Skip to main content

Confluence Connector

The Confluence connector ingests pages, comments, and labels from Confluence Cloud and Confluence Data Center via the REST API.

What it ingests

  • Pages — full page content including body text
  • Comments — inline and page-level comments (configurable)
  • Labels — page labels stored as metadata
  • Page hierarchy — parent-child relationships preserved in heading path

Authentication

Confluence Cloud

Use an API token:

  1. Go to id.atlassian.com/manage-profile/security/api-tokens
  2. Click Create API token
  3. Copy the token

In CoreCube:

  • Server URL: https://your-domain.atlassian.net
  • Email: your Atlassian account email
  • API Token: the token from step 3

Confluence Data Center / Server

CoreCube authenticates to every Confluence endpoint with HTTP Basic auth (username:token, base64-encoded). The Bearer / Personal Access Token scheme is not used — even on Data Center, supply a username together with a token or password.

  1. Use your Confluence account username
  2. Create an API token, or use your account password, with at least read access to the target spaces

In CoreCube:

  • Server URL: https://confluence.your-company.com
  • Email / Username: your Confluence username
  • API Token: the token (or password) from step 2

Source filtering

Create separate connections per compartment by filtering to specific space keys:

FieldDescriptionExample
Space key(s)Comma-separated list of Confluence space keysENG, DEVOPS, PLATFORM
One connection per compartment

Create "Confluence — Engineering Docs" scoped to ENG, DEVOPS with compartment engineering, and "Confluence — HR Policies" scoped to HR with compartment hr. Each has its own sensitivity level and user scope.

Sync options

OptionDefaultDescription
Include commentsYesIngest page-level and inline comments
Include labelsYesStore page labels as chunk metadata
Sync scheduleManualManual by default; optional interval of 15, 30, 60, or 360 minutes

Change detection

CoreCube uses Confluence's page version number to detect changes. Only pages with a version number higher than the last-synced version are re-fetched. This makes incremental syncs very fast — typically only a handful of pages are re-processed on each run.

Deleted and trashed pages are detected and removed from the evidence layer.

Content normalization

Confluence pages are delivered in storage format — Confluence's XHTML-based representation. CoreCube converts this HTML to clean text before chunking:

  • Headings → Markdown headings (preserved in heading_path for weighted FTS)
  • Tables → Markdown tables
  • Code blocks → Fenced code blocks
  • Images → Alt text kept as a placeholder (e.g. [Image: architecture diagram]); images without alt text are dropped
  • Macros → Inner text extracted where present

Troubleshooting

"Authentication failed"

  • Verify your API token (or account password) is still valid
  • Ensure the account has at least read access to the configured spaces
  • For Confluence Cloud, confirm the email matches your Atlassian account

"Space not found"

  • Verify the space key is correct (case-sensitive)
  • Confirm the authenticated user can access the space

Pages missing after sync

  • Verify the pages are not in a restricted space that the user cannot access
  • Confirm the pages belong to one of the configured space keys
  • Check the Connection → Last sync log for skip or failure entries

We use cookies for analytics to improve our website. More information in our Privacy Policy.