Confluence Connector

The Confluence connector ingests pages, comments, and attachments from Confluence Cloud and Confluence Data Center via the REST API.

What it ingests

Pages — full page content including body text
Comments — inline and page-level comments (configurable)
Labels — page labels stored as metadata
Page hierarchy — parent-child relationships preserved in heading path
Attachments — PDFs, images, and other files (configurable, with OCR for images and scanned PDFs)

Authentication

Confluence Cloud

Use an API token:

Go to id.atlassian.com/manage-profile/security/api-tokens
Click Create API token
Copy the token

In CoreCube:

Server URL: https://your-domain.atlassian.net
Email: your Atlassian account email
API Token: the token from step 3

Confluence Data Center / Server

Use a Personal Access Token (Confluence 7.9+):

Go to Profile → Personal Access Tokens → Create token
Set an expiry and the required permissions (read access)

In CoreCube:

Server URL: https://confluence.your-company.com
Personal Access Token: the token from step 2

Source filtering

Create separate connections per compartment by filtering to specific space keys:

Field	Description	Example
Space key(s)	Comma-separated list of Confluence space keys	`ENG, DEVOPS, PLATFORM`
Parent page ID	Only ingest pages under a specific parent (optional)	`123456`
Label filter	Only ingest pages with specific labels (optional)	`documentation, runbook`

:::tip One connection per compartment Create "Confluence — Engineering Docs" scoped to ENG, DEVOPS with compartment engineering, and "Confluence — HR Policies" scoped to HR with compartment hr. Each has its own sensitivity level and user scope. :::

Sync options

Option	Default	Description
Include comments	Yes	Ingest page-level and inline comments
Include labels	Yes	Store page labels as chunk metadata
Include attachments	No	Ingest attached PDF, DOCX, and image files
Sync schedule	Every 1 hour	How often to check for updated pages

Change detection

CoreCube uses Confluence's page version number to detect changes. Only pages with a version number higher than the last-synced version are re-fetched. This makes incremental syncs very fast — typically only a handful of pages are re-processed on each run.

Deleted and trashed pages are detected and removed from the evidence layer.

Webhook support

Confluence webhooks trigger an immediate sync when pages are created, updated, or deleted — instead of waiting for the next scheduled sync.

To configure a webhook in Confluence:

Go to Confluence Admin → System → Webhooks
Click Create webhook
Set the URL to: https://corecube.your-domain.com/api/webhooks/{connection-id}
Copy the webhook secret from the CoreCube connection detail view and enter it as the Secret
Select events: page_created, page_updated, page_deleted, page_trashed

CoreCube validates the HMAC-SHA256 signature on every incoming webhook and rejects payloads older than 5 minutes.

Content normalization

Confluence pages are stored in Atlassian Document Format (ADF). CoreCube converts ADF to clean markdown before chunking:

Headings → Markdown headings (preserved in heading_path for weighted FTS)
Tables → Markdown tables
Code blocks → Fenced code blocks with language tag
Images → Placeholder text (or OCR'd content if attachments are enabled)
Macros → Extracted text where possible, metadata where not

Troubleshooting

"Authentication failed"

Verify your API token or Personal Access Token is still valid
Ensure the account has at least read access to the configured spaces
For Confluence Cloud, confirm the email matches your Atlassian account

"Space not found"

Verify the space key is correct (case-sensitive)
Confirm the authenticated user can access the space

Pages missing after sync

Check whether the pages have any of the configured label filters
Verify the pages are not in a restricted space that the user cannot access
Check the Connection → Last sync log for skip or failure entries

What it ingests​

Authentication​

Confluence Cloud​

Confluence Data Center / Server​

Source filtering​

Sync options​

Change detection​

Webhook support​

Content normalization​

Troubleshooting​