API Overview

CoreCube exposes two interfaces for AI clients:

Headless API — OpenAI-compatible /v1/chat/completions endpoint
MCP Server — Model Context Protocol server for Claude Desktop, Cursor, and Claude Code

Both are authenticated via API keys and scoped to the knowledge your key has access to.

API reference

Resource	URL	Auth
Interactive API docs (Scalar UI)	`/v1/docs`	None
OpenAPI 3.1 spec	`/v1/openapi.json`	None
Chat completions	`POST /v1/chat/completions`	API key
Model listing	`GET /v1/models`	API key
MCP server	`/mcp`	API key

The interactive API reference at /v1/docs can be imported directly into n8n, Postman, or any OpenAPI-compatible tool via /v1/openapi.json.

Authentication

All API requests require a bearer token:

Authorization: Bearer cc_YOUR_API_KEY

API keys are created in Admin Console → API Keys. Each key is assigned a scope that determines which connections and compartments it can access.

API key types

Key type	User identity	Typical use
Personal	Fixed — the user who owns the key	Claude Desktop, Cursor, single-user OpenWebUI
Service	Dynamic — `X-Cube-User` header per request	OpenWebUI (multi-user), n8n, custom proxies
Public	None — anonymous	Customer support widgets, public docs assistants, AnythingLLM

Personal keys

Authenticate a single user. The key resolves to the user's scope assignments — no extra headers required.

Claude Desktop → Authorization: Bearer cc_personal_...
              → CoreCube resolves: key → user → scope → compartments

Use a personal key when the client runs on behalf of one fixed user: Claude Desktop, Cursor, a personal Raycast extension, or an OpenWebUI instance used by a single person.

Service keys

Authenticate an application (e.g., OpenWebUI, n8n) as a trusted proxy. The key authenticates the application; the X-Cube-User header identifies which end-user is making each request.

OpenWebUI (Sarah) → Authorization: Bearer cc_service_...
                    X-Cube-User: sarah@company.com
                  → CoreCube resolves: service key → user (Sarah) → Sarah's scope

Service keys enable per-user permissions through shared frontends. The same API key serves different users with different compartment access depending on who the proxy passes in the header.

The X-Cube-User value can be either the user's email address or their user ID — CoreCube tries email first, then falls back to ID:

X-Cube-User: admin@example.com     ← human-friendly (recommended)
X-Cube-User: usr_abc123            ← machine-friendly

Service keys fail closed without the header

A service-key request that doesn't include X-Cube-User returns 400 MISSING_USER_IDENTITY. This is deliberate — the server refuses to guess who a request is from. The referenced user must also exist in CoreCube and be active.

Public keys

Authenticate anonymous callers against a single pre-selected scope. No user identity is required or resolved. All callers see the same knowledge — there is no per-user isolation.

Customer support widget → Authorization: Bearer cc_public_...
                        → CoreCube resolves: key → bound scope → compartments

Public keys are for clients that cannot forward a user identity at all: anonymous website visitors, embedded chat widgets, public documentation assistants, and frontends like AnythingLLM that do not pass per-user headers to downstream APIs.

What public keys allow:

GET /v1/models
POST /v1/chat/completions

What public keys do not allow:

Any write endpoint (check-ins, MCP mutations) — rejected with 403 FORBIDDEN
Deletion of the bound scope — the scope cannot be deleted while any public key references it; re-point or revoke those keys first
Per-user audit attribution — queries are logged with user_id = null and the caller's IP address

Public keys are anonymous by design

Treat a public key like a read-only token pasted on your website. Anyone with the key can read everything in the bound scope. Mitigate the blast radius by:

Binding the key to a scope whose max_sensitivity is public
Adding a CIDR allowlist when the caller is server-side
Rotating the key on a fixed schedule (90 days recommended)
Configuring CORS at your reverse proxy — CoreCube does not set per-key CORS headers

Sensitivity confirmation. When binding a public key to a scope whose max_sensitivity is higher than public, the Admin Console requires an explicit acknowledgement before saving. This is deliberate friction — a leaked public key exposes everything the scope can see.

Choosing a key type

If the client…	Use
Runs on behalf of one fixed user (Claude Desktop, Cursor, a single-user OpenWebUI)	Personal key
Runs as a shared proxy where many users log in through it and each should see only their own knowledge (OpenWebUI for a team, n8n, custom chat frontends)	Service key + `X-Cube-User` header
Serves anonymous callers who have no CoreCube identity (website widgets, public docs assistants, AnythingLLM)	Public key

When in doubt, start with a personal key. Reach for service keys when one API key has to represent different end-users on different requests. Reach for public keys only when anonymous access is a hard requirement — they trade per-user isolation for zero-setup anonymous access.

OpenWebUI: a common gotcha

OpenWebUI passes the bearer token but does not automatically send X-Cube-User. If you use a service key without extra configuration, every request fails with 400 MISSING_USER_IDENTITY. Two ways to fix it:

Option 1 — use a personal key (simplest). Create the key owned by the OpenWebUI user. Works out of the box, no extra headers needed. Best choice for a single-user OpenWebUI deployment.

Option 2 — service key + custom headers. OpenWebUI supports custom request headers on its API connections, but the field expects a JSON object, not individual key/value pairs. Paste this into the Headers field on the OpenWebUI connection:

{ "X-Cube-User": "admin@example.com" }

Replace admin@example.com with the CoreCube user whose scope should apply to OpenWebUI queries. Every request OpenWebUI makes will now include the header, and CoreCube will resolve that user's compartments and sensitivity ceiling.

For a multi-user OpenWebUI deployment where each logged-in user should query against their own scope, the proxy would need to inject a per-request X-Cube-User header — OpenWebUI's static connection-level headers don't support that today. Until they do, a service key on OpenWebUI behaves as a single-user key pinned to whatever email you put in the JSON.

Can an end user spoof `X-Cube-User` to escalate permissions?

Short answer: no, not from a browser. The header is set on the server side, not the client side.

This is the standard trust model for any backend-to-backend proxy — it applies to OpenWebUI, n8n, a custom chat frontend, or any application that uses a CoreCube service key on behalf of multiple end users.

The end user's browser never talks directly to CoreCube. It talks to the proxy, which makes a separate outbound request to CoreCube. The proxy constructs that request's headers — including X-Cube-User — from its own configuration and server-side session state, not from anything the browser sent. The user has no access to that HTTP call.

CoreCube trusts the proxy because the proxy holds the service key, and the proxy vouches for its end users based on its own authentication.

What a regular end user at the browser cannot do:

Inject headers into the proxy's outbound request to CoreCube — they have no access to that call.
Override the email the proxy puts in X-Cube-User — it comes from server-side config or session state, not from a request field the user controls.

The real threat surface:

Threat	Exploitable by a regular end user?	Mitigation
Browser-side header injection	No — the browser talks to the proxy, not CoreCube	Architecture
Stolen service key → direct CoreCube calls	Only if they steal the key	Key rotation, audit log, short expiry
Compromised proxy application	No (requires server-level access)	Proxy hardening, network isolation
Network interception between proxy and CoreCube	No (requires network-level access)	HTTPS required on all CoreCube endpoints

Bottom line: the security is as strong as (a) the confidentiality of the service key and (b) the proxy's own user authentication. If the proxy correctly identifies its users — OpenWebUI through sessions, an SSO gateway through signed tokens, n8n through its workflow execution context — CoreCube inherits that trust. A regular end user at a browser has no attack surface here.

Audit logs record the effective user, not the service key

Every request made with a service key is logged against the effective user in X-Cube-User — not against the key itself. So if alice@example.com queries through the proxy, the audit log shows alice@example.com asked "…", which makes forensic review trivial. If the same key is used by different effective users, each appears separately in the audit trail.

Two response modes

Strict mode (default)

Identical request/response format to OpenAI. Citations are injected into the assistant message text as formatted references. No extra top-level fields in the response JSON.

Any OpenAI client connects without modification. Citation rendering depends on how each client handles markdown references in the assistant text.

Extended mode (opt-in)

Adds structured source metadata to the response. Activated by a request header or body field:

X-Cube-Extended: true

or in the request body:

{ "cube_extended": true }

Extended mode adds a top-level corecube object:

{
  "choices": [...],
  "corecube": {
    "citations": [
      {
        "index": 1,
        "title": "Deployment Runbook",
        "url": "https://confluence.company.com/pages/123",
        "connection": "Confluence — Engineering",
        "source_path": "connector",
        "indexed_at": "2026-04-10T14:30:00Z",
        "relevance_score": 0.94
      }
    ],
    "search_latency_ms": 142,
    "llm_latency_ms": 1840,
    "chunks_retrieved": 8
  }
}

For streaming, extended mode sends a custom corecube.sources SSE event before [DONE].

Answerability

CoreCube controls what happens when retrieval finds insufficient evidence:

Mode	Behavior
Strict (recommended)	Returns a structured "insufficient evidence" response. Query is not forwarded to the LLM.
Permissive	Forwards the query to the LLM even without context, clearly marked as ungrounded.

Configure the answerability mode in Admin Console → Settings → Retrieval.

MCP server

CoreCube exposes these MCP tools for Claude Desktop, Cursor, and Claude Code:

Tool	Description
`search_knowledge`	Search the knowledge base with a query
`list_sources`	List available connections and their status
`get_document`	Retrieve a specific document by ID
`get_entity`	Retrieve a synthesized entity page
`get_project_summary`	Retrieve a project summary artifact

SSE and Streamable HTTP transports are both supported.

Claude Desktop configuration:

{
  "mcpServers": {
    "corecube": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/mcp-client-sse", "https://corecube.your-domain.com/mcp"],
      "env": {
        "AUTHORIZATION": "Bearer cc_YOUR_API_KEY"
      }
    }
  }
}

Rate limiting

Every /v1/* request increments a counter tied to the calling API key. Personal and service keys share one counter per key. Public keys have two counters so that one abusive IP cannot exhaust the key's budget for everyone else.

Key type	Counter	Default limit
Personal / service	Per API key	300 req/min (configurable per key)
Public — per source IP	Per (API key, client IP) pair	30 req/min (configurable globally)
Public — aggregate	Per API key	Same as the key's `rate_limit` value
`/api/*` (admin API)	Per session	1000 req/min

Rate limit responses include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After headers. The aggregate public-key cap prevents bulk extraction across many IPs; tune it by editing the key's rate_limit value in the Admin Console.

The global per-IP default for public keys is controlled by the public_key_ip_rate_limit setting and defaults to 30 requests per minute.

API reference​

Authentication​

API key types​

Personal keys​

Service keys​

Public keys​

Choosing a key type​

OpenWebUI: a common gotcha​

Can an end user spoof X-Cube-User to escalate permissions?​

Two response modes​

Strict mode (default)​

Extended mode (opt-in)​

Answerability​

MCP server​

Rate limiting​