API Overview
CoreCube exposes two interfaces for AI clients:
- Headless API — OpenAI-compatible
/v1/chat/completionsendpoint - MCP Server — Model Context Protocol server for Claude Desktop, Cursor, and Claude Code
Both are authenticated via API keys and scoped to the knowledge your key has access to.
API reference
| Resource | URL | Auth |
|---|---|---|
| Interactive API docs (Scalar UI) | /v1/docs | None |
| OpenAPI 3.1 spec | /v1/openapi.json | None |
| Chat completions | POST /v1/chat/completions | API key |
| Model listing | GET /v1/models | API key |
| MCP server | /mcp | API key |
The interactive API reference at /v1/docs can be imported directly into n8n, Postman, or any OpenAPI-compatible tool via /v1/openapi.json.
Authentication
All API requests require a bearer token:
Authorization: Bearer cc_YOUR_API_KEY
API keys are created in Admin Console → API Keys. Each key is assigned a scope that determines which connections and compartments it can access.
API key types
| Key type | User identity | Typical use |
|---|---|---|
| Personal | Fixed — the user who owns the key | Claude Desktop, Cursor, single-user OpenWebUI |
| Service | Dynamic — X-Cube-User header per request | OpenWebUI (multi-user), n8n, custom proxies |
| Public | None — anonymous | Customer support widgets, public docs assistants, AnythingLLM |
Personal keys
Authenticate a single user. The key resolves to the user's scope assignments — no extra headers required.
Claude Desktop → Authorization: Bearer cc_personal_...
→ CoreCube resolves: key → user → scope → compartments
Use a personal key when the client runs on behalf of one fixed user: Claude Desktop, Cursor, a personal Raycast extension, or an OpenWebUI instance used by a single person.
Service keys
Authenticate an application (e.g., OpenWebUI, n8n) as a trusted proxy. The key authenticates the application; the X-Cube-User header identifies which end-user is making each request.
OpenWebUI (Sarah) → Authorization: Bearer cc_service_...
X-Cube-User: sarah@company.com
→ CoreCube resolves: service key → user (Sarah) → Sarah's scope
Service keys enable per-user permissions through shared frontends. The same API key serves different users with different compartment access depending on who the proxy passes in the header.
The X-Cube-User value can be either the user's email address or their user ID — CoreCube tries email first, then falls back to ID:
X-Cube-User: admin@example.com ← human-friendly (recommended)
X-Cube-User: usr_abc123 ← machine-friendly
A service-key request that doesn't include X-Cube-User returns 400 MISSING_USER_IDENTITY. This is deliberate — the server refuses to guess who a request is from. The referenced user must also exist in CoreCube and be active.
Public keys
Authenticate anonymous callers against a single pre-selected scope. No user identity is required or resolved. All callers see the same knowledge — there is no per-user isolation.
Customer support widget → Authorization: Bearer cc_public_...
→ CoreCube resolves: key → bound scope → compartments
Public keys are for clients that cannot forward a user identity at all: anonymous website visitors, embedded chat widgets, public documentation assistants, and frontends like AnythingLLM that do not pass per-user headers to downstream APIs.
What public keys allow:
GET /v1/modelsPOST /v1/chat/completions
What public keys do not allow:
- Any write endpoint (check-ins, MCP mutations) — rejected with
403 FORBIDDEN - Deletion of the bound scope — the scope cannot be deleted while any public key references it; re-point or revoke those keys first
- Per-user audit attribution — queries are logged with
user_id = nulland the caller's IP address
Treat a public key like a read-only token pasted on your website. Anyone with the key can read everything in the bound scope. Mitigate the blast radius by:
- Binding the key to a scope whose
max_sensitivityispublic - Adding a CIDR allowlist when the caller is server-side
- Rotating the key on a fixed schedule (90 days recommended)
- Configuring CORS at your reverse proxy — CoreCube does not set per-key CORS headers
Sensitivity confirmation. When binding a public key to a scope whose max_sensitivity is higher than public, the Admin Console requires an explicit acknowledgement before saving. This is deliberate friction — a leaked public key exposes everything the scope can see.
Choosing a key type
| If the client… | Use |
|---|---|
| Runs on behalf of one fixed user (Claude Desktop, Cursor, a single-user OpenWebUI) | Personal key |
| Runs as a shared proxy where many users log in through it and each should see only their own knowledge (OpenWebUI for a team, n8n, custom chat frontends) | Service key + X-Cube-User header |
| Serves anonymous callers who have no CoreCube identity (website widgets, public docs assistants, AnythingLLM) | Public key |
When in doubt, start with a personal key. Reach for service keys when one API key has to represent different end-users on different requests. Reach for public keys only when anonymous access is a hard requirement — they trade per-user isolation for zero-setup anonymous access.
OpenWebUI: a common gotcha
OpenWebUI passes the bearer token but does not automatically send X-Cube-User. If you use a service key without extra configuration, every request fails with 400 MISSING_USER_IDENTITY. Two ways to fix it:
Option 1 — use a personal key (simplest). Create the key owned by the OpenWebUI user. Works out of the box, no extra headers needed. Best choice for a single-user OpenWebUI deployment.
Option 2 — service key + custom headers. OpenWebUI supports custom request headers on its API connections, but the field expects a JSON object, not individual key/value pairs. Paste this into the Headers field on the OpenWebUI connection:
{ "X-Cube-User": "admin@example.com" }
Replace admin@example.com with the CoreCube user whose scope should apply to OpenWebUI queries. Every request OpenWebUI makes will now include the header, and CoreCube will resolve that user's compartments and sensitivity ceiling.
For a multi-user OpenWebUI deployment where each logged-in user should query against their own scope, the proxy would need to inject a per-request X-Cube-User header — OpenWebUI's static connection-level headers don't support that today. Until they do, a service key on OpenWebUI behaves as a single-user key pinned to whatever email you put in the JSON.
Can an end user spoof X-Cube-User to escalate permissions?
Short answer: no, not from a browser. The header is set on the server side, not the client side.
This is the standard trust model for any backend-to-backend proxy — it applies to OpenWebUI, n8n, a custom chat frontend, or any application that uses a CoreCube service key on behalf of multiple end users.
The end user's browser never talks directly to CoreCube. It talks to the proxy, which makes a separate outbound request to CoreCube. The proxy constructs that request's headers — including X-Cube-User — from its own configuration and server-side session state, not from anything the browser sent. The user has no access to that HTTP call.
CoreCube trusts the proxy because the proxy holds the service key, and the proxy vouches for its end users based on its own authentication.
What a regular end user at the browser cannot do:
- Inject headers into the proxy's outbound request to CoreCube — they have no access to that call.
- Override the email the proxy puts in
X-Cube-User— it comes from server-side config or session state, not from a request field the user controls.
The real threat surface:
| Threat | Exploitable by a regular end user? | Mitigation |
|---|---|---|
| Browser-side header injection | No — the browser talks to the proxy, not CoreCube | Architecture |
| Stolen service key → direct CoreCube calls | Only if they steal the key | Key rotation, audit log, short expiry |
| Compromised proxy application | No (requires server-level access) | Proxy hardening, network isolation |
| Network interception between proxy and CoreCube | No (requires network-level access) | HTTPS required on all CoreCube endpoints |
Bottom line: the security is as strong as (a) the confidentiality of the service key and (b) the proxy's own user authentication. If the proxy correctly identifies its users — OpenWebUI through sessions, an SSO gateway through signed tokens, n8n through its workflow execution context — CoreCube inherits that trust. A regular end user at a browser has no attack surface here.
Every request made with a service key is logged against the effective user in X-Cube-User — not against the key itself. So if alice@example.com queries through the proxy, the audit log shows alice@example.com asked "…", which makes forensic review trivial. If the same key is used by different effective users, each appears separately in the audit trail.
Two response modes
Strict mode (default)
Identical request/response format to OpenAI. Citations are injected into the assistant message text as formatted references. No extra top-level fields in the response JSON.
Any OpenAI client connects without modification. Citation rendering depends on how each client handles markdown references in the assistant text.
Extended mode (opt-in)
Adds structured source metadata to the response. Activated by a request header or body field:
X-Cube-Extended: true
or in the request body:
{ "cube_extended": true }
Extended mode adds a top-level corecube object:
{
"choices": [...],
"corecube": {
"citations": [
{
"index": 1,
"title": "Deployment Runbook",
"url": "https://confluence.company.com/pages/123",
"connection": "Confluence — Engineering",
"source_path": "connector",
"indexed_at": "2026-04-10T14:30:00Z",
"relevance_score": 0.94
}
],
"search_latency_ms": 142,
"llm_latency_ms": 1840,
"chunks_retrieved": 8
}
}
For streaming, extended mode sends a custom corecube.sources SSE event before [DONE].
Answerability
CoreCube controls what happens when retrieval finds insufficient evidence:
| Mode | Behavior |
|---|---|
| Strict (recommended) | Returns a structured "insufficient evidence" response. Query is not forwarded to the LLM. |
| Permissive | Forwards the query to the LLM even without context, clearly marked as ungrounded. |
Configure the answerability mode in Admin Console → Settings → Retrieval.
MCP server
CoreCube exposes these MCP tools for Claude Desktop, Cursor, and Claude Code:
| Tool | Description |
|---|---|
search_knowledge | Search the knowledge base with a query |
list_sources | List available connections and their status |
get_document | Retrieve a specific document by ID |
get_entity | Retrieve a synthesized entity page |
get_project_summary | Retrieve a project summary artifact |
SSE and Streamable HTTP transports are both supported.
Claude Desktop configuration:
{
"mcpServers": {
"corecube": {
"command": "npx",
"args": ["-y", "@anthropic-ai/mcp-client-sse", "https://corecube.your-domain.com/mcp"],
"env": {
"AUTHORIZATION": "Bearer cc_YOUR_API_KEY"
}
}
}
}
Rate limiting
Every /v1/* request increments a counter tied to the calling API key. Personal and service keys share one counter per key. Public keys have two counters so that one abusive IP cannot exhaust the key's budget for everyone else.
| Key type | Counter | Default limit |
|---|---|---|
| Personal / service | Per API key | 300 req/min (configurable per key) |
| Public — per source IP | Per (API key, client IP) pair | 30 req/min (configurable globally) |
| Public — aggregate | Per API key | Same as the key's rate_limit value |
/api/* (admin API) | Per session | 1000 req/min |
Rate limit responses include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After headers. The aggregate public-key cap prevents bulk extraction across many IPs; tune it by editing the key's rate_limit value in the Admin Console.
The global per-IP default for public keys is controlled by the public_key_ip_rate_limit setting and defaults to 30 requests per minute.