Docker Deployment
CoreCube ships as a Docker image with PostgreSQL + pgvector as the required knowledge backend.
For the fastest setup, use the install script. This page documents the manual Compose deployment and operational details.
Deployment modes
| Mode | Use case | Database | External API calls |
|---|---|---|---|
| Production | Teams and organizations | PostgreSQL + pgvector | LLM API required |
| Fully local | Air-gapped, data sovereignty | PostgreSQL + pgvector | None required |
CoreCube uses PostgreSQL + pgvector for durable application and knowledge data. Runtime access is split into bootstrap/admin, app, and maintenance roles so ordinary request-path traffic stays under RLS while background cleanup can safely repair or purge cross-scope knowledge data.
Production deployment
The recommended setup for teams: CoreCube + PostgreSQL/pgvector.
services:
corecube:
image: registry.arantic.cloud/corecube/corecube:latest
container_name: corecube-app
ports:
- '7400:7400'
environment:
CUBE_ADMIN_EMAIL: admin@example.com
CUBE_ADMIN_PASSWORD: changeme123
PGVECTOR_URL: postgresql://corecube:changeme123@pgvector:5432/corecube
PGVECTOR_APP_URL: postgresql://corecube_app:changeme123@pgvector:5432/corecube
PGVECTOR_MAINTENANCE_URL: postgresql://corecube_maintenance:changeme123@pgvector:5432/corecube
ENCRYPTION_KEY: your-256-bit-encryption-key
# Local inference sidecars (embedding + reranking)
INFERENCE_EMBEDDING_URL: http://inf-embedding:9440
INFERENCE_RERANKER_URL: http://inf-reranker:9450
# Object storage — required for Library uploads and OCR
STORAGE_ENDPOINT: s3storage
STORAGE_PORT: '8333'
STORAGE_USE_SSL: 'false'
STORAGE_ACCESS_KEY: corecube
STORAGE_SECRET_KEY: corecube_secret_key_change_me
STORAGE_REGION: eu-west-1
# Browser-reachable URL for presigned uploads (set to your storage subdomain in production)
STORAGE_PUBLIC_ENDPOINT: http://localhost:8333
# Sandbox executor (Unix-socket IPC)
CC_EXECUTOR_SOCKET_PATH: /run/corecube/executor.sock
volumes:
- corecube-data:/data
- executor-socket:/run/corecube
depends_on:
pgvector:
condition: service_healthy
s3storage:
condition: service_healthy
executor:
condition: service_healthy
restart: unless-stopped
pgvector:
image: pgvector/pgvector:pg17
container_name: corecube-pgvector
environment:
POSTGRES_USER: corecube
POSTGRES_PASSWORD: changeme123
POSTGRES_DB: corecube
volumes:
- pgvector-data:/var/lib/postgresql/data
healthcheck:
test: ['CMD-SHELL', 'pg_isready -U corecube -d corecube']
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
# SeaweedFS object store — holds original uploaded files (Library + OCR re-scan)
s3storage:
image: chrislusf/seaweedfs:4.22
container_name: corecube-s3storage
command:
[
'server',
'-dir=/data',
'-filer',
'-s3',
'-s3.config=/etc/s3storage/s3.json',
'-volume.max=0',
'-master.volumeSizeLimitMB=1024',
'-volume.index=leveldb',
'-master.volumePreallocate',
'-filer.concurrentUploadLimitMB=100',
'-s3.allowEmptyFolder=true',
]
ports:
- '8333:8333' # S3 API (browser-facing presigned PUT)
- '9333:9333' # Master UI/API
- '8888:8888' # Filer UI/API
volumes:
- s3storage-data:/data
- ./s3.json:/etc/s3storage/s3.json:ro
healthcheck:
test:
['CMD', 'wget', '--no-verbose', '--tries=1', '--spider', 'http://127.0.0.1:9333/dir/status']
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
restart: unless-stopped
# Sandbox that runs preset code tools — the app waits for it to be healthy
executor:
image: registry.arantic.cloud/corecube/corecube-executor:latest
container_name: corecube-executor
environment:
CC_EXECUTOR_SOCKET_PATH: /run/corecube/executor.sock
read_only: true
tmpfs:
- /tmp:noexec,nosuid,nodev,size=64m
network_mode: 'none'
user: '65532:65532'
volumes:
- executor-socket:/run/corecube
healthcheck:
test: ['CMD', '/usr/local/bin/healthz.sh']
interval: 5s
timeout: 2s
retries: 5
start_period: 10s
restart: unless-stopped
# Inference sidecars — local embedding and reranking (always on, no profile)
inf-embedding:
image: registry.arantic.cloud/corecube/corecube-inference:${INFERENCE_IMAGE_TAG:-cpu-latest}
container_name: corecube-inf-embedding
expose:
- '9440'
environment:
HF_HOME: /data/hf-cache
INFERENCE_ROLE: embedding
DEVICE: ${INFERENCE_DEVICE:-cpu}
volumes:
- ./inference/hf-cache:/data/hf-cache
# Uncomment for NVIDIA GPU (also set INFERENCE_IMAGE_TAG=cuda-latest, INFERENCE_DEVICE=cuda):
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
restart: unless-stopped
inf-reranker:
image: registry.arantic.cloud/corecube/corecube-inference:${INFERENCE_IMAGE_TAG:-cpu-latest}
container_name: corecube-inf-reranker
expose:
- '9450'
environment:
HF_HOME: /data/hf-cache
INFERENCE_ROLE: reranker
DEVICE: ${INFERENCE_DEVICE:-cpu}
volumes:
- ./inference:/data
# Uncomment for NVIDIA GPU (see inf-embedding above):
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
restart: unless-stopped
volumes:
corecube-data:
pgvector-data:
s3storage-data:
executor-socket:
The s3storage service needs an access-key file (s3.json) next to your docker-compose.yml,
with credentials matching STORAGE_ACCESS_KEY / STORAGE_SECRET_KEY above:
{
"identities": [
{
"name": "corecube",
"credentials": [{ "accessKey": "corecube", "secretKey": "corecube_secret_key_change_me" }],
"actions": ["Admin", "Read", "Write"]
}
]
}
Start:
docker compose up -d
Stop:
docker compose down
Fully local deployment
Zero external API calls. The CoreCube Inference sidecars (inf-embedding, inf-reranker) ship in the
Compose file and run by default — they serve local embedding and reranking. There is no separate
profile to enable; the same docker compose up -d starts them.
docker compose up -d
To eliminate external calls entirely:
- In the Admin Console, select a local embedding model and a local reranker model (served by the inference sidecars).
- Configure a self-hosted LLM — for example Ollama — as a custom LLM provider pointing to
http://ollama:11434/v1.
OCR (for scanned PDFs) runs in-process inside the CoreCube container against your configured OCR provider, not the inference sidecars. For a fully local setup, point it at a vision-capable local model.
NVIDIA GPU (CUDA)
For GPU-accelerated inference on Linux, switch the inference sidecars to the CUDA image and reserve a GPU.
Set these in your .env:
INFERENCE_IMAGE_TAG=cuda-latest
INFERENCE_DEVICE=cuda
Then uncomment the deploy.resources block on both inf-embedding and inf-reranker in your
compose file:
inf-embedding:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Apply the same deploy.resources block to inf-reranker.
Apple Silicon (MPS)
On Apple Silicon Macs, Docker cannot expose Apple Metal/MPS to Linux containers. Run inference natively:
make deploy-local-mps
This starts the inference servers as native macOS Python processes and runs the rest of the stack in Docker.
Environment variables
| Variable | Default | Description |
|---|---|---|
PORT | 7400 | Server port |
CUBE_ADMIN_EMAIL | admin@example.com | Initial admin email |
CUBE_ADMIN_PASSWORD | changeme123 | Initial admin password |
PGVECTOR_URL | — | PostgreSQL bootstrap/admin connection string (required) |
PGVECTOR_APP_URL | — | PostgreSQL corecube_app connection string (required) |
PGVECTOR_MAINTENANCE_URL | — | PostgreSQL corecube_maintenance connection string (required) |
PGVECTOR_POOL_MAX | 5 | Bootstrap/admin pool size per app instance |
PGVECTOR_APP_POOL_MAX | 20 | App-role retrieval/query pool size per app instance |
PGVECTOR_MAINTENANCE_POOL_MAX | 5 | Maintenance-role cleanup/repair pool size per app instance |
ENCRYPTION_KEY | auto-generated | AES-256-GCM key for credential encryption |
DATA_DIR | /data | Persistent data directory |
SESSION_MAX_LIFETIME_HOURS | 720 | Absolute session lifetime in hours (default: 30 days) |
SESSION_IDLE_TIMEOUT_MINUTES | 60 | Idle timeout — sessions expire after this much inactivity |
S3STORAGE_S3_HOST_PORT | 8400 | Host port for the bundled object store's S3 API |
S3STORAGE_MASTER_HOST_PORT | 8410 | Host port for the object store master UI/API |
S3STORAGE_FILER_HOST_PORT | 8420 | Host port for the object store filer UI/API |
STORAGE_PUBLIC_ENDPOINT | http://localhost:8400 | Browser-reachable URL for presigned uploads (set explicitly in production) |
CC_CHAT_MAX_CONCURRENT | 100 | In-flight chat requests per app instance |
CC_CHAT_KEY_MAX_CONCURRENT | 100 | In-flight chat requests per API key |
CC_CHAT_USER_MAX_CONCURRENT | 8 | In-flight chat requests per effective user |
CC_CHAT_PUBLIC_IP_MAX_CONCURRENT | 8 | In-flight public-key chat requests per source IP |
CC_PROVIDER_RATE_LIMITS_ENABLED | true | Enforce model-catalog RPM/TPM limits before external provider calls |
The production Compose file publishes the application on 7400 and the bundled s3storage service on SeaweedFS's defaults — 8333 (S3 API), 9333 (master), and 8888 (filer). The inference sidecars communicate over the internal Docker network (9440/9450) and are not published to the host. The 84xx band (8400/8410/8420, overridable via S3STORAGE_*_HOST_PORT) applies only to the local-development Compose file. See Environment Variables → Object storage for details.
Docker secrets
Sensitive variables support the _FILE suffix pattern for Docker secrets:
environment:
ENCRYPTION_KEY_FILE: /run/secrets/encryption_key
PGVECTOR_URL_FILE: /run/secrets/pgvector_url
PGVECTOR_APP_URL_FILE: /run/secrets/pgvector_app_url
PGVECTOR_MAINTENANCE_URL_FILE: /run/secrets/pgvector_maintenance_url
secrets:
encryption_key:
file: ./secrets/encryption_key.txt
pgvector_url:
file: ./secrets/pgvector_url.txt
pgvector_app_url:
file: ./secrets/pgvector_app_url.txt
pgvector_maintenance_url:
file: ./secrets/pgvector_maintenance_url.txt
Data volumes
CoreCube uses three persistent volumes:
| Volume | Contents |
|---|---|
corecube-data | Encryption key and app-local runtime files |
pgvector-data | PostgreSQL app and knowledge data |
s3storage-data | Original uploaded files (Library uploads and OCR) |
Durable state is split across PostgreSQL, uploaded files, and the encryption key. A backup of only one layer is incomplete. See Backup & Recovery below.
URLs
| Service | URL |
|---|---|
| Admin Console | http://localhost:7400/admin |
| Headless API | http://localhost:7400/v1 |
| OpenAPI spec | http://localhost:7400/v1/openapi.json |
| Interactive API docs | http://localhost:7400/v1/docs |
| Health check | http://localhost:7400/health |
Backup & recovery
Backup domains
| Domain | What | Command |
|---|---|---|
| Database | PostgreSQL app and knowledge data | docker exec corecube-pgvector pg_dump -U corecube corecube > backup.sql |
| Storage | Uploaded files | docker run --rm -v s3storage-data:/data -v "$PWD":/backup alpine tar czf /backup/s3storage.tgz -C /data . |
| Encryption key | Credential and secret encryption key | Back up DATA_DIR/encryption.key separately |
Encryption key
The encryption key is required to decrypt connector credentials. If the key is lost, all connection credentials become permanently unreadable. Back up DATA_DIR/encryption.key separately from the data volume.
Upgrading
docker compose pull
docker compose up -d
CoreCube applies schema migrations automatically on startup.
Security hardening
See Security Hardening for the full guide. Key points for Docker:
- The CoreCube container runs its server process as the unprivileged
bunuser. The entrypoint starts as root only to fix data-volume ownership, then drops privileges viasu-execbefore launching the server. - The bundled
executorsandbox is hardened in the shipped Compose file: a read-only root filesystem, all Linux capabilities dropped, a non-root user, andnetwork_mode: none(no network access at all). - PostgreSQL and the inference sidecars communicate over an internal Docker network and are not published to the host.
- The
read_only,cap_drop, and explicit-UID settings are applied toexecutoronly — not to thecorecubeservice. To hardencorecubefurther, add those settings yourself, keeping/dataand the executor socket path writable.