Pipeline, tools, and prompts
The preset Pipeline tab is the runtime assembly surface for a preset. It defines which tools run before retrieval, which retrieval profile is the default, which tools post-process retrieved chunks, which prompts shape the answer LLM, and which tools the answer LLM may call directly.
Admin Console → Configuration → Presets → open a preset → Pipeline tab.
Runtime flow
When a preset is applied, CoreCube composes a resolved pipeline snapshot. Query-time code uses that snapshot instead of re-reading every tool and prompt row while a user waits for an answer.
The default path is:
- Run active query tools in position order.
- Run the preset's default retrieval tool if the query stage did not already produce chunks.
- Run active chunk tools in position order.
- Compose the answer system prompt from answer-stage prompt fragments plus every active tool's tool prompt.
- Send the grounded context and composed prompt to the preset's answer LLM.
- If the preset exposes retrieval or action capabilities, the answer LLM can call them in a bounded tool-call loop.
Tool stages
| Tool type | Pipeline location | What it can do | LLM-callable |
|---|---|---|---|
query | Before retrieval | Rewrite or expand the user query, then request additional retrieval runs. | No |
retrieval | Default retrieval or answer capability | Store a full retrieval profile: candidate pools, fusion weights, freshness floor, reranker, final top-K. | Yes, when not the default retrieval |
chunk | After retrieval | Reorder, filter, expand, or annotate candidate chunks before they reach the answer LLM. | No |
action | Answer capability | Execute a code tool with an OpenAI-style function schema, scoped to the caller. | Yes |
Only one retrieval tool is the default retrieval. Additional retrieval tools are exposed as LLM-callable capabilities, so the answer model can ask for a narrower or wider search when the initial evidence is insufficient.
Retrieval tools store reranker configuration in their settings.reranker_model field. They do not
use the generic tool modelId; that field is reserved for tools that need an LLM or executor model.
Prompt stages
Prompt fragments are reusable building blocks. A fragment can be attached to a preset stage or used as a tool prompt.
| Stage | Typical fragment types | Runtime use |
|---|---|---|
| Query | retrieval_semantic_query, retrieval_keyword_query | Supplied to query tools that rewrite the user's question before retrieval. |
| Chunk | chunk_selection | Supplied to chunk tools that ask an LLM to classify or expand candidate passages. |
| Answer | answer, answer_citations, answer_attachments | Composed into the final answer-LLM system prompt. |
| Tool | tool_prompt | Attached to each tool. Active tool prompts are appended to the answer system prompt. |
| Ingestion | ocr_extraction | Used by chat-API OCR providers while ingesting scanned or image-based documents. |
The answer system prompt is composed in this order:
answerfragments in answer-stage position order.- Tool prompts for the active default retrieval, retrieval capabilities, query tools, chunk tools, and action capabilities.
answer_citationsfragments.answer_attachmentsfragments.- A non-editable invariant footer that enforces language, untrusted-context handling, refusal rules, and permission-leak prevention.
Editing a shared fragment affects every preset or tool that references it. The Admin Console shows affected presets and tools before you save.
Seeded default presets
Fresh installs bind the organization to default-balanced. The seeded defaults are:
| Preset | Default retrieval | Query tools | Chunk tools | Shape |
|---|---|---|---|---|
default-fast | retrieval-fast | None | freshness_decay | Small candidate pools, final top-K 4, lowest latency. |
default-balanced | retrieval-balanced | expand_multi_query | freshness_decay | Balanced candidate pools, query expansion, final top-K 7. |
default-accurate | retrieval-accurate | expand_multi_query | freshness_decay, llm_select_and_expand | Wide candidate pools, LLM chunk selection, final top-K 12. |
These defaults are editable from the preset page. Use Reset to defaults when you want to return one of the seeded presets to its shipped shape, or duplicate a preset before experimenting.
Seeded tools
| Tool slug | Type | Purpose |
|---|---|---|
retrieval-fast | Retrieval | Lean retrieval profile for low-latency answers. |
retrieval-balanced | Retrieval | Mid-tier retrieval profile used by the fresh-install default. |
retrieval-accurate | Retrieval | High-recall retrieval profile with deeper candidate pools. |
expand_multi_query | Query | Uses an LLM to create semantic and keyword variants, then fuses the resulting streams. |
freshness_decay | Chunk | Re-scores chunks toward recently synced content, bounded by the retrieval freshness floor. |
llm_select_and_expand | Chunk | Uses an LLM to select relevant anchors and expand them with neighboring chunks. |
noop_chunk | Chunk | Pass-through chunk tool, useful when you need an explicit no-op chunk stage. |
Answer model and loop bounds
The answer stage has its own model selection. A preset with tools or prompts but no answer model is incomplete until you pick one.
Two preset-level limits protect runtime tool execution. They apply to deterministic query/chunk tool stages and to LLM-emitted retrieval/action capability calls:
| Field | Default | Meaning |
|---|---|---|
max_tool_calls_per_request | 8 | Maximum tool invocations consumed by one stage budget. |
max_tool_call_wall_ms | 60000 | Maximum wall-clock time spent in one stage budget. |
If either bound is exceeded, the request fails fast with a typed tool-loop error instead of running unbounded work. The deterministic query/chunk pipeline and the answer LLM's capability loop each receive their own bounded budget.
Operational notes
- Streaming chat currently supports retrieval and chunk processing before answer generation. If a preset exposes LLM-callable retrieval or action capabilities, use non-streaming chat for that preset until streaming tool-call deltas are supported.
- Tool execution is scoped to the caller's resolved compartments and sensitivity ceiling. A tool cannot widen the API key or user's permissions.
- Code tools execute through the executor sidecar with timeout, memory, and egress allowlist controls.
Related
- System prompts — fragment types, composition order, and examples.
- Retrieval — fields stored inside retrieval tools.
- How retrieval works — the end-to-end query flow.