Skip to main content

Retrieval

Retrieval is the first ranking pass of the pipeline. Given a query, it produces a candidate list that the reranker and the LLM will work from. Get this stage wrong and no amount of clever reranking will recover — you cannot rank what was never retrieved.

CoreCube uses hybrid retrieval: two parallel searches (dense vector + sparse keyword) that each produce a ranked list, fused into a single ordering by Reciprocal Rank Fusion (RRF). The two streams catch different things:

  • Dense (vector) finds paraphrases, related concepts, fuzzy matches. Strong on natural-language questions.
  • Sparse (FTS) finds exact tokens, identifiers, codes, part numbers, file paths. Strong on lookups.

Most production queries are a mix of both, which is why the default is hybrid.

Effect is immediate

Every field on this page applies to the next query — no rebuild, no reindex, no waiting. Switch presets, tweak a weight, and the next search reflects it.


Dense top-K

The number of candidates the dense (vector) search returns before fusion.

Dense search runs first, returns its top-K most similar chunks by cosine distance. Those candidates are then merged with the sparse list via RRF.

ValueEffect
10–20Tight pool — risks missing semantically relevant chunks that ranked just below the cutoff. Use only when latency is critical.
20–50 (typical)Healthy pool — captures most semantically relevant chunks for the reranker to choose from.
50–100Wide pool — useful when the dense vector is missing the right chunks at smaller K. Diminishing returns above 100.
> 100Reranker territory — the rerank pool (Rerank pool) caps fed candidates anyway, so going higher is wasted.

Raise this when you suspect the right chunks are being filtered out before reaching the reranker — i.e. searching for a known-good chunk in the index returns it ranked below 20.


Sparse top-K

The number of candidates the sparse (full-text keyword) search returns before fusion.

Mirror of dense top-K but for the FTS path. Uses Postgres websearch_to_tsquery + ts_rank under the hood.

Same intuitions apply — typical: 20–50. Usually matched to dense top-K so neither stream dominates by sheer volume of candidates. The two pools are independent — a chunk that scores in dense top-K but not in sparse top-K is still considered (the union goes into RRF).

Asymmetric tuning: if your corpus is keyword-heavy (logs, code, identifiers) you may want sparse > dense; if it is prose-heavy (articles, transcripts) the reverse.


RRF k

The dampening constant in the Reciprocal Rank Fusion formula.

For each chunk, RRF computes:

score(chunk) = vector_weight × 1 / (rrf_k + vec_rank)
+ fts_weight × 1 / (rrf_k + fts_rank)

rrf_k controls how steeply rank position matters. The standard value 60 comes from the original RRF paper and is appropriate for top-10 retrieval.

ValueEffect
10–30Aggressive top-of-list bias. Rank 1 dominates. Rank 5 barely contributes. Use when only the very top results matter.
60 (default)Smooth decay across the top ~50 results. Documents that appear in both streams reliably rise.
100–200Mild decay. Mid-ranked items contribute meaningfully. Use when you want a broad, diverse candidate set.
> 200Near-linear fusion. Rank position barely matters relative to whether a chunk hit at all.

Leave at 60 unless you have a measured reason. The value is a published default and works well for top-K = 10 retrieval.


Vector weight

Multiplier on the dense stream's contribution to fused rank.

Combined with FTS weight, this is how you bias hybrid retrieval toward semantic matches over keyword matches (or vice versa).

ValueEffect
0Disables the vector stream entirely — hybrid degrades to pure FTS.
1.0Equal voice with FTS. Both streams contribute the same.
1.2 (default)Seeded default. Slight vector bias over FTS, which defaults to 0.8.
1.5–2.0Vector-biased. Semantic matches outrank keyword matches when both hit. Useful for question-style queries.
> 3Heavy vector bias. Use only when measured to help — easy to over-tune and lose lookup precision.
Tune as a ratio, not two axes

Vector weight and FTS weight are a ratio. Setting vector_weight = 1.5, fts_weight = 1.0 is mathematically the same as vector_weight = 1.0, fts_weight = ~0.67. Pick one to leave at 1.0 and adjust the other. Two-knob tuning quickly becomes incoherent.


FTS weight

Multiplier on the sparse (full-text) stream's contribution to fused rank.

Symmetric to vector weight; the seeded default is 0.8. Raise it above 0.8 when:

  • Your corpus is identifier-heavy: codes, part numbers, error messages, file paths, ticket IDs.
  • Your users tend to type exact strings rather than questions.
  • You have measured that pure FTS results are usually more relevant than pure vector results on your evaluation set.

Set to 0 to disable the FTS stream (hybrid degrades to pure vector).


Min score

The minimum fused score (0–1) a chunk must reach to be returned to the reranker.

After RRF, every candidate has a normalized fused score in [0, 1]. Min score filters out the weak tail of that ranking.

ValueEffect
0No floor — every ranked candidate flows through. Useful for debugging or for mode: 'fts' callers that want every tsquery match.
0.4 / 0.3 / 0.2 (default)Balanced — discards obviously irrelevant matches while keeping borderline ones for the reranker to judge. Seeded as fast 0.4 / balanced 0.3 / accurate 0.2.
0.5–0.7Aggressive — only confident hits survive. Use in strict-evidence flows where you would rather return zero results than a wrong one.
> 0.7Very aggressive — risks empty result sets on niche queries. Pair with a graceful "no results" UX in the calling app.

The floor applies to all three search modes (hybrid, vector, fts). Score-normalization happens before the floor, so the threshold means the same thing across modes.


Setting reference

When defaults differ, values are listed as default-fast / default-balanced / default-accurate.

Setting keyTypeDefaultRange
retrieval_dense_top_kinteger20 / 40 / 801500
retrieval_sparse_top_kinteger20 / 40 / 801500
retrieval_rrf_kinteger601500
retrieval_vector_weightfloat1.205
retrieval_fts_weightfloat0.805
retrieval_min_scorefloat0.4 / 0.3 / 0.201

The ranges above are recommended operating bounds, not enforced limits. The preset update endpoint (PUT /api/presets/:id) stores whatever numeric value you send and the runtime reads it back as-is — no clamping, no rejection. Stay within these ranges unless you have measured a reason not to.


We use cookies for analytics to improve our website. More information in our Privacy Policy.