Freshness & Content Decay
Every connection has a freshness_hours setting (default 24). It is one number that does
two different jobs, measured against two different clocks. Understanding which clock applies where
is the key to setting it correctly — and the reason a tighter value does not mean "synced more
often."
freshness_hours does not control whether or when a connection syncs. It controls how quickly
older content is demoted in search ranking, and it defines the window the dashboard uses to call
a connection fresh or stale.
The two clocks
freshness_hours is compared against a different timestamp depending on what it is doing:
| Job | Timestamp it measures against | Advances when… |
|---|---|---|
| Search ranking decay | a document's last indexed time | that document's content is (re)processed |
| Dashboard fresh / stale | the connection's last sync time | any successful sync run completes |
These two clocks move independently, and that difference is the single most important thing on this page:
- Last sync advances every time the connector runs, even if nothing changed.
- Last indexed advances only when a document's content actually changes and is re-processed.
So a Confluence page that has not been edited in 30 days keeps its original "last indexed" time even if the connector syncs every hour. Re-syncing an unchanged page does not make it look fresher to the ranker.
What freshness decay does
When a search runs, the freshness decay step re-scores candidate chunks so that recently indexed content ranks above older content. Each chunk's relevance score is multiplied by a freshness factor based on how long ago its document was last indexed:
- Content newer than
freshness_hours→ full score (multiplier =1.0) - Older than that → the score decays exponentially, halving once per additional
freshness_hourswindow:
age = now − document_last_indexed
if age ≤ freshness_hours:
multiplier = 1.0
else:
multiplier = 0.5 ^ ((age − freshness_hours) / freshness_hours)
Freshness decay is a chunk-stage retrieval tool (freshness_decay) and ships active in all three
default presets (default-fast, default-balanced, default-accurate). It runs once per search —
there is no second, hidden decay applied at ingest time. See the
retrieval pipeline for where it sits in the overall flow.
Worked example
With freshness_hours = 24, scored by how long ago the document's content was last indexed:
| Content age | Excess over window | Score multiplier |
|---|---|---|
| ≤ 24h | — | 1.00 (100%) |
| 48h | 24h | 0.50 (50%) |
| 72h | 48h | 0.25 (25%) |
| 120h (5d) | 96h | 0.0625 (~6%) |
| 168h (7d) | 144h | ~0.016 (~1.6%) |
A document that has not changed in a week is ranked at roughly 1.6% of its semantic relevance — unless a floor stops the decay (next section).
The freshness floor
The decay above is unbounded by default — all three shipped presets set
retrieval_freshness_floor = 0, so a very old document can decay arbitrarily close to zero. The
freshness floor is a separate, preset-level knob that caps the penalty from below:
effective_multiplier = max(decay_multiplier, retrieval_freshness_floor)
retrieval_freshness_floor | Effect |
|---|---|
0 (default) | Floor disabled — decay runs unclamped; very old content can be buried. |
0.2 – 0.3 | Mild decay — freshness still tiebreaks, but old authoritative docs stay competitive. |
1.0 | Disables decay entirely — freshness no longer affects ranking. |
The two settings divide the work cleanly:
| Setting | Lives on | Controls |
|---|---|---|
freshness_hours | the connection | how fast scores decay (the half-life) |
retrieval_freshness_floor | the preset | how far down decay is allowed to push a score |
Full floor reference: Reranking → Freshness floor.
Freshness vs. sync cadence
A common assumption is "if I sync every day, a 24-hour window keeps everything fresh." That is not how it works, because incremental sync skips unchanged documents and leaves their last-indexed time untouched.
The ranking clock is last indexed (content age), not last synced. A page that has not been edited in 30 days has decayed for 30 days no matter how often the connector runs. Sync cadence governs how fast edits are picked up and what the dashboard badge shows — it does not reset the decay on stable content.
This makes freshness_hours a content-shelf-life setting for ranking: "how long after a document
was last written should it still count as fully current?" — not a sync-frequency setting.
The tension: one number, two clocks
Because the same freshness_hours value drives both jobs, the two can pull in different directions:
- Set it tight (e.g.
24) on a weekly sync, and the dashboard reports the connection stale for six days out of seven — because its last sync is up to 7 days old, well past a 24-hour window. - Set it wide (e.g.
168) to keep the dashboard green on a weekly sync, and you also widen the ranking half-life to a full week.
How to choose freshness_hours
- Set it at or above your sync interval. This keeps a healthy connection reading fresh on the
dashboard. Daily sync →
24+; weekly sync →168+. - Then think about content shelf-life for ranking. If your corpus is mostly stable, authoritative
content (policies, runbooks, ADRs, legal) that stays valid for months, a tight window will demote it
aggressively. Either widen
freshness_hours, or set aretrieval_freshness_floorof0.2–0.3so old-but-relevant content stays competitive. - Go tighter than your sync interval only deliberately — when recency genuinely drives relevance and you cannot sync more often (a fast-moving source where a three-day-old page really is less trustworthy than a fresh one). This nudges older content down between syncs. It is a ranking lever, not a data-currency lever: the underlying data still only refreshes on schedule.
Multi-connection searches
Freshness is applied per chunk, per connection. In a search that spans several connections, each result is decayed using:
- the window (
freshness_hours) from its own connection, and - the age from its own document's last-indexed time.
The window is a single value per connection — every document in a Confluence connection shares the same
freshness_hours curve, so you cannot give individual pages different decay rates. But documents sit at
different points on that curve depending on when each was last indexed.
Quick reference
| Question | Answer |
|---|---|
Does freshness_hours decide when a sync runs? | No. Sync timing is the connection's sync schedule. |
| What does it decide? | Search-ranking decay + the dashboard fresh/stale window. |
| Which clock drives ranking decay? | The document's last indexed time (content age). |
| Which clock drives the dashboard badge? | The connection's last sync time. |
| Does re-syncing an unchanged page reset its decay? | No — last-indexed only advances when content changes. |
| Is decay bounded? | Only if retrieval_freshness_floor > 0. Default is 0 (unbounded). |
| Default value | 24 hours, integer, per connection. |
| Is it per-document or per-connection? | Window is per-connection; age is per-document. |