Skip to main content

Freshness & Content Decay

Every connection has a freshness_hours setting (default 24). It is one number that does two different jobs, measured against two different clocks. Understanding which clock applies where is the key to setting it correctly — and the reason a tighter value does not mean "synced more often."

In one sentence

freshness_hours does not control whether or when a connection syncs. It controls how quickly older content is demoted in search ranking, and it defines the window the dashboard uses to call a connection fresh or stale.

The two clocks

freshness_hours is compared against a different timestamp depending on what it is doing:

JobTimestamp it measures againstAdvances when…
Search ranking decaya document's last indexed timethat document's content is (re)processed
Dashboard fresh / stalethe connection's last sync timeany successful sync run completes

These two clocks move independently, and that difference is the single most important thing on this page:

  • Last sync advances every time the connector runs, even if nothing changed.
  • Last indexed advances only when a document's content actually changes and is re-processed.

So a Confluence page that has not been edited in 30 days keeps its original "last indexed" time even if the connector syncs every hour. Re-syncing an unchanged page does not make it look fresher to the ranker.

What freshness decay does

When a search runs, the freshness decay step re-scores candidate chunks so that recently indexed content ranks above older content. Each chunk's relevance score is multiplied by a freshness factor based on how long ago its document was last indexed:

  • Content newer than freshness_hoursfull score (multiplier = 1.0)
  • Older than that → the score decays exponentially, halving once per additional freshness_hours window:
age = now − document_last_indexed

if age ≤ freshness_hours:
multiplier = 1.0
else:
multiplier = 0.5 ^ ((age − freshness_hours) / freshness_hours)

Freshness decay is a chunk-stage retrieval tool (freshness_decay) and ships active in all three default presets (default-fast, default-balanced, default-accurate). It runs once per search — there is no second, hidden decay applied at ingest time. See the retrieval pipeline for where it sits in the overall flow.

Worked example

With freshness_hours = 24, scored by how long ago the document's content was last indexed:

Content ageExcess over windowScore multiplier
≤ 24h1.00 (100%)
48h24h0.50 (50%)
72h48h0.25 (25%)
120h (5d)96h0.0625 (~6%)
168h (7d)144h~0.016 (~1.6%)

A document that has not changed in a week is ranked at roughly 1.6% of its semantic relevance — unless a floor stops the decay (next section).

The freshness floor

The decay above is unbounded by default — all three shipped presets set retrieval_freshness_floor = 0, so a very old document can decay arbitrarily close to zero. The freshness floor is a separate, preset-level knob that caps the penalty from below:

effective_multiplier = max(decay_multiplier, retrieval_freshness_floor)
retrieval_freshness_floorEffect
0 (default)Floor disabled — decay runs unclamped; very old content can be buried.
0.20.3Mild decay — freshness still tiebreaks, but old authoritative docs stay competitive.
1.0Disables decay entirely — freshness no longer affects ranking.

The two settings divide the work cleanly:

SettingLives onControls
freshness_hoursthe connectionhow fast scores decay (the half-life)
retrieval_freshness_floorthe presethow far down decay is allowed to push a score

Full floor reference: Reranking → Freshness floor.

Freshness vs. sync cadence

A common assumption is "if I sync every day, a 24-hour window keeps everything fresh." That is not how it works, because incremental sync skips unchanged documents and leaves their last-indexed time untouched.

Syncing more often does not refresh stable content

The ranking clock is last indexed (content age), not last synced. A page that has not been edited in 30 days has decayed for 30 days no matter how often the connector runs. Sync cadence governs how fast edits are picked up and what the dashboard badge shows — it does not reset the decay on stable content.

This makes freshness_hours a content-shelf-life setting for ranking: "how long after a document was last written should it still count as fully current?" — not a sync-frequency setting.

The tension: one number, two clocks

Because the same freshness_hours value drives both jobs, the two can pull in different directions:

  • Set it tight (e.g. 24) on a weekly sync, and the dashboard reports the connection stale for six days out of seven — because its last sync is up to 7 days old, well past a 24-hour window.
  • Set it wide (e.g. 168) to keep the dashboard green on a weekly sync, and you also widen the ranking half-life to a full week.

How to choose freshness_hours

  1. Set it at or above your sync interval. This keeps a healthy connection reading fresh on the dashboard. Daily sync → 24+; weekly sync → 168+.
  2. Then think about content shelf-life for ranking. If your corpus is mostly stable, authoritative content (policies, runbooks, ADRs, legal) that stays valid for months, a tight window will demote it aggressively. Either widen freshness_hours, or set a retrieval_freshness_floor of 0.20.3 so old-but-relevant content stays competitive.
  3. Go tighter than your sync interval only deliberately — when recency genuinely drives relevance and you cannot sync more often (a fast-moving source where a three-day-old page really is less trustworthy than a fresh one). This nudges older content down between syncs. It is a ranking lever, not a data-currency lever: the underlying data still only refreshes on schedule.

Multi-connection searches

Freshness is applied per chunk, per connection. In a search that spans several connections, each result is decayed using:

  • the window (freshness_hours) from its own connection, and
  • the age from its own document's last-indexed time.

The window is a single value per connection — every document in a Confluence connection shares the same freshness_hours curve, so you cannot give individual pages different decay rates. But documents sit at different points on that curve depending on when each was last indexed.

Quick reference

QuestionAnswer
Does freshness_hours decide when a sync runs?No. Sync timing is the connection's sync schedule.
What does it decide?Search-ranking decay + the dashboard fresh/stale window.
Which clock drives ranking decay?The document's last indexed time (content age).
Which clock drives the dashboard badge?The connection's last sync time.
Does re-syncing an unchanged page reset its decay?No — last-indexed only advances when content changes.
Is decay bounded?Only if retrieval_freshness_floor > 0. Default is 0 (unbounded).
Default value24 hours, integer, per connection.
Is it per-document or per-connection?Window is per-connection; age is per-document.

We use cookies for analytics to improve our website. More information in our Privacy Policy.