RFC 0002: Docs Search KPI Policy and SLO Thresholds
Metadata
Goals and non-goals
This policy sets targets for docs search quality and what to do when numbers move out of range.
Goals
- Clear KPIs for docs search on
docs/**/*.html. - Warning and breach limits so on-call knows when to act.
- Same review rhythm and responses across releases.
Non-goals
- Semantic search, typo fixes, or multilingual ranking.
- Replacing the technical detail in RFC 0001.
Canonical KPI definitions
- Zero-result rate =
count(search_query where results_count = 0) / count(search_query). - Query CTR =
count(distinct query_id with at least one search_result_click) / count(search_query). - TTFS (time-to-first-success) = distribution of first
search_success.time_to_success_msper session (p50/p75).
Data source: GET /internal/telemetry/docs-search/metrics?window_minutes=<N>.
Targets (rolling 7-day window)
| KPI | Target | Warning threshold | Breach threshold |
|---|---|---|---|
| Zero-result rate | ≤ 12% | > 12% and ≤ 18% | > 18% |
| Query CTR | ≥ 45% | < 45% and ≥ 35% | < 35% |
| TTFS p50 | ≤ 25s | > 25s and ≤ 40s | > 40s |
| TTFS p75 | ≤ 60s | > 60s and ≤ 90s | > 90s |
Operating cadence
- Weekly: review KPIs for the last 7 and 30 days.
- Before big doc releases: confirm no KPI is in breach.
- Monthly: add a short trend note to
docs/CHANGELOG.mdor an audit page when numbers move a lot.
Response playbook
Warning (one KPI in warning for 2+ days)
- Look at top zero-result queries and low click-through queries.
- Rebuild the index:
python3 scripts/build_docs_search_index.py. - Check recent content or nav changes that could hurt ranking.
Breach (any KPI in breach for 24h)
- Open a backlog item and assign an owner to fix within 48h.
- Run the checks in RFC 0001 (local validation + telemetry).
- Fix quickly: rebuild index, roll back ranking changes, or fix nav/content.
- If users see clearly worse search, add a short runbook or changelog note.
Runbook links
- RFC 0001 troubleshooting
- Runbook: Logging failing (telemetry/log signal checks)
- Runbook: Quality-check failing (CI and docs pipeline gates)
Change management
- Change KPI thresholds only after at least 30 days of data and a short written reason.
- Update this RFC and mention the change in
docs/CHANGELOG.md. - Update an ADR only if the architecture changes — not for small threshold tweaks.
Page history
| Date | Change | Author |
|---|---|---|
| Added Page history section (repository baseline). | Ivan Boyarkin |