ADR 0027: High-relevance client-side docs search (inverted index, IDF, and ranking boosts)

Ratification

Discussion Issue: N/A (implemented as architecture improvement for docs UX)
Merge PR: pending
Accepted: 2026-04-17

Context

Docs are hosted as static files. We need fast, useful search across ADRs, runbooks, internal guides, developer pages, and API docs.

A simple “word present or not” scorer is easy to maintain but weak on large sites: common words win too often, long pages rank too high, and prefix search while typing needs better behavior.

Decision

We replace the old scorer with an inverted-index lexical model that combines:

Per-field term frequencies at build time (title, URL, section, content).
IDF-based weighting at query time.
Log-scaled TF and document length normalization.
High-precision boosts for exact phrase, all-token coverage, and title prefix matches.
Prefix expansion only for the last query token for good type-ahead UX.

We also add search telemetry (append-only) in the app SQLite database so local runs stay simple and we can verify behavior.

Theoretical basis

Representation

Each document is represented as:

d = (title, url, section, preview, content_len, tf_title, tf_url, tf_section, tf_content)

where tf_field(term, d) is term frequency in that field.

Normalization

Input text is lowercased, punctuation-normalized, and whitespace-collapsed:

N(x) = trim(collapseSpaces(lowercase(x)))

Tokens are alphanumeric terms extracted from normalized text.

Core ranking formula

For query tokens T, base score is:

score_base(d, T) = Σ over t in T:
        idf(t) * (
        w_title   * log(1 + tf_title(t, d))
        + w_url     * log(1 + tf_url(t, d))
        + w_section * log(1 + tf_section(t, d))
        + w_content * log(1 + tf_content(t, d))
        )

Weights are tuned for precision-first behavior:

w_title = 8.0
          w_url = 4.0
          w_section = 2.0
          w_content = 1.4

IDF and length normalization

idf(t) = log(1 + (N + 1) / (df(t) + 0.5))

            len_ratio(d) = content_len(d) / avg_content_len
            norm(d) = 1 / (1 + 0.08 * max(0, len_ratio(d) - 1))

            score_norm(d, T) = score_base(d, T) * norm(d)

This downweights very common tokens and prevents long pages from winning by volume alone.

Precision boosts

Final score adds deterministic bonuses:

score_final = score_norm
              + B_all_tokens_in_title
              + B_all_tokens_in_url
              + B_exact_phrase_in_title
              + B_exact_phrase_in_url
              + B_title_prefix
              + B_exact_section

These bonuses enforce intuitive ranking for navigational queries and short phrase queries.

Complexity model

With an inverted index, query complexity becomes proportional to postings, not all documents:

Build: O(total_tokens)
                Query: O(sum_postings_for_query_terms + rerank_candidates)
                Space: O(vocabulary + postings + doc_metadata)

This is substantially faster than scanning every document on each query.

Scope

In scope: all HTML pages under docs/, including docs/api/.
Out of scope: neural/vector search, typo-edit-distance search, multilingual morphology.

Alternatives considered

Simple field-presence scoring
- Pros: tiny implementation.
- Cons: weak ranking on larger corpora; limited precision control.
Hosted search providers
- Pros: rich relevance and analytics.
- Cons: external dependency, operational overhead, crawler governance.
Client-side third-party full-text libraries
- Pros: mature ranking options.
- Cons: larger runtime dependency surface than needed.

Consequences

Positive

Higher relevance for exact and navigational queries.
Predictable and debuggable ranking math.
Fast query path via postings-based candidate generation.

Trade-offs

Larger index compared to minimal list-based format.
More ranking parameters to maintain.
Still lexical: no semantic similarity understanding.

Compatibility and migration

Backward-compatibility impact: low (search implementation change only).
Migration plan: generate new index version in existing docs pipeline.
Rollback strategy: revert to previous scorer and simple index schema.

Implementation mapping

Index builder: scripts/build_docs_search_index.py
Runtime search and reranking: docs/assets/docs-nav.js
UI styling and snippets: docs/assets/docs.css
Artifact: docs/assets/search-index.json (schema version 2)
Telemetry ingest API: POST /internal/telemetry/docs-search
Telemetry metrics API: GET /internal/telemetry/docs-search/metrics
Telemetry store module: app/core/docs_search_telemetry.py
Telemetry database file: same SQLite file configured by SQLITE_DB_PATH

Telemetry events and storage

Client events

search_query: one emitted query execution with query length, token count, result count, latency, and top-N impressions.
search_result_click: click-through on a result (mouse/keyboard), including rank and URL.
search_success: first successful click in a search session; includes time-to-success and time-to-click.
search_query_error: client-side index-load failure for observability and diagnostics.

Persistence model

All events are written append-only to docs_search_events in the current application SQLite DB.
Telemetry data and business data share one DB file to simplify local setup.
SQLite runs in WAL mode to reduce writer contention for frequent event inserts.

Metric definitions (canonical formulas)

Let Q be all search_query events in a time window, and S be first search_success per session in the same window.

Zero-result rate

zero_result_rate = count(q in Q where q.results_count = 0) / count(Q)

Query CTR

query_ctr = count(distinct query_id in Q with at least one search_result_click) / count(Q)

Time-to-first-success

TTFS = distribution of s.time_to_success_ms for s in S
                      reported as p50 and p75

For this ADR, dashboard defaults are: p50 and p75 over the selected rolling window.

Validation

Run make docs-fix and verify index generation step succeeds.
Smoke-test queries across ADR/internal/developer/runbooks/api paths.
Verify keyboard UX: up/down/enter/escape.
Track payload size and first-query latency after docs growth.
Send a search event and verify row creation in telemetry DB (docs_search_events).
Call GET /internal/telemetry/docs-search/metrics and verify non-zero aggregates.

References

Page history

Date	Change	Author
2026-04-21	Added Page history section (repository baseline).	Ivan Boyarkin