Backlog: priorities and status

Overview

One ordered list of engineering work: most urgent first, nice-to-have last. Dates are rough calendar-day ranges (weekends off), for 3 / 2 / 1 hours of focused work per day. They are not deadlines—only a shared guess for planning. To change status, edit the pill class (see the legend). This page is not ADR lifecycle: ADRs use data-adr-weight and ratification (ADR 0018).

Status legend

Work item states
To do — not started
In progress — someone is working on it
Done — finished and accepted
Blocked — waiting on someone or a decision
Rejected — we decided not to do this (out of scope or obsolete)

How to change status (README.html)

  1. Colors live in docs/backlog/backlog.css (CSS variables).
  2. To change colors globally, edit --status-* there.
  3. In each item’s <h2>, find <span class="status-pill status-pill--..."> and set the part after status-pill-- to todo, in-progress, done, blocked, or rejected.

Meta Backlog page and docs/backlog/ layout Done

1) Summary
Keep a simple backlog in docs/backlog/: HTML, shared pill styles, and the same fields per item so anyone reading the repo sees priorities without a separate tool.
2) Problem & value
Without one place to look, plans scatter across chat and tickets. This page is versioned with the code—good for reviews, onboarding, and release planning if you do not use Jira/Linear or want docs next to the source.
3) Delivered (to date)
README.html, backlog.css (status color variables), navigation from the developer docs index, and the status model described in the legend above (see ADR 0001 for the README.html convention).
4) Rough estimate (calendar days)
3 h/day 1 calendar day
2 h/day 1 calendar day
1 h/day 1 calendar day

P0 — critical CI pipeline: quality gates, coverage threshold, docs-check Done

1) Summary
Run the same checks as locally (make verify or similar) in CI: format, types, tests with a coverage floor, OpenAPI/docs sync, contract tests—so main stays shippable.
2) Problem & value
Problem: If CI is weak or missing, bad code and doc drift can merge to main.
Value: Bugs surface earlier; reviewers see a green pipeline; standard gates on each PR.
3) Scope & deliverables
GitHub Actions (or equivalent): run formatter/linter, mypy, pytest with coverage and fail-under, OpenAPI check, contract tests, docs-check; cache dependencies for speed; fail the build on violations. Optionally run the same hooks in CI that pre-commit uses locally. Document how to reproduce CI locally in contributor docs.
4) Rough estimate (calendar days)
3 h/day ~4 days (~12 h)
2 h/day ~6 days
1 h/day ~12 days
5) Delivered (to date)
.github/workflows/ci.yml on push/PR to main or master: Python 3.11, pip cache, make verify then pre-commit run --all-files. Coverage enforced via pytest-cov and fail_under in pyproject.toml. Local pre-push drift check: make verify-ci. Contributor note in engineering practices (CI / reproduce locally).

P0 SLOs / SLAs, error budget, and monitoring alerts Done

1) Summary
Set SLIs/SLOs (and optional SLAs) for latency, uptime, and errors. Add dashboards and alerts in Prometheus (or your stack). Write what to do when the error budget runs out (freeze, incident flow).
2) Problem & value
Problem: Metrics without targets do not guide releases or on-call.
Value: Clear reliability goals; ops and product share the same numbers (error-budget style).
3) Scope & deliverables
ADR or runbook: e.g. p95 latency, /ready-based availability, max 5xx ratio; Grafana panels and Prometheus alert rules; runbook steps for budget exhaustion; link from existing observability docs.
4) Rough estimate (calendar days)
3 h/day ~4 days (~12 h)
2 h/day ~6 days
1 h/day ~12 days
5) Delivered (to date)
ADR 0011; Prometheus recording and alert rules in ops/prometheus/rules/study_app_slo.yml; Grafana dashboard import and Blackbox probe wiring via docker-compose.observability.yml and ops/prometheus/; links from README and developer docs to local observability URLs.

P1 DB uniqueness, race safety, and documented concurrency behavior Done

1) Summary
Put business rules in the DB (unique constraints), map conflicts to the API error shape, and document races (double submit, retries, idempotency) so behavior under load is clear.
2) Problem & value
Problem: “Check then insert” without unique indexes can race and duplicate rows.
Value: The database enforces truth; clients get stable errors; less guesswork for support.
3) Scope & deliverables
Add constraints on natural keys; consistent HTTP/status codes for conflicts; developer doc table: scenarios (duplicate create, idempotent retry) and expected outcomes.
4) Rough estimate (calendar days)
3 h/day ~3 days (~10 h)
2 h/day ~5 days
1 h/day ~10 days
5) Delivered (to date)
Alembic/SQLAlchemy unique indexes and constraints on natural keys (e.g. systems, timezones, users, idempotency_keys composite uniqueness per ADR 0006); HTTP/error contract for conflicts and idempotency replay; ADR 0006 plus developer docs and error matrix for retry and idempotency semantics.

P1 Dependency security (pip-audit) and update policy Done

1) Summary
Scan Python deps for known CVEs, pin direct deps for reproducible builds, and write a short upgrade policy (how often, by severity).
2) Problem & value
Problem: Loose or old pins add CVE risk and surprise builds.
Value: Safer supply chain; fewer panic upgrades.
3) Scope & deliverables
make deps-audit (or equivalent) using pip-audit; integrate into CI; ADR: pinning rules, review cadence, exception process.
4) Rough estimate (calendar days)
3 h/day ~1 day (~3 h)
2 h/day ~2 days
1 h/day ~3 days
5) Delivered (to date)
Policy ADR: ADR 0019 (pinning, pip-audit, cadence, exceptions). pip_audit pinned in requirements.txt; make deps-audit (OSV scan, .pip-audit-cache); CI quality job runs it before make verify (.github/workflows/ci.yml); make verify-ci includes deps-audit for local pre-push parity; engineering practices table documents the command.

P1 Integration tests against the database (CI) To do

1) Summary
Add integration tests that hit real PostgreSQL in CI (Compose), with migrations and repositories—alongside fast unit tests on SQLite or mocks.
2) Problem & Value
Problem: SQLite in dev and Postgres in prod differ (types, locks, edge cases).
Value: Fast unit tests plus a smaller set of real-DB tests catches data-heavy bugs.
3) Scope & Deliverables
Docker Compose service for PostgreSQL in CI; integration Pytest marker; database fixtures and teardown; migrations executed before tests; well-documented local command matching the CI process.
4) Rough estimate (calendar days)
3 h/day ~5 days (~16 hours)
2 h/day ~8 days
1 h/day ~16 days

P2 Test pyramid: unit tests + property-based tests (Hypothesis) To do

1) Summary
Pull pure validation into small functions; test them with unit tests and property tests (Hypothesis) so many inputs are checked without slow E2E only.
2) Problem & value
Problem: Hand-picked examples miss edge mixes; line coverage is not proof.
Value: Cheaper than E2E for heavy validation; common pattern for parsers and rules.
3) Scope & deliverables
Identify critical pure functions; Hypothesis strategies; guidelines in the developer guide (when to use properties vs examples).
4) Rough estimate (calendar days)
3 h/day ~4 days (~12 h)
2 h/day ~6 days
1 h/day ~12 days

P2 E2E: HTTP scenarios against a running application To do

1) Summary
Add a few E2E tests: real HTTP → API → DB on a running stack, with real status codes, headers, and bodies—not only mocks.
2) Problem & value
Problem: Lower-level tests can miss wiring, middleware, and contract gaps.
Value: Extra confidence before release; does not replace unit/integration tests.
3) Scope & deliverables
httpx + pytest (or similar); separate CI job or marker e2e; minimal happy path plus a few representative error paths; stable test data strategy.
4) Rough estimate (calendar days)
3 h/day ~3 days (~8 h)
2 h/day ~4 days
1 h/day ~8 days

P2 Mutation testing pilot for critical rules To do

1) Summary
Try mutation testing (e.g. cosmic-ray, mutmut) on one or two sensitive modules to see if tests really catch broken code—not just high line coverage.
2) Problem & value
Problem: Coverage can look good with weak asserts.
Value: Shows where tests actually fail bad mutations; points to stronger asserts.
3) Scope & deliverables
Scoped run in CI or manual report; threshold or qualitative review; ADR if adopted team-wide; avoid full-repo runs initially (cost).
4) Rough estimate (calendar days)
3 h/day ~2 days (~6 h)
2 h/day ~3 days
1 h/day ~6 days

P3 Distributed rate limiting with Redis (multi-instance) To do

1) Summary
Move rate limits from per-process memory to Redis so limits match across many API instances, still returning HTTP 429 as today.
2) Problem & value
Problem: Each replica has its own counter, so abuse scales with instance count.
Value: Normal pattern behind a load balancer; fair limits when you scale out.
3) Scope & deliverables
Replace InMemoryRateLimiter in app/core/security.py with a Redis-backed adapter using shared counters + TTL windows, preserving existing 429 behavior and headers (X-RateLimit-*, Retry-After); ADR (algorithm: token bucket or sliding window); Compose profile for local dev/CI; clear fallback policy when Redis is unavailable (fail-open vs fail-closed) and integration tests for multi-worker consistency.
4) Rough estimate (calendar days)
3 h/day ~5 days (~14 h)
2 h/day ~7 days
1 h/day ~14 days

P2 Redis adoption: idempotency cache and short-lived platform data To do

1) Summary
Add Redis as a shared fast-access layer for idempotency lookups and short-lived data, while keeping SQL as the source of truth and audit trail.
2) Problem & value
Problem: Idempotency currently checks SQL on each request and temporary runtime data has no shared distributed store.
Value: Lower latency/load for retries and repeated reads; enables safe scale-out patterns for cache, one-time keys, and future background workers.
3) Scope & deliverables
Redis read-through cache for app/repositories/idempotency_repository.py (idempotency_key -> status/response/payload_hash with TTL); SQL remains canonical persistence; key schema + TTL policy doc; cache invalidation and observability metrics (hit ratio, errors, fallback rate); optional adapters for hot GET cache (user/course cards), short-lived tokens/OTP/one-time keys, and starter queue primitives for later worker adoption.
4) Rough estimate (calendar days)
3 h/day ~6 days (~18 h)
2 h/day ~9 days
1 h/day ~18 days

P3 PostgreSQL as primary database and migration path from SQLite To do

1) Summary
Use PostgreSQL for staging/prod (Docker-friendly), keep schema in Alembic, and plan a safe move from SQLite where it still runs—including data migration runbooks if needed.
2) Problem & value
Problem: SQLite is fine for dev but weak for concurrent prod ops, backups, and tooling.
Value: Postgres matches usual ops expectations (backup, HA, metrics).
3) Scope & deliverables
docker-compose for app + Postgres; environment profiles; SQLAlchemy compatibility checks; migration runbook; cutover or dual-write strategy as appropriate.
4) Rough estimate (calendar days)
3 h/day ~8 days (~24 h)
2 h/day ~12 days
1 h/day ~24 days

P3 Load testing (k6 / Locust) and SLO validation Done

1) Summary
Add repeatable load scripts (e.g. k6 or Locust) for p95 and errors under set concurrency; compare to SLOs; optional CI or nightly job.
2) Problem & value
Problem: Many teams only load-test in production.
Value: Plan capacity from data; spot regressions before traffic jumps.
3) Scope & deliverables
Scripts under ops/load/ or tests/load/; baseline profile (e.g. user CRUD); report artifact; optional nightly job with thresholds; link to SLO doc.
4) Rough estimate (calendar days)
3 h/day ~3 days (~8 h)
2 h/day ~4 days
1 h/day ~8 days
5) Delivered (to date)
Python runner and scenarios under tools/load_testing/; Makefile targets run-loadtest-api and run-loadtest-api-serve for local runs against a live API; scenario docs in-repo. Optional automated CI smoke/load remains tracked under item 23.

P4 Test data: seeds, fixtures, runbook To do

1) Summary
Ship reference data and fixtures for integration/E2E tests, plus a short runbook to reset or seed state without hidden tricks in conftest.py.
2) Problem & value
Problem: Opaque setup flakes and hides real bugs.
Value: Stable CI; new contributors get green tests faster.
3) Scope & deliverables
Single seed script or Alembic data revision; naming conventions; developer doc for when to extend seeds vs factories.
4) Rough estimate (calendar days)
3 h/day ~2 days (~5 h)
2 h/day ~3 days
1 h/day ~5 days

P4 Feature flags (safe rollout) To do

1) Summary
Add simple feature flags (env and/or Redis) so risky behavior can flip without a full deploy—gradual rollout and fast rollback.
2) Problem & value
Problem: Every fix needs a deploy, which slows incidents.
Value: Turn behavior on/off quickly; canary-style rollouts.
3) Scope & deliverables
Small provider abstraction in config; ADR on evaluation rules and caching; one non-business demo flag to prove wiring; security note on who can change flags.
4) Rough estimate (calendar days)
3 h/day ~3 days (~10 h)
2 h/day ~5 days
1 h/day ~10 days

P4 Architecture fitness: enforced layer boundaries To do

1) Summary
Enforce layer rules in CI (e.g. routers do not import repositories directly)—import-linter or a graph check in make verify.
2) Problem & value
Problem: Loose imports cause cycles and muddy layers as the team grows.
Value: The code matches the documented architecture.
3) Scope & deliverables
Rule set matching the intended layers; documented exceptions; CI failure on violation; pointer in contributor guide.
4) Rough estimate (calendar days)
3 h/day ~2 days (~5 h)
2 h/day ~3 days
1 h/day ~5 days

P4 Changelog and release-time verification Done

1) Summary
Use Keep a Changelog for user-visible and breaking changes; optional CI check on release tags.
2) Problem & value
Problem: Users and support lack one dated list of what shipped.
Value: Clear history for API consumers and audits.
3) Scope & deliverables
CHANGELOG.md structure; CONTRIBUTING rules; optional CI step on tag/release that fails if changelog section is missing for the version.
4) Rough estimate (calendar days)
3 h/day ~1 day (~3 h)
2 h/day ~2 days
1 h/day ~3 days
5) Delivered (to date)
CHANGELOG.md (Keep a Changelog); ADR 0013; scripts/changelog_gate.py and optional scripts/changelog_draft.py; CI workflow runs the gate on PR/push to main / master (see .github/workflows/ci.yml).

P4 — lower priority Repository cleanup and dead code removal Done

1) Summary
Remove unused modules, duplicate logic, and old hacks after contracts stabilize—static analysis plus human review.
2) Problem & value
Problem: Dead code confuses readers and hides real flow.
Value: Smaller surface, faster reviews—best after big refactors land.
3) Scope & deliverables
vulture / Ruff unused-import rules; manual triage; delete only with test confidence; optional periodic job or checklist in maintainers’ guide.
4) Rough estimate (calendar days)
3 h/day ~2 days (~4 h)
2 h/day ~2 days
1 h/day ~4 days
5) Delivered (to date)
ADR 0014; Ruff F401 + RUF100 and per-file-ignores for tests/conftest.py (E402); [tool.vulture] in pyproject.toml; make dead-code-check; weekly .github/workflows/dead-code.yml; checklist in CONTRIBUTING.md. Ongoing removal of dead code remains manual with test-backed review.

P2 Distributed tracing (OpenTelemetry) and trace context To do

1) Summary
Add OpenTelemetry (W3C): HTTP spans, outbound calls, DB spans where useful; export OTLP to Jaeger/Tempo or a host. Tie traces to logs/metrics with trace_id / span_id.
2) Problem & value
Problem: Metrics and logs alone do not show one request across services and replicas.
Value: Standard way to debug latency and dependencies; adds to SLO work (ADR 0011), not a replacement.
3) Scope & deliverables
ADR (sampler rules, PII in attributes, env vars); FastAPI/Starlette OTel integration; optional Compose service for local trace UI; developer doc: how to find a trace from a failing request.
4) Rough estimate (calendar days)
3 h/day ~5 days (~16 h)
2 h/day ~8 days
1 h/day ~16 days

P3 Structured logging (JSON) and request correlation IDs Done

1) Summary
Ship JSON logs in prod (optional in dev) with stable fields: time, level, logger, message, and correlation (request_id; trace_id when OTel ships). Accept or set X-Request-Id per request.
2) Problem & value
Problem: Plain text is hard to search at scale and to join across services.
Value: Fits ELK/Datadog-style tools; faster support and postmortems.
3) Scope & deliverables
Config flag or APP_ENV switch; middleware for request ID; document field list; ensure PII policy (no secrets in log fields).
4) Rough estimate (calendar days)
3 h/day ~3 days (~8 h)
2 h/day ~4 days
1 h/day ~8 days
5) Delivered (to date)
LOG_FORMAT / LOG_SERVICE_NAME; NDJSON fields including request_id, trace_id, span_id (placeholders for OTel); X-Request-Id middleware; optional docker-compose.logging.yml (Elasticsearch, Kibana, Filebeat) and ADR 0023.

P3 CI: container image SBOM and vulnerability scan (published GHCR image) To do

1) Summary
After push to GHCR: build an SBOM (e.g. Syft), scan the image for CVEs (Trivy/Grype), upload SARIF or a summary, warn or fail on severity—document the rules for engineers.
2) Problem & value
Problem: pip-audit does not cover OS packages inside the image.
Value: Meets common supply-chain checks; complements ADR 0019.
3) Scope & deliverables
GitHub Actions job after publish-image (or same workflow step with the built digest); artifact retention policy; exception process for accepted CVEs.
4) Rough estimate (calendar days)
3 h/day ~2 days (~6 h)
2 h/day ~3 days
1 h/day ~6 days

P1 OpenAPI governance: lint, semantic baseline, and strict contract test Done

1) Summary
Treat OpenAPI like code: lint operation IDs, summaries, examples on writes and 422s, and block surprise breaks vs a checked-in baseline.
2) Problem & value
Problem: Drift or missing docs confuse clients and hide breaks.
Value: Visible contract and safer changes inside /api/v1 (ADR 0007).
3) Scope & deliverables
Lint rules; semantic compare vs docs/openapi/openapi-baseline.json; optional strict byte-for-byte contract-test; documented accept workflow when changing the spec intentionally.
4) Rough estimate (calendar days)
3 h/day ~3 days (~8 h)
2 h/day ~4 days
1 h/day ~8 days
5) Delivered (to date)
scripts/openapi_governance.py; make openapi-check and make contract-test in Makefile; baseline under docs/openapi/openapi-baseline.json; integrated into make verify / CI (.github/workflows/ci.yml); ADR 0007.

P1 Continuous delivery: publish container image to GHCR from main / tags Done

1) Summary
After CI passes, build and push a Docker image to GHCR so staging and docs can pull a known digest.
2) Problem & value
Problem: Without a standard image, container workflows are harder to share.
Value: Commit → tested image → registry is the usual path.
3) Scope & deliverables
Workflow job with Buildx cache; tags for SHA, semver, and latest on default branch; ADR describing scope (registry delivery vs full prod rollout).
4) Rough estimate (calendar days)
3 h/day ~2 days (~6 h)
2 h/day ~3 days
1 h/day ~6 days
5) Delivered (to date)
publish-image job in .github/workflows/ci.yml (GHCR, metadata tags, GHA cache); ADR 0021; make docker-build for local parity.

P2 Single source of truth for application / OpenAPI version metadata To do

1) Summary
One version string (e.g. pyproject.toml or app/__version__.py) wired into FastAPI, OpenAPI, and release notes so tags and docs match the running app.
2) Problem & value
Problem: Hardcoded FastAPI(version=…) drifts from changelog and tags.
Value: Support and automation know which build is running.
3) Scope & deliverables
Single import or TOML field; optional /live or /version payload field; contributor note: bump process tied to release.
4) Rough estimate (calendar days)
3 h/day ~1 day (~3 h)
2 h/day ~2 days
1 h/day ~3 days

P2 CI: optional HTTP smoke or load-regression job (post-main or nightly) To do

1) Summary
In GitHub Actions, run a small smoke or short load (health + one protected route, or tools/load_testing/) so latency and 5xx regressions show without manual make run-loadtest-api. Builds on item 11.
2) Problem & value
Problem: Item 11’s tooling does not run on every merge by default.
Value: Catches perf/timeouts earlier; works with SLO alerts.
3) Scope & deliverables
Workflow that starts API + DB (or uses test stack), runs scripted checks with thresholds; document flakiness mitigations; keep optional or workflow_dispatch if cost is a concern.
4) Rough estimate (calendar days)
3 h/day ~2 days (~6 h)
2 h/day ~3 days
1 h/day ~6 days

P2 Document versioning: Document Version field and change policy To do

1) Summary
Add explicit document versioning with a Document Version field so each docs page has a clear version value tied to updates and release cadence.
2) Problem & value
Problem: Without an explicit version marker, readers cannot quickly confirm whether they are looking at the latest or expected revision.
Value: Better traceability for reviews and audits; easier support communication when discussing specific document revisions.
3) Scope & deliverables
Define where Document Version is stored/rendered in docs templates; set increment rules (major / minor / patch or date-based); update contributor guidance for version bumps; optionally add a docs check that validates version presence and format.
4) Rough estimate (calendar days)
3 h/day ~2 days (~6 h)
2 h/day ~3 days
1 h/day ~6 days

P0 — critical QA foundation: dedicated tester space, test process, and full testing pyramid (frontend + backend) To do

1) Summary
Create a dedicated QA/testing space and establish an end-to-end testing process for the project, covering both frontend and backend with a clear testing pyramid (unit, integration, API/contract, E2E, and smoke/regression).
2) Problem & value
Problem: Testing coverage is currently near zero for both frontend and backend, and there is no shared QA workflow, ownership model, or quality gate policy.
Value: Predictable release quality, earlier bug detection, lower regression risk, and a stable team process where engineers and testers work with one quality baseline.
3) Scope & deliverables
Define dedicated tester workspace and access model (test environment, test data, tooling); baseline quality strategy document with entry/exit criteria; target coverage and test pyramid ratios per layer; frontend and backend test suites with markers and ownership; CI gates for required checks; defect triage and bug lifecycle policy; release readiness checklist; onboarding guide for test practices and responsibilities.
4) Rough estimate (calendar days)
3 h/day ~14 days (~42 h)
2 h/day ~21 days
1 h/day ~42 days

P1 Mobile docs UX: fix top navigation rendering on iPhone 17 Pro Max and adaptive layout by screen size Done

1) Summary
Fix the broken/awkward top navigation rendering on iPhone 17 Pro Max and implement responsive/platform-aware documentation layout behavior so reading and navigation stay comfortable across small, medium, and large screens.
2) Problem & value
Problem: The top of the docs navigation currently looks poor on iPhone 17 Pro Max, reducing usability and perceived quality on mobile devices.
Value: Better mobile first impression, faster page navigation, and more predictable UX with screen-size-specific behavior.
3) Scope & deliverables
Audit current docs shell/navigation on iOS Safari (including safe-area and notch behavior); define responsive breakpoints and layout rules for nav/sidebar/content; implement platform-friendly patterns (env(safe-area-inset-*), sticky/fixed header behavior, compact navigation states, touch target sizing, and readable typography scale); verify on key viewport buckets and document the rules in internal docs.
4) Rough estimate (calendar days)
3 h/day ~4 days (~12 h)
2 h/day ~6 days
1 h/day ~12 days
5) Completion evidence / acceptance
  • Mobile title row uses safe-area aware top spacing via env(safe-area-inset-top).
  • Drawer panel uses safe-area top/bottom padding for notched devices.
  • Interactive controls in collapsed/compact states keep at least 44px touch target width.
  • Tablet/phone drawer mode hides desktop sidebar and keeps reliable open/close behavior.
  • Implemented behavior is documented in docs/internal/front/docs-frontend-menu-and-theme-controls.html.

Page history

Date Change Author
Added backlog item for iPhone 17 Pro Max navigation fix and adaptive docs layout by screen size. Ivan Boyarkin
Added critical QA foundation backlog item for tester space and full frontend/backend testing process. Ivan Boyarkin
Added backlog item for document versioning and Document Version field policy. Ivan Boyarkin
Expanded Redis rate-limiting backlog item and added Redis idempotency/cache adoption item. Ivan Boyarkin
Added Page history section (repository baseline). Ivan Boyarkin