From d1490c4639aafd8f6764b64ff3e160d1583968ec Mon Sep 17 00:00:00 2001 From: Konrad du Plessis Date: Fri, 24 Apr 2026 00:23:48 +0200 Subject: [PATCH] docs(perf): design for Quick-Wins Pass A MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Short design covering four changes: mtime-based CSS cache-bust token, Django Debug Toolbar (dev-only) for profiling, N+1 fixes on Dashboard and Payroll pages, and a before/after measurement in the commit message. Scope is deliberately tight — plan B (template splitting) and plan C (full audit) are deferred until plan A evidence lands. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../2026-04-24-perf-quick-wins-design.md | 175 ++++++++++++++++++ 1 file changed, 175 insertions(+) create mode 100644 docs/plans/2026-04-24-perf-quick-wins-design.md diff --git a/docs/plans/2026-04-24-perf-quick-wins-design.md b/docs/plans/2026-04-24-perf-quick-wins-design.md new file mode 100644 index 0000000..d7929d2 --- /dev/null +++ b/docs/plans/2026-04-24-perf-quick-wins-design.md @@ -0,0 +1,175 @@ +# Perf Quick-Wins Pass — Design (24 Apr 2026) + +## Origin + +Konrad, after a long stretch of feature work (Inline Filters + Adjustments +tab + filter-bar v2): + +> _"the app feel a bit sluggish especially changing between main spaces. +> Go through the app systematically and look for bugs and un optimized +> code. systematically go through the code and expertly and thoroughly +> review and fix it."_ + +Presented three scopes (A quick-wins / B focused dashboard pass / C full +systematic audit). Konrad picked **A — quick-wins first**, on the +principle that perf work is notorious for "big rewrite that didn't help." +If A moves the needle, we can stop; if not, we escalate with evidence. + +## Goal + +Make navigation between main spaces (Dashboard ↔ Payroll ↔ Workers ↔ +Report ↔ Teams ↔ Projects) feel snappier. Ship in 1-3 commits. No +architecture changes. Every change individually revertible. + +## Who it's for + +Everyone who uses the app — most immediately Konrad, who navigates +between Dashboard and Payroll dozens of times a day. + +## What we already know (pre-measurement) + +- `payroll_dashboard.html` is 213 KB / 4,147 lines — all 4 tabs rendered + server-side even when only one is visible. Addressed in plan B, not A. +- `deployment_timestamp` context var is `int(time.time())` per-request + → `custom.css?v=` is a new URL every second → Cloudflare + edge-cache HIT rate on CSS is effectively 0 → every page load fetches + 64 KB of CSS from the VM. Documented as a trade-off in CLAUDE.md. + This is almost certainly the biggest single contributor to the + "heavy navigation" feel. +- 49 `select_related`/`prefetch_related` calls vs 91 + `.all()/.first()/.count()` calls in `views.py`. Not damning but worth + pointing at hot-path views. + +## Scope — 4 changes, in order + +### 1. Fix `deployment_timestamp` to bust cache only on real deploys + +**File:** `core/context_processors.py` + +Today: +```python +'deployment_timestamp': int(time.time()), +``` + +After: +```python +# Cache-bust token tied to the CSS file's mtime — only changes when +# custom.css actually changes. Falls back to int(time.time()) if the +# file isn't on disk yet (fresh container, pre-collectstatic). +try: + _css_path = settings.BASE_DIR / 'static' / 'css' / 'custom.css' + _token = int(os.path.getmtime(_css_path)) +except (OSError, FileNotFoundError): + _token = int(time.time()) +``` + +Effect: the `?v=...` query string stays constant across requests until +`custom.css` is modified. Cloudflare can finally hold the file at its +edge for its full 4h TTL. Repeat navigation within a session drops from +"fetch 64 KB from VM" to "304 Not Modified" from the browser cache, +after the first hit in a 4h window. + +**Degraded-mode guarantee:** if the file is missing (shouldn't happen in +normal dev or prod, but could in a fresh container), we degrade to +today's behaviour (per-request timestamp) rather than crash. + +### 2. Profile + fix N+1 on the two busiest pages + +**Pages:** `/` (dashboard) and `/payroll/` (payroll dashboard — all 4 +tabs — Pending, History, Loans, Adjustments). + +**Tool:** Django Debug Toolbar, added to `requirements.txt` as a +dev-only dependency. Gated in `config/settings.py` so it only +initialises when `DJANGO_DEBUG=true` AND `USE_SQLITE=true` (never +loads in prod). + +**Process:** +1. Install toolbar, confirm the SQL panel loads on `/`. +2. Navigate to `/`, read the SQL tab: flag any query count > ~20, + any row with `+N duplicate queries`, any view of the queryset + that could be answered with `select_related`/`prefetch_related`/ + `annotate(Count/Sum)`. +3. Fix each flag with the minimal ORM change. One fix = one commit. +4. Re-run, confirm query count dropped, confirm no test regressions. +5. Repeat for `/payroll/?status=pending`, `/payroll/?status=history`, + `/payroll/?status=loans`, `/payroll/?status=adjustments`. + +**Likely suspects** (prediction — to be confirmed by toolbar): +- **Dashboard cert-expiry card** — aggregates expired/expiring certs + across active workers. If it loops in Python instead of `annotate`- + plus-filter, that's an N+1. +- **Pending payments table** — worker + team + overdue calc per row. + The overdue check calls `get_pay_period(team)` per worker; if teams + aren't prefetched we're firing one SELECT per row. +- **Adjustments tab groupings** — we fixed `worker.teams.first()` → + `worker.teams.all()` once already (commit `06b3315`); worth + double-checking grouped view for similar patterns. + +**Out of scope for this step:** any fix that requires a template +rewrite. If something needs more than a `.select_related()` / +`.prefetch_related()` / `.annotate()` tweak, it goes on the plan-B list. + +### 3. Double-check WeasyPrint is not eager-imported anywhere + +**File:** `core/utils.py`, `core/views.py`. + +We already lazy-import WeasyPrint in `render_to_pdf()` (per CLAUDE.md). +I'll grep to confirm nothing else on the app-boot path imports +`weasyprint` at module level. If anything does, move it into a function +body. 10 minutes, zero-risk. + +### 4. Commit message includes before/after measurement + +The final commit's message records: +- Page size bytes (DOM serialized) for `/` and `/payroll/` before & after +- Network request count on a cold cache hit +- SQL query count on both pages + +If the numbers don't materially improve after steps 1-3, I stop. We +don't press on to plan B without evidence that plan A helped (or at +least surfaced what's actually slow). + +## What I will NOT touch in this pass + +- Splitting `payroll_dashboard.html` into tab partials +- Any refactoring of `views.py` or extraction of helpers +- Any visual / UX change +- Tests — query-count changes don't break the existing tests (they + assert URL contract + output shape, not query plans). If a test + genuinely needs updating because I materially changed a view's + behaviour, I'll note why in the commit + +## Risks + rollback + +All four changes are individually revertible. Biggest risks: + +- **mtime-based token misfires on fresh containers** — mitigated by + try/except fallback to today's behaviour +- **A `select_related` fix changes query semantics** — e.g., eager + loading a nullable FK that used to be accessed lazily-with-None. Low + risk on Django's ORM, but the test suite (65 tests, all passing at + HEAD `503eff6`) will catch any behavioural regression +- **Django Debug Toolbar pulled in in prod** — mitigated by double-gate + (DEBUG=true AND USE_SQLITE=true) in `config/settings.py` + +Rollback: `git revert ` on the offending commit. No data, schema, +or URL-contract impact. + +## Out of scope (explicit non-goals) + +- Plan B / C items (template splitting, written baseline doc, whole-app + measurement) +- Moving CDN assets to local / self-hosted +- Changing Flatlogic's `runserver` → gunicorn +- Turning on HTTP/2 push, service workers, or other frontend perf tooling +- Any refactor that requires a migration + +## Next step + +Generate an implementation plan via the writing-plans skill +(task-by-task, bite-sized steps) and then execute via +subagent-driven-development. Auto mode is active — proceed +continuously, no mid-execution checkpoints (plan A is 4 small +mechanical changes; a checkpoint adds overhead without value). + +Ship alongside current `ai-dev` HEAD (`503eff6`) in the same branch.