Scenario · Query Performance
Stale statistics after a data-distribution change
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L2 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-01/03-stale-statisticsPart of these paths
Show the postmortem & investigation hints spoilers
Stale statistics after a data-distribution change
Type: incident simulation · Topic: Query Performance · Level: L2 · Duration: 10–15 min
Launch: ride postgres start stage-01/03-stale-statistics
POSTMORTEM (root cause · how it was found · the fix · lesson)
Root cause: after the last ANALYZE, the data distribution changed dramatically
(most rows became status='paid'), but the statistics still described the old,
even distribution. The planner estimated the filter would match ~1/5 of the
table when it actually matched ~9/10, so its row estimates — and therefore its
plan choices — were badly wrong. Autovacuum was disabled on the table, so
nothing refreshed the stats automatically.
How it was found: EXPLAIN (ANALYZE) showed a large gap between estimated rows
("rows=") and actual rows ("actual rows="). pg_stat_user_tables confirmed a
stale last_analyze and a shifted live-tuple count.
The fix: ANALYZE orders; (or VACUUM ANALYZE orders;)
After refreshing statistics the estimates lined up with reality and the planner
could choose sensibly again. No new index was needed — reaching for one would
have treated the symptom, not the cause.
Lesson: bad plans after a bulk load, mass UPDATE, or big DELETE are often a
statistics problem, not an indexing problem. When estimates and actuals diverge,
ANALYZE first. Build ANALYZE into data-loading pipelines, and don't disable
autovacuum without a plan to keep stats fresh.
INVESTIGATION HINTS (the staged path to diagnose and fix)
1. The plan didn't get worse because of a missing index — it got worse because the planner's idea of the data is out of date. EXPLAIN (ANALYZE) and compare the estimated rows vs actual rows.
2. A large gap between 'rows=' (estimate) and 'actual rows=' means the statistics are stale. Check pg_stat_user_tables (last_analyze / last_autovacuum, n_live_tup) for the orders table.
3. The data distribution changed after the last ANALYZE and autovacuum is off, so stats never refreshed. Run ANALYZE orders; (or VACUUM ANALYZE orders;) and re-check the estimates. You do NOT need a new index here.