← All scenarios

Scenario · Vacuum & Bloat

Table bloat after update churn

A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.

L2 · 10–15 min · runs locally in Docker

Launch

Start this scenario

Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.

ride postgres start stage-07/02-table-bloat-after-update-churn

Part of these paths

Show the postmortem & investigation hints spoilers
Table bloat after update churn
Type: incident simulation · Topic: Vacuum & Bloat · Level: L2 · Duration: 10–15 min
Launch: ride postgres start stage-07/02-table-bloat-after-update-churn

POSTMORTEM (root cause · how it was found · the fix · lesson)
Root cause: heavy update/delete churn on analytics_db.events left a large number of
dead tuples (bloat) occupying space, so the table grew and scans slowed. It's not a
query-plan or index problem — it's reclaimable dead space.

How it was found: pg_stat_user_tables showed a high n_dead_tup and a large
pg_total_relation_size for events.

The mitigation: VACUUM ANALYZE events — reclaims the dead space for reuse.

Lesson: churn creates bloat; a plain VACUUM reclaims dead space for reuse (it does
NOT shrink the file on disk). VACUUM FULL would shrink it but takes an ACCESS
EXCLUSIVE lock — use a maintenance window or pg_repack for physical shrink, not as
an incident reflex. An index is unrelated, and vacuuming the wrong database does
nothing.

INVESTIGATION HINTS (the staged path to diagnose and fix)
1. An analytics table got slow and large after heavy update/delete churn. Look at dead tuples and size: \connect analytics_db then SELECT relname, n_live_tup, n_dead_tup, pg_size_pretty(pg_total_relation_size(relid)) FROM pg_stat_user_tables WHERE relname='events'; lots of dead tuples.
2. This is table bloat from churn — dead tuples occupying space. The first remediation is a plain VACUUM (reclaims space for reuse), not an index and not VACUUM FULL.
3. Reclaim it: \connect analytics_db then VACUUM ANALYZE events; Avoid VACUUM FULL (it takes a heavy ACCESS EXCLUSIVE lock — a maintenance-window/pg_repack job, not an incident reflex), don't add an index, and don't vacuum the wrong database.