Scenario · Migrations & Releases
ALTER TABLE locks production
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L3 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-09/01-alter-table-locks-productionPart of these paths
Show the postmortem & investigation hints spoilers
ALTER TABLE locks production Type: incident simulation · Topic: Migrations & Releases · Level: L3 · Duration: 10–15 min Launch: ride postgres start stage-09/01-alter-table-locks-production POSTMORTEM (root cause · how it was found · the fix · lesson) Root cause: a release ran an unsafe `ALTER TABLE orders ...` migration against the production database (app_db). The migration took an ACCESS EXCLUSIVE lock and held its transaction open, so every production INSERT/UPDATE — which needs ROW EXCLUSIVE — blocked behind it. The look-alike analytics_db was never involved. How it was found: pg_stat_activity (filtered to datname = 'app_db') showed a `migration_runner` session idle in transaction holding the lock, and an `app_writer` session with wait_event_type = 'Lock'. pg_blocking_pids() tied the waiter to the migration's pid. The fix: terminate the migration session (application_name LIKE 'migration_runner%'). Releasing its ACCESS EXCLUSIVE lock lets the queued writes drain immediately. Lesson: a migration is a write outage waiting to happen if it holds a strong lock on a hot table. Break heavy DDL into safe phases, set a short `lock_timeout` / `statement_timeout` on migrations, and schedule the risky ones. When it's already stuck, end the *migration* (the blocker) — not the app writers — and never reach for an index: this is a locking/release-discipline incident, not a query problem. INVESTIGATION HINTS (the staged path to diagnose and fix) 1. Production writes to app_db.orders started hanging during a release. Look at pg_stat_activity (datname = 'app_db'): a migration session is holding a lock and a writer is waiting on it. 2. The migration grabbed an ACCESS EXCLUSIVE lock (an unsafe ALTER TABLE) and kept its transaction open. Every INSERT/UPDATE needs ROW EXCLUSIVE, which conflicts — so writes queue behind the migration. pg_blocking_pids() ties the waiter to the migration's pid. 3. End the migration session (application_name LIKE 'migration_runner%'), not the app writers. Once its lock is released, the queued writes drain. This is not an index problem.