Scenario · Replication & WAL
Replication slot dropped
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L3 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-04/08-replication-slot-droppedPart of these paths
Show the postmortem & investigation hints spoilers
Replication slot dropped
Type: incident simulation · Topic: Replication & WAL · Level: L3 · Duration: 10–15 min
Launch: ride postgres start stage-04/08-replication-slot-dropped
POSTMORTEM (root cause · how it was found · the fix · lesson)
Root cause: the physical replication slot the standby streams through
(`replica_slot`, set as its primary_slot_name) was dropped. Without it the
walreceiver had no slot to attach to, so streaming stopped and — worse — the
primary was no longer retaining WAL on the standby's behalf, risking a gap.
How it was found: pg_replication_slots on the primary no longer listed
replica_slot; pg_stat_replication was empty; pg_stat_wal_receiver on the replica
showed it couldn't stream.
The mitigation: recreate the slot (pg_create_physical_replication_slot); the
walreceiver reconnected through it and streaming resumed.
Lesson: a slot-based standby depends on its slot existing. Manage slots
carefully (dropping one un-protects its consumer's WAL); recreate the expected
slot to restore streaming. This is slot management, not a failover/rebuild — and
not an indexing problem.
INVESTIGATION HINTS (the staged path to diagnose and fix)
1. The replica stopped streaming and the primary is healthy. This standby is configured to stream through a named replication slot, but that slot is gone. On the PRIMARY: SELECT slot_name, active, restart_lsn FROM pg_replication_slots; — the expected `replica_slot` is missing.
2. Confirm the standby is detached: pg_stat_replication on the primary is empty, and on the REPLICA pg_stat_wal_receiver shows it can't stream (the configured primary_slot_name doesn't exist).
3. Recreate the expected slot on the PRIMARY: SELECT pg_create_physical_replication_slot('replica_slot'); the walreceiver reconnects through it. Don't drop other slots or add indexes.