Scenario · Replication & WAL
Read replica stale reads
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L2 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-04/03-read-replica-stale-readsPart of these paths
Show the postmortem & investigation hints spoilers
Read replica stale reads Type: incident simulation · Topic: Replication & WAL · Level: L2 · Duration: 10–15 min Launch: ride postgres start stage-04/03-read-replica-stale-reads POSTMORTEM (root cause · how it was found · the fix · lesson) Root cause: the application wrote to the primary and immediately read from the replica, but WAL replay on the replica was lagging, so the read returned an older snapshot. The data was never wrong — the replica just hadn't applied the latest WAL yet. This is read-after-write consistency under replication lag, not a transaction-isolation or query problem. How it was found: the primary's pg_current_wal_lsn was well ahead of the replica's pg_last_wal_replay_lsn, and a row count on the replica trailed the primary while replay was stalled. The mitigation: resume replay so the replica catches up; the read then returned current data. Lesson: route read-after-write to the primary (or wait for the replica's replay_lsn to reach the write's LSN) for consistency-sensitive reads. Don't write to a read-only replica and don't treat stale reads as an isolation or indexing issue. INVESTIGATION HINTS (the staged path to diagnose and fix) 1. The app reads from the replica right after writing to the primary and sees old data. That's read-after-write inconsistency from replication lag — not a transaction-isolation bug. Compare the two ends. 2. On the PRIMARY: SELECT pg_current_wal_lsn(); and count the churn rows. On the REPLICA: SELECT pg_is_in_recovery(); SELECT pg_last_wal_replay_lsn(); the replica's replay LSN is stuck behind the primary, so its reads are stale. 3. Let the replica catch up: SELECT pg_wal_replay_resume(); on the replica. Don't try to UPDATE the replica (it's read-only) and don't add indexes — the data is correct, just not yet replayed.