← All bricks

PostgreSQL · Advanced

WAL Archiving and Point-in-Time Recovery

Understand WAL as a change journal, enable archive-mode, run pg_basebackup, then perform a real Point-in-Time Recovery — restoring a database to the moment before a destructive DELETE.

Every production Postgres needs more than a nightly dump. This course builds the full safety net from scratch: we turn on WAL archiving so every committed change lands in a durable archive directory, snapshot the live cluster with pg_basebackup, then stage a real disaster — an accidental DELETE that wipes the table — and use Point-in-Time Recovery to rewind the database to the exact second before it happened. Along the way we read LSNs, understand what archive_command does on every segment switch, and learn why a base backup alone is not enough without a complete WAL chain. The recovery mechanics — restore_command, recovery_target_time, recovery.signal, and promote — are exercised on real data, not slides.

What you'll build

  • Explain what WAL is and why it enables crash recovery and archiving
  • Read pg_current_wal_lsn() and track LSN movement across commits
  • Configure wal_level, archive_mode, and archive_command in postgresql.conf
  • Verify that WAL segments land in the archive directory after pg_switch_wal()
  • Create a replication role and run pg_basebackup with streaming WAL
  • Understand the directory layout a base backup produces
  • Perform a complete PITR: recovery.signal, restore_command, recovery_target_time, promote
  • Distinguish recovery_target_time, recovery_target_name, recovery_target_xid, and recovery_target_lsn

Contents

  1. A primary with a mounted config and an archive
  2. The server configuration we own
  3. Access rules for the stand
  4. Start the server
  5. The table that proves recovery works
  6. Connect with psql
  7. Create the table
  8. Fifty thousand orders, deterministically
  9. Load the initial data
  10. Three questions about the WAL
  11. The WAL before archiving is enabled
  12. Every commit moves the LSN
  13. The LSN after five more commits
  14. Three lines that enable the archive
  15. Restart to apply the new GUC values
  16. Archiving is now live
  17. Force the first segment to archive
  18. The first segment is in the archive
  19. Open the replication protocol
  20. Create the replication role
  21. The backup script
  22. Take the base backup
  23. What the base backup contains
  24. The LSN before the additional writes
  25. The writes we want to recover to
  26. Add the second batch and note the time
  27. Fifty-five thousand rows, all present
  28. The destructive operation
  29. Archive the segment that contains the DELETE
  30. The archive holds the complete history
  31. The recovery script
  32. What recovery.signal does
  33. Perform the recovery
  34. The data is back
  35. The archiver's ledger
  36. Four ways to pin the recovery target
  37. Reading the WAL directly
  38. Timeline 2 on the recovered instance
  39. What we built
  40. A two-query health check
  41. Try it yourself: recover to the original 50 000