Scenario · Storage & Backup
Disk full from temp files
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L2 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-05/01-disk-full-from-temp-filesPart of these paths
Show the postmortem & investigation hints spoilers
Disk full from temp files Type: incident simulation · Topic: Storage & Backup · Level: L2 · Duration: 10–15 min Launch: ride postgres start stage-05/01-disk-full-from-temp-files POSTMORTEM (root cause · how it was found · the fix · lesson) Root cause: a workload ran huge sorts that spilled to temporary files far beyond work_mem, churning the temp area and threatening to fill the disk. temp_file_limit capped each attempt, but the workload kept retrying, so temp_files/temp_bytes climbed continuously. This is storage pressure, not a query-tuning task. How it was found: pg_stat_database.temp_files/temp_bytes for the database kept rising; pg_stat_activity showed one app (temp_spill) repeatedly spilling. The mitigation: stop the runaway temp-spilling workload; temp creation then stopped. Lesson: trace temp-file pressure to the workload producing it and stop/fix that (right work_mem per query, an index, or batching). Don't raise work_mem globally — it multiplies per backend and makes disk pressure worse — and never delete files by hand. INVESTIGATION HINTS (the staged path to diagnose and fix) 1. Queries are failing/degrading from temporary-file pressure, not a plan problem. Look at temp usage: SELECT datname, temp_files, temp_bytes FROM pg_stat_database WHERE datname = current_database(); it keeps climbing. 2. Check the guardrails: SHOW temp_file_limit; SHOW work_mem; and find the culprit in pg_stat_activity — one app (temp_spill) is in a loop spilling huge sorts to disk. 3. Stop the runaway temp-spilling workload: SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE application_name LIKE 'temp_spill%'; temp pressure then stops growing. Don't crank work_mem globally (every backend would grab that much) and don't add indexes.