Scenario · Connections & Pooling
Runaway query holds a pool connection
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L3 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-03/07-runaway-query-holds-pool-connectionPart of these paths
Show the postmortem & investigation hints spoilers
Runaway query holds a pool connection Type: incident simulation · Topic: Connections & Pooling · Level: L3 · Duration: 10–15 min Launch: ride postgres start stage-03/07-runaway-query-holds-pool-connection POSTMORTEM (root cause · how it was found · the fix · lesson) Root cause: a few long-running queries stayed ACTIVE for minutes, each pinning one of PgBouncer's server connections. With the pool's slots all occupied, every other client queued (cl_waiting > 0) and the app appeared to hang — even though PostgreSQL itself had plenty of connection slots free. This is a pooling incident, not a lock and not a missing index. How it was found: pg_stat_activity showed Postgres far from max_connections but a handful of queries active far longer than any healthy request; PgBouncer's SHOW POOLS showed all server connections busy with clients waiting. The mitigation: terminate the runaway queries holding the pool's server connections; the queued client is promoted and cl_waiting drops to 0. Lesson: a runaway query on a pooled path starves everyone behind it. Cap it with statement_timeout (and idle_in_transaction_session_timeout), size the pool for the workload, and keep pooled transactions short — don't reach for indexes or for raising Postgres max_connections. INVESTIGATION HINTS (the staged path to diagnose and fix) 1. The app hangs, but PostgreSQL is NOT at max_connections — there's headroom in pg_stat_activity. The pool one layer up (PgBouncer) is starved. Look for long-running ACTIVE queries holding the pool's server connections. 2. On the direct Postgres connection: SELECT pid, application_name, state, now() - query_start AS age, query FROM pg_stat_activity WHERE state = 'active' ORDER BY query_start; A couple of queries have been active far too long. On the PgBouncer admin console, SHOW POOLS shows cl_waiting > 0 with every server slot busy. 3. Terminate the long-running queries that are pinning the pool's server connections (pg_terminate_backend for the runaway_query backends) — not the waiters. SHOW POOLS cl_waiting then drops to 0. The durable fix is statement_timeout and not running unbounded queries on a pooled path.