Scenario · Connections & Pooling
Too many connections
A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.
L1 · 10–15 min · runs locally in Docker
Launch
Start this scenario
Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.
ride postgres start stage-03/01-too-many-connectionsPart of these paths
Show the postmortem & investigation hints spoilers
Too many connections Type: incident simulation · Topic: Connections & Pooling · Level: L1 · Duration: 10–15 min Launch: ride postgres start stage-03/01-too-many-connections POSTMORTEM (root cause · how it was found · the fix · lesson) Root cause: the application opened far more connections than it needed and sat near max_connections, so new connections started failing with `FATAL: sorry, too many clients already`. Queries themselves were fine — the database ran out of connection slots, most of them held idle by one app. How it was found: count(*) on pg_stat_activity was close to max_connections; grouping by application_name/state showed one app (app_api_worker) holding most slots, largely idle. The mitigation: terminate the idle app connections to free slots. Lesson: connection exhaustion is not a query problem — don't reach for indexes. The real fix is a connection pool (PgBouncer or an app-side pool) and a sane app pool size, not raising max_connections (each connection costs memory). INVESTIGATION HINTS (the staged path to diagnose and fix) 1. New connections are failing but queries are fine — this is connection exhaustion, not a slow query. Count them: SELECT count(*) FROM pg_stat_activity; and compare to SHOW max_connections. 2. Group to find the culprit: SELECT application_name, state, count(*) FROM pg_stat_activity GROUP BY application_name, state ORDER BY 3 DESC. One app is holding most of the slots, much of it idle. 3. Shed the idle app connections to free slots: pg_terminate_backend(pid) for that application_name where state = 'idle'. Don't kill your own session or system backends; the real fix is a connection pool / lower app pool size.