← All scenarios

Scenario · Connections & Pooling

Too many connections

A sandboxed PostgreSQL incident — investigate with your own tools, submit a fix, and get deterministic Detect / Fix / Trap scoring.

L1 · 10–15 min · runs locally in Docker

Launch

Start this scenario

Boot it in a real PostgreSQL sandbox and investigate with psql, EXPLAIN and pg_stat_statements.

ride postgres start stage-03/01-too-many-connections

Part of these paths

Show the postmortem & investigation hints spoilers
Too many connections
Type: incident simulation · Topic: Connections & Pooling · Level: L1 · Duration: 10–15 min
Launch: ride postgres start stage-03/01-too-many-connections

POSTMORTEM (root cause · how it was found · the fix · lesson)
Root cause: the application opened far more connections than it needed and sat
near max_connections, so new connections started failing with `FATAL: sorry,
too many clients already`. Queries themselves were fine — the database ran out
of connection slots, most of them held idle by one app.

How it was found: count(*) on pg_stat_activity was close to max_connections;
grouping by application_name/state showed one app (app_api_worker) holding most
slots, largely idle.

The mitigation: terminate the idle app connections to free slots.

Lesson: connection exhaustion is not a query problem — don't reach for indexes.
The real fix is a connection pool (PgBouncer or an app-side pool) and a sane app
pool size, not raising max_connections (each connection costs memory).

INVESTIGATION HINTS (the staged path to diagnose and fix)
1. New connections are failing but queries are fine — this is connection exhaustion, not a slow query. Count them: SELECT count(*) FROM pg_stat_activity; and compare to SHOW max_connections.
2. Group to find the culprit: SELECT application_name, state, count(*) FROM pg_stat_activity GROUP BY application_name, state ORDER BY 3 DESC. One app is holding most of the slots, much of it idle.
3. Shed the idle app connections to free slots: pg_terminate_backend(pid) for that application_name where state = 'idle'. Don't kill your own session or system backends; the real fix is a connection pool / lower app pool size.