Playground (pgpg) · REPL-first incident simulator

Practice PostgreSQL incidents in Playground (pgpg)

Playground (pgpg) is a REPL-first PostgreSQL incident simulator with 112 hands-on scenarios across 12 production tracks. Start real PostgreSQL failures in Docker, investigate with psql, get hints, submit your fix, and receive deterministic scoring from one interactive terminal session.

REPL-firstReal PostgreSQL · psqlDeterministic scoring112 scenarios
What a Playground session looks like

Run pgpg, start an incident, investigate with psql, then submit and score

No toy editor, no fake database UI, no copying session ids between commands. pgpg opens an interactive REPL: it briefs you on the incident, prints a connection string, and scores your fix — you investigate a real PostgreSQL sandbox with the tools engineers already use.

Step 1 — Enter the Playground
REPL
$ pgpg Playground (pgpg) Type help for commands. pgpg> help start <scenario> start an incident submit · score · stop

Start pgpg without arguments to enter the interactive REPL.

Step 2 — Start an incident
REPL
pgpg> start stage-01/01-missing-index --profile dev Starting PostgreSQL sandbox… Incident: missing-index A checkout endpoint suddenly got much slower. The database looks healthy otherwise — find out what changed and fix it. Connect and investigate: postgres://postgres:postgres@127.0.0.1:55432/app_db pgpg[missing-index]>

pgpg briefs you on the incident and prints a connection string. The prompt tracks the active incident, so the next commands know which session to use.

Step 3 — Investigate with psql
psql
$ psql "postgres://postgres:postgres@127.0.0.1:55432/app_db" app_db=> EXPLAIN ANALYZE app_db-> SELECT * FROM orders WHERE customer_id = 42; Seq Scan on orders (cost=0.00..18650.00 …)

pgpg creates the incident. You investigate with real PostgreSQL tools.

Step 4 — Submit and score
Deterministic
pgpg[missing-index]> submit Submission received. pgpg[missing-index]> score Detect: pass Fix: pass Trap: pass Score: 100 / 100 pgpg[missing-index]> stop Sandbox stopped. pgpg> exit

Scoring is deterministic and offline. AI can assist, but it does not judge the result.

pgpg manages the incident session. psql is where you investigate. The scorecard tells you whether you detected, fixed, and avoided traps.

What pgpg is

A flight simulator for PostgreSQL operations

pgpg is an interactive terminal playground for PostgreSQL incidents. It runs as a REPL-first CLI: it manages the incident session while you investigate the database with psql. Each scenario provisions a real PostgreSQL environment, injects a realistic failure, and scores your fix — Detect, Fix and Trap.

Real PostgreSQL sandbox

Every scenario runs an actual PostgreSQL instance in a container — not a multiple-choice quiz or a video.

Real incident scenarios

Missing indexes, stale statistics, lock contention, replication lag, disk pressure and WAL growth — injected into a live system.

AI hints + deterministic scoring

Get progressive hints while you investigate, a scorecard for your fix and a postmortem explaining the root cause.

Built for individuals and teams

Practice alone, onboard new engineers, or run repeatable internal PostgreSQL incident drills.

How it works

From paged to postmortem in four steps

Enter the Playground

Run pgpg to open the interactive REPL.

Start an incident

Choose one of 112 PostgreSQL scenarios across 12 tracks.

Investigate with psql

Use real PostgreSQL tools — EXPLAIN, pg_stat_statements, pg_locks — not a fake browser editor.

Submit and score

Get deterministic Detect / Fix / Trap scoring and a postmortem.

pgpg start investigate in psql submit score stop
Detect / Fix / Trap

Deterministic scoring, plus AI hints

pgpg scores three things on every attempt — and the scoring engine is deterministic and offline, so the same fix always earns the same result.

  • Detect — did you find the real root cause?
  • Fix — did your change actually resolve the incident?
  • Trap — did you avoid the dangerous production actions?

Inside the REPL, tutor can explain what to check next — it guides and explains, it isn't the source of truth. It never runs your database for you; it helps you learn.

REPL
pgpg[missing-index]> tutor what should I check first? tutor Run EXPLAIN on the hot query — the seq scan on orders is the path to fix. pgpg[missing-index]> score ✓ detect: identified missing index ✓ fix: index created, no table rewrite ✓ trap: no dangerous actions score → 100 / 100
Example incident

Anatomy of one incident: a missing index

A realistic scenario, the way pgpg presents it — symptoms first, then you investigate and fix.

The symptom

Checkout latency is 30× higher than normal. Users hit timeouts, CPU isn't maxed, and one query dominates pg_stat_statements.

The investigation

You open psql with the printed connection string, run EXPLAIN ANALYZE on the hot query and find a sequential scan where an index used to be.

The fix & score

You create the right index without a table rewrite, then submit and score. pgpg scores Detect, Fix and Trap — and explains the root cause.

Incident tracks

112 PostgreSQL incidents across 12 production tracks

From slow queries and lock pileups to replication, failover, migrations, security, and final multi-phase Incident Control simulations.

01

Query Performance

Core

Slow queries, missing indexes, stale statistics, bad plans and sort spills.

02

Locks, Blocking & Transactions

Core

Deadlocks, lock queues, idle-in-transaction sessions and DDL that stalls writes.

03

Connections, Pooling & App Integration

Core

Connection exhaustion, PgBouncer misconfigurations and pooling failures under load.

04

Replication & WAL

Operations

Replication lag, WAL growth, replication slot retention and standby conflicts.

05

Storage, Backup & Recovery

Operations

Disk pressure, backup validation, PITR and accidental data-loss recovery.

06

Multi-Database Operations

Operations

Cross-database workflows, dump/restore between instances and operational drift.

07

Vacuum, Bloat & XID

Advanced

Autovacuum tuning, table and index bloat, and transaction-id wraparound risk.

08

HA & Failover

Advanced

Failover drills, split-brain risk, synchronous replication and recovery checks.

09

Migrations & Releases

Advanced

Migration locks, unsafe backfills and the release patterns that stall production.

10

Security & Access

Advanced

Privilege misconfigurations, exposed roles, row-level security gaps and credentials.

11

Compound Production Incidents

Final

Scenarios that combine several failure modes at once — the messy, multi-cause incidents.

12

Incident Control

Final

Multi-phase production incidents where the goal is to bring the system back under control.

Incident Control

Final simulations for complex PostgreSQL incidents

Incident Control is the final pgpg track: multi-phase production incidents where the goal is not just to fix one issue, but to bring a complex PostgreSQL system back under control.

  • Multi-node behavior
  • Multi-database validation
  • Replication and failover pressure
  • Backup and recovery decisions
  • Migration fallout
  • Security and access mistakes
  • Strict post-incident validation
Incident Control
# multi-phase incident · bring it back under control pgpg> start control/01-cascade --profile dev phase 1 · replica lag climbing under write burst phase 2 · failover promotes a stale standby phase 3 · disk fills from retained WAL pgpg[01-cascade]> score Recovery: pass Data loss: none Post-incident validation: pass
Local-first & safe

Practice in a sandbox, never in production

Every incident runs in a local Docker PostgreSQL sandbox controlled by the CLI. No production access is required, generated data is used, and each scenario is a reproducible workflow: provision, seed, inject the fault, run a workload, observe, score and clean up.

provision seed inject fault run workload observe score clean up
Who it's for

Use it alone, with your team, or with clients

Individual engineers

Build confidence before your next on-call rotation and prepare for senior backend, SRE and DBA interviews.

Engineering teams

Onboard engineers into PostgreSQL operations with repeatable incident drills instead of tribal knowledge.

PostgreSQL consultants

Run customer trainings with realistic, reproducible incident scenarios you can reuse across clients.

FAQ

pgpg questions, answered

Is this a hosted product or a local CLI?

pgpg is a REPL-first local CLI running real PostgreSQL containers in Docker. You run pgpg, start an incident, and investigate with your own psql.

Do I need to be a DBA?

No. It's built for backend engineers, SREs, DevOps engineers and DBAs alike.

Does AI score my incidents?

No. The scoring engine is deterministic and offline. AI is used for hints and explanation, not as the source of truth.

Does it use a real PostgreSQL database?

Yes. Every scenario runs against a real PostgreSQL instance in Docker, not a simulation of one.

Is production data used?

No. Scenarios use generated sandbox data, in isolated local environments. No production access is required.

Is this only about query performance?

No. pgpg has 112 scenarios across 12 tracks — query performance, locks, replication, WAL, storage, vacuum, bloat, HA, migrations, security and full Incident Control.

Is pgpg sold separately from psql+?

No. Every Rillence subscription includes both pgpg and psql+. New scenarios and refinements are included with the subscription.

Practice incidents before they become outages

One annual subscription includes both pgpg and psql+.