PostgreSQL · Intermediate

Finding the Heavy Queries

Learn to find your most expensive queries using pg_stat_statements and read table health through pg_stat_user_tables, pg_stat_user_indexes, and pg_size_pretty — so you know where the pain is before you guess at a fix.

Most performance fires start the same way: someone notices the app is slow, but nobody knows which query is causing it. This course builds the habit of looking at the data before guessing. We start by wiring up pg_stat_statements — the extension that tracks every query the server has ever run — and configuring it to capture nested calls too. Then we generate a realistic workload across four tables and read the results through three lenses: total impact, per-call cost, and raw frequency. We finish by building a six-block dashboard that gives a full health snapshot in seconds, covering active queries, blocked sessions, top statements, table scan ratios, unused indexes, and table sizes.

What you'll build

Enable and configure pg_stat_statements with shared_preload_libraries
Find top queries by total_exec_time, mean_exec_time, and calls
Read shared_blks_hit and shared_blks_read to estimate query I/O
Reset statistics with pg_stat_statements_reset and know when to use it
Spot tables with high seq_scan using pg_stat_user_tables
Identify unused indexes through pg_stat_user_indexes
Measure table and index sizes with pg_size_pretty
Calculate cache hit ratio from pg_statio_user_tables
Run a single dashboard query that covers all six health dimensions

A database wired for query statistics
Two more knobs and a healthcheck
One command to bring it up
Start the server
A short way into the database
Confirm the server is up and the extension is loaded
The users table
The orders table
The payments table
The events table
Create the tables
Seed the users, repeatably
Seed half a million orders
Seed four hundred thousand payments
Seed a million events
Load the data
Fresh statistics and single-threaded plans
Confirm the volumes
Create the extension
Verify the configuration was read
Four query patterns, four different profiles
Run the workload
Top queries by total impact
The aggregate query dominates
Finding the rare monster
The events scan is the slowest per call
The frequency lens
Point lookups dominate the call count
Reading the I/O columns
Resetting the statistics
Table-level scan counters
The events table is taking the most sequential scans
The seq_ratio signal
Payments and events have never used an index
Index scan counters per index
Two indexes with zero scans
Three size views for every table
Events is the heaviest table
Cache hit ratio
Reading the ratio
Dashboard: who is running right now
Dashboard: blocked queries
Dashboard: top statements by total time
Dashboard: table health
Dashboard: unused indexes
Dashboard: table sizes
Running the full dashboard
Tearing the stand down for a clean start
A deliberately slow query to find
Finding the heavy query in the dashboard

Finding the Heavy Queries

What you'll build

Contents