← All bricks

PostgreSQL · Intermediate

Finding the Heavy Queries

Learn to find your most expensive queries using pg_stat_statements and read table health through pg_stat_user_tables, pg_stat_user_indexes, and pg_size_pretty — so you know where the pain is before you guess at a fix.

Most performance fires start the same way: someone notices the app is slow, but nobody knows which query is causing it. This course builds the habit of looking at the data before guessing. We start by wiring up pg_stat_statements — the extension that tracks every query the server has ever run — and configuring it to capture nested calls too. Then we generate a realistic workload across four tables and read the results through three lenses: total impact, per-call cost, and raw frequency. We finish by building a six-block dashboard that gives a full health snapshot in seconds, covering active queries, blocked sessions, top statements, table scan ratios, unused indexes, and table sizes.

What you'll build

  • Enable and configure pg_stat_statements with shared_preload_libraries
  • Find top queries by total_exec_time, mean_exec_time, and calls
  • Read shared_blks_hit and shared_blks_read to estimate query I/O
  • Reset statistics with pg_stat_statements_reset and know when to use it
  • Spot tables with high seq_scan using pg_stat_user_tables
  • Identify unused indexes through pg_stat_user_indexes
  • Measure table and index sizes with pg_size_pretty
  • Calculate cache hit ratio from pg_statio_user_tables
  • Run a single dashboard query that covers all six health dimensions

Contents

  1. A database wired for query statistics
  2. Two more knobs and a healthcheck
  3. One command to bring it up
  4. Start the server
  5. A short way into the database
  6. Confirm the server is up and the extension is loaded
  7. The users table
  8. The orders table
  9. The payments table
  10. The events table
  11. Create the tables
  12. Seed the users, repeatably
  13. Seed half a million orders
  14. Seed four hundred thousand payments
  15. Seed a million events
  16. Load the data
  17. Fresh statistics and single-threaded plans
  18. Confirm the volumes
  19. Create the extension
  20. Verify the configuration was read
  21. Four query patterns, four different profiles
  22. Run the workload
  23. Top queries by total impact
  24. The aggregate query dominates
  25. Finding the rare monster
  26. The events scan is the slowest per call
  27. The frequency lens
  28. Point lookups dominate the call count
  29. Reading the I/O columns
  30. Resetting the statistics
  31. Table-level scan counters
  32. The events table is taking the most sequential scans
  33. The seq_ratio signal
  34. Payments and events have never used an index
  35. Index scan counters per index
  36. Two indexes with zero scans
  37. Three size views for every table
  38. Events is the heaviest table
  39. Cache hit ratio
  40. Reading the ratio
  41. Dashboard: who is running right now
  42. Dashboard: blocked queries
  43. Dashboard: top statements by total time
  44. Dashboard: table health
  45. Dashboard: unused indexes
  46. Dashboard: table sizes
  47. Running the full dashboard
  48. Tearing the stand down for a clean start
  49. A deliberately slow query to find
  50. Finding the heavy query in the dashboard