The most useful Postgres extension: pg_stat_statements

Written by Craig Kerstiens
February 8, 2019

Extensions are capable of extending, changing, and advancing the behavior of Postgres. How? By hooking into low level Postgres API hooks. The open source Citus database that scales out Postgres horizontally is itself implemented as a PostgreSQL extension, which allows Citus to stay current with Postgres releases without lagging behind like other Postgres forks. I've previously written about the various types of extensions, today though I want to take a deeper look at the most useful Postgres extension: pg_stat_statements.

You see, I just got back from FOSDEM. FOSDEM is the annual free and open source software conference in Brussels, and at the event I gave a talk in the PostgreSQL devroom about Postgres extensions. By the end of the day, over half the talks that had been given in the Postgres devroom mentioned pg_stat_statements:

If you use Postgres and you haven't yet used pg_stat_statements, it is a must to add it to your toolbox. And even if you are familiar, it may be worth a revisit.

Getting started with pg_stat_statements

Pg_stat_statements is what is known as a contrib extension, found in the contrib directory of a PostgreSQL distribution. This means it already ships with Postgres and you don't have to go and build it from source or install packages. You may have to enable it for your database if it is not already enabled. This is as simple as:

CREATE EXTENSION pg_stat_statements;

If you run on a major cloud provider there is a strong likelihood they have already installed and enabled it for you.

Once pg_stat_statements is installed, it begins silently going to work under the covers. Pg_stat_statements records queries that are run against your database, strips out a number of variables from them, and then saves data about the query, such as how long it took, as well as what happened to underlying reads/writes.

Note: It doesn't save each individual query, rather it parameterizes them and then saves the aggregated result

Let's look at how would work with a couple of examples. Suppose we execute the following query:

SELECT order_details.qty,
FROM order_details,
WHERE = order_details.customer_id
  AND = ''

It would transform this query to:

SELECT order_details.qty,
FROM order_details,
WHERE = order_details.customer_id
  AND = '?'

If this were a query in my application that I was frequently executing to get the order details for something like a retail order history, it wouldn't save how often I ran this for each user, but rather for an aggregated view.

Looking at the data

From here we can query the raw data of pg_stat_statements and we'll see something like:

FROM pg_stat_statements;

userid              | 16384
dbid                | 16388
query               | select * from users where email = ?;
calls               | 2
total_time          | 0.000268
rows                | 2
shared_blks_hit     | 16
shared_blks_read    | 0
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit      | 0
local_blks_read     | 0
local_blks_dirtied  | 0
local_blks_written  | 0

Extracting insights with pg_stat_statements

Now there is a wealth of valuable information here, and as a more advanced user there are times where it can all prove valuable. But even without starting to understand the internals of your database, you can get some really powerful insights by querying pg_stat_statements in certain ways. By looking at the total_time and number of times a query is called per query, we can get a really quick view of which queries are very frequently run, as well as what they consume on average:

  (total_time / 1000 / 60) as total, 
  (total_time/calls) as avg, 
FROM pg_stat_statements 
LIMIT 100;

There are a number of different ways you can filter this and sort this, you may want to focus only on queries that are run over 1,000 times. Or queries that average over 100 milliseconds. The above query shows us the total amount of time in minutes that have been consumed against our database as well as the average time in milliseconds. With the above query I would get back something looks like:

   total  |   avg  |        query
   295.76 |  10.13 | SELECT id FROM users...
   219.13 |  80.24 | SELECT * FROM ...
  (2 rows)

As a rule of thumb, I know that when quickly grabbing a record, PostgreSQL should be able to return in 1ms. Given this I could get to work optimizing. On the above I see that bringing the first query down to 1ms would be an improvement, but optimizing the second query would give an even bigger boost in performance to my system overall.

A special note: If you're building multi-tenant apps, you may not want pg_stat_statements to parameterize your tenant_id. To help with this, we built citus_stat_statements to give you per tenant insights.

If you haven't looked at your data from pg_stat_statements ever—or even in the past month—today is a good day for it. What does it tell you about places you can optimize? We'd love to hear what you discover @citusdata.

Craig Kerstiens

Written by Craig Kerstiens

Former Head of Cloud at Citus Data. Ran product at Heroku Postgres. Countless conference talks on Postgres & Citus. Loves bbq and football.