PostgreSQL rocks, except when it blocks: Understanding locks

Written by Marco Slot
February 15, 2018

On the Citus open source team, we engineers take an active role in helping our users scale out their Postgres database, be it for migrating an existing application or building a new application from scratch. This means we help you with distributing your relational data model—and also with getting the most out of Postgres.

One problem I often see users struggle with when it comes to Postgres is locks. While Postgres is amazing at running multiple operations at the same time, there are a few cases in which Postgres needs to block an operation using a lock. You therefore have to be careful about which locks your transactions take, but with the high-level abstractions that PostgreSQL provides, it can be difficult to know exactly what will happen. This post aims to demystify the locking behaviors in Postgres, and to give advice on how to avoid common problems.

Postgres is pretty amazing with its support for complex, concurrent, ACID transactions

PostgreSQL is unique among open source databases in its support for complex, concurrent, ACID transactions.

To make sure complex transactions can safely run at the same time, PostgreSQL uses several layers of locks to serialise changes to critical sections of the database. Transactions run concurrently until they try to acquire a conflicting lock, for example when they update the same row. In that case, the first transaction to acquire the lock can proceed, and the second one waits until the first transaction commits or aborts.

Table locks block reads &/or writes during DDL

Whenever you run a command or a query on a table, you take a lock on the table. The primary purpose of table-level locks is to block reads and/or writes when changes to the underlying table structure are made during DDL commands such as ALTER TABLE. However, not all DDL commands need to block reads or writes, some only block each other.

Whether a transaction can acquire a lock depends on whether its “lock level” conflicts with that of the holder(s) of the lock. The PostgreSQL documentation gives a detailed overview of how different lock levels conflict.

To make this more meaningful, we’ve compiled a table of the most common Postgres commands and whether they can run concurrently with each other on the same table:

Runs concurrently with SELECT INSERT
UPDATE
DELETE
CREATE INDEX CONC
VACUUM
ANALYZE
CREATE INDEX CREATE TRIGGER ALTER TABLE
DROP TABLE
TRUNCATE
VACUUM FULL
SELECT
INSERT
UPDATE
DELETE
CREATE INDEX CONC
VACUUM
ANALYZE
CREATE INDEX
CREATE TRIGGER
ALTER TABLE
DROP TABLE
TRUNCATE
VACUUM FULL

If there’s a red X in the table, it means commands block each other. For example, when you run ALTER TABLE items ADD COLUMN last_update timestamptz, the command will first block until all queries on the items table have finished, and other queries on the table will block until the ALTER TABLE is done.

If you have access to a Postgres server, it’s easy to see this in practice.

In one psql session, run:

CREATE TABLE items (
  key text primary key,
  value jsonb
);

BEGIN;
ALTER TABLE items ADD COLUMN last_update timestamptz;

Now open another terminal and in psql, run:

SELECT * FROM items;
< nothing happens (waiting for a lock) >

If you go back to the first session, and run COMMIT, you’ll see that the second session finishes immediately afterwards. Locks are always kept until commit or rollback.

One other thing to be aware of is that Postgres uses lock queues. If you run an ALTER TABLE command then that command goes into the queue and blocks until all queries on that table are finished, but any SELECT that comes immediately after will be blocked until the ALTER TABLE is finished even if the ALTER TABLE is not yet running.

Row locks serialise concurrent changes to rows

Every row in a PostgreSQL table is also protected with a lock. Row locks only come in two flavours:

  • share
  • exclusive

Many transactions can hold a share lock concurrently, but only one transaction can hold an exclusive lock.

If concurrent transactions modify the same row, one of them will get blocked on a row lock. You can also take row-level locks without modifying anything using SELECT … FOR UPDATE or SELECT … FOR SHARE, which lets you temporarily prevent changes to a set of rows.

Can access row concurrently with: SELECT SELECT FOR SHARE SELECT FOR UPDATE UPDATE
DELETE
SELECT
SELECT FOR SHARE
SELECT FOR UPDATE
UPDATE

Going back to our items table, there’s an easy way to see this in practice.

In one psql session, run:

INSERT INTO items VALUES ('key-1', '{"hello":"world"}');

BEGIN;
SELECT * FROM items WHERE key = 'key-1' FOR UPDATE;

Now open another terminal and in psql, run:

UPDATE items SET value = '{"hello":"globe"}' WHERE key = 'key-1';
< nothing happens (waiting for a lock) >

Again, if you go back to the first session, and run COMMIT, you’ll see that the second session finishes immediately afterwards.

Making sense of pg_locks

Sometimes you notice a command is taking awfully long, but the process is not actually doing anything. In that case it might be waiting for a lock and you should have a look at pg_locks.

To see which query is waiting for a lock, the PostgreSQL wiki has a number of useful queries for displaying lock information, which give results such as:

 blocked_pid | blocking_pid |           blocked_statement           | current_statement_in_blocking_process
-------------+--------------+----------------------------------------+----------------------------------------
       15317 |        15206 | UPDATE test SET y = 10 WHERE x = '2'; | UPDATE test SET y = 5 WHERE x = '2';
       15242 |        15206 | ALTER TABLE test ADD COLUMN a int;    | UPDATE test SET y = 5 WHERE x = '2';
       15242 |        15317 | ALTER TABLE test ADD COLUMN a int;    | UPDATE test SET y = 10 WHERE x = '2';
(3 rows)

The table above tells us that an ALTER TABLE is blocked waiting for two UPDATEs to finish, while the UPDATE with SET y = 10 is waiting for the UPDATE with SET y = 5 to finish.

Getting visibility into locks on a distributed Citus database cluster

If you’re using the Citus extension to Postgres to shard database tables across multiple nodes, one more thing to take into consideration is that queries might get blocked on row-level locks on one of the shards on a distributed Citus worker node—and if that happens, then those queries would not show up in pg_locks on the Citus coordinator node. To get visibility into blocked operations across a distributed Citus database cluster, you can use the following query:

WITH citus_xacts AS (
  SELECT * FROM get_all_active_transactions() WHERE initiator_node_identifier = 0
),
citus_wait_pids AS (
  SELECT
    (SELECT process_id FROM citus_xacts WHERE transaction_number = waiting_transaction_num) AS waiting_pid,
    (SELECT process_id FROM citus_xacts WHERE transaction_number = blocking_transaction_num) AS blocking_pid
  FROM
    dump_global_wait_edges()
)
SELECT
  waiting_pid AS blocked_pid,
  blocking_pid,
  waiting.query AS blocked_statement,
  blocking.query AS current_statement_in_blocking_process
FROM
  citus_wait_pids
JOIN
  pg_stat_activity waiting ON (waiting_pid = waiting.pid)
JOIN
  pg_stat_activity blocking ON (blocking_pid = blocking.pid)
 blocked_pid  blocking_pid            blocked_statement            current_statement_in_blocking_process
─────────────┼──────────────┼───────────────────────────────────────┼───────────────────────────────────────
      147619        147613  UPDATE test SET y = 1000 WHERE x = 2;  UPDATE test SET y = 0 WHERE x = 2;

These diagnostic queries give you a good starting point to figure out which sessions are blocking each other. You can then look at pg_stat_activity to get more information such as how long the query was running.

Postgres locks aren’t scary once you understand their behavior

Locking is one of the mechanisms that Postgres uses to safely implement concurrent ACID transactions. While most of the time you don’t even have to think about locking, you may occasionally see a transaction taking way longer than usual. Your first reaction might be: Why is it slow? But often the right question is: Why is it waiting for a lock? We hope this post will help you understand locking behaviors in Postgres and diagnose any database locking issues you might encounter.

Marco Slot

Written by Marco Slot

Former lead engineer for the Citus database engine at Microsoft. Speaker at Postgres Conf EU, PostgresOpen, pgDay Paris, Hello World, SIGMOD, & lots of meetups. Talk selection team member for Citus Con: An Event for Postgres. PhD in distributed systems. Loves mountain hiking.

@marcoslot marcocitus