POSETTE 2024 is a wrap! 💯 Thanks for joining the fun! Missed it? Watch all 42 talks online 🍿
POSETTE 2024 is a wrap! 💯 Thanks for joining the fun! Missed it? Watch all 42 talks online 🍿
Written by Samay Sharma
July 28, 2022
This post by Samay Sharma about autovacuum was originally published on the Microsoft TechCommunity Blog.
If you've been running PostgreSQL for a while, you've heard about autovacuum. Yes, autovacuum, the thing which everybody asks you not to turn off, which is supposed to keep your database clean and reduce bloat automatically.
And yet—imagine this: one fine day, you see that your database size is larger than you expect, the I/O load on your database has increased, and things have slowed down without much change in workload. You begin looking into what might have happened. You run the excellent Postgres bloat query and you notice you have a lot of bloat. So you run the VACUUM
command manually to clear the bloat in your Postgres database. Good!
But then you have to address the elephant in the room: why didn't Postgres autovacuum clean up the bloat in the first place...? Does the above story sound familiar? Well, you are not alone. 😊
Autovacuum and VACUUM
provide a number of configuration parameters to adapt it to fit your workload, but the challenge is figuring out which ones to tune. In this post—based on my optimizing autovacuum talk at Citus Con: An Event for Postgres—you'll learn to figure out where the problem lies and what to tune to make it better.
More specifically, you'll learn how to investigate—and how to fix—these 3 common types of autovacuum problems:
Another common type of autovacuum problem is transaction id wraparound related, which is a meaty topic all on its own. In the future I plan to write a separate, follow-on blog post to focus on that topic.
This cheat sheet diagram of "autovacuum tips" gives you an overview of all the Postgres autovacuum fixes you'll learn about in this blog post:
If you're not yet familiar, Postgres uses Multiversion Concurrency Control (MVCC) to guarantee isolation while providing concurrent access to data. This means multiple versions of a row can exist in the database simultaneously. So, when rows are deleted, older versions are still kept around, since older transactions may still be accessing those versions.
Once all transactions which require a row version are complete, those row versions can be removed. This can be done by the VACUUM command. Now, VACUUM
can be run manually but that requires you to monitor and make decisions about various things like: when to run vacuum, which tables to vacuum, how frequently to vacuum etc.
To make life easier for you, PostgreSQL has an autovacuum utility that:
autovacuum_naptime
secondsVACUUM
and ANALYZE
jobs on those tables in parallel.You can learn more about autovacuum in general from Joe Nelson's great post on why Autovacuum is not the enemy.
Now, the definition of "significantly modified" in bullet #2 above—and how much to vacuum in parallel—depends heavily on your workload, transaction rate, and hardware. Let's start looking into debugging autovacuum with one of the most common autovacuum issues—autovacuum not vacuuming a "significantly modified" table.
Vacuuming is typically triggered for a table if (non-transaction id wrapround related)
autovacuum_vacuum_threshold
+ autovacuum_vacuum_scale_factor
* number of tuples ORautovacuum_vacuum_insert_threshold
+ autovacuum_vacuum_insert_scale_factor
* number of tuples.If you see bloat growing more than expected and find yourself needing to manually run VACUUM
to clear up bloat, it's an indication that autovacuum is not vacuuming tables often enough.
You can check when and how frequently tables were vacuumed by checking pg_stat_user_tables. If your large tables show up here with low autovacuum counts and last_autovacuum
well in the past, it's another sign that autovacuum isn't vacuuming your tables at the right time.
SELECT last_autovacuum, autovacuum_count, vacuum_count FROM
pg_stat_user_tables;
To vacuum tables at the right frequency, you should adjust the autovacuum_vacuum_scale_factor
and autovacuum_vacuum_insert_scale_factor
based on the size and growth rate of the tables.
The 2nd problem you might encounter is that your tables are being vacuumed too slowly. This may manifest as bloat growing because your rate of cleaning up bloat is slower than your transaction rate. Or, you will see vacuum processes running constantly on your system when you check pg_stat_activity
.
There are a few ways you can speed up vacuuming: these recommendations apply both to autovacuum and to manually triggered VACUUM
.
The first thing you should check is if you have cost limiting enabled. When vacuum is running, the system maintains a counter that tracks estimated cost of different I/O operations. When that cost exceeds autovacuum_vacuum_cost_limit
(or vacuum_cost_limit
), the process sleeps for autovacuum_vacuum_cost_delay
(or vacuum_cost_delay
) ms. This is called cost limiting and is done to reduce the impact of vacuuming on other processes.
If you notice that vacuum is falling behind, you could disable cost limiting (by setting autovacuum_vacuum_cost_delay
to 0) or reduce its impact by either decreasing autovacuum_vacuum_cost_delay
or increasing autovacuum_vacuum_cost_limit
to a high value (like 10000).
Autovacuum can only vacuum autovacuum_max_workers
tables in parallel. So, if you have hundreds of tables being actively written to (and needing to be vacuumed), doing them 3 at a time might take a while (3 is the default value for autovacuum_max_workers
).
Therefore, in scenarios with a large number of active tables, it might be worth increasing autovacuum_max_workers to a higher value—assuming you have enough compute to support running more autovacuum workers.
Before increasing the number of autovacuum workers, make sure that you are not being limited by cost limiting. Cost limits are shared among all active autovacuum workers, so just increasing the number of parallel workers may not help, as each of them will then start doing lesser work.
To find more ideas on what to tune, it might be worth looking into pg_stat_progress_vacuum to understand what phase your ongoing vacuums are in and how you can improve their performance. Let's look at a few examples where it might give useful insights.
To see how fast vacuum is progressing, you could compare heap_blks_scanned
with heap_blks_total
in pg_stat_progress_vacuum
over time. If you see progress is slow and the phase
is scanning heap
, that means vacuum needs to scan a lot of heap blocks to complete.
In this case, you can scan the heap faster by prefetching larger relations in memory by using something like pg_prewarm or by increasing shared_buffers
.
When scanning the heap, vacuum collects dead tuples in memory. The number of dead tuples it can store is determined by maintenance_work_mem
(or autovacuum_work_mem
, if set). Once the maximum number of tuples have been collected, vacuum must switch to vacuuming indexes and then return to scanning heap again after the indexes and heap are vacuumed (i.e. after an index vacuuming cycle).
So, if you notice that index_vacuum_count
in pg_stat_progress_vacuum
is high—well, it means that vacuum is having to go through many such index vacuum cycles.
To reduce the number of cycles vacuum needs and to make it faster, you can increase autovacuum_work_mem
so that vacuum can store more dead tuples per cycle.
If you see that the phase
in pg_stat_progress_vacuum
is vacuuming indexes
for a long time, you should check if you have a lot of indexes on the table being vacuumed.
If you have many indexes, you could make vacuuming faster by increasing max_parallel_maintenance_workers to process indexes in parallel. Note that this configuration change will help only if you manually run VACUUM
commands. (Unfortunately, parallel vacuum is currently not supported for autovacuum.)
With all these recommendations, you should be able to speed up vacuuming significantly. But, what if your vacuum completes in time and you still notice that dead tuples have not come down? In the upcoming paragraphs, we will try to find causes and solutions for this new type of problem: vacuum finishes but is unable to clean dead rows.
Vacuum can only clean row versions which no other transaction needs. But, if Postgres feels certain rows are "needed", they won't be cleaned up.
Let's explore 4 common scenarios in which vacuum cannot clean up rows (and what to do about these scenarios!)
If you have a transaction that's been running for several hours or days, the transaction might be holding onto rows, not allowing vacuum to clean the rows. You can find long-running transactions by running:
SELECT pid, datname, usename,
state, backend_xmin
FROM pg_stat_activity
WHERE backend_xmin IS NOT NULL
ORDER BY age(backend_xmin) DESC;
To prevent long running transactions from blocking vacuuming, you can terminate them by running pg_terminate_backend() on their PIDs.
To deal with long running transactions in a proactive way, you could:
statement_timeout
to automatically time out long queries, oridle_in_transaction_session_timeout
to time out sessions which are idle within an open transaction, orlog_min_duration_statement
to at least log long running queries so that you can set an alert on them and kill them manually.Typically, Postgres can clean up a row version as soon as it isn't visible to any transaction. If you're running Postgres on a primary with a standby node, it's possible for a vacuum to clean up a row version on the primary which is needed by a query on the standby. This situation is called a "replication conflict"—and when it's detected, the query on the standby node will be cancelled.
To prevent queries on the standby node from being cancelled due to replication conflicts, you can set hot_standby_feedback = on, which will make the standby inform the primary about the oldest transaction running on it. As a result, the primary can avoid cleaning up rows which are still being used by transactions on the standby.
However, setting hot_standby_feedback = on also means that long running queries on the standby have the capability to block rows from getting cleaned up on the primary.
To get the xmin
horizon of all your standbys, you can run:
SELECT pid, client_hostname,
state, backend_xmin
FROM pg_stat_replication
WHERE backend_xmin IS NOT NULL
ORDER BY age(backend_xmin) DESC;
To avoid having excessive bloat on the primary due to long-running transactions on the standby, you can take one of the following approaches:
hot_standby_feedback = off
.vacuum_defer_cleanup_age
transactions have passed, giving more time to standby queries to complete without running into replication conflicts.A replication slot in Postgres stores information required by a replica to catch up with the primary. If the replica is down, or severely behind, the rows in the replication slot can't be vacuumed on the primary.
This additional bloat can happen for physical replication only when you have hot_standby_feedback = on
. For logical replication, you would be seeing bloat only for catalog tables.
You can run the query below to find replication slots with old transactions to retain.
SELECT slot_name, slot_type,
database, xmin, catalog_xmin
FROM pg_replication_slots
ORDER BY age(xmin),
age(catalog_xmin) DESC;
Once you find them, you can drop inactive or unneeded replication slots by running pg_drop_replication_slot(). You can also apply learnings from the section on how to manage hot_standby_feedback
for physical replication.
Postgres supports 2 phase commits (2PC), which has 2 distinct steps. First, the transaction is prepared with PREPARE TRANSACTION and second, the transaction is committed with COMMIT PREPARED.
2PCs are resilient transactions meant to tolerate server restarts. So, if you have any PREPARED
transactions hanging around for some reason, they might be holding onto rows. You can find old prepared transactions by running:
SELECT gid, prepared, owner,
database, transaction AS xmin
FROM pg_prepared_xacts
ORDER BY age(transaction) DESC;
You can remove hanging 2PC transactions by running ROLLBACK PREPARED
on them manually.
Autovacuum knows that it's a system process and prioritizes itself lower than user queries. So, if a process triggered by autovacuum is unable to acquire the locks it needs to vacuum, the process ends itself. That means if a particular table has DDL running on it almost all the time, a vacuum might not be able to acquire the needed locks and hence dead rows won't be cleaned up.
If you notice that not being able to get the right locks is causing bloat to rise, you might have to do one of 2 things:
VACUUM
the table (the good news is that manual VACUUM
won't terminate itself) orAnother resource on this topic which you might find useful is Laurenz Albe's blog post on Four reasons why VACUUM won't remove dead rows from a table.
Now that you have walked through the causes and the 13 tips for debugging Postgres autovacuum issues, you should be able to handle problems like: (1) autovacuum is not triggering vacuum often enough; or (2) vacuum is too slow; or (3) vacuum isn't cleaning up dead rows.
If you've addressed all of these and autovacuum still can't keep up with your transaction rate, it might be time to upgrade your Postgres server to bigger hardware—or to scale out your database using multiple nodes, with Citus.
Below, I'm including a reference table which summarizes all the different Postgres configs we've mentioned in this post to optimize autovacuum.
Postgres Configs (in order of appearance) | Recommendation |
---|---|
#1 - Vacuum not triggered enough | |
autovacuum_ | Lower the value to trigger vacuuming more frequently, useful for larger tables with more updates / deletes. |
autovacuum_ | Lower the values to trigger vacuuming more frequently for large, insert-heavy tables. |
#2 - Vacuum too slow | |
autovacuum_ | Decrease to reduce cost limiting sleep time and make vacuuming faster. |
autovacuum_ | Increase the cost to be accumulated before vacuum will sleep, thereby reducing sleep frequency and making vacuum go faster. |
autovacuum_ | Increase to allow more parallel workers to be triggered by autovacuum. |
shared_ | Consider increasing memory for shared memory buffers, enabling better caching of blocks which allows vacuum to scan faster.* |
autovacuum_ | Increase to allow each autovacuum worker process to store more dead tuples while scanning a table. Set to -1 to fallback to maintenance_ |
maintenance_ | Increase to allow each autovacuum worker process to store more dead tuples while scanning a table.* |
max_ | Increase to allow `VACUUM` to vacuum more indexes in parallel.* |
#3 - Vacuum isn't cleaning up dead rows | |
statement_ | Set to automatically terminate long-running queries after a specified time.** |
idle_ | Set to terminate any session that has been idle with an open transaction for longer than specified time.** |
log_ | Set to log each completed statement which takes longer than the specified timeout.** |
hot_ | Set to "on" so the standby sends feedback to the primary about running queries. Decreases query cancellation, but can increase bloat. Consider switching "off" if bloat is too high. |
vacuum_ | Set to defer cleaning up row versions until specified transactions have passed. Allows more time for standby queries to complete without running into conflicts due to early cleanup. |
Side-note: The other class of autovacuum issues you might run into are related to transaction id wraparound vacuums. These are triggered based on a different criterion and behave differently from your regular vacuum. Hence, they deserve a blog post of their own . I'll be writing a part 2 of this blog post soon focused on what transaction ID wraparound autovacuums are, what makes them different, and how to deal with common issues encountered while they are running. Stay tuned!
On a related note, the Postgres talk I gave at Citus Con: An Event for Postgres is all about optimizing autovacuum. If you have questions or feedback, you can always reach me at @samay_sharma on Twitter.