Citus 10 is out! New features include columnar storage & Citus on a single node—plus we’ve open-sourced the shard rebalancer. Read the Citus 10 blog.

Skip navigation

Citus Blog

Articles tagged: Postgres committers

David Rowley

Speeding up recovery & VACUUM in Postgres 14

Written byBy David Rowley | March 25, 2021Mar 25, 2021

One of the performance projects I’ve focused on in PostgreSQL 14 is speeding up PostgreSQL recovery and vacuum. In the PostgreSQL team at Microsoft, I spend most of my time working with other members of the community on the PostgreSQL open source project. And in Postgres 14 (due to release in Q3 of 2021), I committed a change to optimize the compactify_tuples function, to reduce CPU utilization in the PostgreSQL recovery process. This performance optimization in PostgreSQL 14 made our crash recovery test case about 2.4x faster.

The compactify_tuples function is used internally in PostgreSQL:

  • when PostgreSQL starts up after a non-clean shutdown—called crash recovery
  • by the recovery process that is used by physical standby servers to replay changes (as described in the write-ahead log) as they arrive from the primary server
  • by VACUUM

So the good news is that the improvements to compactify_tuples will: improve crash recovery performance; reduce the load on the standby server, allowing it to replay the write-ahead log from the primary server more quickly; and improve VACUUM performance.

Keep reading

As part of my work on the open source PostgreSQL team at Microsoft, I recently committed a new feature into PostgreSQL 14 to track dependencies on collation versions, with help from co-author Julien Rouhaud and the many others who contributed ideas. It took a long time to build a consensus on how to tackle this thorny problem (work I began at EnterpriseDB and continued at Microsoft), and you can read about some of the details and considerations in the commit message below and the referenced discussion thread. Please note that some details may change by the time PostgreSQL 14 is released.

commit 257836a75585934cc05ed7a80bccf8190d41e056
Author: Thomas Munro <[email protected]>
Date:   Mon Nov 2 19:50:45 2020 +1300

    Track collation versions for indexes.

    Record the current version of dependent collations in pg_depend when
    creating or rebuilding an index.  When accessing the index later, warn
    that the index may be corrupted if the current version doesn't match.

    Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
    Michael Paquier, Robert Haas, Tom Lane and others for very helpful
    discussion.

    Author: Thomas Munro <[email protected]>
    Author: Julien Rouhaud <[email protected]>
    Reviewed-by: Peter Eisentraut <[email protected]> (earlier versions)
    Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com

I’m pretty happy with the result so far, but there is more to be done (see further down)! Now seems like a good time to walk you through the problem we needed to solve—that PostgreSQL indexes can get corrupted by changes in collations that occur naturally over time—and how the new feature makes things better in PostgreSQL 14. Plus, you’ll get a bit of background on collations, too.

Keep reading

One of the unique things about Postgres is that it is highly programmable via PL/pgSQL and extensions. Postgres is so programmable that I often think of Postgres as a computing platform rather than just a database (or a distributed computing platform—with Citus). As a computing platform, I always felt that Postgres should be able to take actions in an automated way. That is why I created the open source pg_cron extension back in 2016 to run periodic jobs in Postgres—and why I continue to maintain pg_cron now that I work on the Postgres team at Microsoft.

Using pg_cron, you can schedule Postgres queries to run periodically, according to the familiar cron syntax. Some typical examples:

Keep reading
Andres Freund

Improving Postgres Connection Scalability: Snapshots

Written byBy Andres Freund | October 25, 2020Oct 25, 2020

I recently analyzed the limits of connection scalability, to understand the most effective way to improve Postgres’ handling of large numbers of connections, and why that is important. I concluded that the most pressing issue is snapshot scalability.

This post details the improvements I recently contributed to Postgres 14 (to be released Q3 of 2021), significantly reducing the identified snapshot scalability bottleneck.

As the explanation of the implementation details is fairly long, I thought it’d be more fun for of you if I start with the results of the work, instead of the technical details (I’m cheating, I know ;)).

Keep reading

One common challenge with Postgres for those of you who manage busy Postgres databases, and those of you who foresee being in that situation, is that Postgres does not handle large numbers of connections particularly well.

While it is possible to have a few thousand established connections without running into problems, there are some real and hard-to-avoid problems.

Since joining Microsoft last year in the Azure Database for PostgreSQL team—where I work on open source Postgres—I have spent a lot of time analyzing and addressing some of the issues with connection scalability in Postgres.

Keep reading
Jeff Davis

How to securely authenticate with SCRAM in Postgres 13

Written byBy Jeff Davis | July 28, 2020Jul 28, 2020

Making security easy to use is crucial because hard-to-use security is likely to be neglected entirely. SCRAM with channel binding is a variation of password authentication that is almost as easy to use, but much more secure.

In basic password authentication, the connecting client simply sends the server the password. Then the server checks that it’s the right one, and allows the client to connect. Basic password authentication has several weaknesses which are addressed with SCRAM and channel binding.

In this article, you’ll learn how to set up authentication using SCRAM with channel binding in Postgres. I implemented the client connection parameter channel_binding in PostgreSQL 13, due to be released in late 2020 (PostgreSQL 13 is in beta now). SCRAM and Channel Binding have already been supported in several releases, but this new connection parameter is necessary to realize the security benefits of SCRAM and Channel Binding.

Keep reading
Ozgun Erdogan

Microsoft Azure Welcomes PostgreSQL Committers

Written byBy Ozgun Erdogan | March 3, 2020Mar 3, 2020

Interview with the Postgres committers who have joined the Postgres team at Microsoft by Sudhakar Sannakkayala (Partner Director, Azure Data) and Ozgun Erdogan (Principal, Azure Data)—cross-posted from the Azure Database for PostgreSQL Blog.

In recent years, the data landscape has seen strong innovation as a result of the onset of open source technologies. At the forefront, PostgreSQL has shown that it’s the open source database built for every type of developer. By staying true to its principles of being standards-compliant, highly programmable, and extensible, PostgreSQL has solidified its position as the “most loved database” of developers across the board—ranging from scenarios for OLTP, analytics, and business intelligence to processing various formats of geometric data using the PostGIS extension.

Keep reading

Page 1 of 1