Citus 9.2 speeds up large scale HTAP workloads on Postgres

Written by Marco Slot
March 2, 2020

Some of you have been asking, “what’s happening with the Citus open source extension to Postgres?” The short answer is: a lot. More and more users have adopted the Citus extension in order to scale out Postgres, to increase performance and enable growth. And you’re probably not surprised to learn that since Microsoft acquired Citus Data last year, our engineering team has grown quite a bit—and we’ve been continuing to evolve and innovate on the Citus open source extension.

Our newest release is Citus 9.2. We’ve updated the installation instructions on our Download page and in our Citus documentation, and now it’s time to take a walk through what’s new.

The executive summary of what’s new with Citus 9.2

Pretty much all the work in this release falls into 3 buckets. Citus 9.2 delivers:

  • SQL Coverage Improvements / Optimizations (resulting in huge performance improvements in important areas across “hybrid transactional analytic processing” aka HTAP, a scenario which Citus is particularly good at)
  • Usability / Ease of operations
  • Extension Support (with support for any aggregate function, including all PostGIS aggregates)

CTE performance improvements (making distributed queries up to 10X faster)

In relational databases, common table expressions (CTEs) help you break down complex analytical queries into several sections. A CTE is little more than a subquery with a name, but both PostgreSQL and Citus used to execute CTEs separately from the rest of the query, which prevented the planner from optimizing the query plan. Citus used to always collect the result of CTEs on the coordinator and then broadcast it to the worker nodes where it could be used in joins, which gave Citus a huge boost in SQL support.

Since PostgreSQL 12, CTEs are treated as subqueries under the covers, which can lead to much faster query plans. We now use the same approach in Citus, and in many cases this even works on PostgreSQL 11. We also improved performance for queries that only use CTE results by performing that final step on the coordinator. Overall, these performance improvements can make distributed queries with CTEs up to 10x faster compared to previous versions of Citus.

Full support for aggregate functions 

Most analytical SQL queries use aggregate functions such as sum and avg to compute statistics or combine many data points into one. Extensions such as PostGIS and HyperLogLog (HLL) also add additional aggregate functions to PostgreSQL.

Citus now supports aggregate functions in all cases, either by pushing the aggregate down to the worker nodes when grouping by the distribution column, or by evaluating the aggregates on the coordinator node by pulling the necessary data. All PostGIS functions—as well as specialized built-in aggregates—will now work seamlessly when querying distributed tables.

INSERT..SELECT with re-partitioning (shuffle)—enabling real-time analytics and up to 5X performance improvements 

The INSERT..SELECT command allows you to transform data inside your database. You can use INSERT..SELECT to do computations on your data, for instance to pre-aggregate your data into a different table, but also change the way the data is stored. For instance, you can query raw JSON data in a partitioned table to make writes fast, and using INSERT..SELECT extract the fields that are relevant to your application and store them in a heavily indexed table to make your queries fast. Effectively, INSERT..SELECT gives you the power to create multiple different types of database systems within one database.

In Citus, Postgres’ INSERT..SELECT command is extra powerful, because it can often be parallelized across all the shards of your distributed tables, meaning you can process really huge volumes of data inside a single distributed database. However, the requirement for taking advantage of parallelism is that the tables in the INSERT.. and the SELECT part have the same distribution column. Before Citus 9.2, if you wrote to a table with a different distribution column, the data was collected on the coordinator and then written to the destination table. Under the covers we use the COPY command to do this, which is actually very fast, but we realized that in many cases, we could do a lot better.

With Citus 9.2, the INSERT..SELECT command can now shuffle the data between the workers, rather than pulling the data to the coordinator. This change means that INSERT..SELECT can be up to 5x faster. The ability to do INSERT..SELECT with re-partitioning means users are no longer bound to a single distribution column in order to scale a data processing pipelining in Citus, which enables more advanced real-time analytics scenarios and data processing pipelines inside your Citus database.  

Updates to how re-partition join queries are handled can mean 10X faster joins

Citus is well-optimized for real-time analytics scenarios (check out how Microsoft uses Citus for petabyte-scale real-time analytics), but some of you have discovered that Citus can also act as a data warehouse, too.

In data warehousing queries, Citus often needs to re-partition one side or both sides of the join over the network. Before Citus 9.2, queries containing such joins used to be executed via the “task-tracker” executor, which added a lot of extra latency. As of Citus 9.2, these re-partition join queries are executed via the Citus “adaptive executor” (more on the adaptive executor innovation later)—which can be up to 10x faster for joins with under 10 million rows.

We’ll have a lot more exciting improvements on the re-partitioning front over the next few releases as we introduce new infrastructure for INSERT..SELECT with re-partitioning, that will speed up joins as well.

Stored procedure call delegation helps you avoid round trips

Stored procedures are useful for doing complex transactions without additional network round trips between the application server and the database server. In a distributed SQL database like Citus, stored procedures can be even more powerful because you can avoid the round trips between the coordinator and the worker nodes and horizontally scale throughput through a technique we call “stored procedure call delegation”.

A typical scenario in which you might use stored procedure call delegation is a multi-tenant application with complex transactions that operate on a single tenant at a time. When you define a procedure or function that takes the tenant ID as an argument, you can use the create_distributed_function UDF to delegate the procedure calls to the worker node that stores the data for that tenant—with almost no overhead on the coordinator. If you make changes to the UDF, the changes will be automatically propagated to the worker nodes.

Stored procedure call delegation was already introduced in Citus 9.0, but Citus 9.2 increases throughput by more than 30%. With the performance improvements in stored procedure calls in Citus 9.2, stored procedures will often be the best way of scaling out your multi-tenant app.

Propagation of schema grants and collations 

A Citus distributed database cluster is effectively just a collection of PostgreSQL servers—and you can often take advantage of PostgreSQL features even if Citus does not natively support them yet, by manually creating objects (e.g. types, collations, schemas) on the worker nodes.

However, manually creating objects on the worker nodes can create issues when you go to add new worker nodes to the Citus database cluster, because those objects won’t automatically be created on the new nodes—after all, Citus doesn’t know about these objects (and you might forget to create them!). The good news is that since Citus 9.0, we’ve started propagating types, schemas, and extensions automatically if a distributed table depends on them. And Citus 9.2 expands support to schema grants and collations, such that you can seamlessly create distributed tables with more complex configurations.

Using Citus to scale out Postgres gives you a state-of-the-art HTAP database

The Citus 9.2 extension to Postgres takes huge leaps in database performance across both transactional and analytical workloads. We see Citus as a state-of-the-art HTAP database that can:

  • scale transaction throughput by getting rid of many of the bottlenecks of PostgreSQL (for example, Citus has massively parallel autovacuuming!)
  • scale analytics workloads by using parallelism to run fast analytical queries over ever-growing data sets—or by using parallelism to process a high volume of incoming data in incremental, transactional steps

The parallelism and transactional features of Citus continuously enhance each other, so better INSERT..SELECT performance, improved CTE performance for distributed queries, and faster join performance will also enable more advanced pre-processing, which in turn will enable higher transaction throughput on other distributed Postgres tables.

N.B. Updated October 2022: If you’re reading this post and you’re interested in trying out Citus, but want to do so in the context of a fully-managed database as a service—the good news is that, thanks to the efforts of some of my Citus teammates, Citus is now available in Azure Cosmos DB for PostgreSQL. You can learn about how to get started with Citus on Azure in the Azure quickstart documentation.

Marco Slot

Written by Marco Slot

Former lead engineer for the Citus database engine at Microsoft. Speaker at Postgres Conf EU, PostgresOpen, pgDay Paris, Hello World, SIGMOD, & lots of meetups. Talk selection team member for Citus Con: An Event for Postgres. PhD in distributed systems. Loves mountain hiking.

@marcoslot marcocitus