Citus 10.2 is out! 10.2 brings you new columnar & time series features—and is ready to support Postgres 14. Read the new Citus 10.2 blog.

Skip navigation

Citus Blog

Articles tagged: open source

Burak Velioglu

How to scale Postgres for time series data with Citus

Written byBy Burak Velioglu | October 22, 2021Oct 22, 2021

Managing time series data at scale can be a challenge. PostgreSQL offers many powerful data processing features such as indexes, COPY and SQL—but the high data volumes and ever-growing nature of time series data can cause your database to slow down over time.

Fortunately, Postgres has a built-in solution to this problem: Partitioning tables by time range.

Partitioning with the Postgres declarative partitioning feature can help you speed up query and ingest times for your time series workloads. Range partitioning lets you create a table and break it up into smaller partitions, based on ranges (typically time ranges). Query performance improves since each query only has to deal with much smaller chunks. Though, you’ll still be limited by the memory, CPU, and storage resources of your Postgres server.

The good news is you can scale out your partitioned Postgres tables to handle enormous amounts of data by distributing the partitions across a cluster. How? By using the Citus extension to Postgres. In other words, with Citus you can create distributed time-partitioned tables. To save disk space on your nodes, you can also compress your partitions—without giving up indexes on them. Even better: the latest Citus 10.2 open-source release makes it a lot easier to manage your partitions in PostgreSQL.

Keep reading
Onder Kalaci

What’s new in the Citus 10.2 extension to Postgres

Written byBy Onder Kalaci | September 17, 2021Sep 17, 2021

Citus 10.2 is out! If you are not yet familiar with Citus, it is an open source extension to Postgres that transforms Postgres into a distributed database—so you can achieve high performance at any scale. The Citus open source packages are available for download. And Citus is also available in the cloud as a managed service, too.

You can see a bulleted list of all the changes in the CHANGELOG on GitHub. This post is your guide to what’s new in Citus 10.2, including some of these headline features.

Keep reading
Jelte Fennema

Shard rebalancing in the Citus 10.1 extension to Postgres

Written byBy Jelte Fennema | September 3, 2021Sep 3, 2021

With the 10.1 release to the Citus extension to Postgres, you can now monitor the progress of an ongoing shard rebalance—plus you get performance optimizations, as well as some user experience improvements to the rebalancer, too.

Whether you use Citus open source to scale out Postgres, or you use Citus in the cloud, this post is your guide to what’s new with the shard rebalancer in Citus 10.1.

And if you’re wondering when you might need to use the shard rebalancer: the rebalancer is used when you add a new Postgres node to your existing Citus database cluster and you want to move some of the old data to this new node, to “balance” the cluster. There are also times you might want to balance shards across nodes in a Citus cluster in order to optimize performance. A common example of this is when you have a SaaS application and one of your customers/tenants has significant more activity than the rest.

Keep reading

Citus 10.1 is out! In this latest release to the Citus extension to Postgres, our team focused on improving your user experience. Some of the 10.1 fixes are operational improvements—such as with the shard rebalancer, or with citus_update_node. Some are performance improvements—such as for multi-row INSERTs or with citus_shards. And some are fixes you’ll appreciate if you use Citus with lots of Postgres partitions.

Given that the previous Citus 10 release included a bevy of new features—including things like columnar storage, Citus on a single node, open sourcing the shard rebalancer, new UDFs so you can alter distributed table properties, and the ability to combine Postgres and Citus tables via support for JOINs between local and distributed tables, and foreign keys between local and reference tables—well, we felt that Citus 10.1 needed to prioritize some of our backlog items, the kinds of things that can make your life easier.

This post is your guide to the what’s new in Citus 10.1. And if you want to catch up on all the new things in past releases to Citus, check out the release notes posts about Citus 10, Citus 9.5, Citus 9.4, Citus 9.3, and Citus 9.2.

Keep reading

If you have a large PostgreSQL database that runs on a single node, eventually the single node’s resources—such as memory, CPU, and disk—may deliver query responses that are too slow. That is when you may want to use the Citus extension to Postgres to distribute your tables across a cluster of Postgres nodes.

In your large database, Citus will shine for large tables, since the distributed Citus tables will benefit from the memory across all of the nodes in the cluster. But what if your Postgres database also contains some small tables which easily fit into a single node’s memory? You might be wondering: do you need to distribute these smaller tables, even though there wouldn’t be much performance gain from distributing them?

Fortunately, as of the Citus 10 release, you do not have to choose: you can distribute your large tables across a Citus cluster and continue using your smaller tables as local Postgres tables on the Citus coordinator.

One of the new features in Citus 10 that enables you to use a hybrid “local+distributed” Postgres database is that you can now JOIN local tables and distributed tables. (The other new Citus 10 feature has to do with foreign keys between local and reference tables.)

Keep reading

One of the main reasons people use Citus to transform Postgres into a distributed database is that with Citus, you can scale out horizontally while still enjoying PostgreSQL’s great RDBMS features. Whether you’re already a Postgres expert or are new to Postgres, you probably know one of the benefits of using a relational database is to have relations between your tables. And one of the ways you can relate your tables is of course to use foreign keys.

A foreign key ensures referential integrity, which can help you to avoid bugs in applications. For example, a foreign key can be used to ensure that a table of “orders” can only reference customer IDs that exist in the “customers” table.

If you have already heard about Citus 10, you know that Citus 10 gives you more support for hybrid data models, which means that you can easily combine regular Postgres tables with distributed Citus tables to get the best of the single node and distributed Postgres worlds.

This post will walk you through one of the new features in Citus 10: support for foreign keys between local Postgres tables and Citus reference tables.

Keep reading

Citus is an extension to Postgres that lets you distribute your application’s workload across multiple nodes. Whether you are using Citus open source or using Citus as part of a managed Postgres service in the cloud, one of the first things you do when you start using Citus is to distribute your tables. While distributing your Postgres tables you need to decide on some properties such as distribution column, shard count, colocation. And even before you decide on your distribution column (sometimes called a distribution key, or a sharding key), when you create a Postgres table, your table is created with an access method.

Previously you had to decide on these table properties up front, and then you went with your decision. Or if you really wanted to change your decision, you needed to start over. The good news is that in Citus 10, we introduced 2 new user-defined functions (UDFs) to make it easier for you to make changes to your distributed Postgres tables.

Keep reading
David Rowley

Speeding up recovery & VACUUM in Postgres 14

Written byBy David Rowley | March 25, 2021Mar 25, 2021

One of the performance projects I’ve focused on in PostgreSQL 14 is speeding up PostgreSQL recovery and vacuum. In the PostgreSQL team at Microsoft, I spend most of my time working with other members of the community on the PostgreSQL open source project. And in Postgres 14 (due to release in Q3 of 2021), I committed a change to optimize the compactify_tuples function, to reduce CPU utilization in the PostgreSQL recovery process. This performance optimization in PostgreSQL 14 made our crash recovery test case about 2.4x faster.

The compactify_tuples function is used internally in PostgreSQL:

  • when PostgreSQL starts up after a non-clean shutdown—called crash recovery
  • by the recovery process that is used by physical standby servers to replay changes (as described in the write-ahead log) as they arrive from the primary server
  • by VACUUM

So the good news is that the improvements to compactify_tuples will: improve crash recovery performance; reduce the load on the standby server, allowing it to replay the write-ahead log from the primary server more quickly; and improve VACUUM performance.

Keep reading

One of the big new things in Citus 10 is that you can now shard Postgres on a single Citus node. So in addition to using the Citus extension to Postgres to scale out Postgres across a distributed cluster, you can now also:

  • Try out Citus on a single node with just a few simple commands
  • Shard Postgres on a single Citus node to be “scale-out-ready”
  • Simplify CI/CD pipelines by testing with single-node Citus

The Citus 10 release is chock full of new capabilities like columnar storage for Postgres, the open sourcing of the shard rebalancer, as well as the feature we are going to explore here: using Citus on a single node. No matter what type of application you run on top of Citus—multi-tenant SaaS apps, customer-facing analytics dashboards, time-series workloads, high-throughput transactional apps—there is something for everyone in Citus 10.

Keep reading

One of the main reasons people use the Citus extension for Postgres is to distribute the data in Postgres tables across multiple nodes. Citus does this by splitting the original Postgres table into multiple smaller tables and putting these smaller tables on different nodes. The process of splitting bigger tables into smaller ones is called sharding—and these smaller Postgres tables are called “shards”. Citus then allows you to query the shards as if they were still a single Postgres table.

One of the big changes in Citus 10—in addition to adding columnar storage, and the new ability to shard Postgres on a single Citus node—is that we open sourced the shard rebalancer.

Yes, that’s right, we have open sourced the shard rebalancer! The Citus 10 shard rebalancer gives you an easy way to rebalance shards across your cluster and helps you avoid data hotspots over time. Let’s dig into the what and the how.

Keep reading

Page 1 of 3