Citus Blog

Articles tagged: shard rebalancer

The new PostgreSQL 16 release is out, packed with exciting improvements and features—and Citus 12.1 brings them to you at scale, within just one week of the PG16 release.

As many of you likely know, Citus is an open source PostgreSQL extension that turns Postgres into a distributed database. Our team started integrating Citus with the PG16 beta and release candidates early-on, so that you could have a new Citus 12.1 release that is compatible with Postgres 16 as quickly as possible after PG16 came out.

There are a lot of good reasons to upgrade to Postgres 16—huge thanks to everyone who contributed into this Postgres release! PG16 highlights include query performance boost with more parallelism; load balancing with multiple hosts in libpq (contributed by my Citus teammate, Jelte Fennema-Nio); I/O monitoring with pg_stat_io; developer experience enhancements; finer-grained options for access control; logical replication from standby servers and other replication improvements, like using btree indexes in the absence of a primary key (contributed by one of my teammates, Onder Kalaci.)

The good news for those of you who care about distributed Postgres: Citus 12.1 is now available and adds support for Postgres 16.

In addition to Postgres 16 support, Citus 12.1 includes enhancements to schema-based sharding, which was recently added to Citus 12.0—and is super useful for multi-tenant SaaS applications.

Keep reading

One of the top Citus features is the ability to run PostgreSQL at any scale, on a single node as well as a distributed database cluster.

As your application needs to scale, you can add more nodes to the Citus cluster, rebalance existing data to the new Postgres nodes, and seamlessly scale out. However, these operations require manual intervention: a) first you must create alerts on metrics, b) then, based on those alerts, you need to add more nodes, c) then you must kick off and monitor the shard rebalancer. Automating these steps will give you a complete auto scale experience—and make your life so much easier.

In this blog post, you will learn how to build a full-fledged auto scaling setup for the Citus database extension running as a managed service on Azure—called Azure Cosmos DB for PostgreSQL. You’ll also learn how you can easily add nodes to the Azure Cosmos DB for PostgreSQL cluster and use any metrics available to trigger actions in your cluster! Let’s dive into the following chapters:

Keep reading

Citus enables several different PostgreSQL use cases, but one of the most popular ones is to build scalable multi-tenant software as a service (SaaS) applications. The most common way to build a multi-tenant application on Citus is to distribute all your Postgres tables by a “tenant ID” column. That way rows are (hash-)distributed across nodes, while rows with the same tenant ID value are co-located on the same node for fast local joins, transactions, and foreign keys.

For those of you who build SaaS apps, one question many of you have is how active your tenants are. More specifically: What are your busiest tenants? How many queries is your application doing on behalf of your tenants, and how much CPU do those queries use?

The new 11.3 release to the open source Citus database extension gives you tenant monitoring—with instant visibility into your top tenants using the new citus_stat_tenants feature, which shows query counts and CPU usage over a configurable time period.

Keep reading
Marco Slot

Citus 11.1 shards your Postgres tables without interruption

Written byBy Marco Slot | September 19, 2022Sep 19, 2022

Citus is a distributed database that is built entirely as an open source PostgreSQL extension. In fact, you can install it in your PostgreSQL server without changing any PostgreSQL functionality. Citus simply gives PostgreSQL additional superpowers.

Being an extension also means we can keep adding new Postgres superpowers at a high pace. In the last release (11.0), we focused on giving you the ability to query from any node, opening up Citus for many new use cases, and we also made Citus fully open source. That means you can see everything we do on the Citus GitHub page (and star the repo if you’re a fan 😊). It also means that everyone can take advantage of shard rebalancing without write-downtime.

In the latest release (11.1), our Citus database team at Microsoft improved the application’s experience and avoided blocking writes during important operations like distributing tables and tenant isolation. These new capabilities built on the experience gained from developing the shard rebalancer, which uses logical replication to avoid blocking writes. In addition, we made the shard rebalancer faster and more user-friendly; also, we prepared for the upcoming PostgreSQL 15 release. This post gives you a quick tour of the major changes in Citus 11.1, including:

Keep reading

A few months ago we made Citus fully open source. This was a very exciting milestone for all of us on the Citus database engine team. Contrary to folks who say that Postgres is a monolith that can’t scale—Postgres in fact has a fully open source solution for distributed scale, one that’s also native to Postgres. It’s called Citus! This post will go into more detail on why we open sourced our few remaining enterprise features in Citus 11, what exactly we open sourced, and finally what it took to actually open source our code. If you’re more interested in the code instead, you can find it in our GitHub repo (feel free to give the Citus project a star.)

Keep reading

Citus 11.0 is here! Citus is a PostgreSQL extension that adds distributed database superpowers to PostgreSQL. With Citus, you can create tables that are transparently distributed or replicated across a cluster of PostgreSQL nodes. Citus 11.0 is a new major release, which means that it comes with some very exciting new features that enable new levels of scalability.

The biggest enhancement in Citus 11.0 is that you can now always run distributed queries from any node in the cluster because the schema & metadata are automatically synchronized. We already shared some of the details in the Citus 11.0 beta blog post, but we also have big surprise for those of you who use Citus open source that was not part of the initial beta.

When we do a new Citus release, we usually release 2 versions: The open source version and the enterprise release which includes a few extra features. However, there will be only one version of Citus 11.0, because everything in the Citus extension is now fully open source!

That means that you can now rebalance shards without blocking writes, manage roles across the cluster, isolate tenants to their own shards, and more. All this comes on top of the already massive enhancement in Citus 11.0: You can query your Citus cluster from any node, creating a truly distributed PostgreSQL experience.

Keep reading

With the 10.1 release to the Citus extension to Postgres, you can now monitor the progress of an ongoing shard rebalance—plus you get performance optimizations, as well as some user experience improvements to the rebalancer, too.

Whether you use Citus open source to scale out Postgres, or you use Citus in the cloud, this post is your guide to what’s new with the shard rebalancer in Citus 10.1.

And if you’re wondering when you might need to use the shard rebalancer: the rebalancer is used when you add a new Postgres node to your existing Citus database cluster and you want to move some of the old data to this new node, to “balance” the cluster. There are also times you might want to balance shards across nodes in a Citus cluster in order to optimize performance. A common example of this is when you have a SaaS application and one of your customers/tenants has significant more activity than the rest.

Keep reading

Citus 10.1 is out! In this latest release to the Citus extension to Postgres, our team focused on improving your user experience. Some of the 10.1 fixes are operational improvements—such as with the shard rebalancer, or with citus_update_node. Some are performance improvements—such as for multi-row INSERTs or with citus_shards. And some are fixes you’ll appreciate if you use Citus with lots of Postgres partitions.

Given that the previous Citus 10 release included a bevy of new features—including things like columnar storage, Citus on a single node, open sourcing the shard rebalancer, new UDFs so you can alter distributed table properties, and the ability to combine Postgres and Citus tables via support for JOINs between local and distributed tables, and foreign keys between local and reference tables—well, we felt that Citus 10.1 needed to prioritize some of our backlog items, the kinds of things that can make your life easier.

This post is your guide to the what’s new in Citus 10.1. And if you want to catch up on all the new things in past releases to Citus, check out the release notes posts about Citus 10, Citus 9.5, Citus 9.4, Citus 9.3, and Citus 9.2.

Keep reading

It’s been an eventful time for Hyperscale (Citus) lately. If you’re interested in Postgres, distributed databases, and how to handle ever growing needs for your Postgres application or simply use Hyperscale (Citus), keep reading.

Citus is an open source extension to Postgres that enables horizontal scaling of your Postgres database. Citus distributes your Postgres tables, writes, and SQL queries across multiple nodes—parallelizing your workload and enabling you to use the memory, compute, and disk of a multi-node cluster. And Citus is available on Azure: Hyperscale (Citus) is a deployment option in Azure Database for PostgreSQL.

What’s really exciting to me is that we’ve made it easier and cheaper than ever to try and use Hyperscale (Citus). With Basic tier, you can now use Hyperscale (Citus) on a single node, parallelizing your operations and adopting a distributed database model from the very beginning. And you can now try Citus open source with a single docker run command—boom!

Keep reading

One of the main reasons people use the Citus extension for Postgres is to distribute the data in Postgres tables across multiple nodes. Citus does this by splitting the original Postgres table into multiple smaller tables and putting these smaller tables on different nodes. The process of splitting bigger tables into smaller ones is called sharding—and these smaller Postgres tables are called “shards”. Citus then allows you to query the shards as if they were still a single Postgres table.

One of the big changes in Citus 10—in addition to adding columnar storage, and the new ability to shard Postgres on a single Citus node—is that we open sourced the shard rebalancer.

Yes, that’s right, we have open sourced the shard rebalancer! The Citus 10 shard rebalancer gives you an easy way to rebalance shards across your cluster and helps you avoid data hotspots over time. Let’s dig into the what and the how.

Keep reading

Page 1 of 2