Citus BlogCitus Blog

Thoughts about the Citus database—as well as PostgreSQL, sharding, distributed databases, and other open source extensions to Postgres.

Citus is a PostgreSQL extension that makes PostgreSQL scalable by transparently distributing and/or replicating tables across one or more PostgreSQL nodes. Citus could be used either on Azure cloud, or since the Citus database extension is fully open source, you can download and install Citus anywhere you like.

A typical Citus cluster consists of a special node called coordinator and a few worker nodes. Applications usually send their queries to the Citus coordinator node, which relays them to worker nodes and accumulates the results. (Unless of course you’re using the Citus query from any node feature, an optional feature introduced in Citus 11, in which case the queries can be routed to any of the nodes in the cluster.)

Anyway, one of the most frequently asked questions is: “How does Citus handle failures of the coordinator or worker nodes? What’s the HA story?”

And with the exception of when you’re running Citus in a managed service in the cloud, the answer so far was not great—just use PostgreSQL streaming to run coordinator and workers with HA and it is up to you how to handle a failover.

In this blog post, you’ll learn how Patroni 3.0+ can be used to deploy a highly available Citus database cluster—just by adding a few lines to the Patroni configuration file.

Keep reading

Our goal for the Citus extension is for you to be able to use all PostgreSQL features at any scale, with a seamless scaling experience. Distributed tables (or more generally “Citus tables”) are a powerful tool to get high performance at any scale. There are only a few remaining limitations when distributing a PostgreSQL table, but we are determined to solve them all. The Citus 11.2 release checks off another five SQL & DDL features that now work seamlessly on Citus tables. We also improved progress tracking for the shard rebalancer, so you know exactly what’s going on when rebalancing your cluster.

We also want PostgreSQL tools to work out-of-the-box even if you have a distributed PostgreSQL cluster. One of the most frequent questions we get on the Citus Slack from our open source users is how to set up high availability. Alexander Kukushkin, who is the primary maintainer of Patroni and recently joined the Citus database engine team, therefore developed a new version of Patroni which includes support for Citus!

Before we dive in, you can find detailed release notes for Citus 11.2 by the engineering team on our Updates page.

Keep reading
Nazir Bilal Yavuz

Debugging PostgreSQL CI failures faster: 4 tips

Written byBy Nazir Bilal Yavuz | January 18, 2023Jan 18, 2023

Postgres is one of the most widely used databases and supports a number of operating systems. When you are writing code for PostgreSQL, it's easy to test your changes locally, but it can be cumbersome to test it on all operating systems. A lot of times, you may encounter failures across platforms and it can get confusing to move forward while debugging. To make the dev/test process easier for you, you can use the Postgres CI.

When you test your changes on CI and see it fail, how do you proceed to debug from there? As a part of our work in the open source Postgres team at Microsoft, we often run into CI failures—and more often than not, the bug is not obvious, and requires further digging into.

In this blog post, you'll learn about techniques you can use to debug PostgreSQL CI failures faster. We'll be discussing these 4 tips in detail:

Keep reading

For those of you looking to give a talk at a Postgres conference, some good news: the CFP is open for Citus Con: An Event for Postgres 2023. Citus Con is a free and virtual developer event happening next April, hosted by the Postgres and Citus teams at Microsoft.

Carpe diem, as the CFP will close on Feb 5, 2023 at 11:59pm PST.

Videos of all of the Citus Con talks will be published online for the world to see, including on YouTube—so the reach of your talk is not limited to the day of the event.

Because it’s a virtual event, you won’t need to travel to give your talk. And you don’t need to worry about the process of recording your talk: the organizers take care of the video recording and production—all you need is a decent webcam and microphone. You can see from the playlist of last year’s Citus Con talks that the production values of the videos are quite good.

Keep reading

As you may have heard, we recently made PostgreSQL 15 generally available in Azure Cosmos DB for PostgreSQL within just 1 week of the PostgreSQL 15 release. The Postgres 15 version is available for you whether you need to create a new cluster in Azure Cosmos DB for PostgreSQL, or upgrade your existing cluster. (Note: you can do in-place major version upgrades in Azure Cosmos DB for PostgreSQL.) And the PostgreSQL 15 support is available in all Azure regions that support Azure Cosmos DB for PostgreSQL.

You may be surprised since it's usually not the norm for a managed database service to start supporting the new major PostgreSQL version that early... This post will walk you through what's going on behind the scenes that enables us to do such a feat. Some background before diving in:

Azure Cosmos DB for PostgreSQL is powered by native Postgres and Citus open source—and enables you to run PostgreSQL at any scale, from a single node to a large, distributed cluster. Customers can also scale out as much as they want depending on their needs with many additional features. The Hyperscale (Citus) managed service recently moved into Azure Cosmos DB family (more info on the launch of Azure Cosmos DB for PostgreSQL in this blog post) and with that introduced try Azure Cosmos DB for PostgreSQL for free where you can try out PostgreSQL 15 with Citus 11.1.

Keep reading
Thomas Munro

Reducing replication lag with IO concurrency in Postgres 15

Written byBy Thomas Munro | November 10, 2022Nov 10, 2022

Reducing replication lag with IO concurrency in Postgres 15

PostgreSQL 15 improves crash recovery and physical replication performance of some large and very busy databases by trying to minimise I/O stalls. A standby server might now have an easier time keeping up with the primary.

How? The change in PostgreSQL15 is that recovery now uses the maintenance_io_concurrency setting (default is 10, but you can increase it) to decide how many concurrent I/Os to try to initiate, rather than doing random read I/Os one at a time. With big and busy databases, when I/O concurrency increases, replication lag can be reduced.

Keep reading
Nik Larin

News: Postgres 15 available in Azure Cosmos DB for PostgreSQL

Written byBy Nik Larin | October 21, 2022Oct 21, 2022

Big news from the Postgres and Citus team here at Microsoft! Just 1 week after PostgreSQL 15 was released, PostgreSQL 15 GA is generally available in the portal for the Azure Cosmos DB for PostgreSQL managed service—in all Azure regions. Whether you need to provision new clusters in Azure Cosmos DB for Postgres—or upgrade your existing database clusters—Postgres 15 is now a choice for you. Oh, and you can upgrade your existing cluster to Postgres 15 from any of the other supported major Postgres versions, using the in-place major version upgrade feature.

Keep reading
Melih Mutlu

How to Add More Environments to the Postgres CI

Written byBy Melih Mutlu | September 30, 2022Sep 30, 2022

Have you ever played with Postgres source code and weren't sure if you broke anything? Postgres has a quite comprehensive regression test suite that helps to ensure that nothing is broken. You can, of course, run those tests on your machine and check if your version of Postgres works properly. But it always works on your machine, right? What about other environments?

In this blog post, you will learn about how to enable and use the Postgres CI (plus how to contribute to it!) based on my experience and learnings creating my first patch to Postgres. Specifically, you’ll learn:

Keep reading
Marco Slot

Citus 11.1 shards your Postgres tables without interruption

Written byBy Marco Slot | September 19, 2022Sep 19, 2022

Citus is a distributed database that is built entirely as an open source PostgreSQL extension. In fact, you can install it in your PostgreSQL server without changing any PostgreSQL functionality. Citus simply gives PostgreSQL additional superpowers.

Being an extension also means we can keep adding new Postgres superpowers at a high pace. In the last release (11.0), we focused on giving you the ability to query from any node, opening up Citus for many new use cases, and we also made Citus fully open source. That means you can see everything we do on the Citus GitHub page (and star the repo if you’re a fan 😊). It also means that everyone can take advantage of shard rebalancing without write-downtime.

In the latest release (11.1), our Citus database team at Microsoft improved the application’s experience and avoided blocking writes during important operations like distributing tables and tenant isolation. These new capabilities built on the experience gained from developing the shard rebalancer, which uses logical replication to avoid blocking writes. In addition, we made the shard rebalancer faster and more user-friendly; also, we prepared for the upcoming PostgreSQL 15 release. This post gives you a quick tour of the major changes in Citus 11.1, including:

Keep reading

A few months ago we made Citus fully open source. This was a very exciting milestone for all of us on the Citus database engine team. Contrary to folks who say that Postgres is a monolith that can’t scale—Postgres in fact has a fully open source solution for distributed scale, one that’s also native to Postgres. It’s called Citus! This post will go into more detail on why we open sourced our few remaining enterprise features in Citus 11, what exactly we open sourced, and finally what it took to actually open source our code. If you’re more interested in the code instead, you can find it in our GitHub repo (feel free to give the Citus project a star.)

Keep reading

Page 4 of 32