Distributed Postgres goes full open source with Citus: why, what & how

Written by Jelte Fennema
September 12, 2022

A few months ago we made Citus fully open source. This was a very exciting milestone for all of us on the Citus database engine team. Contrary to folks who say that Postgres is a monolith that can’t scale—Postgres in fact has a fully open source solution for distributed scale, one that’s also native to Postgres. It’s called Citus! This post will go into more detail on why we open sourced our few remaining enterprise features in Citus 11, what exactly we open sourced, and finally what it took to actually open source our code. If you’re more interested in the code instead, you can find it in our GitHub repo (feel free to give the Citus project a star.)

Why make the final pieces of Citus open source now?

One of the reasons we open sourced the last few enterprise features in Citus 11 is that our business model has changed. When Citus was first started back in ~2011, the business model consisted of selling enterprise licenses and support contracts. In 2016 we un-forked Citus from Postgres and open sourced the bulk of Citus. After doing that, we differentiated our enterprise-licensed software from the open source version by including a few extra closed source enterprise features. But over time our business model has moved away from selling enterprise licenses.

Currently our business model revolves around our managed service: Azure Database for PostgreSQL - Hyperscale (Citus). As you might imagine, our managed service builds on top of Citus open source by adding in the “managed” features, aka the features that save you time and make it so you no longer have to worry about your database.

With the Azure service, you can create and scale a Postgres cluster with the click of a button; your Postgres settings are already tuned to get extra performance out of the hardware; you get automatic backups, from which you can restore with ease; and if a node in your cluster crashes, you automatically failover to another one assuming you enabled High Availability (HA). You’ll also get easy integrations with other Azure cloud services like ADF, Azure Stream Analytics, Azure Kubernetes Service, App Service, and more… And if you ever run into an issue, you can always reach out to the super-knowledgeable Azure support team.

As a result of this change in business model, we started to wonder: If customers are primarily paying us for the managed service, does that mean that we could make Citus completely open source? And what are the advantages of making Citus completely open source?

Hacker News front page screenshot
Figure 1: Lots of you were just as excited as we about open sourcing the remaining Citus Enterprise features. My GitHub commit that open sourced these enterprise features was my only commit ever that landed on the front page of Hacker News.

The advantages of making Citus completely open source

Even if there were no big downsides to open sourcing everything, then open sourcing needed to have some advantages for us to spend the effort to move away from the status quo.

As you might imagine, there are a multitude of benefits to making Citus completely open source.

More open source features mean more open source users

Probably the most obvious group of people who benefit from completely open sourcing Citus are those of you who already use the open source version of Citus. If you’re part of this group, you suddenly get extra features by simply upgrading to Citus 11. And who doesn’t like lots of new features?

But if you’re not yet using Citus, this release could be the turning point too. With these extra features like the online shard rebalancer and better user management, we expect more of you to give Citus a try. For some of you Citus on Azure may not fit into your plans yet—or perhaps you chose not to use Citus because it was missing some features that were critical for you. Well, not anymore! We have taken away that barrier!

We would love as many people as possible, including you, to use Citus to build and deploy applications—whether you’re using Citus on Azure or Citus open source. Why? Because of the following reasons:

  • More people benefiting from Citus: We think that Citus is awesome (we are slightly biased) and we think many more people could benefit from using it. For Citus engineers like me, having people benefit from the software we’re building is already a reward in itself.
  • More contributions: As a user you might very well contribute directly to Citus and its community. You might help others out on our Slack channel or on Stack Overflow, you could report bugs you run into, and you might even contribute code. All this makes Citus better for everyone. Of course, all of these are optional, but from experience we know that some of you greatly improve Citus and add to its community.
  • More customers: This is probably the most counterintuitive reason, but in the long run we expect more open source adoption to result in more customers that use Citus on Azure. Maybe you’re an open source Citus user now but want to move to Azure in the future—or maybe you tell a friend how great Citus is, which causes your friend to adopt Citus on Azure. So if you’re an open source user, you could very well end up influencing people to use the Citus managed service in the future.

Functional parity between your laptop and Azure

If you were already using the Citus managed service on Azure then the newly open-sourced features are already available to you there. But if you’re a developer, then Azure likely isn’t the only place you used Citus. You have probably also installed the open source version of Citus on your dev machine, to develop and test your application. So in the past, the environment on your dev machine would differ slightly from your production environment. Which is not desirable, since many developers like to keep the development environment as close to production as possible, so you can catch the most bugs.

By open sourcing the remaining Citus features in Citus 11, you now have complete functional parity between Citus in the cloud and on your laptop.

More developer time spent on improving Citus

Lastly, but not unimportantly, by open sourcing all of Citus we significantly increase the happiness and productivity of our own developers (including myself).

Prior to Citus 11, we had two git repositories. A public GitHub repo, for the open source version of Citus and a private repo that contained the “enterprise” version of Citus.

So, before Citus 11 each new Citus release required a significant chunk of developer time to be spent on the repetitive task of merging the changes from the public repo into the private repo. Often resulting in annoying merge conflicts, or slight differences in behavior during tests. This wasn’t the favorite part of the job for any of us. And it was also time consuming. Now that everything is open source, my colleagues and I can focus more on the work that we love: Improving Citus and distributing PostgreSQL!

We show that we love open source

While some of my teammates work on our Azure managed services for PostgreSQL and for Citus, a fair number of my engineering colleagues here at Microsoft spend the bulk of their time working as committers of the Postgres open source project. We also maintain other open source extensions to Postgres such as pg_cron, hyperloglog, and TopN. And even though Citus was already mostly open source, we felt that we could show our love for open source even better by making Citus entirely open source.

So, what exactly did we open source in Citus 11?

Most of the code for Citus was already open source before Citus 11—we had unforked Citus from Postgres and open sourced Citus as a Postgres extension way back in 2016 already. So, exactly what new features were open sourced in Citus 11? The full list of newly open-sourced features can be found on our updates page, but I have highlighted below the ones I think you will likely benefit from the most:

  • The non-blocking shard rebalancer allows you to scale out your cluster without downtime. With Citus 10 we open sourced the shard rebalancer, but it only allowed you to rebalance while blocking writes to the shard that was being moved. With Citus 11 there’s no need to take such partial downtime anymore when you need to add more compute power to your cluster, or when you’re running out of disk space. With the non-blocking shard rebalancer, your cluster is fully online any time you want rebalance the data in your cluster.
  • Propagation of CREATE/ALTER/DROP ROLE is very useful in a multi-user Postgres environment. Setting up a cluster with multiple Postgres users was quite cumbersome in the open source version of Citus. You had to create each role on every node, and when you added a new node you would have to add it there as well. This is not the case anymore! You can now create or edit a role on the coordinator and your changes will automatically appear on all the other nodes in your cluster too.
  • Propagation of GRANT statements allows you to easily manage permissions of the users in your cluster. GRANT statements that you execute on distributed tables now do exactly what you would expect them to do: They grant the same permissions to all the underlying shards. And the same when you GRANT access to entire schemas, these same permissions will now be propagated to all the workers.
  • Authentication options with pg_dist_authinfo allows you to easily configure how nodes should authenticate with each other. Previously, you needed to use a .pgpass file to configure the passwords with which to authenticate between nodes when using open source Citus. With Citus 11 now being fully open source you can use a much easier to use and more powerful alternative: the pg_dist_authinfo table. In this table you can put the credentials that should be used to authenticate to other nodes. These credentials can be any authentication options that Postgres supports, like passwords or TLS certificates. What makes pg_dist_authinfo especially easy to use is the fact that you can create a shared single row for each user that is used to authenticate to any of the nodes in your cluster, while with the .pgpass approach you needed one line for each node and user combination.
  • Row level security (RLS) isn’t a feature that is used by most Postgres users. But for some of you, this is a critical feature that allows you to configure precisely who can read what data. In Citus 11 we open sourced support for this feature on distributed tables. Any RLS rules that you create on distributed tables now automatically get created on all the shards too. So if this was holding you back from migrating to Citus, that’s one reason less.

How did we open source Citus completely?

The arguments and reasoning above might seem obvious now in hindsight, but it took some time to get the stakeholders aligned. But eventually the team agreed, and everyone was very excited about this huge step.

The only decision left to make was when to open source the code: The Citus 11.0 release was the obvious candidate. Then actually open sourcing all the code was surprisingly easy due to the power of git. It was pretty much as simple as the following four commands:

# Create new branch based on open source code
git checkout -b open-source-master-merge-enterprise open-source/master
# Copy all files tracked by git from the enterprise repo
git checkout enterprise/master .
# Keep the open source license instead of the enterprise license
git checkout HEAD -- LICENSE
# Create one big commit
git commit -m 'Make enterprise features open source'

After doing that we still needed to double check the contents of the commit. To make sure that there wasn’t some reference to a specific customer in a code comment or something like that. But also, so we could create a comprehensive list of all the features that we were making open source (some of which were even a surprise to us).

Then finally all that was left was creating a pull request with this commit on our open source repo and Citus was completely open source.

Have fun using Citus!

By open sourcing all our code in Citus 11 we tried to make your life as a Citus user better, no matter if you use it on-prem, in the cloud on Azure, or on your laptop. So have fun trying out the Citus 11 release. If you don’t know where to start, you should check out our Getting Started page. And if you have any questions, just join our Slack channel or drop your questions on Stack Overflow.

Jelte Fennema

Written by Jelte Fennema

Postgres and Citus developer at Microsoft. Low latency APIs at Stream. BSc in Computer Science and MSc in System & Network Engineering from U of Amsterdam. Rust. Hot sun. Cold beer.

JelteF