Citus Blog

Articles tagged: open source

Our goal for the Citus extension is for you to be able to use all PostgreSQL features at any scale, with a seamless scaling experience. Distributed tables (or more generally “Citus tables”) are a powerful tool to get high performance at any scale. There are only a few remaining limitations when distributing a PostgreSQL table, but we are determined to solve them all. The Citus 11.2 release checks off another five SQL & DDL features that now work seamlessly on Citus tables. We also improved progress tracking for the shard rebalancer, so you know exactly what’s going on when rebalancing your cluster.

We also want PostgreSQL tools to work out-of-the-box even if you have a distributed PostgreSQL cluster. One of the most frequent questions we get on the Citus Slack from our open source users is how to set up high availability. Alexander Kukushkin, who is the primary maintainer of Patroni and recently joined the Citus database engine team, therefore developed a new version of Patroni which includes support for Citus!

Before we dive in, you can find detailed release notes for Citus 11.2 by the engineering team on our Updates page.

Keep reading
Nazir Bilal Yavuz

Debugging PostgreSQL CI failures faster: 4 tips

Written byBy Nazir Bilal Yavuz | January 18, 2023Jan 18, 2023

Postgres is one of the most widely used databases and supports a number of operating systems. When you are writing code for PostgreSQL, it's easy to test your changes locally, but it can be cumbersome to test it on all operating systems. A lot of times, you may encounter failures across platforms and it can get confusing to move forward while debugging. To make the dev/test process easier for you, you can use the Postgres CI.

When you test your changes on CI and see it fail, how do you proceed to debug from there? As a part of our work in the open source Postgres team at Microsoft, we often run into CI failures—and more often than not, the bug is not obvious, and requires further digging into.

In this blog post, you'll learn about techniques you can use to debug PostgreSQL CI failures faster. We'll be discussing these 4 tips in detail:

Keep reading

As you may have heard, we recently made PostgreSQL 15 generally available in Azure Cosmos DB for PostgreSQL within just 1 week of the PostgreSQL 15 release. The Postgres 15 version is available for you whether you need to create a new cluster in Azure Cosmos DB for PostgreSQL, or upgrade your existing cluster. (Note: you can do in-place major version upgrades in Azure Cosmos DB for PostgreSQL.) And the PostgreSQL 15 support is available in all Azure regions that support Azure Cosmos DB for PostgreSQL.

You may be surprised since it's usually not the norm for a managed database service to start supporting the new major PostgreSQL version that early... This post will walk you through what's going on behind the scenes that enables us to do such a feat. Some background before diving in:

Azure Cosmos DB for PostgreSQL is powered by native Postgres and Citus open source—and enables you to run PostgreSQL at any scale, from a single node to a large, distributed cluster. Customers can also scale out as much as they want depending on their needs with many additional features. The Hyperscale (Citus) managed service recently moved into Azure Cosmos DB family (more info on the launch of Azure Cosmos DB for PostgreSQL in this blog post) and with that introduced try Azure Cosmos DB for PostgreSQL for free where you can try out PostgreSQL 15 with Citus 11.1.

Keep reading
Thomas Munro

Reducing replication lag with IO concurrency in Postgres 15

Written byBy Thomas Munro | November 10, 2022Nov 10, 2022

Reducing replication lag with IO concurrency in Postgres 15

PostgreSQL 15 improves crash recovery and physical replication performance of some large and very busy databases by trying to minimise I/O stalls. A standby server might now have an easier time keeping up with the primary.

How? The change in PostgreSQL15 is that recovery now uses the maintenance_io_concurrency setting (default is 10, but you can increase it) to decide how many concurrent I/Os to try to initiate, rather than doing random read I/Os one at a time. With big and busy databases, when I/O concurrency increases, replication lag can be reduced.

Keep reading
Nik Larin

News: Postgres 15 available in Azure Cosmos DB for PostgreSQL

Written byBy Nik Larin | October 21, 2022Oct 21, 2022

Big news from the Postgres and Citus team here at Microsoft! Just 1 week after PostgreSQL 15 was released, PostgreSQL 15 GA is generally available in the portal for the Azure Cosmos DB for PostgreSQL managed service—in all Azure regions. Whether you need to provision new clusters in Azure Cosmos DB for Postgres—or upgrade your existing database clusters—Postgres 15 is now a choice for you. Oh, and you can upgrade your existing cluster to Postgres 15 from any of the other supported major Postgres versions, using the in-place major version upgrade feature.

Keep reading
Melih Mutlu

How to Add More Environments to the Postgres CI

Written byBy Melih Mutlu | September 30, 2022Sep 30, 2022

Have you ever played with Postgres source code and weren't sure if you broke anything? Postgres has a quite comprehensive regression test suite that helps to ensure that nothing is broken. You can, of course, run those tests on your machine and check if your version of Postgres works properly. But it always works on your machine, right? What about other environments?

In this blog post, you will learn about how to enable and use the Postgres CI (plus how to contribute to it!) based on my experience and learnings creating my first patch to Postgres. Specifically, you’ll learn:

Keep reading
Marco Slot

Citus 11.1 shards your Postgres tables without interruption

Written byBy Marco Slot | September 19, 2022Sep 19, 2022

Citus is a distributed database that is built entirely as an open source PostgreSQL extension. In fact, you can install it in your PostgreSQL server without changing any PostgreSQL functionality. Citus simply gives PostgreSQL additional superpowers.

Being an extension also means we can keep adding new Postgres superpowers at a high pace. In the last release (11.0), we focused on giving you the ability to query from any node, opening up Citus for many new use cases, and we also made Citus fully open source. That means you can see everything we do on the Citus GitHub page (and star the repo if you’re a fan 😊). It also means that everyone can take advantage of shard rebalancing without write-downtime.

In the latest release (11.1), our Citus database team at Microsoft improved the application’s experience and avoided blocking writes during important operations like distributing tables and tenant isolation. These new capabilities built on the experience gained from developing the shard rebalancer, which uses logical replication to avoid blocking writes. In addition, we made the shard rebalancer faster and more user-friendly; also, we prepared for the upcoming PostgreSQL 15 release. This post gives you a quick tour of the major changes in Citus 11.1, including:

Keep reading

A few months ago we made Citus fully open source. This was a very exciting milestone for all of us on the Citus database engine team. Contrary to folks who say that Postgres is a monolith that can’t scale—Postgres in fact has a fully open source solution for distributed scale, one that’s also native to Postgres. It’s called Citus! This post will go into more detail on why we open sourced our few remaining enterprise features in Citus 11, what exactly we open sourced, and finally what it took to actually open source our code. If you’re more interested in the code instead, you can find it in our GitHub repo (feel free to give the Citus project a star.)

Keep reading
Samay Sharma

Debugging Postgres autovacuum problems: 13 tips

Written byBy Samay Sharma | July 28, 2022Jul 28, 2022

If you've been running PostgreSQL for a while, you've heard about autovacuum. Yes, autovacuum, the thing which everybody asks you not to turn off, which is supposed to keep your database clean and reduce bloat automatically.

And yet—imagine this: one fine day, you see that your database size is larger than you expect, the I/O load on your database has increased, and things have slowed down without much change in workload. You begin looking into what might have happened. You run the excellent Postgres bloat query and you notice you have a lot of bloat. So you run the VACUUM command manually to clear the bloat in your Postgres database. Good!

But then you have to address the elephant in the room: why didn't Postgres autovacuum clean up the bloat in the first place...? Does the above story sound familiar? Well, you are not alone. 😊

Keep reading

We released Citus 11 in the previous weeks and it is packed. Citus went full open source, so now previously enterprise features like the non-blocking aspect of the shard rebalancer—and multi-user support—are all open source for everyone to enjoy. One other huge change in Citus 11 is now you can query your distributed Postgres tables from any Citus node, by default.

When using Citus to distribute Postgres before Citus 11, the coordinator node was your application’s only point of contact. Your application needed to connect to the coordinator to query your distributed Postgres tables. Coordinator node can handle high query throughput, about 100K per second but your application might need even more processing power. Thanks to our work in Citus 11 you can now query from any node in the Citus database cluster you want. In Citus 11 we sync the metadata to all nodes by default, so you can connect to any node and run queries on your tables.

Running queries from any node is awesome but you also need to be able to monitor and manage your queries from any node. Before, when you only connected the coordinator, using Postgres’ monitoring tools was enough but this is not the case anymore. So in Citus 11 we added some ways to observe your queries similar to you would do in a single Postgres instance.

Keep reading

Page 4 of 7