Blog posts tagged with 'distributed Postgres' on the Citus Blog

PgBouncer Connection Pooler for Postgres Now Supports More Session Vars

Written byBy Emel Simsek | April 4, 2024Apr 4, 2024

PgBouncer is probably the most popular connection pooler for Postgres. It is essentially a transparant middleware between clients and the server. However, it is not %100 transparent in practice. There are a few intricacies that should be taken into account when using PgBouncer. One such consideration is that PgBouncer does not support the use of all session variables in transaction pooling mode. This lack of support is one of the reasons that the most commonly used transaction pooling mode is not fully compatible with Postgres. PgBouncer 1.20.0 started supporting two of the most requested session variables and laid ground work to be able to support all session variables in the future. Let’s break this down further.

The Impact of Pooling Mode on Postgres Compatibility

A connection pooler between the client and the server should ideally be completely transparent such that your application doesn’t have to be aware of the presence of a connection pooler. This is not the case with PgBouncer for all the pooling modes.

There are three different connection pooling modes:

Session
Transaction
Statement

The pooling mode determines how the connections in the pool are assigned to the clients. The way this is done may impose some limitations impacting Postgres compatibility.

In session pooling mode, there is a one-to-one mapping between the client and the pooled server connection throughout the lifetime of the client connection.

Hence session pooling mode is the most compatible mode with Postgres. The clients can use session variables which are Postgres parameters whose values can be set per session.

Keep reading

New Citus Technical README for distributed PostgreSQL

Written byBy Onder Kalaci | October 19, 2023Oct 19, 2023

Want to dive right in? Check out this brand new Citus Technical README on our GitHub repo for the open source Citus database extension.

This README is a bit of a deep dive into the Citus concepts, the use cases and design principles, how Citus uses the PostgreSQL extension hooks to distribute Postgres, and more.

Keep reading

Adding Postgres 16 support to Citus 12.1, plus schema-based sharding improvements

Written byBy Naisila Puka | September 22, 2023Sep 22, 2023

The new PostgreSQL 16 release is out, packed with exciting improvements and features—and Citus 12.1 brings them to you at scale, within just one week of the PG16 release.

As many of you likely know, Citus is an open source PostgreSQL extension that turns Postgres into a distributed database. Our team started integrating Citus with the PG16 beta and release candidates early-on, so that you could have a new Citus 12.1 release that is compatible with Postgres 16 as quickly as possible after PG16 came out.

There are a lot of good reasons to upgrade to Postgres 16—huge thanks to everyone who contributed into this Postgres release! PG16 highlights include query performance boost with more parallelism; load balancing with multiple hosts in libpq (contributed by my Citus teammate, Jelte Fennema-Nio); I/O monitoring with pg_stat_io; developer experience enhancements; finer-grained options for access control; logical replication from standby servers and other replication improvements, like using btree indexes in the absence of a primary key (contributed by one of my teammates, Onder Kalaci.)

The good news for those of you who care about distributed Postgres: Citus 12.1 is now available and adds support for Postgres 16.

In addition to Postgres 16 support, Citus 12.1 includes enhancements to schema-based sharding, which was recently added to Citus 12.0—and is super useful for multi-tenant SaaS applications.

Keep reading

What’s new with Postgres at Microsoft (August 2023)

Written byBy Claire Giordano | August 31, 2023Aug 31, 2023

On one of the Postgres community chat forums, a friend asked me: "Is there a blog post that outlines all the work that is being done on Postgres at Microsoft? It's hard to keep track these days."

And my friend is right: it is hard to keep track. Probably because there are multiple Postgres workstreams at Microsoft, spread across a few different teams.

In this post, you'll get a bird's eye view of all the Postgres work the Microsoft team has done over the last year. Our work includes some pretty significant improvements to the Postgres managed services on Azure, as well as contributions across the entire open source ecosystem—including commits to the Postgres core; new releases to Postgres open source extensions like Citus and pg_cron; plus ecosystem work on Patroni, PgBouncer, pgcopydb. And more.

Keep reading

Understanding partitioning and sharding in Postgres and Citus

Written byBy Claire Giordano | August 4, 2023Aug 4, 2023

The topic of this month's PGSQL Phriday #011 community blogging event is partitioning vs. sharding in PostgreSQL. It seemed right to share a perspective on the question of "partitioning vs. sharding" from someone in the Citus open source team, since we eat, sleep, and breathe sharding for Postgres.

Postgres built-in "native" partitioning—and sharding via PG extensions like Citus—are both tools to grow your Postgres database, scale your application, and improve your application's performance.

What is partitioning and what is sharding? In Postgres, database partitioning and sharding are techniques for splitting collections of data into smaller sets, so the database only needs to process smaller chunks of data at a time. And as you might imagine, work gets done faster when you're processing less data.

In this post, you'll learn what partitioning and sharding are, why they matter, and when to use them. The table of contents:

Keep reading

Schema-based sharding comes to PostgreSQL with Citus

Written byBy Onur Tirtir | July 31, 2023Jul 31, 2023

Citus, a database scaling extension for PostgreSQL, is known for its ability to shard data tables and efficiently distribute workloads across multiple nodes. With Citus 12.0, Citus introduces a very exciting feature called schema-based sharding. The new schema-based sharding feature gives you a choice of how to distribute your data across a cluster, and for some data models (think: multi-tenant apps, microservices, etc.) this schema-based sharding approach may be significantly easier!

In this blog post, we will take a deep dive into the new schema-based sharding feature, and you will learn:

Keep reading

How Citus supports the PostgreSQL MERGE command, as of Citus 12.0

Written byBy Teja Mupparti | July 27, 2023Jul 27, 2023

Postgres community released a new feature, in Postgres 15.0, that performs actions to modify rows in the target table, using the data from a source. MERGE provides a single SQL statement that can conditionally INSERT, UPDATE or DELETE rows, a task that would otherwise require multiple procedural language statements, using INSERT with ON CONFLICT clause etc.

In this blog post, you will learn a high-level overview of the functioning of Postgres MERGE. It will delve into some of the practical use-cases, and subsequently elaborate on the different strategies employed by Citus for handling MERGE in a distributed environment.

Keep reading

Citus 12: Schema-based sharding for PostgreSQL

Written byBy Marco Slot | July 18, 2023Jul 18, 2023

What if you could automatically shard your PostgreSQL database across any number of servers and get industry-leading performance at scale without any special data modelling steps?

Our latest Citus open source release, Citus 12, adds a new and easy way to transparently scale your Postgres database: Schema-based sharding, where the database is transparently sharded by schema name.

Schema-based sharding gives an easy path for scaling out several important classes of applications that can divide their data across schemas:

Multi-tenant SaaS applications
Microservices that use the same database
Vertical partitioning by groups of tables

Each of these scenarios can now be enabled on Citus using regular CREATE SCHEMA commands. That way, many existing applications and libraries (e.g. django-tenants) can scale out without any changes, and developing new applications can be much easier. Moreover, you keep all the other benefits of Citus, including distributed transactions, reference tables, rebalancing, and more.

Keep reading

Distributed PostgreSQL benchmarks using HammerDB, by GigaOM

Written byBy Marco Slot | June 21, 2023Jun 21, 2023

Distributed PostgreSQL has become a hot topic. Several distributed database vendors have added support for the PostgreSQL protocol as a convenient way to gain access to the PostgreSQL ecosystem. Others (like us) have built a distributed database on top of PostgreSQL itself.

For the Citus database team, distributed PostgreSQL is primarily about achieving high performance at scale. The unique thing about Citus, the technology powering Azure Cosmos DB for PostgreSQL, is that it is fully implemented as an open-source extension to PostgreSQL. It also leans entirely on PostgreSQL for storage, indexing, low-level query planning and execution, and various performance features. As such, Citus inherits the performance characteristics of a single PostgreSQL server but applies them at scale.

That all sounds good in theory, but to see whether this holds up in practice, you need benchmark numbers. We therefore asked GigaOM to run performance benchmarks comparing Azure Cosmos DB for PostgreSQL to other distributed implementations. GigaOM compared the transaction performance and price-performance of these popular managed services of distributed PostgreSQL, using the HammerDB benchmark software:

Keep reading

Tenant monitoring in Citus & Postgres with citus_stat_tenants

Written byBy Halil Ozan Akgul | May 12, 2023May 12, 2023

If you have ever used a database like Postgres, you know how important optimization is. Some minor changes in how the database is setup make all the difference between long waiting times and satisfied customers. And one crucial thing you need before doing the optimization is to monitor and understand how your database is being used.

Citus is an extension to Postgres that improves scalability and parallelization by distributing your Postgres database across nodes in a cluster. The Citus database extension is available as open source and as a managed service on the cloud, as Azure Cosmos DB for PostgreSQL. You can track your Citus nodes and the Postgres tables, but Citus 11.3 takes it one step further and introduces a new way to gather insight on your Citus database with tenant monitoring.

The new Citus 11.3 release, among many other features, introduces a new citus_stat_tenants view to track your most active tenants, for those with multi-tenant SaaS applications.

Keep reading