pg_shard, Shard and scale out PostgreSQL

Written by Jason Petersen
December 4, 2014

Notice: Citus is now a successor to pg_shard. We encourage you to take a look at it if you need help with Postgres sharding.

Today we’re excited to announce pg_shard, a transparent sharding extension for PostgreSQL. Without requiring any changes to your application code, pg_shard enables your tables and queries to be distributed across any number of PostgreSQL servers.

CitusDB customers have been clamoring for this functionality for some time. Previously, CitusDB only allowed batch loads: adding new data meant creating more shards using CitusDB’s special \stage command.

We agreed it would be awesome to “just INSERT” data into a cluster, but felt the underlying problem was bigger than Citus: given the fantastic extension capabilities of PostgreSQL, where was the easy-to-use open source sharding extension? So we decided to write one.

In particular, we made sure the extension plays nicely with existing PostgreSQL installations: it preserves the full PostgreSQL feature set for local tables while adding the ability to shard and replicate tables designated as distributed. SELECT, INSERT, UPDATE, and DELETE commands to such tables are then seamlessly spread across a set of PostgreSQL servers. To ensure pg_shard is fully useable as a stand-alone extension, we even implemented cross-shard SELECT functionality.

When used with CitusDB, pg_shard defers to CitusDB’s superior distributed planner during SELECT queries, but takes over during modification commands to give our users the real-time INSERTs they’d been craving.

To get the extension, head over to pg_shard’s page on GitHub.

Got questions?

If you have questions about pg_shard, please contact us using the pg_shard-users Google Group.

If you discover an issue when using pg_shard, please submit it to pg_shard’s issue tracker on GitHub.

Jason Petersen

Written by Jason Petersen

Early team member at Citus Data. Plays piano, gardens, & samples new restaurants.