Announcing Citus MX: Scaling Postgres to over 500k writes per second

Marco Slot Sep 22, 2016

About Citus

Citus is a distributed database that scales out PostgreSQL. An extension to Postgres, Citus is available as open source, as on-prem software, and as a fully-managed database as a service.

Sign up for our newsletter

Enjoy what you're reading? Sign-up to our newsletter to stay informed:

Other Recent Posts

Migrating from single-node Postgres to Citus How Citus works (a look at dynamic executors) Citus 7: Transactions, Framework Integration, and Postgres 10 More Articles

Like our blog, or have a question about Citus? Join us on Slack for a chat :)

Today we’re excited to announce the private beta of Citus MX. Citus MX builds on the Citus extension for PostgreSQL, which allows you to scale out PostgreSQL tables across many servers. Citus MX gives you the ability to write to or query distributed tables from any node, which allows you to horizontally scale out your write-throughput using PostgreSQL. It also removes the need to interact with a primary node in a Citus cluster.

We’ve performed over 500k durable writes per second (using YCSB) on a 32 node Citus Cloud cluster with our regular PostgreSQL settings. We’ve also exceeded ingest rates of 7 million records per second using batch COPY. Watch the video to see it in action. If you’re curious to learn more, read on or to get access, sign up below.

Scaling your writes

With current editions of Citus it’s already simple to scale out PostgreSQL tables. Once your data is distributed across many nodes, you have more memory and parallel processing power available to help scale your read workloads. As you query through the leader, Citus routes your query to the appropriate node, pushing down the heavy lifting of lookups and aggregations to data nodes. Whether you’re performing real-time analytics across terabytes of data or looking to scale out your transactions this is straight-forward today.

Now those that need even higher write volumes, can achieve it through MX. Under the covers Citus MX keeps all the metadata about which data lives where in sync across the cluster. We expose to you a single URL which pools all of your nodes together so each connection goes to a random node. This means that you can have a higher number of connections and each connection you make is helping you scale your writes.

Improved availability

One of the favorite features of Citus according to our customers is improving uptime through redundancy. To date though you still had to manage a failover setup for your leader node, and in the event that it failed, all writes and reads would fail until it was brought back online. While high availability reduces the length of downtime, should a node fail your entire cluster would still be unavailable for this minute while the failover was happening. With Citus MX as long as a node is available, the data on it can be read or written even if other nodes are down.

How it works

In Citus MX you can access your database in one of two ways: Either through the coordinator which allows you to create or make changes to distributed tables, or the data URL which routes you to one of the data nodes on which you can perform regular queries on the distributed tables. These are also the nodes that hold the shards, the regular PostgreSQL tables in which the actual data is stored. When you perform a query on a distributed table, the command is either forwarded to the right shard based on filter conditions, or parallelised across all the shards.

Streaming replication

One key part that enables Citus MX is our usage of Postgres streaming replication. Citus has a built-in replication mechanism in which the leader node sends writes to all replicas. However, this requires that the leader is involved in every write to ensure linearizability. Fortunately, streaming replication provides us with an alternative that we use in Citus MX: Let PostgreSQL handle the replication, removing the need for a single leader. We also make sure that MX automatically sets up streaming replicas and handles node fail-overs for you.

By having hot standbys for all nodes and putting the distributed tables on each of them, the impact of a single node failure is small and can be quickly recovered. Moreover, streaming replication is more efficient at handling certain types of writes like updates. Last, your application talks to a single endpoint and doesn’t require logic to handle load balancing and node fail-over.

Metadata makes it work

The major change we made in Citus MX is that the distributed tables and the metadata about where the data is located are propagated to all nodes. Since all the nodes have the Citus extension, they were already able to run distributed queries, but did not have the necessary metadata.

We’ve taken special care to ensure that metadata-changing commands like ALTER TABLE and moving shards to new nodes can be safely executed at any time. By using PostgreSQL’s built-in two-phase commit (2PC) mechanism for metadata changes, we avoid the risk of nodes ever having outdated metadata since it guarantees that all nodes apply the same changes. We’ve even covered very exceptional failure scenarios by adding a recovery mechanism for 2PCs that fail to complete at the last moment. These techniques ensure that the database remains reliable and consistent even when making changes to the cluster or when experiencing failures.

Availability

Citus MX is available on Citus Cloud, which you can sign up and get started today. We’re currently allowing private beta access to select customers, if you’re interested in access please sign up below.



You can also find us in the Citus Slack channel, for any questions you might have.

← Next article Previous article →