Citus Cloud 2, Postgres, and scaling out without sacrifice

Written by Craig Kerstiens
November 16, 2017

Update in October 2022: The Citus managed database service is now available in Azure Cosmos DB for PostgreSQL. If you’re looking to distribute PostgreSQL in the cloud (for multi-tenant SaaS apps, or real-time analytics, or time series workloads), take a look at Azure Cosmos DB for PostgreSQL. And as always, the Citus database is also available as open source: you can find the Citus repo on GitHub or download Citus here.

Relational databases have been a mainstay in applications for decades now. And with their dominance has come a rich set of tools: you have tools to help with monitoring, to gain insight into performance, and to operate the database in safe ways.

The knock against relational databases has long been: what happens when you need to scale? At that point, you usually have to make difficult trade-offs, like having to trade off relational semantics in order to get a database that scales out (think: NoSQL.) Or having to find a way to reduce the amount of data you need to retain, in order to continue to skate by with a single-node relational database. Or having to trade off as much as a year’s worth of application features in order to divert an engineering team away from your core business, to instead spend their time sharding the application. Bottom line: the database trade-offs to get scale can be painful.

Today I’m excited to announce Citus Cloud 2, the newest version of our cloud database. We launched the first release of Citus Cloud 18 months ago as a fully-managed database as a service that enables you to keep right on scaling your relational database. If you’re unfamiliar, Citus is an extension to Postgres that transforms your Postgres database into a distributed system under the covers, while appearing to your application as a single-node database. With Citus, you don’t need to teach your application all about distributed systems to continue scaling. We make Citus available as open source, as on-prem enterprise software, and as a fully-managed database as a service, Citus Cloud. And with Citus Cloud, you have all of your management/backups/etc. taken care of for you.

We’ll dig into the details of this release in more detail below, but the theme for all of the improvements is allowing you to work more powerfully with your database. When my team uses gridding to make priority calls and allocate resources for a new release of Citus Cloud, the goal of empowering you as a SaaS developer is always top of mind.

  • Tooling for seamless migration from your existing single-node Postgres database
  • Powerful features to empower developers & analysts alike to get value from your database—without impacting the production database
  • Additional trust knowing your data is secure and safe from harm

Taking the pain out of Postgres migration with Citus Warp

At Citus, we routinely help customers migrate from their existing setup onto Citus, including when they've begun to run into a wall in terms of scaling single-node Postgres. Customers have migrated to Citus Cloud from all over including RDS, Digital Ocean, and Google Cloud. To make migrations across clouds and into Citus even easier, we’re announcing a number of tools to make your migration to Citus seamless.

For a fast-growing SaaS business, the experience you give to your customers is key. Providing stability, uptime, and features are essential to keeping your customers being happy, making your business succeed, and getting enough sleep at night.

In the past, when you migrated from one database to another, you probably had to take downtime. The historical migration process was to stop new writes to the system, dump the database, then restore into the new system, and bring everything online. When your database is small (less than 100 GB of data) this dump/restore process is fast enough, but beyond 100 GB of data the downtime becomes prohibitive.

Enter the new Warp feature in Citus Cloud. Warp allows you to continue writing to your existing Postgres database by streaming all of your updates from that single-node Postgres database into your Citus Cloud cluster. By streaming all updates as they come in to your cluster, you're able to get your Citus cluster prepared over time, and when the time comes your cutover time is reduced down to minutes. Even for Terabytes of data. SalesLoft who recently migrated to Citus had this to say about their experience with the migration onto Citus Cloud, using Citus Warp:

"It’s difficult to express how great the Citus Warp migration utility is without slamming all the other data migration solutions in the world. But I’ll try: Citus Warp is one of the most seamless database migration tools I’ve ever used, and I’ve used some pretty expensive products from Oracle and EMC/Dell. With Citus Warp, our cutover was smooth, with almost zero downtime."

—Scott Mitchell, CTO at SalesLoft

Reducing the application changes you need to make

One part of migration to a scale-out database is the data migration itself, the other part is the lead-up work to ensure your SaaS app will work as expected. This migration process typically entails:

  1. Updating your models and primary keys to include your tenant_id
  2. Ensuring queries are scoped to tenant
  3. Profit

To ease your application migration process for steps 1 and 2 above, we have you covered. For Ruby on Rails applications you can leverage the activerecord-multi-tenant gem and in Python for Django you can use django-multitenant. Both of these libraries take the heavy lifting out of updating your SaaS app. For each of the libraries, all you need to do is add the libraries to your application, update the scope of your models, and then jump straight to the scaling part.

Get stuff done with your database without risking production—with forks, followers, and PITR

Relational databases have decades of tooling built up around them to make them friendly to work with. That ranges from tools for BI to well-defined strategies for backup/testing. A common trade-off with distributed systems is that you have to give up many of the benefits of relational databases in order to scale out. And that hurts.

With our Citus Cloud 2 release, we're changing all that and are giving you the ability to work more flexibly than ever before with a distributed database. You can now scale out your relational database, without having to contort your SaaS application. Need a full copy of your production database to test a schema migration? Fork your Citus cluster. Need a read-only replicated copy of your data for your business analyst? Create a follower database. Have a customer that purged some data and now they want to recover it? Use our PITR feature to create a point-in-time version of your Citus Cloud cluster—and go back to the exact moment before your customer accidentally wiped data, so they can recover it.

Scale out when you need to with our zero-downtime shard rebalancer

One of the most important considerations in picking a database is how will it scale. With Citus Cloud, our shard rebalancer is getting some great upgrades. We’ve updated our rebalancer to leverage one of the newest updates in Postgres 10: logical replication. When you add nodes to your Citus cluster and scale out from say 4 nodes to 6 nodes, under the covers we gradually stream new writes, updates, and deletes as they come over to your newly active nodes. As your transactions get caught up on the new nodes, we then update the Citus metadata under the covers, flipping them as live and turning the new nodes into primary nodes. So now with Citus Cloud 2, adding nodes to your database cluster is as simple as dragging a slider, and there is zero downtime for your app as you run the shard rebalancer.

To top it all off, you can follow along on the shard rebalancer’s progress, live in the Citus Cloud console:

Citus Shard Rebalancer Visualization

Citus Security Suite

Ease of migration and scaling are key, but we'd be remiss if we didn't talk about security. As database folk, we view data as the most sacred portion of your application. With the Citus security suite, you can rest easy at night knowing your database is secure. Citus Cloud ensures all of your data is encrypted at rest and in transit. We provide a certificate for you to verify your connection to the database and confirm you’ve not been victim to a MITM attack.

With Citus Cloud 2, you can now setup VPC peering within AWS and peer your fully-managed Citus Cloud cluster directly to your existing AWS infrastructure. Want to restrict your database fully from the internet? We allow you to setup network restrictions so your database is as open or as closed to the rest of the internet as you desire.

Even better, we continually review all new security releases the underlying operating system, as well as Postgres. If you’re affected we’ll take care of getting you patched, while you get to rest easy and keep focusing on your app and features.

Quit worrying about your database, and get back to building your application

With Citus Cloud 2, we’re giving you a cloud database that you enjoy working with, not one you’re afraid to touch. We know that scaling out relational databases has historically been hard, but we’ve made it so much simpler to scale out. If you have questions about how Citus Cloud could help you stop worrying about your database and help you scale your SaaS app into the future, then just let us know–I’d be happy to chat and answer any questions you have.

Craig Kerstiens

Written by Craig Kerstiens

Former Head of Cloud at Citus Data. Ran product at Heroku Postgres. Countless conference talks on Postgres & Citus. Loves bbq and football.