Blog posts tagged with 'sharding' on the Citus Blog - Page 3

Dynamically resizing your Postgres cluster with Citus' shard rebalancer

Written byBy Craig Kerstiens | April 11, 2017Apr 11, 2017

One of the most common questions we get at Citus is how the rebalancer works–which is understandable. When you have an elastic scaled out database, how easy it is to scale is a key factor into how usable it will be. While we'll happily take time for anyone that's interested and live demo it, this walk through should give you a better idea of how it works for those of you that are curious.

Keep reading

A multi-tenant sharding tutorial

Written byBy Craig Kerstiens | March 9, 2017Mar 9, 2017

A number of SaaS applications have data models where they want to have their customers interact with only their data. At the enterprise end you have companies like Salesforce and Workday that fall into this bucket, but we see a ton of small ones as well. If you're just getting started figuring out how you should approach your data so it can scale in the future, it doesn't have to be hard.

Here we're going to walk through an example data model that you can use as a basis for learning how you could apply the same to your own multi-tenant application.

Keep reading

Lessons learned from Postgres schema sharding

Written byBy Craig Kerstiens | December 18, 2016Dec 18, 2016

We talk with a number of Postgres users each week that are looking to scale out their database. First, we would never recommend scaling out until you truly have to, it’s always easier to scale your database up rather than out. It’s often not until over 100 GB of data that you need to think about sharding.

When you want to scale out though, you want it to be simple. For scaling a multi-tenant database, there’s three common approaches:

Keep reading

Citus' Replication Model: Today and Tomorrow

Written byBy Ozgun Erdogan | December 15, 2016Dec 15, 2016

Citus is a distributed database that extends (not forks) PostgreSQL. Citus does this by transparently sharding database tables across the cluster and replicating those shards.

After open sourcing Citus, one question that we frequently heard from users related to how Citus replicated data and automated node failovers. In this blog post, we intend to cover the two replication models available in Citus: statement-based and streaming replication. We also plan to describe how these models evolved over time for different use cases.

Keep reading

Designing your SaaS Database for Scale with Postgres

Written byBy Ozgun Erdogan | October 3, 2016Oct 3, 2016

If you’re building a SaaS application, you probably already have the notion of tenancy built in your data model. Typically, most information relates to tenants / customers / accounts and your database tables capture this natural relation.

With smaller amounts of data (10s of GB), it’s easy to throw more hardware at the problem and scale up your database. As these tables grow however, you need to think about ways to scale your multi-tenant database across dozens or hundreds of machines.

After our blog post on sharding a multi-tenant app with Postgres, we received a number of questions on architectural patterns for multi-tenant databases and when to use which. At a high level, developers have three options:

Keep reading

Sharding a multi-tenant app with Postgres

Written byBy Craig Kerstiens | August 10, 2016Aug 10, 2016

Whether you’re building marketing analytics, a portal for e-commerce sites, or an application to cater to schools, if you’re building an application and your customer is another business then a multi-tenant approach is the norm. The same code runs for all customers, but each customer sees their own private data set, except in some cases of holistic internal reporting.

Early in your application’s life customer data has a simple structure which evolves organically. Typically all information relates to a central customer/user/tenant table. With a smaller amount of data (10’s of GB) it’s easy to scale the application by throwing more hardware at it, but what happens when you’ve had enough success and data that you have no longer fits in memory on a single box, or you need more concurrency? You scale out, often painfully.

Keep reading

Sharding Postgres with semi-structured data and its performance implications

Written byBy Craig Kerstiens | July 25, 2016Jul 25, 2016

If you're looking at Citus its likely you've outgrown a single node database. In most cases your application is no longer performing as you’d like. In cases where your data is still under 100 GB a single Postgres instance will still work well for you, and is a great choice. At levels beyond that Citus can help, but how you model your data has a major impact on how much performance you're able to get out of the system.

Keep reading