There are a number of data architectures you could use when building a multi-tenant app. Some, such as using one database per customer or one schema per customer, have trade-offs when it comes to larger scale. The other option is to build the notion of tenancy directly into the logic of your SaaS application. With django-multitenant and Citus, built-in tenancy becomes much easier to put in place for your application without having to re-invent the wheel yourself.
Our django-multitenant Python library, enables easy scale out of applications that are built on top of Django and follow a multi tenant data model. This Python library has evolved from our experience working with SaaS customers, scaling out their multi-tenant apps.
There are a lot of things that are everyday occurrences for engineering teams. Deploying new code, deploying a new service, it’s even fairly common to deploy a net new data store or language. But migrating from one database to another is far more rare. While migrating your database can seem like a daunting task, there are lessons you can learn from others—and steps you can take to minimize risk in migrating from one database to another.
At Citus Data, we’ve helped many a customer migrate from single node Postgres, like RDS or Heroku Postgres, to a distributed Citus database cluster, so they can scale out and take advantage of the compute, memory, and disk resources of a distributed, scale-out solution. So we’ve been privy to some valuable lessons learned, and we’ve developed some best practices. Here you can find your guide for steps to follow as you start to create your migration plan to Citus.
When it comes to scaling your database, there are challenges but the good news is that you have options. The easiest option of course is to scale up your hardware. And when you hit the ceiling on scaling up, you have a few more choices: sharding, deleting swaths of data that you think you might not need in the future, or trying to shrink the problem with microservices.
Deleting portions of your data is simple, if you can afford to do it. Regarding sharding there are a number of approaches and which one is right depends on a number of factors. Here we’ll review a survey of five sharding approaches and dig into what factors guide you to each approach.
“Your father’s lightsaber. This is the weapon of a Jedi Knight. Not as clumsy or random as a blaster. An elegant weapon, for a more civilized age.”
—Obi-Wan Kenobi, Star Wars Episode IV: A New Hope
Announcing the release of Citus 6.2
Today I’m happy to announce that we’ve rolled out a new version of our database, Citus 6.2. Because as most of you know, good software never stops evolving. Nor should it. If you want the scoop on the new capabilities in Citus 6.2, just scroll ahead. But before diving in, I need to explain the lightsaber pic. Why? Because usually a picture speaks a thousand words, but sometimes it needs an annotation. :-)
When my colleagues first started on their journey to build Citus, they had a vision of combining the best aspects of relational databases with the elastic scale of NoSQL—to give developers a database that delivers SQL capabilities, at scale.
But vision alone does not make a successful company. The Citus co-founders needed a mix of key ingredients: the right team, good timing, good execution, a willingness to experiment and learn, plus (of course) a good idea.
When George Lucas describes his days before the first Star Wars film, he said he was “searching for just the right ingredients, characters and storyline.” In Lucas’s search for the right mix, he too had to iterate: he wrote four different screenplays before landing on the final version of the original film!
Because our CTO is such a big fan of Star Wars, Ozgun sometimes talks about his vision for Citus in the language of the Jedi: Ozgun has said his aim for Citus was “to create a database as elegant and as powerful as a lightsaber.” Now, I’m more of a Stranger Things fan myself (after all, mornings are for coffee and contemplation) but I get Ozgun’s desire to create a database that gives you the benefits of SQL—at scale.
Distributed databases often require you to give up SQL and ACID transactions as a trade-off for scale. Citus is a different kind of distributed database. As an extension to PostgreSQL, Citus can leverage PostgreSQL’s internal logic to distribute more sophisticated data models. If you’re building a multi-tenant application, Citus can transparently scale out the underlying database in a way that allows you to keep using advanced SQL queries and transaction blocks.
In multi-tenant applications, most data and queries are specific to a particular tenant. If all tables have a tenant ID column and are distributed by this column, and all queries filter by tenant ID, then Citus supports the full SQL functionality of PostgreSQL—including complex joins and transaction blocks—by transparently delegating each query to the node that stores the tenant’s data. This means that with Citus, you don’t lose any of the functionality or transactional guarantees that you are used to in PostgreSQL, even though your database has been transparently scaled out across many servers. In addition, you can manage your distributed database through parallel DDL, tenant isolation, high performance data loading, and cross-tenant queries.
For many SaaS products, a common database problem is having one customer that has so much data, it adversely impacts other customers on the shared machine. This leads many to ask, “What do I do with my largest customer?”
Tenant isolation is a great way to solve this issue. Effectively it allows you to control which tenant or customer in particular you want to isolate on a completely new node. By separating a tenant, you get dedicated resources with more memory and cpu processing power.
A number of SaaS applications have data models where they want to have their customers interact with only their data. At the enterprise end you have companies like Salesforce and Workday that fall into this bucket, but we see a ton of small ones as well. If you’re just getting started figuring out how you should approach your data so it can scale in the future, it doesn’t have to be hard.
Here we’re going to walk through an example data model that you can use as a basis for learning how you could apply the same to your own multi-tenant application.
Today we’re happy to announce our new activerecord-multi-tenant Ruby library, which enables easy scale-out of applications that are built on top of Ruby on Rails and follow a multi-tenant data model.
This Ruby library has evolved from our experience working with customers, scaling out their multi-tenant apps, and patching some restrictions that ActiveRecord and Rails currently have when it comes to automatic query building. It is based on the excellent acts_as_tenant library, and extends it for the particular use-case of a distributed multi-tenant database like Citus.
Written byByMarco Slot | December 22, 2016Dec 22, 2016
Relational databases are the first choice of data store for many applications due to their enormous flexibility and reliability. Historically the one knock against relational databases is that they can only run on a single machine, which creates inherent...
Citus is a distributed database that extends (not forks) PostgreSQL. Citus does this by transparently sharding database tables across the cluster and replicating those shards.
After open sourcing Citus, one question that we frequently heard from users related to how Citus replicated data and automated node failovers. In this blog post, we intend to cover the two replication models available in Citus: statement-based and streaming replication. We also plan to describe how these models evolved over time for different use cases.