Citus Data Blog

Thoughts on scaling out PostgreSQL, big data architectures, distributed systems, and the PostgreSQL community.

How we implement Disaster Recovery and High Availability with Postgres on Citus Cloud

AWS is the leader when it comes to the cloud, and for good reason. AWS is well ahead in the quality and breadth of services they offer.

However, when a service is running at the scale of AWS, it is natural to expect some failures to occur. According to AWS EBS availability is designed for 99.999%.

The annual failure rate (AFR) is 0.1% - 0.2%, where failure means a complete or partial failure. For example, if you had 1,000 EBS discs, you should expect 1 or 2 to have a failure per year. In our experience, partial failure is significantly more common than a complete loss. Even so, a partial loss can take a lot of time to resolve and can still be debilitating to a business.

Over the years, there have been some AWS failures that made news headlines due to havoc caused for both companies and their users. These incidents put a spotlight on AWS’ imperfections.

Daniel Farina Mar 23, 2017

A Look at Isolating Tenants To Improve Database Performance

For many SaaS products, a common database problem is having one customer that has so much data, it adversely impacts other customers on the shared machine. This leads many to ask, “What do I do with my largest customer?”

Tenant isolation is a great way to solve this issue. Effectively it allows you to control which tenant or customer in particular you want to isolate on a completely new node. By separating a tenant, you get dedicated resources with more memory and cpu processing power.

Metin Doslu Mar 15, 2017

How to Scale PostgreSQL on AWS–Learnings from Citus Cloud

Citus is a distributed database that extends (not forks) PostgreSQL for large workloads. One challenge associated with building a distributed relational database (RDBMS) is that they require notable effort to deploy and operate. To remove these operational barriers, we’ve been thinking about offering Citus as a managed database for a while now.

Naturally, we were also worried that providing a native database offering on AWS could split our startup’s focus and take up significant engineering resources. (Honestly, if the founding engineers of the Heroku Postgres team didn’t join Citus, we might have decided to wait on this.) After having Citus Cloud publicly available for eight months though, we are now more bullish on the cloud then ever.

It turns out that targeting an important use case for your customers and delivering it to them in a way that removes their pain points, matters more than anything else. In this blog post, we’ll only focus on removing operational pain points and not on use cases: Why is cloud changing the way databases are delivered to customers? What AWS technologies Citus Cloud is using to enable that in a unique way?

Ozgun Erdogan Mar 10, 2017

A multi-tenant sharding tutorial

A number of SaaS applications have data models where they want to have their customers interact with only their data. At the enterprise end you have companies like Salesforce and Workday that fall into this bucket, but we see a ton of small ones as well. If you’re just getting started figuring out how you should approach your data so it can scale in the future, it doesn’t have to be hard.

Here we’re going to walk through an example data model that you can use as a basis for learning how you could apply the same to your own multi-tenant application.

Craig Kerstiens Mar 9, 2017

Citus 6.1 Released–Horizontally scale your Postgres database

Microservices and NoSQL get a lot of hype, but in many cases what you really want is a relational database that simply works, and can easily scale as your application data grows. Microservices can help you split up areas of concern, but also introduce complexity and often heavy engineering work to migrate to them. Yet, there are a lot of monolithic apps out that do need to scale. If you don’t want the added complexity of microservices, but do need to continue scaling your relational database then you can with Citus. With Citus 6.1 we’re continuing to make scaling out your database even easier with all the benefits of Postgres (SQL, JSONB, PostGIS, indexes, etc.) still packed in there.

With this new release customers like Heap and Convertflow are able to scale from single node Postgres to horizontal linear scale. Citus 6.1 brings several improvements, making scaling your multi-tenant app even easier. These include:

  • Integrated reference table support
  • Tenant Isolation
  • View support on distributed tables
  • Distributed Vaccum / Analyze

All of this with the same language bindings, clients, drivers, libraries (like ActiveRecord) that Postgres already works with.

Give Citus 6.1 a try today on Citus Cloud, our fully managed database-as-a-service on top of AWS, or read on to learn more about all that’s included in this release.

Craig Kerstiens Feb 16, 2017

Setting up your log destination on Citus Cloud

Your database is a key part of your stack, and when things act up in your application getting insights into it are key. With Citus Cloud you have a number of dashboards with metrics you can look into as well as centralized logging. In addition to the centralized logging, you also have the ability to drain your logs to the provider of your choice. This means you can have all your Citus Cloud logs (both the coordinator and distributed nodes) integrated with the rest of your application logs.

Craig Kerstiens Feb 13, 2017

Yubikeys and U2F make two-factor authentication easier

We’re excited to announce U2F Fido (Yubikey) support for Citus Cloud to make the experience of keeping your account and data secure even easier. Within the Account Security section of the Citus Cloud Console you’ll now see a section to add your new device. If you already have a U2F click Register New Device then you’ll be prompted to activate it, and you’re done.

If you already have a Yubikey then you know all the benefits it brings, however when testing many of our customers were unaware of them or weren’t using them already. We felt it would be worth it to spend some time explaining why they’re great as well as creating a few guides for how to set them up on the most common services you may be using.

Craig Kerstiens Feb 1, 2017

Getting started with GitHub event data on Citus

Getting an example schema and data is often one of the more time consuming parts of testing a database. To make that easier for you, we’re going to walk through Citus with an open data set which almost any developer can relate to–github event data. If you already have your own schema, data, and queries you want to test with, by all means use it. If you need any help with getting setup, join us in our Slack channel and we’ll be happy to talk through different data modeling options for your own data.

An overview of the schema and queries

The data model we’re going to work with here is simple, we have users and events. An event can be a fork or a commit related to an organization and of course many more.

Craig Kerstiens Jan 27, 2017

Postgres Parallel indexing in Citus

Indexes are an essential tool for optimizing database performance and are becoming ever more important with big data. However, as the volume of data increases, index maintenance often becomes a write bottleneck, especially for advanced index types which use a lot of CPU time for every row that gets written. Index creation may also become prohibitively expensive as it may take hours or even days to build a new index on terabytes of data in postgres. As of Citus 6.0, we’ve made creating and maintaining indexes that much faster through parallelization.

Marco Slot Jan 17, 2017

Scale Out Multi-Tenant Apps based on Ruby on Rails

Today we’re happy to announce our new activerecord-multi-tenant Ruby library, which enables easy scale-out of applications that are built on top of Ruby on Rails and follow a multi-tenant data model.

This Ruby library has evolved from our experience working with customers, scaling out their multi-tenant apps, and patching some restrictions that ActiveRecord and Rails currently have when it comes to automatic query building. It is based on the excellent acts_as_tenant library, and extends it for the particular use-case of a distributed multi-tenant database like Citus.

Lukas Fittl Jan 5, 2017

Page 1 of 11

Next page