Citus Data Blog

Thoughts on scaling out PostgreSQL, big data architectures, distributed systems, and the PostgreSQL community.

Citus 6.1 Released–Horizontally scale your Postgres database

Microservices and NoSQL get a lot of hype, but in many cases what you really want is a relational database that simply works, and can easily scale as your application data grows. Microservices can help you split up areas of concern, but also introduce complexity and often heavy engineering work to migrate to them. Yet, there are a lot of monolithic apps out that do need to scale. If you don’t want the added complexity of microservices, but do need to continue scaling your relational database then you can with Citus. With Citus 6.1 we’re continuing to make scaling out your database even easier with all the benefits of Postgres (SQL, JSONB, PostGIS, indexes, etc.) still packed in there.

With this new release customers like Heap and Convertflow are able to scale from single node Postgres to horizontal linear scale. Citus 6.1 brings several improvements, making scaling your multi-tenant app even easier. These include:

  • Integrated reference table support
  • Tenant Isolation
  • View support on distributed tables
  • Distributed Vaccum / Analyze

All of this with the same language bindings, clients, drivers, libraries (like ActiveRecord) that Postgres already works with.

Give Citus 6.1 a try today on Citus Cloud, our fully managed database-as-a-service on top of AWS, or read on to learn more about all that’s included in this release.

Craig Kerstiens Feb 16, 2017

Setting up your log destination on Citus Cloud

Your database is a key part of your stack, and when things act up in your application getting insights into it are key. With Citus Cloud you have a number of dashboards with metrics you can look into as well as centralized logging. In addition to the centralized logging, you also have the ability to drain your logs to the provider of your choice. This means you can have all your Citus Cloud logs (both the coordinator and distributed nodes) integrated with the rest of your application logs.

Craig Kerstiens Feb 13, 2017

Yubikeys and U2F make two-factor authentication easier

We’re excited to announce U2F Fido (Yubikey) support for Citus Cloud to make the experience of keeping your account and data secure even easier. Within the Account Security section of the Citus Cloud Console you’ll now see a section to add your new device. If you already have a U2F click Register New Device then you’ll be prompted to activate it, and you’re done.

If you already have a Yubikey then you know all the benefits it brings, however when testing many of our customers were unaware of them or weren’t using them already. We felt it would be worth it to spend some time explaining why they’re great as well as creating a few guides for how to set them up on the most common services you may be using.

Craig Kerstiens Feb 1, 2017

Getting started with GitHub event data on Citus

Getting an example schema and data is often one of the more time consuming parts of testing a database. To make that easier for you, we’re going to walk through Citus with an open data set which almost any developer can relate to–github event data. If you already have your own schema, data, and queries you want to test with, by all means use it. If you need any help with getting setup, join us in our Slack channel and we’ll be happy to talk through different data modeling options for your own data.

An overview of the schema and queries

The data model we’re going to work with here is simple, we have users and events. An event can be a fork or a commit related to an organization and of course many more.

Craig Kerstiens Jan 27, 2017

Postgres Parallel indexing in Citus

Indexes are an essential tool for optimizing database performance and are becoming ever more important with big data. However, as the volume of data increases, index maintenance often becomes a write bottleneck, especially for advanced index types which use a lot of CPU time for every row that gets written. Index creation may also become prohibitively expensive as it may take hours or even days to build a new index on terabytes of data in postgres. As of Citus 6.0, we’ve made creating and maintaining indexes that much faster through parallelization.

Marco Slot Jan 17, 2017

Scale Out Multi-Tenant Apps based on Ruby on Rails

Today we’re happy to announce our new activerecord-multi-tenant Ruby library, which enables easy scale-out of applications that are built on top of Ruby on Rails and follow a multi-tenant data model.

This Ruby library has evolved from our experience working with customers, scaling out their multi-tenant apps, and patching some restrictions that ActiveRecord and Rails currently have when it comes to automatic query building. It is based on the excellent acts_as_tenant library, and extends it for the particular use-case of a distributed multi-tenant database like Citus.

Lukas Fittl Jan 5, 2017

Scaling out relational data models, and SQL, through co-location

Relational databases are the first choice of data store for many applications due to their enormous flexibility and reliability. Historically the one knock against relational databases is that they can only run on a single machine, which creates inherent...

Marco Slot Dec 22, 2016

Lessons learned from Postgres schema sharding

We talk with a number of Postgres users each week that are looking to scale out their database. First, we would never recommend scaling out until you truly have to, it’s always easier to scale your database up rather than out. It’s often not until over 100 GB of data that you need to think about sharding.

When you want to scale out though, you want it to be simple. For scaling a multi-tenant database, there’s three common approaches:

Craig Kerstiens Dec 18, 2016

Citus' Replication Model: Today and Tomorrow

Citus is a distributed database that extends (not forks) PostgreSQL. Citus does this by transparently sharding database tables across the cluster and replicating those shards.

After open sourcing Citus, one question that we frequently heard from users related to how Citus replicated data and automated node failovers. In this blog post, we intend to cover the two replication models available in Citus: statement-based and streaming replication. We also plan to describe how these models evolved over time for different use cases.

Ozgun Erdogan Dec 15, 2016

Real-time event aggregation at scale using Postgres w/ Citus

Citus is commonly used to scale out event data pipelines on top of PostgreSQL. Its ability to transparently shard data and parallelise queries over many machines makes it possible to have real-time responsiveness even with terabytes of data. Users with very high data volumes often store pre-aggregated data to avoid the cost of processing raw data at run-time. With Citus 6.0 this type of workflow became even easier using a new feature that enables pre-aggregation inside the database in a massively parallel fashion using standard SQL. For large datasets, querying pre-computed aggregation tables can be orders of magnitude faster than querying the facts table on demand.

Marco Slot Nov 29, 2016

Page 1 of 10

Next page