Citus Data Blog

Thoughts on scaling out PostgreSQL, sharding, multi-tenant apps, real-time analytics, and distributed databases.

Colton Shepard
By Colton Shepard
November 8, 2018

Making the PostgreSQL pgloader utility more Citus aware, a database hackathon project

This year, as part of the Citus Data annual team retreat and hackathon, we decided to get ambitious and try to create an easy, 1-click migration to the Citus distributed database. My team was lucky enough to get Dimitri Fontaine, the author of the excellent PostgreSQL pgloader utility, so we decided to start our hackathon project by building on top of pgloader and making pgloader more Citus-aware.

For those of you not familiar with Citus—Citus is an extension to PostgreSQL that transforms Postgres into a distributed database. We make Citus available as open source (star us on GitHub!), as enterprise software you can run anywhere, and as a fully-managed database as a service.

Continue reading
Craig Kerstiens
By Craig Kerstiens
October 31, 2018

Materialized views vs. Rollup tables in Postgres

Materialized views were a long awaited feature within Postgres for a number of years. They finally arrived in Postgres 9.3, though at the time were limited. In Postgres 9.3 when you refreshed materialized views it would hold a lock on the table while they were being refreshed. If your workload was extremely business hours based this could work, but if you were powering something to end-users this was a deal breaker. In Postgres 9.4 we saw Postgres achieve the ability to refresh materialized views concurrently. With this we now have fully baked materialized view support, but even still we’ve seen they may not always be the right approach.

Continue reading
Ozgun Erdogan
By Ozgun Erdogan
October 24, 2018

Why Citus Data is donating 1% equity to PostgreSQL organizations

Today, we’re excited to announce that we have donated 1% of Citus Data’s stock to the non-profit PostgreSQL organizations in the US and Europe. The United States PostgreSQL Association (PgUS) has received this stock grant. PgUS will work with PostgreSQL Europe to support the growth, education, and future innovation of Postgres both in the US and Europe.

To our knowledge, this is the first time a company has donated 1% of its equity to support the mission of an open source foundation.

To coincide with this donation, we’re also joining the Pledge 1% movement, alongside well-known technology organizations such as Atlassian, Twilio, Box, and more.

Continue reading
Craig Kerstiens
By Craig Kerstiens
October 17, 2018

Commenting your Postgres database

At Citus whether it’s looking at our own data or helping a customer debug a query I end up writing a lot of SQL. When I do write SQL I do my best to make sure it’s readable in case others need to come along and understand or modify, but admittedly I do have some bad habits from time to time such as using implicit joins. Regardless of my bad habits I still try to make my SQL and database as easy to understand for someone not already familiar with it. One of the biggest tools for that is comments.

Even early on in learning to program we take advantage of comments to explain and describe what our code is doing, even in times when it seems obvious. I see this less commonly in SQL and databases, which is a shame because data is just as valuable so making it easier to reason and work with seems logical. Postgres has a few great mechanisms you can start leveraging when it comes to commenting so you can better document things.

Continue reading
Gary Sahota
By Gary Sahota
September 27, 2018

Fun with SQL: Unions in Postgres

Before joining the Citus Data team to work on the Postgres extension that transforms Postgres into a distributed database, I checked out the Citus Data blog to learn a bit more. And found all sorts of material for Postgres beginners as well as power users, with one of my favorites being the “Fun with SQL” series. After I joined, I jumped at the chance to write a Fun with SQL post, and the only question was what to write about. I chose unions.

In SQL, the UNION and UNION ALL operators help take multiple tables and combine the results into a single table of all matching columns. This operator is extremely useful if you want the results to come back as a single set of records.

Continue reading
Dimitri Fontaine
By Dimitri Fontaine
September 25, 2018

A history lesson on SQL joins (in Postgres)

Our beloved Structured Query Language may be the lingua franca for relational databases—but like many languages, SQL is in a state of constant evolution. The first releases of SQL didn’t even have a notation for joins. At the time, SQL only supported inner joins.

Cross Joins and Where Filters

As a result, back in early eighties, the only way to express a join condition between tables would be in the WHERE clause.

Continue reading
Craig Kerstiens
By Craig Kerstiens
September 19, 2018

Use cases for followers (read replicas) in Postgres

Citus extends Postgres to be a horizontally scalable database. By horizontally scalable, we mean the data is spread across multiple machines, and you’re able to scale not only storage but also memory and compute—thus providing better performance. Without using something like Citus to transform PostgreSQL into a distributed database, sure you can add read replicas to scale, but you’re still maintaining a single copy of your data. When you run into scaling issues with your Postgres database, adding a read replica and offloading some of your traffic to your read replica is a common bandaid to slow down the bleeding, but it is only a matter of time until even that doesn’t work any further. Whereas with Citus, scaling out your database is as simple as dragging a slider and rebalancing your data.

Are read replicas still useful with horizontally scalable databases?

But that leaves a question, are read-replicas still useful? Well, sure they are.

Continue reading
Dimitri Fontaine
By Dimitri Fontaine
September 11, 2018

PostgreSQL 11 and Just In Time Compilation of Queries

PostgreSQL 11 is brewing and will be released soon. In the meantime, testing it with your own application is a great way to make sure the community catches all the remaining bugs before the dot-zero release.

One of the big changes in the next PostgreSQL release is the result of Andres Freund’s work on the query executor engine. Andres has been working on this part of the system for a while now, and in the next release we are going to see a new component in the execution engine: a JIT expression compiler!

Continue reading
Craig Kerstiens
By Craig Kerstiens
September 4, 2018

12 Factor: Dev/prod parity for your database

The twelve-factor app changed the way we build SaaS applications. Explicit dependency management, separating config from code, scaling out your app concurrently—these design principles took us from giant J2EE apps to apps that scale predictably on the web. One of these 12 factors has long stood out as a challenge when it comes to databases: dev/prod parity. Sure, you can run the exact same version of your database, and have a sandbox copy, but testing and staging with production data… that’s a different story.

Continue reading
Craig Kerstiens
By Craig Kerstiens
August 29, 2018

Postgres data types you should consider using

Postgres is a rich and powerful database. And the existence of PostgreSQL extension APIs have enabled Postgres to expand its capabilities beyond the boundaries of what you would expect in a traditional relational database. Examples of popular Postgres extensions today include HyperLogLog, which gives you approximate distincts with a small footprint—to rich geospatial support via PostGIS—to Citus which helps you scale out your Postgres database across multiple nodes to improve performance for multi-tenant SaaS applications and real-time analytics dashboards—to the built-in full text search capabilities in PostgreSQL. With all the bells and whistles you can layer into Postgres, sometimes the most basic built-ins get overlooked.

PostgreSQL has nearly 100 different data types, and these data types can come with their own tuned indexing or their own specialized functions. You probably already use the basics such as integers and text, and today we’re going to take a survey of less-used but incredibly powerful PostgreSQL data types.

Continue reading

Page 1 of 20