Citus Data Blog

Thoughts on scaling out PostgreSQL, sharding, multi-tenant apps, real-time analytics, and distributed databases.

Craig Kerstiens
By Craig Kerstiens
January 15, 2019

Contributing to Postgres

About once a month I get this question: “How do I contribute to Postgres?”. PostgreSQL is a great database with a solid code base and for many of us, contributing back to open source is a worthwhile cause. The thing about contributing back to Postgres is you generally don’t just jump right in and commit code on day one. So figuring out where to start can be a bit overwhelming. If you’re considering getting more involved with Postgres, here’s a few tips that you may find helpful.

Continue reading
Claire Giordano
By Claire Giordano
January 13, 2019

10 Most Popular Citus Data Blog Posts in 2018, ft. Postgres

Seasons each have a different feel, a different rhythm. Temperature, weather, sunlight, and traditions—they all vary by season. For me, summer usually includes a beach vacation. And winter brings the smell of hot apple cider on the stove, days in the mountains hoping for the next good snowstorm—and New Year’s resolutions. Somehow January is the time to pause and reflect on the accomplishments of the past year, to take stock in what worked, and what didn’t. And of course there are the TOP TEN LISTS.

Spoiler alert, yes, this is a Top 10 list. If you’re a regular on the Citus Data blog, you know our Citus database engineers love PostgreSQL. And one of the open source responsibilities we take seriously is the importance of sharing learnings, how-to’s, and expertise. One way we share learnings is by giving lots of conference talks (seems like I have to update our Events page every week with new events.) And another way we share our learnings is with our blog.

So just in case you missed any of our best posts from last year, here is the TOP TEN list of the most popular Citus Data blogs published in 2018. Enjoy.

Continue reading
Craig Kerstiens
By Craig Kerstiens
January 2, 2019

Fun with SQL: Self joins

Various families have various traditions in the US around Christmas time. Some will play games like white elephant where you get a mix of decent gifts as well as gag gifts… you then draw numbers and get to pick from existing presents that have been opened (“stealing” from someone else) or opening an up-opened one. The game is both entertaining to try to get something you want, but also stick Aunt Jennifer with the stuffed poop emoji with a Santa hat on it.

Other traditions are a bit simpler, one that my partner’s family follows is drawing names for one person you buy a gift for. This is nice because you can put a bit of effort into that one person without having to be too overwhelmed in tracking down things for multiple people. Each year we draw names for the next year. And by now you’re probably thinking what does any of this have to do with SQL? Well normally when we draw names we write them on a piece of paper, someone takes a picture, then that gets texted around to other family members. At least for me every October I’m scrolling back through text messages to try to recall who it was I’m supposed to buy for. This year I took a little time to put everyone’s name in a SQL database and write a simple query for easier recall.

Continue reading
Claire Giordano
By Claire Giordano
December 27, 2018

The perks of sharing your Citus open source stories

Most of us who work with open source like working with open source. You get to build on what’s already been built, and you get to focus on inventing new solutions to new problems instead of reinventing the wheel on each project. Plus you get to share your work publicly (which can improve the state of the art in the industry) and you get feedback from developers outside your company. Hiring managers give it a +1 too, since sharing your code will sometimes trigger outside interest in what you’re doing and can be a big boon for recruiting. After all “smart people like to hang out with smart people”.

Continue reading
Will Leinweber
By Will Leinweber
December 14, 2018

\watch ing Star Wars in Postgres

I recently had the honor of speaking at the last Keep Ruby Weird. A good part of the talk dealt with Postgres and since Citus Data is not only a database company but also a Postgres company, I figured sharing those parts on the Citus Data blog would be a good idea. If you’d like to see it in talk form, or you’d also like to know how to watch movies rendered as emojis in your terminal, I encourge you to watch the talk.

Continue reading
Marco Slot
By Marco Slot
November 30, 2018

Why the RDBMS is the future of distributed databases, ft. Postgres and Citus

Around 10 years ago I joined Amazon Web Services and that’s where I first saw the importance of trade-offs in distributed systems. In university I had already learned about the trade-offs between consistency and availability (the CAP theorem), but in practice the spectrum goes a lot deeper than that. Any design decision may involve trade-offs between latency, concurrency, scalability, durability, maintainability, functionality, operational simplicity, and other aspects of the system—and those trade-offs have meaningful impact on the features and user experience of the application, and even on the effectiveness of the business itself.

Perhaps the most challenging problem in distributed systems, in which the need for trade-offs is most apparent, is building a distributed database. When applications began to require databases that could scale across many servers, database developers began to make extreme trade-offs. In order to achieve scalability over many nodes, distributed key-value stores (NoSQL) put aside the rich feature set offered by the traditional relational database management systems (RDBMS), including SQL, joins, foreign keys, and ACID guarantees. Since everyone wants scalability, it would only be a matter of time before the RDBMS would disappear, right? Actually, relational databases have continued to dominate the database landscape. And here’s why:

Continue reading
Craig Kerstiens
By Craig Kerstiens
November 27, 2018

How Postgres is more than a relational database: Extensions

Postgres has been a great database for decades now, and has really come into its own in the last ten years. Databases more broadly have also gotten their own set of attention as well. First we had NoSQL which started more on document databases and key/value stores, then there was NewSQL which expanding things to distributed, graph databases, and all of these models from documents to distributed to relational were not mutually exclusive. Postgres itself went from simply a relational database (which already had geospatial capabilities) to a multi modal database by adding support for JSONB.

But to me the most exciting part about Postgres isn’t how it continues to advance itself, rather it is how Postgres has shifted itself from simply a relational database to more of a data platform. The largest driver for this shift to being a data platform is Postgres extensions. Postgres extensions in simplified terms are lower level APIs that exist within Postgres that allow to change or extend it’s functionality. These extension hooks allow Postgres to be adapted for new use cases without requiring upstream changes to the core database. This is a win in two ways:

  1. The Postgres core can continue to move at a very safe and stable pace, ensuring a solid foundation and not risking your data.
  2. Extensions themselves can move quickly to explore new areas without the same review process or release cycle allowing them to be agile in how they evolve.

Okay, plug-ins and frameworks aren’t new when it comes to software, what is so great about extensions for Postgres? Well they may not be new to software, but they’re not new to Postgres either. Postgres has had extensions as long as I can remember. In Postgres 9.1 we saw a new sytax to make it easy to CREATE EXTENSION and since that time the ecosystem around them has grown. We have a full directory of extensions at PGXN. Older forks such as which were based on older versions are actively working on catching up to a modern release to presumably become a pure extension. By being a pure extension you’re able to stay current with Postgres versions without heavy rebasing for each new release. Now the things you can do with extensions is as powerful as ever, so much so that Citus’ distributed database support is built on top of this extension framework.

Continue reading
Colton Shepard
By Colton Shepard
November 8, 2018

Making the PostgreSQL pgloader utility more Citus aware, a database hackathon project

This year, as part of the Citus Data annual team retreat and hackathon, we decided to get ambitious and try to create an easy, 1-click migration to the Citus distributed database. My team was lucky enough to get Dimitri Fontaine, the author of the excellent PostgreSQL pgloader utility, so we decided to start our hackathon project by building on top of pgloader and making pgloader more Citus-aware.

For those of you not familiar with Citus—Citus is an extension to PostgreSQL that transforms Postgres into a distributed database. We make Citus available as open source (star us on GitHub!), as enterprise software you can run anywhere, and as a fully-managed database as a service.

Continue reading
Craig Kerstiens
By Craig Kerstiens
October 31, 2018

Materialized views vs. Rollup tables in Postgres

Materialized views were a long awaited feature within Postgres for a number of years. They finally arrived in Postgres 9.3, though at the time were limited. In Postgres 9.3 when you refreshed materialized views it would hold a lock on the table while they were being refreshed. If your workload was extremely business hours based this could work, but if you were powering something to end-users this was a deal breaker. In Postgres 9.4 we saw Postgres achieve the ability to refresh materialized views concurrently. With this we now have fully baked materialized view support, but even still we’ve seen they may not always be the right approach.

Continue reading
Ozgun Erdogan
By Ozgun Erdogan
October 24, 2018

Why Citus Data is donating 1% equity to PostgreSQL organizations

Today, we’re excited to announce that we have donated 1% of Citus Data’s stock to the non-profit PostgreSQL organizations in the US and Europe. The United States PostgreSQL Association (PgUS) has received this stock grant. PgUS will work with PostgreSQL Europe to support the growth, education, and future innovation of Postgres both in the US and Europe.

To our knowledge, this is the first time a company has donated 1% of its equity to support the mission of an open source foundation.

To coincide with this donation, we’re also joining the Pledge 1% movement, alongside well-known technology organizations such as Atlassian, Twilio, Box, and more.

Continue reading

Page 1 of 21