Blog posts by Ozgun Erdogan on the Citus Blog - Page 2

Citus 7: Transactions, Framework Integration, and Postgres 10

Written by By Ozgun Erdogan | September 7, 2017 Sep 7, 2017

"Thirty years ago, my older brother was trying to get a report on birds written that he'd had three months to write. It was due the next day.

We were out at our family cabin in Bolinas, and he was at the kitchen table close to tears, surrounded by binder paper and pencils and unopened books on birds, immobilized by the hugeness of the task ahead. Then my father sat down beside him, put his arm around my brother's shoulder, and said, 'Bird by bird, buddy. Just take it bird by bird.'"

— Bird by Bird: Some Instructions on Writing and Life, by Anne LaMott

When we started working on Citus, our vision was to combine the power of relational databases with the elastic scale of NoSQL. To do this, we took a different approach. Instead of building a new database from scratch, we leveraged PostgreSQL’s new extension APIs. This way, Citus would make Postgres a distributed database and integrate with the rich ecosystem of tools you already use.

When PostgreSQL is involved, executing on this vision isn’t a simple task. The PostgreSQL manual offers 3,558 pages of features built over two decades. The tools built around Postgres use and combine these features in unimaginable ways.

After our Citus open source announcement, we talked to many of you about scaling out your relational database. In every conversation, we’d hear about different Postgres features that needed to scale out of the box. We’d take notes from our meeting and add these features into an internal document. The list would keep getting longer, and longer, and longer.

Like the child writing a report on birds, the task ahead felt insurmountable. So how do you take a solid relational database and make sure that all those complex features scale? You take it bird by bird. We broke down the problem of scaling into five hundred smaller ones and started implementing these features one by one.

Keep reading

Principles of Sharding for Relational Databases

Written by By Ozgun Erdogan | August 9, 2017 Aug 9, 2017

When your database is small (10s of GB), it's easy to throw more hardware at the problem and scale up. As these tables grows however, you need to think about other ways to scale your database.

In one way, sharding is the best way to scale. Sharding enables you to linearly scale your database’s cpu, memory, and disk resources by separating your database into smaller parts. In other ways, sharding is a controversial topic. The internet is full of advice on sharding, from "essential to scaling your database infrastructure" to "why you never want to shard". So the question is, whose advice should you take?

Keep reading

How to Scale PostgreSQL on AWS–Learnings from Citus Cloud

Written by By Ozgun Erdogan | March 10, 2017 Mar 10, 2017

Citus is a distributed database that extends (not forks) PostgreSQL for large workloads. One challenge associated with building a distributed relational database (RDBMS) is that they require notable effort to deploy and operate. To remove these operational barriers, we’ve been thinking about offering Citus as a managed database for a while now.

Naturally, we were also worried that providing a native database offering on AWS could split our startup’s focus and take up significant engineering resources. (Honestly, if the founding engineers of the Heroku Postgres team didn’t join Citus, we might have decided to wait on this.) After having Citus Cloud publicly available for eight months though, we are now more bullish on the cloud then ever.

It turns out that targeting an important use case for your customers and delivering it to them in a way that removes their pain points, matters more than anything else. In this blog post, we’ll only focus on removing operational pain points and not on use cases: Why is cloud changing the way databases are delivered to customers? What AWS technologies Citus Cloud is using to enable that in a unique way?

Keep reading

Citus' Replication Model: Today and Tomorrow

Written by By Ozgun Erdogan | December 15, 2016 Dec 15, 2016

Citus is a distributed database that extends (not forks) PostgreSQL. Citus does this by transparently sharding database tables across the cluster and replicating those shards.

After open sourcing Citus, one question that we frequently heard from users related to how Citus replicated data and automated node failovers. In this blog post, we intend to cover the two replication models available in Citus: statement-based and streaming replication. We also plan to describe how these models evolved over time for different use cases.

Keep reading

Designing your SaaS Database for Scale with Postgres

Written by By Ozgun Erdogan | October 3, 2016 Oct 3, 2016

If you’re building a SaaS application, you probably already have the notion of tenancy built in your data model. Typically, most information relates to tenants / customers / accounts and your database tables capture this natural relation.

With smaller amounts of data (10s of GB), it’s easy to throw more hardware at the problem and scale up your database. As these tables grow however, you need to think about ways to scale your multi-tenant database across dozens or hundreds of machines.

After our blog post on sharding a multi-tenant app with Postgres, we received a number of questions on architectural patterns for multi-tenant databases and when to use which. At a high level, developers have three options:

Keep reading

Citus Unforks From PostgreSQL, Goes Open Source

Written by By Ozgun Erdogan | March 24, 2016 Mar 24, 2016

Elecorn

When we started working on CitusDB 1.0 four years ago, we envisioned scaling out relational databases. We loved Postgres (and the elephant) and picked it as our underlying database of choice. Our goal was to extend this database to seamlessly shard

Keep reading

PostgreSQL, pg_shard, and what we learned from our failures

Written by By Ozgun Erdogan | September 9, 2015 Sep 9, 2015

pg_shard is a PostgreSQL extension that scales out real-time read and writes. This document talks about an earlier version of pg_shard that used Postgres' foreign data wrappers (FDWs) for sharding and scaling. We failed, learned, and succeeded in our...

Keep reading

First PGConf Silicon Valley Speakers Announced

Written by By Ozgun Erdogan | July 23, 2015 Jul 23, 2015

As a member of the PGConf Silicon Valley Conference Committee, I'm extremely happy with the volume and quality of the talks submitted to the conference. The Committee has been working hard on sorting through the talks, and I am pleased to announce...

Keep reading

CitusDB 4.0, pg_shard 1.1, and cstore 1.2 are out. What's next?

Written by By Ozgun Erdogan | April 27, 2015 Apr 27, 2015

We're excited to release CitusDB 4.0, pg_shard 1.1, and cstore 1.2! These products extend PostgreSQL for scaling out and high performance.

Now that our new releases are out, we wanted to answer two questions that we continuously hear from our users...

Keep reading

Scaling out PostgreSQL at Cloudflare with Citus

Written by By Ozgun Erdogan | April 14, 2015 Apr 14, 2015

Cloudflare is a content delivery network (CDN) and DNS provider that powers millions of websites around the world. Last week, we were happy to see them publish a technical blog post that described how they power their analytics dashboards using Citus...

Keep reading