SQL, Scaling, and What's Unique About Postgres

Ozgun Erdogan Jan 12, 2015

About Citus

Citus is a distributed database that scales out PostgreSQL. An extension to Postgres, Citus is available as open source, as on-prem software, and as a fully-managed database as a service.

Sign up for our newsletter

Enjoy what you're reading? Sign-up to our newsletter to stay informed:

Other Recent Posts

Migrating from single-node Postgres to Citus How Citus works (a look at dynamic executors) Citus 7: Transactions, Framework Integration, and Postgres 10 More Articles

Like our blog, or have a question about Citus? Join us on Slack for a chat :)

This blog post simplifies certain concepts to be brief. If you’re looking for examples on how to override the PostgreSQL planner and executor, please check out Citus.

By now, you’ve likely heard the line “SQL doesn’t scale.” If you take away all the marketing, what remains of this line is “SQL is hard to scale.” At a high level, three reasons make scaling out SQL hard:

  1. Scaling is always hard. Scaling data, compared to scaling computations, is even harder.
  2. SQL means different things to different people. Depending on the context, it could mean transactional workloads, short reads/writes, real-time analytics, data warehousing, or triggers. It’s hard to (know how to) scale out all these different use cases.
  3. SQL doesn’t have the notion of “distribution” built into the language. This can be added into the database, but it isn’t there in the language.

This blog post focuses on the third challenge. To make things concrete, let’s take a look at how you insert entries using an example key-value store and through SQL.

-- Redis commands to hash-set properties for user billgates
HSET billgates race sayan
HSET billgates power 9001

-- Example SQL query to insert values for user billgates
INSERT INTO user (player_race, player_power, user_name)
  VALUES ('sayan', 9000 + 1, 'billgates');

In the first example, Redis simply hashes the key “billgates”, finds the related shard, and routes the query to that shard. For the second example, the database first parses a plain text SQL query. The database then checks column values and their data types against the table schema, and also applies optimizations, such as constant folding. Only after these steps are performed, can the query routing logic kick in, and determine that “billgates” makes up the distribution key.

In other words, the routing logic in SQL needs to be coupled with the rest of the database. And this is for a simple INSERT. If you’ve ever looked into SELECT queries, you can imagine things getting quite complicated. Think of filter clauses, functions, operators, aggregates, sub-selects, window functions, and others.

All this functionality means even tighter integration with the database. The component that parallelizes SELECT queries now needs to tell apart a filter clause from an aggregate function. Contrast this with Map/Reduce, where the developer needs to represent their computations within a specific parallel programming paradigm.

In summary, when the user types up a SQL query, there’s a fair bit of machinery associated with just understanding what that plain text query means. Now, if you are looking to scale out this SQL query, your “query distribution” logic needs to work together with the part that does the understanding.

This used to happen in one of four ways:

  1. Application level sharding: Push the distribution logic all the way up. The application developer then owns the complexity of partitioning and failure handling.
  2. Build a distributed database from scratch: One vendor owns the entire stack, and can couple their components at their own leisure. Building a SQL database takes a long time, and creating a community around it takes even longer.
  3. Use middleware for an open source database: Delegate the routing logic to an external process. The developer then needs to understand how to make their application interact with this external component, and how that component behaves during node failures.
  4. Fork an open source database: And make it distributed. The forked database supports the original’s features as of the time of the fork, and diverges from it over time. It in essence becomes a separate project.

This last item on forking applies a lot to PostgreSQL. In fact, PostgreSQL has probably been forked more than any other database.

I don’t know if it was one fork too many for the PostgreSQL community, an enthusiasm to involve more developers, or the modularly designed codebase. Either way, for the last few releases, PostgreSQL has been providing official APIs to extend SQL. In particular, the Postgres manual describes how to add new data types, operators, aggregates, background processes, foreign data wrappers, specialized indexes, and more.

If you like Postgres and want to extend it in a particular way, you can read through the manual, and implement your idea according to the API contract. You can then dynamically load in your extension (.so), and start using your extended database.

What makes things even more interesting are the lesser known, but more powerful, PostgreSQL hook APIs. If you’re looking to scale out a SQL database, you need more than new functionality. As a database programmer, you need to be able to intercept an INSERT statement after it gets parsed and semantically checked, but before it gets planned.

Fortunately, you can change any database submodule’s behavior in Postgres by intercepting the right hook.

For example, pg_shard transforms an incoming SQL query into one or more sub-queries, and therefore needs to run different planning logic than Postgres. No problem. Let’s intercept the planner logic.

This way, we get to change Postgres’ behavior for distributed tables. Also, we get to benefit from core functionality that Postgres has to offer. For example, we need to perform partition pruning to determine the shards that are relevant to an INSERT or SELECT query. That’s easy, PostgreSQL has a constraint exclusion library for partition pruning. We’ll just use that to prune away shards instead.

 * _PG_init is called when the module is loaded. In this function we save the
 * previous planner, executor, and utility hooks. We then install our hooks to
 * intercept operations for distributed tables.
  PreviousPlannerHook = planner_hook;
  planner_hook = PgShardPlanner;

 * PgShardPlanner implements custom planner logic to plan queries involving
 * distributed tables.
static PlannedStmt *
PgShardPlanner(Query *query, int cursorOptions, ParamListInfo boundParams)

Of course, revamping the query planner or executor is trickier than adding a new index type. The API contracts aren’t cleanly documented and require a fair amount of code reading. This was one of the biggest challenges for us; and we’re hoping that cstore_fdw or Citus could serve as good examples for these APIs.

In summary, PostgreSQL’s extensible architecture puts it in a unique place for scaling out SQL and also for adapting to evolving hardware trends. It could just be that the monolithic SQL database is dying. If so, long live Postgres!

← Next article Previous article →