Why Citus?

Learn Your Way: Read, Watch, or Do

docs icon

Read the docs

Find out more about the Citus concepts, architecture, cluster management, APIs, use cases, & performance tuning.

videos icon

Watch the demos

See how Citus scales out Postgres and parallelizes your workloads via these YouTube videos. Tip: turn on captions.

tutorials icon

Try the tutorials

Learn how to use Citus by using sample data in these short tutorials. For time series data, check out the use case guide.

Try Citus Right Now

Citus elicorn icon

Citus Open Source

You can download and install Citus open source packages for Docker, Ubuntu, Debian, Fedora, CentOS, and Red Hat via these simple steps.

cloud icon

Citus on Azure

You can stand up a Citus cluster in minutes with the Hyperscale (Citus) option in the Azure Database for PostgreSQL managed service.

Scaling out Postgres with Citus

Using sharding and replication, the Citus open source extension to Postgres distributes your data and queries across multiple servers in a database cluster. Because Citus uses a coordinator as the single entry point for applications, your app can interact with the Citus cluster as if it were a single Postgres server.

Citus is available as an open source download and in the cloud as a managed service. The Hyperscale (Citus) option in Azure Database for PostgreSQL makes it easy to stand up a managed Citus cluster in minutes.

How Citus scales out Postgres diagram
A Citus distributed database cluster contains a Citus coordinator node and multiple worker nodes. Each node contains small Postgres tables called shards. Learn more in the animated Citus architecture graphic—or in the Citus GitHub repo.

Frequently Asked Questions

  1. Citus Version Compatible with PostgreSQL
    5.2 9.5 only
    6.x 9.5, 9.6
    7.x 9.6, 10
    8.x 10, 11
    9.0-9.4 11, 12
    9.5 11, 12, 13
  2. Citus achieves order-of-magnitude faster execution compared to vanilla PostgreSQL through a combination of parallelism, keeping more data in memory, and higher I/O bandwidth.

    Citus enables human real-time interaction with large datasets that span billions of records—and is a good fit for customer-facing workloads that often require low-latency response times. Performance increases as you add nodes to a Citus database cluster. Watch our 15-min performance demo from SIGMOD to see an example of how Citus speeds up Postgres.

  3. The first step in migrating an application from Postgres to Citus is to choose your distribution column (sometimes called a distribution key, or a sharding key.) You’ll want to understand your workload in order to pinpoint a “good” distribution column, e.g., a column that enables you to get the maximum performance from Citus.

    The second step is to prepare the Postgres tables and SQL queries for migration. The amount of effort involved depends (you’ve heard that before, right?) on whether your application is already centered around that distribution column in terms of queries and schema. If not, you may have to update some of your queries and/or add the distribution column to some of your tables.

    If you are ready to delve deeper, the Migrating to Citus guide in the Citus documentation should be useful.

  4. The Citus extension to Postgres is commonly used with customer-facing applications that are growing fast, have demanding performance requirements, are starting to experience slow queries, need to plan for future scale—or all of the above. Common use cases for Citus—both on-prem and in the cloud where Hyperscale (Citus) is an option in the Azure Database for PostgreSQL managed service—include:

    • Customer-facing analytics dashboards
    • SaaS applications—usually multi-tenant
    • Time series workloads
    • IOT workloads—that need UPDATEs & JOINs
    • High-throughput transactional applications
  5. As you’ll learn in the Citus concepts section of the documentation, Citus divides Postgres tables into multiple smaller tables, called shards. The shards are then spread across the nodes in the Citus database cluster when you configure Citus with the create_distributed_table() function. When new data is ingested or when queries come in, the Citus coordinator routes them to the correct shards based on the value of the distribution column.

    SELECT create_distributed_table(
      table_name,
      distribution_column);

    Another way of thinking about shards: Each shard contains a portion of the larger Postgres table that you have distributed. Imagine you previously had a 1 TB Postgres table. Now imagine you have distributed that 1 TB table across 100 shards in a Citus cluster. Each shard—which is just a smaller Postgres table—would be a 10 GB Postgres table.

    Citus does more than simply shard and distribute your data, however. Citus also parallelizes your SQL queries across different nodes in the Citus cluster, giving you an order-of-magnitude increase in query response times for many use cases.