Learn about Citus on Microsoft Azure in our latest post about use cases: When to use Hyperscale (Citus) to scale out Postgres.

Skip navigation


Answers to Frequently Asked Questions

  • In January 2019, Microsoft acquired Citus Data. For more details on the exciting news, please visit the announcements on the Official Microsoft Blog and the Citus Data Blog.

  • Citus extends PostgreSQL to support distributed SQL queries. On top of PostgreSQL, Citus comes with its own transparent sharding, replication, distributed query planner and executor logic which enable execution of distributed SQL queries in parallel. This provides Hadoop-like fault tolerance, scalability, and recovery from mid-query failures—while still allowing large datasets to be queried orders of magnitude faster than what has been possible on PostgreSQL before.

  • Citus Version Compatible with PostgreSQL
    5.2 9.5 only
    6.x 9.5, 9.6
    7.x 9.6, 10
    8.x 10, 11
    9.0-9.4 11, 12
    9.5 11, 12, 13
  • Since Citus provides distributed functionality by extending PostgreSQL, it uses the standard PostgreSQL SQL constructs. It provides full SQL support for queries which access a single node in the database cluster. These queries are common, for instance, in multi-tenant applications where different nodes store different tenants (see When to Use Citus).

  • Since Citus is based on PostgreSQL, you can directly use PostgreSQL extensions such as HyperLogLog, TopN, or PostGIS with it. When using other extensions, you will first need to create the Citus extension on your PostgreSQL instance and then the other extensions you want to use. Citus will work with tools that use standard PostgreSQL drivers such as Tableau through regular ODBC/JDBC drivers.

    In general, you can use standard PostgreSQL drivers and language bindings with Citus, which means almost any language is supported. You can view a list of supported drivers and interfaces for PostgreSQL here.

  • You can find real-world examples of how organizations use Citus to scale out Postgres in our customer stories. Our customers are wonderful, and we appreciate their vote of confidence and the time they spent being interviewed for these case studies. You’ll find stories from companies who use Citus to build real-time analytics APIs and dashboards; as well as stories about teams that use Citus to scale their multi-tenant SaaS applications.

    The Citus distributed database is used by Fortune 100 companies and startups alike, across different types of businesses including web & mobile analytics, information & network security, advertising technology, sales & marketing automation, and fintech.

    And Citus is now available in the cloud on Microsoft Azure, as an integrated deployment option in the Azure Database for PostgreSQL managed service. Learn more about how the Helsinki Regional Transportation Authority uses Hyperscale (Citus) on Azure—along with PostGIS—to deliver impressive performance and reduce costs by 50%. Or watch this video interview with BNY Mellon about how the bank has made their Postgres queries 24X faster by scaling out Postgres with Hyperscale (Citus) on Azure Database for PostgreSQL.

  • Citus deployments continue scaling up horizontally as we speak. On the last count, we had customers keeping hundreds of TBs on Citus, using tens of nodes in parallel and ingesting TBs of data per day. We test Citus on 100+ nodes and Citus is capable of keeping and processing PB scale workloads so we look forward to the continuing growth of our customers' Citus deployments.

  • There are several ways in which Citus is different than other analytics databases. (1) Citus is built for fast analytics and high transaction rates for many concurrent users. This is unlike most analytics databases which are generally not intended to support concurrent users or transactions. (2) Citus is open source. This is not the case for proprietary analytics databases. Open source means you have a lot of freedom and flexibility: you can run Citus and Postgres on your laptop, you can run it on VMs in the cloud. And you can take advantage of a large, vibrant ecosystem of tools and client libraries. (3) Because Citus is implemented as an extension to PostgreSQL (not a fork), it’s easy for us to keep Citus current with the latest releases of Postgres. (4) Because Citus supports distributed transactions, you can easily transform your data in parallel inside your database, simplifying your infrastructure and enabling you to build fast and powerful analytics pipelines.

  • Citus is not a columnar database by design since it extends PostgreSQL. However, it can be used in combination with the cstore_fdw extension, which gives Citus the capability to create distributed columnar tables. This helps to reduce the data footprint and improves the performance for disk-bound workloads.

  • Citus achieves order-of-magnitude faster execution compared to vanilla PostgreSQL through a combination of parallelism, keeping more data in memory, higher I/O bandwidth, and a simultaneous utilization of multiple cores available in your Citus database cluster.

    Citus enables human real-time interaction with large datasets that span billions of records—and is a good fit for customer-facing applications that often require low-latency response times. You can increase performance by adding nodes to the Citus database cluster. Watch our 15-min performance demo from SIGMOD to see an example of how Citus speeds up Postgres.

  • A single Citus node stores multiple shards of the same distributed table. This enables Citus to use multiple cores for a single query by virtue of hitting multiple PostgreSQL tables (shards) on each node. However, to get true scalability in performance and reliability, we recommend a multi-node cluster. In cases where queries hit the disk, a single node setup can easily become disk I/O bound.

  • Yes. You can deploy Citus on prem as well as in the cloud. Citus is available as open source, as on-prem enterprise software, and in the cloud as Hyperscale (Citus), a built-in deployment option for the Azure Database for PostgreSQL managed service. More details on the ways to get Citus can be found here.

  • The needs of your application (performance, scalability, concurrency, seasonality) will influence how many nodes you need in your Citus distributed database cluster. The good news: because of how Citus shards your data and distributes SQL across multiple nodes to parallelize the workload, Citus is able to scale out processing power, memory, and storage linearly. You can read more about how to tune query performance in our Citus documentation. Or feel free to contact us to explore what size Citus database cluster will work for you.

  • Citus implements transparent sharding at the database layer—so if you use Citus, you do not need to shard at the application layer, and you will not need to re-architect your application in order to scale out horizontally. You can read more about the Citus architecture and sharding semantics in our documentation.

  • Optimal shard count is related to the total number of cores on the workers. Citus partitions an incoming query into its fragment queries which run on individual worker shards. Hence, the degree of parallelism for each query is governed by the number of shards the query hits. To ensure maximum parallelism, you should create enough shards on each node such that there is at least one shard per CPU core.

  • Migrating an existing relational store to Citus sometimes requires adjusting the schema and queries for optimal performance. Since Citus is deployed as a Postgres extension, Postgres users can often start using Citus by simply installing the extension on their existing database. Once the extension is created, you can create and use distributed tables through standard Postgres interfaces while maintaining compatibility with existing Postgres tools. For more information, see our Migrating to Citus guide.

    If you are moving from MySQL or any other relational database, the migration path is similar to moving to Postgres from another relational database. We've had numerous customers move from MySQL to Citus with little change in their applications.

  • Citus treats cstore_fdw tables just like regular PostgreSQL tables. When cstore_fdw is used with Citus, each logical shard is created as a foreign cstore_fdw table instead of a regular PostgreSQL table. If your cstore_fdw use case is suitable for the distributed nature of Citus (e.g. large dataset archival and reporting), the two can be used to provide a powerful tool which combines query parallelization, seamless sharding and HA benefits of Citus with superior compression and I/O utilization of cstore_fdw.

  • With the open-source release of Citus v5.x, pg_shard's codebase has been merged into Citus to offer you a unified solution which provides the advanced distributed query planning previously only enjoyed by CitusDB customers while preserving the simple and transparent sharding and real-time writes and reads pg_shard brought to the PostgreSQL ecosystem. Our flagship product, Citus, provides a superset of the functionality of pg_shard and we have migration steps to help existing users to perform a drop-in replacement. Please contact us for more information.

  • Citus Community is open source and is available free for download here. Citus is also available as on-prem enterprise software, and in the cloud, integrated with Azure Database for PostgreSQL, a fully-managed database as a service on Microsoft Azure.

    GA pricing starts at less than $2/hour for Hyperscale (Citus) on Azure Database for PostgreSQL in the US East region (price varies by region.) You can visit the Hyperscale (Citus) to learn more about Hyperscale (Citus) pricing.

    We have several different pricing models for our Citus Enterprise offering including OEM, site-wide, and per-node licenses. Please contact sales for more details.

  • The Citus server is licensed under the GNU Affero General Public License v3.0. For additional details, including answers to common questions about the AGPL, see the FAQ from the Free Software Foundation. The client drivers are licensed under the PostgreSQL license.

    With this licensing structure, we looked to accomplish the following objectives:

    • Allow users to download Citus, see the source code, and use it for free.
    • Require users who choose to modify Citus to fit their needs, to release the patches to the software development community.
    • Require users who are unwilling to release the patches to the software development community to purchase a commercial license.

    With a significant volume of database software delivered today as a hosted service vs. distributed in binary form, GNU AGPL became the most effective license to fulfill all of the above.

    Having the client drivers under the PostgreSQL license removes any ambiguity as to the extent of the server license. We also have the Citus Enterprise product available under a commercial license from Citus Data.