Hyperscale (Citus) is now GA as a built-in deployment option in Azure Database for PostgreSQL. Read Craig’s blog
Today, we’re excited to announce Citus 7.3—the latest release of our distributed database that scales out Postgres. Citus 7.3 improves support for complex analytical queries, provides integration with Tableau and other BI tools, and integrates with the open source Postgres extension, TopN.
The features in this latest Citus database release are particularly important for real-time analytics workloads. In these workloads, users typically need to ingest data in real time and run analytical queries with sub-second response times. A good example is when you’re serving a dashboard to thousands of customers and your database needs to provide fast replies over billions of rows.
What’s New in Citus 7.3
- Complex subquery support
- Broader support for aggregate functions (json(b), bool, and bit aggregates)
- Renaming distributed tables and indexes
- Tableau integration
- TopN extension integration
For all these new Citus features, you can try them today by downloading Citus open source on your local machine. Or you can spin up a new Citus database cluster in the cloud by using our fully-managed database as a service, Citus Cloud. Or keep reading to learn more about the new features in Citus.
Complex subquery support
Citus already provided full SQL support for queries that were scoped to a single machine, such as with many multi-tenant applications. We’ve also been adding and strengthening SQL support for complex queries that span across a cluster of machines. With version 7.3 of the Citus database, we now have even broader support for complex SQL queries. In particular, Citus 7.3 makes three improvements.
- Joins between two or more subqueries, where the join is between large tables and not on the distribution column.
- Support for all types of subqueries that appear in a WHERE clause.
- Support for data-modifying statements in WITH (CTE) clauses.
-- Subquery foo has a join on distribution keys. Subquery bar has a join on distribution keys -- However, join among subqueries foo and bar isn’t on the distribution keys SELECT foo.value_2 FROM (SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (1,2,3,4)) as foo, (SELECT users_table.value_2 FROM users_table, events_table WHERE users_table.user_id = events_table.user_id AND event_type IN (5,6,7,8)) as bar WHERE foo.value_2 = bar.value_2;
Broader support for complex aggregate functions across a Citus database cluster
Citus 7.3 improves support for running complex aggregate functions across a distributed cluster. In particular, Citus 7.3 can now distribute the following PostgreSQL aggregate functions.
- json and jsonb aggregate functions:
jsonb_object_aggare now supported. These functions were contributed to Citus by Markus Sintonen.
- bool and bit aggregate functions:
bit_andare now supported over distributed clusters.
Renaming distributed tables and indexes
Citus 7.3 adds support for renaming distributed tables and indexes. Renaming distributed tables becomes helpful in several scenarios. For example, you could change your Rails application, adjust the model name in your ORM, and you need the underlying table name to match the new model name. Or, you may need to change the distribution key on a table. To do that, you can create a new table, copy data to the new table, and rename the old table to the new one.
Citus 7.3 also broadens its support for index modification commands. You can now rename an index created on a distributed table, or set / reset storage parameters for an index.
ALTER TABLE ad_campaigns RENAME TO campaigns; ALTER INDEX ad_campaigns_pkey RENAME TO campaigns_pkey; ALTER INDEX campaigns_pkey SET (fillfactor = 40);
Verified integration between Tableau & Citus
At Citus we believe that data is a valuable asset to any business, and extracting value and insights from that data are key to success. Tableau is a popular business intelligence and analytics tool for databases. Tableau already has a robust connector for PostgreSQL. Since Citus extends Postgres, you can leverage the vast ecosystem of tools and libraries that support Postgres.
Some of you already use Citus and Tableau together. With Citus 7.3 we’ve done the work to ensure Citus and Tableau provide a seamless experience when you need to perform ad-hoc reporting or analysis. You can now interact with Tableau using the following steps.
- You can directly connect to Citus by choosing PostgreSQL from the connection menu.
- Once you connect to Tableau, you will see the tables in your database. You can define your data source by dragging and dropping tables from the “Table” pane. Or, you can run a custom query through “New Custom SQL”.
- You can create your own sheets by dragging and dropping dimensions, measures, and filters. You can also create an interactive user interface with Tableau. To do this, Tableau automatically chooses a date range over the data. Citus can then compute aggregations over this range in human real-time.
TopN, an open source extension to PostgreSQL
Citus 7.3 also comes with support for a new open source PostgreSQL extension called TopN. TopN returns the top values in a database according to some criteria. When doing this, TopN applies an approximation algorithm to provide fast results using few compute and memory resources.
We originally created TopN to help Citus customers, who needed to scale out their PostgreSQL databases. These customers needed to serve customer-facing dashboards and run analytical queries that needed sub-second responses.
The TopN extension achieves that by materializing top values in a database, incrementally updating these top values, and/or merging top values from different time ranges. If you’re familiar with the HyperLogLog (HLL) extension, you can think of TopN as its sister.
If you’re interested in learning more, please see the examples section in the TopN repository.
Full list of changes in Citus 7.3
We only covered the highlights for Citus 7.3 in this blog post. We have many more changes in this release and you can read the full list in our Citus GitHub repo.
Share your thoughts on Citus 7.3 with us
Citus 7.3 makes Citus more of a drop-in replacement for your single-node Postgres database. By scaling out Postgres using Citus, you don’t need to worry about adapting your database for a distributed system.
If you give Citus a try, we’d love to hear your feedback. Please join the conversation in our Slack channel and let us know what you think.