v10.2 (Sep 2021)

Updates to this version:


What’s new in Citus 10.2?

Welcome to release v10.2 of the Citus database, an open source extension to PostgreSQL. Explore these headline features of Citus 10.2:

PostgreSQL 14 Beta3 support—and ready for PostgreSQL 14

Because Citus is an extension to Postgres, it’s easier for us to keep Citus in sync with the newest releases of Postgres (which would not be true if Citus were a fork.) This means when a new Postgres version is released, we can quickly release a new version of Citus that supports all or most of the new features.

One of the things to be excited about: Citus 10.2 is compatible with the Postgres 14beta3 release and already supports most of the new Postgres 14 features. This means that Citus 10.2 is ready for Postgres 14, and when Postgres 14 is officially released, it can be used with Citus 10.2.

Going over all the improvements that Postgres 14 provides is out of the scope of the blog as there are many of them! You can read the feature highlights for PG 14 here and the complete Postgres release notes here.

As an example, Postgres 14 adds a general subscripting framework for retrieving information in nested objects. You can now retrieve nested info in the JSONB data type using subscript syntax on distributed tables as well:

-- create a table, distribute it and
-- ingest a 3 rows for demo purposes
CREATE TABLE distributed_table(data jsonb, key int);
SELECT create_distributed_table('distributed_table' ,'key');
INSERT INTO distributed_table
  VALUES
  ('{ "attributes": { "state": "CA" }}'::jsonb, 1),
  ('{ "attributes": { "state": "CA" }}'::jsonb, 2),
  ('{ "attributes": { "state": "WA" }}'::jsonb, 3);

-- subscript jsonb in GROUP BY clause
SELECT
  data['attributes']['state'] as state, count(*)
FROM
  distributed_table
GROUP BY
  data['attributes']['state'];
state | count
------+------
"CA"  |     2
"WA"  |     1
(2 rows)

-- subscript jsonb in WHERE clause
SELECT
  key
FROM
  distributed_table
WHERE
  data['attributes']['state'] = '"CA"'::jsonb;
key
-----
   1
   2
(2 rows)

There are only a handful of Postgres 14 features that are not yet compatible with Citus 10.2, and you can see all of them on this tracking issue. The Citus team will gradually provide the compatibility of these features to our users. The prioritization is typically determined by the input we get from those of you who use Citus, so feel free to comment on this GitHub issue to note your favorite ones!

Partition management for time series workloads

With Citus 10.2, partition management for timeseries workloads on Citus becomes much easier!

Citus provides a powerful way for handling time series workloads at scale. Citus 10 release introduced columnar tables for Postgres. If you haven’t yet heard about using columnar storage with Citus, this demo that Jeff created is a good start. Anyway, a useful way to take advantage of Citus Columnar is when you combine it with Postgres native range partitioning. Commonly, range partitioning is used for time-based partitioning for time-series workloads.

Citus 10.2 provides the following APIs for easily managing the partitions for time partitions:

  • create_time_partitions(table_name regclass, partition_interval interval, end_at timestamp with time zone, start_from timestamp with time zone DEFAULT now()): For the given table and interval, create as many partitions as necessary for the given time range [start_from – end_at]
  • alter_old_partitions_set_access_method(parent_table_name regclass, older_than timestamp with time zone, new_access_method name): For the given table, convert the access method (e.g., heap -> columnar) of all the partitions that are older_than the given timestamp.
  • drop_old_time_partitions(table_name regclass, older_than timestamp with time zone): For the given table, drop all the partitions that are older_than the given timestamp

These 3 functions provide a simple way of managing partitions for time series workloads. It helps to maintain the desired number of partitions with the desired access methods (e.g., row-based or columnar) over time.

-- create a partitioned table
CREATE TABLE
time_series_events (event_time timestamp, event int, user_id int)
partition by range (event_time);

-- optional: partition management for timeseries workload is supported on
-- both distributed and non-distributed tables. In this example,
-- we use it on distributed tables
SELECT create_distributed_table('time_series_events', 'user_id');

-- first, create monthly partitions for the next 12 months
-- note that you can choose the interval based on your needs
-- we choose monthly partitions in this demo
SELECT create_time_partitions(table_name:='time_series_events',
  partition_interval:= '1 month',
  end_at:=now() + '12 months',
  start_from:=now());


-- the above command creates 12 partitions, one per month
-- and you can use "time_partitions" view that Citus provides
-- for inspecting various details about the time partitioned tables
SELECT partition, from_value, to_value, access_method FROM time_partitions;
          partition         |     from_value      |      to_value       | access_method
----------------------------+---------------------+---------------------+---------------
time_series_events_p2021_09 | 2021-09-01 00:00:00 | 2021-10-01 00:00:00 | heap
time_series_events_p2021_10 | 2021-10-01 00:00:00 | 2021-11-01 00:00:00 | heap
time_series_events_p2021_11 | 2021-11-01 00:00:00 | 2021-12-01 00:00:00 | heap
time_series_events_p2021_12 | 2021-12-01 00:00:00 | 2022-01-01 00:00:00 | heap
time_series_events_p2022_01 | 2022-01-01 00:00:00 | 2022-02-01 00:00:00 | heap
time_series_events_p2022_02 | 2022-02-01 00:00:00 | 2022-03-01 00:00:00 | heap
time_series_events_p2022_03 | 2022-03-01 00:00:00 | 2022-04-01 00:00:00 | heap
time_series_events_p2022_04 | 2022-04-01 00:00:00 | 2022-05-01 00:00:00 | heap
time_series_events_p2022_05 | 2022-05-01 00:00:00 | 2022-06-01 00:00:00 | heap
time_series_events_p2022_06 | 2022-06-01 00:00:00 | 2022-07-01 00:00:00 | heap
time_series_events_p2022_07 | 2022-07-01 00:00:00 | 2022-08-01 00:00:00 | heap
time_series_events_p2022_08 | 2022-08-01 00:00:00 | 2022-09-01 00:00:00 | heap
time_series_events_p2022_09 | 2022-09-01 00:00:00 | 2022-10-01 00:00:00 | heap
(13 rows)


-- make sure that we never have more than one year of data
-- (e.g., drop partitions older than 12 months)
CALL drop_old_time_partitions('time_series_events',
now() - interval '12 months' /* older_than */);

-- and finally, make sure that the last 6 months are always columnar compressed
-- (e.g., compress partitions older than 6 months)
CALL alter_old_partitions_set_access_method('time_series_events',
             now() - interval '6 months' /* older_than */,
            'columnar');

You can easily automate these operations using pg_cron which is an open source Postgres extension that our team builds and manages. If you are not yet familiar with pg_cron, it is a simple, cron-based job scheduler for PostgreSQL that runs inside the database as an extension. And you can easily combine Citus with pg_cron to automate time series partition management. (Side-note: pg_cron does not require a new release to work with PG14, you will just need to get the new packages.)

For instance, the following command sets up a pg_cron job that runs once a week to ensure that monthly partitions are always pre-created for the next 3 months.

-- schedule a cron job such that the partitions
-- for the next 12 months are always created automatically
SELECT cron.schedule('create-partitions',
                     '@weekly',
                     $$SELECT create_time_partitions(table_name:='time_series_events',
                                                     partition_interval:= '1 month',
                                                     end_at:=now() + '3 months') $$);

With that, you don’t need to worry about managing partitions anymore. Citus 10.2 with pg_cron will do the heavy lifting for you!

Index support for columnar tables

Citus 10 introduced columnar tables. Some common feedback our users provided was the lack of index support on columnar tables. The good news is as of Citus 10.2, indexes on columnar tables are supported.

This provides the following 3 key capabilities for you:

CREATE INDEX on partitioned tables that have columnar partitions

One of the powerful features of Citus columnar is that you can use it to compress the older partitions in a table that is partitioned by time. By keeping recent partitions in row-based storage, you can efficiently perform single-row inserts and updates on recent data, while older partitions that no longer change can use compressed columnar storage to save on storage space.

Prior to Citus 10.2, using columnar storage to compress old partitions of a partitioned table had one caveat: creating indexes on partitioned tables which have columnar partitions was not supported because indexes on columnar tables were not supported. Similarly, converting a partition to columnar storage was not allowed when the partitioned table already had any index defined on it. As of Citus 10.2, this limitation is removed.

For users, this improvement significantly simplifies index management on partitioned tables with columnar partitions. You can simply create indexes on the partitioned tables which will be created on all the partitions (including future partitions):

CREATE TABLE events(ts timestamptz, i int, n numeric, s text) PARTITION BY RANGE (ts);

-- index on a partitioned table
CREATE INDEX index_on_partitioned_table ON events(n);

-- columnar partition table
-- have the index defined on it
CREATE TABLE events_2021_jan PARTITION OF events
  FOR VALUES FROM ('2021-01-01') TO ('2021-02-01') USING COLUMNAR;

\d+ events_2021_jan
                   Table "public.events_2021_jan"
Column |           Type           | Collation | Nullable | Default | Storage  | Stats target | Description
-------+--------------------------+-----------+----------+---------+----------+--------------+-------------
ts     | timestamp with time zone |           |          |         | plain    |              |
i      | integer                  |           |          |         | plain    |              |
n      | numeric                  |           |          |         | main     |              |
s      | text                     |           |          |         | extended |              |

Partition of: events FOR VALUES FROM ('2021-01-01 00:00:00+01') TO ('2021-02-01 00:00:00+01')
Partition constraint: ((ts IS NOT NULL) AND (ts >= '2021-01-01 00:00:00+01'::timestamp with time zone) AND (ts < '2021-02-01 00:00:00+01'::timestamp with time zone))
Indexes:
    "events_2021_jan_n_idx" btree (n)
Access method: columnar

With this improvement, you can compress your old data while also maintaining indexes across all partitions for fast lookups. Do keep in mind that the indexes themselves take up space and are not themselves compressed.

Enable use of constraints on columnar tables

In Postgres (hence Citus as well), some of the constraints are backed by indexes. With the index support on columnar tables, Citus 10.2 supports the following constraints on columnar tables: primary keys, unique constraints, and exclude constraints.

-- Create a columnar table with
--    (a) PRIMARY KEY on c1
--    (b) UNIQUE KEY on c2
--    (c) EXCLUDE CONSTRAINT on c1
CREATE TABLE columnar_with_constraints (c1 INT PRIMARY KEY,c2 INT UNIQUE, c3 INT,
EXCLUDE USING btree (c3 WITH =)) USING columnar;

-- violate (a) PRIMARY KEY
INSERT INTO columnar_with_constraints (c1) VALUES (1), (1);
ERROR:  duplicate key value violates unique constraint "columnar_with_constraints_pkey"
DETAIL:  Key (c1)=(1) already exists.

-- violate (b) UNIQUE KEY
INSERT INTO columnar_with_constraints (c1, c2) VALUES (1, 1), (2, 1);
ERROR:  duplicate key value violates unique constraint "columnar_with_constraints_c2_key"
DETAIL:  Key (c2)=(1) already exists.

-- violate (c)  EXCLUDE CONSTRAINTS
INSERT INTO columnar_with_constraints (c1, c3) VALUES (1, 1), (2, 1);
ERROR:  conflicting key value violates exclusion constraint "columnar_with_constraints_c3_excl"
DETAIL:  Key (c3)=(1) conflicts with existing key (c3)=(1).

Improve performance for lookup queries on columnar tables

Indexes are useful data structures for query optimizations. Postgres typically stores data using the heap access method, which is row-based storage. For row-based tables, having an index almost always speeds up lookup queries. However, there are some caveats to consider for columnar tables on similar workloads. In this section, you’ll learn about the differences in indexes between row-based and column-based tables.

Columnar tables group data together by column instead of by row. And columnar tables compress the data, too. Indexes look for individual rows, whereas columnar stores tables in columns. This means that extracting individual rows from columnar tables needs extra processing. For each column on the table, columnar needs to decompress an entire chunk group (10K rows by default) in order to construct a single row. And this may need to be repeated per matched row. So, index scans are not as cheap for columnar tables as they are for row-based tables.

In addition to the above, Citus columnar tables internally store the data in a way that filters out the data that is not relevant for the query (see the "What is Chunk Group Filtering?" section). The idea is to skip parts of the data that is not matched by the WHERE clause of the query. If the filters in the query eliminate most of the data, Chunk Group Filtering works efficiently.

As a rule of thumb, lookup queries on columnar tables work as follows:

  • When the column filtered in the WHERE clause is ordered (say, timestamp or auto-incrementing primary key), Chunk Group Filtering is often picked as most of the chunks will be eliminated via Chunk Group Filtering
  • When the column filtered in the WHERE clause is not ordered (say, randomly populated integer/UUID), indexes are often picked as most of the chunks won’t be eliminated via Chunk Group Filtering
-- create a columnar table with two columns
-- and fill the table with data. first column is loaded with
-- ordered data (e.g., monotonically increasing)
-- second column is loaded with random values   (e.g., unordered data)
CREATE TABLE columnar_index_test
(ordered_data INT, unordered_data INT) USING COLUMNAR;
INSERT INTO
  columnar_index_test
SELECT
  i, random() * i FROM generate_series(0,10000000)i;

-- create an index on an already ordered data
-- columnar will not need to use this index
CREATE INDEX
  index_on_ordered_col ON columnar_index_test(ordered_data);

-- and, let Postgres update the statistics about the table
ANALYZE columnar_index_test;

-- show that for a column where the data is ordered, the columnar picks
-- group chunk filtering over index (e.g., index_on_ordered_col is not picked)
EXPLAIN ANALYZE
  SELECT
    count(*)
  FROM
    columnar_index_test
  WHERE
    ordered_data = 1;
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=55.78..55.79 rows=1 width=8) (actual time=8.300..8.301 rows=1 loops=1)
   ->  Custom Scan (ColumnarScan) on columnar_index_test  (cost=0.00..55.78 rows=1 width=0) (actual time=0.429..8.295 rows=1 loops=1)
         Filter: (ordered_data = 1)
         Rows Removed by Filter: 9999
         Columnar Projected Columns: ordered_data
         Columnar Chunk Group Filters: (ordered_data = 1)
         Columnar Chunk Groups Removed by Filter: 1000
Planning Time: 0.415 ms
Execution Time: 8.331 ms
(9 rows)

-- for the unordered, we do not have the index yet
-- however columnar still does not pick chunk group filtering
-- as using index doesn't help the query execution times
EXPLAIN ANALYZE
  SELECT
    count(*)
  FROM
    columnar_index_test
  WHERE
    unordered_data = 1;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=3737.32..3737.33 rows=1 width=8) (actual time=685.035..685.036 rows=1 loops=1)
   ->  Custom Scan (ColumnarScan) on columnar_index_test  (cost=0.00..3737.31 rows=3 width=0) (actual time=0.705..685.020 rows=19 loops=1)
         Filter: (unordered_data = 1)
         Rows Removed by Filter: 9999982
         Columnar Projected Columns: unordered_data
Planning Time: 0.305 ms
Execution Time: 685.063 ms
(7 rows)

-- now that create an index on the unordered column
-- and see that the index is picked during the query execution
-- and the execution time has decreased dramatically compared
-- to the execution without the index
CREATE INDEX
  index_on_unordered_col ON columnar_index_test(unordered_data);
ANALYZE columnar_index_test;

EXPLAIN ANALYZE
  SELECT
    count(*)
  FROM
    columnar_index_test
  WHERE
    unordered_data = 1;

                                                                          QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=116.06..116.07 rows=1 width=8) (actual time=8.349..8.350 rows=1 loops=1)
   ->  Index Only Scan using index_on_unordered_col on columnar_index_test  (cost=0.43..116.05 rows=3 width=0) (actual time=1.590..8.335 rows=19 loops=1)
         Index Cond: (unordered_data = 1)
         Heap Fetches: 19
Planning Time: 0.348 ms
Execution Time: 8.516 ms
(6 rows)

As you can see in the above examples, lookup queries on columnar tables are optimized based on the indexes and the distribution of data.

What else is in Citus 10.2?

Postgres 14beta3 support, Time partition management, and Indexes on Citus columnar tables are only a subset of what Citus 10.2 is all about.

As usual with new Citus releases, there are several performance improvements such as DROP TABLE on distributed partitioned tables are significantly faster. There are also some new DDL commands that are supported such as CREATE INDEX ONLY and ALTER INDEX ATTACH PARTITION with Citus 10.2.