Citus 11.0 beta is out! 🐘🦄 Read all about 11.0 beta and then give it a test drive. 🚗
Welcome to release 11.0-beta! the Citus extension to PostgreSQL. Explore these headline features of Citus 11.0-beta:
The easiest way to use Citus is to connect to the coordinator and use it for both schema changes and distributed queries—as most of you have been doing prior to 11.0 beta—but for very demanding applications, you now have the option to load balance distributed queries across the workers in (parts of) your application by using a different connection string and factoring a few limitations.
Load balancing distributed queries across nodes are enabled by automatic metadata syncing to the nodes. All the code changes that enabled automatic metadata syncing is tracked on
#5304. Throughout this release notes, we’ll dive into different parts of the changes.
Citus 11.0 beta changes to a new mode of operation by default: All Citus database clusters always use metadata syncing. That means that with Citus 11.0, you can always run distributed Postgres queries, from any node.
-- create a table, distribute it on the coordinator and -- ingest a 3 rows for demo purposes CREATE TABLE distributed_table(key int, value text); SELECT create_distributed_table('distributed_table' ,'key'); INSERT INTO distributed_table VALUES (1, '1'), (2, '2'), (3, '3'); INSERT 0 3
Now, connect to any node, and run queries:
SELECT count(*) FROM distributed_table; ┌───────┐ │ count │ ├───────┤ │ 3 │ └───────┘ (1 row) INSERT INTO distributed_table VALUES (4, '4'), (5, '5'), (6, '6'); INSERT 0 3
All commands (
TRUNCATE) can be executed from any node on any tables (
reference tables or
citus managed local tables). All DDL commands, like
CREATE INDEX or
create_distributed_table, should still happen via the coordinator.
To enable running queries via any node, Citus propagates all its metadata (
pg_dist_* tables) to the nodes in the database cluster. Some of the metadata tables were already propagated by earlier versions of Citus. Citus 11.0-beta enabled propagation of
If you are upgrading an existing cluster to Citus 11.0 beta, then after installing the new packages, you will need to call a function to finalize the upgrade. This has been implemented on
Note that we do not suggest upgrading production clusters Citus 11.0 beta. We have similar perspective on the beta release as Postgres commnuity. Citus beta releases are not meant for use in production systems.
-- on all nodes ALTER EXTENSION citus UPDATE;
And, then on the coordinator:
-- only on the coordinator select citus_finalize_upgrade_to_citus11();
The upgrade function will make sure that all worker nodes have the right schema and metadata. It also addresses several naming issues that affected shards of partitioned tables.
If there is anything preventing metadata from being synchronized (e.g. missing permissions on worker nodes, or conflicting objects) the upgrade function will throw an error. You can still use your existing Citus database cluster via the coordinator until you address the issue and finalize the upgrade, but some new 11.0-beta functionality will not be available.
If you need to perform many queries per second on a distributed PostgreSQL database, it may be necessary to use all nodes in the cluster for running distributed queries. To do that, you need a relatively large number of connections.
When your application opens a connection to one of the Citus nodes, that connection spawns a Postgres process, which will need to make internal connections to other nodes to perform distributed queries. If you connect directly to the worker nodes for distributed queries, your client connections are in a sense competing with internal connections. To make sure that every Postgres process resulting from a client connection can also make internal connections to all other nodes, we added the
citus.max_client_connections setting via
#5698. This setting restricts the number of (external) client connections, while continuing to allow internal connections between Citus nodes. In addition to the usual installation instructions, we recommend adding the following settings to
postgresql.conf on each Citus node (coordinator and all workers) to accommodate a large number of client connections:
# The maximum number of client + internal connections a node can handle # The total number of client connections across all nodes should never exceed this number max_connections = 6000 # The number of client connections an individual node can handle # Should be no greater than: max_connections / node count including the coordinator citus.max_client_connections = 500
With these settings, each node will accept up to 500 connections from your application, so if you have 10 worker nodes and 1 coordinator your application can make 5500 connections in total. You can increase this number further by using a connection pooler like pgbouncer on each node.
We also highly recommend adding the Citus coordinator to the metadata, such that worker nodes can connect to the coordinator as well. Some Citus features will only be available if the coordinator is in the metadata. We will likely make adding the coordinator required in the future.
When there is a single node in the cluster for running distributed queries, the cluster insights can be extracted pretty much simiilar to single node Postgres. You can use
pid(process id) to gain an insight of the current activity. Similarly, you can use
pg_terminate_backend to have a further control over the activities.
When you use multiple nodes for querying the Citus cluster, gaining insights could become more complicated. We introduced several tools to alleviate potential difficulties the users can have.
citus_check_cluster_node_health()UDF: Any node should be able to connect to any other node in the cluster. You can use
citus_check_cluster_node_health()UDF to check if there are any connectivity issues.
from_nodename | from_nodeport | to_nodename | to_nodeport | result -------------------------------------------------------------------- localhost | 9700 | localhost | 9700 | t localhost | 9700 | localhost | 9701 | t localhost | 9701 | localhost | 9700 | t localhost | 9701 | localhost | 9701 | t (4 rows)
citus_backend_gpid()UDF: Any process that initiates a distributed query could create some internal connections. With Citus 11.0-beta, we introduce
gpid), which uniquely identifies a client session and all internal connections associated with the session via
gpidstarts with the node ID of the node that initiated the query (= the node that the client connected to).
select citus_backend_gpid(); ┌────────────────────┐ │ citus_backend_gpid │ ├────────────────────┤ │ 10000059518 │ └────────────────────┘ (1 row)
Note that the internal connections also gets the same unique global pid of the session that initiated the distributed query.
select key, citus_backend_gpid() FROM distributed_table; ┌─────┬────────────────────┐ │ key │ citus_backend_gpid │ ├─────┼────────────────────┤ │ 1 │ 10000059518 │ │ 5 │ 10000059518 │ │ 4 │ 10000059518 │ │ 3 │ 10000059518 │ │ 6 │ 10000059518 │ │ 2 │ 10000059518 │ └─────┴────────────────────┘ (6 rows)
citus_stat_activityView: Shows the information from
pg_stat_activityfrom all client sessions across all nodes, along with a
gpid). It includes all the internal backends as well. This change is introduced via
SELECT global_pid, query, state from citus_stat_activity WHERE query ilike 'INSER%' and global_pid != citus_backend_gpid(); ┌─────────────┬────────────────────────────────────────────────────────────────────────────────┬─────────────────────┐ │ global_pid │ query │ state │ ├─────────────┼────────────────────────────────────────────────────────────────────────────────┼─────────────────────┤ │ 80000062731 │ INSERT INTO distributed_table VALUES (1, '1'); │ idle in transaction │ │ 80000062731 │ INSERT INTO public.distributed_table_102009 (key, value) VALUES (1, '1'::text) │ idle in transaction │ └─────────────┴────────────────────────────────────────────────────────────────────────────────┴─────────────────────┘ (2 rows)
citus_dist_stat_activityView: Similar to the
citus_stat_activitybut excludes the internal backends. Most of the time, this is what you would probably need. This change is introduced via
SELECT global_pid, query, state from citus_dist_stat_activity WHERE query ilike 'INSER%' and global_pid != citus_backend_gpid(); ┌─────────────┬────────────────────────────────────────────────┬─────────────────────┐ │ global_pid │ query │ state │ ├─────────────┼────────────────────────────────────────────────┼─────────────────────┤ │ 80000062731 │ INSERT INTO distributed_table VALUES (1, '1'); │ idle in transaction │ └─────────────┴────────────────────────────────────────────────┴─────────────────────┘ (1 row)
citus_lock_waitsView: In single node Postgres, there is a useful query which shows blocking/blocked activities. At Citus, we had already provided
citus_lock_waitswhich gives the same information, but considering all the activity across the cluster. With Citus 11.0-beta, we improved this view to show any blocking activitiy and with
gpids attached. This change is introduced via
SELECT * FROM citus_lock_waits; ┌─[ RECORD 1 ]──────────────────────────┬────────────────────────────────────────────────┐ │ waiting_gpid │ 10000062689 │ │ blocking_gpid │ 80000062731 │ │ blocked_statement │ TRUNCATE distributed_table; │ │ current_statement_in_blocking_process │ INSERT INTO distributed_table VALUES (1, '1'); │ │ waiting_nodeid │ 1 │ │ blocking_nodeid │ 8 │ └───────────────────────────────────────┴────────────────────────────────────────────────┘ Time: 76.163 ms
pg_terminate_backend(): Postgres allows cancelling and/or terminating processes with a
pid. On a distributed Citus cluster, we want to preserve the same ability with as little effort as possible. Given that we have
global pids, we overriden these UDFs to accept
gpids. For simplicty, below we show using the current backends
gpidwith the UDFs. In general, you can pass
gpids from any node.
-- cancel own backend via gpid SELECT pg_cancel_backend(citus_backend_gpid()); ERROR: canceling statement due to user request -- terminatee own backend via global_pid SELECT pg_terminate_backend(citus_backend_gpid()); FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded.
These changes are introduced in
Pull Request #5699.
With Citus 11.0-beta, users can connect to any node to query their distributed tables. We are getting reports from users who are understandbly confused by seeing Citus-generated shards in their schemas next to their distributed tables when inspecting the database with tools like pgAdmin.
To avoid the confusion, Citus 11.0-beta by default hides all the shards from all external applications. This is controlled by a GUC
This change is introduced via
Pull request #5567.
-- * means that hide shards from all applications show citus.hide_shards_from_app_name_prefixes; ┌──────────────────────────────────────────┐ │ citus.hide_shards_from_app_name_prefixes │ ├──────────────────────────────────────────┤ │ * │ └──────────────────────────────────────────┘ (1 row) \d List of relations ┌────────┬───────────────────┬───────┬─────────────┐ │ Schema │ Name │ Type │ Owner │ ├────────┼───────────────────┼───────┼─────────────┤ │ public │ citus_tables │ view │ onderkalaci │ │ public │ distributed_table │ table │ onderkalaci │ └────────┴───────────────────┴───────┴─────────────┘ (2 rows) -- now configure the GUC so that it only hides shards from a specific application -- and then, on psql -- where we run the test -- we can see the shards SET citus.hide_shards_from_app_name_prefixes TO 'non_existing_app'; \d List of relations ┌────────┬──────────────────────────┬───────┬─────────────┐ │ Schema │ Name │ Type │ Owner │ ├────────┼──────────────────────────┼───────┼─────────────┤ │ public │ citus_tables │ view │ onderkalaci │ │ public │ distributed_table │ table │ onderkalaci │ │ public │ distributed_table_102008 │ table │ onderkalaci │ │ public │ distributed_table_102015 │ table │ onderkalaci │ │ public │ distributed_table_102022 │ table │ onderkalaci │ │ public │ distributed_table_102029 │ table │ onderkalaci │ │ public │ distributed_table_102036 │ table │ onderkalaci │ └────────┴──────────────────────────┴───────┴─────────────┘ (7 rows)
Note that, if you want to make all the shards visible, you can do that via
-- do not hide any tables (a.k.a., shards) SET citus.override_table_visibility TO off; \d List of relations ┌────────┬──────────────────────────┬───────┬─────────────┐ │ Schema │ Name │ Type │ Owner │ ├────────┼──────────────────────────┼───────┼─────────────┤ │ public │ citus_tables │ view │ onderkalaci │ │ public │ distributed_table │ table │ onderkalaci │ │ public │ distributed_table_102008 │ table │ onderkalaci │ │ public │ distributed_table_102015 │ table │ onderkalaci │ │ public │ distributed_table_102022 │ table │ onderkalaci │ │ public │ distributed_table_102029 │ table │ onderkalaci │ │ public │ distributed_table_102036 │ table │ onderkalaci │ └────────┴──────────────────────────┴───────┴─────────────┘ (7 rows)
Starting from Citus 9.0, Citus automatically manages some of the database objects. In other words, users only need to take care of creating the objects on the coordinator, and the rest is handled by Citus. The objects are automatically created on the worker nodes.
In Citus 11.0-beta release, we expanded the supported objects by Citus. Before getting into the details of the new object support, let us discuss
citus.create_object_propagation GUC to control DDL creation behaviour in transactions.
In a distributed database we often need to make trade-offs between consistency, fault tolerance, parallelism, and other distributed systems aspects. Citus needs to support PostgreSQL’s interactive, multi-statement transaction blocks, which is particularly challenging in a distributed setting.
For example, Citus typically parallelizes expensive operations across shards—such as analytical queries and
create_distributed_table() over multiple connections per worker. When creating a database object, Citus propagates it to worker nodes over a single connection per worker. Combining those two operations in a single multi-statement transaction may cause issues because the parallel connections will not be able to see the object that was created over a single connection but not yet committed.
Consider a transaction block that creates a type, a table, loads data, and distributes the table:
BEGIN; -- type creation over a single connection: CREATE TYPE coordinates AS (x int, y int); CREATE TABLE positions (object_id text primary key, position coordinates); -- data loading thus goes over a single connection: SELECT create_distributed_table(‘positions’, ‘object_id’); \COPY positions FROM ‘positions.csv’ …
Previously, Citus would defer creating the type on the worker nodes and commit it separately when doing the
create_distributed_table. This enabled the data copying in
create_distributed_table to happen in parallel. However, it also meant that the type was not always present on the Citus worker nodes, or if the transaction rolled back, it would be present only on the worker nodes. We were able to hide these inconsistencies, but eventually they could cause issues.
With Citus 11.0 beta the default behaviour changes to prioritize schema consistency between coordinator and worker nodes. That does have a downside: If object propagation happens after a parallel command, it can no longer be completed, as highlighted by the ERROR in the code block below:
BEGIN; CREATE TABLE items (key text, value text); -- parallel data loading: SELECT create_distributed_table(‘items’, ‘key’); \COPY items FROM ‘items.csv’; CREATE TYPE coordinates AS (x int, y int); ERROR: cannot run type command because there was a parallel operation on a distributed table in the transaction DETAIL: When running command on/for a distributed type, Citus needs to perform all operations over a single connection per node to ensure consistency. HINT: Try re-running the transaction with "SET LOCAL citus.multi_shard_modify_mode TO 'sequential';"
If you run into this issue, there are 2 simple workarounds: * Use
set citus.create_object_propagation to deferred; to return to the old object propagation behavior * Use
set citus.multi_shard_modify_mode to sequential to disable per-node parallelism. Data load in the same transaction might be slower
Several database objects, such as
CREATE TYPE or
CREATE EXTENSION has already been managed by Citus. In this release, we add several more objects that are automatically managed by Citus, so that you don’t have to take any manual steps to manage the objects across the cluster.
ALTER FUNCTION ... SUPPORT ...
TEXT SEARCH CONFIGURATIONobjects
TEXT SEARCH DICTIONARYobjects
CREATE SCHEMA .. GRANT ON SCHEMA ..commands
CREATE SCHEMA AUTHORIZATIONstatements without schema name
Prior to Citus 11.0-beta, Citus allowed objects in a way that could cause problems when the metadata syncing is enabled (a.k.a., Citus 11.0-beta). In other words, when metadata syncing is enabled, we cannot afford to have any dependencies that Citus cannot sync to the nodes. With Citus 11.0-beta, the following two cases are prevented:
Although not very commonly used, Postgres allows to create circular dependencies between objects. Syncing such objects to remote nodes are currently unsupported.
-- Show that causing circular dependency via functions and default values are not allowed CREATE TABLE table_1_for_circ_dep(id int); SELECT create_distributed_table('table_1_for_circ_dep','id'); CREATE OR REPLACE FUNCTION func_1_for_circ_dep(col_1 table_1_for_circ_dep) RETURNS int LANGUAGE plpgsql AS $$ BEGIN return 1; END; $$; CREATE TABLE table_2_for_circ_dep(id int, col_2 int default func_1_for_circ_dep(NULL::table_1_for_circ_dep)); SELECT create_distributed_table('table_2_for_circ_dep','id'); CREATE OR REPLACE FUNCTION func_2_for_circ_dep(col_3 table_2_for_circ_dep) RETURNS int LANGUAGE plpgsql AS $$ BEGIN return 1; END; $$; -- It should error out due to circular dependency ALTER TABLE table_1_for_circ_dep ADD COLUMN col_2 int default func_2_for_circ_dep(NULL::table_2_for_circ_dep); ERROR: Citus can not handle circular dependencies between distributed objects DETAIL: "table table_1_for_circ_dep" circularly depends itself, resolve circular dependency first
Even if an object cannot be distributed by Citus, it can still reside on the coordinator as a local object. However, as soon as any such object is used as a dependency of a distributed object, Citus throws an error.
-- see https://github.com/citusdata/citus/blob/76f249a05a3b2d5a77fd79e23fd62fa85bb49b2c/src/test/regress/expected/pg13.out#L143-L167 -- to reproduce this example. This is only a simplified version. The main point to keep in mind is that -- my_table cannot be distributed -- myvarchar is a shell type that Citus does not support CREATE TYPE myvarchar; .... -- create a table that depends on unsupported object (e.g., myvarchar) CREATE TABLE my_table (a int, b myvarchar); SELECT create_distributed_table('my_table', 'a'); ERROR: Relation "my_table" has dependency on unsupported object "type myvarchar”
Under the hood, Citus can convert local Postgres tables on the coordinator to “citus managed local tables”. This is nothing but converting the Postgres table to a single shard Citus table sitting on the coordinator, and record the table on the Citus metadata. With Citus 11.0-beta, as the metadata is synced, “citus managed local tables” are also synced. Meaning that, those tables can be accessed / used in the queries via the worker nodes as well, although the data sits on the coordinator.
Prior to Citus 11.0-beta, this feature was automatically handled by Citus when a foreign key is formed between a reference table and local Postgres tables. With Citus 11.0-beta, as this concept is also useful for accessing these tables on the worker nodes, we added some explicit ways of enabling the feature.
To do that, the pre-requisite is to add the coordinator to the metadata.
-- add the coordinator to the metadata -- make sure to give the correct hostname / port of your enviroment SELECT citus_set_coordinator_host('10.0.0.1', 5432);
Later, Citus 11.0-beta offers two seperate ways to convert local tabes to citus managed tables.
First, all new local tables are automatically added to the metadata with
Pull request #5440:
-- change use_citus_managed_tables permenantly ALTER SYSTEM SET citus.use_citus_managed_tables TO true; SELECT pg_reload_conf(); -- create a local table, and see that it is on the Citus metadata CREATE TABLE local(a int); SELECT * FROM pg_dist_partition WHERE logicalrelid = 'local'::regclass; ┌──────────────┬────────────┬─────────┬──────────────┬──────────┬───────────────┐ │ logicalrelid │ partmethod │ partkey │ colocationid │ repmodel │ autoconverted │ ├──────────────┼────────────┼─────────┼──────────────┼──────────┼───────────────┤ │ local │ n │ │ 0 │ s │ f │ └──────────────┴────────────┴─────────┴──────────────┴──────────┴───────────────┘ (1 row)
Second, you can selectively decide which local tables to become Citus managed tables via
-- create a local table, and see that it is on the Citus metadata CREATE TABLE local(a int); select citus_add_local_table_to_metadata('local'); ┌───────────────────────────────────┐ │ citus_add_local_table_to_metadata │ ├───────────────────────────────────┤ │ │ └───────────────────────────────────┘ (1 row) SELECT * FROM pg_dist_partition WHERE logicalrelid = 'local'::regclass; ┌──────────────┬────────────┬─────────┬──────────────┬──────────┬───────────────┐ │ logicalrelid │ partmethod │ partkey │ colocationid │ repmodel │ autoconverted │ ├──────────────┼────────────┼─────────┼──────────────┼──────────┼───────────────┤ │ local │ n │ │ 0 │ s │ f │ └──────────────┴────────────┴─────────┴──────────────┴──────────┴───────────────┘ (1 row)
In any case, if you connect to the worker nodes, you can still access to the local tables.
# 9700 is a worker node port psql postgres -p 9700 -c "SELECT count(*) FROM local" ┌───────┐ │ count │ ├───────┤ │ 0 │ └───────┘ (1 row)
In general, our design policy for “citus managed local tables” is that they become indistinguishable from Postgres tables. In this release, we added support for partitioning. We track all the remaining features on this issue.
Like PostgreSQL, Citus maintains long-term backwards compatibility. We go through great lengths to make sure your applications continue working when upgrading Citus. However, sometimes a feature is no longer in line with how Citus is used and gets in the way of development. There are a few Citus features which we decided to remove in 11.0 beta:
Drops support for PostgreSQL 12: Our soft policy for Postgres version compatibilty is to support Citus’ latest release with Postgres 2 latest releases. In other words, Citus 11.0-beta supports Postgres 13 and Postgres 14.
Invalid shard placements: Shards are no longer marked invalid when a write fails, since this behavior had some flaws and leads limitations when metadata synced. This work involved several changes:
Pull request #5379: When replication factor > 1, all modifications are done via 2PC
Pull request #5380: Drop support for citus.multishardcommit_protocol
Pull request #5381: Drop support Inactive Shard States
Pull request #5392: Enable shard replication > 1 hash distributed tables on metadata synced clusters
Pull request #5470: Active placements can only be on active nodes
Pull request #5486: Allow disabling node(s) when multiple failures happen
Append-distributed table functions: The original method of distribution in Citus was “append” distribution, which was optimized for append-only data. Hash-distributed tables are a lot easier to use and have many more capabilities and can also handle append-only data very well through partitioning. Citus 11.0 beta removes the functions for creating shards and loading new data into append-distributed tables. We are not aware of any append-distributed table users, but just in case: you can still upgrade to 11.0 beta, but the tables will become read only. We recommend creating new distributed tables that use the default hash-distribution.
Pull request #5690: Remove citus.shardplacementpolicy
Pull request #5481: Stop updating shard range in citusupdateshard_statistics for append tables
Pull request #5444: Remove masterappendtabletoshard UDF
Pull request #5389: Deprecate mastergettable_metadata UDF
Pull request #5399: Remove copy into new append shard logic
Pull request #5361: Remove masterapplydelete_command UDF
Pull request #5359: Disable co-located and single-repartition joins for append tables
cstore_fdw tables (should switch to columnar access method) Since version 10.0, Citus comes with built-in columnar storage. Prior to Citus 10.0, it was possible to use Citus with columnar storage using the now deprecated cstore_fdw extension.
cstore_fdw did not support important PostgreSQL features like streaming replication and backups, hence we rarely saw Citus customers use columnar storage before Citus 10. Many companies are now successfully using Citus’ built-in columnar storage for time series data, and we are therefore dropping support for creating or using distributed
cstore_fdw tables. If you have distributed
cstore_fdw tables, we recommend converting them to the columnar access method before upgrading to 11.0 beta.
This change is introduced via
In order to simplify the user experience, we dropped support for foreign distributed or reference tables via
Pull request #5605. Instead, we enabled “citus managed local tables” to become foreign tables that can be accessed from any node via
Pull request #5461. Previously,
TRUNCATE command was also not supported. As of Citus 11.0-beta, Citus supports [
TRUNCATE] on foreign tables via
Pull request #5710.
Note that you can still upgrade the cluster with existing distributed or reference foreign tables, but you cannot create new such tables.
Pull request #5851: Drops
citus.enable_cte_inliningGUC, which is only useful for Postgres 11
Pull request #5525: Removes support for dropping distributed and local indexes in the same statement, which doesn’t work well on metadata synced clusters
Pull request #5678: Replaces
citus.enable_metadata_sync, mostly a cosmetic change to have a broader name
Pull request #5609: Requires superuser for
citus_activate_node()UDFs, you could have given certain privileges for non-superusers to add/activate nodes. For simplicity, always require superuser
Pull request #5753: Disallows unsupported lateral subqueries on distributed tables, we realized that when reference tables are involved in certain complex lateral joins, Citus might return wrong results. Until we properly support such queries, we prevent users getting wrong results.
Pull request #5400: A major improvement for repartition joins, fixing several issues
Pull request #5654: Fixes a bug that causes columnar storage pages to have zero LSN