Tenant monitoring in Citus & Postgres with citus_stat_tenants

Written by Halil Ozan Akgul
May 12, 2023

If you have ever used a database like Postgres, you know how important optimization is. Some minor changes in how the database is setup make all the difference between long waiting times and satisfied customers. And one crucial thing you need before doing the optimization is to monitor and understand how your database is being used.

Citus is an extension to Postgres that improves scalability and parallelization by distributing your Postgres database across nodes in a cluster. The Citus database extension is available as open source and as a managed service on the cloud, as Azure Cosmos DB for PostgreSQL. You can track your Citus nodes and the Postgres tables, but Citus 11.3 takes it one step further and introduces a new way to gather insight on your Citus database with tenant monitoring.

The new Citus 11.3 release, among many other features, introduces a new citus_stat_tenants view to track your most active tenants, for those with multi-tenant SaaS applications.

In a multi-tenant SaaS application, the same database stores data from the multiple customers of the application. Each of the customers are often referred as tenants. Usually the data for each tenant is handled separately. When you distribute your Postgres table using the Citus create_distributed_table function, every partition key value represents a tenant.

With the new citus_stat_tenants view you can track:

  • read query count, SELECT queries,
  • total query count, SELECT, INSERT, DELETE, and UPDATE queries,
  • total CPU usage in seconds

In this post, you’ll learn how to monitor your top tenants to make more-informed decisions on your database—and you’ll learn how to configure citus_stat_tenants to best fit your application. This post includes a quickstart and code examples. Let’s explore:

Monitor your top tenants with citus_stat_tenants

Let us say you have a multi-tenant app with Citus database, similar to the one on our Citus 11.3 blog post, and your customers are companies who have some ad campaigns.

CREATE TABLE companies (id BIGSERIAL, name TEXT);
SELECT create_distributed_table ('companies', 'id');

CREATE TABLE campaigns (id BIGSERIAL, company_id BIGINT, name TEXT);
SELECT create_distributed_table ('campaigns', 'company_id');

Each of the companies will be a tenant and the company ids are the tenant ids, or the tenant attributes.

Now we can run some queries to track with the citus_stat_tenants view.

Tip: You need to set the citus.stat_tenants_track to all in all your nodes to be able to track the statistics. You can put the setting in the postgresql.conf file.

Let us run some queries.

INSERT INTO companies (name) VALUES ('GigaMarket');
INSERT INTO campaigns (company_id, name) VALUES (1, 'Crazy Wednesday'), (1, 'Frozen Food Frenzy');
INSERT INTO campaigns (company_id, name) VALUES (1, 'Spring Cleaning'), (1, 'Bread&Butter');
INSERT INTO campaigns (company_id, name) VALUES (1, 'Personal Care Refresh'), (1, 'Lazy Lunch');

INSERT INTO companies (name) VALUES ('White Bouquet Flowers');
INSERT INTO campaigns (company_id, name) VALUES (2, 'Bonjour Begonia'), (2, 'April Selection'), (2, 'May Selection');

INSERT INTO companies (name) VALUES ('Smart Pants Co.');
INSERT INTO campaigns (company_id, name) VALUES (3, 'Short Shorts'), (3, 'Tailors Cut');
INSERT INTO campaigns (company_id, name) VALUES (3, 'Smarter Casual');

SELECT COUNT(*) FROM campaigns WHERE company_id = 1;
count
-------
     6
(1 row)
SELECT name FROM campaigns WHERE company_id = 2 AND name LIKE '%Selection';
      name
-----------------
 April Selection
 May Selection
(2 rows)
UPDATE campaigns SET name = 'Tailor''s Cut' WHERE company_id = 3 AND name = 'Tailors Cut';

Now you can check citus_stat_tenants to see the statistics.

SELECT tenant_attribute,
       read_count_in_this_period,
       query_count_in_this_period,
       cpu_usage_in_this_period
FROM citus_stat_tenants;
tenant_attribute | read_count_in_this_period | query_count_in_this_period | cpu_usage_in_this_period
------------------+---------------------------+----------------------------+--------------------------
 1                |                         1 |                          5 |                 0.000299
 3                |                         0 |                          4 |                 0.000314
 2                |                         1 |                          3 |                 0.000295
(3 rows)

Now you have insight on your tenants’ activities.

Monitoring tenants in real time

Activity trends of your tenants might change in time. A new tenant could be much more active than a year old one. Instead of tracking the activity of your tenants since they entered the database, citus_stat_tenants monitors them within time buckets. Each time period’s query and CPU statistics are counted separately. Once a period ends, that period’s numbers are finalized and only stored for one more period. That means citus_stat_tenants only shows the current and the last period’s statistics.

A period is 60 seconds by default, and you can change it with citus.stat_tenants_period parameter.

Let us say you waited 60 seconds (1 period) and ran new queries.

SELECT companies.name company, campaigns.name campaign
FROM companies JOIN campaigns ON companies.id = campaigns.company_id
WHERE companies.id = 1 AND campaigns.name LIKE '%zy%';
  company   |      campaign
------------+--------------------
 GigaMarket | Crazy Wednesday
 GigaMarket | Frozen Food Frenzy
 GigaMarket | Lazy Lunch
(3 rows)
DELETE FROM campaigns WHERE company_id = 2 AND name = 'April Selection';

If you check the citus_stat_tenants view you can see something like:

SELECT * FROM citus_stat_tenants WHERE tenant_attribute = '1' OR tenant_attribute = '2';
-[ RECORD 1 ]--------------+---------
nodeid                     | 2
colocation_id              | 1
tenant_attribute           | 1
read_count_in_this_period  | 1
read_count_in_last_period  | 1
query_count_in_this_period | 1
query_count_in_last_period | 5
cpu_usage_in_this_period   | 0.000054
cpu_usage_in_last_period   | 0.000299
-[ RECORD 2 ]--------------+---------
nodeid                     | 1
colocation_id              | 1
tenant_attribute           | 2
read_count_in_this_period  | 0
read_count_in_last_period  | 1
query_count_in_this_period | 1
query_count_in_last_period | 3
cpu_usage_in_this_period   | 0.000132
cpu_usage_in_last_period   | 0.000295

Note that in addition to query and CPU usage, citus_stat_tenants also includes these columns:

  • nodeid: id of the Citus node the tenant lives in, can also be seen in pg_dist_node
  • colocation_id: Colocation group id for the distributed table the tenant is from
  • tenant_attribute: id of the tenant (the partition key value)

If you want to create graphs and dashboards with citus_stat_tenants, you can use the “last period” columns for every period. Because “last period” columns have values that cannot be changed, the values in the “last period” columns will be removed after the next period is over.

Insights on the most active tenants (versus on all tenants)

The new tenant monitoring feature of Citus 11.3 is not meant for investigating each and every tenant in your Citus database.

Rather, the new citus_stat_tenants is designed to help you gather information on your most active tenants. The tenant monitoring view will store and show the tenants that ran the most queries the most recently.

If a tenant in your Citus database stops running queries on your distributed Postgres tables, that tenant’s statistics will eventually be removed from the monitor to open space for the more active tenants.

By default, citus_stat_tenants will show 100 tenants. You can change the number of tenants stored with citus.stat_tenants_limit parameter.

Citus stores extra data to ensure consistency of tenant-level statistics

The tenant monitor will show the top citus.stat_tenants_limit number of tenants and only for the current and the last period. But to make the monitor and the statistics consistent and more useful, in the background, citus_stat_tenants stores some more tenants and some extra data related to recency and the number of queries. Citus stores extra data is to make sure scenarios like “an active tenant dropping out of the monitor list, just because the tenant didn’t run queries during one period” does not happen.

Local stats from Citus worker nodes

You can query citus_stat_tenants from any node and the view will show the statistics from all the cluster. However, the tracking of the data is done locally on the Citus node where the tenant resides in.

You might not be interested in the whole Citus cluster’s data, if:

  • you plan to create a set of graphs for each of your Citus nodes,
  • you used the tenant isolation feature and are not particularly interested in the isolated tenants’ usages, or
  • you have a specific Citus node which you are trying to optimize

If you just want to get the monitoring data from the node you are connected to, you can use the citus_stat_tenants_local view.

From Citus worker node with node id 1:

SELECT * FROM citus_stat_tenants_local;
-[ RECORD 1 ]--------------+--
colocation_id              | 1
tenant_attribute           | 2
read_count_in_this_period  | 0
read_count_in_last_period  | 1
query_count_in_this_period | 1
query_count_in_last_period | 3
cpu_usage_in_this_period   | 0.000132
cpu_usage_in_last_period   | 0.000295

Note that you can only see the tenants whose shards are in the Citus node with node id 1 and you no longer have the nodeid column that you had in citus_stat_tenants view.

Optional choice to clear tenant statistics history

The citus_stat_tenants shows the statistics for one period. But if you want to reset the monitor and clear all the data you can use the citus_stat_tenants_reset function.

SELECT citus_stat_tenants_reset();
SELECT COUNT(*) FROM citus_stat_tenants;
count
-------
     0
(1 row)

There is also a function for cleaning local monitor data of a single Citus node, citus_stat_tenants_reset_local(). Keep in mind that resetting the data for only one Citus node might create inconsistent results across the cluster, so for most cases you should not need the local reset function.

Getting started with Citus & Citus tenant monitoring for multi-tenant SaaS

At Citus we try make your database faster every day, which includes giving you the tools to optimize your database performance yourself.

Tenant monitoring is created with this thought in mind. Now that you have the tools to learn more about the activity and usage statistics for the top tenants in your SaaS application, you can make more-informed decisions on how to optimize your Citus database and cluster.

To learn more about Citus 11.3 and tenant monitoring:

And to get started with Citus:

Halil Ozan Akgul

Written by Halil Ozan Akgul

Computer engineer, Citus developer, Postgres lover, piano player, plant grower—all while working at Microsoft.

halilozanakgul