Citus Blog

Articles tagged: analytics

Citus scales out Postgres for a number of different use cases, both as a system of record and as a system of engagement. One use case we're seeing implemented a lot these days: using the Citus database to power customer-facing real-time analytics dashboards, even when dealing with billions of events per day. Dashboards and pipelines are easy to handle when you’re at 10 GB of data, as you grow even basic operations like a count of unique users require non-trivial engineering work to get performing well.

Citus is a good fit for these types of event dashboards because of Citus’ ability to ingest large amounts of data, to perform rollups concurrently, to mix both raw unrolled-up data with pre-aggregated data, and finally to support a large number of concurrent users. Adding all these capabilities together, the Citus extension to Postgres works well for end users where a data warehouse may not work nearly as well. We've talked some here about various parts of building a real-time customer facing dashboard, but today we thought we'd go one step further and give you a guide for doing it end to end.

Keep reading
Ozgun Erdogan

Citus' Replication Model: Today and Tomorrow

Written byBy Ozgun Erdogan | December 15, 2016Dec 15, 2016

Citus is a distributed database that extends (not forks) PostgreSQL. Citus does this by transparently sharding database tables across the cluster and replicating those shards.

After open sourcing Citus, one question that we frequently heard from users related to how Citus replicated data and automated node failovers. In this blog post, we intend to cover the two replication models available in Citus: statement-based and streaming replication. We also plan to describe how these models evolved over time for different use cases.

Keep reading
Marco Slot

Real-time event aggregation at scale using Postgres w/ Citus

Written byBy Marco Slot | November 29, 2016Nov 29, 2016

Citus is commonly used to scale out event data pipelines on top of PostgreSQL. Its ability to transparently shard data and parallelise queries over many machines makes it possible to have real-time responsiveness even with terabytes of data. Users with very high data volumes often store pre-aggregated data to avoid the cost of processing raw data at run-time. With Citus 6.0 this type of workflow became even easier using a new feature that enables pre-aggregation inside the database in a massively parallel fashion using standard SQL. For large datasets, querying pre-computed aggregation tables can be orders of magnitude faster than querying the facts table on demand.

Keep reading

Page 2 of 2