Citus Blog

Articles tagged: software engineering

Claire Giordano

UK COVID-19 dashboard built using Postgres and Citus for millions of users

Written byBy Claire Giordano & Pouria Hadjibagheri | December 11, 2021Dec 11, 2021

From the beginning of the COVID-19 pandemic, the United Kingdom (UK) government has made it a top priority to track key health metrics and to share those metrics with the public.

And the citizens of the UK were hungry for information, as they tried to make sense of what was happening. Maps, graphs, and tables became the lingua franca of the pandemic. As a result, the GOV.UK Coronavirus dashboard became one of the most visited public service websites in the United Kingdom.

The list of people who rely on the UK Coronavirus dashboard is quite long: government personnel, public health officials, healthcare employees, journalists, and the public all use the same service.

Keep reading

When working on the internals of Citus, an open source extension to Postgres that transforms Postgres into a distributed database, we often get to talk with customers that have interesting challenges you won’t find everywhere. Just a few months back, I encountered an analytics workload that was a really good fit for Citus.

But we had one problem: the percentile calculations on their data (over 300 TB of data) could not meet their SLA of 30 seconds.

To make things worse, the query performance was not even close to the target: the percentile calculations were taking about 6 minutes instead of the required 30 second SLA.

Keep reading

Many of you rely on databases to return correct results for your SQL queries, however complex your queries might be. And you probably place your trust with no questions asked—since you know relational databases are built on top of proven mathematical foundations, and since there is no practical way to manually verify your SQL query output anyway.

Since it is possible that a database’s implementation of the SQL logic could have a few errors, database developers apply extensive testing methods to avoid such flaws. For instance, the Citus open source repo on GitHub has more than twice as many lines related to automated testing than lines of database code. However, checking correctness for all possible SQL queries is challenging because of the lack of a “ground truth” to compare their outputs against, and the infinite number of possible SQL queries.

Keep reading

Over the last two years, our engineering team at Citus Data has shortened release cycles from 12 months all the way down to 8 weeks. The most recent 7.2 release of the Citus database took 8 weeks exactly, start to finish.

These shortened release cycles have been chock full of new capabilities for our users, including distributed deadlock detection in Citus 7.0, multi-shard updates and deletes in Citus 7.1, and support for CTE’s (common table expressions) and complex Postgres subqueries in Citus 7.2.

On the Citus Cloud side (that’s our fully-managed database as a service that runs on AWS), we’ve recently added fork, followers, fully-online “warp” migration from existing PostgreSQL installations, and point-in-time-recovery (PITR), just to name a few.

When I step back to think about how we got here (as a co-founder of Citus Data, I’ve been here since the beginning), it’s no surprise that I attribute much of what we’ve accomplished to our team. But here’s the point about all these accomplishments that I think is so interesting: our engineering team is distributed across 5 countries and 6 different cities.

Keep reading
Metin Doslu

Linux memory manager and your big data

Written byBy Metin Doslu | November 23, 2013Nov 23, 2013

Disclaimer: We always assume that when we have an issue and think it’s the operating system, 99% of the time, it turns out to be something else. We therefore caution against assuming that the problem is with your operating system, unless your use-case...

Keep reading

Page 1 of 1