Citus Blog

Articles tagged: AzureDBPostgres

Claire Giordano

What’s new with Postgres at Microsoft (August 2023)

Written byBy Claire Giordano | August 31, 2023Aug 31, 2023

On one of the Postgres community chat forums, a friend asked me: "Is there a blog post that outlines all the work that is being done on Postgres at Microsoft? It's hard to keep track these days."

And my friend is right: it is hard to keep track. Probably because there are multiple Postgres workstreams at Microsoft, spread across a few different teams.

In this post, you'll get a bird's eye view of all the Postgres work the Microsoft team has done over the last year. Our work includes some pretty significant improvements to the Postgres managed services on Azure, as well as contributions across the entire open source ecosystem—including commits to the Postgres core; new releases to Postgres open source extensions like Citus and pg_cron; plus ecosystem work on Patroni, PgBouncer, pgcopydb. And more.

Keep reading
Claire Giordano

UK COVID-19 dashboard built using Postgres and Citus for millions of users

Written byBy Claire Giordano & Pouria Hadjibagheri | December 11, 2021Dec 11, 2021

From the beginning of the COVID-19 pandemic, the United Kingdom (UK) government has made it a top priority to track key health metrics and to share those metrics with the public.

And the citizens of the UK were hungry for information, as they tried to make sense of what was happening. Maps, graphs, and tables became the lingua franca of the pandemic. As a result, the GOV.UK Coronavirus dashboard became one of the most visited public service websites in the United Kingdom.

The list of people who rely on the UK Coronavirus dashboard is quite long: government personnel, public health officials, healthcare employees, journalists, and the public all use the same service.

Keep reading

There is some good news for those of you wanting to shard your Postgres database in the cloud, so that as your data grows you have an easy way to scale out your Postgres database. I’m delighted to announce that Citus 10—the latest open source release of the Citus extension to Postgres—is now generally available in Hyperscale (Citus).

Hyperscale (Citus) is a built-in option in the Azure Database for PostgreSQL managed service, which has been around for a couple of years to help those of you who would rather focus on your application—and not on spending cycles managing your database.

Keep reading

It’s been an eventful time for Hyperscale (Citus) lately. If you’re interested in Postgres, distributed databases, and how to handle ever growing needs for your Postgres application or simply use Hyperscale (Citus), keep reading.

Citus is an open source extension to Postgres that enables horizontal scaling of your Postgres database. Citus distributes your Postgres tables, writes, and SQL queries across multiple nodes—parallelizing your workload and enabling you to use the memory, compute, and disk of a multi-node cluster. And Citus is available on Azure: Hyperscale (Citus) is a deployment option in Azure Database for PostgreSQL.

What’s really exciting to me is that we’ve made it easier and cheaper than ever to try and use Hyperscale (Citus). With Basic tier, you can now use Hyperscale (Citus) on a single node, parallelizing your operations and adopting a distributed database model from the very beginning. And you can now try Citus open source with a single docker run command—boom!

Keep reading

Citus is an extension to Postgres that lets you distribute your application’s workload across multiple nodes. Whether you are using Citus open source or using Citus as part of a managed Postgres service in the cloud, one of the first things you do when you start using Citus is to distribute your tables. While distributing your Postgres tables you need to decide on some properties such as distribution column, shard count, colocation. And even before you decide on your distribution column (sometimes called a distribution key, or a sharding key), when you create a Postgres table, your table is created with an access method.

Previously you had to decide on these table properties up front, and then you went with your decision. Or if you really wanted to change your decision, you needed to start over. The good news is that in Citus 10, we introduced 2 new user-defined functions (UDFs) to make it easier for you to make changes to your distributed Postgres tables.

Keep reading

Once you start using the Citus extension to distribute your Postgres database, you may never want to go back. But what if you just want to experiment with Citus and want to have the comfort of knowing you can go back? Well, as of Citus 9.5, now there is a new undistribute_table() function to make it easy for you to, well, to revert a distributed table back to being a regular Postgres table.

If you are familiar with Citus, you know that Citus is an open source extension to Postgres that distributes your data (and queries) to multiple machines in a cluster—thereby parallelizing your workload and scaling your Postgres database horizontally. When you start using Citus—whether you’re using Citus open source or whether you’re using Citus as part of a managed service in the cloud—usually the first thing you need to do is distribute your Postgres tables across the cluster.

Keep reading
Claire Giordano

When to use Hyperscale (Citus) to scale out Postgres

Written byBy Claire Giordano | December 5, 2020Dec 5, 2020

If you've built your application on Postgres, you already know why so many people love Postgres.

And if you're new to Postgres, the list of reasons people love Postgres is loooong—and includes things like: 3 decades of database reliability baked in; rich datatypes; support for custom types; myriad index types from B-tree to GIN to BRIN to GiST; support for JSON and JSONB from early days; constraints; foreign data wrappers; rollups; the geospatial capabilities of the PostGIS extension, and all the innovations that come from the many Postgres extensions.

But what to do if your Postgres database gets very large?

Keep reading

In my work as an engineer on the Postgres team at Microsoft, I get to meet all sorts of customers going through many challenging projects. One recent database migration project I worked on is a story that just needs to be told. The customer—in the retail space—was using Redshift as the data warehouse and Databricks as their ETL engine. Their setup was deployed on AWS and GCP, across different data centers in different regions. And they'd been running into performance bottlenecks and also was incurring unnecessary egress cost.

Specifically, the amount of data in our customer's analytic store was growing faster than the compute required to process that data. AWS Redshift was not able to offer independent scaling of storage and compute—hence our customer was paying extra cost by being forced to scale up the Redshift nodes to account for growing data volumes. To address these issues, they decided to migrate their analytics landscape to Azure.

Keep reading

When working on the internals of Citus, an open source extension to Postgres that transforms Postgres into a distributed database, we often get to talk with customers that have interesting challenges you won't find everywhere. Just a few months back, I encountered an analytics workload that was a really good fit for Citus.

But we had one problem: the percentile calculations on their data (over 300 TB of data) could not meet their SLA of 30 seconds.

To make things worse, the query performance was not even close to the target: the percentile calculations were taking about 6 minutes instead of the required 30 second SLA.

Keep reading

The last two months, I managed the agenda for our weekly Citus team meeting, the one time each week where our entire distributed team—with people spread across 6 different countries—gets together to talk about Citus things. As I chatted with our PostgreSQL folks to find speakers to give 10-minute “lightning talks”, I heard a chorus from several of the engineers: “see if you can get Joe to give a talk. His talks are always super interesting.”

I succeeded. Joe Nelson (known as begriffs online) did deliver a talk titled “Dominus SQL, lord of my domain.” And the engineers liked it. Not a surprise, as Joe’s content tends to be pretty popular, both on his personal blog, and on the Citus Data blog, including high traffic posts such as 5 ways to paginate in Postgres and Faster PostgreSQL Counting.

And when Joe agreed to let me interview him about his work on the Citus documentation (he’s quite busy so I wasn’t sure he would say yes), well, I was thrilled. This post is an edited transcript of my interview with Joe—and it’s your inside baseball view into how the documentation for the Citus open source project gets made.

Keep reading

Page 1 of 2