If you want to learn more about Citus on Microsoft Azure, read this post about Hyperscale (Citus) on Azure Database for PostgreSQL.

Skip navigation

Citus Blog

Articles by Sai Srirampur

Sai Krishna Srirampur

Migrating interactive analytics apps from Redshift to Postgres, ft. Hyperscale (Citus)

Written by By Sai Srirampur | October 28, 2020 Oct 28, 2020

In my work as an engineer on the Postgres team at Microsoft, I get to meet all sorts of customers going through many challenging projects. One recent database migration project I worked on is a story that just needs to be told. The customer—in the retail space—was using Redshift as the data warehouse and Databricks as their ETL engine. Their setup was deployed on AWS and GCP, across different data centers in different regions. And they’d been running into performance bottlenecks and also was incurring unnecessary egress cost.

Specifically, the amount of data in our customer’s analytic store was growing faster than the compute required to process that data. AWS Redshift was not able to offer independent scaling of storage and compute—hence our customer was paying extra cost by being forced to scale up the Redshift nodes to account for growing data volumes. To address these issues, they decided to migrate their analytics landscape to Azure.

Keep reading
Sai Krishna Srirampur

Using search_path and views to hide columns for reporting with Postgres

Written by By Sai Srirampur | July 3, 2018 Jul 3, 2018

Data security and data privacy are important, no one disputes that. We all want to keep private things private and to keep our data secure. And yet, data needs to be shared, to enable insights, to help organizations observe patterns and have those “ah-ha” moments. None of us want the extreme where, in an effort to keep data secure, there is no access to data of any form within your organization, and the result is no business insights or analytics. With GDPR going into effect, you’ve likely been rethinking what security controls you have in place.

Here at Citus Data we collaborate with SaaS businesses and larger enterprises alike, generally to consult on Postgres data models and how to best scale out their database. (Our Citus extension to Postgres enables you to scale out Postgres horizontally. The benefit: performance.) In working with teams, one common thing we’ve seen companies do is to restrict who can see which bits of Personally Identifiable Information (PII) within your database. There are a number of approaches, including heavyweight ETL processes that mask PII bits. An ETL process tends to introduce a certain amount of latency from the time data is in your system until the time it can be analyzed.

Fortunately, Postgres provides a few primitives that can be used directly within your database to hide PII, while still enabling sophisticated analytics and exploration of data in real time.

Here we’ll look at using Postgres schemas and views to provide access to data while keeping PII safe and hidden.

Keep reading
Sai Krishna Srirampur

Fun with SQL: Relocating shards on a Citus database cluster

Written by By Sai Srirampur | February 28, 2018 Feb 28, 2018

The Citus extension to Postgres allows you to shard your Postgres database across multiple nodes without having to make major changes to your SaaS application. Citus then provides performance improvements (as compared to single-node Postgres) by transforming SQL queries and distributing queries across multiple nodes, thereby parallelizing the workload. This means that a 2 node, 4 core Citus database cluster could perform 4x faster than single node Postgres.

With the Citus shard rebalancer, you can easily scale your database cluster from 2 nodes to 3 nodes or 4 nodes, with no downtime. You simply run the move shard function on the co-location group you want move shards for, and Citus takes care of the rest. When Citus moves shards, it ensures tables that are co-located stay together. This means all of your joins, say, from orders to order_items still work, just as you’d expect.

Keep reading
Sai Krishna Srirampur

Scaling out your Django Multi-tenant App

Written by By Sai Srirampur | November 14, 2017 Nov 14, 2017

There are a number of data architectures you could use when building a multi-tenant app. Some, such as using one database per customer or one schema per customer, have trade-offs when it comes to larger scale. The other option is to build the notion of tenancy directly into the logic of your SaaS application. With django-multitenant and Citus, built-in tenancy becomes much easier to put in place for your application without having to re-invent the wheel yourself.

Our django-multitenant Python library, enables easy scale out of applications that are built on top of Django and follow a multi tenant data model. This Python library has evolved from our experience working with SaaS customers, scaling out their multi-tenant apps.

Keep reading

Page 1 of 1