Blog posts by Craig Kerstiens on the Citus Blog - Page 4

Monitoring your Citus Cloud cluster with Datadog

Written by By Craig Kerstiens | April 25, 2018 Apr 25, 2018

At the heart of most applications is a database. Ensuring your database is performing well is key to ensuring your your customers receive a good experience when working with your app. It's likely you're already monitoring your systems today, and want to monitor your database using similar tooling. Today we're excited to release turnkey integration for one of the more popular tools out there to monitor Citus Cloud clusters: Datadog.

Keep reading

Custom configuring your Postgres settings on Citus Cloud

Written by By Craig Kerstiens | April 17, 2018 Apr 17, 2018

Postgres has long been a reliable database for keeping your data safe, and it is used in a variety of flexible ways. Because of the many flexible ways it can be used (ranging from embeded devices to data warehousing to large transactional system) it also comes with a lot of knobs to configure it. Part of our approach in providing a fully managed database as a service is configuring Postgres to be production ready from the moment you click a provision, which is what you get with Citus Cloud.

Over time though we have seen a need for more flexibility to tune and customize configurations to your specific needs. Part of this flexibility is in supporting the rich feature set of Postgres features such as JSONB, rich indexing, and more. Part is supporting a broad set of extensions such at HyperLogLog, pg_partman, TopN, PostGIS, and more. And today we're excited to support custom configuration of your Citus clusters on Citus Cloud to enable even broader flexibility.

Keep reading

Raw SQL access for users with row-level-security

Written by By Craig Kerstiens | April 4, 2018 Apr 4, 2018

We talk with a lot of SaaS companies that are encountering issues with their database. The most common issue we discuss relates to performance, either a need to keep scaling or at times just dealing with really intensive data needs of only a few customers and how to handle that.

And then as you continue to scale and capture more data you want to provide more value back to your customers.

At times you might even consider giving raw SQL access to your largest and most important customers. Typically controlling what data you give them, via dashboards and canned reports is ideal–this way you can control performance impact and other risks. But, if you have extra large/important customers that require you to give them raw access to the data... then PostgreSQL and thus Citus has your answer.

Pro-tip: Don't grant access to *all** of your customers.*

Keep reading

Contributing to Postgres via patch review

Written by By Craig Kerstiens | March 31, 2018 Mar 31, 2018

Citus is an open source extension to Postgres that transforms Postgres into a distributed database, scaling horizontally. The fact that Citus is built on top of Postgres is a huge benefit to our users: it means that when you choose Citus, you get all the great features that are available in Postgres. And Postgres itself is an awesome database. Awesome. As a team, we value the foundation we're built on and regularly aim to contribute back to it. We have a number of developers that have contributed to Postgres over the years from features like watch, event triggers, and the PostgreSQL extension framework.

Recently a few more of our engineers expressed an interest in giving back to the PostgreSQL community. In fact it's a common question, how can we better help the PostgreSQL project? And a common answer is reviewing patches. To help kick start that process we organized a session and carved out a few days just for patch review during the most recent commitfest.

Keep reading

Citus Data internal hackathon roundup

Written by By Craig Kerstiens | March 26, 2018 Mar 26, 2018

At Citus Data, we regularly get the team together, because even with an engineering team that is distributed around the globe, face-to-face time is valuable to connecting and collaborating. During our team offsites, we often organize engineering hackathons to proof out new ideas, learn new things, or just for fun. We recently completed one of our Citus hackathons and thought we'd share some of what we built.

The theme of our hackathon this time was on building the ultimate dashboard for our Citus extension to Postgres. For Postgres, there are lots of options out there for capturing and displaying insights into your database. You could use New Relic, Vivid Cortex, or something entirely open source like pghero. But we wanted to explore the question, what more could we provide?

Our two teams took two very different approaches, but each emerged with something interesting that we hope to continue to build on and productize in the future. In case you’re curious, here’s a look at each of the projects from our hackday:

Keep reading

Raw SQL access for users with row-level-security

Written by By Craig Kerstiens | March 19, 2018 Mar 19, 2018

And then as you continue to scale and capture more data you want to provide more value back to your customers.

At times you might even consider giving raw SQL access to your largest and most important customres. Typically controlling what data you give them, via dashboards and canned reports is ideal–this way you can control performance impact and other risks. But, if you have extra large/important customers that require you to give them raw access to the data... then PostgreSQL and thus Citus has your answer.

Pro-tip: Don't grant access to *all** of your customers.*

Keep reading

Fun with SQL: generate_series in Postgres

Written by By Craig Kerstiens | March 14, 2018 Mar 14, 2018

There are times within Postgres where you may want to generate sample data or some consistent series of records to join in order for reporting. Enter the simple but handy set returning function of Postgres: generate_series. generate_series as the name implies allows you to generate a set of data starting at some point, ending at another point, and optionally set the incrementing value. generate_series works on two datatypes:

integers
timestamps

Keep reading

How the Citus distributed database rebalances your data

Written by By Craig Kerstiens | February 1, 2018 Feb 1, 2018

In both Citus Cloud 2 and in the enterprise edition of Citus 7.1 there was a pretty big update to one of our flagship features—the shard rebalancer. No, I’m not talking about our shard rebalancer visualization that reminds me of the Windows '95 disk defrag. (Side-node: At one point I tried to persuade my engineering team to play tetris music in the background while the shard rebalancer UI in Citus Cloud was running. The good news for all of you is that I was overwhelmingly veto'ed by my team. Whew.) The interesting new capability in the Citus database is the online nature of our shard rebalancer.

Keep reading

Citus and pg_partman: Creating a scalable time series database on Postgres

Written by By Craig Kerstiens | January 24, 2018 Jan 24, 2018

Years ago Citus used to have multiple methods for distributing data across many nodes (we actually still support both today), there was both hash-based partitioning and time-based partitioning. Over time we found big benefits in further enhancing the features around hash-based partitioning which enabled us to add richer SQL support, transactions, foreign keys, and more. Thus in recent years, we put less energy into time-based partitioning. But… no one stopped asking us about time partitioning, especially for fast data expiration. All that time we were listening. We just thought it best to align our product with the path of core Postgres as opposed to branching away from it.

Postgres has had some form of time-based partitioning for years. Though for many years it was a bit kludgy and wasn't part of core Postgres. With Postgres 10 came native time partitioning, and because Citus is an open source extension to Postgres that means anyone using Citus gets to take advantage of time-based partitioning as well. You can now create tables that are distributed across nodes by ID and partitioned by time on disk.

We have found a few Postgres extensions that make partitioning much easier to use. The best in class for improving time partitioning is pg_partman and today we'll dig into getting time partitioning set up with your Citus database cluster using pg_partman.

Keep reading

Building HIPAA-compliant applications with Citus Cloud and Postgres

Written by By Craig Kerstiens | January 23, 2018 Jan 23, 2018

Today we're excited to announce that you can now use our fully-managed database as a service, Citus Cloud, to manage protected health information (PHI) and to build HIPAA-compliant applications on top of Postgres. For those of you building apps in healthcare environments regulated by the Health Insurance Portability and Accountability Act (HIPAA, you can feel safer knowing you now have a scalable Postgres database that meets your healthcare compliance requirements. .

If you're building an application on top of Postgres and you need a combination of horizontal scale as well as HIPAA compliance, reach out to us if you want more information about getting a Business Associate Agreement (BAA) with Citus Data in place.

Keep reading