Hyperscale (Citus) is now GA as a built-in deployment option in Azure Database for PostgreSQL. Read Craig’s blog
Hi. I work as a data architect in San Francisco and I’m auditing Dr. Jones class to stay up to date on the latest technologies and she mentioned you might be able to help me before I get too deep into the design of a new system.
- I would be happy to help. Can you give me an overview of where you’re at?
My default was just to use Postgres. I had a few questions on what schema designs might make most sense.
- Well I’m working with more interesting data architectures, not sure I can help you much with Postgres. Really getting excited about what’s possible with neomodern data architectures, they make it so my app devs can build any feature their hearts desire.
I thought your expertise used to be relational databases?
- It was, but neomodern data architectures are better. Neomodern data architectures allow it so app devs can build any feature they like without having to think about data models. Really, it’s the future of databases.
Hmmm, my app is a pretty straightforward web app that allows salespeople to send campaigns and track metrics on the campaigns. My app should be fine in a Postgres database right?
- That might work for you, but really, if you have to define a data model up front, then you’re limiting yourself.
How do you mean limiting? Can’t I just add new tables as I need them? Or add columns to existing tables?
- Well you could, but meta document stores can take care of that for you
Meta document stores?
- Yeah, you have a custom document for each data model, but then have a document store that auto parses each document and can tell you the structure without having to read the document itself
Oh nice, so I can have an index of all my data.
- Exactly, and because you have a bunch of smaller document stores you can distribute them around the world and read the data from the nearest one
So reading from the nearest copy of data would be faster?
- Well it’s supposed to be, but to make sure the data is correct you have to read from at least 3 nodes to get consensus
- Yeah consensus. Since the underlying storage is all in blockchain, there has to be an agreement about which of the data is a source of truth
How long does consensus take?
- Right now, most queries take around 400 milliseconds, but I’m planning to rewrite the consensus algorithm over the next two months
In the past my Postgres database would perform reads in usually a millisecond or less. 400 ms doesn’t sound any better?
- Well the key is scale, at larger scale. We’re planning ahead for really large growth so we needed to have something distributed in place
And what about writes? Is write performance just as bad?
- Writes take about 3 seconds, but we don’t write to the database directly
You don’t write to the database?! What do you write to then?
- Well we tried to write directly to the database, but for some reason there was clock skew which gave inconsistencies. So now we’re planning to upgrade to TrueTime atomic clocks in July.
What the hell! Where do you get atomic clocks from?
- We sent our infrastructure guy to China to get our atomic clocks custom designed
This is running on your own hardware? Not on a cloud vendor like AWS, Microsoft, or Google?
- We budget our hardware on a 5 year cycle, and accounting wouldn’t let us move the classification. Over the next 3 years, we’ll save about 50% on hardware cost, and almost break even when you include the people hours
Ouch! That sounds like a lot of work. Wait, can we back up for a minute. You said you don’t write to the database. If you don’t write directly to the database, what do you write to then?
- Right now we dual write to Redis and Memcached
You’re running two things that you write to?! Why would you even need to do that?
- They’re not normally used for transactions, but since they’re so fast, we can write to both. And since we write to both, we have a better chance of transactions
This is sounding like more work than I was hoping for. Is that all I need to know?
- Oh we haven’t even gotten to the big wins yet
Okay, color me curious. What’s the payoff to all this investment?
- I’ve seen the biggest value in how we’ve been able to give each analyst their own data lake.
- Yeah, it’s like a partial copy of the database that has just what that person wants
You lost me
- Because it’s schemaless, we have to use the meta document to parse it into a structured format so people can read it
How do you parse it?
- We have event streams that feed out from Kafka
Where do you stream it?
- Into Apache Spark
What’s Spark then do?
- It computes pre-pre-aggregates
And those then get queried?
- Oh, no. We push the pre-pre-aggregates to Druid, which computes our pre-aggregates
Okay, then you query the pre-aggregates?
- No, that go into our data warehouse. Our data warehouse is great for analysts. It doesn’t help for user facing things, but our analysts can create some really powerful reports.
Oh, so you can make business decisions in real time?
- Well because the pipeline data is usually about 12 hours behind, our business decisions are 12 hours behind. But we’re confident we can get it down to 8 hrs.
Okay, I think you lost me somewhere, but you’re saying the application developers are way more productive at least?
- Oh the application developers haven’t gotten to try it yet, we’re just wrapping up our staging environment, then we have to start ordering the hardware for production. We plan to go live by 2022.
I think I may just stick with Postgres.