To grow and evolve, Agari needed a way to scale their database
Agari provides global brands with the experience, tools and analytics they need to eliminate email threats, protect customers and their personal data, and proactively guard brand reputation. Today, Agari has analyzed more than 1 trillion email messages and secures more than 85 percent of U.S. consumer emails.
A primary mechanism cybercriminals use to trick an employee into clicking on a link or opening a file is “spear phishing.” The practice involves sending emails that appear to be from a well-known or trusted brand but contain malicious links. To combat this type of spoofing attack, Agari works with large ISPs and email providers to mine huge volumes of aggregated email metadata (e.g., domains, IP addresses, and authentication information). Their insights are then provided to enterprise customers whose brands are being spoofed to help them understand the scope of the problem and how to stop it.
Agari built its original product on PostgreSQL but encountered significant constraints as it tried to evolve the solution. The company wanted to enable its customers to look at the high-level analytics and quickly drill down to the underlying specifics, but Postgres lacked a scale out option. “We vetted a number of SQL and NoSQL solutions and considered some of the forked Postgres solutions that added a scale out capability, but fortunately I heard a talk about Citus Data at a PostgreSQL meetup in San Francisco,” said Vidur Apparao, CTO of Agari. “I liked Citus’s combination of features, open source flexibility and commercial-grade scalability. Best of all, because Citus is built on the latest version of PostgreSQL, we have been able to continue using the same core SQL toolset, leverage our existing PostgreSQL expertise, and take advantage of the continuing innovation within the PostgreSQL community.” Following an extensive evaluation of Citus, Apparao began building a new version of the Agari product, which was released in early 2014.
Citus is a cost-effective solution that delivers the interactive insight our customers demand and the querying and processing power our internal team needs for deep analytics and business intelligence.
Vidur Apparao, CTO of Agari
Migrating to Citus shaved months off the transition
Because Citus is code-compatible with PostgreSQL, Agari was able to leverage the same comprehensive set of data types – including IP addresses and CIDR ranges – and existing Postgres extensions. “Rebuilding the Agari platform from scratch while adding substantial improvements took far less time than it otherwise would have,” said Apparao. “We had a product to demo for clients in less than four months. Having had to do more development work on our own at the data store level would likely have added months to our project.”
I like Citus’ combination of features, open source flexibility, and commercial-grade scalability. Best of all, because Citus is built on the latest version of PostgreSQL, we have been able to continue using the same core SQL toolset, leverage our existing PostgreSQL expertise, and take advantage of the continuing innovation within the PostgreSQL community.
Vidur Apparao, CTO of Agari
On-the-fly analytics across multiple dimensions of 6-8 TB
Agari now runs its solution on two 8-node Citus clusters on Amazon Web Services (AWS). The masters run on systems with 30GB RAM and 8 cores. The data nodes run on systems with 15GB RAM and 8 cores. The most recent data is stored on solid state drives, while older data is moved to magnetic drives. The solution is used for both interactive customer queries from Agari’s SaaS-based products and internal batch queries for analytics. Agari is able to conduct on-the-fly analytics across multiple dimensions of 6 to 8 TB of the most recent data while still being able to interactively drill down to row-level detail. Typical interactive queries spanning the most recent month of data return in under 4 seconds.
Today, Agari has analyzed more than 1 trillion email messages and secures more than 85 percent of U.S. consumer emails. “There are other solutions we could have used to obtain this capability, but they would have come at a substantially higher cost,” added Apparao.
Being an extension to Postgres means Citus is compatible with new releases
Citus is code-compatible with PostgreSQL, so it leverages all the innovation taking place in the Postgres community, including the tools built for PostgreSQL. Extension libraries contributed to the community can also be easily integrated into Citus applications. “The Citus Data approach of building a scale out version of PostgreSQL that tracks and leverages the innovations of the Postgres community was a huge driver behind our decision to adopt the Citus database,” said Apparao. “Had we chosen a proprietary solution or one that was open but not supported by such a large community, we would have suffered in a number of ways, including not having access to all the tools we needed.”
The Citus Data approach of building a scale out version of PostgreSQL that tracks and leverages the innovations of the Postgres community was a huge driver behind our decision to adopt the Citus database.
Vidur Apparao, CTO of Agari
Agari gets to leverage their PostgreSQL expertise and gets a relational database that scales out
“Our customers have been delighted with our product,” said Apparao. “They specifically like our ability to pull insight out of the underlying data and present it to them in the form of dashboards, alerts and workflows. They also appreciate the flexibility in querying the underlying data to explore, understand and drill down to the core issues they are solving.”
Citus has allowed Agari to leverage its investment and experience in the PostgreSQL ecosystem while still getting the benefit of a scale out solution that is easy to manage. This allowed the company to get a product to market faster than it would have by taking on a new and potentially more complex data solution. “Citus is a cost-effective solution that delivers the interactive insight our customers demand and the querying and processing power our internal team needs for deep analytics and business intelligence.”