Testing Postgres vectorization for faster aggregations

One of the ideas we wanted to explore more has been speeding up in-memory aggregations in PostgreSQL through vectorized execution. The opportunity to do so came up when we had our intern, Can, take this on as a project during his summer internship. The early numbers he has there are promising - suggesting a 3-4x increase in PostgreSQL performance for simple SELECT queries with sum/count/group by operations.

This is proof-of-concept work conducted within several weeks and not with the production-ready diligence that we always have on our projects - hence we made a very explicit point to add "_test" to the end of the project name. That said, it shows good promise for further increases in performance, and given the ideas there can be useful more broadly, we are happy to open source the project.

Please take a look at postgres_vectorization_test on GitHub, and let us know what you think! The readme also has more details on our approach, sample queries, performance comparisons, and instructions on getting it set up as a PostgreSQL extension: https://github.com/citusdata/postgres_vectorization_test

With this, my special thanks go to our intern, Can, for tackling this in the short amount of time he had, to Metin for mentoring Can, and of course, to Ozgun, for pulling everything together for an exciting summer project.

cstore_fdw 1.1 release notes

We are excited to announce the release of cstore_fdw 1.1, Citus Data's open source columnar store extension for PostgreSQL. The changes in this release include:

  • Automatic file management. The filename foreign table option has become optional, and cstore_fdw uses a default directory inside PostgreSQL’s data directory to manage cstore tables.
  • Automatically delete table files on DROP FOREIGN TABLE. In cstore_fdw v1.0 it was a user's responsibility to delete the files created by cstore_fdw after dropping a table. Failure to properly delete the files could result in unexpected behavior in the future. For example, if a user dropped a table and then created another table with the same filename option they could get errors when querying the new table. cstore_fdw now automatically deletes table files on DROP FOREIGN TABLE and eliminates these kinds of problems.
  • cstore_table_size. The new cstore_table_size('tablename') function can be used to get the size of a cstore table in bytes.
  • Improved documentation. “Using Skip Indexes” and “Uninstalling cstore_fdw” sections were added to the README file.
  • Bug fixes:
    • Previously querying empty tables errored out. These tables can now be queried as expected.
    • Previously cost estimation functions overestimated number of columns. The source of estimation error has been fixed.

For installation and update instructions, please see cstore_fdw’s page in GitHub.

To learn more about what’s coming up for cstore_fdw see our development roadmap.

Got questions?

If you have questions about cstore_fdw, please contact us using the cstore-users Google group.

If you discover an issue when using cstore_fdw, please submit it to cstore_fdw’s issue tracker on GitHub.

Page 1 of 8

About

CitusDB is a scalable analytics database that's built on top of PostgreSQL.

In this blog, we share our ideas and experiences on databases and distributed systems.