Learn about Citus on Microsoft Azure in our latest post about use cases: When to use Hyperscale (Citus) to scale out Postgres.

Skip navigation

Citus Blog

Articles by Thomas Munro

Thomas Munro

Don’t let collation versions corrupt your PostgreSQL indexes

Written by By Thomas Munro | December 12, 2020 Dec 12, 2020

As part of my work on the open source PostgreSQL team at Microsoft, I recently committed a new feature into PostgreSQL 14 to track dependencies on collation versions, with help from co-author Julien Rouhaud and the many others who contributed ideas. It took a long time to build a consensus on how to tackle this thorny problem (work I began at EnterpriseDB and continued at Microsoft), and you can read about some of the details and considerations in the commit message below and the referenced discussion thread. Please note that some details may change by the time PostgreSQL 14 is released.

commit 257836a75585934cc05ed7a80bccf8190d41e056
              Author: Thomas Munro <[email protected]>
              Date:   Mon Nov 2 19:50:45 2020 +1300
              
                  Track collation versions for indexes.
              
                  Record the current version of dependent collations in pg_depend when
                  creating or rebuilding an index.  When accessing the index later, warn
                  that the index may be corrupted if the current version doesn't match.
              
                  Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
                  Michael Paquier, Robert Haas, Tom Lane and others for very helpful
                  discussion.
              
                  Author: Thomas Munro <[email protected]>
                  Author: Julien Rouhaud <[email protected]>
                  Reviewed-by: Peter Eisentraut <[email protected]> (earlier versions)
                  Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
              

I’m pretty happy with the result so far, but there is more to be done (see further down)! Now seems like a good time to walk you through the problem we needed to solve—that PostgreSQL indexes can get corrupted by changes in collations that occur naturally over time—and how the new feature makes things better in PostgreSQL 14. Plus, you’ll get a bit of background on collations, too.

Keep reading

Page 1 of 1