Looking for more Postgres content? Visit our sister blog, the Azure Postgres Blog on Microsoft TechCommunity.
As part of my work on the open source PostgreSQL team at Microsoft, I recently committed a new feature into PostgreSQL 14 to track dependencies on collation versions, with help from co-author Julien Rouhaud and the many others who contributed ideas. It took a long time to build a consensus on how to tackle this thorny problem (work I began at EnterpriseDB and continued at Microsoft), and you can read about some of the details and considerations in the commit message below and the referenced discussion thread. Please note that some details may change by the time PostgreSQL 14 is released.
commit 257836a75585934cc05ed7a80bccf8190d41e056
Author: Thomas Munro <[email protected]>
Date: Mon Nov 2 19:50:45 2020 +1300
Track collation versions for indexes.
Record the current version of dependent collations in pg_depend when
creating or rebuilding an index. When accessing the index later, warn
that the index may be corrupted if the current version doesn't match.
Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
Michael Paquier, Robert Haas, Tom Lane and others for very helpful
discussion.
Author: Thomas Munro <[email protected]>
Author: Julien Rouhaud <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]> (earlier versions)
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
I’m pretty happy with the result so far, but there is more to be done (see further down)! Now seems like a good time to walk you through the problem we needed to solve—that PostgreSQL indexes can get corrupted by changes in collations that occur naturally over time—and how the new feature makes things better in PostgreSQL 14. Plus, you’ll get a bit of background on collations, too.
Keep reading