Big News: Hyperscale (Citus) is now generally available as part of Azure Database for PostgreSQL. Learn more.

Fun with SQL: Text and system functions

SQL by itself is great and powerful, and Postgres supports a broad array of more modern SQL including things like window functions and common table expressions. But rarely do I write a query where I don’t want to tweak or format the data I’m getting back out of the database. Thankfully Postgres has a rich array of functions to help with converting or formatting data. These built-in functions save me from having to do the logic elsewhere or write my own functions, in other words I have to do less work because Postgres has already done it for me which I’m always happy about.

We’ve covered a set of functions earlier, today we’re going to look at some different categories of functions to dive deeper.

Manipulating strings

When building a nice cleamn report I’m often doing some cleaning up of the data. This could be as simple as combining my first_name and last_name column. In that case a simple concatenation is quite useful:

SELECT first_name || last_name as name
FROM users;

But as you can expect I can do much more. If I’m perfoming some basic search for a keyword or delimiter I can search for the position of that string. Then I can feed that into a case statement:

SELECT CASE
  WHEN position(',' in my_string) > 0
    then True
  ELSE
    False
  END
FROM my_table;

Of course if you’re working with some field that is delimited you may want to take the above to first find the correct field. From there you likely would want to split that delimited field up. For that we can use split_part and grab the appropriate section of our delimited field:

SELECT CASE
  WHEN position(',' in my_string) > 0
    then split_part(my_string, ',', 1)
  ELSE
    null
  END
FROM my_table;

Finally, if you’re dealing with messy input/machine data there are a number of functions that can be handing for cleaning up that data:

  • substr - allows you to extract a substring
  • rtrim - removes the longest part of a matching string from the right side
  • reverse - for the next time you’re asked to reverse a string in an interview :)
  • regexp_replace - for when you need regular expressions

Sizing made simply

A less common category of functions is working with sizes of things like disk/memory. If you work on a tool that records data about how much data you’ve stored, or even work with the system information about your Postgres database you often deal with this. Lets say once a month you’re querying your database for bloat to track it.

When you query for bloat you get back something that looks like 37502634 (this is in bytes). At first glance my though is we’re doomed and the database will die tomorrow because that looks like a lot. But what I really want to do is get a better understanding of it. Fortunately I don’t have to pull up Wolfram Alpha to get a better idea of how large that actually is. I can use pg_size_pretty:

SELECT pg_size_pretty(37502634::bigint);

 pg_size_pretty
----------------
 36 MB
(1 row)

Whew. 36MB isn’t a level that concerns me on my 40 GB table, so I’m safe.

One I’ve seen less used is the reverse. Postgres also makes it easy for you to go from the pretty form to the raw number:

SELECT pg_size_bytes ('40 GB'::text);
 pg_size_bytes
---------------
   42949672960
(1 row)

The admin functions are a unique category in and of themself that can be very handy when operating/managing your database. We make heavy use of them ourselves while running and managing your database as a service.

A function a day

Postgres functions make the little things that eat up time when creating the perfect query much easier. The next time you’re trying to create the perfect output or working with a query and the SQL itself can’t quite get you there explore Postgres built-in functions. Chances are Postgres has already done the heavy lifting to make working with your data easier.