Hyperscale (Citus) is now GA as a built-in deployment option in Azure Database for PostgreSQL. Read Craig’s blog

Postgres tips for the average and power user

Personally I’m a big fan of email, just like blogging. To me a good email thread can be like a good novel where you’re following along always curious for what comes next. And no, I don’t mean the ones where there is an email to all-[email protected] and someone replies all, to only receive reply-all’s to not reply-all. I mean ones like started last week internally among the Azure Postgres team.

The first email was titled: Random Citus development and psql tips, and from there it piled on to be more and more tips and power user suggestions for Postgres. Some of these tips are relevant if you’re working directly on the Citus codebase, others relevant as anyone that works with Postgres, and some useful for debugging Postgres internals. While the thread is still ongoing here is just a few of the great tips:

In psql, tag your queries and use Ctrl+R

Psql supports Ctrl+R to search previous queries you ran. For demos and when testing complex scenarios, I like adding a little comment to queries that then becomes the tag by which I can later find the query:

# SELECT count(*) FROM test; -- full count
┌───────┐
 count 
├───────┤
     0 
└───────┘
(1 row)

Time: 127.124 ms
(reverse-i-search)`f': SELECT count(*) FROM test; -- full count

In most cases, 2-3 letters is going to be enough to find the query.

Better psql output

I find \x lacking, but pspg is great. It is available from PGDG via sudo yum install -y pspg or the equivalent on your system. I have the following .psqlrc which sets up pspg with a very minimalistic configuration:

$ cat > ~/.psqlrc
\timing on
\pset linestyle unicode 
\pset border 2
\setenv PAGER 'pspg --no-mouse -bX --no-commandbar --no-topbar'
\set HISTSIZE 100000

Get a stack trace for an error

In psql:

# SELECT pg_backend_pid();
┌────────────────┐
 pg_backend_pid 
├────────────────┤
         156796 
└────────────────┘
(1 row)

In another shell:

$ gdb -p 156796 
(gdb) b errfinish
Breakpoint 1 at 0x83475b: file elog.c, line 251.
(gdb) c
Continuing.

Back in psql:

# SELECT 1/0;

Back in gdb:

Breakpoint 1, errfinish (dummy=0) at elog.c:414
414     {
(gdb) bt
#0  errfinish (dummy=0) at elog.c:414
#1  0x00000000007890f3 in int4div (fcinfo=<optimized out>) at int.c:818
#2  0x00000000005f543c in ExecInterpExpr (state=0x1608000, econtext=0x1608900, isnull=0x7ffd27d1ad7f) at execExprInterp.c:678
...

Generating and inserting fake data

I know there are a lot of ways to generate fake data, but if you want something simple and quick you can do it directly in SQL:

CREATE TABLE some_table (id bigserial PRIMARY KEY, a float);
SELECT create_distributed_table('some_table', 'id');​​​

INSERT INTO some_table (a) SELECT random() * 100000 FROM generate_series(1, 1000000) i;    -- 2 secs  (40MB)​
INSERT INTO some_table (a) SELECT random() * 100000 FROM generate_series(1, 10000000) i;   -- 20 secs (400MB)​
INSERT INTO some_table (a) SELECT random() * 100000 FROM generate_series(1, 100000000) i;  -- 300 secs (4GB)​
INSERT INTO some_table (a) SELECT random() * 100000 FROM generate_series(1, 1000000000) i; -- 40 mins (40GB)​

A better pg_stat_activity

We’ve talked about pg_stat_statements before, but less about pg_stat_activity. Pg_stat_activity will show you information about currently running queries. Though it’s default view gets improved with the following query, and is easy to tweak as well:

SELECT 
  pid, 
  -- procpid,​
  -- usename, ​
  substring(query, 0, 100) as query,
  query_start,
  -- backend_start,​
  backend_type,
  state
FROM 
  pg_stat_activity
--WHERE​
--  state='active'​
ORDER BY
  query_start ASC
;

Managing multiple versions of Postgres

For many people one version of Postgres is enough, but if you’re working with different versions in production it can be useful to have them locally as well for dev/prod parity. pgenv is a tool that helps you manage and easily swap between Postgres versions.

What are your tips?

As our email thread goes on we may create another post on useful tips, but I’d also love to hear from you. Have some useful tips for working with Postgres, let us hear them and help share with others @citusdata.

Big thanks to Onder, Marco, Furkan on the team for contributing tips to the thread