POSETTE 2024 is a wrap! 💯 Thanks for joining the fun! Missed it? Watch all 42 talks online 🍿
POSETTE 2024 is a wrap! 💯 Thanks for joining the fun! Missed it? Watch all 42 talks online 🍿
Written by Nazir Bilal Yavuz
January 18, 2023
This post by Nazir Bilal Yavuz about debugging PostgreSQL was originally published on the Microsoft Tech Community Blog.
Postgres is one of the most widely used databases and supports a number of operating systems. When you are writing code for PostgreSQL, it's easy to test your changes locally, but it can be cumbersome to test it on all operating systems. A lot of times, you may encounter failures across platforms and it can get confusing to move forward while debugging. To make the dev/test process easier for you, you can use the Postgres CI.
When you test your changes on CI and see it fail, how do you proceed to debug from there? As a part of our work in the open source Postgres team at Microsoft, we often run into CI failures—and more often than not, the bug is not obvious, and requires further digging into.
In this blog post, you'll learn about techniques you can use to debug PostgreSQL CI failures faster. We'll be discussing these 4 tips in detail:
Before diving into each of these tips, let's discuss some basics about how Postgres CI works.
PostgreSQL uses Cirrus CI for its continuous integration testing. To use it for your changes, Cirrus CI should be enabled on your GitHub fork. The details on how to do this are in my colleague Melih Mutlu's blog post about how to enable the Postgres CI. When a commit is pushed after enabling CI; you can track and see the results of the CI run on the Cirrus CI website. You can also track it in the "Checks" GitHub tab.
Cirrus CI works by reading a .cirrus.yml
file from the Postgres codebase to understand the configuration with which a test should be run. Before we discuss how to make changes to this file to debug further, let's understand its basic structure:
# A sequence of instructions to execute and
# an execution environment to execute these instructions in
task:
# Name of the CI task
name: Postgres CI Blog Post
# Container where CI will run
container:
# Container configuration
image: debian:latest
cpu: 4
memory: 12G
# Where environment variables are configured
env:
POST_TYPE: blog
FILE_NAME: blog.txt
# {script_name}_script: Instruction to execute commands
print_post_type_script:
# command to run at script instruction
- echo "Will print POST_TYPE to the file"
- echo "This post's type is ${POST_TYPE}" > ${FILE_NAME}
# {artifacts_name}_artifacts: Instruction to store files and expose them in the UI for downloading later
blog_artifacts:
# Path of files which should be relative to Cirrus CI's working directory
paths:
- "${FILE_NAME}"
# Type of the files that will be stored
type: text/plain
As you can see, the echo
commands are run at script instruction. Environment variables are configured and used in the same script instruction. Lastly, the blog.txt file is gathered and uploaded to Cirrus CI. Now that we understand basic structure, let's discuss some tips you can follow when you see CI failures.
When Postgres is working on your local machine but you see failures on CI, it's generally helpful to connect to the environment where it fails and check what is wrong.
You can achieve easily that using the RE-RUN with terminal button on the CI. Also, typically, a CI run can take time as it needs to find available resources to start and rerun instructions. However, thanks to this option, that time is saved as the resources are already allocated.
After the CI's task run is finished, there is a RE-RUN button on the task's page.
You may not have noticed it before, but there is a small arrow on the right of the RE-RUN button. When you click this arrow, the "Re-Run with Terminal Access" button will appear. When this button is clicked, the task will start to re-run and shortly after you will see the Cirrus terminal. With the help of this terminal, you can run commands on the CI environment where your task is running. You can get information from the environment, change configurations and re-test your task.
Note that the re-run with terminal option is not available for Windows yet, but there is ongoing work to support it.
Postgres and meson provide additional build-time debug options to generate more information to find the root cause of certain types of errors. Some examples of build options which might be useful to set are:
-Dcassert=true
[defaults to false]: Turns on various assertion checks. This is a debugging aid. If you are experiencing strange problems or crashes you might want to turn this on, as it might expose programming mistakes.-Dbuildtype=debug
[defaults to debug]: Turns on basic warnings and debug information and disables compiler optimizations.-Dwerror=true
[defaults to false]: Treat warnings as errors.-Derrorlogs=true
[defaults to true]: Whether to print the logs from failing tests.While building Postgres with meson, these options can be setup using the meson setup [<options>] [<build_dir>]
or the meson configure
commands.
These options can either be enabled with the "re-running with terminal access" option or by editing the cirrus.yml config file. Cirrus CI has a script instruction in the .cirrus.yml file to execute a script. These debug options could be added to the script instructions in which meson is configured. For example:
configure_script: |
su postgres <<-EOF
meson setup \
-Dbuildtype=debug \
-Dwerror=true \
-Derrorlogs=true \
-Dcassert=true \
${LINUX_MESON_FEATURES} \
-DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
build
EOF
Once it's written as such, the debug options will be activated next time CI runs. Then, you can check again if the build fails and investigate the logs in a more detailed manner. You may also want to store these logs to work on them later. To gather the logs and store them, you can follow the tip below.
Cirrus CI has an artifact instruction to store files and expose them in the UI for downloading later. This can be useful for analyzing test or debug output offline. By default, Postgres' CI configuration gathers log, diff, regress log, and meson's build files—as can be seen below:
testrun_artifacts:
paths:
- "build*/testrun/**/*.log"
- "build*/testrun/**/*.diffs"
- "build*/testrun/**/regress_log_*"
type: text/plain
meson_log_artifacts:
path: "build*/meson
If there are other files that need to be gathered, another artifact instruction could be written or the current artifact instruction could be updated at the .cirrus.yml
file. For example, if you want to collect the docs to review or share with others offline, you can add the instructions below to the task in the .cirrus.yml
file.
configure_script: su postgres -c 'meson setup build'
build_docs_script: |
su postgres <<-EOF
cd build
ninja docs
EOF
docs_artifacts:
path: build/doc/src/sgml/html/*.html
type: text/html
Then, collected logs will be available in the Cirrus CI website in html format.
Apart from the tips mentioned above, here is another tip you might find helpful. At times, we want to run some commands only when we come across a failure. This might be to avoid unnecessary logging and make CI runs faster for successful builds. For example, you may want to gather the logs and stack traces only when there is a test failure. The on_failure
instruction helps to run certain commands only in case of an error.
on_failure:
testrun_artifacts:
paths:
- "build*/testrun/**/*.log"
- "build*/testrun/**/*.diffs"
- "build*/testrun/**/regress_log_*"
type: text/plain
meson_log_artifacts:
path: "build*/meson-logs/*.txt"
type: text/plain
As an example, in the above, the logs are gathered only in case of a failure.
While working on multi-platform databases like Postgres, debugging issues can often be difficult. Postgres CI makes it easier to catch and solve errors since you can work on and test your changes on various settings and platforms. In fact, Postgres automatically runs CI on every commitfest entry via Cfbot to catch errors and report them.
These 4 tips for debugging CI failures should help you speed up your dev/test workflows as you develop Postgres. Remember to use the terminal to connect CI environment, gather logs and files from CI runs, use build options on CI, and run specific commands on failure. I hope these tips will make Postgres development easier for you!