RustProof Labs: blogging for education (logo)
My book Mastering PostGIS and OpenStreetMap is available!

Stubbing toes with auto_explain

By Ryan Lambert -- Published December 20, 2022

Postgres has a handy module called auto_explain. The auto_explain module lives up to its name: it runs EXPLAIN automatically for you. The intent for this module is to automatically provide information useful for troubleshooting about your slow queries as they happen. This post outlines a pitfall I recently discovered with auto_explain. Luckily for us, it's an easy thing to avoid.

I discovered this by running CREATE EXTENSION postgis; and watching it run for quite a while before failing with an out of disk space error. That is not my typical experience with a simple CREATE EXTENSION command!

Standard use of auto_explain

A common way auto_explain is used is to target "slow queries" through the setting auto_explain.log_min_duration. This setting defines the number of milliseconds to use as the threshold of when to log the EXPLAIN output. If your queries are typically 10-50 ms, you might decide to set auto_explain.log_min_duration = 100 to log queries taking twice as long as your goal. An instance serving big analytic queries might want to set that much higher, say 2 or 5 seconds.

Innocent testing

I say my testing was innocent because I wasn't trying to break something. That makes it innocent, right? I was playing around with auto_explain trying out PgMustard's scoring API. At the time I didn't want to think about where to set that threshold... I just wanted to capture some explain output for testing. The auto_explain documentation explains that setting auto_explain.log_min_duration = 0 will capture "all plans." Sounds good, let's do that!

To be clear, I knew that this type of configuration would have negative performance side effects. I had even read warnings such as "beware, this can have severe performance implications." I said to myself "How severe could it be?" I didn't think what I was doing was too bad. After all, I was on a small test machine with a small test database and no load other than my queries.

Early on in my session I had ran the following queries.

LOAD 'auto_explain';
SET auto_explain.log_min_duration = 0;

I ran a variety of basic test queries, checked the logs, ran some more queries, then found myself saying, "I need PostGIS in this database." I ran the following query to create PostGIS without even thinking about it.

CREATE EXTENSION postgis;

I waited. And waited. And waited. Finally, it failed as I mentioned at the beginning. The reason it failed is because it wrote so much output to the logs that it consumed the small 20 GB drive I gave it. It was pretty clear that storage space was the issue. Running sudo du -h --max-depth=1 /var/log/ quickly found the culprit in the /var/log/postgresql/ directory.

Past the failure

I re-ran the test on a local computer with a lot more storage space. It completed! It took 15 minutes and 38 seconds, but it did at least complete.

Screenshot showing psql running the query create extension postgis; took nearly 16 minutes to complete.

The log file (created fresh for this test) resulted in 66 GB of logs!

Screenshot showing 66 GB log file resulting from this misuse of auto_explain

The exact size of the log file will depend on a variety of other logging related settings, which I also have turned up higher than many production databases use.

What happened?

I set auto_explain.log_min_duration = 0 which reports on "all plans." All. Creating the PostGIS extension involves running a lot of queries. Further, the output from EXPLAIN can be quite lengthy and verbose.

Summary

Like stubbing your toes, this problem is easy to avoid. Set auto_explain.log_min_duration to 1 or higher. Only in the most rare cases would you want to set that to 0. Setting auto_explain.log_min_duration = 1 works as expected with CREATE EXTENSION postgis; and doesn't write dozens of GB of logs.

([local] 🐘) ryanlambert@dev=# LOAD 'auto_explain';
set auto_explain.log_min_duration = 1;
LOAD
Time: 0.757 ms
SET
Time: 0.377 ms
([local] 🐘) ryanlambert@dev=# create extension postgis; 
CREATE EXTENSION
Time: 684.501 ms

Need help with your PostgreSQL servers or databases? Contact us to start the conversation!

By Ryan Lambert
Published December 20, 2022
Last Updated December 20, 2022