Stubbing toes with auto_explain
Postgres has a handy module called auto_explain
. The
auto_explain
module
lives up to its name: it runs EXPLAIN
automatically for you.
The intent for this module is to automatically provide information useful
for troubleshooting about your slow queries as they happen.
This post outlines a pitfall I recently discovered with
auto_explain
. Luckily for us, it's an easy thing to avoid.
I discovered this by running CREATE EXTENSION postgis;
and watching it run for quite a while before failing with an out of disk space error.
That is not my typical experience with a simple CREATE EXTENSION
command!
Standard use of auto_explain
A common way auto_explain
is used is to target "slow queries" through the setting
auto_explain.log_min_duration
. This setting defines the number of milliseconds to use
as the threshold of when to log the EXPLAIN
output. If your queries are typically 10-50 ms,
you might decide to set auto_explain.log_min_duration = 100
to log queries taking twice as
long as your goal. An instance serving big analytic queries might want to set that much higher,
say 2 or 5 seconds.
Innocent testing
I say my testing was innocent because I wasn't trying to break something.
That makes it innocent, right? I was playing around with auto_explain
trying out
PgMustard's scoring API.
At the time I didn't want to think about where to set that threshold... I just wanted to capture
some explain
output for testing.
The auto_explain
documentation
explains that setting auto_explain.log_min_duration = 0
will capture "all plans." Sounds good, let's do that!
To be clear, I knew that this type of configuration would have negative performance side effects. I had even read warnings such as "beware, this can have severe performance implications." I said to myself "How severe could it be?" I didn't think what I was doing was too bad. After all, I was on a small test machine with a small test database and no load other than my queries.
Early on in my session I had ran the following queries.
LOAD 'auto_explain';
SET auto_explain.log_min_duration = 0;
I ran a variety of basic test queries, checked the logs, ran some more queries, then found myself saying, "I need PostGIS in this database." I ran the following query to create PostGIS without even thinking about it.
CREATE EXTENSION postgis;
I waited. And waited. And waited. Finally, it failed as I mentioned at the beginning.
The reason it failed is because it wrote so much output to the logs that it consumed the
small 20 GB drive I gave it. It was pretty clear that storage space was the issue.
Running sudo du -h --max-depth=1 /var/log/
quickly found the culprit in the
/var/log/postgresql/
directory.
Past the failure
I re-ran the test on a local computer with a lot more storage space. It completed! It took 15 minutes and 38 seconds, but it did at least complete.
The log file (created fresh for this test) resulted in 66 GB of logs!
The exact size of the log file will depend on a variety of other logging related settings, which I also have turned up higher than many production databases use.
What happened?
I set auto_explain.log_min_duration = 0
which reports on "all plans." All.
Creating the PostGIS extension involves running a lot of queries. Further, the output
from EXPLAIN
can be quite lengthy and verbose.
Summary
Like stubbing your toes, this problem is easy to avoid.
Set auto_explain.log_min_duration
to 1 or higher. Only in the most rare cases
would you want to set that to 0.
Setting auto_explain.log_min_duration = 1
works as expected with
CREATE EXTENSION postgis;
and doesn't write dozens of GB of logs.
([local] 🐘) ryanlambert@dev=# LOAD 'auto_explain';
set auto_explain.log_min_duration = 1;
LOAD
Time: 0.757 ms
SET
Time: 0.377 ms
([local] 🐘) ryanlambert@dev=# create extension postgis;
CREATE EXTENSION
Time: 684.501 ms
Need help with your PostgreSQL servers or databases? Contact us to start the conversation!