PostgreSQL performance on Raspberry Pi
Have you ever wondered how well PostgreSQL performs on a
Raspberry Pi 3B?
Oh. You haven't? Oh well, I asked myself that question
and put it to the test. This post covers how I used
to answer this question and the results of what I found.
I love the Raspberry Pi, my post Low Power Computing with Raspberry Pi explains more.
This post is part of the series PostgreSQL: From Idea to Database.
Modern PostgreSQL (>=9.6), out-of-the-box, achieves around 200 TPS
TPC-B (sort of)) on a Raspberry Pi 3B.
That's 17+ million transactions per day!
What was tested
I tested four PostgreSQL versions, 9.6.12, 10.7, 11.2, and 12-dev. Tests
were ran using four Raspberry Pi 3Bs mounted as a
Rack-o-Pi. The first
round of testing used one version of Postgres installed on each
Pi. For the rest of the testing I focused on Pg11 and Pg12.
These little machines put in more than 14 hours of digital blood, sweat and tears to produce this post!
All versions of Postgres tested were built from source.
pgBenchran locally on each machine.
I kept most of the default configurations for each version, though there are four (4) settings I controlled for this testing.
All other settings are whatever the default was for that setting in the tested version.
The defaults for
0.5) don't work well for this case,
I've adjusted those to
0.8 for all tests. I didn't
capture the results from those initial tests with the default settings,
but the difference was noticeable.
wal_level deserves a mention; the default in Pg9.6 was
and in Pg10 and later the default is
replica. This change requires a lot of
extra WAL activity that seriously hinders performance on the Pi.
All tests here used
wal_level = minimal. (Again, initial tests with replica were noticably slower!)
The only PostgreSQL configuration that changed throughout these tests was
max_parallel_workers_per_gather, set to 2 by default in Pg10 and newer.
Parallel queries was introduced in Pg9
.6 but was that value was 0 by default.
Read my post on parallel query in Pg10 to learn more.
Each test was ran with a fresh database initialized with a scale of 10 using these commands.
/usr/local/pgsql/bin/createdb bench_test /usr/local/pgsql/bin/pgbench -i -s 10 bench_test
This creates a
bench_test database with the
pgbench_accounts table storing 1M rows, taking up 133 MB for that table.
postgres@tag103:~$ /usr/local/pgsql/bin/psql -d bench_test -c "SELECT COUNT(*) FROM pgbench_accounts;" count --------- 1000000 (1 row)
Show tables with sizes:
postgres@tag103:~$ /usr/local/pgsql/bin/psql -d bench_test -c "\dt+" List of relations Schema | Name | Type | Owner | Size | Description --------+------------------+-------+----------+--------+------------- public | pgbench_accounts | table | postgres | 133 MB | public | pgbench_branches | table | postgres | 96 kB | public | pgbench_history | table | postgres | 41 MB | public | pgbench_tellers | table | postgres | 392 kB |
The following command is the
pgbench command used.
The only change was if the
-T parameter was adjusted between
3600 (1 hour) or 600 (10 minutes).
pgbench -c 10 -j 2 -T 3600 -P 60 bench_test starting vacuum...end. ... transaction type: <builtin: TPC-B (sort of)> scaling factor: 10 query mode: simple number of clients: 10 number of threads: 2 duration: 3600 s number of transactions actually processed: 772292 latency average = 46.614 ms latency stddev = 875.019 ms tps = 214.522850 (including connections establishing) tps = 214.523484 (excluding connections establishing)
Let's take a closer look at these two lines of the output at the end.
duration: 3600 s number of transactions actually processed: 772292
More than 750k transactions per hour!
I think it's cool that a Raspberry Pi 3B with it's 5V power supply, SD card, ARM processor (1.2GHz) and 1GB of RAM handled 750k transactions in an hour. Then again, I am a bit biased.
Within the first two runs of
pgbench it was obvious that one of my
Pi's was not keeping pace with the others. I ran it through the full
range of tests and found that particular hardware consistently ran 20%
slower than the median of the other three nodes. I had had issues
with that particular Pi in the past and my suspicion was the SD card is
going bad, this could be pointing to the same thing...
The results from the slow node have been removed from all aggregate results presented in the remainder of this post.
On the Raspberry Pi, you can expect roughly the same performance across versions 9.6 through 11.
|Pg Version||Min TPS||Median TPS||Max TPS||Std Dev||# Tests|
PostgreSQL 11 had less variance in its performance as seen in the standard deviation of TPS observed. Preliminary tests of the Pg12 development branch show further improvements in performance, but the real test is how it works when it goes live!
One more note one the above results is that versions 11 and 12 had
more than one test run at 1 hour, while versions 9.6 and 10 only received
a single one-hour test.
The difference between running this
pgbench test for 10 minutes versus 60
minutes caused a 10% drop in average TPS.
Parallel on Pi?
Starting with Postgres 10, the default is to give 2 cores to
parallel processing. Switching
max_parallel_workers_per_gather between 2 and 0 had nearly
zero impact on that Pi, either for or against.
When working with mobile-scale technology, there are plenty of
other bottlenecks that parallel just won't help with. Also,
the queries used for the
TPC-B (sort of) tests are not the typical
workload that parallel query is expected to benefit.
|Pg Version||# Parallel Workers||Min TPS||Median TPS||Max TPS||Std Dev||# Tests|
Reporting style queries with aggregations are better suited for parallel query. See my follow-up post for those results.
Incremental reporting with
pgbench command I used includes
-P 60. This tells
to print out progress updates every 60 seconds. I highly recommend
you do this for longer tests, you'll probably see some details that
are otherwise obscured by aggregated details. Notice the severe
change in TPS and related metrics around 10 minutes (600 seconds)
into this test run.
pgbench -c 10 -j 2 -T 3600 -P 60 bench_test starting vacuum...end. ... progress: 540.0 s, 171.8 tps, lat 55.105 ms stddev 946.851 progress: 600.0 s, 24.6 tps, lat 435.693 ms stddev 2945.727 progress: 660.0 s, 405.8 tps, lat 24.108 ms stddev 134.692 ...
Notice that across three minutes it went from 171 TPS, down to 25 TPS, and then back up to 400 TPS? That's a very big swing in performance...
Pi Hardware Considerations
These tests all use standard Pi hardware including the often-loathed micro-SD card. Ultimately, I made the decision to keep things as standard as possible in order to keep things generic. The SD cards used in these tests have all been in 24/7 use for at least six (6) months with occasional abuse as required for content like this, and how to load OSM data on the Pi. The slow-node mentioned earlier happens to live on the Pi that I've beat-up the most with OpenStreetMap and other large spatial data sets. I've repeated loaded multi-GB databases to that poor 16GB card... not surprised it is wearing out the SD card!
That said, I did perform the same basic tests with a few USB drives I had laying around. Putting the entire Postgres data directory (including logs, WAL, etc) on a USB drive seemed to drop performance considerably. That was unexpected, and bad. By putting either the data files or the logs/WAL on the USB (leaving the other portion on the SD card) seemed to get a moderate performance boost, though far from a major boost like I had hoped for. I suspect this is due to the USB bus sharing with Ethernet as mentioned in this article:
"... still shares a single USB channel to the SoC, as in previous designs..."
The only other testing I could find of PostgreSQL on the Raspberry Pi was this post from 2013 on older hardware and PostgreSQL 9.2. The results for the "small" database in that post (slightly smaller than tested here) showed 14 TPS for the TPC-B test. Compared to >200 TPS that I found here!
There are plenty of limitations when working with this hardware. This post has a good explanation of the shared bus issues, thermal limiting, and so forth.
The sort-of TCP-B tests are also a limitation. This test workload
looks nothing like the queries I actually run on a Raspberry Pi.
The queries used in the default
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; SELECT abalance FROM pgbench_accounts WHERE aid = :aid; UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
Most of what I use a Pi for is reporting type work including joins, aggregations, and such. Even using the read-only option with the default tests does not provide a real test for this type of workload.
Read part 2 (the reporting edition) for a look at testing a reporting-style PostgreSQL workload.
PostgreSQL truly is an amazing database platform to be able to handle such limited resources so eloquently. The Raspberry Pi is quite a cool little piece of hardware, you just have to understand what you're working with.
Rack-o-Pi could easily be setup with a single primary node
to handle the writes, streaming replication the other three nodes
serving read-only queries and you'd be able to handle quite a respectable
load (assuming more read than write).
My next round of
pgbench testing with PostgreSQL will be on
more powerful systems. Stay tuned!
Need help with your PostgreSQL servers or databases? Contact us to start the conversation!
Published April 03, 2019
Last Updated April 18, 2019