PostgreSQL 12 and PostGIS 3 Initial Review
Postgres 12 and PostGIS 3 have been out and production ready for about a month now. I have been testing the upgrade from Pg 11 to 12 and have evaluated various workloads, mainly looking for red flags that would hold up an upgrade. At this point I am happy with what I see and am planning the upgrade for all our Postgres databases! This post covers a few points I found interesting during my initial testing. See the Postgres 12 release notes and PostGIS 3 release notes for more details about what has changed in these versions.
Reasons to upgrade
The main feature making me want to upgrade to Postgres 12 is stored generated columns. This allows columns to store a calculated value to disk, much like materialized views store the results of queries. Not all our Pg instances have plans to use the feature at this time, so I could hold some back on Pg11 if I wanted. I decided we will upgrade all instances anyway, largely because with the changes to streaming replication it will be easier and safer to not have to keep two sets of procedures.
The other main feature I have had an eye on is the ability to create covering GIST indexes. The idea here is that a GIST index can be used to find appropriate records from the database, and the INCLUDE allows adding other columns to be included in the index and avoid pulling them from disk.
Generated columns
My first use for generated columns in Postgres 12 is to combine components of
physical addresses into a single field. This is helpful with
data I am geocoding with PostGIS' Tiger geocoder. The data typically comes in with the
address split out into its individual components (street, city, state, zip),
while the address normalizer wants a single string.
The following example will add a generated column named address_full
to automatically combine the fields from their sub-components.
ALTER TABLE co_businesses
ADD address_full TEXT
GENERATED ALWAYS AS (address1 || ', ' || city || ', ' || state_code || ' ' || zip)
STORED;
Note: This assumes each individual field is set to
NOT NULL
. If values can containNULL
you should useCOALESCE()
or the entire string will be empty.
Covering GIST Indexes
The basic syntax to create a covering index:
CREATE INDEX GIX_building_covering_with_name
ON osm.building_polygon USING GIST (way)
INCLUDE (name)
;
I did some quick testing on existing data and queries but was unable to find a query that took advantage of this. I need to put in more time into testing the nuances of this feature before I give up on it.
Performance
My initial impression of Postgres 12's performance is it is roughly consistent with Pg11 on the same hardware. Some parts are a bit faster (creating GIST indexes is about 6% faster), other parts are a bit slower (bulk loading data). Kaarel Moppel found a slight decline in performance and attributes much of that to the overhaul of the storage layer.
"... the average performance actually decreased slightly! Well, it’s only a modest 3.3% for the mean times .... And it can of course be totally normal and expected, since the storage architecture was overhauled ..."
Query to examine
I settled on one query to examine EXPLAIN (ANALYZE)
output that provides a few interesting
points.
The basic query is to find points (trees) within a buffer around a building.
In this case we filter for a specific building and find trees within 1km (1000m) of the building.
The underlying tables are loaded with OpenStreetMap data for Colorado
(read how to load and transform).
The way
columns in each table have appropriate GIST
indexes
but no index is on the building name
column, or the natural
column from the point table.
SELECT t.natural, t.way
FROM osm.building_polygon b
INNER JOIN osm.natural_point t
ON t.natural = 'tree'
AND ST_Contains(ST_Buffer(b.way, 1000), t.way)
WHERE b.name = 'Pangea Coffee Roasters'
;
The query was ran on two server configurations, a Digital Ocean droplet with 4 CPU and
8 GB RAM, and a local VM with 4 CPU and 4 GB RAM. The VM and the DO servers were loaded
with OSM data loaded on different dates, so row counts do not match identically. The
output from the four (4) EXPLAIN (ANALYZE)
outputs was loaded into https://explain.depesz.com linked below.
- DO - Postgres 11, PostGIS 2.5
- DO - Postgres 12, PostGIS 3
- VM - Postgres 11, PostGIS 2.5
- VM - Postgres 12, PostGIS 3
EXPLAIN observation: Updated output
Output of EXPLAIN with GIST indexes now uses @
instead of ~
.
Also, Pg12 reorders the condition itself, I'm not sure if that's purely cosmetic
or an indicator that it is reordering for efficiency.
Postgres 11 / PostGIS 2.5 index condition:
Index Cond: (st_buffer(b.way, '1000'::double precision) ~ way)
Postgres 12 / PostGIS 3 index condition:
Index Cond: (way @ st_buffer(b.way, '1000'::double precision, ''::text))
EXPLAIN observation: Goodbye, recheck
Postgres 11 and PostGIS 2.5 (and earlier) have always had a Recheck condition listed when filtering on a GIST index. With Postgres 12 and PostGIS 3 (both minimum required) the need to recheck disappears. An example of this output:
Recheck Cond: (st_buffer(b.way, '1000'::double precision) ~ way)
It appears this change is a result of this commit, thanks to Jeff Janes for helping find that.
Key takeaways
- Generated Columns working as expected
- Faster GIST index creation
- Streaming replication changes
Summary
Postgres 12 and PostGIS 3 are both great steps forward! This post focused on a few key details and early observations. There will be much more to write about after our production upgrades are completed!
My first goal with testing was to confirm there wasn't a major performance regression. Second, I needed to ensure the features I'm upgrading for will work the way I intend them to. All in all, success!
Need help with your PostgreSQL servers or databases?
Contact us to start the conversation!