Book Release! Mastering PostGIS and OpenStreetMap
I'm excited to announce my book, Mastering PostGIS and OpenStreetMap, is available to purchase as of October 1, 2022! This book provides a practical guide to introduce readers to PostGIS, OpenStreetMap data, and spatial querying. Queries used for examples are written against real OpenStreetMap data (included) to help you learn how to navigate and explore complex spatial data. The examples start simple and quickly progress through a variety of clever spatial queries and powerful techniques.
Section 12.3, Create Denver specific tables, is available as a free preview section. The full Table of Contents is available from the free preview page.
Who is this book for?
Mastering PostGIS and OpenStreetMap is for anyone that wants to learn more about PostGIS and/or OpenStreetMap data. The hefty Appendix helps keep new users on track without distracting users with more experience. The following table gives an idea of the topics covered.
Topic | Included? |
---|---|
Install PostGIS | ✅ |
Spatial SQL queries | ✅ |
Basics of OpenStreetMap tagging | ✅ |
Load OpenStreetMap data to PostGIS | ✅ |
Find and use local SRIDs everywhere | ✅ |
Handle real-world (dirty!) data | ✅ |
Performance of Geometry vs. Geography | ✅ |
Routing | ✅ |
Postgres 15 improves UNIQUE and NULL
Postgres 15 beta 2 was released
recently! I enjoy Beta season... reviewing and testing new features
is a fun diversion from daily tasks. This post takes a look at an improvement
to UNIQUE
constraints on columns with NULL
values. While the nuances of unique constraints are not as flashy
as making sorts faster (that's exciting!),
improving the database developer's control over data quality is always a good benefit.
This email chain has the history behind this change. The Postgres 15 release notes summarize this improvement:
"Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)
Previously
NULL
values were always indexed as distinct values, but this can now be changed by creating constraints and indexes usingUNIQUE NULLS NOT DISTINCT
."
Two styles of UNIQUE
To take a look at what this change does, we create two tables.
The null_old_style
table has a 2-column UNIQUE
constraint
on (val1, val2)
. The val2
allows NULL
values.
CREATE TABLE null_old_style
(
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
val1 TEXT NOT NULL,
val2 TEXT NULL,
CONSTRAINT uq_val1_val2
UNIQUE (val1, val2)
);
H3 indexes for performance with PostGIS data
I recently started using the H3 hex grid extension in Postgres with the goal of making some not-so-fast queries faster. My previous post, Using Uber's H3 hex grid in PostGIS, has an introduction to the H3 extension. The focus in that post, admittedly, is a PostGIS focused view instead of an H3 focused view. This post takes a closer look at using the H3 extension to enhance performance of spatial searches.
The two common spatial query patterns considered in this post are:
- Nearest neighbor style searches
- Regional analysis
This post used the H3 v3 extension. See Using v4 of the Postgres H3 extension for usage in the latest version.
Setup and Point of Focus
This post uses two tables to examine performance.
The following queries add an h3_ix
column to the osm.natural_point
and osm.building_polygon
tables. This approach uses
GENERATED
columns
and adds an index to the column. Going through these steps allow us
to remove the need for PostGIS joins at query time for rough distance searches.
See my
previous post for details about installing
the H3 extension and the basics of how it works.
Using Uber's H3 hex grid in PostGIS
This post explores using the H3 hex grid system
within PostGIS. H3 was developed by Uber and has some cool benefits
over the PostGIS native ST_HexagonGrid()
function used
in my post Find missing crossings in OpenStreetMap with PostGIS.
The hex grid built-in to PostGIS is great for one-off projects covering a specific region,
though it has shortcomings for larger scale consistency.
On the other hand, the H3 grid is a globally defined grid that scales up and down
through resolutions neatly. For more details, read
Uber's description.
This post used the H3 v3 extension. See Using v4 of the Postgres H3 extension for usage in the latest version.
This post works through a few of the functions available in the H3 extension and how they can be used for spatial aggregation in an analysis. One additional focus is how to generate a table of H3 hexagons for a given resolution.
Note: This post does not focus on using H3 for the best performance. See my post H3 indexes for performance with PostGIS data for a look into high performance spatial searches with H3.
Install H3 in Postgres
The H3 library is available to PostGIS as a Postgres extension. I am using
the bytesandbrains h3-pg project
available on GitHub. The extension can be installed using
pgxn install h3
. Once installed, create the H3 extension in the database.
CREATE EXTENSION h3;
Postgres Data Dictionary for everyone
A data dictionary is an important tool for anyone that stores and consumes data. The PgDD extension makes it easy to inspect and explore your data structures in Postgres. This post shows how PgDD provides access to current and accurate information about your databases for a variety of users:
- Analysts
- DBAs and Developers
- The Business
This data dictionary information from PgDD is made available using standard SQL by querying a small set of views.
Background
Relational databases, including Postgres, track the majority of the information
needed for a data dictionary. This is done in the underlying
system catalogs;
Postgres' system catalogs are in the pg_catalog
schema.
The challenge with using the system catalogs is they are not very
user friendly to query for the type of details commonly needed.
PgDD does not do anything magical, it is simply
a wrapper around the Postgres system catalogs!