RustProof Labs: blogging for education (logo)

Using Uber's H3 hex grid in PostGIS

By Ryan Lambert -- Published April 24, 2022

This post explores using the H3 hex grid system within PostGIS. H3 was developed by Uber and has some cool benefits over the PostGIS native ST_HexagonGrid() function used in my post Find missing crossings in OpenStreetMap with PostGIS. The hex grid built-in to PostGIS is great for one-off projects covering a specific region, though it has shortcomings for larger scale consistency. On the other hand, the H3 grid is a globally defined grid that scales up and down through resolutions neatly. For more details, read Uber's description.

This post works through a few of the functions available in the H3 extension and how they can be used for spatial aggregation in an analysis. One additional focus is how to generate a table of H3 hexagons for a given resolution.

Install H3 in Postgres

The H3 library is available to PostGIS as a Postgres extension. I am using the bytesandbrains h3-pg project available on GitHub. The extension can be installed using pgxn install h3. Once installed, create the H3 extension in the database.

CREATE EXTESION h3;

Continue Reading

Postgres Data Dictionary for everyone

By Ryan Lambert -- Published January 04, 2022

A data dictionary is an important tool for anyone that stores and consumes data. The PgDD extension makes it easy to inspect and explore your data structures in Postgres. This post shows how PgDD provides access to current and accurate information about your databases for a variety of users:

  • Analysts
  • DBAs and Developers
  • The Business

This data dictionary information from PgDD is made available using standard SQL by querying a small set of views.

Background

Relational databases, including Postgres, track the majority of the information needed for a data dictionary. This is done in the underlying system catalogs; Postgres' system catalogs are in the pg_catalog schema. The challenge with using the system catalogs is they are not very user friendly to query for the type of details commonly needed. PgDD does not do anything magical, it is simply a wrapper around the Postgres system catalogs!

Continue Reading

Permissions required for PostGIS

By Ryan Lambert -- Published December 01, 2021

PostGIS is a widely popular spatial database extension for Postgres. It's also one of my favorite tools! A recent discussion on the People, Postgres, Data Discord server highlighted that the permissions required for various PostGIS operations were not clearly explained in the PostGIS documentation. As it turned out, I didn't know exactly what was required either. The basic PostGIS install page provides resources for installing the binary on the server and the basic CREATE EXTENSION commands, but does not explain permissions required.

This post explores the permissions required for three types of PostGIS interactions:

  • Install/Create PostGIS
  • Use PostGIS
  • Load data from pg_dump

Database and Users

I am using Postgres installed on my laptop for these tests, Postgres 13.5 and PostGIS 3.1. I created an empty database named postgis_perms and check the \du slash command in psql to see the current roles. This instance has my my ryanlambert role, a superuser, and the default postgres role. The postgres role is not used in this post outside of this example.

([local] 🐘) ryanlambert@postgis_perms=# \du
                                     List of roles
┌─────────────┬────────────────────────────────────────────────────────────┬───────────┐
│  Role name  │                         Attributes                         │ Member of │
╞═════════════╪════════════════════════════════════════════════════════════╪═══════════╡
│ postgres    │ Superuser, Create role, Create DB, Replication, Bypass RLS │ {}        │
│ ryanlambert │ Superuser, Create role, Create DB                          │ {}        │
└─────────────┴────────────────────────────────────────────────────────────┴───────────┘

Continue Reading

Find missing crossings in OpenStreetMap with PostGIS

By Ryan Lambert -- Published November 07, 2021

The #30DayMapChallenge is going on again this November. Each day of the month has a different theme for that day's map challenge. These challenges do not have a requirement for technology, so naturally I am using OpenStreetMap data stored in PostGIS with QGIS for the visualization component.

The challenge for Day 5 was an OpenStreetMap data challenge. I decided to find and visualize missing crossing tags. Crossing tags are added to the node (point) where a pedestrian highway (e.g. highway=footway) intersects a motorized highway (e.g. highway=tertiary). This post explains how I used PostGIS and OpenStreetMap data to find intersections missing a dedicated crossing tag.

Without further ado, here was my submission for Day 5.

Map of the Denver, Colorado metro area with a shaded hex grid overlay. Title reads "% of Footway Intersections missing Crossing".  Subtitles read "Denver Metro area, November 2021" and "#30DayMapChallenge - 2021 Day 5: OpenStreetMap". The hex grid is shaded from light red to dark red (5 gradients), with only 4 of the lightest shaded areas around Denver proper.  Throughout the rest of the inner-metro area are shades 2-4 (35% through 94% missing) with most of the outer regions in the 100% or "no data" area.

Continue Reading

Using Query ID in Postgres 14

By Ryan Lambert -- Published October 15, 2021

Postgres 14 was released on September 30, 2021. With a new major version comes new features to explore! This post takes a look at the unique query id option enabled with compute_query_id in postgresql.conf. This particular backend improvement, included with Postgres 14, is one I am excited about because it makes investigating and monitoring query related performance easier. This post covers how to enable the new feature and explores how it can be used in real life performance tuning.

Enable query id

For testing I created a new instance with Postgres 14 installed and edited the postgresql.conf file to change a few configuration options related to the query id. I set compute_query_id to on instead of auto and to allow the pg_stat_statements extension to be loaded. Additionally, I turn on log_duration, set log_statement to all and update log_line_prefix to include query_id=%Q,

compute_query_id = on
shared_preload_libraries = 'pg_stat_statements'

log_duration = on
log_statement = 'all'
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h,query_id=%Q '

Continue Reading