Blog @ RustProof Labs

My book Mastering PostGIS and OpenStreetMap is available!

Setup Geocoder with PostGIS and Tiger/LINE

By Ryan Lambert -- Published October 08, 2023

Geocoding addresses is the process of taking a street address and converting it to its location on a map. This post shows how to create a PostGIS geocoder using the U.S. Census Bureau's TIGER/Line data set. This is part one of a series of posts exploring geocoding addresses. The next post illustrates how to geocode in bulk with a focus on evaluating the accuracy of the resulting geometry data.

Before diving in, let's look at an example of geocoding. The address for Union Station (see on OpenStreetMap) is 1701 Wynkoop Street, Denver, CO, 80202. This address was the input to geocode. The blue point shown in the following screenshot is the resulting point from the PostGIS geocode() function. The pop-up dialog shows the address, a rating of 0, and the calculated distance away from the OpenStreetMap polygon representing that address (13 meters), shown in red under the pop-up dialog.

Screenshot showing the source polygon for Union Station next to the geocoded point using the street address and the PostGIS Geocoder function.

PostgreSQL 16 improves infinity: PgSQLPhriday #012

By Ryan Lambert -- Published September 01, 2023

This month's #pgsqlphriday challenge is the 12th PgSQL Phriday, marking the end of the first year of the event! Before getting into this month's topic I want to give a shout out to Ryan Booz for starting #pgsqlphriday. More importantly though, a huge thank you to the hosts and contributors from the past year! I looked forward to seeing the topic each month followed by waiting to see who all would contribute and how they would approach the topic.

Check out pgsqlphriday.com for the full list of topics, including recaps from each topic to link to contributing posts. This month is the 7th topic I've been able to contribute to the event. I even had the honor of hosting #005 with the topic Is your data relational? I'm really looking forward to another year ahead!

Now returning to your regularly scheduled PgSQL Phriday content.

This month, Ryan Booz chose the topic: What Excites You About PostgreSQL 16? With the release of Postgres 16 expected in the near(ish) future, it's starting to get real. It won't be long until casual users are upgrading their Postgres instances. To decide what to write about I headed to the Postgres 16 release notes to scan through the documents. Through all of the items, I picked this item attributed to Vik Fearing.

Accept the spelling "+infinity" in datetime input

The rest of this post looks at what this means, and why I think this matters.

Load the Right Amount of OpenStreetMap Data

By Ryan Lambert -- Published August 24, 2023

Populating a PostGIS database with OpenStreetMap data is favorite way to start a new geospatial project. Loading a region of OpenStreetMap data enables you with data ranging from roads, buildings, water features, amenities, and so much more! The breadth and bulk of data is great, but it can turn into a hinderance especially for projects focused on smaller regions. This post explores how to use PgOSM Flex with custom layersets, multiple schemas, and osmium. The goal is load limited data for a larger region, while loading detailed data for a smaller, target region.

The larger region for this post will be the Colorado extract from Geofabrik. The smaller region will be the Fort Collins area, extracted from the Colorado file. The following image shows the data loaded in this post with two maps side-by-side. The minimal data loaded for all of Colorado is shown on the left and the full details of Fort Collins is on the right.

Image with two maps. The map on the left is a map showing most of Colorado showing place boundaries and major roadways. The map on the right is a closer view in Fort Collins, Colorado, showing a portion of Colorado State University's campus and residential areas. The map on the right has details of minor roadways, sidewalks, buildings, and even trees.

Getting started with MobilityDB

By Ryan Lambert -- Published August 15, 2023

The MobilityDB is an exciting project I have been watching for a while. The project's README explains that MobilityDB "adds support for temporal and spatio-temporal objects." The spatio-temporal part translates as PostGIS plus time which is very interesting stuff to me. I had briefly experimented with the project in its early days and found a lot of potential. Mobility DB 1.0 was released in April 2022, at the time of writing the 1.1.0 alpha release is available.

This post explains how I got started with MobilityDB using PostGIS and pgRouting. I'm using OpenStreetMap roads data with pgRouting to generate trajectories. If you have gpx traces or other time-aware PostGIS data handy, those could be used in place of the routes I create with pgRouting.

Install MobilityDB

When I started working on this post a few weeks ago I had an unexpectedly difficult time trying to get MobilityDB working. I was trying to install from the production branch with Postgres 15 and Ubuntu 22.04 and ran into a series of errors. It turned out the fixes to allow MobilityDB to work with these latest versions had been in the develop branch for more than a year. After realizing what the problem was I asked a question and got an answer. The fixes are now in the master branch tagged as 1.1.0 alpha. Thank you to everyone involved with making that happen!

To install MobilityDB I'm following the instructions to install from source. These steps involve git clone then using cmake, make, and sudo make install. Esteban Zimanyi explained they are working on getting packaging worked out for providing deb and yum installers. It looks like work is progressing on those!

Update Configuration

After installing the extension, the postgresql.conf needs to be updated to include PostGIS in the shared_preload_libraries and increase the max_locks_per_transaction to double the default value.

shared_preload_libraries = 'postgis-3'
max_locks_per_transaction = 128

Postgres 15: Explain Buffer now with Temp Timings

By Ryan Lambert -- Published June 24, 2023

This post explores a helpful addition to Postgres 15's EXPLAIN output when using BUFFERS. The Postgres 15 release notes mention lists this item:

Add EXPLAIN (BUFFERS) output for temporary file block I/O (Masahiko Sawada)

This improvement adds new detail to the output provided from Postgres 15 when running EXPLAIN (ANALYZE, BUFFERS) <query>. This post explores this feature along with a couple different ways the reported I/O timing interacts with performance tuning.

Getting the feature

The first thing you need is to be using at least Postgres 15. Your instance also needs to have track_io_timing=on in your Postgres configuration file, postgresql.conf. Check the value of this setting with SHOW track_io_timing;.

SHOW track_io_timing;
┌─────────────────┐
│ track_io_timing │
╞═════════════════╡
│ on              │
└─────────────────┘

Test Data and Server

This post used a Postgres 15 instance on a Digital Ocean droplet with 2 AMD CPU and 2 GB RAM. I loaded Colorado OpenStreetMap data via PgOSM Flex. The data loaded to the osm schema weighs in at 2.5 GB, the public schema, 3.3 GB, has the raw data needed to enable PgOSM Flex's --replication feature via osm2pgsql-replication. The advantage to having more data than RAM is it's pretty easy to show I/O timings, which I need for this post!

<-- Older Posts Newer Posts -->