RustProof Labs: blogging for education (logo)
My book Mastering PostGIS and OpenStreetMap is available!

Revisiting Decisions: Improving Systems and Processes

By Ryan Lambert -- Published July 06, 2025

Revisiting your code of the past is a worthwhile exercise. More specifically, revisiting the decisions, reasons, and other factors that led to the current state is an extremely valuable exercise. Authors of technical work (databases, applications, calculations, etc) should re-evaluate the purpose of our systems and decide when "the way it's always been" needs to become "the way it used to be." While it is easy to proclaim this, it is often difficult to do in practice. Why? Because it's complicated, of course! 🙄 The more serious answer is that humans, including yourself, are involved. Technical people like to focus on the technology even though the technology is often a side-story to the bigger picture.

This post tells a story of a single project that spanned over more than a decade. While there are technical elements involved, you'll see some of the biggest technical issues were human created.

A non-exhaustive list of human elements include:

  • Who was involved? (developer, end-user, other stakeholders, etc.)
  • Experience of the people involved.
  • Internal politics, then and now.
  • Process and systems evolution.

The Widgets database

This story you are about to read is true. The names have been changed to protect the innocent. For this post, we will refer to the project as "The Widgets Database." It started with an Excel file that had far exceeded what should be done in Excel. The top of the main Widgets worksheet had a line with the warning:

Do not sort, you will break formulas. Spoiler alert! The formulas were already broken.

Continue Reading

Can you use ltree for Nested Place Data?

By Ryan Lambert -- Published February 29, 2024

The topic of the ltree data type has come up a few times recently. This intersects with a common type of query used in PostGIS: nested geometries. An example of nested geometries is the state of Colorado exists within the United States. The PgOSM Flex project calculates and stores nested polygon data from OpenStreetMap places into a handful of array (TEXT[], BIGINT[]) columns. I decided to explore ltree to see if it would be a suitable option for PgOSM Flex nested places.

Spoiler alert: ltree is not suitable for OpenStreetMap data in the way I would want to use it.

Nested data in arrays

The following is what the "Colorado is in the U.S" would look like using a Postgres TEXT[] array:

{"United States","Colorado"}

Continue Reading

Hosting a set of Postgres Demo databases

By Ryan Lambert -- Published February 10, 2024

In April 2023, I submitted my proposal for a full-day pre-conference at PASS 2023. My chosen topic was focused on PostGIS, titled GIS Data, Queries, and Performance. A key part of my submission was that the session would be an interactive, follow-along type design. Julie and I believe that doing is key to learning so we wanted to enforce that as much as possible. The plan was to use real data and queries to teach a nuanced, technical topic to an audience of unknown size or background. I also knew that PASS is very much a Microsoft focused community.

Knowing these things, I could not assume specific pre-existing knowledge about Postgres and PostGIS. I also didn't want to assume they had a Postgres 15 instance with PostGIS immediately available. I decided the best approach was to provide participants each a demo Postgres database so they didn't have to worry about those steps. These demo databases would be pre-loaded with the same data and extensions I used for my demos. This would allow participants to run the same queries, on the same data, on the same general hardware.

Of course, when my proposal was accepted then I realized I had to figure out how I was actually going to deliver! This post explains how I deployed demo databases to the participants of my PASS 2023 pre-con session.

The net result was a reliable, secure (enough), scalable, and affordable setup.

Continue Reading

UUID in Postgres: PgSQLPhriday #015

By Ryan Lambert -- Published January 31, 2024

This month's PgSQLPhriday #015 topic is about UUIDs. Lætitia Avrot is this month's host, see her post for the full challenge text. The topic is described as a debate between the Database People and Developers. I'm not sure there's such a clean divide on people's opinions on the topic, as I know plenty of Database People that have settled on using UUIDs as their default. Similarly, I know even more developer types that have followed the arguably more conventional choice of using an auto-incrementing ID.

TLDR;

I avoid UUIDs. The only places I have used UUIDs in production are the places where a 3rd party system is involved.

Continue Reading

Working with GPS data in PostGIS

By Ryan Lambert -- Published December 18, 2023

One of the key elements to using PostGIS is having spatial data to work with! Lucky for us, one big difference today compared to the not-so-distant past is that essentially everyone is carrying a GPS unit with them nearly everywhere. This makes it easy to create your own GPS data that you can then load into PostGIS! This post explores some basics of loading GPS data to PostGIS and cleaning it for use. It turns out, GPS data fr om nearly any GPS-enabled device comes with some... character. Getting from the raw input to usable spatial data takes a bit of effort.

This post starts with using ogr2ogr to load the .gpx data to PostGIS. Once the data is in PostGIS then we actually want to do something with it. Before the data is completely usable, we should spend some time cleaning the data first. Technically you can start querying the data right away, however, I have found there is always data cleanup and processing involved first to make the data truly useful.

This is especially true when using data collected over longer periods of time with a variety of users and data sources.

Travel database project

Before I started writing this post I had assumed that all of the code would be contained in the body of this post. It turned out that to get to the quality I wanted, I had to create a new travel database project (MIT licensed) to share the code. This is already a long post without including a few hundred more lines of code! The travel-db project creates a few tables, a view, and a stored procedure in the travel schema. The stored procedure travel.load_bad_elf_points is responsible for cleaning an importing the data and weighs in at nearly 500 lines of code itself.

Continue Reading