Blog @ RustProof Labs

My book Mastering PostGIS and OpenStreetMap is available!

Can you use ltree for Nested Place Data?

By Ryan Lambert -- Published February 29, 2024

The topic of the ltree data type has come up a few times recently. This intersects with a common type of query used in PostGIS: nested geometries. An example of nested geometries is the state of Colorado exists within the United States. The PgOSM Flex project calculates and stores nested polygon data from OpenStreetMap places into a handful of array (TEXT[], BIGINT[]) columns. I decided to explore ltree to see if it would be a suitable option for PgOSM Flex nested places.

Spoiler alert: ltree is not suitable for OpenStreetMap data in the way I would want to use it.

Nested data in arrays

The following is what the "Colorado is in the U.S" would look like using a Postgres TEXT[] array:

{"United States","Colorado"}

Hosting a set of Postgres Demo databases

By Ryan Lambert -- Published February 10, 2024

In April 2023, I submitted my proposal for a full-day pre-conference at PASS 2023. My chosen topic was focused on PostGIS, titled GIS Data, Queries, and Performance. A key part of my submission was that the session would be an interactive, follow-along type design. Julie and I believe that doing is key to learning so we wanted to enforce that as much as possible. The plan was to use real data and queries to teach a nuanced, technical topic to an audience of unknown size or background. I also knew that PASS is very much a Microsoft focused community.

Knowing these things, I could not assume specific pre-existing knowledge about Postgres and PostGIS. I also didn't want to assume they had a Postgres 15 instance with PostGIS immediately available. I decided the best approach was to provide participants each a demo Postgres database so they didn't have to worry about those steps. These demo databases would be pre-loaded with the same data and extensions I used for my demos. This would allow participants to run the same queries, on the same data, on the same general hardware.

Of course, when my proposal was accepted then I realized I had to figure out how I was actually going to deliver! This post explains how I deployed demo databases to the participants of my PASS 2023 pre-con session.

The net result was a reliable, secure (enough), scalable, and affordable setup.

UUID in Postgres: PgSQLPhriday #015

By Ryan Lambert -- Published January 31, 2024

This month's PgSQLPhriday #015 topic is about UUIDs. Lætitia Avrot is this month's host, see her post for the full challenge text. The topic is described as a debate between the Database People and Developers. I'm not sure there's such a clean divide on people's opinions on the topic, as I know plenty of Database People that have settled on using UUIDs as their default. Similarly, I know even more developer types that have followed the arguably more conventional choice of using an auto-incrementing ID.

TLDR;

I avoid UUIDs. The only places I have used UUIDs in production are the places where a 3rd party system is involved.

Working with GPS data in PostGIS

By Ryan Lambert -- Published December 18, 2023

One of the key elements to using PostGIS is having spatial data to work with! Lucky for us, one big difference today compared to the not-so-distant past is that essentially everyone is carrying a GPS unit with them nearly everywhere. This makes it easy to create your own GPS data that you can then load into PostGIS! This post explores some basics of loading GPS data to PostGIS and cleaning it for use. It turns out, GPS data fr om nearly any GPS-enabled device comes with some... character. Getting from the raw input to usable spatial data takes a bit of effort.

This post starts with using ogr2ogr to load the .gpx data to PostGIS. Once the data is in PostGIS then we actually want to do something with it. Before the data is completely usable, we should spend some time cleaning the data first. Technically you can start querying the data right away, however, I have found there is always data cleanup and processing involved first to make the data truly useful.

This is especially true when using data collected over longer periods of time with a variety of users and data sources.

Travel database project

Before I started writing this post I had assumed that all of the code would be contained in the body of this post. It turned out that to get to the quality I wanted, I had to create a new travel database project (MIT licensed) to share the code. This is already a long post without including a few hundred more lines of code! The travel-db project creates a few tables, a view, and a stored procedure in the travel schema. The stored procedure travel.load_bad_elf_points is responsible for cleaning an importing the data and weighs in at nearly 500 lines of code itself.

Postgres Events: PgSQLPhriday #014

By Ryan Lambert -- Published December 01, 2023

It is PgSQLPhriday time again! This month's event is PgSQLPhriday (#014) and is hosted by Pavlo Golub. I'm barely making the deadline, but didn't want to miss this one! Pavlo chose PostgreSQL Events for the focus for this month's topic. See his post for the full details. As always, I can't wait to read the rest of the contributions this month. This post addresses roughly three of his questions.

Networking

It just so happens, I finally got to meet Pavlo in person at the PASS 2023 summit in Seattle, Washington! 👋

Q: "Discuss the importance of networking. Have you formed valuable connections or partnerships as a result of these events?"

<-- Older Posts