PostGIS Trajectory: Space plus Time
A few months ago I started experimenting with a few project ideas involving data over space and time. Naturally, I want to use Postgres and PostGIS as the main workhorse for these projects, the challenge was working out exactly how to pull it all together. After a couple false starts I had to put those projects aside for other priorities. In my free time I have continued working through some related reading on the topic. I found why you should be using PostGIS trajectories by Anita Graser and recommend reading that before continuing with this post. In fact, read Evaluating Spatio-temporal Data Models for Trajectories in PostGIS Databases while you're at it! There is great information in those resources with more links to other resources.
This post outlines examples of how to use these new PostGIS trajectory tricks with
OpenStreetMap data I already have available
(load
and
prepare
).
Often, trajectory examples assume using data collected from our
new age of IoT sensors sending GPS points and timestamps. This example approaches
trajectories from a data modeling perpective instead, showing how to synthesize trajectory data using pgrouting
.
Visualization of data is a critical component of sharing information,
QGIS
has long been my favorite GIS application to use with PostGIS data.
Find your local SRID in PostGIS
The past few weeks I had been tossing around some ideas that resulted in me
looking for a particular data set. I needed to get the
bounding boxes
for the most commonly used SRIDs
(Spatial Reference IDentifier)
in PostGIS to join with the
public.spatial_ref_sys
table. My hope was to be able to use the data to quickly identify local
SRIDs for geometries spreading across the U.S. This data was needed to support
another idea where I want both accurate spatial calculations and the best possible
performance when working with large OpenStreetMap data sets.
The good news is now I have the exact data I was looking for. The unexpected bonus is that there is a much broader use case for this data in providing an easy way to find which SRIDs might be appropriate for a specific area!
This post explores this new data with an example of how to use it with pre-existing spatial data.
Postgres 13 Performance with OpenStreetMap data
With Postgres 13 released recently, a new season of testing has begun! I recently wrote about the impact of BTree index deduplication, finding that improvement is a serious win. This post continues looking at Postgres 13 by examining performance through a few steps of an OpenStreetMap data (PostGIS) workflow.
Reasons to upgrade
Performance appears to be a strong advantage to Postgres 13 over Postgres 12. Marc Linster wrote there's "not one headline-grabbing feature, but rather a wide variety of improvements along with updates to previously released features." I am finding that to be an appropriate description. At this point I intend to upgrade our servers for the improved performance, plus a few other cool benefits.
PostgreSQL 13Beta3: B-Tree index deduplication
PostgreSQL 13 development is coming along nicely, Postgres 13 Beta3 was
released on 8/13/2020.
The Postgres Beta 1 and 2 releases were released in May and June 2020.
One of the features that has my interest in Postgres 13 is the B-Tree deduplication effort. B-Tree indexes are the default indexing method
in Postgres, and are likely the most-used indexes in production
environments.
Any improvements to this part of the database are likely to have wide-reaching benefits.
Removing duplication from indexes keeps their physical size smaller,
reduces I/O overhead, and should help keep SELECT
queries fast!
PostgreSQL: Integers, on-disk order, and performance
This post examines how the on-disk order of integer data can influence performance in PostgreSQL. When working with relational databases, you often do not need to think about data storage, though there are times when these details can have a noticeable impact on your database's performance.
This post uses PostgreSQL 12.3 on Ubuntu 18.04 on a DigitalOcean droplet with 4 CPU and 8 GB RAM droplet, aka "Rig B" from Scaling osm2pgsql: Process and costs.