Better OpenStreetMap places in PostGIS
Data quality is important. This post continues exploring the improved quality of OpenStreetMap data loaded to Postgres/PostGIS via PgOSM-Flex. These improvements are enabled by the new flex output of osm2pgsql, making it easier to understand and consume OpenStreetMap data for analytic purposes.
I started exploring the Flex output a few weeks ago, the post before this one used PgOSM-Flex v0.0.3. This post uses PgOSM-Flex v0.0.7 and highlights a few cool improvements by exploring the OSM place data. Some of the improvements made of the past few weeks were ideas brought over from the legacy PgOSM project. Other improvements were spurred by questions and conversations with the community, such as the nested admin polygons.
This post focuses on the
osm.place_polygon data that stores things like
city, county and Country boundaries, along with neighborhoods and other details.
The the format of place data has a number of improvements covered in this post:
- Consolidated name
- Remove duplication between relation/member polygons
- Boundary hierarchy
Improved OpenStreetMap data structure in PostGIS
It was nearly a decade ago when I first loaded OpenStreetMap data to PostGIS.
Over the years my fingers have typed
osm2pgsql --slim --drop ... countless times
and I do not see an end to that trend anytime soon.
One thing that is changing is that getting high quality OpenStreetMap data into
PostGIS is easier than ever!
This improvement in data quality is made possible by the new Flex output available in osm2pgsql 1.4.0,
I wrote about my initial impressions of the Flex output a few weeks ago.
This post looks at how I am starting to use osm2pgsql's Flex output to provide a
standardized and sanitized OpenStreetMap data set in Postgres/PostGIS.
No longer is osm2pgsql limited to loading data to the 3-table structure
so I am eagerly converting to the Flex output and taking advantage of these changes!
It is also easier than ever to create customized mix-and-match data loads
for customized needs of specific projects.
Hands on with osm2pgsql's new Flex output
The osm2pgsql project has seen quite a bit of development over the past couple of years. This is a core piece of software used by a large number of people to load OpenStreetMap data into PostGIS / PostgreSQL databases, so it has been great to see the activity and improvements. Recently, I was contacted by Jochen Topf to see if I would give one of those (big!) improvements, osm2pgsql's new Flex output, a try. While the flex output is still marked as "experimental" it is already quite robust. In fact, I have already started thinking of the typical pgsql output I have used for nearly a decade as "the old output!"
So what does this new Flex output do for us? It gives us control over the imported data's format, structure and quality. This process uses Lua styles (scripts) to achieve powerful results. The legacy pgsql output from osm2pgsql gave you three (3) main tables with everything organized into points, lines and polygons, solely by geometry type. From a database design perspective this would be like keeping product prices, employee salaries and expense reports all in one table using the justification "they all deal with money." With the flex output we are no longer constrained by this legacy design. With that in mind, the rest of this post explores osm2pgsql's Flex output.
PostGIS Trajectory: Space plus Time
A few months ago I started experimenting with a few project ideas involving data over space and time. Naturally, I want to use Postgres and PostGIS as the main workhorse for these projects, the challenge was working out exactly how to pull it all together. After a couple false starts I had to put those projects aside for other priorities. In my free time I have continued working through some related reading on the topic. I found why you should be using PostGIS trajectories by Anita Graser and recommend reading that before continuing with this post. In fact, read Evaluating Spatio-temporal Data Models for Trajectories in PostGIS Databases while you're at it! There is great information in those resources with more links to other resources.
This post outlines examples of how to use these new PostGIS trajectory tricks with
OpenStreetMap data I already have available
Often, trajectory examples assume using data collected from our
new age of IoT sensors sending GPS points and timestamps. This example approaches
trajectories from a data modeling perpective instead, showing how to synthesize trajectory data using
Visualization of data is a critical component of sharing information,
has long been my favorite GIS application to use with PostGIS data.
Find your local SRID in PostGIS
The past few weeks I had been tossing around some ideas that resulted in me
looking for a particular data set. I needed to get the
for the most commonly used SRIDs
(Spatial Reference IDentifier)
in PostGIS to join with the
public.spatial_ref_sys table. My hope was to be able to use the data to quickly identify local
SRIDs for geometries spreading across the U.S. This data was needed to support
another idea where I want both accurate spatial calculations and the best possible
performance when working with large OpenStreetMap data sets.
The good news is now I have the exact data I was looking for. The unexpected bonus is that there is a much broader use case for this data in providing an easy way to find which SRIDs might be appropriate for a specific area!
This post explores this new data with an example of how to use it with pre-existing spatial data.