RustProof Labs: blogging for education (logo)
My book Mastering PostGIS and OpenStreetMap is available!

Routing with Lines through Polygons

By Ryan Lambert -- Published October 23, 2022

One of my favorite layers to route with pgRouting is the water layer. I am interested in where water comes from, where it goes, where runoff happens, and how urban development interacts with this powerful force of nature. The OpenStreetMap water layer, however, presents a challenge when routing with PostGIS and pgRouting: Polygons.

Why are polygons a challenge? A routing network using pgRouting is built from lines (edges). Now, to state the obvious: polygons are not lines.

Real world waterway networks are made up of both lines and polygons. Rivers, streams, and drainage routes are predominately (but not exclusively!) mapped using lines. These lines feed into and out of ponds, lakes, and reservoirs. The following animation shows how much impact the water polygons can have on a waterway network... some very important paths simply disappear when they are excluded.

Animated image showing an area northwest of Denver, Colorado with a few large reservoirs (polygons) connected by waterways (lines).  The animation shows the impact of taking these large polygons out of the routing equation, many important route options disappear.

To make the full water network route-able we need to create a combined line layer. The combined line layer will include:

This post is a big one thanks to the query examples and screenshots involved.

This post was inspired by a two-part post written by Bruce Rindahl. The posts, titled "Flow path through a lake along the medial axis," are currently unavailable due to a blog upgrade issue. The original source URL will hopefully be restored in the future: http://berrie.ddns.net/?p=28.

Extensions and Data

This post uses the ST_ApproximateMedialAxis() function which requires the postgis_sfcgal extension. Create the extension, along with postgis and pgrouting so they are available as they are needed.

CREATE EXTENSION postgis;
CREATE EXTENSION pgrouting;
CREATE EXTENSION postgis_sfcgal;

The data source for this post is OpenStreetMap data for Colorado loaded using PgOSM Flex's Docker procedures. Loading Colorado with the default layerset typically takes less than 10 minutes to get the .sql, which I load into my Postgres instance of choice. Be sure to use PgOSM Flex 0.6.0 or newer. Older versions of PgOSM Flex did not include water relations such as Standley Lake.

My book Mastering PostGIS and OpenStreetMap provides in-depth explanations covering how to load OpenStreetMap data (chapter 8) and prepare it for routing (chapters 14 - 16 and 18). The free preview section shows the book's full table of contents.

Example lake polygon and medial axis

The largest body of water that disappears in the above animation is Standley Lake, also know as relation 3023193. The PostGIS function ST_ApproximateMedialAxis() can be used to generate a reasonable path of travel for many polygons. Another option is ST_StraightSkeleton(); that function is not discussed in this post. The following query and image shows how to calculate the medial axis for Standley Lake.

SELECT osm_id, geom, ST_ApproximateMedialAxis(geom)
    FROM osm.water_polygon 
    WHERE osm_id = -3023193
;

Screenshot showing the selected polygon for relation 3023193 and its calculated medial axis lines.

Define small test area

Routing has steps that can take a long time on large data sets. For the purpose of this post I limit the full Colorado data down to a roughly 5 kilometer area of focus are Standley Lake. I create a schema named routing_demo to hold our data and a table (routing_demo.focus) to store the geometry of focus. The focus area is used in multiple queries throughout this post to limit the amount of data being used.

CREATE SCHEMA routing_demo;
COMMENT ON SCHEMA routing_demo IS 'Contains examples for routing through polygons using ST_ApproximateMedialAxis()';

CREATE TABLE routing_demo.focus AS
SELECT osm_id, geom, ST_Buffer(geom, 5000) AS geom_buffer
    FROM osm.water_polygon 
    WHERE osm_id = -3023193
;
CREATE INDEX ON routing_demo.focus
    USING GIST (geom);
COMMENT ON TABLE routing_demo.focus IS 'Focus area for routing';

The newly created routing_demo.focus table contains the data needed to establish the focus area. The lake polygon and calculated buffer are shown in the following screenshot.

SELECT * FROM routing_demo.focus;

Screenshot from DBeaver showing the data in the routing_demo.focus table containing the polygon for Standley Lake and a roughly 5 kilometer buffer around the lake.  The data is displayed over the standard OpenStreetMap basemap.

The next step I take is to check the osm.water_polygon data joined to the routing_demo.focus table. The following query shows that there are two types of water polygons I will want to exclude from the following steps: dam and boatyard. While dams have a significant impact to the flow of water, they are not part of the routing analysis I want to achieve.

SELECT p.osm_type, p.osm_subtype, COUNT(*)
    FROM osm.water_polygon p
    INNER JOIN routing_demo.focus f ON p.geom && f.geom_buffer
    GROUP BY p.osm_type, p.osm_subtype
    ORDER BY p.osm_subtype
;

osm_type|osm_subtype|count|
--------+-----------+-----+
waterway|boatyard   |    1|
waterway|dam        |    2|
natural |water      |  172|
natural |wetland    |   38|

Create medial axis for polygons

With the focus area defined, the next step is to calculate and persist the medial axis lines for these polygons. The following query creates the routing_demo.polygon_medial_axis table with the individual lines that make up the medial axis for each polygon. The ST_ApproximateMedialAxis() function returns MULTILINESTRINGs, so ST_Dump() is also used in order to extract the individual line components for each polygon.

CREATE TABLE routing_demo.polygon_medial_axis AS
SELECT p.osm_id,
        (ST_Dump(ST_ApproximateMedialAxis(p.geom))).path[1] AS ma_id,
        (ST_Dump(ST_ApproximateMedialAxis(p.geom))).geom AS geom
    FROM osm.water_polygon p
    INNER JOIN routing_demo.focus f ON p.geom && f.geom_buffer
    WHERE p.osm_subtype NOT IN ('dam', 'boatyard')
;
CREATE INDEX ON routing_demo.polygon_medial_axis
    USING GIST (geom);
COMMENT ON TABLE routing_demo.polygon_medial_axis IS 'Contains calculated medial axis lines created from the osm.water_polygon table.';

The input of 210 polygons created 4,415 medial axis lines, an average of 21 lines per polygon.

SELECT COUNT(*) AS rows, COUNT(DISTINCT osm_id) AS input_polygons,
        COUNT(*) / COUNT(DISTINCT osm_id) AS avg_medial_axis_per_input
    FROM routing_demo.polygon_medial_axis
;

rows|input_polygons|avg_medial_axis_per_input|
----+--------------+-------------------------+
4415|           210|                       21|

The following query and screenshot provide a look at the medial axis data saved to the routing_demo.polygon_medial_axis table.

SELECT * FROM routing_demo.polygon_medial_axis;

Screenshot from DBeaver showing the data in the routing_demo.polygon_medial_axis table. The data is displayed over the standard OpenStreetMap basemap.

Identify water lines intersecting water polygons

The above steps have provided basic line routes through our water polygons. The medial axis lines, however, do not match up to where waterways enter and exit the polygons. To solve that challenge, we need to identify where the water lines from osm.water_line intersect with the water polygon layer. To accomplish this I'm reusing a trick from my post on finding missing crossing tags in OpenStreetMap data.

In the following query, the intersections CTE finds lines that intersect with polygons. This first query also joins to the focus table to limit the scope to our region of interest. The node_ids query uses ST_PointN() and generate_series() to extract node ids from lines that intersect the polygon at multiple points. The combined CTE query combines the intersecting lines with one intersecting node (from intersections) and those with multiple intersecting nodes (from node_ids). The screenshot after the query shows the intersecting points around Standley Lake as blue dots.

CREATE TABLE routing_demo.line_polygon_intersection AS
WITH intersections AS (
SELECT p.osm_id AS water_polygon_osm_id,
        l.osm_id,
        l.osm_type, l.osm_subtype,
        ST_Intersection(p.geom, l.geom) AS geom
    FROM osm.water_polygon p
    INNER JOIN routing_demo.focus f ON p.geom && f.geom_buffer
    INNER JOIN osm.water_line l
        ON ST_Intersects(p.geom, l.geom)
            AND l.osm_subtype NOT IN ('dam', 'boatyard')
), node_ids AS (
SELECT osm_id, generate_series(1, ST_Npoints(geom)) AS node_id
    FROM intersections
    WHERE ST_NPoints(geom) > 1
), combined AS (
SELECT DISTINCT i.water_polygon_osm_id, i.osm_id, i.osm_type, i.osm_subtype,
        CASE WHEN n.osm_id IS NULL THEN i.geom
            ELSE ST_PointN(i.geom, n.node_id)
        END AS geom 
    FROM intersections i
    LEFT JOIN node_ids n ON i.osm_id = n.osm_id
)
SELECT *
    FROM combined 
    ;

Screenshot from DBeaver showing the intersections between waterways (lines) and bodies of water (polygons) around Standley Lake as blue dots. The image shows many intersections are right on the boundary of the polygon, others have intersections throughout where the waterway line is extended into/through the polygon.

In case you were curious, the points that are contained within polygons are where the waterway line has been extended into (or through) the polygon itself. The wetland area to the west of Standley Lake is an example of that. The query above could be easily modified to exclude these points if the user desired that change. I have decided to leave them in, they won't hurt anything beyond taking up a bit more space.

Status check

To see the progress made so far I pull the data from the routing_demo.polygon_medial_axis and routing_demo.line_polygon_intersection tables into QGIS. The following screenshot shows the blue point where a water line intersects with the Standley Lake polygon. The medial axis has a branch that comes close to, but does not touch, the intersection at the boundary of the polygon. To create complete routes, we need to connect the pieces.

Screenshot from QGIS showing that the points intersecting polygons and the medial axis lines are close to each other, but not touching.

Connect the pieces

The following query uses ST_ClosesPoint() and ST_MakeLine() to bridge the point where a waterway intersects a polygon and the medial axis line(s) inside the polygon. The subquery in the INNER JOIN uses ST_Collect() to collapse the individual lines of the medial axis into a single geometry per polygon. Skipping this step would result in lines calculated to the closest point of each line.

CREATE TABLE routing_demo.water_line_to_polygon AS
SELECT i.osm_id, i.water_polygon_osm_id,
        ST_MakeLine(i.geom, ST_ClosestPoint(ma.geom, i.geom)) AS geom
    FROM routing_demo.line_polygon_intersection i
    INNER JOIN (
        SELECT osm_id, ST_Collect(geom) AS geom
             FROM routing_demo.polygon_medial_axis
             GROUP BY osm_id
    ) ma
    ON i.water_polygon_osm_id = ma.osm_id
;

Screenshot from QGIS showing the line connecting the standard waterway with the calculated medial axis line within Standley Lake.

Bring it together

The steps above created the components we need to stitch together. Revisiting the list of items from the beginning of the post, the following table shows where the data elements are currently located.

Data needed Available in
Initial osm.water_line inputs osm.water_line
Medial axis lines from osm.water_polygon routing_demo.polygon_medial_axis
Lines to connect initial inputs to medial axis routing_demo.water_line_to_polygon

The following query combines these three (3) data sets into a single table of lines. The goal in this step is to make everything ready for the pgRouting steps.

DROP TABLE IF EXISTS routing_demo.waterways_to_route;
CREATE TABLE routing_demo.waterways_to_route AS
WITH ww_selected AS (
SELECT l.osm_id, l.osm_type, l.osm_subtype,
        NULL::BIGINT AS water_polygon_osm_id,
        l.layer,
        -- This will simplify some/most...
        ST_LineMerge(l.geom) AS the_geom
    FROM routing_demo.focus f
    INNER JOIN osm.water_line l
        ON f.geom_buffer && l.geom
            AND l.osm_subtype NOT IN ('dam', 'boatyard')
), ww_extra_cleanup AS (
-- Split the complex lines that made it through ST_LineMerge
SELECT osm_id, osm_type, osm_subtype, water_polygon_osm_id,
        layer,
        (ST_Dump(the_geom)).geom AS the_geom
    FROM ww_selected
    WHERE ST_GeometryType(the_geom) = 'ST_MultiLineString'
), combined AS (
SELECT osm_id, osm_type, osm_subtype, water_polygon_osm_id, layer,
        the_geom
    FROM ww_selected
    WHERE ST_GeometryType(the_geom) != 'ST_MultiLineString'
UNION
SELECT osm_id, osm_type, osm_subtype, water_polygon_osm_id, layer,
        the_geom
    FROM ww_extra_cleanup
    WHERE ST_GeometryType(the_geom) != 'ST_MultiLineString'
UNION
SELECT NULL AS osm_id, 'line-to-polygon' AS osm_type, '' AS osm_subtype,
        water_polygon_osm_id, 0 AS layer,
        geom AS the_geom
    FROM routing_demo.water_line_to_polygon
UNION
SELECT NULL AS osm_id, 'medial-axis' AS osm_type, '' AS osm_subtype,
        osm_id AS water_polygon_osm_id, 0 AS layer,
        geom AS the_geom
    FROM routing_demo.polygon_medial_axis
)
SELECT ROW_NUMBER() OVER () AS id, *
    FROM combined
    WHERE the_geom IS NOT NULL
;

Routing Network

The following steps create the structures and data needed for routing. There is not room in this post to go into these steps in depth.
Chapters 14, 15, 16 and 18 of my book Mastering PostGIS and OpenStreetMap are all about pgRouting and explain this process in detail.

SELECT pgr_nodeNetwork('routing_demo.waterways_to_route', .1);

-- Establish route costs
ALTER TABLE routing_demo.waterways_to_route_noded
    ADD cost NUMERIC
    GENERATED ALWAYS AS (
        ST_Length(the_geom)
    )
    STORED
;
ALTER TABLE routing_demo.waterways_to_route_noded
    ADD reverse_cost NUMERIC NOT NULL DEFAULT -1
;
UPDATE routing_demo.waterways_to_route_noded n
    SET reverse_cost = ST_Length(n.the_geom)
    FROM routing_demo.waterways_to_route r
    WHERE n.old_id = r.id
        AND r.osm_type IN ('line-to-polygon', 'medial-axis')
;

-- Topology and Analyze
SELECT pgr_createTopology('routing_demo.waterways_to_route_noded', 0.1);
SELECT pgr_analyzeGraph('routing_demo.waterways_to_route_noded', 0.1);

Create route through a lake

Phew, we made it. It's time to create a route! The following query creates a route from a chosen start and end node that goes through Standley Lake from west to east.

SELECT *
    FROM pgr_dijkstra('
        SELECT rn.id, rn.source,
              rn.target, rn.cost,
              rn.reverse_cost
            FROM routing_demo.waterways_to_route_noded rn
            WHERE source IS NOT NULL
          ',
         235, -- start node
         5930, -- end node
         True -- Enforce one-way restrictions
        ) d
    INNER JOIN routing_demo.waterways_to_route_noded x
        ON d.edge = x.id
;

Screenshot from DBeaver showing the waterway route through Standley Lake.

The start and end nodes that I used in this post (235 and 5930) will almost certainly be different for you. Any changes to the data or exact steps above will result in different IDs for the same point from the data.

Summary

This overstuffed post gave an in-depth guide to using OpenStreetMap data to create a waterway routing network capable of traversing polygons. While ST_ApproximateMedialAxis() was a key function involved, this post shows the use of more than a dozen functions! This technique sets the stage for more complete and complex hydrology analysis using this layer.

Beyond the waterway layer, this approach can be used to improve routing using other layers of data available in OpenStreetMap. Pedestrian routes are another place where polygons need to be included. To close this post, one more query and one more screenshot. Kudos for hanging in there!

SELECT osm_id, osm_type, route_foot, geom
    FROM osm.road_polygon
    WHERE osm_id = 323692586
;

Screenshot from DBeaver showing a pedestrian area as a polygon from the osm.road_polygon table with osm_id = 323692586.  The osm_type column reads "pedestrian" and the "route_foot" value is true.

Need help with your PostgreSQL servers or databases? Contact us to start the conversation!

By Ryan Lambert
Published October 23, 2022
Last Updated November 20, 2022