RustProof Labs: blogging for education (logo)

PostGIS Trajectory: Space plus Time

By Ryan Lambert -- Published November 29, 2020

A few months ago I started experimenting with a few project ideas involving data over space and time. Naturally, I want to use Postgres and PostGIS as the main workhorse for these projects, the challenge was working out exactly how to pull it all together. After a couple false starts I had to put those projects aside for other priorities. In my free time I have continued working through some related reading on the topic. I found why you should be using PostGIS trajectories by Anita Graser and recommend reading that before continuing with this post. In fact, read Evaluating Spatio-temporal Data Models for Trajectories in PostGIS Databases while you're at it! There is great information in those resources with more links to other resources.

This post outlines examples of how to use these new PostGIS trajectory tricks with OpenStreetMap data I already have available (load and prepare ). Often, trajectory examples assume using data collected from our new age of IoT sensors sending GPS points and timestamps. This example approaches trajectories from a data modeling perpective instead, showing how to synthesize trajectory data using pgrouting. Visualization of data is a critical component of sharing information, QGIS has long been my favorite GIS application to use with PostGIS data.

This post is part of the series PostgreSQL: From Idea to Database.

Route data

Routes for this example are created with manually selected start (s_id) and end (e_id) nodes. These id values line up to pgrouting node data in the table osm_routing.roads_noded_vertices_pgr. To set the stage for temporal data a start time (s_time) is also included. This example gives each route a start time of 5:00 AM, but makes it easy to adjust start times to explore the impact.

route_id|s_id |e_id  |s_time             |
--------|-----|------|-------------------|
       1|24902| 21887|2020-11-01 05:00:00|
       2|43615| 82508|2020-11-01 05:00:00|
       3|83320| 25649|2020-11-01 05:00:00|
       4|27478| 46247|2020-11-01 05:00:00|
       5| 3487|113816|2020-11-01 05:00:00|
       6|20587|113816|2020-11-01 05:00:00|
       7|30505|113816|2020-11-01 05:00:00|

The start and end nodes (r.s_id and r.e_id below) are fed into pgrouting's pgr_dijkstra() routing function. The osm_routing.roads table contains OpenStreetMap data for Denver, CO prepared by PgOSM. The pgosm.layer_detail and pgosm.routable tables are used to easily filter down to roads usable by motorized traffic (cars). The osm_routing.roads_noded (edges) table was prepared with pg_routing and is based on the base osm_routing.roads table.

The route details are created and saved into a new table (osm_routing.route_details) using the following query.

CREATE TABLE osm_routing.route_details AS
WITH rte AS (
SELECT * 
    FROM osm_routing.routes
)
SELECT r.route_id, d.*, n.the_geom AS node_geom, e.the_geom AS edge_geom
    FROM rte r
    INNER JOIN pgr_dijkstra(
        'SELECT rn.id, rn.source, rn.target, rn.cost_length AS cost,
                rn.the_geom
            FROM osm_routing.roads_noded rn
            INNER JOIN osm_routing.roads r ON rn.old_id = r.id
            INNER JOIN pgosm.layer_detail ld ON ld.code = r.code
            INNER JOIN pgosm.routable rbl
                ON rbl.layer_detail_id = ld.layer_detail_id
                    AND rbl.route_motor
            ',
        r.s_id, r.e_id, directed := False
        ) d
        ON True
    INNER JOIN osm_routing.roads_noded_vertices_pgr n ON d.node = n.id
    INNER JOIN osm_routing.roads_noded e ON d.edge = e.id
;

The pgr_dijkstra() function returns one row for each step of a given route. The route_id value links back to osm_routing.routes. The cost used a simple length based cost with meters as the unit. The node and edge values point back to the id values in their respective tables used by pgrouting.

route_id|seq|path_seq|node |edge |cost              |agg_cost          |
--------|---|--------|-----|-----|------------------|------------------|
       1|  1|       1|24902|27505| 20.42259891997424|               0.0|
       1|  2|       2|27462|27506| 70.51812482510054| 20.42259891997425|
       1|  3|       3|27463|27507|21.724521060420614| 90.94072374507479|
       1|  4|       4|27464|27508|23.621809729839022| 112.6652448054954|
       1|  5|       5|27465|27509| 9.095234265004999|136.28705453533442|

The nodes (blue circles) and edges (red lines) for each of the routes are shown below. We can easily confirm with code (ST_Intersects()) and visually on the map that these routes intersect using simple spatial checks. Would these routes intersect if time was also considered?

Screenshot from DBeaver showing generated route details.  Blue circles indicate each route node and the red lines illustrate the edges connecting the nodes.

The animated version of these routes is at the bottom of this post under Animate in QGIS.

Path to Trajectory

A simple approach to adding time to geometry is by adding two TIMESTAMPTZ columns for start and end times. Unfortunately this simple approach quickly falls apart as later queries will illustrate. Luckily, the PostGIS extension comes to the rescue (again) with built in trajectory support. What is a trajectory? It starts with at least two points that include a Measure (M) representing time. These points can be created with ST_MakePointM(). Points including M can be made into a line with ST_MakeLine(), and now you have a trajectory.

The key detail added to make a trajectory is M, or the Measure. Many trajectory use cases use time as the measure. Unfortunately, the TIMESTAMPTZ format cannot be used but using the epoch is an easy enough workaround in Postgres.

Assign Speeds

To get to a M value for each step of the route we need the speed traveled for each edge. Add a column for edge_mph.

ALTER TABLE osm_routing.route_details ADD edge_mph NUMERIC;

Quite a series of joins to get back to the speed limit. In the future this step will most likely be done far earlier on in the data processing pipeline.

UPDATE osm_routing.route_details rd_update
    SET edge_mph = rbl.max_speed
    FROM osm_routing.route_details rd
    INNER JOIN osm_routing.routes rte ON rd.route_id = rte.route_id
    INNER JOIN osm_routing.roads_noded rn ON rd.edge = rn.id
    INNER JOIN osm_routing.roads r ON rn.old_id = r.id
    INNER JOIN pgosm.layer_detail ld ON ld.code = r.code
    INNER JOIN pgosm.routable rbl ON rbl.layer_detail_id = ld.layer_detail_id
    WHERE rd_update.route_id = rd.route_id
        AND rd_update.seq = rd.seq
        AND rd_update.path_seq = rd.path_seq
;

Calculate time for each edge

Add a column to store the time for each node.

ALTER TABLE osm_routing.route_details ADD node_ts TIMESTAMPTZ;

To get the value for each node_ts we need to calculate cost in seconds based on an edge's speed. Our distance is in meters, our speed in MPH, and we need seconds. Perfect use case for a function!

The following function accepts two parameters, length_m and a miles_per_hour, and returns the time in seconds required to travel the edge.

CREATE OR REPLACE FUNCTION osm_routing.mph_length_m_to_cost_seconds(
    length_m NUMERIC, miles_per_hour NUMERIC)
 RETURNS NUMERIC
 LANGUAGE sql
 SECURITY DEFINER
 SET search_path TO 'pg_temp'
AS $function$
/*
 * miles per hour * 0.44704  = meters per second
 */
    SELECT ($1 / ($2 * .44704))

$function$
;

This makes it easy to calculate the time to traverse an edge based on its length and speed. How many seconds does it take to travel 1,000 meters at 60 miles per hour?

SELECT osm_routing.mph_length_m_to_cost_seconds(1000, 60);

mph_length_m_to_cost_seconds|
----------------------------|
         37.2822715342400382|

What about 1000 meters at 20 mph?

SELECT osm_routing.mph_length_m_to_cost_seconds(1000, 20);

mph_length_m_to_cost_seconds|
----------------------------|
        111.8468146027201145|

The edge_times CTE below uses the mph_length_m_to_cost_seconds() function to calculate each segment's time to traverse. The time_to_node CTE uses uses SUM() as a window function to provide a running total of cost in time to each node. The final UPDATE query sets the node_ts to the route's start time (from osm_routing.routes) plus the appropriate number of seconds from the previous window function.

WITH edge_times AS (
SELECT r.route_id, rd.seq, rd.path_seq, 
        r.s_time, rd.cost, rd.agg_cost, rd.edge_mph,
        CASE WHEN (rd.edge_mph > 0 AND agg_cost > 0)
            THEN osm_routing.mph_length_m_to_cost_seconds(rd.cost::NUMERIC, rd.edge_mph)
            ELSE NULL
            END AS edge_time_s
    FROM osm_routing.routes r
    INNER JOIN osm_routing.route_details rd ON True
    WHERE r.route_id = rd.route_id
), time_to_node AS (
SELECT SUM(et.edge_time_s)
            OVER (PARTITION BY et.route_id ORDER BY et.seq, et.path_seq)
            AS time_to_node_s,
        *
    FROM edge_times et
)
UPDATE osm_routing.route_details rd
    SET node_ts = ttn.s_time + (ttn.time_to_node_s || ' seconds')::INTERVAL
    FROM time_to_node ttn
    WHERE rd.route_id = ttn.route_id
        AND rd.seq = ttn.seq
        AND rd.path_seq = ttn.path_seq
        AND rd.node_ts IS NULL
;

The above UPDATE query does not set the route's initial node. The following UPDATE fixes the first record in each route equal to the s_time.

UPDATE osm_routing.route_details rd
    SET node_ts = r.s_time
    FROM osm_routing.routes r
    WHERE rd.route_id = r.route_id
        AND rd.node_ts IS NULL
        AND rd.seq = 1
;

The osm_routing.route_details table now includes a speed limit and timestamp for each segment for the included routes.

route_id|seq|path_seq|edge_mph|node_ts            |
--------|---|--------|--------|-------------------|
       1|  1|       1|   45.00|2020-11-01 05:00:00|
       1|  2|       2|   45.00|2020-11-01 05:00:03|
       1|  3|       3|   45.00|2020-11-01 05:00:04|
       1|  4|       4|   45.00|2020-11-01 05:00:05|
       1|  5|       5|   45.00|2020-11-01 05:00:06|

Finally, Trajectories!

Now to turn into trajectories I am using a materialized view created with a CTE. The first portion (nodes) pulls out the needed details for each node. The second step (traj_nodes) creates the points with the epoch of the timestamp for the M value. The third step (build_line) uses the LEAD() window function to string route nodes together into trajectory lines. The final portion calculates few extra handy columns, such as time_range for easy temporal only filtering.

CREATE MATERIALIZED VIEW osm_routing.route_trajectory_segment AS
WITH nodes AS (
SELECT route_id, seq, path_seq, node, edge, cost, agg_cost,
        edge_mph, node_ts,
        ST_X(ST_Transform(node_geom, 4326)) AS long,
        ST_Y(ST_Transform(node_geom, 4326)) AS lat
    FROM osm_routing.route_details
), traj_nodes AS (
SELECT route_id, seq, path_seq, node_ts,
        ST_SetSRID(ST_MakePointM(long, lat, EXTRACT(epoch FROM node_ts)), 4326) AS traj_point
    FROM nodes
), build_line AS (
SELECT traj_nodes.route_id, traj_nodes.seq, traj_nodes.path_seq,
        node_ts AS time_start,
        LEAD(node_ts) OVER (PARTITION BY route_id ORDER BY seq, path_seq)
            AS time_end,
        ST_Transform(
            ST_MakeLine(traj_nodes.traj_point,
                LEAD(traj_nodes.traj_point)
                    OVER (PARTITION BY traj_nodes.route_id
                            ORDER BY traj_nodes.seq, traj_nodes.path_seq)
                        )
            , 2773)
                AS traj
    FROM traj_nodes
)
SELECT *,
        TSTZRANGE(time_start, time_end) AS time_range,
        ST_Length(traj) AS route_length,
        time_end - time_start AS duration
    FROM build_line
;

Create indexes in the trajectory and time range columns for speedy querying.

CREATE INDEX gix_osm_routing_route_trajectory_segment
    ON osm_routing.route_trajectory_segment
    USING GIST (traj);
CREATE INDEX ix_osm_routing_route_trajectory_segment_trange
    ON osm_routing.route_trajectory_segment
    USING GIST (time_range);

While the materialized view created above includes dedicated timestamp and time range calculations, these values could be pulled back out of the trajectory data itself.

SELECT route_id, seq,
        TO_TIMESTAMP(ST_M(ST_StartPoint(traj))) AS t_start_traj, 
        TO_TIMESTAMP(ST_M(ST_StartPoint(traj))) AS t_end_traj
    FROM osm_routing.route_trajectory_segment
    ORDER BY route_id, seq, path_seq
    LIMIT 5
;


route_id|seq|t_start_traj       |t_end_traj         |
--------|---|-------------------|-------------------|
       1|  1|2020-11-01 05:00:00|2020-11-01 05:00:00|
       1|  2|2020-11-01 05:00:03|2020-11-01 05:00:03|
       1|  3|2020-11-01 05:00:04|2020-11-01 05:00:04|
       1|  4|2020-11-01 05:00:05|2020-11-01 05:00:05|
       1|  5|2020-11-01 05:00:06|2020-11-01 05:00:06|

The animation using QGIS' temporal controller currently requires the dedicated start/end times as actual columns. Future QGIS versions may remove this requirement. For that reason, the timestamp columns have been kept in the materialized view.

Check validity

It is a good habit to check your trajectories for invalid data using ST_IsValidTrajectory(). It took me more than a couple tries to get the final trajectory data to build properly. Most of my issues were due to my own errors with sorting and other logic.

Intersections in time and space

This query is an example of how trajectories may be faked using time range data. As mentioned earlier, this method does not work as desired for more than the most basic examples. The join uses a.time_range && b.time_range AND ST_DWithin(a.traj, b.traj, 2) to filter for rows where the time range overlap and geometries are withing 2 meters of each other. Using this approach returns 45 rows with most of them are false positives.

SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
        a.seq AS seq_a, b.seq AS seq_b,
        a.time_range, b.time_range,
        a.traj, b.traj AS traj_line_intersect
    FROM osm_routing.route_trajectory_segment a
    INNER JOIN osm_routing.route_trajectory_segment b
        ON a.time_range && b.time_range 
            AND ST_DWithin(a.traj, b.traj, 2)
            AND a.route_id < b.route_id
;

Taking a closer look at a single record we can see the tail of one route is at the same point as the beginning of another route (POINT (-104.99831660000001 39.74985710023273)). These nodes were traversed 6 seconds apart by the different routes (05:00:05 vs 05:00:11). Did the objects traveling on the routes really get within 2 meters of each other?

SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
        a.seq AS seq_a, b.seq AS seq_b,
        ST_Transform(ST_StartPoint(a.traj), 4326) AS start_a,
        ST_Transform(ST_EndPoint(b.traj), 4326) AS end_b,
        TO_TIMESTAMP(ST_M(ST_StartPoint(a.traj))) AS start_time_a,
        TO_TIMESTAMP(ST_M(ST_EndPoint(b.traj))) AS end_time_b
    FROM osm_routing.route_trajectory_segment a
    INNER JOIN osm_routing.route_trajectory_segment b
        ON a.time_range && b.time_range 
            AND ST_DWithin(a.traj, b.traj, 2)
            AND a.route_id < b.route_id
            AND (a.route_id = 3 AND b.route_id = 4 AND a.seq = 3 AND b.seq = 6)
;

Name        |Value                                        |
------------|---------------------------------------------|
route_id_a  |3                                            |
route_id_b  |4                                            |
seq_a       |3                                            |
seq_b       |6                                            |
start_a     |POINT (-104.99831660000001 39.74985710023273)|
end_b       |POINT (-104.99831660000001 39.74985710023273)|
start_time_a|2020-11-01 05:00:05                          |
end_time_b  |2020-11-01 05:00:11                          |

To check the closest distance between two traveling objects represented by the trajectories we can use ST_DistanceCPA. The following query is updated with ST_DistanceCPA(a.traj, b.traj) AS route_cpa_meters. This shows the two routes' closest points were actually 3.9 meters apart, further than the 2 meter threshold! So while the times overlap, and the geometries are within 2 meters, the traveling objects would not have been within 2 meters as they travel along their respective routes.

SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
        a.seq AS seq_a, b.seq AS seq_b,
        ST_DistanceCPA(a.traj, b.traj) AS route_cpa_meters
    FROM osm_routing.route_trajectory_segment a
    INNER JOIN osm_routing.route_trajectory_segment b
        ON a.time_range && b.time_range 
            AND ST_DWithin(a.traj, b.traj, 2)
            AND a.route_id < b.route_id
            AND a.route_id = 3 AND b.route_id = 4
            AND a.seq = 3 AND b.seq = 6
;

route_id_a|route_id_b|seq_a|seq_b|route_cpa_meters |
----------|----------|-----|-----|-----------------|
         3|         4|    3|    6|3.938249529587584|

Trajectory specific functions

To be more accurate with spatio-temporal filtering, use the trajectory specific function ST_CPAWithin. This replaces the two-part join on time range and geometry in the previous queries with ST_CPAWithin(a.traj, b.traj, 2). This returns only 3 records instead of 45. This also includes a check column to show that all three trajectory segments had time within 2 meters or less.

SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
        a.seq AS seq_a, b.seq AS seq_b,
        ST_DistanceCPA(a.traj, b.traj) AS traj_closest_point_meters,
        a.traj, b.traj AS traj_line_intersect
    FROM osm_routing.route_trajectory_segment a
    INNER JOIN osm_routing.route_trajectory_segment b
        ON ST_CPAWithin(a.traj, b.traj, 2)
            AND a.route_id < b.route_id
;

route_id_a|route_id_b|seq_a|seq_b|traj_closest_point_meters|
----------|----------|-----|-----|-------------------------|
         3|         4|   10|   13|         1.87513147035179|
         3|         4|   10|   14|       1.8751314051078767|
         3|         4|   11|   14|       1.8751314051078767|

Animate in QGIS

Seeing data in action is always helpful. QGIS now includes a feature called Temporal Controller, available in QGIS 3.14. and newer. I am using the feature currently in QGIS v3.16 with good results. Nyall Dawson has a great tutorial on how to use this feature. The trajectory routes created above are illustrated in the following animated image created using QGIS' Temporal Controller.

Animiated GIF image showing the PostGIS routes animated within QGIS using the Temporal Controller feature.

Note: Above map uses Stamen Toner basemap, see https://wiki.openstreetmap.org/wiki/Tile_servers.

Saving the animation from QGIS seems to only have one option currently, to save a series of PNG files into a directory. To stitch them into an animated GIF I used the convert tool. I wasn't able to get the Cumulative range feature to export with the images, and getting the right resolution and extents set is a bit finicky still. That said, it's a new feature and I'm sure those nuances will be improved on in the future.

For now, to convert the exported PNG files into an animated GIF.

cd /path/to/qgis/dir
convert -delay 10 -loop 0 *.png postgis-trajectory-qgis-animation.gif

Summary

This post showed how to get started with PostGIS trajectory data using OpenStreetMap data and pgrouting. There is plenty more exploring to be done with this now that the basics for spatio-temporal modeling are in place.

Need help with your PostgreSQL servers or databases? Contact us to start the conversation!

By Ryan Lambert
Published November 29, 2020
Last Updated November 30, 2020