PostGIS Trajectory: Space plus Time
A few months ago I started experimenting with a few project ideas involving data over space and time. Naturally, I want to use Postgres and PostGIS as the main workhorse for these projects, the challenge was working out exactly how to pull it all together. After a couple false starts I had to put those projects aside for other priorities. In my free time I have continued working through some related reading on the topic. I found why you should be using PostGIS trajectories by Anita Graser and recommend reading that before continuing with this post. In fact, read Evaluating Spatio-temporal Data Models for Trajectories in PostGIS Databases while you're at it! There is great information in those resources with more links to other resources.
This post outlines examples of how to use these new PostGIS trajectory tricks with
OpenStreetMap data I already have available
(load
and
prepare
).
Often, trajectory examples assume using data collected from our
new age of IoT sensors sending GPS points and timestamps. This example approaches
trajectories from a data modeling perpective instead, showing how to synthesize trajectory data using pgrouting
.
Visualization of data is a critical component of sharing information,
QGIS
has long been my favorite GIS application to use with PostGIS data.
This post is part of the series PostgreSQL: From Idea to Database.
Route data
Routes for this example are created with manually selected start (s_id
)
and end (e_id
) nodes. These id values line up to pgrouting node data
in the table osm_routing.roads_noded_vertices_pgr
.
To set the stage for temporal data a start time (s_time
) is also included. This example
gives each route a start time of 5:00 AM, but makes it easy to adjust start times to explore
the impact.
route_id|s_id |e_id |s_time |
--------|-----|------|-------------------|
1|24902| 21887|2020-11-01 05:00:00|
2|43615| 82508|2020-11-01 05:00:00|
3|83320| 25649|2020-11-01 05:00:00|
4|27478| 46247|2020-11-01 05:00:00|
5| 3487|113816|2020-11-01 05:00:00|
6|20587|113816|2020-11-01 05:00:00|
7|30505|113816|2020-11-01 05:00:00|
The start and end nodes (r.s_id
and r.e_id
below) are fed into pgrouting's pgr_dijkstra()
routing function.
The osm_routing.roads
table contains OpenStreetMap data for Denver, CO prepared by PgOSM.
The pgosm.layer_detail
and pgosm.routable
tables are used to
easily filter down to roads usable by motorized traffic (cars).
The osm_routing.roads_noded
(edges) table was prepared with pg_routing
and is based on the base osm_routing.roads
table.
The route details are created and saved into a new table
(osm_routing.route_details
) using the following query.
CREATE TABLE osm_routing.route_details AS
WITH rte AS (
SELECT *
FROM osm_routing.routes
)
SELECT r.route_id, d.*, n.the_geom AS node_geom, e.the_geom AS edge_geom
FROM rte r
INNER JOIN pgr_dijkstra(
'SELECT rn.id, rn.source, rn.target, rn.cost_length AS cost,
rn.the_geom
FROM osm_routing.roads_noded rn
INNER JOIN osm_routing.roads r ON rn.old_id = r.id
INNER JOIN pgosm.layer_detail ld ON ld.code = r.code
INNER JOIN pgosm.routable rbl
ON rbl.layer_detail_id = ld.layer_detail_id
AND rbl.route_motor
',
r.s_id, r.e_id, directed := False
) d
ON True
INNER JOIN osm_routing.roads_noded_vertices_pgr n ON d.node = n.id
INNER JOIN osm_routing.roads_noded e ON d.edge = e.id
;
The pgr_dijkstra()
function returns one row for each step of a given route.
The route_id
value links back to osm_routing.routes
.
The cost
used a simple length based cost with meters as the unit.
The node
and edge
values point back to the id values in their respective tables
used by pgrouting.
route_id|seq|path_seq|node |edge |cost |agg_cost |
--------|---|--------|-----|-----|------------------|------------------|
1| 1| 1|24902|27505| 20.42259891997424| 0.0|
1| 2| 2|27462|27506| 70.51812482510054| 20.42259891997425|
1| 3| 3|27463|27507|21.724521060420614| 90.94072374507479|
1| 4| 4|27464|27508|23.621809729839022| 112.6652448054954|
1| 5| 5|27465|27509| 9.095234265004999|136.28705453533442|
The nodes (blue circles) and edges (red lines) for each of the routes are shown
below. We can easily confirm with code (ST_Intersects()
) and visually on
the map that these routes intersect using simple spatial checks.
Would these routes intersect if time was also considered?
The animated version of these routes is at the bottom of this post under Animate in QGIS.
Path to Trajectory
A simple approach to adding time to geometry is by adding two TIMESTAMPTZ
columns
for start and end times. Unfortunately this simple approach quickly falls apart
as later queries will illustrate.
Luckily, the PostGIS extension comes to the rescue (again) with built in trajectory
support. What is a trajectory?
It starts with at least two points that include a Measure (M
)
representing time. These points can be created with
ST_MakePointM()
.
Points including M
can be made into a line with ST_MakeLine()
, and
now you have a trajectory.
The key detail added to make a trajectory is M
, or the Measure.
Many trajectory use cases use time as the measure. Unfortunately,
the TIMESTAMPTZ
format cannot be used but using
the epoch is an easy enough workaround in Postgres.
Assign Speeds
To get to a M
value for each step of the route we need the speed traveled
for each edge. Add a column for edge_mph
.
ALTER TABLE osm_routing.route_details ADD edge_mph NUMERIC;
Quite a series of joins to get back to the speed limit. In the future this step will most likely be done far earlier on in the data processing pipeline.
UPDATE osm_routing.route_details rd_update
SET edge_mph = rbl.max_speed
FROM osm_routing.route_details rd
INNER JOIN osm_routing.routes rte ON rd.route_id = rte.route_id
INNER JOIN osm_routing.roads_noded rn ON rd.edge = rn.id
INNER JOIN osm_routing.roads r ON rn.old_id = r.id
INNER JOIN pgosm.layer_detail ld ON ld.code = r.code
INNER JOIN pgosm.routable rbl ON rbl.layer_detail_id = ld.layer_detail_id
WHERE rd_update.route_id = rd.route_id
AND rd_update.seq = rd.seq
AND rd_update.path_seq = rd.path_seq
;
Calculate time for each edge
Add a column to store the time for each node.
ALTER TABLE osm_routing.route_details ADD node_ts TIMESTAMPTZ;
To get the value for each node_ts
we need to calculate
cost in seconds based on an edge's speed. Our distance is in meters, our
speed in MPH, and we need seconds. Perfect use case for a function!
The following function accepts two parameters, length_m
and a miles_per_hour
,
and returns the time in seconds required to travel the edge.
CREATE OR REPLACE FUNCTION osm_routing.mph_length_m_to_cost_seconds(
length_m NUMERIC, miles_per_hour NUMERIC)
RETURNS NUMERIC
LANGUAGE sql
SECURITY DEFINER
SET search_path TO 'pg_temp'
AS $function$
/*
* miles per hour * 0.44704 = meters per second
*/
SELECT ($1 / ($2 * .44704))
$function$
;
This makes it easy to calculate the time to traverse an edge based on its length and speed. How many seconds does it take to travel 1,000 meters at 60 miles per hour?
SELECT osm_routing.mph_length_m_to_cost_seconds(1000, 60);
mph_length_m_to_cost_seconds|
----------------------------|
37.2822715342400382|
What about 1000 meters at 20 mph?
SELECT osm_routing.mph_length_m_to_cost_seconds(1000, 20);
mph_length_m_to_cost_seconds|
----------------------------|
111.8468146027201145|
The edge_times
CTE below uses the mph_length_m_to_cost_seconds()
function
to calculate each segment's time to traverse.
The time_to_node
CTE uses uses SUM()
as a window function to provide a running total of
cost in time to each node. The final UPDATE
query sets the node_ts
to the
route's start time (from osm_routing.routes
) plus the appropriate number of seconds
from the previous window function.
WITH edge_times AS (
SELECT r.route_id, rd.seq, rd.path_seq,
r.s_time, rd.cost, rd.agg_cost, rd.edge_mph,
CASE WHEN (rd.edge_mph > 0 AND agg_cost > 0)
THEN osm_routing.mph_length_m_to_cost_seconds(rd.cost::NUMERIC, rd.edge_mph)
ELSE NULL
END AS edge_time_s
FROM osm_routing.routes r
INNER JOIN osm_routing.route_details rd ON True
WHERE r.route_id = rd.route_id
), time_to_node AS (
SELECT SUM(et.edge_time_s)
OVER (PARTITION BY et.route_id ORDER BY et.seq, et.path_seq)
AS time_to_node_s,
*
FROM edge_times et
)
UPDATE osm_routing.route_details rd
SET node_ts = ttn.s_time + (ttn.time_to_node_s || ' seconds')::INTERVAL
FROM time_to_node ttn
WHERE rd.route_id = ttn.route_id
AND rd.seq = ttn.seq
AND rd.path_seq = ttn.path_seq
AND rd.node_ts IS NULL
;
The above UPDATE
query does not set the route's initial node. The following
UPDATE
fixes the first record in each route equal to the s_time
.
UPDATE osm_routing.route_details rd
SET node_ts = r.s_time
FROM osm_routing.routes r
WHERE rd.route_id = r.route_id
AND rd.node_ts IS NULL
AND rd.seq = 1
;
The osm_routing.route_details
table now includes a speed limit and timestamp
for each segment for the included routes.
route_id|seq|path_seq|edge_mph|node_ts |
--------|---|--------|--------|-------------------|
1| 1| 1| 45.00|2020-11-01 05:00:00|
1| 2| 2| 45.00|2020-11-01 05:00:03|
1| 3| 3| 45.00|2020-11-01 05:00:04|
1| 4| 4| 45.00|2020-11-01 05:00:05|
1| 5| 5| 45.00|2020-11-01 05:00:06|
Finally, Trajectories!
Now to turn into trajectories I am using a materialized view created with a CTE.
The first portion (nodes
) pulls out the needed details for each node. The second
step (traj_nodes
) creates the points with the epoch of the timestamp for the
M
value. The third step (build_line
) uses the LEAD()
window function to string
route nodes together into trajectory lines.
The final portion calculates few extra handy columns, such as time_range
for easy temporal only filtering.
CREATE MATERIALIZED VIEW osm_routing.route_trajectory_segment AS
WITH nodes AS (
SELECT route_id, seq, path_seq, node, edge, cost, agg_cost,
edge_mph, node_ts,
ST_X(ST_Transform(node_geom, 4326)) AS long,
ST_Y(ST_Transform(node_geom, 4326)) AS lat
FROM osm_routing.route_details
), traj_nodes AS (
SELECT route_id, seq, path_seq, node_ts,
ST_SetSRID(ST_MakePointM(long, lat, EXTRACT(epoch FROM node_ts)), 4326) AS traj_point
FROM nodes
), build_line AS (
SELECT traj_nodes.route_id, traj_nodes.seq, traj_nodes.path_seq,
node_ts AS time_start,
LEAD(node_ts) OVER (PARTITION BY route_id ORDER BY seq, path_seq)
AS time_end,
ST_Transform(
ST_MakeLine(traj_nodes.traj_point,
LEAD(traj_nodes.traj_point)
OVER (PARTITION BY traj_nodes.route_id
ORDER BY traj_nodes.seq, traj_nodes.path_seq)
)
, 2773)
AS traj
FROM traj_nodes
)
SELECT *,
TSTZRANGE(time_start, time_end) AS time_range,
ST_Length(traj) AS route_length,
time_end - time_start AS duration
FROM build_line
;
Create indexes in the trajectory and time range columns for speedy querying.
CREATE INDEX gix_osm_routing_route_trajectory_segment
ON osm_routing.route_trajectory_segment
USING GIST (traj);
CREATE INDEX ix_osm_routing_route_trajectory_segment_trange
ON osm_routing.route_trajectory_segment
USING GIST (time_range);
While the materialized view created above includes dedicated timestamp and time range calculations, these values could be pulled back out of the trajectory data itself.
SELECT route_id, seq,
TO_TIMESTAMP(ST_M(ST_StartPoint(traj))) AS t_start_traj,
TO_TIMESTAMP(ST_M(ST_StartPoint(traj))) AS t_end_traj
FROM osm_routing.route_trajectory_segment
ORDER BY route_id, seq, path_seq
LIMIT 5
;
route_id|seq|t_start_traj |t_end_traj |
--------|---|-------------------|-------------------|
1| 1|2020-11-01 05:00:00|2020-11-01 05:00:00|
1| 2|2020-11-01 05:00:03|2020-11-01 05:00:03|
1| 3|2020-11-01 05:00:04|2020-11-01 05:00:04|
1| 4|2020-11-01 05:00:05|2020-11-01 05:00:05|
1| 5|2020-11-01 05:00:06|2020-11-01 05:00:06|
The animation using QGIS' temporal controller currently requires the dedicated start/end times as actual columns. Future QGIS versions may remove this requirement. For that reason, the timestamp columns have been kept in the materialized view.
Check validity
It is a good habit to check your trajectories for invalid data using
ST_IsValidTrajectory()
.
It took me more than a couple tries to get the final trajectory data
to build properly.
Most of my issues were due to my own errors with sorting and other logic.
Intersections in time and space
This query is an example of how trajectories may be faked using
time range data. As mentioned earlier, this method does not work as desired for more
than the most basic examples.
The join uses a.time_range && b.time_range AND ST_DWithin(a.traj, b.traj, 2)
to filter for rows where the time range overlap and geometries are withing
2 meters of each other.
Using this approach returns 45 rows with most of them are false positives.
SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
a.seq AS seq_a, b.seq AS seq_b,
a.time_range, b.time_range,
a.traj, b.traj AS traj_line_intersect
FROM osm_routing.route_trajectory_segment a
INNER JOIN osm_routing.route_trajectory_segment b
ON a.time_range && b.time_range
AND ST_DWithin(a.traj, b.traj, 2)
AND a.route_id < b.route_id
;
Taking a closer look at a single record we can see the tail of one route is
at the same point as the beginning of another route (POINT (-104.99831660000001 39.74985710023273)
). These nodes were traversed 6 seconds apart by the
different routes (05:00:05 vs 05:00:11).
Did the objects traveling on the routes really get within 2 meters of each other?
SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
a.seq AS seq_a, b.seq AS seq_b,
ST_Transform(ST_StartPoint(a.traj), 4326) AS start_a,
ST_Transform(ST_EndPoint(b.traj), 4326) AS end_b,
TO_TIMESTAMP(ST_M(ST_StartPoint(a.traj))) AS start_time_a,
TO_TIMESTAMP(ST_M(ST_EndPoint(b.traj))) AS end_time_b
FROM osm_routing.route_trajectory_segment a
INNER JOIN osm_routing.route_trajectory_segment b
ON a.time_range && b.time_range
AND ST_DWithin(a.traj, b.traj, 2)
AND a.route_id < b.route_id
AND (a.route_id = 3 AND b.route_id = 4 AND a.seq = 3 AND b.seq = 6)
;
Name |Value |
------------|---------------------------------------------|
route_id_a |3 |
route_id_b |4 |
seq_a |3 |
seq_b |6 |
start_a |POINT (-104.99831660000001 39.74985710023273)|
end_b |POINT (-104.99831660000001 39.74985710023273)|
start_time_a|2020-11-01 05:00:05 |
end_time_b |2020-11-01 05:00:11 |
To check the closest distance between two traveling objects represented by
the trajectories we can use
ST_DistanceCPA
.
The following query is updated with ST_DistanceCPA(a.traj, b.traj) AS route_cpa_meters
.
This shows the two routes' closest
points were actually 3.9 meters apart, further than the 2 meter threshold!
So while the times overlap, and the geometries are
within 2 meters, the traveling objects would not have been within 2 meters
as they travel along their respective routes.
SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
a.seq AS seq_a, b.seq AS seq_b,
ST_DistanceCPA(a.traj, b.traj) AS route_cpa_meters
FROM osm_routing.route_trajectory_segment a
INNER JOIN osm_routing.route_trajectory_segment b
ON a.time_range && b.time_range
AND ST_DWithin(a.traj, b.traj, 2)
AND a.route_id < b.route_id
AND a.route_id = 3 AND b.route_id = 4
AND a.seq = 3 AND b.seq = 6
;
route_id_a|route_id_b|seq_a|seq_b|route_cpa_meters |
----------|----------|-----|-----|-----------------|
3| 4| 3| 6|3.938249529587584|
Trajectory specific functions
To be more accurate with spatio-temporal filtering, use the trajectory specific function
ST_CPAWithin
.
This replaces the two-part join on time range and geometry in the previous queries
with ST_CPAWithin(a.traj, b.traj, 2)
.
This returns only 3 records instead of 45. This also includes a check column to show that
all three trajectory segments had time within 2 meters or less.
SELECT a.route_id AS route_id_a, b.route_id AS route_id_b,
a.seq AS seq_a, b.seq AS seq_b,
ST_DistanceCPA(a.traj, b.traj) AS traj_closest_point_meters,
a.traj, b.traj AS traj_line_intersect
FROM osm_routing.route_trajectory_segment a
INNER JOIN osm_routing.route_trajectory_segment b
ON ST_CPAWithin(a.traj, b.traj, 2)
AND a.route_id < b.route_id
;
route_id_a|route_id_b|seq_a|seq_b|traj_closest_point_meters|
----------|----------|-----|-----|-------------------------|
3| 4| 10| 13| 1.87513147035179|
3| 4| 10| 14| 1.8751314051078767|
3| 4| 11| 14| 1.8751314051078767|
Animate in QGIS
Seeing data in action is always helpful. QGIS now includes a feature called Temporal Controller, available in QGIS 3.14. and newer. I am using the feature currently in QGIS v3.16 with good results. Nyall Dawson has a great tutorial on how to use this feature. The trajectory routes created above are illustrated in the following animated image created using QGIS' Temporal Controller.
Note: Above map uses Stamen Toner basemap, see https://wiki.openstreetmap.org/wiki/Tile_servers.
Saving the animation from QGIS seems to only have one option currently, to save a series
of PNG files into a directory. To stitch them into an animated GIF I used
the convert
tool. I wasn't able to get the Cumulative range feature to export
with the images, and getting the right resolution and extents set is a bit finicky still.
That said, it's a new feature and I'm sure those nuances will be improved on in the future.
For now, to convert the exported PNG files into an animated GIF.
cd /path/to/qgis/dir
convert -delay 10 -loop 0 *.png postgis-trajectory-qgis-animation.gif
Summary
This post showed how to get started with PostGIS trajectory data using OpenStreetMap data and pgrouting. There is plenty more exploring to be done with this now that the basics for spatio-temporal modeling are in place.
Need help with your PostgreSQL servers or databases? Contact us to start the conversation!