Postgres is Relational Plus
I was the host for this month's #PGSQLPhriday topic (#005), and decided on the topic question: Is your data relational? This is my submission on the topic, and how I use Postgres for Relational Plus usages.
Q: What non-relational data do you store in Postgres and how do you use it?
PostGIS is the most prominent non-relational data I am involved with. Pretty much all of the PostGIS data I am involved with is rooted alongside solidly relational data. Want to know demographics of customers within 10 miles of a specific location? The location portion is spatial, the demographic data is relational.
One of my favorite spatial data sources
is built using a flexible
key/value tagging scheme
to support a diverse, global community.
The PgOSM Flex project uses
to provide additional attributes in the
PgOSM Flex pulls out the most commonly used "tags" from
the data during processing and stores them in proper relational
table structures. Of course, there are many thousands of distinct
keys used in OpenStreetMap so it's impossible to try to catch them all.
JSONB data comes in!
osm.tags table has a
column with all of the key/value data from OpenStreetMap.
An example of using this
JSONB data is shown in
Identify OpenStreetMap changes with Postgres.
I also have found myself storing various API results in
format. My usage pattern with all JSON data is that it is saved once,
never updated, rarely retrieved. It serves a purpose, at a cost.
One always-relevant cost is the size JSON takes on disk.
Didn't work well
Q: Have you attempted non-relational uses of Postgres that did not work well?
Messing up spatial joins on big data. In a
traditional relational join, missing a join clause (
AND a.county_id = b.county_id)
might result in an slower query returning way too many rows.
It probably won't be too bad though.
In a spatial query, missing a join clause (
AND a.geom && b.geom)
can change a query timing from seconds into hours. In some cases,
it's even possible to crash your server. How does that happen?
A lot of spatial data with complex polygons, queries with sorting,
and spilling a lot of data to disk. Turns out,
Postgres really doesn't enjoy running out of disk space! 🔥
The importance of monitoring free disk space cannot be overstated!
My 2018 post Tame your spatial data shows the type of complex polygons mentioned above. Postgres 9.6 and spinning rust HDDs feels so long ago!
Q: What are the biggest challenges with your data, whatever its structure?
Documentation! In relational databases, the structure itself is a significant part of the documentation. Primary and foreign keys, constraints, and indexes all give hints about the meaning contained. When I approach a new database that is missing these critical elements, my job is significantly more difficult.
The other biggest challenge I deal with is the balance on the ideal structure. That's always an "it depends" answer based on the use case and organization's culture.
How does the data come in? How is it used? How is it retained? How do we think it will be needed in 5 years? 10 years? I am finding myself now at the point where I am maintaining a growing number of systems that go back a decade or more. At this point in their lifecycle, I am very aware of how design implications now will still be impacting me (or someone else) in the future. Not all systems will last 10 years, but many will.
Postgres is a great database providing Relational Plus support. If you need relational support, Postgres is trusted and reliable. If you need support beyond standard relational data, Postgres is quite capable of handling a wide range of other data structures.
Need help with your PostgreSQL servers or databases? Contact us to start the conversation!
Published February 03, 2023
Last Updated February 03, 2023