PostgreSQL at RustProof Labs: 2018 in Review
As I look back on 2018, I can say with great joy: I have spent a lot of time working with PostgreSQL this year! Postgres is my favorite open-source database, and I'm really excited to start digging into some of the new features and enhancements in PostgreSQL 11 and PostGIS 2.5.
This post is a quick snapshot of my Postgres activities in 2018 and some goals for 2019. For me, these activities fall into a three broad groups.
- Blog posts
- Projects built around PostgreSQL (open-source and internal)
- Other stuff
PostgreSQL popularity
PostgreSQL has continued to grow in popularity through 2018, a long running trend according to DB-Engine's ranking. Postgres is 4th on the list ahead of MongoDB (even before the news of The Guardian switching!) and behind Oracle, MySQL and MS SQL. Postgres and MongoDB are growing in popularity while all of the top 3 databases are apparently declining.
Yay, PostgreSQL!!
PostGIS: Tame your spatial data (Part 2)
In a previous post PostGIS: Tame your spatial data
I illustrated how large-area polygons in your spatial data can take up a lot of space.
That post examined one method (ST_Simplify
) to reduce the size of the data (45% reduction of size on disk)
in order to improve performance (37% faster queries).
The goal of reductions like this are to improve the life of the analysts
working with spatial data for their job.
These analysts are often using spatial data within a GIS tool such as QGIS, and in
those tools Time-to-Render (TTR) is quite important.
The topic of that prior post solved a specific problem (large polygons), and unfortunately that solution can't be applied to all problems related to the size of spatial data.
This post is an advanced topic of the series PostgreSQL: From Idea to Database.
Enhance your PostgreSQL Data Dictionary
This post continues the topic of creating a robust data dictionary in PostgreSQL.
My previous post on
building a data dictionary in PostgreSQL
focused on using built-in psql
tools, such as \dn
and \dt
. It also quickly covered the use of comments
to document database objects.
Those tools and shortcuts are great and readily available, but beyond quick exploration tasks these built-in options don't provide enough functionality. This is especially true in today's data-filled world with
increasing regulation.
This post is part of the series PostgreSQL: From Idea to Database.
Meet pgdd
: PostgreSQL data dictionary
The PostgreSQL Data Dictionary (pgdd
) project,
available on GitHub, intends to improve our ability to document databases in PostgreSQL.
This open-source project super-powers your PostgreSQL databases by introducing additional
meta-tracking abilities.
It also makes the internal database meta-data more user accessible and user friendly through a
handy set of views. These views can be queried and joined using standard SQL syntax using any
tool you like.
pgdd
exists to make PostgreSQL data dictionaries more usable.
Data dictionary should be for everyone
Imagine an analyst with access to a PostgreSQL/PostGIS database. That database probably has
data from a variety of open and closed sources.
That analyst spends
quite a bit of time in QGIS, but very little time in psql
. Being able to run
standard SQL type queries in their favorite tool makes the data dictionary more usable for this user.
Data Analytics and Software: It still isn't easy
This post examines the work of producing analytics/visualizations and the role
software plays. Business Intelligence (BI) software, such as
Power BI,
Tableau, and
SAP Dashboards
are a few examples of software with a lot of buzz.
The marketing folks in that industry would have us believe their software enables analysts
to point and click at data and enable magic
. This magic
feature provides instant access
to interactive dashboards, drill-down functionality, and more.
Don't forget, the data will be reliable, accurate, easy to share, easy to secure,
and cross-platform (mobile-friendly) too!
Reality check
Data visualization software in 2018 doesn't have the magic
feature working yet.
I believe it takes at least as much human effort today to produce a great visualization as it
did a decade ago. The improvements in the software are not making the process quicker or
easier. The reality is that working with data isn't for everyone, and
#DataIsHard.
I recently attended a PASS webinar by Scot Reagin who stated:
"90% of users are not self-service capable."
Build your Data Dictionary in PostgreSQL
This post provides an introduction to building a data dictionary directly in our PostgreSQL databases. The steps outlined here are specific to PostgreSQL, though every database platform has these basics components in place.
My 2022 post Postgres Data Dictionary for everyone shows how to use the PgDD extension to make it easy to query data dictionary details using standard SQL sytnax.
What is a Data Dictionary?
Documentation is a good thing. Data dictionaries are one important component of documenting your databases. A data dictionary is a common tool used to provide database-specific documentation to analysts, developers, and other business users. A good data dictionary provides insights into a database's structure, constraints, relationships, and sources of data found inside a system.