PGSQL Phriday #005 Recap
Thank you everyone who contributed to PgSQL Phriday #005! This month's topic: "Is your data relational?" If I missed any contributions, or if new ones are published, let me know and I'll try to update this post. These snippets are in a somewhat random order, loosely threaded together by sub-topic.
Hetti D. wrote a great post starting by addressing the bonus question. I put that question last partly because I have struggled with a succinct definition myself. I also put it last because I hoped the initial 3 questions would lead us to answer the bonus question in our own ways. Hetti also discusses storing blobs and objects, and considerations between complexities and trade-offs with more targeted technology.
Postgres is Relational Plus
I was the host for this month's #PGSQLPhriday topic (#005), and decided on the topic question: Is your data relational? This is my submission on the topic, and how I use Postgres for Relational Plus usages.
Q: What non-relational data do you store in Postgres and how do you use it?
PostGIS is the most prominent non-relational data I am involved with. Pretty much all of the PostGIS data I am involved with is rooted alongside solidly relational data. Want to know demographics of customers within 10 miles of a specific location? The location portion is spatial, the demographic data is relational.
Relational and Non-relational Data: PGSQL Phriday #005
Welcome to the 5th installment of the #PGSQLPhriday blogging series. I am thrilled to be this month's host! The topic posts should be published by Friday February 3rd.
When Ryan Booz proposed the idea for #PGSQLPhriday I was immediately excited about it. Other than our first names, Ryan and I have a other few things in common. One of these common points is we both started our database careers in the world of MS SQL Server and later found our way to Postgres. My move to Postgres, and why I discovered Postgres, is at the heart of this month's topic for PGSQL Phriday 005.
Is your data relational?
The entire reason I discovered and started using Postgres was PostGIS. I needed PostGIS because I had a project in 2011 that could benefit from the OpenStreetMap data. The project still needed rock solid support for relational data and the SQL Standard, which Postgres also provides. However, it was the spatial support of PostGIS that pulled me into the world of Postgres.
Postgres and Software Development: PGSQL Phriday #004
This blog post is part of PGSQL Phriday #004. Hettie Dombrovskaya is this month's host! I was very excited to see the topic chosen as Postgres and Software Development. My contribution to this month's #pgsqlphriday topic covers how I manage code through our development processes. Check out Hettie's post for more about this month's topic.
Types of code
Before continuing on to the specific questions for the challenge, I want to define the broad groupings of code I work.
- Mission critical
- Not trivial
Mission critical code is where most of the "special sauce" is at. Mission
critical SQL code includes DDL commands to create the database structure
CREATE TABLE foo and
CREATE VIEW baz. This level of code
represents the data structure that enables everything else to function.
Stubbing toes with
Postgres has a handy module called
lives up to its name: it runs
EXPLAIN automatically for you.
The intent for this module is to automatically provide information useful
for troubleshooting about your slow queries as they happen.
This post outlines a pitfall I recently discovered with
auto_explain. Luckily for us, it's an easy thing to avoid.
I discovered this by running
CREATE EXTENSION postgis;
and watching it run for quite a while before failing with an out of disk space error.
That is not my typical experience with a simple
CREATE EXTENSION command!
Standard use of
A common way
auto_explain is used is to target "slow queries" through the setting
auto_explain.log_min_duration. This setting defines the number of milliseconds to use
as the threshold of when to log the
EXPLAIN output. If your queries are typically 10-50 ms,
you might decide to set
auto_explain.log_min_duration = 100 to log queries taking twice as
long as your goal. An instance serving big analytic queries might want to set that much higher,
say 2 or 5 seconds.
I say my testing was innocent because I wasn't trying to break something.
That makes it innocent, right? I was playing around with
auto_explain trying out
PgMustard's scoring API.
At the time I didn't want to think about where to set that threshold... I just wanted to capture
explain output for testing.
explains that setting
auto_explain.log_min_duration = 0
will capture "all plans." Sounds good, let's do that!