RustProof Labs: blogging for education (logo)
My book Mastering PostGIS and OpenStreetMap is available!

PostgreSQL Crash Course

By Ryan Lambert -- Published July 30, 2018

PostgreSQL (postgres, pg) is a powerful, open-source relational database (RMDS). It is the database of choice here at RustProof Labs, I wrote previously why I chose PostgreSQL over MySQL. This post provides a 10-mile view of how PostgreSQL is structured, tools to use, and other important concepts. The goal is to provide a broad view of PostgreSQL with resources to more detail on each topic, without getting caught up in too much technical jargon.

This post is part of the series PostgreSQL: From Idea to Database.

Installing PostgreSQL

PostgreSQL will run on most any modern hardware and operating systems. PostgreSQL has installation guides for a large number of operating systems, supporting most Linux/Unix, Mac, and Windows. As for hardware, it will run on nearly anything modern, including a Raspberry Pi (ARM processor). See the PostgreSQL supported platforms for more details.

I have successfully ran PostgreSQL on Linux servers of all sizes, from tiny to powerful. I previously wrote about load testing PostgreSQL that gives more detail about some of the systems it can run on and what type of performance to expect.

Structure and Terminology

Every RDMS has its own way to organize the elements required for a reliable database system. PostgreSQL has the following hierarchy to organize the required elements.

Cluster > Database > Schema > Table > Data

Data is stored as rows, contained in tables, within a schema, within a database, within a cluster. The PostgreSQL Concepts page provides a nice explanation of this overall structure.

Continue Reading

Security is a feature. Choose it!

By Ryan Lambert -- Published July 11, 2018

Nearly every modern organization, large or small, has a significant amount of IT infrastructure reliant on 3rd parties. For the purpose of this post, 3rd parties includes anything you think of as "the cloud" (e.g. not my servers) or any service that you don't fully control on your hardware. Web hosting, email, social media, and file storage are all examples of services commonly outsourced.

I wrote previously about the responsibilities associated with managing data securely. A solid cybersecurity posture should require evaluating 3rd parties for proper security practices. This post provides a loose framework for evaluating a 3rd party's overall cybersecurity posture. This evaluation can then be used in your ongoing decision making to pick new vendors and reevaluate existing vendors. This post does not go in depth to any specific technology or vendor specific details, instead I intend to focus more on the culture of security.

Security as a feature

Security as a feature in this model has 3 overarching components:

  • Preparation
  • Problem solving
  • Learn from mistakes

The quality of cybersecurity is an aggregate of these three components. An organization with good security practices is well prepared, knows how to react when something goes wrong, and they learn from their mistakes.

Plan for the worst, hope for the best.

The reality is, until more organizations make cybersecurity a higher priority, the never-ending stream of data breaches will continue.

Continue Reading

PostGIS: Tame your spatial data (Part 1)

By Ryan Lambert -- Published July 06, 2018

The goal of this post is to examine and fix one common headache when using spatial data: its size. Spatial data (commonly called "GIS data") can become painfully slow because spatial data gets big in a hurry. I discussed this concept before by reducing the OSM roads layer to provide low-overhead thematic layers. This post uses the same basic philosophy to reduce another common bottleneck with polygon layers covering large areas, such as the US county boundaries.

This post goes into detail about using PostGIS to simplify large polygon objects, the effects this has on storage, performance and the accuracy of spatial analysis when using simplified geometries. The solution provided is not suitable for all data and/or use cases, particularly if a high level of accuracy and precision is required.

One of the main reasons why PostgreSQL became my personal favorite RDMS was its PostGIS extension, providing the most robust spatial database in the world.

This post is an advanced topic of the series PostgreSQL: From Idea to Database.

A numeric example

This post assumes basic knowledge of SQL syntax, PostgreSQL and PostGIS.

To begin this conversation, let's examine a bit about what we already know about storing numbers. Let's start with Pi (π). Pi is a mathematical constant, roughly represented as 3.14159, though the decimals go on and on into infinity. The following example shows how to create a table with a single row storing two different representations of Pi. The first column, pi_long, stores 31 decimals of Pi. The pi_short column stores only two decimal places.

DROP TABLE IF EXISTS cool_numbers;
CREATE TEMP TABLE cool_numbers AS 
SELECT 3.1415926535897932384626433832795::NUMERIC(32,31) AS pi_long,
    3.14::NUMERIC(3,2) AS pi_short;

As you may expect, storing a number with more digits takes more disk space than storing a number with fewer digits. This is called precision. The following query uses the pg_column_size() function to illustrate that the number with more decimal spaces takes up more disk space.

Continue Reading

PostgreSQL: From Idea to Database - Table of Contents

By Ryan Lambert -- Published June 19, 2018

PostgreSQL: From Idea to Database is a series of blog posts written to teach practical database design. This series focuses on practical examples of working with Postgres to illustrate the why and how of operation.

Note: This series is an evolving and improving collection of blog posts. I regularly update this page as content is published and maintained.

Table of contents

The following listing includes links to the pages that are already published along with placeholders for planned future topics. Future topics (e.g. titles without links) are likely to change in title, exact content and scope.

Continue Reading

PostgreSQL: From Idea to Database - Introduction

By Ryan Lambert -- Published June 14, 2018

Databases are a critical component for nearly any modern application. The database is also a component that often is surrounded by confusion and apprehension.

This post is the first of the series PostgreSQL: From Idea to Database. The goal of this series to provide a guideline to how we approach database development at RustProof Labs. This includes covering both methodology and technology. This series provides working code examples, friendly explanations, and real-world database design scenarios. Hopefully this series helps explain and teach database design is a new, friendly format.

Project for reference

The PiWS Logo

A major challenge with teaching database design is when you don't have a real project to use as an example. For this, I've chosen to use The PiWS project, an open source weather station designed around the Raspberry Pi. To read more about the PiWS, see my introductory post.

The PiWS project was chosen for a variety of reasons, but there are three main reasons:

  • I know the code
  • The project was developed in a way worth describing
  • It's open source!

The third reason is a big benefit because the source code used in this series will come from the project's GitHub repository. This gives us a real project with real code to study how to approach the task of designing a database.

Throughout this series I provide examples of how the PiWS project was initially designed and how it has evolved.

Continue Reading

<-- Older Posts          Newer Posts -->