RustProof Labs: blogging for education (logo)
My book Mastering PostGIS and OpenStreetMap is available!

PostgreSQL deployments using Ansible and Sqitch

By Ryan Lambert -- Published July 07, 2019

The database is a critical component of pretty much every application in existence today. Being such an important component it is good to be able to consistently deploy your databases in a efficient, reliable and accurate manner. At RustProof Labs, Postgres is our database of choice so in this post I examine how we use Ansible and Sqitch together at a high level to deploy, maintain, and upgrade our Postgres servers and databases.

This post is part of the series PostgreSQL: From Idea to Database.

Define goals

Before diving into how Ansible and Sqitch work together with Postgres lets define some goals. At the server/service level, we need the ability to quickly provision new Postgres instances ready for production or development. It should be easy to set/change many common configuration options (postgresql.conf, pg_hba.conf, etc.) to allow efficient operation on a variety of hardware and handle widely different workloads.

At the individual database level (where your app stores its data) we need the ability to manage schema changes (DDL) over time. This typically involves a large number of tables, views and functions. It should be easy and low-risk to test, deploy and verify even the most complex schema changes.

Meeting our goals

Ansible is a tool for automating IT deployments and configurations. We use Ansible to configure servers, install software, manage security, and customize configuration files. In other words, Ansible is the conductor of our automation orchestra. I had reassurance that Ansible meets our needs in 2018 after one of our primary database servers became "unavailable to communicate." I received a notification of the issue and promoted a secondary server to be the new primary after initial triage. After creating a new Amazon EC2 instance I ran two Ansible playbooks taking a total of less than 5 minutes to provision the replacement secondary server.

Sqitch is a database change management tool. Our database projects use Sqitch to deploy and track changes to individual databases schemas. It is a straight forward command line tool, simple to use during development and easy to automate into our large deployments with Ansible.

Ansible

The great thing with Ansible is that if you can document the manual steps to take, those can be converted into Ansible roles and playbooks. In fact, I recommend you have a solid grasp on your desired deployment process before you attempt to automate it.

Install Ansible (Ubuntu) on your control node. This is normally installed on my laptop for easy access (development) and on our deployment servers.

sudo apt update
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible

With Ansible available, it is time to start automating. I'd recommend their quickstart and docs if you are new to Ansible.

Install PostgreSQL and Sqitch

We can now write Ansible tasks to deploy any version of Postgres available, such as this excerpt:

- name: "Install PostgreSQL {{ pg_version }} base package"
  apt:
    package: "{{ item }}"
    state: latest
    update_cache: yes
    cache_valid_time: 3600
  with_items:
    - "postgresql-{{ pg_version }}"
    - python-psycopg2
    - sqitch
    - pgbadger

The variable pg_version can be set in defaults/main.yml to define your preferred version. It is easy to override the defaults in the inventory files, playbooks, or even environment variables. Other than the core PostgreSQL server and Sqitch, this task installs a couple other Postgres tools that I always want installed (pgbadger and python-psycopg2). If certain software only gets installed on some servers, those can be controlled in their own task by using a when clause.

when: pg_dev_server

Configure PostgreSQL

One of our big requirements was to easily configure and reconfigure our Postgres instances. Ansible makes this easy as well. The following example task uses the pg_version variable again to handle deploying a customizsed version-specific postgresql.conf file.

- name: Copy production postgres.conf template
  template:
    src: "postgresql-{{ pg_version }}.conf.j2"
    dest: "/etc/postgresql/{{ pg_version }}/main/postgresql.conf"
    owner: postgres
    group: postgres
  notify:
    - Restart PostgreSQL

The src file is named to include the version it is valid for (configuration options do change over time!) and ends with a .j2 file extension. Ansible templates use Jinja to generate and deploy customized configurations. For example, if we want to control shared_buffers, probably the single-most configured option in Postgres, our templates/postgresql-11.conf.j2 would replace the default postgresql.conf line with this one:

shared_buffers = {{ pg_shared_buffers  }}

Now, defaults/main.yml should be updated to provide a reasonable default value. If you know your Postgres servers typically have at least 4GB RAM, 1GB for shared_buffers is a good starting point (25%).

pg_shared_buffers: 1GB

Setting defaults is not required by Ansible, but can help keep your inventory variables limited to documenting deviation from your defaults.

An Ansible inventory file (or env vars) can override the default value, but note the use of = (below) instead of : (above) between the variable name and the value. This inventory example would be suitable for a Raspberry Pi's configuration with the older version of Postgres and reduced shared buffers.

[postgresql:vars]
pg_version=9.6
pg_shared_buffers=100MB

Sqitch

The best way to get started with Sqitch is with their Postgres tutorial. It is worth following along to learn how powerfully simple Sqitch is.

sqitch status
sqitch deploy
sqitch verify

I worked up this short, cheesy video highlighting the basics:

Numbered migrations

Sqitch makes it clear you do not need to use numbered, ordered changes.

"There is no need to number your changes, although you can if you want. Sqitch doesn’t much care how you name your changes."

I still prefer numbered, ordered migrations with names like 001, 002 and so on. Database migrations are an ordered series of changes so having the basic ordering inherent in the file names is nice. Also, we are creatures of habit and that was how I was taught to do it, and since Sqitch doesn't care... 😀

Deploying with Ansible

The Sqitch commands can easily be wrapped into Ansible tasks using the shell module, this example includes --verify to instruct Sqitch to automatically run the verification scripts.

- name: Deploy database using Sqitch
  shell: "sqitch deploy --verify db:pg:{{ db_name }}"
  become_user: postgres
  become: yes

It's often handy to have a single script to use to install a database. This is easy enough to accomplish with pg_dump in another task.

- name: Dump database schema for single DB install file.
  shell: "pg_dump --schema-only -U {{ db_user }} -d "{{ db_name }} -f {{ db_name }}-schema.sql"
  become_user: postgres
  become: yes

Summary

This post has provided a high level view of how we use Ansible and Sqitch to deploy our Postgres at RustProof Labs. Automation is a great way to increase your efficiency, reduce errors, and generally maintain more reliable database driven architectures. This is important throughout our infrastructure, but it is especially true for our databases!

Need help with your PostgreSQL servers or databases? Contact us to start the conversation!

By Ryan Lambert
Published July 07, 2019
Last Updated August 29, 2019