PostgreSQL deployments using Ansible and Sqitch
The database is a critical component of pretty much every application in existence today. Being such an important component it is good to be able to consistently deploy your databases in a efficient, reliable and accurate manner. At RustProof Labs, Postgres is our database of choice so in this post I examine how we use Ansible and Sqitch together at a high level to deploy, maintain, and upgrade our Postgres servers and databases.
This post is part of the series PostgreSQL: From Idea to Database.
Define goals
Before diving into how Ansible and Sqitch work together with Postgres
lets define some goals. At the server/service level, we need the ability
to quickly provision new Postgres instances ready for production or development.
It should be easy to set/change many common configuration options
(postgresql.conf
, pg_hba.conf
, etc.) to allow efficient operation
on a variety of hardware and handle widely different workloads.
At the individual database level (where your app stores its data) we need
the ability to manage schema changes (DDL
) over time. This typically involves
a large number of tables, views and functions. It should be easy
and low-risk to test, deploy and verify even the most complex schema changes.
Meeting our goals
Ansible is a tool for automating IT deployments and configurations. We use Ansible to configure servers, install software, manage security, and customize configuration files. In other words, Ansible is the conductor of our automation orchestra. I had reassurance that Ansible meets our needs in 2018 after one of our primary database servers became "unavailable to communicate." I received a notification of the issue and promoted a secondary server to be the new primary after initial triage. After creating a new Amazon EC2 instance I ran two Ansible playbooks taking a total of less than 5 minutes to provision the replacement secondary server.
Sqitch is a database change management tool. Our database projects use Sqitch to deploy and track changes to individual databases schemas. It is a straight forward command line tool, simple to use during development and easy to automate into our large deployments with Ansible.
Ansible
The great thing with Ansible is that if you can document the manual steps to take, those can be converted into Ansible roles and playbooks. In fact, I recommend you have a solid grasp on your desired deployment process before you attempt to automate it.
Install Ansible (Ubuntu) on your control node. This is normally installed on my laptop for easy access (development) and on our deployment servers.
sudo apt update
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible
With Ansible available, it is time to start automating. I'd recommend their quickstart and docs if you are new to Ansible.
Install PostgreSQL and Sqitch
We can now write Ansible tasks to deploy any version of Postgres available, such as this excerpt:
- name: "Install PostgreSQL {{ pg_version }} base package" apt: package: "{{ item }}" state: latest update_cache: yes cache_valid_time: 3600 with_items: - "postgresql-{{ pg_version }}" - python-psycopg2 - sqitch - pgbadger
The variable pg_version
can be set in
defaults/main.yml
to define your
preferred version. It is easy to override the defaults in the inventory files,
playbooks, or even environment variables.
Other than the core PostgreSQL server and Sqitch, this task installs a couple
other Postgres tools that I always want installed
(pgbadger
and python-psycopg2
).
If certain software only gets installed on some servers,
those can be controlled in their own task by using a
when
clause.
when: pg_dev_server
Configure PostgreSQL
One of our big requirements was to easily configure and reconfigure our Postgres
instances. Ansible makes this easy as well. The following example task uses
the pg_version
variable
again to handle deploying a customizsed version-specific postgresql.conf
file.
- name: Copy production postgres.conf template template: src: "postgresql-{{ pg_version }}.conf.j2" dest: "/etc/postgresql/{{ pg_version }}/main/postgresql.conf" owner: postgres group: postgres notify: - Restart PostgreSQL
The src
file is named to include the version it is valid for (configuration options
do change over time!) and ends with a .j2
file extension.
Ansible templates use Jinja to generate and
deploy customized configurations. For example, if we want to control
shared_buffers
, probably the single-most configured option in Postgres,
our templates/postgresql-11.conf.j2
would replace the default postgresql.conf
line with this one:
shared_buffers = {{ pg_shared_buffers }}
Now, defaults/main.yml
should be updated to provide a reasonable default value.
If you know your Postgres servers typically have at least 4GB RAM,
1GB for shared_buffers
is a good starting point (25%).
pg_shared_buffers: 1GB
Setting defaults is not required by Ansible, but can help keep your inventory variables limited to documenting deviation from your defaults.
An Ansible inventory file (or env vars) can override the default value,
but note the use of =
(below)
instead of :
(above) between the variable name and the value.
This inventory example would be suitable for a
Raspberry Pi's
configuration with the older version of Postgres and reduced shared buffers.
[postgresql:vars]
pg_version=9.6
pg_shared_buffers=100MB
Sqitch
The best way to get started with Sqitch is with their Postgres tutorial. It is worth following along to learn how powerfully simple Sqitch is.
sqitch status
sqitch deploy
sqitch verify
I worked up this short, cheesy video highlighting the basics:
Friday lunch project - I was thinking something along these lines. 45 seconds, shows multi-platform support and the most important commands/functionality.
— RustProof Labs (@RustProofLabs) April 5, 2019
Cheesy music optional. pic.twitter.com/KfvNw2oFUl
Numbered migrations
Sqitch makes it clear you do not need to use numbered, ordered changes.
"There is no need to number your changes, although you can if you want. Sqitch doesn’t much care how you name your changes."
I still prefer numbered, ordered migrations with names like 001
, 002
and so on. Database migrations are an ordered series of changes so
having the basic ordering inherent in the file names is nice. Also,
we are creatures of habit and that was how I was taught to do it, and since
Sqitch doesn't care... 😀
Deploying with Ansible
The Sqitch commands can easily be wrapped into Ansible tasks using the
shell
module,
this example includes --verify
to instruct Sqitch to automatically run
the verification scripts.
- name: Deploy database using Sqitch shell: "sqitch deploy --verify db:pg:{{ db_name }}" become_user: postgres become: yes
It's often handy to have a single script to use to install a database. This
is easy enough to accomplish with pg_dump
in another task.
- name: Dump database schema for single DB install file. shell: "pg_dump --schema-only -U {{ db_user }} -d "{{ db_name }} -f {{ db_name }}-schema.sql" become_user: postgres become: yes
Summary
This post has provided a high level view of how we use Ansible and Sqitch to deploy our Postgres at RustProof Labs. Automation is a great way to increase your efficiency, reduce errors, and generally maintain more reliable database driven architectures. This is important throughout our infrastructure, but it is especially true for our databases!
Need help with your PostgreSQL servers or databases? Contact us to start the conversation!