There are several options to backup a PostgreSQL database. The standard
pg_dump
command ships with PostgreSQL and is very easy to use. Another option is to
create a file system level backup by directly copying the files that PostgreSQL uses to store
the data in the database. It is also possible to use PostgreSQL’s write ahead
log (WAL) to continuously archive the database. However, each of these approaches
have drawbacks that we will discuss in the next section.
What’s wrong with the existing PostgreSQL backup strategies?
Each of the backup strategies discussed in the PostgreSQL documention have their own set of advantages and disadvantages. I will discuss those tradeoffs and why we decided against using each one.
SQL dump with pg_dump
Creating a SQL dump of the database with pg_dump
is very simple for both the
backup and the restore process. Unfortunately, it can take a long time to
backup and restore a large database with pg_dump
. Since backups take a long
time, if the backup is taken on the slave database, there is a chance that the
slave could fall too far behind the master database and the backup will be
aborted. You can try to reduce the chance of this happening by running the
backup in the middle of the night when the site is less busy. But, if the
database is large and the site is busy enough at all times of the day, it may
never be possible to create a backup with pg_dump
on the slave server.
File system level backup
Using file system level backup to directly copy the files that PostgreSQL uses
to the store the data in the database is another option. You create a backup
by using tar
to archive and optionally compress the PostgreSQL data
directory. For example:
tar czf backup.tar.gz /var/lib/postgresql/9.2/main
A file system level backup is fast, but it does have one major restriction: the database server must be shut down in order to get a usable backup. This restriction makes it a non-starter for anyone with a single database server. However, if you have a master/slave cluster it could be an attractive option. Just shutdown the slave, backup its data directory, and restart the slave and let it catch back up to the master using streaming replication. This will work for most people, but if the master got too far ahead of slave during the backup process then replication will have to be setup by copying a fresh basebackup from the master to the slave. If this happens you may be left with just as much, if not more, downtime to rebuild replication as taking down the master to perform the backup would have taken.
Continuous Archiving and Point-in-Time Recovery (PITR)
Continuous Archiving is a backup approach that takes advantage of PostgreSQL’s write ahead log (WAL). The log records every change made to the database’s data files and forms the basis of PostgreSQL’s streaming replication. Using continuous archiving for backup purposes involves creating a base backup as the starting point and then just archiving every WAL file created as changes are made to the database.
There are several advantages to continuous archiving backups discussed in PostgreSQL’s documentation on the subject. However, the main drawbacks are that setup is difficult and the size of the backups may be large – both due to the ever increasing number of WAL files and the fact that the default WAL format is fairly bulky.
A better approach with OmniPITR
OmniPITR is in their own words, “a
set of scripts to ease setting up WAL replication, and making hot backups from
both Master and Slave systems”. OmniPITR is able to manage all aspects of
PostgreSQL streaming replication. It contains scripts for archiving and
shipping WAL segment files from the master to the slave, restoring the WAL
segments on the slave, and creating hot backups on both the master and slave
database servers. It is the slave backup script, omnipitr-backup-slave
, that
we are most interested in today.
The omnipitr-backup-slave
is able to create a ‘hot backup’ on a PostgreSQL
slave server. A hot backup is a backup done by backuping up the data directory
much like the file system level backup. However, unlike normal file system
level backups the OmniPITR takes special care to allow the backup to be taken
without shutting down the database.
The omnipitr-backup-slave
script first creates a trigger file that pauses
the removal of WAL archives. The script then tells the master to create a
backup label. This causes PostgreSQL to create a ‘snapshot’ of the data and
ensures that all of the data at the time of the snapshot is consistent on
disk. Next, the script creates a file system level backup of the PostgreSQL
data directory on the slave server. This is the core of the backup and will
contain all of the data at the time that the snapshot was taken in a
consistent manner. The script then tells the master that we are finished
making backups and compresses all of the WAL archives created during the
backup process. Finally, the trigger file that was created in the first step
to pause the removal of WAL archives is removed so that normal WAL archive
removal will take place.
The two backup files created by omnipitr-backup-slave
, one containing the
data directory backup and the other containg the WAL segment files created
during the backup process, can simply be extracted ontop of each other on a
fresh database server or on your local machine to restore the database. The
backup is quick since it uses a file system level backup for the bulk of the
data, and isn’t succeptible to the problem of the slave falling behind since
the master server is still processing incoming WAL segment files while the
backup is taking place.
Easy PostgreSQL Slave Backups with Moonshine OmniPITR
OmniPITR is a very powerful tool, and as such it provides a lot of options. At Rails Machine we like to automate the installation and configuration of every tool we use. This was no different with OmniPITR which is why we created moonshine_omnipitr.
Moonshine OmniPITR automates the installation and configuratino of the
omnipitr-backup-slave
script – as we learn more about OmniPITR and it’s
capabilities we may add additional functionality. Installation works just like
any other moonshine plugin:
# Rails 2.x.x
script/plugin install git://github.com/railsmachine/moonshine_omnipitr.git
# Rails 3.x.x
script/rails plugin install git://github.com/railsmachine/moonshine_omnipitr.git
Once it is installed, you can include it in your database manifest:
# app/manifests/database_manifest.rb
class DatabaseManifest < Moonshine::Manifest::Rails
include Moonshine::Omnipitr
# other recipes and configuratino omitted
# tell DatabaseManifest to use the omnipitr recipe
# omnipitr should only be setup on the slave database server
if Facter.hostname =~ /^db2/
recipe :omnipitr
end
end
Moonshine Omnipitr automatically handles uploading the backups to your Amazon S3 bucket. For this to work it needs access to your Amazon S3 account:
# in global config/moonshine.yml
:omnipitr:
:s3_bucket: 'your-bucket'
:s3_prefix: 'backups/'
:s3_key: 'YOUR_ACCESS_KEY'
:s3_secret: 'YOUR_SECRET_KEY'
# in stage-specific moonshine.yml,
# config/moonshine/staging.yml and config/moonshine/production.yml
:omnipitr:
:s3_bucket: 'your-bucket'
:s3_prefix: 'backups/'
:s3_key: 'YOUR_ACCESS_KEY'
:s3_secret: 'YOUR_SECRET_KEY'
# `configure` call in app/manifests/database_manifest.rb
configure :omnipitr => {
:s3_bucket => 'your-bucket',
:s3_prefix => 'backups/',
:s3_key => 'YOUR_ACCESS_KEY',
:s3_secret => 'YOUR_SECRET_KEY'
}
All of the configuration options are required except for :s3_prefix
. The
prefix is an option setting that can be used to organize your backups into a
subfolder within your bucket. You can then use S3’s Object Expiration to
automatically delete older backups or move them to Amazon Glacier.