Rails Migrations on MariaDB Galera Cluster

Written By : Bryan Traywick

Here at Rails Machine we love MariaDB Galera Cluster. It solves a lot of problems we have experienced in the past with other MySQL multi-master solutions. However, schema changes require special care with Galera Cluster. Galera Cluster provides two options for processing DDL (Data Definition Language) statements: Total Order Isolation (TOI) and Rolling Schema Upgrade (RSU). Each method has its advantages and disadvantages and understanding when to use each is important to ensuring that schema changes go smoothly in a Galera Cluster. We will go through both methods, when to use them, and a solution we developed to help manage the complexity.

Total Order Isolation

TOI is the default DDL replication method in Galera Cluster. When the master node receives a DDL statement it sends out a replication event before starting the DDL processing. Every node in the cluster will processs the replicated DDL statement during the same “slot” in the cluster transaction stream. This ensures that every node in the cluster will process the schema change at the same time.

With this guarantee you don’t have to worry about schema backwards compatibility, but there are some drawbacks. The strict commit order will make every transaction wait until DDL processing is over. Meaning that altering a table will block any queries that are trying to access that table. For a table with a large number or rows, altering a table or adding an index could take several minutes, or longer, during which the table cannot be queried.

Rolling Schema Upgrade

To allow the rest of the cluster to continue operating at full speed Galera Cluster offers the RSU method for DDL statements. During RSU the node executing a DDL statement is desynchronized from replication for the duration of the DDL processing. All incoming replication events are buffered and the node will not send replication events to the other nodes in the cluster. When DDL processing is over, the node will automatically join back into the cluster and process missed transactions from the buffer. Once the node has rejoined and caught up with the rest of the cluster, you must repeat the DDL statements on the next node in the cluster.

The RSU method will not slow down the cluster; all other transactions can complete at full speed on the two synced nodes. However, there are caveats that must be considered when using RSU. The entire session will be processed with RSU (i.e. any insert statements will not be replicated to the other nodes). Second, upgrading the schema on all nodes is a manual operation. As a result the schema changes must be backward compatible since queries will be processed against and replicated to upgraded and non-upgraded nodes.

How does this affect my Rails migrations?

The default DDL method for Galera Cluster is TOI. This means everytime you run cap production deploy:migrate (or some equivalent method of running database migrations in production, such as Moonshine!) the DDL statements are being executed simultaneously on every node in the cluster. Now, if your tables are sufficiently small, the load is low at the time, or you can afford to put your app in maintenance mode while the migrations are running, then you may never need to consider RSU. But, if your migrations take several minutes (or hours) to run and you can’t afford the downtime, then you need to consider using RSU for your migrations.

Unfortunately, there weren’t any good solutions for using RSU with Rails migrations so we wrote GaleraClusterMigrations. GaleraClusterMigrations helps to take the pain out of using RSU with Rails migrations on a MariaDB Galera Cluster.

Using GaleraClusterMigrations

GaleraClusterMigrations contains a module that can be included in your migration with methods to enable TOI and RSU. It also contains a Rake task that will run the migration against a single cluster node, and a Capistrano task for remote executing the migration on each node.

The first step is to include the GaleraClusterMigrations module in your migration and use the #with_rsu method to enable RSU for DDL statements specified inside a block:

# db/migrate/20140710000000_add_foo_to_bars.rb
class AddFooToBars < ActiveRecord::Migration
  include GaleraClusterMigrations

  def change
    with_rsu do
      add_column :bars, :foo, :integer, default: 0
    end
  end
end

The add_column statement above will be executed in RSU mode when the migration is run.

If you are using Capistrano, require galera_cluster_migrations/capistrano in your config/deploy.rb file and use the galera:migrate task to run the migration on each node:

# config/deploy.rb
...
require 'galera_cluster_migrations/capistrano'
...

You should also disable the default deploy:migrate task from running during a deploy. Moonshine users should create a “dummy” rails_migrations recipe to prevent Moonshine from migrating the database during a deploy:

# app/manifests/base_manifest.rb
...
def rails_migrations
  exec 'rake db:migrate',
    :command => 'true'
end
...

Once the migration has been deployed to your cluster, execute the migration on each node checking that the entire cluster is in the Synced state before proceeding with the next node. Be sure to use the HOSTFILTER option to specify a single node to migrate at a time.

$ cap production galera:migrate HOSTFILTER=db1.example.com
# check that db1 has rejoined the cluster and is Synced
$ cap production galera:migrate HOSTFILTER=db2.example.com
# check that db2 has rejoined the cluster and is Synced
$ cap production galera:migrate HOSTFILTER=db3.example.com
# check that db3 has rejoined the cluster and is Synced

You have now successfully upgraded the schema on every database node using RSU! During this time the two nodes not processing the migration will be available to process queries at full speed.

For migrations that you want to run using TOI there is nothing special required to migrate the database. Write the database migration as you normally would and migrate the database with cap production deploy:migrate.

Recommendations

Understanding the nuances of TOI and RSU can be challenging. Fortunately, we have compiled a set of recommendations to help you decide which method is appropriate to use:

  • Use TOI for CREATE TABLE and DROP TABLE statements.
  • Use RSU for ALTER TABLE and CREATE INDEX statements.
  • Set default values for new columns when using RSU.
  • When using RSU, ensure that your application can use the upgraded and non-upgraded tables until every node has been upgraded.
  • Use the GaleraClusterMigrations gem to ease the use of RSU with Rails migrations.

Further Reading

The following resources provide excellent details on TOI and RSU if you want to understand the nuances of each more: