Friday, April 27, 2012

Moving Master or Slave nodes in MySQL

In this post I want to discuss with you a concept which simplifies failing-over or migration of a master or slave node in a MySQL replication setup without incurring any downtime to the application - i.e. hot  failover/migration.

But before I need you to read and understand these three axioms which I collected and which I believe are the core understanding to MySQL High Availability Concepts:
  1. In a multi-master, multi-slave replication setup, a slave will only follow its one single adjacent master coordinates.
  2. In a master-master replication, transactions written to master 1 will first propagate to master 2 and then bounced back to master 1 - in the latter route, master 1 will realize that these are its original transactions and ignore them. 
  3. Place and record marker events on both the original and new master for the slave to be moved.
Ok, I made up these three axioms and may seem confusing at first. Let me explain a bit:

In the first rule I am saying that if I have this setup: M1<->M2->S1, then S1 will need the coordinates of M2 to stay in sync. Quite obvious no? Just make sure you never forget it and try to use the coordinates of M1 instead. Also for beginners, remember that a slave cannot have more than one master.

In the second rule I am stating another obvious thing but which few people put it into practical use. Consider the following setup: M1->M2->S1. How would you go about to convert it to: M1<->M2->S1, i.e. making M1 the slave of M2 without incurring any downtime whatsoever on any of the nodes? This is pretty simple if you read again this concept - if at any point in time I read the M2's master coordinates (SHOW MASTER STATUS on M2) and request M1 to be a slave of M2 using these coordinates, any transactions which have been already propagated from M1 to M2, will be ignored when they come back to M1. I will write another article to illustrate this with a detailed example.

The third rule is the concept I would like you to understand in this article - if you are able to understand it, then you can do almost all MySQL migrations without ever incurring any downtime on the setup. The problem when moving nodes is to make sure they remain consistent - so we need to know exactly when the replication stopped and then at which point within the new master log file we have to tell the slave to resume replication. This can be easily done by using Marker Events. Marker Events is just a technique used in many replication setups (not just MySQL) and it involves creating a fictitious table on the master, taking note of the coordinates within the bin log, and then ask the slave to stop when this table is created on its end. Then you can resume the slave with the previously recorded coordinates.

The best way to understand the 3rd axiom, Marker Events, is to look at this example:


Suppose we want to migrate the setup on the left hand side to the setup on the right hand side. This is a very common migration in the real world where we are requested to convert from a chained replication to a star replication setup (the inverse is also common). To do this, we need to follow these steps:

Step 1 - Stop Slaving C
On node C:
STOP SLAVE

Step 2 - Create and Record Marker Events
On node A create a marker table and determine its coordinates from the bin log events:
CREATE TABLE mydb.marker1 (c1 SMALLINT);
SHOW BINLOG EVENTS;

Call these coordinates (fileA`, posA`).

On node B determine the coordinates from the bin log events:
SHOW BINLOG EVENTS;

Call these coordinates (fileB`, posB`).

Step 3 - Sync C to the Marker Event on B
On node C:
START SLAVE UNTIL
    MASTER_LOG_FILE=fileB`, 
    MASTER_LOG_POS=posB`;

Node C will stop exactly when the marker table is created. 

Step 4 - Point C to Marker Event on A and start slave
On node C:
CHANGE MASTER TO
    MASTER_HOST=hostA,
    MASTER_LOG_FILE=fileA`, 
    MASTER_LOG_POS=posA`;
START SLAVE;

Since we know that C stopped at the Marker, we can pretty much resume at that stage! This is recorded as (fileA`, posA`) on node A! As you can see, as long as we setup and record marker events on the original and new master, we are able to seamlessly move slaves around with no required downtime. As an exercise I would recommend you to try doing the inverse (from star to chain) using the same technique!

As a side note, I would like to personally thank Rene Cannao` for introducing me to these concepts.
Post a Comment