MongoDB Manipulation, Mastery and Monkey Business

Written By : Kevin Lawver

August 28, 2012

I’m a big fan of MongoDB. I used it for a product at my last company and found it to be easy to manage and deploy and fun to develop with. We have a couple of customers here at Rails Machine that are also big fans of MongoDB and we helped one of them upgrade their replica set today from two nodes with an arbiter to three gigantic servers. You’d think that completely replacing the hardware that runs your database would be painful… but with Mongo, it’s really not. Here’s how we did it.

Since these were new servers, and our customer has a staging environment, we did a lot of testing of the new servers by adding them to the staging replica set. It was really easy to move things around using the Moonshine MongoDB plugin. Some of the tests we ran:

  • How long does it take for a new node to become a fully functioning secondary?
    • Depends on the amount of data, but not as long as you’d expect.
  • What happens when a secondary unexpectedly drops out?
    • Not much. If there was an election, it was so quick we didn’t notice.
  • What happens when the primary unexpectedly drops out?
    • There’s an election and things “flap” for a few seconds. We did this a few times, and elections took as little as 2 seconds and as long as 10.
  • Does it take longer to sync two new secondaries than one?
    • As long as there are as many secondaries as there are new nodes, then no.

After we were happy that the servers were ready, we removed them from the staging replica set and deleted their data. With Moonshine, removing them from staging and adding them to the production deployment was just moving a few lines of configuration around.

I’ve changed the names of the servers and ip addresses for this example, so, let’s pretend that things look like this:

  • Existing replica set:
    • arbiter:
    • donald:
    • daisy: - the current primary (pretty much force to be primary because we set its priority to 2, which will come in later)
  • New ‘super’ mongo servers:
    • huey:
    • dewey:
    • louie:

We did this over two days to make sure the new servers were “happy” and ready to take over all the traffic, but here’s what we did:

  1. We deployed to all the servers to make sure the correct iptables rules were in place so that everything that needed to talk to the new Mongos could, and also to add the three new nodes to the app config (they should be in the config after the two current nodes).
  2. Confirmed that the new servers could connect to, and be connected to by, the old ones (you’ll get an error if things are wrong, otherwise you’ll connect and can run queries.):
    • from huey:
    • mongo
    • mongo * from daisy:
    • mongo
    • mongo
    • mongo
  3. Now you need to connect to the current primary (daisy for this story) and reconfigure things. Since we always wanted there to be a quorum of “up” nodes in the replica set to keep things from possibly going south (as in, not enough functioning nodes to elect a primary, which I’ve seen before and is unpleasant). We need to first remove the arbiter and add one of the new nodes:
    1. config = rs.conf()
    2. huey = {_id:4,host:'',hidden:true,priority:0}
    3. config.members.push(huey)
    4. config.members.splice(INDEX,1) where INDEX is the index of the arbiter in the members array.
    5. config.version++
    6. Now, before we commit this, we need to look at the config variable and make sure things make sense:
      • Are all the nodes you want to be in the list of members?
      • Do they all have their host field set to “IP:PORT”?
      • Are the new nodes set to hidden?
      • Do the new nodes all have unique _id fields?
      • Do they all have priority set to 0?
    7. rs.reconfig(config)
  4. Now you get to obsessively run rs.status() over and over again until huey is all synced up and a full-fledged member of the set (which will happen when it’s no longer RECOVERING and says SECONDARY). You may see a few things that look like error messages while this is happening:
    • “errmsg” : “initial sync need a member to be primary or secondary to do our initial sync” - This almost always means the election is taking place. Wait a few seconds and run rs.status() again.
    • “errmsg” : “initial sync cloning db: DBNAME” - This is good! That means the sync is happening.
    • “errmsg” : “syncThread: 10278 dbclient error communicating with server:” - I saw this one a couple times right after I triggered an election. I think this is the normal “I’ve just triggered an election and am switching connections” message.
    • If you see any other errors, google them, because I didn’t see them.
  5. Once the new node is a SECONDARY, you can add the other two (because there are now three healthy nodes, adding two won’t cause an imbalance):
    1. config = rs.conf()
    2. dewey = {_id:5,host:'',hidden:true,priority:0}
    3. louie = {_id:6,host:'',hidden:true,priority:0}
    4. config.members.push(dewey)
    5. config.members.push(louie)
    6. config.version++
    7. Before we commit this, go through the checklist we went through the first time we did this and make sure we don’t have any typos or other mistakes. If things are cool:
    8. rs.reconfig(config)
    9. Again, obsessively run rs.status() until everything’s happy.
  6. Once the new nodes are all listed as SECONDARY in rs.status(), you’ve successfully added the new nodes to the replica set. This is where we stopped on day one so we could watch things to make sure everything was fine with the new nodes. But, once the new nodes are secondaries, you can continue with dropping the old nodes:
    1. We created a pull request that had the updated app config that removes the two old nodes.
    2. We also stopped all the resque workers at this point to make sure we didn’t cause any jobs to fail during the election.
    3. Open up the mongo console on the current primary and get rolling!
    4. config = rs.conf()
    5. For each of the hidden nodes in the members list:
      • config.members[x].hidden = false
      • config.members[x].priority = 1
    6. Now we need to give one of the new nodes a priority higher than the current primary’s, so:
      • config.members[3].priority = 3
    7. config.version++
    8. Look at the config variable again and make sure it’s got all the right stuff in it and then:
    9. rs.reconfig(config)
    10. Do the rs.status() dance until the new primary is elected. There may be a considerable amount of “flapping” during the election. I’ve seen it take as little as 2 seconds and as long as 10 for a new primary to be elected. Just keep checking rs.status() until things calm down.
  7. After the new primary is elected, we merged the pull request that removes the old nodes from the app config and deployed.
  8. Once the deploy was done and the app was up and running talking to the new primary, it was time to remove the old nodes from the replica set.
    1. Connect to the new primary’s mongo console.
    2. config = rs.conf()
    3. And now we need to splice out the old nodes. For each of the old nodes:
      • config.members.splice(INDEX,1) (where INDEX is the indexes of the old node we’re removing)
    4. config.version++
    5. Go through the checklist again, and this time, make sure the old nodes are no longer in the members array.
    6. rs.reconfig(config)
    7. There might be some more flapping here as it disconnects the old nodes. We definitely saw a few seconds of “weirdness” when we did it.
  9. That’s pretty much it!

Overall, the entire process went very smoothly. The only issue we had was when we removed the old nodes, the apps lost their connection to MongoDB entirely and refused to connect. An apache restart fixed that issue. We think it was a “failed” restart during the deploy that didn’t restart all of the passenger instances. That was the only real downtime during the entire migration and it lasted for only a couple of minutes while we restarted Apache.

Having done this process a few times now, and having done this with other database systems in the past, I’m really impressed with how easy it is to manipulate replica sets with MongoDB. It’s a lot easier than I originally thought it would be and while running three instances for a replica set is more expensive than the regular master/slave setup you see with traditional databases, it makes a lot of sense and works really well in the “real world”.

I’d love to hear how other folks have done this kind of thing with MongoDB!

And as a congratulations for getting to the bottom of this post, here’s a photo of a corgi:

Corgi in the leaves