Redis as a Tool For Managing State in Simple Clustered Processes

April 5th, 2013 Permalink

The Redis datastore is a great tool when managing state in a backend cluster of simple processes that all perform the same function. For the purposes of this post we'll say that the cluster consists of Node.js processes. In Node.js, it is is essential to cluster multiple processes even when building comparatively small services. On the one hand there is the need for redundancy, such that the frontend can continue to provide service if (more usually a matter of when) a single backend process dies, but it is also the case that Node.js is effectively a single thread of operation for your code. You need to run multiple Node.js processes in order to take advantage of the fact that your server has more than one processor.

So let us say that your simple process has a state that can be represented in an object, such that somewhere in the process launch script we find:

// Load the state from Redis
var redis = require("redis");
var client = redis.createClient();
var state = {};
client.get("application:state", function (error, stateJson) {
  if (error) {
    // Logging and error handling.
  } else {
    try {
      state = JSON.parse(stateJson);
    } catch (e) {
      // Logging and error handling.
    }
  }
});

Elsewhere in the code there is a way to update the process state. Perhaps state changes occur as a result of actions in an administration interface - something that happens in only one process but must be propagated to all processes. Storing the updated state in Redis can be as simple as the following code if you don't expect overlapping updates, or if overlapping updates can be disregarded, with the last one winning:

// Store the state to Redis
client.set("application:state", JSON.stringify(state), function (error) {
  if (error) {
    // Logging and error handling.
  }
});

But how to propagate notice of this change to all processes, so that they can reload the state? Redis comes in handy here too: it has easy to use publish/subscribe functionality. A Redis client can be subscribed to one or more channels (at which point it can't be used for anything else, so a separate client must be created for other needs), and will output published messages as they turn up. In the case of Node.js, a subscribed Redis client emits events.

In the hypothetical process startup script, then, we need to create a dedicated Redis client and subscribe it to a channel:

var channel = "application:channel";
// Create an additional client to subscribe and listen to the channel.
var subscribeClient = redis.createClient();
// Sign up for channel messages.
subscribeClient.subscribe(channel, function (error) {
  if (error) {
    // Error logging.
  }
});
// React to channel messages.
subscribeClient.on("message", function (channel, json) {
  var data;
  try {
      data = JSON.parse(json);
  } catch (e) {}
  if (data) {
    // Process the received message.
  }
});

Sending messages to the channel - and thus to all listening Node.js processes - is a simple matter. To send a message telling the processes to reload their states from the Redis datastore, for example:

var data = {
  task: "reloadState"
};
client.publish(channel, JSON.stringify(data), function (error) {
  if (error) {
    // Logging and error handling.
  }
});

It should be clear that you could construct any number of simple message systems using this sort of strategy - and it's really a very convenient tool for clustering simple Node.js processes, wherein all you really care about is keeping process states more or less synchronized. So I threw together an example implementation of Redis-backed communication between Node.js processes called simple-cluster, which you can find at GitHub or install via NPM. It is used as follows:

var simpleCluster = require("simple-cluster");
// Start things running default options.
var instance = simpleCluster.start();
// Set up an example listener. Arbitrary event names can be used.
instance.on("exampleEventName", function (data) {
  // Take action.
});

// Send a message to all listening processes, including this one.
var data = {
  // Message data goes here.
}
req.simpleCluster.sendToAll("exampleEventName", data, function (error) {
  if (error) {
    // Error logging.
  }
});