Run Multiple Cron Tasks in Parallel in Drupal

February 11th, 2012 Permalink

Today's topic is cron in Drupal 7, and how to get tasks to run in parallel with a minimum of effort. If you dig around in the Drupal core code, you'll see that cron tasks proposed by modules are run in sequence - one hook_cron() invocation after another. Take a look at /includes/common.inc:

function drupal_cron_run() {

  ...

    // Iterate through the modules calling their cron handlers (if any):
    module_invoke_all('cron');

    // Record cron time
    variable_set('cron_last', REQUEST_TIME);
    watchdog('cron', 'Cron run completed.', array(), WATCHDOG_NOTICE);

The same goes for any queues defined through hook_cron_queue_info(). A few lines further on you'll see that queues are also processed in sequence.

  foreach ($queues as $queue_name => $info) {
    $function = $info['worker callback'];
    $end = time() + (isset($info['time']) ? $info['time'] : 15);
    $queue = DrupalQueue::get($queue_name);
    while (time() < $end && ($item = $queue->claimItem())) {
      $function($item->data);
      $queue->deleteItem($item);
    }
  }

This is not really an issue for many sites: under most circumstances for most production Drupal sites all cron tasks will complete quickly, in an elapsed time that is short compared to the time between cron runs. But what about other scenarios? Suppose, for example, that your site generates thousands of tasks every hour that each require multiple calls to distant servers across the internet - so they are slow, but not particularly computationally intensive. Or your cron run consists of hundreds of tasks and one summary of all those tasks, but you must have that summary as quickly as possible: waiting for the tasks to complete in sequence is impractical for your business case. Or your site is a simply a monster of many, many modules and tasks, cron is just running way too long, and you wish you could make use of more of all that computing power that resides in the however many processing cores that rack servers ship with these days.

Running cron tasks in parallel across many processes is the obvious solution to these issues. To achieve this end the best toolkit for the neonate is probably the Ultimate Cron module, which provides a pleasant interface for cron tasks and queues and uses Background Process to manage parallel execution of tasks. It has enough support and use in the community to generate a modest halo of related modules and enhancements, so you're likely to be able to satisfy your exact requirements without too much coding.

The example I'll give here is a simple one of using Ultimate Cron - version 7.x-1.6 at the time of writing - to plow through many tasks in parallel queues rather than leaving core cron to chug through those tasks one at a time. Assuming that this all takes place in a "cron_example" module:

/**
 * Implements hook_cron_queue_info().
 *
 * Set up ten queues that all send items to the same callback function.
 */
function cron_example_cron_queue_info() {
  $queues = array();
  for ($index = 0; $index < 10; $index++) {
    $queues['cron_example_queue_' . $index] = array(
      'worker callback' => 'cron_example_consume_queue_item',
      'time' => 60,
    );
  }
  return $queues;
}

/**
 * Queue worker callback: receives an item from the queue and performs
 * some mysterious processing task based on the item data.
 *
 * @param mixed $item whatever was added to the queue via queue->createItem()
 */
function cron_example_consume_queue_item($item) {

  // mysterious processing task goes here

}

If you enable cron_example and Ultimate Cron in your system, and navigate to the cron configuration page, you'll see your ten queues showing up as distinct cron tasks in the table provided by Ultimate Cron. These queues should in theory execute in parallel; so, for example, we could do this:

/**
 * Implements hook_cron().
 *
 * Drops an item into each of the example queues.
 */
 function cron_example_cron() {
  for ($index = 0; $index < 10; $index++) {
    $queue = DrupalQueue::get('cron_example_queue_' . $index);
    if ($queue && !$queue->numberOfItems()) {
      $item = array('index' => $index);
      $queue->createItem($item);
    }
  }
 }

/**
 * Queue worker callback: receives an item from the queue and performs
 * some mysterious processing task based on the item data.
 *
 * @param mixed $item whatever was added to the queue via queue->createItem()
 */
function cron_example_consume_queue_item($item) {
  // log and then hold things up for 10 seconds - which is a fair simulation of
  // processes that have to make many calls out to services across the internet.
  error_log('Queue ' . $item['index']);
  sleep(10);
}

When the code above is on your server, the Ultimate Cron task list will show a default cron task entry for the cron_example module - and you can run that by using the "Run" link for the task, or via hitting /cron.php. The cron task drops a single item into each of your queues that will log and wait: this is a quick and easy way to determine whether or not tasks are in fact running in parallel. Just look at the timestamps in your log.