Drupal: Trim All Form Fields Everywhere

June 10th, 2012 Permalink

Update 02/2015: Benjamin Melançon was kind enough to adopt and clean up this code. It is now published as the Drupal module Trim.

It is a real pain to have to think about trimming strings submitted in Drupal forms every time you write a validate or submit callback function. In any major Drupal project you probably have a bunch of custom forms and a fair number of cases where the difference between "value" and "value " is important. Which is not to mention all the third party code that doesn't trim entered values. Users, eternally fat-fingered as they are, will eventually manage to hit all of the sore points thus created in your shiny new website. "Why can't I load this item by name?" and so forth.

One approach to this issue is to ignore it and hope it goes away. This is popular among contractors, so I hear. Another, possibly better, methodology is to globally trim every text field value at the form validation step. This is fairly easy to accomplish with a simple custom module, here called "trim". First all of all, you will want to add a new validation function to every form, one that always goes first:

/**
 * Implements hook_form_alter().
 */
trim_form_alter(&$form, &$form_state, $form_id) {
  // Ensure that there is an array here.
  if (!isset($form['#validate'])) {
    $form['#validate'] = array();
  }
  // And if someone has set it as a string, fix that issue. You'd be surprised.
  if (!is_array($form['#validate'])) {
    $form['#validate'] = array($form['#validate']);
  }
  // Now add a new function to the list, but ensure that it is called first.
  array_unshift($form['#validate'], 'trim_form_values');
}

Depending on how careful and paranoid you are, you may want to give the trim module a hook_install() implementation that ensures trim_form_alter() will most likely be called after every other hook_form_alter() implementation.

/**
 * Implements hook_install().
 */
function trim_install() {
  // Set a large enough weight to be fairly sure of going last.
  db_update('system')
    ->fields(array('weight' => 1001))
    ->condition('name', 'trim', '=')
    ->execute();
}

Now, that first-in-line validation function should run through the submitted values and trim them all, bearing in mind that $form_state['values'] may or may not be collapsed into a single level array.

/**
 * Validation callback function. Trim the values of the form.
 */
function trim_form_values($form, &$form_state) {
  trim_array_values($form_state['values']);
}

/**
 * Trim all the values of the provided array, and those in all nested arrays.
 *
 * @param array $array
 *   An array of values.
 */
function trim_array_values(&$array) {
  foreach ($array as &$value) {
    if (is_string($value) {
      $value = preg_replace('/^\s+|\s+$/', '', $value);
    }
    elseif (is_array($value) {
      trim_array_values($value);
    }
  }
}

Note that this simple example fails to detect recursion - I'll leave that as an exercise for the reader, but there should be no recursion in form values.

Another point: arguably it's acceptable to use trim() rather than preg_replace() given that Drupal uses UTF-8 to encode strings and all trim is doing is stripping ASCII range whitespace characters. Being forced to think about all of the fine details regarding which single-byte string functions are safe to use under various encodings and circumstances is one of the many fun things that localization and internationalization inflicts upon developers.