Regular Expressions   «Prev 

Perl Back References


In the following example, I created a subpattern of the W from my first name to use in matching the W in my last name:
/([Ww])illiam \1einman/

William Weinman will match william weinman will match William weinman will not match
If expressions in parenthesis matched a capital W, then \1 will only match another capital W, and vice versa.

Relative backreferences

Counting the opening parentheses to get the correct number for a backreference is error-prone as soon as there is more than one capturing group.
A more convenient technique became available with Perl 5.10: relative backreferences. To refer to the immediately preceding capture group one now may write \g{-1} , the next but last is available via \g{-2} , and so on.
Another good reason in addition to readability and maintainability for using relative backreferences is illustrated by the following example, where a simple pattern for matching peculiar strings is used:
$a99a = '([a-z])(\d)\g2\g1';   # matches a11a, g22g, x33x, etc.

Now that we have this pattern stored as a handy string, we might feel tempted to use it as a part of some other pattern:

$line = "code=e99e";
if ($line =~ /^(\w+)=$a99a$/){   # unexpected behavior!
 print "$1 is valid\n";
} else {
 print "bad line: '$line'\n";
}

But this doesn't match, at least not the way one might expect. Only after inserting the interpolated $a99a and looking at the resulting full text of the regexp is it obvious that the backreferences have backfired. The subexpression (\w+) has snatched number 1 and demoted the groups in $a99a by one rank. This can be avoided by using relative backreferences:
$a99a = '([a-z])(\d)\g{-1}\g{-2}';  # safe for being interpolated