Regular Expressions   «Prev  Next»
Lesson 3Regular expression reference
ObjectiveWrite a regular expression that will catch most common misspellings of your name.

Regular Expression Reference

Perl regular expressions are based on the standard egrep-style (so-called version 8) regexps. These regexes perform pattern matching based on a set of rules. The basic set of rules are explained in this lesson.
For the purpose of the examples in this discussion, we will use the simple form of Perl's pattern-matching

operator (m//): 

For a review on the matching operator, see "The match operator" lesson from Module 3.

Pattern - Matching Rules

There are a lot of details in this lesson that we will be using later on in the module. Be sure to read each of the paragraphs below as well as the linked pages from this lesson. In addition, we will apply the regular expressions discussed to the yes/no if structure we examined in the previous lesson.
Any single character matches itself, unless it is one of the recognized metacharacters.

1) Perl Metacharacters Example

Note: By now you have noticed that some characters in regexes have a special meaning. These are called metacharacters. The following are the metacharacters that Perl regular expressions recognize:
{} [] () ^ $ . | * + ? \

If you want to match the literal version of any of those characters, you must precede them with a backslash, \. As you go through the chapter, the meaning of these metacharacters will become clear.
These are the recognized metacharacters:
+ ? . * % $ ( ) [ ] { } | \

For example,
/$15/
will not match this pattern:
Can I borrow $15?

If any of the metacharacters are present in your expression, and you are specifically looking for that character, you will need to escape it in order to have it included in your results. In the above search example, use:
/$15/
You can also use special metacharacters to match the beginning or end of a line or string .

2) Perl Special Metacharacters

The following special metacharacters have these special meanings:
  1. ^ matches the beginning of the line or string.
  2. $ matches the end of the line or string.

When the special metacharacter ^ is used outside of a bracketed character class, it means "the beginning of a line or string." However, when ^ is used inside a bracketed character class, it negates the immediately following character or group of characters. Here is an example of how you would apply the special metacharacters to our yes/no if structure:

if($input=~/^[Yy](es)?$/)
   { print "Let's play!\n" }
else
   { print "Okay. Thanks anyway.\n" }
Let's examine the regular expression:
~/^[Yy](es)?$/

The special metacharacter ^ matches the beginning of the string.
  1. [Yy] matches 1 character from a set of either Y or y.
  2. (es)? matches a pattern of es either 0 or 1 times.
  3. $ matches the end of a string.
  1. Brackets are used to create your own class of characters.
  2. The backslash (\) character is used to create special escape characters for matching some nonalphanumerics and classes of characters.
  3. The period (.) matches any character (except \n). To match a period itself, use \. or [.].
  4. Alternate matches can be specified using | to separate them.
  5. Within a pattern, you can specify subpatterns for later reference by enclosing them in parenthesis. You can refer to those subpatterns later by using \n where the n refers back to the nth subpattern. These are called back-references.
  6. You can repeat a pattern several times by following a character, class, or parenthesized expression with one of these quantifiers.

Perl Spell Check Name - Exercise

Click the exercise link below to write a regular expression that will catch misspellings of your name.
Perl Spell Check Name - Exercise

Advanced Perl Programming