Regular expression examples

This is an old revision of this page, as edited by 65.215.72.62 (talk) at 17:46, 21 March 2005 (Split RE examples off from main Perl article). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Here are some examples of Perl regular expressions.

Regular Expression Description Example
Note that all the if statements return a TRUE value
. Matches an arbitrary character, but not a newline.
$string1 = "Hello World\n";
if ($string1 =~ m/...../) {
  print "$string1 has length >= 5\n";
}
( ) Groups a series of pattern elements to a single element. When you match a pattern within parentheses,

you can use any of $1, $2,

... later to refer to the previously matched pattern.
$string1 = "Hello World\n";
if ($string1 =~ m/(H..).(o..)/) {
  print "We matched '$1' and '$2'\n";
}

<B>Output:</B>
We matched 'Hel' and 'o W';
+ Matches the preceding pattern element one or more times.
$string1 = "Hello World\n";
if ($string1 =~ m/l+/) {
  print "There are one or more consecutive l's in $string1\n";
}
? Matches zero or one times.
$string1 = "Hello World\n";
if ($string1 =~ m/H.?e/) {
  print "There is an 'H' and a 'e' separated by ";
  print "0-1 characters (Ex: He Hoe)\n";
}
? Modifies the *, +, or {M,N}'d regexp that comes before to match as few times as possible.
$string1 = "Hello World\n";
if ($string1 =~ m/(l.+?o)/) {
  print "The non-greedy match with 'l' followed by one or ";
  print "more characters is 'llo' rather than 'llo wo'.\n";
}
* Matches zero or more times.
$string1 = "Hello World\n";
if ($string =~ m/el*o/) {
  print "There is an 'e' followed by zero to many";
  print "'l' followed by 'o' (eo, elo, ello, elllo)\n";
}
{M,N} Denotes the minimum M and the maximum N match count.
$string1 = "Hello World\n";
if ($string1 =~ m/l{1,2}/) {
 print "There exists a substring with at least 1";
 print "and at most 2 l's in $string1\n";
}
[...] Denotes a set of possible character matches.
$string1 = "Hello World\n";
if ($string1 =~ m/[aeiou]+/) {
  print "$string1 contains one or more vowels.\n";
}
| Separates alternate possibilities.
$string1 = "Hello World\n";
if ($string1 =~ m/(Hello|Hi|Pogo)/) {
  print "At least one of Hello, Hi, or Pogo is ";
  print "contained in $string1.\n";
}
\b Matches a word boundary.
$string1 = "Hello World\n";
if ($string1 =~ m/llo\b/) {
  print "There is a word that ends with 'llo'\n";
} else {
  print "There are no words that end with 'llo'\n";
}
\w Matches alphanumeric, including "_".
$string1 = "Hello World\n";
if ($string1 =~ m/\w/) {
  print "There is at least one alphanumeric ";
  print "character in $string1 (A-Z, a-z, 0-9, _)\n";
}
\W Matches a non-alphanumeric character.
$string1 = "Hello World\n";
if ($string1 =~ m/\W/) {
  print "The space between Hello and ";
  print "World is not alphanumeric\n";
}
\s Matches a whitespace character (space, tab, newline, form feed)
$string1 = "Hello World\n";
if ($string1 =~ m/\s.*\s/) {
  print "There are TWO whitespace characters, which may";
  print " be separated by other characters, in $string1";
}
\S Matches anything BUT a whitespace.
$string1 = "Hello World\n";
if ($string1 =~ m/\S.*\S/) {
  print "There are TWO non-whitespace characters, which";
  print " may be separated by other characters, in $string1";
}
\d Matches a digit, same as [0-9].
$string1 = "99 bottles of beer on the wall.";
if ($string1 =~ m/(\d+)/) {
  print "$1 is the first number in '$string1'\n";
}

<B>Output:</B>
99 is the first number in '99 bottles of beer on the wall.'
\D Matches a non-digit.
$string1 = "Hello World\n";
if ($string1 =~ m/\D/) {
  print "There is at least one character in $string1";
  print " that is not a digit.\n";
}
^ Matches the beginning of a line or string.
$string1 = "Hello World\n";
if ($string1 =~ m/^He/) {
  print "$string1 starts with the characters 'He'\n";
}
$ Matches the end of a line or string.
$string1 = "Hello World\n";
if ($string1 =~ m/rld$/) {
  print "$string1 is a line or string";
  print "that ends with 'rld'\n";
}
\A Matches the beginning of a string (but not an internal line).
$string1 = "Hello\nWorld\n";
if ($string1 =~ m/\AH/) {
  print "$string1 is a string";
  print "that starts with 'H'\n";
}
\Z Matches the end of a string (but not an internal line).
$string1 = "Hello\nWorld\n";
if ($string1 =~ m/d\n\Z/) {
  print "$string1 is a string";
  print "that ends with 'd\\n'\n";
}
[^...] Matches every character except the ones inside brackets.
$string1 = "Hello World\n";
if ($string1 =~ m/[^abc]/) {
  print "$string1 does not contain the characters ";
  print "a, b, and c\n";
}

The 'm' in the above regular expressions, for example m/[^abc]/, is not required in order for perl to recognize the expression as a 'match' (cf. 'substitute': s/a/b/); /[^abc]/ could just as easily be used without the preceding 'm'. The 'm' operator can be used to alter the delimiting character; for example, m{/} may be used to enhance the legibility of patterns such as /\//. See 'perldoc perlre' for more details.