Regular expression examples: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 10:26, 1 June 2010 edit 193.219.160.34 (talk) →+External links: regex.powertoy.org ← Previous edit		Latest revision as of 17:50, 9 June 2017 edit undo Tom.Reding (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, Template editors 4,364,437 edits m +{{Redirect category shell}}, using AWB
(23 intermediate revisions by 20 users not shown)
Line 1: ~~{{Mergeto\|~~#REDIRECT [[Regular expression~~\|date=December 2008}}~~#Examples]] ~~{{Cleanup-rewrite\|date=May 2009}}~~ ~~{{examplefarm}}~~ {{Redirect category shell\|1= A [[regular expression]] ( also "RegEx" or "regex" ) is a string that is used to describe or match a set of strings according to certain [[syntax]] rules. The specific syntax rules vary depending on the specific [[implementation]], [[programming language]], or [[Library (computing)\|library]] in use. Additionally, the functionality of regex implementations can vary between [[Software versioning\|version]]s. {{R from merge}} }} Despite this variability, and because regular expressions can be difficult to both explain and understand without examples, this article provides a basic description of some of the properties of regular expressions by way of illustration. ~~== Conventions ==~~ The following conventions are used in the examples.<ref name="clarify000">The character 'm' is not always required to specify a perl match operation. For example, m/[^abc]/ could also be rendered as /[^abc]/. The 'm' is only necessary if the user wishes to specify a match operation without using a forward-slash as the regex [[delimiter]]. Sometimes it is useful to specify an alternate regex delimiter in order to avoid "[[Delimiter#Delimiter collision\|delimiter collision]]". See '[http://perldoc.perl.org/perlre.html perldoc perlre]' for more details.</ref> ~~metacharacter(s) ;; the metacharacters column specifies the regex syntax being demonstrated~~ ~~=~ m// ;; indicates a regex '''match''' operation in perl~~ ~~=~ s/// ;; indicates a regex '''substitution''' operation in perl~~ ~~Also worth noting is that these regular expressions are all Perl-like syntax. Standard POSIX regular expressions are different.~~ ~~== Examples ==~~ Unless otherwise indicated, the following examples conform to the [[Perl]] programming language, release 5.8.8, January 31, 2006. The syntax and conventions used in these examples coincide with that of other programming environments as well (e.g., see Java in a Nutshell - Page 213, Python Scripting for Computational Science - Page 320, Programming PHP - Page 106 ). ~~<table class="wikitable">~~ ~~<tr>~~ ~~<th>Metacharacter(s)</th>~~ ~~<th>Description</th>~~ ~~<th>Example~~ ~~<br>Note that all the if statements return a TRUE value</th>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>'''.'''</td>~~ ~~<td>Normally matches any character except a newline. Within square brackets the dot is literal.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/...../) {~~ ~~print "$string1 has length >= 5\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>( )</td>~~ ~~<td>Groups a series of pattern elements to a single element. When you match a pattern within parentheses, you can use any of $1, $2, ... later to refer to the previously matched pattern.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/(H..).(o..)/) {~~ ~~print "We matched '$1' and '$2'\n";~~ } ~~</pre>'''Output:'''<pre>~~ ~~We matched 'Hel' and 'o W';~~ ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>+</td>~~ ~~<td>Matches the preceding pattern element one or more times.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/l+/) {~~ ~~print "There are one or more consecutive letter \"l\"'s in $string1\n";~~ } ~~</pre>'''Output:'''<pre>~~ ~~There are one or more consecutive letter "l"'s in Hello World~~ ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>?</td>~~ ~~<td>Matches the preceding pattern element zero or one times.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/H.?e/) {~~ ~~print "There is an 'H' and a 'e' separated by ";~~ ~~print "0-1 characters (Ex: He Hoe)\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>?</td>~~ ~~<td>Modifies the , +, or {M,N}'d regexp that comes before~~ ~~to match as few times as possible.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/(l.+?o)/) {~~ ~~print "The non-greedy match with 'l' followed by one or ";~~ ~~print "more characters is 'llo' rather than 'llo wo'.\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td></td>~~ ~~<td>Matches the preceding pattern element zero or more times.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/elo/) {~~ ~~print "There is an 'e' followed by zero to many ";~~ ~~print "'l' followed by 'o' (eo, elo, ello, elllo)\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>{M,N}</td>~~ ~~<td>Denotes the minimum M and the maximum N match count.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/l{1,2}/) {~~ ~~print "There exists a substring with at least 1 ";~~ ~~print "and at most 2 l's in $string1\n";~~ } ~~</pre>~~ ~~</td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>[...]</td>~~ ~~<TD>Denotes a set of possible character matches.</TD>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/[aeiou]+/) {~~ ~~print "$string1 contains one or more vowels.\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\|</td>~~ ~~<td>Separates alternate possibilities.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/(Hello\|Hi\|Pogo)/) {~~ ~~print "At least one of Hello, Hi, or Pogo is ";~~ ~~print "contained in $string1.\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\b</td>~~ ~~<td>Matches a word boundary.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/llo\b/) {~~ ~~print "There is a word that ends with 'llo'\n";~~ ~~} else {~~ ~~print "There are no words that end with 'llo'\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\w</td>~~ ~~<td>Matches an alphanumeric character, including "_"; same as [A-Za-z0-9_]</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/\w/) {~~ ~~print "There is at least one alphanumeric ";~~ ~~print "character in $string1 (A-Z, a-z, 0-9, _)\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\W</td>~~ ~~<td>Matches a <b>non</b>-alphanumeric character, excluding "_"; same as [^A-Za-z0-9_]</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/\W/) {~~ ~~print "The space between Hello and ";~~ ~~print "World is not alphanumeric\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\s</td>~~ ~~<td>Matches a whitespace character (space, tab, newline, form feed)</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/\s.\s/) {~~ ~~print "There are TWO whitespace characters, which may";~~ ~~print " be separated by other characters, in $string1";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\S</td>~~ ~~<td>Matches anything BUT a whitespace.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/\S.\S/) {~~ ~~print "There are TWO non-whitespace characters, which";~~ ~~print " may be separated by other characters, in $string1";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\d</td>~~ ~~<td>Matches a digit; same as [0-9].</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "99 bottles of beer on the wall.";~~ ~~if ($string1 =~ m/(\d+)/) {~~ ~~print "$1 is the first number in '$string1'\n";~~ } ~~</pre>'''Output:'''<pre>~~ ~~99 is the first number in '99 bottles of beer on the wall.'~~ ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\D</td>~~ ~~<td>Matches a non-digit; same as [^0-9].</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/\D/) {~~ ~~print "There is at least one character in $string1";~~ ~~print " that is not a digit.\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>^</td>~~ ~~<td>Matches the beginning of a line or string.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/^He/) {~~ ~~print "$string1 starts with the characters 'He'\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>$</td>~~ ~~<td>Matches the end of a line or string.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/rld$/) {~~ ~~print "$string1 is a line or string ";~~ ~~print "that ends with 'rld'\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\A</td>~~ ~~<td>Matches the beginning of a string (but not an internal line).</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello\nWorld\n";~~ ~~if ($string1 =~ m/\AH/) {~~ ~~print "$string1 is a string ";~~ ~~print "that starts with 'H'\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>\z</td>~~ ~~<td>Matches the end of a string (but not an internal line).<br/> see Perl Best Practices Page 240</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello\nWorld\n";~~ ~~if ($string1 =~ m/d\n\z/) {~~ ~~print "$string1 is a string ";~~ ~~print "that ends with 'd\\n'\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~<tr>~~ ~~<td>[^...]</td>~~ ~~<td>Matches every character except the ones inside brackets.</td>~~ ~~<td align="left">~~ ~~<pre>~~ ~~$string1 = "Hello World\n";~~ ~~if ($string1 =~ m/[^abc]/) {~~ ~~print "$string1 contains a character other than ";~~ ~~print "a, b, and c\n";~~ } ~~</pre></td>~~ ~~</tr>~~ ~~</table></center>~~ ~~== Notes ==~~ ~~{{Reflist}}~~ ~~== See also ==~~ [[Comparison of programming languages]] ~~[[Category:Perl]]~~ ~~[[Category:Pattern matching]]~~ ~~[[Category:Articles with example code]]~~ ~~[[Category:Programming language topics]]~~ ~~==External links==~~ * [http://regex.powertoy.org/ simple and straitforward RegEx online trainer/demo]