📄 perlrequick.pod
字号:
/(^a|b)c/; # matches 'ac' at start of string or 'bc' anywhere /house(cat|)/; # matches either 'housecat' or 'house' /house(cat(s|)|)/; # matches either 'housecats' or 'housecat' or # 'house'. Note groups can be nested. "20" =~ /(19|20|)\d\d/; # matches the null alternative '()\d\d', # because '20\d\d' can't match=head2 Extracting matchesThe grouping metacharacters C<()> also allow the extraction of theparts of a string that matched. For each grouping, the part thatmatched inside goes into the special variables C<$1>, C<$2>, etc.They can be used just as ordinary variables: # extract hours, minutes, seconds $time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format $hours = $1; $minutes = $2; $seconds = $3;In list context, a match C</regex/> with groupings will return thelist of matched values C<($1,$2,...)>. So we could rewrite it as ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);If the groupings in a regex are nested, C<$1> gets the group with theleftmost opening parenthesis, C<$2> the next opening parenthesis,etc. For example, here is a complex regex and the matching variablesindicated below it: /(ab(cd|ef)((gi)|j))/; 1 2 34Associated with the matching variables C<$1>, C<$2>, ... arethe B<backreferences> C<\1>, C<\2>, ... Backreferences arematching variables that can be used I<inside> a regex: /(\w\w\w)\s\1/; # find sequences like 'the the' in stringC<$1>, C<$2>, ... should only be used outside of a regex, and C<\1>,C<\2>, ... only inside a regex.=head2 Matching repetitionsThe B<quantifier> metacharacters C<?>, C<*>, C<+>, and C<{}> allow usto determine the number of repeats of a portion of a regex weconsider to be a match. Quantifiers are put immediately after thecharacter, character class, or grouping that we want to specify. Theyhave the following meanings:=over 4=item *C<a?> = match 'a' 1 or 0 times=item *C<a*> = match 'a' 0 or more times, i.e., any number of times=item *C<a+> = match 'a' 1 or more times, i.e., at least once=item *C<a{n,m}> = match at least C<n> times, but not more than C<m>times.=item *C<a{n,}> = match at least C<n> or more times=item *C<a{n}> = match exactly C<n> times=backHere are some examples: /[a-z]+\s+\d*/; # match a lowercase word, at least some space, and # any number of digits /(\w+)\s+\1/; # match doubled words of arbitrary length $year =~ /\d{2,4}/; # make sure year is at least 2 but not more # than 4 digits $year =~ /\d{4}|\d{2}/; # better match; throw out 3 digit datesThese quantifiers will try to match as much of the string as possible,while still allowing the regex to match. So we have $x = 'the cat in the hat'; $x =~ /^(.*)(at)(.*)$/; # matches, # $1 = 'the cat in the h' # $2 = 'at' # $3 = '' (0 matches)The first quantifier C<.*> grabs as much of the string as possiblewhile still having the regex match. The second quantifier C<.*> hasno string left to it, so it matches 0 times.=head2 More matchingThere are a few more things you might want to know about matchingoperators. In the code $pattern = 'Seuss'; while (<>) { print if /$pattern/; }perl has to re-evaluate C<$pattern> each time through the loop. IfC<$pattern> won't be changing, use the C<//o> modifier, to onlyperform variable substitutions once. If you don't want anysubstitutions at all, use the special delimiter C<m''>: $pattern = 'Seuss'; m'$pattern'; # matches '$pattern', not 'Seuss'The global modifier C<//g> allows the matching operator to matchwithin a string as many times as possible. In scalar context,successive matches against a string will have C<//g> jump from matchto match, keeping track of position in the string as it goes along.You can get or set the position with the C<pos()> function.For example, $x = "cat dog house"; # 3 words while ($x =~ /(\w+)/g) { print "Word is $1, ends at position ", pos $x, "\n"; }prints Word is cat, ends at position 3 Word is dog, ends at position 7 Word is house, ends at position 13A failed match or changing the target string resets the position. Ifyou don't want the position reset after failure to match, add theC<//c>, as in C</regex/gc>.In list context, C<//g> returns a list of matched groupings, or ifthere are no groupings, a list of matches to the whole regex. So @words = ($x =~ /(\w+)/g); # matches, # $word[0] = 'cat' # $word[1] = 'dog' # $word[2] = 'house'=head2 Search and replaceSearch and replace is performed using C<s/regex/replacement/modifiers>.The C<replacement> is a Perl double quoted string that replaces in thestring whatever is matched with the C<regex>. The operator C<=~> isalso used here to associate a string with C<s///>. If matchingagainst C<$_>, the S<C<$_ =~> > can be dropped. If there is a match,C<s///> returns the number of substitutions made, otherwise it returnsfalse. Here are a few examples: $x = "Time to feed the cat!"; $x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!" $y = "'quoted words'"; $y =~ s/^'(.*)'$/$1/; # strip single quotes, # $y contains "quoted words"With the C<s///> operator, the matched variables C<$1>, C<$2>, etc.are immediately available for use in the replacement expression. Withthe global modifier, C<s///g> will search and replace all occurrencesof the regex in the string: $x = "I batted 4 for 4"; $x =~ s/4/four/; # $x contains "I batted four for 4" $x = "I batted 4 for 4"; $x =~ s/4/four/g; # $x contains "I batted four for four"The evaluation modifier C<s///e> wraps an C<eval{...}> around thereplacement string and the evaluated result is substituted for thematched substring. Some examples: # reverse all the words in a string $x = "the cat in the hat"; $x =~ s/(\w+)/reverse $1/ge; # $x contains "eht tac ni eht tah" # convert percentage to decimal $x = "A 39% hit rate"; $x =~ s!(\d+)%!$1/100!e; # $x contains "A 0.39 hit rate"The last example shows that C<s///> can use other delimiters, such asC<s!!!> and C<s{}{}>, and even C<s{}//>. If single quotes are usedC<s'''>, then the regex and replacement are treated as single quotedstrings.=head2 The split operatorC<split /regex/, string> splits C<string> into a list of substringsand returns that list. The regex determines the character sequencethat C<string> is split with respect to. For example, to split astring into words, use $x = "Calvin and Hobbes"; @word = split /\s+/, $x; # $word[0] = 'Calvin' # $word[1] = 'and' # $word[2] = 'Hobbes'To extract a comma-delimited list of numbers, use $x = "1.618,2.718, 3.142"; @const = split /,\s*/, $x; # $const[0] = '1.618' # $const[1] = '2.718' # $const[2] = '3.142'If the empty regex C<//> is used, the string is split into individualcharacters. If the regex has groupings, then list produced containsthe matched substrings from the groupings as well: $x = "/usr/bin"; @parts = split m!(/)!, $x; # $parts[0] = '' # $parts[1] = '/' # $parts[2] = 'usr' # $parts[3] = '/' # $parts[4] = 'bin'Since the first character of $x matched the regex, C<split> prependedan empty initial element to the list.=head1 BUGSNone.=head1 SEE ALSOThis is just a quick start guide. For a more in-depth tutorial onregexes, see L<perlretut> and for the reference page, see L<perlre>.=head1 AUTHOR AND COPYRIGHTCopyright (c) 2000 Mark KvaleAll rights reserved.This document may be distributed under the same terms as Perl itself.=head2 AcknowledgmentsThe author would like to thank Mark-Jason Dominus, Tom Christiansen,Ilya Zakharevich, Brad Hughes, and Mike Giroux for all their helpfulcomments.=cut
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -