⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 perlre.pod

📁 MSYS在windows下模拟了一个类unix的终端
💻 POD
📖 第 1 页 / 共 4 页
字号:
where C<\G> will match can also be influenced by using C<pos()> asan lvalue.  See L<perlfunc/pos>.The bracketing construct C<( ... )> creates capture buffers.  Torefer to the digit'th buffer use \<digit> within thematch.  Outside the match use "$" instead of "\".  (The\<digit> notation works in certain circumstances outside the match.  See the warning below about \1 vs $1 for details.)Referring back to another part of the match is called aI<backreference>.There is no limit to the number of captured substrings that you mayuse.  However Perl also uses \10, \11, etc. as aliases for \010,\011, etc.  (Recall that 0 means octal, so \011 is the character atnumber 9 in your coded character set; which would be the 10th character,a horizontal tab under ASCII.)  Perl resolves this ambiguity by interpreting \10 as a backreference only if at least 10 left parentheses have opened before it.  Likewise \11 is a backreference only if at least 11 left parentheses have opened before it.  And so on.  \1 through \9 are always interpreted as backreferences.Examples:    s/^([^ ]*) *([^ ]*)/$2 $1/;     # swap first two words     if (/(.)\1/) {                 # find first doubled char         print "'$1' is the first doubled character\n";     }    if (/Time: (..):(..):(..)/) {   # parse out values	$hours = $1;	$minutes = $2;	$seconds = $3;    }Several special variables also refer back to portions of the previousmatch.  C<$+> returns whatever the last bracket match matched.C<$&> returns the entire matched string.  (At one point C<$0> didalso, but now it returns the name of the program.)  C<$`> returnseverything before the matched string.  And C<$'> returns everythingafter the matched string.The numbered variables ($1, $2, $3, etc.) and the related punctuationset (C<$+>, C<$&>, C<$`>, and C<$'>) are all dynamically scopeduntil the end of the enclosing block or until the next successfulmatch, whichever comes first.  (See L<perlsyn/"Compound Statements">.)B<WARNING>: Once Perl sees that you need one of C<$&>, C<$`>, orC<$'> anywhere in the program, it has to provide them for everypattern match.  This may substantially slow your program.  Perluses the same mechanism to produce $1, $2, etc, so you also pay aprice for each pattern that contains capturing parentheses.  (Toavoid this cost while retaining the grouping behaviour, use theextended regular expression C<(?: ... )> instead.)  But if you neveruse C<$&>, C<$`> or C<$'>, then patterns I<without> capturingparentheses will not be penalized.  So avoid C<$&>, C<$'>, and C<$`>if you can, but if you can't (and some algorithms really appreciatethem), once you've used them once, use them at will, because you'vealready paid the price.  As of 5.005, C<$&> is not so costly as theother two.Backslashed metacharacters in Perl are alphanumeric, such as C<\b>,C<\w>, C<\n>.  Unlike some other regular expression languages, thereare no backslashed symbols that aren't alphanumeric.  So anythingthat looks like \\, \(, \), \<, \>, \{, or \} is alwaysinterpreted as a literal character, not a metacharacter.  This wasonce used in a common idiom to disable or quote the special meaningsof regular expression metacharacters in a string that you want touse for a pattern. Simply quote all non-"word" characters:    $pattern =~ s/(\W)/\\$1/g;(If C<use locale> is set, then this depends on the current locale.)Today it is more common to use the quotemeta() function or the C<\Q>metaquoting escape sequence to disable all metacharacters' specialmeanings like this:    /$unquoted\Q$quoted\E$unquoted/Beware that if you put literal backslashes (those not insideinterpolated variables) between C<\Q> and C<\E>, double-quotishbackslash interpolation may lead to confusing results.  If youI<need> to use literal backslashes within C<\Q...\E>,consult L<perlop/"Gory details of parsing quoted constructs">.=head2 Extended PatternsPerl also defines a consistent extension syntax for features notfound in standard tools like B<awk> and B<lex>.  The syntax is apair of parentheses with a question mark as the first thing withinthe parentheses.  The character after the question mark indicatesthe extension.The stability of these extensions varies widely.  Some have beenpart of the core language for many years.  Others are experimentaland may change without warning or be completely removed.  Checkthe documentation on an individual feature to verify its currentstatus.A question mark was chosen for this and for the minimal-matchingconstruct because 1) question marks are rare in older regularexpressions, and 2) whenever you see one, you should stop and"question" exactly what is going on.  That's psychology...=over 10=item C<(?#text)>A comment.  The text is ignored.  If the C</x> modifier enableswhitespace formatting, a simple C<#> will suffice.  Note that Perl closesthe comment as soon as it sees a C<)>, so there is no way to put a literalC<)> in the comment.=item C<(?imsx-imsx)>One or more embedded pattern-match modifiers.  This is particularlyuseful for dynamic patterns, such as those read in from a configurationfile, read in as an argument, are specified in a table somewhere,etc.  Consider the case that some of which want to be case sensitiveand some do not.  The case insensitive ones need to include merelyC<(?i)> at the front of the pattern.  For example:    $pattern = "foobar";    if ( /$pattern/i ) { }     # more flexible:    $pattern = "(?i)foobar";    if ( /$pattern/ ) { } Letters after a C<-> turn those modifiers off.  These modifiers arelocalized inside an enclosing group (if any).  For example,    ( (?i) blah ) \s+ \1will match a repeated (I<including the case>!) word C<blah> in anycase, assuming C<x> modifier, and no C<i> modifier outside thisgroup.=item C<(?:pattern)>=item C<(?imsx-imsx:pattern)>This is for clustering, not capturing; it groups subexpressions like"()", but doesn't make backreferences as "()" does.  So    @fields = split(/\b(?:a|b|c)\b/)is like    @fields = split(/\b(a|b|c)\b/)but doesn't spit out extra fields.  It's also cheaper not to capturecharacters if you don't need to.Any letters between C<?> and C<:> act as flags modifiers as withC<(?imsx-imsx)>.  For example,     /(?s-i:more.*than).*million/iis equivalent to the more verbose    /(?:(?s-i)more.*than).*million/i=item C<(?=pattern)>A zero-width positive look-ahead assertion.  For example, C</\w+(?=\t)/>matches a word followed by a tab, without including the tab in C<$&>.=item C<(?!pattern)>A zero-width negative look-ahead assertion.  For example C</foo(?!bar)/>matches any occurrence of "foo" that isn't followed by "bar".  Notehowever that look-ahead and look-behind are NOT the same thing.  You cannotuse this for look-behind.If you are looking for a "bar" that isn't preceded by a "foo", C</(?!foo)bar/>will not do what you want.  That's because the C<(?!foo)> is just saying thatthe next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" willmatch.  You would have to do something like C</(?!foo)...bar/> for that.   Wesay "like" because there's the case of your "bar" not having three charactersbefore it.  You could cover that this way: C</(?:(?!foo)...|^.{0,2})bar/>.Sometimes it's still easier just to say:    if (/bar/ && $` !~ /foo$/)For look-behind see below.=item C<(?<=pattern)>A zero-width positive look-behind assertion.  For example, C</(?<=\t)\w+/>matches a word that follows a tab, without including the tab in C<$&>.Works only for fixed-width look-behind.=item C<(?<!pattern)>A zero-width negative look-behind assertion.  For example C</(?<!bar)foo/>matches any occurrence of "foo" that does not follow "bar".  Worksonly for fixed-width look-behind.=item C<(?{ code })>B<WARNING>: This extended regular expression feature is consideredhighly experimental, and may be changed or deleted without notice.This zero-width assertion evaluate any embedded Perl code.  Italways succeeds, and its C<code> is not interpolated.  Currently,the rules to determine where the C<code> ends are somewhat convoluted.The C<code> is properly scoped in the following sense: If the assertionis backtracked (compare L<"Backtracking">), all changes introduced afterC<local>ization are undone, so that  $_ = 'a' x 8;  m<      (?{ $cnt = 0 })			# Initialize $cnt.     (       a        (?{           local $cnt = $cnt + 1;	# Update $cnt, backtracking-safe.       })     )*       aaaa     (?{ $res = $cnt })			# On success copy to non-localized					# location.   >x;will set C<$res = 4>.  Note that after the match, $cnt returns to the globallyintroduced value, because the scopes that restrict C<local> operatorsare unwound.This assertion may be used as a C<(?(condition)yes-pattern|no-pattern)>switch.  If I<not> used in this way, the result of evaluation ofC<code> is put into the special variable C<$^R>.  This happensimmediately, so C<$^R> can be used from other C<(?{ code })> assertionsinside the same regular expression.The assignment to C<$^R> above is properly localized, so the oldvalue of C<$^R> is restored if the assertion is backtracked; compareL<"Backtracking">.For reasons of security, this construct is forbidden if the regularexpression involves run-time interpolation of variables, unless theperilous C<use re 'eval'> pragma has been used (see L<re>), or thevariables contain results of C<qr//> operator (seeL<perlop/"qr/STRING/imosx">).  This restriction is because of the wide-spread and remarkably convenientcustom of using run-time determined strings as patterns.  For example:    $re = <>;    chomp $re;    $string =~ /$re/;Before Perl knew how to execute interpolated code within a pattern,this operation was completely safe from a security point of view,although it could raise an exception from an illegal pattern.  Ifyou turn on the C<use re 'eval'>, though, it is no longer secure,so you should only do so if you are also using taint checking.Better yet, use the carefully constrained evaluation within a Safemodule.  See L<perlsec> for details about both these mechanisms.=item C<(??{ code })>B<WARNING>: This extended regular expression feature is consideredhighly experimental, and may be changed or deleted without notice.A simplified version of the syntax may be introduced for commonlyused idioms.This is a "postponed" regular subexpression.  The C<code> is evaluatedat run time, at the moment this subexpression may match.  The resultof evaluation is considered as a regular expression and matched asif it were inserted instead of this construct.The C<code> is not interpolated.  As before, the rules to determinewhere the C<code> ends are currently somewhat convoluted.The following pattern matches a parenthesized group:  $re = qr{	     \(	     (?:		(?> [^()]+ )	# Non-parens without backtracking	      |		(??{ $re })	# Group with matching parens	     )*	     \)	  }x;=item C<< (?>pattern) >>B<WARNING>: This extended regular expression feature is consideredhighly experimental, and may be changed or deleted without notice.An "independent" subexpression, one which matches the substringthat a I<standalone> C<pattern> would match if anchored at the givenposition, and it matches I<nothing other than this substring>.  Thisconstruct is useful for optimizations of what would otherwise be"eternal" matches, because it will not backtrack (see L<"Backtracking">).It may also be useful in places where the "grab all you can, and do notgive anything back" semantic is desirable.For example: C<< ^(?>a*)ab >> will never match, since C<< (?>a*) >>(anchored at the beginning of string, as above) will match I<all>characters C<a> at the beginning of string, leaving no C<a> forC<ab> to match.  In contrast, C<a*ab> will match the same as C<a+b>,since the match of the subgroup C<a*> is influenced by the followinggroup C<ab> (see L<"Backtracking">).  In particular, C<a*> insideC<a*ab> will match fewer characters than a standalone C<a*>, sincethis makes the tail match.An effect similar to C<< (?>pattern) >> may be achieved by writingC<(?=(pattern))\1>.  This matches the same substring as a standaloneC<a+>, and the following C<\1> eats the matched string; it thereforemakes a zero-length assertion into an analogue of C<< (?>...) >>.(The difference between these two constructs is that the second oneuses a capturing group, thus shifting ordinals of backreferencesin the rest of a regular expression.)Consider this pattern:

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -