⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 perlfaq6.1

📁 视频监控网络部分的协议ddns,的模块的实现代码,请大家大胆指正.
💻 1
📖 第 1 页 / 共 4 页
字号:
.PP.Vb 2\&        $_ = "1122a44";\&        my @pairs = m/(\ed\ed)/g;   # qw( 11 22 44 ).Ve.PPIf you use the \f(CW\*(C`\eG\*(C'\fR anchor, you force the match after \f(CW22\fR tostart with the \f(CW\*(C`a\*(C'\fR.  The regular expression cannot matchthere since it does not find a digit, so the next matchfails and the match operator returns the pairs it alreadyfound..PP.Vb 2\&        $_ = "1122a44";\&        my @pairs = m/\eG(\ed\ed)/g; # qw( 11 22 ).Ve.PPYou can also use the \f(CW\*(C`\eG\*(C'\fR anchor in scalar context. Youstill need the \f(CW\*(C`g\*(C'\fR flag..PP.Vb 5\&        $_ = "1122a44";\&        while( m/\eG(\ed\ed)/g )\&                {\&                print "Found $1\en";\&                }.Ve.PPAfter the match fails at the letter \f(CW\*(C`a\*(C'\fR, perl resets \f(CW\*(C`pos()\*(C'\fRand the next match on the same string starts at the beginning..PP.Vb 5\&        $_ = "1122a44";\&        while( m/\eG(\ed\ed)/g )\&                {\&                print "Found $1\en";\&                }\&\&        print "Found $1 after while" if m/(\ed\ed)/g; # finds "11".Ve.PPYou can disable \f(CW\*(C`pos()\*(C'\fR resets on fail with the \f(CW\*(C`c\*(C'\fR flag, documentedin perlop and perlreref. Subsequent matches start where the lastsuccessful match ended (the value of \f(CW\*(C`pos()\*(C'\fR) even if a match on thesame string has failed in the meantime. In this case, the match afterthe \f(CW\*(C`while()\*(C'\fR loop starts at the \f(CW\*(C`a\*(C'\fR (where the last match stopped),and since it does not use any anchor it can skip over the \f(CW\*(C`a\*(C'\fR to find\&\f(CW44\fR..PP.Vb 5\&        $_ = "1122a44";\&        while( m/\eG(\ed\ed)/gc )\&                {\&                print "Found $1\en";\&                }\&\&        print "Found $1 after while" if m/(\ed\ed)/g; # finds "44".Ve.PPTypically you use the \f(CW\*(C`\eG\*(C'\fR anchor with the \f(CW\*(C`c\*(C'\fR flagwhen you want to try a different match if one fails,such as in a tokenizer. Jeffrey Friedl offers this examplewhich works in 5.004 or later..PP.Vb 9\&        while (<>) {\&                chomp;\&                PARSER: {\&                        m/ \eG( \ed+\eb    )/gcx   && do { print "number: $1\en";  redo; };\&                        m/ \eG( \ew+      )/gcx   && do { print "word:   $1\en";  redo; };\&                        m/ \eG( \es+      )/gcx   && do { print "space:  $1\en";  redo; };\&                        m/ \eG( [^\ew\ed]+ )/gcx   && do { print "other:  $1\en";  redo; };\&                }\&        }.Ve.PPFor each line, the \f(CW\*(C`PARSER\*(C'\fR loop first tries to match a seriesof digits followed by a word boundary.  This match has tostart at the place the last match left off (or the beginningof the string on the first match). Since \f(CW\*(C`m/ \eG( \ed+\eb)/gcx\*(C'\fR uses the \f(CW\*(C`c\*(C'\fR flag, if the string does not match thatregular expression, perl does not reset \fIpos()\fR and the nextmatch starts at the same position to try a differentpattern..Sh "Are Perl regexes DFAs or NFAs?  Are they \s-1POSIX\s0 compliant?".IX Xref "DFA NFA POSIX".IX Subsection "Are Perl regexes DFAs or NFAs?  Are they POSIX compliant?"While it's true that Perl's regular expressions resemble the DFAs(deterministic finite automata) of the \fIegrep\fR\|(1) program, they are infact implemented as NFAs (non-deterministic finite automata) to allowbacktracking and backreferencing.  And they aren't POSIX-style either,because those guarantee worst-case behavior for all cases.  (It seemsthat some people prefer guarantees of consistency, even when what'sguaranteed is slowness.)  See the book \*(L"Mastering Regular Expressions\*(R"(from O'Reilly) by Jeffrey Friedl for all the details you could everhope to know on these matters (a full citation appears inperlfaq2)..Sh "What's wrong with using grep in a void context?".IX Xref "grep".IX Subsection "What's wrong with using grep in a void context?"The problem is that grep builds a return list, regardless of the context.This means you're making Perl go to the trouble of building a list thatyou then just throw away. If the list is large, you waste both time and space.If your intent is to iterate over the list, then use a for loop for thispurpose..PPIn perls older than 5.8.1, map suffers from this problem as well.But since 5.8.1, this has been fixed, and map is context aware \- in voidcontext, no lists are constructed..Sh "How can I match strings with multibyte characters?".IX Xref "regex, and multibyte characters regexp, and multibyte characters regular expression, and multibyte characters martian encoding, Martian".IX Subsection "How can I match strings with multibyte characters?"Starting from Perl 5.6 Perl has had some level of multibyte charactersupport.  Perl 5.8 or later is recommended.  Supported multibytecharacter repertoires include Unicode, and legacy encodingsthrough the Encode module.  See perluniintro, perlunicode,and Encode..PPIf you are stuck with older Perls, you can do Unicode with the\&\f(CW\*(C`Unicode::String\*(C'\fR module, and character conversions using the\&\f(CW\*(C`Unicode::Map8\*(C'\fR and \f(CW\*(C`Unicode::Map\*(C'\fR modules.  If you are usingJapanese encodings, you might try using the jperl 5.005_03..PPFinally, the following set of approaches was offered by JeffreyFriedl, whose article in issue #5 of The Perl Journal talks aboutthis very matter..PPLet's suppose you have some weird Martian encoding where pairs of\&\s-1ASCII\s0 uppercase letters encode single Martian letters (i.e. the twobytes \*(L"\s-1CV\s0\*(R" make a single Martian letter, as do the two bytes \*(L"\s-1SG\s0\*(R",\&\*(L"\s-1VS\s0\*(R", \*(L"\s-1XX\s0\*(R", etc.). Other bytes represent single characters, just like\&\s-1ASCII\s0..PPSo, the string of Martian \*(L"I am \s-1CVSGXX\s0!\*(R" uses 12 bytes to encode thenine characters 'I', ' ', 'a', 'm', ' ', '\s-1CV\s0', '\s-1SG\s0', '\s-1XX\s0', '!'..PPNow, say you want to search for the single character \f(CW\*(C`/GX/\*(C'\fR. Perldoesn't know about Martian, so it'll find the two bytes \*(L"\s-1GX\s0\*(R" in the \*(L"Iam \s-1CVSGXX\s0!\*(R"  string, even though that character isn't there: it justlooks like it is because \*(L"\s-1SG\s0\*(R" is next to \*(L"\s-1XX\s0\*(R", but there's no real\&\*(L"\s-1GX\s0\*(R".  This is a big problem..PPHere are a few ways, all painful, to deal with it:.PP.Vb 2\&        # Make sure adjacent "martian" bytes are no longer adjacent.\&        $martian =~ s/([A\-Z][A\-Z])/ $1 /g;\&\&        print "found GX!\en" if $martian =~ /GX/;.Ve.PPOr like this:.PP.Vb 6\&        @chars = $martian =~ m/([A\-Z][A\-Z]|[^A\-Z])/g;\&        # above is conceptually similar to:     @chars = $text =~ m/(.)/g;\&        #\&        foreach $char (@chars) {\&        print "found GX!\en", last if $char eq \*(AqGX\*(Aq;\&        }.Ve.PPOr like this:.PP.Vb 3\&        while ($martian =~ m/\eG([A\-Z][A\-Z]|.)/gs) {  # \eG probably unneeded\&                print "found GX!\en", last if $1 eq \*(AqGX\*(Aq;\&                }.Ve.PPHere's another, slightly less painful, way to do it from BenjaminGoldberg, who uses a zero-width negative look-behind assertion..PP.Vb 5\&        print "found GX!\en" if  $martian =~ m/\&                (?<![A\-Z])\&                (?:[A\-Z][A\-Z])*?\&                GX\&                /x;.Ve.PPThis succeeds if the \*(L"martian\*(R" character \s-1GX\s0 is in the string, and failsotherwise.  If you don't like using (?<!), a zero-width negativelook-behind assertion, you can replace (?<![A\-Z]) with (?:^|[^A\-Z])..PPIt does have the drawback of putting the wrong thing in $\-[0] and $+[0],but this usually can be worked around..Sh "How do I match a regular expression that's in a variable? ,".IX Xref "regex, in variable eval regex quotemeta \eQ, regex \eE, regex qr".IX Subsection "How do I match a regular expression that's in a variable? ,"(contributed by brian d foy).PPWe don't have to hard-code patterns into the match operator (oranything else that works with regular expressions). We can put thepattern in a variable for later use..PPThe match operator is a double quote context, so you can interpolateyour variable just like a double quoted string. In this case, youread the regular expression as user input and store it in \f(CW$regex\fR.Once you have the pattern in \f(CW$regex\fR, you use that variable in thematch operator..PP.Vb 1\&        chomp( my $regex = <STDIN> );\&\&        if( $string =~ m/$regex/ ) { ... }.Ve.PPAny regular expression special characters in \f(CW$regex\fR are stillspecial, and the pattern still has to be valid or Perl will complain.For instance, in this pattern there is an unpaired parenthesis..PP.Vb 1\&        my $regex = "Unmatched ( paren";\&\&        "Two parens to bind them all" =~ m/$regex/;.Ve.PPWhen Perl compiles the regular expression, it treats the parenthesisas the start of a memory match. When it doesn't find the closingparenthesis, it complains:.PP.Vb 1\&        Unmatched ( in regex; marked by <\-\- HERE in m/Unmatched ( <\-\- HERE  paren/ at script line 3..Ve.PPYou can get around this in several ways depending on our situation.First, if you don't want any of the characters in the string to bespecial, you can escape them with \f(CW\*(C`quotemeta\*(C'\fR before you use the string..PP.Vb 2\&        chomp( my $regex = <STDIN> );\&        $regex = quotemeta( $regex );\&\&        if( $string =~ m/$regex/ ) { ... }.Ve.PPYou can also do this directly in the match operator using the \f(CW\*(C`\eQ\*(C'\fRand \f(CW\*(C`\eE\*(C'\fR sequences. The \f(CW\*(C`\eQ\*(C'\fR tells Perl where to start escapingspecial characters, and the \f(CW\*(C`\eE\*(C'\fR tells it where to stop (see perlopfor more details)..PP.Vb 1\&        chomp( my $regex = <STDIN> );\&\&        if( $string =~ m/\eQ$regex\eE/ ) { ... }.Ve.PPAlternately, you can use \f(CW\*(C`qr//\*(C'\fR, the regular expression quote operator (seeperlop for more details).  It quotes and perhaps compiles the pattern,and you can apply regular expression flags to the pattern..PP.Vb 1\&        chomp( my $input = <STDIN> );\&\&        my $regex = qr/$input/is;\&\&        $string =~ m/$regex/  # same as m/$input/is;.Ve.PPYou might also want to trap any errors by wrapping an \f(CW\*(C`eval\*(C'\fR blockaround the whole thing..PP.Vb 1\&        chomp( my $input = <STDIN> );\&\&        eval {\&                if( $string =~ m/\eQ$input\eE/ ) { ... }\&                };\&        warn $@ if $@;.Ve.PPOr....PP.Vb 7\&        my $regex = eval { qr/$input/is };\&        if( defined $regex ) {\&                $string =~ m/$regex/;\&                }\&        else {\&                warn $@;\&                }.Ve.SH "REVISION".IX Header "REVISION"Revision: \f(CW$Revision:\fR 10126 $.PPDate: \f(CW$Date:\fR 2007\-10\-27 21:29:20 +0200 (Sat, 27 Oct 2007) $.PPSee perlfaq for source control details and availability..SH "AUTHOR AND COPYRIGHT".IX Header "AUTHOR AND COPYRIGHT"Copyright (c) 1997\-2007 Tom Christiansen, Nathan Torkington, andother authors as noted. All rights reserved..PPThis documentation is free; you can redistribute it and/or modify itunder the same terms as Perl itself..PPIrrespective of its distribution, all code examples in this fileare hereby placed into the public domain.  You are permitted andencouraged to use this code in your own programs for funor for profit as you see fit.  A simple comment in the code givingcredit would be courteous but is not required.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -