text::balanced.3
来自「视频监控网络部分的协议ddns,的模块的实现代码,请大家大胆指正.」· 3 代码 · 共 1,363 行 · 第 1/4 页
3
1,363 行
substring that matched a tagged text (including the start and endtags). \f(CW\*(C`undef\*(C'\fR is returned on failure. In addition, the original inputtext has the returned substring (and any prefix) removed from it..PPIn a void context, the input text just has the matched substring (andany specified prefix) removed..ie n .Sh """gen_extract_tagged""".el .Sh "\f(CWgen_extract_tagged\fP".IX Subsection "gen_extract_tagged"(Note: This subroutine is only available under Perl5.005).PP\&\f(CW\*(C`gen_extract_tagged\*(C'\fR generates a new anonymous subroutine whichextracts text between (balanced) specified tags. In other words,it generates a function identical in function to \f(CW\*(C`extract_tagged\*(C'\fR..PPThe difference between \f(CW\*(C`extract_tagged\*(C'\fR and the anonymoussubroutines generated by\&\f(CW\*(C`gen_extract_tagged\*(C'\fR, is that those generated subroutines:.IP "\(bu" 4do not have to reparse tag specification or parsing options every timethey are called (whereas \f(CW\*(C`extract_tagged\*(C'\fR has to effectively rebuildits tag parser on every call);.IP "\(bu" 4make use of the new qr// construct to pre-compile the regexes they use(whereas \f(CW\*(C`extract_tagged\*(C'\fR uses standard string variable interpolation to create tag-matching patterns)..PPThe subroutine takes up to four optional arguments (the same set as\&\f(CW\*(C`extract_tagged\*(C'\fR except for the string to be processed). It returnsa reference to a subroutine which in turn takes a single argument (the text tobe extracted from)..PPIn other words, the implementation of \f(CW\*(C`extract_tagged\*(C'\fR is exactlyequivalent to:.PP.Vb 6\& sub extract_tagged\& {\& my $text = shift;\& $extractor = gen_extract_tagged(@_);\& return $extractor\->($text);\& }.Ve.PP(although \f(CW\*(C`extract_tagged\*(C'\fR is not currently implemented that way, in orderto preserve pre\-5.005 compatibility)..PPUsing \f(CW\*(C`gen_extract_tagged\*(C'\fR to create extraction functions for specific tags is a good idea if those functions are going to be called more than once, sincetheir performance is typically twice as good as the more general-purpose\&\f(CW\*(C`extract_tagged\*(C'\fR..ie n .Sh """extract_quotelike""".el .Sh "\f(CWextract_quotelike\fP".IX Subsection "extract_quotelike"\&\f(CW\*(C`extract_quotelike\*(C'\fR attempts to recognize, extract, and segment anyone of the various Perl quotes and quotelike operators (see\&\fIperlop\fR\|(3)) Nested backslashed delimiters, embedded balanced bracketdelimiters (for the quotelike operators), and trailing modifiers areall caught. For example, in:.PP.Vb 1\& extract_quotelike \*(Aqq # an octothorpe: \e# (not the end of the q!) #\*(Aq\& \& extract_quotelike \*(Aq "You said, \e"Use sed\e"." \*(Aq\&\& extract_quotelike \*(Aq s{([A\-Z]{1,8}\e.[A\-Z]{3})} /\eL$1\eE/; \*(Aq\&\& extract_quotelike \*(Aq tr/\e\e\e/\e\e\e\e/\e\e\e//ds; \*(Aq.Ve.PPthe full Perl quotelike operations are all extracted correctly..PPNote too that, when using the /x modifier on a regex, any commentcontaining the current pattern delimiter will cause the regex to beimmediately terminated. In other words:.PP.Vb 5\& \*(Aqm /\& (?i) # CASE INSENSITIVE\& [a\-z_] # LEADING ALPHABETIC/UNDERSCORE\& [a\-z0\-9]* # FOLLOWED BY ANY NUMBER OF ALPHANUMERICS\& /x\*(Aq.Ve.PPwill be extracted as if it were:.PP.Vb 3\& \*(Aqm /\& (?i) # CASE INSENSITIVE\& [a\-z_] # LEADING ALPHABETIC/\*(Aq.Ve.PPThis behaviour is identical to that of the actual compiler..PP\&\f(CW\*(C`extract_quotelike\*(C'\fR takes two arguments: the text to be processed anda prefix to be matched at the very beginning of the text. If no prefix is specified, optional whitespace is the default. If no text is given,\&\f(CW$_\fR is used..PPIn a list context, an array of 11 elements is returned. The elements are:.IP "[0]" 4.IX Item "[0]"the extracted quotelike substring (including trailing modifiers),.IP "[1]" 4.IX Item "[1]"the remainder of the input text,.IP "[2]" 4.IX Item "[2]"the prefix substring (if any),.IP "[3]" 4.IX Item "[3]"the name of the quotelike operator (if any),.IP "[4]" 4.IX Item "[4]"the left delimiter of the first block of the operation,.IP "[5]" 4.IX Item "[5]"the text of the first block of the operation(that is, the contents ofa quote, the regex of a match or substitution or the target list of atranslation),.IP "[6]" 4.IX Item "[6]"the right delimiter of the first block of the operation,.IP "[7]" 4.IX Item "[7]"the left delimiter of the second block of the operation(that is, if it is a \f(CW\*(C`s\*(C'\fR, \f(CW\*(C`tr\*(C'\fR, or \f(CW\*(C`y\*(C'\fR),.IP "[8]" 4.IX Item "[8]"the text of the second block of the operation (that is, the replacement of a substitution or the translation listof a translation),.IP "[9]" 4.IX Item "[9]"the right delimiter of the second block of the operation (if any),.IP "[10]" 4.IX Item "[10]"the trailing modifiers on the operation (if any)..PPFor each of the fields marked \*(L"(if any)\*(R" the default value on success isan empty string.On failure, all of these values (except the remaining text) are \f(CW\*(C`undef\*(C'\fR..PPIn a scalar context, \f(CW\*(C`extract_quotelike\*(C'\fR returns just the complete substringthat matched a quotelike operation (or \f(CW\*(C`undef\*(C'\fR on failure). In a scalar orvoid context, the input text has the same substring (and any specifiedprefix) removed..PPExamples:.PP.Vb 1\& # Remove the first quotelike literal that appears in text\&\& $quotelike = extract_quotelike($text,\*(Aq.*?\*(Aq);\&\& # Replace one or more leading whitespace\-separated quotelike\& # literals in $_ with "<QLL>"\&\& do { $_ = join \*(Aq<QLL>\*(Aq, (extract_quotelike)[2,1] } until $@;\&\&\& # Isolate the search pattern in a quotelike operation from $text\&\& ($op,$pat) = (extract_quotelike $text)[3,5];\& if ($op =~ /[ms]/)\& {\& print "search pattern: $pat\en";\& }\& else\& {\& print "$op is not a pattern matching operation\en";\& }.Ve.ie n .Sh """extract_quotelike"" and ""here documents""".el .Sh "\f(CWextract_quotelike\fP and ``here documents''".IX Subsection "extract_quotelike and here documents"\&\f(CW\*(C`extract_quotelike\*(C'\fR can successfully extract \*(L"here documents\*(R" from an inputstring, but with an important caveat in list contexts..PPUnlike other types of quote-like literals, a here document is rarelya contiguous substring. For example, a typical piece of code usinghere document might look like this:.PP.Vb 4\& <<\*(AqEOMSG\*(Aq || die;\& This is the message.\& EOMSG\& exit;.Ve.PPGiven this as an input string in a scalar context, \f(CW\*(C`extract_quotelike\*(C'\fRwould correctly return the string \*(L"<<'\s-1EOMSG\s0'\enThis is the message.\enEOMSG\*(R",leaving the string \*(L" || die;\enexit;\*(R" in the original variable. In other words,the two separate pieces of the here document are successfully extracted andconcatenated..PPIn a list context, \f(CW\*(C`extract_quotelike\*(C'\fR would return the list.IP "[0]" 4.IX Item "[0]"\&\*(L"<<'\s-1EOMSG\s0'\enThis is the message.\enEOMSG\en\*(R" (i.e. the full extracted here document,including fore and aft delimiters),.IP "[1]" 4.IX Item "[1]"\&\*(L" || die;\enexit;\*(R" (i.e. the remainder of the input text, concatenated),.IP "[2]" 4.IX Item "[2]""" (i.e. the prefix substring \*(-- trivial in this case),.IP "[3]" 4.IX Item "[3]"\&\*(L"<<\*(R" (i.e. the \*(L"name\*(R" of the quotelike operator).IP "[4]" 4.IX Item "[4]"\&\*(L"'\s-1EOMSG\s0'\*(R" (i.e. the left delimiter of the here document, including any quotes),.IP "[5]" 4.IX Item "[5]"\&\*(L"This is the message.\en\*(R" (i.e. the text of the here document),.IP "[6]" 4.IX Item "[6]"\&\*(L"\s-1EOMSG\s0\*(R" (i.e. the right delimiter of the here document),.IP "[7..10]" 4.IX Item "[7..10]""" (a here document has no second left delimiter, second text, second rightdelimiter, or trailing modifiers)..PPHowever, the matching position of the input variable would be set to\&\*(L"exit;\*(R" (i.e. \fIafter\fR the closing delimiter of the here document),which would cause the earlier \*(L" || die;\enexit;\*(R" to be skipped in anysequence of code fragment extractions..PPTo avoid this problem, when it encounters a here document whilstextracting from a modifiable string, \f(CW\*(C`extract_quotelike\*(C'\fR silentlyrearranges the string to an equivalent piece of Perl:.PP.Vb 5\& <<\*(AqEOMSG\*(Aq\& This is the message.\& EOMSG\& || die;\& exit;.Ve.PPin which the here document \fIis\fR contiguous. It still leaves thematching position after the here document, but now the rest of the lineon which the here document starts is not skipped..PPTo prevent <extract_quotelike> from mucking about with the input in this way(this is the only case where a list-context \f(CW\*(C`extract_quotelike\*(C'\fR does so),you can pass the input variable as an interpolated literal:.PP.Vb 1\& $quotelike = extract_quotelike("$var");.Ve.ie n .Sh """extract_codeblock""".el .Sh "\f(CWextract_codeblock\fP".IX Subsection "extract_codeblock"\&\f(CW\*(C`extract_codeblock\*(C'\fR attempts to recognize and extract a balancedbracket delimited substring that may contain unbalanced bracketsinside Perl quotes or quotelike operations. That is, \f(CW\*(C`extract_codeblock\*(C'\fRis like a combination of \f(CW"extract_bracketed"\fR and\&\f(CW"extract_quotelike"\fR..PP\&\f(CW\*(C`extract_codeblock\*(C'\fR takes the same initial three parameters as \f(CW\*(C`extract_bracketed\*(C'\fR:a text to process, a set of delimiter brackets to look for, and a prefix tomatch first. It also takes an optional fourth parameter, which allows theoutermost delimiter brackets to be specified separately (see below)..PPOmitting the first argument (input text) means process \f(CW$_\fR instead.Omitting the second argument (delimiter brackets) indicates that only \f(CW\*(Aq{\*(Aq\fR is to be used.Omitting the third argument (prefix argument) implies optional whitespace at the start.Omitting the fourth argument (outermost delimiter brackets) indicates that thevalue of the second argument is to be used for the outermost delimiters..PPOnce the prefix an dthe outermost opening delimiter bracket have beenrecognized, code blocks are extracted by stepping through the input text andtrying the following alternatives in sequence:.IP "1." 4Try and match a closing delimiter bracket. If the bracket was the samespecies as the last opening bracket, return the substring to thatpoint. If the bracket was mismatched, return an error..IP "2." 4Try to match a quote or quotelike operator. If found, call\&\f(CW\*(C`extract_quotelike\*(C'\fR to eat it. If \f(CW\*(C`extract_quotelike\*(C'\fR fails, returnthe error it returned. Otherwise go back to step 1..IP "3." 4Try to match an opening delimiter bracket. If found, call\&\f(CW\*(C`extract_codeblock\*(C'\fR recursively to eat the embedded block. If therecursive call fails, return an error. Otherwise, go back to step 1..IP "4." 4Unconditionally match a bareword or any other single character, andthen go back to step 1..PPExamples:.PP.Vb 1\& # Find a while loop in the text\&\& if ($text =~ s/.*?while\es*\e{/{/)\& {\& $loop = "while " . extract_codeblock($text);\& }\&\& # Remove the first round\-bracketed list (which may include\& # round\- or curly\-bracketed code blocks or quotelike operators)\&\& extract_codeblock $text, "(){}", \*(Aq[^(]*\*(Aq;.Ve.PPThe ability to specify a different outermost delimiter bracket is usefulin some circumstances. For example, in the Parse::RecDescent module,parser actions which are to be performed only on a successful parseare specified using a \f(CW\*(C`<defer:...>\*(C'\fR directive. For example:.PP.Vb 2\& sentence: subject verb object\& <defer: {$::theVerb = $item{verb}} >.Ve.PPParse::RecDescent uses \f(CW\*(C`extract_codeblock($text, \*(Aq{}<>\*(Aq)\*(C'\fR to extract the codewithin the \f(CW\*(C`<defer:...>\*(C'\fR directive, but there's a problem..PPA deferred action like this:.PP.Vb 1\& <defer: {if ($count>10) {$count\-\-}} >.Ve.PPwill be incorrectly parsed as:.PP.Vb 1\& <defer: {if ($count>.Ve.PPbecause the \*(L"less than\*(R" operator is interpreted as a closing delimiter..PPBut, by extracting the directive using\&\f(CW\*(C`extract_codeblock($text,\ \*(Aq{}\*(Aq,\ undef,\ \*(Aq<>\*(Aq)\*(C'\fRthe '>' character is only treated as a delimited at the outermostlevel of the code block, so the directive is parsed correctly..ie n .Sh """extract_multiple""".el .Sh "\f(CWextract_multiple\fP"
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?