syntax_perl.qbk

来自「Boost provides free peer-reviewed portab」· QBK 代码 · 共 517 行 · 第 1/2 页
QBK
517 行
[/   Copyright 2006-2007 John Maddock.  Distributed under the Boost Software License, Version 1.0.  (See accompanying file LICENSE_1_0.txt or copy at  http://www.boost.org/LICENSE_1_0.txt).][section:perl_syntax Perl Regular Expression Syntax][h3 Synopsis]The Perl regular expression syntax is based on that used by the programming language Perl .  Perl regular expressions are the default behavior in Boost.Regex or you can pass the flag `perl` to the [basic_regex] constructor, for example:   // e1 is a case sensitive Perl regular expression:    // since Perl is the default option there's no need to explicitly specify the syntax used here:   boost::regex e1(my_expression);   // e2 a case insensitive Perl regular expression:   boost::regex e2(my_expression, boost::regex::perl|boost::regex::icase);[h3 Perl Regular Expression Syntax]In Perl regular expressions, all characters match themselves except for the following special characters:[pre .\[{()\\\*+?|^$][h4 Wildcard]The single character '.' when used outside of a character set will match any single character except:* The NULL character when the [link boost_regex.ref.match_flag_type flag    `match_not_dot_null`] is passed to the matching algorithms.* The newline character when the [link boost_regex.ref.match_flag_type    flag `match_not_dot_newline`] is passed to    the matching algorithms.   [h4 Anchors]A '^' character shall match the start of a line.A '$' character shall match the end of a line.[h4 Marked sub-expressions]A section beginning `(` and ending `)` acts as a marked sub-expression.  Whatever matched the sub-expression is split out in a separate field by the matching algorithms.  Marked sub-expressions can also repeated, or referred to by a back-reference.[h4 Non-marking grouping]A marked sub-expression is useful to lexically group part of a regular expression, but has the side-effect of spitting out an extra field in the result.  As an alternative you can lexically group part of a regular expression, without generating a marked sub-expression by using `(?:` and `)` , for example `(?:ab)+` will repeat `ab` without splitting out any separate sub-expressions.[h4 Repeats]Any atom (a single character, a marked sub-expression, or a character class) can be repeated with the `*`, `+`, `?`, and `{}` operators.The `*` operator will match the preceding atom zero or more times, for example the expression `a*b` will match any of the following:   b   ab   aaaaaaaabThe `+` operator will match the preceding atom one or more times, for example the expression `a+b` will match any of the following:   ab   aaaaaaaabBut will not match:   bThe `?` operator will match the preceding atom zero or one times, for example the expression ca?b will match any of the following:   cb   cabBut will not match:   caabAn atom can also be repeated with a bounded repeat:`a{n}`  Matches 'a' repeated exactly n times.`a{n,}`  Matches 'a' repeated n or more times.`a{n, m}`  Matches 'a' repeated between n and m times inclusive.For example:[pre ^a{2,3}$]Will match either of:   aa   aaaBut neither of:   a   aaaaIt is an error to use a repeat operator, if the preceding construct can not be repeated, for example:   a(*)Will raise an error, as there is nothing for the `*` operator to be applied to.[h4 Non greedy repeats]The normal repeat operators are "greedy", that is to say they will consume as much input as possible.  There are non-greedy versions available that will consume as little input as possible while still producing a match.`*?` Matches the previous atom zero or more times, while consuming as little    input as possible.`+?` Matches the previous atom one or more times, while consuming as    little input as possible.`??` Matches the previous atom zero or one times, while consuming    as little input as possible.`{n,}?` Matches the previous atom n or more times, while consuming as    little input as possible.`{n,m}?` Matches the previous atom between n and m times, while    consuming as little input as possible.   [h4 Back references]An escape character followed by a digit /n/, where /n/ is in the range 1-9, matches the same string that was matched by sub-expression /n/.  For example the expression:[pre ^(a\*).\*\\1$]Will match the string:   aaabbaaaBut not the string:   aaabba[h4 Alternation]The `|` operator will match either of its arguments, so for example: `abc|def` will match either "abc" or "def". Parenthesis can be used to group alternations, for example: `ab(d|ef)` will match either of "abd" or "abef".Empty alternatives are not allowed (these are almost always a mistake), but if you really want an empty alternative use `(?:)` as a placeholder, for example:`|abc` is not a valid expression, but`(?:)|abc` is and is equivalent, also the expression:`(?:abc)??` has exactly the same effect.[h4 Character sets]A character set is a bracket-expression starting with `[` and ending with `]`, it defines a set of characters, and matches any single character that is a member of that set.A bracket expression may contain any combination of the following:[h5 Single characters]For example `[abc]`, will match any of the characters 'a', 'b', or 'c'.[h5 Character ranges]For example `[a-c]` will match any single character in the range 'a' to 'c'.  By default, for Perl regular expressions, a character x is within the range y to z, if the code point of the character lies within the codepoints ofthe endpoints of the range.  Alternatively, if you set the [link boost_regex.ref.syntax_option_type.syntax_option_type_perl `collate` flag] when constructing the regular expression, then ranges are locale sensitive.[h5 Negation]If the bracket-expression begins with the ^ character, then it matches the complement of the characters it contains, for example `[^a-c]` matches any character that is not in the range `a-c`.[h5 Character classes]An expression of the form `[[:name:]]` matches the named character class "name", for example `[[:lower:]]` matches any lower case character.  See [link boost_regex.syntax.character_classes character class names].[h5 Collating Elements]An expression of the form `[[.col.]` matches the collating element /col/.  A collating element is any single character, or any sequence of characters that collates as a single unit.  Collating elements may also be used as the end point of a range, for example: `[[.ae.]-c]` matches the character sequence "ae", plus any single character in the range "ae"-c, assuming that "ae" is treated as a single collating element in the current locale.As an extension, a collating element may also be specified via it's [link boost_regex.syntax.collating_names symbolic name], for example:   [[.NUL.]]matches a `\0` character.[h5 Equivalence classes]An expression of the form `[[=col=]]`, matches any character or collating element whose primary sort key is the same as that for collating element /col/, as with collating elements the name /col/ may be a [link boost_regex.syntax.collating_names symbolic name].  A primary sort key is one that ignores case, accentation, or locale-specific tailorings; so for example `[[=a=]]` matches any of the characters: a, '''&#xC0;''', '''&#xC1;''', '''&#xC2;''', '''&#xC3;''', '''&#xC4;''', '''&#xC5;''', A, '''&#xE0;''', '''&#xE1;''', '''&#xE2;''', '''&#xE3;''', '''&#xE4;''' and '''&#xE5;'''.  Unfortunately implementation of this is reliant on the platform's collation and localisation support; this feature can not be relied upon to work portably across all platforms, or even all locales on one platform.[h5 Escaped Characters]All the escape sequences that match a single character, or a single character class are permitted within a character class definition.  For example`[\[\]]` would match either of `[` or `]` while `[\W\d]` would match any characterthat is either a "digit", /or/ is /not/ a "word" character.[h5 Combinations]All of the above can be combined in one character set declaration, for example: `[[:digit:]a-c[.NUL.]]`.[h4 Escapes]Any special character preceded by an escape shall match itself.The following escape sequences are all synonyms for single characters:
syntax_perl.qbk - 源码说明

本页面展示了「Boost provides free peer-reviewed portable C++ source libraries. We emphasize libraries that work」中的 syntax_perl.qbk 源码文件，采用 QBK 编程语言编写，共 517 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与libraries相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?