📄 basicsyntax.html
字号:
<HTML><HEAD><TITLE> Regular Expressions </TITLE></HEAD><BODY><A NAME="Regular_Expressions"></A><H1> Regular Expressions </H1><A NAME="Basic_Regular_Expression_Syntax"></A><H2> Basic Regular Expression Syntax </H2><P>Regular expressions (regex's) are useful as a way to match inexact sequencesof characters. They can be used in the `Find...' and `Replace...' searchdialogs and are at the core of Color Syntax Highlighting patterns. To specifya regular expression in a search dialog, simply click on the `RegularExpression' radio button in the dialog.</P><P>A regex is a specification of a pattern to be matched in the searched text. This pattern consists of a sequence of tokens, each being able to match asingle character or a sequence of characters in the text, or assert that aspecific position within the text has been reached (the latter is called ananchor.) Tokens (also called atoms) can be modified by adding one of a numberof special quantifier tokens immediately after the token. A quantifier tokenspecifies how many times the previous token must be matched (see below.)</P><P>Tokens can be grouped together using one of a number of grouping constructs,the most common being plain parentheses. Tokens that are grouped in this wayare also collectively considered to be a regex atom, since this new largeratom may also be modified by a quantifier.</P><P>A regex can also be organized into a list of alternatives by separating eachalternative with pipe characters, `|'. This is called alternation. A matchwill be attempted for each alternative listed, in the order specified, until amatch results or the list of alternatives is exhausted (see <A HREF="#alternation">Alternation</A>section below.)</P><P><H3>The 'Any' Character</H3></P><P>If a dot (`.') appears in a regex, it means to match any character exactlyonce. By default, dot will not match a newline character, but this behaviorcan be changed (see help topic <A HREF="parenConstructs.html#Parenthetical_Constructs">Parenthetical Constructs</A>, under theheading, Matching Newlines).</P><P><H3>Character Classes</H3></P><P>A character class, or range, matches exactly one character of text, but thecandidates for matching are limited to those specified by the class. Classescome in two flavors as described below:</P><P><PRE> [...] Regular class, match only characters listed. [^...] Negated class, match only characters NOT listed.</PRE></P><P>As with the dot token, by default negated character classes do not matchnewline, but can be made to do so.</P><P>The characters that are considered special within a class specification aredifferent than the rest of regex syntax as follows. If the first character ina class is the `]' character (second character if the first character is `^')it is a literal character and part of the class character set. This alsoapplies if the first or last character is `-'. Outside of these rules, twocharacters separated by `-' form a character range which includes all thecharacters between the two characters as well. For example, `[^f-j]' is thesame as `[^fghij]' and means to match any character that is not `f', `g',`h', `i', or `j'.</P><P><H3>Anchors</H3></P><P>Anchors are assertions that you are at a very specific position within thesearch text. NEdit regular expressions support the following anchor tokens:</P><P><PRE> ^ Beginning of line $ End of line < Left word boundary > Right word boundary \B Not a word boundary</PRE></P><P>Note that the \B token ensures that the left and right characters are bothdelimiter characters, or that both left and right characters arenon-delimiter characters. Currently word anchors check only one character,e.g. the left word anchor `<' only asserts that the left character is a worddelimiter character. Similarly the right word anchor checks the rightcharacter.</P><P><H3>Quantifiers</H3></P><P>Quantifiers specify how many times the previous regular expression atom maybe matched in the search text. Some quantifiers can produce a largeperformance penalty, and can in some instances completely lock up NEdit. Toprevent this, avoid nested quantifiers, especially those of the maximalmatching type (see below.)</P><P>The following quantifiers are maximal matching, or "greedy", in that theymatch as much text as possible.</P><P><PRE> * Match zero or more + Match one or more ? Match zero or one</PRE></P><P>The following quantifiers are minimal matching, or "lazy", in that they matchas little text as possible.</P><P><PRE> *? Match zero or more +? Match one or more ?? Match zero or one</PRE></P><P>One final quantifier is the counting quantifier, or brace quantifier. Ittakes the following basic form:</P><P><PRE> {min,max} Match from `min' to `max' times the previous regular expression atom.</PRE></P><P>If `min' is omitted, it is assumed to be zero. If `max' is omitted, it isassumed to be infinity. Whether specified or assumed, `min' must be lessthan or equal to `max'. Note that both `min' and `max' are limited to65535. If both are omitted, then the construct is the same as `*'. Notethat `{,}' and `{}' are both valid brace constructs. A single numberappearing without a comma, e.g. `{3}' is short for the `{min,min}' construct,or to match exactly `min' number of times.</P><P>The quantifiers `{1}' and `{1,1}' are accepted by the syntax, but areoptimized away since they mean to match exactly once, which is redundantinformation. Also, for efficiency, certain combinations of `min' and `max'are converted to either `*', `+', or `?' as follows:</P><P><PRE> {} {,} {0,} * {1,} + {,1} {0,1} ?</PRE></P><P>Note that {0} and {0,0} are meaningless and will generate an error message atregular expression compile time.</P><P>Brace quantifiers can also be "lazy". For example {2,5}? would try to match2 times if possible, and will only match 3, 4, or 5 times if that is what isnecessary to achieve an overall match.</P><P><H3>Alternation</H3></P><P>A series of alternative patterns to match can be specified by separating them<A NAME="alternation"></A>with vertical pipes, `|'. An example of alternation would be `a|be|sea'. This will match `a', or `be', or `sea'. Each alternative can be anarbitrarily complex regular expression. The alternatives are attempted inthe order specified. An empty alternative can be specified if desired, e.g.`a|b|'. Since an empty alternative can match nothingness (the empty string),this guarantees that the expression will match.</P><P><H3>Comments</H3></P><P>Comments are of the form `(?#<comment text>)' and can be inserted anywhereand have no effect on the execution of the regular expression. They can behandy for documenting very complex regular expressions. Note that a commentbegins with `(?#' and ends at the first occurrence of an ending parenthesis,or the end of the regular expression... period. Comments do not recognizeany escape sequences.<P><HR></P><P></P></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -