regcomp.html

来自「IEEE 1003.1-2003, Single Unix Specificat」· HTML 代码 · 共 591 行 · 第 1/2 页
HTML
591 行
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta name="generator" content="HTML Tidy, see www.w3.org"><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"><link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 --><!-- Copyright (c) 2001-2003 The Open Group, All Rights Reserved --><title>regcomp</title></head><body bgcolor="white"><basefont size="3"> <a name="regcomp"></a> <a name="tag_03_603"></a><!-- regcomp --> <!--header start--><center><font size="2">The Open Group Base Specifications Issue 6<br>IEEE Std 1003.1, 2003 Edition<br>Copyright &copy; 2001-2003 The IEEE and The Open Group, All Rights reserved.</font></center><!--header end--><hr size="2" noshade><h4><a name="tag_03_603_01"></a>NAME</h4><blockquote>regcomp, regerror, regexec, regfree - regular expression matching</blockquote><h4><a name="tag_03_603_02"></a>SYNOPSIS</h4><blockquote class="synopsis"><p><code><tt>#include &lt;<a href="../basedefs/regex.h.html">regex.h</a>&gt;<br><br> int regcomp(regex_t *restrict</tt> <i>preg</i><tt>, const char *restrict</tt> <i>pattern</i><tt>,<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int</tt> <i>cflags</i><tt>);<br> size_t regerror(int</tt> <i>errcode</i><tt>, const regex_t *restrict</tt> <i>preg</i><tt>,<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char *restrict</tt> <i>errbuf</i><tt>, size_t</tt> <i>errbuf_size</i><tt>);<br> int regexec(const regex_t *restrict</tt> <i>preg</i><tt>, const char *restrict</tt> <i>string</i><tt>,<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; size_t</tt> <i>nmatch</i><tt>, regmatch_t</tt> <i>pmatch</i><tt>[restrict], int</tt><i>eflags</i><tt>);<br> void regfree(regex_t *</tt><i>preg</i><tt>);<br></tt></code></p></blockquote><h4><a name="tag_03_603_03"></a>DESCRIPTION</h4><blockquote><p>These functions interpret <i>basic</i> and <i>extended</i> regular expressions as described in the Base Definitions volume ofIEEE&nbsp;Std&nbsp;1003.1-2001, <a href="../basedefs/xbd_chap09.html">Chapter 9, Regular Expressions</a>.</p><p>The <b>regex_t</b> structure is defined in <a href="../basedefs/regex.h.html"><i>&lt;regex.h&gt;</i></a> and contains at leastthe following member:</p><center><table border="1" cellpadding="3" align="center"><tr valign="top"><th align="center"><p class="tent"><b>Member Type</b></p></th><th align="center"><p class="tent"><b>Member Name</b></p></th><th align="center"><p class="tent"><b>Description</b></p></th></tr><tr valign="top"><td align="left"><p class="tent">size_t</p></td><td align="left"><p class="tent">re_nsub</p></td><td align="left"><p class="tent">Number of parenthesized subexpressions.</p></td></tr></table></center><p>The <b>regmatch_t</b> structure is defined in <a href="../basedefs/regex.h.html"><i>&lt;regex.h&gt;</i></a> and contains atleast the following members:</p><center><table border="1" cellpadding="3" align="center"><tr valign="top"><th align="center"><p class="tent"><b>Member Type</b></p></th><th align="center"><p class="tent"><b>Member Name</b></p></th><th align="center"><p class="tent"><b>Description</b></p></th></tr><tr valign="top"><td align="left"><p class="tent"><b>regoff_t</b></p></td><td align="left"><p class="tent"><i>rm_so</i></p></td><td align="left"><p class="tent">Byte offset from start of <i>string</i> to start of substring.</p></td></tr><tr valign="top"><td align="left"><p class="tent"><b>regoff_t</b></p></td><td align="left"><p class="tent"><i>rm_eo</i></p></td><td align="left"><p class="tent">Byte offset from start of <i>string</i> of the first character after the end of substring.</p></td></tr></table></center><p>The <i>regcomp</i>() function shall compile the regular expression contained in the string pointed to by the <i>pattern</i>argument and place the results in the structure pointed to by <i>preg</i>. The <i>cflags</i> argument is the bitwise-inclusive ORof zero or more of the following flags, which are defined in the <a href="../basedefs/regex.h.html"><i>&lt;regex.h&gt;</i></a>header:</p><dl compact><dt>REG_EXTENDED</dt><dd>Use Extended Regular Expressions.</dd><dt>REG_ICASE</dt><dd>Ignore case in match. (See the Base Definitions volume of IEEE&nbsp;Std&nbsp;1003.1-2001, <a href="../basedefs/xbd_chap09.html">Chapter 9, Regular Expressions</a>.)</dd><dt>REG_NOSUB</dt><dd>Report only success/fail in <i>regexec</i>().</dd><dt>REG_NEWLINE</dt><dd>Change the handling of &lt;newline&gt;s, as described in the text.</dd></dl><p>The default regular expression type for <i>pattern</i> is a Basic Regular Expression. The application can specify ExtendedRegular Expressions using the REG_EXTENDED <i>cflags</i> flag.</p><p>If the REG_NOSUB flag was not set in <i>cflags</i>, then <i>regcomp</i>() shall set <i>re_nsub</i> to the number ofparenthesized subexpressions (delimited by <tt>"\(\)"</tt> in basic regular expressions or <tt>"()"</tt> in extended regularexpressions) found in <i>pattern</i>.</p><p>The <i>regexec</i>() function compares the null-terminated string specified by <i>string</i> with the compiled regularexpression <i>preg</i> initialized by a previous call to <i>regcomp</i>(). If it finds a match, <i>regexec</i>() shall return 0;otherwise, it shall return non-zero indicating either no match or an error. The <i>eflags</i> argument is the bitwise-inclusive ORof zero or more of the following flags, which are defined in the <a href="../basedefs/regex.h.html"><i>&lt;regex.h&gt;</i></a>header:</p><dl compact><dt>REG_NOTBOL</dt><dd>The first character of the string pointed to by <i>string</i> is not the beginning of the line. Therefore, the circumflexcharacter ( <tt>'^'</tt> ), when taken as a special character, shall not match the beginning of <i>string</i>.</dd><dt>REG_NOTEOL</dt><dd>The last character of the string pointed to by <i>string</i> is not the end of the line. Therefore, the dollar sign (<tt>'$'</tt> ), when taken as a special character, shall not match the end of <i>string</i>.</dd></dl><p>If <i>nmatch</i> is 0 or REG_NOSUB was set in the <i>cflags</i> argument to <i>regcomp</i>(), then <i>regexec</i>() shall ignorethe <i>pmatch</i> argument. Otherwise, the application shall ensure that the <i>pmatch</i> argument points to an array with atleast <i>nmatch</i> elements, and <i>regexec</i>() shall fill in the elements of that array with offsets of the substrings of<i>string</i> that correspond to the parenthesized subexpressions of <i>pattern</i>: <i>pmatch</i>[ <i>i</i>]. <i>rm_so</i> shallbe the byte offset of the beginning and <i>pmatch</i>[ <i>i</i>]. <i>rm_eo</i> shall be one greater than the byte offset of the endof substring <i>i</i>. (Subexpression <i>i</i> begins at the <i>i</i>th matched open parenthesis, counting from 1.) Offsets in<i>pmatch</i>[0] identify the substring that corresponds to the entire regular expression. Unused elements of <i>pmatch</i> up to<i>pmatch</i>[ <i>nmatch</i>-1] shall be filled with -1. If there are more than <i>nmatch</i> subexpressions in <i>pattern</i> (<i>pattern</i> itself counts as a subexpression), then <i>regexec</i>() shall still do the match, but shall record only the first<i>nmatch</i> substrings.</p><p>When matching a basic or extended regular expression, any given parenthesized subexpression of <i>pattern</i> might participatein the match of several different substrings of <i>string</i>, or it might not match any substring even though the pattern as awhole did match. The following rules shall be used to determine which substrings to report in <i>pmatch</i> when matching regularexpressions:</p><ol><li><p>If subexpression <i>i</i> in a regular expression is not contained within another subexpression, and it participated in thematch several times, then the byte offsets in <i>pmatch</i>[ <i>i</i>] shall delimit the last such match.</p></li><li><p>If subexpression <i>i</i> is not contained within another subexpression, and it did not participate in an otherwise successfulmatch, the byte offsets in <i>pmatch</i>[ <i>i</i>] shall be -1. A subexpression does not participate in the match when:</p><blockquote><tt>'*'</tt> or <tt>"\{\}"</tt> appears immediately after the subexpression in a basic regular expression, or<tt>'*'</tt> , <tt>'?'</tt> , or <tt>"{}"</tt> appears immediately after the subexpression in an extended regular expression, andthe subexpression did not match (matched 0 times)</blockquote><p>or:</p><blockquote><tt>'|'</tt> is used in an extended regular expression to select this subexpression or another, and the othersubexpression matched.</blockquote></li><li><p>If subexpression <i>i</i> is contained within another subexpression <i>j</i>, and <i>i</i> is not contained within any othersubexpression that is contained within <i>j</i>, and a match of subexpression <i>j</i> is reported in <i>pmatch</i>[ <i>j</i>],then the match or non-match of subexpression <i>i</i> reported in <i>pmatch</i>[ <i>i</i>] shall be as described in 1. and 2.above, but within the substring reported in <i>pmatch</i>[ <i>j</i>] rather than the whole string. The offsets in <i>pmatch</i>[<i>i</i>] are still relative to the start of <i>string</i>.</p></li><li><p>If subexpression <i>i</i> is contained in subexpression <i>j</i>, and the byte offsets in <i>pmatch</i>[ <i>j</i>] are -1, thenthe pointers in <i>pmatch</i>[ <i>i</i>] shall also be -1.</p></li><li><p>If subexpression <i>i</i> matched a zero-length string, then both byte offsets in <i>pmatch</i>[ <i>i</i>] shall be the byteoffset of the character or null terminator immediately following the zero-length string.</p></li></ol><p>If, when <i>regexec</i>() is called, the locale is different from when the regular expression was compiled, the result isundefined.</p><p>If REG_NEWLINE is not set in <i>cflags</i>, then a &lt;newline&gt; in <i>pattern</i> or <i>string</i> shall be treated as anordinary character. If REG_NEWLINE is set, then &lt;newline&gt; shall be treated as an ordinary character except as follows:</p><ol><li><p>A &lt;newline&gt; in <i>string</i> shall not be matched by a period outside a bracket expression or by any form of anon-matching list (see the Base Definitions volume of IEEE&nbsp;Std&nbsp;1003.1-2001, <a href="../basedefs/xbd_chap09.html">Chapter9, Regular Expressions</a>).</p></li><li><p>A circumflex ( <tt>'^'</tt> ) in <i>pattern</i>, when used to specify expression anchoring (see the Base Definitions volume ofIEEE&nbsp;Std&nbsp;1003.1-2001, <a href="../basedefs/xbd_chap09.html#tag_09_03_08">Section 9.3.8, BRE Expression Anchoring</a>),shall match the zero-length string immediately after a &lt;newline&gt; in <i>string</i>, regardless of the setting ofREG_NOTBOL.</p></li><li><p>A dollar sign ( <tt>'$'</tt> ) in <i>pattern</i>, when used to specify expression anchoring, shall match the zero-length stringimmediately before a &lt;newline&gt; in <i>string</i>, regardless of the setting of REG_NOTEOL.</p></li></ol><p>The <i>regfree</i>() function frees any memory allocated by <i>regcomp</i>() associated with <i>preg</i>.</p><p>The following constants are defined as error return values:</p><dl compact><dt>REG_NOMATCH</dt><dd><i>regexec</i>() failed to match.</dd><dt>REG_BADPAT</dt><dd>Invalid regular expression.</dd><dt>REG_ECOLLATE</dt><dd>Invalid collating element referenced.</dd><dt>REG_ECTYPE</dt><dd>Invalid character class type referenced.</dd><dt>REG_EESCAPE</dt><dd>Trailing <tt>'\'</tt> in pattern.</dd><dt>REG_ESUBREG</dt><dd>Number in <tt>"\digit"</tt> invalid or in error.</dd><dt>REG_EBRACK</dt><dd><tt>"[]"</tt> imbalance.</dd><dt>REG_EPAREN</dt><dd><tt>"\(\)"</tt> or <tt>"()"</tt> imbalance.</dd><dt>REG_EBRACE</dt><dd><tt>"\{\}"</tt> imbalance.</dd><dt>REG_BADBR</dt><dd>Content of <tt>"\{\}"</tt> invalid: not a number, number too large, more than two numbers, first larger than second.</dd><dt>REG_ERANGE</dt>
regcomp.html - 源码说明

本页面展示了「IEEE 1003.1-2003, Single Unix Specification v3」中的 regcomp.html 源码文件，采用 HTML 编程语言编写，共 591 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与Specification相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?