📄 awk.html

📁 unix 下的C开发手册,还用详细的例程。
💻 HTML
📖 第 1 页 / 共 5 页
字号:
input file.The name "-" indicates the standard input.If an argument matches the format of an<i>assignment</i>operand,this argumentwill be treated as an assignment rather than a<i>file</i>argument.<dt><b>CONVFMT</b><dd>The<b>printf</b>format for converting numbers to strings (except for output statements,where<b>OFMT</b>is used);%.6gby default.<dt><b>ENVIRON</b><dd>The variable<b>ENVIRON</b>is an array representing the value of the environment,as described in the <b>XSH</b> specificationunder the<i>exec</i>functions.The indices of the array are stringsconsisting of the names of the environment variables,and the value of each array element isa string consisting of the value of that variable.If the value of an environment variable is considered a<i>numeric string</i>(see<xref href=awkexpr><a href="#tag_000_000_108_002">Expressions in awk</a></xref>),the array element will also have its numeric value.In all cases where the behaviour of<i>awk</i>is affected by environment variables(including the environment of any commands that<i>awk</i>executes via the<b>system</b>function or via pipeline redirections with the<b>print</b>statement, the<b>printf</b>statement, or the<b>getline</b>function),the environment used will be the environment at the time<i>awk</i>began executing;it is implementation-dependent whether any modification of<b>ENVIRON</b>affects this environment.<dt><b>FILENAME</b><dd>A pathname of the current input file.Inside a<b>BEGIN</b>action the value is undefined.Inside an<b>END</b>action the value is the name of the lastinput file processed.<dt><b>FNR</b><dd>The ordinal number of the current recordin the current file.Inside a<b>BEGIN</b>action the value is zero.Inside an<b>END</b>action the value is the number of the last recordprocessed in the last file processed.<dt><b>FS</b><dd><index term="regular expressions"></index>Input field separator regular expression;a space characterby default.<dt><b>NF</b><dd>The number of fields in the current record.Inside a<b>BEGIN</b>action, the use of<b>NF</b>is undefined unless a<b>getline</b>function without a<i>var</i>argument is executed previously.Inside an<b>END</b>action,<b>NF</b>will retain the value it had for the last record read, unlessa subsequent, redirected,<b>getline</b>function without a<i>var</i>argument is performed prior to entering the<b>END</b>action.<dt><b>NR</b><dd>The ordinal number of the current recordfrom the start of input.Inside a<b>BEGIN</b>action the value is zero.Inside an<b>END</b>actionthe value is the number of the last record processed.<dt><b>OFMT</b><dd>The<b>printf</b>format for converting numbers to strings inoutput statements (see<xref href=awkout><a href="#tag_000_000_108_010">Output Statements</a></xref>);%.6gby default.The result of the conversion is unspecified if the value of<b>OFMT</b>is not a floating-point format specification.<dt><b>OFS</b><dd>The<b>print</b>statement output field separation;aspace characterby default.<dt><b>ORS</b><dd>The<b>print</b>statement output record separator;anewline characterby default.<dt><b>RLENGTH</b><dd>The length of the string matched by the<b>match</b>function.<dt><b>RS</b><dd>The first character of the string value of<b>RS</b>is the input record separator;anewline characterby default.If<b>RS</b>contains more than one character,the results are unspecified.If<b>RS</b>is null, then records are separated bysequences of one or more blank lines,leading or trailing blank lines do not result in empty records atthe beginning or end of the input,and anewline characteris always a field separator, no matter what the value of<b>FS</b>is.<dt><b>RSTART</b><dd>The starting position of the string matched by the <b>match</b>function, numbering from 1.This is always equivalentto the return value of the <b>match</b> function.<dt><b>SUBSEP</b><dd>The subscript separator string for multi-dimensional arrays;the default value is implementation-dependent.</dl><h5><a name = "tag_000_000_108_004">&nbsp;</a>Regular Expressions</h5><xref type="5" name="awkre"></xref>The<i>awk</i>utility makes use of the extended regular expression notation (seethe <b>XBD</b> specification, <a href="../xbd/re.html#tag_007_004"><b>Extended Regular Expressions</b>&nbsp;</a> ) except that it will allow the use ofC-language conventions for escaping special characters within the EREs,as specified in the table in the <b>XBD</b> specification, <a href="../xbd/notation.html"><b>File Format Notation</b>&nbsp;</a> (\\,\a,\b,\f,\n,\r,\t,\v)and the following table;these escape sequences will be recognised both inside and outsidebracket expressions.Note that records need not be separated by newline charactersand string constants can contain newline characters, so even the\nsequence is valid in<i>awk</i>EREs.Using a slash character within the regular expressionrequires the escaping shown in the following table:<pre><table  bordercolor=#000000 border=1 align=center><tr valign=top><th align=center><b>Escape Sequence</b><th align=center><b>Description</b><th align=center><b>Meaning</b><tr valign=top><td align=center><b>\"</b><td align=left> Backslash quotation-mark <td align=left> Quotation-mark character <tr valign=top><td align=center><b>\/</b><td align=left> Backslash slash <td align=left> Slash character <tr valign=top><td align=center><b>\<i>ddd</i></b><td align=left> A backslash character followed by the longest sequence of one, two or three octal-digit characters (01234567). If all of the digits are 0, (that is, representation of the NUL character), the behaviour is undefined. <td align=left> The character whose encoding is represented by the one-, two- or three-digit octal integer. If the size of a byte on the system is greater than nine bits, the valid escape sequence used to represent a byte is implementation-dependent. Multi-byte characters require multiple, concatenated escape sequences of this type, including the leading \ for each byte. <tr valign=top><td align=center><b>\<i>c</i></b><td align=left> A backslash character followed by any character not described in this table or in the table in the <b>XBD</b> specification, <a href="../xbd/notation.html"><b>File Format Notation</b>&nbsp;</a>  (<code>\\,\a,\b,\f,\n,\r,\t,\v</code>)<td align=left>Undefined</table></pre><h6 align=center><xref table="Escape Sequences in <I>awk</i>"></xref>Table: Escape Sequences in <i>awk</i></h6><xref type="7" name="awkesc"></xref><p>A regular expression can be matched againsta specific field or string by using one of thetwo regular expression matching operators,~and!~.These operators interprettheir right-hand operand as a regular expressionand their left-hand operand as a string.If the regular expression matches the string, the~expressionwill evaluate to a value of1,and the!~expression will evaluate to a value of0.(The regular expression matching operationis as defined by the termmatched in the <b>XBD</b> specification, <a href="../xbd/re.html#tag_007_001"><b>Regular Expression Definitions</b>&nbsp;</a> ,where a match occurs on any part of the stringunless the regular expression is limited with thecircumflex or dollar sign special characters.)If the regular expression does not match the string, the~expression will evaluate to a value of0,and the!~expression will evaluate to a value of1.If the right-hand operand is any expression other than the lexical token<b>ERE</b>,the string value of the expression will be interpreted as an extendedregular expression, including the escape conventions described above.Note that these same escape conventions also will beapplied in the determining thevalue of a string literal (the lexical token<b>STRING</b>),and thus will be applied a second time when a string literal is used in thiscontext.<p>When an<b>ERE</b>token appears as an expression in any context other than as the right-handof the~or!~operator or as one of the built-infunction arguments described below, the value of the resultingexpression will be the equivalent of:<pre><code>$0  ~  /<i>ere</i>/</code></pre><p>The<i>ere</i>argument to the<b>gsub</b>,<b>match</b>,<b>sub</b>functions, and the<i>fs</i>argument to the<b>split</b>function (see<xref href=awkstr><a href="#tag_000_000_108_013">String Functions</a></xref>)will be interpreted as extended regular expressions.These can be either<b>ERE</b>tokens or arbitrary expressions,and will be interpreted in the same manner as the right-hand side of the~or!~operator.<p>An extended regular expressioncan be used to separate fields byusing the<b>-F</b>&nbsp;<i>ERE</i>option or by assigninga string containing the expression tothe built-in variable<b>FS</b>.The defaultvalue of the<b>FS</b>variable will be a singlespacecharacter.The following describes<b>FS</b>behaviour:<ol><p><li>If<b>FS</b>is a single character:<ol type = a><p><li>If<b>FS</b>is thespace character,skip leading and trailingblank characters;fields will be delimited by sets of one or moreblank characters.<p><li>Otherwise, if<b>FS</b>is any other character<i>c</i>,fields will be delimited byeach single occurrence of<i>c .</i><p></ol><p><li>Otherwise,the string value of<b>FS</b>will be considered to be an extended regular expression.Each occurrence ofa sequence matchingthe extended regular expression will delimit fields.<p></ol><p>Except in the<b>gsub</b>,<b>match</b>,<b>split</b>and<b>sub</b>built-in functions, regular expression matching will be based on input records;that is, record separator characters (the first character of the value ofthe variable<b>RS</b>,anewline characterby default) cannot be embedded in the expression,and no expression will match the record separator character.If the record separator is not anewline character,newline characters embedded in the expression can be matched.In those four built-in functions, regular expression matching will bebased on text strings; that is, any character(including thenewline characterand the record separator) can be embedded in the patternand an appropriate pattern will match any character.However, in all<i>awk</i>regular expression matching, the use of one or more NUL charactersin the pattern, input record or text string producesundefined results.<h5><a name = "tag_000_000_108_005">&nbsp;</a>Patterns</h5>A<i>pattern</i>is any valid<i>expression</i>,a range specified by two expressions separated by comma,or one of the two special patterns<b>BEGIN</b>or<b>END</b>.<h5><a name = "tag_000_000_108_006">&nbsp;</a>Special Patterns</h5>The<i>awk</i>utility recognises two special patterns,<b>BEGIN</b>and<b>END</b>.Each<b>BEGIN</b>pattern will be matched once and its associated actionexecuted before the firstrecord of input is read(except possibly by use of the<b>getline</b>function (see<xref href=awkio><a href="#tag_000_000_108_014">Input/Output and General Functions</a></xref>)in a prior<b>BEGIN</b>action) and before command line assignment is done.Each<b>END</b>pattern will be matched onceand its associated action executedafter the last record of input has been read.These two patterns will have associated actions.<p><b>BEGIN</b>and<b>END</b>will not combine with other patterns.Multiple<b>BEGIN</b>and<b>END</b>patterns are allowed.The actions associated with the<b>BEGIN</b>patterns will beexecuted in the order specified in the program, as are the<b>END</b>actions.An<b>END</b>pattern can precede a<b>BEGIN</b>pattern in a program.<p>If an<i>awk</i>program consists of onlyactions with the pattern<b>BEGIN</b>,and the<b>BEGIN</b>action contains no<b>getline</b>function,<i>awk</i>will exit without reading its input when the last statement in the last<b>BEGIN</b>action is executed.If an<i>awk</i>program consists of onlyactions with the pattern<b>END</b>or onlyactions with the patterns<b>BEGIN</b>and<b>END</b>,the input will be read before the statements in the<b>END</b>actions are executed.<h5><a name = "tag_000_000_108_007">&nbsp;</a>Expression Patterns</h5>An expression pattern will be evaluatedas if it were an expression in a Boolean context.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -