📄 gawk.hlp

📁 早期freebsd实现
💻 HLP
📖 第 1 页 / 共 4 页
字号:
 The field prefix operator ($), is used to reference a particular field.  For example, $3 designates the third field of the current record.  The entire record can be referenced via $0 (and it holds the actual input record, not the values of $1, $2, ... concatenated together, so multiple spaces--when present--remain intact, unless a new value gets assigned). The builtin variable NF holds the number of fields in the current record.  $NF is therefore the value of the last field.  Attempts to access fields beyond NF result in null values (if a record contained 3 fields, the value of $5 would be ""). Assigning a new value to $0 causes all the other field values (and NF) to be re-evaluated.  Changing a specific field will cause $0 to receive a new value once it's re-evaluated, but until then the other existing fields remain unchanged.3 variables Variables in awk can hold both numeric and string values and do not have to be pre-declared.  In fact, there is no way to explicitly declare them at all.  Variable names consist of a leading letter (either upper or lower case, which are distinct from each other) or underscore (_) character followed by any number of letters, digits, or underscores. When a variable that didn't previously exist is referenced, it is created and given a null value.  A null value is treated as 0 when used as a number, and is a string of zero characters in length if used as a string.4 builtin_variables GAWK maintains several 'built-in' variables.  All have default values; some are updated automatically.  All the builtins have uppercase-only names. These builtin variables control how awk behaves   FS  input field separator; default is a single space, which is         treated as if it were a regular expression for matching         one or more spaces and/or tabs; a value of " " also has a         second special-case side-effect of causing leading blanks         to be ignored instead of producing a null first field;         initial value can be specified on the command line with         the -F option (or /field_separator); the value can be a         regular expression   RS  input record separator; default value is a newline ("\n");         only a single character is allowed [no regular expressions         or multi-character strings; expected to be remedied in a         future release of gawk]   OFS output field separator; value to place between variables in         a 'print' statement; default is one space; can be arbitrary         string   ORS output record separator; value to implicitly terminate 'print'         statement with; default is newline ("\n"); can be arbitrary         string   OFMT default output format used for printing numbers; default         value is "%.6g"   CONVFMT conversion format used for string-to-number conversions;         default value is also "%.6g", like OFMT   SUBSEP subscript separator for array indices; used when an array         subscript is specified as a comma separated list of values:         the comma is replaced by SUBSEP and the resulting index         is a concatenation of the values and SUBSEP(s); default         value is "\034"; value may be arbitrary string   IGNORECASE regular expression matching flag; if true (non-zero)         matching ignores differences between upper and lower case         letters; affects the '~' and '!~' operators, the 'index',         'match', 'split', 'sub', and 'gsub' functions, and the         field splitting based on FS; default value is false (0);         has no effect if GAWK is in strict compatibility mode (via         the -"W compat" option or /strict)   FIELDWIDTHS space or tab separated list of width sizes; takes         precedence over FS when set, but is cleared if FS has a         value assigned to it; [note: the current implementation         of fixed-field input is considered experimental and is         expected to evolve over time] These builtin variables provide useful information   NF  number of fields in the current record   NR  record number (accumulated over all files when more than one         input file is processed by the same program)   FNR current record number of the current input file; reset to 0         each time an input file is completed   RSTART starting position of substring matched by last invocation         of the 'match' function; set to 0 if a match fails and at         the start of each input record   RLENGTH length of substring matched by the last invocation of the         'match' function; set to -1 if a match fails   FILENAME name of the input file currently being processed; the         special name "-" is used to represent the standard input   ENVIRON array of miscellaneous user environment values; the VMS         implementation of GAWK provides values for ["USER"] (the         username), ["PATH"] (current default directory), ["HOME"]         (the user's login directory), and "[TERM]" (terminal type         if available) [all info provided by VAXCRTL's environ]   ARGC number of elements in the ARGV array, counting [0] which is         the program name (ie, "gawk")   ARGV array of command-line arguments (in [0] to [ARGC-1]); the         program name (ie, "gawk") in held in ARGV[0]; command line         parameters (data files and "var=value" expressions, but not         program options or the awk program text string if present)         are stored in ARGV[1] through ARGV[ARGC-1]; the awk program         can change values of ARGC and ARGV[] during execution in         order to alter which files are processed or which between-         file assignments are made4 arrays awk supports associative arrays to collect data into tables.  Array elements can be either numeric or string, as can the indices used to access them.  Each array must have a unique name, but a given array can hold both string and numeric elements at the same time.  Arrays are one-dimensional only, but multi-dimensional arrays can be simulated using comma (,) separated indices, whereby a single index value gets created by replacing commas with SUBSEP and concatenating the resulting expression into a single string. Referencing an array element is done with the expression       Array[Index] where 'Array' represents the array's name and 'Index' represents a value or expression used for a subscript.  If the requested array element did not exist, it will be created and assigned an initial null value.  To check whether an element exists without creating it, use the 'in' boolean operator.       Index in Array would check 'Array' for element 'Index' and return 1 if it existed or 0 otherwise.  To remove an element from an array, use the 'delete' statement       delete Array[Index] Note:  there is no way to delete an ordinary variable or an entire array; 'delete' only works on a specific array element. To process all elements of an array (in succession) when their subscripts might be unknown, use the 'in' variant of the for-loop       for (Index in Array) { ... }3 functions awk supports both built-in and user-defined functions.  A function may be considered a 'black-box' which accepts zero or more input parameters, performs some calculations or other manipulations based on them, and returns a single result. The syntax for calling a function consists of the function name immediately followed by an open parenthesis (left parenthesis '('), followed by an argument list, followed by a closing parenthesis (right parenthesis ')').  The argument list is a sequence of values (numbers, strings, variables, array references, or expressions involving the above and/or nested function calls), separated by commas and optional white space. The parentheses are required punctuation, except for the 'print' and 'printf' builtin IO functions, where they're optional, and for the builtin IO function 'getline', where they're not allowed.  Some functions support optional [trailing] arguments which can be simply omitted (along with the corresponding comma if applicable).4 numeric_functions Builtin numeric functions   int(n)      returns the value of 'n' with any fraction truncated                 [truncation of negative values is towards 0]   sqrt(n)     the square root of n   exp(n)      the exponential of n ('e' raised to the 'n'th power)   log(n)      natural logarithm of n   sin(n)      sine of n (in radians)   cos(n)      cosine of n (radians)   atan2(m,n)  arctangent of m/n (radians)   rand()      random number in the range 0 to 1 (exclusive)   srand(s)    sets the random number 'seed' to s, so that a sequence                 of 'random' numbers can be repeated; returns the                 previous seed value; srand() [argument omitted] sets                 the seed to an 'unpredictable' value (based on date                 and time, for instance, so should be unrepeatable)4 string_functions Builtin string functions   index(s,t)  search string s for substring t; result is 1-based                 offset of t within s, or 0 if not found   length(s)   returns the length of string s; either 'length()'                 with its argument omitted or 'length' without any                 parenthesized argument list will return length of $0   match(s,r)  search string s for regular expression r; the offset                 of the longest, left-most substring which matches                 is returned, or 0 if no match was found; the builtin                 variables RSTART and RLENGTH are also set [RSTART to                 the return value and RLENGTH to the size of the                 matching substring, or to -1 if no match was found]   split(s,a,f) break string s into components based on field                 separator f and store them in array a (into elements                 [1], [2], and so on); the last argument is optional,                 if omitted, the value of FS is used; the return value                 is the number of components found   sprintf(f,e,...) format expression(s) e using format string f and                 return the result as a string; formatting is similar                 to the printf function   sub(r,t,s)  search string target s for regular expression r, and                 if a match is found, replace the matching text with                 substring t, then store the result back in s; if s                 is omitted, use $0 for the string; the result is                 either 1 if a match+substitution was made, or 0                 otherwise; if substring t contains the character                 '&', the text which matched the regular expression                 is used instead of '&' [to suppress this feature                 of '&', 'quote' it with a backslash (\); since this                 will be inside a quoted string which will receive                 'backslash' processing before being passed to sub(),                 *two* consecutive backslashes will be needed "\\&"]   gsub(r,t,s) similar to sub(), but gsub() replaces all nonoverlapping                 substrings instead of just the first, and the return                 value is the number of substitutions made   substr(s,p,l) extract a substring l characters long starting at                 offset p in string s; l is optional, if omitted then                 the remainder of the string (p thru end) is returned   tolower(s)  return a copy of string s in which every uppercase                 letter has been converted into lowercase   toupper(s)  analogous to tolower(); convert lowercase to uppercase4 time_functions Builtin time functions   systime()   return the current time of day as the number of seconds                 since some reference point; on VMS the reference point                 is January 1, 1970, at 12 AM local time (not UTC)   strftime(f,t) format time value t using format f; if t is omitted,                 the default is systime()5 time_formats Formatting directives similar to the 'printf' & 'sprintf' functions (each is introduced in the format string by preceding it with a percent sign (%)); the directive is substituted by the corresponding value   a   abbreviated weekday name (Sun,Mon,Tue,Wed,Thu,Fri,Sat)   A   full weekday name   b   abbreviated month name (Jan,Feb,...)   B   full month name   c   date and time (Unix-style "aaa bbb dd HH:MM:SS YYYY" format)   C   century prefix (19 or 20) [not century number, ie 20th]   d   day of month as two digit decimal number (01-31)   D   date in mm/dd/yy format   e   day of month with leading space instead of leading 0 ( 1-31)   E   ignored; following format character used   H   hour (24 hour clock) as two digit number (00-23)   h   abbreviated month name (Jan,Feb,...) [same as %b]   I   hour (12 hour clock) as two digit number (01-12)   j   day of year as three digit number (001-366)   m   month as two digit number (01-12)   M   minute as two digit number (00-59)   n   'newline' (ie, treat %n as \n)   O   ignored; following format character used   p   AM/PM designation for 12 hour clock   r   time in AM/PM format ("II:MM:SS p")   R   time without seconds ("HH:MM")   S   second as two digit number (00-59)   t   tab (ie, treat %t as \t)   T   time ("HH:MM:SS")   U   week of year (00-53) [first Sunday is first day of week 1]   V   date (VMS-style "dd-bbb-YYYY" with 'bbb' forced to uppercase)   w   weekday as decimal digit (0 [Sunday] through 6 [Saturday])   W   week of year (00-53) [first _Monday_ is first day of week 1]   x   date ("aaa bbb dd YYYY")   X   time ("HH:MM:SS")   y   year without century (00-99)   Y   year with century (19yy-20yy)   Z   time zone name (always "local" for VMS)   %   literal percent sign (%)4 IO_functions Builtin I/O functions   print x,... print the values of one or more expressions; if none                 are listed, $0 is used; parentheses are optional;                 when multiple values are printed, the current value                 of builtin OFS (default is 1 space) is used to                 separate them; the print line is implicitly                 terminated with the current value of ORS (default                 is newline); print does not have a return value   printf(f,x,...) print the values of one or more expressions, using                 the specified format string; null strings are used                 to supply missing values (if any); no between field                 or trailing newline characters are printed, they                 should be specified within the format string; the                 argument-enclosing parentheses are optional;                 printf does not have a return value   getline v   read a record into variable v; if v is omitted, $0 is                 used (and NF, NR, and FNR are updated); if v is                 specified, then field-splitting won't be performed;                 note:  parentheses around the argument are *not*                 allowed; return value is 1 for successful read, 0                 if end of file is encountered, or -1 if some sort                 of error occurred; [see 'redirection' for several                 variants]   close(s)    close a file or pipe specified by the string s; the                 string used should have the same value as the one                 used in a getline or print/printf redirection   system(s)   pass string s to executed by the operating system;                 the command string is executed in a subprocess5 redirection Both getline and print/printf support variant forms which use redirection and pipes. To read from a file (instead of from the primary input file), use     getline var < "file" or  getline < "file"    (read into $0) where the string "file" represents either an actual file name (in quotes) or a variable which contains a file name string value or an expression which evaluates to a string filename. To create a pipe executing some command and read the result into a variable (or into $0), use     "command" | getline var or  "command" | getline    (read into $0) where "command" is a literal string containing an operating system
💿 文件大小 40554 K
👤 上传用户 luyibo54618
📂 所属分类 Linux/Unix编程
📄 代码行数 1,190 行
💻 语言类型 HLP
🏷️ 相关标签

#freebsd
更多freebsd资源 →
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -