📄 awk.shar
字号:
# To unbundle, sh this fileecho README 1>&2sed 's/.//' >README <<'//GO.SYSIN DD README'-/****************************************************************-Copyright (C) Lucent Technologies 1997-All Rights Reserved--Permission to use, copy, modify, and distribute this software and-its documentation for any purpose and without fee is hereby-granted, provided that the above copyright notice appear in all-copies and that both that the copyright notice and this-permission notice and warranty disclaimer appear in supporting-documentation, and that the name Lucent Technologies or any of-its entities not be used in advertising or publicity pertaining-to distribution of the software without specific, written prior-permission.--LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,-INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.-IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY-SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES-WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER-IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,-ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF-THIS SOFTWARE.-****************************************************************/--This is the version of awk described in "The AWK Programming Language",-by Al Aho, Brian Kernighan, and Peter Weinberger-(Addison-Wesley, 1988, ISBN 0-201-07981-X).--Changes, mostly bug fixes and occasional enhancements, are listed-in FIXES. If you distribute this code further, please please please-distribute FIXES with it. If you find errors, please report them-to bwk@bell-labs.com. Thanks.--The program itself is created by- make-which should produce a sequence of messages roughly like this:-- yacc -d awkgram.y--conflicts: 43 shift/reduce, 85 reduce/reduce- mv y.tab.c ytab.c- mv y.tab.h ytab.h- cc -c ytab.c- cc -c b.c- cc -c main.c- cc -c parse.c- cc maketab.c -o maketab- ./maketab >proctab.c- cc -c proctab.c- cc -c tran.c- cc -c lib.c- cc -c run.c- cc -c lex.c- cc ytab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm--This produces an executable a.out; you will eventually want to-move this to some place like /usr/bin/awk.--If your system does not have yacc or bison (the GNU-equivalent), you must compile the pieces manually. We have-included yacc output in ytab.c and ytab.h, and backup copies in-case you overwrite them. We have also included a copy of-proctab.c so you do not need to run maketab.--NOTE: This version uses ANSI C, as you should also. We have-compiled this without any changes using gcc -Wall and/or local C-compilers on a variety of systems, but new systems or compilers-may raise some new complaint; reports of difficulties are-welcome.--This also compiles with Visual C++ on all flavors of Windows,-*if* you provide versions of popen and pclose. The file-missing95.c contains versions that can be used to get started-with, though the underlying support has mysterious properties,-the symptom of which can be truncated pipe output. Beware. The-file makefile.win gives hints on how to proceed; if you run-vcvars32.bat, it will set up necessary paths and parameters so-you can subsequently run nmake -f makefile.win. Beware also that-when running on Windows under command.com, various quoting-conventions are different from Unix systems: single quotes won't-work around arguments, and various characters like % are-interpreted within double quotes.--This compiles without change on Macintosh OS X using gcc and-the standard developer tools.--This is also said to compile on Macintosh OS 9 systems, using the-file "buildmac" provided by Dan Allen (danallen@microsoft.com),-to whom many thanks.--The version of malloc that comes with some systems is sometimes-astonishly slow. If awk seems slow, you might try fixing that.-More generally, turning on optimization can significantly improve-awk's speed, perhaps by 1/3 for highest levels.//GO.SYSIN DD READMEecho FIXES 1>&2sed 's/.//' >FIXES <<'//GO.SYSIN DD FIXES'-/****************************************************************-Copyright (C) Lucent Technologies 1997-All Rights Reserved--Permission to use, copy, modify, and distribute this software and-its documentation for any purpose and without fee is hereby-granted, provided that the above copyright notice appear in all-copies and that both that the copyright notice and this-permission notice and warranty disclaimer appear in supporting-documentation, and that the name Lucent Technologies or any of-its entities not be used in advertising or publicity pertaining-to distribution of the software without specific, written prior-permission.--LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,-INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.-IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY-SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES-WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER-IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,-ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF-THIS SOFTWARE.-****************************************************************/--This file lists all bug fixes, changes, etc., made since the AWK book-was sent to the printers in August, 1987.--Apr 24, 2005:- modified lib.c so that values of $0 et al are preserved in the END- block, apparently as required by posix. thanks to havard eidnes- for the report and code.--Jan 14, 2005:- fixed infinite loop in parsing, originally found by brian tsang.- thanks to arnold robbins for a suggestion that started me- rethinking it.--Dec 31, 2004:- prevent overflow of -f array in main, head off potential error in - call of SYNTAX(), test malloc return in lib.c, all with thanks to - todd miller.--Dec 22, 2004:- cranked up size of NCHARS; coverity thinks it can be overrun with- smaller size, and i think that's right. added some assertions to b.c- to catch places where it might overrun. the RE code is still fragile.--Dec 5, 2004:- fixed a couple of overflow problems with ridiculous field numbers:- e.g., print $(2^32-1). thanks to ruslan ermilov, giorgos keramidas- and david o'brien at freebsd.org for patches. this really should- be re-done from scratch.--Nov 21, 2004:- fixed another 25-year-old RE bug, in split. it's another failure- to (re-)initialize. thanks to steve fisher for spotting this and- providing a good test case.--Nov 22, 2003:- fixed a bug in regular expressions that dates (so help me) from 1977;- it's been there from the beginning. an anchored longest match that- was longer than the number of states triggered a failure to initialize- the machine properly. many thanks to moinak ghosh for not only finding- this one but for providing a fix, in some of the most mysterious- code known to man.-- fixed a storage leak in call() that appears to have been there since- 1983 or so -- a function without an explicit return that assigns a - string to a parameter leaked a Cell. thanks to moinak ghosh for - spotting this very subtle one.--Jul 31, 2003:- fixed, thanks to andrey chernov and ruslan ermilov, a bug in lex.c- that mis-handled the character 255 in input. (it was being compared- to EOF with a signed comparison.)--Jul 29, 2003:- fixed (i think) the long-standing botch that included the beginning of- line state ^ for RE's in the set of valid characters; this led to a- variety of odd problems, including failure to properly match certain- regular expressions in non-US locales. thanks to ruslan for keeping- at this one.--Jul 28, 2003:- n-th try at getting internationalization right, with thanks to volker- kiefel, arnold robbins and ruslan ermilov for advice, though they- should not be blamed for the outcome. according to posix, "." is the- radix character in programs and command line arguments regardless of- the locale; otherwise, the locale should prevail for input and output- of numbers. so it's intended to work that way.- - i have rescinded the attempt to use strcoll in expanding shorthands in- regular expressions (cclenter). its properties are much too- surprising; for example [a-c] matches aAbBc in locale en_US but abBcC- in locale fr_CA. i can see how this might arise by implementation- but i cannot explain it to a human user. (this behavior can be seen- in gawk as well; we're leaning on the same library.)-- the issue appears to be that strcoll is meant for sorting, where- merging upper and lower case may make sense (though note that unix- sort does not do this by default either). it is not appropriate- for regular expressions, where the goal is to match specific- patterns of characters. in any case, the notations [:lower:], etc.,- are available in awk, and they are more likely to work correctly in- most locales.-- a moratorium is hereby declared on internationalization changes.- i apologize to friends and colleagues in other parts of the world.- i would truly like to get this "right", but i don't know what- that is, and i do not want to keep making changes until it's clear.--Jul 4, 2003:- fixed bug that permitted non-terminated RE, as in "awk /x".--Jun 1, 2003:- subtle change to split: if source is empty, number of elems- is always 0 and the array is not set.--Mar 21, 2003:- added some parens to isblank, in another attempt to make things- internationally portable.--Mar 14, 2003:- the internationalization changes, somewhat modified, are now- reinstated. in theory awk will now do character comparisons- and case conversions in national language, but "." will always- be the decimal point separator on input and output regardless- of national language. isblank(){} has an #ifndef.-- this no longer compiles on windows: LC_MESSAGES isn't defined- in vc6++.-- fixed subtle behavior in field and record splitting: if FS is- a single character and RS is not empty, \n is NOT a separator.- this tortuous reading is found in the awk book; behavior now- matches gawk and mawk.--Dec 13, 2002:- for the moment, the internationalization changes of nov 29 are- rolled back -- programs like x = 1.2 don't work in some locales,- because the parser is expecting x = 1,2. until i understand this- better, this will have to wait.--Nov 29, 2002:- modified b.c (with tiny changes in main and run) to support- locales, using strcoll and iswhatever tests for posix character- classes. thanks to ruslan ermilov (ru@freebsd.org) for code.- the function isblank doesn't seem to have propagated to any- header file near me, so it's there explicitly. not properly- tested on non-ascii character sets by me.--Jun 28, 2002:- modified run/format() and tran/getsval() to do a slightly better- job on using OFMT for output from print and CONVFMT for other- number->string conversions, as promised by posix and done by - gawk and mawk. there are still places where it doesn't work- right if CONVFMT is changed; by then the STR attribute of the- variable has been irrevocably set. thanks to arnold robbins for- code and examples.-- fixed subtle bug in format that could get core dump. thanks to- Jaromir Dolecek <jdolecek@NetBSD.org> for finding and fixing.- minor cleanup in run.c / format() at the same time.-- added some tests for null pointers to debugging printf's, which- were never intended for external consumption. thanks to dave- kerns (dkerns@lucent.com) for pointing this out.-- GNU compatibility: an empty regexp matches anything (thanks to- dag-erling smorgrav, des@ofug.org). subject to reversion if- this does more harm than good.-- pervasive small changes to make things more const-correct, as- reported by gcc's -Wwrite-strings. as it says in the gcc manual,- this may be more nuisance than useful. provoked by a suggestion- and code from arnaud desitter, arnaud@nimbus.geog.ox.ac.uk-- minor documentation changes to note that this now compiles out- of the box on Mac OS X.--Feb 10, 2002:- changed types in posix chars structure to quiet solaris cc.--Jan 1, 2002:- fflush() or fflush("") flushes all files and pipes.-- length(arrayname) returns number of elements; thanks to - arnold robbins for suggestion.-- added a makefile.win to make it easier to build on windows.- based on dan allen's buildwin.bat.--Nov 16, 2001:- added support for posix character class names like [:digit:],- which are not exactly shorter than [0-9] and perhaps no more- portable. thanks to dag-erling smorgrav for code.--Feb 16, 2001:- removed -m option; no longer needed, and it was actually- broken (noted thanks to volker kiefel).--Feb 10, 2001:- fixed an appalling bug in gettok: any sequence of digits, +,-, E, e,- and period was accepted as a valid number if it started with a period.- this would never have happened with the lex version.-- other 1-character botches, now fixed, include a bare $ and a- bare " at the end of the input.--Feb 7, 2001:- more (const char *) casts in b.c and tran.c to silence warnings.--Nov 15, 2000:- fixed a bug introduced in august 1997 that caused expressions- like $f[1] to be syntax errors. thanks to arnold robbins for- noticing this and providing a fix.--Oct 30, 2000:- fixed some nextfile bugs: not handling all cases. thanks to- arnold robbins for pointing this out. new regressions added.-- close() is now a function. it returns whatever the library- fclose returns, and -1 for closing a file or pipe that wasn't- opened.--Sep 24, 2000:- permit \n explicitly in character classes; won't work right- if comes in as "[\n]" but ok as /[\n]/, because of multiple- processing of \'s. thanks to arnold robbins.--July 5, 2000:- minor fiddles in tran.c to keep compilers happy about uschar.- thanks to norman wilson.--May 25, 2000:- yet another attempt at making 8-bit input work, with another- band-aid in b.c (member()), and some (uschar) casts to head - off potential errors in subscripts (like isdigit). also- changed HAT to NCHARS-2. thanks again to santiago vila.-- changed maketab.c to ignore apparently out of range definitions- instead of halting; new freeBSD generates one. thanks to- jon snader <jsnader@ix.netcom.com> for pointing out the problem.--May 2, 2000:- fixed an 8-bit problem in b.c by making several char*'s into- unsigned char*'s. not clear i have them all yet. thanks to- Santiago Vila <sanvila@unex.es> for the bug report.--Apr 21, 2000:- finally found and fixed a memory leak in function call; it's- been there since functions were added ~1983. thanks to- jon bentley for the test case that found it.-- added test in envinit to catch environment "variables" with- names beginning with '='; thanks to Berend Hasselman.--Jul 28, 1999:- added test in defn() to catch function foo(foo), which- otherwise recurses until core dump. thanks to arnold- robbins for noticing this.--Jun 20, 1999:- added *bp in gettok in lex.c; appears possible to exit function- without terminating the string. thanks to russ cox.-
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -