📄 preccx.1
字号:
.TH PRECCX 1L "30 August 1994" "Oxford University" "LOCAL".SH NAME preccx \-PREttier Compiler Compiler 2.42.SH SYNOPSIS.B preccx[\fIoptions\fP] < \fIfile.y\fP > \fIfile.c\fP.B preccx[\fIoptions\fP] \fIfile.y\fP > \fIfile.c\fP.B preccx[\fIoptions\fP] \fIfile.y\fP \fIfile.c\fP.SH DESCRIPTION.I Preccxis a compiler compiler. It converts \fIpreccx\fP-style context-grammardefinition scripts (with a \fI.y\fP extension) into C code scripts(with a \fI.c\fP extension). The output code compiles under ANSI Ccompilers such as the GNU Software Foundation's \fIgcc\fP(1)..PPThere is an easy-to-use hook for \fIlex\fP(1) tokenisers..PP\fIPreccx\fP extends the UNIX \fIyacc\fP(1) utility by allowing:.PP[0] Contextual definitions. Each grammar definition may beparameterized with contexts. For example, some languages determinewhether a declaration is local (and to what) or global in scopeby relative indentation, and this can be encoded in \fIpreccx\fPusing the number of spaces indentation as a parameter, n:.IP @ decl(n) = <' '>*n expr <'\\n'> decl(n+1)*.PPThis definition is intended to mean that a "decl" indented by nspaces consists of n spaces, an expression, and a newline, optionallyfollowed by one or several "decl"s indented still further..PP[1] Infinite lookahead and backtracking in place of the \fIyacc\fP1-token lookahead, This means that \fIpreccx\fP parsers distinguishcorrectly between sentences of the form `foo bah gum' and `foo bah NAY'on a single pass.If you cannot imagine why one should want to decide between the two,think about `if\0...\0then\0...' and `if\0...\0then\0...\0else\0...\0'..PP[2] Arbitrarily complex expressions. This means that compounddefinitions like.IPexplain {{this | that} {several | no} times}+.PPare legal within \fIpreccx\fP definition scripts..PP[3] \fIPreccx\fP has postfix operators `*' (zero or more times),`*n' (exactly n times),`+' (one or more times), and `!' (execute accumulated actions now)built in, along with the `[\0]' (optionally) outfix operator. Forexample, the following means `exactly n spaces':.IP @ space(n) = <'\0'>*n.PPThe other built-ins are.IP`?' (any token).IP`^' (beginning of line).IP`$' (end of line).IP`|' (or, placed between alternate phrases of the grammar).IP`{\0}' (grouping brackets).IP`<\0>' (around literals).IP`>\0<' (to mean `not a particular literal').IP`(\0)' around the name of a \fBBOOLEAN\fP valued predicate on tokens,defined as an int 1 or 0 \-valued C function elsewhere in the script, and.IP`)\0(' (anti-brackets) round a C expression of BOOLEAN type, meaninga logical test condition..IP`]..[' anti-brackets hide an expression, causing it to be required butignored..PP`]a[ b' means that input must satisfy both a and b, while`a ]b[' means that b is trailing context..PP`$!' is a shorthand for matching end-of-line followed by executionof pending actions (it also causes the input buffer to start beingoverwritten). It is roughly equivalent to the conjunction '! $',but more efficient..PP`a\0b\0c' (conjunction) is the termdenoting an expression consisting of an `a expression' followed by a `bexpression' followed by a `c expression'. An example of a \fIpreccx\fPscript follows in the section \fIUSAGE\fP..PP[4] Modular output. Parts of a script can be \fIpreccx\fP'edseparately, compiled separately, and then linked together later, whichmakes maintenance and version control easy..PP[5] Speed. \fIPreccx\fP is fast, typically taking two to five seconds tocompile scripts of several hundred lines. And it builds fast parserstoo..PP[6] Higher order behaviour. `Macros' may be defined in a script.For example,.IP @ optional(parser) = parser.IP @ \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 | {}.PPmay be defined (this particular example is an equivalent for thebuilt-in `[parser]' construct). After the definition, the construct.IP @ ice_cream(flavour) = tub(flavour) optional(sauce).PPmay be used instead of the built-in:.IP @ ice_cream(flavour) = tub(flavour) [sauce].PP[7] Separate syntax to distinguish synthesised attributes(without side-effects) from attached actions (with side-effects) in2.40 and above. This is a break from yacc(1) style aimed at greatertransparancy and robustness. For example, the following synthesises atotal for a simple sum without using side-effecting actions or therun-time value stack:.IP @ sum = summand\\x <'+'> summand\\y {@ $x+$y @}.PPwhereas the following uses \fIyacc\fP(1)-style references in anattached action:.IP @ sum = summand <'+'> summand {: total=$1+$3; :}.PPNOTE: From 2.41 onwards \fIyacc\fP(1)-style referencing is only supported withthe `-old' command line switch to preccx, and less efficient code isgenerated. Moreover, the scope of both kinds of dollar variables is nowstrictly left to right so that `$0' can no longer be used to access a termto the left. In other words, the \fIyacc\fP(1) style of use is now restricted anddiscouraged..PP[8] Built-in error handling capability (2.40 and above). The followingcode sets the handler `foo' as the parser to use when the parse beyondthe `!{foo}' does not match:.IP @ typical = okstuff !{foo} morestuff.PPMalformed parse input will be matched against the parse `okstuff foo',and well-formed input will be matched against `okstuff morestuff'. Thedefinition in this instance is equivalent to`okstuff ! {morestuff | foo}'..PP\fIPreccx\fP is intended to be both easy and convenient to use, but a compilercompiler cannot be understood in one minute. Have a look at the example*.y files in the \fIpreccx\fP directory to get more of the feel. A morecomplex line in a grammar definition script than those above may looklike:.IP @ expr = var { <'+'> | <'-'> } expr.IP @ \0\0\0\0 | <'('> expr <')'>.PPThe `@' is an `attention mark'. Every line which does not begin with an`@' is passed through to the output unchanged, so arbitrary C code canbe embedded in a preccx script. Intended comments must therefore besurrounded by C comment marks, `/*' and `*/'..PPA default do-nothing tokeniser is provided in the preccx library andwill be automatically linked in unless you specify a different yylex()routine to the C compiler. There is nothing to worry about here. If youdo nothing yourself, you will get a working parser out of a \fIpreccx\fPscript immediately, but if you particularly want to put your owntokeniser on the input, then you do that by naming it `yylex()' andmaking it return TOKENs when called. It should write VALUE attributesinto `yylval', just like \fIlex\fP(1). Place its object module or sourcecode file ahead of the `-lcc' argument when you use the C compiler, andit will be linked in instead of the default (NB. yylex() \fImust\fPsignal EOF to \fIpreccx\fP by setting `yytchar=EOF', which yylex()routines generated by \fIlex\fP(1) do not seem to get right)..PPThe way to compile a C source code file `foo.c' generated by\fIpreccx\fP into an executable `foo' is to use an incantation like:.IPgcc\0\-Wall\0\-ansi\0\-o\0foo\0foo.c\0\-L\0<preccx\0dir>\0\-lcc.PPYou can change the TOKEN type by #defining it as a C macro in the *.yscript (you may want a wider range of TOKENs than the 256 possibilitiesafforded by the default 8-bit char, and `#define TOKEN short int' issometimes useful). But it is important that the appropriate \fIpreccx\fPlibrary is used at link time. The default libcc.a library will assumeTOKEN=char, but different versions of the library can be produced byrecompiling with TOKEN set to the desired data type..PPThe parser generated from a \fIpreccx\fP script will ordinarilysignal valid input by absorbing it silently, and signal invalid input byrejecting it and spouting an error message. This is a standard stylefor compiler-compilers. To get the parser to do anything else, you mustdecorate the definition script with ACTIONs (see below for details)..PPThe error handler may be redefined by #defining an ON_ERROR(x)macro. An x=0 value should give the code to execute on a partial butsuccessful parse and x=1 should give the code to execute on anunsuccessful parse. x=-1 should give code to execute when \fIpreccx\fPattempts to backtrack across a `cut' (`!', see below). For example:.IP#define\0ON_ERROR(x)\0x?printf("ow!\\n"):printf("ouch!\\n").PPThe default error actions attempt to restart the parse on the next lineof input, using the parser p designated by `MAIN(p)' in the script..PPYou may likewise #define BEGIN and END for C code to be executed ateither end of a parse attempt. This means that BEGIN will bere-executed if the parse resyncs after an error, and your code shouldtake account of that (most likely by installing and using an invocationcounter)..SH OPTIONS.I Preccxcan be run as a \fIstdin\fP to \fIstdout\fP filter, taking no options orarguments. It is better practice, however, to use the command lineoptions:.IP \fIpreccx\fP [options] infile outfile.PPbecause then there is no danger of \fIpreccx\fP misidentifying the consoleor keyboard when you have redirected stdin and stdout..PPThe default sizes of various internal buffers can be changed by commandline options (version 2.40 and above only), as follows:.IP-rNNNN The read buffer size in Kb. This determines the maximumchar length of a single production in a script readable by\fIpreccx\fP. Default 2Kb/ 2K chars..IP-pNNNN The maximum size in Kb of the internal program (tables)built by \fIpreccx\fP during the scan of a specification script. Itcorrelates with the maximum number of symbols in a singleproduction rule. Default 20Kb/4K instructions..IP-vNNNN The maximum size in Kb of the attributed data built by \fIpreccx\fPduring the scan of the specification script. Default 16Kb/4Kdata items up to v2.41, 0Kb/0K in v2.42 and later (now handled by C andthe data is compiled instead of dynamically interpreted)..IP-fNNNN The maximal size in Kb of the area used by \fIpreccx\fP to storebacktrack points when scanning a script. It correlates to themaximal number of sequents in a production rule. Default16Kb/1K breakpoints..PPThe sizes need only be changed if \fIpreccx\fP fails to parse an inputscript returning an error message that indicates an overflow of one ofthese buffers..PPThe buffers are also used by utilities built by \fIpreccx\fP, and theirsizes in the utilities are set by the macros READBUFFERSIZE,MAXPROGRAMSIZE, STACKSIZE and CONTEXTSTACKSIZE respectively (see belowand look in cc.h and ccx.h)..PP-old This flag (version 2.41 and above) supports the use of \fPyacc(1)\fIstyle dollar variables in attached actions. The support is limitedhowever: $0 and lower cannot be referenced and the variables should onlybe read, not written. Writing to $1 still works as a way to assign theattribute attached to an entire clause, but use the {@foo@} notation inpreference..SH ENVIRONMENTThe following macros must be set in the user's grammar definitionscript, above the #include <cc.h> or <ccx.h> directive:.TP 20.BI #\0define\0TOKEN\0tokentype.PP(default char)This defines the space reserved for each incoming token in the parserwhich \fIpreccx\fP builds. Note that a corresponding version of libcc.amust be linked in at compile time..TP 20.BI #\0define\0VALUE\0valuetype.PP(default char*)This defines the space reserved for each value on the runtime stackmanipulated by the runtime program which \fIpreccx\fP attaches to the parser.There is no good reason for changing this to a type whichis shorter than long int (or far *char), because the actual space usedwill be a union type which is at least as long as these..PPIn version 2.41 and above, this stack is by dfault absent, but the VALUEmacro still has significance..TP 20.BI #\0define\0PARAM\0parametertype.PP(default long)This defines the space reserved for grammar parameters onthe C runtime call stack. It may be worthwhile changing this to int onsystems where int is much shorter than long. On such systems, integerconstants must be cast to PARAM before they can be used as grammarparameters, viz: foo((PARAM)0)..PP
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -