📄 ss3
字号:
.SH3: Lexical Analysis.PPThe user must supply a lexical analyzer to read the input stream and communicate tokens(with values, if desired) to the parser.The lexical analyzer is an integer-valued function called.I yylex .The function returns an integer, the.I "token number" ,representing the kind of token read.If there is a value associated with that token, it should be assignedto the external variable.I yylval ..PPThe parser and the lexical analyzer must agree on these token numbers in order forcommunication between them to take place.The numbers may be chosen by Yacc, or chosen by the user.In either case, the ``# define'' mechanism of C is used to allow the lexical analyzerto return these numbers symbolically.For example, suppose that the token name DIGIT has been defined in the declarations section of theYacc specification file.The relevant portion of the lexical analyzer might look like:.DSyylex(){ extern int yylval; int c; . . . c = getchar(); . . . switch( c ) { . . . case \'0\': case \'1\': . . . case \'9\': yylval = c\-\'0\'; return( DIGIT ); . . . } . . ..DE.PPThe intent is to return a token number of DIGIT, and a value equal to the numerical value of thedigit.Provided that the lexical analyzer code is placed in the programs section of the specification file,the identifier DIGIT will be defined as the token number associatedwith the token DIGIT..PPThis mechanism leads to clear,easily modified lexical analyzers; the only pitfall is the needto avoid using any token names in the grammar that are reservedor significant in C or the parser; for example, the use oftoken names.I ifor.I whilewill almost certainly cause severedifficulties when the lexical analyzer is compiled.The token name.I erroris reserved for error handling, and should not be used naively(see Section 7)..PPAs mentioned above, the token numbers may be chosen by Yacc or by the user.In the default situation, the numbers are chosen by Yacc.The default token number for a literalcharacter is the numerical value of the character in the local character set.Other names are assigned token numbersstarting at 257..PPTo assign a token number to a token (including literals),the first appearance of the token name or literal.Iin the declarations section.Rcan be immediately followed bya nonnegative integer.This integer is taken to be the token number of the name or literal.Names and literals not defined by this mechanism retain their default definition.It is important that all token numbers be distinct..PPFor historical reasons, the endmarker must have tokennumber 0 or negative.This token number cannot be redefined by the user; thus, alllexical analyzers should be prepared to return 0 or negative as a token numberupon reaching the end of their input..PPA very useful tool for constructing lexical analyzers isthe.I Lexprogram developed by Mike Lesk..[Lesk Lex.]These lexical analyzers are designed to work in closeharmony with Yacc parsers.The specifications for these lexical analyzersuse regular expressions instead of grammar rules.Lex can be easily used to produce quite complicated lexical analyzers,but there remain some languages (such as FORTRAN) which do notfit any theoretical framework, and whose lexical analyzersmust be crafted by hand.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -