📄 lex.html
字号:
<HTML><HEAD><TITLE>Using LEX with ACCENT</TITLE></HEAD><BODY bgcolor="white"><TABLE cellspacing=20> <TR> <TD valign="top"> <img src="logo.gif"> </TD> <TD valign="bottom" align="left"> <a href="index.html">The Accent Compiler Compiler</a> <h1>Using LEX with ACCENT</h1> </TD> </TR> <TR> <TD align="right" valign="top"> <!-- MENU --> <font face="helvetica"> <a href="index.html">Accent</a><br> <a href="overview.html">Overview</a><br> <a href="tutorial.html">Tutorial</a><br> <a href="language.html">Language</a><br> <a href="installation.html">Installation</a><br> <a href="usage.html">Usage</a><br> Lex<br> <a href="algorithms.html">Algorithms</a><br> <a href="distribution.html">Distribution</a><br> </font> </TD> <TD valign="top"> <!--- begin main content --><h3>The Scanner Function</h3>The representation of terminal symbols (tokens) is not definedby the <i>Accent</i> specification. An <i>Accent</i> parsercooperates with a lexical scanner that converts the source text intoa sequence of tokens. This scanner is implemented by a function<tt>yylex()</tt> that reads the next token and returns a valuerepresenting the kind of the token.<h3>The Kind of a Token</h3>The kind of a token is indicated by a number.<p>A terminal symbol denoted by a literal in the <i>Accent</i> specification,e.g. <tt>'+'</tt>, is represented by the numerical value of the character.So <tt>yylex()</tt> returns this value if it has recognized this literal:<pre> return '+';</pre>A terminal symbol denoted by a symbolic name declaredin the token declaration part of the <i>Accent</i> specification,e.g. <tt>NUMBER</tt>, is represented by a constant with a symbolic namethat is the same as the token name. So <tt>yylex</tt> returnsthis constant:<pre> return NUMBER;</pre>The definition of the constants is generated by <i>Accent</i>and is contained in the generated file <tt>yygrammar.h</tt>.Hence the file introducing <tt>yylex</tt> should include this file.<pre> #include "yygrammar.h"</pre><h3>The Attribute of a Token</h3>Besides having a kind (e.g. <tt>NUMBER</tt>)a token can also be augmented with a semantic attribute.The function <tt>yylex</tt>assigns this attribute value to the variable <tt>yylval</tt>.For example<pre> yylval = atoi(yytext);</pre>(here <tt>yytext</tt> is the actual token that has been recognizedas a <tt>NUMBER</tt>; the function <tt>atoi()</tt> converts thisstring into a numerical value).<p>The variable <tt>yylval</tt> is declaredin the generated file <tt>yygrammar.c</tt>.An <tt>external</tt> declaration for this variableis provided in the generated file <tt>yygrammar.h</tt>.<p><tt>yylval</tt> is declared as of type <tt>YYSTYPE</tt>.This is defined by <i>Accent</i>in the file <tt>yygrammar.h</tt> as a macro standing for <tt>long</tt>.<pre> #ifndef YYSTYPE #define YYSTYPE long #endif</pre>The user can define his or her own type before including the file<tt>yygrammar.h</tt>.For example, a file <tt>yystype.h</tt> may define<pre> typedef union { int intval; float floatval; } ATTRIBUTE; #define YYSTYPE ATTRIBUTE</pre>Now the file defining <tt>yylex()</tt> imports two header files:<pre>#include "yystype.h"#include "yygrammar.h"</pre>and defines the semantic attribute by:<pre> yylval.intval = atoi(yytext);</pre><h3>The <i>Lex</i> Specification</h3>The function <tt>yylex</tt> can be generated by the scanner generator<i>Lex</i> (or the GNU implementation <i>Flex</i>).<p>The <a href="http://dinosaur.compilertools.net"><i>Lex & Yacc Page</i></a>has online documentation for <i>Lex</i> and <i>Flex</i>.<p>A <i>Lex</i> specification gives rules that define for each token how itis represented and how it is processed.A rule has the form<pre> pattern { action }</pre><tt>pattern</tt> is a regular expressionthat specifies the representation of the token.<p><tt>action</tt> is <i>C</i> code that specifies how the token is processed.This code sets the attribute value and returns the kind of the token.<p>For example, here is a rule for the token <tt>NUMBER</tt>:<pre> [0-9]+ { yylval.intval = atoi(yytext); return NUMBER; }</pre>The <i>Lex</i> specification starts with a definition sectionwhich can be used to import header files and to declare variables.For example,<pre> %{ #include "yystype.h" #include "yygrammar.h" %} %%</pre>Here the section imports <tt>yystype.h</tt> to provide a user specificdefinition of <tt>YYSTYPE</tt> and <tt>yygrammar.h</tt>that defines the token codes.The <tt>%%</tt> separates this section from the rules part.<h3>The <i>Accent</i> Specification</h3>In the <i>Accent</i> specification, tokens are introduced in the tokendeclaration part.<p>For example<pre> %token NUMBER;</pre>introduces a token with name <tt>NUMBER</tt>.<p>Inside a rule the token can be used with a parameter,for example<pre> NUMBER<x></pre>This parameter can then be used in actions to access the attribute of the token.It is of type <tt>YYSTYPE</tt>.<pre> Value : NUMBER<x> { printf("%d", x.intval); } ;</pre>or simply<pre> Value : NUMBER<x> { printf("%d", x); } ;</pre>if there is no user specific definition of <tt>YYSTYPE</tt>.<p>As opposed to the <i>Lex</i> specification the import of <tt>yygrammar.h</tt>does not appear in the<i>Accent</i> specification.If the user specifies an own type <tt>YYSTYPE</tt>this has to be done in global prelude part, e.g.<pre> %prelude { #include "yystype.h" }</pre><h3>Tracking the Source Position</h3>Like <tt>yylval</tt>, which holds the attribute of a token,there is a further variable, <tt>yypos</tt>, thats holds the source positionof the token.<p><tt>yypos</tt> is declared in the <i>Accent</i> runtimeas an <tt>external</tt> variable of type <tt>long</tt>.Its initial value is <tt>1</tt>.<p>This variable can be set in rules of the <i>Lex</i> specification.For example,<pre> \n { yypos++; /* adjust linenumber and skip newline */ }</pre>If the newline character is seen, <tt>yypos</tt> is incrementedand so holds the actual line number.<p>The variable <tt>yypos</tt> is managed in in such a way thatit holds the correct value when <tt>yyerror</tt> is invoked toreport a syntax error(although due to lookahead already the next token is read).<p>It has also a correct value when semantic actions are executed(note that this is done after lexical analysis and parsing).Hence it can be used inside semantic actions,for example<pre> value: NUMBER<n> { printf("value in line %d is %d\n", yypos, n); } ;</pre> <!--- end main content --> <br> <br> <font face="helvetica" size="1"> <a href="http://accent.compilertools.net">accent.compilertools.net</a> </font> </TD> </TR></TABLE></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -