⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 context.htm

📁 英文版编译器设计:里面详细介绍啦C编译器的设计
💻 HTM
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0047)http://cs.wwc.edu/~aabyan/464/Book/Context.html -->
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>
<BODY text=#000000 vLink=#51188e aLink=#ff0000 link=#0000ef bgColor=#ffffff>
<HR width="100%">

<CENTER>
<H1>Symbol Tables</H1></CENTER>
<HR width="100%">
<BR>Lex and Yacc files can be extended to handle the context sensitive 
information. For example, suppose we want to require that, in Simple, we require 
that variables be declared before they are referenced. Therefore the parser must 
be able to compare variable references with the variable declarations. 
<P>One way to accomplish this is to construct a list of the variables during the 
parse of the declaration section and then check variable references against the 
those on the list. Such a list is called a <I>symbol table</I>. Symbol tables 
may be implemented using lists, trees, and hash-tables. 
<P>We modify the Lex file to assign the global variable <TT>yylval</TT> to the 
identifier string since the information will be needed by the attribute grammar. 

<H2>The Symbol Table Module</H2>To hold the information required by the 
attribute grammar we construct a symbol table. A symbol table contains the 
environmental information concerning the attributes of various programming 
language constructs. In particular, the type and scope information for each 
variable. The symbol table will be developed as a module to be included in the 
yacc/bison file. 
<P>The symbol table for Simple consists of a linked list of identifiers, 
initially empty. Here is the declaration of a node, initialization of the list 
to empty, <BR>&nbsp; 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE><TT>struct symrec<BR>{<BR>&nbsp; char 
      *name;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      /* name of symbol&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      */<BR>&nbsp; struct symrec *next;&nbsp;&nbsp;&nbsp; /* link 
      field&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      */<BR>};<BR>typedef struct symrec symrec;<BR>symrec *sym_table = (symrec 
      *)0;<BR>symrec *putsym ();<BR>symrec *getsym 
();</TT></TD></TR></TBODY></TABLE></CENTER>
<P>and two operations: putsym to put an identifier into the table, and 
<TT>getsym</TT> which returns a pointer to the symbol table entry corresponding 
to an identifier. <BR>&nbsp; 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE><PRE><TT>symrec * putsym ( char *sym_name )
{
&nbsp; symrec *ptr;
&nbsp; ptr = (symrec *) malloc (sizeof(symrec));
&nbsp; ptr$-&gt;$name = (char *) malloc (strlen(sym_name)+1);
&nbsp; strcpy (ptr$-&gt;$name,sym_name);
&nbsp; ptr$-&gt;$next = (struct symrec *)sym_table;
&nbsp; sym_table = ptr;
&nbsp; return ptr;
}</TT></PRE><TT>symrec * getsym ( char *sym_name )<BR>{<BR>&nbsp; symrec 
      *ptr;<BR>&nbsp; for (ptr = sym_table; ptr != (symrec *) 
      0;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ptr = (symrec 
      *)ptr$-&gt;$next)<BR>&nbsp;&nbsp;&nbsp; if (strcmp 
      (ptr$-&gt;$name,sym_name) == 0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return 
      ptr;<BR>&nbsp; return 0;<BR>}</TT></TD></TR></TBODY></TABLE></CENTER>
<H2>The Parser Modifications</H2>The Yacc/Bison file is modified to include the 
symbol table and with routines to perform the installation of an indentifier in 
the symbol table and to perform context checking. <BR>&nbsp; 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE><TT>%{<BR>#include 
      $&lt;$stdlib.h$&gt;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* For malloc in 
      symbol table */<BR>#include 
      $&lt;$string.h$&gt;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* For strcmp in 
      symbol table */<BR>#include 
      $&lt;$stdio.h$&gt;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* For error 
      messages&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; */<BR>#include 
      "ST.h"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* The 
      Symbol Table Module&nbsp;&nbsp;&nbsp; */<BR>#define YYDEBUG 
      1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* For 
      debugging&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      */<BR>install ( char *sym_name )<BR>{&nbsp; symrec *s;<BR>&nbsp;&nbsp; s = 
      getsym (sym_name);<BR>&nbsp;&nbsp; if (s == 
      0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s = putsym 
      (sym_name);<BR>&nbsp;&nbsp; else { 
      errors++;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      printf( "%s is already defined\n", sym_name );<BR>&nbsp;&nbsp; 
      }<BR>}<BR>context_check( char *sym_name )<BR>{ if ( getsym( sym_name ) == 
      0 )&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp; printf( "%s is an undeclared 
      identifier\n", sym_name );<BR>}<BR>%}<BR></TT><I>Parser 
      declarations<BR></I><TT>%%<BR></TT><I>Grammar rules and 
      actions<BR></I><TT>%%<BR></TT><I>C 
subroutines</I></TD></TR></TBODY></TABLE></CENTER>
<P>Since the scanner (the Lex file) will be returning identifiers, a semantic 
record (static semantics) is required to hold the value and <TT>IDENT</TT>&nbsp; 
is associated with that semantic record. <BR>&nbsp; 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE><I>C declarations<BR></I><TT>%union 
      {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      /* SEMANTIC 
      RECORD&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      */<BR>char&nbsp;&nbsp;&nbsp; 
      *id;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      /* For returning identifiers&nbsp; */<BR>}<BR>%token INT SKIP IF THEN ELSE 
      FI WHILE DO END&nbsp;<BR>%token &lt;id&gt; IDENT /* Simple 
      identifier&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; */<BR>%left '-' '+'<BR>%left '*' 
      '/'<BR>%right '^'&nbsp;<BR>%%<BR></TT><I>Grammar rules and 
      actions<BR></I><TT>%%<BR></TT><I>C 
subroutines</I></TD></TR></TBODY></TABLE></CENTER>The context free-grammar is 
modified to include calls to the install and context checking functions. 
<TT>$<I>n</I></TT> is a variable internal to Yacc which refers to the semantic 
record corresponding the the n-th symbol on the right hand side of a production. 
<BR>&nbsp; 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE><I>C and parser 
      declarations<BR></I><TT>%%<BR>...<BR>declarations : /* empty 
      */<BR>&nbsp;&nbsp; | INTEGER id_seq IDENTIFIER '.' { install( $3 ); 
      }<BR>;<BR>id_seq : /* empty */<BR>&nbsp;&nbsp; | id_seq IDENTIFIER 
      ','&nbsp; { install( $2 );&nbsp;&nbsp; }<BR>;<BR>command : 
      SKIP<BR>&nbsp;&nbsp; | READ IDENTIFIER&nbsp;&nbsp; { context_check( $2 ); 
      }<BR>&nbsp;&nbsp; | IDENT ASSGNOP exp { context_check( $2 ); 
      }<BR>...<BR>exp : 
      INT&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp; 
      | 
      IDENT&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      { context_check( $1 ); }&nbsp;<BR>...<BR>%%<BR></TT><I>C 
  subroutines</I></TD></TR></TBODY></TABLE></CENTER>
<P>In this implementation the parse tree is implicitly annotated with the 
information concerning whether a variable is assigned to a value before it is 
referenced in an expression. The annotations to the parse tree are collected 
into the symbol table. 
<H2>The Scanner Modifications</H2>The scanner must be modified to return the 
literal string associated each identifier (the semantic value of the token). The 
semantic value is returned in the global variable <TT>yylval</TT>. 
<TT>yylval</TT>'s type is a union made from the <TT>%union</TT> declaration in 
the parser file. The semantic value must be stored in the proper member of the 
union. Since the union declaration is: 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE>%union { char *id;&nbsp;<BR>}</TD></TR></TBODY></TABLE></CENTER>
<P>The semantic value is copied from the global variable <TT>yytext</TT> (which 
contains the input text) to <TT>yylval.id</TT>. Since the function 
<TT>strdup</TT> is used (from the library <TT>string.h</TT>) the library must be 
included. The resulting scanner file is: <BR>&nbsp; 
<CENTER>
<TABLE border=0 NOSAVE>
  <TBODY>
  <TR NOSAVE>
    <TD NOSAVE><TT>%{<BR>#include &lt;string.h&gt;&nbsp;&nbsp;&nbsp;&nbsp; /* 
      for 
      strdup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      */&nbsp;&nbsp;&nbsp;<BR>#include "Simple.tab.h" /* for token definitions 
      and yylval */<BR>%}<BR>DIGIT&nbsp;&nbsp;&nbsp; 
      [0-9]<BR>ID&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      [a-z][a-z0-9]*<BR>%%<BR>":="&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(ASSGNOP);&nbsp;&nbsp; }<BR>{DIGIT}+ { 
      return(NUMBER);&nbsp;&nbsp;&nbsp; 
      }<BR>do&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(DO);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>else&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(ELSE);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>end&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(END);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>fi&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(FI);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>if&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(IF);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>in&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(IN);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>integer&nbsp; { 
      return(INTEGER);&nbsp;&nbsp; }<BR>let&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(LET);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>read&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(READ);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>skip&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(SKIP);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      }<BR>then&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(THEN);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>while&nbsp;&nbsp;&nbsp; { 
      return(WHILE);&nbsp;&nbsp;&nbsp;&nbsp; }<BR>write&nbsp;&nbsp;&nbsp; { 
      return(WRITE);&nbsp;&nbsp;&nbsp;&nbsp; }<BR>{ID}&nbsp;&nbsp;&nbsp;&nbsp; { 
      yylval.id = (char *) 
      strdup(yytext);&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
      return(IDENTIFIER);}<BR>[ \t\n]+ /* eat up whitespace 
      */<BR>.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { 
      return(yytext[0]);}<BR>%%</TT></TD></TR></TBODY></TABLE></CENTER>
<H2>Intermediate Representation</H2>Most compilers convert the source code to an 
intermedate representation during this phase. In this example, the intermediate 
representation is a parse tree. The parse tree is held in the stack but it could 
be made explicit. Other popular choices for intermediate representation include 
abstract parse trees, three-address code, also known as quadruples, and post fix 
code. In this example we have chosen to bypass the generation of an intermediate 
representation and go directly to code generation. The principles illustrated in 
the section on code generation also apply to the generation of intermediate 
code. </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -