📄 readme.txt

📁 c—语言的句法分析器。读入一个C--语言程序
💻 TXT
📖 第 1 页 / 共 2 页
字号:
12 下一页
编译原理实验（3）实验报告
============================================================================
计算机科学系2004级(1)班  陈健 学号04120004
完成时间  2007年5月7日

----------------------------------------------------------------------------
一. 实验目的及要求
----------------------------------------------------------------------------
1．c—语言的句法分析器。读入一个C--语言程序，判断该程序是不是一个合法的C--
语言程序。如果程序有语法和语义错误，请给出错误信息。至少处理以下错误：
（1）所有语法错误。C――的语法参见C――的文档；
（2）语义错误：
    2a：标志符未定义错误，多次定义错误（需要采用符号表。关于符号表的实现，请
大家参阅第7章。）
    2b: 类型错误。判断运算符的参数是否具有所需要的类型。注意：算术运算符包括：
＋，－，*, /, ％只作用于整数。关系运算符（<,>，等）可以作用于整数表达式，字符
表达式，但是不要求作用于字符串表达式（否则C－－就不是C语言的子集了，虽然理论
上支持“ab” < “ac”还是挺有意思的）。类型错误包括函数调用的参数。

2．程序要做成命令行程序，带一个参数，表示输入的C--文件名。

3．实例输入文件为 s1.c, s2.c，s3.c其中s1.c和s3.c有错误，s2.c没有错误。

----------------------------------------------------------------------------
二. 设计思想
----------------------------------------------------------------------------
①概况
●利用语法分析器代码生成器yacc(bison)生成代码。可以识别出语法错误。
●识别类型错误，主要通过构造符号表，函数表，修改产生式对应的语义动作来实现。
●yacc文件的设计 见 ②lab3.y 的设计
●如何编译及运行程序可以看下面的 三. 编译及运行

②lab3.y 的设计
0. 下面用到的函数InsertID  InsertFun  LookUpFun  MakeTable  InsertSym  
   LookUp 在src/lab3.h中声明,src/lab3.c中实现
   它们是按照老师给的 C--编译器的示例代码 中的函数来写的，稍做修改
    
1. 修改实验2的lex文件，得到需要的词法分析器yylex函数
1.1 全局变量lineno记录行号，将用于错误输出时指定行号
1.2 遇见token:ID NUM CHAR_LITERAL STRING_LITERAL，用InsertID函数插入表
    id_hash_table中，这个表将存放这些token的串，散列函数采用的是书7.6.5节的
    hashpjw
1.3 id_hash_table相关的结构
    #define HASHTABLESIZE 1511
    char *id指向字符串在内存中的位置
                                                     
      ┌──┐    ┌──┬──┐    ┌──┬──┐         
      │ 0  │─→│id  │next│─→│id  │next│─→NULL
      ├──┤    └──┴──┘    └──┴──┘       
      │ 1  │       id_entry          id_entry            
      ├──┤                                            
      │ :  │                                           
      │ :  │                                                                     
      ├──┤                                             
      │1510│                                             
      └──┘                                            
   id_hash_table

2. 声明
2.1 
%union {
    char *id;             /* identifier */
    int  value;           /* value */
    char op[3];           /* operator */
    symbol *sign;         /* symbol */
    expr expn;            /* expression */
}
定义YYSTYPE
其中op[3],联系了算术运算符，关系运算符和一元操作符
2.2 全局变量
level 代表当前处理函数的所在的范围，level越高，嵌套的越多(我的理解可能是错的)

3. 翻译规则
3.1 按照 C--语言的说明.doc 写好产生式和对应的语义动作，初始设计的动作是空，如
leftside
    :  alt 
        {/* empty */}
    ;
3.2 处理移进归约冲突
会有两个state有移进/归约冲突
●一个是if else语句的悬空else二义所引起的，移进优先，修改如下：
%nonassoc  IFX
%nonassoc  ELSE
%%
if_stmt
    :  IF '(' expression ')' statement %prec IFX
    |  IF '(' expression ')' statement ELSE statement
    ;
●另一个是
fun_declaration
    :  type_specifer calling_convention ID '(' params ')' compound_stmt
    |  type_specifer calling_convention ID '(' params ')' ';'
    ;
calling_convention
    :  CDECL
    |  STDCALL
    |  ε
    ;
由于calling_convention含ε产生式, 在
var_declaration  ->  type_specifer . ID ';'
fun_declaration  ->  type_specifer . calling_convention ID '(' params ')' compound_stmt
fun_declaration  ->  type_specifer . calling_convention ID '(' params ')' ';'
时，遇见ID时产生移进/归约冲突
为了解决冲突和方便后面添加函数到函数表funtable，修改如下:
fun_declaration
    :  fun_tag compound_stmt
    |  fun_tag ';'
    ;
fun_tag
    :  type_specifer calling_convention ID '(' params ')'
    |  type_specifer ID '(' params ')'
    ;
calling_convention
    :  CDECL
    |  STDCALL
    ;

3.3 在定义ID时将它加入符号表 symboltable
3.3.1 找出所有定义ID的地方用InsertSym加入符号表
var_declaration              /* 变量定义处 */
    :  type_specifer ID ';'
    |  type_specifer ID '[' NUM  ']' ';'
    |  type_specifer '*' ID ';'
    |  type_specifer '*' ID '[' NUM ']' ';'
    ;
param                         /* 函数声明处形参 */
    :  type_specifer ID
    |  type_specifer '*' ID
    |  type_specifer ID '[' ']'
    |  ELLIPSIS
    ;

3.3.2 在不同的函数范围内，改变level后添加符号
●fun_tag之后跟着 compound_stmt
进入函数体,level加1
fun_tag
    :  type_specifer calling_convention ID '(' params ')'
       {
           ...
           level++;
           ...
       }
    |  type_specifer ID '(' params ')'
       {
           ...
           level++;
           ...
       }
    ;
●在
fun_declaration => fun_tag compound_stmt
fun_declaration => fun_tag ;
归约时，离开函数，level减1
fun_declaration
    :  fun_tag compound_stmt
       {
           tablelen--;
           level--;
       }
    |  fun_tag ';'
       {
           tablelen--;
           level--;
       }
    ;
●函数参数的声明
这里定义的变量在函数体内是可用的，level加1，插入符号，再减1
param
    :  type_specifer ID
       {
           ...
           level++;
           p = InsertSym($2, tp);
           ...
           level--;
           ...
       }
    |  type_specifer '*' ID
       {
           ...
           level++;
           p = InsertSym($3, tp);
           ...
           level--;
           ...
       }
    |  type_specifer ID '[' ']'
       {
           ...
           level++;
           p = InsertSym($2, tp);
           ...
           level--;
           ...
       }
    |  ELLIPSIS
       {
           ...
       }
    ;

3.3.3 在插入前要注意这些量所在的level
若level相同则在相同级别的symboltable[tablelen-1]中添加 
若待插入的table的 level(原) 小于现在所在的 level(当前) 则说明进入新的函数范围，
就新建一个level等于 level(当前) 的symboltable，再添加符号

3.3.4 symboltable相关结构
     #define BUCKETSIZE    211
     #define SYMBOLSIZE    100
     ┌───┐                                                           
     │  0   │─→   struct table                                         
     ├───┤                                                          
     │  1   │─→ ┌──────┐                                     
     ├───┤     │ level      │                                    
     │      │     ├──────┤                                    
     │  :   │     │ previous   │─→  global_table                  
     │  :   │     ├──────┤                                    
     │  :   │     │ buckets[0] │                                    
     │      │     │            │     ┌───┬───┐              
     │      │     │ buckets[1] │─→ │      │name  │             
     │      │     │            │     │      ├───┤             
     ├───┤     │    :       │     │ sym  │scope │          
     │ 100  │     │    :       │     │      ├───┤            
     └───┘     │    :       │     │      │type  │             
    symboltable     │            │     ├───┴───┤              
                    │buckets[210]│     │     next     │─→struct   
                    └──────┘     └───────┘    sym_entry 
                      struct table       struct sym_entry                

3.4 在定义函数时将它加入函数的表 funtable
3.3.1 找出所有定义函数的地方用InsertFun加入funtable
fun_tag
    :  type_specifer calling_convention ID '(' params ')'
    |  type_specifer ID '(' params ')'
    ;

3.3.2 funtable相关结构
      #define FUNCTIONSIZE  1009
      char *funame函数名
      int retype  函数返回值类型
      int count   函数参数个数
      int *type   函数参数类型(存放在一个数组)
     ┌───┐    ┌──────┐                
     │  0   │─→│ funame     │               
     ├───┤    ├──────┤               
     │  1   │    │ retype     │             
     ├───┤    ├──────┤              
     │      │    │ count      │               
     │  :   │    ├──────┤              
     │  :   │    │ type       │─→ argtemp.temp
     │  :   │    ├──────┤               
     │      │    │ next       │─→ struct fun   
     │      │    └──────┘               
     │      │      struct fun                  
     ├───┤                                 
     │ 1008 │                                  
     └───┘                                  
      funtable

4 检查错误
4.1 检查语法错误
yacc检查到语法错误时会报错，修改的yyerror函数，将打印出错的行
利用bison的YYERROR_VERBOSE这个宏，给出更详细的错误说明

4.2 检查语义错误
4.2.1 标志符未定义错误
在用到ID的地方
var
    :  ID
    |  ID '[' expression ']'
    ;
用lookup函数搜索该变量所在的符号表和全局变量所在的符号表
若找不到提示出错：undeclared identifier

4.2.2 多次定义错误
找出所有定义ID的地方
var_declaration              /* 变量定义处 */
    :  type_specifer ID ';'
    |  type_specifer ID '[' NUM  ']' ';'
    |  type_specifer '*' ID ';'
    |  type_specifer '*' ID '[' NUM ']' ';'
    ;
param                         /* 函数声明处形参 */
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -