cobol.pars

来自「linux 下的源代码分析阅读器 red hat公司新版」· PARS 代码 · 共 2,242 行 · 第 1/5 页
PARS
2,242 行
/*		COBOL grammar		=============		conforming to:	ANSI'74 Standard (ANSI X3.23 - 1974)	ANSI'85 Standard (ANSI X3.23 - 1985)	IBM OS/VS COBOL	IBM VS COBOL II	IBM SAA COBOL/370	IBM DOSVS COBOL	X/Open	Micro Focus COBOL*//* Ich, Doktor Josef Grosch, Informatiker, March 1997 *//*Conventions:The suffix _l stands for list.The suffix _e stands for list element.The suffix _o stands for optional.The suffix _i stands for imperative statement.For some nonterminals such as name, qualification, subscription, and identifiervarious kinds of usage are distinguished:No  suffix    stands for read access.The suffix _w stands for write access.The suffix _f stands for forward reference.The suffix _c stands for reference in CORRESPONDING context.The suffix _n stands for none of the above.The character - is replaced by _ in nonterminals.Notation for words: keywords	: all uppercase				: END optional words	: first uppercase, else lowercase	: Is terminals	: all lowercase				: real nonterminals	: all lowercase				: identifierTerminals with	: unsigned_integer, plus_integer, minus_integer, level_number, attributes are	: real, string, name, paragraph_name, pseudo_text,		  picture_string, illegal_character*//*		  Discussion of the LR-Conflicts		  ------------------------------The grammars for COBOL '85 and Micro Focus COBOL in their published forms arehighly ambiguous and they are not LR(k) for any k. The grammars in their originalversion contain dozens of LR conflicts. However, the situation is not as bad asit might seem because:- The verbose syntax rules of the language specify how to resolve some conflicts.- Many rules can be rewritten into an LR(1) form.- A few rules are LR(2), they require lookahead of 2 tokens.- Some rules could be written in an LR(1) form, but the natural version that  reflects the semantic structure is LR(2).Theoretically, the grammar for Cobol following below is still not LR(k) for any k.The shift-reduce conflicts that require a lookahead of arbitrary length can beresolved in favor of the shift action according to the verbose syntax rules.Therefore, the grammar is practically LR(2). In order to process the grammar withan LALR(1) tool, a buffer could be inserted between scanner and parser. This bufferimplements a lookahead of 2 tokens by modifying some tokens if these are followedby certain tokens. With this mechanism the grammar is actually LALR(1).The parser generator Lark automatically provides the mentioned buffer and evensupports lookahead of an unlimited number of tokens. In the followingI will discuss the interesting conflicts present in COBOL '85 in detail.1. file_control_entry   SELECT f ACCESS MODE SEQUENTIAL .   SELECT f ACCESS MODE SEQUENTIAL RELATIVE KEY IS n .   SELECT f ACCESS MODE RANDOM .   SELECT f ACCESS MODE RANDOM RELATIVE KEY IS n .   SELECT f ACCESS MODE DYNAMIC .   SELECT f ACCESS MODE DYNAMIC RELATIVE KEY IS n .   Does RELATIVE start the KEY phrase of the ACCESS MODE clause or does it   start the ORGANIZATION IS clause? This conflict requires a lookahead of 2.   It could be handled with a lookahead of 1 by adding the following rules:   select_clause = ACCESS Mode Is SEQUENTIAL RELATIVE .   select_clause = ACCESS Mode Is RANDOM     RELATIVE .   select_clause = ACCESS Mode Is DYNAMIC    RELATIVE .   These rules are combinations of ACCESS MODE clauses and ORGANIZATION IS   clauses. They recognize the given combinations of two clauses with one rule.   Using Lark, syntactic predicates that trigger trial parsing can be added   in order to solve the shift reduce conflicts:   select_clause = ACCESS Mode Is SEQUENTIAL ? - RELATIVE_Key_Is_name .   select_clause = ACCESS Mode Is RANDOM     ? - RELATIVE_Key_Is_name .   select_clause = ACCESS Mode Is DYNAMIC    ? - RELATIVE_Key_Is_name .   RELATIVE_Key_Is_name = RELATIVE Key Is name .   The nonterminal RELATIVE_Key_Is_name checks whether RELATIVE starts the   KEY phrase.2. report_group_description_entry   01 LINE NUMBER 50 .   01 LINE NUMBER 50 NEXT PAGE .   Does NEXT start the NEXT PAGE phrase of the LINE NUMBER clause or does it   start a NEXT GROUP clause? This conflict requires a lookahead of 2.   It could be handled with a lookahead of 1 by adding the following rules:   report_group_clause = LINE Number Is integer NEXT GROUP Is integer .   report_group_clause = LINE Number Is integer NEXT GROUP Is PLUS integer .   report_group_clause = LINE Number Is integer NEXT GROUP Is NEXT PAGE .   These rules are combinations of LINE NUMBER clauses and NEXT GROUP clauses.   They recognize the given combinations of two clauses with one rule.   Using Lark, the shift reduce conflict can be solved by considering a   lookahead of 2 tokens:   report_group_clause = LINE Number Is integer                                ? { GetLookahead (2) == YYCODE (GROUP) } .3. Scope delimiters and optional error phrases   ADD a TO b SIZE ERROR ADD c TO d END-ADD   ADD a TO b SIZE ERROR ADD c TO d NOT SIZE ERROR STOP RUN   Are END-ADD or the NOT SIZE ERROR phrase associated with the outer or the inner   ADD statement? This part of the grammar is ambiguous. The verbose syntax rules   specify that both phrases are to be associated with the inner ADD statement.   This corresponds to taking a shift action instead of a reduce action. This is   also the usual method for parser generators to solve this type of conflict.   This problem arises with all variants of ADD statements such as ADD TO,   ADD TO GIVING, and ADD CORRESPONDING as well as for many more imperative   statements with optional NOT phrases and optional scope delimiters:      ADD CORRESPONDING, ADD TO, ADD TO GIVING,      CALL,      COMPUTE,      DELETE,      DIVIDE BY GIVING, DIVIDE BY GIVING REMAINDER, DIVIDE INTO,      DIVIDE INTO GIVING, DIVIDE INTO GIVING REMAINDER,      MULTIPLY BY, MULTIPLY BY GIVING,      READ, READ KEY, READ NEXT,      RECEIVE,      REWRITE,      START,      STRING,      SUBTRACT CORRESPONDING FROM, SUBTRACT FROM, SUBTRACT FROM GIVING,      UNSTRING,      WRITE, WRITE WITH NO ADVANCING4. RECEIVE WITH DATA   RECEIVE n MESSAGE INTO i NO DATA CLOSE f WITH DATA STOP RUN   RECEIVE n MESSAGE INTO i NO DATA CLOSE f WITH LOCK   Does WITH start the WITH DATA phrase of the RECEIVE statement or does it   start the WITH LOCK phrase of the CLOSE statement? This conflict requires a   lookahead of 2. This problem arises in combination of the RECEIVE statement   and all statements that have an optional phrase starting with WITH. These are:      CLOSE	/ WITH NO REWIND / WITH LOCK      DISABLE	/ WITH KEY      DISPLAY	/ WITH NO ADVANCING      ENABLE	/ WITH KEY      OPEN	/ WITH NO REWIND      PERFORM	/ WITH TEST BEFORE / WITH TEST AFTER      STRING	/ WITH POINTER      UNSTRING	/ WITH POINTER      WRITE	/ WITH NO ADVANCING   Using Lark, these shift reduce conflicts can be solved by adding an   inspection of 2 lookahead tokens to numerous rules such as e. g.:   perform = PERFORM procedure ? { GetLookahead (2) == YYCODE (DATA) } .5. PERFORM UNTIL NOT   ADD a TO b SIZE ERROR PERFORM p UNTIL i NOT NUMERIC   "                                     e NOT ZERO   "                                     e NOT EQUAL TO f   ADD a TO b SIZE ERROR PERFORM p UNTIL i NOT On SIZE ERROR   "                                     i NOT INVALID KEY   "                                     i NOT On OVERFLOW   "                                     i NOT At END   "                                     i NOT At END-OF-PAGE   "                                     i NOT On EXCEPTION   Does the NOT continue the condition after UNTIL or does it start a NOT ERROR   or a similar NOT phrase of a containing statement? This conflict requires a   lookahead of 2. In combination of PERFORM and ADD it is the SIZE ERROR phrase   that causes the problem. For the other phrases, combinations of PERFORM with   READ and WRITE or similar statements cause the trouble.   Using Lark, this conflict can be solved by adding a syntactic predicate:   Is           = ? not .   not          = <                = NOT classification .                = NOT sign_3 .                = NOT EQUAL .                = NOT LESS .                = NOT GREATER .                = NOT '=' .                = NOT '<' .                = NOT '>' .                = NOT '(' .   > .6. INSPECT TALLYING   INSPECT a TALLYING i FOR ALL u v w j FOR ALL x y z   Does the identifier j continue the list of identifiers after the first ALL   or does it start a new FOR phrase? This problem could be formulated with a   lookahead of 1 but the natural version that reflects the semantic structure   requires a lookahead of 2.   LR(2) Version:      inspect           = INSPECT identifier TALLYING tallying_l .      tallying_l        = < = tallying_e . = tallying_l tallying_e . > .      tallying_e        = identifier 'FOR' for_l .   LR(1) Version:      inspect           = INSPECT identifier TALLYING identifier tallying_l .      tallying_l        = < = tallying_e . = tallying_l tallying_e . > .      tallying_e        = < = 'FOR' for_l . = 'FOR' for_l identifier . > .   Using Lark, this conflict can be solved by adding syntactic predicates:   for_e		= ALL     all_leading_l ? identifier_FOR .   for_e		= LEADING all_leading_l ? identifier_FOR .   identifier_FOR	= identifier 'FOR' .   The nonterminal identifier_FOR checks whether identifier starts a new   FOR phrase.7. Sections and Paragraphs   A SECTION. a. CONTINUE. b. CONTINUE. B SECTION. c. CONTINUE.   Does the name B start a new section or a new paragraph? As before this problem   could be formulated with a lookahead of 1 but the natural version that reflects   the semantic structure requires a lookahead of 2.   LR(2) Version:procedure_division	= <		= PROCEDURE DIVISION using_o '.' declaratives section_l .		= PROCEDURE DIVISION using_o '.'              section_l .		= PROCEDURE DIVISION using_o '.' paragraph_l .> .declaratives	= DECLARATIVES '.' d_section_l 'END' DECLARATIVES '.' .d_section_l	= <		=             section_head use '.' paragraph_l		   ? { GetLookahead (2) == YYCODE (SECTION) } .		= d_section_l section_head use '.' paragraph_l		   ? { GetLookahead (2) == YYCODE (SECTION) } .> .section_l	= <		= section_head paragraph_l		   ? { GetLookahead (2) == YYCODE (SECTION) } .		= section_head paragraph_l section_l .> .section_head	= name SECTION segment_number_o '.'paragraph_l	= <		= .		= paragraph_l paragraph_e .> .paragraph_e	= name '.' sentence_l .   LR(1) Version:procedure_division	= <		= PROCEDURE DIVISION using_o '.' declaratives name section_l .		= PROCEDURE DIVISION using_o '.' name section_l .		= PROCEDURE DIVISION using_o '.' name paragraph_l .> .declaratives	= DECLARATIVES '.' d_section_l 'END' DECLARATIVES '.' .d_section_l	= <		= SECTION segment_number_o '.' use '.' name paragraph_l .		= SECTION segment_number_o '.' use '.' .		= SECTION segment_number_o '.' use '.' name paragraph_l d_section_l .		= SECTION segment_number_o '.' use '.' name             d_section_l .> .section_l	= <		= SECTION segment_number_o '.' name paragraph_l .		= SECTION segment_number_o '.' .		= SECTION segment_number_o '.' name paragraph_l section_l .		= SECTION segment_number_o '.' name             section_l .> .paragraph_l	= <		= paragraph_e .		= paragraph_l paragraph_e .> .paragraph_e	= <		= '.' sentence_l name .		= '.' sentence_l .> .   Using Lark, the shift reduce conflicts in the LR(2) version can be solved   by adding an inspection of 2 lookahead tokens to some rules as shown above.8. Identifiers: Subscription or Modification?   n (i + 1)   n (i + 1 :)   Does the character '+' continue the index of a subscription or does it continue   the expression of a modification? This conflict requires lookahead of arbitrary   length. It can be solved by allowing a full expression for the first index of a   subscription. The check for the restricted form of the index is delegated to   semantic analysis.   identifier	= <		= qualification .		= qualification '(' expression index_l ')' .		= qualification '(' expression ':' ')' .		...   > .9. LINAGE clause of file_description_entry   In its natural form the LINAGE clause of the file_description_entry is LR(2).   It can be rewritten into an LR(1) form as can be seen below.10. SUM clause of report_group_description   SUM a SUM b   The SUM clause in a report_group_description_entry can be repeated several   times. All the clauses in a report_group_description_entry can be given in   arbitrary order. Therefore we allow a list of clauses and rely on semantic   analysis to detect multiple appearances of clauses. Now, if the SUM clause   would be specified as a list, too, it is ambiguous whether SUM b continues   the outer list of all clauses or the inner list of SUM clauses. The solution   omits the inner list, uses the outer list for the repetition of SUM clauses,   too, and delegates the detailed checks to semantic analysis.*/PARSERGLOBAL {# include <ctype.h># include "Position.h"# include "StringM.h"# include "Idents.h"# include "keywdef.h"# include "keywords.h"# include "def.h"# include "deftab.h"# include "paf.h"# define yyInitStackSize	200# define yyInitBufferSize	32# define TOKENOP	PrevEPos = CurrentEPos; CurrentEPos = Attribute.name.EPos;# define BEFORE_TRIAL	tPosition SavePEPos, SaveCEPos; SavePEPos = PrevEPos; SaveCEPos = CurrentEPos;# define AFTER_TRIAL	PrevEPos = SavePEPos; CurrentEPos = SaveCEPos;extern	rbool		Copy ARGS ((tIdent ident, tPosition pos));static	tPosition	PrevEPos, CurrentEPos;static	tIdent		iCURRENT_DATE	;static	tIdent		iWHEN_COMPILED	;}BEGIN {   iCURRENT_DATE	= MakeIdent ("CURRENT-DATE"	, 12);   iWHEN_COMPILED	= MakeIdent ("WHEN-COMPILED"	, 13);}START programs descriptionsPROPERTY INPUTRULEprograms	= program_l program_end_o .program_end_o	= <		= program .		= program program_l 'END' PROGRAM name '.' .> .program_end	= program program_l 'END' PROGRAM name '.' .program_l	= <		= .		= program_l program_end .> .program		= identification_division		  environment_division_o		  data_division_o		  procedure_division_o .sce		= { => { Start_Comment_Entry (); }; } .identification_division	= identification_o program_id_o identification_l .environment_division_o	= <		= environment_division .	     /* = . */> .data_division_o	= <		= data_division .	     /* = . */> .procedure_division_o	= <		= procedure_division .		= .> .identification_o	= <		= IDENTIFICATION DIVISION '.'		{ => { Section = cID_DIV; }; } .		=		{ => { Section = cID_DIV; }; } .> .program_id_o	= <		= 'PROGRAM-ID' period_o sce name   program_o		{ => { (void) DeclareLabel (name:Scan, lPROGRAM, PrevEPos); }; } .		= 'PROGRAM-ID' period_o sce string program_o		{ => {	char word [128];			StGetString (string:Value, word);			string:Scan.name.Ident = MakeIdent (word, strlen (word));			(void) DeclareLabel (string:Scan, lPROGRAM, PrevEPos); }; } .		= .
cobol.pars - 源码说明

本页面展示了「linux 下的源代码分析阅读器 red hat公司新版」中的 cobol.pars 源码文件，采用 PARS 编程语言编写，共 2,242 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与linux相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?