📄 lr_parser.java
字号:
package com.sun.java_cup.internal.runtime;import java.util.Stack;/** This class implements a skeleton table driven LR parser. In general, * LR parsers are a form of bottom up shift-reduce parsers. Shift-reduce * parsers act by shifting input onto a parse stack until the Symbols * matching the right hand side of a production appear on the top of the * stack. Once this occurs, a reduce is performed. This involves removing * the Symbols corresponding to the right hand side of the production * (the so called "handle") and replacing them with the non-terminal from * the left hand side of the production. <p> * * To control the decision of whether to shift or reduce at any given point, * the parser uses a state machine (the "viable prefix recognition machine" * built by the parser generator). The current state of the machine is placed * on top of the parse stack (stored as part of a Symbol object representing * a terminal or non terminal). The parse action table is consulted * (using the current state and the current lookahead Symbol as indexes) to * determine whether to shift or to reduce. When the parser shifts, it * changes to a new state by pushing a new Symbol (containing a new state) * onto the stack. When the parser reduces, it pops the handle (right hand * side of a production) off the stack. This leaves the parser in the state * it was in before any of those Symbols were matched. Next the reduce-goto * table is consulted (using the new state and current lookahead Symbol as * indexes) to determine a new state to go to. The parser then shifts to * this goto state by pushing the left hand side Symbol of the production * (also containing the new state) onto the stack.<p> * * This class actually provides four LR parsers. The methods parse() and * debug_parse() provide two versions of the main parser (the only difference * being that debug_parse() emits debugging trace messages as it parses). * In addition to these main parsers, the error recovery mechanism uses two * more. One of these is used to simulate "parsing ahead" in the input * without carrying out actions (to verify that a potential error recovery * has worked), and the other is used to parse through buffered "parse ahead" * input in order to execute all actions and re-synchronize the actual parser * configuration.<p> * * This is an abstract class which is normally filled out by a subclass * generated by the JavaCup parser generator. In addition to supplying * the actual parse tables, generated code also supplies methods which * invoke various pieces of user supplied code, provide access to certain * special Symbols (e.g., EOF and error), etc. Specifically, the following * abstract methods are normally supplied by generated code: * <dl compact> * <dt> short[][] production_table() * <dd> Provides a reference to the production table (indicating the index of * the left hand side non terminal and the length of the right hand side * for each production in the grammar). * <dt> short[][] action_table() * <dd> Provides a reference to the parse action table. * <dt> short[][] reduce_table() * <dd> Provides a reference to the reduce-goto table. * <dt> int start_state() * <dd> Indicates the index of the start state. * <dt> int start_production() * <dd> Indicates the index of the starting production. * <dt> int EOF_sym() * <dd> Indicates the index of the EOF Symbol. * <dt> int error_sym() * <dd> Indicates the index of the error Symbol. * <dt> Symbol do_action() * <dd> Executes a piece of user supplied action code. This always comes at * the point of a reduce in the parse, so this code also allocates and * fills in the left hand side non terminal Symbol object that is to be * pushed onto the stack for the reduce. * <dt> void init_actions() * <dd> Code to initialize a special object that encapsulates user supplied * actions (this object is used by do_action() to actually carry out the * actions). * </dl> * * In addition to these routines that <i>must</i> be supplied by the * generated subclass there are also a series of routines that <i>may</i> * be supplied. These include: * <dl> * <dt> Symbol scan() * <dd> Used to get the next input Symbol from the scanner. * <dt> Scanner getScanner() * <dd> Used to provide a scanner for the default implementation of * scan(). * <dt> int error_sync_size() * <dd> This determines how many Symbols past the point of an error * must be parsed without error in order to consider a recovery to * be valid. This defaults to 3. Values less than 2 are not * recommended. * <dt> void report_error(String message, Object info) * <dd> This method is called to report an error. The default implementation * simply prints a message to System.err and where the error occurred. * This method is often replaced in order to provide a more sophisticated * error reporting mechanism. * <dt> void report_fatal_error(String message, Object info) * <dd> This method is called when a fatal error that cannot be recovered from * is encountered. In the default implementation, it calls * report_error() to emit a message, then throws an exception. * <dt> void syntax_error(Symbol cur_token) * <dd> This method is called as soon as syntax error is detected (but * before recovery is attempted). In the default implementation it * invokes: report_error("Syntax error", null); * <dt> void unrecovered_syntax_error(Symbol cur_token) * <dd> This method is called if syntax error recovery fails. In the default * implementation it invokes:<br> * report_fatal_error("Couldn't repair and continue parse", null); * </dl> * * @see com.sun.java_cup.internal.runtime.Symbol * @see com.sun.java_cup.internal.runtime.Symbol * @see com.sun.java_cup.internal.runtime.virtual_parse_stack * @version last updated: 7/3/96 * @author Frank Flannery */public abstract class lr_parser { /*-----------------------------------------------------------*/ /*--- Constructor(s) ----------------------------------------*/ /*-----------------------------------------------------------*/ /** Simple constructor. */ public lr_parser() { /* nothing to do here */ } /** Constructor that sets the default scanner. [CSA/davidm] */ public lr_parser(Scanner s) { this(); /* in case default constructor someday does something */ setScanner(s); } /*-----------------------------------------------------------*/ /*--- (Access to) Static (Class) Variables ------------------*/ /*-----------------------------------------------------------*/ /** The default number of Symbols after an error we much match to consider * it recovered from. */ protected final static int _error_sync_size = 3; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The number of Symbols after an error we much match to consider it * recovered from. */ protected int error_sync_size() {return _error_sync_size; } /*-----------------------------------------------------------*/ /*--- (Access to) Instance Variables ------------------------*/ /*-----------------------------------------------------------*/ /** Table of production information (supplied by generated subclass). * This table contains one entry per production and is indexed by * the negative-encoded values (reduce actions) in the action_table. * Each entry has two parts, the index of the non-terminal on the * left hand side of the production, and the number of Symbols * on the right hand side. */ public abstract short[][] production_table(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The action table (supplied by generated subclass). This table is * indexed by state and terminal number indicating what action is to * be taken when the parser is in the given state (i.e., the given state * is on top of the stack) and the given terminal is next on the input. * States are indexed using the first dimension, however, the entries for * a given state are compacted and stored in adjacent index, value pairs * which are searched for rather than accessed directly (see get_action()). * The actions stored in the table will be either shifts, reduces, or * errors. Shifts are encoded as positive values (one greater than the * state shifted to). Reduces are encoded as negative values (one less * than the production reduced by). Error entries are denoted by zero. * * @see com.sun.java_cup.internal.runtime.lr_parser#get_action */ public abstract short[][] action_table(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The reduce-goto table (supplied by generated subclass). This * table is indexed by state and non-terminal number and contains * state numbers. States are indexed using the first dimension, however, * the entries for a given state are compacted and stored in adjacent * index, value pairs which are searched for rather than accessed * directly (see get_reduce()). When a reduce occurs, the handle * (corresponding to the RHS of the matched production) is popped off * the stack. The new top of stack indicates a state. This table is * then indexed by that state and the LHS of the reducing production to * indicate where to "shift" to. * * @see com.sun.java_cup.internal.runtime.lr_parser#get_reduce */ public abstract short[][] reduce_table(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The index of the start state (supplied by generated subclass). */ public abstract int start_state(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The index of the start production (supplied by generated subclass). */ public abstract int start_production(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The index of the end of file terminal Symbol (supplied by generated * subclass). */ public abstract int EOF_sym(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The index of the special error Symbol (supplied by generated subclass). */ public abstract int error_sym(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Internal flag to indicate when parser should quit. */ protected boolean _done_parsing = false; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** This method is called to indicate that the parser should quit. This is * normally called by an accept action, but can be used to cancel parsing * early in other circumstances if desired. */ public void done_parsing() { _done_parsing = true; } /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /* Global parse state shared by parse(), error recovery, and * debugging routines */ /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Indication of the index for top of stack (for use by actions). */ protected int tos; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The current lookahead Symbol. */ protected Symbol cur_token; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** The parse stack itself. */ protected Stack stack = new Stack(); /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Direct reference to the production table. */ protected short[][] production_tab; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Direct reference to the action table. */ protected short[][] action_tab; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Direct reference to the reduce-goto table. */ protected short[][] reduce_tab; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** This is the scanner object used by the default implementation * of scan() to get Symbols. To avoid name conflicts with existing * code, this field is private. [CSA/davidm] */ private Scanner _scanner; /** * Simple accessor method to set the default scanner. */ public void setScanner(Scanner s) { _scanner = s; } /** * Simple accessor method to get the default scanner. */ public Scanner getScanner() { return _scanner; } /*-----------------------------------------------------------*/ /*--- General Methods ---------------------------------------*/ /*-----------------------------------------------------------*/ /** Perform a bit of user supplied action code (supplied by generated * subclass). Actions are indexed by an internal action number assigned * at parser generation time. * * @param act_num the internal index of the action to be performed. * @param parser the parser object we are acting for. * @param stack the parse stack of that object. * @param top the index of the top element of the parse stack. */ public abstract Symbol do_action( int act_num, lr_parser parser, Stack stack, int top) throws java.lang.Exception; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** User code for initialization inside the parser. Typically this * initializes the scanner. This is called before the parser requests * the first Symbol. Here this is just a placeholder for subclasses that * might need this and we perform no action. This method is normally * overridden by the generated code using this contents of the "init with" * clause as its body. */ public void user_init() throws java.lang.Exception { } /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Initialize the action object. This is called before the parser does * any parse actions. This is filled in by generated code to create * an object that encapsulates all action code. */ protected abstract void init_actions() throws java.lang.Exception; /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Get the next Symbol from the input (supplied by generated subclass). * Once end of file has been reached, all subsequent calls to scan * should return an EOF Symbol (which is Symbol number 0). By default * this method returns getScanner().next_token(); this implementation * can be overriden by the generated parser using the code declared in * the "scan with" clause. Do not recycle objects; every call to * scan() should return a fresh object. */ public Symbol scan() throws java.lang.Exception { return getScanner().next_token(); } /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Report a fatal error. This method takes a message string and an * additional object (to be used by specializations implemented in * subclasses). Here in the base class a very simple implementation * is provided which reports the error then throws an exception. * * @param message an error message. * @param info an extra object reserved for use by specialized subclasses. */ public void report_fatal_error( String message, Object info) throws java.lang.Exception { /* stop parsing (not really necessary since we throw an exception, but) */ done_parsing(); /* use the normal error message reporting to put out the message */ report_error(message, info); /* throw an exception */ throw new Exception("Can't recover from previous error(s)"); } /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** Report a non fatal error (or warning). This method takes a message * string and an additional object (to be used by specializations * implemented in subclasses). Here in the base class a very simple * implementation is provided which simply prints the message to * System.err. * * @param message an error message. * @param info an extra object reserved for use by specialized subclasses. */ public void report_error(String message, Object info) { System.err.print(message); if (info instanceof Symbol) if (((Symbol)info).left != -1) System.err.println(" at character " + ((Symbol)info).left + " of input"); else System.err.println(""); else System.err.println(""); } /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** This method is called when a syntax error has been detected and recovery * is about to be invoked. Here in the base class we just emit a * "Syntax error" error message. * * @param cur_token the current lookahead Symbol. */ public void syntax_error(Symbol cur_token) { report_error("Syntax error", cur_token); } /*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/ /** This method is called if it is determined that syntax error recovery * has been unsuccessful. Here in the base class we report a fatal error. * * @param cur_token the current lookahead Symbol. */ public void unrecovered_syntax_error(Symbol cur_token) throws java.lang.Exception { report_fatal_error("Couldn't repair and continue parse", cur_token); }
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -