📄 error.otx

📁 inger小型c编译器源码
💻 OTX
📖 第 1 页 / 共 2 页
字号:
上一页 12
      (12345678901234567890). Replaced with 0.
    \end{verbatim}
  \end{quote}
  Error
  \begin{quote}
    The source code is missing the keyword \verb|THEN| where it is expected according to the grammar.
  \end{quote}
  Response
  \begin{quote}
    This error cannot be treated as a warning, since an essential piece of code can not not be compiled. 
    The location of the error must be pinpointed so that the user can easily correct it: 
    \begin{verbatim}
      test.i (54): error: THEN expected after IF condition.
    \end{verbatim}
    Note that the compiler must now 
    recover from the error; obviously an important part of the \verb|IF| statement is missing and is 
    must be skipped somehow. More information on error recovery will follow below.
  \end{quote}
\end{example}

\section{Error recovery}

  There are three ways to perform error recovery:
  \begin{itemize}
    \item[a]When an error is found, the parser stops and does not attempt to find other errors.
    \item[b]When an error is found, the parser reports the error and continues parsing. 
      No attempt is made at error correction (recovery), so the next errors may be irrelevant because 
      they are caused by the first error.
    \item[c]	When an error is found, the parser reports it and recovers from the error, so that subsequent 
      errors do not result from the original error. This is the method discussed below.
  \end{itemize}
  Any of these three approaches may be used (and have been), but it should be obvious that approach \verb|c)| 
  is most useful to the programmer using the compiler.  
  Compiling a large source program may take a long time, so it is advantageous to have the compiler 
  report multiple errors at once. The user may then correct all errors at his leisure. 

\section{Synchronization}

  Error recovery uses so-called synchronization points that the parser looks for after an error has been 
  detected. A synchronization point is a location in the source code from which the parser can safely 
  continue parsing without printing further errors resulting from the original error.
  
  Error recovery uses two sets of terminal tokens, the so-called direction sets [TODO: 02]:
  \begin{itemize}
  \item[a] The FIRST set - is the set of all terminal symbols with which the strings, generated by 
    all the productions for this nonterminal begin. 
  \item[b] The FOLLOW set - a set of all terminal symbols that can be generated by the grammar 
    directly after the current nonterminal.
  \end{itemize}
  As an example for direction sets, we will consider the following very simple grammar and 
  show how the FIRST and FOLLOW sets may be constructed for it.

  \begin{quote}
    \begin{verbatim}
  number: digit morenumber.
  morenumber: digit morenumber.
  morenumber: . 
  digit: '0'.
  digit: '1'.
    \end{verbatim}
  \end{quote}
  
  Any nonterminal has at least one, but frequently more than one production rule. Every production 
  rule has its own FIRST set, which we will call PFIRST. 
  The PFIRST set for a production rule contains all the leftmost terminal tokens that the production 
  rule may eventually produce. The FIRST set of any nonterminal is the union of all its PFIRST sets. 
  We will now construct the FIRST and PFIRST sets for our sample grammar.


  PFIRST sets for every production:

  \begin{quote}
    \begin{verbatim}
  number: digit morenumber.		PFIRST = { '0', '1' }
  morenumber: digit morenumber.		PFIRST = { '0', '1' }
  morenumber: . 				PFIRST = { }
  digit: '0'.					PFIRST = { '0' }
  digit: '1'.					PFIRST = { '1' }
    \end{verbatim}
  \end{quote}
  
  FIRST sets per terminal:
  \begin{quote}
    \begin{verbatim}
  FIRST(number) = { '0', '1' }
  FIRST(morenumber) = { '0', '1' } V { } = { '0', '1' }
  FIRST(digit) = { '0' } V { '1' } = { '0', '1' }
    \end{verbatim}
  \end{quote}

  PRACTICAL ADVICE
  PFIRST sets may be most easily constructed by working from bottom to top: find the PFIRST sets for 'digit' first (these are easy since the production rules for digit contain only terminal tokens). When finding the PFIRST set for a production rule higher up (such as number), combine the FIRST sets of the nonterminals it uses (in the case of number, that is digit). These make up the PFIRST set.
  
  Every nonterminal must also have a FOLLOW set. A FOLLOW set contains all the terminal tokens that the grammar accepts after the nonterminal to which the FOLLOW set belongs. To illustrate this, we will now determine the FOLLOW sets for our sample grammar.
  
  \begin{quote}
    \begin{verbatim}
  number: digit morenumber.
  morenumber: digit morenumber.
  morenumber: . 
  digit: '0'.
  digit: '1'.
    \end{verbatim}
  \end{quote}
  
  FOLLOW sets for every nonterminal:
  \begin{quote}
    \begin{verbatim}
  FOLLOW(number) = { EOF }
  FOLLOW(morenumber) = { EOF }
  FOLLOW(digit} = { EOF, '0', '1' }
    \end{verbatim}
  \end{quote}

  The terminal tokens in these two sets are the synchronization points. After the parser detects 
  and displays an error, it must synchronize (recover from the error). The parser does this by ignoring 
  all further tokens until it reads a token that occurs in a synchronization point set, after which parsing 
  is resumed. 
  
  This point is best illustrated by a example, describing a \verb|Sync()| routine:
  %!SYNC
  
  Tokens are requested from the lexer and discarded until a token occurs in one of the synchronization 
  point lists. 
  
  At the beginning of each production function in the parser the FIRST and FOLLOW sets are filled. Then 
  the function \verb|Sync()| should be called to check if the token given by the lexer is available in the 
  FIRST or FOLLOW set. If not then the compiler must display the error and search for a token that is 
  part of the FIRST or FOLLOW set of the current production. This is the synchronization point. From here 
  on we can start checking for other errors.
  
  It is possible that an unexpected token is encounterd halfway a nonterminal function. When this happens 
  it is nessesary to syncronize until a token of the FOLLOW set is found. The function \verb|SyncOut()| provides 
  this functionallity.
  %!SYNCOUT
  
  Morgan (1970) claims that up to 80\% of the spelling errors occurring in student programs may be corrected 
  in this fashion.
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -