📄 error.otx
字号:
(12345678901234567890). Replaced with 0.
\end{verbatim}
\end{quote}
Error
\begin{quote}
The source code is missing the keyword \verb|THEN| where it is expected according to the grammar.
\end{quote}
Response
\begin{quote}
This error cannot be treated as a warning, since an essential piece of code can not not be compiled.
The location of the error must be pinpointed so that the user can easily correct it:
\begin{verbatim}
test.i (54): error: THEN expected after IF condition.
\end{verbatim}
Note that the compiler must now
recover from the error; obviously an important part of the \verb|IF| statement is missing and is
must be skipped somehow. More information on error recovery will follow below.
\end{quote}
\end{example}
\section{Error recovery}
There are three ways to perform error recovery:
\begin{itemize}
\item[a]When an error is found, the parser stops and does not attempt to find other errors.
\item[b]When an error is found, the parser reports the error and continues parsing.
No attempt is made at error correction (recovery), so the next errors may be irrelevant because
they are caused by the first error.
\item[c] When an error is found, the parser reports it and recovers from the error, so that subsequent
errors do not result from the original error. This is the method discussed below.
\end{itemize}
Any of these three approaches may be used (and have been), but it should be obvious that approach \verb|c)|
is most useful to the programmer using the compiler.
Compiling a large source program may take a long time, so it is advantageous to have the compiler
report multiple errors at once. The user may then correct all errors at his leisure.
\section{Synchronization}
Error recovery uses so-called synchronization points that the parser looks for after an error has been
detected. A synchronization point is a location in the source code from which the parser can safely
continue parsing without printing further errors resulting from the original error.
Error recovery uses two sets of terminal tokens, the so-called direction sets [TODO: 02]:
\begin{itemize}
\item[a] The FIRST set - is the set of all terminal symbols with which the strings, generated by
all the productions for this nonterminal begin.
\item[b] The FOLLOW set - a set of all terminal symbols that can be generated by the grammar
directly after the current nonterminal.
\end{itemize}
As an example for direction sets, we will consider the following very simple grammar and
show how the FIRST and FOLLOW sets may be constructed for it.
\begin{quote}
\begin{verbatim}
number: digit morenumber.
morenumber: digit morenumber.
morenumber: .
digit: '0'.
digit: '1'.
\end{verbatim}
\end{quote}
Any nonterminal has at least one, but frequently more than one production rule. Every production
rule has its own FIRST set, which we will call PFIRST.
The PFIRST set for a production rule contains all the leftmost terminal tokens that the production
rule may eventually produce. The FIRST set of any nonterminal is the union of all its PFIRST sets.
We will now construct the FIRST and PFIRST sets for our sample grammar.
PFIRST sets for every production:
\begin{quote}
\begin{verbatim}
number: digit morenumber. PFIRST = { '0', '1' }
morenumber: digit morenumber. PFIRST = { '0', '1' }
morenumber: . PFIRST = { }
digit: '0'. PFIRST = { '0' }
digit: '1'. PFIRST = { '1' }
\end{verbatim}
\end{quote}
FIRST sets per terminal:
\begin{quote}
\begin{verbatim}
FIRST(number) = { '0', '1' }
FIRST(morenumber) = { '0', '1' } V { } = { '0', '1' }
FIRST(digit) = { '0' } V { '1' } = { '0', '1' }
\end{verbatim}
\end{quote}
PRACTICAL ADVICE
PFIRST sets may be most easily constructed by working from bottom to top: find the PFIRST sets for 'digit' first (these are easy since the production rules for digit contain only terminal tokens). When finding the PFIRST set for a production rule higher up (such as number), combine the FIRST sets of the nonterminals it uses (in the case of number, that is digit). These make up the PFIRST set.
Every nonterminal must also have a FOLLOW set. A FOLLOW set contains all the terminal tokens that the grammar accepts after the nonterminal to which the FOLLOW set belongs. To illustrate this, we will now determine the FOLLOW sets for our sample grammar.
\begin{quote}
\begin{verbatim}
number: digit morenumber.
morenumber: digit morenumber.
morenumber: .
digit: '0'.
digit: '1'.
\end{verbatim}
\end{quote}
FOLLOW sets for every nonterminal:
\begin{quote}
\begin{verbatim}
FOLLOW(number) = { EOF }
FOLLOW(morenumber) = { EOF }
FOLLOW(digit} = { EOF, '0', '1' }
\end{verbatim}
\end{quote}
The terminal tokens in these two sets are the synchronization points. After the parser detects
and displays an error, it must synchronize (recover from the error). The parser does this by ignoring
all further tokens until it reads a token that occurs in a synchronization point set, after which parsing
is resumed.
This point is best illustrated by a example, describing a \verb|Sync()| routine:
%!SYNC
Tokens are requested from the lexer and discarded until a token occurs in one of the synchronization
point lists.
At the beginning of each production function in the parser the FIRST and FOLLOW sets are filled. Then
the function \verb|Sync()| should be called to check if the token given by the lexer is available in the
FIRST or FOLLOW set. If not then the compiler must display the error and search for a token that is
part of the FIRST or FOLLOW set of the current production. This is the synchronization point. From here
on we can start checking for other errors.
It is possible that an unexpected token is encounterd halfway a nonterminal function. When this happens
it is nessesary to syncronize until a token of the FOLLOW set is found. The function \verb|SyncOut()| provides
this functionallity.
%!SYNCOUT
Morgan (1970) claims that up to 80\% of the spelling errors occurring in student programs may be corrected
in this fashion.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -