📄 sec-monads.html
字号:
<HTML
><HEAD
><TITLE
>Monadic Parsers</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.74b"><LINK
REL="HOME"
TITLE="Happy User Guide"
HREF="happy.html"><LINK
REL="UP"
TITLE="Using Happy"
HREF="sec-using.html"><LINK
REL="PREVIOUS"
TITLE="Type Signatures"
HREF="sec-type-signatures.html"><LINK
REL="NEXT"
TITLE="The Error Token"
HREF="sec-error.html"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>Happy User Guide</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="sec-type-signatures.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
>Chapter 2. Using <SPAN
CLASS="APPLICATION"
>Happy</SPAN
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="sec-error.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="SEC-MONADS"
>2.5. Monadic Parsers</A
></H1
><P
><SPAN
CLASS="APPLICATION"
>Happy</SPAN
> has support for threading a monad
through the generated parser. This might be useful for several
reasons:</P
><P
></P
><UL
><LI
><P
> Handling parse errors by using an exception monad
(see <A
HREF="sec-monads.html#SEC-EXCEPTION"
>Section 2.5.1</A
>).</P
></LI
><LI
><P
> Keeping track of line numbers in the input file, for
example for use in error messages (see <A
HREF="sec-monads.html#SEC-LINE-NUMBERS"
>Section 2.5.3</A
>).</P
></LI
><LI
><P
> Performing IO operations during parsing.</P
></LI
><LI
><P
> Parsing languages with context-dependencies (such as
C) require some state in the parser.</P
></LI
></UL
><P
>Adding monadic support to your parser couldn't be simpler.
Just add the following directive to the declaration section of
the grammar file:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%monad { <type> } [ { <then> } { <return> } ]</PRE
></TD
></TR
></TABLE
><P
>where <TT
CLASS="LITERAL"
><type></TT
> is the type constructor for
the monad, <TT
CLASS="LITERAL"
><then></TT
> is the bind operation of the
monad, and <TT
CLASS="LITERAL"
><return></TT
> is the return operation. If
you leave out the names for the bind and return operations,
<SPAN
CLASS="APPLICATION"
>Happy</SPAN
> assumes that <TT
CLASS="LITERAL"
><type></TT
> is an
instance of the standard Haskell type class <TT
CLASS="LITERAL"
>Monad</TT
> and
uses the overloaded names for the bind and return
operations.</P
><P
>When this declaration is included in the grammar,
<SPAN
CLASS="APPLICATION"
>Happy</SPAN
> makes a couple of changes to the generated
parser: the types of the main parser function and
<TT
CLASS="LITERAL"
>happyError</TT
> become <TT
CLASS="LITERAL"
>[Token] -> P a</TT
> where
<TT
CLASS="LITERAL"
>P</TT
> is the monad type constructor, and the function must
be polymorphic in <TT
CLASS="LITERAL"
>a</TT
>. In other words,
<SPAN
CLASS="APPLICATION"
>Happy</SPAN
> adds an application of the
<TT
CLASS="LITERAL"
><return></TT
> operation defined in the declaration
above, around the result of the parser (<TT
CLASS="LITERAL"
>happyError</TT
> is
affected because it must have the same return type as the
parser). And that's all it does.</P
><P
>This still isn't very useful: all you can do is return
something of monadic type from <TT
CLASS="LITERAL"
>happyError</TT
>. How do you
specify that the productions can also have type <TT
CLASS="LITERAL"
>P a</TT
>?
Most of the time, you don't want a production to have this type:
you'd have to write explicit <TT
CLASS="LITERAL"
>returnP</TT
>s everywhere.
However, there may be a few rules in a grammar that need to get
at the monad, so <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> has a special syntax for
monadic actions:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>n : t_1 ... t_n {% <expr> }</PRE
></TD
></TR
></TABLE
><P
>The <TT
CLASS="LITERAL"
>%</TT
> in the action indicates that this is a
monadic action, with type <TT
CLASS="LITERAL"
>P a</TT
>, where <TT
CLASS="LITERAL"
>a</TT
> is
the real return type of the production. When
<SPAN
CLASS="APPLICATION"
>Happy</SPAN
> reduces one of these rules, it evaluates the
expression </P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
><expr> `then` \result -> <continue parsing></PRE
></TD
></TR
></TABLE
><P
><SPAN
CLASS="APPLICATION"
>Happy</SPAN
> uses <TT
CLASS="LITERAL"
>result</TT
> as the real
semantic value of the production. During parsing, several
monadic actions might be reduced, resulting in a sequence
like</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
><expr1> `then` \r1 ->
<expr2> `then` \r2 ->
...
return <expr3></PRE
></TD
></TR
></TABLE
><P
>The monadic actions are performed in the order that they
are <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>reduced</I
></SPAN
>. If we consider the parse as a tree,
then reductions happen in a depth-first left-to-right manner.
The great thing about adding a monad to your parser is that it
doesn't impose any performance overhead for normal reductions -
only the monadic ones are translated like this.</P
><P
>Take a look at the Haskell parser for a good illustration
of how to use a monad in your parser: it contains examples of
all the principles discussed in this section, namely parse
errors, a threaded lexer, line/column numbers, and state
communication between the parser and lexer.</P
><P
>The following sections consider a couple of uses for
monadic parsers, and describe how to also thread the monad
through the lexical analyser.</P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="SEC-EXCEPTION"
>2.5.1. Handling Parse Errors</A
></H2
><P
>It's not very convenient to just call <TT
CLASS="LITERAL"
>error</TT
> when
a parse error is detected: in a robust setting, you'd like the
program to recover gracefully and report a useful error message
to the user. Exceptions (of which errors are a special case)
are normally implemented in Haskell by using an exception monad,
something like:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>data E a = Ok a | Failed String
thenE :: E a -> (a -> E b) -> E b
m `thenE` k =
case m of
Ok a -> k a
Failed e -> Failed e
returnE :: a -> E a
returnE a = Ok a
failE :: String -> E a
failE err = Failed err
catchE :: E a -> (String -> E a) -> E a
catchE m k =
case m of
Ok a -> OK a
Failed e -> k e</PRE
></TD
></TR
></TABLE
><P
>This monad just uses a string as the error type. The
functions <TT
CLASS="LITERAL"
>thenE</TT
> and <TT
CLASS="LITERAL"
>returnE</TT
> are the usual
bind and return operations of the monad, <TT
CLASS="LITERAL"
>failE</TT
>
raises an error, and <TT
CLASS="LITERAL"
>catchE</TT
> is a combinator for
handling exceptions.</P
><P
>We can add this monad to the parser with the declaration</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%monad { E } { thenE } { returnE }</PRE
></TD
></TR
></TABLE
><P
>Now, without changing the grammar, we can change the
definition of <TT
CLASS="LITERAL"
>happyError</TT
> and have something sensible
happen for a parse error:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>happyError tokens = failE "Parse error"</PRE
></TD
></TR
></TABLE
><P
>The parser now raises an exception in the monad instead
of bombing out on a parse error.</P
><P
>We can also generate errors during parsing. There are
times when it is more convenient to parse a more general
language than that which is actually intended, and check it
later. An example comes from Haskell, where the precedence
values in infix declarations must be between 0 and 9:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>prec :: { Int }
: int {% if $1 < 0 || $1 > 9
then failE "Precedence out of range"
else returnE $1
}</PRE
></TD
></TR
></TABLE
><P
>The monadic action allows the check to be placed in the
parser itself, where it belongs.</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="SEC-LEXERS"
>2.5.2. Threaded Lexers</A
></H2
><P
><SPAN
CLASS="APPLICATION"
>Happy</SPAN
> allows the monad concept to be
extended to the lexical analyser, too. This has several
useful consequences:</P
><P
></P
><UL
><LI
><P
> Lexical errors can be treated in the same way as
parse errors, using an exception monad.</P
></LI
><LI
><P
> Information such as the current file and line
number can be communicated between the lexer and
parser. </P
></LI
><LI
><P
> General state communication between the parser and
lexer - for example, implementation of the Haskell layout
rule requires this kind of interaction.
</P
></LI
><LI
><P
> IO operations can be performed in the lexer - this
could be useful for following import/include declarations
for instance.</P
></LI
></UL
><P
>A monadic lexer is requested by adding the following
declaration to the grammar file:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%lexer { <lexer> } { <eof> }</PRE
></TD
></TR
></TABLE
><P
>where <TT
CLASS="LITERAL"
><lexer></TT
> is the name of the lexical
analyser function, and <TT
CLASS="LITERAL"
><eof></TT
> is a token that
is to be treated as the end of file.</P
><P
>When using a monadic lexer, the parser no longer reads a
list of tokens. Instead, it calls the lexical analysis
function for each new token to be read. This has the side
effect of eliminating the intermediate list of tokens, which
is a slight performance win.</P
><P
>The type of the main parser function and
<TT
CLASS="LITERAL"
>happyError</TT
> is now just <TT
CLASS="LITERAL"
>P a</TT
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -