📄 sec-monads.html

📁 編譯器像YACC的編譯及語法產生器
💻 HTML
📖 第 1 页 / 共 2 页
字号:
12 下一页
<HTML
><HEAD
><TITLE
>Monadic Parsers</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.74b"><LINK
REL="HOME"
TITLE="Happy User Guide"
HREF="happy.html"><LINK
REL="UP"
TITLE="Using Happy"
HREF="sec-using.html"><LINK
REL="PREVIOUS"
TITLE="Type Signatures"
HREF="sec-type-signatures.html"><LINK
REL="NEXT"
TITLE="The Error Token"
HREF="sec-error.html"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>Happy User Guide</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="sec-type-signatures.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
>Chapter 2. Using <SPAN
CLASS="APPLICATION"
>Happy</SPAN
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="sec-error.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="SEC-MONADS"
>2.5. Monadic Parsers</A
></H1
><P
><SPAN
CLASS="APPLICATION"
>Happy</SPAN
> has support for threading a monad
      through the generated parser.  This might be useful for several
      reasons:</P
><P
></P
><UL
><LI
><P
> Handling parse errors by using an exception monad
          (see <A
HREF="sec-monads.html#SEC-EXCEPTION"
>Section 2.5.1</A
>).</P
></LI
><LI
><P
> Keeping track of line numbers in the input file, for
          example for use in error messages (see <A
HREF="sec-monads.html#SEC-LINE-NUMBERS"
>Section 2.5.3</A
>).</P
></LI
><LI
><P
> Performing IO operations during parsing.</P
></LI
><LI
><P
> Parsing languages with context-dependencies (such as
          C) require some state in the parser.</P
></LI
></UL
><P
>Adding monadic support to your parser couldn't be simpler.
      Just add the following directive to the declaration section of
      the grammar file:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%monad { &#60;type&#62; } [ { &#60;then&#62; } { &#60;return&#62; } ]</PRE
></TD
></TR
></TABLE
><P
>where <TT
CLASS="LITERAL"
>&#60;type&#62;</TT
> is the type constructor for
      the monad, <TT
CLASS="LITERAL"
>&#60;then&#62;</TT
> is the bind operation of the
      monad, and <TT
CLASS="LITERAL"
>&#60;return&#62;</TT
> is the return operation. If
      you leave out the names for the bind and return operations,
      <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> assumes that <TT
CLASS="LITERAL"
>&#60;type&#62;</TT
> is an
      instance of the standard Haskell type class <TT
CLASS="LITERAL"
>Monad</TT
> and
      uses the overloaded names for the bind and return
      operations.</P
><P
>When this declaration is included in the grammar,
      <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> makes a couple of changes to the generated
      parser: the types of the main parser function and
      <TT
CLASS="LITERAL"
>happyError</TT
> become <TT
CLASS="LITERAL"
>[Token] -&#62; P a</TT
> where
      <TT
CLASS="LITERAL"
>P</TT
> is the monad type constructor, and the function must
      be polymorphic in <TT
CLASS="LITERAL"
>a</TT
>.  In other words,
      <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> adds an application of the
      <TT
CLASS="LITERAL"
>&#60;return&#62;</TT
> operation defined in the declaration
      above, around the result of the parser (<TT
CLASS="LITERAL"
>happyError</TT
> is
      affected because it must have the same return type as the
      parser).  And that's all it does.</P
><P
>This still isn't very useful: all you can do is return
      something of monadic type from <TT
CLASS="LITERAL"
>happyError</TT
>.  How do you
      specify that the productions can also have type <TT
CLASS="LITERAL"
>P a</TT
>?
      Most of the time, you don't want a production to have this type:
      you'd have to write explicit <TT
CLASS="LITERAL"
>returnP</TT
>s everywhere.
      However, there may be a few rules in a grammar that need to get
      at the monad, so <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> has a special syntax for
      monadic actions:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>n  :  t_1 ... t_n          {% &#60;expr&#62; }</PRE
></TD
></TR
></TABLE
><P
>The <TT
CLASS="LITERAL"
>%</TT
> in the action indicates that this is a
      monadic action, with type <TT
CLASS="LITERAL"
>P a</TT
>, where <TT
CLASS="LITERAL"
>a</TT
> is
      the real return type of the production.  When
      <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> reduces one of these rules, it evaluates the
      expression </P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>&#60;expr&#62; `then` \result -&#62; &#60;continue parsing&#62;</PRE
></TD
></TR
></TABLE
><P
><SPAN
CLASS="APPLICATION"
>Happy</SPAN
> uses <TT
CLASS="LITERAL"
>result</TT
> as the real
      semantic value of the production.  During parsing, several
      monadic actions might be reduced, resulting in a sequence
      like</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>&#60;expr1&#62; `then` \r1 -&#62;
&#60;expr2&#62; `then` \r2 -&#62;
...
return &#60;expr3&#62;</PRE
></TD
></TR
></TABLE
><P
>The monadic actions are performed in the order that they
      are <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>reduced</I
></SPAN
>.  If we consider the parse as a tree,
      then reductions happen in a depth-first left-to-right manner.
      The great thing about adding a monad to your parser is that it
      doesn't impose any performance overhead for normal reductions -
      only the monadic ones are translated like this.</P
><P
>Take a look at the Haskell parser for a good illustration
      of how to use a monad in your parser: it contains examples of
      all the principles discussed in this section, namely parse
      errors, a threaded lexer, line/column numbers, and state
      communication between the parser and lexer.</P
><P
>The following sections consider a couple of uses for
      monadic parsers, and describe how to also thread the monad
      through the lexical analyser.</P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="SEC-EXCEPTION"
>2.5.1. Handling Parse Errors</A
></H2
><P
>It's not very convenient to just call <TT
CLASS="LITERAL"
>error</TT
> when
      a parse error is detected: in a robust setting, you'd like the
      program to recover gracefully and report a useful error message
      to the user.  Exceptions (of which errors are a special case)
      are normally implemented in Haskell by using an exception monad,
      something like:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>data E a = Ok a | Failed String

thenE :: E a -&#62; (a -&#62; E b) -&#62; E b
m `thenE` k = 
   case m of 
       Ok a -&#62; k a
	 Failed e -&#62; Failed e

returnE :: a -&#62; E a
returnE a = Ok a

failE :: String -&#62; E a
failE err = Failed err

catchE :: E a -&#62; (String -&#62; E a) -&#62; E a
catchE m k = 
   case m of
      Ok a -&#62; OK a
	Failed e -&#62; k e</PRE
></TD
></TR
></TABLE
><P
>This monad just uses a string as the error type.  The
        functions <TT
CLASS="LITERAL"
>thenE</TT
> and <TT
CLASS="LITERAL"
>returnE</TT
> are the usual
        bind and return operations of the monad, <TT
CLASS="LITERAL"
>failE</TT
>
        raises an error, and <TT
CLASS="LITERAL"
>catchE</TT
> is a combinator for
        handling exceptions.</P
><P
>We can add this monad to the parser with the declaration</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%monad { E } { thenE } { returnE }</PRE
></TD
></TR
></TABLE
><P
>Now, without changing the grammar, we can change the
        definition of <TT
CLASS="LITERAL"
>happyError</TT
> and have something sensible
        happen for a parse error:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>happyError tokens = failE "Parse error"</PRE
></TD
></TR
></TABLE
><P
>The parser now raises an exception in the monad instead
	of bombing out on a parse error.</P
><P
>We can also generate errors during parsing.  There are
        times when it is more convenient to parse a more general
        language than that which is actually intended, and check it
        later.  An example comes from Haskell, where the precedence
        values in infix declarations must be between 0 and 9:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>prec :: { Int }
      : int    {% if $1 &#60; 0 || $1 &#62; 9 
	                then failE "Precedence out of range"
		        else returnE $1
		}</PRE
></TD
></TR
></TABLE
><P
>The monadic action allows the check to be placed in the
	parser itself, where it belongs.</P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="SEC-LEXERS"
>2.5.2. Threaded Lexers</A
></H2
><P
><SPAN
CLASS="APPLICATION"
>Happy</SPAN
> allows the monad concept to be
	extended to the lexical analyser, too.  This has several
	useful consequences:</P
><P
></P
><UL
><LI
><P
> Lexical errors can be treated in the same way as
            parse errors, using an exception monad.</P
></LI
><LI
><P
> Information such as the current file and line
            number can be communicated between the lexer and
            parser. </P
></LI
><LI
><P
> General state communication between the parser and
            lexer - for example, implementation of the Haskell layout
            rule requires this kind of interaction.
            </P
></LI
><LI
><P
> IO operations can be performed in the lexer - this
            could be useful for following import/include declarations
            for instance.</P
></LI
></UL
><P
>A monadic lexer is requested by adding the following
	declaration to the grammar file:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>%lexer { &#60;lexer&#62; } { &#60;eof&#62; }</PRE
></TD
></TR
></TABLE
><P
>where <TT
CLASS="LITERAL"
>&#60;lexer&#62;</TT
> is the name of the lexical
        analyser function, and <TT
CLASS="LITERAL"
>&#60;eof&#62;</TT
> is a token that
        is to be treated as the end of file.</P
><P
>When using a monadic lexer, the parser no longer reads a
        list of tokens.  Instead, it calls the lexical analysis
        function for each new token to be read.  This has the side
        effect of eliminating the intermediate list of tokens, which
        is a slight performance win.</P
><P
>The type of the main parser function and
        <TT
CLASS="LITERAL"
>happyError</TT
> is now just <TT
CLASS="LITERAL"
>P a</TT
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -