📄 sec-using.html

📁 編譯器像YACC的編譯及語法產生器
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
>E</TT
>, which may refer to the values of
    <TT
CLASS="LITERAL"
>t_1...t_n</TT
> using the symbols
    <TT
CLASS="LITERAL"
>$1...$n</TT
>.</P
><P
>The parser reduces the input using the rules in the grammar
    until just one symbol remains: the first symbol defined in the
    grammar (namely <TT
CLASS="LITERAL"
>Exp</TT
> in our example).  The value of this
    symbol is the return value from the parser.</P
><P
>To complete the program, we need some extra code.  The
    grammar file may optionally contain a final code section, enclosed
    in curly braces.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>{</PRE
></TD
></TR
></TABLE
><P
>All parsers must declare the function <TT
CLASS="FUNCTION"
>happyError</TT
>,
    which is called when an error is detected. </P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>happyError :: [Token] -&#62; a
happyError _ = error "Parse error"</PRE
></TD
></TR
></TABLE
><P
>Note that <TT
CLASS="LITERAL"
>happyError</TT
> must be polymorphic
    in its return type <TT
CLASS="LITERAL"
>a</TT
>, which usually means it
    must be a call to <TT
CLASS="LITERAL"
>error</TT
>.  We'll see in <A
HREF="sec-monads.html"
>Section 2.5</A
> how to wrap the parser in a monad so that we
    can do something more sensible with errors.  It's also possible to
    keep track of line numbers in the parser for use in error
    messages, this is described in <A
HREF="sec-monads.html#SEC-LINE-NUMBERS"
>Section 2.5.3</A
>.</P
><P
>Next we can declare the data type that represents the parsed
    expression:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>data Exp  
      = Let String Exp Exp
      | Exp1 Exp1

data Exp1 
      = Plus Exp1 Term 
      | Minus Exp1 Term 
      | Term Term

data Term 
      = Times Term Factor 
      | Div Term Factor 
      | Factor Factor

data Factor 
      = Int Int 
      | Var String 
      | Brack Exp</PRE
></TD
></TR
></TABLE
><P
>And the data structure for the tokens...</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>data Token
      = TokenLet
      | TokenIn
      | TokenInt Int
      | TokenVar String
      | TokenEq
      | TokenPlus
      | TokenMinus
      | TokenTimes
      | TokenDiv
      | TokenOB
      | TokenCB
 deriving Show</PRE
></TD
></TR
></TABLE
><P
>... and a simple lexer that returns this data
    structure.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>lexer :: String -&#62; [Token]
lexer [] = []
lexer (c:cs) 
      | isSpace c = lexer cs
      | isAlpha c = lexVar (c:cs)
      | isDigit c = lexNum (c:cs)
lexer ('=':cs) = TokenEq : lexer cs
lexer ('+':cs) = TokenPlus : lexer cs
lexer ('-':cs) = TokenMinus : lexer cs
lexer ('*':cs) = TokenTimes : lexer cs
lexer ('/':cs) = TokenDiv : lexer cs
lexer ('(':cs) = TokenOB : lexer cs
lexer (')':cs) = TokenCB : lexer cs

lexNum cs = TokenInt (read num) : lexer rest
      where (num,rest) = span isDigit cs

lexVar cs =
   case span isAlpha cs of
      ("let",rest) -&#62; TokenLet : lexer rest
      ("in",rest)  -&#62; TokenIn : lexer rest
      (var,rest)   -&#62; TokenVar var : lexer rest</PRE
></TD
></TR
></TABLE
><P
>And finally a top-level function to take some input, parse
    it, and print out the result.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>main = getContents &#62;&#62;= print . calc . lexer
}</PRE
></TD
></TR
></TABLE
><P
>And that's it! A whole lexer, parser and grammar in a few
    dozen lines.  Another good example is <SPAN
CLASS="APPLICATION"
>Happy</SPAN
>'s own
    parser. Several features in <SPAN
CLASS="APPLICATION"
>Happy</SPAN
> were developed
    using this as an example.</P
><P
>To generate the Haskell module for this parser, type the
    command <B
CLASS="COMMAND"
>happy example.y</B
> (where
    <TT
CLASS="FILENAME"
>example.y</TT
> is the name of the grammar file).
    The Haskell module will be placed in a file named
    <TT
CLASS="FILENAME"
>example.hs</TT
>.  Additionally, invoking the
    command <B
CLASS="COMMAND"
>happy example.y -i</B
> will produce the
    file <TT
CLASS="FILENAME"
>example.info</TT
> which contains detailed information
    about the parser, including states and reduction rules (see <A
HREF="sec-info-files.html"
>Chapter 5</A
>).  This can be invaluable for debugging
    parsers, but requires some knowledge of the operation of a
    shift-reduce parser. </P
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="SEC-OTHER-DATATYPES"
>2.1. Returning other datatypes</A
></H1
><P
>In the above example, we used a data type to represent the
      syntax being parsed.  However, there's no reason why it has to
      be this way: you could calculate the value of the expression on
      the fly, using productions like this:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>Term  : Term '*' Factor         { $1 * $3 }
      | Term '/' Factor         { $1 / $3 }
      | Factor                  { $1 }</PRE
></TD
></TR
></TABLE
><P
>The value of a <TT
CLASS="LITERAL"
>Term</TT
> would be the value of the
      expression itself, and the parser could return an integer.  </P
><P
>This works for simple expression types, but our grammar
      includes variables and the <TT
CLASS="LITERAL"
>let</TT
> syntax.  How do we know
      the value of a variable while we're parsing it?  We don't, but
      since the Haskell code for a production can be anything at all,
      we could make it a function that takes an environment of
      variable values, and returns the computed value of the
      expression:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><PRE
CLASS="PROGRAMLISTING"
>Exp   : let var '=' Exp in Exp  { \p -&#62; $6 (($2,$4 p):p) }
      | Exp1                    { $1 }

Exp1  : Exp1 '+' Term           { \p -&#62; $1 p + $3 p }
      | Exp1 '-' Term           { \p -&#62; $1 p - $3 p }
      | Term                    { $1 }

Term  : Term '*' Factor         { \p -&#62; $1 p * $3 p }
      | Term '/' Factor         { \p -&#62; $1 p `div` $3 p }
      | Factor                  { $1 }

Factor			  
      : int                     { \p -&#62; $1 }
      | var                     { \p -&#62; case lookup $1 p of
	                                    Nothing -&#62; error "no var"
					    Just i  -&#62; i }
      | '(' Exp ')'             { $2 }</PRE
></TD
></TR
></TABLE
><P
>The value of each production is a function from an
      environment <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>p</I
></SPAN
> to a value.  When parsing a
      <TT
CLASS="LITERAL"
>let</TT
> construct, we extend the environment with the new
      binding to find the value of the body, and the rule for
      <TT
CLASS="LITERAL"
>var</TT
> looks up its value in the environment.  There's
      something you can't do in <TT
CLASS="LITERAL"
>yacc</TT
> :-)</P
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="sec-obtaining.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="happy.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="sec-sequences.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Obtaining <SPAN
CLASS="APPLICATION"
>Happy</SPAN
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
>&nbsp;</TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Parsing sequences</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -