📄 tutor12.doc

📁 计算机编译原理教材
💻 DOC
📖 第 1 页 / 共 3 页
字号:





       like  the semicolon, each item of sugar  is  something  that  can
       potentially cause a compile error by its omission.


       DEALING WITH SEMICOLONS

       There  are  two  distinct  ways  in which semicolons are used  in
       popular  languages.    In Pascal, the semicolon is regarded as an
       statement SEPARATOR.  No semicolon  is  required  after  the last
       statement in a block.  The syntax is:


            <block> ::= <statement> ( ';' <statement>)*

            <statement> ::= <assignment> | <if> | <while> ... | null


       (The null statement is IMPORTANT!)

       Pascal  also defines some semicolons in  other  places,  such  as
       after the PROGRAM statement.

       In  C  and  Ada, on the other hand, the semicolon is considered a
       statement TERMINATOR,  and  follows  all  statements  (with  some
       embarrassing and confusing  exceptions).   The syntax for this is
       simply:


            <block> ::= ( <statement> ';')*


       Of  the two syntaxes, the Pascal one seems on the face of it more
       rational, but experience has shown  that it leads to some strange
       difficulties.  People get  so  used  to  typing a semicolon after
       every  statement  that  they tend to  type  one  after  the  last
       statement in a block, also.  That usually doesn't cause  any harm
       ...  it  just gets treated as a  null  statement.    Many  Pascal
       programmers, including yours truly,  do  just  that. But there is
       one  place you absolutely CANNOT type  a  semicolon,  and  that's
       right before an ELSE.  This little gotcha  has  cost  me  many an
       extra  compilation,  particularly  when  the  ELSE  is  added  to
       existing code.    So  the  C/Ada  choice  turns out to be better.
       Apparently Nicklaus Wirth thinks so, too:  In his  Modula  2,  he
       abandoned the Pascal approach.

       Given either of these two syntaxes, it's an easy matter (now that
       we've  reorganized  the  parser!) to add these  features  to  our
       parser.  Let's take the last case first, since it's simpler.

       To begin, I've made things easy by introducing a new recognizer:


       {--------------------------------------------------------------}
       { Match a Semicolon }A*2A*
                                     - 6 -

PA2A





       procedure Semi;
       begin
          MatchString(';');
       end;
       {--------------------------------------------------------------}


       This procedure works very much like our old Match.  It insists on
       finding a semicolon as the next token.  Having found it, it skips
       to the next one.

       Since a  semicolon follows a statement, procedure Block is almost
       the only one we need to change:


       {--------------------------------------------------------------}
       { Parse and Translate a Block of Statements }

       procedure Block;
       begin
          Scan;
          while not(Token in ['e', 'l']) do begin
             case Token of
              'i': DoIf;
              'w': DoWhile;
              'R': DoRead;
              'W': DoWrite;
              'x': Assignment;
             end;
             Semi;
             Scan;
          end;
       end;
       {--------------------------------------------------------------}


       Note carefully the subtle change in the case statement.  The call
       to  Assignment  is now guarded by a test on Token.   This  is  to
       avoid calling Assignment when the  token  is  a  semicolon (which
       could happen if the statement is null).

       Since declarations are also  statements,  we  also  need to add a
       call to Semi within procedure TopDecls:


       {--------------------------------------------------------------}
       { Parse and Translate Global Declarations }

       procedure TopDecls;
       begin
          Scan;
          while Token = 'v' do begin
             Alloc;
             while Token = ',' doA*2A*
                                     - 7 -

PA2A





                Alloc;
             Semi;
          end;
       end;
       {--------------------------------------------------------------}


       Finally, we need one for the PROGRAM statement:


       {--------------------------------------------------------------}
       { Main Program }

       begin
          Init;
          MatchString('PROGRAM');
          Semi;
          Header;
          TopDecls;
          MatchString('BEGIN');
          Prolog;
          Block;
          MatchString('END');
          Epilog;
       end.
       {--------------------------------------------------------------}


       It's as easy as that.  Try it with a copy of TINY and see how you
       like it.

       The Pascal version  is  a  little  trickier,  but  it  still only
       requires  minor  changes,  and those only to procedure Block.  To
       keep things as simple as possible, let's split the procedure into
       two parts.  The following procedure handles just one statement:


       {--------------------------------------------------------------}
       { Parse and Translate a Single Statement }

       procedure Statement;
       begin
          Scan;
          case Token of
           'i': DoIf;
           'w': DoWhile;
           'R': DoRead;
           'W': DoWrite;
           'x': Assignment;
          end;
       end;
       {--------------------------------------------------------------}AB2AB
                                     - 8 -A*2A*

PA2A





       Using this procedure, we can now rewrite Block like this:


       {--------------------------------------------------------------}
       { Parse and Translate a Block of Statements }

       procedure Block;
       begin
          Statement;
          while Token = ';' do begin
             Next;
             Statement;
          end;
       end;
       {--------------------------------------------------------------}


       That  sure  didn't  hurt, did it?  We can now parse semicolons in
       Pascal-like fashion.


       A COMPROMISE

       Now that we know how to deal with semicolons, does that mean that
       I'm going to put them in KISS/TINY?  Well, yes and  no.    I like
       the extra sugar and the security that comes with knowing for sure
       where the  ends  of  statements  are.    But I haven't changed my
       dislike for the compilation errors associated with semicolons.

       So I have what I think is a nice compromise: Make them OPTIONAL!

       Consider the following version of Semi:


       {--------------------------------------------------------------}
       { Match a Semicolon }

       procedure Semi;
       begin
          if Token = ';' then Next;
       end;
       {--------------------------------------------------------------}


       This procedure will ACCEPT a semicolon whenever it is called, but
       it won't INSIST on one.  That means that when  you  choose to use
       semicolons, the compiler  will  use the extra information to help
       keep itself on track.  But if you omit one (or omit them all) the
       compiler won't complain.  The best of both worlds.

       Put this procedure in place in the first version of  your program
       (the  one for C/Ada syntax), and you have  the  makings  of  TINY
       Version 1.2.A62A6
                                     - 9 -A*2A*

PA2A





       COMMENTS

       Up  until  now  I have carefully avoided the subject of comments.
       You would think that this would be an easy subject ... after all,
       the compiler doesn't have to deal with comments at all; it should
       just ignore them.  Well, sometimes that's true.

       Comments can be just about as easy or as difficult as  you choose
       to make them.    At  one  extreme,  we can arrange things so that
       comments  are  intercepted  almost  the  instant  they  enter the
       compiler.  At the  other,  we can treat them as lexical elements.
       Things  tend to get interesting when  you  consider  things  like
       comment delimiters contained in quoted strings.


       SINGLE-CHARACTER DELIMITERS

       Here's an example.  Suppose we assume the  Turbo  Pascal standard
       and use curly braces for comments.  In this case we  have single-
       character delimiters, so our parsing is a little easier.

       One  approach  is  to  strip  the  comments  out the  instant  we
       encounter them in the input stream; that is,  right  in procedure
       GetChar.    To  do  this,  first  change  the  name of GetChar to
       something else, say GetCharX.  (For the record, this is  going to
       be a TEMPORARY change, so best not do this with your only copy of
       TINY.  I assume you understand that you should  always  do  these
       experiments with a working copy.)

       Now, we're going to need a  procedure  to skip over comments.  So
       key in the following one:


       {--------------------------------------------------------------}
       { Skip A Comment Field }

       procedure SkipComment;
       begin
          while Look <> '}' do
             GetCharX;
          GetCharX;
       end;
       {--------------------------------------------------------------}


       Clearly, what this procedure is going to do is to simply read and
       discard characters from the input  stream, until it finds a right
       curly brace.  Then it reads one more character and returns  it in
       Look.

       Now we can  write  a  new  version of GetChar that SkipComment to
       strip out comments:AB2AB
                                    - 10 -A*2A*

PA2A





       {--------------------------------------------------------------}
       { Get Character from Input Stream }
       { Skip Any Comments }
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -