⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 tutor12.doc

📁 计算机编译原理教材
💻 DOC
📖 第 1 页 / 共 3 页
字号:
       procedure GetChar;
       begin
          GetCharX;
          if Look = '{' then SkipComment;
       end;
       {--------------------------------------------------------------}


       Code this up  and  give  it  a  try.    You'll find that you can,
       indeed, bury comments anywhere you like.  The comments never even
       get into the parser proper ... every call to GetChar just returns
       any character that's NOT part of a comment.

       As a matter of fact, while  this  approach gets the job done, and
       may even be  perfectly  satisfactory  for  you, it does its job a
       little  TOO  well.    First  of all, most  programming  languages
       specify that a comment should be treated like a  space,  so  that
       comments aren't allowed  to  be embedded in, say, variable names.
       This current version doesn't care WHERE you put comments.

       Second, since the  rest  of  the  parser can't even receive a '{'
       character, you will not be allowed to put one in a quoted string.

       Before you turn up your nose at this simplistic solution, though,
       I should point out  that  as respected a compiler as Turbo Pascal
       also won't allow  a  '{' in a quoted string.  Try it.  And as for
       embedding a comment in an  identifier, I can't imagine why anyone
       would want to do such a  thing,  anyway, so the question is moot.
       For 99% of all  applications,  what I've just shown you will work
       just fine.

       But,  if  you  want  to  be  picky  about it  and  stick  to  the
       conventional treatment, then we  need  to  move  the interception
       point downstream a little further.

       To  do  this,  first change GetChar back to the way  it  was  and
       change the name called in SkipComment.  Then, let's add  the left
       brace as a possible whitespace character:


       {--------------------------------------------------------------}
       { Recognize White Space }

       function IsWhite(c: char): boolean;
       begin
          IsWhite := c in [' ', TAB, CR, LF, '{'];
       end;
       {--------------------------------------------------------------}AB2AB
                                    - 11 -A*2A*

PA2A





       Now, we can deal with comments in procedure SkipWhite:


       {--------------------------------------------------------------}
       { Skip Over Leading White Space }

       procedure SkipWhite;
       begin
          while IsWhite(Look) do begin
             if Look = '{' then
                SkipComment
             else
                GetChar;
          end;
       end;
       {--------------------------------------------------------------}


       Note  that SkipWhite is written so that we  will  skip  over  any
       combination of whitespace characters and comments, in one call.

       OK, give this one a try, too.   You'll  find  that  it will let a
       comment serve to delimit tokens.  It's worth mentioning that this
       approach also gives us the  ability to handle curly braces within
       quoted strings, since within such  strings we will not be testing
       for or skipping over whitespace.

       There's one last  item  to  deal  with:  Nested  comments.   Some
       programmers like the idea  of  nesting  comments, since it allows
       you to comment out code during debugging.  The  code  I've  given
       here won't allow that and, again, neither will Turbo Pascal.

       But the fix is incredibly easy.  All  we  need  to  do is to make
       SkipComment recursive:


       {--------------------------------------------------------------}
       { Skip A Comment Field }

       procedure SkipComment;
       begin
          while Look <> '}' do begin
             GetChar;
             if Look = '{' then SkipComment;
          end;
          GetChar;
       end;
       {--------------------------------------------------------------}


       That does it.  As  sophisticated a comment-handler as you'll ever
       need.AB2AB
                                    - 12 -A*2A*

PA2A





       MULTI-CHARACTER DELIMITERS

       That's all well and  good  for cases where a comment is delimited
       by single  characters,  but  what  about  the  cases such as C or
       standard Pascal, where two  characters  are  required?  Well, the
       principles are still the same, but we have to change our approach
       quite a bit.  I'm sure it won't surprise you to learn that things
       get harder in this case.

       For the multi-character situation, the  easiest thing to do is to
       intercept the left delimiter  back  at the GetChar stage.  We can
       "tokenize" it right there, replacing it by a single character.

       Let's assume we're using the C delimiters '/*' and '*/'.   First,
       we  need  to  go back to the "GetCharX' approach.  In yet another
       copy of your compiler, rename  GetChar to GetCharX and then enter
       the following new procedure GetChar:


       {--------------------------------------------------------------}
       { Read New Character.  Intercept '/*' }

       procedure GetChar;
       begin
          if TempChar <> ' ' then begin
             Look := TempChar;
             TempChar := ' ';
             end
          else begin
             GetCharX;
             if Look = '/' then begin
                Read(TempChar);
                if TempChar = '*' then begin
                   Look := '{';
                   TempChar := ' ';
                end;
             end;
          end;
       end;
       {--------------------------------------------------------------}


       As you can see, what this procedure does is  to  intercept  every
       occurrence of '/'.  It then examines the NEXT  character  in  the
       stream.  If the character  is  a  '*',  then  we  have  found the
       beginning  of  a  comment,  and  GetChar  will  return  a  single
       character replacement for it.   (For  simplicity,  I'm  using the
       same '{' character  as I did for Pascal.  If you were writing a C
       compiler, you'd no doubt want to pick some other character that's
       not  used  elsewhere  in C.  Pick anything you like ... even $FF,
       anything that's unique.)

       If the character  following  the  '/'  is NOT a '*', then GetChar
       tucks it away in the new global TempChar, and  returns  the  '/'.A*2A*
                                    - 13 -

PA2A





       Note that you need to declare this new variable and initialize it
       to ' '.  I like to do  things  like  that  using the Turbo "typed
       constant" construct:


            const TempChar: char = ' ';


       Now we need a new version of SkipComment:


       {--------------------------------------------------------------}
       { Skip A Comment Field }

       procedure SkipComment;
       begin
          repeat
             repeat
                GetCharX;
             until Look = '*';
             GetCharX;
          until Look = '/';
          GetChar;
       end;
       {--------------------------------------------------------------}


       A  few  things  to  note:  first  of  all, function  IsWhite  and
       procedure SkipWhite  don't  need  to  be  changed,  since GetChar
       returns the '{' token.  If you change that token  character, then
       of  course you also need to change the  character  in  those  two
       routines.

       Second, note that  SkipComment  doesn't call GetChar in its loop,
       but  GetCharX.    That  means   that  the  trailing  '/'  is  not
       intercepted and  is seen by SkipComment.  Third, although GetChar
       is the  procedure  doing  the  work,  we  can still deal with the
       comment  characters  embedded  in  a  quoted  string,  by calling
       GetCharX  instead  of  GetChar  while  we're  within  the string.
       Finally,  note  that  we can again provide for nested comments by
       adding a single statement to SkipComment, just as we did before.


       ONE-SIDED COMMENTS

       So far I've shown you  how  to  deal  with  any  kind  of comment
       delimited on the left and the  right.   That only leaves the one-
       sided comments like those in assembler language or  in  Ada, that
       are terminated by the end of the line.  In a  way,  that  case is
       easier.   The only procedure that would need  to  be  changed  is
       SkipComment, which must now terminate at the newline characters:


       {--------------------------------------------------------------}A*2A*
                                    - 14 -

PA2A





       { Skip A Comment Field }

       procedure SkipComment;
       begin
          repeat
             GetCharX;
          until Look = CR;
          GetChar;
       end;
       {--------------------------------------------------------------}


       If the leading character is  a  single  one,  as  in  the  ';' of
       assembly language, then we're essentially done.  If  it's  a two-
       character token, as in the '--'  of  Ada, we need only modify the
       tests  within  GetChar.   Either way, it's an easier problem than
       the balanced case.


       CONCLUSION

       At this point we now have the ability to deal with  both comments
       and semicolons, as well as other kinds of syntactic sugar.   I've
       shown  you several ways to deal with  each,  depending  upon  the
       convention  desired.    The  only  issue left is: which of  these
       conventions should we use in KISS/TINY?

       For the reasons that I've given as we went  along,  I'm  choosing
       the following:


        (1) Semicolons are TERMINATORS, not separators

        (2) Semicolons are OPTIONAL

        (3) Comments are delimited by curly braces

        (4) Comments MAY be nested


       Put the code corresponding to these cases into your copy of TINY.
       You now have TINY Version 1.2.

       Now that we  have  disposed  of  these  sideline  issues,  we can
       finally get back into the mainstream.  In  the  next installment,
       we'll talk  about procedures and parameter passing, and we'll add
       these important features to TINY.  See you then.


       *****************************************************************
       *                                                               *
       *                        COPYRIGHT NOTICE                       *
       *                                                               *
       *   Copyright (C) 1989 Jack W. Crenshaw. All rights reserved.   *A*2A*
                                    - 15 -

PA2A





       *                                                               *
       *****************************************************************AU2AU
AS2AS






                                    - 16 -A*2A*
@

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -