📄 tutor12.doc
字号:
procedure GetChar;
begin
GetCharX;
if Look = '{' then SkipComment;
end;
{--------------------------------------------------------------}
Code this up and give it a try. You'll find that you can,
indeed, bury comments anywhere you like. The comments never even
get into the parser proper ... every call to GetChar just returns
any character that's NOT part of a comment.
As a matter of fact, while this approach gets the job done, and
may even be perfectly satisfactory for you, it does its job a
little TOO well. First of all, most programming languages
specify that a comment should be treated like a space, so that
comments aren't allowed to be embedded in, say, variable names.
This current version doesn't care WHERE you put comments.
Second, since the rest of the parser can't even receive a '{'
character, you will not be allowed to put one in a quoted string.
Before you turn up your nose at this simplistic solution, though,
I should point out that as respected a compiler as Turbo Pascal
also won't allow a '{' in a quoted string. Try it. And as for
embedding a comment in an identifier, I can't imagine why anyone
would want to do such a thing, anyway, so the question is moot.
For 99% of all applications, what I've just shown you will work
just fine.
But, if you want to be picky about it and stick to the
conventional treatment, then we need to move the interception
point downstream a little further.
To do this, first change GetChar back to the way it was and
change the name called in SkipComment. Then, let's add the left
brace as a possible whitespace character:
{--------------------------------------------------------------}
{ Recognize White Space }
function IsWhite(c: char): boolean;
begin
IsWhite := c in [' ', TAB, CR, LF, '{'];
end;
{--------------------------------------------------------------}AB2AB
- 11 -A*2A*
PA2A
Now, we can deal with comments in procedure SkipWhite:
{--------------------------------------------------------------}
{ Skip Over Leading White Space }
procedure SkipWhite;
begin
while IsWhite(Look) do begin
if Look = '{' then
SkipComment
else
GetChar;
end;
end;
{--------------------------------------------------------------}
Note that SkipWhite is written so that we will skip over any
combination of whitespace characters and comments, in one call.
OK, give this one a try, too. You'll find that it will let a
comment serve to delimit tokens. It's worth mentioning that this
approach also gives us the ability to handle curly braces within
quoted strings, since within such strings we will not be testing
for or skipping over whitespace.
There's one last item to deal with: Nested comments. Some
programmers like the idea of nesting comments, since it allows
you to comment out code during debugging. The code I've given
here won't allow that and, again, neither will Turbo Pascal.
But the fix is incredibly easy. All we need to do is to make
SkipComment recursive:
{--------------------------------------------------------------}
{ Skip A Comment Field }
procedure SkipComment;
begin
while Look <> '}' do begin
GetChar;
if Look = '{' then SkipComment;
end;
GetChar;
end;
{--------------------------------------------------------------}
That does it. As sophisticated a comment-handler as you'll ever
need.AB2AB
- 12 -A*2A*
PA2A
MULTI-CHARACTER DELIMITERS
That's all well and good for cases where a comment is delimited
by single characters, but what about the cases such as C or
standard Pascal, where two characters are required? Well, the
principles are still the same, but we have to change our approach
quite a bit. I'm sure it won't surprise you to learn that things
get harder in this case.
For the multi-character situation, the easiest thing to do is to
intercept the left delimiter back at the GetChar stage. We can
"tokenize" it right there, replacing it by a single character.
Let's assume we're using the C delimiters '/*' and '*/'. First,
we need to go back to the "GetCharX' approach. In yet another
copy of your compiler, rename GetChar to GetCharX and then enter
the following new procedure GetChar:
{--------------------------------------------------------------}
{ Read New Character. Intercept '/*' }
procedure GetChar;
begin
if TempChar <> ' ' then begin
Look := TempChar;
TempChar := ' ';
end
else begin
GetCharX;
if Look = '/' then begin
Read(TempChar);
if TempChar = '*' then begin
Look := '{';
TempChar := ' ';
end;
end;
end;
end;
{--------------------------------------------------------------}
As you can see, what this procedure does is to intercept every
occurrence of '/'. It then examines the NEXT character in the
stream. If the character is a '*', then we have found the
beginning of a comment, and GetChar will return a single
character replacement for it. (For simplicity, I'm using the
same '{' character as I did for Pascal. If you were writing a C
compiler, you'd no doubt want to pick some other character that's
not used elsewhere in C. Pick anything you like ... even $FF,
anything that's unique.)
If the character following the '/' is NOT a '*', then GetChar
tucks it away in the new global TempChar, and returns the '/'.A*2A*
- 13 -
PA2A
Note that you need to declare this new variable and initialize it
to ' '. I like to do things like that using the Turbo "typed
constant" construct:
const TempChar: char = ' ';
Now we need a new version of SkipComment:
{--------------------------------------------------------------}
{ Skip A Comment Field }
procedure SkipComment;
begin
repeat
repeat
GetCharX;
until Look = '*';
GetCharX;
until Look = '/';
GetChar;
end;
{--------------------------------------------------------------}
A few things to note: first of all, function IsWhite and
procedure SkipWhite don't need to be changed, since GetChar
returns the '{' token. If you change that token character, then
of course you also need to change the character in those two
routines.
Second, note that SkipComment doesn't call GetChar in its loop,
but GetCharX. That means that the trailing '/' is not
intercepted and is seen by SkipComment. Third, although GetChar
is the procedure doing the work, we can still deal with the
comment characters embedded in a quoted string, by calling
GetCharX instead of GetChar while we're within the string.
Finally, note that we can again provide for nested comments by
adding a single statement to SkipComment, just as we did before.
ONE-SIDED COMMENTS
So far I've shown you how to deal with any kind of comment
delimited on the left and the right. That only leaves the one-
sided comments like those in assembler language or in Ada, that
are terminated by the end of the line. In a way, that case is
easier. The only procedure that would need to be changed is
SkipComment, which must now terminate at the newline characters:
{--------------------------------------------------------------}A*2A*
- 14 -
PA2A
{ Skip A Comment Field }
procedure SkipComment;
begin
repeat
GetCharX;
until Look = CR;
GetChar;
end;
{--------------------------------------------------------------}
If the leading character is a single one, as in the ';' of
assembly language, then we're essentially done. If it's a two-
character token, as in the '--' of Ada, we need only modify the
tests within GetChar. Either way, it's an easier problem than
the balanced case.
CONCLUSION
At this point we now have the ability to deal with both comments
and semicolons, as well as other kinds of syntactic sugar. I've
shown you several ways to deal with each, depending upon the
convention desired. The only issue left is: which of these
conventions should we use in KISS/TINY?
For the reasons that I've given as we went along, I'm choosing
the following:
(1) Semicolons are TERMINATORS, not separators
(2) Semicolons are OPTIONAL
(3) Comments are delimited by curly braces
(4) Comments MAY be nested
Put the code corresponding to these cases into your copy of TINY.
You now have TINY Version 1.2.
Now that we have disposed of these sideline issues, we can
finally get back into the mainstream. In the next installment,
we'll talk about procedures and parameter passing, and we'll add
these important features to TINY. See you then.
*****************************************************************
* *
* COPYRIGHT NOTICE *
* *
* Copyright (C) 1989 Jack W. Crenshaw. All rights reserved. *A*2A*
- 15 -
PA2A
* *
*****************************************************************AU2AU
AS2AS
- 16 -A*2A*
@
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -