📄 compilers-scanners.st
字号:
Object subclass: #AbstractScanner
instanceVariableNames: 'source nextChar token tokenType buffer '
classVariableNames: ''
poolDictionaries: ''
category: 'Compilers-Scanners'!
AbstractScanner comment:
'=================================================
Copyright (c) 1992 by Justin O. Graver.
All rights reserved (with exceptions).
For complete information evaluate "Object tgenCopyright."
=================================================
I scan a source string and break it up into tokens using mechanisms provided in concrete subclasses.
Instance Variables:
source <ReadStream> - character input stream.
nextChar <Character + UndefinedObject> - one-character lookahead buffer for source, nil if no input left.
token <String> - current token buffer.
tokenType <String + Symbol> - current token type buffer.
buffer <WriteStream> - character accumulation buffer for tokens.
'!
!AbstractScanner methodsFor: 'initialization'!
init
self buffer: (RetractableWriteStream on: (String new: 32))!
reset
"Reset the initial state of the scanner before scanning a new source."
self buffer reset.
self token: nil.
self tokenType: nil.
self nextChar: nil!
scanSource: aString
"Convert the input string to a read stream and scan the first token."
self reset.
self source: (RetractableReadStream on: aString).
self nextChar: self source next.
self scanToken! !
!AbstractScanner methodsFor: 'state accessing'!
buffer
^buffer!
buffer: argument
buffer := argument!
nextChar
^nextChar!
nextChar: argument
nextChar := argument!
source
^source!
source: argument
source := argument!
token
^token!
token: argument
token := argument!
tokenType
^tokenType!
tokenType: argument
tokenType := argument! !
!AbstractScanner methodsFor: 'scanning'!
backspaceSource
"When the source is at the end, 'source current' is the last character."
self atEnd ifFalse: [self source backspace].
self nextChar: self source current!
getNextChar
"Source will answer an empty string when no more input is available.
Subclasses may override this to avoid unnecessary buffering."
self buffer nextPut: self nextChar.
self nextChar: self source next!
putBackChar
"Remove the last character in the buffer and backspace the source.
Subclasses may override this to avoid unnecessary buffering."
self buffer backspace.
self backspaceSource!
scanToken
"Subclasses must compute values for token and tokenType here."
self subclassResponsibility!
signalEndOfInput
"Set scanner to the end-of-input state."
self tokenType: self endOfInputTokenType.
self token: self endOfInputToken! !
!AbstractScanner methodsFor: 'testing'!
atEnd
^self nextChar = self endOfInputToken! !
!AbstractScanner methodsFor: 'accessing'!
contents
^self source contents!
endOfInputToken
"Answer a token representing the end of the input."
self subclassResponsibility!
endOfInputTokenType
"Answer the token type representing the end of the input."
self subclassResponsibility!
errorPosition
"Answer the source position of the last acceptable character."
^source position + (self atEnd
ifTrue: [1]
ifFalse: [0]) max: 1!
position
^self source position! !
"-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- "!
AbstractScanner class
instanceVariableNames: ''!
!AbstractScanner class methodsFor: 'instance creation'!
new
^super new init!
scanFrom: aString
| newScanner |
newScanner := self new.
newScanner scanSource: aString.
^newScanner! !
AbstractScanner subclass: #FSABasedScanner
instanceVariableNames: 'fsa '
classVariableNames: ''
poolDictionaries: ''
category: 'Compilers-Scanners'!
FSABasedScanner comment:
'=================================================
Copyright (c) 1992 by Justin O. Graver.
All rights reserved (with exceptions).
For complete information evaluate "Object tgenCopyright."
=================================================
I am an abstract class of scanner that scans a source string and breaks it up into tokens using a minimal deterministic finite-state automata (FSA). Each token is also given a type by its associated final state in the FSA. Specific FSAs are stored in class instance variables of my concrete subclasses.
Instance Variables:
fsa <FSAState> - a local reference to the token recognizer, in minimal deterministic form, for this class of scanner.
'!
!FSABasedScanner methodsFor: 'state accessing'!
fsa
^fsa!
fsa: argument
fsa := argument! !
!FSABasedScanner methodsFor: 'scanning directives'!
compactDoubleApostrophes
"Compact all two apostrophe sequences in my current token into a single
apostrophe."
| readStream writeStream ch nextCh |
readStream := ReadStream on: self token.
writeStream := WriteStream on: (String new: 20).
[readStream atEnd]
whileFalse:
[writeStream nextPut: (ch := readStream next).
(ch = $' and: [(nextCh := readStream peek) notNil and: [nextCh = $']])
ifTrue: [readStream skip: 1]].
self token: writeStream contents!
ignoreComment
self scanToken!
ignoreDelimiter
self scanToken! !
!FSABasedScanner methodsFor: 'accessing'!
endOfInputToken
"Answer a token representing the end of the input."
^Character endOfInput!
endOfInputTokenType
"Answer the token type representing the end of the input."
^self endOfInputToken!
myFsa
^self class fsa!
startState
^self fsa! !
!FSABasedScanner methodsFor: 'scanning'!
scanToken
"Scan the next token and compute its token type."
| state nextState tok typeAction |
self atEnd
ifTrue: [self signalEndOfInput]
ifFalse:
[state := self startState.
[(nextState := self at: state transitionFor: self nextChar) isNil]
whileFalse:
[state := nextState.
self getNextChar].
tok := self buffer contents.
typeAction := self at: state tokenTypeAndActionFor: tok.
self tokenType: typeAction type.
self token: tok.
self buffer reset.
typeAction action notNil ifTrue: [self perform: typeAction action]]! !
!FSABasedScanner methodsFor: 'initialization'!
init
super init.
self fsa: self myFsa! !
!FSABasedScanner methodsFor: 'scanner generation'!
classInitializationMethodTextForClassNamed: name spec: tokenSpec
^self subclassResponsibility!
createScannerClassNamed: name category: category spec: tokenSpec
| scannerClass |
scannerClass := self defaultScannerClass
subclass: name asSymbol
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: category.
scannerClass comment: self generatedScannerClassComment.
scannerClass class compile: (self classInitializationMethodTextForClassNamed: name spec: tokenSpec)
classified: 'class initialization'.
scannerClass initialize.
^scannerClass!
defaultOptimizedScannerClass
^OptimizedScanner!
defaultScannerClass
^self class!
generatedScannerClassComment
^'This scanner class was automatically generated by ', TranslatorGenerator versionName , '.'!
newStreamForMethodRendering
| ws |
ws := WriteStream on: (String new: 2048).
ws policy printCharactersLiterally: true.
^ws! !
!FSABasedScanner methodsFor: 'converting'!
fastScanner
^self defaultOptimizedScannerClass buildFrom: self! !
!FSABasedScanner methodsFor: 'private'!
at: state tokenTypeAndActionFor: tok
^state tokenTypeAndActionFor: tok!
at: state transitionFor: char
^state transitionFor: char! !
"-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- "!
FSABasedScanner class
instanceVariableNames: 'fsa '!
!FSABasedScanner class methodsFor: 'state accessing'!
fsa
^fsa!
fsa: argument
fsa := argument! !
!FSABasedScanner class methodsFor: 'class initialization'!
initialize
"Concrete subclasses must somehow provide a fsa. Subclasses created by
automatic means may simply 'plug-in' a dynamically computed fsa. However, if a
class that can be filed-out is desired then it is worthwhile to override this
initialization method with one that can build the appropriate fsa directly."
"FSABasedScanner initialize"
self fsa: nil! !
FSABasedScanner subclass: #FSABasedLookaheadScanner
instanceVariableNames: 'savePosition '
classVariableNames: ''
poolDictionaries: ''
category: 'Compilers-Scanners'!
FSABasedLookaheadScanner comment:
'=================================================
Copyright (c) 1992 by Justin O. Graver.
All rights reserved (with exceptions).
For complete information evaluate "Object tgenCopyright."
=================================================
This is an abstract class for scanners with lookahead.
Instance Variables:
savePosition <Integer> - pointer into input source for error notification.'!
!FSABasedLookaheadScanner methodsFor: 'initialization'!
reset
"Reset the initial state of the scanner before scanning a new source."
super reset.
self savePosition: 0! !
!FSABasedLookaheadScanner methodsFor: 'state accessing'!
savePosition
^savePosition!
savePosition: argument
savePosition := argument! !
!FSABasedLookaheadScanner methodsFor: 'accessing'!
errorPosition
"Answer the source position of the last acceptable character."
^self savePosition max: 1! !
FSABasedLookaheadScanner subclass: #FSABasedScannerWithOneTokenLookahead
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Compilers-Scanners'!
FSABasedScannerWithOneTokenLookahead comment:
'=================================================
Copyright (c) 1992 by Justin O. Graver.
All rights reserved (with exceptions).
For complete information evaluate "Object tgenCopyright."
=================================================
This class provides a scanner with simple one-token lookahead. '!
!FSABasedScannerWithOneTokenLookahead methodsFor: 'scanner generation'!
defaultOptimizedScannerClass
^OptimizedScannerWithOneTokenLookahead! !
!FSABasedScannerWithOneTokenLookahead methodsFor: 'scanning'!
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -