📄 intro.htm

📁 strongForth: a strongly typed dialect of Forth implemented in ANS Forth.
💻 HTM
字号:
<html>
<head>
<title>Introduction To StrongForth</title>
</head>
<body>
<h1>Introduction To StrongForth</h1>
<h2>Preface</h2>
<p>This introduction to StrongForth has been written for those
who already have collected some experience with Forth. 
Although StrongForth is as
close to ANS Forth as possible, it is not required that the
reader has worked with an ANS compliant Forth system.</p>
<p>The basic idea behind StrongForth is the wish to add strong
static type checking to a Forth system. Previous Forth systems
and standards (including ANS) were supposed to be <em>typeless</em>
or <em>untyped</em>, which means they do not do any type
checking at all. The interpreter and the compiler generally
accept any word to be applied to the operands on the data and
return stack. This behaviour grants total freedom to the
programmer, but on the other side it is rather often a reason for
type errors, which frequently cause system crashes and other more
or less strange behaviour throughout the whole development phase.</p>
<p>StrongForth does not guarantee bug-free programs. It does not
even grant the absence of crashes. But type errors will be
greatly reduced. Furthermore, since interpreter and compiler know
about the data types of the operands on the stack, they are able
to chose the appropriate version of a word, if the dictionary
contains several words with the same name, but different input
parameter types. This is called operator overloading. As will be
shown in this introduction, operator overloading allows a much
more comfortable way of programming. Additionally, it is no longer 
necessary for you to invent individual names for words with the 
same semantics, but different data types.</p>
<p>Of course, strong static typing has some drawbacks, which
might keep traditional Forth programmers from using it.
First, it requires a higher degree of discipline, because all
words having stack-effects have to be provided with precise stack
diagrams. Second, interpreter and compiler will prohibit not only
dirty tricks, but sometimes also just <em>unusual</em>
operations. For example, adding a flag to an address is not
possible, although it might be useful in some cases. And third,
relying on a system that does all the type-checking itself, might
lead to more careless programming.</p>
<p>Nevertheless, the advantages and disadvantages of strong
static type checking have already been discussed in the Forth
community. The availability of StrongForth will certainly put
more practical aspects into the previously rather theoretical
discussion, allowing you to simply try it out by yourself.</p>
<h2>First Steps</h2>
<p>Let's begin with a few examples out of the first chapter of
Leo Brodie's famous textbook <em>Starting Forth</em>:</p>
<pre><u>15 SPACES</u>                 OK</pre>
<p>When interpreting the number <kbd>15</kbd>, the interpreter pushes this
value on the data stack and remembers that it is an unsigned
single integer. <kbd>SPACES</kbd> is a word that requires an unsigned single
integer as input parameter. Here's a possible definition of 
<kbd>SPACES</kbd>:</p>
<pre>: SPACES ( UNSIGNED -- )
  0 ?DO SPACE LOOP ;</pre>
<p>Well, this is not very exciting. At a first look, the only
more or less interesting thing about it is the stack diagram.
Standard Forth systems use <kbd>( n -- )</kbd>, which is nothing but a
comment. In StrongForth, it is interpreted source code, which
compiles the stack diagram of <kbd>SPACES</kbd> into the dictionary.
Additionally, it tells the compiler, that the definition starts
with an item of data type <kbd>UNSIGNED</kbd> on the data stack, and is
expected to remove this item on exiting. Generally, each word in
the dictionary includes full information about its stack effect.</p>
<p>So let us now try a second example:</p>
<pre><u>42 EMIT</u> * OK</pre>
<p><kbd>EMIT</kbd> is a word that expects a number on the stack and 
displays the ASCII character associated with this number. We can also 
write</p>
<pre><u>CHAR * EMIT</u> * OK</pre>
<p>instead, because a character is some kind of a number. Even 
the following code works well:</p>
<pre><u>CHAR * .</u> * OK</pre>
<p>But wait ... Isn't <kbd>.</kbd> supposed to display a number, and not 
a character? Let's see:</p>
<pre><u>42 .</u> 42  OK</pre>
<p>Yes, this still works. But how does <kbd>.</kbd> know whether it should print 
a number or an ASCII character? StrongForth actually provides more than 
one version of <kbd>.</kbd>. There's one version for displaying numbers, 
and there's one version for displaying characters. The interpreter and 
the compiler take care of selecting the version that is suited best for the 
purpose. In this case, a number is displayed as a number, and a 
character is displayed as a character. When we write <kbd>42</kbd>, the 
interpreter pushes 42 onto the data stack and keeps in mind that this 
is a number. When we write <kbd>CHAR *</kbd>, the interpreter pushes 
exactly the same value onto the stack, but this time it makes a note that 
the item on top of the stack is a character. This note later allows the 
interpreter to select the correct version of <kbd>.</kbd>. <kbd>EMIT</kbd> 
doesn't make this difference. It displays each and every parameter as an 
ASCII character.</p>
<p>There are several other versions of <kbd>.</kbd> in
StrongForth's dictionary. Just have a look at these:</p>
<pre><u>3 4 = .</u> FALSE  OK
<u>-16 .</u> -16  OK</pre>
<p>In this example, <kbd>=</kbd> takes the two items of data type 
<kbd>UNSIGNED</kbd> and returns an item of data type <kbd>FLAG</kbd>. 
A special version of <kbd>.</kbd> for flags delivers the appropriate 
result. The second example seems to be straight-forward, but it is not.
Remember that <kbd>15</kbd>, <kbd>42</kbd>, <kbd>3</kbd> and <kbd>4</kbd> 
produced items of data type <kbd>UNSIGNED</kbd>. <kbd>-16</kbd> produces 
an item of data type <kbd>SIGNED</kbd>, and the
interpreter finds a version of <kbd>.</kbd> suited for signed
numbers. To enter a positive signed number, you have to precede
it with a sign, for example <kbd>+16</kbd>. The advantage of distinguishing
between signed and unsigned numeric literals becomes obvious when
we try larger numbers:</p>
<pre><u>4000000000 .</u> 4000000000  OK
<u>+4000000000 .</u> -294967296  OK</pre>
<p>A standard 32-bit Forth system would always display <kbd>-294967296</kbd>,
because it can not distinguish signed and unsigned numbers. You'd have to 
explicitly use <kbd>U.</kbd> in order to display 4000000000 as 
an unsigned number.</p>
<p>With the knowledge obtained so far, let's try out the compiler,
still sticking to the examples in Leo Brodie's <em>Starting Forth</em>:</p>
<pre><u>: STAR [CHAR] * . ;</u>  OK
<u>STAR</u> * OK
<u>CR</u>
 OK
<u>CR STAR CR STAR CR STAR</u>
*
*
* OK
<u>: STARS 0 DO STAR LOOP ;</u>
: STARS 0 DO ? undefined word
UNSIGNED
  OK</pre>
<p>Oops. What's that? <kbd>DO</kbd> tried to compile its runtime semantics, 
which expects two numbers of the same data type on the stack, but there was
only one. Thus, the compiler could not find an appropriate
runtime word <kbd>DO</kbd> in the dictionary, and throws an exception.
Yes, we have to supply a stack diagram to <kbd>STARS</kbd>:</p>
<pre><u>: STARS ( UNSIGNED -- ) 0 DO STAR LOOP ;</u>  OK
5 STARS ***** OK
<u>STARS</u>
STARS ? undefined word
</pre>
<p>So, the compiler starts with an <kbd>UNSIGNED</kbd> on the stack, adds
another one (<kbd>0</kbd>), and now <kbd>DO</kbd>'s runtime word gets its 
input parameters. The last line just shows that <kbd>STARS</kbd> will 
not be found in the dictionary, if the stack is empty.</p>
<p>Finally, let's complete Leo Brodie's example:</p>
<pre><u>: MARGIN CR 30 SPACES ;</u>  OK
<u>: BLIP MARGIN STAR ;</u>  OK
<u>: BAR MARGIN 5 STARS ;</u>  OK
<u>: F BAR BLIP BAR BLIP BLIP CR ;</u>  OK
<u>F</u>
                              *****
                              *
                              *****
                              *
                              *
 OK</pre>
<h2>Data Types</h2>
<p>In the previous section, we have introduced four data types:
<kbd>UNSIGNED</kbd>, <kbd>SIGNED</kbd>, <kbd>CHARACTER</kbd>, and 
<kbd>FLAG</kbd>. Actually, StrongForth knows 
a lot more data types, and it is even possible to define new,
application-specific data types.</p>
<h3>Data Type Structure</h3>
<p>Having several different data types is certainly useful, but a
large, unstructured quantity of data types would cause a serious
problem. Since it should be possible to apply words like <kbd>DUP</kbd> and
<kbd>DROP</kbd> to almost every data type, it would be necessary to 
supply a separate version of these words for each of them. Words with
two input parameters, like <kbd>SWAP</kbd>, would have to be defined for
each possible combination of data types, which makes already 1024
versions for 32 data types! <kbd>ROT</kbd> would be even worse.</p>
<p>To solve this problem, StrongForth arranges all data types in
a hierarchical structure. There are four data types at the root of
this hierarchy: <kbd>SINGLE</kbd>. <kbd>DOUBLE</kbd>, <kbd>TUPLE</kbd>, 
and <kbd>SYS</kbd>. All other data types are direct of indirect subtypes 
of these four so-called <em>anchestor</em> data types. The complete
data type structure looks like this:</p>

<pre>SINGLE
|
+-- INTEGER
|   |
|   +-- UNSIGNED
|   |
|   +-- SIGNED
|   |
|   +-- CHARACTER
|
+-- ADDRESS
|   |
|   +-- CADDRESS
|
+-- LOGICAL
|   |
|   +-- FLAG
|
+-- DEFINITION
|
+-- TOKEN
|   |
|   +-- SEARCH-CRITERION
|
+-- FILE
|
+-- FAM
|
+-- WID
|
+-- R-SIZE
|
+-- CONTROL-FLOW

DOUBLE
|
+-- INTEGER-DOUBLE
|   |
|   +-- UNSIGNED-DOUBLE
|   |	|
|   |	+-- NUMBER-DOUBLE
|   |
|   +-- SIGNED-DOUBLE
|
+-- DATA-TYPE
    |
    +-- STACK-DIAGRAM

TUPLE
|
+-- INPUT-SOURCE

SYS
|
+-- ORIG/DEST
|   |
|   +-- ORIG
|   |
|   +-- DEST
|
+-- COLON-SYS
|   |
|   +-- DOES-SYS
|
+-- DO-SYS
|
+-- CASE-SYS
|
+-- OF-SYS</pre>

<p>Whenever the interpreter or compiler tries to find a word in
the dictionary, it accepts not only a word whose input parameters
match the data types of the items on the stack <em>exactly</em>,
but also a word whose input parameters are parents of those. Thus,
only two versions of <kbd>DUP</kbd> and <kbd>DROP</kbd> are required 
for data types <kbd>SINGLE</kbd> and <kbd>DOUBLE</kbd> and all their 
respective subtypes. If, for example, the item on top of the data 
stack has data type <kbd>UNSIGNED</kbd>, the version of 
<kbd>DUP</kbd> for data type <kbd>SINGLE</kbd> would match, because
<kbd>UNSIGNED</kbd> is a (second-generation) subtype of <kbd>SINGLE</kbd>. 
Similarly, four versions of <kbd>SWAP</kbd> and eight versions of 
<kbd>ROT</kbd> (instead of 22
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -