📄 readme
字号:
Basis of AI Backprop
Code from April 10, 1996
Documentation from April 10, 1996
Copyright (c) 1990-96 by Donald R. Tveter
CONTENTS
--------
1. Introduction
2. Making the Simulators
3. A Simple Example
4. Basic Facilities
5. The Format Command
6. Taking Training and Testing Patterns from a File
7. Saving and Restoring Weights
8. Initializing Weights
9. The Seed Values
10. The Algorithm Command
11. The Delta-Bar-Delta Method
12. Quickprop
13. Making a Network
14. Recurrent Networks
15. Miscellaneous Commands
16. Limitations
17. The Pro Version Additions
1. Introduction
---------------
This manual describes the free version of my Basis of AI Backprop
designed to accompany my not yet published (sigh) textbook, _The Basis
of AI_. This program contains enough features for students in an
ordinary AI or Neural Networking course. More serious users will
probably need the professional version of this software, see:
http://www.mcs.com/~drt/probp.html
or send me email at: drt@mcs.com. Other free NN software for the
textbook is also available at:
http://www.mcs.com/~drt/svbp.html
For more on backprop see my "Backpropagator's Review" at:
http://www.mcs.com/~drt/bprefs.html
Notice: this is use at your own risk software. There is no guarantee
that it is bug-free. Use of this software constitutes acceptance for
use in an as is condition. There are no warranties with regard to this
software. In no event shall the author be liable for any damages
whatsoever arising out of or in connection with the use or performance
of this software.
There are four simulators that can be constructed from the included
files. The program, bp, does back-propagation using real weights and
arithmetic. The program, ibp, does back-propagation using 16-bit
integer weights, 16 and 32-bit integer arithmetic and some floating
point arithmetic. The program, sbp, uses symmetric floating point
weights and its sole purpose is to produce weights for two-layer
networks for use with the Hopfield and Boltzman relaxation algorithms
(included in another package). The program sibp does the same using
16-bit integer weights. The integer versions are faster on systems
without floating point hardware however sometimes these versions don't
have enough range or precision and then using the floating point
versions is necessary. DOS binaries are included here for systems with
floating point hardware. If you need other versions write me.
2. Making the Simulators
------------------------
This code has been written to use either 32-bit floating point
(float) or 64-bit floating point (double) arithmetic. On System V
machines the standard seems to be that all floating point arithmetic is
done with double precision arithmetic so double arithmetic is faster
than float and therefore this is the default. Other versions of C (e.g.
ANSI C) will do single precision real arithmetic which will ordinarily
be faster on most machines (I think). To get 32-bit floating point set
the compiler flag FLOAT in the makefile. The function, exp, defined in
real.c is double since System V specifies it as double. If your C uses
float, change this definition as well.
For UNIX systems, use either makefile.unx or makereal.unx.
The makefile.unx will make any of the programs and makefile will keep
the bp object code files around while makereal.unx will only make bp
but it keeps the bp object code files around. Also for DOS systems
there are two makefiles to choose from, makefile and makereal. Makefile
is designed to make all four programs but it only leaves around the
object files for ibp while erasing object files for sibp, sbp and bp.
On the other hand, makereal only makes bp and it leaves its object
files around. For 16-bit DOS you need to set the flag, -DDOS16 and for
32-bit DOS, you need to set the flag -DDOS32. The flags I have in the
DOS makefiles are what I use with Zortech C 3.1. The code is known not
to compile with at least one version of Turbo C because of an oddity
(or bug?) in the compiler.
There was a problem found with the previous free student version
where it crashed on a Sun when the program hits a call to free in the
file bp.c. This can be solved by removing the two calls to free and the
amount of space you waste is minimal. I haven't had a report of such a
problem with this version yet but if it happens, let me know, in all
probability removing a call or two to free in the file io.c will solve
the problem.
This code will work with basic C compilers however the libraries
sometimes vary from system to system. DOS systems seem to use the
function getch in the conio library for hot key capability. For a
System V UNIX system the code uses a home-made function called getch for
hot key capability. This is the default setting for a UNIX system and
it also works with Suns. If you use BSD UNIX then you need to define
the compiler variable BSD either in the cc command by adding the
parameter, -DBSD. To get the hotkey feature to work with a NeXT use the
parameter -DNEXT. At this point I don't know what other variations of
UNIX use so you may need to adapt the ioctl function call in the file
io.c and the files rbp.h and ibp.h to make them fit some other version.
If your system uses some other standard then if you can send me the
documentation I should be able to make it work as well. If necessary
the hot key option can be removed by removing or commenting out the
line:
#define HOTKEYS
in the rbp.h and ibp.h files.
There are some other more minor options that can be compiled in or
left out but these are mentioned at other points in the documentation.
To make a particular executable file use the makefile given with the
data files and make any or all of them like so:
UNIX DOS
make -f makereal.unx bp make -f makereal bp
make -f makefile.unx bp make bp
make -f makefile.unx ibp make ibp
make -f makefile.unx sibp make sibp
make -f makefile.unx sbp make sbp
If you do get bugs on an odd system and you can let me telnet in to
your system (preferably on a separate login, rather than your personal
login) I will try and fix the problem for you.
3. A Simple Example
-------------------
Each version would normally be called with the name of a file to read
commands from, as in:
bp xor
After the data file is read commands are then taken from the keyboard.
When no file name is specified bp will take commands from the keyboard
(stdin file). Normally you will find it convenient to put the commands
you need to set up the network in a short file however it is possible to
type them all in to the program from the keyboard. If you have more
than a tiny amount of data you should have the data ready in a training
file and a testing file if you have test data.
The commands are one or two or three letter commands and most of them
have optional parameters. The `a', `d', `f' and 'q' commands allow a
number of sub-commands on a line. The maximum length of any line is 256
characters. An `*' is a comment and it can be used to make the
remainder of the line a comment. In addition ctrl-R will run the
training.
Here is an example of a data file to do the xor problem:
* input file for the xor problem
m 2 1 1 x * make a 2-1-1 network with extra input-output connections
s 7 * seed the random number function
ci * clear and initialize the network with random weights
rt { * read training patterns into memory
1 0 1
0 0 0
0 1 1
1 1 0}
e 0.5 * set eta, the learning rate to 0.5 (and eta2 to 0.5)
a 0.9 * set alpha, the momentum to 0.9
First in this example, the m command will make a network with 2 units in
the input layer, 1 unit in the second layer and 1 unit in the third
layer. Much of the time a three layer network where the connections are
only between adjacent layers is as complex as a network needs to be
however there are problems where having additional connections between
the input units and output units will greatly speed-up the learning
process. The xor problem is one of those problems where the extra
connections help so the 'x' at the end of the command will add these
two extra connections. The `s' (seed) command sets the seed for the
random number function. The "ci" command (clear and initialize) clears
the existing network weights and initializes the weights to random
values between -1 and +1. The rt (read training set) command gives four
new patterns to be read into the program. All of them are listed
between the curly brackets ({}). The input pattern comes first followed
by the output pattern. The command "e 0.5" sets eta, the learning
rate for the upper layer to 0.5 and eta2 for the lower layers to 0.5 as
well. The last line sets alpha, the momentum parameter, to 0.9.
After these commands are executed the following messages and prompt
appears:
Basis of AI Backprop (c) 1990-96 by Donald R. Tveter
drt@mcs.com - http://www.mcs.com/~drt/home.html
April 10, 1996 version.
taking commands from stdin now
[ACDFGMNPQTW?!acdefhlmopqrstw]? q
The characters within the square brackets are a list of the possible
commands. To run 100 iterations of back-propagation and print out the
status of the learning every 10 iterations type "r 100 10" at the
prompt:
[ACDFGMNPQTW?!acdefhlmopqrstw]? r 100 10
This gives:
running . . .
10 0.00 % 0.49947
20 0.00 % 0.49798
30 0.00 % 0.48713
40 0.00 % 0.37061
50 0.00 % 0.15681
59 100.00 % 0.07121 DONE
The program immediately prints out the "running . . ." message. After
each 10 iterations a summary of the learning process is printed giving
the percentage of patterns that are right and the average value of the
absolute values of the errors of the output units. The program stops
when the each output for each pattern has been learned to within the
required tolerance, in this case the default value of 0.1. Sometimes
the integer versions will do a few extra iterations before declaring the
problem done because of truncation errors in the arithmetic done to
check for convergence. Unlike the previous student version the default
for these values is to be "up-to-date" however this can be over-ridden
to save a little on CPU time.
There are many factors that affect the number of iterations needed
for a network to converge. For instance if your random number function
doesn't generate the same values as the one with the Zortech 3.1
compiler (which is the same one used by most UNIX C compilers) the
number of iterations it takes will be different. The integer versions
produce slightly different results that the floating point versions.
Listing Patterns
To get a listing of the status of each pattern use the `p' command
to give:
[ACDFGMNPQTW?!acdefhlmopqrstw]? p
1 0.903 e 0.097 ok
2 0.050 e 0.050 ok
3 0.935 e 0.065 ok
4 0.072 e 0.072 ok
59 (TOL) 100.00 % (4 right 0 wrong) 0.07121 err/unit
The number folloing the e (for error) is the sum of the absolute values
of the output errors for each pattern. An `ok' is given to every
pattern that has been learned to within the required tolerance. To get
the status of one pattern, say, the fourth pattern, type "p 4" to give:
0.07 (0.072) ok
To get a summary without the complete listing use "p 0". To get the
output targets for a given pattern, say pattern 3, use "o 3".
A particular test pattern can be input to the network by giving the
pattern at the prompt:
[ACDFGMNPQTW?!acdefhlmopqrstw]? 1 0
0.903
Examining Weights
It is often interesting to see the values of some particular weights
in the network. To see a listing of all the weights in a network you
can use the save weights command described later on and then list the
file containing the weights, however, to see the weights leading into a
particular node, say the node in row 3, node 1 use the w command as in:
[ACDFGMNPQTW?!acdefhlmopqrstw]? w 3 1
layer unit inuse unit value weight inuse input from unit
1 1 1 1.00000 5.38258 1 5.38258
1 2 1 0.00000 -4.86238 1 0.00000
2 1 1 1.00000 -10.86713 1 -10.86710
3 b 1 1.00000 7.71563 2 7.71563
sum = 2.23111
This listing also gives data on how the current activation value of the
node is computed using the weights and the activations values of the
nodes feeding into unit 1 of layer 3. The `b' unit is the bias (also
called the threshold) unit. The inuse column to the right of the unit
column is 1 when the unit is in use and 0 if it is not in use. In this
free version there are no commands to take weights out of use. A 1
indicates a regular weight in use and a 2 indicates a bias weight in
use.
Besides saving weights you can save all the parameters to a file
with the save everything command as in:
se saved
At the same time the weights will be written to the current weights
file. The file saved is virtually the same as the one you get with the
'?' command. To start over from where you left off you can use:
bp saved
and this also reads in the patterns and weights. This command DOES NOT
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -