📄 sonar.txt
字号:
NAME: Sonar, Mines vs. RocksSUMMARY: This is the data set used by Gorman and Sejnowski in their studyof the classification of sonar signals using a neural network [1]. Thetask is to train a network to discriminate between sonar signals bouncedoff a metal cylinder and those bounced off a roughly cylindrical rock.SOURCE: The data set was contributed to the benchmark collection by TerrySejnowski, now at the Salk Institute and the University of California atSan Deigo. The data set was developed in collaboration with R. PaulGorman of Allied-Signal Aerospace Technology Center.MAINTAINER: Scott E. FahlmanPROBLEM DESCRIPTION:The file "sonar.mines" contains 111 patterns obtained by bouncing sonarsignals off a metal cylinder at various angles and under variousconditions. The file "sonar.rocks" contains 97 patterns obtained fromrocks under similar conditions. The transmitted sonar signal is afrequency-modulated chirp, rising in frequency. The data set containssignals obtained from a variety of different aspect angles, spanning 90degrees for the cylinder and 180 degrees for the rock.Each pattern is a set of 60 numbers in the range 0.0 to 1.0. Each numberrepresents the energy within a particular frequency band, integrated overa certain period of time. The integration aperture for higher frequenciesoccur later in time, since these frequencies are transmitted later duringthe chirp.The label associated with each record contains the letter "R" if the objectis a rock and "M" if it is a mine (metal cylinder). The numbers in thelabels are in increasing order of aspect angle, but they do not encode theangle directly.METHODOLOGY: This data set can be used in a number of different ways to test learningspeed, quality of ultimate learning, ability to generalize, or combinationsof these factors.In [1], Gorman and Sejnowski report two series of experiments: an"aspect-angle independent" series, in which the whole data set is usedwithout controlling for aspect angle, and an "aspect-angle dependent"series in which the training and testing sets were carefully controlled toensure that each set contained cases from each aspect angle inappropriate proportions.For the aspect-angle independent experiments the combined set of 208 casesis divided randomly into 13 disjoint sets with 16 cases in each. For eachexperiment, 12 of these sets are used as training data, while the 13th isreserved for testing. The experiment is repeated 13 times so that everycase appears once as part of a test set. The reported performance is anaverage over the entire set of 13 different test sets, each run 10 times.It was observed that this random division of the sample set led to ratheruneven performance. A few of the splits gave poor results, presumablybecause the test set contains some samples from aspect angles that areunder-represented in the corresponding training set. This motivated Gormanand Sejnowski to devise a different set of experiments in which an attemptwas made to balance the training and test sets so that each would have arepresentative number of samples from all aspect angles. Since detailedaspect angle information was not present in the data base of samples, the208 samples were first divided into clusters, using a 60-dimensionalEuclidian metric; each of these clusters was then divided between the104-member training set and the 104-member test set. The actual training and testing samples used for the "aspect angledependent" experiments are marked in the data files. The reportedperformance is an average over 10 runs with this single division of thedata set.A standard back-propagation network was used for all experiments. Thenetwork had 60 inputs and 2 output units, one indicating a cylinder and theother a rock. Experiments were run with no hidden units (directconnections from each input to each output) and with a single hidden layerwith 2, 3, 6, 12, or 24 units. Each network was trained by 300 epochs overthe entire training set.The weight-update formulas used in this study were slightly different fromthe standard form. A learning rate of 2.0 and momentum of 0.0 was used.Errors less than 0.2 were treated as zero. Initial weights were uniformrandom values in the range -0.3 to +0.3.RESULTS: For the angle independent experiments, Gorman and Sejnowski report thefollowing results for networks with different numbers of hidden units:Hidden % Right on Std. % Right on Std.Units Training set Dev. Test Set Dev.------ ------------ ---- ---------- ----0 89.4 2.1 77.1 8.32 96.5 0.7 81.9 6.23 98.8 0.4 82.0 7.36 99.7 0.2 83.5 5.612 99.8 0.1 84.7 5.724 99.8 0.1 84.5 5.7For the angle-dependent experiments Gorman and Sejnowski report thefollowing results:Hidden % Right on Std. % Right on Std.Units Training set Dev. Test Set Dev.------ ------------ ---- ---------- ----0 79.3 3.4 73.1 4.82 96.2 2.2 85.7 6.33 98.1 1.5 87.6 3.06 99.4 0.9 89.3 2.412 99.8 0.6 90.4 1.824 100.0 0.0 89.2 1.4Not surprisingly, the network's performance on the test set was somewhatbetter when the aspect angles in the training and test sets were balanced.Gorman and Sejnowski further report that a nearest neighbor classifier onthe same data gave an 82.7% probability of correct classification.Three trained human subjects were each tested on 100 signals, chosen atrandom from the set of 208 returns used to create this data set. Theirresponses ranged between 88% and 97% correct. However, they may have beenusing information from the raw sonar signal that is not preserved in theprocessed data sets presented here.REFERENCES: 1. Gorman, R. P., and Sejnowski, T. J. (1988). "Analysis of Hidden Unitsin a Layered Network Trained to Classify Sonar Targets" in Neural Networks,Vol. 1, pp. 75-89.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -