📄 readme_experiment_gui
字号:
===================================================================Experiment Gui Quick Primer===================================================================NOTE THAT THIS README IS FOR THE "ADVANCED" MODE OF THE EXPERIMENTER(however, some of it may be useful for the "simple" mode as well).The basic philosophy of the Experiment package is described briefly inthe main README file. Please read this first before continuing.The Experimenter provides a graphical front end to the classes in theExperiment package. However, the Experimenter can be bewildering forthe new user. Here is a quick rundown on how to set it up for doing astandard 10x10 fold cross-validation experiment (that is, 10 runs of10 fold cross validation on a bunch of data sets using severalclassifiers).Setting up and running the experiment:First start the Experimenter:java weka.gui.experiment.Experimenter.Next click "New" for a new experiment.Next choose a destination for results:click on the destination panel and choose InstancesResultListener.Set the output file for the InstancesResultListener (all resultswill be saved in an arff file so that you can load them againlater and perform the t-test again etc.)Now choose a ResultProducer: click on the Result generator panel andselect AveragingResultProducer (this result producer takes resultsfrom a CrossValidationResultProducer and averages them----this willgive you the 10 averaged results I mentioned above). Do not set the"calculateStdDevs" option to true for theAveragingResultProducer---this calculates the standard deviation for aSINGLE run of cross-validation. The standard deviation that you aremost likely to be interested in is the standard deviation of theAVERAGES from the 10 runs of 10-fold cross validation.Next changed "Disabled" to "Enabled" under Generator properties (onthe right hand side of the Setup panel. This will pop-up a list-view.Now expand the "resultProducer" entry in the list-view---this shouldshow a property called "splitEvaluator". Expand the splitEvaluatorentry. This should show an entry called "classifier"---click on thisto highlight it and press the "Select" button at the bottom of thelist-view.Now you should see that the generator panel has become active andshows "ZeroR" as a single entry in a list. Here is where you can add(or delete) classifiers that will be involved in the experiment. Nowadd the classifers you are interested in comparing.Last step (whew!). Add datasets to compare the schemes on in theDatasets panel on the left of the Setup panel (Note that the lastcolumn is treated as the class column for all datasets---if this isnot the case in a particular dataset of yours you will have to reorderthe columns using an AttributeFilter).Now you are ready to run your experiment. Change to the "Run" paneland press start. If all goes well you will see the status of yourexperiment as it proceeds and will be informed in the Log panel of anyerrors.Analysing the results:Click on the Analyse panel of the Experimenter. If you've saved yourresults to an arff file you can load them into the Analyse paneleither by pressing the experiment button (which goes and grabs themost recently run experiment's results) or by pressing the filebutton.You won't need to adjust the Row Key fields, Run fields or Column keyfields. If you are just interested in percent correct as your accuracymeasure you needn't change the Comparison field either. Significanceallows you to change the statistical significance for the (correctedresampled) t-test (0.05 and 0.01 are standard levels of significancethat you see used all the time in scientific papers). Test base allowsyou to set the scheme against which the other schemes are compared.Press Perform test to see a results table with significances. Theleft hand column of the table shows the scheme which is being comparedagainst. The figures are percent correct accuracies for eachdataset/scheme. A 'v' indicates that a result is significantly higherat the chosen significance level; A '*' indicates that it issignificantly lower; if no symbol is present it means that there is nosignificant difference at the chosen significance level.Click the Show std. deviations check box and press Perform test againto get the results with standard deviations.A couple of other summaries are available. If you choose"Summary" in the Test base drop-down list and press Performtest you will see a kind of wins vs losses table---it shouldbe relatively self explaining. If you choose "Ranking" fromthe Test base drop-down you will get a kind of league-table.This ranks the schemes according to the total wins-lossesagainst all other schemes.===================================================================Distributed Experiments ===================================================================This is very much experimental (no pun intended). The Experimenterincludes the ability to split an experiment up and distribute it tomultiple hosts. This works best when all results are being sent to acentral data base, although you could have each host save its resultsto a distinct arff file and then merge the files afterwards.Distributed experiments have been tested using InstantDB (with the RMIbridge) and MySQL under Linux.Each host *must* have Java installed, access to whatever datasets you are using, and an experiment server running(weka.experiment.RemoteEngine).If results are being sent to a central data base, then the appropriateJDBC data base drivers must also be installed on each host and belisted in a DatabaseUtils.props file which is accessable to theRemoteEngine running on that host.To start a RemoteEngine experiment server on a host--- first copy theremoteExperimentServer.jar from weka-3-x-y to a directory on a hostmachine. Next unpack the jar with jar xvf remoteExperimentServer.jar.This will expand to three files: remoteEngine.jar, remote.policy, andDatabaseUtils.props. You will need to edit the DatabaseUtils.propsfile in order to list the names of the jdbc data base driver(s) youare using. The entry for the url to the data base is not needed by theremoteEngine server - this will be supplied by clients when they starta remote experiment on the server. The remoteEngine server willdownload code from clients on as needed basis. The remote.policy filegrants permissions to downloaded code to perform certain operations,such as connect to ports etc. You will need to edit this file in orderto specify correct paths in some of the permissions. This should beself explanitory when you look at the file. By default the policy filespecifies that code can be downloaded from places accessable on theweb via the http port (80). If you have a network accessable sharedfile system that your remoteEngines and clients will all be using,then you can also have remoteEngines obtain downloaded code from fileurls - just uncomment the examples and replace the paths withsomething sensible. It is actually *necessary* to have a shared filesystem as data sets need to be accessable to tasks running onremoteEngines (see the first entry in remote.policy).To start the remoteEngine server first make sure that the CLASSPATHenvironment variable is unset and then type (from the directorycontaining remoteEngine.jar):java -classpath remoteEngine.jar:/path_to_any_jdbc_drivers \-Djava.security.policy=remote.policy \-Djava.rmi.server.codebase=file:/path_to_this_directory/remoteEngine.jar \weka.experiment.RemoteEngineIf all goes well, you should see a message similar to:ml@kiwi:remote_engine>Host name : kiwi.cs.waikato.ac.nzRemoteEngine exception: Connection refused to host: kiwi.cs.waikato.ac.nz; nested exception is: java.net.ConnectException: Connection refusedAttempting to start rmi registry...RemoteEngine bound in RMI registryNow you can repeat this process on all hosts that you want to use.The SetUp panel of the Experimenter works exactly as before, but thereis now a small panel next to the Runs panel which controls whether anexperiment will be distributed or not. By default, this panel isinactive indicating that the experiment is a default (single machine)experiment. Clicking the checkbox will enable a remote (distributed)experiment and activates the "Hosts" button. Clicking the Hosts buttonwill popup a window into which you can enter the names of the machinesthat you want to distribute the experiment to. Enter fully qualifiednames here, eg. blackbird.cs.waikato.ac.nz.Once host names have been entered configure the rest of the experimentas you would normally. When you go to the Run panel and start theexperiment progress on sub-experiments running on the different hostswill be displayed along with any error messages.Remote experiments work by splitting a standard experiment into anumber of sub-experiments which get sent by RMI to remote hosts forexecution. By default, an experiment is split up on the basis of dataset, so each sub-experiment is self contained with all schemes appliedto the single data set. This allows you to specify, at most, as manyremote hosts as there are data sets in your experiment. If you onlyhave a few data sets, you can split your experiment up by run instead.For example, a 10 x 10 fold cross validation experiment will get splitinto 10 sub-experiments - one for each run.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -