📄 readme
字号:
java weka.classifiers.trees.J48 -t $WEKAHOME/data/iris.arffThis prints out a decision tree classifier for the iris dataset and ten-fold cross-validation estimates of its performance. If youdon't pass any options to the classifier, WEKA will list all the available options. Try:java weka.classifiers.trees.J48The options are divided into "general" options that apply to mostclassification schemes in WEKA, and scheme-specific options that onlyapply to the current scheme---in this case J48. WEKA has a commoninterface to all classification methods. Any class that implements aclassifier can be used in the same way as J48 is used above. WEKAknows that a class implements a classifier if it extends theClassifier class in weka.classifiers. Almost all classes inweka.classifiers fall into this category. Try, for example:java weka.classifiers.bayes.NaiveBayes -t $WEKAHOME/data/labor.arffHere is a list of some of the classifiers currently implemented inweka.classifiers:a) Classifiers for categorical prediction:weka.classifiers.lazy.IBk: k-nearest neighbour learnerweka.classifiers.trees.J48: C4.5 decision trees weka.classifiers.rules.PART: rule learner weka.classifiers.bayes.NaiveBayes: naive Bayes with/without kernelsweka.classifiers.rules.OneR: Holte's OneRweka.classifiers.functions.SMO: support vector machinesweka.classifiers.functions.Logistic: logistic regressionweka.classifiers.meta.AdaBoostM1: AdaBoostweka.classifiers.meta.LogitBoost: logit boostweka.classifiers.trees.DecisionStump: decision stumps (for boosting)etc.b) Classifiers for numeric prediction:weka.classifiers.functions.LinearRegression: linear regressionweka.classifiers.trees.M5P: model treesweka.classifiers.rules.M5Rules: model rulesweka.classifiers.lazy.IBk: k-nearest neighbour learnerweka.classifiers.lazy.LWR: locally weighted regression=================Association rules=================Next to classification schemes, there is some other useful stuff in WEKA. Association rules, for example, can be extracted using the Apriori algorithm. Tryjava weka.associations.Apriori -t $WEKAHOME/data/weather.nominal.arff=======Filters=======There are also a number of tools that allow you to manipulate adataset. These tools are called filters in WEKA and can be foundin weka.filters.weka.filters.unsupervised.attribute.Discretize: discretizes numeric dataweka.filters.unsupervised.attribute.Remove: deletes/selects attributesetc.Try:java weka.filters.supervised.attribute.Discretize -i $WEKAHOME/data/iris.arff -c last----------------------------------------------------------------------4. Database access:-------------------In terms of database connectivity, you should be able to use anydatabase with a Java JDBC driver. When using classes that access adatabase (e.g. the Explorer), you will probably want to create aproperties file that specifies which JDBC drivers to use, and where tofind the database. This file should reside in your home directory orthe current directory and be called "DatabaseUtils.props". An exampleis provided in weka/experiment (you need to expand wek.jar to be ableto look a this file). This file is used unless it is overidden by onein your home directory or the current directory (in that order).----------------------------------------------------------------------5. The Experiment package:--------------------------There is support for running experiments that involve evaluatingclassifiers on repeated randomizations of datasets, over multipledatasets (you can do much more than this, besides). The classes forthis reside in the weka.experiment package. The basic architecture isthat a ResultProducer (which generates results on some randomizationof a dataset) sends results to a ResultListener (which is responsiblefor stating whether it already has the result, and otherwise storingresults).Example ResultListeners include:weka.experiment.CSVResultListener: outputs results ascomma-separated-value files.weka.experiment.InstancesResultListener: converts results into a setof Instances.weka.experiment.DatabaseResultListener: sends results to a databasevia JDBC. Example ResultProducers include:weka.experiment.RandomSplitResultProducer: train/test on a % splitweka.experiment.CrossValidationResultProducer: n-fold cross-validationweka.experiment.AveragingResultProducer: averages results from anotherResultPoducer weka.experiment.DatabaseResultProducer: acts as a cache for results,storing them in a database.The RandomSplitResultProducer and CrossValidatioResultProducer makeuse of a SplitEvaluator to obtain actual results for a particularsplit, provided are ClassifierSplitEvaluator (for nominalclassification) and RegressionSplitEvaluator (for numericclassification). Each of these uses a Classifier for actual resultsgeneration. So, you might have a DatabaseResultListener, that is sent results froman AveragingResultProducer, which produces averages over the n resultsproduced for each run of an n-fold CrossValidationResultProducer,which in turn is doing nominal classification through aClassifierSplitEvaluator, which uses OneR for prediction. Whew. Butyou can combine these things together to do pretty much whatever youwant. You might want to write a LearningRateResultProducer that splitsa dataset into increasing numbers of training instances.To run a simple experiment from the command line, try:java weka.experiment.Experiment -r -T datasets/UCI/iris.arff \ -D weka.experiment.InstancesResultListener \ -P weka.experiment.RandomSplitResultProducer -- \ -W weka.experiment.ClassifierSplitEvaluator -- \ -W weka.classifiers.rules.OneR(Try "java weka.experiment.Experiment -h" to find out what theseoptions mean)If you have your results as a set of instances, you can perform pairedt-tests using weka.experiment.PairedTTester (use the -h option to findout what options it needs).However, all this is much easier if you use the Experimenter GUI.----------------------------------------------------------------------6. Tutorial:------------A tutorial on how to use WEKA is in $WEKAHOME/Tutorial.pdf. However,not everything in WEKA is covered in the Tutorial, and the packagestructure has changed quite a bit. For a complete list you have tolook at the online documentation $WEKAHOME/doc/packages.html. Inparticular, Tutorial.pdf is a draft from the "Data Mining" book (seeour web page), and so only describes features in the stable 3.0release.----------------------------------------------------------------------7. Source code:---------------The source code for WEKA is in $WEKAHOME/weka-src.jar. To expand it, use the jar utility that's in every Java distribution.----------------------------------------------------------------------8. Credits:-----------Refer to the web page for an up-to-date list of contributors:http://www.cs.waikato.ac.nz/~ml/weka/----------------------------------------------------------------------9. Call for code and bug reports:---------------------------------If you have implemented a learning scheme, filter, application,visualization tool, etc., using the WEKA classes, and you think it should be included in WEKA, send us the code, and we can put itin the next WEKA distribution.If you find any bugs, send a fix to mlcontrib@cs.waikato.ac.nz.If that's too hard, just send a bug report to the wekalist mailing list.-----------------------------------------------------------------------10. Copyright:--------------WEKA is distributed under the GNU public license. Please readthe file COPYING.-----------------------------------------------------------------------
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -