⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 readme

📁 这是一个由java实现的数据挖掘
💻
字号:
=====================================================================                                                                                                     ======                                                                 README                                                                 ======                                                                                                                                      WEKA 3.2.3                            28 May 2002                                                                                        Java Programs for Machine Learning                                                                        	Copyright (C) 1998, 1999, 2000, 2001  Eibe Frank, 	   Leonard Trigg, Mark Hall, Richard Kirkby                 email: wekasupport@cs.waikato.ac.nz                                                                                   =====================================================================NOTE: We are following the Linux model of releases, where, an evensecond digit of a release number indicates a "stable" release and anodd second digit indicates a "development" release (e.g. 3.0.x is astable release, and 3.1.x is a developmental release). If you areusing a developmental release, you might get to play with extra funkyfeatures, but it is entirely possible that these featurescome/go/transmogrify from one release to the next.  If you requirestability (e.g. if you are using Weka for teaching), use a stablerelease.=====================================================================Contents:---------1. Installation2. Getting started   - Classifiers   - Association rules   - Filters   - Data format   - Experiment package    - GUIs               3. Tutorial4. Source code5. Credits6. Submission of code and bug reports7. Copyright----------------------------------------------------------------------1. Installation:----------------For people familiar with their command-line interface-----------------------------------------------------a) Set WEKAHOME to be the directory which contains this README.b) Add $WEKAHOME/weka.jar to your CLASSPATH environment variable.c) Bookmark $WEKAHOME/doc/packages.html in your web browser.To start a simple GUI for using Weka------------------------------------If you are using Java 2 (JDK 1.2 or equivalent) or you have Swing1.1.1 (or later installed for Java 1.1), you should be able to justdouble-click on the weka.jar icon, or from a command-line (assumingyou are in the directory containing weka.jar) typejava -jar weka.jaror if you are using Windows usejavaw -jar weka.jarThis will start a small GUI (GUIChooser) from which you can select theSimpleCLI interface or the more sophisticated Explorer andExperimenter interfaces (see below). SimpleCLI just acts like a simplecommand shell and has been provided mainly for Mac users who don'thave their own shell :)If you are using NT/Windows you may need to create a file associationbefore you can double click on the weka.jar icon. Open the fileExplorer or a file browser window. Select View (or perhapsTools)->Options. Click on File Types. Click on New Type. Fill in theType field (put something like "java jar files"). Fill in theAssociated Extension ("jar"). Add new Action, with Action name Open,and application as "javaw.exe -jar" (you will probably need to browseto the location of your JRE to get the path correct for javaw---youwill find javaw in the "bin" directory of wherever your JRE isinstalled).If you are using some other Java virtual machine you need to startGUIChooser from within weka.jar. For JDK 1.1 users somethinglike the following:java -classpath weka.jar:$CLASSPATH weka.gui.GUIChooseror if you are using Windows usejavaw -classpath weka.jar;$CLASSPATH weka.gui.GUIChooser----------------------------------------------------------------------2. Getting started:-------------------In the following, the names of files assume use of a unix command-linewith environment variables. For other command-lines (including SimpleCLI)you should substitute the name of the directory where weka.jar lives where you see $WEKAHOME. If your platform uses something other than / asthe path separator, also make the appropriate substitutions.===========Classifiers===========Try:java weka.classifiers.j48.J48 -t $WEKAHOME/data/iris.arffThis prints out a decision tree classifier for the iris dataset and ten-fold cross-validation estimates of its performance. If youdon't pass any options to the classifier, WEKA will list all the available options. Try:java weka.classifiers.j48.J48The options are divided into "general" options that apply to mostclassification schemes in WEKA, and scheme-specific options that only apply to the current scheme---in this case J48. WEKA has acommon interface to all classification methods. Any class that implements a classifier can be used in the same way as J48 is usedabove. WEKA knows that a class implements a classifier if it extends the Classifier or DistributionClassifier classes inweka.classifiers. Almost all classes in weka.classifiers fall intothis category. Try, for example:java weka.classifiers.NaiveBayes -t $WEKAHOME/data/labor.arffHere is a list of the most important classifiers currently implemented in weka.classifiers:a) Classifiers for categorical prediction:weka.classifiers.IBk: k-nearest neighbour learnerweka.classifiers.j48.J48: C4.5 decision trees weka.classifiers.j48.PART: rule learner weka.classifiers.NaiveBayes: naive Bayes with/without kernelsweka.classifiers.OneR: Holte's OneRweka.classifiers.KernelDensity: kernel density classifierweka.classifiers.SMO: support vector machinesweka.classifiers.Logistic: logistic regressionweka.classifiers.AdaBoostM1: AdaBoostweka.classifiers.LogitBoost: logit boostweka.classifiers.DecisionStump: decision stumps (for boosting)b) Classifiers for numeric prediction:weka.classifiers.LinearRegression: linear regressionweka.classifiers.m5.M5Prime: model treesweka.classifiers.IBk: k-nearest neighbour learnerweka.classifiers.LWR: locally weighted regressionweka.classifiers.RegressionByDiscretization: uses categorical classifiers=================Association rules=================Next to classification schemes, there is some other useful stuff in WEKA. Association rules, for example, can be extracted using the apriori algorithm. Tryjava weka.associations.Apriori -t $WEKAHOME/data/weather.nominal.arff=======Filters=======There are also a number of tools that allow you to manipulate adataset. These tools are called filters in WEKA and can be foundin weka.filters.weka.filters.DiscretizeFilter: discretizes numeric dataweka.filters.AttributeFilter: deletes/selects attributesetc.Try:java weka.filters.DiscretizeFilter -i $WEKAHOME/data/iris.arff -c last===========Data format===========Datasets in WEKA have to be formatted according to the arff format. Examples of arff files can be found in $WEKAHOME/data. What follows is a short description of the file format. A dataset has to start with a declaration of its name:@relation namefollowed by a list of all the attributes in the dataset (including the class attribute). These declarations have the form@attribute attribute_name specificationIf an attribute is nominal, specification contains a list of the possible attribute values in curly brackets:@attribute nominal_attribute {first_value, second_value, third_value}If an attribute is numeric, specification is replaced by the keyword numeric: (Integer values are treated as real numbers in WEKA.)@attribute numeric_attribute numericIn addition to these two types of attributes, there also exists astring attribute type. This attribute provides the possibility tostore a comment or ID field for each of the instances in a dataset:@attribute string_attribute stringAfter the attribute declarations, the actual data is introduced by a @datatag, which is followed by a list of all the instances. The instances are listed in comma-separated format, with a question mark representing a missing value. Comments are lines starting with %==================Experiment package==================There is now support for running experiments that involve evaluatingclassifiers on repeated randomizations of datasets, over multipledatasets (you can do much more than this, besides). The classes forthis reside in the weka.experiment package. The basic architecture isthat a ResultProducer (which generates results on some randomizationof a dataset) sends results to a ResultListener (which is responsiblefor stating whether it already has the result, and otherwise storingresults).Example ResultListeners include:weka.experiment.CSVResultListener: outputs results ascomma-separated-value files.weka.experiment.InstancesResultListener: converts results into a setof Instances.weka.experiment.DatabaseResultListener: sends results to a databasevia jdbc. Example ResultProducers include:weka.experiment.RandomSplitResultProducer: train/test on a % splitweka.experiment.CrossValidationResultProducer: n-fold cross-validationweka.experiment.AveragingResultProducer: averages results from anotherResultPoducer weka.experiment.DatabaseResultProducer: acts as a cache for results,storing them in a database.The RandomSplitResultProducer and CrossValidatioResultProducer makeuse of a SplitEvaluator to obtain actual results for a particularsplit, provided are ClassifierSplitEvaluator (for nominalclassification) and RegressionSplitEvaluator (for numericclassification). Each of these uses a Classifier for actual resultsgeneration. So, you might have a DatabaseResultListener, that is sent results froman AveragingResultProducer, which produces averages over the n resultsproduced for each run of an n-fold CrossValidationResultProducer,which in turn is doing nominal classification through aClassifierSplitEvaluator, which uses OneR for prediction. Whew. Butyou can combine these things together to do pretty much whatever youwant. You might want to write a LearningRateResultProducer that splitsa dataset into increasing numbers of training instances.In terms of database connectivity, we use InstantDB, a free databaseimplemented entirely in Java. It is available from:http://www.instantdb.co.uk/index.htmFrom there you will also be able to find a RmiJdbc bridge which isuseful for running a server that just listens for experiment resultsfrom other machines. When using classes that access a database, youwill probably want to create a properties file that specifies whichjdbc drivers to use, and where to find the database. This file shouldreside in your home directory or the current directory and be called"DatabaseUtils.props". An example is provided in weka/experiment, thisfile is used unless it is overidden by one in your home directory orthe current directory (in that order).To run a simple experiment from the command line, try:java weka.experiment.Experiment -r -T datasets/UCI/iris.arff  \  -D weka.experiment.InstancesResultListener \  -P weka.experiment.RandomSplitResultProducer -- \  -W weka.experiment.ClassifierSplitEvaluator -- \  -W weka.classifiers.OneR(Try "java weka.experiment.Experiment -h" to find out what these optionsmean) If you have your results as a set of instances, you can perform pairedt-tests using weka.experiment.PairedTTester (use the -h option to findout what options it needs).This is all much easier from the Experiment Environment GUI :-)====GUIs====We now have two GUIs to make using Weka a little easier: one that actsmuch as the original interface to the old Weka 2 system, and one forconducting experiments (see README_Experiment_Gui). Both of theseinterfaces use Swing, so you need to either be using Java 2 or havedownloaded Swing 1.1.1 or later for your JDK 1.1. One of thecomponents of the GUIs is a generic object editor that requires aconfiguration "GenericObjectEditor.props". There is an example file inweka/gui. This file will be used unless it is overidden by one in yourhome directory or the current directory (in that order).  This filesimply specifies for each superclass which subclasses to offer aschoices. For example, which Classifiers are available/wanted to beused when an object requires a property of type Classifier. An examplefile is provided.To start the Explorer:java weka.gui.explorer.ExplorerTo start the experiment editor:java weka.gui.experiment.ExperimenterThese _really_ need more documentation, but that'll do to get youstarted :)----------------------------------------------------------------------4. Tutorial:------------A tutorial on how to use WEKA is in $WEKAHOME/Tutorial.pdf. However,not everything in WEKA is covered in the Tutorial. For a complete listyou have to look at the online documentation$WEKAHOME/doc/packages.html.In particular, Tutorial.pdf is a draft from the forthcoming book (seeour web page), and so only describes features in the stable 3.0release.----------------------------------------------------------------------5. Source code:---------------The source code for WEKA is in $WEKAHOME/weka-src.jar. To expand it, use the jar utility that's in every Java distribution.----------------------------------------------------------------------6. Credits:-----------Len Trigg           - weka.experiment, weka.gui, weka.gui.experiment,                      weka.gui.explorer, weka.filters, weka.estimators,                       weka.classifiers, weka.coreEibe Frank          - weka.core, weka.classifiers,                       weka.classifiers.j48, weka.filters,                       weka.associationsMark Hall           - weka.clusterers, weka.attributeSelection,                      weka.classifiers.DecisionTable, weka.gui, 		      weka.gui.explorer, weka.gui.experiment, 		      weka.gui.visualize, weka.convertersRichard Kirkby      - weka.classifiers.adtreeMalcolm Ware        - weka.classifiers.neural,                       weka.classifiers.UserClassifierBernhard Pfahringer - weka.classifiers.adtreeYong Wang           - weka.classifiers.m5Abdelaziz Mahoui    - weka.classifiers.kstar Ian H. Witten       - weka.classifiers.OneR, weka.classifiers.PrismStuart Inglis       - weka.classifiers.IB1... and others!----------------------------------------------------------------------7. Call for code and bug reports:---------------------------------If you have implemented a learning scheme, filter, application,visualization tool, etc., using the WEKA classes, and you think it should be included in WEKA, send us the code, and we can put itin the next WEKA distribution.If you find any bugs, send a fix to wekasupport@cs.waikato.ac.nz.If that's too hard, just send us a bug report.-----------------------------------------------------------------------8. Copyright:-------------WEKA is distributed under the GNU public license. Please readthe file COPYING.-----------------------------------------------------------------------

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -