📄 readme.txt

📁 把 sequential 有导师学习问题转化为传统的有导师学习问题
💻 TXT
字号:
The Recurrent Sliding Window Classifier (RSW) is to be used with the WEKA machine learning package. WEKA has been developed at the Computer Science Department, University of Waikato ,NZ. 1. An introduction to sequential data classification and recurrent sliding windows:	The standard supervised learning problem is to learn to map froman input feature vector x to an output class variable y given N trainingexamples.  Many recent learning problems can be viewed as extensions ofstandard supervised learning to the setting where each input object X isa sequence of feature vectors X = (x1, ..., xT), and the correspondingoutput object Y is a sequence of class labels Y = (y1, ..., yT).  Thesequential supervised learning (SSL) problem is to learn to map from Xto Y given a set of N training examples {(X1,Y1), ..., (XN, YN)}.	Several recent learning systems involved solving SSLproblems. One example is the famous NETTalk problem of learning topronounce English words.  Each training example consists of a sequenceof letters (e.g., ``enough'') and a corresponding output sequence ofphonemes (e.g., In^-f-).  Another example is the problem ofpart-of-speech tagging in which the input is a sequence of words (e.g.,``do you want fries with that?'') and the output is a sequence of partsof speech (e.g., ``verb pron verb noun prep pron'').  A third example isthe problem of information extraction from web pages in which the inputsequence is a sequence of tokens from a web page and the output is asequence of field labels.	In the literature, two general strategies for solving SSLproblems have been studied.  One strategy, which we might call the``direct'' approach, is to develop probabilistic models of sequentialdata such as Hidden Markov Models (HMMs) and Conditional Random Fields(CRFs).  These methods directly learn a model of the sequential data.         The other general strategy that has been explored might becalled the ``indirect'' approach (i.e., a ``hack'').  In this strategy,the sequential supervised learning problem is solved indirectly by firstconverting it into a standard supervised learning problem, solving thatproblem, and then converting the results into a solution to the SSLproblem. Specifically, the indirect approach is to convert the inputsequence X and output sequence Y into a set of ``windows'' {w1, ..., w6}as shown in the table below.  Each window wt consists of a centralelement xt and some number of elements to the left and right of xt.  Theoutput of the window is the corresponding label yt.  We will denote thenumber of elements to the left of xt as the Left Input Context (LIC),and the number of elements to the right of xt as the Right Input Context(RIC).  In the example, LIC=RIC=3.  Contextual positions before thestart of the sequence or after the end of the sequence are filled by adesignated null value (in this case "_").Simple Sliding Windows: original SSL training example: (X, Y) where X = "enough" and Y = "In^-f-" derived windows:       input elements:      output class label: w1		 	_ _ _ e n o u  		I w2                     _ _ e n o u g		n w3 			_ e n o u g h		^ w4 			e n o u g h _		- w5 			n o u g h _ _		f w6  			o u g h _ _ _		-In this example, each input element is a single character, but ingeneral, each input element can be a vector of features.The process of converting SSL training examples into windows is called"windowizing".  The resulting windowized examples can be provided asinput to any standard supervised learning algorithm which will learn aclassifier that takes an input window and predicts an output class.  TheRSW package can then take this classifier and apply it to classifyadditional windowized SSL examples.  RSW computes two measures of errorrate: (i) error rate on individual windows and (ii) error rate on entiresequences.  An entire (X,Y) sequence is classified incorrectly if any ofits windows are misclassified.The RSW package also supports recurrent sliding windows.  In recurrentsliding windows, the predicted output of the window classifier atpositions t-1, t-2, ..., are provided as additional input features tomake predictions at position t.   These additional input features arecalled the Output Context.  Recurrent Sliding Windows require that theX sequence be processed in one direction, either from left-to-right orright-to-left.  If the sequence is processed left-to-right, the LeftOutput Context (LOC) specifies the number of previous positions whosepredictions are fed back as additional inputs.  If the sequence isprocessed right-to-left, the Right Output Context (ROC) parameterspecifies the number of positions whose predictions are fed back asadditional inputs.  No more than one of LOC and ROC can be nonzero.  IfLOC=ROC=0, then only simple sliding windows are produced.The table below shows the case for LOC=1 (and LIC=RIC=3 as before).When the training data are windowized, the correct label for yt-1 isprovided as an input feature for predicting yt.  Recurrent Sliding Windows(X, Y)			enough			In^-f- w1		 	_ _ _ e n o u - 	I w2                     _ _ e n o u g I		n w3 			_ e n o u g h n		^ w4 			e n o u g h _ ^		- w5 			n o u g h _ _ -		f w6  			o u g h _ _ _ f		-These windowized examples can then be provided to a standard supervisedlearning algorithm which learns to map the input windows to the outputlabels.  The learned function can then be applied by RSW to classify new(X,Y) sequences.  The new test sequences are windowized dynamically, byusing the predicted value yhat at position t-1 as the extra inputfeature to predict yt.  As with simple sliding windows, two error ratesare computed:  (i) the fraction of individual elements incorrectlypredicted, and (ii) the fraction of entire (X,Y) sequences incorrectlypredicted. 2. Structure of the ARFF file for sequential data:For RSW, the ARFF file contains one example for each element of eachtraining sequence.  The data file has the standard ARFF format with thefollowing extensions:The first attribute in each line must be the sequence number.  Eachtraining example (X,Y) is assigned a unique sequence number (inascending order counting from 1).  The second attribute in each linemust be the element number.  Each position (xt,yt) for t = 1, ..., T(where T is the length of X and Y) must be assigned an element number(in ascending order counting from 1).  The remaining attributes in eachline provide features that describe xt and the class label yt.  Thefollowing example ARFF file shows two training sequences (bad, BAD) and(feed, FEED).  Note that each attribute and the class variable mustspecify an extra null value that is used to pad context that extendsbeyond the end of the sequence.  In this case, we have specified thenull value "_".@relation SampleSequentialData@attribute sequence_number numeric@attribute element_number numeric@attribute feature1 {a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,_}@attribute class {A1, B1, C1, D1, E1, F1, _}@data1, 1, b, B11, 2, a, A11, 3, d, D12, 1, f, F12, 2, e, E12, 3, e, E12, 4, d, D1	The type for the sequence_number and element_number attributesmust be numeric. The sequence numbers must start with 1 and proceedserially without any breaks or jumps in increasing order.  The elementnumbers inside each sequence also must start at 1 and proceed seriallyin increasing order until the end of the sequence.  The class attributemust be nominal. RSW converts the data structured as above into thewindowized data as shown in the previous section. The non-nominalattributes remain unmodified.3. Using RSW from the command line:RSW is a meta classifier (like Bagging and AdaBoostM1).Therefore, it requires a base classifier. To run RSW from the commandline do the following:Preparation steps:1. Move the files RSW.java and RSW.class to the directory weka/classifiers/meta2. Move the files SequentialClassifier.java and SequentialClassifier.class to the directory weka/classifiers3. Move the files DistributionSequentialClassifier.java and DistributionSequentialClassifier.class to the directory weka/classifiers	 4. Move the files SequentialEvaluation.java and SequentialEvaluation.class to the directory weka/classifiers5. Move the files Windowise.java and Windowise.class to the directory weka/classifiers6. Change your working directory to the directory holding the weka package.7. Run RSW like any other classifier with its command line options: 	java RSW {options}RSW takes all options taken by any other classifier eg. -t "train.arff" -T "test.arff" etc. In addition to these, RSW requires 5 mandatory options. -W name of base classifier-A left input context for sliding window-B right input context for sliding window-Y left output context for sliding window-Z right output context for sliding window4. Using RSW from the GUI:	The directory weka/gui contains a file named GenericObjectEditor.props. In that file, below the lines saying  # Lists the Classifiers I want to choose fromthere is a list of all the classifiers that may appear in the drop-down list of the GUI Explorer. Insert the following line as part of that list and save the file.weka.classifiers.meta.RSW, \Now open the GUI Explorer and use the RSW classifier just as you woulduse any other classifier. 5. Parameter Selection with RSW using cross-validation:In our experience, it is important to carefully choose the size of theinput and output contexts to obtain good results.  If the contexts aretoo large, performance is damaged because of high variance (overfitting)by the base learning algorithm.  If the contexts are too small,performance may be poor because not enough contextual information isavailable to the base learning algorithm.  We have observed cases wherethe input context should be very small yet the output context is largeand also cases where the output context should be zero and the inputcontext should be large.  Furthermore, the choice of context may alsodepend on whether you seek to maximize the number of elements correctlyclassified or the number of entire sequences correctly classified.A good way to choose the context parameter values (LIC, LOC, RIC, ROC)is to perform an internal cross-validation on the training data.  RSWsupports this with the SeqCVParameterSelection class. Follow these steps:1. Move the files SeqCVParameterSelection.java and SeqCVParameterSelection.classto the directory weka/classifiers/meta.2. Move the files Instances.java and Instances.class to the directory weka/core. You will have to replace the files with the same name already existing in that directory. 3. Use the SeqCVParameterSelection from the command line exactly like you would use CVParameterSelection but with RSW as the base classifier. Please note that SeqCVParameterSelection will not accept any other base classifier.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -