stringtowordvector.html
来自「数据挖掘的最常用工具。由于开源」· HTML 代码 · 共 1,646 行 · 第 1/5 页
HTML
1,646 行
</TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getIDFTransform()">getIDFTransform</A></B>()</CODE><BR> Sets whether if the word frequencies in a document should be transformed into: <br> fij*log(num of Docs/num of Docs with word i) <br> where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getInvertSelection()">getInvertSelection</A></B>()</CODE><BR> Gets whether the supplied columns are to be processed or skipped.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getLowerCaseTokens()">getLowerCaseTokens</A></B>()</CODE><BR> Gets whether if the tokens are to be downcased or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> int</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getMinTermFreq()">getMinTermFreq</A></B>()</CODE><BR> Get the MinTermFreq value.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> <A HREF="../../../../weka/core/SelectedTag.html" title="class in weka.core">SelectedTag</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getNormalizeDocLength()">getNormalizeDocLength</A></B>()</CODE><BR> Gets whether if the word frequencies for a document (instance) should be normalized or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String[]</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getOptions()">getOptions</A></B>()</CODE><BR> Gets the current settings of the filter.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getOutputWordCounts()">getOutputWordCounts</A></B>()</CODE><BR> Gets whether output instances contain 0 or 1 indicating word presence, or word counts.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> double</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getPeriodicPruning()">getPeriodicPruning</A></B>()</CODE><BR> Gets the rate at which the dictionary is periodically pruned, as a percentage of the dataset size.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getRevision()">getRevision</A></B>()</CODE><BR> Returns the revision string.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> <A HREF="../../../../weka/core/Range.html" title="class in weka.core">Range</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getSelectedRange()">getSelectedRange</A></B>()</CODE><BR> Get the value of m_SelectedRange.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> <A HREF="../../../../weka/core/stemmers/Stemmer.html" title="interface in weka.core.stemmers">Stemmer</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getStemmer()">getStemmer</A></B>()</CODE><BR> Returns the current stemming algorithm, null if none is used.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.io.File</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getStopwords()">getStopwords</A></B>()</CODE><BR> returns the file used for obtaining the stopwords, if the file represents a directory then the default ones are used.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getTFTransform()">getTFTransform</A></B>()</CODE><BR> Gets whether if the word frequencies should be transformed into log(1+fij) where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> <A HREF="../../../../weka/core/tokenizers/Tokenizer.html" title="class in weka.core.tokenizers">Tokenizer</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getTokenizer()">getTokenizer</A></B>()</CODE><BR> Returns the current tokenizer algorithm.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getUseStoplist()">getUseStoplist</A></B>()</CODE><BR> Gets whether if the words on the stoplist are to be ignored (The stoplist is in weka.core.StopWords).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> int</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#getWordsToKeep()">getWordsToKeep</A></B>()</CODE><BR> Gets the number of words (per class if there is a class attribute assigned) to attempt to keep.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#globalInfo()">globalInfo</A></B>()</CODE><BR> Returns a string describing this filter.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#IDFTransformTipText()">IDFTransformTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#input(weka.core.Instance)">input</A></B>(<A HREF="../../../../weka/core/Instance.html" title="class in weka.core">Instance</A> instance)</CODE><BR> Input an instance for filtering.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#invertSelectionTipText()">invertSelectionTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.util.Enumeration</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#listOptions()">listOptions</A></B>()</CODE><BR> Returns an enumeration describing the available options.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#lowerCaseTokensTipText()">lowerCaseTokensTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#main(java.lang.String[])">main</A></B>(java.lang.String[] argv)</CODE><BR> Main method for testing this class.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#minTermFreqTipText()">minTermFreqTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#normalizeDocLengthTipText()">normalizeDocLengthTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#outputWordCountsTipText()">outputWordCountsTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#periodicPruningTipText()">periodicPruningTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setAttributeIndices(java.lang.String)">setAttributeIndices</A></B>(java.lang.String rangeList)</CODE><BR> Sets which attributes are to be worked on.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setAttributeIndicesArray(int[])">setAttributeIndicesArray</A></B>(int[] attributes)</CODE><BR> Sets which attributes are to be processed.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setAttributeNamePrefix(java.lang.String)">setAttributeNamePrefix</A></B>(java.lang.String newPrefix)</CODE><BR> Set the attribute name prefix.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setDoNotOperateOnPerClassBasis(boolean)">setDoNotOperateOnPerClassBasis</A></B>(boolean newDoNotOperateOnPerClassBasis)</CODE><BR> Set the DoNotOperateOnPerClassBasis value.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setIDFTransform(boolean)">setIDFTransform</A></B>(boolean IDFTransform)</CODE><BR> Sets whether if the word frequencies in a document should be transformed into: <br> fij*log(num of Docs/num of Docs with word i) <br> where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setInputFormat(weka.core.Instances)">setInputFormat</A></B>(<A HREF="../../../../weka/core/Instances.html" title="class in weka.core">Instances</A> instanceInfo)</CODE><BR> Sets the format of the input instances.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setInvertSelection(boolean)">setInvertSelection</A></B>(boolean invert)</CODE><BR> Sets whether selected columns should be processed or skipped.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setLowerCaseTokens(boolean)">setLowerCaseTokens</A></B>(boolean downCaseTokens)</CODE><BR> Sets whether if the tokens are to be downcased or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setMinTermFreq(int)">setMinTermFreq</A></B>(int newMinTermFreq)</CODE><BR> Set the MinTermFreq value.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setNormalizeDocLength(weka.core.SelectedTag)">setNormalizeDocLength</A></B>(<A HREF="../../../../weka/core/SelectedTag.html" title="class in weka.core">SelectedTag</A> newType)</CODE><BR> Sets whether if the word frequencies for a document (instance) should be normalized or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOptions(java.lang.String[])">setOptions</A></B>(java.lang.String[] options)</CODE><BR> Parses a given list of options.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOutputWordCounts(boolean)">setOutputWordCounts</A></B>(boolean outputWordCounts)</CODE><BR> Sets whether output instances contain 0 or 1 indicating word presence, or word counts.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1">
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?