stringtowordvector.html
来自「由java开发的软件包」· HTML 代码 · 共 1,386 行 · 第 1/4 页
HTML
1,386 行
<CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#outputWordCountsTipText()">outputWordCountsTipText</A></B>()</CODE><BR> Returns the tip text for this property</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setAttributeNamePrefix(java.lang.String)">setAttributeNamePrefix</A></B>(java.lang.String newPrefix)</CODE><BR> Set the attribute name prefix.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setDelimiters(java.lang.String)">setDelimiters</A></B>(java.lang.String newDelimiters)</CODE><BR> Set the value of delimiters.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setIDFTransform(boolean)">setIDFTransform</A></B>(boolean IDFTransform)</CODE><BR> Sets whether if the word frequencies in a document should be transformed into: <br> fij*log(num of Docs/num of Docs with word i) <br> where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setInputFormat(weka.core.Instances)">setInputFormat</A></B>(<A HREF="../../../../weka/core/Instances.html" title="class in weka.core">Instances</A> instanceInfo)</CODE><BR> Sets the format of the input instances.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setLowerCaseTokens(boolean)">setLowerCaseTokens</A></B>(boolean downCaseTokens)</CODE><BR> Sets whether if the tokens are to be downcased or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setNormalizeDocLength(boolean)">setNormalizeDocLength</A></B>(boolean normalizeDocLength)</CODE><BR> Sets whether if the word frequencies for a document (instance) should be normalized or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOnlyAlphabeticTokens(boolean)">setOnlyAlphabeticTokens</A></B>(boolean tokenizeOnlyAlphabeticSequences)</CODE><BR> Sets whether if tokens are to be formed only from contiguous alphabetic character sequences.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOptions(java.lang.String[])">setOptions</A></B>(java.lang.String[] options)</CODE><BR> Parses a given list of options controlling the behaviour of this object.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOutputWordCounts(boolean)">setOutputWordCounts</A></B>(boolean outputWordCounts)</CODE><BR> Sets whether output instances contain 0 or 1 indicating word presence, or word counts.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setSelectedRange(java.lang.String)">setSelectedRange</A></B>(java.lang.String newSelectedRange)</CODE><BR> Set the value of m_SelectedRange.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setTFTransform(boolean)">setTFTransform</A></B>(boolean TFTransform)</CODE><BR> Sets whether if the word frequencies should be transformed into log(1+fij) where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setUseStoplist(boolean)">setUseStoplist</A></B>(boolean useStoplist)</CODE><BR> Sets whether if the words that are on a stoplist are to be ignored (The stop list is in weka.core.StopWords).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setWordsToKeep(int)">setWordsToKeep</A></B>(int newWordsToKeep)</CODE><BR> Sets the number of words (per class if there is a class attribute assigned) to attempt to keep.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#TFTransformTipText()">TFTransformTipText</A></B>()</CODE><BR> Returns the tip text for this property</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#useStoplistTipText()">useStoplistTipText</A></B>()</CODE><BR> Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#wordsToKeepTipText()">wordsToKeepTipText</A></B>()</CODE><BR> Returns the tip text for this property</TD></TR></TABLE> <A NAME="methods_inherited_from_class_weka.filters.Filter"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TD><B>Methods inherited from class weka.filters.<A HREF="../../../../weka/filters/Filter.html" title="class in weka.filters">Filter</A></B></TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../weka/filters/Filter.html#batchFilterFile(weka.filters.Filter, java.lang.String[])">batchFilterFile</A>, <A HREF="../../../../weka/filters/Filter.html#filterFile(weka.filters.Filter, java.lang.String[])">filterFile</A>, <A HREF="../../../../weka/filters/Filter.html#getOutputFormat()">getOutputFormat</A>, <A HREF="../../../../weka/filters/Filter.html#inputFormat(weka.core.Instances)">inputFormat</A>, <A HREF="../../../../weka/filters/Filter.html#isOutputFormatDefined()">isOutputFormatDefined</A>, <A HREF="../../../../weka/filters/Filter.html#numPendingOutput()">numPendingOutput</A>, <A HREF="../../../../weka/filters/Filter.html#output()">output</A>, <A HREF="../../../../weka/filters/Filter.html#outputFormat()">outputFormat</A>, <A HREF="../../../../weka/filters/Filter.html#outputPeek()">outputPeek</A>, <A HREF="../../../../weka/filters/Filter.html#useFilter(weka.core.Instances, weka.filters.Filter)">useFilter</A></CODE></TD></TR></TABLE> <A NAME="methods_inherited_from_class_java.lang.Object"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TD><B>Methods inherited from class java.lang.Object</B></TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE>equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait</CODE></TD></TR></TABLE> <P><!-- ============ FIELD DETAIL =========== --><!-- ========= CONSTRUCTOR DETAIL ======== --><A NAME="constructor_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TD COLSPAN=1><FONT SIZE="+2"><B>Constructor Detail</B></FONT></TD></TR></TABLE><A NAME="StringToWordVector()"><!-- --></A><H3>StringToWordVector</H3><PRE>public <B>StringToWordVector</B>()</PRE><DL><DD>Default constructor. Targets 1000 words in the output.<P></DL><HR><A NAME="StringToWordVector(int)"><!-- --></A><H3>StringToWordVector</H3><PRE>public <B>StringToWordVector</B>(int wordsToKeep)</PRE><DL><DD>Constructor that allows specification of the target number of words in the output.<P><DT><B>Parameters:</B><DD><CODE>wordsToKeep</CODE> - the number of words in the output vector (per class if assigned).</DL><!-- ============ METHOD DETAIL ========== --><A NAME="method_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TD COLSPAN=1><FONT SIZE="+2"><B>Method Detail</B></FONT></TD></TR></TABLE><A NAME="listOptions()"><!-- --></A><H3>listOptions</H3><PRE>public java.util.Enumeration <B>listOptions</B>()</PRE><DL><DD>Returns an enumeration describing the available options<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../weka/core/OptionHandler.html#listOptions()">listOptions</A></CODE> in interface <CODE><A HREF="../../../../weka/core/OptionHandler.html" title="interface in weka.core">OptionHandler</A></CODE></DL></DD><DD><DL><DT><B>Returns:</B><DD>an enumeration of all the available options</DL></DD></DL><HR><A NAME="setOptions(java.lang.String[])"><!-- --></A><H3>setOptions</H3><PRE>public void <B>setOptions</B>(java.lang.String[] options) throws java.lang.Exception</PRE><DL><DD>Parses a given list of options controlling the behaviour of this object. Valid options are:<p> -C<br> Output word counts rather than boolean word presence.<p> -D delimiter_charcters <br> Specify set of delimiter characters (default: " \n\t.,:'\\\"()?!\"<p> -R index1,index2-index4,...<br> Specify list of string attributes to convert to words. (default: all string attributes)<p> -P attribute_name_prefix <br> Specify a prefix for the created attribute names. (default: "")<p> -W number_of_words_to_keep <br> Specify number of word fields to create. Other, less useful words will be discarded. (default: 1000)<p> -A <br> Only tokenize contiguous alphabetic sequences. <p> -L <br> Convert all tokens to lower case before adding to the dictionary. <p> -S <br> Do not add words to the dictionary which are on the stop list. <p> -T <br> Transform word frequencies to log(1+fij) where fij is frequency of word i in document j. <p> -I <br> Transform word frequencies to fij*log(numOfDocs/numOfDocsWithWordi) where fij is frequency of word i in document j. <p> -N <br> Normalize word frequencies for each document(instance). The frequencies are normalized to average length of the documents specified in input format. <p><P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../weka/core/OptionHandler.html#setOptions(java.lang.String[])">setOptions</A></CODE> in interface <CODE><A HREF="../../../../weka/core/OptionHandler.html" title="interface in weka.core">OptionHandler</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>options</CODE> - the list of options as an array of strings<DT><B>Throws:</B><DD><CODE>java.lang.Exception</CODE> - if an option is not supported</DL></DD></DL><HR><A NAME="getOptions()"><!-- --></A><H3>getOptions</H3><PRE>public java.lang.String[] <B>getOptions</B>()</PRE><DL><DD>Gets the current settings of the filter.<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../weka/core/OptionHandler.html#getOptions()">getOptions</A></CODE> in interface <CODE><A HREF="../../../../weka/core/OptionHandler.html" title="interface in weka.core">OptionHandler</A></CODE></DL></DD><DD><DL><DT><B>Returns:</B><DD>an array of strings suitable for passing to setOptions</DL></DD></DL><HR><A NAME="setInputFormat(weka.core.Instances)"><!-- --></A><H3>setInputFormat</H3><PRE>public boolean <B>setInputFormat</B>(<A HREF="../../../../weka/core/Instances.html" title="class in weka.core">Instances</A> instanceInfo) throws java.lang.Exception</PRE><DL><DD>Sets the format of the input instances.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../weka/filters/Filter.html#setInputFormat(weka.core.Instances)">setInputFormat</A></CODE> in class <CODE><A HREF="../../../../weka/filters/Filter.html" title="class in weka.filters">Filter</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>instanceInfo</CODE> - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).<DT><B>Returns:</B><DD>true if the outputFormat may be collected immediately<DT><B>Throws:</B><DD><CODE>java.lang.Exception</CODE> - if the input format can't be set successfully</DL></DD></DL><HR><A NAME="input(weka.core.Instance)"><!-- --></A><H3>input</H3><PRE>public boolean <B>input</B>(<A HREF="../../../../weka/core/Instance.html" title="class in weka.core">Instance</A> instance) throws java.lang.Exception</PRE><DL><DD>Input an instance for filtering. Filter requires all training instances be read before producing output.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../weka/filters/Filter.html#input(weka.core.Instance)">input</A></CODE> in class <CODE><A HREF="../../../../weka/filters/Filter.html" title="class in weka.filters">Filter</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>instance</CODE> - the input instance.<DT><B>Returns:</B><DD>true if the filtered instance may now be collected with output().<DT><B>Throws:</B><DD><CODE>java.lang.IllegalStateException</CODE> - if no input structure has been defined.<DD><CODE>java.lang.Exception</CODE> - if the input instance was not of the correct
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?