stringtowordvector.html

来自「由java开发的软件包」· HTML 代码 · 共 1,386 行 · 第 1/4 页

HTML
1,386
字号
<CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#outputWordCountsTipText()">outputWordCountsTipText</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Returns the tip text for this property</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setAttributeNamePrefix(java.lang.String)">setAttributeNamePrefix</A></B>(java.lang.String&nbsp;newPrefix)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Set the attribute name prefix.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setDelimiters(java.lang.String)">setDelimiters</A></B>(java.lang.String&nbsp;newDelimiters)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Set the value of delimiters.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setIDFTransform(boolean)">setIDFTransform</A></B>(boolean&nbsp;IDFTransform)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether if the word frequencies in a document should be transformed into: <br> fij*log(num of Docs/num of Docs with word i) <br>      where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setInputFormat(weka.core.Instances)">setInputFormat</A></B>(<A HREF="../../../../weka/core/Instances.html" title="class in weka.core">Instances</A>&nbsp;instanceInfo)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets the format of the input instances.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setLowerCaseTokens(boolean)">setLowerCaseTokens</A></B>(boolean&nbsp;downCaseTokens)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether if the tokens are to be downcased or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setNormalizeDocLength(boolean)">setNormalizeDocLength</A></B>(boolean&nbsp;normalizeDocLength)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether if the word frequencies for a document (instance) should  be normalized or not.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOnlyAlphabeticTokens(boolean)">setOnlyAlphabeticTokens</A></B>(boolean&nbsp;tokenizeOnlyAlphabeticSequences)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether if tokens are to be formed only from contiguous alphabetic character sequences.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOptions(java.lang.String[])">setOptions</A></B>(java.lang.String[]&nbsp;options)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Parses a given list of options controlling the behaviour of this object.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setOutputWordCounts(boolean)">setOutputWordCounts</A></B>(boolean&nbsp;outputWordCounts)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether output instances contain 0 or 1 indicating word presence, or word counts.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setSelectedRange(java.lang.String)">setSelectedRange</A></B>(java.lang.String&nbsp;newSelectedRange)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Set the value of m_SelectedRange.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setTFTransform(boolean)">setTFTransform</A></B>(boolean&nbsp;TFTransform)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether if the word frequencies should be transformed into  log(1+fij) where fij is the frequency of word i in document(instance) j.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setUseStoplist(boolean)">setUseStoplist</A></B>(boolean&nbsp;useStoplist)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets whether if the words that are on a stoplist are to be ignored (The stop list is in weka.core.StopWords).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#setWordsToKeep(int)">setWordsToKeep</A></B>(int&nbsp;newWordsToKeep)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sets the number of words (per class if there is a class attribute assigned) to attempt to keep.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#TFTransformTipText()">TFTransformTipText</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Returns the tip text for this property</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#useStoplistTipText()">useStoplistTipText</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Returns the tip text for this property.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../weka/filters/unsupervised/attribute/StringToWordVector.html#wordsToKeepTipText()">wordsToKeepTipText</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Returns the tip text for this property</TD></TR></TABLE>&nbsp;<A NAME="methods_inherited_from_class_weka.filters.Filter"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TD><B>Methods inherited from class weka.filters.<A HREF="../../../../weka/filters/Filter.html" title="class in weka.filters">Filter</A></B></TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../weka/filters/Filter.html#batchFilterFile(weka.filters.Filter, java.lang.String[])">batchFilterFile</A>, <A HREF="../../../../weka/filters/Filter.html#filterFile(weka.filters.Filter, java.lang.String[])">filterFile</A>, <A HREF="../../../../weka/filters/Filter.html#getOutputFormat()">getOutputFormat</A>, <A HREF="../../../../weka/filters/Filter.html#inputFormat(weka.core.Instances)">inputFormat</A>, <A HREF="../../../../weka/filters/Filter.html#isOutputFormatDefined()">isOutputFormatDefined</A>, <A HREF="../../../../weka/filters/Filter.html#numPendingOutput()">numPendingOutput</A>, <A HREF="../../../../weka/filters/Filter.html#output()">output</A>, <A HREF="../../../../weka/filters/Filter.html#outputFormat()">outputFormat</A>, <A HREF="../../../../weka/filters/Filter.html#outputPeek()">outputPeek</A>, <A HREF="../../../../weka/filters/Filter.html#useFilter(weka.core.Instances, weka.filters.Filter)">useFilter</A></CODE></TD></TR></TABLE>&nbsp;<A NAME="methods_inherited_from_class_java.lang.Object"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TD><B>Methods inherited from class java.lang.Object</B></TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE>equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait</CODE></TD></TR></TABLE>&nbsp;<P><!-- ============ FIELD DETAIL =========== --><!-- ========= CONSTRUCTOR DETAIL ======== --><A NAME="constructor_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TD COLSPAN=1><FONT SIZE="+2"><B>Constructor Detail</B></FONT></TD></TR></TABLE><A NAME="StringToWordVector()"><!-- --></A><H3>StringToWordVector</H3><PRE>public <B>StringToWordVector</B>()</PRE><DL><DD>Default constructor. Targets 1000 words in the output.<P></DL><HR><A NAME="StringToWordVector(int)"><!-- --></A><H3>StringToWordVector</H3><PRE>public <B>StringToWordVector</B>(int&nbsp;wordsToKeep)</PRE><DL><DD>Constructor that allows specification of the target number of words in the output.<P><DT><B>Parameters:</B><DD><CODE>wordsToKeep</CODE> - the number of words in the output vector (per class if assigned).</DL><!-- ============ METHOD DETAIL ========== --><A NAME="method_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TD COLSPAN=1><FONT SIZE="+2"><B>Method Detail</B></FONT></TD></TR></TABLE><A NAME="listOptions()"><!-- --></A><H3>listOptions</H3><PRE>public java.util.Enumeration <B>listOptions</B>()</PRE><DL><DD>Returns an enumeration describing the available options<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../weka/core/OptionHandler.html#listOptions()">listOptions</A></CODE> in interface <CODE><A HREF="../../../../weka/core/OptionHandler.html" title="interface in weka.core">OptionHandler</A></CODE></DL></DD><DD><DL><DT><B>Returns:</B><DD>an enumeration of all the available options</DL></DD></DL><HR><A NAME="setOptions(java.lang.String[])"><!-- --></A><H3>setOptions</H3><PRE>public void <B>setOptions</B>(java.lang.String[]&nbsp;options)                throws java.lang.Exception</PRE><DL><DD>Parses a given list of options controlling the behaviour of this object. Valid options are:<p> -C<br> Output word counts rather than boolean word presence.<p>  -D delimiter_charcters <br> Specify set of delimiter characters (default: " \n\t.,:'\\\"()?!\"<p> -R index1,index2-index4,...<br> Specify list of string attributes to convert to words. (default: all string attributes)<p> -P attribute_name_prefix <br> Specify a prefix for the created attribute names. (default: "")<p> -W number_of_words_to_keep <br> Specify number of word fields to create. Other, less useful words will be discarded. (default: 1000)<p> -A <br> Only tokenize contiguous alphabetic sequences. <p> -L <br> Convert all tokens to lower case before adding to the dictionary. <p> -S <br> Do not add words to the dictionary which are on the stop list. <p> -T <br> Transform word frequencies to log(1+fij) where fij is frequency of word i  in document j. <p> -I <br> Transform word frequencies to fij*log(numOfDocs/numOfDocsWithWordi) where fij is frequency of word i in document j. <p> -N <br> Normalize word frequencies for each document(instance). The frequencies are normalized to average length of the documents specified in input  format. <p><P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../weka/core/OptionHandler.html#setOptions(java.lang.String[])">setOptions</A></CODE> in interface <CODE><A HREF="../../../../weka/core/OptionHandler.html" title="interface in weka.core">OptionHandler</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>options</CODE> - the list of options as an array of strings<DT><B>Throws:</B><DD><CODE>java.lang.Exception</CODE> - if an option is not supported</DL></DD></DL><HR><A NAME="getOptions()"><!-- --></A><H3>getOptions</H3><PRE>public java.lang.String[] <B>getOptions</B>()</PRE><DL><DD>Gets the current settings of the filter.<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../weka/core/OptionHandler.html#getOptions()">getOptions</A></CODE> in interface <CODE><A HREF="../../../../weka/core/OptionHandler.html" title="interface in weka.core">OptionHandler</A></CODE></DL></DD><DD><DL><DT><B>Returns:</B><DD>an array of strings suitable for passing to setOptions</DL></DD></DL><HR><A NAME="setInputFormat(weka.core.Instances)"><!-- --></A><H3>setInputFormat</H3><PRE>public boolean <B>setInputFormat</B>(<A HREF="../../../../weka/core/Instances.html" title="class in weka.core">Instances</A>&nbsp;instanceInfo)                       throws java.lang.Exception</PRE><DL><DD>Sets the format of the input instances.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../weka/filters/Filter.html#setInputFormat(weka.core.Instances)">setInputFormat</A></CODE> in class <CODE><A HREF="../../../../weka/filters/Filter.html" title="class in weka.filters">Filter</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>instanceInfo</CODE> - an Instances object containing the input  instance structure (any instances contained in the object are  ignored - only the structure is required).<DT><B>Returns:</B><DD>true if the outputFormat may be collected immediately<DT><B>Throws:</B><DD><CODE>java.lang.Exception</CODE> - if the input format can't be set  successfully</DL></DD></DL><HR><A NAME="input(weka.core.Instance)"><!-- --></A><H3>input</H3><PRE>public boolean <B>input</B>(<A HREF="../../../../weka/core/Instance.html" title="class in weka.core">Instance</A>&nbsp;instance)              throws java.lang.Exception</PRE><DL><DD>Input an instance for filtering. Filter requires all training instances be read before producing output.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../weka/filters/Filter.html#input(weka.core.Instance)">input</A></CODE> in class <CODE><A HREF="../../../../weka/filters/Filter.html" title="class in weka.filters">Filter</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>instance</CODE> - the input instance.<DT><B>Returns:</B><DD>true if the filtered instance may now be collected with output().<DT><B>Throws:</B><DD><CODE>java.lang.IllegalStateException</CODE> - if no input structure has been defined.<DD><CODE>java.lang.Exception</CODE> - if the input instance was not of the correct 

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?