⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 classifiers.xml

📁 使用具有增量学习的监控式学习方法。包括几个不同的分类算法。
💻 XML
字号:
<chapter id="classifiers"><title>Classifiers</title><para>This chapter describes (some of) the different classifiers available in <application>Select</application>.</para><para>There are two main types of classifiers in <application>Select</application>: document classifiers and vector classifiers.</para><sect1><title>Document classifiers</title><para>A document classifier is implemented as taking a document as input.</para><para>All document classifiers have type document.<programlisting>type		document		# Document classifier</programlisting></para><sect2><title>From classifier</title><para>Classifies according to sender.</para><sect3><title>Options</title><variablelist><varlistentry><term><option>n</option></term><listitem><simpara>Specifies the maximum number of addresses to save.</simpara></listitem></varlistentry><varlistentry><term><option>o</option></term><listitem><simpara>Specifies the address eviction order.</simpara></listitem></varlistentry></variablelist></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		from			# Name the classifier fromclassifier	From			# From classifiertype		document		# Document classifieroptions		n=100,o=fifo		# 100 addresses, fifo order</programlisting></para></sect3></sect2><sect2><title>Reply classifier</title><para>Classifies according to threads.</para><sect3><title>options</title><variablelist><varlistentry><term><option>n</option></term><listitem><simpara>Specifies the maximum number of threads to save.</simpara></listitem></varlistentry></variablelist></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		reply			# Name the classifier replyclassifier	Reply			# Reply classifiertype		document		# Document classifieroptions		n=100			# 100 subject entries</programlisting></para></sect3></sect2></sect1><sect1><title>Vector classifiers</title><para>A vector classifier is implemented as taking a vector as input.</para><para>Vector classifiers can have different type arguments.A multi class classifier should have type multi_one:<programlisting>type		multi_one		# Multi classifier, type ONE_MAX</programlisting>A binary classifier can have one of several types:<programlisting>type		multi_rest		# Multi classifier, type REST_MAXtype		multi_linmax		# Multi classifier, type LIN_MAXtype		multi_uc		# Multi classifier, type UC_MAX</programlisting></para><sect2><title>Alma</title><para>Alma is a binary maximal margin classifier.See <link linkend="Gen01"><citation>Gen01</citation></link> for a description of it.</para><sect3><title>options</title><para>There are no options.</para></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		alma			# Name the classifier almaclassifier	Alma			# Alma classifiertype		multi_linmax		# Multi classifier, type LIN_MAXoptions					# No optionstokenizer	alpha			# Alpha tokenizervectorizer	tfidf			# TF-IDF vectorizernormalizer				# No normalization</programlisting></para></sect3></sect2><sect2><title>Naive Bayes</title><para>Naive Bayes is a simple probabilistic multi class classifier.See <link linkend="McCNig98"><citation>McCNig98</citation></link> for a description of it.</para><para>Should only be used with type multi_one.</para><sect3><title>options</title><para>There are no options.</para></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		nb			# Name the classifier nbclassifier	NaiveBayes		# NaiveBayes classifiertype		multi_one		# Multi classifier, type ONE_MAXoptions					# No optionstokenizer	alpha			# Alpha tokenizervectorizer	tfidf			# TF-IDF vectorizernormalizer				# No normalization</programlisting></para></sect3></sect2><sect2><title>N-gram</title><para>N-gram is a classifier which uses relative entropy.It is suitable to use for language identification.See <link linkend="SibRey96"><citation>SibRey96</citation></link> for a description of it.</para><para>Should only be used with an n-gram tokenizer and type multi_one.</para><sect3><title>options</title><para>There are no options.</para></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		ng			# Name the classifier ngclassifier	N-gram			# N-gram classifiertype		multi_one		# Multi classifier, type ONE_MAXoptions					# No optionstokenizer	ngram.byte		# N-gram byte tokenizervectorizer	tf			# TF vectorizernormalizer				# No normalization</programlisting></para></sect3></sect2><sect2><title>Perceptron</title><para>Perceptron is an old, simple binary classifier.It is described in just about every textbook on machine learning.</para><para>It can be used with type multi_rest, multi_linmax, multi_uc.</para><sect3><title>options</title><para>There are no options.</para></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		per			# Name the classifier perclassifier	Perceptron		# Perceptron classifiertype		multi_linmax		# Multi classifier, type LIN_MAXoptions					# No optionstokenizer	alpha			# Alpha tokenizervectorizer	tfidf			# TF-IDF vectorizernormalizer				# No normalization</programlisting></para></sect3></sect2><sect2 id="trivial"><title>Trivial classifier</title><para>Classifies classifies either according to class frequency or at random.Needless to say, this is only useful for testing purposes and should not be used in practice.</para><para>Should only be used with a null tokenizer and type multi_one.</para><sect3><title>Options</title><variablelist><varlistentry><term><option>s</option></term><listitem><simpara>Seed (!= 0) for the pseudo random number generator. Sets the classifier to random mode.</simpara></listitem></varlistentry></variablelist></sect3><sect3><title>Example</title><para><programlisting>[classifier]name		triv			# Name the classifier trivclassifier	Trivial			# Trivial classifiertype		multi_one		# Multi classifier, type ONE_MAXoptions		s=123			# Random mode, with seed 123tokenizer	null			# Null tokenizervectorizer				# Default vectorizernormalizer				# No normalization</programlisting></para></sect3></sect2></sect1></chapter>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -