⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 createmodel.java~16~

📁 垃圾邮件过滤器源代码
💻 JAVA~16~
字号:
///////////////////////////////////////////////////////////////////////////////
// Copyright (C) 2001 Chieu Hai Leong and Jason Baldridge
//
// This library is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 2.1 of the License, or (at your option) any later version.
//
// This library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License along with this program; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
//////////////////////////////////////////////////////////////////////////////

import opennlp.maxent.*;
import opennlp.maxent.io.*;
import java.io.*;

/*从数据中获得EventStream后,调用GIS过程的主类;
 * Main class which calls the GIS procedure after building the EventStream
 * from the data.
 *
 * @author  Chieu Hai Leong and Jason Baldridge
 * @version $Revision: 1.3 $, $Date: 2001/11/20 17:07:16 $
 */
public class CreateModel {

    // some parameters if you want to play around with the smoothing option
    // for model training.  This can improve model accuracy, though training
    // will potentially take longer and use more memory.  Model size will also
    // be larger.  Initial testing indicates improvements for models built on
    // small data sets and few outcomes, but performance degradation for those
    // with large data sets and lots of outcomes.
    // 模型训练中平滑选项的设置.可以提高精确度,但训练过程更长,需要更多内存,模型
    // 也会更大.最初测试表明:对建立在小训练集和输出类别少的模型可以提高效率,但
    // 对大训练集和输出较多的模型,性能将下降.
    public static boolean USE_SMOOTHING = true;
    public static double SMOOTHING_OBSERVATION = 0.01;

    /**
     * Main method. Call as follows:
     * <p>
     * java CreateModel dataFile
     */
    public static void main (String[] args) {

        // 从命令行中获得训练集文件名,确定模型数据文件名.
	String dataFileName = new String(args[0]);
	String modelFileName =
	    dataFileName.substring(0,dataFileName.lastIndexOf('.'))
	    + "Model.txt";

	try {
            // 从数据文件出发,把数据转化成EventStream的形式.
	    FileReader datafr = new FileReader(new File(dataFileName));
            EventStream es =
		new BasicEventStream(new PlainTextByLineDataStream(datafr));

	    // 设置平滑选项,调用GIS类的静态方法trainModel,获得一个GISModel对象
            // 这是一个关键步骤!!!!!!!!!!!
            GIS.SMOOTHING = USE_SMOOTHING;
	    GIS.SMOOTHING_OBSERVATION = SMOOTHING_OBSERVATION;
	    GISModel model = GIS.trainModel(es);

	    // 把训练过程获得的数据利用GISModelWriter对象的方法输出到文件中,以备
            // 将来进行测试时使用
            File outputFile = new File(modelFileName);
	    GISModelWriter writer =
                  new SuffixSensitiveGISModelWriter(model, outputFile);
	    writer.persist();
            System.out.print("\nCongratulations! Model has been built!\n");
	} catch (Exception e) {
            // 处理异常
	    System.out.print("Sorry,Unable to create model due to exception: ");
	    System.out.println(e);
	}
    }

}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -