⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 paodingmaker.java

📁 paoding的中文分词程序,效果还可以.这个可以作为一些基本应用的分词.
💻 JAVA
📖 第 1 页 / 共 2 页
字号:
		// 否则使用属性文件的paoding.dic.home配置
		// 但是如果属性文件中强制配置paoding.dic.home.config-first=this,
		// 则优先考虑属性文件的paoding.dic.home配置,
		// 此时只有当属性文件没有配置paoding.dic.home时才会采用环境变量的配置
		String dicHomeBySystemEnv = System
				.getenv(Constants.ENV_PAODING_DIC_HOME);
		String dicHome = getProperty(p, Constants.DIC_HOME);
		if (dicHomeBySystemEnv != null) {
			String first = getProperty(p, Constants.DIC_HOME_CONFIG_FIRST);
			if (first != null && first.equalsIgnoreCase("this")) {
				if (dicHome == null) {
					dicHome = dicHomeBySystemEnv;
				}
			} else {
				dicHome = dicHomeBySystemEnv;
			}
		}
		// 如果环境变量和属性文件都没有配置词典安转目录
		// 则尝试在当前目录和类路径下寻找是否有dic目录,
		// 若有,则采纳他为paoding.dic.home
		// 如果尝试后均失败,则抛出PaodingAnalysisException异常
		if (dicHome == null) {
			File f = new File("dic");
			if (f.exists()) {
				dicHome = "dic/";
			} else {
				URL url = PaodingMaker.class.getClassLoader()
						.getResource("dic");
				if (url != null) {
					dicHome = "classpath:dic/";
				}
			}
		}
		if (dicHome == null) {
			throw new PaodingAnalysisException(
					"please set a system env PAODING_DIC_HOME or Config paoding.dic.home in paoding-dic-home.properties point to the dictionaries!");
		}
		// 规范化dicHome,并设置到属性文件对象中
		dicHome = dicHome.replace('\\', '/');
		if (!dicHome.endsWith("/")) {
			dicHome = dicHome + "/";
		}
		p.setProperty(Constants.DIC_HOME, dicHome);// writer to the properites
													// object
		// 将dicHome转化为一个系统唯一的绝对路径,记录在属性对象中
		File dicHomeFile = getFile(dicHome);
		if (!dicHomeFile.exists()) {
			throw new PaodingAnalysisException("not found the dic home dirctory! " + dicHomeFile.getAbsolutePath());
		}
		if (!dicHomeFile.isDirectory()){
			throw new PaodingAnalysisException("dic home should not be a file, but a directory!");
		}
		p.setProperty("paoding.dic.home.absolute.path", dicHomeFile.getAbsolutePath());
	}

	
	private static Paoding implMake(final Properties p) {
		// 将要返回的Paoding对象,它可能是新创建的,也可能使用paodingHolder中已有的Paoding对象
		Paoding paoding;
		// 作为本次返回的Paoding对象在paodingHolder中的key,使之后同样的key不会重复创建Paoding对象
		final Object paodingKey;
		// 如果该属性对象是通过PaodingMaker由文件读入的,则必然存在paoding.dic.properties.path属性
		// 详细请参考loadProperties方法)
		String path = p.getProperty("paoding.dic.properties.path");
		// 如果该属性由文件读入,则文件地址作为Paoding对象在paodingHolder中的key
		if (path != null) {
			paodingKey = path;
		// 否则以属性文件作为其key,之后只要进来的是同一个属性对象,都返回同一个Paoding对象
		} else {
			paodingKey = p;
		}
		paoding = (Paoding) paodingHolder.get(paodingKey);
		if (paoding != null) {
			return paoding;
		}
		try {
			paoding = createPaodingWithKnives(p);
			final Paoding finalPaoding = paoding;
			//
			String compilerClassName = getProperty(p, Constants.ANALYZER_DICTIONARIES_COMPILER);
			Class compilerClass = null;
			if (compilerClassName != null) {
				compilerClass = Class.forName(compilerClassName);
			}
			if (compilerClass == null) {
				String analyzerMode = getProperty(p, Constants.ANALYZER_MODE);
				if ("most-words".equalsIgnoreCase(analyzerMode)
						|| "default".equalsIgnoreCase(analyzerMode)) {
					compilerClass = MostWordsModeDictionariesCompiler.class;
				}
				else {
					compilerClass = SortingDictionariesCompiler.class;
				}
			}
			final DictionariesCompiler compiler 
				= (DictionariesCompiler)compilerClass.newInstance();
			new Function() {
				public void run() throws Exception {
					// 编译词典-对词典进行可能的处理,以符合分词器的要求
					if (compiler.shouldCompile(p)) {
						Dictionaries dictionaries = readUnCompiledDictionaries(p);
						Paoding tempPaoding = createPaodingWithKnives(p);
						setDictionaries(tempPaoding, dictionaries);
						compiler.compile(dictionaries, tempPaoding, p);
					}
					
					// 使用编译后的词典
					final Dictionaries dictionaries = compiler.readCompliedDictionaries(p);
					setDictionaries(finalPaoding, dictionaries);
					
					// 启动字典动态转载/卸载检测器
					// 侦测时间间隔(秒)。默认为60秒。如果设置为0或负数则表示不需要进行检测
					String interval = getProperty(p, Constants.DIC_DETECTOR_INTERVAL);
					dictionaries.startDetecting(Integer.parseInt(interval), new DifferenceListener() {
						public void on(Difference diff) throws Exception {
							dictionaries.stopDetecting();
							// 此处调用run方法,以当检测到**编译后**的词典变更/删除/增加时,
							// 重新编译源词典、重新创建并启动dictionaries自检测
							run();
						}
					});
				}
			}.run();
			// Paoding对象创建成功!此时可以将它寄放到paodingHolder中,给下次重复利用
			paodingHolder.set(paodingKey, paoding);
			return paoding;
		} catch (Exception e) {
			throw new PaodingAnalysisException("", e);
		}
	}
	
	private static Paoding createPaodingWithKnives(Properties p) throws Exception {
		// 如果PaodingHolder中并没有缓存该属性文件或对象对应的Paoding对象,
		// 则根据给定的属性创建一个新的Paoding对象,并在返回之前存入paodingHolder
		Paoding paoding = new Paoding();
		
		//寻找传说中的Knife。。。。
		final Map /* <String, Knife> */ knifeMap = new HashMap /* <String, Knife> */ ();
		final List /* <Knife> */ knifeList = new LinkedList/* <Knife> */();
		final List /* <Function> */ functions = new LinkedList/* <Function> */();
		Iterator iter = p.entrySet().iterator();
		while (iter.hasNext()) {
			Map.Entry e = (Map.Entry) iter.next();
			final String key = (String) e.getKey();
			final String value = (String) e.getValue();
			int index = key.indexOf(Constants.KNIFE_CLASS);
			if (index == 0 && key.length() > Constants.KNIFE_CLASS.length()) {
				final int end = key.indexOf('.', Constants.KNIFE_CLASS.length());
				if (end  == -1) {
					Class clazz = Class.forName(value);
					Knife knife = (Knife) clazz.newInstance();
					knifeList.add(knife);
					knifeMap.put(key, knife);
					log.info("add knike: " + value);
				}
				else {
					// 由于属性对象属于hash表,key的读取顺序不和文件的顺序一致,不能保证属性设置时,knife对象已经创建
					// 所以这里只定义函数放到functions中,待到所有的knife都创建之后,在执行该程序
					functions.add(new Function() {
						public void run() throws Exception {
							String knifeName = key.substring(0, end);
							Object obj = knifeMap.get(knifeName);
							if (!obj.getClass().getName().equals("org.springframework.beans.BeanWrapperImpl")) {
								Class beanWrapperImplClass = Class.forName("org.springframework.beans.BeanWrapperImpl");
								Method setWrappedInstance = beanWrapperImplClass.getMethod("setWrappedInstance", new Class[]{Object.class});
								Object beanWrapperImpl = beanWrapperImplClass.newInstance();
								setWrappedInstance.invoke(beanWrapperImpl, new Object[]{obj});
								knifeMap.put(knifeName, beanWrapperImpl);
								obj = beanWrapperImpl;
							}
							String propertyName = key.substring(end + 1);
							Method setPropertyValue = obj.getClass().getMethod("setPropertyValue", new Class[]{String.class, Object.class});
							setPropertyValue.invoke(obj, new Object[]{propertyName, value});
						}
					});
				}
			}
		}
		// 完成所有留后执行的程序
		for (Iterator iterator = functions.iterator(); iterator.hasNext();) {
			Function function = (Function) iterator.next();
			function.run();
		}
		// 把刀交给庖丁
		paoding.setKnives(knifeList);
		return paoding;
	}

	private static Dictionaries readUnCompiledDictionaries(Properties p) {
		String skipPrefix = getProperty(p, Constants.DIC_SKIP_PREFIX);
		String noiseCharactor = getProperty(p, Constants.DIC_NOISE_CHARACTOR);
		String noiseWord = getProperty(p, Constants.DIC_NOISE_WORD);
		String unit = getProperty(p, Constants.DIC_UNIT);
		String confucianFamilyName = getProperty(p, Constants.DIC_CONFUCIAN_FAMILY_NAME);
		String combinatorics = getProperty(p, Constants.DIC_FOR_COMBINATORICS);
		String charsetName = getProperty(p, Constants.DIC_CHARSET);
		Dictionaries dictionaries = new FileDictionaries(getDicHome(p),
				skipPrefix, noiseCharactor, noiseWord, unit,
				confucianFamilyName, combinatorics, charsetName);
		return dictionaries;
	}

	private static void setDictionaries(Paoding paoding, Dictionaries dictionaries) {
		Knife[] knives = paoding.getKnives();
		for (int i = 0; i < knives.length; i++) {
			Knife knife = (Knife) knives[i];
			if (knife instanceof DictionariesWare) {
				((DictionariesWare) knife).setDictionaries(dictionaries);
			}
		}
	}


	private static File getFile(String path) {
		File file;
		URL url;
		if (path.startsWith("classpath:")) {
			path = path.substring("classpath:".length());
			url = getClassLoader().getResource(path);
			final boolean fileExist = url != null;
			file = new File(fileExist ? url.getFile() : path) {
				private static final long serialVersionUID = 4009013298629147887L;

				public boolean exists() {
					return fileExist;
				}
			};
		} else {
			file = new File(path);
		}
		return file;
	}

	private static ClassLoader getClassLoader() {
		ClassLoader loader = Thread.currentThread().getContextClassLoader();
		if (loader == null) {
			loader = PaodingMaker.class.getClassLoader();
		}
		return loader;
	}

	private static String getProperty(Properties p, String name) {
		return Constants.getProperty(p, name);
	}
	
	//--------------------------------------------------------------------
	
	private static class ObjectHolder/* <T> */{

		private ObjectHolder() {
		}

		private Map/* <Object, T> */objects = new HashMap/* <Object, T> */();

		public Object/* T */get(Object name) {
			return objects.get(name);
		}

		public void set(Object name, Object/* T */object) {
			objects.put(name, object);
		}

		public void remove(Object name) {
			objects.remove(name);
		}
	}
	
	private static interface Function {
		public void run() throws Exception;
	}

}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -