⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 segmenter.cpp

📁 贝叶斯公式
💻 CPP
字号:
#include "segmenter.h"

/*************************************************************************
//this methord search for a specified word in the lexicon.
//argument: string &s----- the characters to be searched.
//return value: string ----- the part of speech of the word or NULL

*************************************************************************/

string segmenter::exist(string &s){
	
	unsigned char * word = (unsigned char*)const_cast<char *>(s.c_str());	
	unsigned char wordpos[10];	
	if( lex->exist(word,s.size(),wordpos)){	
		wordpos[8] = '\0';
		string pos ((char*)wordpos);		
		return pos;
	}	
	else {		
		return "";
	}	
}

/*********************************************************************************
//this method segment an input text and return the result in the specific form,like
//tag1 + word + tag2 +pos;
//argument:string &str-----the input characters
//return value: string ----- the output characters in specific form
**********************************************************************************/

string segmenter::segment(std::string &str, char tag1,char tag2){
	unsigned char * text = (unsigned char*)const_cast<char*>(str.c_str());
	unsigned char * resptr = new unsigned char[5*str.size()];
	unsigned int length = str.size();
	memset((void * )resptr, 1, 5*length*sizeof(unsigned char));
	lex->Segment(text,resptr,tag1,tag2);
	string result((char *)resptr);
	delete[] resptr;
	return result;
}
/*******************************************************************************
//this method segment an input text into words.
//argument 1: string& str----the input characters
//argument 2: vector<pair<string,int>>----- the segmentation result splited into
//            words. each item in the vector is a pair of the word and its position 
//            in the text.
//return value: void
*******************************************************************************/
void segmenter::segment(std::string &str, std::vector<wordpair> &results){
	
	unsigned char * text = (unsigned char*)const_cast<char*>(str.c_str());
	unsigned char * resptr = new unsigned char[5*str.size()];
	unsigned int length = str.size();
	memset((void * )resptr, 1, 5*length*sizeof(unsigned char));
	lex->Segment(text,resptr,results);
	delete[] resptr;

}
string segmenter::segment(std::string &str,char delima){
	vector< pair<string,int> > resultwords;
	segment(str,resultwords);
	string result;
	for(int i = 0; i < resultwords.size(); i++){
		result += resultwords.at(i).first;
		result += delima;
	}
	return result;
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -