📄 词性标注类.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE></TITLE>
<META http-equiv=Content-Type content="text/html; charset=GB2312">
<META content="MSHTML 6.00.2900.3157" name=GENERATOR></HEAD>
<BODY><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: ??; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-font-kerning: 0pt"><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: ??; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-font-kerning: 0pt"><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: ??; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-font-kerning: 0pt">
<P class=MsoNormal
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; mso-char-indent-count: 2.0"><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-font-kerning: 0pt">特征的组织结构由二级表构成。第一级表是特征表,有特征串和特征指针两个字段,特征指针字段指向存储了特征编号和出现次数所构成的数据对。特征表采用红黑树存取。如下图</SPAN><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-font-kerning: 0pt">所示。</SPAN><BR><BR></SPAN><SPAN lang=EN-US
style="FONT-SIZE: 12pt; mso-bidi-font-size: 9.0pt; mso-font-kerning: 0pt"><o:p>
<IMG
src="file://C:\Documents and Settings\Administrator\My Documents\CRF帮助文档\posFeatureOrg.JPG"
align=baseline><BR><BR>注意:在词性标注及同类问题的特征提取过程中,权重是通过一个权重数组进行存取的,每个特征的权重通过对应的特征编号获得。由于在形成的Lattice当中,每一列所对应的状态数是固定不变的,设为<STRONG>N</STRONG>,则特征编号每隔<STRONG>N</STRONG>个插入到上述特征表当中,比如:</SPAN>特征串"U00:227"对应的特征编号为0,其隐含的意思是对应特征串为
"U00:227" 的<STRONG>N</STRONG>个状态的权重通过编号0, 1, 2, 3, ..., N-1来获得。依此类推,可知特征串
"U01:11" 对应的特征编号为<STRONG>N</STRONG>, 特征串 "U02:131"
对应的特征编号为<STRONG>2N</STRONG>。</O:P></P></SPAN></SPAN></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -