⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 crf原理.html

📁 crf(condintional random fields)简介 用音字转换实例来介绍crf模型
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns:o = "urn:schemas-microsoft-com:office:office" xmlns:v = 
"urn:schemas-microsoft-com:vml"><HEAD><TITLE></TITLE>
<META http-equiv=Content-Type content="text/html; charset=GB2312">
<META content="MSHTML 6.00.2900.3157" name=GENERATOR></HEAD>
<BODY><FONT size=4><FONT face=@宋体 
DESIGNTIMESP="16432"><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA"><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA">
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">条件随机域是一种条件概率模型,它定义了整个标签序列的联合概率,而不是为每一个状态都规定一个概率分布,各状态是非独立的,彼此之间可以交互,因此可以更好的模拟现实世界的数据。<SPAN 
lang=EN-US><o:p></o:p></SPAN></SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 9pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">&nbsp; 线性链是条件随机域</SPAN><SPAN lang=EN-US 
style="FONT-SIZE: 12pt; mso-bidi-font-size: 9.0pt; mso-font-kerning: 0pt"><FONT face="Times New Roman">(CRF)</FONT></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">中常见的特定图结构之一,它由指定的输出节点顺序链接而成。一个线性链与一个有限状态机相对应,可用于解决序列数据的标注问题。<SPAN 
lang=EN-US><o:p></o:p></SPAN></SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">定义<STRONG>X</STRONG></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">为给定的输入观测值序列,即无向图模型中<SPAN lang=EN-US>T</SPAN>个输入节点上的值<SPAN lang=EN-US>,</SPAN>如一个音序列;定义<SPAN lang=EN-US><STRONG>Y</STRONG></SPAN>为有限状态机的状态集合,每个状态可以对应一个标记,如音所对应的汉字:号好浩;定义<STRONG>Y</STRONG></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">为一个长度与<SPAN lang=EN-US><STRONG>X</STRONG></SPAN>相等的状态序列,即无向图模型中<SPAN lang=EN-US>T</SPAN>个输出节点上的值。一个带有参数<IMG 
src="file://C:\Documents and Settings\Administrator\My Documents\CRF帮助文档\Lameda.JPG" 
align=baseline></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">的线性链条件随机域</SPAN><SPAN lang=EN-US 
style="FONT-SIZE: 12pt; mso-bidi-font-size: 9.0pt; mso-font-kerning: 0pt"><FONT face="Times New Roman">(CRF)</FONT></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">把从给定输入序列<SPAN lang=EN-US><STRONG>X</STRONG></SPAN>得到的状态序列<SPAN lang=EN-US><STRONG>Y</STRONG></SPAN>的条件概率定义为:</SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
&nbsp;&nbsp; <IMG 
src="file://C:\Documents and Settings\Administrator\My Documents\CRF帮助文档\条件公式.JPG" 
align=baseline></SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-ALIGN: left; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">上式中,</SPAN><SPAN lang=EN-US><SPAN 
style="POSITION: relative; TOP: 5pt; mso-text-raise: -5.0pt"><v:shapetype id=_x0000_t75 stroked="f" filled="f" 
path="m@4@5l@4@11@9@11@9@5xe" o:preferrelative="t" o:spt="75" 
coordsize="21600,21600"><FONT face="Times New Roman"> <v:stroke joinstyle="miter"></v:stroke><v:formulas><v:f 
eqn="if lineDrawn pixelLineWidth 0"></v:f><v:f 
eqn="sum @0 1 0"></v:f><v:f eqn="sum 0 0 @1"></v:f><v:f eqn="prod @2 1 2"></v:f><v:f eqn="prod @3 21600 pixelWidth"></v:f><v:f eqn="prod @3 21600 pixelHeight"></v:f><v:f eqn="sum @0 0 1"></v:f><v:f 
eqn="prod @6 1 2"></v:f><v:f eqn="prod @7 21600 pixelWidth"></v:f><v:f eqn="sum @8 21600 0"></v:f><v:f eqn="prod @7 21600 pixelHeight"></v:f><v:f eqn="sum @10 21600 0"></v:f></v:formulas><v:path o:connecttype="rect" 
gradientshapeok="t" o:extrusionok="f"></v:path><o:lock 
aspectratio="t" v:ext="edit"></o:lock></FONT></v:shapetype></SPAN></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt"><STRONG>Zx</STRONG>是一个范化因子,使得在给定输入上的所有可能的状态序列的概率之和为<SPAN lang=EN-US>1</SPAN>。范化因子<STRONG><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt"><STRONG>Zx</STRONG></SPAN></STRONG></SPAN><SPAN lang=EN-US><SPAN 
style="POSITION: relative; TOP: 5pt; mso-text-raise: -5.0pt"></SPAN></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">的计算涉及到的状态序列的数目非常庞大<SPAN lang=EN-US>(</SPAN>指数量级增长<SPAN lang=EN-US>)</SPAN>,但在线性链模型中<SPAN lang=EN-US>,</SPAN>由于这些节点间没有闭合路径,因此可以通过动态规划算法便捷地进行计算。同时,推断寻找最可能的状态序列的问题也可以用动态规划的方法加以解决。<SPAN 
lang=EN-US><o:p></o:p></SPAN></SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">该式中的<STRONG>f<FONT size=2>k</FONT></STRONG></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">表示一个特征函数,它通常取布尔值(也可以是实数);<IMG 
src="file://C:\Documents and Settings\Administrator\My Documents\CRF帮助文档\w1.JPG" 
align=baseline></SPAN><SPAN lang=EN-US><SPAN 
style="POSITION: relative; TOP: 6pt; mso-text-raise: -6.0pt"></SPAN></SPAN><SPAN 
style="FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">&nbsp; 和&nbsp;<IMG 
src="file://C:\Documents and Settings\Administrator\My Documents\CRF帮助文档\w2.JPG" 
align=baseline>&nbsp; </SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">是在训练中得到的、与每个特征</SPAN><SPAN lang=EN-US><SPAN 
style="POSITION: relative; TOP: 6pt; mso-text-raise: -6.0pt"></SPAN></SPAN><SPAN 
style="FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'"><STRONG>f<FONT size=2>k</FONT></STRONG>和<STRONG><FONT size=4>g</FONT><FONT size=2>k</FONT></STRONG></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">相关的权重参数。如果我们以当前时间点<SPAN lang=EN-US><STRONG>t</STRONG></SPAN>为中心来考察整个观察序列<SPAN lang=EN-US><STRONG>X</STRONG>,</SPAN>就可以用特征函数来度量状态转换过程的各个方面。特征参数</SPAN><SPAN 
lang=EN-US><SPAN 
style="POSITION: relative; TOP: 6pt; mso-text-raise: -6.0pt"></SPAN></SPAN><SPAN 
style="FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">和</SPAN><SPAN lang=EN-US><SPAN 
style="POSITION: relative; TOP: 6pt; mso-text-raise: -6.0pt"></SPAN></SPAN><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-bidi-font-family: 黑体; mso-font-kerning: 0pt">的取值反映了特征函数所代表的事件发生的可能性。如果它为较大的正数,则事件更可能发生;反之,如果它为较大的负数,则事件更倾向于不发生。<SPAN 
lang=EN-US><o:p></o:p></SPAN></SPAN></P>
<P>
<P>&nbsp;</P>
<P></SPAN></SPAN></FONT></FONT>&nbsp;</P>
<P></P></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -