⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 训练方法.html

📁 crf(condintional random fields)简介 用音字转换实例来介绍crf模型
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns:o = "urn:schemas-microsoft-com:office:office"><HEAD><TITLE></TITLE>
<META http-equiv=Content-Type content="text/html; charset=GB2312">
<META content="MSHTML 6.00.2900.3157" name=GENERATOR></HEAD>
<BODY>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt"><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA">CRF训练命令如下:</SPAN></SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt"><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA">crfLearn.exe template_file train_file 
model_file</SPAN></SPAN></P>
<P class=MsoNormal 
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
align=left><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt"><SPAN 
style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA">其中template_file为事先准备好的模板文件,train_file为准备好的训练语料,model_file为通过CRF训练后得到的模型文件。</SPAN></SPAN></P>
<UL>
  <LI>
  <DIV class=MsoNormal 
  style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
  align=left><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  >-x&nbsp;&nbsp;&nbsp;&nbsp; 
  用于训练的特征的出现次数的下界,默认为1</SPAN></SPAN></DIV>
  <LI>
  <DIV class=MsoNormal 
  style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
  align=left><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  >-y&nbsp;&nbsp;&nbsp;&nbsp; 
  用于训练的特征的出现次数的上界,默认为10000</SPAN></SPAN></DIV>
  <LI>
  <DIV class=MsoNormal 
  style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
  align=left><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  >-m&nbsp;&nbsp;&nbsp; 
  训练时最多的迭代次数,默认为100000</SPAN></SPAN></DIV>
  <LI>
  <DIV class=MsoNormal 
  style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
  align=left><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  >-c&nbsp;&nbsp;&nbsp;&nbsp; 
  控制训练过拟合和不拟合的参数,如果设置过大,则可能导致过拟合;如果设置过小,则可能导致不拟合。其默认值为1.0</SPAN></SPAN></DIV>
  <LI>
  <DIV class=MsoNormal 
  style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 24pt; TEXT-ALIGN: left; mso-char-indent-count: 2.0; mso-layout-grid-align: none" 
  align=left><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  >-e&nbsp;&nbsp;&nbsp;&nbsp; 前后两次迭代的目标函数值的相对差,默认值为0.0001<BR 
  >例如:crfLearn.exe -x 2 -y 100 -c 0.8 -e 
  0.00007&nbsp;template train model 表示的含义是:<BR 
  >将出现次数位于2到100之间的那些特征用来训练,训练的对多的迭代次数为100000,控制训练过拟合和不拟合的参数值为0.8,前后两次迭代的目标函数值的相对差为0.00007。</SPAN></SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  ><BR></SPAN></SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  ><BR>训练过程中的输出信息如下:<BR 
  ></SPAN></SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-font-kerning: 0pt" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: KNLe; mso-bidi-font-family: :ZLe; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA" 
  ><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'" 
  ><BR>句子个数:40000<BR>特征个数:3773667<BR>建立features完毕!<BR>clear完毕!<BR>iteration=0 
  terr=0.882073 serr=0.9992 obj=1.69213e+006 difference=1<BR>iteration=1 
  terr=0.326572 serr=0.93215 obj=1.66518e+006 difference=0.015922<BR>iteration=2 
  terr=0.326568 serr=0.93215 obj=1.56032e+006 
  difference=0.0629739<BR>iteration=3 terr=0.326575 serr=0.93215 obj=1.2318e+006 
  difference=0.210543<BR>iteration=4 terr=0.30927 serr=0.9191 obj=809550 
  difference=0.342792<BR>iteration=5 terr=0.268727 serr=0.890275 obj=640726 
  difference=0.208539<BR>...<BR><BR><SPAN lang=EN-US style="FONT-SIZE: 12pt" 
  >terr</SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'" 
  >表</SPAN>示所有训练拼音的识别率,</SPAN><SPAN lang=EN-US 
  style="FONT-SIZE: 12pt">serr</SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'" 
  >表示所有训练句子的识别率,</SPAN><SPAN lang=EN-US 
  style="FONT-SIZE: 12pt">obj</SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'" 
  >表示目标函数的值,</SPAN><SPAN lang=EN-US style="FONT-SIZE: 12pt" 
  >diff</SPAN><SPAN 
  style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'" 
  >表示的是前后两次目标函数值的相对差距。</SPAN><SPAN lang=EN-US 
  style="FONT-SIZE: 12pt"><o:p 
  ></o:p></SPAN></SPAN></SPAN></DIV></LI></UL></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -