📄 cs585 fall 1998 project one by stanislav rost.htm

📁 《Visual C++数字图像识别技术典型案例》之光学字符识别技术源码
💻 HTM
📖 第 1 页 / 共 2 页
字号:
上一页 12
        src="CS585 Fall 1998 Project One by Stanislav Rost.files/mahalanobis.gif"></CENTER><BR><I><B>X</B> 
        is the identifying vector for the letter which we are attempting to 
        recognize<BR><B>Mu</B> is the mean and <B>C</B> is covariance matrix for 
        a vowel to which we are comparing the letter, obtained in training</I> 
        <LI>Pick a vowel which is the closest to the letter we are considering 
        (whose shifted Mahalanobis distance is the smallest). 
        <LI>Calculate a non-shifted Mahalanobis distance (without a log term) to 
        find out the actual closeness of the vowel and the letter in question. 
        <LI>Find the range which will threshold the distance between letters and 
        allow the program to determine whether letters match or not. We obtain 
        the variance from diagonals of the covariance matrix, and the square 
        root of variance produces standard deviation. To obtain the standard 
        deviation's measure in the space of our data distribution for the vowel, 
        the program applies the Mahalanobis distance to it. Once the normalized 
        standard deviation's value is known, it is scaled by an empirically 
        derived constant to cover most of the distribution. 
        <LI>If the distance is greater than the allowed range, mark the letter 
        as non-(O, E, A, U) and proceed to the next letter. <A name=checks></A>
        <LI>If the distance is within the allowed range, perform series of 
        secondary checks using the Euler number (number of image components 
        minus the number of holes in the image. For instance, O's must have an 
        Euler number of 0 (1 component, 1 hole). If the Euler checks fails, the 
        letter is marked as non-(O, E, A, U) and the program proceeds to the 
        next letter.<BR><BR>In particular, this is where <B>u</B>'s are 
        distinguished from <B>n</B>'s. The letter in question is sliced into two 
        halves horizontally, and <B>u</B>'s have different Euler numbers for 
        each half than <B>n</B>'s. <BR>
        <CENTER><IMG 
        src="CS585 Fall 1998 Project One by Stanislav Rost.files/n2.gif">&nbsp;&nbsp;<IMG 
        src="CS585 Fall 1998 Project One by Stanislav Rost.files/u2.gif"></CENTER><BR>Also, 
        <B>a</B>'s which are similar to <B>s</B>'s go through an additional 
        check here. In a's, the top half of the image has smaller area than the 
        bottom half. In s's, both halves will have roughly the same area. 
        <LI>If the letter which we want to recognize passes all the checks and 
        is within the allowed range, then paint the letter as the appropriate 
        vowel and proceed to the next letter in the paragraph. </LI></UL><A 
      name=results></A>
      <CENTER><IMG 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/results.jpg" 
      border=0></CENTER>
      <CENTER>Paragraph 1<BR><A 
      href="CS585 Fall 1998 Project One by Stanislav Rost.files/par1result.gif"><IMG 
      height=70 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/par1result.gif" 
      width=497 border=0></A><BR><BR>
      <TABLE cellSpacing=1 border=1>
        <TBODY>
        <TR>
          <TD><B>Number Right</B> </TD>
          <TD><B>Number Wrong</B> </TD>
          <TD><B>Correctness</B> </TD></TR>
        <TR>
          <TD>16 </TD>
          <TD>1 </TD>
          <TD>94.11 % </TD></TR></TBODY></TABLE>
      <P>Paragraph 2<BR><A 
      href="CS585 Fall 1998 Project One by Stanislav Rost.files/par2result.gif"><IMG 
      height=137 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/par2result.gif" 
      width=488 border=0></A><BR><BR>
      <TABLE cellSpacing=1 border=1>
        <TBODY>
        <TR>
          <TD><B>Number Right</B> </TD>
          <TD><B>Number Wrong</B> </TD>
          <TD><B>Correctness</B> </TD></TR>
        <TR>
          <TD>82 </TD>
          <TD>4 wrong, 4 missed </TD>
          <TD>91.11 % </TD></TR></TBODY></TABLE>
      <P>Scaled Paragraph<BR><A 
      href="CS585 Fall 1998 Project One by Stanislav Rost.files/scaledresult.gif"><IMG 
      height=194 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/scaledresult.gif" 
      width=229 border=0></A><BR><BR>
      <TABLE cellSpacing=1 border=1>
        <TBODY>
        <TR>
          <TD><B>Number Right</B> </TD>
          <TD><B>Number Wrong</B> </TD>
          <TD><B>Correctness</B> </TD></TR>
        <TR>
          <TD>11 </TD>
          <TD>0 </TD>
          <TD>100.00 % </TD></TR></TBODY></TABLE>
      <P>Tilted Paragraph<BR><BR>Original<BR><A 
      href="CS585 Fall 1998 Project One by Stanislav Rost.files/tilted.gif"><IMG 
      height=20 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/tilted.gif" 
      width=498 border=0></A><BR><BR>Result<BR><A 
      href="CS585 Fall 1998 Project One by Stanislav Rost.files/tiltedresult.gif"><IMG 
      height=62 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/tiltedresult.gif" 
      width=498 border=0></A><BR><BR>
      <TABLE cellSpacing=1 border=1>
        <TBODY>
        <TR>
          <TD><B>Number Right</B> </TD>
          <TD><B>Number Wrong</B> </TD>
          <TD><B>Correctness</B> </TD></TR>
        <TR>
          <TD>10 </TD>
          <TD>6 </TD>
          <TD>62.50 % </TD></TR></TBODY></TABLE></CENTER><BR><FONT size=+1>What Went 
      Wrong and How To Fix It</FONT><BR><BR>The recognition performed poorly on 
      the tilted paragraph for several reasons. 
      <UL>
        <LI>When the program performed reverse rotation on the tilted paragraph 
        to orient it properly, the algorithm for rotation might have distorted 
        data or added noise to it. There really is no way to avoid this because 
        the data is discrete, and during rotation some data will be corrupted. 
        <LI>The training paragraph did not have any rotated letters. Although 
        the moments are rotation-invariant, when the training routine calculated 
        the mean for each set of instances of vowels, the mean was probably far 
        away from the value of the mean which also accounts for rotated letters. 
        To correct for this, i should have also included rotated letters in 
        training. 
        <LI>The tilted paragraph was not simply rotated, but as you can see from 
        the image deformed in other ways, i.e. some letters are rotated 
        differently than the whol paragraph. These deformations in conjunction 
        with my usage of orientation-variant parameters may produce error. To 
        correct for this, I should have refrained from using rotation-variant 
        parameters. </LI></UL>The splitting did not quite work in some cases. The 
      improve the splitting technique, I probably should have considered a 
      technique which uses image morphology such as erosion, removal of spurs, 
      shrinking etc. <A name=code></A>
      <CENTER><IMG 
      src="CS585 Fall 1998 Project One by Stanislav Rost.files/code.jpg" 
      border=0></CENTER><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/train.m">train.m</A> - 
      Training program <BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/vowel.m">vowel.m</A> - main 
      recognition program <BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/letter_compare.m">letter_compare.m</A> 
      - compares a letter to all vowels</A> <BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/invmoments.m">invmoments.m</A> 
      - calculates HU-moments <BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/ncmoment.m">ncmoment.m</A> - 
      calculates normalized central moments<BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/centralmoment.m">centralmoment.m</A> 
      - calculates central moments<BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/gravcenters.m">gravcenters.m</A> 
      - calculates gravity centers for an image<BR><BR><B>Training 
      data</B><BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/a.mat">a.mat</A> - training 
      data for vowel a<BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/e.mat">e.mat</A> - training 
      data for vowel e<BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/o.mat">o.mat</A> - training 
      data for vowel o<BR><A 
      href="http://web.mit.edu/stanrost/www/cs585p1/u.mat">u.mat</A> - training 
      data for vowel u<BR></TD></TR></TBODY></TABLE></P></BODY></HTML>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -