📄 using a particle filter for gesture recognition.htm

📁 一个很好的粒子滤波算法
💻 HTM
📖 第 1 页 / 共 2 页
字号:
上一页 12
src="Using a Particle Filter for Gesture Recognition.files/img18.gif" width=15 
align=middle> refers to the position in the model for the right hand's 
trajectory). 
<P>In summary, there are 7 parameters that describe each <EM>state</EM>. 
<P>
<H4><A name=SECTION00021000000000000000>Initialization</A></H4>
<P>The sample set is initialized with <I>N</I> samples distributed over possible 
starting states and each assigned a weight of <IMG height=27 
alt=tex2html_wrap_inline184 
src="Using a Particle Filter for Gesture Recognition.files/img19.gif" width=12 
align=middle> . Specifically, the initial parameters are picked uniformly 
according to: <BR><BR><IMG height=24 alt=tex2html_wrap_inline186 
src="Using a Particle Filter for Gesture Recognition.files/img20.gif" width=91 
align=middle> <BR><IMG height=35 alt=tex2html_wrap_inline188 
src="Using a Particle Filter for Gesture Recognition.files/img21.gif" width=74 
align=middle> , where <IMG height=20 alt=tex2html_wrap_inline190 
src="Using a Particle Filter for Gesture Recognition.files/img22.gif" width=22 
align=middle> [0,1] <BR><IMG height=30 alt=tex2html_wrap_inline192 
src="Using a Particle Filter for Gesture Recognition.files/img23.gif" width=124 
align=middle> <BR><IMG height=30 alt=tex2html_wrap_inline194 
src="Using a Particle Filter for Gesture Recognition.files/img24.gif" width=119 
align=middle> <BR><BR>In this application, I set the parameters as follows: 
<BR><IMG height=14 alt=tex2html_wrap_inline196 
src="Using a Particle Filter for Gesture Recognition.files/img25.gif" width=35 
align=middle> = 2, since there are two models <BR><IMG height=22 
alt=tex2html_wrap_inline198 
src="Using a Particle Filter for Gesture Recognition.files/img26.gif" width=157 
align=middle> for 50 percent scaling <BR><IMG height=22 
alt=tex2html_wrap_inline200 
src="Using a Particle Filter for Gesture Recognition.files/img27.gif" width=153 
align=middle> for 50 percent scaling <BR>
<P>
<H4><A name=SECTION00022000000000000000>Prediction</A></H4>
<P>In the prediction step, each parameter of a randomly sampled <IMG height=14 
alt=tex2html_wrap_inline136 
src="Using a Particle Filter for Gesture Recognition.files/img2.gif" width=11 
align=middle> is used to determine <IMG height=15 alt=tex2html_wrap_inline138 
src="Using a Particle Filter for Gesture Recognition.files/img3.gif" width=27 
align=middle> based on the parameters of that particular <IMG height=14 
alt=tex2html_wrap_inline136 
src="Using a Particle Filter for Gesture Recognition.files/img2.gif" width=11 
align=middle> . Each old state, <IMG height=14 alt=tex2html_wrap_inline136 
src="Using a Particle Filter for Gesture Recognition.files/img2.gif" width=11 
align=middle> , is randomly chosen from the sample set, based on the weight of 
each sample. That is, the weight of each sample determines the probability of 
its being chosen. This is done efficiently by creating a cumulative probability 
table, choosing a uniform random number on [0,1], and then using binary search 
to pull out a sample (see Isard and Blake for details). 
<P>The following equations are used to choose the new state <BR><BR><IMG 
height=15 alt=tex2html_wrap_inline210 
src="Using a Particle Filter for Gesture Recognition.files/img28.gif" width=69 
align=middle> <BR><IMG height=30 alt=tex2html_wrap_inline212 
src="Using a Particle Filter for Gesture Recognition.files/img29.gif" width=169 
align=middle> <BR><IMG height=30 alt=tex2html_wrap_inline214 
src="Using a Particle Filter for Gesture Recognition.files/img30.gif" width=137 
align=middle> <BR><IMG height=30 alt=tex2html_wrap_inline216 
src="Using a Particle Filter for Gesture Recognition.files/img31.gif" width=132 
align=middle> <BR><BR>where <IMG height=23 alt=tex2html_wrap_inline218 
src="Using a Particle Filter for Gesture Recognition.files/img32.gif" width=44 
align=middle> refers to a number chosen randomly according to the normal 
distribution with standard deviation <IMG height=14 alt=tex2html_wrap_inline220 
src="Using a Particle Filter for Gesture Recognition.files/img33.gif" width=13 
align=middle> . This adds an element of uncertainty to each prediction, which 
keeps the sample set diffuse enough to deal with noisy data. In this application 
I set: <IMG height=21 alt=tex2html_wrap_inline222 
src="Using a Particle Filter for Gesture Recognition.files/img34.gif" width=131 
align=middle> . 
<P>For a given drawn sample, predictions are generated until all of the 
parameters are within the accepted range. If, after, a set number of attempts it 
is still impossible to generate a valid prediction, a new sample is created 
according to the initialization procedure above. In addition, 10 percent of all 
samples in the new sample set are initialized randomly as in the initialization 
step above (with the exception that rather than having the phase parameter 
biased towards zero, it is biased towards the number of observations that have 
been made thus far). This ensures that local maxima can't completely take over 
the curve; new hypothesese are always given a chance to dominate. 
<P>
<H4><A name=SECTION00023000000000000000>Updating</A></H4>
<P>After the Prediction step above, there exists a new set of <I>N</I> predicted 
samples which need to be assigned weights. The weight of each sample is a 
measure of its likelihood given the observed data <IMG height=23 
alt=tex2html_wrap_inline226 
src="Using a Particle Filter for Gesture Recognition.files/img35.gif" width=128 
align=middle> . I define <IMG height=26 alt=tex2html_wrap_inline228 
src="Using a Particle Filter for Gesture Recognition.files/img36.gif" width=163 
align=middle> as a sequence of observations for the <I>i</I>th coefficient over 
time; specifically, let <IMG height=22 alt=tex2html_wrap_inline232 
src="Using a Particle Filter for Gesture Recognition.files/img37.gif" width=131 
align=middle> be the sequence of observations of the horizontal velocity of the 
left hand, the vertical velocity of the left hand, the horizontal velocity of 
the right hand, and the vertical velocity of the right hand respectively. 
<P>Extending Black and Jepson, I then calculate the weight by the following 
equation: 
<P><IMG height=48 alt=equation84 
src="Using a Particle Filter for Gesture Recognition.files/img38.gif" width=500 
align=bottom> 
<P><BR>where <BR>
<P><IMG height=48 alt=displaymath234 
src="Using a Particle Filter for Gesture Recognition.files/img39.gif" width=450 
align=bottom> 
<P>and where <I>w</I> is the size of a temporal window that spans back in time 
(here, I take <I>w</I> = 10). Note that <IMG height=11 
alt=tex2html_wrap_inline174 
src="Using a Particle Filter for Gesture Recognition.files/img15.gif" width=15 
align=bottom> , <IMG height=24 alt=tex2html_wrap_inline172 
src="Using a Particle Filter for Gesture Recognition.files/img14.gif" width=14 
align=middle> , and <IMG height=22 alt=tex2html_wrap_inline176 
src="Using a Particle Filter for Gesture Recognition.files/img16.gif" width=14 
align=middle> refer to the appropriate parameters of the model for the blob in 
question and that <IMG height=37 alt=tex2html_wrap_inline246 
src="Using a Particle Filter for Gesture Recognition.files/img40.gif" width=95 
align=middle> refers to the value given to the <I>i</I>th coefficient of the 
model <IMG height=14 alt=tex2html_wrap_inline160 
src="Using a Particle Filter for Gesture Recognition.files/img9.gif" width=8 
align=middle> interpolated at time <IMG height=24 alt=tex2html_wrap_inline252 
src="Using a Particle Filter for Gesture Recognition.files/img41.gif" width=59 
align=middle> and scaled by <IMG height=11 alt=tex2html_wrap_inline174 
src="Using a Particle Filter for Gesture Recognition.files/img15.gif" width=15 
align=bottom> . 
<P>
<H4><A name=SECTION00024000000000000000>Classification</A></H4>
<P>With this algorithm in place, all that remains is actually classifying the 
video sequence as one of the two signs. Since the whole idea of Condensation is 
that the most likely hypothesis will dominate by the end, I chose to use the 
criterion of which model was deemed most likely at the end of the video sequence 
to determine the class of the entire video sequence. Determining the probability 
assigned to each model is a simple matter of summing the weights of each sample 
in the sample set at a given moment whose <EM>state</EM> refers to the model in 
question. The following graphs plot the likelihood of each model over time for 
an instance of each sign (the first is a sign that is classified as model 1, the 
second a sign that is classified as model 2): <BR><IMG alt=tex2html_wrap256 
src="Using a Particle Filter for Gesture Recognition.files/img42.gif" 
align=bottom> <IMG alt=tex2html_wrap258 
src="Using a Particle Filter for Gesture Recognition.files/img43.gif" 
align=bottom> <BR>Using this criterion, my system correctly classified 80 
percent of the signs it was trained on and 75 percent of novel signs. 
<H3>Condensation Implementation</H3>The Condensation algorithm was coded in C++ 
and ran much in much faster than real time on the sweet hall machines (excluding 
the image preprocessing). The complete source code is here: 
<UL>
  <LI>Condensation code 
  <UL>
    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/sample.cpp">sample.cpp</A> 

    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/sample.h">sample.h</A> 

    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/model.cpp">model.cpp</A> 

    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/model.h">model.h</A> 

    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/main.cpp">main.cpp</A> 
    </LI></UL>
  <LI>Utilities 
  <UL>
    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/random_utils.cpp">random_utils.cpp</A> 
    (create random gaussian numbers, etc) 
    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/random_utils.h">random_utils.h</A> 

    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/utility.h">utility.h</A> 

    <LI>scanner.cpp String tokenizer 
    <LI><A 
    href="http://www.mit.edu/~alexgru/vision/condensation_source/scanner.h">scanner.h</A> 
    </LI></UL></LI></UL>
<H1>References </H1>For more background on gesture recognition, see my 
literature review here: <A 
href="http://www.mit.edu/~alexgru/vision/review.ps">ps</A> or <A 
href="http://www.mit.edu/~alexgru/vision/review.pdf">pdf</A><BR><BR>
<DT><A name=condensationGesture><STRONG>1</STRONG></A>
<DD>Michael&nbsp;J. Black and Allan&nbsp;D. Jepson. <A 
href="http://citeseer.nj.nec.com/black98probabilistic.html">A probabilistic 
framework for matching temporal trajectories: Condensation-based recognition of 
gestures and expressions.</A> In <EM>Proceedings 5th European Conference 
Computer Vision</EM>, volume&nbsp;1, pages 909-924, 1998. 
<P></P>
<DT><A name=condensation><STRONG>2</STRONG></A>
<DD>Michael Isard and Andrew Blake.<A 
href="http://citeseer.nj.nec.com/isard96contour.html"> Contour tracking by 
stochastic propagation of conditional density.</A> In <EM>Proceedings European 
Conference on Computer Vision</EM>, pages 343-356, 1996. 
<P></P>
<DT><A name=condensationSwitching><STRONG>3</STRONG></A>
<DD>Michael Isard and Andrew Blake. <A 
href="http://citeseer.nj.nec.com/isard98mixedstate.html">A mixed-state 
condensation tracker with automatic model-switching.</A> In <EM>Proceedings 6th 
Internal Conference Computer Vision</EM>, pages 107-112, 1998. 
<P></P>
<DT><A name=motionTrajectories><STRONG>4</STRONG></A>
<DD>Ming-Hsuan Yang and Narendra Ahuja. <A 
href="http://citeseer.nj.nec.com/yang00recognizing.html">Recognizing hand 
gesture using motion trajectories.</A> In <EM>IEEE CS Conference on Computer 
Vision and Pattern Recognition</EM>, volume&nbsp;1, pages 466-472, June 1999. 
<DL></DL><BR>
<HR>

<HR>

<ADDRESS><A href="mailto:alexgru@stanford.edu">Alexander Houston 
Gruenstein</A></ADDRESS><!-- Created: Fri Feb 22 13:17:58 PST 2002 --><!-- hhmts start -->Last 
modified: Tue May 28 01:32:21 PDT 2002 <!-- hhmts end --></DD></BODY></HTML>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -