http:^^www.cs.cornell.edu^info^people^hejik^cs631^paper.html

来自「This data set contains WWW-pages collect」· HTML 代码 · 共 431 行 · 第 1/2 页

HTML
431
字号
<p>TMDF builds the background frame by computing the median pixel value from sorting the images in the video sequence. This techniques relies on the assumption that any portion of the tracked object appears in any one particular location in less than half the image frames.  We implement this filter by finding whichframe has the median using its  gray-level value, andthen reconstructing the background using thecorresponding RGB values. </p><p>Both temporal filters are pixellevel operations we wrote in RiVL_Genc.RiVL_Genc only allows twenty frames maximum to be entered as inputto a function, and because medians of medians is not a median,we could not implement a true median function over theentire video sequence.Instead, we compute the median for several different samples -- each sample composing of twenty frames set at equal intervals, and allow the user to decide overthe best result.<p><H3><A NAME="HDR6">Physical Space Search</A></H3><p>	Physical space searchfinds the frame where the bounding box of the tracked object does not overlap with the one in the currently processed frame, that isthe part of background needed to replace the object is the one that has not been occupied by the object in the previous frames. Using assumptions of motion continuity, we initially search for thethe background for the current image near the frame where we found thebackground for the previous frame; this way we can avoid a comprehensive search.  For the initial frame, we must search the entire sequence for all possible background replacements.Although we prefer the closest frame that contains the background,we also want to find multiple scenes in which the background residesin case that another moving object has moved into the background.It is also possible to partition the bounding box into smaller blocksand search for the background in pieces.<p>If we assume a single moving object in the sequence, thenit is possible to use one frame which has the object removed and the background reconstructed as the background for the entire sequence.However, due to shifting lighting levels, it is desirable to reconstruct the scene for every frame or every block of frames.Figure 3 shows the result of the background covered by the subject's headreconstruced.<p><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/headless.gif"><B><A NAME="REF69077"><BR></a>Figure  3:</B> Sequence illustrating object-tracking and background reconstruction<p><H3><A NAME="HDR7">3. 3 Object Segmentation</A></H3><H3>Image Segmentation</H3>Segmentation, or separating the tracked object fromthe background, is one of the core problems in visionthat has yet to be adequately solved for unconstrainedsettings.  We explore motion differencing, second differencing, andbackground subtraction for this classical problem.<p><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/imdif.gif"><B><A NAME="REF69077"><BR></a>Figure 4:</B> Segmentation methods<p><p><H4><A NAME="HDR13">A. Image Differencing</A></H4>Motion differencing applies a threshold over two consecutive images to produce a binary image indicating the regions of motion.We extend motion differencing to use three consecutive frames.  With second differencing, we perform a binary <i>AND</i> operation on thedifference image of the first two frames and the last two framesto segment out the moving object in the middle frame.Moving objects are more clearly segmented when there exists less overlap of the moving object with itself in consecutive images;we choose the three consecutive frames such that there has been sufficient motion.<p><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/backsub.gif"><B><A NAME="REF69077"><BR></a>Figure 4:</B> Segmentation methods<p><H4><A NAME="HDR14">B. Background Subtraction</H4>Background subtraction involves application of a threshold over the background with the image containing the moving object.  This techniques works well only when used with a faithful copy of thebackground.<p><H5><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><A HREF="#toc"><-- Table of Contents</A></H5><H2><A NAME="HDR8">4.  Evaluation</A></H2><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/oseq.gif"><br><B>Figure 5:</B> Input video sequence<p><br><br><p>We used the above video sequence of 200 frames as one of the inputs for our test. Images were recorded as Motion JPEGS with a Sun Microsystems camera using a Parallax board.</p><p><br><br><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><A HREF="http://www.cs.cornell.edu/Info/People/hejik/cs631/meanbkg.gif"><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/thumbmeanbkg.gif"></a> Mean<!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><A HREF="http://www.cs.cornell.edu/Info/People/hejik/cs631/medianbkg.gif"> <!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/thumbmedianbkg.gif"></a> Median <br><B>Figure 6:</B> Temporal filter results<p><p>The example images above are the results from temporal filters. The reconstructed background image from the mean filter has a slightly visible blurring effect caused by the moving object.  As the number of frames in the video increases, this effect will become negligible. We can furtherprocess the mean filter with a smoothing function, and then asharpening function to further rid of the shadowing effect.<p><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/resultbg.gif"><B><BR></a>Figure 4:</B> Segmentation results for background subtraction<p><p>We choose the "eyeball test", a metric commonly usedby many vision researchers, to determine the quality of the segmentation.  Background subtraction produced the best segmentationwith the smoothest edges and the least number of holes withinthe object.Motion differencing performed the worstsince it tends to give an irregular outline of the motion and includes portionof the background which belongs to the object in the previous image butnot in the current image: this effect appears as an undesirablewhite outline around the object in the right pair of images below.The second differencing method shows improved results over regular motiondifferencing, but is still not as solid as background subtraction.Second differencing has an advantage over background subtractionsince reconstructing the background is not necessary.Some sort of post-filtering is necessary for all cases to fill in the holes and smooth the edges.<p><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/result2nddiff.gif"><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/white.gif"><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><img src="http://www.cs.cornell.edu/Info/People/hejik/cs631/resultmdiff.gif"><B><BR></a>Figure 5:</B> Segmentation results for second differencing, and motion differencing<p><p><p> We set out with a two-fold goal, one of object-removal and the other of object-segmentation.The overall quality of object-removal depends on the accuracyof the Hausdorff tracker and the fidelity of the reconstructed background.  We feel we have accomplished OTR as long as the background does exist.We have had less success with segmentation, and leave much room forfuture improvements.<p><H5><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><A HREF="#toc"><-- Table of Contents</A></H5><H2><A NAME="HDR9">5.  Related Work and Extensions</A></H2>Multimedia and vision are highly experimental areas embodyingnumerous possibilities. Although tracking and object-segmentation areactive ares of research in vision, there appears to be virtually no established work on automatingobject removal using background reconstruction.<p>We can extend this project along these orthogonal directions:<UL><li>solve object removal for a moving camera,      to handle zooms and pans<li>handle subtle problems in object removal such as object's shadow, reflection, etc<li>integrate Orwell with a full video editor as well as     include more functionalities    such as allowing the tracker to backtrack to     reset the object position<li>refine segmentation with morphological operators.</UL><H5><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><A HREF="#toc"><-- Table of Contents</A></H5><H2><A NAME="HDR10">6.  References</A></H2><DL>	<DT><A NAME="REF98469">[1]  </A><DD><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><A HREF="http://cs-tr.cs.cornell.edu:/TR/CORNELLCS:TR92-1320">Tracking Non-Rigid Objects in Complex Scenes</A>.<em>Proceedings of the Fourth International Conference on Computer Vision</em> (1993), 93-101 (with J.J. Noh and W.J. Rucklidge).	<DT><A NAME="JAIN">[2]  </A><DD> Jain, Kasturi and Schunck, Machine Vision, McGraw-Hill, 1995. 	<DT><A NAME="REF53096">[3]  </A><DD>Ousterhout, John K. Tcl and the Tk Toolkit. Addison-Wesley, Massachusetts, 1994.	<DT><A NAME="REF18623">[4]  </A><DD> Swartz, Jonathan and Smith, Brian C. <!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/tcl-tk-95.ps">RiVL: A Resolution Independent Video Language. </A>Submitted to the 1995 Tcl/Tk Workshop, July 1995, Toronto, CA. 	<DT><A NAME="Salient3">[5] </A><DD> L. Teodosio, W. Bender, Salient Video Stills: Content and Context preserved, Proc. ACM Multimedia, 1993, pp. 39-46. </DL><H5><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><A HREF="#toc"><-- Table of Contents</A></H5>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?