http:^^www.cs.cornell.edu^info^people^barber^516fin^pcmrivl.html
来自「This data set contains WWW-pages collect」· HTML 代码 · 共 544 行 · 第 1/3 页
HTML
544 行
MIME-Version: 1.0
Server: CERN/3.0
Date: Sunday, 01-Dec-96 18:54:06 GMT
Content-Type: text/html
Content-Length: 28767
Last-Modified: Monday, 06-May-96 23:26:29 GMT
<html><head><title>Fine-Grain Parallel CM RIVL</title><a name="home"><center><h1>Fine-Grain Parallel CM RIVL: A Step Towards Real-Time Multimedia Processing</h1><p><b><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><a href="http://www.cs.cornell.edu/Info/People/barber">Jonathan Barber</a> (<!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><a href=mailto:barber@cs.cornell.edu>barber@cs.cornell.edu</a>)<br><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><a href="http://www.cs.cornell.edu/Info/People/sugata/home.html">Sugata Mukhopadhyay</a> (<!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><a href=mailto:sugata@cs.cornell.edu>sugata@cs.cornell.edu</a>)<p> <!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><a href="http://www.cs.cornell.edu/Info/Courses/Spring-96/CS516">CS516</a> Final Project<br><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><a href="http://www.cs.cornell.edu/Info/People/tve/tve.html">Professor Thorsten von Eicken</a><br><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><a href="http://www.cs.cornell.edu/">Department of Computer Science</a><br><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><a href="http://www.cornell.edu">Cornell University</a></h2></center></b><p></head><body><hr><hr><h2>0.0 Table of Contents</h2><ul><b><li><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><a href="#1.0">1.0 Abstract</a><li><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><a href="#2.0">2.0 Introduction</a><li><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><a href="#3.0">3.0 RIVL and the Generic Parallel Paradigm</a><ul><li><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><a href="#3.1">3.1 The RIVL Graph</a> <li><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><a href="#3.2">3.2 Parallelizing RIVL</a> <li><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><a href="#3.3">3.3 Continuous Media Parallel RIVL</a></ul><li><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><a href="#4.0">4.0 Implementations</a><ul><li><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><a href="#4.1">4.1 Shared Memory Implementation</a> <li><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><a href="#4.2">4.2 Networked Implementation</a> <li><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><a href="#4.3">4.3 Implementation Caveats</a></ul><li><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><a href="#5.0">5.0 Performance Results</a><li><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><a href="#6.0">6.0 Extensions and Robustness</a><li><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><!WA20><a href="#7.0">7.0 Conclusions</a><li><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><!WA21><a href="#8.0">8.0 References</a><p></b></ul><hr><hr><a name="1.0"><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><!WA22><a href="#home">Go Back</a><h2>1.0 Abstract</h2> Any form of multimedia processing is typically computationally expensive. An even harder problem is performing some form of multimedia processing on multiple real-time continuous streams of data. In such a paradigm, each frame in a sequence of images incurs a very large computational expense. An obvious yet difficult solution is to divide up the problem, and compute the solution in parallel. This paper details the nature of the problems and the solutions for dealing with parallel multimedia processing in both shared memory and distributed environments.<p><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><!WA23><a href="http://www.cs.cornell.edu/Info/People/barber/516fin/presen/index.htm">Click here to view a slide-show presentation of this paper</a><p>.<hr><a name="2.0"><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><!WA24><a href="#home">Go Back</a><h2>2.0 Introduction: The Evolution of RIVL</h2>Over the course of the past two years, a large effort has been mounted to develop applications that can efficiently and reliably process multimedia data. The effort manifested itself with the construction of <b><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><!WA25><a href="http://www.cs.cornell.edu/Info/Projects/zeno/Projects/Rivl.html">RIVL (A Resolution Independent Video Language)</a></b>. RIVL is a multimedia processing package that given a set of images (or a set of a sequence of images), can efficiently process these multimedia streams and generate an outgoing image (or a sequence of images). RIVL is implemented as a tcl extension that is capable of performing common image operations such as overlay, smoothing, clipping, cropping, etc. The tcl interface simplifies the process of coding an image processing script.<p> Recently, RIVL has been extended to process continuous streams of multimedia data, and generate a corresponding output stream of images. The extended RIVL package, called <b><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><!WA26><a href="http://www.cs.cornell.edu/Info/Courses/Fall-95/CS631/final-projects/Integratig-Rivl-and-CMT/final.html">CM RIVL</a></b>, was made possible by treating RIVL evaluation as a midpoint in a continuous media object. This work was facilitated by using <b><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><!WA27><a href="http://www.bmrc.berkeley.edu/projects/cmt/">CMT (The Continuous Media Toolkit)</a></b>.<p> Image processing continuous streams of media in real-time is a very hard problem, considering today's current state of computer technology. Performing even a simple image oper-ation over a single sequence of images, and outputting the resultant image[s] in real-time requires on the order of a million CPU cycles. To approach a real-time image-processing frame-rate of 30 frames per second, which is the standard frame-rate for perceiving continuous motion, would require one of the following items to be true:<p><ul><li>to be able to perform image processing operations in less than linear time on a single processor<br><li>to be able to utilize high-performance technology that does not yet exist<br><li>to be able to divide up the work, and perform the image processing in parallel to achieve less than linear time performance<p> </ul>Since we have little or no control over the first two items, we have focused our efforts on the third. Most image processing routines can be performed in super-linear time if the work is divided among an array[s] of parallel processors. This is true for RIVL, and also for CM RIVL.<p> <b>Bearing this in mind, we established the project goal to develop an easy-to-use, fast, and inexpensive, real-time multimedia processing application.</b><p>In Section 3.0, we describe a generic method for parallelizing most of the image operations in RIVL, by exploiting the way that RIVL processes an inputted set of images. In Section 4.0, we describe two implementations of Parallel CM RIVL (PRIVL). The first version is designed to run on shared memory machines. The second version is designed to run over a cluster of Workstations. In Section 5.0, we present an analysis of performance results. In Section 6.0, we describe some improvements to our implementations. Finally, in Section 7.0, we draw some conclusions and analyze our progress.<p> <hr><a name="3.0"><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><!WA28><a href="#home">Go Back</a><h2>3.0 RIVL and the Generic Parallel Paradigm</h2><a name="3.1"><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><!WA29><a href="#home">Go Back</a><h3>3.1 The RIVL Graph</h3>We begin our discussion of RIVL by introducing the RIVL Evaluation Graph.<p> <!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><!WA30><img src = "http://www.cs.cornell.edu/Info/People/barber/516fin/presen/sld007.gif"><p>In order for RIVL to execute, it requires a set of multimedia input data, and a control RIVL script. The RIVL script is a sequence of tcl-rivl commands that specify what image processing operations should occur on the input data. Once RIVL is invoked, the RIVL script is translated into the RIVL graph, as pictured above. Each node corresponds to some <b>image operator</b> (e.g. im_smooth, im_canny, etc.), and each edge or <b>signal</b> corresponds to the actual image data. Those nodes lying inside of the illustrated rectangle above correspond to true image operators. Those nodes lying outside of the rectangle are the RIVL I/O nodes. The nodes outside and to the left of the rectangle correspond to read nodes (i.e. one read/node per image [or stream]), and the node to right of the rectangle corresponds to the write node.<p> We want to emphasize that construction of the RIVL graph <b>does not compute on any multimedia data</b>. The RIVL graph is merely the control-flow structure through which each inputted sequence of data must propagate to generate the outputted, processed image.<p> There are two phases in processing data using the RIVL graph once it has been constructed. The first phase manifests itself in a graph traversal from <b>right-to-left</b>. This is what makes RIVL an efficient image processing mechanism. The first node that is evaluated is the <b>Write</b> node (the right-most node). By traversing the graph in reverse-order, RIVL decides at each node exactly how much data the output signal requires from the input signal. The evaluation is reverse-propagated from the write node, through the graph, and back to every read node. Once the reverse-propagation completes, every node in the graph knows exactly how much data from each input signal is required to compute the node's corresponding output signal. The multimedia data is then processed on the second traversal, which conforms to a <b>left-to-right</b> traversal of the RIVL graph, propagating the input data forwards through the graph, only operating on data that is relevant to the final output image.<p> <a name="3.2"><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><!WA31><a href="#home">Go Back</a><h3>3.2 Parallelizing RIVL</h3>We can summarize the preceding section into the statement that, the amount of data that is fetched from each <b>Read</b> node is exactly a function of the output of the <b>Write node</b>. Combining this notion with the fact that most of the image processing operations in RIVL do not create dependencies from one pixel to another in a given input image, we can derive a simple for mechanism for "dividing up the work", and parallelizing RIVL.<p> Instead of running RIVL on a single processor, we spawn multiple RIVL processes on different processors, and have each process work towards computing a different segment of the output data. We define the notion of a single <b>Master</b> RIVL process, and multiple <b>slave</b> RIVL processes. Each slave process is started on a different processor. Once started, the slave process sits idle, listening for instructions from a Master process. After the slave processes have been started, a Master process is created. The Master Process determines how many slaves are "available for work". Once a control connection is established between the Master and every Slave, the Master assigns each slave a logical ID# (the Master ID# is 0, the Slave's ID# ranges from 1 to N slaves). After each slave is assigned an ID#, the Master sends each slave the total number of processes "available for work", followed by a copy of the RIVL script. Once each slave (and the master) receives the RIVL script, they each generate a copy of the RIVL graph, and perform the right-to-left traversal independently.<p>The difference between the right-to-left traversal now, is that the logical ID# for the current processor and the total number of processes becomes a factor in determining how much computation gets done for each process.<p>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?