http:^^www.cs.cornell.edu^info^people^barber^516fin^pcmrivl.html
来自「This data set contains WWW-pages collect」· HTML 代码 · 共 544 行 · 第 1/3 页
HTML
544 行
<!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><!WA32><img src="http://www.cs.cornell.edu/Info/People/barber/516fin/presen/sld010.gif"><p>According the figure above, the amount of data fetched from each read node is no longer a function of the output of the write node, but is now a function of:<p><ul><li>the process's Logical ID#<li>the total number of processes<li>and, is a function of the write node's output<p></ul>That is, each RIVL process is responsible for computing a different, independent portion of the final output data, which is based on the above parameters. Hence the term "Fine-Grain Parallel CM RIVL". Our approach is fine-grained in that each RIVL process performs the same set of computations, on different data.<p>Actual data computation (the left-to-right graph traversal) occurs when the master says "go". Each slave and the master process computes their appropriated portion of the output image.<p> <a name="3.3"><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><!WA33><a href="#home">Go Back</a><h3>3.3 Continuous Media Parallel RIVL</h3><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><!WA34><img src="http://www.cs.cornell.edu/Info/People/barber/516fin/presen/sld004.gif"><p>The model of parallelization for RIVL just described maps smoothly to CM RIVL. With CM RIVL, there is an initial setup phase for each slave process and the master process, as previously described (the Master process sends each slave its logical ID#, the total number of processes, and a copy of the RIVL script. Each RIVL process then computes the RIVL graph and makes the right-to-left traversal).<p> The image processing for computing each output frame in a continuous media stream occurs as follows:<p><ul><li>There is a <b>CMO (Continuous Media Object)</b>, which captures and manages continuous streams of data, and resides as part of the Master Process.<p> <li>When the CMO has captured all of its input data for a single output image, it contacts the master's Parallel Synchronization Device, and tells each RIVL process (slaves and the Master) that data is ready to be fetched, and that computation can begin ASAP.<p> <li>Each RIVL process then fetches only the input data it needs to generate its segment of the output data, and makes the left-to-right traversal through the graph.<p> <li>The output data from each RIVL process is then written back to a buffer within the CMO, where the data is re-assembled into a single data-output object.<p> <li>Each RIVL process then blocks, listening for further instructions from the CMO as to when another image will be ready for processing.<p> </ul>Using this method, for a given stream of multimedia data, the construction of the RIVL graph and a reverse-traversal of the graph are performed only once at setup-time. The actual image processing only requires one traversal of the graph on each RIVL process, where the computation area is distributed among all of the RIVL processes.<p><hr><a name="4.0"><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><!WA35><a href="#home">Go Back</a><h2>4.0 Implementations</h2>Based on the generic parallelization scheme described in the preceding section, we have developed two implementations of Parallel CM RIVL. Each implementation has its own synchronization mechanism for parallelizing the independent RIVL processes, and its own mechanism for transferring data.<p> <a name="4.1"><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><a href="#home">Go Back</a><h3>4.1 Shared Memory Implementation</h3><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><img src="http://www.cs.cornell.edu/Info/People/barber/516fin/presen/sld005.gif"><p>The shared-memory implementation is illustrated above. Each RIVL process resides on a different processor, but each processor resides on the same machine, which has access to the same shared memory segment.<p> This implementation mirrors the generic parallel model described in <b>Section 3</b>.<p>Implementation Details:<p><ul> <li>The initial setup is facilitated by using TCP-IP multi-cast via Tcl-DP.<p><li>The Process synchronization is facilitated using UNIX semaphores.<p><li>The Data Transfer is facilitated using shared-memory reads and writes via UNIX-IPC.<p><li>The Program was compiled for a SparcStation running SunOS.<p></ul>This model operates as follows:<p> Following the initial setup phase, the CMO works at capturing all data necessary to compute a single RIVL output frame. Once the CMO captures all the necessary data, it tells each RIVL process to begin processing by means of an <b>entry</b> semaphore. Each RIVL process then reads only the data relevant to its own output via a shared-memory read. Once the left-to-right evaluation of the RIVL graph completes, the RIVL process then performs a shared-memory write to the memory region containing the output image that is accessible by the CMO. The RIVL process then blocks at an <b>exit</b> semaphore until all of the RIVL processes complete computation for the same frame of data. Once every RIVL process blocks, the master RIVL process un-sets the exit semaphore, and each RIVL process waits again at the entry semaphore, until the CMO again releases it.<p><a name="4.2"><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><a href="#home">Go Back</a><h3>4.2 Networked Implementation</h3><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><img src="http://www.cs.cornell.edu/Info/People/barber/516fin/presen/sld006.gif"><p>The networked implementation is illustrated above. Each RIVL process resides on a different processor, and each processor resides on a different machine.<p> This implementation also mirrors the generic parallel model described in <b>Section 3</b>.<p>Implementation Details:<p><ul><li>The initial setup is again facilitated by using TCP-IP multi-cast via <!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><a href="http://www.cs.cornell.edu/Info/Projects/zeno/Projects/Tcl-DP.html">Tcl-DP</a>.<p><li>The Data Transfer is facilitated using <!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><a href="http://www.cs.cornell.edu/Info/Projects/CAM/">Active-Messages</a> over <!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><a href="http://www.cs.cornell.edu/Info/Projects/U-Net/">U-Net</a>.<p> <li>The Synchronization Mechanism is implicit via the Active-Messages paradigm.<p><li>The Program was compiled for a SparcStation running SunOS.<p></ul>This model operates as follows:<p>Like its shared-memory counterpart, this model performs the initial setup using IP multicast to establish the Active Message connections from the master to each slave RIVL process. The CMO works at capturing all data necessary to compute a single RIVL output frame. This model differs from the generic-model in that the master process knows exactly what portion of the input data each RIVL process needs to evaluate their RIVL graph. Once the CMO captures all the necessary data, it tells each RIVL process to begin processing by issuing a <b>gam_store()</b> to each RIVL process. Once the message is received by each RIVL process, a handler is invoked which tells the RIVL process that it can begin evaluating its RIVL graph on the transferred data. Once the output data is computed, the RIVL process then issues a <b>gam_store()</b> to the Master process, specifying exactly where the sent data should be stored in the final output image buffer managed by the CMO. Eventually, a handler routine in the Master process will update a "received-from list". Once the Master receives data from each RIVL process, the CMO outputs the computed frame, and begins processing the next multimedia frame.<p>The process synchronization mechanism is implicit with the actual data-transfer, in that, a RIVL process cannot begin evaluating its graph on a given frame segment, until it receives an Active-message from the Master process. Similarly, the Master process cannot move on to the next multimedia image until it receives an Active-message from each slave process.<p>Another subtle point is that by having the Master determine how much of the input data each RIVL process requires, rather than having the RIVL process itself determine this information, we reduce the round-trip communication rate from master to slave. Having each RIVL process compute its own region, would require a <b>gam_request()</b>, followed by a <b>gam_reply()</b> by the Master process. Instead, the Master decides how much data each RIVL process needs and simply issues a single <b>gam_store()</b>.<p> <a name="4.3"><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><a href="#home">Go Back</a><h3>4.3 Implementation Caveats</h3>Our actual executables are not SPMD. There is a separate executable for the Master process, and another executable for each Slave process. This didn't cause any problems when developing the shared-memory implementation. However, since Active-Messages ver 1.1 assumes a SPMD model, we ran into problems when specifying AM handlers in both the Master process and the Slave processes.<p> When the Master process received active-messages from any slave process, the slave process attempted to invoke an AM handler in the Master that existed in the slave, but not in the handler. The situation was the same when a slave process received an Active Message from the Master.<p> We overcame this shortcoming in by modifying the Active-Message's source code. The modification allows an application to register a handler with Active-Messages by calling<p> <center><b>hid uam_reg_handler(handler_t handler)</b></center><p> "Handler_t handler" corresponds to the handler's virtual address. The process returns an "hid", which is an integer, but stands for "handler ID#". In our implementation, since only the Master executable and slave executable are different, the Master and each slave must register their handlers with the Active-Message's library. Now, when a process sends an Active Message (from slave to master and vice versa), it no longer ships the processes's virtual address of the handler, but rather, ships a logical ID#, corresponding to the handler to be invoked. The Active-Message's library maintains a look-up table that is indexed by the logical ID#. The logical ID# corresponds
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?