http:^^www.cs.cornell.edu^info^projects^zeno^rivl^rivl.html
来自「This data set contains WWW-pages collect」· HTML 代码 · 共 343 行 · 第 1/3 页
HTML
343 行
set flowers [im_read flowers.jpg] im_blur! flowers 5.0 im_overlay $tiger $flowers im_write out.jpg</PRE>Figure 4 shows the graph created by this program. Both im_scaleC and im_rotateC are implemented with a pair of translations surrounding an im_scale or im_rotate; the translations move the origin of the scale and rotate operations to the center of the image.<p><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><!WA36><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.4.gif"><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><!WA37><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.4.gif"><B><A NAME="REF69077"><BR></a>Figure 4:</B> Sample image graph<p></A><p>A call to im_write triggers the graph-evaluation mode. In principle, the Rivl interpreter traverses the graph from inputs to output, computing intermediate images until the output image is computed. But before computing the images, the interpreter perform several optimizations. Two optimizations, graph restructuring and result-region calculation, are described below.<p><strong>Graph restructuring.</strong> The first optimization modifies the graph so its output is equivalent, but the computation is more efficient. Such modifications include combining or swapping adjacent nodes. For example, figure 4 contains six adjacent affine transformations. Rivl collapses these nodes into a single affine transform (figure 5a). This optimization improves both the computation speed and the quality of the final image by reducing the number of times the image is resampled.<p><strong>Result-region calculation.</strong> The second optimization, introduced by Shantzis[<!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><!WA38><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF67492">10</A>], is to compute only those regions of each intermediate image that affect the final result. In our example program, only a small portion of flowers is visible in the final result (figure 5b, on right). It is sufficient to read in and blur only this portion.<p>Rivl calculates 3 regions for intermediate images (i.e. for every edge in the DAG): the have region, the need region, and the result region. The have region at an edge is the region of pixels provided by the edge's left node(s). The need region at an edge is the region of pixels needed by the edge's right node(s). Finally, the result region is the intersection of have and need. This intersection contains all the pixels that both have defined values and affect the final output image. Only the pixels inside the result region need to be computed. Figure 5b shows the regions computed for each intermediate image in our example graph. In particular, only a small region is calculated for each of the two lower images.<p>Following these optimizations, the Rivl interpreter computes the graph's result image, writes the image to disk, and returns to processing commands in graph-construction mode.<p><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><!WA39><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.14.gif"><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><!WA40><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.14.gif"><B><A NAME="REF53357"><BR></a>Figure 5:</B> (a) Restructured image graph / (b) Result regions<p></A><p><H5><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><!WA41><A HREF="#HDR6"><-- The Rivl Interpreter</A></H5><H3><A NAME="HDR8">3. 2 Implementation of Sequences</A></H3>Our implementation of sequences borrows many ideas from our implementation of images, since many still-image optimizations prove especially beneficial for sequence computing.<p>We will use scrolling titles as a sample task to motivate this section. The following program adds scrolling credits to the last 40 seconds of a 240 second (4 minute) movie:<p><PRE> set credits [ims_to_seq [im_mask [im_read credits.ps]]] seq_map! credits {im_trans %1 0.0 [expr -%p*8000]} seq_scale! credits 40.0 seq_shift! credits [expr 200.0] set raiders [seq_read raiders.mpg] set outseq [seq_map "$credits $raiders" {im_overlay %1 %2}] seq_write $outseq out.mpg</PRE><A NAME="REF62381">Program 2: Adding scrolling credits to the end of a movie</A><p>The titles are stored as a Postscript program that generates a long image (640x8000). The im_mask function makes the titles' background transparent. The program converts the image to a one-second sequence (credits), scrolls credits upwards over time using seq_map, and scales and shifts credits to the desired time range (200-240 sec). The final seq_map overlays the titles onto the raiders movie, and the result is written to the file out.mpg.<p>Like image commands, sequence commands are stored in a graph (called the sequence graph) until the sequence is computed (e.g. in response to a seq_write command). Figure 6 shows the sequence graph for program <!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><!WA42><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF62381">2</A>, in which each node corresponds with one line of the program. The sequence graph is used to generate a set of image graphs that correspond to the sequence's individual frames.<p><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><!WA43><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.17.gif"><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><!WA44><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.17.gif"><B><A NAME="REF69077"><BR></a>Figure 6:</B> Sequence graph for scrolling titles<p></A><p>Suppose we want to compute the frame in the output sequence at time T. We perform two passes over the sequence graph, a backward pass and a forward pass. In the backward pass, we compute a timestamp for each edge. An edge's timestamp indicates the time value at that edge that influences output frame T. As we traverse the graph, each seq_scale and seq_shift node we encounter potentially alters the timestamp. The top of figure 7 shows the timestamps computed for T=220.0.<p>In the forward pass, we build an image graph corresponding to output frame T. As we traverse the sequence graph, each seq_read and seq_map node we encounter adds node(s) to the image graph. seq_read uses the timestamp to determine which frame to read; seq_map uses the timestamp when substituting values for %p and %t. The bottom of figure 7 shows the image graph computed for T = 220.0.<p>To compute the whole sequence, we repeat the image graph generation algorithm for all relevant output times T, at increment of 1/fps, where fps is the desired frame rate (in frames/sec) of the output sequence. For program 2, T ranges from 0.0 to 240.0. The resulting graphs are merged into a single compound image graph, as shown in figure 8.<p><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><!WA45><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.31.gif"><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><!WA46><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.31.gif"><B><A NAME="REF28898"><BR></a>Figure 7:</B> Generating the image graph for t = 220.0<p></A><p><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><!WA47><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.6.gif"><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><!WA48><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.6.gif"><B><A NAME="REF69077"><BR></a>Figure 8:</B> Image graph for entire sequence<p></A><p>The optimizations of section 3.1 are used to process the compound image graph to produce the output images, along with two additional optimizations: image subgraph reuse and direct-transfer detection.<p><strong>Image subgraph reuse.</strong> In figure 8 the subgraph containing im_read(credits.ps) and im_mask is replicated many times. It is more efficient to use a single subgraph with multiple output edges, as shown in figure 9. In this way the pixels of credit.ps are read and masked only once, and the various im_trans nodes share a common input. In general, Rivl detects and merges redundant image subgraphs whenever possible, a form of common subexpression elimination[<!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><!WA49><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF11216">1</A>].<p><strong>Direct-transfer detection.</strong> In our example, the first 200 seconds of raiders.mpg appear unchanged in the output. An obvious optimization is to avoid unnecessary decompression and compression by copying the compressed data directly to the output.<p>In formats such as MPEG, direct copying is not always possible on every frame, since MPEG sequences contain frames that are encoded as differences from other frames and cannot be decoded in isolation. However, MPEG streams are often divided into groups of pictures (GOPs), usually 15 to 30 frames long, that are independent from other GOPs. When reading and writing MPEG, Rivl transfers groups of pictures directly whenever possible.<p><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><!WA50><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.25.gif"><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><!WA51><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.25.gif"><B><A NAME="REF69077"><BR></a>Figure 9:</B> Image graph for entire sequence with shared subgraph<p></A><p><H5><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><!WA52><A HREF="#HDR6"><-- The Rivl Interpreter</A></H5><H3><A NAME="HDR9">3. 3 Memory Management</A></H3>In addition to optimizing image and sequence calculation, the Rivl interpreter contains a custom memory management module to cache previously computed images and cope with very large images.<p>To understand the utility of caching images, consider the evaluation of the graph in figure 9. The output of the im_mask node is used many times; it is advantageous to cache this image. The Rivl memory manager detects this case, freeing each image only when it is no longer needed in the current graph evaluation.<p>Another issue is whether to store images after a graph evaluation ends. Interactive applications of Rivl often require repeated evaluations of a slightly changing graph. Because the language restricts the way image graphs can be modified, the image associated with an edge remains accurate for the lifetime of the graph and can be cached. Unfortunately, we have no special knowledge about which images to cache for future graph evaluations. In principle, the user can access any edge that was ever created. Mistakenly discarding data is nonfatal, since we can always recompute the data, but such mistakes hurt performance.<p>To address this issue, Rivl provides an im_priority command to allow applications to set the priority of an image. The memory manager discards low priority images and keeps high priority images in memory. For instance, a video editor built using Rivl calls im_priority to raise the priority of displayed images, so that the results of special effects can be quickly viewed. We are looking into algorithms and heuristics for automatic priority adjustment. For example, images generated by expensive operations, and images that have been referenced repeatedly in the past, are candidates for high priority.<p>The initial implementation of Rivl treated images as indivisible memory buffers. Unfortunately, this representation performed poorly for large images. The Rivl memory manager divides large images into non-overlapping pages of manageable size (figure 10a). Pages are handled as independent entities by the memory manager, allowing an image to be cached in parts. In addition, large images with considerable blank space are efficiently represented by a set of non-contiguous pages (figure 10b).<p><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><!WA53><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.38.gif"><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><!WA54><img src="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm95.figure.id.38.gif"><B><A NAME="REF53357"><BR></a>Figure 10:</b> (a) Dividing large image / (b) Representing sparse image<p></A><p>To illustrate the utility of Rivl's memory management policy, we consider the execution of the scrolling titles program (program 2) under a standard memory model in which the entire image is read into a virtual memory buffer for the duration of the program. Assuming a 256 color image, this requires 5 MB of storage. In contrast, Rivl accomplishes the task as follows:<p><OL><LI>The 640x8000 title region is divided into ten equally sized pages (given Rivl's current maximum page size.)<BR><LI>Rivl allocates, loads, and masks each page of data only when necessary. The results of each call to im_mask are cached for future requests.<BR><LI>Rivl discards pages as soon as they are no longer needed.<BR></OL>The memory footprint is 1 MB, enough for Rivl to hold two pages at once.<p>In summary, the Rivl interpreter uses a variety of strategies to optimize execution of Rivl programs. They are:<p><UL><LI>Graph restructuring: Combining or reordering nodes in the graph for speed<BR><LI>Result-region calculation: Computing only the parts of an image that affect the output<BR><LI>Direct transfer detection: Copying compressed data directly to the output when possible<BR><LI>Image subgraph reuse: Sharing common subexpressions in the image graph<BR><LI>Image caching: Caching images if they are needed later in the graph evaluation<BR><LI>Image subdivision: Dividing large images into manageable pieces<BR></UL><H5><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><!WA55><A HREF="#HDR6"><-- The Rivl Interpreter</A></H5><H5><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><!WA56><A HREF="#toc"><-- Table of Contents</A></H5><H2><A NAME="HDR10">4. Related and Future Work</A></H2>Many commercial packages are available that provide software libraries of image manipulation functions. Some use demand-driven execution to achieve similar optimizations as those mentioned in section 3. These include the Pixar system described by Shantzis<!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><!WA57><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF67492">[10]</A>, and Silicon Graphics' ImageVisionLibrary<!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><!WA58><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF93885">[13]</A>. Holzmann's Popi<!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><!WA59><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF73579">[4]</A> allows image transformations to be specified with concise expressions at run-time, a mechanism that permits rapid prototyping of new image primitives. We are adapting this idea to Rivl. None of the above mentioned systems provide language support for motion video.<p>Some systems (e.g., Data Explorer<!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><!WA60><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF98469">[2]</A> or Khoros<!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><!WA61><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF36403">[12]</A>) provide a graphical programming environment where image "programs" are expressed as flowcharts. Although this way of expressing image operations is an improvement over pixel manipulation, the limitations of flowcharts for expressing complex programs are well-known. Furthermore, the support for motion video operations in these systems is limited or non-existent.<p>Matthews, Gloor, and Makedon's VideoScheme <!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><!WA62><A HREF="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/mm-95.html#REF98611">[6]</A> combines Apple's QuickTime movie player with a Scheme-based video manipulation language. In VideoScheme the user works with objects close to the underlying implementation of video data, such as pixel arrays and frames. This low-level access gives users considerable flexibility in creating new image operations. For example, algorithms for detecting "cuts" in video can be easily built out of pixel array primitives. In contrast, Rivl's high level of abstraction allows it to exploit delayed computation for improved efficiency, and its resolution independence makes programs more portable.<p>Rivl is implemented with 4000 lines of C code and 500 lines of Tcl code. It has been ported to the Sun OS, HPUX, and Linux operating systems. Rivl has been used to build a simple video editor. Rivl and its editor can be found at <!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><!WA63><a href="http://www.cs.cornell.edu/Info/Projects/zeno/rivl/rivl.html">http://www.cs.cornell.edu/Info/Projects/zeno/rivl/rivl.html.</a><p>The Rivl language is still evolving. We are extending the core set of Rivl primitives to support other types of video processing, such as image analysis, computer vision, and morphing. With the right primitives, we hope to build a rapid prototyping environment for exploring video content processing.<p>We are also building a parallel implementation of the Rivl interpreter using workstation clusters. In this implementation, a Rivl program will run quickly using low resolution images on a small cluster, slowly using high resolution images on a small cluster, and quickly using high resolution images on a large cluster. The interpreter will automatically parallelize the Rivl program, using both coarse grained parallelism (one image / one process) and fine grained parallelism (one image / multiple processes).<p><H5><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><!WA64><A HREF="#toc"><-- Table of Contents</A></H5><H2><A NAME="HDR11">5. References</A></H2><DL><DT><A NAME="REF11216">[1] </A><DD>Aho, Sethi and Ullman, Compilers: Principles, Techniques, and Tools, Reading, Mass. Addison-Wesley, 1988, pp. 592-595<DT><A NAME="REF98469">[2] </A><DD>Data Explorer software package. IBM.<DT><A NAME="REF25762">[3] </A><DD>J. D. Foley, et. al., Computer Graphics: Principles and Practice, second edition, Reading, Mass. Addison-Wesley, 1990.<DT><A NAME="REF73579">[4] </A><DD>Holzmann, Gerald J. Popi. AT&T Bell Laboratories. Murray Hill, NJ.<DT><A NAME="REF84725">[5] </A><DD>D. Le Gall, MPEG: a video compression standard for multimedia applications, Communications of the ACM, April 1991, Vol. 34, Num.4, pp. 46-58.<DT><A NAME="REF98611">[6] </A><DD>Matthews, James, Peter Gloor, and Fillia Makedon. "VideoScheme: A Programmable Video Editing System for Automation and Media Recognition." ACM Multimedia `93 Proceedings, pp. 419-426.<DT><A NAME="REF68172">[7] </A><DD>W. B. Pennebaker, JPEG still image data compression standard, Van Nos and Reinhold, New York, 1992.<DT><A NAME="REF31620">[8] </A><DD>Poskanzer, Jef. The Extended Portable Bitmap Toolkit (PBMPLUS).<DT><A NAME="REF59020">[9] </A><DD>Postscript. Adobe Systems Incorporated, Mountain View, CA.<DT><A NAME="REF67492">[10] </A><DD>Shantzis, Michael. "A Model for Efficient and Flexible Image Computing." SIGGRAPH `94 Proceedings, pp. 147-154.<DT><A NAME="REF53096">[11] </A><DD>Ousterhout, John K. Tcl and the Tk Toolkit. Addison-Wesley, Massachusetts, 1994.<DT><A NAME="REF36403">[12] </A><DD>Rasure and Kubica, "The Khoros Application Development Environment", Experimental Environments for Computer Vision and Image Processing, editor H.I Christensen and J.L Crowley, World Scientific 1994. <DT><A NAME="REF93885">[13] </A><DD>Silicon Graphics Inc. ImageVision Library. Silicon Graphics Inc., Mountain View, CA.<DT><A NAME="REF18623">[14] </A><DD>Swartz, Jonathan and Smith, Brian C. RIVL: A Resolution Independent Video Language. Submitted to the 1995 Tcl/Tk Workshop, July 1995, Toronto, CA. http://www.cs.cornell.edu/Info/Projects/multimedia/rivl/tcl-tk-95.ps</DL><H5><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><!WA65><A HREF="#toc"><-- Table of Contents</A></H5><HR><h3>Footnotes</h3><DL COMPACT><DT><A NAME=FN1>(1)</A><DD>The Rivl image type is unrelated to the Tcl/Tk[11] canvas image type.</DL><A NAME="ENDFILE"><PRE> </PRE></A>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?