http:^^www.cs.wisc.edu^~praveen^projects^seq.html
来自「This data set contains WWW-pages collect」· HTML 代码 · 共 335 行 · 第 1/2 页
HTML
335 行
Date: Mon, 11 Nov 1996 01:56:48 GMTServer: NCSA/1.5Content-type: text/htmlLast-modified: Mon, 05 Feb 1996 21:52:13 GMTContent-length: 12061<HTML><HEAD><TITLE>SEQ Home Page</TITLE><H1>The SEQ Project: Querying Sequence Data</H1><H5>(Document under construction)</H5><HR></HEADER><BODY background="background.gif" TEXT="#000001"><!-- <!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><!WA0><img src="http://www.cs.wisc.edu/~praveen/projects/control.gif" align=middle> --><H3><em><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><!WA1><img src="http://www.cs.wisc.edu/~praveen/pics/redball.gif">Time to put Order in the Database!<!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><!WA2><img src="http://www.cs.wisc.edu/~praveen/pics/redball.gif"><p><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><!WA3><img src="http://www.cs.wisc.edu/~praveen/pics/blueball.gif">Order Time put in the Database!<!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><!WA4><img src="http://www.cs.wisc.edu/~praveen/pics/blueball.gif"><p><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><!WA5><img src="http://www.cs.wisc.edu/~praveen/pics/greenball.gif">Time to put the Database in Order!<!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><!WA6><img src="http://www.cs.wisc.edu/~praveen/pics/greenball.gif"></em></H3><HR><H2>Document Contents:</H2><H4><UL><LI><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><!WA7><A href="#Objective">Project Objectives</A><LI><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><!WA8><A href="#Status">Current Status</A><LI><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><!WA9><A href="#Example">Motivating Example</A><LI><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><!WA10><A href="#Data Model">SEQ Data Model</A><LI><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><!WA11><A href="#Language"><em> Sequin </em> Query Language</A><LI><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><!WA12><A href="#Optimization">Optimization Techniques</A><LI><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><!WA13><A href="#System">SEQ System Development</A><LI><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><!WA14><A href="#Publications">Publications</A><LI><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><!WA15><A href="#Related">Related Work</A><LI><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><!WA16><A href="#Contacts">Contact Information</A></H4></UL><hr><A name="Objective"><H2>Project Objectives</H2></A><BLOCKQUOTE><H4> A number of important database applications require the processingof large amounts of ordered <em> sequence </em> data. The domains of theseapplications include financial management, historical analysis,economic and social sciences, metereology, medical sciences andbiological sciences. Existing relational databases are inadequate in this regard; data collections are treated as sets, not sequences. Consequently, expressing sequence queries is tedious, and evaluating them is inefficient.</H4></BLOCKQUOTE><H4><BLOCKQUOTE>Databases should <UL><LI>model the data using the abstraction of <em> sequences </em>,<LI>allow data sequences to be queried in a <em> declarative manner </em>,utilizing the ordered semantics<LI>take advantage of the unique opportunities available for query optimization and evaluation<LI>integrate sequence data with relational data, so that users canstore and query a combination of relation and sequences</UL>These requirements serve as the goals of the SEQ project.Various kinds of sequences need to be supported, temporal sequences being themost important kind. Queries should be expressible using notions like"next" and "previous" which are natural when considering sequences.These queries should be optimized so that they can be evaluated efficiently.These issues need to be studied in theory, and then a database system needs to be built that demonstrates the feasibility of the theoretical ideas.</BLOCKQUOTE></H4><hr><a name="Status"><H2>Project Status</H2></A><H4><BLOCKQUOTE>The current status of the project is:<ul><li>We have defined the <em> SEQ </em> data model that can support most important kinds of sequence data. We have also defined algebraicquery operators that can be composed to form sequence queries (analogousto the composition of relational algebra operators to form relation queries).<li>We have described how sequence queries can be efficiently processed,and have identified various optimization techniques.<li>We use a sequence query language <em> Sequin </em> that candeclaratively express queries over sequences. A <em> Sequin </em>query can include embedded expressions in a relational query language likeSQL, or vice-versa.<li>We are building a disk-based database system to demonstrate the feasibility of our proposals. The system implements the <em> SEQ </em> model using a nested complex object architecture. It is built over the SHORE storage manager and can process several megabytes of data.Relations and sequences are supported in an integrated and extensible manner.</ul></BLOCKQUOTE></H4><hr><a name="Example"><H2>Motivating Example of a Sequence Query</H2></A><p><H4><BLOCKQUOTE>A weather monitoring system records information about various meteorological phenomena. There is a sequentiality in the occurrence of these phenomena; the various meteorological events are sequenced by the time at which they are recorded. A scientist asks the query: <p><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><!WA17><img src="http://www.cs.wisc.edu/~praveen/pics/redball.gif"> <em> "For which volcano eruptions didthe most recent earthquake have a strength greater than 7.0 on the Richter scale?"</em>. <p>If this query is to be expressed in a relational query language like SQL, complex features like groupby clauses, correlated subqueries and aggregatefunctions are required. Further, a conventional relational query optimizer would not find an efficient query execution plan, even given the knowledge that the Earthquakes and Volcano relations are sorted by time.<p>However a very efficient plan exists, if one models the data as sequencesordered by time. The two sequences can be scanned in lock step (similar to a sort merge join). The most recent earthquake record scanned can be stored in a temporary buffer. Whenever a volcano record is processed, the value of the most recent earthquake record stored in the buffer is checked to see if its strength was greater than 7.0, possibly generating an answer. This query can therefore be processed with a single scan of the two sequences, and using very little memory. The key to such optimization is the sequentiality of the data and the query. </BLOCKQUOTE></H4><hr><A NAME="Data Model"><H2>Data Model</H2></A><H4><BLOCKQUOTE>The details of the <em> SEQ </em> data model aredescribed in a published paper (click <!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><!WA18><a href="http://www.cs.wisc.edu/~praveen/papers/seq.de95.ps">here </a>for postscript version). Here we present the gist of it.The basic model of a sequence is a set of records mapped to an ordereddomain of ``positions''.This many-to-many relationship between records andpositions can be viewed in two dual but distinct ways: as a set of recordsmapped to each position, or as a set of positions mapped to each record.These two views are called ``Positional'' and ``Record-Oriented'' respectively,and each gives rise to a set of query operators based on that view.Queries on sequences could require operators of either or both flavors. The Record-Oriented operators are similar to relationaloperators and include various kinds of joins (overlap, containment, etc) andaggregates. Such operators have been extensively explored by researchersin the temporal database community.<p><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><!WA19><img src="http://www.cs.wisc.edu/~praveen/projects/sequence.gif" ALT="(Picture of Sequence Mapping)"><p>The Positional operators include Next, Previous, Offset, MovingAggregates, etc. Further operators allow ``zooming'' operations on sequences by means of collapsing and expanding the ordering domains associated with the sequence. For instance, a daily sequence could be ``zoomed out'' (i.e.collapsed) to a weekly sequence, or ``zoomed in'' (i.e. expanded) to an hourly sequence.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?