⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sigmod_2004_elementary.txt

📁 利用lwp::get写的
💻 TXT
📖 第 1 页 / 共 5 页
字号:
<proceedings><paper><title>The next database revolution</title><author><AuthorName>Jim Gray</AuthorName><institute><InstituteName>Microsoft, San Francisco, CA</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation></citation><abstract>Database system architectures are undergoing revolutionary changes. Most importantly, algorithms and data are being unified by integrating programming languages with the database system. This gives an extensible object-relational system where non-procedural relational operators manipulate object sets. Coupled with this, each DBMS is now a web service. This has huge implications for how we structure applications. DBMSs are now object containers. Queues are the first objects to be added. These queues are the basis for transaction processing and workflow applications. Future workflow systems are likely to be built on this core. Data cubes and online analytic processing are now baked into most DBMSs. Beyond that, DBMSs have a framework for data mining and machine learning algorithms. Decision trees, Bayes nets, clustering, and time series analysis are built in; new algorithms can be added. There is a rebirth of column stores for sparse tables and to optimize bandwidth. Text, temporal, and spatial data access methods, along with their probabilistic reasoning have been added to database systems. Allowing approximate and probabilistic answers is essential for many applications. Many believe that XML and xQuery will be the main data structure and access pattern. Database systems must accommodate that perspective. External data increasingly arrives as streams to be compared to historical data; so stream-processing operators are being added to the DBMS. Publish-subscribe systems invert the data-query ratios; incoming data is compared against millions of queries rather than queries searching millions of records. Meanwhile, disk and memory capacities are growing much faster than their bandwidth and latency, so the database systems increasingly use huge main memories and sequential disk access. These changes mandate a much more dynamic query optimization strategy - one that adapts to current conditions and selectivities rather than having a static plan. Intelligence is moving to the periphery of the network. Each disk and each sensor will be a competent database machine. Relational algebra is a convenient way to program these systems. Database systems are now expected to be self-managing, self-healing, and always-up. We researchers and developers have our work cut out for us in delivering all these features.</abstract></paper><paper><title>The role of cryptography in database security</title><author><AuthorName>Ueli Maurer</AuthorName><institute><InstituteName>ETH Zurich, Zurich, Switzerland</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>Michael Ben-Or , Shafi Goldwasser , Avi Wigderson, Completeness theorems for non-cryptographic fault-tolerant distributed computation, Proceedings of the twentieth annual ACM symposium on Theory of computing, p.1-10, May 02-04, 1988, Chicago, Illinois, United States</name><name>R. Canetti. Security and composition of multi-party cryptographic protocols. Journal of Cryptology, vol. 13, no. 1, pp. 143--202, 2000.</name><name>Silvana Castano , Maria Grazia Fugini , Giancarlo Martella , Pierangela Samarati, Database security, ACM Press/Addison-Wesley Publishing Co., New York, NY, 1995</name><name>B. Chor , O. Goldreich , E. Kushilevitz , M. Sudan, Private information retrieval, Proceedings of the 36th Annual Symposium on Foundations of Computer Science (FOCS'95), p.41, October 23-25, 1995</name><name>David Chaum , Claude Cr&amp;#233;peau , Ivan Damgard, Multiparty unconditionally secure protocols, Proceedings of the twentieth annual ACM symposium on Theory of computing, p.11-19, May 02-04, 1988, Chicago, Illinois, United States</name><name>O. Goldreich , S. Micali , A. Wigderson, How to play ANY mental game, Proceedings of the nineteenth annual ACM conference on Theory of computing, p.218-229, January 1987, New York, New York, United States</name><name>M. Hirt and U. Maurer. Player simulation and general adversary structures in perfect multi-party computation. Journal of Cryptology, vol. 13, no. 1, pp. 31--60, 2000.</name><name>Ueli M. Maurer, Cryptography 2000&amp;#177;10, Informatics  - 10 Years Back. 10 Years Ahead., p.63-85, January 2001</name><name>U. Maurer. Secure multi-party computation made simple. Security in Communication Networks (SCN'02), G. Persiano (Ed.), Lecture Notes in Computer Science, Springer-Verlag, vol. 2576, pp. 14--28, 2003.</name><name>Alfred J. Menezes , Scott A. Vanstone , Paul C. Van Oorschot, Handbook of Applied Cryptography, CRC Press, Inc., Boca Raton, FL, 1996</name><name>B. Pfitzmann, M. Schunter, and M. Waidner. Secure Reactive Systems. IBM Research Report RZ 3206, Feb. 14, 2000.</name><name>B. Schneier. Applied Cryptography. Wiley, 2nd edition, 1996.</name><name>A. C. Yao. Protocols for secure computations. Proc. 23rd IEEE Symposium on the Foundations of Computer Science (FOCS), pp. 160--164. IEEE, 1982.</name></citation><abstract>In traditional database security research, the database is usually assumed to be trustworthy. Under this assumption, the goal is to achieve security against external attacks (e.g. from hackers) and possibly also against users trying to obtain information beyond their privileges, for instance by some type of statistical inference. However, for many database applications such as health information systems there exist conflicting interests of the database owner and the users or organizations interacting with the database, and also between the users. Therefore the database cannot necessarily be assumed to be fully trusted.In this extended abstract we address the problem of defining and achieving security in a context where the database is not fully trusted, i.e., when the users must be protected against a potentially malicious database. Moreover, we address the problem of the secure aggregation of databases owned by mutually mistrusting organisations, for example by competing companies.</abstract></paper><paper><title>Adaptive stream resource management using Kalman Filters</title><author><AuthorName>Ankur Jain</AuthorName><institute><InstituteName>University of California, Santa Barbara, CA</InstituteName><country></country></institute></author><author><AuthorName>Edward Y. Chang</AuthorName><institute><InstituteName>University of California, Santa Barbara, CA</InstituteName><country></country></institute></author><author><AuthorName>Yuan-Fang Wang</AuthorName><institute><InstituteName>University of California, Santa Barbara, CA</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>D. Abadi , D. Carney , U. &amp;#199;etintemel , M. Cherniack , C. Convey , C. Erwin , E. Galvez , M. Hatoun , A. Maskey , A. Rasin , A. Singer , M. Stonebraker , N. Tatbul , Y. Xing , R. Yan , S. Zdonik, Aurora: a data stream management system, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, R. Motwani, I. Nishizawa, U. Srivastava, D. Thomas, R. Varma, and J. Widom. STREAM: The stanford stream data manager. IEEE Data Engineering Bulletin, 26:19--26, March 2003.</name><name>Arvind Arasu , Brian Babcock , Shivnath Babu , Jon McAlister , Jennifer Widom, Characterizing memory requirements for queries over continuous data streams, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>B. Babcock, S. Babu, M. Datar, R. Motwani, and D. Thomas. Operator scheduling in data stream systems. Technical report, Stanford University, CA, USA, October 2003.</name><name>Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>Brian Babcock , Mayur Datar , Rajeev Motwani, Load Shedding for Aggregation Queries over Data Streams, Proceedings of the 20th International Conference on Data Engineering, p.350, March 30-April 02, 2004</name><name>S. Babu, U. Srivastava, and J. Widom. Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. Technical report, Stanford Univesity, CA, USA, November 2003.</name><name>Aggelos Bletsas, Evaluation of Kalman Filtering for Network Time Keeping, Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, p.289, March 23-26, 2003</name><name>R. F. Boisvert, B. Miller, R. Pozo, K. Remington, J. Hicklin, C. Moler, and P. Webb. JAMA: A java matrix package.</name><name>R. G. Brown. Introduction to Random Signal Analysis and Kalman Filtering. Wiley, New York, NY, USA, 1983.</name><name>A. Bulut and A. K. Singh. SWAT: Hierarchical stream summarization in large networks. In Proceedings of the ICDE Intl. Conf. on Data Engineering, pages 303--314, Bangalore, India, March 2003.</name><name>S. Chandrasekaran. Telegraph CQ: Continuous dataflow processing for an uncertain world. In Proceedings of the CIDR Conf. on Innovative Data Systems Research, Asilomar, CA, USA, January 2003.</name><name>Reynold Cheng , Dmitri V. Kalashnikov , Sunil Prabhakar, Evaluating probabilistic queries over imprecise data, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>R. Clarke, J. Waddington, and J. N. Wallace. The application of Kalman filtering to the load/pressure control of coal-fired boilers. In IEE Colloquium on KAlman Filters: Introduction, Applications and Future Developments, volume 27, pages 2/1--2/6, London, UK, Feburary 1989.</name><name>Richard O. Duda , Peter E. Hart , David G. Stork, Pattern Classification (2nd Edition), Wiley-Interscience, 2000</name><name>Lukasz Golab , M. Tamer &amp;#214;zsu, Issues in data stream management, ACM SIGMOD Record, v.32 n.2, p.5-14, June 2003</name><name>Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>R. E. Kalman. A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering, 82 (Series D):35--45, March 1960.</name><name>I. Lazaridis and S. Mehrotra. Capturing sensor-generated time series with quality guarantess. In Proceedings of the ICDE Intl. Conf. on Data Engineering, pages 429--420, Bangalore, India, March 5--8 2003.</name><name>P. S. Maybeck. Stochastic Models, Estimation, and Control, volume 1. Academic Press, New York, NY, USA, 1979.</name><name>R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, resource management, and approximation in a data stream management system. In Proceedings of the CIDR Conf. on Innovative Data Systems Research, Asilomar, California, USA, January 2003.</name><name>Basic generation services data room, http://www.bgs-auction.com/bgs.dataroom.asp. Newark, NJ, 2003.</name><name>Chris Olston , Jing Jiang , Jennifer Widom, Adaptive filters for continuous queries over distributed data streams, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>Chris Olston , Boon Thau Loo , Jennifer Widom, Adaptive precision setting for cached approximate values, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.355-366, May 21-24, 2001, Santa Barbara, California, United States</name><name>Chris Olston , Boon Thau Loo , Jennifer Widom, Adaptive precision setting for cached approximate values, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.355-366, May 21-24, 2001, Santa Barbara, California, United States</name><name>C. Pereira, S. Gupta, K. Niyogi, I. Lazaridis, S. Mehrotra, and R. Gupta. Energy efficient communication for reliability and quality aware sensor networks. Technical report, University of California at Irvine and University of California at San Diego, April 2003.</name><name>V. Raghunathan, C. Schurgers, S. Park, and M. Srivastava. Energy aware wireless microsensor networks. IEEE Signal Processing Magazine, 19(2):40--50, March 2002.</name><name>Tajana Simunic , Haris Vikalo , Peter Glynn , Giovanni De Micheli, Energy efficient design of portable wireless systems, Proceedings of the 2000 international symposium on Low power electronics and design, p.49-54, July 25-27, 2000, Rapallo, Italy</name><name>G. Strang. Introduction to Applied Mathematics. Wellesley-Cambridge Press, Wellesley, MA, USA, 1986.</name><name>N. Tatbul, U. Cetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker. Load shedding in a data stream manager. In Processdings of VLDB Intl. Conf. on Very Large Data Bases, pages 309--320, Berlin, Germany, September 2003.</name><name>The internet traffic archive, http://ita.ee.lbl.gov. Lawrence Berkeley National Laboratory, USA, April 2000.</name><name>G. Welch and G. Bishop. An introduction to the Kalman filter. In ACM SIGGRAPH Intl. Conf. on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, August 2001.</name><name>Gang Wu , Yi Wu , Long Jiao , Yuan-Fang Wang , Edward Y. Chang, Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance, Proceedings of the eleventh ACM international conference on Multimedia, November 02-08, 2003, Berkeley, CA, USA</name><name>W. Wu, M. J. Black, E. B. Y. Gao, M. Serruya, A. Shaikhouni, and J. P. Donoghue. Neural decoding of cursor motion using a Kalman filter. In Neural Information Processing Systems: Natural and Synthetic, pages 133--140, Vancouver, British Columbia, Canada, December 2002.</name><name>Y. Yao and J. Gehrke. Query processing for sensor networks. In Proceedings of the CIDR Conf. on Innovative Data Systems Research, Asilomar, CA, USA, January 2003.</name><name>Online Data Mining for Co-Evolving Time Sequences, Proceedings of the 16th International Conference on Data Engineering, p.13, February 28-March 03, 2000</name></citation><abstract>To answer user queries efficiently, a stream management system must handle continuous, high-volume, possibly noisy, and time-varying data streams. One major research area in stream management seeks to allocate resources (such as network bandwidth and memory) to query plans, either to minimize resource usage under a precision requirement, or to maximize precision of results under resource constraints. To date, many solutions have been proposed; however, most solutions are ad hoc with hard-coded heuristics to generate query plans. In contrast, we perceive stream resource management as fundamentally a filtering problem, in which the objective is to filter out as much data as possible to conserve resources, provided that the precision standards can be met. We select the Kalman Filter as a general and adaptive filtering solution for conserving resources. The Kalman Filter has the ability to adapt to various stream characteristics, sensor noise, and time variance. Furthermore, we realize a significant performance boost by switching from traditional methods of caching static data (which can soon become stale) to our method of caching dynamic procedures that can predict data reliably at the server without the clients' involvement. In this work we focus on minimization of communication overhead for both synthetic and real-world streams. Through examples and empirical studies, we demonstrate the flexibility and effectiveness of using the Kalman Filter as a solution for managing trade-offs between precision of results and resources in satisfying stream queries.</abstract></paper><paper><title>Online event-driven subsequence matching over financial data streams</title><author><AuthorName>Huanmei Wu</AuthorName><institute><InstituteName>Northeastern University, Boston, MA</InstituteName><country></country></institute></author><author><AuthorName>Betty Salzberg</AuthorName><institute><InstituteName>Northeastern University, Boston, MA</InstituteName><country></country></institute></author><author><AuthorName>Donghui Zhang</AuthorName><institute><InstituteName>Northeastern University, Boston, MA</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A Framework for Clustering Evolving Data Streams. VLDB, pages 81--92, 2003.</name><name>Rakesh Agrawal , Christos Faloutsos , Arun N. Swami, Efficient Similarity Search In Sequence Databases, Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms, p.69-84, October 13-15, 1993</name><name>Shivnath Babu , Jennifer Widom, Continuous queries over data streams, ACM SIGMOD Record, v.30 n.3, September 2001</name><name>Daniel Barbar&amp;#225; , Ping Chen, Using the fractal dimension to cluster datasets, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.260-264, August 20-23, 2000, Boston, Massachusetts, United States</name><name>Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States</name><name>J. A. Bollinger. Bollinger on Bollinger Bands. McGraw-Hill, first edition, 2001.</name><name>Tolga Bozkaya , Meral Ozsoyoglu, Distance-based indexing for high-dimensional metric spaces, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.357-368, May 11-15, 1997, Tucson, Arizona, United States</name><name>K.-P. Chan and A.-C. Fu. Efficient Time Series Matching by Wavelets. ICDE, pages 126--133, 1999.</name><name>Tzi-cker Chiueh, Content-Based Image Indexing, Proceedings of the 20th International Conference on Very Large Data Bases, p.582-593, September 12-15, 1994</name><name>Paolo Ciaccia , Marco Patella , Pavel Zezula, M-tree: An Efficient Access Method for Similarity Search in Metric Spaces, Proceedings of the 23rd International Conference on Very Large Data Bases, p.426-435, August 25-29, 1997</name><name>Gautam Das , Dimitrios Gunopulos , Heikki Mannila, Finding Similar Time Series, Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery, p.88-100, June 24-27, 1997</name><name>Mayur Datar , Aristides Gionis , Piotr Indyk , Rajeev Motwani, Maintaining stream statistics over sliding windows: (extended abstract), Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.635-644, January 06-08, 2002, San Francisco, California</name><name>Christos Faloutsos , M. Ranganathan , Yannis Manolopoulos, Fast subsequence matching in time-series databases, ACM SIGMOD Record, v.23 n.2, p.419-429, June 1994</name><name>E. Fink and K. B. Pratt. Indexing of compressed time series.</name><name>A. J. Frost and R. R. Prechter. Elliott Wave Principle. New Classics Library, first edition, 1998.</name><name>Gabriel Pui Cheong Fung , Jeffrey Xu Yu , Wai Lam, News Sensitive Stock Trend Prediction, Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, p.481-493, May 06-08, 2002</name><name>Like Gao , X. Sean Wang, Continually evaluating similarity-based pattern queries on a streaming time series, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Like Gao , Zhengrong Yao , X. Sean Wang, Evaluating continuous nearest neighbor queries for streaming time series via pre-fetching, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA</name><name>Johannes Gehrke , Flip Korn , Divesh Srivastava, On computing correlated aggregates over continual data streams, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.13-24, May 21-24, 2001, Santa Barbara, California, United States</name><name>Anna C. Gilbert , Yannis Kotidis , S. Muthukrishnan , Martin Strauss, Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries, Proceedings of the 27th International Conference on Very Large Data Bases, p.79-88, September 11-14, 2001</name><name>Lukasz Golab , M. Tamer &amp;#214;zsu, Issues in data stream management, ACM SIGMOD Record, v.32 n.2, p.5-14, June 2003</name><name>S. Guha , N. Mishra , R. Motwani , L. O'Callaghan, Clustering data streams, Proceedings of the 41st Annual Symposium on Foundations of Computer Science, p.359, November 12-14, 2000</name><name>T. Hellstrm and K. Holmstrm. "Predicting the Stock Market". 1998.</name><name>Eamonn Keogh , Kaushik Chakrabarti , Michael Pazzani , Sharad Mehrotra, Locally adaptive dimensionality reduction for indexing large time series databases, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.151-162, May 21-24, 2001, Santa Barbara, California, United States</name><name>E. J. Keogh, K. Chakrabarti, M. J. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems, 3(3):263--286, 2001.</name><name>Eamonn J. Keogh , Selina Chu , David Hart , Michael J. Pazzani, An Online Algorithm for Segmenting Time Series, Proceedings of the 2001 IEEE International Conference on Data Mining, p.289-296, November 29-December 02, 2001</name><name>D. Komo, C. Chang, and H. Ko. "Neural Network Technology for Stock Market Index Prediction". ISSIPNN, pages 543--546, 1994.</name><name>Flip Korn , H. V. Jagadish , Christos Faloutsos, Efficiently supporting ad hoc queries in large datasets of time sequences, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.289-300, May 11-15, 1997, Tucson, Arizona, United States</name><name>X. Liu and H. Ferhatosmanoglu. Efficient k-NN Search on Streaming Data Series. In SSTD, pages 83--101, 2003.</name><name>Yang-Sae Moon , Kyu-Young Whang , Wook-Shin Han, General match: a subsequence matching method in time-series databases based on generalized windows, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>L. O'Callaghan, A. Meyerson, R. Motwani, N. Mishra, and S. Guha. Streaming-Data Algorithms for High-Quality Clustering. ICDE, pages 685--, 2002.</name><name>Sanghyun Park , Sang-Wook Kim , Wesley W. Chu, Segment-based approach for subsequence searches in sequence databases, Proceedings of the 2001 ACM symposium on Applied computing, p.248-252, March 2001, Las Vegas, Nevada, United States</name><name>Davood Rafiei , Alberto Mendelzon, Similarity-based queries for time series data, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.13-25, May 11-15, 1997, Tucson, Arizona, United States</name><name>J. Uhlmann. Satifying General Proximity Similarity Queries with Metric Trees. IPL, 4:175--179, 1991.</name><name>Byoung-Kee Yi , H. V. Jagadish , Christos Faloutsos, Efficient Retrieval of Similar Time Sequences Under Time Warping, Proceedings of the Fourteenth International Conference on Data Engineering, p.201-208, February 23-27, 1998</name><name>Peter N. Yianilos, Data structures and algorithms for nearest neighbor search in general metric spaces, Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, p.311-321, January 25-27, 1993, Austin, Texas, United States</name><name>Cui Yu , Beng Chin Ooi , Kian-Lee Tan , H. V. Jagadish, Indexing the Distance: An Efficient Method to KNN Processing, Proceedings of the 27th International Conference on Very Large Data Bases, p.421-430, September 11-14, 2001</name><name>Y. Zhu and D. Shasha. StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time. VLDB, pages 358--369, 2002.</name></citation><abstract>Subsequence similarity matching in time series databases is an important research area for many applications. This paper presents a new approximate approach for automatic online subsequence similarity matching over massive data streams. With a simultaneous on-line segmentation and pruning algorithm over the incoming stream, the resulting piecewise linear representation of the data stream features high sensitivity and accuracy. The similarity definition is based on a permutation followed by a metric distance function, which provides the similarity search with flexibility, sensitivity and scalability. Also, the metric-based indexing methods can be applied for speed-up. To reduce the system burden, the event-driven similarity search is performed only when there is a potential event. The query sequence is the most recent subsequence of piecewise data representation of the incoming stream which is automatically generated by the system. The retrieved results can be analyzed in different ways according to the requirements of specific applications. This paper discusses an application for future data movement prediction based on statistical information. Experiments on real stock data are performed. The correctness of trend predictions is used to evaluate the performance of subsequence similarity matching.</abstract></paper><paper><title>Holistic UDAFs at streaming speeds</title><author><AuthorName>Graham Cormode</AuthorName><institute><InstituteName>Rutgers University</InstituteName><country></country></institute></author><author><AuthorName>Theodore Johnson</AuthorName><institute><InstituteName>AT&amp;T Labs-Research</InstituteName><country></country></institute></author><author><AuthorName>Flip Korn</AuthorName><institute><InstituteName>AT&amp;T Labs-Research</InstituteName><country></country></institute></author><author><AuthorName>S. Muthukrishnan</AuthorName><institute><InstituteName>Rutgers University</InstituteName><country></country></institute></author><author><AuthorName>Oliver Spatscheck</AuthorName><institute><InstituteName>AT&amp;T Labs-Research</InstituteName><country></country></institute></author><author><AuthorName>Divesh Srivastava</AuthorName><institute><InstituteName>AT&amp;T Labs-Research</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>Agilent Technologies. Router Tester. http://advanced.comms.agilent.com/Router Tester/.</name><name>Noga Alon , Phillip B. Gibbons , Yossi Matias , Mario Szegedy, Tracking join and self-join sizes in limited storage, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.10-20, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>Noga Alon , Yossi Matias , Mario Szegedy, The space complexity of approximating the frequency moments, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.20-29, May 22-24, 1996, Philadelphia, Pennsylvania, United States</name><name>A. Arasu and et al. STREAM: The Stanford stream data manager. IEEE Data Engineering Bulletin, 26(1):19--26, 2003.</name><name>Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>D. Carney and et al. Monitoring streams - a new class of data management applications. In Proc VLDB, pages 215--226, 2002.</name><name>S. Chandrasekaran and et al. Telegraph CQ: Continuous dataflow procesing for an uncertain world. In Proc. CIDR, 2003.</name><name>Moses Charikar , Kevin Chen , Martin Farach-Colton, Finding Frequent Items in Data Streams, Proceedings of the 29th International Colloquium on Automata, Languages and Programming, p.693-703, July 08-13, 2002</name><name>G. Cormode, M. Datar, P. Indyk, and S. Muthukrishnan. Comparing data streams using Hamming norms. In Proc. Intl. Conf. VLDB, pages 335--345, 2002.</name><name>Graham Cormode , S. Muthukrishnan, An improved data stream summary: the count-min sketch and its applications, Journal of Algorithms, v.55 n.1, p.58-75, April 2005</name><name>Chuck Cranor , Yuan Gao , Theodore Johnson , Vlaidslav Shkapenyuk , Oliver Spatscheck, Gigascope: high performance network monitoring with an SQL interface, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Chuck Cranor , Theodore Johnson , Oliver Spataschek , Vladislav Shkapenyuk, Gigascope: a stream database for network applications, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>C. Cranor, T. Johnson, O. Spatscheck, and V. Shkapenyuk. The Gigascope stream database. IEEE Data Engineering Bulletin, 26(1): pages 27--32, 2003.</name><name>Alin Dobra , Minos Garofalakis , Johannes Gehrke , Rajeev Rastogi, Processing complex aggregate queries over data streams, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Pedro Domingos , Geoff Hulten, Mining high-speed data streams, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.71-80, August 20-23, 2000, Boston, Massachusetts, United States</name><name>Philippe Flajolet , G. Nigel Martin, Probabilistic counting algorithms for data base applications, Journal of Computer and System Sciences, v.31 n.2, p.182-209, Sept. 1985</name><name>Minos Garofalakis , Johannes Gehrke , Rajeev Rastogi, Querying and mining data streams: you only get one look a tutorial, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Johannes Gehrke , Flip Korn , Divesh Srivastava, On computing correlated aggregates over continual data streams, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.13-24, May 21-24, 2001, Santa Barbara, California, United States</name><name>Phillip B. Gibbons, Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports, Proceedings of the 27th International Conference on Very Large Data Bases, p.541-550, September 11-14, 2001</name><name>Phillip B. Gibbons , Yossi Matias , Viswanath Poosala, Fast Incremental Maintenance of Approximate Histograms, Proceedings of the 23rd International Conference on Very Large Data Bases, p.466-475, August 25-29, 1997</name><name>Anna C. Gilbert , Yannis Kotidis , S. Muthukrishnan , Martin Strauss, Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries, Proceedings of the 27th International Conference on Very Large Data Bases, p.79-88, September 11-14, 2001</name><name>A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss. How to summarize the universe: Dynamic maintenance of quantiles. In Proc. Intl. Conf. VLDB, pages 454--465, 2002.</name><name>Jim Gray , Adam Bosworth , Andrew Layman , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total, Proceedings of the Twelfth International Conference on Data Engineering, p.152-159, February 26-March 01, 1996</name><name>Michael Greenwald , Sanjeev Khanna, Space-efficient online computation of quantile summaries, ACM SIGMOD Record, v.30 n.2, p.58-66, June 2001</name><name>Sudipto Guha , Nick Koudas , Kyuseok Shim, Data-streams and histograms, Proceedings of the thirty-third annual ACM symposium on Theory of computing, p.471-475, July 2001, Hersonissos, Greece</name><name>S. Guha , N. Mishra , R. Motwani , L. O'Callaghan, Clustering data streams, Proceedings of the 41st Annual Symposium on Foundations of Computer Science, p.359, November 12-14, 2000</name><name>ISO DBL LHR-004 and ANSI X3H2-95-364. (ISO/ANSI Working Draft) Database Language SQL3.</name><name>Richard M. Karp , Scott Shenker , Christos H. Papadimitriou, A simple algorithm for finding frequent elements in streams and bags, ACM Transactions on Database Systems (TODS), v.28 n.1, p.51-55, March 2003</name><name>N. Koudas and D. Srivastava. Data stream query processing: A tutorial. In Proc. VLDB, page 1149, 2003.</name><name>A. Lerner and D. Shasha. The virtues and challenges of ad hoc + streams querying in finance. Data Engineering Bulletin, 26(1):49--56, 2003.</name><name>Fjording the Stream: An Architecture for Queries Over Streaming Sensor Data, Proceedings of the 18th International Conference on Data Engineering, p.555, February 26-March 01, 2002</name><name>G. Manku and R. Motwani. Approximate frequency counts over data streams. In Proc. VLDB, pages 346--357, 2002.</name><name>Gurmeet Singh Manku , Sridhar Rajagopalan , Bruce G. Lindsay, Approximate medians and other quantiles in one pass and with limited memory, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.426-435, June 01-04, 1998, Seattle, Washington, United States</name><name>S. Muthukrishnan, Data streams: algorithms and applications, Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, January 12-14, 2003, Baltimore, Maryland</name><name>Stanford stream data manager. http://www-db.stanford.edu/stream/sqr 2003. J. Widom and et al.</name><name>M. Sullivan and A. Heybey. Tribeca: A system for managing large databases of network traffic. In Proc. USENIX Technical Conf., 1998.</name><name>Nitin Thaper , Sudipto Guha , Piotr Indyk , Nick Koudas, Dynamic multidimensional histograms, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>H. Wang and C. Zaniolo. ATLaS: A native extension of SQL for data mining. In SIAM Intl. Conf. Data Mining 2003.</name></citation><abstract>Many algorithms have been proposed to approximate holistic aggregates, such as quantiles and heavy hitters, over data streams. However, little work has been done to explore what techniques are required to incorporate these algorithms in a data stream query processor, and to make them useful in practice.In this paper, we study the performance implications of using user-defined aggregate functions (UDAFs) to incorporate selection-based and sketch-based algorithms for holistic aggregates into a data stream management system's query processing architecture. We identify key performance bottlenecks and tradeoffs, and propose novel techniques to make these holistic UDAFs fast and space-efficient for use in high-speed data stream applications. We evaluate performance using generated and actual IP packet data, focusing on approximating quantiles and heavy hitters. The best of our current implementations can process streaming queries at OC48 speeds (2x 2.4Gbps).</abstract></paper><paper><title>BLAS: an efficient XPath processing system</title><author><AuthorName>Yi Chen</AuthorName><institute><InstituteName>University of Pennsylvania</InstituteName><country></country></institute></author><author><AuthorName>Susan B. Davidson</AuthorName><institute><InstituteName>University of Pennsylvania and INRIA-FUTURS (France)</InstituteName><country></country></institute></author><author><AuthorName>Yifeng Zheng</AuthorName><institute><InstituteName>University of Pennsylvania</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>Serge Abiteboul , Haim Kaplan , Tova Milo, Compact labeling schemes for ancestor queries, Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, p.547-556, January 07-09, 2001, Washington, D.C., United States</name><name>Structural Joins: A Primitive for Efficient XML Query Pattern Matching, Proceedings of the 18th International Conference on Data Engineering, p.141, February 26-March 01, 2002</name><name>Stephen Alstrup , Theis Rauhe, Improved labeling scheme for ancestor queries, Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.947-953, January 06-08, 2002, San Francisco, California</name><name>From XML Schema to Relations: A Cost-Based Approach to XML Storage, Proceedings of the 18th International Conference on Data Engineering, p.64, February 26-March 01, 2002</name><name>J. Bosak. Shakespeare. http://www.ibiblio.org/xml/examples/shakespeare/.</name><name>Nicolas Bruno , Nick Koudas , Divesh Srivastava, Holistic twig joins: optimal XML pattern matching, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Y. Chen, S. Davidson, C. Hara, and Y. Zheng. RRXS: Redundancy reducing XML storage in relations. In Proceedings of VLDB, 2003.</name><name>Y. Chen, S. B. Davidson, and Y. Zheng. Constraint Preserving XML Storage in Relations. In WebDB, 2002.</name><name>Shu-Yao Chien , Vassilis J. Tsotras , Carlo Zaniolo , Donghui Zhang, Efficient Complex Query Support for Multiversion XML Documents, Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology, p.161-178, March 25-27, 2002</name><name>S.-Y. Chien, Z. Vagena, D. Zhang, V. J. Tsotras, and C. Zaniolo. Efficient Structural Joins on Indexed XML Documents. In Proceedings of VLDB, 2002.</name><name>J. Clark and S. DeRose. XML Path language (XPath), November 1999. http://www.w3.org/TR/xpath.</name><name>Brian Cooper , Neal Sample , Michael J. Franklin , Gisli R. Hjaltason , Moshe Shadmon, A Fast Index for Semistructured Data, Proceedings of the 27th International Conference on Very Large Data Bases, p.341-350, September 11-14, 2001</name><name>David DeHaan , David Toman , Mariano P. Consens , M. Tamer &amp;#214;zsu, A comprehensive XQuery to SQL translation using dynamic interval encoding, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>A. Deutsch. An Experimental Evaluation of the MARS System. In Excerpt from PhD Thesis Alin Deutsch, 2002.</name><name>Alin Deutsch , Mary Fernandez , Dan Suciu, Storing semistructured data with STORED, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.431-442, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>Paul F. Dietz, Maintaining order in a linked list, Proceedings of the fourteenth annual ACM symposium on Theory of computing, p.122-127, May 05-07, 1982, San Francisco, California, United States</name><name>D. Florescu and D. Kossmann. Storing and querying XML data using an RDBMS. In Bulletin of the Technical Committee on Data Engineering, pages 27--34, September 1999.</name><name>Georgetown Protein Information Resource. Protein Sequence Database, 2001. http://www.cs.washington.edu/research/xmldatasets/.</name><name>Torsten Grust, Accelerating XPath location steps, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>H. Jiang, H. Lu, W. Wang, and B. C. Ooi. XR-Tree: Indexing XML Data for Efficient Structural Joins. In Proceedings of ICDE, 2003.</name><name>Raghav Kaushik , Philip Bohannon , Jeffrey F Naughton , Henry F Korth, Covering indexes for branching path queries, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes. Exploiting local similarity for efficient indexing of paths in graph structured data. In Proceedings of ICDE, 2002.</name><name>Quanzhong Li , Bongki Moon, Indexing and Querying XML Data for Regular Path Expressions, Proceedings of the 27th International Conference on Very Large Data Bases, p.361-370, September 11-14, 2001</name><name>J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman. Indexing semistructured data. Technical report, Stanford University, 1998.</name><name>Tova Milo , Dan Suciu, Index Structures for Path Expressions, Proceeding of the 7th International Conference on Database Theory, p.277-295, January 10-12, 1999</name><name>Jun-Ki Min , Myung-Jae Park , Chin-Wan Chung, XPRESS: a queriable compression for XML data, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>Jayavel Shanmugasundaram , Kristin Tufte , Chun Zhang , Gang He , David J. DeWitt , Jeffrey F. Naughton, Relational Databases for Querying XML Documents: Limitations and Opportunities, Proceedings of the 25th International Conference on Very Large Data Bases, p.302-314, September 07-10, 1999</name><name>J. Simon and M. Fernndez. Galax. http://db.bell-labs.com/galax.</name><name>W. Wang, H. Jiang, H. Lu, and J. X. Yu. PBiTree Coding and Efficient Processing of Containment Joins. In Proceedings of ICDE, 2003.</name><name>XMARK the XML-benchmark project, April 2001. http://monetdb.cwi.nl/xml/index.html.</name><name>Chun Zhang , Jeffrey Naughton , David DeWitt , Qiong Luo , Guy Lohman, On supporting containment queries in relational database management systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.425-436, May 21-24, 2001, Santa Barbara, California, United States</name></citation><abstract>We present BLAS, a Bi-LAbeling based System, for efficiently processing complex XPath queries over XML data. BLAS uses P-labeling to process queries involving consecutive child axes, and D-labeling to process queries involving descendant axes traversal. The XML data is stored in labeled form, and indexed to optimize descendent axis traversals. Three algorithms are presented for translating complex XPath queries to SQL expressions, and two alternate query engines are provided. Experimental results demonstrate that the BLAS system has a substantial performance improvement compared to traditional XPath processing using D-labeling.</abstract></paper><paper><title>Efficient processing of XML twig queries with OR-predicates</title><author><AuthorName>Haifeng Jiang</AuthorName><institute><InstituteName>The Hong Kong University of Science and Technology, Hong Kong, China</InstituteName><country></country></institute></author><author><AuthorName>Hongjun Lu</AuthorName><institute><InstituteName>The Hong Kong University of Science and Technology, Hong Kong, China</InstituteName><country></country></institute></author><author><AuthorName>Wei Wang</AuthorName><institute><InstituteName>The Hong Kong University of Science and Technology, Hong Kong, China</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>IBM XML data generator. http://www.alphaworks.ibm.com/tech/xmlgenerator.</name><name>Sihem Amer-Yahia , SungRan Cho , Laks V. S. Lakshmanan , Divesh Srivastava, Minimization of tree pattern queries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.497-508, May 21-24, 2001, Santa Barbara, California, United States</name><name>Nicolas Bruno , Nick Koudas , Divesh Srivastava, Holistic twig joins: optimal XML pattern matching, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>S.-Y. Chien, Z. Vagena, D. Zhang, V. J. Tsotras, and C. Zaniolo. Efficient structural joins on indexed XML documents. In VLDB, pages 263--274, 2002.</name><name>Brian Cooper , Neal Sample , Michael J. Franklin , Gisli R. Hjaltason , Moshe Shadmon, A Fast Index for Semistructured Data, Proceedings of the 27th International Conference on Very Large Data Bases, p.341-350, September 11-14, 2001</name><name>T. Fiebig , S. Helmer , C.-C. Kanne , G. Moerkotte , J. Neumann , R. Schiele , T. Westmann, Anatomy of a native XML base management system, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.11 n.4, p.292-314, December 2002</name><name>Roy Goldman , Jennifer Widom, DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases, Proceedings of the 23rd International Conference on Very Large Data Bases, p.436-445, August 25-29, 1997</name><name>A. Halverson, J. Burger, A. Kini, R. Krishnamurthy, A. N. Rao, F. Tian, S. Viglas, Y. Wang, J. F. Naughton, and D. J. DeWitt. Mixed mode XML query processing. In VLDB, pages 225--236, 2003.</name><name>H. V. Jagadish , S. Al-Khalifa , A. Chapman , L. V. S. Lakshmanan , A. Nierman , S. Paparizos , J. M. Patel , D. Srivastava , N. Wiwatwattana , Y. Wu , C. Yu, TIMBER: A native XML database, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.11 n.4, p.274-291, December 2002</name><name>H. Jiang, H. Lu, W. Wang, and B. C. Ooi. XR-Tree: Indexing XML data for efficient structural joins. In ICDE, pages 253--264, 2003.</name><name>H. Jiang, W. Wang, H. Lu, and J. X. Yu. Holistic twig joins on indexed XML documents. In VLDB, pages 273--284, 2003.</name><name>Raghav Kaushik , Philip Bohannon , Jeffrey F Naughton , Henry F Korth, Covering indexes for branching path queries, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Quanzhong Li , Bongki Moon, Indexing and Querying XML Data for Regular Path Expressions, Proceedings of the 27th International Conference on Very Large Data Bases, p.361-370, September 11-14, 2001</name><name>Jason McHugh , Serge Abiteboul , Roy Goldman , Dallas Quass , Jennifer Widom, Lore: a database management system for semistructured data, ACM SIGMOD Record, v.26 n.3, p.54-66, Sept. 1997</name><name>Jason McHugh , Jennifer Widom, Query Optimization for XML, Proceedings of the 25th International Conference on Very Large Data Bases, p.315-326, September 07-10, 1999</name><name>Tova Milo , Dan Suciu, Index Structures for Path Expressions, Proceeding of the 7th International Conference on Database Theory, p.277-295, January 10-12, 1999</name><name>Dallan Quass , Jennifer Widom , Roy Goldman , Kevin Haas , Qingshan Luo , Jason McHugh , Svetlozar Nestorov , Anand Rajaraman , Hugo Rivero , Serge Abiteboul , Jeff Ullman , Janet Wiener, LORE: a Lightweight Object REpository for semistructured data, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.549, June 04-06, 1996, Montreal, Quebec, Canada</name><name>Structural Joins: A Primitive for Efficient XML Query Pattern Matching, Proceedings of the 18th International Conference on Data Engineering, p.141, February 26-March 01, 2002</name><name>Haixun Wang , Sanghyun Park , Wei Fan , Philip S. Yu, ViST: a dynamic index method for querying XML data by tree structures, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>Y. Wu, J. M. Patel, and H. V. Jagadish. Structural join order selection for XML query optimization. In ICDE, pages 443--454, 2003.</name><name>Chun Zhang , Jeffrey Naughton , David DeWitt , Qiong Luo , Guy Lohman, On supporting containment queries in relational database management systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.425-436, May 21-24, 2001, Santa Barbara, California, United States</name></citation><abstract>An XML twig query, represented as a labeled tree, is essentially a complex selection predicate on both structure and content of an XML document. Twig query matching has been identified as a core operation in querying tree-structured XML data. A number of algorithms have been proposed recently to process a twig query holistically. Those algorithms, however, only deal with twig queries without OR-predicates. A straightforward approach that first decomposes a twig query with OR-predicates into multiple twig queries without OR-predicates and then combines their results is obviously not optimal in most cases. In this paper, we study novel holistic-processing algorithms for twig queries with OR-predicates without decomposition. In particular, we present a merge-based algorithm for sorted XML data and an index-based algorithm for indexed XML data. We show that holistic processing is much more efficient than the decomposition approach. Furthermore, we show that using indexes can significantly improve the performance for matching twig queries with OR-predicates, especially when the queries have large inputs but relatively small outputs.</abstract></paper><paper><title>Tree logical classes for efficient evaluation of XQuery</title><author><AuthorName>Stelios Paparizos</AuthorName><institute><InstituteName>University of Michigan</InstituteName><country></country></institute></author><author><AuthorName>Yuqing Wu</AuthorName><institute><InstituteName>University of Michigan</InstituteName><country></country></institute></author><author><AuthorName>Laks V. S. Lakshmanan</AuthorName><institute><InstituteName>University of British Columbia</InstituteName><country></country></institute></author><author><AuthorName>H. V. Jagadish</AuthorName><institute><InstituteName>University of Michigan</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>Structural Joins: A Primitive for Efficient XML Query Pattern Matching, Proceedings of the 18th International Conference on Data Engineering, p.141, February 26-March 01, 2002</name><name>S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML query languge. Working Draft. http://www.w3.org/TR/xquery.</name><name>Nicolas Bruno , Nick Koudas , Divesh Srivastava, Holistic twig joins: optimal XML pattern matching, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Z. Chen, H. V. Jagadish, L. V. S. Lakshmanan, and S. Paparizos. From tree patterns to generalized tree patterns: On efficient evaluation of XQuery. In Proc. VLDB Conf., Sep. 2003.</name><name>Chun Zhang , Jeffrey Naughton , David DeWitt , Qiong Luo , Guy Lohman, On supporting containment queries in relational database management systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.425-436, May 21-24, 2001, Santa Barbara, California, United States</name><name>David DeHaan , David Toman , Mariano P. Consens , M. Tamer &amp;#214;zsu, A comprehensive XQuery to SQL translation using dynamic interval encoding, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>D. Florescu and D. Kossman. Storing and querying XML data using an RDMBS. IEEE Data Eng. Bull., 22(3), 1999.</name><name>H. V. Jagadish , S. Al-Khalifa , A. Chapman , L. V. S. Lakshmanan , A. Nierman , S. Paparizos , J. M. Patel , D. Srivastava , N. Wiwatwattana , Y. Wu , C. Yu, TIMBER: A native XML database, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.11 n.4, p.274-291, December 2002</name><name>H. V. Jagadish , Laks V. S. Lakshmanan , Divesh Srivastava , Keith Thompson, TAX: A Tree Algebra for XML, Revised Papers from the  8th International Workshop on Database Programming Languages, p.149-164, September 08-10, 2001</name><name>Bertram Lud&amp;#228;scher , Yannis Papakonstantinou , Pavel Velikhov, Navigation-Driven Evaluation of Virtual Mediated Views, Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology, p.150-165, March 27-31, 2000</name><name>U. of Michigan. The Timber project. http://www.eecs.umich.edu/db/timber.</name><name>U. of Wisconsin. The Niagara internet query system. http://www.cs.wisc.edu/niagara/.</name><name>A. R. Schmidt, F. Waas, M. L. Kersten, M. J. Carey, I. Manolescu, and R. Busse. XMark: A benchmark for XML data management. In Proc. VLDB Conf., 2002.</name><name>Harald Sch&amp;#246;ning, Tamino - A DBMS designed for XML, Proceedings of the 17th International Conference on Data Engineering, p.149-154, April 02-06, 2001</name><name>Jayavel Shanmugasundaram , Kristin Tufte , Chun Zhang , Gang He , David J. DeWitt , Jeffrey F. Naughton, Relational Databases for Querying XML Documents: Limitations and Opportunities, Proceedings of the 25th International Conference on Very Large Data Bases, p.302-314, September 07-10, 1999</name><name>J. Simeon and M. F. Fernandez. Galax, an open implementation of XQuery. http://db.bell-labs.com/galax/.</name><name>Igor Tatarinov , Stratis D. Viglas , Kevin Beyer , Jayavel Shanmugasundaram , Eugene Shekita , Chun Zhang, Storing and querying ordered XML using a relational database system, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>S. D. Viglas, L. Galanis, D. J. DeWitt, D. Maier, and J. F. Naughtonn. Putting XML query algebras into context. http://www.cs.wisc.edu/niagara/.</name><name>Y. Wu, J. M. Patel, and H. V. Jagadish. Structural join order selection for XML query optimization. In Proc. ICDE Conf., Mar. 2003.</name><name>X-Hive Corp. X-Hive/DB native XML storage. http://www.x-hive.com/.</name><name>XMark, an XML benchmark project. http://www.xml-benchmark.org/.</name><name>Xin Zhang , Bradford Pielech , Elke A. Rundesnteiner, Honey, I shrunk the XQuery!: an XML algebra optimization approach, Proceedings of the 4th international workshop on Web information and data management, November 08-08, 2002, McLean, Virginia, USA</name></citation><abstract>XML is widely praised for its flexibility in allowing repeated and missing sub-elements. However, this flexibility makes it challenging to develop a bulk algebra, which typically manipulates sets of objects with identical structure. A set of XML elements, say of type book, may have members that vary greatly in structure, e.g. in the number of author sub-elements. This kind of heterogeneity may permeate the entire document in a recursive fashion: e.g., different authors of the same or different book may in turn greatly vary in structure. Even when the document conforms to a schema, the flexible nature of schemas for XML still allows such significant variations in structure among elements in a collection. Bulk processing of such heterogeneous sets is problematic.In this paper, we introduce the notion of logical classes (LC) of pattern tree nodes, and generalize the notion of pattern tree matching to handle node logical classes. This abstraction pays off significantly in allowing us to reason with an inherently heterogeneous collection of elements in a uniform, homogeneous way. Based on this, we define a Tree Logical Class (TLC) algebra that is capable of handling the heterogeneity arising in XML query processing, while avoiding redundant work. We present an algorithm to obtain a TLC algebra expression from an XQuery statement (for a large fragment of XQuery). We show how to implement the TLC algebra efficiently, introducing the nest-join as an important physical operator for XML query processing. We show that evaluation plans generated using the TLC algebra not only are simpler but also perform better than those generated by competing approaches. TLC is the algebra used in the Timber [8] system developed at the University of Michigan.</abstract></paper><paper><title>FleXPath: flexible structure and full-text querying for XML</title><author><AuthorName>Sihem Amer-Yahia</AuthorName><institute><InstituteName>AT&amp;T Labs-Research, Florham Park, NJ</InstituteName><country></country></institute></author><author><AuthorName>Laks V. S. Lakshmanan</AuthorName><institute><InstituteName>University of British Columbia, Vancouver, CA</InstituteName><country></country></institute></author><author><AuthorName>Shashank Pandit</AuthorName><institute><InstituteName>IIT Bombay, Mumba&amp;#238;, India</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>S. Al-Khalifa et al. Structural joins: A primitive for efficient XML query pattern matching. In ICDE, 2002.</name><name>S. Amer-Yahia , C. Botev , J. Shanmugasundaram, Texquery: a full-text search extension to xquery, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA</name><name>Sihem Amer-Yahia , SungRan Cho , Divesh Srivastava, Tree Pattern Relaxation, Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology, p.496-513, March 25-27, 2002</name><name>Klemens B&amp;#246;hm , Karl Aberer , Erich J. Neuhold , Xiaoya Yang, Structured document storage and refined declarative and navigational access mechanisms in HyperStorM, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.6 n.4, p.296-311, November 1997</name><name>J. M. Bremer and M. Gertz. XQuery/IR: Integrating XML Document and Data Retrieval. WebDB 2002.</name><name>Eric W. Brown, Fast evaluation of structured queries for information retrieval, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.30-38, July 09-13, 1995, Seattle, Washington, United States</name><name>Nicolas Bruno , Surajit Chaudhuri , Luis Gravano, Top-k selection queries over relational databases: Mapping strategies and performance evaluation, ACM Transactions on Database Systems (TODS), v.27 n.2, p.153-187, June 2002</name><name>Michael J. Carey , Donald Kossmann, On saying &amp;ldquo;Enough already!&amp;rdquo; in SQL, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.219-230, May 11-15, 1997, Tucson, Arizona, United States</name><name>David Carmel , Yoelle S. Maarek , Matan Mandelbrod , Yosi Mass , Aya Soffer, Searching XML documents via XML fragments, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, July 28-August 01, 2003, Toronto, Canada</name><name>C. Chen and Y. Ling. A Sampling-Based Estimator for Top-K Query. In ICDE 2002.</name><name>T. T. Chinenyanga and N. Kushmerick. Expressive and Efficient Ranked Querying of XML Data. 4th International Workshop on the Web and Databases (WebDB). Santa Barbara, California, 2001.</name><name>S. Cohen et al. XSEarch: A Semantic Search Engine for XML. In VLDB 2003.</name><name>M. Cutler et al. Using the Structure of HTML Documents to Improve Retrieval. USENIX Symposium on Internet Technologies and Systems. California 1997.</name><name>Ernesto Damiani , Nico Lavarini , Stefania Marrara , Barbara Oliboni , Daniele Pasini , Letizia Tanca , Giuseppe Viviani, The APPROXML Tool Demonstration, Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology, p.753-755, March 25-27, 2002</name><name>C. Delobel and M. C. Rousset. A Uniform Approach for Querying Large Tree-structured Data through a Mediated Schema. International Workshop on Foundations of Models for Information Integration (FMII-2001).</name><name>S. Flesca et al. On the minimization of XPath queries. In VLDB 2003: 153--164</name><name>Daniela Florescu , Donald Kossmann , Ioana Manolescu, Integrating keyword search into XML query processing, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.119-135, June 2000, Amsterdam, The Netherlands</name><name>N. Fuhr and K. Grossjohann. XIRQL: An Extension of XQL for Information Retrieval. ACM SIGIR Workshop on XML and Information Retrieval. Athens, Greece, 2000.</name><name>Norbert Fuhr , Thomas R&amp;#246;lleke, HySpirit - A Probabilistic Inference Engine for Hypermedia Retrieval in Large Databases, Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology, p.24-38, March 23-27, 1998</name><name>Lin Guo , Feng Shao , Chavdar Botev , Jayavel Shanmugasundaram, XRANK: ranked keyword search over XML documents, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>Y. Hayashi et al. Searching Text-rich XML Documents with Relevance Ranking. ACM SIGIR 2000 Workshop on XML and Information Retrieval, Greece, 2000.</name><name>Vagelis Hristidis , Nick Koudas , Yannis Papakonstantinou, PREFER: a system for the efficient execution of multi-parametric ranked queries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.259-270, May 21-24, 2001, Santa Barbara, California, United States</name><name>P. Kilpelainen. Tree Matching Problems with Applications to Structured Text Databases. PhD thesis, University of Helsinki, Finland, November 1992.</name><name>Gerome Miklau , Dan Suciu, Containment and equivalence for an XPath fragment, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>Sung Hyon Myaeng , Don-Hyun Jang , Mun-Seok Kim , Zong-Cheol Zhoo, A flexible model for retrieval of SGML documents, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, p.138-145, August 24-28, 1998, Melbourne, Australia</name><name>J. Naughton et al. The Niagara Internet Query System. http://www.cs.wisc.edu/niagara/Publications.html</name><name>Neoklis Polyzotis , Minos Garofalakis , Yannis Ioannidis, Selectivity Estimation for XML Twigs, Proceedings of the 20th International Conference on Data Engineering, p.264, March 30-April 02, 2004</name><name>Gerard Salton , Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, NY, 1986</name><name>Albrecht Schmidt , Martin L. Kersten , Menzo Windhouwer, Querying XML Documents Made Easy: Nearest Concept Queries, Proceedings of the 17th International Conference on Data Engineering, p.321-329, April 02-06, 2001</name><name>T. Schlieder. Similarity Search in XML Data using Cost-Based Query Transformations. ACM SIGMOD 2001 Web and Databases Workshop. May, 2001. Santa Barbara, California.</name><name>Anja Theobald , Gerhard Weikum, Adding Relevance to XML, Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases, p.105-124, May 18-19, 2000</name></citation><abstract>Querying XML data is a well-explored topic with powerful database-style query languages such as XPath and XQuery set to become W3C standards. An equally compelling paradigm for querying XML documents is full-text search on textual content. In this paper, we study fundamental challenges that arise when we try to integrate these two querying paradigms.While keyword search is based on approximate matching, XPath has exact match semantics. We address this mismatch by considering queries on structure as a "template", and looking for answers that best match this template and the full-text search. To achieve this, we provide an elegant definition of relaxation on structure and define primitive operators to span the space of relaxations. Query answering is now based on ranking potential answers on structural and full-text search conditions. We set out certain desirable principles for ranking schemes and propose natural ranking schemes that adhere to these principles. We develop efficient algorithms for answering top-K queries and discuss results from a comprehensive set of experiments that demonstrate the utility and scalability of the proposed framework and algorithms.</abstract></paper><paper><title>An interactive clustering-based approach to integrating source query interfaces on the deep Web</title><author><AuthorName>Wensheng Wu</AuthorName><institute><InstituteName>University of Illinois at Urbana-Champaign</InstituteName><country></country></institute></author><author><AuthorName>Clement Yu</AuthorName><institute><InstituteName>University of Illinois at Chicago</InstituteName><country></country></institute></author><author><AuthorName>AnHai Doan</AuthorName><institute><InstituteName>University of Illinois at Urbana-Champaign</InstituteName><country></country></institute></author><author><AuthorName>Weiyi Meng</AuthorName><institute><InstituteName>SUNY at Binghamton</InstituteName><country></country></institute></author><year>2004</year><conference>International Conference on Management of Data</conference><citation><name>IceQ project: http:

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -