⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sigmod_2003_elementary.txt

📁 利用lwp::get写的
💻 TXT
📖 第 1 页 / 共 5 页
字号:
<proceedings><paper><title>Improving the efficiency of database-system teaching</title><author><AuthorName>Jeffrey D. Ullman</AuthorName><institute><InstituteName>Stanford University</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation></citation><abstract>The education industry has a very poor record of productivity gains. In this brief article, I outline some of the ways the teaching of a college course in database systems could be made more efficient, and staff time used more productively. These ideas carry over to other programming-oriented courses, and many of them apply to any academic subject whatsoever. After proposing a number of things that could be done, I concentrate here on a system under development, called OTC (On-line Testing Center), and on its methodology of "root questions." These questions encourage students to do homework of the long-answer type, yet we can have their work checked and graded automatically by a simple multiple-choice-question grader. OTC also offers some improvement in the way we handle SQL homework, and could be used with other languages as well.</abstract></paper><paper><title>Querying structured text in an XML database</title><author><AuthorName>Shurug Al-Khalifa</AuthorName><institute><InstituteName>University of Michigan, Ann Arbor, MI</InstituteName><country></country></institute></author><author><AuthorName>Cong Yu</AuthorName><institute><InstituteName>University of Michigan, Ann Arbor, MI</InstituteName><country></country></institute></author><author><AuthorName>H. V. Jagadish</AuthorName><institute><InstituteName>University of Michigan, Ann Arbor, MI</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>Rakesh Agrawal , Edward L. Wimmers, A framework for expressing and combining preferences, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.297-306, May 15-18, 2000, Dallas, Texas, United States</name><name>S. Al-Khalifa, H. V. Jagadish, N. Kouda, J. Patel, D. Srivastava, and Y. Wu. Structural joins: A primitive for efficient XML query pattern matching. In ICDE, 2001.</name><name>D. Beech, A. Malhotra, and M. Rys. A formal data model and algebra for XML. W3C XML Query Working Group Note, September 1999.</name><name>C. Beeri and Y. Tzaban. SAL: An algebra for semi-structured data and XML. In ACM SIGMOD Workshop on the Web and Databases, pages 37--42, Philadelphia, PA, June 1999.</name><name>N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.</name><name>Nicolas Bruno , Nick Koudas , Divesh Srivastava, Holistic twig joins: optimal XML pattern matching, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>D. D. Chamberlin, J. Clark, D. Florescu, J. Robie, J. Simon, and M. Stefanescu. XQuery 1.0: An XML query language. W3C working draft, June 2001. http://www.w3.org/TR/xquery/.</name><name>Kevin Chen-Chuan Chang , Seung-won Hwang, Minimal probing: supporting expensive predicates for top-k queries, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>S.-Y. Chien, Z. Vagena, D. Zhang, V. J. Tsotras, and C. Zaniolo. Efficient structural joins on indexed XML documents. In VLDB, 2002.</name><name>William W. Cohen, Integration of heterogeneous databases without common domains using queries based on textual similarity, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.201-212, June 01-04, 1998, Seattle, Washington, United States</name><name>DELOS. Initiative for the evaluation of XML retrieval. http://qmir.dcs.qmw.ac.uk/inex/.</name><name>P. Fankhauser, M. Fernandez, A. Malhotra, M. Rys, J. Simeon, and P. Wadler. The XML query algebra. W3C Working Draft, Feburary 2001.</name><name>Norbert Fuhr , Kai Gro&amp;#223;johann, XIRQL: a query language for information retrieval in XML documents, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.172-180, September 2001, New Orleans, Louisiana, United States</name><name>Norbert Fuhr , Thomas R&amp;#246;lleke, A probabilistic relational algebra for the integration of information retrieval and database systems, ACM Transactions on Information Systems (TOIS), v.15 n.1, p.32-66, Jan. 1997</name><name>Christoph M. Hoffmann , Michael J. O'Donnell, Pattern Matching in Trees, Journal of the ACM (JACM), v.29 n.1, p.68-95, Jan. 1982</name><name>Vagelis Hristidis , Nick Koudas , Yannis Papakonstantinou, PREFER: a system for the efficient execution of multi-parametric ranked queries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.259-270, May 21-24, 2001, Santa Barbara, California, United States</name><name>H. V. Jagadish , Laks V. S. Lakshmanan , Divesh Srivastava , Keith Thompson, TAX: A Tree Algebra for XML, Revised Papers from the  8th International Workshop on Database Programming Languages, p.149-164, September 08-10, 2001</name><name>A. Nierman and H. V. Jagadish. ProTDB: Probabilistic data in XML. In VLDB, 2002.</name><name>G. Ozsoyoglu, A. Al-Hamdani, I. S. Altingovde, S. A. Ozel, O. Ulusoy, and Z. M. Ozsoyoglu. Sideway value algebra for object-relational databases. In VLDB, 2002.</name><name>Gerard Salton , Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, NY, 1986</name><name>T. Schlieder and H. Meuss. Result ranking for structured queries against XML documents. In DELOS Workshop on Information Seeking, Searching and Querying in Digital Libraries, 2000.</name><name>Albrecht Schmidt , Martin L. Kersten , Menzo Windhouwer, Querying XML Documents Made Easy: Nearest Concept Queries, Proceedings of the 17th International Conference on Data Engineering, p.321-329, April 02-06, 2001</name><name>Anja Theobald , Gerhard Weikum, The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking, Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology, p.477-495, March 25-27, 2002</name><name>U. of Michigan. The Timber system. http://www.eecs.umich.edu/db/timber/.</name><name>Chun Zhang , Jeffrey Naughton , David DeWitt , Qiong Luo , Guy Lohman, On supporting containment queries in relational database management systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.425-436, May 21-24, 2001, Santa Barbara, California, United States</name></citation><abstract>XML databases often contain documents comprising structured text. Therefore, it is important to integrate "information retrieval style" query evaluation, which is well-suited for natural language text, with standard "database style" query evaluation, which handles structured queries efficiently. Relevance scoring is central to information retrieval. In the case of XML, this operation becomes more complex because the data required for scoring could reside not directly in an element itself but also in its descendant elements.In this paper, we propose a bulk-algebra, TIX, and describe how it can be used as a basis for integrating information retrieval techniques into a standard pipelined database query evaluation engine. We develop new evaluation strategies essential to obtaining good performance, including a stack-based TermJoin algorithm for efficiently scoring composite elements. We report results from an extensive experimental evaluation, which show, among other things, that the new TermJoin access method outperforms a direct implementation of the same functionality using standard operators by a large factor.</abstract></paper><paper><title>XRANK: ranked keyword search over XML documents</title><author><AuthorName>Lin Guo</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><author><AuthorName>Feng Shao</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><author><AuthorName>Chavdar Botev</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><author><AuthorName>Jayavel Shanmugasundaram</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>DBXplorer: A System for Keyword-Based Search over Relational Databases, Proceedings of the 18th International Conference on Data Engineering, p.5, February 26-March 01, 2002</name><name>V. Aguilera, S. Cluet, F. Wattez, "Xyleme Query Architecture", WWW Conf., 2001.</name><name>Vo Ngoc Anh , Owen de Kretser , Alistair Moffat, Vector-space ranking with effective early termination, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.35-42, September 2001, New Orleans, Louisiana, United States</name><name>G. Bhalotia, et al., "Keyword Searching and Browsing in Databases using BANKS", ICDE Conf., 2002.</name><name>Klemens B&amp;#246;hm , Karl Aberer , Erich J. Neuhold , Xiaoya Yang, Structured document storage and refined declarative and navigational access mechanisms in HyperStorM, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.6 n.4, p.296-311, November 1997</name><name>Sergey Brin , Lawrence Page, The anatomy of a large-scale hypertextual Web search engine, Proceedings of the seventh international conference on World Wide Web 7, p.107-117, April 1998, Brisbane, Australia</name><name>Eric W. Brown , James P. Callan , W. Bruce Croft, Fast Incremental Indexing for Full-Text Information Retrieval, Proceedings of the 20th International Conference on Very Large Data Bases, p.192-202, September 12-15, 1994</name><name>L. J. Brown , M. P. Consens , I. J. Davis , C. R. Palmer , F. W. Tompa, A structured text ADT for object-relational databases, Theory and Practice of Object Systems, v.4 n.4, p.227-244, Oct. 12, 1998</name><name>Chris Buckley , Alan F. Lewit, Optimization of inverted vector searches, Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval, p.97-110, June 05-07, 1985, Montreal, Quebec, Canada</name><name>Soumen Chakrabarti , Mukul Joshi , Vivek Tawde, Enhanced topic distillation using text, markup tags, and hyperlinks, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.208-216, September 2001, New Orleans, Louisiana, United States</name><name>V. Christophides , S. Abiteboul , S. Cluet , M. Scholl, From structured documents to novel query facilities, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.313-324, May 24-27, 1994, Minneapolis, Minnesota, United States</name><name>Tuong Dao , Ron Sacks-Davis , James A. Thom, An Indexing Scheme for Structured Documents and its Implementation, Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA), p.125-134, April 01-04, 1997</name><name>Shaul Dar , Gadi Entin , Shai Geva , Eran Palmon, DTL's DataSpot: Database Exploration Using Plain Language, Proceedings of the 24rd International Conference on Very Large Data Bases, p.645-649, August 24-27, 1998</name><name>Ronald Fagin , Amnon Lotem , Moni Naor, Optimal aggregation algorithms for middleware, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.102-113, May 2001, Santa Barbara, California, United States</name><name>Daniela Florescu , Donald Kossmann , Ioana Manolescu, Integrating keyword search into XML query processing, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.119-135, June 2000, Amsterdam, The Netherlands</name><name>Norbert Fuhr , Kai Gro&amp;#223;johann, XIRQL: a query language for information retrieval in XML documents, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.172-180, September 2001, New Orleans, Louisiana, United States</name><name>Roy Goldman , Narayanan Shivakumar , Suresh Venkatasubramanian , Hector Garcia-Molina, Proximity Search in Databases, Proceedings of the 24rd International Conference on Very Large Data Bases, p.26-37, August 24-27, 1998</name><name>L. Guo, F. Shao, C. Botev, J. Shanmugasundaram, "XRANK: Ranked Keyword Search Over XML Documents", Cornell University Technical Report, 2003.</name><name>Dov Harel , Robert Endre Tarjan, Fast algorithms for finding nearest common ancestors, SIAM Journal on Computing, v.13 n.2, p.338-355, May 1984</name><name>V. Hristidis, Y. Papakonstantinou, "DISCOVER: Keyword Search in Relational Databases", VLDB Conf., 2002.</name><name>HyTime, http://www.hytime.org.</name><name>Guy Jacobson , Balachander Krishnamurthy , Divesh Srivastava , Dan Suciu, Focusing search in hierarchical structures with directory sets, Proceedings of the seventh international conference on Information and knowledge management, p.1-9, November 02-07, 1998, Bethesda, Maryland, United States</name><name>H. V. Jagadish , Laks V. S. Lakshmanan , Tova Milo , Divesh Srivastava , Dimitra Vista, Querying network directories, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.133-144, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>Jon M. Kleinberg, Authoritative sources in a hyperlinked environment, Journal of the ACM (JACM), v.46 n.5, p.604-632, Sept. 1999</name><name>Yong Kyu Lee , Seong-Joon Yoo , Kyoungro Yoon , P. Bruce Berra, Index structures for structured documents, Proceedings of the first ACM international conference on Digital libraries, p.91-99, March 20-23, 1996, Bethesda, Maryland, United States</name><name>R. Luk, et al., "A Survey of Search Engines for XML Documents", SIGIR Workshop on XML and IR, 2000.</name><name>Sung Hyon Myaeng , Don-Hyun Jang , Mun-Seok Kim , Zong-Cheol Zhoo, A flexible model for retrieval of SGML documents, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, p.138-145, August 24-28, 1998, Melbourne, Australia</name><name>Michael Persin, Document filtering for fast ranking, Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, p.339-348, July 03-06, 1994, Dublin, Ireland</name><name>Gerard Salton, Automatic text processing: the transformation, analysis, and retrieval of information by computer, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1989</name><name>Albrecht Schmidt , Martin L. Kersten , Menzo Windhouwer, Querying XML Documents Made Easy: Nearest Concept Queries, Proceedings of the 17th International Conference on Data Engineering, p.321-329, April 02-06, 2001</name><name>A. R. Schmidt , Florian Waas , Martin L. Kersten , D. Florescu , I. Manolescu , M. J. Carey , R. Busse, The XML benchmark project, CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands, 2001</name><name>Igor Tatarinov , Stratis D. Viglas , Kevin Beyer , Jayavel Shanmugasundaram , Eugene Shekita , Chun Zhang, Storing and querying ordered XML using a relational database system, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Anja Theobald , Gerhard Weikum, The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking, Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology, p.477-495, March 25-27, 2002</name><name>Anthony Tomasic , H&amp;#233;ctor Garc&amp;#237;a-Molina , Kurt Shoens, Incremental updates of inverted lists for text document retrieval, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.289-300, May 24-27, 1994, Minneapolis, Minnesota, United States</name><name>World Wide Web Consortium, http://www.w3.org.</name></citation><abstract>We consider the problem of efficiently producing ranked results for keyword search queries over hyperlinked XML documents. Evaluating keyword search queries over hierarchical XML documents, as opposed to (conceptually) flat HTML documents, introduces many new challenges. First, XML keyword search queries do not always return entire documents, but can return deeply nested XML elements that contain the desired keywords. Second, the nested structure of XML implies that the notion of ranking is no longer at the granularity of a document, but at the granularity of an XML element. Finally, the notion of keyword proximity is more complex in the hierarchical XML data model. In this paper, we present the XRANK system that is designed to handle these novel features of XML keyword search. Our experimental results show that XRANK offers both space and performance benefits when compared with existing approaches. An interesting feature of XRANK is that it naturally generalizes a hyperlink based HTML search engine such as Google. XRANK can thus be used to query a mix of HTML and XML documents.</abstract></paper><paper><title>Distributed top-k monitoring</title><author><AuthorName>Brian Babcock</AuthorName><institute><InstituteName>Stanford University</InstituteName><country></country></institute></author><author><AuthorName>Chris Olston</AuthorName><institute><InstituteName>Stanford University</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>M. Arlitt and T. Jin. 1998 world cup web site access logs, Aug. 1988. Available at http://www.acm.org/sigcomm/ITA/.</name><name>M. Arlitt and T. Jin. Workload characterization of the 1998 world cup web site. Technical Report HPL-1999-35R1, Hewlett Packard, Sept. 1999.</name><name>Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>B. Babcock and C. Olston. Distributed top-k monitoring. Technical report, Stanford University Computer Science Department, 2002. http://dbpubs.stanford.edu/pub/2002-61.</name><name>Daniel Barbar&amp;#225; , Hector Garcia-Molina, The Demarcation Protocol: A Technique for Maintaining Linear Arithmetic Constraints in Distributed Database Systems, Proceedings of the 3rd International Conference on Extending Database Technology: Advances in Database Technology, p.373-388, March 23-27, 1992</name><name>P. A. Bernstein, B. T. Blaustein, and E. M. Clarke. Fast maintenance of semantic integrity assertions using redundant aggregate data. In Proc. VLDB, 1980.</name><name>N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In Proc. ICDE, 2002.</name><name>D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams - a new class of data management applications. In Proc. VLDB, 2002.</name><name>Moses Charikar , Kevin Chen , Martin Farach-Colton, Finding Frequent Items in Data Streams, Proceedings of the 29th International Colloquium on Automata, Languages and Programming, p.693-703, July 08-13, 2002</name><name>Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States</name><name>Denial of service attacks using nameservers. Incident Note IN-2000-04, CMU Software Engineering Institute CERT Coordination Center, Apr. 2000. http://www.cert.org/incident_notes/IN-2000-04.html.</name><name>M. Dilman and D. Raz. Efficient reactive monitoring. In Proc. INFOCOM, 2001.</name><name>Cynthia Dwork , Ravi Kumar , Moni Naor , D. Sivakumar, Rank aggregation methods for the Web, Proceedings of the 10th international conference on World Wide Web, p.613-622, May 01-05, 2001, Hong Kong, Hong Kong</name><name>D. Estrin, L. Girod, G. Pottie, and M. Srivastava. Instrumenting the world with wireless sensor networks. In Proc. International Conference on Acoustics, Speech, and Signal Processing, 2001.</name><name>Ronald Fagin , Ravi Kumar , D. Sivakumar, Comparing top k lists, Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, January 12-14, 2003, Baltimore, Maryland</name><name>Ronald Fagin , Amnon Lotem , Moni Naor, Optimal aggregation algorithms for middleware, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.102-113, May 2001, Santa Barbara, California, United States</name><name>Vern Paxson , Sally Floyd, Wide area traffic: the failure of Poisson modeling, IEEE/ACM Transactions on Networking (TON), v.3 n.3, p.226-244, June 1995</name><name>Phillip B. Gibbons , Yossi Matias, New sampling-based summary statistics for improving approximate query answers, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.331-342, June 01-04, 1998, Seattle, Washington, United States</name><name>Phillip B. Gibbons , Srikanta Tirthapura, Estimating simple functions on the union of data streams, Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures, p.281-291, July 2001, Crete Island, Greece</name><name>Phillip B. Gibbons , Srikanta Tirthapura, Distributed streams algorithms for sliding windows, Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, August 10-13, 2002, Winnipeg, Manitoba, Canada</name><name>Anna C. Gilbert , Yannis Kotidis , S. Muthukrishnan , Martin Strauss, Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries, Proceedings of the 27th International Conference on Very Large Data Bases, p.79-88, September 11-14, 2001</name><name>Ashish Gupta , Jennifer Widom, Local verification of global integrity constraints in distributed databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.49-58, May 25-28, 1993, Washington, D.C., United States</name><name>A. Householder, A. Manion, L. Pesante, and G. Weaver. Managing the threat of denial-of-service attacks. Technical report, CMU Software Engineering Institute CERT Coordination Center, Oct. 2001. http://www.cert.org/archive/pdf/Managing_DoS.pdf.</name><name>Nam Huyn, Maintaining Global Integrity Constraints in DistributedDatabases, Constraints, v.2 n.3-4, p.377-399, 1997</name><name>D. Li, K. Wong, Y. H. Hu, and A. Sayeed. Detection, classification and tracking of targets. IEEE Signal Processing Magazine, 2002.</name><name>Samuel Madden , Mehul Shah , Joseph M. Hellerstein , Vijayshankar Raman, Continuously adaptive continuous queries over streams, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>G. S. Manku and R. Motwani. Approximate frequency counts over data streams. In Proc. VLDB, 2002.</name><name>Rex Min , Manish Bhardwaj , Seong-Hwan Cho , Eugene Shih , Amit Sinha , Alice Wang , Anantha Chandrakasan, Low-Power Wireless Sensor Networks, Proceedings of the The 14th International Conference on VLSI Design (VLSID '01), p.205, January 03-07, 2001</name><name>R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, resource management, and approximation in a data stream management system. In Proc. CIDR, 2003.</name><name>Chris Olston , Jing Jiang , Jennifer Widom, Adaptive filters for continuous queries over distributed data streams, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>T. Palpanas, R. Sidle, R. Cochrane, and H. Pirahesh. Incremental maintenance for non-distributive aggregate functions. In Proc. VLDB, 2002.</name><name>G. J. Pottie , W. J. Kaiser, Wireless integrated network sensors, Communications of the ACM, v.43 n.5, p.51-58, May 2000</name><name>Nandit Soparkar , Abraham Silberschatz, Data-valued partitioning and virtual messages (extended abstract), Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.357-367, April 02-04, 1990, Nashville, Tennessee, United States</name><name>Takao Yamashita, Dynamic Replica Control Based on Fairly Assigned Variation of Data with Weak Consistency for Loosely Coupled Distributed Systems, Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02), p.280, July 02-05, 2002</name><name>K. Yi, H. Yu, J. Yang, G. Xia, and Y. Chen. Efficient maintenance of materialized top-k views. In Proc. ICDE, 2003.</name></citation><abstract>The querying and analysis of data streams has been a topic of much recent interest, motivated by applications from the fields of networking, web usage analysis, sensor instrumentation, telecommunications, and others. Many of these applications involve monitoring answers to continuous queries over data streams produced at physically distributed locations, and most previous approaches require streams to be transmitted to a single location for centralized processing. Unfortunately, the continual transmission of a large number of rapid data streams to a central location can be impractical or expensive. We study a useful class of queries that continuously report the k largest values obtained from distributed data streams ("top-k monitoring queries"), which are of particular interest because they can be used to reduce the overhead incurred while running other types of monitoring queries. We show that transmitting entire data streams is unnecessary to support these queries and present an alternative approach that reduces communication significantly. In our approach, arithmetic constraints are maintained at remote stream sources to ensure that the most recently provided top-k answer remains valid to within a user-specified error tolerance. Distributed communication is only necessary on occasion, when constraints are violated, and we show empirically through extensive simulation on real-world data that our approach reduces overall communication cost by an order of magnitude compared with alternatives that o er the same error guarantees.</abstract></paper><paper><title>Approximate join processing over data streams</title><author><AuthorName>Abhinandan Das</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><author><AuthorName>Johannes Gehrke</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><author><AuthorName>Mirek Riedewald</AuthorName><institute><InstituteName>Cornell University</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>Arvind Arasu , Brian Babcock , Shivnath Babu , Jon McAlister , Jennifer Widom, Characterizing memory requirements for queries over continuous data streams, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin</name><name>Shivnath Babu , Jennifer Widom, Continuous queries over data streams, ACM SIGMOD Record, v.30 n.3, September 2001</name><name>D. Barbar&amp;#225;, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The New Jersey data reduction report. IEEE Data Engineering Bulletin, 20(4):3--45, 1997.</name><name>Philippe Bonnet , Johannes Gehrke , Praveen Seshadri, Towards Sensor Database Systems, Proceedings of the Second International Conference on Mobile Data Management, p.3-14, January 08-10, 2001</name><name>D. Carney, U. &amp;#199;etintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams---a new class of data management applications. In Proc. Int. Conf. on Very Large Databases (VLDB), 2002.</name><name>S. Chandrasekaran and M. J. Franklin. Streaming queries over streaming data. In Proc. Int. Conf. on Very Large Databases (VLDB), 2002.</name><name>Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya, On random sampling over joins, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.263-274, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States</name><name>Mayur Datar , Aristides Gionis , Piotr Indyk , Rajeev Motwani, Maintaining stream statistics over sliding windows: (extended abstract), Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.635-644, January 06-08, 2002, San Francisco, California</name><name>M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Company, 1979.</name><name>Anna C. Gilbert , Yannis Kotidis , S. Muthukrishnan , Martin Strauss, Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries, Proceedings of the 27th International Conference on Very Large Data Bases, p.79-88, September 11-14, 2001</name><name>Andrew V. Goldberg, An efficient implementation of a scaling minimum-cost flow algorithm, Journal of Algorithms, v.22 n.1, p.1-29, Jan. 1997</name><name>Michael Greenwald , Sanjeev Khanna, Space-efficient online computation of quantile summaries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.58-66, May 21-24, 2001, Santa Barbara, California, United States</name><name>Sudipto Guha , Nick Koudas , Kyuseok Shim, Data-streams and histograms, Proceedings of the thirty-third annual ACM symposium on Theory of computing, p.471-475, July 2001, Hersonissos, Greece</name><name>C. J. Hahn, S. G. Warren, and J. London. Edited synoptic cloud reports from ships and land stations over the globe, 1982--1991. http://cdiac.esd.ornl.gov/ftp/ndp026b, 1996.</name><name>David J. Hand , Padhraic Smyth , Heikki Mannila, Principles of data mining, MIT Press, Cambridge, MA, 2001</name><name>J. M. Hellerstein, M. J. Franklin, S. Chandrasekaran, A. Deshpande, K. Hildrum, S. Madden, V. Raman, and M. A. Shah. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 23(2):7--18, 2000.</name><name>Yannis E. Ioannidis , Viswanath Poosala, Histogram-Based Approximation of Set-Valued Query-Answers, Proceedings of the 25th International Conference on Very Large Data Bases, p.174-185, September 07-10, 1999</name><name>Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>J. Kang, J. F. Naughton, and S. D. Viglas. Evaluating window joins over unbounded streams. In Proc. Int. Conf. on Data Engineering (ICDE), 2003.</name><name>F. Korn, S. Muthukrishnan, and D. Srivastava. Reverse nearest neighbor aggregates over data streams. In Proc. Int. Conf. on Very Large Databases (VLDB), 2002.</name><name>Fjording the Stream: An Architecture for Queries Over Streaming Sensor Data, Proceedings of the 18th International Conference on Data Engineering, p.555, February 26-March 01, 2002</name><name>Samuel Madden , Mehul Shah , Joseph M. Hellerstein , Vijayshankar Raman, Continuously adaptive continuous queries over streams, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>R. T. Rockafellar. Network flows and monotropic optimization. John Wiley &amp; Sons, 1984.</name><name>Yossi Rubner , Carlo Tomasi , Leonidas J. Guibas, A Metric for Distributions with Applications to Image Databases, Proceedings of the Sixth International Conference on Computer Vision, p.59, January 04-07, 1998</name><name>Nitin Thaper , Sudipto Guha , Piotr Indyk , Nick Koudas, Dynamic multidimensional histograms, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>C. J. Van Rijsbergen, Information Retrieval, Butterworth-Heinemann, Newton, MA, 1979</name></citation><abstract>We consider the problem of approximating sliding window joins over data streams in a data stream processing system with limited resources. In our model, we deal with resource constraints by shedding load in the form of dropping tuples from the data streams. We first discuss alternate architectural models for data stream join processing, and we survey suitable measures for the quality of an approximation of a set-valued query result. We then consider the number of generated result tuples as the quality measure, and we give optimal offline and fast online algorithms for it. In a thorough experimental study with synthetic and real data we show the efficacy of our solutions. For applications with demand for exact results we introduce a new Archive-metric which captures the amount of work needed to complete the join in case the streams are archived for later processing.</abstract></paper><paper><title>Spreadsheets in RDBMS for OLAP</title><author><AuthorName>Andrew Witkowski</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Srikanth Bellamkonda</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Tolga Bozkaya</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Gregory Dorman</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Nathan Folkert</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Abhinav Gupta</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Lei Shen</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Sankar Subramanian</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>APB Benchmark Specifications. http:// www.olapcouncil.org/research/APB1R2_spec.pdf</name><name>Andrey Balmin , Thanos Papadimitriou , Yannis Papakonstantinou, Hypothetical Queries in an OLAP Environment, Proceedings of the 26th International Conference on Very Large Data Bases, p.220-231, September 10-14, 2000</name><name>Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States</name><name>Randall G. Bello , Karl Dias , Alan Downing , James J. Feenan, Jr. , James L. Finnerty , William D. Norcott , Harry Sun , Andrew Witkowski , Mohamed Ziauddin, Materialized Views in Oracle, Proceedings of the 24rd International Conference on Very Large Data Bases, p.659-664, August 24-27, 1998</name><name>Jose A. Blakeley , Per-Ake Larson , Frank Wm Tompa, Efficiently updating materialized views, Proceedings of the 1986 ACM SIGMOD international conference on Management of data, p.61-71, May 28-30, 1986, Washington, D.C., United States</name><name>Damianos Chatziantoniou , Kenneth A. Ross, Querying Multiple Features of Groups in Relational Databases, Proceedings of the 22th International Conference on Very Large Data Bases, p.295-306, September 03-06, 1996</name><name>Ashish Gupta , Inderpal Singh Mumick , V. S. Subrahmanian, Maintaining views incrementally, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.157-166, May 25-28, 1993, Washington, D.C., United States</name><name>Antonin Guttman, R-trees: a dynamic index structure for spatial searching, Proceedings of the 1984 ACM SIGMOD international conference on Management of data, June 18-21, 1984, Boston, Massachusetts</name><name>Joseph M. Hellerstein, Practical predicate placement, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.325-335, May 24-27, 1994, Minneapolis, Minnesota, United States</name><name>Joseph M. Hellerstein , Michael Stonebraker, Predicate migration: optimizing queries with expensive predicates, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.267-276, May 25-28, 1993, Washington, D.C., United States</name><name>Laks V. S. Lakshmanan , Jian Pei , Yan Zhao, QC-trees: an efficient summary structure for semantic OLAP, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California</name><name>Alon Y. Levy , Inderpal Singh Mumick , Yehoshua Sagiv, Query Optimization by Predicate Move-Around, Proceedings of the 20th International Conference on Very Large Data Bases, p.96-107, September 12-15, 1994</name><name>Kenneth A. Ross , Divesh Srivastava , Peter J. Stuckey , S. Sudarshan, Foundations of Aggregation Constraints, Proceedings of the Second International Workshop on Principles and Practice of Constraint Programming, p.193-204, May 02-04, 1994</name><name>I. S. Mumick , S. J. Finkelstein , Hamid Pirahesh , Raghu Ramakrishnan, Magic is relevant, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.247-258, May 23-26, 1990, Atlantic City, New Jersey, United States</name><name>Divesh Srivastava , Raghu Ramakrishnan, Pushing constraint selections, Proceedings of the eleventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.301-315, June 02-05, 1992, San Diego, California, United States</name><name>Yannis Sismanis , Antonios Deligiannakis , Nick Roussopoulos , Yannis Kotidis, Dwarf: shrinking the PetaCube, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>R. Tarjan. "Dept.-first search and linear graph algorithms," SIAM J. Computing, 1997.</name><name>F. Zemke. "Rank, Moving and reporting functions for OLAP," 99/01/22 proposal for ANSI-NCTS.</name></citation><abstract>One of the critical deficiencies of SQL is lack of support for n-dimensional array-based computations which are frequent in OLAP environments. Relational OLAP (ROLAP) applications have to emulate them using joins, recently introduced SQL Window Functions [18] and complex and inefficient CASE expressions. The designated place in SQL for specifying calculations is the SELECT clause, which is extremely limiting and forces the user to generate queries using nested views, subqueries and complex joins. Furthermore, SQL-query optimizer is pre-occupied with determining efficient join orders and choosing optimal access methods and largely disregards optimization of complex numerical formulas. Execution methods concentrated on efficient computation of a cube [11], [16] rather than on random access structures for inter-row calculations. This has created a gap that has been filled by spreadsheets and specialized MOLAP engines, which are good at formulas for mathematical modeling but lack the formalism of the relational model, are difficult to manage, and exhibit scalability problems. This paper presents SQL extensions involving array based calculations for complex modeling. In addition, we present optimizations, access structures and execution models for processing them efficiently.</abstract></paper><paper><title>QC-trees: an efficient summary structure for semantic OLAP</title><author><AuthorName>Laks V. S. Lakshmanan</AuthorName><institute><InstituteName>University of British Columbia, Canada</InstituteName><country></country></institute></author><author><AuthorName>Jian Pei</AuthorName><institute><InstituteName>State University of New York at Buffalo</InstituteName><country></country></institute></author><author><AuthorName>Yan Zhao</AuthorName><institute><InstituteName>University of British Columbia, Canada</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>Sameet Agarwal , Rakesh Agrawal , Prasad Deshpande , Ashish Gupta , Jeffrey F. Naughton , Raghu Ramakrishnan , Sunita Sarawagi, On the Computation of Multidimensional Aggregates, Proceedings of the 22th International Conference on Very Large Data Bases, p.506-521, September 03-06, 1996</name><name>Andrey Balmin , Thanos Papadimitriou , Yannis Papakonstantinou, Hypothetical Queries in an OLAP Environment, Proceedings of the 26th International Conference on Very Large Data Bases, p.220-231, September 10-14, 2000</name><name>Daniel Barbar&amp;#225; , Mark Sullivan, Quasi-cubes: exploiting approximations in multidimensional databases, ACM SIGMOD Record, v.26 n.3, p.12-17, Sept. 1997</name><name>Daniel Barbar&amp;#225; , Xintao Wu, Using Loglinear Models to Compress Datacube, Proceedings of the First International Conference on Web-Age Information Management, p.311-322, June 21-23, 2000</name><name>Kevin Beyer , Raghu Ramakrishnan, Bottom-up computation of sparse and Iceberg CUBE, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.359-370, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>C. Carpineto and G. Romano:. Galois: An order-theoretic approach to conceptual clustering. In ICML'93.</name><name>Sara Cohen , Werner Nutt , Alexander Serebrenik, Rewriting aggregate queries using views, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.155-166, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States</name><name>Jim Gray , Adam Bosworth , Andrew Layman , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total, Proceedings of the Twelfth International Conference on Data Engineering, p.152-159, February 26-March 01, 1996</name><name>Ashish Gupta , Inderpal Singh Mumick , V. S. Subrahmanian, Maintaining views incrementally, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.157-166, May 25-28, 1993, Washington, D.C., United States</name><name>C.Hahn et al. Edited synoptic cloud reports from ships and land stations over the globe, 1982--1991. cdiac.est.ornl.gov/ftp/ndp026b/SEP85L.Z, 1994.</name><name>Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman, Implementing data cubes efficiently, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.205-216, June 04-06, 1996, Montreal, Quebec, Canada</name><name>C. A. Hurtado et al. Maintaining data cubes under dimension updates. In ICDE'99.</name><name>Carlos A. Hurtado , Alberto O. Mendelzon , Alejandro A. Vaisman, Updating OLAP dimensions, Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP, p.60-66, November 02-06, 1999, Kansas City, Missouri, United States</name><name>L. V. S. Lakshmanan et al. Quotient cube: How to summarize the semantics of a data cube. In VLDB'02.</name><name>Alon Y. Levy , Alberto O. Mendelzon , Yehoshua Sagiv, Answering queries using views (extended abstract), Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.95-104, May 22-25, 1995, San Jose, California, United States</name><name>Alberto O. Mendelzon , Alejandro A. Vaisman, Temporal Queries in OLAP, Proceedings of the 26th International Conference on Very Large Data Bases, p.242-253, September 10-14, 2000</name><name>Inderpal Singh Mumick , Dallan Quass , Barinderpal Singh Mumick, Maintenance of data cubes and summary tables in a warehouse, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.100-111, May 11-15, 1997, Tucson, Arizona, United States</name><name>Dallan Quass , Ashish Gupta , Inderpal Singh Mumick , Jennifer Widom, Making views self-maintainable for data warehousing, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.158-169, December 18-20, 1996, Miami Beach, Florida, United States</name><name>Dallan Quass , Jennifer Widom, On-line warehouse view maintenance, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.393-404, May 11-15, 1997, Tucson, Arizona, United States</name><name>Kenneth A. Ross , Divesh Srivastava, Fast Computation of Sparse Datacubes, Proceedings of the 23rd International Conference on Very Large Data Bases, p.116-125, August 25-29, 1997</name><name>Nick Roussopoulos , Yannis Kotidis , Mema Roussopoulos, Cubetree: organization of and bulk incremental updates on the data cube, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.89-99, May 11-15, 1997, Tucson, Arizona, United States</name><name>S. Sarawagi. Indexing OLAP data. IEEE Data Eng. Bulletin, 20:36--43, 1997.</name><name>Gayatri Sathe , Sunita Sarawagi, Intelligent Rollups in Multidimensional OLAP Data, Proceedings of the 27th International Conference on Very Large Data Bases, p.531-540, September 11-14, 2001</name><name>Jayavel Shanmugasundaram , Usama Fayyad , P. S. Bradley, Compressed data cubes for OLAP aggregate query approximation on continuous dimensions, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.223-232, August 15-18, 1999, San Diego, California, United States</name><name>Yannis Sismanis , Antonios Deligiannakis , Nick Roussopoulos , Yannis Kotidis, Dwarf: shrinking the PetaCube, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin</name><name>Jeffrey Scott Vitter , Min Wang , Bala Iyer, Data cube approximation and histograms via wavelets, Proceedings of the seventh international conference on Information and knowledge management, p.96-104, November 02-07, 1998, Bethesda, Maryland, United States</name><name>W. Wang et al. Condensed cube: An effective approach to reducing data cube size. In ICDE'02.</name><name>Jun Yang , Jennifer Widom, Maintaining Temporal Views over Non-Temporal Information Sources for Data Warehousing, Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology, p.389-403, March 23-27, 1998</name><name>Jun Yang , Jennifer Widom, Temporal View Self-Maintenance, Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology, p.395-412, March 27-31, 2000</name><name>Yihong Zhao , Prasad M. Deshpande , Jeffrey F. Naughton, An array-based algorithm for simultaneous multidimensional aggregates, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.159-170, May 11-15, 1997, Tucson, Arizona, United States</name></citation><abstract>Recently, a technique called quotient cube was proposed as a summary structure for a data cube that preserves its semantics, with applications for online exploration and visualization. The authors showed that a quotient cube can be constructed very efficiently and it leads to a significant reduction in the cube size. While it is an interesting proposal, that paper leaves many issues unaddressed. Firstly, a direct representation of a quotient cube is not as compact as possible and thus still wastes space. Secondly, while a quotient cube can in principle be used for answering queries, no specific algorithms were given in the paper. Thirdly, maintaining any summary structure incrementally against updates is an important task, a topic not addressed there. In this paper, we propose an efficient data structure called QC-tree and an efficient algorithm for directly constructing it from a base table, solving the first problem. We give efficient algorithms that address the remaining questions. We report results from an extensive performance study that illustrate the space and time savings achieved by our algorithms over previous ones (wherever they exist).</abstract></paper><paper><title>Winnowing: local algorithms for document fingerprinting</title><author><AuthorName>Saul Schleimer</AuthorName><institute><InstituteName>University of Illinois, Chicago</InstituteName><country></country></institute></author><author><AuthorName>Daniel S. Wilkerson</AuthorName><institute><InstituteName>Computer Science Division, UC Berkeley</InstituteName><country></country></institute></author><author><AuthorName>Alex Aiken</AuthorName><institute><InstituteName>Computer Science Division, UC Berkeley</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>Arvind Arasu , Junghoo Cho , Hector Garcia-Molina , Andreas Paepcke , Sriram Raghavan, Searching the Web, ACM Transactions on Internet Technology (TOIT), v.1 n.1, p.2-43, Aug. 2001</name><name>B. S. Baker, On finding duplication and near-duplication in large software systems, Proceedings of the Second Working Conference on Reverse Engineering, p.86, July 14-16, 1995</name><name>Brenda S. Baker and Udi Manber. Deducing similarities in java sources from bytecodes. In Proc. of Usenix Annual Technical Conf., pages 179--190, 1998.</name><name>Sergey Brin , James Davis , H&amp;#233;ctor Garc&amp;#237;a-Molina, Copy detection mechanisms for digital documents, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.398-409, May 22-25, 1995, San Jose, California, United States</name><name>Andrei Broder. On the resemblance and containment of documents. In SEQS: Sequences '91, 1998.</name><name>Andrei Z. Broder , Steven C. Glassman , Mark S. Manasse , Geoffrey Zweig, Syntactic clustering of the Web, Selected papers from the sixth international conference on World Wide Web, p.1157-1166, September 1997, Santa Clara, California, United States</name><name>The Crystals. Da do run run, 1963.</name><name>Nevin Heintze. Scalable document fingerprinting. In 1996 USENIX Workshop on Electronic Commerce, November 1996.</name><name>James Joyce. Finnegans wake {1st trade ed.}. Faber and Faber (London), 1939.</name><name>Richard M. Karp , Michael O. Rabin, Efficient randomized pattern-matching algorithms, IBM Journal of Research and Development, v.31 n.2, p.249-260, March 1987</name><name>Sergio Leone, Clint Eastwood, Eli Wallach, and Lee Van Cleef. The Good, the Bad and the Ugly / Il Buono, Il Brutto, Il Cattivo (The Man with No Name). Produzioni Europee Associate (Italy) Production, Distributed by United Artists (USA), 1966.</name><name>Udi Manber. Finding similar files in a large file system. In Proceedings of the USENIX Winter 1994 Technical Conference, pages 1--10, San Fransisco, CA, USA, 17--21 1994.</name><name>Peter Mork, Beitao Li, Edward Chang, Junghoo Cho, Chen Li, and James Wang. Indexing tamper resistant features for image copy detection, 1999. URL: citeseer.nj.nec.com/mork99indexing.html.</name><name>Narayanan Shivakumar and H&amp;#233;ctor Garc&amp;#237;a-Molina. SCAM: A copy detection mechanism for digital documents. In Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries, 1995.</name><name>Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249--260, 1995.</name><name>George K. Zipf. The Psychobiology of Language. Houghton Mifltm Co., 1935.</name></citation><abstract>Digital content is for copying: quotation, revision, plagiarism, and file sharing all create copies. Document fingerprinting is concerned with accurately identifying copying, including small partial copies, within large sets of documents.We introduce the class of local document fingerprinting algorithms, which seems to capture an essential property of any finger-printing technique guaranteed to detect copies. We prove a novel lower bound on the performance of any local algorithm. We also develop winnowing, an efficient local fingerprinting algorithm, and show that winnowing's performance is within 33% of the lower bound. Finally, we also give experimental results on Web data, and report experience with MOSS, a widely-used plagiarism detection service.</abstract></paper><paper><title>Information sharing across private databases</title><author><AuthorName>Rakesh Agrawal</AuthorName><institute><InstituteName>IBM Almaden Research Center, San Jose, CA</InstituteName><country></country></institute></author><author><AuthorName>Alexandre Evfimievski</AuthorName><institute><InstituteName>Cornell University, Ithaca, NY</InstituteName><country></country></institute></author><author><AuthorName>Ramakrishnan Srikant</AuthorName><institute><InstituteName>IBM Almaden Research Center, San Jose, CA</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>Nabil R. Adam , John C. Worthmann, Security-control methods for statistical databases: a comparative study, ACM Computing Surveys (CSUR), v.21 n.4, p.515-556, Dec. 1989</name><name>R. Agrawal and J. Kiernan. Watermarking relational databases. In 28th Int'l Conference on Very Large Databases, Hong Kong, China, August 2002.</name><name>R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Hippocratic databases. In Proc. of the 28th Int'l Conference on Very Large Databases, Hong Kong, China, August 2002.</name><name>R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Implementing P3P using database technology. In Proc. of the 19th Int'l Conference on Data Engineering, Bangalore, India, March 2003.</name><name>Rakesh Agrawal , Jerry Kiernan , Ramakrishnan Srikant , Yirong Xu, An XPath-based preference language for P3P, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary</name><name>Rakesh Agrawal , Ramakrishnan Srikant, Privacy-preserving data mining, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.439-450, May 15-18, 2000, Dallas, Texas, United States</name><name>S. Ajmani, R. Morris, and B. Liskov. A trusted third-party computation service. Technical Report MIT-LCS-TR-847, MIT, May 2001.</name><name>Mihir Bellare , Phillip Rogaway, Random oracles are practical: a paradigm for designing efficient protocols, Proceedings of the 1st ACM conference on Computer and communications security, p.62-73, November 03-05, 1993, Fairfax, Virginia, United States</name><name>Josh Benaloh , Michael de Mare, One-way accumulators: a decentralized alternative to digital signatures, Workshop on the theory and application of cryptographic techniques on Advances in cryptology, p.274-285, January 1994, Lofthus, Norway</name><name>Dan Boneh, The Decision Diffie-Hellman Problem, Proceedings of the Third International Symposium on Algorithmic Number Theory, p.48-63, June 21-25, 1998</name><name>C. Cachin, S. Micali, and M. Stadler. Computationally private information retrieval with polylogarithmic communication. In Theory and Application of Cryptographic Techniques, pages 402--414, 1999.</name><name>S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. In 16th Meeting of the Information Processing Society of Japan, pages 7--18, Tokyo, Japan, 1994.</name><name>F. Chin and G. Ozsoyoglu. Auditing and infrence control in statistical databases. IEEE Transactions on Software Eng., SE-8(6):113--139, April 1982.</name><name>Benny Chor , Niv Gilboa, Computationally private information retrieval (extended abstract), Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.304-313, May 04-06, 1997, El Paso, Texas, United States</name><name>B. Chor , O. Goldreich , E. Kushilevitz , M. Sudan, Private information retrieval, Proceedings of the 36th Annual Symposium on Foundations of Computer Science (FOCS'95), p.41, October 23-25, 1995</name><name>U. Dayal and H.-Y. Hwang. View definition and generalization for database integration in a multidatabase system. IEEE Transactions on Software Eng., 10(6):628--645, 1984.</name><name>Dorothy E. Denning , Peter J. Denning , Mayer D. Schwartz, The tracker: a threat to statistical database security, ACM Transactions on Database Systems (TODS), v.4 n.1, p.76-96, March 1979</name><name>W. Diffie and M. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, IT-22(6):644--654, November 1976.</name><name>David Dobkin , Anita K. Jones , Richard J. Lipton, Secure databases: protection against user influence, ACM Transactions on Database Systems (TODS), v.4 n.1, p.97-106, March 1979</name><name>T. ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory, IT-31(4):469--472, July 1985.</name><name>Ahmed Elmagarmid , Marek Rusinkiewicz , Amit Sheth, Management of heterogeneous and autonomous database systems, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998</name><name>Alexandre Evfimievski , Ramakrishnan Srikant , Rakesh Agrawal , Johannes Gehrke, Privacy preserving mining of association rules, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada</name><name>I. Fellegi. On the question of statistical confidentiality. Journal of the American Statistical Assoc., 67(337):7--18, March 1972.</name><name>Amos Fiat , Adi Shamir, How to prove yourself: practical solutions to identification and signature problems, Proceedings on Advances in cryptology---CRYPTO '86, p.186-194, January 1987, Santa Barbara, California, United States</name><name>Yael Gertner , Yuval Ishai , Eyal Kushilevitz , Tal Malkin, Protecting data privacy in private information retrieval schemes, Proceedings of the thirtieth annual ACM symposium on Theory of computing, p.151-160, May 24-26, 1998, Dallas, Texas, United States</name><name>O. Goldreich. Secure multi-party computation. Working Draft, Version 1.3, June 2001.</name><name>L. M. Haas, R. J. Miller, B. Niswonger, M. T. Roth, P. M. Schwarz, and E. L. Wimmers. Transforming heterogeneous data with database middleware: Beyond integration. IEEE Data Engineering Bulletin, 22(1), 1999.</name><name>Bernardo A. Huberman , Matt Franklin , Tad Hogg, Enhancing privacy and trust in electronic communities, Proceedings of the 1st ACM conference on Electronic commerce, p.78-86, November 03-05, 1999, Denver, Colorado, United States</name><name>P. Ipeirotis and L. Gravano. Distributed search over the hidden web: Hierarchical database sampling and selection. In 28th Int'l Conference on Very Large Databases, Hong Kong, China, August 2002.</name><name>Nigel Jefferies , Chris J. Mitchell , Michael Walker, A Proposed Architecture for Trusted Third Party Services, Proceedings of the International Conference on Cryptography: Policy and Algorithms, p.98-104, July 03-05, 1995</name><name>M. Kantarcioglu and C. Clifton. Privacy-preserving distributed mining of association rules on horizontally partitioned data. In ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, June 2002.</name><name>E. Kushilevitz , R. Ostrovsky, Replication is not needed: single database, computationally-private information retrieval, Proceedings of the 38th Annual Symposium on Foundations of Computer Science (FOCS '97), p.364, October 19-22, 1997</name><name>Y. Lindell and B. Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177--206, 2002.</name><name>Moni Naor , Kobbi Nissim, Communication preserving protocols for secure function evaluation, Proceedings of the thirty-third annual ACM symposium on Theory of computing, p.590-599, July 2001, Hersonissos, Greece</name><name>Moni Naor , Benny Pinkas, Oblivious transfer and polynomial evaluation, Proceedings of the thirty-first annual ACM symposium on Theory of computing, p.245-254, May 01-04, 1999, Atlanta, Georgia, United States</name><name>Moni Naor , Benny Pinkas, Efficient oblivious transfer protocols, Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, p.448-457, January 07-09, 2001, Washington, D.C., United States</name><name>Moni Naor , Benny Pinkas , Reuban Sumner, Privacy preserving auctions and mechanism design, Proceedings of the 1st ACM conference on Electronic commerce, p.129-139, November 03-05, 1999, Denver, Colorado, United States</name><name>B. Preneel. Analysis and design of cryptographic hash functions. Ph.D. dissertation, Katholieke Universiteit Leuven, 1992.</name><name>M. O. Rabin. How to exchange secrets by oblivious transfer. Technical Memo TR-81, Aiken Computation Laboratory, Harvard University, 1981.</name><name>S. J. Rizvi and J. R. Haritsa. Privacy-preserving association rule mining. In Proc. of the 28th Int'l Conference on Very Large Databases, August 2002.</name><name>Gerard Salton , Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, NY, 1986</name><name>A. Shamir, R. L. Rivest, and L. M. Adleman. Mental poker. Technical Memo MIT-LCS-TM-125, Laboratory for Computer Science, MIT, February 1979.</name><name>C. E. Shannon. Communication theory of secrecy systems. Bell System Technical Journal, 28--4:656--715, 1949.</name><name>Arie Shoshani, Statistical Databases: Characteristics, Problems, and some Solutions, Proceedings of the 8th International Conference on Very Large Data Bases, p.208-222, September 08-10, 1982</name><name>S. W. Smith and D. Safford. Practical private information retrieval with secure coprocessors. Research Report RC 21806, IBM, July 2000.</name><name>Douglas Stinson, Cryptography: Theory and Practice,Second Edition, CRC/C&amp;H, 2002</name><name>Jaideep Vaidya , Chris Clifton, Privacy preserving association rule mining in vertically partitioned data, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada</name><name>Gio Wiederhold, Intelligent integration of information, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.434-437, May 25-28, 1993, Washington, D.C., United States</name><name>A. C. Yao. How to generate and exchange secrets. In Proc. of the 27th Annual Symposium on Foundations of Computer Science, pages 162--167, Toronto, Canada, October 1986.</name></citation><abstract>Literature on information integration across databases tacitly assumes that the data in each database can be revealed to the other databases. However, there is an increasing need for sharing information across autonomous entities in such a way that no information apart from the answer to the query is revealed. We formalize the notion of minimal information sharing across private databases, and develop protocols for intersection, equijoin, intersection size, and equijoin size. We also show how new applications can be built using the proposed protocols.</abstract></paper><paper><title>Rights protection for relational data</title><author><AuthorName>Radu Sion</AuthorName><institute><InstituteName>Purdue University West Lafayette, IN</InstituteName><country></country></institute></author><author><AuthorName>Mikhail Atallah</AuthorName><institute><InstituteName>Purdue University West Lafayette, IN</InstituteName><country></country></institute></author><author><AuthorName>Sunil Prabhakar</AuthorName><institute><InstituteName>Purdue University West Lafayette, IN</InstituteName><country></country></institute></author><year>2003</year><conference>International Conference on Management of Data</conference><citation><name>M. J. Atallah and Jr. S. S. Wagstaff. Watermarking with quadratic residues. In Proc. of IS-T/SPIE Conf. on Security and Watermarking of Multimedia Contents, SPIE Vol. 3657, pp. 283--288., 1999.</name><name>Mikhail J. Atallah , Victor Raskin , Michael Crogan , Christian Hempelmann , Florian Kerschbaum , Dina Mohamed , Sanket Naik, Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation, Proceedings of the 4th Internati

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -