📄 sigmod_2006_elementary.txt

📁 利用lwp：：get写的
💻 TXT
📖 第 1 页 / 共 5 页
字号:
<proceedings><paper><title>Speeding up search in peer-to-peer networks with a multi-way tree structure</title><author><AuthorName>H. V. Jagadish</AuthorName><institute><InstituteName>University of Michigan, Ann Arbor, MI</InstituteName><country></country></institute></author><author><AuthorName>Beng Chin Ooi</AuthorName><institute><InstituteName>National University of Singapore, Singapore</InstituteName><country></country></institute></author><author><AuthorName>Kian-Lee Tan</AuthorName><institute><InstituteName>National University of Singapore, Singapore</InstituteName><country></country></institute></author><author><AuthorName>Quang Hieu Vu</AuthorName><institute><InstituteName>National University of Singapore, Singapore</InstituteName><country></country></institute></author><author><AuthorName>Rong Zhang</AuthorName><institute><InstituteName>Fudan University, China</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation></citation><abstract>Peer-to-Peer systems have recently become a popular means to share resources. Effective search is a critical requirement in such systems, and a number of distributed search structures have been proposed in the literature. Most of these structures provide "log time search" capability, where the logarithm is taken base 2. That is, in a system with N nodes, the cost of the search is O(log2N).In database systems, the importance of large fanout index structures has been well recognized. In P2P search too, the cost could be reduced considerably if this logarithm were taken to a larger base. In this paper, we propose a multi-way tree search structure, which reduces the cost of search to O(logmN), where m is the fanout. The penalty paid is a larger update cost, but we show how to keep this penalty to be no worse than linear in m. We experimentally explore this tradeoff between search and update cost as a function of m, and suggest how to find a good trade-off point.The multi-way tree structure we propose, BATON*, is derived from the BATON structure that has recently been suggested. In addition to multi-way fanout, BATON* also adds support for multi-attribute queries to BATON.</abstract></paper><paper><title>Reconciling while tolerating disagreement in collaborative data sharing</title><author><AuthorName>Nicholas E. Taylor</AuthorName><institute><InstituteName>University of Pennsylvania</InstituteName><country></country></institute></author><author><AuthorName>Zachary G. Ives</AuthorName><institute><InstituteName>University of Pennsylvania</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In PODS, pages 68--79, 1999.</name><name>A. Bairoch and R. Apweiler. The SWISS-PROT protein sequence database and its supplement TrEMBL. Nucleic Acids Research, 28:45--48, 2000.</name><name>P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In ICDT, pages 316--330, 2001.</name><name>D. Calvanese, G. D. Giacomo, M. Lenzerini, and R. Rosati. Logical foundations of peer-to-peer data integration. In PODS, pages 241--251, 2004.</name><name>S. Ceri, M. A. W. Houtsma, A. M. Keller, and P. Samarati. Independent updates and incremental agreement in replicated databases. Distributed and Parallel Databases, 3(3):225--246, 1995.</name><name>Y. Cui. Lineage Tracing in Data Warehouses. PhD thesis, Stanford University, 2001.</name><name>Concurrent versions system. Available from www.cvshome.org.</name><name>A. Demers, D. Greene, C. Hauser, W. Irish, and J. Larson. Epidemic algorithms for replicated database maintenance. In PODC, 1987.</name><name>W. K. Edwards, E. D. Mynatt, K. Petersen, M. J. Spreitzer, D. B. Terry, and M. M. Theimer. Designing and implementing asynchronous collaborative applications with Bayou. In UIST, pages 119--128, 1997.</name><name>J. N. Foster, M. B. Greenwald, J. T. Moore, B. C. Pierce, and A. Schmitt. Combinators for bi-directional tree transformations: A linguistic approach to the view update problem. Technical Report MS-CIS-004-15, University of Pennsylvania, July 2004.</name><name>H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. Journal of Intelligent Information Systems, 8(2):117--132, March 1997.</name><name>S. Ghandeharizadeh, R. Hull, and D. Jacobs. Heraclitus: Elevating deltas to be first-class citizens in a database programming language. TODS, 21(3):370--426, 1996.</name><name>A. Y. Halevy, Z. G. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In ICDE, pages 505--516, March 2003.</name><name>Z. Ives, N. Khandelwal, A. Kapur, and M. Cakir. Orchestra: Rapid, collaborative sharing of dynamic data. In CIDR, pages 107--118, January 2005.</name><name>A. Kementsietsidis, M. Arenas, and R. J. Miller. Mapping data in peer-to-peer systems: Semantics and algorithmic issues. In SIGMOD, June 2003.</name><name>A.-M. Kermarrec, A. Rowstron, M. Shapiro, and P. Druschel. The IceCube approach to the reconciliation of diverging replicas. In PODC, August 2001.</name><name>H. T. Kung and J. T. Robinson. On optimistic methods for concurrency control. TODS, 6(2):213--226, 1981.</name><name>L. Lamport. Concurrent reading and writing of clocks. ACM Trans. Comput. Syst., 8(4):305--310, 1990.</name><name>D. Lembo, M. Lenzerini, and R. Rosati. Source inconsistency and incompleteness in data integration. In KRDB '02, April 2002.</name><name>A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In VLDB, pages 251--262, 1996.</name><name>K. Moore. The Lotus Notes storage system. In SIGMOD, pages 427--428, 1995.</name><name>P. Mork, R. Shaker, A. Halevy, and P. Tarczy-Hornoch. PQL: A declarative query language over dynamic biological schemata. In American Medical Informatics Association (AMIA) Symposium, 2002, November 2002.</name><name>A. Muthitacharoen, R. Morris, T. M. Gil, and B. Chen. Ivy: A read/write peer-to-peer file system. In OSDI, 2002.</name><name>National Center for Biotechnology Information. GenBank. Available from www.ncbi.nlm.nih.gov/GenBank/.</name><name>D. S. Parker, Jr., G. J. Popek, G. Rudisin, A. Stoughton, B. J. Walker, E. Walton, J. M. Chow, D. A. Edwards, S. Kiser, and C. S. Kline. Detection of mutual inconsistency in distributed systems. IEEE Trans. Software Eng., 9(3):240--247, 1983.</name><name>B. C. Pierce, T. Jim, and J. Vouillon. Unison: A portable, cross-platform file synchronizer, 1999--2001.</name><name>S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In Proc. of ACM SIGCOMM '01, 2001.</name><name>A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), pages 329--350, Nov. 2001.</name><name>F. Sadri. Aggregate operations in the information source tracking method. Theor. Comput. Sci., 133(2):421--442, 1994.</name><name>M. Satyanarayanan, J. J. Kistler, P. Kumar, M. E. Okasaki, E. H. Siegel, and D. C. Steere. Coda: A highly available file system for a distributed workstation environment. IEEE Trans. Comput., 39(4):447--459, 1990.</name><name>I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proc. of ACM SIGCOMM '01, 2001.</name><name>J. Widom. Trio: A system for integrated management of data, accuracy, and lineage. In CIDR, 2005.</name></citation><abstract>In many data sharing settings, such as within the biological and biomedical communities, global data consistency is not always attainable: different sites' data may be dirty, uncertain, or even controversial. Collaborators are willing to share their data, and in many cases they also want to selectively import data from others --- but must occasionally diverge when they disagree about uncertain or controversial facts or values. For this reason, traditional data sharing and data integration approaches are not applicable, since they require a globally consistent data instance. Additionally, many of these approaches do not allow participants to make updates; if they do, concurrency control algorithms or inconsistency repair techniques must be used to ensure a consistent view of the data for all users.In this paper, we develop and present a fully decentralized model of collaborative data sharing, in which participants publish their data on an ad hoc basis and simultaneously reconcile updates with those published by others. Individual updates are associated with provenance information, and each participant accepts only updates with a sufficient authority ranking, meaning that each participant may have a different (though conceptually overlapping) data instance. We define a consistency semantics for database instances under this model of disagreement, present algorithms that perform reconciliation for distributed clusters of participants, and demonstrate their ability to handle typical update and conflict loads in settings involving the sharing of curated data.</abstract></paper><paper><title>Approximately detecting duplicates for streaming data using stable bloom filters</title><author><AuthorName>Fan Deng</AuthorName><institute><InstituteName>University of Alberta, Edmonton, Alberta, Canada</InstituteName><country></country></institute></author><author><AuthorName>Davood Rafiei</AuthorName><institute><InstituteName>University of Alberta, Edmonton, Alberta, Canada</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation></citation><abstract>Traditional duplicate elimination techniques are not applicable to many data stream applications. In general, precisely eliminating duplicates in an unbounded data stream is not feasible in many streaming scenarios. Therefore, we target at approximately eliminating duplicates in streaming environments given a limited space. Based on a well-known bitmap sketch, we introduce a data structure, Stable Bloom Filter, and a novel and simple algorithm. The basic idea is as follows: since there is no way to store the whole history of the stream, SBF continuously evicts the stale information so that SBF has room for those more recent elements. After finding some properties of SBF analytically, we show that a tight upper bound of false positive rates is guaranteed. In our empirical study, we compare SBF to alternative methods. The results show that our method is superior in terms of both accuracy and time effciency when a fixed small space and an acceptable false positive rate are given.</abstract></paper><paper><title>Query evaluation using overlapping views: completeness and efficiency</title><author><AuthorName>Gang Gou</AuthorName><institute><InstituteName>North Carolina State University, Raleigh, NC</InstituteName><country></country></institute></author><author><AuthorName>Maxim Kormilitsin</AuthorName><institute><InstituteName>North Carolina State University, Raleigh, NC</InstituteName><country></country></institute></author><author><AuthorName>Rada Chirkova</AuthorName><institute><InstituteName>North Carolina State University, Raleigh, NC</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>F. Afrati and R. Chirkova. Selecting and using views to compute aggregate queries. In Proc. ICDT, 2005.</name><name>F. Afrati, C. Li, and J.D. Ullman. Generating efficient plans for queries using views. In Proc. SIGMOD, 2001.</name><name>R.G. Bello, K. Dias, A. Downing, J.J. Feenan Jr., J.L. Finnerty, W.D. Norcott, H. Sun, A. Witkowski, and M. Ziauddin. Materialized views in Oracle. In VLDB, 1998.</name><name>A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational databases. In Proc. 9th ACM STOC, pages 77--90, 1977.</name><name>S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Record, 26(1):65--74, 1997.</name><name>S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and K. Shim. Optimizing queries with materialized views. In Proc. ICDE, pages 190--200, 1995.</name><name>S. Chaudhuri and M. Y. Vardi. Optimization of real conjunctive queries. In Proc. PODS, pages 59--70, 1993.</name><name>S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In Proc. PODS, pages 155--166, 1999.</name><name>S. Cohen, W. Nutt, and A. Serebrenik. Algorithms for rewriting aggregate queries using views. In Proc. ADBIS-DASFAA, pages 65--78, 2000.</name><name>D. DeHaan, P.-&amp;#197;. Larson, and J. Zhou. Stacked indexed views in Microsoft SQL server. In Proc. SIGMOD, 2005.</name><name>H. Garcia-Molina, J.D. Ullman, and J. Widom. Database Systems: The Complete Book. Prentice Hall, 2002.</name><name>J. Goldstein and P.-&amp;#197;. Larson. Optimizing queries using materialized views: A practical, scalable solution. In Proc. SIGMOD, 2001.</name><name>A. Gupta, V. Harinarayan, and D. Quass. Aggregate-query processing in data warehousing environments. In Proc. VLDB, pages 358--369, 1995.</name><name>H. Gupta, V. Harinarayan, A. Rajaraman, and J. D. Ullman. Index selection for OLAP. In Proc. ICDE, 1997.</name><name>A.Y. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4):270--294, 2001.</name><name>P.-&amp;#197;. Larson and H.Z. Yang. Computing queries from derived relations. In Proc. VLDB, pages 259--269, 1985.</name><name>A. Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views. In Proc. PODS, 1995.</name><name>R. Pottinger and A.Y. Halevy. MiniCon: A scalable algorithm for answering queries using views. VLDB Journal, 10(2-3):182--198, 2001.</name><name>P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In Proc. SIGMOD, 1979.</name><name>D. Srivastava, S. Dar, H.V. Jagadish, and A.Y. Levy. Answering queries with aggregation using views. In Proc. VLDB, pages 318--329, 1996.</name><name>D. Theodoratos and T. Sellis. Data warehouse configuration. In Proc. VLDB, pages 126--135, 1997.</name><name>O.G. Tsatalos, M.H. Solomon, and Y.E. Ioannidis. The GMAP: A versatile tool for physical data independence. VLDB Journal, 5(2):101--118, 1996.</name><name>H.Z. Yang and P.-&amp;#197;. Larson. Query transformation for PSJ-queries. In Proc. VLDB, pages 245--254, 1987.</name><name>M. Zaharioudakis, R. Cochrane, G. Lapis, H. Pirahesh, and M. Urata. Answering complex SQL queries using automatic summary tables. In Proc. SIGMOD, pages 105--116, 2000.</name></citation><abstract>We study the problem of finding efficient equivalent view-based rewritings of relational queries, focusing on query optimization using materialized views under the assumption that base relations cannot contain duplicate tuples. A lot of work in the literature addresses the problems of answering queries using views and query optimization. However, most of it proposes solutions for special cases, such as for conjunctive queries (CQs) or for aggregate queries only. In addition, most of it addresses the problems separately under set or bag-set semantics for query evaluation, and some of it proposes heuristics without formal proofs for completeness or soundness. In this paper we look at the two problems by considering CQ/A queries - that is, both pure conjunctive and aggregate queries, with aggregation functions SUM, COUNT, MIN, and MAX; the DISTINCT keyword in (SQL versions of) our queries is also allowed. We build on past work to provide algorithms that handle this general setting. This is possible because recent results on rewritings of CQ/A queries [1, 8] show that there are sound and complete algorithms based on containment tests of CQs.Our focus is that our algorithms are efficient as well as sound and complete. Besides the contribution we make in putting and addressing the problems in this general setting, we make two additional contributions for bag-set and set semantics. First, we propose efficient sound and complete tests for equivalence of CQ/A queries to rewritings that use overlapping views (the algorithms are complete with respect to the language of rewritings). These results apply not only to query optimization, but to all areas where the goal is to obtain efficient equivalent view-based query rewritings. Second, based on these results we propose two sound algorithms, BDPV and CDPV, that find efficient execution plans for CQ/A queries in terms of materialized views. Both algorithms extend the cost-based query-optimization approach of System R [19]. The efficient sound algorithm BDPV is also complete in some cases, whereas CDPV is sound and complete for all CQ/A queries we consider. We present a study of the completeness-efficiency tradeoff in the algorithms, and provide experimental results that show the viability of our approach and test the limits of query optimization using overlapping views.</abstract></paper><paper><title>User-defined aggregate functions: bridging theory and practice</title><author><AuthorName>Sara Cohen</AuthorName><institute><InstituteName>Technion---Israel Institute of Technology, Technion City, Haifa, Israel</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>J. Blakeley, N. Coburn, and P.-A. Larson. Updating derived relations: Detecting irrelevant and autonomously computable updates. ACM TODS, 14(3):369--400, 1989.</name><name>S. Chaudhuri and K. Shim. Optimizing queries with aggregate views. In EDBT, 1996.</name><name>S. Cohen, W. Nutt, and Y. Sagiv. Rewriting queries with arbitrary aggregation functions using views. ACM TODS. Accepted for publication.</name><name>S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In PODS, 1999.</name><name>C. Galindo-Legaria and M. Joshi. Orthogonal optimization of subqueries and aggregation. In SIGMOD, 2001.</name><name>J. Goldstein and P.-A. Larson. Optimizing queries using materialized views: A practical, scalable solution. In SIGMOD, 2001.</name><name>T. Griffin and L. Libkin. Incremental maintenance of views with duplicates. In SIGMOD, 1995.</name><name>S. Grumbach and L. Tininini. On the content of materialized aggregate views. JCSS, 66(1):133--168, 2003.</name><name>A. Gupta, V. Harinarayan, and D. Quass. Aggregate-query processing in data warehousing environments. In VLDB, 1995.</name><name>A. Gupta, I. Mumick, and V. Subrahmanian. Maintaining views incrementally. In SIGMOD, 1993.</name><name>V. Harinarayan, A. Rajaraman, and J. Ullman. Implementing data cubes efficiently. In SIGMOD, 1996.</name><name>A. Levy, I. Mumick, and Y. Sagiv. Query optimization by predicate move-around. In VLDB, 1994.</name><name>A. Levy, A. Rajamaran, and J. Ordille. Querying heterogeneous information sources using source description. In VLDB, 1996.</name><name>A. Levy and Y. Sagiv. Queries independent of updates. In VLDB, pages 171--181, 1993.</name><name>C. Lucchesi and S. Osborn. Candidate keys for relations. JCSS, 17(2):270--279, 1978.</name><name>M. Mohania and Y. Kambayashi. Making aggregate views self-maintainable. DKE, 32(1):87--109, 2000.</name><name>I. Mumick, H. Pirahesh, and R. Ramakrishnan. The magic of duplicates and aggregates. In VLDB, 1990.</name><name>R. Pottinger and A. Halevy. MiniCon: a scalable algorithm for answering queries using views. The VLDB Journal, 10(2--3):20--26, 2001.</name><name>D. Quass. Maintenance expressions for views with aggregation. In Workshop on Materialized Views, 1996.</name><name>R. Ramakrishnan, K. Ross, D. Srivastava, and S. Sudarshan. Efficient incremental evaluation of queries with aggregation. In SLP, 1994.</name><name>K. Ross, D. Srivastava, and S. Sudarshan. Materialized view maintenance and integrity constraint checking: Trading space for time. In SIGMOD, 1996.</name><name>D. Srivastava, S. Dar, H. Jagadish, and A. Levy. Answering queries with aggregation using views. In VLDB, 1996.</name><name>D. Theodoratos and T. Sellis. Answering multidimensional queries on cubes using other cubes. In SSDBM, 2000.</name><name>J. D. Ullman. Principles of Database and Knowledge-Base Systems, volume I. Computer Science Press, 1988.</name><name>W. Yan and P.-A. Larson. Eager aggregation and lazy aggregation. In VLDB, 1995.</name><name>M. Zaharioudakis, R. Cochrane, G. Lapis, H. Pirahesh, and M. Urata. Answering complex SQL queries using automatic summary tables. In SIGMOD, 2000.</name></citation><abstract>The ability to create user-defined aggregate functions (UDAs) is rapidly becoming a standard feature in relational database systems. Therefore, problems such as query optimization, query rewriting and view maintenance must take into account queries (or views) with UDAs. There is a wealth of research on these problems for queries with general aggregate functions. Unfortunately, there is a mismatch between the manner in which UDAs are created, and the information that the database system requires in order to apply previous research.The purpose of this paper is to explore this mismatch and to bridge the gap between theory and practice, thereby enabling UDAs to become first-class citizens within the database. Specifically, we consider query optimization, query rewriting and view maintenance for queries with UDAs. For each of these problems we first survey previous results and explore the mismatch between theory and practice. We then present theoretical and practical insights that can be combined to derive a coherent framework for defining UDAs within a database system.</abstract></paper><paper><title>Supporting ad-hoc ranking aggregates</title><author><AuthorName>Chengkai Li</AuthorName><institute><InstituteName>University of Illinois at Urbana-Champaign, Urbana, IL</InstituteName><country></country></institute></author><author><AuthorName>Kevin Chen-Chuan Chang</AuthorName><institute><InstituteName>University of Illinois at Urbana-Champaign, Urbana, IL</InstituteName><country></country></institute></author><author><AuthorName>Ihab F. Ilyas</AuthorName><institute><InstituteName>University of Waterloo, Waterloo, Ontario, Canada</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>F. N. Afrati and R. Chirkova. Selecting and using views to compute aggregate queries (extended abstract). In ICDT, pages 383--397, 2005.</name><name>S. Agarwal, R. Agrawal, P. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In VLDB, pages 506--521, 1996.</name><name>K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cube. SIGMOD, pages 359--370, 1999.</name><name>N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.</name><name>M. J. Carey and D. Kossmann. On saying "enough already!" in SQL. In SIGMOD, pages 219--230, 1997.</name><name>K. C.-C. Chang and S. Hwang. Minimal probing: Supporting expensive predicates for top-k queries. In SIGMOD, 2002.</name><name>S. Chaudhuri and L. Gravano. Evaluating top-k selection queries. In VLDB, pages 397--410, 1999.</name><name>S. Chaudhuri, R. Ramakrishnan, and G. Weikum. Integrating DB and IR technologies: What is the sound of one hand clapping? In CIDR, pages 1--12, 2005.</name><name>S. Chaudhuri and K. Shim. Including group-by in query optimization. In VLDB, pages 354--366, 1994.</name><name>J. Claussen, A. Kemper, D. Kossmann, and C. Wiesner. Exploiting early sorting and early partitioning for decision support query processing. VLDB J., 9(3), 2000.</name><name>S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In PODS, pages 155--166, 1999.</name><name>D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB, 1999.</name><name>R. Fagin. Combining fuzzy information from multiple systems. In PODS, pages 216--226, 1996.</name><name>R. Fagin, A. Lote, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001.</name><name>M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. In VLDB, pages 299--310, San Francisco, CA, USA, 1998.</name><name>J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery, 1(1):29--53, 1997.</name><name>A. Gupta, V. Harinarayan, and D. Quass. Aggregate query processing in data warehousing environments. In VLDB, pages 358--369, 1995.</name><name>H. Gupta, V. Harinarayan, A. Rajaraman, and J. D. Ullman. Index selection for OLAP. In ICDE, pages 208--219, 1997.</name><name>P. J. Haas and J. M. Hellerstein. Ripple joins for online aggregation. In SIGMOD, pages 287--298, 1999.</name><name>J. Han, J. Pei, G. Dong, and K. Wang. Efficient computation of iceberg cubes with complex measures. In SIGMOD, 2001.</name><name>V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD Conference, pages 205--216, 1996.</name><name>J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online aggregation. In SIGMOD, pages 171--182, 1997.</name><name>I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Supporting top-k join queries in relational databases. In VLDB, pages 754--765, 2003.</name><name>I. F. Ilyas, R. Shah, W. G. Aref, J. S. Vitter, and A. K. Elmagarmid. Rank-aware query optimization. In SIGMOD, pages 203--214, 2004.</name><name>C. Li, K. C.-C. Chang, and I. F. Ilyas. Efficient processing of ad-hoc top-k aggregate queries in OLAP. Technical Report UIUCDCS-R-2005-2596, Department of Computer Science, UIUC, June 2005. http://aim.cs.uiuc.edu.</name><name>C. Li, K. C.-C. Chang, I. F. Ilyas, and S. Song. RankSQL: Query algebra and optimization for relational top-k queries. In SIGMOD, pages 131--142, 2005.</name><name>H.-G. Li, H. Yu, D. Agrawal, , and A. E. Abbadi. Ranking aggregates. Technical report, UCSB, July 2004.</name><name>V. Lin, V. Vassalos, and P. Malakasiotis. MiniCount: Efficient rewriting of COUNT-queries using views. In ICDE, 2006.</name><name>T. Neumann and G. Moerkotte. A combined framework for grouping and order optimization. In VLDB, 2004.</name><name>K. A. Ross and D. Srivastava. Fast computation of sparse datacubes. In VLDB, pages 116--125, 1997.</name><name>D. E. Simmen, E. J. Shekita, and T. Malkemus. Fundamental techniques for order optimization. In SIGMOD, 1996.</name><name>D. Srivastava, S. Dar, H. V. Jagadish, and A. Y. Levy. Answering queries with aggregation using views. In VLDB, pages 318--329, 1996.</name><name>A. Tsois and T. K. Sellis. The generalized pre-grouping transformation: Aggregate-query optimization in the presence of dependencies. In VLDB, pages 644--655, 2003.</name><name>W. P. Yan and P.-&amp;#197;. Larson. Performing group-by before join. In ICDE, pages 89--100, 1994.</name><name>W. P. Yan and P.-&amp;#197;. Larson. Eager aggregation and lazy aggregation. In VLDB'95, pages 345--357, 1995.</name><name>Y. Zhao, P. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD, pages 159--170, 1997.</name></citation><abstract>This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a na&amp;#239;ve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rank-aware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans.</abstract></paper><paper><title>MauveDB: supporting model-based user views in database systems</title><author><AuthorName>Amol Deshpande</AuthorName><institute><InstituteName>University of Maryland</InstituteName><country></country></institute></author><author><AuthorName>Samuel Madden</AuthorName><institute><InstituteName>MIT</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless sensor networks: a survey. Computer Networks, 38, 2002.</name><name>Periklis Andritsos, Ariel Fuxman, and Renee J. Miller. Clean answers over dirty databases. In ICDE, 2006.</name><name>The Apache Derby Project. Web Site. http://db.apache.org/derby/.</name><name>D. Barbara, H. Garcia-Molina, and D. Porter. The management of probabilistic data. IEEE TKDE, 4(5):487--502, 1992.</name><name>Tim Brooke and Jenna Burrell. From ethnography to design in a vineyard. In Proceeedings of the Design User Experiences (DUX) Conference, June 2003.</name><name>A. Cerpa, J. Elson, D.Estrin, L. Girod, M. Hamilton, and J. Zhao. Habitat monitoring: Application driver for wireless communications technology. In Proceedings of ACM SIGCOMM 2001 Workshop on Data Communications in Latin America and the Caribbean.</name><name>Surajit Chaudhuri, Vivek Narasayya, and Sunita Sarawagi. Efficient evaluation of queries with mining predicates. In Proceedings of ICDE, 2002.</name><name>Reynold Cheng, Dmitri V. Kalashnikov, and Sunil Prabhakar. Evaluating probabilistic queries over imprecise data. In Proceedings of SIGMOD, 2003.</name><name>M. Chu, H. Haussecker, and F. Zhao. Scalable information-driven sensor querying and routing for ad hoc heterogeneous sensor networks. In Intl Journal of High Performance Computing Applications, 2002.</name><name>Nilesh N. Dalvi and Dan Suciu. Efficient query evaluation on probabilistic databases. In VLDB, 2004.</name><name>Dorothy E. Denning et al. Views for multilevel database security. IEEE Trans. Softw. Eng., 1987.</name><name>Amol Deshpande, Carlos Guestrin, Sam Madden, Joe Hellerstein, and Wei Hong. Model-driven data acquisition in sensor networks. In VLDB, 2004.</name><name>Norbert Fuhr and Thomas Rolleke. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst., 15(1):32--66, 1997.</name><name>G. Golub and C. Van Loan. Matrix Computations. Johns Hopkins, 1989.</name><name>G. Grahne. Horn tables - an efficient tool for handling incomplete information in databases. In PODS, 1989.</name><name>S. Grumbach, P. Rigaux, and L. Segoufin. Manipulating interpolated data is easier than you thought. In VLDB, 2000.</name><name>C. Guestrin, P. Bodik, R. Thibaux, M. Paskin, and S. Madden. Distributed regression: an efficient frame- work for modeling sensor network data. In IPSN, 2004.</name><name>A. Gupta and I.S. Mumick. Materialized views: techniques, implementations, and applications. MIT Press, 1999.</name><name>David Hand, Heikki Mannila, and Padhraic Smyth. Principles of Data Mining. MIT Press, 2001.</name><name>DB2 Intelligent Miner. Web Site. http://www-306.ibm.com/software/data/iminer/.</name><name>T. Imielinski and W. Lipski Jr. Incomplete infor- mation in relational databases. JACM, 31(4), 1984.</name><name>C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed diffusion: A scalable and robust communication paradigm for sensor networks. In MOBICOM, 2000.</name><name>A. Jain, E. Change, and Y. Wang. Adaptive stream resource management using kalman filters. In SIGMOD, 2004.</name><name>L. V. S. Lakshmanan, N. Leone, R. Ross, and V. S. Subrahmanian. Probview: a flexible probabilistic database system. ACM TODS, 22(3), 1997.</name><name>Suk Kyoon Lee. An extended relational database model for uncertain and imprecise information. In VLDB, 1992.</name><name>L. Liao, D. Fox, and H. Kautz. Location-based activity recognition using relational markov networks. In IJCAI, 2005.</name><name>Sam Madden. Intel lab data, 2004. http://berkeley.intel-research.net/labdata.</name><name>Samuel Madden, Wei Hong, Joseph M. Hellerstein, and Michael Franklin. TinyDB web page. http://telegraph.cs.berkeley.edu/tinydb.</name><name>A. Mainwaring, J. Polastre, R. Szewczyk, and D. Culler. Wireless sensor networks for habitat monitoring. In ACM Workshop on Sensor Networks and Applications, 2002.</name><name>Erin McKean, editor. The Oxford English Dictionary (2nd Edition). Oxford Univeristy Press, 2005.</name><name>Leonore Neugebauer. Optimization and evaluation of database queries including embedded interpolation procedures. In Proceedings of SIGMOD, 1991.</name><name>George M. Phillips. Interpolation and Approximation by Polynomials. Springer-Verlag, 2003.</name><name>PMML 3.0 Specification. Web Site. http://www.dmg.org/v3-0/GeneralStructure.html.</name><name>S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with databases: alternatives and implications. In Proceedings of SIGMOD, 1998.</name><name>Business Analytics Software Solutions (SAS). Web Site. http://www.sas.com/technologies/analytics.</name><name>J. Widom. Trio: A system for integrated management of data, accuracy, and lineage. In CIDR, 2005.</name><name>Y. Xia, S. Prabhakar, S. Lei, R. Cheng, and R. Shah. Indexing continuously changing data with mean-variance tree. In ACM SAC, 2005.</name><name>Y. Yao and J. Gehrke. Query processing in sensor networks. In CIDR, 2003.</name></citation><abstract>Real-world data --- especially when generated by distributed measurement infrastructures such as sensor networks --- tends to be incomplete, imprecise, and erroneous, making it impossible to present it to users or feed it directly into applications. The traditional approach to dealing with this problem is to first process the data using statistical or probabilistic models that can provide more robust interpretations of the data. Current database systems, however, do not provide adequate support for applying models to such data, especially when those models need to be frequently updated as new data arrives in the system. Hence, most scientists and engineers who depend on models for managing their data do not use database systems for archival or querying at all; at best, databases serve as a persistent raw data store.In this paper we define a new abstraction called model-based views and present the architecture of MauveDB, the system we are building to support such views. Just as traditional database views provide logical data independence, model-based views provide independence from the details of the underlying data generating mechanism and hide the irregularities of the data by using models to present a consistent view to the users. MauveDB supports a declarative language for defining model-based views, allows declarative querying over such views using SQL, and supports several different materialization strategies and techniques to efficiently maintain them in the face of frequent updates. We have implemented a prototype system that currently supports views based on regression and interpolation, using the Apache Derby open source DBMS, and we present results that show the utility and performance benefits that can be obtained by supporting several different types of model-based views in a database system.</abstract></paper><paper><title>Database support for matching: limitations and opportunities</title><author><AuthorName>Ameet Kini</AuthorName><institute><InstituteName>University of Wisconsin - Madison, Madison, WI</InstituteName><country></country></institute></author><author><AuthorName>Srinath Shankar</AuthorName><institute><InstituteName>University of Wisconsin - Madison, Madison, WI</InstituteName><country></country></institute></author><author><AuthorName>Jeffrey F. Naughton</AuthorName><institute><InstituteName>University of Wisconsin - Madison, Madison, WI</InstituteName><country></country></institute></author><author><AuthorName>David J. Dewitt</AuthorName><institute><InstituteName>University of Wisconsin - Madison, Madison, WI</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>R. Agrawal and H.V. Jagadish. "Materialization and Incremental Update of Path Information", Proceedings of ICDE 1989, p.374--383.</name><name>R. Agrawal and E. Wimmers. "A Framework for Expressing and Combining Preferences", Proceedings of ACM SIGMOD 2000, p. 297--306.</name><name>R. K. Ahuja, T.L. Magnanti, and J.B. Orlin, Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Englewood Cliffs, NJ, 1993.</name><name>C. Berge, "Two Theorems in Graph Theory" Proceedings of the National Academy of Science USA, 1957, p. 842--844.</name><name>Y. Chang et al. "The Onion Technique: Indexing for Linear Optimization Queries", Proceedings of ACM SIGMOD 2000, p. 391--402.</name><name>J. Feigenbaum et al., "On Graph Problems in a Semi-Streaming Model", Proceedings of ICALP 2004, p. 531--543</name><name>A.V. Goldberg and R. E. Tarjan. "A new approach to the maximum-flow problem", Journal of the ACM (JACM), v.35 n.4, Oct 1988, p. 921--940.</name><name>L Gravano and S. Chaudhuri. "Evaluating Top-k Selection Queries", Proceedings of VLDB 1999, p. 397--410</name><name>S. Guha et al. "Efficient Approximation of Optimization Queries Under Parametric Aggregation Constraints", Proceedings of VLDB 2003, p. 778--789</name><name>S. Guha et al. "Merging the Results of Approximate Match Operations", Proceedings of VLDB 2004, p. 636--647.</name><name>J. Hopcroft and R. Karp. "An n5/2 Algorithm for Maximum Matching in Bipartite Graphs", SIAM Journal of Computing, 1975, p. 225--231.</name><name>I. Ilyas et al. "Supporting Top-k Join Queries in Relational Databases", VLDB Journal, v.13 n.3, p. 207--221.</name><name>R. Karp, U.V. Vazirani, and V.V. Vazirani. "An optimal algorithm for online bipartite matching", Proceedings of ACM STOC 1990, p. 352--358.</name><name>J. Magun. "Greedy Matching Algorithms: An experimental study", Proceedings of the 1st Workshop on Algorithm Engineering, p. 22--31, 1997.</name><name>A. Mehta et al. "AdWords and Generalized Online Matching", Proceedings of IEEE FOCS 2005, p. 264--273.</name><name>Predator Project. http://www.distlab.dk/predator/</name><name>R. Raman et al. "Matchmaking: Distributed Resource Management for High Throughput Computing", Proceedings of IEEE HPDC 1998, p. 140--146.</name><name>C. Schlup. "Automatic Game Matching", http://dcg.ethz.ch/theses/ws0203/OnlineMatching_abstract.pdf</name><name>T. Tannenbaum et al. "Condor - A Distributed Job Scheduler", Beowulf Cluster Computing with Linux, The MIT Press, 2002.</name><name>P. Tsaparas et al. "Ranked Join Indices", Proceedings of IEEE ICDE 2003, p. 277--288.</name><name>http://www.research.microsoft.com/mlp/trueskill/default.aspx</name></citation><abstract>We define a match join of R and S with predicate &amp;#952; to be a subset of the &amp;#952;-join of R and S such that each tuple of R and S contributes to at most one result tuple. Match joins and their generalizations belong to a broad class of matching problems that have attracted a great deal of attention in disciplines including operations research and theoretical computer science. Instances of these problems arise in practice in resource allocation scenarios. To the best of our knowledge no one uses an RDBMS as a tool to help solve these problems; our goal in this paper is to explore whether or not this needs to be the case. We show that the simple approach of computing the full &amp;#952;-join and then applying standard graph-matching algorithms to the result is ineffective for all but the smallest of problem instances. By contrast, a closer study shows that the DBMS primitives of grouping, sorting, and joining can be exploited to yield efficient match join operations. This suggests that RDBMSs can play a role in matching related problems beyond merely serving as expensive file systems exporting data sets to external user programs.</abstract></paper><paper><title>Declarative networking: language, execution and optimization</title><author><AuthorName>Boon Thau Loo</AuthorName><institute><InstituteName>UC Berkeley</InstituteName><country></country></institute></author><author><AuthorName>Tyson Condie</AuthorName><institute><InstituteName>UC Berkeley</InstituteName><country></country></institute></author><author><AuthorName>Minos Garofalakis</AuthorName><institute><InstituteName>Intel Research Berkeley</InstituteName><country></country></institute></author><author><AuthorName>David E. Gay</AuthorName><institute><InstituteName>Intel Research Berkeley</InstituteName><country></country></institute></author><author><AuthorName>Joseph M. Hellerstein</AuthorName><institute><InstituteName>University of Wisconsin-Madison</InstituteName><country></country></institute></author><author><AuthorName>Petros Maniatis</AuthorName><institute><InstituteName>Intel Research Berkeley</InstituteName><country></country></institute></author><author><AuthorName>Raghu Ramakrishnan</AuthorName><institute><InstituteName>University of Wisconsin-Madison</InstituteName><country></country></institute></author><author><AuthorName>Timothy Roscoe</AuthorName><institute><InstituteName>Intel Research Berkeley</InstituteName><country></country></institute></author><author><AuthorName>Ion Stoica</AuthorName><institute><InstituteName>UC Berkeley</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>GT-ITM. http://www.cc.gatech.edu/projects/gtitm/.</name><name>S. Abiteboul, Z. Abrams, S. Haar, and T. Milo. Diagnosis of Asynchronous Discrete Event Systems - Datalog to the Rescue! In ACM PODS, 2005.</name><name>I. Balbin and K. Ramamohanarao. A Generalization of the Differential Approach to Recursive Query Evaluation. Journal of Logic Prog, 4(3):259--262, 1987.</name><name>F. Bancilhon. Naive Evaluation of Recursively Defined Relations. On Knowledge Base Management Systems: Integrating AI and DB Technologies, 1986.</name><name>F. Bancilhon, D. Maier, Y. Sagiv, and J. Ullman. Magic Sets and Other Strange Ways to Implement Logic Programs. In SIGMOD, 1986.</name><name>M. Y. Becker and P. Sewell. Cassandra: Distributed Access Control Policies with Tunable Expressiveness. In 5th IEEE International Workshop on Policies for Distributed Systems and Networks, 2004.</name><name>P. A. Bernstein, U. Dayal, D. J. DeWitt, D. Gawlick, J. Gray, M. Jarke, B. G. Lindsay, P. C. Lockemann, D. Maier, E. J. Neuhold, A. Reuter, L. A. Rowe, H.-J. Schek, J. W. Schmidt, M. Schrefl, and M. Stonebraker. Future Directions in DBMS Research. SIGMOD Record, 18(1):17--26, 1989.</name><name>F. Cacace, S. Ceri, and M. A. W. Houtsma. A survey of parallel execution strategies for transitive closure and logic programs. Distributed and Parallel Databases, 1(4):337--382, 1993.</name><name>D. Calvanese, G. D. Giacomo, and M. Y. Vardi. Decidable Containment of Recursive Queries. In ICDT, 2003.</name><name>Emulab. http://www.emulab.net.</name><name>N. Feamster and H. Balakrishnan. Correctness properties for Internet routing. In Allerton Conference on Communication, Control, and Computing, Sept. 2005.</name><name>F. Furfaro, S. Greco, S. Ganguly, and C. Zaniolo. Pushing Extrema Aggregates to Optimize Logic Queries. Inf.Sys., 27(5):321--343, 2002.</name><name>Overcoming barriers to disruptive innovation in networking. Report of NSF Workshop, Jan. 2005.</name><name>G. Graefe. Encapsulation of Parallelism in the Volcano Query Processing System. In SIGMOD, 1990.</name><name>A. Gupta, I. S. Mumick, and V. S. Subrahmanian. Maintaining Views Incrementally. In SIGMOD, 1993.</name><name>Z. J. Haas. A New Routing Protocol for the Reconfigurable Wireless Networks. In IEEE Int. Conf. on Universal Personal Communications, 1997.</name><name>R. Huebsch, B. Chun, J. Hellerstein, B. T. Loo, P. Maniatis, T. Roscoe, S. Shenker, I. Stoica, and A. R. Yumerefendi. The Architecture of PIER: an Internet-Scale Query Processor. In CIDR, 2005.</name><name>Jeffery Ullman. Assigning an Appropriate Meaning to Database Logic with Negation. Computers as Our Better Partners, pages 216--225, 1994.</name><name>T. Jim and D. Suciu. Dynamically Distributed Query Evaluation. In PODS, 2001.</name><name>D. B. Johnson and D. A. Maltz. Dynamic Source Routing in Ad Hoc Wireless Networks. In Mobile Computing, volume 353. 1996.</name><name>R. Krishnamurthy, R. Ramakrishnan, and O. Shmueli. A Framework for Testing Safety and Effective Computability. J. Comp. Sys. Sci. 52(1):100--124, 1996.</name><name>Laurent Vieille. Recursive Axioms in Deductive Database: The Query-Subquery Approach. In 1st International Conference on Expert Database Systems, 1986.</name><name>B. T. Loo, T. Condie, J. M. Hellerstein, P. Maniatis, T. Roscoe, and I. Stoica. Implementing Declarative Overlays. In 20th ACM Symposium on Operating Systems Principles (SOSP), 2005.</name><name>B. T. Loo, J. M. Hellerstein, I. Stoica, and R. Ramakrishnan. Declarative Routing: Extensible Routing with Declarative Queries. In SIGCOMM, 2005.</name><name>L. Peterson and B. Davie. Computer Networks: A Systems Approach. Morgan-KaufMann, 2003.</name><name>L. Peterson, S. Shenker, and J. Turner. Overcoming the Internet Impasse Through Virtualization. In HotNets-III, 2004.</name><name>R. Ramakrishnan, K. A. Ross, D. Srivastava, and S. Sudarshan. Efficient Incremental Evaluation of Queries with Aggregation. In SIGMOD, 1992.</name><name>R. Ramakrishnan and J. D. Ullman. A Survey of Research on Deductive Database Systems. Journal of Logic Programming, 23(2):125--149, 1993.</name><name>V. Ramasubramanian, Z. J. Haas, and E. G. Sirer. SHARP: A Hybrid Adaptive Routing Protocol for Mobile Ad Hoc Networks. In ACM MobiHoc, 2003.</name><name>J. Rohmer, R. Lescoeur, and J. M. Kerisit. Alexander Method - A Technique for the Processing of Recursive Axioms in Deductive Databases. New Generation Computing 4:522--528, 1986.</name><name>C. R.Palmer, P. B. Gibbons, and C. Faloutsos. ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs. In ACM SIGKDD, pages 102--111, 2002.</name><name>J. Shao, D. A. Bell, and M. E. C. Hull. An Experimental Performance Study of a pipelined recursive query processing strategy. In International Symposium on Databases for Parallel and Distributed Systems, 1990.</name><name>A. Singh, P. Maniatis, T. Roscoe, and P. Druschel. Distributed Monitoring and Forensics in Overlay Networks. In Eurosys, 2006.</name><name>I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM, 2001.</name><name>M. Stonebraker and J. M. Hellerstein, editors. Readings in Database Systems, Third Edition. Morgan Kaufmann, San Francisco, 1998.</name><name>S. Sudarshan and R. Ramakrishnan. Aggregation and Relevance in Deductive Databases. In VLDB, 1991.</name><name>J. Whaley and M. S. Lam. Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams. In PLDI, 2004.</name></citation><abstract>The networking and distributed systems communities have recently explored a variety of new network architectures, both for application-level overlay networks, and as prototypes for a next-generation Internet architecture. In this context, we have investigated declarative networking: the use of a distributed recursive query engine as a powerful vehicle for accelerating innovation in network architectures [23, 24, 33]. Declarative networking represents a significant new application area for database research on recursive query processing. In this paper, we address fundamental database issues in this domain. First, we motivate and formally define the Network Datalog (NDlog) language for declarative network specifications. Second, we introduce and prove correct relaxed versions of the traditional semi-na&amp;#239;ve query evaluation technique, to overcome fundamental problems of the traditional technique in an asynchronous distributed setting. Third, we consider the dynamics of network state, and formalize the iheventual consistencyl. of our programs even when bursts of updates can arrive in the midst of query execution. Fourth, we present a number of query optimization opportunities that arise in the declarative networking context, including applications of traditional techniques as well as new optimizations. Last, we present evaluation results of the above ideas implemented in our P2 declarative networking system, running on 100 machines over the Emulab network testbed.</abstract></paper><paper><title>Forensic analysis of database tampering</title><author><AuthorName>Kyriacos Pavlou</AuthorName><institute><InstituteName>University of Arizona, Tucson, AZ</InstituteName><country></country></institute></author><author><AuthorName>Richard T. Snodgrass</AuthorName><institute><InstituteName>University of Arizona, Tucson, AZ</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>I. Ahn and R. T. Snodgrass, "Partitioned Storage Structures for Temporal Databases," Information Systems, Vol. 13, No. 4, December 1988, pp. 369--391.</name><name>J. Bair, M. B&amp;#246;hlen, C. S. Jensen, and R. T. Snodgrass, "Notions of Upward Compatibility of Temporal Query Languages," Business Informatics (Wirtschafts Informatik) 39(1):25--34, February, 1997.</name><name>K. Fu, M. F. Kaashoek and D. Mazi&amp;#232;res, "Fast and secure distributed read-only file system," in Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, pp. 181--196, October 2000.</name><name>P. A. Gerr, B. Babineau, and P. C. Gordon, "Compliance: the effect on information management and the storage industry," Enterprise Storage Group Technical Report, May 2003.</name><name>S. Haber and W. S. Stornetta, "How To Time-Stamp a Digital Document," Journal of Cryptology 3:99--111, 1999.</name><name>W. W. Hsu and S. Ong, "Fossilization: A process for establishing truly trustworthy records," IBM Research report RJ 10331, 2004.</name><name>C. S. Jensen and C. E. Dyreson (eds), "A Consensus Glossary of Temporal Database Concepts---February 1998 Version," in Temporal Databases: Research and Practice, O. Etzion, S. Jajodia, and S. Sripada (eds.), Springer-Verlag, pp. 367--405, 1998.</name><name>C. S. Jensen and R. T. Snodgrass, "Temporal Specialization and Generalization," IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 6, December 1994, pp. 954--974.</name><name>Lab Compliance, www.labcompliance.com/e-signatures/overview.htm, viewed November 14, 2005.</name><name>D. Lomet, R. Barga, M. F. Mokbel, G. Shegalov, R. Wang, and Y. Zhu, "Immortal DB: transaction time support for SQL server," in Proceedings of the International ACM Conference on Management of Data (SIGMOD), pp. 939--941, June 2005.</name><name>D. Mazi&amp;#232;res, M. Kaminsky, M. F. Kaashoek and E. Witchel, "Separating key management from file system security," in Proceedings of the ACM Symposium on Operating Systems Principles, pp. 124--139, December 1999.</name><name>A. Muthitacharoen, R. Morris, T. M. Gil and B. Chen, "Ivy: A Read/Write Peer-to-Peer File System," in Proceedings of USENIX Operating Systems Design and Implementation, 2002.</name><name>Oracle Corporation, "Oracle Database 10g Workspace Manager Overview," Oracle White Paper, May 2005.</name><name>B. Schneier and J. Kelsey, "Secure Audit Logs to Support Computer Forensics," ACM Transactions on Information and System Security 2(2):159--196, May 1999.</name><name>R. T. Snodgrass, S. S. Yao, and C. Collberg, "Tamper Detection in Audit Logs," in Proceedings of the International Conference on Very Large Databases, pp. 504--515, Toronto, Canada, September 2004.</name><name>Q. Zhu and W. W. Hsu, "Fossilized Index: The Linchpin of Trustworthy Non-Alterable Electronic Records," in Proceedings of the ACM International Conference on Management of Data, pp. 395--406, Baltimore, Maryland, June 2005.</name></citation><abstract>Mechanisms now exist that detect tampering of a database, through the use of cryptographically-strong hash functions. This paper addresses the next problem, that of determining who, when, and what, by providing a systematic means of performing forensic analysis after such tampering has been uncovered. We introduce a schematic representation termed a "corruption diagram" that aids in intrusion investigation. We use these diagrams to fully analyze the original proposal, that of a linked sequence of hash values. We examine the various kinds of intrusions that are possible, including retroactive, introactive, backdating, and postdating intrusions. We then introduce successively more sophisticated forensic analysis algorithms: the monochromatic, RGB, and polychromatic algorithms, and characterize the "forensic strength" of these algorithms. We show how forensic analysis can efficiently extract a good deal of information concerning a corruption event.</abstract></paper><paper><title>Dynamic authenticated index structures for outsourced databases</title><author><AuthorName>Feifei Li</AuthorName><institute><InstituteName>Boston University</InstituteName><country></country></institute></author><author><AuthorName>Marios Hadjieleftheriou</AuthorName><institute><InstituteName>AT&amp;T Labs-Research</InstituteName><country></country></institute></author><author><AuthorName>George Kollios</AuthorName><institute><InstituteName>Boston University</InstituteName><country></country></institute></author><author><AuthorName>Leonid Reyzin</AuthorName><institute><InstituteName>Boston University</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation></citation><abstract>In outsourced database (ODB)systems the database owner publishes its data through a number of remote servers, with the goal of enabling clients at the edge of the network to access and query the data more efficiently. As servers might be untrusted or can be compromised, query authentication becomes an essential component of ODB systems. Existing solutions for this problem concentrate mostly on static scenarios and are based on idealistic properties for certain cryptographic primitives. In this work, first we define a variety of essential and practical cost metrics associated with ODB systems. Then, we analytically evaluate a number of different approaches, in search for a solution that best leverages all metrics. Most importantly, we look at solutions that can handle dynamic scenarios, where owners periodically update the data residing at the servers. Finally, we discuss query freshness, a new dimension in data authentication that has not been explored before. A comprehensive experimental evaluation of the proposed and existing approaches is used to validate the analytical models and verify our claims. Our findings exhibit that the proposed solutions improve performance substantially over existing approaches, both for static and dynamic environments.</abstract></paper><paper><title>Redundancy and information leakage in fine-grained access control</title><author><AuthorName>Govind Kabra</AuthorName><institute><InstituteName>Univ. of Illinois, Urbana-Champaign</InstituteName><country></country></institute></author><author><AuthorName>Ravishankar Ramamurthy</AuthorName><institute><InstituteName>Microsoft Research</InstituteName><country></country></institute></author><author><AuthorName>S. Sudarshan</AuthorName><institute><InstituteName>I.I.T. Bombay</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>R. Agrawal, P. Bird, T. Grandison, J. Kiernan, S. Logan, W. Rjaibi: Extending Relational Database Systems to Automatically Enforce Privacy Policies. In ICDE, pages 1013--1022, 2005.</name><name>A. Brodsky, C. Farkas, and S. Jajodia. Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans. on Knowl. and Data Engg., 12(6):900--919, 2000.</name><name>A.K Chandra and P.M Merlin, Optimal Implementation of conjunctive queries in relational databases. STOC, 1977</name><name>K. LeFevre, R. Agrawal, V. Ercegovac, R. Ramakrishnan, Y. Xu and D. DeWitt, Limiting disclosure in Hippocratic databases, In VLDB, 2004</name><name>G. Graefe, W. McKenna, The Volcano Optimizer Generator: Extensibility and Efficient Search, In ICDE, 1993</name><name>G. Graefe, The Cascades Optimization Framework, Data Engg. Bulletin, 1995</name><name>D. Litchfield, Web Application Disassembly with ODBC Error Messages, 2001, http://www.blackhat.com/presen-tations/win-usa-01/Litchfield/BHWin01Litchfield.doc</name><name>A. Motro. An access authorization model for relational databases based on algebraic manipulation of view definitions. In ICDE, pages 339--347, 1989.</name><name>The Virtual Private Database in Oracle9ir2: An Oracle Technical White Paper http://otn.oracle.com/deploy/security/oracle9ir2/pdf/vpd9ir2twp.pdf.</name><name>New Security Features in Sybase Adaptive Server Enterprise. Sybase Technical White Paper, 2003.</name><name>S. Rizvi, A. Mendelzon, S. Sudarshan and P. Roy, Extending query rewriting techniques for fine-grained access control. In SIGMOD, 2004</name><name>A. Rosenthal and E. Sciore. Abstracting and Refining Authorization in SQL. In Secure Data Management (SDM) workshop, In VLDB, 2004.</name><name>M. Stonebraker and E. Wong. Access control in a relational database management system by query modification. In Procs of the ACM Annual Conference, pages 180--186, 1974.</name></citation><abstract>The current SQL standard for access control is coarse grained, in that it grants access to all rows of a table or none. Fine-grained access control, which allows control of access at the granularity of individual rows, and to specific columns within those rows, is required in practically all database applications. There are several models for fine grained access control, but the majority of them follow a view replacement strategy. There are two significant problems with most implementations of the view replacement model, namely (a) the unnecessary overhead of the access control predicates when they are redundant and (b) the potential of information leakage through channels such as user-defined functions, and operations that cause exceptions and error messages. We first propose techniques for redundancy removal. We then define when a query plan is safe with respect to UDFs and other unsafe functions, and propose techniques to generate safe query plans. We have prototyped redundancy removal and safe UDF pushdown on the Microsoft SQL Server query optimizer, and present a preliminary performance study.</abstract></paper><paper><title>Contour map matching for event detection in sensor networks</title><author><AuthorName>Wenwei Xue</AuthorName><institute><InstituteName>Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong</InstituteName><country></country></institute></author><author><AuthorName>Qiong Luo</AuthorName><institute><InstituteName>Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong</InstituteName><country></country></institute></author><author><AuthorName>Lei Chen</AuthorName><institute><InstituteName>Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong</InstituteName><country></country></institute></author><author><AuthorName>Yunhao Liu</AuthorName><institute><InstituteName>Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong</InstituteName><country></country></institute></author><year>2006</year><conference>International Conference on Management of Data</conference><citation><name>Abadi, D., Madden, S., and Lindner, W. REED: Robust, efficient filtering and event detection in sensor networks. VLDB, 2005.</name><name>Agrawal, R., Psaila, G., Wimmers, E., and Zaut, M. Querying shapes of histories. VLDB, 1995.</name><name>Bonnet, P., Gehrke, J., and Seshadri, P. Querying the physical world. IEEE Personal Communications, 7(5), 2000.</name><name>Considine, J., Li, F., Kollios, G., and Byers, J. Approximate aggregation techniques for sensor databases. ICDE, 2004.</name><name>Contour Map. http://en.wikipedia.org/wiki/Contour_map.</name><name>Crossbow Inc. www.xbow.com.</name><name>Deligiannakis, A., Kotidis, Y., and Roussopoulos, N. Compressing historical information in sensor networks. SIGMOD, 2004.</name><name>Deshpande, A., Guestrin, C., and Madden, S. Model-driven data acquisition in sensor networks. VLDB, 2004.</name><name>Dutta, P., Grimmer, M., Arora, A., Bibyk, S., and Culler, D. Design of a wireless sensor network platform for detecting rare, random, and ephemeral events. IPSN, 2005.</name><name>GPCJ. http://www.seisw.com/GPCJ/GPCJ.html.</name><name>Guralnik, V., and Srivastava, J. Event detection from time series data. SIGKDD, 1999.</name><name>Hellerstein, J. M., Hong, W., Madden, S., and Stanek, K. Beyond average: Towards sophisticated sensing with queries. IPSN, 2003.</name><name>Intanagonwiwat, C., Govindan, R., and Estrin, D. Directed diffusion: A scalable and robust communication paradigm for sensor networks. MOBICOM, 2000.</name><name>Intel Lab Data. http://berkeley.intel-research.net/labdata/.</name><name>Kaplan, W. Advanced Calculus. Addison-Wesley, Boston, MA, USA.</name><name>Kotidis, Y. Snapshot queries: Towards data-centric sensor networks. ICDE, 2005.</name><name>Li, S., Lin, Y., Son, S., Stankovic, J., and Wei, Y. Event detection services using data service middleware in distributed sensor networks. Telecommunication Systems, 26(2--4), 2004.</name><name>Madden, S, Franklin, M. J., Hellerstein, J. M., and Hong, W. TAG: A tiny aggregation service for ad-hoc sensor networks. OSDI, 2002.</name><name>Madden, S., Franklin, M. J., Hellerstein, J. M., and Hong, W. The design of an acquisitional query processor for sensor networks. SIGMOD, 2003.</name><name>Manjhi, A., Nath, S., and Gibbons, P. Tributaries and deltas: Efficient and robust aggregation in sensor network streams. SIGMOD, 2005.</name><name>Ni, L. M., Liu, Y., Lau, Y., and Patil, A. LANDMARC: Indoor location sensing using active RFID. Wireless Networks, 10(6), 2004.</name><name>O'Rourke, J. 1998. Computational Geometry in C. Cambridge University Press, New York, NY, USA.</name><name>Papadimitriou, S., Brockwell, A., and Faloutsos, C. Adaptive, hands-off stream mining. VLDB, 2003.</name><name>Rahimi, M., Pon, R., Kaiser, W., Sukhatme, G., Estrin, D., and Srivastava, M. Adaptive sampling for environmental robotics. ICRA, 2004.</name><name>Ruppert, D., Wand, M. P., and Carroll, R. J. Semiparametric Regression. Cambridge University Press, New York, NY, USA.</name><name>Solis, I., and Obraczka, K. Efficient continuous mapping in sensor networks using isolines. Mobiquitous, 2005.</name><name>Szewczyk, R., Mainwaring, A., Polastre, J., Anderson J., and Culler, D. Lessons from a sensor network expedition. EWSN, 2004.</name><name>Wu, H., Salzberg, B., and Zhang, D. Online event-driven subsequence matching over financial data streams. SIGMOD, 2004.</name><name>Yao, Y., and Gehrke, J. Query processing for sensor networks. CIDR, 2003.</name></citation><abstract>Many sensor network applications, such as object tracking and disaster monitoring, require effective techniques for event detection. In this paper, we propose a novel event detection mechanism based on matching the contour maps of in-network sensory data distribution. Our key observation is that events in sensor networks can be abstracted into spatio-temporal patterns of sensory data and that pattern matching can be done efficiently through contour map matching. Therefore, we propose simple SQL extensions to allow users to specify common types of events as patterns in contour maps and study energy-efficient techniques of contour map construction and maintenance for our pattern-based event detection. Our experiments with synthetic workloads derived from a real-world coal mine surveillance application validate the effectiveness and efficiency of our approach.</abstract></paper><paper><title>Constraint chaining: on energy-efficient cont
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -