📄 sigmod_1998_elementary.txt

📁 利用lwp：：get写的
💻 TXT
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 34
This paper presents a new query unnesting algorithm that generalizes many unnesting techniques proposed recently in the literature. Our system is capable of removing any form of query nesting using a very simple and efficient algorithm. The simplicity of the system is due to the use of the monoid comprehension calculus as an intermediate form for OODB queries. The monoid comphrehension calculus treats operations over multiple collection types, aggregates, and quantifiers in a similar way, resulting in a uniform way of unnesting queries, regardless of their type of nesting.</abstract></paper><paper><title>Changing the rules: transformations for rule-based optimizers</title><author><AuthorName>Mitch Cherniack</AuthorName><institute><InstituteName>Department of Computer Science, Brown University, Providence, RI</InstituteName><country></country></institute></author><author><AuthorName>Stan Zdonik</AuthorName><institute><InstituteName>Department of Computer Science, Brown University, Providence, RI</InstituteName><country></country></institute></author><year>1998</year><conference>International Conference on Management of Data</conference><citation><name>Ludger Becker , Ralf Hartmut G&amp;#252;ting, Rule-based optimization and query processing in an extensible geometric database system, ACM Transactions on Database Systems (TODS), v.17 n.2, p.247-303, June 1992</name><name>M. J. Carey , David J. DeWitt , G. Graefe , D. M. Haight , J. E. Richardson , D. T. Schuh , E. J. Shekita , S. L. Vandenberg, The EXODUS extensible DBMS project: an overview, Readings in object-oriented database systems, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1989</name><name>M. Chemiack Translating queries into combinators. September 1996.</name><name>Mitchell Frederic Cherniack , Stan Zdonik, Building query optimizers with combinators, 1999</name><name>Mitch Cherniack , Stanley B. Zdonik, Rule languages and internal algebras for rule-based optimizers, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.401-412, June 04-06, 1996, Montreal, Quebec, Canada</name><name>B&amp;#233;atrice Finance , Georges Gardarin, A Rule-Based Query Rewriter in an Extensible DBMS, Proceedings of the Seventh International Conference on Data Engineering, p.248-256, April 08-12, 1991</name><name>B&amp;#233;atrice Finance , Georges Gardarin, A rule-based query optimizer with multiple search strategies, Data &amp; Knowledge Engineering, v.13 n.1, p.1-29, Aug. 1994</name><name>G. Graefe. The Cascades framework for query optimization. Data Engineering Bulletin, 18(3): 19-29, September 1995.</name><name>Goetz Graefe , William J. McKenna, The Volcano Optimizer Generator: Extensibility and Efficient Search, Proceedings of the Ninth International Conference on Data Engineering, p.209-218, April 19-23, 1993</name><name>J. Guttag, J. Homing, S. Garland, K. Jones, A. Modet, and J. Wing. Larch: Languages and Tools for Formal Specifications. Springer-Verlag, 1992.</name><name>Won Kim, On optimizing an SQL-like nested query, ACM Transactions on Database Systems (TODS), v.7 n.3, p.443-469, Sept. 1982</name><name>J.-S. Lee, K.-E. Kim, and M. Cherniack. A COKO compiler. Available at htrp ://www. cs.brown.edu/softwareJcokokola/coko.tar.Z, 1996.</name><name>Gail Mitchell , Umeshwar Dayal , Stanley B. Zdonik, Control of an Extensible Query Optimizer: A Planning-Based Approach, Proceedings of the 19th International Conference on Very Large Data Bases, p.517-528, August 24-27, 1993</name><name>I. S. Mumick , S. J. Finkelstein , Hamid Pirahesh , Raghu Ramakrishnan, Magic is relevant, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.247-258, May 23-26, 1990, Atlantic City, New Jersey, United States</name><name>Hamid Pirahesh , Joseph M. Hellerstein , Waqar Hasan, Extensible/rule based query rewrite optimization in Starburst, Proceedings of the 1992 ACM SIGMOD international conference on Management of data, p.39-48, June 02-05, 1992, San Diego, California, United States</name><name>Raghu Ramakrishnan, Database management systems, McGraw-Hill, Inc., New York, NY, 1997</name><name>Edward Sciore , John Sieg, Jr., A Modular Query Optimizer Generator, Proceedings of the Sixth International Conference on Data Engineering, p.146-153, February 05-09, 1990</name><name>Praveen Seshadri , Joseph M. Hellerstein , Hamid Pirahesh , T. Y. Cliff Leung , Raghu Ramakrishnan , Divesh Srivastava , Peter J. Stuckey , S. Sudarshan, Cost-based optimization for magic: algebra and implementation, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.435-446, June 04-06, 1996, Montreal, Quebec, Canada</name></citation><abstract>Rule-based optimizers are extensible because they consist of modifiable sets of rules. For modification to be straightforward, rules must be easily reasoned about (i.e., understood and verified). At the same time, rules must be expressive and efficient (to fire) for rule-based optimizers to be practical. Production-style rules (as in [15]) are expressed with code and are hard to reason about. Pure rewrite rules (as in [1]) lack code, but cannot atomically express complex transformations (e.g., normalizations). Some systems allow rules to be grouped, but sacrifice efficiency by providing limited control over their firing. Therefore, none of these approaches succeeds in making rules expressive, efficient and understandable.
We propose a language (COKO) for expressing an alternative form of input to a rule-based optimizer. A COKO transformation consists of a set of declarative (KOLA) rewrite rules and a (firing) algorithm that specifies their firing. It is straightforward to reason about COKO transformations because all query modification is expressed with declarative rewrite rules. Firing is specified algorithmically with an expressive language that provides direct control over how query representations are traversed, and under what conditions rules are fired. Therefore, COKO achieves a delicate balance of understandability, efficiency and expressivity.</abstract></paper><paper><title>CURE: an efficient clustering algorithm for large databases</title><author><AuthorName>Sudipto Guha</AuthorName><institute><InstituteName>Stanford University, Stanford, CA</InstituteName><country></country></institute></author><author><AuthorName>Rajeev Rastogi</AuthorName><institute><InstituteName>Bell Laboratories, Murray Hill, NJ</InstituteName><country></country></institute></author><author><AuthorName>Kyuseok Shim</AuthorName><institute><InstituteName>Bell Laboratories, Murray Hill, NJ</InstituteName><country></country></institute></author><year>1998</year><conference>International Conference on Management of Data</conference><citation><name>Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States</name><name>Thomas T. Cormen , Charles E. Leiserson , Ronald L. Rivest, Introduction to algorithms, MIT Press, Cambridge, MA, 1990</name><name>Martin Ester, Hans-Peter Kriegel, Jorg Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial database with noise. In Int'l Conference on Knowledge Discovery in Databases and Data Mining (KDD-96), Portland, Oregon, August 1996.</name><name>Martin Ester, Hans-Peter Kriegel, and Xiaowei Xu. A database interface for clustering in large spatial databases. In Int'! Conference on Knowledge Discovery in Databases and Data Mining (KDD-95), Montreal, Canada, August 1995.</name><name>Sudipto Guha, R. Rastogi, and K. Shim. CURE: A clustering algorithm for large databases. Technical report, Bell Laboratories, Murray Hill, 1997.</name><name>Anil K. Jain , Richard C. Dubes, Algorithms for clustering data, Prentice-Hall, Inc., Upper Saddle River, NJ, 1988</name><name>Rajeev Motwani , Prabhakar Raghavan, Randomized algorithms, Cambridge University Press, New York, NY, 1995</name><name>Raymond T. Ng , Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining, Proceedings of the 20th International Conference on Very Large Data Bases, p.144-155, September 12-15, 1994</name><name>Clark F. Olson, Parallel Algorithms for Hierarchical Clustering, University of California at Berkeley, Berkeley, CA, 1994</name><name>Hanan Samet, The design and analysis of spatial data structures, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1990</name><name>Hanan Samet, The design and analysis of spatial data structures, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1990</name><name>Timos K. Sellis , Nick Roussopoulos , Christos Faloutsos, The R+-Tree: A Dynamic Index for Multi-Dimensional Objects, Proceedings of the 13th International Conference on Very Large Data Bases, p.507-518, September 01-04, 1987</name><name>Jeffrey S. Vitter, Random sampling with a reservoir, ACM Transactions on Mathematical Software (TOMS), v.11 n.1, p.37-57, March 1985</name><name>Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada</name></citation><abstract>Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size. CURE achieves this by representing each cluster by a certain fixed number of points that are generated by selecting well scattered points from the cluster and then shrinking them toward the center of the cluster by a specified fraction. Having more than one representative point per cluster allows CURE to adjust well to the geometry of non-spherical shapes and the shrinking helps to dampen the effects of outliers. To handle large databases, CURE employs a combination of random sampling and partitioning. A random sample drawn from the data set is first partitioned and each partition is partially clustered. The partial clusters are then clustered in a second pass to yield the desired clusters. Our experimental results confirm that the quality of clusters produced by CURE is much better than those found by existing algorithms. Furthermore, they demonstrate that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality.</abstract></paper><paper><title>Efficiently mining long patterns from databases</title><author><AuthorName>Roberto J. Bayardo, Jr.</AuthorName><institute><InstituteName>IBM Almaden Research Center</InstituteName><country></country></institute></author><year>1998</year><conference>International Conference on Management of Data</conference><citation><name>Rakesh Agrawal , Tomasz Imieli&amp;#324;ski , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States</name><name>Rakesh Agrawal , Hiekki Mannila , Ramakrishnan Srikant , Hannu Toivonen , A. Inkeri Verkamo, Fast discovery of association rules, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996</name><name>Agrawal, R., and Srikant, R. 1994. Fast Algorithms for Mining Association Rules. IBM Research Report RJ9839, June 1994, IBM Almaden Research Center, San Jose, CA.</name><name>Rakesh Agrawal , Ramakrishnan Srikant, Mining Sequential Patterns, Proceedings of the Eleventh International Conference on Data Engineering, p.3-14, March 06-10, 1995</name><name>Bayardo, R. J. 1997. Brute-Force Mining of High-Confidence Classification Rules. In Proc. of the Third Int 'l Conf. on Knowledge Discovery and Data Mining, 123-126.</name><name>Sergey Brin , Rajeev Motwani , Jeffrey D. Ullman , Shalom Tsur, Dynamic itemset counting and implication rules for market basket data, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.255-264, May 11-15, 1997, Tucson, Arizona, United States</name><name>Dimitrios Gunopulos , Heikki Mannila , Sanjeev Saluja, Discovering All Most Specific Sentences by Randomized Algorithms, Proceedings of the 6th International Conference on Database Theory, p.215-229, January 08-10, 1997</name><name>Dao-I Lin , Zvi M. Kedem, Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set, Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology, p.105-119, March 23-27, 1998</name><name>Jong Soo Park , Ming-Syan Chen , Philip S. Yu, An effective hash-based algorithm for mining association rules, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.175-186, May 22-25, 1995, San Jose, California, United States</name><name>Rymon, R. 1992. Search through Systematic Set Enumeration. In Proc. of Third Int '1 Conf. on Principles of Knowledge Representation and Reasoning, 539-550.</name><name>Ashoka Savasere , Edward Omiecinski , Shamkant B. Navathe, An Efficient Algorithm for Mining Association Rules in Large Databases, Proceedings of the 21th International Conference on Very Large Data Bases, p.432-444, September 11-15, 1995</name><name>Slagel, J. R.; Chang, C.-L.; and Lee, R. C. T. 1970. A New Algorithm for Generating Prime Implicants. IEEE Trans. on Computers, C- 19(4):304-310.</name><name>P. Smyth , R. M. Goodman, An Information Theoretic Approach to Rule Induction from Databases, IEEE Transactions on Knowledge and Data Engineering, v.4 n.4, p.301-316, August 1992</name><name>Ramakrishnan Srikant , Rakesh Agrawal, Mining Sequential Patterns: Generalizations and Performance Improvements, Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology, p.3-17, March 25-29, 1996</name><name>Srikant, R.; Vu, Q.; and Agrawal, R. 1997. Mining Association Rules with Item Constraints. In Proc. of the Third lnt T Conf. on Knowledge Discovery in Databases and Data Mining, 67-73.</name><name>Zaki, M. J.; Parthasarathy, S.; Ogihara, M.; and Li, W. 1997. New Algorithms for Fast Discovery of Association Rules. In Proc. of the Third Int l Conf. on Knowledge Discovery in Databases and Data Mining, 283-286.</name></citation><abstract>We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data show that when the patterns are long, our algorithm is more efficient by an order of magnitude or more.</abstract></paper><paper><title>Automatic subspace clustering of high dimensional data for data mining applications</title><author><AuthorName>Rakesh Agrawal</AuthorName><institute><InstituteName>IBM Almaden Research Center, 650 Harry Road, San Jose, CA</InstituteName><country></country></institute></author><author><AuthorName>Johannes Gehrke</AuthorName><institute><InstituteName>IBM Almaden Research Center, 650 Harry Road, San Jose, CA</InstituteName><country></country></institute></author><author><AuthorName>Dimitrios Gunopulos</AuthorName><institute><InstituteName>IBM Almaden Research Center, 650 Harry Road, San Jose, CA</InstituteName><country></country></institute></author><author><AuthorName>Prabhakar Raghavan</AuthorName><institute><InstituteName>IBM Almaden Research Center, 650 Harry Road, San Jose, CA</InstituteName><country></country></institute></author><year>1998</year><conference>International Conference on Management of Data</conference><citation><name>Rakesh Agrawal , Hiekki Mannila , Ramakrishnan Srikant , Hannu Toivonen , A. Inkeri Verkamo, Fast discovery of association rules, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996</name><name>A. Aho, 2. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Welsley, 1974.</name><name>P. Arabic and L. J. Hubert. An overview of combinatorial data analyis, in P. Arabic, L. Hubert, and G. D. Sorts, editors, Clustering and Classification, pages 5-63. World Scientific Pub., New Jersey, 1996.</name><name>Arbor Software Corporation. Application Manager User's Guide, Essbase Version 4.0 edition.</name><name>Roberto J. Bayardo, Jr., Efficiently mining long patterns from databases, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.85-93, June 01-04, 1998, Seattle, Washington, United States</name><name>Stefan Berchtold , Christian B&amp;#246;hm , Daniel A. Keim , Hans-Peter Kriegel, A cost model for nearest neighbor search in high-dimensional data space, Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.78-86, May 11-15, 1997, Tucson, Arizona, United States</name><name>M. Berger and I. Regoutsos. An algorithm for point clustering and grid generation. IEEE Transactions on Systems, Man and Cybernetics, 21(5):1278-86, 1991.</name><name>Sergey Brin , Rajeev Motwani , Jeffrey D. Ullman , Shalom Tsur, Dynamic itemset counting and implication rules for market basket data, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.255-264, May 11-15, 1997, Tucson, Arizona, United States</name><name>Peter Cheeseman , John Stutz, Bayesian classification (AutoClass): theory and results, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996</name><name>R. Chhikara and D. Register. A numerical classification method for partitioning of a large multidimensional mixed data set. Technometrics, 21:531-537, 1979.</name><name>R. O. Duds and P. E. Hart. Pattern Classification and Scene Analysis. john Wiley and Sons, 1973.</name><name>R. J. Earle. Method and apparatus for storing and retrieving multi-dimensional data in computer memory. U.S. Patent No. 5359724, October 1994.</name><name>M. Ester, H.-P. Kriegel, 2. Sander, and X. Xu. A densitybased algorithm for discovering clusters in large spatial databases with noise, in Proc. of the ~nd Int'l Cort/erenee on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, August 1996.</name><name>Martin Ester , Hans-Peter Kriegel , J&amp;#246;rg Sander, Knowledge Discovery in Spatial Databases, Proceedings of the 23rd Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence, p.61-74, September 13-15, 1999</name><name>Usama M. Fayyad , Gregory Piatetsky-Shapiro , Padhraic Smyth , Ramasamy Uthurusamy, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996</name><name>Uriel Feige, A threshold of ln n for approximating set cover (preliminary version), Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.314-318, May 22-24, 1996, Philadelphia, Pennsylvania, United States</name><name>D. S. Franzblau, Performance guarantees on a sweep-line heuristic for covering rectilinear polygons with rectangles, SIAM Journal on Discrete Mathematics, v.2 n.3, p.307-321, August 1989</name><name>J. Friedman. Optimizing a noisy function of many variables with application to data mining. In UW/MSR Summer Research Institute in Data Mining, July 1997.</name><name>Keinosuke Fukunaga, Introduction to statistical pattern recognition (2nd ed.), Academic Press Professional, Inc., San Diego, CA, 1990</name><name>Dimitrios Gunopulos , Heikki Mannila , Roni Khardon , Hannu Toivonen, Data mining, hypergraph transversals, and machine learning (extended abstract), Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.209-216, May 11-15, 1997, Tucson, Arizona, United States</name><name>Ching-Tien Ho , Rakesh Agrawal , Nimrod Megiddo , Ramakrishnan Srikant, Range queries in OLAP data cubes, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.73-88, May 11-15, 1997, Tucson, Arizona, United States</name><name>S. J. Hong. MINI: A heuristic algorithm for two-level logic minimization. In R. Newton, editor, Selected Papers on Logic Synthesis/or Integrated Circuit Design. IEEE Press, 1987.</name><name>Internationl Business Machines. IBM Intelligent Miner User's Guide, Version 1 Release 1, SH12-6213-00 edition, July 1996.</name><name>Anil K. Jain , Richard C. Dubes, Algorithms for clustering data, Prentice-Hall, Inc., Upper Saddle River, NJ, 1988</name><name>L. Kaufman and P. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons, 1990.</name><name>Dao-I Lin , Zvi M. Kedem, Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set, Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology, p.105-119, March 23-27, 1998</name><name>L. Lovasz. On the ratio of the optimal integral and fractional covers. Discrete Mathematics, 13:383-390, 1975.</name><name>Carsten Lund , Mihalis Yannakakis, On the hardness of approximating minimization problems, Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, p.286-293, May 16-18, 1993, San Diego, California, United States</name><name>W. Masek. Some NP-comptete set covering problems. M.S. Thesis, MIT, 1978.</name><name>Manish Mehta , Rakesh Agrawal , Jorma Rissanen, SLIQ: A Fast Scalable Classifier for Data Mining, Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology, p.18-32, March 25-29, 1996</name><name>R. S. Michalski and It. E. Stepp. Learning from observation: Conceptual clustering. In It. S. Michalski, 3. G. Carbonell, nnd T. M. Mitchell, editors, Machine Learning: An Artificial Intelligence Approach, volume I, pages 331-363. Morgan Kaufmann, 1983.</name><name>R. J. Miller , Y. Yang, Association rules over interval data, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.452-461, May 11-15, 1997, Tucson, Arizona, United States</name><name>Raymond T. Ng , Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining, Proceedings of the 20th International Conference on Very Large Data Bases, p.144-155, September 12-15, 1994</name><name>R. A. Reckhow , J. Culberson, Covering a simple orthogonal polygon with a minimum number of orthogonally convex polygons, Proceedings of the third annual symposium on Computational geometry, p.268-277, June 08-10, 1987, Waterloo, Ontario, Canada</name><name>Jorma Rissanen, Stochastic Complexity in Statistical Inquiry Theory, World Scientific Publishing Co., Inc., River Edge, NJ, 1989</name><name>P. Schroeter and J. Bigun. Hierarchical image segmentation by multi-dimensional clustering and orientation-adaptive boundary refinement. Pattern Recognition, 25(5):695-709, May 1995.</name><name>John C. Shafer , Rakesh Agrawal , Manish Mehta, SPRINT: A Scalable Parallel Classifier for Data Mining, Proceedings of the 22th International Conference on Very Large Data Bases, p.544-555, September 03-06, 1996</name><name>A. Shoshani. Personal communication. 1997.</name><name>P. Sneath and R. Sokal. Numerical Tazonomy. Freeman, 1973.</name><name>Ramakrishnan Srikant , Rakesh Agrawal, Mining quantitative association rules in large relational tables, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.1-12, June 04-06, 1996, Montreal, Quebec, Canada</name><name>Hannu Toivonen, Sampling Large Databases for Association Rules, Proceedings of the 22th International Conference on Very Large Data Bases, p.134-145, September 03-06, 1996</name><name>S. Wharton. A generalized histogram clustering for multidimensional image data. Pattern Recognition, 16(2):193-199, 1983.</name><name>Mohamed Za&amp;#239;t , Hammou Messatfa, A comparative study of clustering methods, Future Generation Computer Systems, v.13 n.2-3, p.149-159, Nov. 1997</name><name>D Zhang , A Bowyer, CSG set-theoretic solid modelling and NC machining of blend surfaces, Proceedings of the second annual symposium on Computational geometry, p.236-245, June 02-04, 1986, Yorktown Heights, New York, United States</name><name>Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada</name></citation><abstract>Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satisfies each of these requirements. CLIQUE identifies dense clusters in subspaces of maximum dimensionality. It generates cluster descriptions in the form of DNF expressions that are minimized for ease of comprehension. It produces identical results irrespective of the order in which input records are presented and does not presume any specific mathematical form for data distribution. Through experiments, we show that CLIQUE efficiently finds accurate cluster in large high dimensional datasets.</abstract></paper><paper><title>Efficient mid-query re-optimization of sub-optimal query execution plans</title><author><AuthorName>Navin Kabra</AuthorName><institute><InstituteName>Computer Sciences Department, University of Wisconsin, Madison</InstituteName><country></country></institute></author><author><AuthorName>David J. DeWitt</AuthorName><institute><InstituteName>Computer Sciences Department, University of Wisconsin, Madison</InstituteName><country></country></institute></author><year>1998</year><conference>International Conference on Management of Data</conference><citation><name>Laurent Amsaleg , Michael J. Franklin , Anthony Tomasic , Tolga Urhan, Scrambling query plans to cope with unexpected delays, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.208-219, December 18-20, 1996, Miami Beach, Florida, United States</name><name>Gennady Antoshenkov, Dynamic Query Optimization in Rdb/VMS, Proceedings of the Ninth International Conference on Data Engineering, p.538-547, April 19-23, 1993</name><name>Gennady Antoshenkov, Dynamic Optimization of Index Scans Restricted by Booleans, Proceedings of the Twelfth International Conference on Data Engineering, p.430-440, February 26-March 01, 1996</name><name>Ming-Syan Chen , Ming-Ling Lo , Philip S. Yu , Honesty C. Young, Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins, Proceedings of the 18th International Conference on Very Large Data Bases, p.15-26, August 23-27, 1992</name><name>[)ERR, M. A., MORISHITA, S., AND PHIPPS, G. "Adaptive Query Optimization in a Deductive Database System". In In Proceedings of the Proceedings of the Second International Conference on Information and Knowledge Management (Washington D. C., USA, 1993).</name><name>Philippe Flajolet , G. Nigel Martin, Probabilistic counting algorithms for data base applications, Journal of Computer and System Sciences, v.31 n.2, p.182-209, Sept. 1985</name><name>Richard L. Cole , Goetz Graefe, Optimization of dynamic query evaluation plans, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.150-160, May 24-27, 1994, Minneapolis, Minnesota, United States</name><name>G. Graefe , K. Ward, Dynamic query evaluation plans, Proceedings of the 1989 ACM SIGMOD international conference on Management of data, p.358-366, June 1989, Portland, Oregon, United States</name><name>Yannis E. Ioannidis , Stavros Christodoulakis, On the propagation of errors in the size of join results, Proceedings of the 1991 ACM SIGMOD international conference on Management of data, p.268-277, May 29-31, 1991, Denver, Colorado, United States</name><name>Yannis E. Ioannidis , Raymond T. Ng , Kyuseok Shim , Timos K. Sellis, Parametric Query Optimization, Proceedings of the 18th International Conference on Very Large Data Bases, p.103-114, August 23-27, 1992</name><name>Yannis E. Ioannidis , Viswanath Poosala, Balancing histogram optimality and practicality for query result size estimation, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.233-244, May 22-25, 1995, San Jose, California, United States</name><name>Navin Kabra , David J. Dewitt, Query optimization for object-relational database systems, 1999</name><name>Navin Kabra , David J. DeWitt, OPT++ : an object-oriented implementation for extensible database query optimization, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.8 n.1, p.55-78, April 1999</name><name>Manish Mehta , David J. DeWitt, Dynamic Memory Allocation for Multiple-Query Workloads, Proceedings of the 19th International Conference on Very Large Data Bases, p.354-367, August 24-27, 1993</name><name>NAG, B., AND DEWITT, D.J. "Memory Allocation Strategies for Complex Decision Support Queries". Submitted for publication.</name><name>Kiyoshi Ono , Guy M. Lohman, Measuring the Complexity of Join Enumeration in Query Optimization, Proceedings of the 16th International Conference on Very Large Data Bases, p.314-325, August 13-16, 1990</name><name>Jignesh Patel , JieBing Yu , Navin Kabra , Kristin Tufte , Biswadeep Nag , Josef Burger , Nancy Hall , Karthikeyan Ramasamy , Roger Lueder , Curt Ellmann , Jim Kupsch , Shelly Guo , Johan Larson , David De Witt , Jeffrey Naughton, Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.336-347, May 11-15, 1997, Tucson, Arizona, United States</name><name>POOSALA, V. "Zipf's Law". Tech. rep., University of Wisconsin, Madison, 1995.</name><name>POOSALA, V., AND IOANNIDIS, Y. "Histogram-Based Solutions to Diverse Database Estimation Problems". In Data Engineering Bulletin (1995), vol. 18(3), pp. 10- 18.</name><name>Viswanath Poosala , Peter J. Haas , Yannis E. Ioannidis , Eugene J. Shekita, Improved histograms for selectivity estimation of range predicates, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.294-305, June 04-06, 1996, Montreal, Quebec, Canada</name><name>RAAB, F. "TPC Benchmark D - Standard Specification, Revision 1.0". Transaction Processing Performance Council, May 1995.</name><name>P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts</name><name>STONEBRAKER, M., ANTON, J., AND HIROHAMA, M. "Extendability in POSTGRES". In Data Engineering Bulletin (1987), vol. 10(2), pp. 16-23.</name><name>Jeffrey S. Vitter, Random sampling with a reservoir, ACM Transactions on Mathematical Software (TOMS), v.11 n.1, p.37-57, March 1985</name><name>Eugene Wong , Karel Youssefi, Decomposition&amp;mdash;a strategy for query processing, ACM Transactions on Database Systems (TODS), v.1 n.3, p.223-241, Sept. 1976</name><name>Philip S. Yu , Douglas W. Cornell, Buffer management based on return on consumption in a multi-query environment, The VLDB Journal &amp;mdash; The International Journal on Very Large Data Bases, v.2 n.1, p.1-38, January 1993</name><name>ZIPF, G.K. "Human Behavior and the Principle of Least Resistance". Addison-Wesley, Reading, MA, 1949.</name></citation><abstract>For a number of reasons, even the best query optimizers can very often produce sub-optimal query execution plans, leading to a significant degradation of performance. This is especially true in databases used for complex decision support queries and/or object-relational databases. In this paper, we describe an algorithm that detects sub-optimality of a query execution plan during query execution and attempts to correct the problem. The basic idea is to collect statistics at key points during the execution of a complex query. These statistics are then used to optimize the execution of the query, either by improving the resource allocation for that query, or by changing the execution plan for the remainder of the query. To ensure that this does not significantly slow down the normal execution of a query, the Query Optimizer carefully chooses what statistics to collect, when to collect them, and the circumstances under which to re-optimize the query. We describe an implementation of this algorithm in the Paradise Database System, and we report on performance studies, which indicate that this can result in significant improvements in the performance of complex queries.</abstract></paper><paper><title>Interaction of query evaluation and buffer management for information retrieval</title><author><AuthorName>Bj&amp;#246;rn T. J&amp;#243;nsson</AuthorName><institute><InstituteName>University of Maryland</InstituteName><country></country></institute></author><author><AuthorName>Michael J. Franklin</AuthorName><institute><InstituteName>University of Maryland</InstituteName><country></country></institute></author><author><AuthorName>Divesh Srivastava</AuthorName><institute><InstituteName>AT&amp;T Labs-Research</InstituteName><country></country></institute></author><year>1998</year><conference>International Conference on Management of Data</conference><citation><name>Rafael Alonso ,
上一页 1 2 34
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -