📄 vldb_1996_elementary.txt

📁 利用lwp：：get写的
💻 TXT
📖 第 1 页 / 共 5 页
字号:
hold in the whole database, and then to verify the results with the rest 
of the database. The algorithms thus produce exact association rules, not 
approximations based on a sample. The approach is, however, probabilistic, 
and in those rare cases where our sampling method does not produce all 
association rules, the missing rules can be found in a second pass. Our 
experiments show that the proposed algorithms can find association rules 
very efficiently in only one database pass.</abstract></paper><paper><title>Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules.</title><author><AuthorName>Takeshi Fukuda</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Yasuhiko Morimoto</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Shinichi Morishita</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Takeshi Tokuyama</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1996</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Polynomial-Time Solutions to Image Segmentation.</name><name>An Interval Classifier for Database Mining Applications.</name><name>Database Mining: A Performance Perspective.</name><name>Fractional Cascading: I. A Data Structuring Technique.</name><name>Computing the Discrepancy.</name><name>Probing Convex Polytopes.</name><name>Mining Optimized Association Rules for Numeric Attributes.</name><name>Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization.</name><name>SONAR: System for Optimized Numeric AssociationRules.</name><name>Computer and Intractability: A Guide to the Theory of NP-Completeness.
 W. H. Freeman 1979, ISBN 0-7167-1044-7</name><name>Constructing Optimal Binary Decision Trees is NP-Complete.</name><name>SLIQ: A Fast Scalable Classifier for Data Mining.</name><name>Induction of Decision Trees.</name><name>C4.5: Programs for Machine Learning.</name><name>Inferring Decision Trees Using the Minimum Description Length Principle.</name></citation><abstract>We propose an extension of an entropy-based heuristic of 
Quinlan [Q93] for constructing a decision tree from a 
large database with many numeric attributes.
Quinlan pointed out that his original method 
(as well as other existing methods) may be inefficient if 
any numeric attributes are strongly correlated.
Our approach offers one solution to this problem.
For each pair of numeric attributes with strong correlation,
we compute a two-dimensional association rule with respect to  
these attributes and the objective attribute of the decision tree.
In particular, we consider a family R of grid-regions
in the plane associated with the pair of attributes. 
For R in R, the data can be 
split into two classes: data inside R and data outside R.
We compute the region Ropt in R 
that minimizes the entropy of the splitting, 
and  add the splitting associated with Ropt 
(for each pair of strongly correlated attributes) to 
the set of candidate tests in Quinlan's entropy-based heuristic.

We give efficient algorithms for cases in which R is 
(1) x-monotone connected regions, 
(2) based-monotone regions, 
(3) rectangles, and 
(4) rectilinear convex regions.
The algorithm for the first case 
has been implemented as a subsystem of 
SONAR(System for Optimized Numeric Association Rules) developed by the
authors. 
Tests show that our approach can create small-sized decision trees.</abstract></paper><paper><title>Reordering Query Execution in Tertiary Memory Databases.</title><author><AuthorName>Sunita Sarawagi</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Michael Stonebraker</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1996</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Broadcast Disks: Data Management for Asymmetric Communications Environments.</name><name>Prefetching from Broadcast Disks.</name><name>Process And Dataflow Control In Distributed Data-Intensive Systems.</name><name>Dynamic Query Optimization in Rdb/VMS.</name><name>Compiling Control into Database Queries for Parallel Execution Management.</name><name>Distributed Databases: Principles and Systems.
 McGraw-Hill Book Company 1984, ISBN 0-07-010829-3</name><name>An Efficient Hybrid Join Algorithm: A DB2 Prototype.</name><name>An Evaluation of Buffer Management Strategies for Relational Database Systems.</name><name>Optimization of Dynamic Query Evaluation Plans.</name><name>Practical Prefetching via Data Compression.</name><name>A Multi-Threaded Architecture for Prefetching in Object Bases.</name><name>Encapsulation of Parallelism in the Volcano Query Processing System.</name><name>Query Evaluation Techniques for Large Databases.</name><name>The Datacycle Architecture for Very High Throughput Database Systems.</name><name>Optimization of Parallel Query Execution Plans in XPRS.</name><name>Energy Efficient Indexing on Air.</name><name>Efficient Assembly of Complex Objects.</name><name>Disk-directed I/O for MIMD Multiprocessors.</name><name>Workload Balance and Page Access Scheduling For Parallel Joins In Shared-Nothing Systems.</name><name>Batch Scheduling in Parallel Database Systems.</name><name>Scheduling of Page-Fetches in Join Operations.</name><name>Single Table Access Using Multiple Indexes: Optimization, Execution, and Concurrency Control Techniques.</name><name>Multiprocessor Join Scheduling.</name><name>Database Principles, Programming, Performance.</name><name>Informed Prefetching and Caching.</name><name>Query Processing in Tertiary Memory Databases.</name><name>On the Multiple-Query Optimization Problem.</name><name>Mariposa: A Wide-Area Distributed Database System.</name><name>The Postgres Next Generation Database Management System.</name><name>Managing IBM Database 2 Buffers to Maximize Performance.</name><name>DB2 Query Parallelism: Staging and Implementation.</name><name>Query Pre-Execution and Batching in Paradise: A Two-Pronged Approach to the Efficient Processing of Queries on Tape-Resident Raster Images.</name></citation><abstract>In the relational data model the order of fetching data does not
affect the correctness of query semantics.
This flexibility is exploited in query optimization by statically
reordering data accesses.
However, once a query is optimized, it is executed in a
fixed order in most systems,
with the result that data requests are made in a fixed order.
Only limited forms of runtime reordering can
be provided by low-level device managers.
More aggressive reordering strategies are essential in scenarios where the
latency of access to data objects varies widely and dynamically, as in
tertiary devices.
This paper presents such a strategy.
Our key innovation is to exploit dynamic reordering to match execution order
to the optimal data fetch order, in all parts of the plan-tree.
To demonstrate the practicality of our approach and the impact of our
optimizations, we report on a prototype implementation based on
Postgres.
Using our system, typical I/O cost for queries on tertiary
memory databases is as much as an order of magnitude smaller than with
conventional query processing techniques.</abstract></paper><paper><title>Query Processing Techniques for Multiversion Access Methods.</title><author><AuthorName>Jochen Van den Bercken</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Bernhard Seeger</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1996</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Temporal Relations in Geographic Information Systems: A Workshop at the University of Maine.</name><name>On Optimal Multiversion Access Structures.</name><name>Hashing Methods and Relational Algebra Operations.</name><name>Key-Sequence Data Sets on Inedible Storage.</name><name>The BANG File: A New Kind of Grid File.</name><name>R-Trees: A Dynamic Index Structure for Spatial Searching.</name><name>The Art of Computer Programming, Volume III: Sorting and Searching.
 Addison-Wesley 1973, ISBN 0-201-03803-X</name><name>Segment Indexes: Dynamic Indexing Techniques for Multi-Dimensional Interval Data.</name><name>Access Methods for Bi-Temporal Databases.</name><name>Access Methods for Multiversion Data.</name><name>The Performance of a Multiversion Access Method.</name><name>LoT: Dynamic Declustering of TSB-Tree Nodes for Parallel Access to Temporal Data.</name><name>A Taxonomy of Time in Databases.</name><name>Techniques for Design and Implementation of Efficient Spatial Access Methods.</name><name>The R+-Tree: A Dynamic Index for Multi-Dimensional Objects.</name><name>The Design of the POSTGRES Storage System.</name></citation><abstract>Multiversion access methods have been emerged in the literature primarily to
support queries on a transaction-time database where records are never
physically deleted. For a popular class of efficient methods (including the
multiversion B-tree), data records and index entries are occasionally
duplicated to separate data according to time. In this paper, we present
techniques for improving query processing in multiversion access methods. In
particular, we address the problem of avoiding duplicates in the response sets.
We first discuss traditional approaches that eliminate duplicates using hashing
and sorting. Next, we propose two new algorithms for avoiding duplicates
without using additional data structures. The one performs queries in a
depth-first order starting from a root, whereas the other exploits links
between data pages. These methods are discussed in full details and their main
properties are identified. Preliminary performance results confirm the
advantages of these methods in comparison to traditional ones according to
CPU-time, disk accesses and storage.</abstract></paper><paper><title>Coalescing in Temporal Databases.</title><author><AuthorName>Michael H. B{\"o}hlen</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Richard T. Snodgrass</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Michael D. Soo</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1996</year><conference>International Conference on Very Large Data Bases</conference><citation><name>The Temporal Deductive Database System ChronoLog.
Ph.D. thesis,  Departement Informatik, ETH Z&amp;uuml;rich 1994</name><name>Duplicate Record Elimination in Large Data Files.</name><name>SQL for Smarties: Advanced SQL Programming.</name><name>Implementation Techniques for Main Memory Database Systems.</name><name>An Evaluation of Non-Equijoin Algorithms.</name><name>A Homogeneous Relational Model and Query Languages for Temporal Databases.</name><name>Sort versus Hash Revisited.</name><name>Query Evaluation Techniques for Large Databases.</name><name>Practical Predicate Placement.</name><name>Predicate Migration: Optimizing Queries with Expensive Predicates.</name><name>A Consensus Glossary of Temporal Database Concepts.</name><name>Aggregates.</name><name>Extending relational algebra to manipulate temporal data.</name><name>Query Processing for Temporal Databases.</name><name>Temporal Query Processing and Optimization in Multiprocessor Database Machines.</name><name>Querying Historical Data in IBM DB2 C/S DBMS Using Recursive SQL.</name><name>An Algebraic Language for Query and Update of Temporal Databases.
Ph.D. thesis,  University of North Carolina, Computer Science Department 1988</name><name>Evaluation of Relational Algebras Incorporating the Time Dimension in Databases.</name><name>Understanding the New SQL: A Complete Guide.</name><name>Database Principles, Programming, Performance.</name><name>Exploiting Uniqueness in Query Optimization.</name><name>An Algebra for TSQL2.</name><name>The Temporal Query Language TQuel.</name><name>The TSQL2 Temporal Query Language.
 Kluwer 1995, ISBN 0-7923-9614-6</name><name>Temporal Reasoning in Deductive Databases.
Ph.D. thesis,  Imperial College of Science, University of London 1991</name><name>HQL - A Historical Query Language.</name><name>Efficient Evaluation of the Valid-Time Natural Join.</name><name>Adding time dimension to relational model and extending relational algebra.</name><name>Temporal Databases: Theory, Design, and Implementation.
 Benjamin/Cummings 1993, ISBN 0-8053-2413-5</name><name>Principles of Database  and Knowledge-Base Systems, Volume I.
 Computer Science Press 1988, ISBN 0-7167-8158-1</name><name>Performing Group-By before Join.</name></citation><abstract>Coalescing is a unary operator applicable to temporal databases; it is
similar to duplicate elimination in conventional databases.  Tuples in
a temporal relation that agree on the explicit attribute values and
that have adjacent or overlapping time periods are candidates for
coalescing.  Uncoalesced relations can arise in many ways, e.g., via a
projection or union operator, or by not enforcing coalescing on update
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -