📄 vldb_1999_elementary.txt
字号:
(2) feasibility of getting interactive answers even
with modest hardware resources.</abstract></paper><paper><title>Database Architecture Optimized for the New Bottleneck: Memory Access.</title><author><AuthorName>Peter A. Boncz</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Stefan Manegold</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Martin L. Kersten</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>PRISMA/DB: A Parallel Main Memory Relational DBMS.</name><name>Monet And Its Geographic Extensions: A Novel Approach to High Performance GIS Processing.</name><name>The Drill Down Benchmark.</name><name>Flattening an Object Algebra to Provide Performance.</name><name>A Decomposition Storage Model.</name><name>The Art of Computer Programming, Volume I: Fundamental Algorithms.
Addison-Wesley 1968</name><name>A Study of Index Structures for Main Memory Database Management Systems.</name><name>Modelling Costs for a MM-DBMS.</name><name>Smarter Memory: Improving Bandwidth for Streamed References.</name><name>AlphaSort: A RISC Machine Sort.</name><name>Cache Conscious Algorithms for Relational Query Processing.</name><name>Join Indices.</name><name>Query Optimization in a Memory-Resident Domain Relational Calculus Database System.</name></citation><abstract>In the past decade, advances in speed of commodity CPUs
have far out-paced advances in memory latency.
Main-memory access is therefore incresasingly a performance
bottleneck for many computer applications, including
database systems.
In this article, we use a simple scan test to show the
severe impact of this bottleneck.
The insights gained are translated into guidelines for
database architecture; in terms of both data structures and
algorithms.
We discuss how vertically fragmented data structures optimize
cache performance on sequential data access.
We then focus on equi-join, typically a random-access
operation, and introduce radix algorithms for partitioned
hash-join.
The performance of these algorithms is quantified using a detailed
analytical model that incorporates memory access cost.
Experiments that validate this model were performed on the Monet
database System.
We obtained exact statistics on events like TLB misses,
L1 and L2 cache misses, by using hardware performance
counters found in modern CPUs.
Using our cost model, we show how the carefully tuned
memory access pattern of our radix algorithms make them
perform well, which is confirmed by experimental results.</abstract></paper><paper><title>The Persistent Cache: Improving OID Indexing in Temporal Object-Oriented Database Systems.</title><author><AuthorName>Kjetil N{\o}rv{\aa}g</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>A Simple Analysis of the LRU Buffer Policy and Its Relationship to Buffer Warm-Up Transient.</name><name>A Performance Evaluation of OID Mapping Techniques.</name><name>The Time Index and the Monotonic B+-tree.</name><name>Access Methods for Multiversion Data.</name><name>Design, Implementation, and Performance of the LHAM Log-Structured History Data Access Method.</name><name>Log-Only Temporal Object Storage.</name><name>An Analytical Study of Object Identifier Indexing.</name><name>Development and Performance Analysis of a Temporal Persistent Object Store POST/C++.</name><name>Principles of Database and Knowledge-Base Systems, Volume I.
Computer Science Press 1988, ISBN 0-7167-8158-1</name><name>Principles of Database and Knowledge-Base Systems, Volume II.
Computer Science Press 1989, ISBN 0-7167-8162-X</name></citation><abstract>In a temporal OODB, an OID index (OIDX) is
needed to map from OID to the physical location
of the object. In a transaction time temporal
OODB, the OIDX should also index the object
versions. In this case, the index entries, which
we call object descriptors (OD), also include the
commit timestamp of the transaction that created
the object version. The OIDX in a non-temporal
OODB only needs to be updated when an object is
created, but in a temporal OODB, the OIDX
has to be updated every time an object is updated.
This has previously been shown to be a potential
bottleneck, and in this paper, we present the
Persistent Cache (PCache), a novel approach which
reduces the index update and lookup costs in temporal OODBs.
We develop a cost model for the PCache,
and use this to show that the use of a
PCache can reduce the average access cost to only
a fraction of the cost when not using the PCache.
Even though the primary context of this paper is
OID indexing in a temporal OODB, the PCache
can also be applied to general secondary indexing,
and can be especially beneficial for applications
where updates are non-clustered.</abstract></paper><paper><title>Cache Conscious Indexing for Decision-Support in Main Memory.</title><author><AuthorName>Jun Rao</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Kenneth A. Ross</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>The Asilomar Report on Database Research.</name><name>The Ubiquitous B-Tree.</name><name>An Algorithm to Construct a Compact B-Tree in Case of Ordered Keys.</name><name>``One Size Fits All'' Database Architectures Do Not Work for DDS.</name><name>Teaching an OLTP Database Kernel Advanced Data Warehousing Techniques.</name><name>Hash Joins and Hash Teams in Microsoft SQL Server.</name><name>Transaction Processing: Concepts and Techniques.</name><name>Crash Recovery Scheme for a Memory-Resident Database System.</name><name>Dal&iacute;: A High Performance Main Memory Storage Manager.</name><name>Two Access Methods Using Compact Binary Trees.</name><name>The Art of Computer Programming, Volume III: Sorting and Searching.
Addison-Wesley 1973, ISBN 0-201-03803-X</name><name>Query Processing in Main Memory Database Management Systems.</name><name>A Study of Index Structures for Main Memory Database Management Systems.</name><name>A Recovery Algorithm for A High-Performance Memory-Resident Database System.</name><name>The Influence of Caches on the Performance of Heaps.</name><name>The Influence of Caches on the Performance of Sorting.</name><name>Multiprocessor Main Memory Transaction Processing.</name><name>An Evaluation of Starburst's Memory Resident Storage Component.</name><name>AlphaSort: A RISC Machine Sort.</name><name>Access Path Selection in a Relational Database Management System.</name><name>Cache Conscious Algorithms for Relational Query Processing.</name><name>Cache Memories.</name><name>Query Optimization in a Memory-Resident Domain Relational Calculus Database System.</name><name>A Data Locality Optimizing Algorithm.</name></citation><abstract>We study indexing techniques for main memory,
including hash indexes, binary search trees,
T-trees, B+-trees, interpolation search, and binary
search on arrays. In a decision-support context, our
primary concerns are the lookup time, and the space
occupied by the index structure.
Our goal is to provide faster lookup times than binary
search by paying attention to reference locality and
cache behavior, with- out using substantial extra space.
We propose a new indexing technique called ``Cache-Sensitive
Search Trees'' (CSS-trees). Our technique stores a directory
structure on top of a sorted array. Nodes in this directory
have size matching the cache-line size of the machine.
We store the directory in an array and do not store
internal-node pointers; child nodes can be found by performing
arithmetic on array o sets.
We compare the algorithms based on their time and space
requirements. We have implemented all of the techniques,
and present a performance study on two popular modern
machines. We demonstrate that with
a small space overhead, we can reduce the cost of binary
search on the array by more than a factor of two. We also
show that our technique dominates B+-trees, T-trees, and
binary search trees in terms of both space and time.
A cache simulation verifies that the gap is due largely
to cache misses.</abstract></paper><paper><title>Comparing Hierarchical Data in External Memory.</title><author><AuthorName>Sudarshan S. Chawathe</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Bounds on the Complexity of the Longest Common Subsequence Problem.</name><name>Representing and Querying Changes in Semistructured Data.</name><name>Meaningful Change Detection in Structured Data.</name><name>Change Detection in Hierarchically Structured Information.</name><name>Efficient Snapshot Differential Algorithms for Data Warehousing.</name><name>A File Comparison Program.</name><name>A Faster Algorithm Computing String Edit Distances.</name><name>An O(ND) Difference Algorithm and Its Variations.</name><name>The Tree-to-Tree Editing Problem.</name><name>RCS - A System for Version Control.</name><name>External Memory Algorithms.</name><name>Bounds for the String Editing Problem.</name><name>Combinatorial Pattern Discovery for Scientific Data: Some Preliminary Results.</name><name>The String-to-String Correction Problem.</name><name>An O(NP) Sequence Comparison Algorithm.</name><name>Structural Matching and Discovery in Document Databases.</name><name>A System for Approximate Tree Matching.</name><name>Identifying Syntactic differences Between Two Programs.</name><name>Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems.</name><name></name></citation><abstract>We present an external-memory algorithm for
computing a minimum-cost edit script between
two rooted, ordered, labeled trees. The I/O, RAM,
and CPU costs of our algorithm are, respectively,
4mn+7m+5n, 6S, and O(MN+(M+N)S1.5),
where M and N are the input tree sizes,
S is the block size, m=M/S, and n=N/S.
This algorithm can make effective use of surplus RAM
capacity to quadratically reduce I/O cost. We extend to
trees the commonly used mapping from sequence comparison
problems to shortest-path problems in edit graphs.</abstract></paper><paper><title>Mining Deviants in a Time Series Database.</title><author><AuthorName>H. V. Jagadish</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Nick Koudas</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>S. Muthukrishnan</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>A Linear Method for Deviation Detection in Large Databases.</name><name>Fast Incremental Maintenance of Approximate Histograms.</name><name>Efficient Mining of Partial Periodic Patterns in Time Series Database.</name><name>Universality of Serial Histograms.</name><name>Balancing Histogram Optimality and Practicality for Query Result Size Estimation.</name><name>Optimal Histograms with Quality Guarantees.</name><name>Algorithms for Mining Distance-Based Outliers in Large Datasets.</name><name>Selectivity Estimation Without the Attribute Value Independence Assumption.</name><name>Improved Histograms for Selectivity Estimation of Range Predicates.</name></citation><abstract>Identifiying outliers is an important data analysis
function. Statisticans have long studied techniques
to identify outliers is a data set in the context
of fitting the data to some model. In the case
of time series data, the situation is more murky.
For instance, the ``typical'' value cound ``drift''
up or down over time, so the extrema may not necessarily
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -