📄 vldb_1999_elementary.txt

📁 利用lwp：：get写的
💻 TXT
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
data delivered in context, is now the key weapon in gaining and
retaining customers. The data management challenges in an
environment of massively growing data volumes and complexity
introduced by distributed processing are outlined.
A framework and methodology for the management of information
is presented and the term Context Data is introduced.</abstract></paper><paper><title>Finding Intensional Knowledge of Distance-Based Outliers.</title><author><AuthorName>Edwin M. Knorr</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Raymond T. Ng</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications.</name><name>An Interval Classifier for Database Mining Applications.</name><name>Mining Association Rules between Sets of Items in Large Databases.</name><name>Beyond Market Baskets: Generalizing Association Rules to Correlations.</name><name>Efficient Discovery of Functional and Approximate Dependencies Using Partitions.</name><name>Fast Computation of 2-Dimensional Depth Contours.</name><name>Algorithms for Mining Distance-Based Outliers in Large Datasets.</name><name>Exploratory Mining and Pruning Optimizations of Constrained Association Rules.</name><name>Scalable Techniques for Mining Causal Structures.</name><name>Knowledge Discovery in Data Warehouses.</name></citation><abstract>Existing studies on outliers focus only on the
identification aspect; none provides any intensional
knowledge of the outliers - by which we
mean a description or an explanation of why an
identified outlier is exceptional. For many applications,
a description or explanation is at least as vital to the
user as the identification aspect. Specifically, intensional 
knowledge helps the user to:
(i) evaluate the validity of the identified outliers, and
(ii) improve one's understanding of the data.
The two main issues addresses in this paper are:
what kinds of intensional knowledge to provide,
and how to optimize the computation of such knowledge.
With respect to the first issue, we propose finding
strongest and weak outliers and their
corresponding structural intensional knowledge.
With respect to the second issue, we first present a naive
and a semi-naive algorithm. Then, by means of what we call
path and semi-lattice sharing of I/O processing,
we develop two optimized approaches. We provide analytic results
on their I/O performance, and present experimental results 
showing significant reductions in I/O
and significant speedups in overall runtime.</abstract></paper><paper><title>SPIRIT: Sequential Pattern Mining with Regular Expression Constraints.</title><author><AuthorName>Minos N. Garofalakis</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Rajeev Rastogi</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Kyuseok Shim</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Querying Shapes of Histories.</name><name>Fast Algorithms for Mining Association Rules in Large Databases.</name><name>Mining Sequential Patterns.</name><name>Efficient Data Mining for Path Traversal Patterns.</name><name>Elements of the Theory of Computation.</name><name>Discovering Generalized Episodes Using Minimal Occurrences.</name><name>Discovering Frequent Episodes in Sequences.</name><name>Exploratory Mining and Pruning Optimizations of Constrained Association Rules.</name><name>Mining Association Rules with Item Constraints.</name><name>Mining Sequential Patterns: Generalizations and Performance Improvements.</name><name>Combinatorial Pattern Discovery for Scientific Data: Some Preliminary Results.</name></citation><abstract>Discovering sequential patterns is an important problem in
data mining with a host of application domains including
medicine, telecommunications, and the World Wide Web.
Conventional mining systems provide users with only a
very restricted mechanism (based on minimum support)
for specifying patterns of interest. In this paper, we propose
the use of Regular Expressions (REs) as a flexible constraint
specification tool that enables user-controlled focus to be
incorporated into the pattern mining process. We develop a
family of novel algorithms (termed SPIRIT - Sequential Pattern
mIning with Regular expressIon con-sTraints) for mining frequent
sequential patterns that also satisfy user-specified RE
constraints. The main distinguishing factor among the
proposed schemes is the degree to which the RE constraints
are enforced to prune the search space of patterns during
computation. Our solutions provide valuable insights into
the tradeoffs that arise when constraints that do not
subscribe to nice properties (like anti-monotonicity)
are integrated into the mining process. A quantitative
exploration of these tradeoffs is conducted through an
extensive experimental study on synthetic and real-life data sets.</abstract></paper><paper><title>A Novel Index Supporting High Volume Data Warehouse Insertion.</title><author><AuthorName>Chris Jermaine</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Anindya Datta</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Edward Omiecinski</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Incremental Organization for Data Recording and Warehousing.</name><name>The Log-Structured Merge-Tree (LSM-Tree).</name><name>Improved Query Performance with Variant Indexes.</name><name>Concurrency Control in B-Trees with Batch Updates.</name><name>The Design and Implementation of a Log-Structured File System.</name><name>Access Methods.</name></citation><abstract>While the desire to support fast, ad hoc query processing
for large data warehouses has motivated the recent
introduction of many new indexing structures, with a few notable
exceptions (namely, the LSM-Tree [4]
and the Stepped Merge Method [1])
little attention has been given to developing new indexing
schemes that allow fast insertions.
Since additions to a large warehouse may number in the
millions per day, indices that require a disk seek
(or even a significant fraction of a seek) per insertion
are not acceptable.
In this paper, we offer an alternative to the B+-tree called the
Y-tree for indexing huge warehouses having frequent
insertions. The Y-tree is a new indexing structure supporting
both point and range queries over a single attribute,
with retrieval performance comparable to the B+-tree.
For processing insertions, however, the Y-tree may exhibit a
speedup of 100 times over batched insertions into a B+-tree.</abstract></paper><paper><title>Microsoft English Query 7.5: Automatic Extraction of Semantics from Relational Databases and OLAP Cubes.</title><author><AuthorName>Adam Blum</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation></citation><abstract></abstract></paper><paper><title>The New Locking, Logging, and Recovery Architecture of Microsoft SQL Server 7.0.</title><author><AuthorName>David Campbell</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation></citation><abstract></abstract></paper><paper><title>The Value of Merge-Join and Hash-Join in SQL Server.</title><author><AuthorName>Goetz Graefe</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Storage and Access in Relational Data Bases.</name><name>Microsoft Index Tuning Wizard for SQL Server 7.0.</name><name>Implementation Techniques for Main Memory Database Systems.</name><name>Nested Loops Revisited.</name><name>Hash Joins and Hash Teams in Microsoft SQL Server.</name><name>Application of Hash to Data Base Machine and Its Architecture.</name><name>Fragmentation: A Technique for Efficient Query Processing.</name><name>Access Path Selection in a Relational Database Management System.</name><name>Join Processing in Database Systems with Large Main Memories.</name><name></name></citation><abstract>Microsoft SQL Server was successful for many
years for transaction processing and decision support
workloads with neither merge join nor hash join,
relying entirely on nested loops and index nested loops join.
How much difference do additional join algorithms really make,
and how much system performance do they actually add?
In a pure OLTP workload that requires only record-to-record
navigation, intuition agrees that index nested loops join is
sufficient. For a DSS workload, however, the question is much
more complex. To answer this question, we have analyzed TPC-D
query performance using an internal build of SQL Server with
merge-join and hash-join enabled and disabled. It shows that merge
join and hash join are both required to achieve the best
performance for decision support workloads.</abstract></paper><paper><title>VOODB: A Generic Discrete-Event Random Simulation Model To Evaluate the Performances of OODBs.</title><author><AuthorName>J{\'e}r{\^o}me Darmont</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Michel Schneider</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>The HyperModel Benchmark.</name><name>The Simulation Model Development Environment: An Overview.</name><name>Output Analysis Capabilities of Simulation Software.</name><name>Libraries of Reusable Models: Theory and Application.</name><name>Dynamic Clustering in Object Databases Exploiting Effective Use of Relationships Between Objects.</name><name>The oo7 Benchmark.</name><name>An Engineering Database benchmark.</name><name>Exploiting Inheritance and Structure Semantics for Effective Clustering and Buffering in an Object-Oriented DBMS.</name><name>Effective Clustering of Complex Objects in Object-Oriented Databases.</name><name>A Comparison Study of Object-Oriented Database Clustering Techniques.</name><name>OCB: A Generic Benchmark to Evaluate the Performances of Object-Oriented Database Systems.</name><name>The O2 System.</name><name>A Clustering Technique for Object Oriented Databases.</name><name>Integrating an Object-Oriented Programming System with a Database System.</name><name>The ObjectStore Database System.</name><name>Texas: An Efficient, Portable Persistent Store.</name><name>On the Performance of Object Clustering Techniques.</name></citation><abstract>Performance of object-oriented database systems
(OODBs) is still an issue to both designers and
users nowadays.
The aim of this papers is to propose a generic discrete-event
random simulation model, called VOODB, in order to evaluate
the performance of OODBs in general, and the performance of
optimization methods like clustering in particular.
Such optimization methods undoubtedly improve the
performance of OODBs.
Yet, they also always induce some kind of overhead for the
system. Therefore, it is important to evaluate their exact
impact on the overall performances.
VOODB has been designed as a generi
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -