📄 vldb_1999_elementary.txt
字号:
which results in heterogeneous system, where the
clock cycle of CPU, the performance/capacity of disk
drives, etc are different among component PC's.
Heterogeneity is inevitable. Basically, current algorithms
assume the homogeneity. Thus if we naively apply them to
heterogeneous system, its performance is far below
expectation. We need some new methodologies to handle
heterogeneity. In this paper, we propose the new dynamic
load balancing methods for association rule mining,
which works under heterogeneous system. Two strategies,
called candidate migration and transaction migration are
proposed. Initially first one is invoked. When the load
imbalance cannot be resolved with the first method, the
second one is employed, which is costly but more effective
for strong imbalance. We have implemented them on the PC
cluster system with two different types of PCs: one with
Pentium Pro, the other one with Pentium II. The experimental
results confirm that the proposed approach can very effectively
balance the workload among heterogeneous PCs.</abstract></paper><paper><title>Histogram-Based Approximation of Set-Valued Query-Answers.</title><author><AuthorName>Yannis E. Ioannidis</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Viswanath Poosala</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Join Synopses for Approximate Query Answering.</name><name>A Semantics for Complex Objects and Approximate Queries.</name><name>Adaptive Selectivity Estimation Using Query Feedback.</name><name>New Sampling-Based Summary Statistics for Improving Approximate Query Answers.</name><name>Fast Incremental Maintenance of Approximate Histograms.</name><name>Online Aggregation.</name><name>The Optimization of Queries in Relational Databases.
Ph.D. thesis, Case Western Reserve University 1980</name><name>Practical Selectivity Estimation through Adaptive Sampling.</name><name>Query Generalization: A Method for Interpreting Null Answers.</name><name>Processing Real-Time, Non-Aggregate Queries with Time-Constraints in CASE-DB.</name><name>Accurate Estimation of the Number of Tuples Satisfying a Condition.</name><name>Fast Approximate Answers to Aggregate Queries on a Data Cube.</name><name>Fast Approximate Query Answering Using Precomputed Statistics.</name><name>Selectivity Estimation Without the Attribute Value Independence Assumption.</name><name>Improved Histograms for Selectivity Estimation of Range Predicates.</name><name>Data Cube Approximation and Histograms via Wavelets.</name><name>APPROXIMATE - A Query Processor that Produces Monotonically Improving Approximate Answers.</name><name>Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology.
Addison-Wesley 1949</name></citation><abstract>Answering queries approximately has recently
been proposed as a way to reduce query response
times in on-line decision support systems, when
the precise answer is not necessary or early
feedback is helpful. Most of the work in this
area uses sampling-based techniques and handles
aggregate queries, ignoring queries that return
relations as answers. In this paper, we extend the
scope of approximate query answering to general queries.
We propose a novel and intuitive error measure for quantifying
the error in an approximate query answer, which can be a
multiset in general. We also study the use of histograms
in approximate query answering as an alternative to sampling.
In that direction, we develop a histogram algebra and
demonstrate how complex SQL queries on a database may
be translated into algebraic operations on the corresponding
histograms. Finally, we present the results of an initial
set of experiments where various types of histograms and
sampling are compared with respect to their effectiveness
in approximate query answering as captured by the introduced
error measure. The results indicate that the MaxDiff(V,A)
histograms provide quality approximations for both set-valued and
aggregate queries, while sampling is competitive mainly for
aggregate queries with no join operators.</abstract></paper><paper><title>Semantic Compression and Pattern Extraction with Fascicles.</title><author><AuthorName>H. V. Jagadish</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>J. Madar</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Raymond T. Ng</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name>Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications.</name><name>Mining Association Rules between Sets of Items in Large Databases.</name><name>Fast Algorithms for Mining Association Rules in Large Databases.</name><name>The New Jersey Data Reduction Report.</name><name>Dynamic Itemset Counting and Implication Rules for Market Basket Data.</name><name>A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.</name><name>FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets.</name><name>CURE: An Efficient Clustering Algorithm for Large Databases.</name><name>Comparing Massive High-Dimensional Data Sets.</name><name>Finding Interesting Rules from Large Sets of Discovered Association Rules.</name><name>Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences.</name><name>Levelwise Search and Borders of Theories in Knowledge Discovery.</name><name>Efficient and Effective Clustering Methods for Spatial Data Mining.</name><name>Block-Oriented Compression Techniques for Large Statistical Databases.</name><name>Rearranging Data to Maximize the Efficiency of Compression.</name><name>Power Domains.</name><name>BIRCH: An Efficient Data Clustering Method for Very Large Databases.</name></citation><abstract>Often many recoords in a database share similar
values for several attributes. If one is able to
identify and group together records that share
similar values for some - even if not all - attributes,
one can both obtain a more parsimonious representation
of the data, and gain useful insight into the data from
a mining perspective.
In this paper, we introduce the notion of fascicles.
A fascicle F(k,t) is a subset of records that have
k compact attributes. An attribute A of a
collection F of records is compact if the width of
the range of A-values (for numeric attributes) or
the number of distinct A-values (for categorial
attributes) of all the records in F does not exceed t.
We introduce and study two problems related to fascicles.
First, we consider how to find fascicles such that the total
storage of the relation is minimized.
Second, we study how best to extract fascicles whose sizes
exceed a given minimum threshold (i.e., support) and that
represent patterns of maximal quality, where quality is
measured by the pair (k,t). We develop algorithms
to attack both of the above problems.
We show that these two problems are very hard to solve
optimally. But we demonstrate empirically that good
solutions can be obtained using our algorithms.</abstract></paper><paper><title>Issues in Network Management in the Next Millennium.</title><author><AuthorName>Michael L. Brodie</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Surajit Chaudhuri</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation></citation><abstract>The next generation of computing will involve vast
numbers of devices and humans interacting to achieve
organizational, enterprise, and any number of other personal
objectives. The explosive growth of the web and the emergence
of a new generation of personal gizmos are two extreme examples:
the web is a universal publishing platform and the PalmPilot is
a hand-held database of contact and scheduling information.
In the emerging networked world, data will reside not on a few
data servers but on millions of servers and devices distributed
worldwide in connected and disconnected modes. Conventional
database concepts, tools, and techniques apply in the abstract but
the networked world will present several discontinuities.
Some fundamental database assumptions will no longer apply.
What are the data management requirements in the future
networked world? What are the current and future requirements
and challenges for networked data management? In this industrial
session, three industry leaders with major commitments to
networked data management will present their views and will
respond to questions.</abstract></paper><paper><title>A Scalable and Highly Available Networked Database Architecture.</title><author><AuthorName>Roger Bamford</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Rafiul Ahad</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><author><AuthorName>Angelo Pruscino</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation></citation><abstract>The explosive growth of the Internet and
information devices has driven database systems
to be more scaleable, available, and able to
support online, mobile, and disconnected clients
while keeping the cost of operations low. This
paper presents the concept of Scalable Server
that has the above characteristics and that can
directly serve applications and data.</abstract></paper><paper><title>Networked Data Management Design Points.</title><author><AuthorName>James R. Hamilton</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation></citation><abstract></abstract></paper><paper><title>In Cyber Space No One can Hear You Scream.</title><author><AuthorName>Chris Pound</AuthorName><institute><InstituteName></InstituteName><country></country></institute></author><year>1999</year><conference>International Conference on Very Large Data Bases</conference><citation><name></name></citation><abstract>As the telecommunications industry endeavours to reinvent itself,
the effective management and exploitation of information,
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -