|
Range
Search on Multidimensional Uncertain Data |
| |
|
Yufei Tao,
Xiaokui Xiao,
Reynold Cheng |
| |
|
ACM Transactions on Databases
Systems (TODS),
32(3), 2007 |
| |
| Abstract |
|
In an uncertain database, every object o is associated with a
probability density function, which describes the likelihood that o
appears at each position in a multidimensional workspace. This article
studies two types of range retrieval fundamental to many analytical tasks.
Specifically, a nonfuzzy query returns all the objects that appear in a
search region rq with at least a certain probability tq.
On the other hand, given an uncertain object q, fuzzy search
retrieves the set of objects that are within distance εq
from q with no less than probability tq. The core
of our methodology is a novel concept of "probabilistically constrained
rectangle", which permits effective pruning/validation of nonqualifying/qualifying
data. We develop a new index structure called the U-tree for minimizing the
query overhead. Our algorithmic findings are accompanied with a thorough
theoretical analysis, which reveals valuable insight into the problem
characteristics, and mathematically confirms the efficiency of our
solutions. We verify the effectiveness of the proposed techniques with
extensive experiments.
|
| Paper
download |
The short version appeared in VLDB 2005.
|
|
Implementation and datasets |
Before you proceed with downloading, please read and agree to
the terms of using our implementation.
Implementation of the U-tree
(implemented: by
Xiaokui Xiao).
Datasets used in our experiments: CA,
LB.
Dataset format:
CA (LB): Each line corresponds to a location in California (the
Long Beach county), in the form of:
tuple-id <space> x- <space> y-coordinate
Based on each point in CA (LB), an uncertain object is defined as
described in the experimental settings of our paper. The PCRs of the object
are computed in the following function of the U-tree implementation: void
U_Tree::init(char *_fname, int _b_length, Cache *_c, int _dimension, int
nCataSize, float* pCatalog).
|
| |