Range Search on Multidimensional Uncertain Data

 

Yufei Tao, Xiaokui Xiao, Reynold Cheng

 

In ACM Transactions of Database Systems (TODS)

 

Abstract



 


In an uncertain database, every object o is associated with a probability density function, which describes the likelihood that o appears at each position in a multidimensional workspace. This article studies two types of range retrieval fundamental to many analytical tasks. Specifically, a nonfuzzy query returns all the objects that appear in a search region rq with at least a certain probability tq. On the other hand, given an uncertain object q, fuzzy search retrieves the set of objects that are within distance εq from q with no less than probability tq. The core of our methodology is a novel concept of "probabilistically constrained rectangle", which permits effective pruning/validation of nonqualifying/qualifying data. We develop a new index structure called the U-tree for minimizing the query overhead. Our algorithmic findings are accompanied with a thorough theoretical analysis, which reveals valuable insight into the problem characteristics, and mathematically confirms the efficiency of our solutions. We verify the effectiveness of the proposed techniques with extensive experiments.
 

 

Paper download



 

 

Codes



 


You need to agree to this before downloading

Codes:        U-tree  (by Xiaokui Xiao).