¡¡
CSC6210 Advanced multidimensional search |
Offered by Yufei Tao |
¡¡ |
Course description |
This course will discuss data structures that can be used to solve several important geometric problems efficiently. We will assume that data are stored in the external memory, so the goal is to minimize the number of I/Os performed (nevertheless, most of the techniques taught can be used in the internal memory as well). The first main problem discussed is range search. Here, given a set S of points, we need to pre-process it into a structure such that, given any rectangle Q at run time, all the points of S that fall in Q can be reported efficiently. We will learn several asymptotically optimal structures that can achieve the best tradeoff between the space consumption and query cost. The second major problem is nearest neighbor (NN) search. Given a query point q, a NN query returns the point in S nearest to q. Again, our objective is to build a structure on S so that all NN queries can be processed efficiently in the worst case, especially when the dimensionality is high. All the structures we need for the above purposes are tailored for specific problems. In practice, it would be nice to have a single, generic, structure that is reasonably good for all types of queries. The course will also cover the R-tree, which is the most popular multidimensional structure in practical systems. We will also study the cost model of the R-tree, which is a useful tool for analyzing its practical performance. Target audience Students in the areas of theory, database, data mining, machine learning, and pattern recognition. ¡¡ |
Time table |
|
Grading scheme |
|
Lecture notes |
No textbook. Lecture notes and/or key references will be provided for download before lectures. Lecture 1 Introduction Lecture 2 Persistent B-trees Lecture 3 Point location query (revised on Sep 20; the original version) Lecture 4 Distribution sweep and logarithmic rebuilding Lecture 5 The buffer-tree technique (Sections 1-4) News: Assignment 1 can be downloaded here. Due day: 31 Oct 2009. (Deadline extended to 7 Nov 2009). Lecture 6 Stabbing query Lectures 6-9 will be based on Prof. Lars Arge's excellent notes, which can be downloaded here. Prof. Arge's homepage can be found here. Lecture 7 3-sided range search Lecture 8 Range search 1 (linear space) Lecture 9 Range search 2 (nonlinear space) The
description of
assignment 1 has a typo in Problem 2. The corrected
version is available here. For comparison, the
old version is still available for download. The due day is postponed to
7 Nov 2009. |