Math @ Duke

Publications [#235483] of Pankaj K. Agarwal
Papers Published
 Agarwal, PK; Arge, L; Yi, K, I/Oefficient batched unionfind and its applications to terrain analysis,
Proceedings of the Annual Symposium on Computational Geometry, vol. 2006
(2006),
pp. 167176
(last updated on 2017/12/18)
Abstract: Despite extensive study over the last four decades and numerous applications, no I/Oefficient algorithm is known for the unionfind problem. In this paper we present an I/Oefficient algorithm for the batched (offline) version of the unionfind problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(SORT(N)) = O(N/B logM/B N/B) I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(SORT(N) + MST(N)) I/Os, where MST(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(SORT(N)log(N/M))I/O algorithm for this problem, which we have implemented. We are interested in the unionfind problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over ℝ2, and many problems that deal with such functions require a unionfind data structure. With the emergence of modem mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/Oefficient algorithms are needed to process this data efficiently. In this paper, we study two terrain analysis problems that benefit from a unionfind data structure: (i) computing topological persistence and (ii) constructing the contour tree. We give the first O(SORT(N))I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices. Finally, we report some preliminary experimental results, showing that our algorithms give orderofmagnitude improvement over previous methods on large data sets that do not fit in memory. Copyright 2006 ACM.


dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821
 
Mathematics Department
Duke University, Box 90320
Durham, NC 277080320

