Department of Mathematics
 Search | Help | Login | pdf version | printable version

Math @ Duke





.......................

.......................


Publications [#235548] of Pankaj K. Agarwal

Papers Published

  1. Agarwal, PK; Cormode, G; Huang, Z; Phillips, J; Wei, Z; Yi, K, Mergeable summaries, Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (2012), pp. 23-34 [doi]
    (last updated on 2017/12/13)

    Abstract:
    We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means that the summaries can be merged in a way like other algebraic operators such as sum and max, which is especially useful for computing summaries on massive distributed data. Several data summaries are trivially mergeable by construction, most notably all the sketches that are linear functions of the data sets. But some other fundamental ones like those for heavy hitters and quantiles, are not (known to be) mergeable. In this paper, we demonstrate that these summaries are indeed mergeable or can be made mergeable after appropriate modifications. Specifically, we show that for ε-approximate heavy hitters, there is a deterministic mergeable summary of size O(1/ε) for ε-approximate quantiles, there is a deterministic summary of size O(1/ε log(εn))that has a restricted form of mergeability, and a randomized one of size O(1/ε log 3/21/ε) with full mergeability. We also extend our results to geometric summaries such as ε-approximations and εkernels. We also achieve two results of independent interest: (1) we provide the best known randomized streaming bound for ε-approximate quantiles that depends only on ε, of size O(1/ε log 3/21/ε, and (2) we demonstrate that the MG and the SpaceSaving summaries for heavy hitters are isomorphic. © 2012 ACM.

 

dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821

Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320