 Wu, Y; Gao, J; Agarwal, PK; Yang, J, Finding diverse, highvalue representatives on a surface of answers,
Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, vol. 10 no. 7
(January, 2017),
pp. 793804
Abstract: © 2017 VLDB Endowment. In many applications, the system needs to selectively present a small subset of answers to users. The set of all possible answers can be seen as an elevation surface over a domain, where the elevation measures the quality of each answer, and the dimensions of the domain correspond to attributes of the answers with which similarity between answers can be measured. This paper considers the problem of finding a diverse set of k highquality representatives for such a surface. We show that existing methods for diversified topk and weighted clustering problems are inadequate for this problem. We propose kDHR as a better formulation for the problem. We show that kDHR has a submodular and monotone objective function, and we develop efficient algorithms for solving kDHR with provable guarantees. We conduct extensive experiments to demonstrate the usefulness of the results produced by kDHR for applications in computational leadfinding and factchecking, as well as the efficiency and effectiveness of our algorithms.


