Math @ Duke

Publications [#326893] of Robert Calderbank
Papers Published
 Nokleby, M; Beirami, A; Calderbank, R, A ratedistortion framework for supervised learning,
IEEE International Workshop on Machine Learning for Signal Processing : [proceedings]. IEEE International Workshop on Machine Learning for Signal Processing, vol. 2015November
(November, 2015), ISBN 9781467374545 [doi]
(last updated on 2017/12/17)
Abstract: © 2015 IEEE. An informationtheoretic framework is presented for bounding the number of samples needed for supervised learning in a parametric Bayesian setting. This framework is inspired by an analogy with ratedistortion theory, which characterizes tradeoffs in the lossy compression of random sources. In a parametric Bayesian environment, the maximum a posteriori classifier can be viewed as a random function of the model parameters. Labeled training data can be viewed as a finiterate encoding of that source, and the excess loss due to using the learned classifier instead of the MAP classifier can be viewed as distortion. A strict bound on the lossmeasured in terms of the expected total variationis derived, providing a minimum number of training samples needed to drive the expected total variation to within a specified tolerance. The tightness of this bound is demonstrated on the classification of Gaussians, for which one can derive closedform expressions for the bound.


dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821
 
Mathematics Department
Duke University, Box 90320
Durham, NC 277080320

