Papers Published
- Liu, M; Liao, X; Carin, L, The infinite regionalized policy representation,
Proceedings of the 28th International Conference on Machine Learning, ICML 2011
(October, 2011),
pp. 769-776 .
(last updated on 2024/12/31)Abstract:
We introduce the infinite regionalized policy presentation (iRPR), as a nonparametric policy for reinforcement learning in partially observable Markov decision processes (POMDPs). The iRPR assumes an unbounded set of decision states a priori, and infers the number of states to represent the policy given the experiences. We propose algorithms for learning the number of decision states while maintaining a proper balance between exploration and exploitation. Convergence analysis is provided, along with performance evaluations on benchmark problems. Copyright 2011 by the author(s)/owner(s).