Math @ Duke
|
Publications [#353378] of Andrea Agazzi
Papers Published
- Agazzi, A; Lu, J, Global optimality of softmax policy gradient with single hidden layer
neural networks in the mean-field regime, vol. abs/2010.11858
(October, 2020)
(last updated on 2022/05/25)
Abstract: We study the problem of policy optimization for infinite-horizon discounted
Markov Decision Processes with softmax policy and nonlinear function
approximation trained with policy gradient algorithms. We concentrate on the
training dynamics in the mean-field regime, modeling e.g., the behavior of wide
single hidden layer neural networks, when exploration is encouraged through
entropy regularization. The dynamics of these models is established as a
Wasserstein gradient flow of distributions in parameter space. We further prove
global optimality of the fixed points of this dynamics under mild conditions on
their initialization.
|
|
dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821
| |
Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320
|
|