Fitzpatrick Institute for Photonics Fitzpatrick Institute for Photonics
Pratt School of Engineering
Duke University

 HOME > pratt > FIP    Search Help Login 

Publications [#338747] of Lawrence Carin

Papers Published

  1. Liu, M; Liao, X; Carin, L, Online expectation maximization for reinforcement learning in POMDPs, IJCAI International Joint Conference on Artificial Intelligence (December, 2013), pp. 1501-1507
    (last updated on 2024/12/31)

    Abstract:
    We present online nested expectation maximization for model-free reinforcement learning in a POMDP. The algorithm evaluates the policy only in the current learning episode, discarding the episode after the evaluation and memorizing the sufficient statistic, from which the policy is computed in closedform. As a result, the online algorithm has a time complexity O (n) and a memory complexity O(1), compared to O (n2) and O(n) for the corresponding batch-mode algorithm, where n is the number of learning episodes. The online algorithm, which has a provable convergence, is demonstrated on five benchmark POMDP problems.


Duke University * Pratt * Reload * Login
x