A model for system uncertainty in reinforcement learning

Ryan Murray, Michele Palladino

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This work provides a rigorous framework for studying continuous-time control problems in uncertain environments. The framework models uncertainty in state dynamics as a probability measure on the space of functions. Such a probability measure is permitted to change over time as agents learn about their environment. This model can be seen as a variant of either Bayesian reinforcement learning (RL) or adaptive optimal control. We study conditions for locally optimal trajectories within this model, in particular deriving an appropriate dynamic programming principle and Hamilton–Jacobi equations. Some discussion of variants of the model are also provided, including one potential framework for studying the tradeoff between exploration and exploitation in RL.

Original languageEnglish (US)
Pages (from-to)24-31
Number of pages8
JournalSystems and Control Letters
Volume122
DOIs
StatePublished - Dec 2018

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science(all)
  • Mechanical Engineering
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A model for system uncertainty in reinforcement learning'. Together they form a unique fingerprint.

Cite this