On the complexity of partially observed Markov decision processes

Dima Burago, Michel De Rougemont, Anatol Slissenko

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

In the paper we consider the complexity of constructing optimal policies (strategies) for some type of partially observed Markov decision processes. This particular case of the classical problem deals with finite stationary processes, and can be represented as constructing optimal strategies to reach target vertices from a starting vertex in a graph with colored vertices and probabilis-tic deviations from an edge chosen to follow. The colors of the visited vertices is the only information available to a strategy. The complexity of Markov decision in the case of perfect information (bijective coloring of vertices) is known and briefly surveyed at the beginning of the paper. For the unobservable case (all the colors are equal) we give an improvement of the result of Papadimitriou and Tsitsiklis, namely we show that the problem of constructing even a very weak approximation to an optimal strategy is NP-hard. Our main results concern the case of a fixed bound on the multiplicity of coloring, that is a case of partially observed processes where some upper bound on the unobservability is supposed. We show that the problem of finding an optimal strategy is still NP-hard, but polytime approximations are possible. Some relations of our results to the Max-Word Problem are also indicated.

Original languageEnglish (US)
Pages (from-to)161-183
Number of pages23
JournalTheoretical Computer Science
Volume157
Issue number2
DOIs
StatePublished - May 5 1996

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'On the complexity of partially observed Markov decision processes'. Together they form a unique fingerprint.

Cite this