Markov decision process: Difference between revisions

Content deleted Content added
m Computational complexity: added a few links
m Computational complexity: moved periods before references
Line 81:
==Computational complexity==
 
Algorithms for finding optimal policies with [[time complexity]] polynomial in the size of the problem representation exist for finite MDPs. Thus, [[decision problem]]s based on MDPs are in computational [[complexity class]] [[P (complexity)|P]].<ref>{{cite journal |last1=Papadimitriou|first1=Christos|last2=Tsitsiklis|first2=John|date=1987|title=The Complexity of Markov Decision Processes|url=https://pubsonline.informs.org/doi/abs/10.1287/moor.12.3.441|journal=Mathematics of Operations Research|volume=12|issue=3|doi= 10.1287/moor.12.3.441|access-date=November 2, 2023}}</ref>. However, due to the [[curse of dimensionality]], the size of the problem representation is often exponential in the number of state and action variables, limiting exact solution techniques to problems that have a compact representation. In practice, online planning techniques such as [[Monte Carlo tree search]] can find useful solutions in larger problems, and, in theory, it is possible to construct online planning algorithms that can find an arbitrarily near-optimal policy with no computational complexity dependence on the size of the state space.<ref>{{cite journal|last1=Kearns|first1=Michael|last2=Mansour|first2=Yishay|last3=Ng|first3=Andrew|date=November 2002|title=A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes|url=https://link.springer.com/article/10.1023/A:1017932429737|journal=Machine Learning|volume=49|doi=10.1023/A:1017932429737|access-date=November 2, 2023}}</ref>.
 
==Extensions and generalizations==