Markov model: Difference between revisions

Content deleted Content added
m Reference uniformity. Cleaned up using AutoEd, General formatting by script
Line 31:
{{main|Markov decision process}}
 
A [[Markov decision process]] is a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards. It is closely related to [[Reinforcementreinforcement learning]], and can be solved with [[value iteration]] and related methods.
 
==Partially observable Markov decision process==