Content deleted Content added
CLCStudent (talk | contribs) m Reverted edits by 103.103.214.103 (talk) to last version by CLCStudent |
|||
Line 31:
{{main|Markov decision process}}
A [[Markov decision process]] is a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards. It is closely related to [[reinforcement learning]], and can be solved with [[value iteration]] and related methods. in my opinion it is wrong.
==Partially observable Markov decision process==
|