Revision as of 18:18, 7 May 2018 edit Nitpicking polish (talk \| contribs) Extended confirmed users 5,186 edits m Reference uniformity. Cleaned up using AutoEd, General formatting by script ← Previous edit		Revision as of 18:52, 17 June 2018 edit undo Loraof (talk \| contribs) Extended confirmed users 22,850 edits →Markov decision process: ce Next edit →
Line 31: {{main\|Markov decision process}} A [[Markov decision process]] is a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards. It is closely related to [[~~Reinforcement~~reinforcement learning]], and can be solved with [[value iteration]] and related methods. ==Partially observable Markov decision process==

Markov model: Difference between revisions