Revision as of 11:22, 11 December 2020 edit WikiCleanerBot (talk \| contribs) Bots 1,007,735 edits m v2.04b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation) Tag: WPCleaner ← Previous edit		Revision as of 12:30, 21 December 2020 edit undo 102.167.99.249 (talk) →Markov decision process: MDPs are not "closely related" to "reinforcement learning". Reinforcement learning is simply a non-mathematically rigorous application which tries to emulate MDPs. Next edit →
Line 31: {{main\|Markov decision process}} A [[Markov decision process]] is a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards~~. It is closely related to [[reinforcement learning]], and can be solved with [[value iteration]] and related methods~~. ==Partially observable Markov decision process==

Markov model: Difference between revisions