Content deleted Content added
→Policy gradient: anchor REINFORCE |
|||
Line 27:
== REINFORCE ==
{{Anchor|REINFORCE}}
=== Policy gradient ===
Line 196 ⟶ 197:
* [[Reinforcement learning]]
* [[Deep reinforcement learning]]
* [[Actor-critic method]]
|