Content deleted Content added
Line 206:
* {{Cite book |last=Grossi |first=Csaba |title=Algorithms for Reinforcement Learning |date=2010 |publisher=Springer International Publishing |isbn=978-3-031-00423-0 |edition=1 |series=Synthesis Lectures on Artificial Intelligence and Machine Learning |___location=Cham}}
* {{Cite journal |last1=Mohamed |first1=Shakir |last2=Rosca |first2=Mihaela |last3=Figurnov |first3=Michael |last4=Mnih |first4=Andriy |date=2020 |title=Monte Carlo Gradient Estimation in Machine Learning |url=https://www.jmlr.org/papers/v21/19-346.html |journal=Journal of Machine Learning Research |volume=21 |issue=132 |pages=1–62 |arxiv=1906.10652 |issn=1533-7928}}
== External links ==
* {{Cite web |last=Weng |first=Lilian |date=2018-04-08 |title=Policy Gradient Algorithms |url=https://lilianweng.github.io/posts/2018-04-08-policy-gradient/ |access-date=2025-01-25 |website=lilianweng.github.io |language=en}}
* {{Cite web |title=Vanilla Policy Gradient — Spinning Up documentation |url=https://spinningup.openai.com/en/latest/algorithms/vpg.html |access-date=2025-01-25 |website=spinningup.openai.com}}
[[Category:Reinforcement learning]]
|