Content deleted Content added
correcting according to "Articles for possible copyedit from 2025-2-20 dump" – "is,This", "gradientwhich", "estimatorand", "sinceby", "tryinguntil", "advantageunder", "advantagewhere" |
merge hidden blocks |
||
Line 49:
{{hidden begin|style=width:100%|ta1=center|border=1px #aaa solid|title=Proof}}
{{Math proof|title=Proof of Lemma|proof=
Use the [[reparameterization trick#REINFORCE estimator|reparameterization trick]].
Line 82:
\end{aligned}
</math>
}}
{{Math proof|title=Proof of two identities|proof=▼
▲{{Math proof|title=Proof|proof=
Applying the [[reparameterization trick#REINFORCE estimator|reparameterization trick]],
|