Talk:Policy gradient method

Latest comment: 6 months ago by Hector in topic REINFORCE algorithm

REINFORCE algorithm

edit

i would erase the index subscript in the expectation : Do you agree ? Thanks ! Hector (talk) 15:07, 4 February 2025 (UTC)Reply