Revision as of 00:56, 6 December 2023 edit Tsesea (talk \| contribs) 104 edits →Algorithms Tag: Reverted ← Previous edit		Revision as of 00:56, 6 December 2023 edit undo Tsesea (talk \| contribs) 104 edits →Algorithms Tag: Reverted Next edit →
Line 71: [[File:Challenges and Tricks of Deep RL.jpg\|thumb\|Challenges and tricks in deep reinforcement learning algorithms]] Deep reinforcement learning algorithms can start from a blank policy candidate and achieve superhuman performance in many complex tasks, including Atari games, StarCraft and Chinese Go. Mainstream DRL algorithms include Deep Q-Network (DQN), Dueling DQN, Double DQN (DDQN), Trust Region Policy Optimization (TRPO), Proximal Policy Optimization (PPO), Asynchronous Advantage Actor-Critic (A3C), Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Soft Actor-Critic (SAC), Distributional SAC (DSAC), etc. These algorithms are proposed with one or several of the abovementioned tricks to alleviate one or some challenges <ref name="Li-2023">{{cite book \|last1=Li \|first1=Shengbo \|title= Reinforcement Learning for Sequential Decision and Optimal Control \|date=2023 \|___location=Springer Verlag, Singapore \|isbn=978-9-811-97783-1 \|pages=1–460 \|doi=10.1007/~~978-981-19-7784-8 \|s2cid=257928563 \|edition=First \| url=https://link.springer.com/book/10.1007/978-981-19-7784-8}}</ref~~>. == Research ==

Deep reinforcement learning: Difference between revisions