Deep reinforcement learning: Difference between revisions

Content deleted Content added
Tsesea (talk | contribs)
Tag: Reverted
Tsesea (talk | contribs)
Tag: Reverted
Line 71:
[[File:Challenges and Tricks of Deep RL.jpg|thumb|Challenges and tricks in deep reinforcement learning algorithms]]
 
Deep reinforcement learning algorithms can start from a blank policy candidate and achieve superhuman performance in many complex tasks, including Atari games, StarCraft and Chinese Go. Mainstream DRL algorithms include Deep Q-Network (DQN), Dueling DQN, Double DQN (DDQN), Trust Region Policy Optimization (TRPO), Proximal Policy Optimization (PPO), Asynchronous Advantage Actor-Critic (A3C), Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Soft Actor-Critic (SAC), Distributional SAC (DSAC), etc. These algorithms are proposed with one or several of the abovementioned tricks to alleviate one or some challenges <ref name="Li-2023">{{cite book |last1=Li |first1=Shengbo |title= Reinforcement Learning for Sequential Decision and Optimal Control |date=2023 |___location=Springer Verlag, Singapore |isbn=978-9-811-97783-1 |pages=1–460 |doi=10.1007/978-981-19-7784-8 |s2cid=257928563 |edition=First | url=https://link.springer.com/book/10.1007/978-981-19-7784-8}}</ref>.
 
== Research ==