Revision as of 10:11, 18 November 2023 edit Tsesea (talk \| contribs) 104 edits →Algorithms Tag: Reverted ← Previous edit		Revision as of 00:52, 6 December 2023 edit undo Tsesea (talk \| contribs) 104 edits →Algorithms Tag: Reverted Next edit →
Line 67: \|Distributional Soft Actor-Critic \|\|Model-free \|\|Off-policy \|\|Continuous \|\|Continuous \|\|Value distribution \|} Previously, it was believed that deep reinforcement learning (DRL) was a natural product of combining tabular RL and deep neural network, and its design was a trivial task. In practice, deep reinforcement learning is fundamentally complicated because it inherits a few serious challenges from both reinforcement learning and deep learning. Some challenges, including non-iid sequential data, easy divergence, overestimation, and sample inefficiency yield particularly destructive outcomes if they are not well treated. A few empirical but useful tricks have been proposed to address these prominent issues, which build the basis of various advanced DRL algorithms. These tricks include experience replay (ExR), parallel exploration (PEx), separated target network (STN), delayed policy update (DPU), constrained policy update (CPU), clipped actor criterion (CAC), double Q-functions (DQF), bounded double Q-functions (BDQ), distributional return function (DRF), entropy regularization (EnR), and soft value function (SVF). [[File:Challenges and Tricks of Deep RL.jpg\|thumb\|Challenges and tricks in deep reinforcement learning algorithms]] == Research ==

Deep reinforcement learning: Difference between revisions