IMPACT: Importance Weighted Asynchronous Architectures with Clipped\n Target Networks

Michael Luo; Jiahao Yao; Richard Liaw; Eric Liang; Ion Stoica

doi:10.48550/arxiv.1912.00167

Abstract

1 min read

The practical usage of reinforcement learning agents is often bottlenecked by\nthe duration of training time. To accelerate training, practitioners often turn\nto distributed reinforcement learning architectures to parallelize and\naccelerate the training process. However, modern methods for scalable\nreinforcement learning (RL) often tradeoff between the throughput of samples\nthat an RL agent can learn from (sample throughput) and the quality of learning\nfrom each sample (sample efficiency). In these scalable RL architectures, as\none increases sample throughput (i.e. increasing parallelization in IMPALA),\nsample efficiency drops significantly. To address this, we propose a new\ndistributed reinforcement learning algorithm, IMPACT. IMPACT extends IMPALA\nwith three changes: a target network for stabilizing the surrogate objective, a\ncircular buffer, and truncated importance sampling. In discrete action-space\nenvironments, we show that IMPACT attains higher reward and, simultaneously,\nachieves up to 30% decrease in training wall-time than that of IMPALA. For\ncontinuous control environments, IMPACT trains faster than existing scalable\nagents while preserving the sample efficiency of synchronous PPO.\n

IMPACT: Importance Weighted Asynchronous Architectures with Clipped\n Target Networks

Abstract

Discussion(0)

Open reviews(0)

Related publications

RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs

Scenic4RL: Programmatic Modeling and Generation of Reinforcement\n Learning Environments

CLASSIFICATION of CELLS INFECTED with the MALARIA PARASITE with RESNET ARCHITECTURES

DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning

NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement\n Learning

Related publications

Preprint2025
RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs
Preprint2025

Preprint2021
Scenic4RL: Programmatic Modeling and Generation of Reinforcement\n Learning Environments
Preprint2021

Article2022
CLASSIFICATION of CELLS INFECTED with the MALARIA PARASITE with RESNET ARCHITECTURES
Article2022

Preprint2021
DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning
Preprint2021

Preprint2019
NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement\n Learning
Preprint2019