Ape-X

Introduced by Horgan et al. in Distributed Prioritized Experience Replay

Ape-X is a distributed architecture for deep reinforcement learning. The algorithm decouples acting from learning: the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory; the learner replays samples of experience and updates the neural network. The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors.

In contrast to Gorila, Ape-X uses a shared, centralized replay memory, and instead of sampling uniformly, it prioritizes, to sample the most useful data more often. All communications are batched with the centralized replay, increasing the efficiency and throughput at the cost of some latency. And by learning off-policy, Ape-X has the ability to combine data from many distributed actors, by giving the different actors different exploration policies, broadening the diversity of the experience they jointly encounter.

Source: Distributed Prioritized Experience Replay

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Reinforcement Learning (RL)	6	31.58%
Atari Games	3	15.79%
Navigate	1	5.26%
Decision Making	1	5.26%
Multi-agent Reinforcement Learning	1	5.26%
OpenAI Gym	1	5.26%
Game of Football	1	5.26%
Real-Time Strategy Games	1	5.26%
Starcraft	1	5.26%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Prioritized Experience Replay	Replay Memory

Categories

Add Remove

Distributed Reinforcement Learning