jaxplore • QDagger

QDagger

https://arxiv.org/abs/2206.01626
https://docs.cleanrl.dev/rl-algorithms/qdagger/

QDagger is an extension of the DQN algorithm that uses previously computed results, like teacher policy and teacher replay buffer, to help train student policy. This method eliminates the need for learning from scratch, improving sample efficiency and reducing computational effort in training new policy.