hindsight experience replay
Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: HER addresses the issue of sample inefficiency in DRL, especially for those problems with sparse and binary reward functions. It has become one of the most effective algorithms for learning problems with multiple goals which have the potential to solve many challenging manipulation tasks. The idea of "EVERY experience is a good experience for SOME task" is a powerful insight that succinctly reflects how we teach our children to be lifelong learners. We should teach our robots the same way.