learning and generalization of motor skills by learning from demonstration

Author(s): Peter Pastor, Heiko Hoffmann, Tamim Asfour, and Stefan Schaal
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2009
Keywords: planning, learning from demonstration
Expert Opinion: Not the first DMP paper, but the most understandable and with fixes to some annoying problems with the original formulation. Incredibly simple idea, but that's the nice thing about it -- it is a great starting point for talking about what generalization means in policy learning and how a restricted policy representation with the right inductive bias can allow you to learn something meaningful from a single trajectory, as well as learn quickly from practice.

reinforcement learning: an introduction

Author(s): Richard S. Sutton and Andrew G. Barto
Venue: Book
Year Published: 2018
Keywords: mobile robots, reinforcement learning, unsupervised learning, optimal control, genetic algorithms
Expert Opinion: Great introductory text book to the underpinnings of of a lot of the modern approaches in ML/RL for robotics.

maximum entropy inverse reinforcement learning

Author(s): Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, and Anind K. Dey
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2008
Keywords: probabilistic models, learning from demonstration, reinforcement learning
Expert Opinion: This work provides a novel and useful approach to the problem of inverse reinforcement learning. It is commonly used in practice and has influenced many follow up works in modeling humans in human-robot interaction.

from skills to symbols: learning symbolic representations for abstract high-level planning

Author(s): George Konidaris, Leslie Pack Kaelbling, Tomas Lozano-Perez
Venue: Journal of Artificial Intelligence Research
Year Published: 2018
Keywords: probabilistic models, planning
Expert Opinion: There exists a representational gap between the continuous sensorimotor world of a robot and the discrete symbols used by advanced AI planning methods. Many existing studies typically assume the existence of precoded planning symbols, and investigate how to learn the relations between these pre-coded symbols and continuous world of the robot. Few others argue that symbols should be formed in relation to the experience of agents through their sensorimotor experience. This paper presents a structured approach, which is built on Markov-decision process formalism, to discover symbolic abstract representations from low-level high-dimensional continuous sensorimotor experience. The learned symbols and rules can automatically and effectively expressed in PDDL, a canonical high-level planning domain language, enabling high-level planning with traditional off-the-shelf AI planners.

policy gradient reinforcement learning for fast quadrupedal locomotion

Author(s): Nate Kohl, Peter Stone
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2004
Keywords: reinforcement learning, policy gradients, locomotion, legged robots
Expert Opinion: The paper is one of the first impressive applications of policy gradient algorithms on real robots. The policy gradient algorithm is rather simple, but is able to optimize the gait of the AIBO robot efficiently.

end-to-end training of deep visuomotor policies

Author(s): Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel
Venue: Journal of Machine Learning Research
Year Published: 2016
Keywords: manipulation, probabilistic models, planning, locomotion, learning from demonstration, reinforcement learning, neural networks, visual perception
Expert Opinion: The paper by Sergey Levine, Chelsea Finn et al shows how the perception and control system of a vision-based manipulation robot can be trained jointly with a trajectory-centric reinforcement learning approach (building on the guided-policy search framework). In particular, this paper promotes the idea of "end-to-end training‚" in robot learning, as well as the use of deep neural networks. Both of these aspects have been very influential in the area and inspired many follow-up works. Overall, this paper (or the series of works around this one) significantly pushed the state-of-the-art in robot learning and constitutes one of the most powerful results and methods in modern RL.

autonomous helicopter aerobatics through apprenticeship learning

Author(s): Pieter Abbeel, Adam Coates and Andrew Y. Ng
Venue: International Journal of Robotics Research
Year Published: 2010
Keywords: learning from demonstration, optimal control, dynamical systems
Expert Opinion: This paper presents a beautiful and compelling demonstration of the strength of learning dynamical models and using optimal control to learn complex tasks on intrinsically unstable systems even if the learned models rather crude and the optimal controllers are based on linearization, both strong approximations of reality. Furthermore, it addresses the problem of learning from demonstrations and improving from such demonstrations to beat human performance. To the best of my knowledge, on of the first paper demonstrating the use of learning by demonstration, model learning and optimal control together to achieve acrobatic tasks.

pilco: a model-based and data-efficient approach to policy search

Author(s): Marc Peter Deisenroth, Carl Edward Rasmussen
Venue: International Conference of Machine Learning
Year Published: 2011
Keywords: state estimation, reinforcement learning, probabilistic models, gaussians, dynamical systems, visual perception, policy gradients
Expert Opinion: Demonstrates a way to efficiency learn a task through model-based reinforcement learning.

hindsight experience replay

Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: A really nice, simple idea for learning parameterized skills (building on UVFAs) and efficiently dealing with sparse reward. I think Learning Parameterized Motor Skills on a Humanoid Robot (Castro Da Silva et. al) has a much better description of the parameterized skill learning problem than the HER or UVFA papers, but the HER paper has better practical ideas.

a survey on policy search for robotics

Author(s): Marc Peter Deisenroth, Gerhard Neumann, Jan Peters
Venue: Book
Year Published: 2013
Keywords: survey, reinforcement learning
Expert Opinion: For learning optimal robot behavior, reinforcement learning is an essential tool. Whereas the standard textbook by Sutton & Barto mainly covers value-function based methods, this survey covers policy-based method that are very popular in robotics application, with a specific focus on a robotics context.

probabilistic movement primitives

Author(s): Alexandros Paraschos, Christian Daniel, Jan Peters, and Gerhard Neumann
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2013
Keywords: manipulation, probabilistic models, gaussians, planning, learning from demonstration
Expert Opinion: This work proposes a probabilistic movement primitive representation that can be trained through least squares regression from demonstrations. The most important feature of this model is its ability to model coupled systems. Thus, through exploiting the learned covariance between limbs or other dimensions whole body motion can be completed and predicted. Also the approach provides a closed form solution of optimal feedback controller in each time step assuming local Gaussian models.

alvinn: an autonomous land vehicle in a neural network

Author(s): Dean A. Pomerleau
Venue: MITP
Year Published: 1989
Keywords: mobile robots, learning from demonstration, neural networks
Expert Opinion: On the theoretical side, the first paper to recognize covariate shift in imitation learning and provide a simple data-augmentation style strategy to improve it. On the implementation side, a real self-driving first that led to "No Hands Across America".

probabilistic robotics

Author(s): Sebastian Thrun, Wolfram Burgard, Dieter Fox
Venue: Book
Year Published: 2005
Keywords: probabilistic models
Expert Opinion: Probabilistic Robotics is a tour de force, replete with material for students and practitioners alike.

dynamical movement primitives: learning attractor models for motor behaviors

Author(s): Auke Jan Ijspeert, Jun Nakanishi, Heiko Hoffmann, Peter Pastor, Stefan Schaal
Venue: Neural Computation (Volume 25, Issue 2)
Year Published: 2013
Keywords: planning, learning from demonstration, dynamical systems, nonlinear systems
Expert Opinion: The right parametrization is often the key in a learning system. Dynamical movement primitives (Ijspeert, Nakanishi, Schaal, 2003) are a very successful way to encode movements in robots. The idea is to use dynamical systems with desired properties, such as stable attractors or rhythmic solutions, as building blocks. This provides a low-dimensional parametrization and combining them linearly allows for effective learning. So far it was mainly used for learning from demonstration.

intrinsic motivation systems for autonomous mental development

Author(s): Pierre-Yves Oudeyer, Frederic Kaplan, and Verena V. Hafner
Venue: IEEE Transactions on Evolutionary Computation (Volume 11, Issue 2)
Year Published: 2007
Keywords: reinforcement learning, evolution, neural networks
Expert Opinion: This article describes some of the first successful experiments about "curious robots" and intrinsic motivation. It is one of the foundational articles in the "developmental robotics" field and inspired hundreds of papers about intrinsic motivation in reinforcement learning.

robotic grasping of novel objects using vision

Author(s): Ashutosh Saxena, Justin Driemeyer, Andrew Y. Ng
Venue: International Journal of Robotics Research
Year Published: 2008
Keywords: neural networks, dynamical systems, visual perception, learning from demonstration, manipulation, planning
Expert Opinion: This paper lead a generation of PhD students to reimagine how grasping, and manipulation more generally, could be approached as a machine learning problem. Treating the grasp learning problem as a supervised learning problem without explicit human demonstrations or reinforcement learning, Saxena and colleagues' work stood as an example of how manipulation could be approached from a perceptual angle. A decade before deep learning made a splash in robotics, this work showed how robots could be trained to manipulate previously unseen objects without a need for complete 3D or dynamics models. While the learning techniques and features may have changed, the general formulation still stands as the initial approach many researchers take when implementing a grasp planning algorithm.

apprenticeship learning via inverse reinforcement learning

Author(s): Pieter Abbeel, Andrew Y. Ng
Venue: International Conference on Machine Learning
Year Published: 2004
Keywords: reinforcement learning, learning from demonstration
Expert Opinion: Provided a convincing demonstration of the usefulness of inverse reinforcement learning

a reduction of imitation learning and structured prediction to no-regret online learning

Author(s): Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell
Venue: 14th International Conference on Artificial Intelligence and Statistics
Year Published: 2011
Keywords: neural networks, learning from demonstration, dynamical systems
Expert Opinion: This paper provides the first formal analysis of the (dynamic) covariate shift problem, where the suboptimal execution behavior of a policy drives the system to different states than those observed during training. While the general problem itself was well-known at the time ("Behavioral Cloning: A Correction" Michie 1995; Alvinn: "An autonomous land vehicle in a neural network" Pomerleau 1989), a disciplined analysis was lacking in the community. Ross et al. use a regret analysis to analyze and theoretically control the effects of dynamic covariate shift. The theory and algorithmic tools proposed in this work are still an active area of research today.

movement imitation with nonlinear dynamical systems in humanoid robots

Author(s): Auke Jan Ijspeert, Jun Nakanishi, Stefan Schaal
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2002
Keywords: probabilistic models, nonlinear systems, dynamical systems, learning from demonstration, humanoid robotics
Expert Opinion: This paper introduced Dynamic Motor Primitives (DMPs) - a very prominent representation for robot motion used in many learning approaches. While originally introduced as an imitation learning approach, DMPs have gone on to become central to many reinforcement learning papers. Many modern approaches that end with the word "primitive" are descendants of this work, including Probabilistic Motion Primitives (ProMPs), or Interaction Primitives.