Next

supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours

Author(s): Lerrel Pinto, Abhinav Gupta
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2015
Keywords: manipulation, reinforcement learning, neural networks
Expert Opinion: This paper demonstrated that it's possible to have a robot interact in a self-supervised way with the environment in order to learn useful tasks, like grasping. By running a robot for a long period of time, it's possible to collect enough data to train policies using simple algorithms. This lead the way for a lot of follow up work from Google and others, and is likely an area where we'll see a lot of interest in the future.

pilco: a model-based and data-efficient approach to policy search

Author(s): Marc Peter Deisenroth, Carl Edward Rasmussen
Venue: International Conference of Machine Learning
Year Published: 2011
Keywords: state estimation, reinforcement learning, probabilistic models, gaussians, dynamical systems, visual perception, policy gradients
Expert Opinion: Demonstrates a way to efficiency learn a task through model-based reinforcement learning.

movement imitation with nonlinear dynamical systems in humanoid robots

Author(s): Auke Jan Ijspeert, Jun Nakanishi, Stefan Schaal
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2002
Keywords: probabilistic models, nonlinear systems, dynamical systems, learning from demonstration, humanoid robotics
Expert Opinion: In this work, a robust and scaleable movement primitive learning approach is proposed. The key insight is the embedding of motion trajectories in a 2nd order dynamical system. Goal attractors enable the generalization to different targets and simplify the learning of the model parameters from rewards. Complex motion can be learned through least squares regression from demonstrations.

dynamical movement primitives: learning attractor models for motor behaviors

Author(s): Auke Jan Ijspeert, Jun Nakanishi, Heiko Hoffmann, Peter Pastor, Stefan Schaal
Venue: Neural Computation (Volume 25, Issue 2)
Year Published: 2013
Keywords: planning, learning from demonstration, dynamical systems, nonlinear systems
Expert Opinion: DMPs proved to be a very useful representation for robot learning. The paper gives the clearest presentation ten years after they were invented

intrinsic motivation systems for autonomous mental development

Author(s): Pierre-Yves Oudeyer, Frederic Kaplan, and Verena V. Hafner
Venue: IEEE Transactions on Evolutionary Computation (Volume 11, Issue 2)
Year Published: 2007
Keywords: reinforcement learning, evolution, neural networks
Expert Opinion: This work contributes to the general question of obtaining life-long learning robotic systems. Large body of the existing robot learning literature mostly focus on methods that enable the robots to learn particular pre-defined skills and achieve particular tasks. Life-long learning, on the other hand, requires the robots to learn skills and adapt to situations that were not (and cannot be) foreseen. Inspired from human development, intrinsic motivation is an important drive that guides the robots towards regions that can be most effectively and efficiently learned with the capabilities developed so far; exploiting metrics such as novelty, curiosity, diversity, etc. This paper, in particular, is a seminal study that exploits maximization of learning progress in a real robot that explores its continuous sensorimotor space. It nicely shows that the robot exhibits stage-like development, learning easy tasks first, and focusing to more complex problems later; progressively developing more advanced skills.

probabilistic robotics

Author(s): Sebastian Thrun, Wolfram Burgard, Dieter Fox
Venue: Book
Year Published: 2005
Keywords: probabilistic models
Expert Opinion: Probabilistic Robotics is a tour de force, replete with material for students and practitioners alike.

maximum entropy inverse reinforcement learning

Author(s): Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, and Anind K. Dey
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2008
Keywords: probabilistic models, learning from demonstration, reinforcement learning
Expert Opinion: This work provides a novel and useful approach to the problem of inverse reinforcement learning. It is commonly used in practice and has influenced many follow up works in modeling humans in human-robot interaction.

from skills to symbols: learning symbolic representations for abstract high-level planning

Author(s): George Konidaris, Leslie Pack Kaelbling, Tomas Lozano-Perez
Venue: Journal of Artificial Intelligence Research
Year Published: 2018
Keywords: probabilistic models, planning
Expert Opinion: As we get better at low-level robotic control, the community will need to start thinking more about longer-horizon problems and how to smoothly flow between reasoning at different levels of abstraction. This paper presents a theoretically-ground formal treatment of the problem, proves some nice stuff about what constitutes necessary and sufficient symbols for various types of planning, and shows some nice demos on a real robot. It is by far the best analysis of hierarchical learning / planning that I know of and provides a much-needed theoretical foundation for moving this area of research forward.

a reduction of imitation learning and structured prediction to no-regret online learning

Author(s): Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell
Venue: 14th International Conference on Artificial Intelligence and Statistics
Year Published: 2011
Keywords: neural networks, learning from demonstration, dynamical systems
Expert Opinion: Imitation learning is a very appealing approach to learning robot skills. This paper shows that the straightforward technique of 'behavioral cloning' - simply copying the expert demonstrations, is actually not a good idea in sequential tasks. The reason is due to an effect of accumulating errors - once the learning agent strays away from states seen in the demonstration, it's learned policy is no longer accurate, causing it to stray even further away from the demonstration. There beauty of the paper is in capturing this idea mathematically, using no regret theoretical framework, and suggesting a simple algorithmic solution to the problem. The method, dubbed Dataset Aggregation (DAgger), asks for additional expert actions *on states visited by the policy*. The idea of controlling the distribution shift between the expert and the learner has since been fundamental to robotic imitation learning, and has manifested in various other methods.

probabilistic movement primitives

Author(s): Alexandros Paraschos, Christian Daniel, Jan Peters, and Gerhard Neumann
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2013
Keywords: manipulation, probabilistic models, gaussians, planning, learning from demonstration
Expert Opinion: This and the following papers using ProMPs, because they provided a very nice formulation for representing probabilistic movement primitives. ProMPs have many advantages and I found them better than classical DMPs in many robotics applications, from gestures to whole-body manipulations.

reinforcement learning: an introduction

Author(s): Richard S. Sutton and Andrew G. Barto
Venue: Book
Year Published: 2018
Keywords: mobile robots, reinforcement learning, unsupervised learning, optimal control, genetic algorithms
Expert Opinion: Somewhat repeating myself from the last suggestion: for learning robot behavior, reinforcement learning is an essential tool. While Sutton & Barto do not focus specifically on the case of robotics, their book is a very accessible text that nevertheless manages to cover many aspects, techniques, and challenges in reinforcement learning.

learning and generalization of motor skills by learning from demonstration

Author(s): Peter Pastor, Heiko Hoffmann, Tamim Asfour, and Stefan Schaal
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2009
Keywords: planning, learning from demonstration
Expert Opinion: DMPs (Dynamic Movement Primitives) are a good representation for learning robot movements from demonstration, as well as for doing reinforcement learning based on demonstrations. This paper explains a variant of the original DMP formulation that makes them stable when generalizing movements to accommodate new goals, or obstacles in the robot's path. It then shows how the new DMPs can be used for one-shot learning of tasks such as pick-and-place operations or water serving. More robust than just a trajectory, and less complex than learning with many trials, this is a nice tool to have in your robot learning toolkit.

alvinn: an autonomous land vehicle in a neural network

Author(s): Dean A. Pomerleau
Venue: MITP
Year Published: 1989
Keywords: mobile robots, learning from demonstration, neural networks
Expert Opinion: This work was pioneering with respect to machine learning in robotics broadly, learning from demonstration specifically, and also autonomous driving. It applied a neural net to learn steering angles from examples of human driving (even online!), way back in 1989. By today's deep learning standards the net was tiny (5 hidden units) and the sensor input extremely limited (30x32 image pixels), but it worked... and at a time when robots rarely operated outside of the lab or factory, and machine learning was rarely deployed on real hardware. It is the first* example of using demonstration-based learning for high-stakes control, that required (comparatively) fast sampling (25Hz) and operated a large van at regular road speeds (20mph). The vehicle was a part of NavLab, which was the precursor to CMU's DARPA Grand and Urban Challenges entires in the early 2000's, and those challenges in turn played a big role in accelerating today's driverless car boom. * To my knowledge! ...also, there actually are two papers (and my words above mix the two): [first publication] D. Pomerleau. ALVINN: An Autonomous Land Vehicle in a Neural Network. In Advances in Neural Information Processing Systems, 1989. [real world driving results] D. Pomerleau. Efficient Training of Artificial Neural Networks for Autonomous Navigation. Neural Compuation, 3(1), 88-97, 1991.

hindsight experience replay

Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: HER addresses the issue of sample inefficiency in DRL, especially for those problems with sparse and binary reward functions. It has become one of the most effective algorithms for learning problems with multiple goals which have the potential to solve many challenging manipulation tasks. The idea of "EVERY experience is a good experience for SOME task" is a powerful insight that succinctly reflects how we teach our children to be lifelong learners. We should teach our robots the same way.

end-to-end training of deep visuomotor policies

Author(s): Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel
Venue: Journal of Machine Learning Research
Year Published: 2016
Keywords: manipulation, probabilistic models, planning, locomotion, learning from demonstration, reinforcement learning, neural networks, visual perception
Expert Opinion: This work has shown a robot performing a variety of contact-rich manipulation tasks with learned controllers that close the loop around RGB images. This work spawned a flurry of research in reinforcement and representation learning.

apprenticeship learning via inverse reinforcement learning

Author(s): Pieter Abbeel, Andrew Y. Ng
Venue: International Conference on Machine Learning
Year Published: 2004
Keywords: reinforcement learning, learning from demonstration
Expert Opinion: Provided a convincing demonstration of the usefulness of inverse reinforcement learning

policy gradient reinforcement learning for fast quadrupedal locomotion

Author(s): Nate Kohl, Peter Stone
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2004
Keywords: reinforcement learning, policy gradients, locomotion, legged robots
Expert Opinion: The paper is one of the first impressive applications of policy gradient algorithms on real robots. The policy gradient algorithm is rather simple, but is able to optimize the gait of the AIBO robot efficiently.

robotic grasping of novel objects using vision

Author(s): Ashutosh Saxena, Justin Driemeyer, Andrew Y. Ng
Venue: International Journal of Robotics Research
Year Published: 2008
Keywords: neural networks, dynamical systems, visual perception, learning from demonstration, manipulation, planning
Expert Opinion: This paper lead a generation of PhD students to reimagine how grasping, and manipulation more generally, could be approached as a machine learning problem. Treating the grasp learning problem as a supervised learning problem without explicit human demonstrations or reinforcement learning, Saxena and colleagues' work stood as an example of how manipulation could be approached from a perceptual angle. A decade before deep learning made a splash in robotics, this work showed how robots could be trained to manipulate previously unseen objects without a need for complete 3D or dynamics models. While the learning techniques and features may have changed, the general formulation still stands as the initial approach many researchers take when implementing a grasp planning algorithm.

a survey on policy search for robotics

Author(s): Marc Peter Deisenroth, Gerhard Neumann, Jan Peters
Venue: Book
Year Published: 2013
Keywords: survey, reinforcement learning
Expert Opinion: A great unifying view on policy search

Next