Next

robotic grasping of novel objects using vision

Author(s): Ashutosh Saxena, Justin Driemeyer, Andrew Y. Ng
Venue: International Journal of Robotics Research
Year Published: 2008
Keywords: neural networks, dynamical systems, visual perception, learning from demonstration, manipulation, planning
Expert Opinion: This is one of the first works in literature that utilized machine learning for the robotic manipulation problem. The proposed framework is still useful to design similar robot learning solutions. The particular importance of this work is to identify local features that are related to manipulation planning

a reduction of imitation learning and structured prediction to no-regret online learning

Author(s): Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell
Venue: 14th International Conference on Artificial Intelligence and Statistics
Year Published: 2011
Keywords: neural networks, learning from demonstration, dynamical systems
Expert Opinion: Dagger points to a problem that keeps popping up in everyone's research. Every robot learning person should know about it.

probabilistic robotics

Author(s): Sebastian Thrun, Wolfram Burgard, Dieter Fox
Venue: Book
Year Published: 2005
Keywords: probabilistic models
Expert Opinion: It laid out basis for robotics in uncertain real world.

from skills to symbols: learning symbolic representations for abstract high-level planning

Author(s): George Konidaris, Leslie Pack Kaelbling, Tomas Lozano-Perez
Venue: Journal of Artificial Intelligence Research
Year Published: 2018
Keywords: probabilistic models, planning
Expert Opinion: There exists a representational gap between the continuous sensorimotor world of a robot and the discrete symbols used by advanced AI planning methods. Many existing studies typically assume the existence of precoded planning symbols, and investigate how to learn the relations between these pre-coded symbols and continuous world of the robot. Few others argue that symbols should be formed in relation to the experience of agents through their sensorimotor experience. This paper presents a structured approach, which is built on Markov-decision process formalism, to discover symbolic abstract representations from low-level high-dimensional continuous sensorimotor experience. The learned symbols and rules can automatically and effectively expressed in PDDL, a canonical high-level planning domain language, enabling high-level planning with traditional off-the-shelf AI planners.

intrinsic motivation systems for autonomous mental development

Author(s): Pierre-Yves Oudeyer, Frederic Kaplan, and Verena V. Hafner
Venue: IEEE Transactions on Evolutionary Computation (Volume 11, Issue 2)
Year Published: 2007
Keywords: reinforcement learning, evolution, neural networks
Expert Opinion: This work contributes to the general question of obtaining life-long learning robotic systems. Large body of the existing robot learning literature mostly focus on methods that enable the robots to learn particular pre-defined skills and achieve particular tasks. Life-long learning, on the other hand, requires the robots to learn skills and adapt to situations that were not (and cannot be) foreseen. Inspired from human development, intrinsic motivation is an important drive that guides the robots towards regions that can be most effectively and efficiently learned with the capabilities developed so far; exploiting metrics such as novelty, curiosity, diversity, etc. This paper, in particular, is a seminal study that exploits maximization of learning progress in a real robot that explores its continuous sensorimotor space. It nicely shows that the robot exhibits stage-like development, learning easy tasks first, and focusing to more complex problems later; progressively developing more advanced skills.

learning and generalization of motor skills by learning from demonstration

Author(s): Peter Pastor, Heiko Hoffmann, Tamim Asfour, and Stefan Schaal
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2009
Keywords: planning, learning from demonstration
Expert Opinion: Not the first DMP paper, but the most understandable and with fixes to some annoying problems with the original formulation. Incredibly simple idea, but that's the nice thing about it -- it is a great starting point for talking about what generalization means in policy learning and how a restricted policy representation with the right inductive bias can allow you to learn something meaningful from a single trajectory, as well as learn quickly from practice.

movement imitation with nonlinear dynamical systems in humanoid robots

Author(s): Auke Jan Ijspeert, Jun Nakanishi, Stefan Schaal
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2002
Keywords: probabilistic models, nonlinear systems, dynamical systems, learning from demonstration, humanoid robotics
Expert Opinion: In this work, a robust and scaleable movement primitive learning approach is proposed. The key insight is the embedding of motion trajectories in a 2nd order dynamical system. Goal attractors enable the generalization to different targets and simplify the learning of the model parameters from rewards. Complex motion can be learned through least squares regression from demonstrations.

end-to-end training of deep visuomotor policies

Author(s): Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel
Venue: Journal of Machine Learning Research
Year Published: 2016
Keywords: manipulation, probabilistic models, planning, locomotion, learning from demonstration, reinforcement learning, neural networks, visual perception
Expert Opinion: This work has drawn attention to end-to-end learning with neural networks, which I think was the beginning of the big boom of the deep learning in robotics.

autonomous helicopter aerobatics through apprenticeship learning

Author(s): Pieter Abbeel, Adam Coates and Andrew Y. Ng
Venue: International Journal of Robotics Research
Year Published: 2010
Keywords: learning from demonstration, optimal control, dynamical systems
Expert Opinion: The helicopter stunts achieved in this work are some of the most compelling examples in robotics of both imitation learning and reinforcement learning. (The combination of the two is called apprenticeship learning.) In this work, multiple, imperfect trajectory demonstrations are used to generate ideal trajectories, and then reinforcement learning is used to learn sequences of linear feedback controllers that reproduce those trajectories. When people say things like "but there haven't really been many successes in using reinforcement learning on *real* robots, right?" you can point to this work and say, "sure there are! Have you *seen* these crazy helicopter tricks?"

pilco: a model-based and data-efficient approach to policy search

Author(s): Marc Peter Deisenroth, Carl Edward Rasmussen
Venue: International Conference of Machine Learning
Year Published: 2011
Keywords: state estimation, reinforcement learning, probabilistic models, gaussians, dynamical systems, visual perception, policy gradients
Expert Opinion: The paper by Marc Deisenroth and Carl Rasmussen promotes the use of Gaussian processes (GPs) for model-based reinforcement learning and proposes the PILCO algorithm, one of the most influential algorithms in recent reinforcement learning. GPs are by now heavily used in control and robotics communities. While this paper wasn't the first to use GPs in this context, it's arguably one of the most influential ones. Moreover, this work addresses the problem of data-efficiency in RL, which is of crucial importance for RL in the real world (such as in robotics). The PILCO algorithms has since been used in many different applications and extended in many ways. I consider the PILCO algorithm (or the underlying approach to model-based RL) as one of the state-of-the-art methods in modern RL.

maximum entropy inverse reinforcement learning

Author(s): Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, and Anind K. Dey
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2008
Keywords: probabilistic models, learning from demonstration, reinforcement learning
Expert Opinion: This is a seminal paper for IRL. It has not only become a standard way to think about IRL, but the observation model for a demonstration given the reward has propagated to many other related areas, like goal inference, human prediction, etc.

dynamical movement primitives: learning attractor models for motor behaviors

Author(s): Auke Jan Ijspeert, Jun Nakanishi, Heiko Hoffmann, Peter Pastor, Stefan Schaal
Venue: Neural Computation (Volume 25, Issue 2)
Year Published: 2013
Keywords: planning, learning from demonstration, dynamical systems, nonlinear systems
Expert Opinion: Foundation for motion planning using iterative learning methods

alvinn: an autonomous land vehicle in a neural network

Author(s): Dean A. Pomerleau
Venue: MITP
Year Published: 1989
Keywords: mobile robots, learning from demonstration, neural networks
Expert Opinion: This work was pioneering with respect to machine learning in robotics broadly, learning from demonstration specifically, and also autonomous driving. It applied a neural net to learn steering angles from examples of human driving (even online!), way back in 1989. By today's deep learning standards the net was tiny (5 hidden units) and the sensor input extremely limited (30x32 image pixels), but it worked... and at a time when robots rarely operated outside of the lab or factory, and machine learning was rarely deployed on real hardware. It is the first* example of using demonstration-based learning for high-stakes control, that required (comparatively) fast sampling (25Hz) and operated a large van at regular road speeds (20mph). The vehicle was a part of NavLab, which was the precursor to CMU's DARPA Grand and Urban Challenges entires in the early 2000's, and those challenges in turn played a big role in accelerating today's driverless car boom. * To my knowledge! ...also, there actually are two papers (and my words above mix the two): [first publication] D. Pomerleau. ALVINN: An Autonomous Land Vehicle in a Neural Network. In Advances in Neural Information Processing Systems, 1989. [real world driving results] D. Pomerleau. Efficient Training of Artificial Neural Networks for Autonomous Navigation. Neural Compuation, 3(1), 88-97, 1991.

a survey on policy search for robotics

Author(s): Marc Peter Deisenroth, Gerhard Neumann, Jan Peters
Venue: Book
Year Published: 2013
Keywords: survey, reinforcement learning
Expert Opinion: A great unifying view on policy search

supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours

Author(s): Lerrel Pinto, Abhinav Gupta
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2015
Keywords: manipulation, reinforcement learning, neural networks
Expert Opinion: Pinto et al., were the first paper to exploit deep learning techniques to process large amounts of data collected by a robot running 24x7 for significantly improving the grasping accuracy without making any object specific assumptions or requiring 3D models of objects. This paper inspired several works in using large scale data to learn intuitive physics, manipulation of deformable objects and also impressive grasping works such as Google's arm farm and DexNet.

reinforcement learning: an introduction

Author(s): Richard S. Sutton and Andrew G. Barto
Venue: Book
Year Published: 2018
Keywords: mobile robots, reinforcement learning, unsupervised learning, optimal control, genetic algorithms
Expert Opinion: Lays the foundation for all RL work, and therefore often referred to as "the book". Should not be forgotten in times of DeepRL!

policy gradient reinforcement learning for fast quadrupedal locomotion

Author(s): Nate Kohl, Peter Stone
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2004
Keywords: reinforcement learning, policy gradients, locomotion, legged robots
Expert Opinion: The work is practical in that it allowed the authors to improve the walking speed of Aibos, something essential to creating top-flight robocup players. The reason I adore this work and frequently cite it in my talks on machine learning is the fantastic way it allowed the robots to learn autonomously. In particular, for the Aibo robots to succeed in robocup, they need to be able to localize on the field based on their perception of provided markers. The authors enabled the robots to measure their own walking speed leveraging this capability. By marching a team of robots back and forth across the width of the pitch, experimenting with and evaluating different gaits each time, the robots were able to find movement patterns that surpassed hand-designed ones. It's a beautiful example of exploiting measurable quantities to drive learning---a key enabling technology for robot learning.

hindsight experience replay

Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: HER addresses the issue of sample inefficiency in DRL, especially for those problems with sparse and binary reward functions. It has become one of the most effective algorithms for learning problems with multiple goals which have the potential to solve many challenging manipulation tasks. The idea of "EVERY experience is a good experience for SOME task" is a powerful insight that succinctly reflects how we teach our children to be lifelong learners. We should teach our robots the same way.

probabilistic movement primitives

Author(s): Alexandros Paraschos, Christian Daniel, Jan Peters, and Gerhard Neumann
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2013
Keywords: manipulation, probabilistic models, gaussians, planning, learning from demonstration
Expert Opinion: This work proposes a probabilistic movement primitive representation that can be trained through least squares regression from demonstrations. The most important feature of this model is its ability to model coupled systems. Thus, through exploiting the learned covariance between limbs or other dimensions whole body motion can be completed and predicted. Also the approach provides a closed form solution of optimal feedback controller in each time step assuming local Gaussian models.

Next