Previous Next

learning attractor landscapes for learning motor primitives

Author(s): Auke Jan Ijspeert, Jun Nakanishi, and Stefan Schaal
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2003
Keywords: manipulation, planning, learning from demonstration, reinforcement learning, humanoid robotics
Expert Opinion: This is the basis for large body of work on learning movement primitives. This first paper on the topic was published by the same authors at ICRA 2002

trust region policy optimization

Author(s): John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning
Expert Opinion: Schulman et al., introduced an iterative method for optimizing policies that guaranteed monotonic improvement. At the time, it was developed, it was significantly more stable (lower variance) than other on and off-policy methods such as Deep Q-learning for learning large non-linear policies. TRPO, to date requires substantially less parameter tuning than other deep reinforcement learning algorithms and its variants such as PPO are a popular choice. While model-free learning algorithms have not met with big successes in robot learning, we might not be far away.

on learning, representing and generalizing a task in a humanoid robot

Author(s): Sylvain Calinon, Florent Guenter and Aude Billard
Venue: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) (Volume 27, Issue 2)
Year Published: 2007
Keywords: humanoid robotics, gaussians
Expert Opinion: First work to explicitly represent motion variance for motion generation.

optimization-based iterative learning for precise quadrocopter trajectory tracking

Author(s): Angela Schoellig. Raffaello D'Andrea
Venue: Autonomous Robots Journal
Year Published: 2012
Keywords: state estimation, optimal control
Expert Opinion: The authors propose an iterative approach to improve the flight controller in a quadcopter during repetitive task execution. While at a high-level, the paper has the same general setting as recent work in policy learning and robotics, it takes a very different approach that is grounded in control theory and state estimation. In my opinion, this paper is one of the best examples of "model-based" learning in robotics from both an algorithmic and a systems perspective. The task studied is dynamic, has non-trivial dynamic disturbances, and the proposed control technique is theoretically justified while being simple enough to analyze---also it worked.

simultaneous adversarial multi-robot learning

Author(s): Michael Bowling, Manuela Veloso
Venue: International Joint Conferences on Artificial Intelligence 
Year Published: 2003
Keywords: learning from demonstration
Expert Opinion: This paper is one of the earliest works demonstrating multiagent reinforcement learning on a real robot. The work was also an early demonstration of improving policies learned in a physics simulation using learning on robots (sim2real). Finally, the video of the robots' learned policies was one of the first robot demo videos to be slowed down on playback rather than sped up.

model-agnostic meta-learning for fast adaptation of deep networks

Author(s): Chelsea Finn, Pieter Abbeel, Sergey Levine
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning, neural networks, locomotion
Expert Opinion: Most people probably wouldn't compare MAMAL with HER because the algorithms and the problems they address are vastly different; MAML tackles transfer learning while HER tackles sparse reward issues. But from certain perspective, HER and MAML can make a very complementary pair of parents. HER is like an encouraging parent who, in hindsight, thinks everything the child did is a useful learning experience. MAML, on the other hand, thinks in foresight and wants the child to only learn skills for future job prospects. Does that sound like your parents?

automatic gait optimization with gaussian process regression

Author(s): Daniel Lizotte, Tao Wang, Michael Bowling, Dale Schuurmans
Venue: International Joint Conference on Artificial Intelligence
Year Published: 2007
Keywords: locomotion, legged robots, gaussians
Expert Opinion: This paper is from the line of papers on Aibo Gate optimization started by Kohl and Stone in 2004. This paper introduced the idea of using Gaussian process regression for learning so as to avoid local optima, make full use of all historical data, and explicitly model noise in gait evaluation. The authors acheived impressive results for optimizing both speed and smoothness with dramatically fewer gait evaluations than prior approaches.

applied nonlinear control

Author(s): Jean-Jacques E Slotine, Weiping Li
Venue: Book
Year Published: 2001
Keywords: nonlinear systems, optimal control
Expert Opinion: It laid the basis for adaptive nonlinear control commonly used in robotic control.

efficient reinforcement learning with relocatable action models

Author(s): Bethany R. Leffler, Michael L. Littman, Timothy Edmunds
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2007
Keywords: reinforcement learning, learning from demonstration, dynamical systems
Expert Opinion: This paper from 2007 was the culmination of a thread of reinforcement learning research on relocatable action models. The premise is that states can be clustered into types such that actions taken in states of the same type have similar effects. This paper was the first to study implementation of such an idea on a real robot, and shows impressive results. It's a great example of the creative robot demonstrations from Michael Littman's lab at a time when most learning was limited to simulation.

human-level control through deep reinforcement learning

Author(s): Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis
Venue: Nature
Year Published: 2015
Keywords: neural networks, reinforcement learning
Expert Opinion: Although not strictly robotics paper, this one made a breakthrough in end-to-end deep reinforcement learning.

belief space planning assuming maximum likelihood observations

Author(s): Robert Platt Jr., Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez
Venue: Robotics: Science and Systems VI
Year Published: 2010
Keywords: manipulation, dynamical systems, planning, gaussians
Expert Opinion: This isn't a learning paper, but a planning paper. Nonetheless, I feel the need to include it, as it strongly influenced my thinking about manipulation learning. It was the first work that I had seen make POMDPs work for real robotics problems. It teaches students that reasoning about uncertainty is important, but that you need to make the right assumptions in order to make it work for any reasonably sized problem. Whereas PILCO aims to reduce model uncertainty, this work assumes a correct model and leverages information-gathering actions to reduce state uncertainty. Thus, these papers are complementary when discussing uncertainty.

a survey of iterative learning control

Author(s): D.A. Bristow, M. Tharayil, A.G. Alleyne
Venue: IEEE Control Systems Magazine (Volume 26, Issue 3)
Year Published: 2006
Keywords: learning from demonstration, survey, nonlinear systems, gaussians
Expert Opinion: The content of the paper provides the reader with a broad perspective of the important ideas, potential, and limitations of iterative learning control - ILC. Besides the design techniques, it discusses problems in stability, performance, learning transient behavior, and robustness.

aibo ingenuity

Author(s): Michael Littman, Tom Walsh, Ali Nourl, Bethany Leffler, Timothy Edmunds, Jill Littman, Cameron Detulleo, Christopher Metcalf, Jonathan Metcalf, Charles Isbell, Joni Isbell
Venue: YouTube
Year Published: 2007
Keywords: learning from demonstration
Expert Opinion: This brief video from 2007 summarizes some very creative demonstrations of learning on Aibo robots in an environment that requires exploration. I love this work because they created a sort of "playpen" for the robot that had interesting features at just the right level of difficulty to allow the robot to learn intereting behaviors - in the same way that playpen environments are designed for human infants. It's a great example of the creative robot demonstrations from Michael Littman's lab at a time when most learning was limited to simulation. Though not associated with any publication, some related research, also including robot experiments, appears in

everyday robotic action: lessons from human action control

Author(s): Roy de Kleijn, George Kachergis, Bernhard Hommel
Venue: Frontiers in NeuroRobotics
Year Published: 2014
Keywords: planning, manipulation
Expert Opinion: Roboticists are not the only researchers working on motion representation and generation. Researchers on human motor control approach the problem from a different angle, and their work has often served as an inspiration for me. This paper provides a very nice, easy to understand overview of some topics in the field of human action control, with many interesting citations to follow up.

robot learning from demonstration

Author(s): Atkeson and Schaal
Venue: International Conference of Machine Learning
Year Published: 1997
Keywords: learning from demonstration, state estimation
Expert Opinion: An early work with a stunning actual robot result of learning to swing up a pendulum, on a real robot, from a very few demonstrations and a little bit of trial and error.

coordination of multiple behaviors acquired by vision-based reinforcement learning

Author(s): Minoru Asada, Eiji Uchibe, Shoichi Noda, Sukoya Tawaratsumida, Koh Hosoda
Venue: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Year Published: 1994
Keywords: reinforcement learning, dynamical systems
Expert Opinion: This paper is representative of a body of work from Minoru Asada's lab in the mid 1990's that was some of the first examples of machine learning in the robot soccer domain, some of the first reinforcement learning on real robots, some of the first RL from visual inputs, and some of the first multirobot learning, all wrapped into one. They introduced several methods for aggressively reducing the sample complexity needed for learning on real robots, such as "learning from easy missions."

stanley: the robot that won the darpa grand challenge

Author(s): Sebastian Thrun, Mike Montemerlo, Hendrik Dahlkamp, David Stavens, Andrei Aron, James Diebel, Philip Fong, John Gale, Morgan Halpenny, Gabriel Hoffmann, Kenny Lau, Celia Oakley, Mark Palatucci, Vaughan Pratt, and Pascal Stang
Venue: Journal of Robotic Systems
Year Published: 2006
Keywords: gaussians, state estimation
Expert Opinion: There would not be this much focus on robotics and learning if not for self-driving cars. Self-driving cars would not be a thing without Stanley.

relative entropy policy search

Author(s): Jan Peters, Katharina Mülling, Yasemin Altün
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2010
Keywords: policy gradients, reinforcement learning, probabilistic models
Expert Opinion: This work proposes an information theoretic gradient based policy learning learning algorithm with adaptive step sizes. This adaptive step sizes or learning rates are essential for real robot implementations where large jumps in policy updates might damage a real system.

robot programming by demonstration

Author(s): Aude Billard and Sylvain Calinon, Ruediger Dillmann, Stefan Schaal
Venue: Book
Year Published: 2008
Keywords: humanoid robotics, learning from demonstration, dynamical systems
Expert Opinion: Provides a clear presentation of robot learning from demonstration from the authors who made the approach popular

autonomous helicopter control using reinforcement learning policy search methods

Author(s): J. Andrew Bagnell, Jeff G. Schneider
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2001
Keywords: learning from demonstration, reinforcement learning, dynamic programming
Expert Opinion: One of the first real demonstrations of RL on an actual robot performing a complex control problem.

Previous Next