Previous Next

belief space planning assuming maximum likelihood observations

Author(s): Robert Platt Jr., Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez
Venue: Robotics: Science and Systems VI
Year Published: 2010
Keywords: manipulation, dynamical systems, planning, gaussians
Expert Opinion: This isn't a learning paper, but a planning paper. Nonetheless, I feel the need to include it, as it strongly influenced my thinking about manipulation learning. It was the first work that I had seen make POMDPs work for real robotics problems. It teaches students that reasoning about uncertainty is important, but that you need to make the right assumptions in order to make it work for any reasonably sized problem. Whereas PILCO aims to reduce model uncertainty, this work assumes a correct model and leverages information-gathering actions to reduce state uncertainty. Thus, these papers are complementary when discussing uncertainty.

everyday robotic action: lessons from human action control

Author(s): Roy de Kleijn, George Kachergis, Bernhard Hommel
Venue: Frontiers in NeuroRobotics
Year Published: 2014
Keywords: planning, manipulation
Expert Opinion: Roboticists are not the only researchers working on motion representation and generation. Researchers on human motor control approach the problem from a different angle, and their work has often served as an inspiration for me. This paper provides a very nice, easy to understand overview of some topics in the field of human action control, with many interesting citations to follow up.

human-level control through deep reinforcement learning

Author(s): Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis
Venue: Nature
Year Published: 2015
Keywords: neural networks, reinforcement learning
Expert Opinion: Because it demonstrates that with modern approaches to deep reinforcement learning one can achieve human-level performance even in end-to-end settings.

learning attractor landscapes for learning motor primitives

Author(s): Auke Jan Ijspeert, Jun Nakanishi, and Stefan Schaal
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2003
Keywords: manipulation, planning, learning from demonstration, reinforcement learning, humanoid robotics
Expert Opinion: This is the basis for large body of work on learning movement primitives. This first paper on the topic was published by the same authors at ICRA 2002

stanley: the robot that won the darpa grand challenge

Author(s): Sebastian Thrun, Mike Montemerlo, Hendrik Dahlkamp, David Stavens, Andrei Aron, James Diebel, Philip Fong, John Gale, Morgan Halpenny, Gabriel Hoffmann, Kenny Lau, Celia Oakley, Mark Palatucci, Vaughan Pratt, and Pascal Stang
Venue: Journal of Robotic Systems
Year Published: 2006
Keywords: gaussians, state estimation
Expert Opinion: There would not be this much focus on robotics and learning if not for self-driving cars. Self-driving cars would not be a thing without Stanley.

model-agnostic meta-learning for fast adaptation of deep networks

Author(s): Chelsea Finn, Pieter Abbeel, Sergey Levine
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning, neural networks, locomotion
Expert Opinion: Model-Agnostic Meta-Learning (MAML) has been a paradigm shift in robotics and has been used in number of applications.

applied nonlinear control

Author(s): Jean-Jacques E Slotine, Weiping Li
Venue: Book
Year Published: 2001
Keywords: nonlinear systems, optimal control
Expert Opinion: It laid the basis for adaptive nonlinear control commonly used in robotic control.

robot programming by demonstration

Author(s): Aude Billard and Sylvain Calinon, Ruediger Dillmann, Stefan Schaal
Venue: Book
Year Published: 2008
Keywords: humanoid robotics, learning from demonstration, dynamical systems
Expert Opinion: Provides a clear presentation of robot learning from demonstration from the authors who made the approach popular

aibo ingenuity

Author(s): Michael Littman, Tom Walsh, Ali Nourl, Bethany Leffler, Timothy Edmunds, Jill Littman, Cameron Detulleo, Christopher Metcalf, Jonathan Metcalf, Charles Isbell, Joni Isbell
Venue: YouTube
Year Published: 2007
Keywords: learning from demonstration
Expert Opinion: This brief video from 2007 summarizes some very creative demonstrations of learning on Aibo robots in an environment that requires exploration. I love this work because they created a sort of "playpen" for the robot that had interesting features at just the right level of difficulty to allow the robot to learn intereting behaviors - in the same way that playpen environments are designed for human infants. It's a great example of the creative robot demonstrations from Michael Littman's lab at a time when most learning was limited to simulation. Though not associated with any publication, some related research, also including robot experiments, appears in https://link.springer.com/content/pdf/10.1007/s10994-010-5202-y.pdf

a survey of iterative learning control

Author(s): D.A. Bristow, M. Tharayil, A.G. Alleyne
Venue: IEEE Control Systems Magazine (Volume 26, Issue 3)
Year Published: 2006
Keywords: learning from demonstration, survey, nonlinear systems, gaussians
Expert Opinion: The content of the paper provides the reader with a broad perspective of the important ideas, potential, and limitations of iterative learning control - ILC. Besides the design techniques, it discusses problems in stability, performance, learning transient behavior, and robustness.

automatic gait optimization with gaussian process regression

Author(s): Daniel Lizotte, Tao Wang, Michael Bowling, Dale Schuurmans
Venue: International Joint Conference on Artificial Intelligence
Year Published: 2007
Keywords: locomotion, legged robots, gaussians
Expert Opinion: This paper is from the line of papers on Aibo Gate optimization started by Kohl and Stone in 2004. This paper introduced the idea of using Gaussian process regression for learning so as to avoid local optima, make full use of all historical data, and explicitly model noise in gait evaluation. The authors acheived impressive results for optimizing both speed and smoothness with dramatically fewer gait evaluations than prior approaches.

simultaneous adversarial multi-robot learning

Author(s): Michael Bowling, Manuela Veloso
Venue: International Joint Conferences on Artificial Intelligence 
Year Published: 2003
Keywords: learning from demonstration
Expert Opinion: This paper is one of the earliest works demonstrating multiagent reinforcement learning on a real robot. The work was also an early demonstration of improving policies learned in a physics simulation using learning on robots (sim2real). Finally, the video of the robots' learned policies was one of the first robot demo videos to be slowed down on playback rather than sped up.

robot learning from demonstration

Author(s): Atkeson and Schaal
Venue: International Conference of Machine Learning
Year Published: 1997
Keywords: learning from demonstration, state estimation
Expert Opinion: Some the key ideas in learning for robotics.

optimization-based iterative learning for precise quadrocopter trajectory tracking

Author(s): Angela Schoellig. Raffaello D'Andrea
Venue: Autonomous Robots Journal
Year Published: 2012
Keywords: state estimation, optimal control
Expert Opinion: The authors propose an iterative approach to improve the flight controller in a quadcopter during repetitive task execution. While at a high-level, the paper has the same general setting as recent work in policy learning and robotics, it takes a very different approach that is grounded in control theory and state estimation. In my opinion, this paper is one of the best examples of "model-based" learning in robotics from both an algorithmic and a systems perspective. The task studied is dynamic, has non-trivial dynamic disturbances, and the proposed control technique is theoretically justified while being simple enough to analyze---also it worked.

on learning, representing and generalizing a task in a humanoid robot

Author(s): Sylvain Calinon, Florent Guenter and Aude Billard
Venue: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) (Volume 27, Issue 2)
Year Published: 2007
Keywords: humanoid robotics, gaussians
Expert Opinion: First work to explicitly represent motion variance for motion generation.

efficient reinforcement learning with relocatable action models

Author(s): Bethany R. Leffler, Michael L. Littman, Timothy Edmunds
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2007
Keywords: reinforcement learning, learning from demonstration, dynamical systems
Expert Opinion: This paper from 2007 was the culmination of a thread of reinforcement learning research on relocatable action models. The premise is that states can be clustered into types such that actions taken in states of the same type have similar effects. This paper was the first to study implementation of such an idea on a real robot, and shows impressive results. It's a great example of the creative robot demonstrations from Michael Littman's lab at a time when most learning was limited to simulation.

coordination of multiple behaviors acquired by vision-based reinforcement learning

Author(s): Minoru Asada, Eiji Uchibe, Shoichi Noda, Sukoya Tawaratsumida, Koh Hosoda
Venue: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Year Published: 1994
Keywords: reinforcement learning, dynamical systems
Expert Opinion: This paper is representative of a body of work from Minoru Asada's lab in the mid 1990's that was some of the first examples of machine learning in the robot soccer domain, some of the first reinforcement learning on real robots, some of the first RL from visual inputs, and some of the first multirobot learning, all wrapped into one. They introduced several methods for aggressively reducing the sample complexity needed for learning on real robots, such as "learning from easy missions."

autonomous helicopter control using reinforcement learning policy search methods

Author(s): J. Andrew Bagnell, Jeff G. Schneider
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2001
Keywords: learning from demonstration, reinforcement learning, dynamic programming
Expert Opinion: I think this is one of the first work that shows the potential of reinforcement learning in realistic robotics problems. This paper together with "Inverted autonomous helicopter flight via reinforcement learning" by Andrew Y. Ng, et.al. are useful to encourage students see the importance and impact of applying concepts and methods to real-world setting.

trust region policy optimization

Author(s): John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning
Expert Opinion: Schulman et al., introduced an iterative method for optimizing policies that guaranteed monotonic improvement. At the time, it was developed, it was significantly more stable (lower variance) than other on and off-policy methods such as Deep Q-learning for learning large non-linear policies. TRPO, to date requires substantially less parameter tuning than other deep reinforcement learning algorithms and its variants such as PPO are a popular choice. While model-free learning algorithms have not met with big successes in robot learning, we might not be far away.

relative entropy policy search

Author(s): Jan Peters, Katharina Mülling, Yasemin Altün
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2010
Keywords: policy gradients, reinforcement learning, probabilistic models
Expert Opinion: This work introduced for the first time an analytical KL bound to policy search algorithms in order to make them more stable and efficient. This work started a whole branch of new policy search algorithms and also many deep RL algorithms are inspired by this idea.

Previous Next