Found 18 results.




learning control in robotics

Author(s): Stefan Schaal, Christopher G. Atkeson
Venue: IEEE Robotics & Automation Magazine
Year Published: 2010
Keywords: survey, reinforcement learning, policy gradients, optimal control, trajectory optimization
Expert Opinion: This review from Schaal and Atkeson does an excellent job of concisely covering the many approaches to learning control in robotics. It is useful not only as an overview of this subtype of robot learning, but also as a jumping off point for further research, as the works cited are extensive. This paper is also of note because it considers the problem of robot learning from a control perspective, rather than the more common computer science or statistical perspectives. The authors also discuss the practical aspects of learning control, such as the robustness of learned control policies to unexpected perturbation.

biped dynamic walking using reinforcement learning

Author(s): Hamid Benbrahim
Venue: University of New Hampshire
Year Published: 1997
Keywords: policy gradients, neural networks, reinforcement learning, dynamical systems, legged robots
Expert Opinion: Totally overlooked work that employs policy gradients with multiple nested CMAC neural networks. One could give Benbrahim credit for having done much of OpenAI's stuff 20 years earlier.

policy gradient reinforcement learning for fast quadrupedal locomotion

Author(s): Nate Kohl, Peter Stone
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2004
Keywords: reinforcement learning, policy gradients, locomotion, legged robots
Expert Opinion: The paper is one of the first impressive applications of policy gradient algorithms on real robots. The policy gradient algorithm is rather simple, but is able to optimize the gait of the AIBO robot efficiently.

relative entropy policy search

Author(s): Jan Peters, Katharina Mülling, Yasemin Altün
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2010
Keywords: policy gradients, reinforcement learning, probabilistic models
Expert Opinion: This work introduced for the first time an analytical KL bound to policy search algorithms in order to make them more stable and efficient. This work started a whole branch of new policy search algorithms and also many deep RL algorithms are inspired by this idea.

reinforcement learning of motor skills with policy gradients

Author(s): Jan Peters, Stefan Schaal
Venue: Robotics and Neuroscience
Year Published: 2008
Keywords: manipulation, policy gradients, reinforcement learning, dynamical systems
Expert Opinion: This paper presents some of the early method to learn high-dimensional parameters, which is important in robot manipulation. It presents a non-trivial combination of the authors' earlier work in learning motor skills, that aside from learning how reinforcement learning can be used to learn high-dimensional parameters, it also provide a good short summary of relevant work (up to that point).

policy search for motor primitives in robotics

Author(s): Jens Kober, Jan Peters
Venue: Machine Learning Journal
Year Published: 2009
Keywords: policy gradients, reinforcement learning, learning from demonstration, probabilistic models
Expert Opinion: This work was published before the recent AI boom. It presents impressive results of imitation and reinforcement learning, which are still remarkable now.

a simple learning strategy for high-speed quadrocopter multi-flips

Author(s): Sergei Lupashin, Angela Schoellig, Michael Sherback, Raffaello D'Andrea
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2010
Keywords: policy gradients
Expert Opinion: This work is the first to show highly agile, multiple flips with state-of-the-art quadrotors. The proposed learning strategy leverages first-principles modeling to obtain a suitable parameterization for the quadcopter multi-flips, and a systematic approach for finding optimal parameters from data. The method is thus an excellent example for combining the strength of both first-principles models and data-based learning. Each approach on its own (i.e. a purely model-based approach or a purely data-driven approach) would have had difficulty to succeed for this challenging application because modeling all effects relevant for high-speed flips is extremely challenging, and purely data-based approaches can't be applied (easily) because the problem is highly unstable. Another strength of the paper lies in the (conceptual) simplicity of the proposed learning strategy, which makes it widely applicable for learning problems in robotics.

model-agnostic meta-learning for fast adaptation of deep networks

Author(s): Chelsea Finn, Pieter Abbeel, Sergey Levine
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning, neural networks, locomotion
Expert Opinion: Model-Agnostic Meta-Learning (MAML) has been a paradigm shift in robotics and has been used in number of applications.

natural actor critic

Author(s): Jan Peters, Sethu Vijayakumar, and Stefan Schaal
Venue: European Conference on Machine Learning
Year Published: 2008
Keywords: policy gradients, reinforcement learning
Expert Opinion: This paper established policy-search approaches as the algorithms most naturally (!) suited for motor skill learning, in part by pairing them with DMPs. That combination was a major advance and accounts for almost all notable successful instances of robot motor skill learning until deep nets. All existing robot policy learning algorithms - including the latest, most fashionable deep approaches - are technically descended from this algorithm, and many researchers were inspired by its immense success.

a brief survey of deep reinforcement learning

Author(s): Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, Anil Anthony Bharath
Venue: IEEE Signal Processing Magazine
Year Published: 2017
Keywords: survey, policy gradients, neural networks, reinforcement learning
Expert Opinion: This comprehensive survey greatly covers the field of deep reinforcement learning approaches and algorithms. It is written to be accessible to the wide audience and generally easy to understand.

pilco: a model-based and data-efficient approach to policy search

Author(s): Marc Peter Deisenroth, Carl Edward Rasmussen
Venue: International Conference of Machine Learning
Year Published: 2011
Keywords: state estimation, reinforcement learning, probabilistic models, gaussians, dynamical systems, visual perception, policy gradients
Expert Opinion: it is a nice answer to the problem of learning models.

trust region policy optimization

Author(s): John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning
Expert Opinion: Schulman et al., introduced an iterative method for optimizing policies that guaranteed monotonic improvement. At the time, it was developed, it was significantly more stable (lower variance) than other on and off-policy methods such as Deep Q-learning for learning large non-linear policies. TRPO, to date requires substantially less parameter tuning than other deep reinforcement learning algorithms and its variants such as PPO are a popular choice. While model-free learning algorithms have not met with big successes in robot learning, we might not be far away.

deep reinforcement learning

Author(s): Yuxi Li
Venue: Under review for Morgan & Claypool: Synthesis Lectures in Artificial Intelligence and Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: A huge bottleneck in Robot Learning is supervision during training. A breakthrough came when DNNs could be leveraged for reinforcement learning. This paper, I believe, is a good introduction.

using inaccurate models in reinforcement learning

Author(s): Pieter Abbeel, Morgan Quigley, Andrew Y. Ng
Venue: International Conference on Machine Learning
Year Published: 2006
Keywords: reinforcement learning, policy gradients
Expert Opinion: This is another example that tries to make (usually data-inefficient) reinforcement learning techniques more feasible on real systems. While many techniques have been proposed to use simulators to train robot behaviors, this paper stands out to me in that it combines very pragmatic observations (we need simulators to learn complex behaviors, but they are often inaccurate), with precise theoretical insights into RL algorithms.

learning agile and dynamic motor skills for legged robots

Author(s): Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter
Venue: Science Robotics
Year Published: 2019
Keywords: policy gradients, neural networks, legged robots, locomotion, dynamical systems
Expert Opinion: Very nice work that combines supervised learning for learning internal models (deep networks) of the series-elastic actuators dynamics with reinforcement learning (specifically, Trust Region Policy Optimization) for learning locomotion policies. They obtained excellent locomotion gaits and were able to learn complex standing-up sequences.

an introduction to deep reinforcement learning.

Author(s): Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Venue: Foundations and Trends in Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: There have been astounding achievements in Deep Reinforcement Learning in recent years with complex decision-making problems suddenly becoming solvable. This book is written by experts in the field and on top of that, it is free!

closing the sim-t-real loop: adapting simulation randomization with real world experience

Author(s): Y. Chebotar, A. Handa, V. Makoviychuk, M. Macklin, J. Issac, N. Ratliff, and D. Fox
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2019
Keywords: reinforcement learning, policy gradients, manipulation
Expert Opinion: There are more than one way to view to this work. One, presented in the paper, is to close the sim2real gap by tuning a parametric simulator. Another is to embed a simulation model in the policy representation and close the loop through online learning.