reinforcement learning and optimal control

Author(s): Dimitri P. Bertsekas
Venue: Book
Year Published: 2019
Keywords: reinforcement learning, optimal control, dynamic programming, neural networks
Expert Opinion: an accessible take on reinforcement learning that pairs well with the classic and influential book(s) on Dynamic Programming by Bertsekas

learning agile and dynamic motor skills for legged robots

Author(s): Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter
Venue: Science Robotics
Year Published: 2019
Keywords: policy gradients, neural networks, legged robots, locomotion, dynamical systems
Expert Opinion: Very nice work that combines supervised learning for learning internal models (deep networks) of the series-elastic actuators dynamics with reinforcement learning (specifically, Trust Region Policy Optimization) for learning locomotion policies. They obtained excellent locomotion gaits and were able to learn complex standing-up sequences.

a review of robot learning for manipulation: challenges, representations, and algorithms

Author(s): Oliver Kroemer, Scott Niekum, George Konidaris
Venue: arXiv
Year Published: 2019
Keywords: survey, probabilistic models, manipulation, reinforcement learning
Expert Opinion: This paper present an incredibly extensive recent survey on learning in robot manipulation (440 citations!!). Surveys are always use especially for new grad students. This one presents a single framework to formalise the robot manipulation problem.

closing the sim-t-real loop: adapting simulation randomization with real world experience

Author(s): Y. Chebotar, A. Handa, V. Makoviychuk, M. Macklin, J. Issac, N. Ratliff, and D. Fox
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2019
Keywords: reinforcement learning, policy gradients, manipulation
Expert Opinion: There are more than one way to view to this work. One, presented in the paper, is to close the sim2real gap by tuning a parametric simulator. Another is to embed a simulation model in the policy representation and close the loop through online learning.

from skills to symbols: learning symbolic representations for abstract high-level planning

Author(s): George Konidaris, Leslie Pack Kaelbling, Tomas Lozano-Perez
Venue: Journal of Artificial Intelligence Research
Year Published: 2018
Keywords: probabilistic models, planning
Expert Opinion: As we get better at low-level robotic control, the community will need to start thinking more about longer-horizon problems and how to smoothly flow between reasoning at different levels of abstraction. This paper presents a theoretically-ground formal treatment of the problem, proves some nice stuff about what constitutes necessary and sufficient symbols for various types of planning, and shows some nice demos on a real robot. It is by far the best analysis of hierarchical learning / planning that I know of and provides a much-needed theoretical foundation for moving this area of research forward.

reinforcement learning: an introduction

Author(s): Richard S. Sutton and Andrew G. Barto
Venue: Book
Year Published: 2018
Keywords: mobile robots, reinforcement learning, unsupervised learning, optimal control, genetic algorithms
Expert Opinion: Reinforcement learning is the branch of machine learning that is concerned with decision making under uncertainty, and can be treated as sitting at the intersection of stochastic optimal control theory and machine learning. As such, it is one of the primary tools that is used for learning on robots, where it has appeared in many forms from mobile robots learning to navigate, to manipulators learning to handle different kinds of objects. This book is really the primary text on reinforcement learning, and covers everything from the basic concepts in the field to more recent developments. It is a must-read for anyone interested in robot learning.

hindsight experience replay

Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: HER addresses the issue of sample inefficiency in DRL, especially for those problems with sparse and binary reward functions. It has become one of the most effective algorithms for learning problems with multiple goals which have the potential to solve many challenging manipulation tasks. The idea of "EVERY experience is a good experience for SOME task" is a powerful insight that succinctly reflects how we teach our children to be lifelong learners. We should teach our robots the same way.

qt-opt: scalable deep reinforcement learning for vision-based robotic manipulation

Author(s): Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, Vincent Vanhoucke, Sergey Levine
Venue: conference on robot learning
Year Published: 2018
Keywords: reinforcement learning, manipulation, neural networks
Expert Opinion: The paper shows that replay buffer based RL methods can be successfully applied to large scale robotic applications. We will see more of this kind of work in the future.

deep reinforcement learning in a handful of trials using probabilistic dynamics models

Author(s): Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: reinforcement learning, dynamical systems, probabilistic models, optimal control
Expert Opinion: Model-based reinforcement learning had the image of not performing well in comparison to model-free methods on complicated problems. However, conceptually, model-based methods have many advantages when it comes to data-efficiency and transfer between tasks. This paper shows that with ensemble models state-of-the-art robotic learning benchmarks (Gym-Mujoco environments) can be solved with high performance in significantly fewer steps. The data-efficiency is particularly important for real robot applications.

an algorithmic perspective on imitation learning

Author(s): Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J. Andrew Bagnell, Pieter Abbeel, Jan Peters
Venue: Foundations and Trends in Robotics
Year Published: 2018
Keywords: survey, learning from demonstration, reinforcement learning, planning
Expert Opinion: A focused overview of imitation learning and perspectives from some of the leaders in the field. Not a complete review but an excellent highlighting of important contributions in the field and perspective on future challenges.

deep reinforcement learning

Author(s): Yuxi Li
Venue: Under review for Morgan & Claypool: Synthesis Lectures in Artificial Intelligence and Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: A huge bottleneck in Robot Learning is supervision during training. A breakthrough came when DNNs could be leveraged for reinforcement learning. This paper, I believe, is a good introduction.

world models

Author(s): David Ha, Jürgen Schmidhuber
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: neural networks
Expert Opinion: A different take on the deep model-based RL based on use of RNNs to model the world so that the agent can train by hallucination.

an introduction to deep reinforcement learning.

Author(s): Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Venue: Foundations and Trends in Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: There have been astounding achievements in Deep Reinforcement Learning in recent years with complex decision-making problems suddenly becoming solvable. This book is written by experts in the field and on top of that, it is free!

particle filter networks with application to visual localization

Author(s): Peter Karkus, David Hsu, Wee Sun Lee
Venue: Proceedings of The 2nd Conference on Robot Learning
Year Published: 2018
Keywords: state estimation, neural networks, mobile robots
Expert Opinion: makes clear how the algorithmic ideas from before and end-to-end learning can be combined

affordances in psychology, neuroscience and robotics: a survey

Author(s): Lorenzo Jamone, Emre Ugur, Angelo Cangelosi, Luciano Fadiga, Alexandre Bernardino, Justus Piater and Jose Santos-Victor
Venue: IEEE Transactions on Cognitive and Developmental Systems
Year Published: 2018
Keywords: survey, visual perception, mobile robots, reinforcement learning
Expert Opinion: Affordances is an important term for robot learning, but also one that tends to be overloaded and can lead to confusion. If an object allows an agent to perform an action, then the object is said to afford the action to that agent. Affordances can generally be learned autonomously and are thus a fundamental aspect of self-supervised learning for autonomous robots. The nuances of the term however are still widely discussed in robotics and other fields. As a result, one should be aware of the ambiguity and different perspectives regarding the term when talking about affordances. This survey paper discusses some of the nuanced interpretations of the term affordances.

trust region policy optimization

Author(s): John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning
Expert Opinion: The TRPO paper lead the way towards practical Reinforcement Learning (RL) for robotics (and other domains). It's much more sample efficient and robust than previous approaches, and scales to high dimensional continuous action spaces. It's become a very popular method for training RL algorithms applied to robotics, although its slowly being replaced with descendants like Proximal Policy Optimization (PPO). TRPO has played a big part in reviving RL for robotics.

model-agnostic meta-learning for fast adaptation of deep networks

Author(s): Chelsea Finn, Pieter Abbeel, Sergey Levine
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning, neural networks, locomotion
Expert Opinion: Model-Agnostic Meta-Learning (MAML) has been a paradigm shift in robotics and has been used in number of applications.

cad2rl: real single-image flight without a single real image

Author(s): Fereshteh Sadeghi, Sergey Levine
Venue: Robotics: Science and Systems Conference
Year Published: 2017
Keywords: neural networks, reinforcement learning, mobile robots
Expert Opinion: The CAD2RL paper demonstrated that it was possible to train a policy, using Reinforcement Learning, entirely in simulation and zero-shot transfer it to a real world robot. It paved the way for a lot of follow up work on domain transfer and domain randomization for robotic perception and control.

intrinsically motivated goal exploration processes with automatic curriculum learning

Author(s): Sebastien Forestier, Yoan Mollard, Pierre-Yves Oudeyer
Venue: arXiv
Year Published: 2017
Keywords: learning from demonstration
Expert Opinion: The paper shows how an agent/robot can find out by itself how to manipulate its environment from a simple intrinsic motivation. In my eyes, this is the first practical demonstration of Schmidhuber's idea on learning progress maximization which is probably one of the most powerful generic drives. The paper shows how an agent can discover more and more complex interactions with an environment without a specific task in mind. I believe these kinds of studies are important steps towards an intelligently learning robot.