reinforcement learning and optimal control

Author(s): Dimitri P. Bertsekas
Venue: Book
Year Published: 2019
Keywords: reinforcement learning, optimal control, dynamic programming, neural networks
Expert Opinion: an accessible take on reinforcement learning that pairs well with the classic and influential book(s) on Dynamic Programming by Bertsekas

learning agile and dynamic motor skills for legged robots

Author(s): Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter
Venue: Science Robotics
Year Published: 2019
Keywords: policy gradients, neural networks, legged robots, locomotion, dynamical systems
Expert Opinion: Very nice work that combines supervised learning for learning internal models (deep networks) of the series-elastic actuators dynamics with reinforcement learning (specifically, Trust Region Policy Optimization) for learning locomotion policies. They obtained excellent locomotion gaits and were able to learn complex standing-up sequences.

a review of robot learning for manipulation: challenges, representations, and algorithms

Author(s): Oliver Kroemer, Scott Niekum, George Konidaris
Venue: arXiv
Year Published: 2019
Keywords: survey, probabilistic models, manipulation, reinforcement learning
Expert Opinion: This paper present an incredibly extensive recent survey on learning in robot manipulation (440 citations!!). Surveys are always use especially for new grad students. This one presents a single framework to formalise the robot manipulation problem.

closing the sim-t-real loop: adapting simulation randomization with real world experience

Author(s): Y. Chebotar, A. Handa, V. Makoviychuk, M. Macklin, J. Issac, N. Ratliff, and D. Fox
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2019
Keywords: reinforcement learning, policy gradients, manipulation
Expert Opinion: There are more than one way to view to this work. One, presented in the paper, is to close the sim2real gap by tuning a parametric simulator. Another is to embed a simulation model in the policy representation and close the loop through online learning.

from skills to symbols: learning symbolic representations for abstract high-level planning

Author(s): George Konidaris, Leslie Pack Kaelbling, Tomas Lozano-Perez
Venue: Journal of Artificial Intelligence Research
Year Published: 2018
Keywords: probabilistic models, planning
Expert Opinion: There exists a representational gap between the continuous sensorimotor world of a robot and the discrete symbols used by advanced AI planning methods. Many existing studies typically assume the existence of precoded planning symbols, and investigate how to learn the relations between these pre-coded symbols and continuous world of the robot. Few others argue that symbols should be formed in relation to the experience of agents through their sensorimotor experience. This paper presents a structured approach, which is built on Markov-decision process formalism, to discover symbolic abstract representations from low-level high-dimensional continuous sensorimotor experience. The learned symbols and rules can automatically and effectively expressed in PDDL, a canonical high-level planning domain language, enabling high-level planning with traditional off-the-shelf AI planners.

reinforcement learning: an introduction

Author(s): Richard S. Sutton and Andrew G. Barto
Venue: Book
Year Published: 2018
Keywords: mobile robots, reinforcement learning, unsupervised learning, optimal control, genetic algorithms
Expert Opinion: Somewhat repeating myself from the last suggestion: for learning robot behavior, reinforcement learning is an essential tool. While Sutton & Barto do not focus specifically on the case of robotics, their book is a very accessible text that nevertheless manages to cover many aspects, techniques, and challenges in reinforcement learning.

hindsight experience replay

Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: A really nice, simple idea for learning parameterized skills (building on UVFAs) and efficiently dealing with sparse reward. I think Learning Parameterized Motor Skills on a Humanoid Robot (Castro Da Silva et. al) has a much better description of the parameterized skill learning problem than the HER or UVFA papers, but the HER paper has better practical ideas.

qt-opt: scalable deep reinforcement learning for vision-based robotic manipulation

Author(s): Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, Vincent Vanhoucke, Sergey Levine
Venue: conference on robot learning
Year Published: 2018
Keywords: reinforcement learning, manipulation, neural networks
Expert Opinion: The paper shows that replay buffer based RL methods can be successfully applied to large scale robotic applications. We will see more of this kind of work in the future.

deep reinforcement learning in a handful of trials using probabilistic dynamics models

Author(s): Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: reinforcement learning, dynamical systems, probabilistic models, optimal control
Expert Opinion: Model-based reinforcement learning had the image of not performing well in comparison to model-free methods on complicated problems. However, conceptually, model-based methods have many advantages when it comes to data-efficiency and transfer between tasks. This paper shows that with ensemble models state-of-the-art robotic learning benchmarks (Gym-Mujoco environments) can be solved with high performance in significantly fewer steps. The data-efficiency is particularly important for real robot applications.

an algorithmic perspective on imitation learning

Author(s): Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J. Andrew Bagnell, Pieter Abbeel, Jan Peters
Venue: Foundations and Trends in Robotics
Year Published: 2018
Keywords: survey, learning from demonstration, reinforcement learning, planning
Expert Opinion: A focused overview of imitation learning and perspectives from some of the leaders in the field. Not a complete review but an excellent highlighting of important contributions in the field and perspective on future challenges.

deep reinforcement learning

Author(s): Yuxi Li
Venue: Under review for Morgan & Claypool: Synthesis Lectures in Artificial Intelligence and Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: A huge bottleneck in Robot Learning is supervision during training. A breakthrough came when DNNs could be leveraged for reinforcement learning. This paper, I believe, is a good introduction.

world models

Author(s): David Ha, Jürgen Schmidhuber
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: neural networks
Expert Opinion: A different take on the deep model-based RL based on use of RNNs to model the world so that the agent can train by hallucination.

an introduction to deep reinforcement learning.

Author(s): Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Venue: Foundations and Trends in Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: There have been astounding achievements in Deep Reinforcement Learning in recent years with complex decision-making problems suddenly becoming solvable. This book is written by experts in the field and on top of that, it is free!

particle filter networks with application to visual localization

Author(s): Peter Karkus, David Hsu, Wee Sun Lee
Venue: Proceedings of The 2nd Conference on Robot Learning
Year Published: 2018
Keywords: state estimation, neural networks, mobile robots
Expert Opinion: makes clear how the algorithmic ideas from before and end-to-end learning can be combined

affordances in psychology, neuroscience and robotics: a survey

Author(s): Lorenzo Jamone, Emre Ugur, Angelo Cangelosi, Luciano Fadiga, Alexandre Bernardino, Justus Piater and Jose Santos-Victor
Venue: IEEE Transactions on Cognitive and Developmental Systems
Year Published: 2018
Keywords: survey, visual perception, mobile robots, reinforcement learning
Expert Opinion: Affordances is an important term for robot learning, but also one that tends to be overloaded and can lead to confusion. If an object allows an agent to perform an action, then the object is said to afford the action to that agent. Affordances can generally be learned autonomously and are thus a fundamental aspect of self-supervised learning for autonomous robots. The nuances of the term however are still widely discussed in robotics and other fields. As a result, one should be aware of the ambiguity and different perspectives regarding the term when talking about affordances. This survey paper discusses some of the nuanced interpretations of the term affordances.

trust region policy optimization

Author(s): John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning
Expert Opinion: The TRPO paper lead the way towards practical Reinforcement Learning (RL) for robotics (and other domains). It's much more sample efficient and robust than previous approaches, and scales to high dimensional continuous action spaces. It's become a very popular method for training RL algorithms applied to robotics, although its slowly being replaced with descendants like Proximal Policy Optimization (PPO). TRPO has played a big part in reviving RL for robotics.

model-agnostic meta-learning for fast adaptation of deep networks

Author(s): Chelsea Finn, Pieter Abbeel, Sergey Levine
Venue: International Conference on Machine Learning
Year Published: 2017
Keywords: policy gradients, reinforcement learning, neural networks, locomotion
Expert Opinion: Most people probably wouldn't compare MAMAL with HER because the algorithms and the problems they address are vastly different; MAML tackles transfer learning while HER tackles sparse reward issues. But from certain perspective, HER and MAML can make a very complementary pair of parents. HER is like an encouraging parent who, in hindsight, thinks everything the child did is a useful learning experience. MAML, on the other hand, thinks in foresight and wants the child to only learn skills for future job prospects. Does that sound like your parents?

cad2rl: real single-image flight without a single real image

Author(s): Fereshteh Sadeghi, Sergey Levine
Venue: Robotics: Science and Systems Conference
Year Published: 2017
Keywords: neural networks, reinforcement learning, mobile robots
Expert Opinion: The CAD2RL paper demonstrated that it was possible to train a policy, using Reinforcement Learning, entirely in simulation and zero-shot transfer it to a real world robot. It paved the way for a lot of follow up work on domain transfer and domain randomization for robotic perception and control.

intrinsically motivated goal exploration processes with automatic curriculum learning

Author(s): Sebastien Forestier, Yoan Mollard, Pierre-Yves Oudeyer
Venue: arXiv
Year Published: 2017
Keywords: learning from demonstration
Expert Opinion: The paper shows how an agent/robot can find out by itself how to manipulate its environment from a simple intrinsic motivation. In my eyes, this is the first practical demonstration of Schmidhuber's idea on learning progress maximization which is probably one of the most powerful generic drives. The paper shows how an agent can discover more and more complex interactions with an environment without a specific task in mind. I believe these kinds of studies are important steps towards an intelligently learning robot.