Found 60 results.


Next

efficient reinforcement learning with relocatable action models

Author(s): Bethany R. Leffler, Michael L. Littman, Timothy Edmunds
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2007
Keywords: reinforcement learning, learning from demonstration, dynamical systems
Expert Opinion: This paper from 2007 was the culmination of a thread of reinforcement learning research on relocatable action models. The premise is that states can be clustered into types such that actions taken in states of the same type have similar effects. This paper was the first to study implementation of such an idea on a real robot, and shows impressive results. It's a great example of the creative robot demonstrations from Michael Littman's lab at a time when most learning was limited to simulation.

domain randomization for transferring deep neural networks from simulation to the real world

Author(s): Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, Pieter Abbeel
Venue: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Year Published: 2017
Keywords: visual perception, dynamical systems, neural networks, reinforcement learning
Expert Opinion: The work focuses on one of the most important problems related to utilizing CNNs for robotics problems: transferring policies from simulations to real world. Effective solutions are presented together with promising results.

reinforcement learning of motor skills with policy gradients

Author(s): Jan Peters, Stefan Schaal
Venue: Robotics and Neuroscience
Year Published: 2008
Keywords: manipulation, policy gradients, reinforcement learning, dynamical systems
Expert Opinion: This paper presents some of the early method to learn high-dimensional parameters, which is important in robot manipulation. It presents a non-trivial combination of the authors' earlier work in learning motor skills, that aside from learning how reinforcement learning can be used to learn high-dimensional parameters, it also provide a good short summary of relevant work (up to that point).

supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours

Author(s): Lerrel Pinto, Abhinav Gupta
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2015
Keywords: manipulation, reinforcement learning, neural networks
Expert Opinion: This paper demonstrated that it's possible to have a robot interact in a self-supervised way with the environment in order to learn useful tasks, like grasping. By running a robot for a long period of time, it's possible to collect enough data to train policies using simple algorithms. This lead the way for a lot of follow up work from Google and others, and is likely an area where we'll see a lot of interest in the future.

an introduction to deep reinforcement learning.

Author(s): Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Venue: Foundations and Trends in Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: There have been astounding achievements in Deep Reinforcement Learning in recent years with complex decision-making problems suddenly becoming solvable. This book is written by experts in the field and on top of that, it is free!

reinforcement learning and optimal control

Author(s): Dimitri P. Bertsekas
Venue: Book
Year Published: 2019
Keywords: reinforcement learning, optimal control, dynamic programming, neural networks
Expert Opinion: an accessible take on reinforcement learning that pairs well with the classic and influential book(s) on Dynamic Programming by Bertsekas

one-shot imitation learning

Author(s): Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2017
Keywords: learning from demonstration, reinforcement learning, neural networks
Expert Opinion: This paper focuses on a very challenging problem in robot learning, which is obtaining a generalized policy of a task using a very limited user supervision. The presented framework has a great potential to design robot learning algorithms with realistic data expectations.

reinforcement learning for robot soccer

Author(s): Martin Riedmiller, Thomas Gabel, Roland Hafner, Sascha Lange
Venue: Auton Robot
Year Published: 2009
Keywords: reinforcement learning, neural networks
Expert Opinion: This paper describes a decade's worth of research on reinforcement learning in the RoboCup competition by the group led by Martin Riedmiller. In essence, it describes the foundation of deep reinforcement learning before the modern hype started. This includes algorithms that have become a stable of deep RL, such as neural-fitted Q iteration. However, what makes this paper stand out even today is that it shows how to put together an entire system only out of RL-synthesized components. Martin and his team repeatedly won world championships using this approach and he later went on to become an influential figure at DeepMind.

pilco: a model-based and data-efficient approach to policy search

Author(s): Marc Peter Deisenroth, Carl Edward Rasmussen
Venue: International Conference of Machine Learning
Year Published: 2011
Keywords: state estimation, reinforcement learning, probabilistic models, gaussians, dynamical systems, visual perception, policy gradients
Expert Opinion: The paper by Marc Deisenroth and Carl Rasmussen promotes the use of Gaussian processes (GPs) for model-based reinforcement learning and proposes the PILCO algorithm, one of the most influential algorithms in recent reinforcement learning. GPs are by now heavily used in control and robotics communities. While this paper wasn't the first to use GPs in this context, it's arguably one of the most influential ones. Moreover, this work addresses the problem of data-efficiency in RL, which is of crucial importance for RL in the real world (such as in robotics). The PILCO algorithms has since been used in many different applications and extended in many ways. I consider the PILCO algorithm (or the underlying approach to model-based RL) as one of the state-of-the-art methods in modern RL.

maximum entropy inverse reinforcement learning

Author(s): Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, and Anind K. Dey
Venue: AAAI Conference on Artificial Intelligence
Year Published: 2008
Keywords: probabilistic models, learning from demonstration, reinforcement learning
Expert Opinion: This work is one of the first to connect probabilistic inference with robot policy learning. Maximum Entropy Inverse Reinforcement Learning poses the classical Inverse Reinforcement Learning problem, well-studied for several years before this work, as maximizing the likelihood of observing a state distributing given a noisily optimal agent w.r.t an unknown reward function. The inference method, model, and general principles not only inspired future IRL works (such as RelEnt-IRL, GP-IRL, and Guided Cost Learning), they also have been applied in Human Robot Interaction and general policy search algorithms.

policy gradient reinforcement learning for fast quadrupedal locomotion

Author(s): Nate Kohl, Peter Stone
Venue: IEEE International Conference on Robotics and Automation (ICRA)
Year Published: 2004
Keywords: reinforcement learning, policy gradients, locomotion, legged robots
Expert Opinion: The paper is one of the first impressive applications of policy gradient algorithms on real robots. The policy gradient algorithm is rather simple, but is able to optimize the gait of the AIBO robot efficiently.

cad2rl: real single-image flight without a single real image

Author(s): Fereshteh Sadeghi, Sergey Levine
Venue: Robotics: Science and Systems Conference
Year Published: 2017
Keywords: neural networks, reinforcement learning, mobile robots
Expert Opinion: The CAD2RL paper demonstrated that it was possible to train a policy, using Reinforcement Learning, entirely in simulation and zero-shot transfer it to a real world robot. It paved the way for a lot of follow up work on domain transfer and domain randomization for robotic perception and control.

biped dynamic walking using reinforcement learning

Author(s): Hamid Benbrahim
Venue: University of New Hampshire
Year Published: 1997
Keywords: policy gradients, neural networks, reinforcement learning, dynamical systems, legged robots
Expert Opinion: Totally overlooked work that employs policy gradients with multiple nested CMAC neural networks. One could give Benbrahim credit for having done much of OpenAI's stuff 20 years earlier.

a review of robot learning for manipulation: challenges, representations, and algorithms

Author(s): Oliver Kroemer, Scott Niekum, George Konidaris
Venue: arXiv
Year Published: 2019
Keywords: survey, probabilistic models, manipulation, reinforcement learning
Expert Opinion: This paper present an incredibly extensive recent survey on learning in robot manipulation (440 citations!!). Surveys are always use especially for new grad students. This one presents a single framework to formalise the robot manipulation problem.

hindsight experience replay

Author(s): Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 2018
Keywords: manipulation, humanoid robotics, reinforcement learning, neural networks
Expert Opinion: HER addresses the issue of sample inefficiency in DRL, especially for those problems with sparse and binary reward functions. It has become one of the most effective algorithms for learning problems with multiple goals which have the potential to solve many challenging manipulation tasks. The idea of "EVERY experience is a good experience for SOME task" is a powerful insight that succinctly reflects how we teach our children to be lifelong learners. We should teach our robots the same way.

guided policy search

Author(s): Sergey Levine, Vladlen Koltun
Venue: International Conference on Machine Learning
Year Published: 2013
Keywords: planning, trajectory optimization, reinforcement learning, neural networks
Expert Opinion: This paper, as well as its successors, try to make learning complex behaviors from experience more tractable where only little data is available (which is of course a common situation for learning robots). In particular, I like that the paper combines well-established planning methods that have long been studied in AI and robotics with learning methods to establish a new procedure that combines advantages from both worlds.

natural actor critic

Author(s): Jan Peters, Sethu Vijayakumar, and Stefan Schaal
Venue: European Conference on Machine Learning
Year Published: 2008
Keywords: policy gradients, reinforcement learning
Expert Opinion: This paper established policy-search approaches as the algorithms most naturally (!) suited for motor skill learning, in part by pairing them with DMPs. That combination was a major advance and accounts for almost all notable successful instances of robot motor skill learning until deep nets. All existing robot policy learning algorithms - including the latest, most fashionable deep approaches - are technically descended from this algorithm, and many researchers were inspired by its immense success.

deep reinforcement learning

Author(s): Yuxi Li
Venue: Under review for Morgan & Claypool: Synthesis Lectures in Artificial Intelligence and Machine Learning
Year Published: 2018
Keywords: neural networks, reinforcement learning, policy gradients, learning from demonstration
Expert Opinion: A huge bottleneck in Robot Learning is supervision during training. A breakthrough came when DNNs could be leveraged for reinforcement learning. This paper, I believe, is a good introduction.

Next