Next

a new approach to linear filtering and prediction problems

Author(s): R. E. Kalman
Venue: Transactions of the ASME–Journal of Basic Engineering
Year Published: 1960
Keywords: probabilistic models, optimal control, dynamical systems, state estimation
Expert Opinion: Important to point out that this is a Bayesian (probabilistic) approach, long before Bayesian approaches became popular in ML.

the coordination of arm movements: an experimentally confirmed mathematical model

Author(s): Tamar Flash, Neville Hogans
Venue: Journal of Neuroscience
Year Published: 1985
Keywords: optimal control, cognitive sciences, dynamical systems
Expert Opinion: This paper is part of a set of papers that outlines important points about synergy formation in neuroscience and robotics along a fil-rouge during the last 40 years: a) the coordination of multiple joints in goal directed movement, b) the characterization of "biological motion‚", c) the equilibrium point hypothesis, d) the role of force fields for motor coordination, e) the extension of the equilibrium point hypothesis from real (overt) movements to covert (real) movements, f) the characterization of synergy formation as the simulation of an internal body model. In particular, this paper explained the bell-shape of the speed profile in human arm reaching movements in terms of optimal control. The observation had been first made in Morasso P (1981) Spatial control of arm movements. Experimental Brain Research.

experiments in synthetic psychology

Author(s): Valentino Braitenberg
Venue: MIT Press
Year Published: 1986
Keywords: cognitive sciences
Expert Opinion: This book gave rise to the concept of Braitenberg Vehicles. It follows a sequence of simple thought experiments, starting with the idea that light sensors attached to motors can generate a machine that can simulate a desire for or aversion to light. These concepts are made progressively more complex, ranging up to machines that can display sophisticated behaviours. This is an interesting foundational take on emergent intelligence.

on the adaptive control of robot manipulator

Author(s): J.J. Slotine and W. Li
Venue: International Journal of Robotics Research
Year Published: 1987
Keywords: dynamical systems, manipulation
Expert Opinion: I selected this paper for several reasons: 1) to remind us that learning system dynamics at the same time as performing control is an old control topic, known as adaptive control, and that fundamental results in adaptive control and especially fundamental limitations will certainly carry over in other learning problems (parametric or not) when learning and control time-scales are mixed, 2) it is an incredibly elegant paper and its results have been further used on real robotic systems to provide fast adaptation to changing dynamics (e.g. plane catching experiments done at MIT using adaptive control in the early 90s) and they have also been extended to use non-parametric models (e.g. neural networks and Gaussian radial basis function) in subsequent papers.

alvinn: an autonomous land vehicle in a neural network

Author(s): Dean A. Pomerleau
Venue: MITP
Year Published: 1989
Keywords: mobile robots, learning from demonstration, neural networks
Expert Opinion: On the theoretical side, the first paper to recognize covariate shift in imitation learning and provide a simple data-augmentation style strategy to improve it. On the implementation side, a real self-driving first that led to "No Hands Across America".

using local models to control movement

Author(s): Christopher G. Atkeson
Venue: Advances in Neural Information Processing Systems
Year Published: 1990
Keywords: cognitive sciences
Expert Opinion: seminal work on using learned local linear models for robot learning

automatic programming of behavior-based robots using reinforcement learning

Author(s): Sridhar Mahadevan and Jonathan Connell
Venue: Artificial Intelligence
Year Published: 1991
Keywords: reinforcement learning
Expert Opinion: This is a seminal paper that really broke open the application of RL to learning policies. It was the first that I am aware of, and certainly the earliest high-profile case. It blew a lot of minds when it came out.

forward models: supervised learning with a distal teacher

Author(s): Michael I. Jordan, David E. Rumelhart
Venue: Cognitive Science
Year Published: 1992
Keywords: dynamical systems, neural networks
Expert Opinion: This landmark provides a very good exposition on learning internal models of the environment for selecting actions. The idea of a distal teacher, in-between learning from demonstration (i.e. traditional supervised way of learning policies) and reinforcement learning (that relies on environment rewards), is to provide supervision for learning policies by matching the predictions of a forward model with observation of the expert's demonstration. The paper also summarizes the advantages and challenges of learning both forward and inverse models. Another fantastic and related paper is: MOSAIC Model for Sensorimotor Learning and Control.

coordination of multiple behaviors acquired by vision-based reinforcement learning

Author(s): Minoru Asada, Eiji Uchibe, Shoichi Noda, Sukoya Tawaratsumida, Koh Hosoda
Venue: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Year Published: 1994
Keywords: reinforcement learning, dynamical systems
Expert Opinion: This paper is representative of a body of work from Minoru Asada's lab in the mid 1990's that was some of the first examples of machine learning in the robot soccer domain, some of the first reinforcement learning on real robots, some of the first RL from visual inputs, and some of the first multirobot learning, all wrapped into one. They introduced several methods for aggressively reducing the sample complexity needed for learning on real robots, such as "learning from easy missions."

evolution of corridor following behavior in a noisy world

Author(s): Craig W. Reynolds
Venue: International Conference on Simulation of Adaptive Behavior
Year Published: 1994
Keywords: genetic algorithms, evolution
Expert Opinion: The work features the automatic synthesis of a symbolic robot controller in a non-deterministic environment via genetic programming. Despite being an early paper on robot learning it features a combination of many aspects that are often not found in modern papers, i.e., (1) learning of explainable, symbolic code, (2) automatic sensor placement, (3) strong non-determinism. Reynolds even goes to great lengths to analyse the code generated by the evolutionary process and identifies a more general framework for how a good solution looks like. Structure and interpretability play an important role in this paper.

evolving virtual creatures

Author(s): Karl Sims
Venue: SIGGRAPH
Year Published: 1994
Keywords: dynamical systems, genetic algorithms
Expert Opinion: This paper demonstrated that machine learning and optimization do not have to be restricted to the generation of behavior of a robot. Rather, morphology and shape of an agent can be changed and optimized in an automatic fashion, too. In doing so, the paper created some of the first complex (an extremely impressive and life-like) examples of artificial creatures whose brain and body are fully synthesized. The video accompanying this paper is one of the best research videos out there. The paper has also spawned a number of follow ups, in particular by the group of Hod Lipson at Columbia.

adaptive representation of dynamics during learning a motor task

Author(s): Reza Shadmehr and Ferdinando A. Mussa-lvaldi
Venue: The Journal of Neuroscience
Year Published: 1994
Keywords: dynamical systems, visual perception, planning
Expert Opinion: The reason why I picked these articles and books is because I think that robot learning cannot be separated from the cognitive architecture supporting the learning processes. The first two reference highlight the importance and role of embodiment (in humans and robots) and the fact that in physical systems part of the learning process is embedded in the morphology and material.

optimal control and estimation

Author(s): Robert Stengel
Venue: Book
Year Published: 1994
Keywords: optimal control, state estimation
Expert Opinion: Robotics Learning Practitioners must be aware of and understand Optimal Control. :)

a robot controller using learning by imitation

Author(s): Gillian Hayes, Yiannis Demiris
Venue: Neural Information Processing Systems Conference (NeurIPS)
Year Published: 1995
Keywords: reinforcement learning, learning from demonstration, dynamical systems
Expert Opinion: This paper introduced learning from imitation. This has proved useful in and of its own (i.e., as a means for non-expert users to program robots), and also as the means for initializing robot controllers (most prominently by Schaal and later Peters) to a reasonable policy that is later refined by learning. Schaal's work on this was probably more influential but it was preceded, and possibly inspired by, Gillian Hayes.

reinforcement learning: a survey

Author(s): Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore
Venue: Journal of Artificial Intelligence Research
Year Published: 1996
Keywords: neural networks, survey, reinforcement learning, probabilistic models
Expert Opinion: This work provides a relatively short and easy to understand introduction to Reinforcement Learning. Although rather old and therefore does not cover the new approaches to reinforcement learning, it covers the problem of RL very well. I usually ask beginning students interested in reinforcement learning to read this paper together with the more recent "Reinforcement Learning in Robotics: A Survey" by Jens Kober, Andrew Bagnell, and Jan Peters, and deep learning approaches to reinforcement learning.

active learning for vision based grasping

Author(s): Marcos Salganicoff, Lyle H. Ungar, Ruzena Bajcsy
Venue: Machine Learning, 23, 251-278
Year Published: 1996
Keywords: manipulation
Expert Opinion: This work as far as I can detect is the first introducing Active learning and Forgetting for the Perception-Action paradigm. It was the PhD thesis of Marcos Salganicoff in 1992. The paper (in fact several papers) cited here was published later. The new approach allowed the learner to control when and where in the input space the new examples should be gathered. It balances the cost of gathering the experiences (Exploration) with the cost of misclassification and the execution of the task (Exploitation ).

Next