VeriDream: strengthening teaching for robots through dreams

Distinctions Innovation International Robotics

Artificial intelligence helps robots to successfully perform tasks autonomously. They can learn and train while disconnected and work through simulations as if they were dreaming. However, differences with real-life situations and the infinite number of choices that can be made in real-life limit the effectiveness of these methods. The VeriDream project, supported by the European Innovation Council (EIC), is determined to alleviate this problem.

A new European project aims to make robots dream although no one knows for sure if electric sheep are involved. Stéphane Doncieux, a professor at the Sorbonne University and deputy director of the Institute of Intelligent Systems and Robotics (ISIR, CNRS/ Sorbonne University), is leading the French contribution to the VeriDream project1 . This large-scale project has received funding from the EIC. "We help robots to learn through simulation phases," explains Stéphane Doncieux. "This consolidation is disconnected from reality and plays a role attributed to sleep and dreams in humans."

VeriDream follows on from the Dream project2 which ran from 2015 to 2018 and used open-ended learning methods to help robots adapt to different situations. "We tell robots what to do but we don't tell them how" explains Stéphane Doncieux, who coordinated the initial project. "It is up to the robots to explore the most appropriate sequences of actions to find the right solution." This system is based on the principle of rewards for robots that make the right choices in order to gradually establish a policy.

  • 1Vertical innovation in the domain of robotics enabled by artificial intelligence methods
  • 2Deferred restructuring of experience in autonomous machines
A policy is a function which tells a robot what to do when it finds itself in a given situation.
The Baxter robot from the DREAM project.Photo credit: Philippe Gauthier / ISIR

However, the level of effectiveness of the robot's behaviour really depends on the quality of its system representations, namely the information it gathers from its environment. Each machine also has its own physical limitations regarding the actions it can actually perform. Moreover, when robots are faced with too many possibilities they actually find it difficult to make choices and adjust those choices.

Therefore, the Dream project took its inspiration from human and animal development while adding a reinforcement learning phase with simulations. Robots can train on the programme without being turned on which saves considerable time and money while also avoiding wear on the machines. Robots with arms were given tasks consisting of simple forms of manipulation of objects like throwing or pushing…

Audiodescription

VeriDream has taken up the same principle as the Dream project, but has opted to apply it to an industrial context. "Learning always takes place in a certain environment and simulations do not necessarily correspond to the real conditions the robot will be operating in," explains Stéphane Doncieux. To solve this issue, researchers will attempt to automatically detect failures in manually defined policies before the robot is even confronted with them.

We break down learning based on simulation into several stages combined with interactions with the real world.

"We test a policy generated beforehand then disconnect from reality to analyse what happened and explore new alternatives," continues Stéphane Doncieux. This work is based on evolutionary methods which test a neural network's behaviour without knowing what the ideal solution might be. The algorithms generate variations of a policy and then select the most interesting among these. This is very different from the supervised learning method currently very much in vogue in which we know what a network should do and correct it accordingly.

The project is coordinated by the DLR1 in Munich and involves academic partners from the Sorbonne University and the ENSTA2 along with the international companies Magazino, Synesis and GoodAI.

  • 1Deutsches Zentrum für Luft-und Raumfahrt (German Aerospace Center), the national aeronautics and space research centre
  • 2École Nationale Supérieure de Techniques Avancées (National School of Advanced Techniques, Paris-Saclay University)
Magazino's mobile robot autonomously selects, transports and stores items like boxes of shoes in a warehouse.Photo credit: Magazino GmbH

Contact

Stéphane Doncieux
Professor at Sorbonne Université, member of ISIR