Robots that Learn

Robots that Learn

Infants are born with the ability to imitate
what other people do. Here, a ten-minute-old newborn sees another
person stick their tongue out for the first time. In response, he sticks his own tongue out. Imitation allows humans to learn new behaviors
rapidly. We would like our robots to be able to learn
this way, too. We’ve built a proof of concept system, trained
entirely in simulation: we teach the robot a new task by demonstrating how to assemble
block towers in a way we desire — in this case a single stack of six virtual blocks
to form a tower. Previously, the robot has seen other examples
of manipulating blocks, but not this particular one. Our robot has now learned the task, even though
its movements have to be different from the ones in the demonstration. With a single demonstration of the task, we
can replicate it in a number of initial conditions. Teaching the robot how to build a different
block arrangement requires only a single new demonstration. Here’s how our system works: the robot perceives
the environment with its camera, and manipulates the blocks with its arm. At its core, there are two neural networks,
working together. The camera image is first processed by the
vision network. Then, based on the recorded demonstration,
the imitation network figures out what action to take next. Our vision network is a deep neural net that
takes a camera image and determines the position of the blocks relative to the robot. To train the network, we use only simulated
data, using domain randomization to learn a robust vision model. We generate thousands of different object
locations, light settings and surface textures, and show these examples to the network. After training, the network can find the blocks
in the physical world, even though it has never seen images from a real camera before. Now that we know the location of the blocks,
the imitation network takes over. Its goal is to mimic the task shown by the
demonstrator. This neural net is trained to predict what
action the demonstrator would have taken in the same situation. On its own, it has learned to scan through
the demonstration and pay attention to the relevant frames that tell it what to do next. Nothing in our technique is specific to blocks. This system is an early prototype, and will
form the backbone of the general-purpose robotic systems we’re developing at OpenAI.

10 Replies to “Robots that Learn”

  1. In 9+ months, only 18K views and 0 comments… really?!
    AI will be more transformational in our lifetime than electricity, computers and the internet combined. Videos such as this one offer us a glimpse into the workings of our shared future. By watching, commenting and understanding the content of these videos you are increasing the probabilities of a better future for us all.

  2. Could you explain any benefit from this system compared to a regular edge detecting vision system in combination with an industrial robot that is able to be handguided, like the Kuka LBR IIWA? Are you able to make a sloppy stack and then let the robot learn to create a perfect stack over time or what is the point here? ☺

  3. Okey! The vision network predict the locations of the blocks(and it is propagated earlier). But how you train the imitation network. So confusing to me. Because the demonstration has thousands of images. Just think about it you have 128×128 image, with duration about 5 minute, each has 24fps. So the math would be 128x128x24x60x5 plus locations of the blocks. Eww! That's a minimum but huge input feed to the imitation network. My point is how you train the imitation network. No back propagation I think is possible.

  4. I think the human brain is the perfect learning machine. Once machine learning is perfected, we'll almost certainly learn some lessons about the nature of the human mind by proxy. great minds think alike as they say.

Leave a Reply

Your email address will not be published. Required fields are marked *