Researchers from Brown University and MIT have developed a method for helping robots plan for multi-step tasks by constructing abstract representations of the world around them. Their study, published in the Journal of Artificial Intelligence Research, is a step toward building robots that can think and act more like people. Planning is a monumentally difficult thing for robots, largely because of how they perceive and interact with the world.
A robot’s perception of the world consists of nothing more than the vast array of pixels collected by its cameras, and its ability to act is limited to setting the positions of the individual motors that control its joints and grippers. It lacks an innate understanding of how those pixels relate to what we might consider meaningful concepts in the world.
“That low-level interface with the world makes it really hard to do decide what to do,” said George Konidaris, an assistant professor of computer science at Brown and the lead author of the new study.
“Imagine how hard it would be to plan something as simple as a trip to the grocery store if you had to think about each and every muscle you’d flex to get there, and imagine in advance and in detail the terabytes of visual data that would pass through your retinas along the way".
"You’d immediately get bogged down in the detail. People, of course, don’t plan that way. We’re able to introduce abstract concepts that throw away that huge mass of irrelevant detail and focus only on what is important.”
Even state-of-the-art robots aren’t capable of that kind of abstraction. When we see demonstrations of robots planning for and performing multistep tasks, “it’s almost always the case that a programmer has explicitly told the robot how to think about the world in order for it to make a plan,” Konidaris said.
“But if we want robots that can act more autonomously, they’re going to need the ability to learn abstractions on their own.”
In computer science terms, these kinds of abstractions fall into two categories: “procedural abstractions” and “perceptual abstractions.” Procedural abstractions are programs made out of low-level movements composed into higher-level skills.
An example would be bundling all the little movements needed to open a door — all the motor movements involved in reaching for the knob, turning it and pulling the door open — into a single “open the door” skill.
Once such a skill is built, you don’t need to worry about how it works. All you need to know is when to run it. Roboticists — including Konidaris himself — have been studying how to make robots learn procedural abstractions for years, he says.
But according to Konidaris, there’s been less progress in perceptual abstraction, which has to do with helping a robot make sense of its pixelated surroundings. That’s the focus of this new research.
“Our work shows that once a robot has high-level motor skills, it can automatically construct a compatible high-level symbolic representation of the world — one that is provably suitable for planning using those skills,” Konidaris said.
For the study, the researchers introduced a robot named Anathema Device (or Ana, for short) to a room containing a cupboard, a cooler, a switch that controls a light inside the cupboard, and a bottle that could be left in either the cooler or the cupboard.
They gave Ana a set of high-level motor skills for manipulating the objects in the room—opening and closing both the cooler and the cupboard, flipping the switch and picking up a bottle.
Then they turned Ana loose to try out her motor skills in the room, recording the sensory data from her cameras and actuators before and after each skill execution. Those data were fed into the machine-learning algorithm developed by the team.
The researchers showed that Ana was able to learn a very abstract description of the environment that contained only what was necessary for her to be able perform a particular skill.
For example, she learned that in order to open the cooler, she needed to be standing in front of it and not holding anything (because she needed both hands to open the lid). She also learned the proper configuration of pixels in her visual field associated with the cooler lid being closed, which is the only configuration in which it’s possible to open it.
Just by executing her motor skills, Ana learned the approriate configuration of pixels associated with a cooler having a closed lid, a necessary condition for her to run her "open the cooler" skill.
She learned similar abstractions associated with her other skills. She learned, for example, that the light inside cupboard was so bright that it whited out her sensors. So in order to manipulate the bottle inside the cupboard, the light had to be off.
She also learned that in order to turn the light off, the cupboard door needed to be closed, because the open door blocked her access to the switch. The resulting abstract representation distilled all that knowledge down from high-definition images to a text file, just 126 lines long.
“These were all the important abstract concepts about her surroundings,” Konidaris said. “Doors need to be closed before they can be opened. You can’t get the bottle out of the cupboard unless it’s open, and so on. And she was able to learn them just by executing her skills and seeing what happens.”
Discover more here.
Image credit: Brown University.