Teaching chores to an AI
CSAIL's activity simulator could eventually teach robots tasks like
making coffee or setting the table
For many people, household chores are a dreaded, inescapable part of
life that we often put off or do with little care - but what if a robot
maid could help lighten the load?
Recently, computer scientists have been working on teaching machines to
do a wider range of tasks around the house. In a new paper spearheaded
by MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL)
and the University of Toronto, researchers demonstrate "VirtualHome," a
system that can simulate detailed household tasks and then have
artificial "agents" execute them, opening up the possibility of one day
teaching robots to do such tasks.
The AI agent setting the table.
The team trained the system using nearly 3,000 programs of various
activities, which are further broken down into subtasks for the computer
to understand. A simple task like "making coffee," for example, would
also include the step "grabbing a cup." The researchers demonstrated
VirtualHome in a 3-D world inspired by the Sims video game.
The team's AI agent can execute 1,000 of these interactions in the
Sims-style world, with eight different scenes including a living room,
kitchen, dining room, bedroom, and home office.
"Describing actions as computer programs has the advantage of providing
clear and unambiguous descriptions of all the steps needed to complete a
task," says PhD student Xavier Puig, who was lead author on the paper.
"These programs can instruct a robot or a virtual character, and can
also be used as a representation for complex tasks with simpler
The project was co-developed by CSAIL and the University of Toronto
alongside researchers from McGill University and the University of
Ljubljana. It will be presented at the Computer Vision and Pattern
Recognition (CVPR) conference, which takes place this month in Salt Lake
How it works
Unlike humans, robots need more explicit instructions to complete easy
tasks - they can't just infer and reason with ease.
For example, one might tell a human to "switch on the TV and watch it
from the sofa." Here, actions like "grab the remote control" and
"sit/lie on sofa" have been omitted, since they're part of the
commonsense knowledge that humans have.
To better demonstrate these kinds of tasks to robots, the descriptions
for actions needed to be much more detailed. To do so, the team first
collected verbal descriptions of household activities, and then
translated them into simple code. A program like this might include
steps like: walk to the television, switch on the television, walk to
the sofa, sit on the sofa, and watch television.
Once the programs were created, the team fed them to the VirtualHome 3-D
simulator to be turned into videos. Then, a virtual agent would execute
the tasks defined by the programs, whether it was watching television,
placing a pot on the stove, or turning a toaster on and off.
The end result is not just a system for training robots to do chores,
but also a large database of household tasks described using natural
language. Companies like Amazon that are working to develop Alexa-like
robotic systems at home could eventually use data like this to train
their models to do more complex tasks.
The team's model successfully demonstrated that, their agents could
learn to reconstruct a program, and therefore perform a task, given
either a description: "pour milk into glass", or a video demonstration
of the activity.
line of work could facilitate true robotic personal assistants in the
future," says Qiao Wang, a research assistant in arts, media, and
engineering at Arizona State University. "Instead of each task
programmed by the manufacturer, the robot can learn tasks just by
listening to or watching the specific person it accompanies. This allows
the robot to do tasks in a personalized way, or even some day invoke an
emotional connection as a result of this personalized learning process."
In the future, the team hopes to train the robots using actual videos
instead of Sims-style simulation videos, which would enable a robot to
learn simply by watching a YouTube video. The team is also working on
implementing a reward-learning system in which the agent gets positive
feedback when it does tasks correctly.
"You can imagine a setting where robots are assisting with chores at
home and can eventually anticipate personalized wants and needs, or
impending action," says Puig. "This could be especially helpful as an
assistive technology for the elderly, or those who may have limited