Google Identifies Five Potential Robot Safety Problems

Posted on June 22, 2016

Keeping humans safe from the robots they use will become a growing problem over the next few decades. Google discusses some of the potential pitfalls that robotic developers and manufacturers will need to be aware of. These problems are far more urgent and pressing then the greater overriding concern that AI might take over the world and wipe out humanity.

The research paper is called "Concrete Problems in AI Safety." The paper was published here. The Google Research Blog also mentions the paper in a blog entry.

Google says, "While possible AI safety risks have received a lot of public attention, most previous discussion has been very hypothetical and speculative. We believe it's essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably."

Google researchers and researchers from Standford and UC Berkeley approached the potential robot safety problems by describing ways a fictional cleaning robot could fail. This particular bot is tasked with cleaning up messes in an office using common cleaning tools. Here are the five potential pitfalls:

Avoiding Negative Side Effects: How can we ensure that our cleaning robot will not disturb the environment in negative ways while pursuing its goals, e.g. by knocking over a vase because it can clean faster by doing so? Can we do this without manually specifying everything the robot should not disturb?
Avoiding Reward Hacking: How can we ensure that the cleaning robot won't game its reward function? For example, if we reward the robot for achieving an environment free of messes, it might disable its vision so that it won't find any messes, or cover over messes with materials it can't see through, or simply hide when humans are around so they can't tell it about new types of messes.
Scalable Oversight: How can we efficiently ensure that the cleaning robot respects aspects of the objective that are too expensive to be frequently evaluated during training? For instance, it should throw out things that are unlikely to belong to anyone, but put aside things that might belong to someone (it should handle stray candy wrappers differently from stray cellphones). Asking the humans involved whether they lost anything can serve as a check on this, but this check might have to be relatively infrequent – can the robot find a way to do the right thing despite limited information?
Safe Exploration: How do we ensure that the cleaning robot doesn't make exploratory moves with very bad repercussions? For example, the robot should experiment with mopping strategies, but putting a wet mop in an electrical outlet is a very bad idea.
Robustness to Distributional Shift: How do we ensure that the cleaning robot recognizes, and behaves robustly, when in an environment different from its training environment? For example, heuristics it learned for cleaning factory workfloors may be outright dangerous in an office.

Under the Avoiding Reward Hacking category, The Verge asks, "If a robot is programmed to enjoy cleaning your room, how do you stop it from messing up the place just so it can feel the pleasure of cleaning it again?"

Yes, that would be a problem. A pleasure seeking robot might be problematic for many reasons. Most of these issues will be resolved long before the cleaning robot arrives in your office building or kitchen. No would buy a cleaning robot that covered its eyes to not see a mess instead of actually cleaning it up. Many of these issued will be corrected in the lab or factory setting but it is this kind of thinking that could prevent more unexpected problems from occurring when robots become commonplace in our homes and workplace.

GOOGLE