The Unavoidable Folly of Making Humans Train Self-Driving Cars
UPDATE Friday, June 22: Police in Tempe, Arizona, say the safety driver of an Uber autonomous test vehicle was streaming TV on her phone right up until a fatal crash. In a detailed report, police obtained records from Hulu which show an episode of The Voice was playing on Rafaela Vasquez’s account. County prosecutors will decide if she will face charges, which could include vehicle manslaughter. Uber says it’s cooperating with investigations, and conducting an internal safety review. “We have a strict policy prohibiting mobile device usage for anyone operating our self-driving vehicles,” a spokesperson says.
This story about why humans are ill-suited to tasks like monitoring autonomous vehicles, and easily distracted, originally ran on March 24, 2018. We've updated it to include new details from both a National Transportation Safety Board preliminary report, released May 24, and the Tempe Police Department investigation.
The British Royal Air Force had a problem. It was 1943, and the Brits were using radar equipment to spot German submarines sneaking around off the western coast of France. The young men sitting in planes circling over the Bay of Biscay had more than enough motivation to keep a watchful eye for the telltale blips on the screens in front of them. Yet they had a worrying tendency to miss the signals they’d been trained to spot. The longer they spent looking at the screen, the less reliable they became.
The RAF could tell their skills deteriorated over time, but it wasn’t sure how long it was safe to keep them at their vital task. So they brought in Norman Mackworth. Mackworth brought his clock.
The British psychologist put RAF cadets alone in a sparse and silent wooden cabin, where they would sit 7 feet from a clock 10 inches in diameter. The clock had a single hand. Every second, the hand moved forward a third of an inch. But at random intervals, it moved twice that distance. The subject’s job was to watch the clock, and press a Morse key (that thing telegraph operators use) each time it made the double jump. Some of the cadets sat there for 30 minutes, others an hour, the unluckiest two hours. Mackworth worked in all sorts of variables—some subjects got telephone calls during the test, others got amphetamines—but the clear takeaway was it took less than half an hour for their attention to wander.
In Breakdown of Vigilance During Prolonged Visual Search, Mackworth traced the recognition of this phenomenon back to Shakespeare’s The Tempest:
"For now they are oppress’d with travel, they Will not, nor cannot, use such vigilance As when they are fresh."
Before and since Mackworth’s time, the “vigilance decrement” has caused trouble everywhere humans are asked to spend long periods of mostly uneventful time, watching for easy to spot but impossible to predict signals. Security guards suffer from it. So do the people looking after nuclear reactors and Predator drones. Same goes for TSA agents and lifeguards.
And, as a fatal crash in Tempe, Arizona, made clear, the vigilance decrement affects the people sitting behind the wheel of Uber’s self-driving cars. On March 18, one of Uber’s autonomous Volvo XC90 SUVs hit and killed 49-year-old Elaine Herzberg as she was walking her bike across the street.
The National Transportation Safety Board's preliminary report from its investigation into the crash reveals that the car's sensors detected Herzberg about six seconds before the crash, and that the software classified her as an unknown object, then as a vehicle, and finally as a pedestrian. Less than two seconds before impact, the car determined it needed to stop. But it couldn't. "According to Uber, emergency braking maneuvers are not enabled while the vehicle is under computer control, to reduce the potential for erratic vehicle behavior," the report says. "The vehicle operator is relied on to intervene and take action. The system is not designed to alert the operator."
But video shows that the car's operator, Rafaela Vasquez, wasn’t watching the road in the moments leading to the crash. She told NTSB investigators she was looking at the system's interface, which is built into the center console. But police investigations have revealed that an episode of the TV show The Voice was being streamed via Vasquez's Hulu account right up to the moment of the crash.
And so, along with the entire notion that robots can be safer drivers than humans, the crash casts doubt on a fundamental tenet of this nascent industry: that the best way to keep everyone safe in these early years is to have humans sitting in the driver’s seat, ready to leap into action.
Dozens of companies are developing autonomous driving technology in the United States. They all rely on human safety drivers as backups. The odd thing about that reliance is that it belies one of the key reasons so many people are working on this technology. We are good drivers when we’re vigilant. But we’re terrible at being vigilant. We get distracted and tired. We drink and do drugs. We kill 40,000 people on US roads every year, and more than a million worldwide. Self-driving cars are supposed to fix that. But if we can’t be trusted to watch the road when we’re actually driving, how did anyone think we’d be good at it when the robot’s doing nearly all the work?
“Of course this was gonna be a problem,” says Missy Cummings, the director of the Humans and Autonomy Laboratory and Duke Robotics at Duke University. “Your brain doesn’t like to sit idle. It is painful.”
In 2015, Cummings and fellow researchers ran their own test. “We put people in a really boring, four-lane-highway driving simulator for four hours, to see how fast people would mentally check out,” she says. On average, people dropped their guard after 20 minutes. In some cases, it took just eight minutes.
Everyone developing self-driving tech knows how bad humans are at focusing on the road. That’s why many automakers have declined to develop semiautonomous tech, where a car drives itself in a simple scenario like highway cruising, but needs a person to supervise and grab the wheel when trouble seems imminent. That kind of system conjures the handoff problem, and as Volvo’s head of safety and driver-assist technologies told WIRED in 2016, "That problem's just too difficult.”
The problem for the companies eager to skip that icky middle ground and go right for a fully driverless car is that they believe the only way to get there is by training on public roads—the testing ground that offers all the vagaries and oddities these machines must master. And the only reasonable approach—from a pragmatic and political point of view—to testing imperfect tech in two-ton vehicles speeding around other people is to have a human supervisor.
“I think, in good faith, people really thought the safety drivers were going to do a good job,” Cummings says. In a rush to move past the oh-so-fallible human, the people developing truly driverless cars doubled down on, yes, the oh-so-fallible human.
That’s why, before letting them on the road, Uber puts its vehicle operators through a three-week training course at its Pittsburgh R&D center. Trainees spend time in a classroom reviewing the technology and the testing protocols, and on the track learning to spot and avoid trouble. They even get a day at a racetrack, practicing emergency maneuvers at highway speeds. They’re taught to keep their hands an inch or two from the steering wheel, and the right foot over the brake. If they simply have to look at their phones, they’re supposed to take control of the car and put it in park first.
Working alone in eight-hour shifts (in Phoenix they earn about $24 an hour), the babysitters are then set loose into the wild. Each day, they get a briefing from an engineer: Here’s where you’ll be driving, here’s what to look for. Maybe this version of the software is acting a bit funky around cyclists, or taking one particular turn a little fast.
And constantly, they are told: Watch the road. Don’t look at your phone. If you’re tired, stop driving. Uber also audits vehicle logs for traffic violations, and it has a full-time employee who does nothing but investigate potential infractions of the rules. Uber has fired drivers caught (by other operators or by people on the street) looking at their phones.
Still, the vigilance decrement proves persistent. “There’s fatigue, there’s boredom,” says one former operator, who left Uber recently and requested not to be named. “There’s a sense of complacency when you’re driving the same loops over and over, and you trust the vehicle.” That’s especially true now that Uber’s cars are, overall, pretty good drivers. This driver said that by early this year, the car would regularly go 20 miles without requiring intervention. If you’re tooling around the suburbs, that might mean an hour or more. As any RAF cadet watching a broken clock in a cabin could tell you, that’s a long time to stay focused. “You get lulled into a false sense of security,” the driver says.
Moreover, Uber has no system in place to ensure its drivers keep their eyes on the road. That technology exists, though. Drivers who pay to have Autopilot on their Tesla or ProPilot Assist on their Nissan get vehicles that can handle themselves on the highway, but require human supervision. (They can’t handle things like stopped fire trucks in their lane, for instance.) To make sure they pay attention, the car uses a torque sensor in the steering wheel to determine that they’re occasionally touching it. Slackers get a warning beep or buzz.
Cadillac’s Super Cruise offers a more sophisticated solution, using an infrared camera on the steering column to watch the driver’s head position. Look away for too long, and it delivers a scolding. If Cadillac managed to work that into a commercial product, it’s easy to imagine Uber or one of its robocar competitors doing the same to keep their employees alert.
“Driver state management is a critical aspect to all vehicles that involve human operation,” says Bryan Reimer, who studies human machine interaction at MIT. That can be a system like Cadillac’s, or even better, one that tracks eye movement. (You can point your head at the road and be asleep, after all.) Or, you can put two people in the car. That’s not guaranteed to help (ask the Northwest pilots who flew 150 miles past their destination in 2009), but it does mean you’ve got a backup if one person totally checks out.
(An Uber spokesperson said the company regularly reviews its testing procedures, but declined to speculate on how it would change with regard to the fatal crash. Waymo and General Motors, the premiere competitors in this space, did not reply to request for comment on how they train and monitor their human operators.)
And if we can't use machines to monitor the humans who are monitoring the machines, Norman Mackworth might have a simpler, if sketchier, suggestion to offer: The cadets popping speed outperformed the rest of the class.