Just in case you were holding out any hope that Google didn’t let humans listen to voice recordings from Google Home and Google Assistant, stop doing that. One of the humans that Google hired to review voice recordings recently leaked over a thousand Assistant recordings to a Belgian news organisation, which published a story and video about the recordings this week. Google, of course, is very pissed about this.
The Flemish news report is quite something, mainly because you can actually hear a whole bunch of Google Assistant records from anonymous Flemish people. We’ve long known that Google employs humans to review and transcribe voice recordings in order to train the technology that makes the voice assistant work. (Amazon and Apple have admitted to doing the same thing, and we’ve previously reported on the uncomfortable truth about why humans are still necessary for voice assistants to work.) Unfortunately for Google, one of these humans sent a large cache of these recordings to VRT News in Belgium, and the news organisation. The person, who works as a subcontractor for Google, also let the journalists look at the software they use to review the recordings. The report confirms what we already knew, but hearing the recordings is a vivid reminder that stuff you say to a voice assistant is recorded, stored, and inevitably at risk of being leaked to hackers, governments, or Belgian news organisations.
Google fired back on Friday with a blog post that frames the leak as a security breach. The company explained the review process as something that’s necessary for its products to work well in multiple languages, though the same review process exists for Assistant recordings in English. Inevitably, the blog reads like a scolding:
We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data. Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.
It also goes on to confirm that “around 0.2 percent of all audio snippets” get sent to human reviewers. That seems like a small number until you remember there are 1 billion devices that can query Google Assistant.
What’s most concerning about this glimpse into what makes Google Assistant work, however, is the simple fact that many recordings happen by accident. Google Assistant and other voice assistants are supposed to start recording only after the user says a wake word or phrase, like “Hey Google.” However, the Belgian news report says, “VRT NWS listened to more than a thousand excerpts, 153 of which were conversations that should never have been recorded and during which the command ‘Okay Google’ was clearly not given.” That means maybe 10 percent of what Google is recording is stuff it’s not supposed to record.
So it’s unclear what happens next. Perhaps some people will be a little bit more cautious around their Google Home or their Amazon Echo or Apple HomePod, all of which amount to wiretapping devices according to some privacy experts. This analogy does make more and more sense as we learn about how these devices work. A Google Home does have microphones that are on by default and that, sometimes, record audio without your explicit consent. And then those recordings get sent to a subcontractor who might just get an itch to leak the recordings to the press. That has now happened.
Another possible outcome, of course, is that you just throw your Google Home or your Amazon Echo or your Apple HomePod into the ocean, scream at the clouds, and cry into the sand. Maybe this future isn’t the one you wanted or hoped for, but it’s the one you have to live in.