Why Siri and Alexa Will Succeed Where Clippy Failed

By James O Malley on at

Hey Siri, when did talking to computers start to feel normal? Talking to our devices is now something we think nothing of, but don’t forget – this isn’t an entirely new phenomenon.

Remember Clippy?

The Office Assistant, to give him his full name was first introduced in Microsoft Office 97, and was a fixture of the product suite for a decade. He’d pop up whenever the software thought you might need help, such as when writing a letter. You could type and ask him questions. He was the spiritual grandfather of Siri, Alexa, Cortana and the Google Assistant.

And despite this, in 2007 the order was given that Clippy was not long for this world, and he was killed by Jensen Harris.

So why was this allowed to happen? And how can we be so sure that our modern assistants won’t eventually experience the same fate? To find out, I spoke to Jensen, who today is the co-founder and CTO of Textio, a text analysis start-up that isn’t technologically a million miles away from either Clippy or Siri.

Jensen first joined Microsoft in 1996, after the Office Assistant had already been created, but he was able to tell me a little about the genesis of the paperclip that we loved to hate.

It all started with a piece of software called Microsoft Bob. This was the company’s first attempt at creating what Jensen calls a “social user interface”. The idea was that it would live on top of a standard graphical user interface and would make it easier for the unfamiliar to get to grips with using a computer. For example, you would click on the desk to access work related things, and click on the sofa to access media. Your virtual guide was a cartoon dog, who spoke using speech bubbles.

Sadly for Microsoft Research though, which developed the software, it was a critical and commercial flop. “It was a dismal failure and if you remember at the time it was widely ridiculed for being cartoony,” Jensen explains. “The tech press at the time was made up of enthusiasts of the PC industry and all that who didn't really see the goal in that.”

But in the technology, there was the genesis of Clippy. The lesson was that a social user interface could be useful when grafted on top of another product to make that product’s features, especially a product as complicated and powerful as Office.

“So the dog became the Office Assistant in Office. Really the idea there was how do you take this product, with an immense feature-set, which really people only know how to do a few things with, and use the character to make the product more socially engaging, to make it more approachable, and to make it feel like it was your trusted adviser, versus just a monolithic product?”

It looks like you’re trying to write a death warrant...

So why did Jensen kill it? The answer, simply put, was that the technology was not good enough.

“The Office Assistant was ahead of its time,” he explains. “Mostly it was ahead of was the technology. The technology wasn't there to actually make the Office Assistant contextually relevant.”

“Today we've got immense AI capabilities, we've got machine learning and all offshoots of that. Like deep learning that can actually help a computer intuit what it is that you're trying to do.”

Jensen explains that Office simply wasn’t smart enough: The Assistant was driven by some extremely simple “heuristic” rules , so that once you had been using Microsoft Word for a couple of days you’d probably seen all that Clippy had to offer. And worse still: sometimes it was wrong.

“There's only so many times you can type the word ‘Dear’ and have it say ‘it looks like you're writing a letter’ before you say... it’s actually not that helpful to have this paperclip type out ‘Dear Mum and Dad’ for me, because I'm probably not writing a letter to Mum and Dad.”

When you get it right, it's magical, and when you get it wrong people get frustrated.

The trouble was, apparently, that though the heuristic rules about when Clippy would pop up were written based on research into what users actually want to do, they were also limited by what they were capable of detecting at the time. Jensen explained:

“For years Word had this issue with autocorrect and there are even things today when people will say ‘I started typing and it just put bullets in, this is not where I wanted bullets to go’. This is really more of an artificial intelligence problem. When you get it right, it's magical, and when you get it wrong people get frustrated. And the Office Assistant got it wrong a lot more than it got it right. And so the magic essentially wasn't there.”

In the end, the Assistant was replaced in part by the Ribbon, the system for organising all of the functions and tools built into Microsoft Office that Jensen invented that is still in use today. This is the name given to the series of tabs that replaced toolbars – and is something that has been emulated and ripped off in countless other apps since.

Getting Siri-ous

Fast-forward a decade and it is clear that social user interfaces, though they may have stumbled with the Office Assistant, are here to stay. And this time… they work. “Now we're seeing everything from Siri to Google Now to Cortana to whatever, doing the same thing but better,” Jensen says.

So how has the technology improved? Why do Siri and Alexa succeed? Jensen puts it down to two major factors.

“There's the social user interface and then there's an artificial intelligence”, he explains. “Siri and Cortana are at the cross section of these things. Both have the social interface, so that people are able to talk to it, and it is supposed to be able to understand you, and it's going to come back with smart things and some based on rules, some based on AI.”

It is this latter factor which is also important, as it enables these tools to have what Jensen calls a predictive engine.

“Google Now doesn't have a character, but it can tell you that you should probably leave now to get to your next appointment across town. To know it has to predict how long you're going to take. [...] It knows your calendar, it knows a lot of content about you, and it can make very accurate predictions. And so I think the technology is there to do the predictive stuff in certain domains stunningly well.”

It is this underlying AI which is arguably more important than the social interface, as it is prediction and detection that can make or break a product.

“It boggles my mind that I can leave my mind that I can leave Portland, which is a 2.5 hour drive from Seattle and [Google Maps] can tell me within minutes that three hours from now exactly what time I'm going to arrive.”

Jensen’s current company, Textio, is a specialist text editor for writing job advertisements. The idea is that companies can tweak their pitch to potential candidates, down to specific words and phrases that could be used, to yield more responses by the sorts of people they want to attract.

“When you're trying to hire someone for your company, we can actually predict based just on the words that you are writing how many people are going to walk through the door and apply for your job. And also what their gender is going to be, also how qualified they're going to be.”

The suggestions are made by analysing thousands of other job listings, and using a sort of artificial intelligence to suggest alternatives based on this massive dataset.

This is a good illustration of the “magic” Jensen referred to, and an example of the sort of computation that Clippy couldn’t handle, but modern AI can.

“You see it actually come true. It's amazing. It's hard to imagine how amazing that is.”

And this is why this time around the Assistants are probably here to stay.

Jensen likens this to the birth of the internet. In 2016, it is almost impossible to imagine writing something without having an internet connection: how would you fact-check what you want to include? How would you collaborate on documents? And so on. In the future, we could be wondering how we coped without an descendent of Clippy looking down on us as we type.

“We're going to become more and more comfortable with the idea of having a few really trusted predictive engines that are part of their daily life and it will seem uncanny at first that you can actually predict things about the future and have them be true, but eventually it becomes like another form of human intelligence, in a sense.”

Clippy might be dead, but we could one day look back on him as the paperclip that changed everything.