In an Apocalyptic Mood, Stephen Hawking Overestimates the Evolutionary Future of Smart Machines

Hawking AI.jpg

Stephen Hawking is a great physicist but he’s dead wrong in a co-authored article in The Independent, darkly warning against the temptation "to dismiss the notion of highly intelligent machines as mere science fiction." This, he says, "would be a mistake, and potentially our worst mistake in history."

Artificial-intelligence (AI) research is now progressing rapidly. Recent landmarks such as self-driving cars, a computer winning at Jeopardy! and the digital personal assistants Siri, Google Now and Cortana are merely symptoms of an IT arms race fuelled by unprecedented investments and building on an increasingly mature theoretical foundation. Such achievements will probably pale against what the coming decades will bring.

The peril derives from the prospect that "machines with superhuman intelligence could repeatedly improve their design even further, triggering what Vernor Vinge called a ‘singularity’":

One can imagine such technology outsmarting financial markets, out-inventing human researchers, out-manipulating human leaders, and developing weapons we cannot even understand. Whereas the short-term impact of AI depends on who controls it, the long-term impact depends on whether it can be controlled at all.

Stuart Russell — who co-wrote the definitive text on AI, Artificial Intelligence: A Modern Approach, with Peter Norvig, — is listed as an author of the piece, with Hawking and physicists Frank Wilczek and Max Tegmark (ugh). The physicists can be forgiven, but Russell should know better.

There is no "increasingly mature theoretical foundation" — or rather, there is one, with known limitations. The methods of Big Data, which I referred to yesterday, all show performance gains for well-defined problems, achieved by adding more and more input data — right up to saturation. "Model saturation," as it’s called, is the eventual flattening of a machine learning curve into an asymptote or a straight line, where there’s no further learning, no matter how much more data you provide. Russell (one would hope) knows this, but the problem is not even mentioned in the piece, let alone explained. Instead, front and center is Hawking’s ill-defined worry about a future involving "super" intelligence. This is hype, at its best.

A learning model (here "model" means AI algorithm) might gain performance quickly up to, say, 70 percent accuracy on a particular task. Accuracy is called an F-measure, a harmonic mean between precision and recall. But it will then slow, and inevitably saturate. At that point, it’s done. In industry — say, at Google — it then goes from the training phase to "production," where it’s used to generate results on new, previously unseen data.

For instance, if the model was trained for "learning" what music you like based on your music listening habits, it would be released on a music recommendation site to suggest new music samples for you. This is roughly how the music service Pandora works. More simply, if it was trained on, say, two groups of email data — one spam and one good — after saturation it would be released to label new, previously unseen emails as "Spam" or "Good" (or rather "Yes" or "No," as the decision is binary). And on and on. Since Big Data makes empirical or learning methods more effective, learning methods have effectively dominated approaches to AI.

There is a confusion, then, at the heart of the vision that Stephen Hawking has somewhat oddly endorsed. Adding more data won’t help these learning problems — performance can even go down. This tells you something about the prospects for the continual "evolution" of smart machines.

But Hawking is a physicist. Let’s look at what the computer scientists say. Peter Norvig, Russell’s other half in their popular AI textbook, who is also Director of Research at Google, has admitted that progress on learning-based problems — "learning" is technically termed numerical optimization — like machine translation (or spam detection, or recommendation, or photo recognition) will likely slow in the coming years, not exponentially ramp-up. Norvig conceded in an article in The Atlantic last year:

"We could draw this curve: as we gain more data, how much better does our system get?" he says. "And the answer is, it’s still improving — but we are getting to the point where we get less benefit than we did in the past."

This doesn’t sound like the imminent rise of the machines.

Facebook, meanwhile, has hired NYU computer scientist Yann LeCunn to head their new AI Lab. LeCunn spearheads a machine learning approach known as "Deep Learning." ATMs already use LeCunn’s methods to automatically read checks. Facebook hopes his approach will help the company automatically read images in photos posted on the site, like pictures of married couples, or pets.

Futurist and entrepreneur Ray Kurzweil, currently Director of Engineering at Google, popularized the notion of a coming "singularity" discussed in the Independent article. He made his fortune designing and patenting speech-to-text synthesizers, and helped design Apple’s voice recognition system, Siri. Indeed, the examples mentioned in the article — self-driving cars, Google Now, Siri — were all made possible by the application of fairly well-known learning algorithms (for example, Hidden Markov Models for voice recognition) that have had new life breathed into them (so to speak) by the availability of massive datasets, the terabytes of text and image data on the Web.

This means, for one thing, that they have indeed out-performed older versions, and so progress in the last, say, ten years is apparent. It also means, as Norvig himself admitted, that the performance will inevitably slow down, as the underlying learning approaches must, as the models saturate. This completely known consequence of empirical or learning approaches on today’s Web no doubt prompted the peculiar, almost defeatist concession by Norvig and Russell, recounted in the Atlantic piece:

Perhaps, as Russell and Norvig politely acknowledge in the last chapter of their textbook, in taking its practical turn, AI has become too much like the man who tries to get to the moon by climbing a tree: "One can report steady progress, all the way to the top of the tree."

An unfortunate product of hype about AI is the concession to what Kurzweil himself has called "Narrow" AI: the abandonment of the original goals of Artificial Intelligence to actually understand human thinking, focusing instead on big money applications that tell us little about the mind. While Kurzweil views this as a stepping stone toward the eventual Singularity, as all those "narrow" AI applications like recognizing a photo of a wedding picture "scale up" to genuine, real-time intelligence, a more direct and evidence-based conclusion is that AI is in a bubble, and as the Big Data methods saturate, excitement will crash and result in a hangover. That’s exactly what happened in the 1960s with early efforts on natural language understanding, and later in the 1980s with the failure of so-called expert systems.

AI is a tough racket; the human mind isn’t a computer program, it seems, though there’s never a dearth of people happy to promulgate this view.

Russell knows this. Norvig knows this. And Hawking, it appears, has his singularities mixed up. I love Dr. Hawking for his work on black holes and Hawking radiation. But for all his brilliance in astrophysics, the all-too-human complexities of a hype-driven field like AI have so far escaped him.

Photo credit: NASA HQ/Flickr.