More and more of modern life is steered by algorithms. But what are they exactly, and who is behind them? Tom Whipple follows the trail
There are many reasons to believe that film stars earn too much. Brad Pitt and Angelina Jolie once hired an entire train to travel from London to Glasgow. Tom Cruise’s daughter Suri is reputed to have a wardrobe worth $400,000. Nicolas Cage once paid $276,000 for a dinosaur head. He would have got it for less, but he was bidding against Leonardo DiCaprio.
Nick Meaney has a better reason for believing that the stars are overpaid: his algorithm tells him so. In fact, he says, with all but one of the above actors, the studios are almost certainly wasting their money. Because, according to his movie-analysis software, there are only three actors who make money for a film. And there is at least one A-list actress who is worth paying not to star in your next picture.
The headquarters of Epagogix, Meaney’s company, do not look like the sort of headquarters from which one would confidently launch an attack on Hollywood royalty. A few attic rooms in a shared south London office, they don’t even look as if they would trouble Dollywood. But my meeting with Meaney will be cut short because of another he has, with two film executives. And at the end, he will ask me not to print the full names of his analysts, or his full address. He is worried that they could be poached.
Worse though, far worse, would be if someone in Hollywood filched his computer. It is here that the iconoclasm happens. When Meaney is given a job by a studio, the first thing he does is quantify thousands of factors, drawn from the script. Are there clear bad guys? How much empathy is there with the protagonist? Is there a sidekick? The complex interplay of these factors is then compared by the computer to their interplay in previous films, with known box-office takings. The last calculation is what it expects the film to make. In 83% of cases, this guess turns out to be within $10m of the total. Meaney, to all intents and purposes, has an algorithm that judges the value—or at least the earning power—of art.
To explain how, he shows me a two-dimensional representation: a grid in which each column is an input, each row a film. “Curiously,” Meaney says, “if we block this column…” With one hand, he obliterates the input labelled “star”, casually rendering everyone from Clooney to Cruise, Damon to De Niro, an irrelevancy. “In almost every case, it makes no difference to the money column.”
“For me that’s interesting. The first time I saw that I said to the mathematician, ‘You’ve got to change your program—this is wrong.’ He said, ‘I couldn’t care less—it’s the numbers.’” There are four exceptions to his rules. If you hire Will Smith, Brad Pitt or Johnny Depp, you seem to make a return. The fourth? As far as Epagogix can tell, there is an actress, one of the biggest names in the business, who is actually a negative influence on a film. “It’s very sad for her,” he says. But hers is a name he cannot reveal.
YOU TAKE the Underground north from Meaney’s office, you will pass beneath the housing estates of south London. Thousands of times every second, above your head, someone will search for something on Google. It will be an algorithm that determines what they see; an algorithm that is their gatekeeper to the internet. It will be another algorithm that determines what adverts accompany the search—gatekeeping does not pay for itself.
Algorithms decide what we are recommended on Amazon, what films we are offered on Netflix. Sometimes, newspapers warn us of their creeping, insidious influence; they are the mysterious sciencey bit of the internet that makes us feel websites are stalking us—the software that looks at the e-mail you receive and tells the Facebook page you look at that, say, Pizza Hut should be the ad it shows you. Some of those newspaper warnings themselves come from algorithms. Crude programs already trawl news pages, summarise the results, and produce their own article, by-lined, in the case of Forbes magazine, “By Narrative Science”.
Others produce their own genuine news. On February 1st, the Los Angeles Times website ran an article that began “A shallow magnitude 3.2 earthquake was reported Friday morning.” The piece was written at a time when quite possibly every reporter was asleep. But it was grammatical, coherent, and did what any human reporter writing a formulaic article about a small earthquake would do: it went to the US Geological Survey website, put the relevant numbers in a boilerplate article, and hit send. In this case, however, the donkey work was done by an algorithm.
But it is not all new. It is also an algorithm that determines something as old-fashioned as the route a train takes through the Underground network—even which train you yourself take. An algorithm, at its most basic, is not a mysterious sciencey bit at all; it is simply a decision-making process. It is a flow chart, a computer program that can stretch to pages of code or is as simple as “If x is greater than y, then choose z”.
What has changed is what algorithms are doing. The first algorithm was created in the ninth century by the Arabic scholar Al Khwarizami—from whose name the word is a corruption. Ever since, they have been mechanistic, rational procedures that interact with mechanistic, rational systems. Today, though, they are beginning to interact with humans. The advantage is obvious. Drawing in more data than any human ever could, they spot correlations that no human would. The drawbacks are only slowly becoming apparent.
Continue your journey into central London, and the estates give way to terraced houses divided into flats. Every year these streets inhale thousands of young professional singles. In the years to come, they will be gently exhaled: gaining partners and babies and dogs, they will migrate to the suburbs. But before that happens, they go to dinner parties and browse dating websites in search of that spark—the indefinable chemistry that tells them they have found The One.
And here again they run into an algorithm. The leading dating sites use mathematical formulae and computations to sort their users’ profiles into pairs, and let the magic take its probabilistically predicted course.
Not long after crossing the river, your train will pass the server farms of the Square Mile—banks of computers sited close to the fibre-optic cables, giving tiny headstarts on trades. Within are stored secret lines of code worth billions of pounds. A decade ago computer trading was an oddity; today a third of all deals in the City of London are executed automatically by algorithms, and in New York the figure is over half. Maybe, these codes tell you, if fewer people buy bananas at the same time as more buy gas, you should sell steel. No matter if you don’t know why; sell sell sell. In nanoseconds a trade is made, in milliseconds the market moves. And, when it all goes wrong, it goes wrong faster than it takes a human trader to turn his or her head to look at the unexpectedly red numbers on the screen.
Finally, your train will reach Old Street—next door to the City, but a very different place. This is a part of town where every office seems to have a pool table, every corner a beanbag, every receptionist an asymmetric haircut. In one of those offices is TechHub. With its bare brick walls and website that insists on being your friend, this is the epitome of what the British government insists on calling Silicon Roundabout. After all, what America can do with valleys, surely Britain can do with traffic-flow measures.
Inside are the headquarters of Simon Williams’s company QuantumBlack. The world, Williams says, has changed in the past decade—even if not everyone has noticed. “There’s a ton more data around. There’s new ways of handling it, processing it, manipulating it, interrogating it. The tooling has changed. The speed at which it happens has changed. You’re shaping it, sculpting it, playing with it.”
QuantumBlack is, he says, a “data science” agency. In the same way as, ten years ago, companies hired digital-media agencies to make sense of e-commerce, today they need to understand data-commerce. “There’s been an alignment of stars. We’ve hit a crossover point in terms of the cost of storing and processing data versus ten years ago. Then, capturing and storing data was expensive, now it is a lot less so. It’s become economically viable to look at a shed load more data.”
When he says “look at”, he means analysing it with algorithms. Some may be as simple as spotting basic correlations. Some apply the same techniques used to spot patterns in the human genome, or to assign behavioural patterns to individual hedge-fund managers. But there is no doubt which of Williams’s clients is the most glamorous: Formula 1 teams. This, it is clear, is the part of the job he loves the most.
“It’s a theatre, an opera,” he says. “The fun isn’t in the race, it’s in the strategy—the smallest margins win or lose races.” As crucial as the driver, is when that driver goes for a pit stop, and how his car is set up. This is what QuantumBlack advises on: how much fuel you put in, what tyres to use, how often to change those tyres. “Prior to the race, we look at millions of scenarios. You’re constantly exploring.”
He can’t say which team he is working with this season, but they are “generally at the front of the grid”. Using the tens of billions of calculations per second that are possible these days, his company might offer the team one strategy in which there is a slim chance of winning, but a greater chance of not finishing; another in which there is no chance of winning, but a good chance of coming third.
This, however, is not where Williams’s algorithms really earn their money. To borrow a line from Carl von Clausewitz, the Prussian military strategist, no Formula 1 plan survives first contact with a corner. “You all line up, the lights go out. Three seconds later someone’s crashed or overtaken and plans go out of the window. So the real advantage is being able to pick out what’s happening on the track, learn and adapt. The teams that do that win.”
In real time Williams collects data from thousands of variables. Some are from sensors in his team’s cars. Other data are fuzzier: “We listen to the engine notes of competitors’ cars, on TV. That can tell us their settings. The braking profile of a car on gps as it goes into a corner can also tell you all sorts of things.” His software then collates all the data, all the positions, and advises on pitting strategy. “If you are taking more than ten seconds to make a decision, you’re losing your advantage. Really you need to be under the eight-second mark. A human couldn’t take in that much data and process it fast.”
By analysing all this data with algorithms, not only can you find patterns no one thought existed, you can also challenge orthodoxies. Such as: a movie star is worth the money.
A FEW YEARS back, when Nick Meaney was just starting in the business, a Hollywood studio approached him confidentially to look at a script. You will have heard of this film. You may well have seen this film, although you might be reluctant to admit it in sophisticated company.
The budget for the film was $180m and, Meaney says, “it was breathtaking that it was under serious consideration”. There were dinosaurs and tigers. It existed in a fantasy prehistory—with a fantasy language. “Preposterous things were happening, without rhyme or reason.” Meaney, who will not reveal the film’s title because he “can’t afford to piss these people off”, told the studio that his program concurred with his own view: it was a stinker.
The difference is the program puts a value on it. Technically a neural network, with a structure modelled on that of our brain, it gradually learns from experience and then applies what it has learnt to new situations. Using this analysis, and comparing it with data on 12 years of American box-office takings, it predicted that the film in question would make $30m. With changes, Meaney reckoned they could increase the take—but not to $180m. On the day the studio rejected the film, another one took it up. They made some changes, but not enough—and it earned $100m. “Next time we saw our studio,” Meaney says, “they brought in the board to greet us. The chairman said, ‘This is Nick—he’s just saved us $80m.’”
He might well have done, and Epagogix might well have the advantage of being the only company doing this in quite this way. But, Meaney says, it still sometimes feels as if they are “hanging on by our fingertips”. He has allies in the boardrooms of Hollywood, but they have to fight the prevailing culture. Calculations like Meaney’s tend to be given less weight than the fact that, say, the vibe in the room with Clooney was intense, or Spielberg is hugely excited.
Predicting a Formula 1 race, or the bankability of Brad Pitt, is arguably quite a simple problem. Predicting the behaviour of individuals is rather more complex. Not everyone is convinced, for all the claims, that algorithms are really able to do it yet—least of all when it comes to love. Earlier this year, a team of psychologists published an article in the journal Psychological Science in the Public Interest that looked into the claims made by dating websites for their algorithms. They wrote: “Ever since eHarmony.com, the first algorithm-based matching site, launched in 2000, sites such as Chemistry.com, PerfectMatch.com, GenePartner.com, and FindYourFaceMate.com have claimed that they have developed a sophisticated matching algorithm that can find singles a uniquely compatible mate.”
“These claims,” Professor Eli Finkel from Northwestern University wrote, “are not supported by credible evidence.” In fact, he said, there is not “a shred of evidence that would convince anybody with any scientific training”. The problem is, we have spent a century studying what makes people compatible—by looking at people who are already together, so we know they are compatible. Even looking at guaranteed, bona-fide lovebirds has produced only weak hypotheses—so using these data to make predictions about which people could fall in love with each other is contentious.
ALTHOUGH IT IS difficult to predict what attracts a human to another human, it turns out to be rather simpler to predict what attracts a human to a political party. Indeed, the destiny of the free world was, arguably, changed by an algorithm—and its ability to understand people. In October 2008 I was in Denver, covering the American election for the Times. I was embedded with the volunteers. For a month I worked surrounded by enthusiastic students and even more enthusiastic posters: “Hope”, “Change we can believe in”, and, a rare concession to irony, “Pizza we can believe in”.
At this stage in one of the most closely fought elections in American history, few people were going to have their mind changed. So the Obama campaign had no interest in sending us out to win over Republican voters. Our job was just to contact Democrats, and get them to vote, and our tool for this was VoteBuilder.
Every night, churning away in the Democratic National Committee’s central servers, the VoteBuilder software combined the list of registered Democrats with demographics and marketing information. The company would not speak to me, but you can guess which data were relevant. Who was a regular churchgoer? Who lived in a city apartment block? Who lived in a city apartment block and had a Hispanic name? Who had lentils on their shopping list? Every morning, it e-mailed the results to us: a list of likely Democrats in Denver, to be contacted and encouraged to vote.
As Obama supporters arrived to volunteer from safe Democrat states across the country, the result of these algorithmic logistics was organised stalking on a colossal scale. I was not the only volunteer to call someone on their mobile to convince them to vote, only to discover that they were in the office with me. Some complained of getting five calls a day. One changed his answerphone message to, “I’m a Democrat, I’ve already voted and I volunteer for the campaign.”
But it was effective; tens of thousands of volunteers were mobilised across the country. And in four weeks, only once did VoteBuilder pair me with a likely Republican. Given that his lawn had a sign saying “McCain-Palin”—a flag marking a lonely, defiant Alamo in his part of town—I didn’t go in. Voting intentions, it seems, really can be narrowed down to simple criteria, drawn from databases and weighted in an algorithm. For all its success, which has subsequently been studied by political parties across the world, VoteBuilder was about volume. Somewhere, there would have been country-club members who liked guns but also believed in free health care, or wealthy creationists who favoured closing down Guantánamo. They might well have evaded our stalking.
Equally, when VoteBuilder made a mistake, the worst that would happen would have been an idealistic student finding themselves arguing with someone holding rather different beliefs. In a world controlled by algorithms, though, sometimes the most apparently innocuous of processes can have unintended consequences.
Recently an American company, Solid Gold Bomb, hit on what it thought was a clever strategy. It would be yet another play on the British wartime poster “Keep Calm and Carry On”, now a 21st-century meme. Using a program that trawled a database of hundreds of thousands of words to generate slogans, it then printed the results on T-shirts and sold them through Amazon. Solid Gold didn’t realise what the computer had come up with—until the former deputy prime minister Lord Prescott tweeted, “First Amazon avoids paying UK tax. Now they’re making money from domestic violence.”
T-shirts, it transpired, had been made and sold bearing the slogans “Keep Calm and Rape” and “Keep Calm and Grope a Lot”. The shirts were withdrawn, but the alert had been sounded. An algorithm had designed a T-shirt and put it up for sale on a website that directs users to items on the basis of algorithms, and it was only when it met its first human that anyone knew anything had gone wrong.
The absence of humans in other processes has proven more fraught still. In 2000, Dave Cliff, professor of computer science at Bristol University, was responsible for designing one of the first trading algorithms. A decade later, he was responsible for writing a British government report into the dangers they posed to the world economy.
“Every now and then there were these interactions between trading systems,” he says of his early experience, working for Deutsche Bank. “They were interacting in ways we hadn’t foreseen—couldn’t have foreseen.” Designing a trading algorithm is, he says, “a bit like raising a kid. You can do it all right, but then you send them to the playground and you don’t know who they are going to meet.”
In October 2012, in under a minute, the market value of Kraft increased by 30%. In 2010, the now-infamous “flash crash” briefly wiped a trillion dollars off United States markets. Last March BATS, an American stock-exchange company, floated on its own market at over $15 a share—but there was a glitch. The official explanation, still disputed by some, is that the BATS market software malfunctioned for all stock beginning with anything from A to BF. “If they had popped a champagne bottle as they launched the shares,” Cliff says, “by the time the cork hit the floor their value was zero.”
Cliff’s report did find benefits to high-frequency algorithmic trading. It seems to increase liquidity and decrease transaction costs. The problem, he says, is that not enough people understand how it works yet, and there is no proper regulation. His report, which has been endorsed by John Beddington, Britain’s chief scientific officer, recommends the creation of forensic software tools, to analyse the market and help investigations. “The danger is an over-reliance on computer systems which are not well understood,” he said. “I have no problem with technologies. I like flying, I like to give my kids medicine. But I like my planes certified safe, my medicine tested. I prefer to be engaged in capital markets where there are similar levels of trust, and meaningful and incisive investigation when things go wrong.”
WHAT OF THE future of algorithms? In a sense, the question is silly. Anything that takes inputs and makes a decision is an algorithm of sorts. As computer-processing power increases and the cost of storing data decreases, their use will only spread. Almost every week a new business appears that is specifically algorithmic; they are so common that we barely comment on the fact they use algorithms.
Last year Target, a marketing company, yet again proved the power of algorithms, in a startling way. Its software tracks purchases to predict habits. Using this, it chooses which coupons to send customers. It seemed to have gone wrong when it began sending a teenage girl coupons for nappies, much to the anger of her father, who made an official complaint. A little later, the New York Times reported that the father had phoned the company to apologise. “It turns out,” he said, “there have been some activities in my house I haven’t been completely aware of.” He was going to be a grandfather—and an algorithm knew before he did.
Taken together, all this is a revolution. The production line standardised industry. We became a species that could have any colour Model T Ford as long as it was black. Later, the range of colours increased, but never to match the number of customers. Today, the chances are that the recommendations Amazon gives you will match no one else’s in the world.
Soon internet-shopping technology will come to the high street. Several companies are now producing software that can use facial recognition to change the advertising you see on the street. Early systems just spot if you are male or female and react accordingly. The hope—from the advertisers’ point of view, at least—is to correlate the facial recognition with Facebook, to produce a truly personalised advert.
But providing a service that adapts to individual humans is not the same as becoming like a human, let alone producing art like humans. This is why the rise of algorithms is not necessarily relentless. Their strength is that they can take in that information in ways we cannot quickly understand. But the fact that we cannot understand it is also a weakness. It is worth noting that trading algorithms in America now account for 10% fewer trades than they did in 2009.
Those who are most sanguine are those who use them every day. Nick Meaney is used to answering questions about whether computers can—or should—judge art. His answer is: that’s not what they’re doing. “This isn’t about good, or bad. It is about numbers. These data represent the law of absolute numbers, the cinema-going audience. We have a process which tries to quantify them, and provide information to a client who tries to make educated decisions.”
Such as? “I was in a meeting last week about the relative merits of zombies versus the undead.” Is there a difference? “The better question is, what is a grown, educated man doing in that discussion? But yes, there is a difference.” (Zombies are gross flesh-eaters; the undead, like Dracula, are photo-sensitive garlic-haters with no reflection in a mirror.)
Equally, his is not a formula for the perfect film. “If you take a rich woman and a poor man and crash them into an iceberg, will that film always make money?” No, he says. No algorithm has the ability to write a script; it can judge one—but only in monetary terms. What Epagogix does is a considerably more sophisticated version, but still a version, of noting, say, that a film that contains nudity will gain a restricted rating, and thereby have a more limited market.
But the hardest bit has already been done. “We presuppose competence.” In other words, all the scripts have the same presumed standard—you can assume dialogue is not overly dire, that special effects will not be catastrophically inept. This is a standard that requires talented people. And that, for actors who aren’t Pitt, Depp or Smith, is the crumb of algorithmic comfort. It is not that Robert de Niro or Al Pacino is worthless; it’s that in this program they are interchangeable. Even if zombies and the undead are not………