Gigantic Study of Fake News Online Finds the Enemy Is Humanity

By Rhett Jones on at

Over the last year, “fake news” has gone from being a niche concern that charlatans exploited for profit, to a code red existential threat to the fabric of society—or something in between. But our scientific understanding of how and why false stories spread is still limited. Researchers at MIT are diving in to correct that blind spot and for anyone looking to point a finger, we have some bad news.

A new paper published in on Thursday is the largest ever longitudinal study of the spread of false news online. Much of the scientific work that’s been done to assess fake news and its spread through social networks has focused on the study of individual rumours. There’s little research to point to that comprehensively evaluates the differences in the spread of true and false news across a variety of topics, or that examines why false news may spread differently than the truth. MIT’s Soroush Vosoughi and his colleagues took a look at 126,000 rumour cascades tweeted by three million people more than 4.5 million times in order to better understand the qualities that go into an effectively viral news story.

The researchers emphasised to Gizmodo that they had to avoid the term “fake news” because it’s come to mean different things to different people. So, for the purposes of the study, they’ve limited their phrasing to “falsehood” and “false stories.” They set out to answer two primary questions: How do truth and falsity spread differently, and what factors of human judgement explain these differences? The answers aren’t particularly revolutionary, but in a time when pundits, tech giants, politicians, and the public are all flailing around making assertions about fake news, it’s important to have some factual grounding in what we’re talking about.

The researchers used numerous controls, established mathematical models, and systems of approach in their evaluation of a comprehensive data set of every fact-checked rumour cascade that spread on Twitter from its inception in 2006 to 2017. Deb Roy, one of the paper’s co-authors told us that Twitter was the obvious network choice for the study because, unlike Facebook, its data is open to the public and MIT “has had a multi-year relationship with Twitter where we have elevated access to that public data.” The authors point to a separate study of rumour cascades from 2014 that focused on Facebook and used a smaller dataset, as the only paper comparable to their approach.

They define a rumour cascade on Twitter as “when a user makes an assertion about a topic in a tweet, which could include written text, photos, or links to articles online.” They write in the paper, “if a rumour ‘A’ is tweeted by 10 people separately, but not retweeted, it would have 10 cascades, each of size one. Conversely, if a second rumour ‘B’ is independently tweeted by two people and each of those two tweets are retweeted 100 times, the rumour would consist of two cascades, each of size 100.”


Without getting too bogged down in the math and variables used, just know that overall the study found false stories spread farther, faster, deeper, and more broadly than the true stories.


Six fact-checking organisations (Snopes, Politifact, Factcheck, Truth or Fiction, Hoax-Slayer, and Urban Legends) were chosen to form the basis of whether a story was true, false, or mixed. The sample of rumor cascades was drawn from the investigations in which all six organisations had agreed on the stories’ veracity between 95 and 98 percent of the time. Each cascade was then quantified using four categories to measure its diffusion.

Without getting too bogged down in the math and variables used, just know that overall the study found false stories spread farther, faster, deeper, and more broadly than the true stories. One area in which this study differs from others is in the researchers’ attempt to discover any major differences in the way rumours spread between different categories of news. Politics was the big winner in terms of the number of cascades analysed ( 45,000) and the speed at which it went viral. But in all categories, falsehood reigned as the king of virality.

Now, you’re probably thinking bots are to blame for the spread of false news. Devious Russian hackers pushing fake news and comments have become the political boogeymen of 2018, and everyone suddenly thinks they’re a bot expert. But the team at MIT found that bots really don’t seem to make a significant difference in whether or not more false stories are spread in the end.

To determine whether not an account was a bot, the researchers used a well-regarded detection algorithm called Botometer, formerly known as Bot or Not. Part of Botometer’s appeal for Vosoughi, the study’s lead author, was that it doesn’t make a binary judgement on an account’s authenticity, but instead gives a score between 0 and 1. If the algorithm determined that there was more than a 50 percent chance that an account was a bot, the researchers treated it as one. Comparing the two categories of bots and humans, they found that each shared falsehoods at about the same rate.

The research isn’t saying bots aren’t a factor in the spread of fake news, they simply don’t explain the difference between how false and true news spreads. “What we see is that when we remove the bots from our analysis, the difference between how false and true news spreads still stands,” Vosoughi says. He does say that bots do “move the needle” a little bit when they’re included in the data, spreading false stories rather than true ones just a little more than humans did, but the difference is minimal. And it makes sense, humans programmed the bots to act like humans, which brings us to the authors’ conclusion as to why false stories spread more than the truth: human nature.

Analysis of users’ comments on news found that false stories inspired fear, disgust, and surprise, while true stories inspired anticipation, sadness, joy, and trust. Above all, surprise was the biggest reaction to false news, which leads Vosoughi to believe that fake news has more to do with human nature and its attraction to novelty than anything.

Roy acknowledges that it’s well known both anecdotally and through communication studies that people are more likely to share negative news. He points to the work of Claude Shannon, the father of information theory, and his formulations that have been summarised as “information is surprise.”

But on a certain level, the idea that a false story is surprising to people could be a sign of hope. After all, in order for one to find false information surprising, they likely already have the knowledge that is being contradicted as a foundation. One could see that as an indication that people are better informed than we think, and maybe a false story is just a drop in the bucket. The authors aren’t so optimistic. Roy points out that this also requires false stories to become especially false because a story that contradicts a lot of what you know is more likely to be surprising—a fake news arms race ensues. He adds that the subject of politics may be especially vulnerable because people don’t have a lot of direct grounding and knowledge of the truth. “You don’t have to conform to reality,” he says. “It’s easier to surprise someone.”

As far as a next step towards adapting this research into some sort of executable solution that reduces the spread of false stories, Vosoughi suggests that we’re going to need to do more experiments with behavioural intervention. He points to possible reputation scores for users and news outlets that could be integrated into social media interfaces. This isn’t a new idea and has been part of several initiatives announced by Facebook—initiatives that have so far been poorly conceived and seem unrealistic.

I pointed out that the study found structural elements of the network didn’t seem to play much of a role in the spread of fake news. Major Twitter influencers were more likely to share true stories. False stories were more likely to be shared by users with low follower counts and no blue check mark, yet they still spread farther and faster. So, if people aren’t paying attention to the indicators of a source’s veracity that we already have, why would additional indicators make a difference?

Roy said he shares my scepticism but he thinks New York City’s requirement that restaurants post calorie counts on their menus is an interesting example of how scores based on reputation could have a positive impact. The regulation is largely seen as a failure in its effort to encourage New Yorkers to eat healthier. “Some people who are seeking to self-regulate their calorie intake actually did reduce their calorie intake as a result of those postings,” he said. “But maybe to the surprise of New York Health Commissioner, some people actually used that information to increase their calorie intake to get more efficient calories per dollar.” But an overlooked aspect of the program is that some studies found there was a broader network effect in which restaurants that display calorie counts began introducing lower-calorie items. He thinks it’s possible that we could see similar outcomes with labelling our news diet on social media. Some people who want that information will seek it out, others will prefer to wallow in the filth, Roy said, “and it may have interesting pressure on the content producers who care about their reputation.”

It’s clear that the reputational score of tech and media companies is dropping with the general public on a daily basis. As we get a greater understanding of how fake news spreads, we’ll see if anyone cares enough to do something about it. [Science]