2016 has not been a good year to be a celebrity. Since the start of the year it has felt like every few days we lose another big name. David Bowie died on January 10th, and Alan Rickman followed him less than a week later. Since, we’ve lost Terry Wogan, Keith Emerson, Sir George Martin, Nancy Reagan, Ronnie Corbett, Victoria Wood and most recently Prince, and countless others. Even Admiral Ackbar himself, Erik Bauersfeld, passed away. It makes you wonder… is there something weird going on?
Is 2016 cursed? Are more celebrities dying than usual? I decided to find out - and spoiler alert… it looks like the answer could indeed be YES.
The trouble with investigating this is that it is very hard to define who a “celebrity” is. David Bowie is undoubtedly, yes - that’s easy. But what about Nancy Reagan, who died in March? She wasn’t a performer of any sort, but was probably equally as well known as Bowie by virtue of being the wife of a former President. And how well known does someone have to be before they count as a celebrity? Again, Bowie and Prince for sure. Victoria Wood and Ronnie Corbett? Definitely if you’re British - but it is less clear elsewhere. And what about Swiss-Lithuanian Chess grandmaster Viktor Gavrikov, who died last month? How do we measure fame, and where do we draw the line?
Wikipedia to the Rescue
And this is where I had a realisation: Wikipedia neatly answers both of these questions.
To get a page on Wikipedia you need to be classed as “notable”. This is hugely subjective obviously, but the collective wisdom of Wikipedia’s editors seems like a reasonably fair way of judging it, as the inclusion of each page is argued about and voted on. There’s no one individual subjectively arbitrating, or saying that musicians count and politicians don’t because of their own personal preference. So we have a definition of “celebrity” - albeit one that is fairly broad. If you’re famous enough for Wikipedia, you count.
But what about measuring how famous someone is?
For this, we can steal a trick from Google. The (grossly simplified) way that Google ranks pages is by the number of inbound links to a page, as this works as a proxy for an endorsement that the page is a good one that is worth looking at. Helpfully, Wikipedia has a tool to view the number of other wiki pages linking in - and this works as an easy means to see how impactful, and thus famous someone was.
For example, David Bowie’s page is linked to on over 7000 other wiki pages - unsurprising given his decades of success and huge influence on those who came after him. Someone less famous, like Canadian wrestler Mike Sharpe, who died on the 17th January, has only 50 links inbound. So thanks to this, we have a pretty good measure of someone’s fame too.
Of course, Wikipedia isn’t perfect: It could be biased towards certain fields or topics (the Star Trek and Doctor Who sections are particularly detailed, unsurprisingly), the data could conceivably be bad (literally anyone can edit the site), or the source data could have been edited unevenly - with more updates on some years than on others. But if we compare different years, it at least is comparing almost-like with almost-like. I’ve only gone back as far as 2010, as it is conceivable that since then deaths have been recorded in real-time, rather than historically, which should hopefully mean the real volume of deaths is reflected.
There is also almost certainly a bias towards people who are famous in the English-speaking world on the English language Wikipedia. But this would at least be consistent with our own biases on who counts as a celebrity from our position as someone who lives in the English-speaking world.
I would argue that the data is broadly good enough for us to extract some meaning.
I’ve spent the last few weeks swearing at my computer whilst writing code to extract the data on around 33,000 listed individuals who have died between January 2010 and April 2016, and tallying up the number of pages that link into it to score them too.
So, finally, caveats out the way and methodology explained, let’s try and finally answer the question: is 2016 really as cursed as it seems?
What the Data Says
Let’s look first at the total number of notable individuals who have died in a given year. We'll be looking at the first 121 days (the cut off point for our research deadline) of each year 2010-2016, with the cumulative total of the number of celebrities who have died by each day. (Because of leap-years, we couldn’t simply say January-April, as that would leave us with an “extra” day on two of the years.)
As you can see from the pinkish line, this year has been a bad year for the number of celebrity deaths, but not the worst. In fact, by this point last year more famous names had died. Over the 121 day period we’d lost Philip Seymour Hoffman, Shirley Temple, Pete Seeger, Bob Hoskins, Tony Benn and… notorious firebrand Fred Phelps, Patriarch of the so-called “Most Hated Family in America”. Hey, we didn’t say that you have to like the celebrities.
What’s curious though is that it appears that as 2015 went on, the rate of famous people dying fell. Take an average number of notable deaths per day for so far this year, and use that as a basis for projecting the number of notable deaths for the rest of the year - if celebrities keep dying at the current rate, this year will indeed be the worst year for celebrity deaths.
Now what if we introduce the scoring system described above into the data? Sure, lots of people have been dying but it is still pretty close. Are the people who are dying this year more famous than those who died over the last few?
Here’s the first 121 days of the six years we’re looking at plotted against each other, with their ‘score’ based on the number of other Wikipedia pages linking into theirs.
According to these numbers, perhaps the reason we think we’re losing a lot of big names this year is because the year started with Bowie, and has been bookended so far by Prince. (Alan Rickman is that smaller bump a week after Bowie - Wogan, Corbett and Wood are in the noise at the bottom of the chart.)
But other years also had huge notable deaths. In 2012 we lost Whitney Houston, and in 2013 we lost Margaret Thatcher and former Venezuelan President Hugo Chavez. This chart also demonstrates one potential bias with this sort of analysis: The huge score for Roger Ebert isn’t just because he was a highly esteemed film critic, but also because his name features on the “critical reception” reviews section for thousands of film pages listed on the site, perhaps bumping him up to a higher score than he would otherwise ‘deserve’.
Where this scoring data really starts to tell us something though is when it is plotted cumulatively. In other words, the combined scores of everyone who has died so far in each year.
Though more people had died by this point in 2015, overall, 2016’s deaths have received a higher score - suggesting that yes, the people who died this year are more famous. Which perhaps explains why the notable losses this year have felt so much more dramatic.
What About the Megastars?
So is what we’re collectively noticing not that more notable people are dying but in fact more megastars are dying? Though the Canadian Wrestler or Chess Grandmaster dying will be sad for their fans, it isn’t like they are household names: Are the people who are dying the mega-famous ones?
To find this out, I filtered the data to only include celebrities scoring over 500 - who I’ll refer to as the megastars - and the results are perhaps even more striking.
First, in terms of the absolute numbers of megastars who have died, according to the cumulative totals by the end of April we had indeed lost more than any of the other years we’re comparing against.
And this difference is even more striking when you compare the megastars’ scores.
Notice the massive gap in scores - suggesting we’re losing more megastars this year than any other. Though, as we learned initially, by this point in 2015 we’d lost more notable people, it turns out that not as many were as famous as the calibre of people we’ve lost so far this year.
You’re Not Going Mad
What this means is that - assuming you go along with all of the caveats as described above - is that yes - there really are more famous people dying this year. And not just any famous people: really famous people. 2016 has been a brutal year so far, and now we have the evidence. Let’s just hope the rest of this year doesn’t continue like it began.