Another Psychological Study Fails the Reproducibility Test

By Jennifer Ouellette on at

The field of psychology is currently in the midst of a kind of civil war, with one side claiming a widespread reproducibility crisis, and the other just as loudly proclaiming that concerns are greatly exaggerated. There’s certainly evidence for the former. Last year, a University of Virginia initiative called the Reproducibility Project repeated 100 experiments and failed to replicate fully one-third of them.

Add yet another one to that list: a classic 30-year-old study concluding that people who smiled while holding a pen between their teeth thought cartoons were funnier.

It’s known as the “facial feedback hypothesis,” and it harkens back to 19th century American psychologist William James, who thought things like sweaty palms or a rapidly beating heart weren’t the result of emotions of anxiety or panic and fear, but actually caused them. In other words, smiling makes us happier, and frowning makes us sadder. Subsequent research seemed to support this hypothesis, including the aforementioned 1988 German study.

Then Dutch researchers at the University of Amsterdam decided to replicate that experiment—notably at the suggestion of Fritz Strack, the original lead author—collaborating with several other labs around the world. In a new paper in Perspectives on Psychological Science, they reported a failure to replicate the 30-year-old findings “in a statistically compelling fashion,” concluding, “Overall, the results were inconsistent with the original result.”

Another Psychological Study Fails the Reproducibility Test
The experimental setup. (Image: Quentin Gronau/Flickr)

While nine of the labs involved in the replication effort reported findings similar to the original 1988 study, the effects weren’t as statistically strong, and disappeared entirely when those results were combined with the findings of eight other labs that found no evidence for the hypothesis.

However, as Christian Jarrett observed in the British Psychological Society’s Research Digest, “[T]his does not mean the entire facial feedback hypothesis is dead in the water. Many diverse studies have supported the hypothesis, including research involving participants who have undergone botox treatment, which affects their facial muscles.”





The researchers did their best to recreate the original conditions for their pool of 1894 student subjects, right down to the use of cartoonist Gary Larson’s iconic Far Side cartoons. But there were some differences. Most notably, participants watched the instructions on video, and were videotaped as they performed the tasks to ensure they were doing so correctly.

Strack himself raised questions about the methodology and statistical analysis in an accompanying commentary to the new paper, arguing that the findings should not be deemed conclusive just yet. For instance, the pool of participants were drawn from psychology students, who were very likely to have read about the original 1988 study in their textbooks and guessed the true purpose of the recreated experiment, which could have skewed the results.

Furthermore, Strack thought that videotaping the subjects might have made them more self-conscious, also skewing the results. He also questioned whether today’s students would relate to The Far Side’s uniquely 1980s brand of humour. “It is indicative that one of the four exclusion criteria was participants’ failure to understand the cartoons,” he wrote. [Perspectives in Psychological Science via Improbable Research]