Three Cheers for Failure!
Last week I vowed to pay more attention to replication in psychology experiments. Repeated experiments are an important test of whether a finding is “really out there” or an accident, so, as a number of psychologists have been saying lately to the public, it is kind of a problem that many experiments are never repeated. And that, when they are, failures to replicate are often consigned to the file drawer. A new hypothesis often excites people; the “null hypothesis”—nothing’s out there, the prediction was wrong—not so much.
A big help in countering such prejudice is an online journal I hadn’t heard of until this week: the Journal of Articles in Support of the Null Hypothesis. As the name suggests, it publishes researchers’ reports that they fought the null and the null won, saving those papers from the file drawer and waste basket. Journals (and journalism even more) offer mostly a dance of research success (“they tried it, and, lo, it worked!”). This place is a haven for failure’s dull unwelcome (but so necessary!) song.
Perusing the journal’s back issues, I see it’s my empirical duty (this really is not fun) to revisit some research I posted about here: Experiments that showed people who feel bad about their moral status are more eager to wash physically; and, conversely, that people, when asked to help out a stranger, were more likely to do so if they had not had a chance to wash first. This “Macbeth effect” suggests that moral and physical cleanliness are associated in an alarming way: Feeling good about the one seemed to incline people to give themselves a pass about the other.
Intriguing idea, which makes sense to me (and which aligns with a number of other studies performed elsewhere). But in a paper last year in the JASNH, (findable here or pdf here) four psychologists, in two separate projects, who tried to replicate “Macbeth effect” experiments report that they could not do it. They were not, they’re careful to say, doing exactly the same thing as the Macbeth authors, Chen-Bo Zhong and Katie Liljenquist. On the other hand, they used more people than the original studies, which is generally seen as a good way to avoid accidental effects.
This doesn’t mean the “Macbeth effect” is toast. It just means that the first paper’s finding—never mind that it’s novel, never mind that personally I find it appealing, never mind that it was published in Science—hasn’t been established as a sure thing. All perfectly normal for ongoing research, much as everyone (researchers and fans alike) would prefer a story of clear triumph or tragedy.
Jennifer V. Fayard, Amandeep K. Bassi, Daniel M. Bernstein, & Brent W. Roberts (2009). Is cleanliness next to godliness? Dispelling old wives’ tales: Failure to replicate Zhong and Liljenquist (2006) Journal of Articles in Support of the Null Hypothesis, 6 (2), 21-29 ISSN: 1539-8714
The paper that Fayard et al. failed to replicate:
Zhong CB, & Liljenquist K (2006). Washing away your sins: threatened morality and physical cleansing. Science (New York, N.Y.), 313 (5792), 1451-2 PMID: 16960010