A major investigation into scores of claims made in psychology research journals has delivered a bleak verdict on the state of the science.
An international team of experts repeated 100 experiments published in top psychology journals and found that they could reproduce only 36% of original findings.
The study, which saw 270 scientists repeat experiments on five continents, was launched by psychologists in the US in response to rising concerns over the reliability of psychology research.
The first imperative: Science that isn’t transparent isn’t science
“There is no doubt that I would have loved for the effects to be more reproducible,” said Brian Nosek, a professor of psychology who led the study at the University of Virgina. “I am disappointed, in the sense that I think we can do better.”
“The key caution that an average reader should take away is any one study is not going to be the last word,” he added. “Science is a process of uncertainty reduction, and no one study is almost ever a definitive result on its own.”
All of the experiments the scientists repeated appeared in top ranking journals in 2008 and fell into two broad categories, namely cognitive and social psychology. Cognitive psychology is concerned with basic operations of the mind, and studies tend to look at areas such as perception, attention and memory. Social psychology looks at more social issues, such as self esteem, identity, prejudice and how people interact.
In the investigation, a whopping 75% of the social psychology experiments were not replicated, meaning that the originally reported findings vanished when other scientists repeated the experiments. Half of the cognitive psychology studies failed the same test. Details are published in the journal Science.
Even when scientists could replicate original findings, the sizes of the effects they found were on average half as big as reported first time around. That could be due to scientists leaving out data that undermined their hypotheses, and by journals accepting only the strongest claims for publication.
Despite the grim findings, Nosek said the results presented an opportunity to understand and fix the problem. “Scepticism is a core part of science and we need to embrace it. If the evidence is tentative, you should be sceptical of your evidence. We should be our own worst critics,” he told the Guardian. One initiative now underway calls for psychologists to submit their research questions and proposed methods to probe them for review before they start their experiments.
John Ioannidis, professor of health research and policy at Stanford University, said the study was impressive and that its results had been eagerly awaited by the scientific community. “Sadly, the picture it paints – a 64% failure rate even among papers published in the best journals in the field – is not very nice about the current status of psychological science in general, and for fields like social psychology it is just devastating,” he said.
But he urged people to focus on the positives. The results, he hopes, will improve research practices in psychology and across the sciences more generally, where similar problems of reproducibility have been found before. In 2005, Ioannidis published a seminal study that explained why most published research findings are false.
Marcus Munafo, a co-author on the study and professor of psychology at Bristol University, said: “I think it’s a problem across the board, because wherever people have looked, they have found similar issues.” In 2013, he published a report with Ioannidis that found serious statistical weaknesses were common in neuroscience studies.
Scandals prompt return to peer review and reproducible experiments
Nosek’s study is unlikely to boost morale among psychologists, but the findings simply reflect how science works. In trying to understand how the world works, scientists must ask important questions and take risks in finding ways to try and answer them. Missteps are inevitable if scientists are not being complacent. As Alan Kraut at the Association for Psychological Science puts it: “The only finding that will replicate 100% of the time is likely to be trite, boring and probably already known: yes, dead people can never be taught to read.”
There are many reasons why a study might not replicate. Scientists could use a slightly different method second time around, or perform the experiment under different conditions. They might fail to find the original effect by chance. None of these would negate the original finding. Another possibility is that the original result was a false positive.
Among the experiments that stood up was one that found people are equally adept at recognising pride in faces from different cultures. Another backed up a finding that revealed the brain regions activated when people were given fair offers in a financial game. One study that failed replication claimed that encouraging people to believe there was no such thing as free will made them cheat more.
Munafo said that the problem of poor reproducibility is exacerbated by the way modern science works. “If I want to get promoted or get a grant, I need to be writing lots of papers. But writing lots of papers and doing lots of small experiments isn’t the way to get one really robust right answer,” he said. “What it takes to be a successful academic is not necessarily that well aligned with what it takes to be a good scientist.”