We Tried to Publish a Replication of a Science Paper in Science. The Journal

Why is it not an error that an experiment cannot be repeated? Isn’t repeatability a basic requirement for any experiment for it to be considered valid?

Because figuring out why an experiment cannot be repeated is a rather large undertaking. It could be some sort of error in the original study, though none was reported. It could be the replication experiment had an error and the original was correct. It could be that the experimental conditions were subtly different in an unexpected way. Especially in social sciences the effects are small and the variables are hard to control.

Actually, no, not really. Repeatability in the sense that it can be theoretically repeated (specific methods and required code/data are often included in supplemental information) is a feature of science. However, that doesn’t mean in practice that research is actually repeated as a whole, and I would think very rarely before publication. For example,

In the work, published on 27 August in Nature Human Behaviour, researchers attempted to reproduce 21 social-science results reported in Science and Nature between 2010 and 2015 and were able to reproduce 62% of the findings. That’s about twice the rate achieved by an earlier effort that examined the psychology literature more generally, but the latest result still raises questions about two out of every five papers studied.

Sometimes reviewers will try to replicate analysis and they should review the methodology to see if there are any issues, but given that most experiments take large amounts of time and money, it makes sense that full reproducibility is not common.

Doesn’t make any sense to me .If you are talking about reproducible at around the 50% mark; then it makes sense to reproduce the important/expensive studies. That way, people don’t waste money chasing a rabbit hole and then find out years later that the initial experiments result were not reliable/repeatable.Put a “P” value to what % of studies published in journals can be confirmed independently, if a newspaper had such low reliability levels to its stories, nobody would trust it!.

OK, so maybe I should qualify a bit. Reproducibility studies are fairly rarely published, depending on the field. Original research is where the money and “productivity” is and it’s how people get more grants and tenure and such. Some fields, especially “fuzzy” ones like social sciences and medicine are taking more pains to increase reproducibility, but it’s an uphill battle given how bad the starting point is.

However, in the course of doing science, you often rely on other people’s work. So, for instance, if I’m doing a 10 step synthesis of chemical compound, I’m not going to make all 10 steps up out of thin air. I would go to the literature and find find a published procedure for the step I want (or near to it) and then try it out. Often enough in synthetic chemistry the first time you try a published procedure it won’t work. So, you try to figure out what they missed in the procedure. It could be as trivial as climate – some reactions depend on humidity and if the original paper’s lab was in Florida and I’m in Colorado, then there could be a problem. In any case, once I figure out how to make the procedure work in my lab, I’m very unlikely to write up what worked for me and send it to the journal of the original procedure to be published.

[Edit] Forgot my original point with that story – even though journal articles are rarely formally reproduced (especially in entirety), the hope is they are used and informally reproduced in that way. If there’s something wrong, new original research should include a better method. Scientists love to prove each other wrong.

1 Like

There is also the question of resources. When you apply for a research grant they usually want to see proposals for new and original research, not replications of previous work. At best, new research will indirectly test the accuracy of previous work as @Jordan describes . Therefore, it is difficult to come up with the time and money to repeat other peoples’ work, and we often have to rely on experiments that build on previous work.

We should also point out that no study is 100% proof of anything. We always have to assess possible demographic effects, population sampling, population variability, and other possible sources of variability. Experiments are almost always based on a small sampling of a larger population.

1 Like

If we look at the larger picture, these scientists are just pushing the costs down the line to other groups doing similar research. The overall financial loss as well as loss of time should be much higher in the long run. This is why other industries have costly quality control systems in place.
IN cases where the loos due to poor QC is transferred to the public such as in medicine/automobile sector etc; governments take a hand in QC and insist in passing certain minimum standards.
I wonder if anyone has done an audit to find out what the loss to the general public and investors ( people who give grants) is in terms of false results being accepted.

How would you put quality control in place for the types of studies described in the OP?

The results in the paper in the opening post weren’t false results. They were the true results for the group of people they tested.

Larger sampling sizes across a larger geographic spread… perhaps two teams doing the same study simultaneously.

The results were not representative of either group I guess…

That’s a lot of money and time. How large is large enough? What area of the country should the study be done in? Once you involve multiple institutional review boards and local regulatory authorities you are looking at a massive paperwork headache too boot. The logistics of modern research can be very overwhelming.

The results were representative of both groups. That’s the point.

Let’s ask a different question. What if the papers were published in the reverse order. Which paper would you be saying is incorrect? If the first paper published showed no correlation, and then the second paper did show a correlation, would you be saying that the group that showed no correlation was wrong?

1 Like