Does Science Work by Falsifiability?

I’m curious @Puck_Mendelssohn and @David_MacMillan’s insight on the legal definition of science. Does it still include falsification?

I completely agree with this. For whatever reason, I was not steeped in Popperian inflexibility (which the discussion above shows was never even true of Popper himself), and count some very accomplished philosophers of science among my friends. (I have mentioned Del Ratzsch here before I think.)

My problem with most of these discussions is that they ask a question (“What is science?”) that is not very interesting to me by itself and is too often used to obscure much more important questions about an idea or a proposal. I made this point explicitly when discussing ID 10+ years ago. “Design” is an idea that can exist within science, regardless of how it’s defined, and that is worthy of deep discussion whether or not someone judges it to be “scientific.”

That doesn’t mean that falsifiability isn’t an important thing to consider when weighing an idea. I have argued that “design,” specifically by an omnipotent deity, is inherently unfalsifiable, in principle. I don’t care whether this makes ID “science” and that’s not my point. It does make ID (when involving omnipotent deities) a thing that no discovery can ever disprove in principle or in practice, and I feel strongly that this means that ID should be discussed and defended with that fact in mind.


I don’t think there really is a single “legal definition of science.” It depends too much on context. If we’re talking about qualification of experts, it’s one thing; if we’re talking about science/religion demarcation for purposes of creationism lawsuits, et cetera, it’s another; and I’m sure there are other contexts and other definitions I’m not thinking of. Bear in mind that the common law is the source of most of this stuff, and that means that the law is embodied In a large and often confusing body of decisions in individual cases, which may conflict with one another, especially from jurisdiction to jurisdiction.

At its best, the law tries to be informed BY definitions of such things as “science” or “religion” as they are employed by people outside; in practice it may sometimes impose its own peculiar constructs on things, but at its best it listens more than it speaks.


I think the linked blog post is right in that scientists, even though they might mention Popper a lot, don’t really believe that a theory is never verified, only falsified (strict Popperianism). First, an important element which is missing in that naive picture is the role of cumulative evidence: people are loathe to give up a theory that can already explain a lot of things, and especially in physics, they would try to restrict the theory’s domain to what has been verified instead of throwing it out completely.

Second, the Duhem-Quine thesis (i.e. you don’t know if it’s the theory that is falsified, or one of your measuring tools) is a potential problem, but physicists have had ways to deal with it, again by invoking the notion of cumulative evidence. My field of research (precision measurement) is focused on testing the most fundamental and well-accepted theories, such as whether gravity really scales as r^{-2}, whether certain new hypothetical particles exist and whether the electron’s charge is really symmetric or not. In this field, we don’t just decide arbitrarily whether an established theory is really wrong or that one of the measuring tools is defective. Instead, we try as much as we can to calculate and independently measure the precision and accuracy of our measuring tools and factor them into a systematic error “budget”. This calibration of our tools is done with techniques which are relatively simple, well-established and widely accepted. If this calibration cannot be done properly, or that it turns out that our tools are very imprecise, then sometimes we hold off judgment on the significance of our measurement until we can design a better experiment that can verify it more precisely.

The rigor of the above process is such that many experiments (including mine) in fact blind our final measured value during the entire process of running the experiment and doing the data analysis, so as to minimize human bias from interfering in the case that an established theory is really false, for example. That’s falsification in action, I would say.

An example of this happening in action is the proton radius measurement (which I’m surprised hasn’t been analyzed from a philosophy or sociology of science perspective yet). There was a new, much more precise measurement that conflicted with previous results. Instead of just rejecting the newer result, what people did is construct a number of newer, more precise experiments (utilizing very different methods of measuring the same quantity) in order to put the extraordinary result to the test.

Third is the issue of ad hoc hypotheses. The author argues that falsification is a meaningless criteria to demarcate science from non-science, as both mainstream science and pseudoscience are equally guilty of being willing to conjure up ad hoc hypotheses to rescue their theory. But this is an overly simplified picture that doesn’t take into account different types of ad hoc hypotheses, and the testability of the ad hoc hypotheses themselves. It’s common in theoretical physics to revise one’s prediction of the mass of a new, undiscovered particle in response to the latest experimental results, but people don’t just accept that blindly. They evaluate each revision to see how it fits with other parts of physics that are already well-established.

In conclusion, I would say that most scientists today (at least in most of physics) still operate on the assumption that some type of falsification is a necessary but insufficient criteria for something to constitute a legitimate scientific hypothesis. Almost every time one proposes a new, speculative hypothesis, the next thing that is asked is, “How can we test this? What experimental evidence would this theory predict?”

Now, this may not apply in the same way to some really speculative areas (such as multiverse hypotheses or string theory), but I don’t think most people regard these theories as “established” in the same way that the Standard Model of particle physics or Big Bang cosmology is established, for example. You are much freer to claim that you don’t believe in a multiverse compared to claiming that you don’t believe in the robustness of the Standard Model. (Of course everyone also agrees that the SM has to fail at some point, but that’s a different matter…)


(Extended applause). In addition, lots of more recent philosophers of science have agreed with this. It is biologists who think that Karl Popper is the last word on how you do science. Philosophers of science know better.

In a similar situation, biologists and many other scientists think that Thomas Kuhn’s argument about “paradigms” is the last word on history and sociology of science. Historians and sociologists of science aren’t so sure.


To be more precise, it’s members of the Hennig Society.

They do, but there are many other biologists who do too. What is odd is that those other biologists are often happy to do statistical tests on their observations. If Popper were correct, they wouldn’t need to.

Even the members of the Hennig Society, at the peak of their fervor for hypothetico-deductive reasoning, didn’t take a single refutation as decisive – they counted refutations of different possible cladograms and chose the “least-refuted” one. Which isn’t Popper’s approach to refutation.

1 Like

Thinking about this further, I realized that there is a pretty good example of how the law deals with this, which is sort of front-and-center in the ID Creationism conflict. When the Dover Area School District tried to get ID Creationism into the classroom, and the Kitzmiller lawsuit followed, the court did not come to the dispute with some pre-existing “legal definition” of “science” and scrutinize the evidence to see whether the criteria of that pre-existing legal definition were met. Rather, the court heard the evidence offered by the parties, which included expert testimony going, among other things, to the definition of “science.”

The creationist school board brought in Michael Behe as an expert on this point, and he proposed that the definitions of “science” and of “scientific theory” being applied by the plaintiffs’ experts were overly narrow. Plainly this tack was taken because by any conventional definition of a “scientific theory,” ID Creationism fails. But if you’re going to play this sort of shell game and insist that everybody else’s definition of a “scientific theory” is wrong, you’ve got to have your own definition, and that’s where the wheels came off. It was, to me, the “spit-take heard 'round the world,” when news stories of the trial reported this exchange:

Q. And using your definition, intelligent design is a scientific theory, correct?

A. Yes.

Q. Under that same definition astrology is a scientific theory under your definition, correct?

A. Under my definition, a scientific theory is a proposed explanation which focuses or points to physical, observable data and logical inferences. There are many things throughout the history of science which we now think to be incorrect which nonetheless would fit that – which would fit that definition. Yes, astrology is in fact one, and so is the ether theory of the propagation of light, and many other – many other theories as well.

Q. The ether theory of light has been discarded, correct?

A. That is correct.

Q. But you are clear, under your definition, the definition that sweeps in intelligent design, astrology is also a scientific theory, correct?

A. Yes, that’s correct. And let me explain under my definition of the word “theory,” it is – a sense of the word “theory” does not include the theory being true, it means a proposition based on physical evidence to explain some facts by logical inferences. There have been many theories throughout the history of science which looked good at the time which further progress has shown to be incorrect. Nonetheless, we can’t go back and say that because they were incorrect they were not theories. So many many things that we now realized to be incorrect, incorrect theories, are nonetheless theories.

As a practitioner who has had the sad experience of having an expert witness implode on the stand, I cannot read this strange exchange without having some sense of the horrid sinking feeling it must have involved, and some sympathy for the poor fundamentalist lawyer who represented the creationists in this thing and who had to know that his case was now fully in the fire. It was at this moment that it became clear that, even if the worst reports of Judge Jones’ character (which happily turned out to be untrue) were true, it would take a mighty strain and the will to ignore the clear weight of the evidence to figure out a way to rule in favor of the creationists on the board.

At any rate, my original point is illustrated here. Rather than applying some “legal definition” of science, or of scientific theory, the court looked to those who use these terms in their work for the relevant concepts, and the court got it right; though, after Behe’s testimony, it is hard to imagine any judge with integrity somewhere north of Roy Moore who could have ruled otherwise.


This question is somewhat inadequate because it uses the word “Science” in an unqualified way. This word means vastly different things depending on whether we are talking about repeatable experiments about the present working of nature, or whether we are talking about claims in the past which are not repeatable (i.e. forensics).

I wrote an essay on this for the Journal of Creation, but it’s not yet available for free online and requires a subscription.

Price, P., Examining the usage and scope of historical science—a response to Dr Carol Cleland and a defence of terminology, Journal of Creation 33 (2):121–127, 2019.

Cleland jumps on the bandwagon of attacking the criterion of falsifiability, but she does it based upon a mischaracterization of Popper’s position.

In its most basic formulation, falsification is nothing more than pure logic. If you state “Only A”, and then I present “Not A”, then I have proven you wrong. Without this basic understanding of logic, there is no way to do any science at all.

Historical science by nature is not subject to falsification (something with which Dr Cleland would agree), and that makes it fundamentally less trustworthy than operational science because of the “asymmetry of overdetermination” as Cleland puts it. With any given set of present-day clues, any number of possible sets of past circumstances could suffice to explain it.

You make a model that predicts measurements (rocks deposited here should contain X amount of Y because of hypothetical process Z). Go do the measurements, compare prediction to observation.

How is this different from comparing predictions to measurements done on some process that is still active? You’re still comparing the predictions of models to observations, whether those things you measure are happening now or are the result of something that happened in the past, you’re doing the same thing. Comparing outputs of models to measurements.

That’s true even for things you supposedly “directly” observe, like a temperature, or weight measurement. You stick your thermometer into your pot and note down a value. Any number of possible factors can influence your “observational” measurement and explain why you obtain the value that you do.

I weigh droplets I deposit from a pipette because I am to estimate it’s level of precision (reported to be within some narrow range), and I measure a range of values within some narrow range on 19/20 measurements, corresponding to be within the reported precision of the instrument. Then on the 20th measurement I weigh a droplet that appears to be the exact same size, under (as best I can tell) the exact same conditions I did to the previous 19. But it suddenly weighs 20% more (according to the weight). Have I “falsified” the reported precision of the pipette? Entertain me.

A reasonable statement so far as it goes, but science is not math; the challenge is that empirical results in science, even the repeatable results of experimentation, do not always map cleanly to a logical syllogism. That is, even as scientific theories must always be held tentative, so must scientific data (facts) be held tentative, and most especially scientific data which is held to overturn otherwise widely supported theories must be held tentative.

Case in point, creationist made a pretty big deal over “missing solar neutrinos” a few years back, as falsifying the mainstream physics of solar fusion (thus implying that the sun could not have shone for billions of years). Physicists never had any real doubt that the fusion physics was correct and the neutrino data was in error, and that was not because of any close mindedness but rather that fusion is very well understood. Inevitably, the missing neutrinos were found with better detectors, and not only was the fusion problem resolved but physical theory advanced by the effort. None of this was any thanks to creationists who were rather happy to have the doubt over fusion persist. When the missing neutrinos were found, YEC just pointed out that does not prove the sun is old, and then went quiet. There was no questioning of their overall approach to testing in science.

Falsification is only as good as the empirical evidence is solid, and it can be a matter of some time and effort before resolution. YEC has a history of being premature, overstating and overly eager in jumping on anomalies as they appear.

Repeatability and observability. Yes, you can come up with a theory and go check and see if what you predicted is there. But even if it is there, that doesn’t prove your theory. It could simply be a coincidence, caused by other factors you didn’t consider. The opposite is true. Your theory could be correct, but outside factors you didn’t consider could cause you to fail to find the evidence you expected. The past is not repeatable, and that is the fundamental distinction between operational and historical science.

Yes, unknown mitigating factors can influence your measurements. That is exactly why repeatability is so important. By having others attempt to get your same results, you lower the chances that your results were a fluke.

Popper’s idea of falsification was nuanced. He understood that flukes and errors can happen. This is again why repeatability matters. Keep doing it. Have others do it. If that anomaly you reported keeps happening, then yeah, I would consider the possibility it has been falsified, unless there is some factor you can isolate that was causing the anomaly.

Then why call it “science”? If it is entirely scriptural there is no need to argue science at all.


You must have. Can you link me to what you found? I was unaware it was even available at this point.

I’ve got a meeting …

Please try to keep your comments and replies on this thread focused on the issue of falsification, and refrain from extended debates on the distinction between operational vs. historical science. Comments solely on the latter should be directed towards this thread: The "historical vs. operational science" distinction (@moderators take note).

1 Like

You cannot use science to prove or disprove anything period, even say, relativity.

2 posts were merged into an existing topic: The “historical vs. operational science” distinction

2 posts were merged into an existing topic: The “historical vs. operational science” distinction