Running A Negative Control on ID Math

Read on …

Everybody is wrong, including @T_aquaticus :grinning:

The root of this problem is the very premise of inferring anything with this sort of probability calculation is flawed. Counter-examples can demonstrate there is a flaw, but not why all such examples are flawed.

Bear with me for a bit of simplified Bayesian statistics, and I’ll do this as non-mathematically as I can:

Suppose there are two possible explanations, A and B, for an event E. Given some data for E we might be able to evaluate the probability of E for each explanation A and B. Bayesian statistics allow us to make some assumptions about the probability before we calculate it (a prior assumption, or just “prior”). Usually this prior assumptions is “weak”, meaning it doesn’t greatly influence the final probability calculation (posterior probability), BUT nothing prevents the use of a strong prior that completely determines the resulting posterior probability.

We would like to compare the probabilities of E for A and B for the same data and compare them to see which posterior probability is larger. In Bayesian statistics the ratio of these probabilities is called a Bayes Factor, and it is related to the Likelihood Ratio from the usual “Frequentist” statistical theory. If the Bayesian prior assumption is very weak (non-informative) then the Likelihood Ratio and Bayes Factor are the same thing. With the preliminaries out of the way we can get on with the statistics.

Given a sequence of data, the ID argument puts a probability on the event E based on an explanation from evolution (A). This is usually a flawed explanation of evolution, but that doesn’t matter because it’s the wrong flaw anyway! Next we do the same calculation of probability for E based on an explanation from ID (B) and compare them.

Except there is no calculation of probability for E based on an explanation from ID (B). ← This is where the ID argument goes wrong.
Sure the probability based on A might be incredibly small, but the probability based on B could be even smaller. We don’t know because it can’t be calculated.

That’s not where the ID argument stops though - the argument goes on by claiming the probability of A is so small that B must be the answer. AND this is where the prior assumption comes in. The ID argument is making a claim - an inference for Design - that has the same form and interpretation as the Bayes Factor does to a statistician.

Now for my own inference. The ID argument based on the form of a Bayes Factor concludes ID is more likely than evolution without any probability of ID. Therefore, I conclude the ID argument is making an unstated prior assumption that the probability of ID must be greater than that of evolution. This must be a very strong prior, because it the data doesn’t influence the posterior probability at all. This unstated prior assumption forces the conclusion favoring ID, always. It’s not really a method or argument at all, merely a sneaky way of restating the hidden assumption.

TL;DR: This ID argument tacitly assumes the ID is the only possible conclusion.

6 Likes

This not right. ID claims that observed proteins have limited substitutability in their observed state. This is determined by limits to their ability to mutate over long periods of time in their current function and is determined by comparing proteins separated by millions of years of evolution. Many of observed proteins fall into this category.
Proteins are ofter interdependent as Salvador pointed out. Once a protein is built the sequence of the protein that can bind to it is constrained.

Fantastic explanation Dan!! Thanks very much!

2 Likes

Exactly. Can we enshrine this somewhere for future reference and linking?

4 Likes

Then state the ID hypothesis and cite the data that came from testing it.

Bill, you might have the core of a testable hypothesis here, but it lacks specificity and careful definition of terms. I’m not suggesting you correct this yourself, but rather the whole of ID has avoided testing hypotheses that might support ID by any conventional means. I outlined the mathematical basis for testing hypotheses in my comment above.

2 Likes

At least you didn’t go full Pauli and say that I wasn’t even wrong. :wink:

The one flaw I see in this argument is that both A and B could be wrong. For example, 200 years ago we might be weighing phlogiston and magic fire faeries, but both would be wrong. However, Bayesian statistics are useful and are cogent to this discussion.

Good point. It becomes a Designer of the Gaps argument.

Well A and B, we know, for any non-trivial system are actually both wrong in practice. The question is which one is least wrong?

1 Like

Your aren’t WRONG wrong, just missing an underlying assumption. Lots of people miss that underlying assumption!

Yes. There a quote to that effect:

What the use of P [the significance level] implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred.
– Sir Harold Jeffreys

Once your sequence has been produced, whatever it is and no matter how it was produced, it becomes a specification. So if I play your game after you first get your sequence and come across the same sequence, then I can safely conclude that your random sequence generator is biased toward producing your sequence. IOW, I can infer design.
You must understand that before you came across your sequence, said sequence had never existed anywhere at anytime in the universe. It was only a potentiality. But by playing your game, you gave birth to that potentiality, you gave birth to a very singular event that now constitutes a specification.

There is bias in naturally occurring systems, so I’m not sure what you are getting at here. If we saw that nearly all water molecules move downhill instead of an even split between uphill and downhill would we conclude that a designer is actively moving water in rivers?

You are also missing the point of my post. Even in cases where we know the results come from a random process your argument would state that it is designed. Your argument produces false positives.

AND

You guys are correct, but my comment was already too long. I focused on the inferential gorilla in the room that usually gets ignored.

1 Like

You are trying to disqualify the design inference by appealing to a Bayesian approach. But design theorists such as Dembsky infer design through a Fisherian, not a Bayesian approach, for they see the later as inadequate for this job. For more details, see the addendum at p 35 of the paper below.

If they say this explicitely, that would be really interesting. Can anyone produce a quote?

As I said, you can look at the addendum at p 35 of the linked paper above.

The text is too long to paste here, but here’s a web link to an abstract (well, the Abastract, actually, in this case) and a link to the document to which Gilbert refers.

https://fdocc.blogspot.com/2005/10/specification-pattern-that-signifies.html

So if I mutate a critical residue, say one that contacts that enzyme’s substrate, making the most radical size change possible, the enzyme should no longer perform the “specified” function, correct?

My comment above simply puts a common argument from ID in the context of a Bayes Factor. Whatever we call it, the “coins” argument, or fantastic improbability argument, the fact is this probability is interpreted as the odds against evolution, just as a Bayes factor is interpreted as the relative odds between two hypotheses. All I’m doing using the given probability and conclusion to fill in the missing piece, revealing the hidden assumption.

I am very familiar with Dembski’s 2005 paper. I studied it carefully (a year, off and on) before writing a blog post about it. Link below. That appendix is a long diatribe against Bayesian statistics, but Dembski still makes the same implicit Bayesian assumption in his interpretation. In the Fisherian (Frequentist) context Dembski is calculating a single Likelihood, which doesn’t mean anything by itself.

Actually, it’s not even a likelihood until you correct Dembski’s error. You can read all about it. :slight_smile:

2 Likes

I just read through that appendix again and it still makes no sense. Maybe I can find some good ones …

[…] the Bayesian approach, which is essentially comparative rather than
eliminative, comparing the probability of an event conditional on a chance hypothesis to its probability conditional on a design hypothesis, and preferring the hypothesis that confers the greater probability. I’ve argued at length elsewhere that Bayesian methods are inadequate for drawing design inferences. Among the reasons I’ve given is the need to assess prior probabilities in employing these methods, the concomitant problem of rationally grounding these priors, and the lack of empirical grounding in estimating probabilities conditional on design hypotheses.

Dembski primary objection seems to be that it would require a design hypotheses, and he know he can’t provide one. Following this Dembski goes on (and on and on) about the Caputo ballot rigging example as justification for rejecting Bayesian methods. BUT a Bayesian hypothesis with a non-informative prior gives the Frequentist Likelihood Ratio, and that is NOT what Dembski is doing.

Among the oddities of this paper:

  1. A basic probability calculation that he gets wrong, leading to …
  2. Probabilities greater than one. And …
  3. A reversal of Shannon and Kolmogorov definitions of information, so that a sequence of 1000 zeros would maximize CSI while a random sequence of zeros and ones minimize it.

That last one I had as a note in the comments, but it seems that Blogger lost the comments when G+ when away (dammit Google!).

I think that not even Dembski knew what Dembski was doing. My educated guess is that Richard Marks quietly set Dembski straight on the theory, because this method was quietly abandoned after those two started collaborating.

Reference

Dembski, W. A. (2005). Specification: the pattern that signifies intelligence. Philosophia Christi , 7 (2), 299-343.

3 Likes

The language here is fairly damaging. Bayesian methods, he says, are inadequate because the design hypothesis has no emperical grounding! This is not a problem with Bayesian methods but their strength, in that they ask us to make explicit our models and priors.

6 Likes