Dembski: Building a Better Definition of Intelligent Design

I’m on board with independent events from the same distribution. The trouble is Dembski only has a single event, and he would have us accept that 7 x 1/6 is a probability.
This was the case in his 2005 paper.

By "mutually exclusive do you mean independent, or a 1-to-1 mapping between Events and Descriptions?
Maybe it doesn’t matter? :wink:

IIIRC Elsberry and Shallit (2011) note that in Dembski uses equiprobability in examples showing CSI and other probability distributions when it’s not CSI.

As I noted in my reply to Gil (just above), the assumption of equiprobability is dubious the moment we suspect an event is non-random. It would be far better to draw more samples and conduct a test to see if they deviate from the expected randomness.

I maintain that sort of multiplication is nuts, AND the factors chosen are arbitrary. :slight_smile:

In the end it seems to me that the whole thing is just an attempt to justify Dembski’s use of the length of the description in words - and I think it fails there.

I agree. It also misses any complex events that do not have short names.

I don’t accept that the particular multipliers chosen by Dembski are correct. All I am saying is that the probability of getting one of a number of mutually exclusive equiprobable results - on a single trial - can be got by multiplying the probability of a single result by the number of chosen results.
As in my example.

I don’t seem the equiprobability assumption as particularly troublesome - if we can accept ASC as a rough estimate it isn’t too bad. And the choice of multipliers is a bigger problem.

Dembski has never been worried about false negatives in his Design Inferences - and I’m more worried about false positives from those that DO have short names (counted in words apparently for some reason). The sheer arbitrariness of it - plus the fact that short descriptions likely violate Dembski’s assumed inequality - seems a moderately significant problem.

But really this is just a tweak to the original CSI that attempts to address a genuine weakness - but leaves the major problem of impracticality untouched (possibly exacerbated by the push to get short descriptions).

1 Like

I’m pretty sure that a single trial can’t have 7 possible outcomes each with probability 1/6.

If we had a reasonable application for equiprobability,then maybe, but Dembski abuses this by using arbitrarily long sequences, as I showed at comment #108.

If we can rule out intentional deceptiveness, then I agree the choice of multipliers is a bigger problem.

1 Like

You are absolutely correct, but this is what Dembski does in that older paper. He gives an example showing negative CSI, and if you work backward he started with a probability greater than one.

I have other complaints about that 2005 paper, but it’s all old news.

2 Likes

It seems to me that there the problem is not equiprobability but simply assuming the wrong chance hypothesis. The sequence was not generated by randomly selecting the 150 digits individually, with all digits equiprobable. (That’s not my only disagreement with that example but it’s the main point here).

Or to repeat my point ASC is only defined for a particular chance hypothesis and can only be used to eliminate that hypothesis. If Dembski is presenting it as anything more then that’s a major error.

The issue of identifying and eliminating all possible chance explanations is another of the serious flaws in CSI that ASC doesn’t address.

2 Likes

But the goal here is not to determine the SC of a specific molecule, but the SC of an event, the event here being for a molecule within a reference class of possibilities to be able to bind ATP.

Of course there is no “SC of an event” the SC is always relative to a chance hypothesis. What hypothesis would you use and how would you calculate the SC for that hypothesis? And could it practically be done for all relevant chance hypotheses as Dembski’s method requires ?

1 Like

That’s not an event. That’s (at best) a potential event.

It bears no resemblence at all to your example.

It isn’t calculable.

You have not addressed the problem of your misuse of the SC equation.

You’re simply reinforcing the uselessness of your approach.

Yes, the probability of an event, and so it’s SC is always relative to a chance hypothesis. Dembski is perfectly aware of this. Here is what he wrote in the piece I link below :
With Ewert’s lead, specified complexity, as an information measure, became the difference between Shannon information and Kolmogorov information. In symbols, the specified complexity SC for an event E was thus defined as SC (E ) = I (E) – K (E ). The term I (E ) in this equation is just, as we saw in my last article, Shannon information, namely, I (E ) = –log(P (E )), where P (E ) is the probability of E with respect to some underlying relevant chance hypothesis.

And here is another passage from the second edition of the Design Inference (emphasis mine):
Note that it does not, strictly speaking, make sense to talk about the probability of E as such. Rather, we must always speak of the probability under some assumed hypothesis that might account for the event. Thus, when we see a probability like P(E), an underlying chance hypothesis has simply been left unexpressed.

Well, you could proceed like that:
Create a library of random proteins and find out what proportion bind to ATP.
It happens that Keefe and Szostak did just that. They created a library of 6 x 1012 proteins each containing 80 contiguous random amino acids and then estimated that approximately 1 in 10^11 random proteins bound to ATP.
From these numbers, Dembski, in the second edition of the Design Inference, calculated the SC of a protein (within the class of 80 aa long proteins) that binds to ATP. Here is the passage:
For the description that specifies this function, we will take the phrase « binds ATP », which is two words long and thus we estimate would take around forty bits to describe (if we are generous in assigning 20 bits per word). Using the formula for specified complexity (see Section 6.4), we then calculate:
SC(X\H) = I(X\H) - D(X)

  •  ~ -log2(1/10^11) - 40*
    
  •  ~ 36 - 40*
    
  •  ~ - 4 bits*
    

So the chance hypothesis you are dealing with is some form of random assembly - and limited to quite short proteins. I hope you realise that that isn’t really adequate to conclude design,

And my understanding is that some degree of binding may well be more common than that.

3 Likes

Yes, I think that is key.

If I were trying to come up with a similar measure, I would start with the probability of an event in its current environment. A watch found in the middle of a pasture is a low probability event because that isn’t where we typically find timepieces. A flagella on a bacteria is a high probability event because that is the nature place to find it. This is then a chance hypothesis conditioned on the environment. The calculation may still be problematic, but it should be possible to estimate it from data.

I’ve been reading a recent paper from Lee Cronin’s group explaining the Assembly Index, and there is some overlap with the discussion here. I mean to get to that

Or to repeat my point ASC is only defined for a particular chance hypothesis and can only be used to eliminate that hypothesis. If Dembski is presenting it as anything more then that’s a major error.

An early version of CSI use a “rejection region” like a standard statistical test. I’ve never found a citation for it, and I think that method was junked before 2005.

The issue of identifying and eliminating all possible chance explanations is another of the serious flaws in CSI that ASC doesn’t address.

Yes, and in the unlikely event Dembski is correct, the statistical properties of CSI can only be terrible. (I keep writing this out in hopes the point will stick with those who need it).

[quote=“Giltil, post:191, topic:16746, full:true”]

I agree it does not describe a specific molecule.

But Dembski always calculates the probability of a single event, not any class of events. He uses the length of the sequence, which it easily cherry picked to be long enough to make his equation work (arbitrary precision).

This ought to be a Big Red Flag, because hypotheses need to be carefully defined to be of any value. This has already been discussed above in relation to false positives and lack of power to (correctly) reject a hypothesis.

Here we might have a class of events, but all living cells have proteins to bind ATP. I think this is the wrong random hypothesis.

This goes to what I just mentioned to Paul about the Cronin’s Assembly Index, which hypothesizes that Selection is responsible for the observation of large numbers of “unusual” molecules. Cronin’s purpose is different; detecting life, but there is some overlap in methods. I’ll have to follow up on this at a later time. The point for now is that Dembski’s random hypothesis will also reject if the event is a product of selection.

I think CSI cannot deal with any other hypothesis than random assembly, and this criticism goes back to Elsberry and Shallit (2011) at least (probably to about 5 minutes after someone with a math degree read about it in The Design Inference.). The chance hypothesis should fail the moment we realize that cells are not assembling proteins by chance.

The expected objection here is, “but how does it evolve in the first place?” Dembski’s random hypothesis cannot test this; it can only test (ignoring other flaws) if the observed sequence can be assembled by chance. It’s testing the wrong random hypothesis.

Except that no evolutionary biologist hypothesises that evolution occurs through random assemblage of proteins. This is therefore not the “relevant chance hypothesis”, and this is therefore a strawman fallacy.

comes up with something that has nothing to do with evolution.

What Dembski is doing appears to be an excellent example of the Streetlight Effect:

A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost them here, and the drunk replies, no, and that he lost them in the park. The policeman asks why he is searching here, and the drunk replies, “this is where the light is”.

He is calculating the probability of random assemblage, not because that probability has any evolutionary meaning, but because that is the only probability he is able to calculate.

Dembski’s calculations are nothing but “humbug”.

1 Like

I have to rate The Design Inference as a massive failure. It did not provide a practical method for detecting design in non-trivial cases. It didn’t account for the issues introduced by using hindsight to identify patterns. It didn’t do a good job of explaining the flawed and impractical method it did provide. It did not even manage to elucidate how we detect design - an objective stated up front.

In fact we usually detect design by treating it as a positive hypothesis - and Dembski’s failure to acknowledge and use that is the source of many of his problems. Perhaps also his desire to claim a “mathematical proof” of design exacerbated the problems although it was rather clear that he didn’t have one (how can you mathematically prove that all relevant “chance hypotheses” have been considered?).

3 Likes

It is the relevant chance hypothesis to compute the SC of a protein that binds ATP in the context of Keefe and Szostak article. Dembski offers this example as a way to illustrate how SC can be computed, that’s all. He perfectly knows that no evolutionary biologist hypothesises that evolution occurs through random assemblage of proteins (in chapter 6 of the second edition of the Design Inference called « biological evolution, he devotes many pages to this very point). I am wandering whether your are not the one who’s strawmaning here?

I’m sorry, but nothing in your post indicated that this was just another made-up “example” – with no more real-world relevance than your “Pi and Peter”.

I had briefly hoped that we were finally getting onto something that Dembski had done that actually had relevance to evolution. A futile hope, it would seem.

Does he, in this chapter, actually get around to discussing how he’d go about estimating probability distributions of real-world evolutionary hypotheses (and how this would differ from merely assuming de novo random assemblage)?

@Giltil (and anyone else interested),
I think I figured out how to properly calculate the Kolmgorov information for pi (\pi) and similar constants.

If the value of pi is background knowledge then the message “pi to 53 decimal places” is 19 characters (and we could translate that to bits), and the message
“pi to 99 decimal places” is the same.
Note that if you only need ten digits of pi it is shorter to simply send “3.1415926535”.

If the value of pi is NOT background knowledge, then the algorithm [generate pi] must be included in the message in addition to the instruction “to N decimal places”.
Once again it may be shorter to send the actual digits than to send the algorithm and instructions.

The upshot is that KI will grow with the length of the message up to the point where compression becomes efficient, but will grow very slowly after that.

You write that as if it means something. I’ve looked at SC in every reasonable way - and several unreasonable ways - and it never makes any sense.

In one of those unreasonable ways, the one where we simply assume Dembski is correct, the statistical properties of SC can be little better than random guessing (by coin flips, dice, etc.). SC will generate more false results than correct results, and probably a LOT more. It can never be useful for its intended purpose.

That’s a gross misrepresentation of both what they found and the context in which they used that number.

Let’s start with the empirical fact that any protein binding to anything is not a binary property. Keefe and Szostak explicitly qualified this, but you removed that qualification. Why?

4 Likes