Gpuccio: Functional Information Methodology

The random walk is random. It is not, of course, the only component of the neo-darwinian model.

The model is RV + NS. I happen to know this. The probabilistic analysis applies only to RV. NS must be analyzed separately.

But for NS to act, there must be something that is naturally selectable. And that naturally selectable function must arise by RV alone. Can you agree on that?

My point is simply that this is not a peer review, but a confrontation between two very different paradigms.

I never did this error. Nor, as far as I know, any important source of ID theory.

With all respect, you are using a strategy which is unfair and unacceptable. Because it generates only confusion in a discussion which is already confused enough. And not, I believe, because of me.

In brief, I asked for any possible counter-example to my statement that there exists no object, non biological and non designed, which exhibits high FI (more than 500 bits). I firmly stick to this statement. It is true, beyond any doubt.

So, you readily offer not one, but four counter-examples. Very good.

Now, I am not joking here. I mean what I say. And I am not here ot waste time, or make strategies. One counter-example will falsify my position. This point is very important. to me and to the discussion here.

So I take your examples very seriously. And I start from the first (the olthers are nor essentially different. And I point to the reasons why it is not a counter-example. Indeed, it is rather obvious for me that it is not a counter-example at all, so much so that I really wonder why you are offering it.

At this point, you justify your choice using an old trick, that I know very well, and that indeed I had cautioned you against just before you used it: you use the bits to build an ad hoc function.

This is not new. I remember that many years ago, in a similar discussion (yes, I am not at all new at these confrontations and at these objections, whatever you seem to believe), Mark Frank, a very serious and intelligent friend from the other side, when challenged to offer one counter-example, did exactly the same thing: he used the bits to build an ad hoc function. And believe me, he was in perfect good faith.

IMO, this seems to show two things: how intelligent people can say and believe obviously wrong things whem they have a desperate need to deny some important truth that is not comforting for their worldview, and how the same intelligent people (in this case MF and you) cannot obviously find a better argument, if they have to recur to this very indirect, and completely wrong, argument.

However, when I pointed that you were using the wrong trick of using the bits to build an ad hoc function, you change again your position: instead of simply saying if you agree or disagree with my point, you “justify” it saying that I did the same, and broke my rules in the same way.

Which is absolutely not true. So, when I point to the simple fact that I have never, never used the bits to define a function, which can be checked by anybody just by looking at everything that I have written in more than ten years, you “justify” your behaviour by saying that my methodology to estimate FI is based on micorstates. Which is not true, either, but requires a more detailed answer, that I hope to give later.

So, when I point to the simple fact that one thing is the definition of the function, another thing the procedure to estimate FI, and that the discussion was about the first thing, not the second, you still do nothing to acknoledge my point abd clarify your position.

I have appreciated your behaviour up to now. Very much. But I don’t like this. Not at all. This discussion is very serious for me, and for obvious reasons difficult to manage here. Tricks and strategies and unnecessary confusion are very bad.

So, I ask again.

Do you insist that the starry sky exhibits more than 500 bits of FI? Under my definition of FI?

Do you insist that I have broken my own rules in the definition of a function? Where?

Thank you.

(I will discuss the tornado separately with Art, at least for a last attempt)

1 Like

I’m not convinced that you’re not mixing concepts of FI here, but I don’t think it matters. As you point out, novel functions, domains, genes, and gene families do occur in specific clades. Regardless of the details of the calculation or the precise definition of FI, these represent the appearance of a substantial amount of functional information.

I would rather focus on larger issues that I think do matter. To summarize your approach, you are trying to determine whether the biosphere has sufficient probabilistic resources to hit upon functional targets, given the ratio of the target space to the search space. As a way of setting a upper bound on this ratio (i.e. a lower bound on FI), you use the number of bases conserved across long evolutionary time.

The major problems I see with your approach (which have largely been raised by me or by others already):

  1. Sequence conservation is not a valid estimator of the ratio you’re interest in. Conservation tells you nothing about most of the search space; it tells you only about the immediate mutational neighborhood of the existing sequence, which is a vanishingly small fraction of the total. More importantly, it does not give information about the number of nearby states that possess the function in question. Instead, it gives information about the number of states with higher function. (Higher fitness, actually, which need not be the same thing, but that’s a minor concern here.) But in standard evolutionary theory, the theory you’re challenging, adaptation involves passing through states with lower fitness (less functional states) until the local maximum is reached, and not returning to those states. Conservation cannot tell you whether there are less functional but still selectable states nearby in mutation space, and therefore cannot tell you anything about the size of the target space. This alone invalidates conservation as a proxy for the ratio you’re interested in.

  2. The target space you’ve considered consists of a single function, the function of the gene you’re looking at. To the extent that evolution can be considered a search algorithm, though, it is not a search for “the function performed by gene X”. It is a search for any function that will increase fitness. The only target space that will let you assess whether evolution could have produced some gene without design, then, is the space of all possible beneficial functions. Considering the probability post facto of achieving the function that did arise is indeed the Texas sharpshooter fallacy.

  3. The claim that only processes that incorporate design can generate 500 bits of FI has been challenged by two examples of two biological processes that observably produce large amounts of FI, in cancer and the immune system. Those challenges have not been addressed.


For your metric or your claims to have any meaning at all for evolution, the “random walk” would have to resemble an evolutionary trajectory. In its simplest form, this (an evolutionary trajectory) is a random exploration of sequence space in the vicinity of a sequence, which then iterates as the “center” of that sequence space moves. This movement results from selection, or drift, as the population genotypes change over time. And so, as I’m sure you know, many “random walks” are highly nonrandom as they unfold. The most striking metaphor, at least to me, is the ascent of a fitness peak. I don’t know whether your model makes this mistake, and I hope not, but if you are calculating probabilities about “random walks” that are blind to the iterative and nonrandom nature of evolutionary change, and especially adaptation, then your math is not worth thinking about.

I will repeat what I and others have been saying over and over and over: for this FI thing to be interesting at all, it would have to say something new or interesting or challenging about evolution. If all you are doing is quantifying some metric of protein sequence divergence/conservation, then you’re not adding anything interesting to the world of ideas. Maybe this is the best time to note that even if the metric passed basic muster (by being subjected, successfully to negative controls, most importantly), it would have to be informative. And I don’t see any indication at all that it’s informative. If it were, in fact, informative, then I would recommend a few places where you might either submit for publication or confer with experts (at, say, a small conference). As you know, this is how scientific ideas are vetted and honed and improved.

Oh. I was misinformed about the purpose of the conversation. I thought it was about your methodology, and I assumed we all shared some basic commitments regarding rational explanation, hypothesis testing, critical feedback, and revision after testing/feedback. I apologize for making those assumptions. You should probably count me out of any further discussion in this conversation. Thanks for inviting me, @swamidass.


Right. I do agree.

You raise classical objections, that I know very well. Probably, I have not had to time to discuss them here, up to now.

This objection can be summarized as follows: we are looking to optimized functions, and they are conserved in such an optimized form. But, of course, we belive that they started much simpler, and evolved gradually to the optimized state by RV + NS. Therefore, the target space for the initial, simpler form of the function must be much bigger". Is that fine?

This objection is very reasonable from the point of view of a fervent believer in the powers, of NS, but irrelevant if we consider what NS can really do according to facts. Optimizations are short and scarcely important. New fucntions are complex already in their starting form. Even if some optimization can cerianly occur, and does indeed occur, a complex function is complex, and cannot be decostructed into simpler stpes, each of them naturally selectable.

I cannot enter into details about that just now, given the bulk of things that I still have to clarify, but I have discussed this point in great detail in my OP about the limits of NS, alredy lnked here. I would also recommend Behe’s last book, Darwin devolves, which is essentially about this problem (how NS really works).

I know that these brief statements will immediately draw hundreds of fierce attacks here. So be it.

No, the Texas sharpshooter has nothing to do with this. This objection is often referred by me as the “any possible function” objection. In brief, evolution is not searching for anything, so it can find any possible function. Therefore, it must be much more powerful than a search for a specific function.

Again, this seems to be a reasonable objection from the point of view of a good believer in the neo-darwinian algorithm, but it is irrelevant when we consider facts.

First of all, it is not “any possible function”: it is any change that gives some definite reproductive advantage, and can therefor be expanded and fixed, with reasonable probability, by positive NS. That is much more restricted than “any possible function”.

Moreover, the number of functions that can really be useful in a context is sverely limited by the complex organization of the context itself. An existing set of complex functions, well organized, can use only a few new functions, which must anyway be well integrated into what already exists. Behe’s book, and known facts, show clearly that in known cases of NS acting, the variation is very simple, and it is variation of some already existing complex structure, with some impairment of its original function, but at the same time some collateral advantage in a specific environment. Like in antibiotic resistance.

Again, the main point is, again, that complex functions are already complex even in their minimally complex form. And adding the target space of many complex functions changes only trivially the target space - search space ratio, when we are already above the 500 bits, even a lot above. The key point is: these are logarithmic - exponential values. But, again, I cannot deal with this point in greater detail, for lack of time.

The claim remains valid. The examples offered for non biological objects are simply wrong. I am trying to ahow why, even if I don’t expect to convince anyone here. This is, it seems, a very sensitive points, and it evokes terrible resistances. Again, so be it.

Regarding cancer and the immune system, I will treat those two cases in great detail. If I survive. After all, that is my field, much more than meteorology! :slight_smile:

So why the many questions about ID theory, which I have been answering?

Really? That’s just nonsense. Most of us don’t have animosity or bias against Behe. We have just examined his work carefully and found it lacking. Apealing to him undermines your case substantially. If your argument depends on his books, you just torpedoed your own battleship.

Read his last paragraph again. He is stating that he see no commitment on your part to engage in rigorous scientific work.

Maybe he is wrong, but he does not think you are engaging in good faith inquiry, as we all expect from ourselves and from our scientific peers. That is why he is checking out.

Maybe he is wrong, and it is in your power to demonstrate it wrong by showing a different side of you.


And I really agree with you on that. I hate repeating arguments, if they have been already clearly stated. To agree to disagree is certainly a much better option.

Look, I habe been forced to slow down my comments here, because it was really too exacting. But I am available to continue the discussion, if it remains interesting. My only aim is to defend ID theory, as well as I can. And to get interesting and constructive intellectual confrontation with those who think differently.

Regarding the problem of FI in non biological, non designed systems, I remain firmly convinced of my statement: there is no example of non trivial values.

I don’t think I will discuss further the starry sky, because I believe that I have already shown beyond any possible dount that it is a completely wrong exampler. Unless, of course, Swamidass brings new arguments.

But I feel that I still owe you some better clarification about my position regarding the tornado example. I am absolutely convinced that your analysis of that system oin term of FI is wrong, but maybe I have not explained my points clearly enough.

So, I will make a last attempt at clarification, but I need some more time for that. After that, I leave the last word to you, and we can peacefully agree to disagree. :slight_smile:
One last poiny. I have read in my e-mail a comment by you about my arguments here and the semantic argument. I cannot find it here, maybe it is in the parallel thread.

However, I wanted to confirm that you are perfectly right about that: while I believe that the semantic argument is very important and valid, I have not used it here up to now, and I probably won’t. THe reason is simple: the arguments I have presented here do not need it.

So, I confirm that all my arguments here, the statement that no high levels of FI can be found in non biological and non designed objects, the estimate of FI in proteins, the estimate of the biological resources of our biological planet, and everything else, do not depend in any way on the semantic argument, at least not in the form that I have expressed them. They could certainly be strengthened by semantic considerations, and I will probably mention in the future discussion a minor aspect that is probably pertinent to my discussion, but essentially all my reasonings here are independent from that.

I hope this clarifies the point.

Another point is that, while my argument is more easily shown for digital information, it perfectly applies to non digital systems, too. I will clarify that better in my final discussion about tornadoes.

Ah, so you already know that what you’re modeling is not standard evolutionary biology.

To summarize the situation as I see it: there are two distinct kinds of evolution being discussed here. One, call it evol_biol, is the evolution proposed by evolutionary biologists. In evol_biol, there are many possible new functions that a species could acquire, function is often a matter of degree, and adaptation for a single trait frequently can take many routes, each through multiple beneficial mutations. In the other, call it evol_gpuccio, each species has a single possible beneficial trait, function is binary (present or not), and it is achieved or lost by a single mutation.

Sequence conservation, and the entire machinery of BLAST searches and functional information, tests the probability of evol_gpuccio occurring. The argument against evol_biol, which is the kind of evolution everyone else is interested in, boils down to, “Read Behe’s book.”


And I had just praised your lucidity at UD! :slight_smile:

1 Like

To all here:

However, reading Behe’s books will certainly help. I highly recommend it!

This came to me via Bill Cole.


If natural selection can add 60 bits of FI in a few weeks, why can’t it add 500 bits of FI over the course of (say) 20 million years?

I have no idea of what the “60 bits in a few weeks” is about, but at least we can clarify the math.

Let’s say that 60 bits are added in one week, whatever the source of this statement may be.

500 bits is a quantity which is 2^440 times bigger than 2^60.

We have about 2^38 weeks in 5 billion years.

So, at that rate, NS would be able to add about 98 bits of FI in 5 billion years.

I hope I made no errors. Just check the math.

Yes, your math is off by 37 orders of magnitude. You’re essentially calculating probabilities of independent low-probability events – create one improbable antibody this week, create another next week. The probability of creating both is the product of the individual probabilities, which is equivalent to adding the logs of the probabilities. So you add the number of bits. Generating 60 bits per week, and assuming independence, means that it will take a little over 8 weeks to generate 500 bits.


We have read Behe’s three books @gpuccio. We have assessed them carefully. Have you read our response to Darwin Devolves?

1 Like

Wow, you guys in the anti-ID field seem to be really fond of this error.

Let’s state things clearly:

10 objects with 50 bits of FI each are not, in any way, 500 bits of FI.

Which is what Rumracket (and maybe you) seems to believe when he says:

If natural selection can add 60 bits of FI in a few weeks, why can’t it add 500 bits of FI over the course of (say) 20 million years?

To make things more clear, I will briefly propose again here my example of the thief and the safes, that I used some time ago to make the same point with Joe Felsenstein.

It goes this way.

A thief enters a building, where he finds the following objects:

a) One set of 100 small safes.

b) One big safe.

The 100 small safes contain, each, 1/100 of the sum in the big safe.

Each small safe is protected by one electronic key of one bit: it opens either with 0 or with 1.

The big safe is protected by a 100 bit long electronic key.

The thief does not know the keys, any of them.

He can do two different things:

a) Try to open the 100 small safes.

b) Try to open the big safe.

What would you do, if you were him?

Rumracket, maybe, would say that there is no difference: the total sum is the same, and according to his reasoning (or your reasoning, maybe) we have 100 bits of FI in both cases.

My compliments to your reasoning! If the thief reasoned that way, he could choose to go for the big safe, and maybe spend his whole life without succeeding. He has to find one functional combination out of 2^100 (about 10^30). Not a good perspective.

On the other hand, if he goes for the small safes, he can open one in, what? one minute? Probably less. Even giving one more minute to take the cash, he would probably be out and rich after a few hours of honest work! :slight_smile:

So, you see, 100 objects with one bit of FI each do not make 100 bits of FI. One object with 100 bits of FI is the real thing. The rest is simply an error of reasoning.

You are quite correct. My mistake was in treating the 60 bits as representing the probability of finding a particular antibody per infection rather than per B cell. In the former case, my calculation would be correct. (In the 100 safes scenario, the correct analogy would be the probability of unlocking all 100 safes by flipping a coin once as the thief encounters each safe. That probability is indeed the same as that for guessing the 100-bit combination by flipping 100 coins.) But since the 60 bits is per B cell, the probability per infection is much higher.

So let’s ballpark some numbers for the real case. We’re assuming the probability of hitting on the correct antibody is ~1e-18, which is 60 bits worth. How many tries do the B cells get at mutating to hit the right antibody? Good question. There seem to be about 1e11 naive B cells in an adult human. Only a fraction of these are going to proliferate in most infections. Let’s say 10% of naive B cells each proliferate 100-fold. That give 1e12 tries at a 1e-18 target, for a probability of randomly hitting the target of 1 in a million per infection. That corresponds to ~20 bits. So each week in this scenario only contributes 20 bits of probability, not 60, and the time to reach 500 bits is 25 weeks, not 8. (Note: this 500 bits represents the same probability as hitting a 500 bit target in a single try.) If my guess of the proliferation is off by an order of magnitude, knock off a few more bits. It still takes less than a year to get to 500 bits, and a lot less than 1e38.



I have not time now, and will answer later.

For the moment, two things:

  1. Up to now, I had not considered the problem of what the initial 60 bits meant. I have only answered about your wrong statement about probabilities in general. You are still wrong, and I will clarify better later.

  2. In the meantime, I realize that your example is referring to the working of the immune system. So, not it is good time to deal with that, and I will do that later too. But I have real difficulties in understanding your ideas about the immune system, and how you apply probabilities to that example.

So, could you please give me a short layout of your thoughts? You have probably done that before, but I have not read it.

In particular, are you referring to the primary immune response, to antibody affinity maturation, or to the secondary response? Please, be as clear as possible.

To later.

This is a new one. Probabilities are multiplicative, so information is additive. Information is the log of a probability. So yes, 10 objects with 50 bits of FI each are exactly 500 bits of FI.

Their is a caveat here if the objects are dependent. In this case the total FI is less than 500. But that also means that the per object estimate of FI is deceptively high.

Why would you think different?

This it seems would explain quite a bit of the disagreement between us. If FI is not additive (but it is!!) there is no reason to consider cumulative scenarios, which is why you have not. Except FI is additive.

Whatever error @glipsnort made, he is entirely correct here.

1 Like

I think the confusion is that @gpuccio is not really interested in bits of information, but in probability of hitting targets with multiple attempts, and that isn’t multiplicative. Compare two cases, one where there are two independent events, each with probability p, and the other where there is a single event with probability p^2. The two cases have the same number of bits, and in the event of a single trial they have the same probability of success.

If there are multiple (n) trials, however, the probabilities are no longer equal. In case 1, Prob(success) = [1 - (1-p)^n] * [1 - (1-p)^n]. If np << 1, Prob(success) ~ (np)^2. In case 2, Prob(success) = 1 - (1-p^2)^n, which is ~ np^2.


@glipsnort that is not a well defined example. What do you think of this?

Case 1.Two independent events, each with probability p, and success requires both at same time, and there is no benefit to one alone.

Case 2. Two independent events, each with probability p, and each event is independently usefule, so it can be retained by negative selection when found.

Case 3. One event with probability p^2.

All else being equal, perhaps with some caveats to be clarified:

  1. The FI is the same for all three cases (success at all events).

  2. Single trial success is identical in all cases: p^2, with FI 2 log p.

  3. Evolutionary wait time in Case 1 and 3 is the same: p^2, with FI 2 log p.

  4. Evolutionary wait time in Case 2 is much less, approx, p * 2, with FI 2 log p. Note, the wait time is actually LESS than this, and it scales very well as we increase decomposability.

  5. Case 1 is equivalent to the strictest (and known to be false) version of irreducible complexity (IC1). Even Behe acknowledges that this is not how biology works.

  6. For very good reason, modern evolutionary theory works like Case 2, which had far lower wait times than Case 1 and 3.

  7. FI does not correlated with wait time! The decomposability of the system breaks this relationship.

  8. This result does not depend on fitness landscapes at all, just random sampling (tornado in a junkyard) plus NEGATIVE selection, not Darwinistic positive selection.

I bulleted out the points here so points can be disputed or affirmed more clearly. Everything I wrote here is directly verifiable with simulations, and experiments. We are not even including positive selection here, just negative selection.

This seems to be the heart of @gpuccio’s conceptual misunderstanding of Behe and of evolutionary science. He is working of a strawman theory of evolution, and a long ago dropped formulation of IC. Recall also he explains that his work depends on Behe’s work. He seems unaware that Behe acknowledges that IC1 does not reflect biology.