Gpuccio is a poster at Uncommon Descent who published a blog article at UD arguing that proteins have high functional information (FI), and must have been designed. His work is commonly referenced by ID proponents.
This is his post:
We asked for him to explain his methodology further. @colewd asked him (comment 343), and gpuccio gave a response (comment 356). He has agreed to respond to reasonable critique.
OK, so here is a brief primer about my methodology to measure Functional Information in proteins:
a) I use Blast to measure sequence homology between proteins, in bits. I take the bitscore from the Blast algorithm as it is, with some consideration of the number of identities and similarities, too.
b) I am interested in homologies that are conserved throughout long evolutionary periods. I consider that kind of homology as a very good estimator of FI. The reason is very simple: a specific sequence can be conserved for those long time windows only if it is under very strong functional constraint, and is therefore preserved by negative (purifying) selection. In all other cases, sequence homologies are practically cancelled after a long evolutionary time because of the constant effect of neutral variation.
c) How long must the time window be so that sequence homology may be considered a good estimator of FI? I would say at least 200 – 400 million years. Better if 400. Why? Because that’s more or less the time window that is usually associated with “saturation” of synonymous sites, IOWs with the more or less complete loss of any detectable homology in neutral sequences.
d) In particular, I am interested in “information jumps” in proteins at specific evolutionary times, especially at the transition to vertebrates.
e) That transition is supposed to have happened more than 400 million years ago, providing therefore a good time window for our reasonings. Moreover, the split between pre-vertebrates and the first vertebrates, and the following split between cartilaginous fishes and bony fishes, happened reasonably in a relatively short time more than 400 million years ago. As we will see, this is a very good context to measure information jumps in proteins.
f) So, let’s be practical. I take some protein in the human form. IOWs, I use the human sequence as my initial “probe”.
g) Then I measure, for the specific task of studying the transition to vertebrates, the sequence homology between the human protein and all pre-vertebrates, in particular deuterostomes and chordates non vertebrates. I take the bitscore value of the best hit. This value represents the best assessment of sequence homologies existing before the appearance of vertebrates. The value can be very low, or medium, or high. Whatever it is, that sequence information was already there.
h) Then I measure the sequence homology between the human protein and the proteins in cartilaginous fishes. I take the bitscore valure of the best hit. Again, it can be low, medium or high. This is the sequence homology that is present at the beginning of vertebrate history, before the split between cartilaginous fishes and bony fishes. That, again, is supposed to have happened 400+ million years ago. This value is important, because humans derive from bony fishes. Therefore, any homology found between cartilaginous fishes and humans predates the split between cartilaginous fishes and bony fishes. IOWs, any such homology was alredy present in the common ancestor of fishes, and therefore it has been conserved for 400+ million years.
i) Finally, I make the difference, in bits, between the bitscore from h) and the bitscore from g). This is the “information jump”, IOWs the sequence homology (to the human form) that has been “added” in the transition to vertebrates. And it is also a very good estimator of the functional information jump (more precisely, of the jump in human conserved FI), IOWs of the human conserved FI that was added to that protein in the transition to vertebrates, because both the homology measured from h) and the homology measured from g) are sequence homologies that have been conserved from 400+ million years.
This is the general idea.
Now, an example.
Let’s consider for a moment an old friend, the beta chain in ATP synthase. We know it is a very conserved proteins, with a very high homology between the human sequence and the sequence in bacteria. Certainly a lot of FI there.
Now, this is a 529 AA long protein. Let’s say that we want to apply our methodology to see what happens to that protein at the vertebrate transition.
OK, blasting the human sequence (P06576) against non vertebrate Deuterostomia and Chordates, the best hit is 866 bits (Acanthaster planci), a starfish.
Now, let’s blast it against cartilaginous fishes. The best hit is 929 bits (Callorhincus milii).
So, the information jump at the transition to vertebrates, for that protein, is 929 – 866 = 63 bits. 0.12 bits per AA (baa). Very low indeed.
That simply means that the protein was already almost identical to the human form in pre-vertebrates (87% identities). IOWs, the FI was already there, and no big information jump takes place at the vertebrate transition.
Now, let’s do that again with the protein CARMA1/CARD11, of which we have discussed in this thread.
This is a 1554 AA long sequence, in the human form.
Again, let’s blast it against Deuterostomia and Chordates that are not vertebrates. The best hit is 234 bits (Saccoglossus kowalevskii). 0.15 baa. A very low score, for such a long protein. That means that the specific sequence found in humans was almost completely absent in pre-vertebrates.
Now, let’s blast it against cartilaginous fishes. The best hit is 1514 bits (Callorhincus milii). That means that, even if the shark protein is still different from the human form, about half of the potential sequence information is already there. More than 400 million years ago.
So, how big is the jump in human conserved FI? Easy. 1514 – 234 = 1280. IOWs, about 1280 bits of FI have been added “de novo” in vertebrates, and then conserved for more than 400 million years. That’s a very big information jump.
IOWs, this protein was highly and specifically engineered during the transition to vertebrates, and that precious FI has then been preserved up to now.