Gil grabs some ammunition and shoots down Doug Axe's 2004 extrapolation by a factor of more than 10^44

Sometimes the sequences are not so similar. We are looking at sequences based on similar or the same function. Beta lactamase is an example of a protein that has similar function but divergent sequences.

So what?

1 Like

Your claim:

Is false.
We are looking at similar function.

No, it’s not false. Many (actually the vast, vast majority) of the functional annotations found in databases are based on sequence similarity(some times including surrounding gene and non-coding DNA synteny), scientists have generally not re-done the complete experimental biochemical assays that show the actual biochemical functions of these proteins for every new species they are discovered in.

And even where they have done assays that show the functions of different proteins, Gpuccio has not been deriving his numbers from different proteins with similar functions. He’s just done blast searches looking for proteins with similar sequences.

1 Like

This is not the method we are using. The function of alpha actin is clear as the function of beta lactamase is also clear. One protein lines up well the other does not. The comparison is based on functional similarity not sequence alignment. I understand the maturity of the databases is an issue for certain comparisons.

This is false. You need to rethink your argument.

Hey, it gets worse. Gpuccio implicitly accepts the inference of incremental increases in protein size and complexity over the history of some clade. That’s how he derives his so-called “information jumps” from for example the evolution of some protein in an inferred ancestral vertebrate, to the human version of the sequence.

Hence Gpuccio implicitly accepts that this historical growth has occurred, and has an even deeper ancestry. Read the link.

1 Like

Yes, that is the method you are using. You clearly have no idea what’s going on, again.

No, it’s not false. Please go and read what Gpuccio actually does, from his own posts. Click the link above.

1 Like

After re reading this I see where you are coming from. He is looking at basically the same function and not similar function. I think this is where the disconnect is. Where I think you are still in error is claiming that

The sequences may be similar and may not be. It depends on the variation observed from the specific function in different animals.

No the disconnect still is that you don’t understand that his method is based on similarity searches, because he literally explicitly does similarity-based searches when he’s looking for homologous sequences in other species to the one he’s interested in.

There isn’t any way around this.

But I’m not, for reasons already explained.

Your subjective views on the degree to which the sequences are “similar” is not a relevant factor here Bill.

What matters to the question of whether the method is based on similarity or not, is how Gpuccio finds and includes a sequence to use so as to extrapolate the amount of tolerated variation for some sequence to implement a function. He does that by doing similarity-based searches. That is what happens when you use a BLAST search tool.

Even were he to use names of proteins from different species(in other words lets say he tries to avoid using blast), say he wants to find ATP synthase subunit beta in some unicellular eukaryote, he could just be searching for that (“ATP synthase subunit beta”) on uniprot for example, and then choose to sort results by taxonomy. He’d find lots of candidate gene sequences that haven’t actually been experimentally characterized to be ATP synthase subunit beta, but merely inferred to be that merely on the basis of some sort of similarity measure. That’s how these genes are often times automatically annotated in these databases.

Only a very small subset of them have been experimentally characterized to function in some specific way expected from their similarity.

Now since these BLAST-based searches used to collect homologous sequence for use in the extrapolation of FI, are in fact based on sequence similarity, and since this in turn means that to calculate FI you are implicitly accepting that the similar sequences you use in your calculation are in fact homologous, then it will be hypocritical to suddenly arbitrarily reject similarity-based inferences of relatedness when those very same similarity-based searches can be used to show deeper ancestral relationships to proteins with different functions, or with simpler structures and shorter, more likely sequences.

You can’t have your cake and eat it too.

1 Like

His claim has nothing to do with functional information.

1 Like

How many different types of beta-lactamase are there, Bill? Are all of their sequences homologous?

How are you doing on coming up with a design explanation for all of those MYH7 alleles in healthy humans?


According to Bill, such experiments can be dismissed as “just so” stories. :smile:

No the disconnect still is that you don’t understand that his method is based on similarity searches, because he literally explicitly does similarity-based searches when he’s looking for homologous sequences in other species to the one he’s interested in.

The sequences in invertebrates are very different here. How does your argument work?

Gpuccio found them by similariy-based BLAST searches. So, that’s how.

Are you claiming that these are different functioning proteins?

No, why would I? LOL


How then are they labeled the same function with vastly different sequences?

Because they’re still similar enough for that inference to be made. Similarity comes in degrees Bill, it is not only 100% or 0%. There are significant similarites below 100% and above 5%, for protein sequences. Heck, there are even significant sequence similarities at 0%(no identical residues in pairwise alignment), I’ll let you figure out how that can be.

Try going on uniprot and pulling up a lot of homologues of ATP synthase subunit beta, and you will discover that only for a small minority do they have experimental biochemical evidence for the function of the sequence. It’s the same for all of these public sequence and genome databases that have gene sequences from millions of genes from hundreds of thousand of species. There’s no way to biochemically assay and characterize all these millions upon millions of genes, so functional annotation is often times automated by similarity-based algorithms, or some times even mere alignments by hand.


Having looked at the plot, the immediate suspicion is that there isn’t actually an information jump at all (because such a big difference in information would require the human version of these proteins to be five times as long as the echinoderm ones) but the plotter is merely showing the level of difference between the human vs animal sequences, and fish/mice have higher numbers because they diverged from the human lineage more recently than echinoderms and tunicates.

Having found the source (a post by gpuccio on Uncommon Descent), it transpired that was exactly what was done:
The evolutionary history of those six protein is summarized in the following graph, realized as usual by computing the best homology bit score with the human protein in different groups of organisms.

So there is no information jump to explain - the echinoderm and human proteins contain about the same amount of information. The information is merely different, and the plot merely reflects divergence time. In fact, gpuccio doesn’t even confirm that the differences are due to changes in the human/mouse/fish lineages as opposed to changes in the metazoan/echinoderm/tunicate lineages! As far as that paper is concerned it may be showing an information ‘jump’ from chordates to echinoderms, and I’m sure that if gpuccio had calculated his ‘information’ values by comparisons to ecinoderms rather than humans that is exactly what it would show.

So this isn’t merely a failure to plot anything relevant, it’s also a failure to understand that phylogenies are bush-like not ladder-like, and that echinoderm lineages have evolved for exactly as long as the human lineage since they split.

There’s nothing here for anyone to explain, unless it’s gpuccio and Bill explaining why they consistently fail to understand one of the most basic concepts of evolution.


He is measuring functional information not Shannon information.