No, why would I? LOL
How then are they labeled the same function with vastly different sequences?
Because they’re still similar enough for that inference to be made. Similarity comes in degrees Bill, it is not only 100% or 0%. There are significant similarites below 100% and above 5%, for protein sequences. Heck, there are even significant sequence similarities at 0%(no identical residues in pairwise alignment), I’ll let you figure out how that can be.
Try going on uniprot and pulling up a lot of homologues of ATP synthase subunit beta, and you will discover that only for a small minority do they have experimental biochemical evidence for the function of the sequence. It’s the same for all of these public sequence and genome databases that have gene sequences from millions of genes from hundreds of thousand of species. There’s no way to biochemically assay and characterize all these millions upon millions of genes, so functional annotation is often times automated by similarity-based algorithms, or some times even mere alignments by hand.
Having looked at the plot, the immediate suspicion is that there isn’t actually an information jump at all (because such a big difference in information would require the human version of these proteins to be five times as long as the echinoderm ones) but the plotter is merely showing the level of difference between the human vs animal sequences, and fish/mice have higher numbers because they diverged from the human lineage more recently than echinoderms and tunicates.
Having found the source (a post by gpuccio on Uncommon Descent), it transpired that was exactly what was done:
The evolutionary history of those six protein is summarized in the following graph, realized as usual by computing the best homology bit score with the human protein in different groups of organisms.
So there is no information jump to explain - the echinoderm and human proteins contain about the same amount of information. The information is merely different, and the plot merely reflects divergence time. In fact, gpuccio doesn’t even confirm that the differences are due to changes in the human/mouse/fish lineages as opposed to changes in the metazoan/echinoderm/tunicate lineages! As far as that paper is concerned it may be showing an information ‘jump’ from chordates to echinoderms, and I’m sure that if gpuccio had calculated his ‘information’ values by comparisons to ecinoderms rather than humans that is exactly what it would show.
So this isn’t merely a failure to plot anything relevant, it’s also a failure to understand that phylogenies are bush-like not ladder-like, and that echinoderm lineages have evolved for exactly as long as the human lineage since they split.
There’s nothing here for anyone to explain, unless it’s gpuccio and Bill explaining why they consistently fail to understand one of the most basic concepts of evolution.
He is measuring functional information not Shannon information.
All of those concerns were raised to Gpuccio himself when he was here about 6 months ago.
That doesn’t change the fact that what he’s doing is meaningless as an estimate of FI since all he measures is the degree of relatedness. He’s really just finding some other nebulous way to show that as you find increasingly distantly related organisms on the tree of life, you find increasingly dissimilar protein sequences. Calling this an “information jump” is nonsensical.
Based on the measure (human conserved sequences) we are observing a functional information jump closer to the human sequence 400 million years ago. It’s an observation and a viable explanation as the change does not appear to be gradual.
Does it ever cross your mind, Bill, that when people who are experts in fields about which you know absolutely nothing keep saying you are wrong, that you might actually be wrong?
Sure I might be wrong. I learned from Rum that there is a weakness in the uniprot database. You may have noticed I liked his post. As far as the information jump that gpuccio identified there is no viable counter yet only assertions.
He identified no “information jump.” It’s make believe. That’s what the people who understand this stuff as part of their jobs have been telling you. For months.
What is the argument that supports this assertion?
So you then now realize that Gpuccio’s method is based on the implicit acceptance that nesting hierarchical structure in sequences of similar genes, imply relatedness?
If you see similarity-based annotation as a weakness of the databases, then by implication you must also reject Gpuccio’s method of estimating FI (from similar sequences found in those databases) as having zero basis in fact. Hence you have no basis for claiming what the FI of any biological polymer is.
You can’t have it both ways.
I think you are letting your personal bias get the best of you. This is made clear by statements of exaggeration like zero basis in fact. While your point about the database maturity is an excellent one saying there is no value to the measurement is a stretch.
We have a method and it needs the databases to mature for that method to improve.
No, it’s you who has to decide what you think can be derived based on shared similar sequences. If you don’t think it can be used to infer relatedness, then by implication you can’t do what Gpuccio is doing because that’s the only tool he has.
You’d need direct biochemical evidence for function for all these sequences, and you’re just never going to get that. There are too many species and too many genes. At best you’re going to get expressed protein sequences, which means you still only have a sequence-similarity-based inference of function.
But that is what you are implying when you suddenly and arbitrarily turn around and reject sequence-similarity based methods for inferring relatedness.
So you do accept the inference of relatedness based on nesting hierarchical structure in the sequences of shared similar genes? Then you must deal with the evidence for the simpler ancestries of these genes or you are having a hypocritical double standard.
He explained it very clearly.
Sorry Faizal but we all have bias sources here. The claims you’re side are making are iffy at best.
Direct biochemical evidence would be preferred but the appearance of de novo sequences in vertebrates is not zero evidence.
I am not rejecting relatedness either by common descent or common design as possible inferences.
Let me think about this. The nested structure is positive evidence for common descent but this does not explain the appearance of complex sequences that are mutation resistant.
This is why that graph is meaningless. The sequences were literally preselected to give that “jump.”