I’m sorry but that latter part doesn’t make sense as a response to what you quote me say. As I was saying, if you are not okay with merely inferring the same function merely from sequence-simlarity, then yes it certainly makes sense for you to desire direct biochemical evidence for function.
But that has nothing to do with anything about “de novo sequences in vertebrates”. Who was talking about de novo sequences in vertebrates? Not me. You might have got something mixed up here and it’s not clear what that is. Is it that talk about information jumps?
The “information jump” in vertebrates that Gpuccio attempts to infer does not involve any de novo sequences. He’s deriving that idea from the observation of a large degree of change in sequences over some period of time, he’s not saying there’s some new protein sequence that suddenly pops up out of nowhere where before there was none.
Good, that’s a first step I guess. But it’s also a bit too vague to be meaningful with respect to our argument here. You’re not rejecting it as possible inferences? Okay, but then when do you actually reject it and why?
Do you reject that the sequences analyzed in this paper on the phylogenetic analysis of the P-loop NTPase superfamily, are actually related? And if you do, why?
Leipe DD, Koonin EV, Aravind L. Evolution and classification of P-loop kinases and related proteins. J Mol Biol. 2003 Oct 31;333(4):781-815. DOI: 10.1016/j.jmb.2003.08.040
Abstract
Sequences and structures of all P-loop-fold proteins were compared with the aim of reconstructing the principal events in the evolution of P-loop-containing kinases. It is shown that kinases and some related proteins comprise a monophyletic assemblage within the P-loop NTPase fold. An evolutionary classification of these proteins was developed using standard phylogenetic methods, analysis of shared sequence and structural signatures, and similarity-based clustering. This analysis resulted in the identification of approximately 40 distinct protein families within the P-loop kinase class. Most of these enzymes phosphorylate nucleosides and nucleotides, as well as sugars, coenzyme precursors, adenosine 5’-phosphosulfate and polynucleotides. In addition, the class includes sulfotransferases, amide bond ligases, pyrimidine and dihydrofolate reductases, and several other families of enzymes that have acquired new catalytic capabilities distinct from the ancestral kinase reaction. Our reconstruction of the early history of the P-loop NTPase fold includes the initial split into the common ancestor of the kinase and the GTPase classes, and the common ancestor of ATPases. This was followed by the divergence of the kinases, which primarily phosphorylated nucleoside monophosphates (NMP), but could have had broader specificity. We provide evidence for the presence of at least two to four distinct P-loop kinases, including distinct forms specific for dNMP and rNMP, and related enzymes in the last universal common ancestor of all extant life forms. Subsequent evolution of kinases seems to have been dominated by the emergence of new bacterial and, to a lesser extent, archaeal families. Some of these enzymes retained their kinase activity but evolved new substrate specificities, whereas others acquired new activities, such as sulfate transfer and reduction. Eukaryotes appear to have acquired most of their kinases via horizontal gene transfer from Bacteria, partly from the mitochondrial and chloroplast endosymbionts and partly at later stages of evolution. A distinct superfamily of kinases, which we designated DxTN after its sequence signature, appears to have evolved in selfish replicons, such as bacteriophages, and was subsequently widely recruited by eukaryotes for multiple functions related to nucleic acid processing and general metabolism. In the course of this analysis, several previously undetected groups of predicted kinases were identified, including widespread archaeo-eukaryotic and archaeal families. The results could serve as a framework for systematic experimental characterization of new biochemical and biological functions of kinases.
I agree it does not. It is not supposed to. The ultimate origin of a sequence that subsequently diverges into a large superfamily of proteins is not explained merely from the inference that it diversified into that superfamily.
It does however allow us to make inferences about what it was that first originated, as in what the ancestral sequence and function was, and how it then subsequently changed and evolved into the sequences we see today.