How many possible mutations are we talking about? If we took the human genome as our example, how many beneficial two base mutations are possible?
What dependent mutations are you referring to? What are you suddenly talking about?
Sure, amino acid substitutions that require double DNA mutations would occur at a lower rate than amino acid substitutions that require single DNA mutations. And? How many of those have happened, and is that a problem?
It simply is what I recommended Gil look at if human to human variations due to mutations were found that appears neutral and are not fixed in the population as a whole.
That makes absolutely no sense.
No. It is clear that you do not understand fixation.
What about selection, Bill?
What about neutral mutations that were fixed 10’s of millions or 100’s of millions of years ago?
- Most natural functional proteins first have to adopt folded structures in order to be able to perform a function
- Protein folds are rare in sequence space
- In order to perform a function, it is not enough for a protein to adopt folded structures. In fact, only a small subset of folded structures will be able to perform a biological function.
From these 3 premises, it follows that natural functional proteins are rare in sequence space.
Because in order to evaluate FI, you must focus on conservation through deep time, which can only be assessed by analyzing homology across extant species whose lineages have split more than 400 millions years ago.
No. In order to test my hypothesis, I need to exclude even mutations that have a very small negative effect on fitness, since it is very unlikely that such mutations would have been conserved through deep time.
So 410 My is great, but 390 My is useless?
You’re channeling Nigel Tufnel again. You haven’t even made the smallest effort to show that (or to find out whether) gpuccio’s “FI” has any relationship at all with functional information.
Wrong, because the same allele can have a negative effect on fitness on one background and a positive effect on a different background.
Do you know what epistasis is? If so, why are you pretending it doesn’t exist?
Your hypothesis said nothing about which variants (not mutations) you’d consider. Do you not understand that you need to bake all of your assumptions into the hypothesis before you see the data, so you don’t cheat as you’re doing now?
If a given position of a given protein exhibits conservation through deep time, it means that a mutation at that position would very likely have a negative fitness effect on all the considered backgrounds.
That would be a hypothesis, because you haven’t bothered to look yet. Shall we test it?
BTW, they are called “residues,” not “positions.”
And you’ve denied the existence of epistasis.
All conservation can show you is how many changes can occur from one single starting point. Sequence conservation can’t tell you how many starting points there are.
Furthermore, ID supporters keep asking for examples of evolution producing high levels of FI in living populations. Obviously, this is a dishonest request since they measure FI by using sequence conservation over hundreds of millions of years. In fact, if an ID supporter were there for the evolution of the ancestral sequence 100’s of millions of years ago for what is now a highly conserved protein they would claim that it had very low FI when it first evolved.
Most natural proteins adopt folded structures to perform their functions, sure. That doesn’t mean that a functional protein has to adopt a fold to function, nor that functional proteins are rare in sequence space.
You are again confusing functions with folds, and proteins in their extant forms with their ancestral states.
Are they? How rare is “rare”?
How do you know that? Have you tested some fold for all possible functions, under all possible physical conditions?
There are at least two problems with this. First of all, those premises merely beg the question. Second, is your apparent, continued misconception that many of the functional proteins we see in extant life must have sort of just popped into existence in their present form de novo, rather than evolving incrementally from simpler precursors, or by rearrangements and fusion of other protein fragments. You are ignoring phylogenetic evidence.
Well no, it’s worse than that. When the protein first evolved they would ignore it, wait 400 million years, and THEN claim it has high FI and that it couldn’t evolve.
But the amount of divergence the protein would have to undergo over those 400 million years, to have a large impact on the FI calculation, is effectively physically impossible. We would have to find more variants of the protein, having become fixed or at least detected in some lineage, than there are numbers of individual cells that have ever lived.
It is not physically possible for sequence divergence over 400 million years to even result in so many protein variants as would be required to bring the FI down from some value like 700, or 1000, to below the arbitrarily set threshold of 500 bits FI. As I have argued at length. Even if we don’t require they become fixed. Just to have existed for some fleeting moment.
Suppose we have a 300 amino acid protein and we only know of one variant of it. That’s 1297 bits of FI for that one variant. We then discover that every single cell that has ever existed in the history of life made it’s very own mutant version of that protein, so we have that much variation known. Let’s be extremely charitable and say as many as 10^50 cells have lived during the history of life(so one novel variant pr. cell, aka 10^50 different known variants of the protein). What does that bring the FI down to? 1130 bits.
With 10^50 different variants, we go from 1297 to 1130 bits. Let’s exaggerate to the extreme and propose that every single one of those 10^50 cells that ever lived, all produced one thousand different versions of it each, all of them different. So each of those 10^50 cells produced one thousand variants of the protein. What does that do to those 1130 bits? They become -log2(10^53/20^300) = 1120 bits FI.
Even if every single cell that ever lived produced one thousand of their very own sequence variants(detectable as homologous based on similarity) of the protein in question, that would not result in enough variation to make much of a dent on the FI calculation at those scales.
The estimation of FI based on extrapolation from alignments of homologous sequences appears to have been carefully designed to represent an impossible hurdle to overcome. It is not physically possible for all of life to have generated the amount of variation IDcreationists are demanding to be shown. Which means the very method simply begs the question against evolution.
Even if there really were so many different possible functional variants of the protein, there’s no physical way life could diverge and produce it during it’s entire history even under completely unrealistically generous assumptions. The whole thing is fatuous in the extreme.
GIl’s claim is incoherent, because the function of many, many proteins is literally to change structure.
This claim from Gil also is incoherent. “Folds” are particular structures. Virtually all proteins, even randomly synthesized ones, fold, because some residues are hydrophilic and some hydrophobic.
I know I keep saying this, but it’s even worse! Many natural functional proteins have intrinsic disorder, yet are functional.
Lieutaud P, Ferron F, Uversky AV, Kurgan L, Uversky VN, Longhi S. How disordered is my protein and what is its disorder for? A guide through the “dark side” of the protein universe. Intrinsically Disord Proteins. 2016 Dec 21;4(1):e1259708. DOI: 10.1080/21690707.2016.1259708
In protein science, the existence of intrinsic disorder in proteins has been known for a long time. This is in spite of the fact that it contradicts the classical protein sequence-structure-function paradigm where the “lock-and-key” model is used to explain how a protein can achieve its biological function via folding into a unique, highly structured state determined by its amino acid sequence.14 IDPs and IDPRs constitute a part of the “dark proteome” that includes entire proteins or protein regions for which the molecular conformation is entirely unknown.15 Traditional ordered proteins have a relatively stable 3-D structure possess Ramachandran angles that vary only slightly around their equilibrium positions with occasional cooperative conformational switches. On the other hand, IDPs/IDPRs, despite being biologically active, fail to form specific 3D structures and exist as highly dynamic structural ensembles, either at the secondary or at the tertiary level.5,6,16-21 Furthermore, intrinsic disorder is characterized by high structural heterogeneity. In fact, it is now recognized that IDPs/IDPRs may contain collapsed disorder (where the intrinsic disorder is present in a molten globular form) and extended disorder (where intrinsic disorder is present in a form of random coil or pre-molten globule) under physiological conditions in vitro .5,20,22 It has also been shown that, in addition to completely ordered and disordered regions, proteins may contain regions of semi-disorder; i.e., fragments that have ∼50% predicted probability to be ordered or disordered.23 Such semi-disordered regions have been shown to play key roles in protein aggregation, and to participate in protein-protein interactions involving induced folding.23 The currently available structural data has been used to suggest that the heterogeneous spatiotemporal structure of IDPs/IDPRs can be described as a set of foldons, inducible foldons, semi-foldons, non-foldons, and unfoldons.21,24 The discovery of IDPs and IDPRs, which would not have been possible without bioinformatics, has drastically expanded the understanding of protein functionality, and exposed new and unexpected roles of dynamics, plasticity, and flexibility in the context of protein functions.
While there are IDPs/IDPRs that are able to perform their function while remaining completely disordered (e.g. entropic chains), many such proteins and regions experience a disorder-to-order transition after binding to their physiological partner(s), known as “induced folding.”65 The functional relevance of disorder is the result of increased plasticity which allows for binding numerous and structurally distinct targets. Consequently, intrinsic disorder is a common and distinctive feature of “hub” proteins, with disorder acting as a measure of protein promiscuity.66 As such, the majority of IDPs are involved in functions that involve multiple partner interactions, such as molecular assembly, molecular recognition, signal transduction and transcription, and cell cycle regulation.67
Hilariously, natural proteins are on average MORE disordered than random protein sequences:
Yu JF, Cao Z, Yang Y, Wang CL, Su ZD, Zhao YW, Wang JH, Zhou Y. Natural protein sequences are more intrinsically disordered than random sequences. Cell Mol Life Sci. 2016 Aug;73(15):2949-57. DOI: 10.1007/s00018-016-2138-9
Of course! They are not as frequent as the ones that change conformation, though
That would fit with much of biology involving transitions between METAstable structures. The prion protein is much more ordered and stable in the prion conformation than it is in its functional conformation and also quite deadly!
I have looked at 150 human substitutions in the database of the myh 7 protein and all but one only required 1 nucleotide substitution for the missense mutation. Most of these were associated with a health problem.
Most of the people harboring most of these variants are perfectly healthy people because of epistasis, so they have to be counted as working sequences.
If you’re not desperately trying to cheat. You wouldn’t do that, would you, Bill?
You don’t have a clue about epistasis, do you, Bill?
Very rare. You can see this in table 1 of the paper I referred to at 102. Here it is:
The rarity in the sequence space of the 10 protein folds investigated by the authors is given in column 4 under the label SC*.
It would be interesting to have a list of say 20 to 30 missenses neutral SNP for the human MYH7 protein, the prediction being, according to gpuccio’s reasonning, that the majority of these SNP will land at positions that do not exhibit conservation through deep time. The problem is that I don’t have the ressources to compile such a list.