Here is the uniprot list of myh 7 human variants. There are about 200 of them and they are listed after you scroll down the page . The associated disease can be found by clicking the publication tab. If you google genetic code you will find a letter to amino acid conversion table.
But are not the subset of folded structures which are able to perform biological functions subject to continuous change and adaptation to new functions? They are not necessarily fragile and exact entities which become useless when subject to the slightest change. For instance, modifications to photopsins may exhibit useful differences in photo sensitivity to different light frequencies, and a case of a women who exhibited functional tetrachromacy has been documented. A billion years or so of tweeking protein folding by the earth’s biosphere can come up with a lot of useful proteins. In fact, this seems to be going on all the time in biology, especially with the never ending dance of infectious agents and organism membrane defenses, even apart from the immune system.
Most proteins have to adopt folded structures in order to perform a complex function. But not all proteins able to adopt folded structures will be able to perform a complex function. IOW, for most proteins, being able to adopt folded structures is a necessary condition for performing a complexfunction, but it is not sufficient. Now, once a protein has emerged that is able to perform a complex function, it will be very, very difficult for it to perform another complex function. What you may observe of course is some tweaking of the existing function, aka micro evolution. But that’s all.
Proteins excited by different light frequencies are variations in the opsin protein family, providing different spectral responses in organisms from fish, hummingbirds, snakes, and humans. If fish to birds is an example of micro evolution, then I’m OK with it.
It’s not clear what you mean by a “complex function”, much less what “another” is supposed to mean exactly. If a protein binds one spot, or a DNA sequence, and then mutates to bind another, or act on another non-identical molecule, is it then enough to constitute “another” function in your view?
How different must one function A be from another function B before it counts, in your subjective opinion, as “another”, “different”, “macro” or whatever?
First of all that’s incorrect use of the term microevolution, which refers to evolution below the species level, and hence doesn’t really apply to protein evolution since proteins can change functions whether they are part of the same species, or those changes occur concomitant with species-level transitions and diversifications.
The second issue is what you’re saying is flat out wrong. Changes in protein function have been demonstrated at all levels of functional classification. From structural/binding proteins changing into enzymes, between different enzymatic functions, from enzymes that break other molecules apart, into enzymes that assemble them.
For example you can take a look at this paper that gives a nice overview of the kinds of functional changes associated with particular structurally defined enzyme superfamilies:
Furnham N, Sillitoe I, Holliday GL, Cuff AL, Laskowski RA, Orengo CA, Thornton
JM. Exploring the evolution of novel enzyme functions within structurally defined protein superfamilies. PLoS Comput Biol. 2012;8(3):e1002403. DOI:10.1371/journal.pcbi.1002403
In order to understand the evolution of enzyme reactions and to gain an overview of biological catalysis we have combined sequence and structural data to generate phylogenetic trees in an analysis of 276 structurally defined enzyme superfamilies, and used these to study how enzyme functions have evolved. We describe in detail the analysis of two superfamilies to illustrate different paradigms of enzyme evolution. Gathering together data from all the superfamilies supports and develops the observation that they have all evolved to act on a diverse set of substrates, whilst the evolution of new chemistry is much less common. Despite that, by bringing together so much data, we can provide a comprehensive overview of the most common and rare types of changes in function. Our analysis demonstrates on a larger scale than previously studied, that modifications in overall chemistry still occur, with all possible changes at the primary level of the Enzyme Commission (E.C.) classification observed to a greater or lesser extent. The phylogenetic trees map out the evolutionary route taken within a superfamily, as well as all the possible changes within a superfamily. This has been used to generate a matrix of observed exchanges from one enzyme function to another, revealing the scale and nature of enzyme evolution and that some types of exchanges between and within E.C. classes are more prevalent than others. Surprisingly a large proportion (71%) of all known enzyme functions are performed by this relatively small set of 276 superfamilies. This reinforces the hypothesis that relatively few ancient enzymatic domain superfamilies were progenitors for most of the chemistry required for life.
Now if that doesn’t ram a freight-train through the idea that large-scale functional shifts can’t occur in protein evolution, or that it’s all just “microevolution”, then I don’t know what does.
Of course, they’re exclusively looking at enzyme functions here and have not characterized functional shifts between enzymes and non-enzymes(proteins that have other functions than to catalyze chemical reactions).
E.C. numbers attributed to these 276 superfamilies (including relatives where the domain is in different MDA contexts) account for 71% of the 2,676 E.C. numbers assigned to known enzymes, with the E.C. numbers associated with single domain enzymes accounting for approximately 36%. The high coverage of enzyme functionality from just 276 superfamilies, given that this represents only 15% of known domains, is surprising. Moreover, just 45 superfamilies account for 50%, of all sequences assigned E.C. numbers with 31 superfamilies in which the single domain accounts for 25%.
From this we can postulate that a limited repertoire of structural frameworks has evolved to carry out a large proportion of reactions required for all of life. Moreover, it is clear that generating new chemistry does not necessarily require large leaps, such as the evolution of novel protein structures or large structural re-arrangements, but can be made by small local changes e.g. residue substitutions or small insertions or deletions. Functional changes can also arise from changes in MDA and less frequently insertion/deletion of unstructured regions. This is perhaps not surprising since residue changes in the active site can easily induce changes in chemistry. Superfamilies supporting a wide range of enzyme functions predominantly adopt one of a few relatively highly populated superfamilies, such as the TIM barrel or Rossmann-like fold, which both possess large surface clefts likely to tolerate residue mutations.
We also observe that the addition of another domain or set of domains can bring a function associated solely with those domains and not with the superfamily domain (see Figure S9 and S10) i.e. acquisition of function by domain addition. These domains can bring confusion as to where the function is originating and the role (if any) that the superfamily domain under scrutiny contributes to that function. The contribution of these additional domains to the functional repertoire of a superfamily has been taken into account.
The Thornton lab has shown, using ancestral sequence reconstruction, how an ancient enzyme radically altered it’s function into a sort of structural protein that contributes to controlling the spatial direction of cell division in the tissues of multicellular organisms:
Anderson DP, Whitney DS, Hanson-Smith V, Woznica A, Campodonico-Burnett W, Volkman BF, King N, Thornton JW, Prehoda KE. Evolution of an ancient protein function involved in organized multicellularity in animals. Elife. 2016 Jan 7;5:e10147. DOI: 10.7554/eLife.10147
To form and maintain organized tissues, multicellular organisms orient their mitotic spindles relative to neighboring cells. A molecular complex scaffolded by the GK protein-interaction domain (GKPID) mediates spindle orientation in diverse animal taxa by linking microtubule motor proteins to a marker protein on the cell cortex localized by external cues. Here we illuminate how this complex evolved and commandeered control of spindle orientation from a more ancient mechanism. The complex was assembled through a series of molecular exploitation events, one of which - the evolution of GKPID’s capacity to bind the cortical marker protein - can be recapitulated by reintroducing a single historical substitution into the reconstructed ancestral GKPID. This change revealed and repurposed an ancient molecular surface that previously had a radically different function. We show how the physical simplicity of this binding interface enabled the evolution of a new protein function now essential to the biological complexity of many animals.
There’s a pretty good summary here:
For billions of years, life on Earth was made up of single cells. In the lineage that led to animals – and independently in those that led to plants and to fungi – multicellular organisms evolved as cells began to specialize and arrange themselves into tissues and organs. Although the evolution of multicellularity is one of the most important events in the history of animal life, very little is known about the molecular mechanisms by which it took place.
To form and maintain organized tissues, cells must coordinate how they divide relative to the position of their neighbours. One important aspect of this process is orientation of the mitotic spindle, a structure inside the dividing cell that distributes the chromosomes —and the genetic material they carry — between the daughter cells. When the spindle is not oriented properly, malformed tissues and cancer can result. In a diverse range of animals, the orientation of the spindle is controlled by an ancient scaffolding protein that links the spindle to “marker” proteins on the edge of the cell.
Anderson et al. have now used a technique called ancestral protein reconstruction to investigate how this molecular complex evolved its ability to position the spindle. First, the amino acid sequences of the scaffolding protein’s ancient progenitors, which existed before the origin of the most primitive animals on Earth, were determined. Anderson et al. did this by computationally retracing the evolution of large numbers of present-day scaffolding protein sequences down the tree of life, into the deep past. Living cells were then made to produce the ancient proteins, allowing their properties to be experimentally examined.
By experimentally dissecting successive ancestral versions of the scaffolding protein, Anderson et al. deduced how the molecular complex that it anchors came to control spindle orientation. This new ability evolved by a number of “molecular exploitation” events, which repurposed parts of the protein for new roles. The progenitor of the scaffolding protein was actually an enzyme, but the evolution of its spindle-orienting ability can be recapitulated by introducing a single amino acid change that happened many hundreds of millions of years ago.
How could a single mutation have conferred such a dramatically new function? Anderson et al. found that the ancient scaffolding protein uses the same part of its surface to bind to the spindle-orienting molecular marker as the ancient enzyme used to bind to its target substrate molecule, and the two partner molecules happen to share certain key chemical properties. This fortuitous resemblance between two unrelated molecules thus set the stage for the simple evolution of a function that is now essential to the complexity of multicellular animals.
The genetic simplicity of the evolutionary change in GKPID function is underscored by the fact that we found not one but two historical amino acid replacements from the relevant phylogenetic interval, either of which is sufficient to confer the GKPID’s derived functions on the ancestral enzyme. This finding indicates that GK acquired its new protein-binding function through a relatively simple, high-probability genetic path, rather than a long trajectory that required many specific mutations before the new function could be established.
GKPID’s dramatic evolutionary transition in function could take place through such a simple genetic mechanism because of its biophysical architecture. The gk enzyme’s simple binding site for GMP can also be occupied by a simple two-residue motif on the Pins peptide, which fortuitously has similar surface properties. In addition, a series of small hydrophobic patches, which happen to be adjacent, was available to bind the hydrophobic portion of the Pins peptide and increase affinity. All that was required to confer the protein’s new function was a single mutation that revealed this molecular surface, apparently by changing the protein’s conformational flexibility. In this way, the physical simplicity of an interaction between ancient molecules set the stage for the easy evolution of a novel molecular complex and, in turn, a cellular function that now plays an important role in the complex biology of multicellular animals.
Opsins of course belong to a larger family of proteins called G-protein coupled receptors
(GPCRs), from which all opsins ultimately derive. Which have seen numerous rather large-scale functional shifts during the history of life.
Isn’t it interesting to consider that the physical senses touch, smell, taste, and sight, are all evolutionarily related at the molecular level? They all employ GPC receptors as part of the extracellular sensory mechanism.
So do I understand you correctly here, to be saying that even if we could somehow assemble a substantial library of random proteins that nevertheless fold into some structure, we’d still be very unlikely to find a biologically useful function?
In other words, that not only are folding proteins in general rare among protein all sequences, but even among the minority of protein sequence that do fold, biologically useful functions are rare too?
How did you determine that by looking at just the human sequences? You would need to look at all lineages that have the homologous protein and reconstruct ancestral sequences to determine how many mutations have accumulated in the lineage leading to humans.
From what I can see of MYH7, this protein is shared with fungi, so you would need to go the common ancestor of fungi and animals, and then start tracking mutations through the entire phylogeny leading to humans.
I think he’s just trying to determine the minimum number of DNA base changes required to produce the observed amino acid substitutions. Say if one protein in species 1 has the amino acid K, and another species has L in the same position(or it could be the differences between variants in the same species), how many DNA base substitutions would that require at minimum? Well that would require two nucleotide substitutions, because K is encoded by AAA and AAG, while L is encoded by UUA, UUG, and CUN, so there is no possible one-nucleotide substitution that could produce the amino acid substitution K<->L.
I think it is reasonable to say that, on average, we expect a preponderance of amino acid substitutions that only require a single base change over those that require two.
Okay. But I guess you are not aware, then, that scientists have tested that in multiple concrete experiments? They have generated large libraries of variants of folding proteins, in total ignorance of whether they will have any useful functions, put them into biological organisms, and tested to see how they work.
Here’s a couple papers on such experiments where they were tested for in-vivo beneficial functions in living cells: Digianantonio KM, Korolev M, Hecht MH. A Non-natural Protein Rescues Cells Deleted for a Key Enzyme in Central Metabolism. ACS Synth Biol. 2017 Apr 21;6(4):694-700. DOI: 10.1021/acssynbio.6b00336
Digianantonio KM, Hecht MH. A protein constructed de novo enables cell growth by altering gene regulation. Proc Natl Acad Sci U S A. 2016 Mar 1;113(9):2400-5. DOI:10.1073/pnas.1600566113
These papers are interesting because they highlight the unpredictable nature of evolution. Key genes involved in complex biological functions inside living cells are deleted, and large libraries of random but folding proteins are screened to see if any of these proteins are able to functionally take the place of the deleted genes.
You might naively predict that, in so far as they are able to, they do it by taking over and performing the function of the deleted genes. But it turns out instead that they function in entirely different ways. The proteins being tested turns out to function as trancription initiators (which means they either have to interact directly with DNA, and/or and bind other transcription factors for example by suppressing other inhibitors, and/or bind to RNA polymerase) that upregulate expression of certain metabolic enzymes with low-level promiscous side-reactions that they are normally not selected to perform. So this in turn proves two things:
First, it proves that already existing proteins are often times functionally promiscous, which means one protein can have many functions they have normally not evolved or been selected to perform, which they can nevertheless perform at some low level, and under the right conditions those functions can become adaptive, which would then provide the basis for further enhancement of those functions by selection.
Second, it proves that among folding proteins in general, it cannot be the case that biologically useful functions are too unlikely to evolve.
There are many other such experiments, where naively designed proteins, made only to be able to fold, have been tested for in vitro functions, such as small molecule binding and enzymatic activities. Here’s a couple of those: Cherny I, Korolev M, Koehler AN, Hecht MH. Proteins from an unevolved library of de novo designed sequences bind a range of small molecules. ACS Synth Biol. DOI: 10.1021/sb200018e
Patel SC, Bradley LH, Jinadasa SP, Hecht MH. Cofactor binding and enzymatic activity in an unevolved superfamily of de novo designed 4-helix bundle proteins. Protein Sci. 2009 Jul;18(7):1388-400. DOI: 10.1002/pro.147
These papers of course demonstrates what was already implied above by the fact that native proteins in living organisms usually have many low-level side-functions(both enzymatic and otherwise) they haven’t evolved to perform, yet which nevertheless natively exist as a capacity of their sequence and structure. This also explains why so much of protein evolution has happened through gene-duplication, as many proteins with multiple native functions have been duplicated and repurposed to enhance those functions as they became beneficial under the right circumstances.
There are also papers on random protein sequences that have not been explicitly designed to fold, being tested for adaptive biological functions in real living organisms. Such as this one: Knopp M, Gudmundsdottir JS, Nilsson T, König F, Warsi O, Rajer F, Ädelroth P, Andersson DI. De Novo Emergence of Peptides That Confer Antibiotic Resistance. mBio. 2019 Jun 4;10(3). pii: e00837-19. DOI: 10.1128/mBio.00837-19
This last one is interesting because they screen a library of a few hundred million small peptides, some of which are too small to yield actual protein folds(thus proving that you don’t need folds for biologically useful functions) and only really form secondary structures, such as single sheets or helices. They find multiple small proteins in the 22-25 amino acid range that function as membrane channels. These membrane channels have the function of depolarizing the bacterial membrane which reduces antibiotic uptake, leading to an almost 50-fold increase in the amount of antibiotic tolerated by the organism.
This of course also proves that a protein doesn’t need to fold to be functional, and that biologically relevant and useful functions can’t be too rare to evolve.