Another intriguing de novo gene evolution paper was recently published in Nature Communications:
Mutations that cause changes to the sequence or expression of established genes are typically constrained by preexisting selected effects—the specific physiological processes mediated by the gene products that are maintained by natural selection20. In contrast, emerging proto-genes are expected to mostly lack such constraints because they do not have selected effects. This would leave them more readily accessible to evolutionary changes that have the potential to increase fitness (adaptive changes)3,4. We reasoned that this initial potential for adaptive changes would give way as proto-genes mature and the adaptive changes engender novel selected effects, in turn increasing constraints and reducing the possibility of future change. This reasoning is akin to Sartre’s “existence precedes essence” dictum21, and predicts that mutations affecting the sequence or regulation of proto-genes should impact fitness differently than mutations affecting the sequence or regulation of established genes. Specifically, proto-genes are predicted to evolve under weaker constraints, and thereby to display a higher potential for adaptive change, than established genes
We find that most emerging ORFs can be disrupted without detectable fitness cost, consistent with a lack of selected effect. Approximately 10% of emerging ORFs show beneficial fitness effects when overexpressed, a 3-fold enrichment relative to established ORFs consistent with a higher potential for adaptive change. In emerging but not established ORFs, beneficial fitness effects are associated with a high propensity to encode transmembrane ™ domains. Analyses of genome-wide TM propensities led us to hypothesize that novel adaptive TM peptides may spontaneously emerge when thymine-rich non-genic regions become translated: a “TM-first” model of gene birth. The plausibility of this model is supported by a detailed reconstruction of the evolutionary history of one locus where an ORF ( YBR196C-A ) emerged de novo in a thymine-rich ancestral non-genic region, accumulated substantial changes under positive selection and progressively increased its TM propensity to give rise to a protein that integrates into the membrane of the endoplasmic reticulum (ER) while retaining the potential for adaptive change. Overall, our results support an experiential model for de novo gene birth whereby a fraction of incipient proto-genes can subsequently mature and, as adaptive changes engender novel selected effects, progressively become established in genomes in a species-specific manner.
Our analyses suggest that a simple thymine bias suffices to generate a diverse reservoir of novel TM peptides (Fig. 5a–c), and that incipient proto-genes with TM domains are more likely to increase fitness than proto-genes without TM domains (Fig. 4). This could account for the observation that young ORFs have high TM propensities across multiple yeast species3,40. Beyond yeast, putative de novo genes with TM domains have also been characterized51,52,53,54. Furthermore, evidence suggests that the fitness-enhancing capacities of small TM proteins might extend to bacteria as well as to mouse18,55,56,57. Finally, unannotated TM sequences may also be pervasively translated in bacteria, insects and mammals58,59,60. The TM-first model could therefore represent a prevalent route of molecular innovation across phyla. The membrane environment might provide a natural niche for novel TM peptides, shielding them from degradation by the proteasome, and allowing subsequent evolution of specific local interactions while reducing the potential for deleterious promiscuous interactions throughout the cytoplasm. The TM domains may be lost in the subsequent stages of de novo gene evolution. We hope that future studies will quantify the prevalence of the TM-first mechanism and investigate the many exciting questions it raises.
The paper is open source, so check it!