Transcriptome studies reveal pervasive transcription of complex genomes, such as those of mammals. Despite popular arguments for functionality of most, if not all, of these transcripts, genome-wide analysis of selective constraints indicates that most of the produced RNA are junk. However, junk is not garbage. On the contrary, junk transcripts provide the raw material for the evolution of diverse long non-coding (lnc) RNAs by non-adaptive mechanisms, such as constructive neutral evolution. The generation of many novel functional entities, such as lncRNAs, that fuels organismal complexity does not seem to be driven by strong positive selection. Rather, the weak selection regime that dominates the evolution of most multicellular eukaryotes provides ample material for functional innovation with relatively little adaptation involved.
A second important implication of the weak selection regime in eukaryotic genomes that counters the above trend against the emergence of lncRNAs, is the continuous emergence of biochemically active but non-functional entities. The barrage of mutations, which the genome experiences, constantly generates short motifs that have biochemical activity, including transcription factor binding and recruitment of RNA polymerase, resulting in cryptic transcriptional start sites. This is not surprising given that transcription factor binding sites are typically very short (Stewart et al., 2012), and many random pieces of DNA can activate transcription (Gerber et al., 2013; Gosselin et al., 2016; Reinke et al., 2008; White et al., 2013). Under the weak selection regime, these cryptic transcriptional start sites will only be eliminated by purifying selection if their associated transcript has a negative fitness effect above the drift barrier. The potential negative effects of such transcription start sites are drastically diminished by quality control mechanisms that degrade spurious RNAs or at least prevent their efficient translation into proteins (see below). Thus, one would expect that genomes of multicellular eukaryotes, which evolve under weak selection, would inevitably produce numerous, low abundance non-coding RNAs that exert small (positive and negative) fitness effects. Spurious āāgenesāā producing non-specific transcriptional noise are expected to be incessantly created and destroyed by neutral evolution. Thus, the evolutionary dynamics in complex eukaryotes necessarily produces a genome teeming with ever changing transcriptional noise.
Another concept that is commonly glossed over is that the production of a low level of junk RNA is fully compatible with our current understanding of biochemistry. All enzymes as well as regulatory proteins possess a degree of promiscuity and can bind to, and act on, sub-optimal substrates (Copley, 2020; Tawfik, 2020). Thus, transcription factors, which typically recognize short degenerate DNA motifs, will bind not only to gene-regulatory regions, but also to many additional non-functional sites in the genome (Paris et al., 2013; Reilly and Noonan, 2016; Villar et al., 2014; Wong et al., 2015).
There is so much good stuff in this paper itās hard to pick something to highlight over something else.
Weāve talked about the limits of seeing DNA as computer code, and here is one key point. DNA tolerates quite a bit of noise, but computer programs donāt.
Moreover, the noise being described here is a thicket of complexity, like a messy room or a cluttered desk.
Agreed. The computer code analogies completely break down when it comes to describing the physics of interacting molecules. For example, there is nothing in computer code that is analogous to the relationship between GC or AT hydrogen bonding and binding affinity(and how this affects things like melting temperature), between complementary antiparallel strands of nucleic acids. They just arenāt the same things and donāt function by the same principles.
I think this also explains why many people have a hard time understanding why biological macromolecules can evolve and change over time when they are time and again inappropriately analogized to āinertā macroscopic, mechanical objects, like vehicle engines with axles and pistons, and electronics devices.
This report also is another refutation of the claim that there is some sort of informational barrier to evolution. The origins and functioning of lncRNAs are decidedly low information (information in the ID use of the term, that is).
āSo the way this code works is it produces a 1, unless the room is under 20C, or youāve gone more than 3 days between a power cycle, or you have any of the programs listed below installed. Then itās a 0. Unlessā¦ā