I guess it depends on the percentage of Junk DNA. Approaches like what Behe has which incorporates evolution, might predict predict similar levels of junk DNA.
I think lesser percentages of junk DNA would point more towards importance in the Developmental stage of embryos.
In such a scenario, the vast majority of negative selection will be taking place after fertilisation and before birth.
Of course, these are just my thoughts and could be totally wrong.
Over short distances weâve known that exact spacing can be critical to whether different transcriptional components come together to activate or inhibit expression. For example, DNA has a helical structure so depending on the number of bases between two binding sites, the proteins that bind these sites can find themselves on different âsidesâ of the DNA. That depends on the rotational phase between sites. Short stretches of DNA are sufficiently rigid such that the phase limits how well two bound proteins can come together. That interaction leads to differences in transcription (regulation). Over longer stretches DNA bending and phase constraints lessen. Other affects can also come into play such as spacing relative to histone wrapping. However, that tends to leave flexibility for long distance spacing.
Overall, we know of many regulatory or expression altering effects from changing spacing between DNA sites. But that doesnât imply that all or even most portions of the genome are active or constrained in this way. We canât make that extrapolation from âsome sequences are important in case âXââ to âall sequences are importantâ. I think there is some reporting bias in the nature of publishing results that can give the appearance of critical functionality. That is, we tend not to publish negative results. First, theyâre hard to positively demonstrate. Second, theyâre considered less interesting. For example, itâs harder to publish a paper showing that increasing distances between two sites has little effect on expression. On the other hand, if changing something has an effect, that leaves something interesting to discuss about mechanisms.
An important point youâre missing: the ENCODE project itself, in one of those 30 papers, came up with an estimate of the fraction of the genome that has any effect on fitness that was in the neighborhood of 10%. Iâve never seen any serious argument or data, from ENCODE or anywhere else, that most of the human genome has any sequence-specific effect on fitness.
@T_aquaticus could have given a simpler and less controversial explanation. Basically it all depends on the definition of function. Biologists define function in different ways.
For example encode is referring to biochemical function. I.e 80% of the human genome has some kind of biochemical activity like getting transcripts in RNA. This need not mean that the stretch of DNA does something useful for the organism.
Evolutionary biologists define function as something that gives an advantage that can be selected for (I.e there is an advantage in terms of natural selection/reproductive succes).Now obviously, every change/feature of an organism need not have an impact on reproductive success.
In simple terms, evolutionary biologists talk about function in circular terms that makes no sense to a lot of laymen and leads to misunderstanding.
Edit: if we use @T_aquaticus analogy of a TV. Every quality if a TV that helps people in making decision to buy it would correlate with evolutionary definition of function.
And ENCODE definition would be more general. Ascribing function to every part of the TV that interacts with electricity.
Obviously, the designer of the TV had a different perspective when talking about the function of the TV.
I guess they were talking about different things.
However, some of the genome might cause diseases that has no effect on fitness. This is especially true for diseases that effect older people such as Alzheimerâs. It might not have much impact on reproductive success, however itâs an important project in terms of health.
Most of the genome in eukaryotes is probably without function, at least at the level of the organism. As others have pointed out, genome size in eukaryotes varies significantly, with no immediate correlation to organismic complexity. I find this comparison especially instructive, as it shows that some unicellular eukaryotes (protozoa) have larger genomes than mammals (from here):
What I find interesting about non-functional DNA is how little bacteria have of it. As much as 90 percent of the genome may consist of genes for most bacterial species, according to one study. This makes sense; bacteria are fast reproducers and should be expected to undergo streamlining selection. Eukaryotes, in contrast, are the energy-rich behemoths of the cellular world, and they can afford the luxury of lugging a genome filled with non-functional DNA around.
Non-functional DNA isnât just a burden. Itâs also a reservoir of future genes and evolutionary potentials. The Thr-Ala-Ala repeats in the antifreeze protein of codfish appear to derive from non-functional DNA, for example. Transposable elements have even been found to have taken on regulatory functions.
Non-functional DNA may be likened to the crumpled-up pieces of paper found in the wastebasket of an aspiring writer; by itself useless, but a necessary by-product of the creative process.
Yup. What ticked a lot of people off about ENCODE was that they studied one thing in their papers, and made it sound like in their press releases that they had studied something else.
Sure, there could be cases like that. On biological grounds, I would expect that most such things would be detrimental to health â and it would be a little odd to describe a stretch of sequence as having the function of causing disease.
There are other gray areas, too. Suppose thereâs a transcription factor that works by binding to some stretch of DNA, but that also binds weakly to DNA in a transposon. As the transposon copy number increases in the genome, the copies take up more and more of the TF, reducing its effectiveness. Eventually it becomes beneficial for the TF gene to add a new enhancer to increase expression to overcome the loss. At this point, if you take away the transposons youâll have overproduction of the TF, which could be deleterious, and if you take away the enhancer youâll have underproduction, so both are âfunctionalâ in an evolutionary sense, but in an engineering sense theyâre both useless, since their only role is to compensate for each other.
Actually, 100% of the human genome has some kind of biochemical activity, since being replicated is a biochemical activity. That should clue you that this definition of function is useless.
It could, but itâs another bad analogy. Junk DNA is not a necessary by-product of the creative process. That would imply that species with little junk are not creative and that species with lots of it are more creative than others. I for one welcome our new salamander and fern overlords.
The vast majority of biologists define function as impacting the fitness of the organism in some detectable way. In other words, function is defined as being selectable. As Dan Graur puts it:
The terms seem very straightforward to me. If you can randomly change the DNA sequence in a stretch of DNA without impacting the fitness of the organism then that stretch of DNA does not contain function.
I have a general question about the ENCODE project. Did it sift out the considerable amounts of transcripts that are quickly degraded after synthesis like the âcryptic unstable transcriptsâ in Saccharomyces? It seems like these, if included, would dramatically increase what the project considered âfunctionalâ beyond what actually serves a purpose in the cell.
The converse is not true, however. If you can randomly change a sequence and it does impact fitness, that doesnât mean the sequence contains function. Mutations to junk DNA can be deleterious. Spurious transcription factor binding sites have already been mentioned.
This is very interesting. So youâre saying âfunctionâ only applies to protein-encoding DNA sequences (a gene)? I would think that âfunctionâ would apply to both genes and sequences that affect transcription and translation of those genes.
Biologists do assign function to genes that arenât translated into protein. Function absolutely does apply to sequences that drive expression of those genes, both at the transcriptional and post-transcriptional levels.
I have studied the effect of micro-RNAâs on protein translation, as one example. These are short sequences of RNA that bind to the 3â untranslated portion of messenger RNA and can prevent their translation into proteins. As we would expect, microRNAâs show strong conservation across species, and are definitely targets for evolutionary change.
Iâm just trying to resolve @John_Harshman and your statements to get a better idea of whatâs going on and how biologist talk about this stuff.
So functional vs non-functional sequences are not equal to genes vs junk? If thatâs right, what can we say about the relationship between these? Are they roughly equal but âfuzzyâ?
First off, gene is not equal to translated into protein. There are genes that are never translated into protein, such as microRNAâs. A gene is best described as a functional and contiguous unit of DNA which includes its promoter, regulatory sequences, the transcribed/translated sequence, and a laundry list of other features. Using the TV analogy, a TV isnât just the pixels on the screen. A TV includes the cord, switches, plastic molding, and other bits. A gene has lots of parts as well, not just the parts that are translated into protein.
What we can say about all of these functional features in genes is that mutations have a chance of destroying that function. A mutation in the regulatory region of a gene can affect when the gene is turned on and the expression level of that gene which can in turn affect the geneâs function. The mutations in the regulatory region in the human lactase gene is a good example where lactase expression is no longer shut down in adolescence but continues into adulthood. A mutation in the 3â UTR of a gene can affect which microRNAâs it binds, and hence how it is regulated. Because of this, we see conservation of sequence due to negative selection of deleterious mutations throughout a gene and its various parts.
We donât see sequence conservation in junk DNA. In fact, one way of searching for functional DNA is looking for a conserved sequence. You can find candidates for functional sequence without even knowing what that DNA does just by looking for sequence conservation. Of course, there can be false positives and false negatives when using conservation as a search tool, but it is still a very useful and a good estimate of the amount of functional DNA in a genome.