Junk, or Not Junk, that is the Question

I guess it depends on the percentage of Junk DNA. Approaches like what Behe has which incorporates evolution, might predict predict similar levels of junk DNA.

I think lesser percentages of junk DNA would point more towards importance in the Developmental stage of embryos.
In such a scenario, the vast majority of negative selection will be taking place after fertilisation and before birth.

Of course, these are just my thoughts and could be totally wrong.

Over short distances we’ve known that exact spacing can be critical to whether different transcriptional components come together to activate or inhibit expression. For example, DNA has a helical structure so depending on the number of bases between two binding sites, the proteins that bind these sites can find themselves on different ‘sides’ of the DNA. That depends on the rotational phase between sites. Short stretches of DNA are sufficiently rigid such that the phase limits how well two bound proteins can come together. That interaction leads to differences in transcription (regulation). Over longer stretches DNA bending and phase constraints lessen. Other affects can also come into play such as spacing relative to histone wrapping. However, that tends to leave flexibility for long distance spacing.

Overall, we know of many regulatory or expression altering effects from changing spacing between DNA sites. But that doesn’t imply that all or even most portions of the genome are active or constrained in this way. We can’t make that extrapolation from ‘some sequences are important in case “X”’ to ‘all sequences are important’. I think there is some reporting bias in the nature of publishing results that can give the appearance of critical functionality. That is, we tend not to publish negative results. First, they’re hard to positively demonstrate. Second, they’re considered less interesting. For example, it’s harder to publish a paper showing that increasing distances between two sites has little effect on expression. On the other hand, if changing something has an effect, that leaves something interesting to discuss about mechanisms.

1 Like

I guess that’s to be expected… afterall, a negative result in one type of cell need not mean a negative result in other types also…

Thanks for the info.

1 Like

I repeat,

An important point you’re missing: the ENCODE project itself, in one of those 30 papers, came up with an estimate of the fraction of the genome that has any effect on fitness that was in the neighborhood of 10%. I’ve never seen any serious argument or data, from ENCODE or anywhere else, that most of the human genome has any sequence-specific effect on fitness.

5 Likes

What do you mean by that? Apparently you just think “junk” is a bad name for all the useless parts of the genome. But why?

@T_aquaticus could have given a simpler and less controversial explanation. Basically it all depends on the definition of function. Biologists define function in different ways.
For example encode is referring to biochemical function. I.e 80% of the human genome has some kind of biochemical activity like getting transcripts in RNA. This need not mean that the stretch of DNA does something useful for the organism.
Evolutionary biologists define function as something that gives an advantage that can be selected for (I.e there is an advantage in terms of natural selection/reproductive succes).Now obviously, every change/feature of an organism need not have an impact on reproductive success.

In simple terms, evolutionary biologists talk about function in circular terms that makes no sense to a lot of laymen and leads to misunderstanding.

Edit: if we use @T_aquaticus analogy of a TV. Every quality if a TV that helps people in making decision to buy it would correlate with evolutionary definition of function.
And ENCODE definition would be more general. Ascribing function to every part of the TV that interacts with electricity.
Obviously, the designer of the TV had a different perspective when talking about the function of the TV.

2 Likes

I guess they were talking about different things.
However, some of the genome might cause diseases that has no effect on fitness. This is especially true for diseases that effect older people such as Alzheimer’s. It might not have much impact on reproductive success, however it’s an important project in terms of health.

Most of the genome in eukaryotes is probably without function, at least at the level of the organism. As others have pointed out, genome size in eukaryotes varies significantly, with no immediate correlation to organismic complexity. I find this comparison especially instructive, as it shows that some unicellular eukaryotes (protozoa) have larger genomes than mammals (from here):

What I find interesting about non-functional DNA is how little bacteria have of it. As much as 90 percent of the genome may consist of genes for most bacterial species, according to one study. This makes sense; bacteria are fast reproducers and should be expected to undergo streamlining selection. Eukaryotes, in contrast, are the energy-rich behemoths of the cellular world, and they can afford the luxury of lugging a genome filled with non-functional DNA around.

Non-functional DNA isn’t just a burden. It’s also a reservoir of future genes and evolutionary potentials. The Thr-Ala-Ala repeats in the antifreeze protein of codfish appear to derive from non-functional DNA, for example. Transposable elements have even been found to have taken on regulatory functions.

Non-functional DNA may be likened to the crumpled-up pieces of paper found in the wastebasket of an aspiring writer; by itself useless, but a necessary by-product of the creative process.

1 Like

Yup. What ticked a lot of people off about ENCODE was that they studied one thing in their papers, and made it sound like in their press releases that they had studied something else.

Sure, there could be cases like that. On biological grounds, I would expect that most such things would be detrimental to health – and it would be a little odd to describe a stretch of sequence as having the function of causing disease.

There are other gray areas, too. Suppose there’s a transcription factor that works by binding to some stretch of DNA, but that also binds weakly to DNA in a transposon. As the transposon copy number increases in the genome, the copies take up more and more of the TF, reducing its effectiveness. Eventually it becomes beneficial for the TF gene to add a new enhancer to increase expression to overcome the loss. At this point, if you take away the transposons you’ll have overproduction of the TF, which could be deleterious, and if you take away the enhancer you’ll have underproduction, so both are ‘functional’ in an evolutionary sense, but in an engineering sense they’re both useless, since their only role is to compensate for each other.

7 Likes

Actually, 100% of the human genome has some kind of biochemical activity, since being replicated is a biochemical activity. That should clue you that this definition of function is useless.

It could, but it’s another bad analogy. Junk DNA is not a necessary by-product of the creative process. That would imply that species with little junk are not creative and that species with lots of it are more creative than others. I for one welcome our new salamander and fern overlords.

3 Likes

Or just welcome us eukaryote overlords. :slight_smile:

1 Like

The vast majority of biologists define function as impacting the fitness of the organism in some detectable way. In other words, function is defined as being selectable. As Dan Graur puts it:

The terms seem very straightforward to me. If you can randomly change the DNA sequence in a stretch of DNA without impacting the fitness of the organism then that stretch of DNA does not contain function.

1 Like

I have a general question about the ENCODE project. Did it sift out the considerable amounts of transcripts that are quickly degraded after synthesis like the “cryptic unstable transcripts” in Saccharomyces? It seems like these, if included, would dramatically increase what the project considered “functional” beyond what actually serves a purpose in the cell.

1 Like

The converse is not true, however. If you can randomly change a sequence and it does impact fitness, that doesn’t mean the sequence contains function. Mutations to junk DNA can be deleterious. Spurious transcription factor binding sites have already been mentioned.

This appears to be the paper that outlined the methods and results for the RNA work:

https://www.nature.com/articles/nature11233

It’s quite dense, but it may be worth sifting through.

This is very interesting. So you’re saying “function” only applies to protein-encoding DNA sequences (a gene)? I would think that “function” would apply to both genes and sequences that affect transcription and translation of those genes.

Biologists do assign function to genes that aren’t translated into protein. Function absolutely does apply to sequences that drive expression of those genes, both at the transcriptional and post-transcriptional levels.

I have studied the effect of micro-RNA’s on protein translation, as one example. These are short sequences of RNA that bind to the 3’ untranslated portion of messenger RNA and can prevent their translation into proteins. As we would expect, microRNA’s show strong conservation across species, and are definitely targets for evolutionary change.

I’m just trying to resolve @John_Harshman and your statements to get a better idea of what’s going on and how biologist talk about this stuff.

So functional vs non-functional sequences are not equal to genes vs junk? If that’s right, what can we say about the relationship between these? Are they roughly equal but “fuzzy”?

First off, gene is not equal to translated into protein. There are genes that are never translated into protein, such as microRNA’s. A gene is best described as a functional and contiguous unit of DNA which includes its promoter, regulatory sequences, the transcribed/translated sequence, and a laundry list of other features. Using the TV analogy, a TV isn’t just the pixels on the screen. A TV includes the cord, switches, plastic molding, and other bits. A gene has lots of parts as well, not just the parts that are translated into protein.

What we can say about all of these functional features in genes is that mutations have a chance of destroying that function. A mutation in the regulatory region of a gene can affect when the gene is turned on and the expression level of that gene which can in turn affect the gene’s function. The mutations in the regulatory region in the human lactase gene is a good example where lactase expression is no longer shut down in adolescence but continues into adulthood. A mutation in the 3’ UTR of a gene can affect which microRNA’s it binds, and hence how it is regulated. Because of this, we see conservation of sequence due to negative selection of deleterious mutations throughout a gene and its various parts.

We don’t see sequence conservation in junk DNA. In fact, one way of searching for functional DNA is looking for a conserved sequence. You can find candidates for functional sequence without even knowing what that DNA does just by looking for sequence conservation. Of course, there can be false positives and false negatives when using conservation as a search tool, but it is still a very useful and a good estimate of the amount of functional DNA in a genome.

1 Like