Junk DNA, High R, Pinnipeds, and the Multiverse

John_Harshman · December 30, 2023, 2:14pm

Well, of course the evidence is overwhelming and has been provided to you many times before. The greatest part comes from the nested hierarchical structure of the data.

But in fact we don’t need to assume common descent. There is other evidence. The spectrum of observed variation within the human species should be enough to show that some parts of the genome are more subject to neutral variation than others. And why, given separate creation, should 10% of the genome be very similar to 10% of the mouse genome while the rest is much less similar? Are the important differences between the species 90% of the genome while the commonalities are only 10%? That seems like a serious imbalance to me, because we’re much more alike than we are different.

Mercer · December 30, 2023, 9:49pm

Especially when one asks those questions and considers the answers BEFORE taking positions.

Why not? What would be the biochemical mechanisms underlying this alleged difference, particularly since many of those regulatory sequences function by interacting with proteins?

I think you’ve just stumbled on another empirical prediction of your ever-changing ID hypothesis…

Giltil · December 30, 2023, 10:20pm

But isn’t the case that the nested hierarchical structure of the data mostly concern the conserved parts, not the other parts ? And in that case, it seems to me that the assumption of common descent for the 90% that lack conservation is unwarranted, and so also appears the claim that these 90 % are under neutral evolution.

Paul_King · December 30, 2023, 10:43pm

That seems absurd. Why should parts of DNA be related by common descent and other parts not? Please explain your reasoning. Also explain why the supposedly unrelated segments would happen to differ by the amounts expected from neutral drift.

RonSewell · December 30, 2023, 10:55pm

The junk is in the trunk. It comes along for the ride.

Pseudogenes, transposons, ERV’s, and relative degrees of similarity, even if not conserved, can be useful for determining phylogeny, so long as they’re not so degenerate as to be no longer identifiable.

Gisteron · December 30, 2023, 10:57pm

No. Whether a part is conserved or not is a matter of how much it affects fitness for selection. If it has a great impact on how the organism functions, then only some changes are likely to survive long term. Changes to genes that do not get expressed in ways selection can work on can diffuse into the population unimpeded, but the new variants will still be derivatives of the ancestral gene.

When many sequences in a set have unique features not shared by the others, one can reasonably class the conditions shared by the majority as the base, and the unique features as derived. Classifying sequences of larger sets by this rule automatically renders a nested hierarchy, where individual sequences have unique features of their own, but then shared features within a smaller group but not shared by the majority of the rest of the set, thus rendering the group itself derived from an even higher order super-group and so on.

If or what part selection had in extinguishing certain variations does not alter the classification scheme. One may argue that until this point it would still be an arbitrary classification designed to produce some nested hierarchy no matter what. This is true. However, one can construct a hierarchical tree from many non-overlapping gene sequences, and compare them. If each specific tree is a mere artifact of the classification scheme, then that any of them should align in any part at all is a matter of coincidence, and we should expect alignment at a rate no greater than, essentially, the autocorrelation of white noise at finite sample sizes. On the other hand, if the trees match almost perfectly all the time (‘almost’ only because we are still working with a finite sampling of a fundamentally random process, so a perfect match is rather improbable), it stands to reason that one of those trees, or one quite like them is a correct reflection of an underlying relationship between the sequences. And since we are talking about gene sequences, the only relation that can be is descent.

Roy · December 30, 2023, 11:06pm

No. It’s impossible to identify a hierarchical structure in something that doesn’t change.

John_Harshman · December 31, 2023, 1:25am

No, it is not. The non-conserved parts retain their hierarchical structure for many millions of years. One can certainly observed the structure throughout mammals or throughout birds, though the structure in junk DNA doesn’t reach any further. I can’t, for example, align introns between birds and crocodiles*.

*There is in fact a single exception to this that I know of. For some reason unknown to me, an intron in IRF2 (Interferon Regulatory Factor 2) evolves very slowly and can be aligned between birds and crocodiles. Not that this is relevant to the main point.

Gisteron · December 31, 2023, 6:31am

Is this because so much of the sequence changes that it becomes basically irrecognizable what’s derived and what’s basal anymore? Peak entropy, in a way? I suppose then in the long term limit the same should happen to non-junk regions as well, wouldn’t it? Conservation isn’t a hard barrier after all, but rather effectively prolongs the timescale characterizing the changes, and the reason we can still reconstruct a family tree almost all the way to the beginning is because the duration life has been around does not greatly surpass that degeneration timescale. Am I picturing this roughly correctly?

Rumraket · December 31, 2023, 8:58am

Yes, not only the degree of conservation matters here. Genes are also lost and gained. There are a small handful of genes (about 30) that are found in all organisms (the only genes that are universally distributed, that is found in all known organisms). Most of which are ribosomal proteins and ribosomal RNA, tRNA, and a few other genes functioning in the translation system.
These also appear to have been evolving slowly enough for universal tree to be derived from their sequences. Pretty much all other genes have been either lost or gained somewhere along evolutionary history, and many of these have also been evolving much too quickly too for a well-resolved tree to be derivable from their sequence (with some having diverged below the level of similarity expected by chance). A rule of thumb is for very deep phylogenies protein sequences is usually required because the DNA sequences that encodes the proteins have long since become saturated.

Methods for constructing trees from conservation of protein structure (which typically evolves even slower than protein sequence) are in development.

John_Harshman · December 31, 2023, 2:17pm

More than that. It becomes impossiblee to align, meaning that you can’t tell which positions are homologous. You can’t tell what’s changed because you can’t tell what sites to compare.

Depends on how conserved it is. There are sequences that are still able to be compared across all eukaryotes, even all life. But those are mostly protein sequences. I wouldn’t call this degeneration, because that would imply that the sequences are getting less functional. They’re just getting different.

Mercer · December 31, 2023, 2:58pm

No. It’s a continuum. More-conserved parts show more distant relationships, less-conserved ones show closer ones.

That’s why nonfunctional sequences are used for paternity testing, for example.

It’s even the case within protein families; myosin head (motor) domains are more conserved than tail domains.

But that case doesn’t hold. You really should ask questions and consider the answers (maybe even examine data!) BEFORE jumping to conclusions that you have to run away from defending.

Again, your “case” is pure fantasy. And your downgrading of a conclusion supported by petabytes of data to a mere “claim” is puerile.

Mercer · December 31, 2023, 3:01pm

There are fairly obvious (but very limited) exceptions, the most obvious of which is histocompatibility, for which differences increase fitness.

Tim · December 31, 2023, 11:30pm

I cannot help but see an analogy to Radiometric Dating here – with shorter-lived isotopes (analogous to less conserved parts) being used for the more recent past, and longer-lived isotopes (analogous to more conserved parts) being used in deeper time. It’s all about using the (metaphoric) ‘clock’ that hasn’t ‘run out’ and gives you the greatest discrimination.

Gisteron · January 1, 2024, 10:15am

Of course, it is the same thing. Assume that a given very long sequence has a well-defined mutation rate that doesn’t change over time, that the mutations can occur evenly distributed at random loci, and that a base that mutated once cannot be restored to its original state. These are ultimately false assumptions, but at sufficiently “short” timescales, i.e. ones moderately shorter than the inverse of that mutation rate, still a fair first approximation. Then the expected value of the fraction of the loci retaining the original base after some time comes down to exactly the same negative exponential function as the expected value of the fraction of a very large parent isotope amount that has not decayed after some time. Indeed, we could define original sequence integrity accounting for reverting mutations, too, and still keep the same exponential behaviour, albeit with a slightly corrected timescale.

Whether we call it a rate of mutation or a rate of radioactive decay, if one process can serve as a clock, then so can the other, and with the same limitations at timescales much shorter and much longer than the ticking rate. And if we have at our avail ‘clocks’ with distinct ticking rates, then we can cover a greater range of timescales, too.

Giltil · January 1, 2024, 1:32pm

Ok. But do we know what proportion of the non-conserved parts still exhibiting a hierarchical structure evolved at a truly neutral rate?

Giltil · January 1, 2024, 1:39pm

For example, because of horizontal transfer; or viral infections; or retro transposition.

Giltil · January 1, 2024, 1:44pm

Of course. But you shouldn’t identify the conserved parts with parts exhibiting perfect similarities.

Gisteron · January 1, 2024, 2:15pm

Yes and no. There is no hard line between conserved and non-conserved. It’s a gradient. The hierarchical structure is observed in all parts, just to different depths because of the varying conservation strengths / mutation rates.

What on earth are “perfect similarities”? If a given sequence is completely conserved over some duration, then it undergoes no changes over that duration, and therefore no structure of the progression of change can be identified in that sequence over that timescale. If the sequence is less than completely conserved, then it undergoes some random change, becoming ever less similar as time moves forward, branching into the hierarchical structure between its siblings. The less conserved the sequence is, the faster it undergoes change, and thereby loses similarity to the initial sequence. The maximum rate at which a sequence undergoes change, when normalized against other factors that impact mutation rates across the board irrespective of the effects of selection, corresponds to the minimal conservation a sequence can be subject to. Because minimal conservation is either none or something that is not experimentally distinguishable from none, the maximum rate at which a sequence undergoes change is the same as the “a truly neutral rate”.

John_Harshman · January 1, 2024, 2:26pm

I want to add that phylogenetic analysis does not rely on there being any sort of clock.

We can’t know, but it’s the way to bet. This is you again demanding certainty and being unable to comprehend any sort of probabilistic claims.

Topic		Replies	Views
Let's talk about the Junk in our DNA Conversation Science	5	378	January 17, 2021
Death of the junk DNA myth? Conversation Science	25	1574	February 4, 2024
So-called 'junk' DNA plays a key role in speciation Conversation Science	36	1123	October 2, 2021
Junk DNA, Rana, and Cardinale Conversation Design	20	951	February 16, 2023
Reviewing Nathaniel Jeanson's nuclear DNA arguments Conversation	9	549	May 10, 2019

Junk DNA, High R, Pinnipeds, and the Multiverse

Related topics