Quite so.
Remind us, what percentage of human DNA is genic?
Quite so.
Remind us, what percentage of human DNA is genic?
For that scenario to work, non-genic DNA must be available to mutate.
There is plenty of non-genic DNA available, at least for the vast, vast majority of eukaryotes. Coding regions take up 2-3% of the human genome. If we extend our definition to transcription units, we are still talking about <20% of the human genome. There’s plenty of DNA open for de novo origin of genes.
Lots of it, in fact, with high initial sequence diversity.
I don’t understand why it needs to have high sequence diversity, especially when repetitive DNA can serve as a strong promoter.
Here’s a cite to get you started – I’m not coy, I’m just lazy:
If we are talking about orphan genes in the human lineage, wouldn’t the common ancestor between us and chimps have about the same size of genome, at around 3 billion bases in a haploid genome? Our lineage would have started with billions of bases available for de novo origin of genes. If we look at vertebrate genomes we once again see plenty of bases available. It seems that you are straining a gnat while swallowing a camel.
In other words, Doolittle et al. are arguing for a scenario that we are all too familiar with here at PS - namely that new protein-coding genes arise de novo, and not mixing and matching of extant protein-coding genes in some hypothetical ancestor.
Doolittle et al. were arguing for LGT:
However, genomes do differ substantially in size, so many genes must truly be present in some genomes and absent from others. When a genome contains a gene with no significant hit in any of several other ‘related’ genomes, but strong similarity to genes in a distant taxon, it seems perverse not to accept LGT as the best explanation.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1693099/pdf/12594917.pdf
I have no idea why @pnelson is bringing LGT between prokaryotes and archaea into the discussion. The paper he is citing appears to be saying that these metabolic capabilities evolved in different lineages and then were shared between lineages through LGT. Once again, the solution to Paul’s problem seems very apparent.
Doolittle et al. were arguing for LGT:
I guess I could have stated this better. In the excerpt @pnelson provided, Doolittle et al. made an argument that is consistent with the de novo appearance of new protein-coding genes (as opposed to a common ancestor with all possible genes).
Sorry for the confusion.
I have no idea why @pnelson is bringing LGT between prokaryotes and archaea into the discussion. The paper he is citing appears to be saying that these metabolic capabilities evolved in different lineages and then were shared between lineages through LGT. Once again, the solution to Paul’s problem seems very apparent.
I think @pnelson is focused on the one particular excerpt from the paper, that has to do with the unseemly ancestor that would have all of the genetic resources we see in extant life. I may be wrong, but I believe the criticism goes something like: Even if all these genes are being swapped around between different lineages, sometime and somewhere there must have been a common ancestor that had all of these genes. This ancestor must have had an untenably large genome for this to be true. I believe one can read the quote @pnelson provides as agreeing with this.
I’m more than a little confused. I also am having difficulty seeing why this is a problem. I did, however, find one other paper mentioning the “genome of eden”. It is also emphasizing the critical role of LGT.
301.40 KB
Abstract The complex pattern of presence and absence of many genes across different species provides tantalising clues as to how genes evolved through the processes of gene genesis, gene loss and lateral gene transfer (LGT). The extent of LGT, particularly in prokaryotes, and its implications for creating a ‘network of life’ rather than a ‘tree of life’ is controversial. In this paper, we formally model the problem of quantifying LGT, and provide exact mathematical bounds, and new computational results. In particular, we investigate the computational complexity of quantifying the extent of LGT under the simple models of gene genesis, loss and transfer on which a recent heuristic analysis of biological data relied. Our approach takes advantage of a relationship between LGT optimization and graph-theoretical concepts such as tree width and network flow.
Even if all these genes are being swapped around between different lineages, sometime and somewhere there must have been a common ancestor that had all of these genes.
That is exactly what the authors were arguing against. The genes originated in different common ancestors and were then shared across lineages. LGT solves the problem posed by @pnelson.
I also am having difficulty seeing why this is a problem.
Pictures help, so I made one.
The genes originated in different common ancestors and were then shared across lineages. LGT solves the problem posed by @pnelson.
Not really. Robbing Peter to pay Paul.
Let’s say taxon A has gene X, and transfers X to taxon B. Now X is no longer an ORFan, because it is shared by A and B (unless it can be eliminated entirely from the genome of A). If you diagram this (easy to do), you find that gene count does not go down, as it should under any hypothesis of universal common descent; it just moves around.
Pictures help, so I made one.
Thanks, Paul. That helped me, at least. In my mind, though, that takes us back to John’s comment earlier in the thread.
Fortunately for the scenario, there’s plenty of non-genic DNA. Still waiting for the mystery. Are you asking how non-genic DNA arises? There’s plenty of literature on that too.
Let’s say taxon A has gene X, and transfers X to taxon B. Now X is no longer an ORFan, because it is shared by A and B (unless it can be eliminated entirely from the genome of A).
Why is that a problem? This happens with horizontal transfer as well when two lineages share an orphan gene after a speciation event.
Why do you think it is a problem?
If you diagram this (easy to do), you find that gene count does not go down, as it should under any hypothesis of universal common descent; it just moves around.
Why should gene count go down under universal common descent combined with descent with modification? Do you think universal common descent precludes evolution of new genes in separate lineages?
Shouldn’t the last common ancestor in that diagram be “A B C D E F G H” or “A B C D E F G H Y Z”? Either that, or the 4 species should have extra letters that correspond to the parts of the ancestral genome that haven’t become ORFans.
Do you think universal common descent precludes evolution of new genes in separate lineages?
Universal common descent (UCD) by itself precludes nothing. “All organisms on Earth share a common ancestor, LUCA.” What does that rule out? Nichts, which is both the power of the theory, and its principal flaw.
Only when UCD is coupled with the rest of our biological knowledge, such as known constraints on cell function, can one derive testable predictions from UCD. The pervasiveness of non-orthologous gene displacement [NOGD] looks incompatible with UCD and what we know about functional constraints. In the face of ORFans and NOGD, I’d say UCD is in big trouble:
“As the genome database grows, it is becoming clear that NOGD reaches across most of the functional systems and pathways such that there are very few functions that are truly “monomorphic”, i.e. represented by genes from the same orthologous lineage in all organisms that are endowed with these functions. Accordingly, the universal core of life has shrunk almost to the point of vanishing.”
From Eugene Koonin, “Evolution of the genomic universe,” 2016.
This happens with horizontal transfer as well when two lineages share an orphan gene after a speciation event.
When two lineages share an orphan, it’s not an orphan any more (by definition).
Gotta run; hanging out here is fun but I really shouldn’t.
Let’s say taxon A has gene X, and transfers X to taxon B. Now X is no longer an ORFan, because it is shared by A and B (unless it can be eliminated entirely from the genome of A). If you diagram this (easy to do), you find that gene count does not go down, as it should under any hypothesis of universal common descent; it just moves around.
Perhaps you mean to say “If you diagram this (easy to do), you find that orfan count does not go down, as it should under any hypothesis of universal common descent; it just moves around.”? If not, then I agree with @T_aquaticus that this sentence doesn’t make much sense.
The pervasiveness of non-orthologous gene displacement [NOGD] looks incompatible with UCD and what we know about functional constraints.
I don’t see this at all. It would be nice to see this point expanded upon.
Gotta run; hanging out here is fun but I really shouldn’t.
Ha! It is probably about as bad as accidentally entering the wrong restroom! Not that I would know anything about that…
Universal common descent (UCD) by itself precludes nothing. “All organisms on Earth share a common ancestor, LUCA.” What does that rule out? Nichts , which is both the power of the theory, and its principal flaw.
UCD precludes completely different genetic and metabolic systems, as one example.
Only when UCD is coupled with the rest of our biological knowledge, such as known constraints on cell function, can one derive testable predictions from UCD. The pervasiveness of non-orthologous gene displacement [NOGD] looks incompatible with UCD and what we know about functional constraints. In the face of ORFans and NOGD, I’d say UCD is in big trouble:
We aren’t talking about UCD. We are talking about the de novo origin of genes. You can have the evolution of orphan genes and a lack of universal common descent. The evidence for UCD is a separate matter.
When two lineages share an orphan, it’s not an orphan any more (by definition).
So please tell me which of these steps are a problem for evolution.
Step 1: A mutation results in a putative promoter, resulting in a transcript from previously non-genic DNA that increases fitness.
Step 2: The mutation is selected for through selection until it becomes fixed in the population.
Step 3: There is a speciation event, resulting in two new species. Both species carry the gene that evolved in the ancestral population.
For that scenario to work, non-genic DNA must be available to mutate. Lots of it, in fact, with high initial sequence diversity.
And it is. How much intronic sequence is there, for example? Transposable elements that become nonfunctional due to (from their perspective) deleterious mutations. And many other kinds of pseudogenes. There are many sources of gene-duplication also, and gene-duplication doesn’t just duplicate protein coding genes of course.
Where is the mystery?