A problem with orphan genes?

scd · March 23, 2021, 1:50pm

If evolution is true we should probably find a correlation between time and new genes over time, since all species supposedly evolved from a common ancestor, as we can see in that figure:

now, when we look at the data, we can see that there is no correlation between time and evolution of unique genes, since basically any species has a similar percentage of them (between 10-20% on average):

It seems as if all those species were created separately and did not evolve from a common ancestor.

(image from https://www.researchgate.net/figure/Percentages-of-orphan-or-taxonomically-restricted-genes-TRGs-in-30-animal-genomes_fig1_26776433)

Roy · March 23, 2021, 1:56pm

All extant species have been evolving for the same amount of time, so there is nothing to correlate to.

Once again you’re failing to understand the basics of evolution. It produces a branching tree, not a linear scale.

scd · March 23, 2021, 2:02pm

yes there is. remember that we are talking here about orphan genes. which means that these genes are unique for each species. so just for the sake of the argument, suppose that human has about 10% unique genes (so he supposedly evolved these genes in the last 6-8 my). we should not find that percentage in a lungfish for instance, since lungfish exist for at least 400 my.

Faizal_Ali · March 23, 2021, 2:08pm

When they speak of the number of “unique genes” in the lungfish genome, to whose genome is it being compared?

Art · March 23, 2021, 2:29pm

@pnelson , can you help @scd develop this train of thought?

T_aquaticus · March 23, 2021, 2:35pm

What about gene loss?

Compared to whom?

The figure you cited lists taxonomically restricted genes (TRG’s). This means that older genes will have spread wider in the tree so will no longer be counted as taxonomically restricted genes. Therefore, only recent new genes will be counted as TRG’s so it isn’t surprising that there are similar numbers in all of the species they looked at since we are looking at short time frames towards the tips of the tree.

T_aquaticus · March 23, 2021, 2:37pm

If they are unique for each species then they had to evolve recently, right?

If we are looking for genes found in just one species of lungfish, yes we should see a similar percentage since we would be looking at genes that evolved in just the last few million years.

Roy · March 23, 2021, 3:43pm

There is no such thing as the lungfish.

Understand then criticise. Not the other way around.

John_Harshman · March 23, 2021, 3:48pm

How can you know that unless you compare each species to its sister species, which your little graph doesn’t do? Your comparison is useless without a tree behind it.

You should understand that there are several extant species of lungfish, not separated from each other by 400 my. What was that lungfish compared to?

Rumraket · March 23, 2021, 3:57pm

There is no requirement of evolution that the total number of protein coding genes in a genome of an organism should just continue to accumulate indefinitely. Organisms aren’t continuously accumulating new genes without losing old ones too. The total number of protein coding genes has stayed relatively constant throughout metazoan evolution.
There is some core of central metabolic, developmental, and informational genes that stick around by purifying selection (the part shared among all animals in your figure), and then non-essential but fitness-contributing genes are replaced over time as old ones are no longer necessary, and new ones evolve to replace old ones.
The vast majority of predicted orphan genes are either false positives(aren’t actually functional protein coding genes, just recently emerged but ultimately transient open reading frames being spuriously transcribed), or just very old genes that have mistakenly identified as lacking signal of relatedness by homology deteciton failure(see: Many, but not all, lineage-specific genes can be explained by homology detection failure).

These factors explain why most species would have relatively similar amounts of TRGs, as the fraction of all protein coding genes in the genome that consists of novel, non-essential genes stays relatively constant(though also significantly smaller than that identified in your older reference), but also because they are usually non-essential they are also subject to a lot of gene turn-over, being lost and replaced quickly.

There you go.

davecarlson · March 23, 2021, 4:11pm

This bears repeating. As somebody who has assembled and annotated some complex genomes over the past couple years, I can’t emphasize enough that no newly published genome assembly/annotation is complete or free of errors. It takes years of effort and many iterations to “finalize” a set of predicted genes. Even genomes from well studied and economically important species are still being refined over time.

scd · March 23, 2021, 4:52pm

if its indeed true than how do you explain their high percentage? 15% of about 20,000 genes is about 3000 genes that supposedly evolved in the last few million years. compare with about 6000 genes that supposedly evolved in the last 500-600 my. doesnt make sense.

i dont think they checked lungfish genome. i just gave it as a theoretical example.

this because of my english.

Faizal_Ali · March 23, 2021, 5:16pm

Oh.

So I still don’t understand what argument you are trying to make. You claim there should be a “correlation between time and evolution of unique genes”, but what do you mean by “time”? “Time” from when to when?

evograd · March 23, 2021, 8:11pm

Presumably he means time to the last common ancestor used in the comparison. In other words, if 2 species that diverged 10 million years ago should have “x” orphan genes relative to each other, then 2 species that diverged 50 million years ago should have “5x” orphan genes relative to each other. This all assuming we’ve accounted for mutation rate and population size, and have sufficient outgroups to determine which genes were present in the common ancestor of each pair.

T_aquaticus · March 23, 2021, 8:32pm

It would depend on which species they are comparing as to how long the time period is. Also, taxonomically restricted genes may not even be new genes if there isn’t enough coverage in the phylogeny.

scd · March 23, 2021, 9:03pm

i dont think we need that. unless all of these species in the figure are equally close to each other, which im pretty sure its not the case.

true. so your scenario is that there is some limitation for the number of genes evolving? if so why do we have much more genes that are not unique?

its still doesnt explain the lack of correlation between time and gene count (even if they are “false genes”).

the same problem. if these are indeed genes that were lost, we should see a correlation between time and genes that were lost over time.

Witchdoc · March 23, 2021, 9:16pm

How much thought did you put into this?

Isn’t % DNA similarity an extremely good proxy for exactly what you are asking?

Rumraket · March 23, 2021, 9:54pm

It’s not so much a scenario as it is an inference from the data. There just appears to be no need to continuously expand the total number of protein coding genes over the course of (at least) metazoan evolution.

To the extend that there is variation in genome size (not to be confused with total gene number) it comes mostly from changes in the amount of non-coding DNA (facilitated mostly by changes in the activity of transposable elements), but that a relatively limited and similar number of protein coding genes (roughly about 20000 to 30000 total protein coding genes) has remained throughout.

As above, it appears that this core set of shared genes for animals evolved relatively early, defines a set of basic animal functions that make for a somewhat successful organism across a broad range of environments, and only relatively little additional gene innovation is actually necessary to specialize this organism further to the many different environmental niches that exist on Earth(with any remainder specialization owing more to changes in regulation and non-coding DNA). For the most part, while these additional genes can have fitness-contributing functions, they are not essential, and so can often be lost and replaced with new ones at little to no cost.

If there is a relatively constant number of them over time, and they are quickly gained and lost again, hence continuously replaced, that really does explain why different species appear to have roughly equal numbers of them.

I thought about explaining this in more detail but they say a picture is worth:

I believe I have now explained this so everyone should be able to understand.

John_Harshman · March 24, 2021, 4:16am

That made no sense. You can’t say that a gene is present only in one species unless you compare it to all likely sister species. If you reject phylogeny, then I suppose you would have to have the complete genomes of every species in the world just to be sure.

Time? Where do you even get a measure of time without a phylogeny?

scd · March 24, 2021, 2:40pm

ok. so lets take a look at Drosophila phylogeny:

(image from https://www.researchgate.net/figure/Genus-Drosophila-phylogeny-Powell-1997_fig2_5857976)

the species simulans and sechellia supposedly split off about 3 my ago, and both have about 20% unique genes. which give us about 3000 new genes in about 3 million years. or a single new gene per 1000 years. can you show me how we can get so many new genes (without any homologous) in such a short time?

see above: this is what they did. at least for the Drosophila.

Topic		Replies	Views
James Tour on Orphan Genes Conversation	46	2872	July 5, 2019
New article on lineage-specific genes Conversation Science , Article	2	356	November 5, 2020
From Junk to Genes: The Birth of New miRNA Genes in the Human Genome Public Square Science	13	642	March 11, 2021
Answers Journal on Taxonomically Restricted Genes Conversation Science	41	1085	August 12, 2019
JeffB and Swamidass: Understanding Evidence for Phylogeny Conversation Science	88	3181	May 8, 2021

A problem with orphan genes?

Related Topics