Genes that evolve from scratch expand protein diversity: Genomic analysis of...

Patrick · March 12, 2019, 8:50pm

Timothy_Horton · March 12, 2019, 8:52pm

Can’t wait to see the DI’s spin on this one.

Faizal_Ali · March 12, 2019, 9:05pm

So does this mean “ORFan genes” were not just magically created from pixie dust by a “designer”? I am disappoint.

Timothy_Horton · March 12, 2019, 9:08pm

It has to be “this is more evidence for ID!” since the lab experiment was obviously Designed.

pnelson · March 12, 2019, 9:16pm

I haven’t read the paper yet (just returned yesterday from an inspiring week in the Galapagos Islands, which I’ll be blogging about at ENV), so probably I shouldn’t be posting…but, what the heck, this is a freewheeling forum, right?

A challenge with “de novo” studies arises from determining the directionality of evolution in cases such as this. In 2016, Richard Buggs and I wrote the following; note in particular the second issue we raise:

Firstly, it is difficult to see these de novo genes as orphans sensu stricto, given that orthologs – albeit apparently non-functional orthologs – do exist. The presence of orthologous sequences in other taxa is prima facie difficult to reconcile with most operational definitions of TRGs and ORFans, in particular, the criterion of similarity threshold (see 2.2, above). A second difficulty lies in proving the direction of evolution in cases of de novo evolution: it could be that the non-coding orthologs of the functional orphan genes are simply pseudogenes which were previously functional. Given that de novo gene origination is unlikely as an evolutionary process because the probability of a functional protein sequence emerging from a random sequence is vanishingly small (Jacob, 1977, Ohno, 1970), pseudogenization may be a more parsimonious explanation for the patterns seen. Cardoso-Moreira and Long (2012, p. 170) caution that “the presence of a gene in a genome and its absence [as a coding sequence] in the genomes of close relatives does not necessarily imply that that gene evolved de novo…that gene could have been lost from all other genomes” (2012, p. 170). Siepel (2009, p. 1694) argues that this could be the case even if multiple pseudogenes are found “the possibility that apparent gene births were actually functional in ancestral genomes and were lost independently in multiple lineages, although remote for these genes, cannot be completely discounted. Mutational hotspots could lead to non-negligible probabilities of parallel (homoplastic) disabling mutations.”

(from here: https://www.researchgate.net/publication/304039133_Next_generation_apomorphy_The_ubiquity_of_taxonomically_restricted_genes)

evograd · March 12, 2019, 9:16pm

You beat me to it! I was literally just about to post this paper, having just finished reading it.

swamidass · March 12, 2019, 9:20pm

Ancient genomes should resolve that objection. Also, this study is on rice. It should be uncontroversial to presume all rice share common ancestors. Phyogenetic anaylsis can rule out one directionality over the other. So I’m not sure realy how long this ORFan ID argument can be sustained…

evograd · March 12, 2019, 9:20pm

In the paper, they explicitly look at the “stepwise” process of de novo gene birth along the phylogeny, describing in detail 2 examples. It certainly wouldn’t be more parsimonious to consider in these cases the gene was “gradually pseudogenised” as you move in reverse up the phylogeny!

swamidass · March 12, 2019, 9:22pm

And there you have it. @pnelson, this does establish directionality.

evograd · March 12, 2019, 9:28pm

I will add that I think this figure is a little bit counter-intuitive. The appearance of the orange box in the ancestor of groups 1-5 is clear enough, but after that it gets a bit confusing. The green lines represent frameshift, and in groups 1 and 2 when this line is absent, this is supposed to represent the frameshift occurring to the final “gene state”. The same goes for the purple line, it represents a premature stop codon that was present in the ORF from the start, and its absence in groups 1-3 represents the loss of the stop codon (to a codon that can be read through) by a substation from a T to a C.

In other words, the phylogeny shows a clear progression: first the ORF appears, then the premature stop codon is removed, then there is an insertion of 2 bases that causes a frame-shift, then transcription is gained.

There seems to be a similar progression identified for all 175 de novo genes analysed.

pnelson · March 12, 2019, 9:31pm

Like I said – haven’t read the paper yet.

But the dates on the phylogeny are really close together, so I’ll be interested to see how they constructed their branching order.

evograd · March 12, 2019, 9:41pm

The divergence times themselves are from timetree.org, so based on several studies each. From the sounds of it, the phylogeny of the genus itself is pretty well-accepted (in the paper they just say they used “the Orzya phylogenetic tree”.

T_aquaticus · March 12, 2019, 10:37pm

First off, you probably need something a little more recent for your estimation of function within random sequence.

Second, they are looking at 11 species that emerged over the last 3 million years, so it should be pretty easy to test your claims. From the article:

As @swamidass notes, it should be relatively simple to determine if a gene was present in the common ancestor and then became a pseudogene in all but one species over a very short period of time. The most parsimonious explanation in this case would be the emergence of a gene in one lineage instead of the production of the same pseudogene in parallel in 10 species over a short period of time.

Faizal_Ali · March 12, 2019, 11:13pm

Doesn’t really solve your problem, though, does it? The fact would still remain that the existence of what appear to be “orphan” or “de novo” genes is just an artifact, whether the apparent absence of orthologous genes in related lineages is the result of neo-functionalization of non-coding sequences, or of inactivation of functioning genes.

Faizal_Ali · March 12, 2019, 11:23pm

But I now see that the scientists have shown you are wrong, anyway, so that’s that.

I suppose we’ll be hearing no more of this “de novo” gene argument from you ID creationists. Correct?

Ashwin_s · March 14, 2019, 8:06am

@Patrick: Thanks for the link to the article. Its really intriguing.

Leaves me wondering what somebody like James Shapiro will make of this study. Seems right up his alley.

T_aquaticus · March 14, 2019, 3:19pm

Unfortunately, it does sound like something up James Shapiro’s alley. He will take the already known and widely accepted mechanisms, give them a new name, and then act as if he has discovered something new. That’s pretty much what he did with “natural genetic engineering”.

Ashwin_s · March 14, 2019, 3:26pm

And because of the great work done by science popularizers like Richard Dawkins, people will actually believe him…

What he is talking about is a far cry from what is generally taught in class rooms.

T_aquaticus · March 14, 2019, 3:29pm

No, it isn’t. What James Shapiro talks about are random mutations as they are taught in the class rooms. If you feel so inclined, you could start a thread on the subject and we could discuss more.

Ashwin_s · March 14, 2019, 3:31pm

Will take you on your word…
To clarify, I was referring to high school.

Topic		Replies	Views
James Tour on Orphan Genes Conversation	46	2973	July 5, 2019
Mechanisms of the origination of new protein-coding genes Conversation	11	679	August 16, 2019
Genes that evolve from scratch expand protein diversity: Conversation Article	4	356	March 2, 2021
Evidence favoring de novo gene evolution, and an actual population genetic model of de novo gene gain Conversation Science	8	767	September 9, 2021
Constructive Neutral Evolution Conversation Science	86	5489	July 15, 2020

Genes that evolve from scratch expand protein diversity: Genomic analysis of...

Related topics