Genes that evolve from scratch expand protein diversity: Genomic analysis of...


Can’t wait to see the DI’s spin on this one. :slightly_smiling_face:

1 Like

So does this mean “ORFan genes” were not just magically created from pixie dust by a “designer”? I am disappoint.


It has to be “this is more evidence for ID!” since the lab experiment was obviously Designed. :wink:


I haven’t read the paper yet (just returned yesterday from an inspiring week in the Galapagos Islands, which I’ll be blogging about at ENV), so probably I shouldn’t be posting…but, what the heck, this is a freewheeling forum, right?

A challenge with “de novo” studies arises from determining the directionality of evolution in cases such as this. In 2016, Richard Buggs and I wrote the following; note in particular the second issue we raise:

Firstly, it is difficult to see these de novo genes as orphans sensu stricto, given that orthologs – albeit apparently non-functional orthologs – do exist. The presence of orthologous sequences in other taxa is prima facie difficult to reconcile with most operational definitions of TRGs and ORFans, in particular, the criterion of similarity threshold (see 2.2, above). A second difficulty lies in proving the direction of evolution in cases of de novo evolution: it could be that the non-coding orthologs of the functional orphan genes are simply pseudogenes which were previously functional. Given that de novo gene origination is unlikely as an evolutionary process because the probability of a functional protein sequence emerging from a random sequence is vanishingly small (Jacob, 1977, Ohno, 1970), pseudogenization may be a more parsimonious explanation for the patterns seen. Cardoso-Moreira and Long (2012, p. 170) caution that “the presence of a gene in a genome and its absence [as a coding sequence] in the genomes of close relatives does not necessarily imply that that gene evolved de novo…that gene could have been lost from all other genomes” (2012, p. 170). Siepel (2009, p. 1694) argues that this could be the case even if multiple pseudogenes are found “the possibility that apparent gene births were actually functional in ancestral genomes and were lost independently in multiple lineages, although remote for these genes, cannot be completely discounted. Mutational hotspots could lead to non-negligible probabilities of parallel (homoplastic) disabling mutations.”

(from here:

1 Like

You beat me to it! I was literally just about to post this paper, having just finished reading it.

1 Like

Ancient genomes should resolve that objection. Also, this study is on rice. It should be uncontroversial to presume all rice share common ancestors. Phyogenetic anaylsis can rule out one directionality over the other. So I’m not sure realy how long this ORFan ID argument can be sustained…

In the paper, they explicitly look at the “stepwise” process of de novo gene birth along the phylogeny, describing in detail 2 examples. It certainly wouldn’t be more parsimonious to consider in these cases the gene was “gradually pseudogenised” as you move in reverse up the phylogeny!


And there you have it. @pnelson, this does establish directionality.

1 Like

I will add that I think this figure is a little bit counter-intuitive. The appearance of the orange box in the ancestor of groups 1-5 is clear enough, but after that it gets a bit confusing. The green lines represent frameshift, and in groups 1 and 2 when this line is absent, this is supposed to represent the frameshift occurring to the final “gene state”. The same goes for the purple line, it represents a premature stop codon that was present in the ORF from the start, and its absence in groups 1-3 represents the loss of the stop codon (to a codon that can be read through) by a substation from a T to a C.

In other words, the phylogeny shows a clear progression: first the ORF appears, then the premature stop codon is removed, then there is an insertion of 2 bases that causes a frame-shift, then transcription is gained.

There seems to be a similar progression identified for all 175 de novo genes analysed.

Like I said – haven’t read the paper yet.

But the dates on the phylogeny are really close together, so I’ll be interested to see how they constructed their branching order.

The divergence times themselves are from, so based on several studies each. From the sounds of it, the phylogeny of the genus itself is pretty well-accepted (in the paper they just say they used “the Orzya phylogenetic tree”.


First off, you probably need something a little more recent for your estimation of function within random sequence.

Second, they are looking at 11 species that emerged over the last 3 million years, so it should be pretty easy to test your claims. From the article:

As @swamidass notes, it should be relatively simple to determine if a gene was present in the common ancestor and then became a pseudogene in all but one species over a very short period of time. The most parsimonious explanation in this case would be the emergence of a gene in one lineage instead of the production of the same pseudogene in parallel in 10 species over a short period of time.


Doesn’t really solve your problem, though, does it? The fact would still remain that the existence of what appear to be “orphan” or “de novo” genes is just an artifact, whether the apparent absence of orthologous genes in related lineages is the result of neo-functionalization of non-coding sequences, or of inactivation of functioning genes.

But I now see that the scientists have shown you are wrong, anyway, so that’s that.

I suppose we’ll be hearing no more of this “de novo” gene argument from you ID creationists. Correct?

@Patrick: Thanks for the link to the article. Its really intriguing.

Leaves me wondering what somebody like James Shapiro will make of this study. Seems right up his alley.

Unfortunately, it does sound like something up James Shapiro’s alley. He will take the already known and widely accepted mechanisms, give them a new name, and then act as if he has discovered something new. That’s pretty much what he did with “natural genetic engineering”.


And because of the great work done by science popularizers like Richard Dawkins, people will actually believe him…

What he is talking about is a far cry from what is generally taught in class rooms.

No, it isn’t. What James Shapiro talks about are random mutations as they are taught in the class rooms. If you feel so inclined, you could start a thread on the subject and we could discuss more.

Will take you on your word…
To clarify, I was referring to high school.