James Tour on Orphan Genes

Yes, one of my new research projects is in this area. I’m not ready to speak publicly about our results, but I’m hoping to have something ready for prime time by mid-fall. There isn’t a ton of work out there, so we’ve decided to reinvent the wheel and go at this in an entirely novel and unbiased way, tossing out almost all assumptions that tend to drive the way the questions are asked and concentrate on novel DNA sequences themselves, rather than beginning with proteins or polyadenylated message. The results, as you can imagine, are quite different this way (from, e.g., the Ruiz-Orera paper). As @swamidass referred to, most “orphans” aren’t really orphans at all and have clear orthologs. (Many have paralogs also.) In fact, even genes that others have said have no orthologs, we find them really easily, so I’m not even sure what to believe in this field at the moment.

True “de novo” genetic elements are indeed very rare, but we’ve found some and that’s what we’re focusing on mostly. Notice I did not say de novo “genes,” because most of the novel sequences we find are not genes. As for the mechanism, that’s a tougher question, but we’re making progress there also, and it looks like rearrangement (probably with the help of retroviruses) is the likely answer for at least the first few that we decided to bear down on. Sorry for being so vague with all of this, but it would be irresponsible to talk about results so preliminary that they are literally days, not weeks, old. :slight_smile: @evograd, once again, you are right on the money. Want to collaborate with us? :slight_smile:

7 Likes

That sounds like some really exciting work. I can’t wait to see it in its developed form!

Granted, the only work I’ve done in this area is a tiny bit of reading, but I’m beginning to formulate a general perception of the “de novo” gene. Maybe I can put this in outline form and ask for correction or expansion, as needed. @NLENTS @swamidass @evograd @davecarlson and others I’m sure I’m missing, your input is heartily invited:

  1. Algorithms traditionally used for identification of ORFs homologs (either protein-coding or simply transcribed) missed a small percentage of candidates.
  2. The percentage of the missed candidates varies between organisms, and tends to inversely correlate with the abundance of genomic studies for particular organisms.
  3. Direct investigation of individual sequences that appear (according to algorithm) to lack orthologs in closely-related organisms typically identifies homologous sequences that were initially missed.
  4. There are a small number of true “de novo” genes (@swamidass’s “exceptions to the exceptions”) that can actually be identified, with enough work.
  5. These true “de novo” genes are noticeable due to the lack of introns (at least in higher eukaryotes). There are probably other hallmark signatures, so I’d be interested to hear more about these.

I think this is all accurate, but I’m certainly open to correction! If accurate, it seems there are a couple of very important notes that follow out of these observations. First, the existence of these de novo genes is both predicted and required for large-scale evolution. Second, the existence of these de novo genes tends to refute arguments we’ve seen from Axe and @Agauger that protein-coding sequences are simply too complex to evolve from non-coding sequences. I suppose the ID camp could still argue that these genes are a product of direct divine action, but tracing their emergence would be quite interesting.

1 Like

Yes, I think your perceptions are accurate. I would just say that this field is in its infancy and, as such, you will see lots of conflicting reports and numbers as we, collectively, figure out the best way to do this work. Even the terminology is still “evolving” and, to be honest, I’m not letting almost any prior work influence my project very much. Although it can be frustrating and disorienting, it’s also exhilarating to be working in a field that is just getting started, knowing that you’re contributing to a foundation that will hopefully open up new avenues of research and understanding.

Predicted, yes. Required? Hmmm, I’m not really sure about that. Hadn’t thought about it as a “requirement” issue. I think the whole “needs an influx of new genetic information” bit has been vastly overblown.

Exactly. And this is the position they’ll always be able to retreat to as we decipher all the mechanisms that they claim could only be supernatural. Just this year, Behe chided Coyne with “you are confusing the question of WHAT happened with HOW it happened.” (paraphrase) So once we show where the de novo genetic elements come from, they’ll move the goal post to “yeah but you can’t prove that it was an unguided process.” As I’ve said before, ultimately, there is no way to know whether or not a given mutation or genetic rearrangement was guided by someone/thing.

3 Likes

I think you are probably right about it being overblown, but wouldn’t it still be necessity for new genes to arise as life forms developed in complexity?

1 Like

Sure. But don’t confuse “new genes” with “de novo genes”.

1 Like

You’re right, that’s an important distinction I really wasn’t making mentally. Thanks.

yes, ^ what John said ^. The normal process of duplication and diversification has done way, way, way, way more to produce “new genes” than the incredibly rare birth of truly de novo genes.

2 Likes