Beyond Reasonable Doubt? A Test for Common Ancestry

(Blogging Graduate Student) #1

Hi all, this is first time starting a thread on this site, so I decided to go with something simple and uncontroversial, so we can all get along in unanimous agreement.

Universal common ancestry.

I’ve been browsing a few different threads here recently and one topic that I’ve noticed pop up repeatedly is formal tests of universal common ancestry - whether they have been performed, or are even possible. One paper that is often brought up (and not surprisingly so given its title), is Douglas Theobald’s 2010 paper A formal test of the theory of universal common ancestry.

It’s worth a read if you haven’t already, but without getting into the details, I think it’s generally acknowledged at this point that Theobald’s statistical methods were flawed, so let’s put his paper to one side for a moment.

What other research exists to fill this void? Well, as it happens I wrote a blog post a while back outlining one such piece of research. I’ll let you read the blog post and/or paper to get the full explanation and data, but for now, here’s my brief summary:

In their 2013 paper, White and colleagues developed and implemented a test of common ancestry between different large clades based on ancestral gene reconstruction, based on the principal that divergence from a common ancestor means that earlier members of each new lineage would be less genetically diverged than modern members of the respective clades. The illustration of this principal I give in the blog post involves humans and chimps. IF humans and chimps shared a common ancestor 6 million years ago, then if we could go back in time 3 million years and look at members of each lineage (the on-the-way-to-chimp lineage and the on-the-way-to-human lineage), you’d find that they were less genetically distinct from one another than extant humans and chimps are today. After all, they’d only had 3 million years to diverge, whereas extant humans and chimps have had 6 million years.

We can’t go back in time, but we can perform ancestral sequence reconstruction. By comparing the ancestral sequences, as well as modern sequences, we can see if these ancestral sequences really are more similar to one another than the modern sequences are, as would be predicted if the 2 groups in question really did share a common ancestor in the past, or not. White and colleagues performed this test between several major clades across the eukaryotic tree of life, and found in each case that indeed the ancestral sequences were significantly more similar to one another than the modern sequences, confirming the prediction of universal common ancestry.

My blog post:

I link the paper in that post, but here’s the link again:

Where Theobald’s paper was met with critiques of the statistical methods, White et al’s paper has been met with more enthusiasm. In their 2016 paper critiquing Theobald’s 2010 paper, “Infinitely long branches and an informal test of common ancestry”, Leonardo de Oliveira Martins and David Posada write:

“It is worth noticing that another method has been recently proposed that can more directly test for ancestral convergence (White et al. 2013). This method does not seem to suffer from the drawbacks of [Theobald’s] UCA test…”

A rave review if I’ve ever heard one!

I’m curious to hear the thoughts of the members of this forum on this research (as well as my blog post describing it), particularly those of the resident ID researchers. As I say at the beginning of the blog post, I’ve yet to see any rebuttal to the conclusions of this particular paper, despite the fact that it was published 5 years ago now, and I really have looked. The only reasonable conclusion seems to be angiosperms share a common ancestor with gymnosperms, vertebrates share a common ancestor with echinoderms, etc.

One of the papers that cites this one is Baum et al’s 2016 paper: “Statistical evidence for common ancestry: application to primates”, published along with 2 complementary papers (in BioRxiv): “Statistical evidence for common ancestry: New tests of universal ancestry” and “Statistical Evidence for Common Ancestry: Testing for Signal in Silent Sites”. As the titles suggest, these too describe a suite of statistical tests for common ancestry. I haven’t delved into these as deeply as I have White et al’s, but again, these are proposed tests that I’ve not really seen brought up when people discuss common ancestry, so maybe we can consider these methods in this thread as well.

Side Comments on The Dependency Graph of Life
(S. Joshua Swamidass) #2

Irony of the day?

(S. Joshua Swamidass) #3

Our test is based on the expectation that, under evolution, the ancestral sequence of one natural group of taxa will be more similar to the ancestral sequence of a second natural group of taxa, than to any sequence from the first group will be to any sequence from the second. In contrast, a variety of proposed non-evolutionary models either do not make this prediction, or require so many parameters that they cannot be said to make any testable predictions at all.

That is a really interesting premise. I’ll have to think about this…

(Mark M Moore) #4

You are a funny guy. We could use some more of that here.

(Bill Cole) #5

The issue is the selection of the null hypothesis. The data beats random chance but that tells us very little.

(Blogging Graduate Student) #6

The null model was seperate origins as opposed to common origin of each pair of clades. The “random chance” referred to in the abstract is the chance that the same levels of ancestral convergence observed in the data would be observed in the “seperate origins” null model.

(S. Joshua Swamidass) #7

It seems @evograd is correct. You may have misread the paper @colewd.

(S. Joshua Swamidass) #8

6 posts were split to a new topic: Side Comments on Beyond Reasonable Doubt: Evolution from DNA Sequences

(Paul A Nelson) #12

Evograd, have you looked at the alignments? Do you know if Bojian Zhong can still make them available? (I’m wondering how to reach him/her to obtain the aligned and unaligned sequences.)

Any analysis of this interesting paper would need to start with the alignment methods.


(Bill Cole) #13

The discussion topic is universal common descent so for this to be true their must have been a transition from prokaryotic to eukaryotic cells. I agree Winstons paper looks at different line of evidence however it brings common design forward as an inference which this paper also does.

I think this paper proposes a partial and reasonable test for common descent.

(Blogging Graduate Student) #14

I have not, nor have I had any contanct with Bojian Zhong, so I’m afraid I can’t help you at all there.

(Blogging Graduate Student) #15

The discussion topic is really more restricted to statistical tests of universal common ancestry, not arguments for or against UCA per se. If you can think of a way to formalise your arguments about splicosomes and other protein complexes into something that can be quantitatively tested, that might be more appropriate.

(Paul A Nelson) #16

OK, thanks. I’ll see if I can find him and obtain the data.

(Paul A Nelson) #17

Here’s why I am worried about the alignments (Bojian Zhong is senior author on this paper):

Notice that the phylogenetic signal from chloroplast sequences was “improved” by editing out GC sites which caused problems with a monophyletic hypothesis.

Such practices are widespread in molecular phylogenetics. These methods rarely elicit critical comment because monophyly is taken as given. One should only use “phylogenetically informative” sites in constructing evolutionary trees.

I should be able to reach Zhong by email later today or tomorrow (traveling right now, but will have some time tomorrow).

(Bill Cole) #18

The test in the paper seems to be independent of a transitional mechanism. Is this the case?

(Blogging Graduate Student) #19

Yes, the test has nothing to do with mechanisms of “transitions”, it’s just a test of CA versus SA. The “what”, not the “how”.

(Bill Cole) #20

I then agree with Joshua that the prokaryotic to eukaryotic transition is not a good example based on the papers test method as so many new proteins were added during this transition. Comparing homologous proteins does not explain much.

This does limit the subject to common descent vs universal common descent.

(Blogging Graduate Student) #21

To be clear, they didn’t didn’t edit out GC sites that “caused problem with a monophyletic hypothesis”, they edited out GC sites that were contributing to the unresolved nature of classifications with Chlorophyta. The phylogeny of the classes within Chlorophyta (as well as the monophyly of some of these classes) was thus far unresolved, giving conflicting results and weak support in different studies. Fang et al set out to resolve these conflicts - to find the real phylogeny, regardless of what it was. There’s nothing to suggest that the authors were cherry-picking sites to remove in order to support a particular phylogeny, as you seem to be suggesting.

Their reasoning holds up - if GC heterogenous sites are known to be rapidly-evolving and lacking any useful phylogenetic signal, it stands to reason that they be removed, reducing the noise in the rest of the analysis.

Are you suggesting that these phylogenetically “uninformative” sites are actually some kind of signature of seperate ancestry? Can you elaborate on that?

(Blogging Graduate Student) #22

As I said in the OP, strictly speaking this isn’t about universal common descent because it’s not including all groups of organisms, but it is including a great many groups that creationists/IDers should be concerned with.

Edit: actually I noticed that I didn’t include the line about this not being strictly about universal common ancestry in the OP (I must have deleted it in the final draft), so apologies if that wasn’t clear.

(S. Joshua Swamidass) #25

That would be really helpful @pnelson. Thanks.

I’ve been thinking about this study, and have some questions about it. It is not clear they have done the right controls.

@evograd, honestly, I’m not sure this is a valid test. And this is coming from a scientist that affirms common descent. I am no anti-evolutionist. I could be wrong, but it seems it is fairly easy to build a generative model that would show evidence of common descent according to this test. Here is how we could do it.

  1. Define a relatively large convex space in sequence space as “functional” (e.g. a fixed set of differences from a centroid sequence).
  2. Sample sequences uniformly from this functional space to assign sequences to different species.

There is no reason to think it must be a convex space. I’m just saying that here to make it easy to visualize. I’m pretty sure the same finding would apply for any shaped space, as long as it is large enough.

Now, look at the basic intuition here:

the expectation that, under evolution, the ancestral sequence of one natural group of taxa will be more similar to the ancestral sequence of a second natural group of taxa, than to any sequence from the first group will be to any sequence from the second.

Now, in this generative model, which does not include common descent, the same prediction is made. The ancestral sequence will be approximately the “average” of all the sequences in the taxa, and will statistically be much closer to the centroid sequence. So the ancestral sequences will always be closer to one another than the extant sequences are to one another.

Unless I am missing something, it appears that this is a flawed test. Did I miss anything @evograd? What do you think @pnelson?