Beyond Reasonable Doubt? A Test for Common Ancestry

evograd · August 6, 2018, 4:28pm

Hi all, this is first time starting a thread on this site, so I decided to go with something simple and uncontroversial, so we can all get along in unanimous agreement.

Universal common ancestry.

I’ve been browsing a few different threads here recently and one topic that I’ve noticed pop up repeatedly is formal tests of universal common ancestry - whether they have been performed, or are even possible. One paper that is often brought up (and not surprisingly so given its title), is Douglas Theobald’s 2010 paper A formal test of the theory of universal common ancestry.

It’s worth a read if you haven’t already, but without getting into the details, I think it’s generally acknowledged at this point that Theobald’s statistical methods were flawed, so let’s put his paper to one side for a moment.

What other research exists to fill this void? Well, as it happens I wrote a blog post a while back outlining one such piece of research. I’ll let you read the blog post and/or paper to get the full explanation and data, but for now, here’s my brief summary:

In their 2013 paper, White and colleagues developed and implemented a test of common ancestry between different large clades based on ancestral gene reconstruction, based on the principal that divergence from a common ancestor means that earlier members of each new lineage would be less genetically diverged than modern members of the respective clades. The illustration of this principal I give in the blog post involves humans and chimps. IF humans and chimps shared a common ancestor 6 million years ago, then if we could go back in time 3 million years and look at members of each lineage (the on-the-way-to-chimp lineage and the on-the-way-to-human lineage), you’d find that they were less genetically distinct from one another than extant humans and chimps are today. After all, they’d only had 3 million years to diverge, whereas extant humans and chimps have had 6 million years.

We can’t go back in time, but we can perform ancestral sequence reconstruction. By comparing the ancestral sequences, as well as modern sequences, we can see if these ancestral sequences really are more similar to one another than the modern sequences are, as would be predicted if the 2 groups in question really did share a common ancestor in the past, or not. White and colleagues performed this test between several major clades across the eukaryotic tree of life, and found in each case that indeed the ancestral sequences were significantly more similar to one another than the modern sequences, confirming the prediction of universal common ancestry.

My blog post:

I link the paper in that post, but here’s the link again:

Where Theobald’s paper was met with critiques of the statistical methods, White et al’s paper has been met with more enthusiasm. In their 2016 paper critiquing Theobald’s 2010 paper, “Infinitely long branches and an informal test of common ancestry”, Leonardo de Oliveira Martins and David Posada write:

“It is worth noticing that another method has been recently proposed that can more directly test for ancestral convergence (White et al. 2013). This method does not seem to suffer from the drawbacks of [Theobald’s] UCA test…”

A rave review if I’ve ever heard one!

I’m curious to hear the thoughts of the members of this forum on this research (as well as my blog post describing it), particularly those of the resident ID researchers. As I say at the beginning of the blog post, I’ve yet to see any rebuttal to the conclusions of this particular paper, despite the fact that it was published 5 years ago now, and I really have looked. The only reasonable conclusion seems to be angiosperms share a common ancestor with gymnosperms, vertebrates share a common ancestor with echinoderms, etc.

One of the papers that cites this one is Baum et al’s 2016 paper: “Statistical evidence for common ancestry: application to primates”, published along with 2 complementary papers (in BioRxiv): “Statistical evidence for common ancestry: New tests of universal ancestry” and “Statistical Evidence for Common Ancestry: Testing for Signal in Silent Sites”. As the titles suggest, these too describe a suite of statistical tests for common ancestry. I haven’t delved into these as deeply as I have White et al’s, but again, these are proposed tests that I’ve not really seen brought up when people discuss common ancestry, so maybe we can consider these methods in this thread as well.

swamidass · August 6, 2018, 4:31pm

Irony of the day?

swamidass · August 6, 2018, 4:37pm

Our test is based on the expectation that, under evolution, the ancestral sequence of one natural group of taxa will be more similar to the ancestral sequence of a second natural group of taxa, than to any sequence from the first group will be to any sequence from the second. In contrast, a variety of proposed non-evolutionary models either do not make this prediction, or require so many parameters that they cannot be said to make any testable predictions at all.

That is a really interesting premise. I’ll have to think about this…

anon46279830 · August 6, 2018, 5:22pm

You are a funny guy. We could use some more of that here.

colewd · August 6, 2018, 5:33pm

The issue is the selection of the null hypothesis. The data beats random chance but that tells us very little.

evograd · August 6, 2018, 6:13pm

The null model was seperate origins as opposed to common origin of each pair of clades. The “random chance” referred to in the abstract is the chance that the same levels of ancestral convergence observed in the data would be observed in the “seperate origins” null model.

swamidass · August 6, 2018, 6:14pm

It seems @evograd is correct. You may have misread the paper @colewd.

swamidass · August 8, 2018, 4:25am

6 posts were split to a new topic: Side Comments on Beyond Reasonable Doubt: Evolution from DNA Sequences

pnelson · August 6, 2018, 7:40pm

Evograd, have you looked at the alignments? Do you know if Bojian Zhong can still make them available? (I’m wondering how to reach him/her to obtain the aligned and unaligned sequences.)

Any analysis of this interesting paper would need to start with the alignment methods.

Thanks

colewd · August 6, 2018, 7:46pm

The discussion topic is universal common descent so for this to be true their must have been a transition from prokaryotic to eukaryotic cells. I agree Winstons paper looks at different line of evidence however it brings common design forward as an inference which this paper also does.

I think this paper proposes a partial and reasonable test for common descent.

evograd · August 6, 2018, 7:50pm

I have not, nor have I had any contanct with Bojian Zhong, so I’m afraid I can’t help you at all there.

evograd · August 6, 2018, 7:52pm

The discussion topic is really more restricted to statistical tests of universal common ancestry, not arguments for or against UCA per se. If you can think of a way to formalise your arguments about splicosomes and other protein complexes into something that can be quantitatively tested, that might be more appropriate.

pnelson · August 6, 2018, 7:54pm

OK, thanks. I’ll see if I can find him and obtain the data.

pnelson · August 6, 2018, 8:05pm

Here’s why I am worried about the alignments (Bojian Zhong is senior author on this paper):

ncbi.nlm.nih.gov

Improving phylogenetic inference of core Chlorophyta using chloroplast sequences with strong phylogenetic signals and heterogeneous models.

L Fang, F Leliaert, PM Novis, Z Zhang, H Zhu, G Liu, D Penny and B Zhong, Molecular phylogenetics and evolution, Oct 2018

Phylogenetic relationships within the green algal phylum Chlorophyta have proven difficult to resolve. The core Chlorophyta include Chlorophyceae, Ulvophyceae, Trebouxiophyceae, Pedinophyceae and Chlorodendrophyceae, but the relationships among these classes remain unresolved and the monophyly of Ulvophyceae and Trebouxiophyceae are highly controversial. We analyzed a dataset of 101 green algal species and 73 protein-coding genes sampled from complete and partial chloroplast genomes, including six newly sequenced ulvophyte genomes (Blidingia minima NIES-1837, Ulothrix zonata, Halochlorococcum sp. NIES-1838, Scotinosphaera sp. NIES-154, Caulerpa brownii and Cephaleuros sp. HZ-2017). We applied the Tree Certainty (TC) score to quantify the level of incongruence between phylogenetic trees in chloroplast genomic datasets, and show that the conflicting phylogenetic trees of core Chlorophyta stem from the most GC-heterogeneous sites. With removing the most GC-heterogeneous sites, our chloroplast phylogenomic analyses using heterogeneous models consistently support monophyly of the Chlorophyceae and of the Trebouxiophyceae, but the Ulvophyceae was resolved as polyphyletic. Our analytical framework provides an efficient approach to reconstruct the optimal phylogenetic relationships by minimizing conflicting signals.

Notice that the phylogenetic signal from chloroplast sequences was “improved” by editing out GC sites which caused problems with a monophyletic hypothesis.

Such practices are widespread in molecular phylogenetics. These methods rarely elicit critical comment because monophyly is taken as given. One should only use “phylogenetically informative” sites in constructing evolutionary trees.

I should be able to reach Zhong by email later today or tomorrow (traveling right now, but will have some time tomorrow).

colewd · August 6, 2018, 8:06pm

The test in the paper seems to be independent of a transitional mechanism. Is this the case?

evograd · August 6, 2018, 9:15pm

Yes, the test has nothing to do with mechanisms of “transitions”, it’s just a test of CA versus SA. The “what”, not the “how”.

colewd · August 6, 2018, 9:50pm

I then agree with Joshua that the prokaryotic to eukaryotic transition is not a good example based on the papers test method as so many new proteins were added during this transition. Comparing homologous proteins does not explain much.

This does limit the subject to common descent vs universal common descent.

evograd · August 6, 2018, 9:56pm

To be clear, they didn’t didn’t edit out GC sites that “caused problem with a monophyletic hypothesis”, they edited out GC sites that were contributing to the unresolved nature of classifications with Chlorophyta. The phylogeny of the classes within Chlorophyta (as well as the monophyly of some of these classes) was thus far unresolved, giving conflicting results and weak support in different studies. Fang et al set out to resolve these conflicts - to find the real phylogeny, regardless of what it was. There’s nothing to suggest that the authors were cherry-picking sites to remove in order to support a particular phylogeny, as you seem to be suggesting.

Their reasoning holds up - if GC heterogenous sites are known to be rapidly-evolving and lacking any useful phylogenetic signal, it stands to reason that they be removed, reducing the noise in the rest of the analysis.

Are you suggesting that these phylogenetically “uninformative” sites are actually some kind of signature of seperate ancestry? Can you elaborate on that?

evograd · August 6, 2018, 9:58pm

As I said in the OP, strictly speaking this isn’t about universal common descent because it’s not including all groups of organisms, but it is including a great many groups that creationists/IDers should be concerned with.

Edit: actually I noticed that I didn’t include the line about this not being strictly about universal common ancestry in the OP (I must have deleted it in the final draft), so apologies if that wasn’t clear.

swamidass · August 7, 2018, 2:49am

That would be really helpful @pnelson. Thanks.

I’ve been thinking about this study, and have some questions about it. It is not clear they have done the right controls.

@evograd, honestly, I’m not sure this is a valid test. And this is coming from a scientist that affirms common descent. I am no anti-evolutionist. I could be wrong, but it seems it is fairly easy to build a generative model that would show evidence of common descent according to this test. Here is how we could do it.

Define a relatively large convex space in sequence space as “functional” (e.g. a fixed set of differences from a centroid sequence).
Sample sequences uniformly from this functional space to assign sequences to different species.

There is no reason to think it must be a convex space. I’m just saying that here to make it easy to visualize. I’m pretty sure the same finding would apply for any shaped space, as long as it is large enough.

Now, look at the basic intuition here:

the expectation that, under evolution, the ancestral sequence of one natural group of taxa will be more similar to the ancestral sequence of a second natural group of taxa, than to any sequence from the first group will be to any sequence from the second.

Now, in this generative model, which does not include common descent, the same prediction is made. The ancestral sequence will be approximately the “average” of all the sequences in the taxa, and will statistically be much closer to the centroid sequence. So the ancestral sequences will always be closer to one another than the extant sequences are to one another.

Unless I am missing something, it appears that this is a flawed test. Did I miss anything @evograd? What do you think @pnelson?

Topic		Replies	Views
Molecular Genetics of Whale Evolution Conversation Science , Society	92	11104	December 21, 2018
Shared mistakes as evidence of common ancestry Conversation Science	32	922	March 20, 2024
The Return of the Probability Argument Conversation Science	122	2099	December 22, 2020
A Comprehensive Theory of Intelligent Design Conversation Science	113	3219	May 14, 2021
A Test of Common Descent vs. Common Function Conversation	54	2449	January 31, 2021

Beyond Reasonable Doubt? A Test for Common Ancestry

Related topics