Gauger: A Single-Couple Human Origin is Possible

swamidass · October 21, 2019, 11:48pm

A Single Couple Human Origin is Possible

The problem of inferring history from genetic data is complex and underdetermined; there are many possible scenarios that would explain the same data. It can be made more tractable by making reasonable simplifications to the model, but it is continually important to remember what has been demonstrated and what is merely a parsimonious working assumption. In this paper we have chosen to model the demographic ancestry of humanity using the simplest of assumptions, with a homogeneous population whose size can vary over time. All other assumptions such as the mutation rates were standard, and no natural selection was in operation. Using a previously published backwards simulation method and some newly developed and faster algorithms, we run our single-couple origin model of humanity and compare the results to allele frequency spectra and linkage disequilibrium statistics from current genetic data. We show that a single-couple origin of humanity as recent as 500kya is consistent with data. With only minor modifications of our parsimonious model assumptions, we suggest that a single-couple origin 100kya, or more recently, is possible.

From @Agauger. What do you think?

swamidass · October 22, 2019, 11:07pm

5 posts were split to a new topic: Comments on Gauger Single-Couple Human Origin is Possible

John_Harshman · October 22, 2019, 1:57am

I’m assuming they didn’t consider the sharing of multiple alleles across primates.

swamidass · October 22, 2019, 11:06pm

I had a chance to read this and confer with a few people. I’m curious what @glipsnort thinks of it.

It seems to look only at SFS/AFS and LD, both of which are summary statistics that only indirectly test the hypothesis. In essence this replicates work by @glipsnort and by me, but TMR4A is a more powerful and direct way of testing the same thing, coming to essentially the same answer.

Of course, they did not engage the challenge of trans-species variation, nor did they engage the evidence for ancient admixture. These are weaker lines of evidence against a single couple bottleneck, but I don’t think they can be fully dismissed yet.

In summary, not sure what is new in this analysis, and this certainly does not handle all the data. So I don’t think the title is justified.

Perhaps just as important, many theologians are working through whether genetic ancestry even matters or not. Here is what @jack.collins says:

the genetic questions, important as they may be for some purposes, foists a misleading anachronism on the Biblical text.

Here is what @KenKeathley says:

Scripture does not speak of genetics, but it does emphasize genealogy, presenting Adam as the genealogical ancestor of the human race.

Perhaps genetic questions are important, but I don’t think we can take the genetic framing of the question for granted any more. In their future work, it would be important for them to explain why a genetic bottleneck is theologically important.

If genealogical ancestry is most important? Well, then… The Genealogical Adam and Eve. In that case, we beat them by 494,000 years.

Joe_Felsenstein · October 23, 2019, 3:48am

I love their use of “parsimonious”, as if to imply that a single couple is the simplest hypothesis, and therefore favored, while having two, or three, or 10,000 couples would be a much less simple assumption. Not clear whether having one member of the couple cloned from the rib of the other is even more parsimonious.

swamidass · October 23, 2019, 4:00am

I agree. Parsimony is not really a helpful term on this question. I think the best approach is hypothesis testing to figure out what can be ruled out and what cannot be ruled out by the evidence. But are they claiming a couple is most parsimonious?

On the de novo creation side, I don’t think they are saying the first pair was created without parents, at least not in this paper. So I’m not sure we’d want to make that critique. Of course, some of them do in fact think that Adam and Eve were created from scratch, but we should at least take this paper for what it is.

Chris_Falter · October 23, 2019, 4:32am

Hi Joshua,

This aspect of their model would seem to be highly counterfactual, if I am understanding correctly. We see natural selection in operation everywhere in the domain of biology, right? It’s no mere assumption, either - we see it today!

I find the reliance on no natural selection to be highly troubling. Moreover, I have never heard anyone in the YEC or OEC camps even contest the existence of natural selection. They often doubt the ability of natural selection to produce speciation, of course, but that’s not the same thing as excluding it entirely from a model.

But perhaps I am misunderstanding. Would any biologists care to enlighten me?

Thanks!
Chris

Chris_Falter · October 23, 2019, 4:36am

There is no such thing as a minor modification to parsimony. The addition of one extraneous variable is enough to invalidate a model. A drop of poison ruins the stew.

But if a scientist cares to disagree with my understanding, I’m all ears.

What do you all think?

Best,
Chris

swamidass · October 23, 2019, 5:08am

It seems like a reasonable approximation for what they are trying to do. @Joe_Felsenstein can correct me if I am wrong.

Natural selection shouldn’t affect the analysis by much. Seems very reasonable to neglect it. This is more of an exercise in the molecular clock, which on a genome-wide scale should not be overly affected by selection.

I think ancient DNA is sparring with Occam’s Razor. In untangling ancient human history, parsimony is a horrible guide.

glipsnort · October 23, 2019, 11:53am

Immediate reaction, without having read it in detail: they seem to rely exclusively on the folded allele frequency spectrum, which throws away an important constraint. They should be looking at the derived frequency spectrum. Derived frequencies above 50% constrain the contribution from initial diversity. (Take a look at Fig. 8b, and imagine projecting a 1/f distribution out to the right. Their initial diversity contribution will be far higher than the actual derived frequency spectrum as you get close to f = 100%.)

davecarlson · October 23, 2019, 12:33pm

It’s not uncommon to neglect selection in coalescent models and simulations, although there are certainly times when selection should be (and is) included. John Wakeley has a nice discussion on the subject.

I’m more curious to know how including geographic structure in the model would influence the results.

NLENTS · October 23, 2019, 1:05pm

I agree that this is a misuse of parsimony, which generally isn’t a factor in pop gen studies because, as others have said, selection isn’t the main force to contend with since most loci that are (and can be) used are not under selection anyway. Drift and migration are key, selection isn’t, so parsimony doesn’t really apply. Ann mentions that she collaborated to bring the math expertise that she needed, but I think she should have brought someone in from population genetics too (a field whose complexity is underappreciated by even most of us scientists.).

Also apropos: why resurrect this self-published journal for a paper that doesn’t even fit the mission. Quoting from the purpose:

BIO-Complexity is a peer-reviewed scientific journal with a unique goal. It aims to be the leading forum for testing the scientific merit of the claim that intelligent design (ID) is a credible explanation for life. Because questions having to do with the role and origin of information in living systems are at the heart of the scientific controversy over ID, these topics—viewed from all angles and perspectives—are central to the journal’s scope.

This paper has nothing to do with ID or origin of information in living systems.

Quoting from the scope:

Among the topics of interest are: the origin or characterization of complex biological sequences, structures, forms, functions and processes; pre-biotic chemistry and the origin of life; molecular or morphologic phylogenies and phylogenetic methods; new molecular or morphologic data including paleontological data; cladistics and systematics; biomimetic or engineering analyses of biological systems; in vitro and laboratory evolution; evolutionary simulation and computational evolution. Theoretical or mathematical treatments of complexity or information with clear relevance to the journal’s aims are also welcome.

I don’t really see how this paper fits in there, either. I suppose they would claim that this is an evolutionary simulation and/or a test of a “phylogenetic method,” but, um, it’s not.

It’s not exactly a good strategy to build credibility for your in-house journal to publish articles that don’t even really fit. Also, doesn’t really build credibility to HAVE a journal just for publishing your own stuff, and to only publish there instead of in peer-reviewed journals.

And finally:

BIO-Complexity will consider for publication only work that adheres to widely accepted modes of scientific investigation and inference.

Oof.

Joe_Felsenstein · October 23, 2019, 2:27pm

An assumption that the sequences are evolving neutrally is necessary if you do calculations based on the mathematics of coalescents (including the “backwards” simulation they used). So this is quite commonly used, but you have to be aware that individual loci might be affected by “selective sweeps” as an allele arises and spreads by natural selection, and also that they may be affected by natural selection maintaining a polymorphism of multiple alleles. So careful examination of conflicting signals from different parts of the genome is in order.

Joe_Felsenstein · October 23, 2019, 2:31pm

OK, so if the data fits one couple, but also equally well fits 10,000 couples, your use of parsimony makes it absolutely, definitely true that it must be one couple, right?

swamidass · October 23, 2019, 3:31pm

I think her point is that the genetic evidence studied this way doesn’t tell us either way. That seems to be a valid conclusion.

If the argument is that a single couple should be preferred over 10,000 because of parsimony, I don’t think that argument is valid.

John_Harshman · October 23, 2019, 3:34pm

If he wants to make sense, he should say that Adam is a genealogical ancestor rather than the genealogical ancestor. Right?

Joe_Felsenstein · October 23, 2019, 3:41pm

As far as I can see, Chris Falter’s argument is that parsimony is all-powerful. I think I’ll dissent from that very strongly.

swamidass · October 23, 2019, 4:08pm

I think you are misreading him. He is questioning @Agauger’s use of parsimony (as he understands she is using it) not endorsing it, right? Or maybe I’m misunderstanding him!

davecarlson · October 23, 2019, 4:26pm

I have a friend and former colleague, Elise Lauterbur, who was interested in seeing how coalescent models perform for species of conservation concern, where effective population size and sample size are often extremely small. She found that coalescent methods give unreliable estimates of genetic diversity (as measured by Watterson’s theta) diversity in these scenarios. She recently published this work on bioarxiv:

Coalescent models at small effective population sizes and population declines are positively misleading

Here’s a sample:

To determine if and how standard coalescent models influence estimates of genetic
diversity in populations with small effective population sizes, this analysis compares
genetic diversity estimates based on coalescent models to genetic diversity calculated
in forward simulation ground-truths at a range of effective population sizes, sample
sizes, and sampling times since a bottleneck. Coalescent models give unreliable
estimates of genetic diversity, as measured by Watterson’s , regardless of the 𝜃 = 4
relationship between sample size and effective population size. This occurs particularly
when the population is oversampled with respect to effective population size (sample
size exceeds effective population size), and when sampled soon after a bottleneck.

The profound differences between coalescent estimates of 𝜃 = 4W and forward calculations
of 𝜃 = 4W soon after a bottleneck show that applying standard coalescent models to
bottlenecked populations can give misleading results. Overestimates would have the
effect of fitting null coalescent models with incorrectly long times to the most recent
common ancestor to estimates of 𝜃 = 4W from real data, thus artificially extending estimates
of both bottleneck times and population split times into the past. Underestimates of 𝜃 = 4W
that occur after prolonged small effective population sizes would artificially shorten
estimates of bottleneck times or population split times.

I wonder what, if any, implications these conclusions have for the present discussion.

glipsnort · October 23, 2019, 4:31pm

But it’s a bad way of looking at the genetic evidence.

Topic		Replies	Views
Richard Buggs: Adam, Eve, and Human Genetic Diversity Conversation Adam , Science	34	654	February 16, 2021
Some History on the Adam and Eve Exchange from Mike Heiser Conversation Science	0	452	June 6, 2020
Mosaic Eve: Mother of All (Part 1) Conversation Science	1	610	February 26, 2021
Story One: Ancient Sole-Genetic Progenitor Adam Peaceful Science Adam	18	3312	November 1, 2019
Correcting One Error By Me on BioLogos Conversation Science , Society	1	378	August 27, 2021

Gauger: A Single-Couple Human Origin is Possible

A Single Couple Human Origin is Possible

Related topics