Gauger: A Single-Couple Human Origin is Possible

A Single Couple Human Origin is Possible

The problem of inferring history from genetic data is complex and underdetermined; there are many possible scenarios that would explain the same data. It can be made more tractable by making reasonable simplifications to the model, but it is continually important to remember what has been demonstrated and what is merely a parsimonious working assumption. In this paper we have chosen to model the demographic ancestry of humanity using the simplest of assumptions, with a homogeneous population whose size can vary over time. All other assumptions such as the mutation rates were standard, and no natural selection was in operation. Using a previously published backwards simulation method and some newly developed and faster algorithms, we run our single-couple origin model of humanity and compare the results to allele frequency spectra and linkage disequilibrium statistics from current genetic data. We show that a single-couple origin of humanity as recent as 500kya is consistent with data. With only minor modifications of our parsimonious model assumptions, we suggest that a single-couple origin 100kya, or more recently, is possible.

From @Agauger. What do you think?

5 posts were split to a new topic: Comments on Gauger Single-Couple Human Origin is Possible

I’m assuming they didn’t consider the sharing of multiple alleles across primates.


I had a chance to read this and confer with a few people. I’m curious what @glipsnort thinks of it.

It seems to look only at SFS/AFS and LD, both of which are summary statistics that only indirectly test the hypothesis. In essence this replicates work by @glipsnort and by me, but TMR4A is a more powerful and direct way of testing the same thing, coming to essentially the same answer.

Of course, they did not engage the challenge of trans-species variation, nor did they engage the evidence for ancient admixture. These are weaker lines of evidence against a single couple bottleneck, but I don’t think they can be fully dismissed yet.

In summary, not sure what is new in this analysis, and this certainly does not handle all the data. So I don’t think the title is justified.

Perhaps just as important, many theologians are working through whether genetic ancestry even matters or not. Here is what @jack.collins says:

the genetic questions, important as they may be for some purposes, foists a misleading anachronism on the Biblical text.

Here is what @KenKeathley says:

Scripture does not speak of genetics, but it does emphasize genealogy, presenting Adam as the genealogical ancestor of the human race.

Perhaps genetic questions are important, but I don’t think we can take the genetic framing of the question for granted any more. In their future work, it would be important for them to explain why a genetic bottleneck is theologically important.

If genealogical ancestry is most important? Well, then… In that case, we beat them by 494,000 years. :slight_smile:

I love their use of “parsimonious”, as if to imply that a single couple is the simplest hypothesis, and therefore favored, while having two, or three, or 10,000 couples would be a much less simple assumption. Not clear whether having one member of the couple cloned from the rib of the other is even more parsimonious.


I agree. Parsimony is not really a helpful term on this question. I think the best approach is hypothesis testing to figure out what can be ruled out and what cannot be ruled out by the evidence. But are they claiming a couple is most parsimonious?

On the de novo creation side, I don’t think they are saying the first pair was created without parents, at least not in this paper. So I’m not sure we’d want to make that critique. Of course, some of them do in fact think that Adam and Eve were created from scratch, but we should at least take this paper for what it is.

Hi Joshua,

This aspect of their model would seem to be highly counterfactual, if I am understanding correctly. We see natural selection in operation everywhere in the domain of biology, right? It’s no mere assumption, either - we see it today!

I find the reliance on no natural selection to be highly troubling. Moreover, I have never heard anyone in the YEC or OEC camps even contest the existence of natural selection. They often doubt the ability of natural selection to produce speciation, of course, but that’s not the same thing as excluding it entirely from a model.

But perhaps I am misunderstanding. Would any biologists care to enlighten me?



There is no such thing as a minor modification to parsimony. The addition of one extraneous variable is enough to invalidate a model. A drop of poison ruins the stew.

But if a scientist cares to disagree with my understanding, I’m all ears.

What do you all think?



It seems like a reasonable approximation for what they are trying to do. @Joe_Felsenstein can correct me if I am wrong.

Natural selection shouldn’t affect the analysis by much. Seems very reasonable to neglect it. This is more of an exercise in the molecular clock, which on a genome-wide scale should not be overly affected by selection.

I think ancient DNA is sparring with Occam’s Razor. In untangling ancient human history, parsimony is a horrible guide.


Immediate reaction, without having read it in detail: they seem to rely exclusively on the folded allele frequency spectrum, which throws away an important constraint. They should be looking at the derived frequency spectrum. Derived frequencies above 50% constrain the contribution from initial diversity. (Take a look at Fig. 8b, and imagine projecting a 1/f distribution out to the right. Their initial diversity contribution will be far higher than the actual derived frequency spectrum as you get close to f = 100%.)


It’s not uncommon to neglect selection in coalescent models and simulations, although there are certainly times when selection should be (and is) included. John Wakeley has a nice discussion on the subject.

I’m more curious to know how including geographic structure in the model would influence the results.

1 Like

I agree that this is a misuse of parsimony, which generally isn’t a factor in pop gen studies because, as others have said, selection isn’t the main force to contend with since most loci that are (and can be) used are not under selection anyway. Drift and migration are key, selection isn’t, so parsimony doesn’t really apply. Ann mentions that she collaborated to bring the math expertise that she needed, but I think she should have brought someone in from population genetics too (a field whose complexity is underappreciated by even most of us scientists.).

Also apropos: why resurrect this self-published journal for a paper that doesn’t even fit the mission. Quoting from the purpose:

BIO-Complexity is a peer-reviewed scientific journal with a unique goal. It aims to be the leading forum for testing the scientific merit of the claim that intelligent design (ID) is a credible explanation for life. Because questions having to do with the role and origin of information in living systems are at the heart of the scientific controversy over ID, these topics—viewed from all angles and perspectives—are central to the journal’s scope.

This paper has nothing to do with ID or origin of information in living systems.

Quoting from the scope:

Among the topics of interest are: the origin or characterization of complex biological sequences, structures, forms, functions and processes; pre-biotic chemistry and the origin of life; molecular or morphologic phylogenies and phylogenetic methods; new molecular or morphologic data including paleontological data; cladistics and systematics; biomimetic or engineering analyses of biological systems; in vitro and laboratory evolution; evolutionary simulation and computational evolution. Theoretical or mathematical treatments of complexity or information with clear relevance to the journal’s aims are also welcome.

I don’t really see how this paper fits in there, either. I suppose they would claim that this is an evolutionary simulation and/or a test of a “phylogenetic method,” but, um, it’s not.

It’s not exactly a good strategy to build credibility for your in-house journal to publish articles that don’t even really fit. Also, doesn’t really build credibility to HAVE a journal just for publishing your own stuff, and to only publish there instead of in peer-reviewed journals.

And finally:

BIO-Complexity will consider for publication only work that adheres to widely accepted modes of scientific investigation and inference.



An assumption that the sequences are evolving neutrally is necessary if you do calculations based on the mathematics of coalescents (including the “backwards” simulation they used). So this is quite commonly used, but you have to be aware that individual loci might be affected by “selective sweeps” as an allele arises and spreads by natural selection, and also that they may be affected by natural selection maintaining a polymorphism of multiple alleles. So careful examination of conflicting signals from different parts of the genome is in order.


OK, so if the data fits one couple, but also equally well fits 10,000 couples, your use of parsimony makes it absolutely, definitely true that it must be one couple, right?

1 Like

I think her point is that the genetic evidence studied this way doesn’t tell us either way. That seems to be a valid conclusion.

If the argument is that a single couple should be preferred over 10,000 because of parsimony, I don’t think that argument is valid.


If he wants to make sense, he should say that Adam is a genealogical ancestor rather than the genealogical ancestor. Right?


As far as I can see, Chris Falter’s argument is that parsimony is all-powerful. I think I’ll dissent from that very strongly.

1 Like

I think you are misreading him. He is questioning @Agauger’s use of parsimony (as he understands she is using it) not endorsing it, right? Or maybe I’m misunderstanding him!

1 Like

I have a friend and former colleague, Elise Lauterbur, who was interested in seeing how coalescent models perform for species of conservation concern, where effective population size and sample size are often extremely small. She found that coalescent methods give unreliable estimates of genetic diversity (as measured by Watterson’s theta) diversity in these scenarios. She recently published this work on bioarxiv:

Coalescent models at small effective population sizes and population declines are positively misleading

Here’s a sample:

To determine if and how standard coalescent models influence estimates of genetic
diversity in populations with small effective population sizes, this analysis compares
genetic diversity estimates based on coalescent models to genetic diversity calculated
in forward simulation ground-truths at a range of effective population sizes, sample
sizes, and sampling times since a bottleneck. Coalescent models give unreliable
estimates of genetic diversity, as measured by Watterson’s , regardless of the 𝜃 = 4
relationship between sample size and effective population size. This occurs particularly
when the population is oversampled with respect to effective population size (sample
size exceeds effective population size), and when sampled soon after a bottleneck.

The profound differences between coalescent estimates of 𝜃 = 4W and forward calculations
of 𝜃 = 4W soon after a bottleneck show that applying standard coalescent models to
bottlenecked populations can give misleading results. Overestimates would have the
effect of fitting null coalescent models with incorrectly long times to the most recent
common ancestor to estimates of 𝜃 = 4W from real data, thus artificially extending estimates
of both bottleneck times and population split times into the past. Underestimates of 𝜃 = 4W
that occur after prolonged small effective population sizes would artificially shorten
estimates of bottleneck times or population split times.

I wonder what, if any, implications these conclusions have for the present discussion.

1 Like

But it’s a bad way of looking at the genetic evidence.