John Harshman: Bottlenecks and Trans-Species Variation

Yes, I do see your point. Keep in mind that this is not in service of my personal view. I’m just trying to be honest about what the data shows.

What is interesting about this is that if we can’t rule out a bottleneck of two (and we can’t more ancient than about 500 kya), we certainly can’t rule out a bottleneck of 10 or 20. That seems to be an important and overlooked point. Once again, tight bottlenecks might have been important in speciation of our lineage. That has been a live possibility at times, but then fell out of favor because of the genetic evidence. Perhaps some of those hypotheses deserve another look. A misunderstanding of the genetic evidence might have prematurely foreclosed them.

I’m sure we can rule out that hypothesis without need to send a probe. That may not be the case for a single couple bottleneck. It is just an issue of honestly. We shouldn’t say there is evidence against it if we can’t produce said evidence. Of course, there is evidence against it more recent than 500 kya, but before than point it becomes equivocal, at least for now.

You are missing the fact that I’m referring not to A-group alleles but to B-group alleles. Based solely on topology, there are at least 5 B-group alleles predating the split but only at least 1 A-group allele.

Incidentally, and irrelevant to my point, shouldn’t balancing selection result in reduced rather than increased branch lengths? One might get accelerated evolution early on but purifying selection after that. And shouldn’t recombination also reduce the apparent age of divergence rather than increasing it?

Sorry for the late reply, but I was apparently not allowed to speak for 24 hours.

Incidentally, why is a single couple the hypothesis of interest? If Eve was created from Adam’s rib, shouldn’t there be a single individual bottleneck? One may presume that the process, most simply, would have involved the slight engineering of Adam’s genome to excise the Y chromosome and duplicate the X.

One might also ask whether it’s possible to examine the Flood scenario, in which there was a complicated bottleneck: One Y chromosome, four X chromosomes, three mitochondrial genomes, and a possibility of 10 of each autosome. Come to think of it, an Adam & Eve scenario, of any sort, would also involve different bottlenecks for autosomes, sex chromosomes, and mitochondria.


There are two such mechanisms: founder flush and Mayr’s peripatric speciation. The latter seems unsupportable based on its counter-to-fact assumptions (the need for a genetic revolution due to coadapted gene complexes), and the former has not much in the way of observation to support it. On the other hand, there is much evidence favoring the plain vanilla version of allopatric speciation. Have you read Coyne & Orr’s book Speciation?


Ah, I see what you are saying. I might have made an error there. I’ll have to go back, and I want to confirm with a colleague that the A and B group alleles are the same locus. If that is the case, I owe you beer next time I’m in northern California.

Still, because these are exon, it remains unclear how much of this is convergent evolution. It is possible that exposure to the same pathogens causes alleles to evolve independently much more recently. There seems to be strong evidence of convergent evolution, and this analysis did not limit itself to neutral mutations.

I agree that a careful analysis of this data could eventually falsify a bottleneck. I also agree that I might have to revise one of my summaries of the literature. It looks like I might have made an error. However, I do not see the published analysis, as they stand as solid and settled evidence against a bottleneck. At some point, someone needs to do some careful work to hash this out. I’m up for doing it too, as long as I have the right collaborators in place to do it really well.

Why Not Clones?

Turns out to be a straw man to say they are a clone, for two reasons.

  1. Buggs, who brought question, made his appoint against the backdrop of common descent, and as a “bottleneck” of a couple.

  2. Both AIG and RTB are actively considering a genetic mosaic model, where Adam and Eve each have different genomes in each gamete.

So for the original question #1, Adam and Eve are not homozygous clones. For the other proposals #2, there is a massive loophole that some are trying to take. This is not a panacea, because they have to start specifying details of how many children Adam and Eve have, and I’m not sure the YEC model can work, but perhaps the RTB model can.

Of course, we can use the same methodology to look for TMR2A, though no one seems to care about this. It is a straw man argument, so I’d rather not contribute to confusion by giving it any air time.

Dreaming of Noah

This however, bring us to the elephant in the room:

Exactly. I looked at this partly here, by looking at the TMR10A time…

This ignores the interbreeding between Neanderthals and Sapiens, which is established by independent means. The analysis is a bit complex, but it does seem to solidly rule out most of the current crop of YEC models, which want there to be a bottleneck of 5 with Noah, and can’t really plausibly make them genetic mosaics.

So What Could Work?

However, all they have to do is posit that (1) the flood was not global, just regional, and (2) God made people outside the Garden, and they have a solution. The major organizations are not going to do this for largely political reasons (not based in Scripture or theology), but it is possible some unaffiliated YECs might take that path. It is possible RTB could take this path, or a variation of it, and this would rescue their model.

Even if the A and B groups aren’t the same locus (and I’m pretty sure they are), there are 5 alleles in the B group alone.

I don’t see how convergence is relevant here. Is it your claim that the sole chimp B-group allele might be convergent on one of the human alleles, thus falsely appearing as its sister? That seems a stretch.

I don’t think the clonal Eve scenario is a strawman either. It’s just takes the genetic implications of literal reading of the text. Now I can see why none of the YECs want to own it, as it’s just plain bizarre, but no other scenario fits biblical literalism. Other scenarios take extra miracles not suggested by the text, millions of such miracles in the case of the diverse gamete population. And of course the bottleneck of a previous population is right out, biblically. I don’t see how a YEC with any integrity could avoid the clonal Eve scenario.

What is TMR4A? Is that the X chromosome flood scenario?

I’d also say that the reasons for not adopting a local flood or people outside the garden would be that neither is compatible with Genesis, unless you torture the text.

It is a straw man if it does not fit their model. Their model does not appear falsified by the evidence.

TMR4A is about 500 kya. X chromosome I have not looked at, but we can presume about 120 kya (based on reduced divergence of X).

You might have found an error. I need to look at it more closely and get it right. Give me some time to look at the data, consult with some others, and settle a few details in my mind. I do not remember seeing 5 alleles in the B group.

The convergence is a real possibility. If most the mutations were neutral, we could ignore it, however in this locus most mutations are not neutral. Also the evidence for convergence is strong, showing that the same alleles appear to be arising by multiple mutational pathways. The high number of non-neutral mutation means that we do not know how much of the similarity is due to common function (independently evolved) instead of common history.

I’m just echoing the 1998 paper that found Ayala to be in error. Unless you account for convergence here, you can’t make claims about trans-species variation. I’m also pointing to a way around the problem, to look at the introns. No one has done this yet (alongside non-human sequences). That seems to be the real way to solve this puzzle. At the very least we have a disagreement in the literature about how to interpret these results. Trans-species variation (after accounting for convergence) does not appear in the literature as a settled finding.

Someone needs to go sort this out with the right analysis, and also some careful simulation. That is not, as you point out, an easy study. Maybe it will falsify a bottleneck. Until then, we should be clear about the limits of settled science here.

I’m questioning the validity of those models for biblical literalists. If their models are un-biblical, they’d be some kind of converse of strawman models: picking a model that isn’t what they should logically be proposing because it’s harder to attack. Perhaps a brickman?

Why wouldn’t 4-fold degenerate sites do as well?

This, also, may of us would strongly dispute, including myself. There is strong evidence, both from historical interpretations and also from the text itself, that Genesis suggests there are people outside the garden.

Unless you are biblical literalist, I’m not sure your opinion is determinative. Do you have in depth knowledge of traditional theology and Biblical Hebrew? If not, you might be overly reliant on english versions of Genesis and cartoons you’ve heard from YEC polemicists. We’ve been engaging with more thoughtful YECs and OEcs. The real test is what they end up thing about this.

Turns out that many of them find it helpful. It could give us a new way forward. If your goal is to promote good science, this is good news. If your goal is to fight against religion, then it might be bad news.

I don’t see why one has to be a biblical literalist to understand what a literal interpretation would be. Is it your claim is that the English translations are so faulty as to render their use unhelpful? Of course I don’t know biblical Hebrew, but one presumes the translators did. I would certainly be interested in seeing a thoughtful YEC defense of a local flood or people outside the garden. And I would even more like to see a thoughtful reason for rejecting the clonal Eve. (I mean from a YEC biblical literalist. I reject Genesis 1-2 as having any basis in reality.)

Because this can arise by selecting for diversity.

Balancing selection selects for diversity, so fixation rate will be faster than the mutation rate. HLA selection is driven be pathogen exposure which is very unlikely to be a constant landscape over millions of years, leave alone thousands of years. With a constantly changing fitness landscape, a shift to purifying selection is not likely.

More importantly we can directly test this by looking at Ka to Ks ratios. This loci has the among the highest ratios in the whole genome. Unlike most other loci, mosthe variants here change the protein sequence. Looking at neutral mutations alone, the clock is running much much more slowly. Even then, because of mutational clusters and neutral draft we know that the fixation rate for neutral mutations here is is higher than the mutation rate.

Also we know that most these sequences do not follow a nested clade pattern because of convergent evolution and/or gene conversion. So there are a high number of discordant mutations, each of which artificially increases the apparant age of lineages.

Basically, mostly of the usually safe assumptions of the molecular clock are blatantly violated at this loci. There are several lines of evidence, including more notnmeantioned here, that we can be reasonably sure that balancing selection makes alleles appear older.

Biblical questions Ill answer tomorrow.

You will have to explain this. Why would 4-fold degenerate sites be selected for diversity?

Possibly, though it would seem to me that the main effect there would be formation of new alleles, not changes in existing alleles. The Ka/Ks could be driven by early changes. And again, why not look at silent changes alone?

Remember that mutations are not IID, but come in clusters. So there is a lot of neutral draft in this region because there is a lot of selection for diversity.

Nonetheless, we can estimate this with some high level statistics, if we neglect draft. Let’s say the the Ka/Ks ratio is, for example, about 6 (as it for some most HLA loci like this one. That means that using the neutral changes (corresponding to Ks), the length fo the lineage age will computed between 1/6th and 1/7 the age (roughly) as when the whole sequence is used. I’m sure there is a better formula for this, but the point is that this certainly drops the lineage ages to well after the human/chimp divergence.

Taking into account the increased rate of the clock in the context of neutral draft, and it will drop further. I am not sure we can estimate how much yet, but perhaps we can. I’m not sure we’ve settled the precise distributions of mutational clusters yet, as this is still very early days in de novo mutation sequencing. Regardless, the fact that the Ka/Ks ratios being greater than 1 are a strong indicator that using neutral sequences will substantially decrease the age.

Neutral draft on mutational clusters. Also there is a complex relationship between degeneracy and phylogeny. As you know, what is 4-fold degenerate now, might not have been 4-fold degenerate before an adjacent mutation. With balancing selection, the missense mutations are ticking at a high rate, so it is not plausible to assert these 4-fold degenerate sites are actually 4-fold degenerate over 10s of millions of years.

Remember, most of the assumptions we make for molecular clocks are not valid in sequences facing balancing selection. They are a rare class of sequences that follow different evolutionary rules. They are rare, so their evolution seems less settled in the literature. It seems to be an open (and possibly unresolvable) question about how to calibrate the molecular clock in this regions.

Our best bet might be in looking at ancient DNA…

Look at the reference here to John Sanford’s work: Sanford and Carter: Allele Frequencies and a YEC Adam and Eve.

Though I hestate to include a reference to AIG, look what they write about it here: which is connected to the Designed Diversity hypothesis:

Basically no YEC scientist I know of claims Adam and Eve were clones, and may have been protesting that their model is being misrepresented. I agree with them here. There is no reason to misrepresent them any way. The genetic evidence rules out their position either way. There is no defensible reason to strawman them.

When you say “neutral draft” are you referring to hitchhiking? But wouldn’t there also be some hitchhiking in introns too, assuming that selection on the exon sequence is strong enough?

More importantly, my main argument rests not on branch length but on topology. For that purpose, introns would be nice purely because they would provide a source of more neutrally evolving sites to improve the phylogeny. It would also be very helpful to include sequences from many more primates; the claimed reason for group A is that chimps have lost most of their ancient alleles. Even if other apes also have lost most of their alleles, we would expect them to have retained a few different ones. If group A is indeed old as claimed, some of those retained alleles should pop up within it. Any human allele with an ape sister group, and any descended from a node ancestral to that human allele, would have to predate the human-chimp split.

Does it? If I recall the supposed age of the first divergence within group A was 29ma, so even 1/7 of that is getting close to the divergence. And are we allowing Adam to be several million years old, presumably an australopithecine? I thought he was supposed to be at least “archaic Homo sapiens”.

Sure, but that’s both a fairly uncommon event and one that could be detected by the addition of more sequences.

Don’t see it. Ancient human DNA, including neandertals and denisovans, won’t be old enough. Ancient non-human DNA is so far not in evidence, nor is any DNA 5 million or more years old.

Yes, I mean hitchhiking, and that is a real challenge to dating by a single means. At least it will be diminished in the introns. More than just the molecular clock age, we have to see if there shared lineages between multiple species.

Yeah I agree.

The reason introns would be valuable is that convergent evolution is not likely in play, so that removes an important confounder. Hitchhiking or neutral draft (whatever we want to call it) is still a problem, but the topology of the tree can provide independent verification of ancient lineages. This seems to be the way to go.

This is just a very rough approximation. We would have to look at the Ka/Ks for this specific sequence dataset, which can range all the way up greater than 10 in some cases. That was not reported here, so we are just trying to get a sense of the magnitude of the effect. This shows it is a very large magnitude.

Moreover, this is just one independent source of inflated age. There are additional sources too, all of which collaborate together. All of them have to be accounted for to be sure what is going on. Honestly, the way these papers are typically written, lineage age is probably better understood as “time-scaled divergence”, because the clock being used is not calibrated to take into account these issues.

You could do the analysis, but it seems the intron study would be fair more direct and less prone to critique. Regardless, neither approach has been published in the literature, so we are only speaking speculatively here.

Once again, I am speaking speculatively. It might be possible to study the rate of the clock by looking at ancient genomes. It might require a large sample of ancient genomes. But it might be possible that a rate might be visible. One pitfall here is that the rate might be highly variable on short time scales. Because it is driven by selection to pathogens, I’m not sure at what time scale the stochasticity would even out into a stable rate.

This leaves the intron study as the most direct test, as I suggested from the very beginning.

Urk. No wonder you hesitate to quote AIG. No justification, either textual or scientific or logical, is offered, just a set of claims that one is apparently supposed to believe because AIG says so. I haven’t checked out Sanford so far; one hopes for something better.

I’m not trying to represent their model. I’m trying to point out that their model is inconsistent with the actual story, and that story is supposedly the source of the model. Even YECs are mangling the text in order to present a story that better fits the known data. That’s a brickman argument ™.


If you want to be more convincing, try Steelman Instead of Strawman. Your approach seems like it would provoke a lot of unnecessary conflict.

More convincing to whom? “Steelman” doesn’t fit the 3 little pigs analogy. The brickman is constructed because it stands up to the huffing and puffing of reality. Anyway, aren’t these scenarios chosen because they fit the facts a little better than straight adherence to Genesis 2?