It’s certainly true that genetics is modeling much more complex systems than is typical for physics, and genetics models are correspondingly less precise. Even so, your statement seems a little sweeping. QCD was considered a viable and valuable model for decades before it could be used to calculate something as basic as the mass of the proton to within an order of magnitude, wasn’t it?
You seem to have confused two different sigmas. The sigma you’re talking about is the sample standard deviation, while the relevant sigma is the standard error on the mean, which is sigma/sqrt(N), where N seems to be about 10 million. In HEP terms, it’s analogous to reconstructing 10 million decays of a broad resonance and asking how confident you are that you haven’t gotten the rest mass wrong by a factor of five. In reality, this case is a little different, since we really want to know the maximum true age of four non-coalesced lineages, since that sets a limit on when a two-person bottleneck could have occurred. As I recall, @swamidass used the median of the estimated ages, which seems a conservative choice.
Note that the model of heterozygosity in the Mouflon sheep study assumed neutral evolution, which may well be wrong in this case. By contrast, the results that @swamidass describes make no assumption about neutrality, and in fact the researchers report finding loci with both purifying and directional selection. The primary assumption in those results is that the mutation rate has been more or less constant, something that we have multiple reasons for thinking must be true.
I’m not confusing sigmas. You cannot look at just the sigma on the mean. That is not the relevant thing here. You have to look at the sigma of the width. That gives the range of reasonable values. Again, if I’m interpreting things correctly the statistical analysis used would never hold up for a physics masters degree paper. I’m not trying to be demeaning or antagonistic, just observing the analysis.
You are confusing perturbative QCD with non-perturbative QCD. Perturbative QCD has made mathematical predictions from the beginning. Non-perturbative QCD is a low energy approximation and we know it is only an approximation that actually follows my previous statement precisely. It has been continually refined with complexity added because the early versions were not very accurate.
Sorry, but as a physicist this paragraph doesn’t make sense. Does the Mouflon sheep support or differ from the models used to make predictions about genetic diversity and the size of the human bottleneck? I’ve heard that the Mouflon sheep data differs from the predictions made by those models.
But you’re not interpreting things correctly. The values in @swamidass’s histogram are not independent estimates of the same quantity. They are estimates of the time to 4 lineages for different segments of the genome, each of which has its own history and its own time to that coalescence point. The distribution of those values has a very large variance, so there is indeed a large intrinsic width to this distribution. What we want to know – what sets the limit on the time to a 2-person bottleneck – is the true upper edge of that distribution, which has no obvious connection to what you’re calculating. Even if there were no intrinsic width, though, and these were all independent, noisy measurements of a single age of the genome, your use of sigma here would still be incorrect: the uncertainty on the mean would tell you the uncertainty on the time to the bottleneck, whereas the standard deviation on the pictured distribution would represent the uncertainty in any given measurement.
Don’t assume, by the way, that physicists are necessarily more competent at statistics than biologists. I’ve been both a high energy physicist and a geneticist, and I can assure you that good population geneticists know at least as much about statistics as experimental physicists, and some of them quite a bit more.
I didn’t confuse perturbative and non-perturbative QCD – I just said QCD, i.e. the full model. Perturbative QCD is an approximation of QCD that works well at high energies, while other approximations have to be used for the non-perturbative regime, including for calculating hadron masses. The latter could not be calculated until quite recently, using any approximation. The earliest reasonable calculation that I can find of the proton mass was in 2008. None of this affects my (quite unimportant) point, which is that physicists too are willing to accept models that can’t estimate some things accurately if there’s nothing better available. Anyway, this is a tangent and not worth pursuing.
The Mouflon sheep study is orthogonal to the methods that we’re discussing for setting limits on the timing of a tight human bottleneck. The model used in the sheep study, the one that was in conflict with observation, makes much stronger assumptions.
@MStrauss I’m sorry I have not been able to participate in this conversation. I’m currently out of pocket. I want to comment on a couple things.
@glipsnort is a highly competent and thoughtful scientist in this area, who has consistently demonstrated himself honest with the evidence. I suggest engaging most directly with him.
You discuss difficulty understanding the validity of the models involved. I’m happy to explain them to you sometime, so you can make sense of it. This really not that complex, but it often been explained in very poor ways.
You are lookin at Yadam and Meve as a potential bottleneck. The problem, however, is that there is much more information to consider, such as the other 95% of the genome. That changes the conclusion substantially. Until we engaged with 95% of the data, we can’t be certain of much from only looking at Yadam and Meve.