Explaining the shape of a typical COVID 19 epidemic curve

That doesn’t actually show that. It says bottlenecks might be common, which can mean quite a range of values significantly larger than 1. Reference 7 doesn’t seem to show the typical number of viral particles per droplet(as it’s a paper on modeling with ranges of parameters and conditions), but that’s not even the relevant measure anyway since a person can come into contact with many droplets over a short duration of time, as one person can breathe many out and another can breathe many in. Being stuck together in a bus, crowded restaurant, or other form of large gathering for minutes to hours.

The thing remains that with a replication error rate of ~ 1 mutation per genome per replication and a strong bottleneck whereby no more than 2 to 10 viral particles are transmitted from one host to the other, genetic entropy of sars-cov-2 is certain to occur quite rapidly.

You just don’t have any evidence that these values obtain, nor that Sanford’s distribution of fitness effects of mutations do.

Please show your work. Don’t forget to include the effects of purifying selection in both source and target, the fraction of mutants that are viable (and how you determined this), the fraction that are deleterious (ditto), and the effect of back mutations in the target.

1 Like

It’s not my work, it’s Sanford’s.
You will find answers to most of your questions in the mat&meth section of his article below:

Let’s look at Sanford’s numbers. A key bit of his paper is this: “We model 10% of all mutations as being perfectly neutral, with the remainder of mutations being 99% deleterious and 1% beneficial [35] […] We use the well-accepted Weibull distribution for mutation effects (a natural, exponential-type distribution [26]). In this type of distribution, low-impact mutations are much more abundant than high-impact mutations.”

Reference 35 is to this paper. In that paper, the authors report on 91 synthetic mutants of an RNA virus. They found 24 produced no virus and can be presumed lethal. If we ignore those, 31 (46%) had no statistically significant effect on fitness, 32 (48%) were deleterious, and 4 (6%) were beneficial. If we assume (without justification) that all mutants with lower measured fitness were deleterious, even if the value was not statistically significant, we have 76% deleterious and 18% neutral. They also found that the distribution of fitness effects detectably non-lethal deleterious mutations had a longer tail – more highly deleterious mutations – than could be fit well with an exponential-type distribution. Mean beneficial effect was 1%, mean/median deleterious effect (for non-lethals) was -13.9%/-9.2%

So let’s compare the Sanford model with the empirical results from the source they cite. Fraction neutral: 10% (model) vs 18-34% (empirical); fraction beneficial: 0.9% (model) vs 4-6% (empirical); fraction deleterious: 89% (model) vs 46-76% (empirical); size of beneficial effect: maximum of 1% (model) vs mean of 1% (empirical). Sanford’s paper doesn’t give the mean fitness effect of their deleterious mutations, but does report that 10% had an effect > 10%. The empirical median value for deleterious mutations (assuming all of the non-significant ones were actually deleterious) is just under 10% (9.2%), which means that Sanford’s distribution is weighted far more toward mutations of small effect than the empirical values, i.e. weighted toward mutations that can accumulate rather than be purged quickly by selection.

In summary, Sanford and colleagues took (and cited) a study of empirical values for the very thing they’re modeling, and then discarded every single one of the values and replaced it with one that suited their thesis. I’m not going to bother looking at the rest of their model, and I think I’ll skip characterizing the quality of this effort so I don’t get banned from PS.


Even assuming your corrected values, I don’t see how you could avoid genetic entropy. In particular, I don’t see how shifting the distribution of deleterious mutations toward higher effect mutations can help you, quite the contrary.
And you also have to consider that fitness decline of RNA viruses due to Muller’s ratchet has been experimentally documented.

Once again: show your work. This time with real values.

The whole point of Sanford’s ‘genetic entropy’ is that it’s caused by deleterious mutations of small effect, since mutations of large effect are easily removed.

No one disputes that Muller’s ratchet can occur, especially under artificially tight bottlenecks in the lab, or that RNA viruses have a mutation rate that is so high that they are close to the edge of failure. In fact, it’s thought that coronaviruses can only afford their relatively large genome because they have a relatively low mutation rate. The claim, though, is that RNA viruses must and do degrade during ordinary transmission. So far you’ve offered nothing to support that claim.


I would bet that the bottlenecks in the real Covid19 epidemic are tighter to the artificial one used in the experiment I refer you to.

So you are acknowledging up front you are making a wild guess?

Not really. I’ve offered a reference that supports the idea of very strong bottlenecks in the Covid19 epidemic.

That isn’t evidence @Giltil. Now might be a good time to wind down this conversation. Let’s see how Sanford responds.

I am really looking forward John’s answers and I I am grateful to you for taking the initiative to contact him on these issues of exceptional interest at this time of epidemic.

So how would you characterize his response?


Well, since @swamidass doesn’t seem to be chiming in, I’ll say that Sanford did respond by email. He declined to explain why he cited a source that gave quite different estimates than the ones he used, and he did not offer a justification for the numbers he did use.


That is an accurate summary.


You’ve asked JS the 3 following questions:

1. Responding to @glipsnort, how does he reconcile his choice of the parameters in his study with reference 35?
2. Does he believe that SARS-CoV-2 is being attenuated by the information loss mechanism he explains in this paper? What evidence can he point to that shows this is the case?
3. How did this new virus come to be? Why is it so effective if new functions cannot evolve? If evolution can only degrade functions, and it does so rapidly, shouldn’t this virus have gone extinct thousands of years ago?

Given that it seems that John has answered you, could you tell us how he has responded to these 3 points?

I don’t have permission to make his email public. It, unfortunately, does not engage these questions.

Sad to see he declined to explain anything. Meanwhile, just to reiterate something I posted up earlier in the thread, and others have mentioned, the shape of the curve that prompted Gil to start this thread has a perfectly good explanation in the widely used SIR model. There are many good introductions and explanations for this model and the curves that result from it, on the internet. Even just searching “SIR model” on youtube gives lots of good explanations.

1 Like