Does Jeanson's model match "the population growth data from 1000 BC to the present"?

Tim · June 30, 2022, 3:56pm

Given that this question is outside the ambit of the original topic, I thought I’d create a new one for it.

Given that Valerie did not respond to my question over where Jeanson makes this claim, I tried to track it down for myself, and came up with the following:

One problem is that this paper does not claim to match “the population growth data from 1000 BC to the present”:

However, it failed to capture the shape of population growth pre-A.D. 1150 (fig. 1; Supplemental table 1). In other words, the Y chromosome curve based on the Evo root captured only 27% of the history of global male population growth (fig. 1).

fig 1:

Another problem would appear to be that, post 1150 AD, both curves appear to approximate exponential growth curves. It’s been a long time since I’ve played with the underlying maths – but wouldn’t that mean that the two curves will take the same shape (assuming that the scales for each curve is chosen correctly), even if they are both growing exponentially for complete unrelated reasons?

I’d be interested to get comments both on this point, and on the validity of the paper in general.

And if this is the wrong paper, I’d appreciate a link to the right one.

John_Harshman · June 30, 2022, 5:17pm

Let’s just start by noting that any attempt to determine past population size purely by looking at the number of branches in various slices through a tree is based on a gross misunderstanding of just what trees are and how coalescence works. There are ways to use current diversity patterns to estimate past population history, but Jeanson didn’t use any of them.

Nesslig20 · June 30, 2022, 5:17pm

Both curves are - at some point - exponential, but they have no causal relationship to each other. Jeanson thinks a larger population size would lead to more Y-chromosome lineages, hence why he attempts to match these curves, but that’s not at all the case.

@dsterncardinale explains it here in more detail here:

thoughtful · June 30, 2022, 7:25pm

Got sick last night so went to bed early and didn’t take time to respond. Thanks for finding it. Taking it easy today so gives me some time to re-read the paper. To be honest I skimmed the end - a lot of info…

From what I can tell, the point of this figure is to show that even the evolutionary root captures population growth at a YEC time scale, but as is pointed out in the part you quoted, it fails to capture the pre-1150 growth so Jeanson is saying the evolutionary root is wrong.

Thus, for the mismatch between the Evo root-based Y chromosome population growth curve and the historical population growth curve, the lack of plausible counter-explanations suggested that the mismatch represented real error—that the Evo root was not the actual root.

I checked the book, and the root that Jeanson settled on was the Epsilon root with a few modifications, so something close to this one (fig 4)

I don’t understand the question as they are both approximating population growth - how could the reasons be unrelated?

That’s really a misunderstanding of what he’s doing.

It will, because there are branches every generation.

John_Harshman · June 30, 2022, 10:13pm

Please explain what he’s doing, then. What’s wrong with mu understanding?

And branches go extinct in every generation too. Did Jeanson account for that? Do you know what coalescence is?

dsterncardinale · June 30, 2022, 10:23pm

Credit where due, that’s a shorter version of what @Herman_Mays walked through in our conversation on Jeanson’s methodology.

Tim · July 1, 2022, 8:45am

There are multiple problems with this:

This chart does not show a good fit after 1400 AD.
Jeanson’s chart notwithstanding, there was actually a dip in population in the 14th century (due to the Black Death). The fact that his “Global Poulation Size (Male)” does not contain this dip suggests that his data is not remotely detailed (as we probably could have surmised from his smooth expontential curves). The fact that his “Y Chromesome Lineages” does not predict this dip suggests that it is probably not a particularly good predictor of population.

How do you know that they’re “both approximating population growth”? Because Jeanson says so? But Jeanson has no expertise in either genetics or historical demography. Unlike Jeanson, @John_Harshman actually has a background in genetics and genetic lineages. Yet instead of carefully considering his opinion you simply reflexively and thoughtlessly dismiss it:

What evidence do you have that Jeanson himself understands “what he is doing”?

What evidence do you have that “what he’s doing” isn’t simply making it up as he goes along with little or no scientific basis?

thoughtful · July 1, 2022, 8:45am

Because that method of counting through the tree can only give you minimum population size, looking at a slice of the tree can only tell you that you’re looking at branches that never went extinct. It obviously can’t count the lineages that went extinct. So looking at slices is pretty useless.

BUT because branches MUST multiply in order to achieve population growth (of course drift still happens, but the number of men overall gets larger over time in population growth so SOME branches must multiply) the tree can be used to infer population growth over large swaths of time. So you can draw a minimum growth curve using population sizes and the tree should follow it. As I tried to explain to @Herman_Mays, I understand you can’t use this method in a population that’s stable over a large swath of time. But no one is doing that. We already know these populations grew exponentially.

Yep.

Yep. Minimum population size.

I think so. Looking backward in time, coalescense is when genes or lineages coalesce at their most recent common ancestor.

John_Harshman · July 1, 2022, 3:27pm

Can you see why the number of branches must increase over time for any tree, regardless of what is hapopening to the population? Your “explanations” show that you have no comprehension of what Jeanson did. (To be fair, neither does Jeanson; not all your fault.) “Minimum population size” is a useless measure, as should be obvious from the data, and there’s no reason it should be even loosely correlated with the true population size.

evograd · July 1, 2022, 5:40pm

If we already know what then what information is added by Jeanson’s method? If he’s just saying “if I assume that the population has grown exponentially, I get this exponential curve! Look at this - my exponential curve is similar to the exponential curve of population growth, so my method must be a good proxy for population growth!”, doesn’t it all seem a bit pointless and silly?

If he actually wants a method that can be used to infer historical population sizes, shouldn’t he be coming up with a method that can actually model real population size trends similar to PSMC?

thoughtful · July 1, 2022, 5:40pm

Jeanson explains it’s over 90% and based on 300 men out of billions. So it just means those that these men’s lines didn’t have as much population growth as other lineages weren’t sampled. A sample size larger than around .00003% may be needed.

You’re missing that it’s a smoothed population curve. From the book:

The point is that the y-chromosome tree can never capture a downturn because those lineages went extinct. It would just not show branching.

Not necessarily in all cases - drift over a period of time means that after a significant bottleneck, only one or a few lineages make it though that bottleneck and make the tree look like flatlining. I showed this on pencil and paper a few weeks ago in an earlier thread. You’re the one who helped me understand this about drift several months back, so…

I get your point, but you’re too focused on population size. The point is that the tree will follow population growth. The probability of having a boy or a girl hasn’t changed AFAIK. Even with drift, during growth, branching will follow growth.

Tim · July 1, 2022, 6:48pm

This actually “explains” nothing Valerie. “90%” of what? And what part do these “300 men” play? (Assuming of course you aren’t referring to the graphic novel or movie.)

I’d find this hilarity less misplaced if there was any evidence whatsoever that either you or Jeanson had any idea whatsoever as to what you’re talking about.

No Valerie. You’re missing the fact that smoothing this data removes any distinguishing features from it. What’s left is just yet another approximately exponential curve, that will have roughly the same shape as any other approximately exponential curve for the same time period – and so will be largely indistinguishable from them. This means that any similarity means nothing. And therefore that Jeanson has predicted nothing. It’s all just ‘smoke and mirrors’.

Your problem is that you base all your arguments on the assumption that Jeanson knows what he’s talking about. Given that Jeanson has no expertise in any of this, this is clearly an unsubstantiated assumption – and proving more and more obviously to be a false assumption.

John_Harshman · July 1, 2022, 6:48pm

You most certainly do not get my point. The tree does not follow population growth. To suppose so is to entirely ignore coalescence. Branching occurs with or without growth, and a tree necessarily narrows toward the root. Jeanson’s data, regardless of root, have nothing to say about population growth, and choosing a root based on how well meaningless data track an extrapolated, unjustified population growth curve is not a valid method. Please accept that you know nothing about this subject and that reliance on Jeanson for information is not a good way to learn.

Nesslig20 · July 2, 2022, 1:57am

However, in large populations, most new branches won’t spread throughout the population and many die off; i.e. some Y-chromosomes don’t get passed one when Men don’t father sons.

The probability of two y-chromosomes to coalesce in the previous generation = 1 / (Ne/2). Ne = effective population size, and it’s divided by 2 because ~50% of people have a y-chromosome. But we can simplify the formula by defining Ny as the Y-chromosomal population size, in which case P = 1/Ny. Conversely, probability of NOT coalescence in previous generation = 1-1/Ny. The probability of NOT coalescence within the previous t number of generations = 1-1/Ny^t. Hence, the probability of coalescence within the previous t number of generations = 1-(1-1/Ny^t)

On a graph, it looks like this:

Basically, this illustrates that with larger population sizes, the more generations it takes for two lineages to coalesce. But we don’t have just two y-chromosomes in a population. What about all the others? When there are multiple individuals, there are many more ways for any two of them to coalesce. In this case, you need to add the following formula to the equation: k(k-1)/2. Meaning, if you only have two y-chromosome branches to consider, it means we only have 2(2-1)/2=1 unique pair that can coalesce. However, if we have three branches, then that number is 3(3-1)/2=3. Thus, the probability that none of the k number of y-chromosomes coalesce one generation ago = 1-(1-1/Ny^(k(k-1)/2)). And adding the t (umber of generations) we get: 1-(1-(1-1/Ny^(k(k-1)/2))^t)

Plotting this formula on two graphs for two different Ny population sizes.

Each line gives the probabilities all k number of lineages for one given number of generations ago. For example, for Ny = 50, the probability that any of 16 lineages coalesce in the previous generation (t = 1) is over 90%. See here again that for larger population sizes, the probabilities are lower given the same values for t and k See also how for larger k and t values, the probabilities of coalescence are higher.

The most important thing to note here is that, when k values are high, you don’t need a high t value for a high probability of coalescence. Meaning, many lineages coalesce only a few generations ago. However, when k values are low, then large values of t are needed for a descent probability of coalescence. Meaning, deep branches are more likely to coalesce many generations ago. Both of these effects means that coalescent trees are top heavy; i.e. many branches coalesce at the tips, deep branches coalesce at the base.

For example, in graph Ny = 50, the probability of coalescence among 8 random lineages 1 or 2 generations ago is about 50%. Meaning, with a 50% probability, we would expect to see 8 lineages to still be separate 1 to 2 generation ago within a population of 50. The probability of 10 separate lineages is lower because the probability of coalescence is 60%, and we certainly would not expect to see 16 lineage to all remain separate 1 generation ago as the probability of coalescence is 90%. So, with a 50% probability, 16 lineages would have coalesced down to about 8 lineages 2 generations ago. Many branches coalesce at the top. If we go further, the probability of coalescence among 4 lineages reaches 50% between 5 to 8 generations ago. Regarding 3 lineages, 50% coalescence probability is reached 12 generations ago. However, for 2 lineages to coalesce in a population of 50, you would need to go back between 30 to 40 generations ago to get a probability of 50%. Deep branches coalesce at the base. We can do the same for the other population size Ny =300, although here we need more generations to reach the same branch points. We would still expect to see about 15 lineages to be separate 1 generation ago. 8 lineages would remain separate between 5 and 8 generations ago. 4 lineages remain separate until 30 to 40 generations ago. And the last 2 lineages would remain separate until over 200 generations ago.

Thus, each population size, we would expect to something like the following coalescent trees:

Note here that the number of branches increases almost exponentially the closer you get to the present. Yet, we have only considered constant population sizes so far. So, it is already clear that Jeanson is wrong to say that an exponential increase in the number of branches corresponds to an exponential population growth. But we can go further and see how a coalescent tree would look like if the population is exponentially growing. We can calculate the population size at t by the formula: Ny0(1+r)^-t with Ny0 defined as the population size at t=0, and r is the growth factor. Let’s have Ny0 = 2000 and use exponential growth rates of 0,01 and 0,04. We get the following in graph form:

Now the probabilities look very different. I did not bother the include the lines for t = 2 to 9 since the probability of coalescence to occur among 16 lineages within 10 generations ago is <50%. This is because among these generations, the population sizes are still quite large. Coalescence are more likely to occur further in the past when population sizes were small. If we construct the expected coalescence trees for these like last time, we get the following.

Very different. Deep branches are more likely to coalesce more recently and recent branches are more likely to coalesce further into the past when the population is growing exponentially. Maybe the difference isn’t as obvious. Let’s take a very large population size (8 billion) and a large growth rate (10%). Then we get the following:

Most branches coalesce many generations ago when the population size was small. This is not what Jeanson expects from an exponential population growth. He wrongly thinks that more branching is an indication that a population is growing. BUT we have seen that a greater frequency of branching events (i.e. coalescence) correlates with smaller population sizes, NOT with a growing population. The way we actually use coalescence to reconstruct population size in the past is by examining time when coalescence rates are high (smaller population sizes, perhaps bottle necks) and when the rates were low (when the population sizes were larger). There have been such studies before, and those use a lot more complicated math and models that take things into account such as non-random mating that I did not previously with my simple calculations.

Here is an example for how mtDNA variation predicts population size. The outlier here is Australia+New Guinea and they discuss the implications in the paper.

thoughtful · July 2, 2022, 1:57am

Um…This one looks different. Perhaps you’re talking about the last 900 years. If you think so, show how it does. I challenged Mays to this in YouTube comments also.

How does it ignore coalescense? I’d like to understand that point.

Regarding what I was saying about drift earlier, I found the quote from you I was looking for (below). I understood you as saying that over time, y-chromosome drift will reduce male lineages over time so that there will be no branching. I’m only going a step further and saying, yes, exactly: with extreme drift and bottlenecks and branching every generation, branching we see in the resulting tree wasn’t eliminated because there was population growth.

“Further, given most reasonable assumptions, every strictly male lineage except one will eventually be eliminated because all it’s various branchings will end in daughters (or extinction, which is the same thing as far as Y chromosomes are concerned).”

New Jeanson Book: Traced Human DNA's Big Surprise Conversation

There’s your problem. Not every male passes on his Y chromosome. That only happens if he has sons. All the mutations that happen in males who have no sons are lost. Further, given most reasonable assumptions, every strictly male lineage except one will eventually be eliminated because all it’s various branchings will end in daughters (or extinction, which is the same thing as far as Y chromosomes are concerned). This is true whether the population is expanding, contracting, or stable. Then again, barring selection, that one remaining lineage will have experienced a number of substitutions per generation equal to the mutation rate per generation unless selection has been operating. And if there’s no selection, population size is irrelevant.

Nesslig20 · July 2, 2022, 1:57am

Hence, when for example the y-chromosome lineages when from 50 to 100 exponentially, that does not mean that this period corresponds to a population growth. During the time when y-chromosome lineages were 50, there could’ve been easily many others that went extinct.

SOME branches must multiple in growing populations. HOWEVER, the total multiplication in a population would only correspond to the multiplication events captured in coalescence in RECENT times when almost every existent lineage is an extant lineage, because it becomes a tautology the closer you are at the present generation (i.e. at the present generation, every existent lineage is an extant lineage and thus the # of branches correspond to the population size). But as you go further back in time, and more extant lineage coalesce to a fewer number of extant lineages, such that the proportion of extant lineages of the total at the time become smaller. These extant lineages wouldn’t be a representative of the total population. Hence, the number of branches do not correspond to the population size, and the branches that multiply at these times do not correspond to a growing population whatsoever. Even if the population is constant, you would still expect branching to multiply (see my explanation in another comment).

Let’s also note that when Jeanson is counting the lineage multiplying from 1 (the Y-chr. MRCA) to 300, these multiplications are the oldest multiplication of the extant lineages, they occurred when the proportion of the surviving lineages to the total lineages were small and thus the number of branches do not correspond to population size as previously explained. This is even assuming Jeanson has the correct root, which I seriously doubt.

In short, it’s exactly what @John_Harshman said:
“The tree does not follow population growth. To suppose so is to entirely ignore coalescence. Branching occurs with or without growth, and a tree necessarily narrows toward the root.”

So it is certainly false to state that the tree can be used to infer population growth over large swaths of time in the way Jeanson is using it.

This is exactly backwards. A “downturn” or a bottle neck would INCREASE the likelihood of coalescence, i.e. more branching than you would expect relative to a constant or even a growing population size.

John_Harshman · July 2, 2022, 1:57am

You understand that Jeanson’s population curves (the irregular blue lines, not the smooth black curve) are just taken from the Y-chromosome tree of 300 individuals just by counting the number of branches on the tree at some point assumed to be a particular time. It’s the vertical axis on the left — note how it ends at 300. The vertical axis on the right is the population estimate. The matching of scale between axes is chosen to fit the curves together. Smoke and mirrors, nothing else.

thoughtful · July 2, 2022, 1:01pm

Impressive replies. Thank you for taking that time! I will think more on them. (I had to figure out on my own earlier that simulations typically use static populations and the scenarios usually discussed are random mating. So thank you for creating growth scenarios and explaining they were random; I appreciate the clarity and specificity.)

I know there are not 8 billion men in the world, but I assume that halving that number would make the probabilities even more favorable for my position: I really liked the top left portion of this graph.

John_Harshman · July 2, 2022, 1:01pm

You understand wrong. The Y-chromosome lineage that remains will have originated deep in the past and will have branched since then. The point is that back then at the root of the current Y-chromosome tree there were many Y-chromosome lineages whose coalescence would have been much deeper in the past. Everything you say suggests that you understand nothing. And again, using Jeanson as a guide is just the blind leading the blind.

Tim · July 2, 2022, 1:01pm

No Valerie. My point was for your “fig 4”.

Any two data series undergoing (approximately) exponential growth over the same time period will have the very similar shape. If a new invasive species were introduced into Brazil and a new consumer technology were introduced into China, both in the year 2000, and the both happened to experience exponential growth thereafter, their graphs would fit very well – even though there was no relationship between their growth.

The question is not whether the exponential growth of two series matches – as spurious matches are dead easy to find. The question is whether deviations from a perfect exponential curve in one explains deviations from the perfect exponential curve in the other. This is why Jeanson’s smoothing is highly problematical – as it obliterates any signal that would allow us to distinguish between a real correlation and a spurious one.

Topic		Replies	Views
Nathaniel Jeanson’s Traced Conversation Science	96	2424	May 19, 2022
Testing Jeanson's Model: Y Chromosome Mutation Rates Conversation Science	108	6146	August 10, 2020
Jeanson's Method of Inferring Past Population Sizes Conversation Science	77	3443	March 29, 2022
New Jeanson Book: Traced Human DNA's Big Surprise Conversation Science , BookClub	189	5719	April 22, 2022
"I'm treating the mutation rate as a substitution rate" - Dr. Nathaniel Jeanson Conversation Science , Comments	313	6164	July 28, 2022

Does Jeanson's model match "the population growth data from 1000 BC to the present"?

Related topics