What is Effective Population Size

Ashwin_s · April 19, 2019, 1:48pm

I think it would be helpful to explain this in a separate peaceful science post.
It was only recently that I realised the significance of Ne( mainly because of recent reading on the subject).
A detailed explanation from your end would definitely help me and others grasp this better.

davecarlson · April 19, 2019, 4:10pm

Brian Charlesworth has a characteristically excellent review of effective population size and drift that you can read here.

To be (very) brief, effective population size (Ne) is a theoretical construct that allows scientists to assess the affect of genetic drift on real populations using various population genetics models. Ne is typically defined as the size of a hypothetical population that experiences the same amount of genetic drift (or other genetic property of interest to the researcher) as the actual population, given certain modeling assumptions. all of which can be relaxed to one extent or the other, to account for more realistic scenarios.

Ne is critical for understanding te expected amount of genetic diversity present, the time to the most recent common ancestor (time to coalescence) of various alleles in a population, the relative strength of drift vs. natural selection, etc.

swamidass · April 19, 2019, 6:20pm

The key issue here is that it is not a point estimate of census population size, nor is it the minimum population size over a window. It is rather the harmonic average over a large sliding window.

@Ashwin_s, in the future just start a new thread. Don’t call for it. You’ve been around long enough to do this yourself.

Rumraket · April 19, 2019, 6:56pm

What is understood by a “sliding window” of population size? Is that the census population size within some set number of generations, or other defined period of time? And is Ne then the harmonic mean of the changing census population size in that window?

T_aquaticus · April 19, 2019, 7:00pm

I’m just starting to go through the article @davecarlson linked to, but this caught my eye right away:

Rumraket · April 19, 2019, 7:00pm

Exploding head

Ashwin_s · April 19, 2019, 7:03pm

Can you translate that?
Why is it significant?

T_aquaticus · April 19, 2019, 7:06pm

It means that different parts of the genome will give you different answers for effective population size. Therefore, you have to take the whole genome into account. This may be part of the “sliding window” that @swamidass is talking about. The other part may have to do with changes in population dynamics through time.

gbrooks9 · April 19, 2019, 7:07pm

Oh my …

Ashwin_s · April 19, 2019, 7:15pm

I get this.

Because the diversity levels are different… but then this should mainly be only in parts of the genomes that has undergone more selective pressure… correct?

Can Ne be considered as the smallest hypothetical population size in which a set of genes must have gotten “fixed” in order to give the amount of variation currently observed?

swamidass · April 19, 2019, 7:17pm

Ne is essentially defined as the inverse of the coalescent rate (CR). Whatever increases CR decreases Ne, and visa a versa.

Selection increases CR, and therefore decreases Ne in positively selected regions of genome.

Recombination decreases CR, and therefore increases Ne.

Isolated populations increase CR, and therefore decrease Ne.

Differential survival between males/females creates different CR for sex chromosomes, and therefore different Ne.

Immigration decreases CR, and therefore increases Ne.

Inbreeding increases CR and therefore decreases Ne.

CR is just a rate measured over time, a single number, but it is affected by all these things and more.

Well, for a moment, let’s ignore the “size” of the window, and some normalization details. I’ll add this back in a moment.

Think of a timeline stretching into the past. There is a tick mark wherever there is a coalescence. What CR is (which is the just the inverse of Ne) is just the average number of tick markers in a window along that line. The tick marks are coalescence in the phylogenetic tree. The time of these tick marks is based on the number of mutations along each leg, and there is a lot of noise here.

So add back the complications.

How big is the window? The window size increases exponentially as you go back in time, and it is not well defined beyond this. I’ve worked out some of the statistics for a yet to be published paper on this, but the key point is that the window size increases as you go back in time, and no one really tracks how large this is right now in population demographic inference.

What about the normalization details? Each coalescence event is weighted by how many active lineages there are. The more lineages, the less weight. The exact formula is given by the Kingman Coalescent. I can explain it if you like, but the key point is that each of the events “counts” a different amount. In the recent past, we have way more of them that count just a little, but in the distant past we have only a few that count a lot.

Now, that we have settled that, it should be clear that Ne does not tell much at all about brief tight bottlenecks. In fact it tells us just about diddly squat about them. It can only pick up bottlenecks that last for a large number of generations.

T_aquaticus · April 19, 2019, 7:24pm

That definition is a big help (also mentioned by @davecarlson).

glipsnort · April 19, 2019, 7:39pm

Note for observers: this is true because recombination controls the range over which selection affects the chromosome.

Sex chromosomes have a different Ne anyway because there are fewer of them in the population than autosomes (3/4 as many for the X, 1/4 for the Y.)

It’s not really relevant here, but note that this is one way of defining Ne. What Ne means in general is the population size in whatever model you’re thinking of that would match an empirical measurement, i.e. it’s how big an ideal population would have to be to behave like the real population. Depending on the population’s history, you can get wildly different values for Ne if you measure different things. Immediately after a bottleneck, Ne based on diversity might ~10,000 while Ne based on variance in allele frequencies might be ~10.

swamidass · April 19, 2019, 7:42pm

I suppose the context I mean it is in population demographic inference. Do you know of any program or approach that does not define Ne as CR? This the way it is done for MSMC, SMC, and just about every algorithm I’ve looked at. Am I missing something?

Very true. And I suppose AFS is one way to do population inferences, although not the most powerful way once you get more ancient than, say, 10,000 years. Was that your point?

swamidass · April 19, 2019, 7:45pm

@jordan and @AJRoberts this is a good thread for you.

glipsnort · April 19, 2019, 7:53pm

In the context of long-term demographic inference, sure, that’s what you do. In the context of short-term inference (e.g. when we measured the variance effective size of the malaria population of Senegal a few years ago, following anti-malaria intervention), you should be using the variance Ne or something similar. As I said, in this context it doesn’t matter.

Topic		Replies	Views
Window Size For Effective Population Size Estimates? Conversation Science	30	788	March 7, 2021
Mendel's Accountant Conversation	83	2308	December 9, 2020
Mantha: Genetics for Dummies Office Hours Science , Pedagogy	74	4529	March 17, 2019
Intuiting the Strength of Negative Selection Conversation Science	26	440	May 3, 2021
Deleted: Does Genetics Point to a Single Primal Couple? Conversation Adam , Communication , Society	33	6006	November 8, 2020

What is Effective Population Size

Related topics