I had a focused question for population geneticists. It is fairly technical, but @Joe_Felsenstein might have the answer tucked away somewhere.

The N_e (effective population size) is the number of individuals in an idealized (large/panmictic/diploid/constant population size) population that would reproduce the statistic we are looking at.

In reference to coalescence based methods of approximating N_e at different points in the past (e.g. SMC and PMC), consider a population that is idealized in all respects, but the population size is changing from generation to generation.

To fix our understanding here, consider this case. For 1000 generations is it 10,000 individuals, and then for 200 generations it is 5,000 individuals. Assuming an ideal coalescence-based method, working from 100 present-day samples, what do we theoretically expect the N_e over time to be estimated as?

As I understand it, N_e at generation g in the past will be a harmonic mean (1 / {\text{mean} [1/N] }) of the population size across a window of times. It seems that the width of that window which should increase as we go back in time, and should be related to the number of coalescences near time g.

It seems the window size should be a function of un-coalesced linages (from the samples) at time g, and it seems like there should be a theoretical result. Of course, thinking of this as a âwindow widthâ is somewhat of a simplification, and it might be better understood as kernel function centered on time g that defines a weighted harmonic mean as the theoretical value of N_e in a varying N population.

So then, is there an analytic formula published anywhere that tells us the width of that sliding window (or the kernel function) at g?