The fundamental problems with Fisher's not-so-fundamental Fundamental Theorem of Natural Selection

At this point in time, it’s like beating a dead horse to attack Fisher’s Fundamental Theorem of Natural Selection.

This thread surveys problems outside of Basener Sanford 2018. There are at least 3 problems, and the 4th is the deal breaker for me.

The first problem is that there was no clear and stable statement of the theorem. Not good for a theorem that was hailed as so highly for a long time.

Fisher’s Fundamental Theorem of Natural Selection was hailed as “biology’s central theorem” by Richard Dawkins. The theorem was first stated in the book, The Genetical Theory of Natural Selection , which WD Hamilton said was “one of the greatest books of the present century” and “second in importance in evolution theory to Darwin’s Origin ”. Price described how Fisher spoke of his own theorem:

He [Fisher] compared this result to the second law of thermodynamics, and described it as holding ‘the supreme position among the biological sciences’

But even as the theorem was given such accolades, even by Fisher himself who called it “fundamental”, the theorem was “view by most population geneticists as out-of-date and of modest historical interest” (Basener and Sanford 2018). Ewens and Lessard give a “negative assessment” to “the long term relevance of the theorem.” And Joe Felsenstein referred to it almost derogatory terms as the “not so fundamental Fundamental Theorem of Natural Selection."

Regarding the statement of the theorem, Price notes Edward’s commentary:

only a few (eg. Edwards, 1967) have thought that Fisher may have been correct – if only we could understand what he meant!

So in fairness, how did Fisher himself state his theorem?

‘The rate of increase in fitness of any organism at any time is equal to its genetic variance in ftness at that time.’

My reading of this is that this statement is the most general continuous-generation model statement of the theorem.

Ewens and Lessard 2015 give what they call a discrete-generation version of Fisher’s Theorem in equation 13 here:

On the interpretation and relevance of the Fundamental Theorem of Natural Selection - ScienceDirect

They said:

the correct statement of the FTNS in the discrete generation case

My reading is that this is a multi-loci discrete generation case, not a single loci haploid case. So, I’ll state the single loci haploid case of FTNS (which Joe Felsenstein doesn’t consider THE statement of Fisher’s theorem, so we can call it Fisher like). But in my defense, Queller 2017 and Barton 2009 describe it as THE statement of the theorem. I think Ewens and Lessard give the more accurate description that this is merely a special case application of FTNS. So this is the special case statement of FTNS:

\LARGE {\bar{w}}^\prime-\bar{w}=\frac{Var\left(w\right)}{\bar{w}}

just to clarify, in the discrete generation case, \bar{w} is a function of the generation K, so to emphasize this I like to say, \bar{w}(K).

So another way of stating FTNS in the simplest case is:

\LARGE \bar{w}\left(K+1\right)-\bar{w}\left(K\right)=\frac{Var\left(w,K\right)}{\bar{w\left(K\right)}}

But this leads to the 2nd problem of FTNS, namely, it’s a nothing burger!

Consider the example where we have a set of 3 numbers:

{1, 2, 3}

The mean is 2.0 and the variance is 2.0. But there are many sets of 3 numbers that posses the same variance and mean. For example, the following sets have the same mean and variance as the above set (within a few significant digits):

{ 0.845299462, 2.577350269, 2.577350269}


{ 0.941699476, 2.129150262, 2.929150262 }

Fisher’s theorem relates the variance in fitness at one point in time to the increase in fitness. But given that many sets of values of W_i might result in the same variance in relative fitness w_i, any statement of Fisher’s theorem in terms of variance is insufficient to determine the exact trajectory of the population sizes at any given time. If anything, information is lost by appealing to summary statistics such as variance of numbers rather than the numbers themselves!!!

So even if one rejects the simplified version of FTNS above as THE statement of FTNS, the inadequacy remains for FTNS to improve the computation of the system state at any time compared to simply using the fitness values themselves!!!

Why not just use the fitness values themselves to compute the state of the population than the variance of the fitness values, which doesn’t work for all time any way??? FTNS is an interesting mathematical relationship, but it actually degrades and blurs the evoluton of the system relative to the clarity the fitness values themselves provide.

The 3rd problem is that although the idealization of infinite sized populations make many models tractable, for small finite-sized real populations of complex multi-locus organisms, the relevance and value of describing traits in terms of fitness become increasingly suspect.

The 4th problem, perhaps the worst problem is that W_i for allele i or whatever i is – that 99% of the “fit” types might be function compromising. Framing fitness in term of differential reproductive success doesn’t really speak to the long term evolution of complexity that Darwin envisioned for Natural Selection. In fact it suggests, even on the assumption that the mean fitness is always increasing where:

\bar{w} \equiv \text{adaptedness}

functional compromise is always increasing!

So even if one argues fitness and adaptedness are increasing, it’s only an equivocation of common sense understanding of what it means to be improved or more fit.

here was my derivation of the simple case of FTNS:




but it can be shown

\LARGE p_i\left(K+1\right)=\frac{w_ip_i\left(K\right)}{\sum_{i=1}^{N}{w_ip_i\left(K\right)}}=\frac{w_ip_i\left(K\right)}{\bar{w}(K)}


\LARGE \bar{w}(K+1) \equiv\sum{p_i\left(K+1\right)w_i}


\LARGE \bar{w}\left(K+1\right)=\sum{\frac{w_ip_i\left(K\right)}{\bar{w}}w_i}=\sum\frac{p_i\left(K\right)w_i^2}{\bar{w}(K)}

We can state the above relations as:

\LARGE \bar{w}\left(K+1\right)-\bar{w}\left(K\right)=\left\{\sum\frac{p_i\left(K\right)w_i^2}{\bar{w}(K)}\right\}-\bar{w}(K)

but since:

\LARGE \bar{w}=\frac{{\bar{w}}^2}{\bar{w}}

we can say:

\LARGE \bar{w}\left(K+1\right)-\bar{w}\left(K\right)=\frac{\left\{\sum{p_i\left(K\right)w_i^2}\right\}-{\bar{w\left(K\right)}}^2}{\bar{w\left(K\right)}}

which reduces to:

\LARGE \bar{w}\left(K+1\right)-\bar{w}\left(K\right)=\frac{Var\left(w,K\right)}{\bar{w}\left(K\right)}


This is the second time in a day you’ve made that claim. Where did you get the number from?


No, it doesn’t, and isn’t supposed to. But out of curiosity, what “evolution of complexity that Darwin envisioned” are you referring to here exactly? Can you quote Darwin?


… and when the pipes in your house start leaking the Fundamental Theorem of Natural Selection won’t solve that problem either!!!

Also, interesting to see Sal’s proof of the haploid discrete-generations case of an FTNS-like theorem. For basically the same proof for the same case, see section II.7 of my online free textbook Theoretical Evolutionary Genetics, pages 89-90, equations (II-104) to (II-109).


Your’s was the foundation, but I thought my version was more general and clear. You are welcome to include my improved version in your next edition.

My understanding was that the formula was by Sewall Wright, so I presume the proof of the theorem was his. However, I’m more than happy to attribute the original proof to Joe Felsenstein or whomever the original proof belongs to.

Joe’s version had only 2 alleles, and I generalized it to N alleles. Also, the notation, although conventional to pop gen, is not as directly accessible to the average math student. I would hope a DEPENDENT variable like \bar{w} is at least made clear on what it depends on and so I made it explicit with \bar{w}(K).

Of course if a field doesn’t want too much scrutiny from it’s peers in other mathematical discplines, being cryptic helps.

To no one’s surprise Sal keeps avoiding this question.


The current edition of my online book, the 2019 one, has the proof for haploids on page 93, equations (II-108) to (II-113). Sal says

Joe’s version had only 2 alleles, and I generalized it to N alleles.
That is incorrect: the version in my book is for an arbitrary number of alleles, k. As to whether the proof can be found in Sewall Wright, I am not sure of that. I am not at my office now so I cannot look it up, but I think a version for multiple alleles in haploids will be found in the pioneering textbook Population Genetics by C. C. Li in 1954. He may cite it to Sewall Wright, I don’t recall whether he does.

There is one thing that is disconcerting, the idea that mean relative fitness is a measure of adaptedness, informally:

\bar{w}(K) \equiv \text{measure of adaptedness, where a higher number means more adapted}

In the extreme case, \bar{w}(K) is maximized when all the alleles save one are gone (or at least practically so in an infinite population model). How is this necessarily good or a really more adapted population toward future environmental uncertainties? Populations without allelic diversity are often considered at risk.

Maybe it was good that Patrick Moran found out that Natural Selection doesn’t always maximize one set of alleles over another (aka maximizing mean fitness in multi-loci models vs. single loci models).

Did someone claim it was?


Why would you think that a measure of adaptedness should include fitness with regard to “future environmental uncertainties”, or be a measure of how “good” the organism is in some unspecified sense?


From the pure math

\bar{w} = \text {measure of homogeneity}

I suggest we call things what they really are.

Perhaps it is better to avoid using words like fitness or adaptedness since those are quasi-philosophical metaphysical notions.

Maybe we can’t put a figure on adaptedness, or even what constitutes fitness! That’s the point.



\bar{w} = \text{mean of relatative growth rates}

There isn’t anything in the “pure math” that says fitness is with regard to “future environmental uncertainties”.

The specific measure of fitness is reproductive success, it’s not whatever nebulous other notions of “adaptedness towards future environmental uncertainties”, or “complexity” you’ve dreamt up.

How can you be so confused about something so simple? I doubt that you really are.


Did someone claim it was?

The word “FIT” in the conventional sense from the dictionary:

of a suitable quality, standard, or type to meet the required purpose.

I’m suggesting using less metaphysically loaded words than “fit” or “adpatedness” to describe mathematical entities that simply measure a level of homogeneity.

Absolute Darwinian fitness is merely a growth rate W, and by extension relative Darwinian fitness is relative growth rate. Why use the prejudicial term “fit” at all. Call the term what it is and has been ever since we found exponential functions as solutions to differential equations, some sort of growth constant! Don’t call it fitness, as the math term has been well know for centuries. This is equivocation and double speak.

Why see it as prejudicial?

Your mention of “exponential functions as solutions to differential equations” demonstrates your continuing confusion. You seem unaware that fitness is relative to an environmental niche.


It is a well-known phenomenon that the technical literature has non-standard usage of words.

So no, the “standard” use of the word fit in the dictionary isn’t actually claiming that the word “fitness” implies something is necessarily good.

Heck, even the standard definition of “fit” does not imply something to be “necessarily good” or “more adapted towards future environmental uncertainties” either.

To have a suitable quality, or to meet some required purpose (to be fit for some required purpose) is not “necessarily good”, nor “adapted toward future environmental uncertainties” and isn’t implied to.

You’re making shirt up.

Good for you, I wish you the best of luck in trying to change the established vernacular in a field of scientific study.



\bar{w} = \text{mean of relatative growth rates}

That’s not the pop gen definition, as Lewontin notes, fitness is defined on the reproductive schedules themselves.

Equivocation abounds, and it never gets cleaned up.

That’s why Dawkins would call a nothing burger “biology’s central theorem”.

If you’re going to be quoting people, please give a specific reference, we have historical precedent for not taking your quotations at face value.


Frist the Royal Society Web page:

Ronald Fisher | The Royal Society

He therefore could be said to have provided researchers in biology and medicine with their most important research tools, as well as with the modern version of biology’s central theorem.

This sound hauntingly like Dawkins here:

Evolutionary Zoologist, University of Oxford. Author, The Blind Watchmaker; The Greatest Show on Earth

I’ve listened to Armand Leroi’s talk: beautifully delivered (apart from his infuriating use of the historic present), but his nomination of Aristotle is obviously just contrarian for the sake of it. He is simply bending over backwards to contrive a way of answering the question with a name other than Darwin. And that’s a hopeless cause! I prefer to answer the question straight, making no attempt to be contrarian or ‘interesting’. Darwin, of course, is the greatest biologist ever.

Who is the greatest biologist since Darwin? That’s far less obvious, and no doubt many good candidates will be put forward. My own nominee would be Ronald Fisher. Not only was he the most original and constructive of the architects of the neo-Darwinian synthesis. Fisher also was the father of modern statistics and experimental design. He therefore could be said to have provided researchers in biology and medicine with their most important research tools, as well as with the modern version of biology’s central theorem.

And Fisher himself it was a Fundamental Theorem, like the 2nd law of Thermodynamics! Who is anyone to dispute such an esteemed, honorable, Christian (albeit a neo-Darwinist).