Genetic evidence *against* common ancestry

Since @colewd seems determined to derail the thread “Evidence for common ancestry,” I’ll do him a favor and start this new thread, “Evidence against common ancestry.” Let’s begin by discussing the sequence and waiting time ‘problems’ which @colewd just brought up in the other thread.

Because they’re completely made up by ID advocates, they don’t exist in the real world.

The ‘sequence problem’

The so-called ‘sequence problem’ is from a single study by Axe (2004) in which he estimated the proportion of functional β-lactamases out of all protein sequence space as 1 in 10^77. How did he do this? By taking an already crippled temperature-sensitive β-lactamase domain and mutating it until it no longer worked as a β-lactamase. Needless to say, this is a bad way to test for function in sequence space. Yet he somehow extrapolated from this that any given specified function will only exist at about 1 in 10^77 polypeptides.

It should be obvious how bad this study was for determining the proportion of functional proteins in sequence space. But it’s even worse if you consider how many other experiments have directly refuted Axe. For example, Yamauchi et al. (2002) began with a library of only 10 random polypeptides 140 amino acids in length, and were able to evolve esterase function within just six ‘generations.’ This shows that esterases exist at a rather high proportion within random sequence space.

Nakashima et al. (2007) began with a library of ~10^6 random polypeptides 140 amino acids in length, and were able to evolve DNA-binding function within just five ‘generations.’ This shows that DNA-binding proteins exist at a rather high proportion within random sequence space, thus demonstrating how transcription factors can evolve de novo.

Shahsavarian et al. (2017) began with a library of 2.7 * 10^9 random polypeptides, and found 5 different β-lactamases, showing that β-lactamases exist at a proportion of approx. 1 in 5.4 * 10^8 within random sequence space. This is especially significant because β-lactamase is precisely the enzyme that Axe (2004) looked at, showing that his conclusions are off by about 68 orders of magnitude. This is not a small error.

This research is even being used for important real-life applications. Wang et al. (2016) used the same technique, beginning with a library of 10^9 polypeptides 7 amino acids in length, and they found 7 peptides that bind selectively to ovarian cancer cells, with possible application for cancer treatment. Matsumura et al. (2010) found 19 peptides out of a library of 10^12 random sequences 16 amino acids in length with the ability to inhibit the Bcl-XL enzyme, showing that 1 in ~10^11 polypeptides of this length have this ability, with possible application for therapeutic drugs.

There are many, many more studies like this that I could have cited (for more, see @Rumraket’s post on this topic from The Skeptical Zone). But just one of these studies is enough to show that Axe is completely, utterly, horribly wrong. Many, many, many more than 1 in 10^77 random protein sequences have a specific function. Even as many as 1 in 10^8 random protein sequences have β-lactamase function, which is the specific function that Axe himself was testing for!

So it’s absolutely ridiculous to tout Axe’s paper, or any so-called ‘sequence problem,’ as evidence against common ancestry. Of course, I know that ID creationists aren’t going to let this go anytime soon (if ever), since once they get a paper published in a real peer-reviewed journal, they can’t admit that they could possibly have been wrong.

What about the so-called ‘waiting time problem’?

As it turns out, the ‘sequence problem’ and ‘waiting time problem’ are just two sides of the same coin. Every paper published on the ‘waiting time problem’ thus far has looked at the time until two (or more) pre-specified, coordinated mutations become fixed in a population. But there are many ways to achieve a desired result in real biology, as all of the studies cited above show. So the waiting time ‘problem’ is a non-problem.

However, it gets even worse. One of the recent ID papers published on this topic, by Hossjer et al. (2021), looks specifically at the waiting time in regulatory sequences. As such studies tend to do, they again assumed two pre-specified mutations were needed to achieve the desired function.

But another recent study, by Yona et al. (2018), discovered that no less than 10% of random, 100-nucleotide DNA sequences have the ability to serve as active promoters (regulatory sequences) in E. coli. Ten percent! It’s impossible to understate these findings – ten percent of all sequences 100 nucleotides in length is 10^59 sequences.

Now, I’m not saying that any one of these 10^59 sequences would suffice for whatever desired result evolution is trying to achieve (speaking figuratively – evolution isn’t a purposeful process). But when you’re dealing with this many sequences, is it really reasonable to think that only a single pair of pre-specified mutations will achieve the desired result? No, that’s utterly ridiculous. So no one should be taking these ‘waiting time problem’ publications seriously, especially not if they’re trying to make some overarching conclusion about the real biological world.

Furthermore, the ‘waiting time problem’ assumes that all of the mutations required for a change have to occur simultaneously, presumably because individually they would be deleterious. But I don’t think this is a valid assumption, since many changes in evolution occur without deleterious intermediates.

One very interesting study that examines this idea is Neme et al. (2017). They inserted completely random genes, coding for polypeptides of 65 random amino acids in length, into E. coli bacteria, and observed what happened to the bacteria as a result. What they found is that 52% of the polypeptides decreased the fitness of the host bacterium (slowing its growth), whereas 23% were neutral in effect, and 25% actually increased the fitness of the host bacterium (speeding up its growth).

The same thing was shown by Tretyachenko et al. (2017). They generated ten thousand random 100-amino-acid protein sequences and inserted them into E. coli bacteria. What they found is that the polypeptides were well-tolerated by the E. coli, and that a high proportion (~25%) in fact had defined secondary structures (α-helixes and β-sheets)! Furthermore, the random polypeptides were fairly well-ordered.

What does all this have to do with the ‘waiting time problem’? Mainly, I think it’s applicable because it shows that completely random gene sequences can exist without having any deleterious effect on the host (and indeed do so about 50% of the time). Sometimes they even have a beneficial effect. So it’s totally false to think that any mutation leading up to a functional protein will necessarily be deleterious. This invalidates the ‘waiting time problem’ entirely.

Finally, the ‘waiting time problem’ has also been shown to be false, observationally. The adaptation which originally allowed the HIV-1 group N virus to transmit to humans involved four amino acid changes, each one of which individually is deleterious, and so they all would have had to occur at once (Sauter et al. 2012).

According to Behe’s ID book The Edge of Evolution, which rests entirely on the ‘waiting time problem,’ an adaptation involving four simultaneous mutations should be at or beyond the “edge of evolution,” since (according to Behe) such a change could never have occurred in the entire 4.5 billion year history of the earth. But this change, involving four simultaneous mutations, was observed to happen in just a few decades. Thus, this piece of observational evidence shows that all ‘waiting time’ calculations must be off by many orders of magnitude, most likely for the reasons laid out above.


The ‘sequence problem’ and ‘waiting time problem’ are non-problems. They have been shown observationally to be wrong on many levels, and (in YEC lingo) this is an example of ‘observational science’ not ‘historical science,’ so creationists should (if they were consistent) not disagree with this.

In addition, @colewd, I’m not sure what any of this has to do with common ancestry in the first place. The original thread was about whether all organisms are related through ancestry, not whether evolution is guided or not. Those are two very different questions. Please enlighten us as to how, even if the ‘sequence problem’ and ‘waiting time problem’ are correct, this would demonstrate that all organisms do not share common ancestry.

@colewd, I am absolutely sure that you will take none of this to heart. You have shown yourself to be unwilling to consider evidence against your position time and time again, and I very much doubt that this will be the thing to change your mind. But I hope that someone somewhere will find all of this information useful and interesting.


Hi Andrew
I completely agree to base are an argument on one study is not convincing. I need some time to read at least the abstracts of all the papers you have cited that I have not seen before.

While I am gone think about the process of a unique gene forming from a duplication event or some non coding DNA. How do you get from here to a unique gene sequence? Looking at the Lenski experiment may be helpful.

ID “science”:


How is that relevant at all? The fact that a new gene hasn’t arisen in that experiment just tells us that there is no selective pressure for a new gene to form. It doesn’t somehow prove that the sequence and waiting time ‘problems’ are somehow correct, despite contradicting all this other experimental evidence.


A duplicated promoter did arrive. What did not happen is a new gene was not generated from SNP’s. The amount of actual mutations was in the billions but the number of fixed in the population was 75 as I remember. This is after 60k generations.

To say these problems (sequence and fixation) are made up by creationists and have no substance is not credible. A sequence has a large amount of possible arrangements and function is a subset of those arrangements. The number of arrangements exponentially grows with sequence length.

This is the challenge as the number of possible functions varies depending on the application and with many vertebrate proteins we see high levels of preservation in AA sequences which means changes are often deleterious due to loss of function.

My assessment of your statement remains: “The fact that a new gene hasn’t arisen in that experiment just tells us that there is no selective pressure for a new gene to form. It doesn’t somehow prove that the sequence and waiting time ‘problems’ are somehow correct, despite contradicting all this other experimental evidence.”


The number of arrangements has little bearing on the proportion of functional sequences. Most of the studies I cited in the OP involved proteins ~140aa in length, although there were a few that looked at short (~10aa) polypeptides as well. The proportion seems to be similar in both long and short proteins. This is experimental, observational evidence.

That’s why highly constrained sequences aren’t the ones that evolve new functions…

Edit: But apparently they can evolve new functions anyway. See the article that evograd linked later in this thread. So your point is even more wrong than I thought.


It’s also important to note that the Lenski experiment was specifically designed to limit evolution. The populations of E. coli are all kept in the same environment, with the same selective pressures, etc. The fact that we’ve seen the amount of evolution that we have is nothing short of remarkable.

With populations in the natural world, which have variable selective pressures that change over time, we’d expect to see far more evolution than in the LTEE. The LTEE can be seen as a minimum (amount of evolution), not an average or a maximum as ID/creationists sometimes like to imagine.


We also didn’t see the appearance of an IC system created by an Intelligent Designer.


This seems like a good thread to mention this new paper, accepted but not get typeset, reviewing the role of chance/simple processes in evolving protein complexity from Joe Thornton and colleagues:


Simple mechanisms for the evolution of protein complexity
Proteins are tiny models of biological complexity: specific interactions among their many amino acids cause proteins to fold into elaborate structures, assemble with other proteins into higher-order complexes, and change their functions and structures upon binding other molecules. These complex features are classically thought to evolve via long and gradual trajectories driven by persistent natural selection. But a growing body of evidence from biochemistry, protein engineering, and molecular evolution shows that naturally occurring proteins often exist at or near the genetic edge of multimerization, allostery, and even new folds, so one or a few mutations can trigger acquisition of these properties. These sudden transitions can occur because many of the physical properties that underlie these features are present in simpler proteins as fortuitous by-products of their architecture. Moreover, complex features of proteins can be encoded by huge arrays of sequences, so they are accessible from many different starting points via many paths. Because the bridges to these features are both short and numerous, random chance can join selection as a key factor in explaining the evolution of molecular complexity.


From the article that @evograd linked:

The A and B domains of protein G from Streptococcus, each ~50 amino acids long, are unrelated in sequence and structure: domain A consists of 3 α-helices, and domain B consists of 4β strands and one helix, with a completely different tertiary fold and distinct sets of residues that make up the hydrophobic core. The authors gradually walked the two sequences towards each other in sequence space by swapping individual residues between the proteins without changing or losing the fold of either one. Ultimately, two protein sequences were achieved that differed by a single amino acid, each occupying one of the folds; the remaining mutation completely reorganized either fold into the other. Even though the “wild-type” A and B domains are as distant from each other in sequence space as is possible, a continuous network of sequences encodes each fold and, in at least this one location, the two networks are adjacent to each other.

Wow, that’s crazy amazing. And that just goes to show how protein sequence space can be super unintuitive, so it’s impossible to know how likely a functional protein is, or how far it is from another function in sequence space, just on intuition alone. Contrary to what Axe says in his ID book Undeniable.


Also, @colewd, this paper answers your unfounded assertion that because some proteins are under purifying selection, they cannot evolve new structures.

In a classic biochemical study, it took only two point mutations to reorganize the secondary structure and tertiary fold of the N-terminal portion of the Arc repressor. One of these mutations on its own resulted in partial occupancy of both folds, depending on experimental conditions. The bridge in sequence space between the folds is therefore only two mutations long, and the journey across that bridge does require passing through an unstructured intermediate. The implication is that, even under purifying selection, one fold could rapidly evolve into the other.

So even though it doesn’t really matter for evolution whether highly conserved proteins can form new structures – after all, evolution has all of those less constrained proteins to work on – it turns out that they can regardless.


Here is a critique that you should read from @Art re Axe’s experiment Axe (2004) and the evolution of enzyme function

Yes, I’ve read it before. It’s pretty interesting, but what’s your point in bringing it up? That just goes against your point, it doesn’t help you.

1 Like

This is not the claim I made. I am showing the evidence of proteins with high levels of functional constraint.

How do you know evolution has all this material? Where is all these loose proteins in the primate populations we are studying.

Lets mark your prediction here and see how it goes :slight_smile:

Yes, and “under high levels of functional constraint” equates to “under purifying selection.” If you can’t understand that, perhaps you should not be talking about highly conserved proteins. So this paper does answer that talking point of yours.

What does that mean? If you’re asking about less conserved protein-coding genes, consider duplicated genes, or genes from non-coding DNA, that do not yet have a conserved function. There are many of these in some species of grasses, for example.

Edit: When I refer to “genes from non-coding DNA,” I am speaking of de novo genes that arise via a frameshift mutation which creates an open reading frame. Such as the ones described in the article I linked.

1 Like

Yeah, but as Axe also says in Undeniable, ya don’t need no pointy-headed scientists spewin’ words ya don’t understand ta get it! Evolution just seems really really really wrong, so common sense tells ya that it IS really really really wrong, consarn it all!

Seriously, Axe’s book was one of the most bizarre things I’ve ever seen in this field. And he did commit one of the weirdest howlers I’ve seen, claiming this about the views of modern evolutionary biologists:

“The current stance is that evolution was so successful that it perfected life to the point where modern forms no longer evolve, making the whole process even further removed from the category of observable phenomena.”

That was a mind-roaster, that one.


You agree that 75 mutations became fixed in the population out of about a billion generated. This is not the lab that you cited above it is real populations. The waiting time is real and the sequence problem is real.

All good now how do you do experiments requiring many trials in a population?

Lol what? It is in a lab, that’s the whole point of the LTEE! Are you talking about the same thing that I am? Or do you legitimately not know what the LTEE actually is?

So provide your experimental evidence… I’m still waiting. Meanwhile, I’ll stick with the experimental evidence that says that they’re not real.

Hmm? We observe these less conserved proteins, that’s just a fact. How do you want us to do experiments? Do you know the point of experiments? They test hypotheses, not observations.

1 Like

In today’s world, New York City would grind to a standstill if the power went out city wide. In fact, this has happened in the past. So how could the city have functioned 200 years ago without electricity?