Miller: Axe Decisively Confirmed?


(Mikkel R.) #13

It’s really odd. It would have made more sense for Axe to take some specialized enzyme, then grown his bacteria on plates containing a closely related substrate they’re not normally active on, and tested to see if he could select for activity on the related substrate. That would actually go some way towards indicating how far away different functions are from each other. How many mutations and generations would it take to find that other function from the extant one?
Even then, it is difficult to extrapolate from a data point of one to a general case for all proteins.

The conclusion that this was done to generate the lowest number is hard to escape. It seems that if Axe really thought that he had globally-relevant data, he would have done the same sort of experiment with many other unrelated proteins in the last 14 years.

At the very least he could have tested the activity of his enzyme on a host of related substrates that beta-lactamases are known to have some cross-reactivity for.

(S. Joshua Swamidass) #14

As I understand it, he didn’t even do this experiment. It was done by a friend of ours working underneath him :smile:.

(Mikkel R.) #15

On the topic of @bjmiller’s EN&V post, he writes:

The rarity would then be less than 1/3 to the power of the sequence length. This estimate closely matches the result from Axe’s 2004 β-lactamase experiment that only 1 in 1077 sequences corresponds to a functional fold/domainwithin the protein.

This is a typical impression ID proponents have got from Axe’s work, as that is how it is normally sold to them in the ID literature, which is the idea that Axe has shown the prevalence of any protein with a functional domain(or worse, any functional protein at all), as we can see here: Imagine: 60 Million Proteins in One Cell Working Together

This paper is interesting because it relates to the work of Douglas Axe that resulted in a paper in the Journal of Molecular Biology in 2004. Axe answered questions about this paper earlier this year, and also mentioned it in his recent book Undeniable (p. 54). In the paper, Axe estimated the prevalence of sequences that could fold into a functional shape by random combinations. It was already known that the functional space was a small fraction of sequence space, but Axe put a number on it based on his experience with random changes to an enzyme. He estimated that one in 10^74 sequences of 150 amino acids could fold and thereby perform some function — any function.

But this is of course wrong. Very very wrong. Even Ann Gauger says this is not what Axe has shown, as we can see here:

Doug’s paper showed the rarity of a functional protein with a particular activity (B-lactam) and a particular structure ( TEM-1 B-lactam) (that’s what he and I mean by a functional fold BTW). Out all possible protein structures only 1 in 10^77 will have that structure and that enzymatic activity. It’s a way of answering the question, how many ways are there to make a protein that has that particular structure with that particular chemistry out of all possible proteins.

(Mikkel R.) #16

I’ll just copy-paste the relevant parts of my previous response to @bjmiller’s invoking the Tokuriki and Tawfik paper (Bershtein et al 2006) paper here:

No, they don’t. One of them (Bershtein et al 2006) was deliberately set up to exclude several well-characterized mechanisms of evolutionary change in order to better understand, in isolation, the consequences of a single mechanism of change in the absense of the effects of the others. It only allowed the effects of mutations within the reading frame of the protein. Potentially compensatory chromosomal mutations were avoided by deliberately only mutating the plasmid genes with PCR, and then transforming competent cells to measure the fitness effects of those mutations.

The TEM-1 gene was cloned into a plasmid (as it occurs in nature) under its endogenous promoter. Recloning after each round of mutagenesis confined the mutational drift to the open reading frame of TEM-1. Our in vitro random mutagenesis protocol was optimized for high reproducibility and was calibrated to obtain, on average, two mutations per gene per round of mutagenesis. We maintained three populations of randomly drifting TEM-1 genes: one population under no selection (Lib0), and the rest under purifying selection at ‘high’ and ‘low’ stringencies. Each population, or plasmid library, was separately mutated, ligated into an empty vector and transformed into E. coli host cells; it then underwent purifying selection: ‘high’ selection pressure (250 mg ml21 ampicillin; Lib250; Supplementary Fig. 2), and ‘low’ selection pressure (12.5 mg ml21 ampicillin; Lib12.5). After growth on selection plates, plasmid DNA was extracted from the surviving E. coli colonies, and the TEM-1 genes were subjected to the next round of mutagenesis. Altogether, ten successive rounds of mutagenesis and purifying selection were performed. Loss of diversity was less than 50% per round, and a diversity of at least 10^6 variants per library was maintained throughout.
As expected, a rapid fitness decline was observed in Lib0 (no selection). The fitness of the selected populations (Lib12.5 and Lib250) remained unchanged under the threshold of selection, and decreased above that threshold (Supplementary Fig. 3).

This figure is from supplementary materials of Bershtein et al 2006.

This completely rules out the possibility of compensatory duplications, other forms of regulation of gene dosage, compensatory chromosomal mutations, and so on.

And even then, it is noteworthy that the aspect of the protocol that involved purifying selection was still able to maintain structural integrity of the protein against the prevalence of deleterious mutations.

Fig. 3. The fitness ‘landscape’ of the TEM-1 gene.
The fitness dynamics of the different TEM-1 libraries is presented as a function of mutational input. The average fitness (W) of a given population was defined as the fraction of β-lactamase variants that confer resistance at a given concentration of ampicillin (see Methods). Wild-type TEM-1 exhibited W=1 for all ampicillin concentrations ≤ 2500 µg/ml. All fitness measurements are detailed in Supplementary Table 1.The rapid fitness decline of the unselected library Lib0 is shown at 12.5 μg/ml of ampicillin (○). The fitness of the libraries subjected to purifying selection remained unchanged at concentrations under the applied selection thresholds, as exemplified here by Lib12.5 at 50 μg/ml ampicillin (∆), and Lib250 at 500 μg/ml (F). At concentrations exceeding the selection thresholds, constant decreases in fitness were observed, exemplified by Lib12.5 at 500 μg/ml ampicillin (◊). Note that the impact of ampicillin is much higher on freshly transformed cells (as in the purifying selections) than on ongrowing, replicated colonies (as in the fitness measurements). Thus, the threshold ampicillin concentration for the fitness measurements was found to be ≤100μg/ml for Lib12.5 (selected with freshly transformed cells at 12.5 μg/ml ampicillin), and ≤1000 μg/ml for Lib250 (selected at 250 μg/ml).

The other paper you cited (Lundin et al 2018) explored the fitness effects of mutations and found, completely unsurprisingly that most mutations are deleterious. They didn’t find anything which supports the view that protein evolution can only go downhill as mutations accumulate. Their protocol did not even include a lineage evolving under purifying selection. All mutations were created directly in DNA by PCR and then inserted in the bacterial chromosome and their fitness effects were tested. When the effects of multiple mutations in combination were tested, it was again the in absence of purifying selection.

Is Doug Axe Right about the Rarity of Proteins?
(John Mercer) #17

The contortions are interesting. Behe ignores neutral evolution, while Axe, @bjmiller, and @Agauger ignore selection.

(Brian Miller) #18

Since Axe first published his paper, critics have consistently raised certain key issues which have been repeated on this forum:

  1. Were other functions present in the protein even after the tested activity ceased?
  2. Is beta-lactamase rarity representative of most proteins?
  3. Could many other proteins perform the same function?

The cited research addresses all of these questions:

  1. The loss of function corresponds to the destabilizing of the protein, so all functions related to a stable protein must cease.

  2. The 21 studied proteins all show the same distribution of stability reduction for mutations, so nearly all globular proteins would have a similar minimum rarity in sequence space near an optimized sequence. And, nearly all would become entirely nonfunctional with nearly any random combination of mutations leading to a 10% sequence change.

  3. The number of “sequence families”/single domain architectures (SDAs) is increasing very slowly/“becoming saturated”, and the few hundred thousand “close families” which have been identified are typically combinations of the small number of SDAs. Therefore, a search through sequence space would never even enter the neighborhood of any protein families. The number of SDAs is far too small to find even one.

According to the standard model, at some point in the past, the first representative of an entirely new fold had to appear through a nonfunctional sequence exploring sequence space through random mutations. Natural selection cannot differentiate two nonfunctional sequences, so the search had to be fairly close to random.

All of the key criticisms of Axe’s research have been overturned, and his estimates of rarity are exceedingly optimistic. Yet, they still demonstrate the implausibility for evolution to produce even one novel protein of modest size with a distinctly different fold in the entire history of the earth. The research studies cited as counterexamples almost always relate to the challenge of slightly modifying an existing protein fold, not generating an entirely new one.

Rajendrani Mukhopadhyay, “Close to a miracle”
Also, see Ann Gauger
“Once you have identified an enzyme that has some weak, promiscuous activity for your target reaction, it’s fairly clear that, if you have mutations at random, you can select and improve this activity by several orders of magnitude,” says Dan Tawfik at the Weizmann Institute in Israel. “What we lack is a hypothesis for the earlier stages, where you don’t have this spectrum of enzymatic activities, active sites and folds from which selection can identify starting points. Evolution has this catch-22: Nothing evolves unless it already exists.” (Emphasis added).

(S. Joshua Swamidass) #19

@bjmiller thank you for your response. You did not yet confirm we were properly attributed.

(John Mercer) #20

Wow. So much theater, Brian!

This is hilarious! You’re now misrepresenting Axe’s paper completely, because he never assayed activity at all! If I had reviewed the paper, I would have rejected it immediately for that reason alone, regardless of the results.

Why can’t you even acknowledge the existence of that major problem with the paper? Is it a good idea to turn a continuous variable (activity) into a binary one? Even a physicist can answer that.

The paper didn’t establish that beta-lactamase activity is rare. That was a huge extrapolation.

Catalytic antibodies with measurable beta-lactamase activity are found in less than 10^8 samples of an unimmunized library. That ain’t rare.

But not the criticisms actually made, and it even misses the few you deem worthy of acknowledgement.

Perhaps you should read Axe’s paper. He was in no way looking at space near an optimized sequence, because he chose a temperature-sensitive mutant. Or perhaps you meant “optimized for near-instability”?

That’s directly addressed by 32 years of the catalytic antibody literature.

Both of those claims are simply false.

Why do you repeatedly misrepresent an extrapolation from an N of 1 as a global demonstration?

You’re not addressing catalytic antibodies, which do represent entirely new ones. And you are really missing a basic understanding of folds. Folds are structural classifications that typically encompass many different functions.

(S. Joshua Swamidass) #21

I ratify this concern. If we have already discovered all folds we don’t expect to find new folds, and that doesn’t matter at all. What matters is new functions. Novelty isn’t measured by folds.

(Brian Miller) #22

The challenge is finding novel folds from distant nonfunctional sequences. At some point, an entirely new fold had to appear even if one has to go back to the origin of life. In the middle of sequence space, none of the processes you cite will help a search find an exceedingly rare functional island.

(John Mercer) #23

No, it isn’t. The challenge is only to find novel functions. You’re conflating folds and functions in a way that makes absolutely no sense. They aren’t the same thing.

This is silly. Random sequences fold. It’s what proteins generally do.

Yet catalytic antibodies have been found for the last 32 years, but never cited by Axe nor Gauger. Multiple hits in a library of 10^8 isn’t rare at all, and destabilizing existing proteins is not a model for a forward search.

(S. Joshua Swamidass) #24

So @bjmiller there is the Turf13 example put forward by @art, which appears to arise by constructive neutral evolution.

(Brian Miller) #25

I promise to cite you by name in my next article. Please keep listing your thoughts on counterexamples and other related issues, and I will respond to all of the points at once.

(John Mercer) #26

I don’t see how that makes up for the lack of attribution in the previous one. Do you not know that web pages can be edited?

(S. Joshua Swamidass) #27

Thanks @bjmiller, with the current blitz from ENV, it is important to me that we continue to highlight positive dialogue between us. This is an example of such positive dialogue. Peace.

There is five things I would like you address.

  1. The first one is T-urf13 which has been put forward by @art, and is an example of a three part irreducibly complex system that appears to have arisen without selective benefit, by Constructive Neutral Evolution. This is not Darwinism, but neutral evolution, so you may need to reorient your rhetoric.

  2. How are you going to account for the point that @Rumraket has made, legitimately, that you are not taking negative selection into account in your analysis?

  3. It does appear that an irreducibly complex beta-lactamase can arise out of an unselected library of less than 10^-10 sequences. We discussed this at length with @Agauger here: Antibody Enzymes and Sequence Space. If this is true (and I believe it is), it appears you are back to square one. This substantiates all the critique @Mercer has been making with hard evidence.

  4. I want to know your situation here, as a proxy for Axe. If you found that Axe was wrong would you be able to and willing to publicly admit it? Or do you need approval to do so? Or do you honestly plan to argue his case no matter what the evidence shows? I ask this question without judgement, but to understand what we can realistically expect from you in this exchange. If you can’t or wouldn’t ever admit error, even if you were convinced of it, it would be good to just know up front.

  5. Getting to the point on #4, can you acknowledge any errors in your own reasoning or in Axes reasoning that you can acknowledge? If not about this, can you provide any evidence of a place where you personally have admitted error publicly? This would go a long way to calming concerns I have, and making much more hopeful for progress.

I do not mean any of this as insult or to be invasive. Answering this last question for myself, I plan to be writing up in a blog post soon for myself. I have admitted error more than once after being corrected by ID proponents. I would like to know if we are being fair to each other now.

Finally, there is one personal question I have.

  1. Why is it that the DI has repeatedly called me a Darwinist? I am not a darwinist and have explained this recently: Do they mean this as an insult? Are they intentionally trying to misrepresent me? What do you think should be done about this.

These are all the questions I have, and they are each important to me. Thank you for addressing them. I very much look forward to your response.

(S. Joshua Swamidass) #28

@bjmiller, please keep in mind that this field is immense. Do not fixate on merely the three papers under question in that thread. We are seeking understanding with you. I encourage you to bring your objections to @mercer privately or publicly, and allow him to present additional papers if required. This is a body of work that @agauger and Axe are not familiar with, so please accept our help.

(John Harshman) #29

Doesn’t the existence of orphan genes in humans (that have non-coding homologs in chimps) decisively refute the notion that new protein sequences are vanishingly rare?

(Mikkel R.) #30

Why is that the challenge? You wrote in an earlier post that the challenge was to evolve, for example, a flagellum protein from a protein with another function. But a protein with another function would not constitute a “distant nonfunctional sequence”. it’s not clear that a protein with another function would even be distant.

You brought up the example of the filament proteins of the flagellum as a supposed example of one of these proteins that would have been too unlikely to evolve from some other function because you could find homologoues of the filament protein with as little as 20-30% amino acid sequence similarity.

I then pointed out that there are filament proteins that simultaneously function as adhesive proteins, and even more hilariously, there are filament proteins that are active enzymes. Filament proteins (flagellin), just like antibodies, have a hypervariable region which mutates a lot (for the reason that filament proteins are often the target of immune system antibodies in multicellular eukaryotes).
Moens S, Vanderleyden J. Functions of bacterial flagella. Crit Rev Microbiol.
1996;22(2):67-100. Review. PubMed PMID: 8817078.

Eckhard U, Bandukwala H, Mansfield MJ, Marino G, Cheng J, Wallace I, Holyoak T, Charles TC, Austin J, Overall CM, Doxey AC. Discovery of a proteolytic flagellin family in diverse bacterial phyla that assembles enzymatically active flagella. Nat Commun. 2017 Sep 12;8(1):521. doi: 10.1038/s41467-017-00599-0

This is evidence that the “islands” constituting different functions in “sequence space” considerably overlap, and evidence against the claim that they are impossibly rare and isolated. This means three known functional “islands” overlap in the flagellum flagellin protein. It’s a filament protein, an adhesive protein, and an enzyme at the same time.

At some point, an entirely new fold had to appear even if one has to go back to the origin of life.

An entirely new fold is not the same as a new function. But proteins with novel folds typically evolve de novo from non coding DNA.

In the middle of sequence space, none of the processes you cite will help a search find an exceedingly rare functional island.

This idea that functional islands are exceedingly rare and isolated is a fantasy. Taking an existing enzyme, then mutating it until it stops working doesn’t say anything about how frequent functional proteins are in sequence space.

More importantly, you have still not acknowledged your complete misunderstanding of the Bershtein et al 2006 paper you keep referencing.

(Mikkel R.) #31

I’m sorry but this is false and it appears you have again completely ignored my earlier response to this same assertion of yours. It is only true that proteins will necessarily destabilize in response to accumulating mutations if it happens in the absence of purifying selection. You simply cannot extend the conclusions about what the effects of accumulating mutations in the absence of selection is like on protein function and stability, to protein evolution in general. As should be obvious from the Bershtein et al 2006 paper you’ve referenced.

  1. The number of “sequence families”/single domain architectures (SDAs) is increasing very slowly/“becoming saturated”, and the few hundred thousand “close families” which have been identified are typically combinations of the small number of SDAs. Therefore, a search through sequence space would never even enter the neighborhood of any protein families. The number of SDAs is far too small to find even one.

But that simply doesn’t follow. It is not even implied at all. How many folds have been so far discovered and utilized life on Earth is not itself any indication of how many possible folds are “out there”. It could just as well be the case that because species share common descent, even if there are still many new organisms to be discovered, they’re going to be related to already discovered life, so chances are they will just have variants of already discovered folds.

(S. Joshua Swamidass) #32

This is another important point I want to “second” here. There is no reason to think that evolution must proceed by evolving proteins from distant sequences.