Gauger and Mercer: Bifunctional Proteins and Protein Sequence Space

This is really helpful. Thanks.

I think the challenge for communicating this to the public @Art is how these numbers translate into statements about the difficulty of evolution. For this, we have extrapolate to all possible trials and all possible functions. Working out some of that math might be instructive.

1 Like

This is another point that Ann has muddied up. On the one hand, the ID community holds Axe (2004) as proof that protein function - period, full stop - is impossibly rare in sequence space. On the other hand, to avoid the flaws in Axe’s experimental approach, Ann states that Axe is only interested in the activity of a crippled beta-lactamase under very specific conditions in one strain of bacteria. Heck, this doesn’t even apply to different strains of E. coli. (Of course, her stance is taken to avoid the need to independently confirm, by direct enzyme assay, some of the assertions Axe is making.). I don’t expect Ann to resolve this matter - I will just state the contradiction that her avoidance leads to.

As far as accessibility to evolution, even 2x10^-49 is probably beyond the reach of evolutionary mechanisms. As we have seen above, on the other hand, 10^-10 is well within reach.


More experiments using other proteins are always welcome. . If anyone wants to take up the challenge and do similar experiments in other systems that is definitely something worth doing. I myself am not in a position to do that work.

As for our 2011 paper, we claimed that getting a new function from an existing protein (by new function I mean a new catalytic activity) by nucleotide changes requires too many base changes to be feasible. I mean a new catalytic activity, not an increase in an activity already present.

Mercer’s work demonstrated that a modification in the strength of binding of myosin to actin could be accomplished by enlarging a binding pocket in myosin’s active site to accommodate a modified ADP. This is a modification to the regulation of an existing activity. It’s not a new activity.

This is why I have not considered Mercer’s work as relevant to the question we asked. The question is: can an existing enzyme be made to have a genuine new catalytic activity with just a few mutations.

It occurs to me that the one way it might happen is by a translocation of domains, such that the new domain either causes the loss of an existing function or the gain of a new one, as happens in cancer. This kind of change can result in modification of growth regulation. A protein that once regulated repression of growth becomes a cause of an increase in growth. I am being very nonspecific here but this is the one case I can think of where a single mutation (not a base change) causes a radical change in function.

There are suggestions that such a thing may have happened in our evolutionary history, for example a translocation and additional material added to a protein, creating a new protein, an orphan if you will, created a new protein that increases growth in the brain.

I am going to go back and find the quote that Mercer wanted me to address. I will make a comment in a little bit.

1 Like

When the matter of dispute is the interpretation of paper number one in the light of paper number two, understanding both paper number one and paper number two is highly relevant. You are disputing my paper on the difficulty of producing a new catalytic activity by single mutations, I.e the number of base changes required. You are claiming that your paper on myosin is relevant. But it seems you don’t know what I am arguing for. So understanding my paper is important.

@Mercer @swamidass

I couldn’t copy from Mercer’s post, but he said he wanted me to defend this statement, I think.

So unless functional sequences are easy to find (very common), and/or are clustered together (easily reachable from one functional island to another), explaining current protein diversity without design is impossible.

I don’t see why anyone disputes it. The main critique we got on the 2011 paper was that we didn’t start from an ancestral sequence, which had enough sequence similarity to acquire the new functions easily. People made this critique because they knew that existing proteins with different functions are far apart in sequence, stabilized by epistatic interactions among amino acids, and resistant to change toward a new function. It would require too many changes ! Does Mercer think it’s easy to convert proteins from one function to another if they are far distant in sequence space?

The question to be disputed is whether they are far distant in sequence space. This may be a surprise to some, but all Doug shows is that functional folds are very rare in sequence space, not how they are arranged in sequence space. That is another question. Arguments can be made but I am not going to do that here. I have said as much is I care to about Doug’s paper.

Are our proteins in the 2011 paper proteins with shared structure far apart in sequence space? They certainly can be. Even proteins with the same function can be, because they have differing stabilization—different epistatic interactions to stabilize the fold. Every biologist knows this instinctively. No one was surprised when we couldn’t do a conversion. So then, let us ask the key question. If proteins are so established in their particular functions that they can’t shift function easily, where did all the diversity of proteins function we have come from.? Some kind of changes are possible. I have been discussing that here with Mercer. Others do not seem to be possible, yet we see evidence they must have happened in the protein record. I simply want to point out that the conclusion that they evolved from each other is not based in any experiment. It is derived from evolutionary thinking. It may be true. Perhaps with the simpler enzymes it could be demonstrated. But the experiment must be done rather than simply assume that the patterns we see are due to evolution. Simple point. Should be understandable.

My statement above was “if this, then that”. “This” may or may not be true, so “this” is what we need to establish—if this is true.

BTW, I don’t think anyone has demonstrated that it is possible to go from a hypothetical ancestral sequence to the modern form, all the way. They have shown the first few steps and that is all.

A post was merged into an existing topic: Side Comments on Gauger and Mercer

2 posts were merged into an existing topic: Side Comments on Gauger and Mercer

I disagree.

If there is a combinatorially large number of possible functions, then there is no reason to think this is outside the reach of evolutionary mechanism. Think of protein-protein binding, as one example. There are a combinatorially large number of binding “functions” (it is an all-by-all set of possible pairs, each one another “function”).

Also, if functions are “compossable” by (for example) recombination and exon shuffling, than we divide the exponent by the number of units we are composing. Combined with the fact that small peptides can do a lot of useful thing, composing them into more complex functions makes the 10^-49 an illusory value. The key concept here, in addition to “composition,” is exaptation.


The EBG system in E. coli may be of interest to people in this thread. This is an artificially evolved B-galactosidase enzyme in E. coli. The mutations leading to substrate changes are known.

Adding a new protein to the mix might be overkill, but I thought it might offer something where the effects of mutations are known.


2 posts were split to a new topic: A Complexity Analysis of Evolutionary Algorithms

The matter of dispute is the GLOBAL claims you’re making about the prevalence of function in sequence space, on the basis of ignoring most of the literature in favor of your and Axe’s limited, flawed experiments.

You’re behaving as though no one else has produced anything of value in testing your global claims. If you’re right, none of what we tried should have worked, given the far greater complexity of myosins over your proteins of choice.

I think it should be obvious that I dispute the premises. Our results argue strongly against both of your premises, as do most others in protein engineering. As a far larger example, if proteins are as functionally fragile as you claim, how is it that so many of us can slap GFP (green fluorescent protein) on either end of so many proteins while retaining function?

Hey, what if someone inserted GFP randomly INSIDE proteins? Certainly that couldn’t possibly work if you are correct, wouldn’t you agree?

I’m still waiting to learn where you read this:

Myosin head domains, which are extraordinarily complex functionally, can differ by more than 70% without losing the basic functions of actin-dependent ATP hydrolysis and induction of actin binding in low ATP/ADP ratios and actin release in the converse condition. That’s how they move, and movement also is assayed experimentally.

So, I simply can’t imagine what you have read that would lead you to make such a claim.

It seems that you have yet to truly read the paper.

Wow. You seem to be trying very, very hard to misunderstand it.

To put it in the simplest terms possible, the wild-type Myo1c simply doesn’t notice N6(2-methylbutyl)ADP, nor N6(2-methylbutyl) ATP. The Y61G mutant does all the things that wild-type does with ADP and ATP with the N6(2-methylbutyl) analogs. How is that not a new activity?

And you keep harping on the fact that we didn’t lose the original functions in the mutant as though it’s a bug, when in fact it’s a feature. It’s a feature that blows a hole in your global claims about how easy function is to find in sequence space. I’m pretty sure that is why Swamidass chose the term “bifunctional proteins” to put in the title!

The reason why is that you don’t understand the most important features of the mutant.

And we showed that it can be done with a single mutation, without even giving up the original activity.

Did this occur to you only recently?

My memory was wrong. I found two papers. There may be an earlier paper with a different estimate, but these shown here roughly agree.

Experimental measurements in several different proteins indicate that the likelihood of mutation
to be deleterious is in the order of 33–40% [2,7,13] (36%,on average). Hence, as mutations accumulate, protein fitness declines exponentially [2]:W ≈ ea^0.36n (2)(where n is the average number of mutations) or even more than exponentially (see section on ‘epistatic effects’). So by the time an average protein accumulates, on average, five mutations, its fitness will decline to <20%. Thus, although the initial stability of a protein can buffer some of the destabilizing effects of mutations (Figure 1a), stability appears to comprise the main factor (although clearly not the only one [6]) that dictates the rate of protein evolution [1,4], and possibly of whole organisms [14,15,16], in particular, but not only, in relation to the acquisition of new functions.

Here is the figure I remember, but without numerical values.


Proteins can tolerate roughly 35-40% mutational substitutions. Note: this is not the same as comparing sequence differences between homologues, which have accumulated differences over time and stabilized them as they went. Rather this is the sudden introduction of mutations without time for stabilization. Here’s another.

1 Like

Can you show what the genuine new catalytic activity for your myosin is? I don’t think we are using the terms the same way. Since you won’t look at my paper to see what I mean, I would like to hear from you what you mean.

Snide. And no.

1 Like

Dr. Gauger,

Can you tell me why you keep moving the goalposts from what you claimed:


I’ve already explained that to you.

You seem to be unwilling to assess our work in the context of your claims about sequence space. It appears that “genuinely” and “folds” are just undefined rhetorical handles you use to make it easier to move the goalposts.

Let’s try again. What does our work show about your two premises:

  1. how easy functional sequences are to find; and
  2. how clustered functional sequences are together?

Nothing about your papers, please, and no use of “genuinely” and “folds.”

1 Like

I don’t see how either of the papers are consistent with your claim of “complete catastrophic collapse.”

Do you think that a 5-fold reduction in an enzymatic activity is a big deal?

And how about stability: would a 10-fold reduction in stability cause problems in vivo?

Yes. The latter is far more representative of evolutionary processes. :smile:

1 Like


You have consistently not engaged in a dialog with me. You have repeatedly refused to read what I say or answer my questions, which are relevant, and you then make non sequiturs, or move the goal posts yourself.

I have told you why I think your myosin experiment is not relevant. I will say it again. The change you made was to one amino acid. Did it change the myosin’s catalytic behavior, such that it carried out a substantially different chemical reaction? No.

To put it in the simplest terms possible, the wild-type Myo1c simply doesn’t notice N6(2-methylbutyl)ADP, nor N6(2-methylbutyl) ATP. The Y61G mutant does all the things that wild-type does with ADP and ATP with the N6(2-methylbutyl) analogs. How is that not a new activity?

Did it change the enzyme’s reaction type? For example, did it go from being an aminotransferase to an aminotransferase plus decarboxylase? No. It hydrolyzes ATP in the absence of the modified ADP, just as before. Your wild type myosin doesn’t interact with the modified ADP. Your mutant myosin does. The modified ADP causes the mutant myosin to freeze on actin. That’s new behavior, sure. But it’s because the myosin mutant, in the presence of the modified ADP, can’t complete the ATP hydrolysis.

Let’s try again. What does your work show about my two premises:

  1. How easy functional sequences are to find;

They are still not easy to find. Your example does not meet the criterion of a genuinely new chemistry, involving a different reaction type, which I have explained, to no avail.

  1. how clustered functional sequences are together?

In your case they are right next door, and they are functional. They just aren’t genuinely new chemistry, involving a different reaction type, which I have explained, to no avail.

I could say something rude here, but I won’t. I have answered you more than once why I think your paper does not bear on what I have said. You are entitled to disagree. But I have no reason to continue.


You seem to be determined to put the worst possible construction on every thing I say. This topic first came up as a side comment apropos of nothing. But I looked up some references because you asked.

Do I think a 5 fold reduction in activity is a big deal? It depends on the particular sensitivity of the reaction, I would imagine.

Well, it certainly affected antibiotic resistance in vivo.


1 Like

3 posts were merged into an existing topic: Side Comments on Gauger and Mercer