Brian Miller: Co-option and Irreducible Complexity

im not sure about that. say that we want to create a protein that can bind 2 molecules. we will need at least 2 binding sites. if a single binding site requires about 40-50 aa then we will need about 100 aa to a functional protein. this is why most proteins are at least 100 aa long.

but what about non- closely related functions? how many mutations we will need to change cytochrome c into histone h4 for instance?

That’s a bad example, because those are not thought to be actually related, so how many mutations that would take it irrelevant as the case for evolution doesn’t rest on that specific transition having happened.

There are examples of radical transformations in protein function by even single mutations. For example it is shown that an enzyme can completely alter function and become a sort of structural protein involved in coordinating the growth patterns of multicellular tissues by a single amino acid substitution. See this:

Anderson DP, Whitney DS, Hanson-Smith V, Woznica A, Campodonico-Burnett W,
Volkman BF, King N, Thornton JW, Prehoda KE. Evolution of an ancient protein
function involved in organized multicellularity in animals. Elife. 2016 Jan
7;5:e10147. doi: 10.7554/eLife.10147.


I’d like to thank you guys for taking the time to give this latest anti-science propaganda from the DI the thorough beatdown it deserves. It’s no secret I see the DI as a pox on the scientific community with their nefarious goal of undermining science education in the U.S. to get their religious views back into public schools. Please keep up the good work!

1 Like

actually acording to evolution almost every protein is related, since all of them (or most of them) suppose to evolved from a common ancient protein. so its possible to change a cytochrome c into histone h4 even according to evolution.

true. but these are similar proteins. i talking about very diffierent proteins.

It is statements like this which disqualify ID supporters in the eyes of the scientific community. There are tons and tons of examples of proteins with homologous function that differ by more than 20%, as noted by other posters. Just randomly looking a few conserved proteins will demonstrate this, such as human MMP9 being 40% different than zebrafish MMP9. Anyone who has done any simple search of sequence identity between widely conserved proteins can easily find examples that go well beyond the silly 20% limit given by ID supporters.

Will ID supporters change their act? Given their known history, it doesn’t bode well, but I am somehow naively optimistic.

1 Like

Indeed, the disconnect between what is easily found in sequence databases and the interpretation by ID proponents of dedicated mutational analyses of protein structure and function is striking.

It would be very, very helpful to the community of “critics” here if @bjmiller (and perhaps @Agauger) could spend some time here to rationalize what appear to be some glaring contradictions. For starters, how might you reconcile your interpretations of Bershtein et al. (Nature 444(7121):929-32, 2007) with @T_aquaticus’s comment (which is entirely accurate)?

(In order to minimize jumping around, here is what @bjmiller said about Bershtein et al. in a prior thread:

The study demonstrates that after 1 to 2 mutations to a protein, about 2/3 of the following possible non-synonymous mutations could be “tolerated”. However, after a few more mutations (around 5-6), the likelihood of the following non-synonymous mutation being tolerated is roughly 1/3. This result closely matches Axe’s result for the rarity of functional sequences in the vicinity of a functional beta-lactamase: (1/3)^150 is around 1E-77.)

Thanks in advance for your continued discussion.


For those who are interested, Homologene is a fun database to check out. Enter the gene of interest in the search bar and it will give you hits, including other genes that are similar. Click on a link, and then click on “Show Pairwise Alignment Scores” on the next page. You can check out the scores for cytochrome c (cycs) here, and wouldn’t you know it . . . human and yeast cytochrome c is only 58.4% similar at the protein level, and yet the protein functions fine in both species. According to ID supporters, we shouldn’t be making these observations.


I think you may have misread one of the articles. The Chatterjee et al. paper defined a step in terms of a single base change. A “step” in the Liu and Ochman article corresponds to gene duplication and then significant alterations to the sequence which could correspond to well over 100 nucleotide changes (steps).

As @Mercer is explaining, the notion that protein functionality requires long, unique polypeptides and folds is wrong. Direct and very interesting experiments show this in no uncertain terms. This means that any model in which the “probability” of protein functionality scales directly, or exponentially, with polypeptide length is not going to be correct.

I have examined abzyme research in more detail with help from experts of course, and I now realize I may have overstated their relevance to even understanding the active sites of proteins. The binding sites of abzymes and true enzymes are qualitatively different, and the former have much weaker activity:

It has also appeared that the efficiency of antibody catalysts to accelerate chemical reactions ((k_cat/K_M)/k_uncat or 1/K_TS) is much lower than that of natural enzymes. Indeed, k_cat/K_M values reported for catalytic antibodies range from 10^2 to 10^4 s/mol, while those of natural enzymes range from 10^6 to 10^8 s/mol. In addition, a significant fraction of hapten binders failed to catalyze the target reactions.
Rémy Ricoux, Jean-Pierre Mahy, in Comprehensive Natural Products II, 2010

The reason for their shortcomings is that the activity of true enzymes results from far more complex interactions between the substrate and the enzyme than simply binding to a substrate and transition states.
Pratul K Agarwal, Enzymes: An integrated view of structure, dynamics and function

The challenge is transforming one protein with a certain complex fold structure into another protein with another fold structure, not simply altering a binding site within an already existing larger structure which does not change.

Let me be more specific, has any experiment taken one structural protein and then continued to mutate it until a large portion of its sequence was altered without seeing a drop in function? The key issue is the difficulty of a random search finding a functional sequence. The research I referenced indicated that proteins degrade quickly on average with random mutations, and the average negative effect of mutations increases with the number. Does any research addressing that specific issue challenge those results?

The average effect of mutations could generally degrade function even if every single nucleotide could be changed to another nucleotide. As a quick example, imagine every position in a protein could have 1 of 2 amino acids without any loss of function, but any other change disables it. In that case, two fully functional versions of the proteins could have 0% sequence similarity, but the rarity of that protein in sequence space would be 1/10 to the power of the length of the protein. So a 100 aa protein would be impossible to find through a random search even though two species could have completely different versions of it.

Why in the world would anyone insist on this? It has little to do with how we think evolution progresses. Why do you think this is relevant @bjmiller? It appears you arguing against a straw man version of evolution.


No, they won’t change. ID is still a religiously motivated political movement, not a scientific one. Their target audience is still uneducated laymen, not the actual scientific community. As long as the DI can keep donations flowing in from the religious True Believers they’ll keep pumping out the worst science-free dreck to tell the True Believers what they want to hear.

1 Like

Which is not at all surprising when you don’t include natural selection to weed out deleterious mutations, and disallow any form of compensatory epistasis from anything other than more substitutions in the same protein.

I believe I already addressed this. The key concepts left out of your scenario are natural selection and compensatory epistasis.

One of the authors of the paper you cite (Dan Tawfik) has done extensive research on this. It is even discussed later in the paper he coauthored that you cite. I suggest you read the discussion there, the references, and then find citing and similar articles on google scholar. Perhaps you can read even earlier work of his, such as this:
Tokuriki N, Tawfik DS. Protein dynamism and evolvability. Science. 2009 Apr 10;324(5924):203-7. DOI:
Soskine M, Tawfik DS. Mutational effects and the evolution of new protein
functions. Nat Rev Genet. 2010 Aug;11(8):572-82. DOI:10.1038/nrg2808

If you include natural selection to week out strongly deleterious mutations, and allow for compensatory epistasis due to things like increasing gene dosage through upregulation of expression, upregulation of chaperones, or increased copy numbers through duplication, the potentially less stable intermediates are compensated for, allowing further change to the protein than can be obtained merely through unconstrained mutation accumulation without selection or regulation changes. And then there’s compensatory epistasis from the rest of the genome, a phenomenon your cited studies were explicitly designed to exclude.


true but its just another varitaion of cytochrome c. even if we have about 10^50 possible variations of cytochrome c, out of 20^100 possible combinations its actually nothing.

In a previous post, @bjmiller wrote:

It is pretty obvious that there are many proteins that can differ by more than 20% and still retain their original function. Wouldn’t you agree?


How do you determine that a protein has no function? That’s the first big problem. There are millions of possible substrates that a protein can interact with, so how do you determine if a protein has no function if you don’t test it against all possible substrates? This is one of the major failings of ID arguments. They blindly assume that a protein can only ever have one function. For example, Douglas Axe tested tested for beta-lactamase activity in his mutated proteins as if beta-lactams are the only substrate that exists in nature. After testing these proteins against just one substrate out of millions he declared they had no function. That makes no sense.

1 Like

sure that it may be possible. depend on the protein and the function we are talking about. its also depend on creature.

it’s not only possible, they EXIST IN NATURE.


Of course. Which experts?

That’s odd, as I was using them as a very large example of how you were grossly misrepresenting what is known (“all evidence”) about the prevalence of function in protein sequence space, not as “understanding the active sites of proteins.” Where did that even come from?

Indeed they are different, and that is obviously because they are highly constrained and must fit into the framework of an immunoglobulin fold (note, Brian, this is how real biologists use the term “fold”). The active site is like a pair of lips, while most unconstrained enzymes have pockets for their substrates.

You’re really missing the fact that this difference doesn’t help you, because it means that the 10^-8 frequency we see with catalytic antibody screens is a gross underestimate of the frequency with unconstrained proteins.

I see an actual testable hypothesis there, Brian. Do you?

These are true enzymes. They have lower activity because they are highly structurally constrained. Do you see the testable hypothesis yet?

No, it isn’t, as you didn’t even mention function!

The challenge is getting you to use the same terminology people in the field do.

We know with absolute certainty that the structural classification “folds” doesn’t correspond to activities. And you’re fudging by pretending that catalytic antibodies are mere binding sites. Do you even realize that binding is pretty much the basis of catalysis?

What’s the difference between a “complex fold structure,” a “fold structure,” a “fold,” and a “structure”? You’re tying yourself into semantic knots to avoid confronting the simple fact that function is easy to find in random sequence space, about one in 10^8 in literally hundreds of independent trials, even when we dramatically constrain the size and the supporting structure of the random sequences being screened.

Please explain how a single backwards model in which enzymatic activity was never even measured trumps this giant mass of evidence.

1 Like

This is my assessment as well.