Brian Miller: Co-option and Irreducible Complexity


(Mikkel R.) #21

There are still no targets. Increasing the number of targets is still meaningless. Evolution isn’t trying to find some particular function, so whether you just use 1 or multiple targets for the same function is irrelevant. Anything that works is what evolves. There are many different ways of increasing fitness.

Think of it this way. Say there’s an antibiotic in the environment which, if transported into the cell, disrupts some key metabolic process. How can this problem be solved? We see some bacterium has an enzyme that degrades the antibiotic. Now you come along and think this is the target, this enzyme. But the enzyme was never the target, the enzyme is the result of blindly sampling a huge space of possible solutions of which enzymes that degrade the antibiotic are just one way.
Some of those solutions could have been to change the transporter and stop taking the antibotic in. Some of those solutions could have been to change regulation of the metabolic process. And so on. Rather than having the “target” be “this enzyme, with more peaks”, it is instead anything that reduces the effect of the antibiotic on fitness. The number of possible solutions goes way beyond merely this enzyme that evolved, so while any particular solution looks unlikely in isolation, because there are so many possible solutions chances are one of those solutions can and will be found.

Not that there’s any guarantee. Extinctions still happen and populations can fail to adapt to changing circumstances. But the target-view is just not a realistic way to think about how evolution happens and restricting your thinking to this idea that some particular protein couldn’t have evolved because it looks unlikely after the fact is a textbook example of the texas sharpshooter fallacy.

(John Mercer) #22

The catalytic antibody literature provides a good estimate for finding a specific enzymatic activity from a highly preconstrained (structurally) random search. How much of that extensive literature did you read before reaching a conclusion?

Why would you only cite papers that deal with diminishing function? Why would they be more relevant to finding functions in a random search, when there’s so much literature that actually describes finding functions with a random search?

But you’re making a GLOBAL and negative claim, so citing only 3 papers that require massive handwaving (because they go in the opposite direction) to fit to your predetermined conclusion isn’t anywhere close to being enough.

That is not how science is done.

Then how do you explain the fact that in hundreds of cases, undirected searches find catalytic antibodies in a tiny fraction of possible sequence space?

But other studies don’t indicate that. The problem is that you’re ignoring literally thousands of studies that are far more directly relevant to your claim than the few you’re cherry-picking.

That is not how science is done. If you think that some of the data have been misinterpreted, you must directly address those papers instead of ignoring them.

(Brian Miller) #23

What do you feel would be the best studies which suggest that generating a protein comparable to one of the more complex flagellar ones would not be problematic?

(S. Joshua Swamidass) #24

What is the most complex flagellate protein? They all look pretty simple, and most are modified versions of one another.

(Mikkel R.) #25

This is target thinking again. The flagellar proteins and their homologues we see today are results of billions of years of divergence, and it seems to assume that only when flagellar proteins have evolved to their present level of difference from their homologues would they become functional, or have succeeded transitioning to a novel function.

(John Mercer) #26

Pretty much all of them. If you want to focus, the catalytic antibody literature, for its age and sheer volume. Why don’t you know about it?

(Arthur Hunt) #27

Actually, @bjmiller, the scenario discussed in the Liu and Ochman paper you cite in your ENV article fits this bill nicely. From Liu and Ochman:

These results show that core components of the bacterial flagellum originated through the successive duplication and modification of a few, or perhaps even a single, precursor gene.

It really looks to me as if the “regeneration process” can apply to the co-option model for the evolution of the bacterial flagellum. The works you cite, @bjmiller, in fact support my contention.

(Arthur Hunt) #28

As @Mercer is explaining, the notion that protein functionality requires long, unique polypeptides and folds is wrong. Direct and very interesting experiments show this in no uncertain terms. This means that any model in which the “probability” of protein functionality scales directly, or exponentially, with polypeptide length is not going to be correct.

@bjmiller, I strongly suggest that you take time to understand this, and to follow @Mercer’s discussion of this matter. It has a very large impact on so much of ID thought, and ID proponents need to come to grips with all of this.

(Arthur Hunt) #29

If everyone can pardon a self-reply, here is a thought experiment that tries to illustrate my point.

First, instead of proteins, consider a typical 4 base restriction enzyme recognition site. The fraction of all 4-mers that will consist of any given such site will be 1 in 256. However, essentially all 1000-mers will possess such a site; in other words, the relevant fraction is 1, as is the probability of finding such a site in a given collection of 1000-mers.

While protein function is a bit more complicated than this, the same principles apply, and these considerations cast ID proponents’ favorite probability calculations in considerably different light. Thus, as one lengthens a given set of polypeptides to lengths that approximate that required for a particular enzymatic function (and these may be rather small, or modest entities on the order of 50-100 amino acids), the fraction of polypeptides that possess a specific functional motif will reflect the size and sequence of the motif, and will decrease as the size of the motif is larger. However, once the lengths of a population of polypeptides exceeds that of a given functional motif, then the fraction of functional polypeptides in such populations actually increases with polypeptide length.

This means that all of the calculations by ID proponents that scale probability inversely with polypeptide length are wrong.

(S. Joshua Swamidass) #30

Isn’t this all of their calculations?

(John Mercer) #31

And at the very least, delete the “All evidence” claim from your essay. It’s really, really arrogant, even more so because you’re so obviously unfamiliar with the most relevant and direct evidence.

(Brian Miller) #32

The key question is to what extent catalytic antibody (immunoglobulin) research is relevant to determining the rarity of structural proteins in molecular machines. In case you are not familiar with the research, here are an introductory video and a sample research paper:

As a quick overview, antibodies are complex proteins which contain constant and variable regions. The variable regions include the complementarity-determining regions (CDRs) in the antigen-binding sites which are randomized, so different antibodies can bind to different antigens. Antibodies are engineered to maintain a stable structure while allowing the binding sites to vary dramatically, so they can bind to an enormous variety of molecules (antigens).

The catalytic antibody research attempts to create antibodies with catalytic binding sites (abzymes) which can catalyze specific reactions. Normal enzymes catalyze reactions by stabilizing the transition state of a reaction, thus lowing the activation energy. The basic approach for much of the research is to create a transition state analog (TSA) for the reaction and combine it with a carrier molecule. This complex (hapten-carrier adduct) is injected into an animal which then starts generating antibodies until some have binding (active) sites which complement the TSA and thus have the targeted catalytic ability. The researchers could then continue by using molecular modeling to target changes to specific regions of the binding site (e.g. CDRs) combined with screening to enhance the efficiency.

This research is certainly valuable in determining how active sites can be modified, but it has scarce relevance to the question of structural protein rarity in molecular machines for several reasons:

  • The variable regions of antibodies are engineered to allow for great variability and to bind to a wide variety of molecules, so they do not make a good choice for a representative of the general properties of proteins.
  • The experiments use an enormous amount of investigator direction.
  • Most importantly, these experiments do not study the effect of randomly accumulating mutations in the constant region of the antibody. Therefore, they do not even address the issues of the rarity of functional antibodies in sequence space or the evolution of antibodies from some ancestral protein.

Other experiments do study the effects of mutations to the constant region, and they have demonstrated that even a few mutations can destabilize the structure and reduce performance:

Does anyone know of any research on the effects of accumulating mutations in a protein comparable to those in the flagellum which have shown that they can have large percentages of their sequences (e.g. 20%) altered without losing their function?

Example flagellar proteins of interest could be the following:

  • FliC - Filament
  • FliD - Assembly Cap
  • FlgK - Joint

(John Mercer) #33

Brian, that is absurd.

I’m disputing your point about function in general. The only paper you cited (before claiming that “all evidence” supported your claim!) was in no way specific to structural proteins or molecular machines. If a paper that never specifies those classes is “some of the most relevant research,” how would my citing a general area of research with more than 5000 papers disqualify it from your consideration?

In case I’m not familiar? Brian, I’m the “you” pointing to this 32-year-old area of research that you ignored in writing your essay that concluded with “all evidence.”

Moreover, a “sample” paper that involves:

  1. screening of a library from an immunized animal; and
  2. evolution through multiple rounds of selection

is not at all a paper that is relevant to the question of the proportion of RANDOM sequences with function; only the first round of selection from a library from an unimmunized animal would be. Someone actually familiar with the literature would have chosen a paper lacking both of those characteristics given the context of our discussion.

So right there, you clearly haven’t looked carefully and thoughtfully to find what’s truly relevant to your claim. Or maybe you just completely missed the point?

We don’t need a quick overview with a condescending tone, particularly since you completely passed over the most relevant part of the process to your claim. You’re not convincing.

Very good! As such they represent a tiny, random fraction of sequence space. The frequency of function we find will therefore be an underestimate for the prevalence of function in all of sequence space.

Yes, so looking at the hit rate from libraries from UNimmunized animals gives us a floor for the frequency of a specific function in random sequence.

You picked an evolutionary paper with multiple rounds of selection and starting with an immunized library as a sample, so you seem to have completely missed that important point; immunization and additional selection rounds aren’t relevant to our discussion.

Now you’re contradicting yourself. They are random, as YOU stated above, so they are a subset of random sequence. They are not engineered, not even metaphorically. They are structurally constrained, so the hit rate is an underestimate.

Kindly explain the logic here.

We’re discussing how many members of a random library have a specific function. How is this using more direction, than going in the opposite direction and mutagenizing a single protein? Which tells us more about random distributions of function in a forward search?

And if you object to direction, why didn’t you choose a sample paper illustrating selection from an unimmunized library?

If that’s most important, perhaps you can explain it instead of declaring it, because it makes no sense.

With catalytic antibodies, we’re directly looking at the prevalence of function in a tiny subset of random sequence space (V regions). You haven’t explained why mutating proteins already selected by evolution is a better model for looking for function in a universe of random sequences than screening a library from an unimmunized animal, which is literally looking for function in a subset of random sequences.

We predict reductions in function when we change something that occupies a highly-selected, and therefore nonrandom, functional peak that creates a functional structure. They did those experiments to understand the structure, not to look for function in random sequence space.

You, on the other hand, have made a grandiose claim (“all evidence”) about what evolution can do going in the opposite direction to find new functions in random sequence! Why would a model system that diminishes function of already selected proteins be a better model for that than literally finding new functions in random sequences?

(Arthur Hunt) #34

If I might add one thought to @Mercer’s reply - it pays to keep in mind that abzymes, and also random combinatorial phage display, are exactly the addition of new functionality to an existing protein fold. Not sort of, not similar to, but exactly. This is, I believe, a prime mechanism for cooption.

(John Mercer) #35

And if I may add a thought to your point, it illustrates that the attempt by @Agauger to claim that our myosin work wasn’t relevant to her claim on the basis of not creating a new fold made no sense. Folds don’t correspond to functions.

@bjmiller, here is a paper that I would have chosen as a far more pointed sample:

This shows how easy it is to find functional beta-lactamases (Axe’s favorite!) from an unimmunized library. In this case, we have a beta-lactamase in the context of an immunoglobulin (Ig) fold.

(John Mercer) #36

Yet primary screening of unimmunized random libraries routinely yields multiple desired target enzyme activities.

(S. Joshua Swamidass) #37

Why do you think these three portions are homologues of one another?

(Arthur Hunt) #38

The E.coli and Agrobacterium tumefaciens FlgK proteins are only 25% sequence identical.

(Mikkel R.) #39

And the “without losing their function”-part is a red herring anyway. It is entirely plausible that the flagellum evolved through intermediate stages with different proteins having different functions.

This is one of the issues with Axe’s work in the beta-lactamases, where it was mutated until it stopped working as a beta lactamase. Then from that result it was basically extrapolated how many mutations it takes to break a protein, but no attempts were made to test, or more relevantly select, for closely related functions. It is just not possible to conclude that some particular fold doing one particular function sits isolated in functional sequence space if you haven’t even bothered to test for alternative, closely related functions.

(John Mercer) #40

He didn’t really do that because he didn’t bother to do any enzymatic assays. He took a continuous variable (activity, the z axis in a landscape) and turned it into an uncalibrated binary. Therefore he has no idea when it stopped working, just that it was insufficient to work through the single antibiotic concentration at which the bacteria were plated.

Did Axe work on more than one beta-lactamase?

Exactly. Even worse, they are extrapolating from that what it would take to break ANY protein! It seems to me that if Axe and @Agauger truly believed that this was a valid estimate, they would have done something similar with other proteins that have different structural folds.