Antibody Enzymes and Sequence Space


(John Mercer) #61

@Agauger could have read one of the two original papers:

(S. Joshua Swamidass) #62

They did provide the data…

I wonder if there is some confusion here about the biochemical assay they ran. They did obtain and report quantitative kinetics. They just were unable to obtain an enzyme concentration level (as is very common in enzyme kinetics papers) and we’re therefore able only to show relative quantitation.

(John Mercer) #63

What about the other >5000 papers, Ann?

Do you not realize that I chose this individual paper because its subject is beta-lactamase, but there are plenty of other successful trials?

Do you not realize that I chose this individual paper because they successfully found activity in a nonimmunized library, falsifying your claim that this method requires multiple rounds of selection?

Do you not realize that phage display is a modification of the original scheme?

Do you not remember writing:

Does the beta-lactamase paper not answer this question “yes, at a frequency of more than 1 in 10^8?”

Does the original Schultz paper not address the criticisms you’re brought up regarding the beta-lactamase paper?

Why have you avoided mentioning, and apparently avoided reading, any other catalytic antibody papers?

(Ann Gauger) #64

As a rough approximation for variation in a population, the mutation rate is 10^-8 per base pair per cell division; in about 1 ml of bacteria we have a minimum of 10^8 cells, so you expect a mutation in every base bait somewhere in that number of cells, and at least one revertant of any mutation you provide.

It’s standing variation. Even if bacterial cells are maintained by streaking from colonies, those colonies have a million cells , having gone through, what ? 13-14 replications since their origin as a single cell. They are no longer truly clonal. In a genome of 4 million there will be 200 mutations per cell. every generation

By the time you have grown a colony there will be 200 mutations x 10 gen x 10^6 cells or 2 x 10^9 mutations scattered throughout those million cells. If you were to grow 1 ml of cells from that colony you would have 10^9 cells after about ten doublings. Do the math. Then grow up a liter of cells… How much neutral or near neutral variation is that?

I am writing from the airport and this has been interrupted several times. I will move on to Mercer’s critique.

(Ann Gauger) #65


@Mercer. I am not a mind reader… if you want me to know something, tell me.

When Abzymes first came up here I went and looked at some papers. One prominent review said they would never substitute for enzymes so I quit looking.

I will look at the next papers you cite just as I looked at the first one. And then I will call a halt. It will take me several days.

I do not understand your animus toward me. You think me ill informed. In some areas no doubt. But I am no idiot. And I operate in good faith, admitting mistakes when I make them, And I am willing to learn.

There is little percentage in continuing if this tone continues.

(John Mercer) #66

I pointed you to >5000 papers. You haven’t even sampled them.

Their clinical utility is completely irrelevant to the point we’re discussing.

That’s not how scientists work. It’s not up to me to point you to the massive amount of literature that you missed, paper by paper.

You’ve made a global claim to laypeople, and you clearly didn’t bother to review the literature carefully.

I think that you have an ethical duty to be highly informed in matters on which you pontificate to the public. Do you disagree?

Straw man. I have never called you an idiot.


(Mikkel R.) #67

No, there certainly won’t be 200 mutations pr cell every generation.

If you have a pr. basepair mutation rate of 10^-8 that means you need on average 100 million basepairs to get one mutation. If the typical bacterial genome size is 4 mbp, you need 25 cells to find one mutation. So in 100 million cells, you’ll get 4 million mutations (100x10^8 / 25 = 4x10^6). So one in 25 cells. Meaning roughly speaking there will be 24 clones for every 1 mutant.

We can assume we don’t get any parallel mutations. While that’s probably not really accurate but it will do for this kind of back of the envelope calculation.
Further, a minority but still substantial fraction of the mutations that happen in ORFs will be synonymous if not silent because of codon degeneracy. You will probably only have sampled the immediate neighborhood of any particular protein, and for any given residue chances are that at most two other other amino acids are sampled, out of the 19 other possible ones.

You will certainly have a lot of mutations in a trillion cells, but that variation will still only be sampling into the immediate neighborhood (meaning one amino acid difference) of any protein sequence, as is crucial to the point I tried to get across in my previous post. Which is that functions are found in clusters, and in so far as other functional clusters are out there you don’t have any guarantee that any particular protein in the genome you are using to sample sequence space, is that closely overlapping the function of interest at this moment in time.
This is why geological time really makes a difference, because if that other function is even as little as 4 or 5 mutations in the same protein away, you’re going to need a lot of generations to get that many substitutions into a particular protein just from letting the bacteria grow and divide. A number of generations we don’t have time to wait for in the lab because we don’t have a million years.

So researchers properly employ things like error-prone PCR on plasmids and phage display to get that kind of deep time-sampling power into the laboratory.

Don’t worry about it, you can write when you have time and interest. We all have other commitments. Have a nice weekend.

(Guy Coe) #68

Thanks for the good exchange, both of you, and let’s continue to strive for mutual understanding, rather than the old tactic of tag and blame. That gets old really fast, no?

(Bill Cole) #69

Correct thanks. Build a proper 3d fold along with a catalytic site that acts on the substrate.

(Ann Gauger) #70

This remark baffles me. Clinical utility? Their utility as enzymes has everything to do with what we are discussing. Do they or are they capable of making good Beta lactamase enzymes. The answer appears to be no.

(S. Joshua Swamidass) #71

I don’t follow. It seems like the answer is “yes”. What did I miss?

(Ann Gauger) #72

“good beta lactamase enzymes” = independent measurable enzyme kinetics" with soluble stable enzyme. You keep saying they had enzyme kinetics. I’d like to know where. There’s a number for the slope. Where’s the graph and error bars? Not even indirectly with number of phage particles from the same batch? That means no reproducibility.

It just seems totally unrealistic? incongruent? suspect? that it should be so easy to assemble an B leactamase activive site and have had it take 20 years or more for resistance to become wide-spread in the population. Every 4th colony according to @Rumraket should already have the beginnings of a beta lactamase enzyme.

(Ann Gauger) #73

I have not avoided anything

I would have gotten to it sooner if I had known what you were trying to get me to see, because I have a long term interest in this topic. But I do have lots of other responsibilities, so hints and guessing games don’t register strongly on the to do list.

As I said earlier, I actually did look at some papers out of the 5000 in PubMed and even the authors sounded discouraged about how little the abenzymes could do. As I said. So it didn’t seem worth following up further.

But now I will make it a project and will write something up after I am done. i can’t guarantee you and I will agree, but… probably not.

(Mikkel R.) #74

I believe this follows from nothing I’ve said, and I have actually given reasons not to expect that.

It also doesn’t make sense given your ~10^8 cells/ml from earlier. Perhaps a more straightforward way of putting it is to apply the ~1 in 10^9 different proteins directly to all sampled variants of a particular protein.
If every nucleotide in a protein coding gene is mutationally sampled in that 1 ml of cells, such that you have one mutant pr. nucleotide, then for a protein 150 amino acids long you will have 450 mutant versions of that protein (ignoring codon degeneracy). That’s not even enough to sample all possible variants of the protein that only differ by 1 amino acid. 150x20=3000.

Suffice it to say we are nowhere near a billion different protein variants. It’s ~999 million less that the number of different proteins indicated by the above mentioned paper needs to be sampled, on average, to find the function of interest.

That means I need another two million ml of cells to sample a billion variants of that protein. And all this of course still ignores the problem of only having sampled the direct neighborhood of proteins no more than 1 or 2 amino acids different from the extant one. But to really sample a billion variants of it, we’re going to need to dip into triple-mutant versions of the protein. We need to approach geological time for that, which we don’t have, so instead use some way of speeding up protein evolution.

Further still, a general statement about the average density of a particular function in protein sequence space does not entail that the functions will always overlap other functions that closely. Again, the functions will tend to cluster into “islands”. To pick some extreme examples, I wouldn’t expect much direct overlap between fat and water-soluble proteins in general. Sampling a few thousand single or double-mutants of a fat-soluble protein, looking for some function only possible for water soluble globular proteins is probably going to be mostly fruitless. It is generally going to take more substitutions for such a conversion. The number is an average.

The experiment you propose with just plating bacteria simply doesn’t constitute a valid test of the claim. The only way to do it since we don’t have a million years is to employ sped-up methods of generating protein variants.

(Ann Gauger) #75

Thank you. You have made Doug’s and my case admirably, showing clearly why proteins are rare and isolated in sequence and functional space. I did not mean to trick you. I was trying to show why it was foolish to expect to find an enzyme with beta lactamase activity so easily. You have outdone me.

(S. Joshua Swamidass) #76

@Agauger you seem to have misread @Rumraket.

(Ann Gauger) #77

@swamidass I will respond more fully when I am not on my phone. But no. Rumraket makes the point admirably. How far apart proteins are, unless you stumble on a cluster, for example, or that the only way we can hope to find new proteins is to vastly speed up the process. My reason for suggesting the 10^10 bacteria plating was to illustrate how ridiculous it was to say that an _ enzyme_ would be found that easily. Rumraket argued ably why that wouldn’t work. My point. There is something about the phage system that is not natural . Maybe I’ll find out when when I read Mercer, papers.

(Mikkel R.) #78

But I haven’t made any arguments to the effect that functional proteins are isolated.

And a function at one in a billion can certainly be called rare, but that is not a point in your favor as that is still 68 orders of magnitude more frequent than Axe’s estimate. You seem to be deliberately vague here just so you can try to score some sort of rhetorical point.

I did not mean to trick you.

I don’t see your unintended “trick” as applying to me, since I’m not falling for it. Rather I feel like you are exposing the poverty of your arguments by this misdirection, instead of engaging specifically with the points I raise.

First you make some basic miscalculations on the expected number of mutations pr cell, or the sampled number of proteins in your proposed experiment, and now it looks like you’re trying to misdirect from the fact that?

In what way is the demonstration that you’d only be sampling a few thousand proteins in your proposed experiment somehow going to constitute substantiation that functions are found at density of 1 in 10^77, or a falsification of the claim that they’re found at a rate of 1 in 10^9 on average?

I was trying to show why it was foolish to expect to find an enzyme with beta lactamase activity so easily.

What is “so easily”? Only by being deliberately vague can you maintain this charade that my arguments here constitute support for Axe’s extrapolation that functions only exist at a rate of 1 in 10^77 protein sequences.

You have outdone me.

I believe I must concur. Just not in the way you are attempting to make it appear.

(Mikkel R.) #79

But who claims it would be found “that easily”?

The claim that the function exists at a rate of approximately 1 in every billion proteins, as demonstrated, isn’t actually tested by the experiment you propose.

Then how can my argument against your faux thought-experiment constitute a point you’ve been making? Who do you think is fooled by this?

(Ann Gauger) #80

I wasn’t aiming at 10^-77 . I was aiming at 10 ^-10.