Hunt's 2007 Critique of Axe

colewd · August 22, 2018, 12:42am

You’re assuming your conclusion here.

Do the genetic search algorithms you worked with include a target sequence?

I am not sure how you can have a sustainable process that reproduces itself without cell division. Can you describe something here that is testable?

colewd · August 22, 2018, 12:45am

Longer proteins often require chaperones to fold. This is the opposite of your prediction.

swamidass · August 22, 2018, 12:50am

Both viruses and prions are offered as examples of self reproducing entities, at the border of life and non-life. Memes are another example too.

colewd · August 22, 2018, 1:30am

I never am surprised with your resourcefulness Of course these assume the existence of stable self replicating organisms.

swamidass · August 22, 2018, 1:31am

That is missing his point @colewd. And not all longer proteins require chaperones to fold. It is not length that is at issue here.

colewd · August 22, 2018, 2:17am

Fair enough. I wait for him to support his claim that longer proteins are easier to fold then smaller ones.

swamidass · August 22, 2018, 2:21am

That is not what he claimed precisely. He already supported it…

colewd · August 22, 2018, 2:46am

Based on this I agree he did support his claim. Thanks for the correction.

T_aquaticus · August 22, 2018, 4:04pm

That would be a case of moving goal posts. I thought we were talking about the chances of a protein fold forming in random sequence.

colewd · August 22, 2018, 5:26pm

I understand and I apologize. Can you re calibrate by defining a successful protein fold for this discussion?

T_aquaticus · August 22, 2018, 5:32pm

I am defining it as the formation of something like an alpha helix, and you could probably throw beta sheets in there as well. In other words, the classic secondary protein structures. Ultimately, the presence of these features would need to be determined empirically through x-ray crystallography or other means, but in our case a basic algorithm like SWISS-MODEL can be useful for giving us an idea if there is some potential for protein folding.

Added in edit:

As an illustration, you could check out the structure of the beta-lactamase from mycobacterium here.

colewd · August 22, 2018, 5:38pm

Then I agree with your hypothesis that a longer random sequence is more likely to find some of these structures.

Dan_Eastwood · August 22, 2018, 9:57pm

Apologies, but I don’t fully grok the comment system here yet, and I seem to have fouled things up. Will fix it if I can?

DE: That would seem to be a reasonable prediction for evolved proteins, yes.
BC: You’re assuming your conclusion here.

DE: I don’t understand how you mean that. I think that interdependencies among proteins are not an unexpected product of evolution. Allow me to re-phrase: “I am (was “we are”) not surprised at interdependencies among proteins.”

DE: I’m a statistician not a biologist, but I’ve tinkered with genetic search algorithms, and I do not see how it would be possible to stop such interdependencies from forming if it favors fitness.
BC:Do the genetic search algorithms you worked with include a target sequence?

DE: That is not the setting where I applied GA. This was just a side project I did for fun, setting up a complex optimization problem with a simple fitness function and letting it fly.

DE: I am admittedly making a naive claim based on my limited experience with GA. BUT given what I know, if there exists the potential for protein dependencies, I am not surprised that genetic search is able to find them. GA is a very efficient search method (“big Oh” O[n*log(n)])!

DE: Depending on where we draw the line for “life”, I wouldn’t necessarily require cell division, only a sustainable process that reproduces itself.
BC: I am not sure how you can have a sustainable process that reproduces itself without cell division. Can you describe something here that is testable?

DE: At a minimum - A self-reproducing protein, perhaps a chain-molecule that lengthens itself with occasional errors, and periodically breaks into two pieces, both piece continuing the process.

DE: Now I’m going to hedge a bit, because that minimal molecule might not have all the qualities of life. Call it a pre-biotic molecule if you like. The essential quality is the ability to reproduce imperfectly, allowing natural selection to come into play.

DE: Can I test that? No, not personally. Dr. S has a better answer here.

Is that testable? As I understand, a molecule with such properties is an object of current research into abiogenesis.

colewd · August 22, 2018, 10:21pm

If you can write one without a sequence target over 50 english letter characters you will be the first that I know of.

The evolutionary problem is that proteins are sequence dependent. From Szostak’s work the binding of ATP (the power molecule) requires 10^10 trials for a 70 Amino acid protein. How could a random process build a structure that required 13 proteins to bind together and produce a coordinated function?

He is cleaver but his solution requires life to exist already. The origin of life including the origin of proteins is a monster problem. First you assume matter exists then you need something that can sustain itself through reproduction and obtain energy on its own. The closest thing we can observe now to what is OOL is a bacteria. You need DNA that is organized to obtain around 500 proteins to have an energy consuming reproducing organism. To maintain these proteins you need to produce hundreds if not thousands of AA s per hour. After they are produced initially they can be recycled but again this requires proteins.

In order to do this you need enzymes that can produce the AA’s. You also need proteins for the process of turning DNA into proteins. The chickens and the eggs need to show up at the same time This is why I think the simple to complex model is broken. It started with complexity.

T_aquaticus · August 22, 2018, 10:32pm

70 amino acids is a really small protein. I have to wonder what the chances would be for “regular” sized proteins.

By starting with two proteins that bind together and produce a coordinated function.

That’s not very close. All bacteria we have now are the product of 3.5 billion years of evolution. You might as well claim that the first human civilizations had build BMW’s in order to survive.

Dan_Eastwood · August 22, 2018, 11:05pm

There was no target sequence in my application, only a function to optimize, which it did in some interesting ways. Likewise, if we consider a (hypothetical) function for evolutionary fitness, there is no target, only some interesting ways in which it might be optimized.

We are at risk of talk past each other - I think you missed the significance of the efficiency of genetic algorithms. “Big O of N times log(N)” is about as good as any search algorithm can do (see Burjorjee, 2015). Given a fitness function with the input of “50 english letter characters”, a GA should be able to reach a near optimal solution in ~K*200 generations, for some constant K. Given reasonable computing resources for the task that should not be too difficult. K in a practical application would depend on the actual function and parameters.

I don’t know enough about the Szostak example to try to frame that for a GA. Can you give me a reference?

OOL has not been the context of this discussion to this point. Please do not move the goalposts.

References
Burjorjee, K. M. (2015, January). Hypomixability elimination in evolutionary systems. In Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII (pp. 163-175). ACM.

colewd · August 22, 2018, 11:05pm

How do we get those?

You are speculating. Show a simpler life form is possible. This is what the Venter lab is trying to do and they are claiming between 400 and 500 proteins required for a living organism.

colewd · August 22, 2018, 11:22pm

I looked briefly at your reference and I would have to spend time with it. Apologize for moving the goal posts I mistakenly thought that was where you were going.

To simulate a protein search we need to estimate how many functional sequence exist inside the 50 letters. Lets call function a coherent sentence that we could recognize meaning from. For argument purposes lets assume one million or 10^6 sentences we could recognize. So your algorithm has to find one of these in about 28^46 search space to functional space ratio. Dawkins created a program to do this with a single target. Given the size of the search space this seems like a very difficult problem without a target.

Dan_Eastwood · August 22, 2018, 11:32pm

That reference is very technical. I gave it to show that I wasn’t just making up this “Big O” stuff.

Now we are progressing into a discussion of fitness landscapes, which is more than I can attempt tonight, or maybe this week. I would need to review the topic myself, and you might do better to read about it on your own. I suggest we take a break for a few days, and consider starting a new topic so others might join in.

The answer to the problem you pose, however, depends on the smoothness of the fitness landscape. I’ll leave it at that.

colewd · August 23, 2018, 12:48am

I have had lots of discussions on this subject and it is part of the discussion in Hunts article. I am happy to discuss this when you are ready.

Topic		Replies	Views
Miller: Axe Decisively Confirmed? Conversation Science , Design	31	4561	February 23, 2019
Gpuccio: Functional Information Methodology Conversation Science , Design	183	12547	September 1, 2019
Gauger and Mercer: Bifunctional Proteins and Protein Sequence Space Office Hours Design	188	7405	November 15, 2018
Art Hunt to Doug Axe: Invitation to Discuss Office Hours Design	18	2603	August 4, 2019
Mercer's Work on Protein Function and Sequence Space Office Hours Design	5	809	June 19, 2021

Hunt's 2007 Critique of Axe

Related topics