Strong evidence that a search algorithm can find high functionality in an astronomically large search space

Chris_Falter · September 14, 2021, 7:52pm

A common argument by Intelligent Design theoreticians (Dembski, Meyer, Behe) and forum participants (you know who you are! ) is that the astronomically large DNA search space makes the probability of finding functionality by evolutionary mechanisms so vanishingly small as to be effectively impossible.

A very, very strong rebuttal is that evolution has an array of algorithms that allow it to search the space extremely efficiently. Dead ends can be cut off quickly and optima can be approached because existing functionality gradients can be traversed by purifying and positive selection, respectively.

A highly relevant example is the GPT-3 language model, which has 175B 16-bit parameters. The parameter search space is therefore 175,000,000,000¹⁶, which can be approximately expressed in binary as 10¹⁸⁰. Importantly, this far exceeds Dembski’s universal probability bound of 1 in 10¹⁵⁰ – by 30 orders of magnitude.

So the standard ID logic would seem to predict that a search algorithm starting from randomized initial conditions in the GPT-3 search space in which useful, functional peaks are extremely rare – (and they are, I have examined the output of language models in their initialized state!) – would be unable to do anything useful in geological time, much less in human lifetimes.

The success of the OpenAI GPT-3 project proves this ID logic to be unviable. The reason is that the search algorithm, as previously mentioned, can perform hill-climbing in order to find a highly effective functional optimum. It may or may not reach the functional optimum in the search space, but the local optimum it finds is good enough to make it an extremely effective NLP language model.

Similarly, natural selection provides an optimization function that facilitates hill-climbing in biology’s search for optima. This optimization of efficient search strategies (copy-and-modify, reverse transcription, open reading frames, transpositions, etc.) provides a robust explanation of how and why evolution works, and how it is that biologists have been able to accumulate such strong evidence for evolution.

EDIT: Removed an erroneous comparison of search space sizes. HT @Rumraket

Dan_Eastwood · September 14, 2021, 8:00pm

I’ll add to this. Genetic algorithms are simple and highly efficient, capable of finding solutions in O[N*log(n)] time.(https://web.mit.edu/16.070/www/lecture/big_o.pdf)

Rumraket · September 14, 2021, 9:38pm

The fundamental problem with the ID argument is that it simply flat out assumes that functions are so rare and so isolated from each other that you essentially need foreknowledge of where to look to find anything useful regardless of what “strategy” you use to search, so as long as that strategy is essentially blindly and randomly sampling in the space it has no chance to find anything of functional significance.

There’s just no reason to think that the space is the way they assume. It’s just an assumption they make.

The vastness of the space is not evidence for how rare useful things are found in it. It just isn’t. Straightforward non-sequitur. The ID argument fails to the simplest possible logical fallacy: It doesn’t follow.

colewd · September 15, 2021, 1:05am

The evolutionary assumption has to be small steps to function. While this story may be believable for some where climbing a step by step hill results in new enzyme substrates, when you look at more advanced systems required for living function it is not credible IMO.

If you start from an existing enzyme and model the formation of a new enzyme in the population of bacteria what would be the predicted waiting time if the change was nearby in sequence space? Just asking as a thought stimulator.

CrisprCAS9 · September 15, 2021, 4:01pm

And now ladies and gentleman, welcome once again @colewd dancing the Irreducible Complexity Two Step:

Name a problem without examples
Name examples without a problem

Rumraket · September 15, 2021, 4:01pm

That’s an empirical fact, not an assumption. We had a whole thread about this here:

Functions are not so rare at all, and definitely not isolated, in sequence space of biopolymers Conversation

Found this paper on arXiv: Abstract: At odds with a traditional view of molecular evolution that seeks a descent-with-modification relationship between functional sequences, new functions can emerge de novo with relative ease. At early times of molecular evolution, random polymers could have sufficed for the appearance of incipient chemical activity, while the cellular environment harbors a myriad of proto-functional molecules. The emergence of function is facilitated by several mechanisms intrinsic to molecular organization, such as redundant mapping of sequences into structures, phenotypic plasticity, modularity, or cooperative associations between genomic sequences. It is the availability of niches in the molecular ecology that filters new potentially functional proposals. New phenotypes and subsequent levels of molecular complexity could be attained through combinatorial explorations of currently available molecular variants. Natural selection does the rest. It’s hard to pi…

So you just assume something despite having no evidence to base it on. You just have your opinion that it’s “not credible”.

That experiment has already been done. Try this classic experiment:

Also, do you remember we had a thread about this paper?

Duplicate Gene, New Tricks

The fact that functionally new genes appear is clear, but the evolutionary process that allows for a new gain of function is not well understood. Näsvall et al. (p. 384) present the innovation-amplification-divergence model which suggests that after gene duplication, ancestral function is maintained but that the duplicate copies can gain new function that is selected for through the accumulation of mutations or changes in expression. Experimental selection on Salmonella enterica allowed an ancestral gene to evolve new enzymatic function in fewer than 3000 generations.

Apparently the “waiting time” to evolve a new enzyme activity in an already existing enzyme that was incapable of it, was in both cases a few months.

Even so, most enzymes can normally catalyze multiple different reactions already:

https://www.sciencedirect.com/science/article/pii/S0959440X17300982

It is now well accepted that most — and probably all — extant enzymes are, in fact, promiscuous [5, 6 ].

Recent large-scale studies, both computational and experimental, have opened our eyes to the enormous functional diversity among existing enzyme superfamilies, the vastness of ‘promiscuity space,’ and therefore the seemingly limitless potential for future evolutionary innovation. Baier et al. surveyed the functional diversity, as represented by Enzyme Commission (EC) numbers, in five common superfamilies [7•]. Each superfamily contained enzymes from all six of the EC classes (Figure 1a). Furnham et al. went further and used a phylogenetic approach [8] to reconstruct the evolutionary histories of 379 superfamilies from the Class, Architecture, Topology, Homology (CATH) database, and to ask how often a change in EC number was observed over the course of their evolution [9•]. While 81% of the functional changes were within an EC class, every possible change between EC classes was also observed (Figure 1b), with the exception of a change from a ligase (EC class 6) to an isomerase (EC class 5). These bioinformatics studies emphasize that there is little, if anything, that constrains particular catalytic chemistries to particular folds.

Four high-throughput experimental studies (reviewed in detail elsewhere [7•]) have reached a similar conclusion. Dozens of enzymes from within the cytosolic glutathione transferase [10], β-keto acid cleavage enzyme [11], metallo-β-lactamase [12], and haloalkanoate dehalogenase [13••] superfamilies were each tested for activity towards a range of different substrates. In each case, many enzymes were found to have multiple functions in vitro . In the most comprehensive study, 217 members of the haloalkanoate dehalogenase superfamily were expressed, purified, and screened for phosphatase or phosphonatase activity towards 167 substrates (most of which were naturally occurring metabolites). The authors discovered breathtakingly broad substrate specificities. A median of 15.5 substrates were recognized by each enzyme, 50 of the enzymes could utilize 40 or more substrates, and remarkably, one enzyme could utilize 143 [13••].

I really recommend you read that thread again. You are operating under a misapprehension. What seems credible to you is not based on facts.

Roy · September 15, 2021, 4:01pm

You have no evidence. Your opinion has no weight. So the ID argument still fails.

Mercer · September 15, 2021, 4:01pm

Since there’s no evidence that you have looked at any “more advanced systems required for living function,” your objection is not credible IMO.

Why do you never show any math, Bill?

colewd · September 15, 2021, 7:27pm

The hypothesis that a process that starts with random change can explain biodiversity is sciences to model and test.

Since you are dealing with combinatorial and exponential, mathematics along with mutational fixation waiting times I am not optimistic you can defeat the design argument.

The ID claim that a mind is behind this mathematical complexity and the hypothesis that living animals are the starting point for science/evolutionary theory takes away the mathematical roadblock.

Have you thought about the observation that living organisms at the cellular level work because of the principles of combinatorial and exponential mathematics? Is it possible this is just an accident?

Chris_Falter · September 15, 2021, 7:27pm

An interesting question, although a theoretical answer would require parameterization and intense computation.

But as Mikkel points out, we do know that under a particular set of parameters, nature has already computed an answer for us.

So there’s your answer, Bill.

Do you think the 2 papers cited by Mikkel were the product of such rare parameters/conditions that they never happened before and could never be replicated, ever again?

Or would you agree that the parameters/conditions described in the papers are not atypical for bacteria such as Pseudomonas and Salmonella?

Over to you, Bill.

Best,
Chris

Puck_Mendelssohn · September 15, 2021, 7:27pm

Credible IMO can be found here: Imo Sour Cream Substitute (16 oz) Delivery or Pickup Near Me - Instacart

I never liked the stuff, honestly.

Michael_Okoko · September 15, 2021, 8:33pm

For promoters, that was two weeks

CrisprCAS9 · September 15, 2021, 8:33pm

Yeah, and the models work. So… your turn.

Nope, we’re dealing with fractions. If 10% of sequence space is functional, that’s the fraction that functional no matter how large the total space might be!

Waiting time is a nonsense non-problem resulting from either gross incompetence or deliberate dishonesty on the part of Behe and Sanford. Given what I know of Sanford, I’m guessing both.

The design argument defeats itself.

Is ludicrous, since the only minds for which we have any evidence are made of meat and products of biology. You know, the very thing you are trying to explain.

I try to avoid thinking about clearly false observations any more than what is required to recognize them as such.

A Pasteurized and Cultured Blend of Water

swamidass · September 16, 2021, 2:32am

The next iteration of GPT will have several trillion paremeters:

Mercer · September 16, 2021, 2:52am

That’s no reason for you to never show any math, is it?

And yes, science has tested it. Why do you ignore the data produced by working scientists in favor of the rhetoric produced by those who have stopped doing the work, Bill?

We are dealing with fractions, Bill. Your excuse doesn’t work.

What did we show with Myo1c?

Click on this link, please, Bill:

Why does it say,

Quoted phrase not found: “mutational fixation waiting times”

Science is about testing hypotheses and producing data. Those data don’t support the design argument, and those who espouse it are unwilling to test their hypotheses. I don’t need to defeat anything.

Would you show the math supporting that alleged observation?

That would be math, not rhetoric.

No, as selection is not accidental.

Why do you never show any math?

swamidass · September 16, 2021, 2:52am

Also see:

colewd · September 16, 2021, 5:28pm

What does functional mean in this context. 10% fo all sequences nucleotide sequences can build a vehicle for bacterial mobility?

Have you considered this hypothesis could be wrong?

CrisprCAS9 · September 16, 2021, 10:37pm

It certainly doesn’t mean ‘specific’, since that’s irrelevant to evolution.

10% was an arbitrary number used for a hypothetical.

You can build something impacting mobility with anything that has a conformation change. If ~20% of proteins that bind something have open-to-closed conformational change, then 20% of the fraction of proteins that bind something can be used for mobility. Ideally, binding something like ATP. Since random screens for ATP binding have been done, we can say the minimum probability of something that can be used for mobility is 10^-12, a frequent occurrence for bacteria.

Fun fact, by the way: The enzyme in question has a substantial conformational change associated with binding.

Chris_Falter · September 16, 2021, 10:37pm

I think he was giving an illustration, Bill.

He would probably estimate the universal functional fraction as something like 1 in 10⁸, a figure I have seen cited many times in the forum. However, I invite @CrisprCAS9 speak for himself.

Estimating the fraction that could be applied to a particular function like mobility would be much harder, insofar as different kinds of mobility in different kinds of organisms have different kinds of constraints. In addition, the estimate would depend on the definition of the problem. For example, proteins used in metabolism are essential for mobility (how else to move a flagellum or limb?). Should genes with metabolic function, or regulation of metabolic genes, be counted?

Best,
Chris

Dan_Eastwood · September 22, 2021, 9:38pm

38 posts were split to a new topic: Strong evidence that topics can go off the rails

Topic		Replies	Views
Why didn't LUCA go extinct? Conversation Communication	44	1711	November 12, 2022
Why are We Disagreeing with ID? Conversation Science , Design	706	9940	July 4, 2022
Beyond Reasonable Doubt? A Test for Common Ancestry Conversation	92	6989	May 19, 2019
A Comprehensive Theory of Intelligent Design Conversation Science	110	3303	May 6, 2021
Creation Myths with Dr. Michael Behe on "The Edge of Evolution" Conversation Science , Design	154	6922	July 13, 2021

Strong evidence that a search algorithm can find high functionality in an astronomically large search space

Duplicate Gene, New Tricks

Related topics