Comments on Gpuccio: Functional Information Methodology

How small is it and how do you know?

and gpuccio preservation data is showing that window is exceedingly unlikely.

How does it show that?

When the dust settles you will realize the only known generator of 2335 functional linear bits is a mind.

I know this is what you believe. What’s missing is a demonstration that this is true. So far you’ve made four unsupported assertions, one piled on top of another.

When asked a question, you answer with yet another unsubstantiated claim that merely assumes the truth of what you’re being asked to demonstrate. You really need to understand the difference between arguing and asserting. You are asserting, you aren’t arguing.

5 Likes

One more unsupported and frankly silly assertion from the ID-Creationist camp. Sigh.

3 Likes

@gpuccio has precisely shown that such complex proteins with very high FI (>500 bits) exist. Now the question is whether their evolution by the traditional RV+ NS mechanism is plausible. The answer is no. Why? Because although the RV+NS can in principle produce very high FI (>500 bits), such a performance by RV+NS is only possible if a smooth fitness landscape exists that connect the final target (the complex protein exhibiting high FI) to the starting point. But, according to our best knowledge, such a favorable situation doesn’t exist for complex proteins. I invite you to read @gpuccio’s OP that I linked to at 272. It is quite a long read, but nonetheless a very rewarding one.

I know this. I know that even evolutionists are not ready to believe such nonsense. Rather, they believe that the RV+NS mechanism can do the job. But, as I explain at 272 and 328, this belief is not warranted.

Wrong.

1 Like

Okay. What are your evidences that complex proteins exhibiting high FI such as, say, the ATP synthase, can be produced by RV+NS?

He hasn’t actually done that. He has attempted to do that, but the problem is he’s taking sequence variations from the known diversity of life to be a proxy for the total “target space” above the minimum threshold.

There’s a big issue with that approach, namely that what we’re seeing is the product of history, a combination of drift and mostly negative selection, not the total possible diversity of functional sequences that could support life.

To pick just a simple example you can go into some database right now and find some human protein coding gene as Gpuccio did. It will have some canonical sequence listed as the main “normal” isoform that isn’t associated with any disease state.

Okay, then you can do the same for chimpanzee, and gorilla, and so on and so forth. And you can go through the data base and find lots and lots of different sequences and collect a diverse set of similar sequences for this protein from several different lineages in some larger clade(like vertebrates, or all animals).

Gpuccio does this, and then defines this as his target space of functional sequences above the “minimum threshold”. Any other imaginable sequence variant is taken to be outside of the target space, as in below the minimum threshold for function (in the parlance of Hazen & Szostak).

Why is this a problem? Because many other, “non-canonical” sequences can actually support life, they just have lower fitness. That’s why we don’t see them at high frequencies in any population of course, they’ve been selected against as they’re associated with some disease state or lower performance. That also means they’re not likely to turn up in the kinds of simplistic searches Gpuccio is doing as he’s combing through some protein homologue database when he decides to sample 10-20 different species.

Going back to the human protein, how many actual mutant variants of that protein exist out there in the human population, some of which cause disease, others of which are neutral variants at low frequencies? They’re not including in Gpuccio’s calculation, some of them probably haven’t even been sequenced.

Now realize the same applies to any species out there with a homologue of this protein. There will be chimpanzees with mutant version of this protein, who haven’t had their genomes sequenced, or which didn’t make it into some database as the canonical or main isoform that Gpuccio used in his collection.

Same for gorilla, and orangutan, and so on through the entire diversity of life that carries a homologue of this protein.

As should be pretty obvious, Gpuccio is nowhere NEAR doing the kind of work he would have to be doing to even begin to make a case for knowing the actual FI of ANY protein. He would have to have good reasons for thinking he has very substantially sampled the total diversity of homologoues of the protein that would meet the minimum threshold for function(instead of just variants with lower fitness): The ability to support life for the organism in question. But he has no reason for thinking he is anywhere near doing this.

But there’s another problem here:

Now the question is whether their evolution by the traditional RV+ NS mechanism is plausible. The answer is no. Why? Because although the RV+NS can in principle produce very high FI (>500 bits), such a performance by RV+NS is only possible if a smooth fitness landscape exists that connect the final target (the complex protein exhibiting high FI) to the starting point.

It turns out most of the sequence differences between species are actually not due to positive natural selection, they’re neutral or very nearly so. Many sequence variants have been discarded by negative selection, but the differences weren’t fixed one after another as a protein was pushed up some smooth hill on the fitness landscape. It’s not that when we see the differences between two proteins in two very distantly related proteins, we have to think that natural selection drove the fixation of all these differences over long timescales. We don’t have to think that these proteins kept getting better as they were pushed uphill in the landscape through hundreds of mutations.

Rather, these proteins have for the most part been evolving under historical contingency and epistasis. Neutral mutations open up for other neutral mutations, but once that 2nd neutral mutation happens, the first one can’t change back because it’s ancestral state now has lower fitness in the context of the 2nd neutral mutation. In this way, proteins evolve under a sort of neutral ratchet where they slowly become more and more dissimilar from their ancestors, and their cousins. Not because selection is somehow fine-tuning them to perform new functions(relatively few of all the mutations that occur when proteins evolve are fixed because they have any novel beneficial effects).

This also means we don’t have to posit that the protein we see were somehow evolved to their present state along a huge smooth slope to the top of some hill in the fitness landscape, because the shape of the hill isn’t actually static, the shape of the hill depends on the already existing sequence elsewhere in the protein. Neutral mutations can open up pathways which, in a different sequence context in the same protein, would have been deleterious(aka down some valley).

Ironically one of your IDcreationist allies who posts around here, Brian Miller, has supplied some nice references that demonstrates how this works. Work has been done also on ancestor reconstruction(*) where biologists have been able to show how the proteins we see in life has evolved through these historically contingent neutral ratchets. While a few mutations of a novel beneficial character many explain an initial fixation of a novel allele with a small number of beneficial mutations fixed by initial optimization, what subsequently happens over hundreds of millions of years is these proteins just sort of drift apart as they slowly accumulate mutations that are initially neutral, but reversal becomes deleterious due to negative epistasis.

  • See for example: Starr TN, Flynn JM, Mishra P, Bolon DNA, Thornton JW. Pervasive contingency and entrenchment in a billion years of Hsp90 evolution. Proc Natl Acad Sci U S A.
    2018 Apr 24;115(17):4453-4458. DOI: 10.1073/pnas.1718133115

Abstract

Interactions among mutations within a protein have the potential to make molecular evolution contingent and irreversible, but the extent to which epistasis actually shaped historical evolutionary trajectories is unclear. To address this question, we experimentally measured how the fitness effects of historical sequence substitutions changed during the billion-year evolutionary history of the heat shock protein 90 (Hsp90) ATPase domain beginning from a deep eukaryotic ancestor to modern Saccharomyces cerevisiae . We found a pervasive influence of epistasis. Of 98 derived amino acid states that evolved along this lineage, about half compromise fitness when introduced into the reconstructed ancestral Hsp90. And the vast majority of ancestral states reduce fitness when introduced into the extant S. cerevisiae Hsp90. Overall, more than 75% of historical substitutions were contingent on permissive substitutions that rendered the derived state nondeleterious, became entrenched by subsequent restrictive substitutions that made the ancestral state deleterious, or both. This epistasis was primarily caused by specific interactions among sites rather than a general effect on the protein’s tolerance to mutation. Our results show that epistasis continually opened and closed windows of mutational opportunity over evolutionary timescales, producing histories and biological states that reflect the transient internal constraints imposed by the protein’s fleeting sequence states.

7 Likes

Nice try at moving the goal posts, but I’m not biting.

It’s not really wrong, rather, it’s not necessary.

The problems with the creationist use of the fitness landscape metaphor for proteins is they don’t understand it.

They think the base of the hill is smaller than it is(for example when Gpuccio thinks he can take a handful of sequences and think he’s sampled enough sequence to get an idea of how many meet the minimum threshold for function).

They think there are way fewer hills than there are(not even considering the possibility that the sequence diversity observed in life is a product of historical sampling, as opposed to a gauge of what is really out there).

They think the hills rarely overlap or touch at all(discard more distant homologues with different functions in the protein families they include in their homologue searches).

And they think the shape of the landscape is static(by saying the landscape is rugged, not smooth, so complex proteins can’t evolve).

All four are wrong. This is why they don’t undestand evolution in “sequence space”.

7 Likes

Like other Creationists here you keep mistaking your unsupported assertions for explanations. But thanks for admitting your FI calculations have nothing to do with actual biology.

2 Likes

Yes, but you aren’t making an educated guess about the numbers.

That’s not a conservative approach. You’re faking it to get a lower FI because you know that this process, all by itself, refutes @gpuccio’s claim. You’re ignoring the FI in VDJ recombination because it also refutes @gpuccio’s claim, again by itself. I suspect you’ve tried some numbers for the whole process, but they were too absurd, so you’re omitting VDJ recombination, a much easier FI to calculate, and hoping that no one notices.

I don’t see a speck of evidence that your guess is educated, Gilbert. You’re clearly just making this up.

No, that would not be anywhere near an educated guess. You’re lowballing it to a truly insane level. Since we know that the doubling time is ~6 hours and that the number of cells in a typical follicle is 10000, a primary-school student can do the math that you’re eliding.

Five divisions gets you only to 16 cells, Gilbert. Getting to 10K cells would require well over 15, because there’s significant negative selection happening. An actual educated guess would be >20.

Now, if you want to insist that your guess of 5 was educated, please point out from where you received your education.

Which, if you were being conservative and not a shill for @gpuccio, you would then multiply by the ~100 clones in which this is happening simultaneously, all with the identical function you defined.

But adjusting your estimate for reality would get you way over 500 bits, which variation and selection can do in less than two weeks without design. @gpuccio is trying to claim that it can’t be done in hundreds of millions of years.

And then you’d have to add the FIa for VDJ recombination.

But thanks for dropping the howler about how a universal feature of vertebrates that works throughout life is somehow rare.

2 Likes

You don’t know anything of the sort. It’s obnoxious to falsely attribute beliefs to other people, Gilbert. What you’re missing is that we scientists don’t believe anything of the sort. The evidence supports the conclusion. If you have evidence that falsifies it, present the evidence. Not a bunch of rhetoric with a tiny bit of evidence that was cherry-picked and distorted to support your claim.

It’s clear that you understand this at some level, because you clearly have no familiarity with the relevant evidence before claiming to understand things like the proportion of sequence space that has function.

1 Like

I do understand the pseudoscientific concepts used by IDers. They are:

  1. Avoid empirical testing of an ID hypothesis at all costs; it’s hard work and you might find that your hypothesis was false. Better to produce only rhetoric.
  2. Pretend that science doesn’t involve hypothesis testing–it’s only about interpreting existing data.
  3. Don’t even bother to look at most of the existing data before taking a public stand.

You and @gpuccio are following all of those concepts, correct?

Then it is completely decoupled from our understanding of genetics, biochemistry, and evolution.

This is why you run away from computing FIa, because a single B-cell produces a receptor with affinity for a specific antigen through a single cell division via random VDJ recombination. If you were looking for truth, you’d look at that first.

Even worse for you, this typically occurs about 100 times to produce different sequences with the same function (binding the same antigen), so that demolishes the idea that function is rare in sequence space.

Antibodies have high FI. They are routinely produced by RV+NS in two weeks. By immunizing with analogs of reaction intermediates, we can get specific enzyme activities from them.

2 Likes

Actually, I would take issue with this. We see crazy amounts of variation in the extremely important proteins required to make our human hearts beat, with most of the variants causing inherited primary cardiomyopathy only rarely (and usually after reproductive age), probably because of epistasis.

Again, one would think that @gpuccio, having been trained as a physician, would have spent some time with these data, instead of merely doing BLAST searches, to get a better handle on the “target space” before investing so much ego in his claim.

1 Like

@gpuccio doesn’t have any data to show, Bill. You are grossly misrepresenting his pseudoscience as real science.

If he really thought he was right and that we are wrong, he would test his hypothesis far more rigorously by producing actual data that you could talk about.

1 Like

I agree that variants that are unlikely to cause a disease state(because these too, as you suggest, depend on the background in which they occur) are not necessarily selected against, or at least are only weakly selected against. Neutral or nearly neutral variants would obviously be expected to be able to rise to higher frequencies than more deleterious variants.

And they’d be far better evidence for @gpuccio and @Giltil than some BLAST runs…

John, I respectfully disagree. He is producing real data. The measurements are indirect but they are real. I don’t think you are looking at the real problem evolution is facing. There is almost no window where evolution by selected random change works. Your example is not broad enough to represent the problem.

By testing the possible ways evolution might work. For prp8 if 80% of the amino acids per position have substitutability evolution will still not be able to find a functional sequence. If they can create a functional fold that can splice the chance of a random search finding this sequence is 10^-96 where all but two work in every position. I am skeptical given this restriction that a 2335 AA protein can even execute a successful 3D fold.

There is essentially no reasonable window where evolution created the eukaryotic cell.

What possible case do you have that evolution build prp8 let alone the spliceosome?

Yet there are literally millions of empirical observations in hundreds of different scientific disciplines over the last 160 years which show that’s exactly how evolution works. Heck, you’re still ducking the empirical examples of the evolving soft robots where you can see them evolve right before your eyes.

You need to offer more than some silly calculations which we’ve already established have no relevance at all to actual evolutionary processes. “Bill wishes it was so” just ain’t gonna cut it. :slightly_smiling_face:

3 Likes