Comments on Gpuccio: Functional Information Methodology

colewd · August 30, 2019, 5:43pm

You are assuming that there is no FI in the starting sequence. He is saying change in FI.

Mercer · August 30, 2019, 5:48pm

Indeed. I am particularly struck by what appears to be an unwillingness to make a serious effort to accurately estimate the prevalence of even a single function in sequence space, which is an essential factor in every mathematical assertion @gpuccio and @Giltil are making.

Here’s another actual biological function for which FI is meaningless: the function of MHC (major histocompatibility complex) is as a necessary component for the immune system’s ability to distinguish self from nonself. IOW, its function is literally to be different. There’s no way that I can see to calculate FI for that.

T_aquaticus · August 30, 2019, 6:45pm

So how do the FI calculations for ubiquitin factor in the FI of the proteins that preceded ubiquitin?

gpuccio · August 30, 2019, 6:56pm

Of course not. Most of the FI is already there. You only add what is missing.

If you had read my statements about random walks, you would know that FI expresses the probability of finding the target by a random walk from an unrelated state.

Rumraket · August 30, 2019, 6:59pm

But isn’t it then obvious that you don’t actually know whether any of the proteins you use as examples really contain those 500 bits? When you say they do, you are essentially claiming to know that there are no other proteins they evolved from, otherwise you would have to say most of the FI was already there because most of the sequence was present in the ancestral state.

Roy · August 30, 2019, 7:29pm

I see the safe analogy has made another appearance. Unfortunately, the bit about the large safe’s handle turning more and more as the thief gets closer to finding the right combination has been omitted.

T_aquaticus · August 30, 2019, 8:42pm

That’s not found in any of the calculations I have seen. Instead, the FI calculation focuses solely on the function that emerges by comparing the variation of that gene in different lineages. I have yet to see an FI calculation that factors in ancestral proteins that had a different function.

What is an “unrelated state” in this scenario? If we have a protein with no ubiquitin function that gains ubiquitin function through a single mutation would that meet your standards? Would the change in FI in this scenario be 4.3 bits?

Giltil · August 30, 2019, 8:51pm

In fact, I have revised my estimation of FIb and now think it is lower, around 50 bits. But this still remains a pretty large amount of FI that the IS cannot produced by a random walk. In order to produce these 50 bits of FI, the IS resorts to the RV + NS mechanism.

The question now is why the RV + NS is able to produce high FI in a few weeks during the process of somatic hypermutation (SHM). The thing to see here is that the path from an antibody with low affinity for the antigen to an antibody with high affinity for this antigen consists of a succession of discrete selection steps. Moreover, the FI associated with each of these selection steps is very low, around 10 bits. Given the probabilistic ressources of the IS during the SHM process, it is a child’s play for the IS to produce 10 bits of FI by random walk and, as a result, to go through the different selection steps leading cumulatively to high FI. So it is true that RV + NS can produce high FI but only in one particular and very special situation, ie., when the final target exhibiting high FI can be reach incrementally through a serie of small selective steps. Such a very special situation is quite rare in biology and doesn’t apply to complex proteins. This last point is well argued by @gpuccio in his OP below.
https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/

Giltil · August 30, 2019, 8:58pm

Good guess👍

Giltil · August 30, 2019, 9:31pm

With all due respect, you are completely wrong here, as explained by @gpuccio below.

Gpuccio: Functional Information Methodology Conversation

Wow, you guys in the anti-ID field seem to be really fond of this error. Let’s state things clearly: 10 objects with 50 bits of FI each are not, in any way, 500 bits of FI. Which is what Rumracket (and maybe you) seems to believe when he says: If natural selection can add 60 bits of FI in a few weeks, why can’t it add 500 bits of FI over the course of (say) 20 million years? To make things more clear, I will briefly propose again here my example of the thief and the safes, that I used some time ago to make the same point with Joe Felsenstein. It goes this way. A thief enters a building, where he finds the following objects: a) One set of 100 small safes. b) One big safe. The 100 small safes contain, each, 1/100 of the sum in the big safe. Each small safe is protected by one electronic key of one bit: it opens either with 0 or with 1. The big safe is protected by a 100 bit long electronic key. The thief does not know the keys, any of them. He can do two different things…

glipsnort · August 30, 2019, 9:50pm

I was indeed wrong, but a lot less wrong that @gpuccio.

colewd · August 30, 2019, 10:54pm

If the scenario that you describe is real then the FI was already in the previous application as far as I can tell.

colewd · August 30, 2019, 10:56pm

Who ever ends up being right is immaterial as the conversation is going to improve our understanding of functional information. Thanks for the thoughtful posts.

colewd · August 30, 2019, 11:36pm

@gpuccio here is a reference point for the discussion. This definition is from Hazen and Szostak.

I(Ex) = ?log2[F(Ex)], where F(Ex) is the fraction of all possible configurations of the system that possess a degree of function ? Ex. Functional information, which we illustrate with letter sequences, artificial life, and biopolymers, thus represents the probability that an arbitrary configuration of a system will achieve a specific function to a specified degree. In each case we observe evidence for several distinct solutions with different maximum degrees of function, features that lead to steps in plots of information versus degree of function.

Dan_Eastwood · August 31, 2019, 12:52am

The safes example assumes independence (see 1 below). That is, the combinations and rewards (hence in $dollars) are part of a whole, not 101 independent parts. Each safe has $1. The firt safe has a 1-but combination the second a 2-bit combo, etc., up to a 100-bit combo for the last safe.
The first small safe with a 1-bit combination is quickly opened by thief, who gains $1 and 1-bit of the combination to the next safe. The second safe has a 2-bit combination, but using his 1-bit knowledge he only needs one more bit! The second safe is also soon opened gaining another $1 and 1 more bit. The thief proceed to open each safe in turn until all the safes are open, and walks out with $100.

Let’s make this harder - The thief does know how the safes are ordered, so it is not clear what order he should proceed.
He starts by entering “0” as the combination for all 100 safes, if one of the does not open, he goes around again and enters “1” as the combination, and one must open. The thief has gained $1 and 1-bit. 99 safes remain.
The thief repeats his task, starting with the bit(s) he knows and and adding 1-bit at a time until all the safes are open.

Additional notes:

I am making the error of assuming only a single function, or a single set of safes, when the thief may have many to choose from. A combination that does not open a safe for a particular function might open a safe containing some other new and unexpected function.
Another error! There is not just a single thief, but a population of thieves, each working to open the safes and sparing information.
If the safe combinations allow extra bits beyond the correct combination, the thief can guess the next two or three bits, possibly opening several safes with each pass, greatly speeding his task.
The thief ought to be flipping coins to choose bits instead of sequentially trying “0” and “1” bits. The thief will average 2 attempts per bit, but this does not substantially change the point I am making so I’m not going back to fix it!

Dan_Eastwood · August 31, 2019, 1:16am

Gpuccio is, I think, assuming there is nothing for the thief can gain until some substantial number of correct bits are known. We could modify the example so the thief needs to guess more than 1-bit, at least at first. That would mitigate some of the easy gains for the thief I demonstrated, but opening the safes and gaining $100 is still far easier than claimed. I’m willing to give our busy thief a well earned rest.

AndyWalsh · August 31, 2019, 3:28am

Looking at the original CARD11 example: The actual function of the matching protein in Saccoglossus kowalevskii seems to be unknown at this time, as it is only a predicted protein from analysis of the genome sequence (if I understand correctly). However, if it does have an analogous function, isn’t the divergence in sequence between Saccoglossus kowalevskii and humans suggestive of low functional information (as defined), since two very different proteins can achieve the same function? And if the two proteins don’t have analogous functions and we are just looking for the raw material to mutate into something that can carry out the function in humans/vertebrates, don’t we need to look at the DNA level rather than the protein level?

Ashwin_s · August 31, 2019, 4:25am

The situation is a little more dynamic that this. The environment/ecosytem changes (change in weather, animal migration patterns, pathogens etc).

So it more like the safes code keeps changing every once in a while.
This is why evolution is a process that depended so much on “contingencies”.
In short biodiversity is a miracle that normally shouldn’t have happened.

Roy · August 31, 2019, 8:09am

How do you know?

Rumraket · August 31, 2019, 8:20am

No, that’s not what I’m saying. I’m saying why can’t the accumulation in the same protein continue? There’s an antibody, it has some sequence. It mutates in the hypervariable region in a stretch of 6-10 amino acids, and over the course of a few weeks 50 FI (or whatever) is generated. Why could a larger portion of the antibody not continue mutating?

Why could the same thing not happen to 200 residues, out of a 600 amino acid protein over the course of 500 million years? In fact, isn’t this exactly what phylogenetics reveals to us has happened? We create a large tree of homologous proteins, and we see that along some lineages lots of mutations have accumulated in the protein during this time period?

Natural selection fixed the mutations in the antibody. Why can’t natural selection have fixed the mutations in the larger protein?

Topic		Replies	Views
Gpuccio on Common Descent Conversation Science	1	751	August 26, 2019
Gpuccio: Functional Information Methodology Conversation Science , Design	183	12580	September 1, 2019
Information is Additive but Evolutionary Wait Time is Not Conversation Science	12	1508	September 3, 2019
Stephen Meyer and Information Conversation Science	7	666	May 22, 2022
DI's Brian Miller thinks people just don't understand ID Conversation Theology	17	1082	July 5, 2020

Comments on Gpuccio: Functional Information Methodology

Related topics