You are assuming that there is no FI in the starting sequence. He is saying change in FI.
Indeed. I am particularly struck by what appears to be an unwillingness to make a serious effort to accurately estimate the prevalence of even a single function in sequence space, which is an essential factor in every mathematical assertion @gpuccio and @Giltil are making.
Hereās another actual biological function for which FI is meaningless: the function of MHC (major histocompatibility complex) is as a necessary component for the immune systemās ability to distinguish self from nonself. IOW, its function is literally to be different. Thereās no way that I can see to calculate FI for that.
So how do the FI calculations for ubiquitin factor in the FI of the proteins that preceded ubiquitin?
Of course not. Most of the FI is already there. You only add what is missing.
If you had read my statements about random walks, you would know that FI expresses the probability of finding the target by a random walk from an unrelated state.
But isnāt it then obvious that you donāt actually know whether any of the proteins you use as examples really contain those 500 bits? When you say they do, you are essentially claiming to know that there are no other proteins they evolved from, otherwise you would have to say most of the FI was already there because most of the sequence was present in the ancestral state.
I see the safe analogy has made another appearance. Unfortunately, the bit about the large safeās handle turning more and more as the thief gets closer to finding the right combination has been omitted.
Thatās not found in any of the calculations I have seen. Instead, the FI calculation focuses solely on the function that emerges by comparing the variation of that gene in different lineages. I have yet to see an FI calculation that factors in ancestral proteins that had a different function.
What is an āunrelated stateā in this scenario? If we have a protein with no ubiquitin function that gains ubiquitin function through a single mutation would that meet your standards? Would the change in FI in this scenario be 4.3 bits?
In fact, I have revised my estimation of FIb and now think it is lower, around 50 bits. But this still remains a pretty large amount of FI that the IS cannot produced by a random walk. In order to produce these 50 bits of FI, the IS resorts to the RV + NS mechanism.
The question now is why the RV + NS is able to produce high FI in a few weeks during the process of somatic hypermutation (SHM). The thing to see here is that the path from an antibody with low affinity for the antigen to an antibody with high affinity for this antigen consists of a succession of discrete selection steps. Moreover, the FI associated with each of these selection steps is very low, around 10 bits. Given the probabilistic ressources of the IS during the SHM process, it is a childās play for the IS to produce 10 bits of FI by random walk and, as a result, to go through the different selection steps leading cumulatively to high FI. So it is true that RV + NS can produce high FI but only in one particular and very special situation, ie., when the final target exhibiting high FI can be reach incrementally through a serie of small selective steps. Such a very special situation is quite rare in biology and doesnāt apply to complex proteins. This last point is well argued by @gpuccio in his OP below.
https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/
Good guessš
With all due respect, you are completely wrong here, as explained by @gpuccio below.
I was indeed wrong, but a lot less wrong that @gpuccio.
If the scenario that you describe is real then the FI was already in the previous application as far as I can tell.
Who ever ends up being right is immaterial as the conversation is going to improve our understanding of functional information. Thanks for the thoughtful posts.
@gpuccio here is a reference point for the discussion. This definition is from Hazen and Szostak.
I(Ex) = ?log2[F(Ex)], where F(Ex) is the fraction of all possible configurations of the system that possess a degree of function ? Ex. Functional information, which we illustrate with letter sequences, artificial life, and biopolymers, thus represents the probability that an arbitrary configuration of a system will achieve a specific function to a specified degree. In each case we observe evidence for several distinct solutions with different maximum degrees of function, features that lead to steps in plots of information versus degree of function.
The safes example assumes independence (see 1 below). That is, the combinations and rewards (hence in $dollars) are part of a whole, not 101 independent parts. Each safe has $1. The firt safe has a 1-but combination the second a 2-bit combo, etc., up to a 100-bit combo for the last safe.
The first small safe with a 1-bit combination is quickly opened by thief, who gains $1 and 1-bit of the combination to the next safe. The second safe has a 2-bit combination, but using his 1-bit knowledge he only needs one more bit! The second safe is also soon opened gaining another $1 and 1 more bit. The thief proceed to open each safe in turn until all the safes are open, and walks out with $100.
Letās make this harder - The thief does know how the safes are ordered, so it is not clear what order he should proceed.
He starts by entering ā0ā as the combination for all 100 safes, if one of the does not open, he goes around again and enters ā1ā as the combination, and one must open. The thief has gained $1 and 1-bit. 99 safes remain.
The thief repeats his task, starting with the bit(s) he knows and and adding 1-bit at a time until all the safes are open.
Additional notes:
-
I am making the error of assuming only a single function, or a single set of safes, when the thief may have many to choose from. A combination that does not open a safe for a particular function might open a safe containing some other new and unexpected function.
-
Another error! There is not just a single thief, but a population of thieves, each working to open the safes and sparing information.
-
If the safe combinations allow extra bits beyond the correct combination, the thief can guess the next two or three bits, possibly opening several safes with each pass, greatly speeding his task.
-
The thief ought to be flipping coins to choose bits instead of sequentially trying ā0ā and ā1ā bits. The thief will average 2 attempts per bit, but this does not substantially change the point I am making so Iām not going back to fix it!
Gpuccio is, I think, assuming there is nothing for the thief can gain until some substantial number of correct bits are known. We could modify the example so the thief needs to guess more than 1-bit, at least at first. That would mitigate some of the easy gains for the thief I demonstrated, but opening the safes and gaining $100 is still far easier than claimed. Iām willing to give our busy thief a well earned rest.
Looking at the original CARD11 example: The actual function of the matching protein in Saccoglossus kowalevskii seems to be unknown at this time, as it is only a predicted protein from analysis of the genome sequence (if I understand correctly). However, if it does have an analogous function, isnāt the divergence in sequence between Saccoglossus kowalevskii and humans suggestive of low functional information (as defined), since two very different proteins can achieve the same function? And if the two proteins donāt have analogous functions and we are just looking for the raw material to mutate into something that can carry out the function in humans/vertebrates, donāt we need to look at the DNA level rather than the protein level?
The situation is a little more dynamic that this. The environment/ecosytem changes (change in weather, animal migration patterns, pathogens etc).
So it more like the safes code keeps changing every once in a while.
This is why evolution is a process that depended so much on ācontingenciesā.
In short biodiversity is a miracle that normally shouldnāt have happened.
How do you know?
No, thatās not what Iām saying. Iām saying why canāt the accumulation in the same protein continue? Thereās an antibody, it has some sequence. It mutates in the hypervariable region in a stretch of 6-10 amino acids, and over the course of a few weeks 50 FI (or whatever) is generated. Why could a larger portion of the antibody not continue mutating?
Why could the same thing not happen to 200 residues, out of a 600 amino acid protein over the course of 500 million years? In fact, isnāt this exactly what phylogenetics reveals to us has happened? We create a large tree of homologous proteins, and we see that along some lineages lots of mutations have accumulated in the protein during this time period?
Natural selection fixed the mutations in the antibody. Why canāt natural selection have fixed the mutations in the larger protein?