Gpuccio: Functional Information Methodology

My comments regarding gpuccio’s work:

As far as I can tell, the goal is to estimate what gpuccio calls “human conserved FI”, the idea being (as far as I can tell) that large amounts of “human conserved FI” will be suggestive of design. What confuses me is the method that is used to arrive at “human conserved FI”, and how this relates to any parameter that may suggest design.

To illustrate – gpuccio estimates the “human conserved FI” for a given protein by subtracting the bit scores from BLAST comparisons of human, shark, and Saccoglossus (H:S – H:Sa). The problem with this is that one can obtain very different values if one changes the organisms that are plugged into the analysis. For example, replace sharks with chimpanzees (Ch) and one gets much, much larger values. However, also replace Saccoglossus with the mouse (M) and the value for “human conserved FI” (H:Ch – H:M) will be much, much smaller.

To show why this doesn’t make much sense to me, consider instead the Saccharomyces species (cerevisiae and fragilis), and, as an “outgroup” to represent some unicellular predecessor, Plasmodium. Run gpuccio’s calculation (Sc:Sf - Sc:P) and one gets a result that would call for design in the origination of yeasts (probably, for the comparison I present here, the amount of “conserved FI” would be much greater for yeast than for humans when that latter is calculated using chimpanzees and mice).

Thus, as best I can tell, “conserved FI” is little more than a measure of evolutionary relatedness. One can rig the calculation to obtain pretty much any value one wants, and the value would reflect relationships between the three organisms used for the analysis. Nothing more, IMO.

Beyond this, it is not clear to me what the connection is between “conserved FI” and design. I suspect (but would welcome correction) that gpuccio is drawing on the work of Dembski, Axe, Behe, et al. who argue that information, defined as the frequency of occurrence of a functional sequence in sequence space, is suggestive of design when it is high. However, as many, many discussions here on PS have shown, the ID vanguard is wrong when it comes to their ideas about protein functionality and information. This calls into question gpuccio’s use of the term, and the conclusions drawn.

However, I will grant that I am not familiar with all of gpuccio’s posts on UD, and will grant the possibility that gpuccio is aware of these considerations, and has developed a more correct formulation. If so, I would interested in a summary, and moreso in the ways that the metric has been calibrated and/or validated.

My 2c.