Gpuccio: Functional Information Methodology

Swamidass et al. :

I see that many comments here are about the relationship of my analysis with a possible philogenetic analysis. I am trying to understand better what you mnean and why you suggest that. Maybe some further feedback from you would help.

I will try to clarify a few points about my analysis that, apparently, are not well understood.

  1. My analysis is focusing on the vertebrate transition only because it is very easy to study it. A number of circumstances are particularly favorable, as I have tried to explain. In particular, the time pttern of the pertinent splits, and the presence of sufficient protein sequences in the NCBI database to make the comparisons, and of course, the very good data about the human proteome.

But in no way I am trying to affirm that there is something special in the vertebrate transition. There is a lot of functional information added at that time, and we can easily check the sequence conservation of that information up to humans. However, the same thing probably happens at many other transitions.

So, why do I find a lot of FI at the vertebrate transition? It’s because I am looking for FI specific to the vertebrate branch. Indeed, I am using human proteins as a probe, and humans are of course vertebrate. My analysis shows that a big part of the specific FI found in vertebrates was added at the initial transition. It is not cpmparing that to what happens in other branches of natural history.

Just to be clear, I could analyze in a similat way the transition to hymenoptera. In that case, I would take as probes the protein sequences in some bee, for example, and blast them against pre-hymenoptera and some common ancestor of the main branches in that tree. I have not done that, but it can be done, and it would have the same meaning of my vertebrate analysis: to quantify how much specific FI was added at the beginning of that evolutionary branch.

I am not saying that vertebrates are in any way special. I am not saying that humans are in any way special (well, they are, but for different reasons).

  1. It should be clear that my methodology is not measuring the absolute FI present in a protein. It is only measuring the FI conserved up to humans, and specific to the vertebrate branch.

So, let’s say that protein A has 800 bits of human conserved sequence similarity (conserved for 400+ million years). My methodology affirms that those 800 bits are a good estimator of specific FI. But let’s say that the same protein A, in bees, has only 400 bits of sequence similarity with the human form. Does it mean that the bee protein has less FI?

Absolutely not. It probably just means that the bee protein has less vertebrate specific FI. But it can well have a lot of Hymenoptera specific FI. That can be verified by measuring the sequence similarity conserved in that branch for a few hundred million years, in that protein.

OK, time has expired. More in next post.

2 Likes