You attempted to cherry-pick by citing actin: Let me remind you of alpha actin which is part of an irreducible complex structure.
mouse human 100% alignment.
Mice have 5 generations per year and this split was at least 50 million years ago. At 50 mutations per generation that’s 10 billion neutral changes fixed in the population. There is real substantial functional constraint going on here.
There is. However, constraint is not information. Constraint as a proxy for function is the hypothesis that according to you, should be worked on.
So, Bill, please calculate @gpuccio’s “FI” for actin. Then together, we’ll do so for the alpha- and beta-cardiac myosin heavy chains so that we can see which has more “FI” and which has more functional information.
We have common ground that gpuccio’s method tests for constraint. Constraint shows system failure when sequences are changed. We know this as mutations to critical protein functions lead to disease. We have millions of papers that support this hypothesis.
If I calculate the FI for skeletal muscle actin in mammals I would get about 1000 bits. This is converting 10-^377 to binary and backing off the number to account for error in the estimate. The exact same result is observed for smooth muscle actin and heart muscle actin. There are small sequence differences between these different actin proteins but all these are perfectly preserved in the mammals who’s sequences have been compared. I find this result particularly interesting.
The FI for cardiac myosin heavy chain based on this method is greater than 5000 bits. It is not perfectly preserved but it is highly preserved and its sequence length is 5x that of alpha actin.
No, BLASTing does that. @gpuccio’s hypothesis is that constraint is a measure of functional information. Let’s try to keep that goalpost planted. Your eagerness to move it reflects a lack of confidence in his hypothesis.
That would be a hypothesis that is disproved by the system you have chosen.
This system shows otherwise.
You haven’t read millions of papers, Bill. Your claims suggest that you have yet to read dozens, much less millions.
There’s no reason to back off, as we have great controls. Please provide the calculation precisely as specified by @gpuccio.
Why would you limit it to mammals?
I find your evasions to be far more interesting.
There’s not a single cardiac MHC, as I explicitly noted in the OP. We have alpha and beta isoforms, both of which work.
Why are you now pretending that there is only one?
But the C-terminal half of that sequence contains very little of the functional information. This alone is a major weakness of the hypothesis, correct?
He HYPOTHESIZES that. He is SHOWING nothing of the sort. Goalposts, Bill; let’s keep them planted.
Because functional information obviously is not evenly distributed WITHIN the protein. I’m just trying, against all odds, to get you to think about the hypothesis.
Your misrepresentation of the hypothesis of fact suggests that you lack sufficient confidence in it to test it. Is my inference correct?
So? In addition to the fact that 200K is not millions, searching PubMed is not an indication that you’ve read any of the papers you’ve found there. Your incoherent response here supports the inference that you don’t bother to read them.
I think that you should read my papers instead of my claims to contrast my level of understanding with yours.
I certainly understand the cardiac sarcomere at the system level, which is why I asked you why you were pretending that there was only one cardiac myosin heavy chain to analyze in this context.
Why did you not respond to that question, Bill?
Let’s do the calculations, with no estimates, starting with actin. Let’s see if @gpuccio’s hypothesis is correct.
This is untrue. He measures FI by his method and shows how it changes from major transitions. This is a test of his hypothesis.
Please get beyond assertion here.
Do you really think arguing this point is productive? If you are arguing from authority it shows weakness in your position.
I have read a few of your papers my area of interest is different.
This is not a request I am interested in. It is showing your lack of understanding in his method. All measurements have error factors. When I am comparing 6 sequences there is an error factor due to the sample size.
So two different types of actins are adapted to slightly different functions, and they each work despite being different from each other. Hence the topology of the fitness surface for the protein sequence is dependent on the intra and extracellular context in which it functions. This also shows there are at least two peaks, very close to each other in sequence space, with one variant being the local optimum for smooth muscle, and the other for heart muscle.
but all these are perfectly preserved in the mammals who’s sequences have been compared.
I have to mirror the question Mercer posed, why stop at mammals? Go look at cnidarian actins, an ancient clade. Are they identical to mammalian actincs? How well preserved are they between different cnidarian species? How does that conservation relate to how distantly related the cnidarian members are?
No, Bill, it is just a hypothesis that this “FI” calculation correlates with functional information.
The case that you have chosen shows that it does not.
I see that you lack sufficient faith in it to test it in real life.
I don’t think it’s a mere assertion to point out that the N-terminal half of myosin is the motor domain and the C-terminal half is the rod domain; it should be obvious that motors are far more functionally complex than rods.
Do you really find that to be a mere assertion?
You brought the point up. I’m not arguing from authority. You’re the one bringing it up to divert attention from testing @gpuccio’s hypothesis.
Your question had nothing to do with areas of interest. Please stop trying to deflect.
You’ve made your lack of confidence in the hypothesis quite clear, Bill. If you weren’t interested, why did you bring up the sarcomere in this context?
I understand it quite well. That’s why I am challenging you to apply it to the example you introduced.
There are no errors in the protein sequences of actin and the myosin heavy chains. They have all been sequenced many times.
No, there is no such “error factor.” You’re evading testing the hypothesis, Bill. Why?
What’s obvious is that you don’t understand the important difference between mutation and the fixation of mutant alleles. I’m trying to work with you to test the hypothesis that there is a correlation between a sequence being constrained (or critical) and the amount of functional information it contains.
You’re trying to avoid doing this. Why?
Then let’s calculate “FI” using both alpha and beta isoforms, because both are present.
The confidence level of the measurements. These sequence measurements have no errors, so that escape hatch is not available.
What’s the “FI” for human actin without any fudging, Bill?
For starters, because “statically sampling” is gibberish.
Let’s try to move on and limit ourselves to only human and mouse, which are sufficient to disprove the hypothesis. With 100% conservation, what’s the calculated “FI” for alpha-actin, with no estimations?