Durston: Functional Information

swamidass · November 4, 2018, 2:59pm

First, I am not interested in “non-functional” p53, we areinterested in carcinogenic p53.

Still waiting…

Second, before we get to actual data, I want to work, out this toy example. I note that you have not yet answered the question. What do you compute for H(cP53 | maxent) based on the observed sequences? It cannot be two or three numbers. It can only be a single number. The closest to a straight answere is here:

This seems to be an admission that the H(cP53 ) computed from sequences is 4 bits less than H(P53 ), even though it should be zero (in your mind). Strangely, you also wrote contradictory things to this point. It is a straightforward question. Looking at the extant sequences, what do you compute as the H(cP53)? What do you compute as H(P53)? What do you compute as H(ground)? What is required here is the numbers computed from the extant sequences because that is all you use to compute FSC.

Note also that this is not relative to anything. It is merely the number computed from the extant sequences. We should be able to compute the delta H between any two states once the H of all states are established. Why is it so difficult to give us this number?

Where to find cancer data

Third, when we do actually work with data, there are over 100,000 examples of normal p53 in ExAc, so try using that. There are over 10,000 examples of carcinogenic p53 in CTAG, so try using that. We have more than enough data to make sense of this. The numbers calculated from this data are very different than yours.

Why so difficult?

@Kirk, I’m not sure why this is taking so long to establish H(cP53) or why you are writing to so much. I am just asking you to apply your formula to cancer in a toy example ot produce a straightforward answer. It is a well defined problem.

Show me how you computed this number from the sequences in this example? It appears that you are stating what FSC should be, rather than what it actually is computed to be from extant sequences. We need to know what FSC is computed as, not what you think it should be if it is a valid measure of FI. So I should repeate, for our toy example (not jumping ahead), using extant sequences and the FSC method you published, what are theses quantities:

H(P53)
H(cP53)
H(ground)

I’ve asked this several times now:

swamidass:

Clarifying this toy example some more. Of course, this is just a simplified model and better data would improve this. P53 has 390 amino acids. Let us imagine (this is not a fact) that all these amino acids need to be precisely correct for it function normally. Carcinogenic TP53 (which we will abbreviate cP53) requires mutating a specific amino acid to some other amino acid, any one will do.

We can obtain a large number of normal P53 sequences, and a large number of cP53 sequences. What will their H be?

H(P53)
H(cP53)
H(maxent ground state)

In this toy example, it seems that that we will agree on this:

H(maxent ground state) = - 390 * log 20 = 1686 bits
H(P53) = 0

This means that the FSC is going to be 1686 bits for normal p53. The divergence appears to be in how we compute H(cP53). What number do you compute? What do you think we will arrive at for the H of the cancer-function in P53 sequences?

H(cP53) = ?

I compute it at 1682 bits. This is merely applying your formula to the extant sequences we have, which (in this toy example) will all have the normal P53 mutation except in one location. Any other number would require either (1) changing your formula for FSC, or (2) smuggling in sequences for cP53 that were not actually part of the data set. What do you compute?

So, based on extant sequences as have been described in this example, I compute it at:

H(P53) = 0
H(cP53) = 4
H(maxent ground state) = 1686

This means the FSC computations are:

FSC(P53) = 1686
FSC(cP53) = 1682

What numbers do you compute from the extant sequences in this example? Yes, I know you think FSC should equal zero to be valid. For your interpretation to hold, I agree. I am asking here instead what the actual FSC computation gives us, which is NOT zero, which means that FSC is not valid. If we can’t get a straight answer on this, it seems that that this conversation is coming to a close.

Topic		Replies	Views
Computing the Functional Information in Cancer Conversation Design	41	5431	July 6, 2020
Gpuccio: Functional Information Methodology Conversation Science , Design	183	13362	September 1, 2019
Explaining the Cancer Information Calculation Conversation	85	6747	September 28, 2020
Looking for sources on the information argument Conversation Design	127	2786	September 10, 2021
Shannon information and COVID-19 Conversation Science , Article	93	1961	October 7, 2022

Durston: Functional Information

Still waiting…

Where to find cancer data

Why so difficult?

Related topics