Information = Entropy and Chance = Choice

It will also help make sense of an earlier question you had…

See that there are three types of information.

  1. Total Information, or just information content of an entity (the area of a circle)
  2. Mutual information, or the shared information between two entities (the overlap between the two circles)
  3. Conditional Information, or the information required for one entity, if we know another entity (the non-overlapping area of one of the circles).

So, in your case, if a single mutation changes a gene:

  1. In the most common understanding, the conditional information is guaranteed to increase (unless the sequence does not change). In this case, we mean the conditional information between the before-mutation and after-mutation sequences. Turns out that it is fairly easy to calculate a reasonable estimate of the conditional information here if it is a single mutation.

  2. In the next most common understand, the mutual information may or may not decrease between the before-mutation and after-mutation sequences… An insertion would keep the mutual information the same, but a deletion or point mutation would reduce it. If A and B are are two unrelate sequences, the chance that mutual information will increase is about a 50/50 coin toss in many if not most cases, as we just learned: The EricMH Information Argument and Simulation - #117.

  3. In the least common understanding, the information would most likely increase, though there is a small chance it might decrease. The more mutations there are, the more certain we are that the information would increase. Deletions, however, might reduce the information content on average. Randomly mutating a sequence increases its entropy (usually), and information = entropy.

There are several ways of quantifying or measuring these quantities empirically from data. There are also theoretical ways building proofs about the “true” information, which are unmeasurable and unknowable versions of these three quantities too. Of note, many (if not most) of the proofs about “true” information do not carry over to empirical information. That translation step between theoretical “true” information, and empirical information cannot be neglected. Understanding this translation undergirds a strong intuition about how to interpret information measures of complexity.

There are several other things that should be come clear too If we ever talk about mutual information or conditional information in a formal context, we must alway specific to which information is being measured in relation? We can’t talk about the mutual or conditional information of A without specifying B. We cannot claim to be talking about information theory and use the bare term “information” to mean anything other than total information, which equals entropy. Mutual information is the shared entropy between two objects. Conditional entropy is the unshared entropy. So information is entropy. Just as Shannon wrote in his seminal paper.

Questions?

2 Likes