So, this is a good question because I was basically intending to ask Eric further on something very similar to this point in Explaining the Cancer Information Calculation - #42 by swamidass.
Let’s focus on the most recent, and rather simple claim by Eric:
(Caveat: I’m not trained in information theory, and I have not been following the exchange between Eric and @Dan_Eastwood.)
First, it seems to me that here Eric is using the word “information” in two different ways: once referring to mutual information between A and B (which in my simple picture, I imagine as some degree of correlation between two sets A and B) and once referring to “information” in a general sense which exists in A independently, perhaps a measure of its entropy. This creates confusion.
Secondly and more importantly, I don’t understand the idea of using a function F to sidestep this argument. In the simplest case, we can think of A and B as an ordered string of 0s and 1s. A function that produces B by copying A simply needs to consist of a simple instruction:
If A_n = X, then set B_n = X.
This “copying function” can be used no matter what A contains. Thus, I(F:A) should be close to zero in most cases. Am I missing something here?