@EricMH while you are thinking about it, may I ask for a clarification? Please forgive my clumsy notation below. My background in statistical theory lets me follow the IT literature well enough, but I’m not used to expressing myself in these terms.
For the law of Information non-growth in the setting of statistical theory we have
I(X; Y ) ≥ I(X; T(Y ))
which is familiar to me as an application of Jensen’s Inequality. In the Algorithmic Information setting we have the same, and this shows that the average message length is minimised when coded for the true probability distribution rather than any function T() of that probability distribution. I agree with you this far, but here begins my question:
If I(X; Y ) represents information for some biological function coded by the true probability distribution p, then are you asserting that I(X; Y ) already represents the maximum state of greater biological fitness can exist?
In this scenario any T() other than identity will indeed have lesser biological function, because no other state is possible. But this seems to be intuitively backwards for what is supposed of biological evolution: For some probability distribution q not equal to the true distribution p there exists some function U() such that
I(X; Y,q ) \leq I(X; U(Y ),p)
thus allowing an increase in fitness. That is, U() reduces the Kullback-Leibler distance between q and p. I would weaken this statement to U() might exist, because it is not clear we could find such a function universally.