Durston: Functional Information

@kirk is it possible that you are mistaking this:

KL(p || q) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)}

for this?

\Delta H =\sum_{x \in X} -q(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)}

Notice the switch from p to q in the first term. KL looks a lot like a delta H, but it most definitively is not. The first summation in KL is not H ( p) or H ( q), because it includes both q and p. The second summation, however, is H(p).


Just to catch everyone up, if q is MaxEnt (the base state that @kirk is using), then in this case KL = delta H. If q is not MaxEnt, this is not the case. If q is MaxEnt, then for all p:

H(q) =\sum_{x \in X} -q(x) \log {q(x)} = \sum_{x \in X} -p(x) \log {q(x)} = \log {N}

Here, N is the number of possible states. In this case it doesn’t matter that p is not q. Usually, however, it matters a great deal. If q is NOT MaxEnt, this not longer is true.

2 Likes