I found it helpful to understand the connection to physics. I never get to see that side of the math, so the physical interpretation is not intuitive for me.
I tend to have a more practical view in my work in molecular biology. I usually use the âentropy is the ability to do workâ definition and it gets me by, whether it is cooling agar plates or driving reactions with higher energy phosphates (e.g. ATP).
I imagine most of heard the following anecdote. But, still, how can anyone claim to understand entropy better than von Neumann? I mean, the guy has a type of entropy named after him!
FWIW, I understand the article is targeted at people who know little about entropy in its various guises, and with that proviso it does a reasonable job of explaining the various concepts and how they are linked.
Plus it takes a stab at differentiating surprise (AKA surprisal), missing information, and entropy. Though not all might agree with how the author does that.
Take the game of craps. I am no more surprised when a seven is rolled than I am when snake-eyes is rolled, in spite of the difference in probability between the two outcomes. Now what would surprise me is if snake-eyes repeatedly came up six times as often as sevens.
@mung I would like to invite you to New Jersey to a private craps party I am hosting in your honor. You can bet on snake eyes and I will bet on sevenâs all day long. How much money do you have to test your understanding of probability? Note that the probability of snake eyes is 1/36 and the probability of a seven is 6/36 so I should clean you out before lunch. But since I am a good sport, I will buy you lunch as you wonât have any money left.
We each bet a dollar. You give me thirty-seven dollars if snake-eyes comes up. I give you six dollars if seven comes up. Iâll be surprised if I run out of money before you do.
This mixes in an expected value and a (-log) probability concept. Was that intentional? I know, I probably shouldnât ask. And I guess Patrick sort of started it.
I edited my first post responding to yours because I originally referred to something similar to Patrick: the Dutch book argument, which says you should construct your subjective probabilities so that they satisfy the probability axioms. Under than constraint, I think it is a consequence that in a series of hands of a game of chance, you should align your probabilities (and bets) to the overall probabilities of each hand.
But if there is just one trial, then that argument would not apply. Thatâs where the principal principle comes in.
Still, if we use âsurpriseâ to refer to a personal feeling, you can certainly violate that by your personal feeling of surprise. I think you are saying that your personal feeling of surprise is a step function, not something continuous as is -log p.
But then using âsurpriseâ in that way for Shannonâs work is the same as using âinformationâ in its colloquial sense in the context of that work. Thatâs why surprisal is a better term. Like âentropyâ, its not so easy to anthropomorphize.
Too bad von Nuemann didnât think of that word. too. You might of thought he would have. After, he was a calculating machine!.
I think my point is that âsurprisalâ seems like something subjective. And of course âaverage surprisalâ something utterly mysterious. In his dissatisfaction with âdisorderâ heâs replaced it with something equally unsatisfactory (imo).
It [disorder] has a level of subjectivity that the other physical quantities donât.
Given some probably of the occurrence of an event, should you and I both be surprised by the same amount if that event takes place?
Note: There are of course, as I am sure you know, those who interpret Shannon entropy as expected value. Whether they do that with the individual probabilities of the individual events Iâd have to go back and look.
My point to Patrick was that he was proposing that we wager without saying what we would bet or be paid. Is he is saying that we should wager based on my lack of surprise at seeing a snake-eyes or based on my surprise if snake-eyes shows up more often than a seven. If the latter than i would truyly be surprised and would gladly pay him to see it, but request that the game take place in a Las Vegas casino with casino dice .
I couldnât help but notice the formula - log ( p ). It looked suspiciously familiar. It was missing an H = on the left hand side of the equation. Is that because he and @swamidass are talking two different things and just happened to be using the same notation?
In his Figure 3 can we replace the S on the left with an H? It is, after all, Shannon entropy, and Shannon used an H. Am I mistaken?
So we have H = - p1 log(p1) - p2 log(p2) - ⌠- pn log(pn) = -log p
I agree that the -log p is motivated by referring to subjective/personal surprise. I think it is a fair approach for the intended audience. But it can be misleading in a formal setting. Thatâs why Iâm happy when the term âsurprisalâ comes across as less subjective.
Yes Shannon used H in his 1948 and itâs the same formula as the average surprise equation in the paper. But the -log p is the surprise which is also called the âinformation gainâ
The article is imprecise in how it uses âinformation gainâ. Figure 1 says information gain is -log p (ie surprisal). Then in the section âAverage Surpriseâ, the âAverage Information Gainâ (emphasis added) is defined as the usual Shannon entropy sum. But in the sentence following Figure 2, the paper says âwhat we are talking about is how much information we have before and after the experiment.â Does that âinformationâ refer to a single experiment as in initial definition of information gain as surprisal? Or does it refer to the average information gain as in the immediately preceding Figure and definition of Shannon entropy? Or both?
Some people may be disturbed by that seeming imprecision. In fact, given how some people get emotional when they perceive internet as having a mistake, the existence of a thread devoted to arguing about that imprecision, perhaps on this very forum, would not be a surprise.
There are two things in error in the Entropy article:
Evolution can be in any direction⌠and still be evolution. Evolution does NOT require an increase in complexity. In fact, except for the obvious 1 cell vs multi-cell kind of distinctions, there isnt even a way to reliably MEASURE complexity
And 2), when the sun shines energy down on a planet for 5 billion years (for our purposes, an eternal SOURCE of energy)⌠Entropy is not a relevant objection to Evolution.