Information Rate and Entropy Rate

On another thread, @nwrickert raises some interesting distinctions:

He is not defining information. He is defining a quantity which measures how much information. And that’s what I have always understood to be Shannon’s entropy.

how much information = entropy – Yes to that.
information = entropy – No to that.

And also:

Entropy can’t be rate. The rate would be in something like “bits per second”. But Shannon entropy is supposed to be a dimensionless ratio, so the “per second” part doesn’t fit.

@nwrickert, if you’d like, we can hash this out here. Entropy and information have a scale determined by the base of the logarithm used. So they do have dimension. With log base 2, they are in units of bits. With the natural log, they are often called “dimensionless”, but the fact is that information in bits can be trivially rescaled to be “dimensionless” precisely the same way as entropy.

The equations for measuring entropy and for measuring information are identical. So we can talk about the rate of entropy across a channel in the same way as we talk about the rate of information as a channel. They are exactly the same things, with different semantic gloss.

In context, Shannon is just defining the total information transmitted in a period of time, which he divides by the number of seconds. The exact same thing can be done with entropy to get a rate. Information content = entropy amount.

What am I missing about what you are saying?


That’s not dimension in the sense in which physics talks of dimension.

My disagreement, though, is already there with the idea that information has a scale. The measurement of information has a scale. But the measurement of information is not itself the information.

I don’t know what it means to say that information has a scale. Information about weight uses a different scale from information about distance. Information has many scales, depending on what the information is about.

Agreed. But the measure of information is not the same as the information itself.

To use the example that EricMH is probably considering, the genome is information (as a string of base pairs). The length of the genome (the number of base pairs) is one possible measurement of information. But the genome is not the same thing as the length of the genome.

Yes, once you switch to talk of information rate, then you have switch to talking of a measurement of the information rather than of the information itself.

In another thread, you mentioned information, compression, entropy. We can use that in an example. In a computer we store information in a file. We can then compress the file. The compressed file contains the same information (assuming lossless compression). The information is what is contained in that compressed file. The entropy is, approximately, the length of that compressed file. So information is still not entropy.

But sure, if we talk about rates, then information rate and entropy rate are going to be the same. But information and entropy are not the same.


@nwrickert of course a measure of information in the genome is not the genome itself. However the measure information in a genome is precisely the same as the measure of the entropy in the genome. This is because information is entropy. They are the same in this context.

Of course information has many other meanings, especially in no technical speech, but this the way it is used in information theory. So in the context of information theory, information is entropy. Note, I’ve qualified this by noting there several differences in convention, such as using different bases in the logarithm.

Outside that context, information has additional meanings. So, when discussing “the information” in a message, in common parlance, we are not discussing the entropy. I agree. Is that all you are saying?


Yes. But that was the part of the miscommunication with @EricMH

1 Like

Okay I can agree with this. However, by that definition, we are not talking about the information in information theory.

1 Like

Yes we are, assuming you are referring to Shannon’s theory – except that I don’t think Shannon ever defined “information” as a technical term. I’ll note that he called it a theory of communication. Shannon is concerned with communicating messages. So the information is the message, while the entropy is the length of the message (at least in the simple case).


Not true. There is information in the message. And the amount of information in the message can be calculated precisely. The information in the message is exactly the same as the entropy in the message. Shannon was concerned with how accurately can the message be moved (communicated) from the sender to the receiver. He showed that the information content in the message can be moved perfectly (without loss of information) if certain conditions were meet. If the receiver got the information perfectly the MI between sender and receiver would be MI = 1 bit. Note that bits are the unit of information and entrophy. We can convert bits into states W. For example if a system can be in one of two state (off or on). W=2 and log (base 2) of 2 equals 1 bit.

Has you ever seen Shannon’s box? This is was build in the 1940’s by Shannon. There is exactly one bit of information as the input (person throwing the switch) and the system can be on one of two states (on or off).
When the system is turned on, it is the job of the system to shut itself off. Exactly 1 bit of information in this system and 1 bit of entropy in this system as it has only two states - on or off.

This is a variation on Shannon’s box with the same 1 bit of entropy.


He did define “information” as a technical term. He was careful to clarify that his use of the word “information” was not how it is understood in normal speech. He was concerned that it would be misunderstood.

Going back and reading his 1948 paper again is really remarkable. It is a truly seminal paper. It was written 70 years ago, and has so many of the key components of information theory. He notes the equivalency of information and entropy. He initially makes an IID assumption, but then goes back and shows how to relax it with language using a Markov chain. He recognizes the connection to compression. So much is embedded in this paper it is no wonder it essentially spawned a new field of study.

1 Like

Are these really the same?

1 Like

Further the linking of information with entropy was profound. Entropy was a term from statistical mechanics. Shannon linked it forever with information. We can now calculate both the information content and the entropy of the universe, black holes, communication networks, and computers.

1 Like


The amount of information in the message is exactly the same as the entropy of the message.

If that’s what you meant, then I agree. But I don’t agree with the original (quoted) wording.

1 Like

For all of this arguing, we still have the same thing.


Those are three messages. Those three messages all have different information. They all have the same entropy. So information is not entropy.

This is divorce my claim from the context. I said that the equation to compute information is the same as the equation to compute entropy. This claim was disputed by people here. I said that information = entropy, and explained this as that the two equations were identical. The fact that the two equations were identical was disputed.

I see your point @nwrickert, but it seems like it misses the point of the exchange in the first place.

1 Like

But that’s not right. It needs to be “the equation to compute the amount of information is the same …”

Any equation at all can compute information. The result of any useful computation is information. The reason we do the computation is to compute the information (as distinct from computing the amount of information). That’s why you need to use “amount of information” for what you were trying to say.

Sometimes “amount of” is implicit from context. But that isn’t always the case, and that’s why there was miscommunication in the earlier thread.


If you look at the alternatives there, we were only look at “amounts of”. That is not what caused the miscommunication.

At issue here also is not just the precise wording of a sentence, but the fact that there was inability to understand the actual paper. Remember the “misunderstanding” arose from a person saying I was in error. So the right way to put it is that he misunderstood.

Which I agree with also. So what exactly is your point?

1 Like

I agree. What the entropy (SMI) gives is a value, a quantity, an amount of information. It’s a measure.

1 Like

Shannon thought about calling it “information” but decided on “uncertainty.” Later he changed it to entropy.

1 Like

I’ll post later the story of how it came to be called entropy. There was nothing profound about it. :slight_smile:

1 Like

It was rather simple to see that your two equations were identical. But when asked for a source for your equation for information you pointed us to the equation for entropy!

I still have not found a single source that defines an equation for information that looks like the one you posted. The one source that did give a formal definition for information gave an equation that was not the same as yours.

1 Like

On page 10-11 of the 1948 paper, and in Appendix 2, he is deriving his measure of information, not of entropy. It is the same equation and he recognizes this from the start.