Examples of Shannon information "codes"?

I don’t know what the definition of a shannon information code is, but have you ever considered that Fraunhofer lines are effectively like barcodes in light? These lines can be used to infer the elemental composition of shining objects from across the width of the observable universe. I’ve been told by physicists that the existence of Fraunhofer lines in the electromagnetic spectrum can be predicted by quantum mechanics, and their locations in the spectrum calculated from first principles.

This would then seem to be a sort of “code” that originates the the properties of atoms, or perhaps more correctly in the quantum mechanical laws that describe them.

1 Like

Shannon information is just a property of random variables: you need a sample space (often a set of strings) and a probability distribution. Modeling something with a causal relation does not need probabilities so I think the example of inferring composition from Fraunhofer lines is not on its own a candidate for Shannon info analysis.

However, we can often introduce probabilities in situations when there is random noise between two variables which we take to have an underlying causal relation. For example, in neuroscience, we can see how effective neural spike trains model external stimuli by using the mutual (Shannon) information of their two probability distributions as estimated by repeated sampling from response to same stimulus. This approach separates correlation of response with stimulus (correlation in the MI sense) from the inherent noise in stimulus and response taken separately.

1 Like

The OP uses the term “Shannon information codes”. There is a specific technical meaning for the term “Shannon code”. It is the encoding of the input messages which produces maximum possible compression after taking into account the probability of each message.

Morse code is an attempt at that, but it only looks at messages of length one (ie English letters). Huffman coding is the modern formalization of that idea of optimal single symbol-based compression.

Fully optimal codes require one to look at the entire message. In practice, this is done by looking at probabilities of pairs of letters, then triplets, then whole words, etc. Shannon considers such Markov modeling in his paper.

Modern compression algorithms like LZW build tables of the most frequent subsequences in a given input string and code them as shorter sequences; this effectively gives the most frequent substrings the shortest encoding, which is the basic idea in Shannon coding. LZ type coding is one of the compression techniques used in Zip compression.

(As an aside, as I read the posts by the OP, they still assume the everyday usage of information as referring to human meaning and incremental knowledge. The OP does not address Shannon’s technical usage of ‘information’, which is a property of the probability distribution of random variables. Specifically, Shannon’s technical definitions relate to -log2 p(i) and the expected value of this quantity, where p(i) is the probability of message i. This assumes countably many messages so that a discrete distribution suffices.)

Just for fun, on DNA and Shannon/Huffman codes. (You can find lots of articles with similar themes)