Ted Schneider attempted to tie evolution and Shannon information together in a computer simulation.
As @nwrickert states earlier, I think information in this context is an abstract concept that we humans have invented to understand the whole. Even in Schneider’s work we could easily argue that there are situations where a weakening in binding between a transcription factor and a DNA binding region would actually be beneficial. We could apply the same concept to SARS-CoV-2 immune evasion where a reduction in binding between specific antibodies and the S protein increases viral fitness. On the flip side, increases in binding between viral and host cell surface proteins could be beneficial.
What I don’t agree with is that there is a meaningful connection between the types of information modeled in these papers and the actual process of entropy. I do think physical information could be real (atomic spin, velocity, mass), but I don’t think this abstract type of information is concrete nor is it part of the 2LoT.
Somewhere in my reading on IT I found mention of something which was roughly the equivalent of 2LoT for Information Theory. It applies (IIRC) to a change of information relative to some starting point, not to an absolute increase or decrease as for physical entropy. If I can find it again I may have to get it tattooed on my arm for easy reference.
As I argued here, to the extent that life is some sum of macromolecular associations and interactions, it is entirely correct to assert that life exists, not in spite of the SLoT, but because of it. (Read the whole thread, including the part before my first comment.)
As being measured here, the greater the SIE, the more information that can be conveyed per base. With the small differences observed here, it would only take 12 additional nucleotides to get the total information back to the same level. And that’s well within the variation in genome length seen in SARS-CoV-2.
I didn’t scale by k_b because it obviously won’t change the trend, and when I did so I got numbers on a very different scale than what the figure showed. However, the table shows the information entropy without the k_b scaling, and my numbers were exactly the same as the table.
Once I had the data, I could also see that while the mutations increased over time, it was not a fully nested set of mutations. I got the impression the “careful selection” was intended to achieve that kind of nesting, but apparently not.
For there mutational bias hypothesis (which I also wondered about), here is the distribution of changes with respect to the reference sequence.
Then I went to NCBI for more sequences. I used their filtering to restrict to complete genomes with no ambiguous characters. That still left a lot more data than I wanted to download, so I kept the 29,903 length restriction and only human samples from the oronasopharynx (no reason other than it was the only option that reduced the numbers further but left a useful sample). Could those choices introduce biases? Absolutely. I wouldn’t try to publish these results; I just want to quickly satisfy my curiosity.
The overall trends are the same. I’d guess that the number of sequences thins out over time because some of the adaptive mutations are deletions, so that exact length of 29,903 becomes less common. The number of mutations in each sequence is with respect to the same reference sequence used in the paper. I could be underestimating by overlooking sequential mutations at the same location that might be discernible with a more sophisticated analysis than just computing the number of string mismatches.
Since we’re not dealing with a strict progression increase in mutations, a more relevant trend might be directly in number of mutations and information entropy, rather than indirectly comparing them by looking at their trends over time.
Oh, no, I don’t think it’s an ID paper. Apparently it is pretty bad though. It’s too bad we don’t have the authors here to explain what their methodology was and why they drew the conclusions that they did.
Fully agree. Without entropy life couldn’t exist. It is the movement of energy from low entropy to high entropy that drives everything life does. Without a direction for energy to move along we couldn’t perform even the simplest metabolic tasks. For that matter, heat may not even reach us from the Sun.
Thanks for your nice work. So it seems that your analyses confirm the authors conclusion, ie that the SIE of SARS2 RNA genome decreases with time, don’t they ?
My wild guess is that they are probably engineers, and not biologists. They have perhaps been paying attention to news reports of mutations with COVID. But the news reports are really of those mutations which reached near fixation, so that could be what they were looking at.
I’ve been doing some digging into the authors, and found this interesting article:
Currently, we produce ∼10^21 digital bits of information annually on Earth. Assuming a 20% annual growth rate, we estimate that after ∼350 years from now, the number of bits produced will exceed the number of all atoms on Earth, ∼10^50. After ∼300 years, the power required to sustain this digital production will exceed 18.5 × 10^15 W, i.e., the total planetary power consumption today, and after ∼500 years from now, the digital content will account for more than half Earth’s mass…
…In conclusion, we established that the incredible growth of digital information production would reach a singularity point when
there are more digital bits created than atoms on the planet.
I always save the contents of my bit bucket for recycling (CoSci joke).
My first CoSci instructor told many tales, including one about keeping an actual bucket in the computer cabinet so the seniors operators could tell the new guys to “empty the bit bucket.”