Halting Oracles And Law of information Non Growth

Continuing the discussion from Contradictory Points in ID and Information Theory Arguments?:

[Optimal] Compression is Uncomputable

Halting oracle comes into play when you have a process that you do not know if it has terminated or not.

Take an arbitrary configuration of atoms in space, perfectly known, velocities and positions.

Our task is to compute the optimal compression of these atoms at a particular snap shot. We do not want near optimal. We want to have the precisely optimal compression. We do not want lossy compression. We want to be able to reconstruct the whole system precisely down to the bit. We do not want just salient parts of the atoms (e.g. their temperature), we want to be able to save and restore this without a single extra bit of storage. We do not care how long the compression/decompression algorithm takes to run, but it cannot run infinitely.

We do this by looking for patterns. Every pattern we find reduces the compression size. We can also run a simulation forward (or backwards), and see if that new snap shot is easier to compress, and compress that snap shot with the time offset from where we are, to get a better compression.

Here is the algorithm to find this optimal compression:

def find_best_compression(X)
  for compression in all_possible_compressions:  #an infinite set
    if best_compression(compression, X): return compression

The problem is that “best_compression” is undecidable without a halting oracle. We could make a naive implementation of this like so:

 def best_compression_looper(compression, X):
   for other_compression in all_possible_compressions:
     if other_compression(X) < compression(X): return false
   return true      # we never reach here

The problem is that all_possible_compressions is an infinite set. If we have the best compression, the function will never return. We will never find out, so the program find_best_compression will never halt. So we need to wrap this looper in halting oracle, just like this:

 def best_compression(compression, X): 
   return will_not_halt(best_compression_looper,  args=(compression, X))

Now, we we have an algorithm that can compute the best possible compression of a sequence. However, it relies on a logical impossible entity, a halting oracle. So we cheated. This is not possible. That is why optimal compression is incomputable. How does this relate with MI? Well, “pattern” is synonymous with MI here (between different parts of the system) or redundant information in the system. We can never compute for certain how much MI there is because we do not have a halting oracles.

This itself is NOT the proof of that compressibility is uncomputable. That is a different proof. Rather, this is an example of an implementation which shows how a halting oracle could overcome the uncomputability problem. [note: this paragraph was corrected; in first draft, it erroneously claimed this was the proof of uncomputability]

In practice, however, we are not interested in this problem of finding the verifiably optimal compression. We are happy to get nearly optimal solutions, and that does not require a halting oracle.

Law of Information Non Growth

I already explained LoING.

We can see here that measuring the information in a system is equivalent to finding the optimal compression of a system. We cannot do this. It requires a logically impossible program. This means that we cannot actually measure the true amount of order in any system.

LoiNG and 2nd Law of Thermodynamics

So information in the system can be staying constant (or decreasing), but we would have no way of telling this. This how you resolve the paradox of the 2nd Law with LoiNG. The 2nd Law teaches that information/entropy is always increasing. LoiNG states that it is always constant (or decreasing in non-reversible systems). How can both be true? The entropy of the 2nd Law is measurable macroscopic quantity about the number of possible state a system can be in. The information of LoING is the precise one of those states the system is in, which cannot ever be known. So we have two types of information/entropy alongside one another. One is measurable in the empirical domain, and one is not and is provably impossible to access.

That is how Halting Oracles and LoiNG are connected. It is not that Halting Oracles are required for new information. Rather, if we observe information growth, our lack of Halting Oracles prevent us from really know if this is informaiton growth or not. So LoiNG still applies in systems that apparently follow the 2nd law of thermodynamics.


If the number of atoms is finite, then isn’t the number of possible arrangements also finite? Would that mean that the number of patterns, and thus compressions, is also finite?

Nope. We can run the simulation infinitely into the future or into the past. At some point in the future, it is possible that all the atoms are in 50% of the volume, and therefore compressible with less use of binary space. Perhaps going forward even father, they might be in just 49% of the space. We can’t know for sure without a halting oracle. There are an infinite number of possible compressions because there is an infinite number of possible patterns and time frames we can look through to find a better compression.

How should I understand the term “compression”?

At first, I thought it was trying to represent the vector describing the positions and velocities of the atoms with as least words as possible. For example, if you have 50 atoms with velocities v_1 = 1, v_2 = 2, v_3 = 3, ... that is more compressible than if you have 50 atoms with random velocities and positions.

However, it seems that here you are using the term compression in a more literal, physical space sort of way, where if the atoms are in a smaller volume, they are also more compressible - just like one would say when dealing with actual gases.

It could be that these two understandings are actually mathematically equivalent to each other. I’m not sure.


Take the case of a 1D universe of length 10 with two atoms. These are the initial conditions:
x_1 = 0, v_x = 1
x_2 = 0, v_x = 1

No matter how far we extrapolate into the past or future, we know exactly the position and velocities of these atoms. So the possible number of patterns describing their position and velocity is finite.

That being said, I understand we’re talking about statistical mechanics, and the above example is using Newtonian mechanics. But any statistical mechanical system large enough is in principle reducible to Newtonian mechanics, with enough computing power. (Let’s forget about quantum mechanics for the moment.)

@dga471 I want to point out another equivocation on the LoING. Notice the formula:

Determinism + randomness can never increase information.

That is true. However, remember that randomness is information too. So that means that adding randomness to an existing system from outside can increase information, in fact it must increase information. So imagine we have the world progressing by reversible-deterministic law from some initial state (random_init). It’s information content is always:

deterministic + random_init

Now, let us say we inject some randomness from somewhere, perhaps from God, or perhaps from quantum mechanics in some non-hyper-deterministc conception of it. Now the information content of the system is:

deterministic + random_init + random_injection

Do you see how information just increased?

1 Like

Yes. One way I can relate this to my 1D universe, 2 atoms example above: if one of the atoms gets a random velocity kick, the behavior of the two particles becomes more complicated - more entropy, thus more information. Intuitively, it becomes much harder to compress the system.


How should I understand the term “compression”?

That would be one way to represent the positions. However it is not the most efficient way, it is not the most compressed way.

Perhaps some of the other atoms have identical coordinates. So you can store a reference to another part of the array to save some space.

Perhaps there is a configuration that three atoms are in, that other atoms are in also. So you can story the position/orientation/scale of those orientations instead of the atoms themselves to save some space.

These all arise from different types of MI or self-correlations between the system and itself. Synonymously, they arise from different patterns we detect in the atoms.

The task is to find the optimal way of representing all these atoms, that reduces the size to the minimum amount.

If they occupy less space, then we need less bits to represent their position. Their positions are correlated now, so we need one less bit to encode each one of them. That is all I was saying.

Compression is Finite?

Yes, the compression is finite. Representations, in general, are finite. Just writing down the state of the system will be a finite vector.

However, the number of possible compressions is infinite. We want the minimum compression from this infinite set.

1 Like

That is exactly right. You know all the debates about QM. There is no reason to be certain the universe is a system that does not have randomness introduced by QM. If this were the case, it would mean that information is increasing in the universe. At least in the sense that LoING means it. QM could be introducing new information into the world at every single tick of the plank clock (if that exists?).

1 Like

I agree that the number of possible compressions is infinite. However, in the case of the 2-atom universe, you should be able to find the minimum compression using Newtonian mechanics without having to go through all possible compressions. It is:
x_1 = x_2.

I can’t imagine there is some way of representing the position-velocity of the atoms with less information than that. The minimum compression must at the minimum require me writing down 2 lines, one describing the velocity and the other the positions. And in this case, we can do that without the need of going through the possible compressions. Do you agree?

It is also worth point out that all these claims are brittle. If we just care about a nearly optimal compression, none of this applies. If all we care about a reasonable or useful compression or bound on the information content, none of this applies. If we restrict ourself to specific classes of easy to solve systems, none of this applies.

In those cases, we do not need a halting oracle, so no logical contradiction is encountered. So the entire exercise depends on not relaxing things to a “partial halting oracle” (as one example that we heard).

Yes, there are boundary cases where we can get an answer. That is not a problem. You are hitting on another point here. Our optimal compression algorithm must be “complete” in that it applies to all possible inputs too. So the fact that we can find an optimal compression in one case is beside the point.

Very quickly, as systems grow, and details of its working are hidden, optimal compression becomes uncomputable in practice.

I can. Just run that through Bzip, and it will be less bits.

Boundary cases might be possible to resolve. That is beside the point. Try resolving it for an MD simulation of a single protein. Even in that tiny system, it becomes impossible. Try resolving it in cryptography. Once again it becomes impossible. Try resolving it for a quart of air. Once again it becomes impossible.

Now, for DNA in a cell evolving in the wild? Impossible.


Well yes, but it remains to be seen if QM randomness has any macro-level effects that is relevant for biology. We don’t know. If not, then we are essentially left with the same universe as a classical one, where randomness is not truly random (?). But, @PdotdQ has pointed out that even a classical universe is not fully predictable. (Though I think we agreed that it is deterministic? Need to dig that up again.)


Quantum mechanics is definitely important in the randomness of mutations. The absorption/non-absorption of UV light by DNA at precise atoms is governed by quantum randomness. This sets in motion a chain of events that stochastically may or may not lead to an inherited mutation. The precise positions of these mutations determine if a mutation is beneficial or not. The precise positions, recall, are governed by QM randomness.

At this level, QM definitely affects biology. (@gbrooks9 would be proud)


I think I’m starting to understand what you mean. However, I’m trying to understand whether this “impossibility” is fundamental, or something that is simply because we don’t have computing power to simulate the mechanics of atoms precisely.

I think I have convinced myself that the impossibility is fundamental. As you said, you have to be able to compress all possible inputs. So in the case of the 2-atom universe, you have to be able to find an algorithm to compress any 4-tuple (x_1, x_2, v_1, v_2) for any x_1, x_2, v_1, v_2 \in R. This seems fundamentally impossible - because it would basically mean discovering every possible theorem in number theory.

1 Like

OK, but imagine we were living in 1910, before people knew much about QM. (OK, Shannon was not born yet.) So we have only a classical, mostly deterministic universe. (Yes, there are edge cases like Space Invaders and Norton’s Dome, but do we need to rely on them?) Would this have any impact on whether information could increase via determinism alone?


Remember too that most real numbers are incompressible because they have infinite significant digits, and no obvious construction. So simpler numbers are more compressible. Usually we can ignore this detail by just discretizing space…or using a descreet problem (such as a sequence). Thinking about an MD simulation of fixed precision might be how this is done, however keep in mind that this system is no longer reversible. You need infinite precision for reversibility. However, you are physicist, so I thought we’d use this example.

Just a side point. We can ignore this detail. LoING is in a theoretical construct. It is a separate question whether or not this applies to the real world.

1 Like

Yes, but going back to a QM universe, position and velocity is discretized, in the sense that it doesn’t make sense to talk about anything smaller than the Planck length. So even in our actual universe we should be able to represent positions and velocities with a finite number of significant digits.

That being said, I don’t think this breaks the fundamental impossibility of finding optimal compression with a single formula. Even with a finite set of numbers, there could be an infinite number of theorems to relate them. (Unless if there is a mathematical proof against that.)


The real challenge is if I give you an arbitrary list of numbers, without explaining to you the process that constructed them. In that case, you will quickly find it is impossible to solve, even for very short lists.

Take, for example, this list: Open Challenge to ID Advocates. It is framed differently, but we can repose the problem. What is the optimal compression of those five sequences? Not an easy thing to solve. In fact, as we’ve seen, you cannot figure it out. I’m almost guaranteed (because I produced the numbers) to be able to come up with a better compression than you. However, even that is not guaranteed to be optimal. There might be a better compression I did not envision. It is an undecidable problem.


True, but in the model we’ve been discussing here, you don’t even have to invoke QM for randomness to inject information into DNA. In this model, the only information is the digital information in the order of the bases in the DNA. Even if the chemistry that causes a mutation is deterministic, as far as this model is concerned the mutation represents new information from a random source.

As a number of people have observed here, it’s not just the mathematics of the model that matters; the mapping of the model elements to the physical world is just as important in drawing valid conclusions.


Definetely true.

That is precisely the point. It is also why appeals to “well it is just a proven established fact that information can’t grow” are hollow appeals. It reminds me more of Aristotelian philosophy than anything in science.