Eric Holloway: Algorithmic Specified Complexity

Design

(Eric Michael Holloway) #9

Yes, keeping in mind my caveats, it is a lower bound on ASC.

OASC is defined in @Winston_Ewert’s IEEE paper “Algorithmic Specified Complexity in the Game of Life”.

If we want to be sure beyond a shadow of a doubt your OASC algorithm is technically correct, then for P_X calculated from an initial text X, we calculate the OASC for any new X’ as such:

OASC(X', P_X, C) = \ell(X')H_C(X) - L(X').


(S. Joshua Swamidass) #10

We will move forward with my implementation, but would you please make an implementation that is beyond a shadow of a doubt correct? You can fork my implementation and edit it as required.


I want to now go over the key claims about ASC as laid out by Marks. Tell me if you think I missed something important.

Marks is correct to point out that intuitive concepts of information do not map to information theory:

While humans operate using an intuitive concept of information, attempts to develop a theory of information have thus far fallen short of the intuitive concept.

This is the problem that he is trying to solve here, by providing a quantitate way of measuring the “information” content that is indicative of design. Marks points out the problem with Dembski’s Complex Specified Information (CSI) criteria for identifying design, there is no objective way to measure CSI. This is the problem that ASC seeks to solve…

One approach to information is specified complexity as expressed by Dembski (Dembski, 1998). Dembski’s concern is that of detecting design, the separation of that which can be explained by chance or necessity from that which is the product of intelligence. In order to infer design, an object must be both complex and specified.

The concept of specification has been criticized for being imprecisely defined and unquantifiable. … The goal of this paper is to present and defend a simple measure of specification that clearly alleviates these concerns. Towards this end, the authors propose to use conditional Kolmogorov complexity to quantify the degree of specification in an object. Conditional Kolmogorov complexity can then be combined with complexity as a measurement of specified complexity. This approach to measuring specified complexity is called algorithmic specified complexity.

So the function I implemented is claimed to be a reliable way of detecting design. A process that produces high ASC should, therefore, be only possible if intelligence is being injected into the process for Marks to be correct.

Do you agree @EricMH?


(Eric Michael Holloway) #11

Yes, it is a variant of mutual information. If P(X) = 2^{-K(X)}, and we use the maximally compressed form of C (denoted by C^*), then ASC(X,P,C) = K(X) - K(X|C^*) = I(X:C), where I(X:C) is algorithmic mutual information. Per our last discussion, algorithmic mutual information cannot be generated by random and computable processes any better than by flipping a coin. Hence my claim that naturalism cannot create information that fits our intuitive notion of information.

If you properly formulate P based on all possible chance and necessity causes, then yes, X with high ASC would give high probability something other than chance and necessity created X.


A Ubiquitin Response to Gpuccio
(S. Joshua Swamidass) #12

In my implementation, is P properly formulated?


(Eric Michael Holloway) #13

Not if you are trying to account for all possible chance and necessity causes. In your case, high ASC just means the text was not generated by sampling characters according to frequency distribution. This leaves open many other probabilistic and algorithmic means of generating text, such as with Markov chains.


(S. Joshua Swamidass) #14

So this gets a key claim that Marks makes, and I want to know if you agree with him:

It is not possible to calculate the Kolmogorov complexity of an object. However, it is possible to upper-bound the Kolmogorov complexity and thus lower-bound the algorithmic specified complexity. This means that something can be determined to be at least this specified, although the possibility that it is even more specified cannot be ruled out.

According to Marks, the consequences of using the wrong P is that ASC will be artificially low, and never artificially high. My ASC implementation is supposed to be a hard lower bound on CSI. Do you agree? If not, where did Marks go wrong?

After answering this question, I need you to choose either:

  1. If ASC (and my implementation) can never overestimate CSI, then we will keep my implementation and move to the next part.

  2. If you think we must carefully chose our implementation of ASC (the P) or we will overestimate CSI, that leaves you with two tasks: (1) explain were Marks when wrong and (2) give us the correct implementation by which we can compute ASC on biological sequences (DNA and/or protein sequences).

It seems a key point Marks is making is that even if we have the wrong compression algorithm, we are guaranteed to be under-estimating CSI with the ASC measure. I think you agree, as this appears to be your idea. If not, that would be really interesting. Curious which choice you will take…


(Eric Michael Holloway) #15

I’m not seeing where you are reading this. In the passage you quote he is discussing the specification term K(X|C), not the complexity term -\log_2 P(X).

Yes, this is correct. If C is kept constant, then L(X|C) is always going to over estimate the conditional Kolmogorov complexity if it follows the Kraft inequality, and thus underestimate the ASC.


(S. Joshua Swamidass) #16

Great…you chose option one, right?

This means that the implementation I’ve given here for ASC is a lower bound on CSI. So tell me if I have the logical syllogism correct then. I’m using ASC in reference to OASC (my implementation), to be consistent with my code and Mark’s paper.

  1. Postulate: nothing else but intelligent design can produce objects with high CSI.
  2. Ergo, high amounts of CSI are a unique signature of intelligence,
  3. For a biological sequence X, you have demonstrated ASC(X) \leq CSI(X).
  4. Some biological sequences in nature have high ASC; ergo, they have high CSI.
  5. Ergo, these sequences were produced by an intelligent designer.

Is this the argument that Marks, Dembski and you are all making? How would you refine or tighten this syllogism? Or is it correct as written?


(S. Joshua Swamidass) #17

This is an important point I suppose. I mean to say that ASC will always be a lower bound of CSI, and it seems you agree. It will never be more than the true CSI. This is what you have proven about my implementation of ASC.


(Eric Michael Holloway) #18

It is a false dilemma since it is premised on the false claim that “the wrong choice of P can only underestimate ASC.” I’m not sure why you think the wrong P can only underestimate ASC, as that is not stated in the passage you quote.

It is easy to see your claim is incorrect. For example, I could pick a P that gives P(X)=0 for all events that occur, and P(X) > 0 for events that never occur. In which case, all events that occur will have ASC=\infty.

The assumption that “wrong choice of P can only underestimate ASC” also does not apply to your implementation of ASC. I can easily construct an example where letter frequency gives the wrong probability for a sequence, e.g. “ABABABABABABABABABAB” will have -\log_2 P(X)=20 in your scheme by sampling uniformly from \{\text{"A"},\text{"B"}\}. On the other hand, the sequence could have been generated by the program print “AB” * randint(10,137), which gives -\log_2P(X)=-\log_2 \frac{1}{128} = 7. In which case your choice of P highly over estimates the ASC.


(S. Joshua Swamidass) #19

So this is interesting. Just to get some definitions straight here, we need to stick to the nomenclature definitions in the Marks paper.

I presented a choice between two options:

  1. ASC (or OASC) can is guaranteed to be less than CSI.
  2. ASC implemented with the wrote P can be greater than CSI.

You respond that:

This means, it appears that:

  1. I presented a valid implementation of ASC (which is later called OASC).
  2. Marks argues that the implementation is guaranteed to produce a number greater than CSI, and that this is an objective metric.
  3. Now you are saying that if I did not correctly choose P, ASC could be higher than CSI.

At face value, #2 and #3 are in contradiction. How can both be true? If a poor choice of P would increase the ASC above CSI, then how do we know if we have the right P or not? It seems that, instead, the claim is of you and Marks is:

ASC (i.e. OASC) is guaranteed to be less than CSI, provided the implementation uses the correct P. If the wrong P is used, then ASC might be higher than CSI.

Do you agree?

If You Agree…

If you do agree, for ASC to be confident lower-bound bound on CSI there must be an unambiguous way of determining the correct P. There must be an ambiguous way of determine P, or there is always the possibility that a better choice of P will reduce ASC, demonstrating that ASC was greater than CSI.

This becomes particularly challenging in cases where we do not know how a sequences was generated. So, then, this brings us to a fundamental and central set of question:

  1. What is the correct P for biological sequences such that we can be certain ASC < CSI?
  2. If you cannot produce an guaranteed to be correct implementation of P for biological sequences, how do you know that ASC < CSI?
  3. If you can produce produce an implementation of P, how do you know if it is correct?
  4. If you can’t produce an implementation or P or even determine if a given P is correct, how is ASC (OASC in all these references) guaranteed to be less than CSI?

If you have a general method of determining what the correct P is, I’d like to put that to the test. I can produce a a few sets of sequences, and you determine which set has the highest CSI. If you have an objective strategy, without knowing how I generate these sequences (till the end), you should be able to determine which one has the highest CSI. I do not think this is possible. In fact, I think it can prove its impossible, but we can see if you can prove me wrong.

If you can’t determine which sequence has the highest CSI, that means Marks is incorrect, and ASC is not an objective way of measuring/bounding CSI.

If You Disagree…

How are you certain that ASC will always be less than CSI? When looking at biological sequences, It is always possible we just have the wrong P, and the true CSI is much lower.

Because of this unresolvable question, it seems then that Marks is wrong: ASC is not an objective way of measuring/bounding CSI.


(S. Joshua Swamidass) #20

I want to signal also that I agree with this too:

Your example, appears to demonstrates this point:

The choice of P is critically important for computing ASC if we are to interpret it as a lower-bound of CSI. Your language here is confusing though. You mean the ASC computed using this choice of P will be much higher than the ASC using the correct P.


(S. Joshua Swamidass) #21

Finally, I on some careful though, I determined a general way of implementing P that will produce an ASC (on average) that is equal to or less than zero, no matter what the sequence. In fact, it might always produce an ASC of zero on CSI.

If I can implement such a P, and demonstrate that it produces an ASC of zero on arbitrary sequences (including any you want to present), does this demonstrate that there is no CSI in biological sequences?


(Eric Michael Holloway) #22

There is a bit of ambiguity in this discussion. We need to distinguish between the fact of the matter and our understanding of the matter when we talk about P.

There is always a real P_{true} that accurately estimates the probability that chance and necessity produced X. This P will always result in an ASC that has the property \Pr\{ASC(X, P_{true}, C) \geq b\} \leq 2^{-b}.

Then there is the P_{est.} we are talking about, which if we get it wrong could overestimate ASC. The key is we need to pick a P_{est.} such that it never under estimates the probability for all X. Mathematically the criterion for a valid P_{est.} is:

valid(P_{est.}) := \forall X, P_{est.}(X) \geq P_{true}(X)

As long as this criterion is met, then OASC is guaranteed to be a lower bound. An example of a universal P_{est.}
is P_{est.}(X) = 2^{-K(X)}, based on the assumption that all physical processes can be completely characterized by some combination of Turing computation and bits generated by a Bernoulli(\frac{1}{2}) distribution.

So, in regard to your challenge, it is certainly interesting and I would like to participate, but my failure will not indicate that Marks et. al. are wrong about ASC. The key property of ASC is mathematically proven, and so empirical counter evidence can only mean we have erred in our construction of OASC, possibly in our selection of P_{est.}.

The more interesting claim you make is that it is provably impossible to craft a usable valid(P_{est.}). If true, this will not disprove ASC, but it will show it is not useful in a practical setting. I would like to see your proof.


(S. Joshua Swamidass) #23

The Point: ASC Not Practically Useful

There is no ambiguity in my mind here, though you seem to be getting my point. The claims of Marks is entirely about what can be computed from data. He makes the claim that ASC is good way of empirically measuring CSI. That is the whole purpose of the study. In this context we do not know the true P, but can only guess it. If the result is dependent on choice of P, then his entire claim is false. Essentially, as you put it:

Which is to say that ASC is not useful in a practical setting, contra his claims. That is precise challenge I am making. If ASC is not practically useful, we still have no way of measuring CSI (as Marks admits in the paper). This leaves CSI as very poorly defined concept without any way of engaging real data till his problem is solved. I know there are other attempts to solve this problem, but they all have analogous defeaters.

Valid Choice of P

Your criteria for valid estimated P is interesting.

We can prove criterial is only satisfied if and only if \forall X, P_{ext}(X) = P_{true}(X). Remember, that P is normalized, so that the sum across all X equals 1. So too much density for one outcome, has to be compensated for with too little density at some other outcome, which would violate your inequality. Therefore, by your criteria, the estimated P is only valid if and only if it exactly equals the true P. I certainly agree.

The criteria you’ve put forward makes my point. The inequality ASC < CSI only applies if you know exactly what the true P is, but we do not know the true P for any biological sequence. ASC is, therefore, practically useless. Remember, the only valid P is the true P, so we have no way of constructing a valid ASC.

That was not my claim at first. Though it appears to be true. If finding a valid estimated P is equivalent to finding the true P. That is impossible. Finding the true P is equivalent to finding the ideal compression, which we know is uncomputable. There is a simple algorithm for transforming the true P into the ideal compression. So if we can determine the true P, then compressibility is computable, However, we already know that leads to contradiction: compressibility is not computable. Ergo, determining the true P of, for example, DNA is provably impossible.

What is at Stake

It seems we are tracking very closely with the logic of the paper. You’d have to show where you deviated from the logic of the paper. Or perhaps, where they went wrong. This is, after all, your idea that they have developed.

That does disprove the claim that ASC is a valid way of measuring CSI. This the whole point of the Marks paper on ASC, that he has found a practical way of measuring ASC, and therefore a practical way of detecting intelligence. If this turns out to be false, the entire point of the paper is overturned.

The Interesting Claim

That was not my earlier claim.

Rather, my earlier claim is that I can construct a sensible P which will almost always yield an ASC of zero (if not always). I can both empirically demonstrate this and prove it too. If that is true, one of two things is also true:

  1. Either ASC is not useful in a practical setting for measuring CSI, because it more determined by our choice of P than any signal in the data.

  2. Or CSI is precisely zero for all objects encodable as strings (i.e. everything), and we therefore find no evidence for intelligent design using CSI.

We will go further, but will be interesting to see which path you will take on this. Either ASC is useless practically for measuring CSI, leaving CSI as a signature of design but unmeasurable; or ASC is correct and useful practically, but it demonstrates that ASC does not empirically find CSI in anything. I’m not honestly sure which one is a better conclusion for ID. Both are going to be difficult to work through.


(S. Joshua Swamidass) #24

There are additional proofs I can make.

  1. If we know the true P, we can use this to construct an ideal compression algorithm. I can produce the construction algorithm to transform any P into a compression algorithm.

  2. We know that determining the ideal compression algorithm for arbitrary data is impossible (compressibility is uncomputable), therefore it is impossible to know the true P of arbitrary data. Of note, is impossible to know the true P of biological sequences.

  3. The ASC requires a P and a compression algorithm to implement. If we choose the true P and the ideal compression algorithm constructed from the true P, then ASC measured on all sequences would be exactly zero in all cases. So the true ASC appears to be zero for all sequences.

I can make these proofs in a fairly straight forward way. They are not difficult proofs (in fact #2 is trivially true if #1 is true). Moreover, they are consistent with the proof that ASC less than or equal to CSI, because zero would be less than or equal to CSI (if it is a real entity).

  1. Note, moreover, that IF we use the true P and the ideal compression algorithm to implement ASC, it seems I can prove that ASC = CSI. So this might demonstrate that CSI is merely an artifact of though, and measured as zero in all objects, even if the empirical/observed ASC is greater than zero. All the CSI being measured by observe ASC is just an artifact of using the wrong compression algorithm and the wrong P. I think this maybe difficult to establish because CSI is such a poorly defined concept. I suspect the most likely response would be to abandon ASC entirely so as to protect CSI.

This seems to be complementary proof to the point that ASC is not practically useful. This point, it seems, demonstrates that the idealized ASC of any sequence is zero, and therefore not a meaningful quantity. If the proof that ASC corresponds with CSI is valid, it might even demonstrate that CSI is always zero.


(S. Joshua Swamidass) #25

The Puzzle Rules

If you can’t solve this puzzle with an objective metric, it means you can’t objectively determine the CSI of objects. I do not think this is a solvable puzzle, but because I generated the sequences I know the right answer. For each of the following sequences, I know the correct CSI. I can give a few hints, and a promise:

  1. Each sequence has a different CSI, either 5, 10, 100, 150 or 200 bits of CSI to be precise.

  2. Each sequence was constructed with about five lines of python code.

  3. I’m saving the code used to generate this, and the answer key, and can produce it when you are done trying.

Remember, it is not enough to just permute these 5 numbers and try all combinations. You have to produce an implementation of ASC that produces results (1) consistent with these numbers and (2) able to identify which ones have high CSI. Of course, you could submit a function that always emits zero. It would be less than CSI in all cases, but it would be unable to determine which sequences were high or low.

If you cannot solve this puzzle (and I think it is impossible to solve), then it seems we have demonstrated that ASC cannot practically identify high CSI.

The Sequences

98d50ad729eb9af79bedc1eb64d555083513057ddfa6e86892fc6360d843e168

ce98c8f3aaddcac18d724c93ccde9b5af87f1967fe4481700d0832c5b003f4a2

451834bb93b6de3e8f87255f14789b728b544304027343000e57eabba0573318

1319730bc751757d7ae1b1f21f81978b9a2ee6d96685f03afc93d257c7d21d66

19385699c280f99b6825aa165c38191b9d7b6f150cf504f25b9934e96b90261d


Pesky Implementation Details in Information Theory
Open Challenge to ID Advocates
Pesky Implementation Details in Information Theory
(Eric Michael Holloway) #26

If we are just looking at a subset of the possible X, then the estimated P does not need to equal the true P to be valid. In any practical scenario we are looking at a subset of physical phenomena, anyways.

This is great. Thanks for listing your proofs.

#1 is correct.

#2 While it is true compressibility is uncomputable, it does not follow that we cannot find a decent lossy compression for some arbitrary data. If we are only looking at a subset, then we can reproduce the signal in the subset, which is equivalent to an overestimate of the probability.

For example, if we are just looking at DNA, we don’t care about the specific atomic composition of the DNA strand, we just care about the very high level base pair coding. From this coding, we could theoretically create a new DNA strand with this coding, but the new DNA strand would not be atomically identical to the original, so there is a tremendous amount of information loss. But, we still think characterizing the DNA strand by the base pair coding is very useful, so that is the signal.

#3 Since P is an abstraction, it does not need to be a particularly good compression. So, it does not follow that ASC is always zero, if we are just comparing the P compression to the ideal compression of the abstraction.

I generally agree with #4 that ASC is a more concrete form of CSI, but still CSI is a more general definition that is mathematically valid. For example, CSI could potentially be greater than ASC if the semiotic agents are halting oracles.

Regarding your puzzle, I’ll go ahead and concede defeat.

This does not follow:

With that, I don’t see there is much more to add. I appreciate you sharing why you do not think ASC is useful. Unfortunately, I do not find the reasons very convincing. I still hold out hope for a refutation of ASC/CSI. ID has consumed over a decade of my life, and it would be great to finally disprove it and move onto something else.


#28

Why do you hold out hope for ASC? ASC hasn’t shown to be a useful mathematical formulation that accurately describes anything in nature. Also ASC doesn’t seems to add anything to Information Theory either. And certainly not related to evolutionary science. Further, how does ASC have anything to do with ID?


(Eric Michael Holloway) #29

Those are good questions. Yes, ASC is technically a form of mutual information, so in a sense is not innovative. For that matter I’ve not discovered anything in ID that is innovative. But, that is also the whole point. The information theory aspect of ID is actually not controversial, and as far as I can tell makes no claim that hasn’t been made and proven elsewhere by more eminently qualified people. So, ID theory is settled mathematically. I really do not understand @swamidass contention with ID theory. The only reasonable point I get from our exchanges is there could be implementation issues in measuring mutual information, but so far I’ve not seen any insurmountable problems.

Furthermore, if @swamidass is correct, then ID is the least of his concerns. The whole field he is an expert in, information theory, is predicated on the notion that mathematical concepts like entropy and mutual information are not completely useless in practical settings. For example, the channel capacity is a form of mutual information, and Shannon proved we can create codes that get arbitrarily close to the channel capacity with arbitrarily small error. Our entire modern communication infrastructure is possible because of his insights regarding mutual information. But, if the practicality of mutual information is indeed the issue, then @swamidass needs to aim a lot higher than people like myself and the rest of the ID proponents.

As far as ID, ASC and CSI are concerned, the whole point is that mutual information cannot be generated by natural processes, for a suitably qualified definition of “natural process.” So, if mutual information exists in the genome, or any other aspect of physical reality, including the “natural processes” themselves, then it is put there by something that itself cannot be characterized as a natural process. That seems pretty significant, at least to me. Furthermore, practicality aside, it is also obvious mutual information exists. Any instance of order, such as this post on the message board, is an instance of mutual information.

So, in a nutshell, information theory tells us that because I wrote this post, we can know God exists and humans have immortal, immaterial souls.