So this is interesting. Just to get some definitions straight here, we need to stick to the nomenclature definitions in the Marks paper.
I presented a choice between two options:
- ASC (or OASC) can is guaranteed to be less than CSI.
- ASC implemented with the wrote P can be greater than CSI.
You respond that:
This means, it appears that:
- I presented a valid implementation of ASC (which is later called OASC).
- Marks argues that the implementation is guaranteed to produce a number greater than CSI, and that this is an objective metric.
- Now you are saying that if I did not correctly choose P, ASC could be higher than CSI.
At face value, #2 and #3 are in contradiction. How can both be true? If a poor choice of P would increase the ASC above CSI, then how do we know if we have the right P or not? It seems that, instead, the claim is of you and Marks is:
ASC (i.e. OASC) is guaranteed to be less than CSI, provided the implementation uses the correct P. If the wrong P is used, then ASC might be higher than CSI.
Do you agree?
If You Agree…
If you do agree, for ASC to be confident lower-bound bound on CSI there must be an unambiguous way of determining the correct P. There must be an ambiguous way of determine P, or there is always the possibility that a better choice of P will reduce ASC, demonstrating that ASC was greater than CSI.
This becomes particularly challenging in cases where we do not know how a sequences was generated. So, then, this brings us to a fundamental and central set of question:
- What is the correct P for biological sequences such that we can be certain ASC < CSI?
- If you cannot produce an guaranteed to be correct implementation of P for biological sequences, how do you know that ASC < CSI?
- If you can produce produce an implementation of P, how do you know if it is correct?
- If you can’t produce an implementation or P or even determine if a given P is correct, how is ASC (OASC in all these references) guaranteed to be less than CSI?
If you have a general method of determining what the correct P is, I’d like to put that to the test. I can produce a a few sets of sequences, and you determine which set has the highest CSI. If you have an objective strategy, without knowing how I generate these sequences (till the end), you should be able to determine which one has the highest CSI. I do not think this is possible. In fact, I think it can prove its impossible, but we can see if you can prove me wrong.
If you can’t determine which sequence has the highest CSI, that means Marks is incorrect, and ASC is not an objective way of measuring/bounding CSI.
If You Disagree…
How are you certain that ASC will always be less than CSI? When looking at biological sequences, It is always possible we just have the wrong P, and the true CSI is much lower.
Because of this unresolvable question, it seems then that Marks is wrong: ASC is not an objective way of measuring/bounding CSI.