Multiplying Probabilities

Continuing the discussion from Comments on Gpuccio: Functional Information Methodology:

Pardon me, but my pet peeve has been perked! @AllenWitmerMiller, bring me my soapbox! :slight_smile:

I am quoting @Gpuccio here, but only because this is the latest example of an error I see regularly, and darnit, I’m getting tired of correcting some basic probability calculations. The error is, probabilities do not multiply this way.

If X is some event and the probability of X occurring in a single trial is 0.1, then the probability of X occurring in two trials generates the following probability distribution:

The probability X does not occur in either of two trials: (1.0-0.1)\times(1.0 - 0.1) = 0.9 \times 0.9 = 0.81

The probability X occurs in the first trial but not the second: (0.1)\times(1.0 - 0.1) = 0.1 \times 0.9 = 0.09

The probability X does not occur in the first trial but does in the second: (1.0-0.1)\times(0.1) = 0.9 \times 0.1 = 0.09

The probability X occurs in both the first and second trials: (0.1)\times( 0.1) = 0.1 \times 0.1 = 0.01

If we add up all these possibilities they sum to 1.0.
0.81 + 0.09 + 0.09 + 0.01 = 1.0

AND finally the probability that X occurs at least once in two trials is the sum of the last three terms:
0.09 + 0.09 + 0.01 = 0.19

OR we can take the complement; the compliment of a probability (p) is one minus that probability (1-p), so the probability that X does not occur in two trials:
1-0.81 = 0.19

More on this complementary trick, but first a word from our sponsor, the Binomial distribution, which is where we end up if we generalize this example.

Now (back to our program and) the compliment trick; with the same X and five N trials instead of two.
The probability X does not occur in any of N trials: (1.0-0.1)^N = 0.9^N

and the complimentary probability that X occurs at least once:
(1 - 0.9^N)

and if N=2:
(1 - 0.9^2 = 1 - 0.81 = 0.19

and finally (FINALLY!) generalizing to some probability p; if the probability of X is P, then the probability of X occurring at least once in N trials is:
(1 - (1-p)^N)

which will always top out at 1 even for an infinite number of trials.

So to my point, quoting Gpuccio with no malice intended,

Of course, it is 1 billion times more likely to find the target in 1 billion steps then in 1 step.

I’m not using Gpuccio’s stated probability because it does not matter - the statement is incorrect. If the original probability is greater than 10^{-9} = 0.000000001, then you end up with a probability greater than one, which is NOT a probability by definition. This simple multiplication is not a valid probability calculation. It’s actually the number of times we expect X to have occurred in N trials.

So it is not “a billion times more likely to find a target”, in this example. That might be correct to within an order of magnitude for p much less than 1\over N, but not in general.

Remember this compliment trick, because the same calculation comes up regularly in discussions of ID.

OK, rant over. :smile:

13 Likes

Sorry. I did laundry today and I still have socks soaking in your soapbox.

4 Likes

BUT WAIT - THERE’S MORE!!

I’m going to skip the algebra this time and cut to the chase.

If we set the final probability - that of observing X at least once in N trials, equal to 1/N, what can we say about the probability of X in a single trial? Written out :
(1-(1-p)^N) = {1 \over N}
For clarity later, let’s assign a name, p_n = {1\over N}, so
(1-(1-p)^N) =p_n

It’s pretty easy to solve for p, giving,
p= 1-({{N-1}\over N})^{1\over N} = 1-\sqrt[N]{{{N-1}\over N}}

So if we have N=100 trials, and we assume the probability we will observe X at least once is p_n={1\over N} = 0.01, then p=0.0001005. Note that 0.01 is almost 100 times greater then 0.0001005. both probabilities are still pretty small, but multiple trials make a big difference!

Next let’s solve for N. This is much harder, but there is another trick! (Calculus!) For values of a close to one, the logarithm function is nearly linear, so that,
log(a) ~= a-1
For 0.9< a <1.1 this approximation is very good (and calculus is useful!)

Using this, and the fact that {{N-1}\over N} is close to one for large N and the probability p is very small (close to zero), I can give the following approximation:
N\approx{ 1\over\sqrt{p}}

So if p is …let’s say a* one-in-a-billion probability* (p=0.000000001) then N \approx 31,623, and p_n \approx {1 \over 32623} = 0.000031623
which is a LOT more than one-in-a-billion.

This is a good place to stop. It should be obvious there are serious implications for ID probability arguments, but I’ll let people figure this out for themselves.

Next I’ll work this out to find the number of trials needed so the probability of observing X (with probability p) at least once is 0.5.

7 Likes

Your calculations and his statements both appear to be correct.

1 Like

How can that be, Bill?

That seems obvious to me, even without the algebra. It directly contradicts @gpuccio’s claim, which Bill is claiming to be correct.

6 Likes

With no further ado …

N = {log(0.5)} / log(1-p) \approx 0.69315/p \approx 0.7/p

And that approximation should work very nicely for p<0.1.

Next on my list is Probabalistic Resourses. :slight_smile:

Adding another useful relation derived in a similar way …

if the probability of an event is 1/N, and there are N trials where it may occur, the probability that event will occur at least once is approximately 0.632. The probability converges to 1-1/e as N goes to infinity.

Apropos of very little, I’ve actually published using that particular probability calculation.
Here.

3 Likes