company, drawing, pen

More About Probabilities: Negative Logic

There is a class of probability problems that I often see students struggling with. I think of them as having a trapdoor that makes them easy, but without finding it, such problems become hard and tedious. These are often associated with questions that involve some inequality. Anyone trying to solve these should be immediately tipped off when such words appear. For example:

  • “Toss a fair coin 10 times. What is the probability it comes heads at most 8 times?”
  • Draw 6 cards from a deck of cards. What is the probability at least one of them is a Spade?

You can solve them with brute force, but that is going to be a long walk in the woods. Instead, use negative logic, and take the shortcut home. Let’s see that with the above two examples.

Scenario 1: coin tosses

Let’s approach this first with Brute Force. We consider all sequences of 10 coin tosses and note them on the paper as sequences of HT. Equally, you could represent them as values of 1 (say, for Heads) and 0 (for Tails). Therefore, we can have binary strings of 10 bits. This gives 2^10 = 1024 different possible states.

We want to count all the strings that have at most nine 1s. By brute force, we compute:

  • all the strings with zero 1s
  • all the strings with one 1s
  • […]
  • all the strings with nine 1s

and sum the result.

To find the result for each line, we can apply combinatorics. Consider for example how to find the number of all sequences with exactly 4 Heads.

This is a combinatorics problem. We have 10 positions and want to distribute 4 1s in them. Let’s take a look at a simpler problem to gain some intuition. Suppose we have a set of four letters (ABCD, to make this simple) and want to make exactly two of these lower-case. How many modified sequences can we find?

From those 4 positions, we want to pick exactly 2 each time. We get:

  • ABcd
  • AbCd
  • AbcD
  • aBCd
  • aBcD
  • abCD

It’s irrelevant what letter we choose to change. What matters is what position we change. The letter just happens to be there. But in doing so, we have collected all pairs of letters from our initial group of 4. We extracted all different subsets of 2 elements from a set of 4 elements. I’m rephrasing this repeatedly because I want to bring home that this is exactly the same as computing combinations, and the most usual way to look at them is:

Pick groups of k elements out of a set of n elements.

That is what we want to do: to count all the sequences with 4 1s, we take all the groups of 4 positions out of a possible 10, and let them be the only ones with a bit 1; the other positions will have a bit 0.

After this circuitous, but hopefully more intuitive route, we get that the number of sequences with exactly 4 Heads is

\binom{10}{4}.

And so, all we have to do is add these binomial coefficients for k = \{0, \dots, 8\}:

\sum_{k=0}^9 \binom{10}{k} = \sum_{k=0}^8 \frac{10!}{k! (10 - k)!}.

Excited?

Neither am I.

These are all manageable terms, but we are dealing with small numbers. What if we had 100 tosses instead and required at most 98 heads? Would you really do nearly a hundred binomial coefficients and add them together? Remember that \binom{100}{50} is a number with 30 digits.

The point I want to make is that you want to reduce your calculations.

Inequalities and negations usually give you a way to do that: look at the failure side and see if you have less cases to deal with.

The problem I chose was deliberately set to make this extreme. You might say I cheated. I will say I only wanted a clear example. In this case, the trapdoor is this:

  • If a sequence of 10 tosses has at most 8 Heads, then it must have at least 2 Tails.
  • Invalid sequences are those with more than 8 Heads, or just at least 1Tail.

That’s only two cases, and very simple ones.

  • How many sequences have 0 Tails (ie: 10 Heads)?
  • How many sequences have 1 Tail (ie: 9 Heads)?

The answers are:

  • 1: every position is Heads
  • 10: the single Tail can appear at each of the 10 positions.

The probability of this happening is then \frac{11}{1024}.

But this is the probability of failing our initial experiment.

To obtain the probability of success, we use the cornerstone equality of probabilities:

P[FAIL] = 1 – P[SUCCESS]

That is, the answer to the original question, the one that could be obtained by summing all the binomial coefficients, is simply

1 - 11 / 1024 = 1013 / 1024 = 98.9%

Take away:

Check the limits of your inequality: if you have at most k out of n occurrences of some event in a successful state, you have at least (n-k+1) out of n non-occurrences of that event in a failed state. You should pick whichever side has the smaller of (n, n-k+1) to do your calculations. And if this is the failure side, remember to subtract the probability from 1, to get the original requested probability.

Scenario 2: Cards

Given the extensive analysis of the previous case, I think I can try to be more concise for this second one. If we try to follow the lead of the question and find the probability of having a Spade directly, we may try to do this

  • Count all sequences with at exactly one spade, exactly two spades, and so on and sum them all.
  • Count the probability that the first Spade happens in the first position, then in the second and so on, and sum the probabilities.

The first approach is the same strategy analysed above and it’s generally unpleasant, so I’ll avoid it entirely.

The second one seems a bit more interesting, and is in fact an application of compound event strategies as detailed in this post. Actually, it is interesting enough that it merits some attention.

There are 6 sub-events here: given that we want at least a Spade to appear, the first one can be any of the 6 cards we draw. Therefore, we have six events like: “First Spade is drawn in position 1”, “First Spade is drawn in position 2”, and so on. These events are NOT independent. The first Spade can only be drawn in position 2 if it was not drawn in position 1. So let’s proceed systematically.

  • The probability of having a Spade in the first position is 13/52.
  • The probability of having the first Spade in the second position is:
    • the probability of NOT having a Spade in the first: 39/52
    • times the probability of having it in the second position: 13/51
    • total: 39/52 x 13/51.
  • The probability of having the first Spade in the third position is:
    • the probability of NOT having a Spade in the first: 39/52
    • times the probability of NOT having a Spade in the second: 38/51 (notice how we have 1 card less in the deck, and one less non-Spade)
    • times the probability of having it in the third position: 13/50
    • total: 39/52 x 38/51 x 13/50.

The above is enough to give a hint for the general rule. The probability of the first spade appearing in position k is:

  • the probability of not appearing in positions 1 to k-1: \prod_{i=1}^{k-1} \frac{39-(i-1)}{52-(i-1)}
  • times the probability of appearing in position k: \frac{13}{52 - (k-1)}.

And the total probability, now grouping all events, is:

\sum_{k=1}^6 \left(\prod_{i=1}^{k-1} \frac{39-(i-1)}{52-(i-1)}\right) \frac{13}{52-(k-1)}

It’s rather more elegant than computing and summing binomials, and we can probably end up faster, but you won’t be doing yourself any favours going this way.

There is still a better way: using negative logic:

If this leads to fewer computations, find the probability of failure and from there the probability of success.

In this case, the probability of failure is extremely simple. If we don’t have at least one Spade, then it’s because we have no Spade at all. What is the probability of drawing 6 non-Spades in 6 draws?

  • For the first card, 39/52
  • For the second card, 38/51
  • […]
  • For the sixth card, 34 / 47

We multiply all of these to get the probability of failing, and then subtract from 1 to get the probability of success. The final formula is:

1 - \prod_{k=1}^6 \frac{39 - (k-1)}{52 - (k-1)}

This is equivalent to the previous equation with a sum, which is not even obvious at the start. But we got there in a much easier way.

Take Away:

It’s useful to keep in mind there are often several ways to compute a probability. But you should always pick the one with less terms and operations: that usually means a simpler concept justifying the probability, is less tedious and may even scale better. The trick is to find the strategy that has less sub-cases or where the probabilities of each sub-case are easier to compute.

That wraps it for today and for my reflections triggered by Day 2 of the HackerRank’s Statistics tutorial.

Leave a Reply