Probably, We Are Not Independent

humble pi joint probabitlity

“Getting our head around probabilities is very hard for humans. But in high-stakes cases like this, we have to get it right.”

[Humble PI: A Comedy of Maths Errors, Matt Parker]

Since a human is born to be a prediction machine, we naturally estimate the probability of everything. When amazing things are happening in a row, we think it is improbable. This is because we usually estimate the probability that each thing happens and multiply all these probabilities in our head, leading to the tiny probability (close to zero); this is a simple rule to calculate a joint probability. However, to use this rule, there is a hidden assumption that all the events are independent (the probability that one event occurs does not affect the probability that another event occurs). But all the events in our life are not always independent. Some events are correlated, have a cause-and-effect relationship, or occur in the same environment. So, amazing things may happen in a row with a high probability.

If you want to predict some extreme results using such a calculation of joint probability, you consider the independence of each event first. Only the right calculation of the probability is a guardian against a catastrophic result. For example, on the street, you met ten persons whose height is over 6′ 4” in a row. Then, you think that a strange thing happens today. But if you met the same persons in front of a basketball court, then you feel that this situation is somewhat reasonable. The same persons but the different places can change your probability (sometimes we consider a conditional probability). Hence, a careful calculation of a joint probability changes looks-like improbable things to probable things (or vise versa). Even though you calculate the joint probability and the result is very small, you should keep in mind that improbable things happen all the time.

I Don’t Count on You When You Count Numbers

Humble PI

“The only downside is that you break the link between the number you are using to keep track of your counting with the number of things you are counting.”

[Humble PI: A Comedy of Maths Errors, Matt Parker]

Since Adam was a boy, counting has long been recognized as the most important skill for humans to survive in the world. So, I believe that you (even if you are a toddler) can count numbers well. Let’s check it out. how many numbers you can count on your fingers? The answer is eleven (not ten). This is because you can also count zero with all folded fingers (the more correct answer is 1024, please search for “finger binary” on google). Next, how many natural numbers from 10 to 99? The answer is 90 (not 89). Hooray, you got the right answers, I count on you!

We are more getting into trouble when counting large numbers in efficient ways. For example, when we count the total number of events for calculating probability, we use some math skills such as permutation and combination. However, these are too tricky to use simply. So, when you make a decision based on probability (e.g. Bayesian approach), miscounting the number of events results in a totally different probability, leading to a wrong decision. Please don’t count on yourself when you count numbers (specifically, counting sheep to sleep or counting cards to win the blackjack).

Improbable Things Happen All the Time

“The universe is big, and if you’re sufficiently attuned to amazingly improbable occurrences, you’ll find them. Improbable things happen a lot.”

[How not to be wrong, Jordan Ellenberg]

You have a card deck and draw five cards from this. Surprisingly, five cards you drawn are spade A, 2, 3, 4, and 5. (Congrats! you made a straight flush). Then, you might think that this is a new card deck so it is not shuffled yet because drawing these five cards in a row might be improbable (or much lower probable). However, improbable things happen all the time. Please go to Las Vegas and check this!

When analyzing some results, we need to get used to a BIG number in our fields. Our field of interest is pretty big and you can see many improbable occurrences (we can see winners of the lottery every week). Hence, we should be careful not to make any causality from a chance occurrence. In data science, even though the data-driven model finds some patterns from Big Data, we should examine that this pattern can be made by randomness or not. (It may be improbable that millions of people read this post and like it but improbable things happen all the time!!)