Improbable Things Happen All the Time

“The universe is big, and if you’re sufficiently attuned to amazingly improbable occurrences, you’ll find them. Improbable things happen a lot.”

[How not to be wrong, Jordan Ellenberg]

You have a card deck and draw five cards from this. Surprisingly, five cards you drawn are spade A, 2, 3, 4, and 5. (Congrats! you made a straight flush). Then, you might think that this is a new card deck so it is not shuffled yet because drawing these five cards in a row might be improbable (or much lower probable). However, improbable things happen all the time. Please go to Las Vegas and check this!

When analyzing some results, we need to get used to a BIG number in our fields. Our field of interest is pretty big and you can see many improbable occurrences (we can see winners of the lottery every week). Hence, we should be careful not to make any causality from a chance occurrence. In data science, even though the data-driven model finds some patterns from Big Data, we should examine that this pattern can be made by randomness or not. (It may be improbable that millions of people read this post and like it but improbable things happen all the time!!)

The Past is in the Past: the Law of Large Numbers

“That’s how the Law of Large Numbers works: not by balancing out what’s already happened, but by diluting what’s already happened with new data, until the past is so proportionally negligible that it can safely be forgotten.”

[How not to be wrong, Jordan Ellenberg]

You have a FAIR coin and toss it ten times. Surprisingly, ten heads in a row! Now, you should bet on head or tail. Where do you put your money? Fortunately, you had learned the law of large numbers which said that the average of large trials closer to the expected value. So, the next will be “tail” for balancing out by this law. BUT, it is not true, the probability of getting head or tail is still the same. We called this misconception as “Gambler’s Fallacy”. The law of large numbers CANNOT predict your future.

We often misunderstand that the previous independent results are highly related to the future result. Before you think like that, you should check first that previous results are really related to my future decision. If not, please forget about the past. Please don’t make the wrong causality using the law of large numbers. Even though you see that the average of repeated trials is far from normal, it cannot say anything about the future. Queen Elsa in Frozen says “Past is in the past” in her famous song ‘Let it go’.

Do You Want to Be a Nonlinear Thinker?

“Nonlinearity is a real thing! … Thinking nonlinearly is crucial, because not all curves are lines.”

[How not to be wrong, Jordan Ellenberg]

Many people want to be a nonlinear thinker who does not follow the step-by-step progression but tries to find the solution outside of the box. Hence, the word ‘nonlinear thinking’ implies somewhat special ability but most of the curves are nonlinear (only a few are lines) in the real world. That is, becoming a nonlinear thinker means (maybe) being mediocre.

When predicting future behaviors from the past (like the predictive model in AI), we should keep in mind that almost all curves are not lines. We should consider all possibilities to make our prediction nonlinear. Moreover, if our case turns out the nonlinear prediction, our optimal decision depends on where we already lie on the nonlinear curve. However, a linear prediction gives us good advantages to quickly find the pattern from the past and efficiently predict the (near) future behaviors because ALL curves seem to be lines locally. Hence, the balance of linear and nonlinear thinking is highly required in the age of Big Data.