Some Numbers to Represent You

“When you can create a model from proxies, it is far simpler for people to game it. This is because proxies are easier to manipulate than the complicated reality they represent.”

[Weapons of Math Destruction, Cathy O’neil]

This is NOT the story of “Numerology” (but somewhat related). Can you believe that there exist some numbers to represent your whole life? In data science, we called these numbers “proxies”. To make a model simpler, we often use some numbers instead of your characteristic, leading to an efficient prediction about you and your future.

The AI model for college admission may use only SAT scores (and the number of completed APs) for screening. The AI model for a personal loan may use only five numbers of your zip code. However, we should keep in mind that these proxies (numbers) cannot represent our life correctly.

Make Your Crystal Ball Shine

“These mathematical models were opaque, their workings invisible to all but the highest priests in their domain: mathematicians and computer scientists. Their verdicts, even when wrong or harmful, were beyond dispute or appeal.”

[Weapons of Math Destruction, Cathy O’neil]

The fortune-tellers show our future with their crystal balls. We don’t know how this crystal ball works but we just believe (or not) its prediction. In fact, the fortune-tellers don’t know, too. The only thing they can do is to make their crystal balls shine.

Most of the predictive models based on machine learning are black-box models like crystal balls so we cannot know what happens inside. Someone worries about this opacity but this opacity may eliminate prejudice and bias. The one thing we can do is to feed them on unbiased and accurate data.