Ai Cheat Sheet

Search…

Statistics ↓↑

Probability ↓↑

Data Science ↓↑

Data Engineering ↓↑

Machine Learning ↓↑

Natural Language Processing ↓↑

Computer Vision ↓↑

Time Series

Reinforcement Learning

System Design

Interview Questions ↓↑

Contact

Central Limit Theorem

In many fields including natural and social sciences, when the distribution of a random variable is unknown, normal distribution is used.

Central limit theorem (CLT) justifies why normal distribution can be used in such cases. According to the CLT, as we take more samples from a distribution, the sample averages will tend towards a normal distribution regardless of the population distribution.

Consider a case that we need to learn the distribution of the heights of all 20-year-old people in a country. It is almost impossible and, of course not practical, to collect this data. So, we take samples of 20-year-old people across the country and calculate the average height of the people in samples. CLT states that as we take more samples from the population, sampling distribution will get close to a normal distribution.

10 Must-Know Statistical Concepts for Data Scientists

Medium

Last modified 1yr ago

Copy link