The Central Limit Theorem


The sample mean 𝜛 from a group of observations is an estimate of the population mean 𝜇. Given a sample size n, consider n independent random variables  X1, X2,  . . ., Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean 𝜇 and standard deviation 𝜎. The sample mean is defined to be :

 

Moreover, by the properties of means and variances of random variables, the mean and variance of the sample mean are the following :

 

μ 𝜛 = μ

 

θ 𝜛 = θ

 

Although the mean of the distribution of 𝜛 is identical to the mean of the population, the variance is much smaller for large sample sizes.

 

Distribution of the sample mean

 

When the distribution of the population is normal, then the distributio of the sample mean is also normal. For a normal population distribution with mean 𝜇 and standard deviation 𝜎, the distribution of the sample mean is normal, with :

 

This result follows from the fact that any linear combination of independent normal random variables is also normally distributed. This means that for two independent normal random variables X and Y and any constants a and b, aX + bY will be normally distributed.

 

The Central Limit Theorem

 

The most important result about sample means is the Central Limit Thoerem. Simply stated, this theorem says that for a large enough  sample size n, the distribution of the sample mean 𝜛 will approch a normal distribution. This is true for a sample of independent random variables from any population distribution, as long as the population has a finite standard deviation 𝜎.

 

A formal statement of the Central Limit Theorem is the following :


If 𝜛 is the mean of a random sample X1, X2, ..., Xn of size n from a distribution with a finite mean 𝜇 and a finite positive variance  𝜎2, then the distribution of  :

is N(0,1) in the limit as n approaches infinity.  This means that the variable 𝜛 is distributed :