Skip to main content
United StatesStatisticsSyllabus dot point

Why is the sampling distribution of the sample mean approximately normal even when the population is not?

Topic 5.3 The Central Limit Theorem: state and apply the central limit theorem, that the sampling distribution of the sample mean becomes approximately normal as the sample size grows, regardless of the population's shape.

A focused answer to AP Statistics Topic 5.3, on the central limit theorem, why the sample mean's distribution becomes normal as n grows regardless of population shape, the large-sample guideline, and its role in inference, with a worked application.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. What the central limit theorem says
  3. Center, spread, and the role of n
  4. Why the CLT is the engine of inference
  5. Stating and checking the condition
  6. Try this

What this topic is asking

The College Board (Topic 5.3) wants you to state and apply the central limit theorem (CLT): that the sampling distribution of the sample mean becomes approximately normal as the sample size grows, regardless of the shape of the population.

What the central limit theorem says

The striking word is whatever. The population can be skewed, bimodal, or otherwise non-normal, and yet the distribution of its sample means still becomes bell-shaped as nn grows. The averaging process smooths out the population's irregular shape: extreme individual values get diluted when many are averaged, so the means pile up symmetrically around μ\mu. This is one of the most important results in statistics, because it makes the normal model applicable far beyond normal populations.

Center, spread, and the role of n

Two effects of larger nn work together. The CLT makes the shape more normal, and the σn\dfrac{\sigma}{\sqrt{n}} formula makes the spread smaller. So a bigger sample gives a sample mean that is both more normally distributed and more tightly clustered around the true mean, which is precisely why larger samples give more reliable estimates.

Why the CLT is the engine of inference

The central limit theorem is the reason the rest of the course works for means. Inference procedures, confidence intervals and significance tests, rely on knowing the sampling distribution of the statistic, and for that distribution to be normal so that z-scores and critical values apply. The CLT guarantees this for the sample mean whenever nn is large enough, without requiring the population to be normal, which would be an impossibly strong demand in practice (real populations of incomes, waiting times, and lifetimes are usually skewed). So when Unit 7 builds a confidence interval for a mean, it leans on the CLT to claim the sample mean is approximately normal; the "large sample size" or "normal population" condition you will check there is exactly the CLT condition. The theorem also explains a practical asymmetry: for a symmetric population, even a small sample's mean is nearly normal, but for a strongly skewed population you need a larger nn before the approximation is good. Knowing this lets you judge whether a given sample size is adequate for the population at hand, a judgement the exam frequently asks for.

Stating and checking the condition

On the exam, applying the CLT means stating the condition and then using the normal model. If n30n \ge 30 (or the population is stated to be approximately normal), you may treat the sampling distribution of xˉ\bar{x} as approximately normal with mean μ\mu and standard deviation σn\dfrac{\sigma}{\sqrt{n}}, and proceed with z-score calculations. If nn is small and the population is skewed or unknown, you should not assume normality, and the normal-based answer is unjustified. A full-credit response names the CLT, cites the sample size (or population shape) as the justification, gives the correct center and spread, and only then computes the probability. This discipline, justify normality first, then standardize, mirrors Topic 5.2 and is the template for every mean-based inference to come. The CLT is thus both a beautiful theoretical result and a practical permission slip: it tells you exactly when you are allowed to use the normal model on a sample mean.

Try this

Q1. State what the central limit theorem guarantees about the sample mean for large nn. [2 points]

  • Cue. Its sampling distribution is approximately normal regardless of the population's shape, with mean μ\mu and standard deviation σ/n\sigma/\sqrt{n}.

Q2. A population is strongly skewed and n=10n = 10. Can you assume xˉ\bar{x} is approximately normal? Explain. [1 point]

  • Cue. No; nn is small and the population is skewed, so the CLT approximation is not yet good; a larger sample would be needed.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). A population is strongly right-skewed. According to the central limit theorem, the sampling distribution of the sample mean for large nn is (A) right-skewed like the population (B) approximately normal (C) uniform (D) left-skewed
Show worked answer →

The correct answer is (B).

The central limit theorem says that for a large enough sample size, the sampling distribution of the sample mean is approximately normal regardless of the population's shape, even a strongly skewed one.

(A) and (D) wrongly assume the sample mean keeps the population's skew; the CLT removes it as nn grows. (C) is not implied. Large nn makes the sample mean's distribution approximately normal.

AP 2021 (style)4 marksSection II (free response). A population of waiting times is strongly right-skewed with mean μ=6\mu = 6 minutes and standard deviation σ=4\sigma = 4 minutes. A random sample of n=64n = 64 times is taken. (a) Describe the shape, center, and spread of the sampling distribution of the sample mean. (b) Justify the shape using the central limit theorem. (c) Find the probability the sample mean exceeds 77 minutes.
Show worked answer →

A 4-point central-limit-theorem question.

(a) (2 points) Center: μxˉ=μ=6\mu_{\bar{x}} = \mu = 6 minutes; spread: σxˉ=σn=464=48=0.5\sigma_{\bar{x}} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{4}{\sqrt{64}} = \dfrac{4}{8} = 0.5 minutes (1 point); shape: approximately normal (1 point).
(b) (1 point) Although the population is right-skewed, n=64n = 64 is large, so by the central limit theorem the sampling distribution of xˉ\bar{x} is approximately normal.
(c) (1 point) z=760.5=2z = \dfrac{7 - 6}{0.5} = 2, so P(xˉ>7)=P(Z>2)0.0228P(\bar{x} > 7) = P(Z > 2) \approx 0.0228, about 2.3%2.3\%.

Markers reward the correct mean and standard deviation of xˉ\bar{x}, citing the CLT to justify approximate normality despite skew, and the upper-tail probability.

Related dot points

Sources & how we know this