Write the standard deviation formula for p_1 - p_2 and state why variances are added. [2 points]

List the conditions needed for p_1 - p_2 to be approximately normal. [1 point]

College Board AP 2022 (style)-style practice: Section II (free response). In population 1, p_1 = 0.50; in population 2, p_2 = 0.40. Independent random samples of n_1 = 100 and n_2 = 80 are taken. Let D = p_1 - p_2. (a) Find the mean and standard deviation of D. (b) Check the conditions for normality. (c) Find the probability that p_1 - p_2 is less than 0, and interpret in context.

A 4-point question on the difference of two proportions. (a) (2 points) _D = p_1 - p_2 = 0.50 - 0.40 = 0.10 (1 point); _D = p_1(1-p_1)n_1 + p_2(1-p_2)n_2 = 0.25100 + 0.2480 = sqrt(0.0025 + 0.003) = sqrt(0.0055) ≈ 0.0742 (1 point). (b) (1 point) Large counts in each sample: n_1 p_1 = 50, n_1(1-p_1) = 50, n_2 p_2 = 32, n_2(1-p_2) = 48, all 10; samples independent and each under 10\% of its population, so D is approximately normal. (c) (1 point) z = 0 - 0.100.0742 ≈ -1.35, so P(D p_2. Markers reward the mean and the add-the-variances standard deviation, checking large counts in both samples, and the probability with interpretation.

United StatesStatisticsSyllabus dot point

How is the sampling distribution of the difference between two sample proportions described?

Topic 5.6 Sampling Distributions for Differences in Sample Proportions: describe the mean, standard deviation, and shape of the sampling distribution of the difference between two independent sample proportions, and check the conditions for the normal model.

A focused answer to AP Statistics Topic 5.6, on the mean, standard deviation, and approximately normal shape of the difference between two independent sample proportions, the conditions, and finding probabilities, with full worked calculations.

Generated by Claude Opus 4.810 min answerUpdated 2026-06-04

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Quick answer

For two independent random samples with true proportions $p_1$ and $p_2$ and sizes $n_1$ and $n_2$ , the sampling distribution of the difference $\hat{p}_1 - \hat{p}_2$ has:

mean $\mu_{\hat{p}_1 - \hat{p}_2} = p_1 - p_2$ ;
standard deviation $\sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{\dfrac{p_1(1-p_1)}{n_1} + \dfrac{p_2(1-p_2)}{n_2}}$ (add the two variances, then square-root);
an approximately normal shape when the large-counts condition holds for both samples ( $n_1 p_1, n_1(1-p_1), n_2 p_2, n_2(1-p_2)$ all $\ge 10$ ), the samples are independent, and each is at most $10\%$ of its population.

The standard deviation adds variances because the samples are independent, exactly the combining rule from Topic 4.9.

Jump to a section

What this topic is asking
Center and spread of the difference
Why variances add for a difference
The conditions, doubled
Why this matters for inference
Try this

What this topic is asking

The College Board (Topic 5.6) wants you to describe the mean, standard deviation, and shape of the sampling distribution of the difference between two independent sample proportions $\hat{p}_1 - \hat{p}_2$ , and to check the conditions for the normal model.

Center and spread of the difference

The mean rule is the difference of the individual means, $\mu_{\hat{p}_1} - \mu_{\hat{p}_2} = p_1 - p_2$ , with no surprises. The standard deviation is the one to internalise: it adds the two separate variances $\dfrac{p_1(1-p_1)}{n_1}$ and $\dfrac{p_2(1-p_2)}{n_2}$ and then takes the square root. This is a direct use of Topic 4.9's rule that variances add for independent variables, even for a difference.

Why variances add for a difference

The recurring trap is to subtract the variances (because it is a difference) or to subtract or add the standard deviations. Neither is correct. Variability accumulates when independent quantities are combined regardless of the sign, so the variance of the difference is the sum of the variances, and the standard deviation is the square root of that sum. This is the single most important computational point of the topic.

The conditions, doubled

The conditions are the same as for a single proportion, but they must hold for both samples, plus an independence condition between the samples. The large-counts condition requires at least about $10$ expected successes and $10$ expected failures in each sample ( $n_1 p_1 \ge 10$ , $n_1(1-p_1) \ge 10$ , and the same for sample 2), which makes each $\hat{p}$ approximately normal so their difference is too. The $10\%$ condition must hold for each sample separately, so that within each, the standard deviation formula is valid. And the two samples must be independent of each other (separate random samples, or two randomly assigned treatment groups), because the add-the-variances formula depends on independence. A complete answer verifies all of these before invoking the normal model. This doubling of conditions is the main way the two-sample topic differs from the one-sample Topic 5.5; the underlying logic, large counts for shape, $10\%$ for the standard deviation, is identical, just applied twice and supplemented by between-sample independence.

Why this matters for inference

Topic 5.6 is the sampling-distribution foundation for comparing two proportions, one of the most common inference tasks (Unit 6). A confidence interval for $p_1 - p_2$ is centered at $\hat{p}_1 - \hat{p}_2$ with a width built from this same added-variances standard deviation (as a standard error), and a two-proportion significance test computes a z-score using it. The ability to answer "how likely is a difference this large by chance?" comes directly from knowing that $\hat{p}_1 - \hat{p}_2$ is approximately normal with the mean and standard deviation above. A particularly instructive question type asks for the probability that the difference is negative even when $p_1 > p_2$ , which shows that sampling variability can make the second sample proportion exceed the first on a given pair of samples, a reminder that a single observed difference is one draw from a distribution, not the true difference. Working through the center, the added-variances spread, the conditions, and a probability cements the template for two-proportion inference.

Difference of two sample proportions

In school A, $60\%$ of students walk to school ( $p_A = 0.60$ ); in school B, $45\%$ do ( $p_B = 0.45$ ). Independent random samples of $n_A = 50$ and $n_B = 50$ are taken. Let $D = \hat{p}_A - \hat{p}_B$ . (a) Find the mean and standard deviation of $D$ . (b) Check conditions. (c) Find $P(D > 0.25)$ .

step 1 Center and spread (part a)

$\mu_D = p_A - p_B = 0.60 - 0.45 = 0.15$ . Add the variances:

\sigma_D = \sqrt{\frac{0.60(0.40)}{50} + \frac{0.45(0.55)}{50}} = \sqrt{\frac{0.24}{50} + \frac{0.2475}{50}} = \sqrt{0.0048 + 0.00495} = \sqrt{0.00975} \approx 0.0987.

step 2 Check conditions (part b)

Large counts: $n_A p_A = 30$ , $n_A(1-p_A) = 20$ , $n_B p_B = 22.5$ , $n_B(1-p_B) = 27.5$ , all $\ge 10$ . The samples are independent and each is under $10\%$ of its school's population, so $D$ is approximately normal.

step 3 Standardize and find the area (part c)

$z = \dfrac{0.25 - 0.15}{0.0987} = \dfrac{0.10}{0.0987} \approx 1.01$ . So $P(D > 0.25) = P(Z > 1.01) \approx 1 - 0.8438 = 0.1562$ , about $15.6\%$ .

step 4 Interpret

The difference in walking proportions is approximately normal, centered at $0.15$ with standard deviation about $0.099$ . There is roughly a $15.6\%$ chance the observed gap exceeds $0.25$ . The standard deviation came from adding the two variances and square-rooting, the essential move for a difference of independent proportions.

Try this

Q1. Write the standard deviation formula for $\hat{p}_1 - \hat{p}_2$ and state why variances are added. [2 points]

Cue. $\sigma = \sqrt{\dfrac{p_1(1-p_1)}{n_1} + \dfrac{p_2(1-p_2)}{n_2}}$ ; variances add because the samples are independent (and add even for a difference).

Q2. List the conditions needed for $\hat{p}_1 - \hat{p}_2$ to be approximately normal. [1 point]

Cue. Large counts ( $\ge 10$ expected successes and failures) in both samples, the two samples independent, and each sample at most $10\%$ of its population.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). For two independent sample proportions, the standard deviation of

\hat{p}_1 - \hat{p}_2

is found by (A) subtracting the two standard deviations (B) adding the two standard deviations (C) adding the two variances, then taking the square root (D) averaging the two standard deviations

Show worked answer →

The correct answer is (C).

For independent random variables, variances add (even for a difference), so $\sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{\sigma_{\hat{p}_1}^2 + \sigma_{\hat{p}_2}^2}$ : add the two variances, then square-root.

(A) and (B) wrongly operate on standard deviations directly. (D) is not a valid rule. Adding variances and rooting gives (C).

AP 2022 (style)4 marksSection II (free response). In population 1,

p_1 = 0.50

; in population 2,

p_2 = 0.40

. Independent random samples of

n_1 = 100

and

n_2 = 80

are taken. Let

D = \hat{p}_1 - \hat{p}_2

. (a) Find the mean and standard deviation of

D

. (b) Check the conditions for normality. (c) Find the probability that

\hat{p}_1 - \hat{p}_2

is less than

0

, and interpret in context.

Show worked answer →

A 4-point question on the difference of two proportions.

(a) (2 points) $\mu_D = p_1 - p_2 = 0.50 - 0.40 = 0.10$ (1 point); $\sigma_D = \sqrt{\dfrac{p_1(1-p_1)}{n_1} + \dfrac{p_2(1-p_2)}{n_2}} = \sqrt{\dfrac{0.25}{100} + \dfrac{0.24}{80}} = \sqrt{0.0025 + 0.003} = \sqrt{0.0055} \approx 0.0742$ (1 point).
(b) (1 point) Large counts in each sample: $n_1 p_1 = 50$ , $n_1(1-p_1) = 50$ , $n_2 p_2 = 32$ , $n_2(1-p_2) = 48$ , all $\ge 10$ ; samples independent and each under $10\%$ of its population, so $D$ is approximately normal.
(c) (1 point) $z = \dfrac{0 - 0.10}{0.0742} \approx -1.35$ , so $P(D < 0) = P(Z < -1.35) \approx 0.0885$ ; about an $8.9\%$ chance the first sample proportion is below the second despite $p_1 > p_2$ .

Markers reward the mean and the add-the-variances standard deviation, checking large counts in both samples, and the probability with interpretation.

Related dot points

Sources & how we know this

AP Statistics Course and Exam Description — College Board (2020)