College Board AP 2022 (style)-style practice: Section II (free response). A spinner is claimed to land on its four colors equally. In 80 spins the counts are red 14, blue 26, green 22, yellow 18. Test at α = 0.05 whether the spinner is fair. State hypotheses, give expected counts, compute the chi-square statistic and P-value (use df = 3), and conclude in context (justify in context).

A 4-point goodness-of-fit test. (1) (1 point) H_0: the four colors are equally likely (p = 0.25 each); H_a: the distribution is not equal. Random/independent assumed; expected counts all 5. (2) (1 point) Expected count each = 14(80) = 20. (3) (1 point) ^2 = (14-20)^220 + (26-20)^220 + (22-20)^220 + (18-20)^220 = 36 + 36 + 4 + 420 = 8020 = 4.0, df = 3. P-value = P(^2_3 > 4.0) ≈ 0.261. (4) (1 point) Since P-value 0.261 > 0.05, fail to reject H_0. There is not convincing evidence that the spinner is unfair; the observed counts are consistent with an equal distribution. Markers reward the expected counts, the chi-square sum, the P-value with df = 3, and a contextual conclusion.

United StatesStatisticsSyllabus dot point

How do you compute the chi-square statistic and P-value and conclude a goodness-of-fit test?

Topic 8.3 Carrying Out a Chi-Square Test for Goodness of Fit: compute the chi-square statistic from observed and expected counts, find the P-value using k minus 1 degrees of freedom, and state a conclusion in context.

A focused answer to AP Statistics Topic 8.3, on computing the chi-square statistic from observed and expected counts, finding the P-value with k minus 1 degrees of freedom, and stating a conclusion in context, with a full worked test.

Generated by Claude Opus 4.811 min answerUpdated 2026-06-04

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this topic is asking
The chi-square statistic
Degrees of freedom and the upper-tail P-value
Decision and conclusion
Try this

What this topic is asking

The College Board (Topic 8.3) wants you to carry out and conclude a goodness-of-fit test: compute the chi-square statistic from observed and expected counts, find the P-value with $k - 1$ degrees of freedom, compare to $\alpha$ , and state a conclusion in context.

The chi-square statistic

Each term $(O - E)^2/E$ is large when a category's observed count is far from what the null predicts, and dividing by $E$ scales the discrepancy relative to the size expected there. Summing over all categories gives a single measure of total surprise. A statistic near $0$ means observed matches expected closely (consistent with $H_0$ ); a large statistic means big discrepancies (evidence against $H_0$ ). Because the terms are squared, $\chi^2$ is always non-negative.

Degrees of freedom and the upper-tail P-value

Unlike $z$ or $t$ tests, chi-square is never two-tailed: any departure from the claim (in any direction, across any categories) produces a larger statistic, so all the evidence against $H_0$ lives in the upper tail. There is no "doubling." A larger $\chi^2$ gives a smaller P-value. Read the area from a calculator or chi-square table at $df = k - 1$ .

Decision and conclusion

Compare the P-value to $\alpha$ . If P-value $\le \alpha$ , reject $H_0$ : there is convincing evidence the true distribution differs from the claimed one. If P-value $> \alpha$ , fail to reject: the data are consistent with the claimed distribution. The conclusion states the decision, ties it to the $P$ -versus- $\alpha$ comparison, and interprets in context ("there is convincing evidence the distribution of ... is not as claimed"). A rejected goodness-of-fit test says the distribution differs somewhere, but does not by itself say which category drives it; examining the individual $(O - E)^2/E$ contributions (the largest terms) is how you discuss where the discrepancy lies, a common follow-up.

Complete goodness-of-fit test

A candy company claims its bags contain colors in the proportions $30\%$ brown, $20\%$ red, $20\%$ yellow, $30\%$ green. A random sample of $200$ candies gives brown $54$ , red $48$ , yellow $32$ , green $66$ . Test at $\alpha = 0.05$ whether the true distribution differs from the claim.

step 1 Hypotheses and conditions

$H_0$ : the true color distribution is $30\%, 20\%, 20\%, 30\%$ . $H_a$ : it is not as claimed (at least one proportion differs). Random sample assumed; expected counts checked below (all $\ge 5$ ).

step 2 Expected counts

$E_{\text{brown}} = 0.30(200) = 60$ , $E_{\text{red}} = 0.20(200) = 40$ , $E_{\text{yellow}} = 0.20(200) = 40$ , $E_{\text{green}} = 0.30(200) = 60$ . (Sum $= 200$ .) All $\ge 5$ : condition met.

step 3 Chi-square statistic

\chi^2 = \frac{(54-60)^2}{60} + \frac{(48-40)^2}{40} + \frac{(32-40)^2}{40} + \frac{(66-60)^2}{60}.

= \frac{36}{60} + \frac{64}{40} + \frac{64}{40} + \frac{36}{60} = 0.6 + 1.6 + 1.6 + 0.6 = 4.4, \qquad df = 4 - 1 = 3.

step 4 P-value

Upper tail: $\text{P-value} = P(\chi^2_3 > 4.4) \approx 0.221$ .

step 5 Conclusion in context

Since P-value $0.221 > 0.05 = \alpha$ , fail to reject $H_0$ . There is not convincing evidence that the true color distribution differs from the company's claimed $30\%, 20\%, 20\%, 30\%$ ; the observed counts are consistent with the claim.

Try this

Q1. Write the chi-square statistic formula and the goodness-of-fit degrees of freedom for $k$ categories. [2 points]

Cue. $\chi^2 = \sum \dfrac{(O - E)^2}{E}$ ; $df = k - 1$ .

Q2. Why is a chi-square P-value always an upper-tail area? [1 point]

Cue. Any departure from the claim increases $\chi^2$ , so all evidence against $H_0$ is in the right tail; the test is never two-sided.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). A goodness-of-fit test on

6

categories gives a chi-square statistic. The degrees of freedom are (A)

6

(B)

5

(C)

4

(D) depends on

n

Show worked answer →

The correct answer is (B).

For a goodness-of-fit test with $k$ categories, $df = k - 1 = 6 - 1 = 5$ . The degrees of freedom do not depend on the sample size $n$ .

(A) uses $k$ instead of $k - 1$ . (C) is incorrect. (D) is wrong: $df$ depends on the number of categories, not $n$ .

AP 2022 (style)4 marksSection II (free response). A spinner is claimed to land on its four colors equally. In

80

spins the counts are red

14

, blue

26

, green

22

, yellow

18

. Test at

\alpha = 0.05

whether the spinner is fair. State hypotheses, give expected counts, compute the chi-square statistic and P-value (use

df = 3

), and conclude in context (justify in context).

Show worked answer →

A 4-point goodness-of-fit test.

(1) (1 point) $H_0$ : the four colors are equally likely ( $p = 0.25$ each); $H_a$ : the distribution is not equal. Random/independent assumed; expected counts all $\ge 5$ .
(2) (1 point) Expected count each $= \dfrac{1}{4}(80) = 20$ .
(3) (1 point) $\chi^2 = \dfrac{(14-20)^2}{20} + \dfrac{(26-20)^2}{20} + \dfrac{(22-20)^2}{20} + \dfrac{(18-20)^2}{20} = \dfrac{36 + 36 + 4 + 4}{20} = \dfrac{80}{20} = 4.0$ , $df = 3$ . P-value $= P(\chi^2_3 > 4.0) \approx 0.261$ .
(4) (1 point) Since P-value $0.261 > 0.05$ , fail to reject $H_0$ . There is not convincing evidence that the spinner is unfair; the observed counts are consistent with an equal distribution.

Markers reward the expected counts, the chi-square sum, the P-value with $df = 3$ , and a contextual conclusion.

Related dot points

Sources & how we know this

AP Statistics Course and Exam Description — College Board (2020)