Skip to main content
United StatesStatisticsSyllabus dot point

How do you compute the chi-square statistic and P-value and conclude a goodness-of-fit test?

Topic 8.3 Carrying Out a Chi-Square Test for Goodness of Fit: compute the chi-square statistic from observed and expected counts, find the P-value using k minus 1 degrees of freedom, and state a conclusion in context.

A focused answer to AP Statistics Topic 8.3, on computing the chi-square statistic from observed and expected counts, finding the P-value with k minus 1 degrees of freedom, and stating a conclusion in context, with a full worked test.

Generated by Claude Opus 4.811 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The chi-square statistic
  3. Degrees of freedom and the upper-tail P-value
  4. Decision and conclusion
  5. Try this

What this topic is asking

The College Board (Topic 8.3) wants you to carry out and conclude a goodness-of-fit test: compute the chi-square statistic from observed and expected counts, find the P-value with k−1k - 1 degrees of freedom, compare to α\alpha, and state a conclusion in context.

The chi-square statistic

Each term (O−E)2/E(O - E)^2/E is large when a category's observed count is far from what the null predicts, and dividing by EE scales the discrepancy relative to the size expected there. Summing over all categories gives a single measure of total surprise. A statistic near 00 means observed matches expected closely (consistent with H0H_0); a large statistic means big discrepancies (evidence against H0H_0). Because the terms are squared, χ2\chi^2 is always non-negative.

Degrees of freedom and the upper-tail P-value

Unlike zz or tt tests, chi-square is never two-tailed: any departure from the claim (in any direction, across any categories) produces a larger statistic, so all the evidence against H0H_0 lives in the upper tail. There is no "doubling." A larger χ2\chi^2 gives a smaller P-value. Read the area from a calculator or chi-square table at df=k−1df = k - 1.

Decision and conclusion

Compare the P-value to α\alpha. If P-value ≤α\le \alpha, reject H0H_0: there is convincing evidence the true distribution differs from the claimed one. If P-value >α> \alpha, fail to reject: the data are consistent with the claimed distribution. The conclusion states the decision, ties it to the PP-versus-α\alpha comparison, and interprets in context ("there is convincing evidence the distribution of ... is not as claimed"). A rejected goodness-of-fit test says the distribution differs somewhere, but does not by itself say which category drives it; examining the individual (O−E)2/E(O - E)^2/E contributions (the largest terms) is how you discuss where the discrepancy lies, a common follow-up.

Try this

Q1. Write the chi-square statistic formula and the goodness-of-fit degrees of freedom for kk categories. [2 points]

  • Cue. χ2=∑(O−E)2E\chi^2 = \sum \dfrac{(O - E)^2}{E}; df=k−1df = k - 1.

Q2. Why is a chi-square P-value always an upper-tail area? [1 point]

  • Cue. Any departure from the claim increases χ2\chi^2, so all evidence against H0H_0 is in the right tail; the test is never two-sided.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). A goodness-of-fit test on 66 categories gives a chi-square statistic. The degrees of freedom are (A) 66 (B) 55 (C) 44 (D) depends on nn
Show worked answer →

The correct answer is (B).

For a goodness-of-fit test with kk categories, df=k−1=6−1=5df = k - 1 = 6 - 1 = 5. The degrees of freedom do not depend on the sample size nn.

(A) uses kk instead of k−1k - 1. (C) is incorrect. (D) is wrong: dfdf depends on the number of categories, not nn.

AP 2022 (style)4 marksSection II (free response). A spinner is claimed to land on its four colors equally. In 8080 spins the counts are red 1414, blue 2626, green 2222, yellow 1818. Test at α=0.05\alpha = 0.05 whether the spinner is fair. State hypotheses, give expected counts, compute the chi-square statistic and P-value (use df=3df = 3), and conclude in context (justify in context).
Show worked answer →

A 4-point goodness-of-fit test.

(1) (1 point) H0H_0: the four colors are equally likely (p=0.25p = 0.25 each); HaH_a: the distribution is not equal. Random/independent assumed; expected counts all ≥5\ge 5.
(2) (1 point) Expected count each =14(80)=20= \dfrac{1}{4}(80) = 20.
(3) (1 point) χ2=(14−20)220+(26−20)220+(22−20)220+(18−20)220=36+36+4+420=8020=4.0\chi^2 = \dfrac{(14-20)^2}{20} + \dfrac{(26-20)^2}{20} + \dfrac{(22-20)^2}{20} + \dfrac{(18-20)^2}{20} = \dfrac{36 + 36 + 4 + 4}{20} = \dfrac{80}{20} = 4.0, df=3df = 3. P-value =P(χ32>4.0)≈0.261= P(\chi^2_3 > 4.0) \approx 0.261.
(4) (1 point) Since P-value 0.261>0.050.261 > 0.05, fail to reject H0H_0. There is not convincing evidence that the spinner is unfair; the observed counts are consistent with an equal distribution.

Markers reward the expected counts, the chi-square sum, the P-value with df=3df = 3, and a contextual conclusion.

Related dot points

Sources & how we know this