How do you compute the chi-square statistic and P-value and conclude a goodness-of-fit test?
Topic 8.3 Carrying Out a Chi-Square Test for Goodness of Fit: compute the chi-square statistic from observed and expected counts, find the P-value using k minus 1 degrees of freedom, and state a conclusion in context.
A focused answer to AP Statistics Topic 8.3, on computing the chi-square statistic from observed and expected counts, finding the P-value with k minus 1 degrees of freedom, and stating a conclusion in context, with a full worked test.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 8.3) wants you to carry out and conclude a goodness-of-fit test: compute the chi-square statistic from observed and expected counts, find the P-value with degrees of freedom, compare to , and state a conclusion in context.
The chi-square statistic
Each term is large when a category's observed count is far from what the null predicts, and dividing by scales the discrepancy relative to the size expected there. Summing over all categories gives a single measure of total surprise. A statistic near means observed matches expected closely (consistent with ); a large statistic means big discrepancies (evidence against ). Because the terms are squared, is always non-negative.
Degrees of freedom and the upper-tail P-value
Unlike or tests, chi-square is never two-tailed: any departure from the claim (in any direction, across any categories) produces a larger statistic, so all the evidence against lives in the upper tail. There is no "doubling." A larger gives a smaller P-value. Read the area from a calculator or chi-square table at .
Decision and conclusion
Compare the P-value to . If P-value , reject : there is convincing evidence the true distribution differs from the claimed one. If P-value , fail to reject: the data are consistent with the claimed distribution. The conclusion states the decision, ties it to the -versus- comparison, and interprets in context ("there is convincing evidence the distribution of ... is not as claimed"). A rejected goodness-of-fit test says the distribution differs somewhere, but does not by itself say which category drives it; examining the individual contributions (the largest terms) is how you discuss where the discrepancy lies, a common follow-up.
Try this
Q1. Write the chi-square statistic formula and the goodness-of-fit degrees of freedom for categories. [2 points]
- Cue. ; .
Q2. Why is a chi-square P-value always an upper-tail area? [1 point]
- Cue. Any departure from the claim increases , so all evidence against is in the right tail; the test is never two-sided.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2019 (style)1 marksSection I (multiple choice). A goodness-of-fit test on categories gives a chi-square statistic. The degrees of freedom are (A) (B) (C) (D) depends on Show worked answer →
The correct answer is (B).
For a goodness-of-fit test with categories, . The degrees of freedom do not depend on the sample size .
(A) uses instead of . (C) is incorrect. (D) is wrong: depends on the number of categories, not .
AP 2022 (style)4 marksSection II (free response). A spinner is claimed to land on its four colors equally. In spins the counts are red , blue , green , yellow . Test at whether the spinner is fair. State hypotheses, give expected counts, compute the chi-square statistic and P-value (use ), and conclude in context (justify in context).Show worked answer →
A 4-point goodness-of-fit test.
(1) (1 point) : the four colors are equally likely ( each); : the distribution is not equal. Random/independent assumed; expected counts all .
(2) (1 point) Expected count each .
(3) (1 point) , . P-value .
(4) (1 point) Since P-value , fail to reject . There is not convincing evidence that the spinner is unfair; the observed counts are consistent with an equal distribution.
Markers reward the expected counts, the chi-square sum, the P-value with , and a contextual conclusion.
Related dot points
- Topic 8.2 Setting Up a Chi-Square Goodness of Fit Test: state the hypotheses for a goodness-of-fit test, compute expected counts from a claimed distribution, and verify the conditions.
A focused answer to AP Statistics Topic 8.2, on stating the hypotheses for a goodness-of-fit test, computing expected counts from a claimed distribution, and checking the random, large-counts (expected at least 5), and 10% conditions.
- Topic 8.1 Introducing Statistics: Are My Results Unexpected?: explain why comparing observed counts across several categories to expected counts motivates the chi-square family of tests.
A focused answer to AP Statistics Topic 8.1, on why comparing observed counts across several categories to expected counts motivates chi-square tests, extending proportion inference to variables with more than two categories.
- Topic 8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence: compute the chi-square statistic from a two-way table, find the P-value using (rows minus 1)(columns minus 1) degrees of freedom, and state a conclusion in context.
A focused answer to AP Statistics Topic 8.6, on computing the chi-square statistic from a two-way table, finding the P-value with (r minus 1)(c minus 1) degrees of freedom, and stating a conclusion in context, with a full worked test.
- Topic 8.4 Expected Counts in Two-Way Tables: compute the expected count for each cell of a two-way table under the null hypothesis using the row total times column total divided by the grand total.
A focused answer to AP Statistics Topic 8.4, on computing expected counts in a two-way table under the null of no association, using row total times column total over the grand total, and why this formula encodes independence.
- Topic 6.5 Interpreting P-Values: define the P-value as the probability, assuming the null hypothesis is true, of obtaining a test statistic at least as extreme as the one observed, and interpret it in context.
A focused answer to AP Statistics Topic 6.5, on defining the P-value as the probability under the null of a result at least as extreme as observed, interpreting small and large P-values, and avoiding common misreadings, with a worked interpretation.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)