Skip to main content
United StatesStatisticsSyllabus dot point

How do you state the hypotheses, compute expected counts, and check conditions for a chi-square goodness-of-fit test?

Topic 8.2 Setting Up a Chi-Square Goodness of Fit Test: state the hypotheses for a goodness-of-fit test, compute expected counts from a claimed distribution, and verify the conditions.

A focused answer to AP Statistics Topic 8.2, on stating the hypotheses for a goodness-of-fit test, computing expected counts from a claimed distribution, and checking the random, large-counts (expected at least 5), and 10% conditions.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. Distributional hypotheses
  3. Computing expected counts
  4. Checking the conditions
  5. Try this

What this topic is asking

The College Board (Topic 8.2) wants you to set up a chi-square goodness-of-fit test: state the distributional hypotheses, compute the expected counts from a claimed distribution, and check the conditions (random, all expected counts at least 55, and 10%10\%).

Distributional hypotheses

These hypotheses are about the entire distribution, not a single proportion. Write H0H_0 as the claimed split, "equal across categories," or "the 9:3:3:19:3:3:1 ratio," or specific percentages, and HaH_a as "the distribution is not as claimed" (it is enough that some category's proportion differs; you do not specify which). Stating HaH_a as "all proportions differ" is wrong; "at least one differs" is correct.

Computing expected counts

Expected counts are what the null predicts you would see on average. For an "equal" claim with kk categories, each Ei=n/kE_i = n/k. For a ratio like 2:3:52:3:5, the parts sum to 1010, so the proportions are 0.2,0.3,0.50.2, 0.3, 0.5 and Ei=0.2n,0.3n,0.5nE_i = 0.2n, 0.3n, 0.5n. The expected counts need not be whole numbers, and they should sum to nn, a useful check. They are the backbone of both the condition check and the test statistic in Topic 8.3.

Checking the conditions

The large-counts condition for chi-square is "all expected counts 5\ge 5," which differs from the proportion test's "np010np_0 \ge 10 and n(1p0)10n(1-p_0) \ge 10." Use the expected counts you just computed; if any falls below 55, the chi-square approximation is unreliable and categories may need combining. Checking expected (not observed) counts is the distinctive requirement here and a common slip.

Try this

Q1. A 1:2:11:2:1 claim is tested with n=120n = 120. Find the three expected counts. [2 points]

  • Cue. Parts sum to 44, so proportions are 0.25,0.5,0.250.25, 0.5, 0.25; expected counts 30,60,3030, 60, 30.

Q2. Which counts must be at least 55 for the condition, observed or expected? [1 point]

  • Cue. Expected counts; every expected count must be at least 55.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). A goodness-of-fit test claims a 1:1:1:1:11:1:1:1:1 distribution across 55 categories with n=200n = 200. The expected count in each category is (A) 55 (B) 4040 (C) 200200 (D) 11
Show worked answer →

The correct answer is (B).

Under an equal distribution, each category's expected count is 15×200=40\dfrac{1}{5} \times 200 = 40.

(A) is the minimum-expected-count condition value, not the expected count. (C) is the total. (D) is the ratio number, not a count. The expected count is 4040.

AP 2021 (style)3 marksSection II (free response). A geneticist claims offspring appear in a 9:3:3:19:3:3:1 ratio across four phenotypes. A random sample of 160160 offspring is classified. (a) State the hypotheses for a goodness-of-fit test. (b) Compute the expected count for each phenotype. (c) Check the conditions.
Show worked answer →

A 3-point set-up question.

(a) (1 point) H0H_0: the true distribution of phenotypes follows the 9:3:3:19:3:3:1 ratio. HaH_a: the true distribution is not the 9:3:3:19:3:3:1 ratio (at least one proportion differs).
(b) (1 point) Total ratio parts =9+3+3+1=16= 9 + 3 + 3 + 1 = 16. Expected counts: 916(160)=90\dfrac{9}{16}(160) = 90, 316(160)=30\dfrac{3}{16}(160) = 30, 316(160)=30\dfrac{3}{16}(160) = 30, 116(160)=10\dfrac{1}{16}(160) = 10.
(c) (1 point) Random: stated random sample. Large counts: all expected counts (90,30,30,1090, 30, 30, 10) are at least 55. 10%10\%: 160160 is plausibly under 10%10\% of all offspring.

Markers reward distributional hypotheses (not about a single proportion), correct expected counts from the ratio, and the all-expected-at-least-5 check.

Related dot points

Sources & how we know this