Skip to main content
United StatesStatisticsSyllabus dot point

How do you choose the correct inference procedure for a categorical-data scenario?

Topic 8.7 Skills Focus: Selecting an Appropriate Inference Procedure for Categorical Data: choose among the one-proportion, two-proportion, and chi-square (goodness of fit, homogeneity, independence) procedures based on the scenario.

A focused answer to AP Statistics Topic 8.7, on choosing among one-proportion, two-proportion, and chi-square (goodness of fit, homogeneity, independence) procedures for categorical data, based on the number of variables, categories, and samples, with a worked decision.

Generated by Claude Opus 4.810 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The categorical decision tree
  3. Why goodness of fit is different
  4. Implement and communicate
  5. Try this

What this topic is asking

The College Board (Topic 8.7) is a skills focus: given a categorical-data scenario, choose the correct procedure among the one-proportion z, two-proportion z, and the three chi-square tests (goodness of fit, homogeneity, independence), and justify the choice. It synthesizes the categorical inference of Units 6 and 8.

The categorical decision tree

These questions partition every categorical scenario. The trickiest boundaries are: (a) two-proportion z versus homogeneity, both compare groups, but two-proportion z is for a two-category variable across two groups, while homogeneity handles two or more categories and two or more groups (a 2×22 \times 2 homogeneity test and a two-proportion two-sided z-test actually agree); and (b) homogeneity versus independence, the same table arithmetic, distinguished by several samples (homogeneity) versus one sample, two variables (independence).

Why goodness of fit is different

Goodness of fit stands apart: it tests one categorical variable's distribution against a claimed distribution (equal, a ratio, or stated percentages), using a one-way list of counts, not a two-way table. The cue is a sentence like "the company claims colors appear in a 2:3:52:3:5 ratio", a single variable measured against an external claim. There are no groups to compare and no second variable; that is what separates it from homogeneity and independence.

Implement and communicate

Implementation details follow the choice: a proportion test uses zz and the large-counts condition with counts of 1010; a chi-square test uses (OE)2/E\sum (O-E)^2/E, expected counts, the 5\ge 5 condition, and degrees of freedom (k1k - 1 for goodness of fit, (r1)(c1)(r-1)(c-1) for a table). The conclusion wording must match: "the proportion differs," "the distributions differ across groups" (homogeneity), or "the variables are associated" (independence). As always, association is not causation, and statistical significance is not practical importance.

Try this

Q1. Separate random samples from four hospitals are classified by infection outcome (yes/no). Which procedure? [1 point]

  • Cue. Chi-square test of homogeneity (several separate samples, one variable across groups).

Q2. What single cue separates goodness of fit from the two-way chi-square tests? [1 point]

  • Cue. Goodness of fit tests one variable against a claimed distribution (a one-way list of counts), with no second variable and no separate groups.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). A single random sample is classified by two categorical variables, each with three levels, to see if they are related. The correct procedure is (A) one-proportion z-test (B) chi-square goodness of fit (C) chi-square test of independence (D) two-proportion z-test
Show worked answer →

The correct answer is (C).

One sample with two categorical variables (each multi-level) calls for a chi-square test of independence (are the variables associated?).

(A) and (D) are for one or two proportions (two-category situations). (B) is for one variable against a claimed distribution, not two variables.

AP 2021 (style)4 marksSection II (free response). For each scenario, name the appropriate categorical inference procedure and justify the choice. (a) Testing whether a single coin's heads proportion differs from 0.50.5. (b) Comparing the proportion of two independent groups that prefer a product. (c) Testing whether one die's six face counts match a uniform distribution. (d) Taking one sample of people and testing whether handedness and eye dominance are related.
Show worked answer →

A 4-point procedure-selection question.

(a) (1 point) One-proportion z-test: one categorical variable with two outcomes, testing p=0.5p = 0.5.
(b) (1 point) Two-proportion z-test (or interval): two independent groups, comparing a proportion.
(c) (1 point) Chi-square goodness-of-fit test: one variable, six categories, against a claimed (uniform) distribution.
(d) (1 point) Chi-square test of independence: one sample, two categorical variables, testing association.

Markers reward distinguishing two-category proportion procedures from multi-category chi-square ones, and goodness of fit (one variable vs a distribution) from independence (two variables, one sample).

Related dot points

Sources & how we know this