Skip to main content
United StatesStatisticsSyllabus dot point

How do you compute expected counts in a two-way table under the assumption of no association?

Topic 8.4 Expected Counts in Two-Way Tables: compute the expected count for each cell of a two-way table under the null hypothesis using the row total times column total divided by the grand total.

A focused answer to AP Statistics Topic 8.4, on computing expected counts in a two-way table under the null of no association, using row total times column total over the grand total, and why this formula encodes independence.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The expected-count formula
  3. Why the formula encodes "no association"
  4. Expected counts and the conditions
  5. Try this

What this topic is asking

The College Board (Topic 8.4) wants you to compute expected counts in a two-way table under the null of no association: for each cell,

E=(row total)×(column total)grand total.E = \frac{(\text{row total}) \times (\text{column total})}{\text{grand total}}.

This is the bridge from a one-variable goodness-of-fit test to the two-variable tests of homogeneity and independence.

The expected-count formula

This single formula generates the expected count for each cell. Multiply the cell's row total by its column total, divide by the grand total. Because the formula reuses the observed marginal totals, the expected table has identical margins to the observed table, a reliable check: if your expected row and column sums do not match the observed margins, you have made an arithmetic error.

Why the formula encodes "no association"

The derivation shows the formula is not arbitrary: it is exactly what "the two variables are independent" predicts. If opinion and age were unrelated, the proportion favoring would be the same in every age group (equal to the overall row proportion), so each cell's expected count is the grand total scaled by the two marginal proportions. The expected table is therefore the "no-association" template against which the observed table is compared. The same formula serves both homogeneity (same distribution across groups) and independence (no association on one sample), because both nulls produce identical expected counts.

Expected counts and the conditions

The expected counts you compute here are exactly what the large-counts condition for a two-way chi-square test checks: every expected count must be at least 55. They are also the denominators in the chi-square statistic (OE)2/E\sum (O - E)^2 / E of Topic 8.6. So Topic 8.4 is the shared computational core of the homogeneity and independence tests. Computing all expected counts carefully, and verifying each is at least 55, is a graded step before any chi-square value is found. If some expected count falls below 55, the chi-square approximation is unreliable and the analysis may need categories combined.

Try this

Q1. A cell has row total 6060, column total 9090, grand total 300300. Find the expected count. [1 point]

  • Cue. E=60×90300=5400300=18E = \dfrac{60 \times 90}{300} = \dfrac{5400}{300} = 18.

Q2. What null hypothesis are these expected counts computed under? [1 point]

  • Cue. No association (independence/homogeneity): the distribution of one variable is the same across categories of the other.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). In a two-way table, a cell's row total is 8080, its column total is 5050, and the grand total is 200200. The expected count for that cell is (A) 130130 (B) 2020 (C) 4040 (D) 2525
Show worked answer →

The correct answer is (B).

The expected count is row total×column totalgrand total=80×50200=4000200=20\dfrac{\text{row total} \times \text{column total}}{\text{grand total}} = \dfrac{80 \times 50}{200} = \dfrac{4000}{200} = 20.

(A) adds the totals. (C) and (D) misapply the formula. The expected count is 2020.

AP 2021 (style)3 marksSection II (free response). A survey of 300300 people cross-classifies opinion (favor, oppose) by age group (young, old). The row totals are favor 180180, oppose 120120; the column totals are young 150150, old 150150. (a) Compute the expected count for the (favor, young) cell. (b) Explain what assumption this expected count is computed under. (c) Explain in words why the formula uses the row and column totals.
Show worked answer →

A 3-point expected-counts question.

(a) (1 point) Efavour, young=180×150300=27000300=90E_{\text{favour, young}} = \dfrac{180 \times 150}{300} = \dfrac{27000}{300} = 90.
(b) (1 point) It is computed under the null hypothesis of no association (independence) between opinion and age: the proportion favoring is assumed the same across age groups.
(c) (1 point) Under independence, the expected proportion in a cell is (row proportion) times (column proportion); multiplying by the grand total and simplifying gives row total times column total over grand total. The marginal totals carry the overall proportions the null spreads evenly across the table.

Markers reward the correct expected count, naming the independence/no-association assumption, and explaining the role of the marginal totals.

Related dot points

Sources & how we know this