Skip to main content
United StatesStatisticsSyllabus dot point

How do joint, marginal, and conditional proportions help us decide whether two categorical variables are associated?

Topic 2.3 Statistics for Two Categorical Variables: calculate joint, marginal, and conditional relative frequencies from a two-way table, and use conditional distributions to judge association.

A focused answer to AP Statistics Topic 2.3, on joint, marginal, and conditional relative frequencies from two-way tables, and using conditional distributions to assess association, with full worked proportion calculations.

Generated by Claude Opus 4.810 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. Three kinds of proportion
  3. Conditional distributions and association
  4. How to assess association cleanly
  5. Why the denominator decides everything
  6. Try this

What this topic is asking

The College Board (Topic 2.3) wants you to compute joint, marginal, and conditional relative frequencies from a two-way table, and to use conditional distributions to decide whether two categorical variables are associated.

Three kinds of proportion

The three differ only in the denominator: joint divides by the grand total, marginal divides by the grand total but uses a margin in the numerator, and conditional divides by a row or column total. Getting the denominator right is the whole game; the commonest error in the topic is dividing a conditional question by the grand total instead of by the group total.

Conditional distributions and association

This is the precise, computable version of "are the variables related?" from Topic 2.1. If 70%70\% of treatment-A patients improve but only 60%60\% of treatment-B patients do, the conditional distribution of outcome depends on treatment, so outcome and treatment are associated. If both groups improved at the same rate, there would be no association. Crucially, you compare conditional distributions, not raw counts, because the groups usually have different sizes.

How to assess association cleanly

A reliable exam routine is: pick the response variable, compute its conditional distribution within each category of the explanatory variable, and then compare those distributions. If they differ meaningfully, state that the variables are associated and describe how (which group has the higher rate, in context); if they are nearly identical, state that there is little or no association. Because conditional proportions already adjust for group size, this comparison is fair even when the groups are very unequal, which is exactly why conditional, not joint, proportions are the right tool. It helps to phrase your conclusion as a direct comparison ("a higher proportion of A than B improved"), echoing the comparative-language discipline of Unit 1. And as always in Unit 2, finding an association is not the same as proving cause: if the data are observational, you note that a lurking variable could explain the differing rates, so the association stands but causation does not.

Why the denominator decides everything

It is worth dwelling on why the three proportions answer different questions, because exam wording is designed to test whether you can tell them apart. "What proportion of all people are female coffee-drinkers?" is joint (divide the female-coffee cell by nn). "What proportion of people prefer coffee?" is marginal (divide the coffee column total by nn). "What proportion of females prefer coffee?" is conditional (divide the female-coffee cell by the female total). The little words "of all," "of females," and so on, signal the denominator, so reading the question slowly and identifying the group being conditioned on is the single most valuable habit in this topic. A natural follow-up the exam likes is to compare two conditional proportions ("of females versus of males") to judge association, which ties the whole topic together: you compute conditional distributions precisely so that you can compare them and decide whether the two categorical variables move together.

Try this

Q1. Of 8080 students, 2020 play sport and study music; 5050 study music in total. What proportion of music students play sport (a conditional proportion)? [2 points]

  • Cue. Condition on music: 2050=0.40\frac{20}{50} = 0.40, so 40%40\% of music students play sport.

Q2. How do you decide, from a two-way table, whether two categorical variables are associated? [1 point]

  • Cue. Compare the conditional distributions of one variable across the categories of the other; if they differ, the variables are associated.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). In a two-way table of 200200 people, 5454 of 9090 males prefer coffee and 4444 of 110110 females prefer coffee. What is the conditional proportion of coffee preference among females? (A) 0.220.22 (B) 0.400.40 (C) 0.490.49 (D) 0.600.60
Show worked answer →

The correct answer is (B).

A conditional proportion fixes the condition (female) and divides within that group: 44110=0.40\frac{44}{110} = 0.40. So 40%40\% of females prefer coffee.

(A) 44/20044/200 wrongly divides by the grand total (that is a joint proportion). (C) 98/20098/200 is the marginal proportion for coffee. (D) is the male conditional proportion (54/9054/90). Conditioning on female means dividing by the female total 110110.

AP 2022 (style)4 marksSection II (free response). A study classifies 300300 patients by treatment (A, B) and outcome (improved, not improved). Of 180180 on treatment A, 126126 improved; of 120120 on treatment B, 7272 improved. (a) Find the conditional proportion improving for each treatment. (b) Use these to decide whether outcome appears associated with treatment, and explain. (c) State why this observational comparison alone does not prove treatment A causes more improvement.
Show worked answer →

A 4-point question on conditional distributions and association.

(a) (2 points) Treatment A: 126/180=0.70126/180 = 0.70 improved. Treatment B: 72/120=0.6072/120 = 0.60 improved (1 point each, or 1 for both with minor slip).
(b) (1 point) The conditional proportion improving differs (0.700.70 for A versus 0.600.60 for B), so outcome appears associated with treatment: patients on A improved at a higher rate. Because the conditional distributions of outcome differ across treatment, there is an association.
(c) (1 point) The data are observational (or not stated to be randomised): a lurking variable, such as patients on A being less severely ill, could explain the difference, so the comparison shows association, not proven causation.

Markers reward correct conditional proportions, a conclusion of association justified by the differing conditional distributions, and the observational-design caution.

Related dot points

Sources & how we know this