How do joint, marginal, and conditional proportions help us decide whether two categorical variables are associated?
Topic 2.3 Statistics for Two Categorical Variables: calculate joint, marginal, and conditional relative frequencies from a two-way table, and use conditional distributions to judge association.
A focused answer to AP Statistics Topic 2.3, on joint, marginal, and conditional relative frequencies from two-way tables, and using conditional distributions to assess association, with full worked proportion calculations.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 2.3) wants you to compute joint, marginal, and conditional relative frequencies from a two-way table, and to use conditional distributions to decide whether two categorical variables are associated.
Three kinds of proportion
The three differ only in the denominator: joint divides by the grand total, marginal divides by the grand total but uses a margin in the numerator, and conditional divides by a row or column total. Getting the denominator right is the whole game; the commonest error in the topic is dividing a conditional question by the grand total instead of by the group total.
Conditional distributions and association
This is the precise, computable version of "are the variables related?" from Topic 2.1. If of treatment-A patients improve but only of treatment-B patients do, the conditional distribution of outcome depends on treatment, so outcome and treatment are associated. If both groups improved at the same rate, there would be no association. Crucially, you compare conditional distributions, not raw counts, because the groups usually have different sizes.
How to assess association cleanly
A reliable exam routine is: pick the response variable, compute its conditional distribution within each category of the explanatory variable, and then compare those distributions. If they differ meaningfully, state that the variables are associated and describe how (which group has the higher rate, in context); if they are nearly identical, state that there is little or no association. Because conditional proportions already adjust for group size, this comparison is fair even when the groups are very unequal, which is exactly why conditional, not joint, proportions are the right tool. It helps to phrase your conclusion as a direct comparison ("a higher proportion of A than B improved"), echoing the comparative-language discipline of Unit 1. And as always in Unit 2, finding an association is not the same as proving cause: if the data are observational, you note that a lurking variable could explain the differing rates, so the association stands but causation does not.
Why the denominator decides everything
It is worth dwelling on why the three proportions answer different questions, because exam wording is designed to test whether you can tell them apart. "What proportion of all people are female coffee-drinkers?" is joint (divide the female-coffee cell by ). "What proportion of people prefer coffee?" is marginal (divide the coffee column total by ). "What proportion of females prefer coffee?" is conditional (divide the female-coffee cell by the female total). The little words "of all," "of females," and so on, signal the denominator, so reading the question slowly and identifying the group being conditioned on is the single most valuable habit in this topic. A natural follow-up the exam likes is to compare two conditional proportions ("of females versus of males") to judge association, which ties the whole topic together: you compute conditional distributions precisely so that you can compare them and decide whether the two categorical variables move together.
Try this
Q1. Of students, play sport and study music; study music in total. What proportion of music students play sport (a conditional proportion)? [2 points]
- Cue. Condition on music: , so of music students play sport.
Q2. How do you decide, from a two-way table, whether two categorical variables are associated? [1 point]
- Cue. Compare the conditional distributions of one variable across the categories of the other; if they differ, the variables are associated.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2019 (style)1 marksSection I (multiple choice). In a two-way table of people, of males prefer coffee and of females prefer coffee. What is the conditional proportion of coffee preference among females? (A) (B) (C) (D) Show worked answer →
The correct answer is (B).
A conditional proportion fixes the condition (female) and divides within that group: . So of females prefer coffee.
(A) wrongly divides by the grand total (that is a joint proportion). (C) is the marginal proportion for coffee. (D) is the male conditional proportion (). Conditioning on female means dividing by the female total .
AP 2022 (style)4 marksSection II (free response). A study classifies patients by treatment (A, B) and outcome (improved, not improved). Of on treatment A, improved; of on treatment B, improved. (a) Find the conditional proportion improving for each treatment. (b) Use these to decide whether outcome appears associated with treatment, and explain. (c) State why this observational comparison alone does not prove treatment A causes more improvement.Show worked answer →
A 4-point question on conditional distributions and association.
(a) (2 points) Treatment A: improved. Treatment B: improved (1 point each, or 1 for both with minor slip).
(b) (1 point) The conditional proportion improving differs ( for A versus for B), so outcome appears associated with treatment: patients on A improved at a higher rate. Because the conditional distributions of outcome differ across treatment, there is an association.
(c) (1 point) The data are observational (or not stated to be randomised): a lurking variable, such as patients on A being less severely ill, could explain the difference, so the comparison shows association, not proven causation.
Markers reward correct conditional proportions, a conclusion of association justified by the differing conditional distributions, and the observational-design caution.
Related dot points
- Topic 2.2 Representing Two Categorical Variables: construct and interpret two-way (contingency) tables and segmented or side-by-side bar graphs for two categorical variables.
A focused answer to AP Statistics Topic 2.2, on building and reading two-way tables and segmented or side-by-side bar graphs for two categorical variables, with marginal totals and a worked table.
- Topic 2.1 Introducing Statistics - Are Variables Related?: identify questions about the association between two variables, distinguish association from causation, and recognize what two-variable data can answer.
A focused answer to AP Statistics Topic 2.1, on framing questions about the association between two variables, the difference between explanatory and response variables, why association is not causation, and what two-variable data can answer, with worked examples.
- Topic 1.3 Representing a Categorical Variable with Tables: build and interpret frequency and relative frequency tables for a single categorical variable, and read proportions and percentages from them.
A focused answer to AP Statistics Topic 1.3, on building frequency and relative frequency tables for one categorical variable, converting between counts, proportions, and percentages, and interpreting them in context, with worked tables.
- Topic 2.4 Representing the Relationship Between Two Quantitative Variables: construct and describe scatterplots by direction, form, strength, and unusual features, in context.
A focused answer to AP Statistics Topic 2.4, on building scatterplots and describing them by direction, form, strength, and unusual features (the DUFS framework), in context, with a worked description.
- Topic 2.5 Correlation: calculate and interpret the correlation coefficient r, understand its properties (range, unit-free, resistance), and recognize what it can and cannot tell you.
A focused answer to AP Statistics Topic 2.5, defining the correlation coefficient r, its range and properties (unit-free, symmetric, non-resistant), what it measures and misses, and the correlation-causation caution, with a worked interpretation.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)