How do we compare two or more distributions of a quantitative variable fairly and completely?
Topic 1.9 Comparing Distributions of a Quantitative Variable: compare two or more distributions of a quantitative variable by shape, center, spread, and unusual features, in context, using comparative language.
A focused answer to AP Statistics Topic 1.9, on comparing two or more distributions by shape, center, spread, and unusual features using explicit comparative language, with a worked side-by-side comparison.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 1.9) wants you to compare two or more distributions of a quantitative variable, covering shape, center, spread, and unusual features, using explicitly comparative language ("higher than," "more variable than") and always in context. This is one of the most common free-response tasks in the whole course.
The comparison must be comparative
This is the rule that catches students out more than any other. The College Board's scoring guidelines repeatedly withhold credit when a response describes Group A fully, then describes Group B fully, without ever saying which is larger or more spread out. The fix is mechanical: every sentence should contain a comparison word and name both groups.
Compare like with like, across SOCS
Run the SOCS checklist, but each item becomes a comparison:
- Shape: "Group A is roughly symmetric, whereas Group B is skewed right."
- Center: "Group A has a higher median () than Group B ()," so a typical A is larger.
- Spread: "Group A has a larger IQR (more variable) than Group B."
- Unusual features: "Group B has a high outlier at , while Group A has none."
Always compare the same measure across groups: median to median and IQR to IQR. Mixing a mean with a median, or a range with an IQR, is not a valid comparison. When the groups have outliers or skew, prefer the resistant pair (median and IQR), exactly as in Topic 1.7.
Parallel boxplots are the natural tool
Parallel (side-by-side) boxplots on a shared axis are the standard display for comparison because they line up the five-number summaries so differences in center, spread, and skew are visible at a glance. The position of the median lines compares centers; the box widths compare spreads (IQR); the whisker lengths and plotted outlier points compare skew and unusual features. Reading a comparison straight off parallel boxplots is a core exam skill, and the same logic applies to back-to-back stemplots or overlaid histograms. Whatever the display, the discipline is identical: line up the same feature in both groups and state the difference comparatively in context.
Writing a full-credit comparison
Because comparison questions carry several marks, it is worth knowing exactly how markers award them: typically one point per SOCS component, given only if the statement is genuinely comparative and in context. So a reliable structure is four sentences, one each for shape, center, spread, and unusual features, every sentence naming both groups, the relevant measure, and a comparison word, with units attached. For example: "The commute times in City X are roughly symmetric while those in City Y are skewed right (shape); City X has a higher median commute of minutes than City Y's minutes (center); City X is more variable, with an IQR of minutes against City Y's minutes (spread); and City Y has a high outlier at minutes whereas City X has none (unusual features)." That paragraph would earn full points because it compares every component, quantifies where possible, and stays in context. The commonest ways to lose marks are to describe the groups separately, to omit a component (often spread or unusual features), or to forget the context and units, so a quick self-check against those three pitfalls before moving on is time well spent.
Try this
Q1. Rewrite "Group A has median . Group B has median ." as a proper comparison. [1 point]
- Cue. "Group A has a higher median () than Group B (), so a typical A value is larger."
Q2. When comparing two skewed distributions, which center and spread should you use, and why? [2 points]
- Cue. Median and IQR, because they are resistant to the skew and any outliers, giving a fair comparison of typical value and spread.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2019 (style)1 marksSection I (multiple choice). Two groups' test scores are shown in parallel boxplots. Group A has median ; Group B has median . Which statement is a correct comparison? (A) Group A scored (B) Group B has a lower median than Group A, so a typical Group B student scored lower (C) Group A is better (D) The medians are about the sameShow worked answer →
The correct answer is (B).
A comparison must be explicitly comparative and tied to context: Group B's median () is lower than Group A's (), so a typical Group B student scored lower. The word "lower than" makes it a comparison.
(A) reports one value, not a comparison. (C) is vague and not statistical. (D) is false (). AP markers require explicitly comparative language, such as "greater than" or "lower than," not two separate descriptions.
AP 2022 (style)4 marksSection II (free response). Parallel boxplots show the commute times (minutes) for workers in City X and City Y. City X: median , IQR , roughly symmetric, no outliers. City Y: median , IQR , skewed right, one high outlier at . Compare the two distributions of commute time in context.Show worked answer →
A 4-point comparison question requiring explicit comparative language across SOCS.
Award up to 4 points for comparing, in context and comparatively: Shape (1 point): City X is roughly symmetric while City Y is skewed right. Center (1 point): City X has a higher median commute ( minutes) than City Y ( minutes), so a typical City X worker commutes longer. Spread (1 point): City X has a larger IQR () than City Y (), so City X commute times are more variable. Unusual features (1 point): City Y has a high outlier at minutes, whereas City X has none.
Markers require explicitly comparative wording ("higher than," "more variable than") for each component, not two separate one-group descriptions, and everything stated in the context of commute time.
Related dot points
- Topic 1.6 Describing the Distribution of a Quantitative Variable: describe a quantitative distribution by its shape, center, spread, and unusual features (outliers, gaps, clusters) in context.
A focused answer to AP Statistics Topic 1.6, the SOCS framework for describing a quantitative distribution by shape, outliers, center, and spread, with the vocabulary of skew, modality, and clusters, and worked descriptions.
- Topic 1.7 Summary Statistics for a Quantitative Variable: calculate and interpret measures of center (mean, median) and spread (range, IQR, standard deviation, variance), and judge their resistance to outliers.
A focused answer to AP Statistics Topic 1.7, defining and computing the mean, median, range, IQR, variance, and standard deviation, explaining resistance to outliers, with full worked calculations.
- Topic 1.8 Graphical Representations of Summary Statistics: construct and interpret boxplots from the five-number summary, and identify outliers using the 1.5 times IQR rule.
A focused answer to AP Statistics Topic 1.8, on building and reading boxplots from the five-number summary, the 1.5 times IQR rule for outliers, and what a boxplot does and does not reveal, with a worked construction.
- Topic 1.5 Representing a Quantitative Variable with Graphs: construct and interpret dotplots, stem-and-leaf plots, and histograms for a quantitative variable, and choose an appropriate display.
A focused answer to AP Statistics Topic 1.5, on displaying a quantitative variable with dotplots, stem-and-leaf plots, and histograms, choosing bin widths, and reading the displays, with a worked histogram construction.
- Topic 1.10 The Normal Distribution: use z-scores, the empirical (68-95-99.7) rule, and the standard normal model to find proportions and percentiles for approximately normal data.
A focused answer to AP Statistics Topic 1.10, on the normal model, standardizing with z-scores, the 68-95-99.7 empirical rule, and finding proportions and percentiles, with full worked z-score and normal-area calculations.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)