Skip to main content
United StatesStatisticsSyllabus dot point

How do we compare two or more distributions of a quantitative variable fairly and completely?

Topic 1.9 Comparing Distributions of a Quantitative Variable: compare two or more distributions of a quantitative variable by shape, center, spread, and unusual features, in context, using comparative language.

A focused answer to AP Statistics Topic 1.9, on comparing two or more distributions by shape, center, spread, and unusual features using explicit comparative language, with a worked side-by-side comparison.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The comparison must be comparative
  3. Compare like with like, across SOCS
  4. Parallel boxplots are the natural tool
  5. Writing a full-credit comparison
  6. Try this

What this topic is asking

The College Board (Topic 1.9) wants you to compare two or more distributions of a quantitative variable, covering shape, center, spread, and unusual features, using explicitly comparative language ("higher than," "more variable than") and always in context. This is one of the most common free-response tasks in the whole course.

The comparison must be comparative

This is the rule that catches students out more than any other. The College Board's scoring guidelines repeatedly withhold credit when a response describes Group A fully, then describes Group B fully, without ever saying which is larger or more spread out. The fix is mechanical: every sentence should contain a comparison word and name both groups.

Compare like with like, across SOCS

Run the SOCS checklist, but each item becomes a comparison:

  • Shape: "Group A is roughly symmetric, whereas Group B is skewed right."
  • Center: "Group A has a higher median (7575) than Group B (6868)," so a typical A is larger.
  • Spread: "Group A has a larger IQR (more variable) than Group B."
  • Unusual features: "Group B has a high outlier at 6565, while Group A has none."

Always compare the same measure across groups: median to median and IQR to IQR. Mixing a mean with a median, or a range with an IQR, is not a valid comparison. When the groups have outliers or skew, prefer the resistant pair (median and IQR), exactly as in Topic 1.7.

Parallel boxplots are the natural tool

Parallel (side-by-side) boxplots on a shared axis are the standard display for comparison because they line up the five-number summaries so differences in center, spread, and skew are visible at a glance. The position of the median lines compares centers; the box widths compare spreads (IQR); the whisker lengths and plotted outlier points compare skew and unusual features. Reading a comparison straight off parallel boxplots is a core exam skill, and the same logic applies to back-to-back stemplots or overlaid histograms. Whatever the display, the discipline is identical: line up the same feature in both groups and state the difference comparatively in context.

Writing a full-credit comparison

Because comparison questions carry several marks, it is worth knowing exactly how markers award them: typically one point per SOCS component, given only if the statement is genuinely comparative and in context. So a reliable structure is four sentences, one each for shape, center, spread, and unusual features, every sentence naming both groups, the relevant measure, and a comparison word, with units attached. For example: "The commute times in City X are roughly symmetric while those in City Y are skewed right (shape); City X has a higher median commute of 2525 minutes than City Y's 2020 minutes (center); City X is more variable, with an IQR of 1414 minutes against City Y's 88 minutes (spread); and City Y has a high outlier at 6565 minutes whereas City X has none (unusual features)." That paragraph would earn full points because it compares every component, quantifies where possible, and stays in context. The commonest ways to lose marks are to describe the groups separately, to omit a component (often spread or unusual features), or to forget the context and units, so a quick self-check against those three pitfalls before moving on is time well spent.

Try this

Q1. Rewrite "Group A has median 5050. Group B has median 4040." as a proper comparison. [1 point]

  • Cue. "Group A has a higher median (5050) than Group B (4040), so a typical A value is larger."

Q2. When comparing two skewed distributions, which center and spread should you use, and why? [2 points]

  • Cue. Median and IQR, because they are resistant to the skew and any outliers, giving a fair comparison of typical value and spread.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). Two groups' test scores are shown in parallel boxplots. Group A has median 7575; Group B has median 6868. Which statement is a correct comparison? (A) Group A scored 7575 (B) Group B has a lower median than Group A, so a typical Group B student scored lower (C) Group A is better (D) The medians are about the same
Show worked answer →

The correct answer is (B).

A comparison must be explicitly comparative and tied to context: Group B's median (6868) is lower than Group A's (7575), so a typical Group B student scored lower. The word "lower than" makes it a comparison.

(A) reports one value, not a comparison. (C) is vague and not statistical. (D) is false (756875 \neq 68). AP markers require explicitly comparative language, such as "greater than" or "lower than," not two separate descriptions.

AP 2022 (style)4 marksSection II (free response). Parallel boxplots show the commute times (minutes) for workers in City X and City Y. City X: median 2525, IQR 1414, roughly symmetric, no outliers. City Y: median 2020, IQR 88, skewed right, one high outlier at 6565. Compare the two distributions of commute time in context.
Show worked answer →

A 4-point comparison question requiring explicit comparative language across SOCS.

Award up to 4 points for comparing, in context and comparatively: Shape (1 point): City X is roughly symmetric while City Y is skewed right. Center (1 point): City X has a higher median commute (2525 minutes) than City Y (2020 minutes), so a typical City X worker commutes longer. Spread (1 point): City X has a larger IQR (1414) than City Y (88), so City X commute times are more variable. Unusual features (1 point): City Y has a high outlier at 6565 minutes, whereas City X has none.

Markers require explicitly comparative wording ("higher than," "more variable than") for each component, not two separate one-group descriptions, and everything stated in the context of commute time.

Related dot points

Sources & how we know this