Skip to main content
United StatesStatisticsSyllabus dot point

How do scatterplots display two quantitative variables, and how do we describe what we see?

Topic 2.4 Representing the Relationship Between Two Quantitative Variables: construct and describe scatterplots by direction, form, strength, and unusual features, in context.

A focused answer to AP Statistics Topic 2.4, on building scatterplots and describing them by direction, form, strength, and unusual features (the DUFS framework), in context, with a worked description.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The scatterplot
  3. Describing a scatterplot
  4. Why describing comes before fitting
  5. Direction, strength, and the link to correlation
  6. Try this

What this topic is asking

The College Board (Topic 2.4) wants you to display two quantitative variables with a scatterplot and to describe it by direction, form, strength, and unusual features, always in context. This verbal description is the two-variable counterpart of Unit 1's SOCS.

The scatterplot

The axis convention matters: explanatory on xx, response on yy. It fixes the meaning of "positive" and "negative" direction and sets up the regression line later, whose slope and prediction read off the same axes. A point that does not follow the convention is a frequent small error that confuses everything downstream.

Describing a scatterplot

Run through four features, the two-variable analogue of SOCS:

  • Direction. Positive if yy tends to increase as xx increases (points rise left to right); negative if yy tends to decrease (points fall). No clear trend means no association.
  • Form. Is the pattern linear (a straight-line trend) or curved (for example, exponential or quadratic)?
  • Strength. How tightly do the points cluster around the underlying pattern? Tight clustering is strong; a loose cloud is weak.
  • Unusual features. Outliers (points far from the pattern), clusters, or distinct subgroups that suggest a lurking categorical variable.

Why describing comes before fitting

The order of operations in Unit 2 is deliberate: you describe the scatterplot first, then decide whether a line is appropriate, then fit and assess it. The reason is that the tools that follow, the correlation coefficient and the least-squares regression line, are summaries of a linear relationship. If the scatterplot is clearly curved, a correlation near zero can hide a strong non-linear relationship, and a straight-line fit will systematically miss the pattern, so you would need a transformation (Topic 2.9) instead. If there is a glaring outlier, it can distort both the correlation and the line, so you want to notice it before it quietly skews your numbers. This is why the College Board tests scatterplot description on its own: spotting non-linearity and outliers by eye is the gatekeeping judgement that decides whether the rest of the unit's machinery even applies. A student who fits a line to an obviously curved pattern without comment has missed the point of the whole sequence.

Direction and strength foreshadow the correlation coefficient rr of the next topic, which puts a single number on exactly these two features for a linear pattern: the sign of rr encodes direction and its magnitude (closeness to 11) encodes strength. But rr only makes sense once you have confirmed the form is roughly linear, which is the verbal judgement you make here. Form is the feature rr cannot capture: a strong curved relationship and a weak linear one can share the same rr, so the scatterplot's form must be read by eye. Keeping direction, form, and strength as three separate ideas, rather than collapsing them into "strong," is what lets you describe and later model a relationship correctly, and it is precisely the distinction the exam probes when it shows a curved scatterplot with a deceptively small or large correlation.

Try this

Q1. Points on a scatterplot form a tight upward-curving arc. State the direction and form. [2 points]

  • Cue. Direction is positive (rising), but the form is curved (non-linear), so a straight-line summary would be inappropriate.

Q2. Why should you describe a scatterplot's form before computing a correlation? [1 point]

  • Cue. Correlation only measures linear strength; for a curved relationship a small rr can hide a strong association, so you must confirm linearity first.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). A scatterplot shows points falling tightly along a line that slopes downward from upper left to lower right. How is this relationship best described? (A) Strong positive linear (B) Strong negative linear (C) Weak positive linear (D) No association
Show worked answer →

The correct answer is (B).

Downward slope means a negative association (as xx increases, yy decreases); points falling tightly along a line mean strong and linear. So the relationship is strong, negative, and linear.

(A) is wrong on direction (downward is negative). (C) is wrong on both direction and strength. (D) ignores the clear linear pattern. Direction comes from the slope, strength from how tightly the points cluster.

AP 2021 (style)4 marksSection II (free response). A scatterplot plots fuel used (liters) against distance driven (km) for 3030 trips. The points rise from lower left to upper right, lie fairly close to a straight line, and there is one point far above the pattern at a moderate distance. (a) Describe the relationship in terms of direction, form, strength, and unusual features. (b) Identify the explanatory and response variables. (c) Explain why describing the scatterplot should come before fitting any line.
Show worked answer →

A 4-point scatterplot-description question.

(a) (2 points) Direction: positive (fuel increases with distance). Form: roughly linear. Strength: fairly strong (points lie close to a line). Unusual features: one point lies well above the pattern, a possible outlier. Award up to 2 points for covering direction, form, strength, and unusual features in context.
(b) (1 point) Explanatory variable: distance driven (horizontal axis); response variable: fuel used (vertical axis).
(c) (1 point) You describe the scatterplot first to check that the relationship is roughly linear (and to spot outliers) before fitting a line, because a least-squares line and correlation only summarize a linear pattern sensibly; fitting a line to a clearly curved or outlier-distorted pattern would mislead.

Markers reward a description covering direction, form, strength, and unusual features in context, correct variable roles, and the reason that linearity must be checked before fitting a line.

Related dot points

Sources & how we know this