Skip to main content
United StatesStatisticsSyllabus dot point

What does the correlation coefficient r measure, and what are its limits?

Topic 2.5 Correlation: calculate and interpret the correlation coefficient r, understand its properties (range, unit-free, resistance), and recognize what it can and cannot tell you.

A focused answer to AP Statistics Topic 2.5, defining the correlation coefficient r, its range and properties (unit-free, symmetric, non-resistant), what it measures and misses, and the correlation-causation caution, with a worked interpretation.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. What r measures
  3. The properties of r
  4. What r cannot tell you
  5. Reading the size of r
  6. Try this

What this topic is asking

The College Board (Topic 2.5) wants you to calculate and interpret the correlation coefficient rr, to know its properties (its range, that it is unit-free and symmetric, and that it is not resistant), and to understand exactly what rr measures and what it misses, including the correlation-causation caution.

What r measures

You will almost always get rr from technology rather than this formula, but the formula reveals two things: rr is built from how the two variables' standardized deviations move together, and because standardizing strips out units and scale, rr is unit-free, unaffected by changing units, and symmetric in xx and yy.

The properties of r

A few consequences are worth stating plainly. Because rr is symmetric, "the correlation of height with weight" equals "the correlation of weight with height," unlike a regression line, whose slope depends on which variable is the response. Because rr is not resistant, you should always look at the scatterplot before trusting rr, since one stray point can inflate or deflate it. And because rr captures only linearity, it is silent about curvature.

What r cannot tell you

Two limitations are examined relentlessly. First, correlation is not causation: a large ∣r∣|r| shows that two variables move together linearly, but a lurking variable, reverse causation, or coincidence can produce that pattern, so you may never conclude cause from rr alone. The classic examples (churches and crime, ice cream and drowning) all feature a lurking variable that drives both. Second, rr measures only linear association, so a value near zero does not mean the variables are unrelated; a strongly curved relationship (a U-shape, say) can have r≈0r \approx 0 while clearly being a tight relationship, just not a straight-line one. The reverse trap also exists: a high rr confirms a strong linear fit only if the scatterplot is genuinely linear; computing rr for a curved cloud and reporting "strong relationship" is wrong. The discipline that protects you from both traps is the same one from Topic 2.4: always describe the scatterplot's form first, and only interpret rr once you have confirmed the pattern is roughly linear. The College Board returns to these two cautions, no causation and linear-only, in almost every regression question, so internalising them now pays off across the whole unit.

Reading the size of r

It helps to attach rough verbal labels to magnitudes, while remembering they are conventions, not hard rules. An ∣r∣|r| around 0.90.9 or above is usually called strong, around 0.50.5 to 0.80.8 moderate, and below about 0.30.3 weak, with the exact cut-offs unimportant compared with reading the scatterplot. What matters on the exam is interpreting rr in context and pairing the number with the picture: "r=0.85r = 0.85 indicates a strong positive linear association between distance and fuel used, consistent with the tight upward-sloping scatterplot." A common follow-up is the relationship between rr and r2r^2 (the coefficient of determination of Topic 2.8): r2r^2 is the fraction of the variation in yy explained by the linear model, so a correlation of 0.850.85 corresponds to r2=0.7225r^2 = 0.7225, meaning about 72%72\% of the variation in yy is accounted for by the linear relationship with xx. Holding rr and r2r^2 as related-but-different ideas, one a measure of linear strength and the other a proportion of explained variation, prepares you for the regression topics that follow.

Try this

Q1. State what the sign and the magnitude of rr each tell you. [2 points]

  • Cue. The sign gives the direction of the linear association (positive or negative); the magnitude (closeness of ∣r∣|r| to 11) gives its strength.

Q2. A scatterplot is strongly U-shaped, and r=0.02r = 0.02. Does this mean no relationship? Explain. [1 point]

  • Cue. No; rr measures only linear association, so r≈0r \approx 0 here reflects the lack of a linear trend, not the absence of the strong non-linear (curved) relationship.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). A correlation of r=−0.92r = -0.92 between two variables indicates which of the following? (A) A strong positive linear relationship (B) A strong negative linear relationship (C) A weak negative relationship (D) That one variable causes the other to decrease
Show worked answer →

The correct answer is (B).

The sign of rr gives direction (negative here) and the magnitude gives strength; ∣r∣=0.92|r| = 0.92 is close to 11, so the linear relationship is strong. Thus r=−0.92r = -0.92 means a strong negative linear relationship.

(A) has the wrong sign. (C) understates the strength. (D) wrongly infers causation; correlation never establishes cause. Sign for direction, magnitude for strength, and no causal claim.

AP 2022 (style)4 marksSection II (free response). For 4040 towns, the correlation between number of churches and number of crimes is r=0.85r = 0.85. (a) Interpret this correlation. (b) Explain why it would be wrong to conclude that building more churches causes more crime. (c) A student computes rr for a scatterplot that is strongly curved and finds r=0.1r = 0.1; explain what this small value does and does not tell you.
Show worked answer →

A 4-point question on interpreting and critiquing correlation.

(a) (1 point) Interpretation: there is a strong positive linear association between the number of churches and the number of crimes across these 4040 towns (as one is larger, the other tends to be larger).
(b) (2 points) Correlation is not causation (1 point); a lurking variable, population size, plausibly drives both, since larger towns have more churches and more crimes (1 point). So the association reflects town size, not a causal link.
(c) (1 point) A small rr (0.10.1) means a weak linear association, but because the pattern is strongly curved, rr near zero does not mean "no relationship"; there can be a strong non-linear relationship that rr fails to detect.

Markers reward a correct interpretation of strength and direction, the causation caution with a lurking variable, and the insight that rr measures linear association only.

Related dot points

Sources & how we know this