Skip to main content
FloridaMathsSyllabus dot point

What does the correlation coefficient tell you, why does correlation not imply causation, and how do residuals show whether a linear model fits?

Interpret the correlation coefficient as a measure of the strength and direction of a linear association, distinguish correlation from causation, and use residuals to assess the fit of a linear model (MA.912.DP.2.6, MA.912.DP.2.8, MA.912.DP.2.9).

A B.E.S.T. Algebra 1 EOC answer on correlation (MA.912.DP.2), reading the correlation coefficient r, why correlation does not prove causation, lurking variables, and using residuals to judge a linear fit.

Generated by Claude Opus 4.811 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The correlation coefficient
  3. Correlation is not causation
  4. Residuals and fit
  5. How the B.E.S.T. EOC examines this topic
  6. Why correlation cannot establish causation
  7. Try this

What this topic is asking

MA.912.DP.2 asks you to interpret the correlation coefficient rr, to understand why correlation does not imply causation, and to use residuals to judge whether a linear model fits. These are conceptual Statistics points on the B.E.S.T. Algebra 1 EOC, tested mostly with multiple choice and short interpretation.

The correlation coefficient

The correlation coefficient rr is a single number summarizing a linear association:

  • Sign: positive rr means upward trend; negative rr means downward.
  • Magnitude: ∣r∣|r| near 11 is strong (points hug a line); near 00 is weak or no linear association.

So r=0.9r = 0.9 is strong positive, r=−0.3r = -0.3 is weak negative, and r=0.05r = 0.05 is essentially no linear association. Note rr measures only linear association; a strong curve can have rr near 00.

Correlation is not causation

A strong correlation means two variables move together, but it does not prove one causes the other. The classic reason is a lurking variable, a third factor driving both.

Only a controlled experiment (randomly assigning a treatment) can establish causation; observational correlation cannot.

Residuals and fit

A residual is the difference between an actual data value and the model's prediction:

A positive residual means the point is above the line; negative means below. A residual plot graphs residuals against xx. If the residuals scatter randomly around zero with no pattern, the linear model is a good fit. If they form a clear pattern (a curve or a funnel), a line is not the right model.

How the B.E.S.T. EOC examines this topic

  • Multiple choice. Interpret rr (direction and strength), or identify a correlation-causation error.
  • Short interpretation. Explain a lurking variable, or why correlation is not causation.
  • Residual items. Compute a residual or read a residual plot to judge fit.

A clarifying idea: rr and the line of best fit describe the same linear trend, rr scores how tightly the points follow the line, while the line gives the trend's equation. A high ∣r∣|r| means predictions from the line are more trustworthy.

Why correlation cannot establish causation

The reason a correlation alone never proves cause is that several different real situations produce the same observed pattern, and the data cannot tell them apart. Variable XX might cause YY; YY might cause XX; a lurking variable ZZ might cause both; or the link might be coincidence in a small sample. Ice-cream sales and drownings rise together not because either causes the other, but because summer heat (ZZ) lifts both. Since the scatter plot only records that XX and YY move together, it is consistent with all four explanations, so concluding causation overreaches the evidence. To actually pin down cause, you need a controlled experiment that randomly assigns the supposed cause and holds other factors fixed, ruling out lurking variables and reverse causation. This is why the EOC rewards naming a plausible third variable and stating plainly that correlation does not imply causation.

Try this

Q1. Interpret r=0.85r = 0.85. [1 point]

  • Cue. A strong positive linear association.

Q2. A residual plot shows a clear U-shaped pattern. Is a line a good model? [1 point]

  • Cue. No; a pattern in the residuals means a line does not fit well.

Exam-style practice questions

Practice questions written in the style of FLDOE exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

B.E.S.T. (style)1 marksMultiple choice. A correlation coefficient of r=−0.95r = -0.95 indicates: (A) a strong negative linear association (B) a weak negative association (C) a strong positive association (D) no association
Show worked answer →

The correct answer is (A).

The correlation coefficient rr ranges from −1-1 to 11. The sign gives direction (negative here), and the magnitude gives strength: ∣−0.95∣|{-0.95}| is close to 11, so the linear association is strong. Thus r=−0.95r = -0.95 is a strong negative linear association. A value near 00 would mean little or no linear association.

B.E.S.T. (style)2 marksIce-cream sales and drowning incidents are strongly positively correlated across the year. Explain why this does not mean ice cream causes drownings.
Show worked answer →

The correlation is explained by a lurking variable, hot weather, not by ice cream causing drownings.

A strong correlation shows the two rise together, but a third variable, summer heat, independently increases both ice-cream sales (people buy more) and swimming (so more drownings). Correlation measures association, not cause; without a controlled experiment, you cannot conclude one variable causes the other. Markers reward naming the lurking variable and stating that correlation does not imply causation.

Related dot points

Sources & how we know this