What does the correlation coefficient measure, and why does correlation not imply causation?
Use the correlation coefficient to describe the strength and direction of a linear relationship and distinguish correlation from causation (NC.M1.S-ID.8, S-ID.6c).
An NC Math 1 EOC answer on correlation (NC.M1.S-ID.8, S-ID.6c): what the correlation coefficient r measures, reading its sign and size, why correlation does not imply causation, and assessing fit with residuals.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
NC.M1.S-ID.8 asks you to use the correlation coefficient to describe the strength and direction of a linear relationship and to recognize that correlation does not imply causation. NC.M1.S-ID.6c adds assessing fit qualitatively with residuals. This is about quantifying and interpreting a linear relationship responsibly.
What the correlation coefficient measures
The value packs direction and strength into one number.
So is strong positive, is strong negative, and is essentially no linear relationship.
Reading r in context
Why correlation is not causation
This is the most-tested idea in the strand. A strong correlation means two variables move together, but the cause could be:
- A lurking variable affecting both (hot weather raising both ice cream sales and drowning rates).
- Reverse causation (the assumed effect actually drives the cause).
- Coincidence in a particular data set.
Only a controlled experiment, not a correlation, can establish causation.
Residuals and fit (S-ID.6c)
A residual is the difference between an actual data value and the value predicted by the line of best fit (observed minus predicted). If residuals are small and randomly scattered with no pattern, the line fits well; a clear pattern in the residuals suggests a line is not the right model. This is a qualitative check on the fit of a line of best fit.
How the NC Math 1 EOC examines this topic
- Multiple choice. Interpret an value's sign and strength, or identify a correlation-causation error.
- Short reasoning. Explain why a correlation does not prove causation.
- Technology-enhanced. Match values to scatter plots.
This completes the two-variable thread that begins with scatter plots and parallels association in two-way tables.
Why the causation caution matters
It is tempting to leap from "these move together" to "this causes that," but that leap is the most common statistical mistake, and the EOC tests it deliberately. The correlation coefficient is honest about what it measures, the tightness and direction of a linear pattern, and silent about why the pattern exists. A lurking variable can manufacture a strong correlation between two effects of a common cause, as with ice cream and drowning both driven by summer heat. Holding this distinction protects you from a whole class of wrong conclusions and is exactly the reasoning S-ID.8 asks for: report what shows (association), and refuse to claim what it cannot (causation) without an experiment.
Try this
Q1. What does indicate about a linear relationship? [1 point]
- Cue. Near : essentially no linear relationship.
Q2. Shoe size and reading ability correlate in children. Does bigger feet cause better reading? [1 point]
- Cue. No; age is a lurking variable (older children have bigger feet and read better).
Exam-style practice questions
Practice questions written in the style of NCDPI exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
NC Math 1 EOC (style)1 marksA data set has correlation coefficient . What does this indicate? (A) weak positive (B) strong negative (C) no relationship (D) strong positiveShow worked answer →
The correct answer is (B), strong negative.
The correlation coefficient ranges from to . A value near (like ) means a strong negative linear relationship: as one variable increases, the other tends to decrease, and the points lie close to a downward line. The sign gives direction; the size (closeness to ) gives strength.
NC Math 1 EOC (style)2 marksIce cream sales and drowning rates both rise in summer, with high correlation. Does ice cream cause drowning? Explain.Show worked answer →
No. The two are correlated but neither causes the other; a lurking variable (hot weather) drives both.
High correlation only means the two variables move together, not that one causes the other. Here warm weather increases both ice cream sales and swimming (and thus drownings), so weather is a lurking variable. This is the classic "correlation does not imply causation" point that S-ID.8 tests directly.
Related dot points
- Represent two quantitative variables on a scatter plot, fit a linear model, and interpret slope and intercept in context (NC.M1.S-ID.6, S-ID.7).
An NC Math 1 EOC answer on scatter plots and linear models (NC.M1.S-ID.6, S-ID.7): describing form and strength, fitting a line of best fit, using it to predict, and interpreting slope and intercept in context.
- Use statistics appropriate to the shape of the distribution to compare center and spread of two or more data sets (NC.M1.S-ID.2).
An NC Math 1 EOC answer on center and spread (NC.M1.S-ID.2): mean versus median, range and IQR, choosing measures based on shape and outliers, and comparing two data sets.
- Represent data with dot plots, histograms, and box plots, and interpret the shape of a distribution (NC.M1.S-ID.1, S-ID.3).
An NC Math 1 EOC answer on representing data (NC.M1.S-ID.1, S-ID.3): reading and building dot plots, histograms, and box plots, and describing distribution shape, symmetry, skew, and outliers.
- Summarize two-variable categorical data in two-way tables and interpret joint, marginal, and conditional relative frequencies (NC.M1.S-ID.5).
An NC Math 1 EOC answer on two-way frequency tables (NC.M1.S-ID.5): reading counts, computing joint, marginal, and conditional relative frequencies, and recognizing possible association between two categorical variables.
- Compare linear, quadratic, and exponential functions across representations and observe that exponential growth eventually exceeds the others (NC.M1.F-LE.3, F-IF.9).
An NC Math 1 EOC answer on comparing function families (NC.M1.F-LE.3, F-IF.9): distinguishing linear, quadratic, and exponential by their patterns of change, comparing across tables and graphs, and why exponential growth eventually dominates.
Sources & how we know this
- North Carolina Standard Course of Study for Mathematics — NC Department of Public Instruction (2024)
- EOC NC Math 1 and NC Math 3 Test Specifications — NC Department of Public Instruction (2024)