How does a linear regression model predict one variable from another, and how do we interpret its slope and intercept?
Topic 2.6 Linear Regression Models: write, interpret, and use a least-squares regression equation to predict a response, interpreting the slope and intercept in context, and recognizing the danger of extrapolation.
A focused answer to AP Statistics Topic 2.6, on the form of a regression equation, interpreting slope and intercept in context, making predictions, and the danger of extrapolation, with a worked prediction and interpretation.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 2.6) wants you to work with a least-squares regression model: write the prediction equation, interpret the slope and intercept in context, use the model to predict a response, and recognize the danger of extrapolation beyond the data.
The regression equation
The hat notation distinguishes the model's prediction from reality. An actual data point has an observed ; the line gives for the same ; the gap between them is the residual of the next topic. Writing rather than in your equation is a small notation habit the exam rewards and penalizes its absence.
Interpreting the slope
The slope is a rate of change, not a value. It must say "predicted" because the line gives averages, not guarantees, and it must name both variables' units. Two recurring errors are dropping "predicted" (which wrongly implies every individual changes by exactly ) and reversing the variables (interpreting the slope as a change in per unit ).
Interpreting the intercept
The intercept is the predicted response when . Sometimes this is meaningful (if is a sensible, observed value), but very often it is not, because lies far outside the data or is physically impossible. A regression of weight on height has an intercept at height cm, which is nonsense; you should interpret it as "the predicted weight when height is is , but since no one is cm tall this is an extrapolation and not meaningful." Saying this, rather than pretending the intercept is a real prediction, demonstrates the understanding the exam is checking.
Prediction and the danger of extrapolation
To predict, substitute an -value into the equation and compute . This is reliable only within the range of the observed data, where you have evidence the linear pattern holds. Extrapolation, predicting for an outside that range, is risky because there is no data to confirm that the relationship stays linear (or stays at all) out there. Ice-cream sales may rise linearly with temperature from to degrees, but predicting sales at degrees assumes a pattern you have never observed and that may break down (people might stay indoors). The exam frequently sets up an extrapolation trap, giving a model with a stated valid range and then asking for a prediction outside it; the expected answer computes the value if asked but flags that it is an unreliable extrapolation. This is the regression version of "know the limits of your data," and it connects back to Topic 1.1's theme of what data can and cannot tell you. Because slope and intercept interpretation, prediction, and the extrapolation caution together account for most regression marks on the exam, drilling the exact phrasing, "predicted," "per one-unit," units, and "outside the range of the data," turns these into reliable points.
Try this
Q1. A line predicting test score from hours studied is . Interpret the slope. [2 points]
- Cue. For each additional hour studied, the predicted test score increases by points, on average.
Q2. Why is predicting outside the range of the data (extrapolation) unreliable? [1 point]
- Cue. There is no data out there to confirm the linear pattern continues, so the prediction rests on an untested assumption and may be badly wrong.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2018 (style)1 marksSection I (multiple choice). A regression line predicting weight (kg) from height (cm) is . How is the slope interpreted? (A) A person cm tall weighs kg (B) For each additional cm of height, predicted weight increases by kg on average (C) Weight is times height (D) The correlation is Show worked answer →
The correct answer is (B).
The slope is the predicted change in the response per one-unit increase in the explanatory variable: for each extra cm of height, predicted weight rises by kg, on average. Slope interpretation must include "predicted," "per one-unit increase," and units.
(A) describes the intercept (and meaninglessly here). (C) misreads the model as proportional. (D) confuses slope with correlation. The slope is a rate of change in the predicted response.
AP 2021 (style)4 marksSection II (free response). The least-squares line relating ice-cream sales (\yx\hat{y} = 30 + 12x10352050$ degrees would be unwise.Show worked answer →
A 4-point regression-interpretation question.
(a) (1 point) Slope: for each additional degree C of temperature, predicted ice-cream sales increase by $12, on average.
(b) (1 point) Intercept: at degrees C, the model predicts \1035$ degrees), so it is not a meaningful prediction, just where the line crosses the axis.
(c) (2 points) Prediction at degrees: \hat{y} = 30 + 12(20) = 30 + 240 = \27050501035$); this is extrapolation, and the linear pattern may not hold there (1 point).
Markers reward a slope interpretation with "predicted," "per one-unit," and units; an intercept interpretation noting the extrapolation; a correct prediction; and the extrapolation caution.
Related dot points
- Topic 2.4 Representing the Relationship Between Two Quantitative Variables: construct and describe scatterplots by direction, form, strength, and unusual features, in context.
A focused answer to AP Statistics Topic 2.4, on building scatterplots and describing them by direction, form, strength, and unusual features (the DUFS framework), in context, with a worked description.
- Topic 2.5 Correlation: calculate and interpret the correlation coefficient r, understand its properties (range, unit-free, resistance), and recognize what it can and cannot tell you.
A focused answer to AP Statistics Topic 2.5, defining the correlation coefficient r, its range and properties (unit-free, symmetric, non-resistant), what it measures and misses, and the correlation-causation caution, with a worked interpretation.
- Topic 2.7 Residuals: calculate and interpret residuals, construct and read residual plots, and use them to assess whether a linear model is appropriate.
A focused answer to AP Statistics Topic 2.7, defining the residual as observed minus predicted, interpreting positive and negative residuals, and using residual plots to judge whether a linear model is appropriate, with worked calculations.
- Topic 2.8 Least Squares Regression: determine the least-squares regression line from summary statistics, and interpret the coefficient of determination r-squared and the standard deviation of the residuals.
A focused answer to AP Statistics Topic 2.8, on why the least-squares line minimizes squared residuals, computing it from means, standard deviations, and r, and interpreting r-squared and s, with full worked calculations.
- Topic 2.9 Analyzing Departures from Linearity: identify outliers, high-leverage, and influential points in regression, and use transformations to model a non-linear relationship.
A focused answer to AP Statistics Topic 2.9, on regression outliers, high-leverage and influential points, and using transformations (logs and powers) to linearise a curved relationship, with a worked transformation example.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)