When should you use the median instead of the mean?

Use the median for skewed data or data with outliers, because outliers pull the mean toward the tail but barely affect the median. For symmetric data with no outliers, the mean is a fair center. Match the measure to the shape of the distribution.

What is the difference between joint, marginal, and conditional frequencies?

In a two-way table, a joint frequency is a single inner cell divided by the grand total, a marginal frequency is a row or column total divided by the grand total, and a conditional frequency divides a cell by its row or column total. The phrase of those who signals a conditional frequency.

How do you interpret the slope of a line of best fit?

The slope is the predicted change in y for each one-unit increase in x, a rate in the context. The y-intercept is the predicted y when x is zero. For example, in y equals 6x plus 50, each extra hour predicts about 6 more points and 50 is the predicted score at zero hours.

What does the correlation coefficient measure?

It measures the strength and direction of a linear relationship on a scale from negative 1 to 1. The sign gives direction (positive for upward, negative for downward) and the size gives strength (near 1 is strong, near 0 is weak). It only measures linear relationships.

Why does correlation not imply causation?

Two variables can move together because a lurking variable affects both, because of reverse causation, or by coincidence, without one causing the other. A classic example is ice cream sales and drowning rates, both driven by hot weather. Only a controlled experiment can establish causation.

North CarolinaMaths

NC Math 1: a complete guide to descriptive statistics

Q: What does the statistics strand cover on the NC Math 1 EOC?

It covers interpreting categorical and quantitative data (NC.M1.S-ID): representing one-variable data with dot plots, histograms, and box plots, comparing center and spread, two-way frequency tables, scatter plots and lines of best fit, and the correlation coefficient. Statistics and Probability is about 18 to 20 percent of the test.

A deep-dive NC Math 1 EOC guide to descriptive statistics (Statistics and Probability, about 18 to 20 percent of the test). Covers representing one-variable data with dot plots, histograms, and box plots, comparing center and spread, two-way frequency tables, scatter plots and lines of best fit, and the correlation coefficient with the correlation-versus-causation caution.

Generated by Claude Opus 4.814 min readNC.M1.S-IDUpdated 2026-06-13

Reviewed by: AI editorial process; not yet individually human-reviewed

Jump to a section

What this strand demands
Representing one-variable data
Comparing center and spread
Two-way frequency tables
Scatter plots and lines of best fit
Correlation and causation
How this strand is examined
Check your knowledge

What this strand demands

This guide covers descriptive statistics on the NC Math 1 EOC, drawing on Interpreting Categorical and Quantitative Data (NC.M1.S-ID). The Statistics and Probability category is about 18 to 20 percent of the test, a reliable block that rewards careful reading of displays and clear interpretation. Each dot-point page has its own practice: representing data distributions, comparing center and spread, two-way frequency tables, scatter plots and linear models, and correlation and causation.

Representing one-variable data

Three displays show data on a single variable. A dot plot stacks a dot per value; a histogram groups values into bins and shows frequency; a box plot displays the five-number summary (minimum, $Q_1$ , median, $Q_3$ , maximum) with the box spanning the IQR. Shape is symmetric, skewed right (high-value tail), or skewed left (low-value tail), and may include outliers.

Comparing center and spread

Center is the mean (sensitive to outliers) or median (resistant). Spread is the range (max minus min) or IQR ( $Q_3 - Q_1$ , resistant). For skewed data or outliers, prefer the median and IQR. To compare two data sets, report both a center and a spread using the same measures for each.

Two-way frequency tables

A two-way table cross-classifies two categorical variables. A joint frequency is one inner cell over the grand total; a marginal frequency is a margin total over the grand total; a conditional frequency divides a cell by its row or column total ("of those who..."). Comparing conditional frequencies across groups reveals association.

Scatter plots and lines of best fit

A scatter plot shows two numerical variables. Describe the direction (positive or negative), form (linear or not), and strength. Fit a line of best fit $y = mx + b$ to predict; interpret the slope as the predicted change in $y$ per unit of $x$ and the y-intercept as the predicted $y$ at $x = 0$ .

Correlation and causation

The correlation coefficient $r$ (from $-1$ to $1$ ) measures the strength and direction of a linear relationship: sign for direction, size for strength. A strong correlation does not prove causation, a lurking variable, reverse causation, or coincidence can explain it. A residual (observed minus predicted) checks fit: small random residuals mean a good line.

How this strand is examined

Gridded response. Compute a mean, median, IQR, relative frequency, or prediction. Exact-match scoring.
Multiple choice and multiple select. Identify shape, choose a measure, interpret slope or $r$ , or spot a correlation-causation error.
Technology-enhanced. Build a box plot or table, or match scatter plots to descriptions.

Check your knowledge

Work these as you would for credit on the EOC.

A five-number summary is $5, 8, 11, 16, 25$ . Find the IQR. (1 point)
Find the mean and median of $3, 4, 4, 5, 39$ , and say which is more representative. (2 points)
A histogram has a long tail toward high values. Name the shape. (1 point)
Of $60$ students, $24$ play music and also a sport. What is this joint relative frequency? (1 point)
Of $30$ musicians, $24$ play a sport. What is this conditional relative frequency? (1 point)
A line of best fit is $y = 5x + 12$ . Interpret the slope. (1 point)
Using $y = 2x + 8$ , predict $y$ when $x = 6$ . (1 point)
What does $r = -0.92$ indicate? (1 point)

Solutions to the check-your-knowledge questions

Work through each question in order, matching the method to the type of data or display.

Step 1: Find the IQR from a five-number summary

The IQR (interquartile range) measures the spread of the middle half of the data. It is always $Q_3$ minus $Q_1$ , not the full range:

\text{IQR} = Q_3 - Q_1 = 16 - 8 = 8.

Answer: $8$ .

Step 2: Find the mean and median, and choose the more representative measure

Add all five values to find the sum, then divide by the count for the mean. The median is the middle value when the data are ordered:

\text{Mean} = \frac{3 + 4 + 4 + 5 + 39}{5} = \frac{55}{5} = 11; \qquad \text{Median} = 4.

The value $39$ is an outlier that pulls the mean up to $11$ , far above most of the data. The median ( $4$ ) is resistant to that outlier, so it better represents where the data actually cluster.

Answer: Mean $= 11$ , median $= 4$ ; the median is more representative because of the outlier $39$ .

Step 3: Name the shape of the histogram

A long tail stretching toward higher values means the distribution is pulled to the right. The bulk of data is at low values and the tail extends upward.

Answer: Skewed right.

Step 4: Calculate the joint relative frequency

A joint relative frequency divides a single inner cell by the grand total of all students. The cell count is $24$ and the grand total is $60$ :

\frac{24}{60} = 0.40 = 40\%.

Answer: $40\%$ of all students both play music and play a sport.

Step 5: Calculate the conditional relative frequency

A conditional relative frequency restricts the denominator to a subgroup rather than the grand total. The condition is "of the $30$ musicians," so divide the $24$ who also play a sport by $30$ :

\frac{24}{30} = 0.80 = 80\%.

Answer: $80\%$ of musicians also play a sport.

Step 6: Interpret the slope of the line of best fit

The slope of a line of best fit is the predicted change in $y$ for each one-unit increase in $x$ . In $y = 5x + 12$ , the slope is $5$ .

Answer: Each one-unit increase in $x$ predicts about $5$ more units in $y$ .

Step 7: Make a prediction using the line of best fit

Substitute $x = 6$ into the equation and evaluate. This gives the predicted $y$ -value at that input:

y = 2(6) + 8 = 12 + 8 = 20.

Answer: $y = 20$ .

Step 8: Interpret the correlation coefficient

The sign of $r$ shows direction and the size shows strength, on a scale from $-1$ to $1$ . With $r = -0.92$ , the sign is negative (as $x$ increases, $y$ tends to decrease) and the magnitude $0.92$ is close to $1$ (a strong relationship).

Answer: A strong negative linear relationship.

Sources & how we know this

North Carolina Standard Course of Study for Mathematics — NC Department of Public Instruction (2024)
EOC NC Math 1 and NC Math 3 Test Specifications — NC Department of Public Instruction (2024)