United StatesStatisticsSyllabus dot point

How do we describe the shape, center, spread, and unusual features of a quantitative distribution?

Topic 1.6 Describing the Distribution of a Quantitative Variable: describe a quantitative distribution by its shape, center, spread, and unusual features (outliers, gaps, clusters) in context.

A focused answer to AP Statistics Topic 1.6, the SOCS framework for describing a quantitative distribution by shape, outliers, center, and spread, with the vocabulary of skew, modality, and clusters, and worked descriptions.

Generated by Claude Opus 4.89 min answerUpdated 2026-06-04

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this topic is asking
The SOCS framework
Shape
Outliers and unusual features
Center and spread
Describing in context, completely
Try this

What this topic is asking

The College Board (Topic 1.6) wants you to describe a quantitative distribution in words, covering its shape, center, spread, and unusual features (outliers, gaps, clusters), always in context. This verbal description is examined heavily on free-response questions.

The SOCS framework

Describing a distribution is the verbal counterpart of drawing it. The display from Topic 1.5 shows the picture; SOCS turns that picture into a precise, contextual sentence that another reader could act on.

Shape

Shape captures the overall form of the distribution:

Symmetric: the two halves roughly mirror each other (a bell shape is the famous case).
Skewed right (positive skew): a long tail stretches toward higher values; most data sit at the low end. Income and house prices are classic examples.
Skewed left (negative skew): a long tail stretches toward lower values; most data sit at the high end. Exam scores on an easy test can look like this.
Modality: unimodal (one peak), bimodal (two peaks, often hinting at two subgroups), or uniform (roughly flat).

The single most-tested fact is that a distribution is skewed in the direction of its tail, not its hump.

Outliers and unusual features

Calling out gaps and clusters matters: a bimodal, clustered distribution often means you are really looking at two different groups mixed together (for example heights of a mixed-sex group), and naming that is genuine insight, not decoration.

Center and spread

Center answers "what is a typical value?" You may cite the median (the middle value, resistant to skew and outliers) or the mean (the balance point, sensitive to skew). Spread answers "how variable are the values?" using the range, the interquartile range (IQR), or the standard deviation. The next two topics define these precisely; here you describe them in words and give an approximate value read from the display. A crucial pairing rule, which the exam rewards, is to match your measures: for a skewed distribution or one with outliers, use the median and IQR (both resistant), and for a roughly symmetric distribution use the mean and standard deviation. Quoting a mean for badly skewed income data, where the long right tail drags the mean above what most people earn, is a textbook mistake.

Describing in context, completely

The reason free-response questions on this topic lose so many marks is that students give a description that is either incomplete or generic. A description that says only "it is skewed right" misses center, spread, and unusual features, and a description that says "the center is about $40$ with moderate spread" floats free of the situation. The College Board wants all four SOCS components and the context: name the variable, give units, and tie each statement to the data. Compare "it is skewed" with "the distribution of household incomes is skewed right, with most households earning between \ $30{,}000 and \$ 70{,}000 (a typical value around \ $50{,}000), a spread of roughly \$ 40{,}000 across the bulk, and a few very high incomes forming a long right tail." The second answer would earn every available point because it addresses shape, unusual features, center, and spread, each anchored to the variable and its units. Building the habit of running through SOCS in order, out loud or on scratch paper, guarantees you never drop a component under exam pressure.

Describing a distribution with SOCS

A histogram shows the times (minutes) that $40$ customers waited in a queue. The bars rise to a peak around $4$ to $6$ minutes, then fall away gradually toward higher times, with one isolated bar at $20$ to $22$ minutes separated by a gap. Describe the distribution.

step 1 Shape

The distribution is unimodal and skewed right: a single peak at the low end (around $4$ to $6$ minutes) with a tail stretching toward longer wait times.

step 2 Outliers and unusual features

There is a gap before an isolated bar at $20$ to $22$ minutes, so those longest waits are outliers, clearly separated from the main body of the data.

step 3 Center

Because the distribution is skewed, the median is the better center; a typical customer waited roughly $6$ to $7$ minutes (read from where about half the data lie below).

step 4 Spread

Most customers waited between about $2$ and $12$ minutes (use the IQR for a resistant spread); including the outliers stretches the full range out to $22$ minutes.

step 5 Interpret in context

The wait times for these $40$ customers are skewed right: most waited under about $7$ minutes, with a few unusually long waits near $21$ minutes forming a separated outlier group, so the median and IQR best summarize the typical experience.

Try this

Q1. A distribution has a long tail toward low values and most data at the high end. State its skew. [1 point]

Cue. Skewed left (negative skew), because the long tail points toward the lower values.

Q2. For a strongly right-skewed distribution, which center and spread should you report, and why? [2 points]

Cue. Median and IQR, because both are resistant to the skew and outliers, whereas the mean and standard deviation are pulled by the long tail.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). A histogram of household incomes has most values bunched at lower amounts with a long tail stretching toward high incomes. How is this distribution best described? (A) Symmetric (B) Skewed left (C) Skewed right (D) Uniform

Show worked answer →

The correct answer is (C).

A distribution is skewed in the direction of its long tail. Here the tail stretches toward high incomes (to the right), so the distribution is skewed right (positively skewed). Income data are a classic example.

(A) symmetric would have matching tails. (B) skewed left would have the long tail toward low values. (D) uniform would have roughly equal frequencies across the range. The tail points right, so it is skewed right.

AP 2022 (style)4 marksSection II (free response). A dotplot shows the ages (years) of

25

members of a community choir. The values cluster between

30

and

55

, with two isolated members aged

78

and

80

, and a gap between

55

and

78

. (a) Describe the distribution in context, addressing shape, center, spread, and unusual features. (b) Explain why the two oldest members are best described as outliers rather than simply the maximum values.

Show worked answer →

A 4-point question on a full distribution description.

(a) (3 points, one each for shape/unusual features, center, spread done in context): Shape and unusual features (1 point): the distribution is roughly clustered between $30$ and $55$ with a clear gap before two high outliers at $78$ and $80$ ; ignoring the outliers it is fairly symmetric or mildly skewed. Center (1 point): a typical chorister is around the low-to-mid $40$ s (cite an approximate median). Spread (1 point): the bulk of ages spans roughly $30$ to $55$ (a range of about $25$ years), widening greatly if the two outliers are included.
(b) (1 point): the two oldest members are outliers because they are separated from the main cluster by a clear gap and lie far from the rest of the data, not merely the largest of a continuous spread; the gap is what marks them as unusual rather than just the high end of the range.

Markers reward addressing shape, center, and spread in context, and a justification that ties "outlier" to the gap and separation, not just to being the maximum.

Related dot points

Sources & how we know this

AP Statistics Course and Exam Description — College Board (2020)