Skip to main content
United StatesStatisticsSyllabus dot point

How do we describe the shape, center, spread, and unusual features of a quantitative distribution?

Topic 1.6 Describing the Distribution of a Quantitative Variable: describe a quantitative distribution by its shape, center, spread, and unusual features (outliers, gaps, clusters) in context.

A focused answer to AP Statistics Topic 1.6, the SOCS framework for describing a quantitative distribution by shape, outliers, center, and spread, with the vocabulary of skew, modality, and clusters, and worked descriptions.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The SOCS framework
  3. Shape
  4. Outliers and unusual features
  5. Center and spread
  6. Describing in context, completely
  7. Try this

What this topic is asking

The College Board (Topic 1.6) wants you to describe a quantitative distribution in words, covering its shape, center, spread, and unusual features (outliers, gaps, clusters), always in context. This verbal description is examined heavily on free-response questions.

The SOCS framework

Describing a distribution is the verbal counterpart of drawing it. The display from Topic 1.5 shows the picture; SOCS turns that picture into a precise, contextual sentence that another reader could act on.

Shape

Shape captures the overall form of the distribution:

  • Symmetric: the two halves roughly mirror each other (a bell shape is the famous case).
  • Skewed right (positive skew): a long tail stretches toward higher values; most data sit at the low end. Income and house prices are classic examples.
  • Skewed left (negative skew): a long tail stretches toward lower values; most data sit at the high end. Exam scores on an easy test can look like this.
  • Modality: unimodal (one peak), bimodal (two peaks, often hinting at two subgroups), or uniform (roughly flat).

The single most-tested fact is that a distribution is skewed in the direction of its tail, not its hump.

Outliers and unusual features

Calling out gaps and clusters matters: a bimodal, clustered distribution often means you are really looking at two different groups mixed together (for example heights of a mixed-sex group), and naming that is genuine insight, not decoration.

Center and spread

Center answers "what is a typical value?" You may cite the median (the middle value, resistant to skew and outliers) or the mean (the balance point, sensitive to skew). Spread answers "how variable are the values?" using the range, the interquartile range (IQR), or the standard deviation. The next two topics define these precisely; here you describe them in words and give an approximate value read from the display. A crucial pairing rule, which the exam rewards, is to match your measures: for a skewed distribution or one with outliers, use the median and IQR (both resistant), and for a roughly symmetric distribution use the mean and standard deviation. Quoting a mean for badly skewed income data, where the long right tail drags the mean above what most people earn, is a textbook mistake.

Describing in context, completely

The reason free-response questions on this topic lose so many marks is that students give a description that is either incomplete or generic. A description that says only "it is skewed right" misses center, spread, and unusual features, and a description that says "the center is about 4040 with moderate spread" floats free of the situation. The College Board wants all four SOCS components and the context: name the variable, give units, and tie each statement to the data. Compare "it is skewed" with "the distribution of household incomes is skewed right, with most households earning between \30{,}000 and \70{,}000 (a typical value around \50{,}000), a spread of roughly \40{,}000 across the bulk, and a few very high incomes forming a long right tail." The second answer would earn every available point because it addresses shape, unusual features, center, and spread, each anchored to the variable and its units. Building the habit of running through SOCS in order, out loud or on scratch paper, guarantees you never drop a component under exam pressure.

Try this

Q1. A distribution has a long tail toward low values and most data at the high end. State its skew. [1 point]

  • Cue. Skewed left (negative skew), because the long tail points toward the lower values.

Q2. For a strongly right-skewed distribution, which center and spread should you report, and why? [2 points]

  • Cue. Median and IQR, because both are resistant to the skew and outliers, whereas the mean and standard deviation are pulled by the long tail.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). A histogram of household incomes has most values bunched at lower amounts with a long tail stretching toward high incomes. How is this distribution best described? (A) Symmetric (B) Skewed left (C) Skewed right (D) Uniform
Show worked answer →

The correct answer is (C).

A distribution is skewed in the direction of its long tail. Here the tail stretches toward high incomes (to the right), so the distribution is skewed right (positively skewed). Income data are a classic example.

(A) symmetric would have matching tails. (B) skewed left would have the long tail toward low values. (D) uniform would have roughly equal frequencies across the range. The tail points right, so it is skewed right.

AP 2022 (style)4 marksSection II (free response). A dotplot shows the ages (years) of 2525 members of a community choir. The values cluster between 3030 and 5555, with two isolated members aged 7878 and 8080, and a gap between 5555 and 7878. (a) Describe the distribution in context, addressing shape, center, spread, and unusual features. (b) Explain why the two oldest members are best described as outliers rather than simply the maximum values.
Show worked answer →

A 4-point question on a full distribution description.

(a) (3 points, one each for shape/unusual features, center, spread done in context): Shape and unusual features (1 point): the distribution is roughly clustered between 3030 and 5555 with a clear gap before two high outliers at 7878 and 8080; ignoring the outliers it is fairly symmetric or mildly skewed. Center (1 point): a typical chorister is around the low-to-mid 4040s (cite an approximate median). Spread (1 point): the bulk of ages spans roughly 3030 to 5555 (a range of about 2525 years), widening greatly if the two outliers are included.
(b) (1 point): the two oldest members are outliers because they are separated from the main cluster by a clear gap and lie far from the rest of the data, not merely the largest of a continuous spread; the gap is what marks them as unusual rather than just the high end of the range.

Markers reward addressing shape, center, and spread in context, and a justification that ties "outlier" to the gap and separation, not just to being the maximum.

Related dot points

Sources & how we know this