What sampling methods use chance to select a sample, and how do they differ?
Topic 3.3 Random Sampling and Data Collection: describe and distinguish simple random, stratified, cluster, and systematic random sampling, and explain why random selection supports generalization to a population.
A focused answer to AP Statistics Topic 3.3, describing simple random, stratified, cluster, and systematic random sampling, how each uses chance, their trade-offs, and why random selection allows generalization, with a worked SRS selection.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 3.3) wants you to describe and distinguish the random sampling methods, simple random, stratified, cluster, and systematic, and to explain why using chance to select a sample is what lets results generalize to a population.
The simple random sample
The SRS is the benchmark every other method is compared to. Its strength is fairness: with no systematic preference for any subgroup, the sample is, on average, a fair miniature of the population, which is exactly what licenses generalization. Its weakness is practical: it needs a list of the whole population (a sampling frame) and can be costly if individuals are spread out.
Stratified, cluster, and systematic sampling
A clean way to keep stratified and cluster apart: in stratified sampling you take some from every group; in cluster sampling you take everyone from some groups. Stratified strata should be internally similar (so a few from each captures the differences between strata), whereas clusters should each resemble the whole (so a few clusters capture the population's variety).
Why random selection matters
Every one of these methods uses chance to decide who is in the sample, and that is the point. Chance selection removes the human tendencies that create bias: a researcher cannot favor agreeable respondents, convenient locations, or visible subgroups, because a random mechanism, not judgement, makes the choices. As a result the sample is representative in the long run, the difference between the sample statistic and the population parameter is purely random sampling error (which is quantifiable and shrinks with sample size), and the result can be generalized to the population. This is the bridge to inference: confidence intervals and significance tests in later units all assume the data came from a random sample, because only then is the chance behavior of the statistic known. A non-random sample breaks this chain, which is why Topic 3.4 catalogues the bias that follows when randomisation is missing.
Choosing among the designs
The exam often asks you to pick or justify a design, so it helps to know the trade-offs. Use stratified sampling when the population has clear subgroups that differ on the variable of interest (men and women, grade levels, regions); sampling within each guarantees representation and usually gives a more precise estimate than an SRS of the same size. Use cluster sampling when the population is naturally grouped by location and travelling to scattered individuals would be expensive; surveying a few whole neighborhoods is cheaper, at the cost of some precision if clusters are not truly representative. Use systematic sampling when you lack a full list but can take every th item off a production line or a queue. The SRS is the default when a full list exists and cost is not prohibitive. None of these is "biased"; they are all valid random methods with different cost-and-precision profiles, which is the judgement Topic 3.3 trains.
Try this
Q1. State the difference between stratified and cluster sampling in one sentence each. [2 points]
- Cue. Stratified: split into similar groups and take some from every group. Cluster: split into representative groups and take everyone from a few randomly chosen groups.
Q2. Why does random selection let a sample result generalize to the population? [1 point]
- Cue. Chance selection removes systematic bias, so the sample is representative and the only error is quantifiable random sampling error.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2019 (style)1 marksSection I (multiple choice). A school divides students into grade levels, then randomly selects students from each grade. This is an example of (A) a simple random sample (B) a stratified random sample (C) a cluster sample (D) a convenience sampleShow worked answer →
The correct answer is (B).
The school splits students into groups (strata) that share a characteristic (grade level), then takes a random sample from every stratum. That is a stratified random sample.
(A) In an SRS every group of the same size is equally likely; selecting a fixed number from each grade is not an SRS. (C) A cluster sample randomly selects whole groups and uses everyone in them, not a few from each. (D) There is randomisation, so it is not convenience. Sampling within every stratum makes it stratified.
AP 2022 (style)4 marksSection II (free response). A city has districts and wants to survey residents about a transport plan. (a) Describe how to select a simple random sample of residents from a list of all residents. (b) Describe how a cluster sample using the districts would be carried out. (c) Give one practical reason a city might prefer the cluster sample, and justify it in context.Show worked answer →
A 4-point question on sampling designs.
(a) (1 point) Number every resident on the list from to , then use a random number generator (or table) to choose distinct numbers; the residents with those numbers form the SRS.
(b) (2 points) Treat each district as a cluster (1 point); randomly select one or more whole districts and survey every resident in the chosen district(s) (1 point).
(c) (1 point) A cluster sample is cheaper and easier because surveyors travel to only a few districts rather than residents scattered across the whole city, in context reducing travel cost and time.
Markers reward a correct SRS procedure using a list and random numbers, a correct cluster procedure (random whole clusters, all members surveyed), and a valid practical advantage in context.
Related dot points
- Topic 3.1 Introducing Statistics: Do the Data We Collected Tell the Truth? Recognize that the method of data collection determines the kinds of conclusions that can be drawn, and that poorly collected data cannot be fixed by analysis.
A focused answer to AP Statistics Topic 3.1, on why the data-collection method determines what conclusions are valid, the difference between random error and bias, and why analysis cannot rescue badly collected data.
- Topic 3.4 Potential Problems with Sampling: identify undercoverage, voluntary response, convenience, nonresponse, and response bias, explain how each distorts results, and recognize that bias is not reduced by a larger sample.
A focused answer to AP Statistics Topic 3.4, identifying undercoverage, voluntary response, convenience, nonresponse, and response bias, the direction each pushes results, and why bias persists no matter how large the sample.
- Topic 3.2 Introduction to Planning a Study: distinguish observational studies from experiments, identify explanatory and response variables, and recognize that only an experiment with imposed treatments can support a causal conclusion.
A focused answer to AP Statistics Topic 3.2, distinguishing observational studies from experiments, identifying explanatory and response variables and confounding, and explaining why imposing treatments is what enables causal claims.
- Topic 3.7 Inference and Experiments: use the presence or absence of random selection and random assignment to determine the scope of inference, that is, whether results generalize to a population and whether a causal conclusion is justified.
A focused answer to AP Statistics Topic 3.7, on the scope of inference, using random selection (generalization) and random assignment (causation) to decide what conclusions are valid, with a worked four-quadrant analysis.
- Topic 1.1 Introducing Statistics - What Can We Learn from Data?: identify questions to be answered, based on variation in one-variable data, and recognize what a data set can and cannot tell us.
A focused answer to AP Statistics Topic 1.1, on how variation in data raises statistical questions, what kinds of question data can answer, and the limits of what a single data set reveals, with worked examples.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)