What kinds of bias can creep into a sample, and why does none of them shrink with sample size?
Topic 3.4 Potential Problems with Sampling: identify undercoverage, voluntary response, convenience, nonresponse, and response bias, explain how each distorts results, and recognize that bias is not reduced by a larger sample.
A focused answer to AP Statistics Topic 3.4, identifying undercoverage, voluntary response, convenience, nonresponse, and response bias, the direction each pushes results, and why bias persists no matter how large the sample.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 3.4) wants you to identify the common sources of sampling bias, undercoverage, voluntary response, convenience, nonresponse, and response bias, explain how each distorts a result and in which direction, and recognize that bias is not reduced by a larger sample.
Bias from who is sampled
These three share a cause: the selection method systematically favors some people over others. An online poll undercovers those without internet; a call-in radio poll is voluntary response, dominated by the angry and the enthusiastic; surveying the first people you meet at a mall is convenience, skewed toward whoever shops there at that time. In each case the sample is not a fair miniature of the population, so the statistic systematically misses the truth.
Bias from who answers and how they answer
Nonresponse is about who answers among those selected; response bias is about the answers themselves being pushed off the truth. A neutrally worded survey of a random sample can still suffer nonresponse if half the selected people hang up; a survey with full response can still suffer response bias if the question is leading. The two are independent problems with independent fixes (follow-up for nonresponse, neutral wording and confidentiality for response bias).
Why bias does not average out
The deepest idea in Topic 3.4 is the contrast between bias and random error. Random sampling error is the chance difference between a sample statistic and the population parameter; it is symmetric, sometimes high and sometimes low, and a larger sample makes it smaller because the ups and downs cancel. Bias is a consistent push in one direction built into the method, so every additional observation is pushed the same way, and a larger sample simply gives a more precise estimate of the wrong value. A famously large but biased poll can be confidently wrong, while a small random sample can be roughly right. This is why the exam treats "the sample was too small" as the wrong diagnosis for a biased study: the cure for bias is a better method, random selection to fix undercoverage and self-selection, vigorous follow-up to fix nonresponse, and neutral, confidential measurement to fix response bias, not more of the same flawed data.
Naming and directing bias on the exam
Free-response questions reward you for naming the specific bias and stating its direction, in context. It is not enough to say "this is biased"; you should say which bias and which way it skews the result. A leading question that calls a policy "wasteful" biases responses toward opposing it. An online-only survey on technology use overstates connectivity, because the offline cannot reply. A daytime landline survey undercovers workers and the young, skewing toward older, at-home residents. Practicing this, identify the source, explain the mechanism, predict the direction, is exactly the skill the College Board scores, and it carries forward into every later question about whether a study's conclusion can be trusted.
Try this
Q1. Distinguish undercoverage from nonresponse. [2 points]
- Cue. Undercoverage means some groups have no chance of selection (left out of the frame); nonresponse means selected people are reached but decline or cannot be contacted.
Q2. Why does increasing the sample size not reduce bias? [1 point]
- Cue. Bias is a systematic push in one direction, so every extra observation is skewed the same way; more data gives a precise estimate of the wrong value.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2018 (style)1 marksSection I (multiple choice). A survey on internet habits is conducted only through an online form posted on social media. Which bias is most clearly present? (A) Nonresponse bias (B) Response bias from question wording (C) Undercoverage of people without internet access (D) No bias, because anyone can respondShow worked answer →
The correct answer is (C).
An online-only survey gives people without internet access no chance of being selected, so that part of the population is left out entirely. That is undercoverage.
(A) Nonresponse is when selected people decline; here some people cannot be reached at all. (B) Nothing about wording is described. (D) The method systematically excludes a group, so it is biased. Excluding a subgroup from the frame is undercoverage.
AP 2021 (style)4 marksSection II (free response). A town surveys residents by phoning landlines during weekday working hours and asking, 'Do you agree that the irresponsible council should cut wasteful spending?' (a) Identify one source of bias from the sampling method and explain its likely effect. (b) Identify one source of bias from the question itself and explain its likely effect. (c) Explain why surveying more people in the same way would not remove these biases, justifying in context.Show worked answer →
A 4-point question on multiple bias sources.
(a) (1 point) Calling landlines on weekday working hours undercovers people at work and those without landlines (often younger residents), so the sample skews toward those reachable, biasing the result toward their views.
(b) (1 point) The wording ('irresponsible council', 'wasteful spending') is leading and emotionally loaded, a response bias that pushes respondents toward agreeing, inflating the proportion in favor.
(c) (2 points) Bias is systematic, not random (1 point): every extra call uses the same skewed frame and the same loaded question, so a larger sample repeats the same distortion rather than averaging it out (1 point, in context).
Markers reward identifying a sampling bias with its direction, identifying a question-wording bias with its direction, and the insight that more data does not cure systematic bias.
Related dot points
- Topic 3.3 Random Sampling and Data Collection: describe and distinguish simple random, stratified, cluster, and systematic random sampling, and explain why random selection supports generalization to a population.
A focused answer to AP Statistics Topic 3.3, describing simple random, stratified, cluster, and systematic random sampling, how each uses chance, their trade-offs, and why random selection allows generalization, with a worked SRS selection.
- Topic 3.1 Introducing Statistics: Do the Data We Collected Tell the Truth? Recognize that the method of data collection determines the kinds of conclusions that can be drawn, and that poorly collected data cannot be fixed by analysis.
A focused answer to AP Statistics Topic 3.1, on why the data-collection method determines what conclusions are valid, the difference between random error and bias, and why analysis cannot rescue badly collected data.
- Topic 3.2 Introduction to Planning a Study: distinguish observational studies from experiments, identify explanatory and response variables, and recognize that only an experiment with imposed treatments can support a causal conclusion.
A focused answer to AP Statistics Topic 3.2, distinguishing observational studies from experiments, identifying explanatory and response variables and confounding, and explaining why imposing treatments is what enables causal claims.
- Topic 3.7 Inference and Experiments: use the presence or absence of random selection and random assignment to determine the scope of inference, that is, whether results generalize to a population and whether a causal conclusion is justified.
A focused answer to AP Statistics Topic 3.7, on the scope of inference, using random selection (generalization) and random assignment (causation) to decide what conclusions are valid, with a worked four-quadrant analysis.
- Topic 1.1 Introducing Statistics - What Can We Learn from Data?: identify questions to be answered, based on variation in one-variable data, and recognize what a data set can and cannot tell us.
A focused answer to AP Statistics Topic 1.1, on how variation in data raises statistical questions, what kinds of question data can answer, and the limits of what a single data set reveals, with worked examples.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)