Skip to main content
United StatesStatisticsSyllabus dot point

What kinds of bias can creep into a sample, and why does none of them shrink with sample size?

Topic 3.4 Potential Problems with Sampling: identify undercoverage, voluntary response, convenience, nonresponse, and response bias, explain how each distorts results, and recognize that bias is not reduced by a larger sample.

A focused answer to AP Statistics Topic 3.4, identifying undercoverage, voluntary response, convenience, nonresponse, and response bias, the direction each pushes results, and why bias persists no matter how large the sample.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. Bias from who is sampled
  3. Bias from who answers and how they answer
  4. Why bias does not average out
  5. Naming and directing bias on the exam
  6. Try this

What this topic is asking

The College Board (Topic 3.4) wants you to identify the common sources of sampling bias, undercoverage, voluntary response, convenience, nonresponse, and response bias, explain how each distorts a result and in which direction, and recognize that bias is not reduced by a larger sample.

Bias from who is sampled

These three share a cause: the selection method systematically favors some people over others. An online poll undercovers those without internet; a call-in radio poll is voluntary response, dominated by the angry and the enthusiastic; surveying the first people you meet at a mall is convenience, skewed toward whoever shops there at that time. In each case the sample is not a fair miniature of the population, so the statistic systematically misses the truth.

Bias from who answers and how they answer

Nonresponse is about who answers among those selected; response bias is about the answers themselves being pushed off the truth. A neutrally worded survey of a random sample can still suffer nonresponse if half the selected people hang up; a survey with full response can still suffer response bias if the question is leading. The two are independent problems with independent fixes (follow-up for nonresponse, neutral wording and confidentiality for response bias).

Why bias does not average out

The deepest idea in Topic 3.4 is the contrast between bias and random error. Random sampling error is the chance difference between a sample statistic and the population parameter; it is symmetric, sometimes high and sometimes low, and a larger sample makes it smaller because the ups and downs cancel. Bias is a consistent push in one direction built into the method, so every additional observation is pushed the same way, and a larger sample simply gives a more precise estimate of the wrong value. A famously large but biased poll can be confidently wrong, while a small random sample can be roughly right. This is why the exam treats "the sample was too small" as the wrong diagnosis for a biased study: the cure for bias is a better method, random selection to fix undercoverage and self-selection, vigorous follow-up to fix nonresponse, and neutral, confidential measurement to fix response bias, not more of the same flawed data.

Naming and directing bias on the exam

Free-response questions reward you for naming the specific bias and stating its direction, in context. It is not enough to say "this is biased"; you should say which bias and which way it skews the result. A leading question that calls a policy "wasteful" biases responses toward opposing it. An online-only survey on technology use overstates connectivity, because the offline cannot reply. A daytime landline survey undercovers workers and the young, skewing toward older, at-home residents. Practicing this, identify the source, explain the mechanism, predict the direction, is exactly the skill the College Board scores, and it carries forward into every later question about whether a study's conclusion can be trusted.

Try this

Q1. Distinguish undercoverage from nonresponse. [2 points]

  • Cue. Undercoverage means some groups have no chance of selection (left out of the frame); nonresponse means selected people are reached but decline or cannot be contacted.

Q2. Why does increasing the sample size not reduce bias? [1 point]

  • Cue. Bias is a systematic push in one direction, so every extra observation is skewed the same way; more data gives a precise estimate of the wrong value.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2018 (style)1 marksSection I (multiple choice). A survey on internet habits is conducted only through an online form posted on social media. Which bias is most clearly present? (A) Nonresponse bias (B) Response bias from question wording (C) Undercoverage of people without internet access (D) No bias, because anyone can respond
Show worked answer →

The correct answer is (C).

An online-only survey gives people without internet access no chance of being selected, so that part of the population is left out entirely. That is undercoverage.

(A) Nonresponse is when selected people decline; here some people cannot be reached at all. (B) Nothing about wording is described. (D) The method systematically excludes a group, so it is biased. Excluding a subgroup from the frame is undercoverage.

AP 2021 (style)4 marksSection II (free response). A town surveys residents by phoning landlines during weekday working hours and asking, 'Do you agree that the irresponsible council should cut wasteful spending?' (a) Identify one source of bias from the sampling method and explain its likely effect. (b) Identify one source of bias from the question itself and explain its likely effect. (c) Explain why surveying more people in the same way would not remove these biases, justifying in context.
Show worked answer →

A 4-point question on multiple bias sources.

(a) (1 point) Calling landlines on weekday working hours undercovers people at work and those without landlines (often younger residents), so the sample skews toward those reachable, biasing the result toward their views.
(b) (1 point) The wording ('irresponsible council', 'wasteful spending') is leading and emotionally loaded, a response bias that pushes respondents toward agreeing, inflating the proportion in favor.
(c) (2 points) Bias is systematic, not random (1 point): every extra call uses the same skewed frame and the same loaded question, so a larger sample repeats the same distortion rather than averaging it out (1 point, in context).

Markers reward identifying a sampling bias with its direction, identifying a question-wording bias with its direction, and the insight that more data does not cure systematic bias.

Related dot points

Sources & how we know this