What are Type I and Type II errors, and how do significance level, sample size, and effect size affect them?
Topic 6.7 Potential Errors When Performing Tests: distinguish Type I and Type II errors and their consequences, define the power of a test, and explain how significance level, sample size, and effect size affect error probabilities and power.
A focused answer to AP Statistics Topic 6.7, on Type I and Type II errors, their real-world consequences, the power of a test, and how alpha, sample size, and effect size change error rates and power, with worked reasoning in context.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 6.7) wants you to distinguish Type I and Type II errors, describe their consequences in context, define the power of a test, and explain how the significance level , the sample size , and the effect size affect error probabilities and power.
The four outcomes
A clear way to keep them straight: a Type I error is a false alarm (you cry effect when there is none); a Type II error is a missed detection (you miss a real effect). Always describe each in the direction of the specific test, naming the context, because exam credit depends on saying which error means what here, not reciting the textbook definition.
Consequences drive the trade-off
Which error is worse depends on context, and naming the real-world consequence of each is routinely examined. In a medical screen, a Type I error (false positive) might cause needless treatment, while a Type II error (false negative) might leave a disease untreated; the relative harms decide whether you set low or high. There is no universally "safer" choice; you balance the costs. This is why is chosen before the data, as a policy about acceptable false-alarm risk.
Power and what raises it
Three factors raise power:
- Larger sample size . The biggest controllable lever. More data shrink the standard error, so a true effect produces a more extreme statistic and is detected more often, all without changing . This is the standard answer to "how can power be increased without raising the Type I error rate?"
- Larger true effect (effect size). The farther the true parameter is from the null value, the easier it is to detect, so power rises. This is not under the experimenter's control, but it explains why small effects need large samples.
- Larger . Loosening the rejection threshold rejects more readily, raising power, but at the cost of a higher Type I error rate. So this lever trades one error for the other rather than improving the test for free.
Reduced variability (for example, a less variable population or a better design) also raises power by shrinking the standard error, the same mechanism as a larger .
Try this
Q1. Define a Type II error and name its consequence in a test of whether a treatment works. [2 points]
- Cue. Failing to reject when the treatment truly works (a missed effect); consequence: an effective treatment is not adopted, so its benefits are lost.
Q2. Name two ways to increase power, and which one also raises the Type I error rate. [2 points]
- Cue. Increase the sample size (does not change ) and increase (which does raise the Type I error rate). A larger true effect also raises power.
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2019 (style)1 marksSection I (multiple choice). In a test, rejecting when is actually true is (A) a Type II error (B) a Type I error (C) the power (D) a correct decisionShow worked answer →
The correct answer is (B).
A Type I error is rejecting a true (a false positive). Its probability equals the significance level .
(A) is the reverse: a Type II error is failing to reject a false . (C) power is the probability of correctly rejecting a false . (D) describes a correct decision, not an error.
AP 2021 (style)4 marksSection II (free response). A drug regulator tests : a new drug is no more effective than the standard, against : it is more effective. (a) Describe a Type I and a Type II error in this context. (b) State a real-world consequence of each. (c) The regulator wants to reduce the chance of a Type II error without increasing . Justify in context one change that would do this.Show worked answer →
A 4-point errors-in-context question.
(a) (2 points) Type I error: concluding the new drug is more effective when it actually is not. Type II error: failing to conclude the new drug is more effective when it actually is.
(b) (1 point) Type I consequence: a useless (or costly) drug is approved or promoted, exposing patients to risk or cost for no benefit. Type II consequence: a genuinely better drug is not adopted, so patients miss out on improved treatment.
(c) (1 point) Increase the sample size. A larger increases the power of the test (reduces the Type II error rate) without changing , because it shrinks the standard error and makes a true effect easier to detect.
Markers reward correct directional descriptions of each error, plausible consequences, and identifying larger (or a larger true effect) as the way to raise power without raising .
Related dot points
- Topic 6.5 Interpreting P-Values: define the P-value as the probability, assuming the null hypothesis is true, of obtaining a test statistic at least as extreme as the one observed, and interpret it in context.
A focused answer to AP Statistics Topic 6.5, on defining the P-value as the probability under the null of a result at least as extreme as observed, interpreting small and large P-values, and avoiding common misreadings, with a worked interpretation.
- Topic 6.6 Concluding a Test for a Population Proportion: compute the standardized z test statistic and P-value for a one-sample proportion test, compare to the significance level, and state a conclusion in context.
A focused answer to AP Statistics Topic 6.6, on computing the standardized z statistic and P-value for a one-sample proportion test using the null value, comparing to alpha, and stating a conclusion in context, with a full worked test.
- Topic 6.4 Setting Up a Test for a Population Proportion: state null and alternative hypotheses about a population proportion, identify the significance level, and verify the conditions for a one-sample z-test.
A focused answer to AP Statistics Topic 6.4, on writing the null and alternative hypotheses for a population proportion, choosing the significance level, and checking the random, large-counts (using the null value), and 10% conditions for a one-sample z-test.
- Topic 6.11 Carrying Out a Test for the Difference of Two Population Proportions: compute the two-sample z test statistic using the pooled standard error, find the P-value, and state a conclusion in context.
A focused answer to AP Statistics Topic 6.11, on computing the two-sample z statistic with the pooled standard error, finding the P-value, and stating a conclusion in context, with a full worked two-proportion test.
- Topic 6.3 Justifying a Claim Based on a Confidence Interval for a Population Proportion: use a confidence interval for a proportion to evaluate whether a claimed value is plausible, and discuss the effect of confidence level and sample size on the interval.
A focused answer to AP Statistics Topic 6.3, on using a one-sample proportion confidence interval to judge whether a claimed value of p is plausible, and explaining how confidence level and sample size change the interval, with worked justifications.
Sources & how we know this
- AP Statistics Course and Exam Description — College Board (2020)