Skip to main content
United StatesStatisticsSyllabus dot point

What does a P-value measure, and how is it interpreted in the context of a test?

Topic 6.5 Interpreting P-Values: define the P-value as the probability, assuming the null hypothesis is true, of obtaining a test statistic at least as extreme as the one observed, and interpret it in context.

A focused answer to AP Statistics Topic 6.5, on defining the P-value as the probability under the null of a result at least as extreme as observed, interpreting small and large P-values, and avoiding common misreadings, with a worked interpretation.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The precise definition
  3. Small versus large P-values
  4. The interpretation that loses marks
  5. What a P-value does not do
  6. Try this

What this topic is asking

The College Board (Topic 6.5) wants you to define and interpret a P-value: the probability, assuming the null hypothesis is true, of getting a test statistic at least as extreme as the one observed (in the direction of HaH_a). You must phrase this correctly in context and avoid the standard misreadings.

The precise definition

Three parts must appear in any interpretation: (1) assuming H0H_0 is true, (2) the probability of a result at least as extreme, (3) as the one observed. Drop any one and the interpretation is wrong. The P-value is a measure of surprise: how unusual is our data under the null? The smaller it is, the harder it is to explain the data by chance alone if H0H_0 holds.

Small versus large P-values

The P-value is a sliding scale of evidence, not a verdict on its own; converting it into a decision requires comparing it to α\alpha (Topic 6.6). A useful mental anchor: "if the null were true, how often would chance alone produce data this extreme?" If that fraction is tiny, the null looks like a poor explanation.

The interpretation that loses marks

The single most penalized error in the course is calling the P-value "the probability that H0H_0 is true" (or "the probability the result is due to chance"). The P-value is a probability about the data, conditional on H0H_0, not a probability about the hypothesis. H0H_0 is either true or false; it has no probability in this framework. Likewise, a P-value of 0.030.03 does not mean "a 3%3\% chance the result is a fluke"; it means "if H0H_0 were true, results this extreme would occur 3%3\% of the time." Saying it correctly, every time, is worth real marks.

What a P-value does not do

A P-value does not measure the size of an effect, only how surprising the data are under H0H_0. With a very large sample, a tiny, practically meaningless departure from p0p_0 can produce a small P-value; with a small sample, a large real effect can give a non-significant P-value. So statistical significance is not the same as practical importance, and a large P-value never proves H0H_0, it only fails to provide evidence against it. These limits are why Topic 6.7 (errors) and confidence intervals (which show effect size) accompany P-values rather than replace them.

Try this

Q1. A test gives P-value 0.620.62. Interpret it in one sentence (generic context). [1 point]

  • Cue. If H0H_0 were true, there would be a 62%62\% chance of getting a result at least as extreme as the one observed, so the data are consistent with H0H_0.

Q2. Why is "the P-value is the probability H0H_0 is true" wrong? [1 point]

  • Cue. The P-value is a probability about the data assuming H0H_0 is true; it says nothing about the probability of the hypothesis itself.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2019 (style)1 marksSection I (multiple choice). A test of H0:p=0.5H_0: p = 0.5 against Ha:p>0.5H_a: p > 0.5 gives P-value 0.030.03. This means (A) there is a 3%3\% chance H0H_0 is true (B) if p=0.5p = 0.5, there is a 3%3\% chance of a sample proportion at least as large as observed (C) there is a 3%3\% chance the result is due to chance (D) p=0.03p = 0.03
Show worked answer →

The correct answer is (B).

The P-value is computed assuming H0H_0 is true; it is the probability, under p=0.5p = 0.5, of getting a test statistic (here a sample proportion) at least as extreme as the one observed in the direction of HaH_a.

(A) is the classic error: the P-value is not the probability H0H_0 is true. (C) is a vague misstatement of the same error. (D) confuses the P-value with the parameter pp.

AP 2021 (style)3 marksSection II (free response). A researcher tests whether a coin is unfair, H0:p=0.5H_0: p = 0.5 versus Ha:p0.5H_a: p \ne 0.5, where pp is the probability of heads, and obtains a P-value of 0.210.21. (a) Interpret this P-value in context. (b) At α=0.05\alpha = 0.05, what does the P-value imply about the evidence against H0H_0? (c) Explain why a large P-value does not prove the coin is fair.
Show worked answer →

A 3-point interpretation question.

(a) (1 point) Assuming the coin is fair (p=0.5p = 0.5), there is a 0.210.21 probability of obtaining a sample result at least as far from 0.50.5 (in either direction) as the one observed.
(b) (1 point) Since 0.21>0.050.21 > 0.05, the result is not surprising under H0H_0; there is not convincing evidence against H0H_0, so we fail to reject it.
(c) (1 point) Failing to reject H0H_0 means the data are consistent with p=0.5p = 0.5, but they are also consistent with values near 0.50.5; absence of evidence against fairness is not proof of fairness.

Markers reward the conditional "assuming H0H_0 true," the at-least-as-extreme phrasing, the comparison to α\alpha, and the point that a large P-value does not prove H0H_0.

Related dot points

Sources & how we know this