Skip to main content
United StatesStatisticsSyllabus dot point

How do we display two categorical variables together, and what do two-way tables and segmented bar graphs reveal?

Topic 2.2 Representing Two Categorical Variables: construct and interpret two-way (contingency) tables and segmented or side-by-side bar graphs for two categorical variables.

A focused answer to AP Statistics Topic 2.2, on building and reading two-way tables and segmented or side-by-side bar graphs for two categorical variables, with marginal totals and a worked table.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. The two-way table
  3. Joint and marginal distributions
  4. Segmented and side-by-side bar graphs
  5. Reading association from the displays
  6. Try this

What this topic is asking

The College Board (Topic 2.2) wants you to display two categorical variables together with a two-way (contingency) table and with segmented or side-by-side bar graphs, and to read the joint and marginal information these displays carry.

The two-way table

A complete table has every cell filled and every margin computed, with the row totals and column totals each summing to the grand total nn. Filling in margins is not decoration: the marginal totals are needed to compute the proportions of the next topic, and a missing margin is a common reason a table-construction question loses marks.

Joint and marginal distributions

The table carries two kinds of information at once. The joint distribution is the set of cell counts (or cell proportions out of the grand total): it answers "how many are in both this row and this column?" The marginal distribution is given by the margins: a row total describes one variable on its own, ignoring the other. For example, the column totals tell you the overall split of the column variable across everyone, regardless of the row variable. Distinguishing joint from marginal is foundational, because conditional proportions (the next topic) build on both.

Segmented and side-by-side bar graphs

To display two categorical variables graphically, two bar-graph variants are standard. A segmented (stacked) bar graph draws one bar per category of the first variable, divided into segments whose heights show the proportions of the second variable within that category; comparing the segment patterns across bars reveals whether the variables are associated. A side-by-side (clustered) bar graph instead places the second variable's bars next to each other within each group, which can make exact comparisons easier when there are only a few categories. Both displays are usually drawn with relative frequencies (proportions) rather than raw counts when the groups have different sizes, so that the comparison is fair, exactly as in Unit 1. The choice between segmented and side-by-side is largely about readability: segmented bars emphasize parts of a whole, while side-by-side bars emphasize direct category-to-category comparison.

Reading association from the displays

The reason two-way tables and these bar graphs matter is that they reveal association between the two categorical variables. If the proportion preferring coffee is much higher among males than females, the segmented bars for the two sexes look different, and that visible difference is the association. If the variables were unrelated, the breakdown within each group would look the same. So when you read one of these displays, you are really asking "does the distribution of one variable change as I move across the categories of the other?" A strong exam answer names the variables, points to the differing proportions, and states the association in context, while remembering Topic 2.1's caution that an association in observational data is not proof of cause. The numerical version of this comparison, using conditional proportions, is exactly what Topic 2.3 formalises, so a clean two-way table here sets up the next topic directly.

Try this

Q1. In a two-way table, what do the column totals describe? [1 point]

  • Cue. The marginal distribution of the column variable: that variable on its own, ignoring the row variable.

Q2. Why are relative frequencies preferred over counts in a segmented bar graph comparing two groups of different sizes? [2 points]

  • Cue. Proportions put both groups on a per-total scale, so a difference in group size does not distort the visual comparison of the second variable's breakdown.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2017 (style)1 marksSection I (multiple choice). In a two-way table, the totals in the right-hand margin (the row totals) give which distribution? (A) The joint distribution (B) The conditional distribution of the column variable (C) The marginal distribution of the row variable (D) The relative frequencies of each cell
Show worked answer →

The correct answer is (C).

The row totals in the right margin summarize one variable on its own, ignoring the other; this is the marginal distribution of the row variable. The margins of a two-way table give the marginal distributions.

(A) the joint distribution is the cell counts (or cell proportions). (B) a conditional distribution fixes one variable's category and looks within it. (D) cell relative frequencies are joint, not marginal. The margin totals define marginal distributions.

AP 2020 (style)4 marksSection II (free response). A survey of 200200 people records sex (male, female) and whether they prefer tea or coffee. Of 9090 males, 5454 prefer coffee; of 110110 females, 4444 prefer coffee. (a) Construct the complete two-way table with marginal totals. (b) State the marginal distribution of beverage preference. (c) Describe one advantage of a segmented bar graph for displaying these data.
Show worked answer →

A 4-point question on building and reading a two-way table.

(a) (2 points) Males: coffee 5454, tea 90−54=3690 - 54 = 36. Females: coffee 4444, tea 110−44=66110 - 44 = 66. Table cells: male/coffee 5454, male/tea 3636, female/coffee 4444, female/tea 6666. Column totals: coffee 54+44=9854 + 44 = 98, tea 36+66=10236 + 66 = 102. Grand total 200200 (1 point for cells, 1 point for correct margins).
(b) (1 point) Marginal distribution of beverage: coffee 98/200=0.4998/200 = 0.49, tea 102/200=0.51102/200 = 0.51 (proportions of the whole sample, ignoring sex).
(c) (1 point) A segmented bar graph shows, within each sex, the proportion choosing each beverage, making it easy to compare the conditional distributions of preference between males and females visually.

Markers reward a correct table with margins, the marginal beverage distribution as proportions, and a sensible advantage of the segmented display.

Related dot points

Sources & how we know this