Statistics for Two Categorical Variables

Noah Martinez
8 min read
Study Guide Overview
This study guide covers exploring categorical data, focusing on two-way tables, relative frequencies (joint, marginal, conditional), and assessing associations between categorical variables. It also includes visualizing data with side-by-side, segmented, and mosaic plots, plus vocabulary practice and exam tips covering multiple-choice and free-response questions.
#AP Statistics: Unit 2 - Exploring Categorical Data π
Hey there! Let's get you prepped for the AP Stats exam with a deep dive into categorical data. This guide is designed to be your go-to resource for a quick, effective review. We'll break down key concepts, practice questions, and exam strategies to ensure you're feeling confident and ready to ace it! Let's jump in!
#2.2 - Exploring Two-Way Tables
#
Relative Frequencies: More Than Meets the Eye
Remember those two-way tables? They're not just for counting! We can extract a ton of info using relative frequencies. Let's break it down:
- Joint Relative Frequency: The proportion of observations that fall into a specific combination of categories. (e.g., the proportion of males who think they have a "50-50 chance").
- Marginal Relative Frequency: The proportion of observations that fall into a specific category, regardless of other categories. (e.g., the overall proportion of people who think they have a "50-50 chance").
- Conditional Relative Frequency: The proportion of observations that fall into a specific category, given that they are already in another category. (e.g., the proportion of males who think they have a "50-50 chance" out of all males).
#Marginal Relative Frequency
- The marginal relative frequency is the relative frequency of all individuals within a specific category. It's like looking at the totals on the margins of your two-way table. β
- How to calculate: Divide the row or column total by the grand total of the table.
- Example: In the table below, the marginal relative frequency of "50-50 chance" is 1416/4826. This means 1416 out of 4826 respondents gave that response.
#Conditional Relative Frequency
- The conditional relative frequency is the relative frequency of a category given that the subject is already in another category. Think of it as focusing on a specific subgroup within your data. π§±
- How to calculate: Divide the frequency of the intersection of the two categories by the total frequency of the given category.
- Example: The conditional relative frequency of "50-50 chance given male" is 720/2459. Out of the 2459 males, 720 said "50-50 chance."
#2.3 - Associations in Categorical Data
#Determining Associations
- We use marginal and conditional relative frequencies to determine if two categorical variables are associated. π€
- Association: Two variables are associated if the conditional relative frequencies for one variable differ across the categories of the other variable.
- Independence: Two variables are independent if the conditional relative frequencies for one variable are the same across all categories of the other variable. In other words, knowing the category of one variable doesn't change the distribution of the other. π
#Example
Let's see if "gender" and "opinion" are associated using the table below:
- If the conditional relative frequency of being "50-50 chance given male" is roughly equal to the marginal relative frequency of being "50-50 chance", then the variables are likely independent. If they are significantly different, then the variables are associated.
#Visualizing Categorical Data
#
Graphs for Categorical Data
- Side-by-side bar graphs: Compare distributions of a categorical variable across different groups.
- Segmented bar graphs: Show the distribution of a categorical variable within each group.
- Mosaic plots: Visualize the relationship between two categorical variables, where the area of each rectangle is proportional to the joint relative frequency.
#Vocabulary Practice (2.1 to 2.3) π
Let's solidify your understanding with a quick matching exercise!
Match the number with the letter that corresponds to its description!
- Two-way tables
- Side-by-side bar graphs
- Mosaic plots
- Segmented bar graphs
- Categorical variable
- Quantitative variable
- Bivariate variable
- Marginal relative frequency
- Conditional relative frequency
A. A graphical display that shows the relationship between two categorical variables by dividing the area of a rectangle into tiles that represent the different categories of both variables.
B. A graphical display that shows the distribution of a categorical variable by displaying the frequency or relative frequency of each category as a bar.
C. A graphical display that shows the relationship between two categorical variables by dividing the bars in a bar graph into segments that represent the different categories of one of the variables.
D. A statistical table that shows the frequencies or relative frequencies of two categorical variables in a cross-tabulated format, with one variable represented by rows and the other by columns.
E. A variable that can take on a limited number of categories or values, such as "male" or "female," but cannot be meaningfully ordered or measured on a continuous scale.
F. A variable that can be measured or ordered on a continuous scale, such as height or weight.
G. The relative frequency of a particular category of a categorical variable within another category, calculated by dividing the frequency or relative frequency of the first category within the second category by the total frequency or relative frequency of the second category.
H. The frequency or relative frequency of a particular category of a categorical variable, calculated by dividing the frequency or relative frequency of that category by the total frequency or relative frequency for the entire sample or population.
I. A statistical concept that refers to the relationship between two variables, often used to describe the association between two categorical variables.
#Answers
- D
- B
- A
- C
- E
- F
- I
- H
- G
#
Final Exam Focus
#High-Priority Topics
- Two-way tables: Master calculating joint, marginal, and conditional relative frequencies. Know how to use these values to assess association.
- Graphical displays: Be able to interpret and compare side-by-side bar graphs, segmented bar graphs, and mosaic plots.
- Association vs. Independence: Understand the difference and be able to determine if two categorical variables are associated or independent.
#Common Question Types
- Multiple Choice: Expect questions that ask you to calculate relative frequencies, interpret graphs, and identify associations.
- Free Response: Be prepared to construct and interpret two-way tables and graphs, and to justify your conclusions about association using statistical evidence.
#Last-Minute Tips
- Time Management: Don't spend too long on any one question. If you're stuck, move on and come back later.
- Common Pitfalls: Double-check your calculations, especially when dealing with conditional relative frequencies. Make sure you're using the correct denominator.
- Strategies: When interpreting graphs, always describe the shape, center, spread, and any unusual features. When assessing association, make sure to compare conditional distributions.
#
Practice Question
Practice Questions
#Multiple Choice Questions
- A survey asked 100 students whether they prefer math or science. The results are shown in the table below:
Math | Science | Total | |
---|---|---|---|
Male | 20 | 30 | 50 |
Female | 30 | 20 | 50 |
Total | 50 | 50 | 100 |
What is the conditional relative frequency that a student prefers math, given that the student is male?
(A) 20/100
(B) 20/50
(C) 50/100
(D) 30/50
(E) 20/30
2. Which of the following is NOT a way to display the relationship between two categorical variables?
(A) Two-way table
(B) Side-by-side bar graph
(C) Scatterplot
(D) Mosaic plot
(E) Segmented bar graph
3. If two categorical variables are independent, which of the following statements must be true?
(A) The marginal relative frequencies are equal.
(B) The conditional relative frequencies are equal.
(C) The joint relative frequencies are equal.
(D) The two variables have a strong positive association.
(E) The two variables have a strong negative association.
#Free Response Question
A researcher is studying the relationship between a person's favorite color and their personality type. They surveyed 200 people and categorized their favorite color as either red, blue, or green, and their personality type as either introverted or extroverted. The results are shown in the table below:
Red | Blue | Green | Total | |
---|---|---|---|---|
Introverted | 20 | 40 | 30 | 90 |
Extroverted | 40 | 30 | 40 | 110 |
Total | 60 | 70 | 70 | 200 |
(a) Calculate the marginal relative frequency of people whose favorite color is blue.
(b) Calculate the conditional relative frequency that a person is introverted, given that their favorite color is red.
(c) Based on your calculations, is there an association between favorite color and personality type? Justify your answer.
#Scoring Rubric
(a) Marginal Relative Frequency (1 point)
- 1 point for correctly calculating the marginal relative frequency: 70/200 = 0.35
(b) Conditional Relative Frequency (1 point)
- 1 point for correctly calculating the conditional relative frequency: 20/60 = 0.33
(c) Association (3 points)
- 1 point for stating whether or not there is an association.
- 1 point for providing a comparison of conditional relative frequencies.
- 1 point for providing a justification based on the comparison.
Example Answer: There appears to be an association between favorite color and personality type. The conditional relative frequency of being introverted given the favorite color is red (0.33) is not equal to the conditional relative frequency of being introverted given the favorite color is blue (40/70 = 0.57). Since the conditional distributions are not the same, we can say that there is an association between the two variables.
#Answers
Multiple Choice Answers:
- (B)
- (C)
- (B)
That's it! You've reviewed the key concepts for exploring categorical data. You've got this! πͺ
Explore more resources

How are we doing?
Give us your feedback and let us know how we can improve