Chi Square

Noah Martinez
8 min read
Listen to this study note
Study Guide Overview
This guide covers chi-square tests for the AP Statistics exam, including the goodness of fit, independence, and homogeneity tests. It explains how to set up the tests, calculate expected counts, and verify test conditions. The SPDC (State, Plan, Do, Conclude) framework is emphasized for free-response questions. Practice questions and exam tips are also provided.
AP Statistics: Chi-Square Tests - Your Ultimate Guide ๐
Hey there, future AP Stats master! ๐ Let's dive into the world of chi-square tests, a crucial topic for your exam. This guide is designed to be your go-to resource, especially the night before the test. We'll make sure you're not just prepared, but confident!

Introduction to Chi-Square Tests
Remember Unit 6? We tackled inference for proportions. Now, in Unit 8, we're leveling up with chi-square tests. These are your go-to tools when you have two or more categories and want to analyze relationships between them. Think of it as moving beyond simple proportions to explore the connections between different groups.
Chi-square tests help us determine if observed data significantly differs from expected data, especially in categorical variables. This is a crucial concept for both multiple-choice and free-response questions.
Types of Chi-Square Tests
There are three main types of chi-square tests, each with a specific purpose:
- Chi-Square Test for Goodness of Fit: Examines if a sample distribution matches a hypothesized distribution. Think of this as testing if your data "fits" a particular model.
- Chi-Square Test for Independence: Determines if two categorical variables are independent of each other. Are they related, or is it just a coincidence?
- Chi-Square Test for Homogeneity: Compares distributions of a categorical variable across different populations or treatments. Are the groups similar or different?
Understanding when to use each type of chi-square test is crucial. Questions often test your ability to identify the correct test based on the scenario. Pay close attention to the wording of the problem.
The Core Idea
At its heart, a chi-square test compares observed frequencies (what you actually see in your data) with expected frequencies (what you'd expect if there was no relationship between variables). If the difference is large enough, we have evidence to reject our null hypothesis. ๐งฎ
For example, let's say you are examining the relationship between political affiliation and state of residence. You would compare the number of actual Republican voters in California to the number of Republican voters you would expect if state of residence had no effect on party affiliation.
Setting Up Your Chi-Square Test
What You Need
To get started, you'll need either a two-way table or a frequency table distribution. These tables organize your categorical data, making it easier to calculate expected counts and perform the test. ๐ช
Conditions for Chi-Square Tests
Just like other inference procedures, chi-square tests have conditions that must be met:
- Randomness: Your sample must be randomly selected, or treatments must be randomly assigned in an experiment. This ensures your data is representative.
- Large Counts: All expected counts must be at least 5. This condition ensures the sampling distribution of our test statistic is approximately chi-square. โ
Forgetting the "Large Counts" condition is a common error. Always calculate expected counts and check that they're all 5 or greater. This is a make-or-break step!
Example: Voting Preferences
Let's revisit our voting example. In the 2020 election, Biden received 51.3% of the national vote, and Trump received 46.9%. If we look at Alabama, we'd expect Biden to receive about 1.2 million votes out of 2.3 million, but he only received 849,000. This difference suggests a relationship between state of residence and vote choice. Chi-square tests help us quantify this relationship. ๐ณ๏ธ
Test Taking Template: SPDC
Hereโs a super helpful template to follow on test day, especially for FRQs: SPDC
- State: Clearly define your parameter of interest and state your null and alternative hypotheses. What are you testing, and what are your claims?
- Plan: Verify the conditions for inference (Randomness and Large Counts). Don't skip this step!
- Do: Calculate your chi-square test statistic and find your p-value. Use calculator shortcuts if you're comfortable with them.
- Conclude: Make a conclusion based on your p-value. Is there enough evidence to reject the null hypothesis?
Using SPDC (or similar) consistently will help you stay organized and ensure you don't miss any crucial steps. It's like having a checklist for success! ๐ป
SPDC: State, Plan, Do, Conclude. Remember this acronym to structure your FRQ responses. It's your roadmap to success!
Final Exam Focus
Okay, let's talk about what's most important for the exam:
- Identifying the Correct Test: Know when to use goodness-of-fit, independence, or homogeneity tests. This is the most common point of confusion.
- Conditions: Always check the randomness and large counts conditions. This is a common place to lose points.
- Expected Counts: Make sure you know how to calculate expected counts correctly. This is essential for the chi-square statistic.
- P-values: Understand what a p-value means and how to use it to make conclusions.
- SPDC: Use this template to structure your FRQs and ensure you hit all the key points.
Remember, a low p-value means we have evidence to reject the null hypothesis. A high p-value means we fail to reject the null hypothesis.
Last-Minute Tips
- Time Management: Don't spend too much time on one question. Move on if you're stuck and come back later.
- Common Pitfalls: Be careful with calculator inputs and double-check your calculations.
- FRQ Strategies: Show all your work, even if you use a calculator. Partial credit is your friend!
Practice Questions
Here are some practice questions to solidify your understanding. Remember, practice makes perfect!
Practice Question
Multiple Choice Questions
-
A researcher wants to determine if there is an association between a person's favorite color and their preferred type of music. Which test should they use? (a) One-sample z-test for proportions (b) Two-sample z-test for proportions (c) Chi-square test for goodness of fit (d) Chi-square test for independence (e) Chi-square test for homogeneity
-
A company claims that the distribution of colors in their candy mix is 20% red, 30% blue, 20% green, and 30% yellow. A sample of 500 candies is taken, and the observed counts are different from the claimed distribution. Which test should be used to determine if the company's claim is correct? (a) One-sample z-test for proportions (b) Two-sample z-test for proportions (c) Chi-square test for goodness of fit (d) Chi-square test for independence (e) Chi-square test for homogeneity
-
A study compares the distribution of political affiliations (Democrat, Republican, Independent) across three different states. Which test should be used? (a) One-sample z-test for proportions (b) Two-sample z-test for proportions (c) Chi-square test for goodness of fit (d) Chi-square test for independence (e) Chi-square test for homogeneity
Free Response Question
A researcher is investigating whether there is a relationship between a student's preferred learning style (visual, auditory, kinesthetic) and their academic performance (high, medium, low). They collect data from a random sample of 300 students and organize the data in the following two-way table:
Visual | Auditory | Kinesthetic | Total | |
---|---|---|---|---|
High | 40 | 30 | 20 | 90 |
Medium | 35 | 40 | 35 | 110 |
Low | 25 | 30 | 45 | 100 |
Total | 100 | 100 | 100 | 300 |
(a) State the null and alternative hypotheses for this test.
(b) Calculate the expected counts for each cell in the table. Show your work.
(c) Check if the conditions for performing a chi-square test are met.
(d) Calculate the chi-square test statistic. You may use a calculator.
(e) Calculate the p-value. You may use a calculator.
(f) State your conclusion in the context of the problem.
Scoring Rubric
(a) Hypotheses (1 point)
-
1 point for correct null and alternative hypotheses
H0: There is no association between preferred learning style and academic performance. Ha: There is an association between preferred learning style and academic performance.
(b) Expected Counts (2 points)
-
1 point for showing the correct formula or method for calculating expected counts
-
1 point for correct expected counts
- Expected Count = (Row Total * Column Total) / Grand Total
- Example: Expected count for High/Visual = (90 * 100) / 300 = 30
- Expected counts:
Visual Auditory Kinesthetic High 30 30 30 Medium 36.67 36.67 36.67 Low 33.33 33.33 33.33
(c) Conditions (2 points)
-
1 point for checking randomness
-
1 point for checking large counts
- Randomness: The problem states that a random sample was taken.
- Large Counts: All expected counts are greater than or equal to 5. (d) Chi-Square Statistic (1 point)
-
1 point for correct chi-square statistic
- ฯยฒ = ฮฃ [(Observed - Expected)ยฒ / Expected]
- ฯยฒ โ 13.06
(e) P-value (1 point)
-
1 point for correct p-value
- p-value โ 0.0109
(f) Conclusion (1 point)
-
1 point for correct conclusion in context
- Since the p-value (0.0109) is less than the significance level (e.g., 0.05), we reject the null hypothesis. There is sufficient evidence to suggest that there is an association between preferred learning style and academic performance.
You've got this! Remember to stay calm, use your resources, and trust your preparation. You're ready to ace this exam! ๐

How are we doing?
Give us your feedback and let us know how we can improve
Question 1 of 9
๐ When would you use a chi-square test? ๐ค
When comparing means of two groups
When analyzing relationships between categorical variables with two or more categories
When analyzing the relationship between two continuous variables
When analyzing data from a single sample with a single proportion