zuai-logo

Setting Up a Test for the Difference of Two Population Proportions

Isabella Lopez

Isabella Lopez

10 min read

Listen to this study note

Study Guide Overview

This study guide covers two-proportion z-tests for comparing two population proportions. It explains hypotheses (null and alternative), parameters, and the necessary conditions for inference (random, independent, normal). It details the pooled proportion calculation, the large counts condition, calculating the test statistic and p-value, and interpreting results in context. Practice questions and an answer key are included.

AP Statistics: Two-Proportion Z-Test Study Guide πŸš€

Hey there, future AP Stats superstar! Let's get you prepped for the exam with a focused review of the two-proportion z-test. This guide is designed to be your go-to resource for a quick, effective review. Let's dive in!

Significance Tests for Two Proportions

A significance test helps us determine if the difference between two sample proportions is statistically significant, or just due to random chance. It's all about seeing if there's a real difference between two groups. Think of it as a detective tool for data! πŸ•΅οΈβ€β™€οΈ

Key Concept

Core Idea

  • We're comparing two population proportions (p1 and p2) to see if there’s a genuine difference.
  • We use sample data to make inferences about the population.
  • The goal is to decide whether observed differences are statistically significant or just due to chance.

Hypotheses and Parameters

  • Null Hypothesis (H0): There is no difference between the two population proportions. Always written as H0:p1=p2H_0: p_1 = p_2
  • Alternative Hypothesis (Ha): There is a difference. This can be one-sided (Ha:p1>p2H_a: p_1 > p_2 or Ha:p1<p2H_a: p_1 < p_2) or two-sided (Ha:p1β‰ p2H_a: p_1 \neq p_2).
  • Parameters: Clearly define what p1p_1 and p2p_2 represent in the context of the problem. For example, p1p_1 = proportion of people who prefer Brand A, p2p_2 = proportion of people who prefer Brand B.

Hypotheses Image

Memory Aid

Remember: The null hypothesis always assumes no difference (p1=p2p_1 = p_2). The alternative hypothesis is what you're trying to find evidence for (either >,<>, < or β‰ \neq).

Conditions for Inference

Before we jump into calculations, we need to make sure our data is good to go. Here are the conditions we need to check:

(1) Random

- Samples must be randomly selected from their respective populations. This is crucial for avoiding bias. If it's not random, your results can't be generalized to the population. 😞

(2) Independence

- **10% Condition:** Both populations must be at least 10 times larger than their respective samples. This ensures that the samples are independent. πŸ”Ÿ
- **Random Assignment:** If it's a randomized experiment, the random assignment of treatments ensures independence.

(3) Normal

- **Large Counts Condition:**  We need to check if our expected successes and failures are at least 10. But here's the twist: we use a *pooled* proportion (<math-inline>\hat{p}\_c</math-inline>) to calculate these expected values. 🎩

Pooled Proportion Image

  • Pooled Proportion Formula: p^c=x1+x2n1+n2\hat{p}_c = \frac{x_1 + x_2}{n_1 + n_2}, where x1x_1 and x2x_2 are the number of successes in each sample, and n1n_1 and n2n_2 are the sample sizes.

Large Counts Condition Image

  • Check: n1p^cβ‰₯10n_1\hat{p}_c \geq 10, n1(1βˆ’p^c)β‰₯10n_1(1-\hat{p}_c) \geq 10, n2p^cβ‰₯10n_2\hat{p}_c \geq 10, and n2(1βˆ’p^c)β‰₯10n_2(1-\hat{p}_c) \geq 10

Memory Aid

Think of it like a swimming pool: We're combining our two samples into one big pool to get a better estimate of the overall proportion. 🏊

Example: MJ vs. Lebron (Again!) πŸ€

Let's revisit our basketball legends. MJ made 836/1623 shots, and Lebron made 622/1493 shots. Let's use a 2-Prop Z-Test.

Hypotheses and Parameters

  • H0:pMJ=pLH_0: p_{MJ} = p_{L}
  • Ha:pMJ>pLH_a: p_{MJ} > p_{L}
  • Where pMJp_{MJ} is the proportion of shots made by MJ, and pLp_{L} is the proportion of shots made by Lebron.

Hypotheses Example Image

Memory Aid

Use meaningful subscripts (like MJ and L) to keep track of which proportion belongs to which group. It makes everything clearer!

Conditions

  • Random: Assume the shots are random (even though it wasn't explicitly stated).
  • Independent: Assume MJ took at least 16,230 shots and Lebron took at least 14,930 shots, so the samples are independent.
  • Normal:
    • Calculate the pooled proportion: p^c=836+6221623+1493=0.468\hat{p}_c = \frac{836 + 622}{1623 + 1493} = 0.468
![Pooled Proportion Calculation Image](https://zupay.blob.core.windows.net/resources/files/0baca4f69800419293b4c75aa2870acd_8709a4_102.png?alt=media&token=3bc1e071-d7ab-44f6-a30a-b087794cfff1)
    - Check large counts:
        - <math-inline>1623(0.468) = 759.6 > 10</math-inline> βœ”οΈ
        - <math-inline>1623(0.532) = 863.4 > 10</math-inline> βœ”οΈ
        - <math-inline>1493(0.468) = 698.7 > 10</math-inline> βœ”οΈ
        - <math-inline>1493(0.532) = 794.3 > 10</math-inline> βœ”οΈ

Exam Tip

Test Statistic and P-value

  • Test Statistic: Calculate the z-score using the formula: z=(p^1βˆ’p^2)βˆ’0p^c(1βˆ’p^c)(1n1+1n2)z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\hat{p}_c(1-\hat{p}_c)(\frac{1}{n_1} + \frac{1}{n_2})}}
  • P-value: Find the probability of getting a z-score as extreme as the one you calculated, assuming the null hypothesis is true. Use your calculator or a z-table.

Conclusion

  • Decision: If the p-value is less than your significance level (usually Ξ±=0.05\alpha = 0.05), reject the null hypothesis. There is significant evidence to support the alternative hypothesis.
  • Context: Always state your conclusion in the context of the problem. Don't just say "reject the null." Explain what that means in real-world terms. πŸ§ͺ

Final Exam Focus

  • High-Priority Topics:
    • Hypotheses (null and alternative), parameters definition
    • Conditions for inference (random, independent, normal)
    • Pooled proportion calculation and large counts condition
    • Interpreting p-values and making conclusions in context
  • Common Question Types:
    • Multiple choice questions testing conceptual understanding of hypothesis testing and conditions.
    • Free response questions requiring you to perform a full significance test, including stating hypotheses, checking conditions, performing calculations, and writing a conclusion in context.
  • Time Management:
    • Quickly identify the type of test needed (two-proportion z-test).
    • Focus on correctly stating hypotheses and checking conditions first.
    • Use calculator shortcuts to find test statistics and p-values.
  • Common Pitfalls:
    • Forgetting to define parameters.

    • Incorrectly checking the normal condition (not using the pooled proportion).

    • Misinterpreting the p-value or not stating the conclusion in context.

Exam Tip

Remember: It's not just about getting the right answer; it's about showing your understanding of the process. Always show your work and explain your reasoning.

Practice Question

Practice Questions

Multiple Choice Questions

  1. A researcher is testing the effectiveness of a new drug. In a randomized experiment, 200 patients are given the new drug, and 200 are given a placebo. The proportion of patients who showed improvement was 0.65 in the drug group and 0.55 in the placebo group. Which of the following is the correct set of hypotheses for a two-proportion z-test? (A) H0:p1βˆ’p2=0H_0: p_1 - p_2 = 0, Ha:p1βˆ’p2β‰ 0H_a: p_1 - p_2 \neq 0 (B) H0:p1=p2H_0: p_1 = p_2, Ha:p1>p2H_a: p_1 > p_2 (C) H0:p1=p2H_0: p_1 = p_2, Ha:p1β‰ p2H_a: p_1 \neq p_2 (D) H0:p^1=p^2H_0: \hat{p}_1 = \hat{p}_2, Ha:p^1>p^2H_a: \hat{p}_1 > \hat{p}_2 (E) H0:p1=p2H_0: p_1 = p_2, Ha:p^1βˆ’p^2β‰ 0H_a: \hat{p}_1 - \hat{p}_2 \neq 0

  2. A survey was conducted to compare the proportions of adults who support a new policy in two different cities. In City A, 150 out of 250 adults supported the policy, while in City B, 120 out of 200 adults supported the policy. Before performing a two-proportion z-test, what condition must be checked regarding the sample sizes? (A) The sample sizes must be equal. (B) Both sample sizes must be greater than 30. (C) The populations must be at least 10 times the sample sizes. (D) The sample proportions must be approximately normal. (E) The sample sizes must be less than 10% of the population.

  3. When conducting a two-proportion z-test, why is a pooled proportion used when checking the normal condition? (A) To simplify calculations. (B) To account for the difference in sample sizes. (C) Because the null hypothesis assumes the two population proportions are equal. (D) To increase the power of the test. (E) To reduce the chance of a Type I error.

Free Response Question

A study was conducted to investigate the effectiveness of a new teaching method. A group of 100 students were randomly assigned to either the new method or the traditional method. After a semester, the proportion of students who passed the final exam was recorded for each group. 70 out of 100 students using the new method passed the exam and 60 out of 100 using the traditional method passed the exam. Conduct a significance test to determine if there is evidence that the new method is more effective than the traditional method. Use a significance level of Ξ±=0.05\alpha = 0.05.

Scoring Rubric:

(a) Hypotheses and Parameters (1 point): - 1 point for stating correct null and alternative hypotheses with proper notation. - 1 point for defining the parameters in context.

(b) Conditions (3 points): - 1 point for stating random sample/assignment condition. - 1 point for stating the independence condition and verifying it. - 1 point for stating and verifying the normal condition using the pooled proportion.

(c) Calculations (2 points): - 1 point for calculating the test statistic. - 1 point for calculating the p-value.

(d) Conclusion (1 point): - 1 point for making a correct conclusion in context, based on the p-value and significance level.

Answer Key:

Multiple Choice:

  1. (B)
  2. (C)
  3. (C)

Free Response:

(a) Hypotheses and Parameters: - H0:p1=p2H_0: p_1 = p_2 - Ha:p1>p2H_a: p_1 > p_2 - Where p1p_1 is the proportion of students who pass the exam using the new method and p2p_2 is the proportion of students who pass the exam using the traditional method.

(b) Conditions: - Random: The problem states that students were randomly assigned, so the random condition is met. - Independent: Since the sample size is 100 in each group, and it is reasonable to assume that there are more than 1000 students, the 10% condition is met. - Normal: - p^c=70+60100+100=0.65\hat{p}_c = \frac{70 + 60}{100 + 100} = 0.65 - n1p^c=100(0.65)=65β‰₯10n_1\hat{p}_c = 100(0.65) = 65 \geq 10 - n1(1βˆ’p^c)=100(0.35)=35β‰₯10n_1(1-\hat{p}_c) = 100(0.35) = 35 \geq 10 - n2p^c=100(0.65)=65β‰₯10n_2\hat{p}_c = 100(0.65) = 65 \geq 10 - n2(1βˆ’p^c)=100(0.35)=35β‰₯10n_2(1-\hat{p}_c) = 100(0.35) = 35 \geq 10

(c) Calculations: - Test Statistic: z=(0.7βˆ’0.6)βˆ’00.65(0.35)(1100+1100)=1.51z = \frac{(0.7 - 0.6) - 0}{\sqrt{0.65(0.35)(\frac{1}{100} + \frac{1}{100})}} = 1.51 - P-value: P(z>1.51)=0.0655P(z > 1.51) = 0.0655

(d) Conclusion: - Since the p-value (0.0655) is greater than the significance level (0.05), we fail to reject the null hypothesis. There is not enough evidence to conclude that the new teaching method is more effective than the traditional method.

You've got this! Keep practicing, stay confident, and you'll ace that exam. Good luck! 🌟

Question 1 of 11

What is the null hypothesis (H0H_0) typically assumed to be in a two-proportion z-test? πŸ€”

H0:p1β‰ p2H_0: p_1 \neq p_2

H0:p1>p2H_0: p_1 > p_2

H0:p1<p2H_0: p_1 < p_2

H0:p1=p2H_0: p_1 = p_2