Glossary
Alternative Hypothesis (Ha)
A statement that contradicts the null hypothesis, proposing an effect, a difference, or a relationship between variables.
Example:
If the null hypothesis states no association, the Alternative Hypothesis would state that there is an association between age group and preferred social media platform.
Chi-Square Test Statistic (χ²)
A calculated value that quantifies the discrepancy between the observed frequencies and the expected frequencies in a chi-square test.
Example:
A large Chi-Square Test Statistic value suggests a significant difference between what was observed and what was expected, leading to potential rejection of the null hypothesis.
Chi-Square Test for Goodness of Fit
A chi-square test used to determine if an observed sample distribution matches a hypothesized or theoretical population distribution for a single categorical variable.
Example:
A candy company claims their bags contain 30% red, 20% blue, 50% green candies; you'd use a Chi-Square Test for Goodness of Fit to see if a sample bag matches this claim.
Chi-Square Test for Homogeneity
A chi-square test used to compare the distributions of a single categorical variable across two or more different populations or groups.
Example:
If you want to see if the distribution of political party affiliation is the same for voters in California, Texas, and New York, you'd use a Chi-Square Test for Homogeneity.
Chi-Square Test for Independence
A chi-square test used to determine if there is a statistically significant association or relationship between two categorical variables in a single sample.
Example:
To investigate if there's a link between a person's favorite genre of movie (action, comedy, drama) and their preferred streaming service, you would use a Chi-Square Test for Independence.
Chi-Square Tests
A family of statistical tests used to analyze relationships between two or more categorical variables or to determine if observed data fits a hypothesized distribution.
Example:
A researcher might use a Chi-Square Test to see if there's a relationship between a student's major and their preferred study location (library, dorm, coffee shop).
Expected Frequencies
The counts or proportions of outcomes that would be anticipated if the null hypothesis were true, calculated based on the overall distribution.
Example:
If a coin is fair, and you flip it 100 times, you'd have an Expected Frequency of 50 heads.
Frequency Table Distribution
A table that summarizes the counts of observations for each category of a single categorical variable.
Example:
A table listing the number of students who chose each color (red, blue, green) as their favorite is a Frequency Table Distribution.
Large Counts Condition
A condition for chi-square tests requiring that all expected counts in the contingency table are at least 5, ensuring the sampling distribution of the test statistic is approximately chi-square.
Example:
If you calculate the expected number of people who prefer classical music in a survey and it's 3, you've violated the Large Counts Condition and cannot proceed with the chi-square test.
Null Hypothesis (H0)
A statement of no effect, no difference, or no relationship between variables, which is assumed true until evidence suggests otherwise.
Example:
For a chi-square test of independence, the Null Hypothesis would state that there is no association between a person's age group and their preferred social media platform.
Observed Frequencies
The actual counts or proportions of outcomes recorded in a sample or experiment.
Example:
If you survey 100 people and find 60 prefer coffee, 60 is the Observed Frequency for coffee preference.
P-value
The probability of obtaining a test statistic as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true.
Example:
If your P-value is 0.03, it means there's a 3% chance of seeing your data (or more extreme) if the null hypothesis were actually true, which is often considered strong evidence against the null.
Randomness Condition
A condition for inference procedures requiring that the data come from a random sample or a randomized experiment to ensure representativeness and valid generalization.
Example:
Before conducting a survey on student opinions, ensuring that every student has an equal chance of being selected for the sample satisfies the Randomness Condition.
SPDC
An acronym representing a structured template for answering free-response questions in AP Statistics: State (hypotheses), Plan (conditions), Do (calculations), Conclude (interpretation).
Example:
When tackling an FRQ, following the SPDC framework helps ensure you address all necessary components for full credit.
Significance Level (α)
A predetermined threshold (commonly 0.05) used to decide whether to reject the null hypothesis; if the p-value is less than or equal to this level, the null hypothesis is rejected.
Example:
Setting the Significance Level at 0.01 means you require very strong evidence (a p-value less than 0.01) to reject the null hypothesis.
Two-way Table
A table that displays the counts of observations for two categorical variables, with rows representing categories of one variable and columns representing categories of the other.
Example:
A table showing the number of students who prefer online vs. in-person classes, broken down by grade level (freshman, sophomore, etc.), is a Two-way Table.