Glossary
10% Condition
A condition for inference tests, stating that when sampling without replacement, the sample size should be no more than 10% of the population size to ensure independence of observations.
Example:
If you survey 50 students from a high school with 800 students, the 10% Condition (50 < 0.10 * 800 = 80) is met, allowing you to proceed with certain inference procedures.
Alpha (α)
The predetermined significance level, representing the maximum probability of making a Type I error (rejecting a true null hypothesis) that a researcher is willing to accept.
Example:
Commonly set at 0.05, if your p-value is less than this Alpha (α), you would reject the null hypothesis.
Alternative Hypothesis (Ha)
The statement that there is an effect, a difference, or an association between variables; it's what the researcher is trying to find evidence for.
Example:
If testing a new fertilizer, the Alternative Hypothesis (Ha) would be: 'The new fertilizer increases plant growth compared to the old fertilizer.'
Chi-Square Statistic (χ²)
A test statistic that measures the discrepancy between observed frequencies and expected frequencies under the null hypothesis, indicating how well the observed data fits the expected pattern.
Example:
A high Chi-Square Statistic (χ²) value suggests a large difference between what was observed and what was expected, providing evidence against the null hypothesis.
Chi-Square Test for Homogeneity
A type of chi-square test used to determine if the distribution of a single categorical variable is the same across multiple independent populations.
Example:
If a school wants to know if the distribution of after-school activity participation (sports, clubs, none) is the same for freshmen, sophomores, and juniors, they would conduct a Chi-Square Test for Homogeneity.
Chi-Square Test for Independence
A type of chi-square test used to determine if there is a statistically significant relationship between two categorical variables within a single population.
Example:
To investigate if there's an association between a person's zodiac sign and their favorite ice cream flavor, you would use a Chi-Square Test for Independence.
Chi-Square Tests
Statistical tests used to analyze categorical data, determining if there's a significant association between variables or if observed data fits an expected pattern.
Example:
A researcher might use a Chi-Square Test to see if there's a relationship between a student's favorite subject and their preferred learning style.
Degrees of Freedom (df)
A value that determines the specific shape of a chi-square distribution, calculated as (number of rows - 1) × (number of columns - 1) for chi-square tests of independence or homogeneity.
Example:
For a 3x4 contingency table, the Degrees of Freedom (df) would be (3-1) * (4-1) = 2 * 3 = 6.
Expected Frequency (E)
The count that would be anticipated in each category or cell of a contingency table if the null hypothesis were true and there were no association or difference.
Example:
If a fair coin is flipped 100 times, the Expected Frequency (E) for heads would be 50.
Fail to Reject the Null Hypothesis
The decision made when the p-value is greater than the significance level (α), indicating insufficient evidence to support the alternative hypothesis.
Example:
If a study finds no significant difference between two teaching methods (p > 0.05), the conclusion is to Fail to Reject the Null Hypothesis, meaning there's not enough evidence to say one method is better.
Large Counts Condition
A condition for chi-square tests requiring that all expected counts in the contingency table are at least 5, ensuring the chi-square distribution is a good approximation for the test statistic.
Example:
Before performing a chi-square test on survey data, you must check the Large Counts Condition by calculating all expected frequencies and confirming none are below 5.
Null Hypothesis (H0)
The statement of no effect, no difference, or no association between variables; it's the 'status quo' that researchers attempt to disprove.
Example:
For a study on coffee preference and study habits, the Null Hypothesis (H0) would state: 'There is no association between a student's coffee preference (e.g., black, latte) and their study habits (e.g., morning, night).'
Observed Frequency (O)
The actual count of occurrences in each category or cell of a contingency table, as collected from the sample data.
Example:
In a survey of 100 people, if 30 reported preferring cats, then 30 is the Observed Frequency (O) for the 'cats' category.
P-value
The probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming the null hypothesis is true.
Example:
If a study yields a P-value of 0.01, it means there's only a 1% chance of seeing such results if the null hypothesis were actually true, suggesting strong evidence against the null.
Random Condition
A condition for inference tests requiring that data come from a random sample or a randomized experiment to ensure generalizability and valid statistical inference.
Example:
Before analyzing survey results about student opinions, it's crucial to verify the Random Condition by ensuring the students were selected via a simple random sample.
Reject the Null Hypothesis
The decision made when the p-value is less than or equal to the significance level (α), indicating statistically significant evidence to support the alternative hypothesis.
Example:
If a new drug significantly outperforms a placebo (p < 0.05), a researcher would Reject the Null Hypothesis, concluding the drug has an effect.