Glossary
Alternative Hypothesis (Ha)
A statement that contradicts the null hypothesis, proposing that there is an effect, a difference, or a relationship.
Example:
Following the fertilizer example, the alternative hypothesis might be that the fertilizer increases plant growth (e.g., average plant height with fertilizer > average plant height without fertilizer).
Chi-Square Test
A statistical test used for categorical data to determine if there is a significant association between two categorical variables (independence) or if observed frequencies fit expected frequencies (goodness-of-fit).
Example:
A marketing team wants to know if there's a relationship between a customer's preferred social media platform and their age group; they would use a chi-square test for independence.
Confidence Interval
A range of plausible values for an unknown population parameter, constructed with a specified level of confidence (e.g., 95% or 99%).
Example:
A 95% confidence interval for the average commute time in a city might be (25 minutes, 35 minutes), meaning we are 95% confident the true average commute time falls within this range.
Independent Condition
A condition for inference requiring that individual observations or groups of observations are independent of each other, often checked by ensuring the sample size is less than 10% of the population size.
Example:
When sampling students from a large university, the independent condition is met if the sample size is small enough (e.g., 100 students from a population of 20,000).
Inference
The process of drawing conclusions or making predictions about a population based on data collected from a sample.
Example:
Using a sample of 500 voters to predict the outcome of a national election is an example of statistical inference.
Matched Pairs t-test
A specific type of t-test used when data are collected from dependent groups, such as before-and-after measurements on the same subjects or pairs of similar subjects.
Example:
To assess if a new exercise program reduces blood pressure, researchers measure each participant's blood pressure before and after the program, then analyze the differences using a matched pairs t-test.
Normal Condition
A condition for inference requiring that the sampling distribution of the statistic is approximately normal, often checked by a large sample size (Central Limit Theorem) or by examining the data's distribution for skewness/outliers.
Example:
For a small sample of quantitative data, you'd check the normal condition by creating a normal probability plot or histogram to ensure no strong skew or outliers.
Null Hypothesis (H0)
A statement of no effect, no difference, or no relationship between variables, which is assumed to be true until evidence suggests otherwise.
Example:
In a test of a new fertilizer, the null hypothesis would be that the fertilizer has no effect on plant growth (e.g., average plant height with fertilizer = average plant height without fertilizer).
P-Value
The probability of obtaining sample data as extreme as, or more extreme than, the observed data, assuming the null hypothesis is true.
Example:
If a study yields a p-value of 0.03, it means there's a 3% chance of seeing results like these (or more extreme) if the null hypothesis were actually true.
Probability
The measure of the likelihood that an event will occur, ranging from 0 (impossible) to 1 (certain).
Example:
The probability of rolling a 6 on a fair six-sided die is 1/6.
Random Condition
A condition for inference requiring that the data come from a random sample or a randomized experiment to ensure the sample is representative of the population.
Example:
Before performing a t-test, you must verify the random condition by checking if the subjects were randomly selected or assigned to treatments.
SPDC Template
A four-step framework (State, Plan, Do, Conclude) used to structure responses for free-response questions involving significance tests or confidence intervals.
Example:
When tackling a free-response question on the AP exam, always remember to follow the SPDC Template to ensure all necessary components are included for full credit.
Sampling Distribution
The distribution of a statistic (like a sample mean or proportion) obtained from all possible samples of the same size taken from a given population.
Example:
If you repeatedly take samples of 30 students and calculate their average height, the distribution of all those sample averages would form the sampling distribution of the mean height.
Significance Level (alpha)
A predetermined threshold (e.g., 0.05 or 0.01) used to decide whether to reject the null hypothesis; if the p-value is less than alpha, the results are considered statistically significant.
Example:
Setting the significance level at 0.05 means that we are willing to accept a 5% chance of incorrectly rejecting a true null hypothesis.
Significance Test (Hypothesis Test)
A formal procedure used to evaluate the evidence provided by sample data against a null hypothesis about a population parameter.
Example:
A company performs a significance test to determine if a new marketing campaign has increased their product's market share.
Study Design
The overall plan for collecting data, including methods like observational studies, experiments, and sampling techniques.
Example:
Deciding whether to conduct a randomized controlled experiment or a survey to gather data about a new teaching method is part of the study design phase.
t-test/interval for Means (One-Sample)
A statistical procedure used to test a hypothesis or construct a confidence interval about the mean of a single population when the population standard deviation is unknown.
Example:
To determine if the average weight of a new potato chip bag is truly 10 ounces, a quality control manager would use a one-sample t-test on a sample of bags.
t-test/interval for Means (Two-Sample)
A statistical procedure used to compare the means of two independent populations, often to see if there's a significant difference between them.
Example:
A researcher wants to compare the average test scores of students taught by two different methods; they would use a two-sample t-test if the groups are independent.
z-test/interval for Proportions (One-Sample)
A statistical procedure used to test a hypothesis or construct a confidence interval about the proportion of a single population.
Example:
If a political pollster wants to estimate the percentage of voters who support a certain candidate, they would construct a one-sample z-interval for a proportion.
z-test/interval for Proportions (Two-Sample)
A statistical procedure used to compare the proportions of two independent populations, often to see if there's a significant difference between them.
Example:
To see if the success rate of a new drug is different for men versus women, a pharmaceutical company might use a two-sample z-test for proportions.