Glossary
Central Limit Theorem (CLT)
The CLT states that for a sufficiently large sample size (n ≥ 30), the sampling distribution of the sample mean will be approximately normal, regardless of the shape of the population distribution.
Example:
Even if the distribution of individual test scores is skewed, if you take samples of 30 or more students, the Central Limit Theorem guarantees that the distribution of the average test scores from those samples will be roughly normal.
Independence (10% Condition)
When sampling without replacement, this condition ensures that individual observations are independent by requiring the population size to be at least 10 times larger than the sample size.
Example:
If you're sampling 30 M&M's from a bag, the 10% condition means the bag should contain at least 300 M&M's to ensure that picking one M&M doesn't significantly change the probability of picking the next.
Large Counts Condition (for Proportions)
This condition ensures the sampling distribution of proportions is approximately normal by requiring at least 10 expected successes (np ≥ 10) and 10 expected failures (n(1-p) ≥ 10).
Example:
To use a normal model for the proportion of students who own a smartphone in a sample of 50, you'd need to check if at least 10 students are expected to own one and at least 10 are expected not to, satisfying the large counts condition.
Random Condition (for Proportions and Means)
This condition requires that the sample be randomly selected from the population to ensure it is representative and avoids bias.
Example:
To study student opinions on a new school policy, a principal must use a truly random selection method, like drawing names from a hat, rather than just asking the first 50 students they see.
Sampling Distribution
A sampling distribution is the distribution of a statistic (like a sample mean or proportion) obtained from all possible samples of the same size drawn from a population.
Example:
Imagine repeatedly taking samples of 50 students from your school and calculating the average GPA for each sample; the distribution of all those average GPAs would be the sampling distribution of the sample mean GPA.
Sampling Distribution of Means
This is the distribution of sample means from all possible samples of the same size taken from a population, used when dealing with numerical data.
Example:
If you repeatedly measure the average height of 25 randomly selected adults, the distribution of all those average heights would be the sampling distribution of means.
Sampling Distribution of Proportions
This is the distribution of sample proportions from all possible samples of the same size taken from a population, used when dealing with categorical data.
Example:
If you repeatedly survey 100 people and record the proportion who prefer coffee over tea, the distribution of those proportions forms the sampling distribution of proportions.
Sampling Distributions for Differences (in Means and Proportions)
These distributions describe the variability of the difference between two sample statistics (means or proportions) when comparing two populations.
Example:
To compare the average study times of students at two different universities, you would look at the sampling distribution for the difference in means between samples from each university.