Glossary
10% condition
A rule stating that when sampling without replacement, the sample size (n) must be less than 10% of the population size (N) to ensure that observations are approximately independent. This prevents significant changes in the population as items are sampled.
Example:
If you're sampling 50 students from a high school, the 10% condition requires that the school has at least 500 students to assume independence.
Central Limit Theorem (CLT)
A fundamental theorem stating that the sampling distribution of the sample mean (or sum) will be approximately normal, regardless of the population distribution, as long as the sample size is sufficiently large (typically n ≥ 30).
Example:
Even if the individual incomes in a city are heavily skewed, the Central Limit Theorem tells us that the distribution of sample mean incomes from many large samples will be approximately normal.
Confidence Intervals for the Difference of Two Means
A range of plausible values for the true difference between two population means, constructed from sample data. It provides an estimated interval that is likely to contain the true difference with a certain level of confidence.
Example:
A researcher constructs a confidence interval for the difference of two means to estimate how much taller, on average, plants grown with a new fertilizer are compared to those with an old one.
Critical t-value (t*)
A value from the t-distribution that defines the boundary for a given confidence level and degrees of freedom. It is used in the calculation of the margin of error for confidence intervals involving means when the population standard deviation is unknown.
Example:
For a 95% confidence interval with 20 degrees of freedom, you'd look up the appropriate critical t-value in a t-table to determine the multiplier for the standard error.
Independence (Condition)
A condition for inference ensuring that the observations within each sample are independent of each other, and the two samples are independent of each other. For sampling without replacement, this is checked using the 10% condition.
Example:
When comparing the average heights of male and female students, the selection of a male student should not influence the selection of a female student, demonstrating independence between the two samples.
Interpretation (of Confidence Interval)
The process of explaining what a calculated confidence interval means in the context of the problem, including the confidence level, the parameter being estimated, and the specific populations. It explains the meaning of the interval's bounds.
Example:
An interpretation of a confidence interval might state, 'We are 90% confident that the true average difference in reaction time between coffee drinkers and non-coffee drinkers is between -50 milliseconds and -20 milliseconds.'
Margin of Error
The amount added to and subtracted from the point estimate to construct a confidence interval, accounting for the variability in the sampling distribution. It quantifies the precision of the estimate.
Example:
A poll reports a candidate's support at 55% with a margin of error of ±3%, meaning the true support is likely between 52% and 58%.
Normality (Condition)
A condition for inference requiring that the sampling distribution of the sample means (or their difference) is approximately normal. This can be met if the population is normal, or if sample sizes are large enough (n ≥ 30) due to the Central Limit Theorem.
Example:
Before constructing a confidence interval for mean test scores, a teacher checks the normality condition by looking at a histogram of the sample scores to ensure no strong skewness or outliers.
Point Estimate
A single value calculated from sample data that serves as the best guess or approximation for an unknown population parameter. For the difference of two means, it's simply the difference between the two sample means (x̄₁ - x̄₂).
Example:
If one sample of batteries lasts 10 hours and another lasts 8 hours, the point estimate for the difference in mean battery life is 2 hours.
Pooled vs. Not Pooled (for 2-SampTInt)
A choice in calculator functions for two-sample t-intervals/tests, referring to whether the population variances are assumed to be equal ('pooled') or unequal ('not pooled'). In AP Statistics, it is generally recommended to choose 'not pooled' unless there's strong evidence or a specific instruction to pool.
Example:
When using the TI-84's 2-SampTInt function, selecting 'No' for pooled vs. not pooled means you are not assuming the underlying population standard deviations are equal.
Randomness (Condition)
A crucial condition for inference requiring that both samples are obtained through a random process or, for experiments, subjects are randomly assigned to treatments. This ensures the samples are representative and reduces bias.
Example:
To compare two teaching methods, students must be randomly assigned to either Method A or Method B to ensure the groups are comparable.
Sample sizes (n₁, n₂)
The number of observations or individuals included in each of the two independent samples. These values are crucial for checking conditions (like CLT and 10% condition) and for calculating the standard error.
Example:
To compare two different exercise programs, a researcher might enroll 30 participants in Program A (n₁ = 30) and 35 in Program B (n₂ = 35).
Sample standard deviations (s₁, s₂)
Measures of the spread or variability within each of the two collected samples. These are used in the formula for the standard error of the difference between two sample means.
Example:
If one group of students has very consistent test scores and another has widely varying scores, their respective sample standard deviations would reflect this difference in spread.