Glossary
10% condition
A rule of thumb for independence in sampling without replacement, stating that the sample size should be no more than 10% of the population size to ensure that the probability of selecting subsequent items remains approximately constant.
Example:
If you sample 100 students from a school, the 10% condition requires that the school has at least 1000 students.
Alternative Hypothesis (Ha)
A statement that contradicts the null hypothesis, proposing that there is a significant effect, difference, or relationship between population parameters, which the researcher seeks to find evidence for.
Example:
If a researcher believes a new fertilizer will increase crop yield, their alternative hypothesis would state that the mean yield with the new fertilizer is greater than with the old one.
Central Limit Theorem (CLT)
A fundamental theorem in statistics stating that, for a sufficiently large sample size, the sampling distribution of the sample mean will be approximately normal, regardless of the shape of the population distribution.
Example:
Even if the distribution of individual incomes is skewed, the Central Limit Theorem ensures that the distribution of average incomes from many large samples will be approximately normal.
Degrees of Freedom (df)
A value that specifies the number of independent pieces of information available to estimate a parameter, which determines the specific shape of the t-distribution used in hypothesis testing.
Example:
For a two-sample t-test, the degrees of freedom are often approximated using the smaller sample size minus one, or a more complex formula by technology.
Fail to reject the null hypothesis
The decision made in a hypothesis test when the p-value is greater than or equal to the chosen significance level, indicating insufficient statistical evidence to conclude that the alternative hypothesis is true.
Example:
If the p-value for a new teaching method is 0.15 (and α=0.05), you would fail to reject the null hypothesis, meaning there's not enough evidence to say it's better.
Independent condition
A condition for inference stating that observations within each sample and between the two samples are independent of each other, meaning the outcome of one does not influence another.
Example:
When comparing the average weights of two different dog breeds, the weight of one dog should be independent of another dog's weight.
Normal condition
A condition for inference requiring that the sampling distribution of the sample mean (or difference in means) is approximately normal, which can be met if the population is normal, or by a sufficiently large sample size due to the Central Limit Theorem.
Example:
If your sample size is small, you might need to check a boxplot for severe skewness or outliers to ensure the normal condition is met for a t-test.
Null Hypothesis (Ho)
A statement of no effect, no difference, or no relationship between population parameters, which is assumed to be true until evidence suggests otherwise.
Example:
In a study comparing two new medications, the null hypothesis would state that there is no difference in their average effectiveness.
P-value
The probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming that the null hypothesis is true.
Example:
A p-value of 0.02 means there is a 2% chance of observing a difference in means as large as, or larger than, the one found, if the true population means were actually equal.
Parametric test
A type of statistical test that makes specific assumptions about the parameters of the population distribution from which the data are drawn, such as normality and equal variances.
Example:
The two-sample t-test is a parametric test because it assumes the underlying populations are normally distributed.
Quantitative data
Data that consists of numerical values or measurements, allowing for mathematical operations like calculating means and standard deviations.
Example:
The heights of plants in centimeters or the number of hours students spend studying are examples of quantitative data.
Random condition
A crucial condition for inference requiring that samples are randomly selected from the population or treatments are randomly assigned in an experiment, ensuring representativeness and allowing for valid conclusions.
Example:
To ensure the results of a survey are generalizable, participants must be selected using a random condition like simple random sampling.
Reject the null hypothesis
The decision made in a hypothesis test when the p-value is less than the chosen significance level, indicating sufficient statistical evidence to conclude that the alternative hypothesis is true.
Example:
If the p-value for a new drug's effectiveness is 0.001 (and α=0.05), you would reject the null hypothesis, concluding the drug is effective.
Significance level (α)
A pre-determined threshold (commonly 0.05 or 0.01) representing the maximum probability of making a Type I error (rejecting a true null hypothesis); if the p-value is less than this level, the result is considered statistically significant.
Example:
If a researcher sets their significance level at 0.05, they are willing to accept a 5% chance of incorrectly rejecting the null hypothesis.
Test Statistic
A standardized value calculated from sample data during a hypothesis test, which measures how many standard errors the observed sample result is from the value stated in the null hypothesis.
Example:
In a t-test, the test statistic (t-value) quantifies the difference between the observed sample means relative to the variability within the samples.
Two-sample t-test
A statistical hypothesis test used to compare the means of two independent groups to determine if they are significantly different from each other.
Example:
To determine if the average test scores of students taught by Method A are significantly different from those taught by Method B, a two-sample t-test would be appropriate.