Glossary
Bias
A systematic error in a study design or data collection that causes a sample statistic to consistently overestimate or underestimate a population parameter.
Example:
If a survey about healthy eating habits is only given to people at a gym, the results might have bias because that group is not representative of the general population.
Confounding Variables
Extraneous variables that influence both the dependent variable and the independent variable, potentially leading to a spurious association between them.
Example:
In a study on coffee consumption and anxiety, stress levels could be confounding variables if people who drink more coffee also tend to have higher stress, making it hard to isolate coffee's true effect.
Experimental Design
The overall plan for conducting an experiment, including how treatments are assigned and how data will be collected to establish cause-and-effect relationships.
Example:
A well-thought-out experimental design for a drug trial would include a placebo group, blinding, and random assignment to ensure reliable results.
Generalizability
The extent to which the findings from a study can be applied to a larger population or different settings beyond the specific sample studied.
Example:
If a study on a new diet only involved young, healthy adults, its generalizability to older individuals or those with health conditions might be limited.
P-value
The probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming the null hypothesis is true.
Example:
A low p-value (e.g., 0.005) in a study suggests that the observed effect is very unlikely to be due to random chance, providing strong evidence against the null hypothesis.
Parameter
A numerical value that describes a characteristic of the entire population, typically unknown and estimated by a statistic.
Example:
The true average lifespan of a specific brand of lightbulb, if you could test every single one ever made, would be a parameter.
Population
The entire group of individuals or objects that a researcher is interested in studying and from which a sample is drawn.
Example:
If you want to know the average height of all professional basketball players, then all professional basketball players constitute the population.
Random Assignment
The process of randomly distributing experimental units into different treatment groups in an experiment to balance out pre-existing differences.
Example:
In a study testing a new fertilizer, plants are randomly assigned to either the fertilizer group or a control group to ensure any growth differences are due to the fertilizer.
Random Selection
The process of choosing a sample from a population in such a way that every individual has an equal chance of being chosen, allowing for inferences about the population.
Example:
To ensure a survey's results represent the entire student body, a researcher uses random selection to pick participants from a list of all enrolled students.
Sample
A smaller, manageable subset of the population from which data is actually collected and analyzed.
Example:
To estimate the average amount of sleep college students get, you survey 200 randomly chosen students, making them your sample.
Sampling Methods
Systematic procedures used to choose a sample from a population, aiming to ensure the sample is representative and allows for valid inferences.
Example:
Using a simple random sample, stratified sample, or cluster sample are all different types of sampling methods.
Sampling Variability
The natural tendency for statistics calculated from different random samples of the same population to differ from each other.
Example:
If five different groups each take a random sample of 30 high school students to estimate average daily screen time, their average times will likely show some sampling variability.
Statistic
A numerical value that describes a characteristic of a sample, used to estimate a population parameter.
Example:
If you measure the average lifespan of 100 lightbulbs from a specific brand and find it to be 1,200 hours, 1,200 hours is a statistic.
Statistical Inference
The process of using data from a sample to draw conclusions or make predictions about a larger population.
Example:
A political pollster surveys 1,000 likely voters to make an inference about the voting preferences of all registered voters in the country.
Statistical Significance
A result is statistically significant if the observed difference or effect is so large that it is unlikely to have occurred by random chance alone.
Example:
If a new teaching method leads to test scores that are much higher than the traditional method, and this difference is deemed statistically significant, it suggests the new method is genuinely effective.