Glossary
Bias
A systematic error in data collection or analysis that favors certain outcomes, leading to conclusions that are not representative of the true population.
Example:
A survey conducted only among people who frequent a specific gym might have a bias towards health-conscious individuals, not representing the general population's fitness habits.
Blinding
A technique used in experiments to prevent participants, researchers, or both from knowing who is receiving which treatment, reducing bias from expectations.
Example:
When testing a new pain reliever, participants are given pills that look identical, so they don't know if they received the drug or a placebo, implementing blinding.
Cluster Sampling
A reliable sampling method where the population is divided into heterogeneous groups (clusters), and then a random sample of entire clusters is selected.
Example:
To assess the quality of textbooks across a large school district, a researcher randomly selects 5 schools (clusters) and surveys every teacher in those selected schools, employing cluster sampling.
Confounding Variables
Variables that are related to both the independent and dependent variables in a study, making it difficult to determine if the independent variable truly causes changes in the dependent variable.
Example:
In a study linking coffee consumption to heart disease, stress levels could be confounding variables if stressed people drink more coffee and also have higher heart disease risk.
Control Group
In an experiment, the group that does not receive the treatment or intervention being studied, serving as a baseline for comparison.
Example:
In a study testing a new fertilizer, one set of plants receives the new fertilizer, while another set receives only water, acting as the control group.
Convenience Sampling
A non-random sampling method where individuals are selected because they are easily accessible or readily available to the researcher.
Example:
A student surveying their classmates in their first-period class about school lunch preferences is using convenience sampling, as they are easy to reach.
Correlation does not imply causation
A fundamental statistical principle stating that just because two variables are associated or move together, it does not mean one variable causes the other.
Example:
Finding that cities with more churches also have more crime doesn't mean churches cause crime; this is an example of how correlation does not imply causation.
Counts Instead of Percentages
A misleading data presentation method where raw counts are used to compare groups of different sizes, potentially distorting the true proportional differences.
Example:
Reporting that 50 students from School A and 40 students from School B passed an exam, without mentioning School A has 500 students and School B has 80, misleads by using counts instead of percentages.
Double-Blind Study
An experimental design where neither the participants nor the researchers administering the treatment know who is in the treatment group and who is in the control group.
Example:
In a clinical trial for a new antidepressant, neither the patients nor the doctors prescribing the pills know if they are giving/receiving the actual drug or a placebo, making it a double-blind study.
Experimental Design
The systematic process of planning and conducting a study to investigate cause-and-effect relationships, typically involving random assignment, control groups, and blinding.
Example:
A pharmaceutical company uses rigorous experimental design to test a new drug, ensuring proper controls and randomization to determine its effectiveness.
Manipulated Axes
A misleading visual technique where the scale or starting point of a graph's axis is altered to exaggerate or minimize differences in data.
Example:
A company's sales chart might use a manipulated axis starting at 0 to make a small increase to $95,000 look like a massive jump in profits.
Omitted Variable Bias
A bias that occurs when an important variable that influences both the independent and dependent variables is left out of a study, leading to a misleading relationship.
Example:
Observing a positive correlation between ice cream sales and drowning incidents without considering temperature as an omitted variable bias would lead to a false conclusion about causation.
Population
The entire group of individuals or instances about which we want to gather information and draw conclusions.
Example:
If a researcher wants to study the average height of all high school students in a city, then all high school students in that city constitute the population.
Random Assignment
A technique used in experiments to distribute participants into treatment and control groups by chance, aiming to create groups that are as similar as possible before the intervention.
Example:
In a study testing a new study technique, students are flipped a coin to decide if they use the new method or the old one, ensuring random assignment to groups.
Random Sampling
A method of selecting individuals from a population where every member has an equal chance of being chosen, ensuring the sample is representative.
Example:
To survey student opinions, a researcher puts all student IDs into a hat and draws 100 at random, using random sampling to get a representative group.
Sample
A subset of individuals selected from a larger population, from which data is collected to make inferences about the entire population.
Example:
From the entire student body of a university, a researcher selects 200 students to participate in a survey; these 200 students form the sample.
Self-Selection Bias
A type of bias that occurs when individuals volunteer to participate in a study, often leading to a sample that is not representative of the population.
Example:
An online poll asking 'Do you love statistics?' will likely suffer from self-selection bias, as only those passionate enough to seek out and answer the poll will respond.
Stratified Sampling
A reliable sampling method where the population is divided into homogeneous subgroups (strata), and then a simple random sample is drawn from each stratum.
Example:
To survey student satisfaction, a school divides students into grade levels (freshman, sophomore, etc.) and then randomly selects 20 students from each grade, using stratified sampling.
Systematic Sampling
A reliable sampling method where individuals are selected from a list at a fixed interval after a random starting point.
Example:
From a list of 1000 students, a researcher randomly picks a starting point between 1 and 10, say 7, and then selects every 10th student (7th, 17th, 27th, etc.), using systematic sampling.
Voluntary Response Bias
A specific type of self-selection bias where individuals choose to respond to a survey or call for participation, often resulting in extreme opinions being overrepresented.
Example:
A radio station asking listeners to call in and vote on a controversial topic will likely get voluntary response bias, as only those with strong feelings will take the time to call.