Glossary
Correlation
A statistical measure that describes the extent to which two variables move together, indicating both the strength and direction of their relationship.
Example:
When analyzing data, you might find a strong correlation between the amount of time spent studying and the score received on an exam.
Correlation DOES NOT equal causation
A critical principle stating that a statistical association between two variables does not imply that one variable causes the other; other factors or coincidences might be involved.
Example:
Observing that ice cream sales and drowning incidents both increase in summer shows a correlation, but correlation DOES NOT equal causation; neither causes the other.
Correlation coefficient (r)
A numerical value, ranging from -1 to 1, that quantifies the strength and direction of the linear relationship between two quantitative variables.
Example:
An AP Stats student calculated an r value of 0.92, indicating a very strong positive linear relationship between hours of sleep and alertness.
DiagnosticOn
A setting on a graphing calculator (like TI-84) that must be enabled to display the correlation coefficient (r) and coefficient of determination (r²) when performing linear regression calculations.
Example:
Before running LinReg on your calculator to find 'r', ensure DiagnosticOn is enabled in the MODE menu to see the correlation value.
Direction (of correlation)
Indicates whether the linear relationship between two variables is positive (both increase) or negative (one increases as the other decreases).
Example:
If the number of hours spent exercising increases and body fat percentage tends to decrease, this shows a negative direction in their relationship.
Linear relationships
A type of relationship between two variables where the data points tend to follow a straight line when plotted on a scatterplot.
Example:
The relationship between the number of hours a car is driven and the amount of gas consumed typically exhibits strong linear relationships.
Negative Correlation
A relationship where as one variable increases, the other variable tends to decrease, resulting in a downward-sloping pattern on a scatterplot.
Example:
As the number of hours spent watching TV increases, the number of hours spent exercising tends to decrease, showing a negative correlation.
No Correlation
A situation where there is no clear linear pattern between two variables on a scatterplot; the points appear randomly scattered.
Example:
Plotting a person's shoe size against their IQ would likely show no correlation.
No linear correlation (r = 0)
Indicates that there is no discernible straight-line pattern between two variables, meaning the points are scattered randomly on a scatterplot.
Example:
If you plot a person's favorite color against their height, you would likely find no linear correlation.
Outliers (effect on r)
Data points that significantly deviate from the overall pattern of the other data points, which can drastically influence the correlation coefficient, making it less representative.
Example:
A single student who studied very little but scored perfectly on the exam could be an outlier that significantly weakens the observed correlation between study time and scores.
Perfect negative correlation (r = -1)
Occurs when all data points on a scatterplot lie exactly on a decreasing straight line, representing a perfect inverse linear relationship.
Example:
Imagine a scenario where for every degree the temperature drops, the heating bill increases by a fixed amount; this would be a perfect negative correlation.
Perfect positive correlation (r = 1)
Occurs when all data points on a scatterplot lie exactly on an increasing straight line, representing a perfect direct linear relationship.
Example:
If every additional minute of sunlight always results in exactly one more millimeter of plant growth, that would be a perfect positive correlation.
Positive Correlation
A relationship where as one variable increases, the other variable also tends to increase, resulting in an upward-sloping pattern on a scatterplot.
Example:
The more hours a student spends studying, the higher their exam score tends to be, illustrating a positive correlation.
Scatterplot
A graphical display used to show the relationship between two quantitative variables, where each point represents a pair of values from the dataset.
Example:
To visually assess the relationship between daily temperature and ice cream sales, you would create a scatterplot.
Strength (of correlation)
Refers to how closely the points on a scatterplot follow a straight line, indicating the consistency or predictability of the linear relationship.
Example:
A correlation coefficient of 0.90 shows a high strength in the relationship, meaning the points are tightly clustered around a line.
Z-scores (in context of r formula)
Standardized values that indicate how many standard deviations a data point is from the mean, used in the calculation of the correlation coefficient to measure relative position.
Example:
The correlation formula essentially averages the product of the z-scores for each data point, showing how consistently values are above or below their respective means.