Glossary

Clustering

Criticality: 2

The phenomenon where data points in a scatterplot form distinct groups, often indicating underlying categories or segments within the data.

Example:

Observing Clustering in a scatterplot of income versus education level might reveal distinct groups for different educational attainment levels.

Clusters

Criticality: 2

Distinct groups or concentrations of points within a scatterplot that suggest different categories or subgroups in the data.

Example:

A scatterplot of student heights and weights might show two distinct Clusters, one for middle schoolers and one for high schoolers.

Confidence intervals

Criticality: 1

A range of values within which the true value of a prediction or parameter is expected to fall, typically with a certain level of probability.

Example:

When predicting a student's future GPA, providing a Confidence interval like '3.0 to 3.4' gives a more realistic estimate than a single point prediction.

Correlation DOES NOT equal causation

Criticality: 3

A critical principle stating that just because two variables are related or move together, it does not mean one variable causes the other.

Example:

While ice cream sales and drowning incidents both increase in summer, Correlation DOES NOT equal causation; neither causes the other.

Extrapolating

Criticality: 3

Making predictions using the line of best fit for x-values that are outside the range of the observed data.

Example:

Using a line of best fit from data on ages 5-10 to predict the height of a 50-year-old would be Extrapolating and potentially unreliable.

Form (of relationship)

Criticality: 2

Refers to the overall shape or pattern that the points in a scatterplot tend to follow.

Example:

The Form of the relationship between the age of a car and its resale value is typically nonlinear, showing a curve.

Gaps

Criticality: 1

Areas in a scatterplot where there are no data points, indicating a lack of observations within a certain range of the variables.

Example:

A scatterplot of house prices versus square footage might show Gaps if no houses of a certain size were sold in the dataset.

Line of best fit (trendline)

Criticality: 3

A straight line drawn through the center of a scatterplot's data points that best represents the overall linear relationship between the variables.

Example:

After plotting student data, drawing a Line of best fit helps predict a student's potential score based on their study time.

Linear (relationship)

Criticality: 3

A type of relationship where the points in a scatterplot tend to follow a straight line pattern.

Example:

The relationship between the number of items sold and the total revenue often shows a Linear pattern.

Moderate Correlation

Criticality: 2

A correlation that shows a general trend with some noticeable variation, meaning the points are somewhat spread out but still follow a direction.

Example:

A scatterplot of daily steps taken and reported energy levels might show a Moderate Correlation, indicating a general trend but with individual differences.

Negative Correlation

Criticality: 3

A specific type of negative relationship where points generally move downwards from left to right, indicating an inverse association.

Example:

As the number of hours spent playing video games increases, a student's sleep duration tends to decrease, demonstrating a Negative Correlation.

Negative Relationship

Criticality: 3

A pattern in a scatterplot where as the values of one variable increase, the values of the other variable tend to decrease.

Example:

Observing that as the number of hours watching TV increases, a student's GPA tends to decrease, indicates a Negative Relationship.

No Correlation

Criticality: 2

A situation where points on a scatterplot are scattered randomly with no discernible direction or pattern, suggesting no linear relationship.

Example:

There is typically No Correlation between the number of pets a person owns and their favorite type of music.

No Relationship

Criticality: 2

A pattern in a scatterplot where there is no clear trend or connection between the two variables; points appear randomly scattered.

Example:

A scatterplot comparing a person's favorite color to their height would likely show No Relationship.

Nonlinear (relationship)

Criticality: 2

A type of relationship where the points in a scatterplot follow a curved pattern rather than a straight line.

Example:

The growth of bacteria over time often exhibits a Nonlinear (exponential) relationship.

Outliers

Criticality: 3

Individual data points in a scatterplot that lie far away from the general pattern of the other points.

Example:

On a scatterplot of student test scores versus study hours, a student who studied very little but scored exceptionally high would be an Outlier.

Perfect Linearity

Criticality: 1

A rare scenario where all data points in a scatterplot fall exactly on a straight line, indicating an exact linear relationship.

Example:

If you plot the circumference of a circle against its diameter, you would observe Perfect Linearity.

Positive Correlation

Criticality: 3

A specific type of positive relationship where points generally move upwards from left to right, indicating a direct association.

Example:

The more time a student spends on practice problems, the higher their test scores tend to be, showing a Positive Correlation.

Positive Relationship

Criticality: 3

A pattern in a scatterplot where as the values of one variable increase, the values of the other variable also tend to increase.

Example:

A scatterplot showing that more hours spent exercising generally leads to more calories burned illustrates a Positive Relationship.

Quantitative variables

Criticality: 2

Variables that can be measured numerically, allowing for mathematical operations and meaningful comparisons.

Example:

Height, weight, age, and test scores are all examples of quantitative variables that can be plotted on a scatterplot.

Residual analysis

Criticality: 1

The process of examining the differences between the observed y-values and the y-values predicted by the line of best fit to assess the model's accuracy.

Example:

Performing Residual analysis can help determine if a linear model is appropriate for the data or if a different type of relationship exists.

Slope

Criticality: 3

The 'm' value in the equation y = mx + b, representing the rate of change in the y-variable for every one-unit increase in the x-variable.

Example:

If the Slope of a line of best fit for hours studied vs. test scores is 5, it means for every additional hour studied, the test score is predicted to increase by 5 points.

Strength (of correlation)

Criticality: 3

Describes how closely the points in a scatterplot follow a particular trend or pattern.

Example:

If all data points fall almost perfectly on a straight line, the Strength of the correlation is very high.

Strong (correlation)

Criticality: 2

Indicates that the points in a scatterplot are tightly clustered around a clear trend, suggesting a consistent and predictable relationship.

Example:

If a scatterplot of study hours and exam scores shows points very close to a straight line, it indicates a Strong correlation.

Weak (correlation)

Criticality: 2

Indicates that the points in a scatterplot are widely spread out around a trend, suggesting a loose or inconsistent relationship.

Example:

A scatterplot showing a Weak correlation between daily coffee intake and hours of sleep would have points very dispersed.

Y-intercept

Criticality: 3

The 'b' value in the equation y = mx + b, representing the predicted value of the y-variable when the x-variable is zero.

Example:

In a line of best fit for temperature vs. ice cream sales, the Y-intercept would represent the predicted ice cream sales when the temperature is 0 degrees.

x-axis

Criticality: 2

The horizontal axis on a scatterplot, typically representing the independent variable or the first quantitative variable being observed.

Example:

When plotting hours studied versus exam scores, the number of hours studied would usually be displayed on the x-axis.

y-axis

Criticality: 2

The vertical axis on a scatterplot, typically representing the dependent variable or the second quantitative variable being observed.

Example:

In a scatterplot showing temperature and ice cream sales, the amount of ice cream sold would be plotted on the y-axis.