zuai-logo

Glossary

E

Exponential Models

Criticality: 3

Nonlinear models of the form ŷ = abˣ, where the response variable grows or decays at a constant percentage rate, linearized by taking the natural logarithm of the y-values.

Example:

The spread of a virus often follows an exponential model, where the number of infected individuals increases rapidly over time.

H

High-Leverage Points

Criticality: 3

Points with x-values that are far from the other x-values in the dataset.

Example:

When analyzing car weight vs. fuel efficiency, a data point for a monster truck (extremely high weight) would be a high-leverage point.

I

Influential Points

Criticality: 3

Data points that, if removed, would significantly change the slope, y-intercept, or correlation of the regression model.

Example:

In a study of study hours vs. test scores, a student who studied 100 hours and scored 50% might be an influential point because their data drastically pulls the regression line down.

N

Nonlinear Models

Criticality: 2

Statistical models used when the relationship between variables is not a straight line, often requiring data transformation to achieve linearity.

Example:

The growth of a bacterial colony over time often follows a curve, requiring a nonlinear model like an exponential one to accurately predict future population sizes.

O

Outliers

Criticality: 3

Points with y-values that are far from the regression line, meaning they have large residuals.

Example:

If most students score between 70-90% on a test, but one student scores 20% despite studying an average amount, that student's score would be an outlier.

P

Power Models

Criticality: 3

Nonlinear models of the form ŷ = axᵇ, where the response variable changes proportionally to a power of the explanatory variable, linearized by taking the natural logarithm of both x and y values.

Example:

The relationship between the area of a square and its side length (Area = side²) is a simple power model.

R

Residual Plots

Criticality: 3

A scatterplot of the residuals (observed y - predicted y) against the explanatory variable (x) or the predicted y-values, used to assess the appropriateness of a regression model.

Example:

If a residual plot shows a clear U-shaped pattern, it indicates that a linear model is not a good fit for the data, and a nonlinear model might be more appropriate.

R² Value

Criticality: 3

The coefficient of determination, which represents the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

Example:

An R² value of 0.85 for a model predicting house prices based on square footage means that 85% of the variation in house prices can be explained by the variation in square footage.

T

Transform (data transformation)

Criticality: 3

The process of applying a mathematical function to data (e.g., logarithm, square root) to make a nonlinear relationship linear, allowing for linear regression analysis.

Example:

To analyze the relationship between a planet's distance from the sun and its orbital period, astronomers might transform both variables using logarithms to reveal a linear pattern.