Glossary

25th Percentile (Q1)

Criticality: 2

The value below which 25% of the data falls when ordered from least to greatest. It is also known as the first quartile.

Example:

If the 25th percentile for commute times is 15 minutes, it means 25% of commuters take 15 minutes or less to get to work.

Box Plot

Criticality: 3

A graphical display that summarizes the distribution of a quantitative variable using five key values: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. It's excellent for comparing distributions and identifying outliers.

Example:

To quickly compare the spread and typical values of daily temperatures in two different cities over a month, you could use a box plot for each city.

Center (of a distribution)

Criticality: 3

Represents the typical or central value of a distribution, often measured by the mean or median.

Example:

The center of the distribution of commute times for employees at a company might be around 25 minutes, indicating a typical commute length.

Comparing Distributions

Criticality: 3

The process of analyzing and contrasting the characteristics (shape, center, spread, outliers) of two or more datasets, often using graphical displays.

Example:

A researcher might compare the distributions of test scores for students who used a new study method versus those who used a traditional method to see which is more effective.

Histogram

Criticality: 3

A graphical display that uses bars to show the frequency or relative frequency of data values within defined intervals (bins). It's useful for visualizing the shape of larger datasets.

Example:

A histogram might display the distribution of heights of all students in a high school, showing how many students fall into height ranges like 60-65 inches, 65-70 inches, etc.

Interquartile Range (IQR)

Criticality: 3

The range of the middle 50% of the data, calculated as the difference between the third quartile (Q3) and the first quartile (Q1). It is a resistant measure of spread.

Example:

If the interquartile range of house prices in a neighborhood is $50,000, it means the middle half of houses vary in price by that amount.

Mean

Criticality: 3

The arithmetic average of a dataset, calculated by summing all values and dividing by the number of values. It is sensitive to outliers and skewness.

Example:

If you add up all the points scored by a basketball team in a season and divide by the number of games, you get the mean points per game.

Median

Criticality: 3

The middle value in an ordered dataset, dividing the data into two equal halves. It is resistant to outliers and skewness.

Example:

In a list of student heights, if you arrange them from shortest to tallest, the height of the student exactly in the middle is the median height.

Outlier

Criticality: 3

A data point that lies an abnormal distance from other values in a random sample from a population. Outliers can significantly affect the mean and range.

Example:

If most students finish a 30-minute quiz in 15-20 minutes, but one student takes 45 minutes, that 45-minute time would be considered an outlier.

Range

Criticality: 2

The difference between the maximum and minimum values in a dataset, providing a simple measure of the overall spread.

Example:

If the highest temperature recorded in a week was 90°F and the lowest was 60°F, the range of temperatures for that week was 30°F.

SOCS

Criticality: 3

An acronym used to remember the four key aspects to discuss when describing or comparing distributions: Shape, Outliers, Center, and Spread.

Example:

When asked to describe the distribution of student test scores, a student should remember to address SOCS: its shape (e.g., symmetric), any outliers, its center (e.g., median score), and its spread (e.g., IQR).

Shape (of a distribution)

Criticality: 3

Describes the overall form of a distribution, including its symmetry, skewness, and number of peaks (modes).

Example:

When looking at a histogram of exam scores, you might describe its shape as 'skewed to the left' if most students scored high.

Skewed to the Left (Negatively Skewed)

Criticality: 3

A distribution where the tail extends further to the left, indicating that most data values are concentrated on the higher end, with a few lower values pulling the mean to the left of the median.

Example:

The distribution of scores on an easy exam might be skewed to the left, with most students scoring high and only a few scoring low.

Skewed to the Right (Positively Skewed)

Criticality: 3

A distribution where the tail extends further to the right, indicating that most data values are concentrated on the lower end, with a few higher values pulling the mean to the right of the median.

Example:

The distribution of household incomes is often skewed to the right, as most households earn moderate incomes, but a few very high incomes pull the average up.

Spread (of a distribution)

Criticality: 3

Describes the variability or dispersion of data values within a distribution, often measured by range, interquartile range (IQR), or standard deviation.

Example:

If the spread of test scores in one class is much wider than another, it means there's more variability in performance among students in that class.

Stem-and-Leaf Plot (Stem Plot)

Criticality: 2

A graphical display that shows the shape of the distribution while preserving the individual data values. Data is separated into a 'stem' (leading digit(s)) and a 'leaf' (trailing digit).

Example:

A stem-and-leaf plot could show the ages of participants in a survey, with stems representing tens (e.g., '2' for 20s) and leaves representing units (e.g., '2 | 3 5 8' for 23, 25, 28).

Symmetric (Distribution)

Criticality: 2

A distribution where the left and right sides are approximate mirror images of each other around the center. For symmetric distributions, the mean and median are approximately equal.

Example:

The distribution of weights of a specific brand of potato chips, if the manufacturing process is consistent, should be roughly symmetric around the target weight.

Unimodal

Criticality: 2

A distribution that has a single, distinct peak or mode.

Example:

A histogram showing the heights of adult women would likely be unimodal, with one central peak around the average height.