Glossary
COSS
An acronym used to remember the four key aspects for describing the distribution of quantitative data: Center, Outliers, Spread, and Shape.
Example:
When asked to describe a histogram of student test scores, remember to address all parts of COSS to provide a complete description.
Categorical Data
Data that represents qualities or characteristics that cannot be measured numerically but can be divided into groups or categories. It is often described using proportions or percentages.
Example:
A survey asking students about their favorite subject (e.g., Math, English, Science) collects categorical data because the responses are categories, not numbers you can average.
Center
A characteristic of a quantitative data distribution that describes its typical or central value. Common measures include the mean and median.
Example:
The center of the distribution of daily temperatures in July might be around 85 degrees Fahrenheit, indicating a typical warm day.
Context
The real-world setting or background of a statistical problem. In AP Statistics, it is crucial to relate numerical results back to the specific situation being studied.
Example:
When analyzing the average height of trees, simply stating 'the mean is 15' is insufficient; you must provide context by saying 'the mean height of the oak trees is 15 feet'.
Data
Information, especially facts or numbers, collected to be examined and considered. It is the raw material for statistical analysis.
Example:
Before starting a new study, researchers first need to collect relevant data, such as survey responses or experimental measurements.
Mean
A measure of the center of a quantitative dataset, calculated by summing all values and dividing by the number of values. It is commonly referred to as the average.
Example:
If five friends scored 80, 85, 90, 75, and 90 on a test, their mean score is (80+85+90+75+90)/5 = 84.
Median
A measure of the center of a quantitative dataset, representing the middle value when the data is ordered from least to greatest. It is less affected by extreme values than the mean.
Example:
For the test scores 75, 80, 85, 90, 90, the median score is 85, as it's the middle value.
Outliers
Data points in a distribution that are unusually far away from the other data points. They can significantly affect measures like the mean.
Example:
If most students score between 70 and 95 on a test, but one student scores 20, that 20 would be considered an outlier.
Proportions
A fraction of a whole, often expressed as a decimal or percentage, used to describe the distribution of categorical data. It indicates the relative frequency of a specific category.
Example:
If 30 out of 100 students prefer pizza, the proportion of students who prefer pizza is 0.30 or 30%.
Quantitative Data
Numerical data that represents counts or measurements. It is data for which arithmetic operations like calculating an average make sense.
Example:
The number of text messages a student sends in a day (e.g., 50, 120, 75) is quantitative data because you can calculate an average number of messages.
Shape
A characteristic of a quantitative data distribution that describes its overall form, such as symmetric, skewed left, or skewed right. It is often visualized using histograms or box plots.
Example:
If a histogram of exam scores shows a long tail to the left, the shape of the distribution is skewed left, indicating more high scores than low scores.
Spread
A characteristic of a quantitative data distribution that describes the variability or dispersion of the data. Common measures include range, standard deviation, and IQR.
Example:
A wide spread in the distribution of house prices in a neighborhood indicates a large variation between the cheapest and most expensive homes.
Standard Deviation
A measure of the typical distance or spread of data points from the mean in a quantitative dataset. A larger standard deviation indicates greater variability.
Example:
If the standard deviation of test scores is very small, it means most students scored very close to the average score.
Statistics
The science of collecting, analyzing, interpreting, and presenting data. It involves using numerical data to draw conclusions and make informed decisions about the world.
Example:
A company uses Statistics to analyze customer feedback data and determine which product features are most popular.
Univariate Data
Data that consists of observations on a single variable. It focuses on describing the characteristics of one attribute at a time.
Example:
A study recording only the heights of students in a class is collecting univariate data because only one characteristic (height) is being measured.