Glossary
Bivariate Data
Data collected on two variables for each individual or observation. It is used to explore the relationship or association between these two variables.
Example:
Collecting data on both a student's SAT score and their high school GPA to see if there's a connection creates bivariate data.
Categorical Data
Data that represents qualities or characteristics, often grouped into categories or labels. It cannot be averaged meaningfully but can be described by percentages or proportions.
Example:
The different types of music genres (e.g., rock, pop, classical) preferred by students in a survey are examples of categorical data.
Correlation vs. Causation
A critical distinction stating that just because two variables are correlated (show a relationship) does not mean one causes the other. Other factors or lurking variables might be involved.
Example:
Observing a strong correlation between ice cream sales and drowning incidents doesn't mean ice cream causes drowning; both are likely influenced by hot weather, illustrating correlation vs. causation.
Frequency Charts
Visualizations, often bar charts, that show the proportion or percentage of observations falling into different categories or intervals. They are useful for comparing relative frequencies.
Example:
A frequency chart might display the percentage of AP Statistics students who prefer online learning versus in-person classes.
Histograms
Bar charts used to display the distribution of a single quantitative variable or, in some contexts, counts for categorical data. The bars represent counts within specific intervals or categories.
Example:
A histogram could show the number of students who scored within different ranges (e.g., 70-79, 80-89) on a recent math test.
Mosaic Plots
Graphical displays used to visualize the relationship between two categorical variables. The area of each rectangle in the plot is proportional to the number of observations in that category combination.
Example:
A mosaic plot could illustrate the relationship between a student's favorite subject (Math, English, Science) and their preferred learning style (Visual, Auditory, Kinesthetic).
Negative Relationship
A pattern observed in bivariate data where as one variable increases, the other variable tends to decrease. On a scatterplot, points generally trend downwards from left to right.
Example:
You might observe a negative relationship between the number of hours a student spends playing video games and their GPA; as gaming hours increase, GPA might decrease.
No Relationship
A situation in bivariate data where there is no discernible pattern or association between the two variables. Changes in one variable do not consistently correspond to changes in the other.
Example:
There would likely be no relationship between a student's shoe size and their score on the AP Statistics exam.
Positive Relationship
A pattern observed in bivariate data where as one variable increases, the other variable also tends to increase. On a scatterplot, points generally trend upwards from left to right.
Example:
There's often a positive relationship between the amount of time spent exercising and a person's overall fitness level; more exercise tends to mean higher fitness.
Quantitative Data
Numerical data that represents counts or measurements, allowing for mathematical operations like averaging. It answers questions about 'how much' or 'how many'.
Example:
The number of hours a student spends studying for the AP Statistics exam is quantitative data because you can calculate an average study time.
Scatterplots
Graphs used to display the relationship between two quantitative variables. Each point on the plot represents a pair of values for an individual observation.
Example:
A scatterplot could show the relationship between the number of hours a student sleeps and their score on a pop quiz, with sleep hours on one axis and quiz score on the other.