Correlation

Ava Garcia
8 min read
Listen to this study note
Study Guide Overview
This study guide covers correlation, including the correlation coefficient (r), its interpretation (strength and direction), and calculation. It emphasizes the distinction between correlation and causation. The guide also provides visualization techniques using scatterplots, calculator instructions, practice problems, and exam tips.
Correlation: Your Night-Before-the-Exam Guide ๐
Hey there, future AP Stats master! Let's break down correlation so you're totally ready for anything the exam throws your way. Think of this as your cheat sheet, but way more awesome.
What is Correlation?
Correlation is all about understanding how two variables move together. It's like watching two friends โ do they walk in the same direction, opposite directions, or not at all? We measure this relationship using the correlation coefficient, r, which tells us both the strength and direction of the linear relationship.
- Strength: How closely the points on a scatterplot follow a straight line.
- Direction: Whether the line slopes upwards (positive) or downwards (negative).
The correlation coefficient, r, ranges from -1 to 1:
- r = 1: Perfect positive correlation (points form an exact increasing line).
- r = -1: Perfect negative correlation (points form an exact decreasing line).
- r = 0: No linear correlation (points are scattered).
Remember: Correlation only measures linear relationships. If the relationship is curved, r might be misleading. ๐ก
Correlation vs. Causation ๐
This is HUGE: Correlation DOES NOT equal causation! Just because two variables are related doesn't mean one causes the other. There could be other factors at play, or it could be a coincidence.
Visualizing Correlation
Here's a visual to help you nail down the concept:
- Positive Correlation: As one variable increases, the other tends to increase. The scatterplot slopes upwards from left to right.
- Negative Correlation: As one variable increases, the other tends to decrease. The scatterplot slopes downwards from left to right.
- No Correlation: There's no clear pattern; the points look like a random cloud.
Key Things to Remember
- Linearity: r only measures linear relationships. A strong r doesn't mean there's no relationship, just no linear one.
- Causation: Correlation does not imply causation.
- Outliers: r is not resistant to outliers. A single outlier can drastically change the value of r.
Calculating the Correlation Coefficient
Okay, here's the formula. Don't panic! You'll rarely calculate this by hand, but it's good to understand what's going on:
Think of it as the average product of the z-scores. You're standardizing each variable, multiplying the standardized values for each point, and then averaging those products.
- and are the means of x and y variables.
- and are the standard deviations of x and y variables.
- is the number of data points.
Using Your Calculator
Your TI-84 is your best friend here. Follow these steps:
- Enter your x-values in L1 and y-values in L2. 2. Go to STAT > CALC > LinReg(ax+b).
- Make sure "DiagnosticOn" is enabled in the MODE menu to see the r value. โ๏ธ
Double-check that "DiagnosticOn" is enabled. It's a simple step that can save you from missing the all-important r value. โ
Practice Problems
Let's test your knowledge with some practice questions:
Practice Question
(1) Multiple Choice:
A study examined the relationship between hours of exercise per week and body mass index (BMI). Hereโs the scatterplot:
Based on the scatterplot, which statement is true?
(A) There is a strong positive correlation between hours of exercise per week and BMI. (B) There is a strong negative correlation between hours of exercise per week and BMI. (C) There is a moderate positive correlation between hours of exercise per week and BMI. (D) There is a moderate negative correlation between hours of exercise per week and BMI. (E) There is no correlation between hours of exercise per week and BMI.
(2) True/False:
Determine whether each statement is true or false:
- A scatterplot is a graphical representation of the relationship between two variables.
- A correlation coefficient of 1 indicates a strong positive correlation between two variables.
- A correlation coefficient of -1 indicates a strong positive correlation between two variables.
- A correlation coefficient of 0 indicates no correlation between two variables.
- The correlation coefficient only measures linear relationships between two variables.
- The correlation coefficient indicates the strength and direction of the relationship between two variables.
- The correlation coefficient indicates the cause and effect relationship between two variables.
- Correlation implies causation, meaning that if two variables are correlated, one variable must cause the other.
- A scatterplot can show nonlinear relationships between two variables.
- A scatterplot can be used to predict the value of one variable based on the value of the other variable.
(3) Free Response Question:
A researcher is studying the relationship between the number of hours students study per week and their exam scores. They collect data from a sample of 10 students. The data is shown below:
Hours of Study (X) | Exam Score (Y) |
---|---|
5 | 60 |
10 | 75 |
15 | 82 |
20 | 90 |
25 | 95 |
8 | 68 |
12 | 78 |
18 | 88 |
22 | 92 |
28 | 98 |
(a) Calculate the correlation coefficient r between hours of study and exam scores. Show your work using calculator steps. (b) Interpret the correlation coefficient in the context of the problem. (c) Does this correlation imply that studying more causes higher exam scores? Explain.
Answer Key:
(1) (D) There is a moderate negative correlation between hours of exercise per week and BMI.
(2) T, T, F, T, T, T, F, F, T, T
(3)
(a) Calculator Steps:
- Enter hours of study (X) into L1 and exam scores (Y) into L2. 2. Go to STAT > CALC > LinReg(ax+b).
- Ensure โDiagnosticOnโ is enabled in MODE.
- The correlation coefficient r โ 0.965
(b) Interpretation:
The correlation coefficient of approximately 0.965 indicates a strong, positive linear relationship between hours of study per week and exam scores. This means that as the number of hours students study increases, their exam scores tend to increase as well.
(c) Causation:
No, this correlation does not imply that studying more causes higher exam scores. While the data suggests a strong association, there could be other factors influencing both study hours and exam scores (e.g., prior knowledge, study habits, etc.). Correlation does not prove causation. We would need a well-designed experiment to prove causation.
Final Exam Focus
Alright, let's zero in on what's most important for the exam:
- Interpreting r: Make sure you can explain what a given r value means in context. Remember both strength and direction.
- Correlation vs. Causation: This is a classic trap! Always be ready to explain why correlation doesn't imply causation.
- Calculator Skills: Be fluent with using your calculator to find r. Don't forget to turn on diagnostics! ๐งฎ
Last-Minute Tips
- Time Management: Don't spend too long on one question. If you're stuck, make a note and come back to it later.
- Read Carefully: Pay close attention to the wording of each question. Little details can make a big difference.
- Show Your Work: Even if you use your calculator, show the steps you took. You can get partial credit even if your final answer is wrong.
- Stay Calm: You've got this! Take a deep breath and trust in your preparation.
By now, you should feel like a correlation pro. You've got the knowledge, the tools, and the confidence to ace this topic on the AP Statistics exam. Go get 'em! ๐ช

How are we doing?
Give us your feedback and let us know how we can improve