Glossary
68% Rule
In a normal distribution, approximately 68% of the data falls within one standard deviation of the mean.
Example:
If the average IQ is 100 with a standard deviation of 15, then about 68% of people have an IQ between 85 and 115 according to the 68% Rule.
95% Rule
In a normal distribution, approximately 95% of the data falls within two standard deviations of the mean.
Example:
Using the IQ example, the 95% Rule suggests that about 95% of people have an IQ between 70 and 130.
Bimodal Distribution
A frequency distribution that has two distinct peaks or modes, suggesting the presence of two different groups or categories within the data.
Example:
A survey on preferred sleep times might show a bimodal distribution if there are two common preferences, like early risers and night owls.
Correlation
A statistical measure that describes the strength and direction of a relationship between two variables.
Example:
Observing that students who spend more time studying tend to get higher grades suggests a correlation between study time and academic performance.
Correlation Coefficient
A numerical index that quantifies the strength and direction of a linear relationship between two variables, ranging from -1 to +1.
Example:
A correlation coefficient of +0.9 indicates a very strong positive relationship, like between hours spent exercising and calories burned.
Correlation vs. Causation
A critical distinction in research, emphasizing that just because two variables are related (correlated) does not mean one causes the other.
Example:
Finding a correlation vs. causation between ice cream sales and crime rates doesn't mean ice cream causes crime; both might increase in hot weather.
Descriptive Statistics
Statistical methods used to summarize and describe the main features of a dataset, such as measures of central tendency and variation.
Example:
Calculating the average score of a class on a psychology exam is an example of using descriptive statistics to understand the class's performance.
Frequency Distribution
A summary of how often different scores or values occur in a dataset, often displayed in tables or graphs.
Example:
A psychologist might create a frequency distribution to show how many students scored within specific ranges on a personality test.
Inferential Statistics
Statistical methods used to make generalizations or draw conclusions about a larger population based on data collected from a sample.
Example:
A researcher uses inferential statistics to determine if the results from a study on a small group of participants can be applied to all teenagers.
Mean
The arithmetic average of a dataset, calculated by summing all values and dividing by the number of values.
Example:
If a student's test scores are 80, 90, and 70, their mean score is 80.
Median
The middle value in a dataset when the values are arranged in numerical order.
Example:
In the dataset 1, 3, 5, 8, 10, the median is 5, as it's the central value.
Mode
The value that appears most frequently in a dataset.
Example:
In a survey where most people chose 'blue' as their favorite color, 'blue' would be the mode.
Negative Correlation
A relationship between two variables where as one variable increases, the other tends to decrease.
Example:
A negative correlation might exist between the number of hours spent watching TV and academic grades, where more TV time is associated with lower grades.
Negatively Skewed
A distribution where the tail extends to the left, indicating that most scores are higher, and the median is typically greater than the mean.
Example:
The distribution of scores on an easy exam might be negatively skewed, with most students scoring high and only a few scoring low.
No Correlation
A lack of a consistent relationship between two variables, meaning changes in one variable are not predictably associated with changes in the other.
Example:
There is typically no correlation between a person's shoe size and their intelligence level.
Normal Distribution
A symmetrical, bell-shaped curve that represents the distribution of many natural phenomena, where most data points cluster around the mean.
Example:
Human height often follows a normal distribution, with most people being of average height and fewer people being extremely tall or short.
Positive Correlation
A relationship between two variables where both variables tend to increase or decrease together.
Example:
There is a positive correlation between the amount of time spent practicing a musical instrument and proficiency in playing it.
Positively Skewed
A distribution where the tail extends to the right, indicating that most scores are lower, and the mean is typically greater than the median.
Example:
Income distribution in many countries is often positively skewed, with most people earning lower incomes and a few earning very high incomes.
Range
The difference between the highest and lowest values in a dataset.
Example:
If the highest score on a quiz was 95 and the lowest was 60, the range of scores is 35.
Standard Deviation
A measure of the average amount by which scores in a dataset deviate from the mean, indicating the spread or variability of the data.
Example:
A low standard deviation in test scores means most students scored very close to the class average, indicating consistent performance.
Statistical Significance
A determination that a research result is unlikely to have occurred by chance, suggesting a real effect or relationship.
Example:
A drug trial showing a statistical significance in reducing symptoms means the improvement is likely due to the drug, not just random variation.