Glossary

Categorical Data

Criticality: 2

Data that represents characteristics or qualities, which can be divided into categories or groups, often without a natural order.

Example:

The types of cars people drive (sedan, SUV, truck) or their favorite ice cream flavors (vanilla, chocolate, strawberry) are examples of categorical data.

Chi-Squared Tests

Criticality: 3

A family of statistical tests used to analyze categorical data, typically to determine if observed frequencies differ significantly from expected frequencies or if there's an association between categorical variables.

Example:

An AP Stats student might use a chi-squared test to see if there's a relationship between a person's favorite color and their preferred music genre.

Chi-squared test statistic (χ²)

Criticality: 3

A calculated value that measures the discrepancy between the observed frequencies and the expected frequencies in a chi-squared test.

Example:

A large chi-squared test statistic value suggests a significant difference between what was observed and what was expected, leading to rejection of the null hypothesis.

Conditions for Inference (Chi-Squared)

Criticality: 3

Assumptions that must be met for the results of a chi-squared test to be valid, including random sampling, independence of observations, and large expected counts (typically ≥ 5 in each cell).

Example:

Before conducting a chi-squared test, a student must verify the conditions for inference, such as ensuring all expected counts are at least 5.

Degrees of Freedom (df)

Criticality: 3

A value that indicates the number of independent pieces of information used to calculate a statistic, influencing the shape of the chi-squared distribution.

Example:

In a chi-squared test for independence with a 2x3 table, the degrees of freedom would be (2-1)*(3-1) = 2.

Expected Counts

Criticality: 3

The frequencies that would be anticipated in each cell of a contingency table if the null hypothesis were true (i.e., if there were no association or difference).

Example:

When testing if a die is fair, the expected counts for each face would be 1/6th of the total rolls.

Goodness of Fit Test

Criticality: 3

A chi-squared test used to determine if an observed frequency distribution for a single categorical variable matches an expected or theoretical distribution.

Example:

A candy company might use a Goodness of Fit Test to see if the color distribution of candies in their new batch matches the advertised proportions.

Hypotheses (Null and Alternative)

Criticality: 3

The null hypothesis (H0) states there is no effect or no difference, while the alternative hypothesis (Ha) states there is an effect or a difference.

Example:

For a study on a new drug, the null hypothesis might be that the drug has no effect, while the alternative hypothesis is that it reduces symptoms.

P-value

Criticality: 3

The probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming the null hypothesis is true.

Example:

If a p-value is 0.01, it means there's a 1% chance of seeing the observed results if the null hypothesis were actually true.

Significance Level (α)

Criticality: 3

A pre-determined threshold (e.g., 0.05) used to decide whether to reject the null hypothesis; if the p-value is less than α, the null hypothesis is rejected.

Example:

Setting the significance level at 0.05 means you are willing to accept a 5% chance of making a Type I error (rejecting a true null hypothesis).

Test for Homogeneity

Criticality: 3

A chi-squared test used to determine if the distribution of a single categorical variable is the same across two or more different populations or groups.

Example:

A marketing team might use a Test for Homogeneity to compare if the distribution of customer satisfaction ratings is the same across three different store locations.

Test for Independence

Criticality: 3

A chi-squared test used to determine if there is a statistically significant association between two categorical variables from a single sample.

Example:

Researchers could use a Test for Independence to investigate if there's a relationship between a student's chosen major and their participation in extracurricular activities.

Two-way table

Criticality: 2

A table that displays the counts of observations for two categorical variables, with rows representing categories of one variable and columns representing categories of the other.

Example:

A survey collecting data on gender and preferred social media platform would typically summarize its findings in a two-way table.