# Correlations

# Stats: Modeling the World ©2004

Bock, Velleman, De Veaux

### Correlated to: Advanced Placement® (AP®) Statistics Standards (Grades 9–12)

SE = Student Edition

TE = Teacher Edition

## I. EXPLORING DATA

Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and departures from patterns. Emphasis should be placed on interpreting information from graphical and numerical displays and summaries.

A. Interpreting graphical displays of distributions of univariate data (dotplot, stemplot, histogram, cumulative frequency plot) | |

1. Center and spread | SE/TE: 39–44 |

2. Clusters and gaps | SE/TE: 41–42, 233 |

3. Outliers and other unusual features | SE/TE: 41–42, 61–62, 117–122 |

4. Shape | SE/TE: 39–43 |

B. Summarizing distributions of univariate data | |

1. Measuring center: median, mean | SE/TE: 57–65 |

2. Measuring spread: range, interquartile range, standard deviation | SE/TE: 60–67, 83–84 |

3. Measuring position: quartiles, percentiles, standardized scores (z-scores) | SE/TE: 84–92, 387–388, 427–430 |

4. Using boxplots | SE/TE: 60–61, 70–72 |

5. The effect of changing units on summary measures | SE/TE: 45–46, 98–99 |

C. Comparing distributions of univariate data (dotplots, back-to-back stemplots, parallel boxplots) | |

1. Comparing center and spread: within group, between group variation | SE/TE: 38–43, 67–70 |

2. Comparing clusters and gaps | SE/TE: 41–42, 233 |

3. Comparing outliers and other unusual features | SE/TE: 41–42, 61–62, 117–122 |

4. Comparing shapes | SE/TE: 39–40 |

D. Exploring bivariate data | |

1. Analyzing patterns in scatterplots | SE/TE: 115–123 |

2. Correlation and linearity | SE/TE: 119–123, 125, 130, 139–140 |

3. Least-squares regression line | SE/TE: 139–142, 547–551, 553–556 |

4. Residual plots, outliers, and influential points | SE/TE: 138, 167–169, 174 |

5. Transformations to achieve linearity: logarithmic and power transformations | SE/TE: 45–46, 190–193 |

E. Exploring categorical data: frequency tables | |

1. Marginal and joint frequencies for two-way tables | SE/TE: 17–23 |

2. Conditional relative frequencies and association | SE/TE: 21–22, 37, 276, 292–293 |

## II. PLANNING A STUDY: DECIDING WHAT AND HOW TO MEASURE

Data must be collected according to a well-developed plan if valid information on a conjecture is to be obtained. This plan includes clarifying the question and deciding upon a method of data collection and analysis.

A. Overview of methods of data collection | |

1. Census | SE/TE: 229–230 |

2. Sample survey | SE/TE: 227–240 |

3. Experiment | SE/TE: 246–262 |

4. Observational study | SE/TE: 246–247, 260 |

B. Planning and conducting surveys | |

1. Characteristics of a well-designed and well-conducted survey | SE/TE: 227–234 |

2. Populations, samples, and random selection | SE/TE: 229–234 |

3. Sources of bias in surveys | SE/TE: 227, 235, 238–242 |

4. Simple random sampling | SE/TE: 228, 231–235 |

5. Stratified random sampling | SE/TE: 232, 241 |

C. Planning and conducting experiments | |

1. Characteristics of a well-designed and well-conducted experiment | SE/TE: 248–251 |

2. Treatments, control groups, experimental units, random assignments, and replication | SE/TE: 231–232, 247–249, 254 |

3. Sources of bias and confounding, including placebo effect and blinding | SE/TE: 239–242, 254–255, 258–260 |

4. Completely randomized design | SE/TE: 247–251 |

5. Randomized block design, including matched pairs design | SE/TE: 250, 256–257, 492, 499–500 |

D. Generalizability of results from observational studies, experimental studies, and surveys |

## III. ANTICIPATING PATTERNS: PRODUCING MODELS USING PROBABILITY THEORY AND SIMULATION

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

A. Probability as relative frequency | |

1. "Law of large numbers" concept | SE/TE: 276–277 |

2. Addition rule, multiplication rule, conditional probability, and independence | SE/TE: 291–297, 300–304 |

3. Discrete random variables and their probability distributions, including binomial | SE/TE: 309–313, 316–319, 329–332 |

4. Simulation of probability distributions, including binomial and geometric | SE/TE: 218–221, 327–332 |

5. Mean (expected value) and standard deviation of a random variable, and linear transformation of a random variable | SE/TE: 139–140, 311–315 |

B. Combining independent random variables | |

1. Notion of independence versus dependence | SE/TE: 119–120, 295–297 |

2. Mean and standard deviation for sums and differences of independent random variables | SE/TE: 85–86, 316–318, 320 |

C. The normal distribution | |

1. Properties of the normal distribution | SE/TE: 83–89 |

2. Using tables of the normal distribution | SE/TE: 91–95, Table in SE A70–73; Table in TE A104 |

3. The normal distribution as a model for measurements | SE/TE: 83–86, 96–98, 349–350 |

D. Sampling distributions | |

1. Sampling distribution of a sample proportion | SE/TE: 347–350 |

2. Sampling distribution of a sample mean | SE/TE: 352–354, 358 |

3. Central Limit Theorem | SE/TE: 354–357 |

4. Sampling distribution of a difference between two independent sample proportions | SE/TE: 347–360, 365–377 |

5. Sampling distribution of a difference between two independent sample means | SE/TE: 352–354 |

6. Simulation of sampling distributions | SE/TE: 352–354 |

## IV. STATISTICAL INFERENCE: CONFIRMING MODELS

Statistical inference guides the selection of appropriate models.

A. Confidence intervals | |

1. The meaning of a confidence interval | SE/TE: 366–371 |

2. Large sample confidence interval for a proportion | SE/TE: 371–376 |

3. Large sample confidence interval for a mean | SE/TE: 449–454 |

4. Large sample confidence interval for a difference between two proportions | SE/TE: 424–430 |

5. Large sample confidence interval for a difference between two means (unpaired and paired) | SE/TE: 466–477, 491–498 |

B. Tests of significance | |

1. Logic of significance testing, null and alternative hypotheses; p-values; one- and two-sided tests; concepts of Type I and Type II errors; concept of power | SE/TE: 383–390, 392–396, 404–405, 409–414 |

2. Large sample test for a proportion | SE/TE: 350–355 |

3. Large sample test for a mean | SE/TE: 358–361 |

4. Large sample test for a difference between two proportions | SE/TE: 421–425, 428–432 |

5. Large sample test for a difference between two means (unpaired and paired) | SE/TE: 466–477, 491–498 |

6. Chi-square test for goodness of fit, homogeneity of proportions, and independence (one- and two-way tables) | SE/TE: 521–524, 527–530, 532–538; Table in SE A73; Table in TE A107 |

C. Special case of normally distributed data | |

1. t-distribution | SE/TE: 443–446 |

2. Single sample t procedures | SE/TE: 452–457 |

3. Two sample (independent and matched pairs) t procedures | SE/TE: 466–477, 491–499 |

4. Inference for the slope of least-squares regression line | SE/TE: 139–144, 547–551, 553–556 |

AP® is a trademark registered and/or owned by the College Board, which was not involved in the production of, and does not endorse, this site.