## Browse By

Web Codes What is this?

Technical Support

# Stats: Modeling the World ©2004

Bock, Velleman, De Veaux

SE = Student Edition
TE = Teacher Edition

## I. EXPLORING DATA

Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and departures from patterns. Emphasis should be placed on interpreting information from graphical and numerical displays and summaries.

 A. Interpreting graphical displays of distributions of univariate data (dotplot, stemplot, histogram, cumulative frequency plot) 1. Center and spread SE/TE: 39–44 2. Clusters and gaps SE/TE: 41–42, 233 3. Outliers and other unusual features SE/TE: 41–42, 61–62, 117–122 4. Shape SE/TE: 39–43 B. Summarizing distributions of univariate data 1. Measuring center: median, mean SE/TE: 57–65 2. Measuring spread: range, interquartile range, standard deviation SE/TE: 60–67, 83–84 3. Measuring position: quartiles, percentiles, standardized scores (z-scores) SE/TE: 84–92, 387–388, 427–430 4. Using boxplots SE/TE: 60–61, 70–72 5. The effect of changing units on summary measures SE/TE: 45–46, 98–99 C. Comparing distributions of univariate data (dotplots, back-to-back stemplots, parallel boxplots) 1. Comparing center and spread: within group, between group variation SE/TE: 38–43, 67–70 2. Comparing clusters and gaps SE/TE: 41–42, 233 3. Comparing outliers and other unusual features SE/TE: 41–42, 61–62, 117–122 4. Comparing shapes SE/TE: 39–40 D. Exploring bivariate data 1. Analyzing patterns in scatterplots SE/TE: 115–123 2. Correlation and linearity SE/TE: 119–123, 125, 130, 139–140 3. Least-squares regression line SE/TE: 139–142, 547–551, 553–556 4. Residual plots, outliers, and influential points SE/TE: 138, 167–169, 174 5. Transformations to achieve linearity: logarithmic and power transformations SE/TE: 45–46, 190–193 E. Exploring categorical data: frequency tables 1. Marginal and joint frequencies for two-way tables SE/TE: 17–23 2. Conditional relative frequencies and association SE/TE: 21–22, 37, 276, 292–293

## II. PLANNING A STUDY: DECIDING WHAT AND HOW TO MEASURE

Data must be collected according to a well-developed plan if valid information on a conjecture is to be obtained. This plan includes clarifying the question and deciding upon a method of data collection and analysis.

 A. Overview of methods of data collection 1. Census SE/TE: 229–230 2. Sample survey SE/TE: 227–240 3. Experiment SE/TE: 246–262 4. Observational study SE/TE: 246–247, 260 B. Planning and conducting surveys 1. Characteristics of a well-designed and well-conducted survey SE/TE: 227–234 2. Populations, samples, and random selection SE/TE: 229–234 3. Sources of bias in surveys SE/TE: 227, 235, 238–242 4. Simple random sampling SE/TE: 228, 231–235 5. Stratified random sampling SE/TE: 232, 241 C. Planning and conducting experiments 1. Characteristics of a well-designed and well-conducted experiment SE/TE: 248–251 2. Treatments, control groups, experimental units, random assignments, and replication SE/TE: 231–232, 247–249, 254 3. Sources of bias and confounding, including placebo effect and blinding SE/TE: 239–242, 254–255, 258–260 4. Completely randomized design SE/TE: 247–251 5. Randomized block design, including matched pairs design SE/TE: 250, 256–257, 492, 499–500 D. Generalizability of results from observational studies, experimental studies, and surveys

## III. ANTICIPATING PATTERNS: PRODUCING MODELS USING PROBABILITY THEORY AND SIMULATION

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

 A. Probability as relative frequency 1. "Law of large numbers" concept SE/TE: 276–277 2. Addition rule, multiplication rule, conditional probability, and independence SE/TE: 291–297, 300–304 3. Discrete random variables and their probability distributions, including binomial SE/TE: 309–313, 316–319, 329–332 4. Simulation of probability distributions, including binomial and geometric SE/TE: 218–221, 327–332 5. Mean (expected value) and standard deviation of a random variable, and linear transformation of a random variable SE/TE: 139–140, 311–315 B. Combining independent random variables 1. Notion of independence versus dependence SE/TE: 119–120, 295–297 2. Mean and standard deviation for sums and differences of independent random variables SE/TE: 85–86, 316–318, 320 C. The normal distribution 1. Properties of the normal distribution SE/TE: 83–89 2. Using tables of the normal distribution SE/TE: 91–95, Table in SE A70–73; Table in TE A104 3. The normal distribution as a model for measurements SE/TE: 83–86, 96–98, 349–350 D. Sampling distributions 1. Sampling distribution of a sample proportion SE/TE: 347–350 2. Sampling distribution of a sample mean SE/TE: 352–354, 358 3. Central Limit Theorem SE/TE: 354–357 4. Sampling distribution of a difference between two independent sample proportions SE/TE: 347–360, 365–377 5. Sampling distribution of a difference between two independent sample means SE/TE: 352–354 6. Simulation of sampling distributions SE/TE: 352–354

## IV. STATISTICAL INFERENCE: CONFIRMING MODELS

Statistical inference guides the selection of appropriate models.

 A. Confidence intervals 1. The meaning of a confidence interval SE/TE: 366–371 2. Large sample confidence interval for a proportion SE/TE: 371–376 3. Large sample confidence interval for a mean SE/TE: 449–454 4. Large sample confidence interval for a difference between two proportions SE/TE: 424–430 5. Large sample confidence interval for a difference between two means (unpaired and paired) SE/TE: 466–477, 491–498 B. Tests of significance 1. Logic of significance testing, null and alternative hypotheses; p-values; one- and two-sided tests; concepts of Type I and Type II errors; concept of power SE/TE: 383–390, 392–396, 404–405, 409–414 2. Large sample test for a proportion SE/TE: 350–355 3. Large sample test for a mean SE/TE: 358–361 4. Large sample test for a difference between two proportions SE/TE: 421–425, 428–432 5. Large sample test for a difference between two means (unpaired and paired) SE/TE: 466–477, 491–498 6. Chi-square test for goodness of fit, homogeneity of proportions, and independence (one- and two-way tables) SE/TE: 521–524, 527–530, 532–538; Table in SE A73; Table in TE A107 C. Special case of normally distributed data 1. t-distribution SE/TE: 443–446 2. Single sample t procedures SE/TE: 452–457 3. Two sample (independent and matched pairs) t procedures SE/TE: 466–477, 491–499 4. Inference for the slope of least-squares regression line SE/TE: 139–144, 547–551, 553–556

AP® is a trademark registered and/or owned by the College Board, which was not involved in the production of, and does not endorse, this site.