+ Chapter 2: Modeling Distributions of Data Section 2.1 Describing Location in a Distribution The Practice of Statistics, 4th edition - For AP* STARNES, YATES, MOORE In Chapter 1, we developed a kit of graphical and numerical tools for describing distributions. Now, we’ll add one more step to the strategy. Exploring Quantitative Data 1. Always plot your data: make a graph. 2. Look for the overall pattern (shape, center, and spread) and for striking departures such as outliers. 3. Calculate a numerical summary to briefly describe center and spread. 4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. + Curves Describing Location in a Distribution Density Curve A density curve is a curve that •is always on or above the horizontal axis, and •has area exactly 1 underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval. AREA = PROBABILITY = PROPORTION. The overall pattern of this histogram of the scores of all 947 seventh-grade students in Gary, Indiana, on the vocabulary part of the Iowa Test of Basic Skills (ITBS) can be described by a smooth curve drawn through the tops of the bars. Describing Location in a Distribution Definition: + Density + Mean and Median of a Density Curve Symmetric: Mean = Median Skewed Left: Mean < Median Skewed Right: Mean > Median The median of a density curve is the equal-areas point, where ½ of the area is to the left and ½ of the area is to the right. The mean of a density curve is the balance point, where the curve would balance if it were made of solid material. + Examples For each of these density curves, which line represents the mean? The median? + Quartiles For any density curve… How much area is to the left of the first quartile? How much area is to the right of the first quartile? How much area is between the first and third quartiles? + The uniform distribution The uniform distribution is a special kind of density curve. It looks like a rectangle. It has a constant height (i.e. horizontal line) over some interval of values. So, this density curve describes a variable whose values are distributed evenly (UNIFORMLY) over some interval of values. + Example Accidents on a level, 3-mile bike path occur uniformly along the length of the path. A) Show that this density curve satisfies the two requirements for a density curve. B) The proportion of accidents that occur in the first mile of the path is the area under the density curve between 0 miles and 1 mile. What is this area? C) Sue’s property adjoins the bike path between the 0.8 mile mark and the 1.1 mile mark. What proportion of accidents happen in front of Sue’s property? + Normal Distributions Density curves have an area = 1 and are always positive. Normal curves are a special type of density curves. T/F All density curves are normal curves. T/F All normal curves are density curves. Characteristics of Normal Curves Symmetric Single-peaked (also called unimodal) Bell-shaped μ σ The mean, μ, is located at the center of the curve. The standard deviation, σ, is located at the inflection points of the curve. Parameters of the Normal Curve The same way a line is defined by its slope and y-intercept, a normal curve is defined by its mean and standard deviation. + Why Be Normal? Normal curves are good descriptions for lots of real data: SAT test scores, IQ, heights, length of cockroaches (yum!). Normal curves approximate random experiments, like tossing a coin many times. Not all data is normal (or even approximately normal). Income data is skewed right. Notation We abbreviate the Normal distribution with mean µ and standard deviation σ as N(µ,σ). IQ scores on the WISC-IV are distributed Normally with a mean of 100 and a standard deviation of 15. IQ ~ N(100,15). Women’s heights ~ N(64.5”, 2.5”) + Distributions Normal Distributions Normal + Remember z-scores? What’s the formula to standardize a value? If a person’s IQ has a z-score of 0, what does that mean? What does a z-score of -1 mean? On the normal curve, we draw 3 standard deviations on either side of the mean.