Chapter 3 (Introducing density curves) • When given a Histogram or list of data, we often are asked to estimate the relative position of a particular data point. What percent of Scores were under 40. What percent of Scores were equal to or over 60?.What percent were equal to 40 or larger but less than 50 (i.e. between 40 and 50) ? Chapter 3 (Introducing density curves) Is a score of 12 considered unusual? Explain. Explain why a score of 44 is not considered unusual. Chapter 3 (cont’d) • Explain why a height of 70.5 inches (using the above chart) is considered unusual. . Here is a histogram of vocabulary scores of 947 seventh graders The smooth curve drawn over the histogram is a mathematical model for the distribution (specifically , a Normal curve) Is a score of 3 unusual? Quantify that answer ! i.e. “What is the “percent of scores” above 3? How to answer the above , is the subject of Ch 3 ! Ch 3 (cont’d) • In Chapter 3, instead of using “sample” data (histograms) we will use an entire population to describe the relative position of a data point. • To do the above, we will need to define & use “density curves”. DENSITY CURVES: • A density curve is an “idealized” mathematical description of a population’s distribution. • The area underneath the curve is exactly 1. • In a density curve, the area under the curve (to the left or to the right of a value) represents the fraction of data values to the left or right (Think of a histogram without vertical lines between the intervals! ) Density Curve for IQ’s Another “idealized” model of a populations distribution IQ “normal curve” m=100, s=15 The area under the curve (to the left or to the right of a value) represents the fraction of data values to the left or right (eg. P(x>130) ~2.5%, P(x<100) = 50% How did I get the above percentiles???? 1st ---Some Definitions/Notation Population Mean : m, “Mu “ -- A measure of the center of the entire population. It is the arithmetic mean of ALL the data points m = xi i.e. (sum the all the data values in the population, divide by N). In practice a VERY hard number to obtain. N Population Standard deviation: s , “sigma” A measure of the spread of the entire population’s data values s = ( xi x ) N 2 It is seldom calculated. Instead a large sample is taken and the sample standard deviate, “s” is used as the “best estimate” for . Chapter 3 - Some Definitions/Notation x ~ N ( m,s ) x ~ N ( 7 ,1 . 5 ) “The variable x is Normally distributed with a mean of m and a standard deviation of s “ Implies tht there is a population whose mean is 7 and standard deviation is 1.5 IQ “normal curve” x ~ N(100,15) Let’s look at some “rule of thumbs” for normal curves. 68-95-99.7 Rule for Any Normal Curve 68% -s 95% +s µ -2s µ +2s 99.7% -3s Essential Statistics µ Chapter 3 +3s 12 68-95-99.7 Rule for Any Normal Curve Essential Statistics Chapter 3 13 Health and Nutrition Examination Study of 1976-1980 • Heights of adult men, aged 18-24 – mean: 70.0 inches – standard deviation: 2.8 inches – heights follow a normal distribution, so we have that heights of men are N(70, 2.8). – DO ON BOARD-Essential Statistics Chapter 3 14 Health and Nutrition Examination Study of 1976-1980 • 68-95-99.7 Rule for men’s heights 68% are between 67.2 and 72.8 inches [ µ s = 70.0 2.8 ] 95% are between 64.4 and 75.6 inches [ µ 2s = 70.0 2(2.8) = 70.0 5.6 ] 99.7% are between 61.6 and 78.4 inches [ µ 3s = 70.0 3(2.8) = 70.0 8.4 ] Essential Statistics Chapter 3 15 Health and Nutrition Examination Study of 1976-1980 • What proportion of men are less than 72.8 inches tall? 68% 16% (by 68-95-99.7 Rule) ? -1 +1 70 72.8 (height values) ? = 84% Essential Statistics Chapter 3 16 Health and Nutrition Examination Study of 1976-1980 • What proportion of men are less than 68 inches tall? ? 68 70 (height values) How many standard deviations is 68 from 70? Essential Statistics Chapter 3 17 Weds – Using Table A • That’s it for the 68-95-99.7 rule. • We will see how to find proportions (left, right, in-between) any score (s) • Note : HW is on StatsPortal Z scores ALL Normal Curves are the same –if we measure in units of size s from the m. Finding the distance a data point is away from its mean – in standard deviations – is called “standardizing” How do we find the “proportion” of scores to the left or right of a data point ? • ANSWER: By finding the number of standard deviations a score is AWAY from the mean – i.e. find the “z” score then use Table A