1 WHY DO WE NEED TO KNOW MODELS? Scenario I A patient visits his doctor complaining of a number of symptoms. The doctor suspects the patient is suffering from some disease. The doctor performs a diagnostic test to check for this disease. High responses on the test support that the patient may have the disease. The patient’s test response is 200.What does this say? The doctor has a frame of reference—that is, a model for the responses of the diagnostic test for “healthy subjects”— as shown in the accompanying figure: Model for Healthy Subjects 80 100 120 140 160 180 200 Test response Based on this model, it is very unlikely that a test response of 200, or greater, would have occurred if the subject were actually healthy. Thus, either the patient has this disease or a very unlikely event has occurred. Scenario II: Suppose we wish to compare two drugs, Drug A and Drug B, for relieving arthritis pain. Subjects suitable for the study are randomized to one of the two drug groups and are given instructions for dosage and how to measure their “time to relief.” Results of the study are summarized by presenting the models for the time to relief for the two drugs. Which drug is better overall? Consider any point in time, say time = T as indicated on the above axis. A higher proportion of subjects treated with Drug A have felt “relief” by this time point as compared to those treated with Drug B. If the study design was sound, and the models based on the study results adequately portray the models for the populations, then we might conclude that Drug A appears to be better than Drug B in terms of having a quicker time to relief. We may wish to assess if the difference between these two drugs is statistically significant by conducting a more formal statistical test. 2 MODELING CONTINUOUS VARIABLES Histogram 6.1 Proportion 5 0.08 0.06 0.04 0.02 30 35 40 45 50 55 Age If we draw a curve through the tops of the bars in Histogram 6.1 and require the smoothed curve to have total area under it equal to 1, we would have what is called a density function, also called a density curve. The key idea when working with density functions is that area under the curve, above an interval, corresponds to the proportion of units with values in the interval. NOTATION... Since we will be discussing models for populations, the mean and standard deviation for a density curve or model will be represented by (mu) and (sigma), respectively. DEFINITION: A density function is a (nonnegative) function or curve that describes the overall shape of a distribution. The total area under the entire curve is equal to 1, and proportions are measured as areas under the density function. 3 Let's Do It! 1 Lifetime Density Function Let the variable X represent the length of life, in years, for an electrical component. The following figure is the density curve for the distribution of X. (a) What proportion of electrical components lasts longer than 6 years? (b) What proportion of electrical components lasts longer than 1 year? (c) Describe the shape of the distribution. Normal Distributions A normal distribution Point of inflection 4 Three members of the family of normal distributions Distribution #3: Normal with a mean of 80 and a standard deviation of 5 Distribution #1: Normal with a mean of 50 and a standard deviation of 10 20 30 40 50 Distribution #2: Normal with a mean of 80 and a standard deviation of 10 60 70 80 90 100 General Notation X is N( , ) means that the variable or characteristic X is normally distributed with mean and standard deviation . 5 Example 1 IQ Scores Problem Let the variable X represent IQ scores of 12-year-olds. Suppose that the distribution of X is normal with a mean of 100 and a standard deviation of 16—that is, X is N(100, 16). Jessica is a 12-year-old and has an IQ score of 132.We would like to determine the proportion of 12-year olds that have IQ scores less than Jessica’s score of 132. Since the area under the density curve corresponds to proportion, we want to find the area to the left of 132 under an N(100, 16) curve. Sketch this curve and show the corresponding area that represents this proportion. IQ Scores have a normal distribution with mean 100 and standard deviation 16 area to the left of 132 = ? 68 84 100 116 132 IQ Score 6 How to Calculate Areas under a Normal Distribution DEFINITION: If X is N( , ) , the standardized normal variable Z X is N 0 ,1. DEFINITION: The z-score or standard score for an observed value tells us how many standard deviations the observed value is from the mean – that is, it tells us how far the observed value is from the mean in standard-deviation units. It is computed as follows: Z X = number of standard deviations that X differs from the mean If Z > 0, then the value of X is above (greater than) its mean. If Z < 0, then the value of X is below (less than) its mean. If Z = 0, then the value of X is equal to its mean. 7 Example 2 Standard IQ Score Problem Recall the distribution of IQ scores for 12-year-olds—normally distributed with a mean of 100 and a standard deviation of 16. (a) Jessica had a score of 132. Compute Jessica’s standardized score. (b) Suppose Jessica has an older brother, Mike, who is 20 years old and has an IQ score of 144. It wouldn’t make sense to directly compare Mike’s score of 144 to Jessica’s score of 132. The two scores come from different distributions due to the age difference. Assume that the distribution of IQ scores for 20-year-olds is normal with a mean of 120 and a standard deviation of 20. Compute Mike’s standardized score. (c) Relative to their respective age group, who had the higher IQ score—Jessica or Mike? Solution Jessica's standard score = Mike's standard score = 132 100 2 . 16 144 120 12 . . 20 Thus, relative to their respective age groups, Jessica has a higher IQ score than Mike 8 Example 3 Finding Proportions for the Standard Normal Distribution Problem Finding proportions under a normal distribution involves standardization and then finding the corresponding proportion (area) under the standard normal distribution. Let’s first work on finding areas under a standard normal N(0, 1) distribution. (a) Find the area under the standard normal distribution to the left of z = 1.22. Sketch a picture of the corresponding area and use either Table 3 page 825 or your TI-84 to find the area. Solution Using TI: (b) Find the area under the standard normal distribution to the right of z = 1.22. 9 Let's Do It! 2 6.2More Standard Normal Areas (a) Find the area under the standard normal distribution between z = 0 and z = 1.22. Sketch the area and use Table E or your calculator to find the area Z 0 (b) Find the area under the standard normal distribution to the left of z = -2.55. Sketch the area and use Table E or your calculator to find the area. Z 0 (c) Find the area under the standard normal distribution between z = -1.22 and z = 1.22. Sketch the area and use Table E or your calculator to find the area. 0 Z 10 Let's Do It! 3 6.3IQ Scores We will continue with the model for IQ score of 12-year-olds. In answering the following questions, remember to use the symmetry of the normal distribution and the fact that the total area under the curve is 1. It may also be very useful to draw a picture of the area you are trying to find so you can establish a frame of reference (for example, should it be larger or smaller than 50%?) and see the way to approach getting the answer. If you will be using Table II, you will need to first compute the corresponding zscores. X = IQ score (12-year-olds) has a (a) N 100,16 distribution. What proportion of the 12-year-olds has IQ scores below 84? Sketch it. 52 (b) 68 84 100 116 132 148 IQ Score What proportion of the 12-year-olds has IQ scores 84 or more? Sketch it. 52 (d) 68 84 100 116 132 148 IQ Score What proportion of the 12-year-olds has IQ scores between 84 and 116? Sketch it. 52 68 84 100 116 132 148 IQ Score 11 Example The Top 1% of the IQ Distribution Problem Recall the N 100,16 model for IQ score of 12-year-olds. What IQ score must a 12year-old have to place in the top 1% of the distribution of IQ scores? (a) Draw a picture to show what IQ score you are trying to find. (b) What percentile do you want to find for the IQ distribution? (c) Find the percentile using Table E in reverse or your calculator. Again it may be helpful to draw a picture: The area to the left is 0.99 100 ? IQ Score Many calculators have the ability to find various percentiles of a normal distribution. The TI has a built-in function called invNorm under the DIST menu. You must first specify the desired are to the left, then the mean and the standard deviation for the normal distribution. The steps for finding the 99th percentile of our N(100,16) distribution are as follows: 12 To use Table 3 in reverse manner, Step1: standardize the score X. Step 2: find the Z score on the margins that will corresponds to the area 0.99. Finally equate step 1 to the value of the Z in step 2 and solve for X. x x 100 2.33 16 . x 100 (2.33) 16 137.28 z A 12-year-old must have an IQ score of at least 137.28 to place in the top 1%. Let's Do It! 4 6.7Freestyle Swim Times The finishing times for 11–12-year-old male swimmers performing the 50-yard freestyle are normally distributed with a mean of 35 seconds and a standard deviation of 2 seconds. (a) The sponsors of a swim meet decide to give certificates to all 11–12-year-old male swimmers who finish their 50-yard race in under 32 seconds. If there are 50 such swimmers entered in the 50-yard freestyle event, approximately how many certificates will be needed? (b) In what amount of time must a swimmer finish to be in the “top” fastest 2% of the distribution of finishing times? 13 Let's Do It! 5 7 Hours per Week According to a study, men in the US devote an average of 16hrs per week to house work. Assume that the number of hours men devote to house work is normally distributed with a standard deviation of 3.5. a. Suppose that the lower 10% of men on the distribution devote fewer than x hours per week. Find the value of x. b. Suppose the upper 5% of men on the distribution devote more than x hrs per week. Find the value of x. Let's Do It! 6 Middle portion of the normal distribution If one-person household spends an average of $40 per month on medications and doctor visits, find the maximum and minimum dollar amounts spent per month for the middle 50% of one-person household. Assume that the standard deviation is $5 and that the amount spent is normally distributed. Homework Page 138: 1-9 all, 12-16 all, 31-33 all 14