5, 8, 13, 17, 22, 24, 25, 27, 29, 30 8, 10, 22, 24, 25, 25, 26, 27, 45, 72 Graph & Describe 22, 27, 33, 39, 57, 88, 110 Modified Boxplot Mild outliers are represented by shaded circles. Extreme outliers are represented by open circles Whiskers are only extended to largest values that are not outliers. Create a Modified Boxplot 56 54 75 64 76 22 81 78 66 87 62 80 68 72 59 45 An article on peanut butter reported the following scores (quality ratings on a scale of 0 to 100) for various brands. Construct a comparative stem-and-leaf plot and compare the graphs. Creamy: 56 65 40 44 45 50 62 40 56 36 56 30 39 68 22 53 41 50 30 Crunchy: 62 62 80 53 52 47 75 50 56 42 34 62 47 42 40 36 34 75 20, 22, 23, 24, 24, 25, 25, 27, 35 Are there any outliers? Draw a skeleton boxplot. Draw a modified boxplot. Chebyshev’s & The Empirical Rule Describing Data in terms of the Standard Deviation. Test Mean = 80 St. Dev. = 5 Chebyshev’s Rule The percent of observations that are within k standard deviations of the mean is at least 1 100 1 2 % k Facts about Chebyshev Applicable to any data set – whether it is symmetric or skewed. Many times there are more than 75% - this is a very conservative estimation. # St. Dev. 2 3 4 4.472 5 10 1 100 1 2 k % w/in k st. dev. of mean Interpret using Chebyshev Test Mean = 80 St. Dev. = 5 1. What percent are between 75 and 85? 2. What percent are between 60 and 100? Collect wrist measurements (in) Create distribution Find st. dev & mean. What percent is within 1 deviation of mean Practice Problems 1. Using Chebyshev, solve the following problem for a distribution with a mean of 80 and a st. dev. Of 10. a. At least what percentage of values will fall between 60 and 100? b. At least what percentage of values will fall between 65 and 95? Normal Distributions These are special density curves. They have the same overall shape Symmetric Single-Peaked Bell-Shaped They are completely described by giving its mean () and its standard deviation (). We abbreviate it N(,) Normal Curves…. •Changing the mean without changing the standard deviation simply moves the curve horizontally. •The Standard deviation controls the spread of a Normal Curve. Standard Deviation It’s the natural measure of spread for Normal distributions. It can be located by eye on a Normal curve. It’s the point at which the curve changes from concave down to concave up. Why is the Normal Curve Important? They are good descriptions for some real data such as Test scores like SAT, IQ Repeated careful measurements of the same quantity Characteristics of biological populations (height) They are good approximations to the results of many kinds of chance outcomes They are used in many statistical inference procedures. Empirical Rule Can only be used if the data can be reasonably described by a normal curve. Approximately 68% of the data is within 1 st. dev. of mean 95% of the data is within 2 st. dev. of mean 99.7% of data is within 3 st. dev. of mean Empirical Rule What percent do you think…… www.whfreeman.com/tps4e Empirical Rule (68-95-99.7 Rule) In the Normal distribution with mean () and standard deviation (): 1 of ≈ 68% of the observations Within 2 of ≈ 95% of the observations Within 3 of ≈ 99.7% of the observations Within The distribution of batting average (proportion of hits) for the 432 Major League Baseball players with at least 100 plate appearances in the 2009 season is normally distributed defined N(0.261, 0.034). Sketch a Normal density curve for this distribution of batting averages. Label the points that are 1, 2, and 3 standard deviations from the mean. What percent of the batting averages are above 0.329? What percent are between 0.227 and .295? Scores on the Wechsler adult Intelligence Scale (a standard IQ test) for the 20 to 34 age group are approximately Normally distributed. N(110, 25). What percent are between 85 and 135? What percent are below 185? What percent are below 60? 2. A sample of the hourly wages of employees who work in restaurants in a large city has a mean of $5.02 and a st. dev. of $0.09. a. Using Chebyshev’s, find the range in which at least 75% of the data will fall. b. Using the Empirical rule, find the range in which at least 68% of the data will fall. The mean of a distribution is 50 and the standard deviation is 6. Using the empirical rule, find the percentage that will fall between 38 and 62. A sample of the labor costs per hour to assemble a certain product has a mean of $2.60 and a standard deviation of $0.15, using Chebyshev’s, find the values in which at least 88.89% of the data will lie. Measures of Position Percentiles Z-scores The following represents my results when playing an online sudoku game…at www.websudoku.com. 0 min 30 min Introduction A student gets a test back with a score of 78 on it. A 10th-grader scores 46 on the PSAT Writing test Isolated numbers don’t always provide enough information…what we want to know is where we stand. Where Do I Stand? Let’s make a dotplot of our heights from 58 to 78 inches. How many people in the class have heights less than you? What percent of the dents in the class have heights less than yours? This is your percentile in the distribution of heights Finishing…. Calculate the mean and standard deviation. Where does your height fall in relation to the mean: above or below? How many standard deviations above or below the mean is it? This is the z-score for your height. Let’s discuss What would happen to the class’s height distribution if you converted each data value from inches to centimeters. (2.54cm = 1 in) How would this change of units affect the measures of center, spread, and location (percentile & z-score) that you calculated. National Center for Health Statistics Look at Clinical Growth Charts at www.cdc.gov/nchs Percentiles Value such that r% of the observations in the data set fall at or below that value. If you are at the 75th percentile, then 75% of the students had heights less than yours. Test scores on last AP Test. Jenny made an 86. How did she perform relative to her classmates? 6 7 7 8 8 9 7 2334 5777899 00123334 569 03 Her score was greater than 21 of the 25 observations. Since 21 of the 25, or 84%, of the scores are below hers, Jenny is at the 84th percentile in the class’s test score distribution. Find the percentiles for the following students…. 6 7 7 8 8 9 Mary, who earned a 74. Two students who earned scores of 80. 7 2334 5777899 00123334 569 03 Cumulative Relative Frequency Table: Age of First 44 Presidents When They Were Inaugurated Age Frequency Relative frequency Cumulative frequency Cumulative relative frequency 40-44 2 2/44 = 4.5% 2 2/44 = 4.5% 45-49 7 7/44 = 15.9% 9 9/44 = 20.5% 50-54 13 13/44 = 29.5% 22 22/44 = 50.0% 55-59 12 12/44 = 34% 34 34/44 = 77.3% 60-64 7 7/44 = 15.9% 41 41/44 = 93.2% 65-69 3 3/44 = 6.8% 44 44/44 = 100% Cumulative Relative Frequency Graph: Cumulative relative frequency (%) 100 80 60 40 20 0 40 45 50 at inauguration 55 60 65 Age 70 Interpreting… When does it slow down? Why? 100 Cumulative relative frequency (%) Why does it get very steep beginning at age 50? 80 60 What percent were inaugurated before age 70? 40 20 What’s the IQR? 0 40 45 50 at inauguration 55 60 65 Age 70 Obama was 47…. Interpreting Cumulative Relative Frequency Graphs 11 47 58 Describing Location in a Distribution Use the graph from page 88 to answer the following questions. Was Barack Obama, who was inaugurated at age 47, unusually young? 65 and interpret the 65th Estimate percentile of the distribution What is the relationship between percentiles and quartiles? Z-Score – (standardized score) It represents the number of deviations from the mean. If it’s positive, then it’s above the mean. If it’s negative, then it’s below the mean. It standardized measurements since it’s in terms of st. deviation. Discovery: Mean = 90 St. dev = 10 Find z score for 80 95 73 Z-Score Formula x mean z standard deviation Compare…using z-score. History Test Math Test Mean = 92 Mean = 80 St. Dev = 3 St. Dev = 5 My Score = 95 My Score = 90 Compare Math: mean = 70 x = 62 s=6 English: mean = 80 x = 72 s=3 Be Careful! Being better is relative to the situation. What if I wanted to compare race times? Find the following percentiles. X 3 4 5 6 7 8 9 10 Rel. Freq 0.05 0.12 0.23 0.08 0.02 0.18 0.24 0.08 1. 40th percentile? C.F. 0.05 0.17 0.4 0.45 0.5 0.68 0.92 1 2. 17th percentile? 3. 70th percentile? 4. 25th percentile? Homework Worksheet