1.2 Describing Distributions with Numbers Age of Presidents at Inauguration President Age President Age President Age Washington 57 Buchanan 65 Coolidge 51 J. Adams 61 Lincoln 52 54 Jefferson 57 A. Johnson 56 Hoover F. D, Roosevelt Madison 57 Grant 46 Truman 60 Monroe 58 Hayes 54 Eisenhower 61 J. Q. Adams 57 Garfield 49 Kennedy 43 Describe the Histogram in terms of center, shape, spread, and outliers??? Arthur at Inauguration 51 L. Johnson Ages of 61 Presidents Jackson 14 Number of Presidents 12 10 8 51 55 Van Buren 54 Cleveland 47 Nixon 56 W. H. Harrison 68 B. Harrison 55 Ford 61 Tyler 51 Cleveland 55 Carter 52 Polk 49 McKinley 54 Reagan 69 Taylor 64 T. Roosevelt 42 Bush 64 Fillmore 50 Taft 51 Clinton 46 Pierce 48 Wilson 56 Bush 54 Harding 55 Obama 47 6 4 2 0 40-44 45-49 50-54 55-59 Age at Inauguration 60-64 65-69 Mean: ο ο ο ο The most common measure of center (A.K.A. average) Denoted by π₯ The Mean is considered Non-resistant because it is sensitive to extreme values. May or may not be outliers. On Calculator use 1 Var Stat to get the mean. Median: ο ο ο ο ο ο The middle value of the set of data Denoted as M If the # of observations is odd, the median is the center observation. If the # of observations is even then take the mean of the two center observations. Median is resistant to extreme values On Calculator use 1 Var Stat to get the median. Example 1: Find π and M for the set of data 20 Number of Hysterectomies performed by a male doctor in one year 25 25 27 28 31 33 34 36 37 44 50 59 85 π₯=41.3 M=34 Example 2: Find π and M for the set of data Number of Hysterectomies performed by a female doctor in one year 5 7 10 14 18 19 25 29 31 33 π₯=19.1 M=18.5 86 Comparison of π and M ο If…… β¦ Symmetrical – then they are very similar (close in value) β¦ Skewed – Then π₯ is farther out in the tail than the median β¦ Exactly symmetrical – exactly the same Measuring Spread: Range & the Quartiles ο Range = Largest Value – Smallest Value π1 - Lower Quartile – median of the observations smaller than the median π2 - Median π3 - Upper Quartile - median of the observations larger than the median ο πΌπ π – Interquartile Range ο ο ο (π3 − π1 ) ο Outliers fall more than 1.5 × πΌπ π below π1 or above π3 ** 1 – Var stats on your Calculator gives them all to you. 5 – Number Summary ο The 5# Summary consists of the smallest and largest observations from a set of data along with π1 , π, and π3 . ο The 5# summary leads to a new graph called the box and whisker plot (boxplot). ο Best used for comparing two sets of data Example 3: Find any outliers for the set of data. 20 Number of Hysterectomies performed by a male doctor in one year 25 25 27 28 31 33 34 36 37 44 50 59 85 86 • πΌπ π = π3 − π1 • 50 − 27 = 23 • 1.5 × πΌπ π = 34.5 • π3 + 34.5 = 84.5 π1 − 34.5 = −24.5 • Therefore, the observations 85 and 86 are both outliers for the set of data. Example 4: Create a boxplot for each set of data. What can you conclude? 20 Number of Hysterectomies performed by a male doctor in one year 25 25 27 28 31 33 34 36 37 44 50 59 85 86 M Max Min π1 π3 Number of Hysterectomies performed by a female doctor in one year 5 7 10 14 18 19 25 29 31 33 18.5 Min π1 M π3 Max Standard Deviation ο ο ο Measures spread by looking at how far the observations are from the mean. Denoted by s ** 1 – Var stats / Sx Properties of Standard Deviation ο ο ο s measures spread about the mean and should be used only when the mean is used. As s gets larger the observations are more spread out from the mean s is highly influenced by outliers Example 5: Find the standard deviation for the set of data 20 Number of Hysterectomies performed by a male doctor in one year 25 25 27 28 31 33 34 36 37 44 50 59 85 π = 20.6 86 *** 5# Summary is usually better than the mean and standard deviation for describing a skewed distribution. Use the mean and standard deviation for data that is reasonably symmetric