CLASSWORK 1.9 NAME: PERIOD: EXAMINING THE EFFECT OF OUTLIERS In this worksheet you will be investigating how an outlier affects the mean and median of a set of data. By the end of the lesson you will be able to explain which measure of central tendency most accurately represents a set of data with an outlier. DATA SET 1: Rushing Yards Gained by San Diego Chargers Football Players The table below show the rushing yards gained by San Diego Chargers Football Players during the 2006 season. Player LaDainian Tomlinson Michael Turner Lorenzo Neal Philip Rivers Andrew Pinnock Erick Parker Vincent Jackson Charlie Whitehurst Keenan McCardell Brandon Manumaleuna Billy Volek Mike Scifres Rushing Yards 1815 502 140 49 25 19 16 13 8 1 -3 -7 Which player is an outlier in the data? How many rushing yards did he have? CALCULATIONS: Calculate the mean and median for the rushing yards, but DO NOT include the outlier in your calculations. Show your work below. Mean Mean = Median Median = CLASSWORK 1.9 NAME: PERIOD: Now, recalculate the mean and median for the rushing yards, but this time INCLUDE the outlier in your calculations. Show your work below. Mean Mean = Median Median = SUPPORTING QUESTIONS: Answer all supporting questions in complete sentences and justify your answers by referring back to your calculations. 1) Look at your calculations for the mean and median when you DID NOT include the outlier. How many players had a rushing total that was less than the mean? How many players had a rushing total that was greater than the mean? How many players had a rushing total that was less than the median? How many players had a rushing total that was greater than the median? 2) Look at your calculations for the mean and median when you DID include the outlier. How many players had a rushing total that was less than the mean? How many players had a rushing total that was greater than the mean? How many players had a rushing total that was less than the median? How many players had a rushing total that was greater than the median? 3) Look at your answers for questions #1 and #2. If you wanted to accurately represent the number of yards that a TYPICAL San Diego Charger gained rushing, should you use the mean or the median to report the data? Justify your answer with supporting details. CLASSWORK 1.9 NAME: PERIOD: DATA SET 2: Populations of the 10 Largest Cities in Maryland The table below shows the populations of the 10 largest cities in Maryland City Baltimore Columbia Silver Spring Dundalk Wheaton-Glenmont Ellicott City Germantown Bethesda Frederick Gaithersburg Population 651,154 88,254 76,540 62,306 57,694 56,397 55,419 55,277 52,816 52,455 Which city is an outlier in the data? What is the population? CALCULATIONS: Calculate the mean and median for the populations, but DO NOT include the outlier in your calculations. Show your work below. Mean Mean = Median Median = Now, recalculate the mean and median for the populations, but this time INCLUDE the outlier in your calculations. Show your work below. Mean Mean = CLASSWORK 1.9 NAME: PERIOD: Median Median = Finally, calculate how the outlier affected your mean and median. Calculate the difference between the second calculations and the first calculations. Mean (Mean with outlier – Mean without outlier) Difference Between Mean Populations = Median (Median with outlier – Median without outlier) Difference Between Median Populations = SUPPORTING QUESTIONS: Answer all supporting questions in complete sentences and justify your answers by referring back to your calculations. 1) Look at your calculations for the difference between the two mean populations. Did the outlier have a significant effect on the value of the mean population? If so, what was the effect? 2) Look at your calculations for the difference between the two median populations. Did the outlier have a significant effect on the value of the median population? If so, what was the effect? 3) Look at your answers for questions #1 and #2. Summarize how an outlier affects the mean and median of a set of data. CLASSWORK 1.9 NAME: PERIOD: DATA SET 3: Gross Domestic Product (GDP) of the 10 wealthiest countries Record the name of each country and the GDP Report the GDP in billions. For example (United States), $11,667,515,000,000.00 would be 11,667 billion dollars. For another example (Spain), $991,442,000,000.00 would be 991 billion dollars Country GDP (in billions of dollars Which country is an outlier in the data? What is the GDP of that country? CALCULATIONS: Calculate the mean and median for the GDP, but DO NOT include the outlier in your calculations. Show your work below. Mean Mean = Median Median = CLASSWORK 1.9 NAME: PERIOD: Now, recalculate the mean and median for the GDP, but this time INCLUDE the outlier in your calculations. Show your work below. Mean Mean = Median Median = SUPPORTING QUESTIONS: Answer all supporting questions in complete sentences and justify your answers by referring back to your calculations. 1) Look at your calculations for the mean and median when you DID NOT include the outlier. How many countries had a GDP less than the mean GDP? How many countries had a GDP greater than the mean GDP? How many countries had a GDP less than the median GDP? How many countries had a GDP greater than the median GDP? 2) Look at your calculations for the mean and median when you DID include the outlier. How many countries had a GDP less than the mean GDP? How many countries had a GDP greater than the mean GDP? How many countries had a GDP less than the median GDP? How many countries had a GDP greater than the median GDP? 3) Look at your answers for questions #1 and #2. When the GDP of the United States is included in the calculations, which measure of central tendency (mean or median) most accurately represents the GDP of a TYPICAL country in the top ten? CLASSWORK 1.9 NAME: PERIOD: CONCLUDING QUESTIONS: Now that you have examined three sets of data you are ready to make some general conclusions. Answer each question in complete sentences and justify your answer by referring back to calculations you made with the data sets. 1) When there is an outlier in a data set, how is the value of the mean affected? How is the value of the median affected? Does the outlier have a greater affect on the mean or the median? Remember to justify your answer with examples from your calculations. 2) You want to accurately represent a typical number in a data set. If there is an outlier in the data, which measure of central tendency (mean or median) should you use to represent the data? BONUS: In all our data sets the outlier was significantly higher than the rest of the data points. An outlier can also be a data point that is significantly lower than the rest of the data. How do you think that an outlier that is lower than the rest of the data will affect the mean? How will it affect the median?