ECO 72 ­ INTRODUCTION TO ECONOMIC STATISTICS Topic 2 Measures of Central Tendency These slides are copyright © 2003 by Tavis Barr. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/). Measures of Central Tendency This chapter looks at three different concepts of how we describe a “typical” element of a data set. Mean ● Median ● Mode ● There is no one “best” concept for all cases; we will discuss the advantages and disadvantages of each. Mean ● ● The mean is what is most commonly called the average. If a population is finite, of size N, we can write the population mean as EXAMPLE: ● ● i= N X i X 1 X 2 ⋯X N ∑ N= N i=1 ● There are three countries in North America (N=3) Their land areas are: Canada 9,093,507 km2 Mexico 1,923,040 km2 U.S. 9,161,923 km2 Total 2,017,840 km2 Average land area: 2,017,840/3 = 6,726,157 km2 Source: 2005 CIA World Factbook Mean – sample mean ● i=n For a sample of size n, we can write sample mean as X i X 1 X 2 ⋯X n ∑ n= n i=1 Example: ● ● ● Ten people are asked how many hours of TV they watched last night. Their responses are 1, 2, .5, 0, 4, 0, 2, 1.5, 0, 3. Mean: 1+2+0.5+4+2+1.5+3 =1.4 10 Advantages of the sample mean 1. It takes all values in the sample into account. 2. It is unique: Each sample and population has only one mean. 3. The sum of X minus the mean is zero, so the mean acts as a “balancing point.” Disadvantages of the Mean 1. It only exists for quantitative data What is the mean between good, fair, poor? ● Between red, yellow, and blue? ● 2. It can be affected strongly by outliers. Example: In Whoville, there are 10 people who earn $10,000 a year and one person who earns $1,000,000 ● What is the mean? Is it a typical income? ● Weighted Mean Weighted means occur when we have some observations that we wish to place more importance on than others. ● They require a weighting variable that indicates the importance to place on a given observation. ● ● We denote the original variable by Xi and the weighting variable by Wi. Weighted Mean – Formula Formula for the weighted mean: i =n ∑ W i Xi i =1 i =n ∑Wi i =1 = W 1 X 1 W 2 X 2 ⋯W n X n W 1W 2 ⋯W n Weighted Mean – Example Life Expectancy in a group of northern African countries: Country Algeria Egypt Libya Morocco Nigeria Sudan Tunisia Life Expectancy 68 66 73 67 41 55 72 Sum Mean 442 442/7 = 63.14 Weighted Mean – Example (cont'd) Life Expectancy in a group of northern African countries: Country Algeria Egypt Libya Morocco Nigeria Sudan Tunisia Life Expectancy 68 66 73 67 41 55 72 Population (mil) 31 70 5 29 126 31 10 Sum Mean 442 442/7 = 63.14 302 Weighted Mean – Example (cont'd) Life Expectancy in a group of northern African countries: Country Algeria Egypt Libya Morocco Nigeria Sudan Tunisia Life Expectancy 68 66 73 67 41 55 72 Population (mil) 31 70 5 29 126 31 10 LE x Popn 2108 4620 365 1943 5166 1705 720 Sum Mean 442 442/7 = 63.14 302 16627 Weighted Mean: 16627/302 = 55.05 Median ● ● ● Looks at midpoint of data when they are sorted from highest to lowest. If even number of observations, take average of two midpoints. Example: Hours of television watched, sorted: 0, 0, 0, .5, 1, 1.5, 2, 2, 3, 4 Median ● ● ● Looks at midpoint of data when they are sorted from highest to lowest. If even number of observations, take average of two midpoints. Example: Hours of television watched, sorted: 0, 0, 0, .5, 1, 1.5, 2, 2, 3, 4 Median: (1 + 1.5)/2 = 1.25 Advantages of Median 1. ● Works on ordered data as well as quantitative data. Example: 20 Opinions of Hillwood Cafe Excellent: 3 Good: 6 Fair: 7 Poor: 4 Pick midpoint from: Poor, Poor, Poor, Poor, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Good, Good, Good, Good, Good, Good, Excellent, Excellent, Excellent. Advantages of Median (cont'd) 2. Median is unaffected by outliers: – There are 11 people in Whoville; ten make $10,000 per year and one makes $1 million per year. What is the median income? 3. Median is unique: A sample has only one. Disadvantage of Median Disadvantage: Not affected by changes in data away from center. – Example: In Whoville, what would happen to the median income if the millionaire suddenly started making only $25,000? Mode ● Asks which value is observed most. ● Example: 20 Opinions of Hillwood Cafe Excellent: 3 Good: 6 Fair: 7 Poor: 4 Here, “Fair” is the most common response. Mode – Advantage ● Advantage: Works on category data. ● Example: Ethnic groups in Ethiopia (millions) Amharic/Tigray 22.6 Oromo 28.2 Shankella 4.2 Sidamo 6.3 Somali 4.2 Other 4.9 Source: 2005 CIA World Factbook Disadvantages of Mode ● ● May not be unique. Consider the following sample of 10 people Favorite Flavor of Ice Cream # of people Vanilla 1 Chocolate 4 Strawberry 4 Coffee 1 Disadvantages of Mode (cont'd) ● May not even exist in a meaningful way on continuous data. Consider life expectancy data: Country Life Expectancy Algeria Egypt Libya Morocco Nigeria Sudan Tunisia 68 66 73 67 41 55 72 One could say that every value is a mode, or that none is. Disadvantages of Mode (cont'd) ● May not lie near the center of the data at all in ordered data. Consider our answers about how many hours of television people watched last night: 0, 0, 0, .5, 1, 1.5, 2, 2, 3, 4 The modal response is not a typical one. Geometric Mean ● ● ● ● Used when looking at growth rates. For example, economic growth, interest rates, population growth. Asks what growth rate, if it were constant each year, would get you from starting value to ending value We won't use it in this class, but keep it in mind if you're working with time­series data such as financial data