Lecture 4 Data analysis Measure of Dispersion Measure of location, mean, mode and median; describe one characteristics of data i.e. Central tendency. However in market research we wish to know about another characteristic of data that isthe variation or scatter among the values. One can easily visualize two variables having identical central tendencies but with very different spread. For example If spending of two groups of households on purchasing of tea per annum appears like this: Members 1 2 3 4 5 Group A 3000 1500 2100 1800 2800 Group B 3600 972 1319 2700 2609 Mean of both is $ 2240 but variation of between two groups is not same. This information is important for a tea manufacturer for making his distribution plan. There are different measures of dispersion which are described below. Range: The range is the difference between the largest and smallest values in the data.In the previous example, the range of data in Group A is $ 3000 – 1500 = 1500. While the range of Group B is $ 3600 – 972 = 2628. Standard Deviation and Variance The distance between the mean and an observed value is deviation from the mean. When we square these deviations and find out the mean of these squared deviation, this is called variance. When the different data are scattered to a great extent variance is large, but if data are clustered around the mean, variance is small. Page | 1 ©St. Paul’s University Sample Variance It is found with the following formula. The sum of square deviations for the value from the mean X divided by n – 1. Standard Deviation of Sample Standard deviation of sample “s” is the square root of the sample variance. To put it in the formula, it would be Example ABC Real Estate in Karachi wants to know how long does it take to the listed homes to sell. The Director of the firm took a sample of 10 homes listed last year and the number of weeks each house took to be sold. The data revealed that the sample homes took following weeks (rounded to the nearest whole week) 21,6,9,23,1,10,8,11,5,7. What was the mean and standard deviation of the time period to sell the homes listed in ABC Real Estate last year? Page | 2 ©St. Paul’s University Page | 3 ©St. Paul’s University Rules of Standard Deviation 68% of population fall within ±1 standard deviation. 95% of the population fall between ±2 standard deviations and 99.7% or virtually all population fall between ±3 standard deviation. Example Let us take another example to illustrate the concepts. A bank branch located in the heart of the city developed a process to serve the customers during lunch hour (from 1:00 to 2:00 p.m.). The waiting time of the customers (the time customer entered in the bank to when his transaction was completed) was noted in minutes during the lunch hour for a week. A random sample of 12 customers was selected and the waiting time of these individuals was recorded as follows. 4,5,3,5,6,2,7,2,4,3,2 Compute the arithmetic mea n, median, mode, variance and standard deviation. A customer walks into the branch and asks the managers how long he is expected to wait. The manager replies “Almost certainly not more than six minutes.” What do you say about the accuracy of this statement? Solution Page | 4 ©St. Paul’s University Median: Arrange in descending order 7,6,5,5,5,4,4,3,3,2,22 Median = 4 + 4 / 2 = 4 Mode = 5 Statement of the manager is correct because the range of ± one standard deviation is (41.82) (4+1.82). Majority of the people take not more than 5.82 min. Coefficient of Variation Coefficient of variation (CV) is the ratio of the standard deviation to mean expressed in following formula. CV is expressed in percentage and is useful when the variable is measured on a ratio scale. CV is a useful measure of relative dispersion when means are positive. It compares the sets of numbers with different magnitudes. Example: The standard deviation of closing prices of two shares X and Y were $ 5 and 50 and mean closing prices during a week were $ 10 and 1000 respectively. In which share should we invest? Solution If we look at only the standard deviation we might decide to invest in share X because it has less volatility. But when we look at the mean prices and work out the ratio of standard deviation to mean i.e. Coefficient of Variation, the picture is different, Page | 5 ©St. Paul’s University Now we will change the decision in favor of Y as fluctuation in Y’s prices is much lesser than share X. So coefficient of variation is a good measure of comparing the riskiness in this case. Page | 6 ©St. Paul’s University