Lecture 4 Data analysis Measure of Dispersion Measure of location

advertisement
Lecture 4
Data analysis
Measure of Dispersion
Measure of location, mean, mode and median; describe one characteristics of data i.e. Central
tendency. However in market research we wish to know about another characteristic of data that
isthe variation or scatter among the values. One can easily visualize two variables having identical
central tendencies but with very different spread.
For example
If spending of two groups of households on purchasing of tea per annum appears like this:
Members
1
2
3
4
5
Group A
3000
1500
2100
1800
2800
Group B
3600
972
1319
2700
2609
Mean of both is $ 2240 but variation of between two groups is not same. This information is
important for a tea manufacturer for making his distribution plan.
There are different measures of dispersion which are described below.
Range: The range is the difference between the largest and smallest values in the
data.In the previous example, the range of data in Group A is
$ 3000 – 1500 = 1500.
While the range of Group B is
$ 3600 – 972 = 2628.
Standard Deviation and Variance
The distance between the mean and an observed value is deviation from the mean. When we
square these deviations and find out the mean of these squared deviation, this is called variance.
When the different data are scattered to a great extent variance is large, but if data are clustered
around the mean, variance is small.
Page | 1
©St. Paul’s University
Sample Variance
It is found with the following formula.
The sum of square deviations for the value from the mean X divided by n – 1.
Standard Deviation of Sample
Standard deviation of sample “s” is the square root of the sample variance. To put it in
the formula, it would be
Example
ABC Real Estate in Karachi wants to know how long does it take to the listed homes to sell. The
Director of the firm took a sample of 10 homes listed last year and the number of weeks each
house took to be sold. The data revealed that the sample homes took following weeks (rounded
to the nearest whole week) 21,6,9,23,1,10,8,11,5,7.
What was the mean and standard deviation of the time period to sell the homes listed in ABC
Real Estate last year?
Page | 2
©St. Paul’s University
Page | 3
©St. Paul’s University
Rules of Standard Deviation
68% of population fall within ±1 standard deviation.
95% of the population fall between ±2 standard deviations and
99.7% or virtually all population fall between ±3 standard deviation.
Example
Let us take another example to illustrate the concepts.
A bank branch located in the heart of the city developed a process to serve the customers during
lunch hour (from 1:00 to 2:00 p.m.). The waiting time of the customers (the time customer
entered in the bank to when his transaction was completed) was noted in minutes during the
lunch hour for a week. A random sample of 12 customers was selected and the waiting time of
these individuals was recorded as follows.
4,5,3,5,6,2,7,2,4,3,2
Compute the arithmetic mea n, median, mode, variance and standard deviation.
A customer walks into the branch and asks the managers how long he is expected to wait.
The manager replies “Almost certainly not more than six minutes.” What do you say
about the accuracy of this statement?
Solution
Page | 4
©St. Paul’s University
Median: Arrange in descending order
7,6,5,5,5,4,4,3,3,2,22
Median = 4 + 4 / 2 = 4
Mode = 5
Statement of the manager is correct because the range of ± one standard deviation is (41.82) (4+1.82).
Majority of the people take not more than 5.82 min.
Coefficient of Variation
Coefficient of variation (CV) is the ratio of the standard deviation to mean expressed in
following formula.
CV is expressed in percentage and is useful when the variable is measured on a ratio scale.
CV is a useful measure of relative dispersion when means are positive. It compares the sets
of numbers with different magnitudes.
Example:
The standard deviation of closing prices of two shares X and Y were $ 5 and 50 and
mean closing prices during a week were $ 10 and 1000 respectively. In which share should we
invest?
Solution
If we look at only the standard deviation we might decide to invest in share X because
it has less volatility. But when we look at the mean prices and work out the ratio of standard
deviation to mean i.e. Coefficient of Variation, the picture is different,
Page | 5
©St. Paul’s University
Now we will change the decision in favor of Y as fluctuation in Y’s prices is much lesser
than share X. So coefficient of variation is a good measure of comparing the riskiness in this
case.
Page | 6
©St. Paul’s University
Download