Lab 3

advertisement
E370 2013 Spring
Chapter 3 Summary Statistics
Summarizing - Distribution
Measures of Central Location/Central Tendency
Mean, Median, Mode
Measures of Variability/Dispersion
Range, Standard Deviation,Variance, Coefficient of Variation
Measures of Shape
Skewness (e.g. Pearson’s 2nd Skewness)
2
Three Measures of Central Tendency
Tendency
Statistic
Mean
Median
3
Formula
Excel Formula
1 n
xi

n i 1
Familiar and
uses all the
=AVERAGE(Data)
sample
information.
Middle
value in
sorted
array
=MEDIAN(Data)
Pro
Con
Influenced
by extreme
values.
Ignores
extremes
Robust when
and can be
extreme data
affected by
values exist.
gaps in data
values.
Three Measures of Central Tendency (cont.)
4
Variance
N
 The population variance (𝜎2) is defined as
the sum of squared deviations around
the mean m divided by the population size.
2 
  xi   
i 1
N
n
(s2), we
 For the sample variance
divide by
n – 1 instead of n, otherwise s2 would tend
to underestimate the unknown population
variance s2.
s2 
  xi  x 
i 1
n 1
Note! the denominator is sample size (n) minus one !
 Drawback: due to its units, hard to interpret
5
2
2
Variance(cont’d)
 Excel’s built in functions are
Statistic
Excel population
formula
Excel sample
formula
Variance
=VAR.P(Array)
=VAR.S(Array)
=STDEV.P(Array)
=STDEV.S(Array)
Standard deviation
6
Coefficient of Variation
 The coefficient of variation(CV) of a set of observations is
the standard deviation of the observations divided by
their mean, that is:
•
This coefficient provides a unit-free measure of
variation.
 It measures relative dispersion, and is useful for
comparing dispersion of variables measured in different
units or with different means.
7
Measure of Skewness
 Pearson’s Skewness Coefficients
• First:
Sk = (mean-mode)/sample averg(or pop averg)
• Second:
Sk = 3(mean-median)/sample averg(or pop averg)
 Characteristics of Pearson’s Second Skewness Coefficient:
• Usually exist between -3 and +3
• zero means symmetric.
• Negative means negative (left) skewness
• Positive means positive (right) skewness
8
Mean, Median, Mode
If a distribution is right-skewed (positive) it is often true:
MEAN > MEDIAN > MODE
If a distribution is left-skewed (negative) it is often true:
MODE > MEDIAN > MEAN
Excel
=AVERAGE(Array):
=MEDIAN(Array):
Returns the arithmetic mean.
Returns the median of an ordered array. The
array must be put in order before use, or the value
it returns is meaningless.
=MODE.SNGL(Array): Returns the first mode that is found in an array.
=MODE.MULT(Array): Will return multiple modes if they exist in an
array.
=MIN(Array): Returns the value of the smallest magnitude in an array.
=MAX(Array): Returns the value of the greatest magnitude in an array.
=VAR.P(Array): Returns the population variance of an array.
=VAR.S(Array): Returns the sample variance of an array.
=STDEV.P(Array): Returns the population standard deviation of an
array.
=STDEV.S(Array): Returns the sample standard deviation of an array.
Data==>Data Analysis==>Descriptive Statistics: Generates a table of
statistics for one or more variables.
Download