Busn210ch03 - Highline Community College

advertisement
Slides by
JOHN
LOUCKS
St. Edward’s
University
Slide 1
Chapter 3, Part A
Descriptive Statistics: Numerical Measures


Measures of Location
Measures of Variability
Slide 2
Measures of Location

Mean

Median
Mode



Percentiles
Quartiles
If the measures are computed
for data from a sample,
they are called sample statistics.
If the measures are computed
for data from a population,
they are called population parameters.
A sample statistic is referred to
as the point estimator of the
corresponding population parameter.
Slide 3
Mean


The mean of a data set is the average of all the data
values.
The sample mean x is the point estimator of the
population mean m.
Slide 4
Sample Mean x
x
x
Sum of the values
of the n observations
i
n
Number of
observations
in the sample
Slide 5
Population Mean m
m
x
Sum of the values
of the N observations
i
N
Number of
observations in
the population
Slide 6
Sample Mean
 Example: Apartment Rents
Seventy efficiency apartments were randomly
sampled in a small college town. The monthly rent
prices for these apartments are listed below.
445
440
465
450
600
570
510
615
440
450
470
485
515
575
430
440
525
490
580
450
490
590
525
450
472
470
445
435
435
425
450
475
490
525
600
600
445
460
475
500
535
435
460
575
435
500
549
475
445
600
445
460
480
500
550
435
440
450
465
570
500
480
430
615
450
480
465
480
510
440
Slide 7
Sample Mean
 Example: Apartment Rents
x

x
34, 356

 490.80
n
70
445
440
465
450
600
570
510
615
440
450
470
485
515
575
430
440
525
490
580
450
490
590
525
450
472
470
445
435
i
435
425
450
475
490
525
600
600
445
460
475
500
535
435
460
575
435
500
549
475
445
600
445
460
480
500
550
435
440
450
465
570
500
480
430
615
450
480
465
480
510
440
Slide 8
Median
 The median of a data set is the value in the middle
when the data items are arranged in ascending order.
 Whenever a data set has extreme values, the median
is the preferred measure of central location.
 The median is the measure of location most often
reported for annual income and property value data.
 A few extremely large incomes or property values
can inflate the mean.
Slide 9
Median
 For an odd number of observations:
26
18
27
12 14
27
12
14
18
19
27 27
26
19
7 observations
in ascending order
the median is the middle value.
Median = 19
Slide 10
Median
 For an even number of observations:
26
18
27
12
14
27
30
19
8 observations
12
14
18
19
26
27 27
30
in ascending order
the median is the average of the middle two values.
Median = (19 + 26)/2 = 22.5
Slide 11
Median
 Example: Apartment Rents
Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
Slide 12
Mode
 The mode of a data set is the value that occurs with
greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.
Slide 13
Mode
 Example: Apartment Rents
450 occurred most frequently (7 times)
Mode = 450
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
Slide 14
Using Excel to Compute
the Mean, Median, and Mode
 Excel Formula Worksheet
1
2
3
4
5
6
A
Apartment
1
2
3
4
5
B
C
D
E
Monthly
Rent ($)
525
Mean =AVERAGE(B2:B71)
440
Median =MEDIAN(B2:B71)
450
Mode =MODE(B2:B71)
615
480
Note: Rows 7-71 are not shown.
Slide 15
Using Excel to Compute
the Mean, Median, and Mode
 Value Worksheet
1
2
3
4
5
6
A
Apartment
1
2
3
4
5
B
C
D
Monthly
Rent ($)
525
Mean
440
Median
450
Mode
615
480
E
490.80
475.00
450.00
Note: Rows 7-71 are not shown.
Slide 16
Percentiles
 A percentile provides information about how the
data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.

The pth percentile of a data set is a value such that at
least p percent of the items take on this value or less
and at least (100 - p) percent of the items take on this
value or more.
Slide 17
Percentiles
Arrange the data in ascending order.
Compute index i, the position of the pth percentile.
i = (p/100)n
If i is not an integer, round up. The pth percentile
is the value in the ith position.
If i is an integer, the pth percentile is the average
of the values in positions i and i+1.
Slide 18
80th Percentile
 Example: Apartment Rents
i = (p/100)n = (80/100)70 = 56
Averaging the 56th and 57th data values:
80th Percentile = (535 + 549)/2 = 542
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
Slide 19
80th Percentile
 Example: Apartment Rents
“At least 80% of the
items take on a
value of 542 or less.”
“At least 20% of the
items take on a
value of 542 or more.”
56/70 = .8 or 80%
14/70 = .2 or 20%
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Slide 20
Using Excel’s Rank and Percentile Tool
to Compute Percentiles and Quartiles
 Using Excel’s Percentile Function
The formula Excel uses to compute the location (Lp)
of the pth percentile is
Lp = (p/100)n + (1 – p/100)
Excel would compute the location of the 80th
percentile for the apartment rent data as follows:
L80 = (80/100)70 + (1 – 80/100) = 56 + .2 = 56.2
The 80th percentile would be
535 + .2(549 - 535) = 535 + 2.8 = 537.8
Slide 21
Using Excel’s Rank and Percentile Tool
to Compute Percentiles and Quartiles
 Excel Formula Worksheet
1
2
3
4
5
6
A
B
Apart- Monthly
ment Rent ($)
1
525
2
440
3
450
4
615
5
480
C
80th percentile
D
E
80th Percentile
=PERCENTILE(B2:B71,.8)
Note: Rows 7-71 are not shown.
It is not necessary
to put the data
in ascending order.
Slide 22
Using Excel’s Rank and Percentile Tool
to Compute Percentiles and Quartiles
 Excel Value Worksheet
1
2
3
4
5
6
A
B
Apart- Monthly
ment Rent ($)
1
525
2
440
3
450
4
615
5
480
C
D
E
80th Percentile
537.8
Note: Rows 7-71 are not shown.
Slide 23
Quartiles
 Quartiles are specific percentiles.
 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile
Slide 24
Third Quartile
 Example: Apartment Rents
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
Slide 25
Third Quartile
 Using Excel’s Quartile Function
Excel computes the locations of the 1st, 2nd, and 3rd
quartiles by first converting the quartiles to
percentiles and then using the following formula to
compute the location (Lp) of the pth percentile:
Lp = (p/100)n + (1 – p/100)
Excel would compute the location of the 3rd quartile
(75th percentile) for the rent data as follows:
L75 = (75/100)70 + (1 – 75/100) = 52.5 + .25 = 52.75
The 3rd quartile would be
515 + .75(525 - 515) = 515 + 7.5 = 522.5
Slide 26
Third Quartile
 Excel Formula Worksheet
1
2
3
4
5
6
A
B
Apart- Monthly
ment Rent ($)
1
525
2
440
3
450
4
615
5
480
C
3rd quartile
D
E
Third Quartile
=QUARTILE(B2:B71,3)
It is not necessary
to put the data
in ascending order.
Note: Rows 7-71 are not shown.
Slide 27
Third Quartile
 Excel Value Worksheet
1
2
3
4
5
6
A
B
Apart- Monthly
ment Rent ($)
1
525
2
440
3
450
4
615
5
480
C
D
E
Third Quartile
522.5
Note: Rows 7-71 are not shown.
Slide 28
Excel’s Rank and Percentile Tool
Step 1 Click the Data tab on the Ribbon
Step 2 In the Analysis group, click Data Analysis
Step 3 Choose Rank and Percentile from the list of
Analysis Tools
Step 4 When the Rank and Percentile dialog box appears
(see details on next slide)
Slide 29
Excel’s Rank and Percentile Tool
Step 4 Complete the Rank and Percentile dialog
box as follows:
Slide 30
Excel’s Rank and Percentile Tool
 Excel Value Worksheet
1
2
3
4
5
6
7
8
9
10
B
Rent
525
440
450
615
480
510
575
430
440
C
D
Point
4
63
35
42
49
56
28
21
7
E
F
Rent
Rank
615
1
615
1
600
3
600
3
600
3
600
3
590
7
580
8
575
9
G
Percent
98.50%
98.50%
92.70%
92.70%
92.70%
92.70%
91.30%
89.80%
86.90%
Note: Rows 11-71 are not shown.
Slide 31
Geometric Mean (GM)
• The Geometric Mean is useful in finding the
averages of increases in:
–
–
–
–
Percents
Ratios
Indexes
Growth Rates
• The Geometric Mean will always be less than or
equal to (never more than) the arithmetic mean
• The GM gives a more conservative figure that is
not drawn up by large values in the set
32
Geometric Mean
• The GM of a set of n positive numbers is
defined as the nth root of the product of n
values. The formula is:
GM % 1  n ( X 1)(X 2)(X 3)...(Xn)
GM %  n ( X 1)(X 2)(X 3)...(Xn) 1
GM
X1
X2
n
Define Variables & Symbols
= Geometric Mean
= A particular number (1 + %)
= A particular number (1 + %)
= Number of postive numbers in set
33
Geometric Mean Example 1:
Percentage Increase
Starting Salary
$41,000.00
Increase in salary Year 1
5%
Increase in salary Year 2
15%
GM  2 (1.05)(1.15)  1.09886
In Excel:
1.05 *
1.15 = 1.2075
GM = 1.2075 ^ (1/2) - 1 = 9.886%
34
Verify Geometric Mean Example
Verify 1:
Raise 1 = $41,000.00 *
5% =
Raise 2 = 43,050.00 *
15% =
Total
$2,050.00
6,457.50
$8,507.50
Verify 2:
Raise 1 = $41,000.00 * 0.09886 =
Raise 2 = 45,053.39 * 0.09886 =
Total
$4,053.39
4,454.11
$8,507.50
If We used Arithmetic Mean (5%+15%)/2 = 10%
Raise 1 = $41,000.00 *
10% = $4,100.00
Raise 2 = 45,100.00 *
10% =
4,510.00
Total
$8,610.00
35
Another Use Of GM:
Ave. % Increase Over Time
• Another use of the geometric mean is to
determine the percent increase in sales,
production or other business or economic series
from one time period to another
• Where n = number of periods
GM  n
(Value at end of all t heperiods)
1
(Value at beginningof all t heperiods)
36
Example for GM: Ave. % Increase Over
Time
• The total number of females enrolled in
American colleges increased from 755,000 in
1992 to 835,000 in 2000. That is, the geometric
mean rate of increase is 1.27%.
GM  8
835,000
 1  .0127
755,000
•The annual rate of increase is 1.27%
•For the years 1992 through 2000, the rate of
female enrollment growth at American colleges
was 1.27% per year
37
Measures of Variability
 It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.
Slide 38
Measures of Variability
 Range
 Interquartile Range
 Variance
 Standard Deviation
 Coefficient of Variation
Slide 39
Range
 The range of a data set is the difference between the
largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.
Slide 40
Range
 Example: Apartment Rents
Range = largest value - smallest value
Range = 615 - 425 = 190
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
Slide 41
Interquartile Range
 The interquartile range of a data set is the difference
between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.
Slide 42
Interquartile Range
 Example: Apartment Rents
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
Slide 43
Variance
The variance is a measure of variability that utilizes
all the data.
It is based on the difference between the value of
each observation (xi) and the mean ( x for a sample,
m for a population).
Slide 44
Variance
The variance is the average of the squared
differences between each data value and the mean.
The variance is computed as follows:
2  ( xi  x )
s 
n 1
for a
sample
2
 ( xi  m )
 
N
2
2
for a
population
Slide 45
Standard Deviation
The standard deviation of a data set is the positive
square root of the variance.
It is measured in the same units as the data, making
it more easily interpreted than the variance.
Slide 46
Standard Deviation
The standard deviation is computed as follows:
s  s2
  2
for a
sample
for a
population
Slide 47
Coefficient of Variation
The coefficient of variation indicates how large the
standard deviation is in relation to the mean.
The coefficient of variation is computed as follows:
s


100

%
x

for a
sample


 100  %
m

for a
population
Slide 48
Sample Variance, Standard Deviation,
And Coefficient of Variation
 Example: Apartment Rents
• Variance
s2
(x


 x )2

n1
i
2, 996.16
• Standard Deviation
s  s2  2996.16  54.74
• Coefficient of Variation
the standard
deviation is
about 11%
of the mean
 s

 54.74


100
%


100



%  11.15%
x

 490.80

Slide 49
Using Excel to Compute the Sample Variance,
Standard Deviation, and Coefficient of Variation
 Formula Worksheet
1
2
3
4
5
6
7
A
B
C
D
E
Apart- Monthly
ment Rent ($)
1
525
Mean =AVERAGE(B2:B71)
2
440
Median =MEDIAN(B2:B71)
3
450
Mode =MODE(B2:B71)
4
615
Variance =VAR(B2:B71)
5
480
Std. Dev. =STDEV(B2:B71)
6
510
C.V. =E6/E2*100
Note: Rows 8-71 are not shown.
Slide 50
Using Excel to Compute the Sample Variance,
Standard Deviation, and Coefficient of Variation
 Value Worksheet
1
2
3
4
5
6
7
A
B
C
D
Apart- Monthly
ment Rent ($)
1
525
Mean
2
440
Median
3
450
Mode
4
615
Variance
5
480
Std. Dev.
6
510
C.V.
E
490.80
475.00
450.00
2996.16
54.74
11.15
Note: Rows 8-71 are not shown.
Slide 51
Using Excel’s
Descriptive Statistics Tool
Step 1 Click the Data tab on the Ribbon
Step 2 In the Analysis group, click Data Analysis
Step 3 Choose Descriptive Statistics from the list of
Analysis Tools
Step 4 When the Descriptive Statistics dialog box appears:
(see details on next slide)
Slide 52
Using Excel’s
Descriptive Statistics Tool
 Excel’s Descriptive Statistics Dialog Box
Slide 53
Using Excel’s
Descriptive Statistics Tool
 Excel Value Worksheet (Partial)
1
2
3
4
5
6
7
8
E
D
C
B
A
Apart- Monthly
Monthly Rent ($)
ment Rent ($)
525
1
490.8
Mean
440
2
6.542348114
Standard Error
450
3
475
Median
615
4
450
Mode
480
5
Standard Deviation 54.73721146
510
6
2996.162319
Sample Variance
575
7
Note: Rows 9-71 are not shown.
Slide 54
Using Excel’s
Descriptive Statistics Tool
 Excel Value Worksheet (Partial)
9
10
11
12
13
14
15
16
A
8
9
10
11
12
13
14
15
B
430
440
450
470
485
515
575
430
C
D
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
E
-0.334093298
0.924330473
190
425
615
34356
70
Note: Rows 1-8 and 17-71 are not shown.
Slide 55
End of Chapter 3, Part A
Slide 56
Chapter 3, Part B
Descriptive Statistics: Numerical Measures

Measures of Distribution Shape, Relative Location,
and Detecting Outliers

Exploratory Data Analysis
Slide 57
Measures of Distribution Shape,
Relative Location, and Detecting Outliers





Distribution Shape
z-Scores
Chebyshev’s Theorem
Empirical Rule
Detecting Outliers
Slide 58
Distribution Shape: Skewness

An important measure of the shape of a distribution
is called skewness.

The formula for computing skewness for a data set is
somewhat complex.

Skewness can be easily computed using statistical
software.

Excel’s SKEW function can be used to compute the
skewness of a data set.
Slide 59
Distribution Shape: Skewness
Symmetric (not skewed)
• Skewness is zero.
• Mean and median are equal.
.35
Relative Frequency

Skewness = 0
.30
.25
.20
.15
.10
.05
0
Slide 60
Distribution Shape: Skewness
Moderately Skewed Left
• Skewness is negative.
• Mean will usually be less than the median.
.35
Relative Frequency

Skewness =  .31
.30
.25
.20
.15
.10
.05
0
Slide 61
Distribution Shape: Skewness
Moderately Skewed Right
• Skewness is positive.
• Mean will usually be more than the median.
.35
Relative Frequency

Skewness = .31
.30
.25
.20
.15
.10
.05
0
Slide 62
Distribution Shape: Skewness

Highly Skewed Right
• Skewness is positive (often above 1.0).
• Mean will usually be more than the median.
Relative Frequency
.35
Skewness = 1.25
.30
.25
.20
.15
.10
.05
0
Slide 63
Distribution Shape: Skewness

Example: Apartment Rents
Seventy efficiency apartments were randomly
sampled in a college town. The monthly rent prices
for the apartments are listed below in ascending order.
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Slide 64
Distribution Shape: Skewness
Example: Apartment Rents
.35
Relative Frequency

Skewness = .92
.30
.25
.20
.15
.10
.05
0
Slide 65
z-Scores
The z-score is often called the standardized value.
It denotes the number of standard deviations a data
value xi is from the mean.
xi  x
zi 
s
Slide 66
z-Scores
 An observation’s z-score is a measure of the relative
location of the observation in a data set.
 A data value less than the sample mean will have a
z-score less than zero.
 A data value greater than the sample mean will have
a z-score greater than zero.
 A data value equal to the sample mean will have a
z-score of zero.
Slide 67
z-Scores
 Example: Apartment Rents
• z-Score of Smallest Value (425)
xi  x 425  490.80
z

  1.20
s
54.74
Standardized Values for Apartment Rents
-1.20
-0.93
-0.75
-0.47
-0.20
0.35
1.54
-1.11
-0.93
-0.75
-0.38
-0.11
0.44
1.54
-1.11
-0.93
-0.75
-0.38
-0.01
0.62
1.63
-1.02
-0.84
-0.75
-0.34
-0.01
0.62
1.81
-1.02
-0.84
-0.75
-0.29
-0.01
0.62
1.99
-1.02
-0.84
-0.56
-0.29
0.17
0.81
1.99
-1.02
-0.84
-0.56
-0.29
0.17
1.06
1.99
-1.02
-0.84
-0.56
-0.20
0.17
1.08
1.99
-0.93
-0.75
-0.47
-0.20
0.17
1.45
2.27
-0.93
-0.75
-0.47
-0.20
0.35
1.45
2.27
Slide 68
Chebyshev’s Theorem
At least (1 - 1/z2) of the items in any data set will be
within z standard deviations of the mean, where z is
any value greater than 1.
Slide 69
Chebyshev’s Theorem
At least 75% of the data values must be
within z = 2 standard deviations of the mean.
At least 89% of the data values must be
within z = 3 standard deviations of the mean.
At least 94% of the data values must be
within z = 4 standard deviations of the mean.
Slide 70
Chebyshev’s Theorem
 Example: Apartment Rents
Let z = 1.5 with
x = 490.80 and s = 54.74
At least (1  1/(1.5)2) = 1  0.44 = 0.56 or 56%
of the rent values must be between
x - z(s) = 490.80  1.5(54.74) = 409
and
x + z(s) = 490.80 + 1.5(54.74) = 573
(Actually, 86% of the rent values
are between 409 and 573.)
Slide 71
Empirical Rule
For data having a bell-shaped distribution:
68.26% of the values of a normal random variable
are within +/- 1 standard deviation of its mean.
95.44% of the values of a normal random variable
are within +/- 2 standard deviations of its mean.
99.72% of the values of a normal random variable
are within +/- 3 standard deviations of its mean.
Slide 72
Empirical Rule
99.72%
95.44%
68.26%
0
m – 3
m – 1
m – 2
m
m + 3
m + 1
m + 2
x
Slide 73
Detecting Outliers
 An outlier is an unusually small or unusually large
value in a data set.
 A data value with a z-score less than -3 or greater
than +3 might be considered an outlier.
 It might be:
• an incorrectly recorded data value
• a data value that was incorrectly included in the
data set
• a correctly recorded data value that belongs in
the data set
Slide 74
Detecting Outliers
 Example: Apartment Rents
• The most extreme z-scores are -1.20 and 2.27
• Using |z| > 3 as the criterion for an outlier, there
are no outliers in this data set.
Standardized Values for Apartment Rents
-1.20
-0.93
-0.75
-0.47
-0.20
0.35
1.54
-1.11
-0.93
-0.75
-0.38
-0.11
0.44
1.54
-1.11
-0.93
-0.75
-0.38
-0.01
0.62
1.63
-1.02
-0.84
-0.75
-0.34
-0.01
0.62
1.81
-1.02
-0.84
-0.75
-0.29
-0.01
0.62
1.99
-1.02
-0.84
-0.56
-0.29
0.17
0.81
1.99
-1.02
-0.84
-0.56
-0.29
0.17
1.06
1.99
-1.02
-0.84
-0.56
-0.20
0.17
1.08
1.99
-0.93
-0.75
-0.47
-0.20
0.17
1.45
2.27
-0.93
-0.75
-0.47
-0.20
0.35
1.45
2.27
Slide 75
Exploratory Data Analysis
 Five-Number Summary
 Box Plot
Slide 76
Five-Number Summary
1
Smallest Value
2
First Quartile
3
Median
4
Third Quartile
5
Largest Value
Slide 77
Five-Number Summary
 Example: Apartment Rents
First Quartile = 445
Lowest Value = 425
Median = 475
Third Quartile = 525 Largest Value = 615
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Slide 78
Box Plot
 Example: Apartment Rents
• A box is drawn with its ends located at the first and
third quartiles.
• A vertical line is drawn in the box at the location of
the median (second quartile).
400 425 450 475 500 525 550 575 600 625
Q1 = 445
Q3 = 525
Q2 = 475
Slide 79
Box Plot
 Limits are located (not drawn) using the interquartile
range (IQR).
 Data outside these limits are considered outliers.
 The locations of each outlier is shown with the
symbol * .
continued
Slide 80
Box Plot
 Example: Apartment Rents
• The lower limit is located 1.5(IQR) below Q1.
Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(80) = 325
• The upper limit is located 1.5(IQR) above Q3.
Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(80) = 645
•
There are no outliers (values less than 325 or
greater than 645) in the apartment rent data.
Slide 81
Box Plot
 Example: Apartment Rents
• Whiskers (dashed lines) are drawn from the ends
of the box to the smallest and largest data values
inside the limits.
400 425 450 475 500 525 550 575 600 625
Smallest value
inside limits = 425
Largest value
inside limits = 615
Slide 82
End of Chapter 3, Part B
Slide 83
Download