Uploaded by uniwork

Describing Data: Numerical Measures - Statistics Presentation

advertisement
Describing Data:
Numerical Measures
Chapter 3
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Learning Objectives
LO3-1 Compute and interpret the mean, the median,
and the mode.
LO3-2 Compute a weighted mean.
LO3-3 Compute and interpret the geometric mean.
LO3-4 Compute and interpret the range, variance, and
standard deviation.
LO3-5 Explain and apply Chebyshev’s theorem and the
Empirical Rule.
LO3-6 Compute the mean and standard deviation of
grouped data.
3-2
LO3-1 Compute and interpret the
mean, the median, and the mode.
[1] Measures of Location
The purpose of a measure of location is to pinpoint the
center of a distribution of data.
โ—ผ There are many measures of location. We will consider
three:
1- First : The mean
(a)The arithmetic mean: 1- nongroup data (raw data)
(population parameter, sample statistics)
: 2- grouped data
(b)Weight mean
(c) Geometric mean
2- Second: The median
3- Third: The mode
โ—ผ
3-3
LO3-1
First the mean :(A) The arithmetic mean
a. Population Mean
For ungrouped data, the population mean is the sum of all
the population values divided by the total number of
population values:
3-4
LO3-1
Example – Population Mean
There are 42 exits on I-75 through the state of Kentucky.
Listed below are the distances between exits (in miles).
1. Why is this information a population?
2. What is the mean number of miles between exits?
3-5
LO3-1
Solution
Why is this information a population?
Answer: This is Population because we are considering
all the exits on I-75 in Kentucky.
2. What is the mean number of miles between exits?
Answer:
๐œ‡=
σ๐‘‹
๐‘
=
1+โ‹ฏ+10+4+11
42
192
= 42 =57.4
3-6
LO3-1
Parameter versus Statistic
PARAMETER A measurable characteristic of a population.
STATISTIC A measurable characteristic of a sample.
3-7
LO3-1
b. Sample Mean
For ungrouped data, the sample mean is the sum of all
the sample values divided by the number of sample
values:
3-8
LO3-1
Example – Sample Mean
Solution
3-9
LO3-1
Properties of the Arithmetic Mean
1. Every set of interval-level and ratio-level data has a
mean.
2. All the values are included in computing the mean.
3. The mean is unique.
4. The sum of the deviations of each value from the mean is
zero.
เดค =0
σ(๐‘‹ − ๐‘‹)
3-10
LO3-1
Weakness of the Arithmetic Mean
Arithmetic average is extremely sensitive to extreme values.
Which means If one or two of the data values are either
extremely large or extremely small compared to the majority
of data, the mean might not be an appropriate average to
represent the data.
Arithmetic average treats all the individual observations
equally. For example, you have a portfolio of stocks and it is
highly unlikely that all stocks will have the same weight and
therefore the same impact on the total performance of the
portfolio.
3-11
LO3-6 Compute the mean and
standard deviation of grouped data.
The Arithmetic Mean of Grouped Data
3-12
LO3-6
Example - The Arithmetic Mean of
Grouped Data
Recall in Chapter 2, we constructed a
frequency distribution for Applewood
Auto Group profit data for 180
vehicles sold. The information is
repeated in the table. Determine the
arithmetic mean profit per vehicle.
3-13
LO3-6
Example - The Arithmetic Mean of
Grouped Data
3-14
LO3-2 Compute a weighted mean.
(B) Weighted Mean
The weighted mean of a set of numbers X1, X2, ..., Xn,
with corresponding weights w1, w2, ...,wn, is computed
with the following formula:
σ๐‘Š๐‘‹
๐‘‹๐‘Š = σ
๐‘Š
3-15
LO3-2
Example – Weighted Mean
The Carter Construction Company pays its hourly
employees $16.50, $19.00, or $25.00 per hour. There are
26 hourly employees: 14 are paid at the $16.50 rate, 10
at the $19.00 rate, and 2 at the $25.00 rate.
What is the mean hourly rate paid for the 26 employees?
3-16
LO3-3 Compute and interpret
the geometric mean.
(C)The Geometric Mean
โ—ผ Useful in finding the average change of percentages,
ratios, indexes, or growth rates over time.
โ—ผ It has wide application in business and economics
because we are often interested in finding the percentage
changes in sales, salaries, or economic figures, such as
the GDP (Gross Domestic Product).
โ—ผ The geometric mean will always be less than or equal to
the arithmetic mean.
3-17
LO3-3
The Geometric Mean: Finding the
average rate of return over time
EXAMPLE:
The return on investment earned by Atkins Construction
Company for four successive years was: 30 percent, 20
percent, -40 percent, and 200 percent. What is the change
geometric mean rate of return on investment?
The average change rate of return is 29.4%
3-18
LO3-3
The Geometric Mean: Finding an
Average Percent Change Over Time
EXAMPLE:
The population of Las Vegas, Nevada increased from 258,295 in 1990 to
584,539 in 2011. This is an increase of 326,244 people, or a 226.3
percent increase over the period. What is the average annual increase?
The population increased at a rate of 3.97% per year
3-19
2728-
31-
32-
3.83% found by 5
151,812, 000
−1
125,821, 000
LO3-1
Second: The Median
MEDIAN The midpoint of the values after they have
been ordered from the minimum to the maximum values.
Properties of the median:
1. There is a unique median for each data set.
2. It is not affected by extremely large or small values
and is therefore a valuable measure of central
tendency when such values occur.
3. It can be computed for ratio-level, interval-level, and
ordinal-level data.
4. It can be computed for an open-ended frequency
distribution if the median does not lie in an openended class.
3-22
LO3-1
Examples - Median
The ages for a sample of
five college students are:
The heights of four
basketball players, in
inches, are:
21, 25, 19, 20, 22
76, 73, 80, 75
Arranging the data in
ascending order gives:
19, 20, 21, 22, 25.
Thus the median is 21.
Arranging the data in
ascending order gives:
73, 75, 76, 80.
Thus the median is 75.5.
3-23
Note that:
โ‘ If there is an odd amount of numbers, the median value is the number
that is in the middle, with the same amount of numbers below and above.
โ‘ If there is an even amount of numbers in the list, the middle pair must
be determined, added together, and divided by two to find the median
value.
LO3-1
Third: The Mode
MODE The value of the observation that appears
most frequently.
Note: The mode is especially useful in summarizing nominal-level
data.
3-25
LO3-1
Example - Mode
Using the data
measuring the
distance in miles
between exits on I-75
through Kentucky,
what is the modal
distance?
Organize the distances
into a frequency table
and select the distance
with
the
highest
frequency.
3-26
LO3-1
The Relative Positions of the Mean,
Median and the Mode
Note: If the distribution is highly skewed, the mean would not
be a good measure to use. The median and mode would be
more representative
[2] Dispersion
LO3-4 Compute and interpret the range,
variance, and standard deviation.
โ–ช A measure of location, such as the mean or the median, does not tell us
anything about the spread of the data.
โ–ช For example, if your nature guide told you that the river ahead averaged
3 feet in depth, would you want to wade across on foot without
additional information? Probably not. You would want to know
something about the variation in the depth.
๏‚จ A small value for a measure of dispersion indicates that the data are
clustered closely, say, around the mean. The mean is therefore
considered representative of the data.
๏‚จ A large measure of dispersion indicates that the mean is not reliable.
3-28
LO3-4 Compute and interpret the range,
variance, and standard deviation.
Dispersion
โ–ช A second reason for
studying the dispersion in
a set of data is to
compare the spread in two
or more distributions.
LCD computer monitor is assembled in Baton Rouge and in Tucson.
The arithmetic mean hourly output in both the Baton Rouge plant
and the Tucson plant is 50.
Based on the two means, we might conclude that the distributions
of the hourly outputs are identical. But production records reveal
that this conclusion is not correct (see Chart 3–6).
• Baton Rouge production varies from 48 to 52 assemblies per hour.
• Production at the Tucson is more erratic, ranging from 40 to 60 per hour.
Therefore, the hourly output for Baton Rouge is clustered near the
mean of 50; the hourly output for Tucson is more dispersed.
3-29
LO3-4
Measures of Dispersion
โ—ผ Range
โ—ผ Variance
โ—ผ Standard Deviation
3-30
LO3-4
Example – Range
The number of cappuccinos sold at the Starbucks location in
the Orange County Airport between 4 and 7 p.m. for a
sample of 5 days last year were 20, 40, 50, 60, and 80.
Determine the range for the number of cappuccinos sold.
Range = Maximum value – Minimum value
= 80 – 20
= 60
Limitation of Range: It is based on just two values, the
maximum and the minimum; it does not take into
consideration all of the values. The variance does this.
3-31
LO3-4
Variance and Standard Deviation
VARIANCE : The arithmetic mean of the squared
deviations from the mean.
STANDARD DEVIATION: The square root of the variance.
โ—ผ The variance and standard deviations are nonnegative and
are zero only if all observations are the same.
โ—ผ For populations, whose values are near the mean, the
variance and standard deviation will be small.
โ—ผ For populations, whose values are dispersed from the mean,
the population variance and standard deviation will be large.
โ—ผ The variance overcomes the weakness of the range by using
all the values in the population.
3-32
LO3-4
Computing the Variance
Steps in computing the variance:
Step 1: Find the mean.
Step 2: Find the difference between each observation and
the mean, and square that difference.
Step 3: Sum all the squared differences found in Step 2.
Step 4: Divide the sum of the squared differences by the
number of items in the population.
3-33
LO3-4
Example – Variance and Standard
Deviation
The number of traffic citations (tickets of traffic offense) issued during the
last twelve months in Beaufort County, South Carolina, is reported below:
What is the population variance?
Step 1: Find the mean.
๐œ‡ = σ ๐‘‹=
๐‘
19 + 17+. . . +34 + 10 348
=
= 29
12
12
3-34
LO3-4
Example – Variance and Standard Deviation
Continued
What is the population variance?
Step 2: Find the difference
between each observation and
the mean of 29, and square that
difference.
Step 3: Sum all the squared
differences found in Step 2.
Step 4: Divide the sum of the
squared differences by the
number of items in the
population.
( X − ๏ญ)
1,488
๏ƒฅ
๏ณ =
= 124
=
2
2
N
12
3-35
LO3-4
Sample Variance
3-36
LO3-4
Example – Sample Variance
The hourly wages for a
sample of part-time
employees at Home Depot
are: $12, $20, $16, $18,
and $19.
The sample mean is $17.
What
is
the
sample
variance?
3-37
LO3-4
Sample Standard Deviation
where :
s 2 is the sample variance
x is the value of each observation in the sample
x is the mean of the sample
n is the number of observations in the sample
3-38
Self Review
LO3-6
Example - Standard Deviation of Grouped Data
Refer to the frequency distribution for the Applewood Auto Group data
used earlier. Compute the standard deviation of the vehicle profits.
3-40
LO3-5 Explain and apply Chebyshev’s
theorem and the Empirical Rule.
Chebyshev’s Theorem
3-41
Example
Dupree Paint Company employees contribute a mean of $51.54 to the
company’s profit-sharing plan every two weeks. The standard deviation of
biweekly contributions is $7.51. At least what percent of
the
contributions lie within plus 3.5 standard deviations and minus 3.5
standard deviations of the mean, that is between $25.26 and $77.83?
Solution
3-42
LO3-5
The Empirical Rule
3-43
Example
A sample of the rental at University Park Apartments approximates a
symmetrical, bell-shaped distribution. The sample mean is $500; the
standard deviation is $20. Using the Empirical Rule, answer these
1.
About 68% of the monthly rentals are between what two amounts?
2.
About 95% of the monthly rentals are between what two amounts?
3.
Almost all of the monthly rentals are between what two amounts?
Solution
1.
About 68% are between 500 – 20 & 500 + 20 means $480 & $520.
2.
About 95% are between 500 – 2(20) & 500 + 2(20) means $460 &
$540.
3.
Almost all (99.7%) are between 500 – 3(20) & 500 + 3(20) means
$440 & $560.
3-44
55-
About 69%, found by 1 −
1
(1.8) 2
56- About 84%, each income level lies 2.5 standard deviations from the mean. Then 1 −
1
= 0.84
2
(2.5)
TRY & CHECK
(1)
Here are six cards , there is a number on each card
Two of the numbers are hidden
4
5
?
6
3
The mode of the six numbers is 4
The mean of the six numbers is 5
Work out the two numbers that are hidden
?
TRY & CHECK
(2)
TRY & CHECK
(3)
TRY & CHECK
(4)
Three numbers have a mode of 6 and a mean of 7
Find the three numbers?
TRY & CHECK
(5)
Mark ran a mean distance of 13.2 KM in five days
The next day Mark ran 20 KM
Find the mean distance Mark ran in the six days
Download