X - Webster in china

advertisement
Business Statistics:
A First Course
5th Edition
Chapter 6
The Normal Distribution
正态分布
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.
Chap 6-1
Learning Objectives
In this chapter, you learn:

To compute probabilities from the normal
distribution计算正态分布概率

To use the normal probability plot to determine
whether a set of data is approximately normally
distributed利用正态概率图来判断某一数据集是
否近似服从正态分布
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-2
Continuous Probability Distributions

A continuous random variable is a variable that can
assume any value on a continuum (can assume an
uncountable number of values)连续性随机变量指一个
变量可以在连续取值空间上任意取值





thickness of an item物品的厚度
time required to complete a task完成某一任务所需的时间
temperature of a solution溶解温度
height, in inches高度,以英寸计
These can potentially take on any value depending
only on the ability to precisely and accurately measure
理论上可以在测量工具能达到的精确度范围内的任何值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-3
The Normal Distribution
‘Bell Shaped’ 钟型
 Symmetrical 对称的
f(X)
 Mean, Median and Mode
are Equal 三种集中趋势度量一直

Location is determined by the mean, μ
位置参数为μ
Spread is determined by the standard
deviation, σ 离散程度(尺度参数)由σ给出
The random variable has an infinite
theoretical range:
+  to  
随机变量的取值范围为实线
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
σ
X
μ
Mean
= Median
= Mode
Chap 6-4
The Normal Distribution
Density Function
The formula for the normal probability density function is
正态分布的概率密度函数为

f(X) 
1
2π 
e
1  (X  μ) 
 

2 

2
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean总体均值
σ = the population standard deviation总体标准差
X = any value of the continuous variable随机变量的取值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-5
Many Normal Distributions
By varying the parameters μ and σ, we obtain
different normal distributions
改变参数μ和σ可以得到不同样子的正态分布
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-6
The Normal Distribution
Shape
f(X)
Changing μ shifts the
distribution left or right.
改变均值可以左右移动分布
Changing σ increases or
decreases the spread.
σ
μ
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
改变标准差可以增减离散度
X
Chap 6-7
The Standardized Normal



Any normal distribution (with any mean and standard
deviation combination) can be transformed into the
standardized normal distribution (Z) 任何正态分布都可
以转化为标准正态分布Z
Need to transform X units into Z units 实现计量单位
X到Z的变换
The standardized normal distribution (Z) has a mean
of 0 and a standard deviation of 1 标准正态分布的均
值为0,标准差为1
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-8
Translation to the Standardized
Normal Distribution

Translate from X to the standardized normal (the “Z”
distribution) by subtracting the mean of X and dividing
by its standard deviation: 通过减去均值后除以标准差
来实现Z变换(标准化变换)
Z
X μ
σ
The Z distribution always has mean = 0 and standard
deviation = 1 Z分布的均值和标准差总是为0和1
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-9
The Standardized Normal
Probability Density Function

The formula for the standardized normal
probability density function is标准正态分布的概
率密度函数
2
1
 (1/2)Z
f(Z) 
e
2π
Where
e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
Z = any value of the standardized normal distribution标准正态分布
随机变量的取值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-10
The Standardized
Normal Distribution



Also known as the “Z” distribution Z分布
Mean is 0
Standard Deviation is 1
f(Z)
1
0
Z
Values above the mean have positive Z-values, 如果变化后Z
值是正的,则变量X的值大于其均值
values below the mean have negative Z-values如果变化后Z值
是负的,则变量X的值小于其均值
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-11
Example

If X is distributed normally with mean of 100 and
standard deviation of 50, the Z value for X = 200 is
如果随机变量X服从均值为100。标准差为50的正态分
布,则X=200时的Z值为
Z

X μ
σ

200  100
 2.0
50
This says that X = 200 is two standard deviations (2
increments of 50 units) above the mean of 100. 表明
X=200时比均值100大了2倍的标准差
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-12
Comparing X and Z units
100
0
200
2.0
X
Z
(μ = 100, σ = 50)
(μ = 0, σ = 1)
Note that the shape of the distribution is the same, only
the scale has changed. We can express the problem in
original units (X) or in standardized units (Z) 实际上从X
到Z只是改变了变量的度量尺度,但是变量分布的形状没有
改变。
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-13
Finding Normal Probabilities
计算正态概率
Probability is measured by the area
under the curve概率密度函数曲线下的
面积为变量取值概率
P (a ≤ X ≤ b)
f(X)
= P (a < X < b)
(Note that the probability
of any individual value is
zero)连续随机变量的单点
取值概率为0
a
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
b
X
Chap 6-14
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve
is symmetric, so half is above the mean, half is below
概率曲线下方总面积为1,因为曲线对称,所以大于、
小于均值部分的面积各为一半,0.5
f(X) P(  X  μ)  0.5
P(μ  X  )  0.5
0.5
0.5
μ
X
P(  X  )  1.0
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-15
The Standardized Normal Table
标准正态分布表
The Cumulative Standardized Normal table in the
textbook (Appendix table E.2) gives the probability
less than a desired value of Z (i.e., from negative
infinity to Z)书中附表E.2给出了标准正态分布的累计概
率表,即给定Z值下,小于它的概率

0.9772
Example:
P(Z < 2.00) = 0.9772
0
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
2.00
Z
Chap 6-16
The Standardized Normal Table
(continued)
The column gives the value of
Z to the second decimal point
Z
The row shows
the value of Z
to the first
decimal point
0.00
0.01
0.02 …
0.0
0.1
.
.
.
2.0
2.0
P(Z < 2.00) = 0.9772
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
.9772
The value within the
table gives the
probability from Z =  
up to the desired Z
value
Chap 6-17
General Procedure for Finding
Normal Probabilities
计算正态概率的一般步骤
To find P(a < X < b) when X is distributed normally:
如果X服从正态分布,则计算X大于值a,且小于值b的
概率如下:

Draw the normal curve for the problem in terms of X
画出刻画随机变量X的正态密度曲线

Translate X-values to Z-values
将X值转换为标准正态的Z值

Use the Standardized Normal Table
查标准正态分布表
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-18
Finding Normal Probabilities


Let X represent the time it takes to download an image
file from the internet. 变量X表示从英特网下载图片所以的
时间
Suppose X is normal with mean 8.0 and standard
deviation 5.0. Find P(X < 8.6) 假定X服从均值为8,标准差
为5的正态分布,计算P(X < 8.6)
X
8.0
8.6
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-19
Finding Normal Probabilities
(continued)
Z
X μ
σ

8.6  8.0
 0.12
5.0
μ=8
σ = 10
8 8.6
P(X < 8.6)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
μ=0
σ=1
X
0 0.12
Z
P(Z < 0.12)
Chap 6-20
Solution: Finding P(Z < 0.12)
Standardized Normal Probability
Table (Portion)标准正态概率表(部分)
Z
.00
.01
P(X < 8.6)
= P(Z < 0.12)
.02
.5478
0.0 .5000 .5040 .5080
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
Z
0.3 .6179 .6217 .6255
0.00
0.12
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-21
Finding Normal
Upper Tail Probabilities


Suppose X is normal with mean 8.0 and
standard deviation 5.0.假定X服从均值为8,
标准差为5的正态分布
Now Find P(X > 8.6) 计算P(X > 8.6)
X
8.0
8.6
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-22
Finding Normal
Upper Tail Probabilities
(continued)

Now Find P(X > 8.6)…
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522
0.5478
1.000
1.0 - 0.5478
= 0.4522
Z
0
0.12
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Z
0
0.12
Chap 6-23
Finding a Normal Probability
Between Two Values
Suppose X is normal with mean 8.0 and
standard deviation 5.0. Find P(8 < X < 8.6)

Calculate Z-values:
Z
Z
X μ
σ
X μ
σ

88
0
5

8.6  8
 0.12
5
8 8.6
X
0 0.12
Z
P(8 < X < 8.6)
= P(0 < Z < 0.12)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-24
Solution: Finding P(0 < Z < 0.12)
Standardized Normal Probability
Table (Portion)
Z
.00
.01
.02
P(8 < X < 8.6)
= P(0 < Z < 0.12)
= P(Z < 0.12) – P(Z ≤ 0)
= 0.5478 - .5000 = 0.0478
0.0 .5000 .5040 .5080
0.0478
0.5000
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
Z
0.00
0.12
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-25
Probabilities in the Lower Tail


Suppose X is normal with mean 8.0 and
standard deviation 5.0.
Now Find P(7.4 < X < 8)
X
8.0
7.4
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-26
Probabilities in the Lower Tail
(continued)
Now Find P(7.4 < X < 8)…
P(7.4 < X < 8)
= P(-0.12 < Z < 0)
0.0478
= P(Z < 0) – P(Z ≤ -0.12)
= 0.5000 - 0.4522 = 0.0478
The Normal distribution is
symmetric, so this probability
is the same as P(0 < Z < 0.12)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
0.4522
7.4 8.0
-0.12 0
X
Z
Chap 6-27
Empirical Rules经验法则
What can we say about the distribution of values around
the mean? For any normal distribution:
对于给定的正态分布,如何经验的确定随机变量如何围绕均值分布?
μ ± 1σ encloses about
68.26% of X’s
f(X)
σ
μ-1σ
均值加减一倍标准差之间
覆盖68.26%的X
σ
μ
μ+1σ
X
68.26%
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-28
The Empirical Rule
(continued)

μ ± 2σ covers about 95% of X’s

μ ± 3σ covers about 99.7% of X’s
2σ
3σ
2σ
μ
95.44%
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
x
3σ
μ
x
99.73%
Chap 6-29
Given a Normal Probability
Find the X Value
计算对应正态累积概率的X值

Steps to find the X value for a known
probability:步骤
1. Find the Z value for the known probability
查标准正态累积概率表找出对应概率的Z值
2. Convert to X units using the formula:
用如下转换公式计算X值
X  μ  Zσ
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-30
Finding the X value for a
Known Probability
(continued)
Example:



Let X represent the time it takes (in seconds) to download an image
file from the internet. X表示从英特网下载图片文件的时间(秒)
Suppose X is normal with mean 8.0 and standard deviation 5.0
假定X服从均值为8,标准差为5的正态分布
Find X such that 20% of download times are less than X. 计算X值
使得下载所需时间少于X的概率为20%
0.2000
?
?
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
8.0
0
X
Z
Chap 6-31
Find the Z value for
20% in the Lower Tail
1. Find the Z value for the known probability先查表
得20%概率对应的Z值
Standardized Normal Probability  20% area in the lower
Table (Portion)
tail is consistent with a
Z
-0.9
…
.03
.04
.05
… .1762 .1736 .1711
-0.8 … .2033 .2005 .1977
-0.7
Z value of -0.84
0.2000
… .2327 .2296 .2266
?
8.0
-0.84 0
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
X
Z
Chap 6-32
Finding the X value
2. Convert to X units using the formula:利用转换
公式计算X值
X  μ  Zσ
 8 . 0  (  0 . 84 )5 . 0
 3 . 80
So 20% of the values from a distribution
with mean 8.0 and standard deviation
5.0 are less than 3.80 在均值为8。标准
差为5的正态分布中,20%的值小于3.8
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-33
Evaluating Normality正态性检验

Not all continuous distributions are normal 不是所有连续分布都是正
态分布

It is important to evaluate how well the data set is approximated by
a normal distribution. 判断一个数据集是否近似服从正态分布很重要

Normally distributed data should approximate the theoretical normal
distribution: 将数据分布的特征与理论正态分布的性质进行比较

The normal distribution is bell shaped (symmetrical) where the
mean is equal to the median. 正态分布是对称钟型且均值与中位
数相同

The empirical rule applies to the normal distribution.
应该满足正态分布的经验法则

The interquartile range of a normal distribution is 1.33 standard
deviations. 四分位距应该在标准差的1.33倍左右
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-34
Evaluating Normality
(continued)
Comparing data characteristics to theoretical properties
将数据分布的特征与理论正态分布的性质进行比较

Construct charts or graphs 图形判断



For small- or moderate-sized data sets, construct a stem-and-leaf
display or a boxplot to check for symmetry 茎叶图、盒子图判断对称性
For large data sets, does the histogram or polygon appear bell-shaped?
柱状图与折线图是否是钟型的
Compute descriptive summary measures 描述性统计量判断



Do the mean, median and mode have similar values?三种集中趋势度量
是否近似相等
Is the interquartile range approximately 1.33 σ?四分位距是否近似1.33
σ
Is the range approximately 6 σ? 全距是否近似6倍σ
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-35
Evaluating Normality
(continued)
Comparing data characteristics to theoretical properties
将数据分布的特征与理论正态分布的性质进行比较
 Observe the distribution of the data set 观察数据的分布




Do approximately 2/3 of the observations lie within mean ±1 standard
deviation? 是否2/3的观测值落在均值加减一倍标准误之间
Do approximately 80% of the observations lie within mean ±1.28
standard deviations?是否80%的观测值落在均值加减1.28倍标准误之间
Do approximately 95% of the observations lie within mean ±2 standard
deviations?是否95%的观测值落在均值加减2倍标准误之间
Evaluate normal probability plot 计算正态概率图

Is the normal probability plot approximately linear (i.e. a straight line) with
positive slope? 看正态概率图是否近似斜率为正的直线
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-36
Constructing
A Normal Probability Plot

Normal probability plot正态概率图

Arrange data into ordered array 先将数据排序

Find corresponding standardized normal quantile values (Z)计
算排序后累积百分比对应的标准正态分布Z值

Plot the pairs of points with observed data values (X) on the
vertical axis and the standardized normal quantile values (Z) on
the horizontal axis 以X为纵轴、Z为横轴画散点图

Evaluate the plot for evidence of linearity 评价散点图的线性程度
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-37
The Normal Probability Plot
Interpretation
A normal probability plot for data from a normal
distribution will be approximately linear:
正态数据的正态概率图应该近似一条直线
X
90
60
30
-2
-1
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
0
1
2
Z
Chap 6-38
Normal Probability Plot
Interpretation
(continued)
Left-Skewed
Right-Skewed
X 90
X 90
60
60
30
30
-2 -1 0
1
2 Z
-2 -1 0
1
2 Z
Rectangular
Nonlinear plots
indicate a deviation
from normality
X 90
60
30
-2 -1 0
1
2 Z
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-39
Evaluating Normality
An Example: Mutual Funds Returns
B o x plo t o f 2 0 0 6 R e tur ns
The boxplot appears
reasonably symmetric,
with four lower outliers at
-9.0, -8.0, -8.0, -6.5 and
one upper outlier at 35.0.
(The normal distribution is
symmetric.)
-10
0
10
20
30
40
R e t ur n 2 0 0 6
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-40
Evaluating Normality
An Example: Mutual Funds Returns
(continued)
Descriptive Statistics
• The mean (12.5142) is slightly less than the
median (13.1). (In a normal distribution the
mean and median are equal.)
• The interquartile range of 9.2 is approximately
1.46 standard deviations. (In a normal
distribution the interquartile range is 1.33
standard deviations.)
• The range of 44 is equal to 6.99 standard
deviations. (In a normal distribution the range is
6 standard deviations.)
• 72.2% of the observations are within 1 standard
deviation of the mean. (In a normal distribution
this percentage is 68.26%.
• 87% of the observations are within 1.28
standard deviations of the mean. (In a normal
distribution percentage is 80%.)
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-41
Evaluating Normality
An Example: Mutual Funds Returns
(continued)
P r o ba bil ity P lo t o f R e tur n 2 0 0 6
No r m a l
99 .99
Plot is approximately
a straight line except
for a few outliers at
the low end and the
high end.
99
95
Pe r c e nt
80
50
20
5
1
0 .01
-10
0
10
20
30
40
R e t ur n 2 0 0 6
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-42
Evaluating Normality
An Example: Mutual Funds Returns
(continued)

Conclusions





The returns are slightly left-skewed
The returns have more values concentrated around
the mean than expected
The range is larger than expected (caused by one
outlier at 35.0)
Normal probability plot is reasonably straight line
Overall, this data set does not greatly differ from the
theoretical properties of the normal distribution
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-43
Chapter Summary

Presented normal distribution

Found probabilities for the normal distribution

Applied normal distribution to problems
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc..
Chap 6-44
Download