Business Statistics for Managerial Decision

advertisement
Business Statistics for Managerial
Decision
Probability Theory
Probability Theory

The mathematics of probability can provide
models to describe


The flow of traffic through a highway system, a
telephone interchange, or a computer processor,
the product preference of consumers, the spread
of epidemics or computer viruses, and the rate
of return on risky investments.
We are interested in probability because of
its usefulness in statistics.
General Probability Rules

Rule 1


Rule 2


P(S) =1
Rule 3


0  P(A)  1 for any event A
Complement rule: for any event A,
P(Ac)=1- p(A)
Rule 4


Addition rule: If A and B are disjoint events, then
P(A or B) = P(A) + P(B)
The General addition rule: for any event A and B
P(A or B) = P(A) + P(B) - P(A and B)
Independence and the Multiplication Rule

Two events A and B are independent if
knowing that one occurs does not change
the probability that the other occurs. If A
and B are independent,
P(A and B) = P(A) × P(B)
Conditional Probability

The following table contains counts (in
1000’s) of persons aged 16-24 who are
enrolled in school classified by gender and
employment status
Male
Female
Total
Employed Unemployed Not in labor force
3927
520
4611
4313
446
4357
8240
966
8968
Total
9058
9116
18174
Conditional Probability





Randomly choose a person aged 16 to 24 who is enrolled
in school.
What is the probability that the person is employed?
Now we are told that the person chosen is female. What is
the probability that this person is employed?
This is a conditional probability.
The conditional probability above gives the probability of
one event (the person chosen is employed) under the
condition that we know another event(the person is
female).
Definition of conditional probability

When P(A) > 0, the conditional probability
of B given A is
P( B | A) 

P( A and B)
P( A)
Two events A and B are independent if
P( B | A)  P( B)
Example:prosperity and education

Call a household prosperous if its income exceeds
$100,000. Call the household educated if the
householder completed college. Select an
American household at random, and let A be the
event that the selected household is prosperous
and B the event that is educated. According to the
Current Population Survey, P(A) = .134, P(B) =
.254, and the probability that a household is both
prosperous and educated is P(A and B) = .080.
Example:prosperity and education




Draw a Venn diagram that shows the relation
between the events A and B. What is the
probability P(A or B) that the household selected
is either prosperous or educated?
In the diagram, shade the event that the household
is educated but not prosperous. What is the
probability of this event?
Find the conditional probability that a household
is educated, given that it is prosperous.
Are the events A and B independent? How do you
know?
The Binomial Distribution



A store sells 10 computers with 1-year warranties.
How many will not need repair within 1 year?
A company’s human resources manager asks 100
employee if job stress is affecting their personal
lives. How many will say “yes”.
In all these situations, we want a probability
model for a count of successful outcomes.
The Binomial Setting





There are a fixed number of n observations.
The n observations are all independent. That is,
knowing the result of one observation tells you
nothing about the other observations.
Each observation falls into one of just two
categories, which for convenience we call
“success” and “failure”.
The probability of a success, call it p, is the same
for each observation.
Example

Tossing a coin n = 15 times
Binomial Distribution

The distribution of successes (x) in a
binomial setting is the Binomial distribution
of x with parameters n and p. The parameter
n is number of observations, and p is the
probability of a success on any one
observation. The possible values of X are
the whole numbers from 0 to n.
Binomial Probabilities


Suppose we toss a coin 20 times. Let X be
the number of heads.
What is the probability that x =8?
Finding Binomial Probabilities: Formula

If X has the binomial distribution with n
observations and probability p of success on
each observation, the possible values of X
are 0, 1, 2, 3, …, n. If k is any one of these
values, P( X  k )   n  p (1  p)
k 
 

k
nk
n!
p k (1  p) n  k
k!(n  k )!
Finding Binomial Probabilities: Formula


Wee tossed a coin 20 times, and X is the
number of heads.
What is the probability that X =8?


In this example n = ----- and p =----Using the binomial formula
P( X  8) 
20!
(0.5)8 (1  0.5) 208  0.1201
8!(20  8)!
Finding Binomial Probabilities: Tables


The formula given in the previous slide is
practical for hand calculations when n is
small.
In practice, we either use statistical
packages or table C in your Moore,
MaCabe, Duckworth, Sclove text book.
Example:Inspecting switches



The quality engineers inspect a SRS of 10
switches from a large shipment of which 10% fail
to conform to specifications. What is the
probability that no more than 1 of the 10 switches
in the sample fails inspection?
The count X of nonconforming switches in the
sample has approximately the binomial
distribution with n = ----- and P = -----.
What is the probability that exactly 4 in the sample
of 10 fail to conform to specification?
Binomial Mean and Standard Deviation


If a count X has the binomial distribution based on
n observations with probability p of success, what
is the average count of successes in very many
repetition of the binomial setting.
If a count X has a Binomial distribution with
number of observations n and probability of
success p, the mean and the standard deviation of
X are
  np
  np (1  p )
Example:Inspecting switches

The count X of bad
switches is Binomial with
n = 10 and P = 0.1. The
mean and standard
deviation of this Binomial
distributions are
  np
 (10)( 0.1)  1
  np(1  p)
 10(0.1)( 0.9)  .9  .9487
The Normal Approximation to Binomial
Distribution


The Binomial probability formula and
tables are practical only when the number of
trials n is small.
When n is large, we can use Normal
probability calculation to approximate hard
to calculate Binomial probability.
Normal Approximation for Binomial
Distribution

Suppose that a count X has the Binomial
distribution with n trials and success
probability p. When n is large, the
distribution of X is approximately normal,
N (np,

np(1  p) )
As a rule of Thumb, we will use the normal
approximation when n and p satisfy np  10
and n(1-p)  10.
Example:Is clothes shopping frustrating


Sample surveys show that fewer people enjoy shopping
than in the past. A recent survey asked a nationwide
random sample of 2500 adults if they agreed or disagreed
that “I like buying new clothes, but shopping is often
frustrating and time consuming.” The population that the
poll wants to draw conclusions about is all the U.S.
residents aged 18 and over. Suppose that 60% of all adult
U.S. residents would say “agree” if asked the same
question.
What is the probability that 1520 or more of the sample
agree.
Example:Is clothes shopping frustrating

Histogram of 1000
binomial counts
(n = 2500, p = 0.6) and the
normal density curve that
approximates this
Binomial distribution.
  np  (2500)( 0.6)  1500
  np(1  p)  2500(0.6)( 0.4)  24.49
X ~ N (1500, 24.49)
Example:Is clothes shopping frustrating

What is the probability that 1520 or more of the
sample agree?
1520  1500
)  P( Z  0.82)
24.49
 1  .7930  .2061
P( X  1520)  p( z 
The Poisson Distributions





It is common to meet counts that are open ended.
A bank counts the number of automatic teller
machine (ATM) customers arriving at a particular
ATM between 2:00 p.m. and 4:00 p.m.
A railyard counts the number of work injuries that
happen in a month.
What are the possible outcomes for these
examples?
Poisson distribution is another distribution for
counting random variables.
The Poisson setting



The number of events (call them successes) that
occur in any unit of measure is independent of the
number of successes that occur in any nonoverlapping unit of measure.
The probability that a success will occur in a unit
of measure is the same for all units of equal size
and is proportional to the size of the unit.
The probability that 2 or more successes will
occur in a unit approaches 0 as the size of the unit
becomes smaller.
Poisson Distribution

The distribution of the count X of successes in the
Poisson setting is the Poisson distribution with
mean . The parameter  is the mean number of
successes per unit of measure. The possible values
of X are the whole numbers 0, 1, 2, 3, … if k is
any whole number 0 or grater, then
e  k
P( X  k ) 
k!

The standard deviation of the distribution is

.
Example: Flaws in carpets


A carpet manufacturer knows that the number of
flaws per square yard in a type of carpet material
varies with an average of 1.6 flaws per square
yard. The count X of flaws per square yard can be
modeled by the Poisson distribution with  = 1.6.
The unit of measure is a square yard of carpet
material.
What is the probability of no more than 2 defects
in a randomly chosen square yard of this material?
Example: Flaws in carpets
e  k
p( x  k ) 
  1.6, k  2
k!
P ( x  2)  p ( x  0)  p ( x  1)  p ( x  2)
e 1.6 (1.6) 0 e 1.6 (1.6)1 e 1.6 (1.6) 2



0!
1!
2!
 0.2019  0.3220  .2584  .7834
The Role of Probability in Statistical
Inference

A statistic from a random sample will take
different values if we take more samples from the
same population.



That is, sample statistics are random variables.
The values of a statistic (sampling distribution, in
many samples) have a regular pattern.
We will use the language of probability to to
examine the sampling distribution of a sample
mean X .
Example: Does this wine smell bad?


Sulfur compounds such as Dimethyl sulfide
(DMS) are sometimes present in wine. DMS
causes “off-odors” in wine, so winemakers want to
know the odor threshold, the lowest concentration
of DMS that the human nose can detect. Different
people have different thresholds, so we start by
asking about the mean threshold  in the
population of all adults.
The number  is a parameter that describe this
population.
Example: Does this wine smell bad?




To estimate , we present tasters with both natural wine and the same
wine spiked with DMS at different concentrations to find the lowest
concentration at which they can identify the spiked wine.
Here are the odor threshold (measured in micrograms of DMS per liter
of wine) for 10 randomly selected subjects:
 28
40
28
33
20
31
29
27
17
21
The mean threshold for these subjects is X  27.4 .
This sample mean is a statistic that we use to estimate the parameter .
 This is probably not exactly equal to .
 A different 10 subjects would give us a different X .
Statistical Estimation and the Law of
Large Numbers




A parameter, such as the mean threshold  of all
adults, is in practice a fixed but unknown number.
A statistic, such a the mean threshold X of a
random sample of 10 adults, is a random variable.
We use X to estimate .
An SRS should fairly represent the population, so
the mean X of the sample should be somewhere
near the mean  of the population (i.e. X it is an
unbiased estimate of ).
Statistical Estimation and the Law of
Large Numbers


If X is rarely exactly right and varies from sample
to sample, why is it nonetheless a reasonable
estimate of the population ?
The answer:



If we keep on taking larger and larger samples, the
statistic X is guaranteed to get closer and closer to the
parameter .
That is if we can afford to keep on measuring more
subjects, eventually we will estimate the mean odor
threshold of all adults very accurately.
This fact is known As the law of large Numbers.
The Law of Large Numbers


Draw independent observations at random
from any population with finite mean .
As the number of observations drawn
increases, the mean X of the observed
values get closer and closer to the mean 
of the population
The Law of Large Numbers

In fact, the distribution of
odor threshold among all
adults has mean 25.


 = 25
As we take more
observations, the sample
mean X always
approaches the mean  of
the population.
Sampling Distributions


The law of large number assures us that if we
measure enough subjects, the statistic X will
eventually get very close to the unknown
parameter .
In our example we had a sample of 10 subjects.


What can we say about X from 10 subjects as an
estimate of ?
That is, what would happen if we took many samples of
10 subjects from this population?
Sampling Distributions



To answer this question
 Take a large number of samples of size 10 from the same
population
 Calculate the sample mean X for each sample.
 Make a histogram of the values of X . this histogram shows how
X varies in many samples.
The histogram of values of the statistic approximates the Sampling
distribution that we would see if we kept on sampling for ever.
One reason for studying probability is that the laws of probability can
tell us about sampling distributions without the need to actually choose
or simulate a large number of samples.
The mean and Standard Deviation of X

Suppose that X is the mean of a SRS of size
n drawn from a large population with mean
 and standard deviation . Then the mean
of the sampling distribution of X is  and

its standard deviation is n .
Sampling Distribution of a Sample Mean

If a population has the N(, ) distribution,
the sample mean X of n independent

observations has the N (  , )
n
Example: Estimating Odor Threshold

Adults differ in the smallest amount of DMS they can
detect in wine. Extensive studies have found that the DMS
odor threshold of adults follows roughly a Normal
distribution with mean  = 25 g/l and standard deviation
 = 7 g/l. because the population distribution is Normal,
the sampling distribution of X is also Normal
 Both distribution have the same mean
 But means ( X )from a sample of 10 adults vary less
than do measurements on individual adults.
 The standard deviation of X is

n

7
 2.21 g / l
10
Example: Estimating Odor Threshold


The distribution of
single observations
compared with the
distribution of the
mean X of 10
observations.
Averages are less
variable than
individual
observations.
Central Limit Theorem




What happens when the population distribution is
not Normal?
As the sample size increases, the distribution of
X changes shape: it looks less like that of the
population distribution and more like a Normal
distribution.
When the sample is large enough, the distribution
of X is very close to Normal
This important fact of probability is called the
central limit theorem.
The Central Limit Theorem in
Action

The distribution of means
X from a strongly nonnormal population
becomes more Normal as
the sample size increases.




(a) the distribution of 1
observation
(b) The distribution X of
two observations
(c)The distribution of X of
10 observations
(d) the distribution of X of
25 observations.
Central Limit Theorem

Draw a SRS of size n from any population
with mean  and finite standard deviation .
When n is large, the sampling distribution
of the sample mean X is approximately
Normal:
X is approximat ely N (  ,

)
n
Example: flaws in carpets

The number of flaws per square yard in a type of carpet
material varies with mean 1.6 flaws per square yard and
standard deviation 1.2 flaws per square yard. The
population distribution cannot be normal, because a count
takes only whole number values. An inspector samples 200
square yards of material, records the number of flaws
found in each square yard, and calculates X , the mean
number of flaws per square yard inspected. Use the central
limit theorem to find the approximate probability that the
mean number of flaws exceeds 2 per square yard.
Download