Chapter 1 Making Economic Decisions

advertisement
Chapter 3
Basic Concepts in Statistics
and Probability
3.1 Probability
Definition: An experiment is a process that
results in an outcome that cannot be predicted
in advance with certainty.
Examples:



rolling a die
tossing a coin
weighing the contents of a box of cereal.
2
Sample Space
Definition: The set of all possible outcomes of an
experiment is called the sample space for the
experiment.
Examples:
• For rolling a fair die, the sample space is {1, 2, 3, 4, 5, 6}.
• For a coin toss, the sample space is {heads, tails}.
• Imagine a hole punch with a diameter of 10 mm punches
holes in sheet metal. Because of variation in the angle of
the punch and slight movements in the sheet metal, the
diameters of the holes vary between 10.0 and 10.2 mm.
For this experiment of punching holes, a reasonable
sample space is the interval (10.0, 10.2).
3
More Terminology
Definition: A subset of a sample space is called an
event.
• A given event is said to have occurred if the
outcome of the experiment is one of the
outcomes in the event. For example, if a die
comes up 2, the events {2, 4, 6} and {1, 2, 3}
have both occurred, along with every other
event that contains the outcome “2”.
4
Combining Events
The union of two events A and B, denoted
A  B, is the set of outcomes that belong either
to A, to B, or to both.
In words, A  B means “A or B.” So the event
“A or B” occurs whenever either A or B (or both)
occurs.
Example: Let A = {1, 2, 3} and B = {2, 3, 4}.
What is A  B?
5
Intersections
The intersection of two events A and B, denoted
by A  B, is the set of outcomes that belong to A
and to B. In words, A  B means “A and B.”
Thus the event “A and B” occurs whenever both
A and B occur.
Example: Let A = {1, 2, 3} and B = {2, 3, 4}.
What is A  B?
6
Complements
The complement of an event A, denoted Ac, is
the set of outcomes that do not belong to A. In
words, Ac means “not A.” Thus the event “not
A” occurs whenever A does not occur.
Example: Consider rolling a fair sided die. Let A be
the event: “rolling a six” = {6}.
What is Ac = “not rolling a six”?
7
Mutually Exclusive Events
Definition: The events A and B are said to be mutually
exclusive if they have no outcomes in
common.
More generally, a collection of events A1, A2, …, An
is said to be mutually exclusive if no two of them have
any outcomes in common.
Sometimes mutually exclusive events are referred to as
disjoint events.
8
Probabilities
Definition:Each event in the sample space has a
probability of occurring. Intuitively, the
probability is a quantitative measure of
how likely the event is to occur.
Given any experiment and any event A:
• The expression P(A) denotes the probability that
the event A occurs.
• P(A) is the proportion of times that the event A
would occur in the long run, if the experiment
were to be repeated over and over again.
9
Axioms of Probability
1. Let S be a sample space. Then P(S) = 1.
2. For any event A, 0  P( A)  1 .
3. If A and B are mutually exclusive events, then
P( A  B)  P( A)  P( B) . More generally, if
A1 , A2 ,..... are mutually exclusive events, then
P( A1  A2  ....)  P( A1 )  P( A2 )  ...
10
A Few Useful Things
• For any event A, P(AC) = 1 – P(A).
• Let  denote the empty set. Then P(  ) = 0.
• If S is a sample space containing N equally likely
outcomes, and if A is an event containing k
outcomes, then P(A) = k/N.
• Addition Rule (for when A and B are not mutually
exclusive): P( A  B)  P( A)  P( B)  P( A  B )
11
Conditional Probability and
Independence
Definition: A probability that is based on part of the
sample space is called a conditional probability.
Let A and B be events with P(B)  0. The conditional
probability of A given B is
P( A  B)
P( A | B) 
.
P( B)
12
Conditional Probability
Venn Diagram
13
Independence
Definition: Two events A and B are independent if
the probability of each event remains the same
whether or not the other occurs.
• If P(A)  0 and P(B)  0, then A and B are
independent if P(B|A) = P(B) or, equivalently,
P(A|B) = P(A).
• If either P(A) = 0 or P(B) = 0, then A and B are
independent.
• These concepts can be extended to more than
two events.
14
The Multiplication Rule
• If A and B are two events and P(B)  0, then
P(A  B) = P(B)P(A|B).
• If A and B are two events and P(A)  0, then
P(A  B) = P(A)P(B|A).
• If P(A)  0, and P(B)  0, then both of the above
hold.
• If A and B are two independent events, then
P(A  B) = P(A)P(B).
15
Extended Multiplication Rule
• If A1, A2,…, An are independent results, then for
each collection of Aj1,…, Ajm of events
P( A j1  A j 2    A jm )  P( A j1 ) P( A j 2 )  P( A jm )
• In particular,
P( A1  A2    An )  P( A1 ) P( A2 )  P( An )
16
Example
A system contains two components, A and B,
connected in series. The system will function only
if both components function. The probability that A
functions is 0.98 and the probability that B
functions is 0.95. Assume A and B function
independently. Find the probability that the system
functions.
17
Example
A system contains two components, C and D,
connected in parallel. The system will function if
either C or D functions. The probability that C
functions is 0.90 and the probability that D
functions is 0.85. Assume C and D function
independently. Find the probability that the system
functions.
18
Example
P(A)= 0.995; P(B)= 0.99
P(C)= P(D)= P(E)= 0.95
P(F)= 0.90; P(G)= 0.90, P(H)= 0.98
19
Random Variables
Definition:A random variable assigns a numerical
value to each outcome in a sample space.
Definition:A random variable is discrete if its
possible values form a discrete set.
20
Probability Mass Function
• The description of the possible values of X and
the probabilities of each has a name: the
probability mass function.
Definition:The probability mass function (pmf) of a
discrete random variable X is the function
p(x) = P(X = x).
• The probability mass function is sometimes called
the probability distribution.
21
Probability Mass Function
Example
22
Cumulative Distribution Function
• The probability mass function specifies the
probability that a random variable is equal to a given
value.
• A function called the cumulative distribution
function (cdf) specifies the probability that a
random variable is less than or equal to a given
value.
• The cumulative distribution function of the random
variable X is the function F(x) = P(X ≤ x).
23
More on a
Discrete Random Variable
Let X be a discrete random variable. Then
• The probability mass function of X is the function
p(x) = P(X = x).
• The cumulative distribution function of X is the
function F(x) = P(X ≤ x).
•
F ( x)   p(t )   P( X  t ) .
•
 p( x)   P( X  x)  1 , where the sum is over all the
tx
x
tx
x
possible values of X.
24
Mean and Variance for Discrete
Random Variables
• The mean (or expected value) of X is given by
 X   xP( X  x) ,
x
where the sum is over all possible values of X.
• The variance of X is given by
 X2   ( x   X )2 P( X  x)
x
  x 2 P( X  x)   X2 .
x
• The standard deviation is the square root of the
variance.
25
Example
Probability mass function
will balance if supported at
the population mean
26
The Probability Histogram
• When the possible values of a discrete random
variable are evenly spaced, the probability mass
function can be represented by a histogram, with
rectangles centered at the possible values of the
random variable.
• The area of the rectangle centered at a value x is
equal to P(X = x).
• Such a histogram is called a probability
histogram, because the areas represent
probabilities.
27
Probability Histogram for the
Number of Flaws in a Wire
The pmf is: P(X = 0) = 0.48, P(X = 1) = 0.39,
P(X=2) = 0.12, and P(X=3) = 0.01.
28
Probability Mass Function
Example
29
Continuous Random Variables
• A random variable is continuous if its probabilities
are given by areas under a curve.
• The curve is called a probability density function
(pdf) for the random variable. Sometimes the pdf
is called the probability distribution.
• The function f(x) is the probability density function
of X.
• Let X be a continuous random variable with
probability density function f(x). Then



f ( x)dx  1.
30
Continuous Random Variables:
Example
31
Computing Probabilities
Let X be a continuous random variable with
probability density function f(x). Let a and b
be any two numbers, with a < b. Then
b
P(a  X  b)  P(a  X  b)  P(a  X  b)   f ( x)dx.
a
In addition,
P( X  a )  P( X  a)  
a

f ( x)dx

P( X  a)  P( X  a)   f ( x)dx.
a
32
More on
Continuous Random Variables
• Let X be a continuous random variable with
probability density function f(x). The cumulative
distribution function of X is the function
x
F ( x)  P( X  x)   f (t )dt.

• The mean of X is given by

 X   xf ( x)dx.

• The variance of X is given by

   ( x   X ) 2 f ( x)dx
2
X


  x 2 f ( x)dx   X2 .

33
Two Independent Random
Variables
If X and Y are independent random
variables, and S and T are sets of numbers,
then 𝑃 𝑋 ∈ 𝑆 𝑎𝑛𝑑 𝑌 ∈ 𝑇 = 𝑃 𝑋 ∈ 𝑆 𝑃(𝑌 ∈ 𝑇)
More generally, if X1, …, Xn are independent
random variables, and S1, …, Sn are sets,
then
𝑃 𝑋1 ∈ 𝑆1 , 𝑋2 ∈ 𝑆2 , … , 𝑋𝑛 ∈ 𝑆𝑛
= 𝑃 𝑋1 ∈ 𝑆1 𝑃 𝑋2 ∈ 𝑆2 … 𝑃(𝑋𝑛 ∈ 𝑆𝑛 )
34
Variance Properties
If X1, …, Xn are independent random variables,
then the variance of the sum X1+ …+ Xn is given
2
2
2
2
by
 X1  X 2 ... X n   X1   X 2  ....   X n .
If X1, …, Xn are independent random variables
and c1, …, cn are constants, then the variance of
the linear combination c1 X1+ …+ cn Xn is given
by
2
 c2 2  c2 2  ....  c2 2 .
c1 X1 c2 X 2 ...cn X n
1
X1
2
X2
n
Xn
35
More Variance Properties
If X and Y are independent random variables
with variances  X2 and  Y2, then the variance of
the sum X + Y is  2   2   2 .
X Y
X
Y
The variance of the difference X – Y is

2
X Y
   .
2
X
2
Y
36
Independence and Simple
Random Samples
Definition: If X1, …, Xn is a simple random
sample, then X1, …, Xn may be treated as
independent random variables, all from the
same population.
37
Properties of
If X1, …, Xn is a simple random sample from a
population with mean  and variance 2, then the
sample mean X is a random variable with
X  
2
 X2  .
n
The standard deviation of X is

X 
.
n
38
3.2 Sample versus Population
Definitions:
 A population is the entire collection of objects
or outcomes about which information is sought.
 A sample is a subset of a population, containing
the objects or outcomes that are actually
observed.
 A simple random sample (SRS) of size n is a
sample chosen by a method in which each
collection of n population items is equally likely
to comprise the sample, just as in the lottery.
Sampling (cont.)
Definition: A sample of convenience is a sample
that is not drawn by a well-defined random
method.
Things to consider with convenience samples:
 Differ systematically in some way from the
population.
 Only use when it is not feasible to draw a
random sample.
40
Simple Random Sampling
• A SRS is not guaranteed to reflect the
population perfectly.
• SRS’s always differ in some ways from each
other; occasionally a sample is substantially
different from the population.
• Two different samples from the same population
will vary from each other as well.
 This phenomenon is known as sampling
variation.
41
Tangible Population
• The populations that consist of actual
physical objects – customers, blocks, balls
are called tangible populations.
• Tangible populations are always finite.
• After we sample an item, the population
size decreases by 1.
42
More on
Simple Random Sampling
Definition: A conceptual population consists of
items that are not actual objects.
• For example, a geologist weighs a rock several
times on a sensitive scale. Each time, the scale
gives a slightly different reading.
• Here the population is conceptual. It consists of
all the readings that the scale could in principle
produce.
43
Simple Random Sampling
(cont.)
• The items in a sample are independent if knowing
the values of some of the items does not help to
predict the values of the others.
• Items in a simple random sample may be treated
as independent in most cases encountered in
practice. The exception occurs when the
population is finite and the sample comprises a
substantial fraction (more than 5%) of the
population.
44
Types of Sampling
• Weighted Sampling
• Stratified Random Sampling
• Cluster Sampling
45
3.3 Location
Measures used to describe “location” of data:
(Measure of center) or (Measure of central tendency)
• Median
• Mean (Average)
Robust estimators:
• Trimmed average: 10% of the observations in a
sample are trimmed from each end
3.4 Variation
Variation:
• Natural cause
• Assignable causes
Measures of variation:
• Range (using only the extreme values)
• Variance
• Standard deviation
• Covariance
Variation Calculation
𝑋=
𝑛
𝑖=1 𝑋𝑖
𝑆2 =
𝑆𝑥𝑦 =
(3.1)
𝑛
𝑛
𝑖=1(𝑋𝑖
− 𝑋)2
𝑛−1
𝑛
𝑖=1(𝑋𝑖
− 𝑋)(𝑌𝑖 − 𝑌)
𝑛−1
(3.2)
(3.3)
3.5 Discrete Distributions
• Random Variable: “Something that varies in a
random manner”
• Discrete Random Variable: “Random variable that
can assume only a finite number of possible
values (usually integers)
Discrete Random Variable Example
• Experiment: Tossing a single coin twice and
recording the number of heads observed
• Repeated 16 times
• X= number of heads observed in each experiment
• 0211200121101101120
• Empirical distribution
• Theoretical distribution
3.5.1 Binomial Distribution
• We use the Bernoulli distribution when we
have an experiment which can result in one
of two outcomes. One outcome is labeled
“success,” and the other outcome is labeled
“failure.”
• The probability of a success is denoted by
p. The probability of a failure is then 1 – p.
• Such a trial is called a Bernoulli trial with
success probability p.
51
Examples of Bernoulli Trials
1. The simplest Bernoulli trial is the toss of a coin.
The two outcomes are heads and tails. If we
define heads to be the success outcome, then p is
the probability that the coin comes up heads. For a
fair coin, p = 1/2.
2. Another Bernoulli trial is a selection of a component
from a population of components, some of which
are defective. If we define “success” to be a
defective component, then p is the proportion of
defective components in the population.
52
Binomial Distribution
If a total of n Bernoulli trials are conducted,
and
 The trials are independent.
 Each trial has the same success probability p.
 X is the number of successes in the n trials.
then X has the binomial distribution with
parameters n and p, denoted X ~ Bin(n,p).
53
Probability Mass Function of
a Binomial Random Variable
If X ~ Bin(n, p), the Probability Mass
Function of X is
n!

x
n x
p
(1

p
)
, x  0,1,..., n

p( x)  P( X  x)   x!(n  x)!
0, otherwise

(3.5)
54
Binomial Probability Histogram
(a) Bin(10, 0.4) (b) Bin(20, 0.1)
55
Example
The probability that a newborn baby is a girl
is approximately 0.49. Find the probability
that of the next five single births in a
certain hospital, no more than two are
girls.
56
Another Use of the Binomial
Assume that a finite population contains items of
two types, successes and failures, and that a
simple random sample is drawn from the
population. Then if the sample size is no more
than 5% of the population, the binomial
distribution may be used to model the number of
successes.
57
Example
A lot contains several thousand components, 10%
of which are defective. Nine components are
sampled from the lot. Let X represent the
number of defective components in the sample.
Find the probability that exactly two are
defective.
58
Software Functions for
Binomial Probabilities
Excel:
BINOM.DIST(number_s, trials, probability_s, cumulative)
Minitab:
Calc Probability Distributions  Binomial
59
Example
Of all the new vehicles of a certain model that are
sold, 20% require repairs to be done under
warranty during the first year of service. A
particular dealership sells 14 such vehicles.
What is the probability that fewer than five of
them require warranty repairs?
60
Mean and Variance of
a Binomial Random Variable
 E(X) = np
E(Bernoulli Trial)=(1)p+(0)(1-p)=p
 Var(X) = np(1 – p)
 Var(Bernoulli Trial)=(1-p)2p+(0-p)2(1-p)
=(1-2p+p2)p+p2(1-p)
=p-2p2+p3+p2-p3
=p-p2
=p(1-p)
61
3.5.2 Beta-Binomial Distribution
 Binomial often under-estimate the variation
 Beta-Binomial
𝑛
𝑃 𝑥 =
𝑥
𝐵(𝑟+𝑥, 𝑛+𝑠−𝑥)
𝐵(𝑟,𝑠)
 Where B(r, s) is beta distribution
Γ(𝑟)Γ(𝑠)
𝑟−1 ! 𝑠−1 !
𝐵 𝑟, 𝑠 =
=
Γ(𝑟 + 𝑠)
𝑟+𝑠−1 !
62
3.5.3 Poisson Distribution
 One way to think of the Poisson distribution is
as an approximation to the binomial distribution
when n is large and p is small.
 It is the case when n is large and p is small that
the mass function depends almost entirely on the
mean np, and very little on the specific values of n
and p.
 We can therefore approximate the binomial mass
function with a quantity λ = np; this λ is the
parameter in the Poisson distribution.
63
Probability Mass Function, Mean,
and Variance of Poisson Dist.
 If X ~ Poisson(λ), the probability mass function of
X is
 x
e 
, for x = 0, 1, 2, ...

p( x)  P( X  x)   x!
0, otherwise
(3.6)
 Mean: X = λ
2
 Variance:
X  
Note: X must be a discrete random variable and λ
must be a positive constant.
64
Poisson Probability Histogram
Figure 4.2 (a) Poisson(1) (b) Poisson(10)
65
Poisson Probabilities
Excel:
POISSON.DIST(x, mean, cumulative)
Minitab:
Calc Probability Distributions  Poisson
66
Example
Particles are suspended in a liquid medium at a
concentration of 6 particles per mL. A large
volume of the suspension is thoroughly agitated,
and then 3 mL are withdrawn. What is the
probability that exactly 15 particles are
withdrawn?
67
3.5.4 Geometric Distribution
• Geometric distribution and the negative binomial distribution
are referred as “waiting time” distributions.
• It deals with the number of trials required for a single success.
• Outcomes are either success/failure. Trial continues until
success (defect) occurs for the first time.
– Useful for manufacturing where the line will be shut down for
recalibration upon first defect.
Geometric Distribution
• Geometric Distribution:
P (n )  p(1  p )
n 1
n: The number of trials required to produce 1 success in a geometric
experiment.
p: The probability of success on an individual trial.
1- p: The probability of failure on an individual trial.
Geometric Distribution
Mean and Variance
1

p
: the average no. of trials required to produce 1 success
1 p
  2
p
2
Geometric Distribution
Example
Bob is a high school basketball player. He is a 70% free throw
shooter. That means his probability of making a free throw is
0.70. What is the probability that Bob makes his first free
throw on his fifth shot?
Solution:
Probability of success (p) is 0.70, the number of trials (x)
is 5, and the number of successes (r) is 1. We enter these
values into the geometric formula.
P( x )  pq x 1  (.7)(.3)4  .00567
Geometric Distribution
Example
Military contractor is producing nuts that must be within .04
mm of specified diameter. If nut exceeds the limit the line
must be shut down and adjusted. The probability that the
diameter of a nut will exceeds the allowable error is .0014.
• What is the probability the machine will be shut down
exactly after the 100th nut is produced?
• What is the probability the machine will be shut down
exactly after the 200th nut is produced?
3.5.5 Negative Binomial Distribution
A negative binomial experiment is a statistical experiment
that has the following properties:
• The experiment consists of x repeated trials.
• Each trial can result in just two possible outcomes, a success
and a failure.
• The probability of success, denoted by P, is the same on
every trial.
• The trials are independent; that is, the outcome on one trial
does not affect the outcome on other trials.
• The experiment continues until r successes are observed,
where r is specified in advance.
Negative Binomial Distribution
• A negative binomial random variable is the number X of
repeated trials to produce r successes in a negative binomial
experiment.
• The negative binomial distribution is also known as the
Pascal distribution.
Negative Binomial Distribution
 n  1 r
 p (1  p )n r
P (n )  
 r  1
n: The number of trials required to produce r successes in a negative
binomial experiment.
r: The number of successes in the negative binomial experiment.
p: The probability of success on an individual trial.
1-p: The probability of failure on an individual trial.
Negative Binomial Distribution
Mean and Variance
r

p
: the average no. of trials required to produce r successes
r (1  p )
 
2
p
2
Negative Binomial Distributions
Example
• Bob is a high school basketball player. He is a 70% free
throw shooter. That means his probability of making a free
throw is 0.70. During the season, what is the probability that
Bob makes his third free throw on his fifth shot?
• Solution: The probability of success (p) is 0.70, the number
of trials (x) is 5, and the number of successes (r) is 3.
P ( x ) x 1Cr 1p r q x r  4 C2 (.7)3 (.3)2  .1852
3.5.6 Hypergeometric Distribution
• A sample of size n is randomly selected without
replacement from a population of N items.
• In the population, r items can be classified as successes,
and N - r items can be classified as failures.
• A hypergeometric random variable, x, is the number of
successes that result from a hypergeometric experiment
Hypergeometric Probability Distribution
 D  N  D 
 

x  n  x 

P( x ) 
N 
 
n
Where
N = total number of elements in the population
D = number of success in the population
N-D = number of failures in the population
n = number of trials (sample size)
x = number of successes in trial
n-x = number of failures in n trials
Hypergeometric Distribution
Mean and Variance
  np
Where p= r/N
N n
  np(1  p )(
)
N 1
2
Hypergeometric Probability Distribution
Example
Suppose we select 5 cards from an ordinary deck of playing cards. What is
the probability of obtaining 2 or fewer hearts?
Solution:
N = 52; since there are 52 cards in a deck.
r = 13; since there are 13 hearts in a deck.
n = 5; since we randomly select 5 cards from the deck.
x = 0 to 2; since our selection includes 0, 1, or 2 hearts.
We plug these values into the hypergeometric formula as follows:
P ( x  0) 
13
C0 
52
P ( x  1) 
13
C5
C5
C1 
52
39
39
C5
C4
 .2215
 .4114
P ( x  2) 
13
C2 
39
52 C5
C3
 .2743
Hypergeometric Probability
in MINITAB
• Acceptance testing of ice cream cones Ice cream parlor
checks a batch of 400 waffle cones by checking 50 of
them. They will not buy them if more than 3 cones are
broken.
• What is the probability that the parlor will buy the cones if
35 of the 400 cones are broken.
– Define N, n, D, N-D, x
– In MINITAB select: Calc-> Probability Distributions > Hypergeometric
Download