Math 421 Lecture 1 PROBABILITY

advertisement
Math 435 Lecture 1
Ari Wijetunga
PROBABILITY
A probability space with a finite or countable infinite number of
outcomes or sample points is called a discrete sample space. Let
E i , i  1, 2, 3, ... be the sample points. Then the probability of the
sample point E i is denoted by P( Ei ) . P must satisfy the following
two conditions
(i) 0  P( Ei )  1 for each Ei in the sample space, and
(ii) P(E1 )  P(E2 )  ...   P(Ei )  1.
If the sample points are eqally likely and if there are k sample
points in the sample space P( Ei )  1 for i  1, 2, . . . , k . This sometimes
k
is called the uniform probability function.
An event A is defined as the finite or infinite collection of sample
points from the sample space. All subsets of the sample space is
called the event space. This will include the null and sure events.
The probability of event A, P(A) = Sum of the probabilities of the
sample points in A=  P( Ei )
Ei  A
If an experiment can result in an outcome which can be any real
number in some interval, the experiment gives outcomes which can
be regarded as a continuous sample space. The probabilities in this
sample space is called the continuous probability space. Suppose
you are recording the time it takes to complete a task. Let us
assume that the minimum time required is 10 minutes and the
maximum is 30 minutes. Each outcome is a real number in the
interval (10, 30). In this continuous sample space, we assign
probabilities by assigning probability to subintervals rather than
individual points. For example, the probability that a person will
finish the task between 12 and 15 minutes is 3/20.
Probability Rules
Let S be the entire sample space and  be the null event which has
no outcomes.
1. P(  ) =0
2. P(S) =1
n
n
3. If A1 , A2 , ..., An are mutually exclusive, then P(  Ai )   P( Ai )
i 1
i 1
4. 0  P( Ai )  1
5. If A is a subevent of B ( A  B ), then P( A)  P( B)
6. For any two events A and B, P( A  B)  P( A)  P( B)  P( A  B)
7. For three events A, B, and C,
P( A  B  C )  P( A)  P( B)  P(C )  P( A  B)  P( B  C )  P( A  C )  P( A  B  C )
8. If A is the complement of the event A, P( A)  1  P( A)
9. P( A)  P( A  B)  P( A  B )
10. If Bi , i  1,2, 3, ..., n is a partition of the sample space, then for any
event A,
n
P( A)   P( A  Bi )
i 1
11. For any events
n
n
i 1
i 1
A1 , A2 , ..., An , P( Ai )   P( Ai )
with equality
holding iff the events are mutually exclusive.
12. P( A  B )  P( A  B)
13. P( A  B)  P( A  B)
Examples:
1. A customer visiting a store buy the product A 30% of the time
and buy the product B 40% of the time. She buys neither A nor B
35% of the time. Determine the probability that a customer buys
both products.
P(A) =.3, P(B) = .4,
P( A  B )  P( A  B)  .35 .
P( A  B)  P( A)  P( B)  P( A  B)  .3  .4  P( A  B)
P( A  B)  .7  .65  0.05
2. You are given
P( A  B)  .7 and P( A  B )  0.9 .
Find P(A)
.7  P( A)  P( B)  P( A  B) , and .9  P( A)  1  P( B)  P( A  B )
P( A  B)  P( A  B )  P( A). P( A)  1.6  1  .6
3. Consider the three events A, B and C. Each event will occur
together with other events. For example, if A has occurred, A must
have occurred with B and A must have occurred with C. All three
events cannot occur at the same time. P(A)=1/4, P(B)=1/3 and
P(C)=5/12. Find P( A  B  C )
P( A  B)  P( A  C )  1 / 4, P( B  C )  P( B  A)  1 / 3, P(C  A)  P(C  B)  5 / 12
1 1 1 5
1
P( A  B)  P( A  C )  P( B  C )  (   )  .
2 4 3 12
2
1 1 5 1
Therefore P( A  B  C )     = 1/2.
4 3 12 2
4. A company offers two savings plans A and B to its employees.
It is found that 25% have participated in both plans and 10%
participated in neither plans. The probability that an employee
participates in plan A is 0.15 higher than that of B. What is the
probability that an employee participates in plan B?
P( A  B)  .25, P( A  B )  P( A  B)  .10, P( A)  P( B)  .15
P( A)  P( B)  .25  .90; P( B)  .5
5. A survey of students shows the followings. 31% took course A,
27% took course B, 29% took course C, 7% took A and B, 10%
took B and C, 6% took A and C, and 2% took all three courses.
What percent took none of the three courses.
Use a Venn diagram and
P( A  B  C )  .34
Example. A six-sided red die and a six-sided green die are to be
rolled. Assume that each of these dice is fair. An elementary event
can be described by a vector:
(Number on the top face of the red die, Number on the top face of
the green die)
For a gambler, this example is of utmost importance.
The following is a natural distribution on the resultant sample
space.
Elementary events:
Probability:
Elementary events:
Probability:
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
1/36 1/36 1/36 1/36 1/36 1/36
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
1/36 1/36 1/36 1/36 1/36 1/36
Elementary events:
Probability:
Elementary events:
Probability:
Elementary events:
Probability:
Elementary events:
Probability:
(3,1)
1/36
(4,1)
1/36
(5,1)
1/36
(6,1)
1/36
(3,2) (3,3) (3,4) (3,5) (3,6)
1/36 1/36 1/36 1/36 1/36
(4,2) (4,3) (4,4) (4,5) (4,6)
1/36 1/36 1/36 1/36 1/36
(5,2) (5,3) (5,4) (5,5) (5,6)
1/36 1/36 1/36 1/36 1/36
(6,2) (6,3) (6,4) (6,5) (6,6)
1/36 1/36 1/36 1/36 1/36
In this example, we could use the phrase that all the elementary
events are equally likely to occur.
Let us consider the following events.
A = The sum of the numbers on the top faces is 2
B = The sum of the numbers on the top faces is 3
C = The sum of the numbers on the top faces is 4
D = The sum of the numbers on the top faces is 5
E = The sum of the numbers on the top faces is 6
F = The sum of the numbers on the top faces is 7
G = The sum of the numbers on the top faces is 8
H = The sum of the numbers on the top faces is 9
I = The sum of the numbers on the top faces is 10
J = The sum of the numbers on the top faces is 11
K = The sum of the numbers on the top faces is 12
Event
A
B
C
D
E
F
G
H
I
J
K
Description
(1,1)
(1,2) (2,1)
(1,3) (2,2) (3,1)
(1,4) (2,3) (3,2) (4,1)
(1,5) (2,4) (3,3) (4,2) (5,1)
(1,6) (2,5) (3,4) (4,3) (5,2) (6,1)
(2,6) (3,5) (4,4) (5,3) (6,2)
(3,6) (4,5) (5,4) (6,3)
(4,6) (5,5) (6,4)
(5,6) (6,5)
(6,6)
Pr.
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36
Analysis
The following facts emerge from our computations.
1.
2.
3.
4.
5.
6.
The sum 7 is the most likely sum.
The eleven sums (events) are mutually exclusive events.
The eleven sums (events) are exhaustive.
The sum 7 is six times more likely to occur than the sum 2.
The sum 7 is twice as likely to occur as the sum 4.
The sum 4 and the sum 9 are equally likely.
One would have guessed that the sum 7 is more likely than the sum
5. The above computations clearly spell out quantitatively how
much more likely the sum 7 is than the sum 5.
Efron’s dice
Bradley Efron, a professor of statistics at Stanford University,
created the following dice.
Take any four six-sided blank fair dice. Put numbers on the dice as
outlined below.
Die A:
0
0
4
4
4
4
Die B:
3
3
3
3
3
3
Die C:
2
2
2
2
6
6
Die D:
1
1
1
5
5
5
He suggested a game based on these dice.
1.
2.
3.
4.
You pick up a die.
I pick up a die from the remaining three dice.
Let us roll the dice at the same time.
Whoever gets a larger number on his die gets a dollar from
the other player.
Before you participate in this game, you would like to calculate the
probabilities one die beating another. Let us do this.
Pr(Die A beats Die B) = 2/3
Pr(Die B beats Die C) = 2/3
Pr(Die C beats Die D) = 2/3
There is one way of calculating this probability.
Die C
Die D
1
1
1
5
5
5
2
W W W L
L
L
2
W W W L
L
L
2
W W W L
L
L
2
W W W L
L
L
6
W W W W W W
6
W
W
W
W
Legend:
W = Win for Die C
L = Loss for Die C
W
W
Out of 36 possible cases, 24 cases result in a win for Die C.
Consequently, Pr(Die C beats Die D) = 2/3.
Let us take a pause. By looking at these probabilities, it seems to
me that Die A is the best. Die A beats Die B comfortably (more
than 50% probability). Die B beats Die C comfortably. Die C beats
Die D comfortably.
A surprise. Pr(Die D beats Die A) = 2/3.
How do you play this game now? What are the implications of
these calculations? You should refuse to be the first person to pick
up a die.
Transitivity of a relation is well-known. For example, suppose
Margaret is taller than John and John is taller than Brenda. We can
say that Margaret is taller than Brenda. This is an example of
transitivity. If we define a relation in terms of probability,
transitivity could fail. For example, suppose we define that Die 1 is
better than Die 2 if Die 1 beats Die 2 with more than 50%
probability. This relation is not transitive. We have seen that as per
this definition,
Die A is better than Die B, Die B is better than Die C, and Die C is
better than Die D. This does not mean that Die A is better than Die
D.
History. More than a hundred years ago, some Hungarian
mathematicians created dice (different from Efron’s) for which
transitivity fails. A number of papers were written on this type of
paradoxical dice.
Counting rules
1. Tree diagram is a nice way of counting outcomes when the
experiment has only small number of outcomes. You must
have done enough problems in Math 335 to master this
technique.
2. Multiplication rule
Consider an experiment with k trials. Trial i can be
performed in ni ways, i= 1, 2,…,k. Then the total number of
outcomes in the experiment can be obtained as
n(S) = ( n1) ( n2). . . ( nk)
Example: Four people are ranking 3 beers as 1, 2, and 3 randomly.
The score of a beer is the sum of the rankings. What is the
probability that the beer A gets a total score of 4? 1/81
3. Permutations and combinations
Consider n distinct objects. Any arrangement with order is called a
permutation and any arrangement with no order is called a
combination. Permutations and combinations can be made with
repeated elements(with replacement) or without repeating(without
replacement).
Permutations with replacement:
Consider n distinct objects. Suppose we want to make
permutations of order k. Use the multiplication rule. First place
can be filled in n ways, the second place can be filled in n ways,
and the n th place can be filled with n ways. Therefore
n(S) = ( n) ( n). . . ( n) = nk
Suppose the objects are not distinct. There are r items alike and s
items alike. Then n(S) =
nk
r!s!
Permutations without replacement.
Consider n distinct objects. Suppose we want to make
permutations of order n. Use the multiplication rule. First place
can be filled in n ways, the second place can be filled in (n-1)
ways, and the n th place can be filled with n-(n-1) ways. Therefore
n(s) = n (n-1) (n-2) . . . 3(2) (1) =n!
If the order is k, then n(S)=n(n-1) (n-2). . . ( n-(k-1))=
n!
(n  k )!
When the items are not distinct, you have to count them depending
on the situation. This is somewhat complicated.
Combinations without replacement
Consider n distinct objects. We can make just one combination if
the order is n. If the order is k, we can make
n
n!
n(S) =
=  
(n  k )! k!
k
Combinations with replacement
Consider n distinct objects. The number of combinations with
repetition of the n objects with order k is given by
n(S) =
(n  k  1)!  n  k 1 
= 
k!(n  1)!  n 1 
This formula is hard to derive. It is done using a result in number
theory. See “Probability by Marcel F Neuts,” page 35 for a proof.
Example: Consider 5 objects. The number of combinations with
replacement is (5  2  1)!  15
2!(5  1)!
They are: AA, BB, CC, DD, EE, AB, AC, AD, AE, BC, BD, BE,
CD, CE, DE
Partitioning
Consider n objects. Suppose we want to separate these into two
piles, one consisting k objects and the other consisting of (n-k)
objects. This is called a partition. How many such partitions are
there? We can argue as follows using the multiplication rule.
n
There are   ways to choose any k items. After choosing k items,
k
from the remaining (n-k) items, we choose n-k items in
Total number of partitions =
 nk 
  ways.
 nk 
 n   nk   n 
   = 
 k   nk   k 
The number of partitions of n objects into r piles with ki objects in
the ith pile (i=1,2,...r) is
 n  n k1  n k1 k2   n ( k1 k2 ... kr 1 ) 
...
  

kr
 k1  k2  k3  

=
n!
k1!k 2 !...k r !
Conditional Probability and Independence
Consider the following example. A survey shows that 40% of the
people in a city read the “Time” magazine. Out of those who read
the “Time” magazine 20% are females. Define the events T= A
person reads the “Time” magazine, and F= A person is a female.
Then P(T) = 0.40. Notice that T and F are not in the same sample
space. We need a notation to denote the event F in T. We write
P (F|T) =0.20. This means the sample space of F is T. This
probability is called the conditional probability of F given T. It
can be shown that
P( F  T )
.
P( F | T ) 
P(T )
Similarly,
P( F  T )  P( F | T ) P(T )
Two events A and B are said to be independent if P ( A|B)
=P(A) or P(B|A)=P(B).
This gives
P( A  B)  P( A) P( B) .
In general, for any n events A1 , A2 , ..., An in the sample space, we can
extend the conditional probability rule as follows
P( A1  A2  ...  An )  P( A1 ) P( A2 | A1 ) P( A3 | A1  A2 ) P( A4 | A1  A2  A3 ) ...
P( An | A1  A  A3  ...  An1 )
Example: Consider tossing a die once. Define A=”the number
tossed is less than or equal to 4”, B = “the number tossed is even”,
and C= “the number tossed is a 1 or a 2.”
1 P( A  B) P({1,2,3}  {2,4,6})
P({2})
1/ 6 1
P( A | B)  




3
P( B)
P({2,4,6})
P({2,4,6}) 1 / 2 3
Surely, the two events are not independent which you can check by
using any one of the rules. Notice that P (A|B) =1/3  P(A)= 1/2 or
P(A  B) = 1/6  P(A) P(B) = ( 1 ) ( 1 )  1 .
2
2
4
Are B and C independent? P ( B  C)= P ( { 2,4,6}  {1,2})= 1/6 =
P(B) . P(C) = (1/2) (1/3)=1/6. Therefore they are independent.
It is not always possible to check if two events are independent by
checking the outcomes of the events or the nature of the events,
you must use one of the rules given above. On the other hand, if
you consider rolling two dice and A=” rolling a 1 on the first die”
and B=”rolling a 1 on the second die,” surely A and B must be
independent because A has no influence on B and vice versa.
Example: There are 3 friends A, B, and C. A tells the truth with
probability 1/3. If A tells the truth, the probability that B will also
tell the truth is 1/2. If A and B both tell the truth C will lie with
probability 2/3. If you ask a question from A, then B, and then C,
what is the probability that all of them will tell the truth assuming
that they know the answer to the question?
P( A  B  C )  P( A) P( B | A) P(C | A  B)  (1 / 3)(1 / 2)(1 / 3)  1 / 18
Theorem of Total Probability
Let B i , i  1, 2, 3, ..., k be a partition of the sample space. This means
k
that  Bi
 S , and Bi  B j   if i  j .
Let A  S. Then
i 1
P( A) 
k
 P( A | B ) P( B ) .
i 1
i
i
Important rule: P(A) =P(A|B)P(B) +P( A | B ) P( B )
Example: An insurance company estimates that 40% of policy
holders who have only an auto policy will renew next year and
60% of policyholders who have only a homeowners policy will
renew next year. The company estimates that 80% of
policyholders who have both an auto and a homeowners policy
will renew at least one of those policies next year. Company
records show that 65% of policyholders have an auto policy, 50%
of policyholders have a homeowners policy, and 15% of
policyholders have both an auto and homeowners policy. Using
the company’s estimates calculate the percentage of policyholders
that will renew at least one policy next year.
Let A= policyholder has only an auto policy
H=policyholder has only a homeowner’s policy
A  H = policyholder has both policies.
Notice that the three events above is a partition of the sample
space. P( A) =.5, P(H)=.35, P(A  H) =.15. Let R= policyholder
renews at least one policy next year.
P( R)  P( R | A) P( A)  P( R | H ) P( H )  P( R | A  H ) P( A  H )
P(R) = .4(.5)+.6(.35) +.8(.15)=.53
Most of the time we combine this theorem with the Bayes rule and
it is called the Bayes theorem
Bayes Theorem
Consider the same set up as the total probabilty theorem. What is
P( B i |A) ?
P ( Bi | A) 
P ( A | Bi ) P ( Bi )
, P(A) is given by the total probability.
P ( A)
Example:
Identical twins come from the same egg and hence are of the same sex.
Fraternal twins are equally likely to be of the same sex or opposite sex.
Among twins, the probability of a fraternal set is p and an identical set is
q=1-p. If the next set of twins are of the same sex, what is the probability
that they are identical?
Let A=’next set of twins are of the same sex” and B=”the next set of twins
are identical.”
We need P (B|A) =
P( A | B) P( B)
. P( A|B) =1, P(B)=q,
P( A)
P(A) = P(A| Identical twins) P ( Identical twins)+ P( A| Fraternal twins)
P(Fraternal twins)= 1(q)+.5(p)=.5(1+q)
P(B|A)=q/(.5(1+q))
Some additional Rules
P( A | B )  1  P ( A | B )
P( AUB | C )  P ( A | C )  P ( B | C )  P ( A  B | C )
If A  B, then P ( A | B ) 
P( A)
and P ( B | A)  1
P( B)
Theorem :If A and B are independent events then A and B are independent,
A and B  are independent, and A and B  are independent.
 is independent of any event of A
P(A|B) does not provide information about P(A| B  )
Example:
A study of automobile accidents produced the following data
Model
Proportion of all vehicles
P(accident)
1997
0.16
0.05
1998
0.18
0.02
1999
0.20
0.03
other
0.46
0.04
An automobile from one of the model years 1997, 1998 and 1999 was
involved in an accident. Determine the probability that the model is 1997.
Let A=”model is involved with an accident” and let the year denote the
model. P( 1997|A) =
P( A | 1997) P(1997)
, P( A|1997) = 0.05, P(1997) =.16
P( A)
P(A)=P(A|1997)P(1997)+P(A|1998)P(1998)+P(A|1999)P(1999)+P(A|other)
P(other)= (.05)(.16)+(.02)(.18)+(.03)(.20)+(.04)(.46)=.036
P(1997|A) =
(.05)(.16)
 0.22
.036
Example: A bloods test indicates the presence of a particular disease 95% of
the time when the disease is actually present. The same test indicates the
presence of the disease 0.5% of the time when the disease is not present.
One percent of the population actually has the disease. Calculate the
probability that the person has the disease given that the test indicates the
presence of the disease.
Let D=”the person has the disease” and T=” test indicates the presence of
the disease”
We need P (D|T)=
(0.95)(. 01)
.95(.01)
P(T | D) P( D)

 0.657

P(T | D) P( D)  P(T | D ) P( D ) .95(.01)  .005(.99)
P(T )
Example: The probability that a randomly chosen male has a circulation
problem is 0.25. Males who have a circulation problem are twice as likely
to be smokers as those who do not have a circulation problem. What is the
conditional probability that a male has a circulation problem given that he is
a smoker?
Let C= a male has a circulation problem, M= he is a smoker. We need
P(C|M). We are given P (C) =0.25. P( M|C)=2P(M| C ) . Let P(M|C)=p
P(C|M)=
P( M | C ) P(C )
p(.25)

 .4
P( M )
p(.25)  ( p / 2)(.75)
Example: A doctor is studying the relationship between blood pressure and
heartbeat abnormalities in her patients. She tests a random sample of her
patients notes their blood pressures(high, low, or normal) and their
heartbeats(regular or irregular). She finds that
(i) 14% have high blood pressure
(ii) 22% have low blood pressure
(iii) 15% have an irregular heartbeat
(iv) of those with an irregular heartbeat, one third have high blood pressure
(v) of those with normal blood pressure, one-eighth have an irregular
heartbeat.
What portion of the patients selected has a regular heartbeat and low blood
pressure?
Let R=patient has a regular heartbeat, L=patient has low blood pressure,
H=patient has high blood pressure, and N= patient has normal blood
pressure.
We need P (R  L).
We are given P ( H | R )  1 / 3 , P( R  | N )  1 / 8 , P(H)=.14, P(L) =.22,
P( R ) =.15.
Draw a Venn diagram to do this problem. P( R   H )  .05, P( R   N )  .08
P( R  H )  .14  .05  .09, P( R  N )  .64  .08  .56, P( R  L)  .85  .65  .20
Bayes Theorem is used in Medical diagnostics.
Problem. A 50-year-old woman comes to her primary physician for
a routine physical checkup. (She comes once every year for a
physical checkup.) The doctor has to check whether or not the
woman has breast cancer. A Gold Standard procedure is available:
Breast Biopsy. This procedure is foolproof. This is a definitive test.
One can determine with certainty whether or not the woman has
breast cancer. This procedure is invasive, time-consuming, and
expensive. Further, it can not be done every year.
An alternative simple procedure is available: Mammogram. This
procedure is not as reliable as the gold standard procedure. I will
now explain how this simple diagnostic procedure is evaluated
about its reliability.
Select a woman at random from a population of interest. In our
example, the population consists of all women who are 50 or older.
Let A1 denote the event that the woman has breast cancer and A2
the event that she is free of breast cancer. The events A1 and A2 are
mutually exclusive and exhaustive. The woman takes a
mammogram (test). Let A denote the event that the mammogram is
positive and A* the event that the mammogram is negative. Note
that the events A and A* are mutually exclusive and exhaustive.
Define
Sensitivity of the test = P(A │A1) = Conditional probability that
the test comes out positive given that she has breast cancer.
Specificity of the test = P(A* │B2) = Conditional probability that
the test comes out negative given that she does not have breast
cancer.
If sensitivity is 1.0 and specificity 1.0, the test is definitive. Ideally,
for any diagnostic procedure, we would like to have sensitivity 1.0
and specificity 1.0. For gold standard procedures, sensitivity is 1.0
and specificity is 1.0. For mammograms,
sensitivity = 0.85 and specificity = 0.80.
Interpretation: For women with breast cancer, mammograms come
out positive 85% of the cases and they come out negative (false
negative) 15% of the times.
False negative: Mammogram is negative when the woman has
breast cancer.
For women free of breast cancer, mammograms come out negative
80% of the times and they come out positive (false positive) 20%
of the times.
False positive: Mammogram is positive when the woman is free of
breast cancer.
How are sensitivity and specificity are determined in practice?
How does one get those numbers 85% and 80%?
A medical researcher selected randomly 1,000 women from the
population of women 50 years or older who all had breast cancer.
(She knew they all had breast cancer. She conducted breast biopsy
on these women.) She asked each and every woman to take a
mammogram. She found in 850 cases mammograms were positive.
She declared that the sensitivity of the mammogram is 85%.
She also selected randomly 1,000 women who are free of breast
cancer. (She knew none of these women had breast cancer. She
conducted biopsy.) She asked each and every one of these women
to take a mammogram. In 20% of the cases mammograms came
out to be positive. She declared that the specificity of mammogram
is 80%.
Prevalence of breast cancer = P(A1) = proportion of women in the
population who suffer from breast cancer. This is the apriori
probability of breast cancer. This figure can be obtained from the
demographic data and vital statistics the US government publishes.
The prevalence varies from region to region. For New York city,
prevalence of breast cancer is 0.0025, i.e., 25 cases out of 10,000
women who are 50 or older.
Scenario 1: A woman comes for a routine physical checkup. Her
doctor orders a mammogram. The mammogram is positive. Does
this mean she has breast cancer? Not necessarily. We like to know
Predictive Value Positive = PVP = Pr(A1 │A) = The conditional
probability that the woman has breast cancer given that the test
(mammogram) is positive. Using Bayes’ Theorem, one can
compute PVP using the formula
PVP = Pr(A1 │A) =
Pr( A1 )  Pr( A A1 )
=
Pr( A1 )  Pr( A A1 )  Pr( A2 )  Pr( A A2 )
[Prevalence*Sensitivity] / [Prevalence*Sensitivity + (1Prevalence)(1-Specificity].
For New York city,
0.002125
0.002125


PVP = 0,85 x0.00250.85(1x0.00025
= 0.0105.
.80 )(1  0.0025 ) 0.002125  0.1995 0.201625
There is a 1% chance that she has breast cancer given that the
mammogram is positive.
Scenario 2: A woman comes for a routine physical checkup. The
doctor orders a mammogram. It comes out to be negative. Does
this mean that she is free of breast cancer? Not necessarily. We
would like to compute
Predictive Value Negative = PVN = P(A2 │A*) = The conditional
probability that she is free of breast cancer given that the
mammogram is negative.
Using Bayes’ theorem, one has a formula for PVN.
PVN =
Pr( A2 )  Pr( A A2 )

Pr( A1 )  Pr( A A1 )  Pr( A2 )  Pr( A  A2 )
=
[(1 – prevalence)*Specificity] /
[prevalence*(1 – sensitivity) + (1 – prevalence)*specificity].
Check the formula.
For New York City,
80 x(1  0.0025 )
0.798

PVN = 0.80 x(1  00..0025
)  (1  0.85) x0.0025 0.798  0.000375

0.798
 0.9995 .
0.798375
If the mammogram is negative, it is extremely unlikely that the
woman has breast cancer.
What do physicians do in practice?
Scenario: A woman comes for a routine physical checkup. She is
either 50 years of age or older.
1.
Ask the woman to take a mammogram.
2.
If the test is positive, conduct a biopsy.
3.
If the test is negative, tell her that the chances she has breast
cancer are extremely remote.
Economics of Mammograms
An HMO has on its rolls 10,000 women who are 50 or older. The
US government strongly urges women in this age group to go for
periodical checkups to detect breast cancer. A biopsy costs about
$1,000. If HMO conducts biopsy for every woman who comes up
for a checkup, it would cost 10 million dollars. This is not
economically viable for the HMO. On the other hand, a
mammogram costs about $70. Total cost is $700,000.
Examples.
1. A poker hand consists of 5 cards. If the cards have distinct
consecutive values and they are not all from the same suit. Such a
hand is called a straight. For example, nine of clubs, eight of clubs,
seven of hearts, 10 of hearts, and a jack of any suit is a straight. What
is the probability that one is dealt with a straight.
52
n(S) =   , and we assume they are equally likely. Let us first look at
5
the number of hands with ace, two, three, four and five ( the suits being
irrelevant). Since each has four choices, number of hands = 45. If all the
5 such cards are from the same suit, it is called a straight flush; therefore,
we need to take these out. There are 4 such hands. Now, the number of
hands = 45 – 4. How many such hands are there? Ten, jack, queen, king
and ace is another choice and there are 45 – 4 straights. The total number
of straights is therefore = 10( 45 – 4). Thus the desired probability is
10(4 5  4)
= .004
 52 
 
5
A hand is said to be a full house if 3 cards are of the same denomination and
2 cards are of the same denomination. A full house is three of a kind plus a
pair. For example, two tens and three jacks is a full house. What is the
probability of this event?
There are 4 aces, 4 kings, 4 queens, 4 jacks, 4 tens,…, 4 twos in the deck.
4
4
4
4
Take 4 aces and 4 kings. We can make         48 hands which
 3  2   2  3 
qualify for full house. We can choose 2 such denominations out of 13
13
denominations in   ways. The required probability is
2
 13 
 48
 2   .0014
 52 
 
5
 13 
4 
Also, P ( flush) =P( all 5 cards are of the same suit) = 525   .0020
 
 
5
P( one pair) = P( cards have denominations a,a,b,c,d where a, b, c, and d are
 4  12  4  4  4  13 
      
all distinct.)=  2  3  1 52 1  1  1   .42 . We argue this way. n(E) = Choose
 
 
5
one denomination out of 13 and choose 2 cards from the four and choose 3
different denominations from the 12 denominations and then choose one
card from each of 4 cards in each denomination.
P( two pairs) = P( a,a,b,b,c where a,b,c are all distinct) =
 13  4  4  11 4 
     
 2  2  2  1  1   .0475
 52 
 
5
P(Three of a kind) = P( a, a, a, b, c, where a,b,c are all distinct)=
 13  4  12  4  4 
     
 1  3  2  1  1   .021
 52 
 
5
P(Four of a kind)=P( a, a, a,a, b)= .002
Example 2
Suppose a box contains 4 blue, 5 white, 6 red, and 7 green balls. If we take a
sample of size 5, what is the probability that every color be represented?
22
n(S) =    26334
5
4
5
6
7
4
5
6
7
4
5
6
7
4
5
6
7
n(E) =                         7560
 2  1  1  1   1  2  1  1   1  1  2  1   1  1  1  2 
P(E) =0.287
Download