Chapter 4 Probability: Studying Randomness

advertisement
Chapter 4
Probability: Studying Randomness
Randomness and Probability
• Random: Process where the outcome in a
particular trial is not known in advance,
although a distribution of outcomes may be
known for a long series of repetitions
• Probability: The proportion of time a
particular outcome will occur in a long
series of repetitions of a random process
• Independence: When the outcome of one
trial does not effect probailities of outcomes
of subsequent trials
Probability Models
• Probability Model:
– Listing of possible outcomes
– Probability corresponding to each outcome
• Sample Space (S): Set of all possible outcomes of
a random process
• Event: Outcome or set of outcomes of a random
process (subset of S)
• Venn Diagram: Graphic description of a sample
space and events
Rules of Probability
• The probability of an event A, denoted P(A) must
lie between 0 and 1 (0  P(A)  1)
• For the sample space S, P(S)=1
• Disjoint events have no common outcomes. For 2
disjoint events A and B, P(A or B) = P(A) + P(B)
• The complement of an event A is the event that A
does not occur, denoted Ac. P(A)+P(Ac) = 1
• The probability of any event A is the sum of the
probabilities of the individual outcomes that make
up the event when the sample space is finite
Assigning Probabilities to Events
• Assign probabilities to each individual outcome and
add up probabilities of all outcomes comprising the
event
• When each outcome is equally likely, count the number
of outcomes corresponding to the event and divide by
the total number of outcomes
• Multiplication Rule: A and B are independent events if
knowledge that one occurred does not effect the
probability the other has occurred. If A and B are
independent, then P(A and B) = P(A)P(B)
• Multiplication rule extends to any finite number of
events
Example - Casualties at Gettysburg
• Results from Battle of Gettysburg
Counts
Killed
Wounded
Captured/Missing
Safe Survival
Total
North
3155
14525
5365
72324
95369
South
2592
12709
12227
49972
77500
Proportions
North
0.0331
0.1523
0.0563
0.7584
1.0000
South
0.0334
0.1640
0.1578
0.6448
1.0000
Killed, Wounded, Captured/Missing are considered casualties,
what is the probability a randomly selected Northern soldier was
a casualty? A Southern soldier? Obtain the distribution across
armies
Random Variables
• Random Variable (RV): Variable that takes on the value of
a numeric outcome of a random process
• Discrete RV: Can take on a finite (or countably infinite)
set of possible outcomes
• Probability Distribution: List of values a random variable
can take on and their corresponding probabilities
– Individual probabilities must lie between 0 and 1
– Probabilities sum to 1
• Notation:
– Random variable: X
– Values X can take on: x1, x2, …, xk
– Probabilities: P(X=x1) = p1 … P(X=xk) = pk
Example: Wars Begun by Year (1482-1939)
Distribution of Numbers of wars started by year
X = # of wars stared in randomly selected year
Levels: x1=0, x2=1, x3=2, x4=3, x5=4
Probability Distribution:
Histogram
#Wars
0
1
2
3
4
Probability
0.5284
0.3231
0.1070
0.0328
0.0087
Yearr
•
•
•
•
300
200
100
0
0
1
2
3
Wars
4
More
Masters Golf Tournament 1st Round Scores
Histogram
Score
90
87
84
81
78
75
72
69
66
600
500
400
300
200
100
0
63
Frequency
Score Frequency Probability
63
1 0.000288
64
2 0.000576
65
6 0.001728
66
16 0.004608
67
46 0.013249
68
67 0.019297
69
151 0.043491
70
238 0.068548
71
337 0.097062
72
428 0.123272
73
467 0.134505
74
498 0.143433
75
397 0.114343
76
293 0.084389
77
203 0.058468
78
125 0.036002
79
78 0.022465
80
50 0.014401
81
28 0.008065
82
17 0.004896
83
7 0.002016
84
7 0.002016
85
4 0.001152
86
3 0.000864
87
1 0.000288
88
2 0.000576
Continuous Random Variables
• Variable can take on any value along a continuous
range of numbers (interval)
• Probability distribution is described by a smooth
density curve
• Probabilities of ranges of values for X correspond to
areas under the density curve
– Curve must lie on or above the horizontal axis
– Total area under the curve is 1
• Special case: Normal distributions
Means and Variances of Random Variables
• Mean: Long-run average a random variable will take on
(also the balance point of the probability distribution)
• Expected Value is another term, however we really do
not expect that a realization of X will necessarily be
close to its mean. Notation: E(X)
• Mean of a discrete random variable:
E( X )   X  x1 p1  x2 p2    xk pk   xi pi
Examples - Wars & Masters Golf
#Wars
0
1
2
3
4
Sum
Probability
0.5284
0.3231
0.1070
0.0328
0.0087
1.0000
x*p
0.0000
0.3231
0.2140
0.0983
0.0349
0.6703
=0.67
Score
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
Sum
prob
0.000288
0.000576
0.001728
0.004608
0.013249
0.019297
0.043491
0.068548
0.097062
0.123272
0.134505
0.143433
0.114343
0.084389
0.058468
0.036002
0.022465
0.014401
0.008065
0.004896
0.002016
0.002016
0.001152
0.000864
0.000288
0.000576
1
x*p
0.0181
0.0369
0.1123
0.3041
0.8877
1.3122
3.0009
4.7984
6.8914
8.8756
9.8188
10.6141
8.5757
6.4136
4.5020
2.8082
1.7748
1.1521
0.6532
0.4015
0.1673
0.1694
0.0979
0.0743
0.0251
0.0507
73.54
=73.54
Statistical Estimation/Law of Large Numbers
• In practice we won’t know  but will want to estimate it
• We can select a sample of individuals and observe the
sample mean:
x
• By selecting a large enough sample size we can be very
confident that our sample mean will be arbitrarily close
to the true parameter value
• Margin of error measures the upper bound (with a high
level of confidence) in our sampling error. It decreases
as the sample size increases
Rules for Means
• Linear Transformations: a + bX (where a and b are
constants): E(a+bX) = a+bX = a + bX
• Sums of random variables: X + Y (where X and Y are
random variables): E(X+Y) = X+Y = X + Y
• Linear Functions of Random Variables:
E(a1X1++anXn) = a11+…+ann
where E(Xi)=i
Example: Masters Golf Tournament
• Mean by Round (Note ordering):
1=73.54 2=73.07 3=73.76 4=73.91
Mean Score per hole (18) for round 1:
E((1/18)X1) = (1/18)1 = (1/18)73.54 = 4.09
Mean Score versus par (72) for round 1:
E(X1-72) = X1-72 = 73.54-72= +1.54 (1.54 over par)
Mean Difference (Round 1 - Round 4):
E(X1-X4) = 1 - 4 = 73.54 - 73.91 = -0.37
Mean Total Score:
E(X1+X2+X3+X4) = 1+ 2+ 3+ 4 =
= 73.54+73.07+73.76+73.91 = 294.28 (6.28 over par)
Variance of a Random Variable
• Variance: Measure of the spread of the probability
distribution. Average squared deviation from the mean
• Standard Deviation: (Positive) Square Root of Variance
V ( X )   X2  ( x1   X ) 2 p1    ( xk   X ) 2 pk   ( xi   X ) 2 pi
  xi2 pi   X2 E ( X 2 )-μ X2
(useful when X takes on integer va lues)
Rules for Variances (X, Y RVs a, b constants)
V (a  bX )   a2bX  b 2 X2
V (aX  bY )  
2
aX  bY
 a   b   2ab X  Y
2
2
X
2
2
Y
where  is the correlatio n between X and Y
Variance of a Random Variable
V (a  bX )   a2bX  b 2 X2
2
2 2
2 2
V (aX  bY )   aX

a


b
 Y  2ab X  Y
 bY
X
where  is the correlatio n between X and Y
Special Cases:
• X and Y are independent (outcome of one does not alter the
distribution of the other):  = 0, last term drops out
• a=b=1 and  = 0
V(X+Y) = X2 + Y2
• a=1 b= -1 and  = 0
• a=b=1 and  0
V(X-Y) = X2 + Y2
V(X+Y) = X2 + Y2 + 2XY
• a=1 b= -1 and  0
V(X-Y) = X2 + Y2 -2XY
Wars & Masters (Round 1) Golf Scores
Wars (x)
0
1
2
3
4
Sum
Prob
0.5284
0.3231
0.1070
0.0328
0.0087
1.0000
(x- )
-0.6703
0.3297
1.3297
2.3297
3.3297
2=.7362  = .8580
(x- )^2
0.4493
0.1087
1.7681
5.4275
11.0869
((x- )^2)*p
0.2374
0.0351
0.1892
0.1780
0.0965
0.7362
Score
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
Sum
prob
(x-)^2
0.000288 111.0916
0.000576
91.0116
0.001728
72.9316
0.004608
56.8516
0.013249
42.7716
0.019297
30.6916
0.043491
20.6116
0.068548
12.5316
0.097062
6.4516
0.123272
2.3716
0.134505
0.2916
0.143433
0.2116
0.114343
2.1316
0.084389
6.0516
0.058468
11.9716
0.036002
19.8916
0.022465
29.8116
0.014401
41.7316
0.008065
55.6516
0.004896
71.5716
0.002016
89.4916
0.002016 109.4116
0.001152 131.3316
0.000864 155.2516
0.000288 181.1716
0.000576 209.0916
1
((x-)^2)p
0.031996
0.052426
0.126034
0.261989
0.566674
0.592263
0.896415
0.859021
0.626207
0.292352
0.039222
0.03035
0.243734
0.510691
0.699952
0.716143
0.669731
0.600974
0.448803
0.350437
0.180427
0.220588
0.151304
0.134146
0.052181
0.120444
9.474503
2 =9.47
  3.08
Masters Scores (Rounds 1 & 4)
 1 = 73.54 4 = 73.91 12=9.48 42=11.95 =0.24
• Variance of Round 1 scores vs Par: V(X1-72)=12=9.48
• Variance of Sum and Difference of Round 1 and Round 4 Scores:
Sum ( X 1  X 4 ) : V ( X 1  X 4 )   12   42  2  1 4
 9.48  11.95  2(0.24) (9.48)(11.95)  9.48  11.95  5.11  26.54
Difference ( X 1  X 4 ) : V ( X 1  X 4 )   12   42  2  1 4
 9.48  11.95  2(0.24) (9.48)(11.95)  9.48  11.95  5.11  16.32
 X  X  26.54  5.15
1
4
 X  X  16.32  4.04
1
4
General Rules of Probability
• Union of set of events: Event that any (at least one) of
the events occur
• Disjoint events: Events that share no common sample
points. If A, B, and C are pairwise disjoint, the
probability of their union is: P(A)+P(B)+P(C)
• Intersection of two (or more) events: The event that
both (all) events occur.
• Addition Rule: P(A or B) = P(A)+P(B)-P(A and B)
• Conditional Probability: The probability B occurs
given A has occurred: P(B|A)
• Multiplication Rule (generalized to conditional prob):
P(A and B)=P(A)P(B|A)=P(B)P(A|B)
Conditional Probability
• Generally interested in case that one event precedes
another temporally (but not necessary)
• When P(A) > 0 (otherwise is trivial):
P( A and B)
P( B | A) 
P( A)
P( A and B)
P( A | B) 
P( B)
• Contingency Table: Table that cross-classifies individuals or
probabilities across 2 or more event classifications
• Tree Diagram: Graphical description of cross-classification of 2
or more events
John Snow London Cholera Death Study
• 2 Water Companies (Let D be the event of death):
– Southwark&Vauxhall (S): 264913 customers, 3702 deaths
– Lambeth (L): 171363 customers, 407 deaths
– Overall: 436276 customers, 4109 deaths
4109
 .0094
(94 per 10000 people)
436276
3702
P( D | S ) 
 .0140 (140 per 10000 people)
264913
407
P ( D | L) 
 .0024 (24 per 10000 people)
171363
P( D) 
Note that probability of death is almost 6 times higher for S&V
customers than Lambeth customers (was important in showing how
cholera spread)
John Snow London Cholera Death Study
Water
Company
S&V
Lambeth
Total
Cholera
Death
Yes
No
Total
3702
(.0085)
407
(.0009)
4109
(.0094)
261211
(.5987)
170956
(.3919)
432167
(.9906)
264913
(.6072)
171363
(.3928)
436276
(1.0000)
(
Contingency Table with joint probabilities (in body of table) and
marginal probabilities (on edge of table)
John Snow London Cholera Death Study
Company
Death
.0140
D (.0085)
S&V
.6072
.9860
DC (.5987)
WaterUser
.0024
.3928
L
.9976
D (.0009)
DC (.3919)
Tree Diagram obtaining joint probabilities by multiplication rule
Example: Florida lotto
• You select 6 distinct digits from 1 to 53 (no replacement)
• State randomly draws 6 digits from 1 to 53
• Probability you match all 6 digits:
– First state draw: P(match 1st) = 6/53
– Given you match 1st, you have 5 left and state has 52 left:
P(match 2nd given matched 1st) = 5/52
– Process continues: P(match 3rd given 1&2) = 4/51
– P(match 4th given 1&2&3) = 3/50
– P(match 5th given 1&2&3&4) = 2/49
– P(match 6th given 1&2&3&4) = 1/48
1
 6  5  4  3  2  1 
Multiplica tion rule : P(match all)         
 53  52  51  50  49  48  22,957,480
Bayes’s Rule - Updating Probabilities
• Let A1,…,Ak be a set of events that partition a sample
space such that (mutually exclusive and exhaustive):
– each set has known P(Ai) > 0 (each event can occur)
– for any 2 sets Ai and Aj, P(Ai and Aj) = 0 (events are disjoint)
– P(A1) + … + P(Ak) = 1 (each outcome belongs to one of events)
• If C is an event such that
– 0 < P(C) < 1 (C can occur, but will not necessarily occur)
– We know the probability will occur given each event Ai: P(C|Ai)
• Then we can compute probability of Ai given C occurred:
P(C | Ai ) P( Ai )
P( Ai and C )
P( Ai | C ) 

P(C | A1 ) P( A1 )    P(C | Ak ) P( Ak )
P(C )
Northern Army at Gettysburg
Regiment
I Corps
II Corps
III Corps
V Corps
VI Corps
XI Corps
XII Corps
Cav Corps
Arty Reserve
Sum
Label
A1
A2
A3
A4
A5
A6
A7
A8
A9
Initial #
10022
12884
11924
12509
15555
9839
8589
11501
2546
95369
Casualties
6059
4369
4211
2187
242
3801
1082
852
242
23045
P(Ai)
0.1051
0.1351
0.1250
0.1312
0.1631
0.1032
0.0901
0.1206
0.0267
1
P(C|Ai)
0.6046
0.3391
0.3532
0.1748
0.0156
0.3863
0.1260
0.0741
0.0951
P(C|Ai)*P(Ai)
0.0635
0.0458
0.0442
0.0229
0.0025
0.0399
0.0113
0.0089
0.0025
0.2416
P(C)
P(Ai|C)
0.2630
0.1896
0.1828
0.0949
0.0105
0.1650
0.0470
0.0370
0.0105
1.0002
• Regiments: partition of soldiers (A1,…,A9). Casualty: event C
• P(Ai) = (size of regiment) / (total soldiers) = (Column 3)/95369
• P(C|Ai) = (# casualties) / (regiment size) = (Col 4)/(Col 3)
• P(C|Ai) P(Ai) = P(Ai and C) = (Col 5)*(Col 6)
•P(C)=sum(Col 7)
• P(Ai|C) = P(Ai and C) / P(C) = (Col 7)/.2416
Independent Events
• Two events A and B are independent if
P(B|A)=P(B) and P(A|B)=P(A) , otherwise
they are dependent or not independent.
• Cholera Example:
P(D) = .0094 P(D|S) = .0140 P(D|L) =.0024
Not independent (which firm would you prefer)?
• Union Army Example:
P(C) = .2416 P(C|A1)=.6046
P(C|A5)=.0156
Not independent: Almost 40 times higher risk for A1
Download