The binomial distribution

advertisement
Supplementary information
The binomial distribution represents the probability distribution of the number of successes in
a sequence of n independent experiments.
Each experiment is a Bernoulli trial, which means that in each experiment, there are only two
possibilities: a success or a failure. The probability to have a success in an experiment is noted
p (so the probability of a failure is 1-p). For n experiments, the probability p of success is the
same.
Let X the random variable representing the number of success among the n experiments. X
follows a binomial distribution with parameters n and p. The expected number of X is
E(X)=np.
The
probability
of
getting
exactly
k successes (and so n-k failures) is
n!
P X  k   C nk p k (1  p ) n  k where C nk 
(k! is the factorial of k and is equal to
k!(n  k )!
k! 1  2  3  ...  (k  1)  k (with the exception 0!=1)
In this paper, the fact that an individual presents (or not) a de novo mutation can be
considered as a Bernoulli trial with a probability of success of 2.5x10-5. In the studied family,
there are 92 relatives, and the fact that an individual presents a de novo mutation is
independent of the fact that another individual presents (or not) a de novo mutation. The total
number of de novo mutations among the 92 relatives follows a binomial distribution with
n=92 and p=2.5x10-5.
In a given family of 92 relatives, the expected number of de novo mutation is np=0.00125 (on
average, one mutation for 800 families).
The probability to observe none de novo mutation is:
0
92
92
92!
0

P X  0  C 92
p 0 (1  p) 920 
2.5  10 5  1  2.5  10 5   1  2.5  10 5   0.997703
0!92!
The probability to observe one de novo mutation is:
1
91
91
92!
1

P X  1  C92
p 1 (1  p) 921 
2.5  10 5  1  2.5  10 5   922.5  10 5 1  2.5  10 5   0.002295
1!91!
The probability to observe two de novo mutations is:
2
P X  2  C 92
p 2 (1  p) 92 2 

92!
2.5  10 5
2!90!
 1  2.5 10 
2
 5 90

 4186 2.5 10 5
 1  2.5 10 
2
 5 90
So the probability to observe at least three de novo mutations is:
P X  3  1  P( X  3)  1  P( X  0)  P( X  1)  P( X  2)




 1  1  2.5  10 5  92 2.5  10 5 1  2.5  10 5

91

 4186 2.5  10 5
 1  2.5  10 
2
 5 90
 2.0  10 9
This probability represents one family among about 511 millions of families of 92
individuals.
 2.6 10  6
Download