Section 7 - Winona State University

advertisement
7 - Discrete Random Variables
This is an empirical discrete
probability distribution for the
X = # of pigs in a litter
A discrete probability density function (pdf)
is defined as f ( x)  P( X  x) .
The Prob column contains
estimated values for the
probability function. If this
study were repeated with a
different sample n = 378
litters these estimates would
be different.
Binomial and Poisson Distributions
Binomial Distribution and Random Variable
A binomial random variable X is defined to the number of “successes” in n independent
trials where the P(success) = p is constant. In the definition above notice the following
conditions need to be satisfied for a binomial experiment:
1. The is a fixed number of n trials carried out.
2. The outcome of a given trial is either a “success” or “failure”.
3. The probability of success (p) remains constant from trial to trial.
P( success )  p and P( failure )  1  p  q
4. The trials are independent, the outcome of a trial is not affected by the outcome of any
trial.
Binomial Probability Function
n
n!
f ( x)  P( X  x)    p x q n x 
p x q n  x , x  0,1,..., n
x!(n  x)!
 x
the coefficient in front denotes the number of ways to obtain x successes in n trials.
Example 1:
A drug company claims that 10% of patients taking the drug will experience adverse side
effects. To test this claim a researcher administers the drug to a random sample of 20
patients. Let X = the number patients in our sample who experience adverse side effects.
a) What is the probability that exactly 4 patients experience side effects?
b) What is the probability that 2 patients or less experience side effects?
43
c) What is the probability that 5 patients of less experience side effects?
d) What is the probability that 6 or more patients experience side effects?
e) Suppose that in our sample 6 patients experience side effects. What do think about the
claim made by the drug company on the basis of this result?
For a binomial random variable X the mean, variance and standard deviation of the
number of successes are given by:

2 

f) Suppose we gave the drug to 100 patients. If the drug company’s claim regarding side
effects is true, how many side effects do we expect to observe, i.e. what is the mean or
expected value of X?
g) What is the standard deviation of the number of side effects?
44
When n is sufficiently large then X = # of “successes” in the n trials is approximately
normally distributed. As an example consider the histogram below which shows the
simulated results of 10,000 clinical trials where for each trial n = 100 patients are given a
drug which has p  P( side effect)  .10 or a 10% chance of causing a side effect and the
number of patients experiencing side effects is observed.
h) How many side effects would you have to observe to be convinced that the 10% side
effect is wrong and that the true side effect rate is greater?
Binomial Distribution Table for n = 100 and p = .10
45
Example 2: Effect of Togetherness on Heart Rates of Rats
A researcher was interested in determining if the heart rate of rats increases when they
are in a cage with other rats versus when they are in a cage by themselves. The
researcher thought this would be the case but wanted to conduct a study to determine if
this hypothesis was supported empirically.
The results were obtained:
Rat Alone Together Difference Sign of Difference
(A)
(T)
=T-A
1
463
523
2
462
494
3
462
461
4
456
535
79
5
450
476
26
6
426
454
28
7
418
448
30
8
415
408
-7
9
409
470
61
10
402
437
35
What did we conclude regarding the research hypothesis?
46
Poisson Distribution and Random Variable
A Poisson random variable X = # of occurrences in specified time or space unit.
The assumptions we make about the process generating X are as follows:
1. Occurences are independent
2. Any number of occurrences is possible in a given time/space unit.
3. Probability of single occurrence in a given interval is proportional to the
length/size of the interval.
4. No simultaneous occurrences.
5. The expected number of occurrences during any one space/time unit is denoted by
 . This is the same for all space/time units.
Another case where the Poisson distribution is used to model the number of occurrences
is when we in a binomial experiment situation where the probability of success (p) is
small and the number trials (n) is big. In this case we can treat X = # of successes in n
trials as having a Poisson distribution with   np .
Poisson Probability Function
e   x
f ( x)  P( X  x) 
, x  0,1,2,...
x!
Example 1: Sampling Organisms from a Pond
Suppose that the average number of particular organism is expected to be 4 per 1 ml
sample, i.e.   4/ml .
a) What is the probability of observing 2 organisms in a 1 ml sample?
b) What is the probability of observing fewer than 3 organisms?
c) What is the probability of observing more than 5 organisms?
d) Suppose a feedlot began operation near the pond. Two months after it first began
operation a sample a 1-ml sample of water was taken and 10 organisms were found.
What do you conclude on the basis of this result?
47
Example 2: Birth Defects in the Counties Surrounding a Nuclear Power-Plant in
Handford, Washington.
One of the important issues in assessing nuclear energy is whether there are excess
disease risks in the communities surrounding nuclear power plants. A study was
undertaken in the community surrounding Hanford, Washington, looking at the
prevalence of selected congenital malformations in the counties surrounding the nucleartest facility in Hanford.
a) In a study conducted by Sever et al. (1988), 27 cases of Down’s syndrome were found
and only 19 were expected based on the Birth Defects Monitoring Program prevalence
estimates conducted in the states of Washington, Idaho, and Oregon. Is there are a
significant excess in the number of cases in the area surrounding the nuclear-power
plant?
b) Suppose that 12 cases of cleft palate are observed, while only 7 are expected based on
Birth Defects Monitoring Program estimates. Does this represent a significant excess in
the number children born with cleft palates?
48
Binomial Table Generator in JMP
To use the Binomial Table Generator file in JMP you simply need to change the number
of trials (n) and the probability of success (p) which is labeled a p in the table below. To
change the n and p values right-click at the top of the column and change the number in
the formula to your desired values. The table will then automatically update, and give the
probabilities in the last four columns.
Poisson Table Generator in JMP
To use the Poisson Table Generator file in JMP you simply need to change the mean rate
of arrival or occurrence () which is labeled a mu in the table below. To change the
value right-click at the top of the column and change the number in the formula to your
desired value. The table will then automatically update, and give the probabilities in the
last four columns.
49
Additional Examples –
1 In the U.S. in 2007 7.6% of infants born had birth weights classified as low (i.e.
weight < 2,500 g). In a sample of n = 123 infants born to women who smoked during
pregnancy it was found that 14 had low birth weights (< 2500 g). Does this provide
evidence that the percentage of infant born with low birth weights to women who
smoked during pregnancy exceeds the national rate?
2 – A study in Woburn, MA, in the 1970’s looked at possible excess cancer risk in
children, with a particular focus on leukemia. This study was later portrayed in the book
and movie titled A Civil Action. An important environmental issue in the investigation
concerned the possible contamination of the town’s water supply. Specifically, 12 cases
of childhood leukemia were diagnosed in Woburn during the 1970’s (Jan. 1st 1970 – Dec.
31st, 1979). A key statistical issue is whether this represents an excessive number of
leukemia cases, assuming that Woburn has had a constant 12,000 child residents during
this period and that the incidence rate of leukemia in children nationally is 5 cases per
100,000 person-years. Can we conclude that there is a significant excess in the number
of childhood leukemia cases in Woburn, MA in the 1970’s.
50
Download