Calculus for Biologists Lab Math 1180-002 Spring 2012

advertisement
Calculus for Biologists Lab
Math 1180-002
Spring 2012
Lab #10 - The Poisson Process and Distribution
Report due date: Tuesday, April 3, 2012 at 9 a.m.
Goal: To simulate a Poisson process and understand the related distribution. You will also explore the relationship
between the Poisson and binomial distributions.
? Create a new script, either in R (laptop) or with a text editor (Linux computers).
Greedy pigs and irate birds
Once upon a time, there lived a population composed of birds and pigs. There were other animals too, but that’s
irrelevant. The pigs obtained the highest level of satiety from the birds’ eggs. Too lazy to look for other highly
satisfying foods, the pigs took to thievery of these coveted eggs to relieve their hunger pains. Of course, such acts
failed to go unnoticed by the birds, leading a swelling anger. Each individual bird reached a certain threshold of
severely blinding ire before taking action to avenge their stolen eggs. For a time, the birds independently carried
out uncoordinated attacks on the pig population. These attacks occurred at a constant rate over a period of
days. We will look at a snapshot of these events during the course of a single day to see when and how many
attacks occurred.
The simulation
Notice that this story describes a Poisson process with respect to the number of attacks carried out. We will
break the 24-hour day into 1-hour intervals. Assume that the attack rate is 2.25 per hour. In Poisson-land, this
is equivalent to Λ = 2.25.
Download the process.R file from the lab website. Execute the lines of code in R. This defines a function that
will create a series of attack times through the course of a day. Recall that the waiting times between attacks for
a Poisson process are exponentially distributed. That said, there is no specified number of attacks that should
occur within a 24-hour period. As such, if you implement the code multiple times, the number of attacks will not
necessarily be the same each time. But, in each case, the maximum attack time should be somewhere between
23 and 24.
Simulate the events of the day.
Lambda = 2.25 ## attack rate
window = 24
## time frame we care about
attack.times = process(Lambda,window)
N.attacks = length(attack.times)
## simulate times, not to exceed 'window'
## count total number of attacks
You now have a list of attack times, which you can view in R. To determine the number of attacks that occur
within each 1-hour interval, we can use the hist command. In previous labs, we have used this command to view
histograms. But, if we assign it to a variable, we can obtain a lot more information.
a.hist = hist(attack.times,breaks=0:window,plot=FALSE)
View a.hist. You should see several categories and numbers in the output. In particular, the $counts list is the
information we need. This tells the number of attacks that occurred in each of the 24 hour-long intervals. To
make life easier later, we will save the maximum and minimum of these counts.
max.ct = max(a.hist$counts)
min.ct = min(a.hist$counts)
And now, as always, we can plot some things.
par(mfrow=c(2,1),mar=c(4,4,2,1),new=F)
plot(attack.times,rep(1,N.attacks), type="h", xlim=c(0,window), axes=F,
ylab="", xlab="Hour", main="Simulated attacks")
abline(v=seq(0,window),col="dodgerblue")
axis(1,at=0:window)
plot(0, type="n", ylab="Number of attacks", xlab="Hour", main="Simulation summary",
1 of 3
L10
axes=F, xlim=c(0,window), ylim=c(min.ct,max.ct))
abline(h=min.ct:max.ct,col="gray")
par(new=T)
plot(a.hist,add=T,col="dodgerblue")
par(new=F)
axis(1,at=0:window,pos=0)
axis(2,pos=0)
You should see two plots in the figure. The top plot shows when attacks occur, indicated by the black lines.
The blue lines simply separate the time axis into one-hour intervals. The bottom plot is the visual realization of
a.hist$counts.
Plot 10.1: Save this plot to include in your assignment.
To determine the frequency of the number of attacks, we can use the information in a.hist$counts and create
yet another histogram. The goal is to see how well the simulation compares to the histogram we’d generate from
the actual Poisson distribution. Recall that the distribution has the following p.d.f.:
p(k; Λ) =
e−Λ (Λ)k
k!
Define the function pois(L,k). Guess what the R command factorial(n) does....
pois = function(L,k) (L^k)*exp(-L)/factorial(k)
Plot the simulated distribution alongside the Poisson distribution.
par(mfrow=c(1,2))
hist(a.hist$counts,breaks=min.ct:(max.ct+1)-0.5,
xlab="Number of attacks in 1 hour", ylab="Probability", main="Results summary",
axes=FALSE,freq=F,col="dodgerblue")
axis(1,pos=0)
axis(2)
true.pois = rep(min.ct:max.ct,100*pois(Lambda,min.ct:max.ct))
hist(true.pois,breaks=min.ct:(max.ct+1)-0.5,
xlab="Number of attacks in 1 hour", ylab="Probability", main="Poisson distribution",
axes=FALSE,freq=F,col="limegreen")
axis(1,pos=0)
axis(2)
Plot 10.2: Save this figure to include in your assignment.
The aftermath: the binomial/Poisson connection
In the aftermath of the attacks, the surviving pigs were scanned for injuries by sympathetic parties. Suppose all
injuries are manifest with a swollen, oozing, black eye and that we wish to know the number of black eyes we
would see in fifty pigs (total of 100 eyes). Let n denote the total number of eyes scanned, and let p be the
probability of seeing a black eye.
This situation can be described by a yes/no process, which screams binomial distribution. Therefore, we can
define the binomial p.d.f. as we did last week. Be sure to define n appropriately. The count of black eyes is
denoted by k, which can range from 0 to n.
n = ## ???
k = 0:n
binom = function (p) choose(n,k)*(p^k)*(1-p)^(n-k)
Assuming we know there exists some connection between the binomial and Poisson distributions, it is only fair
that we plot them on the same graph. You will complete this process for a variety of probabilities p: 0.5, 0.2,
0.05, 0.01. Define a vector p that contains these four values.
p = ## ???
Take some time to think about what it means to assume a Poisson distribution here, since we are no longer
looking at what happens over time. What is the event? What is the “rate” of occurrence (Hint: it’s not Lambda
2 of 3
L10
from earlier)?
Now complete the missing information in the following code to create a figure with 4 graphs corresponding to the
different values of p. For each graph, we will evaluate binom(p) and pois(rate,k).
par(mfrow=c(4,1),mar=c(3,4,3,1),oma=c(2,2,0,0))
for (??? in 1:???){
matplot(k,cbind(binom(???),pois(???,???)),
type='l', lwd=2, col=c("black","dodgerblue"),lty=1,ylab="", xlab="")
title(main=paste("p = ",p[i]))
}
legend("topright",c("binomial","Poisson"),fill=c("black","dodgerblue"))
mtext(side=1,outer=T,"Number of black eyes counted")
mtext(side=2,outer=T,"Probability")
Plot 10.3: Save this figure to include in your assignment.
? Save your script so that you can use it for your assignment.
3 of 3
L10
Download