Calculus for Biologists Lab Math 1180-002 Spring 2012 Lab #10 - The Poisson Process and Distribution Report due date: Tuesday, April 3, 2012 at 9 a.m. Goal: To simulate a Poisson process and understand the related distribution. You will also explore the relationship between the Poisson and binomial distributions. ? Create a new script, either in R (laptop) or with a text editor (Linux computers). Greedy pigs and irate birds Once upon a time, there lived a population composed of birds and pigs. There were other animals too, but that’s irrelevant. The pigs obtained the highest level of satiety from the birds’ eggs. Too lazy to look for other highly satisfying foods, the pigs took to thievery of these coveted eggs to relieve their hunger pains. Of course, such acts failed to go unnoticed by the birds, leading a swelling anger. Each individual bird reached a certain threshold of severely blinding ire before taking action to avenge their stolen eggs. For a time, the birds independently carried out uncoordinated attacks on the pig population. These attacks occurred at a constant rate over a period of days. We will look at a snapshot of these events during the course of a single day to see when and how many attacks occurred. The simulation Notice that this story describes a Poisson process with respect to the number of attacks carried out. We will break the 24-hour day into 1-hour intervals. Assume that the attack rate is 2.25 per hour. In Poisson-land, this is equivalent to Λ = 2.25. Download the process.R file from the lab website. Execute the lines of code in R. This defines a function that will create a series of attack times through the course of a day. Recall that the waiting times between attacks for a Poisson process are exponentially distributed. That said, there is no specified number of attacks that should occur within a 24-hour period. As such, if you implement the code multiple times, the number of attacks will not necessarily be the same each time. But, in each case, the maximum attack time should be somewhere between 23 and 24. Simulate the events of the day. Lambda = 2.25 ## attack rate window = 24 ## time frame we care about attack.times = process(Lambda,window) N.attacks = length(attack.times) ## simulate times, not to exceed 'window' ## count total number of attacks You now have a list of attack times, which you can view in R. To determine the number of attacks that occur within each 1-hour interval, we can use the hist command. In previous labs, we have used this command to view histograms. But, if we assign it to a variable, we can obtain a lot more information. a.hist = hist(attack.times,breaks=0:window,plot=FALSE) View a.hist. You should see several categories and numbers in the output. In particular, the $counts list is the information we need. This tells the number of attacks that occurred in each of the 24 hour-long intervals. To make life easier later, we will save the maximum and minimum of these counts. max.ct = max(a.hist$counts) min.ct = min(a.hist$counts) And now, as always, we can plot some things. par(mfrow=c(2,1),mar=c(4,4,2,1),new=F) plot(attack.times,rep(1,N.attacks), type="h", xlim=c(0,window), axes=F, ylab="", xlab="Hour", main="Simulated attacks") abline(v=seq(0,window),col="dodgerblue") axis(1,at=0:window) plot(0, type="n", ylab="Number of attacks", xlab="Hour", main="Simulation summary", 1 of 3 L10 axes=F, xlim=c(0,window), ylim=c(min.ct,max.ct)) abline(h=min.ct:max.ct,col="gray") par(new=T) plot(a.hist,add=T,col="dodgerblue") par(new=F) axis(1,at=0:window,pos=0) axis(2,pos=0) You should see two plots in the figure. The top plot shows when attacks occur, indicated by the black lines. The blue lines simply separate the time axis into one-hour intervals. The bottom plot is the visual realization of a.hist$counts. Plot 10.1: Save this plot to include in your assignment. To determine the frequency of the number of attacks, we can use the information in a.hist$counts and create yet another histogram. The goal is to see how well the simulation compares to the histogram we’d generate from the actual Poisson distribution. Recall that the distribution has the following p.d.f.: p(k; Λ) = e−Λ (Λ)k k! Define the function pois(L,k). Guess what the R command factorial(n) does.... pois = function(L,k) (L^k)*exp(-L)/factorial(k) Plot the simulated distribution alongside the Poisson distribution. par(mfrow=c(1,2)) hist(a.hist$counts,breaks=min.ct:(max.ct+1)-0.5, xlab="Number of attacks in 1 hour", ylab="Probability", main="Results summary", axes=FALSE,freq=F,col="dodgerblue") axis(1,pos=0) axis(2) true.pois = rep(min.ct:max.ct,100*pois(Lambda,min.ct:max.ct)) hist(true.pois,breaks=min.ct:(max.ct+1)-0.5, xlab="Number of attacks in 1 hour", ylab="Probability", main="Poisson distribution", axes=FALSE,freq=F,col="limegreen") axis(1,pos=0) axis(2) Plot 10.2: Save this figure to include in your assignment. The aftermath: the binomial/Poisson connection In the aftermath of the attacks, the surviving pigs were scanned for injuries by sympathetic parties. Suppose all injuries are manifest with a swollen, oozing, black eye and that we wish to know the number of black eyes we would see in fifty pigs (total of 100 eyes). Let n denote the total number of eyes scanned, and let p be the probability of seeing a black eye. This situation can be described by a yes/no process, which screams binomial distribution. Therefore, we can define the binomial p.d.f. as we did last week. Be sure to define n appropriately. The count of black eyes is denoted by k, which can range from 0 to n. n = ## ??? k = 0:n binom = function (p) choose(n,k)*(p^k)*(1-p)^(n-k) Assuming we know there exists some connection between the binomial and Poisson distributions, it is only fair that we plot them on the same graph. You will complete this process for a variety of probabilities p: 0.5, 0.2, 0.05, 0.01. Define a vector p that contains these four values. p = ## ??? Take some time to think about what it means to assume a Poisson distribution here, since we are no longer looking at what happens over time. What is the event? What is the “rate” of occurrence (Hint: it’s not Lambda 2 of 3 L10 from earlier)? Now complete the missing information in the following code to create a figure with 4 graphs corresponding to the different values of p. For each graph, we will evaluate binom(p) and pois(rate,k). par(mfrow=c(4,1),mar=c(3,4,3,1),oma=c(2,2,0,0)) for (??? in 1:???){ matplot(k,cbind(binom(???),pois(???,???)), type='l', lwd=2, col=c("black","dodgerblue"),lty=1,ylab="", xlab="") title(main=paste("p = ",p[i])) } legend("topright",c("binomial","Poisson"),fill=c("black","dodgerblue")) mtext(side=1,outer=T,"Number of black eyes counted") mtext(side=2,outer=T,"Probability") Plot 10.3: Save this figure to include in your assignment. ? Save your script so that you can use it for your assignment. 3 of 3 L10