1180:Lab8 1 Generating Exit Times James Moore

advertisement
1180:Lab8
James Moore
March 5th, 2012
1
Generating Exit Times
In this lab we will use R to visualize some probability data. The example that I am using is similar
to section 6.6 from your book. Throughout this entire assignment we will consider the very basic
process of molecules diffusing out of (i.e. leaving) a cell.
First we generate the leaving times. We’ll sample from an exponential distribution with rate .13
(we’ll see what that means later on). The exponential distribution is a continuous random variable.
ltimes=rexp(n=100000,rate=.13)
The following commands give some information about what we just generated.
> head(ltimes)
[1] 16.3208468 0.6565472 7.5878192 1.5364116 15.8764461
> summary(ltimes)
Min. 1st Qu.
Median
Mean 3rd Qu.
Max.
0.00002 2.21700 5.32000 7.68300 10.67000 81.97000
9.4162347
The simplest way to visualize this data is with a histogram. This is fantastically easy in R.
hist(ltimes,xlab="Leaving Time (Minutes)",main="Leaving Time",freq=F)
Let’s compare this with what the distribution ‘should’ look like which is given by the probability
density function. We can get the probability density function of any distribution by replacing the
r (for random) with a d (for density). In this case, rexp becomes dexp.
tlist<-seq(0,80,.1)
lines(tlist,dexp(tlist,rate=.13),col=’red’)
Save this plot (#1) The histogram shows the distribution of our 100000 random samples, the
red curve shows the expected distribution if we took infinitely many samples.
Right now it looks like the curve lies right on top of our histogram. However, we can increase
the number of boxes to ‘expose’ the randomness of our samples.
1
hist(ltimes,xlab="Leaving Time (Minutes)",main="Leaving Time",freq=F,breaks=500)
lines(tlist,dexp(tlist,rate=.13),col=’red’)
Save this plot (#2) This histogram should show some small deviation from the red curve. When
we take wider bars, these variations tend to cancel out.
Another way to look at the data is with a cumulative distribution function. We’ll plot how
many molecules are still in the cell as a function of time. The code to do this is below.
accumulation<-function(t){
N=length(ltimes)
length(ltimes[ltimes<t])/N
}
tlist2=seq(0,80,length=20)
approxcdf=seq(1,20)
for(i in seq(1,20)){approxcdf[i]<-accumulation(tlist2[i])}
plot(tlist2,approxcdf,type=’h’)
We can check this against the expected curve with integration. The probability density function
(dexp) is actually just .13e−.13t . Integrate this from 0 to T and then add the theorectical curve to the
plot created by the supplied code. Save this plot (#3). Using your calculated distribution
function, estimate what the probability is that a molecule has left by 1 minute. Do
the same calculation for 4 mins and 8 mins.
2
Looking at individual cells
The example of molecules leaving cells is just used as an illustration of the process in general.
However, one might actually be interested in this if you are studying how the cell exports molecules.
Suppose that you have a population of 1000 cells and you know that each cell holds ten molecules
(yes it’s not a perfect example). By attaching a fluorescent bead to the molecules you can see how
many are in each cell at any point. We perform this simulation to know what to expect if it were
simple diffusion.
We have simulated the leaving times of 1000000 molecules. Now we wish to conduct the following
experiment.
1. Choose 10 random molecules
2. See how many have not left at t = 1, 4, 8
3. Repeat 1000 times.
2
4. Plot the results in a histogram.
The code below should get you started.
par(mfrow=c(1,3))#This lines allows you to put three plots side by side
greaterthanone<-seq(1,1000)
for(i in seq(1,1000)){
miniltimes=sample(ltimes,size=10)
greaterthanone[i]<-sum(miniltimes>1)
}
hist(greaterthanone,breaks=seq(-.5,10.5),main="Molecules in Cell at t=1",freq=F)
Create a histogram that shows the distribution at each of the three times and put
them on the same plot (#4)
3
Comparing with Binomial Distribution
The leaving times of a molecule are a contunous random variable. If we count the number of
molecules left at a particularly time then we get a discrete random variable. Using your cumulative
density function, you should have figured out the probability that a given molecule has left at
various times. Suppose, for example, you found that 30 percent have left by 1 minute. That
implies that 70 percent remain. If we assume that there are ten molecules in a cell and each has a
70 percent chance of remaining after a minute, then there should typically be around 7 molecules
left. However, it could be 6 or 8, or even 5 or 9. The distribution for this random variable is exactly
a binomial. Now we can regenerate the previous graph and add the expected values. Like so
greaterthanone<-seq(1,1000)
for(i in seq(1,1000)){
miniltimes=sample(ltimes,size=10)
greaterthanone[i]<-sum(miniltimes>1)
}
hist(greaterthanone,breaks=seq(-.5,10.5),main="Molecules in Cell at t=1",freq=F)
points(seq(0,10),dbinom(seq(0,10),size=10,prob=.7),col=’red’,pch=15)#added line
Note the line added at the end. You’ll have to change the value or ‘prob’. Recreate plot #4
with the expected values added on to each panel (#5)
3
Download