λ μ μ - CSUS

advertisement
Quiz 9 & 10 (20 points)
Due: Friday, April 17, 2008 beginning of class
Name_Solution__________________
Exploring the Distribution of Sample Means by Computer Simulation
This is an individual assignment. You are allowed to seek help from persons other than me for programming
questions only. I reserve the right to verbally question you about your responses and assign a grade of zero if it
becomes apparent that the work was not your own. R has been installed in BRH-205, if you need to use
campus computing facilities.
Purpose: to investigate the distribution of sample means under different conditions using computer
simulations instead of theory.
1.
In this exercise, you will explore the distribution of sample means when the samples are drawn from
and Exp(2) distribution.
a. Using R, draw one random sample of size 3 from the Exp(2) distribution. You will use the
command rexp(n=3,rate=2). In R “rate” is what we call λ, n is the sample size and “rexp”
stands for random generation from the exponential distribution. You can store your sample in
an object called “data” using the command, data <- rexp(n=3,rate=2). Type data to
view your sample.
Write your sample here: __answers will vary 0.9867846 0.1419717 0.4805372___
The sample mean is: _0.5364312_____ (In R, use the command: mean(data) )
b. Repeat part (a). Write the resulting sample and sample mean here: __ 0.7672375, 0.2571384,
0.2458986___sample mean = 0.4234248_____________
c. To understand the behavior of all possible sample means from samples of size 3, we need to
repeat part (a) many, many times and record the resulting sample means. This is tedious to do
by hand, so use the following lines of code to generate 1000 sample of size 3 from an Exp(2)
distribution. Note that # is the comment symbol in R. R will ignore everything on a line after #.
simdata <- rexp(n=3000,rate=2) #generate 3000 random samples from Exp(2)
matrixdata <- matrix(simdata,nrow=1000,ncol=3) #format simdata as matrix
Now type matrixdata to see the random samples you just generated. Note each row is one sample
of size 3. Since there are 1000 rows, we have 1000 samples of size 3.
Now get the sample mean of each row of data:
means.exp <- apply(matrixdata,1,mean)
#takes the mean of each row
means.exp
#print the 1000 sample means to the screen
hist(means.exp)
mean(means.exp)
sd(means.exp)
#histogram of the 1000 means from samples of size 3
#mean of the 1000 sample means
#standard deviation of the means
Estimate μ X by the mean of the 1000 sample means: __0.4999________
How does the estimate compare to the true value of μ X = μ =
1
λ
= __ 0.5 ______ ?
Very close
Estimate σ X using the 1000 sample means:______0.2848___________
How does it compare to the true value of σ X =
σ
n
=
1
λ n
= ___1/(2*sqrt(3))=0.289_____?
Attach either a printout or a sketch of the histogram of the 1000 sample means. Do the sample
means appear to be normally distributed?__attached___No, the sample means do not appear to be
normally distributed. The distribution is right skewed.___
2. Now repeat exercise 1 for samples of size 15. So start by generating 1000 random samples of size 15
from Exp(2).
simdata <- rexp(n=15000,rate=2)
matrixdata <- matrix(simdata,nrow=1000,ncol=15)
Type matrixdata[1:2,] to view the first two rows of the matrix
Use the same R commands are before to obtain the histogram, mean and standard deviation of the
1000 sample means for samples of size 15.
Estimate μ X by the mean of the 1000 sample means: __ 0.4972355________
How does the estimate compare to the true value of μ X = μ =
1
λ
= ___ 0.5 _____ ?
Estimate σ X using the 1000 sample means:__ 0.1264290_______________
How does it compare to the true value of σ X =
σ
n
=
1
λ n
= ___1/(2*3.87_)_= 0.129__The
estimate is close to the true value__?
Attach either a printout or a sketch of the histogram of the 1000 sample means. Does it look
normal?__attached______
3. Now draw 1000 samples of size 3 from a N(0,1) distribution. Use the R commands:
simdata <- rnorm(n=3000,mean=0,sd=1)
matrixdata <- matrix(simdata,nrow=1000,ncol=3)
Estimate μ X by the mean of the 1000 sample means: __ -0.0329809________
How does the estimate compare to the true value of μ X = μ = __ 0 ______ ? Estimate is
close to true value.
Estimate σ X using the 1000 sample means:__ 0.5747191_______________
How does it compare to the true value of σ X =
σ
n
=
1
n
= ___0.5774______?
Attach either a printout or a sketch of the histogram of the 1000 sample means. Does it look
normal?__yes ______
4. Lastly, draw 1000 samples of size 15 from a N(0,1) distribution. Use the R commands:
simdata <- rnorm(n=15000,mean=0,sd=1)
matrixdata <- matrix(simdata,nrow=1000,ncol=15)
Estimate μ X by the mean of the 1000 sample means: __ -0.01284968________
How does the estimate compare to the true value of μ X = μ = __ 0 ______ ?
The estimate is
close to the true value.
Estimate σ X using the 1000 sample means:_____ 0.2559351____________
How does it compare to the true value of σ X =
σ
n
=
1
n
= __0.2582_______?
Attach either a printout or a sketch of the histogram of the 1000 sample means. Does it look
normal?___yes_____
5. Suppose X is the mean of a random sample of size 15 drawn from a population that has the N(0,1)
distribution.
a. Calculate P( X <0.25) using theory to obtain the exact probability. (To use R, look up the
command pnorm, i.e. type ?pnorm)
P ( X < 0.25) = P( Z <
0.25 − 0
) = P( Z < 0.97) = 0.8340
1
15
b. Approximate P( X <0.25) using the 1000 sample means simulated in problem 4. (Hint: sorting
the sample means in ascending order might help, use the R command sort(x), where x is the
name of the vector containing the sample means.)
P ( X < 0.25) ≈
number of sample means greater than 0.25 844
=
= 0.844 (answers will vary)
number of simulated sample means
1000
6.
a. Redo problem #21a in section 4.5 of the Navidi text using a simulation. Compare your
approximation to the exact theoretic answer.
Bulb A life = X ~N(800,sd = 100)
Bulb B life = Y ~ N(900,sd = 150)
P(Y > X)=P(Y-X > 0) since Y-X ~ N(100, sd = sqroot(100^2 + 150^2)=180.3)
=P(Z > (0-100)/180.3)=P(Z>-0.55) = 0.7088
Simulation: simulate 1000 values of X and 1000 values of Y. Calculate Y-X. Then
P (Y − X > 0) =
number of simualted Y - X greater than 0 726
=
= 0.726 (answers will vary but
total number of Y - X simulated
1000
should be close to 0.7088)
b. Let X = life of Bulb A and Y = life of Bulb B, use a simulation to determine if the following
random variables are approximately normal
i. Y/X Close to normal, but a little skewed. Graphs attached.
ii. sin(X) Not normal at all. Graph attached.
c. Use a simulation to approximate the probability that Bulb B lasts over 10% longer than Bulb A.
P (Y > 1.1 * X ) = P (Y / X > 1.1) ≈
number of simulated Y/X over 1.1 547
=
= 0.547 (ans will
1000
total number of Y/X simulated
vary)
d. In general, when using computer simulation methods to approximate probabilities, how can you
improve the accuracy of your approximation? For example, in part (c) what can you do to
increase the accuracy of your answer? Simulate more repetitions of the experiment. So in part
(c) simulate more values of Y/X
e. Are sample means always normally distributed? No. The sample means in problem 1 are not
normally distributed. Sample means will be approximately normally distributed if the sample
size is large. If the sample size is small, we only know sample means are normally distributed
if the sample is drawn from a normally distributed population.
1) 4) 2)
6b) 3) 
Download