Stat 401B Fall 2015 Lab #4 (Due September 24)

advertisement
Stat 401B Fall 2015 Lab #4 (Due September 24)
1. Start RStudio and type the attached Code Set #1 into the upper left pane and run it.
This is a simulation version of the "propagation of error"/"error analysis" problem of Section 5
Exercise 2, Section 5.5 page 321 of V&J.
a) Approximately what mean and standard deviation should be used to describe what is known about
the coefficient of linear expansion of brass based on the values given in this exercise?
b) One simulation-based way of trying to assess the importance of uncertainty in the various measured
values in producing an overall uncertainty in  is to set all but one value at exactly their means and
consider only variability in that one input. Do this in turn for each of L1 , L2 , T1 , and T2 . Uncertainty in
which of these seems to be the biggest contributor to overall uncertainty in  ? (Compare 4 standard
deviations.)
c) Compare the root of the sum of the squares of your 4 standard deviations in b) to your standard
deviation in a). Does it seem that non-linearity of  in the inputs is important here?
2. Using Code Set #1 and 1) above as guides, use simulation to do Section 5 Exercise 5, Section
5.5, page 322 of V&J.
3. Type Code Set #2 into the upper left pane and run it. This is a simulation and some summaries
for the distributions of
Z
X 
X 
and Zˆ 
s/ n
/ n
for n  5 and X 1 , X 2 , X 3 , X 4 , X 5 independent and identically distributed (iid) U  0,1 (so that   .5
and   1/12 ).
a) How do the n  5 distributions of Z and Zˆ compare to each other and to the standard normal
distribution?
b) Appropriately modify the code and rerun it for the case of n  100 . How do the n  100
distributions of Z and Zˆ compare to each other, the n  5 distributions, and the standard normal
distribution?
4. Type Code Set #3 into the upper left pane and run it. This is a simulation intended to study the
effectiveness of the confidence interval formulas
X z

n
and X  z
s
n
in the case that the distribution being sampled is U  0,1 and n  5 . In its initial form it uses the value
z  1.96 .
a) What is the target confidence level for two-sided intervals with the end-points above and z  1.96 ?
Explain.
b) What are the actual approximate confidence levels achieved in this context? Which is closer to
your answer for a)?
c) Which ones (if any) of the first 10 samples of size 5 fail to produce intervals covering  ? (Answer
this for both the "known sigma" and "unknown sigma" interval formulas.)
d) Why is your answer to b) consistent with the results of part 3a above?
e) Appropriately modify the code and rerun it for the case of n  100 . Now how do the actual
approximate confidence levels achieved compare to your answer for a)?
Code Sets for Stat 401B Laboratory #4
#Code Set 1
#Here is some code for Exercise 2 Section 5.5
alpha<-function(l1,l2,t1,t2){
(l2-l1)/(l1*(t2-t1))
}
L1<-rnorm(10000,mean=1,sd=.00005)
L2<-rnorm(10000,mean=1.00095,sd=.00005)
T1<-rnorm(10000,mean=50,sd=.1)
T2<-rnorm(10000,mean=100,sd=.1)
a<-rep(0,10000)
for(i in 1:10000) {a[i]<-alpha(L1[i],L2[i],T1[i],T2[i])}
summary(a)
sd(a)
hist(a)
#Code Set 2
#Here is some code for studying the actual distribution of
#some approximately standard normal variables built in "iid"
#(random sampling from a fixed distribution) models
M<-matrix(runif(50000,min=0,max=1),nrow=10000,byrow=T)
av<-1:10000
for (i in 1:10000){
av[i]<-mean(M[i,])
}
z<-1:10000
for (i in 1:10000){
z[i]<-(av[i]-.5)*sqrt(60)
}
hist(z,freq=FALSE)
curve(dnorm(x),add=TRUE)
plot(ecdf(z))
curve(pnorm(x),add=TRUE)
#Now use the sample standard deviation rather than the model
sigma=1/sqrt(12)
s<-1:10000
for (i in 1:10000){
s[i]<-sd(M[i,])
}
z<-1:10000
for (i in 1:10000){
z[i]<-(av[i]-.5)*sqrt(5)/s[i]
}
hist(z,freq=FALSE)
curve(dnorm(x),add=TRUE)
plot(ecdf(z))
curve(pnorm(x),add=TRUE)
#Code Set 3
#Here is some code for making and checking the performance
#of CIs for mu (for U(0,1) observations)
#First Use the model Standard deviation
M<-matrix(runif(50000,min=0,max=1),nrow=10000,byrow=T)
Low<-rep(0,10000)
Up<-rep(0,10000)
chk<-rep(0,10000)
for (i in 1:10000){
av[i]<-mean(M[i,])
}
for(i in 1:10000) {Low[i]<-av[i]-1.96*sqrt(1/60)}
for(i in 1:10000) {Up[i]<-av[i]+1.96*sqrt(1/60)}
for(i in 1:10000) {if((Low[i]<.5)&(.5<Up[i])) chk[i]<-1}
cbind(Low[1:10],Up[1:10],chk[1:10])
mean(chk)
#Now use the sample standard deviation
Low<-rep(0,10000)
Up<-rep(0,10000)
chk<-rep(0,10000)
s<-rep(0,10000)
for (i in 1:10000){
s[i]<-sd(M[i,])
}
for(i in 1:10000) {Low[i]<-av[i]-1.96*s[i]*sqrt(1/5)}
for(i in 1:10000) {Up[i]<-av[i]+1.96*s[i]*sqrt(1/5)}
for(i in 1:10000) {if((Low[i]<.5)&(.5<Up[i])) chk[i]<-1}
cbind(Low[1:10],Up[1:10],chk[1:10])
mean(chk)
Download