Answers

advertisement
Assignment #5 Answers
STAT 992
Spring 2015
Complete the following problems below. Within each part, include your R program output with code
inside of it and any additional information needed to explain your answer. Your R code and output
should be formatted in the exact same manner as in the lecture notes.
1) (20 total points) The Rayleigh distribution is often used to help model wind speed (Celik, Energy
Conversion and Management, 2004, p. 1735-1747). Let X be a random variable denoting the
average wind speed during a February day in Lincoln, where X has a Rayleigh distribution defined
as
 1  21( x/ )2
 xe
f(x)   2

0

for x  0
for x  0
where  > 0. The data file wind_speed.csv contains the observed daily average wind speed values
for 2000 – 2004 in Lincoln during each February. Assume that each observation is independent
(the autocorrelations are all nonsignificant, except for the first autocorrelation in 2002).
a) Perform the derivations below without using a computer! All answers should be given in terms
of symbols rather than numerical values that use the observed data.
i) (2 points) Derive the maximum likelihood estimator of .
The likelihood function is
n
1
1
 (xi /  )2
 (xi /  )2
1
1 n
1  2  xi n
L(x1,...,xn | )   f(xi )   2 xie 2
 2n  xie 2
 2n e 2 i 1  xi
i 1
i 1 
i 1
 i 1

1
n
n
2
The log-likelihood function is
1


xi2 n
1  22 
i 1

log(L(x1,...,xn | ))  log 2n e
 xi 


i1


n
1 n
 2nlog()  2  xi2  log  xi .
i1
2 i1
n
1 n
 2nlog()  2  xi2   log(xi )
i 1
2 i1
n
 
The derivative of the log-likelihood function with respect to  is
1
 

n


 
1 n 2
logL(x1,...,xn | ) 

2nlog(

)

x

log(xi ) 


i

2
i1

 
2 i1

n
2n 1

 3  xi2
  i1
 
Equating the above result to 0 and solving for  produces,
 
n
 
2
 xi
2n 1 n 2
1 n 2
2n
2
i1


 
 
 xi  0  3  xi 
 3 i1
 i1

2n
n
Therefore, the MLE is ˆ 
n
 xi
2
i1
2n
 xi
2
i1
2n
.
ii) (1 point) Derive the method of moment estimator for . Note that E(X) =   / 2 .
The method of moments estimator is found by equating the sample mean to the expected
value of X:
x 

2
 x
2

iii) (2 points) Through using the asymptotic normality of maximum likelihood estimators, find
the estimated asymptotic variance of the maximum likelihood estimator. Note that E(X2) =
22.
One can show that
2
2 3
log[f(xi | )]  2  4 xi2 . The Fisher information is found from
2

 
 2

2 3
2 3
2 3

F ()  E  2 log[f(Xi | )]  E  2  4 Xi2    2  4 E  Xi2    2  4 22
 
 
 

 

2 6
4
 2  2  2
 

This leads to a variance of
1
  2

2
F ()   E  2 log[f(Xi | )]  
4

  
1
ˆ 2
The estimated asymptotic variance for the MLE is
. One could also obtain the same
4n
results by
2
1
  2

 E  2 log[f(X1, , Xn | )]  

  

ˆ
ˆ 2
4n
b) (2 points) Find the numerical values for the method of moment estimate, the maximum
likelihood estimate, and the estimated asymptotic variance.
> beta.mle <- sqrt(sum(x^2)/(2*n))
> beta.mom <- mean(x) * sqrt(2/pi)
> data.frame(beta.mle, beta.mom)
beta.mle beta.mom
1 7.871982 8.138423
> est.var <- beta.mle^2/(4*n)
> est.var
[1] 0.1090988
ˆ  7.8720 ,   8.1384 , and AsVar(ˆ )  0.1091
c) (2 points) Plot the log-likelihood function and first derivative of the log-likelihood function.
Include a line on each plot that shows the location of the maximum likelihood estimate relative
to the function.
> log.lik.rayleigh <- function(beta, x) {
n <- length(x)
-2*n*log(beta) - 1/(2*beta^2) * sum(x^2) + sum(log(x))
}
> part.log.lik.rayleigh <- function(beta, x) {
n <- length(x)
-2*n/beta + 1/beta^3 * sum(x^2)
}
> #Test
> log.lik.rayleigh(beta = beta.mle, x = x)
[1] -412.3285
> part.log.lik.rayleigh(beta = beta.mle, x = x)
[1] 7.105427e-15
> y <- x #Need to change x here due to how curve() works
> curve(expr = log.lik.rayleigh(beta = x, x = y), xlim = c(5,20), xlab =
expression(beta), ylab = "log likelihood", col = "red")
> abline(v = beta.mle, lty = "dotted")
3
> curve(expr = part.log.lik.rayleigh(beta = x, x = y), xlim = c(5,20), xlab =
expression(beta), ylab = "partial log(L)", col = "red")
> abline(v = beta.mle, lty = "dotted")
> abline(h = 0)
d) The purpose of this problem to calculate some of the derivatives in part a) numerically using
fderiv() of the pracma package with the function’s defaults.
i) (3 points) Find the first derivative of the log-likelihood function using fderiv(). Evaluate
this numerical derivative at  equal to 7, the maximum likelihood estimate, and 8. Compare
these values with the actual values produced by the results in part b).
> log.lik.rayleigh2 <- function(beta, x1) {
n <- length(x)
-2*n*log(beta) - 1/(2*beta^2) * sum(x1^2) + sum(log(x1))
}
> library(pracma)
4
>
>
>
>
eval.at <- c(7, beta.mle, 8)
eval1 <- fderiv(f = log.lik.rayleigh2, x = eval.at, n = 1, x1 = x)
eval2 <- part.log.lik.rayleigh(beta = eval.at, x = x)
data.frame(eval.at, eval1, eval2)
eval.at
eval1
eval2
1 7.000000 10.73743 1.073743e+01
2 7.871982 0.00000 7.105427e-15
3 8.000000 -1.12707 -1.127070e+00
The first derivatives are essentially the same!
ii) (3 points) Find the second derivative of the log-likelihood function using fderiv().
Evaluate this numerical derivative at the maximum likelihood estimate and then use it to
compute the estimated asymptotic variance of the maximum likelihood estimator. Compare
the estimated variance to that produced by the results in part b).
> eval1 <- -1/fderiv(f = log.lik.rayleigh2, x = beta.mle, n = 2, x1 = x)
> eval2 <- beta.mle^2/(4*n)
> data.frame(beta.mle, eval1, eval2)
beta.mle
eval1
eval2
1 7.871982 0.1090989 0.1090988
The estimated asymptotic variances are essentially the same!
e) (4 points) Through using optim() with the data, find the maximum likelihood estimate for 
and the corresponding estimated asymptotic variance. Verify your answers match the results
given in part b). Use the method of moment estimate as the initial value for .
> beta.mle.optim <- optim(par = beta.mom, fn = log.lik.rayleigh, x = x, method =
"L-BFGS", control = list(fnscale = -1), hessian = TRUE, lower = 0, upper = Inf)
> beta.mle.optim
$par
[1] 7.871982
$value
[1] -412.3285
$counts
function gradient
6
6
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
$hessian
[,1]
[1,] -9.166007
> -1/beta.mle.optim$hessian #Estimated variance
[,1]
[1,] 0.1090988
ˆ  7.8720 and AsVar(ˆ )  0.1091 as we have obtained before
5
2) (9 total points) Continuing 1), the Rayleigh distribution is a special case of the Weibull distribution.
One representation of the Weibull PDF is
 1   x 
 x e  
f(x)   

0

for x  0
for x  0
where  > 0 and  > 0. Note that when  = 2 and    2 , the Weibull simplifies to the Rayleigh
distribution. Complete the following.
a) (3 points) Find the method of moment estimates for  and  using the data. Note that
2
  2
 1 
2
2
2
E(X)  (1  1/  ) , E(X )   (1  2 /  ) , and Var(X)     1      1    . Show all

  

 
derivations.
Setting the sample estimates equal to the expected values leads to x  (1  1/  ) and
n
n1  xi2  2(1 2 /  ) . Solving for  in the first moment equation, we obtain
i1
x
(1  1/  )

Substituting this expression for  into the second moment equation, we obtain
2


x
n x  
 (1  2 /  )
i1
 (1  1/  ) 
1
n
2
i
Now, we can use the uniroot() function to solve for  in
n  x 
1
n
i1
2
i
2


x

 (1  2 /  )  0
 (1  1/  ) 
> mom.func <- function(gam, x) {
mean(x^2) - (mean(x)/gamma(1 + 1/gam))^2 * gamma(1 + 2/gam)
}
> #Look for range of values to use in uniroot()
> curve(expr = mom.func(gam = x, x = y), xlim = c(1, 20), xlab = expression(gamma),
ylab = "Function", col = "red", lwd = 2)
> abline(h = 0)
6
> #Root is somewhere between 1 and 10
> gamma.mom <- uniroot(f = mom.func, interval = c(1,10), x = x)
> gamma.mom
$root
[1] 2.440065
$f.root
[1] 2.428406e-05
$iter
[1] 9
$estim.prec
[1] 6.103516e-05
I obtain a value of 2.44. Using this value for  in  
x
, the method of moment estimate
(1  1/  )
for  equal to 11.50.
> lambda.mom <- mean(x) / gamma(1 + 1/gamma.mom$root)
> data.frame(gamma.mom$root, lambda.mom)
gamma.mom.root lambda.mom
1
2.440065
11.50239
b) (3 points) Find the maximum likelihood estimates for  and  using optim() and the observed
data. State the estimated covariance matrix.
> log.lik.weibull <- function(theta, x) {
lambda <- theta[1]
gamma <- theta[2]
sum(dweibull(x = x, shape = gamma, scale = lambda, log = TRUE))
}
> weibull.mle.optim <- optim(par = c(lambda.mom, gamma.mom$root), fn =
log.lik.weibull, x = x, method = "L-BFGS", control = list(fnscale = -1),
hessian = TRUE, lower = c(0,0), upper = c(Inf, Inf))
> weibull.mle.optim
$par
7
[1] 11.533164
2.449769
$value
[1] -407.7613
$counts
function gradient
6
6
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
$hessian
[,1]
[,2]
[1,] -6.406810
5.606374
[2,] 5.606374 -45.976706
> -solve(weibull.mle.optim$hessian) #Estimated covariance matrix
[,1]
[,2]
[1,] 0.17472828 0.02130627
[2,] 0.02130627 0.02434822
The maximum likelihood estimates are ˆ  11.53 and ˆ  2.45 . The estimated covariance
matrix is
0.1747 0.0213 
0.0213 0.0243 


c) (3 points) Construct contour and 3D plots of the log-likelihood function with the maximum
likelihood estimates plotted at their appropriate locations.
> gamma.val <- seq(0.25,15,0.1)
> lambda.val <- seq(0.25,25,0.1)
> pars <- as.matrix(expand.grid(lambda.val, gamma.val))
> #Evaluate log(L) values#
> loglik <- matrix(data = NA, nrow = nrow(pars), ncol = 1)
> for(i in 1:nrow(pars)){
theta <- pars[i,]
loglik[i,1] <- c(log.lik.weibull(theta = theta, x = x))
}
> #Put log(L) into matrix - need to be careful about ordering
> loglik.mat <- matrix(data = loglik, nrow = length(lambda.val), ncol =
length(gamma.val), byrow = FALSE)
> head(loglik)
[,1]
[1,] -736.7222
[2,] -720.2409
[3,] -709.4374
[4,] -701.6771
[5,] -695.7738
[6,] -691.1027
8
> loglik.mat[1:5, 1:5]
[,1]
[,2]
[1,] -736.7222 -793.9590
[2,] -720.2409 -754.1775
[3,] -709.4374 -728.5994
[4,] -701.6771 -710.4907
[5,] -695.7738 -696.8739
[,3]
[,4]
[,5]
-934.0530 -1184.7586 -1591.6411
-852.1910 -1030.9258 -1318.6144
-800.6398 -936.1773 -1154.3482
-764.7066 -871.2199 -1043.6857
-738.0199 -823.6095 -963.6965
> log.lik.weibull(theta = c(0.25, 0.25), x = x)
[1] -736.7222
> log.lik.weibull(theta = c(0.25, 0.35), x = x)
[1] -793.959
> log.lik.weibull(theta = c(0.35, 0.25), x = x)
[1] -720.2409
> par(pty = "s", mfrow = c(1,1))
> contour(x = lambda.val, y = gamma.val, z = loglik.mat, levels = c(-1000, -600,
-500, -450, -430, -420, -410), xlab = expression(lambda), ylab =
expression(gamma), main = "Contour plot of log(L)")
> abline(h = weibull.mle.optim$par[2], col = "blue")
> abline(v = weibull.mle.optim$par[1], col = "blue")
> max(loglik)
[1] -407.7622
> library(package = rgl) #
> open3d() #Open plot window
wgl
6
> #Plot with different scale
> lambda2<-seq(10,13,0.1)
9
> gamma2 <- seq(1,4,0.1)
> pars <- as.matrix(expand.grid(lambda2, gamma2))
> loglik <- matrix(data = NA, nrow = nrow(pars), ncol = 1)
> for(i in 1:nrow(pars)){
Theta <- pars[i,]
loglik[i,1] <- c(log.lik.weibull(theta = theta, x = x))
}
> loglik.mat <- matrix(data = loglik, nrow = length(lambda2), ncol =
length(gamma2), byrow = FALSE)
> persp3d(x = lambda2, y = gamma2, z = loglik.mat, xlab = "lambda", ylab = "gamma",
zlab = "log(L)", ticktype = "detailed", col = "red")
> spheres3d(x = weibull.mle.optim$par[1], y = weibull.mle.optim$par[2], z =
max(loglik), radius = 5)
> lines3d(x = c(weibull.mle.optim$par[1], weibull.mle.optim$par[1]),
y = c(weibull.mle.optim$par[2], weibull.mle.optim$par[2]),
z = c(min(loglik), max(loglik)))
> grid3d(c("x", "y+", "z"))
3) (3 points) Continuing 1) and 2), construct histograms and EDF plots of the data to examine
whether a Rayleigh and/or Weibull distribution work well for the data set. Comment on which
distribution is better.
> win.graph(width = 12, pointsize = 16)
> par(mfrow = c(1,2))
> #PDF
> hist(x = x, xlab = "x", freq = FALSE, ylim = c(0, 0.13), main = "Histogram")
> curve(expr = dweibull(x = x, shape = 2, scale = sqrt(2)*beta.mle), lty = "dotted",
col = "blue", add = TRUE)
10
> curve(expr = dweibull(x = x, shape = weibull.mle.optim$par[2], scale =
weibull.mle.optim$par[1]), lty = "dashed", col = "red", add = TRUE)
> legend(x = 12, y = 0.10, legend = c("Rayleigh", "Weibull"), lty = c("dotted",
"dashed"), col = c("blue", "red"), bty = "n")
> #CDFs
> plot.ecdf(x = x, verticals = TRUE, do.p = FALSE, main = "CDFs", lwd = 2, col =
"black", xlab = "x", ylab = "CDF")
> curve(expr = pweibull(q = x, shape = 2, scale = sqrt(2)*beta.mle), lty = "dashed",
col = "blue", add = TRUE)
> curve(expr = pweibull(q = x, shape = weibull.mle.optim$par[2], scale =
weibull.mle.optim$par[1]), lty = "dotted", col = "red", add = TRUE)
> legend(x = 10, y = 0.2, legend = c("Rayleigh", "Weibull"), lty = c("dotted",
"dashed"), col = c("blue", "red"), bty = "n")
The Rayleigh looks pretty good! The shape of the distribution does a decent job of approximating the
corresponding plots for the observed data. The Weibull does a slightly better job, especially in the left
tail.
11
Download