Assignment #5 Answers STAT 992 Spring 2015 Complete the following problems below. Within each part, include your R program output with code inside of it and any additional information needed to explain your answer. Your R code and output should be formatted in the exact same manner as in the lecture notes. 1) (20 total points) The Rayleigh distribution is often used to help model wind speed (Celik, Energy Conversion and Management, 2004, p. 1735-1747). Let X be a random variable denoting the average wind speed during a February day in Lincoln, where X has a Rayleigh distribution defined as 1 21( x/ )2 xe f(x) 2 0 for x 0 for x 0 where > 0. The data file wind_speed.csv contains the observed daily average wind speed values for 2000 – 2004 in Lincoln during each February. Assume that each observation is independent (the autocorrelations are all nonsignificant, except for the first autocorrelation in 2002). a) Perform the derivations below without using a computer! All answers should be given in terms of symbols rather than numerical values that use the observed data. i) (2 points) Derive the maximum likelihood estimator of . The likelihood function is n 1 1 (xi / )2 (xi / )2 1 1 n 1 2 xi n L(x1,...,xn | ) f(xi ) 2 xie 2 2n xie 2 2n e 2 i 1 xi i 1 i 1 i 1 i 1 1 n n 2 The log-likelihood function is 1 xi2 n 1 22 i 1 log(L(x1,...,xn | )) log 2n e xi i1 n 1 n 2nlog() 2 xi2 log xi . i1 2 i1 n 1 n 2nlog() 2 xi2 log(xi ) i 1 2 i1 n The derivative of the log-likelihood function with respect to is 1 n 1 n 2 logL(x1,...,xn | ) 2nlog( ) x log(xi ) i 2 i1 2 i1 n 2n 1 3 xi2 i1 Equating the above result to 0 and solving for produces, n 2 xi 2n 1 n 2 1 n 2 2n 2 i1 xi 0 3 xi 3 i1 i1 2n n Therefore, the MLE is ˆ n xi 2 i1 2n xi 2 i1 2n . ii) (1 point) Derive the method of moment estimator for . Note that E(X) = / 2 . The method of moments estimator is found by equating the sample mean to the expected value of X: x 2 x 2 iii) (2 points) Through using the asymptotic normality of maximum likelihood estimators, find the estimated asymptotic variance of the maximum likelihood estimator. Note that E(X2) = 22. One can show that 2 2 3 log[f(xi | )] 2 4 xi2 . The Fisher information is found from 2 2 2 3 2 3 2 3 F () E 2 log[f(Xi | )] E 2 4 Xi2 2 4 E Xi2 2 4 22 2 6 4 2 2 2 This leads to a variance of 1 2 2 F () E 2 log[f(Xi | )] 4 1 ˆ 2 The estimated asymptotic variance for the MLE is . One could also obtain the same 4n results by 2 1 2 E 2 log[f(X1, , Xn | )] ˆ ˆ 2 4n b) (2 points) Find the numerical values for the method of moment estimate, the maximum likelihood estimate, and the estimated asymptotic variance. > beta.mle <- sqrt(sum(x^2)/(2*n)) > beta.mom <- mean(x) * sqrt(2/pi) > data.frame(beta.mle, beta.mom) beta.mle beta.mom 1 7.871982 8.138423 > est.var <- beta.mle^2/(4*n) > est.var [1] 0.1090988 ˆ 7.8720 , 8.1384 , and AsVar(ˆ ) 0.1091 c) (2 points) Plot the log-likelihood function and first derivative of the log-likelihood function. Include a line on each plot that shows the location of the maximum likelihood estimate relative to the function. > log.lik.rayleigh <- function(beta, x) { n <- length(x) -2*n*log(beta) - 1/(2*beta^2) * sum(x^2) + sum(log(x)) } > part.log.lik.rayleigh <- function(beta, x) { n <- length(x) -2*n/beta + 1/beta^3 * sum(x^2) } > #Test > log.lik.rayleigh(beta = beta.mle, x = x) [1] -412.3285 > part.log.lik.rayleigh(beta = beta.mle, x = x) [1] 7.105427e-15 > y <- x #Need to change x here due to how curve() works > curve(expr = log.lik.rayleigh(beta = x, x = y), xlim = c(5,20), xlab = expression(beta), ylab = "log likelihood", col = "red") > abline(v = beta.mle, lty = "dotted") 3 > curve(expr = part.log.lik.rayleigh(beta = x, x = y), xlim = c(5,20), xlab = expression(beta), ylab = "partial log(L)", col = "red") > abline(v = beta.mle, lty = "dotted") > abline(h = 0) d) The purpose of this problem to calculate some of the derivatives in part a) numerically using fderiv() of the pracma package with the function’s defaults. i) (3 points) Find the first derivative of the log-likelihood function using fderiv(). Evaluate this numerical derivative at equal to 7, the maximum likelihood estimate, and 8. Compare these values with the actual values produced by the results in part b). > log.lik.rayleigh2 <- function(beta, x1) { n <- length(x) -2*n*log(beta) - 1/(2*beta^2) * sum(x1^2) + sum(log(x1)) } > library(pracma) 4 > > > > eval.at <- c(7, beta.mle, 8) eval1 <- fderiv(f = log.lik.rayleigh2, x = eval.at, n = 1, x1 = x) eval2 <- part.log.lik.rayleigh(beta = eval.at, x = x) data.frame(eval.at, eval1, eval2) eval.at eval1 eval2 1 7.000000 10.73743 1.073743e+01 2 7.871982 0.00000 7.105427e-15 3 8.000000 -1.12707 -1.127070e+00 The first derivatives are essentially the same! ii) (3 points) Find the second derivative of the log-likelihood function using fderiv(). Evaluate this numerical derivative at the maximum likelihood estimate and then use it to compute the estimated asymptotic variance of the maximum likelihood estimator. Compare the estimated variance to that produced by the results in part b). > eval1 <- -1/fderiv(f = log.lik.rayleigh2, x = beta.mle, n = 2, x1 = x) > eval2 <- beta.mle^2/(4*n) > data.frame(beta.mle, eval1, eval2) beta.mle eval1 eval2 1 7.871982 0.1090989 0.1090988 The estimated asymptotic variances are essentially the same! e) (4 points) Through using optim() with the data, find the maximum likelihood estimate for and the corresponding estimated asymptotic variance. Verify your answers match the results given in part b). Use the method of moment estimate as the initial value for . > beta.mle.optim <- optim(par = beta.mom, fn = log.lik.rayleigh, x = x, method = "L-BFGS", control = list(fnscale = -1), hessian = TRUE, lower = 0, upper = Inf) > beta.mle.optim $par [1] 7.871982 $value [1] -412.3285 $counts function gradient 6 6 $convergence [1] 0 $message [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH" $hessian [,1] [1,] -9.166007 > -1/beta.mle.optim$hessian #Estimated variance [,1] [1,] 0.1090988 ˆ 7.8720 and AsVar(ˆ ) 0.1091 as we have obtained before 5 2) (9 total points) Continuing 1), the Rayleigh distribution is a special case of the Weibull distribution. One representation of the Weibull PDF is 1 x x e f(x) 0 for x 0 for x 0 where > 0 and > 0. Note that when = 2 and 2 , the Weibull simplifies to the Rayleigh distribution. Complete the following. a) (3 points) Find the method of moment estimates for and using the data. Note that 2 2 1 2 2 2 E(X) (1 1/ ) , E(X ) (1 2 / ) , and Var(X) 1 1 . Show all derivations. Setting the sample estimates equal to the expected values leads to x (1 1/ ) and n n1 xi2 2(1 2 / ) . Solving for in the first moment equation, we obtain i1 x (1 1/ ) Substituting this expression for into the second moment equation, we obtain 2 x n x (1 2 / ) i1 (1 1/ ) 1 n 2 i Now, we can use the uniroot() function to solve for in n x 1 n i1 2 i 2 x (1 2 / ) 0 (1 1/ ) > mom.func <- function(gam, x) { mean(x^2) - (mean(x)/gamma(1 + 1/gam))^2 * gamma(1 + 2/gam) } > #Look for range of values to use in uniroot() > curve(expr = mom.func(gam = x, x = y), xlim = c(1, 20), xlab = expression(gamma), ylab = "Function", col = "red", lwd = 2) > abline(h = 0) 6 > #Root is somewhere between 1 and 10 > gamma.mom <- uniroot(f = mom.func, interval = c(1,10), x = x) > gamma.mom $root [1] 2.440065 $f.root [1] 2.428406e-05 $iter [1] 9 $estim.prec [1] 6.103516e-05 I obtain a value of 2.44. Using this value for in x , the method of moment estimate (1 1/ ) for equal to 11.50. > lambda.mom <- mean(x) / gamma(1 + 1/gamma.mom$root) > data.frame(gamma.mom$root, lambda.mom) gamma.mom.root lambda.mom 1 2.440065 11.50239 b) (3 points) Find the maximum likelihood estimates for and using optim() and the observed data. State the estimated covariance matrix. > log.lik.weibull <- function(theta, x) { lambda <- theta[1] gamma <- theta[2] sum(dweibull(x = x, shape = gamma, scale = lambda, log = TRUE)) } > weibull.mle.optim <- optim(par = c(lambda.mom, gamma.mom$root), fn = log.lik.weibull, x = x, method = "L-BFGS", control = list(fnscale = -1), hessian = TRUE, lower = c(0,0), upper = c(Inf, Inf)) > weibull.mle.optim $par 7 [1] 11.533164 2.449769 $value [1] -407.7613 $counts function gradient 6 6 $convergence [1] 0 $message [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH" $hessian [,1] [,2] [1,] -6.406810 5.606374 [2,] 5.606374 -45.976706 > -solve(weibull.mle.optim$hessian) #Estimated covariance matrix [,1] [,2] [1,] 0.17472828 0.02130627 [2,] 0.02130627 0.02434822 The maximum likelihood estimates are ˆ 11.53 and ˆ 2.45 . The estimated covariance matrix is 0.1747 0.0213 0.0213 0.0243 c) (3 points) Construct contour and 3D plots of the log-likelihood function with the maximum likelihood estimates plotted at their appropriate locations. > gamma.val <- seq(0.25,15,0.1) > lambda.val <- seq(0.25,25,0.1) > pars <- as.matrix(expand.grid(lambda.val, gamma.val)) > #Evaluate log(L) values# > loglik <- matrix(data = NA, nrow = nrow(pars), ncol = 1) > for(i in 1:nrow(pars)){ theta <- pars[i,] loglik[i,1] <- c(log.lik.weibull(theta = theta, x = x)) } > #Put log(L) into matrix - need to be careful about ordering > loglik.mat <- matrix(data = loglik, nrow = length(lambda.val), ncol = length(gamma.val), byrow = FALSE) > head(loglik) [,1] [1,] -736.7222 [2,] -720.2409 [3,] -709.4374 [4,] -701.6771 [5,] -695.7738 [6,] -691.1027 8 > loglik.mat[1:5, 1:5] [,1] [,2] [1,] -736.7222 -793.9590 [2,] -720.2409 -754.1775 [3,] -709.4374 -728.5994 [4,] -701.6771 -710.4907 [5,] -695.7738 -696.8739 [,3] [,4] [,5] -934.0530 -1184.7586 -1591.6411 -852.1910 -1030.9258 -1318.6144 -800.6398 -936.1773 -1154.3482 -764.7066 -871.2199 -1043.6857 -738.0199 -823.6095 -963.6965 > log.lik.weibull(theta = c(0.25, 0.25), x = x) [1] -736.7222 > log.lik.weibull(theta = c(0.25, 0.35), x = x) [1] -793.959 > log.lik.weibull(theta = c(0.35, 0.25), x = x) [1] -720.2409 > par(pty = "s", mfrow = c(1,1)) > contour(x = lambda.val, y = gamma.val, z = loglik.mat, levels = c(-1000, -600, -500, -450, -430, -420, -410), xlab = expression(lambda), ylab = expression(gamma), main = "Contour plot of log(L)") > abline(h = weibull.mle.optim$par[2], col = "blue") > abline(v = weibull.mle.optim$par[1], col = "blue") > max(loglik) [1] -407.7622 > library(package = rgl) # > open3d() #Open plot window wgl 6 > #Plot with different scale > lambda2<-seq(10,13,0.1) 9 > gamma2 <- seq(1,4,0.1) > pars <- as.matrix(expand.grid(lambda2, gamma2)) > loglik <- matrix(data = NA, nrow = nrow(pars), ncol = 1) > for(i in 1:nrow(pars)){ Theta <- pars[i,] loglik[i,1] <- c(log.lik.weibull(theta = theta, x = x)) } > loglik.mat <- matrix(data = loglik, nrow = length(lambda2), ncol = length(gamma2), byrow = FALSE) > persp3d(x = lambda2, y = gamma2, z = loglik.mat, xlab = "lambda", ylab = "gamma", zlab = "log(L)", ticktype = "detailed", col = "red") > spheres3d(x = weibull.mle.optim$par[1], y = weibull.mle.optim$par[2], z = max(loglik), radius = 5) > lines3d(x = c(weibull.mle.optim$par[1], weibull.mle.optim$par[1]), y = c(weibull.mle.optim$par[2], weibull.mle.optim$par[2]), z = c(min(loglik), max(loglik))) > grid3d(c("x", "y+", "z")) 3) (3 points) Continuing 1) and 2), construct histograms and EDF plots of the data to examine whether a Rayleigh and/or Weibull distribution work well for the data set. Comment on which distribution is better. > win.graph(width = 12, pointsize = 16) > par(mfrow = c(1,2)) > #PDF > hist(x = x, xlab = "x", freq = FALSE, ylim = c(0, 0.13), main = "Histogram") > curve(expr = dweibull(x = x, shape = 2, scale = sqrt(2)*beta.mle), lty = "dotted", col = "blue", add = TRUE) 10 > curve(expr = dweibull(x = x, shape = weibull.mle.optim$par[2], scale = weibull.mle.optim$par[1]), lty = "dashed", col = "red", add = TRUE) > legend(x = 12, y = 0.10, legend = c("Rayleigh", "Weibull"), lty = c("dotted", "dashed"), col = c("blue", "red"), bty = "n") > #CDFs > plot.ecdf(x = x, verticals = TRUE, do.p = FALSE, main = "CDFs", lwd = 2, col = "black", xlab = "x", ylab = "CDF") > curve(expr = pweibull(q = x, shape = 2, scale = sqrt(2)*beta.mle), lty = "dashed", col = "blue", add = TRUE) > curve(expr = pweibull(q = x, shape = weibull.mle.optim$par[2], scale = weibull.mle.optim$par[1]), lty = "dotted", col = "red", add = TRUE) > legend(x = 10, y = 0.2, legend = c("Rayleigh", "Weibull"), lty = c("dotted", "dashed"), col = c("blue", "red"), bty = "n") The Rayleigh looks pretty good! The shape of the distribution does a decent job of approximating the corresponding plots for the observed data. The Weibull does a slightly better job, especially in the left tail. 11