Stat 579: Problem-Solving in R Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu , 1/13 Additional Tips on Writing Functions in R An obvious extension of writing functions is that an entire function can be inserted in place of an argument to another function when a function is passed as an argument. m <- matrix(rnorm(n = 20), ncol = 5) apply(m,2,function(x) sum(x3 )) In a function, it is possible to assign a default value (for alpha, say) by stating the argument in the form name=value. cint2 <- function(y,alpha=.05){ n = length(y) ybar = mean(y) s = sqrt(var(y)/n) tvalue = qt(1 - alpha/2, n - 1) w = tvalue * s return(list(Lower=ybar - w, Upper=ybar + w)) } attach(chickwts) cint2(weight,0.05) cint2(weight) # same as previous one cint2(weight,alpha=0.01) , 2/13 Passing Varying Arguments to Functions Varying numbers of arguments can be passed into a function by including the three dots i.e., ... as an argument, usually as the last argument in the list of arguments. This technique is suitable in a situation in which it is not known is advance how many arguments will be passed to the function. This occurs when another function is used inside the definition of the original function and it is not known in advance the number of parameters needed to execute the second function. In the body of a function definition, ... is usually used as an argument to another function, called inside the definition. In this case any arguments are passed directly on to the second function as they were specified in the original function call. , 3/13 Passing Variable number of Arguments to Functions – Example As a simple example consider writing a function named oneway() that creates boxplots for each level of a factor as well as returns an analysis of variance table. It is assumed that the second argument to oneway() is an R factor object. Here the ... argument is used to provide additional information to the plot() function, and therefore directly passed on to the plot() function: oneway <- function(y,trt,data=NULL,...){ plot(trt,y,...) aov.out <- aov( y∼trt, data) return(summary(aov.out)) } oneway(weight,feed,data=chickwts) oneway(weight,feed,data=chickwts,ylab="Weight",main="Boxplo by Feed") , 4/13 Maximum Likelihood Estimation (MLE) More details will be supplied in Stat 342, 447 or 542. MLE is a primary tool for estimation in parametric models. Most of the likelihood functions you will see will involve one or two parameters. Methods of calculus are used to maximize the logarithm of the likelihood function (“loglikelihood”) as a function of the parameters. First derivatives of loglikelihood with respect to the parameters are set equal to zero and the solutions are the maximum likelihood estimates of the parameters, if they maximize the loglikelihood. In many of these problems, closed form solutions can be found and thus iterative methods are not necessary. Here we look at a problem where an iterative method is needed to maximize the loglikelihood. , 5/13 MLE for Cauchy Distribution – I The Cauchy density with scale parameter equal to one is f (x) = 1 , π[1 + (x − θ)2 ] −∞ < x < ∞ Based on n observations x1 , x2 , . . . , xn , the likelihood function is n Y L(θ) = f (xi ) i=1 and the loglikelihood function is log L(θ) = `(θ) = n X log f (xi ) i=1 = −n log π − n X log[1 + (x − θ)2 ] i=1 , 6/13 MLE for Cauchy Distribution – II The first and second derivatives are, respecively, ∂` ∂θ ∂2` ∂θ2 = n X i=1 2(xi − θ) 1 + (xi − θ)2 n X 2(xi − θ)2 − 2 = (1 + (xi − θ)2 )2 i=1 Suppose that 1.8, 10.2, 3.0, 5.2, −.5, −11, 2.4, 2.9, 3.9, 1.8 is a sample from this distribution. logc <function(theta,x) sum(-log(pi)-log(1+(x-theta)ˆ2)) x <- c(1.8,10.2,3.0,5.2,-.5,-11,2.4,2.9,3.9,1.8) theta <- seq(0,10,,100) plot(theta,sapply(theta,logc,x),type="l",ylab="LogL of Cauchy") To obtain the maximum likelihood estimate of θ, solve ∂` g(θ) = =0 ∂θ using Newton-Raphson, noting that g 0 (θ) = , ∂2` . ∂θ2 7/13 MLE for Cauchy Distribution in R – I First create R functions g() and derg() to compute g and g 0 at a given θ: g <- function(theta) sum(2*(x-theta)/(1+(x-theta)ˆ2)) derg <- function(theta) sum((2*(x-theta)ˆ2-2)/(1+(x-theta)ˆ2)ˆ2) newton2 <- function(fun, derf, x0, eps, nlim) { iter <- 0 repeat { iter <- iter+1 if(iter > nlim) { cat(" Iteration Limit Exceeded: Current = ",iter, fill = T) x1 <- NA break } x1 <- x0 - fun(x0)/derf(x0) if(abs(x0 - x1) < eps||abs(fun(x1))<1.0e-12) break x0 <- x1 cat("****** Iter. No: ", iter, " Current Iterate = ", x1,fill=T) } , 8/13 MLE for Cauchy Distribution in R – II newton2(g,derg,2.0,.00001,100) Also try starting values −11, −1, 0, 4, 8, and 38. In the above implementation of newton(), we are taking advantage of the fact that the data object x is available in the global environment to be used by the functions g() x and derg() when these are evaluated for each value of θ inside of newton2(). To see this let us first remove x and then try newton2(): rm(x) newton2(g,derg,2.0,.00001,100) Thus, strictly, this is not good programming practice. The the data object x needs to be passed into newton2() as a secondary argument to be used by both g() and derg(). We will do this using the three dots i.e., ... as an argument to newton2() as follows: , 9/13 MLE for Cauchy Distribution in R – III newton3 <- function(fun, derf, x0, eps, nlim,...) { iter <- 0 repeat { iter <- iter + 1 if(iter > nlim) { cat(" Iteration Limit Exceeded: Current = ",iter, fill = T) x1 <- NA break } x1 <- x0 - fun(x0,...)/derf(x0,...) if(abs(x0 - x1) < eps||abs(fun(x1,...))<1.0e-12) break x0 <- x1 cat("****** Iter. No: ", iter, " Current Iterate = ", x1,fill=T) } return(x1) } Now the data vector can be passed as an additional argument to newton3(). However, we need to redefine , 10/13 MLE for Cauchy Distribution in R – III gg <- function(theta,x)sum(2*(x-theta)/(1+(x-theta)ˆ2)) dergg <- function(theta,x) sum((2*(x-theta)ˆ2-2)/(1+(x-theta)ˆ2)ˆ2) x <-c(1.8,10.2,3.0,5.2,-.5,-11,2.4,2.9,3.9,1.8) newton3(gg, dergg, 2.0, 0.00001, 100, x) ****** Iter. No: 1 Current Iterate = 2.63186549863111 ****** Iter. No: 2 Current Iterate = 2.61528673874304 2.61527699859479 Alternatively, the R univariate optimization function optimize() could be used to find the maximum of log L(θ) directly. We have two possible ways of calling this function: optimize(function(theta) sapply(theta,logc,x), c(0,10), maximum=T) optimize(logc,interval=c(0,10),,,maximum=T,,x) , 11/13 Scaling a Computing Problem Recall the function myexp(): myexp <- function(x) { y <- abs(x); i <- 0; eps <- 1.e-10; sum <- 1 repeat { i <- i+1 term <- yˆi/factorial(i) sum <- sum + term if (term <= eps) break } if(x < 0) sum <- 1/sum return(sum) While this gives the correct answer for small values of x it does not work for large values: myexp(5) myexp(100) The reason is that, for relatively large values of x, the numerator and denominator of the term y i /i! grow extremely large as i increases before their ratio can become sufficiently small. , 12/13 Scaling a Computing Problem – continued For this reason the problem needs to be scaled before the Taylor series expansion is applied. One way to do this is to express x in the form x = N log 2 + g. This could be done by choosing N such that N = b(x/ log 2)c. This transforms x to g where |g| ≤ log 2/2 so that myexp() can now be applied to g and exp (x) obtained by the relation exp (x) = exp (g) × 2N . options(digits=15) 100/log(2) g <- 100-144*log(2) myexp(g) myexp(g)*2ˆ144 exp(100) , 13/13