precourse exercises

advertisement
Instructions
There are 7 exercises and the task is to complete as many of these as possible within 4
hours. Your solutions are to be handed in at the first lecture and we will go through
the exercises together. The exercises contain all necessary mathematics used
throughout the course.
Please note that some of the exercises are quite difficult and I do not expect you to be
able to complete all the exercises, but it is important that you work at least 5 hours on
this task. The reason that I want you to do these exercises is that the students taking
the course have different backgrounds and I think it is important that we establish a
common “mathematical language” as early as possible.
It is sufficient if your solutions are written with pen and paper, and they do not have
to be neatly written.
Notation
Matrix notation
scalars (i.e. single valued parameters): small letters in italic, e.g. x
vectors: small letters in bold, e.g. v
matrices: capital letters in bold, e.g. A
transpose of a matrix: AT or A'
determinant of a matrix: A or det(A)
inverse of a matrix: A-1
trace of a matrix (i.e. sum of the diagonal elements): tr(A)
matrix rank: r(A)

element i in vector v: vi
element in row i and column j in matrix A: Aij
Statistical notation
random variables: capital letters in italic, e.g. X
vector of random variables: small letters in bold, e.g. y
expected value: E(X)
variance: var(X)
covariance: cov(X,Y)
variance-covariance matrix: V(y)
X is normally distributed with mean  and variance  e2 : X~N(,  e2 )


Exercise 1: Matrix Algebra
x  2y  7
3x  4 y  5
a) Solve the set of equations above

b) Write the set of equations in matrix form
Exercise 2: Some More Matrix Algebra
 

A  13 2
 4 
 
b  57
 
a) Calculate det(A)

b) Calculate A-1
c) Use A-1 to find the solution to Ax = b
d) Find the eigenvalues of A
e) Check that the sum of eigenvalues equals tr(A), and that the product of
eigenvalues equals det(A).
Exercise 3: Variances and Variance-Covariance Matrices
Let X~N(0,1)
a) var(2X) = ?
b) Let Ui be independent and identically distributed (i.e. “iid”) from N(0,1) with
U1 
i = 1, 2,3. Let u  U 2 .
 
U 3 
i)
V(u)= ?
1 1 0
1 0 1
ii)  Let Z  0 0 1. Calculate V(Zu)


0 0 1

Exercise 4: Ordinary Least Squares
1 1 0
1 1
10
1 1 0
1 1
 
W  1 0 1 X  1 0
y  12



 
17
1 0 1
1 0
11
Suppose we have the linear model y=Xb+e, with iid ei ~ N(0,  e2 )



1
a) Estimate b from the formula bˆ  XT X XT y

b) Calculate r(X)
 variance from the formula
c) Estimate the residual
ˆ e2  (y  Xbˆ )T (y  Xbˆ ) /(N  r(X)) where N = 4


d) Find r(W). Solve this question by making a good guess or by calculating the
number of non-zero eigenvalues for WTW.
Exercise 5: Numerical methods
Suppose we have the equation f(x) = 0, where f(x) is some function of x. We can find
a solution to this problem with the iterative procedure of Newton-Raphson:
f (x i )
where the subscript denotes iteration number and x0 is the starting
x i1  x i 
f (x i )
value
Let f(x) = x2 – 2x

a) For which values of x is f(x) = 0?
b) Find an approximation of these values by calculating x2 in the NewtonRaphson procedure with x0 = -1
c) Repeat the previous calculations with x0 = 3
d) How can one use the Newton-Raphson procedure to find an approximate value
of x that maximizes y  13 x 3  x 2  7 ?
Exercise 6: Matrixdifferentiation
 
M  13 43
 
b  3 5
x 
x   1 
x 2 
Let c = bx , p = Mx and q = xTMx

 derivatives:
Calculate the following

a)
c
x1

b)
c
x

c)
p
x

d)
q
x

Exercise 7: Likelihood Theory
Suppose we have the same linear model as in Exercise 4:
y=Xb+e, with iid ei ~N(0,  e2 ).
But in the present exercise we assume that the residual variance is known  e2 = 10.
The likelihood L is the probability of observing y given a parameter value of b,

i.e. L = P(y | b). We can also define a likelihood for each single observation

Li = P(yi | b)
Since we assume that the residuals are normally distributed we have that
1
Li 
2 e
y i X i b 
e
2
2
e
2
, where Xi is the i:th row of X
then the likelihood for all observations is the product of L1, L2, L3, and L4. That is
N
L   Li

i1
We wish to estimate b by maximizing the likelihood L, which is equivalent to
maximizing the logarithm of the likelihood l = log(L). So we can get the maximum
l
likelihood estimates by calculating
= 0.
b

N


2
a) Show that l  log  1 N  21 2  y i  X ib
e
2

 e  

i1
b) Check that l   N2 log 2 e2  21 2 y  Xb y  Xb
T
e

c) Find the maximum likelihood estmate of b
d)

l
is called the gradient of l. The matrix of second derivatives is called the
b
Hessian and is denoted H. The Hessian is symmetric and in our example we
  2 l
 2 l 
 2

XT X
b
b1 b2 
H


have H   21
.
Show
that
in our example.
 2 l 
 e2
  l


b22 
b2b1

e) The observed Fisher Information
 matrix is defined as: I  H and maximum
1
ˆ
 likelihood theory gives that b ~ Nb,I . Find the variance-covariance matrix
of bˆ . What are the standard errors of bˆ1 and bˆ2 ?




Download