CS416 -Adaptive Modelling OLLSCOIL NA hÉIREANN MÁ NUAD NATIONAL UNIVERSITY OF IRELAND, MAYNOOTH B.SC. COMPUTER SCIENCE EXAMINATION B.SC. COMPUTER SCIENCE AND SOFTWARE ENGINEERING EXAMINATION MASTER OF COMPUTER SCIENCE EXAMINATION AUTUMN 2003 PAPER CS416 ADAPTIVE MODELLING AND PREDICTION OF THE REAL WORLD Dr. R. Procter, Prof. R. Reilly, Dr. R. Shorten, Professor D. Leith Attempt any THREE questions. Time Allowed: 2 hours. 1 (a) Explain the following terms: (i) recursive least squares; (ii) persistence of excitation; (iii) a regressor matrix. (b) Explain the principle of least squares. [6 marks] [6 marks] (c) Use the principle of least squares to fit a line of the form [15 marks] y ax b to the following data points. x -1 0 1 2 3 4 5 6 y -10 1 1 2 4 3 1 1 (d) Explain how the recursive least squares algorithm can be modified to identify the parameters of time varying linear-in-parameter model. Explain clearly the effect of the forgetting factor. CS416 Page 1 of 2 [6 marks] SUMMER 2003 CS416 -Adaptive Modelling 2 (a) Explain the following terms: (i) an initial value problem; (ii) global truncation error; (iii) a system of ODE’s. [6 marks] (b) Explain briefly the Euler and Runge-Kutta methods for approximating the solution to an initial value problem. [6 marks] (c) The fourth order Runge-Kutta formula for solving the first order initial value problem [15 marks] dy f ( t , y ), y( t 0 ) y 0 , dt yields the following equations ŷ k 1 ŷ k h ( k 1 2k 2 2k 3 k 4 ) 6 where k 1 f ( t k , ŷ k ) hk h , ŷ k 1 ) 2 2 hk h k 3 f ( t k , ŷ k 2 ) 2 2 k 4 f ( t k h , ŷ k hk 3 ) k2 f ( tk Apply the Runge-Kutta fourth order method to approximate the solution to the initial value problem at t == 0, 0.1, 0.2 seconds respectively. dy t y, dt y( 0 ) 0, h 0.1 (d) The Euler method is called a first order method. Why? [5 marks] 3 (a) Define what is meant by a linear difference equation. Write down the explicit solution to an unforced first-order linear difference equation. When is the system stable, unstable, marginally stable ? [6 marks] (b) The solutions to linear difference equations and linear differential equations are closely related. Discuss. [6 marks] A difference equation relates an output yn to its previous values (yn-1, yn-2, …, CS416 Page 2 of 2 SUMMER 2003 CS416 -Adaptive Modelling yn-m). Say we have that m yi j yi j j 1 Suppose that the parameters j are unknown and we would like to estimate them from observed data. Letting Yi-1 denote the vector (yi-1, yi-2, …, yi-n), we have a sample of N pairs of data yi, Yi, i=1,2,…,N contaminated by gaussian white noise with mean zero and variance 2. (c) How could this estimation task be formulated as a linear regression problem ? What are the least-squares parameter estimates ? [10 marks] (d) Let ̂ j denote the estimated parameter values. Decompose the mean square [11 marks] error E j ˆ j 2 into bias and variance components. Is the bias zero ? How does working with a dynamic system differ from working with a static one in terms of the impact on the bias and variance ? Suppose y is related to the inputs (x1, x2, …, xp) by 4 y = f(x1, x2, …, xp) Let X denote the vector of inputs (x1, x2, …, xp). The function f is unknown, but we have a sample of N pairs of measurements yi, Xi, i=1,2,…,N. Using this sample our task is to estimate f. (a) Write down short pseudo-code for a k-nearest neighbour estimator. [6 marks] (b) The k-nearest neighbour estimator can be biased near the edges of the training data. Briefly discuss the reasons for this. [6 marks] (c) Local linear regression can be used to reduce such edge bias. How would the k- [10 marks] nearest neighbour pseudo-code need to be modified to implement a local linear regression estimator ? (d) k-nearest neighbour methods estimators rely on the function being roughly [11 marks] constant over a neighbourhood which in turn requires that the neighbourhood be “local” or “small”. Suppose that the input points Xi are distributed uniformly over a unit p-dimensional cube. Suppose also that our nearest neighbour estimator uses a neighbourhood which includes a fraction r of the data points. How does the size and volume of this neighbourhood vary as the input dimension p increases ? What are the implications for the k-nearest neighbour estimator ? CS416 Page 3 of 2 SUMMER 2003