LINEAR ALGEBRA IN ENSEMBLE FORECAST METHOD Melissa Ackerman NASA Ames Research Center September 6, 2004 Reviewed by NASA USRP Mentor Samson Cheung NAS Division Abstract In this paper, the relationship between the eigenvalues of the matrix representing the weather prediction model and the singular values generated from the ensemble forecast technique is reported. Understanding this relationship between the eigenvalues and the singular values may allow us to enhance the ensemble forecast technique. Introduction The weather/climate on earth affects our everyday activities and our long term actions. In a matter of hours the weather can change from sunny to rainy, temperature drops from 90 degrees to 60 degrees, and tornados sweep around town. So how do we figure out if lunchtime will be good for a picnic or if we should take an umbrella to work? After decades of research and observations of the Earth’s atmosphere, scientists have come up with various mathematical models for predicting the weather for several hours and up to several days. One of the technique used by Meteorologists is called ensemble forecasting. The process of forecasting begins with some initial data from either a previous calculation, or from collection from satellites and/or ground measurements. This physical data ranges from temperature, to pressure, to moisture content, and more at different places around the globe. From this initial collection of information, a mathematical weather prediction model can be used to project what will occur through time, the forecast. However, the forecast can never be entirely precise due to the fact that numerical error exists in the mathematical (numerical) model and the instability of the atmosphere. In this paper, the method of ensemble forecasting is presented. The method involves tracking the growth of the instability of the numerical model, and how to rescale the error in order to prevent chaotic behavior in the model. The tracking of the numerical instability is by use of the Singular Value Decomposition. On the other hand, the matrix representing the numerical model is associated with a set of eigenvalues. The goal of this study is to observe if a pattern exists between the eigenvalues and the singular values generated from the ensemble forecast, and see if there is some way to utilize this pattern to aid in the prediction of the weather. The Model In this study we use a simple mathematical model to represent atmospheric behavior. It is the Lorenz equation [1], dxi = x(i-1) * (x(i+1) - x(i-2)) - xi + F (1) dt for i=1,2,…40, where x41=x1 so that the variables form a cyclic chain. The above equation is a set of forty ordinary differential equations, ODE. Each variable, x1 through x40, represents an unspecified scalar meteorological quantity, such as temperature, at forty equally spaced grid sites on a latitude circle, shown in figure 2A. A Matlab program solving the Lorenz equations is in Appendix I. The model shares certain properties with many complex atmospheric models, such as the nonlinear advection term, the linear term representing loss of energy to thermal 1 dissipation, and the constant forcing term F to provide energy. The model is chaotic for certain values of the forcing parameter. Figure 1: Graphs produced from the model of x1 versus x2 for different forcing constants, F, when advanced through time. The graph on the left has a forcing constant of two and the one on the right is for F equal to eight. Figure 1 is composed of two graphs generated via solving the ODE (equation 1) using ode45 in Matlab. In order to eliminate the transient state of the solution, we throw away the first 1000 time steps, thus, figure 1 shows what the model looks like from time steps 1000 to 5000 for different forcing constants F. The variable x1 is on the abscissas and x2 is on the ordinates. The graph on the left has a forcing constant of two and the one on the right has a forcing constant of eight. Visibly, the graph with the higher forcing constant is a “purer” chaos whereas the graph with a lower forcing constant is somewhat predictable. This result of more chaos at higher levels of energy is to be expected. However, in reality the forcing function is usually large. Therefore, we must find a way to control the chaotic behavior. In numerical weather forecasting, scientists are particularly interested in how errors grow in time and how to control it. Understanding the growth of errors may help to control it. To observe the growth of a numerical error, one can first identify a solution of the numerical model. This solution is often called the “control” solution. One can add a small error onto this control solution, i.e., perturb the control solution slightly. Then, advance both the control and perturbed solution by the forecast model. The deviation between the control and the perturbed solutions can be observed, thus revealing the error growth of the numerical model. A. Temperature B. 2 Pressure Figure 2A: A pictorial view of the model where the starts represent the forty locations around the world where data is collected. Figure 2B: The graphic representation of the data collected, such as temperature and pressure, at each location. Figure 2C: A pictorial view of the advancement of a control solution and its perturbation solutions through time. The large sphere of figure 2A represents the world. The stars on the rim of the sphere represent the forty different positions around the world where solutions of a forecast model are calculated. For instance, one star could represent San Francisco, California; another one could be London, England; and another Moscow, Russia. At each position, the forecast solution, such as the temperature and pressure, would be calculated. Graphically, the data would appear two dimensionally for temperature and pressure as shown in figure 2B. If we added moisture content it would be a 3-dimensional graph. The graphics in figure 2C represent the evolution of a control solution of the ODE, the bigger star, and its perturbation solutions, the smaller stars, through time. The original set of perturbations is bounded in phase space by a forty-dimensional sphere, since there are forty perturbations for each point. As time advances, we can see that the ellipse being generated is expanding in only a few dimensions. Thus, the system has a glitch. Our goal is to prevent this by rescaling the ellipse if any one of the axes becomes too large. Ensemble Forecasting Ensemble forecasting is a process. The first step is to take a solution of the model, the control, and advance it through time. The next step is to mimic the existence of measurement error by perturbing the control solution. The vector difference between any perturbed solution and the control will be known as a bred vector. Initially, all of the bred vectors will be randomly oriented in the mathematical vector space, each vector pointing from the control to a perturbation. These bred vectors are bounded by a sphere with forty dimensions in the vector space. Suppose there are P perturbations from the control. Evolving the P perturbations and the 3 control in time is known as breeding. Thus, we generate a bred vector matrix B by inserting the bred vectors as columns of the matrix. The next step is to advance the control and the perturbations through time. As shown in figure 2C, as time evolves, the perturbations are wandering off from the control solution, i.e., the norms of those bred vectors are getting larger. At this time, the bred vectors are bounded by an ellipse, rather than a sphere in the vector space. Next, we take the singular value decomposition (SVD) of B. The result is B=UVT where U and V are orthogonal matrices containing the left singular vectors and right singular vectors, respectively, and is a diagonal matrix where the elements are the singular values of B. The singular values can be viewed as the lengths of the axes of a new ellipse, thus, creating a way to evaluate the variance in the bred vectors [2]. We then calculate the volume via the equation [3] (P/2) * iPi (2) (P/2+1) ! value is particularly large compared to the rest, we for i=1,2,…P and P=40 [3]. If one singular know that the bred vector ellipse has collapsed, as shown in figure 2C. V = To avoid this collapse of the ellipse, we will inflate the ellipse when it becomes too flat. The singular values of B measure the extent to which the left singular vectors (columns of U) span the space of the bred vectors. So we rescale the bred vectors when any one of them becomes too large. We define the point of being too big when the vector norm exceed 10 percent of the average bred vector size. To rescale, we divide each bred vector by its own norm, so that the new ellipse surrounding the control has at most the unit norm in any direction. Figure 3: A graphic representation of the evolution of the volume of the ellipse, in phase space, through time for F=8. The blue line is the original Bred Vector matrix and the green line is the rescaling of this matrix. 4 Figure 3 (see Matlab code in Appendix II) shows the original ellipse created, for F equal to eight, in blue and the rescaled ellipse in green. As we can see, there is a significant difference between the two volumes, 1030 versus 10-30. This effect of controlling the volume is similar for all values of F. From this graph we can see how rescaling the bred vector controls the chaotic behavior. This process of controlling the volume is the ensemble forecasting method. Eigenvalues Next, we look at the eigenvalues associated with the model. As shown in figure 4, after graphing the eigenvalues, we see that only the largest one, the green line, increases and fluctuates with the increase in chaotic behavior. The second largest eigenvalue, the blue line, remains the same at a low level. Since this is the second largest, we know that all the other eigenvalues are equal to or less than the value of the blue line. This leads us to compare the graphs of the largest eigenvalues to the graphs of the largest singular values. Figure 4: A graph of the largest and second largest eigenvalues of the model for F=8. Since any real-world models are often represented by matrices with dimensions in the tens of thousands, associated with it are tens of thousands of eigenvalues. For such a large matrix, the eigenvalues are difficult to compute. Therefore, we use the Arnoldi Method, see Appendix III, to reduce the matrix of the model into an much smaller upper Hessenburg matrix. We can then find the eigenvalues from this matrix. However, because our model is simple, we can find all forty eigenvalues. Linear Algebra We now go back to look at the linear algebra involved. Starting with a matrix M which represents the model, the ODE, the initial data of M can be written as a linear combination of the eigenvectors as 0=iN (ci*vi) (3) 5 where c’s are some constants and v’s are the eigenvectors. Advancing 0 through time creates the next state of the control, 1=M*0. We create the perturbation by subtracting a small vector e(p)=iN (di(p)*viT) (4) where d’s are randomly generated small constants; v’s are the eigenvectors; and p runs from 1 to P, where P is number of perturbations. The e’s will form the rows of matrix E. Then, we subtract the transpose of E from 0 to form a set of P perturbation. By advancing the perturbations through time, we get the next state as =M*(0-T). (5) On the other hand, if we look back at the bred vector matrix B, we see that each vector is equal to the control solution minus the perturbation solution, B(:,p)=1- p. When broken down, using (3) and (4), we can see that B(:,p) = iN (ci* i* vi) - iN (i*vi*(ci- di(p))). (6) N (p) B(:,p) = i (i*vi* di ) B = WDT where W is the matrix composed of the eigenvectors, v; is the diagonal matrix of eigenvalues, and D is the matrix composed of di(p). [4] Recall the singular value decomposition B=UVT. From (6), we can conclude that UVT=WDT. Figure 5: A graph of all the singular values as the off-diagonal of the matrix D is increased. The dark blue line is the largest singular value and the green line is the second largest. After looking at the linear algebra we discovered that the matrix D was a big factor. When the off diagonal is small, the largest singular value is under control; however, as the off-diagonal increases, more chaotic behavior appears. We then look to see what the remaining singular values are doing. After graphing, we see that they are being maintained at lower levels. Since 6 one of the singular values is increasing far beyond the increase of the other values, the ellipse created will be collapsed. Therefore, we know that in order to control the chaotic behavior, we need to maintain a strong diagonal in the matrix D. Figure 6: A graph of the largest singular values versus the largest eigenvalues over time as the forcing constant, F, increases while the matrix D has a strong off-diagonal. Conclusions From this study we see that the relationship between the eigenvalues and singular values of interest seems to exist. At the conclusion of my project so far, we don’t have a concrete formula to describe the relationship of these two values; however, as we look at the graph of the largest singular value and the largest eigenvalues together as they evolve through time for different values of F, we see that the largest eigenvalues remain the same while the largest singular value increases. This growth of the singular value seems to be exponential. Goals for the Future Next, we will look into fixing D and change the eigenvalues based on different values of F to see the changes in the singular values. We will also be looking into finding some d such that DT*V will be equal to the identity matrix, fix the eigenvalues, and see the changes in the matrix UT. We have seen what happens when looking at the largest eigenvalues versus the largest singular value through time for different values of F. However, this was only for a strong off-diagonal for the matrix D. A third experiment we will run is to rerun this experiment for D with a strong diagonal. After gathering the data from all of these experiments the next step will be to apply what we learned to a real-life weather model. Acknowledgements 7 Acknowledgements on the work go to my mentor Samson Cheung and my alternate mentor Sherry Chang. I would also like to acknowledge the NASA USRP program and the NASA Ames Research Center. References [1] Emanuel, K. and Lorenz, E., “Optimal Sites for Supplementary Weather Observations: Simulation with a Small Model,” Journal of the Atmospheric Sciences, Vol 55, Feb 1998. [2] E. Kalnay, “Atmospheric Modeling, Data Assimilation and Predictability,” Cambridge University Press, 2003. [3] “Standard Mathematical Tables,” CRC Press, 1987. [4] Strang,Gilbert, “Linear Algebra and Its Applications,” Academic Press, Inc., New York, 1976. 8 Appendix I The following is the Lorenz equation model script created in Matlab: function dx=model(t,x,F) % % Model for Project % dx=zeros(40,1); % Special conditions -- at beginning and end. i=40; dx(i)=x(i-1)*(x(1)-x(i-2))-x(i)+F; i=1; dx(i)=x(40)*(x(i+1)-x(39))-x(i)+F; i=2; dx(i)=x(i-1)*(x(i+1)-x(40))-x(i)+F; for i=3:39 dx(i)=x(i-1)*(x(i+1)-x(i-2))-x(i)+F; end Appendix II The following is the script file for the ensemble method created in Matlab: function [ellipse,psi,fS] = New5b(F) % Script File: New5b % % Rescale bred vectors by its norm if the norm is greater than % 10 percent of the average of the norms. -- while through time. % % USAGE: Calculate once using beta=10000.10 to create an old ellipse. % Save: oldellipse=ellipse % Calculate again with beta=1.10 to create the modified % ellipse. Then, Plot the graph with the old and modified % ellipses on the same graph. % for t=0:dt:6 tspan=[t:step:t+dt]; [Contl,Fore,B] = Bred(C0,F0,tspan,options,F); % *** NOTE -- CHANGE INPUT VALUE FOR BETA *** [B, fac] = scaleB(B,1.1); % Take the Singular Value Decomposition of B. [U,S,V]=svd(B); ellipse(count)=vol(S); time(count)=t; fS(count)=S(1,1); C0 = Contl; F0 = Fore; % Initial Condition Control -- start after first 1000 time steps. % -- See Problem 1 for work. -% Initial state produced by If statement for F: count=count+1; end X0 = ifelse(F); % *** NOTE -- WAIT TO PLOT ... FOLLOW USAGE INSTRUCTIONS *** % Plot the ellipse produced by the perturbations over time. figure; semilogy(time, ellipse) if fac==10000.1 title('Un-scaled Ellipse Over Time'); else title('Re-scaled Ellipse Over Time'); end dt=.1; step=.01; options=[]; % Perturb the initial condition 40 times and march both % the initial condition (the control) and the perturbations % forward in time using steps of .01 for 10 units. P=40; e=(rand(P)-.5)*(1/5); C0 = X0; % figure; % semilogy(time,oldellipse, time,ellipse) % title('Original Ellipse vs Scaled Ellipse Through Time'); % Create Forecast for j=1:P F0(j,:)=C0+e(:,j)'; end count = 1; figure; plot(time, fS) if fac==10000.1 title('Un-scaled Largest Singular Value Over Time'); else title('Largest Singular Value Over Time'); end 9 Appendix III The following is the script file of the Arnoldi Method used to find the eigenvalues: function sEIG = runArnoldi(F) % % A function to run the Arnoldifun file where the input is % a function – the Model – & values of F, the forcing function. % % Prints the eigenvalues of the Upper Hessenburg Matrix % created by the Arnoldi Method (in ascending order). % It also plots the eigenvalues over time. % for t=0:dt:6 tspan=[t:step:t+dt]; % Run modified Arnoldi Method. [H,E,XF]=Arnoldifun(m,tspan,X0,options,F); % Calculate and sort the eigenvalues of the Hessenburg Matrix. EIG=eig(H); sEIG=sort(EIG); e(:,count)=sEIG; X0=XF; % Move counters. time(count)=t; count=count+1; end % Initial state produced by If statement for F: X0 = ifelse(F); % Initial conditions. dt=.1; step=.01; options=[]; m=20; count = 1; % Plot the biggest two eigenvalues. figure; plot(time,e(m-1,:), time,e(m,:)) title('Largest & Next Largest Eigenvalues'); figure; plot(time,e(m,:)) title('Largest Eigenvalue Over Time'); 9