Numerical Methods in Finance. Part A. (2010-2011) Paul Clifford, Sebastian Van Strien and Oleg Zaboronski1 October 6, 2010 1 olegz@maths.warwick.ac.uk Contents 0 Preface iv 0.1 Aims, objectives, and organisation of the course. . . . . . . . . iv 1 Linear models: growth and distribution 1.1 Matrix computations in Matlab . . . . . . . . . . . . . . . . 1.2 Non-negative matrices: modeling growth . . . . . . . . . . . 1.2.1 Models with an age profile . . . . . . . . . . . . . . . 1.2.2 The asymptotic behaviour depends on age-structure. 1.3 Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Mood fluctuations of a Markovian market . . . . . . 1.3.2 Another Markov model: a random walk on a graph . 1.3.3 Matlab Project for week 2. . . . . . . . . . . . . . . . . . . . . . . . 2 Linear models: stability and redundancy 2.1 SVD or Principal Components . . . . . . . . . . . . . . . . . . 2.1.1 Stability of eigenvalues . . . . . . . . . . . . . . . . . . 2.1.2 Singular value decomposition. . . . . . . . . . . . . . . 2.1.3 Application of the singular value decomposition to solving linear equations . . . . . . . . . . . . . . . . . . . . 2.1.4 Least square methods . . . . . . . . . . . . . . . . . . . 2.1.5 Further applications of SVD: Principle Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.6 Pairs trade by exploiting correlations in the stock market. 2.2 MATLAB exercises for Week 3. . . . . . . . . . . . . . . . . . 2.2.1 Lesley Matrices . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Markov matrices . . . . . . . . . . . . . . . . . . . . . 2.2.3 Solving equations . . . . . . . . . . . . . . . . . . . . . 2.3 Ill-conditioned systems: general theory. . . . . . . . . . . . . . i 2 2 6 6 8 12 12 15 17 18 18 18 20 23 26 26 27 29 29 30 31 31 2.4 2.5 Numerical computations with matrices . . . . . . Matlab exercises for week 4 (Linear Algebra) . . . 2.5.1 Ill posed systems . . . . . . . . . . . . . . 2.5.2 MATLAB capabilities investigation: sparse 2.5.3 Solving Ax = b by iteration . . . . . . . . . . . . . . . . . . . . . . . matrices . . . . . . . . . . . . . . . 34 39 39 40 40 3 Gambling, random walks and the CLT 42 3.1 Random variables and laws of large numbers . . . . . . . . . . 42 3.1.1 Useful probabilistic tools. . . . . . . . . . . . . . . . . 43 3.1.2 Weak law of large numbers. . . . . . . . . . . . . . . . 44 3.1.3 Strong law of large numbers. . . . . . . . . . . . . . . . 44 3.2 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . 46 3.3 The simplest applications of CLT and the law of large numbers. 49 3.3.1 Monte Carlo Methods for integration . . . . . . . . . . 49 3.3.2 MATLAB exercises for Week 5. . . . . . . . . . . . . . 51 3.3.3 Analysing the value of the game. . . . . . . . . . . . . 51 3.3.4 Portfolio optimization via diversification. . . . . . . . . 54 3.4 Risk estimation and the theory of large deviations. . . . . . . 60 3.4.1 Week 6 MATLAB exercises. . . . . . . . . . . . . . . . 63 3.4.2 An example of numerical investigation of CLT and the law of large numbers for independent Bernoulli trials. . 63 3.5 The law of large numbers for Markov chains . . . . . . . . . . 67 3.5.1 The Markov model for crude oil data. . . . . . . . . . . 68 3.6 FTSE 100: clustering, long range correlations, GARCH. . . . . 70 3.6.1 Checking the algebraic tails conjecture numerically. . . 77 3.6.2 MATLAB exercises for week 7. . . . . . . . . . . . . . 79 3.7 The Gambler’s Ruin Problem . . . . . . . . . . . . . . . . . . 81 3.7.1 Nidhi’s game. . . . . . . . . . . . . . . . . . . . . . . . 88 3.8 Cox-Ross-Rubinstein model and Black-Scholes pricing formula for European options. . . . . . . . . . . . . . . . . . . . . . . . 89 3.8.1 The model. . . . . . . . . . . . . . . . . . . . . . . . . 89 3.8.2 Solving the discrete BS equation using binomial trees. . 93 3.8.3 The continuous limit of CRR model. . . . . . . . . . . 94 3.8.4 Matlab Exercises for Weeks 8, 9. . . . . . . . . . . . . 100 4 Numerical schemes for solving Differential and Stochastic Differential Equations 101 4.1 Systems of ODE’s. . . . . . . . . . . . . . . . . . . . . . . . . 101 ii 4.2 4.1.1 Existence and Uniqueness. . . . . . . . . . . . . . . . 4.1.2 Autonomous linear ODE’s . . . . . . . . . . . . . . . 4.1.3 Examples of non-linear differential equations . . . . . 4.1.4 Numerical methods for systems of ODE’s. . . . . . . Stochastic Differential Equations . . . . . . . . . . . . . . . 4.2.1 Black-Scholes SDE and Ito’s lemma. . . . . . . . . . 4.2.2 The derivation of Black-Scholes pricing formula as an exercise in Ito calculus. . . . . . . . . . . . . . . . . . 4.2.3 Numerical schemes for solving SDE’s . . . . . . . . . 4.2.4 Numerical example: the effect of stochastic volatility. 4.2.5 Some popular models of stochastic volatility . . . . . iii . . . . . . 101 103 107 114 125 125 . . . . 128 129 133 137 Chapter 0 Preface 0.1 Aims, objectives, and organisation of the course. The aim of this module is to learn to think about modeling in finance. To practice thinking about what is driving cause and effect in some simple models. To analyze the models numerically. In order to do this we shall have the following objectives. We shall learn • how to use Matlab to explore with models • Basics of numerical linear algebra, ill-posed systems, numerical iteration schemes; • The simplest Markov chain models in finance • The law of large numbers, central limit theorem, large deviations theory and risk, Markowitz portfolio theory • Discrete stochastic volatility models and distributions with ’fat tails’. • CRR and Black-Scholes models: theory and numerics • General properties of systems of ODE’s and stochastic ODE’s, the simplest numerical schemes for solving (S)DE’s, stochastic volatility in continuous time iv The course has been originally written by Sebastian van Strien and completely redesigned in 2007 by Oleg Zaboronski. Paul Clifford made a significant contribution to designing the course project. We shall use mainly the following texts, and you will not regret buying the first two. • W.H. Press, S.A. Teukolsky, W.T. Vetterling and B.P. Flannery, Numerical Recipes in C, Cambridge (1992) • D.J. Higham and N.J. Higham, Matlab Guide SIAM Society for Industrial & Applied Mathematics (2005) • P. Brandimarte Numerical methods in finance: a MATLAB-based introduction John Wiley & Sons, New York, (2001) • D. Higham, An Introduction to Financial Option Valuation: Mathematics, Stochastics and Computation, Cambridge University Press Paperback, (2004) The following could be useful as background reading: • G.R. Grimmett and D.R. Stirzaker, Probability and random processes, Oxford Science Publications, (1992). • Ya.G. Sinai, Probability Theory, Springer Textbook, (1991). • P.E. Kloeden, E. Platen and H. Schurz, Numerical solutions of SDE through computer experiments, Springer 1997 • E. Prisman, Pricing derivative securities, an interative dynamical environment using maple and matlab, Academic Press • B. Ripley, Stochastic simulation, Wiley • H.G. Schuster, Deterministic Chaos: an introductions, Physic Verlag 1988. • Pärt-Enander, Sjöberg, Melin and Isaksson, The Matlab handbook’, Addison-Wesley. v • J.Y. Cambell, A.W. Lo and A.C. MacKinlay, The econometrics of financial markets, Princeton 1997. Notes will be handed out during each teaching session. At some point the notes for will also become available on the web site: http://www2.warwick.ac.uk/fac/sci/maths/people/staff/oleg_zaboronski/fm The organisation of the course if follows. Each Thursday you will be given a 2-hour long lecture with a 5 minutes break in the middle. Each Wednesday afternoon 13:00- 15:00 you will have a computer session during which we will learn to programme in Matlab, followed by a C + + tutorial at 15:00-17:00. For this you need to have your university computer account name and password. So you will NOT be using your WBS computer account on Wednesday. The exercises for these computer sessions are included in the lecture notes. You will find that the exercises in this text will help you to get familiar with the material. The projects will not cost you a huge amount of time, especially if you do them during the practice session. There will also be a term project contributing 50% towards the final mark for this component of the course. More details on how the entire project mark will count towards the total mark can be found in the hand-book. IMPORTANT NOTE: you are allowed to discuss your project with a friend. However, you must hand in YOUR OWN work. This means you must be able to defend and explain what is written, and why you have written it like that. All projects must be clearly written by you. Copying is not allowed. Please remember that university rules about plagiarism are extremely strict. 1 Chapter 1 Linear models: growth and distribution The purpose of this chapter to give a first introduction into Matlab, and to explain why it is in particular useful when considering linear models (so for example, a vector describing a portfolio)1 . 1.1 Matrix computations in Matlab Matlab Examples. Let us show how to use Matlab to do matrix multiplications. Enter the matrices in the following way > A= [ 1, 2; 3, 4 ] , B=[1, 1; 1, 1]; A= 1 2 3 4 Note that the semi-colon after the last command suppresses the output. Now use ∗ for multiplication of the two matrices. 1 Portfolio: a collection of investments held by an institution or a private individual (Wikipedia) 2 > A * B - B*A -1 -3 3 1 Square matrices can be iterated, but not necessarily inverted: > A^2*B 17 37 17 37 > det(B), B^(-1) 0 Inf Inf Inf Inf As you know, a matrix is invertible iff its determinant is not equal to zero. Many important properties of linear models are determined by eigenvalues of matrices. Take a square n × n matrix A. A real or complex number t is called an eigenvalue of A if there exists a vector v ∈ Cn so that Av = tv and v 6= 0 or equivalently if (A − tI)v = 0 where I is the identity matrix (v is called the eigenvector associated to t). Eigenvalues and eigenvectors in general need not be real. For the equation (A − tI)v = 0 to hold, the matrix A − tI must be degenerate, hence det(A − tI) = 0, which is an algebraic equation of degree n for eigenvalues of matrix A (characteristic equation). Therefore, there are always n eigenvalues (counted with 3 multiplicities), but a matrix does not necessarily have n linearly independent eigenvectors (if some of the eigenvalues appear with multiplicity). Matlab can easily compute the eigenvalues of a matrix: > A=[5,-1;3,1]; eig(A) 4 2 Eigenvectors together with the corresponding eigenvalues can be computed as follows: > [V,D]=eig(A) V = 0.7071 0.7071 0.3162 0.9487 D = 4 0 0 2 The terms in the diagonal of the matrix D are the eigenvalues of A, while V gives the eigenvectors (the first column of V gives the first eigenvector and so on): > A *[0.7071;0.7071] 2.8284 2.8284 which is 4 times the initial vector (0.7071, 07071). The 2nd eigenvector is mapped to 2 times itself. N. B. Matlab always normalises eigenvectors to P have unit L2 norm, k | vk |2 = 1. Such a normalisation is not necessarily the best choice in the context of stochastic matrices, see below. Equation Avk = tk vk for k = 1, 2, . . . n can be written in the matrix form: AV = V D, where where D has on its diagonal the eigenvalues of A and V is an n × n matrix the columns of which are the eigenvectors of A. If eigenvectors of A 4 are linearly independent, matrix V is invertible and we have the following matrix decomposition A = V DV −1 Note that even if matrix elements of A are real, its eigenvalues can be complex, in which case V and D are complex. If A does not have a basis of eigenvectors (which can happen if several eigenvalues coincide) then one has to allow D to be of a more general (Jordan) form (Jordan decomposition theorem). D will then have a block-diagonal form. Diagonal entries of the Jordan block are equal to an eigenvalue of A, the superdiagonal consists of 1’s, the rest of block elements are zero. The expression A = V DV −1 implies An = V Dn V −1 . But since D is such a special matrix, it is easy to compute its powers, and from this we can prove the theorem below. Theorem 1.1.1. If the norm of each eigenvalue of the diagonalisable matrix A is < 1 then for each vector v one has An v → 0 as n → ∞. If there is an eigenvalue of A with norm > 1 then for ‘almost all’ vectors v (i.e., for all vectors which do not lie in the span of the eigenspaces corresponding to the eigenvalues with norm ≤ 1) one has |An v| → ∞ as n → ∞. If A has an eigenvalue equal to one, then different things can happen depending on the matrix. Idea of the proof (part 1): Let λmax be an eigenvalue of A with the largest norm, ||λmax || < 1. For v 6= 0, ||An v|| = ||v|| · ||An v|| ≤ ||v|| · ||An || ≤ ||v|| · ||V || · ||V −1 || · ||Dn || ||v|| ≤ ||v|| · ||V || · ||V −1 || · |λmax |n → 0, n → ∞. Important remarks: • If A is a symmetric n × n square matrix (for example a covariance matrix) then it always has n real eigenvalues and a basis of n real eigenvectors which form an orthogonal basis. In this case we can choose V so that its columns form an orthogonal basis (the eigenvectors of A). Then V is an orthogonal matrix: V −1 is equal to V tr . 5 • Eigenvectors of a non-symmetric matrix do not necessarily form a basis. For 1 1 has only 1 as an eigenvalue, example, the non-symmetric matrix A = 0 1 and A has only one eigenvector. Even if A has a basis of eigenvectors, they do not need be orthogonal, and so in general V −1 is not equal to V tr . In this chapter we shall discuss in particular two types of square matrices with non-negative entries: • General non-negative matrices; • Stochastic matrices 1.2 1.2.1 Non-negative matrices: modeling growth Models with an age profile We shall now discuss a simple model that can be used to predict the size of a population, the size of a portfolio and so on. To keep it simple, let us consider an example of a portfolio of highly risky stocks2 . Very similar models can be used to model the size of human population, the spread of some disease (e. g. swine flu), the set of your clients who recommend your services to others but who can go out of business. In general, think of a situation where one has an age profile, with a birth and death rate. (Often there are non-linearities, but in these models we shall ignore them.) Assume that we are venture capitalists investing in high-tech start-up companies. We insist that all companies we invest in have an exit strategy for the end of the third year. We also immediately re-invest all the returns into the new start-ups. Let N0 (n) be the amount of capital3 we invest in the new companies in year n, N1 (n) the amount of capital tied in the one-year old companies in year n and, finally, N2 (n) the amount of capital tied in the ‘old’ companies in year n. (For definiteness, let us use US dollars to measure the amount of capital.) 2 In financial markets, stock capital raised by a corporation or joint-stock company through the issuance and distribution of shares (Wikipedia) 3 In finance and accounting, capital generally refers to financial wealth, especially that used to start or maintain a business (Wikipedia) 6 Suppose that on average 3/4 of the investment into the new companies is lost due to bankruptcy in the first year. Suppose also that only 1/2 of the capital tied in the one-year old companies is transferred to two-year old companies due to bankruptcy or early exit in their second year of operation. Then N1 (n + 1) = N0 (n)/4 and N2 (n + 1) = N1 (n)/2. Assume that the return from one-year old companies due to an early exit is 2N1 (n) dollars in year n, whereas the return from the two-year old companies which must exit in the end of their third year of operation is 4N2 (n) dollars. Then the total amount we re-invest in new companies in the end of year n is N0 (n + 1) = 2N1 (n) + 4N2 (n) dollars. In matrix form N0 (n) 0 2 4 N0 (n + 1) N1 (n + 1) = 1/4 0 0 N1 (n) . (1.2.1) 0 1/2 0 N2 (n + 1) N2 (n) This transformation determines the ‘age structure’ of the population in year n + 1 once it is known what it is in year n. Let us give the above matrix a name: 0 2 4 L = 1/4 0 0 . 0 1/2 0 Iterating expression (1.2.1) we get N (n + 1) = LN (n) and N (n) = Ln N (0) where N (n) is the column vector [N0 (n), N1 (n), N2 (n)]′ . Vector N (0) describes the initial distribution of the capital between the start-ups we manage. For example, the vector N (0) = [C, 0, 0]′ corresponds to our fund investing C dollars in new start-ups. So what is the growth-rate associated to this model? If this year the total capital tied in the start-ups is N0 (n) + N1 (n) + N2 (n) dollars, then it is 2N1 (n) + 4N2 (n) + (1/4)N0 (n) + (1/2)N1 (n) dollars next year. So the growth-rate is Rn = 2N1 (n) + 4N2 (n) + (1/4)N0 (n) + (1/2)N1 (n) . N0 (n) + N1 (n) + N2 (n) The survival of our investment fund depends on whether this rate is less or bigger than one. Unfortunately the answer is not obvious from the above 7 formula. We will develop the machinery needed for answering such questions in the next section. Notice that one can also use similar models to predict the age structure of people in a country (the prediction that in the West in the next few decades the average age of the population will continue to grow is based on similar models. In this case the model specifies reproduction and death rates for each age group of the population.). 1.2.2 The asymptotic behaviour depends on age-structure. Simulations with Matlab indicate that for n large (say n > 30) the matrix Ln is approximately equal to 0.4 1.6 1.6 0.1 0.4 0.4 . 0.05 0.2 0.2 First we should note that the limiting matrix has columns which are all multiples of each other, i.e., all are equal to some constant times the vector 8 2 . 1 But there is something else which is important: The size of your total capital in the future depends on the age structure4 of the companies you own now, not just the total capital. For example, 100 800 1480 640 L100 100 = 370 while L100 100 = 160 800 100 185 80 and therefore the total capital in the first case is equal to 2035 million dollars as opposed to 880 million dollars in the second case, even though in both cases you initial investment was 1 billion dollars. (Of course, if we take our example seriously, it may well be that the only starting point available to us 0, 0]′ , in which case we will loose half of the money in the long is N (0) P3= [C,100 run: k=1 (L N 0 )k = 0.55C. Who said being a venture capitalist is easy?) 4 By the age structure we understand the distribution of capital between different companies. 8 Moreover, since in the limit each of the columns is a multiple of 8 2 1 the sizes of the age groups is eventually approximately in this proportion. So in these models one can only make predictions if one knows the age structure of the population, not just its size. Why do you get this limit, and why are all the columns of the matrix multiples of each other? This is easy to explain by writing the matrix in terms of its basis of eigenvalues: [V,D]=eig(L) V = -0.9631 -0.2408 -0.1204 0.9177 -0.2294 - 0.2294i -0.0000 + 0.2294i 0.9177 -0.2294 + 0.2294i -0.0000 - 0.2294i 0 -0.5000 + 0.5000i 0 0 0 -0.5000 - 0.5000i D = 1.0000 0 0 Since L = V DV −1 , Ln = V Dn V −1 and since Dn tends to 1 0 0 0 0 0 0 0 0 and so for n large V Dn is approximately equal to (−0.1204) times 8 2 1 0 0 0 0 0 0 (the resulting first column is the first eigenvector of A). This implies that for each v, lim V Dn V −1 v is a multiple of the vector [8, 2, 1]′ . From this we obtain the special structure of the limit matrix. So the key point here is that there is precisely one eigenvalue which dominates all the others. In this situation it is possible to answer all sorts of questions about the model, including 9 the question about the growth rate posed in the end of the previous section: If n → ∞, N (n) ≈ w(N (0))v, where v = [8, 2, 1]′ is an eigenvector corresponding to the eigenvalue λ = 1 and w(N (0)) is the scalar which depends on the initial vector. Therefore, the total growth rate is asymptotically 1 (the largest eigenvalue), i. e. there is no growth. However, the total growth achieved by some large time n is P n (L N 0 )k X n Γ= k (L [p0 , p1 , p2 ]′ )k , = C k where C is the total initial investment, pi is the share of the investment into the companies which are i years old. As it is easy to see from the limiting form of Ln , Γ = 0.55 is minimal for p0 = 1, p1 = 0, p2 = 0, and Γ = 2.2 is maximal for p0 = 0, p1 + p2 = 1. Our analysis was very easy at it happened that matrix L had a real eigenvalue the norm of which was larger than the norm of any other eigenvalue. As a result we didn’t really have to calculate large powers of L explicitly. Were we particularly lucky or is this situation generic? The answer is ’The latter’ given that we are dealing with irreducible matrices. We say that a non-negative matrix P is irreducible if there exists an integer k ≥ 1 so that P k has only strictly positive coefficients. Theorem 1.2.1. If P is an irreducible non-negative matrix, then: (1) There is one real positive eigenvalue ̺ > 0 which is larger in norm than all others. (2) If ̺ > 1 then for almost all vectors v one gets that |P n v| → ∞ as n → ∞; (3) If ̺ ∈ (0, 1) then for all vectors v, P n v → 0 as n → ∞; (4) If ̺ = 1 then P n tends to a matrix Q for which each column of Q is some multiple of one column vector w with positive entries. The column vector w is the eigenvector of P associated with eigenvalue 1. The matrix L from this section is irreducible: L5 > 0. So Ln converges to a non-zero matrix because the largest eigenvalue (largest in norm) of L is equal to one. If instead of L we would consider another irreducible matrix > A=[0,2,3;1/4,0,0;0,1/2,0] 0 1/4 0 10 2 0 1/2 3 0 0 then because the eigenvalue of A with the largest norm is 0.9466, the matrix An tends to zero as n → ∞ with this rate. Such a remarkable statement as 1.2.1 deserves some comments. Parts (2) and (3) follow easily from Theorem 1.1.1. Parts (1) and (4) are a consequence of the famous Perron-Frobenius Theorem: Theorem 1.2.2. Let A = (aij ) be a real n × n matrix with positive entries aij > 0. Then the following statements hold: (1) There is a positive real eigenvalue r of A: any other eigenvalue λ satisfies |λ| < r. (2) The eigenvalue r is simple: r is a simple root of the characteristic polynomial of A. In particular, both the right and the left eigenspaces associated with r are one-dimensional. (3) There are left and right eigenvectors associated with r which have positive entries. P P (4) minj i aij ≤ r ≤ maxj i aij . For the proof of Perron-Frobenius (PF) theorem, see the Course’s webpage. Here we will simply show how parts (1), (4) of Theorem 1.2.1 follow from Theorem 1.2.2. Let k: matrix Ak has only positive entries. As A has only non-negative entries and is irreducible, Ak+1 has only positive entries. The largest in norm eigenvalues of Ak and Ak+1 have the form rk , rk+1 , where r is an eigenvalue of A. By PF theorem, rk and rk+1 are real positive. Therefore, r is real positive. Also, any other eigenvalue λ of A satisfies |λ|k < rk by PF applied to Ak . Therefore, r > |λ| where λ is an arbitrary eigenvalue of A distinct from r. Part (1) is proved. To prove part (4), we notice that by Jordan decomposition theorem, there is a basis in which matrix A has the following form: 1 0T A= 0 A0 , where all eigenvalues of A0 lie inside the unit circle. Hence in virtue of Thm. 1.1.1, 1 0T k lim A = 0 0. k→∞ In the arbitrary basis k lim A = V k→∞ 1 0T 0 0. 11 V −1 , which is rank-1 matrix whose columns are proportional to the eigenvector associated with the eigenvalue 1. Part (4) is proved. 1.3 1.3.1 Markov Models and stochastic matrices Mood fluctuations of a Markovian market Let us introduce a linear model for the ‘mood of the stock-market’ and study its behavior using the methods we have just learned. Suppose that on a given day the market can either move ’up’ or ’down’. Let o(n) be the probability that on the n-th day the market is moving up (the market is ’optimistic’). Let p(n) be the probability that on the n-th day the market is moving down (the market is ’pessimistic’). Now assume that probability distribution of the market’s mood on the n-th day depends only on the market’s state on the (n − 1)-st day. In other words, assume that the fluctuations in the mood of the market are M arkovian. Due to Markovian assumption, this dependence is also stationary, i. e. independent of the day n. Then the model is specified completely by stating 4 conditional probabilities: o(n) P r(Mn =↑| Mn−1 =↑) P r(Mn =↑| Mn−1 =↓) o(n − 1) = p(n) p(n − 1) P r(Mn =↓| Mn−1 =↑) P r(Mn =↓| Mn−1 =↓) (1.3.2) The above expression constitutes a well known probabilistic sum rule conveniently re-written in the matrix form. In order to be definite, we assume that after an optimistic day with probability 2/3 and 1/3 respectively an optimistic or pessimistic day follows. After a pessimistic day these probabilities are 1/4 and 3/4 respectively. The transition matrix of our Markov model of the mood of the market takes the form 2/3 1/4 P = 1/3 3/4 Matrix P is also called a Markov or a stochastic matrix. The last term is used for non-negative matrices such that the sum of the coefficients of each column is equal to one. It is easy to see that matrix P built out of conditional probabilities is stochastic. We want to compute the probability that the stock market is in a certain state on day n, i. e. the numbers o(n) and p(n). Since these are probabilities, o(n) + p(n) = 1 and each of these 12 o(n) numbers is greater or equal to zero. The vector also is called a p(n) probability vector. Question: Can you check that our model is consistent, in the sense that stochastic matrix maps stochastic vector to a stochastic vector? Iterating (1.3.2) we get o(0) o(n) n . =P p(0) p(n) To study stochastic matrices we need the following Corollary of PerronFrobenius Theorem 1.2.2 Theorem 1.3.1. If P is an irreducible stochastic matrix, then precisely one of the eigenvalues of P is equal to 1 and all others have norm < 1. As n → ∞ the matrices P n converge to a matrix Q whose columns are all the same. This column vector is the eigenvector w corresponding to eigenvalue 1. Its coefficients are positive and add up to 1. (A stochastic vector.) Each column of Q is equal to w. Firstly, we have to prove is that the largest eigenvalue of P is equal to 1. This follows from part (4) of PF theorem, as for any stochastic matrix X X pij = maxj pij = 1. minj i i Secondly, we need to verify that all columns of the limiting matrix is not just proportional but equal to the (stochastic) eigenvector associated with eigenvalue 1. This is true as the limiting matrix is also stochastic, therefore matrix elements in each column must sum to 1. Matrix P is irreducible as every matrix element is positive. As we have just proved, limn→∞ P n v → w, for each probability vector v, where w is the stochastic eigenvector associated with eigenvalue 1. They say that w describes the equilibrium state, which is unique for irreducible stochastic matrices. So, for large n, P n v does not depend much on the initial probability eigenvalue(s) of vector v. How fast P n → Q essentially is determined by the 0.99 0.01 P which is/are closest to the unit circle. For example, if P = 0.01 0.99 0.9 0.1 is still far from stationary. (The eigenvalues of P then P 10 = 0.1 0.9 13 are 0.98 and 1.) Note that irreducibility of P is crucial for the uniqueness 1 0 0 0 0.9 0 then limn→∞ P n = of equilibrium state. If for example, P = 0 0.1 1 1 0 0 0 0 0 , and so in particular, limn→∞ P n v does seriously depend on v. 0 1 1 Note that in the example from the beginning of this section, P is irreducible, and so P n tends to a matrix with all column the same. In fact, P^10 is roughly equal to the stationary matrix: Q= 0.4286 0.4286 0.5714 0.5714 Hence for the example at hand, the market in its equilibrium state is up with probability 0.4286 and down with probability 0.5714. It is often convenient to describe Markov models using trellis diagrams: 2/3 Up Up 1/3 1/4 Down Down 3/4 Remark. The discussed ’market mood’ model is a projection of the celebrated Cox-Ross-Rubinstein (or binomial )model used to estimate the price of American options.5 . In the simplest version of the model the value of an asset on the day n may go up by the factor of u > 1 with probability p or down by the factor of d = 1/u with probability 1 − p. Therefore, the pay-off for the American call option6 on the n-th day is max(0, S0 ·uNu (n)−Nd (n) −K), 5 An option is called ”American” if it may be exercised at any time before the expiry date.(Wikipedia) 6 A call option is a contract which gives its owner the right to purchase a prescribed asset in the future at an agreed price (exercise or strike price). 14 where S0 is the initial value of the asset, K is the exercise price, Nu (n) is the number of upward (’optimistic’) days among n days, Nd (n) is the number of ’pessimistic’ days. The model can be generalized by allowing the probability of the move on the day n to depend on what has happened in the previous day. We see that the pay-off depends on the number of ’ups’ and ’downs’ only. These are exactly the quantities the statistics of which we studied in the ’market mood’ model. 1.3.2 Another Markov model: a random walk on a graph Let us represent n states visited by a random walker by labeled points on the plane. Let us construct a graph by connecting state i to state j with a directed edge if the probability of the walker jumping from i to j is nonzero. In the matrix form, the probability of moving from state i to state j is given by pji , the matrix element of n × n transition matrix P which belongs to the i-th column and the j-th row. If v describes the probability vector associating a mass to each of the n states, then P k v is the mass you have after k steps. Using the sum rule it is easy to check that (in general) the ji-th entry of P k gives the transition probability of moving from i to j in precisely k steps. For example, take n = 5 and the transition matrix 1/2 0 0 0 0 0 1/2 0 1/3 1/4 0 1/2 0 1/4 P = . 0 0 1/2 0 1/3 1/4 1/2 0 1/2 1/3 1/4 (Actually, many people prefer to define the transition as the transpose of what is done here, and one has to make some minor changes.) We can visualize the model using the state diagram shown on the next page. The transition matrix is actually not irreducible. This statement is difficult to prove using MATLAB (Can you see why?). The state diagram makes it easy: as it is impossible to get to site 1 from any other site, (P n (0, v2 , v3 , v4 , v5 )T )1 = 0 for any vector with v1 = 0. Hence, the first row of any power of P must have the form ((P n )11 , 0, 0, 0, 0). The sub-matrix which you get by ignoring the 1st state is irreducible. So what can be said about the probability distribution of the walker after many steps? 15 1 5 2 3 4 An interpretation one could have in mind is five investment funds where the Markov matrix describes the proportion in which they invest in each other. The matrix describing to transition from time zero to time k (with k say 20) is 0.0000 0.3332 0.1112 0.3332 0.2223 0 0.3334 0.1111 0.3334 0.2222 0 0.3332 0.1112 0.3332 0.2223 0 0.3333 0.1111 0.3333 0.2222 0 0.3333 0.1111 0.3333 0.2222 So regardless of how we structure our initial investment (described by a probability vector v), the eventual return would be the same. But the first fund dies out, because there is no transfer of cash into this funds. The death of the first fund is only possible due to reducibility: as it is easy to check, the vector (0, 0.33, .011, 0.33, 0.22)T is an eigenvector of P associated with eigenvalue 1. For the irreducible stochastic matrix such an eigenvector would have only positive coefficients due to Theorem 1.3.1. What happens if the coefficient on the left top is replaced by 0.6? (The transition matrix is no longer stochastic, but we can still use our investment funds analogy to say that the first fund gets a constant injection of extra cash.) A simulation shows that (for k roughly 20) the transition matrix from time 0 to time k is given by 16 0.0000 0.4164 0.1391 0.4164 0.2779 0 0.3334 0.1111 0.3334 0.2222 0 0.3332 0.1112 0.3332 0.2223 0 0.3333 0.1111 0.3333 0.2222 0 0.3333 0.1111 0.3333 0.2222 So in this case, it would be best to put all your money in the first fund, despite the fact that eventually this fund dies out. It is the first fund that actually makes money, because the sum of the elements of the first column is > 1 (the sum of the first column is 1.1, while the sum of the others is 1). This last example can serve as a light-hearted description of the economy of the former Soviet Union. The government subsidizes the industry. The subsidies get stolen and moved abroad. The state eventually collapses. Further in the course we shall discuss Monte-Carlo methods, and see why the fact that one of the other eigenvalues of P could be close to 1, could play a role: because of this P n is not that close to the limit Q even for n large, which leads to slow convergence of the Monte-Carlo-based numerical scheme. 1.3.3 Matlab Project for week 2. During the Matlab Project in week 2 you will do the sections ’Basic syntax and command-line exercises’ and ’Basic array exercises’ of the exercises from http://www.facstaff.bucknell.edu/maneval/help211/exercises.html Answers are also given on this website, which might give you some tips, but you need to go through these exercises yourself. 17