Finance Math Refresher This paper contains some basic math to refresh your human memory and prepare you for the math content for the book Correlation Risk Modeling and Management – An Applied Guide including the Basel III Correlation Framework There are problems at the end of each chapter. The solutions are available upon request. For comments and questions, please email Gunter Meissner at gunter@dersoft.com. 1 1. Number Theory a) Natural numbers N also called whole numbers, are a set of numbers that take positive values and have no decimals. If the 0 is excluded they are termed positive natural number, if the 0 is included they are termed non-negative natural numbers. Formally, for non-negative natural numbers, we have N = {0, 1, 2, 3,…} Graphically, the natural number set can be displayed as a vector 0 1 2 3 4 ∞ For more on vectors see section 2.2. b) Integers Z are positive and negative natural numbers. Formally Z = {…,-3, -2, -1, 0, 1, 2, 3,…} (Z comes from “Zahl”, which means ‘number’ in German, which is very important to know ) c) Rational numbers Q are numbers that can be expressed as a quotient (also called fraction) of integers, and a denominator of non-zero. Formally, Q = p / q where p, q ϵ Z (reads: p and q are elements of the Integer number set Z) and q ≠ 0. As a consequence, rational numbers have finite decimals. d) Irrational numbers are numbers that cannot be expressed as quotient of integers with a denominator q ≠ 0. In other words, Irrational numbers have indefinite and nonrepeating decimals. Examples of irrational number are Euler’s number e = 2.82818…, π = 3.1415926… or 2 1.414213. . e) Real numbers R include all number sets above, i.e. the Natural numbers set N, the Integer number set Z, the rational number set Q, as well as irrational numbers. f) Imaginary numbers are numbers whose square root is less than 0. For example, 10 is an imaginary number since -10 is smaller than 0. A special imaginary number is i, defined as i 1 . We do us imaginary numbers in finance, e.g. in Fourier analysis. g) Complex numbers C are a combination of real numbers and imaginary numbers, i.e. a + bi, where a and b are ϵ R and i 1 . Complex numbers are cool, they expand the one-dimensional number line and add second dimension, which creates a number 2 plane. Complex numbers provide a solution to equations where no real solution can be found. We can relate the numbers sets above by expressing them as subsets: NZQR C This reads: The Natural number set N is a subset of the Integer set Z, which is a subset of the rational number set Q, which is a subset of the real number set R, which is a subset of the complex number set C. Dividing by zero a a , a ϵ R is not defined. So is not an element of any of the 0 0 number sets above. Compare the calculus chapter 3, where we apply limits, i.e. a denominator is decreasing to an infinitesimally small unit, see equation (3.1). In standard algebra the quotient 3 Problems for Chapter 1: Number Theory (The answers are available upon request. Email Gunter Meissner at gunter@dersoft.com) 1.1 Name 3 examples of sets in the real world, which can only be measured (counted) by natural numbers. 1.2 Negative numbers were rejected until the 17th century by many mathematicians. Name 2 examples of negative numbers appearing the financial world. 1.3 “Every natural number is also a rational number, but not vice versa”. Is this statement true? 1.4 Name a rational number, which is not natural number 1.5 Name a real number that is not a natural number 1.6 “Irrational numbers are irrational”. Comment on this statement 1.7 Name an irrational number that is not a rational number 1.9 “Imaginary numbers have no analytical solution. Therefore they don’t make sense”. Comment on this statement. 1.10 Can a complex number be expressed as a real number? 1.11 “Only God can divide by zero, humans can’t” Comment.. 4 2. Algebra There are six basic algebraic operations: Adding, subtracting; multiplication, division; exponentiation and extracting roots. When these operations are combined, they have to be performed in the order of: First exponentiation and extracting roots, then multiplication and division, then adding and subtracting. So if y = a + b cd, then cd has to be performed first, then the multiplication with b, then ‘a’ is added. Example: What is the solution of y = 1 + 2 x 34? It is 81 x 2 + 1 = 163. This is the only correct solution. Many rules exist in Algebra. Here are the most applied ones. a0 = 1, where a ϵ R and a ≠ 0 (a + b)2 = a2 + 2ab + b2 (a - b)2 = a2 - 2ab + b2 am an = a(n+m) am / an = a(m-n) (a b)n = an bn 2 a2 a 2 b b 1 1 n -n - n a and a n a a a (1/p) a p a (q/p) a q p For more algebraic rules, see http://orion.math.iastate.edu/dept/links/formulas/form1.pdf 5 2.1 On the Logarithm Logarithms can help to solve algebraic equations. In particular, they can solve an equation for the exponent. Logarithms in finance often help to more conveniently display exponential growth rates. So let’s discuss them. The idea of a logarithm is to reverse the operation of exponentiation. Simply put, the logarithm asks the question: What is the exponent, with which we have to raise a given number (the base) to get another given number (x)? The notation of a logarithm is y = logb(x) (2.1) b is the base. We are trying to find y. y is the exponent with which b has to be raised to find the given number x. So if b = 10 and x = 1,000, the logarithm is y = 3, since 103 = 1,000. Formally, log10(1,000) = 3. We read this as “the logarithm of 1,000 to the base 10 is 3”. We often use the natural logarithm ‘ln’ in finance. This means the base b = e = 2.71282…. The notation is y = loge(x) or y = ln(x) Example: If x = 10, what it ln(x)? Well, we can just throw it into Excel and get ln(10) = 2.3026. This is correct since the base e = 2.71828 raised to the power of 2.3026 = 10. Logarithms are also quite convenient mathematically, i.e. they are typically easy to dy 1 . ( stands for ‘it follows that’). differentiate and integrate. For example, if y = ln(x) dx x For more on differentiation, see chapter 3. Logarithmic Rules There are some nice logarithmic rules, which come in handy to solve algebraic problems: ln (ea) = a ln(e) This equation helps us solve for exponents. For example, if we have equation y = xa and y and x are given, we can solve for the exponent a, by using ln (y) = ln (xa) or ln (y) = a ln(x) or a = ln(y) / ln(x). 6 Example: What is the solution of 100 = 7a? It is ln(100) = a ln(7) or a = ln(100) / ln(7) = 2.36656. (We can look up ln(100) and ln(7) easily on every calculator, Excel, MatLab, etc). Other helpful rules are ln(ab) = ln(a) + ln(b) ln(a/b) = ln(a) – ln(b) We also use the natural logarithm to more conveniently display growth rates. Growth rates are relative changes, expressed as (S1-S0)/S0, where St is the prices of an asset at time t. For example if S1 = 110, and S0 = 100, the relative change is (110-100)/100 = 0.1 = 10%. We often approximate relative changes as (S1-S0)/S0 ≈ ln (S1/S0) (2.2) This is a good approximation for small differences between S1 and S0. Ln(S1/S0) are called log-returns. The advantage of using log-returns is that they can be added over time. Relative changes are not additive over time. Let’s show this in an example. Example: A stock price at t0 is $100. From t0 to t1, the stock increases by 10%. Hence the stock increases to $110. From t1 to t2 the stock increases again by 10%. So the stock price increases to $110x0.1= $121. This increase of 21% higher than adding the percentage increases of 10%+10%=20%. Hence percentage increases are not additive over time. Let’s look at the log-returns. The log-return from t0 to t1 is ln(110/100) = 9.531%. From t1 to t2 the log-return is ln(121/110) = 9.531%. When adding these returns, we get 9.531%+9.531%= 19.062%. This is the same as the log-return from t0 to t2, i.e. ln(121/100) = 19.062%. Hence logreturns are additive in time.1 We also often display exponential functions in finance on the logarithmic scale. If we have stock growing exponentially with ex, we have 1 We could have also solved for the absolute value 121, which matches a logarithmic growth rate of 9.531%: ln(x/110) = 9.531%, or, ln(x)-ln(110) = 9.531%, or, ln(x) = ln(110) + 9.531%. Taking the power of e we get, e(ln(x)) = x = e(ln(110)+0.09531) = 121. 7 10 9 8 7 6 ex 5 4 3 2 1 0 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 x Figure 2.1: y = ex with respect to x Displaying this graph on a natural logarithmic scale, applying ln ex = x, we get 2.5 2 1.5 ln(ex) 1 0.5 0 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 x Figure 2.2: The exponential stock price growth displayed on a logarithmic scale 8 2.2 Vector and Matrix Algebra We use vectors and especially matrices in finance. Figure 1 already displayed the natural number set as a vector. We use matrices in investments and risk management to display the covariance matrix of the assets in a portfolio. In finance, a covariance matrix measures how asset prices move together in time. We also use default correlation matrices in riskmanagement. A default correlation matrix measures how probable the joint default of two entities is within a certain time period, for example a year. A vector is a geometric entity, which is characterized by two properties: a) Magnitude, which is measured by its length and b) direction. Typically vectors have an origin (starting point) and an end point (but they can also be infinite as our number set in Figure 1). In finance we often apply row vectors and column vectors. The notation for a row vector or horizontal vector is a1 a 2 a 3 where ax ϵ R. The notation for a b1 column vector or vertical vector is b2 , bx ϵ R. b3 Multiplying a row vector with a column vector results in a scalar (which is just the term for a single number in vector algebra, it comes from ‘scale’) b1 a1 a 2 a 3 b2 = a1b1 +a2b2 + a3b3 b3 2 Example: What is the result of 1 2 -3 4 ? It is 1 x 2 + 2 x 4 – 3 x 6 = -8. 6 A row vector and a column vector are special cases of a matrix. The row vector is a onerow matrix and the column vector is a one-column matrix. However, matrices can have several a b rows and columns. For example, a square matrices can have two rows and two columns , c d where {a, b, c, d} ϵ R. Matrices can only be multiplied if the number of columns of the first matrix are identical to the number of rows in the second matrix. Matrix multiplication is done by 9 a b e f ae bg af bh x c d g h ce dg cf dh (2.2) 1 2 - 2 3 - 2 2 3 - 2 0 1 Example: x 3 4 1 - 1 - 6 4 9 - 4 - 2 5 A matrix A can be transposed by exchanging the rows with the columns. The notation of a transposed matrix is AT. a b Example: What is the transpose of matrix A = ? AT= c d a c b d An Eigenvector (“eigen” comes from the German word “self”, which is very important to know) is a special type of vector. A vector is an eigenvector x, if it satisfies the condition A x = λ x, where A is a matrix, x is the eigenvector, and λ is the eigenvalue. Let’s look at an example: 4 2 2 Let A = and x = . x is an eigenvector since 2 4 - 2 4 1 2 that the eigenvalue λ = ½, since x . 4 2 2 4 2 2 8 - 4 4 . It follows x = 4 8 2 4 2 4 Geometrically an eigenvector x, when multiplied with A, leaves unchanged, stretches, shrinks, or flips (points in the exact opposite direction) or flips and stretches or flips and shrinks x. These are the only changes that can occur from multiplying a scalar (λ) and a vector (x). For more on matrices and vectors, www.wiley.com/go/correlationriskmodeling, ‘Mathrefresher’. 10 go to Problems for Chapter 2: Algebra (The answers are available upon request. Email Gunter Meissner at gunter@dersoft.com) 1.1 Solve 3 x 4 + 1 x 24-2 1.2 Solve 4y + 5z Is this a trick question? 1.3 Solve 4y x 5z 1.4 Solve 1 2 x2 2 1.5 Solve (2 + 3)2 1.6 Solve (3 - 4)2 1.7 Solve 22 x 23 - 32 1.8 Solve 1.8 Solve 778 777 1 Another trick question? 3 4 4 42 2 2 1.10 4 2 Solve x 2 4 1.12 4 Solve 4 4 1.12 Solve 20b = 9 for b 10 11 1.13 Solve ln(3/2) 1.14 a c 3 Solve b d 4 1.15 3 a c Solve 4 Enough with the trick questions already! b d 5 8 4 2 Given is the matrix . Is the vector x = an eigenvector? If yes, what is the 4 8 2 eigenvalue? 1.16 12 3. Calculus Finally calculus! Everyone likes calculus since we can deal with infinities and other cool stuff like finding optima, calculating surfaces etc. Calculus has two main operations. 1) Differentiation and 2) Integration. We use differentiation a lot in Finance, for example to calculate the riskparameters (called Greeks) of options, to see the marginal impact of a parameter change (as volatility or asset price) on a portfolio, to optimize a portfolio etc. 3.1 Differentiation Definition: A mathematical derivative measures how one variable changes, if another variable changes by an infinitesimally small amount. The derivative of a function y(x) is the slope of that function for an infinitesimally small dy dy change in x, formally . is the ‘Leibnitz notation’ by Gottfried Leibniz. Other notations are dx dx . . dy f' ' (x) y' y . stands the Lagrange notation f’(x) or y’, or the Newton notation y . Hence dx for ‘is equivalent to’ Let’s derive dy graphically in Figure 3. dx y y(x) ∆y ∆x x x1 Figure 3: The tangent of a function y(x). 13 y dy is the slope of the tangent in point x1. We can now find the derivate by letting ∆x, the x dx discrete change of x, get smaller and smaller, formally lim x 0 Equation (3.1) reads: The limit of y dy x dx (3.1) y dy if x approaches 0, is . x dx dx is now an infinitesimally small change of x. The slope of the function y(x) in dx, i.e. in the dy point x1, is . dx 3.1.1. Differentiation rules There are several major rules for finding a derivative 1) Power rule If y = xn → dy dy n x n -1 , n ≠ 0. Example: If y = x3 → 3x 2 dx dx 2) Constant factor rule This rule allows to leave a constant ‘a’ unchanged when differentiating. Hence we have If y = a xn → dy dy a n x n -1 , n ≠ 0 Example: If y = 2x3 → 2 x 3x 2 6x 2 dx dx 3) Sum rule 14 This rule states that in a function, which consists of two or more sum terms, each sum term can be differentiated individually. Hence for two sum terms, we have If y(x) = u(x) + v(x) → dy du dv dy 6x2 6x , Example: If y = 2x3 + 3x2 → dx dx dx dx 4) Product rule If y(x) = u(x) v(x) → dy du dv v(x) u(x) dx dx dx Example: y(x) = 2x2 3x → dy 4x 3x 3 x 2x 2 12x 2 6x 2 18x 2 dx How could we have derived this result faster? See problem 3.5 5) Quotient Rule du dv v(x) u(x) u(x) dx dx If y(x) = v(x) [v(x)] 2 Example: If y(x) 3x 2 dy 6x 2x - 2 x 3x 2 12x 2 6x 2 6x 2 3x 2 3 2 2 2x dx 4x 2 4x 2 4x 2x 2 How could we have derived this result faster? See problem 3.7 6) Chain rule If y = f(u) and u = g(x) → dy dy du dx du dx Example: 1 3 2 1 1 dy 1 3x 2 3x 2 3 2 2 3 2 2 If y(x) 1 2x (1 2x ) (1 2x ) 6x (1 2x ) 3x 1 dx 2 (1 2x 3 ) 2 1 2x 3 3 7) Some specific Differentiation rules often applied in Finance 15 If y(x) = ln(x) → If y(x) = ey(x) dy 1 dx x de y(x) dy y(x) de2x 2x e 2 e2x Example: If y(x) = e dx dx dx dy e x Let’s look at this convenient derivative geometrically. The function y(x) = dx x e is displayed in Figure 3.1 If y(x) = ex Figure 3.1: The function y(x) = ex From Figure 3.1 we can observe that if y(x) = ex = dy ex . dx In particular, dy ( y ( 5)) 0 . I.e. at x=-5 the function ex has the y-axis value of close to zero dx and the slope at y(-5) is also close to zero as seen from the tangent (in red). y(-5) 0 as well dy ( y (0)) 1 . I.e. at x=0 the function ex has the y-axis value of 1, and the slope dx at x(0) is also 1 as seen from the tangent (in blue). y(0) 1 as well dy ( y (1)) e . I.e. at x=1 the function ex has the y-axis value of e = 2.71828… dx and the slope at y(1) is also e=2.71828… as seen from the tangent (in green). y(1) e as well 16 3.1.2 Partial Mathematical Derivative Definition: Given is a function with several independent variables. A partial mathematical derivative is the derivative of that function with respect to one variable, assuming the other variables are constant. The notation for the partial derivative operator is typically , pronounced as ‘d’ or ‘del’ or ‘partial’. is not a letter from the Greek alphabet and should not to be confused with the Greek delta δ or sigma σ. Let’s look at an example of a partial derivative. y 2 z I.e. the partial derivative of the function y(x,z) = 2x + xz + 4z x with respect to x is 2 + z, assuming the variable z is constant. If y(x,z) = 2x + xz + 4z y x 4 I.e. the partial derivative of the function y(x,z) = 2x + xz + 4z z with respect to z is x + 4, assuming the variable x is constant. If y(x,z) = 2x + xz + 4z We use partial derivatives a lot in Finance. For example we partially differentiate the Nobel prize rewarded Black-Scholes-Merton option model. For a call option, we have C S0 e -qT N(d1 ) K e rT N(d 2 ) where d1 17 ln( S0e qT 1 ) σ 2T rT Ke 2 σ T d 2 d1 T (3.2) C is called ‘Delta’. It tells us how S the call price changes for an infinitesimally small change of S, assuming all other variables q, T, r, and σ are constant. See problem 3.12. The first partial derivatives of equation (3.2) with respect to S, 3.1.3 Finding the maximum and minimum of a function with a mathematical derivative We can find the maximum or minimum of a function by differentiating the function, setting the derivative to zero and solving for x. If the second derivative is >0, we found a minimum, if the second derivative is <0, we found a maximum. dy 2x . We set this to zero, i.e. dx 2x = 0. The solution is x=0. So at x=0, we have a minimum or maximum of the function Let’s look at an example. We have the function y(x) = x2 → d2y 2 . Since 2 > 0, we have found a minimum at x=0, y(x) = The second derivative is dx 2 which is verified by Figure 3.2. x2. 10 9 8 7 6 5 4 3 2 1 0 -3 -2.7 -2.4 -2.1 -1.8 -1.5 -1.2 -0.9 -0.6 -0.3 0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3 x2 x Figure 3.2: The function y(x) = x2 with a minimum at x=0 18 For the function y(x) = x-2, we have maximum at x=0, see problem 3.13. 3.2 Integration Integration is the reverse operation to differentiation. It was developed by Issac Newton and Gottfried Leibniz in the 17th century. In fact, there was quite a quarrel between the two as to who the primary inventor was. Leibniz published his results first, but may have peeked at Newton’s notes while in London. This reminds us to be honest. You don’t want to go into history as a plagiarizer… There are several different types of integration concepts such as the Riemann Integral, Lebesgue Integral, the Riemann-Stieltjes Integral and more. Generally, we can derive a heuristic (means non-mathematical) definition as Definition: A mathematical Integral measures the area of a function, which is bounded horizontally by the y= f(x) and x [x=0], and vertically bounded by x=a and x=b. In Riemann notation we express an integral as b f(x) dx (3.2) a In (3.2) the integral is a stretched letter ‘s’, coming from the word ‘sum’. In fact, in a sum we add discrete units ∆x, whereas in an integral we add infinitesimally small units dx. Actually, the Riemann integral can be derived by starting with the ‘Riemann sum’, which adds units of ∆x and then minimizes the ∆x to get the dx. See http://en.wikipedia.org/wiki/Riemann_integral for more details. dx in the integral (3.2) is just notation, indicating that we are summing up the infinitesimally small values dx. x is a place holder, also called a ‘dummy variable’, since it is replaced by the limits a and b during the process of integration. x=a and x=b are vertical limits, the beginning and end of the domain of integration. Example: Let’s look again at the function y(x) = x2. Let’s graphically show the integral of y(x) = x2 in the domain a=1 and b=2: 19 2 Figure 3.3: The integral of the function y(x) = x2 for the domain a=1 b=2 is x 2 dx 1 2 Let’s calculate the integral x 2 dx . We have 1 b f(x) dx F(b) F(a) (3.3) a F(x) where F(x) is x q 1 C q 1 q ≠ -1 (3.4) and C is the arbitrary constant of integration. Applying equation (3.4) to our example y(x)=x2, we have q=2. Let’s apply equations (3.3) and (3.4) to integrate the function y(x)=x2 in the domain a=1, b=2, with C=0: b 2 a 1 2 f(x) dx x dx F(2) F(1) 23 13 7 1 2 3 3 3 3 1 This means that the area under the function y(x) = x2 from x=1 to x=2 is 2 , compare Figure 3 3.3. 3.2.1 The one-dimensional Integral 20 The domain of the integral is often an area, so it is two-dimensional. However, the domain of integration can also be higher dimensional, as a volume (three-dimensional) or ndimensional, n>3. Also, we can apply integration for a one-dimensional function, i.e. a real line. We do this in Finance when we create the expected value of an asset, which follows the Geometric Brownian motion (GBM). The GBM is dS μ S dt σS ε t dt S (3.5) where S is an asset as stock price, hence dS/S is the relative change of S (see chapter 2), μS is the average growth rate of S, σS is the volatility of S, and εt is a random drawing from a standardized normal distribution at time t (see chapter 4 for more details). Let’s find the expected value of the asset S in equation (3.5). The expected value of the normally distributed variable ε is 0, formally E(ε) = 0. Equation (3.5) then reduces to dS μ S dt S (3.6) To derive the expected value of S at a future time T, E(ST), we sum up, i.e. integrate the infinitesimally small units in time μS dt. Hence we have T T dS 0 E S 0 μS dt For the left side of equation (3.7), we apply (3.7) dS ln(S) . For the right side of equation (3.7), we S apply that the integral of a constant a is a dx a x 2 . So for the right side we have T μ S dt F(T) – F(0)= μS T – μS x 0 = μS T. Hence, when integrating equation (3.7), we derive 0 ln[E(S T )] μST ln(S 0 ) where ln(S0) is the integral constant C. Taking both sides of equation(3.7) to the power of e, we derive 2 We could apply equation (3.4) to show that F(x 0 ) a dx a x . From equation (3.4), we can write x 0 1 x . Since x0 = 1, we have a x 0 dx a x + C 0 1 21 (3.7) eln[E(ST )] e[μ S T ln(S0 )] Applying eln(x) = x and e(x+y) = ex ey, we derive for the expected value of S at time T E(S T ) S0 eμST (3.8) Equation (3.8) states that the expected value of the asset S at time T is simply the starting value μ T S0 (today’s value) multiplied with e S . For example, if a stock today has a price of S0=$100 and the expected growth rate of a stock is 10%, in T=1 year the stock price is expected to be ST = 100 x 2.71828 ^ 0.1 = $110.52. 3.2.2 Some popular Integrals in Finance Here is a list of some often applied Integrals in Finance: a dx a x C , e e x ax where a is a constant, a ϵ R and C is the constant of integration dx e x C (see chapter 3.1.1. for details) dx 1 ax e C a ln(x) dx x ln(x) - x C φ(x) dx Φ(x) C where is the pdf (probability density function) and is the cdf (cumulative density function) of a standard normal distribution, see chapter 4 for details. 22 Problems for Chapter 3: Calculus (The answers are available upon request. Email Gunter Meissner at gunter@dersoft.com) 3.1 What does the derivative 3.2 Explain the equation dy tell us? dx y dy lim x 0 briefly. x dx 3 3.3 Differentiate x 2 3 3.4 Differentiate x 4 4x 2 2x 3 dy 4x 3x 3 x 2x 2 12x 2 6x 2 18x 2 . We could have derived this result dx 2 faster by using 2x 3x = 6x3. Differentiating 6x3 = ? 3.5 If y(x) = 2x2 3x → 3.6 Differentiate y(x) = x3 ln(x) 3x 2 dy 6x 2x - 2 x 3x 2 12x 2 6x 2 6x 2 3x 2 3 2 2 . We could have derived 2x dx 4x 2 4x 2 4x 2x 2 2 3 3x 3 this result faster by using x . Differentiating x ? 2 2x 2 3.7 If y(x) 3.8 Differentiate y(x) x3 . 2x 2 3.9 Differentiate y(x) (2x 1)3 3.10 Differentiate y(x) ex ln( x ) 3.11 Differentiate y(x, z) 4x 2 3z partially with respect to z 3.12. OK. This is for the courageous student. In finance, we partially differentiate the Nobel Prize rewarded Black-Scholes-Merton option pricing model C S0 e -qT N(d1 ) K e rT N(d 2 ) where d1 23 ln( S0e qT 1 ) σ 2T rT Ke 2 σ T d 2 d1 T (3.2) to find the ‘Greeks’. The Greeks consist of the Delta the Theta C C 2C , the Gamma , the Vega , and 2 S S C . Give it a try… don’t get frustrated now T 3.13 Derive the function y(x) = by differentiating 1 in an Excel spreadsheet. Find the maximum of the function x2 1 , setting it to zero and solving for x. Why did you find a maximum? x2 b 3.14 Solve 2x 2 dx for the domain a -1, b 1 a 3.13 Solve a dx , where a ϵ R 3.15 Solve 0 dx Is this a trick question? Ahhh, not really.. 3.16 Solve e2x dx 24 4. Statistics Statistics can be fun. We get to draw colorful graphs called distributions, figure out the math for them, integrate them and see if they fit the real world. Statistics can be divided into two main branches 1) Descriptive statistics, which deals with collecting and interpreting data and 2) Analytical or mathematical statistics, which is mainly probability theory, but also the design of experiments such as how to forecast election results. In this refresher we will concentrate on Descriptive Statistics. 4.1. Distributions and Moments Informal Definition: A probability density function (pdf) is a distribution, which assigns probabilities to the outcomes of random events. Importantly, pdfs are non-negative everywhere and the summation of the outcomes, i.e. the integral of the entire function, is 1. In Finance, for convenience, we often use the normal distribution to model variables such as stock prices, interest rates, commodity prices etc. The normal distribution, also called the bell-shaped or Gaussian curve, after its founder Carl Friedrich Gauss, looks as follows: Figure 4.1: PDF of the standard normal distribution The normal distribution is quantified with 25 1 x μ σ 2 1 f(x; μ, σ 2 ) e 2 σ 2π (4.1) where μ is the mean, σ2 is the variance, and σ is the standard deviation. In Figure 4.1, we see a special case of the normal distribution, the standard normal distribution with a mean μ = 0 and a variance σ2 = 1. In this case, equation (4.1) reduces to 1 12 x 2 f(x;0,1) e 2π (4.2) In stochastic processes (stochastic means unknown, so non-deterministic), we often sample from a standard normal distribution. This means that we randomly draw a sample from the x-axis of a normal distribution. The notation for the sample is typically ε. It can be derived as =normsinv(rand) in Excel or randn() in MatLab. As seen from Figure 4.1, the expected value of ε, E(ε) = 0. See also problem 4.7. If we integrate a probability density function, we derive the cumulative density function (CDF), as seen in Figure 4.2: Figure 4.2: CDF of a standard normal distribution Importantly, a CDF at a certain point x* gives the probability of the random variable falling in the interval (-∞, x*). In other words, it is the probability of the event to have a value of ≤ x*. The CDF of a standard normal distribution shown in Figure 4.2 cannot be quantified with elementary functions, but we can use the error function erf to derive it 26 F(x; μ, σ 2 ) 1 x μ 1 erf 2 σ 2 (4.3) x where erf(x) 2 t 2 dt . t is just a place holder, the dummy variable of integration which is e π 0 replaced with the limits 0 and x in the process of integration (the attentive reader remembers this from chapter 3!!!) In Finance we often use the log-normal distribution. The PDF of a lognormal distribution is f l (x; μ, σ ) 1 2 xσ 2π e 1 ln(x) μ 2 σ 2 (4.4) where μ and σ are the mean and standard deviation of ln(x). 0.6 0.5 0.4 fl(x; u,σ) 0.3 0.2 0.1 0 0.01 0.51 1.01 1.51 x 2.01 2.51 3.01 Figure 4.3: The PDF of a lognormal distribution with μ=0 and σ=1. We often assume in Finance that an asset as a stock grows in time according to a lognormal distribution. In Figure 4.4 we observe that the asset S is expected to grow with the growth rate μ to E(ST). The value of E(ST) was derived in equation (3.8). We also observe from Figure 4.4 that the value, which an asset price can take in the future, falls within the lognormal distribution. In particular, we observe from Figure 4.4 that the asset price S can increase sharply, however with a low probability P1. The asset price can also decrease sharply, however 27 with the low probability P2. In a lognormal distribution the asset price cannot become negative, which is in line with most asset prices as stocks, bonds and commodities in reality. Hence altogether, many researchers believe that the lognormal distribution is a good representation of financial assets in the real world. However, this is an empirical question and depends on the asset, time frame, and geography. Some researchers may disagree and prefer the normal distribution to model assets. See problem 4.8 for more. Figure 4.4: An asset S, represented in time with the log-normal distribution. Moments Statistical distributions are characterized by their moments. The first four moments are 1st Moment: Mean; represented by the expected value of a distribution, E(ST) in the Figure 4.4 (for a calculation see below) 2nd Moment: Variance; loosely speaking how ‘wide’ the distribution is (for a calculation see below) 3rd Moment: Skewness, i.e. how lopsided or asymmetric a distribution is 28 4th Moment: Excess Kurtosis, i.e. how fat the tails of a distribution are By definition the standard normal distribution of Figure 4.1 and 4.2 has a first, third, and fourth moment (defined as excess kurtosis, not kurtosis) of zero and the second moment is 1. The standard lognormal distribution in Figure 4.3 and 4.4 has a 1st Moment (mean) of e compare Figure 4.1. 1 2 2 . Hence with μ=0 and σ=1, it follows that the mean is 1.65 2nd Moment (variance) of (eσ 1) e2μ σ . Hence with μ=0 and σ=1, it follows that the variance 2 2 is 4.6708. 3rd Moment (skewness). From Figure 4.3. and 4.4 we observe that the skewness is bigger than 0, since the distribution is skewed to the right. In fact the skewness of a log-normal distribution is (e σ 2) eσ 1 . Hence with μ=0 and σ=1, we derive a skewness of 6.1849. 2 2 4th Moment (kurtosis). From Figure 4.3 and 4.4, we can already conclude that the kurtosis of the lognormal distribution is > 0, since the distribution shows a fat right tail. The kurtosis of a log-normal distribution is e4σ 2e3σ 3e2σ 6 . For σ=1, this results in a kurtosis of 110.94, showing indeed that the lognormal distribution has a very fat tail. We recommend a diet.. 2 2 2 4.2 Time Series and Correlation In Finance we often analyze time series of financial assets as stocks, bonds, currencies, commodities etc. We typically look at the correlation between these asset time series to assess the profit potential and the risk. Definition: Financial correlations measure how two or more financial assets move together in time. Let’s analyze two time series in an example: Let’s assume we have a portfolio of 2 stocks, A* and B*. They have performed as in Table 4.1: 29 Asset A* 100 120 108 190 160 280 2008 2009 2010 2011 2012 2013 Asset B* 200 230 460 410 480 380 Asset A* return in % Asset B* return in % 20.00% -10.00% 75.93% -15.79% 75.00% 15.00% 100.00% -10.87% 17.07% -20.83% Table 4.1: Performance of a portfolio with two assets The return of an asset is expressed as a percentage. I.e. Return(S t ) = St - St -1 St -1 (4.5) where St is the price of asset S at time t. So the return of asset A* at the end of 2009 is (120-100)/100= 0.2 = 20%. See also chapter 2.1 on the approximation of the return in equation (4.5) with the logarithm. Let’s define the return of asset A* as A, and the return of asset B* as B. The average return of asset A*, for the time frame 2009 to 2013 is µA = 29.03%, for asset B the average return is µB = 20.07%. If we assign a weight to asset A, wA, and a weight to asset B, wB, the portfolio return is µPort = wA µA + wB µB (4.6) where wA + wB = 1 The standard deviation of returns, termed volatility, is derived for an asset A with equation σA 1 n (A t μ A )2 n 1 t 1 (4.7) where At is the return of asset A* at time t and n is the number of observed time units. A standard deviation or volatility measures how far the numbers in the time series diverge from its mean. From our example in Table 4.1, we find that the standard deviation of the returns of asset A* is σA = 44.51% and σB = 47.58% (try to derive this yourself. If you can’t I can send a spreadsheet with the solution). The covariance of returns for assets A and B is derived with equation 30 1 n COVAB (A t μ A )(B t μ B ) . n 1 t 1 (4.8) The covariance tells us how asset return A and asset return B move together in time. For our example in Table 4.1 we derive COVAB= -0.1568. Since the covariance is negative, we can conclude that on average, if A increases, B decreases, vice versa. An even easier to interpret correlation measure is the Pearson correlation coefficient ρ, which is a standardized covariance, i.e. takes values between -1 and +1. The equation for ρ is ρ COVAB σ Aσ B (4.9) For our example in Table 4.1, ρ = -0.1568 / (0.4451 x 0.4758) = -0.7403, confirming that the asset returns A and B are negatively correlated. In fact the negative correlation is quite strong, since the negative correlation -0.7403 is quite close to -1. We can calculate the standard deviation for the returns of our two-asset portfolio as σ Port w 2Aσ 2A w 2Bσ 2B 2w A w BCOVAB (4.9) With equal weights, i.e. wA = wB = 0.5, the example in Table 4.1 results in σPort = 16.66%. Importantly, the standard deviation (or its square, the variance), is interpreted in finance as risk. The higher the standard deviation, the higher is the risk of an asset or a portfolio. Is standard deviation a good measure of risk? The answer is: It’s not great, but it’s pretty much the only one we have. A high standard deviation may mean high upside potential! So it penalizes possible profits! But high standard deviation naturally also means high downside risk. In particular, risk averse investors will not like a high standard deviation, i.e. high fluctuation of their returns. An informative performance measure of an asset or a portfolio is the risk-adjusted return, i.e. the return/risk ratio. For a portfolio, it is µPort/σPort, which we derived in equations (4.6) and (4.9). In Figure 4.2 we observe one of the few ‘free lunches’ in finance: The lower, preferable negative the correlation of the assets in a portfolio, the higher is the return/risk ration. For a rigorous proof, see Markowitz (1952) and Sharpe (1964). 31 Mue/Sigma with respect to Correlation 250% M u e / S i g m a 200% 150% 100% 50% 0% -1 -0.5 0 Correlation 0.5 1 Figure 4.5: The negative relationship between return µ / risk σ with respect to the correlation of the assets in the portfolio ρ. Figure 4.1 shows the high impact of correlation on the return/risk ratio. A high negative correlation results in a return/risk ratio of close to 250%, whereas a high positive correlation results in a 50% ratio. 32 Problems for Chapter 4: Statistics (The answers are available upon request. Email Gunter Meissner at gunter@dersoft.com) 4.1 What are the characteristics of a statistical distribution? 4.2 What is the difference between a pdf and cdf? 4.3 What are the 4 moments of a distribution and what information does each moment have? 4.4 What are the numerical values of the 4 moments of a standard normal distribution? 4.5 Let’s assume we have two assets A and B. They have performed as follows: 2008 2009 2010 2011 2012 2013 Asset A 100 90 130 180 160 200 Asset B 200 230 200 220 240 200 Asset A return in % Asset B return in % 4.5a Calculate the return of each asset. 4.5b Which asset has performed better? (calculate the mean return) 4.5c Which asset is riskier? (calculate the volatility, i.e. standard deviation of returns) 4.5d What is the correlation between the asset returns? (calculate the covariance and Pearson’s ρ) 4.5e Calculate the overall portfolio risk, assuming equation weights. 4.5f Calculate the risk-adjusted return of the portfolio, i.e. µPort/σPort. 4.6 For extra credit (who gets that credit anyway?): Which weights wA and wB of asset A and B minimize the portfolio risk? 33 Hint: Start with equation (4.9) σ Port w 2Aσ 2A w 2Bσ 2B 2w A w BCOVAB . Differentiate equation (4.9) partially with respect to wA. Set the derivative to 0 and solve for wA. Input σA σB and COVAB and voila, you will have the wAmin, the weight of asset A that minimizes the portfolio volatility. 4.7 Derive the random drawing from a standard normal distribution ε in an excel spreadsheet via =normsinv(rand). Build a histogram to show that ε is standard normally distributed. 4.8 There is a discussion which distribution fits asset prices better, the normal distribution or the lognormal distribution. What tool could we provide traders to address this problem? 34