Trinity Business School Econometrics Lecture 5: Generalised Linear Models and Event Studies Lecturer: Niamh Wylie Michaelmas Term 2023/24 Non-Linear Dependent Variables Logit, Probit and Tobit Models Trinity Business School Discrete Choice Models Sometimes we need to work with models where the dependent variable is not continuous. Where the dependent variable is a choice of two or more options from a menu, the dependent variable is said to be discrete. In such an instance OLS will be inappropriate for estimations. Greene (2011) states that discrete choice models in general are appropriate when the economic outcome to be modelled is a discrete choice among a set of alternatives, rather than a continuous measure of some activity. Set of alternatives is called the Choice Set and must be mutually exclusive, (can select only one alternative), and exhaustive. Discrete choice variables are also known as limited dependant variables. Trinity Business School Binary Models Many of the choices that firms, individuals and Governments make are of an ‘either/or’ nature. Such choices can be represented by a binary variable that takes up the value of 1 if one outcome is chosen and the value of 0 otherwise. When the dependent variable is a binary variable OLS cannot be used. Assume an individual has to make a choice between two alternatives. Let Uji be the utility that individual ‘i’ (i = 1, 2,..., N) gets if alternative j (j = 0 or 1) is selected. The individual will make choice 1 (j = 1) if U1i ≥ U0i and makes choice 0 otherwise. The choice depends on the difference in utilities across the two alternatives: Yi* = U1i - U0i Trinity Business School Binary Models If the choice is to use public or private transport, then the individual will choose whatever yields the highest utility to them. We know Yi* that will depend on the characteristics (Xi) of each individual. Yi* = α + βXi + εi Yi* is unobservable but we can observe the choice made by the individual. Trinity Business School Binary Models In particular, we observe Yi = 1 if individual ‘i’ makes choice 1 and Yi = 0 if choice 0 is made. The relationship between Yi* and Yi is: Yi = 1 if Yi* ≥ 0, Yi = 0 if Yi* < 0. This can be modelled using either a Probit or Logit model. Trinity Business School Limited Dependent Variable Models Dummy variables used as explanatory variables to numerically capture qualitative effects. (gender, day of the week) When the explained variable is a qualitative effect, the qualitative information is coded as a dummy. Discrete choice variable where the Y value may take certain discrete integers. Binary choice variable where the Y value may take only 0 or 1. Examples: Where firms choose to list their shares. (Nasdaq or NYSE) Stocks that pay dividends while others do not Factors affecting sovereign debt defaults Trinity Business School Linear Probability Model Simplest way of dealing with binary dependent variables. Probability of event occurring p is linearly related to a set or explanatory variables 𝑃𝑖 = 𝑝(𝑦𝑖 = 1) = 𝛽𝟏 + 𝛽𝟐 𝒙𝟐𝒊 +𝛽𝟑 𝒙𝟑𝒊 + ⋯ +𝛽𝒌 𝒙𝒌𝒊 + 𝑢𝑖 𝑖 = 1,2, . . 𝑁 Fitted values of the regression are the estimated probabilities for 𝑝(𝑦𝑖 = 1) for each observation i. Slope estimates are the change in probability that 𝑦𝑖 = 1 for a unit change in the given explanatory variable, holding all other 𝒙′s fixed. Trinity Business School Linear Probability Model Eg. Model that firm 𝑖 will pay a dividend as a function of its market capitalisation. 𝑷𝒊 = −0.3 + 0.012𝒙𝟐𝒊 𝑷𝒊 denotes the estimated probability for firm 𝑖 Interpreted as; For every $1m increase in size, the probability that the firm pays a dividend increases by 1.2%. A firm whose stock is valued at $50m will have a -0.3 + .012 x 50 = .30 or 30% probability of making a dividend payment. 𝒚𝒊 = −0.3 + 0.012𝒙𝟐𝒊 Probability of paying a dividend v v Market Cap Simple to estimate & intuitive to interpret But what is the problem?? For < $25m or > $109m to probability will be < 0 or > 1! Trinity Business School Linear Probability Model Solution : Use truncation. Set =-.30 to 0 and 1.2% to 1, however.. If truncation is used, too many observations of exactly zero or 1. Not plausible that firm’s probability of paying a dividend is exactly zero or one. Can we be that certain? Econometrically, if the dependent variable can only take on a binary value, then the disturbance term can only take on one of two values, hence cannot plausibly assumed to be normally distributed. If 𝑦𝑖 = 1 , then 𝑈𝑖 = 1 − 𝛽𝟏 − 𝛽𝟐 𝒙𝟐𝒊 −𝛽𝟑 𝒙𝟑𝒊 − ⋯ −𝛽𝒌 𝒙𝒌𝒊 But if 𝑦𝑖 = 0 then,𝑈𝑖 = − 𝛽𝟏 − 𝛽𝟐 𝒙𝟐𝒊 −𝛽𝟑 𝒙𝟑𝒊 − ⋯ −𝛽𝒌 𝒙𝒌𝒊 Disturbances are heteroscedastic and robust standard errors should always be used. Trinity Business School The Logit Model Overcomes the limitations of the Linear Probability Model by transforming the regression model such that the fitted values are bound within a (0,1) interval. The logistic function 𝑭 of a random variable 𝑧 𝐹(𝑧𝑖 ) = 1 1+𝑒 −𝑍𝑖 As 𝑧𝑖 → ∞, 𝑒 −𝑍𝑖 → 0, 1 1+𝑒 −𝑍𝑖 𝐴𝑠 𝑧𝑖 → −∞, 𝑒 −𝑍𝑖 → ∞, →1 1 1+𝑒 −𝑍𝑖 →0 Where 𝑒 is the exponential under the logit approach. 𝑃𝑖 = 1 1+𝑒 −(𝛽𝟏 −𝛽𝟐 𝒙𝟐𝒊 −𝛽𝟑 𝒙𝟑𝒊 +⋯+𝛽𝒌 𝒙𝒌𝒊 +𝒖𝒊 ) where 𝑃𝑖 is the prob 𝑦𝑖 = 1 0 and 1 are asymptotes to the function. Probabilities will never be exactly 0 or 1, but can come Infinitessimally close. Probability Market Cap Clearly this model is not linear, so not estimable by OLS, instead we use Maximum Likelihood Trinity Business School Maximum Log Likelihood Further to least squares, there are two other broad approaches for parameter estimation: Maximum Likelihood: Chooses a set of parameter values that are most likely to have produced the observed produced the observed data. First form a likelihood function (LF). This is a multiplicative function of the actual data, which will be difficult to maximise w.r.t the parameters. Therefore, the log is taken to turn the LF into an additive function, the Log Likelihood Function. 𝑛 𝑛 𝐹𝑥 (𝑥𝑖 ) 𝑖=1 ln[𝐹𝑥 (𝑥𝑖 )] 𝑖=1 functions Where x1, x2,…. xn isJoint a setdensity of i.i.d. observations with individual probability density functions F 𝑥 . Trinity Business School Maximum Log Likelihood The value of the parameter 𝜃 that maximises the probability of observing the data is called a Maximum Likelihood estimate: Denote the maximised value of the LLF by unconstrained ML as L(𝜃) At the unrestricted MLE, L(𝜃), the slope of the curve is zero. All methods work by searching over the parameter space until the values of the parameters that maximise the LLF are found. Most software packages employ an iterative technique Drawback: In the case of GARCH (non-linear models) the LLF can have many local maxima. Trinity Business School Example: Pecking Order Hypothesis Theory suggests that corporations, when financing activities, should use the cheapest methods of financing first and switch to more expensive methods when the cheapest has been exhausted. (Myers,1984) The Pecking Order Theory of Capital Structure: Firms prefer to issue debt rather than equity if internal finance is insufficient Internal Funding Debt Equity Trinity Business School Example: Pecking Order Hypothesis • Helwege & Liang (1996) used a LOGIT model on new US firm listings in 1983, and tracked their additional funding decisions from 1984-1992. • They determined the factors that affected the probability of raising external financing. • Dependent variable 1- firm raises external capital, 0 firm does not raise external funds. • Independent variables reflect degree of information asymmetry, and degree of riskiness of the firm. Findings: The probability of obtaining external funding does not depend on the size of the firm’s cash deficit. The larger the firm’s surplus cash the less likely it is to seek external financing, providing limited support for the Pecking Order Hypothesis Trinity Business School Probit Model Similar to the Logit model, but instead of using the cumulative logistic function to transform the model, the cumulative normal distribution for a standard normally distributed random variable is used instead. 𝐹(𝑧𝑖 ) = 1 𝑧𝑖 √2𝜋 −∞ 𝑒 −𝑧𝑖 2 2 𝑑𝑧 As with the logistic approach, this function provides a transformation to ensure the fitted probabilities lie between 0 and 1. As also with the logit model, the probit model is interpreted as a unit change in 𝒙𝟑𝒊 will be given by 𝛽𝟑 𝑭(𝒛𝒊 ), where 𝛽𝟑 is the parameter attached to 𝒙𝟑𝒊 and 𝒛𝒊 = 𝛽𝟏 + 𝛽𝟐 𝒙𝟐𝒊 +𝛽𝟑 𝒙𝟑𝒊 + ⋯ +𝛽𝒌 𝒙𝒌𝒊 + 𝑢𝑖 Trinity Business School Logit vs. Probit Model • Both will give very similar characteristics of the data because the densities are similar. Fitted regression plots will be virtually indistinguishable. • The implied relationships between the explanatory variables and the probabilities of 𝑦𝑖 = 1 will be very similar. • Both models preferred to linear probability model. • The only time when they might give marginally different results occurs when the split between 𝑦𝑖 = 0 and 𝑦𝑖 = 1 are unbalanced. • Traditionally logistic regression professed as integral calculus more time consuming, but this is now solved by statistical software. (Stock & Watson, 2011) Trinity Business School Parameter Interpretation Logit Model Non-linear therefore cannot be estimated by OLS. Incorrect to say a 1-unit increase in 𝒙𝟐𝒊 causes a 100 x 𝛽𝟐 % increase in the probability of the 𝑦𝑖 = 1 outcome. (Correct for linear only ) i.e. 𝑃𝑖 = 𝛽𝒊 + 𝛽𝟐 𝒙𝟐𝒊 + 𝒖𝒊 Instead, a Logit model is; 𝑃𝑖 = 𝐹(𝛽𝒊 + 𝛽𝟐 𝒙𝟐𝒊 + 𝒖𝒊 ) where 𝐹 represents the non-linear logistic function To obtain the required relationship between changes in 𝒙𝟐𝒊 and 𝑃𝑖 , we differentiate 𝐹 w.r.t. 𝒙𝟐𝒊 and this derivative is 𝐹(𝒙𝟐𝒊 ) (1 - 𝐹(𝒙𝟐𝒊 ) ) Therefore a 1 unit increase in 𝒙𝟐𝒊 causes a 𝛽𝟐 𝐹(𝒙𝟐𝒊 ) (1 - 𝐹(𝒙𝟐𝒊 ) ) increase in probability. Trinity Business School Parameter Interpretation Logit Model Take a numerical example; 𝑃𝑖 = 1 1 + 𝑒 −(𝟎.𝟏+𝟎.𝟑𝒙𝟐𝒊 −𝟎.𝟔𝒙𝟑𝒊 +𝟎.𝟗𝒙𝟒𝒊 ) 𝛽1 = 0.1, 𝛽 2 = 0.3, 𝛽 3 = -0.6, 𝛽 4 = 0.9, Calculate 𝐹(𝒛𝒊 ) from the means of the explanatory variables. Suppose 𝑥2 = 1.6, 𝐹(𝒛𝒊 ) is given by; 1 𝑥3 = 0.2, 1 𝑥4 = 0.1, then the estimate of i.e. P(𝑦𝑖 = 1 ) 𝑃𝑖 = 1+𝑒 −(𝟎.𝟏+𝟎.𝟑(𝟏.𝟔)−𝟎.𝟔(𝟎.𝟐)+𝟎.𝟗(𝟎.𝟏) = 1+𝑒 −(𝟎.𝟓𝟓) = 0.63. Therefore, a 1 unit increase in 𝒙𝟐 causes an increase in the probability that the outcome 𝑦𝑖 = 1 will occur by 0.3 x 0.63 x 0.37 = 0.07. 𝒙𝟑 = -.14 and 𝒙𝟒 = 0.21 These estimates are called marginal effects. Trinity Business School A more intuitive interpretation is odds ratios, see stata output! Odds Ratios 1 ′ How do we estimate 𝑝𝑖 = −𝑍𝑖 as it is non-linear not only in the 𝒙 𝒔 but 1+𝑒 also in the parameters 𝛽s. We can use a simple transformation to make the model linear by taking the probability of 𝑝𝑖 / (1- 𝑝𝑖 ). This is the odds that 𝑦𝑖 = 1 over the odds that 𝑦𝑖 = 0. i.e. the odds ratio. We obtain = 𝑝𝑖 1−𝑝𝑖 = 1+𝑒 𝑍𝑖 1+𝑒 −𝑍𝑖 = 𝑒 −𝑍𝑖 𝑝 𝑳𝒊 = ln(1−𝑝𝑖 ) = 𝑍𝑖 = 𝛽𝟏 + 𝛽𝟐 𝒙𝟐𝒊 +𝛽𝟑 𝒙𝟑𝒊 + ⋯ +𝛽𝒌 𝒙𝒌𝒊 + 𝑢𝑖 𝑖 𝑯𝒆𝒏𝒄𝒆 𝑳𝒊 is the logit model (log of the odds ratio) It is interesting that the linear probability model assumes 𝑝𝑖 is linearly related to 𝑥𝑖 , whereas the logit model assumes the log of the odds ratio is linearly related to 𝑥𝑖 . This makes estimation of the regression parameters much simpler! Trinity Business School Goodness of Fit Measures: • Standard RSS, 𝑅2 or adjusted 𝑅2 will cease to have any real meaning. Fitted values can take on any value but actual values can only be 0 or 1. • Maximum Likelihood maximises the value of the LLF, instead of minimising the RSS. Two goodness of fit measures are used instead. 1: % of correct 𝑦𝑖 values predicted, = 100 x no. of observations correctly predicted divided by the total no. of observations. The higher the number, the better the fit of the model. 2; Pseudo 𝑅 2 = 1 – LLF / LLFo where LLF is the maximised value of the log likelihood function and LLFo is a restricted model with slope parameters =0. As model fit improves, LLF will become less negative and pseudo 𝑅2 will rise. Trinity Business School Tobit Model In certain applications the dependent variable is continuous, but its range may be constrained. The dependent variable might be zero for a substantial part of the population but positive for the rest of the population. Examples include consumption of goods, and hours of work. OLS is not appropriate in this instance, instead an approach base don maximum likelihood must be used. The Tobit model is used in such situations. (Tobin, 1958) Trinity Business School Tobit Model Example: (Right censored or upper limit) Model of demand for IPO shares as function of income (𝒙𝟐𝒊 ), age (𝒙𝟑𝒊 ), education (𝒙𝟒𝒊 ), and region of residence (𝒙𝟓𝒊 ) where no. of shares per individual is capped at 250. 𝑦𝑖 * = 𝛽𝟏 + 𝛽𝟐 𝒙𝟐𝒊 +𝛽𝟑 𝒙𝟑𝒊 + 𝛽𝟒 𝒙𝟒𝒊 +𝛽𝟓 𝒙𝟓𝒊 + 𝑢𝑖 For the Tobit model, the relationship between 𝑦𝑖 * and 𝑦𝑖 is: 𝑦𝑖 = 𝑦𝑖 ∗ 𝑓𝑜𝑟 𝑦𝑖 ∗<250 250 𝑓𝑜𝑟 𝑦𝑖 ∗≥250 Example: (Left censored or lower limit) Model of charitable donations made by individuals For the Tobit model, the relationship between 𝑦𝑖 * and 𝑦𝑖 is: 𝑦𝑖 = Trinity Business School 𝑦𝑖 ∗ 𝑓𝑜𝑟 𝑦𝑖 ∗>0 0 𝑓𝑜𝑟 𝑦𝑖 ∗ ≤ 0 In essence, the individual will either donate €Yi* or zero. Multinomial Choice In Probit and Logit models the individual chooses between two alternatives. Often faced with choices involving more than two alternatives. These are called Multinomial Choice situations. Previously with only two choices, we used one equation to capture the probability that either one would be chosen. With three choices, we estimate two equations with the third choice acting as a reference point. For m possible choices, we use m-1 equations. Trinity Business School Pecking Order Hypothesis Revisited Previous hypotheses considered the choice between external finance or not. But about the type of financing – equity, public debt or private debt. Instead of a binary logit model, a multinomial model is more appropriate. Returning to Helwege and Liang’s study, estimate equations for equity and public debt, and private debt becomes the reference point. For e.g.. A positive equity parameter means an increase in the probability the firm would choose to issue equity over private debt. Results suggest firms in good financial health are more likely to issue equities and bonds, rather than private debt. Riskier or venture backed firms and non-financial firms more likely to issue equities or bonds. Larger firms more likely to issue bonds. Trinity Business School Ordered Choice Models The choice options seen in the Multinomial Logit Model have no natural ordering. In some cases, choice are ordered in a particular way. Bond ratings (AAA, AA), student grades (A, B) and employee performance (good, poor) are ordered in a hierarchy. When modelling these types of outcomes, numerical values are assigned to the outcomes, but the numerical values are ordinal and reflect only the ranking of the outcomes. As with all the earlier models examined, OLS is not suitable for such data as it would treat the dependent variable as having some numerical meaning when it does not. Trinity Business School Ordered Choice Models The most common example in finance is credit ratings, where the is a monotonic increase in the credit quality. Denoted by ordinal numbers. Model is set up such that the boundary value between each rating are estimated along with the model parameters. Poon (2003) investigated bias in unsolicited vs solicited ratings using an Ordered Probit Model. Pooled sample of the annual issuer list of the S&P 500 from 1998 -2000. 295 firms across 15 countries, 595 observations. Findings: Half of sample ratings were unsolicited with lower ratings on average. Financial characteristics of firms with unsolicited ratings were significantly weaker than for those with requested ratings. Trinity Business School Count Data Models Often the dependent variable in a model is a count of a number of occurrences. Given the following Count Data Model: Yi = α + βXi + εi Where: Yi = Number of visits to a GP by individual ‘i’ in a year and; Xi is a vector of explanatory variables. We may be interested in explaining a probability such as the probability that an individual will take two or more trips to GP in the year. A Poisson Regression Model would be used to estimate this model. Trinity Business School Stata Codes Logit Models logit y x1 x2 x3 Heavier cars are less likely to be foreign at the 1% significance level Cars yielding better gas mileage are less likely to be foreign at the 10% level Trinity Business School • Stata Codes • • Odds Ratios > 1 correspond to positive effects because they increase the odds. Odds Ratios between 0-1 decrease the odds Odds Ratio = 1 corresponds to no association Logit Models, using Odds Ratio for easier interpretation • • Trinity Business School The odds of the car being foreign are predicted to shrink by a factor of .004 for each unit increase in weight. The odds of the car being foreign are predicted to shrink by a factor of .155 for each unit increase in mileage. Stata Codes Use Robust Standard Errors Probit Models probit y x1 x2 x3 Pr (foreign = 1) = N(𝛽𝟏 + 𝛽𝟐 𝒘𝒆𝒊𝒈𝒉𝒕 + 𝛽𝟑 𝑴𝑷𝑮) where N is cumulative normal dist. function. Trinity Business School Stata Codes Tobit Models Create a censored variable , mileage ranges from 12-41, let’s only observe mileage ratings > 17. Replace mpg=17 if mpg <=17 Tobit mpg weight, ll Trinity Business School Stands for lower limit ul = upper limit ll (0) if lower limit = 0 Stata Codes Poisson Models Stands for incidence rate ratios Smokers have 1.43 times the mortality rate of non-smokers. poisson deaths smokes i.agecat, exposure(pyears) irr Trinity Business School Simulation Methods Monte Carlo Simulations and Bootstrapping Trinity Business School Monte Carlo Simulations Real data is messy Existence of fat tails, structural breaks and bi-directional causality. A simulation enables an econometrician to determine the effect of changing one factor leaving all others equal. Offers complete flexibility. Simulation can be useful to I. Quantify the simultaneous equations bias by treating an endogenous variable as exogenous. II. Determining the critical values for a Dickey Fuller test III. Determine the effect of heteroscedasticity upon the size and power of an autocorrelation test. IV. Useful in finance, exotic options pricing, effects of macro changes on financial markets, stress testing risk models. Trinity Business School Monte Carlo Simulations Monto Carlo simulation – central idea is random sampling from a given distribution and replicate N times. 1. Generate the data according to desired Data Generating Process and decide distribution of errors. 2. Run the regression and calculate the test statistic 3. Save the test-statistic or parameter of interest 4. Repeat N times. Trinity Business School Variance Reduction Techniques • Say an average value of a parameter 𝒙𝒊 is calculated for a set of 1000 replications, and another researcher conducts an almost identical study, with different sets of random draws, a different average value for 𝒙𝒊 is sure to result. • The sampling variation in a Monte Carlo study is measure by the standard error estimate • 𝑺𝒙 = 𝑽𝒂𝒓 𝒙 𝑵 where Var(x) is the variance of the estimates of interest over N replications • To reduce the standard error by a factor of 10, the number of replications must by increased by 100 • To achieve acceptable accuracy, N may need to be set to infeasible high levels – an alternative way is to use a variance reduction technique. Trinity Business School Antithetic Variates • It may take many many repeated sets of sampling before the entire probability space is adequately covered. • What is really required is for successive replications to to cover different parts of the probability space – to span the entire spectrum of possibilities. • The antithetic variate technique involves taking a complement of a set of random numbers and running a parallel simulation on those in order to reduce the sampling error. • If the driving force of a set of T draws is 𝑼𝒕 for each replication, an additional replication with errors −𝑼𝒕 can be used. i.e purposely try to induce negative correlation amongst the variables, and this reduces the standard error resulting in a smaller confidence interval. • Trinity Business School Control Variates The application of control variates involves employing a similar variable to that used in the simulation 𝒙 but whose properties are known prior to the stimulation 𝒚. The simulation is conducted on 𝒙 and also on 𝒚 with the same sets of random draws being employed in both cases. Denoting simulation estimates 𝒙 𝒂𝒏𝒅 𝒚 a new estimate of 𝒙 can be derived 𝒙* = 𝒚 + (𝒙 - 𝒚 ) It can be shown that the sampling error of 𝒙* is less than for 𝒙 provided a certain condition holds. Control variates help to reduce the Monte Carlo variation of particular sets of random draws, by instead using the same draws on a problem where the solution is known. Trinity Business School Bootstrapping Bootstrapping is related to simulation but with a crucial difference. Instead of constructing the data artificially, bootstrapping involves sampling repeatedly with replacement from the actual data. Bootstrapping is used to obtain a description of the estimator properties using the sample points themselves. Advantages in allowing the researcher to make inferences without making strong distribution assumptions. Applications in finance and econometrics have increased rapidly in recent years. Re-sampling the data 1. Generate a sample size T from the original data by sampling with replacement 2. Calculate β ∗ the coefficient from the sample 3. Repeat stage 1 with a different sample N times and a distribution of β ∗ will result Trinity Business School Disadvantages of the Simulation Approach Computationally expensive: No. of replications might be very large/complex. Results might not be precise, if unrealistic assumptions. Results hard to replicate. Results are usually specific to the investigation. Simulation results are experiment-specific, results only apply to exact type of data. Solution is to run simulations using as many different and relevant data generating processes as possible. Trinity Business School Stata Practice: Monte Carlo Simulations Permutation tests determine the significance of the observed value of a test statistic in light of rearranging the order (permuting) of the observed values of a variable. Let’s test the t-statistic, beta coefficient and standard error for regressing headroom on foreign cars. webuse auto.dta reg headroom foreign Let’s check the t-statistic Trinity Business School Stata Practice: Monte Carlo Simulations permute foreign t=_b[foreign]/_se[foreign], rep(1000): reg headroom foreign Only 4 permutations had a lower t-statistic! Trinity Business School Stata Practice: Bootstrapping- ShuffleVar ssc install shufflevar shufflevar headroom No significance for the headroom randomly shuffled Trinity Business School Event Studies Market Models and CAPM Trinity Business School Event Study Very useful in finance research and extremely common in the literature Gauge an effect of an identifiable event on a financial variable, usually stock returns. (dividends, stock splits, listing/delisting on stockmarket) Simple to understand and conduct but approach must be rigorous. A multitude of approaches, but the main groundwork was established b y Ball and Brown (1968) and Fama et al. (1969) Trinity Business School Basic Approach Define precisely the date/s on which the event occurred, and then gather the sample data For N events we define an ‘event window’ which is the period of time over which we investigate the impact of the event. i.e 10 trading days before vs 10 trading days after. Long term could be months or years. Data frequency is important. Daily data carry greater power than weekly/monthly (MacKinlay, 1997). Define the return 𝑹𝒊𝒕 for each firm i on each day t during the event window. To separate the impact of the event from other unrelated movements in process, the abnormal return 𝑨𝑹𝒊𝒕 is calculated, by subtracting the expected return from the actual return. 𝑨𝑹𝒊𝒕 = 𝑹𝒊𝒕 - E(𝑹𝒊𝒕 ) Trinity Business School Basic Approach – Event Study The expected return E(𝑹𝒊𝒕 ) can be calculated from a sample of data before the event. 100-300 days or 24-60 months (Armitage, 1995) Longer estimation windows can increase precision of estimation parameters but need to balance against possibility of structural breaks in the data. If event window is short, 1-few days, expected return likely to be close to zero, could be acceptable to use the actual return only. Usually there is a gap between the estimation period and event window to take account of anticipation /leakage of the event. Simplest method for E(𝑹𝒊𝒕 ) is the mean return for each stock over the estimation window. Brown and Warner(1980, 1985) found that historical return averages outperform many more complicated approaches. Trinity Business School Market Model The most common approach for computing expected returns is the market model. Regress return of stock i on constant and the return to the market portfolio. 𝑹𝒊𝑡 = 𝜶𝒊 + 𝜷𝒊 𝑹𝒎𝑡 + 𝒖𝒊𝒕 The expected return for firm i on any day t = beta estimate x the market return on day t. Should we include alpha? Yes, as per Fama et al. (1969) however need to exercise caution. Problematic if too high or too low, may be preferable to assume alpha = 0. FTSE All Share or S&P 500 used as market proxy, could also add firm size or other characteristics. Alternative would be to set up a control portfolio with characteristics close to the event firm - size, beta, industry, price to book ratios etc and use this as expected returns. Armitage (1995) used Monte Carlo simulations to compare various models for event studies. Trinity Business School Event Study Hypothesis 𝐇𝟎: Event has no effect on the stock price i.e. Abnormal return = 0 It is likely there will be variation in returns across the days within the event window, therefore we may calculate the cumulative abnormal return over a multi-period event window, say T1 to T2 𝑻𝟐 𝑪Â𝑹𝒊 (𝑻𝟏 , 𝑻𝟐 ) = Â𝑹𝒊𝒕 𝒕=𝑻𝟏 Test statistic 𝑺𝑪Â𝑹𝒊 (𝑻𝟏 , 𝑻𝟐 ) = 𝑪Â𝑹𝒊 (𝑻𝟏 ,𝑻𝟐 ) 𝝈𝟐 𝑪Â𝑹𝒊 (𝑻𝟏 ,𝑻𝟐 ) ~ N(0,1) Sum of daily variances It is usual to examine a pre-event window (e.g. t-10 to t-1) and a post-event window (e.g t+1 to t+10) with t as the event. Trinity Business School Event Study Complications Cross-Sectional Dependence : Assumes returns are independent across firms. Solution is not to aggregate firms and construct test statistics on an event-by-event Changing Variances of Returns: Variance (Volatility) of returns will increase over the event window, but measured variance is based on estimation window which is incorrectly rejected too often. Solution is also to estimate variance of abnormal window. Weighting the Stocks: Approach above will not give equal weight to the stocks, therefore standardise each individual firm’s abnormal return, or take un-weighted standardised cumulative abnormal return. 𝑺𝑪Â𝑹𝒊 𝑻𝟏 , 𝑻𝟐 𝟏 = 𝑵 𝑵 𝑺𝑪Â𝑹𝒊 𝑻𝟏 , 𝑻𝟐 𝒊=𝟏 Long Event Windows: Can lead to large errors in the calculation of the abnormal return and the impact of the event. Trinity Business School Event Study Stata It is common to use Excel for an event study however Pacicco, Vena and Venogoni (2018) devised a method using Stata. The command estudy, performs an event study permitting the user to i) work with multiple varlists, computing the abnormal returns (ARs), average abnormal returns (AARs), cumulative abnormal returns (CARs), and cumulative average abnormal returns (CAARs); ii) specify up to six event windows; iii) customize the length of the estimation window; iv) select the model for the calculation of normal or abnormal returns; v) specify the diagnostic test, among the parametric and nonparametric ones that are most commonly used in the literature; vi) customize the output table and store the results in an Excel file and in a Stata data file. Trinity Business School Event Study Stata Let’s test the effects of the announcement of the Covid pandemic (15th Mar 2020). estudy ln_SPX ln_Gold ln_Bitcoin (ln_Tesla ln_Apple ln_Microsoft), datevar(Date) evdate(03152020) dateformat(MDY) modtype(HMM) lb1(-3) ub1(3) lb2(-2) ub2(2) lb3(-1) ub3(3) eswlb(-30) eswub(-5) Trinity Business School Event Study Stata Now for the reaction to the announcement of the Covid vaccine (9th Nov 2020) estudy ln_SPX ln_Gold ln_Bitcoin (ln_Tesla ln_Apple ln_Microsoft), datevar(Date) evdate(11092020) dateformat(MDY) modtype(HMM) lb1(-3) ub1(3) lb2(-3) ub2(7) lb3(-3) ub3(10) lb4(-2) ub4(3) lb5(0) ub5(7) lb6(1) ub6(10) eswlb(-30) eswub(-5) Trinity Business School Event Study Stata Date Elon Musk announced he was buying Twitter! (14th April 2022) estudy ln_SPX ln_Gold ln_Bitcoin (ln_Tesla ln_Apple ln_Microsoft), datevar(Date) evdate(04142022) dateformat(MDY) modtype(HMM) lb1(-3) ub1(3) lb2(-3) ub2(7) lb3(-3) ub3(10) lb4(-2) ub4(3) lb5(0) ub5(7) lb6(1) ub6(10) eswlb(-30) eswub(-5) Trinity Business School CAPM Testing The CAPM states that the expected return on any stock i is equal to the risk-free rate of interest 𝑅𝒇 plus a risk premium. [𝐸(𝑹𝒎 ) − 𝑅𝒇 ]. 𝜷𝒊 𝒊𝒔 𝒕𝒉𝒆 𝒓𝒊𝒔𝒌𝒊𝒏𝒆𝒔𝒔 𝒐𝒇 𝒕𝒉𝒆 𝒔𝒕𝒐𝒄𝒌. 𝐸(𝑅𝑖 ) = 𝑅𝒇 + 𝜷𝒊 [𝐸(𝑹𝒎 ) − 𝑅𝒇 ] Tests of the CAPM done in two steps: i) Estimate the stock betas ii) Test the model CAPM is an equilibrium model, therefore we would not expect it to hold in every time period, but if it is a good model, it should hold on average. Stock market index : proxy for the market return Yields of short-term treasury bills: Risk-free rate Trinity Business School CAPM Testing Calculate the betas To calculate beta, we can run a simple time series regression of the excess stock returns on the excess market returns, and the slope estimate will be beta. 𝑅𝑖,𝑡 = 𝛼𝒊 + 𝛽𝑖 𝑅𝑚,𝑡 + 𝑢𝒊,𝒕 i=1,2,..N; t=1,2..T N = total no. of stocks. T = number of time series observations 𝛼𝒊 = ‘Jensen’s Alpha’ which measures how much the stock underperformed or given the level of market risk. Trinity Business School CAPM Testing Test the CAPM: N = 100, T = 60 (5 years monthly data) 1. Run 100 regressions (one for each stock) 2. Run single cross-sectional regression on the average stock returns 𝑹𝒊 = 𝝀𝟎 + 𝝀𝟏 𝛽𝑖 +𝑣𝒊 . i=1,..N Stage 2 involves actual returns not excess returns. Essentially CAPM says that stocks with higher betas are riskier and should command higher average returns. If the CAPM is a valid model, 𝝀𝟎 should be close to the risk-free rate, and 𝝀𝟏 close to the Two other validations are: (i) Linear relationship between a stock’s return and its beta, should explain the cross-sectional variation in returns. Can test these by running the augmented model; 𝑹𝒊 = 𝝀𝟎 + 𝝀𝟏 𝛽𝑖 + 𝝀𝟐 𝛽𝑖 2 + 𝝀𝟑 𝜎𝑖 2 + 𝑣𝒊 𝜎𝑖 2 is the idiosyncratic risk and 𝛽𝑖 2 the squared beta for stock i. Both 𝝀𝟐 and Trinity Business School CAPM Test and Fama-French Methodology Research indicates that the CAPM is not a complete model of stock returns. It has been found that returns are systematically higher for small cap and value stocks. Can test this using augmented model, (Fama & French, 1992) 𝑹𝒊 = 𝝀𝟎 + 𝝀𝟏 𝛽𝑖 + 𝝀𝟐 𝑀𝑉𝑖 + 𝝀𝟑 𝐵𝑇𝑀𝑖 +𝑣𝒊 Where 𝑀𝑉𝑖 is the market value and 𝐵𝑇𝑀𝑖 is the price-to-book ratio of stock i. The test for the CAPM to be supported is that both 𝝀𝟐 = 0, and 𝝀𝟑 = 0. Three issues: • Non-normality of returns in finite samples, need normality for valid hypothesis testing • Likely to be heteroscedasticity in the returns. (recent CAPM testing has used GMM (Cochrane, 2005) • Measurement error in beta (to minimise, prefer to base on portfolios than single stocks) Trinity Business School Extreme Value Theory • Much of classical statistics is focused upon accurate estimation of the ‘average’ value of a series or average relationship between two or more series, (OLS). • Central Limit Theorem is all about the sampling distribution of the means . • However, in many cases, it is the extreme or rare events that are of interest. • Extreme Value Theory (EVT) was adopted in finance in the 1990s because of the realisation that asset returns deviate systematically from normality, and assuming normality can lead to severe underestimates of large price movements, and consequently severe losses. • Example: Levine (2009) looked at monthly returns of medium maturity A rated corporate bonds, from Jan 1980 to Aug 2008. Lowest return was -10.84% and using an extreme value distribution, the probability of a return less than or equal to this was 1.4%. • However, under the normal distribution, the probability would be 𝟖𝒙𝟏𝟎−𝟕 , more than 16000 times smaller! • Note: EVT should only be applied to the tails and not to the centre of the distribution. Trinity Business School Block Maximum Approach Take a series y of total length T observations. Separate the series into M blocks of data, each of length n, so that m x n = T. Let the maximum value of the series in each block be 𝑴𝒌 The distribution of the normalised version of the maxima in each block converges asymptotically to a generalised extreme value (GEV) distribution as m, n tend to infinity. (Fisher & Tippett, 1928) ,Gnedenko (1943) Drawbacks: How to fit data into blocks ? If blocks are too long, no. of maxima small and lead to inaccurate estimation of parameters with high variance (standard errors) If blocks are too short, potential for maxima not to be extreme values, causing a bias in the parameter estimate. Hence trade-off between bias and efficiency. Long blocks less bias, more inefficiency, Short blocks more bias, less inefficiency. Trinity Business School Peaks Over Threshold Approach • Preferred approach in empirical finance. • An arbitrary threshold U is specified and any observed value exceeding this is defined as extreme. • As threshold U tends towards infinity , the distribution of a normalised series tends towards the generalised Pareto distribution (GPD) • The generalised extreme value (GEV) and generalised pareto distributions (GDP) are closely related. In the limit in both cases, the GEV is the distribution of the standardised maxima and GDP is the distribution of the standardised data over a threshold, and the parameters from one approach can be estimated from another and vice versa. Trinity Business School Value at Risk Popular method for measuring financial risk. Estimates expected losses for a change in market prices. More formally, loss in monetary terms that is expected to occur over a predetermined time horizon with a pre-determined degree of confidence. i.e. A company states its one-day 99% VaR is $10,000,000, meaning the company is 99% confidence the maximum loss on its portfolio of assets in a one-day period is $10,000,000. Advantages: Simplicity of calculation Ease of Interpretation Can be suitably aggregated across a firm to produce a single figure to encompass the risk positions of the firm as a whole. Trinity Business School Value at Risk The calculated VaR is used as a way to select the most appropriate minimum capital risk requirement (MCRR) to ensure they hold enough liquid assets to cover the expected losses should they arise. Calculating VAR 1. Delta Normal Method: Assume losses are normally distributed. VaR = 𝜎𝑍𝛼 Take the appropriate critical value at the alpha significance level. Eg. Alpha = .5, 𝜎 = the standard deviation of the data. Multiply VaR by the value of the portfolio 2. Historical Simulation Sort the portfolio returns and select the appropriate percentile. Multiply by value if the portfolio for 95% and 99% respectively. Trinity Business School Value at Risk Calculating VAR 3. Extreme Value Theory Uses all the data points defined as being in the tail. Use Peaks over Threshold approach to calculate VaR as a factor of the shape and scale parameters from the sample data, and the ratio of total observations to observations exceeding the threshold. • It is possible to construct confidence intervals for VaR estimates arising from extreme value distributions (Mc Neil 1998) • Assumes data is I.I.D. If not, can estimate an ARMA-GARCH model and take the I.I.D residuals, and estimate parameters of extreme value distribution on those. • Can be extended to multivariate case using Copula functions. Trinity Business School Generalised Method of Moments Generalisation of conventional method of moments estimator which has widespread use in finance, for asset pricing, interest rate models and market microstructure (Jaganathan, Skoulakis & Wang, 2002). GMM can be applied to time series, cross section and panel data. OLS, GLS, instrumental variables, two-stage least squares, and ML are special cases of the GMM estimator. special cases of the GMM estimator. Method of moments dates back to Pearson(1895) and works by computing the moments of the sample data the moments of the sample data and setting them equal to the population values based on an assumed values based on an assumed probability distribution. For a normal dist., we calculate the mean and variance. By the law of large numbers, the sample moments converge to their population counterparts asymptotically. population counterparts asymptotically. Trinity Business School Method of Moments For observed data (𝑦) and population mean 𝜇0 , E[𝑦𝑡 ] = 𝜇0 . The 1st moment 1 i.e. 𝑇 𝑇 𝑡=1 𝑦𝑡 - 𝜇0 → 0 as T → ∞ 𝑦 E[(𝑦𝑡 −𝜇0 ) 2 ] = 𝜎 2 . The 2nd moment The estimator 𝜎 2 can be obtained as 1 𝑇 𝑇 𝑡=1(𝑦𝑡 − 𝑦)2 E[(𝑦𝑡 −𝜇0 ) 4 ] = 3𝜎 2 . The 4th moment The estimator 𝜎 2 can be obtained as Trinity Business School 1 𝑇 𝑇 𝑡=1(𝑦𝑡 − 𝑦)4 /3 Generalised Method of Moments But how to determine the best estimator 𝜎 2 ? We have more moment equations than unknowns (𝜎 2 ) therefore the system is overidentified. There are multiple solutions for 𝜎 2 and how to determine the best one? A natural way to do this is to choose the the parameter estimates that minimise the variance of the moment conditions. Effectively, a weighting matrix gives higher weight to moment conditions with lower variance. This necessity to choose a weighting matrix is a disadvantage of GMM. The method of moments relies on the assumption that the explanatory variables are orthogonal to the disturbances in the model. E[𝑢𝑡 𝑥𝑡 ] = 0. The method of moments is a consistent estimator but is sometimes not efficient. Trinity Business School Summary of Learnings • Discrete Choice Models • Logit and Probit Models • Multinomial choice, Ordered choice and Poisson Models • Monte Carlo Simulations • Bootstrapping • Event Studies • CAPM Testing • Extreme Value Theory • Maximum Likelihood • Generalised Method of Moments Trinity Business School Thank you Any Questions? Niamh Wylie wylien@tcd.ie Trinity Business School