Market Structure, Trading, and Liquidity FIN 2340 Dr. Michael Pagano, CFA Econometric Topics Adapted and Excerpted from Slides by: Dr. Ian W. Marsh Cass College, Cambridge U. and CEPR 1 Overview of Key Econometric Topics (1) Two-Variable Regression: Estimation & Hypothesis Testing (2) Extensions of the Two-Variable Model: Functional Form (3) Estimating Multivariate Regressions (4) Multivariate Regression Inference Tests & Dummy Variables 2 Introduction • Introduction to Financial Data and Financial Econometrics • Ordinary Least Squares Regression Analysis What is OLS? • Ordinary Least Squares Regression Analysis Testing Hypotheses • Ordinary Least Squares Regression Analysis Diagnostic Testing 3 Econometrics • Literally means “measurement in economics” • More practically it means “the application of statistical techniques to problems in economics” • In this course we focus on problems in financial economics • Usually, we will be trying to explain the behavior of a financial variable 4 Econometric Model Building 1. Understand finance theory 2. Derive estimable model 3. Collect data 4. Estimate model 5. Evaluate estimation results Satisfactory Unsatisfactory Interpret model Re-estimate model using better techniques Collect better data Reformulate model Assess implications for theory 5 Financial Data • What sorts of financial variables do we usually want to explain? – – – – – Prices - stock prices, stock indices, exchange rates Returns - stock returns, index returns, interest rates Volatility Trading volumes Corporate finance variables • Debt issuance, use of hedging instruments 6 Time Series Data • Time-series data are data arranged chronologically, usually at regular intervals – Examples of Problems that Could be Tackled Using a Time Series Regression • How the value of a country’s stock index has varied with that country’s macroeconomic fundamentals. • How a company’s stock returns has varied when it announced the value of its dividend payment. • The effect on a country’s currency of an increase in its interest rate 7 Cross Sectional Data • Cross-sectional data are data on one or more variables collected at a single point in time • e.g. A sample of bond credit ratings for UK banks – Examples of Problems that Could be Tackled Using a Cross-Sectional Regression • The relationship between company size and the return to investing in its shares • The relationship between a country’s GDP level and the probability that the government will default on its sovereign debt. 8 Panel Data • Panel Data has the dimensions of both time series and cross-sections • e.g. the daily prices of a number of blue chip stocks over two years. – It is common to denote each observation by the letter t and the total number of observations by T for time series data, – and to to denote each observation by the letter i and the total number of observations by N for cross-sectional data. 9 Econometrics versus Financial Econometrics – Little difference between econometrics and financial econometrics beyond emphasis – Data samples • Economics-based econometrics often suffers from paucity of data • Financial economics often suffers from infoglut and signal to noise problems even in short data samples – Time scales • Economic data releases often regular calendar events • Financial data are likely to be real-time or tick-by-tick 10 Economic Data versus Financial Data • Financial data have some defining characteristics that shape the econometric approaches that can be applied – – – – outliers trends mean-reversion volatility clustering 11 Outliers 12 Trends 13 Mean-Reversion (with Outliers) 14 More Mean-Reversion 15 Volatility Clustering 16 Basic Data Analysis • All pieces of empirical work should begin with some basic data analysis – Eyeball the data – Summarize the properties of the data series – Examine the relationship between data series • Most powerful analytic tools are your eyes and your common sense – Computers still suffer from “Garbage in garbage out” 17 Basic Data Analysis • Eyeballing the data helps establish presence of – trends versus mean reversion – volatility clusters – key observations • outliers – data errors? • turning points • regime changes 18 Basic Data Analysis • Summary statistics – Average level of variable • Mean, median, mode – Variability around this central tendency • Standard deviations, variances, maxima/minima – Distribution of data • Skewness, kurtosis – Number of observations, number of missing observations 19 Basic Data Analysis • Since we are usually concerned with explaining one variable using another – “trading volume depends positively on volatility” • relationships between variables are important – cross-plots, multiple time-series plots – correlations (covariances) – multi-collinearity 20 Basic Data Manipulations • • • • • • Taking natural logarithms Calculating returns Seasonally adjusting De-meaning De-trending Lagging and leading 21 The basic story • y is a function of x • y depends on x • y is determined by x “the spot exchange rate depends on relative price levels and interest rates…” 22 Terminology • y is the – – – – – – predictand regressand explained variable dependent variable endogenous variable left hand side variable x’s are the predictors regressors explanatory variables independent variables exogenous variables right hand side variables 23 Data • Suppose we have n observations on y and x: cross section yi = α + β xi + ui time series y t = α + β x t + ut i = 1, 2, …, n t = 1, 2, …, n 24 Errors • Where does the error come from? – Randomness of (human) nature • men and markets are not machines – Omitted variables • men and markets are more complex than the models we use to describe them. Everything else is captured by the error term – Measurement error in y • unlikely in financial applications 25 Objectives • to get good point estimates of α and β given the data • to understand how confident we should be in those estimates • both will allow us to make statistical inferences on the true form of the relationship between y and x (“test the theory”) 26 Simple Regression: An Example • We have the following data on the excess returns on a fund manager’s portfolio (“fund XXX”) together with the excess returns on a market index: Year, t 1 2 3 4 5 Excess return = rXXX,t – rft 17.8 39.0 12.8 24.2 17.2 Excess return on market index = rmt - rft 13.7 23.2 6.9 16.8 12.3 • We want to find whether there is a relationship between x and y given the data that we have. The first stage would be to form a scatter plot of the two variables. 27 Graph (Scatter Diagram) Excess return on fund XXX 45 40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 Excess return on market portfolio 28 Finding the Line of Best Fit • We can use the general equation for a straight line, y = α + βx to get the line that best “fits” the data. • But this equation (y = α + βx) is completely deterministic. • Is this realistic? No. So what we do is to add a random disturbance term, u into the equation. yt = + xt + ut where t = 1, 2, 3, 4, 5 29 Determining the Regression Coefficients • So how do we determine what and are? • Choose and so that the distances from the data points to the fitted lines are minimised (so that the line fits the data as closely as possible) • The most common method used to fit a line to the data is known as OLS (ordinary least squares). 30 Ordinary Least Squares • What we actually do is 1. take each vertical distance between the data point and the fitted line 2. square it and 3. minimize the total sum of the squares (hence least squares). 31 40 y = - 1.7366 + 1.6417x 35 30 25 20 15 10 5 5 10 15 20 25 32 Algebra Alert!!!!! • • • • Tightening up the notation, let yt denote the actual data point t ŷt denote the fitted value from the regression line ût denote the residual, yt - ŷt 33 How OLS Works 5 • So min. uˆ uˆ uˆ uˆ uˆ , or minimise uˆt2. This t 1 is known as the residual sum of squares. 2 1 2 2 2 3 2 4 2 5 • But what was ût ? It was the difference between the actual point and the line, yt - ŷt . • So minimizing uˆt2 is equivalent to minimizing y with respect to $ and $. 34 yˆ t 2 t Coefficient Estimates T T RSS yt yˆ t 2 t 1 t 1 yt ˆ ˆxt 2 differenti ating RSS wrt ˆ and ˆ and setting equal to zero gives OLS estimtes : S xx xt x xt2 Tx 2 2 S xy xt x yt y xt yt Tx y S xy ˆ S xx ˆ y ˆx th e estimated value of th e estimated value of 35 What do we Use $ and $ For? • In the CAPM example used above, optimising would lead to the estimates • $ = -1.74 and • $ = 1.64. • We would write the fitted line as: yˆ t 1.74 1.64 x t 36 What do we Use $ and $ For? • If an analyst tells you that she expects the market to yield a return 20% higher than the risk-free rate next year, what would you expect the return on fund XXX to be? • Solution: We can say that the expected value of y = “-1.74 + 1.64 * value of x”, so plug x = 20 into the equation to get the expected value for y: yˆ i 1.74 1.64 20 31.06 37 Is Using OLS a Good Idea? • Yes, since given some assumptions (see later) least squares is BLUE – best, linear, unbiased estimator • OLS is consistent – as sample size increases, estimated coefficients tend towards true values • OLS is unbiased – Even in small samples, estimated coefficients are on average equal to true values 38 Is Using OLS a Good Idea? (cont.) • OLS is efficient – no other linear estimator has a smaller variance around the estimated coefficient values – some non-linear estimators may be more efficient 39 Testing Hypotheses • Once you have regression estimates (assuming the regression is a “good” one) you take the results to the theory: “Theory says that the intercept should be zero” “Theory says that the coefficient on prices should be unity” “Theory says that the coefficient on domestic money should be unity and the coefficient on foreign money should be minus unity” 40 Testing Hypotheses (cont.) • Testing these statements is called hypothesis testing • This involves comparing the estimated coefficients with what theory suggests • In order to say whether the estimates are “too far” from theory we need some measure of the precision of the estimated coefficients 41 Standard Errors • Based on a sample of data, you have estimated the coefficients $ and $ • How much are these estimates likely to alter if different samples are chosen? • The usual measure of this degree of uncertainty is the standard error of the coefficient estimates 42 Standard Errors (cont.) • Algebraically, given some crucial assumptions, standard errors can be computed as follows: SE ˆ s SE ˆ s x T x Tx 2 t 2 t 2 1 xt2 Tx 2 43 Error Variance • σ2 is the variance of the error or disturbance term, u • this is unobservable • we approximate it with the variance of the residual terms, s2 uˆ uˆ uˆ s t 2 T 2 s uˆ t t T 2 t T 2 44 Standard Errors • SE are smaller as – T increases, • more data makes precision of estimated coefficients higher – the variance of x increases, • more dispersion of dependent variable about its mean, makes estimated coefficients more precise – s decreases • better the fit of the regression (smaller residuals), the more precise are estimates 45 Null and Alternative Hypotheses • So now you have the coefficient estimates and the associated standard errors. • You now want to test the theory. Five-Step Process: Step 1: Draw up the null hypothesis (H0) Step 2: Draw up the alternative hypothesis (H1 or HA) 46 Null Hypothesis • Usually, the null hypothesis is what theory suggests: e.g. testing the ability of fund mangers to outperform the index R jt R ft j j Rmt R ft u jt excess return of fund j at time t expected risk adjusted return • EMH suggests αj = 0, • so, H0: αj = 0 (fund managers earn zero risk 47 adjusted excess returns) Alternative Hypothesis • The alternative is more tricky • Usually the alternative is just that the null is wrong: – H1: α 0 (fund managers earn non-zero risk adjusted excess returns; fund managers underperform or out-perform) • But sometimes is more specific – H1: α < 0 (fund managers underperform) 48 Confidence Intervals • Suppose our point estimate for α is 0.058 for fund XXXX and the associated standard error is 0.025 based on 20 observations • Has fund XXXX outperformed? • Can we be confident that the true α is different to zero? Step 3: Choose your level of confidence Step 4: Calculate confidence interval 49 Confidence Interval (cont.) • Convention is to use 95% confidence levels • Confidence interval is then ˆ tcritical SEˆ ,ˆ tcritical SEˆ – tcritical is appropriate percentile (eg 97.5th) of the t-distribution with T-2 degrees of freedom • 97.5th percentile since two-sided test • 2 degrees of freedom were lost in estimating 2 coefficients 50 Confidence Interval (cont.) • We are now 95% confident that the true value of alpha lies between 0.058 - 2.1009*0.025 = 0.0059 and 0.058 + 2.1009*0.025 = 0.1105 51 Making inferences Step 5: Does the value under the null hypothesis lie within this interval? • No (null was that alpha = 0) – So we can reject the null hypothesis that fund XXXX earns a zero risk adjusted excess return – and accept the alternative hypothesis – we reject the restriction implied by theory 52 Making inferences (cont.) • Suppose our standard error was 0.03 • The confidence interval would have been -0.005 to 0.121 • The value under the null is within this range – We cannot reject the null hypothesis that fund XXX only earns a zero risk adjusted return – NOTE we never accept the null - hypothesis testing is based on the doctrine of falsification 53 Significance Tests • Instead of calculating confidence intervals we could calculate a significance test Step 4: Calculate test statistic ˆ * 0.058 0.0 2.32 SE ˆ 0.025 α* is value under the null Step 5: Compare test statistic to critical value, tcritical 54 Significance Tests (cont.) Step 6: Is test statistic in the non-rejection or acceptance region? 95% acceptance region 2.5% rejection region -2.1009 2.5% rejection region +2.1009 55 One-Sided tests (cont.) • Suppose the alternative hypothesis was – H1: α < 0 • We then perform one-sided tests – Confidence interval is , ˆ tcritical SEˆ – Significance test statistic is compared to tcritical – tcritical is based on 95th percentile (not 97.5th) 56 Type I and Type II Errors • Where did 95% level of confidence come from? – Convention • What does it mean? – We are going to reject the null when it is actually true 5% of the time – This is a Type I error 57 Type I and Type II Errors (cont.) – Type II error is when we fail to reject the null when it was actually false – To reduce Type I errors, we could use 99% confidence level • this would widen our CI, raise the critical value • making it less likely to reject the null by mistake • but also making it less likely we correctly reject the null • so raises Type II errors 58 Which is Worse? • This depends on the circumstances – In Texas, the null hypothesis is that you are innocent and if the null is rejected you are killed • Type I errors are very important (to the accused) – But if tests are “weak” there is low power to discriminate and econometrics cannot inform theory 59 Statistical Significance I Intercept X Variable 1 Coefficients Standard Error t Stat 2.02593034 0.127404709 15.90153 0.48826704 0.044344693 11.01072 P-value Lower 95% Upper 95% 4.835E-12 1.7582628 2.29359791 1.99E-09 0.3951022 0.58143186 • Can we say with any degree of certainty that the true coefficient is statistically different from zero? • t-statistic and P-value – t-stat is the coefficient estimate/its standard error – rule-of-thumb is that |t-stat|>2 means we can be 95% confident that the true coefficient is not equal to zero – P-value gives probability that true coefficient is zero given the estimated coefficient and its standard error 60 Statistical Significance II Intercept X Variable 1 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% 2.02593034 0.127404709 15.90153 4.8355E-12 1.7582628 2.29359791 0.48826704 0.044344693 11.01072 1.99E-09 0.3951022 0.58143186 • Can we say with any degree of certainty that the true coefficient is statistically different from zero? • Confidence intervals – Give range within which we can be 95% confident that the true coefficient lies • actual coefficients are 2.0 and 0.5 61 Statistical Inference II (cont.) • t-test coeff = 0.5 is (0.488 - 0.5)/0.044 = -0.26 • t-test coeff = 0.6 is (0.488 - 0.6)/0.044 = -2.52 • critical value of t-test (17 d.f.) is 2.11 – cannot reject null that true coefficient is 0.5 – can reject null that true coefficient is 0.6 in favour of alternative that true coefficient is different to 0.6 • with 95% confidence • or at the 5% level of significance 62 Economic Significance Intercept X Variable 1 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% 2.02593034 0.127404709 15.90153 4.8355E-12 1.7582628 2.29359791 0.48826704 0.044344693 11.01072 1.99E-09 0.3951022 0.58143186 • As x increases by 1 unit, y on average increases by 0.488 units • The econometrician has to decide whether this is economically important – Depends on magnitudes of x and y and the variability of x and y – Very important in finance to check economic importance of results 63 Some Real Data • • • • • • • annually from 1800+ to 2001 spot cable [dollars per pound] (spot) consumer price indices (ukp, usp) long-term interest rates (uki, usi) stock indices (ukeq, useq) natural log of each series (l...) log differences of each series (d…) 64 Excel Regression Output Regression Statistics Multiple R 0.245486731 R Square 0.060263735 Adjusted R Square 0.038906093 Standard Error 0.092479061 Observations 181 Intercept dukp dusp duki dusi Coefficients Standard Error t Stat P-value Lower 95% Upper 95% -0.00790941 0.007344336 -1.07694 0.2829809 -0.02240372 0.00658489 -0.28954448 0.11451444 -2.52845 0.01233588 -0.51554277 -0.06354619 0.393294362 0.140052233 2.808198 0.00554476 0.116896338 0.66969239 -0.06224627 0.083163402 -0.74848 0.45516881 -0.22637218 0.10187964 -0.02335464 0.069677157 -0.33518 0.73788573 -0.16086497 0.11415569 65 Assumptions of OLS • OLS is BLUE if the error term, u, has: – zero mean: E(ui) = 0 all i – common variance: var(ui)=σ2 all i – independence: ui and uj are independent (uncorrelated) for all i j – normal: ui are normally distributed for all i 66 Problems with OLS • • • • • What the problem means How to detect it What it does to our estimates and inference How to correct for it Key Problems: Multi-Collinearity, NonNormality, Heteroskedasticity, Serial Correlation. 67 Multi-collinearity • What it means – Regressors are highly intercorrelated • How to detect it – If economics of model is good, high R-squared, lots of individually insignificant (t-stats) but jointly significant (F-tests) regressors. Also, via high VIF values > 10. • What it does – Inference is hard because std errors are blown up • How to correct for it – Get more data; sequentially drop regressors. 68 Non-Normality • What it means – Over and above assumptions about mean and variance of regression errors, OLS also assumes they are normally distributed • How to detect non-normality – If normal, skew = 0, kurtosis = 3 – Jarque-Bera test • J-B = n[S2/6 + (K-3)2/24] • distributed χ2(2) so CV is approx 6.0 – J-B>6.0 => non-normality 69 Non-Normality (cont.) • What it does (loosely speaking) • skewness means coefficient estimates are biased. • excess kurtosis means standard errors are understated. • How to correct for it – skewness can be reduced by transforming the data • take natural logs • look at outliers – kurtosis can be accounted for by adjusting the degrees of freedom used in standard tests of coefficient on x • use k(T-2) d.f. instead of (T-2) • 1/k = 1+[(Ku - 3)(Kx - 3)]/2T 70 Heteroskedasticity • What it means – OLS assumes common variance or homoscedasticity • var(ui) = σ2 for all i – Heteroscedasticity is when the variance varies • often variance gets larger for larger values of x 71 Detecting Heteroskedasticity – Plot residuals as time series or against x – White test • Regress squared residuals on x’s, x2’s and cross products. – Reset test • Regress residuals on fitted y2, y3, etc. Significance indicates heteroscedasticity – Goldfeld-Quandt test • Split sample into large x’s and small x’s, fit separate regressions and test for equality of error variances – Breusch-Pagan test • σ2 = a + bx + cz … so test b = c = 0 72 White Test White Heteroskedasticity Test: F-statistic Obs*R-squared 1.497731 7.427565 Probability Probability Dependent Variable: RESID^2 Variable Coefficient Std. Error t-Statistic Prob. 0.002497 0.034449 0.282897 0.614692 0.045411 0.398780 2.511094 -0.031327 -0.692862 0.054979 0.758711 1.740713 0.0129 0.9750 0.4893 0.9562 0.4490 0.0835 C DUKP DUKP^2 DUKP*DUSP DUSP DUSP^2 R-squared Log likelihood Durbin-Watson stat 0.006270 -0.001079 -0.196008 0.033795 0.034454 0.694162 0.041036 397.3308 1.774378 0.192863 0.190734 Mean dependent var F-statistic Prob(F-statistic) 0.008361 1.497731 0.192863 73 Implications of Heteroskedasticity • OLS coefficient estimates are unbiased • OLS is inefficient – has higher variance than it should • OLS estimated standard errors are biased – if σ2 is positively correlated with x2 (usual case) then estimated standard errors are too small – so inference is wrong • we become too confident in our estimated coefficients 74 Correcting for Heteroskedasticity • If we know the nature of the heteroskedasticity it is best to take this into account in the estimation – use weighted least squares – “deflate” the variable by the appropriate measure of “size” • Usually, we don’t know the functional form – so correct the standard errors so that inference is valid – White standard errors alter the OLS std errors and asymptotically give reasonable inference properties 75 White Standard Errors Dependent Variable: DSPOT Method: Least Squares Date: 07/17/03 Time: 21:41 Sample(adjusted): 1821 2001 Included observations: 181 after adjusting endpoints White Heteroskedasticity-Consistent Standard Errors & Covariance Variable Coefficient Std. Error t-Statistic Prob. C DUKP DUSP -0.007288 -0.310380 0.379539 0.006605 0.102816 0.197376 -1.103411 -3.018788 1.922921 0.2713 0.0029 0.0561 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat 0.055141 0.044525 0.092208 1.513423 176.1352 2.115206 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) -0.006391 0.094332 -1.913097 -1.860083 5.193965 0.006422 76 Serial Correlation • OLS assumes no serial correlation – ui and uj are independent for all i j • In cross-section analysis, residuals are likely to be correlated across individuals – e.g. common shocks • In time series analysis, today’s error is likely to be related to (correlated with) yesterday’s residual – autocorrelation or serial correlation – maybe due to autocorrelation in omitted variables 77 Detecting Serial Correlation • Durbin-Watson test statistic, d – assumes errors ut and ut-1 have (positive) correlation p – tests for significance of p on basis of correlation between residuals u^t and u^t-1 – only valid in large samples, – only tests first order correlation – only valid if there are no lagged dependent variables (yt-i) in regression 78 Detecting Serial Correlation (cont.) • d lies between 0 and 4 – d = 2 implies residuals uncorrelated. • D-W provide upper and lower bounds for d – if d < dL then reject null of no serial correlation – if d > dU then reject null hypothesis of no serial correlation – if dL< d < dU then test is inconclusive. 79 Durbin-Watson Dependent Variable: DSPOT Method: Least Squares Date: 07/17/03 Time: 21:41 Sample(adjusted): 1821 2001 Included observations: 181 after adjusting endpoints White Heteroskedasticity-Consistent Standard Errors & Covariance Variable Coefficient Std. Error t-Statistic Prob. C DUKP DUSP -0.007288 -0.310380 0.379539 0.006605 0.102816 0.197376 -1.103411 -3.018788 1.922921 0.2713 0.0029 0.0561 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat 0.055141 0.044525 0.092208 1.513423 176.1352 2.115206 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic) -0.006391 0.094332 -1.913097 -1.860083 5.193965 0.006422 80 Implications of Serial Correlation • With no lagged dependent variable (so d is a valid test) – OLS coefficient estimates are unbiased – but inefficient – estimated standard errors are biased • so inference is again wrong 81 Correcting for 1st Order Serial Correlation • Rule of thumb: – if d < R2 then estimate model in first difference form yt = α + β xt + ut yt-1 = α + β xt-1 + ut-1 yt - yt-1 = β( xt - xt-1) + (ui - ut-1) – so we can recover the regression coefficients (but not the intercept). 82 Implications of Serial Correlation • With lagged dependent variables in regression (when DW test is invalid). • OLS coefficient estimates are inconsistent – even as sample size increases, estimated coefficient does not converge on the true coefficient (i.e. it is biased) • So inference is wrong. 83 Using a Dummy Variable to Test Changes in Slope & Intercept • What if and change over time? So, what we do is add two additional RHS variables: yt = + xt + Dt + (Dt xt) + ut where, t = 1, 2, 3, … T. Dt = 1 for 2003 - 2004 period, 0 otherwise. = measures change in intercept during 2003 2004. = measures change in slope during 2003 - 2004. 84