Econometrics Lecture Notes: Regression & Correlation Analysis

ECONOMETRICS MODULE 2 ABSTRACT IS ONE PARAGRAPH THAT SUMMARIZE WHAT YOU DID. INTRODUCTION: (MORE DETAIL) WHAT YOU’RE PLANNING TO DO AND WHY DO YOU THINK THAT’S IMPORTANT and exploring the relationship btw co2 and gdp in your country. CONCLUSION: SUMMARIZE EVERYTHING WHAT YOU HAVE DONE CHAPTER 3 20/04/2022 REGRESSION  we make a specific assumption about causality   In regression, we treat the dependent variable (y) and the independent variable(s) (x’s) very differently. (Asymmetric) The y variable is assumed to be random or “stochastic” in some way, i.e. to have a probability distribution. The x variables are, however, assumed to have fixed (“nonstochastic”) values in repeated samples. Y(dependent v.)=α+βx(independent v.)+ε Y is the variable that I want to explain (regressand) and x (regressor) is the variable that I am using to explain y. Like more square metres  higher price CORRELATION  we don’t make a specific assumption about causality   If we say y and x are correlated, it means that we are treating y and x in a completely symmetrical wayI am not assuming casual regression Correlation takes values btw (-1) and (1) SCATTER DIAGRAM OF REGRESSIONRegression is the best line that connects our observation α and β are about the population, SO THEIR TRUE VALUE ARE IMPOSSIBLE TO FIND I can calculate them just from the sample AND I OBTAIN alfa^ and beta^. Alfa hat and beta hat are estimates so they have some uncertainty measured by standard error term of beta^ and alfa^. The lower the standard error the more precise is the estimate. So the error is included bc we cannot explain everything. The disturbance term can capture a number of features: - We always leave out some determinants of yt - There may be errors in the measurement of yt that cannot be modelled. - Random outside influences on yt which we cannot model OLS  trying to minimize the distance btw observations and regression line. PROJECT 1. TIMES SERIES PLOT after highlighting both variables separate small graph 2. CORRELATION, tasto destroy after highlighting both variables 3. SCATTERPLOT: inversion of tendence, in the beginning in order to become richer you need to pollute more but in late 1900s getting richer means better technologies wich employs less CO2 and 4. Filefunction packageservercolorx_y_ plot tight click  install *COPY THE COMMAND AND PASTE IT IN CONTROL 5. RUN A REGRESSION: WHAT IS THE RELAT BTW POLLUTION AND WEALTH model OLS  DEP VAR: CO2 INDEP: GDP Constant term: Alfa hat - prima colonna prima riga ----- standard error of alfa (measures the uncertainty)  alfa hat/standard error of alfa = t-ratio which is Coefficient 1: Beta hat - prima colonna seconda riga 6. GRAPH  FITTED ACTUAL PLOT In the last graph, FAP, Y is the red point, all the observation we have, the blue line represents y hat and Yyhat = u hat (residual) With OLS you tring to minimize uhat (the error) and you’re trying to fit the best line. ALFA HAT AND BETA HAT  coming from the sample, a subset of the population. The standard errors are going to tell me how precise the estimates of this parameters are. STANDARD ERROR is going to be relatively low if all the point are very close to the regression line. POPULATION REGRESSION FUNCTION (PRF)  y= α+βx + u SAMPLE REGRESSION FUNCTION (SRF)  y hat = α hat +β hat x U= y – y hat. Lower s.e.  less uncertainty  best estimates. We usually employ OLS when we have linearity. One way of estimating linear models when things are not linear is by including LOGARITMIC TRANSFORMATIONS. 1. HIGHLOGH THE VARIABLES  ADD  LOGS OF SELECTED VARIABLES. 2. HIGHLIGHT EVERYTHING  TASTO DESTRO  TIME SERIES PLOT AND SEPARATE 4 GRAPHS. This will show that the scale on vertical axis is significantly reduced. 3. Scatter plot of the logs 4. Comparing model 1 (normal variable) and model 2 (log variable) 5. R squared: how much the variation of gdp per capita can explain the variation of annual gdp in model 1, the percentage variation of gdp per capita that can explain the percentage variation of emissions.  From model 1 to model 2 R squared has increased a lot. When we take the logs we try to run a non linear model in a linear way. ESTIMATORS: formulae used to calculate the coefficients ESTIMATES: In gretl we observe that estimates are the actual numerical values for the coefficients. 1. Test in Gretl in for E(u) = 0 (expected value of error is equal to zero): TEST  NORMALITY OF RESIDUALS  I GET DISTRIBUTION OF RESIDUALS AND THE MEAN VALUE, which must be close to zero. (more than 2/3 zero are okay to define the number as a value very close to zero) 2. Test in Gretl whether the variance is constant  HOMOSKEDASTICITY 3. The error are statistically independent one another  NO AUTOCORRELATION (the errors yesterday are not correlated with errors today) 4. There is no relationship between error and x  U is random and not related to some variable 5. U is following normal distribution IF THIS ASSUMPTIONS HOLDS MEANS OLS PARAMETERS ARE BLUE ESTIMATORS (ALFA HAT AND BETA HAT ARE VERY CLOSE/EQUAL TO TRUE VALUE. BEST MEANS MINIMUM VARIANCE)    Consistency: as our sample size increase, I am getting better estimates Unbiased: on average my estimates are goin to be equal to true value. Reliability and precision: standard error of alfa hat and beta hat tell me. They have to be as small as possible. 2:17: 23 DUE GRAFICI, UNO A PALLINA A SX E UNO A RETTA DI PUNTI A DX On the left graph: no data for low values of ex  ALFA HAT could be problematic  I do not have information. Like if x is square metres and y is price of flat: beta is how much more I pay for an increase in square metres. Alfa measures the value of an apartment with zero square metres. IMPOSSIBLE When x equals zero y equals alfa. The larger the T (sample size) the smaller the coefficient variances. More precise the estimates, standard error reduced significantly. T STATISTIC: alfa hat/standard error of alfa hat  The lower the standard error the more precise is the estimate. WE NEED TO DO HYPOTESIS TESTING TO CHECK HOW RELIABLE IS THE ESTIMATES  TEST OF SIGNIFICANCE APPROACH  T RATIO: higher in absolute terms, higher the significance of the variable is. If p value is very low, near to zero, then I reject the null that zero is a probable value for alfa. In other words, alfa is statistically significant.  CONFIDENCE INTERVALS APPROACH: for confidence intervals creations you go to analysis and confidence intervals for coefficients. You find min and max value for the variable and average value. 21/04/2022 Test of significance approach test statistic and p value Alternative: confidence intervals TEST OF SIGNIFICANCE APPROACH (B hat- B) / SE (B hat) Level of significance: 1%,5%,10% Ho: B=0 this is an important statement bc we are valuing if x is significant to explain y When a coefficient can take the value 0, then the variable (X) attached to that coefficient is not helping me in explaining Y. NORMAL DISTRIBUTION - Bell shape 0 mean, 1 variance Relation btw T Distribution and Normal Distribution: both symmetric and centred around 0 but T distribution has another parameter, degrees of freedom, and we employ it when we have a relatively small sample  T has flatter tails = extreme values are more frequent respect to the Normal distribution. T distribution (flatter tails) Normal Distribution There are infinite number of T distributions due to the DEGREES OF FREEDOM = difference btw number of observations les number of coefficients estimated (n-k). Critical values on T are always higher than the critical values on the Normal. THE CONFIDENCE INTERVAL Includes all the possible values of a coefficient/parameters. Min, max and average value. Usually two sided. How to build them: calculate alfa hat, beta hat and the standard errors, chose level of significance and follow this formula: The number on axis of T Distrib depends on Degrees of Freedom. Vedi 00:23  CONFIDENCE INTERVAL  DIFFERENT TESTS  Β= 1 reject Β= 0 do not reject bc in the confidence value the 0 is comprehended FOR THE PROJECT TO BOTH TEST OF SIGNIFICANCE AN CONFIDENCE INTERVAL ERRORS IN HYPOTESIS TESTS 1. Reject a null that is true 2. Not a rejecting a null when it’s false  The best solution to type 1 and 2 errors is to increase sample size NEED COMPARABLE DATA IF THEY ARE MISSING T RATIO Tells if coefficient is significant or not. The constant term can be not significant when we have no data about what is happening around intercept. EXAMPLE OF T RATIO If T ratio in absolute terms is greater than 1.96 we reject H0 SKIP OTHER EXAMPLES THE EXACT SIGNIFICANCE LEVEL OR P-VALUE The p-value gives us the plausibility of null hypothesis CHAPTER 4 MULTIPLE REGRESSION When we have more than 1 variables in righten side. We can add more variables if we find something that can help us explain CO2 emissions (?) F STATISTIC WHEN YOU WANT TO TEST MORE H0, MORE EQUATIONS SIMULTANEOUSLY. NON -LINEAR MODEL  Maybe I need bc the relationship is non linear HOW TO DO THIS IN GRETL? HIGHLOGHT GDP PER CAPITA (NO LOGS)  ADD SQUARE OF SELECTED VARIABLES MODEL  OLS AND PUT SQUARE GDP IN REGRESSORS In fact the model we are running is non linear  as the country become richer and richer we see co2 emission declining  we have sort of 3 phases. 1. You need to pollute to become richer 2. Pollution is constant 3. You go richer and emissions decline  ENVIRONMENTAL KUTNEZT CURVE. If you’re analysis is about a rich country, you will probably have this. While for a poor country you see just the 1 and maybe 2. F test to test if the 2 coefficients (GDP per capita and sq_GDP_pc) are simultaneously zero. Null: b1 and b2 are simultaneously zero Test  omit variables  highlight the 2 coefficients you want to test  wald test or sequential testing  look at p value. F DISTRIBUTION Depends also on degrees of freedom. F dist is similar to T distrib, while Chi square distrib is similar to Normal Distribution. Can I use F test for single hypothesis? Yes. Any test that is done as a T-test can be done as an F too R^2: how well the sample fits the data Square of the correlation between Y an Y hat. R^2 tells me the estimates sum of residuals divided by the total sum of residuals. Problem: R^2 keeps going up when you increase the number of variables.  this is why we use adjusted R^2 with extra variables. If we want to compare 2 models, we have to use adjusted R^2. COMPARE ADJUSTED R^2 OF LINEAR, LOGARITMIC AND NON-LINEAR MODEL TO SEE WHICH MODEL IS BETTER AKAIKE CRITERION  The smaller the value the better the model. DUMMY VARIABLE: it takes just 1 and 0 values.  gives you additional explaining when something extreme like a war is happening. Then watch T-RATIO (?) QUANTILE REGRESSION Standard regression approaches effectively model the (conditional) mean of the dependent variable.  We can calculate from the fitted regression line the value that y takes for any value of the explanatory variables. BUT QR is basically an extrapolation of the behavior of the relationship btw x and y at the mean of the remainder of the data. This is a suboptimal approach bc OLS is measuring the average approach. For example, OLS cannot help if you focus on what happens if Y is relatively low or high. For example, there might be a non linear relationship btw x and y. Estimating a standard linear regression model may lead to seriously misleading estimates of this relationship as it will ‘average’ the positive and negative effects. It would be possible to include non-linear (i.e. polynomial) terms in the regression model (for example, squared, cubic, . . . terms) But quantile regressions represent a more natural and flexible way to capture the complexities by estimating models for the conditional quantile functions Quantile regressions can be conducted in both time-series and cross-sectional contexts It is usually assumed that the dependent variable, often called the response variable, is independently distributed and homoscedastic Quantile regressions are more robust to outliers and non-normality than OLS regressions HOW TO IDENTIFY OUTLIERS IN THE PROJECT graphs  residual plot against time  I can identify big drops or big ups. How can I take outliers into account? I can create 2 dummy variables, one for each year GO TO ADD  OBSERVATION RANGE DUMMY  ONE DUMMY FOR EACH YEARTHEN MODEL  OLS  INCLUDE THE 2 DUMMIES AS REGRESSORS and when you see statistics you understand if they are or not statistically significant. If not (no asterischi) we didn’t identify the outliers correctly. QUANTILE REGRESSION P.2 • Quantile regression is a non-parametric technique since no distributional assumptions are required to optimally estimate the parameters • The notation and approaches commonly used in quantile regression modelling are different to those that we are familiar with in financial econometrics • Increased interest in modelling the ‘tail behaviour’ of series have spurred applications of quantile regression in finance  what happens when co2 emissions per capita are relatively low or relatively high. • A common use of the technique here is to value at risk modelling This seems natural given that the models are based on estimating the quantile of a distribution of possible losses. • • Quantiles, denoted τ , refer to the position where an observation falls within an ordered series for y , for example: – The median is the observation in the very middle – The (lower) tenth percentile is the value that places 10% of observations below it (and therefore 90% of observations above) More precisely, we can define the τ -th quantile, Q(τ ), of a random variable y having cumulative distribution F(y) as Q(τ ) = inf y : F(y) ≥ τ where inf refers to the infimum, or the ‘greatest lower bound’, which is the smallest value of y satisfying the inequality • By definition, quantiles must lie between zero and one • Quantile regressions effectively model the entire conditional distribution of y given the explanatory variables. • The OLS estimator finds the mean value that minimises the RSS and minimising the sum of the absolute values of the residuals will yield the median • The absolute value function is symmetrical so that the median always has the same number of data points above it as below it • If the absolute residuals are weighted differently depending on whether they are positive or negative, we can calculate the quantiles of the distribution • To estimate the τ -th quantile, we would set the weight on positive observations to τ and that on negative observations to 1 – τ  We can select the quantiles of interest and common choices would be 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95  The fit is not good for values of τ too close to its limits of 0 and 1. HOW NOT TO DO IT • As an alternative to quantile regression, it would be tempting to think of partitioning the data and running separate regressions on each of them – • However, this process, tantamount to truncating the dependent variable, would be wholly inappropriate – • For example, dropping the top 90% of the observations on y and the corresponding data points for the xs, and running a regression on the remainder It could lead to potentially severe sample selection biases In fact, quantile regression does not partition the data – All observations are used in the estimation of the parameters for every quantile 2:27 FOR PROJECT: MODEL  ROBUST ESTIMATION  QUANTILE REGRESSION GET RID OF DUMMIES AND NEED CO2 EMISSIONS PER CAPITA  DESIRED QUANTILES: 0.5/0.25/0.50/0.75/0.95 CHOOSE ALL  COMPUTE CONFIDENCE INTERVALS in this way we can see whether a coefficient is statistically significant or not.  CLICK OK  FOR GDP_PC I HAVE 5 COEFFICIENTS: the first one represent when co2 is relatively low, the last one is when co2 is relatively high and the third one out of five is where the co2 is relatively medium  IF WE WANT TO SEE WHAT HAPPENS WE HAVE TO GO TO GRAPHS  TAU SEQUENCE  GDP_PC: THIS GRAPH WILL COMPARE OLS (BLUE) (and the dotted line above and below are confidence intervals for the OLS) WITH coefficients from QUANTILE REGRESSION (BLACK). OLS IS TELLING US WHAT’S HAPPENING IN THE MIDLE, IS giving us the average response; AND QUANTILE REGRESSION IS TELLING US WHETER THERE IS A SIGNIFICANT ASYMMETRY WHEN CO2 EMISSIONS ARE RELATIVELY HIGH OR RELATIVELY LOW, THE DIFFERENCE BTW QR AND OLS FOR UK IS NOT VERY IMPORTANT, IS RELATIVELY LIMITED: Ad ogni coefficient del gdp_pc corrisponde un punto sulla QR line del grafico  we are comparing OLS which assumes the average symmetric response with QR which is looking at different segment of the distribution. As here the blue line is very close to black line it seems that the average response is a good approximation of what is happening in the different segments of the distribution (so when co2 is relatively low, high) The blue line is the average response that assumes symmetry, while black line assumes asymmetry QR says that the response of co2 emissions to change in gdp is not a number as OLS says but is a function of whether co2 emissions are relatively high or relatively high (different response basing on the cases). WORRY ABOUT OUTLIERS WHEN YOU ARE USING R^2, NOT WHEN USING QR, because quantile is looking at the different segments of the distribution, they can basically take into account OUTLIERS IN A MORE EFFICIENT WAY, outliers would affect more what’s happening in a specific segment. Difference depending on number of quantiles: I get different responses depending on the number of quantiles you chose. 22/04/2022 PROJECT + PPP COLOR-XY installed QUANTILE REGRESSION QR can examine the potential asymmetry btw 2 variables. We need quantile regression bc it is looking at the entire conditional distribution while OLS just focuses at the mean and average response/value, it doesn’t tell us what happens when value are relatively low or high. OLS measure average responses symmetry/linearity/average responses (1 β) QR: (more β relative to each quantiles values)   Asymmetric and looking at the entire distribution  it’s looking at the conditional quantile functions. The dependent variable is usually called response var and we assume it is independently distributed and homoscedastic.          The advantage is that QR is robust to outliers, so we don’t need DUMMY variables. Non parametric technique, the notation is different! Tail behaviour = relative high and low values of response variable (Y) Very powerful tool to use if we are in finance or environment Denoted with TAU: when tau is 0.10 it means it contains the first ten % of observation (relatively low values) and when 0.90 it refers to relatively high value. REMEMBER: OBSERVATIONS (Y) ARE ORDERED Quantiles lies btw 0 and 1 QUANTILE REGRESSION CAN MODEL THE ENTIRE CONDITIONAL DISTRIBUTION OF Y USING EXPL VAR (co2 per capita given GDP) (USE FOR THE PROJECT we need quantile regression in order to investigate whether there is a difference btw relatively high CO2 emission and relatively low co2 emissions. TO JUSTIFY WHY WE ARE USING QR WRONG ALTERNATIVE: separate regression for relative high value of y and relative low value this is wrong. Quantile gives different WEIGHTs to different observations, but it does not partition the data. CASE STUDY (0:33, SAME AS GRAPH IN LESSON BEFORE): they compared OLS with QR. If you bootsrap the standard error you get more reliable result (FOR THE PROJECT) and we can also do a table like ABOVE with different quantiles. Q(0.5) median that you can compare with OLS. 00:37 modelling on GRETL 0:43 IMPORTANT Model 1 QR  Regressors: const/GDP_pc click on Robust standard error/intervals and on compute confidence intervals 5 quantiles  graphs  tau sequence  GDP_pc MODEL 2 Doing the same (quantile regression model) but for logs so I put as dependent variable log of annual co2 and as regressors const e log of GDP.  confidence intervals and robust stand error  graphs  tau sequence  gdp_pc : CHECH FOR DIFFERENCES WITH MODEL 1  EXPLANATION OLS results: dependent var is annual CO2 e indep var is GDP per capita. If GDP increases by 1% CO2 leads to 0.49% in GDP per capita. AVERAGE RESPONSE. QR results: 0.41% in co2 with 1% increase in GDP. When you are relatively rich and increase in co2 are consequentially relative high it happens that we have an asymmetric response bc CO2 emission increases less when you are rich and more when you are poor. (I can see that from increasing quantiles) ASYMMETRIC RESPONSE. To compare OLS and QR I go to the median response. PROJECT 1:07 TO INSERT DATA IN EXCEL: YOU DOWNLOAD DATA, FILTER, USE UNDERSCORE TO READ IN GRETL, AND OPEN THE FILE, DELETE ALL DATA APPARTE 3 COLONNE CON ANNO, CO2 E GDP CHAPTER 5 VIOLATION OF THE ASSUMPTIONS OF THE CLRM/*CLRM: test you need to do for all your models Test for VIOLATIONS: 1. 2. 3. 4. 5. E (U) = 0  average value of U is zero  NORMALITY OF RESIDUAL TEST HOMOSKEDASTICITY WHITE TEST NO AUTOCORRELATION  AUTOCORRELATION TEST X MATRIX U follows normal distribution  NORMALITY OF RESIDUAL TEST  Consequences of VIOLATIONS in general, we could encounter any combination of 3 problems: - the coefficient estimates are wrong the associated standard errors are wrong the distribution that we assumed for the test statistics will be inappropriate  - Solutions the assumptions are no longer violated we work around the problem so that we use alternative techniques which are still valid STATISTICAL DISTRIBUTION FOR DIAGNOSTIC TEST • Often, an F- and a 2- version of the test are available. • The F-test version involves estimating a restricted and an unrestricted version of a test regression and comparing the RSS. (F needs DEGREES OF FREEDOM) • The 2- version is sometimes called an “LM” test, and only has one degree of freedom parameter: the number of restrictions being tested, m. (just one degrees of freedom parameter which is the number of restrictions) • Asymptotically, the 2 tests are equivalent since the 2 is a special case of the F-distribution: • For small samples, the F-version is preferable. DoF ARE NUMBER OF OBSERVATION – ESTIMATE COEFFICIENTS.   Chi square (/LM test. 1 degree of freedom)  RELATIVELY BIG SAMPLES or F test (implies that we are using degrees of freedom FOCUS ON THIS BC WE HAVE RELATIVELY SMALL SAMPLE ASSUMPTION 1: the mean of disturbance is zero. The mean of the residuals will always be zero provided that there is a constant term in the regression  satisfied when we include a constant term.  DIAGNOSTIC TEST NUMBER 1: NORMALITY OF RESIDUALS. IF P VALUE IS ZERO THEY ARE NOT NORMALLY DISTRIBUTED. ASSUMPTION 2: HOMOSKED. (variance of errors is constant). To test this we are going to use White test: we start with a linear model, we take the residuals and we run the auxiliary regression, obtain R^2 from that and  DIAGNOSTIC TEST NUMBER 2: white test from logarithmic or also from normal variables and then edit and add logarithmic variables 1:29 If I have heteroskedasticity  VARIANCE OF ERROR IS NOT CONSTANT. Standard errors are inappropriate I cannot use t test, p value ecc so I need to update standard error to take heteroskedasticity into account. We can reduce heterosked by 1. taking logs 2. We use standard errors that consider heterosk EDIT, MODIFY MODEL, CLICK ON ROBUST and you’ll see that stand err is increasing, if I do the white test again the result will be the same but Heterosk will just affect the standard errors and not the coefficient. WE ARE GONNA GET WIDER STANDARD ERRORS.  We are going to produce wider CONFIDENCE INTERVALS, so we are MORE CONSERVATIVE CREATE A LAGGED VARIABLE: Y (t-1). Delta Yt is usually growth rate of the variable. ASSUMPTION 3: NO AUTOCORRELATION = YOUR MODEL IS GOOD IF THERE ARE NO PATTERNS IN THE MODEL NEG/POS AUTOCORRELATION  NO AUTOCORRELATION = EVERYTHING IS RANDOM = YOUR MODEL IS GOOD. TO TEST AUTOCORRELATION, we use DURBIN WATSON TEST (first order autocorrelation so we use just one lag, so just t and t-1). If DW is near 2 there’s little evidence of AUTOCORRELATION, 0 POSITIVE AUTOCORREL, 4 NEGATIVE AUTOCORRELATION but there are areas where the test is INCONCLUSIVE. WE SEE THE TEST STATISTIC FROM OLS MODEL DIRECTLY (you have to do it for all your models ) BUT WE NEED TO CHECH FOR DW P VALUE: WE GO TO TEST, DW P VALUE AND WE SEE THE P VALUE FOR NEG AND POS AUTOCORR. H0 IS THERE IS NO AUTOCORRELATION (1:48) In the case BELOW I reject the alternative case of prof h1.1 I reject the null, H1.2 I do not reject the null so there is positive autocorrelation. Conclusion: we have evidence of positive autocorrelation. ALTRIMENTI: BREUSH GODFREY TEST that is a test for higher order autocorrelation. TEST ON GRETL: AUTOCORRELATION 2 LAGS  P VALUE NEAR ZERO WE REJECT CONSEQUENCES OF IGNORING AUTOCORRELATION IF IT IS PRESENT: • The coefficient estimates derived using OLS are still unbiased, but they are inefficient, i.e. they are not BLUE, even in large sample sizes. • Thus, if the standard error estimates are inappropriate, there exists the possibility that we could make the wrong inferences. • R2 is likely to be inflated relative to its “correct” value for positively correlated residuals. REMEDIES: • If the form of the autocorrelation is known, we could use a GLS procedure – i.e. an approach that allows for autocorrelated residuals e.g., Cochrane-Orcutt. • But such procedures that “correct” for autocorrelation require assumptions about the form of the autocorrelation. • If these assumptions are invalid, the cure would be more dangerous than the disease! - see Hendry and Mizon (1978). • However, it is unlikely to be the case that the form of the autocorrelation is known, and a more “modern” view is that residual autocorrelation presents an opportunity to modify the regression: WE INCLUDE Yt-1 (lagged value of the dependent variable Y) in the model. SO EDIT MODIFY  CLICK ON LAGS1 LAG OF THE DEPENDENT VARIABLE  OK  ROBUST STANDARD ERROR IF WHITE TEST SHOWED HETEROSKED  RERUNN THE MODEL WITH LAGGED VALUE. REDO TEST OF NORMALITY, HETEROSKED AND AUTOCORRELATION AND ALL THE RESULTS WILL BE UNDER MY MODELS IF AUTOCORRELATION IS STILL PRESENT, I HAVE TO MODIFY THE MODEL AGAIN, FOR EXAMPLE INSERTING THE DUMMIES. IN PROF CASE THE 2 DUMMIES HELP SOLVING. Dummies makes all your diagnostic behave better and increase your adjusted r^2 significantly Explanation of Coefficient of DUMMY: 4ex in 1921 the coefficient is -0.4013. So only in that year we have a significant reduction in annual co2 emission by 0.4013 because of war. GRAPHS  ANALYSIS  INFLUENTIAL OBSERVATION you can see if some observations are more influential than others. 2.30) BOOTSTRAP: analysis  BOOSTRAP -> CHOSE A COEFFICIENT AND THIS IS GOING TO BOOTSTRAP YOUR STANDARD ERROR. I GO WITH WILD BOOTSTRAP AND THIS WILL GIVE ME THE EMPIRICAL DISTRIBUTION OF THE COEFFICIENTS. WHY DO WE WANT TO INCLUDE LAGS? - To get rid of autocorrelation Inertia of the dependent variable Over-reaction Measuring time series as overlapping moving averages However, other problems with the regression could cause the null hypothesis of no autocorrelation to be rejected: - Omission of relevant variables, which are themselves autocorrelated. If we have committed a “misspecification” error by using an inappropriate functional form. Autocorrelation resulting from unparameterised seasonality. (WE HAVE THE DUMMIES JUST IF EVERY TEST WE DID BEFORE DO NOT CORRECT AUTOCORRELATION  I CAN SEE IT FROM GRAPH 1 DUMMY PER POINT, YOU CAN CHECK ON ANALYSIS AND RESIDUALS and you see from the dummy what happened in each years) The next step is to do model in FIRST DIFFERENCES. FOR EACH MODEL (L, NL , LOG) WE NEED ALL THE INTERPR, STATISTICAL SIGNIFICANCE (t.test, p value), outliers, time series blabla IN THE INTRODUCTION YOU CAN DISCUSS WHY YOU ARE EXPECTING CO2 AND GDP TO BE RELATED, WHAT YOU KNOW FROM THEORY ABOUT THEIR RELATIONSHIP! 26/04/2022 REMEDIES FOR AUTOCORRELATION AUTOCORRELATION: when we have correlation btw 2 subsequent points in time  create problem both for standard errors and for It’s an opportunity to MODIFY THE REGRESSION  ADD LAGS TO THE RIGHTEN SIDE OF CAUSALITY EQUATION Sometime this change is enough, sometimes NOT  TEST FOR HETEROSKED (WHITE TEST)  no need to test again. TEST FOR AUTOCORRELATION  GODFREY BREUSH TEST  UPDATE STAND ERR  ADD LAG VARIABLE OF Y  RETEST to see if autocorrelation have been taken into account. I NEED TO RETEST FOR AUTOCORRELATION (risenti questa parte perchè sta roba sopra non torna) DYNAMIC MODELS  BC WE HAVE LAGGED VALUE OF DEP VARIABLE ON RIGHTEN SIDE Y: dependent variable Y is CO2 emissions per capita Xt: independent variable GDP per capita Y(t-1): lagged value of CO2 emissions per capita Why do we have lagged value of dep variable BC WE INTRODUCE THIS TO REMEDY AUTOCORRELATION SE NON RIESCI A RISOLVERE AUTOCORRELATION YOU USE DUMMIES OUTLIERS, STATISTICALLY SIGNIFICANT DUMMIES, IS THE COEFFICIENT TO THE DUMMIES SIGNIFICANT? LO VEDI DAL TIME SERIES PLOT WHY INTRODUCE LAGS? LAGS MEASURE INERTIA, SO HOW MUCH PAST (past observation) INFLUENCE PRESENT (current observations)  B2 (coefficient of lagged variable) measures autocorrelation Possible reasons why we may have autocorrelation: - omission of relevant variables we might have a misspecification bc of using an inappropriate functional form  WE USE LOGS OF THE VARIABLE FOR THIS unparametrized seasonality (remove seasonality DESEASONALIZING THE DATA) outliers THEODOR EXAMPLE RIVEDI, 16.28 CIRCA  VEDI OUTLIERS ma also  he added in the model the 2 lag: in the model of OLS he pushed lags and this this: And autocorrelation got better. In his case Breush Godfrey test was good even for 2nd order LAGS (they were significant). So devi provare a vedere gli ouliers and mettere i laags degli outliers (t-2) MODELS IN FIRST DIFFERENCE FORM From LEVELS (Yt, Xt)  to FIST DIFFERENCE (deltaY, deltaX) Delta: Y(t)- Y(t-1) If Y and X are in LOGS then deltaY and deltaX are measuring the GROWTH RATE OF Y/X So by a model in FIRST DIFFERENCE we refer to a model which TRY TO UNDERSTAND THE GROWTH RATE OF CO2 EMISSIONS (4ex) HOW TO DO THAT IN GRETL (00:28) 1. Select the two variables ADD  chose log differences of selected variables OR select log variables and chose FIRST DIFFERENCE OF SELECTED VARIABLE = same result 2. Select 4 variables (normal and log difference of annual co2 emission and log difference of gdp per capita which I just created) Create 4 graphs  time series plot  in separate small graph  you can see levels of variables and growth rate of GDP per capita and growth rate of co2. 3. Ripartendo dale 4 variabili selezionate (Y, X, deltaY, deltaX) View  summary statistics Create statistics (summary statistics) show full statistics 4. SELECT JUST GROWTH RATE VARIABLES (FIRST DIFFERNECE)  tasto destroy  XY-Scatter plot  you can see the growth rate relations and identify the outliers (MAKE A NOTE THAT THE OUTLIERS YOU IDENTIFY AT THE LEVELS (that we have identify so far) DO NOT NECESSARY BE THE SAME OULIERS OF FIRST DIFFERENCE MODEL) 5. RUN OLS REGRESSION with robust standard error where I am going to have growth rate of CO2 emission e growth rate of GDP (= first difference). From that you can see if the growth rate of gdp is statistically significant  6. FILE  FUNCTION PACKAGE on server  Confidence band plot  then go back to model above  GRAPHS  Confidence band plot  IT PLOTS THE REGRESSION LINE TOGETHER WITH THE CONFIDENCE INTERVAL  you need to define confidence level (0.95 he uses) 7. TEST FOR NORMALITY 8. TEST AUTOCORRELATION 9. DURBIN WATSON P VALUE TELLS ME THERE IS FIRST ORDER AUTOCORRELATION 10. TEST HETEROSKEDASTICITY PROBLEM: MODEL IN FIRST DIFFERENCES DO NOT HAVE LONG RUN SOLUTIONS DeltaYt = Yt - Yt-1 and usually all the variables are in logs IN THE LONG RUN Yt =(will equal) Yt-1  Y is the equilibrium value deltaYt will be equal to zero in the long run So, when I am using a model in first differences in long run: 1. 2. 3. 4. 5. 6. We do not write Xt Yt-1 but just X/Y SO deltaY=0 deltaX=0 X(2t-1) = x2 Y(t-1)= Y U(t) = 0  0= b1+ b3x2+ b4y FIRST DIFFERENCE tells how the model behave in SHORT RUN LEVELS tells how the model behave in LONG RUN Problems with Adding Lagged Regressors to “Cure” Autocorrelation • Inclusion of lagged values of the dependent variable violates the assumption that the RHS variables are non-stochastic. (price that we need to pay) • What does an equation with a large number of lags actually mean? Difficult to interpret a model with lots of lags • Note that if there is still autocorrelation in the residuals of a model including lags, then the OLS estimators will not even be consistent. MULTICOLLINEARITY Problem that occurs when explanatory variables are very high correlated with each other If 2 righten side variables are correlated, then we have multicollinearity in our case all we have on righten side are dummies and lagged variables but if we want to try to add variables on the righten side that could explain co2 emissions you can have multicollinearity with that. For example we can add industrial production  find data and say that It could also increase Co2 emissions  industrial prod. And gdp are correlated so there we can test for multicollinearity. There is no test for multicollin. So we just can look at the correlation matrix of the righten side variable if the results are the same for 2 or more variables we have to drop one of the variables or TRANSFORM, even better, the model. • This problem occurs when the explanatory variables are very highly correlated with each other. • Perfect multicollinearity: Cannot estimate all the coefficients - e.g. suppose x3 = 2x2 and the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut • Problems if Near Multicollinearity is Present but Ignored - R2 will be high but the individual coefficients will have high standard errors. - The regression becomes very sensitive to small changes in the specification. - Thus confidence intervals for the parameters will be very wide because there are high standard errors, and significance tests might therefore give inappropriate conclusions. We identify the problem by looking at correlation matrix or we transform the model in order to solve multicollinearity. • “Traditional” approaches, such as ridge regression or principal components. But these usually bring more problems than they solve. • Some econometricians argue that if the model is otherwise OK, just ignore it • The easiest ways to “cure” the problems are - drop one of the collinear variables - transform the highly correlated variables into a ratio - go out and collect more data e.g. - a longer run of data - switch to a higher frequency (1:19) ADOPTING THE WRONG FUNCTIONAL FORM- RAMSEY’S RESET TEST • We have previously assumed that the appropriate functional form is linear. • This may not always be true. • We can formally test this using Ramsey’s RESET test, which is a general test for mis-specification of functional form  we seen so far 4 functional forms: linear, logarithmic, first difference and non linear. Which of this is the most realistic for our data? Ramsey reset test will give us the answer. 2 3 Essentially the method works by adding higher order terms of the fitted values (e.g. yt , yt etc.) ut into an auxiliary regression: Regress ut  0   y   y ...  2 1 t 3 2 t on powers of the fitted values: y  vt p p 1 t so we obtain residual on lefthand side and on righthand side we have y hat square, y hat power 3 and so on. 2 Obtain R2 from this regression. The test statistic is given by TR2 and is distributed as a  ( p  1) . • So if the value of the test statistic is greater than a  ( p  1) (Chi square) then reject the null which is the hypothesis that the functional form was correct. 2 On GRETL OLS  co2 emissions on the left, GDP on right  I am going to test for functional form misspecification using ramsey reset for linear OLS MODEL. Test  ramsey reset test  keep the default option squares and cube  ok  viene fuori anche un auxiliary regression for reset test con (y hat ^2) and (y hat ^3) model and test: in case of Ramsey test the p-value is almost zero the null hypothesis is that the functional form is correct  we reject so this is not a good model. There are no guarantees that I will be able to find a good model but we try with each one. (1:30) Auxiliary regression: We also have (y hat ^2) and (y hat ^3) In yellow the p-value • The RESET test gives us no guide as to what a better specification might be we have to experiment with different specifications • One possible cause of rejection of the test is if the true model is yt  1   2 x2t   3 x22t   4 x23t  ut  non linear In this case the remedy is obvious. • Another possibility is to transform the data into logarithms. This will linearise many previously y  Axt eut  ln yt     ln xt  ut multiplicative models into additive ones: t TESTING FOR NORMALITY • Why did we need to assume normality for hypothesis testing? Testing for Departures from Normality • The Bera Jarque normality test  follows Chi distribution • A normal distribution is not skewed and is defined to have a coefficient of kurtosis of 3. • The kurtosis of the normal distribution is 3 so its excess kurtosis (b2-3) is zero. • Skewness and kurtosis are the (standardised) third and fourth moments of a distribution. The first moment of a distribution is the expectation and the second moment is the variance. NORMAL DISTRIBUTION: Symmetric VS SKIED DISTRIBUTION: asymmetric LEPTOKURTIC: many observations around the mean (higher than a Normal) TEST OF NORMALITY IN GRETL  TEST  NORMALITY OF RESIDUALS  SEE P VALUE  HO: ERROR IS NORMALLY DISTRIBUTED  IF YOU REJECT THE NULL: NON NORMALITY IS NOT A BIG PROBLEM, DUMMIES CAN IMPROVE THE MODEL EVEN HERE. WHAT DO WE DO WITH NON-NORMALITY? • It is not obvious what we should do! • Could use a method which does not assume normality, but difficult and what are its properties? • Often the case that one or two very extreme residuals causes us to reject the normality assumption. • An alternative is to use dummy variables. e.g. say we estimate a monthly model of asset returns from 1980-1990, and we plot the residuals, and find a particularly large outlier for October 1987: OF COURSE WE NEED A THEORETICAL REASON FOR ADDING DUMMY VARIABLE Omission of an Important Variable • Consequence: The estimated coefficients on all the other variables will be biased and inconsistent unless the excluded variable is uncorrelated with all the included variables. • Even if this condition is satisfied, the estimate of the coefficient on the constant term will be biased. • The standard errors will also be biased. Inclusion of an Irrelevant Variable • Coefficient estimates will still be consistent and unbiased, but the estimators will be inefficient. PARAMETER STABILITY TEST • So far, we have estimated regressions such as • We have implicitly assumed that the parameters (1, 2 and 3) are constant for the entire sample period. For what reason? There is no reason to assume that this is the case… so: We can test this implicit assumption using parameter stability tests. The idea is essentially to split the data into sub-periods and then to estimate up to three models, for each of the sub-parts and for all the data and then to “compare” the RSS (residual sum of square) of the models. For example if you have data from 1800-2020, you divide in 3 parts bc some events changed things: 1.1800-1939 2.1940-2020 3.1800-2020 (one for the entire period) • There are two types of test we can look at: - Chow test (analysis of variance test) - Predictive failure tests THE CHOW TETS The steps involved are: 1. Split the data into two sub-periods. Estimate the regression over the whole period and then for the two sub-periods separately (3 regressions). Obtain the RSS for each regression. 2. The restricted regression is now the regression for the whole period while the “unrestricted regression” comes in two parts: for each of the sub-samples. We can thus form an F-test which is the difference between the RSS’s. The Statistic is: RSS   RSS1  RSS2  T  2k  RSS1  RSS2 k where: RSS = RSS for whole sample RSS1 = RSS for sub-sample 1 RSS2 = RSS for sub-sample 2 T = number of observations 2k = number of regressors in the “unrestricted” regression (since it comes in two parts) k = number of regressors in (each part of the) “unrestricted” regression 3. Perform the test. If the value of the test statistic is greater than the critical value from the Fdistribution, which is an F(k, T-2k), then reject the null hypothesis that the parameters are stable over time. NULL HP: The parameters are stable over time (no structural break)The alternative is that the 2 subsample are different ON GRETL: CHOW TEST TEST CHOW TEST  OBSERVATION YEARS IN WHICH WE NEED TO SPLIT (DEPENDS ON THEORY AND ON COUNTRY)  IF WE REJECT, WE SHOULD REDEFINE THE BREAKS AND WE NEED TO ESTIMATE A DIFFERENT MODEL BEFORE A CERTAIN DATA AND A DIFFERENT MODEL AFTER A CERTAIN DATA. OR WE CAN FIND A MODEL THAT TAKES INTO ACCOUNT THAT VARIABLES ARE CHANGING OVER TIME! CUSUM TEST  indication of the stability of the model. ANOTHER WAY TO TEST STABILITY OF THE PARAMETERS IS TO DO A CUSUM TEST : WE NEED IT TO SEE HOW COEFFICIENTS ARE CHANGING OVER TIME AND TO SEE IF FROM SOME POINT IN TIME THE COEFFICIENTS ARE CHANGING SO IT’S ANOTHER INDICATION THAT THE STABILITY OF THE MODEL IS NOT SUPPORTED BY THE DATA. QUESTIONS: 1. MULTICOLLINEARITY. In our case we have just one independent variable, so I do not have multicollinearity problem. BUT if I add variables I can test for multicollinearity. CHOW TEST EXAMPLE (2:10) CUSUM TEST Null is parameter stability. If you want to see which possible year is potential to split the sample you can use the CUSUM test to see of and when there is a break in the sample. QLR TEST You can also RUN QLR TEST: serve a performing the sequential for all possible observation  it’s like running a CHOW test for all possible years. (2:14) we don’t know when this break (in which year) is taking place. The only thing we know is the trimming of the sample we can delete some observation (like the first 15%) If p value=0 means there’s no stability and a potential year for a break is 1896 (GIVEN BY LIKELIHOOD RATIO TEST/similar to CHOW). HOW TO DECIDE WHICH SUBPARTS USE • As a rule of thumb, we could use all or some of the following: 1. Plot the dependent variable over time and split the data accordingly to any obvious structural changes in the series, e.g. 2. Split the data according to any known important 3. historical events (e.g. stock market crash, new government elected) 4. Use all but the last few observations and do a predictive failure test on those. MEASUREMENT ERROR  MAYBE NEED ALTERNATIVE CO2 MEASURES.  If there is measurement error in one or more of the explanatory variables, this will violate the assumption that the explanatory variables are non-stochastic  Sometimes this is also known as the errors-in-variables problem  Measurement errors can occur in a variety of circumstances, e.g. o Macroeconomic variables are almost always estimated quantities (GDP, inflation, and so on), as is most information contained in company accounts o Sometimes we cannot observe or obtain data on a variable we require and so we need to use a proxy variable – for instance, many models include expected quantities (e.g., expected inflation) but we cannot typically measure expectations. From this slide, the rest of the slides skipped till GENERAL TO SPECIFIC APPROACH PROJECT SECTION 3 1. 2. 3. 4. 5. LINEAR MODEL AND DIAGNOSTICS (AUTOCORR, WHITE, RAMSEY, DARBIN WATS, CHOW TEST…) NON LINEAR AND SAME DIAGNOSTICS LOGARITMIC FIRST DIFFERENCE QR QUESTIONS FINE LEZIONE TOGLI TUTTI GLI ANNI CHE NON ABBIAMO LE OSSERVAZIONI PER ENTRAMBE LE VARIABILI PER FARE IL CHECK DEI DECIMALI: PROBLEM OF DECIMAL POINT- GO TO EXCEL FILE AGAIN, FILTER THE DATA, CHECK TO DO IT RIGHT, PASTE IT CAREFULLY AND JUST USE THE YEARS WHEN YOU HAVE OBSERVATION FOR BOTH VARIABLES ONCE YOU HAVE A CLEAN EXCEL FILE YOU HAVE TO INSERT IN EXCEL ALL YOUR ROWS AND TELL GRETL THAT THE FIRST ROW ARE THE YEARS (TIME DIMENSION). FOR QR DO WE HAVE TO DO DIAGNOSTIC TEST ON IT OR JUST TO COMPARE IT WITH LINEAR? THE ONLY TEST YOU CAN DO IS FOR NORMALITY OF RESIDUALS (statistically significance of the coefficients) EXTRA VARIABLES IS OPTIONAL 27/04/2022 COLOR XY graph To identify OUTLIERS  use BOX PLOT  insert annual co2, logged and first difference. MATRIX CORRELATION: do it if you want to check the correlation between two variables (in the case you added another variable) NON-LINEAR MODEL question about autocorrelation : (0:43) I can keep increasing the number of lags  if it doesn’t work you say that we cannot solve the autocorrelation problem even trying with log value and first differences. CHAPTER 6 VARIABLES THAT ARE OPERATING IN TIME SERIES DIMENSION TIME SERIES MODEL: model in which the lagged value of the dependent variable helps us explain and forecast what we are going to do. WEAKLY STATIONARY PROCESS: Y is stationary if it has: If a series satisfies the next three equations, it is said to be weakly or covariance stationary 1. E(yt) =  , t = 1, 2,...,  constant mean 2 2. E ( yt   )( yt   )      Constant variance 3. E ( yt1   )( yt 2   )   t 2  t1  t1 , t2  Constant covariance  However, the value of the autocovariances depend on the units of measurement of yt. That's why we are usually EMPLOYING autocorrelations. Autocorrelations are normalized autocovariances WHERE WE divide covariance by variance to obtain correlation (TAUs). can I compare the autocovariances of two variables? NO Can we compare covariances? NO THE answere IS NO, because all the covariance depends on the units of measurement of Y. Can I COMPARE Autocorrelation? YES. So if we take the autocorrelations and plot them, this is called the auto correlation function or the correlogram. CORRELOGRAM: correlogram measures How strong is the memory of a process. Measures. HOW IMPORTANT ARE LAG VALUED, PAST OBSERVATION FOR A PROCESS. (1:12) ON GRETL: right click on variable and find the correlogram: you can leave the default option of GRETL. AUTOCOVARIANCES (takes values btw -1 and 1) ARE NOT COMPARABLE BUT AUTOCORRELATIONS (takes values btw -1 and 1 and measure dependence btw 2 subsequent observations) ARE and the last are obtained by dividing autocovariance by the variance. WHITE NOISE PROCESS - Random, with no structure Zero autocovariances  what’s happening in the past does not affect what is happening now. We need a white noise process bc it can give us confidence intervals of insignificant autocorrelation. THE NULL HYPOTESIS IS THAT THE TRUE VALUE FOR AUTOCORRELATION IS ZERO. The blue lines around zero are the confidence intervals given by the white noise process SO EVERYTHING THAT IS INSIDE THE BLUE INTERVAL IS NOT SIGNIFICANT, EVERYTHING THAT IS OUTSIDE IS SIGNIFICANT. (WE ARE JUST LOOKING AT THE FIRST GRAPH RIGHT NOW, ACF) Now I GO TO FIRST DIFFERENCES (just of Annual CO2) AND I CREATE A CORRELOGRAM THERE. IF THE AUTOCORRELATIONS ARE OUTSIDE THE WHITE NOISE PROCESS THEN THEY ARE SIGNIFICANT JOINT HYPOTESIS TEST FOR AUTOCORRELATION FUNCTION: We can do that with Q statistics or Q*: the difference is that Q* is used when we have small sample. Q statistic in GRETL is given here: and p value right next to it. EXAMPLE: IN THIS CASE ONLY THE FIRST OF THE AUTOCORRELATIONS IS SIGNIFICANT, THE OTHERS AREN’T, GIVEN THAT INTERVAL OF CONFIDENCE.  THERE IS SIGNIFICANT CORRELATION BTW THOSE TWO SUBSEQUENT POINT IN TIME.  WE CAN ALSO DO A JOINT TEST WITH Q AND Q STAR AND WE ARE GONNA FIND THAT COEFFICIENTS ARE JOINTLY NOT SIGNIFICANT MOVING AVERAGE PROCESS MA: Yt = U (constant term) + coefficient β1 (that I want to estimate) per lag valued of the error + ε (error term) So from this equation I have to estimate value of (U hat) and (β1 hat). SO WHAT IS A MAP TELLING US: that the variable that we are trying to explain depends on lagged value of the random variable. ON GRETL: FROM LOG DIFFERENCE (che dovrebbero essere I first difference) MODEL  UNIVARIATE TIME SERIES  ARIMA AR: 0 bc we haven’t done autoregressive model yet, and 2 for MA You can put dummies under regressors or you cannot, it’s the same. I USE MY Y VARIABLE WICH IS GROWTH RATE OF CO2 EMISSION PER CAPITA. THEN IN THE OUTPUT I HAVE THETA ONE AND THETA DUE AS IN THE PREVIOUS SLIDE THE CONSTANT TERM IS NOT STATISTICALLY SIGNIFICANT THETA ONE IS SIGNIFICANT THETA 2 IS NOT MA IS TELLING ME THAT THE LAGGED VARIABLE HELPS ME OR NOT TO EXPLAIN AND FORECAST THE GROWTH RATE OF CO2 EMISSION FROM MA MODEL  ANALYSIS  FORECAST and I CAN FORECAST CO2 emissions. ( I drop one observation quindi tolgo un anno andando su SAMPLE (pagina delle variabili) e abbassando di un anno così posso forecast that observation to see how good my forecast is. I NEED TO COMPARE MY FORECAST WITH SOMETHING (THAT IS THE REAL VALUE) SO THAT I KNOW HOW GOOD MY FORECAST IS. SO I DROP ONE FROM THE LAST YEAR AND THEN I REESTIMATE THE MODEL. AND THEN I USE THIS NEW MODEL TO FORECAST WHAT IS HAPPENING IN 2020. ANALYSIS  FORECAST FOR THAT MISSING YEAR. (1:36) SEMBRA UN’ANALISI BRUTTA, PROVA A METTERE DYNAMIC FORECAST ALL’INIZIO. You compare the moving average with the autoregressive. 1:41 AUTOREGRESSIVE PROCESS: Process in which the lagged values of the variable itself are used to explain Y. Yt is the variable that I am trying to explain u is the constant term Y t-1 is the lagged value of Y (WE CAL ALSO HAVE 2 LAGS, so another variable which is Y t-2) Ut is the error THE DIFFERENCE BTW THE MA(1) AND THE AR(1) is that in the first case we are concentrating in lagged values of a random variable, while in ARP we concentrate on lagged value of variable itself (the independent one) (2:04) A ARM ON GRETL: MOVING AVERAGE: Select log difference of annual co2  model  univariate time series  ARIMA AUTOREGRESSIVE: Same as before  model  univariate time series  ARIMA  in the table on the right (orders): 1 at AR, 0 MA MA (1. left) VS AR (2. right) DO IN THE PROJECT: Const term in AR is NOT statistically significant. PHI 1 is Significant. Same for constant term and regressor of MAV STATIONARY CONDITION FOR AN AR MODEL You can use AR and MA model just when the variable is stationary so in our case we should use this 2 models only for a growth rate of CO2 emissions. Skipping until PARTIAL AUTOCORRELATION FUNCTION (PACF) • Measures the correlation between an observation k periods ago and the current observation, after controlling for observations at intermediate lags (i.e. all lags < k). the effect (in last example) of T-5 once you removed the effect of all that’s happening in the middle (so t1, t-2, t-3, t-4) ON GRETL: Log difference annual CO2  tasto destro  correlogram  23 lag but go with the preimpostata  output of PACF as a number and as a graph ARMA (1,1) Combination of AR and MA. GRETL: model univariate time series  ARIMA and the put 1 and 1 COMPARING AR, MA, ARMA -estimate AR, MA, ARMA - for comparing you need data from INFORMATION CRITERIA which allows me to compare the models  the best model is the one that minimises information criteria AIC is the AKAIKE CRITERION (look at the most negative) ANS SCHWARTZ Criterion which is SBIC 28/04/2022 QUESTIONS Add lags  re estimate the model  watch for outliers (0:19) DIAGNOSTIC FOR MA, AR, ARIMA: the main test you can do is autocorrelation: CHAPTER 8 STATIONARY: Definition by three properties 1. Constant mean 2. Constant variance 3. Constant covariance The stationarity of a series can its behaviour and its properties (4ex in non-stationary process persistence of shocks will be infinite, so shocks will affect the series forever) SPURIOUS REGRESSION: when you regress one non-stationary variable on another non stationary variable this is called spurious regression.  Two variables trending over time: definition of non-stationarity  A regression of non stationary variables on each other will create a higher R^2 WITH NO MEANING  T ratio are not gonna follow a t distribution and HYPOTESIS TESTING IS NOT VALID So: when 2 variables are not correlated, I GET ZERO R^2. BUT in this case, bc of non-stationarity of 2 variables I will get an R^2 that is higher than zero, like if there is a causal relationship but it’s NOT. TWO TYPES OF NON- STATIONARY 1. RANDOM WALK MODEL: 2. DETERMINISTIC TREND PROCESS: GENERAL CASE OF RANDOM WALK MODEL (slide 6) MODEL WHERE PHI1 CAN TAKE ANY VALUE THAT WE WANT WE ARE GONNA CONCENTRATE ON: PHI BEING >1  explosive case (growth rate that keeps going up) AND NON STATIONARY PHI = 1: NON STATIONARY, IT’S A RANDOM WALK PHI BEING <1: a variable is STATIONARY CASE 1: PHI BEING <1: a variable is STATIONARY  THIS MEANS SHOCKS TO THE SYSTEM WILL GRADUALLY DIE AWAY AND AT SOME POINT, THIS WILL DIE AWAY.  HALF LIFE depends on PHI. If PHI is 0.95 you are forgetting very slowly. If Phi is 0.1 you are forgetting very fast (closer to zero the faster)  Financial variables they do forget very easily about shocks and in 2 weeks time the shock can be absorbed by the system. Totally opposite with economic variables. CASE 2: PHI = 1: the shock will persist forever  RANDOM WALK CASE 3: Phi >1 explosive case HOW DO WE MAKE A VARIABLE STATIONARY IF IT’S NOT? If it is RANDOM WALK we take first difference (non stationary) BY TAKING FIRST DIFFERENCES (we imply that the variable is log) ON GRETL: GDP_PC IS NON-STATIONARY FOR EXAMPLE I TAKE LOGARITMIC FIRST DIFFERENCE (quindi log) and obtain LOG OF GDP_PC which is a STATIONARY. White noise process is -random - No structure - stationary HOW TO TURN SOMETHING NON STATIONARY IN STATIONARY: TAKE FIRST DIFFERENCES OF THE LOGS UNIT ROOT I(0) SERIES IS A STATIONARY SERIES  can be used in the regression I(1) SERIES CONTAINS ONE UNIT ROOT is non stationary, I cannot use it in the regression to make it stationary I take first differences of the logs. GDP is I(1) variable Delta GDP is I(0) CO2 is I(1) Delta CO2 is I(0) I(2) to make them stationary you need to take differences twices HOW DO WE TEST FOR A UNIT ROOT? (SLIDE 19) (1:16) We have a AR model (theoretical DF) HO: SERIES CONTAINS A UNIT ROOT, PHI=1, VARIABLE IS NOT STATIONARY H1: SERIES IS STATIONARY, PHI<1 Watching an AR model it can be SPURIOUS if this is a spurious regression I cannot do hypothesis testing  We can solve this issue by estimating the EMPIRICAL Dickey Fuller (1:19) DF TEST STATISTIC Sometime it suffers from Autocorrelation you solve it by adding lags (1:21) ON GRETL: UNIT ROOT TEST: CHECK FOR STATIONARITY: gdp value (normale)  variable  unit root test dickey- (adf)  HOW MANY LAGS? MAX 14 AND YOU CAN CHOSE EITHER AKAIKE CRITEIRON OR SCWARTZ HOW MANY LAGS MINIMIZE THE AKAIKE Informational Criteria? WHICH IS THE LOWEST AKAIKE? Check it and (1:28) UNIT ROOT FOR GROWTH RATE OF GDP PC: I EXPECT THAT THE VARIABLE IS STATIONARY. I DO THE SAME AS BEFORE (14 LAGS, AKAIKE CRIT)  WHICH IS THE LAG THAT MINIMIZE AKAIKE IC  6TH ONE SO WE HAVE A MODEL WITH 6 LAGS -> P VALUE OF DF TEST is about zero so the null hp (Non-stationary) is rejected. SO THE VARIABLE IS STATIONARY. WE HAVE TO CHECK FOR STATIONARITY OF BOTH CO2 E GDP (NORMAL) VARIABLES  if both of them are non stationary then the first OLS regression we runned is a spurious regression: IF I FIND THAT THE MODEL IS SPURIOUS I HAVE TO SAY THAT THE MOST APPROPRIATE MODEL IS THE MODEL THAT USES ONLY STATIONARY VARIABLES, SO FIRST DIFFERENCE ONE!!!! I CAN CHECK THAT FIRST DIFFERENCE CO2 AND GDP ARE STATIONARY (MAYBE INSERT ANCHE QUESTO GIUSTO PER) ad esempio nell’esempio del prof co2 emission is not stationary but GDP is stationary. SI TEST EVERYTHING FOR STATIONARITY Remember we are taking lags to correct for autocorrelation (slide 23) FIGURES: time series plot, scatter plot and from ADD color xy plot 2: 11 plot everything together? residual plot should go to analysis TESTING FOR HIGHER ORDER INTERGRATION  WHEN A VARIABLE COULD BE I(2)  you have to take first difference again and do again DF test THE PHILIPS-PERRON TEST (unit root test) The correction for autocorrelation is different FROM DF PP is using a non-parametric correction to autocorrelation: first difference GDP  variable  unit root  PHILIPS PERRON TEST IF IT’S NOT AVAILABLE YOU GO TO OPTION FILE, FUNCTION PACKAGE, ON SERVER ADF (DICKEY FULLER) ADDS LAG TO CORRECT AUTOCORRELATION 2:15 06/05/2022 THEO ED EDO PRESENTATION - FAI COLOR XY PER TODO In the project is show PUT AUTOCORRELATION AS FIRST DIAGNOSTIC TEST BC IT’S THE MOST IMPORTANT BREAKS FROM KPSS ARE SIMILAR TO OUTLIERS OR NOT? In their case completely different. (AOUND 00:15) Cointegration is long run findings FILE  FUNCTION PACKAGE  OBSERVER  KETVALS  The environmental cost of being rich is decreasing over time. unit radical test for stationarity should be done at the beginning to understand if the correlation btw 2 variables bc it’s more correct but Prof wants to do it in this way. BEA AND MARCUS (00:30) Hamilton filter for graphs LINEAR MODEL + OUTPUT + CONFIDENCE INTERVAL NON LINEAR MODEL + OUTPUT + CONFIDENCE INTERVALS KETVALS CHIARA AND LISA Do actual fitted plot GDP pc against time ISA AND SABRI Modrick-Prescott filter Time series plot Xy scatterplot Color xy YOU HAVE TO PUT CO2 on VERTICAL AXES Chow test just do it for structural break and special years COMPARISON LIKE THIS NICE NADARAYA WATSON LOG ANNUAL CO2, CONST AND LOG OF GDP AS REGRESOR  LOGARITHMIC MODEL - model ROBUST ESTIMATION  NADARAYA WATSON –NON PARAMETRIC MODEL  produces a fit that if you click on the graph we can see File  function packages  server  KETVALS  install  file  function packeges on local machine  right click  execute  create this Then this Click on graph on output and then get time variety coefficients, you have 2 graphs and

Econometrics Lecture Notes: Regression & Correlation Analysis

Related documents

Products

Support

Econometrics Lecture Notes: Regression & Correlation Analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib