FOrmUlaS: HANdOUt I since Its mean is also a random E(X) Mx 105):Var(x) 0 If the observation sample COV(d + is an + b284 bx + MY + cV,X) b.COV(X,X) c.COV(V,X) + = E(XX) COV(X, y) E(X) E(Y) Oxy+MxMy distribution of the aid, then the bMx cuy + + n = > a + = = > + (ax+by):aevar (x) b2 Var(y) 2abCOv(X,y) E(y2) 8z (X) ox bEX CEY + = var error a = (a + by) b2Var(X) var mean 107):standard = + + variable, thus there is also point estimate for the sample a E(a bX cY) the observed data are random variables, = + = X-NCMn,of) - EX(2 Var(X) E(X mean: = EX - (EX)2 = approximationofdistribution (asymptotic approximation) M relies on the value sample mean (x):n of a being large in -> as law oflarge number (LIN):asncs, the sample mean populationmean theorem Central limit (CLT):when nas, then the distribution · probability to the sample variance (SR).n · Sample correlation - x)2 "(Xi y)(yi y)2 . - - 1 - i1 = SxSy sample standard deviation (5x):sq average ECD) Mx = population " (E(X)orMus: mean yip; i1 = efficient:var(x)<var (xi), then x is more population eff. d, N10,1), W-E(W) normal:If (xi i1 = (rxy):SxY a consistent:true on value, as nc, the distribution will be more clustered asymptotically M - n around the true value · 1 sample covariance (Sxx): is a normal AMRas re estimators ·unblased:true on in (xP>Ma) distribution NCO, 1) properties of converges xi = wisthe estimator variance (8): n population cov(Oxx) var(W) (xi - x)(xi y h,y(x, = x)2 - care more about consistency than blas HandOUT 2 OLS assumptions regression univariate population regression line Elyx) p0+B1X when ElUilxi) 0 unbiased, consistent, asymptoticallynormal = = > the actual observation Ui + · 5 B" xi sample regression line: where the is scattered around vi=Bo B,Xi + B and B is estimated using the a minimum squared OLS cordinary prediction leastsquares) error min ·no extreme u Total sum of squares (TSS):(yi -7) var (xi) <c BY 2 (Yi-T)" Estimated sum of squares (ESS): Residual sum of squares (SSR) R outliers OE(Xi)<0, 0E1Ui) <a OLS estimator for Goodness of fit 2 = (xi,yi) are iid (SRS) + = ECUi1xi) 0 (correct specifications) (yi-y)" = estimated sum of squares = or total sum of squares explained variations unexplained variations · R COV(X,Y) SAY = 2 = and B TN = rxy2 = 1. Residual sum of squares total sum of squares HandOUT3 A symptoticallynormal HomOSkedasticity U(B-B1) "tendency > N 0,002 BY B,d>N(0, 1) - Ov mOn3 SV var(V) = -Mx).U V (X SECBY) NOTE: the standard error approaches of more 33 where = V variation in reduces the Heteroskedasticity scatter" ·homoskedasticity:varcuiixi):ou 8n4 equivalent to to Us constant) If it is homoskedastic, OLSIS OBLUE] ·heteroskedastic:Var(uilxi):f(xi) as n < variation in B homoskedasticity-only formula asymptotic variance of B estimated standard error 2 Op BY of Orp = 8B S = 2 nSn population regression B1FD EXAMPle: HO: B1 = b, H1 : E-values & and t-test. ·after finding urt-value, you If the It Evalue of (x(), where X: significance level generalidea, but then we use z-value when Ifurt-value In BY-b SE(BY 'I I BO + B1X1,+... + ElyiXi, X2 BkXki Y i = BO BiX1i+... + + BkXki + Ui sample regression line lies In the reasnaded = line: Population regression model 2 bc. It's a two talled test hull Visualization: t-test when n<30 t-statIStIc: you reject Y would want to compare it when to use which? both has the same n > 30 and we use regression multivariate Hypothesis testing guide Yi B0 +BYX1i+... BYXki area + = rejecthulihypothesis Eluilxii...., xki) 0 = z.VAIUC of & confidence & Acceptance intervals estimator Confidence level. SE tested value BY = 1.96.SE(BY Reject b I null & p-value region confidence (v1.SE. after getting the +-value, find the probability that 1.96.SE(BY of the shaded + 951. confidence (a 95% confidence level) & at 4/2 hypothesis if bisoutside - the confidence interval level) by 2bc. two-talled PY is outside the Reject null If (using the E-table) and multiply area test. then r ,1 acceptance region Compare with x. Ifp(X, rejectnull Handout 4 ·conditions for omitted variable bids: & the variables (included and omitted) must be correlated ② the omitted variable should have the omitted regressors effect on the dependent variable (B270) OVB, such that when there is an · an [Ox,Xz70] are absorbed by the error term and the assumption that E(UiIXi) tO. · OLS estimator ↑ &I is based and Inconsistent, no matter the size of the 8x12 - inconsistent If blasterm OX1X2 because it depends 0x1x2 on the value highly If high Is the of BC negative then a No or under perfect collinearity regressors Rescaling vice versed collinear but not perfectly collinear it Inigh R-value, How to solve for & impose is (rxxk 0.8, will be difficult to then check for effect but not individual effect, then = BO + Y i = BO + B1X1,+... line: + If BiX1i+... + BkXki + Ui Yi B0 +BYX1i+... BYXki + = ..., = a constant, then slope + E +(()(kX) cx + C ushifty, they-intercept shifts Goodness of fit pt.2 Adjusted R2 Xki) 0 = = factor adjustment 1 = - n n ElyiXi, X2i,Xki) BkXki ou If Uscale Y, (Y R regression sample regression line x1i, by m population regression model E(Ui multicollinearity · nothing! Y · multicollinearity population regression = no. - - 1.SSR 1= k -1 each other shifting (,)(kX) a+ restrictions multivariate X (i) maintaining equality ② Or if you only wish to find out the overall do . = few significant t-ratios ·If sample correlation . . . of y x + BX+ (Imperfect multicollenearity) separate the Individual effects · and If you scale multicollinearity, then degree of imperfect are not linear combinations Homoskedasticity Var(Uilxi, estimated the impact of BI 8x,2 >If either the value Additional OLS assumption for multivariate - = Ox OLS estimator Is cx B1 + B2Ox142, O,B1 + B20x1x2 w n TSS of regressors 2 S4 2 Sy =k