Descriptive Statistics and Linear Regression

Topics in Microeconometrics William Greene Department of Economics Stern School of Business Descriptive Statistics and Linear Regression Model Building in Econometrics • Parameterizing the model • • • • Nonparametric analysis Semiparametric analysis Parametric analysis Sharpness of inferences follows from the strength of the assumptions A Model Relating (Log)Wage to Gender and Experience Application: Is there a relationship between investment and capital stock? Nonparametric Regression Kernel regression of y on x Semiparametric Regression: Least absolute deviations regression of y on x Parametric Regression: Least squares – maximum likelihood – regression of y on x Cornwell and Rupert Panel Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP WKS OCC IND SOUTH SMSA MS FEM UNION ED BLK LWAGE = = = = = = = = = = = = work experience weeks worked occupation, 1 if blue collar, 1 if manufacturing industry 1 if resides in south 1 if resides in a city (SMSA) 1 if married 1 if female 1 if wage set by union contract years of education 1 if individual is black log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155. See Baltagi, page 122 for further analysis. The data were downloaded from the website for Baltagi's text. A First Look at the Data Descriptive Statistics • • Basic Measures of Location and Dispersion Graphical Devices • • Histogram Kernel Density Estimator Histogram for LWAGE The kernel density estimator is a histogram (of sorts).  x i  x m*  n 1 * ˆf ( x * )  1 K  , fo r a se t o f p o in ts x m  m i1 n B  B  B  " b a n d w id th " ch o se n b y th e a n a lyst K  th e k e rn e l fu n ctio n , su ch a s th e n o rm a l o r lo g istic p d f (o r o n e o f se v e ra l o th e rs) x*  th e p o in t a t w h ich th e d e n sity is a p p ro xim a te d . T h is is e sse n tia lly a h isto g ra m w ith sm a ll b in s. Kernel Estimator for LWAGE Kernel Density Estimator T h e cu rse o f d im e n sio n a lity 1 n 1 * fˆ ( x m )   i  1 K n B  x i  x m*   B  *  , fo r a se t o f p o in ts x m  B  " b a n d w id th " K  th e k e rn e l fu n ctio n x*  th e p o in t a t w h ich th e d e n sity is a p p ro xim a te d . fˆ ( x* ) is a n e stim a to r o f f(x* ) 1  n n i1 Q ( x i | x* )  Q ( x* ). B u t, V a r[Q ( x* )]  1 1  S o m e th in g . R a th e r, V a r[Q ( x* )]  N N 3 /5 * so m e th in g I.e., fˆ ( x* ) d o e s n o t co n v e rg e to f ( x* ) a t th e sa m e ra te a s a m e a n co n v e rg e s to a p o p u la tio n m e a n . Objective: Impact of Education on (log) wage • • • • Specification: What is the right model to use to analyze this association? Estimation Inference Analysis Simple Linear Regression LWAGE = 5.8388 + 0.0652*ED Multiple Regression Specification: Quadratic Effect of Experience Partial Effects Education: Experience: FEM .05544 .04062 – 2*.00068*Exp – .37522 Model Implication: Effect of Experience and Male vs. Female Hypothesis Test About Coefficients • Hypothesis • • • Null: Restriction on β: Rβ – q = 0 Alternative: Not the null Approaches • • Fitting Criterion: R2 decrease under the null? Wald: Rb – q close to 0 under the alternative? Hypotheses All Coefficients = 0? R = [ 0 | I ] q = [0] ED Coefficient = 0? R = 0,1,0,0,0,0,0,0,0,0,0,0 q= 0 No Experience effect? R = 0,0,1,0,0,0,0,0,0,0,0,0 0,0,0,1,0,0,0,0,0,0,0,0 q=0 0 Hypothesis Test Statistics S u b s c rip t 0 = th e m o d e l u n d e r th e n u ll h yp o th e s is S u b s c rip t 1 = th e m o d e l u n d e r th e a lte rn a tiv e h yp o th e s is 1 . B a s e d o n th e F ittin g C rite rio n R 2 F= 2 (R 1 - R 0 ) / J (1 - 2 R1 2 ) / (N - K 1 ) = F [J,N - K 1 ] 2 . B a s e d o n th e W a ld D is ta n c e : N o te , fo r lin e a r m o d e ls , W = J F .   -1 2 -1 C h i S q u a re d = ( R b - q )  R s ( X 1 X 1 ) R   ( R b - q )   Hypothesis: All Coefficients Equal Zero All Coefficients = 0? R = [0 | I] q = [0] R12 = .42645 R02 = .00000 F = 280.7 with [11,4153] Wald = b2-12[V2-12]-1b2-12 = 3087.83355 Note that Wald = JF = 11(280.7) Hypothesis: Education Effect = 0 ED Coefficient = 0? R = 0,1,0,0,0,0,0,0,0,0,0,0 q= 0 R12 = .42645 R02 = .36355 (not shown) F = 455.396 Wald = (.05544-0)2/(.0026)2 = 455.396 Note F = t2 and Wald = F For a single hypothesis about 1 coefficient. Hypothesis: Experience Effect = 0 No Experience effect? R = 0,0,1,0,0,0,0,0,0,0,0,0 0,0,0,1,0,0,0,0,0,0,0,0 q= 0 0 R02 = .34101, R12 = .42645 F = 309.33 Wald = 618.601 (W* = 5.99) A Robust Covariance Matrix T h e W h ite E stim a to r -1 E st.V a r[b ] = ( X X )   • • i Heteroscedasticty Not robust to: • • • • 2 What does robustness mean? Robust to: • •  e i x i x i  ( X X )  Autocorrelation Individual heterogeneity The wrong model specification ‘Robust inference’ -1 Robust Covariance Matrix Heteroscedasticity Robust Covariance Matrix

Descriptive Statistics and Linear Regression

Related documents

Products

Support

Descriptive Statistics and Linear Regression

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib