Lecture 21

Econ 140 Binary Response Lecture 21 Lecture 21 1 Today’s plan Econ 140 • Three models: • Linear probability model • Probit model • Logit model • L21.xls provides an example of a linear probability model and a logit model Lecture 21 2 Discrete choice variable • Defining variables: Yi = 1 if individual : Takes BART Buys a car Joins a union Econ 140 Yi = 0 if individual: Does not take BART Does not buy a car Does not join a union • The discrete choice variable Yi is a function of individual characteristics: Yi = a + bXi + ei Lecture 21 3 Graphical representation Econ 140 X = years of labor market experience Y = 1 [if person joins union] = 0 [if person doesn’t join union] Y 1 Yˆ Observed data with OLS regression line 0 Lecture 21 X 4 Linear probability model Econ 140 • The OLS regression line in the previous slide is called the linear probability model – predicting the probability that an individual will join a union given their years of labor market experience • Using the linear probability model, we estimate the equation: Yˆ  aˆ  bˆX – using aˆ & bˆ Lecture 21 we can predict the probability 5 Linear probability model (2) Econ 140 • Problems with the linear probability model 1) Predicted probabilities don’t necessarily lie within the 0 to 1 range 2) We get a very specific form of heteroskedasticity • errors for this model are ei  Yi  Yˆi • note: Yˆi values are along the continuous OLS line, but Yi values jump between 0 and 1 - this creates large variation in errors 3) Errors are non-normal • We can use the linear probability model as a first guess – can be used for start values in a maximum likelihood Lecture 21problem 6 McFadden’s Contribution Econ 140 • Suggestion: curve that runs strictly between 0 and 1 and tails off at the boundaries like so: Y 1 0 Lecture 21 7 McFadden’s Contribution Econ 140 • Recall the probability distribution function and cumulative distribution function for a standard normal: 1 PDF 0 Lecture 21 0 CDF 8 Probit model Econ 140 • For the standard normal, we have the probit model using the PDF • The density function for the normal is: 1  1 2 f Z   exp  Z  2  2  where Z = a + bX • For the probit model, we want to find Pr(Yi  1)  F Z i  f Z i   PDF , F ( Z i )  CDF Pr(Z  z )  CDF Lecture 21 9 Probit model (2) Econ 140 • The probit model imposes the distributional form of the CDF in order to estimate a and b • The values aˆ and bˆ have to be estimated as part of the maximum likelihood procedure Lecture 21 10 Logit model Econ 140 • The logit model uses the logistic distribution Density: 1 ez gz   1 ez Cumulative: 1 G z   1  ez Standard normal F(Z) Logistic G(Z) 0 Lecture 21 11 Maximum likelihood Econ 140 • Alternative estimation that assumes you know the form of the population • Using maximum likelihood, we will be specifying the model as part of the distribution Lecture 21 12 Maximum likelihood (2) Econ 140 • For example: Bernoulli distribution where: (with a parameter ) Pr(Y  1)   Pr(Y  0)  1   • We have an outcome 1110000100 • The probability expression is:  3 1   4 1   2   4 1   6   0 .4 • We pick a sample of Y1….Yn PrYi  1   PrYi  0   1   Lecture 21 13 Maximum likelihood (3) Econ 140 • Probability of getting observed Yi is based on the form we’ve assumed:  Yi 1   1Yi  • If we multiply across the observed sample: n    Yi 1   (1Yi ) i 1  • Given we think that an outcome of one occurs r times:   ( nr ) r ˆ ˆ  1 Lecture 21 14 Maximum likelihood (3) • If we take logs, we get   Econ 140  L ˆ  r log ˆ  n  r log 1  ˆ  – This is the log-likelihood – We can differentiate this and obtain a solution for ˆ Lecture 21 15 Maximum likelihood (4) Econ 140 • In a more complex example, the logit model gives PrYi  1  G Z i  Z i  a  bX i PrYi  0   1  G Z i  • Instead of looking for estimates of  we are looking for estimates of a and b • Think of G(Zi) as : – we get a log-likelihood L(a, b) = Si [Yi log(Gi) + (1 - Yi) log(1 - Gi)] – solve for a and b Lecture 21 16 Example Econ 140 • Data on union membership and years of labor market experience (L21.xls) • To build the maximum likelihood form, we can think of: – intercept: a – coefficient on experience : b • There are three columns – Predicted value Z – Estimated probability(on the CDF) – Estimated likelihood as given by the model • The Solver from the Tools menu calculates estimates of a and b Lecture 21 17 Example (2) Econ 140 • How the solver works: • Defining a and b using start values • Choose start values of a and b equal to zero • Define our model: Z = a + bX 1 • Define the predictive possibilities: G z   1  ez • Define the log-likelihood and sum it – Can use Solver to change the values on a and b Lecture 21 18 Comparing parameters Econ 140 • How do we compare parameters across these models? • The linear probability form is: Y = a + bX – where  Pr b X • Recall the graphs associated with each model – Consequently  Pr  g Zˆ i   b X – This is the same for the probit and logit forms Lecture 21 19 L21.xls example Econ 140 • Predicting the linear probability model: Uˆ  0.281  0.005EXPER • Note the value of the estimated coefficient (b) = 0.005 • For the logit form: – use logit distribution: ez gz   1 ez – logit estimated equation is: Z = U = -0.923 + 0.020EXPER Lecture 21 20 L21.xls example (2) Econ 140 • At 20 years of experience: Z = U = -0.923 + 0.020(20) = -0.523 eZ = e-0.523 = 0.590 g(Z) = (0.590/(1+0.590)) = 0.371 • Thus the slope at 20 years of experience is: 0.371 x 0.020 = 0.007 • Note the similarity (OLS value = 0.005), but for other examples the difference can be notable. • Most software (e.g. STATA) will give the coefficient from the logit, or the differential slope. Lecture 21 21

Lecture 21

Related documents

Products

Support

Lecture 21

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib