Comprehensive Exam: Public Finance Wednesday June 26 2013 Part I: 100 points total 1) [30 points total, 10 points each part] Suppose you are interested in examining children’s educational outcomes, y, as function of the number of siblings (n) and of parental characteristics (x), . a) What is an argument for using twin births or gender composition as an IV for n? What is an argument against? b) What is an argument for introducing birth order into the specification? What is a potential problem with controlling for both birth order and family size? c) Suppose you find little effect of family size on children’s educational outcomes when you control for birth order or instrument with twin births but find a big effect of birth order. If you take your results at face value, what would you conclude about existing fertility models? Be explicit about what these existing models are. 2) [42 points total, 14 points each part] a) Most empirical studies find a negative correlation between health status and the demand for medical services and also between education and the demand for medical services. How does this contradict the Grossman model? b) What is the Barker hypothesis and what challenges does it pose to the Grossman model? c) What implications does the Heckman research agenda have for the Grossman model? 3) [28 points total, 7 points each part] Teacher ability has fallen since the 1960s. Two potential explanations are the rise in alternative opportunities for women and the unionization of teachers. You can assume that ability is the same whether the person is in the teaching or non-teaching sector. a) Very briefly explain why opportunities for women improved since the 1960s. b) If unionization raises teacher pay across the board (for teachers of all abilities), what does a Roy model imply about who sorts into teaching? c) If unionization compresses the wage distribution (pay for ability), what does a Roy model imply about who sorts into teaching? Would your answer change if the best teachers are not necessarily the best non-teachers? d) If alternative opportunities improve for women of all abilities, what does a Roy model imply about who sorts into teaching? 1 Part II: 100 points total 1. Twins and the Return to Computer Use [66 points] An analyst estimates the following model for the log of wages (w) for workers using data from a recent Current Population Survey: log w = α + β S + γ Exp + δ Exp2 + θ Male + λ Use_Computer + ε where (α, β, γ, δ, θ, λ) are coefficients, S is years of schooling, Exp represents potential labor market experience, Male is a dummy variable equal to 1 for male workers and 0 for females, and Use_Computer is a dummy variable equal to 1 for workers who state that they use a computer on the job. He obtains an estimate of λ=0.15, and an estimate of β=0.08 (both are very precisely estimated). Fitting a similar model that omits the Use_Computer variable he obtains an estimate of β=0.10. (a) Use the omitted variables formula to explain the connection between the estimates of β when Use_Compute is included and excluded. What must be true about the correlation between computer use and schooling? (b) Discuss a set of assumptions under which the analyst’s estimates represent causal estimates of the effect of schooling and of computer use. (c) Some analysts have questioned the causal interpretation of the λ coefficient. Discuss alternative interpretations. What evidence would you suggest to use to support or refute these interpretations? (d) Another approach to address omitted variable bias has been the use of data on twins. There are two approaches to use twin data – differencing and correlated random effects. Briefly explain them both. (e) Show how you would use twin data to estimate the returns to computer use. Are you worried about measurement error in this context? How does first differencing affect the influence of classical measurement error? (f) What does a twin-estimator implicitly assume about the choice of computer use within the family? Given this assumption, when is a within family estimator indeed better than a between family estimators? How could you try to assess this assumption using data on observable characteristics (you can follow the suggestion by Ashenfelter and Rouse)? (g) One way to use twin information is to include the average propensity of computer use among both twins as separate regressor in the individual model for earnings. How is this similar to the control function approach? (h) In class, we have shown how the twin estimator is a special case of a matching estimator. Write down the general matching estimator of the effect of computer use on 2 wages. Define the propensity score. What is its role in matching estimators and what justifies its use? 2. IV and the Return to Computer Use [34 points] (a) The computer-use regression in Question 1 is akin to a schooling regression. How have labor economists typically dealt with omitted variables in the context of schooling? Describe how you would the regression discontinuity (RD) approach to estimate the return to schooling, and give a real-life example of when an RD approach could be used. (b) Explain the concept of Local Average Treatment Effect (LATE). Under what circumstances does an instrumental variable regression estimate LATE? How does LATE help rationalize the results typically obtained in the instrumental variable studies on the returns to schooling? (c) Suppose somebody suggests using the presence of a computer lab in the local high school as an instrument for computer use among recent high school graduates in the area. Briefly show how you would implement this procedure. Discuss threats to validity to this exercise, i.e., argue whether this is a good instrument or not considering all three properties of instruments. (d) How would you calculate the standard errors in a typical regression of log wage on schooling to account for the presence of group-level errors components? 3 Part III: 100 points total (25 pts each question) 1. Answer the following questions with a short paragraph. For full credit, be sure to cite relevant papers where appropriate. (a) Describe the causes and consequences of asymmetric information in insurance markets. (b) How have researchers tested for asymmetric information in insurance markets? What is the theoretical basis for these tests? (c) Summarize the state of the evidence that researchers have found using the above test. (d) What theories have been proposed to explain your answer to (c)? What evidence is there to support these theories? 2. Consider the Rothschild Stiglitz model of competitive insurance markets. An individual faces an accident with probability p. If the accident does not occur, the individual’s wealth is W ; if the accident does occur, the individual’s wealth is W − d. Insurance companies are risk neutral and insurance contracts take the form (W1 , W2 ) where W1 is the promised wealth in the no-accident state and W2 is the promised wealth in the accident state. The individual is risk averse and seeks to maximize expected utility over accident states given by: (1 − p) U (W1 ) + pU (W2 ) (a) Suppose that there is no asymmetric information. Characterize the optimal contract. (b) Now suppose there are two types of individuals, H and L, with accident probabilities pH and pL respectively. Insurance companies cannot distinguish between these two types, but know that the probability of a high type is λ. Assume pH > pL . Illustrate using a graph why a pooling equilibrium cannot exist. (c) On another graph, characterize the separating equilibrium, assuming one exists. (d) Use another graph to illustrate why a separating equilibrium may not exist. (e) What are the welfare consequences of asymmetric information in the separating equilibrium? Based on your answer, are you surprised that so much of the discussion on health insurance policy is about the high risk individuals? Why do you think this is? 1 (f) Employer-sponsored health insurance is a policy designed to encourage pooling between risk types. Fang and Gavazza (2011) argue that ESHI can give rise to dynamic inefficiencies in health investment. Explain their argument in a few sentences. 3. You wish to test for the presence of asymmetric information in car insurance markets. You have a cross-section of auto insurance policies from a large insurer. The data includes the policy’s deductible and premium, as well as the driver’s age, profession, driving record, region, and stated use of vehicle, and insured vehicle’s make, model and year. The data also shows whether a claim was ever filed on the policy. You decide to run the following regression with your data: Deductiblei = Xi β + Claimi α + i (1) where Deductiblei is the chosen deductible, Claimi is an indicator for whether a claim was ever filed, and Xi are controls. (a) If asymmetric information was present, what would you expect to find? If you don’t find that evidence, does this mean there is no asymmetric information? (b) What variables might you want to put in Xi . Would it make sense to leave anything out? Would you want to condition your sample in any way? (c) What was Chiappori and Salanie’s (2000) main objection to specification (1) as a test for asymmetric information? (d) Give an alternative specification that corrects the problem in (c). 4. In 1993, New York implemented community rating and guaranteed issue reforms for health insurance. A nearby state, Pennsylvania, did not pass these reforms. As a reminder, community rating is a restriction on the observable factors that health insurance providers may price on, and guaranteed issue requires insurance providers to issue policies to anyone willing to pay the offered premium. (a) It has often been argued that community rating can lead to an “adverse selection death spiral”. Explain what you think this statement means in the context of a competitive insurance market. Based on the Rothschild Stiglitz model, if the goal is to get good risks to subsidize bad risks, do you think community rating & guaranteed issue, by themselves, can accomplish this task? 2 Suppose you have a repeated cross section surveying the populations of New York and Pennsylvania in each year. The data includes health insurance status (whether covered or not, type of coverage) and demographic characteristics of the individual. (b) Describe a method to estimate the causal effect of the reform on the share of population with health insurance. (c) What assumptions are required for your method in part (c) to be valid? How would you use the data to establish the validity of these assumptions? (d) Suppose you do not find any effect of community rating on health insurance coverage. Do you think the reform may have affected some other margin? Use theory to explain. 3