Suggested answers, Problem Set 1 ECON 30331 Dan Hungerman Fall 2009 1 1 2 x2 Part a: ∫ 2 xdx = = x2 = 1 − 0 = 1 0 0 2 0 1 1. 1 1 1 2 x3 2 2 = −0 = Part b: E[ x ] = ∫ xf ( x)dx = ∫ 2 x dx = 3 0 3 3 0 0 2 .25 2 x2 Part c: Pr( x < 0.25) = F (0.25) = ∫ 2 xdx = 2 0 0.25 = 0.252 = 0.0625 0 2. The probability the car will fail after the warranty expires is Pr(x>2) which equals 1-Pr(x≤2) and F(2)=Pr(x≤2). From the handout, we know that the CDF for the exponential distribution is F(a) = 1-exp(λa) so F(2) = 1- exp(-0.25(2)) = 0.394 so Pr(x>2)=1-Pr(x≤2)=1-0.394=0.606. 3. a. b. c. 4. We want to calculate the threshold below which 90% of the GPAs lie, or in the CDF Prob(GPA≤K) = 0.9, we want to know K. GPA is a normal but not a standard normal distribution so we need to standardize it by subtracting off the mean and diving by the standard deviation. Prob(Z ≤ -1.65) = Φ(-1.65) = 0.0495 Prob(Z > 1.25) = 1- Prob(Z≤1.25) = 1 - Φ(1.25) = 1 - 0.8944 = 0.1056 Prob(-1.02 < z ≤ 0.53) = Φ(0.53) - Φ(-1.02) = 0.7019 - 0.1539 = 0.5380 Prob[Z=(GPA-u)/σ ≤(K-u)/σ=a] = 0.90. Looking at the standard normal table, Pr(Z≤a)=0.90, when a=1.28, so a=(K-u)/σ so K=aσ+u = 1.28(0.5)+2.7=3.34. Kids with 3.34 and above will be in the top 10% of their class. 5. Graphing out the relationship between X and Y, we see that these variables are linearly related, which means that knowing X you know exactly what the value of Y is, and vice verse. The values are also negatively related. Because X predicts Y perfectly and the variables are negatively related, the correlation coefficient between X and Y is -1. 6. Start with the definition of the correlation coefficient --- ρfc = σfc/(σfσc) Note that Var(F) = E[(F - µf)2] = σ2f and looking at the value for C, E[C] = µc = E[-17.78 + 0.556F] = 17.78 + 0.556E[F] = -17.78 + 0.556 µf. So C - µc = 0.556(F - µf) Therefore: σfc = E[(F - µf)(C - µc)] = E[(F - µf)(0.556F - 0.556µf)] = 0.556 E[(F - µf)2] = 0.556σ2f Notice as well that Var(C) = E[(C - µc)2] = E[(0.556F - 0.556 µc)2] = E[(0.556)2 (F - µf)2] =0.5562 E[(F µf)2] = 0.5562 σ2f Therefore: ρfc = σfc/(σfσc) = 0.556 σ2f/(σf (0.556 σf )) = σ2f/σ2f = 1. 7. The admission index is I=800GPA+SAT. The expected index E[I] = 800ugpa+uSAT = 800(3.6)+1100=3980. Var(I) = 8002σ2gpa+ σ2sat+2(800)σgpaσsatρg,s=8002(1)2 + 1002+2(800)(1)(1)(100)(0.5) = 730,000 or σI=854. 1 8. This problem is easiest of you construct the 2x2 table, fill in what you know then fill in the blanks. Here is what you are given Poverty Yes No Die in 5 years No Yes Row totals 0.01 0.14 Column Totals 0.03 1.00 By construction, Pr(Not die = 0.95) and Pr(not in poverty = 0.86). The columns must sum to the column totals so Pr(Die in 5 ∩ not in poverty) = 0.04. Die in 5 years 9. No Yes Row totals Poverty Yes 0.13 0.01 0.14 No 0.84 0.02 0.86 Column Totals 0.97 0.03 1.00 a. Pr(Die in 5 | in poverty) = pr(Die in 5 ∩ in poverty)/Pr(in poverty). You are given that the numerator is 0.01 and the denominator is 0.14 so Pr(Die in 5 | in poverty) = 0.01/0.14 = 0.071 b. Pr(Die in 5 | not in poverty) = pr(Die in 5 ∩ not in poverty)/Pr(not in poverty). Since 14% of the population is in poverty, it must be the case that 86% is not in poverty. Likewise, of all people that die in five years, they will either be in poverty or not. Since 3% will die and 1% will die in poverty, it must be the case that Pr(die in 5 ∩ note in poverty) = 0.02. Therefore, Pr(Die in 5 | not in poverty) = pr(Die in 5 ∩ not in poverty)/Pr(not in poverty) 0.02/0.86 = 0.023. c. In this case, Pr(Die in 5 | in poverty) ≠ Pr(Die in 5). Poverty conveys a lot of information about your probability of death so the events are NOT independent. Notice that Pr(Die in 5 | in poverty) Pr(Die in 5 | not in poverty) = 0.071/0.023 = 3. N=25, x̄=98.1 and s=1. Test the null hypothesis that Ho: u=98.6 in this sample The definition of the confidence interval is x̄ ± t α/2 (n-1)s/n0.5 and in this case, the critical value of the tdistribution for a 95% confidence interval is t α/2 (n-1)=t0.025(24)=2.064. Therefore, the 95% CI is 98.1 ± 2.064(1)/250.5 = (97.69, 98.51). Since 98.6 is NOT in the confidence interval, we can reject the null. The t-statistic is given as t̂ = (x̄ - a)/[s/n0.5] where a is the null hypothesis. In this case, t̂=(98.1-98.6)/[1/250.5]=-2.5. since | t̂ |>2.064, the critical value of the t-distribution, we can reject the null. ˆ = x − x = −16 − −7 = −9 . The definition of the confidence interval is 9. In this case, x̄l=-16 and x̄l=-7 so ∆ a l 1 1 ∆ˆ ± t0.025 (na + nl − 2) s p + na nl 0.5 where t α/2 (na +nl - 2)=t0.025(38) = 2.024. The pooled variance is 2 s 2p = [(na − 1) sa2 + (nl − 1) sl2 ] / [na + n p − 2] = [19(122 ) + 19(82 )] / 38 = 104 so sp=10.2. Plugging these values into the confidence interval, we find the 95% most likely values for ∆ are (-15.5, -2.5) so we can reject the null the weight loss is the same. With a 99% confidence level, t α/2 (na +nl - 2)=t0.01(38) = 2.429. The t-test is defined as tˆ = ∆ˆ 1 1 sp + n1 n2 0.5 = −9 1 1 10.2 + 20 20 0.5 = −2.79 Again, even at the 99% level, we can reject the null. 11. Given a linear combination of random variables, Z = a + bY1 + cY2, the handout demonstrates that Var(Z) = b2Var(Y1) + c2Var(Y2) + 2bcCov(Y1,Y2). In this case Z = Y1 + Y2 so a=0, b=1 and c=1. We are also given the fact that Y1 and Y2 have the same variance σ2y plus Y1 and Y2 are independent so we also know that Cov(Y1,Y2) = σ12 = 0. Plugging all these values into the equation above, Var(Z) = 2 σ2y 12. Given the results above, if Z = Y1 + Y2 + Y3 + ..... Yn and each value Yi is independent of Yj, then Var(Z) = Var(Y1 + Y2 + Y3 + ..... Yn) = Var(Y1) + Var(Y2) + ..... Var(Yn) = nσ2y 3