0 NAME:______________________ I.D. # : ______________________ ECONOMICS 2900 Economics and Business Statistics SPRING, 2005 MIDTERM EXAMINATION Thursday, February 17 2005 Weight 35% NOTE : You have 70 mins to complete the exam, budget your time accordingly. Please answer all questions on this exam booklet. Calculators used must not have the ability to program alphabetic characters (whole words or sentences) GOOD LUCK ** Please do not mark the tables ** Question # 1 30 marks The Following data describe U.S. passenger car travel and fuel consumption from 1995 through 1999. The data represent billions of gallons consumed and billions of miles traveled Year 1995 1996 1997 1998 1999 Fuel consumed 68.07 69.22 69.87 71.70 73.16 Miles traveled 1438.29 1469.85 1501.82 1549.58 1569.27 Total Mean Variance Covariance 352.02 70.4 4.1 86.78 7528.81 1505.8 2952.7 A) Calculate the regression equation with fuel as your dependant variable and miles traveled as your independent variable. Interpret the coefficients. B) Is there enough evidence to suggest that the number of miles traveled is linearly related to fuel consumption? (use alpha = .10) C) What is the coefficient of Determination? What does this statistic tell you? D) Predict with 95% confidence, the amount of fuel consumed for the year 2000 when there were 1600 miles traveled. Question 2 (25 marks) The general manager of the Cleveland Indians baseball team is in the process of determining which minorleague players to draft. He is aware that his team needs home-run hitters and would like to find a way to predict the number of home runs a player will hit. Being an astute statistician, he gathers a random sample of players and records the number of home runs each player hit in his first two full years as a major-league player, the number of home runs he hit in his last full year in the minor leagues, his age, and the number of years of professional baseball. An example of the first few lines of data, along with the initial regression printout appears below. Major HR Minor HR 19 23 6 Years Pro Age 13 15 4 19 21 22 3 3 5 SUMMARY OUTPUT Regression Statistics Multiple R 0.592560205 R Square 0.351127597 Adjusted R Square 0.335171718 Standard Error 6.992104843 Observations 126 ANOVA Df Regression Residual Total Intercept Minor HR Age Years Pro 3 122 125 SS MS F Significance F 3227.612245 1075.871 22.00616 1.85592E-11 5964.522676 48.88953 9192.134921 Coefficients Standard Error t Stat P-value Lower 95% -1.969977822 9.547049398 -0.20634 0.836866 -20.86933228 0.665838264 0.087149184 7.640212 5.46E-12 0.493317598 0.135727743 0.524087215 0.258979 0.796088 -0.901756157 1.176370911 0.670625334 1.75414 0.081917 -0.151200086 a. What is the regression equation? Interpret each of the coefficients. b. How well does the model fit? c. Test with alpha =.05 if the model is useful. Explain how your test result relates to “significance F” on the regression printout. d. Do each of the independent variables belong in the model? How can you tell? e. Predict with 95% confidence the number of home runs in the first two years of a player who is 25 years old, has played professional baseball for 7 years, and hit 22 home runs in his last year in the minor leagues. Question 3 (25 marks) The administrator of a school board in a large county was analyzing the average mathematics test scores in the schools under her control. She noticed that there were dramatic differences in scores among the schools. In an attempt to improve the scores of all the schools, she attempted to determine the factors that account for the differences. Accordingly, she took a random sample of 40 schools across the county and, for each, determined the mean test score last year, the percentage of teachers in each school who have at least one university degree in mathematics, the mean age, and the mean annual income of the mathematics teachers. An example of the first few lines of data, along with the initial regression printout appears below. SUMMARY OUTPUT Regression Statistics Multiple R 0.5975122 R Square 0.8570209 Adjusted R Square 0.8034393 Standard Error 7.724526 Observations 40 ANOVA Df Regression Residual Total Intercept Math Degree Age Income SS MS F Significance F 3 1192.732105 397.5774 6.663125 0.001076925 36 2148.058895 59.6683 39 3340.791 Coefficients Standard Error 35.677618 7.278849159 0.2474816 0.069845662 0.2448306 0.185213036 0.1332967 0.152818937 t Stat 4.901547 3.543263 1.321886 0.872253 Correlation matrix: Test Score Test Score Math Degree Age Income Math Degree 1 0.506626 1 0.332495 0.076597 0.311981 0.099351 Age 1 0.869752 Income 1 a. What is the regression model? Do these coefficients make sense? b. Overall, does this model fit the data well? P-value 2.03E-05 0.001115 0.194545 0.388851 Lower 95% 20.91544713 0.1058282 -0.13079835 -0.17663405 Histogram Frequency 10 5 Frequency 0 Bin 100 residuals 50 0 0 10 20 30 40 Series1 -50 -100 -150 residuals Observation # 100 80 60 40 20 0 -20 0 -40 -60 -80 -100 -120 100 200 300 400 500 Series1 Predicted c. What are the required conditions regarding the error variable? Are these conditions satisfied? Explain in detail. d. What is Multicollinearity? Why is it a problem? Is multicollinearity a problem in this model? Should we fix multicollinearity when we do find evidence of it? e. How would you suggest to make this model better? Question # 4 20 marks QUEBEC REFERENDUM VOTE: WAS THERE ELECTORAL FRAUD?* Quebecers have been debating whether to separate from Canada and form an independent nation. A referendum was held on October 30, 1995, in which the people of Quebec voted not to separate. The vote was extremely close, with the “No” side winning by only 52,448 votes. A large number of “No” votes was cast by the non-Francophone (non-French-speaking) people of Quebec, who make up about 20% of the population and who very much want to remain Canadians. The remaining 80% are Francophones, a majority of whom voted “Yes”. After the votes were counted, it became clear that the tallied vote was much closer than it should have been. Supporters of the “No” side charged that poll scrutineers, all of whom were appointed by the proseparatist provincial government, rejected a disproportionate number of ballots in ridings where the percentage of “Yes” votes was low and where there are large numbers of Allophone (people whose first language is neither English nor French) and Anglophone (English-speaking) residents. (Electoral laws require the rejection of ballots that do not appear to be properly marked. They were outraged that in a strong democracy like Canada, votes would be rigged much like in many nondemocratic countries around the world. If, in ridings where there was a low percentage of “Yes” votes there was a high percentage of rejected ballots, this would be evidence of electoral fraud. Moreover, if in ridings where there were large percentages of Allophone and/or Anglophone voters, there were high percentages of rejected ballots, this too would constitute evidence of fraud on the part of the scrutineers and possibly the government. In order to determine the veracity of the charges, the following variables were recorded for each riding: Percentage of rejected ballots in referendum Percentage of “Yes” votes Percentage of Allophones Percentage of Anglophones A multiple regression analysis to determine how the percentages of “yes” votes, Allophones, and Anglophones were related to the percentage of rejected ballots was performed and is summarized below SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.372093 0.138453 0.117092 0.981088 125 ANOVA df Regression Residual Total Intercept Pct Yes Pct Allo Pct Anglo 3 121 124 SS MS F Significance F 18.71651 6.238838 6.481686 0.000418362 116.4665 0.962533 135.183 Coefficients Standard Error t Stat P-value 1.565643 0.739285 2.11778 0.03624 0.000262 0.012072 0.02169 0.982731 0.036747 0.010332 3.556591 0.000537 -0.00904 0.01299 -0.69592 0.487811 ________________________________ * This case is based on “Voting Irregularities in the 1995 Referendum on Quebec Sovereignty,” by Jason Cawley and Paul Sommers, Chance, Vol. 9, No. 4, Fall 1996. We are grateful to Dr. Paul Sommers, Middlebury College, for his assistance in writing this case. Can we infer that electoral fraud took place? If so, how did it manifest itself?