Assignment #6: Adding one variable at a time Psych 5510/6510 Spring, 2009 We will continue to work with the ‘Equity Data.’ Variable Descriptions: Name of the State X1 expenditure per student in public elementary and secondary schools. X2 average student/teacher ratio in public elementary and secondary schools. X3 annual salary of teachers in public elementary and secondary schools. X4 percent of eligible students taking the SAT Y average total score on the SAT 1. Using SPSS, regress Y on X1, X2, X3, and X4 (all four at once). Write the regression equation filling in the values for the various b’s. 2. a) Write out Model C and Model A (using β’s) for adding X1 to a Model C that has all the other predictor variables (X2, X3, and X4). b) Write H0 and HA in terms of β1. c) Write the ‘t’ and ‘p’ value for determining whether or not to reject H0. Then state your decision regarding H0. d) Write the partial correlation coefficient for the relationship between X1 and Y, use the correct symbol (i.e. Ry1.234=?) e) State the PRE for moving from Model C to Model A. 3. Repeat parts ‘a’ through ‘e’ of question 2 but this time for adding X2 to a model that has all the other predictor variables (X1, X3, and X4). Note: we are talking about β2 now, and the symbol in part ‘d’ will change. (you should have the picture by now, we will skip doing the same thing for X3 and X4) 4. For the following questions I recommend you shade in the areas in question, then to make it simpler to hand in your answer I’ll just ask for the parts of the diagram that are involved (e.g. ‘a and b’, or ‘c and d and f’) a) Use horizontal lines to shade in that area of Y that is not explained by X2, X3, and X4, then list those parts of the diagram. b) Use vertical lines to shade in that area of X1 that is not redundant with X2, X3, and X4. Some part of the diagram should be cross-hatched (i.e. have both vertical and horizontal lines), then list those parts of the diagram. Note: The partial regression coefficient is the correlation between the lined part of Y and the lined part of X1. The coefficient of partial determination equals the part that is cross-hatched divided by the total part of Y that is lined (including the cross-hatched area), in other words, it is the proportion of the lined area of Y that is cross-hatched. 5. Now we are going to do the partial correlation the long way. We will be working with the partial correlation between X1 and Y but the same process could be applied to any of the predictor variables. Determine the part of Y that cannot be predicted by X2, X3, and X4 a) Using SPSS, regress Y on X2, X3, and X4 (all three at once), write the regression equation filling in the values of the b’s. Now using that regression equation, and the ‘Transform>>Compute…’ menu on SPSS create a new variable in SPSS that consists of the predicted values of Y, call it ‘Y_Predicted’. Then, create another variable, this one consisting of the error of the predictions (Y- Y_Predicted) and call it ‘Y_Residuals’. You’ll need this in a second. Determine the part of X1 that is not redundant with X2, X3, and X4 b) Regress X1 on those three predictor variables, write down the regression equation filling in the values of the b’s. Now using that regression equation, and the ‘Transform>>Compute…’ menu on SPSS create a new variable in SPSS that consists of the predicted values of X1, call it ‘X1_Predicted’. Then, create another variable, this one consisting of the error of the predictions (X1- X1_Predicted) and call it ‘X1_Residuals’. Now, determine the relationship between the part of Y that could not be predicted by X2, X3, and X4 and the part of X1 that is unique (not redundant with X2, X3, and X4) c) Go to the Graphs>>Scatter… menu and create a scatter plot with Y_Residuals on the Y axis and X1_Residuals on the X axis. Have SPSS put the regression line on the chart. Include the scatter plot with your answer sheet (cut and paste it into your answer sheet or print it out as a pdf and attach it). Look at that graph, the correlation between those two variables is the ‘partial correlation coefficient’ from question 2, and the slope of the regression line is the ‘partial regression coefficient’ for X1 found in the regression equation for regressing Y on all of the predictor variables (see question 1 above). Now, regress Y_Residuals on X1_Residuals. Use the output of the analysis to answer the following: d) State the slope of the regression line. (Double check your work by comparing it to the partial regression coefficient for X1 from question 1, they should be the same except for minor differences due to rounding). e) The partial correlation between Y and X1 is the same as the correlation between Y_Residual and X1_Residuals. State that correlation. (Double check your work by comparing it to the partial correlation coefficient for X1 found in question 2, with perhaps minor differences due to rounding). f) The PRE of using X1_Residuals to predict Y_Residuals is the same as the PRE of adding X1 to a model that already contains the other predictor variables . State the PRE (and double check it by looking back at question 2).