UNIVERSITY OF LOUISVILLE Gender Differences and the SAT as a Predictor of College GPA by Joanne A. Jeuch Introduction to Research Methods and Statistics, EDFD 600 Dr. M. P. Benver December, 1999 Abstract For at least the past 60 years, educators have been investigating the validity of the Scholastic Aptitude Test (SAT) as a predictor of college grade point average (GPA). While the SAT is one of the most commonly used tools to predict academic success at the university level, it continues to be one of the most debated: some believe it to be completely objective; others present numerous external influences which affect the test’s ability to adequately predict a student’s academic success. This review focuses on the studies surrounding gender differences in the validity of the SAT to predict GPA. Several studies have indicated the presence of gender bias in the predictive validity of the SAT such that male students with a given SAT score will have a lower GPA than their female counterparts with the same score. Several options have been presented to lessen or explain the effects of the biased prediction of GPA, such as the combined use of high school GPA and SAT scores, or the consideration of other, less tangible variables that are not tested by the SAT. When all of this information is synthesized as whole, it suggests that the SAT, while far from being the ideal predictive tool for academic success, can be of great assistance to educators if it is used in conjunction with the high school GPA. Introduction For at least the past 60 years, educators have been investigating the validity of the Scholastic Aptitude Test (SAT) as a predictor of college grade point average (GPA). While the SAT is one of the most commonly used tools to predict academic success at the university level, it continues to be one of the most debated. Supporters of the SAT, like G. Hanford (as cited in Jenkins, 1992), believe that it creates a level playing field on which all students can be compared, regardless of differences in schools, grading practices or other aspects of their educational background. Those challenging the validity of the SAT believe that it is not objective, especially in regard to external differences such as primary language, personality, homelife, disabilities and gender. J. Crouse (as cited in Jenkins, 1992) even went so far as to say that the SAT is a greater predictor of where a student will attend college, than it is of their academic success once there. This review focuses on the studies surrounding gender differences in the validity of the SAT to predict GPA. Several studies have indicated the presence of gender bias in the predictive validity of the SAT such that male students with a given SAT score will have a lower GPA than their female counterparts with the same score (e.g., Wainer & Steinberg, 1992). According to Stricker, Rock, & Burton (1991), this underprediction, though statistically significant, is relatively small, -0.10 on a 4-point GPA scale; however, any systematic bias is of concern when SAT scores are so often used as the sole tool in the determination of university admission, course assignments, tuition grants and other awards. While the predictive validity of the SAT is in question, especially as it pertains to gender bias, there is no readily available replacement tool. Until the need for a replacement can be truly assessed and developed, the College Board will continue to use the SAT as a predictor of future 2 academic success. Several options have been presented to lessen or explain the effects of the biased prediction of GPA, such as the combined use of high school GPA and SAT scores, or the consideration of other, less tangible variables that are not tested by the SAT. These options and explanations are the focus of this review. Findings One possible explanation of the gender bias in the SAT is presented by Bridgeman & Wendler (1991) who thought that the bias could be explained at least in part by the way men and women sort themselves into the various types of college mathematics courses: algebra, pre-calculus, and calculus. For instance, there tend to be more men in engineering majors which would require calculus; therefore, it is likely that there will be more men than women in a calculus class. In their study across multiple universities and math courses, however, they found that this was not the case. A consistent tendency to underpredict a woman’s course GPA, or conversely, to overpredict a man’s GPA, based on there respective scores on the mathematics portion of the SAT (SAT-M) was found across all of the course levels at all of the universities. The bias was almost completely eliminated if high school GPA was used in conjunction with the SAT-M scores. Bridgeman & Wendler called upon a study performed by Hyde, Fennema, & Lamon (as cited in Bridgeman & Wendler, 1991) to offer an explanation of their results. The study found that, during the years prior to high school, women are better at computational tasks than are men; after high school, there is no notable difference in computational ability between the sexes. For problem solving tasks, however, men tend to develop a moderate advantage over women during high school. Since the SAT-M tends to focus more on problem solving tasks, and college 3 mathematics courses tend to focus on computational tasks, the difference in college grade point averages for men and women having the same SAT-M scores can be explained. Wainer & Steinberg (1992) also conducted a study across various levels of math courses to assess the magnitude and direction of any gender bias in the mathematics segment of the SAT. Wainer & Steinberg, however, chose to approach the problem in reverse. Instead of predicting college GPA from SAT-M scores, they examined the SAT-M scores of men and women who achieved the same grade in a given math course. Using data on approximately 47,000 students, enrolled in numerous colleges, who had taken the SAT-M between 1982 and 1986, they found a consistent tendency for women to have lower SAT-M scores than their male counterparts, as matched by course level and grade. This difference was; on average, about 33 points. While their results were obviously of interest, Wainer & Steinberg were subjected to significant scrutiny and criticism over their method of retrospective analysis. Bridgeman & Lewis, in their 1996 publication, challenged Wainer & Steinberg’s choice of a retrospective analysis, and brought forth two additional issues not considered by Wainer & Steinberg. The additional concerns of Bridgeman & Lewis focused on (a) the fact that Wainer & Steinberg did not account for fundamental differences in calculus courses, such as those present in a liberal arts calculus course versus one designed for engineering and science majors, and (b) the fact that Wainer & Steinberg focused on the SAT-M as the sole predictor of college GPA, without consideration for the potential affects of high school GPA on the regression analysis. Bridgeman & Lewis conducted a reanalysis of the data used by Wainer & Steinberg, but used only the data for students who had taken the SAT-M in 1985 to control for the possibility of variation in the grading standards over time. This data was further culled to analyze for those colleges that distinguished between liberal arts calculus and engineering calculus courses. The 4 reanalysis was conducted so as to predict college math grade based on SAT-M score; the direction, according to Bridgeman & Lewis, to be of more interest to educators. They found that approximately 25% of the difference, as stated by Wainer & Steinberg, was due to the differences in the gender populations and grading standards between the two different types of calculus courses. Unlike Wainer & Steinberg, Bridgeman & Lewis also analyzed high school GPA as a predictor of college math grade. They found significant differences similar in magnitude as those found for SAT-M scores, but in the opposite direction: high school GPA appeared to favor the men. Similar to the results seen by Bridgeman & Wendler (1991), the gender difference was virtually eliminated when high school GPA and SAT-M scores were considered together. All of the previous studies found gender to be a significant factor when analyzing SAT scores to predict college GPA, but none of them ventured to test other possible explanations for the gender difference. Wright, Palmer & Miller (1996) conducted multiple regression analysis on two identical groups of students taking an introductory marketing course. They controlled as many variables as possible. For instance, grades were based entirely on three multiple choice, computer-graded exams. This eliminated variation due to test differences, grading inconsistencies or errors, instructor-student interactions, etc. Through regression analysis, Wright, Palmer & Miller found that the students’ performance on both the verbal and mathematics portions of the SAT were statistically significant factors in predicting the final course grade (p(v)=0.032 and p(m)=0.056), but gender was not (p(g) = 0.238). A key finding in this analysis however, was that the resulting regression equation explained only a portion of the variation in men’s and women’s course grades. While the researches recognized this fact, they only theorized about what other variables could be significant. 5 Stricker, Rock & Burton, in their 1991 study, ventured to theorize about other variables, but also went on to test their theories. This study investigated two possible explanations for gender bias in the SAT: first, the nature of the college grading criteria, and second, other, less tangible variables associated with scholarly pursuits. The researchers created a means to adjust raw GPA scores for perceived differences in grading criteria. Results of the regression analysis showed that there was a small, but consistent gender difference for both the raw and adjusted scores. Adjusting the GPA for grading differences did not account for the observed gender differences; neither was it correlated to differences in SAT level (high scorers versus low scorers), college, or ethnic group. Of the fifteen explanatory variables that Stricker, Rock & Burton examined, only five produced a significant reduction in the gender bias when introduced into the regression equation. Of the five variables, the three that produced significant reductions were average high school grades, percentage of required reading completed, and percentage of required assignments completed. The remaining two variables produced smaller reductions: years of high school courses taken and study methods. Overall, when the five explanatory variables were included in the regression equation, gender was no longer a significant variable. Discussion Educators and researchers alike are struggling with the predictive validity of the Scholastic Aptitude Test. The studies reviewed here clearly indicate that in most scenarios, some type of gender bias does exist; however, one must consider whether the gender bias stems from the test itself or if it is the result of other external variables, for which the test is not equipped to measure. The studies conducted by Bridgeman & Wendler (1991), Wainer & Steinberg (1992), and Bridgeman & Lewis (1996) did not all agree on analysis methods, but they did all agree that 6 there is a small, but consistent tendency for the SAT to underpredict the college grades of female students. This tendency transcends course and major and university to produce the systematic bias that is of so much concern to researchers and educators alike. In turn, the studies conducted by Wright, Palmer & Miller (1996) and Stricker, Rock & Burton (1991) suggest alternate sources for the apparent gender bias, other than the SAT itself. When all of this information is synthesized as whole, it suggests that the SAT, while far from being the ideal predictive tool for academic success, can be of great assistance to educators when allocating resources for admissions or tuition grants if it is taken with the proverbial “grain of salt”: the high school GPA. 7 References Bridgeman, B., & Lewis, C. (1996). Gender differences in college mathematics grades and SAT-M scores: A reanalysis of Wainer and Steinberg. Journal of Educational Measurement, 33(3), 257-270. Bridgeman, B., & Wendler, C. (1991). Gender differences in predictors of college mathematics performance and in college mathematics course grades. Journal of Educational Psychology, 83(2), 275-284. Jenkins, N. J. (1992). The Scholastic Aptitude Test as a predictor of academic success: A literature review. (ERIC Documentation Reproduction Service No. ED 354 243) Stricker, L. J., Rock, D. A., & Burton, N. W. (1991). Sex differences in SAT predications of college grades (Report No. 91-38). Princeton, NJ: Educational Testing Service. (ERIC Documentation Reproduction Service No. ED 350 336) Wainer, H., & Steinberg, L. S. (1992). Sex difference in performance on the mathematics section of the Scholastic Aptitude Test: A bidirectional validity study. Harvard Educational Review, 62(3), 323-336. Wright, R. E, Palmer, J. C., & Miller, J. C. (1996). An examination of gender-based variations in the predictive ability of the SAT. College Student Journal, 30, 81-84.