A Concentration Analysis of Student Responses on the 1995 Version of the Force Concept Inventory Nicole DiGironimo University of Delaware 2007 AERA Annual Meeting - Poster Session Conference Paper Introduction There have been substantial efforts made towards improving basic physics courses, especially since Halloun and Hestenes' (1985a) survey of calculus based and non-calculus based physics students at the University of Arizona. Their findings were consistent with other studies questioning the effectiveness of traditional methods of teaching physics, suggesting that traditional teaching methods were not successful at imparting physics knowledge to students (Arons, 1997; Hake, 1998; Halloun & Hestenes, 1985a; Hestenes, 1979). This prompted the need to improve basic physics courses. Currently, the student testing device most well-known and widely used in physics education research is the Force Concept Inventory (FCI). The FCI is a multiple-choice test designed to categorize students' understanding of basic Newtonian physics concepts (Hestenes et al, 1992). However, like any standardized testing instrument, some researchers critiqued the exam's design and use (Griffiths, 1997; Huffman & Heller, 1995); this prompted the development of a variety of methods for analyzing FCI data. This paper adds to the literature base by reporting on the implementation of a previously developed analysis method on the newest version of the FCI. A Short History of the Force Concept Inventory The Mechanics Diagnosis Test was first published in 1985 (Halloun & Hestenes, 1985a). The purpose of the exam was to identify the various levels of student knowledge present in any college introductory physics course. It accomplished this goal with open-answer questions that required the students utilize their physics skills. Subsequent analysis and coding of the student responses to the open-ended questions determined recurring student alternative conceptions. Halloun and Hestenes (1985b), disappointed with the haphazard misconceptions literature of the time, recognized a need for a comprehensive taxonomy of alternative conceptions of Newtonian physics. Their review of the literature and MDT results uncovered elements of Aristotelian and Impetus theories present in student thinking and they used these findings to build categories of common alternative conceptions (Hestenes et al, 1992). These categories were used to construct a multiple-choice version of the MDT, designed to provide alternative conception distracters to the students in order to draw out the students' physics conceptions. To establish reliability and validity, various versions of the multiple-choice MDT were administered to over 1000 college students, as well as to physics professors and graduate students. The results of these tests were affirmative and physics education researchers began using this effective evaluation tool immediately. Successes with the multiple-choice version of the MDT led to the development of the FCI in 1992, another multiple-choice test with distracters that represented the common alternative student conceptions. Only a few improvements were made to the MDT to create the FCI; half of the FCI questions were MDT questions. Most of the changes were to the language used in the test questions. Validity and reliability tests were not repeated for the FCI because the FCI scores were comparable with the MDT scores and the FCI was designed as an improvement to the MDT (Hestenes et al., 1992; Savinainen & Scott, 2002). Although the FCI, like the MDT, probed students’ beliefs about Newtonian concepts of force and motion, its main purpose was to "evaluate the effectiveness of instruction" (Hestenes & Halloun, 1995, p. 502). As education researchers' interests evolved from basic instruction improvement to student conceptual understanding, cognition, and epistemology, the analysis and use of the FCI evolved as well. In 1995, the current version of the FCI was developed as a revision to the 1992 version; it has 30 multiple-choice items, compared to 29 on the original (Halloun et al., 1995). Theoretical Framework Research shows that students enter an introductory physics course with pre-defined physics beliefs (Halloun & Hestenes, 1985a). The literature also indicates that instructor and textbook authority alone are not enough for students to dismiss their common sense physics alternative conceptions (Halloun & Hestenes, 1985a; Pintrich et. al, 1993). These pre-defined beliefs, or concepts, are a system used to explain the physical world. The theoretical framework used in this paper to conceptualize student belief systems follows Ioannides and Vosniadou's (2002) framework theory and diSessa and Sherin's (1998) concept system. The framework theory is defined as a relatively well-established theory, or “an explanatory system with some coherence” (Ioannides & Vosniadou, 2002, p. 4), about the physical world that begins when we are very young and is complete by the time we start at school. Ioannides and Vosniadou state that the "framework theory is based on everyday observations and information provided by the culture, as this information is interpreted by the human cognitive system” (Ioannides & Vosniadou, 2002, p. 4). Ioannides and Vosniadou's study involved young children's ideas about 'force' and the authors claimed that, if there is a framework theory that guides children's interpretation of the word force, then we should expect children to answer questions about force in a relatively uniform and internally consistent manner. If not, we should expect logically inconsistent responses guided by a multiplicity of fragmented interpretations of the meaning of force. (Ioannides & Vosniadou, 2002, p. 5) This definition is especially useful for analyzing the FCI. As FCI data are analyzed, patterns in the student's answers will provide insight into their common ideas about Newtonian physics. If a student, or a group of students, consistently answers questions that probe the same physics topic correctly, then, using a framework theory foundation, one could conclude a coherent belief system. Researchers commonly refer to incorrect pre-defined belief systems as misconceptions or alternative conceptions. An interesting fact is that the most common alternative conceptions, Aristotelian and Impetus theory, were, in Pre-Newtonian times, advocated by scientists (Halloun & Hestenes, 1985b). Instructors, therefore, should not only have a way to identify their students' alternative conceptions but they should also consider all alternative conceptions seriously. Each alternative conception should be considered a valid student hypothesis and physics courses should be structured to evaluate the alternative conceptions by scientific procedures. Structuring a course in this way can provide students with the experimental proof, scientific reasoning, and time needed to revise their beliefs (Chinn & Brewer, 1993; Hestenes & Halloun, 1985a; Ioannides & Vosniadou, 2002). diSessa and Sherin (1998) provided insight into the cognitive aspects of this research. Their paper tackled the difficult task of defining 'concept' in the context of conceptual change and understanding. Their definition did not describe a 'concept' as a singular idea or as a small group of ideas, but rather, like Ioannides and Vosniadou, they described the model of a concept "more like a knowledge system" (diSessa and Sherin, 1998, p. 15). It is a student’s comprehension of Newtonian topics that defines his/her basic knowledge system, or concept system, of Newtonian physics. A concept system derived from personal experience and very little formal training will differ distinctly from the Newtonian knowledge system of a trained physicist (Bransford et al, 1999). diSessa and Sherin claimed that, "instead of stating that one either has or does not have a concept, we believe it is necessary to describe specific ways in which a learner's concept system behaves like an expert's - and the ways and circumstances in which it behaves differently" (diSessa & Sherin, 1998, pp. 15-16). All basic physics courses cover the Newtonian theory of physics. Newtonian theory enables us to identify the basic elements in the conceptualization of motion. The kinematical elements are position, distance, motion, time, velocity, and acceleration. The dynamical elements are inertia, force, resistance, vacuum, and gravity. These topics were chosen for inclusion in the FCI for their ability to illuminate the differences between Aristotelian, Impetus, and Newtonian thinkers. It is this difference between expert (Newtonian) and novice understanding that the FCI attempts to bring to light through its well-designed distracters; this difference is also evident in the results of the analyzed FCI exams in this study. Purposes of this Study Two of the standard methods used to reveal useful information from the FCI scores were developed and implemented by Hake (1998) and by Bao and Redish (2001). Bao and Redish's method is called a Concentration Analysis, which measures student response distributions on multiple-choice exams. They applied their method to the 1992 version of the FCI. The purpose of this study was to use the concentration analysis on the 1995 version of the FCI. The theory behind the concentration analysis is, if students have well-defined ideas about the subject being tested, and if the multiple-choice options represent these common alternative conceptions as distracters, then student responses should be concentrated on the appropriate distracters for the physics concept defined in the student’s mind (Bao & Redish, 2001; Ioannides & Vosniadou, 2002). As already stated, the FCI is intended to create this exact situation; the way in which a student responds to each question should yield some information about their alternative conceptions, or lack thereof. After applying the concentration analysis to the FCI exams, the main purpose of this study was to use the concentration analysis data to investigate the students' responses with the hope that patterns would reveal themselves. Bao and Redish (2001) claimed their concentration analysis could determine if students who take the FCI possess common correct or incorrect physics concepts and, therefore, allow one to determine if the FCI is effective in detecting the students’ physics concepts. This study set out to authenticate the first part of Bao and Redish's claim. The latter claim was beyond the scope of this study. Methodology: The Concentration Analysis To understand the concentration analysis, first consider an example where 100 students answer the same multiple-choice question, choosing between choices A, B, C, D, or E. Bao and Redish (2001) maintained that the student responses will correspond to one of these three types of outcomes, illustrated in Table I. 1. A type I response pattern represents an extreme case where all the responses are evenly distributed across all of the choices. 2. A type II pattern represents a more typical situation where there is a higher distribution on some choices than on others. 3. A type III pattern is another extreme case where every student has selected the same answer, presumably, although not necessarily, the correct answer. Table I Possible Distributions for a Multiple-Choice Question Choices Type of Pattern A B C D E I 20 20 20 20 20 II 35 10 50 0 5 III 0 0 100 0 0 Note. Adapted from "Concentration Analysis: A Quantitative Assessment of Student States," by L. Bao and E.F. Redish, 2001, American Journal of Physics, 65(7), p. 45. The concentration factor, C, is a function of student responses. This function can take on values between [0,1], where a 1 represents a Type III perfectly correlated pattern and a 0 represents a Type I pattern. The concentration factor is calculated for each exam question. When using the m C m 1 m n i 1 N 2 1 m equation, m represents the number of choices for the question (for the FCI, this number is always equal to 5), N is the number of students who answered the question, and ni is the number of students who selected choice i. Student response patterns are formed by combining the question's concentration factor with the question's score, the percentage of students who answered a particular question correctly. Like the concentration factor, the score is a continuous value with a range of [0, 1]. Bao and Redish created a coding scheme, illustrated in Table II, to label the student response patterns. Table II Coding Scheme for Score and Concentration Factor Score (S) Level Concentration Factor (C) Level 0~0.4 L 0~0.2 L 0.4~0.7 M 0.2~0.5 M 0.7~1.0 H 0.5~1.0 H Note. Adapted from "Concentration Analysis: A Quantitative Assessment of Student States," by L. Bao and E.F. Redish, 2001, American Journal of Physics, 65(7), p. 50. Although they are illuminating, the codes for the score and concentration factor carry no great weight on their own. Table III shows how combining the codes for the score and concentration factor provide the student response patterns for each multiple-choice question. Table III also indicates how each response pattern can be used to interpret the students' understanding of physics, their concept system. These response patterns are the intended products of the concentration analysis. Table III Student Response Patterns and Interpretation of the Patterns Response Pattern Interpretation of the patterns HH One correct concept system LH One dominant incorrect concept system LM Two incorrect concept systems MM Two concept systems (one correct and one incorrect) LL Three or more concept systems represented somewhat evenly One-Peak Two-Peak Non-Peak Note. Adapted from "Concentration Analysis: A Quantitative Assessment of Student States," by L. Bao and E.F. Redish, 2001, American Journal of Physics, 65(7), p. 50. The purpose of Bao and Redish's study was to introduce and evaluate the concentration analysis method. The results from their study presented three conclusions regarding the effectiveness of using a concentration analysis to investigate students' ideas about physics. The first conclusion was that a concentration analysis can help detect erroneous student concept systems, especially when combined with student interviews. The fundamental nature of this analysis is the ability to find patterns in the students' thinking. If a student consistently chooses distracters that represent a particular alternative physics concept, then the instructor or researcher can make some conclusions about the student's physics understanding (Bao & Redish, 2001; Ioannides & Vosniadou, 2002). When followed by interviews, the student's concept systems can be more easily identified. The second conclusion was that a concentration analysis helps to identify test questions with ineffective distracters. This outcome would be identified by an LL response pattern, which indicates that none of the available distracters are particularly eye-catching for a majority of the students. This could happen if none of the distracters reflect a common student concept; however, a lot of research went into the development of the FCI and it is unlikely that any of the tests questions lack the necessary distracters. Two other, more likely, explanations are that there is no common student concept system for the context of the question or all the choices correspond well with student concept systems and the students are using all the concept systems equally. These possibilities would indicate that the group of students lack a strong understanding of the subject matter; either the students are clueless and are guessing at the answer (causing a nearly random distribution of student responses) or the students lack the experience needed to properly categorize the problem and solve it using the appropriate methods. Bao and Redish suggested that LL responses indicate a need for additional research (Bao & Redish, 2001). The last conclusion Bao and Redish made was a purely practical one: the results of a concentration analysis could be used in test construction. For all of the reasons already mentioned, a concentration analysis provides useful information regarding how students interpret the exam questions. To be a diagnosis exam, a concentration analysis should reveal high concentrations and low (or high) scores. These results indicate effective distracters and coherent student concept systems. Sample The target population for this study is all students who could potentially take the FCI; however, the sample was drawn from an assessable population (Gall et. al, 2003). The sample of students was the entire introductory physics course at a private, urban university. However, only 22 of the 41 students enrolled in the course chose to participate in the voluntary study. One student took the FCI but did not sign the consent form and, accordingly, was not included in the analysis. This response is equivalent to 51.2% of the class. The test was taken voluntarily and anonymously; therefore there are no demographics for the sample. However some information is known about the entire class. There were 12 freshman (29.3%), 21 sophomores (58.5%), 3 juniors (7.3%), and 1 senior (4.9%) in the introductory physics course. The majors represented by the students include: mechanical engineering, computer science, teacher education, mathematics, and economics. The majority (22%) of the students were computer science majors. 80.5% of the class was male. This information could be used in an analysis of the results but it was not used in this study. Results and Discussion This section contains two tables containing relevant data from the study. Table V lists the score, concentration, and response pattern for each question on the FCI. The revealing response patterns are color-coded in Table V to increase its utility. The actual distributions of the students' responses for each FCI question are in Table IV. The following sections of this paper investigate some interesting themes discovered in the data. First, there were seven instances of LL response patterns in the FCI data. It is always illuminating to discuss the possible meanings behind LL student response patterns; this may not turn out to be true for this study, however. The second finding, as indicated in the literature on expert and novice understanding (Bransford et al., 2000), the data demonstrate examples of student miscategorization of FCI questions. The third and fourth themes discovered within the data are examples of sample-wide understandings and misunderstandings of particular physics topics. The practical implications of these themes are discussed later in this paper. Finally, the students' total scores on the FCI are presented with a discussion of the meaning and implications of this data. Table V Score and Concentration for each FCI Question Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 S 0.67 0.24 0.57 0.71 0.24 0.86 0.71 0.60 0.38 0.57 0.33 0.67 0.43 0.62 0.57 C 0.51 0.16 0.30 0.58 0.15 0.75 0.52 0.38 0.18 0.30 0.21 0.51 0.26 0.38 0.34 MH LL MM HH LL HH HH MM LL MM LM MH MM MM MM Q16 Q17 Q18 Q19 Q20 Q21 Q22 Q23 Q24 Q25 Q26 Q27 Q28 Q29 Q30 S 0.85 0.14 0.24 0.38 0.43 0.35 0.76 0.42 0.67 0.33 0.19 0.48 0.67 0.86 0.29 C 0.74 0.76 0.16 0.28 0.14 0.11 0.60 0.19 0.51 0.16 0.15 0.26 0.48 0.76 0.36 HH LH LL LM ML LL HH ML MH LL LL MM MM HH LM Table VI Distribution of Responses for each FCI Question Total Response for each Choice Question Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 A 1 5 3 6 0 2 2 2 2 12 1 0 1 3 12 17 18 0 4 4 1 3 1 14 1 7 6 0 0 0 B 0 4 3 0 18 10 0 0 9 2 1 6 1 8 3 0 5 3 E 0 1 2 15 3 0 3 5 8 2 2 0 0 0 0 0 0 7 8 2 7 0 2 0 2 4 1 14 0 Total Number of Answers 21 21 21 21 21 21 21 20 21 21 21 21 21 21 21 20 21 21 21 21 20 21 19 21 21 21 21 21 21 2 6 1 12 21 5 18 15 12 7 2 2 14 3 3 3 1 3 5 0 2 5 16 8 0 3 7 4 2 C 14 2 12 0 5 1 1 1 4 2 9 6 8 2 5 2 0 2 0 4 5 1 2 6 7 0 D 6 9 1 0 8 0 0 0 0 3 7 1 9 13 1 0 0 7 9 Note. The correct answer is noted in bold. Non-Peak Situations A noticeable result worth discussing is the seven questions that had an LL response pattern. As mentioned earlier, Bao and Redish (2001) suggested that LL response patterns could mean one of three things: (1) None of the distracters for that question reflect a common student concept system; (2) There is no common student concept system for the context of the question; or (3) All the choices correspond well with student concept systems and the students are using all the systems equally. It has already been noted in this paper that the first possibility is unlikely for the FCI. Therefore, for each of the LL patterns in the data, either there was no common student concept system within the sample or all of the choices somewhat evenly represented the sample's concept systems. Unfortunately, a close look at the data reveals that the seven questions do not probe the students' understanding of the same physics topics. This would have been an advantageous outcome; if this had occurred it might have been easier to make claims regarding the students' overall understanding of that particular topic. But this is not the case. As an example, questions 2 and 9 require the students understand basic kinematics (i.e. the equations of motion; distinguishing between position, velocity, and acceleration; trajectories of motion). However, these questions set up extremely different scenarios; it will be difficult to draw conclusions about the student's understanding of kinematics. This is compounded by the fact that several other questions on the FCI require an understanding of kinematics (questions 1, 8, 10, 12, 14, 19, 20, and 23). Determining what caused the different response patterns for questions 2 and 9 is difficult. It is possible the students did not understand the question or they miscategorized it; more examples of miscategorization will be discussed in the next section. A look into the other questions with LL response patterns reveals similar results. The only conclusion one can make at this stage is that there is not enough data to make a conclusion. Without additional information, specifically follow-up interviews that probe student concept systems further, little can be said about why the students answered these seven questions they way they did (Bao & Redish, 1992). The context of each of these questions is different. It is possible that, given the context, the students did not know the physics needed to solve the problems. The purpose of this study was to discover coherent conceptions of physics. Looking at each of these LL response pattern questions individually may yield interesting results but it will not provide information about coherent student understandings of Newtonian physics, at least not without additional data. However, many of these LL questions are useful in the context of the other findings in this paper. Miscategorization of the Problem The students in this sample were not trained physicists and, therefore, they were not expected to score well on the FCI. However, there are instances where the students answered certain questions correctly but answered other questions incorrectly. The significance of these results lies in the questions themselves. Expert categorization of these questions finds that the physics principles needed to solve the questions are the same. Bransford et al. (2000) reported previous findings that compared the differences in how physics experts and novices sort problems according to their appropriate problem-solving method. The research found, "experts' problem piles are arranged on the basis of the principles applied to solve the problems" (Bransford et al., 2000, p. 38). Conversely, "novices tend to characterize physics problems as being solved similarly if they 'look the same' (that is, share the same surface features)" (Bransford et al., 2000, p. 39). As seen in the data gathered in this study, there were several occurrences of miscategorization by the students. A number of questions on the FCI test for an understanding of Newton's First Law; but they do so using different scenarios. Questions 10 and 24 describe situations where there is a moving object with no external forces acting on it (in the direction of the object's motion); therefore, the object does not experience any acceleration. Both questions ask about the speed of the object; the correct response is to say the speed of the object is constant. For question 10, the majority of the students did not choose the correct answer; this question received a MM response using the concentration analysis. Using Table IV, we see that for question 10 exactly 60% of the students believed the object’s speed would continuously increase. Conversely, the majority of the students answered question 24 correctly. 67% of the responses were correct and this question had a MH response pattern. Question 17 also tested for an understanding of Newton's First Law. This question described an elevator moving at a constant speed and asked about the relationship between the tension force and the gravitational force. The correct answer is that the forces are equal; a constant speed indicates no acceleration. The majority of students (86%) answered incorrectly; they selected an answer that indicated their lack of understanding of Newton’s First Law. This raises an interesting question: what differences did the students see in these questions to result in their differences in answers? A Newtonian thinker would recognize these two questions as identical, yet the physics students did not. Question 25 also covers, in a slightly different context, Newton’s First Law. The question sets up a scenario where a woman is applying a constant force to a box that moves the box forward at a constant speed. The question asks the students to relate the force applied by the woman to the other forces present (though not identified explicitly) in the problem. A Newtonian thinker would recognize that a constant speed implies there is no net force on the box; experts would know that the force exerted by the woman was being "canceled out" by a resistive force, like friction. Although this problem also involved applying Newton's First Law, the students were equally distributed on each possible answer. The majority of students (38%) believed the force applied by the woman exceeded the total force that resists the motion of the box. It is worth noting, though, that 33% students got this question correct. The other 6 students related the force applied by the woman to the weight of the box, implying either a misunderstanding of the 2dimensional nature of forces or a misunderstanding of friction's role in this situation. Do the students not understand Newton's First Law or are they mischaracterizing the problem (as perhaps a Newton's Second Law problem, or something else entirely)? It is difficult to say conclusively without additional data. But it does appear that something peculiar was occurring that caused the students to perform well on some questions and poorly on other indistinguishable questions. Sample-wide Understandings FCI results can provide evidence of coherent student concept systems and there are examples in this study (Savinainen & Viiri, 2003). The results for questions 4 and 29 have an interesting similarity; the students’ responses were distributed on only two of the five choices for both of these questions. Most of the students selected the right answer (71% and 86%, respectively) and the remaining students selected the same incorrect answer. These results tell us something interesting about the small percentage of students who answered incorrectly; they had the same alternative conceptions of physics, in the context of this question. An interesting question, which cannot be answered with the data collected in this study, is if they entered the physics course with these identical concept systems or if something in the instruction of the course developed these alternative conceptions about physics. Another similarity between questions 4 and 29 is that they both require an understanding of Newton's Third Law. Question 4 relates to Newton’s Third Law through a scenario of a headon collision between a (heavy) truck and a (light) car. Question 29 describes a chair sitting motionless on a floor and asks the students to identify the forces the chair experiences (the downward force of gravity: the weight, and the upward force exerted by the floor: the normal force). The students appear to recognize problems involving Newton's Third Law and they seem to be applying the physics ideas appropriately. But do any other questions on the FCI probe the students' knowledge of Newton’s Third Law? Both questions 15 and 16 relate to Newton’s Third Law, though they do so using a slightly different context than questions 4 and 29. In these two questions, a small car is pushing a large truck along a road. For any scenario like this, the forces exerted by each object on the other object will always be equal in magnitude. Question 15 complicates the situation by including acceleration. This problem says the small car is accelerating as it pushes the large truck but this fact is irrelevant for Newton’s Third Law. This piece of extraneous information is included intentionally; this question will attempt to mislead a non-Newtonian thinker into believing the force exerted by the car on the truck is less than (or greater than) the force exerted by the truck on the car. For question 15, some students were tempted by a distracter and indicated that they believed the car exerted more force on the truck (24%) than the truck exerted on the car. The majority of the students, however, answered both questions 15 and 16 correctly (57% and 81%, respectively). It seems that regardless of the context, the students were able to recognize and apply the appropriate physics rules to problems involving Newton's Third Law. The only other question not yet discussed that assesses the students on Newton’s Third Law is question 28. This question had a MM response pattern and 67% of the students answered correctly. The concentration factor for question 28 was a 0.48 which is very close to the borderline of 0.50 for "high" concentrations. It is not a stretch to say the findings support the conclusion that there was class-wide understanding of Newton's Third Law. Sample-wide Misunderstandings The results for questions 5, 6, 7, and 18 are extremely interesting. These four questions represent the only questions on the FCI that involve the physics of circular motion. Questions 6 and 7 have a HH response pattern; the majority of the students (86% and 71%, respectively) chose the correct answer. Questions 5 and 18, though, involve the same physics phenomenon and have LL response patterns. How can the students be correct, with a high concentration factor, only half of the time? What distinguishes these four questions from one another? Questions 5 and 18 require the students identify the forces acting on an object as it is in circular motion. Questions 6 and 7 ask the students to choose the correct trajectory for two different objects undergoing circular motion. Conceptually, there are no differences between these two problems, and a physics expert would recognize this, but this is not as obvious to the novice student. These results seem to be another example of student miscategorization but they are also an interesting example of a sample-wide misunderstanding of the forces present in basic circular motion problems. These two incorrect circular motion problems make up less than 7% of the entire FCI but one cannot ignore the significance of these results. Clearly this sample of students did not have a correct understanding of the forces involved in circular motion problems. As mentioned earlier, question 17 requires an understanding of Newton's First Law. This question uses a scenario involving an elevator moving at a constant velocity; the tension force in the elevator cable and the elevator gravitation force cancel and, therefore, the elevator has no acceleration. This question was the only question with an LH response in the data; most of the students (86%) selected the same wrong answer and only three students selected the right answer; no other choices were selected by the students. This response distribution clearly illustrates a strong and coherent sample-wide misunderstanding of Newton's First Law, at least in the context of this question. Furthermore, since the students all chose the same incorrect answer, the alternative student concept system can be identified. 86% of the students believed the upward (tension) force by the cable is larger than the downward force of gravity. It is clear that, in the context of this question, the students did not understand that the elevator could have an upward velocity without also having unequal forces; therefore, they misunderstand Newton's First Law. Total Scores on the FCI Figure I is a histogram of student scores on the FCI. Measured out of 100%, these scores are the percentage of correct answers on the FCI. One of the developers of the 1995 version of the FCI claimed it has "fewer ambiguities and a smaller likelihood of false positives" (Hake, 1998, p. 14) than its predecessor. The previous version of the FCI and the MDT were known to have very few occurrences of correct answers for incorrect reasons (false positives); it is reasonable to assume the same is true for the revised version. Interestingly, many physicists look at the test and claim the questions are too obvious and that they are hardly worth asking (Hestenes & Halloun, 1995). Of course, this makes a poor FCI score, referred to as a negative response, all the more illuminating. A positive response to a single question is much less informative than a negative one; the chance that the answer is a false positive decreases as more questions are answered correctly. But false positives are difficult to remove completely; even random choices have a 20% chance of being false positives (Hestenes & Halloun, 1995). However, a completely non-Newtonian thinker may tend to score even lower because of the carefully constructed distracters (Hake, 1998). Regardless, a near perfect score is considered a strong indicator of Newtonian thinking; the 'mastery threshold' for Newtonian thinking is considered to be an 85% and an FCI score of 60% is regarded as the ‘entry threshold’ to Newtonian physics (Hestenes & Halloun, 1995; Hestenes & Wells, 1992). Below that limit, students’ grasp of Newtonian concepts is insufficient for effective problem solving. Few doubt the 60% ‘entry threshold’ for Newtonian thinking; it is clear that the FCI can provide researchers with certain information if the student does not understand Newtonian force concepts, but a high number of students score does not seem to guarantee complete mastery of the subject (Huffman & Heller, 1995). 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 0 to 10 11 to 21 to 31 to 41 to 51 to 61 to 71 to 81 to 91 to 20 30 40 50 60 70 80 90 100 range Figure I - Histogram of Student Scores This issue was particularly interesting to Sanjoy Mahajan at Cambridge University after he administered the FCI to a small group of students (Mahajan, 2003). All ten of his students performed extremely well on the exam; the average score was a 92% and the lowest score was an 83%. He was rather suspicious of these results, so he attempted to substantiate the FCI results by assigning free-response problems, which required Newtonian thinking, to the students in weekly sessions. He found that the students still had some serious issues with Newtonian thinking; many harbored some of the same common sense alternative conceptions the FCI is intended to identify. Most of the assigned free-response problems were more difficult than FCI problems but Mahajan’s findings still raise concerns about whether or not the FCI can be used to identify mastery in Newtonian thinking. It is possible that a high score on the FCI does not translate to Newtonian mastery in unfamiliar contexts. For this study, the majority of the students (62%) performed at the entry threshold level or below. Two of the students (10%) scored a 90% on the FCI, a surprisingly high score for introductory physics students. These two students surpassed the mastery threshold and could be considered Newtonian thinkers. There was no follow-up to verify student understandings, as there was in Mahajan’s study. Therefore, it is impossible to say with certainty that these students are Newtonian thinkers. For this reason, it is more useful for the purposes of this study to keep to the concentration analysis findings. Not only are there interesting comparisons to be made, but the concentration analysis allows the research to analyze a smaller unit of data, the individual question. Implications for Instruction As alluded to earlier, there are enormous practical implications for research of this sort. In fact, the study described in this paper could be replicated in every introductory physics classroom and would yield fantastic results for each instructor. There are three categories of physics concept systems - Aristotelian, Impetus, and Newtonian - represented on the FCI. The choices made by each student on each question informs the instructor (or researcher) of the student's physics concept system. Better still, the results of a concentration analysis provides an instructor with overall demographics for the physics concept systems present in his classroom. These demographics would provide incomparable information that the instructor could use to plan his lesson. The data on sample-wide understandings and misunderstandings can provide the instructor with useful information about his classes' concept systems. A typical introductory physics course at a large university enrolls many students. It would be impossible for the instructor to thoroughly assess the prior knowledge of each student. Having the students take the FCI, and then performing a concentration analysis on the data, could provide insight into the classes' conceptual understanding of basic Newtonian physics, much like it did in this study. Additionally, instructors could decide to use the FCI to gauge the quality of their instruction. This was, in fact, the FCI's primary purpose upon its introduction into the physics education research community (Hake, 1998; Hestenes & Halloun, 1995). By distributing the FCI as a posttest in a basic physics course, an instructor could see how many of the Newtonian physics concepts were retained by the students. He could then use this information to improve his lessons before teaching his next group of students. There are a variety of statistical analyses that can be done with data like the data collected in this study. Unfortunately the sample used for this study was too small for all statistical analyses. In the future, another study of this type could include a 30-dimensional factor analysis on the data to investigate if it can reduce to fewer dimensions. A reduction could indicate that there are as few as, say, five predictors for student understanding of Newton’s Laws. This could be extremely useful information for practicing physics instructors. By knowing these predictors, instructors can respond to student difficulties more efficiently and they can also build stronger and more productive curricula. Summary Clearly there are a plethora of uses for the FCI in physics education research. This study illuminates only the smallest area of this field. Future research in this area should replicate this study with a much larger sample. As mentioned before, there are a variety of interesting statistics that could be applied to the data and would likely reveal enlightening information. Future research should also attend to the main limitation present in this study. This limitation is the lack of qualitative data. Most strong studies involving the FCI include interview data from the students. Mixed-methods studies are becoming more popular in educational research (Howe, 2004); educational researchers value the way qualitative data can support (and question) quantitative data. This replication study would have had stronger and more applicable results if some of the students (particularly those students with incredibly low and incredibly high scores) were interviewed. References Arons, A.B. (1997). Teaching Introductory Physics. New York: Wiley. Bao, L. & Redish, E. F. (2001). Concentration Analysis: A Quantitative Assessment of Student States. American Journal of Physics, 65(7), 45-53. Bransford, J. D., Brown, A. L., & Cocking, R. R. (Eds) (1999). How People Learn: Brain, Mind, Experience, and School. Expanded Edition. Washington, D.C.: National Academy Press. Chinn, C. A., & Brewer, W. F. (1993). The role of anomalous data in knowledge acquisition: A theoretical framework and implications for science instruction. Review of Educational Research, 63, 1-49. diSessa, A. & Sherin, B. (1998). What changes in conceptual change? International Journal of Science Education, 20(10), 1155-1191. Gall, M.D., Borg, W.R., & Gall, J.P. (2003). Educational Research: An Introduction (7th edition). Boston, MA: Allyn & Bacon. Griffiths, D. (1997). Millikan Lecture 1997: Is there a text in this class? American Journal of Physics, 65, 1141-1143. Hake, R. (1998) Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 66, 64-74. Halloun, I. B., Hake, R., Mosca, E. & Hestenes, D. (1995). Force Concept Inventory (Revised 1995), online at http://modeling.la.asu.edu/modeling.html. Accessed on 21 October 2004 (password protected). Halloun, I. B. & Hestenes, D. (1985a). The initial knowledge state of college physics students. American Journal of Physics, 53(11), 1043-1055. Halloun, I. B. & Hestenes, D. (1985b) Common sense concepts about motion. American Journal of Physics, 53(11), 1056-1065. Hestenes, D. (1979) Wherefore a Science of Teaching? The Physics Teacher, 17, 235-242. Hestenes, D. & Halloun, I. B. (1995) Interpreting the Force Concept Inventory: A response to Huffman and Heller. The Physics Teacher, 33, 502-506. Hestenes, D. & Wells, M. (1992). A Mechanics Baseline Test. The Physics Teacher, 30, 159166. Hestenes, D., Wells, M. & Swackhamer, G. (1992) Force Concept Inventory. The Physics Teacher, 30, 141-158. Howe, K. (2004). A Critique of Experimentalism. Qualitative Inquiry, 10(1), 42-61. Huffman D. & Heller, P. (1995). What does the Force Concept Inventory actually measure? The Physics Teacher, 33(3), 138-143. Ioannides, C., & Vosniadou, S. (2002). The changing meanings of force. Cognitive Science Quarterly, 2(1), 5-62. Mahajan, S. (2003). Observations on Teaching First-Year Physics. A report to the Cavendish teaching committee, Cambridge University, online at http://wol.ra.phy.cam.ac.uk/sanjoy/teaching/tc-report/ accessed on 30 October 2004. Pintrich, P. R., Marx, R. W., & Boyle, R. A. (1993). Beyond Cold Conceptual Change: The Role of Motivational Beliefs and Classroom Contextual Factors in the Process of Conceptual Change. Review of Educational Research, 63(2), 167-199. Savinainen, A. & Scott, P. (2002). The Force Concept Inventory: a tool for monitoring student learning. Physics Education, 37(1), 45-52. Savinainen, A. and Viiri, J. (2003). Using the Force Concept Inventory to Characterise Students’ Conceptual Coherence. In L. Haapasalo and K. Sormunen (Eds.): Towards Meaningful Mathematics and Science Education, Proceeding on the IXX Symposium of Finnish Mathematics and Science Education Research Association. Bulletin of Faculty of Education, No 86, University of Joensuu, pp. 142-152.