This article was downloaded by: [Calvin College Seminary] On: 14 September 2011, At: 13:07 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK International Journal of Science Education Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tsed20 A Research Methodology for Studying What Makes Some Problems Difficult to Solve a Ozcan Gulacar & Herb Fynewever b a Department of Chemistry, Southern Connecticut State University, New Haven, USA b Department of Chemistry and Biochemistry, Calvin College, Grand Rapids, USA Available online: 16 Dec 2009 To cite this article: Ozcan Gulacar & Herb Fynewever (2010): A Research Methodology for Studying What Makes Some Problems Difficult to Solve, International Journal of Science Education, 32:16, 2167-2184 To link to this article: http://dx.doi.org/10.1080/09500690903358335 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-andconditions This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan, sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. International Journal of Science Education Vol. 32, No. 16, 1 November 2010, pp. 2167–2184 RESEARCH REPORT Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 A Research Methodology for Studying What Makes Some Problems Difficult to Solve Ozcan Gulacara and Herb Fyneweverb* aDepartment bDepartment of Chemistry, Southern Connecticut State University, New Haven, USA; of Chemistry and Biochemistry, Calvin College, Grand Rapids, USA 0herb.fynewever@calvin.edu 00 Dr. 000002009 HerbFynewever International 10.1080/09500690903358335 TSED_A_436011.sgm 0950-0693 Research Taylor 2009 & andReports Francis (print)/1464-5289 Francis Journal of Science (online) Education We present a quantitative model for predicting the level of difficulty subjects will experience with specific problems. The model explicitly accounts for the number of subproblems a problem can be broken into and the difficultly of each subproblem. Although the model builds on previously published models, it is uniquely suited for blending with qualitative methods for the study of problem-solving processes rather than being limited to examination of final answers only. We illustrate the usefulness of the model by analysing the written solutions and think-aloud protocols of 17 subjects engaged with 25 chemical stoichiometry problems. We find that familiar themes for subject difficulty are revealed, including mapping of surface features, lack of interconnected knowledge hierarchy, and algorithmic operations at the expense of conceptual understanding. Keywords: Chemistry education; Qualitative research; Quantitative research; Reasoning; Science education Introduction and Model While it is relatively easy to find problems that are difficult for learners to solve, it is much trickier to pin down what it is about difficult problems that makes them difficult. Indeed, to sort a set of problems from easiest to most difficult, one can simply give the problems to a set of learners and rank the problems according to the students’ rate of success. More sophisticated analysis can evaluate the problems according to their ability to discern between high-achieving and low-achieving learners. But to understand why some problems are difficult or discerning requires a *Corresponding author. Department of Chemistry and Biochemistry, Calvin College, 1746 Knollcrest Circle, Grand Rapids, MI 49546, USA. Email: herb.fynewever@calvin.edu ISSN 0950-0693 (print)/ISSN 1464-5289 (online)/10/162167–18 © 2010 Taylor & Francis DOI: 10.1080/09500690903358335 2168 O. Gulacar and H. Fynewever Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 much deeper analysis. In this paper we will examine some of the available models for analysing and predicting problem difficulty. We will then build on these models to present a new model which we feel can naturally be coupled with qualitative methods to facilitate detailed study into what makes some problems more difficult than others. We will go on to illustrate the use of this model by applying it to a set of chemical stoichiometry problems. Throughout this paper we will focus on multistep, modular problems. By this we mean problems whose successful solution requires the solution of several smaller subproblems. For example, consider the traditional jump-rope rhyme (Abrahams, 1969): Teddy Bear, Teddy Bear, go upstairs, Teddy Bear, Teddy Bear, say your prayers, Teddy Bear, Teddy Bear, turn out the light, Teddy Bear, Teddy Bear, say “good-night.” This problem—going to bed—can be thought of as a modular, multistep problem with four smaller subproblems. One might say that the teddy bear only successfully solves the problem of properly going to bed if all four subproblems are done successfully. This problem is modular in the sense that several of the subproblems are used in the context of other problems—for example there likely are many other problems in the teddy bear’s life where it might be necessary or appropriate to go upstairs, say prayers, or turn out lights. Multistep, modular problems are common in science classrooms and laboratories across disciplines. For example, in mechanics problems students may have to draw free-body diagrams, add force vectors, apply Newton’s second law to calculate an acceleration, and apply equations of motion to find a final velocity. In a biology lab, students may have to prepare a buffer, pulverise tissue, extract DNA, filter the extraction, and precipitate the DNA from solution using cold ethanol. When doing chemistry homework, students may have to complete and balance a chemical equation, convert reactant mass to quantity of matter of reactants, use a stoichiometric ratio to find quantity of matter of product, and convert product quantity of matter to mass. In each of these examples, several subproblems must be done to successfully solve the problem. Yet each problem is modular in the sense that the ability to accomplish each subproblem will likely be useful to the students for some other problems in other contexts within the discipline. In developing an understanding for what can make certain multistep, modular problems difficult to solve, we first consider the model developed by Johnstone and El-Bana (Johnstone, 1984; Johnstone & El-Bana, 1986, 1989). Consistent with other research in problem solving (Sherrill, 1983; Stamovlasis & Tsaparlis, 2000; Sweller, 1988), this model presumes that the primary driver behind multistep problem difficulty is problem complexity. The Johnstone–El-Bana model predicts that as problems require more and more steps (i.e. subproblems) for the learner to come to a solution, the problem will be more and more difficult for the learner to solve. The rationale behind the Johnstone–El-Bana model centres on the learner’s mental capacity. It is Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 A Research Methodology 2169 assumed that in order to solve a problem of many subproblems, the learner must use working memory to simultaneously recall needed knowledge from long-term memory and to process demands of each subproblem (Johnstone, 1997). And so, Johnstone and El-Bana predict, when a high number of subproblems overwhelms the mental capacity of a learner, the learner will almost certainly fail to solve the problem. To understand the Johnstone–El-Bana model in a simple context, let’s return to the teddy bear’s task of going to bed. As the teddy bear gains in fine motor skills, it might be required to brush its own teeth and put on its own pajamas. But now the “going to bed problem” is a six-step problem rather than a four-step problem. According to the Johnstone–El-Bana model, the teddy bear is predicted to fail in going to bed properly (e.g. not all six steps will be done well if at all) if its mental capacity only allows for success on four-step problems. In large-scale, quantitative studies the Johnstone–El-Bana model has been shown to make excellent predictions. As seen in Figure 1, when the student success rate is graphed versus the number of subproblems in a problem, the success rate shows a precipitous drop-off. This drop-off occurs at the point where the number of subproblems exceeds the average mental capacity of the subjects in the study (mental capacity independently measured by the digit span backwards test (Miller, 1956)). Figure 1. Johnstone and El-Bana model. (Johnstone, 1984). Used with permission from the Journal of Chemical Education, Vol. 61, No. 1, 1984, pp. 847–849, copyright © 1984, Division of Chemical Education, Inc. Figure 1. Johnstone and El-Bana model. (Johnstone, 1984). Used with permission from the Journal of Chemical Education, Vol. 61, No. 1, 1984, pp. 847–849, copyright © 1984, Division of Chemical Education, Inc. Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 2170 O. Gulacar and H. Fynewever Despite its success, the Johnstone–El-Bana model has some important limitations. The model implicitly assumes that all of the subproblems involved are familiar to the students and of equal difficulty. This excludes many situations of interest such as when problem assignments introduce new material or reinforce relatively new or difficult material. As stated by Johnstone elsewhere, familiarity with problem-solving methods is a major component of what prevents students from demonstrating expertise (Johnstone, 2001). The Johnstone–El-Bana model is also limited in that it only concerns itself with a right or wrong final solution. The model cannot be used, in its present form, to study the process a subject goes through when coming to a solution. Consider the following simple example. Suppose a student with a mental capacity of “four” successfully completes four subproblems out of a problem that requires five subproblems. The student would not get a correct final solution and the Johnstone–El-Bana model would (correctly) predict the student’s failure. Still, this analysis glosses over perhaps important information regarding the student’s success on 80% of the subproblems of the problem. And, as we shall see later, examining partial solutions can provide useful insight for the educator. Tsaparlis built upon the Johnstone–El-Bana model and removed the limiting approximation that all subproblems be treated as equally difficult (Tsaparlis, 1998). In his model, the difficulty of each subproblem is measured with separate, one-step exercises and is accounted for explicitly. The percent probability that a subject will succeed on a given problem with Z subproblems is predicted to be: PP = (1 / 100 )( Z −1) ∏ Pi = (1 / 100 )( Z −1) P1 P2 K Pi i where Pp is the predicted percentage of students who will get the problem right, Pi is the percentage of students who get a subproblem i correct, and Z is the total number of subproblems in the problem. This probabilistic model implicitly assumes that the Pi for each subproblem can be determined independently of its context, i.e. that each factor in the product is decoupled from the other factors. To motivate the Tsaparlis model, let’s consider again the original four-step “going to bed” problem presented to the teddy bear. Imagine that the teddy bear is tested on each of the four subproblems in isolation and is found to have success rates of 90% for going upstairs, that is when asked to simply go upstairs the teddy bear makes it to the top of the stairs 9 out of 10 times without mishap. Similarly, suppose success rates are 70% for saying prayers, 95% for turning out lights, and 100% for saying good night when required. The Tsaparlis model would then predict that on any given night, when presented with the going to bed problem, the teddy bear’s probability of success is just the product of the probability of doing each of the four problems in isolation: P = (1/100)3(90)(70)(95)(100)% = 60%. In comparison to the Johnstone–El-Bana model, the Tsaparlis model is attractive in that it acknowledges the fact that not all subproblems are equally difficult and that students’ performances can be inconsistent from one problem to the next: for example, they might A Research Methodology 2171 Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 sometimes be able to solve six-step problems but sometimes fail to solve seemingly uncomplicated two-step problems. The Tsaparlis model has not been studied as extensively as the Johnstone–ElBana model, but it has been shown to make accurate predictions. And, although more sophisticated than the Johnstone–El-Bana model, the Tsaparlis model is still limited to only evaluating final answers as right or wrong. For this reason, both models are not appropriate for direct examination of subject problem-solving processes. To more finely parse where students are successful and where they are not, we propose a new model that builds on those of Johnstone and El-Bana and of Tsaparlis’ but with some important differences: ● ● ● Similarly to Tsaparlis, we go beyond the Johnstone–El-Bana model in that we do not assume that all of the subproblems are equally difficult. Instead each subproblem type will have its own success rate that will figure into the success rate for each problem. Unlike both models, we determine the success rate for each subproblem type from subproblems that are embedded within the problems being studied. These will mostly be multistep problems. Hence, in an average way, our model accounts for how the context of a problem can affect student performance on a subproblem over and above what that performance would be on the same type of subproblem in an isolated single-step problem. Our model does not predict an overall success rate for final answers to a problem. Rather, by considering all subproblems of each problem explicitly, we predict an average “partial credit” score for each problem. With this approach rather than an “all or nothing” approach, we will better be able to study the process of problem solving. This innovation retains more information for each problem solution and is naturally integrated with qualitative methods which study each solution in detail. Our partial credit model is: PPC = 100 Z P + P + K PZ Pi = 100 1 2 ∑ Z i =1 Z where PPC is the predicted percent partial credit the students will obtain for a given problem, Z is the number of subproblems needed to solve the problem, Pi is the percentage chance that students who get a subproblem the same type as subproblem i will solve it correctly. Each Pi is calculated by taking an average of the percent success rates for all subproblems of the same type in all problems for all students. Pis can only be determined posteriori. We note that PPC for a given problem is simply the average of the Pis for all the subproblems that make up the problem. The usefulness of our model comes primarily from the comparison of the predicted partial credit (PPC) with the actual partial credit (APC) for a given problem. The 2172 O. Gulacar and H. Fynewever actual partial credit for a given problem is calculated in the same way as PPC, but with using Pi data only from the problem under consideration (i.e. there is an average over all students but not over all problems). A comparison of PPC with APC gives a measure of the unexpected difficulty for a given problem. We signify this difference with a capital delta: Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 ∆ = PPC − APC The larger the ∆ is, the greater the level of unexpected difficulty in the problem. By “unexpected,” we mean difficulty that is beyond what might be expected because of the number of subproblems in the problem or because of the difficulty of the subproblems in the problem. We emphasise that both of these sources are taken into account in the calculation of PPC. Once the ∆s for all problems have been calculated, those with a high ∆ are effectively flagged for further study. These are the problems that will likely be most interesting to educational researchers who are engaged in evaluating what makes some problems particularly difficult for students. By coupling the calculation of ∆ with another technique, such as using think-aloud protocol, one can make significant progress towards understanding what makes some problems more difficult than others. We will illustrate the use of this method as it can be applied to chemical stoichiometry problems in this paper. We note that our study is focused only on the effects of the number of subproblems and the difficulty of those subproblems. There have been important developments regarding both the Johnstone–El-Bana model and the Tsaparlis model that show a comprehensive understanding of what makes problems difficult to solve must include other factors. In particular, it has been shown that problem novelty makes it more likely that the Johnstone–El-Bana model is valid (Tsaparlis & Angelopoulos, 2000). Further studies have shown that the logical structure of problems can be more important than the number of subproblems (Niaz & Robinson, 1992) and that the effects of this logical structure varied with student developmental level (Tsaparlis, Kousathana, & Niaz, 1998). Finally, a student’s ability to disembed information in problems is sometimes more important than the student’s mental capacity (Tsaparlis, 2005). We stress that our study is only indirectly related to working memory capacity in that it is focused on the number and difficulty of the subproblems. We certainly acknowledge that this wider body of literature must inform a more comprehensive understanding of what makes problems difficult to solve. In the remainder of our paper, we describe how we use our model to address the research question: “What makes some problems particularly difficult or easy for subjects to solve?” In our analysis we will revisit some classic research related to this question, including: ● Subjects find problems which require conceptual understanding more difficult than problems which simply require application of an algorithm (Nakhleh & Mitchell, 1993). A Research Methodology 2173 ● ● Subjects have an easier time with familiar problems which allow them to “chunk” several steps into a smaller number of steps (Newell & Simon, 1972). Subjects have difficulty with problems for which they are ignorant of some basic, low-level facts (Frazer & Sleet, 1984; Herron & Greenbowe, 1986). We feel it is also important to consider an often-studied research question that is a corollary to our own research question: “How do successful (or expert) problem solvers differ from unsuccessful (or novice) problem solvers?” This question, too, has many classic answers including: ● Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 ● Experts have more highly connected cognitive structure (Shavelson, 1972). Novices have trouble categorising problems according to deep principles, but rather categorise according to surface features (Chi, Feltovich, & Glaser, 1981). Each of these findings about problem solvers suggests a corollary about problems. For example, if experts have a more highly connected cognitive structure; it seems likely that problems that require the connection of several different concepts will be more difficult. Similarly, problems that require qualitative analysis are probably difficult as are problems that defy easy categorisation. Methodology Subjects Recruited from a regional, primarily undergraduate, Midwestern university in the USA, the 17 subjects were all registered for a second semester general chemistry class in spring 2006. Neither of the authors were instructors for any of the subjects at the time of the research. Originally, 18 subjects were recruited, but one dropped out due to other time demands early in the research. Think-Aloud Protocol Our primary method of data collection was think-aloud protocols. During the protocols, all subjects were given several stoichiometry problems. There were four total sessions of approximately 45 minutes each. For the period of these problem-solving sessions, we asked the subjects to think aloud and verbalise their thoughts as much as possible (Heyworth, 1999; Nakhleh & Mitchell, 1993; Tingle & Good, 1990). We also asked some probing questions similar to those used in explicitation interviews (Potvin, 2005). In this type of interview the interviewer is looking essentially for descriptions of “what is going on in the head” of the subject. An example of a frequently asked question during this type of interview would be: “When you said this (a prediction, for instance), what did you say to yourself at that moment?” As it is often said during these interviews, there are no right or wrong answers, just true ones. During the problem-solving sessions we left open the possibility that the researcher could give a subject a hint if the subject was completely stuck at some 2174 O. Gulacar and H. Fynewever Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 point in the development of a solution. We made this choice because we anticipated that some problems might begin with difficult subproblems and if the students were stuck early in the solution to these problems our ability to collect data on interesting subsequent subproblems would be limited. As we will describe below, our analysis of student solutions takes into account the effect of the hint and does not give students credit for those subproblems for which they required a hint. All sessions were recorded using a digital voice recorder and digital videocassettes. The audio recordings, later, were transcribed for further analysis. The digital visual recordings were used to understand what students are referring to when making physical gestures. Determination of the Number of Subproblems There is no single, consensus method for quantifying the number of subproblems needed to solve a problem in the literature. The method we use follows that of Frazer and Sleet (1984). In their work, Frazer and Sleet treat each problem as a network of subproblems. These subproblems are defined as “any problem which tests the ability to derive a single item of information by reasoning with the information immediately preceding that item in the network.” This method also shares aspects with both the Tsaparlis model (Tsaparlis, 1998) and that used in the Johnstone–El-Bana model (Johnstone, 1984; Johnstone & El-Bana, 1986, 1989). Note that an exact match of the method used is not required unless one was to make a direct quantitative comparison between studies, which is not our purpose here. What we do require for this study is a method that can be consistently applied to all of our data. To determine the number of subproblems for each problem, one of us first solved each problem (Ozcan Gulacar). This set of solutions was verified by the other author (Herb Fynewever) and was then used as the “standard solutions” from which the number of subproblems would be determined. We broke each solution into subproblems and sorted them into categories according to the method of Frazer and Sleet. Altogether, the 25 problem solutions were made of 116 subproblems in 11 categories. Coding of Each Step Altogether we have transcripts and written solutions from 17 students solving 25 problems (425 total solutions). Each of these has been coded on a subproblem-bysubproblem basis with the solution to each subproblem receiving one of three labels. A subproblem solution was labelled “successful” if the subject performed it correctly. A subproblem was labelled “unsuccessful” for many possible reasons including the solution being explicitly incorrect, the subproblem being omitted because the subject did not realise it was necessary, the subproblem solution being correct but only because of a guess (i.e. there is evidence in the transcript that the subject was simply guessing), or a subject requiring a hint to solve the subproblem. A Research Methodology 2175 Finally, because some of the problems can be solved by multiple methods, the code “not required” was to subproblems appropriate if a student did not use the same solution method as the standard solutions. Once each subproblem solution was coded, the percent success rate for each subproblem was calculated using: Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 Psubproblem = 100 NS N S + NU where NS and NU represent the number of students who were successful and unsuccessful with the subproblem, respectively. Note that the responses for which the subproblem was “not required” are not considered in the calculation of the percent success rate. Results and Discussion A basic result of our study is the measurement of student success on the different types of subproblems encountered within the 25 problems studied. Table 1 lists these subproblems in order of increasing success. We note that success varies significantly from the most difficult subproblems (44%) to the least difficult (89%). We take this as confirmation that it is appropriate that a model predicting problem difficulty does not treat each subproblem equally when making predictions. One limitation of these data, however, is that a few subproblem types were only encountered a small number of times in the problems studied. Data for the difficulty of these subproblem types should be considered valid only within the context of these problems and may not be generalisable to stoichiometry problems as a whole. The usefulness of our model is not so much in cases where it correctly predicts the level of student success. On the contrary, when a problem is much more difficult than would be predicted by the model, we can use this information to “discover” Table 1. Subproblem type Writing equation Empirical formula Limiting reactant Percent yield Stoichiometric ratio Conservation of mass Mass percent Mole concept Molecular formula Balancing equation Success rates by subproblem type Abbreviation Number of times encountered Average success rate (Pi) WEQ EF LR PY SR CM MP MC MF BEQ 11 3 2 5 24 3 8 48 1 11 44% 45% 59% 63% 66% 76% 77% 77% 88% 89% 2176 O. Gulacar and H. Fynewever sources of subject difficulty that are beyond the simplistic assumptions of the model. We will illustrate four such discoveries in the discussion of the four problems below as we examine why they are more difficult than the model predictions. The four problems were selected because they (1) had high ∆ values, and (2) all four can be understood with relatively little chemical knowledge by an audience of general scientists (such as the readership of this journal). As each problem is presented, we will discuss how subject difficulties discovered are consistent with previous research. Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 Problem A An unknown element (M) combines with iodine (I) to form a compound expected from its location on the Periodic Table, to have the formula M6I2. If 102.58 g of the compound is analysed and found to contain 55.81 g of iodine, what is the atomic mass of M? (Karol, 2005) This problem turns out to be significantly more difficult than might be expected given the subproblems involved (see Figure 2, ∆ = 11.0%). Objectively speaking, it should be one of the easier problems because it only includes four subproblems and three of those subproblems are some of the most used ones (mole concept and stoichiometric ratio). What is it about this problem that is so difficult? Looking at the details of the student transcript, we see a striking number of students all performing the same mistake. Of the 17 subjects, seven subjects performed the most common mistake. The seven students all started off productively—they subtracted to get the right mass for the metal. But when finishing the problem, unexpectedly, all seven simply divided this mass by the “6” of the M6I2 as if it were the quantity of matter (measured in moles) of M that they had. They Figure 2. Typical solution for Problem A. In this and subsequent examples, each box shows the Psubproblem/Pi ratio to illustrate how performance on subproblems in the context of this problem ( Psubproblem) compares to the performance on that subproblem type ( Pi) averaged over all problems Figure 2. Typical solution for Problem A. In this and subsequent examples, each box shows the Psubproblem/Pi ratio to illustrate how performance on subproblems in the context of this problem (Psubproblem) compares to the performance on that subproblem type (Pi) averaged over all problems A Research Methodology 2177 reported this as the molar mass of the metal and were done. Why do they take this path? Two of the students were explicitly not confident. They explicitly state that they don’t know what to do with the subscript: Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 I’m not exactly sure what the 2 or the 6 are for. I don’t know if it’s the number of moles or the number of molecules … I guess I would go with that … because this number is the amount of M in the compound so then just M would be just dividing it by the sub[script] number. (613) [Does it] have anything to do with the 6? I’m not sure if this is right, but like if this was on a test or anything I’d just divide this by 6 “cause this would be M6 and that would be the atomic mass.” I don’t know if that’s right though, I don’t think it is, but I don’t know what else to do. (607) It seems the students know that they need the quantity of matter of M and they know that the 6 is associated with the M and the subscript sometimes has something to do with quantity of matter (e.g. when converting from quantity of matter of one atom in a molecule to quantity of matter of another atom) so they “take a short cut” and just use the subscript for the quantity of matter of M. This is a mapping of surface features, a strategy that novices do when stuck (Chi et al., 1981). What is particularly striking about this instance is that in many other problems in our study the students did properly use subscripts. But others (Friedel & Maloney, 1995) who have studied the use of subscripts in detail have shown that an algorithmic facility with how to use subscripts is common in students even when they do not have a deep enough understanding to transfer this knowledge to an unfamiliar situation. They do not have a highly connected cognitive structure (Shavelson, 1972). The question remains: What is it about the use of subscripts in this particular problem that is unfamiliar? The subjects’ solutions provide evidence that the difficulty comes partly from having to work towards an unfamiliar goal: molar mass. Usually, molar mass is a given or is easily found because the compound is a given. This is an unusual problem in that the identity of the M in M6I2 is unknown. And so, the subjects will be working with a familiar relationship but will have to turn this relationship on its head: see Figure 3. Given that the goal is unfamiliar, the path is unfamiliar, and in this case, difficult to blaze. Perhaps if the mass and the quantity of matter were simply given, the students would be able to find the molar mass. To get to these, though, the subjects must start with a single number (the mass of iodine) and use it in two separate calculations to find the mass of M (by subtraction from total mass) and the quantity Figure 3. Schematic for molar mass calculation Figure 3. Schematic for molar mass calculation 2178 O. Gulacar and H. Fynewever Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 of matter of M (by conversion). This sort of “branching” from one piece of data to two separate calculations constitutes an unfamiliar path. Subjects were more successful in problems which chain forwards using successive conversions. Through the lens of previous research, we take this as confirmation that the unsuccessful problem solvers do not have the highly connected cognitive structure that the successful ones do (Shavelson, 1972). Because they are working towards a goal that is usually a given, the subjects are required to use familiar relationships in an unfamiliar way. This is enough to make the problem difficult, despite the small number of subproblems and the familiarity of each subproblem. Problem B Isobutylene is a hydrocarbon used in the manufacture of synthetic rubber. When 0.847 g of isobutylene was analysed by combustion, the gain in mass of the CO2 absorber was 2.657 g and that of the H2O absorber was 1.089 g. What is the empirical formula of isobutylene? (Silberberg, 2006) What makes this problem difficult (see Figure 4, ∆ = 11.4%)? We note that none of the students knew the formula of isobutylene from memory, so they all pursued the use of the quantitative data—as was presumably the intention of the problem. Only 4 out of 17 get all parts of this problem correct. Of the remaining 13 students, five have miscellaneous approaches while eight have nearly identical partial solution. For these eight subjects, all of them begin the problem and without hesitation (correctly) find the quantity of matter of combustion products (moles of CO2 and H2O). These students go on to recognise that the mole ratio for these two molecules is 1:1. After this, however, the students make no further progress. This may be surprising, given that the only step that remains is to realise that a 1:1 ratio of CO2 to H2O as products implies a 1:2 ratio of C to H atoms in the hydrocarbon reactant, that is an empirical formula of CH2. This follow-through is so simple mathematically that one might expect that the students would be able to work from the 1:1 ratio of CO2 to H2O to the final answer of CH2 in their heads. Why can they not do this? Figure 4. Typical solution for Problem B Figure 4. Typical solution for Problem B A Research Methodology 2179 Further examination of the transcripts reveals that, although most of the students are familiar with finding an empirical formula, they are not familiar with finding an empirical formula within this context (a combustion analysis): I don’t think I’ve ever learned how to find an empirical formula for information like this, so I’m kind of winging it. [I’m] not sure if these two numbers are significant. I need to eventually use this number [gestures to quantity of matter of CO2 and H2O] to find the empirical formula. I don’t know where to go from here. I don’t know how I can use this information to find it. Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 CO2? H2O? I’m not really sure at this point if the mole has really told me a whole lot. As further evidence that the students have an algorithm which is context-bound, consider the fact that of the eight who could not find an empirical formula from the 1:1 ratio of CO2 to H2O, all of them were successful in finding the molecular formula from mass percent data as were presented in another problem under study: During physical activity, lactic acid (M = 90.08 g/mol) forms in muscle tissue and is responsible for muscle soreness. Elemental analysis shows that this compound contains 40.0% C, 6.71% H, and 53.3% O by mass. Determine the molecular formula of lactic acid. This is a much more familiar type of presentation for an empirical formula problem and, as might be expected, the subjects did very well (∆ = –5.5%). Again we can see connections of our results to previous research. Our subjects differ from experts in that their knowledge is context dependent and they cannot solve problems that require deep connections in their cognitive structure (Shavelson, 1972). But when presented with another problem of the same type but in a more familiar context, they have no difficulty “chunking” the subproblems together to make it into a doable problem (Newell & Simon, 1972). The students are able to begin the problem insofar as it begins with a very familiar algorithm of converting mass to quantity of matter, but they cannot do the more conceptual step of seeing that quantity of matter of products (moles CO2 and H2O) implies quantity of matter of atoms (moles of C and H) in the reactant. That is, they have less trouble with the algorithmic part of the problem than they do with the conceptual part of the problem (Cracolice, Deming, & Ehlert, 2008; Nakhleh & Mitchell, 1993). Problem C Methane and ethane are the two simplest hydrocarbons. What is the mass percent of carbon in a mixture that is 40% methane and 60% ethane by mass? (Silberberg, 2006) While this problem is admittedly complex, the lack of success exceeds even what might be expected (see Figure 5, ∆ = 17.0%). Of the 17 subjects, 10 did not get a correct solution. Of those 10, five of them had identical difficulty: they were able to find the quantity of matter of each of the molecules present (moles of CH4 and C2H6) but could not go the next step to find the quantity of matter of carbon atoms present in each. As with Problem B, this involves only simple ratios (1 C atom: 1 CH4 molecule and 2 C atoms: 1 C2H6 molecule). Figure 5. Typical solution for Problem C 2180 O. Gulacar and H. Fynewever Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 Figure 5. Typical solution for Problem C Why is it that students can find quantity of matter of a molecule but they cannot continue on to find quantity of matter of atoms within that molecule? This confirms again that student knowledge is tightly bound to the context in which it was learned and that this knowledge does not have the deep connectedness associated with expertise (Shavelson, 1972), with the case of chemical formula subscripts being a confirmed example (Friedel & Maloney, 1995). Details of this confusion are apparent in the transcripts of the think-aloud protocol. One student simply reached a “dead end” upon coming to the unfamiliar part of the problem: [After having found quantity of matter of CH4] I don’t know how to get the C out, the carbon. I don’t know how to just isolate for that. Other students misapplied the use of subscripts or mass percent to attempt to convert the quantity of matter of CH4 molecules into quantity of matter of carbon: I don’t know if you can do this, but since there’s one C and four H’s—a total of 5 parts—I just divided by 5. So then moles of C is 0.5. But I wasn’t sure if you could do that, though. I have 2.5 moles of CH4 … I have five atoms here and that can also be used as a mole ratio. 2.5 times—no divided—by 5. So that equal 0.5. 2.5 moles total of methane, of which 75% is carbon, so I multiply it by 0.75 [and] I get 1.875 moles of carbon. The fact that subjects are not able to connect from quantity of molecules to quantity of atoms within the molecules shows a truly shallow and unconnected understanding of what a molecule is and what quantity of matter is. One student who initially found the right answer abandons it and reveals a false understanding that the moles themselves need to be conserved: That doesn’t make sense either, because if you put in 3 moles of carbon [including CH4 and C2H6] you have more carbon in this—or moles—more moles in this than in the whole thing [i.e. more moles of carbon atoms than moles of molecules]. Another common mistake in Problem C is confusing % by mass with % by quantity of matter or % by number of atoms within a molecule or % by volume. Four out of the 10 incorrect solutions contain variants of this error: A Research Methodology 2181 If there is one mole total of everything, then there is 0.4 moles of methane in there. Four methanes and six ethanes equals 10 total. So 40% and 60%. So that gives me 10 C and 52 H total mixture. Out of that 100% my ratio is 1–4 and 2–6, which would give me 3 [carbons]. So it would be 30%. Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 I’m just gonna say I have 10 mL of solution, 40% would be 4 mL of methane and 6 mL of ethane. We see here that students have a very un-nuanced understanding of measures of quantity and how various measures are related to each other. While it is certainly true that mass, quantity of matter, number of atoms, and volume can all be used to measure a quantity of material, these of course cannot be used interchangeably. This shallow understanding might not be revealed with familiar problems, but is readily exposed when an unfamiliar problem requires a solution utilising deep understanding and interconnected knowledge structure. Problem D Write a balanced equation for the chemical statement: The destruction of marble statuary by acid rain: aqueous nitric acid reacts with calcium carbonate to form carbon dioxide, water, and aqueous calcium nitrate. (Silberberg, 2006) Although many instructors may not be terribly concerned with how well their students learn low-level (rote memorised) facts, sometimes ignorance of these facts can be a roadblock to solving some problems. The subproblem type with the lowest level of success in our study was the Writing Equations subproblem. The lack of success comes primarily because this subproblem usually requires that the subjects translate chemical names into chemical formulas and the students often lack the low-level knowledge to successfully do this. Problem D had the highest level of unexpected failure rate of these sorts of problems (see Figure 6, ∆ = 16.2%). Fifteen out of 17 subjects were unsuccessful in writing the equation for this problem. In every case, the source of their difficulty was simple ignorance of one or more low-level facts about naming the chemical compounds mentioned in the formula. We should note that none of these compounds are rare or unusual and it is without a doubt that the students have encountered them before and it is likely that they were required to memorise these formulas and names at some point. Their mistakes are summarised in Table 2. Figure 6. Typical solution for Problem D Figure 6. Typical solution for Problem D 2182 O. Gulacar and H. Fynewever Table 2. Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 Difficulty Subject difficulties with the Writing Equation subproblem type (nomenclature) Number of subjects with difficulty Formula for nitric acid 12 Formula for carbonate ion 9 Formula for nitrate ion 7 Charge for carbonate ion 6 Charge for nitrate ion 10 Evidence from transcripts (subject quotations) Nitric acid and calcium carbonate. I don’t know what those two are. Nitric acid would be nitrogen and hydrogen? HN? I have no idea what the formula for nitric acid is. Nitric acid, I think it’s like N, N–I don’t remember what the formula is. NH something? NH? NH2? A really strong acid is HCl, but so would nitric acid be NCl? HNO, let’s see. Is it NO4 or NO3? The reactant is aqueous nitric acid, which I believe is H3. Carbonate, hm. Trying to think. [I] don’t really remember the, like carbonates, the nitrates, all that stuff. I forgot what carbonate was. Calcium carbonate, so then this is CaCO2 I think. [I] don’t really remember the, like, carbonates, the nitrates, all that stuff. “ate” means it has 3? Is it NO3? Nitrate would be like an NO2 or something. I think I remember sulphate being SO4, so I’m thinking maybe all the “ates” are SO4 [sic] so maybe I’ll put NO4. I’m almost positive that carbonate is negative 2. Charge of the carbonate? I don’t know. I don’t remember the charge on carbonate, which is sad. I don’t really remember what the NO3 charge is. HNO3. But nitrate is still minus 2? The high failure rate on this sort of problem serves as a reminder that novice students can be unsuccessful even for tasks that require only low-level thinking skills. We believe that educational research rightly focuses on the higher order thinking skills and the highly connected knowledge structure that comes with expertise. Still, as educators, inclusion and integration of low-level content should not be overlooked. Contribution and Limitations We have developed a new methodology for determining why some problems are more difficult than others for learners. Our methodology is based on previous ones, which takes into account the number of subproblems in a problem and the difficulty of each subproblem. The new contributions made by our model are that it examines the level of success on each subproblem rather than simply making predictions regarding final answers, and that the difficulty of each subproblem is determined within the context of the problems themselves through posteriori examination. This more detailed quantitative model is naturally coupled with qualitative methods, such Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 A Research Methodology 2183 as think-aloud protocols. We have illustrated out methodology by applying it to solutions of chemical stoichiometry problems. We have confirmed the usefulness of the methodology by detecting within the context of stoichiometry phenomena that have been previously established in other areas of science learning. These include: the context-bound nature of novice’s knowledge, effectiveness of “chunking,” and the ease of algorithmic operations relative to conceptual operations. We also highlight how lack of low-level facts can hinder solutions of some problems, as is exemplified by chemical problems which require use of ionic nomenclature. Although we anticipate that our methodology will be useful to others, we note that it is focused on the examination of the number of subproblems and the difficulty of those subproblems. Other factors, such as problem novelty, logical structure, and student developmental level and disembedding ability, should be considered for a comprehensive view of what makes problems difficult to solve. References Abrahams, R. D. (1969). Jump rope rhymes: A dictionary. Austin, TX: University of Texas Press. Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5(2), 121–152. Cracolice, M. S., Deming, J. C., & Ehlert, B. (2008). Concept learning versus problem solving: A cognitive difference. Journal of Chemical Education, 85(6), 873–878. Frazer, M. J., & Sleet, R. J. (1984). A study of students’ attempts to solve chemical problems. European Journal of Science Education, 6(2), 141–152. Friedel, A. W., & Maloney, D. P. (1995). Those baffling subscripts. Journal of Chemical Education, 72(10), 899–905. Herron, J. D., & Greenbowe, T. J. (1986). What can we do about Sue: A case-study of competence. Journal of Chemical Education, 63(6), 528–531. Heyworth, R. M. (1999). Procedural and conceptual knowledge of expert and novice students for the solving of a basic problem in chemistry. International Journal of Science Education, 21(2), 195–211. Johnstone, A. H. (1984). New stars for the teacher to steer by. Journal of Chemical Education, 61(10), 847–849. Johnstone, A. H. (1997). Chemistry teaching: Science or alchemy? Journal of Chemical Education, 74(3), 262–268. Johnstone, A. H. (2001). Can problem solving be taught? University Chemical Education, 5, 69–73. Johnstone, A. H., & El-Bana, H. (1986). Capacities, demands and processes: A predictive model for science education. Education in Chemistry, 23, 80–84. Johnstone, A. H., & El-Bana, H. (1989). Understanding learning difficulties: A predictive research model. Studies in Higher Education, 14(2), 159–168. Karol, P. (2005). Carnegie Mellon University document. Retrieved August 1, 2008, from http:// www.andrew.cmu.edu/course/09-105/GIF97_1/F05.MIwo.3.gif Miller, G. A. (1956). The magical number 7, plus or minus 2: Some limits on our capacity for processing information. Psychological Review, 63(2), 81–97. Nakhleh, M. B., & Mitchell, R. C. (1993). Concept-learning versus problem-solving: There is a difference. Journal of Chemical Education, 70(3), 190–192. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice Hall. Niaz, M., & Robinson, W. R. (1992). Manipulation of logical structure of chemistry problems and its effect on student performance. Journal of Research in Science Teaching, 29(3), 211–226. Downloaded by [Calvin College Seminary] at 13:07 14 September 2011 2184 O. Gulacar and H. Fynewever Potvin, P. (2005). The use of explicitation interview in the study of non-scientific conceptions. Paper presented at the annual meeting of the National Association for Research in Science Teaching, April, in Dallas, TX. Shavelson, R. J. (1972). Some aspects of the correspondence between content structure and cognitive structure in physics instruction. Journal of Educational Psychology, 63(3), 225–234. Sherrill, J. M. (1983). Solving textbook mathematical word-problems. Alberta Journal of Educational Research, 29(2), 140–152. Silberberg, M. S. (2006). Chemistry: The molecular nature of matter and change (4th ed.). New York: McGraw-Hill. Stamovlasis, D., & Tsaparlis, G. (2000). Non-linear analysis of the effect of working-memory capacity on organic-synthesis problem solving. Chemistry Education Research and Practice Europe, 1, 375–380. Sweller, J. (1988). Cognitive load during problem-solving: Effects on learning. Cognitive Science, 12(2), 257–285. Tingle, J. B., & Good, R. (1990). Effects of cooperative grouping on stoichiometric problemsolving in high-school chemistry. Journal of Research in Science Teaching, 27(7), 671–683. Tsaparlis, G. (1998). Dimensional analysis and predictive models in problem solving. International Journal of Science Education, 20(3), 335–350. Tsaparlis, G. (2005). Non-algorithmic quantitative problem solving in university physical chemisry: A correlation study of the role of selective cognitive factors. Research in Science and Technological Education, 23(2), 125–148. Tsaparlis, G., & Angelopoulos, V. (2000). A model of problem solving: Its operation, validity, and usefulness in the case of organic-synthesis problems. Science Education, 84(2), 131–153. Tsaparlis, G., Kousathana, M., & Niaz, M. (1998). Molecular-equilibrium problems: Manipulation of logical structure and of M-demand, and their effect on student performance. Science Education, 82(4), 437–454.