Hidden Stages of Cognition Revealed in Patterns of Brain Activation John R. Anderson Carnegie Mellon University Aryn Pyke Carnegie Mellon University Jon M. Fincham Carnegie Mellon University This work was supported by the National Science Foundation grant DRL-1420008, and a James S. McDonnell Scholar Award. Correspondence concerning this article should be addressed to John R Anderson, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213 (email: ja@cmu.edu). The data and code for the analyses in this paper are available at http://act-r.psy.cmu.edu/?post_type=publications&p=19042. 1 Abstract A new method is described that uses fMRI brain activation to identify when participants are engaged in different cognitive stages on individual trials. The method combines multi-voxel pattern analysis (MVPA) to identify cognitive stages and hidden semi-Markov models (HSMM) to identify their durations. This method, applied to a problem-solving task, identified four distinct stages: Encoding, Planning, Solving, and Responding. This study tests whether these stages correspond to their ascribed functions by testing whether they are affected by the appropriate factors. The Planning stage duration increases as the method for solving the problem becomes less obvious while the Solving Stage increases as the number of calculations to produce the answer increases. The Responding stage increases with the difficulty of the motor actions required to produce the answer. These results confirm the power of HSMM-MVPA to parse complex tasks into meaningful stages and allow us to study the factors affecting each stage. 2 One of the striking features of the human mind is its ability to engage in complex intellectual operations. Analyzing the structure of such activities has been a challenge for any method, behavioral or neural imaging. Given the success of multi-voxel pattern analysis (MVPA) at identifying the structure of concepts (Just, Cherkassky, Aryal, and Mitchell, 2010; Norman, Polyn, Detre, & Haxby, 2006; Pereira, Mitchell, & Botvinick, 2009), it would be nice if it could be used to track the sequence of mental stages in a complex task. We have developed a method that can do just this -- penetrate into the sequential structure of complex cognition. MVPA can be used to recognize the distinct brain patterns that occur in different mental stages, but to do so one must deal with the highly variable duration of these stages. This paper will consider a problem-solving task where participants took anywhere from a couple of seconds to 30 seconds to solve the problems. If we knew when participants were in different stages we could train a pattern recognizer to identify the brain patterns associated with the stages, but we do not know a priori the temporal boundaries of the stages. If we knew what the brain patterns were that characterized each stage, we could use these to estimate when participants were in different stages, but we do not know a priori what these patterns are. The situation we faced was even more uncertain because we were not even sure how many cognitive stages were involved during solving. This new method simultaneously discovers the stages, their associated brain patterns, and their durations on individual trials. Figure 1 shows the output of this method, a parse of 4 trials (out of over 4000) illustrating four stages in solving novel mathematical problems. The color coding reflects the durations of these stages. These particular problems in Figure 1 were chosen because each took seven 2second scans (14 seconds) but the solvers spent different durations in the 4 stages. For illustration the figure shows brain regions in a particular slice that tend to be most active in each 3 of these stages. This figure illustrates how we can use brain activation patterns to identify the period of time when a participant is in a particular mental stage. The method uses a novel combination of hidden semi-Markov models (HSMMs – Yu, 2010) and MVPA to determine on a trial-by-trial basis how to associate fMRI scans with solving stages. Essentially, the HSMMs are complementing the MVPA by performing temporal pattern recognition. Earlier work (Anderson, Betts, Ferris & Fincham, 2010, 2012; Anderson, Fincham, Yang, & Schneider, 2012) showed that this HSMM-MVPA method could accurately recover the stages in tasks where there were external markers of the stage boundaries (e.g., they coincided with overt actions or stimulus events at known time points). More recently (Anderson & Fincham, 2014; Anderson, Lee, & Fincham, 2014) went a step further and applied these methods to identify the stages in tasks that did not afford such external markers. Without independent information about stage boundaries against which to 4 train our system, we identified the stage structure that minimized the unpredicted variability in the imaging data. We went into this effort without strong expectations about the stage structure that would emerge. Rather these methods allowed us to discover the stages. While this was promising we would like to be sure that the stages are meaningful rather than artifacts of the statistical estimation. Therefore, in this paper, we turn to testing predictions about these stages to establish their psychological reality . Figure 2. An illustra on of HSMM–MVPA producing a 4-stage solu on. Input to the HSMM– MVPA are scans organized within trials. Each scan consists of the 20 PCA scores. The parameters es mated for each stage are: the 20 PCA means that define the brain signature (individual scans are normally distributed around these means) and the parameters that define the gamma distribu on for a stage’s dura on. Also calculated are the stage occupancies (probabili es that the par cipant is in each stage on a par cular scan). Figure 2 illustrates what the HSMM-MVPA approach can extract from fMRI data. The process estimates a set of parameters for each stage including whole brain patterns of activation 5 called the brain signatures of the stages. A stage’s brain signature is the mean activation pattern in that stage (across trials and participants) and can be used to discriminate it from other stages. The HSMM further parses each trial into the scan-by-scan probabilities that the participant is in a particular stage. These Stage Occupancy profiles provide estimates of stage durations for a trial. The estimated stage durations for a trial are calculated as a sum of the stage probabilities over the scans that define that trial multiplied by the length of the scans. These trial-by-trial estimates allow us to determine which experimental factors affect the length of which stages. Current Research The goal of the current research is to verify the psychological reality of the stages that are revealed by these HSMM-MVPA methods by testing whether experimental manipulations had their effects on appropriate stages. Participants were taught two new mathematical operators, designated by ↓ and ↑, that mapped pairs of numbers onto values. Down-arrow problems, denoted bn=X, can be solved by adding a sequence of n decreasing terms starting with b – for instance, 83 = 8+7+6 = 21. Up-arrow problems, denoted bn=X, can be solved by adding a sequence of n increasing terms starting with b – for instance, 83 = 8+9+10 = 27. However, there are also formulas for solving these problems and half the participants were not taught the addition rule but rather taught to solve the problems using the equations: bn=X = b*|n| - (n/2)(|n|-1) bn=X = b*|n|+ (n/2)(|n|-1) These algebraic expressions might seem rather intimidating on first sight but participants learned to solve problems using these equations with slightly greater facility than the participants who were performing repetitive addition. 6 The down-arrow problems are similar to the problems used in the research of Anderson and Fincham (2014) who discovered 4 stages in the solution of these problems – the Encoding, Solving, Planning, and Responding stages illustrated in Figure 1. We expected to find the same stages in the solution of the problems in this experiment. The goal of the current study was to test our understanding of these stages by seeing if experimental manipulations would affect the duration of the different stages. This is like the additive factors logic of Sternberg (1969) except that activation patterns are used to measure the duration of each stage rather than inferring it from patterns of total latency. This study focused on two manipulations intended to affect the duration of the two middle stages, Planning and Solving. 1. Type of Problem. The day before the scanner trials, participants learned their assigned method for how to solve arrow problems and practiced calculating the value of up- and down-arrow problems with positive single-digit operands. On the second day, when they were in the scanner, they solved three types of problems (see Table 1). Regular problems were just like the ones they had been solving on the first day and serve as a reference point for two types of transfer problems. The first kind of transfer problems were Computational problems that required solving for different terms (e.g., 2X=5 – answer X = 2) or working with negative operands (e.g., -25=X -- answer X = -20), but which could still be solved by repetitive addition or equation manipulation. Third, there were Relational problems (e.g., 314=31X – answer X = -4) that could not be easily solved by the standard procedure but were fairly easy to solve if the participant came to an insight about the solution procedure. Thus, Relational problems should be harder to plan for but easier to solve, leading to a predicted interaction between the type of transfer problem and the duration of the Planning versus Solving intervals. 7 2. Second Operand. We varied the second operands for Regular and Computational problems (but not Relational problems) from 2 to 5. This value should strongly affect the duration of the Solving interval for participants using the addition algorithm because it determines the number of additions. Thus, we predicted a three-way interaction between the size of the operand, the type of algorithm, and the Planning versus Solving stage: There should be a strong linear relation between operand size and duration only for the Solving Stage of the Addition algorithm. Note that both of these predictions are more precise that simply predicting variation in total time; rather, they are testing the ability of the HSMM-MVPA method to localize these effects in particular stages. Method Participants. The data for the analyses come from 80 participants (44% female, mean age = 23.6, SD = 4.9) with 40 using the Addition algorithm and 40 using the Equation algorithm. This large of a sample size is required for the HSMM-MVPA method. Half of each group were shown a spatial referent appropriate for the algorithm in training but not during imaging. Pyke et al. (2015) reported the empirical results, but not a stage analysis, for the 20 participants who used the addition algorithm with a spatial referent and the 20 participants who used the equation algorithm without a spatial referent. With the factors now crossed in the current data set, we were able to determine that there were strong effects of the algorithm and no significant effects of the referent. Thus, our analysis will focus on the two groups of 40 defined by algorithm. Procedure. In a separate session prior to the scanner trials, participants practiced solving Regular problems. These training problems were of the form bn = X and bn = X, where participants solved for X; and b and n were integers from 1-9 and 2-6, respectively. The 90 8 unique training problems (45 per operation) were partitioned into 8 blocks, the first and last with 12 problems each and the rest with 11 problems each. Each block had about equal numbers of and problems. During the scanner trials, problems were presented one at a time on the screen in black font on a white background. Learners pressed ENTER when they had solved the problem, and typed their answers using a keypad. A problem would time out if learners took longer than 30s to compute an answer and/or longer than 5s to type it. Otherwise, learners’ answers appeared in blue font below the problem. Upon pressing ENTER after their response, their answer turned green if correct and red if incorrect and the X in the problem was replaced with the correct answer, framed in a box. They had 7 seconds to study this feedback. To distract participants from thinking about the prior problem and allow brain activity to return to a relatively constant level between problems, a repetition detection task (12 s) was inserted after the feedback on each trial: A fixation cross (3 s) was followed by a series of letters (each for 1.25 seconds) and participants were instructed to press Enter when the same letter appeared twice in a row. Image Analysis. Images were acquired using gradient echo-echo planar image (EPI) acquisition on a Siemens 3T Verio scanner. We acquired 34 axial slices on each TR using a 3.2 mm thick, 64×64 matrix. This produces voxels that are 3.2 mm high and 3.125 x 3.125 mm2. The anterior commissure-posterior commissure (AC-PC) line was on the 11th slice from the bottom scan slice. Acquired images were pre-processed and analyzed using AFNI (Cox, 1996; Cox & Hyde, 1997). Functional images were motion-corrected using 6-parameter 3D registration (AIR; Woods et al., 1998). All images were then co-registered to a common reference structural MRI by means of a 12-parameter 3D registration and smoothed with a 6 mm full-width-halfmaximum 3D Gaussian filter to accommodate individual differences in anatomy. 9 We assume a BOLD response is produced by the convolution of an underlying activity signal with a hemodynamic response. The hemodynamic function is the SPM difference of gammas (Friston et al., 1998 – g = gamma(6,1)-gamma(16,1)/6). A Wiener filter (Glover, 1999) with a noise parameter of .1 was used to deconvolve the BOLD response into an inferred activity signal. To an approximation, this results in shifting the BOLD signal to the left by 2 scans (4 seconds) to compensate for its lag relative to neural activity. We have used this simple scan shift in earlier studies (e.g., Anderson et al., 2010, 2012) where there was not a constant baseline activity interspersed at regular intervals. In complex tasks it proves useful to perform MVPA on whole brain activity. However, as a step of dimension reduction and to accommodate variations in anatomy over participants that may not be dealt with in co-registration, we aggregated the voxels in each slice into larger 2x2 voxel regions. There are 12,481 such regions. Some of these regions show an excess of extreme values for some participants, probably reflecting differences in anatomy. These were regions mainly on the top and bottom slices as well as some regions around the perimeter of the brain. Eliminating these regions resulted in keeping 8755 regions. HSMM-MVPA Analysis. The HSMM-MVPA algorithm and its mathematical basis are described in detail in Anderson et al. (2014). Figure 2 illustrates the method. The brain pattern that defines a stage is called its brain signature. To deal with the highly correlated nature of brain signals in different regions we performed a spatial principle component analysis (PCA) on the data and used the first 20 PCs. We zscored these PCs for each participant so that they had an overall mean of 0 and standard deviation of 1. The MVPA estimates brain patterns assuming that each trial yields a sample from a 20-dimensional normal distribution (with a standard deviation of 1 for each independent dimension) around a set of mean values that define the brain signature 10 for that stage. Whole brain activation patterns for a stage can be reconstructed from the PC means that define its brain signature. A traditional Markov model would assume that stage durations do not vary, however, on different trials participants can spend different amounts of time in a stage, making the current model semi-Markov. Variability in each stage’s duration is represented by a gamma distribution. There are many possible ways to partition the scans of a particular trial into different stages1. The HSMM can efficiently compute the summed probability of all such partitions given the means estimated for the brain signatures and the gamma distributions. The stage occupancies in Figure 2 are the summed probabilities over all such partitions that a scan is in a stage. The standard expectation-maximization associated with HSMMs (Yu, 2010) is used to estimate the brain signatures and gamma distributions for each stage to maximize the probability of the data on all trials. Results Figure 3 displays the basic behavioral results: accuracy and latency broken down by problem type and algorithm. Spatial referent had no effect on either accuracy or latency, nor did it participate in any significant interactions. The factors of interest showed on accuracy – problem type (F(2,156)= 246.44.86, p < .0001) and algorithm (F(1,78)=15.80, p < .0005). While participants were overall more accurate given the Addition algorithm there was an interaction (F(2,156)= 35.86, p < .0001) such that the advantage was strongest for Relational If there are n scans and m stages the number of possible partitions into stages is (n+m1)!/(n! x (m-1)!). 1 11 (a)$ (b)$ Figure 3. Mean accuracy and latency for correctly solved problems. problems. Equation participants were slightly more accurate than Addition participants on Regular problems, but this was not significant (t(78)=-1.36, p > .10). Addition participants were marginally more accurate on Computational problems (t(78)=2.08, p < .05), but showed a highly significant advantage for Relational problems (t(78)=6.91, p<.0001). One participant (in the Equation group) failed to correctly solve any Relational problems. Figure 3b gives the mean latency for corrects from the remaining participants. There was no significant main effect in the overall mean times of the two algorithm groups (F(1,77) = 0.16). There was a highly significant effect of problem type (F(2,154)=126.94; p < .0001) and a significant interaction between problem type and algorithm (F(2,154)=6.72; p < .005). Mirroring the pattern for accuracy, Addition participants were slower on Regular problems (8.19 sec. versus 7.56 sec.), but faster on Computational problems (11.43 sec. versus 11.61 sec) and Relational problems (9.89 sec versus 10.97 sec.). The effect for Relational problems was significant (t(77)= 2.00, p <.05) while the other pairwise comparisons were not significant. While analyses of total latency do not address the stage predictions in the introduction, these results do establish clear effects of both problem type and algorithm. It remains to be determined whether these factors are affecting the Planning and Solving stages as predicted. But first we will present evidence for the existence of these four stages. 12 Evidence for 4 Stages To verify that participants go through 4 cognitive stages in solving these problems we fit models with different numbers of stages to the data of half of the participants (the first 10 from each of the four training conditions) and then used the estimated parameters to “predict” (fit) the data of the other 40 participants. Since the model parameters specify the probability of any data pattern, we are predicting the 40 participants in the sense we expect them to display high probability patterns. A more complex model (with more stages) will fit the first half of the participants better but, if it is overfitting, it will not predict the second half of the participants better. Figure 4 shows how the mean log likelihood of the predicted participants varies with the number of assumed stages. Starting from a single-stage model, prediction accuracy increased substantially when additional stages were added to the model, up until 5 stages. We measured Figure 4. Average likelihood of the data obtained using models with different numbers of stages. The likelihood of each of the second-half 40 par cipants was calculated using model parameters based on the first-half 40 par cipants. The average log likelihood of par cipants for different numbers of stages is expressed as the difference in the log likelihood for that number of stages minus the log likelihood for a model with just 1 stage. 13 the significance of the improvement by how many participants were better fit (sign test) with more stages. Up until 5 stages, each additional stage results in improvement to the fit of most participants. The least significant case is between 4 and 5 stages where 27 of the 40 participants are predicted better with 5 stages (p < .05 by a sign test). In no case does the addition of more stages significantly improve upon 5 stages (the closest comparison being 8 versus 5 stages where 8 stages predicts 25 of the 40 participants better, p ~ .077). The conclusion of best fits with a 5stage model replicates the results of Anderson and Fincham (2014). Anderson and Fincham (2014) show that the change from 4 to 5 stages involves an insertion of a stage to better fit a ‘hybrid’ scan straddling the transition between stages 3 and 4 in the 4-stage model. Hybrid scans can occur because stage boundaries do not necessarily align with scan boundaries. If the majority of a scan falls in one stage it can be accommodated by that stage in the model. However, if enough scans happen to fall somewhat centered across two particular stages, the modeling process will achieve better prediction if it treats the hybrid brain patterns of these scans as a stage in its own right. However, this reflects no theoretically interesting variation. Therefore, as in Anderson and Fincham we will focus on the 4-stage solution, confident that the data point to the existence of at least these many stages. Figure 5 shows the correspondence between the activation patterns for the stages in this study and Anderson & Fincham (2014) . While there is similarity among all stages, in each case the corresponding stage signatures between experiments are more correlated than the noncorresponding stages (mean r of .93 versus .78). Figure 1 gives some sense of regions that are more active in one stage than in others. The Encoding stage has high activity in parietal regions involved in visual attention, the Planning stage has the highest prefrontal and lateral parietal 14 Encoding$ Planning$ Solving$ Responding$ A&F$ Study$ $ Current$ Study$ A&F$ Study$ $ Current$ Study$ Figure 5. The brain signatures of the 4 stages. Dark red reflects ac vity 1% above baseline while dark blue reflects ac vity 1% below baseline. The z coordinate displayed for a brain slice is at x=y=0 in Talairach space. Note brain images are displayed in radiological conven on with le on right. Compared are the results from Anderson and Fincham (A&F, 2014) and the current study. activation, and Responding stage has highest activation in the left motor region involved in controlling the right hand. Effects on Stage Duration Having established that there are at least 4 stages we can test whether their durations vary as predicted. As discussed earlier, the HSMM-MVPA method leads to estimates of how much time was spent on each stage for each trial of each participant. Even though it was the same model being fit to all the data, these times vary with condition. Table 2 reports 2-way analyses of variance of the 4 stage durations (on correct trials) with the factors algorithm (Addition, Equation) and type of problem (Regular, Computational, Relational). Since one participant in 15 the Spatial Equation condition did not correctly solve any Relational problems, there were only 79 participants in this analysis. There were no effects of these factors predicted for the Encoding and Responding Stages. There are no significant effects for the Encoding stage. There are some effects on Responding time, which we will consider after examining the effects on Planning and Solving time that were the focus of our manipulations and predictions. Planning and Solving times both show a strong effect of Problem Type. Solving time also shows a strong interaction with algorithm. Effect of Problem Type: Figure 6 shows the duration of the 4 stages for the three problem types and two algorithms. As indicated in the Anova there was no effect on the Encoding stage. While there is a significant effect on the Responding stage it is tiny compared to the effects on the Planning and Solving Stages. Not surprisingly, the practiced Regular problems are the fastest during the Planning and Solving stages. As predicted, Computational Figure 6. Effects of problem type on the four stages of problem solving, plo ed separately for Addi on and Equa on algorithms. 16 are faster than Relational in the Planning stage (t(78)=2.70, p <.01), but much slower in the Solving stage (t(78)=5.21, p < .0001). Effect of Second Operand. Figure 7 displays the effect of second operand, averaged over Regular and Computational problems. As we had predicted, there is a linear effect of second operand (n) on the solving stage only for participants using the Addition algorithm. This graph surely satisfies the intraocular test of significance. More formally, this is statistically supported by a highly significant 3-way interaction between stage, second operand, and algorithm (F(9,684)=42.53, p < .0001). While it is not surprising that the size of the second operand affects overall solution time using the addition algorithm, it is striking that this HSMMMVPA analysis method so clearly localizes the effect in the Solving stage. Figure 7. Effects of second operand on stage dura ons for Regular and Computa onal problems. Keying Difficulty. The effect of problem type on the Responding stage in Table 2 reflects at least in part differences in the difficulty of keying responses, which is not equated across problem types. Answers to Regular problems average 1.76 keys, Computational 1.60 keys, 17 and Relational 1.65 keys. Negative answers are only given for Computational problems (24.3%) and Relational problems (15.1%). The negative key, located near the upper right edge of the keypad, was somewhat awkward to reach, particularly for small hands. For example, in Figure 1, the last problem is the only one with a negative answer (-15) and is characterized by a noticeably longer responding time. To sort out the effects of problem type, number of keys in the answer, and use of the negative key, we performed a mixed-effects analysis of the Responding times. Including number of keys resulted in a decisive 853 decrease in BIC, including the existence of a negative answer resulted in a 790 decrease, and including both resulted in a 1014 decrease. In contrast, adding problem types to the model does not decrease the BIC score but rather increases it by 47. Thus, it would appear that the only effects in the Responding stage involve keying difficulty. Each key pressed resulted in a .34 second increase in Responding time and if the negative key had to be pressed there was an additional .49 second impact on keying. Table 3 gives the average duration of the Responding Stage for different combinations of 1 and 2 digit answers with and without a negative. Discussion Problem solving is highly variable in duration, with large variation among participants and among problems, and even large variations within the same individual on similar problems. Nonetheless, the HSMM-MVPA method is able to parse out a common sequence stages characterized by distinct brain patterns. As Figure 1 illustrates, even for problems with the same overall response time, there is considerable variability in the duration of the different stages. The goal of this research was to establish that the variability in the duration of these stages was not just random but was related to the characteristics of the problem being solved and the 18 solution algorithm. These analyses have provided strong evidence for the assumed identity of the three of the stages. The manipulated factors had their predicted effects on the Planning and Solving stages. While the experiment did not involve a balanced manipulation of keying difficulty, a mixedeffects analysis revealed that the controlling factors impacting the duration of the Responding stage were indeed the number of keys and the difficulty of keying negative signs. The HSMMMVPA method was able to localize effects of conditions to appropriate stages even though the analysis had no information about what condition specific problems were in. How general are the four problem-solving stages (Figure 5) that have been identified in this research? One would expect that similar Encoding and Responding stages would be found in other tasks that presented a visual problem and required a manual response. On the other hand, the Planning and Solving stages presumably reflect the characteristics of the problems being solved. In part, these stages reflect general features of mathematical reasoning. For instance, both stages show high activation in the horizontal intraparietal sulcus, which has been associated with the representation of quantity (Dehaene et al., 2004). Lee et al., (2014) studied a different mathematical problem-solving task with a different logical structure, but, like the current task, it involved visual problem presentation and manual response. They found evidence for similar Encoding and Responding stages. They also found evidence for what they called Compute and Transform stages, which bear similarities to our Planning and Solving stages but also reflected the distinctive characteristics of the different tasks. Anderson, Fincham, Yang, and Schneider (2012) studied a task that involved turns in a game that required either solving equations or anagrams. They found distinctive brain signatures between stages across these two activities. Their Solving stage for equation trials showed overlap with the parietal and spatial regions active 19 in this experiment but their anagram Solving stage did not. While this paper examined a mathematical problem-solving task, the results are encouraging for the general use of the HSMM-MVPA methods to parse complex cognition into distinct stages. The distinctive cognitive demands of each stage will produce a brain pattern that can be used to identify the bounds of that stage on a trial. One can then take these stage durations and determine which various factors of interest affect the durations of different cognitive stages. We need not be left just looking at total latency. 20 References Anderson, J. R., Betts, S., Ferris, J. L., Fincham, J. M. (2010). Neural Imaging to Track Mental States while Using an Intelligent Tutoring System. Proceedings of the National Academy of Science, USA, 107, 7018-7023. Anderson, J. R., Betts, S. A., Ferris, J. L., & Fincham, J. M. (2012). Tracking children's mental stages while solving algebra equations. Human Brain mapping, 33, 2650-2665. Anderson, J. R., & Fincham, J. M. (2014b). Extending problem-solving procedures through reflection. Cognitive Psychology, 74, 1-34. Anderson, J. R., Fincham, J. M., Yang, J. & Schneider, D. W. (2012). Using Brain Imaging to Track Problem Solving in a Complex State Space. NeuroImage, 60, 633-643. Anderson, J. R., Lee, H. S., & Fincham, J. M. (2014). Discovering the structure of mathematical problem solving. NeuroImage, 97, 163-177. Dehaene, S., Molko, N., Cohen, L., & Wilson, A. J. (2004). Arithmetic and the brain. Current opinion in neurobiology, 14(2), 218-224. Friston, K. J., Fletcher, P., Josephs, O., Holmes, A. P., Rugg, M. D., & Turner, R. (1998). Event-related fMRI: Characterising differential responses. NeuroImage, 7, 30–40. Just, M. A., Cherkassky, V. L., Aryal, S., & Mitchell, T. M. (2010). A neurosemantic theory of concrete noun representation based on the underlying brain codes. PLoS ONE, 5, e8622. Glover, G. H. (1999). Deconvolution of impulse response in event-related BOLD fMRI. NeuroImage, 9, 416-429. Norman, K. A., Polyn, S. M., Detre, G. J. & Haxby, J. V. (2006). Beyond mind-reading: Multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10, 424–430. Pereira, F., Mitchell, T., & Botvinick, M. (2009). Machine learning classifiers and fMRI: 21 a tutorial overview. NeuroImage, 45, S199–209. Pyke, A., Betts, S., Fincham, J. M., & Anderson, J. R. (2015). Visuospatial referents facilitate the learning and transfer of mathematical operations: Extending the role of the angular gyrus. Cognitive, Affective, & Behavioral Neuroscience, 15(1), 229-250. Sternberg, S. (1969). The discovery of processing stages: Extensions of Donders' method. Acta psychologica, 30, 276-315. Yu, S-Z. (2010). Hidden semi-Markov models. Artificial Intelligence, 174, 215-243. 22 Table 1. Examples of Problem Types Used in Experiment, Solution Principle and number of items per participant. Problem Type Regular (16 items) Example Problems 32=X 32=X Principle and/or Solution Process Half problems, and half problems See Figure A1 for Algorithms Computational Transfer Unknown b (8 items) Unknown n (8 items) Negative b (8 items) Negative n (8 items) X2 =5 X2=7 3X=5 3X=7 -32=X -2X=6 4-2=X 2-5=X Learners may guess X value then check it, by computing answer in same way Learners may guess X value then check it, by computing answer in same way Algorithms apply as for Regulars Relational Transfer Relating Up & Down (8 items) Equation accommodates negative n; Negative reverses direction of addition Consecutive Operand (8 items) 314=31X 194=X4 353=(342)+X 2615=(X14)+26 BN = B(-N) BN = (B-N+1)N BN= (B-1)(N-1) + B BN= (B-1)(N-1) + B Mirror Problems (8 items) 3061 =X 5 X Rule Problems (8 items) 5X = 7X For any integer B, B0=0 and B0=0 52 = 11X For any integer B, B1=B and B1=B B(2B+1) = 0 BB= B(B+1) 23 Table 2. Analysis of Variance of the Potential Factors affecting Each Stage Algorithm Type Interaction F(1,77) F(2,154) F(2,154) Encoding 0.12 0.53 0.71 Planning 0.75 131.19** 0.93 Solving 0.02 75.52** 13.93** Responding 0.77 8.42* 10.01** p<.001 ** p < .0001 24 Table 3. Effects of number of digits and presence of a negative sign in the answer on the duration of the Responding Stage. Note there were no negative answers to Regular problems. Positive Answer Negative Answer 1-digit 2-digit 1-digit 2-digit Regular 2.38 s. 2.55 s. Problem Type Computational Relational 2.32 s. 2.19 s. 2.61 s. 2.69 s. 3.23 s. 3.00 s. 3.40 s. 3.37 s. 25 Figure Captions Figure 1. Illustration of the stage durations for four trials where different problems were solved. In each stage, the axial slice (z = 28mm at x =0, y = 0 in Talairach space) highlights brain regions whose activation in that stage is significantly greater than the average activation during problem solving. Note brain images are displayed in radiological convention with left brain on the right. Figure 2. An illustration of HSMM–MVPA producing a 4-stage solution. Input to the HSMM–MVPA are scans organized within trials. Each scan consists of the 20 PCA scores. The parameters estimated for each stage are: the 20 PCA means that define the brain signature (individual scans are normally distributed around these means) and the parameters that define the gamma distribution for a stage’s duration. Also calculated are the stage occupancies (probabilities that the participant is in each stage on a particular scan). Figure 3. Mean accuracy and latency for correctly solved problems. Figure 4. Average likelihood of the data obtained using models with different numbers of stages. The likelihood of each of the second-half 40 participants was calculated using model parameters based on the first-half 40 participants. The average log likelihood of participants for different numbers of stages is expressed as the difference in the log likelihood for that number of stages minus the log likelihood for a model with just 1 stage. Figure 5. The brain signatures of the 4 stages. Dark red reflects activity 1% above baseline while dark blue reflects activity 1% below baseline. The z coordinate displayed for a brain slice is at x=y=0 in Talairach space. Note brain images are displayed in radiological convention with left on right. Compared are the results from Anderson and Fincham (A&F, 2014) and the current study. 26 Figure 6. Effects of problem type on the four stages of problem solving, plotted separately for Addition and Equation algorithms. Figure 7. Effects of second operand on stage durations for Regular and Computational problems. 27