Promoting the development of embodied and integrated geometry concepts: Gestural depictions in a computer game environment Jonathan Michael Vitale (JMV2125@TC.Columbia.Edu) Department of Human Development, Teachers College, 525 W. 120 th Street New York, NY 10027 USA Michael Swart (MIS2125@TC.Columbia.Edu) Department of Human Development, Teachers College, 525 W. 120 th Street New York, NY 10027 USA John B. Black (Black@Exchange.TC.Columbia.Edu) Department of Human Development, Teachers College, 525 W. 120th Street New York, NY 10027 USA Abstract In two experiments we examined the effectiveness of a computer-based geometry learning tool, implementing four cognitive design principles, with 3rd and 4th grade students. The four principles reflect our hypotheses that geometry learning tasks should (1) access core knowledge, (2) incorporate diverse stimuli, (3) promote embodied representations of higher-level concepts, and (4) prompt conceptual integration through challenging activities. In both experiments, children in two conditions constructed a set of common four-sided figures to fit a set of visual constraints. In experiment 1, children in the treatment condition (only) were additionally required to verify the presence of parallel segments, congruent segments, or right angles embedded in their figures with the assistance of animated depictions of gestures that conveyed the spatial significance of those properties. Following training we conducted six identification tasks in which participants attempted to discriminate two valid members of a given polygon class from four displayed polygons. In all six instances children in the treatment condition were more likely to correctly identify both polygon class members in a trial than children in the control condition. In experiment 2, children in the control condition were additionally required to verify the presence of parallel segments, congruent segments, or right angles with corresponding numerical representations of those properties. The treatment condition remained the same. Similarly to experiment 1, significant differences were found in 5 out of 6 posttest identification tasks, demonstrating support for our principles of embodiment and conceptual integration. Geometry is a critical, yet often overlooked, element of mathematics education. Although the National Council Of Teachers Of Mathematics (2000) grants geometry equal status with other domains of mathematical instruction – including number and operations, algebra, and data analysis – U.S. students often perform poorly on geometry items of standardized assessments, relative to other developed nations (Ginsburg, Cooke, Leinwand, Noell, & Pollock, 2005). Eighth grade U.S. students, on the 2003 TIMSS, in particular, scored an average of 45% of geometry questions correctly, as compared to scores above 70% by their East Asian counterparts from Japan, Chinese Taipei, Honk Kong, Korea, and Singapore. While underwhelming U.S. performance, in general, is due to a number of variables, particularly SES (Berliner, 2006), the relative weakness in geometry performance by U.S. students likely reflects inadequate emphasis in the curriculum – particularly, as concepts become more advanced. For years, educational researchers have bemoaned the insufficient rigor of geometry curriculum that overwhelmingly relies on static images and simplistic tasks, and, consequently, do not prepare students for higher-level geometric reasoning (Clements, 2004; Clements & Battista, 1992). As a potential remedy, interactive geometry software, such as Logo, was heralded as an instructional breakthrough (Papert, 1980; Papert, Watt, DiSessa, & Weir, 1979). Unfortunately, this early enthusiasm was dampened by mixed or negative efficacy findings (Howe, O’Shea, & Plane, 1979; Hughes & Greenbough, 1995; Johnson, 1986; Noss & Hoyles, 1992; Simmons & Cope, 1990). However, evidence suggests that Logo is effective when paired with a well-structured curriculum (Clements, Battista, & Sarama, 2001; Littlefield, Delclos, Bransford, Clayton, & Franks, 1989). Regardless of specific findings, Logo – and descendent technologies, such as NetLogo and Scratch – maintains strong theoretical support as a primary instantiation of the view that mathematical development is best facilitated with the use of “concrete manipulatives” (Clements, 1999) or “spatial tools” (Mix, 2009). Based on the work of developmental and educational theorists, such as Bruner (e.g. 1966), Montessori (e.g. 1912), and Piaget (e.g. 1970) many constructivist educators and researchers assume, as an essential principle, that discovery-based activity with concrete manipulatives is necessary to make abstract ideas accessible to younger children. However, like Logo, there has been a history of mixed findings with concrete manipulatives (Ball, 1992; Fuson & Briars, 1990; Gravemeijer, 1991; Resnick & Omanson, 1987; Thompson & Thompson, 1990). A lack of clear support suggests that continued research, with greater emphasis on how children use the tools – in ways intended and unintended by designers – is necessary. Recent theory posits that spatial tools – in similarly effective physical or virtual form (Triona & Klahr, 2003) – may be more effective when introduced in a properly guided and constrained activity; thereby, focusing learners’ attention to the critical mathematical structures of the problem space and away from superficial or irrelevant features (Brown, McNeil, & Glenberg, 2009; McNeil & Jarvin, 2007; McNeil & Uttal, 2009; Sarama & Clements, 2009; Uttal, O’Doherty, Newland, Hand, & DeLoache, 2009). Consequently, designing learning tools that navigate learners along the optimal developmental trajectory requires both detailed knowledge of the domain as well as its inherent misconceptions. In terms of geometry instruction, extending Piaget’s stage-based theory (Piaget & Inhelder, 1956; Piaget, Inhelder, & Szeminska, 1960), contemporary developmental researchers have made some important discoveries regarding developmental milestones in geometry learning, particularly in regards to the sensorimotor nature of early concepts (e.g. Clements, Swaminathan, Hannibal, & Sarama, 1999). Yet, recent cognitive theory posits an ongoing, reciprocal relationship between the sensorimotor-based knowledge of novices and the abstract knowledge of experts (Lakoff & Núñez, 2000), which may suggest new ways of approaching geometry research and instructional design with more advanced learners. Our purpose here is to detail the development of a novel geometry learning tool from an embodied perspective and to demonstrate its effects on the skill of polygon identification, a common task on many general mathematical assessments (e.g. the TIMSS and NAEP). We further discuss how these findings may be implemented in a mathematics curriculum in general. The emerging cognitive science of geometry For years, cognitive developmental researchers, such as Piaget (1956), Bruner (1960), and their adherents (e.g. Papert, 1980), have emphasized that learning is deeply rooted in physical activity and interaction with the environment. Furthermore, many view the qualitative shift from sensorimotorbased knowledge to conceptual/abstract knowledge as a hallmark of conceptual development, e.g. development of geometry concepts from highly perceptual to symbolic (Clements et al., 1999; van Hiele, 1986). Generally, with this perspective, the development of higher-level knowledge has been addressed by information processing theorists as a matter of memory capacity (Sweller, 1988) and efficient production systems (Anderson, Boyle, Corbett, & Lewis, 1987), best addressed by direct instruction (Kirschner, Sweller, & Clark, 2006). Tools reflecting this view, such as cognitive tutors (Koedinger & Anderson, 1993), take advantage of the “symbolic advantage” that occurs with complex problem solving (Koedinger, Alibali, & Nathan, 2008). While these tools are necessary and successful in many contexts – such as with high school students studying formal Euclidean geometry – recent emphasis on embodied (or grounded) cognition (Barsalou, 2010; Clark, 1999) suggests that abstract concepts require an initial grounding in perceptual-motor experience. In particular, Lakoff and Núñez (2000) describe mathematical thinking in terms of conceptual metaphors that connect abstract words and symbols to common physical experiences. Specifically, “grounding metaphors” map physical objects or activities to target mathematical concepts (e.g. actions on object collections map onto arithmetic operations). perspective comes from studies of early numeracy. The most compelling evidence for this Both neuroimaging and behavioral data demonstrate a critical association between basic numeracy skills (e.g. comparing two numbers) and spatial processes (Dehaene, Piazza, Pinel, & Cohen, 2003; Dehaene, Spelke, Pinel, Stanescu, & Tsivkin, 1999). According to Spelke (2000), basic numeracy is built upon core systems of knowledge, including the ability to subitize – i.e., recognize the numerosity of a small array of objects without counting – and approximate the numerosity of large magnitudes (Feigenson, Dehaene, & Spelke, 2004). In turn, the strength of these spatial-numerical associations has been shown to be predictive of a wide range of mathematical skills, including those commonly found on standardized tests (Booth & Siegler, 2006; Halberda, Mazzocco, & Feigenson, 2008; Holloway & Ansari, 2009). In particular, investigations of number line estimation demonstrate that while associations between space and number emerge early, this mapping is initially inaccurate and biased towards overrepresentation of well-known, small numbers (Booth & Siegler, 2006; Siegler & Booth, 2004; Siegler & Opfer, 2003). In a series of interventions, young children (pre-k) learned to estimate accurately by playing a corresponding board game (Ramani & Siegler, 2008; Siegler & Ramani, 2008, 2009). This research clearly demonstrates that engaging and accessible tools can produce robust improvements in the underlying spatial representation of mathematical concepts. Likewise, researchers are beginning to address geometry in terms of spatial processes in core systems, and their development. Spelke, Lee, and Izard (2010) claim that concepts are grounded in two core systems associated with spatial navigation and object perception. From this perspective, the concept of a shape, e.g. rectangle, is derived from the experience of viewing rectangular objects, such as the face of a block, or navigating along a rectangular path. The idea that initial concepts of geometric shapes are (at least partially) rooted in our perceptual, object recognition system is both intuitively appealing and empirically supported. Many studies of object perception and recognition have utilized common geometric figures as stimuli (e.g. triangles, trapezoids) to investigate how concepts are encoded in and retrieved from memory. For example, in a study by Bomba and Siqueland (1983), 3-month-olds developed a prototypical representation of a triangle by viewing a series of triangles, which varied randomly from the prototype, without ever actually seeing the prototype. In other words, frequent exposure to the natural variation of a shape may lead to the formation of an idealized conceptual representation. While this example suggests how initial associations between mathematical concepts and perceptual experiences are formed, as the number line example suggests, these initial mappings often diverge from expert knowledge in important ways. Specifically, in formal geometry, the shape-as-physical object mapping can produce misconceptions. While a rectangle – as a mathematical concept – must contain perpendicular sides, there is an abundance of examples in the natural environment of blocks with slightly non-perpendicular sides, potentially contributing to the concept of a rectangle. Features other than those that formally define the geometric class, such as symmetry (Quinlan & Humphreys, 1993), may drive identification and classification. In a study in which participants classified a series of four-sided figures based on perceived similarity, Behrman and Brown (1968) found that “dispersion” (irregularity), “elongation”, and “jaggedness” emerged as the most significant factors in classification. Similar features were discovered in subjects’ classification of U.S. state shapes (Shepard & Chipman, 1970). While, these features are clearly important in everyday manipulation of objects, they may interfere with the learning of formal concepts. For example, a polygon is no more or less a rhombus if it is elongated or square. Furthermore, these real world object-based features affect identification of common polygons in educational settings. Young children often apply informal labels, such as “slanty”, “pointy”, or “skinny” to describe common shapes (Clements et al., 1999). In identification tasks, young children often miss non-prototypical, but valid instances of a target shape in favor of invalid instances that share common, holistic features with the prototype (Burger & Shaughnessy, 1986; Clements et al., 1999). For example, while identifying rectangles many children incorrectly selected “long” (non-rectangular) parallelograms as rectangles, while missing squares. Though one may suspect that this kind of error is isolated to young children, Mach (1886/1959), a forebearer of gestalt psychology, demonstrated that an object would be perceived differently – as either a square or diamond – depending upon its spatial orientation. Clearly our initial knowledge of geometry is founded – at least, in part – on implicit natural object perception, but requires further development to support higher-level skills. Embodied learning in geometry instruction The tendency for younger children to identify objects based on superficial characteristics, rather than formally-defining properties, is not unique to geometry. Children often focus on holistic, characteristic features while classifying common objects such as taxi cabs, islands, and robbers (Keil & Baterman, 1984). On the other hand, adults or domain experts tend to classify objects based on abstract or hidden structure (Carey, 1985; Chi, Feltovich, & Glaser, 1981). This is often described as a general perceptual to conceptual shift in development (Bruner, 1960; Piaget, 1952). As such, mature concepts are produced through analytical systems that depart qualitatively from simple perceptual, association systems (Carey, 1985; Mandler, 1992; Sloman, 1996). This approach is reflected in Clements, Swaminathan, Hannibal, and Sarama’s (1999) description of higher stages of geometry development as a synthesis of imagistic and declarative knowledge. Alternatively, conceptual development may be viewed as the reorganization of perceptual and motor systems in accordance with the functional features, or affordances, of objects (Gibson, 1969; Goldstone, 1998; Goldstone, Landy, & Son, 2010; Thelen & Smith, 1994). From this perspective, what was seen as a develomental shift from perception to conception is reframed as a perceptual realignment, in which salient, but irrelevent features are disregarded in favor of more subtle, relevant features (Goldstone & Barsalou, 1998; Quinn & Eimas, 1997; Schyns, Goldstone, & Thibaut, 1998). In the case of geometry, learning about a square, for example, goes beyond assembling a verbal list of features – e.g. four right angles, four congruent sides, etc. – to seeing the shape in terms of these features. Because the enrichment and refinement of initial representations likely involves both perception and action systems, we refer to this process, generally, as embodied learning. How then might embodied learning of geometry concepts be promoted? One possibility is to simply provide feedback on a repetitive perceptual classification task. For instance, Goldstone and colleagues demonstrated successful perceptual learning in a categorization task, with feedback, by presenting multiple, diverse examples (and non-examples) of a target concept over a long series of trials (Goldstone et al., 2010; Son, Smith, & Goldstone, 2011). While this approach may be appropriate for a highlycontrolled laboratory setting, this paradigm may be unsuitable for more dynamic setttings with younger children. Alternatively, Glenberg, Gutierrez, Levin, Japuntich, and Kaschak (2004) demonstrated that physical interaction with a concrete model of a narrative improves comprehension more than simply re- reading a story. Likewise, students learn challenging physics concepts better with an interactive computer simulation than by viewing animations of the same system (Chan & Black, 2006). While using interactive tools to promote conceptual development maintains strong theoretical support, mixed results suggest that these tools can be misapplied. Unguided and unconstrained activity with either a virtual or physical manipulative is likely to lead to rote proceduralization (Hiebert & Wearne, 1986), non-reflective trial-and-error (Simmons & Cope, 1990), and little transfer (Ball, 1992). Rather, a strong curriculum constrains the learner to confront the target concepts directly (Clements, Battista, & Sarama, 2001). In the case of geometry, learners must be constrained to generate polygons according to defining features (e.g. parallel lines, congruent line segments, right angles), while permiting their figures to vary across salient, but non-determining features (e.g. orientation, elongation). Furthermore, guidance is necessary to provide children with initial grounding metaphors for concepts. Specifically, defining features need to be represented in ways that are clear and accessible. While, work with indiginous Amazonians (Dehaene & Izard, 2006) indicates that these concepts are intuitively recognized, the mapping between multiple spatial concepts and formal mathematical concepts is often complex. Therefore, the central challenge of a geometric learning tool is to promote the organization and integration of multiple embodied concepts into unified, formal concepts. Cognitive Design Principles Given the theoretical perspective detailed above, we constructed the following four basic cognitive design principles, attuned towards the development of geometric knowledge. Principle 1: Provide an intuitive interface for polygon construction that taps into children’s core knowledge of shapes. As described above, children’s knowledge of shapes extends out of their everyday experiences interacting with solid objects and navigating through the world (Spelke et al., 2010). Incorporating this type of experience into a learning activity connects the geometry that children are learning in formal settings with the geometry they are practicing in informal settings. Often, with highly symbolic, contextually-abstract learning activities children achieve proficiency with formal mathematics without developing a strong sense of how this knowledge might be applied to authentic settings (Lave, 1988). On the other hand, providing children with a context that is unfamiliar and overly-contextualized, while engaging, may limit the potential for transfer (Son & Goldstone, 2009). By incorporating simple, recognizable objects and/or activities in the game we reinforce the relationship between formal geometry and commonplace tasks. As to the specific guiding metaphor for shape construction, the representation may either elicit the core system for navigation or object-perception. However, as Spelke et al. (2010) note, these systems may ground different aspects of geometric knowledge – i.e., navigation grounds directional sense, object recognition grounds angle, and both ground distance. In choosing a theme that focuses solely on one of these systems, children may find activities grounded in the other system unnecessarily challenging. Yet, as it happens, Logo’s application of a third-person, overhead perspective of an agent tracing a path may represent a suitable blend between these two metaphors. If children imagine the agent’s perspective as it traverses a path, they may access their core navigational system – providing an intuitive sense of direction. On the other hand, if children focus on the shape that emerges as a product of the agent’s actions, they may access their core object recognition system – providing an intuitive sense of angle. Although our activities focus on angle, rather than direction, we chose a similar theme to provide a robust basis for a wide range of potential activities. Principle 2: Provide goals and constraints that promote the construction of a diverse set of valid polygons, rather than highly idealized, prototypical polygons. This principle is simply a matter of good curriculum design. U.S. students’ poor performance in geometry may be fundamentally the result of a curriculum that is “a mile wide and an inch deep” (Schmidt & McKnight, 1998; Schmidt, McKnight, & Raizen, 2007, 1997) and devotes little attention to the complexity of geometry. Curriculum which provides only a limited set of shape instances – likely as a result of a limited set of physical blocks or drawing templates – encourages children to develop an overly rigid concept of shape (e.g. a rectangle is always elongated). On the other hand, activities that include a diverse array of stimuli facilitate the “pick-up” (Gibson, 1966) of critical information about shapes, while disregarding superficialities. Our approach to implementing this principle is to focus on a limited number of shapes (six) across a greater number of construction activities (twenty-six). Furthermore, by providing some basic constraints within the game to prevent the construction of prototypical shapes – in some cases – we ensure that children will be exposed to a perceptually diverse set of any specific shape. Principle 3: Provide instruction that fosters an “embodied” understanding of formal, defining features of polygons. While repeated exposure to a diverse set of shapes could lead to the development of implicit higherlevel knowledge, given time restraints – particularly, considering the constructive nature of the task – some degree of direct instruction is necessary to address higher-level concepts efficiently (Klahr & Nigam, 2004). Yet, at this stage, we do not abandon spatially-grounded tools in favor of symbolic procedures. While advanced geometric procedures – such as deductive proof – rely upon the use of symbolic representations of formal properties, these properties are fundamentally spatial in nature, and require the development of an embodied internal representation. Fostering the development of a spatial representation can be done in a number of ways. Multimedia representations that incorporate text and (potentially animated) images are common, effective, and well-researched (Mayer, 2001; Moreno & Mayer, 1999). Another, more recent procedure is to direct children’s gesture in congruence with the target representation (Broaders et al., 2007; Segal, 2011). Given previous geometry studies’ mention of spontaneous gesture in representing common concepts (Clements & Burns, 2000; Clements et al., 2001), there appears to be a natural relationship between our target concepts (i.e., parallel lines, congruent line segments, and right angles) and gesture. Yet, because of the nature of our research protocol, which limits external guidance and behavioral constraint, enforcing the use of gesture is unfeasible. Rather, by combining multimedia elements and gesture we developed gestural depictions, in which animated hand icons demonstrate a gesture applied to the figure on screen. While these depictions do not ensure that the children will perform their own gestures, or even interpret the gestures as such, they provide a base-line for instruction across all participating children. Principle 4: Incorporate challenges within the game that promote integration of basic concepts of polygons with higher-level concepts regarding their properties. By implementing Principles 1 and 2, children are responsible for the basic construction of a diverse set of shapes. By implementing Principle 3 children are provided a means of conceptualizing higherlevel features of shapes. Therefore, children are guided to develop a grounding metaphor (Lakoff & Núñez, 2000) for both polygons and their properties. However, these two independent sets of representations must be integrated, or “blended” (Fauconnier & Turner, 1998), into a new representation. Developing an integrated representation is non-trivial, and likely requires an activity exhibiting “desirable difficulties” (Bjork, 1994; Bjork, 2006; Bjork & Linn, 2006) – i.e. procedures which may reduce performance during the learning task, but result in more robust learning. Specifically, in conceptual domains, activities should promote reflection upon relationships between multiple components of a concept (Clark & Linn, 2003; Linn, 2000; Linn, Chiu, Zhang, & McElhaney, 2010). Promoting reflection in a learning activity is a significant design challenge. One approach may be to simply ask children to talk about or write about their ideas. For example, Kurtz, Miao, and Gentner (2001) asked children to write an integrated description of two visual representations of an abstract physical concept. However, the addition of a writing task may not be relevant for all activities, particularly games. Here we choose to incorporate a challenge that is more implicit to the goals of the game. Specifically, children are constrained to accurately construct figures that meet both overall visual constraints as well as specific, formal constraints (described in the next section). By navigating between these two sets of requirements we hope to encourage reflection and development of an integrated representation. The digital geometry game In this section we detail the design of a digital game that implements our four cognitive design principles. While these principles guided the overall construction of the game, some specific choices – not dictated by the principles – were necessary. In most cases these decisions were either arbitrary or reflected some practical considerations given the software development tools or the projected participant sample. For example, the choice of a robot navigation theme was made in light of our ongoing work with Lego robotics and previous work with Logo. When potentially consequential, the rationales for specific design decisions are explained below. Initial construction of the shape. Given Principle 1, we chose an easily comprehended narrative premise in which children navigate a robotic agent through an obstacle course; collecting goal objects while avoiding danger objects (Figure 1). Placement of the goals and dangers is designed to constrain learners to the production of a target polygon. By increasing constraint (e.g. less obstacle-free space, more goal objects) we manipulate the difficulty of the exercise as well as the diversity of figures produced by an individual participant from exercise to exercise (i.e. implementing Principle 2). Most importantly, a central set of obstacles, varied between exercises, forms a schematic image of the target polygon. In the planning phase the player is told the name of the target polygon and simply views the array of goals and obstacles, overlaying a grid, with the objective of planning a navigational course for his or her agent. In this phase the player cannot construct his or her figure. The player proceeds by pressing the “build” button when ready. Figure 1. Screenshot of game in the planning phase. Players plan a course of motion starting at the circle (protruding segment indicates initial heading), around the obstacles (rocks and trees), through the goals (flags), and back to the starting circle. In this case the player is planning a parallelogram. In the subsequent construction phase players attempt to produce the imagined path without access to the visual layout of obstacles and goals. The grid does persist to help players coordinate between planning and construction. A figure is built by dragging the mouse to construct lengths and rotating the mouse about a center pivot to construct angles (Figure 2). Upon closing the figure to form a polygon, vertices may be adjusted freely via “drag-and-drop”, to refine the overall shape. Figure 2. Screenshot of game in the construction phase. Players build a closed polygon to represent the path of their agent. In this case the player is attempting to construct an angle of a parallelogram. The choice to separate planning and construction, and thereby eliminate access to the visual layout of obstacles during construction, was intended to provide a challenge for the game player. If the player was granted a persistent view of obstacles and goals, then producing a successful figure would be a trivial matter of tracing a figure around the obstacles and through the goals. By requiring players to construct a figure from memory, we facilitate the internalization of an instance of the target polygon. While advanced players may mentally encode their projected figure in terms of higher-order properties (e.g. parallel sides, right angles, etc.), we expect that the majority of learners will simply encode their projected figure in terms of an overall, holistic appearance. Therefore the initial construction of the figure is simply intended to promote a basic-level conceptual representation of the target polygon. Integrating higher-level features. At this point the player has worked with implicit constraints to produce a polygon. While these constraints are intended to suggest the construction of a specific polygon, we cannot ensure that the produced figure will actually be a valid instance of the target polygon, or that the learner has attended to the defining features of the polygon. Consequently, as Principles 3 and 4 assert, learners must be explicitly guided to develop embodied representations of these features, and provided with means for integrated these concepts with their core representation of the target polygon. To implement these principles we introduce a third phase in the construction process – feature validation. In this phase learners are asked to verify that defining features of the target polygon are exhibited by their constructed figure – including parallel lines, congruent line segments, and right angles. To assist in the development of embodied representations of these defining features, when selected from a menu, a player-controlled animation depicts a gestural representation of the concept, with some additional visual guides (Figure 3). These gestural depictions not only serve as a conceptual grounding for defining features, but function as the feedback mechanism for the validity of defining features in the player’s polygon. Specifically, in the course of manipulating the gestural depictions the player also selects the sides or angles of the figure that he or she believes exhibit the target feature. If correct, within a margin of error, a large check mark appears on screen, and the sides or angles are given standard geometric markings indicating their property (e.g. arrowheads along parallel segments). If incorrect, a large “x” appears on screen, and the child is given an opportunity to either try again (with other sides or angles) or adjust the figure. For parallel lines, players guide two hand icons, which move along the length of two displayed parallel lines, to select the two sides of the polygon that are potentially parallel. The orientation of the hands is automatically adjusted by the computer to match nearby or chosen sides. For congruent segments players guide a pair of hand icons to a side of their polygon. The hands are automatically adjusted by the computer to mark the length of the side. The learner then guides these hands, with their distance fixed, to a second segment for comparison. For right angles, the learner guides two hands, oriented perpendicularly, to fit a green square in the corner of the figure. Margins of error for successful validation are kept small to ensure that this phase of the game provides a challenge for players. As indicated by Principle 4, we believe that the challenge in coordinating between the physical obstacles in the construction space and the narrow spatial definitions of defining features will prompt reflection by players. For example, in the parallelogram example shown in Figure 1, a player will need to reflect on a way to construct a side that passes through the flag, but maintains a slope that can be matched by its oppose side. Finally, after successful feature validation the learner attempts to test his or her constructed path in the obstacle course. This testing phase also grants players an additional opportunity to view the arrangement of obstacles. After the player directs the initial heading, the robot navigates along the constructed path. If the robot successfully navigates the obstacle course (i.e. collects all goals without intersecting any obstacle), then the player proceeds to a new exercise. If the player is unsuccessful he or she returns to the construction phase to adjust the figure. Figure 4. Screenshot of game in the testing phase. Players set the initial heading and then watch as their agent navigates the obstacle course according to their design. Application in research. In the following two studies we detail two experiments using this geometry learning game. The main objective of these experiments is to provide supporting evidence for our cognitive design principles. In particular, because research on basic geometric concepts with young children has been conducted successfully elsewhere (e.g. Sarama & Clements, 2002, 2004), we focus on the latter two principles regarding the development of higher-level concepts. In each study we compare students who were randomly assigned to either the full version of the software, with an embodied, feature validation phase, or an alternate version of the software. After completing each of three sets of unit exercises in the game, participants completed a shape identification posttest. The primary aim of these studies is to determine how a geometry learning task that provides an embodied representation of concepts and facilitates their integration into one’s previously existing representation (i.e., Principles 3 and 4, respectively) will promote conceptual transfer to a simple, but critical identification task. Furthermore, we intend to use in-game data, gathered by the game software, to determine whether the source of differences between conditions strictly relates to the cognitive design principles, or other, unexpected factors of the game-play experience. Experiment 1 As an initial test of our embodied, integration paradigm we compared a group of children with the full feature validation phase – i.e., embodied integration (EI) condition – to a group of children without feature validation – i.e., no integration (NI) condition. Features implementing the first and second principles – i.e., the initial construction task and the curricular sequence of exercises – were the same between groups. Method Participants Twenty-one fourth grade children were initially recruited from an after-school program located in a low-income, predominantly Hispanic neighborhood of New York City. The children were randomly assigned to either the embodied integration or no integration condition. The final embodied integration condition consisted of ten children (M = 9.4 years, SD = .16, 40% female, 90% Hispanic, 10% African American). The no integration condition also consisted of ten children (M = 9.6 years, SD = .28, 50% female, 100% Hispanic). Two children (one from each condition) were native Spanish speakers but could communicate sufficiently in English, and showed little difficulty understanding and completing the tasks. Additionally, one child, originally assigned to the embodied integration condition (but not reflected in the statistics above), was not included in the study due to prolonged absence. Materials and Procedure Game and curriculum. All study-related tasks were conducted in the context of a weekly afterschool robotics course. However, in some instances in which students had no other assigned classes, we conducted multiple sessions in a week. Aside from the tools described here the children also engaged in physical construction of Lego-based robots. This supplementary activity did not affect the structure of this study, except to inform the visual design of the learning tool. Children were randomly assigned to either condition at the start of the first class, thereby determining the version of the software they would use throughout the experiment. In the embodied integration condition children learned with the full version of the software, described above. Specifically, children in this condition were required to validate specific defining features in their constructions, within a small threshold of deviation, to complete each exercise. Parallel sides were validated by computing, by the software, the absolute difference of the two selected sides’ percent gradient (0%-horizontal, 100%-vertical, 50%-diagonal rising left-to-right, -50%-diagonal rising right-toleft). Sides within 3% gradient difference were considered parallel. Congruent sides were validated by computing the absolute difference between segment lengths (in pixels). Sides within 23 pixel difference (approximately 7 mm of screen size) were considered congruent. Finally, right angles were validated by computing the difference of the degree measure of the internal angle, at the selected vertex, from 90°. Angles 4° of 90° were considered right. In the no integration condition children did not validate defining features of their constructed figures. There was no explicit constraint on the defining features of the polygon, beyond those implicit in the physical layout of obstacles and goals. Therefore, while a successful figure may resemble a trapezoid by overall appearance, its opposite sides need not be, necessarily, parallel (Figure 8, in the discussion, shows some examples of formally invalid, yet successful polygons constructed by NI participants). In either condition the children completed a series of exercises across three units of study, focusing on three defining features of polygons: parallel sides, congruent adjacent sides, and right angles, respectively. In the first unit children completed a set of 10 construction exercises, focusing on trapezoids and parallelograms. In the second unit children completed 6 exercises, focusing on kites and rhombi. Finally, in the third unit children completed 6 exercises, focusing on rectangles and squares. Each exercise was designed, via the placement of obstacles and goals, to promote construction of a path resembling the target polygon. Exercises within each unit generally progressed from the most prototypical versions of the polygon to the most irregular, yet still valid, versions (see Appendices A, B, and C for layouts of all levels). To avoid contamination between conditions each learning session was divided into two halves, in which the embodied integration and no integration groups alternated order from session to session. Children not currently engaging in study activities completed homework in a separate room. Advancement through the series of exercises occurred at the individual pace of the participant, causing some students to finish more rapidly than others. During these learning sessions, the experimenters (two of the authors) circulated among the children giving assistance and providing motivational support. As an additional motivational tool, children were supplied with a “stamp certificate” in which each completed exercise earned the child a star sticker or stamp on a personal document. We acknowledge that, unlike traditional laboratory tools for psychological research, our learning tool has many moving parts – all of which could affect learning. To assist in the analysis of the learning process, the software provided a numerical record sufficient to reconstruct all operations undertaken by the learner. For this experiment we analyze some general summary statistics regarding the difficulty of the task and the strategies applied by children. If there are significant differences in the effort or strategy undertaken by a child, group difference on the posttest measures will need to be addressed in terms of alternative explanations. Shape identification. The main assessment measure was individually administered, immediately following a student’s completion of a unit. Each unit posttest included two subtests with 30 trials, for a total of six subtests and 180 trials. The goal of each trial is to identify the two positive examples of a target polygon from four simultaneously presented polygons (see Figure 5). During each trial we recorded which figures were selected, as well as other interactions with the interface (e.g. “mouseovers”, “mouse-clicks”). To encourage more interaction, figures are initially displayed in light gray without a border, but are darkened and highlighted with a black border when “moused-over”– providing a clearer image of the polygon. Upon clicking on a polygon, its internal area is filled blue as a visual indicator of currently selected figures. Figures may be selected and de-selected freely in the process of decision making. Once two figures are selected, the participant may then click on a central button to advance to the next trial. In some cases post-trial feedback may be provided (detailed below). The rational for this trial structure, with four – as opposed to a simpler choice between two figures – was to make common features of the four polygons explicit and comparable. Given the evidence that children, without an appropriate intervention, are more likely to classify geometric figures based on holistic similarity than defining features, this trial structure allows us to identify what feature the participant most likely attended to and what feature he or she most likely disregarded. Figure 5. Screenshot of shape identification task. Participants choose 2 valid examples of a polygon (trapezoid) from four available figures. Here, two figures are currently selected (in blue), and one is being highlighted (gray, outlined in black) through mouse interaction. The light gray, un-outlined polygon is neither highlighted nor selected. Six subtests assessed participants’ ability to identify trapezoids, parallelograms (following unit 1), rhombi, isosceles triangles/trapezoids (following unit 2), rectangles and right triangles/trapezoids (following unit 3). The triangle/trapezoid blocks were included to test whether the participant’s concept of the target feature would extend to novel stimuli. To generate a set of stimuli – for trapezoids, parallelograms, rhombi, and rectangles – we began with three valid members of the polygon class, oriented in a prototypical manner (“upright”). We then created three more valid examples by rotating each of the three original one-quarter turn. From each of these six valid polygons we created two characteristically similar, but formally invalid polygons by slightly altering a defining feature (i.e., parallel sides, congruent adjacent sides, or right angles) consistently across all six positive examples. For example, in Appendix F, the positive examples of rhombi (1st and 3rd rows) were altered to become nonexamples (2nd and 4th rows) by lengthening two of four sides. The six positive examples, paired twice with invalid derivations, produced two sets of six valid-invalid pairs (see appendix D – F, H for general structure). From each of two sets of six valid-invalid pairs, fifteen combinations of four shapes (i.e., pairs of pairs) were produced for a total of thirty trials. In the case of the mixed triangle and trapezoid blocks, six valid examples of trapezoids were paired with a single invalid derivation, while six valid examples of triangles were paired with a single invalid derivation to produce the two sets of six valid-invalid pairs (see appendix G, I, for general structure). Once again, from each of two sets of six valid-invalid pairs, fifteen combinations were produced for a total of thirty trials. An instruction page, displayed on-screen prior to identification trials, provided a written description of the defining feature-based rules of inclusion for the target polygon class. For example, “A trapezoid is a four-sided figure with one pair of parallel sides”, or in the case of a mixed triangle/trapezoid block, “A shape is right if it contains at least one right angle”. Inclusion of this rule ensured that all participants had sufficient information to identify polygons correctly. Furthermore, in the case of subtests utilizing figures similar to those given in the learning exercises (i.e., trapezoids, parallelograms, rhombi, and rectangles), this instruction page displayed both a valid and invalid example of the polygon. The participants were asked to click the sides or angles corresponding to the target defining feature (parallel sides, congruent sides, right angles) within the valid example. In the case of mixed triangle/trapezoid blocks, which were intended to be novel visual stimuli for the given target feature, no visual example was provided. Because of recent work showing that some complex, productive learning tasks are often best suited to preparing students for future learning, rather than transferring conceptual knowledge directly (Hammer & Black, 2009; Schwartz & Bransford, 1998; Schwartz & Martin, 2004), there is a possibility that embodied integration condition participants would show differences from no integration condition participants in learning during the posttest (i.e., changes from performance in the first half of the test to the second half). In this case, post-trial feedback might be necessary to trigger group differences. With an opportunity to work with children across a number of learning units we introduced some variations in feedback across subtests to get a sense of how directly concepts developed in the learning task transferred to the posttest. Specifically, in the first four subtests, following learning Units 1 and 2, participants were informed, following each trial, the number of correct polygons they had selected (0, 1, or 2). In the final two subtests, following Unit 3, no feedback was given. Other assessments. To measure the children’s overall mathematical ability, and to ensure that the groups were roughly equivalent, we conducted the Woodcock Johnson III Calculation and Mathematical Fluency subtests on each child. The test was administered individually, separately from children performing study-related tasks. This measure was taken following the completion of the study. Results Woodcock Johnson. Standardized scores with Woodcock Johnson III age-normed subtests, reveal that both groups were approximately average in performance, although bordering on high average, for Calculation (EI: M = 107.7, SD = 9.7; NI: M = 111.0, SD = 10.0) and average in Math Fluency (Treatment EI: M = 104.7, SD =14.0; Control NI: M = 99.1, SD = 12.7). No significant difference was found between the two conditions on either Calculation (t[18] = 1.04, p > .1) or Math Fluency (t[18] < 1). Shape identification. On any given identification trial, in which two out of four shapes were selected, a participant could correctly identify 0, 1, or 2 figures. While in most testing situations one correct would be deemed better than zero correct, here one correct suggests that the participant judged the polygons according to the incorrect, distracter feature (e.g. choosing both “upright” figures). Therefore we coded trials with both figures selected correctly as accurate and trials with one or zero figures selected correctly as inaccurate. We then calculated the percentage of accurate (both-correct) trials within each subtest. On any given trial chance performance is 1 out of 4, or 25%. Figure 6 displays the distribution of participants’ accuracy scores across all six subtests. Embodied integration No integration Embodied integration No integration (1) Trapezoid (2) Parallelogram (3) Rhombus (4) Isosceles Triangle/Trapezoid (5) Rectangle (6) Right Triangle/Trapezoid Figure 6. Distributions of % accuracy scores in experiment 1. For each of six subtests (1-6), the EI condition is displayed on the left and the NI group is displayed on the right. The results are displayed separately for each block. The x-axis represents the percent of trials in which a participant selected both figures correctly. The y-axis represents the number of participants with the given level of accuracy. We do not assume that the accuracy scores are necessarily distributed normally. For example, it may be the case that children’s performance was skewed by ceiling/floor effects. Therefore to compare participants between conditions directly we performed a series of non-parametric, one-tailed MannWhitney test. Significant results indicate that the distribution of EI condition participants’ accuracy scores are shifted positively from the distribution of the NI condition participants’ accuracy scores. We also compared conditions on mean trial duration. Because there is no reason to suspect that duration scores should be non-normal we applied a two-tailed t-test in each of the six subtests. Table 1 Posttest summary measures Subtest Condition Percent of trials correct Trial duration (secs.) EI Median 60% Min 27% Max 97% Mean 12.4 SD 5.0 NI 42% 27% 83% 12.3 5.7 Parallelogram EI NI 78% 50% 100% 9.7 5.1 48% 7% 90% 10.0 3.3 Rhombus EI NI 80% 56% 97% 8.3 2.6 58% 13% 90% 11.4 2.8 EI 73% 13% 93% 9.6 2.5 NI 13% 3% 97% 9.3 4.0 EI 75% 56% 100% 8.4 3.0 NI 48% 0% 73% 8.6 2.9 Right tri/trap EI NI 23% 0% 60% 11.9 5.3 8% 0% 2% 9.3 3.8 Overall EI NI 64% 37% 50% 19% 85% 69% 10.1 10.1 3.0 2.8 Trapezoid Isosceles tri/trap Rectangle As table 1 reveals, across all subtests the median percent of trials correct is higher for the EI condition than the NI condition. An overall comparison of conditions, pooling all subtests items, applying a one-tailed Mann-Whitney test reveals a significant positive shift in the distribution of EI participants’ accuracy scores, compared to NI participants’ accuracy scores (W = 90, p < .01). Additionally, comparisons at each subtest reveal similar advantages for EI condition participants (Trapezoid: W = 74, p < .05; Parallelogram: W = 82, p < .01; Rhombus: W = 80.5, p < .01; Isosceles: W = 83.5, p < .05; Rectangle: W = 89, p < .01; Right: W = 72.5, p < .05). Somewhat surprisingly, there was little difference between mean durations of conditions – overall, pooling all subtest items, average durations were the same. In the case of rhombi a two-tailed t-test revealed a significant difference between conditions, with participants in the EI condition completing trials more rapidly (t[18] = -2.3, p < .05). In all other cases there were no significant differences between conditions (Trapezoid: t(18) = .03, n.s.; Parallelogram: t(18) = -.15, n.s.; Isosceles: t(18) = .20, n.s.; Rectangle: t(18) = -.11, n.s.; Right: t(18) = 1.3, n.s.). To address whether participants’ accuracy improved from the first half of subtests to the second half of subtests, and thereby indicating learning within the test itself, our approach was to perform a repeated measures ANOVA with subtest-half as a within-subjects factor and condition as a betweensubjects factor. However, as described above, we cannot assume that all accuracy distributions were normal. Therefore, we performed Shapiro-Wilk’s tests of normality to determine whether the distributions of accuracy scores – separated by condition and subtest – were significantly non-normal. For the EI condition the isosceles triangle/trapezoid showed some departure from normality (W[10] = .82, p < .05;), while the other subtests did not (Trapezoid: W[10] = .97, n.s.; Parallelogram: W[10] = .96, n.s.; Rhombus: W[10] = .96, n.s.; Rectangle: W[10] = .98, p >.1; Right: W[10] = .94, n.s.). In the NI condition the trapezoid (marginally) and isosceles triangle/trapezoid blocks showed some departure from normality (Trapezoid: W[10] = .85, p < .1; Isosceles: W[10] = .80, p < .05) while the other subtests did not (Parallelogram: W[10] = .93, n.s.; Rhombus: W[10] = .96, n.s.; Rectangle: W[10] = .89, n.s.; Right: W[10] = .89, n.s.). Given some evidence of non-normal distributions (trapezoid and isosceles trapezoid/triangle subtests) we chose to perform ANOVAs on the four remaining, potentially normal subtests. Reflecting the non-parametric tests, above, a significant effect of condition, favoring the EI condition, emerged (parallelogram: F[1,18] = 8.6, p < .01; rhombus: F[1,18] = 7.2, p < .05; rectangle F[1,18] = 12.5, p < .01; right F[1,18] = 4.7, p < .05). In the two subtests with post-trial feedback, as expected, subtest-half proved to be significant predictor of accuracy, indicating learning from first to second subtest half (parallelogram: F[1,18] = 20.8, p < .001; rhombus: F[1,18] = 19.8, p < .001). Surprisingly, in the rectangle subtest – with no feedback – subtest-half was also a significant predictor of accuracy (F[1,18] = 12.5, p < .01), while this was not the case in with the right triangle/trapezoid subtest (F[1,18] = 1.1, n.s.). The only significant interaction between subtest-half and condition occurred on the rhombus subtest (p < .05). Post-hoc, Bonferroni corrected t-tests reveal that this interaction is the result of a significant difference between conditions in the first half (t[18] = 3.5, p < .01) and a non-significant difference in the second half (t[18] = 1.1, n.s.). Therefore, feedback afforded the NI condition participants an opportunity to “catch up” with the EI condition. In the other cases, if learning from first to second half occurred, the change was parallel between conditions (parallelogram: F[1,18] = 1.7, n.s.; rectangle F[1,18] < 1, n.s.; right F[1,18] = 2.0, n.s.). Game analysis. Every child successfully completed all 26 game exercises. Due to experimenter error some data files were lost in the second unit of study (3 participants from the control condition). Analysis was conducted on all remaining available data. From game-play data we sought to determine whether conditions were equivalent in terms of difficulty and participants’ strategy. To get a sense of differences in the difficulty of the task we computed the mean duration of each exercise. Likewise, we computed a mean count of the number of “transforms” (i.e. vertex drags) during each exercise. The assumption for these two variables is that a more challenging exercise will require greater time to complete and more adjustments to the shape. To assess whether there were some general strategic differences between conditions we computed two summary variables: mean area and relevant error of the completed figure. In regards to area, our intention in designing exercises was to elicit the production of polygons that closely resembled the internal set of obstacles (see appendices A-C). While this layout of internal obstacles was intended to be suggestive of a target shape, they did not prohibit other shapes from being constructed. Rather, goal objects and non-internal obstacles were placed to limit the variation of shapes. Yet, even within these constraints, the possibilities of large variations remained. The area of the shape (i.e., pixels2) is an indirect measure of how closely a participant’s figure resembled the layout of the internal set of obstacles. We suspect that a smaller area indicates closer attention the suggested figure. Our summary variable was calculated for each participant by averaging the area of all completed figures within a unit. The mean relevant error assesses the degree to which the defining features of a completed figure deviated from ideal. This statistic reflects the same computation used within the game to validate the accuracy of defining features. In the first unit, focusing on parallel sides, we calculated the slope of relevant sides as percent gradients. We then calculated the absolute difference between pairs of values corresponding to opposite sides (0%: parallel – 100%: perpendicular). In the case of parallelograms, where both pairs of opposite sides are parallel, we averaged this measure across both pairs of sides. In the second unit, focusing on congruent adjacent sides, we calculated the difference between the length of relevant sides (0 – 840 pixels [the screen width]). We computed this value for two pairs of sides (kites) or all four pairs of sides (rhombi), and then averaged. Finally, in the third unit – focusing on right angles – we calculated the absolute difference of all four angles from 90◦, and then averaged. Finally, the summary mean was calculated for each participant by averaging the relevant error of all completed figures within a unit. These statistics are not comparable between units. Participants in the EI condition were explicitly constrained to keep each of these error values small through the feature validation process. Therefore, we expect the mean relevant error of EI condition participants to be small, necessarily. However, it may be the case that participants in the NI condition spontaneously produce figures that adhered to the target’s defining features. Table 2 Game summary statistics Unit Condition Exercise duration (mins.) Mean SD Transform count Area Mean SD Mean SD Mean SD 2 (pixels /1000) Relevant error a (1) Parallel sides treatment 11.4 2.5 22.0 7.2 122 6.3 0.88 0.22 control 6.7 2.3 11.2 5.7 128 7.0 8.8 2.9 (2) Congruent adjacent sides treatment 12.2 3.2 26.6 6.4 132 5.0 10.2 1.8 control 5.8 2.1 8.5 6.0 134 6.7 57.0 30 (3) Right angles treatment 10.7 5.5 26.3 9.9 128 3.0 1.1 0.34 control 3.6 1.2 6.4 1.3 128 3.5 3.0 0.60 a. Relevant error for each unit: (1) abs. difference in % gradients; (2) abs. difference in pixel length; (3) abs. difference from 90◦ As shown in table 2, there are clear differences between conditions. Across all units, two-tailed ttests reveal that EI condition participants, on average, took longer to complete exercises (Unit 1: t[18] = 4.5, p < .001; Unit 2: t[15] = 4.6, p < .001, Unit 3: t[18] = 4.2, p < .001), and performed more transformation operations than NI condition participants (Unit 1: t[18] = 3.7, p < .01; Unit 2: t[15] = 5.8, p < .001, Unit 3: t[18] = 6.3, p < .001). As expected, figures produced in the EI condition had less error than the figures produced in the NI condition (Unit 1: t[18] = -8.6, p < .001; Unit 2: t[15] = -5.0, p < .001, Unit 3: t[18] = -8.3, p < .001). The extent of this difference is also noteworthy – figures produced in the NI condition contained approximately ten, five, and three times as much error as treatment figures for parallel lines, congruent sides, and right angles, respectively. On the other hand, the differences in area between conditions were less prominent, suggesting a roughly similar adherence to the structure of the internal obstacles. Only in the first unit was there a small significant difference between conditions (t[18] = -2.1, p < .05). On the second and third unit the areas of the completed figures were roughly equivalent (Unit 2: t[15] = -.85, n.s., Unit 3: t[18] = -.35, n.s). This change is likely due to a greater use of external obstacles in Unit 2 and Unit 3 layouts, limiting the size and variability of resulting figures. Discussion The results of the shape identification task demonstrate that children in the embodied integration condition were more likely than children in the no integration condition to select polygons based on defining features, rather than holistic visual similarity. Given this result we can conclude that the inclusion of the property validation phase produced some change in the children’s perception/conception of polygons. Furthermore, given that the isosceles triangle/trapezoid and right triangle/trapezoid subtests asked the children to apply their concept of congruent sides and right angles, respectively, to new figures, this result extends beyond stimuli experienced during the learning task. Furthermore, the embodied integration condition’s advantage was immediate, and did not accelerate due to learning within the test. This suggests that the learning game promoted the development of concepts that transferred intact to the assessment task. In the single case of an interaction post-trial feedback appeared to attenuate group differences. However, due to the differences in the in-game data we hesitate to make the stronger conclusion in definitive support of our third and fourth cognitive design principles, regarding the embodiment and integration of higher-level concepts. First and foremost, as the results displayed in Table 2 indicate, embodied integration participants applied considerably more time and effort to successfully completing each exercise. Taking all learning units into account, on average, participants in the embodied integration condition spent twice as much time and performed twice as many transform operations as no integration participants. Additionally, we observed that children in both conditions utilized gestural strategies to produce roughly accurate initial figures, rapidly. In particular, during the planning phase many children would place their fingers directly on the screen at imagined vertices. This technique, presumably, reduced the need to maintain a mental image in working memory to guide the initial construction of the figure. Often, in the case of no integration participants, this initial construction was sufficiently accurate to successfully navigate the obstacle course. In cases where success was not immediate, figures were often constructed in a series of tests and minor revisions, focusing on a single vertex at a time. This strategy of local refinement likely made the attention to the shape, as a whole, unnecessary. On the other hand, in the embodied integration condition the rough placement of fingers typically was not sufficient to produce successful initial constructions. More so, in the case where further refinements were required, participants could not focus on one vertex or side exclusively – as all defining features measure reciprocal relationships between shape components – and therefore were unable to apply the same process of iterative local refinements to produce a successful figure as no integration participants. Rather, these participants needed to attend to multiple components of the figure, while simultaneously coordinating with a memory representation of the obstacle layout. This additional layer of challenge may have produced the type of desirable difficulty needed to elicit stronger mental representations (Christina & Bjork, 1991). Yet, even if participants in the no integration condition successfully encoded their figures into memory, large error values suggest that these figures were often poor examples of the target polygon. Figure 7, shown below, provides six examples of trapezoids constructed by participants from both conditions. These constructions were chosen to demonstrate how figures produced in either condition may look holistically similar, while differing in visually subtle, but important ways. Specifically, in each case of a completed trapezoid from the no integration condition both pairs of sides deviate from parallel beyond the threshold required by the embodied integration condition. Embodied integration condition No integration condition Figure 7. Selected completed figures for unit 1, exercise 6. The top row displays three figures constructed by participants in the treatment condition. The bottom row displays similar figures constructed by participants in the control condition. For the no integration condition, successful completion of exercises with polygons containing deviant defining features may have reinforced the association between specific shapes and holistic characteristics. This difference between conditions undermines an assumption of this experiment that the exposure to a diverse set of valid polygons (i.e., Principle 2) would be held constant between conditions. Ironically, children’s rote, non-reflective process of shape construction may have mitigated further entrenchment of misconceptions. While, this experiment demonstrates that children, left to a minimally constrained learning environment, are unlikely to engage in effective learning practices, we cannot make a stronger case for our specific principles supporting the design of the experimental EI condition. Rather, it may be the case that by producing or viewing a greater volume of valid polygons, regardless of the instructional format, children learn formal concepts more efficiently. In experiment 2 we seek to eliminate these differences between conditions to test our mechanisms of embodiment and integration directly. Experiment 2 Given the potential confounds discussed in experiment 1, we conducted a second experiment with a strengthened control group. Specifically, children in this new control group were constrained to produce valid polygons, with the same precision as the treatment group, by using a numerical mechanism for feature validation – i.e., numerical integration (NuI) condition. This additional feature ensured that the participants in the numerical integration condition engaged in a similarly rigorous exercise as participants in the embodied integration (EI) condition. While the results of experiment 1 allow us to conclude, tentatively, that the validation constraints produced superior performance at posttest, experiment 2 is aimed at exploring whether these gains are specifically related to the embodied nature of the task. Method Participants A class of sixteen third grade and three fourth grade students were recruited from the same afterschool program as experiment 1. The children were randomly assigned to either the embodied integration or numerical integration condition. The treatment condition consisted of nine children (M = 9.3 years, SD = .76, 67% female, 100% Hispanic). The numerical integration condition consisted of ten children (M = 9.3 years, SD = .30, 50% female, 100% Hispanic). Two children in the embodied integration condition were native Spanish speakers, but could communicate sufficiently in English and showed little difficulty understanding and completing the tasks. Additionally, one child, originally assigned to the embodied integration condition (but not reflected in the statistics above), chose not to engage in assessment materials following the first unit and was not included in further elements of the study. Materials and Procedure Game and curriculum. As discussed above, the strategic use of fingers for initial constructions was effective for game purposes, but in some cases may have reduced the need to engage in mental imagery and, consequently, the internalize concepts. Furthermore, some of the younger children in experiment 1 simply found the process of reconstructing the figure from memory overwhelming. Therefore, given the younger sample of children in experiment 2, we assisted in initial figure construction by providing a printed sheet of “thumbnail” images, corresponding to each exercise in the unit. With this guide participants had continuous access to the layout of the obstacle course. Yet, by providing a (50%) scaled-down image, participants would need to engage in some level of mental imagery to project the illustrations to the screen. The software elements of the planning, construction, and testing phases remained the same as in experiment 1. In the embodied integration version of the game the validation procedure remained the same as in experiment 1. However, in the numerical integration version of the game we introduced an additional panel of numerical values situated below the construction space (Figure 8). This panel displays a series of rectangular blocks with statistical measure(s) associated with each component of the constructed figure, including sides (length, percent gradient) and vertices (degree measure of internal angle). (a) (b) (c) Figure 8. Screenshot of control version of game used in experiment 2. (a) Screenshot during construction phase. (b) Screenshot during parallel sides validation. (c) Zoomed-in image of numerical representation. Only “% grade” is (fully) visible during a parallel check. Likewise, only length and degree are (fully) visible during side congruency and right angle checks, respectively. Note: 1.0 length is equivalent to 8 grid boxes, or 23 pixels. As a participant constructs or adjusts a figure component the corresponding numerical values reflect this change in real-time. To validate defining features participants clicked on the boxes associated with the target feature. For example, to validate parallel sides in the parallelogram of Figure 9 a participant could select either the two blocks with “0% grade” (i.e., horizontal) or the two blocks with “-64% grade” and “-65% grade.” The precision thresholds for both conditions were identical (i.e., 3% gradient, 0.1 grid length [or 23 pixels ≈ 7 mm], and 4◦). Shape identification. The main structure of the subtests was retained from experiment 1 – i.e., trapezoid, parallelogram, rhombus, isosceles triangle/trapezoid, rectangle, and right triangle/trapezoid. Given results from experiment 1 that showed no significant advantage of post-trial feedback for either condition, we opted to remove all post-trial feedback from these posttests. The instructions, including text and (in some cases) figures with selectable sides or angles, remained the same from experiment 1. While the process of constructing trials remained the same from experiment 1 some polygon stimuli were slightly altered to maximize screen space and increase the variation between valid polygons. These revised stimuli are provided in the appendices. Other assessments. Like experiment 1 the children were assessed with the Woodcock Johnson III Calculation and Math Fluency subtests, tested in random order over the course of the last three weeks of the study. Results Woodcock Johnson. Standardized scores with Woodcock Johnson III age-normed subtests, reveal that both groups were approximately average in Calculation performance (Treatment SS: M=96.1, SD = 19.8; Control SS: M=102.0, SD = 16.2) and average in Math Fluency, although bordering on low average for the treatment group (Treatment SS: M=90.0, SD = 16.6; Control SS: M=96.0, SD =12.7). No significant difference was found between the two conditions on either Calculation (t[17] < 1) or Math Fluency (t[17] < 1). Shape identification. Like experiment 1 we considered the percentage of trials in which participants selected both shapes correctly to be the best indicator of accuracy. Additionally, like experiment 1, we did not assume that these scores would be distributed normally. As such, we performed strictly nonparametric, one-tailed, Mann-Whitney tests to analyze participant accuracy. Also, like experiment 1, we compared conditions for mean trial duration using two-tailed t-tests. Table 3 Posttest summary measures Subtest Condition Trapezoid EI Percent of trials correct Median 37% Min 13% Max 90% Trial duration (secs.) Mean 11.6 SD 3.8 Parallelogram Rhombus Isosceles tri/trap Rectangle Right tri/trap Overall NI 20% 3% 47% 9.5 1.4 EI 27% 3% 83% 11.2 2.8 NI 10% 0% 90% 6.8 1.0 EI NI 56% 13% 23% 0% 97% 87% 9.9 7.8 2.6 1.5 EI 39% 3% 53% 13.2 2.8 NI 7% 0% 40% 6.9 1.4 EI NI 63% 3% 90% 8.9 1.6 5% 0% 77% 6.1 1.3 EI 23% 0% 30% 8.9 3.5 NI 8% 0% 33% 6.4 1.7 EI NI 37% 19% 67% 10.6 1.4 11% 4% 40% 7.3 0.9 As table 3 displays, across all subtests, the median percent of trials correct is higher for the EI condition than the NuI condition. In an overall comparison of conditions, pooling all subtests items, a one-tailed Mann-Whitney test reveals a significant positive shift in the distribution of EI participants’ accuracy scores, compared to NuI participants’ accuracy scores (W = 76, p < .01). Additionally, comparisons with five of six subtests reveal similar advantages for EI condition participants (Trapezoid: W = 68.5, p < .05; Parallelogram: W = 69.5, p < .05; Rhombus: W = 74, p < .01; Isosceles: W = 71.5, p < .05; Rectangle: W = 73, p < .05). Only in the case of right triangles/trapezoids was there no significant difference (W = 57, n.s.), likely reflecting the low overall scores for participants in both conditions for this subtest (median accuracy scores at- or below-chance). Unlike experiment 1 there were several large differences in durations between conditions. Overall, pooling all subtest trials, two-tailed t-tests reveal that EI condition mean durations were greater than NuI condition mean durations (t[17] = 6.2, p < .001). Additionally, comparisons within individual subtests reveal significantly greater durations for EI condition participants in four cases (Parallelogram: t[17] = 4.6, p < .001; Rhombus: t[17] = 2.1, p < .05; Isosceles: t[17] = 6.3, p < .001; Rectangle: t[17] = 4.2, p < .001). In the case of right triangles/trapezoids the EI condition participants showed a trend towards longer trial duration (t[17] = 2.0, p < .1) than NuI condition participants. Finally, in the case of trapezoids there was no significant difference between conditions (t[17] = .03, n.s.). Game analysis. As stated in the discussion of experiment 1, large differences in learning game performance confounded interpretation of posttest results. In experiment 2, the numerical representation was introduced to offer a comparable alternative method to gestural depictions. Table 4 Game summary statistics Unit Condition Exercise duration (mins.) Mean SD Transform count Area Mean SD Mean SD Mean SD (pixels2/1000) Relevant error a (1) Parallel sides treatment 19.9 3.9 36.2 9.3 124 4.4 1.0 0.16 control 17.3 3.2 29.7 12.3 126 6.7 0.86 0.2 (2) Congruent adjacent sides treatment 13.9 3.9 25.1 11.8 128 2.6 12.6 8.4 control 10.2 3.2 15.2 7.9 131 6.7 7.0 2.8 (3) Right angles treatment 14.6 4.1 27.5 11.6 126 5.9 1.5 0.39 control 10.2 3.0 11.4 4.3 125 6.1 1.0 0.16 a. Relevant error for each unit: (1) abs. difference in % gradients; (2) abs. difference in pixel length; (3) abs. difference from 90◦ As shown in table 2, there were some differences between conditions. Two-tailed t-tests reveal that EI condition participants, on average, took longer to complete exercises in Unit 2 and Unit 3 (Unit 2: t[17] = 2.3, p < .05, Unit 3: t[16] = 2.3, p < .05), but not in Unit 1 (t[17] = 1.6, n.s.). Similarly, EI condition participants showed a trend towards performing significantly more transformation operations than NuI condition in Unit 2 (t[17] = 2.0); performed significantly more transformation operations in Unit 3 (t[16] = 3.9, p < .01); and performed nearly an equal number of transformation operations in Unit 1 (t[17] = 1.3, n.s.). Like experiment 1, the differences in the areas of completed figures were small; and in this case revealed no significant differences (Unit 1: t[17] = -.86, n.s.; Unit 2: t[17] = -1.4, n.s.; Unit 3: t[16] = .42, n.s.). Most strikingly, unlike experiment 1 the differences between conditions in relevant error were small. Only in Unit 2 did NuI condition participants showed a trend towards less error than EI condition participants (t[17] = 2.0, p < .1). In the other two units there were no significant differences between conditions (Unit 1: t[17] = 1.5, n.s.; Unit 3: t[16] = 1.5, n.s.). Discussion Like Experiment 1, experiment 2 showed that participants in the embodied integration condition were more likely to identify polygons accurately than participants in the numerical integration condition. Specifically, in one-tailed tests of group differences, embodied integration participants outperformed numerical integration in five out six subtests. Clearly, the subtest in which both conditions performed worst, i.e., right triangles/trapezoids, was simply too challenging for these children. Given that this experiment was conducted on a set of end-of-year third graders, instead of beginning-to-mid-year fourth graders, we are not surprised to discover decreased accuracy, overall. The subtle differences between valid-invalid pairs proved taxing at times for children in both conditions. For future studies, when addressing similarly young children, we plan to make discrepancies in defining features more pronounced. In addition to differences between conditions in posttest accuracy, large differences in posttest duration emerged in experiment 2, which were not present in experiment 1. The faster response times of participants in the numerical integration condition, in combination with their at- or below-chance (25%) accuracy in all subtests, suggests that these children were responding quickly to figures with high visual similarity. This echo’s Behrman’s and Brown (1968) findings, discussed earlier, that individuals are likely to classify shapes based on salient, yet informal characteristics, such as elongation. On the other hand, seeking differences in the defining features provided a greater perceptual challenge, and required greater time-on-trial. While these results favor the embodied integration manipulation, recall that in experiment 1 large differences between conditions in the learning task tempered interpretation in terms of our cognitive design principles. In particular, large errors in the completed figures of numerical integration participants may have reinforced inappropriate association between polygons and characteristic features. In contrast, in experiment 2, the game data shows a much smaller difference in the errors between conditions, supporting the efficacy of the manipulation. In the single significant case the numerical integration participants, on average, constructed more accurate figures than embodied integration participants, despite their reduced posttest scores. This suggests that the attended information (i.e., numerical representations of geometric properties) facilitated the construction of more accurate shapes but did not seem to change their concept of the shape, as measured by the identification task. However, some differences in time (in unit 2 and 3) and number of constructions (in unit 2, marginally, and 3) did emerge between conditions. Yet, as noted in experiment 1, embodied integration participants’ exercise durations were on average, more than double, the control participants, while in experiment 2 these differences were less than 50%. To some extent this difference in time-on-task may reflect inherent and unavoidable differences between spatial and numerical processes. For example, while reducing a 4% difference in gradients to a 3% difference, to meet margin of error constraints, is perceptually challenging in the embodied task, the corresponding task is trivial with numerical values. In many cases children in the numerical integration condition discovered numerical margins of error explicitly, and were able to apply this knowledge while adjusting their figures. This likely reduced the number of transform operations they needed to perform. For example, in the third learning unit – which showed a large difference between conditions in number of transforms – participants in the numerical integration condition could adjust a single vertex while attending to the continuouslyupdating numerical representation of an angle until it became 90°, exactly. On the other hand, participants in the embodied integration condition received no such immediate assurances that their angle was accurate. Rather these participants would need to commit to a particular angle, attempt to validate it, and then apply another transformation if incorrect. Therefore, we suggest that these differences likely reflect increased procedural encumbrance in the embodied integration condition rather than any real difference in conceptual rigor. General discussion Results from experiment 1 and experiment 2 demonstrate how a computer game may promote learning of geometric concepts. However, the goal of this study was not to simply demonstrate the efficacy of our particular learning tool, but to study general principles of learning as applied to geometry. The computer game simply represents one possible instantiation of a set of four cognitive design principles. Specifically, the game (1) utilized an intuitive mechanism for initial construction, (2) exposed children to a diverse set of valid shapes, (3) promoted embodied representations of higherlevel concepts, and (4) incorporated a challenging activity to promote integration of multiple embodied representations. The experiments described here assumed the validity of the first two principles (based upon prior research) and tested the latter two principles. Our choice to test these two principles together, rather than separately, reflects our opinion that both principles play a necessary, reciprocal role in developing children’s knowledge. Without an integrating mechanism, even strong embodied representations are simply disconnected – and easily forgotten – pieces of knowledge. Conversely, without embodied representations, higher-level concepts lose their connection to intuitive representations. While these speculations will need to be tested directly, we suggest that the results of this study, and potential future studies, may contribute to literature regarding both (1) the role of embodiment in learning and (2) conceptual development in geometry. Embodiment in learning. While the term “embodied” seems to suggest that physical movement – perhaps even whole-body movement – should be the primary source for learning, we take a broader view of the term – perhaps best encapsulated by Barsalou "grounded" (2008) approach – in which the notion of embodiment or grounding serves as a counterbalance to classical views of cognition that portray abstract thinking as qualitatively dissimilar to processes in the perception and action systems. Rather, higher-level thinking emerges from a wealth of experience with systems that interface with the physical environment. From this perspective, embodied knowledge may be strictly perceptual, strictly motor, or some combination of the two. With this in mind, the focus of our embodied integration condition was to facilitate any number of embodied representations of target concepts, which could vary across this perception-action spectrum. While the gestural depictions were completely visual, by the end of each learning unit all children had either spontaneously discovered the corresponding hand gestures or seen another child or experimenter model those gestures. Yet, the frequency of and purpose behind the use of gestures varied among children. In some cases, gestures appeared to serve as an initial reminder of the meaning of some geometric concept, but faded as children became increasingly fluent. For some children gestures were adopted as a tool for accurate polygon construction – i.e., by matching their hands with components of the constructed figures. Similarly, some children, recognizing the imprecision of these gestures, applied physical artifacts from the classroom – such as a straight edge or the corner of a sheet of paper (for right angles) – to increase their precision in shape construction. Finally, in some cases children showed little use of external tools or gestures. Given the breadth of these behaviors, several questions arise regarding the proper course of learning. First, is there a single, optimal strategy for any particular concept? Should children be taught and expected to use appropriate gestures or are these gestures simply a poor substitute for more precise tools? On the other hand, does the use of precise tools, and to a lesser extent gestures, interfere with the construction of robust internal representations? In other words, will a child who learned to measure right angles with a protractor be unable to recognize this feature in the absence of the tool? Although we did not track gestures explicitly, we did observe some general patterns in their application. We noticed that children with stronger initial concepts of parallelism, congruency, and perpendicularity more readily demonstrated corresponding gestures – at least in the beginning of a learning unit. For struggling students, a demonstration of corresponding hand gesture, by peer or experimenter, was often required – and many cases, multiple times to the same child. Taken together, these observations support Beilock and Goldin-Meadow's (2010) notion of a reciprocal relationship between conceptual thought and gesture. In several cases, children who successfully learned to apply gesture as a tool in the learning game applied this strategy during posttest. Yet, often these students found that the smaller size of the posttest stimuli, relative to figures constructed during the learning task, made the accurate use of gestures for measurement prohibitive. As such, children often resorted to a largely perceptual strategy, with varying degrees of success. To assess the necessity and utility of gestures directly a possible avenue for future research may lie in emerging gesture-based technologies, such as multi-touch screens and motion-capture systems. With these systems our research may shift from merely suggesting a gesture to enforcing the use of a specific gesture. For example, with a motion capture system a learner might align his or her arms with opposite sides of a figure to validate parallel lines. Similarly, with a muli-touch screen, a user might simply run two fingers, in parallel, along the length of opposite sides. In either case a specific gesture, requiring some form of direct instruction, becomes a necessary element of the task. Consequently, does the invention of a gestural or non-gestural strategy – even a sub-optimal one – facilitate conceptualization better than direct instruction? In some cases, engaging in “productive failure” – where discovery and invention are attempted unsuccessfully – may benefit students more than being taught a procedure directly (Kapur, 2008). By teaching students to use a single gesture or tool, do we deprive them of an important opportunity to critically examine their own ideas? For example, a student might benefit from constructing his or her own gesture or visual depiction for parallel lines. While the spontaneous strategy constructed by a learner is likely to be physical in nature, there is, perhaps an important distinction between those strategies that internalize knowledge into embodied representations and those that rely upon external tool use. As Martin and Schwartz (2005) discovered, features of the environment may support performance during learning at the cost of long term retention and transfer. Likewise, in our work here, several students, who successfully relied upon a physical tool to produce right angles (the corner of a sheet of paper), did not succeed when their tool was not available during testing. In these cases, failure may not reflect poor performance during the learning task, but simply an arbitrary choice of strategy. Therefore, while the use of tools, gesture, or fully internalized (e.g. perceptual) strategies are all valid, practical considerations dictate that learning tools support strategies and representations which are most portable to novel activities – including standardized assessments. Development of mature, integrated concepts in geometry. Unless we adopt a highly amodal, abstract interpretation of mental imagery and representation (e.g. Pylyshyn, 1973), then we must assume that geometric concepts are at least partially built upon perceptual primitives. Yet, are these imagistic primitives like a set of exemplar snapshots, or perhaps something more schematic in nature, such as “perceptual symbols” (Barsalou, 1999)? In the case of number, representation appears to be spatial in nature and independent of any particular modal system (Dehaene, 1997). Likewise, it may be the case that mature geometric representations abstract to this same spatial level of representation – facilitating both perception and action. For example, a child who can easily recognize a rectangle visually likely can also draw a rectangular box or walk a rectangular path. In the process of developing this conceptual-spatial representation exposure to a narrow range of stimuli can lead to misconceptions. For example, the child (or adult) viewing an obliquely oriented square may have difficulty seeing the shape as anything but a rhombus. Clearly, learning to identify defining features is an important remedy for this misconception. Yet, does the process of learning features fundamentally change the way one perceives a shape or is it simply a supplementary strategy, independent of the initial recognition process? In the former case, the child learns to see the square in terms of right angles and has little difficulty identifying a “rotated square” correctly as a square. In the latter case, the child still sees the “rotated square” only as a rhombus (or “diamond”), but after a highly controlled search for right angles concludes that the shape must also be a square, contrary to his or her own intuition. In other words, how closely integrated are higher-level concepts with intuitive concepts? We suspect that when children are first introduced to defining features, the process of recognizing a shape by appearance and recognizing defining features are initially, loosely coupled processes, whose association needs continued external reinforcement. However, given sufficient experience, the representation guiding shape recognition deeply integrates these new features. So, while a child may have to be initially reminded, repeatedly, to search for right angles to identify rectangles, eventually he or she will simply see right angles as salient features in rectangular figures. In this process, gesture might serve as both a means to conceptualize but as a source of informal feedback for instructors. As a child’s concept of a shape increasingly shifts to reflect defining features we might expect to see less explicit use of gesture. In a simple example with counting and arithmetic, children often use fingers as a tool to support cognitive processes; yet, as they grow more competent they begin to internalize these processes (Siegler & Robinson, 1982). In the case of a 4th grade, Logo-based geometry task, Clements and Burns (2000) observed spontaneous gestures corresponding to angles and turns, which the children later “curtailed” – i.e., physically scaled down –with increased expertise. Of course gesture is only one indicator of an explicit cognitive process of coordinating multiple geometry concepts. In general, as this explicit, deliberative process – in whatever form it takes – becomes unnecessary, the duration of the corresponding tasks should decrease. While we cannot compare participants in our two experiments directly, due to some differences in stimuli, we note that for younger children (experiment 2) – who found the task more difficult, overall – trial durations were significantly greater in the treatment condition than control condition (for 4 out of 6 subtests). For the older children (experiment 1) this was not the case – and in fact the reverse was true in one subtest. This suggests as a general U-shaped pattern of development in which children with a novice understanding of shape (i.e, children in the control condition of both experiments) quickly respond according to their intuitive, everyday-object-based concept of shape. When first beginning to learn defining feature concepts, the process of shape identification requires multiple deliberative steps that result in longer trial durations (EI condition in the second experiment). Finally, when children begins to master these new concepts and integrate them into their central represenation for a shape, the identification process becomes more rapid (EI condition in the first experiment). It is likely that the 4th grade children had more prior experience with concepts of parallel lines, congruent sides, and right angles then the 3rd grade children, and therefore integrated these concepts more efficiently in experiment 1. To test this hypothesis directly, in future research, we may introduce more iterations of the identification task in a microgenetic design. To be clear, we do not claim that greater expertise will curtail all gesture, but simply reduce children’s application of gesture as an explicit tool to “lighten the cognitive load” (Goldin-Meadow, Nusbaum, Kelly, & Wagner, 2001). Gesture is likely to remain as a general product of thinking about spatial concepts. Yet, if a natural reduction of gesture – in either abundance, necessity, or scale – is indeed a hallmark of conceptual enrichment, should it be promoted explicitly, or allowed to emerge according to the individual child? In the case of numerical development, typically developing children tend to shift from finger-based strategies to verbal or mental strategies as they progress through early primary school. In fact, delay of this shift is frequently associated with mathematical disability (Geary, Hamson, & Hoard, 2000; Geary, Hoard, & Hamson, 1999). However, the transition to verbal and mental strategies is often achieved through a auto-regulative process of strategic variation in which the more efficient strategies are eventually adopted (Siegler, 1994). Attempting to enforce such curtailment with geometric concepts may contribute to misunderstandings. On the other hand, the gradual removal of external supports (scaffolds) is a well-reasoned form of instruction (Vygotsky, 1978). In the case of teaching abstract concepts Goldstone and Son (2005) apply an instructional strategy of “concreteness fading”, in which initially high context stimuli are reduced in complexity over the learning process to promote understanding that is not bound to a specific environment. Would a similar technique with gesture also be appropriate? Using some of the gesturebased designs, discussed above, we plan to address this issue in future studies. Curriculum recommendations. This study reinforces the need to promote the development of robust spatial representations with engaging, productive, and challenging activities. While this may seem nearly self-evident, geometry instruction is often quite superficial and lacks these practices (Clements, 2004). For example, children often engage in activities with cut-out or physical “pattern blocks”. Though these blocks provide a potential grounding for important geometric concepts, they are often quite limited and prototypical. As we see in experiment 2, simple exposure to valid shapes does not entail that children are attending to and discovering critical features of the concept. While technology can facilitate the implementation of research-based curriculum, it is not essential for classroom instruction. Following our four cognitive design principles, children can easily draw, with aid of a straightedge, a diverse set of polygons that fit given physical constraints – including the “obstacle course” design applied here. Furthermore, an instructor may demonstrate hand gestures to represent defining feature of shapes, and show how to use a precise tool, such as a protractor or ruler, to validate the accuracy of these defining features. Providing students with the responsibility to validate their own shapes, or those of their classmates, may even enhance this “analog” activity. Finally, we hope that this work will encourage others within the cognitive and educational fields to address the topic of geometry with the rigor and attention afforded to other areas of mathematics, such as numerical development. Geometry was, of course, an essential element in classical education and worthy of lengthy study by Piaget (Piaget et al., 1960). Geometry integrates a wide range of cognitive processes – including numeracy, spatial sense, object recognition, navigation, and language – and represents an ideal domain to study the development of complex mathematical concepts. Reference Anderson, J. R., Boyle, C. F., Corbett, A., & Lewis, M. W. (1987). Cognitive principles in the design of computer tutors. In P. Morris (Ed.), Modeling cognition (pp. 93-134). New York: Wiley. Ball, D. L. (1992). Magical hopes: Manipulatives and the reform of math education. American Educator, 16(1), 14-19. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577-609. Barsalou, L. W. (2008). Grounded Cognition. Annual Review of Psychology, 59(1), 617-645. Annual Reviews. Barsalou, L. W. (2010). Grounded Cognition: Past, Present, and Future. Topics in Cognitive Science, 2(4), no-no. JOHN WILEY &amp; SONS INC. Behrman, B. W., & Brown, D. R. (1968). Multidimensional scaling of forms: A psychological analysis. Perception & Psychophysics, 4(1), 19-25. Beilock, S. L., & Goldin-Meadow, S. (2010). Gesture changes thought by grounding it in action. Psychological Science, 21(11), 1605-1610. SAGE Publications. Berliner, D. (2006). Our impoverished view of educational reform. Teachers College Record, 108(6), 949995. Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: Knowing about knowing (pp. 185-205). Cambridge, MA: MIT Press. Bjork, R. A. (2006). Desirable difficulties. Learning. Bjork, R. A., & Linn, M. C. (2006). The science of learning and the learning of science: Introducing desirable difficulties. APS Observer, 19(3), 29-39. Bomba, P. C., & Siqueland, E. R. (1983). The nature and structure of infant form categories. Jouranl of Experimental Child Psychology, 35, 294-328. Booth, J. L., & Siegler, R. S. (2006). Developmental and individual differences in pure numerical estimation. Developmental Psychology, 42(1), 189-201. Broaders, S. C., Wagner, S. C., Zachary, M., Goldin-Meadow, S., Cook, S. W., & Mitchell, Z. (2007). Making children gesture brings out implicit knowledge and leads to learning. Journal of Experimental Psychology: General, 136(4), 539-550. Brown, M. C., McNeil, N. M., & Glenberg, A. M. (2009). Using concreteness in education: Real problems, potential solutions. Child Development Perspectives, 3(3), 160-164. Wiley Online Library. Bruner, J. S. (1960). The culture of education. Cambridge, MA: Harvard University Press. Bruner, J. S. (1966). Toward a theory of instruction. Cambridge, MA: Harvard University Press. Burger, W. F., & Shaughnessy, J. M. (1986). Characterizing the Van Hiele levels of development in geometry. Journal for Research in Mathematics Education, 17(1), 31-48. Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: MIT Press. Chan, M., & Black, J. B. (2006). Direct-manipulation animation: incorporating the haptic channel in the learning process to support middle school students in science learning and mental model acquisition (pp. 64-70). Mahway, NJ: Lawrence Erlbaum Associates, Inc. Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5(2), 121-152. Elsevier. Christina, R. W., & Bjork, R. A. (1991). Optimizing long-term retention and transfer. In D. Druckman & R. A. Bjork (Eds.), In the mind’s eye: Enhancing human performance (pp. 23-56). Washington, DC: National Academy Press. Clark, A. (1999). An embodied cognitive science? Trends in Cognitive Sciences, 3(9), 345-351. Elsevier. Clark, D., & Linn, M. C. (2003). Designing for Knowledge Integration: The Impact of Instructional Time. Journal of the Learning Sciences, 12(4), 451-493. Taylor & Francis, Ltd. Clements, D. H. (2004). Geometric and spatial thinking in early childhood education. Engaging young children in mathematics: standards for early childhood education (pp. 267-297). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Clements, D. H., & Battista, M. T. (1992). Geometry and spatial reasoning. In A. E. Kelly & R. A. Lesh (Eds.), Handbook of research on mathematics teaching and learning (pp. 420-464). New York: Macmillan. Clements, D. H., & Burns, B. A. (2000). Students’ development of strategies for turn and angle measure. Educational Studies in Mathematics, 41(4), 31-45. Clements, D. H., Battista, M. T., & Sarama, J. (2001). Logo and geometry. Journal for Research in Mathematics Education. Monograph, 10(2001), i-177. JSTOR. Clements, D. H., Swaminathan, S., Hannibal, M. A. Z., & Sarama, J. (1999). Young children’s concept of shape. Journal for Research in Mathematics Education, 30(2), 192-212. Dehaene, S. (1997). The Number Sense: How the Mind Creates Mathematics. New York: Oxford University Press. Dehaene, S., & Izard, V. (2006). Core knowledge of geometry in an Amazonian indigene group. Science, 311(5759), 381-384. Dehaene, S., Piazza, M., Pinel, P., & Cohen, L. (2003). Three parietal circuits for number processing. Cognitive Neuropsychology, 20(3/4/5/6), 487-506. Dehaene, S., Spelke, E. S., Pinel, P., Stanescu, R., & Tsivkin, S. (1999). Sources of mathematical thinking: Behavioral and brain-imaging evidence. Science, 284, 970-974. Fauconnier, G., & Turner, M. (1998). Conceptual integration networks. (D. Geeraerts, Ed.)Cognitive Science, 22(2), 133-187. Lawrence Erlbaum Associates, Inc. Feigenson, L., Dehaene, S., & Spelke, E. S. (2004). Core systems of number. Trends in Cognitive Science, 8(7), 307-314. Fuson, K. C., & Briars, D. J. (1990). Using a base-ten blocks learning/teaching approach for first- and second-grade place-value and multidigit addition and subtraction. Journal for Research in Mathematics Education, 21(3), 180-206. JSTOR. Geary, D. C., Hamson, C. O., & Hoard, M. K. (2000). Numerical and arithmetical cognition: a longitudinal study of process and concept deficits in children with learning disability. Journal of Experimental Child Psychology, 77(3), 236-263. Geary, D. C., Hoard, M. K., & Hamson, C. O. (1999). Numerical and arithmetical cognition: patterns of functions and deficits in children at risk for a mathematical disability. Journal of Experimental Child Psychology, 74(3), 213-239. Gibson, E. J. (1969). Principles of perceptual learning and development. New York: Appleton-CenturyCrofts. Gibson, J. J. (1966). The senses considered as perceptual systems. New York: Houghton Mifflin. Ginsburg, A., Cooke, G., Leinwand, S., Noell, J., & Pollock, E. (2005). International mathematics performance: New findings from the 2003 TIMSS and PISA. Washington, DC: American Institutes for Research. Glenberg, A. M., Gutierrez, T., Levin, J. R., Japuntich, S., & Kaschak, M. P. (2004). Activity and imagined activity can enhance young children’s reading comprehension. Journal of Educational Psychology, 96(3), 424-436. Goldin-Meadow, Susan, Nusbaum, H., Kelly, S. D., & Wagner, S. (2001). Explaining math: gesturing lightens the load. Psychological Science, 12(6), 516-522. SAGE Publications. Goldstone, R. L. (1998). Perceptual learning. Annual Review of Psychology, 49, 585-612. Goldstone, R. L., & Barsalou, L. W. (1998). Reuniting perception and conception. Cognition, 65(2-3), 231262. Elsevier. Goldstone, R. L., & Son, J. Y. (2005). The transfer of scientific principles using concrete and idealized simulations. The Journal of the learning sciences, 14(1), 69-110. Goldstone, R. L., Landy, D. H., & Son, J. Y. (2010). The education of perception. Topics in Cognitive Science, 2(2), 265-284. John Wiley \& Sons. Gravemeijer, K. P. E. (1991). An instruction-theoretical reflection on the use of manipulatives. In L. Streefland (Ed.), Realistic mathematics education in primary school (pp. 57-76). Technipress. Halberda, J., Mazzocco, M. M. M., & Feigenson, L. (2008). Individual differences in non-verbal number acuity correlate with math achievement. Nature, 455(7213), 665-668. Nature Publishing Group. Hammer, J., & Black, J. (2009). Games and (preparation for future) learning. Educational Technology, 49(2), 29-34. Educational Technology Publications. 700 Palisade Avenue, Englewood Cliffs, NJ 07632-0564. Tel: 800-952-2665; Web site: http://www. bookstoread. com/etp. Hiebert, J., & Wearne, D. (1986). Procedures over concepts: The acquisition of decimal number knowledge. In J. Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 199-223). Hillsdale, NJ: Erlbaum. Holloway, I. D., & Ansari, D. (2009). Mapping numerical magnitudes onto symbols: the numerical distance effect and individual differences in children’s mathematics achievement. Journal of Experimental Child Psychology, 103(1), 17-29. Elsevier Inc. Howe, J., O’Shea, T., & Plane, F. (1979). Teaching mathematics through Logo programming: An evaluation study. Computer assisted learning: scope, progress, and limits. Roehampton, England,. Hughes, M., & Greenbough, P. (1995). Feedback, adult intervention, and peer collaboration in initial LOGO learning. Cognition, 13(4), 525-539. Johnson, P. A. (1986). Effects of computer-assisted instruction compared to teacher-directed instruction on comprehension of abstract concepts by the deaf. Unpublished doctoral dissertation, Northern Illinois University. Kapur, M. (2008). Productive Failure. Cognition and Instruction, 26(3), 379-424. International Society of the Learning Sciences. Keil, F. C., & Baterman, N. (1984). A characteristic-to-defining shift in the development of word meaning. Journal of Verbal Learning, 23, 221-236. Kirschner, P. A., Sweller, J., & Clark, R. E. (2006). Why minimal guidance during instruction does not work: an analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching. Educational Psychologist, 41(2), 75-86. Klahr, D., & Nigam, M. (2004). The equivalence of learning paths in early science instruction: Effects of direct instruction and discovery learning. Psychological Science, 15(10), 661-667. Koedinger, K. R., & Anderson, J. R. (1993). Reifying implicit planning in geometry: Guidelines for modelbased intelligent tutoring system design. In S. P. Lajoie & S. J. Derry (Eds.), Computers as cognitive tools: Technology in education (pp. 15-46). New York: Lawrence Erlbaum Associates, Inc. Koedinger, K. R., Alibali, M. W., & Nathan, M. J. (2008). Trade-offs between grounded and abstract representations: Evidence from algebra problem solving. Cognitive Science, 32(2), 366-397. Kurtz, K. J., Miao, C.-hui, & Gentner, D. (2001). Learning by analogical bootstrapping. Journal of the Learning Sciences, 10(4), 417-446. Lakoff, G., & Núñez, R. E. (2000). Where Mathematics Comes From: How the Embodied Mind Brings Mathematics into Being. Basic Books (p. xvii, 493). New York: Basic Books. Lave, J. (1988). Cognition in practice: Mind, mathematics, and culture in everyday life. New York: Cambridge University Press. Linn, M. C. (2000). Designing the knowledge integration environment. International Journal of Science Education, 22(8), 781–796. Taylor & Francis. Linn, M. C., Chiu, J., Zhang, H., & McElhaney, K. (2010). Can desirable difficulties overcome deceptive clarity in scientific visualizations? In A. Benjamin (Ed.), Successful remembering and successful forgetting a Festchrift in honor of Robert A Bjork (pp. 1-35). Routledge. Littlefield, J., Delclos, V. R., Bransford, J. D., Clayton, K. N., & Franks, J. J. (1989). Some prerequisites for teaching thinking: Methodological issues in the study of LOGO programming. Cognition and Instruction, 6(4), 331-366. Mach, E. (n.d.). The analysis of sensations and the relation of the physical to the psychical. New York: Dover. Mandler, J. M. (1992). How to build a baby: II. Conceptual primitives. Psychological Review, 99, 587-604. Martin, T., & Schwartz, D. L. (2005). Physically distributed learning: Adapting and reinterpreting physical environments in the development of fraction concepts. Cognitive Science, 29(4), 587-625. Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press. McNeil, N. M., & Jarvin, L. (2007). When theories don’t add up: Disentangling the manipulatives debate. Theory Into Practice, 46(4), 309-316. McNeil, N. M., & Uttal, D. H. (2009). Rethinking the use of concrete materials in learning: Perspectives from development and education. Child Development Perspectives, 3(3), 137-139. Mix, K. S. (2009). Spatial tools for mathematical thought. The Spatial Foundations of Language and Cognition, 1(9), 40–66. Oxford Scholarship Online Monographs. Montessori, M. (1912). The Montessori Method. New York: Frederick A. Stokes Company. Moreno, R., & Mayer, R. E. (1999). Multimedia-Supported Metaphors for Meaning Making in Mathematics. Cognition and Instruction, 17(3), 215-248. Routledge. National Council Of Teachers Of Mathematics (NCTM). (2000). Principles and standards for school mathematics. Reston, VA: Author. Noss, R., & Hoyles, C. (1992). Afterword: Looking back and looking forward. In C. Hoyles & R. Noss (Eds.), Learning mathematics and Logo (pp. 427-468). Cambridge, MA,: MIT Press. Papert, S. (1980). Mindstorms: Children, computers, and powerful ideas. New York (p. 246). Basic Books. Papert, S., Watt, D., DiSessa, A. A., & Weir, S. (1979). Final report of the Brookline Logo Project: Parts I and II (Logo memos nos. 53 and 54). Cambridge, MA: M.I.T. Artificial Intelligence Laboratory. Piaget, J. (1952). The origins of intelligence in children. New York: International Universities Press. Piaget, J. (1970). Science of education and the psychology of the child. New York: Orion Press. Piaget, J., & Inhelder, B. (1956). The child’s conception of space. New York: Norton. Piaget, J., Inhelder, B., & Szeminska, A. (1960). The child’s conception of geometry. New York: W. W. Norton. Pylyshyn, Z. W. (1973). What the mind’s eye tells the mind's brain: a critique of mental imagery. Psychological Bulletin, 80, 1-24. Quinlan, P. T., & Humphreys, G. W. (1993). Perceptual frames of reference and two-dimensional shape recognition: Further examination of two-dimensional shape recognition. Perception, 22, 13431364. Quinn, P. C., & Eimas, P. D. (1997). A reexamination of the perceptual-to-conceptual shift in mental representations. Review of General Psychology, 1(3), 271-287. Ramani, G. B., & Siegler, R. S. (2008). Promoting broad and stable improvements in low-income children’s numerical knowledge through playing number board games. Child Development, 79(2), 375-394. Wiley Online Library. Resnick, L., & Omanson, S. (1987). Learning to understand arithmetic. In R Glaser (Ed.), Advances in instructional psychology (Vol. 3, pp. 41-95). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Sarama, J., & Clements, D. H. (2002). Building blocks for young children’s mathematical development. Journal of Educational Computing Research, 27(1&2), 93-110. Sarama, J., & Clements, D. H. (2004). Building Blocks for early childhood mathematics. Early Childhood Research Quarterly, 19(1), 181-189. Sarama, J., & Clements, D. H. (2009). “Concrete” computer manipulatives in mathematics education. Child Development Perspectives, 3(3), 145-150. Schmidt, W. H., & McKnight, C. C. (1998). What can from we really learn from the TIMSS? Science, 282(5395), 1830-1831. Schmidt, W. H., McKnight, C. C., & Raizen, S. A. (2007). A splintered vision: An investigation of U.S. science and mathematics education: Executive summary. Wisconsin Teacher of Mathematics, 48(2), 4-9. Schmidt, W. H., McKnight, C., & Raizen, S. (1997). A splintered vision: An investigation of U. S. science and mathematical education. Wisconsin Teacher of Mathematics, 48(2), 4-9. Kluwer Academic Publishers. Schwartz, D. L., & Bransford, J. D. (1998). A Time For Telling. Cognition and Instruction, 16(4), 475-522. Routledge. Schwartz, D. L., & Martin, T. (2004). Inventing to prepare for future learning: The hidden efficiency of encouraging original student production in statistics instruction. Cognition and Instruction, 22(2), 129–184. Routledge. Schyns, P. G., Goldstone, R. L., & Thibaut, J.-P. (1998). The development of features in object concepts. Behavioral and Brain Sciences, 21(1), 1-54. Segal, A. (2011). Do gestural interfaces promote thinking? Embodied interaction: Congruent gestures and direct touch promote performance in math. Unpublished doctoral dissertation, Columbia University, New York. Shepard, R. N., & Chipman, S. (1970). Second-order isomorphism of internal representation: Shapes of states. Cognitive Psychology, 1, 1-17. Siegler, R. S. (1994). Cognitive variability: A key to understanding cognitive development. Current Directions in Psychological Science, 3, 1-5. Siegler, R. S., & Booth, J. L. (2004). Development of numerical estimation in young children. Child Development, 75(2), 428-444. Siegler, R. S., & Opfer, J. E. (2003). The development of numerical estimation: Evidence for multiple representations of numerical quantity. Psychological Science, 14(3), 237-243. Siegler, R. S., & Ramani, G. B. (2008). Playing linear numerical board games promotes low-income children’s numerical development. Developmental science, 11(5), 655-61. Siegler, R. S., & Ramani, G. B. (2009). Playing linear number board games—but not circular ones— improves low-income preschoolers’ numerical understanding. Journal of Educational Psychology, 101(3), 545-560. Siegler, R. S., & Robinson, M. (1982). The development of numerical understanding. In H. W. Reese & L. P. Lipsitt (Eds.), Advances in Child Development and Behavior (Vol. 16, pp. 241-312). New York: Academic Press. Simmons, M., & Cope, P. (1990). Fragile knowledge of angle in Turtle Geometry. Educational Studies in Mathematics, 21, 375-382. Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119(1), 322. Citeseer. Son, J. Y., & Goldstone, R. L. (2009). Contextualization in perspective. Cognition and Instruction, 57(1), 51-89. Routledge. Son, J. Y., Smith, L. B., & Goldstone, R. L. (2011). Connecting instances to promote children’s relational reasoning. Journal of Experimental Child Psychology, 108(2), 260-277. Spelke, E. S. (2000). Core knowledge. American Psychologist, 55, 1233-1243. Spelke, E. S., Lee, S. A., & Izard, V. (2010). Beyond core knowledge: Natural geometry. Cognitive Science, 34(5), 863-884. Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257-285. Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the development of cognition and action. Cambridge, MA: MIT Press. Thompson, P. W., & Thompson, A. G. (1990). Salient aspects of experience with concrete manipulatives. In F. Hitt (Ed.), Proceedings of the 14th annual meeting of the International Group for the Psychology of Mathematics (Vol. 3, pp. 337-343). Mexico City: International Group for the Psychology of Mathematics Education. Triona, L. M., & Klahr, D. (2003). Point and click or grab and heft: Comparing the influence of physical and virtual instructional materials on elementary school students’ ability to design experiments. Cognition and Instruction, 21(2), 149-173. Routledge. Uttal, D. H., O’Doherty, K., Newland, R., Hand, L. L., & DeLoache, J. (2009). Dual representation and the linking of concrete and symbolic representations. Child Development Perspectives, 3(3), 156-159. Vygotsky, L. S. (1978). Mind in society. Cambridge, MA: Harvard University Press. van Hiele, P. M. (1986). Structure and insight: A theory of mathematics education. Orlando, FL: Academic Press. Appendix A. Unit 1 exercise layouts: trapezoids and parallelograms. Players validate parallel sides. Appendix B. Unit 2 exercise layouts: kites and rhombi. Players validate congruent adjacent sides. Appendix C. Unit 3 exercise layouts: rectangles and squares. Players validate right angles and congruent adjacent sides (for squares, only). Appendix D. Shape identification, trapezoid stimuli, experiment 2. Top-left (outlined in green) are valid-upright, top-right (yellow) are valid-rotated, bottom-left (orange) are valid-upright, bottomright are invalid-rotated. Appendix E. Shape identification, parallelogram stimuli, experiment 2. Top-left (outlined in green) are valid-upright, top-right (yellow) are valid-rotated, bottom-left (orange) are valid-upright, bottomright are invalid-rotated. Note: Stimuli pools were constructed by combining each of six pairs to all other pairs to generate 15 combinations of 4 polygons in set “a” and 15 combinations of 4 polygons in set “b”, for a total of 30 trials. No trials included pairs from both sets. The same is true for the all of the following stimuli as well. In the cases of isosceles or right triangle/trapezoid stimuli set “a” represents all triangles, while set “b” represents all trapezoids. Appendix F. Shape identification, rhombus stimuli, experiment 2. Top-left (outlined in green) are valid-upright, top-right (yellow) are valid-rotated, bottom-left (orange) are valid-upright, bottomright are invalid-rotated. Appendix G. Shape identification, isosceles triangle/trapezoid stimuli, experiment 2. 1st and 3rd rows on left (outlined in green) are valid-upright, 1st and 3rd rows on right (yellow) are valid-rotated, 2nd and 4th rows on left (orange) are valid-upright, 2nd and 4th rows on right are invalid-rotated Appendix H. Shape identification, rectangle stimuli, experiment 2. Top-left (outlined in green) are valid-upright, top-right (yellow) are valid-rotated, bottom-left (orange) are valid-upright, bottomright are invalid-rotated. Appendix I. Shape identification, right triangle/trapezoid stimuli, experiment 2. 1 st and 3rd rows on left (outlined in green) are valid-upright, 1st and 3rd rows on right (yellow) are valid-rotated, 2nd and 4th rows on left (orange) are valid-upright, 2nd and 4th rows on right are invalid-rotated