Running head: TRAINING EFFECTIVENESS OF A COMPUTER GAME

advertisement
Game Evaluation
Running head: TRAINING EFFECTIVENESS OF A COMPUTER GAME
A Formative Evaluation of the Training Effectiveness of a Computer Game
A Proposal
Submitted to: Dr. Harold O’Neil (Chair)
Dr. Richard Clark
Dr. Edward Kazlauskas
by
Hsin-Hui (Claire) Chen
University of Southern California
4325 Renaissance Dr., #307
San Jose, CA 95134
(408) 434-0773;
hsinhuic@usc.edu
In Partial Fulfillment for Ed.D. in Learning and Instruction
November 24, 2003
1
Game Evaluation
2
Table of Contents
ABSTRACT………………………………………………………………………
5
CHAPTER I: INTRODUCTION…………………………………………………..
6
Background of the Problem…………………….…………………………………..
6
Purpose of the Study……………………………………………………………….
8
Significance of the Study……………………………………………………………
8
CHAPTER II: LITERATURE REVIEW….………………………………………..
10
Relevant Studies…………………………………………………………………….
10
Games and Simulations……………………………………..……………………..
10
Theories of Games and Simulations…..…..……….……………………….
15
Game Selection……………………………………………………
18
Design of Games and Simulations…………………………………
20
Training Effectiveness of Games.…………..….…………………………..
21
Promotion of Motivation…………………………………………..
23
Enhancement of Thinking Skills………………………………….
25
Facilitation of Metacognition……………………………………..
27
Improvement of Knowledge………………………………………
28
Building Attitudes…………………………………………………
31
Summary………………………………..………………………………….
32
Evaluation…………………………………………….……………………………
33
Models of Evaluation………………………………….…………………..
34
Summative Evaluation ……………………..….………………….
34
Formative Evaluation……………………..…..……………………….. 36
Game Evaluation
3
Table of Contents (Cont.)
Kirkpatrick’s Four-Level Evaluation.…………………………………. 38
Game Evaluation………………………………………………………………. 43
Summary……………………………………………………………………… 46
Problem Solving……………….………………………………………………........
47
Definition of Problem Solving ……………………………………………..
47
Significance of Problem-Solving Skills……………………………………..
50
Assessment of Problem Solving…………………………….……………..
51
Measurement of Content Understanding……………………………
52
Measurement of Problem Solving Strategies……………..................
61
Measurement of Self-Regulation……………....…………………….. 62
Summary………………………………………………………..………..
64
Summary of the Literature…………………………………………………………..
64
CHAPTER III: METHODOLOGY……………………………………………….
67
Research Hypotheses……………………………………………………………..
67
Research Design……………………………………………………………….….
67
Pilot Study…………………………………………..…………………………….
67
Formative Evaluation……………………………………………………….
68
Participants……………………………………………….…………..
69
Puzzle-Solving Game…….…………………………………………
69
Knowledge Map…………….…..….……………………………….
71
Feedback………………………………………………..................
77
Measures……………………………………………………………………
77
Game Evaluation
4
Table of Contents (Cont.)
Content Understanding Measure………………………………..…
77
Domain-Specific Problem-Solving Strategies Measure………..…...
79
Self-Regulation Questionnaire………………………………………
81
Procedure..…..……………………………………………………………..
81
Time Chart of the Main Study…………………………………….
82
Data Analysis.…………….……..…………………………………………… 82
Main Study….…………………….…………………………………………………
82
Method of the Main Study…………………………………….…………….
82
Participants………………………………..…………….…………
83
Game…….………..………………………………………………..
83
Measures…………………………………………………………………….
83
Knowledge Map…………………………………………..……….
83
Domain-Specific Problem-Solving Strategies Measure…………...
83
Self-Regulation Questionnaire……………………………..……….
84
Procedure..…..……………………………………………………………..
84
Computer-Based Knowledge Map Training……….…………………. 84
Game Playing……………………………………………..……….
84
Feedback on Game Playing Strategies.….… ………………………
84
Data Analysis………………………………………………………………… 85
REFERENCES ………………………………….……..…………………………
Appendix A
Self-Regulation Questionnaire.………………………………..…
86
103
Game Evaluation
5
ABSTRACT
Despite computer games and simulations’ potential power in instruction and training,
research on their training effectiveness for adults is limited, and a framework of evaluation is
lacking, therefore, more analysis and studies on their evaluation need to be conducted
(O’Neil & Fisher, 2002; O’Neil, Baker, & Fisher, 2002; Ruben, 1999). In addition, according
to previous studies, a computer game may be one of the most effective tools to improve
problem-solving. Problem-solving is one of the most significant competences whether in job
settings or in schools, and as a result, teaching and assessing problem-solving become one of
the most significant educational objectives (Mayer, 2002). Therefore the researcher plans to
conduct a formative evaluation on a computer game in terms of its effectiveness of enhancing
learners’ problem-solving, including content understanding, domain-specific problemsolving strategies, and self-regulation.
In the first part of this proposal, the author will review the relevant literature on
computer games and simulations, evaluation models, and problem-solving. The second part
of this proposal will be devoted to a pilot and main study of the formative evaluation on a
computer puzzle-solving game in terms of their effectiveness of enhancing players’ problemsolving.
Game Evaluation
6
CHAPTER I
INTRODUCTION
Background of the Problem
As pointed out by Ruben (1999), researchers such as Abt (1970), Coleman (1969),
Boocock & Schild (1968), Gamson (1969), Greenblat & Duke (1975), Pfeiffer & Jones
(1969-1977), Ruben (1978), Ruben & Budd (1975), and Tansey & Unwin (1969), started to
notice the potential effects of simulations and games in instruction decades ago. The merits
of computer games include facilitating learning by doing (e.g., Mayer, Mouton, & Prothero,
2002), and triggering motivation and enjoyment. In addition, computer simulation games
engage learners in a simulated experience of the real world, which makes learning practical
(Martin, 2000; Stolk, Alexandrian, Gros, & Paggio, 2001). Due to those merits, games and
simulations have been applied in various fields and settings, such as that of business, of K-16
organizations, and of military organizations. Furthermore, as pointed out by Stolk et al.
(2001), for the training in some settings where practice and exercises in real situations are
expensive and dangerous, computer games and simulations are helpful. For example, military
settings applied computer-based training tools, such as war-games and simulators for task
training. The same situation happens in the field of environmental crisis management;
practicing dealing with natural disasters and industrial emergencies are usually very
expensive and dangerous, therefore it is necessary to apply instructional gaming (Stolk et al.,
2001).
However few studies have shown the empirical effects of games and simulations on
training and learning (O’Neil & Fisher, 2002). According to O’Neil and Fisher, the effects of
computer games and simulations can be generally divided into five categories: promotion of
Game Evaluation
motivation, enhancement of thinking skills, facilitation of metacognition, enhancement of
knowledge, and attitudes. They also indicated that despite the potential power of computer
games on instruction and training, research on their training effectiveness is limited, and
there was little gaming literature that was helpful in designing a formative evaluation of
games. As pointed out by Ruben (1999), there is not enough research on the evaluation of
games’ instructional effectiveness and its validity and reliability. According to researchers
(e.g. O’Neil, Baker, & Fisher, 2002; Quinn, 1996), one of the critical concerns is time and
expense. Therefore, more investment should be put in the analysis and studies on computer
game evaluation (O’Neil, et al., 2002 O’Neil & Fisher, 2002; Quinn, 1996; Ruben, 1999).
According to previous research, problem-solving is one of the most critical
competence whether for lifetime learning or accomplishing tasks, whether in job settings, in
academic settings (e.g. Dugdale, 1998), or any other settings. Although there is substantial
previous research which reveals the utility of problem solving (e.g. Mayer, 2002), the
methods to assess problem-solving skills still need to be refined. For example, assessing the
problem-solving of students by giving them a test of separate and unconnected multiple
choice questions, teachers are not accurately assessing students’ problem-solving abilities.
Further, traditional standardized tests do not report to teachers or students what problemsolving processes they should lay emphasis on and why. Although we can find the most
useful measures for problem solving competence in the cognitive science literature, these
measures (e.g., think-aloud protocols), however, are inefficient to assess performance for
diagnostic purposes, since their scoring is laborious and time-consuming (O’Neil, 1999). As
a result, the National Center for Research on Evaluation, Standards, and Student Testing
(CRESST) has developed a problem-solving assessment model to measure content
7
Game Evaluation
8
understanding, problem-solving strategies, and self-regulation, the three elements of
problem-solving.
Purpose of the Study
Games and simulations have potential use in teaching and learning, and have been
applied in the field of business, in academic organizations, and in the military settings, and as
argued by Quinn (1991), computer games may provide effective environments for problemsolving. But there is little research on games’ training effectiveness. This researcher will
conduct this study focusing on the evaluation on a computer game with regard to its
effectiveness of improving problem-solving.
The evaluation to be conducted in this study will be a formative one, which is applied
while a program or a system is happening or forming. The formative evaluation is conducted
to judge the worth of a program or to determine adjustments needed to attain the objectives
while it is in progress instead of at the end of it. The researcher will apply the problemsolving assessment model developed by the National Center for Research on Evaluation,
Standards, and Student Testing (CRESST) to measure content understanding, problemsolving strategies, and self-regulation, the three elements of problem-solving ability because
of their validity and reliability indicated in previous literature (Herl, O’Neil, Chung, &
Schacter et al., 1999; Mayer, 2002). Therefore, the main purpose of this study is to find out if
game-playing helps increase players’ problem-solving ability.
Significance of the Study
Baker and Mayer (1999) indicate that educational assessment has at least three
distinct uses in instructional improvement: first, the results of the assessment are used to keep
educational organizations and students alerted to the academic goals. At the same time it
Game Evaluation
9
motivates the schools and students to achievement the academic standards. Second, the
assessments outcomes provide beneficial objective information, helping teachers to plan or
revise their instruction or assisting administrators in allotting resources to eliminate the
deficiency. Third, the assessment outcomes promote deeper understanding, according to the
present viewpoint of “learning as an activity in which students seek to make sense out of
presented material.”
Some educators such as Amory (2001) have developed instructional games or
software containing aspects of human evaluation, and evaluation on its learning environment;
however, as pointed out by researchers such as O’Neil and Fisher (2002), O’Neil, Baker, and
Fisher (2002), and Ruben (1999), the effort to evaluate game’s training effectiveness is little
when compared to the enthusiasm and effort to take advantage of games’ potential power in
training. Furthermore, evaluation is significant for program designers and executors to
determine its effectiveness, value and further improvement needed, therefore, more analysis
and studies on their evaluation need to be conducted on games.
This researcher focuses on formative evaluation since it not only documents the
computer game’s effects on training, but also discusses the implicit feedback designed in
game terms of puzzle-solving strategies, which offers significant information for future
trainers and developers to select, apply, or design computer games for training. Since the
framework of evaluation on games’ effectiveness on training is lacking as pointed out by
previous researchers, the other purpose of this study is to create a framework of evaluation on
computer games’ effectiveness in improving problem-solving, which can be applied in future
studies.
Game Evaluation
10
CHAPTER II
LITERATURE REVIEW
Relevant Studies
Games and Simulations
As defined by Gredler (1996) “games consist of rules that describe allowable player
moves, game constraints and privileges (such as ways of earning extra turns), and penalties
for illegal (nonpermissible) actions.” In addition, the rules of games do not have to obey
those in real-life and can be imaginative. On the other hand, Driskell and Dwyer’s (1984)
defined a game as a rule-governed, goal-focused, microworld-driven activity incorporating
principles of gaming and computer-assisted instruction, and a simulation game is one that
consists real settings, participants’ roles, playing rules, and scoring systems (Christopher,
1999). A microworld defined by constructivists is a small but complete subset of reality
where person can acquire knowledge of a specific domain by exploring in it as a knowledge
construction tool (Rieber, 1996). As pointed out by Gredler (1996) Games and simulations
have differences in both surface structure and deep structure; surface structure refers to the
observable characteristics while deep structure is defined as the psychological mechanisms
operating in the exercise. The surface structure of games, according to Gredler, are like
“drawing cards, moving pieces around a board, and so on,” while the surface structure of a
simulation is “a scenario or a set of data” to be addressed by the participant (Gredler, 1996, p
522). On the other hand, Gredler points out that their major differences in deep structure are:
(1) while a game player intends to win in the game with competition, a participant in a
simulation of a specific setting is executing serious responsibilities, deliberating feasible job
procedures and possible consequences; (2) event sequence of a game is typically “linear”,
Game Evaluation
11
whereas a simulation sequence is “branching,” which means that actions and decisions made
previously will influence or result in the following situations and problems; (3) rules and
settings of games are not necessarily realistic or matching the real world, but those of
simulations are authentic and closely related to the real world. Finally, games are usually
more fun-driven than simulations. The primary characteristics of games and simulations are
shown in Table 1.
Table 1
Primary Characteristics of Games and Simulations. Adapted and modified from Gredler, M. E.
(1996)
Games
Setting:
Students are transported to another world or
Simulations
X
X
environment
Purpose:
Fun
X
Competition and winning
X
Fulfilling a professional role
X
Executing a professional task
X
Event sequence: Typically linear
X
Nonlinear or branching
Mechanisms thatSets of rules (may be imaginative)
X
X
Game Evaluation
determine
consequences:
Dynamic set of authentic causal relationships
12
X
among two or more variables
Participant is a component of the evolving scenario
X
X
and executes the responsibilities of his or her
role
Participant interacts with a database or sets of
processes to discover scientific principles,
X
explain or predict events, and confront
misconceptions
As seen in Table 1, a common feature of games and simulations is that they transport the
players to another world. For example, a game player may deliberate strategies for a chess board
game while a simulation participant may diagnose the problem and apply plausible policies as a
mayor of a simulated city. Another similarity is that a participant is a part of the evolving
situations and performs the duty of his or her role.
The mechanisms of games and simulations, as shown in Table 1, are distinct. The rules
of games may be imaginative while those of simulations may be close to real-life situations. That
is, simulations are designed with the dynamic set of authentic causal relationships among two or
more variables. In addition, players of simulations may encounter a database or sets of processes
to discover scientific principles, explain, or predict events, and confront misconceptions
(Gredler, 1996).
Game Evaluation
13
Although researchers debate the definition of simulation and game, Martin (2000)
cited and concluded it in his article the as the following:
Games generally have rules and an expectation of competitive behavior toward
winning (Jones, 1998a) and often include a large degree of chance (Jones, 1998).
They sound appealing to students, although perhaps at the expense of learning (Jones,
1989; Lane, 1995) Simulation typically emphasizes a more academic and thoughtful
exercise, often involves a model of a process, and typically supports learning specific
content or about decision making. Shubik (1983) considers gaming to be primarily
people centered and simulations to be primarily computer oriented. Lane (1995)
agrees, describing games as interactive whereas simulations are described as models
that can be left to run. Klein (1985) neatly describes a simulation as a game without
players. (p. 457)
Examples of application of simulations in the business sector are found in previous
studies (e.g., Schank, 1997; Washbush & Gosen, 2001). In Washbush and Gosen’s (2001)
study on undergraduate students on the effect of learning in an enterprise simulation,
MICROMATIC. In this study, students’ performance of the simulation was measured by the
end of play using the simulation’s scoring procedure and was based on net income, return on
sales, and return on assets, and their learning was measured with multiple-choice questions
and short-essay questions. The researchers found that although there was no significant
relationship between simulation performances and learning, learning is meaningfully
occurred from simulation play, especially when participants perceived their teams to be wellorganized. The results showed that students begin to master the skills and concepts presented,
Game Evaluation
14
and that the simulation is a valid learning methodology. Also, Schank (1997) found that
computer simulation games could be used to help adults to learn business skills.
Furthermore, an examples of simulations used in military settings have been
documented by O’Neil and Andrews (2000). As the researchers indicated, simulations have
been used in aircrew training and assessment as an important tool, whether for individual or
teams.
“Simulations, games, and other experience-based instructional methods have had a
substantial impact on teaching concepts and applications during this period.”(Ruben, 1999, p.
498). Also, researchers pointed out that simulations and games are widely accepted as a
powerful alternative of traditional ways of teaching and learning, with the merits of
facilitating learning by doing (e.g., Mayer, et al., 2002; Rosenorn & Kofoed, 1998),
triggering motivation and enjoyment, and engaging learners in a simulated experience of the
real world (Martin, 2000). For example, O’Neil & Fisher (2002) argue that computer games
would be cost-effective for many leader development applications and could offer sufficient
practice and feedback opportunities if designed appropriately.
Finally, Gredler (1996) defines a phrase that means the mixture of games and
simulations’ features; that is simulation games or gaming simulation. In this study, game will
refer to either Gredler’s definition of game or gaming simulation. Amory (2001) argues that
simulation games are more often applied in educational environments than other types of
games, since playing simulation games, learners can focused on single goals, with decreased
competition between learners and at their own pace.
Game Evaluation
15
Theories of Games and Simulations
One of the most important supporting theories of games and simulations is
experience-based learning (Ruben, 1999). Experience-based learning is an important
approach focusing on increasing the student's control and autonomy, an aspect of
constructivism. Experience-based learning is the process whereby students learn something
by doing it and connecting it with their prior experience, such as hands-on laboratory
experiments, studio performances, practical training, etc. Computer games and simulations
facilitate experience-based learning, by transporting learners to “another world,” where they
are in control of the action, and providing them opportunities for learners to interact with a
knowledge domain (Gredler, 1996). As pointed out by Ruben (1999), experience-based
instructional methods had the potential to address many of the limitations of the traditional
teaching methods (Ruben, 1999). Traditional instructional methods have several limitations
such as: learning and teaching are hard to separate; knowledge and skills are not practiced
and applied appropriately; learning tends to be individual work while its application occurs
outside the classroom is usually social; traditional teaching is lacking in creativity and
vividness; finally, the acquisition of problem solving is usually not emphasized. On the other
hand, as pointed out by Ruben, experience-based learning instruction is an effective learning
approach proved by several empirical studies. It provides more pluralistic and multivariate
approaches to learning, and promotes collaboration and interactivity (O’Neil et al., 2002).
Most importantly, experiential learning facilitates cognitive thinking and active learning.
Further more, as pointed out by Mayer et al. (2002), learning with pictorial aids, such as
multimedia and games, students learn by doing, working on realistic tasks instead of learning
by solely being told by teachers.
Game Evaluation
16
Furthermore, Adams (1998) points out that a game satisfies learners’ visual and
auditory sensor and provides flexibility in learning, which makes it an attractive tool for
teaching and learning, based on the perspectives of constructivism (Amory, 2001) and dualcoding theory (Mayer & Sims, 1998). The learning of constructivism must be active;
teachers should guide the learners in the construction of mental models, and the guidance is
based on the individual learner's background knowledge. Based on the concept of
constructivism, new knowledge is constructed by a learner with his/her unique background
knowledge and beliefs, by making sense of the knowledge, in multiple ways and in various
contexts. In addition, constructivism learning is both an active and a reflective process,
triggered by social interaction; it is internally controlled and mediated by the learner
(Bruning, Schraw, & Ronning, 1999). As pointed out by researchers (e.g. Amory, 2001;
Stolk, Alexandrian, Gros, & Paggio, 2001), a game player does not study a particular domain
but becomes part of the scenario, therefore promotes active and meaningful learning, and
stimulating self-regulation in learning. For instance, Stolk et al. (2001) argued that
instructional simulations for teaching environmental crisis management provide alternative
of exercises, since it is usually very expensive and dangerous to practice in real situations.
Experience-based learning is one practical way of integrating constructivist methods
into instruction. Experience-based instruction is assumed to be better than traditional
teaching, since learners form a skill or acquire knowledge by doing. The other reason that
games provide effective alternatives to traditional lectures is that they can facilitate learning
(e.g., Adams, 1998; O’Neil & Fisher, 2002; Ruben, 1999). Simulation based learning is an
effective way to learn and apply knowledge and skills quickly. Further more, previous
research on emerging technologies have shown that computer-based learning enhances
Game Evaluation
17
problem solving (e.g. Dugdale, 1998; Mulqueen, 2001; Poedubicky, 2001) and decision
making skills (Poedubicky, 2001), and studies on transfer of learning (Ritchie & Dodge,
1992) have revealed that simulations facilitate the transfer more effectively than traditional
methods of instruction (e.g., Adams, 1998; Fery & Ponserre, 2001; Mayer et al., 2002).
The significant results of the empirical study conducted by Dugdale (1998) are one of
the examples of computer-based learning enhancing students’ mathematical problemsolving. The researcher conducted this study through 15-day systematic observation of
participants’ use of technology to approach the problems assigned; each use of a computer
was recorded, along with the role of each participant in the computer use and a description of
how the computer was applied to the problem. For example, the rated computer
appropriateness of each day’s problem-solving assignment may be Y/appropriate or N/not
appropriate, the comfort level and number of every participant’s applications of computers
and problem-solving were recorded. In addition, participants’ maximum role in computer
problem solving was rated by “non, passive, active, and central”, four levels. While gaining
experience of applying advantage of computer investigation methods, participants also
increased learner-initiated applications of technology to problem solving and effective
computer use.
In Mulqueen’s study on the effects of computer-based training on teachers’
interdisciplinary problem-solving ability, the self-reported level of problem-solving skills
were improved in the second year of training. In addition, it was found that teachers became
more willing to accept new learning challenges. In the study conducted by Ritchie and
Dodge (1992), a computer microworld was used to simulate symbolic, physical phenomena.
When high school students interacted with this simulated environment, they were able to
Game Evaluation
18
grasp the key principles. Their test performance was improved, and their team work was
fostered, and their subject skills across the curriculum were improved.
Other researchers such as Alessi (2000a; 2000b) points out four critical elements of an
instructional game, which are knowledge attributes, learners’ attributes, simulation attributes,
and learners’ encoding, representing and using knowledge. There are two ways to acquire
instructional games and simulations for research purposes; one of them is to buy off-the-shelf
software, and another way is to develop them (Alisse, 1998; Alessi, 2000b; Amory, 2001;
O’Neil & Fisher, 2002).
Game Selection
Researchers pointed out that play associated with games, is an important construct of
learning (Quinn, 1994; Rieber, 1996). Rieber (1996) argued that a game may be a more
meaningful way to present microworlds to learners than a simulation, and Amory argued that
an effective instructional game should be pertinent to the learning objectives. According to
O’Neil and Fisher (2002), a game designer and a trainer characterize a computer game in
terms of different models or specifications. The former characterizes a game in terms of: (1)
type of platform that supports the game, (2) type of players, (3) the contractor, (4) genre of
the game, (5) purpose of the game, and (6) key milestones. However, a researcher or trainer,
characterizes a game in terms of different five specifications regarding domain knowledge to
be learned, which are (1) learning objectives, (2) training use, (3)learners, (4) practice, and (5)
feedback.
Very often, a training game lacks objectives since most of the games are created for
fun. Customers have to generate the training applications from the game they choose
according to their objectives and game specifications, and evaluate the effects by themselves.
Game Evaluation
19
For example, in the three-phased game feasibility study conducted by Wainess and O’Neil
(2003), the researchers selected three appropriate games among more than five hundred onthe-shelf games in order to use one of them as a platform for research on cognitive and
affective issues related to games in education. The researchers (Wainess & O’Neil, 2003)
managed the selection process based on the research needs and learning objectives of
problem-solving. According to the objectives and needs, the games selected for further
consideration should have several characteristics, such as adult oriented, single user play,
suitable for problem solving research, etc.
In some previous studies on games, the researchers have argued that games effects
learning only when the appropriate games are selected and tailored for training (e.g., Baird &
Silvern, 1999; Dawes & Dumbleton, 2001; Rabbitt, Banerji, & Szymanski, 1989). In Rabbitt
et al.’s study about their empirical research on a videogame’s training effects, participants
were trained to apply two different instructional games to practice. It is found that
participants’ practicing different games resulted in distinct results of the IQ test. Rabbitt et al.
concluded that a videogame could be tailored to be an efficient training tool for complex
industrial and military tasks. Another example of game selection is the empirical study
conducted by Dawes & Dumbleton (2001). The researchers conducted the study on game
selection to support some aspects of learning in schools. In this study, eleven computer
purchased games were considered based on several factors, such as technical issues, language
comprehension, content suitability, teacher’s role, time constraint, and types of feedback. In
addition, what types and amount of guidance should be given to learners should be
considered when applying multimedia and gaming as teaching aids (e.g., Mayer, et al., 2002).
Game Evaluation
20
Design of Games and Simulations
It is very difficult to create an instructionally effective game because the
comprehensive design paradigms derived from learning principles and well-designed
empirical research on instructional simulations and games are lacking. Nevertheless, Quinn
developed a game design model supported by educational theories to design a game based on
system or users, which encompasses entertaining factors and the procedure to design a game.
However he does not collect any data as to its effectiveness.
Amory, Naicker, and Vincent (1999) established the Game Object Model for game
design, which includes components that promote educational objectives and computer
interfaces. Amory (2001) also points out that the development of an instructional game is
composed of three perspectives, which are, the research it is based on, the development of
resources, and software components. Resource development here includes activities, such as
tools/ software selection, story line conception, object placement, image generation, game
page creation, game player analysis, and game level testing; software development here means
to develop game page editor, playback engine and puzzle creation. Based on these, Amory
developed Zadarh, an educational adventure game, which addresses misconceptions held by
biology students and presents information that could foster discussion and other interactive
learning activities. It was found that students who played the game performed approximately
the same in multiple choice questions of biology test than those who did not play the game,
although the difference was not significant. In addition, Martin (2000) points out that,
purpose, reality, timeline, feedback, decisions, participants, role, and close match of the
simulation/game with the learning objectives are main elements to consider when designing a
game.
Game Evaluation
21
To design a game for training or instruction, the first issue is to find out the learning
goal and objectives, which is not only is as the guideline to follow, but also as the criteria its
feedback and assessment will be based on (Alessi, 2000c; Stolk et al., 2001). Since different
goal/objectives require different types of feedback and assessment measures, game developers
should design feedback forms and types, and assessment tools to find out if the game really
helps learners or trainees achieve the learning goal and objectives, and its efficiency. For
example, if the training/learning goal is to increase learners’ problem-solving ability, then
each element of problem-solving ability including content understanding, problem-solving
strategies, and self-regulation (O’Neil, 1999) should be taught and assessed in the game
context (O’Neil, 2003).
Furthermore, training goals may also affect self-efficacy, achievement, and the use of
self-regulatory strategies in learning (Schunk & Ertmer, 1999). Thus, there are three other
essential factors to consider in order to design effective instructional games and simulations:
(1) the structure is designed to reinforce learners’ objective knowledge and skills (also, Stolk
et al., 2001) (2) learners’ prior knowledge (also, Stolk et al., 2001), and (3) the complexity of
problem solving (Gredler, 1996). (4) types and amount of guidance given to learners (e.g.,
Mayer, et al., 2002). In the study conducted by Stolk et al. (2001), researchers developed a
simulation game to support the training of environmental crisis management. The gaming was
developed according to learners’ relevant prior knowledge, and the game scenarios and crisis
designed in the gaming resemble the those happen in the natural environment.
Training Effectiveness of Games
The potential power of computer games on training and instruction has been drawn the
attention of educators for decades (e.g., Donchin, 1989; Malone, 1981; Quinn, 1991; Ruben,
Game Evaluation
22
1999; Thomas & Macredie, 1994). Games have been applied in various subjects such as
geography (Mayer et al., 2002; Moreno & Mayer, 2000; Tkacz, 1998), law, business (e.g., King
& Morrison, 1998; Shank, 1997), physics (e.g., White & Frederiksen, 1998), and therapeutic
situations (Ruben, 1999). As pointed out by O’Neil and Fisher (2002), computer games have
been found beneficial for instruction and training due to their four characteristics: “(a)
complex and diverse approaches to learning processes and outcomes; (b) interactivity; (c)
ability to address cognitive as well as affective learning issues; and perhaps most importantly,
(d) motivation for learning.” (p6). Although the main purpose of computer games has been
entertainment, there are more and more people who apply computer games and simulations
for training and instruction. As argued by Quinn (1991; 1996), computer games are effective
tools for training problem-solving, since computer adventure games are a part of many
learners’ experience, they are motivating, they are enjoyed by a wide range of age group, and
provide engaging and familiar environment where problems are embedded; the games are a
source of information on the question of what strategies subjects bring to bear to solve the
problems. Further, computer games and simulations have been used to develop workers’
financial and banking skills in business settings (e.g., Adams, 1998; Faria, 1998; Lane, 1995;
Wabush & Gosen, 2001). In Adams’s empirical study on a computer simulation game’s
effectiveness in urban geographic education, students were evaluated by how they performed
in the simulation game, essay and multiple choice questions. For example, one of the
questions was “What do you think SimCity teaches people about urban processes?” He found
that students who used the simulation game became cognitively aware of urban geographic
problems, and became more curious about the urban fabric and the complicated repercussions
of changes in the urban system in the real world. Furthermore, more than one third of the
Game Evaluation
23
students wrote in their project with a new appreciation of urban planning and difficulties of
managing urban funds. He points out that an urban geography class at a State University of
New York rated the simulation their favorite project of the semester when compared to nine
other projects of conventional lecture/exam class, and further, due to the attractive graphics
and flexibility of the urban simulation model, the game has become an attractive tool for
teaching urban geography and planning concepts. Unfortunately, the researcher did not
mention in the report whether the results in the study were statistically significant.
The military sector also uses simulation-based games to train flight and combat skills,
and even to recruit new members (Chambers, Sherlock, & Kucik, 2002; O’Neil & Andrews,
2000; Rhodenizer, Bowers & Bergondy, 1998), and other researchers (Galimberti, Ignazi,
Vercesi, & Riva, 2001) found that a networked videogame enhanced social interaction and
cooperation.
The effects of instructional games and simulation can be generally divided into five
categories: promotion of motivation, enhancement of thinking skills, facilitation of
metacognition, enhancement of knowledge, and building of attitude (O’Neil & Fisher, 2002).
Promotion of motivation.
Motivation is the psychological feature that arouses, directs, and maintains an organism
to action (Woolfolk, 2001). Motivation has been found to have positive influence on
performance (e.g., Clark, 1998; Emmons, 2000; Ponsford & Lapadat, 2001; Rieber, 1996; Urdan
& Midhley, 2001; Ziegler & Heller, 2000). Ricci, Salas and Cannon-Bowers (1996) pointed out
that dynamic interaction, competition, and novelty are three characteristics of computer-based
gaming that contribute to its motivational appeal, and these three characteristics can produce
significant differences in learner attitude. Furthermore, O’Neil and Fisher pointed out that
Game Evaluation
24
computer games provide diversity, interactivity, importantly, and motivation for learning, and
therefore have been applied in the instruction in different sectors, such as business (e.g., Adams,
1998; Washbush & Gosen, 2001), military (e.g. O’Neil & Andrews, 2000) and academic sectors
(e.g. Adams, 1998; Amory, 2001; Amory, Naicker, Vincent, & Adams,1999; Barnett, Vitaglione,
Harper, Quackenbush, Steadman, & Valdez, 1997; Ricci, et al., 1996; Santos, 2002).
Further more, Amory (2001) indicates that games and simulation can not only
combine theoretical concepts and practice, but also trigger intrinsic motivation and selfregulated learning. Several previous researchers pointed out the use of computer games
increased learners’ intrinsic or extrinsic motivation (e.g., Amory,1999; Quinn, 1996; Rieber,
1996); the former is associated with inner feeling while the latter is triggered by external
factors such as rewards and punishments (Woolfolk, 2001).
In addition, Malone (1981) pointed out that intrinsic motivation is significant for
problem-solving; a task can not be accomplished when the intrinsic motivation is absent,
even though a learner works his best. He further pointed that games possess the
characteristics of challenges and elements of fantasy, therefore trigger players’ motivated and
interests in the game world.
For instance, the results of a study conducted on six games’ instructional effects by
the British Educational Communications and Technology Agency (Dawes & Dumbleton,
2001), showed that the use of computer games in instruction enhances students’ motivation.
Learners in this study were observed to work positively and to persist in their engagement
with the software, continuing their work after lesson times. For example, some of the games
were found used voluntarily at breaks and lunchtimes, and it was found that learners, who
started with easy levels volunteered to move on to more difficult levels, which required extra
Game Evaluation
25
time and effort. Also, the research conducted by Amory, Naicker, Vincent, and Adams
(1999) showed that participants were intrinsically motivated by playing computer games,
especially the simulation and adventure games. In other articles (Amory, 2001; Quinn, 1996;
Rieber, 1996), the authors argue that play associated with games triggers intrinsic motivation.
For example, Rieber (1996) points out that the challenges, curiosity, fantasy, and
controllability of games trigger intrinsically motivating learning.
Enhancement of thinking skills
Thinking skills are skills of information processing, reasoning, enquiry, creative
thinking, learning strategies and evaluation skills (Dawes & Dumbleton, 2001; O’Neil, 1978).
According to previous studies, computer games are assumed to enhance thinking skills. For
example, in the study conducted by Mayer et al. (2002), the researchers used transfer test to
measure the impact of Profile Game, a computer simulation game on geology learning. It was
found that the computer game helped improve geology students’ thinking skills and visualspatial thinking in geology, and facilitate learning by doing. The study revealed that the
computer game helped students think like geologists. Quinn measured learners’ usage of
problem-solving strategies by transcriptions of the verbal protocols, and the computer traces;
the number of attempts that each participant made to solve each problem, and the number of
times that each character died, and the number of “go’s” that a participant made are recorded
by the computers (1991). The transcripts were examined for evidence of the subjects’ strategy
use of four categories, which are “recall”, “cause”, “trial and error”, and “other”. Quinn
suggested that game’s problem-solving environment could be used not only to investigate the
cognitive skills involved but also as an environment within which to learn and practice these
skills.
Game Evaluation
26
Dawes and Dumbleton’s (2001) reported a study on six computer games, “Age of
Empires”, “Brain Teasing Games”, “Championship Manager”, “City Trader”, “Sim City
3000”, and “The Sims”, that were assigned to different schools respectively. According to
the observation results, all of the games were found to enhance thinking skills or problem
solving skills, including information processing, reasoning, enquiry, creative thinking and
evaluation skills. In the report, the researchers concluded that if the level of challenge and
type of game was appropriate for the students, their problem-solving and critical thinking
skills could be facilitated. For example, SimCity, a computer simulation game, and Age of
Empires, a real time strategy game, were found complex and flexible enough to enable
students to apply different strategies, and require them to think about the interaction of a
range of variables logically.
In Adams’ (1998) empirical study regarding teaching urban geography with a
computer simulation game, students were asked to write down the changes of the city they
were in charge of and the amount of money in their coffer after playing the game following
the hints built into the game (experiment B) and then after playing the game without
following those hints (experiment C) to evaluate the results. The students were then assessed
with essay and multiple-choice question. It was found that simulations enhanced students’
computer literacy, knowledge of geographical phenomena and processes, and their ability to
critique a city’s development from different aspects such as the social, political,
philosophical, scientific, and economic situations. However, the researcher did not mention
whether this result was significant.
Game Evaluation
27
In addition, computer games’ effects on improving reasoning skills, facilitating
complex problem-solving, and enhancing transfer of learning were documented in previous
articles (e.g. Adams, 1998; Crisafulli & Antonietti, 1993).
Also, there is evidence, which shows playing computer games improve cognitive
processes since it increases flexibility and variety of the knowledge representations, such as
visual and auditory representation. For example, Okagaki and Frensch (1994) conducted a
study on Tetris, a video game, using undergraduate students, none of whom had prior
experience with the game, and assessed participants’ mental rotation, spatial visualization,
and perceptual speed with paper-and-pencil test before and after playing the game. Using
pencil-and-paper tasks, the researchers found that spatial-oriented video games have the
potential to improve late adolescents’ mental rotation and spatial visualization skills, and
bring players cognitive benefits during their playing. However, the positive results were only
significant for male but not for female participants. Also, in two experiments conducted by
Greenfield, DeWinstanley and Kirkpatrick (1994), participants’ divided attention was
measured with choice reaction time in a luminance detection task, using response time to
targets of varying probabilities at two locations on a computer screen. The significant results
in the two experiments showed that video games strengthen strategies of divided attention,
which implied that computer games can be applied to train a task that requires monitoring of
multiple visual locations.
Facilitation of Metacognition.
Woolfolk (2001) defined metacognition as knowledge about our own thinking
processes, which includes three kinds of knowledge. First, declarative knowledge about
strategies to learn, to memorize, and to perform will. Second, procedural knowledge about
Game Evaluation
28
how to use the strategies, and third, conditional knowledge about when and why to apply the
former two kinds of knowledge. On the other hand, O’Neil and Abedi (1996) defined that
metacognition as planning and self-checking, and it enables people to utilize various
strategies to accomplish a goal.
It has been shown in previous studies that metacognition facilitates knowledge and
skills acquisition (e.g., Pirolli & Recker, 1994). Playing computer games not only has the
potential benefits of enhancing metacognitive skills (Baird & Silvern, 1999; Bruning et al.,
1999; O’Neil & Fisher, 2002; Pillay, Browlee, & Wilss, 1999). For instance, Bruning et al.
(1999) and Pillay, Browlee, and Wilss (1999) found in their qualitative studies that game
playing offers players an opportunity to apply metacognitive skills. When playing a game,
players checked their own action, activated their schemata, found out relation and connection,
and formed hypotheses. The researchers claimed that the frequent monitoring of thinking by
game players is an application of metacognitive approach.
Improvement of knowledge
Knowledge includes domain-specific knowledge and domain-general knowledge, both
of which include declarative and procedural knowledge (Brunning, Schraw, & Ronning,
1999). While domain-specific knowledge is helpful for a specific subject or activity, general
knowledge is used for a very wide range of activities. Declarative knowledge is organized
factual knowledge, and procedural knowledge is “knowing how” knowledge that facilitates a
specific activity. As Brunning, Schraw, and Ronning (1999) pointed out, general knowledge
is complementary to domain knowledge; however, their roles shift among task focus.
Several studies have shown evidence that computer games can enhance
learning and retention of knowledge. For instance, Westbrook and Braithwaite
Game Evaluation
29
(2001) provided evidence that a health care game was an effective tool in
improving learning outcomes, such as information-seeking skills and factual
knowledge. In the study conducted by Westbrook and Braithwaite (2001), the researchers
applied pre and post self-reported questionnaires consisted of learner demographics, learners’
reaction toward the game, learners’ knowledge of health system, and learners’ experience
with computer games, to evaluate a health care simulation game, which was designed to
promote information-seeking skills and the interaction the health system. Comparing the pre
and post survey responses, the researchers found that participants’ domain knowledge of the
health system, Medicare and private insurance significantly higher than before the game.
In the study conducted by Ricci et al. (1996), it was found that the military
trainees who were presented subject matter or chemical, biological, and
radiological defense in computer-gaming form scored significantly higher in
multiple-choice retention test then those who were presented the subject matter
in paper-based form. The researchers used a trainee reaction questionnaire
containing five statements with a 5-likert scale on the training task, and found
significant positive correlations between reaction and retention test score. That
is, participants who "(a) perceived their form of study as enjoyable, (b) felt they
learned a lot about CBD during their training, and (c) felt confident that they
would remember what they learned during training " tended to score
significantly higher on the retention score than those who did not.
In addition, Betz (1995-1996) also found that students who learned by both
reading a text and playing computer simulators about the planning and
management of a complex city system scored significantly higher in the
Game Evaluation
30
examination, consisting of multiple choice and true/false questions, than those
learned by only reading the text, even though the examination questions were
based on the content and application of the text only.
Also, Fery and Ponserre (2001) found that skills learned by playing a golf
video game can be significantly transferred to actual golf playing, when the video
game provides reliable demonstrations of actual putts and when players have the
intention to learn golf. The subjects may simply enjoy playing the video game or
use it to improve their knowledge and skills of playing a real-world golf game by
analogizing the knowledge and skills acquired when playing the video game with
the situations of virtual golf game. The participants' golf playing knowledge and
skills were measured by experimenter, who indicated the correct posture and
gave alignment references. The distances of the actual putts and the direction of
the error in the force of putts during pre and post tests were collected and
compared. The results showed that participants who played the simulation golf
game with an intention to learn golf, significantly outperformed participants who
played the video game only for entertainment, and than participants who did not
play the video game.
In Adams’s (1998) research on a computer simulation game’s educational
effectiveness on urban geographic, the game helped students develop computer literacy, and
knowledge of geographical phenomena and processes. Amory et al (1999) measured pre- and
post-game, test of multiple-choice questions to measure students’ knowledge of
environmental biology learned after game playing. The difference between students’ pre- and
post-test results was not significant.
Game Evaluation
31
Another example is found in an empirical study on a computer game used to train
cadets at an Israeli Air Force flight school (Gopher, Weil, & Bareket, 1994). Transfer effects
from game training to actual flying were tested during several flights from the transition stage
to the high-performance jet trainer. The outcomes were analyzed based on the two types of
knowledge: the first type is specific skills involved in performing the game. The second type
of knowledge is the general ability of trainees to cope with the high processing and response
demands of the flight task and teach better strategies of attention. Results showed for game
skills that that the training-with-game group performed significantly better than the trainingwithout-game group in test flights. Gopher et al. (1994) concluded that the game maintaind its
relevance and was easier to generalize when variables were changed or new tasks with a
similar context were encountered, therefore was integrated with the regular training program
of the Air Force.
Santos (2002) developed a simulation game to help students understand the monetary
policy. In the study, the researcher gave a survey to the participants after the completion of
the internet-based, interactive teaching aid that introduces undergraduate students to the
domestic and international consequences of monetary policy. According to the outcome of
the survey, 91 percent of the students believed that their participation in the game improved
their understanding of central bank policy and its effects on a global economy, and 90 percent
of students felt that need to include the simulation game in the money and banking course.
Furthermore, the additional written comments at the end of the survey also fully supported
these findings.
Building attitudes
Game Evaluation
32
Attitudes are commonly viewed as summary evaluations of objects (e.g., oneself,
other people, issues, etc.) along a dimension ranging from positive to negative (e.g., Petty,
Priester, & Wegener, 1994). For the evaluation on attitude toward computer game, the
Computer Game Attitude Scale, CGAS, which evaluates student attitudes toward educational
computer games, has been created to assist computer game designers and teachers in the
evaluation of educational software games. Chappell and Taylor (1997) further found the
evidence of its reliability and validity. In the two studies conducted by Westbrook and
Braithwaite (2001) and Amory et al (1999) learner attitudes were measured with
questionnaires. Comparing pre and post questionnaires, Westbrook and Braithwaite found
learners’ interest in the health care system was significantly enhanced after completing the
game.
A study by Adams (1998) showed that the most important learning associated with
using computer games is not the learning of facts but rather the development of certain
attitudes acquired through interaction with software (e.g., becoming aware of the complexity
of a task, developing respect for decision makers in the real world, and developing humility
toward accomplishing the task). In this study, the participants’ attitude was measured by
open-ended questions, and was found changed positively and significantly. For example,
students’ answers of the questions revealed that their interest, appreciation and respect for
urban planning and planners were promoted. In addition, Wellington and Faria (1996) found
that when playing LAPTOP, a marketing simulation game specifically designed for use in
introductory marketing courses, participants’ attitudes were changed significantly. In their
study, participants’ changes in attitudes were measured along with each of their decision
made in the game.
Game Evaluation
33
Summary
A game is a rule-governed, goal-focused, activity incorporating principles of gaming
and computer-assisted instruction; and a simulation game is one that consists of real settings,
participants’ roles, playing rules, and scoring systems. Simulations, games, and other
experience-based instructional methods have had impact on teaching concepts and
applications during this period. Despite games and simulations’ potential power in instruction
and training, research on their training effectiveness is limited; therefore, more analysis and
studies on their evaluation need to be conducted.
Alessi (2000a; 2000b) points out four critical elements of an instructional simulation
game, which are, knowledge attributes, learners’ attributes, simulation attributes, and
learners’ encoding, representing and using knowledge. There are two ways to apply
instructional games and simulations; one of them is to buy off-the-shelf software, and another
way is to develop them. There are four criteria for media selection: simulation of all
necessary conditions of the job setting; sensory-mode information, feedback; and the cost.
Amory (2001) points out that the development of an instructional game is composed of three
elements, which are the research to be based on, the development of resource, and software
components.
The effects of computer games on training and instruction have been found beneficial
in some cases for instruction and training due to some of their characteristics. These effects
can be generally divided into five categories: promotion of motivation, enhancement of
thinking skills, facilitation of metacognition, enhancement of knowledge, and building of
attitude.
Game Evaluation
34
Evaluation
Evaluation is the process of determining achievement, significance or value; it usually
involves decision-making about performance and about appropriate strategies after prudent
appraisal and study (Woolfolk, 2001). Evaluation is the analysis and comparison of current
progress or outcome and prior condition, often in order to improve the program or make
further decision; it can be conducted on persons or a program; it answers the questions “How
well did we do?” “How much did we accomplish?” “Should we go on?” and “What should
we improve?” based on the specific goal/objectives or standards (e.g. O’Neil, et al., 2002;
Quinn, Alem, & Eklund, 1997). As pointed by Quinn et al. (1997), for example, when
learning with technology is designed, there are educational objectives intended to be
achieved, so the “learning effectiveness assessment” offers a means of measuring the
attainment of the objectives “against a set of both design and acceptance criteria.”
Models of Evaluation
Based on the timing, content, usage and purpose of the information collected, an
evaluation is categorized into two types: the first one is summative evaluation: to verify the
value and merits of the program itself. The second type is a formative evaluation: to identify
and correct problems and thereby improve the program. However, some researchers argue
that all evaluation is formative evaluation since uncovered drawbacks of a program often
result in making changes to it (O’Neil, et al., 2002). On the other hand, Peat and Franklin
(2002) claimed that it is beneficial and effective to adopt a mix of formative and summative
evaluation using on-line computer-based assessment. The following section is devoted to the
description of different models of evaluation.
Summative Evaluation
Game Evaluation
35
The most common method of evaluation is summative evaluation (Scriven, 1967),
which judges the value of a program at the end of it, with an emphasis on its final outcome.
For example, Peat and Franklin (2002) pointed that many universities have introduced
computer-based assessment for summative evaluations, with the purpose to provide
immediate outcome of a program. Conducting a summative evaluation, the evaluators
compare participants’ performance and the application before and after a program, and
analyze its cost and effect. As pointed out by Kirkpatrick (1994) the summative evaluation
verifies the worth and merits of the training itself, placing emphasis the overall results of a
program in terms of its performance levels, time, and cost-effectiveness. For example, in the
aspect of education, it may be conducted to investigate the effects and efficiency of a new
educational program (Dugard & Todman, 1995).
The evaluators may further compare the outcome of that particular program with other
alternatives, analyze the results, and then make decisions or changes for the future. Otherwise,
they may make comparisons between participants who receive the treatment (program) and
those who do not receive the treatment (program) or who receive a different treatment
(program); that is, using a comparison group.
The drawback of summative evaluation is that it is not helpful for the diagnosis of
problems which occur in the process or formation of a program. That is, if the outcome of a
summative evaluation found not ideal, summative evaluation won’t offer ways to find out
what the problem is and what to do to make the improvement. As pointed out by O’Neil et al
(2002):
“Given that this state is most common in early stages of development, comparative,
summative-type evaluations are usually mis-timed and may create an unduly negative
Game Evaluation
36
environment for productivity. Furthermore, because summative evaluation is typically
not designed to pinpoint weaknesses and explore potential remedies, it provides
almost no help in the development/improvement cycle that characterizes the
systematic creation of new methods.” (p.15)
Summative evaluation is typically followed by a formal report about the cost and
effect of a program. For instance, an evaluation will usually report what activities bring what
participants what influence. This information is to decide whether the program is worth
carrying on in terms of the costs and effectiveness.
For example, a summative evaluation was conducted by Morris (2001). The
researcher designed a summative evaluation on a computer-assisted learning program which
was designed to be used by students to improve their understanding of psychology. As
pointed out by the researcher, the summative evaluation recorded detail data of the students’
interaction with the learning program, such as their actions at the interface, the screens that
they had visited, their responses to the learning activities and the specific feedback that they
received for each activity. The results of the pre-test and post-test control group design of this
summative evaluation were considered in order to find out the effects of the computerassisted learning program and to compare them with the effects of the paper-based
instruction. Also, an implication was made that the development of the computer-assisted
learning program should involve formative evaluation which should be conducted to ensure
easy usage of the program for students and provide information for improvement of the
learning materials.
Formative Evaluation
Game Evaluation
37
A formative evaluation is typically conducted at the outset or during the development
of a program, and its purpose is to judge the worth of a program while the program is forming
and provide information for the developer to improve that program and its process (Baker &
Alkin, 1973; Baker & Herman, 1985; Kirkpatrick, 1994; O’Neil et al., 2002). As O’Neil et al.
(2002) pointed out, a formative evaluation focuses on the effectiveness of the development
process, refer to which the program developer can decide whether similar approaches may be
also feasible and efficient. It helps program developer or manager to find out whether the
program is working out as planned, and uncover any obstacles, barriers or unexpected
opportunities. Therefore, unlike the summative evaluation conducted typically only at the end
of the program for the overall results. O’Neil also points out that the purpose of formative
evaluation method is to identify the possibility of success and failure of each part and element
of the program; “this approach requires that data be developed to permit the isolation of
elements for improvement and, ideally, the generation of remedial options to assure that
subsequent revisions have a higher probability of success.”(2002, p15).
In addition, some researchers (e.g., Baker & Alkin, 1973; Baker & Herman, 1985;
Scriven, 1967) argue that formative evaluation is conducted to provide information for the
development of a program or internal use by program managers; it is a structured method that
provides program staff with additional feedback about their work in order to fine tune the
implementation and ensure the success of the program. For example, Barootchi and
Keshavarz (2002) conducted formative evaluation on English as a foreign language learners
by establishing evaluation portfolio to assess their progress, achievement and reaction, in
order to keep the evaluation an ongoing activity for program planning and improvement.
Game Evaluation
38
In some cases, the formative evaluation may be conducted before a program is
implemented formally, and feedback will be collected from the participants many times
during the development of the program in order to revise it as needed. As pointed out by
O’Neil et al. (2002), “Interactive formative evaluation would be accomplished during the
project, not at the completion.” In others, formative evaluation may be conducted throughout
the life of a program as guidance for continuous program improvement.
When a formative evaluation is conducted for a training, or instructional program, the
evaluator’s role is as the “quasi third-party,” who must be familiar with the objectives,
procedures and limitations of that program, so the evaluation is of deeper level for program
improvement, not simply an outcome assessment (O’Neil et al., 2002).
Kirkpatrick’s Four-Level Evaluation
One of the most popular models of evaluation is Kirkpatrick’s (1996) four–level
evaluation (Arthur, Tubre, Paul, & Edens, 2003). The Level 1 of the model is to
evaluate reaction; that is, to measure users or students’ feelings (e.g. Mehrotra, 2001; Naugle,
Naugle & Naugle, 2000; Weller, 2000) about a program, or what we call “customer
satisfaction.” According to Kirkpatrick, negative feelings toward a program reduce its effects,
so the positive result of this level’s evaluation is the prerequisite of the program. The Level 2
of the four-level model is to evaluate learning; in other words, to evaluate the degree to
which participants have obtained the required materials or objective knowledge, and the
extent to which participants have changed their attitudes (e.g. Mehrotra, 2001). The third
level is to evaluate behavior/application, which is their ability to transfer what they have
acquired from the program to the real life or to practical use (e.g., Salas, 2001). The purpose
of Level 4 evaluation is to find out the results of the program (e.g., Salas, 2001). The results
Game Evaluation
39
include the impact of the program on the organization, such as improved quality, decreased
costs, reduced mistakes, increased profits, higher return on investment (ROI), vice versa.
Level 1 evaluation is usually conducted with self-report questionnaire by the
participants (e.g. Adams, 1998; Ricci et al., 1996; Salas, 2001). As pointed out by Kirkpatrick
(1994), there are some reasons to conduct level 1 evaluation. First, adults feel interested and
learn better when they can relate the program to their prior experience; on the contrary, they
feel bored or even reluctant to go on when they feel the program irrelevant. Second,
confusion can be discovered using the evaluation. Third, it has potential for pointing out
missing content (e.g. Weller, 2000). Fourth, it can find out if participants feel engaged. Fifth,
it can gauge participants’ overall feelings about the program (e.g. Weller, 2000).
Although learners or trainees’ favorable feeling toward the program does not ensure
learning (e.g. Arthur et al, 2003; Forsetlund, Talseth, Bradley, Nordheim, & Bjorndal, 2003),
it does influence the possibility of whether a program will be supported or implemented in the
future (Kirkpatrick, 1994). For example, as pointed out by Arthur et al (2003), students’
ratings of instructors’ teaching effectiveness have received a great deal of attention in
psychological and educational literature. In addition, Marsh and Roche (1997) found that
teaching effectiveness and learners’ reaction are positively correlated. Further more, in a
study conducted by Mehrotra (2001) to evaluate a training program to increase faculty
productivity in aging research, it was found that participants’ satisfaction with the training
program motivated them to conduct research.
Another example of Level 1 evaluation is a research conducted by Weller
(2000). In his study, student reaction to a computer-mediated communication
tutor group for university distance learning course was examined with
Game Evaluation
40
questionnaires. The result not only showed participants’ feeling about the
program but also revealed both the program’s benefit and drawback such as low
level of tutor involvement. Motivation was measured with questionnaire in the
study.
Ricci et al. (1996) used a trainee reaction questionnaire containing five
statements with a 5-likert scale on the training task, and found significant
positive correlations between reaction and retention test score; participants who
"(a) perceived their form of study as enjoyable, (b) felt they learned a lot about
CBD during their training, and (c) felt confident that they would remember what
they learned during training " scored significantly higher on the retention score
than those who did not. Furthermore, in this study, participants who received
the training in computer-based form performed significantly than those who
received the training in paper-and-pencil form, and participants who received the
training in computer-based form had better reaction to the training program.
Level 2 evaluation is based on the predefined objectives; evaluators have to make
sure there is no confusion between after-program-performance and on-the-job performance.
It can be conducted through performance tests such as role-playing, paper-and-pencil check
list tests (e.g., Henderson, Klemes, & Eshet, 2000; Mayer, 2002; Mayer & Moreno, 1998;
Mayer & Wittrock, 1996), multiple choice, matching and test sheets, and computer-basedknowledge maps (e.g., Baker & Mayer, 1999; Chuang, 2003; Herl et al., 1999; Mayer, 2002;
Schacter et al., 1999; Schau & Mattern, 1997). However, developing an effective test with
validity and reliability is challenging.
Game Evaluation
41
Furthermore, Salas (2001) pointed that assessment of learning via attitude change is
the most popular form of assessing learning. For example, Salas (2001) assessed the aviators’
positive attitudes changed toward the aircrew training program with pre and post training
self-report questionnaire, and pointed out that learning could also be evaluated by measuring
knowledge learned by trainees according to the predefined criteria. The researcher further
pointed out that multiple measures provide stronger evidence of learning outcome.
Mehrotra’s (2001) research on the research training program is also considered an example of
evaluation on changed attitude, one component of level 2 evaluation. In the study, it was
found that the training program has energized participants and enhanced their motivation to
increase faculty productivity in aging research,
Level 3 evaluation is a complicated evaluation designed to ensure that the program or
training has a positive influence on job performance (e.g. Harrell, 2001). It can be conducted
through one-on-one interviews or questionnaires. The latter one is costly but gives us the
most useful information. On the other hand, Salas (2001) pointed out that studies that
gathered behavioral data tended to use a combination of various tools, such as behavioral
observation forms, behavioral checklists, analysis of crew communication, and peer or selfevaluations/reports. In Salas’s (2001) study, the researcher also assessed if aviators transfer
the behaviors that was previously learned to the operation in the Cockpit. The aviators’
behavioral data was collected. As pointed out by Salas, the most common method of
assessing behavioral change was measuring behaviors related to the training objectives while
participants performed in a simulation environment. Further, researchers (e.g. Naugle,
Naugle, & Naugle, 2000) pointed out that educational settings and state departments of
education are already addressing the issue about whether students who had completed the
Game Evaluation
42
program used this new skills and knowledge on the real-world situation. The researchers
(Naugle, Naugle, & Naugle, 2000; Salas, 2001) also pointed out the behavior should be
assessed before and after the program, and could be assessed not only by the program
implementer, but also by the participants, using self or peer assessment.
Level 4 evaluation-results, is the highest level of evaluation in Kirkpatrick’s (1994)
evaluation framework, and is the most complicated one, therefore, despite its high value, very
few evaluations were conducted at this level (Salas, 2001). Salas pointed out the difficulty of
collecting information for program results “in terms of time, resources, identification of a
clear criterion, and low occurrences of accidents and mishaps” (p. 651). When conducting
level 4 evaluation, evaluators are looking for evidence instead of a direct and simple result,
since we may find evidence that the program has influence on the organization by comparing
pre- and post- test or experimental and control groups, to recognize the possibility that some
variables could have contributed to the result. The evidence of positive results may be higher
productivity, increased sales, reduced costs, improved quality, etc, but we may not be certain
that one is the only cause. For example, Naugle, Naugle, and Naugle pointed out that in
educational settings, the desired results are often less explicit or measurable, therefore
evaluators need to look for evidence. Also, applying Kirkpatrick’s framework for evaluating
an aircrew training program (Salas, 2001), the researcher reviewed 58 published accounts of
the training program to determine its cost-effectiveness, and found the results uncertain.
However, as pointed out by the researcher, several evidence showing the effectiveness of the
program had been found, such as reduction in accident rates.
Due to the timing and attributes of the four levels, level 1 and 2 of Kirkpatrick’s
evaluation are typically applied as formative evaluation conducted in the process and
Game Evaluation
43
happening of a program or a plan, while level 3 and 4 are usually applied as summative
evaluation conducted at the end of it to find out the final results. In addition, an evaluation is
not definitely conducted at all of the levels of. Evaluators may apply only one or two levels as
needed (Blanchard, Thacker, & Way, 2000; Kirkpatrick, 1996). For example, in Harrell’s
(2001) report about the evaluation of training effectiveness, he emphasized only on the third
level to find out if trainees behaved differently on the job after a training program. Further
more, as pointed out previously in the section of games’ training effectiveness in this article,
Arthur et al. (2003) applied only reaction and learning evaluation, Level 1 and Level 2
evaluation in their empirical study regarding a simulation game’s training and testing
effectiveness of students’ visual attention.
Game Evaluation
Despite the rush to embrace instructional game, there has been a lack of evaluation,
whether summative or formative evaluation; there is limited evidence as to the training
effectiveness of games, so the framework of evaluation on the learning results of games needs
to be built up (O’Neil & Fisher, 2002; Quinn, 1997; Ruben, 1999). For an instructional game
evaluation, a summative evaluation will be conducted to find out whether the game will last
for a specific period of time or is intended to be constantly adapted and upgraded as new
software options occur, and all of other induced cost, then compare its uncovered outcome
and impact, and finally find out its return on investment, ROI.
The study done by Parchman, Ellis, Christinaz and Vogel (2000) is among
the a few studies which are helpful in designing evaluation of an instructional
game. In their study, the researchers conducted a formative evaluation on four
alternatives of instructional methods to teach navy electronic technology:
Game Evaluation
44
“computer-based instruction”, instruction of “computer-based adventure game”,
the traditional “classroom instruction”, and instruction of “computer-based drill
and practice”. The evaluation of the effectiveness outcomes was limited to
Kirkpatrick's Level two evaluation. Participants' subjective knowledge was
assessed with 40-item multiple-choice test, and their motivation was assessed
with motivation questionnaire. However, the evaluation results of the group of
training with game were not better than that of other three groups. The
researchers pointed out that some game elements of challenge, fantasy, and
curiosity may detract from, rather than enhance, the instruction.
The study conducted by Westbrook and Braithwaite (2001) is also one of the a few
studies which are helpful for designing a game evaluation. In the study, the researchers
applied pre and post questionnaires consisted of learner demographics, learners’ reaction
toward the game, learners’ knowledge of health system, and learners’ experience with
computer games, to evaluate a health care simulation game, which was designed to promote
information-seeking skills and the interaction the health system.
In addition, games with different goal/objectives should be evaluated with different
assessment measures, to find out if the game really helps learners or trainees achieve the
learning goal and objectives, and its efficiency (O’Neil, et al., 2002; Quinn, 1996). For
example, if the training/learning goal is to increase learners’ problem-solving ability, the
measures to assess problem-solving ability including content understanding, problem-solving
strategies, and self-regulation can be applied (O’Neil, 1999).
The evaluation of a game can be formative evaluation or summative, according to the
needs and purposes of the evaluation. If the evaluation is to identify and correct problems and
Game Evaluation
45
thereby improve the game, the formative evaluation should be conducted. However, if the
purpose is to verify the value and benefits of the game, then a summative evaluation should
be made. For example, O’Neil et al. (2002) developed a framework for formative evaluation
on games. This framework is for evaluation conducted during the process of a project, not
after it is implemented to find out the outcome. As seen in Table 2, the procedure involves
multiple steps. The formative evaluation starts with examining if the design of game is
consistent with its specifications and end with the time period of revisions. Activities 4 to 9,
as pointed by O’Neil et al., involve new data collection. The researchers indicate that the
framework would be modified based on the need to provide useful and timely information to
the developers.
Table 2
Formative Evaluation Activity, adapted from O’Neil et al., (2002)
1.
Check the system design against its specifications
2.
Check the design of assessments for outcome and diagnostic measurement against
specifications. Design and try out measures
3.
Check the validity of instructional strategies embedded in the system against research
literature
4.
Conduct feasibility review with the instructors.
•
Are right tasks being trained?
–
5.
Review to be conducted with instructors
Conduct feasibility tests with the students
Game Evaluation
6.
•
One-on-one testing with protocol analysis
•
Small-group testing
46
Assess instructional effectiveness.
•
cognitive
–
e.g., does it improve domain knowledge (e.g., information technology skills),
transfer problem-solving skills, self-regulation?
•
affective
–
•
7.
e.g., does it improve self-efficacy?
are there differential effects for identifiable subgroups
Does more game-based training lead to better game performance (e.g., loss/exchange ratio)
•
need to track a player’s performance across multiple games
8.
Do experts and novices differ in performance?
9.
Does more training lead to better performance?
10. Revise based on 1-9 activities.
Conducting an evaluation for an instructional game is challenging, since it needs to
uncover whether separate components created by developers and course designers interact
appropriately and effectively when combined altogether. Further more, the evaluators must
understand the training objectives and focuses of the evaluation. As pointed out by O’Neil et
al. (2002), “Evaluation is a mix of requirements driven and technology push factors. For
example, if it is requirements-driven, then the training objective/assessments are critical; on
the other hand, if it is technology driven, then fun/challenge/fantasy issues are critical”
(O’Neil et. al., 2002). In addition, the evaluation on a computer game needs to find out if all
Game Evaluation
47
facilities work well together and how long they will last, considering the relevant cost,
including software, hardware, maintenance, update fee, and etc., to find out the relationship
of cost and effect.
Summary
Evaluation is the process of determining achievement, significance or value; it is the
analysis and comparison of current progress or outcome and prior condition based on the
specific goal/objectives or standards. While summative evaluation focuses on outcomes and
is typically conducted at the end of a program, formative evaluation is normally conducted
throughout the program to evaluate the process of a program development. Kirkpatrick’s
(1994) four–level evaluation model is the most common used model. The four levels are
reaction, learning, behavior/application and results. There is limited evidence as to the
training effectiveness of games for adults, so the framework of evaluation on the learning
results of game needs to be developed. Since different goal/objectives and games should be
evaluated with different assessment measures, game developers should design appropriate
assessment tools to find out if the game really helps learners or trainees achieve the learning
goal and objectives, and its efficiency.
For example, this study will evaluate a game with training/learning goal to increase
learners’ problem-solving ability, so the measures to assess problem-solving ability including
content understanding, problem-solving strategies, and self-regulation can be applied (O’Neil,
1999) in this study. In a pilot study, we will use O’Neil’s framework of formative evaluation,
and apply the first two levels of Kirkpatrick’s (1994) evaluation model. In the main study, we
will evaluate the impact of a game on problem solving discussed in the following section.
Game Evaluation
48
Problem solving
Definition of Problem Solving
Problem solving is cognitive processing directed at achieving a goal when no solution
method is obvious to the problem solver (Mayer & Wittrock, 1996, Simon, 1973). According
to Baker and Mayer (1999), it has four characteristics, which are cognitive, process-based,
directed, and personal. Baker and Mayer (1999) further indicated that problem-solving has
four steps. The first step is “problem translation”, the problem solver identifying available
information and translating it in the situation where the problem occurs, into his/her mental
model. The second step is “problem integration”; the problem solver putting together the
pieces of information into a structure. The last two steps of problem-solving are “solution
planning” and “solution execution”, developing a feasible plan and implementing it to solve
the problem. The first two components constitute the problem representation phase of
problem-solving while the latter two components constitute the problem solution phase of
problem solving. Further, Sternberg and Lubart (2003) explained that analytic part of
human’s intelligence initially recognizes and structures problems, and evaluates the ideas that
occur during the process of problem solving, while the practical part of the intelligence is to
figure out which ideas may work well and which ideas will further result in good ideas.
The National Center for Research on Evaluation, Standards, and Student Testing
(CRESST) concluded that problem-solving ability is composed of three elements, which are
content understanding, problem-solving strategies, and self-regulation. The elements and
their hierarchical order can be seen in Figure 1.
Game Evaluation
49
Figure 1
National Centre for Research on Evaluation, Standards, and Student Testing
(CRESST) model of problem solving.
Problem Solving
Content Understanding
Problem Solving Strategies
Domain Specific
Problem Solving Strategies
Domain Independent
Problem Solving Strategies
Self-Regulation
Metacognition
Motivation
Planning
Effort
Self-Monitoring
Self-Efficacy
In addition, problem solving strategies can be further categorized into two types,
which are domain independent and domain specific problem solving strategies. The former
ones refer to general strategies such as the application of multiple representation, mental
simulation, and analogy to problems. As pointed out by researchers (van Merrienboer, Clark,
Game Evaluation
50
& de Croock, 2002), cognitive schemata enable problem solvers to solve a new problem by
serving as an analogy. On the other hand, domain specific problem solving strategies are taskdependent strategies (O’Neil, 1999) that guide problem solving in the domain by reflecting
the way problems may be solved effectively (van Merrienboer, et al., 2002). The examples of
using task-dependent strategies are using Boolean search strategies in a search task, applying
a computer language to write a computer program, or applying equation solving strategies to
solve a math problem (O’Neil, 1999). Baker and Mayer (1999) explain that domain-specific
aspects of problem solving strategies are those that are unique to specific subject or field such
as geometry, geology, or genealogy, and those involve the specific content understanding,
procedural knowledge, and discourse in the subject domain.
The third element of problem-solving is self-regulation, which includes two subcategories, metacognition and motivation; the former one further composes self-checking and
self-planning, and the later one is composed of effort and self-efficacy. Self-efficacy refers to
cognitive judgments of one’s capabilities within a specific domain or a specific task, and it
affects the achievement in that domain or of that task (Bong & Clark, 1999).
The significance of individual problem solving skills
Previous researchers have pointed out the significance of problem solving (e.g.
Cheung, 2002; Mayer, 1998, O’Neil, 1999). For example, O’Neil’s (1999) pointed out
that problem solving skills have been suggested by many researchers to be
among those critical competences required of students, college graduates, and
employees. Cheung (2002) pointed out that problem-solving ability is important
for a person's psychological and social functioning. Mayer (1998) argues that
cognitive, metacognitive, and motivational skills are required for successful problem solving
Game Evaluation
51
in academic settings. Further more, due to the rapid technological change, not only schools
and colleges but also companies will transform into learning organizations, where individual
problem solving skills are critical. For example, according to previous research, there is
evidence that problem solving skills impact the bottom line, therefore high-paying jobs are
usually those required higher level thinking skills (O’Neil, 1999). Therefore, as pointed out
by researchers (e.g., Mayer, 2002; Moreno & Mayer, 2000), promoting problem-solving
transfer has become one of the most important educational objectives. In Moreno and
Mayer’s study (2000), for example, it was found that personalized messages in a multimedia
science lesson produced better performance of problem-solving transfer and retention. Mayer
(1998) also points out that although routine problem solving has been promoted successfully,
educators need to spend more efforts on teaching non-routine problem solving skills. As a
result educators need an assessment program that tests validly and efficiently how much
students have leaned (retention) and how well they are able to apply it (transfer) (e.g., Day,
Arthur & Gettman, 2001; Moreno & Mayer, 2000). Further, in the article by Dugdale
(1998), the author also points out that recent literature about education has emphasized
problem solving as a focus of school mathematics.
Assessment of Problem Solving
There is substantial previous research which reveals the significance of problem
solving for all of the studies, institutes, workforces, and tasks and teaching for problemsolving transfer is as a result an important educational objective. However, an assessment
framework that is valid and efficient need to be built up, and the methods to assess problemsolving skills still need to be refined (Mayer, 2002; O’Neil & Fisher, 2002; O’Neil, et al.,
2002). For example, assessing students by giving them a test of separate and unconnected
Game Evaluation
52
multiple choice questions, teachers are not accurately assessing students’ problem-solving
abilities, and traditional standardized tests do not report to teachers or students what problemsolving and thinking processes they should provide emphasis on. Although we can find
useful measures for problem solving competence in the cognitive science literature such as
think-aloud protocols (e.g., Day, Arthur, &Gettman, 2001), those measures, however, are
inefficient performance assessments, requiring extensive human scoring and a great amount
of time (O’Neil, 1999).
According to Baker and Mayer (1999), two aspects of problem-solving ability need to
be tested as a whole, which are retention and transfer. Retention involves what learners have
retained or remembered from what they have been presented, while transfer involves how
much learners can apply the learned knowledge or skills in a brand new situation; retention is
tested with routine problems, which are problems that learners have learned to solve, and
transfer is tested with non-routine problems, which are problems that learners haven’t solved
in the past (Mayer, 1998). According to the researchers, the assessment of problem-solving
transfer should be the current emphasis of education, since learners need not only memorize
the materials, but also to apply them in a novel situation or in the real world. In addition,
problem solving ability may be assessed by checking the entire process when a task is being
solved or the final outcome, “contrasting expert-novel performance.” Also, Day et al. (2001)
pointed out that the more similar novices’ knowledge structures are to experts’ structures, the
higher the level of novices’ skills acquisition is.
The National Center for Research on Evaluation, Standards, and Student Testing
(CRESST) has developed a problem-solving assessment model composed of content
Game Evaluation
53
understanding, problem-solving strategies, and self-regulation, the three elements of
problem-solving ability. The model is illustrated as the following:
Measurement of Content Understanding
Mayer and Moreno (1998) assessed content understanding with retention and transfer
questions. In their study on the split-attention effect in multimedia learning, they gave
participants retention test and matching test, containing questions designed to assess the
extent to which participants remembered the knowledge delivered by the multimedia with
animation and narration, or animation and on-screen text.
An alternative way to measure content understanding is knowledge maps. Knowledge
maps have been used as an effective tool to learn complex subjects (Herl et al., 1996) and to
facilitate critical thinking and (West, Pomeroy, Park, Gerstenberger, & Dsndoval, 2000).
Several studies also revealed that knowledge maps are not only useful for learning, but also a
reliable and efficient measurement of content understanding (Herl et al., 1999; Ruiz-Primo,
Schultz, & Shavenlson, 1997). For example, Ruiz-Primo et al. (1997) proposed a framework
for conceptualizing knowledge maps as a potential assessment tool in science. Students need
to learn how to locate, organize, discriminate between concepts, and use information stored
in formats to make decisions, solve problems, and continue their learning when formal
instruction is no longer provided.
A knowledge map is a structural representation that consists of nodes and links. Each
node represents a concept in the domain of knowledge. Each link, which connects two
nodes, is used to represent the relationship between them; that is, the relationship between the
two concepts. As Schau and Mattern (1997) point out, learners should not only be aware of
Game Evaluation
54
the concepts but also of the connections among them. A set of two nodes and their link is
called a proposition, which is the basic and the smallest unit in a knowledge map.
Ruiz-Primo et al. (1997) claimed that as an assessment tool, knowledge maps are
identified as a combination of three components: (a) a task that allows a student to perform
his or her content understanding in the specific domain (b) a format in regard to the student’s
response, and (c) a scoring system by which the student’s knowledge map could be
accurately evaluated. Chuang (2003) modified their framework to serve as an assessment
specification using a concept map. Table 3 lists the elements and characteristics of the
knowledge maps identified in her study.
Table 3
Domain Specifications Embedded in Chuang’s (2003) Study Adapted from Hsieh’s (2001)
Study
General Domain
This Software
Specification
Scenario
Create a knowledge map on environmental science by
exchanging messages in collaborative environment and
by searching relevant information from simulated World
Wide Web environment
Game Evaluation
Participates
Student team (two members)
Leader
The one who does the knowledge mapping
Searcher
The one who accesses the simulated World Wide Web
environment to find relevant information and ask for
feedback
Knowledge map terms
Predefined – 18 important ideas identified by content
(Nodes)
experts: atmosphere, bacteria, carbon dioxide, climate,
consumer, decomposition, evaporation, food chain,
greenhouse gases, nutrients, oceans, oxygen,
photosynthesis, producer, respiration, sunlight, waste,
and water cycle
Knowledge map terms
Predefined – 7 important relationships identified by
(Links)
content experts: causes, influences, part of, produces,
requires, used for, and uses
Simulated World Wide
Contains of over 200 Web pages with over 500 images
Web environment
and diagrams about environmental science and other
topic areas
Training
All students went through the same training section.
55
Game Evaluation
The training included the following elements:
• how to construct the map
• how to search
• how to communicate with the other group member
Feedback
Feedback is based on comparing group’s knowledge map
performance to that of expert’s map performance
Adapted knowledge
Including knowledge of response feedback, messages
of response feedback
about how much improvement students have
accomplished in current map compared with previous
map will be provided, but does not contain search
strategy for electronic information seeking.
Representation
Task-specific adapted
knowledge of
response feedback
Graphics plus text
Including knowledge of response feedback, messages
about how much improvement students have
accomplished in current map compared with previous
map will be provided as well as the useful search strategy
56
Game Evaluation
for electronic information seeking.
Representation
Graphics plus text
Timing of feedback
Both feedback used in this study can either be the
immediate or the delayed feedback because the feedback
accessing is controlled by the searchers.
Type of Learning
Collaborative problem solving
Problem solving measures
Knowledge map
Content understanding and structure :(a) semantic
content score; (b) the number of concepts; and (c) the
number of links
Information Seeking
Browsing and searching
Feedback
The number of times students request feedback for their
knowledge maps
Self-regulation
Team processes
Planning, self-checking, self-efficacy, and effort
Adaptability, coordination, decision making,
interpersonal, leadership, communication
57
Game Evaluation
58
Researchers have successfully applied knowledge maps to measure students’ content
understanding in science whether for high school students and adults (e.g., Chuang, 2003;
Herl et al., 1999; Schacter et al., 1999; Schau et al., 2001). Schau et al. (2001) used selectand-fill-in knowledge maps to measure secondary and postsecondary students’ content
understanding of science in two studies respectively. In the first study, the result of students’
performance on the knowledge map correlated significantly with that on a multiple test, a
traditional measure (r= .77 for eighth grade and r=. 74 for seventh grade). According to the
research result, knowledge map is therefore an assessment tool with validity. In the other
study, Schau et al compared the results of knowledge maps with both traditional tests of
multiple choice and relatedness ratings. Further, the mean of map scores increased
significantly, from 30% correct at the beginning of the semester (SD=11%) to 50% correct at
the end (SD=19%). At last, the correlation between knowledge map scores and multiple
choice test scores, and the correlation between concept scores and relatedness ratings
assessment were high.
Recently, CRESST has developed a computer-based knowledge mapping system,
which measures the deeper understanding of individual students and teams, reflects thinking
processes in real-time, and economically reports student thinking process data back to
teachers and students (Chung et al., 1999; O’Neil, 1999; Schacter et al., 1999). The
computer-based knowledge map has been used in at least four studies (Chuang, in
preparation; Chung et al., 1999; Hsieh, 2001; Schacter et al., 1999). In the four studies, the
map contained 18 concepts of environmental science, and seven links of relationships, such
Game Evaluation
59
as cause, influence, and used for. Students were asked to create a knowledge map in
computer-based environment. In the study conducted by Schacter et al. (1999) students were
evaluated by creating individual knowledge map, after searching the simulated world wide
web. On the other hand, in the studies conducted by Chung et al. (1999), Hsieh (2001), and
Chuang (2003) two students constructed a group map cooperatively through the networked
computers, and their results showed that using networked computers to measure group
processes was feasible.
An example of a concept map is shown in Figure 2 and 3. As seen in Figure 2, the
screen of computer was divided into three major parts. The numbered buttons located at the
lower right part of the screen are message buttons for communication between group
members; all predefined messages were numbered and listed on the handouts distributed to
participants. When a participant clicked on a button on the computer screen, the
corresponding message would be shown instantly on his/her and his/her partner’s computers
simultaneously. The lower left part of the screen was where messages were displayed in the
order sent by members. As seen in Figure 2, the top-left-hand part was the area where the
map was constructed.
Figure 2
User Interface for the System
Game Evaluation
60
Game Evaluation
61
As seen in Figure 3, in this system (e.g. Chuang, in preparation; Hsieh, 2001), only a
leader can add concepts to the knowledge map and make connection among concepts by
clicking the icon of “Add Concept” on the menu bar and pressing the “Link” button
respectively. There are 18 concepts of environmental science under “Add Concept” such as
“atmosphere”, “bacteria”, “carbon dioxide”, and “water cycle”, and seven link labels (i.e.,
causes, influence, part of, produces, requires, used for, uses). A leader was asked to use these
terms and links to construct a concept map using the computer mapping system. In addition,
the leader could move concepts and links to make changes to the map. On the contrary, a
searcher in each group could seek information from the Simulated World Wide Web
environment and access feedback regarding the result of their concept map. Therefore, to
construct a concept map successfully, both of the searcher and leader of a group must
collaborate well.
Figure 3 Add Concepts and Links
Game Evaluation
62
Measurement of Problem Solving Strategies
Problem solving strategies can be categorized as domain-independent/general and
domain-dependent/specific (Alexander, 1992; Bruning, Schraw, & Ronning, 1999; O’Neil,
1999; Perkins & Salomon, 1989). Domain-specific knowledge is the knowledge about a
particular field of study or a subject, such as the application of equations in a math question,
the application of a formula in a chemistry problem, or the specific strategies to be
successful in a game. On the other hand domain-general knowledge is a broad array of
knowledge that is not linked with a specific domain, such as the application of multiple
representations and analogies in a problem-solving task or the use of Boolean search
strategies in a search task (e.g. Chuang, in preparation).
CRESST has created a simulated Internet Web space to evaluate problem solving
strategies such as information searching strategies and feedback inquiring strategies (Herl et
al., 1999; Schacter et al., 1999). In the study conducted by Schacter et al, they found that
students’ problem-solving strategies such as information browsing, focused searching, and
feedback require improved significantly from the pretest to posttest. Mayer and Moreno
(1998) conducted a study on the split-attention effect in multimedia learning and the dual
processing systems in working memory, and assessed participants’ problem-solving
strategies with a list of transfer questions. The results of their experiments showed that
students who received concurrent narration describing the target pictures performed better
on transfer tests than those who received concurrent on-screen text involving the same
words; in terms of their content understanding and problem-solving strategies.
Game Evaluation
63
Measure of Self-Regulation
According to Brunning, Schraw, and Ronning (1999), some researchers believe that
self-regulation include three core components: metacognitive awareness, strategy use, and
motivational control. An alternative framework, according to CRESST’s model of problem
solving (O’Neil, 1999) self-regulation is composed of metacognition and motivation.
Metacognition encompasses two subcategories, which are planning and selfchecking/monitoring (Hong & O’Neil, 2001; O’Neil & Herl, 1998; Pintrich & DeGroot,
1990), and motivation is indicated by effort and self-efficacy (Zimmerman, 1994; 2000).
O’Neil and Herl (1998) developed a trait self-regulation questionnaire examining the four
components of self-regulation. Of the four components, planning is the first step because one
must have a plan to achieve the proposed goal. In addition, self-efficacy is one’s belief in
his/her capability to accomplish a task, and effort is how hard one would like to work on a
task. In the trait self-regulation questionnaire developed by O’Neil and Herl (1998) planning,
self-checking, self-efficacy, and effort are assessed using eight questions each. The reliability
of this self-regulation inventory shown in previous studies (Hong & O’Neil, 2001). For
example, in the research conducted by Hong and O’Neil (2001), the reliability estimates
(coefficient α) of the four subscales of self-regulation, planning, self-checking, effort, and
self-efficacy are .76, .06, .83, and .85 respectively, and the research also provided the
evidence of construct validation.
While the self-regulation questionnaire is used by CRESST, another measure, think
aloud, is also used to assess self-regulation (Winne & Perry, 2000), in that, participants
speak out their thinking process when solving a problem. The documented data are then
Game Evaluation
64
analyzed psychologically, and the potential underlying thought processes are induced
(Manning, Glasner, & Smith, 1996)
For example, in the study conducted by O’Neil and Abedi, where metacognitive is
considered “conscious and periodic self-checking of whether one’s goal is achieved and,
when necessary, selecting and applying different strategies” (p3-4), the researchers
developed a framework to assess state metacognition directly and explicitly. In addition,
state metacognition assessed in this study is considered situation-specific, and vary rapidly.
The framework developed is a set of self-reported, domain-independent, and found the
framework reliable and valid.
To evaluate problem-solving ability, previous researchers (e.g., Baker & Mayer,
1999; Baker & O’Neil, 2002; Mayer, 2002; O’Neil, 1999) further points out that computerbased assessment has the merit of integrating validity to generate test items and the
efficiency of computer technology as a means of presenting and scoring tests.
Summary
Problem solving is cognitive processing directed at achieving a goal when no solution
method is obvious to the problem solver. In addition, problem solving strategies can be
further categorized into two types, which are domain independent and domain specific
problem solving strategies. Also, self-regulation includes two sub-categories, metacognition
and motivation; the former one further composes self-checking and self-planning, and the
later one further compose effort and self-efficacy.
Knowledge map is not only useful for teaching and learning, but also a reliable and
efficient measurement of content understanding, in addition, CRESST has created a
simulated Internet Web space to evaluate problem solving strategies such as information
Game Evaluation
65
searching strategies and feedback inquiring strategies. Finally, while CRESST’s selfregulation questionnaire assesses self-regulation, another measure, think aloud, is another
method assess domain dependent metacognition
Computer-based problem-solving assessments are economical, efficient and valid
measures that employ contextualized problems that require students to think for extended
periods of time and to indicate the problem-solving heuristics that they were using and why;
provide students access to information to solve the problem with, and offer detailed feedback
to teachers, students and their parents about individual student’s problem-solving processes.
Summary of the Literature
Simulations, games, and other experience-based instructional methods have had a
substantial impact on teaching concepts and applications during this period. Despite games and
simulations’ potential power in instruction and training, research on their training effectiveness is
limited; therefore, more analysis and studies on their evaluation need to be conducted.
There are two ways to apply instructional games and simulations; one of them is to buy
off-the-shelf software, and another way is to develop them. There are four criteria for media
selection: simulation of all necessary conditions of the job setting; sensory-mode information,
feedback; and the cost. On the other hand, Amory (2001) points out that the development of an
instructional game is composed of three elements, which are the research to be based on, the
development of resource, and software components.
The effects of computer games and simulations on training and instruction shown in previous
studies can be generally divided into five categories: promotion of motivation, enhancement of
thinking skills, facilitation of metacognition, enhancement of knowledge, and building of
attitude.
Game Evaluation
66
Evaluation is the process of determining achievement, significance or value; it is the
analysis and comparison of current progress or outcome and prior condition based on the specific
goal/objectives or standards. While summative evaluation focuses on outcomes and is typically
conducted at the end of a program, formative evaluation is normally conducted throughout the
program to evaluate the process of a program development. In addition, Kirkpatrick’s (1994)
four–level evaluation model is the most common model applied in different fields such as in
business and academic settings. The four levels are reaction, learning, behavior/application and
results.
There is limited evidence as to the training effectiveness of games for adults, so the
framework of evaluation on the learning results of game needs to be found out. Since different
goal/objectives and games should be evaluated with different assessment measures, game
developers should design appropriate assessment tools to find out if the game really helps
learners or trainees achieve the learning goal and objectives, and its efficiency. For example, this
study will evaluate a game with training/learning goal to increase learners’ problem-solving
ability, so the measures to assess problem-solving ability including content understanding,
problem-solving strategies, and self-regulation can be applied (O’Neil, 1999). In this study, we
will apply the first two levels of Kirkpatrick’s (1994) four-level evaluation model for formative
evaluation
Problem solving ability is one of the most critical skills for working or learning and in
almost every setting. However, its assessment measures still need to be further refined.
Knowledge map is not only useful for teaching and learning, but also a reliable and efficient
measurement of content understanding, in addition, CRESST has created a simulated Internet
Web space to evaluate problem solving strategies such as information searching strategies
Game Evaluation
67
and feedback inquiring strategies. Finally, while CRESST’s self-regulation questionnaire
assesses self-regulation, another measure, think aloud (e.g., Day, Arthur, &Gettman, 2001),
is another method assessing domain dependent metacognition. Further more, computer-based
problem-solving assessments are economical, efficient and valid measures that employ
contextualized problems that require students to think for extended periods of time and to
indicate the problem-solving heuristics that they were using and why; provide students access
to information to solve the problem with, and offer detailed feedback to teachers, students
and their parents about individual student’s problem-solving processes.
Game Evaluation
68
CHAPTER III
METHODOLOGY
Research Hypothesis
Research Question: Will participants increase their problem-solving ability after playing a
game (i.e. SafeCracker)?.
Research Design
The research will consist of a pilot study and a main study. The pilot study will focus
on a formative evaluation. The main study will focus on the impact of the game on problem
solving.
Pilot Study
A pilot study is a small-scale trial conducted before a research with a purpose to
develop and examine the measures or procedures that will be used in the main study (Gall,
Gall, & Borg, 2003). There are several advantages of conducting a pilot. First, a pilot study
permits a preliminary testing of the hypotheses that lead to more precise hypotheses in the
main study, and bring researchers new ideas or alternative measures unexpected before the
pilot study. In addition, it permits a complete examination of the planned research procedures
and reduces the number of treatment errors. Also, researchers may obtain feedback from the
participants of the pilot study (Isaac & Michael, 1997).
In previous studies on games’ effects, researchers conducted an initial trial to
examine the utility of the objective software and to determine whether the computer
environment/interface is understandable by the subjects (e.g. Amory, 1999; Greenfield, et al.,
Game Evaluation
69
1994; Quinn, 1991). For example, Quinn (1991) used an adventure game similar to the one to
use in the main study in the pilot experiment to find out the applicability of the adventure
game and the need for revision. In the preliminary study conducted by Amory et al. (1999),
researchers tried to find out appropriate game type and its characteristics appropriate for
education, and based on the result, the researchers designed an educational game for further
study.
There are four purposes of pilot study in this dissertation. First, it will be used to
assess the functionality of the computer system. Second, it will be used to determine whether
the environment was feasible and understandable for the participants. Third, it will be used to
assess if the predicted time is suitable for participants to construct the knowledge map, play
game and finish the tests. Finally, the pilot study will be conducted to find out participants’
feeling toward the game and the whole process. (e.g. Amory, 1999; Quinn, 1991).
Formative Evaluation
For this study, the researchers will apply the framework of formative evaluation as a
pilot study (O’Neil, et al., 2002). According to O’Neil et al., to find out the feasibility of a
program of educational technology and improve the program by offering information on its
implementation and procedure. The study will follow a modified version of the O’Neil
methodology to conduct a formative evaluation of a game as seen in Table 4
Table 4
Formative Evaluation Activity (adapted from O’Neil, et al., 2002)
1.
Check the game design against its specifications
2.
Check the design of assessments for outcome and measurement. Design and try
out measures
Game Evaluation
3.
70
Check the validity of instructional strategies embedded in the game against
research literature
4.
5.
Conduct feasibility review with the students.
•
Review to be conducted with students
•
Small-group testing (n=3-5)
Implement revisions.
Participants
The participants of the pilot study will be four college students at the University of
Southern California, aged from 20-35. The pilot study will be conducted after receiving
approval of USC Review of Human Subjects. All participants will be selected to have no
experience of playing SafeCracker.
Puzzle-Solving Game
The required characteristics of the computer game selected for this study needs to be
as the following: 1) adult oriented, single user play and suitable for problem solving research,
since this research is to find out a game’s effect on an adult’s problem-solving ability. 2) The
game should be one that participants can learn how to play in a couple of minutes since this
study is an initial trial that won’t be prolonged for a long time. 3) The selected game should be
able to be replayed many times, at least in one hour.
The selection of SafeCracker was based on a study by Wainess and O’Neil (2003).
They conducted an evaluation on the research feasibility of potential 525 video games of three
categories: puzzle games, strategy games, and educational games. The appropriate game was
than searched among puzzle games, due to their properties and since they provide appropriate
Game Evaluation
71
platform for studying games’ effectiveness of enhancing problem-solving ability. A
participant in a puzzle-solving game is placed in a specific setting or story background, and
tries to reason out possible task procedure and consequences. Failure to solve a puzzle
previously encountered may result in future problems in the game.
Their criteria include the following: the game selected should not be violent or appeal
to one gender more than the other. It should not appeal to people with special interests,
either. For example, baseball relevant games or wrestling related games tend to interest male
and sports fans more than female and non sports fans. In addition, the game should not favor
people with special skills, motor skills, rapid response, or background knowledge. For
example, a game which is all about music, specially designed for legal practitioners, or
totally biology related, should not be selected. Finally, pacing controlled by players is
required, such as Chess and other traditional computer board games.
SafeCracker, a puzzle-solving game was the final decision by Wainess and O’Neil
(2003) since it facilitates problem solving with right and wrong solutions, and does not
require special background knowledge or extraordinary visual-spatial ability. In addition, the
pacing of SafeCracker, which is mainly designed for adults, is controlled by players. The
other significant reason is that SafeCracker is not as popular as many other potential games.
However, according to Wainess and O’Neil (2003), as the most ideal choice of game
for this study, SafeCracker has three main drawbacks: (1) It may not be appropriate for
testing transfer and retention outside the game per se. (2) Players’ actions within the program
can not be tracked. (3) It is impossible to modify the game scenarios. The lack of source code
or editor for SafeCracker is a major reason for these drawbacks.
Game Evaluation
72
A player in SafeCracker is a candidate for a position as a head of security
development at a world famous firm of security systems, therefore needs to accomplish a
task given by the boss. The task is to open the safes in a mansion in 12 hours without any
help from others. There are 35 safes scattered in about 60 rooms of the mansion. To open all
of the safes, the player not only needs to do mathematic calculation, logical reasoning, and
trial-and-error guessing, but also has to have good sense of direction and memorization. For
example, to open the safe in the Kitchen/room 21, a player needs to solve a math/science
problem of temperature conversion, and to crack the safe in the room of Technical
Design/room 27, a player needs to solve an electrics/science problem of or circuit/current.
However, before solving the problem in the Kitchen/room 21, the player needs to go to the
room of Chief Engineer/room 6 to find the conversion diagram for temperatures; before
solving the problem in Technical Design/room 27, the player needs to go to the room of
Constructor’s Office/room 5 to find the diagram of electric circuit. The player is not offered
any tools in advance; by cracking safes one after another, he/she will obtain tools and
combinations needed to crack some of the following safes.
SafeCracker’s specifications following Wainess and O’Neil’s specification of games
are shown in Table 5.
Table 5
Games Evaluation Specifications
SafeCracker
Game Evaluation
Purpose/domain
Puzzling solving with the focus on logical
inference and trial-and-error.
Type of game platform
PC/CD ROM, Mac
Analogous game
Pandora’s Box
Jewels of the Oracle
Commercialization intent
Primary
Contractor
Dreamcatcher
Genrea
Puzzle
Training use
Recreational use
Length of game
Unlimited (except the very beginning part of
the game)
Terminal learning objectives
TBD
Players/Learners
Candidate for the position of security leader
of a major company
Type of learningb
Problem solving
Domain knowledge
Math, history, physics, information
searching, location/direction, science.
Type of play
73
Game Evaluation
Time to learnc
74
5 minutes
game’s interface game rules
Availability of tutorial or other types
No
of training supported
Manual
What is user perspective?
No
First person
It is fund
Primary
Availability of cheats/hints
8 internet sitese
Time frame
Modern
Plan of Instruction
No
Feedback in game
Implicit
After Action Review
No
Nature of practice
One scenario per game play
Single user vs. multiple user
Single user
a Action, role playing, adventure, strategy games, goal games, team sports, individual sports
(Laird & VanLent, 2001).
b Domain knowledge, problem solving, collaboration or teamwork, self-regulation,
communication (Baker & Mayer, 1999).
c Basic game play, i.e., an educated user, not winning strategies.
d Challenge, fantasy, novelty, complexity.
e http://www.cheatguide.com/cheats/pc/s/safecracker.shtml.
http://www.gamexperts.com/index.php?cheat_id=2178
Game Evaluation
75
http://home.planet.nl/~laan0739/adventure/games/safe.html.
http://faqs.ign.com/articles/424/424105p1.html
http://fourfatchicks.com/Reviews/Safecracker/Safecracker.shtml
http://www.thecomputershow.com/computershow/walkthroughs/safecrackerwalk.htm
http://www.balmoralsoftware.com/safecrak/safecrak.htm
http://www.uhs-hints.com/uhsweb/safecrkr.php
http://www.justadventure.com/thejave/html/Games/GamesS/Safecracker/JAVE_Safecracker
Extras.shtml
Knowledge Map
A knowledge map is a structural representation that consists of nodes and links. Each
node represents a concept in the domain of knowledge. Each link, which connects two
nodes, is used to represent the relationship between them. As Schau and Mattern (1997) point
out, learners should not only be aware of the concepts but also of the connections among
them. A set of two nodes and their link is called a proposition, which is the basic and the
smallest unit in a knowledge map. Previous studies indicated that knowledge map is reliable
and efficient measurement of content understanding (Herl et al., 1999; Ruiz-Primo, Schultz,
& Shavenlson, 1997).
Ruiz-Primo et al. (1997) suggested that as an assessment tool, a knowledge map is
identified as a combination of three components: (a) a task that allows a student to perform
his or her content understanding in the specific domain (b) a format in regard to the student’s
response, and (c) a scoring system by which the student’s knowledge map could be
accurately evaluated.
Game Evaluation
In this research participants will be asked to create a knowledge map in a computerbased environment and evaluated their content understanding before and after playing
SafeCracker and receiving the feedback of domain-specific strategies.
Table 6 lists the concept map specification that will be used in this study (modified
and adapted from Chuang, in preparation)
Table 6
Concept Map Specifications
General Domain
Specification
This Software
Scenario
Create a knowledge map on the content understanding of
Science individually, and by playing SafeCracker, a
puzzle-solving game.
Participants
College students work. Each works on his/her own,
doing the knowledge mapping and playing game
Knowledge map terms
Predefined – 12-15 important ideas identified by content
experts:
(Nodes)
Knowledge map terms
(Links)
Predefined – 3-5 important relationships identified by
content experts:
SafeCracker, a puzzlesolving game
Contains of over 50 rooms with about 30 puzzles and
information about science, mathematics, and other topic
areas
Training
All students will go through the same training
The training included the following elements:
• how to construct the map
• how to play the selected puzzle-solving game
Type of Learning
Problem solving measures
Problem solving
76
Game Evaluation
Knowledge map
Content understanding and structure :(a) semantic
content score; (b) the number of concepts; and (c) the
number of links
Problem-solving strategy
questions
Includes questions of problem-solving retention and
transfer
Self-regulation questionnaire
Planning, self-checking, self-efficacy, and effort
Feedback
Implicit feedback on game
77
Feedback
Feedback provides information following an action or a response and allows a learner
to evaluate the adequacy of the action/response (Brunning, Schraw, & Ronning, 1999;
Kulhavy & Wager, 1993). In addition, feedback has significant influence on learning
efficiency, motivation and self-regulation (Bandura, 2001; Nabors, 1999).
Based on complexity, feedback can be categorized into three types: knowledge of
response feedback, knowledge of correct response feedback, and elaborated feedback
(Clarina, Ross, & Morrison, 1991; Dempsey, Driscoll, & Swindell, 1993). While knowledge
of response feedback only tells the learner if his/her performance was correct or not,
knowledge of correct response shows the learner the correct answer. (e.g. Bangert-Drowns,
Kulik, Kulki, & Morgan, 1991; Clark & Dwyer, 1998; Pridemore & Klein, 1995). In
addition, feedback can be delivered immediately after learner’s action or delayed for a while
(Clariana et al., 1991; Hannafin & Reiber, 1989; Kulhavy & Stock, 1989; Kulik, & Kulik,
1988).
Feedback can be presented visually in the formats of graphics or pictures, verbally in
the format of texts or words, or simply covert in a program, which is implicit feedback. Also,
feedback can be provided directly by the instructor/trainer, by other students, or simply
Game Evaluation
78
implied in a program where participants have to estimate the information conveyed and
figure out the solution. Additionally, feedback can be outcome feedback or cognitive
feedback (Brunning, Schraw, & Ronning, 1999); the former provides specific information
about performance, while the latter emphasizes on the relationship between performance and
the task. Furthermore, Salas (1993) introduced adapted feedback, personalized feedback in
computer-based or computer-assisted a program, which was found more effective than nonpersonalized feedback in learning higher level cognitive items with educational technology
(Albertson 1986).
For the purpose of this dissertation, the characteristics of the feedback will be: 1)
implicit, since the feedback on game-playing strategies will be covert in the game instead of
given by the researchers; when a player fails to solve a crack, he/she may estimate that the
previous step may be inappropriate, and try a another solution. In SafeCracker, players can
not solve the subsequent puzzle unless they do it in the predefined sequence.. 2) The
feedback provided in the game is delayed until a player find out a subsequent problem can
not be solved. 3) The feedback of the game is implicit feedback, since a player will find out
the previous steps were right or wrong when the player tries to crack the subsequent safe.
Measure
Content Understanding Measure
Content understanding measures were computed by comparing semantic content
score of a participant’s knowledge map to semantic score of a set of two experts. The experts
will be Wainess and Chen, a first example of the knowledge map for one room is shown in
Figure 4. The knowledge map was developed by the author as shown in Figure 4 (Chen). The
following description shows how these outcomes would be scored. First, the semantic score
Game Evaluation
79
was calculated based on the semantic propositions, two concepts connected by one link, in
experts’ knowledge map. Every proposition in a participant’s knowledge map would be
compared against each proposition in the four experts’ maps. One match would be scored as
one point. The average score across all two experts would be the semantic score of the
student map. For example, as seen in Table 7, if a participant makes a proposition such as
“room”, this proposition is then compared with two experts’ propositions. A score of one
meant this proposition was the same with the proposition in one map of an expert. A score of
zero means this proposition is not the same with any one of expert’s proposition. Table 7
shows “room contains key” received score one from the first two experts. Then the average
score of this proposition would be 0.5. The total average score of each proposition would be
the semantic score of a participant’s knowledge map. In our example in Table 7, the total
score would be 2.
Figure 4
Sample Knowledge Map
room
Results in
contains
key
Results from
crack
causes
map
clue
Game Evaluation
80
Table 7
An Example of Scoring Map
Concept 1
Links
Concept 2
Expert1
Expert2
Average
Room
contains
key
1
1
1
Crack
results
Key
1
0
0.5
Crack
0
1
0.5
from
Clue
causes
Total
2.00
Domain-Specific Problem-Solving Strategies Measure
In this study, the researcher will modify Mayer and Moreno’s (1998) problem-solving
question list to measure domain specific problem-solving strategies. In Mayer and Moreno’s
(1998) research on the split-attention effect in multimedia learning and the dual processing
systems in working memory, participants’ problem-solving strategies were assessed with a
set of retention and transfer questions.
Mayer and Moreno’s judged a participant’s retention scores by counting the number
of predefined major idea units correctly stated by the participant regardless of wording. The
example of the answer units for the retention were “air rises”, “water condenses”, “water and
crystals fall”, and “wind is dragged downward”.
In addition, Mayer and Moreno (1998) scored the transfer questions by counting the
number of acceptable answers that the participant produced across all of the transfer
problems. For example, the acceptable answers for the first transfer question about
decreasing lightning intensity included “removing positive ions from the ground”, and one of
Game Evaluation
81
the acceptable answers for question two about the reason for the presence clouds without
lightning is that “the tops of the clouds might not be high enough to freeze.”
The problem-solving strategies questions designed for this dissertation research will
be relevant to the selected safes/problems in SafeCracker, the selected puzzle-solving game.
Furthermore, those questions will be regarding the application of the strategies relevant to the
puzzles/safes solving/cracking strategies participants may acquire after trying to solve the
problems in the rooms pre-selected by the researchers from the 60 rooms of the game. The
following are problem solving strategy questions of retention and transfer which will be used
in this dissertation research:
Retention question:
–
Write an explanation of how you solved the puzzle in the first room
–
Write an explanation of how you solved the puzzle in the second room.
Transfer questions:
–
List some ways to improve the play in room 1
–
List some ways to improve the play in room 2
–
List some ways to improve the fun or challenge of playing the game in room 1
–
List some ways to improve the fun or challenge of playing the game in room 2
Participants’ retention scores will be counted by the number of predefined major idea
units correctly stated by the participant regardless of wording. The example of the answer
units for the retention were “follow map”, “find clues”, “find key”, “differentiate rooms” and
“tools are cumulative”.
Game Evaluation
82
In addition, participants’ transfer questions will be scored by counting the number of
acceptable answers that the participant produced across all of the transfer problems. For
example, the acceptable answers for the first transfer question about ways to improve the
play in room 1 includes “jot down notes”, and one of the acceptable answers for question
three ways to improve the fun or challenge of playing the game in room 1 is that “increase
clues needed to crack a safe.”
Self-Regulation Questionnaire
The trait self-regulation questionnaire designed by O’Neil and Herl (1998) will be
applied in this study to access participants’ degree of self-regulation, one of the components
of problem-solving ability. There was sufficient reliability of the self-regulation
questionnaire, ranged from .89-.94, reported in previous study (O’Neil & Herl, 1998). A total
32 items are composed of eight items of each of the four factors: planning, self-checking,
self-efficacy, and effort. For example, item 1 “ I determine how to solve a task before I
begin.” is designed to access participants’ planning ability; and item 2 “I check how well I
am doing when I solve a task” is to evaluate participants’ self-efficacy. The answer for each
item is ranged from almost never, sometimes, often, to almost always.
Procedure
Time Chart of the Pilot Study
Activity
Time
Introduction
2-3 minutes
Self-regulation questionnaire
6-8 minutes
Introduction on knowledge mapping
8 minutes
Game introduction
5 minutes
Game Evaluation
Knowledge map (pre)
5 minutes
Problem-solving strategy questions (pre)
2 minutes
Game playing (room 1 & 2)
20 minutes
Knowledge map (post)
5 minutes
Problem-solving strategy questions (post)
2 minutes
Debriefing
2 minutes
Total
83
57-60 minutes
Data Analysis
According to the outcomes of the pilot study, some modifications will be made for
the main study. For example, time may be adjusted whether the participants in the pilot study
feel they didn’t have enough time to construct the map. Also, the pilot will examine if the
new programming of knowledge mapper works successfully, and if the instructional lesson is
appropriate. Further, the researcher will find out the problems of the computer system which
may occur during the main study, such as system crash, and make necessary adjustments to
them. In addition, the researcher will find out if the problem solving task is interesting for the
participants and their feelings.
Main Study
Method of the Main Study
Participants
There will be 30 young adults aged from 20-35, participating in this main study. The
main study will be conducted at a lab of USC after receiving the approval of USC Review of
Game Evaluation
84
Human Subjects. All participants will be selected to have no experience of playing
SafeCracker or other puzzle-solving games.
Game
The same puzzle-solving game, SafeCracker, will be used in the main study;
however, some adjustments may be made according to the results of the pilot study. For
example, where (in which room) to start the game, the number of rooms, the time allotted for
participants to play the game may be adjusted, or the game instruction before participants’
playing the game.
Measures
Knowledge Map
A knowledge map is a structural representation that consists of nodes and links. Each
node represents a concept in the domain of knowledge. Previous literature has shown its
validity and reliability of assessing content understanding (Herl et al., 1999; Mayer, 2002;
Ruiz-Primo, Schultz, & Shavenlson, 1997). The same knowledge maps used in the pilot
study will be used in the main study. However, adjustment of the time allowed participants to
draw the maps may be made. In the study, subjects will be required to create a knowledge
map in a computer mapping test before and after game-playing.
The content understanding measures were computed by comparing semantic content
score of a participant’s knowledge map to semantic score of a set of two experts (Schacter et
al., 1999), the same as it is measured in the pilot study.
Domain-Specific Problem-Solving Strategies Measures
The same problem-solving strategy questions of retention and transfer modified from
Mayer and Moreno’s (1998) problem-solving question list and modified by the researcher,
Game Evaluation
85
used in the pilot study will be used in the main study. The problem-solving questions are
related to the puzzle-solving strategies for the selected two rooms in SafeCracker. Also, the
same scoring system of counting acceptable answers will be used with the main study.
Self-Regulation Questionnaire
In the main study, subjects’ self-regulation, one of the components of problemsolving ability, will be assessed, using the same self-regulation questionnaire (O’Neil & Herl,
1998) with thirty-two questions used in the pilot study.
Procedure
The same procedure used in the pilot study (modified) will be used in the main study.
Computer-Based Knowledge map Training
Participants will be trained how to use computer-based knowledge map, including
adding/erasing concepts and create/delete links between concepts.
Game Playing
The main study of this research will be done on SafeCracker, a computer puzzlesolving game. The participants will be asked to play in two specific rooms in SafeCracker.
once after the first drawing of knowledge map, and the other after the second drawing of the
knowledge map and the providing of task-specific feedback. Each section of game-playing
will last for thirty minutes.
Feedback on Game Play Strategies
The same implicit feedback in the pilot study will be used in the main study.
Game Evaluation
86
Data Analysis
The descriptive statistics will be means, standard deviation, and correlation
coefficients. The t-test will be used to examine the relationships between outcomes before
and after game playing.
Using t-test to compare the scores of knowledge mapping and problem-solving
checklist before playing SafeCracker, the researcher will find out if playing SafeCracker
enhances participants’ problem-solving ability of content understanding and problem-solving
strategies.
Game Evaluation
87
REFERENCES
Adams, P. C. (1998). Teaching and learning with SimCity 2000 [Electronic Version].
Journal of Geography, 97(2), 47-55.
Albertson, L. M. (1986). Personalized feedback and cognitive achievement in computerassisted instruction. Journal of Instructional Psychology, 13(2), 55-57.
Alessi, S. M. (2000a). Building versus using simulations. In J. M. Spector & T. M. Anderson
(Eds.), Integrated and holistic perspectives on learning, instruction technology:
Improving understanding in complex domains (pp. 175-196). Dordrecht, The
Netherlands: Kluwer.
Alessi, S. M. (2000b). Simulation design for training and assessment. In H. F. O’Neil, JR. &
D. H. Andrews(Eds.), Aircrew training and assessment (pp. 197-222). Mahwah, NJ:
Lawrence Erlbaum Associates.
Alessi, S. M. (2000c). Simulation design for training and assessment. In H. F. O’Neil, Jr. &
D. H. Andrews (Eds.), Aircrew training and assessment (pp. 197-222)., Mahwah, NJ:
Lawrence Erlbaum Associates.
Alexander, P. A. (1992). Domain knowledge: Evolving themes and emerging concerns.
Educational Psychologist, 27(1), 33-51.
Amory, A. (2001). Building an educational adventure game: Theory, design, and lessons.
Journal of Interactive Learning Research, 12(2/3), 249-263.
Amory, A., Naicker, K., Vincent, J., & Adams, C. (1999). The use of computer games as an
educational tool: Identification of appropriate game types and game elements. British
Journal of Educational Technology, 30(4), 311-321.
Game Evaluation
88
Anderson, C. A., & Bushman, B. J. (2001, September). Effects of violent video games on
aggressive behavior, aggressive cognition, aggressive affect, physiological arousal,
and prosocial behavior: A meta-analytic review of the scientific literature.
Psychological Science, 12(5), 353-358.
Arthur, W. Jr, Strong, M. H., Jordan, J. A., Williamson, J. E., Shebilske, W. L., & Regian, J.
W. (1995). Visual attention: individual differences in training and predicting complex
task performance, Acta Psychologica, 88, 3-23.
Arthur, W. Jr., Tubre, T., Paul, D. S., & Edens, P S. (2003). Teaching effectiveness: The
relationship between reaction and learning evaluation criteria. Educational
Psychology, 23(3), 275-285.
Baird, W. E., & Silvern, S. B. (1999) Electronic games: Children controlling the cognitive
environment. Early Child Development & Care, 61, 43-49.
Baker, E. L., & Alkin, M. C. (1973). Formative evaluation of instructional development. AV
Communication Review, 21(4), 389-418. (ERIC Document Reproduction Service No.
EJ091462)
Baker, E. L., & Herman, J. L. (1985). Educational evaluation: Emergent needs for research.
Evaluation Comment, 7(2), 1-12.
Baker, E. L. & Mayer, R. E. (1999). Computer-based assessment of problem solving.
Computers in Human Behavior, 15, 269-282.
Baker, E. L., & O’Neil, H. F. Jr. (2002). Measuring problem solving in computer
environments: Current and future states. Computers in Human Behavior, 18(6), 609622.
Game Evaluation
89
Bandura, A. (2001). Impact of Guided Exploration and Enactive Exploration on SelfRegulatory mechanisms and Information Acquisition Through Electronic Search.
Journal of Applied Psychology, 86 (6), 1129-1141.
Bangert-Drowns, R. L., & Pyke, C. (2001). A taxonomy of student engagement with
educational software: An exploration of literate thinking with electronic text. Journal
of Educational Computing Research, 24(3), 213-234.
Barnett, M. A., Vitaglione, G. D., Harper, K. K. G., Quackenbush, S. W., Steadman, L. A., &
Valdez, B. S. (1997). Late adolescents’ experiences with and attitudes toward
videogames. Journal of Applied Social Psychology, 27(15), 1316-1334.
Barootchi, N, & Keshavarz, M. H. (2002) Assessment of achievement through portfolio and
teacher-made tests. Educational Research, 44(3), 279-288.
Betz, J. A. (1995-96). Computer games: Increase learning in an interactive multidisciplinary
environment. Journal of Educational Technology Systems, 24, 195-205.
Blanchard, P. N., Thacker, J. W., & Way, S A. (2000). Training evaluation: Perspectives and
evidence from Canada. International Journal of Training and Development, 4(4),
295-304.
Bong, M., & Clark, R. E. (1999). Comparison between self-concept and self-efficacy in
academic motivation research. Educational Psychologist, 34(3), 139-153.
British Educational Communications and Technology Agency. Computer Games in
Education Project. Retrieved from http://www.becta.org.uk
Brunning, R. H., Schraw, G. J., & Ronning, R R. (1999). Cognitive psychology and
instruction (3rd ed.). Upper Saddle River, NJ: Merrill.
Game Evaluation
90
Chambers, C., Sherlock, T. D., & Kucik III, P. (2002). The Army Game Project. Army,
52(6), 59-62.
Chappell, K. K., & Taylor, C. S. (1997). Evidence for the reliability and factorial validity of
the computer game attitude scale. Journal of Educational Computing Research,
17(1), 67-77.
Cheung, S. (2002). Evaluating the psychometric properties of the Chinese version of the
Interactional Problem-Solving Inventory. Research on Social Work Practice, 12(4),
490-501.
Christopher, E. M. (1999). Simulations and games as subversive activities. Simulation &
Gaming, 30(4), 441-455.
Chuang, S., (in preparation). The role of search strategies and feedback on a computer-based
collaborative problem-solving task. Unpublished doctoral dissertation. University of
Southern California.
Chung, G. K. W. K., O’Neil H. F., Jr., & Herl, H. E. (1999). The use of computer-based
collaborative knowledge mapping to measure team processes and team outcomes.
Computers in Human Behavior, 15, 463-493.
Clariana, R. B., Ross, S. M., & Morrison, G. R. (1991). The effects of different feedback
strategies using computer-administered multiple-choice questions as instruction.
Educational Technology Research and Development, 39(2), 5-17.
Clark, R. E. (1998). Motivating performance: Part 1—diagnosing and solving motivation
problems. Performance Improvement, 37(8), 39-47.
Game Evaluation
91
Clark, K., & Dwyer, F. M. (1998). Effects of different types of computer-assisted feedback
strategies on achievement and response confidence. International Journal of
Instructional Media, 25(1), 55-63.
Crisafulli, L., & Antonietti, A. (1993). Videogames and transfer: An experiment on
analogical problem-solving. Ricerche di Psicologia, 17, 51-63.
Day, E. A., Arthur, W, & Gettman, D. (2001). Knowledge structures and the acquisition of a
complex skill. Journal of Applied Psychology, 86(5), 1022-1033.
Dawes, L., & Dumbleton, T. (2001). Computer games in education. BECT
ahttp://www.becta.org.uk/technology/software/curriculum/computergames/docs/repor
t.pdf
Dempsey, J. V., Driscoll M. P., and Swindell, L. K. (1993). Text-based feedback. In J. V.
Dempsey & G.C. Sales (Eds.), Interactive instruction and feedback (pp.21-54).
Englewood, NJ: Educational Technology publications.
Donchin, E. (1989). The learning strategies project. Acta Psychologica, 71, 1-15
Driskell, J. E., & Dwyer, D. J. (1984). Microcomputer videogame based training.
Educational Technology, 11-16.
Dugard, P. & Todman, J. (1995). Analysis of pre-test-post-test control group designs in
educational research. Educational Psychology, 15(2), 181-198.
Dugdale, S. (1998). Mathematical problem solving and computers: a study of learnerinitiated application of technology in a general problem-solving context. Journal of
Research on Computing in Education, 30(3), 239-253.
Game Evaluation
92
Enman, M., & Lupart, J. (2000). Talented female students’ resistance to science: an
exploratory study of post-secondary achievement motivation, persistence, and
epistemological characteristics. High Ability Studies, 11(2), 161-178.
Faria, A. J. (1998). Business simulation games: current usage levels-an update. Simulation &
Gaming, 29, 295-308.
Fery, Y. A., & Ponserre S. (2001). Enhancing the control of force in putting by video game
training. Ergonomics, 44, 1025-1037.
Forsetlund, L., Talseth, K. O., Bradley, P., Nordheim, L, & Bjorndal, A. (2003). Many a slip
between cut and lip: Process evaluation of a program to promote and support
evidence-based public health practice. Evaluation Review, 27(2), 179-209.
Galimberti, C., Ignazi, S., Vercesi, P., & Riva, G. (2001). Communication and cooperation in
networked environments: An experimental analysis. Cyber Psychology & Behavior,
4(1), 131-146.
Gall, M. D., Gall, J. P., & Borg, W. R. (2003). Educational research. An introduction (7th
ed.). New York: Allyn & Bacon.
Gopher, D., Weil, M., & Bareket, T. (1994). Transfer of skill from a computer game trainer
to flight. Human Factors, 36, 387-405.
Gredler, M. E. (1996). Educational games and simulations: A technology in search of a
(research) paradigm. In D. Jonassen (Ed.). Handbook of Research for Educational
Communications and Technology (pp521-540). New York: Macmillan.
Greenfield, P.M., DeWinstanley, P., Kilpatrick H., & Kaye D. (1994). Action video games
and informal education: Effects on strategies for dividing visual attention. Journal of
Applied Developmental Psychology, 15, 105-123.
Game Evaluation
93
Hannafin, M. J., & Reiber, L. P. (1989). Psychological foundations of instructional
technologies: Part I. Educational Technology Research and Development, 37(2), 91101.
Harrell, K. D. (2001). Level III training evaluation: Considerations for today’s organizations.
Performance Improvement, 40 (5), 24-27.
Henderson, L., Klemes, J., & Eshet, Y. (2000). Just playing a game? Educational simulation
software and cognitive outcomes. Journal of Educational Computing Research,
22(1), 105-129.
Herl, H. E., Baker, E. L., & Niemi, D. (1996). Construct validation of an approach to
modeling cognitive structure of U.S. history knowledge. Journal of Educational
Psychology, 89(4), 206-218.
Herl, H. E., O’Neil, H. F., Jr., Chung, G., & Schacter, J. (1999) Reliability and validity of a
computer-based knowledge mapping system to measure content understanding.
Computer in Human Behavior, 15, 315-333.
Hong, E., & O’Neil, H. F. Jr. (2001). Construct validation of a trait self-regulation model.
International Journal of Psychology, 36(3), 186-194.
Hsieh, I. (2001). Types of feedback in a computer-based collaborative problem-solving
Group Task. Unpublished doctoral dissertation. University of Southern California.
Isaac, S, & Michael, W. B. (1997). Handbook in research and evaluation for education and
the behavioral sciences (3rd ed.). San Diego, CA: EdITS.
King, K. W., & Morrison M. (1998). A media buying simulation game using the Internet.
Journalism & Mass Communication Education,53(3), 28-36.
Game Evaluation
94
Kirkpatrick, D. L. (1994). Evaluating training program. The four levels. San Francisco, CA:
Berrett-Koehler Publishers.
Kirkpatrick, D. L. (1996, January). Great ideas revisited. Training and Development Journal,
54-59.
Kulhavy, R. W., & Wager, W. (1993). Feedback in programmed instruction: Historical
context and implication for practice. In J. V. Dempsey & G.C. Sales (Eds.),
Interactive instruction and feedback (pp.3-20). Englewood, NJ: Educational
Technology publications.
Kulhavy, R. W., & Stock, W. A. (1989). Feedback in written instruction: The place of
response certitude. Educational Psychology of Review, 1, 279-308.
Kulik, J. A., & Kulik, C. C. (1988). Timing of feedback and verbal learning. Review of
Educational Research, 58(1), 79-97.
Lane, D. C. (1995). On a resurgence of management simulations and games. Journal of the
Operational Research Society, 46, 604-625.
Malone, T. W. (1981). Toward a theory of intrinsically motivating instruction. Cognitive
Science, 4, 333-369.
Manning, B. H., Glasner, S. E., & Smith, E. R. (1996). The self-regulated learning aspect of
metacognition: A component of gifted education. Roeper Review, 18(3), 217-223.
Marsh, H. W., & Roche, L. A. (1997). Making students’ evaluations of teaching
effectiveness effective: The critical issues of validity, bias, and utility. American
Psychologist, 52(11), 1187-1197.
Martin, A. (2000). The design and evaluation of a simulation/game for teaching information
systems development. Simulation & Gaming, 31(4), 445-463.
Game Evaluation
95
Mayer, R. E. (1998). Cognitive, metacognitive, and motivational aspects of problem solving.
Instructional Science, 26, 49-63.
Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press.
Mayer, R. E. (2002). A taxonomy for computer-based assessment of problem-solving.
Computer in Human Behavior, 18, 623-632.
Mayer, R. E., & Moreno, R. (1998). A split-attention effect in multimedia learning: evidence
for dual processing systems in working memory. Journal of Educational Psychology,
90(2), 312-320.
Mayer, R. E., Moutone, P., & Prothero, W. (2002). Pictorial aids for learning by doing in a
multimedia geology simulation game. Journal of Educational Psychology, 94(1),
171-185.
Mayer, R. E., & Sims, V. K. (1994). For whom is a picture worth a thousand words?
Extensions of a dual-coding theory of multimedia learning. Journal of Educational
Psychology, 86, 389-401.
Mayer, R. E., & Wittrock, M. C. (1996). Problem-solving transfer. In D. C. Berliner, &
Calfee, R.C. (Ed.), Handbook of educational psychology (pp. 47-62). New York, NJ:
Macmillian Library Reference USA, Simon & Schuster Macmillan.
Mehrotra, C. M. (2001). Evaluation of a training program to increase faculty productivity in
aging research. Gerontology & Geriatrics Education, 22(3), 79-91.
Moreno, R., & Mayer, R. E. (2000). Engaging students in active learning: The case for
personalized multimedia messages. Journal of Educational Psychology, 92(4), 724733.
Game Evaluation
96
Morris, E. (2001). The design and evaluation of Link: A computer-based teaching system for
correlation. British Journal of Educational Technology, 32(1), 39-52.
Mulqueen, W. E. (2001). Technology in the classroom: lessons learned through professional
development. Education, 122(2), 248-256.
Nabors, Martha L. (1999). New functions for “old Macs”: providing immediate feedback for
student teachers through technology. International Journal of Instructional Media,
26(1) 105-107.
Naugle, K. A., Naugle, L. B., & Naugle, R. J. (2000). Kirkpatrick’s evaluation model as a
means of evaluating teacher performance. Education, 121(1), 135-144.
Novak, J. D. (1990). Knowledge maps and Vee diagrams: Two metacognitive tools to
facilitate meaningful learning. Instructional Science, 19(1), 29-52.
Okagaki, L. & Frensch, P.A. (1994). Effects of video game playing on measures of spatial
performance: Gender effects in late adolescence. Journal of Applied Developmental
Psychology, 15, 33-58.
O’Neil, H. F., Jr. (Ed.). (1978). Learning strategies. New York: Academic Press.
O’Neil, H. F., Jr. (1999). Perspectives on computer-based performance assessment of
problem-solving. Computers in Human Behavior, 15, 225-268.
O’Neil, H. F., Jr. (2003). What works in distance learning. Los Angeles: University of
Southern California; UCLA/National Center for Research on Evaluation, Standards,
and Student Testing (CRESST).
O’Neil, H. F., Jr., & Abedi, J. (1996). Reliability and validity of a state metacognitive
inventory: Potential for alternative assessment. Journal of Educational Research, 89,
234-245.
Game Evaluation
97
O’Neil, H. F., Jr., & Andrews, D. (Eds). (2000). Aircrew training and assessment. Mahwah,
NJ: Lawrence Erlbaum Associates.
O’Neil, H. F., Jr., Baker, E. L., & Fisher, J. Y.-C. (2002). A formative evaluation of ICT
games. Los Angeles: University of Southern California; UCLA/National Center for
Research on Evaluation, Standards, and Student Testing (CRESST).
O’Neil, H. F., Jr., & Fisher, J. Y.-C. (2002). A technology to support leader development:
Computer games. In Day, V. D., & Zaccaro, S. J. (Eds.), Leadership development for
transforming organization. Mahwah, NJ: Lawrence Erlbaum Associates.
O’Neil, H. F., Jr., & Herl, H. E. (1998). Reliability and validity of a trait measure of selfregulation. Los Angeles, University if California, Center for Research on Evaluation,
Standards, and Student Testing (CRESST).
O’Neil, H. F., Jr., Mayer, R. E., Herl, H. E., Niemi, C., Olin, K, & Thurman, R A. (2000).
Instructional strategies for virtual aviation training environments. In H. F. O’Neil, Jr.,
& D. H. Andrews (Eds.), Aircrew training and assessment, (pp. 105-130). Mahwah,
NJ: Lawrence Erlbaum Associates.
Parchman, S. W., Ellis, J. A., Christinaz, D., & Vogel, M. (2000). An evaluation of three
computer-based instructional strategies in basic electricity and electronics training.
Military Psychology, 12(1), 73-87.
Peat, M., & Franklin, S. (2002). Supporting student learning: The use of computer-based
formative assessment modules. British Journal of Educational Technology, 33(5),
515-523.
Perkins, D. N., & Salomon, G. (1989). Are cognitive skills context bound? Educational
Researcher, 18, 16-25.
Game Evaluation
98
Petty, R. E., Priester, J. R., & Wegener, D. T. (1994). Handbook of social cognition.
Hillsdale, NJ: Lawrence Erlbaum Associates.
Pillay, H. K., Brownlee, J., & Wilss, L. (1999). Cognition and recreational omputer games:
implications for educational technology. Journal of Research on Computing in
Education, 32, 203-216.
Pintrich, P. R., & DeGroot, E. V. (1990). Motivational and self-regulated learning
components of classroom academic performance. Journal of Educational
Psychology, 82, 33-40.
Pirolli, P, & Recker, M. (1994). Learning strategies and transfer in the domain of
programming. Cognition & Instruction, 12(3), 235-275.
Poedubicky, V. (2001). Using technology to promote healthy decision making. Learning and
Leading with Technology, 28(4), 18-21.
Ponsford, K. R., & Lapadat, J. C. (2001). Academically capable students who are failing in
high school: Perceptions about achievement. Canadian Journal of Counselling, 35(2),
137-156.
Pridemore, D. R., & Klein, J. D. (1995). Control of practice and level of feedback in
computer-based instruction. Contemporary Educational Psychology, 20, 444-450.
Quinn, C. N. (1991). Computers for cognitive research: A HyperCard adventure game.
Behavior Research Methods, Instruments, & Computers, 23(2) 237-246.
Quinn, C. N. (1996). Designing an instructional game: Reflections on “Quest for
independence.” Education and Information Technologies, 1, 251-269.
Quinn, C. N., Alem, L., Eklund, J. (1997). Retrieved August 30, 2003, from http://www.
Testingcentre.come/jeklund/interact.htm.
Game Evaluation
99
Rabbitt, P., Banerji, N., Szymanski, A. (1989). Space fortress as an IQ test? Predictions of
learning and of practiced performance in a complex interactive video-game. ACTA
Psychologica Special Issue: Tge Kearbubg Strategues Origran: An Examination of
the Strategies in Skill Acquisition, 71(1-3), 243-257.
Rhodenizer, L., Bowers, C., & Bergondy, M. (1998). Team practice schedules: What do we
know? Perceptual and Motor Skills, 87, 31-34.
Ricci, K. E., Salas, E., & Cannon-Bowers, J. A. (1996). Do computer-based games facilitate
knowledge acquisition and retention? Military Psychology, 8, 295-307.
Rieber, L. P. (1996). Animation as feedback in computer simulation: Representation matters.
Educational Technology Research and Development, 44(1), 5-22.
Rieber, L.P. (1996). Seriously considering play: Designing interactive learning environments
based on the blending of microworlds, simulations, and games. Educational
Technology, Research and Development, 44, 43-58.
Ritchie, D., & Dodge, B. (1992, March). Integrating technology usage across the curriculum
through educational adventure games. (ED 349 955).
Rosenorn, T. Kofoed, L. B. (1998). Reflection in Learning Processes through
simulation/gaming. Simulation & Gaming, 29(4), 432-440.
Ross, S. M., & Morrison, G. R. (1993). Using feedback to adapt instruction for individuals.
In J. V. Dempsey & G.C. Sales (Eds.), Interactive instruction and feedback (pp.177195). Englewood, NJ: Educational Technology publications.
Ruben, B. D. (1999, December). Simulations, Games, and experience-based learning: The
quest for a new paradigm for teaching and learning. Simulation & Gaming, 30(4),
498-505.
Game Evaluation
100
Ruiz-Primo, M. A., Schultz, S. E., and Shavelson, R. J. (1997). Knowledge map-based
assessment in science: Two exploratory studies (CSE Tech. Rep. No. 436). Los
Angeles, University if California, Center for Research on Evaluation, Standards, and
Student Testing (CRESST).
Salas, E. (2001). Team training in the skies: does crew resource management (CRM) training
work? Human Factors, 43(4), 641-674.
Sales, G. C. (1993). Adapted and adaptive feedback in technology-based instruction. In J.
V. Dempsey & G.C. Sales (Eds.), Interactive instruction and feedback (pp.159-176).
Englewood, NJ: Educational Technology publications.
Santos, J. (2002). Developing and implementing an Internet-based financial system
simulation game. Journal of Economic Education, 33(1) 31-40.
Schacter, J., Herl, H. E., Chung, G., Dennis, R. A., O’Neil, H. F., Jr. (1999). Computer-based
performance assessments: a solution to the narrow mearurement and reporting of
problem-solving. Computers in Human Behavior, 13, 403-418.
Schank, R. C. (1997). Virtual learning: A revolutionary approach to build a highly skilled
workforce. New York: McGraw-Hill Trade.
Schau, C. & Mattern, N. (1997). Use of map techniques in teaching applied statistics
courses. American statistician, 51, 171-175.
Schau, C., Mattern, N., Zeilik, M., Teague, K., & Weber, R. (2001). Select-and-fill-in
knowledge map scores as a measure of students' connected understanding of science.
Educational & Psychological Measurement, 61(1), 136-158.
Game Evaluation
101
Schunk, D. H., & Ertmer, P A. (1999). Self-regulatory processes during computer skill
acquisition: Goal and self-evaluative influences. Journal of Educational Psychology,
91(2).
Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagné, & M.
Scriven (Eds.), Perspectives of curriculum evaluation (American Educational
Research Association Monograph Series on Curriculum Evaluation, No. 1, pp. 3983). Chicago: Rand McNally.
Simon, H. A. (1973). The structure of ill structured problem. Artificial Intelligence, 4, 181201.
Sternberg, R. J., & Lubart, T. E. (2003). The role of intelligence in creativity. In M. A. Runco
(Ed.), Critical Creative Processes. Perspectives on Creativity Research (pp. 153187). Cresskill, NJ: Hampton Press.
Stolk, D., Alesandrian, D., Gros, B., & Paggio, R. (2001). Gaming and multimedia
applications for environmental crisis management training. Computers in Human
Behavior, 17, 627-642.
Thomas, P., & Macredie, R. (1994). Games and the design of human-computer interfaces.
Educational Technology, 31, 134-142.
Thornburg, D. G. & Pea, R. D. (1991). Synthesizing instructional technologies and
educational culture: Esxploring cognition and metacognition in the social studies.
Journal of Educational Computing Research, 7(2), 121-164.
Tkacz, S. (1998). Learning map interpretation: Skill acquisition and underlying abilities.
Journal of Environmental Psychology, 18 (3), 237-249.
Game Evaluation
102
Urdan, T., & Midgley, C. (2001). Academic self-handicapping: What we know, what more
there is to learn. Education Psychology Review, 13, 115-138.
van Merrienboer, J. J. G., Clark, R. E., & de Croock, M. B. M. (2002). Blueprints for
complex learning: The 4C/ID-Model. Educational Technology Research &
Development, 50(2), 39-64.
White, B. Y., & Frederiksen, J. R. (1998). Inquiry, modeling, and metacognition: Making
science accessible to all students. Cognition and Ind Instruction, 16(1) 3-118.
Washbush, J., & Gosen, J. (2001). An exploration of game-derived learning in total
enterprise simulations. Simulation & Gaming, 32(3), 281-296.
Weller, M. (2000). Implementing a CMC tutor group for an existing distance education
course. Journal of Computer Assisted Learning, 16(3), 178-183.
Wellington, W. J., & Faria, A. J. (1996). Team cohesion, player attitude, and performance
expectations in simulation. Simulation & Gaming, 27(1).
West, D. C., Pomeroy, J. R., Park, J. K., Gerstenberger, E. A., Sandoval, J. (2000). Critical
thinking in graduate medical education. Journal of the American Medical
Association, 284(9), 1105-1110.
Westbrook, J. I., & Braithwaite, J. (2001). The health care game: An evaluation of heuristic,
web-based simulation. Journal of Interactive Learning Research, 12(1), 89-104.
Winne, P. H., & Perry, N. E. (2000). Measuring self-regulated learning. In M. Boekaerts, &
P. R. Pintrich (Eds.), Handbook of Self-regulation (pp. 531-566). San Diego, CA:
Academic Press
Woolfolk, A. E. (2001). Educational Psychology (8th ed.). Needham Heights, MA: Allyn
and Bacon.
Game Evaluation
103
Ziegler, A., & Heller K. A. (2000). Approach and avoidance motivation as predictors of
achievement behavior in physics instructions among mildly and highly gifted eightgrade students. Journal for the Education of the Gifted, 23(4), 343-359.
Zimmerman, B. J. (1994). Dimensions of academic self-regulation: A conceptual framework
for education. In D. H. Schunk, & B. J. Zimmerman (Eds.), Self-regulation of
learning and performance (pp. 3-21). Hillsdale, NJ: Erlbaum.
Zimmerman, B. J. (2000). Self-efficacy. An essential motive to learn. Contemporary
Educational Psychology, 25(1), 82-91.
Game Evaluation
104
Appendix A
Self-Regulation Questionnaire
Name (please print): _________________________________________________________________
Directions: A number of statements which people have used to describe themselves are given
below. Read each statement and indicate how you generally think or feel on learning tasks by
marking your answer sheet. There are no right or wrong answers. Do not spend too much time on
any one statement. Remember, give the answer that seems to describe how you generally think
or feel.
Almost
Never
Sometime
s
Often
Almost
Always
1.
I determine how to solve a task before I begin.
1
2
3
4
2.
I check how well I am doing when I solve a task.
1
2
3
4
3.
I work hard to do well even if I don't like a task.
1
2
3
4
4.
I believe I will receive an excellent grade in this
course.
1
2
3
4
5.
I carefully plan my course of action.
1
2
3
4
6.
I ask myself questions to stay on track as I do a
task.
1
2
3
4
7.
I put forth my best effort on tasks.
1
2
3
4
8.
I’m certain I can understand the most difficult
material presented in the readings for this course.
1
2
3
4
9.
I try to understand tasks before I attempt to solve
them.
1
2
3
4
10.
I check my work while I am doing it.
1
2
3
4
11.
I work as hard as possible on tasks.
1
2
3
4
12.
I’m confident I can understand the basic
concepts taught in this course.
1
2
3
4
13.
I try to understand the goal of a task before I
attempt to answer.
1
2
3
4
14.
I almost always know how much of a task I have
to complete.
1
2
3
4
Game Evaluation
Almost
Never
105
Sometime
s
Often
Almost
Always
15.
I am willing to do extra work on tasks to
improve my knowledge.
1
2
3
4
16.
I’m confident I can understand the most complex
material presented by the teacher in this course.
1
2
3
4
17.
I figure out my goals and what I need to do to
accomplish them.
1
2
3
4
18.
I judge the correctness of my work.
1
2
3
4
19.
I concentrate as hard as I can when doing a task.
1
2
3
4
20.
I’m confident I can do an excellent job on the
assignments and tests in this course.
1
2
3
4
21.
I imagine the parts of a task I have to complete.
1
2
3
4
22.
I correct my errors.
1
2
3
4
23.
I work hard on a task even if it does not count.
1
2
3
4
24.
I expect to do well in this course.
1
2
3
4
25.
I make sure I understand just what has to be
done and how to do it.
1
2
3
4
26.
I check my accuracy as I progress through a task.
1
2
3
4
27.
A task is useful to check my knowledge.
1
2
3
4
28.
I’m certain I can master the skills being taught in
this course.
1
2
3
4
29.
I try to determine what the task requires.
1
2
3
4
30.
I ask myself, how well am I doing, as I proceed
through tasks.
1
2
3
4
31.
Practice makes perfect.
1
2
3
4
Considering the difficulty of this course, the
1
teacher, and my skills, I think I will do well in
this course.
Copyright © 1995, 1997, 1998, 2000 by Harold F. O’Neil, Jr.
2
3
4
32.
Download