Investigating the Role of Simulation Fidelity in

Investigating the Role of Simulation Fidelity in
Laparascopic Surgical Training
by
Hyun K Kim
B. S., Mechanical Engineering
Massachusetts Institute of Technology, 2000
Submitted to the Department of Mechanical Engineering in
Partial Fulfillment of the Requirements for the Degree of
Master of Science in Mechanical Engineering
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
August 2002
0 2002 Massachusetts Institute of Technology
All Rights reserved
Author
Department of Mechanical Engineering
August 9, 2002
Certified
by .......................................................................................................
Dr. Mandayam A. Srinivasan
Mechanical Engineering
Scientist,
Research
Senior
Thesis Supervisor
Accepted
by .......................................................................................................
Prof. Lallit Anand
Professor of Mechanical Engineering
Chair, Department Committee for Graduate Students
Investigating the Role of Simulation Fidelity in
Laparascopic Surgical Training
by
Hyun K Kim
Submitted to the Department of Mechanical Engineering on
August 9, 2002 in Partial Fulfillment of the Requirements for the Degree of
Master of Science in Mechanical Engineering
ABSTRACT
Minimally invasive surgery (MIS), with its aptitude for quick recovery and minimal scarring, has
revolutionized surgery over the past few years. As a result, the development of a VR-based
surgical trainer for MIS has been a popular area of research. However, there still remains a
fundamental question of how realistic the simulation has to be for effective training. On the one
hand, learning surgical practices with an unrealistic model may lead to negative training transfer.
However, because of the learning abilities and perceptual limitations of the sensory, motor, and
cognitive system of the human user, perfect simulation is unnecessary. Furthermore, given the
large variations in human anatomy and physiology, there is no single perfect model. The question
is how simple a simulation can we get away with, while at the same time preserving a level of
fidelity between the virtual and real organ behavior that leads to positive training transfer.
A dual station experimental platform was set up for this study. The two stations consisted of a
real environment testing station and a virtual environment training station. The fidelity of the
simulation could easily be adjusted in the virtual training station so that subjects could be treated
with different modes of training. With the dual station setup the real environment performance of
a subject before and after VE training could be measured.
First round of experiments on the setup were conducted to investigate the effect of haptic fidelity
and the effect of part task training on surgical training. Haptic fidelity was adjusted by modeling a
material of non-linear stiffness to different degrees of accuracy. Subjects were initially tested on
the real station performing a bimanual pushing and cutting task. They were then trained on the
virtual station, with one of the three different levels of haptic fidelity or the part task trainer. Once
the training was complete, the subjects were again evaluated on the real environment station to
gauge their improvement in skill level.
Initial results showed a marked difference in level of skill improvement between training with
haptics and without. However there was no significance difference in the training effectiveness of
the higher fidelity and lower fidelity model of elasticity. Also part task training proved to be an
equally effective method of training for the surgical task chosen.
Experiments with modeling the non-linearity materials are one of many studies that can
be done on this platform, including adjusting other modes of haptic fidelity such as viscoelasticity and experiments with graphic fidelity. Results from such experiments can serve
as the basis of future surgical simulation development by providing guidelines on
environment fidelity required for positive training transfer to occur.
Table of Contents
TABLE OF CONTENTS ..............................................................................................
1
LIST OF FIGURES AND TABLES ............................................................................
4
1
INTROD UCTION ................................................................................................
SUMMARY OF PREVIOUS WORK........................................................................
M O TIVA TION ........................................................................................................
PA RA DIG M .........................................................................................................
ANALOGIES WITH FLIGHT SIMULATOR DEVELOPMENT...................................
1.1
1.2
1.3
1.4
1.5
2
DESIGN OF THE EXPERIMENT PLATFORM ........................................................
2.1.1
2.1.2
2.1.3
2.1.4
2.2
2.3
Factorsthat Effect Virtual Environment Training.................................
Haptic Fidelity ......................................................................................
Part Task vs. Whole Task Training........................................................
EXPERIM ENTAL PROCEDURE..............................................................................
EXPERIMENTAL RESULTS.......................................
3.1
3.2
3.3
RESULTS FROM TESTING STATION ..................................................................
TOTAL SCORE AND SKILL IMPROVEMENT ........................................................
LEARNING CURVE FOR TRAINING ....................................................................
7
8
10
11
13
13
17
21
22
24
24
25
27
28
30
30
32
37
41
ANALYSIS AND DISCUSSIONS ....................................
TRAINING EFFECTIVENESS ...............................................................................
STATISTICAL A NALYSIS ..................................................................................
LEA RN ING C U RV E..............................................................................................
RESULTS FROM FURTHER SUBJECTS ................................................................
41
45
48
50
CONCLUSIONS AND FUTURE WORK..............................
53
4.1
4.2
4.3
4.4
5
Overview ...............................................................................................
Design of Task.......................................................................................
Implem entation of Tasks .......................................................................
PerformanceMeasurements ..................................................................
DESIGN OF VALIDATING EXPERIMENTS ...........................................................
2.2.1
2.2.2
2.2.3
4
6
DESIGN OF PLATFORM AND EXPERIMENTAL METHODS......... 13
2.1
3
6
INTRODUCTION .................................................................................................
5 .1
5.2
C ON C LU SIO N S ....................................................................................................
FU TU RE W O RK ................................................................................................
REFERENCES .........................
0....
.. 00..
............ o........................
53
54
...... 56
List of Figures and Tables
Figure 1.1:
Experimental setup with the real station and virtual station.................. 10
Experiment setup showing (a) the real environment test station and (b) the
Figure 2.1:
14
virtual environm ent training station......................................................................
Two phantoms (a 1.5A model and a 1.OA model) connected to the end of
Figure 2.2:
15
the surgical tools in the virtual station. ................................................................
Figure 2.3:
System for real-time simulation with both graphic and haptic feedback.. 16
Hybrid system which produces "virtual" forces from the phantom and
Figure 2.4:
"real" forces from the abdominal wall.................................................................
17
Sequence of operation for the tasks chosen in the experiment platform.
Figure 2.5:
The figure shows the tasks done in the virtual environment for ease of presentation.
The tasks and objects in the real environment station are the same. .................... 19
(a) The operation scene in the real environment test station. The blocks
Figure 2.6:
are mounted on linear sliding bearings, spring loaded and connected to LVDTs. ...21
22
Figure 2.7:
Surgical tool with pen tip installed to the end.......................................
Figure 2.8:
Factors that affect general virtual environment training....................... 24
Figure 2.9:
Force displacement plot for rubber spring material..............................
26
Figure 2.10: (a) Linear approximation to force displacement curve, (b) Nonlinear
(exponential) approximation to force displacement curve. The order of magnitude
of the stiffness is in agreement with what has been measured experimentally for a
26
pig's esophagus (0.005N/mm - 0.02N/mm)........................................................
Figure 3.1:
Sample of raw data from the real environment test station for subject P3,
before and after training. The data shows, (a) profile of block position, (b) incisions
31
marks, (c) time and (d) number of obstacles hit. ..................................................
Bar chart of the total performance score before after training using the
Figure 3.2:
35
nonlinear elastic model. ........................................................................................
Bar chart of the total score before and after training for the (a) linear
Figure 3.3:
haptic trainer, (b) no haptic trainer, (c) part task trainer........................................
36
Learning curves for subjects who received training on the nonlinear elastic
Figure 3.4:
. . 38
model VR trainer.................................................................................................
Learning curves for subjects who received training on the linear elastic
Figure 3.5:
. . 39
model VR trainer.................................................................................................
Figure 3.6:
Learning curves for subjects who received training on the VR trainer
. . 40
w ithout haptics ...................................................................................................
Figure 4.1:
Average training effectiveness plot ......................................................
42
Figure 4.2:
Performance improvement shown with initial skill level group ...........
44
Average learning curve for training on the VR trainer. Error bars show
Figure 4.3:
49
standard deviation among subjects. .....................................................................
Figure 4.4:
Table 1:
Average training effectiveness plot (Revised).....................................
Perform ance metrics ..................................................................................
Experimental design table showing how the subjects were divided and
Table 2:
trained. N1, N2, etc. represents the subjects........................................................
51
23
28
Sample scoring table for evaluation test conducted on the real environment
Table 3:
test station before and after training using the non-linear elastic model. (For 5
34
subjects.) ...................................................................................................................
Improvement in performance score (training effect) after virtual reality
Table 4:
train ing . .....................................................................................................................
37
ANNOVA table calculations for two factor factorial test with n replicates of
Table 5:
each treatment combination [44]. n=l, a=4 and b=5 for this experiment............. 47
Table 6:
ANOVA table of results...........................................................................
47
1 Introduction
1.1 Introduction
The potential of developing a virtual reality based surgical simulation has captivated the
imagination of the scientists and engineers of multiple disciplines in recent years. Although
simulations of various surgical procedures have been attempted, the majority of research has
concentrated on simulating minimally invasive surgery (MIS). Many surgeons agree MIS has
revolutionized surgery over the past decade. Over 2.5 million surgeries were conducted through
minimally invasive procedures in the US in the past year alone.
Minimally invasive surgery is performed through small incisions made on the outer layer (usually
the abdominal wall or skull) of the operating region. An endoscope and surgical tools are inserted
through these incisions. The operation is done using long slender surgical tools that are pivoted
by a trocar at the outer layer. The endoscope provides the visual feedback of the scene of
operation on to a CRT screen. Among the many advantages of MIS, the majority of them can be
attributed to the fact that very small incisions are made. This results in much less pain to the
patient than in traditional surgery. Also, the recovery time is shortened considerably, allowing the
patient to leave the hospital in a matter of days after the operation. Not only is this an obvious
advantage to the patient, but it also benefits the hospital by shortening hospital stay, reducing the
load on sick beds and nurses.
However, there are difficulties in performing these procedures on the surgeon's part. Most
endoscopes only provide a 2-D view that making visual depth perception difficult. Also, the
surgeons can only view the operation scene on a remote CRT screen. This results in difficulties in
hand-eye coordination. There is also a problem of minimal tactile feedback from tool-tissue
interaction, mainly because the surgeon is feeling the organs through the end of a long tool and
the forces at the trocar/abdominal wall interface dominate. Not only is it is hard to detect small
forces on the tool tip from the soft organ tissue, it is hard to manipulate the long surgical tools
compared to the direct use of hands. Finally, the high frequency tremors of the hand are amplified
at the end of long tool, making precise tool control even more difficult.
Due to such difficulties, training for MIS is a long an arduous process. The traditional method of
training has been apprenticeship, where the surgical resident learns by observing and assisting an
6
expert surgeon until he or she is ready to perform surgical procedures on their own. However,
there are obvious disadvantages to this type of training. First of all, there is risk to the patient as
surgeons who have not completed their training hone their skills in the actual operating room.
Also, the accessibility of training can be a problem for the residents. The training can only be
done if a patient is available and not when the trainee requires the training. Also, the feedback to
the trainee is qualitative and the quality of the feedback could vary widely depending on the
surgeon overseeing the training.
Such shortcomings of the traditional method have led to the development of virtual reality based
surgical simulators. If MIS procedures could be simulated using a PC and simple visual and
haptic interfaces, there would certainly be numerous advantages over traditional methods. First of
all, a simulator would provide a environment for training without risk of injury to the patient.
Also, the training would always be available as often as the trainee desires and the feedback from
the training sessions can be immediate and quantitative. Customized software can be used to
simulate various and even rare surgical procedures. Also, the simulation would allow the surgical
environment to be controlled to cater for the specific conditions that are desired. Therefore an
effective virtual reality based trainer would be an ideal platform for training both novice and
expert surgeons, and thus a great deal of research has been done in this area over the past 10
years.
1.2
Summary of Previous Work
Extensive effort has been spent in developing surgical simulators to aid users study anatomy and
practice medical procedures. [1-3]. Satava [4] developed one of the first surgical simulators,
which included a model of the abdominal anatomy that could be viewed in 3-D using a headmounted display. The early simulators such as Satava's and others [5] provided visual feedback
only. However as more effective haptic devices became available, force feedback was
incorporated into the surgical simulations. Initial models of the human organ were based on
simple lumped parameter models [6, 7]. However, as computing speeds increased, physically
based finite element models [8-10], and other meshless methods [11] have been used as a way of
modeling the human organ. Also, part task training simulators have been developed as a tool for
training perceptual motor and spatial skills, without constructing complicated models of organs
[12-15]. In fact a part task trainer name MIST VR is currently commercially available on the
market and is used sparingly in teaching hospitals.
7
Another area of research that has relevance to this study is the measurement of transfer of training
from the virtual environment. Virtual reality is being used in many disciplines as a method of
training. Consequently, there has been a vast number studies investigating training in the virtual
environment. Adams et al. [16] showed that virtual environment training improved performance
in manual assembly tasks involving lego blocks. However, force feedback did not appear to
improve the training significantly. Earlier, Kozak et al. [17] showed that, low fidelity simulations
could lead to zero or negative training transfer to the real environment. Also, similar VR training
transfer studies have been done with spatial navigation training [18, 19], post-stroke rehabilitation
[20], and flight simulations [21-24]. These studies showed that virtual reality training could have
mixed results depending on the fidelity of the simulation and the task being training for.
On a related topic, Wagner, Howe and Stylopoulos [25]conducted experiments to see the effect of
levels force feedback on teleoperation performance. Operation with Force feedback resulted in
better performance compared with operation without. However, the difference between
augmented (x 1.5) force feedback and smaller (x 0.75) feedback was not significant.
1.3 Motivation
Previous research on surgical simulations has shown one shared conclusion. It is difficult to
simulate the surgical environment accurately. First of all, it is almost impossible to model tooltissue interaction exactly. Human organs are nonlinear, anisotropic, visco-elastic,
non-
homogeneous and their boundary conditions are not well known. On top of all this, there are
physiological effects that are difficult to model, such as breathing and blood flow. Also, to model
organs accurately in-vivo material properties of human organs need to be known and these are not
easy to measure. Furthermore, even if an accurate model can be established, real time simulations
require very fast computation. As the model becomes more and more complicated, a longer
computation time would be required to render the object. Therefore, modeling the interaction
between the organ and the surgical tool is not trivial. It can be safe to say that an exact real-time
simulation of tool-tissue interaction cannot be the accomplished in the near future.
However, it turns out that an exact simulation is not required. Due to the learning abilities and
perceptual limitations of the sensory, motor, and cognitive system of the human user, perfect
simulation is not necessary. Furthermore, given the large variations in human anatomy and
8
physiology, there is no single perfect model, and wide variations exist in geometry and material
properties of organs. In fact, it has been seen in many other domains where virtual reality has
been used as a training method, a low fidelity simulation can give a positive training transfer. The
main question is how simple a simulation can we get away with for surgical simulations, while at
the same time preserving the level of fidelity between the virtual and real organ behavior that
leads to positive training transfer.
A study of measuring training transfer under various levels of fidelity has not been conducted up
to this point. The difficulty lies in measuring the improvement of real world surgical skills from
virtual environment training. Measuring surgical skills within the virtual environment can be
done easily. However, some of these skills might only pertain to the virtual environment alone
and the training may have no significant or even negative effects on real world skills. Therefore,
what is truly required is the measurement of improvement in real world skills. The ideal way of
measuring this would be to have simulators of various fidelity and to train novice surgeons on
these simulators alone and see how the subjects' surgical skills differ during real surgery.
However, the surgical trainers that are available currently are not mature enough for such a study
to take place without risk to the patient. A current method that is widely used to validate low
fidelity simulators is to measure the inverse transfer of training [26]. This involves comparing the
performance of an expert surgeon with a novice surgeon on the simulator, and if the expert
surgeon performs significantly better the simulator is deemed to be effective [27, 28]. However,
such inverse arguments are not sufficient enough to conclude that the simulator gives positive
training transfer to the real environment. Therefore, there is a need for an alternate method for
measuring training effectiveness directly.
As it was indicated previously, a low fidelity surgical simulation could be effective in training a
surgeon. However, there remains the question, how accurate does the haptics have to be for the
simulation to be effective? How accurate does the graphics have to be? There are also questions
that do not concern fidelity. Such as, what is the best method (part-task vs. whole task) for
training? What is the most effective method of feedback? Questions such as these are
fundamental issues that need to be addressed even before any further development of surgical
simulators can be done. To answer these questions, a method of measuring training effectiveness
under various training conditions is required. However, there is a missing link between the virtual
and real environment that precludes the measurement of training effectiveness for laparoscopic
surgical simulators. The research done for this thesis proposes to provide this missing link.
9
1.4
Paradigm
A two-station experiment platform was setup as the test bed. The two stations are laparoscopic
stations with analogous surgical tasks. However, in one station the surgical tasks are performed
on real world objects, whereas the other station contains virtual objects. The real world station
contains simple inanimate objects, for which the shape and material properties are well known.
This allows the real objects to be very accurately modeled in the virtual environment. Therefore a
very high fidelity simulation of the real environment is possible. Thus, this provides an
opportunity to vary the fidelity of the simulation from very high fidelity to low fidelity.
Figure 1.1:
Experimental setup with the real station and virtual station
With this setup, it is now possible to measure training transfer from the virtual training
environment to the real environment. The real world performance of a subject before and after
VE training can be now measured, since the same surgical tasks are performed in both stations.
Therefore, various factors can be varied within the virtual environment to see the effect on real
world skills. These factors can be haptic fidelity, graphic fidelity and type of training, among
10
others. Therefore, this experiment platform provides the missing link between the virtual and real
environment that is key to answering
the fundamental
questions
concerning training
effectiveness.
Analogies with Flight Simulator Development
1.5
Although this is one of the first studies done on measuring training transfer for VR surgical
simulators, there has been a great deal of research done on measuring training transfer in other
domains of VR training, as previously mentioned. Flight simulators are a good example of a case
where there has been extensive research on training effects of virtual environments [29-31]. The
requirements for a faithful simulation are the same for both surgical and flight simulators.
(1)
A complete model, expressed mathematically, of the response to all inputs from the
operator
(2) A means of solving these equations in real-time
(3) A means of presenting the output of this solution to the operator by means haptic, visual
and aural responses
Due to the long history of flight simulations, they have matured to a stage where they are
commonly and effectively used today for training novice aviators. However, when early flight
simulators were being developed in the first half of the
2 0 th
century, engineers were faced with
similar questions that we are faced with today for surgical simulations. The physics of aviation
was not known well enough to have an accurate model, and the sensor, actuator and computing
technology was not developed enough to provide satisfactory real-time responses. Therefore, the
early simulators were very low fidelity models of flight [32]. The central issue of whether or not
these low fidelity simulations can achieve positive training transfer was commonplace back then
as it is now for surgical simulations. The analogy does not end there. The obstacles to measuring
training transfer were similar. The ideal way of measuring training transfer would have been to
train a novice pilot solely on the simulator and see how well they perform in real flight situations.
However, such a study was not realistic due to the risk of accidents and the availability and cost
of aircrafts. The only method of validating the simulators was by inverse transfer of training,
where expert pilots attested to the similarity of the simulator to real flight [32, 33]. However, such
methods are never enough to show the effectiveness of a simulation as a training tool.
Consequently, the early simulators did not have much of an impact on flight training because
most aviators were not convinced of their usefulness. This is similar to what has been observed
11
currently so far, where surgeons are not completely won over by the effectiveness in training with
the low fidelity simulations.
The initial uncertainty about the training effectiveness of flight simulators was answered over
time. By WWII the mechanics of flight was much better understood producing more accurate
mathematical models of flight. There was also significant improvement in simulator technology
from the initial mechanical devices to more complex electromechanical systems with analog and
eventually digital computation. Such development ultimately shaped a very accurate simulation
of flight, to a degree to which there is no doubt today that a flight simulation is an effective tool
for training. In fact, present day flight simulations are at the core of flight training, with strict
FAA regulations governing the fidelity requirements [34, 35].
With the development of surgical simulations still in its infancy, there are several lessons that can
be learnt from the history of flight simulations. For instance, due to fact that there was no easy
way to measure training effectiveness, a lot of money and man-hours were spent to perfect the
flight simulation using state of the art of technology. However, in many cases the improvements
did not enhance the training effectiveness
due to limits of human learning ability and
sensorimotor skills. Although a more accurate simulation may have been achieved, these were
cases where resources were wasted. Such waste could have been avoided if training effectiveness
studies were done to determine the level of fidelity required for the desired degree of training.
This is one of the lessons learnt that is central to this study and that is why measuring training
transfer at an early stage is important for funneling resources in the right direction for surgical
simulators. There are also encouraging signs that history provides. The many similarities in the
issues concerning the development gives hope that one day surgical simulators will play a central
role in training surgeons as flight simulators do for flight training today.
12
2 Design of Platform and Experimental Methods
The first part of this chapter describes the how the experiment platform was designed and built.
Once the platform was set up, experiments were done to validate the usefulness of the platform.
The latter part of this chapter describes how these experiments were designed and performed.
2.1
Design of the Experiment Platform
Laparoscopic surgery is minimally invasive surgery performed on the abdominal region. Surgical
simulation development at the Touch Lab mainly focuses on simulators for laparoscopic surgery.
Therefore, the training transfer experiments for this study was also done in a laparoscopic setup.
2.1.1
Overview
The experiment platform consists of two stations. The real environment laparoscopic setup acts as
the testing station and the virtual environment setup acts as the training station.
(a) Test Station
13
(b) Training Station
Figure 2.1:
Experiment setup showing (a) the real environment test station and (b) the virtual
environment training station
In the real environment station a rubber model of an abdominal wall (Limbs and Things, Inc.)
covers the operation scene. Two laparoscopic tools are inserted through the abdominal wall via
trocars, as done in real laparoscopic procedures. The objects that are to be operated on are
instrumented and are placed inside the abdominal area. A laparoscope, attached to a camera, is
also inserted through the abdominal wall to provide visual feedback on to a CRT screen, again as
in real surgery. The station is setup so that the subject can perform laparoscopic tasks on the
objects, while the data from the tasks are recorded.
The virtual environment station includes the same rubber abdominal wall, with surgical tools and
trocars inserted at the same position as in the real station. However, underneath the abdominal
wall, the surgical tools are connected to Phantom (SensAble Technologies) haptic interface
devices (Figure 2.2). The objects are generated in the virtual world by the computer and displayed
to user graphically through the computer monitor and haptically through the Phantom devices.
Both the real station and the virtual station contain foot petals to activate the harmonic scalpels
that are on the end of the surgical tools.
14
Figure 2.2:
Two phantoms (a 1.5A model and a 1.OA model) connected to the end of the surgical
tools in the virtual station.
For real-time simulation to be possible with the computer and phantom devices, the system
shown in Figure 2.3 had to be set up. The Phantom devices were connected to a 550MHz Pentium
III PC for this setup.
15
Force Feedback
Device
Read tip position
"
Force(l KHz
Timer
-
Collision detection
Display
Interface
rision
response
(Force,IDeform ation)
Visual (301H1z)
Visual
Display
Processor
Figure 2.3:
System for real-time simulation with both graphic and haptic feedback
As the figure shows, the force feedback device sends back the tip position information to the
CPU, the processor determines whether or not the tip is in collision with a virtual object and
calculates the relevant deformation and force feedback for the object. This information is sent to
the graphic and haptic interfaces to be displayed to the user. For the simulation to seem realistic
in real-time, a refresh rate of 1 kHz was used for the haptics and 30 Hz was used for the graphics
for this setup.
The rubber abdominal walls were used in the two stations because it was determined that the
horizontal resistance forces provided by the wall on to the tools were very important.
Laparoscopic surgeons attested to the fact that forces by the abdominal wall at the trocar are the
dominant forces during surgery, much greater than the smaller forces between the tools and the
organs. It would be technically challenging to simulate both the abdominal wall forces and
contact forces at the same time. Therefore, a rubber abdominal model was used so that the
modeling only needs to be done for forces between the tools and the objects underneath the wall.
This type of hybrid model for laparoscopic simulation was used for the first time in this setup.
This hybrid system seems to be the most effective setup for laparoscopic simulation as of yet,
16
under the assumption that the purchased rubber abdominal model is a faithful model of the real
thing.
Figure 2.4:
Hybrid system which produces "virtual" forces from the phantom and "real" forces
from the abdominal wall.
Finally, the proposed experiment procedure using the setup was as follows. A subject would
initially be tested on the real environment test station, to measure their skill level prior to training.
Once they are tested, they would be trained for several sessions on the virtual reality trainer. After
the training is complete the subject would be brought back to the real environment to be evaluated
for the final time. The measurement of training effectiveness would be the increase in skill level
within the real environment. Therefore, using this setup in such experiments, various factors can
be adjusted within the virtual environment to see the real world training effect.
2.1.2
Design of Task
With the hardware in place, it was important to choose the appropriate tasks for the subjects to be
trained and evaluated on. There are numerous procedures that are performed by laparoscopic
techniques. The procedures are mostly a combination of a set of sub-tasks that are commonly
performed. Choosing the right combination of sub-tasks was crucial for this study to be relevant
and the results to be meaningful. The following criteria were used to select the final task.
(1)
The tasks had to be relevant to laparoscopic surgery. Although the operations were
performed on inanimate objects and not human organs, the actions had to be very similar
to what is done in real laparoscopic surgery. This was mainly verified by expert
laparoscopic surgeons from MGH, who were collaborators on this project.
17
(2) The chosen task needed to provide a graded "mimicability" of the real tasks in the virtual
environment. One of the main areas of interest for this training effectiveness study was to
see how environment fidelity affects training. Therefore it was necessary to be able to
adjust the fidelity within the virtual environment. Choosing tasks and objects that could
be modeled to varying degrees of accuracy, both haptically and graphically, was
important.
(3) The tasks needed to have an appropriate level of difficulty so that a suitable number of
training sessions could take place. If the task is too easy, the subjects would be able to
perform the tasks without much training and it would be difficult to distinguish between
the different training effects. On the other hand, if the tasks were too difficult, there
would not be much improvement in subject skill level even after training. Therefore,
again it would be difficult to see the effects of the different trainers.
(4) Finally the tasks had to have a significant number of metrics so that the performance of
the subject can be quantified during both training and evaluation.
Besides these criteria, there were some other desirable traits. It would be ideal if the performance
from the tasks chosen showed a small variance across all subjects. Also, it was hoped that haptic
feedback plays an important role in performing the tasks chosen, since the role of haptic fidelity
in training is one of the main interests of this study.
In choosing a task that fits the above criteria, operations that are commonly done in real
laparoscopic surgery were divided into the following building blocks of subtasks.
1)
Positioning and orientating tool.
2)
Obstacle avoidance.
3)
Palpation.
4) Piercing.
5)
Cutting (scissors or harmonic scalpels).
6)
Pushing (moving organs out of the way).
7)
Pulling.
8)
Wrapping.
A combination of these sub-tasks was chosen, but they could not all be combined in to one main
task. Therefore some were included and some were not. Pushing was chosen because it would be
possible to have a graded fidelity in modeling pushing operations. Positioning and obstacle
avoidance were incorporated because these tasks are generic to all laparoscopic procedures.
18
Wrapping was eliminated because the task would be difficult to simulate accurately for high
fidelity. Pulling was not chosen because it is similar to pushing, and piercing was eliminated
because force feedback is minimal when performing the task. Cutting was chosen because it
requires precise tool control and is a very important part of laparascopic surgery.
In the end a bimanual pushing and cutting task, similar to what is done in Heller's myotomy was
chosen. Heller's myotomy involves cutting muscle fibers in the esophagus to relieve stress which
can cause difficulty in swallowing. The scene and task chosen for the experiment platform is
shown below in section in Figure 2.5
(a)
(b)
(c)
(d)
(e)
(1)
(g)
(h)
Figure 2.5:
Sequence of operation for the tasks chosen in the experiment platform. The figure
shows the tasks done in the virtual environment for ease of presentation. The tasks and objects in the
real environment station are the same.
19
There are three layers in the scene. The top red layer is the obstacle in this task, with the subjects
instructed to avoid touching this layer as much as possible. The layer below the obstacle consists
of two sliding blocks. Both blocks are spring loaded. The blue block (top block) can be pushed to
the right and the pink block (bottom block) to the left. When the blocks are pushed, the bottom
layer is uncovered. The bottom layer has a rectangular grid area where the incision marks are
supposed to be made. The sequence of operation is as shown above.
(a) The scene is shown in its original configuration with the tools in their starting position.
(b) The left hand tool is used to push the spring-loaded blue block, uncovering the grid
underneath.
(c) The edge of the block, once pushed, needs to be maintained between two finely spaced
lines (the red warning line and the outer edge of the rectangular grid mark). The block is
then transferred from the left hand tool to the right while maintaining the position
between the two lines.
(d) Once the tool transfer is complete, the left hand tool becomes free. Pressing the left pedal
activates the left scalpel. An incision mark is made on the first grid with the scalpel
activated. The incision mark has to be as straight as possible and as long as possible,
without going outside the boundaries of the grid. Also, the harder and longer the tool is
pushed, the thicker the incision mark would come out. The incision mark has to be as thin
and consistent in thickness as possible. All this has to be done while maintaining the
correct position of the block.
(e) With the first incision mark completed the block is let go and tools return to the start
position. The same action from (b) to (d) is repeated for the next incision mark on the
second rectangular area under the same blue block. The order of incision is from top left
rectangle, bottom left rectangle, top right then bottom right. When the incisions marks are
to be made on one of the top rectangles, the left tool should push the top of the block, and
then the bottom of the block should be held with the right tool, so that the tools do not
collide when making the incision. The same applies for when making incisions on the
bottom rectangles, but vice versa.
(f)-(h) Once all four incision marks are made under the blue block, the subject switches
hands and performs the same tasks on the pink block. The subject returns the tools back
to the start position after each incision mark. A total of eight incisions are made, four
with the left and four with the right.
20
2.1.3
Implementation of Tasks
With the task selected, it needed to be implemented in both the real environment test
station and the virtual environment training station. For implementation in the real
station, two plastic blocks were mounted on linear sliding bearings for the moving doors.
The blocks were spring loaded with tissue-like material purchased from Limbs and
Things, Inc. DC 750-1000 LVDT s (Linear Variance Displacement Transducers) from
Macro Sensors were calibrated and fixed on to the blocks so that the displacement of the
blocks could be recorded. The LVDTs sent signals to a Data Translation DT 300 A/D
card, so that signal could be read, plotted and saved by the computer. The top layer of the
scene was manufactured by water jetting the desired pattern from an aluminum sheet,
bending to desired shape and attaching a layer of tissue-like material on the top surface.
Figure 2.6:
(a) The operation scene in the real environment test station. The blocks are
mounted on linear sliding bearings, spring loaded and connected to LVDTs.
A pen tip, with its own ink source, was installed at the end of the tool. The grid was made out of
ink absorbing paper and placed underneath the sliding blocks. Therefore, the incision marks
would become thicker as the user pushes harder and longer on to the grid, which was what was
desired. After each test session is complete, the grid would be removed and the incision marks
digitized through a scanner. The straightness, length, thickness and position of the digitized
incisions could be evaluated using algorithms written with MATLAB's image processing
toolbox.
21
Figure 2.7:
Surgical tool with pen tip installed to the end.
Implementation of the task for the virtual environment mainly involved software issues. First, all
the shapes, dimensions and masses of the real environment were accurately measured so that they
could be exactly simulated in the virtual environment. Traditional point-based collision detection
methods were used for detecting the collision between the tool tip and the virtual objects
(collision detection algorithm already developed in GHOST SDK, Sensable Technology, was
used). On top of this, a ray-based collision detection algorithm was developed specifically for this
task so that it could be determined when the sides of the tools hit the edge of the obstacle.
Conventionally, ray-based collision detection is only used when six-degree of freedom force
feedback devices are available. However, for the first time, ray-based collision detection was
implemented in this setup for three degree of freedom devices. This could be done because the
rubber abdominal wall acts as a pivot for the tool. (See Appendix 1 for details)
The most important step in implementing the task in the VE was modeling the dynamic block.
The dynamics of the block was modeled by the equation,
m + fb(,t)
+ fk(x,t) = FT,! Tp
(2.1)
It could be seen from this equation that the dynamics of the block could be modeled as linear or
nonlinear, time variant or time invariant, with or without damping, etc. Therefore, this equation
gives a means for adjusting the fidelity of the haptic model. This makes studies of the effect of
haptic fidelity possible.
2.1.4
Performance Measurements
With the task selected and implemented, a set of performance metrics was required to quantify
the performance of the subjects. Traditionally, the measurement of surgical skill is more
qualitative than quantitative. However, for this study, a numeric measurement of performance
was necessary for a more absolute evaluation of skill improvement. In fact, there have been
22
numerous prior studies done on establishing metrics for laparoscopic surgery ([13, 15, 36-40]).
However, the performance measurements were relevant to particular tasks or involved
establishing hidden Markov models. For this experiment platform, an original set of metrics was
established to gauge performance and is shown in Table 1.
Significance
What is measured?
Gives measurement for economy
of effort and coordination.
-Completion time
-Inter task time
Measurement of how well the
subject controls the force and
position of the tool tip.
-Block position
Measurement of how well the
subject orientates, controls, and
coordinates the two tools for
accurate incisions.
-Straightness
Performance
Measurement
Time
Push Accuracy
Cut Accuracy
C
-Depth
-Depth consistency
-Accuracy
General positioning and control
of the tool prior to pushing and
Peromncincision
Tool Control
-Obstacle avoidance
-Positioning of tool
-Positiningft
Performance = s1 T + s2 P + s3 C + s 4 0
Table 1:
Performance metrics
The total score was a scaled, normalized sum of the above four metrics. The scales were
determined by discussions with expert laparoscopic surgeons from MGH on what is important in
surgery. The total performance score tried to reflect what expert surgeons defined as good
surgical operation. In the end, a 1-4-3-2 weighting was given to time, cut accuracy, push accuracy
and tool control, respectively. Push accuracy and cut accuracy were given the highest weighting
because these were tasks where tool coordination and depth perception was important. Also these
two metrics were the quantities that described the success or failure of the main objective of the
tasks. Time is not an important factor in most laparoscopic procedures, since most cases are not
those in emergency surgery is required. Therefore, time was given a relatively smaller weighting.
23
Before the start of each experiment the subjects were given a detailed description of the task.
Also they were advised on how their performance would be scored, including the weighting of
each metric.
2.2
2.2.1
Design of Validating Experiments
Factors that Effect Virtual Environment Training
The above section concludes the description of the experimental setup. What remains is a
preliminary round of experiments to show the usefulness of the test bed. However, before any
experiments could be done on the platform, we needed to consider the factors that can affect
training. This was required so that it could be determined what factors needed to be experimented
with, in order to clarify their role in training.
Haptic device's inertia,
stiffness, friction and
bandwidth ...
Haptic Accuracy
Accuracy of the
mechanical model
Accuracy of the
mechanical model
Graphic Accuracy
Photo-realistic
texture
Part-task vs. whole-task
U
Frequency and duration
Figure 2.8:
Factors that affect general virtual environment training
As Figure 2.8 shows, in all training environments there is going to be some type of interaction
between the trainer and the user. There are many factors that can affect the quality of the training
from both the user side and the trainer side. Therefore, both the trainer's ability to deliver an
effective mode of training and the user's ability to respond to the training are important.
However, for this study, only the factors that affect training from the trainer side are under
24
consideration. There are three major factors that can affect training from the trainer side: interface
fidelity, environment fidelity and training method. Interface fidelity is influenced by the accuracy
of the haptic and graphic interfaces. Inaccuracies can occur from the inertia, stiffness, resolution,
friction and bandwidth of the haptic interface and the resolution and refresh rate of the graphic
interface, among others. Environment fidelity is determined by how accurately the real
environment is modeled physically in the virtual environment. The accuracy of the environment
can be divided largely into haptic accuracy and graphic accuracy. These mainly involve shape,
texture, force and deflection for haptics and color, texture, shade, shape and deflection for the
graphics. The third factor shown in the figure is the training method, which is general to all forms
of training and not just VR training. Factors such as frequency and duration of training, type of
training (part task or whole task) are categorized under training methods.
Roles of all the factors noted in Figure 2.8 in training can be investigated using the experiment
platform. For the first round of experiments to validate the effectiveness of this platform,
experiments were done investigating two of the above factors. The effects of haptic fidelity on
training transfer and the effectiveness of part task training versus whole task training were
investigated.
2.2.2
Haptic Fidelity
There are many aspects of haptic fidelity that could be experimented with. One of the key
questions that remain about haptic fidelity is how accurately nonlinearity of material elasticity
needs to be modeled. Human organs act as nonlinear springs when they are pushed or pulled.
However, the actual force displacement properties are not well known. Characterizing the in-vivo
force-displacement properties of organs is not a trivial problem. Therefore, the majority of
surgical simulations up to this point have used simple linear elastic models. Efforts are
continuously being made to characterize tissue properties accurately. However, it is not known if
the user using the surgical simulator can actually tell the difference between a linear and a nonlinear spring. Even if they could, the difference in training effect may not even be significant.
Therefore, for the first experiments on the platform, the effect of modeling a nonlinear elastic
material to varying degrees of accuracy was investigated.
25
For this investigation a non-linear spring was loaded on to the sliding blocks in the real station.
The material used for the spring was tissue-like rubber material purchased from Limbs & Things,
Inc. The force-displacement data for the spring and block assembly was measured using standard
weights on a low friction pulley and is plotted on Figure 2.9.
1.6
1.4
1.2
1
z
0.8
0
u 0.6
0.4
0.2
0
0
15
10
5
30
25
20
40
35
x (mm)
Force displacement plot for rubber spring material
Figure 2.9:
The figure shows a typical behavior observed in most tissue material where the stiffness increases
as the displacement increases. For the fidelity study, two approximations were made for this
curve. A linear approximation and nonlinear approximation was made using a least-squares curve
fitting technique.
1.6
1.6 -
1.4
1.4
1.2-
1.2
y =0. 1391 ei
R2=
0.9973
1.
0
LL
0.6
0.6
0.4
08
0.4-
1
0.2
0.2
0
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
35
40
X (nirn)
x (-4
(a)
(b)
(a) Linear approximation to force displacement curve, (b) Nonlinear (exponential)
Figure 2.10:
approximation to force displacement curve. The order of magnitude of the stiffness is in agreement
with what has been measured experimentally for a pig's esophagus (0.005N/mm - 0.02N/mm).
26
With the above approximations, three different levels of haptic fidelity were possible for
modeling the spring in the training station. The nonlinear approximation was the most accurate
high fidelity model, the linear approximation was the medium fidelity model and the lowest
fidelity was a model without force feedback. For the haptic-less model, the block would only
move graphically as it is being pushed. Comparison between the training effects of these three
models would give insight into the role of haptic fidelity in training.
There were some predictions that could be made from prior knowledge even before experiments
were done. For the sensory resolution of the hand, the JND (Just Noticeable Difference) for force
varies from 5-15% depending on the force magnitude, muscle system and experiment method
[41]. The resolution deteriorates at forces below 0.5N, with a minimum resolution of 0.06N at
these small forces [42]. The maximum force difference in the operating region between the linear
and nonlinear approximation is 0.16N. This difference would be amplified three to four times at
the hands due to the long tool that is pivoted. Also, the stiffness difference between the two
models varies from 0-70% in the operating region. The JND for stiffness is known to be 23%
[43]. Therefore, if a simple discrimination experiment were being done, the subject would most
likely be able to discriminate between the nonlinear and linear elastic model. However, whether
this small difference has any effect on the training is another matter. That is why it is worth
investigating. Initial predictions are that the training would not be affected by the small difference
in linearity. This is because there are additional forces on the tool from the friction of the haptic
device and the abdominal wall that may be much larger than the contact forces. Also, the subjects
are concerned with performing the task rather than trying to discriminate between the two
models.
2.2.3
Part Task vs. Whole Task Training
A second experiment comparing the effectiveness of part task training with whole task training
was done in parallel with the above haptic fidelity experiment. Part task training is when training
is done one sub-task at a time, where the desired complete set of tasks is a combination of the
sub-tasks. In fact, part task training is used in many fields where training is commonly required,
such as in sports, aviation and surgery. It has been found that part task training is especially
effective for training beginners with no prior skill. Experiments were done to determine if this is
also true for laparoscopic surgery training.
27
The main tasks described in section 2.1.2 were divided into four part tasks; positioning task,
pushing task, cutting task and obstacle avoidance. The subjects would train on one of each part
task at time. It was important to design the part task training sessions such that the total amount of
training received, in terms of the actions needed and time taken were the same as in whole task
training. Otherwise a valid comparison could not be made between the effects of part task and
whole task training. The positioning task consists of simply positioning the tool tip at desired
locations. The obstacle avoidance task involved placing the tools in a position to push the blocks,
without touching the upper layer. The pushing task was composed of pushing the block to the
desired location and maintaining the location while transferring the tool as done in the whole task.
Finally, the cutting task was simply making incision marks on the grid as straight, long, thin and
consistent as possible. At the end of the training the subjects would be evaluated on the whole
task in the real environment and not the part tasks.
2.3
Experimental Procedure
Twenty subjects with no prior surgical training were used for the first round of experiments. The
subjects were given the same detailed description of the tasks and scoring. Once they consented
to the experiment and the briefing was complete, they were initially evaluated at the test station
performing the described tasks. The subjects were then divided into five initial skill level groups
depending on their initial performance score. One subject from each skill level group was treated
by one of the four training treatments. This set up a 5 x 4 matrix shown in Table 2, with which a
two factor analysis could be done with the results. The analysis would work assuming that the
two factors that affect the training performance are the initial skill level and the type of trainer
used.
Type of Trainer
3
Nonlinear Haptics
Linear Haptics
No Haptics
Part Task
1
NI
Ll
NHI
P1
2
N2
L2
NH2
P2
3
N3
L3
NH3
P3
4
N4
L4
NH4
P4
5
N5
L5
NH5
P5
Experimental design table showing how the subjects were divided and trained. N1,
Table 2:
N2, etc. represents the subjects.
28
The training for each subject lasted seven sessions. Each session involved completing the task on
the virtual trainer from the beginning to the end. For the part task training, a session consisted of
going through each part task exercise once. Once the training was complete the subjects were
brought back to the real environment test station to be evaluated for the final time.
The measure of training effectiveness was determined by the improvement in skill between the
initial evaluation at the test station and the final evaluation, also on the test station.
Training _ Effect
=
perff
-
perfn
(2.2)
The details of the tool path, block position, time to completion and incision marks during the
training and testing sessions were recorded as described in the previous sections. The data was
then used to calculate the performance scores needed for evaluating the results.
29
3 Experimental Results
In this chapter, the results from the first round of validating experiments are presented. The
results are from 20 subjects who were divided to 4 training treatment groups. Each group was
trained with either one of the three levels haptic fidelity, or by part task training.
3.1
Results from Testing Station
Figure 3.1 shows the data from the real environment testing station. The data was recorded for the
initial skill level test and final test before and after training for each subject. The data required
from the test station was the block position, incision mark accuracy, time taken and the number of
obstacles hit.
30
(d) Time : 236.9
(e) Obstacle hit : 25
(a)
(b)
(d) Time: 136.2
--
--
(e) Obstacle hit: 9
(b)
(a)
Figure 3.1:
Sample of raw data from the real environment test station for subject P3, before
and after training. The data shows, (a) profile of block position, (b) incisions marks, (c) time and (d)
number of obstacles hit.
The graphs in part (a) of figure 3.1 are the profile of the block position. The two green
(horizontal) lines mark the boundaries in which the block was supposed to be maintained. The
scanned image in part (b) shows the digitized incision marks on the grid.
31
3.2 Total Score and Skill Improvement
The total scores before and after training were calculated from the raw data using the metrics
described in section 2.1.4. Table 3.1 is a sample table showing the scores for the five subjects
who received training with the nonlinear elastic model. The score for pushing was calculated by
integrating the area of the block profile that lay outside the desired region. Also, the number of
times the block slipped off the tool was added to the push accuracy score. The cut accuracy score
also had several components. The straightness was measured by the standard deviation of the
center of the incision. The mean thickness and thickness consistency (standard deviation) was
also determined. These values were combined with the length of the line and the number of pixels
outside the rectangular boundary to produce the total score for cut accuracy. Each sub-score was
normalized so that the inner 75 percentile would be between
1 and 0. The sub-scores were
weighted, as described previously, and then summed to produce the overall score. A lower score
signified a better performance in this scoring system.
32
Subject
N1
N3
N2
N4
Time
Inter-Task Time
Task Time
normalized
312
246
70.7
97.2
175.3
214.8
0.875093 1.368736
174.7
217.1
62.2
67.4
112.5
149.7
0.34181 0.658938
Cut Accuracy
Straightness
Thickness Consistency
Thickness Deviation
Length
Outside the Lines
total
normalized
1.56685
0.96285
1.18875
1.9424
1.319213 1.327175
2.0388
0.849038 1.638375
19
26
24
3.75
7.125
7.375
2.147473 3.626597 6.794513
0.30439 0.71962
0.110516
Push Accuracy
cross red
cross back
loss block
total
normalized
14.9029
38.7822
1
81.13655
0.014184
Tool Control
Obstacle avoidance
missed combo
total
normalized
17
20
27
23
2
1
1
1
22
32
33
25
0.551724 0.793103 0.827586 0.448276
N5
262.7
109.9
152.8
1
1.303738 2.097488
3.3316
2.043063
3.293813 4.184988
8
53
16.5
12.5
7.120025 15.93445
0.762286 1.917625
34.4
76.88
4.7667 132.5542
496.12
170.82
246.17 528.2261
1
0
1
0
242.42
361.49 555.3762 694.9513
1
0.273213 0.464445 0.775836
TOTAL
33
33
1
38
1
Final Skill Level Test
Subject
N1
N3
N2
N4
N5
133
159.5
52.1
30.2
107.4
102.8
0.029918 0.228123
Time
Inter-Task Time
Task Time
normalized
172.3
149.1
52.3
54
118.3
96.8
0.150337 0.323859
121.2
40
81.2
-0.05834
Cut Accuracy
Straightness
Thickness Consistency
Thickness Deviation
Length
Outside the Lines
total
normalized
1.1255
1.711025
1.20145
9
3.125
2.867351
0.204873
1.0946 0.681713 1.115688
1.36415 1.549738
1.283688
0.78315 1.262425
1.021125
3
24
4
1.375
13.5
1.625
2.696786 0.013529 3.143664
0.182516 -0.169188
0.24109
Push Accuracy
cross red
cross back
loss block
total
normalized
Tool Control
Obstacle avoidance
missed combo
total
normalized
1.013763
1.987863
1.576538
8
8.875
4.213987
0.381381
0.2388
180.75
0
181.1082
0.174743
0.2078 296.6045
192.8802 145.2465
2
0.5
203.1919 630.1533
0.210211 0.895931
8
15
13
0
0
0
15
13
8
0.206897 0.137931 -0.034483
19
17
0
1
17
24
0.275862 0.517241
9.2177
5.628
9.5415
5.8413
0
0
23.36805
14.2833
-0.078595 -0.093186
TOTAL
Sample scoring table for evaluation test conducted on the real environment test
Table 3:
station before and after training using the non-linear elastic model. (For 5 subjects.)
Figure 3.2 shows a bar chart of total scores, before and after training, as calculated in the table in
Table 3. As mentioned previously, a smaller bar represents a better performance.
34
Figure 3.2:
Bar chart of the total performance score before after training using the nonlinear
elastic model.
The results from the other trainers, namely the two lower fidelity trainers and the part task trainer,
were evaluated in the same manner described above. A scoring table identical to that in Table 3.1
was constructed from the raw data for the other three trainers. The total score was again plotted as
bar charts, shown in Figure 3.3.
(a) Linear Haptic Model
35
(b) No Haptics
(c) Part Task Training
Bar chart of the total score before and after training for the (a) linear haptic trainer,
Figure 3.3:
(b) no haptic trainer, (c) part task trainer.
As mentioned in the previous chapter, the measurement of training effectiveness was calculated
by subtracting the initial performance score from the final performance score. This improvement
in skill level is shown in Table 4 for each subject.
Training _ Effect = perfn
36
- perfiitiai
(3.1)
Type of Trainer
2
Table 4:
Nonlinear Haptics
Linear Haptics
No Haptics
Part Task
1
1.502
-0.214
-0.154
1.267
2
3.590
2.297
3.937
4.777
3
4.894
3.173
-0.091
4.324
4
6.031
4.735
3.712
4.209
5
7.183
7.692
6.561
10.356
Improvement in performance score (trainingeffect) after virtual reality training.
3.3 Learning Curve for Training
The performance score could also be measured during the training sessions on the virtual trainer.
By plotting the performance for each of 7 training sessions, a learning curve could be charted for
each subject. Figures 3.4-3.6 show the learning curve for the 15 subjects who received whole task
training. The learning curve for the 5 subjects who underwent part task training is not shown here,
because there are no grounds for comparison between scores from part task operations and the
scores from the whole task operations shown.
37
Figure 3.4:
VR trainer.
Learning curves for subjects who received training on the nonlinear elastic model
38
Figure 3.5:
trainer.
Learning curves for subjects who received training on the linear elastic model VR
39
Figure 3.6:
haptics.
Learning curves for subjects who received training on the VR trainer without
40
4 Analysis and Discussions
Analysis of the results shown in the previous chapter was carried out to see how each training
treatment effected the performance improvement. Initially general observations were made from
the average improvement scores in each training group. Then, a statistical analysis was done to
see if the difference in training effect was statistically significant. Finally, the learning curves
from the training sessions are also discussed in the last step of the analysis.
The analysis will show that the number of subjects as originally planned was not enough to
provide very definite conclusions. Therefore, addition experiments were done as an extension to
the first round of experiments and the combined results were more conclusive, as will be seen at
the end of this chapter.
4.1
Training Effectiveness
The average improvement in performance was calculated for each of the four training treatment
groups. The results are shown in Figure 4.1, from left to right, for the nonlinear elastic (high
fidelity), linear elastic model, no haptics and part-task training.
41
Figure 4.1:
Average training effectiveness plot
The y-axis in the above figure shows the improvement in performance score, and each group of
bars on the x-axis represent the training method. The tallest bar on the right end of each group
represents the average improvement in the total score. The smaller bars to the left of the total
show the improvement in each sub-score (time, pushing, cutting and obstacle avoidance). The
total is a sum of the sub-metric bars to its left (The sub-metrics were plotted in their weighted
form). The error bars on the chart represent the standard deviation of the scores. On first
impression, the standard deviation appears to be very large. However, the experiment was
designed with two factors analysis in mind. The standard deviation as plotted on this bar chart is
expected to be large, because for each training treatment, subjects of five different initial skill
level groups were used. Moreover, it was previously assumed that the initial skill level effects the
improvement in skill. This was taken into account later in the two factor statistical analysis.
Therefore, for the time being, the error bars can be ignored for the discussion of the mean values.
42
One obvious trend that can be seen in Figure 4.1 is that as fidelity decreased, so did the
improvement in total skill level. This supports the assertion that haptic fidelity, more specifically
the fidelity in modeling nonlinear stiffness, plays an important role in laparoscopic training. In
fact, not only is there a difference between training with haptics compared to without, there is
also a notable difference between the training effect of the nonlinear model and the linear
approximation. In the previous chapter it was predicted that the subject might be able to barely
discriminate between a linear and nonlinear spring of this magnitude, but the difference in
training effect would be negligible. However, the results seem to be showing otherwise. Not only
did the subjects seem to be able to distinguish between the two models, the training effect of the
two models seemed to differ. On the other hand, if the results are studied more closely, there are
signs that the difference between the linear and nonlinear models is merely coincidental and more
likely due to variance among subjects. Such an argument is supported by the fact that the bar
charts for the sub-scores show the main area where there was a difference was in the cutting
accuracy. However, the cutting task was modeled exactly the same for both trainers. Therefore,
two arguments could be made. The difference in score may simply be due to the sample variance,
or it could be that the spring modeling had an indirect effect on the cutting because the springloaded block had to be held while making the incision. Further insight can be obtained from the
statistical analysis that will be shown later in the next section.
Another observation that can be made from Figure 4.1 is that the total skill improvement for the
part-task training is larger than those for the whole task training. The spring in the part-task
trainer was modeled with the nonlinear fit. Therefore, the only direct comparison that can be
made is between the nonlinear whole-task training (first set of bar charts) and part task training
(last set of bar charts). Although, the total score improvement was greater for part task training,
the difference between the two was not significant as will be shown later in the statistical
analysis. However, what is interesting is that improvement in obstacle avoidance was notably
greater in part-task training than in any of the whole-task training results. This could be attributed
to the fact that when performing the total combination of tasks the subjects are concentrating
mainly on controlling the block and making an accurate incision and care less about avoiding the
obstacle. On the other hand, in part-task training, there is a training session solely devoted to
avoiding obstacles and thus the subjects become more accustomed to obstacle avoidance.
Whatever the case, it can be said that part-task training was an effective method for training for
these set of tasks. Especially since the training effect was greater or equivalent to whole-task
43
training even though the subjects were not trained even once on the complete combination of
tasks.
There were also some notable trends that can be seen from training without haptics. The average
improvement in total score was smaller for those who trained without haptics, as expected. The
score improvements for the cutting accuracy were especially smaller than in other training
groups. This was probably due to the fact that the cutting task required the most precise tool
control and haptic cues during training were useful for the subjects in determining depth and
controlling the tool tip. Another interesting result from the training without haptics can be seen in
Figure 4.2
Figure 4.2:
Performance improvement shown with initial skill level group
The above figure shows the total score improvement plotted for each subject, with the number
above each data point representing their initial skill level group. Therefore, a lower number
means that the subject performed better in their initial test. A very structured trend can be seen
from the plot for the nonlinear and linear haptic training. The subjects who initially had the least
amount of initial skill showed the most improvement, and the initially most skilled subject
showed the least amount of improvement. The subjects in between are spaced fairly equally in
perfect order. However, for the training without haptics, the subjects are clustered here are there
with random order. A similar effect can be seen for part task training. What this shows is that the
44
subjects responded to the training to varying degrees. For example, some subjects were not able
to perform the task at all when there was no haptic feedback in the virtual environment. These
subjects may continue their training without much improvement and this could be seen in the
some of the learning curves shown in chapter 3. On the other hand, some subjects may initially
start off struggling on the trainer without force feedback but steadily master the technique as the
number of training session increases. Therefore, for those who were able to perform the task well
without haptics, performing the task in the real environment test station became much easier in
comparison. Subjects NH3, NH2 and NH5 from Figure 4.2 are examples of those who were able
to fairly master the task without force feedback, and subjects NHI and NH4 were those who
struggled through out. Similarly for the part task trainer, some subjects were able to perform the
combination of task well from training each part task separately, and some were not. It depended
on how well the subject was able translate the skill acquired for the part tasks into skills needed
for performing the whole combination of tasks. Ideally, a trainer needs to be consistent in being
able to train all subjects. Therefore, a whole-task trainer with force feedback proved to be better
in that aspect.
4.2
Statistical Analysis
It has been speculated above that the different training treatments had different training effects.
However, for any of the conclusions to be mathematically valid, a statistical analysis of the
results had to be done. If the means do not differ by much and the standard deviations are large, it
cannot be concluded that the difference in the means are due to the trainer and not the sample
variation.
A two-factor ANOVA (analysis of variance) test was conducted on the results. As previously
mentioned, it was assumed that the increase in performance was influenced by two factors. The
factors were the training treatment received and the initial skill level of the subject. Thus, the
assumed relationship for the total performance increase is shown by the following equation.
yk = /p+ ai + i + Y
,,k
(4.1)
Yijk is the performance increase for a subject from the initial skill level group i, who underwent
training treatmentj and has repetition index k. u is the common effect, a,is the training effect of
trainer i, Bj is the effect of the subjects initial skill level group j,
factors
a and
#
, and c
yij is the interaction between
is the uncontrolled variation for this specific subject. Using this
45
relationship two hypotheses can be set up. The first hypothesis is that the effects of all four
training treatments were the same.
HO: a] =a
2
=
=a
a3
(4.2)
4
The second hypothesis is that the effects of the five initial skill level groups were the same.
H, :A= #02 =
03 =
(4.3)
4
The main interest is to show that first null hypothesis can be rejected at a significant confidence
level. For this analysis, Table 4 shown in the previous chapter can be used.
Type of Trainer (a)
Nonlinear Haptics
Linear Haptics
No Haptics
Part Task
1
1.502
-0.214
-0.154
1.267
2
3.590
2.297
3.937
4.777
3
4.894
3.173
-0.091
4.324
4
6.031
4.735
3.712
4.209
5
7.183
7.692
5.578
10.356
()
2
Table 4:
Improvement in performance score after virtual reality training.
For the two factor ANOVA test, the following sum of squares (SS), mean squares (MS) and
degree of freedoms (d) needed to be calculated.
Source
df
SS
MS
aaa
b
(colu
a-1
nns)
Za
SSa = bn
2
a
Z
b-1
(rows)
SS/
y
(a-1)(b- 1)
j2
an
a
MSa = SSa / df
Zzzyk
abn
b
/
2
n
1
n
b
1 Yik
rZZ
abn
b
ILV7
SSy=1 j1
a
ab(n-1)
b
Y Z
-SSac-SSP=+
n
Error
MS/i = SS 3/df
2
n
abn
MSy=SSy/df
;*j
abn
SSE = (SST - SSa - SS
46
- SSy)
MSE = SSE / df
Total
Abn-l
SST
i=1
i=1 j= k=1 7
2
n
2
n
b
a
abn
=1 k=I
Table 5:
ANNOVA table calculations for two factor factorial test with n replicates of each
treatment combination [441. n=1, a=4 and b=5 for this experiment.
With the above calculation, the test statistic (F-statistic) could be found for each effect.
F = MSa / MSE
for factor a
(4.4)
F=MS8 / MSE
for factor )6
(4.5)
From the statistic and degree of freedom, the confidence level for rejecting the null hypothesis
could be found. The calculations were done for the experiment results obtained in Table 4 using
MATLAB's statistics toolbox. The results of the ANOVA calculations are shown below.
Sounrc-e
Rows
Error
Total
Table 6:
SS
109.383
21.581
146.239
df
4
12
19
MS
27.3458
1.7984
F
Prob>F
15.21
0.0001
ANOVA table of results
The second hypothesis, H,, could be easily rejected at 5% significance level (p=0.001<0.05).
Therefore, it can be concluded that the initial skill level had a significant effect on the training
effectiveness. However, the null hypothesis, HO , which was of more interest, could not be
rejected at 5% significance level (p=0.0832>0.05). Although, there was a notable difference in
mean as shown in Figure 4.1, the variance was unfortunately too large for the differences to be
statistically significant. Large variations are very in common in most human factor studies and
such results were somewhat expected in this experiments since the sample size was fairly small.
This was accentuated by the fact that a two factor test requires a division of subjects into a 4 by 5
matrix. Therefore, although 20 subjects were used, there was only one subject per each cell.
47
However, the fact that the statistics are not conclusive does not make the observations made in
the previous section invalid. There were clear trends that could be noted, such as higher fidelity
models producing by better average improvement. What are needed are more subjects. If in fact
the observations made in the previous sections were correct representations of the whole
population, a larger sample size would decrease the variance and make the conclusions
statistically valid. Thus, experiments were done with 12 additional subjects and the results of the
final analysis will be shown in a later section.
A pair-wise comparison using a Tukey test for the four training treatments was also conducted.
The training effect of the trainer was denoted as t, and the Tukey test was done to show if each
training treatment was significantly different from one another. The results are shown in Table 7.
Tnonlinear #linear
7nonlinear # Tno-haptics
Vnonlinear
parttask
Thnear
nohaptics
Probability
Table 7:
Tukey pair-wise comparison
The comparisons show that none of the training effects can be concluded to be different
from another at 5% significance level. This was partly expected since the ANOVA test
did not show a significant enough difference. However, the same test with more subjects
showed a much better result and will be presented later.
4.3 Learning Curve
Figure 4.3 shows the average learning curve for the individual learning curves shown in section
3.3.
48
Average learning curve for training on the VR trainer. Errorbars show standard
Figure 4.3:
deviation among subjects.
The learning curves for the linear elastic haptic model and nonlinear model were virtually
identical. In fact, there is no reason why the curves should be dissimilar since subjects are
performing the same tasks with the only difference being the model of the spring. The magnitudes
of the force exerted by the two spring models were approximately the same in the operating
region, so whether the spring is linear or nonlinear should not matter in the learning curve if the
subject was trained solely on one spring. (The performance of the subjects trained on the two
different spring model might not be the same when they are evaluated on the real environment
station. That would be an entirely different matter since the subject would be evaluated on a
nonlinear spring and the performance might differ depending on whether or not they were trained
on a linear or nonlinear elastic model.) The learning curve shown for the two haptic models have
the conventional shape seen in most learning curves. The curve is steep in the first few trials and
it levels off towards the end as the learning saturates. It is plainly obvious from the curves there
was training taken place within the virtual environment. There is a significant increase in
49
performance comparing beginning to end. However, as mentioned in the first chapter, how this
training transfers to the real environment needs to be measured separately.
The learning curve for the training without haptics was somewhat different. The overall
performance score was at a significantly lower level than the other two curves. Also, the curve
was more erratic with less of a steady upward trend. This shows how difficult it was to perform
the tasks in the virtual environment without force feedback. Also, the response to the training
without haptics varied widely among subjects, shown by the large error bars. Some subjects were
able to gain the gist of how to control the tools without haptics within a few sessions and others
struggled throughout from the start to the end of their training sessions. This resulted in the
training effect having a high variance as discussed previously.
4.4 Results from Further Subjects
Further experiments were conducted with 9 more subjects in an attempt to obtain better results.
Although the initial round of experiments showed some encouraging results, the statistical
analysis showed that the data was not entirely satisfactory in making clear-cut conclusions. The
results obtained from the additional subjects are presented here in a separate section because these
experiments were conducted after the initial analysis of the results and were not part of the
original experimental design.
The 9 subjects were initially evaluated at the real environment test station. The subjects were then
divided into three training treatment groups of nonlinear haptics, linear haptics and no haptics.
None of the subjects were treated with part task training because the first round results already
showed that part task training was an effective method of training. Therefore, the additional
subjects were used to solely to determine the difference in training effectiveness of the three
haptic models. The experimental methods were exactly the same as those described in the
previous chapters.
For the statistical analysis, the subjects were divided into eight initial skill level groups so that it
was possible to have one subject per each group for the three training treatments. A total of 24
subjects were needed for this matrix. The results for the 15 subjects from the first round were kept
and the additional 9 results were combined to give results for 24 subjects.
50
The final results of the experiments are shown in Table 8 and Figure 4.4. (These are in the same
format as the results from the first round)
Type of Trainer (a)
Nonlinear Haptics
Linear Haptics
No Haptics
1
1.502
2.709
-0.515
2
1.507
-0.214
-0.496
"W
3
3.590
2.297
3.263
0
4
5.203
6.198
2.164
C
5
4.894
3.173
-0.091
-4
6
6.031
4.734
3.712
7
7.183
7.692
4.026
8
13.625
13.345
6.5609
5.4420
4.9917
2.5947
mean
Improvement in performance score after virtual reality training (Revised).
Table 8:
7
Tukey Interval
Total
Time
Cut Accuarcy
Push Accuracy
Obstacle Avoidance
6
E5 V
0
'-4
=3
M
F-
w2
CL
1
II
0
Figure 4.4:
I-
-
Nonlinear
II
F]-
Linear
Training Model
Average training effectiveness plot (Revised)
51
-
..
I
No Haptics
H
The difference in the total score improvement between the haptic training modes and the training
without haptics was more noticeable than before. Another notable change was that there is no
significant difference between the score improvements for the nonlinear elastic and linear elastic
models. The improvement in each sub-metric score was comparable for the two haptic training
modes and again cutting and pushing accuracy seemed to deteriorate the most among the subscores when force feedback was absent.
In Figure 4.4, error bars represent the interval for which a Tukey pair-wise comparison would
show significant difference between the means. Therefore, if the intervals are disjoint it can be
concluded that the training effects are different at 5% significant level. Showing this statistic as
the error bar seemed to be more appropriate than plotting the standard deviation. The standard
deviation was somewhat meaningless because only one factor was plotted on the x-axis for
performance increase that was affected by two factors.
Two factor ANOVA test was conducted on the results shown in Table 8. This time the null
hypothesis (hypothesis that the training effects are the same),
HO : ai = a
2
= a3 = a 4
(4.2)
could be easily rejected at 5% significance (p=0.026). The pair-wise comparison showed
improved results also,
Tnonlinear
Probability
Table 9:
Vlinear
Tnonlinear =
no _haptics
Vlinear = Vno - haptics
0.003
0.81
0.011
Tukey pair-wise comparison (Revised)
The comparison shows that the training effect of the trainer without haptics was significantly
lower than that of the trainers with force feedback. Also, there was no significant difference
between the training transfer of the nonlinear elastic model and the linear elastic model.
Therefore, as predicted in chapter 2, the accuracy of the elastic model did not appear to affect the
training performance for a bimanual pushing task. However, force feedback did appear to be
essential in effective training for surgical tasks in the virtual environment.
52
5 Conclusions and Future Work
5.1
Conclusions
The overall contribution of this thesis can be divided into two parts; the design of the experiment
platform and the results from the initial rounds of experiments conducted on the platform. The
experiments yielded several conclusions. First of all, it was proved that positive training transfer
can occur from virtual environment training for laparoscopic operations. This may seem like a
trivial conclusion, but it is in fact an important one. This is the first time it has been shown that
real environment surgical skills can be improved through virtual environment training alone.
Such a conclusion could be made because the improvement in skill was a measure performance in
the real environment and not a measurement of virtual world skills.
The next general conclusion that can be made is that force feedback appears to be important for
effective surgical training. The skill level improvement for the trainers with haptic feedback was
significantly higher than that of the trainer without haptics. There are currently many surgical
simulations that rely solely on graphic feedback without haptics. These systems have their
advantages such as simplicity of hardware, slower computation time requirements and costs.
Also, the results seem to show that a virtual trainer without force feedback would in fact give a
positive training transfer on average. However, there is no doubt that haptics enhances the
training effect considerably and that the performance is improved on a more consistent basis.
Also, training with force feedback seemed to become more important for surgical task that
require more accuracy and delicate tool control. Therefore it is safe to say that an effective
surgical simulation is one that provides both graphic and haptic feedback.
The training effect of the linear elastic and the nonlinear elastic model were not significantly
different for these tasks. This suggests that modeling the nonlinear elasticity of tissue for
bimanual pushing tasks is not important. A simple linear approximation of the stiffness seems to
result in training effects that are not significantly reduced, for stiffnesses that are comparable to
the organ tissue stiffnesses. Therefore, it seems accurate characterization of in-vivo forcedisplacement properties for implementation of nonlinear models is not necessary in training for
surgical pushing and pulling tasks. Due to the hardware limitations, the additional forces on the
surgical tools from the friction and inertia of the haptic interface and the resistive forces of the
53
abdominal wall are too large for the small differences in the stiffness to have any significant
effect on training.
Finally, part task training proved to be an equally effective method for training for simple surgical
tasks. Therefore, training with part tasks should be considered as an viable alternative to whole
task methods for entry-level training. Obviously, the effectiveness of part task training will
decline as the whole task becomes more complicated. Therefore, there are limits to what part task
training as a stand-alone training device can achieve. However, part task training as a precursor to
real environment whole task training may be effective. In many cases, creating an entire
environment modeling human anatomy for a complete set of surgical tasks is challenging.
However, modeling parts of organs where specific part tasks are to be performed is a much
simpler undertaking. Therefore, developing part task trainers may be a way to maximize training
effect under the technical limitations that developers are faced with currently.
The preliminary results are promising. However, the more important contribution of thesis up to
this point is in the design of the experimental platform. Through the design of this setup, a tool is
now available for measuring training effectiveness of virtual environment simulation in terms of
real world skills. It may seem that the significance of such measurements are debatable since the
objects in the simulation are not human organs and the tasks are not real surgical procedures.
However, the intention of the platform is not in training subjects to perform real surgery or to
measure how much the subject's surgical skill has improved. The purpose of the setup is to
provide a test bed for adjusting various parameters within the virtual environment to observe their
effect in training transfer. The setup appears to serve that purpose well. Now, various training
experiments can be done on this experiment platform to answer questions about the fidelity of the
simulation required and the training methods that are the most suitable for laparoscopic VR
training. Thus this platform has provided "the missing link " between the real and virtual
environment for which its absence has been the stumbling block for VR training effectiveness
studies in the past.
5.2 Future Work
First priority is given to extending the initial study by experimenting with even more levels of
elastic fidelity. It would be interesting to see at what level of stiffness the training effect starts to
deteriorate. This would give insight into whether or not the accuracy of modeling elasticity is
important and to what degree, given the limitations of the haptic interfaces that are used.
54
The experiments that were conducted with modeling nonlinear elasticity and part task versus
whole training are first of many experiments that can be done on this platform. Other aspects of
haptic fidelity can be investigated, such as viscoelasticity, damping, and simulation of wet
surfaces among many others. Also experiments can be done with the fidelity of the graphics.
Photo-realistic texture, realistic shading and glistening could be implemented to various degrees
of accuracy in order to investigate their effect on training through this platform. Results from
such further experiments would be used to set guidelines for the fidelity required to achieve a
particular level of training. These guidelines would be useful for developers designing surgical
simulators in determining the degree of fidelity they need to aim for in designing an effective
simulator. This would allow designers to focus their resources without wasting time and effort in
striving to make the simulation as realistic as possible.
The results from the experiments performed on this platform will serve as the basis for future
surgical simulation at the Touchlab also. An effective surgical simulation development is one of
the ultimate goals at the lab. The general procedure for surgical simulation development is to
measure in-vivo material properties of human organs and establish a suitable model of the organ.
The model would then be incorporated into the surgical simulator setup similar to the hardware
shown in this study. The last step would be to test the simulator with subjects to validate the
training transfer. This process is already underway at the lab for a simulation of Heller's
myotomy.
55
References
[1]
I. Hunter, T. Doukoglu, S. Lafontaine, P. Charatte, L. Jones, M. Sagar, G. Mallison, and
P. Hunter, "A teleoperated microsurgical robot and associated virtual environment for
eye surgery," Presence, vol. 2, pp. 265-280, 1993.
[2]
D. Ota, B. Loftin, T. Saito, R. Lea, and J. Keller, "Virtual reality in surgical education,"
Computters in Biology andMedicine, vol. 25, pp. 127-137, 1995.
[3]
J. Rosen, A. Lasko-Harvill, and R. Satava, "Virtual reality and surgery," in ComputerIntegrates Surgery: Technology and ClinicalApplications, R. Taylor, S. Lavellee, and G.
Burdea, Eds.: The MIT Press, 1996, pp. 231-243.
[4]
R. Satava, "Virtual Reality Surgical Simulator: The First Steps," Journal of Surgical
Endoscopy, vol. 7, pp. 203-205, 1993.
[5]
S. Delp, J. Loan, M. Hoy, F. Zajac, E. Topp, and J. Rosen, "An interactive graphic-based
model of the lower extremity to study orthapedic
surgical procedures,"
IEEE
Transactionson BiomedicalEngineering,vol. 37, pp. 757-767, 1990.
[6]
U. Kuhnapfel, C. Kuhn, M. Hubner, H. Krumm, H. Maab, and B. Neisius, "The
Karlsruhe Endoscopic Surgery Trainer as an example for Virtual Reality in Medical
Education," Minimally Invasive Therapy and Alliea Techologies (MITAT), vol. 6, pp.
122-125, 1997.
[7]
C. Basdogan, C. Ho, and M. A. Srinivasan, "Force Interaction in Laparoscopic
Simulation
Haptic Rendering of Soft Tissue," presented at Medicine Meets Virtual
Reality, 1998.
[8]
S. Cotin, H. Delingette, and N. Ayache, "Real-time elastic deformations of soft tissue for
surgery simulation," IEEE Trans. On Visualization and computer graphics, vol. 5, pp.
62-73, 1999.
[9]
M. Bro-Nielsen, "Finite Element Modeling in Surgery Simulation," Proceedingof IEEE,
vol. 86, pp. 490-503, 1998.
[10]
S. Pieper, J. Rosen, and D. Zeltzer, "Interactive Graphics for Plastic Surgery : A Task
Level Analysis and Implementation," Proceeding of Computer Graphics, pp. 127-134,
1992.
[11]
S. De, J. Kim, and M. A. Srinivasan, "A Meshless Numerical Technique for Physically
Based Real Time Medical Simulations," presented at Medicine Meets Virtual Reality,
2001.
[12]
F. Tendick, M. C. Cavusglu, and e. al., "A Virtual Environment Testbed for Training
Laparoscopc Surgical Skill," Presence, vol. 9, pp. 236-255, 2000.
[13]
A. Derossis, G. Fried, M. Abrahamowicz, H. Sigman, J. Barkun, and J. Meakins,
"Development of a Model for Training and Evaluation of Laparoscopic Skills," The
American JournalofSurgery, vol. 175, pp. 482-487, 1998.
[14]
C. Sutton, R. McCloy, A. Middlebrook, P. Chater, M. Wilson, and R. Stone, "A
laparoscopic Surgery Procedures Trainer and Evaluator," presented at Medicine Meets
Virtual Reality, 1997.
[15]
S. Payandeh, A. Lomax, J. Dill, C. Mackenzie, and C. Cao, "On Defining Metrics for
Assesing Laparoscopic Surgical Skills in a Virtual Training Environment," presented at
Medicine Meets Virtual Reality, 2002.
[16]
R. Adams, D. Klowden, and B. Hannaford, "Virtual Training for a Manual Assembly
Task," Haptics-e, vol. 2, 2001.
[17]
J. Kozak, P. Hancock, E. Arthur, and S. Chrysler, "Transfer of training from virtual
reality," Ergonomics,vol. 36, pp. 777-784, 1993.
[18]
B. Witmer, J. Bailey, B. Knerr, and K. Parsons, "Virtual spaces and real world places:
Transfer of route knowledge," InternationalJournal of Human-Computer Studies, vol.
45, pp. 413-428, 1996.
[19]
J. Bliss, P. Tidwell, and M. Guest, "The effectiveness of virtual reality for administrating
spatial navigation training for firefighters," Presence, vol. 6, pp. 73-86, 1997.
[20]
R. Boian, A. Sharma, C. Han, A. Merians, G. Burdea, S. Adamovich, M. Recce,
M.Termaine, and H. Poizner, "Virtual Reality-Based Post-Stroke Hand Rehabilitation,"
presented at Medicine Meets Virtual Reality, 2002.
[21]
T. Carretta and R. Dunlap, "Transfer of effectiveness in flight simulation: 1986 to 1997,"
: Air Force Research Laboratory, NTIS, 1998.
[22]
G. Lintern, S. Roscoe, J. Koonce, and L. Segal, "Transfer of landing skills in beginning
flight training," Human Factors,vol. 32, pp. 319-327, 1990.
[23]
A. G. f. A. R. a. Development, "Fidelity of simulation for pilot training," NATO 1980.
[24]
D. Kurts and C. Gainer, "The Use of a Dedicated Testbed to Evaluate Simulator Training
Effectiveness," .
[25]
C. Wagner, N. Stylopoulos, and R. Howe, "The Role of Force Feeback In Surgery:
Analysis of Blunt Dissection," presented at 10th Annual Haptic Symposium, Orlando,
2002.
[26]
C. Lathan, M. Tracey, M. Sebrechts, D. Clawson, and G. Higgins, "Using Virtual
Environments as Training Simulators: Measuring Transfer," in Handbook of Virtual
Environments, K. Stanney, Ed.: Lawrence Erlbaum Associates, 2002.
[27]
N. Taffinder, C. Sutton, R. Fishwick, I. MacManus, and A. Darzi, "Validation of Virtual
Reality To Teach and Assess Psychomotor Skills in Laparoscopic Surgery: Results from
Randomised Controlled Studies Using the MIST VR Laparoscopic Simulator," presented
at Medicine Meets Virtual Reality, 1998.
[28]
A. Chaudhry, C. Sutton, J. Wood, R. Stone, and R. McCloy, "Learning rate for
laparoscopic surgical skills on MIST VR, a virtual reality simulator: quality of humancomputer interface," Ann R Coll Surg Engl, vol. 81, pp. 281-286, 1999.
[29]
G. Chubb and P. Macy, "Microsoft Flight Simulator Suitability for Cross Country
Exercises for Private Pilot Training," presented at AIAA Modeling and Simulation
Technologies Conference, 1997.
[30]
G. Anderson, "A method for aircraft simulation verification and validation developed at
the United States Air Force flight simulation facility," presented at AGARD, Flight
Simulation, 1986.
[31]
M. Bonner and D. Gingras, "Evaluation of the Navy's F/A-18 A/D Powered Approach
Aerodynamics Model," presented at AIAA Modeling and Simulation Technologies
Conference, New Orleans, LA, 97.
[32]
J. Rolfe and K. Staples, "Flight Simulation," in CambridgeAerospace Series. Cambridge:
Cambridge University Press, 1986.
[33]
D. Kurts and C. Gainer, "The Use of a Dedicated Testbed to Evaluate Simulator Training
Effectiveness," presented at AGARD, Piloted Simulation Effectiveness, 1992.
[34]
M. O'Rourke, J. Ralston, J. Bell, and S. Lash, "PC-Based Simulation of the F16/MATV," presented at A!AA Modeling and Simulation Technologies Conference,
New Orleans, LA, 1997.
[35]
K. Neville, "Industry Initiative for Revised Training Simulator Validating Process,"
presented at AIAA Modeling and Simulation Technologies Conference, New Orleans,
LA, 1997.
[36]
J. Rosen, M. MacFarlane, C. Richards, B. Hannaford, and M. Sinanan, "Surgeon-Tool
Force/Torque Signatures - Evaluation of Surgical Skills in Minimally Invasive Surgery,"
presented at Proceedings of the MMVR Conference, 1999.
[37]
R. O'Toole, R. Playter, T. Krummel, W. Blank, N. Cornelius, W. Roberts, W. Bell, and
M. Raibert, "Assessing Skill and Learning in Surgeons and Medical Students Using a
Force Feedback Surgical Simulator," presented at MICCAI, Cambridge, MA, 1998.
[38]
J. Rosser, L. Rosser, and R. Savalgi, "Skill Acquisition Assessment for Laparoscopic
Surgery," Arch Surg, vol. 132, pp. 200-204, 1997.
[39]
L. Moody, C. Barber, and T. Arvanitis, "Objective Surgical Performance Evaluation
based on Hptic Feedback," presented at Medicine Meets Virtual Reality, 2002.
[40]
J. Rosen, C. Richards, B. Hannaford, and M. Sinanan, "Hidden Markov Models of
Minimally Invasive Surgery," presented at Medicine Meets Virtual Reality, 2000.
[41]
M. Srinivasan, "In Virtual Reality: Scientific and Technical Challenges," in Reportf the
Committee on Virtual Reality Research and Development, N. Durlach and A. Mayor,
Eds.: National Research Council, National Academy Press, 1995.
[42]
S. Kilbreath and S. Gandevia, "Neural and biomechanical specialization of human thumb
muscles revealed by matching weights and grasping objects," Journalof Physiology, vol.
472, pp. 537-556, 1993.
[43]
L. A. Jones and I. W. Hunter, "Influence of the Machanical Properties of a
Manipulandum on Human Operator Dynamics;Part 1. Elastic Stiffness," Biol.Cybern,
vol. 62, pp. 299-307, 1990.
[44]
W. Gardiner and G. Gettingby, Experimental Design Techniques in Statistical Practice:
A PracticalSoftware-basedApproach: Horwood Publishing Ltd., 1998.