EVALUATING SAFETY PROTOCOLS FOR MANNED-UNMANNED ENVIRONMENTS THROUGH AGENT-BASED SIMULATION Jason C. Ryan

advertisement
EVALUATING SAFETY PROTOCOLS FOR
MANNED-UNMANNED ENVIRONMENTS THROUGH
AGENT-BASED SIMULATION
by
Jason C. Ryan
B.S. in Aerospace Engineering, University of Alabama (2007)
S.M. in Aeronautics and Astronautics, Massachusetts Institute of Technology (2011)
Submitted to the Engineering Systems Division
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Human Systems Engineering
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2014
c Jason C. Ryan, MMXIV. All rights reserved.
The author hereby grants to MIT permission to reproduce and to distribute publicly
paper and electronic copies of this thesis document in whole or in part in any
medium now known or hereafter created.
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Engineering Systems Division
August
, 2014
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mary L. Cummings
Visiting Professor of Aeronautics and Astronautics
Thesis Supervisor
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
R. John Hansman
T. Wilson Professor of Aeronautics and Astronautics and Engineering Systems
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Julie A. Shah
Assistant Professor of Aeronautics and Astronautics
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Munther A. Dahleh
William A. Coolidge Professor of Electrical Engineering and Computer Science
Acting Director of MIT Engineering Systems Division
2
EVALUATING SAFETY PROTOCOLS FOR MANNED-UNMANNED
ENVIRONMENTS THROUGH AGENT-BASED SIMULATION
by
Jason C. Ryan
Submitted to the Engineering Systems Division
on August
, 2014, in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy in Human Systems Engineering
Abstract
Recent advances in autonomous system capabilities have improved their performance sufficiently to
make the integration of unmanned and autonomous vehicles systems into human-centered civilian
environments a realistic near-term goal. In these systems, such as the national highway system, mining operations, and manufacturing operations, unmanned and autonomous systems will be required
to interact with large numbers of other unmanned vehicle systems as well as with manned vehicles
and other human collaborators. While prior research provides information on the different methods
of controlling unmanned vehicles and the effects of these methods on individual vehicle behavior, it
has typically focused on only a small number of unmanned systems acting in isolation. The qualities that provide the desired behavior of an autonomous system behavior in isolation may not be
the same as the characteristics that lead to desirable performance while interacting with multiple
heterogeneous actors. Additionally, the integration of autonomous systems may include constraints
on operations that limit interactions between manned and unmanned agents. It is not clear which
constraints might be most effective in promoting safe operations and how these constraints may
interact with unmanned system control architectures.
Examining the effects of unmanned systems in these large, complex systems in reality would
require significant capital investment and the physical construction and implementation of the unmanned vehicles of interest. Both of these aspects make empirical testing either difficult or impossible
to perform and may also limit the ability of testing to fully examine all the parameters of interest
in a safe and efficient manner. The objective of this thesis is the creation of a simulation environment that can replicate the behavior of the unmanned vehicle systems, manned vehicles, and
human collaborators in the environment in order to enable an exploration of how parameters related
to individual actor behavior and actor interactions affect performance. The aircraft carrier flight
deck is chosen as an example domain, given that current operations require significant interactions
between human collaborators and manned vehicles and current research addresses the development
of unmanned vehicle systems for flight deck operations. Because the complexity of interactions between actors makes the creation of closed-form solutions of system behavior difficult, an agent-based
modeling approach is taken. First, a list of actors and their characteristic tasks, decision-making
processes, states, and parameters for current aircraft carrier flight deck operations was generated.
Next, these models were implemented in an object-oriented programming language, enabling the
definition of independent tasks, actors, parameters, and states. These models were later extended
to incorporate features of unmanned vehicle control architectures by making minor modifications
to the state, logic functions, or parameters of current agents (or tasks). This same tactic can be
applied by future researchers to further pursue examinations of other influential aspects of system
performance or to adapt the model to other domains.
This model, the Multi-Agent Safety and Control Simulation (MASCS), was then compared to
data for current flight deck operations to calibrate and partially validate simulation outputs, first
addressing an individual vehicle task before proceeding to mission tasks utilizing many vehicles at
once. The MASCS model was extended to incorporate models of different unmanned vehicle control
architectures and different safety protocols that affect vehicle integration. These features were then
3
tested at different densities of mission operations on the flight deck and compositions (unmanned vs.
manned) of missions in order to fully explore the interactions between variables. These results suggest
that productivity on the flight deck is more heavily influenced by the safety protocols that influence
vehicle integration as opposed to the types of unmanned vehicle control architecture employed.
Vehicle safety is improved by increasing the number of high-level constraints on operations (e.g.
separating unmanned and manned aircraft spatially or temporally), but these high-level constraints
may conflict with implicit constraints that are part of crew-vehicle interactions. Additional testing
explored the use of MASCS in understanding the effects of changes to the operating environment,
independent of changes to unmanned vehicle control architectures and safety protocols, as well as
how the simulation can be used to explore the vehicle design space. These results indicate that, at
faster operational tempos, latencies in vehicle operations drive significant differences in productivity
that are exacerbated by the safety protocols applied to operations. In terms of safety, a tradeoff
between slightly increased vehicle safety and significant increases in the risk rate of crew activity
is created at faster tempos in this environment. Lastly, the limitations and generalizability of the
MASCS model for use in other Heterogeneous Manned-Unmanned Environments (HMUEs) was
discussed, along with potential future work to expand the models.
Thesis Supervisor: Mary L. Cummings
Title: Visiting Professor of Aeronautics and Astronautics
4
Acknowledgments
I owe thanks to many different people not only for their help and support during my dissertation
work, but for all the steps that got me to MIT and to this point.
First, to Missy Cummings, my dissertation advisor, thank you for pushing me every step of the
way and making sure that I was doing the best possible work. I know working with me has been
difficult at times, but I am incredibly lucky to have made it to MIT in time to be your student.
You have taught me more about this field of research than I ever expected I could learn, and I will
be forever indebted to you for allowing me to work with you. You have been a great teacher of the
discipline, a great mentor for all the other aspects of my personal and professional life, and a great
friend over these past five years.
To John Hansman, thank you for stepping a bit outside your typical area of research to serve
on my dissertation committee. Having the outside perspective that you bring has been valuable
in making me question my own assumptions and perspectives about what I am doing, what is
important, and how it should be interpreted. To Julie Shah, providing the perspective from the
automation’s point of view has helped me avoid getting too stuck in the humanist perspective on
things. Your enthusiasm in our discussions has also been wonderful to see.
To Sally, for helping me manage all the craziness surrounding my work, the work of the lab,
my travels back and forth around the country, you have been a terrific help. Most especially, your
assistance in finding and locating Missy and John was always of great help.
To Beth Milnes, our intrepid ESD academic administrator, your advice and encouragement in
helping me navigate the muddy waters of ESD was always appreciated. You have also been a great
influence and help to ESS; even though I was never an officer in the organization, your help in
directing ESS, planning the retreat, and making sure that we are all still sane is more than valuable.
To ESS, for giving me a forum to think and talk about my research, for helping with Generals
preparation, and helping me get out of my bubble every now and then to relax a bit, thank you.
To the labmates, the whole crazy lot of you over the past few years, I love you all. To Alex, Mark,
and Kathleen, we were all here at the end of HAL, and having the two of you to keep me company
these past two semesters has been great. To all the rest: Andrew, Paul, Jackie, Yves, Luca, Kris,
Erin, Jamie, and Armen, thanks for all the great times, the laughs, the interesting discussions, and
the support.
To my parents, Steve and Jackie, thank you for all the love and support over these many years.
You’ve always encouraged me to succeed to the best of my abilities, even though most of the time
we were never really sure what that meant and where it would take me. I know this has been a
really interesting and novel experience for the both of you, but I know I could never have done it
without you backing me all the way. I love you both dearly, and this all is dedicated to you.
5
To my loving girlfriend Kim, I’m sorry for all the stressful nights, and days, and lost weekends,
and lost evenings, and preventing you from watching Game of Thrones and House of Cards and
all those other things. You’ve been working a real job while I was still hanging out in school, but
you’ve always been behind me 110% of the way. I love you and I always will, and I cannot tell you
how much you have meant to me during this process. I cannot wait for the adventures that await
us in the years to come.
6
Contents
1 Introduction
39
1.1
Autonomous Robotics in Complex Sociotechnical Systems . . . . . . . . . . . . . . .
40
1.2
Research Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
1.2.1
Research Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
1.2.2
Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
1.3
2 Literature Review
2.1
2.2
2.3
2.4
47
Safety Protocols for Heterogeneous Manned-Unmanned Environments . . . . . . . .
48
2.1.1
Safety Protocols for Unmanned Vehicles . . . . . . . . . . . . . . . . . . . . .
49
2.1.2
Safety Protocols in Related Domains . . . . . . . . . . . . . . . . . . . . . . .
50
2.1.3
Defining Classes of Safety Principles . . . . . . . . . . . . . . . . . . . . . . .
52
Human Control of Unmanned Vehicle Systems . . . . . . . . . . . . . . . . . . . . .
55
2.2.1
Manual Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
2.2.2
Teleoperation of Unmanned Vehicle Platforms . . . . . . . . . . . . . . . . . .
57
2.2.3
Advanced Forms of Teleoperation . . . . . . . . . . . . . . . . . . . . . . . . .
60
2.2.4
Human Supervisory Control of Unmanned Vehicles . . . . . . . . . . . . . . .
62
2.2.5
A Model of Control Architectures . . . . . . . . . . . . . . . . . . . . . . . . .
64
Modeling Heterogeneous Manned-unmanned Environments . . . . . . . . . . . . . .
68
2.3.1
Models of Human and Unmanned Vehicle Behavior . . . . . . . . . . . . . . .
69
2.3.2
Agent-Based Simulations of Safety Protocols . . . . . . . . . . . . . . . . . .
71
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
3 Model Development
73
3.1
Modern Aircraft Carrier Flight Deck Operations . . . . . . . . . . . . . . . . . . . .
74
3.2
Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
3.2.1
Aircraft/Pilot Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
3.2.2
Aircraft Director Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
7
CONTENTS
3.2.3
Deck Handler Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
3.2.4
Deck and Equipment Models . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
3.3
Simulation Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
3.4
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
4 Model Validation
97
4.1
Validating Agent-Based Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
4.2
Single Aircraft Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
4.3
Mission Validation Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4
Mission Sensitivity Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4.1
Sensitivity to Input Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4.2
Sensitivity to Internal Parameters . . . . . . . . . . . . . . . . . . . . . . . . 112
4.5
Subject Matter Expert Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.6
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.7
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5 MASCS Experimental Program
5.1
119
Experimental Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.1.1
Safety Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.1.2
Control Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.1.3
Mission Settings: Composition and Density . . . . . . . . . . . . . . . . . . . 136
5.2
Unmanned Aerial Vehicle Model Verification . . . . . . . . . . . . . . . . . . . . . . 137
5.3
Dependent Variables and Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.4
Experimental Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.5
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.5.1
Effects on Mission Expediency . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.5.2
Effects on Mission Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.5.3
Effects on Mission Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.6
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.7
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6 Exploring Design Options
6.1
163
Exploration of Operational Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.1.1
Effects of Launch Preparation Times on Unmanned Aerial Vehicle (UAV) Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.1.2
Effects of Automating Launch Preparation Tasks . . . . . . . . . . . . . . . . 173
6.1.3
Interactions with Safety Protocols . . . . . . . . . . . . . . . . . . . . . . . . 176
8
CONTENTS
6.2
Exploring Control Architecture Parameters . . . . . . . . . . . . . . . . . . . . . . . 183
6.3
Possible Future Architectural Changes . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.4
Limitations of the MASCS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.5
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7 Conclusions
7.1
7.2
7.3
Modeling HMUEs
195
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.1.1
The MASCS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.1.2
Model Confidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.1.3
Model Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
7.2.1
Additional Phases of Operations . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.2.2
Deck Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
7.2.3
Responding to Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.2.4
Extensions to Other Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
References
215
Appendices
229
A Safety Principles from Carrier and Mining Domains
231
B MASCS Agent Model Descriptions
239
B.1 Task Models and Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
B.2 Director models and rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
C MASCS Input Data
247
D Example Scenario Configuration Text
249
D.1 Example 1: Manually piloted Aircraft . . . . . . . . . . . . . . . . . . . . . . . . . . 249
D.2 Example 2: Crew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
D.3 Example 3: Deck equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
E Center for Naval Analyses (CNA) Interdeparture Data
251
F MASCS Single Aircraft Test Results
255
G Description of Mission Queuing
257
H MASCS Mission Test Results
261
9
CONTENTS
I
UAV Models
I.1
Local Teleoperation
I.1.1
I.2
Local Teleoperation Safety Implications . . . . . . . . . . . . . . . . . . . . . 268
Remote Teleoperation Safety Implications . . . . . . . . . . . . . . . . . . . . 270
Gestural Control (advanced teleoperation) . . . . . . . . . . . . . . . . . . . . . . . . 270
I.3.1
I.4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Remote Teleoperation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
I.2.1
I.3
267
Gesture Control Safety Implications . . . . . . . . . . . . . . . . . . . . . . . 271
Human Supervisory Control Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
I.4.1
Human Supervisory Control (HSC) Safety Implications . . . . . . . . . . . . 273
J UAV Verification
275
K Launch Event Duration (LD) Results
281
K.1 Effects due to Safety Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
K.2 Effects due to Control Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
K.3 Effects due to Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
K.4 Launch event Duration (LD) Effects Summary . . . . . . . . . . . . . . . . . . . . . 286
L Statistical Tests on Launch Event Duration (LD)
289
M Analysis of Additional Safety Metrics
299
M.1 Results for Duration of Tertiary Halo Incursions due to Aircraft (DTHIA) . . . . . . 299
M.2 Results for Primary Halo Incursion (PHI) . . . . . . . . . . . . . . . . . . . . . . . . 302
M.3 Results for Duration of Primary Halo Incursions (DPHI) . . . . . . . . . . . . . . . . 305
N Statistical Tests for Pareto Frontier Results
311
N.1 Supplemental statistical tests for Tertiary Halo Incursions (THI) . . . . . . . . . . . 311
N.2 Full Color Pareto Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
O Statistical Tests for Design Exploration
10
359
List of Figures
1-1 Notional diagram of Heterogeneous Manned-Unmanned Environments (HMUEs). . .
41
2-1 Adaptation of Reason’s “Swiss Cheese” model of failures (Reason, 1990).
48
. . . . . .
2-2 An overview of the variations between the five different control architectures described
in this section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
2-3 Human Information Processing loop (Wickens & Hollands, 2000). . . . . . . . . . . .
56
2-4 Human Supervisory Control Diagram, adapted from Sheridan and Verplank (Sheridan
& Verplank, 1978). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
2-5 Histogram of accuracies of gesture control systems. . . . . . . . . . . . . . . . . . . .
62
2-6 Variations in effects of control architectures, categorized along axes of control input
source, sensing and perception failure rate, response selection failure rate, response
execution failure rate, field of view, and sources of latency. . . . . . . . . . . . . . . .
66
2-7 Aggregate architecture of manned-unmanned environments. . . . . . . . . . . . . . .
67
3-1 Digital map of the aircraft carrier flight deck. Labels indicate important features for
launch operations. The terms “The Fantail” and “The Street” are used by the crew
to denote two different areas of the flight deck. The Fantail is the region aft of the
tower. The Street is the area forward of the tower, between catapults 1 and 2 and
starboard of catapults 3 and 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
3-2 Picture of the USS Eisenhower with aircraft parked prior to launch operations. Thirtyfour aircraft and one helicopter are parked on the flight deck (Photo by U.S. Navy
photo by Mass Communication Specialist Seaman Patrick W. Mullen III, obtained
from Wikimedia Commons). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
3-3 (top) Picture of the MASCS flight deck with 34 aircraft parked, ready for launch
operations. (Bottom) General order in which areas of the deck are cleared by the
Handler’s assignments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
3-4 Example: path, broken into individual actions, of an aircraft moving from a starboard
parking location to launch at catapult 3. . . . . . . . . . . . . . . . . . . . . . . . . .
11
82
LIST OF FIGURES
3-5 Diagram of the Director alignment process . . . . . . . . . . . . . . . . . . . . . . . .
85
3-6 Flow diagram of the collision avoidance process. . . . . . . . . . . . . . . . . . . . . .
87
3-7 Map of the network of connections between Directors on the flight deck. Green
dots denote Directors responsible for launching aircraft at the catapults (“Shooters”).
Blue dots denote Directors that control the queuing position behind the jet blast
deflectors (“Entry” crew). Yellow denotes denote other Directors responsible for taxi
operations through the rest of the flight deck. Shooters and Entry Directors may also
route aircraft through the rest of the deck if not currently occupied with aircraft in
queue at their assigned catapult. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
3-8 Map of the flight deck noting the general parking areas of aircraft (gray), the order
in which they are cleared, and to which catapults (forward or aft) they are allocated.
91
3-9 Top: Picture of the MASCS flight deck with 22 aircraft parked, ready for launch
operations. Bottom: Picture of the MASCS flight deck with 22 aircraft parked, in
the midst of operations. Three aircraft have already launched, four are currently at
their designated catapults, four more are taxiing to or in queue at their catapult, and
eleven are currently awaiting their assignments. . . . . . . . . . . . . . . . . . . . . .
94
4-1 Simple notional diagrams for Discrete Event, Systems Dynamics, and Agent-Based
Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4-2 Diagram showing the path taken by the aircraft in the observed mission scenario. The
figure also details the breakdown of elements in the mission: Taxi, Launch Preparation, and Takeoff. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4-3 Results of preliminary single aircraft calibration test. Simulation data appear in blue
with error bars depicting ±1 s.d. Empirical observations are in red. . . . . . . . . . . 102
4-4 Sensitivity analysis of single aircraft Manual Control Case. . . . . . . . . . . . . . . . 104
4-5 Changes in parameters during sensitivity analyses. . . . . . . . . . . . . . . . . . . . 104
4-6 Histogram of the number of aircraft observed in sorties during the CNA study (Jewell,
1998). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4-7 Cumulative Density Function (CDF) of launch interdepartures from (Jewell, 1998).
This CDF can be converted into a Probability Density Function (PDF) in the form
of a negative exponential of the form λe−λ∗t with λ=0.01557 . . . . . . . . . . . . . 107
4-8 Graphical depiction of launch interdeparture negative exponential fits for the final
round of mission validation testing. Triangles denote λ values. Vertical lines denote
95% confidence intervals. Note that the mean value of launch interdepartures across
all scenarios and replications is 66.5 seconds. . . . . . . . . . . . . . . . . . . . . . . 108
12
LIST OF FIGURES
4-9 Notional diagram of allocations of aircraft to catapults in a 22 aircraft mission. Dots
represent assigned aircraft launches; in the image, catapults 3 and 4 are paired and
launches alternate between the two. Catapult 1 is inoperable, and catapult 2 process
all assigns in that section in series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4-10 Plot of event durations versus number of aircraft, sorted by three- (blue) vs fourcatapult (red) scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4-11 Results of final sensitivity analysis. Labels indicate number of aircraft-number of catapults, the parameter varied, and the percentage variation of the parameters. Dagger
symbols (†) indicate an inverse response to changes in parameter values is expected.
Asterisks (*) indicate significant differences in ANalysis of VAriance (ANOVA) tests. 114
5-1 Experimental Matrix for MASCS evaluation. . . . . . . . . . . . . . . . . . . . . . . 121
5-2 Example of the Area Separation protocol assuming identical numbers of UAVs and
aircraft. In this configuration, all manned aircraft parked forward are only assigned to
the forward aircraft pair, while all unmanned aircraft parked aft are only sent to the
aft catapults. When one group finishes all its launches, these restrictions are removed. 123
5-3 Example of the Temporal Separation protocol assuming identical numbers of UAVs
and aircraft. In this configuration, all manned aircraft parked forward are assigned
to catapults first but are allowed to access the entire deck. After all manned aircraft
reach their assigned catapults, the unmanned vehicles are allowed to taxi and are also
given access to the entire deck. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5-4 Example of the Temporal Separation protocol assuming identical numbers of UAVs
and aircraft. In this configuration, all manned aircraft parked forward are assigned
to catapults first but are only allowed to taxi to forward catapults. After all manned
aircraft reach their assigned catapults, the unmanned vehicles are allowed to taxi and
are given access to the entire deck. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5-5 Variations between control architectures, categorized along axes of operator location,
operator control inputs, and operator input modality.
. . . . . . . . . . . . . . . . . 126
5-6 Variations in effects of control architectures, categorized along axes of routing method,
sensing and perception failure rate, response selection failure rate, response execution
failure rate, field of view, and sources of latency. . . . . . . . . . . . . . . . . . . . . 128
13
LIST OF FIGURES
5-7 Diagram of worst-case scenario for Remote Teleoperation (RT) latency effects for a
five-second task. Upper timeline shows Director and Pilot actions, bottom timeline
shows vehicle response to Pilot commands. If Director provides command late (at the
5 second mark) and the Pilot does not preemptively stop the aircraft, the round-trip
1.6 second latency in communicating the “STOP” command to the vehicle, the pilot
issuing the command, and the command being executed causes the vehicle to stop
late. The task completes at 6.6 seconds. . . . . . . . . . . . . . . . . . . . . . . . . . 132
5-8 Diagram of optimal and worst-case early scenarios for RT latency effects for a fivesecond task. Upper timeline shows Director and Pilot actions, bottom timeline shows
vehicle response to Pilot commands. Perfect execution requires preemptive commands
by either the Director or the Pilot. If the Pilot further pre-empts the Director’s
actions, the task may complete early. . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5-9 Column chart comparing time to complete single aircraft mission across all five UAV
types. Original results from Manual Control modeling are also presented. . . . . . . 138
5-10 Diagram of “halo” radii surrounding an aircraft. Primary incursions are within one
half wingspan of the vehicle. Secondary incursions are within one wingspan; tertiary are within two. In the image shown, the aircraft has its wings folded for taxi
operations. A transparent image of the extended wings appears in the background. . 140
5-11 Hypotheses for performance across Control Architectures. E.g.: System-Based Human Supervisory Control (SBSC) is expected to perform best (smallest values) for all
productivity measures and safety measures (smallest) related to durations. Gestural
Control (GC) is expected to perform worst (largest values) across all measures related
to time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5-12 Hypotheses for performance of Safety Protocols within each Control Architecture. Example: for Local Teleoperation (LT), the Temporal Separation safety protocol is expected to perform best (smallest values) across all measures of productivity and safety.
Area+Temporal is expected to provide the worst productivity measures (largest values), while Area Separation is expected to be the worst for safety.
. . . . . . . . . . 143
5-13 Column charts of effects of Safety Protocols (left) and Control Architectures (right)
for 22 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5-14 Column chart comparing Launch event Duration (LD) values across all Control Architectures (CAs) sorted by Safety Protocol (SP). Whiskers indicate ±1 standard
error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5-15 Diagrams of queuing process at catapults. a) start of operations with full queues. b)
After first launch. c) after second launch. . . . . . . . . . . . . . . . . . . . . . . . . 148
14
LIST OF FIGURES
5-16 Average taxi times, average launch preparation time, and average time to clear a
queue of three aircraft for selected 22 aircraft missions. . . . . . . . . . . . . . . . . . 149
5-17 Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 22 aircraft missions.150
5-18 Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 22 aircraft Dynamic Separation missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5-19 Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
22 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5-20 Pareto frontier plots of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
22 aircraft missions for Local Teleoperation (LT, left) and System-Based Supervisory
Control (SBSC, right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5-21 Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
34 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5-22 Pareto frontier plot of Total Taxi Distance (TTD) versus LD for 22 aircraft missions. 158
5-23 Pareto frontier plot of Total Aircraft taXi Time (TAXT) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6-1 Effects of altering launch preparation mean and standard deviation on mean launch
event duration for 22-aircraft Manual Control case. . . . . . . . . . . . . . . . . . . . 166
6-2 Line chart of mean Launch event Duration (LD) for the Manual Control (MC), Gestural Control (GC), and System-Based Human Supervisory Control (SBSC) control
architectures for decreases in launch preparation time standard deviation. Whiskers
indicate ±1 standard error in the data. . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6-3 Column graphs depicting Launch event Duration (LD) values all control architectures
under the 50% standard deviation, original mean launch preparation time model for
22 (left) and 34 (right) aircraft. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6-4 Line chart of Launch event Duration (LD) values for Manual Control (MC), Gestural
Control (GC), and System-Based Human Supervisory Control (SBSC) missions using
22 aircraft varying the launch preparation mean. Whiskers indicate ±1 standard error.171
6-5 Column graphs depicting Launch event Duration (LD) values all control architectures
under the 50% mean, 50% standard deviation launch preparation time model for 22
(left) and 34 (right) aircraft. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6-6 Column graphs depicting Launch event Duration (LD) values all control architectures
under the “automated” launch preparation time model for 22 (left) and 34 (right)
aircraft. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
15
LIST OF FIGURES
6-7 Column graphs depicting Launch event Duration (LD) values for the Gestural Control
(GC) and SBSC Control Architectures run at the 22 (left) and 34 (right) aircraft
Density levels under the “automated” launch preparation time for all Safety Protocols
(SPs). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6-8 Percent increase in Launch event Duration when moving from 22 to 34 aircraft. . . . 179
6-9 Column charts comparing Tertiary Halo Incursions due to Aircraft (THIA) values
between the “automated” launch preparation time and the original launch preparation
time for all treatments in the SBSC control architecture. Left: 22 aircraft cases. Right:
34 aircraft cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6-10 Column charts comparing Tertiary Halo Incursions due to Aircraft (THIA) values
between the “automated” launch preparation time and the original launch preparation
time for all treatments in the GC control architecture. Left: 22 aircraft cases. Right:
34 aircraft cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6-11 Column charts comparing Primary Halo Incursion (PHI) values between the “automated” launch preparation time and the original launch preparation time for all
treatments in the GC control architecture. Left: 22 aircraft cases. Right: 34 aircraft
cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6-12 Column charts comparing Primary Halo Incursion (PHI) values between the “automated” launch preparation time and the original launch preparation time for all
treatments in the SBSC control architecture. Left: 22 aircraft cases. Right: 34
aircraft cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
6-13 Column charts comparing Duration of Primary Halo Incursions (DPHI) values between the “automated” launch preparation time and the original launch preparation
time for all treatments in the GC control architecture. Left: 22 aircraft cases. Right:
34 aircraft cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
6-14 Column charts comparing Duration of Primary Halo Incursions (DPHI) values between the “automated” launch preparation time and the original launch preparation
time for all treatments in the SBSC control architecture. Left: 22 aircraft cases.
Right: 34 aircraft cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
6-15 Column charts of mean launch event duration for variations of the Gestural Control
(GC) model at both 22 (left) and 34 (right) aircraft Density level using the original
launch preparation time model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6-16 Plot of Launch Event Duration Means against Gestural Control (GC) failure rates
under the original launch preparation time model. Whiskers indicate ±1 standard
error in the data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
16
LIST OF FIGURES
7-1 Picture of a Deck Handler, resting his elbows the “Ouija board,” a tabletop-sized map
of the flight deck used for planning operations. Scale, metal cutouts or aircraft are
used to update aircraft positions and layouts, while colored thumbtacks are used to
indicate fuel and weapons statuses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
C-1 Probability density function of the modeled launch preparation time, with empirical
observations overlaid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
C-2 Probability density function of the modeled launch acceleration time, with empirical
observations overlaid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
E-1 Cumulative Density Function (CDF) of launch interdepartures from (Jewell, 1998). . 252
E-2 PDF of launch interdepartures from (Jewell, 1998) . . . . . . . . . . . . . . . . . . . 253
G-1 Pattern of allocation to catapults. Blue/green coloring corresponds to pairing of
catapults 1/2 and 3/4, respectively. Red indicates the launch is in process. A black
diagonal line indicates that the launch has occurred and the aircraft is no longer in
the system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
G-2 Pattern of allocation to a set of three catapults. Blue/green coloring corresponds to
pairing of catapults 1/2 and 3/4, respectively. Red indicates the launch is in process.
A black diagonal line indicates that the launch has occurred and the aircraft is no
longer in the system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
G-3 Build-out of 22 aircraft launch events with four (left) and three (right) available
catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
J-1 Mean launch event duration time for Local Teleoperation (LT) control architecture
with incremental introduction of features. Whiskers indicate ±1 standard deviation.
276
J-2 Mean launch event duration time for Remote Teleoperation (RT) control architecture
with incremental introduction of features. Whiskers indicate ±1 standard deviation.
277
J-3 Mean launch event duration time for Gestural Control (GC) control architecture with
incremental introduction of features. Whiskers indicate ±1 standard deviation. . . . 277
J-4 Mean launch event duration time for Vehicle-Based Human Supervisory Control
(VBSC) control architecture with incremental introduction of features. Whiskers
indicate ±1 standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
J-5 Mean launch event duration time for System-Based Human Supervisory Control
(SBSC) control architecture with incremental introduction of features. Whiskers indicate ±1 standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
J-6 Mean launch event duration time for all control architectures with incremental introduction of features. Whiskers indicate ±1 standard deviation. . . . . . . . . . . . . . 280
17
LIST OF FIGURES
J-7 Mean taxi duration time for Local Teleoperation (LT) control architecture with incremental introduction of features. Whiskers indicate ±1 standard deviation. . . . . 280
K-1 Boxplots of Launch event Duration (LD) for each Safety Protocol at the 22 (left) and
34 (right) aircraft Density levels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
K-2 Boxplots of Launch event Duration (LD) for each Control Architecture at the 22 (left)
and 34 (right) aircraft Density levels, with Temporal+Area safety protocols excluded. 283
K-3 Boxplots of Launch event Duration (LD) for each Control Architecture at the 100%
Composition settings for 22 (left) and 34 (right) aircraft Density levels.
. . . . . . . 284
K-4 Column chart comparing Launch event Duration (LD) values across all Control Architectures (CAs) sorted by Safety Protocol (SP)-Composition pairs. Whiskers indicate
±1 standard error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
L-1 SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus SP,
with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
L-2 SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 22
aircraft cases versus SP, with T+A safety protocols excluded. . . . . . . . . . . . . . 289
L-3 SPSS output of Levene Test of Equal Variances for the 34 aircraft cases versus SP,
with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
L-4 SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 34
aircraft cases versus SP, with T+A safety protocols excluded. . . . . . . . . . . . . . 290
L-5 SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA,
with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
L-6 SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 22
aircraft cases versus CA, with T+A safety protocols excluded. . . . . . . . . . . . . . 291
L-7 SPSS output of Levene Test of Equal Variances for the 34 aircraft cases versus CA,
with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
L-8 SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 34
aircraft cases versus CA, with T+A safety protocols excluded. . . . . . . . . . . . . . 292
L-9 SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA
for Dynamic Separation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
L-10 SPSS output of an ANOVA test for the 22 aircraft cases versus CA for Dynamic
Separation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
L-11 SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA
for Area Separation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
L-12 SPSS output of an ANOVA test for the 22 aircraft cases versus CA for Area Separation.293
18
LIST OF FIGURES
L-13 SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA
for Temporal Separation.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
L-14 SPSS output of an ANOVA test for the 22 aircraft cases versus CA for Temporal
Separation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
L-15 SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus the
full treatment definition, with T+A safety protocols excluded. . . . . . . . . . . . . . 294
L-16 SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 22
aircraft cases versus the full treatment definition, with T+A safety protocols excluded.294
L-17 SPSS output of Levene Test of Equal Variances for the 34 aircraft cases versus the
full treatment definition, with T+A safety protocols excluded. . . . . . . . . . . . . . 295
L-18 SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 34
aircraft cases versus the full treatment definition, with T+A safety protocols excluded.295
M-1 Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 22 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
M-2 Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 22 aircraft Dynamic Separation missions. . . . . . . . . . . . . . . . . 301
M-3 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions, excluding Temporal+Area cases. . . . . . . . . . . . . . . . . . . . . . . . . . . 303
M-4 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions.305
M-5 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft Remote
Teleoperation missions, including Temporal+Area cases. . . . . . . . . . . . . . . . . 306
M-6 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 34 aircraft missions.307
M-7 Pareto frontier plot of Duration of Primary Halo Incursions (DPHI) versus LD for 34
aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
N-1 SPSS output for Levene test of equal variances for the 22 aircraft THI values against
SPs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
N-2 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft THIA values against SPs. . . . . . . . . . . . . . . . . . . . . . . . . . 311
N-3 SPSS output for Levene test of equal variances for the 34 aircraft THIA values against
SPs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
N-4 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft THIA values against SPs. . . . . . . . . . . . . . . . . . . . . . . . . . 312
N-5 SPSS output for Levene test of equal variances for the 22 aircraft THIA values against
CAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
19
LIST OF FIGURES
N-6 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft THIA values against CAs. . . . . . . . . . . . . . . . . . . . . . . . . 313
N-7 SPSS output for Levene test of equal variances for the 34 aircraft THIA values against
CAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
N-8 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft THIA values against CAs. . . . . . . . . . . . . . . . . . . . . . . . . 313
N-9 SPSS output for Levene test of equal variances for the 22 aircraft THIA values against
SP-Composition pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
N-10 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft THIA values against SP-Composition pairs.
. . . . . . . . . . . . . . 314
N-11 SPSS output for Levene test of equal variances for the 22 aircraft DTHIA values
against CAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
N-12 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft DTHIA values against CAs. . . . . . . . . . . . . . . . . . . . . . . . 316
N-13 SPSS output for Levene test of equal variances for the 22 aircraft DTHIA values
against SPs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
N-14 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft DTHIA values against SPs. . . . . . . . . . . . . . . . . . . . . . . . . 317
N-15 SPSS output for Levene test of equal variances for the 22 aircraft PHI values against
SP-Composition pairs, excluding T+A safety protocols. . . . . . . . . . . . . . . . . 317
N-16 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft PHI values against SP-Composition pairs, excluding T+A safety protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
N-17 SPSS output for Levene test of equal variances for the 22 aircraft Secondary Halo
Incursions (SHI) values against SPs-Composition interactions. . . . . . . . . . . . . . 318
N-18 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft PHI values against SPs-Composition pairs. . . . . . . . . . . . . . . . 318
N-19 SPSS output for Levene test of equal variances for the 34 aircraft PHI values against
SP-Composition pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
N-20 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft PHI values against SP-Composition pairs, excluding T+A safety protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
N-21 SPSS output for Levene test of equal variances for the 22 aircraft PHI values against
SP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
N-22 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft PHI values against SP. . . . . . . . . . . . . . . . . . . . . . . . . . . 321
20
LIST OF FIGURES
N-23 SPSS output for Levene test of equal variances for the 34 aircraft PHI values against
SP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
N-24 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft PHI values against SP. . . . . . . . . . . . . . . . . . . . . . . . . . . 322
N-25 SPSS output for Levene test of equal variances for the 22 aircraft TTD values against
SP-Composition pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
N-26 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft TTD values against SP-Composition pairs. . . . . . . . . . . . . . . . 322
N-27 SPSS output for Levene test of equal variances for the 22 aircraft TTD values against
SP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
N-28 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft TTD values against SP. . . . . . . . . . . . . . . . . . . . . . . . . . . 323
N-29 SPSS output for Levene test of equal variances for the 34 aircraft TTD values against
SP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
N-30 SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft TTD values against SP. . . . . . . . . . . . . . . . . . . . . . . . . . . 324
N-31 Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 22 aircraft missions.325
N-32 Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 34 aircraft missions.326
N-33 Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
22 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
N-34 Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
34 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
N-35 Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 22 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
N-36 Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 34 aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
N-37 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions, excluding Temporal+Area cases. . . . . . . . . . . . . . . . . . . . . . . . . . . 331
N-38 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions.332
N-39 Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 34 aircraft missions.333
N-40 Pareto frontier plot of Duration of Primary Halo Incursions (DPHI) versus LD for 22
aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
N-41 Pareto frontier plot of Duration of Primary Halo Incursions (DPHI) versus LD for 34
aircraft missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
N-42 Pareto frontier plot of Total Taxi Distance (TTD) versus LD for 22 aircraft missions. 336
N-43 Pareto frontier plot of Total Taxi Distance (TTD) versus LD for 34 aircraft missions. 337
21
LIST OF FIGURES
N-44 Pareto frontier plot of Total Aircraft taXi Time (TAXT) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
N-45 Pareto frontier plot of Total Aircraft taXi Time (TAXT) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
N-46 Pareto frontier plot of Total Aircraft Active Time (TAAT) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
N-47 Pareto frontier plot of Total Aircraft Active Time (TAAT) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
N-48 Pareto frontier plot of Primary Collision warnings (PC) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
N-49 Pareto frontier plot of Primary Collision warnings (PC) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
N-50 Pareto frontier plot of Secondary Collision warnings (SC) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
N-51 Pareto frontier plot of Secondary Collision warnings (SC) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
N-52 Pareto frontier plot of Tertiary Collision warnings (TC) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
N-53 Pareto frontier plot of Tertiary Collision warnings (TC) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
N-54 Pareto frontier plot of Landing Strip Foul Time (LSFT) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
N-55 Pareto frontier plot of Landing Strip Foul Time (LSFT) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
N-56 Pareto frontier plot of Number assigned to Street (NS) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
N-57 Pareto frontier plot of Number assigned to Street (NS) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
N-58 Pareto frontier plot of Number assigned to Fantail (NF) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
N-59 Pareto frontier plot of Number assigned to Fantail (NF) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
N-60 Pareto frontier plot of Duration of Vehicles in Street (SD) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
N-61 Pareto frontier plot of Duration of Vehicles in Street (SD) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
22
LIST OF FIGURES
N-62 Pareto frontier plot of Duration of Vehicles in Fantail (FD) versus LD for 22 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
N-63 Pareto frontier plot of Duration of Vehicles in Fantail (FD) versus LD for 34 aircraft
missions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
O-1 SPSS output for Levene test of equal variances for LD values for all control architectures at 22 aircraft under the 50% standard deviation launch preparation time. . . . 359
O-2 SPSS output for an ANOVA of LD values for all control architectures at 22 aircraft
under the 50% standard deviation launch preparation time. . . . . . . . . . . . . . . 359
O-3 SPSS output for Tukey HSC test of LD values for all control architectures at 22
aircraft under the 50% standard deviation launch preparation time. . . . . . . . . . . 360
O-4 SPSS output for Levene test of equal variances for LD values for all control architectures at 34 aircraft under the 50% standard deviation launch preparation time. . . . 360
O-5 SPSS output for an ANOVA of LD values for all control architectures at 34 aircraft
under the 50% standard deviation launch preparation time. . . . . . . . . . . . . . . 361
O-6 SPSS output for Levene test of equal variances for LD values for all control architectures at 22 aircraft under the launch preparation time model with 50% reduction in
mean and 50% reduction in standard deviation. . . . . . . . . . . . . . . . . . . . . . 361
O-7 SPSS output for a Tukey HSD test of LD values for all control architectures at 34
aircraft under the 50% standard deviation launch preparation time. . . . . . . . . . . 362
O-8 SPSS output for an ANOVA test for LD values for all control architectures at 22
aircraft under the launch preparation time model with 50% reduction in mean and
50% reduction in standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
O-9 SPSS output for Tukey HSD test for LD values for all control architectures at 22
aircraft under the launch preparation time model with 50% reduction in mean and
50% reduction in standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
O-10 SPSS output for Levene test of equal variances for LD values for all control architectures for 34 aircraft under the launch preparation time model with 50% reduction in
mean and standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
O-11 SPSS output for a Kruskal-Wallis test of LD values for all control architectures for 34
aircraft under the launch preparation time model with 50% reduction in mean and
standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
O-12 SPSS output for Levene test of equal variances for LD values for all control architectures at 22 aircraft under the “automated” launch preparation time. . . . . . . . . . 364
O-13 SPSS output for Kruskal-Wallis nonparametric analysis of variance for LD values for
all control architectures at 22 aircraft under the “automated” launch preparation time.364
23
LIST OF FIGURES
O-14 SPSS output for Levene test of equal variances for LD values for all control architectures at 34 aircraft under the “automated” launch preparation time. . . . . . . . . . 364
O-15 SPSS output for Kruskal-Wallis nonparametric analysis of variance for LD values for
all control architectures at 34 aircraft under the “automated” launch preparation time.365
O-16 SPSS output for Levene test of equal variances for LD values for the GC architecture
across all safety protocols at 22 aircraft under the “automated” launch preparation
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
O-17 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the GC architecture across all safety protocols at 22 aircraft under the “automated”
launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
O-18 SPSS output for Levene test of equal variances for LD values for the GC architecture
across all safety protocols at 34 aircraft under the “automated” launch preparation
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
O-19 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the GC architecture across all safety protocols at 34 aircraft under the “automated”
launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
O-20 SPSS output for Levene test of equal variances for LD values for the SBSC architecture
across all safety protocols at 22 aircraft under the “automated” launch preparation
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
O-21 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the SBSC architecture across all safety protocols at 22 aircraft under the “automated”
launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
O-22 SPSS output for Levene test of equal variances for LD values for the SBSC architecture
across all safety protocols at 34 aircraft under the “automated” launch preparation
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
O-23 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the SBSC architecture across all safety protocols at 34 aircraft under the “automated”
launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
O-24 SPSS output for Levene test of equal variances for LD values for the GC architecture across different parameter settings at 22 aircraft under the “automated” launch
preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
O-25 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the GC architecture across different parameter settings at 22 aircraft under the
“automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . 368
24
LIST OF FIGURES
O-26 SPSS output for Levene test of equal variances for LD values for the GC architecture across different parameter settings at 34 aircraft under the “automated” launch
preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
O-27 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the GC architecture across different parameter settings at 34 aircraft under the
“automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . 369
O-28 SPSS output for Levene test of equal variances for LD values for the GC architecture across different parameter settings at 22 aircraft under the “automated” launch
preparation time, including the Low Penalty case.
. . . . . . . . . . . . . . . . . . . 370
O-29 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the GC architecture across different parameter settings at 22 aircraft under the
“automated” launch preparation time, including the Low Penalty case. . . . . . . . . 370
O-30 SPSS output for Levene test of equal variances for LD values for the GC architecture
across different failure rates at 22 aircraft under the “automated” launch preparation
time, using Low Latency and Low Penalty.
. . . . . . . . . . . . . . . . . . . . . . . 371
O-31 SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the GC architecture across different failure rates at 22 aircraft under the “automated”
launch preparation time, using Low Latency and Low Penalty. . . . . . . . . . . . . 371
25
LIST OF FIGURES
THIS PAGE INTENTIONALLY LEFT BLANK
26
List of Tables
2.1
Examples of accidents, hazards, and safety protocols from the mining and aircraft
carrier domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
The three general classes of safety principles elicited from HMUE source materials (Rio
Tinto, 2014; Topf, 2011) and observations. . . . . . . . . . . . . . . . . . . . . . . . .
2.3
53
The three general classes of safety protocols that will be applied to unmanned carrier
flight deck operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
52
54
List of actors involved in flight deck operations and descriptions of their primary roles
in deck operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
3.2
List of resources involved in flight deck operations. . . . . . . . . . . . . . . . . . . .
75
3.3
List of primary Actions used in the MASCS mode of launch operations. . . . . . . .
83
3.4
List of important parameters and logic functions for Aircraft/Pilot modeling. . . . .
84
3.5
List of important parameters and logic functions for Aircraft Director modeling. . .
88
3.6
Heuristics of Deck Handler planning and re-planning developed as part of previous
work (Ryan, 2011; Ryan, Banerjee, Cummings, & Roy, 2014) . . . . . . . . . . . . .
90
3.7
List of important parameters and logic functions for Deck Handler modeling. . . . .
92
3.8
List of important parameters and logic functions for Catapult modeling. . . . . . . .
93
4.1
List of scenarios tested in mission validation (Number of aircraft-number of catapults).107
4.2
Matrix of scenarios tested during internal parameter sensitivity testing. . . . . . . . 113
5.1
Description of effects of Safety Protocol-Composition interactions on mission evolution.137
5.2
List of primary performance metrics for MASCS testing. . . . . . . . . . . . . . . . . 141
6.1
Settings for launch preparation time testing. . . . . . . . . . . . . . . . . . . . . . . . 166
6.2
Mean launch durations and percent reductions for the 34 aircraft Manual Control case
for variations in launch preparation (LP) model. . . . . . . . . . . . . . . . . . . . . 167
6.3
Means and standard deviations of wait times in catapult queues at both the 22 and
34 aircraft Density levels under the original and automated launch preparation models.174
27
LIST OF TABLES
6.4
Results of Wilcoxon Rank Sum Tests comparing THIA values between the original
launch preparation time and the automated launch preparation time for the GC and
SBSC Control Architectures. A value of p < 0.013 indicates significant differences
between cases.
6.5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Results of a Steel-Dwass nonparametric simultaneous comparisons test of Gestural
Control (GC) failure rate variations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
A.1 List of aircraft carrier of safety principles, part 1 of 3. . . . . . . . . . . . . . . . . . 232
A.2 List of aircraft carrier safety principles, part 2 of 3. . . . . . . . . . . . . . . . . . . . 233
A.3 List of aircraft carrier safety principles, part 3 of 3. . . . . . . . . . . . . . . . . . . . 234
A.4 List of mining safety principles, part 1 of 3. . . . . . . . . . . . . . . . . . . . . . . . 235
A.5 List of mining safety principles, part 2 of 3. . . . . . . . . . . . . . . . . . . . . . . . 236
A.6 List of mining safety principles, part 3 of 3. . . . . . . . . . . . . . . . . . . . . . . . 237
C.1 Observed turn rates for F-18 aircraft during manually-piloted flight deck operations.
247
C.2 Observations of launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . 247
C.3 Observations of takeoff (acceleration to flight speed) time. . . . . . . . . . . . . . . . 247
C.4 Results of Easyfit fit of Lognormal distribution to takeoff acceleration time data. . . 248
E.1 Results of F test of model fit for JMP PRO 10 fit of CNA CDF data with 3rd order
polynomial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
E.2 Results of t tests for coefficients of JMP PRO 10 fit of CNA CDF data with 3rd order
polynomial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
E.3 Results of F test of model fit for JMP PRO 10 fit of CNA PDF data with exponential
function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
E.4 Results of t tests for coefficients of JMP PRO 10 fit of CNA PDF data with exponential
function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
F.1 Test results for distribution fits of the MASCS single aircraft calibration testing. . . 255
H.1 Results of distribution fits of interdeparture times for CNA and MASCS test data.
Results include λ and p-values for fits from Easyfit along with λ values and 95%
confidence intervals from JMP PRO 10. . . . . . . . . . . . . . . . . . . . . . . . . . 261
H.2 JMP PRO 10 output for F-test of model fit for LD values using three catapults.
. . 262
H.3 JMP PRO 10 output for t-test of coefficients for linear fit of LD values using three
catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
H.4 JMP PRO 10 output for F-test of model fit for LD values using four catapults. . . . 262
28
LIST OF TABLES
H.5 JMP PRO 10 output for t-test of coefficients for linear fit of LD values using four
catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
H.6 JMP Pro 10 output of Levene test of equal variances for launch preparation sensitivity
tests using 22 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . 262
H.7 JMP Pro 10 output of an ANOVA test for launch preparation sensitivity tests using
22 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
H.8 JMP Pro 10 output of Tukey test for launch preparation sensitivity tests using 22
aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
H.9 JMP Pro 10 output of Levene test of equal variances for taxi speed sensitivity tests
using 22 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
H.10 JMP Pro 10 output of an ANOVA test for taxi speed sensitivity tests using 22 aircraft
and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
H.11 JMP Pro 10 output of Levene test of equal variances for collision prevention parameter
sensitivity tests using 22 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . 263
H.12 JMP Pro 10 output of an ANOVA test for collision prevention parameter sensitivity
tests using 22 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . 263
H.13 JMP Pro 10 output of Levene test of equal variances for taxi speed sensitivity tests
using 22 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
H.14 JMP Pro 10 output of an ANOVA test for taxi speed sensitivity tests using 22 aircraft
and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
H.15 JMP Pro 10 output of a Levene test of equal variances for collision prevention parameter sensitivity tests using 22 aircraft and 4 catapults. . . . . . . . . . . . . . . . 264
H.16 JMP Pro 10 output of an ANOVA test for collision prevention parameter sensitivity
tests using 22 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . 264
H.17 JMP Pro 10 output of a Tukey test for collision prevention parameter sensitivity tests
using 22 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
H.18 JMP Pro 10 output of a Levene test of equal variance for taxi speed sensitivity tests
using 34 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
H.19 JMP Pro 10 output of an ANOVA test for taxi speed sensitivity tests using 34 aircraft
and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
H.20 JMP Pro 10 output of a Tukey test for taxi speed sensitivity tests using 34 aircraft
and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
H.21 JMP Pro 10 output of a Levene test of equal variance for collision prevention parameter sensitivity tests using 34 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . 265
H.22 JMP Pro 10 output of an ANOVA test for collision prevention parameter sensitivity
tests using 34 aircraft and 3 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . 265
29
LIST OF TABLES
H.23 JMP Pro 10 output of a Levene test of equal variance for collision prevention parameter sensitivity tests using 34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . 265
H.24 JMP Pro 10 output of a an ANOVA for launch preparation time sensitivity tests using
34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
H.25 JMP Pro 10 output of a Tukey test for launch preparation time sensitivity tests using
34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
H.26 JMP Pro 10 output of Levene test of equal variance for taxi speed sensitivity tests
using 34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
H.27 JMP Pro 10 output for an ANOVA test for taxi speed sensitivity tests using 34 aircraft
and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
H.28 JMP Pro 10 output of a Tukey test for taxi speed sensitivity tests using 34 aircraft
and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
H.29 JMP Pro 10 output of Levene test of equal variance for collision prevention parameter
sensitivity tests using 34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . 266
H.30 JMP Pro 10 output for an ANOVA test for collision prevention parameter sensitivity
tests using 34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . 266
H.31 JMP Pro 10 output for a Tukey test for collision prevention parameter sensitivity
tests using 34 aircraft and 4 catapults. . . . . . . . . . . . . . . . . . . . . . . . . . . 266
K.1 Descriptive statistics for SP*Composition settings across all Control Architectures
(CAs) at the 22 aircraft Density level. . . . . . . . . . . . . . . . . . . . . . . . . . . 285
K.2 Results of ANOVAs for LD results across control architectures for each SP*Composition
setting.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
L.1 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for 22 aircraft cases
versus SP, with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . . 290
L.2 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for 34 aircraft cases
versus SP, with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . . 290
L.3 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for 22 aircraft cases
versus CA, with T+A safety protocols excluded. . . . . . . . . . . . . . . . . . . . . 290
L.4 JMP PRO 10 output for Tukey HSD Multiple Comparisons tests for the 22 aircraft
cases versus CA for Dynamic Separation. . . . . . . . . . . . . . . . . . . . . . . . . 293
L.5 JMP PRO 10 output for Tukey HSD Multiple Comparisons tests for the 22 aircraft
cases versus CA for Temporal Separation, with T+A safety protocols excluded. . . . 294
L.6 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests (significant results
only) for 22 aircraft cases versus full treatment definitions, with T+A safety protocols
excluded. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
30
LIST OF TABLES
L.7 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests (significant results
only) for 34 aircraft cases versus full treatment definitions, with T+A safety protocols
excluded (part 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
L.8 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests (significant results
only) for 34 aircraft cases versus full treatment definitions, with T+A safety protocols
excluded (part 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
N.1 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 22 aircraft cases versus Safety Protocols. . . . . . 312
N.2 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 34 aircraft cases versus Safety Protocols. . . . . . 312
N.3 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 22 aircraft cases versus Control Architectures. . . 313
N.4 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 34 aircraft cases versus Control Architectures. . . 314
N.5 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo Incursions due to Aircraft (THIA) 22 aircraft cases versus Safety Protocols-Composition
pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
N.6 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Duration of
Tertiary Halo Incursions due to Aircraft (DTHIA) 22 aircraft cases versus Control
Architectures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
N.7 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Duration of
Tertiary Halo Incursions due to Aircraft (DTHIA) 22 aircraft cases versus Safety
Protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
N.8 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 22 aircraft cases versus Safety Protocols-Composition pairs, excluding
T+A cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
N.9 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo Incursions due to Aircraft (THIA) 22 aircraft cases versus Safety Protocols-Composition
pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
N.10 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 34 aircraft cases versus Safety Protocols-Composition pairs. . . . . . 320
N.11 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 22 aircraft cases versus Safety Protocols. . . . . . . . . . . . . . . . 321
N.12 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 22 aircraft cases versus Safety Protocols. . . . . . . . . . . . . . . . 322
31
LIST OF TABLES
N.13 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Total Taxi Distance (TTD) 22 aircraft cases versus SP-Composition pairs. . . . . . . . . . . . . . . 323
N.14 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Total Taxi Distance (TTD) 22 aircraft cases versus SP. . . . . . . . . . . . . . . . . . . . . . . . . . 324
N.15 JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Total Taxi Distance (TTD) 34 aircraft cases versus SP. . . . . . . . . . . . . . . . . . . . . . . . . . 324
O.1 JMP PRO 10 output for a Steel-Dwass non-parametric simultaneous comparisons test
of LD values for all control architectures for 34 aircraft under the launch preparation
time model with 50% reduction in mean and standard deviation. . . . . . . . . . . . 361
O.2 JMP Pro 10 output for Steel-Dwass nonparametric simultaneous comparisons test of
LD values for all control architectures at 22 aircraft under the “automated” launch
preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
O.3 JMP Pro 10 output for Stell-Dwass nonparametric simultaneous comparisons test of
LD values for all control architectures at 34 aircraft under the “automated” launch
preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
O.4 SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the GC architecture across all safety protocols at 22 aircraft under the
“automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . 366
O.5 SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the GC architecture across all safety protocols at 34 aircraft under the
“automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . 367
O.6 SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the SBSC architecture across all safety protocols at 22 aircraft under the
“automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . 368
O.7 SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the SBSC architecture across all safety protocols at 34 aircraft under the
“automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . . . . . . . 368
O.8 JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
of LD values for the GC architecture across different parameter settings at 22 aircraft
under the “automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . 369
O.9 JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
of LD values for the GC architecture across different parameter settings at 34 aircraft
under the “automated” launch preparation time. . . . . . . . . . . . . . . . . . . . . 369
O.10 JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
of LD values for the GC architecture across different parameter settings at 22 aircraft
under the “automated” launch preparation time, including the Low Penalty case. . . 370
32
LIST OF TABLES
O.11 JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
for LD values for the GC architecture across different failure rates at 22 aircraft under
the “automated” launch preparation time, using Low Latency and Low Penalty. . . . 371
33
LIST OF TABLES
THIS PAGE INTENTIONALLY LEFT BLANK
34
List of Acronyms
A Area Separation
ABS Agent-Based Simulation
ANOVA ANalysis of VAriance
AUVSI Association of Unmanned Vehicle Systems International
CA Control Architecture
CDF Cumulative Density Function
CNA Center for Naval Analyses
DCAP Deck operations Course of Action Planner
DPHI Duration of Primary Halo Incursions
DS Dynamic Separation
DTHIA Duration of Tertiary Halo Incursions due to Aircraft
FAA Federal Aviation Administration
FD Duration of Vehicles in Fantail
GC Gestural Control
GCS Ground Control Station
GUI Graphical User Interface
HMUE Heterogeneous Manned-Unmanned Environment
HSC Human Supervisory Control
K-S Kolmogorov-Smirnov
LD Launch event Duration
35
LIST OF ACRONYMS
LOS Line-Of-Sight
LP Launch Preparation
LSFT Landing Strip Foul Time
LT Local Teleoperation
MASCS Multi-Agent Safety and Control Simulation
MC Manual Control
ME Mission Effectiveness
NF Number assigned to Fantail
NHTSA National Highway Traffic Safety Administration
NTSB National Transportation Safety Board
NS Number assigned to Street
PDF Probability Density Function
PHI Primary Halo Incursion
PC Primary Collision warnings
RT Remote Teleoperation
SA Situational Awareness
SC Secondary Collision warnings
SD Duration of Vehicles in Street
SBSC System-Based Human Supervisory Control
SHI Secondary Halo Incursions
SME Subject Matter Expert
SP Safety Protocol
T Temporal Separation
T+A Temporal+Area Separation
TAAT Total Aircraft Active Time
36
LIST OF ACRONYMS
TAXT Total Aircraft taXi Time
TC Tertiary Collision warnings
THI Tertiary Halo Incursions
THIA Tertiary Halo Incursions due to Aircraft
TTD Total Taxi Distance
UAV Unmanned Aerial Vehicle
UGV Unmanned Ground Vehicle
USV Unmanned Surface Vehicle
UxV Unmanned Vehicle
VBSC Vehicle-Based Human Supervisory Control
37
LIST OF ACRONYMS
THIS PAGE INTENTIONALLY LEFT BLANK
38
Chapter 1
Introduction
While unmanned and autonomous vehicle systems have existed for many years, it is only recently
that these technologies have reached a sufficient maturity level for full-scale integration into humancentered work environments to be feasible. Already, integration of these vehicles into military (Office
of the Secretary of Defense, 2013; Naval Studies Board, 2005) and commercial environments (Scanlon, 2009; Rio Tinto, 2014) is beginning to occur. Most previous implementations of automated and
unmanned vehicles, however, have been predicated on a complete separation of unmanned vehicles
from the rest of the environment (Federal Aviation Administration, 2014a). For these heterogeneous
manned-unmanned systems, which include mining, warehouse operations, and aircraft carrier flight
decks, this strategy of complete separation will not be feasible: these domains will require close
physical coordination and interaction between the humans, manned vehicles, and unmanned or autonomous vehicles in the world. Increasing the safety of operations will require a different tactic: the
implementation of safety protocols that dictate when and where human-vehicle interactions occur
and tasks can be performed.
The scale and complexity of these systems implies that conducting empirically would be both
time-consuming and resource intensive, assuming that the unmanned or autonomous vehicles of
interest to be integrated have been constructed and are available for testing. This dissertation
describes a simulation and assessment methodology constructed specifically for evaluating system
design tradeoffs in these dynamic human/autonomous vehicle work environments. In particular, it
explores the relationship between the safety protocols, the human-machine interface, and system
performance in terms of both safety and productivity. The simulation is constructed for a single
representative environment — the aircraft carrier flight deck — but the methodology can be used
to construct models of other domains, allowing system designers and other stakeholders to estimate
the likely outcomes of unmanned vehicle integration in their own domains.
39
CHAPTER 1. INTRODUCTION
1.1
Autonomous Robotics in Complex Sociotechnical
Systems
The terms “unmanned” and “autonomous” systems are often used interchangeably to describe a variety of complex vehicle systems, which must further be differentiated from “automated” systems. The
most recent Department of Defense roadmap for autonomy defines an “unmanned” aircraft as one
“that does not carry a human operator and is capable of flight under remote control or autonomous
programming” (Office of the Secretary of Defense, 2013). When remotely-controlled by a human
operator, other applicable terms include “teleoperated” (Sheridan, 1992) and “Remotely-Piloted”
or “Remotely-Operated” aircraft (Federal Aviation Administration, 2014b). In these remotelycontrolled systems, a human operator is still the primary source of lower-level control inputs to the
vehicle system. Alternatively, “autonomous” and semi-autonomous systems offload some of these
activities to the vehicle system, allowing it “to be automatic or, within programmed boundaries,
‘self-governing’ ” (Defense Science Board, 2012). This promotes the human operator to a role of
“supervisor” in which they provides instructions as to where the vehicle should travel (waypoints)
rather than directly controlling steering, acceleration, and braking. The term “automated” then
refers to the assignment of a singular, often repetitive, task to an individual machine.
While various automated technologies are common in manufacturing enterprises, autonomous or
semi-autonomous robots are not commonly seen traveling through the halls of a hospital, rolling along
an urban sidewalk, or flying through the skies loaded with cargo. However, the success of DARPA’s
Grand Challenge (Seetharaman, Lakhotia, & Blasch, 2006) and Urban Challenge (Buehler2009,
2009) as well as the development of Google’s autonomous car (Markoff, 2010; Thrun, 2010) have
demonstrated that fully-autonomous road vehicles are not far away. Unmanned aerial vehicles, using
only a limited form of autonomy to communicate with and execute commands from a remote pilot,
have been used for years in both U.S. military operations; in Japan, remotely piloted helicopter have
been used in large numbers for years for agricultural crop dusting (Sato, 2003). The Association of
Unmanned Vehicle Systems International (AUVSI) forecasts massive growth potential for unmanned
systems in U.S. agriculture and public safety (Jenkins & Vasigh, 2013), and the FAA’s recent awarding of six Unmanned Aerial Vehicle (UAV) test sites should aid in the commercial deployment of
such vehicles within the United States. In a very different context, a variety of work is also currently
addressing the personal service robotics field, where a major focus is the development of robots to
care for the growing elderly population (Roy et al., 2000), as well as a new generation of manufacturing robots that can better understand the actions of their human collaborators, improving their
ability to understand human intentions in the context of work tasks (Nikolaidis & Shah, 2012).
Two primary concerns exist for the integration of unmanned and autonomous systems into these
human-centered environments: safety and productivity. Autonomous vehicles must be capable of
40
CHAPTER 1. INTRODUCTION
safely moving near to and interacting not only with the humans that can control or influence their
behavior, but also with other bystanders (both human and manually controlled vehicles) that have
no influence over them otherwise. They must also be able to do so without significantly reducing the
efficiency of operations: for instance, in highway driving, autonomous cars must navigate effectively
in traffic, merging in seamlessly with both human-controlled vehicles and pedestrians on or near
the roadway. Aerial vehicles in public domains must adhere to the same sense-and-avoid actions
that human pilots take, similar to the “rules of the road” that autonomous surface vessels must
adhere to in navigating the seas amid other ships. Personal robots must attend to whatever tasks
are required of them without running into, or over, their human companions or other items within
the household. These environments can all be characterized as Heterogeneous Manned-Unmanned
Environments (HMUEs), highlighting the fact that different types of both manned and unmanned
vehicles (including humans) operate in close proximity within a given area. Figure 1-1 provides a
notional representation of these domains.
While a great deal of research currently addresses the development of individual unmanned
or autonomous systems, or small teams of them working together, very little research currently
examines their integration into these larger HMUEs. These large-scale systems, in both size and
number of active agents, are problematic to test for several reasons. First, the number of actors in
Figure 1-1: Notional diagram of Heterogeneous Manned-Unmanned Environments (HMUEs).
41
CHAPTER 1. INTRODUCTION
systems like the national highway system is so large that real-world testing at appropriate scales
requires the participation of many individuals (in and of itself a challenge to organize), would incur
significant expenses in terms of time and money, and would require substantial infrastructure to
support operations. Additionally, because the autonomous systems are unproven, there may be
substantially increased risk levied on the human participants in the system or to bystanders in
the area. While small-scale tests could occur, for many of these complex sociotechnical systems,
properties at the smaller scale (one or only a few vehicles) do not typically scale to larger systems
(several dozen systems) due to non-linear relationships and trends in the system. Furthermore, the
systems of interest may only be in the prototype (or earlier) stage and not actually be able for
physical testing in the world, even if the other obstacles could be overcome.
One potential method of addressing these limitations is the creation of realistic simulations of
HMUEs that replicate the logic and behavior of unmanned vehicles, manned vehicles, and human
operators and bystanders in the world. A valid simulation of such environments would allow designers
to explore the effects of design choices in both the unmanned vehicles and the rules that govern their
implementation and integration into the system to be tested more quickly, less expensively, and over
more possible scenarios than empirical, real-world testing would allow. This dissertation examines
the use of agent-based modeling, a bottom-up approach that attempts to model the individual
motion and decision-making of actors in the environment, to create simulations of these complex
sociotechnical systems for the purposes of exploring the effects of safety protocols (the methods of
integrating unmanned vehicles into operations) and methods of robotic vehicle control on system
safety and productivity.
1.2
Research Statement
The goal of this dissertation is to develop a simulation of aircraft carrier flight deck operations
as a case study in exploring the effects of implementing unmanned vehicles into Heterogeneous
Manned-Unmanned Environments (HMUEs). In this exploration, the key interest is in how unmanned vehicles affect the safety and productivity of the system as a whole, rather than focusing
on individual vehicles. Variations in these measures amongst different types of unmanned vehicle
control architectures enables the identification of the key design parameters that drive performance
along metrics of safety and productivity.
1.2.1
Research Approach
The differences between unmanned vehicles integrated into HMUEs lie in the low-level interactions
between vehicles with the world, with other vehicles, and with other human collaborators in the
world. This thesis utilizes agent-based model as its approach, as agent-based modeling is constructed
in a bottom-up fashion by modeling low-level behaviors and interactions of actors in the environment.
42
CHAPTER 1. INTRODUCTION
When the simulation model is executed, it begins with a set of initial conditions for the agents and the
world; the behavior rules and decision-making procedures of the agents, all working in a decentralized
fashion, then dictate how the simulation evolves from that point forward. If properly constructed,
when given a set of initial conditions, an agent-based model should evolve in the same fashion as
the real system it is intended to replicate.
In addition to the behavioral and decision-making rules, defining agents in the system also
requires defining the key states and parameters that characterize tasks that are performed in the
world and how agents observe the world. Agent states define their current location in the world,
speed, orientation, as well as other features such as current goal and health. Parameters define the
range of available speeds that at which agents may travel or the range of times a task may take when
executed. The behavioral rules then define, given the current state of the agent and observable states
of the environment and other agents, (1) what its current goal should be, (2) the tasks it should
perform to achieve that goal, and (3) whether or not it can execute those tasks. The defined agent
parameters then describe how agents perform the prescribed tasks. Agent models may include not
only the actors in the system (people, vehicles, etc.) but machinery, equipment, and other resources
within the environment as well as the environment itself. Together, the definitions of agents, tasks,
and states should characterize not only individual behavior, but the interactions between individuals
in the environment.
Agent-based modeling has been used to explore a variety of systems, ranging from pure decisionmaking models in economics to models of human and vehicle motion in the world. In one study
of human motion, researchers explored ways to improve fire safety evacuations of crowds. The
non-intuitive result from this work was that placing a column directly in front of the exit door
improved egress performance: agents self-organized into a more efficient and orderly exit pattern
than observed in simulations without the column (Helbing, Farkas, & Vicsek, 2000). Agent-based
models have also been used extensively in highway traffic safety modeling (Gettman & Head, 2003;
Archer, 2004; Yuhara & Tajima, 2006) in order to better understand accident prevention without
risking the safety of potential test subjects or bystanders.
A difficulty in modeling futuristic HMUE environments is that, by definition, futuristic environments do not exist and that models of futuristic environments cannot be validated. However,
if a current operational environment is similar to the futuristic one, then a valid model of current
operations can be used as a template for future operations and lend validity to these futuristic
models. One such example of this is aircraft carrier flight deck operations, which this dissertation
takes as its point of investigation. The modern U.S. aircraft carrier flight deck is an example of a
heterogeneous manned environment, where humans and manned vehicles interact and coordinate
within close physical spaces. The United States Navy is currently investigating the inclusion of
unmanned aerial vehicles into flight deck operations, where the vehicles will be required to interact
43
CHAPTER 1. INTRODUCTION
with the flight deck crew and other manned aircraft in a similar fashion as manned aircraft currently
do. The modeling of futuristic aircraft carrier flight deck operations begins with the modeling and
calibration of an agent-based model of current aircraft carrier flight operations before proceeding to
the development and integration of unmanned vehicles into the simulation. This final simulation is
then used as an experimental testbed, comparing the performance of various forms of UAV control
and safety protocols in flight deck operations.
1.2.2
Research Questions
This thesis addresses several questions concerning the simulation of heterogeneous manned-unmanned
environments and the examination of safety protocols and control architectures therein:
1. What are the specific agents and their related states, parameters, and decision-making rules
required to construct an agent-based model of carrier flight deck operations?
2. What are the key parameters that describe variations in unmanned vehicle control architectures
and their interactions with the world, in the context of an agent-based model of flight deck
operations?
3. In the context of the modeled control architectures, how do these key parameters drive performance on the aircraft carrier flight deck? How do these parameters interact with the structure
of the environment, and how do the effects scale when using larger numbers of aircraft in the
system?
4. If high-level safety protocols are applied to operations, what effects do they have on the safety
of operations? Do the safety protocols leads to tradeoffs between safety and productivity in
the environment?
1.3
Thesis Organization
This dissertation is organized as follows:
• Chapter 1, Introduction, describes the motivation, objectives, and research questions.
• Chapter 2, Literature Review, describes prior research into unmanned vehicle control, safety
protocols applied to unmanned vehicles, and prior simulation methods for both human interaction with unmanned systems and safety.
• Chapter 3, Model Development, provides background on current aircraft carrier flight deck
operations and the construction of the agent-based model of this environment. This includes
descriptions of the empirical data used to describe vehicle behavior, the stochastic variables
used to model vehicle performance, and the logic functions used to replicate decision-making
by humans and vehicles in the environment.
44
CHAPTER 1. INTRODUCTION
• Chapter 4, Model Validation, describes the process of validating aspects of this model, including
the empirical data used in testing, the series of tests executed, and key results that support
that the model is a reasonable simulation of flight deck operations.
• Chapter 5, Multi-Agent Safety and Control Simulation (MASCS) Experimental Program, describes the implementation of new unmanned vehicle and safety protocols into the simulation
model, the experimental matrix used in the testing process (the independent variables varied
throughout testing), performance metrics used in the analysis, and the results of testing.
• Chapter 6, Exploring Design Options, describes other potential applications of the MASCS
model in examining the interactions between Control Architectures, Safety Protocols, and
mission settings in the context of a design space exploration.
• Chapter 7, Conclusions, describes the important conclusions from this work. This includes
both key results regarding the integration of unmanned vehicles into aircraft carrier flight deck
operations, how these results inform unmanned system integration into other environments,
and lessons learned regarding the use of agent-based modeling methods in this work.
45
CHAPTER 1. INTRODUCTION
THIS PAGE INTENTIONALLY LEFT BLANK
46
Chapter 2
Literature Review
“... To cast the problem in terms of
humans versus robots or automatons is
simplistic... We should be concerned with
how humans and automatic machines can
cooperate.”
Telerobotics, Automation, and Human
Supervisory Control (1992)
Thomas B. Sheridan
A Heterogeneous Manned-Unmanned Environment (HMUE) requires close physical interaction
between humans, manned vehicles, and unmanned, autonomous, or robotic vehicles. Currently,
HMUE environments are rare, but there exist several heterogeneous manned environments (human
collaborators and manned vehicles) that are seeking to integrate unmanned vehicles into their operations. This ranges from airborne operations such as aircraft carrier flight decks and other military
air operations, ground operations such as autonomous “driverless” cars and automated forklifts, and
human-centered environments, such as interactive manufacturing robots and assistive robotics for
personal and medical use. In each case, unmanned vehicles bring with them new capabilities and
limitations, and the rules that traditionally govern interaction with manned vehicles may no longer
be appropriate. Ultimately, the effectiveness of the unmanned robotic systems are contingent upon
their capabilities to perform tasks and interact with the environment and the safety with which these
operations and interaction occur. This chapter reviews prior research in these two areas. The first
section examines prior research on the safety of unmanned vehicle operations in terms of the rules
that govern their interaction with the environment — the “Safety Protocols” that dictate when,
where, and how vehicle interact with the world. The second section addresses the types of “Control
Architectures” used to control unmanned vehicle systems, which dictate their capabilities to move,
47
CHAPTER 2. LITERATURE REVIEW
observe, and interact with the world. The third section then describes how the nature of Safety
Protocols and Control Architectures relates to the use of agent-based simulation models.
2.1
Safety Protocols for Heterogeneous Manned-Unmanned
Environments
Over the past few decades, a variety of competing ideas regarding the causes of accidents in complex systems have been developed. Some have argued that complex systems always eventually
fail (Perrow, 1984), while others have demonstrated that this is not necessarily true for certain
systems (Rochlin, La Porte, & Roberts, 1987; Weick & Roberts, 1993; LaPorte & Consolini, 1991).
Others have explored the distinctly human contributions to error, including problems that develop
over time due to both individual and organizational factors (Reason, 1990; Reason, Parker, & Lawton, 1998; Dekker, 2005). More recently, a systems perspective has been applied to safety (Leveson,
2011, 2004). The common theme within this research is that preserving safety — preventing accidents — is a behavioral concern. Errors (incorrect actions taken by individual), hazards, and
accidents ultimately occur because certain aspects of the system — human or otherwise — did not
behave as expected and the constraints and processes put in place to prevent accidents all have
failed. Additionally, failure to execute the processes, as well as human behavioral factors, cause the
system to degrade over time. Reason referred to this as the generation of “latent” failures in the
system that lie in wait for the right set of circumstance to become active (Reason, 1990). This is
encapsulated in Fig. 2-1.
To prevent the occurrence of an accident, a behavioral constraint at some level must intervene.
These can be applied at the organizational and supervisory levels, dealing with topics such as
Figure 2-1: Adaptation of Reason’s “Swiss Cheese” model of failures (Reason, 1990).
48
CHAPTER 2. LITERATURE REVIEW
“just culture” and proper training of staff. In the context of human interaction with unmanned
vehicles, significant concerns also exist at the lower levels, in that operators and vehicles must
avoid unsafe acts and preconditions to those unsafe acts. Unsafe acts in these contexts most often
correspond to failures of vehicles to stop in time to prevent a collision (be it not receiving the order
to stop, attempting to stop and not being able to, or incorrectly interpreting the stop command) and
failures of the crew and supervisors to command that certain actions not occur in parallel (failures
of communication, simple human error, and similar cases). While these are known issues, there
is again little past literature on how unsafe control actions (and other hazards and accidents) are
avoided or otherwise mitigated. The following sections review research into unmanned vehicle safety
as well as examples of safety protocols used currently within heterogeneous manned environments.
Within the context of the latent failure framework, these sections attempt to characterize common
methods of increasing safety in unmanned and robotic operations.
2.1.1
Safety Protocols for Unmanned Vehicles
Research concerning safe autonomous robotic systems addresses a variety of cases, but the primary focus is on a single vehicle interacting with the world. Various studies have addressed automated manipulator robots (Heinzmann & Zelinsky, 1999; Kulic & Croft, 2005; Tonietti, Schiavi,
& Bicchi, 2005; Alami et al., 2006; Kulic & Croft, 2006; Nikolaidis & Shah, 2012), Unmanned
Aerial Vehicles (UAVs) (Schouwenaars, How, & Feron, 2004; Teo, Jang, & Tomlin, 2004; Sislak,
Volf, Komenda, Samek, & Pechoucek, 2007), Unmanned Ground Vehicles (UGVs) (Johnson, Naffin, Puhalia, Sanchez, & Wellington, 2009; Yoon, Choe, Park, & Kim, 2007; Burdick et al., 2007;
Campbell et al., 2007), Unmanned Surface Vehicles (USVs) (Larson, Bruch, Halterman, Rogers, &
Webster, 2007; Bandyophadyay, Sarcione, & Hover, 2010; Elkins, Sellers, & Monach, 2010; Huntsberger, Aghazarian, Howard, & Trotz, 2011), and personal robots (Jolly, Kumar, & Vijaykumar,
2009). In the context of safety, the major concern is the motion of the unmanned systems near to
human collaborators and other manned and unmanned vehicles. As such, the focus of safety research
lies on path planning and navigation systems of the vehicles — the primary decision-making action
of vehicles in the aforementioned studies. For collisions between unmanned robotic systems and
other humans, vehicles, and objects, the system attempts to avoid the hazardous states that could
lead to collisions. Examples of hazards might include (1) being too close and not having sufficient
stopping distance, (2) turning the wrong direction (applying an incorrect action), or (3) not seeing
an obstacle (and thus unable to apply the correct control action). Safety protocols can act to avoid
this hazardous state entirely (by maintaining an appropriate distance between vehicles), move out
of a hazardous state if it exists (by moving away or past the obstacle), or at worst, attempt to
minimize the damage caused if an accident actually happens (by decreasing speed).
Three general methods of Unmanned Vehicle (UxV) collision avoidance are observed in the
49
CHAPTER 2. LITERATURE REVIEW
literature: a) path planning prior to the start of movement, b) reactive collision avoidance during
movement, and c) damage/injury mitigation at contact if collision occurs. Each applies some form
of constraint to vehicle behavior, either restricting their possible paths or speeds, in order to prevent
the occurrence of the hazardous state of being too close to other objects. Ideally, option a) is taken,
and the motions of all vehicles in the environment are planned such that no collisions are ever likely
to occur (Kulic & Croft, 2005; Schouwenaars et al., 2004; Teo et al., 2004; Johnson et al., 2009). To
do so requires knowledge of the current position and trajectory of all nearby agents in the world and
accurate predictions of how they will move over time. Given this information, paths for the vehicles
are constructed simultaneously with the minimum safe “buffer” distance applied through all paths
at all times. However, these plans will likely not evolve as intended given uncertainty in sensor data
and stochasticity in both the environment and task execution. As such, many autonomous vehicles
will also utilize software that employs method b) — a reactive collision avoidance system working in
real time, continually monitoring the motion of nearby obstacles and adjusting the vehicle’s path to
avoid them (again using the declared buffer distance), otherwise referred to as “detect, sense, and
avoid” (Kulic & Croft, 2006; Sislak et al., 2007; Yoon et al., 2007; Burdick et al., 2007; Campbell
et al., 2007; Larson et al., 2007; Bandyophadyay et al., 2010). In cases when contact is inevitable,
such as a robotic manipulator arm working closely with human colleagues, the system ensures that
collisions occur at low kinetic energy levels in order to minimize injury or damage (Heinzmann &
Zelinsky, 1999; Tonietti et al., 2005; Alami et al., 2006; Jolly et al., 2009). In general, however,
these protocols attempt to do two things: maximize the distance between vehicles at all times and
minimize the speed of impact if a collision cannot be avoided. However, these two concepts are fairly
limited in scope, applying only to collisions between vehicles. Complex systems such as the HMUEs
of interest in this work can have a large host of other hazardous states that lie outside of simple
vehicle collisions and may require additional protocols. Some examples of these are reviewed in the
next section.
2.1.2
Safety Protocols in Related Domains
While HMUEs are currently rather rare in practice, there are a variety of similar domains that
can provide some guidance as to common hazards of human-machine interaction and the protocols
that are used to prevent them. A first example is the manufacturing domain, which has utilized
automated systems for several years. These systems lack the intelligence and path planning capabilities of the systems in the previous section, and thus a different tactic has been utilized in order
to ensure safe operation near human beings — physical isolation (Marvel & Bostelman, 2013). One
common practice is to place these systems within a physical cage where no human coworkers are
allowed, preventing any possible collisions between human and machine (Fryman, 2014). However,
the International Organization for Standardization (ISO) is currently developing standards for more
50
CHAPTER 2. LITERATURE REVIEW
integrated human-robot interaction that address both manipulator arms and mobile robotic systems
(ISO/DST15066 (ISO, unpublished)). In these draft standards, four potential safe operating modes
are identified: (1) stopping in a predetermined position when a human enters the area, (2) manual
guiding of robot activities by the human, (3) speed and separation monitoring, and (4) power and
force limiting (Fryman, 2014). The first option is, in essence, still a form of isolation: the robot and
human are not operating in the same space at the same time. The second is a form of teleoperation,
with the human in full control the robot’s activity from a safe distance away. In the third and fourth,
the robot is allowed to operate autonomously near the human collaborator, with key behavioral rules
encouraging safety. In the third option, attempts are made to ensure no contact ever takes place; in
the fourth, contact is allowed, but only at low impact forces.
The general rule for manufacturing systems, then, has been to keep human operators separate
from the robotic systems when at all possible. Enabling greater collaboration means allowing collaboration, but ensuring that collisions either do not occur or are of low danger if they do occur. These
same principles of separation have also been used in more advanced mobile robotic systems like the
Kiva “mobile-robotic fulfillment system” used in some Amazon.com and Zappos warehouses (Scanlon, 2009). In this warehouse management system, rather than requiring human workers to move
about the facility locating and loading packages onto carts or forklifts, Kiva’s robots retrieve small
shelving units from the shop floor and deliver them to human packers. These packers remain in a
fixed location in the warehouse; the robots include of sophisticated path planning algorithms that
allow them to safely move around each other, but they cannot account for the presence of a human
coworker. In these systems, it is the human who is kept isolated from the robots, staying within a
specific area while shelves and packages are brought to them.
Two final domains — naval aviation and commercial mining — both involve the use of very
large, mobile vehicles that directly require human interaction to perform tasks in the world. On the
aircraft carrier deck, upwards of forty aircraft interact with over one hundred human crewmembers
and a dozen support vehicles to fuel, load, launch, and land aircraft in rapid fashion. In open
pit mining, 300-ton “haul trucks” work in close spaces with front loaders, dumping chutes, and
human supervisors traversing the site in smaller vehicles (typically small pickup trucks and sport
utility vehicles) to move loose ore from blast pits to processing sites. These tasks typically require
close interaction and maneuvering of vehicles, as well as navigation of roadways and intersections
similar to that on more typical civilian road systems. Additionally, each of these two domains is
actively investigating the introduction of UxV systems into operations. The United States Navy is
currently engaged in developing an unmanned, autonomous combat vehicle for use on the aircraft
carrier deck (McKinney, 2014; Hennigan, 2013) while the mining industry (Rio Tinto, 2014; Topf,
2011) is exploring the development and use of autonomous hauling trucks for use in open pit mining.
Given the dangerous nature of operations in these environments, a high priority is placed on safety in
51
CHAPTER 2. LITERATURE REVIEW
operations, resulting in the development of numerous safety protocols to safeguard system operations.
Published standards were not located for the mining and aircraft carrier domains, but a variety
of information on operations was obtained from training manuals, interviews with subject matter
experts, and observations of operations. These sources of information were used to compile a set of
safety protocols that work within the system. In reviewing this documentation, a framework was
defined to describe an accident in terms of the potential hazards that generated the accident that the
safety protocols (rules of behavior) put in place to avoid the hazards. This resulted in 33 different
combinations for aircraft carrier operations (10 different accident types, 25 different hazards, and 18
safety protocols) and 35 for mining operations (7 general accident types, 15 different hazards, and
11 different safety protocols). Examples of these appear in Table 2.1 below, with a full list appearing
in Appendix A. Together with the examples from prior sections, three general areas of focus appear
to emerge, based on the types of hazards being mitigated in each case. These three major safety
“principles” of Collision Prevention, Exclusive Areas, and Task Scheduling are explained in more
detail in the next section.
2.1.3
Defining Classes of Safety Principles
The types of safety protocols that act within manufacturing, aircraft carrier, and mining environments appear to fall within three general categories, listed within Table 2.2. These three safety
principles provide a framework both for classifying current safety protocols observed in heterogeneous vehicle environments and for creating new protocols that could be applied. The first principle,
Collision Prevention, includes safety protocols that govern the spacing of vehicles when in motion.
Table 2.1: Examples of accidents, hazards, and safety protocols from the mining and aircraft carrier
domains.
Hazard
Accident
Aircraft Carrier Flight
Mining
Deck
Vehicle
collision while
in transit
Collision
between
vehicles
(Observation) During transit
on deck, Aircraft Directors
maintain a large spacing
between vehicles.
Do not enter catapult area if
aircraft is on catapult.
In travel mode, automated
trucks in the mine site are
not allowed within 50 m of
human workers.
Automated trucks may never
enter within 5 m of a
stationary human worker.
Interference
with a
dangerous
task
Interference
between
dangerous
tasks
Accidental
vehicle motion
Collision, fire,
human injury
and death
Collision, fire,
human injury
and death
Crew are instructed to never
fuel and load weapons onto
an aircraft at the same time.
Excavators only load one
truck at a time.
Collision
between
vehicles,
human death
Tie down aircraft if inactive
(prevent accidental motion).
Lockdown vehicles that are
in use (prevent accidental
operation).
52
CHAPTER 2. LITERATURE REVIEW
This category would include the “buffer” areas included in the unmanned vehicle motion protocols
covered in Section 2.1.1 and the “speed and separation monitoring” operating mode in the draft
ISO (ISO, unpublished) standards discussed in Section 2.1.2.
The second safety principle is Exclusive Areas, in which certain areas should not be accessed
while certain systems are operating. One example from aircraft carrier operations is that the area
immediately surrounding a catapult is off-limits while an aircraft is being attached and launched;
this decreases the risk that a crewmember or other vehicle will be in the launch path and will be
struck during a launch. From the ISO standards, the first operating mode — which requires that
robotic systems not be operation while a human collaborator is near — would also fit within this
classification.
Task Scheduling protocols, the third class, describes constraints on task execution – that two
actions either should not be performed at the same time, should not occur in a certain order, or
should not occur within a certain period of time of one another. In the aircraft carrier domain, one
example is that aircraft fueling and weapons loading do not occur at the same time; a fuel accident
and fire occurring while weapons are being loaded could be catastrophic. This can also include
cases in which a task is inserted solely to reduce risk in operations. For instance, an example from
the aircraft carrier environment concerns waiting for appropriate crew to escort a vehicle across
the deck. While an aircraft could begin a taxi motion without the full escort group, doing so may
significantly increase the likelihood of an accident occurring on the deck. The first and second modes
of operation in the draft ISO standard might also fit in this category, as the robotic systems would not
be allowed to work simultaneously with humans in the environment. The primary difference between
Task Scheduling and Exclusive Areas is in how many different regions exist within the environment:
if multiple areas exist, Exclusive Areas would allow parallel operations in each individual region.
Otherwise, Task Scheduling would describe the ordering of tasks within the single region.
Table 2.2: The three general classes of safety principles elicited from HMUE source materials (Rio
Tinto, 2014; Topf, 2011) and observations.
Principle Class
Constraints
Collision Prevention
Safety is enforced through sensing of nearby vehicles and obstacles. Vehicles in motion must maintain a minimum safe
separation distance from other vehicles at all times.
Safety is enforced through physical segregation. The working
environment is divided into separate regions for manned and
unmanned operations. Vehicles must operate within their prescribed regions at all times and should not exit their prescribed
regions.
Safety is enforced through scheduling. Tasks are performed
only in a specified order to reduce risk in operations. This may
apply to multiple tasks for a single vehicle or for separating
entire groups of vehicles/humans.
Exclusive Areas
Task Scheduling
53
CHAPTER 2. LITERATURE REVIEW
The three safety principles listed above all share a common feature: each describes a type of
constraint on individual behavior (human and vehicle, manned and autonomous) in the environment,
addressing when, where, and how agents perform tasks in the world. Dynamic Separation protocols
require additional logic and decision making about the paths vehicles are allowed to travel and
requires them to monitor the world around them, looking for conditions where they should adjust
their speed or path. Exclusive Area protocols require additional logical functions to know when
vehicles can enter a given area, if ever. Task Scheduling protocols address the logic behind task
execution, describing when and under what conditions tasks should or should not be performed.
The common feature of these constraints is that behavioral rules describe what the vehicle should
do: this is a feature of the software of the unmanned and autonomous vehicles and are not direct
properties of the physical hardware. As such, modeling the effects of safety protocols in HMUEs
requires modeling the logic behind these actions, in the context of the motion of agents in the world
and their interactions with others.
These safety principles, when applied to a specific domain and specific set of operations, form
the set of safety protocols (Table 2.2) that apply to specific vehicle behaviors and tasks in the world.
In this work, these principles are applied to aircraft carrier flight deck operations (which will be
described further in Chapter 3). In applying these principles, protocols take the form provided in
Table 2.3, described in terms of vehicle separation. The Collision Prevention principle describes
the “Dynamic Separation” of vehicles in motion, where aircraft should maintain a minimum safe
distance between themselves and other aircraft in front of them. This protocol would be used in
both manned-only and heterogeneous manned-unmanned environments, although the minimum safe
distance may vary between vehicle systems and operating conditions. Exclusive Areas translates
into an “Area Separation” protocol in which the flight deck is divided into forward and aft sections.
Manned and unmanned aircraft begin in separate areas and are only allowed to operate within
those areas. Task Scheduling becomes a “Temporal” separation, with operations organized such
that manned aircraft begin operations first with unmanned aircraft remaining inactive until manned
operations complete. Additional details as to how these rules are encoded for specific actors and
agents will be provided in Chapter 5.
Testing the effects of these Safety Protocols on productivity and safety of HMUE operations
requires generating models of the manned and unmanned aircraft that operate within them. Models
of manned, or manually controlled, vehicles can be created through empirical observations and
measurements that characterize their decisions and movements within the world. However, because
by definition futuristic unmanned vehicle systems do not yet exist, their models must be constructed
through different means. Many of these systems exist only with laboratory environments or as basic
simulation designed to explore new graphical user interfaces, both far from being at sufficient fidelity
for observation and modeling. However, this research does provide sufficient background to describe
54
CHAPTER 2. LITERATURE REVIEW
Table 2.3: The three general classes of safety protocols that will be applied to unmanned carrier
flight deck operations.
Safety Protocol
Definition
Dynamic Separation
Area Separation
Temporal Separation
Aircraft in motion should maintain a minimum safe distance
of one vehicle length forward of the aircraft while in motion on
the flight deck.
Operations will be planned such that manned and unmanned
aircraft begin parked in separate regions of the flight deck.
Operations are planned such that aircraft are only assigned to
and taxi within their specified areas of operation.
Operations are planned such that manned aircraft are the first
to operate on the flight deck. Unmanned aircraft are not allowed to begin taxi operations until manned aircraft complete
their taxi operations.
how these systems might be utilized in the future and provide some guidance as to their capabilities.
The next section describes several different classes of unmanned vehicle systems, classified according
to how a human operator interacts with them — their “Control Architectures.”
2.2
Human Control of Unmanned Vehicle Systems
The key element of HMUEs is the presence of unmanned vehicles in the system. However, the
term “unmanned” can describe a variety of different systems with varying types and degrees of
capabilities. Some systems have highly advanced autonomous capabilities and can navigate the
world largely on their own; other unmanned vehicles are simply remote-controlled systems, with
little onboard intelligence and requiring the same inputs from human operators that manned aircraft
do. The changes in these systems can be characterized along three different axes: the location of
the human operator, what control inputs they provide, and how they provide them. Changes along
these axes have significant effects on the performance of the human-vehicle team in terms of how
they acquire information on the state of their surroundings, determine what tasks to perform, and
execute those tasks in the world.
This section covers five main types of control architectures: Local Teleoperation (LT), Remote
Teleoperation (RT), Gestural Control (GC), Vehicle-Based Human Supervisory Control (VBSC),
and System-Based Human Supervisory Control (SBSC). These are compared to a baseline concept
of Manual Control (MC), where a human provides control inputs directly to the vehicle while seated
within it. An overview of the differences between the systems appears in Figure 2-2. Because local
and remote teleoperation are identical (except for the distance from operator to vehicle), they are
combined in a single column at this time. Beginning with the first row (operator location), all
unmanned vehicle systems involve the relocation of the human operator from the vehicle to some
location outside of the vehicle. For the remaining two rows (control inputs and input modality),
moving from left to right moves from physical input of low-level control inputs (acceleration, steering,
55
CHAPTER 2. LITERATURE REVIEW
etc.) to more abstract definitions of tasks (“go to this area and monitor for activity”) input through
a Graphical User Interfaces (GUIs) using symbolic interactions. Subsequent sections explain how
these differences in control architecture definition affect performance of these systems. Together,
these changes shift the operator from providing highly-detailed control inputs to a single vehicle to
providing highly abstract instructions to multiple vehicles, moving from managing the behavior of
one vehicle to optimizing the performance and interaction of many.
2.2.1
Manual Control
In this work, Manual Control (MC) is defined as a system where the operator is seated within the
vehicle and provides low level control inputs directly to the vehicle through physical devices. Consider
a basic general aviation aircraft: the pilot controls pitch, yaw, and roll by directly manipulating the
yoke, throttle, and rudder pedals. The pilot thus dictates the direction and speed of the vehicle,
performing all tasks related to guidance and navigation, and is responsible for reacting to the world
and all external inputs and closing all control loops1 . These actions can be characterized in terms
of three main functions: sensing the world around them, selecting a response given those sensor
inputs, and executing the selected response. The general pattern of behavior of the human operator
is characterized by Wicken’s information processing loop (Wickens & Hollands, 2000), shown in
Figure 2-3.
The operator acquires information on the world through their senses, which is then translated
into general perceptions about the state of the world and its relationship to the operator’s intended
goals. Operators then move through a process of response selection, supported by both short-term
(storing current information) and long-term (recalling similar experiences) memory. Once the action
1 While, in certain systems, automatic control systems and autopilots may also be used, they are not the default
state of the system and are not necessarily active throughout the entirety of vehicle activity.
Figure 2-2: An overview of the variations between the five different control architectures described
in this section.
56
CHAPTER 2. LITERATURE REVIEW
Figure 2-3: Human Information Processing loop (Wickens & Hollands, 2000).
is selected, it is implemented physically, providing control inputs to the vehicle, which then responds
to the control inputs. While depicted as a staged process, the operator is actually in a continual state
of monitoring the vehicle and all relevant inputs, often monitoring multiple sources of information
at once.
This processing loop is also accurate for unmanned vehicles, with the key change being that the
implementation of actions is no longer done through physical artifacts attached to the vehicle. The
processing loop thus extends to include the vehicle as a separate entity and various intermediaries
connecting operator to vehicle. The most commonly cited model of this is Sheridan’s Human Supervisory Control (HSC) loop, pictured in Figure 2-4 (Sheridan & Verplank, 1978; Sheridan, 1992). A
variety of systems are architected as shown in the figure, with the human operator providing input
to a computer or other mediator than transmits commands to the vehicle platform (which itself
may include a computer). This can take a variety of forms; in some cases, it is simply a Ground
Control Station (GCS) with a communications link, while in others, it is a more complex computer
system with intelligent planning algorithm. These changes in communications patterns change what
information the operator provides, what they receive, and what is required of the vehicles in the
world. The remaining sections describe different types of these control systems, beginning with the
ones most similar to manual control. Teleoperated systems require the operator to provide the same
control inputs in the same manner as MC, but does so from a GCS that may be either nearby or
beyond line-of-sight communications.
2.2.2
Teleoperation of Unmanned Vehicle Platforms
A teleoperated system “extends a person’s sensing and/or manipulating capability to a location
remote from that person” (Sheridan, 1992). This might refer to teleoperated manipulator arms or
two mobile vehicles. In current military operations, most unmanned aircraft are teleoperated, with
57
CHAPTER 2. LITERATURE REVIEW
Figure 2-4: Human Supervisory Control Diagram, adapted from Sheridan and Verplank (Sheridan
& Verplank, 1978).
the human operator providing largely the same control inputs to the vehicle, using the same physical
devices, as they would in manual control. As such, these are often referred to as “remotely piloted
aircraft, or RPAs. For these systems, the Ground Control Station (GCS) used by the human operator
includes a throttle, yoke, and rudder pedals similar to those of the actual aircraft; which are used in
the same fashion as before. In this case, however, the control inputs are then translated by the GCS
into computer data, transmitted to the vehicle through some fashion, and then reinterpreted into
control actuator commands. This same transmission mechanism is also used to provide information
on the state of the vehicle back to the operator, which is then displayed on the GCS. All information
on the state and orientation of the vehicle, including views of the surrounding world as conveyed by
onboard cameras, is received through the GCS. The pilot’s perception of the world is thus limited
in three ways: their visual field is limited by both the onboard cameras and the size of the displays
on the GCS, while their physical location removes any haptic and proprioceptive information about
accelerations and orientation.
Just as in Manual Control, the human operator has the responsibility of closing all control
loops on the vehicle; anything that happens in the environment requires the operator to determine
and implement the appropriate response. The earliest teleoperation systems, and the majority
of early research, concerned nuclear materials handling, exploration of the moon, and underwater
exploration (Sheridan & Ferrell, 1963) – tasks not easily or safely performed by a human being.
The purpose was not to increase the performance of the system, but to remove the human operator
from physical danger. Given this goal, the costs of implementation — a reduced ability to observe
the system and, in many cases, the presence of substantial latency in data transmission — were
acceptable. Both of these elements (visibility and signal lag) have been shown to be detrimental
to the ability of the human-vehicle combination to perform tasks in the world in studies involving
telemanipulation systems (McLean & Prescott, 1991; McLean, Prescott, & Podhorodeski, 1994) and
ground vehicles (telerovers) (Spain & Hughes, 1991; Scribner & Gombash, 1998; Oving & Erp, 2001;
Smyth, Gombash, & Burcham, 2001). These tasks are similar to what will occur with teleoperated
aircraft on the flight deck and are chosen as the focal point for modeling. In each of these studies,
58
CHAPTER 2. LITERATURE REVIEW
the lone change in the system was a change from human operator “direct viewing” (seated within the
vehicle) to some alternative form of visual display; these ranged from helmet-mounted systems linked
to cameras on top of the vehicle (such that the operator was still seated within) to the mounting of
computer displays within the vehicle to the use of remote ground stations.
In general, the experimental tasks for telerovers involved navigating a road course of some form,
with performance measured by time to complete the task and error measured by the number of
course deviations or markers hit by the vehicle. Similar measures were used for the telemanipulation
tasks. Results from those studies (McLean & Prescott, 1991; McLean et al., 1994; Spain & Hughes,
1991; Scribner & Gombash, 1998; Oving & Erp, 2001; Smyth et al., 2001) demonstrated a range
of results in both measures. The increase in task completion time for teleoperated systems ranged
from 0% to 126.18%, with a mean of 30.27% and a median of 22.12%. The tasks used in these
experiments typically required drivers to navigate a marked course that required several turning
maneuvers. In these tasks, drivers showed a significantly higher error rate (defined as the number of
markers or cones struck during the test) as compared to non-teleoperated driving. Percent increases
in errors ranged from 0% to 3650.0% with a median of 137.86% and a mean of 60%. While these
elements are difficult to translate to other types of tasks, conceptually, they signify a difficulty in
high-precision alignment tasks — a skill required of several tasks on an aircraft carrier flight deck.
The studies described above were all instances of “local” teleoperation, with delays in communications between vehicle and operator being much less than one second; the movement of the operator
to a location beyond Line-Of-Sight (LOS) communication — such as in the U.S. military’s current
use of “Remote-Split” operations with Predator UAVs — introduces a non-trivial amount of latency
into signal transmission. This delay has two significant repercussions on human control of teleoperated vehicles. The first is simply the existence of the delay in communications: transmitting data
from the vehicle to the remote operating station requires a period of time (X) to travel between
locations. Transmitting commands back from the remote station to the vehicle requires a second
period of time (Y ), with X + Y being equal to the round-trip delay in communications from time
of data transmission by the vehicle to the vehicle’s receipt of a new command from the operator.
The more interesting ramification is in how operators respond to the presence of lag in the system
and change their strategies to compensate. A naive teleoperator inputs tasks assuming there is no
lag, continually overshooting their target and requiring multiple corrective actions to finally reach
the target (and much greater amounts of time). Sheridan and Ferrell were the first to observe that
more experienced teleoperators used an “open-loop” control pattern, inputting tasks anticipating
the lag, then waiting for the tasks to complete before issuing new commands (Sheridan & Ferrell,
1963). While this strategy is not as efficient as operations with zero latency, it avoids exacerbating
the problem by continually inputting (erroneous) corrective actions. In both cases, the time required
to complete tasks increases at a rate higher than the amount of lag adding to operations (it is a
59
CHAPTER 2. LITERATURE REVIEW
multiplicative function) — one second of latency translates into much more than one second of
additional time. Fifteen different papers, totaling 56 different experimental tests, show that the
percentage change in task completion time per second of lag ranges from 16% to 330%, with a
mean of 125% and a median of 97% (Kalmus, Fry, & Denes, 1960; Sheridan & Ferrell, 1963; Hill,
1976; Starr, 1979; Mar, 1985; Hashimoto, Sheridan, & Noyes, 1986; Bejczy & Kim, 1990; Conway,
Volz, & Walker, 1990; Arthur, Booth, & Ware, 1993; Carr, Hasegawa, Lemmon, & Plaisant, 1993;
MacKenzie & Ware, 1993; Hu, Thompson, Ren, & Sheridan, 2000; Lane et al., 2002; Sheik-Nainar,
Kaber, & Chow, 2005; Lum et al., 2009). Thus, at the median, one second of lag results in a near
doubling of the time to complete a task (a 100% increase); two seconds of lag equals a 200% increase.
These tests addressed both cases where operators where directly observing the system they were
controlling, or while receiving information via a video or computer display.
These studies imply that the nature of human teleoperation of unmanned vehicles involves two
distinct systems that must be modeled. In the first, termed Local Teleoperation (LT), the main
effects are that the operator’s view of the world is artificially constrained by the cameras onboard
the vehicle and the screens on the operator’s GCS, which the leads to increases in the time require
to complete tasks (essentially, a decrease in vehicle speed). In the second, termed Remote Teleoperation (RT), additional latency is introduced into the signal, further degrading performance while
also forcing operators to adopt new strategies of interactions. For these systems, the preservation
of the safety of the human pilot, now a remote operator, comes at a significant cost of vehicle performance. These changes, however, do not guarantee any benefits to the vehicles themselves, or the
people that interact with them within the HMUEs. It is quite possible that the effects of latency on
operations lead these vehicles to be less safe than before, or greatly reduce the productivity of operations. Safety protocols applied to these vehicle must adequately address these changes in operator
capabilities but are ultimately limited by the properties of the vehicles. Otherwise, improving the
safety and efficiency of the unmanned vehicles in the system requires changing the form of operator
interaction that occurs. One potential upgrade from teleoperation involves the inclusion of limited
autonomy onboard the unmanned vehicle that negates the need for a ground control station, instead
allowing individuals within the environment to command actions directly. These advanced forms of
teleoperation are discussed in the next section.
2.2.3
Advanced Forms of Teleoperation
The more traditional teleoperation systems described in the previous section still rely heavily on a
single human operator to close the control loop on the unmanned vehicle under control; this forces
a variety of constraints into the design of the system, requiring a communications link to a GCS
and the transmission of video and vehicle state data to the operator for review. These architectural
constraints may lead to significant performance degradations because of the limitations inherent
60
CHAPTER 2. LITERATURE REVIEW
in the system. More advanced forms of these systems might require no pilot involvement at all,
instead possessing sufficient autonomous capabilities to recognize task instructions from local human
collaborators, no longer requiring a human remote pilot to be part of the information processing
loop. Two forms of such systems are gesture- and voice-based control, in which the human operator
communicates commands rather than control inputs (”move forward” instead of “throttle up”).
In these systems, the computers onboard the vehicle are responsible for decomposing the verbal or
gestural command inputs into lower-level control inputs from a pre-defined list of possibilities. In the
context of Figure 2-2, these changes occur in the second and third rows, moving from control inputs
to commands and the modality from physical interfaces to “natural” gesture and voice commands.
Voice-based command of robotic systems has been utilized in a variety of fields, including surgical robotics (Mettler, Ibrahim, & Jonat, 1998), unmanned vehicles (Sugeno, Hirano, Nakamura,
& Kotsu, 1995; Doherty et al., 2000; Perzanowski et al., 2000; Kennedy et al., 2007; Bourakov &
Bordetsky, 2009; Teller et al., 2010), and motorized wheelchairs (Simpson & Levine, 2002). However,
these studies mostly focus on discussing the system as a proof of concept without reporting comparison to other control architectures, or even reporting the accuracy of the voice recognition system
itself. What results are reported demonstrate error rates ranging from as small as 2 to 4% to upwards of 45% (Atrash et al., 2009; Wachter et al., 2007; Sha & Saul, 2007; Fiscus, Ajot, & Garofolo,
2008; Furui, 2010; Kurniawati, Celetto, Capovilla, & George, 2012). However, these studies also
deal with large sets of vocabulary words and are not directly focused on command language. A few
of these studies are intended for precisely this function, with several studies demonstrating speech
recognition rates above 95% (Ceballos, Gómez, Prieto, & Redarce, 2009; Gomez, Ceballos, Prieto, &
Redarce, 2009; Matsusaka, Fujii, Okano, & Hara, 2009; Park, Lee, Jung, & Lee, 2009; Atrash et al.,
2009). While perhaps promising in some applications, these system require low levels of background
noise and are unlikely to be useful for aircraft carrier flight deck operations, the system of interest
in this work.
Gestural control systems utilize stereo-vision cameras and sophisticated computing algorithms
to first detect human operators, then track them within the local environment. Their actions are
then converted into data points that are compared with databases of previously-recorded motions
to determine the command being issued. These systems rely heavily on machine learning techniques
and training on previous data sets in order to address both between- and within-person variability
in motions. As such, the systems should get better over time, but may struggle with motions that
are not exaggerated. However, these systems may be particularly useful in flight deck operations,
as gestural communication is already the primary method in which crew communicate instructions
to pilots. Effective gesture control technologies would mean that the integration of UAVs would
have a minimal effect on the deck crew in terms of training and adaptation. Currently, most
research involves the use of human motion data sets — large databases of videos of people performing
61
CHAPTER 2. LITERATURE REVIEW
various movements. The datasets most often observed in the literature search are the KTH (Schuldt,
Laptev, & Caputo, 2004), Weizmann (Blank, Gorelick, Shechtman, Irani, & Basri, 2005), and Keck
(“Gesture”) datasets (Lin, Jiang, & Davis, 2009). However, these data sets are typically based
around motions that are interesting to the researchers (for instance, running, boxing, and hand
waving) rather than any concepts of direct commands to a vehicle system. The data sets also typically
use a very small set of gestures (6, 9, and 14 for the sets), reflective of the current immaturity of the
technology.
Even so, research on these sets provide a good picture of the state-of-the-art. A set of papers,
all from the 2005-2010 time frame and each referencing many other studies as comparison points
were reviewed (Blank et al., 2005; Lin et al., 2009; Jhuang, Serre, Wolf, & Poggio, 2007; Scovanner,
Ali, & Shah, 2007; Laptev, Marszalek, Schimd, & Rozenfeld, 2008; Schindler & Gool, 2008; Yuan,
Liu, & Wu, 2009; Song, Demirdjian, & Davis, 2012) and their accuracy ratings were compiled into
the histogram in Fig. 2-5. Of these 146 studies reported, 91 report better than 90% accuracy,
37 report results better than 95%, and 11 report results of 100% accuracy in classifying human
gestures according to their definition. These results suggest that, at least in terms of accuracy, these
systems are quickly approaching the point at which they would be viable for field use. However,
as noted above, most of these are for random, common human gestures — not command type
gestures that would be used in flight deck operations. Song et al. are building a gesture-recognition
system specifically for this purpose, which requires tracking not only the human, but tracking the
arms and hands independently in order to understand different hand position (thumbs up/down,
hand open/closed) (Song et al., 2012). This is a significantly more difficult task, and many of
the motions are quite similar to one another, at least for the algorithms that track and describe
motion. Currently, their system reports accuracies near 90%, but at a three-second computational
time. Results for other systems using simpler data sets in the literature vary significantly, ranging
from sub-second to up to ten seconds. Given that these systems are also using relatively “clean”
data — stable background environments, little to no visual occlusion of the human being observed,
no sun glare — for these simple and small gesture sets, these fast computational times and high
accuracies are not likely to occur in flight deck operations.
While substantial work is still needed for these systems, they may provide a more suitable replacement for manual control vehicles than remotely teleoperated systems. Additionally, further
increases in vehicle autonomy might enable further increases in performance while also removing
some of the potential failure modes of gesture control systems. Centrally networking vehicles together, relying on shared sensor data rather than visual perception, enables new capabilities and
efficiencies in behavior unavailable for other types of vehicles. These are referred to as “human supervisory control” systems and grant a single operator the ability to control many vehicles at once.
They are discussed in the next section.
62
CHAPTER 2. LITERATURE REVIEW
Figure 2-5: Histogram of accuracies of gesture control systems.
2.2.4
Human Supervisory Control of Unmanned Vehicles
While the term “supervisory control” may be applied to any of the systems described in this chapter,
for the remainder of this thesis, the term will refer solely to systems where a single human operator
controls one or more highly-autonomous vehicles through the use of a computerized display and
GUI. In such systems, operators issue abstract task commands, such as “go to this area and look
for a vehicle” to the unmanned platform; the platform is then capable of determining where it
currently is, how to navigate to the area of interest, and how to execute the monitoring task on
its own, possibly to the point of alerting the operator when something of interest has occurred.
Typically, this requires substantial intelligence onboard the vehicle, including a variety of highfidelity sensors to observe the world (GPS, LIDAR, and other range and sensing systems), cameras,
and high-accuracy inertial measurement devices, along with high-powered computer systems running
intelligent planning algorithms to handle vehicle motion and translate operator commands into low
level control inputs. Because of the capabilities of these vehicles, operators are no longer constrained
to operating one vehicle at a time and can issue commands to multiple vehicles for simultaneous
execution. For these systems, the human operator is more of a “supervisor” of the actions of the
largely-independent vehicles, rather than a “controller” or “operator.” The tasks performed by this
supervisor are dependent on exactly how capable the vehicles are: how many tasks can be assigned to
them at a given time, how reliability they execute those tasks, and how often they require additional
instructions.
Typically, the GUIs for these systems moves utilizes a system-centric, map-based representation
(see (Trouvain, Schlick, & Mevert, 2003; Roth, Hanson, Hopkins, Mancuso, & Zacharias, 2004;
Malasky, Forest, Khan, & Key, 2005; Parasuraman, Galster, Squire, Furukawa, & Miller, 2005;
Forest, Kahn, Thomer, & Shapiro, 2007; Rovira, McGarry, & Parasuraman, 2007) for examples),
rather than a vehicle-centric one, to provide supervisors with sufficient Situational Awareness (SA)
63
CHAPTER 2. LITERATURE REVIEW
of the vehicles and states of the world. In many cases, a representation of the system schedule
is also utilized, providing SA on the current state of vehicle task execution and when they may
require operator intervention (see (Howe, Whitley, Barbulescu, & Watson, 2000; Kramer & Smith,
2002; Cummings & Mitchell, 2008; Calhoun, Draper, & Ruff, 2009; Cummings, How, Whitten, &
Toupet, 2012) for examples). Beyond these features, there are two broad ways of classifying how
the operators provide tasks to the vehicles. The first form is referred to as Vehicle-Based Human
Supervisory Control (VBSC), in which supervisors supply commands to a single vehicle at a time,
working iteratively through the queue of vehicles waiting for instruction. These systems require
operators to define tasks (Malasky et al., 2005; Ruff, Narayanan, & Draper, 2002; Ruff, Calhoun,
Draper, Fontejon, & Guilfoos, 2004; Cummings & Mitchell, 2005; Ganapathy, 2006; Cummings &
Guerlain, 2007) or destinations (Trouvain et al., 2003; Calhoun et al., 2009; Dixon & Wickens,
2003, 2004; Dixon, Wickens, & Chang, 2005; Olsen & Wood, 2004; Crandall, Goodrich, Olsen, &
Nielsen, 2005; Cummings et al., 2012) for vehicles, possibly in terms of a preprogrammed travel
route. As the autonomy onboard the vehicles now handles low level control, operators might also
attend to additional tasks, such as error and conflict resolution (Cummings & Mitchell, 2005) or
target assignment and identification (Rovira et al., 2007; Calhoun et al., 2009; Ruff et al., 2002;
Cummings et al., 2012; Wright, 2002; Kaber, Endsley, & Endsley, 2004). The performance of the
system as a whole is dependent on the supervisor’s service rate and whether or not they can keep
up with the incoming rate of tasks from vehicles, which in turn is dependent on the aforementioned
vehicle capabilities. Past research has shown that operators can control up to eight ground vehicles
or twelve aircraft at a time (Cummings & Mitchell, 2008), suitable for a flight deck where no more
than 8-12 vehicles are taxiing at once.
Controlling more than eight to twelve vehicles can be more effectively achieved through the
inclusion of additional autonomy in the system, although not on the vehicles themselves. SystemBased Human Supervisory Control (SBSC) systems utilize intelligent planning algorithms to assist
the human supervisor in replanning tasks for many vehicles simultaneously (see (Roth et al., 2004;
Forest et al., 2007; Howe et al., 2000; Kramer & Smith, 2002; Cummings et al., 2012; Ryan,
Cummings, Roy, Banerjee, & Schulte, 2011; Anderson et al., 1999; Parasuraman, Galster, & Miller,
2003; Bruni & Cummings, 2005) for examples). In these systems, the supervisor is interacting with
a planning algorithm, providing guidance in terms of the desired performance of the schedule in
the form of both performance goals and constraints on activities. The algorithm then attempts to
optimize all tasks for all vehicles in accordance with these inputs. Replanning typically occurs in
three different instances – pre-mission planning (Howe et al., 2000; Kramer & Smith, 2002; Kaber,
Endsley, & Onal, 2000; Scott, Lesh, & Klau, 2002) that creates a schedule prior to any vehicle
activity, event-based replanning (Roth et al., 2004; Ruff et al., 2002; Cummings & Mitchell, 2005;
Ganapathy, 2006; Kaber et al., 2000) that requires new schedules after certain events (like failures)
64
CHAPTER 2. LITERATURE REVIEW
occur, and dynamic replanning (Cummings et al., 2012; Ryan et al., 2011; Cummings, Nehme,
& Crandall, 2007; Ryan, 2011; Maere, Clare, & Cummings, 2010), where supervisors create new
schedule periodically to increase the overall performance of the system. The latter case is needed
because the information and uncertainties in information used to create the original schedule evolve
over time; the expected performance of the schedule degrades over time as new uncertainties emerge
and the environment as a whole changes.
Very few instances of field testing of human supervisory control systems have occurred because, in
most cases, the level of autonomy required of such systems is not yet feasible for real world conditions.
In the few cases that have been observed, they have not included any empirical comparison to other
control architectures. The papers referenced within this section provide a rich set of examples of
how these systems can be architected, along with an understanding of the level of accuracy required
for these systems to be functional, but there is no empirical data with which to ground a definition
of vehicle task performance in the world. The data provided in this section can be used to create a
general model that describes how these various unmanned vehicle architectures are defined within the
world. This framework can then serve as a guide for simulation development and help to clarify what
the differences in performance among systems might be. Such a model is described and constructed
in the next section, followed by a discussion about how such a model might be translated into a
simulation environment.
2.2.5
A Model of Control Architectures
This section began with an overview of several different forms of “control architectures” defining
how human operators and supervisors control Unmanned Vehicles (UxVs), describing changes in the
architectures in terms of operator location, operator control inputs, and operator input modality
(Fig. 2-2). Those sections provided details on the different types of architectures, describing variations in performance and implementation that result from the changes listed in Fig. 2-2. Where
possible, the expected performance of the architecture was described quantitatively in terms of prior
research results. However, a significant missing piece in prior research is an empirical comparison of
the different control architectures to one another. These comparisons have typically been pairwise
in nature, with no research located that provides comparisons of the full spectrum of unmanned
vehicle control architectures. The near complete lack of empirical tests of HSC systems in field
trials also leaves a substantial gap in comparing system performance. However, we can still describe
relative comparisons between the different control architectures given a basic understanding of the
tasks required of HMUE domains.
In this dissertation, five specific unmanned vehicle control architectures are defined for the aircraft carrier flight deck based on the overview provided in the previous sections. In addition to a
baseline of Manual Control (MC), models of Local Teleoperation (LT), Remote Teleoperation (RT),
65
CHAPTER 2. LITERATURE REVIEW
Gestural Control (GC), Vehicle-Based Human Supervisory Control (VBSC), and System-Based Human Supervisory Control (SBSC) are created and described in terms of their differences from MC.
LT systems describe systems that are teleoperated, but where communications latencies are very
small and not noticeable by the operator. These operations are largely the same as Manual Control,
but will have restrictions on the operator’s ability to observe the world through the vehicle’s cameras. RT systems do include a noticeable latency in communications, which should result in higher
error rates in task execution as well as other changes in operator strategy in executing tasks. GC
is utilized as the advanced form of teleoperation as opposed to voice-recognition, as the significant
noise level of aircraft carrier operations makes any vocal communication extremely difficult. GC
replaces the human operator entirely, but should result in increased chances of failing to sense the
environment, of failing to select the correct response given certain inputs, and in executing tasks in
the world.
For each of these three unmanned vehicles control architectures (LT, RT, GC), control inputs
are provided by individual, decentralized operators as opposed to the single, centralized controller
that provides control inputs to all vehicles in Vehicle-Based Human Supervisory Control (VBSC)
and System-Based Human Supervisory Control (SBSC) systems. VBSC operators assigning tasks
to individual vehicles in series and SBSC operators utilizing a planning and scheduling system to
replace all tasks for all vehicles simultaneously. For VBSC systems, a model of operator interaction
times for defining tasks must be included, as well as a queuing policy for prioritizing aircraft.
Figure 2-6 provides an overview of the six key variables (rows) that define the differences between
the manual control and the five unmanned vehicle control architectures (columns in the figure).
Chapter 5 will re-examine these variables and provide additional detail on their definitions for
models of flight deck operations. The source of control inputs appears in the first row in the figure,
defining whether commands are provided by a central operator (the two supervisory control systems)
or from decentralized, individual operators. The ability of a vehicle to sense and perceive the world
appears in the second line, given as a probability of failing to recognize specific elements in the
world. For MC, LT, RT, and GC systems, this is a primarily visual process requiring a human
pilot and/or a set of cameras on the vehicle. The third row in the figure describes how, based on
that visual information, the operator or vehicle’s ability to select the appropriate response based on
those inputs. The two supervisory control systems would be networked to the central system and
acquire information on the world through other means, thus making the sense/perceive and response
selection elements not applicable. The fourth line describes the ability of the vehicle in the world to
execute a task once selected for execution, in terms of a probability of failing to properly complete
that task.
The final two rows are physical constraints based on the design of the system. The Field of
View variable relates to systems relying on human eyesight (MC) and/or using cameras onboard the
66
CHAPTER 2. LITERATURE REVIEW
Figure 2-6: Variations in effects of control architectures, categorized along axes of control input
source, sensing and perception failure rate, response selection failure rate, response execution failure
rate, field of view, and sources of latency.
vehicle (LT, RT, and GC), describing how much of the environment is visible at any given time. This
parameter is also not applicable for VBSC and SBSC systems, given their reliance on the central
network. The final variable corresponds to the types of latencies present in vehicles operations. RT
systems include the latency in signal communications, GC systems include computational delays as
well as the potential for repeated time penalties due to failures, and VBSC include delays due to
waiting in the operator queue.
The remaining knowledge required to model these various control architectures is a representation
of the Heterogeneous Manned-Unmanned Environment (HMUE) and its safety protocols, linking different agents and actors together through the passage of information between agents and elements of
the system. In this case, the transmitted “information” includes direct commands from the operator
to the interface then to the vehicles, interactions between humans and vehicles while collaborating
on tasks, and the effects of tasks on the world at large. Figure 2-7 contains this aggregated model
describing Heterogeneous Manned-Unmanned Environments (HMUEs), based on the descriptions
of the control architectures in the previous sections and an understanding of the environments in
which they work. The intent of this framework was not to abstract away from the details of these
environments; rather, the goal was to include all details together into a single cohesive framework
that can capture any of the control architectures described in this chapter.
The core of the diagram is the Human Supervisor block in the upper center of the figure. The
Human Supervisor communicates with the Supervisory System block and its potential components
in a process of Activity Selection, defining tasks and actions for vehicles to take. In this, the operator
67
CHAPTER 2. LITERATURE REVIEW
Figure 2-7: Aggregate architecture of manned-unmanned environments.
performs the same processes described in Fig. 2-3 and Fig. 2-4, acquiring information on the current
state of the world and using it to determine tasks that agents under the supervisor’s control should
accomplish. The contents of this supervisory control block may vary substantially, given the form of
control architecture involved. All, broadly, involve the definition of “Activities” for agent to execute,
an assignment of these activities to individual Task Queues, which the agents then independently
execute in the world. For Local and Remote Teleoperation, individual UAV operators determine
activities for their vehicles, hold the tasks within their own mental task queue, and are responsible
for executing the tasks in the queue. In Supervisory Control systems, this process might instead
involve the operator defining Activities for the system, which are stored in a Global task queue before
being assigned to individual vehicles by a planning algorithm. In a VBSC system, the operator may
perform a similar process without the aid of the planning algorithm.
Once activities and tasks are assigned, the task queues are executed by Human Crew, Manned
Vehicles, and Unmanned Vehicles in the world (this is a replica of Fig. 1-1 in Section 1.1). This
occurs collaboratively within one another within the Environment, under the auspices of the Safety
Protocols that govern their behavior. Additionally, though not discussed previously, are two additional groups of agents whose are not actively assigned tasks: “Bystanders” that are expected to
68
CHAPTER 2. LITERATURE REVIEW
be involved in operations and “Interlopers” that are not. It is up to the humans and vehicles in
the system to observe and account for these other classes of agents. As agents complete tasks in
the Environment, they generate System Events that have ramifications in the world — tasks are
completed, resulting in changes in the world and in the system itself — and are affected by Vehicle
Events, such as failures, that affect a vehicle’s ability to perform tasks. World Events, exogenous
to the system and outside the control of the Supervisor or the Agents in the Environment (such as
weather, changes in operational priorities, etc.) might also occur and lead to changes in the system.
Both System and World Events must also be displayed to the Supervisor so that they might employ
their best judgment in guiding the system.
This framework should be considered as a roadmap for modeling a number of HSC systems
within a simulation environment — to appropriately examine a control architecture’s performance,
the appropriate elements from the framework diagram in Fig. 2-7 should all be modeled within the
simulation environment. In general, this diagram is a recharacterization of Sheridan’s original HSC
loop shown in Fig. 2-4, with extensions made to include various possible pathways for assigning tasks
to vehicles and including the interaction of the vehicle with the world, both necessary for correctly
modeling the effectiveness of unmanned vehicle systems in the real world. Key to this diagram
is the idea that control of unmanned vehicles is a process of transferring information in terms of
commands, activities, states, and performance data. Given information on the state of the world,
supervisors assign tasks to vehicles, which require other environment on the state of the world in
order to execute their tasks appropriately. These vehicles work independently, each making its own
observations of the world and executing tasks individually based on these observations. Human
collaborators and manned vehicles work similarly, simultaneously, and in a decentralized fashion.
Because these actions occur in the same shared space, the actions of one vehicle affects the actions
of all others. Safety protocols are also put in to place, affecting the behavior of all vehicles in the
system and forming an additional constraint in the logic and actions of agents. A proper simulation
of these environments should include all of these features, adequately replicating how agents in the
world independently observe, decide, and act. In total, these suggest the use of an Agent-Based
Simulation (ABS) environment, whose properties and past work are discussed in the next section.
2.3
Modeling Heterogeneous Manned-unmanned
Environments
The previous sections have outlined how both safety protocols and control architectures act to affect
the performance of actors in a Heterogeneous Manned-Unmanned Environment (HMUE). Safety
protocols serve to constrain behavior in order to preserve a safe system state, while control architectures change the level of performance of the vehicles within the environment, with or without
the presence of the safety protocols. The combination of these two factors contributes to the per69
CHAPTER 2. LITERATURE REVIEW
formance of the entire system. Because each of these elements affects vehicle behavior, the type
of simulation environment used in modeling HMUEs must also address individual vehicle behavior.
This section discusses the applicability of Agent-Based Simulation (ABS) modeling to model human
and unmanned vehicle behavior in Heterogeneous Manned-Unmanned Environment (HMUE).
ABS modeling is a relatively new modeling technique that focuses the on the actions of the individual, specifically on decision making, motion, and situational awareness in the world (Bonabeau,
2002). ABS models are based on the assumption that if the behavior of actors and their interactions
with the world and with one another are accurately modeled, the simulation should evolve in the
same fashion as in the real world. ABS methods are most appropriate when the actions of individuals
have significant repercussions within the system, either in terms of motion in the world or in terms
of strategic decision-making. In such systems, developing closed-form solutions that describe the
interarrival rates at processing servers (as required for discrete-event simulations) may be difficult,
if at all possible. Furthermore, in such systems, the interarrival rates may not be independent and
identically distributed, a primary assumption in the development of the model.
Similar issues are faced with systems dynamics models; mapping the complex interactions between agents physically in the world to the stocks and flows required to define the model may be
incredibly difficult. Even if these could be accomplished, it is difficult to know how changes to the
parameters that describe unmanned vehicle control architectures alter these characteristic, closedform equations for different systems. Agent-based models offer an alternative approach and have
seen wide use in the biology, economics, and sociology domains due to their ability to capture the
emergent behaviors that occur in these systems (Epstein, 1999).
The nature of the safety protocols (which affect individual decision-making, motion, and behavior) and control architectures (which affect individual behavior and ability to follow safety protocols)
in HMUEs makes them candidates for modeling in an agent-based simulation. The remainder of this
section reviews a variety of studies that involved modeling aspects of HMUEs in order to demonstrate
the utility of ABS models for this dissertation. This is not intended to be a comprehensive review of
the content of agent-based models nor of techniques in their development; for the latter, the reader
is directed to Heath et al., who provide a review of 279 ABS over the decade ending in 2008 (Heath,
Hill, & Ciarallo, 2009). The intent, rather, is to provide a review of studies that demonstrate similar
characteristic to the HMUEs of concern in this work; the first section reviews agent-based models of
both human and unmanned vehicle behavior, followed by a review of agent-based models for testing
the effectiveness of safety protocols.
2.3.1
Models of Human and Unmanned Vehicle Behavior
ABS models have been used extensively in the biology, economics, and sociology domains because
of their ability to independently model human behavior and demonstrate the nonlinear and emer70
CHAPTER 2. LITERATURE REVIEW
gent properties expected from these domains (Bonabeau, 2002). Agent-based models have also appeared within the health (Kanagarajah, Lindsay, Miller, & Parker, 2008; Laskowski & Mukhi, 2008;
Laskowski, McLeod, Friesen, Podaima, & Alfa, 2009), automotive (traffic) (Peeta, Zhang, & Zhou,
2005; Eichler, Ostermaier, Schroth, & Kosch, 2005; Kumar & Mitra, 2006; Lee, Pritchett, & Corker,
2007), and military fields (Gonzales, Moore, Pernin, Matonick, & Dreyer, 2001; Bonabeau, Hunt, &
Gaudiano, 2003; Lauren, 2002) for the testing of different tactics, behaviors, and system capabilities.
General models of human crowd motion have been developed (Hu & Sun, 2007; Carley et al., 2003;
Patvivatsiri, 2006; Helbing et al., 2000) in addition to models of human decision-making (Sycara &
Lewis, 2008; Edmonds, 1995; Macy & Willer, 2002; Tesfatsion, 2003; Edmonds, 2001).
Models examining human behavior have typically used two different frameworks: explicitly
changing the behavior of individual actors or changing the environment in which actors behave
(or both). The latter may include physical changes to the environment, or changes in the number of agents, resources, of activities that are available in the system. Studies in the health field
for emergency room performance (Kanagarajah et al., 2008; Laskowski & Mukhi, 2008; Laskowski
et al., 2009) have addressed how different staffing protocols affect the ability of the ER to process
patients. Studies in the automotive (traffic) and aerospace (air traffic control) domains (Peeta et al.,
2005; Eichler et al., 2005; Kumar & Mitra, 2006; Lee et al., 2007) have examined how changes in
both the behavior and the awareness of human operators affect accident rates. Models of airport
terminals (Schultz, Schulz, & Fricke, 2008) and emergencies in buildings and aircraft (Zarboutis &
Marmaras, 2004; Sharma, Singh, & Prakash, 2008; Pan, Han, Dauber, & Law, 2007; Batty, Desyllas,
& Duxbury, 2003; Chen, Meaker, & Zhan, 2006; Hu & Sun, 2007; Carley et al., 2003; Patvivatsiri,
2006) have examined how changing the physical environment affects the ability of individuals to
move through the environment quickly and efficiently.
Military research has addressed similar elements, examining decision-making and performance
in novel command and control methods (Gonzales et al., 2001; Bonabeau et al., 2003), military
operations (including operations-other-than-war, OOTW; see (Milton, 2004; Hakola, 2004; Efimba,
2003; Erlenbruch, 2002) for examples), and team performance (Sycara & Lewis, 2008). These studies
have all addressed understanding the behavior of actors in the world under circumstances not easily
observable in the world. A similar task is done in the work in this dissertation — modeling HMUE
environments for the purposes of investigating the effects of varying safety protocols and control
architectures.
These studies have not, however, addressed any combination of UAV control architecture or safety
protocols, although each of these has been modeled individually. Agent-based modeling of UAVs has
been used to validate implementations of novel control or collaborative sensing algorithms (Lundell,
Tang, Hogan, & Nygard, 2006; Rasmussen & Chandler, 2002; Liang, 2005; Chen & Chang, 2008;
Mancini et al., 2008; Cicirelli, Furfaro, & Nigro, 2009; Rucker, 2006). These studies have primarily
71
CHAPTER 2. LITERATURE REVIEW
addressed systems involving the collaboration of multiple vehicles in performing system tasks, such
as searching for and tracking targets. Another common, and related, area of interest is in the
modeling and control of UAV “swarms” (large, coordinated groups of robots coordinating on a
specific task) (Ho, 2006; Corner & Lamont, 2004; Munoz, 2011; Vaughn, 2008). Both swarms and
cooperative search require agents to sense the world and make individual decisions, making them
candidates for agent-based modeling. In cooperative search, agents make decisions on how best to
achieve the overall system goal via the partitioning of tasks among the vehicles in the system (for
example, tracking multiple targets individually). Swarms typically consider how best to collaborate
on a single task, such as tracking a single target over many hours. Agent-based models have also
been built to test unmanned vehicle survivability (McMindes, 2005; Weibel & Hansman Jr., 2004;
Clothier & Walker, 2006), single UAV search (Schumann, Scanlan, & Takeda, 2011; Steele, 2004),
and the performance of intelligent munitions (Altenburg, Schlecht, & Nygard, 2002), each of which
involves modeling the effectiveness of individual awareness and behavior.
Human interaction with unmanned vehicles is an additional domain whose characteristics make it
a candidate for ABS modeling. However, little work has been done on this topic overall. The studies
that have been observed to include human models of reasonable fidelity fall into two categories –
discrete event models for cognitive processing or mathematical estimates of how many vehicles a
human operator can manage (the “fan-out” problem). The first topic attempts to build models
of how human operators cognitively interact with unmanned vehicle control systems, using data
from experiments to build representative models of behavior that can be repeatedly tested (Nehme,
2009; Mkrtchyan, 2011; Mekdeci & Cummings, 2009). The latter topic attempts to build definitive
equations that describe the maximum number of vehicles a human operator can attend to in the
environments (Olsen & Wood, 2004; Cummings et al., 2007; Cummings & Mitchell, 2008). However,
none of the references listed here include any concern for safety protocols active in the environment,
nor do they address alternate vehicle control methods or human interaction outside of control.
Agent-based models addressing safety protocol performance are discussed in the next section.
2.3.2
Agent-Based Simulations of Safety Protocols
The action of safety protocols to modify actor behavior is especially appropriate for agent-based
models, but few simulations have been built for the exploration of the effectiveness of safety protocols. These studies come primarily from two environments: traffic safety (Archer, 2004; Wahle &
Schreckenberg, 2001; Paruchuri, Pullalarevu, & Karlapalem, 2002; Yuhara & Tajima, 2006; Gettman
& Head, 2003) and airspace safety (Pritchett et al., 2000; Xie, Shortle, & Donahue, 2004; Blom,
Stroeve, & Jong, 2006; Stroeve, Bakker, & Blom, 2007). Traffic safety models have focused both
on driver interactions at intersections (Archer, 2004) and highway traffic (Wahle & Schreckenberg,
2001; Paruchuri et al., 2002). Airspace safety analyses have dealt both the national airspace sys72
CHAPTER 2. LITERATURE REVIEW
tem in general (Pritchett et al., 2000) as well as more localized models of runway interactions at
airports (Xie et al., 2004; Blom et al., 2006; Stroeve et al., 2007).
As with the references in the earlier section, these environments are all characterized by agents
acting independently in terms of certain objectives. In these studies, human drivers are left to act
independently, with researching examining how changes in the behavior of drivers affects accidents,
near-accidents, and efficiency on the roadway (Archer, 2004; Wahle & Schreckenberg, 2001; Paruchuri
et al., 2002; Yuhara & Tajima, 2006). Airspace safety simulations have also investigated how changes
in behavior affect the system. In these cases, however, the behavior in question is not only that of
the pilots in the aircraft (Blom et al., 2006; Stroeve et al., 2007), but also that of the air traffic
controllers that provide structure to the environment in terms of scheduling and routing (Xie et al.,
2004).
Interestingly, no studies have considered the testing of safety protocols for unmanned vehicles
in heterogeneous manned-unmanned environments, even though the prior work referenced in this
section provides successful examples of human, unmanned vehicle, and safety protocol modeling.
The likely reason for the absence of literature on the modeling of safety protocols in HMUEs is the
relative immaturity of these systems; few examples of these environments exist in the real world.
Additionally, substantial hurdles also exist in terms of implementing these environments in public
domains. In most cases, researching human control of unmanned vehicles or the performance of the
autonomous control systems takes precedence over modeling the implementation of the vehicle in
the system. However, modeling the behavior of the vehicle in the system operating under a variety
safety protocols may better inform the design of the human interaction with and control algorithms
embedded within the unmanned vehicles. If increased autonomy on the vehicle provides no tangible
benefits in terms of safety or productivity, it may not be worth the investment on behalf of the
institution funding the project.
2.4
Chapter Summary
This chapter first reviewed literature regarding the nature of safety protocols and how they are
applied to unmanned vehicles systems. In this section, a set of four primary classes of safety “principles” were described, providing general guidance in the implementation of safety protocols for future
systems. The second section reviewed different forms of human control of unmanned vehicles, moving
from teleoperated to more advanced systems and comparing them to the baseline of manual human
control. From these, a general model of performance was created, as well as a general framework
describing the relationships and flow of information between elements of Heterogeneous MannedUnmanned Environments (HMUEs). The last section reviewed how agent-based modeling can be
used in creating simulation models of these HMUEs and prior work in modeling human/unmanned
vehicle behavior and safety protocols usage and performance.
73
CHAPTER 2. LITERATURE REVIEW
THIS PAGE INTENTIONALLY LEFT BLANK
74
Chapter 3
Model Development
This chapter describes the process by which aircraft carrier flight deck operations, a representative
Heterogeneous Manned-Unmanned Environment (HMUE), is modeled through Agent-Based Simulation (ABS) techniques. The aircraft carrier flight deck is selected for two primary reasons. First, it
is a form of heterogeneous manned environment, where human collaborators work in close physical
proximity to manned aircraft and other manned vehicles. More specifically, aircraft on the flight
deck are fueled, equipped, and given taxi instructions by crew traversing the flight deck on foot.
These crew communicate with pilots, supervisors, and each other to coordinate activities on the
flight deck. Second, the Navy is actively researching the development of unmanned fighter aircraft
(and other unmanned vehicles) that would be integrated directly into deck operations. In fact, the
Navy recently conducted shipboard flight testing of one such vehicle (Hennigan, 2013).
As described by Eric Bonabeau, agent-based modeling is as much a perspective as it is a methodology (Bonabeau, 2002). Whereas the use of discrete-event or systems dynamics models elicits specific
representations of structure and constituent components, agent-based models may vary widely in
content and in methods of application. Chapter 2 previously provided a list of examples of agentbased simulation models, ranging from transportation systems involving both ground and aerial
vehicles to human decision-making to medical systems involving human motion and scheduling.
What matters is that, in each case, the agent-based modeling paradigm views the world “from the
perspective of its constituent units” (Bonabeau, 2002)— that the agents retain the same independence and employ similar decision-making strategies as the real-world entities they are intended to
replicate. The agents are then placed into the simulated environment, given a set of goals (which
may be self-generated during the simulation), and allowed to make decisions independently from that
point forward. A canonical example is that of Epstein and Axtell’s work on artificial ant colonies,
the topic of their book “Growing Artificial Societies” (Epstein & Axtell, 1996). In Chapter 1 of
their book, they describe that their models include: (1) “agents,” the active entities in the system,
75
CHAPTER 3. MODEL DEVELOPMENT
comprised of a sets of internal states and behavioral rules; (2) the environment they act within; and
(3) an object-oriented implementation, which preserves the independence of the agents and their
behaviors. Rules defined for each individual artificial “ant” describe under what conditions they
move within the world, based on the amount of sugar they “see” in nearby locations in the grid
world they exist within. The world is initialized with both a population of virtual ants (which may
appear in random locations) and a topography of virtual sugar that the ants seek out. From this
initial configuration, the simulation is allowed evolve based solely on the rules defined for agents
(the “ants”) and the environment.
This general template — independent agents modeled in an object-oriented fashion, containing
sets of states, parameters, behavioral rules, initialized into a world and allowed to evolve over time —
is applicable to all ABS and forms the basis of the Multi-Agent Safety and Control Simulation
(MASCS) model described in this work. The remainder of this chapter describes how the different
elements of MASCS are modeled, addressing the environment, agents, states, parameters, and rules
that populate the system. This modeling focuses on aircraft carrier launch operations, in which
upwards of 20 aircraft are taxied from parking spaces to launch catapults with the assistance of
multiple crew, and whose assignments are planned by supervisors on the flight deck. The next
section provides a description of flight deck launch operations, characterizing the specific agents
required, the tasks they are responsible for planning and executing, and important relationships and
interactions amongst agents in the system.
3.1
Modern Aircraft Carrier Flight Deck Operations
The modern aircraft carrier flight deck is sometimes described as “organized chaos” (Roberts, 1990);
it is a complex environment that requires a variety of simultaneous interactions between crew and
vehicles within a shared and densely populated physical space, making it difficult to know how the
actions of one individual may impact future operations. Multiple vehicles are active simultaneously,
their actions governed by a network of human crew attempting to execute a flight plan developed by
their supervisor, the Deck Handler. The pilots manning the aircraft receive instructions from crew
on the flight deck via hand gestures; these crew dictate all tasks given to the pilot — move forward,
stop, turn, unfold wings, open fuel tank, and many others — and work together as a team to move
aircraft around the flight deck. The crew work in a form of “zone coverage,” with a single Aircraft
Director guiding an aircraft through his assigned area before passing it on to another Director in
an adjacent zone on the way to the aircraft’s assigned catapult. Four different launch catapults are
available for use (two forward and two aft), but do not work completely in parallel and may be
periodically inaccessible due to aircraft currently parked or transiting through the vicinity. Once an
aircraft arrives at a catapult, the pilot and crew complete preparatory tasks for launch after which
the aircraft is accelerated down the catapult to flight speed and into the air. Launch preparation
76
CHAPTER 3. MODEL DEVELOPMENT
requires that several manual tasks be performed in parallel by multiple crewmembers. The aircraft
must be physically connected to the catapult, the catapult properly set for the aircraft’s weight,
the catapult then advanced to place the attachment under tension (to prevent the assembly from
breaking when the aircraft is accelerating), and control surfaces on the vehicle tested.
Building an agent-based model of flight deck operations requires modeling each of these aspects —
the movement of crew and vehicles on the flight deck, the execution of tasks (time to complete, failure
rates, etc.), and decision-making on the part of both pilots and other crew. Table 3.1 provides a
list of the primary actors involved in flight deck launch operations and their roles in executing
operations, each of which must be properly modeled in order to replicate flight deck operations.
Table 3.2 provides a list of the primary resources used by those actors, which may require rules
describing when and how then can be used, or may be referenced by other decision-making rules.
The remainder of this section describes these roles more fully and provides additional information
regarding the important parameters and decision-making functions required by each.
A digitized map of the flight deck appears in Figure 3-1. Labels in the figure highlight the
features of the flight deck relevant to launch operations. Four launch catapults (orange lines) are
used to accelerate aircraft from a stationary position to flight speed and into the air. The catapults
are numbered one to four, from right to left (starboard to port) and forward to back (fore to aft). A
single landing strip (parallel green lines) is the lone runway for recovering aircraft. During landing
(“recovery”) operations, four steel cables, called arresting wires, are laid over the aft area of the
landing strip; aircraft must catch one of these wires with their tailhook in order to land. Note that
the landing strip occupies the same space as the aft catapults, Catapults 3 and 4. As such, only
one form of operations can occur in this area at any time — the deck is either landing aircraft or
launching them from this area, and switching between operations introduces a time cost due to the
Table 3.1: List of actors involved in flight deck operations and descriptions of their primary roles in
deck operations.
Agent type
Role
Aircraft/Pilot
Aircraft Director
Deck Handler
Primary agent in launch operations; goal of flight deck system is to launch
all aircraft in minimum possible time
Relays instructions on task execution (primarily taxi) to Aircraft/Pilot
Responsible for scheduling launch assignments and arranging aircraft on
flight deck
Table 3.2: List of resources involved in flight deck operations.
Resource type
Role
Catapult
Parking Spaces
Resource used by aircraft in launching from the deck
Parking locations on the flight deck allocated to aircraft at the start of
operations.
77
CHAPTER 3. MODEL DEVELOPMENT
need to prepare the related equipment.
Each catapult is capable of launching a single aircraft at a time, but the four catapults are not
able to operate in parallel for two reasons. The first is that a single set of crew operates each pair of
catapults and can only manage one catapult at a time. The second, and more important, is that the
catapults are angled towards one another — simultaneous launches would result in a collision and
loss of both aircraft. This necessitates that the catapults alternate launches within each pair. The
separate pairs can process in parallel, although they still try to avoid simultaneous launches (but a
spacing of a few seconds is allowable).
Each catapult can queue two aircraft at a time in the immediate area — one on the catapult
preparing for launch and a second queued safely behind a jet blast deflector that raises from the deck
prior to launch. This process of queuing is an important feature in the planning and implementation
of the schedule of launches. Queuing aircraft at catapults minimizes the time between launches, as
the next aircraft is already at the catapult ready to begin its launch preparations. Queuing also
forms a major constraint in terms of planning — once a catapult has two aircraft assigned to it,
all available slots are filled. Aircraft needing to launch must be sent elsewhere, or must wait for an
opening at this, or another, catapult.
The process of launching an aircraft requires that several tasks be done in parallel. Physically,
the aircraft must taxi forward on to the catapult, aligning its front wheel (nose gear) with its
centerline. The nose wheel is then physically attached to the catapult mechanism (the “pendant”);
while this is occurring, other crew are confirming the aircraft’s current fuel level and total weight
so as to properly calibrate the catapult for launch. The pilot will also complete a set of control
surface checks to ensure that the ailerons, elevators, and rudder are all functioning. Once all of
Figure 3-1: Digital map of the aircraft carrier flight deck. Labels indicate important features for
launch operations. The terms “The Fantail” and “The Street” are used by the crew to denote two
different areas of the flight deck. The Fantail is the region aft of the tower. The Street is the area
forward of the tower, between catapults 1 and 2 and starboard of catapults 3 and 4.
78
CHAPTER 3. MODEL DEVELOPMENT
these actions are complete, the catapult is triggered and the pendant begins pushing the nose gear
and the aircraft forward up to flight speed. Once the launch is complete, the jet blast deflector is
lowered, the next aircraft is taxied forward onto the catapult, and another aircraft is taxied behind
the jet blast deflector. Within MASCS, catapults will be modeled as independent agents, just as
the crew and aircraft. The nature of catapult queuing, constraints on launches between adjacent
catapults, and the nature of the preparatory tasks and action of accelerating down the catapult are
all important features that must be modeled within the MASCS environment.
The goal of the crew in managing the allocation of aircraft to catapults, as well as the management
of the traffic patterns that result, is to make sure that the catapult queues are always fully populated.
At minimum, a second aircraft should always arrive prior to the launch of the aircraft currently at
the catapult. So long as this happens, the deck maintains a high level of efficiency, but ensuring
that this occurs requires both planning of the layout of the aircraft prior to operations and proper
management and assignment of aircraft once operations begin. Figure 3-2 shows one potential layout
of aircraft on the flight deck prior to the start of launch operations; Subject Matter Experts (SMEs)
report the general arrangement of aircraft in the picture to be fairly standard across the carrier
fleet. Importantly, the Deck Handler attempts to make taxi tasks as easy as possible by relocating
as many aircraft as possible from the interior of the flight deck, parking them on the edge of the deck,
and parking them such that they are pointed towards the interior of the deck. However, parking
spaces on the edge of the deck are limited, and some aircraft will be parked in places that obstruct
taxi paths for other aircraft; the five aircraft parked in the interior of the deck in Fig. 3-2 are one
example, as well as the aircraft at the top left of the image. This latter group is parked on top of
one of the launch catapults (Catapult 1), preventing it from being used in operations. While this is
not ideal, these aircraft will also be some of the first sent to launch, eventually freeing the catapult
for use and increasing the flexibility of operations. The aircraft parked in the interior of the deck
(in “The Street”) may also be launched early in the process.
Deck Handlers plan the parking of aircraft on the flight deck with the knowledge of the general
priority in which aircraft should launch. Aircraft highest in the priority list are parked so that they
are the first to depart their parking spaces with as clear a taxi path as possible. The locations
and priority of aircraft, then, form the initial conditions of launch operations. A set of operational
heuristics guides the manner in which areas of the flight deck are cleared during the launch mission.
These heuristics were developed as part of previous work (Ryan, 2011; Ryan et al., 2014), and
Figure 3-3 provides a general guide as to the order in which areas of the flight deck are cleared under
the heuristics. As noted above, aircraft parked in obstructing areas are moved first (labeled as “1’s”
in the figure). Clearing these areas early on clears more taxi paths on the flight deck and increases
the efficiency of operations. At next highest priority is the region near the front of catapult 1 and
an area in the aft of the flight deck called “The Fantail” (Fig. 3-1). This area sits directly on top
79
CHAPTER 3. MODEL DEVELOPMENT
Figure 3-2: Picture of the USS Eisenhower with aircraft parked prior to launch operations. Thirtyfour aircraft and one helicopter are parked on the flight deck (Photo by U.S. Navy photo by Mass
Communication Specialist Seaman Patrick W. Mullen III, obtained from Wikimedia Commons).
of the landing strip. While the current focus of the MASCS model is not on landing (“recovery”)
operations, clearing this area is a priority for the crew. Clearing this area as early as possible allows
any airborne aircraft that require an emergency landing to return to the carrier.
While the figure denotes the general order of clearance of these areas, it is not a strict ordering.
Areas of priority 1 will be cleared first, based on the available catapults and the current geometry
of the deck. Pending the current allocation of aircraft to catapults and where aircraft are currently
parked, some assignments of aircraft to catapults may not be feasible. For instance, if aircraft are
Figure 3-3: (top) Picture of the MASCS flight deck with 34 aircraft parked, ready for launch operations. (Bottom) General order in which areas of the deck are cleared by the Handler’s assignments.
80
CHAPTER 3. MODEL DEVELOPMENT
parked in The Street (Fig. 3-1), aircraft parked forward cannot be assigned to aft catapults. As such,
the Handler would continue to assign aft aircraft from lower priority areas to the aft catapults to keep
launches happening. If a catapult is open, the highest priority aircraft able to access that catapult
is assigned. The Handler and his staff execute those heuristics dynamically as the mission evolves,
reacting to conditions in real time so as to best compensate for the stochasticity and dynamics of
operations. As such, these heuristics are one of the most important features of the MASCS simulation
and its Deck Handler models. The locations of aircraft prior to launch operations, although a task
of the Deck Handler, are initial conditions and could be modeled as static templates to provide the
same initial conditions across multiple different scenarios.
While the Deck Handler is responsible for determining the parking locations of aircraft prior
to launch operations and in managing the assignment of aircraft to catapults during operations,
the Handler is not an active, moving entity on the flight deck. Rather, he remains within the
Tower at deck level, utilizing a tabletop map of the flight deck and aircraft-shaped metal cut-outs to
determine catapult assignments for aircraft. Given the Handler’s assignments, it is the responsibility
of members of the crew to facilitate the taxiing of aircraft from their parking locations and to their
assigned catapults. The most important members of the crew for these tasks are the yellow-jerseyed
Aircraft Directors, who provide all instructions (taxi commands and otherwise) to pilots during deck
operations. These instructions are provided entirely through hand gestures, using a codified set of
over one hundred individual gestures to describe all possible actions that can be taken (e.g. move
forward, stop, turn left, or unfold wings). To provide some flexibility into operations, Directors work
in a form of “zone coverage” in which the individual Directors each manage a specific area of the
flight deck. Directors taxi an aircraft through their zone before handing off to a nearby Director in
an adjacent zone. This minimizes the amount of travel that an individual Director must do, while
also ensuring that Aircraft always have crew available nearby to provide guidance (that is, they are
not waiting for a Director to return from the far end of the ship).
Pilots are not allowed to take any action independently, except for stopping to avoid an accident.
If a pilot cannot see a Director, they are instructed to stop immediately. To maintain appropriate
lines of sight, Directors align themselves to one of the outside wingtips of the aircraft some distance
away from the aircraft, so that they are visible past the nose of the vehicle. This means that
there is a defined area in the front of the vehicle within which a pilot is looking for his current
Director. If the Director has not gained the attention of the pilot and is not within this area, the
Director will move into this area as quickly as possible. While vehicles are turning, or while moving
forward, Directors may also move in the same direction to maintain their alignment and spacing.
Directors must also coordinate amongst the team to route aircraft through the deck, recognizing
where aircraft are heading, which Directors are adjacent and able to accept incoming aircraft, and
how to properly pass vehicles between one another. They must also monitor nearby aircraft and
81
CHAPTER 3. MODEL DEVELOPMENT
foot traffic to ensure that their aircraft is not taxiing toward a collision, or taxiing to a position in
which they block another aircraft’s taxi path. Along with the Deck Handler’s heuristics, modeling
this “zone coverage” routing is an extremely important feature of the MASCS model, requiring both
an adequate modeling of how aircraft are routed in zone coverage, as well as how crew and pilots
interact during the routing.
Each of the elements described in the previous section — aircraft, crew, and supervisory agents,
their decision-making and coordination rules, the tasks that each perform both individually and
while interacting with others — must be modeled in order to construct an appropriate agent-based
model of flight deck operations. The next section describes this process of generating Agents that
represent the human crew, aircraft (and pilots), the physical flight deck and its equipment, and the
supervisory Deck Handler, Tasks that these agents perform, and Decision-making behaviors that
each work through in the course of the mission. The section begins by describing the nature and
modeling of an “agent” within MASCS.
3.2
Model Construction
As discussed in Chapter 2, agent-based simulation was chosen as the modeling methodology based
on its focus in modeling decision-making and movement of individual actors and information flow
between them, each of which is important in characterizing Unmanned Vehicle (UxV) control architectures and safety protocol application. These elements also play key roles in modeling crew and
pilot interactions in the unstructured and stochastic environment that is the aircraft carrier flight
deck. As noted at the beginning of this chapter, this requires adequately capturing the decisionmaking rules of the agents in the system, as well as the states and parameters required to execute
the decision rules. For modern aircraft carrier flight deck operations, this modeling approach focuses
on the launch phase of operations as described in the previous section, where sorties of 18-34 aircraft
are taxied from their parking spots to one of four launch catapults. While there are other aspects of
aircraft carrier operations that are potentially of interest, none require the same substantial degree
of human interaction with unmanned vehicles and manned vehicles. Only launch operations provide the largest number of simultaneously active agents and subsequently a high level of interaction
between agents, making it desirable for the forms of safety analyses that are a focus of this work.
Section 3.1 provided a description of the type of agents involved in launch operations, their roles
in planning and executing the mission, and some of the general behavioral rules that govern the tasks
they perform. Each of these aspects must also be captured within MASCS, with a key focus being on
replicating the heuristics of agent behavior and their movement on the flight deck. To capture these
aspects in an object-oriented architecture, MASCS was constructed using the Java programming
language and the Golden T Game Engine (GTGE) Java library1 , which controls the updating of
1 http://goldenstudios.or.id/products/GTGE/
82
CHAPTER 3. MODEL DEVELOPMENT
agent and animates their motions on-screen. The object-oriented nature of Java allows agents to be
defined independently, allowing rules and states to be defined individually different types of agents.
Agent states may either be defined as static variables (numeric or Boolean), or may be randomly
sampled from continuous distributions. These states are updated as the simulation executes, driven
by the GTGE package; state updates may be based on elapsed time and a rate of change, or by
logical rules that examine the states of the world or other nearby agents. For example, a catapult
cannot operate if aircraft are in a specified caution area around the catapult; the catapult’s update
function continually checks for the presence of any aircraft in this area, halting operations if one
appears. This is a key safety policy on the flight deck, but one that would not likely change so long
as crew are involved in flight deck operations.
The previous section also provided descriptions about the actions taken by crew and vehicles on
the flight deck. These exist alongside the states, parameters, and decision-making processes described
earlier and form the third primary component of the MASCS model. While these elements are unique
to each type of agent in the model, the tasks they perform are independent of any one agent. Actions
are defined as independent objects that can be assigned to any agent in MASCS; rules within each
Action determine, given the assigned agent, their current state, the current state of the world, and a
goal state, what series of tasks must be accomplished to reach the goal state. Actions exist to define
aircraft taxi tasks, refueling tasks, launch (both preparation at and acceleration down the catapult),
landing, and various other events. Agents maintain a list of Actions that they must accomplish; as
agents are updated by the main MASCS program, they in turn update their own Actions. When one
Action completes, the next one begins. The small increments of time applied during each update
and the rate at which updates occurs means that agent actions happen in parallel.
Many of the actions used to describe flight deck operations (taxi, launching, etc.) are categorized
as “Abstract” actions, describing complex tasks occurring on deck. These Abstract actions can in
turn be broken down into combinations of other elemental actions, termed “Basic” actions. These
include behaviors such as “Move Forward,” “Turn,” or “Accelerate:” the smallest decomposition of
physical tasks, barring the modeling internal cognitive processes of humans or individual parts and
components within an aircraft. The relationship between these Abstract and Basic actions can be
explained in the context of Figure 3-4, in which an aircraft parked on the starboard side of the deck
is asked to taxi to catapult 3 and launch. The aircraft would be assigned a Taxi action, a Launch
Preparation action, and a Takeoff action. The Taxi action would declare a goal state of reaching
catapult 3; rules within the action would determine that, given the location of the aircraft, nearby
crew, and the state of the deck, reaching catapult 3 requires four “Move Forward” tasks and three
“Rotate” tasks in the sequence shown. These would be added into the queue of subtasks tin the Taxi
action and executed in series. At the end of the last Basic action, the higher-level Taxi action checks
to see if the vehicle is at the desired goal state: if so, the Taxi action ends. Otherwise, additional
83
CHAPTER 3. MODEL DEVELOPMENT
rules add in new Basic actions in order to reach the goal. The rules within the Abstract actions are
adaptable, capable of automatically solving the motions required for any aircraft at any location on
the flight deck to reach any of the four catapults.
As Basic actions are completed, they are removed from the Abstract action’s queue and the next
Basic action is begun. When the queue is empty, the Abstract action should be complete and, if the
goal state has been achieved, the Abstract action is removed from the agent’s queue and the next
Abstract action (if any) is begun. In the example above, once the Taxi action completes, the Launch
Preparation task would begin, with its rules defining the set of Basic actions that must occur to
satisfy its goal. When an Agent’s action queue is empty, it may either wait for further instructions to
be assigned or its own internal rule base may generate a new action (such as returning to its original
location). This process of defining and executing Actions occurs independently for each Agent in
MASCS, and rules within each agent and task dictate when and how tasks are executed. These
include how vehicles avoid collisions with other vehicles, how Aircraft Directors align to aircraft
in order to be visible, and other aspects that prevent tasks from being executed as designed. A
description of these rules, along with pseudocode for these rules and for several other elements, can
be found in Appendix B.
The individual Basic actions are all based on time, and the execution of a Basic action completes
when the agent has reached the expected execution time. For example, a motion-based task like
driving forward in a straight line can be decomposed into a total time given the agent’s current
position, movement speed, and final destination. The launch acceleration task that models the
motion of an aircraft down a catapult to flight speed can be solved similarly. For these motions
tasks, updating the agent’s position in the world is simply a function of applying elapsed time to
the rate of travel and decomposing the movement in the x- and y-axes along the vehicle’s current
orientation angle. The vehicle’s position is updated incrementally at very small time steps until the
total elapsed time of the motion has elapsed. Other tasks are purely time-based and require no
physical movement form vehicles; these are modeled as randomized parametric distributions, with
Figure 3-4: Example: path, broken into individual actions, of an aircraft moving from a starboard
parking location to launch at catapult 3.
84
CHAPTER 3. MODEL DEVELOPMENT
execution times independently sampled at the start of each task. The most important of these are
the launch preparation task (tasks performed at catapult prior to launch) and the launch acceleration
task (physically accelerating the vehicle down the catapult) described previously. Table 3.3 notes
the parametric distributions for these two tasks, and Appendix C contains the observed times for
completing these actions used to build these distributions (Tables C.2 and C.3).
In summary, the MASCS model of deck operations is comprised of a combination of independent
agents and modularized action classes. Each relies on the definition of internal rules and states
that govern their behavior. These may also reference states within other agents during execution in
order to mimic information gathering and sensing of actions on the flight deck. Actions are assigned
to vehicles, which are updated by the primary MASCS update routine and which in turn update
action execution. A variety of rules dictate when and how actions can be executed, and higher-level
rule bases dictate which of several lower-level decision-making processes should be utilized. The
preceding section has given a broad overview of how MASCS functions but has not discussed the
agent models in any significant detail. The next section describes the relevant rules and states
required for each class of agent models individually, as well as more detailed descriptions of certain
Actions used within the simulation.
3.2.1
Aircraft/Pilot Models
Section 3.1 described the basics of launch operations on the aircraft carrier flight deck. In the
context of flight deck operations, much of the description centered on the interaction between pilots
(seated within their aircraft) and the Aircraft Director crew that provided instructions to them. In
order to model this interaction in the form of the Actions described in the previous section, this
must be broken down into three components: (1) pilots’ visual sensing and perception of Aircraft
Table 3.3: List of primary Actions used in the MASCS mode of launch operations.
Action
Description
Inputs
MoveForward
(Basic)
Rotate (Basic)
Taxi (Abstract)
Launch
Preparation
(Basic)
Acceleration
(Basic)
Moves aircraft forward in a
straight line between two points
Rotates aircraft in place across a
given angle
Navigates aircraft across flight
deck. Iteratively defines motions
and Director assignments
Replicates series of tasks required
to prepare aircraft for launch
(catapult settings, control surface
checks, physical attachment to
catapult)
Replicates acceleration and motion
of aircraft down catapult to flight
speed
85
Initial point, destination point,
speed of motion
Initial angle, final angle, speed of
rotation
Initial point, destination point,
assigned Director, speed of motion
Preparation Time
(N (109.648, 57.8032 )); see
Appendix C
Acceleration Time (Lognormal,
µ = 1.005, σ = 0.0962); see
Appendix C
CHAPTER 3. MODEL DEVELOPMENT
Directors, (2) pilot comprehension of information and instructions and correct response selection,
and (3) pilots’ execution of the commanded task. Modeling these requires defining a series of states
that relate to these motions (location, heading) and how instructions are being relayed (current
Director assigned to aircraft), parameters related to the motion of the aircraft (such as the field of
view of the pilot and range of speeds), and probabilities regarding failures. Additionally, other task
parameters regarding the execution of the launch preparation and launch acceleration times must be
defined, as well as logic regarding the avoidance of collisions on the flight deck under the Dynamic
Separation safety protocol. A list of these features appears in Table 3.4, and details for each are
provided in this section.
The ability to observe a Director is determined by two things; first, whether the Director is in
an area visible to the pilot, and secondly, whether the pilot actually observes and recognizes the
Director once in that region. The visible area, referred to as the field-of-view of the pilot, is defined
as a state variable describing the angular area in front of the vehicle visible to the pilot. This is
stored within the aircraft class and depicted graphically in Figure 3-5. While a pilot can turn their
head to see a great deal of the deck area (close to 300◦ of radial area) or to signal agreement or
disagreement with instructions, operationally, Directors will attempt to move into and align to an
area in front of aircraft in order to be visible to the pilot within the aircraft. If they are not in this
area, the pilot may not be able to observe their instructions.
Initially, field of view values were at a value of 180◦ , but animations of Aircraft Director alignments were clearly inappropriate for deck operations. The field-of-view variable for manual control
was judged to perform appropriately at a value of 85◦ . As such, Aircraft Directors must be in front
of the aircraft within 42.5◦ of either side of its centerline. This requires specific behaviors on the part
of the Aircraft Directors (an alignment task)2 . Once in this area, there is a chance as to whether
or not the pilot visually acquires and recognizes the correct Director. To model such an event, a
static variable relating to the chance of failure is defined for the aircraft agent and referenced when
Director alignment tasks occur. However, while the chance of a failure of Director acquisition may
exist, SMEs report that the likelihood of failing to observe a Director is very low and does not have
a noticeable effect on operations. At this time, it is left at a 0% failure probability.
The ability of pilots to execute tasks involves both defining the time required to complete a task
as well as the probability of failing to execute that task. Just as with the other failure rates, SMEs
judged the likelihood of a pilot failing to properly align to the catapult as a very rare occurrence:
less than once in every 200 launches. When asked, SMEs were comfortable with leaving this as a
0% failure rate. In addition to failure rates, models describing the time required to complete a task
are also required. Two different types of tasks exist, relating to the Basic actions described earlier.
Motion-based actions address vehicle movement on the flight deck — taxiing, turning, accelerating,
2 Pseudocode
describing both pilot behavior and Aircraft Director alignment appears in Appendix B
86
CHAPTER 3. MODEL DEVELOPMENT
Table 3.4: List of important parameters and logic functions for Aircraft/Pilot modeling.
Feature
Description
Value
Location (state)
Heading (state)
Current speed (state)
Current (x,y) position on deck
Current orientation angle on deck
Current assigned speed
Director assignment status
(state)
Catapult assignment status
(input)
Catapult assignment (input)
Is a Director currently assigned to the
aircraft
Is a catapult currently assigned to the
aircraft
To which catapult the aircraft should be
taxied
Which Director is currently responsible
for giving taxi instructions
Current task prescribed by Director
(turn, move forward, etc.)
Vehicle movement speed on the flight
deck (miles per hour)
Rate at which vehicles rotate during
taxi operations
Area visible to vehicle and to which
Directors align
Likelihood of pilot failing to understand
a task
sampled from
parametric
distribution
True/false
Aircraft Director assignment
Current taxi command
(input)
Speed (parameter)
Turn rate (parameter)
Field of View (parameter)
Probability of missing
director command
(parameter)
Probability of failing to
execute task (parameter)
Director visibility (logic)
Dynamic separation (logic)
Forward collision threshold
(parameter)
Lateral collision threshold
(parameter)
Likelihood of pilot failing to properly
execute a task
Is assigned Director currently in field of
view; if not, stop immediately
Replicates actions to stop aircraft from
making physical contact
True/false
1-4
-
N (3.5, 1.02 )
18o per second
±42.5o from
centerline
0%
0%
-
Minimum safe distance forward
One aircraft length
forward, one
aircraft width
laterally
One aircraft length
Minimum safe distance laterally
One aircraft width
and stopping. Time-based tasks require no motion, but still require a specific amount of time to
complete; this includes the launch preparation time and the launch acceleration time. Motionbased tasks require a definition of vehicle speeds, sampled from random distributions to reflect the
variability in operator behavior on the flight deck.
Aircraft taxi speeds were reported by subject matter experts to be equivalent to brisk human
walking speed; this is captured in MASCS as a Normal distribution with mean 3.5 mph and a
standard deviation of 1.0 mph (N (3.5, 1.02 )) within a specified range of [3.0, 4.0]. An aircraft’s
speed is sampled randomly from this distribution and assigned at the start of its motion task, then
referenced by the Basic actions to update and animate the motion of the aircraft through the deck.
87
CHAPTER 3. MODEL DEVELOPMENT
Figure 3-5: Diagram of the Director alignment process
The modeled speeds were compared to observations of flight deck operations to verify that they were
relatively accurate; a more formal examination of taxi motion occurs in Chapter 4. Aircraft turning
rates were defined as a static variable at 18 degrees per second (10 seconds to turn 180◦ ) and were
compared to empirical observations in a similar fashion; a table of the estimated turn rates for a set
of observations can be found in Appendix C.
A final important aspect of the aircraft models is the inclusion of a collision prevention routine
for the Dynamic Separation safety protocol, which models a pilot’s actions to stop his aircraft from
running into another vehicle. This action is dictated by a rule base contained within the vehicle
model and referenced during Taxi action execution. A flow chart of the decision rules relating to
this appears in Fig. 3-6. On execution, the rule base compares the position of the aircraft agent to
nearby aircraft forward or adjacent to it. The distance between the two aircraft is calculated and
decomposed into forward (through the aircraft’s nose) and lateral (through the wing) components.
If either distance is below a certain threshold (set to be one aircraft length forward or one aircraft
width laterally), the vehicle is paused until the other vehicle moves out of the way.
Limiting the system to aircraft nearby and forward of the aircraft replicates the role of both
the pilot in taking active control of the vehicle, as well as commands from the Directors to stop
and wait for other vehicles. MASCS does not, at this time, make any differentiation between the
two as both are reported by SMEs to be highly reliable — collisions during taxi are quite rare.
However, the behavior of these collision avoidance routines, even if working effectively, does have
repercussions on the flight deck. It is possible that chains of near-collisions occur: vehicle A might
stop because of the location of vehicle B, which is in turn stopped due to vehicle C, which is waiting
for another aircraft to launch from a catapult. While it effectively prevents collisions, it may have
other, downstream effects on operations that are difficult to predict. It should also be noted that
the collision prevention routine is supplemented by the dynamic routing of traffic on the flight deck
by the Aircraft Director agents; directors will attempt to route aircraft in ways that do not conflict
with one another. This decision-making process behind this is described in the following section,
along with other aspects of the Director models.
88
CHAPTER 3. MODEL DEVELOPMENT
Figure 3-6: Flow diagram of the collision avoidance process.
3.2.2
Aircraft Director Models
Aircraft Directors are the primary members of the crew implemented on the flight deck, as they are
the most active during launch operations. While there is a significant number of other crew involved
in pre-launch preparations, they do not participate in launch activities and typically move to safe
locations that do not affect taxi operations. States and inputs for Aircraft Directors relate to how
they route aircraft through the flight deck, describing their current assignments, if they are available
for aircraft to be handed off, and to what other Directors they can route aircraft. Directors also
contain variables describing their speed of motion, identical to that of the aircraft models described
in the previous section. A summary of these features appears within Table 3.5.
As aircraft receive their catapult assignments, they are assigned to Directors; this relationship
is stored by both the aircraft and the Director (”Current aircraft assignment”). Once a Director
takes assignment of an aircraft, they must move to a location in which they are visible to the pilot
(Fig. 3-5). For the manual control operations modeled here, this is done by staying ahead and to the
side of the vehicle’s center line — SMEs note that Directors attempt to stay even with one wingtip of
the aircraft, sufficiently ahead of the aircraft that they are not hidden from the pilot by the aircraft’s
nose. Observations of flight deck operations showed that Directors do follow this general rule for
the majority of operations. A codified version of this rule based on geometry and the destination of
the aircraft dictate where Directors should move is stored in the Director model. Additional rules
determine where Directors should move if they are initially not visible by the pilots and how to stay
in alignment while the aircraft turn and move forward. These additional rules were also based on
89
CHAPTER 3. MODEL DEVELOPMENT
Table 3.5: List of important parameters and logic functions for Aircraft Director modeling.
Feature
Description
Value
Location (state)
Current speed (state)
Current (x,y) position on deck
Current assigned speed
Assignment status (state)
Is Director currently assigned to an
aircraft
Which aircraft the Director is
responsible for.
-
Current aircraft assignment
(input)
Zone/catapult assignment
(input); current Aircraft
assigned; current Aircraft
destination (catapult)
Crew network topology
(input)
Speed (parameter)
Aircraft routing (logic)
Alignment to aircraft (logic)
To which other Directors can an aircraft
be passed
Rate at which crew move on the flight
deck.
Determines which Director an aircraft
should be passed to.
Determine where Director should move
to become visible to pilot.
sampled from
parametric
distribution
True/false
-
N (3.5, 1.02 )
-
observations of operations and discussions with SMEs.
While instructing pilots, Directors also monitor nearby traffic on the flight deck and attempt
to appropriately manage multiple conflicting vehicle paths. Because pilots of manned aircraft have
little knowledge of the catapult assignments of other aircraft on the flight deck (and may not even
know their own), it is left to the Directors, monitoring nearby traffic and communicating with each
other through hand gestures, to deconflict taxi actions across the flight deck. As described by SMEs,
if an aircraft is already in motion, it has priority in moving on the flight deck and any new taxi
commands should not interfere with those actions. However, if the two aircraft are sufficiently far
apart that the second can be taxied clear of the area before the first arrives, Directors will do so in
order to maximize the efficiency of operations.
Traffic deconflictions are replicated with MASCS through the use of a priority list of aircraft
assigned to taxi through a specific area, with a set of rules governing when aircraft are added to the
list, when they are removed, and in what cases no true conflict exists. The traffic management rules
rely on a series of lists that maintain the order in which aircraft request to taxi through the area.
The first aircraft in this list is by default allowed to taxi through the area; all aircraft not first in
the list must wait for this aircraft to exit the area before continuing.
There are a variety of exceptions to this rule that can occur, however, and the decision rules
examine each of these to see if no true conflicts exist. For example, if the first aircraft in the list
is taxiing to a forward catapult, and it has already driven by the second aircraft in the list, that
90
CHAPTER 3. MODEL DEVELOPMENT
aircraft should be allowed to move. If it is heading forward, it will not catch up to the first aircraft,
and if heading aft, cannot cause any problems. Otherwise, the vehicle waits until it either becomes
the first aircraft in the priority list or satisfies one of the exceptions. A detailed discussion of these
rules appears in Appendix B.
Once the Director taxis the aircraft complete through his zone, the aircraft must be handed off
to a new Director in a nearby zone. Each Director maintains an understanding of how their zone
connects to other zones on the flight deck (the input of the crew network topology). Figure 3-7
shows the topology of possible connections within the Director network on the flight deck. Rules
dictate, given the available connections and the current destination of the aircraft, to which Director
the aircraft should be sent next. If no Directors are available, the aircraft waits until one of the
connected Directors finishes with the aircraft they are currently instructing and becomes available.
A more detailed description of the logic for this model appears in Appendix B.
Together, this combination of rules — Director handoffs, alignment, and traffic management —
govern the behavior of Directors in executing flight deck operations and facilitate the movement
of aircraft from parking spaces to launch catapults. However, implicit in each of these actions is
that the aircraft has already received an assignment to a launch catapult. These assignments are
provided by the Deck Handler agent and its decision-making rules, described in the next section.
3.2.3
Deck Handler Model
The most important supervisory-level agent in the simulation is the Deck Handler that determines
the assignments of aircraft to launch catapults on the flight deck. The codified rules that govern
these decisions were generated in earlier work, based on interviews with SMEs regarding the typical
strategies for allocating aircraft tasks (Ryan, 2011; Ryan et al., 2014). In this earlier work, SMEs with
many years of deck experience were interviewed to understand how Deck Handlers allocate aircraft
Figure 3-7: Map of the network of connections between Directors on the flight deck. Green dots
denote Directors responsible for launching aircraft at the catapults (“Shooters”). Blue dots denote
Directors that control the queuing position behind the jet blast deflectors (“Entry” crew). Yellow
denotes denote other Directors responsible for taxi operations through the rest of the flight deck.
Shooters and Entry Directors may also route aircraft through the rest of the deck if not currently
occupied with aircraft in queue at their assigned catapult.
91
CHAPTER 3. MODEL DEVELOPMENT
both at the start of a mission and in response to failures during a mission. In these interviews, SMEs
were presented with different arrangements of the deck and different potential failure conditions
(e.g., a catapult becomes unusable) and asked to describe how a new plan would be designed. From
iterative explanations of different scenarios, a set of general rules of planning were created, describing
the “rules of thumb” used by Handlers in planning operations under a variety of circumstances. These
rules appear in Table 3.6 and when applied to a given scenario on the flight describe specific tactics
for parking aircraft, assigning them to catapults, and routing them through the deck.
Rules 5 (make as many resources available as possible), 6 (maintain an orderly traffic flow),
and 8 (park for maximum availability) combine to govern where aircraft should be parked prior to
operations. As much as possible, aircraft are parked on the edges of the flight deck facing inwards
in order to make the taxi process easier. Aircraft are parked in the interior of the deck (the Street)
and on top of catapults 1 and 2 only when all other options have been utilized. Aircraft parked on
catapults and in the interior regions will also be the first to be assigned in order to open up taxi
paths on the deck and to make more catapults available. Because these areas are cleared first based
on Rule 5, higher-priority aircraft in the mission will be parked in these areas based on Rule 8.
Rules 1 (translated as “simplify operations”) and 6 result in Handlers assigning aircraft to the
nearest available catapults, ensuring the simplest and fastest taxi operations and minimizing any
risk of aircraft interfering with each other’s taxi routes. However, the goal of Rule 4 (distributing
workload) eventually means that aircraft must be assigned to taxi to catapults that are farther
away than desired; it is better to use all available catapults than to minimize taxi path distances.
Additionally, Rule 4 also means distributing priority aircraft around the deck to maximize their
likelihood of being assigned to catapults. If the highest priority aircraft is parked forward, secondhighest might then be parked aft, and so forth, to balance priorities across the areas. Rules 7 and 9
are not applicable to the current MASCS modeling as major failures and landing operations are not
included in the simulation scenarios; however, for future extensions of the model, these heuristics
should be included. Rule 3 is included, however, as it relates to the Dynamic Separation protocols
described in the previous chapter.
Table 3.6: Heuristics of Deck Handler planning and re-planning developed as part of previous work
(Ryan, 2011; Ryan et al., 2014)
General
Deck
Airborne
(1) Minimize Changes
(4) Evenly distribute workload on deck
(2) Cycle quickly, but maintain safety
(3) Halt operations if crew or
pilot safety is compromised
(5) Make as many resources
available as possible
(6) Maintain orderly an orderly traffic flow
(7) Populate landing order by
fuel burn rate, fuel level, then
misc.
(8) Park aircraft for maximum availability next cycle
(9) ”True” vs ”Urgent” emergencies in Marshal Stack
92
CHAPTER 3. MODEL DEVELOPMENT
These rules were translated into sets of initial conditions for where aircraft should be parked and
a set of logical rules that describe, given the current aircraft on the deck, their priorities, and their
locations, where they should be assigned. The MASCS Deck handler also requires an understanding
of what catapults are operational and whether a catapult has an open spot in its queue. The Handler
routine always attempts to assign an aircraft to the nearest available catapult, given the current
geometric constraints on the flight deck. The routine first attempts to assign aircraft to empty
catapults (those without a prior assignment), before proceeding to catapults with only one prior
assignment. A set of rules determines whether or not the vehicle has an accessible taxi path to the
catapult (whether or not aircraft are parked in the interior of the flight deck, blocking the path).
The priority listing of aircraft included in the initial conditions is organized in accordance with
Rules 5, 6, and 8, and when the Handler rules assign aircraft to the nearest catapults, it results
in areas of the deck being cleared in a specific order. This is shown in Figure 3-8. The “Street”
is typically among the first cleared. In a mission using 34 aircraft, its aircraft will largely be sent
forward. Aircraft parked on the forward, starboard side of the deck will also be sent forward so as
to remove them from the catapult they are blocking. Once these areas are cleared, aircraft parked
in the forward are of the deck are cleared in a front to back fashion. In the aft area of the deck, the
few aircraft parked near catapult four might be cleared first, followed by all of the aircraft parked
in the Fantail area. Aircraft parked on the starboard side of the deck would be allocated last from
this area. As operations progress, however, the Handler heuristics will assign aircraft whenever and
wherever possible, regardless of region. These heuristics prioritize assigning an aircraft to a nearby
catapult over taxiing an aircraft the length of the deck to reach an opening: it takes longer to taxi
the length of the deck than to complete a launch at a catapult. Additionally, taxiing the length of
the deck is disruptive to taxi operations along the entire deck and is avoided as much as possible.
A rigid definition of these rules breaks down when the highest-priority aircraft has no feasible
taxi path to the available catapults. If this exists, it is because aircraft in the Street are blocking
taxi routes and have divided the deck into forward and aft halves. Because this potential exists,
Figure 3-8: Map of the flight deck noting the general parking areas of aircraft (gray), the order in
which they are cleared, and to which catapults (forward or aft) they are allocated.
93
CHAPTER 3. MODEL DEVELOPMENT
the Handler rules are allowed to examine the first n aircraft in the list, beginning with the first
(highest-priority) aircraft. The value n defines the lowest-ranked aircraft allowed to be assigned,
and a value of five was used in the system. This accounts for full assignment of aircraft to a forward
or aft catapult pair (four) plus making an additional assignment in the system (plus one). The
arrangement of priority aircraft in the initial parking locations should ensure that there is a balance
between the forward and aft sections of the flight deck and that some assignments are always possible.
Table 3.7 contains a list of the key inputs to the Deck Handler model, which are explained below. A
more detailed discussion of the logical rules comprising the Handler Allocation model can be found
in Appendix B.
Once provided with a catapult assignment, the aircraft begins moving under the command of the
Aircraft Directors. As described previously, the aircraft completes its taxi action when it arrives at
its launch catapult queue, where it waits until being loaded onto the catapult. At that time, it begins
launch preparation before completing the launch acceleration action. Parts of each of these aspects
are handled by models for the deck and various deck equipment, described in the next section.
3.2.4
Deck and Equipment Models
The equipment agents of primary concern in launch operations are the four launch catapults and
the landing strip. Additional constructs were created for the parking locations on the flight deck
(gray areas in Fig. 3-8) to facilitate bookkeeping of data. For each of these elements, the most
important state variable concerns their availability (whether or not they are currently in use) and
Table 3.7: List of important parameters and logic functions for Deck Handler modeling.
Feature
Description
Value
States
-
Aircraft awaiting catapult
assignments
Current priority of each
unassigned aircraft (input)
Current functional status of
each catapult (input)
Current availability of each
catapult (input)
Status of parking spaces 8/9
(input)
Lowest priority assignment
allowed (input)
Aircraft assignment (logic)
-
- Current list of
unassigned aircraft
(input)
Position of aircraft in priority list
-
Is the catapult operational
True/false
Number of aircraft in catapult queue; if
< 2 can accept a new aircraft
Is parking space 8 or 9 occupied by a
parked aircraft
Lowest ranked candidate for assignment
to catapult
Attempts to assign highest priority
aircraft to nearest available catapult
based on where aircraft can taxi (status
of parking spaces 8/9)
True/false
94
True/false
5
-
CHAPTER 3. MODEL DEVELOPMENT
their operational status (whether or not they can be used). For parking spaces, this translates to
tracking whether or not a vehicle currently occupies the parking space. Rules track and update
these statuses according to the location of nearby vehicles. For catapults, the area in the immediate
vicinity of each, referred to as the “foul area,” should not be entered by any unauthorized crew or
vehicles during launch processes; doing so requires an immediate halt of launch operations. These
are noted by markings on the flight deck in the real world, converted into the simulated environment.
Other rules for catapults dictate how aircraft queue within the system and how launches are
executed. Each catapult can queue up to two aircraft: one on the catapult and another parked
behind. Each pair of catapults (one forward, one aft) alternates launches within the pair, but the
two pairs operate in parallel. The first aircraft to arrive at a catapult within a given pair begins
preparing for launch; any aircraft that arrive on the adjacent catapult do not begin preparing for
launch until the first aircraft launches. States in the models define whether or not a catapult is
actively “launching” and how many aircraft are in queue, as well as whether or not an active aircraft
is stationed on (“occupying”) a given catapult. An aircraft only begins preparing for launch if it
currently occupies a catapult launch position and the adjacent catapult is not actively preparing to
launch its own aircraft. Rules for the queuing process describe where aircraft should stop taxiing
near the catapult; aircraft advance only if the catapult is not currently occupied. Table 3.8 provides
an overview of the important inputs, states, and parameters of the catapult agents.
These rules, as with other rules and behavior models within MASCS, are all executed independently by agents in the system. The execution of a simulation run begins with a set of initial
conditions, with agent decisions and reactions dictating how the simulation evolves from that point.
The simulation execution process is described in the next section.
Table 3.8: List of important parameters and logic functions for Catapult modeling.
Feature
Description
Value
Operating (state)
Occupied (state)
Launching (state)
Number of aircraft in queue
(state)
Launching status of adjacent
catapult (input)
Queueing rules (logic)
Catapult is functional and can be used
Aircraft is currently stationed on
catapult
Aircraft occupying catapult is actively
preparing for launch
Current number of vehicles in queue
Catapult can only be launching if
adjacent catapult is not launching
Aircraft only allowed to taxi to catapult
if queue size is less than two; aircraft
not allowed to begin launch preparation
unless adjacent catapult is not
launching and not occupied
95
True/false
True/false
True/false
[0,2]
-
CHAPTER 3. MODEL DEVELOPMENT
3.3
Simulation Execution
The previous sections have described how the individual agents within the MASCS model of flight
deck operations make decisions within the world; all that is required is an initial configuration of
the world, the agents active within it, their starting locations, and their initial goal states (if any).
Also included are their order in the priority list, and crew assignments (to taxi zones or catapults).
Appendix D provides example text from the configuration files, while Figure 3-3 previously showed
the layout of the deck with 34 aircraft (that maximum number used). Figure 3-9, top shows another
possible configuration of the flight deck with only 22 aircraft.
From the initial configuration and priority listing submitted in the initial configuration files, the
Deck Handler’s rule base begins to assign aircraft based on their priority in the launch order based
on the current allocations in the system, the current locations of aircraft, and accessibility and
geometry constraints on the flight deck. Doing so both assigns the aircraft to a Director and a Taxi
action to the aircraft. The Director commands a motion action to the Aircraft, replicated within
the Taxi action being executed. When the aircraft completes the taxi action, the Director’s logic
for aircraft handoffs is called and the aircraft is assigned to a new Director, who creates a new Taxi,
and the process repeats. While the Taxi actions are executed, aircraft pilots employ their collision
prevention routine to avoid collisions while Directors attempt to deconflict traffic around the deck.
When aircraft reach their launch catapults, they complete the Taxi action and begin executing
Figure 3-9: Top: Picture of the MASCS flight deck with 22 aircraft parked, ready for launch
operations. Bottom: Picture of the MASCS flight deck with 22 aircraft parked, in the midst of
operations. Three aircraft have already launched, four are currently at their designated catapults,
four more are taxiing to or in queue at their catapult, and eleven are currently awaiting their
assignments.
96
CHAPTER 3. MODEL DEVELOPMENT
the Launch Preparation task, randomly sampling the execution time from the provided distribution.
Once the required time has elapsed, the aircraft begins the Takeoff task, accelerating down the catapult in a time randomly sampled from its relevant distribution. All aircraft operate simultaneously,
executing their own actions and employing their own rule sets individually. Figure 3-9, bottom
shows one example of the 22 aircraft case in the middle of executing a run. Currently, aircraft at
catapults 2 and 3 are executing their launch preparation tasks, while aircraft are taxiing on to catapults 1 and 4. Additional aircraft are queued at catapults 2 and 3, and two aircraft in the aft area
of deck are taxiing forward to queue at catapults 1 and 4. The remaining aircraft parked around
the edge of the deck will begin taxiing as soon as the Handler routine assigns them to catapults; the
priority designations are such that the remaining aircraft should be assigned from left to right. The
simulation completes when the last aircraft completes its Takeoff task at its assigned catapult.
3.4
Chapter Summary
This chapter has provided details on the nature of aircraft carrier flight deck operations and how these
operations are transformed into rules and states for the constituent agent models. The chapter began
with a discussion of the nature of agent-based modeling and its reliance on rules and states. This was
followed by a discussion of flight deck operations that attempted to highlight the aspects that relied
heavily on rule bases to define agent behavior. The architecture of MASCS, noting its uses of Action
classes to organize and define interactions between agents, was then discussed. The final sections
of the chapter described how the features of aircraft carrier flight deck operations were translated
into specific sets of rules and states for different agent types, including details as the specific rule
processes and how certain state variables were defined. The next chapter describes the process of
validating this baseline manual control model described in this chapter. The manual control model
described in this chapter then serves as the template for the development and integration of the
unmanned vehicle control architectures as defined earlier in Chapter 2. A discussion of these models
and their formal definition will occur in Chapter 5.
97
CHAPTER 3. MODEL DEVELOPMENT
THIS PAGE INTENTIONALLY LEFT BLANK
98
Chapter 4
Model Validation
“Remember that all models are wrong; the
practical question is how wrong do they
have to be to not be useful.”
Empirical Model-Building and Response
Surfaces (1987)
George E. P. Box
The previous chapter provided an overview of modern U.S. aircraft carrier flight deck operations
before describing how these operations could be modeled in an agent-based simulation. Chapter 3
outlined the different agents required for the simulation, their corresponding states, parameters, and
behavioral rules, and the models of agent tasks that were required. These models were based on
empirical data where possible, supplemented by information from Subject Matter Experts (SMEs)
with experience in flight deck operations. This chapter describes how this simulation model, with its
constituent agents, is tested against empirical data on flight deck operations. Demonstrating that
the simulation produces output similar to that of the modeled system provides confidence that the
simulation is a valid representation of that system: that the use of agent-based models is reasonable
and that those agent models and their interactions have been appropriately defined. Demonstrating
that the current Multi-Agent Safety and Control Simulation (MASCS) model described in Chapter 3
suggests that the modeling strategies are suitable for this environment and that the models are
applicable to modeling unmanned vehicles in future tests. A valid simulation of current operations
also serves as a future comparison point for other extensions of the simulation model, as the same
output variables can be encoded into each.
The testing process for the MASCS simulation model was conducted in two different phases.
In the first phase, the performance of a single aircraft proceeding from an initial parking location
to a launch catapult was compared to empirical observations of the same task. The second phase
99
CHAPTER 4. MODEL VALIDATION
of testing examined the performance of the simulation for missions using multiple aircraft, testing
launch events using 18 to 34 aircraft with each taxiing from an initial parking location to a launch
catapult. Tests within each of these phases addressed both the internal and the external validity of
the simulation: how well it replicates empirical data on operations (external validity) as well as its
response to changes in internal parameters (internal validity). These tests aim to provide confidence
that the MASCS simulation mode defined in Chapter 3 is able to accurately replicate the behavior
of flight deck operations. An additional review of the simulation with experienced Naval personnel
provided additional confidence that the simulation was a reasonable model of operations, judging
whether the animated evolution of the simulation environment was qualitatively similar to that of
the real system.
However, limitations in the availability of data for flight deck operations means that fully validating all aspects of the model remains unachievable at this time. Several elements of the model as
described in Chapter 3, such as the Deck Handler planning heuristics and other human behaviors,
have no data available for comparisons and thus cannot be guaranteed to be valid. As such, MASCS
can only be partially validated, with its outputs calibrated to the obtained performance data. The
tests that were performed suggest that the simulation is a reasonable model of flight deck operations.
This chapter discusses the validation tests that did occur, beginning first with a discussion of how
agent-based (and other) simulation models are validated and how this applies to the MASCS model.
4.1
Validating Agent-Based Models
Many authors describe the process of simulation construction in terms of several phases, beginning
with data gathering and proceeding through model construction, model verification, and model
validation (Sargent, 2009; Carley, 1996; Law, 2007). The tasks of data gathering and model construction were discussed in the previous chapter, while this chapter focuses on the processes of
model verification and validation. These two aspects describe similar processes at two levels of a
simulation environment. Generally, a simulation can be viewed as an aggregation of submodels
and function that interact to produce a single coherent simulation of a system. Verification is the
process of ensuring that the submodels and subfunctions are properly defined and encoded; it is
often described as “making sure you’ve coded the X right.” This typically means ensuring that
the mathematical equations governing behavior are correct and that any logical decision structures
behave correctly. Model validation, then, is the process of validation is “making sure you’ve coded
the right X,” demonstrating that the output of the verified submodels matches that of the aggregate
system. Correctly defining and implementing all submodels might still produce an invalid simulation
if the inputs to and interactions between those models were incorrect.
All models are simplifications of reality; to properly model all elements of a real world system at
their true level of complexity would require substantial time and resources. They are all “wrong” in
100
CHAPTER 4. MODEL VALIDATION
some sense, and the question is not how accurate can they be, but how inaccurate they are allowed
to be before they are considered inappropriate or not useful. This is often couched in terms of the
external validity of the simulation — the accuracy of its outputs as compared to the outputs of the
real system. Tests of validity can examine these outputs in a variety of ways, from point-to-point
comparisons to comparisons of distributions of data. Which is chosen depends on the available data,
the stochasticity of the real and simulated systems, and the precision with which data is collected.
Additionally, validation can also address the response of the system to changes in input and internal
parameters. This may either test the internal validity of the simulation — that small changes in
parameters should not generate extreme responses — as well as external validity, if the response of
the real system to the same changes in parameters is known.
Most texts that discuss simulation construction and validation focus on the use of systems dynamics and discrete event simulation, in which the movement from submodels and subfunctions to
the full system simulation occurs in one step: the submodels are created and verified, then directly
assembled into a model of the overall system. Agent-based models contain an additional intermediate level: submodels and subfunctions define agent behavior, with multiple agents interacting to
create system behavior (Figure 4-1 provides notional diagrams for the three model types). It is
useful in agent-based modeling to validate the individual agents’ behaviors before testing that of
the aggregate system. This demonstrates that the agents themselves are externally valid representations of real world actors and ensures that any future errors in the part of the simulation data
comes from agent interactions. While accurate simulation results might be achievable even with
faulty agent models, significant questions regarding the true external (or structural) validity of the
simulation would exist. Both aspects of model validation — individual agents and the aggregate
system model — for the MASCS simulation environment are described in this chapter, beginning in
the next section with single aircraft agent validation.
4.2
Single Aircraft Calibration
Chapter 3 defined the list of agents required in the MASCS model and characterized the states,
parameters, and decision-making rules required for each. These agent models must adequately
replicate not only the decision-making of agents on the flight deck, but their motion on the deck
and their interaction with other agents. Of primary concern is ensuring that models of pilot/aircraft
behavior are correct, as these are the primary focus of launch operations and will be the primary
comparison point in future Unmanned Aerial Vehicle (UAV) testing. This first section describes the
calibration of the behavior of an individual aircraft executing a launch mission on the flight deck,
using the minimum necessary number of crew. The ability of the MASCS simulation to replicate
this task provides confidence that the models of individual vehicle behaviors and the characteristic
distributions describing their speed of travel and the tasks they perform are all accurate. After
101
CHAPTER 4. MODEL VALIDATION
Figure 4-1: Simple notional diagrams for Discrete Event, Systems Dynamics, and Agent-Based
Models.
these models are correctly calibrated, tests can then proceed to the tests at the mission level using
multiple aircraft.
The tasks described in this chapter were made difficult by the fact that the United States Navy
does not make flight deck performance data publicly available in large quantities. From discussions
with SMEs, it appears that what data is collected on flight operations is done at the behest of officers
and senior enlisted on each ship and is not forwarded on to the Navy or any of its research affiliates.
As such, the data used in the tests in this chapter have been amalgamated from a variety of both
government, military, and civilian sources.
For the single aircraft calibration testing, only a single data point for comparison was obtained,
from a video uploaded to Youtube.com (Lockergnome, 2009). Even though a large number of videos
of flight deck operations are available overall, only a small number provide full footage following
an aircraft for more than the final takeoff task, and of those, most involve multiple aircraft taxiing
simultaneously and interacting with one another. The video used in this testing is the only one
discovered that follows a single aircraft from its initial parking place through final takeoff without interacting with other aircraft. No published performance data was located that provides any
information on individual aircraft performance for a characteristic launch task. While this is not
102
CHAPTER 4. MODEL VALIDATION
ideal, these observations can still be used in calibrating the Aircraft agent to ensure that it is not
unreasonable.
In the recorded observation, the aircraft begins parked in the central aft area of the deck, then
taxis to Catapult 3 (the interior of the two aft catapults) and launches. This first requires a Taxi
task to move from parking to catapult (consisting of several individual Move Forward and Rotate
actions), a Launch Preparation task, and a Takeoff task. Figure 4-2 shows the starting location and
transit path of the vehicle. Times for each of these subtasks were calculated by reviewing the video
and manually coding event times; they are referred to as the Taxi Duration, Launch Preparation
Duration, and Takeoff Duration, respectively. Their values are annotated in the figure: 78 seconds
to Taxi, 157 seconds in Launch Preparation, 3 seconds Takeoff by accelerating down the catapult,
for a total of 238 seconds to complete the launch event. Note that the observed time to complete the
launch preparation task was not included in the original observations used to generate the launch
preparation time model (Appendix C, Table C.2). The observed value of launch preparation was
also not included, but several observations of 3.0 second accelerations did occur.
If the MASCS model of operations is accurate, then the Taxi Duration, Launch Preparation
Duration, and Takeoff Duration from the empirical observation should be reasonably similar to
the average times produced by the simulation. At the same time, too closely matching this data
suggests overfitting. Generally, being within 1 standard deviation of the mean value suggests that a
data point comes from that distribution, given that 68% of all data points lie within one standard
deviation of the mean. Higher quality data was available at the mission level (Section 4.3), and tests
of MASCS mission performance will provide a more substantial validation of the simulation as a
whole.
To test the accuracy of the individual aircraft models within MASCS, an initial configuration
file was created that placed the test aircraft at the approximate starting location of the aircraft
Figure 4-2: Diagram showing the path taken by the aircraft in the observed mission scenario. The
figure also details the breakdown of elements in the mission: Taxi, Launch Preparation, and Takeoff.
103
CHAPTER 4. MODEL VALIDATION
from the video observation and directing the aircraft to launch from same catapult (#3, the interior
aft catapult). This scenario was then replicated 300 times in the MASCS environment to ensure
a full exploration of the stochastic variables in the simulation (Taxi speed, Launch Preparation
time, and Takeoff acceleration time are all random variables). Data on these metrics were logged
and compiled in order to compare their values to the empirical observations. The results of these
comparisons appear in Figure 4-3. Simulation results appear in blue, with whiskers denoting ±1
standard deviation in the simulation results. The empirical data from the video recording appears
in green.
The results in Figure 4-3 indicate that the empirical observations for Launch event Duration (LD),
Taxi Duration, Launch Preparation Duration, and Takeoff Duration are all within one standard
deviation of the simulation means. Specifically, they are 0.56, -0.99, 0.75, and 0.97 s.d. away from
their respective means. Statistical tests also demonstrate that the resulting data are all Normally
distributed (Appendix F). The mean Taxi Duration for the simulation results (blue) is 6 seconds
greater than the empirical observation (green), while the mean Takeoff Duration for the simulation
results is 0.25 seconds smaller than the empirical observation. However, these two differences are far
smaller than the standard deviation observed in the Launch event Duration values (49.63 seconds)
and the Launch Preparation Duration (48.84 seconds in the simulation results). These data also
suggest that, because of its large mean and standard deviation, the Launch Preparation process is
the largest driver of variation in the total Launch event Duration (LD) for these tasks.
The two additional data sets in Fig. 4-3 (yellow and gray) examine combat “Surge” operations
that occur only rarely and for which no empirical data could be located. These data are included only
as comparison points of what might be possible in terms of improving operations. SMEs describe the
Figure 4-3: Results of preliminary single aircraft calibration test. Simulation data appear in blue
with error bars depicting ±1 s.d. Empirical observations are in red.
104
CHAPTER 4. MODEL VALIDATION
major change in Surge operations as being a streamlining of the launch preparation process, reducing
the time required to complete the process. Accordingly, simulation runs for “Surge” operations
(yellow) modified the launch preparation time model to be normal with a mean of 45 seconds, a
standard deviation of 10 seconds, and a range of 30-60 seconds (described by SMEs as being the
typical range of launch preparation times in Surge operations). Otherwise, all other parameters and
initial conditions were left unchanged. Modifying just this launch preparation process results in a
substantial improvement in LD values (55% decrease in mean). The fourth column in each chart
(gray) depicts the observed (for taxi and acceleration) or theoretical minimum values (for launch
preparation) for each measure, providing a view of the best possible results under the provided
distributions.
An additional test of the accuracy of the aircraft models can be done through the use of a
sensitivity analysis, which examine whether the model puts changes in unexpected ways given a
plausible range of uncertainties (Sterman, 2000). In such tests, if the changes in output are not
extremely different from what would be expected, they provide confidence that the model is robust
to any errors in its provided parameters (Sterman, 2000). For individual aircraft within MASCS,
the two most important stochastic parameters are the launch preparation time and aircraft speed.
Increasing the mean launch preparation time should increase the total launch event duration while
increasing vehicle speed (a rate measure) should decrease it, and vice versa for each. The goal of
the sensitivity testing is to demonstrate that changes to these stochastic parameters do not generate
extreme changes in outputs: for this data, this concerns the changes in LD values. Since the Launch
event Duration is a simple summation of two stochastic parameters (taxi time and launch preparation
time) that are independently sampled, changes in LD values should be at or less than the percentage
changes to those parameters. That is, a 25% increase in taxi speed should produce a ≤25% change
in LD.
A set of sensitivity analyses individually varied the mean values of the taxi speed and launch
preparation time models, running 300 replications for each case. Both the mean launch preparation
time and the mean taxi speed were individually varied at ±25%; a 25% change corresponds to roughly
half the standard deviation of the launch preparation time model (the larger of the two means), with
25% changes applied to the taxi model for consistency. Results for these tests, indicating the resulting
percent change in launch event duration mean, appear in Figure 4-4.
For both launch preparation time and taxi speed, the relative change in LD values is less than
the relative change (25%) in each parameter, as desired. However, the results also show that the
relative increases and decreases in LD for a given parameter were not equivalent in magnitude, which
is unexpected. Reviewing the output data from the simulation (Figure 4-5), it can be seen that the
changes in the means of the parameters did not produce symmetrical changes in the sampled values
during simulation runs: increasing the launch preparation mean by 25% resulted in samples that
105
CHAPTER 4. MODEL VALIDATION
Figure 4-4: Sensitivity analysis of single aircraft Manual Control Case.
averaged a 30% increase from the baseline, while decreasing the mean by 25% resulted in only an
8% decrease in values. This is a result of the boundaries placed on the parameter values (launch
preparation time has a minimum of 30 seconds, taxi speeds bounds of [3.0, 4.0] miles per hour). The
imbalance in sampling partially explains the low percent changes in LD in Fig. 4-4 and part of the
imbalance in the results.
The direction of changes in LD in Fig. 4-4 was consistent with expectations, however, and
the relative changes in LD remained lower than 25%. Taxi speed values increased by nearly 30%
but resulted in only an 11.15% decrease in LD; launch preparation mean values were decreased
by nearly 30% and resulted in LD decreases of only 8.63%. For changes that would decrease LD
values (decreasing launch preparation times or increasing taxi speed), changes were near or less
than the resulting parameter changes. Taxi speeds were decreased by roughly 17% and increased
LD values by only 5%. Launch preparation values were increased by roughly 8.5%, with LD values
increasing by 10.22%. This is a relatively minor inconsistency that can be attributed to the taxi
speeds sampled (which were roughly 1% below average) and the interactions of random sampling of
these two stochastic processes.
Figure 4-5: Changes in parameters during sensitivity analyses.
106
CHAPTER 4. MODEL VALIDATION
Together, the sensitivity analyses and the comparison to empirical data both provide evidence
that the simulation is a reasonable model of individual aircraft behavior. The data points from the
single aircraft empirical observations all lie within one standard deviation of the simulation data,
suggesting that the empirical observation is a reasonably likely occurrence within the simulation.
Sensitivity analyses on the taxi and speed parameters (varying their means by ±25%) providing
changes in launch event duration of less than 25% and occurring in the correct directions. At worst
case, resulting changes in LD should have been equal to 25%; that they are lower demonstrates there
is some robustness in the simulation model to these errors. This, in turn, suggests that the models
of individual aircraft and their interactions with crew, developed in Chapter 3, are accurate for flight
deck operations and should be acceptable for use in mission simulations. This testing is described
in the next section, in which mission ranging from eighteen to thirty-four aircraft are executed in
order to examine the ability of MASCS to replicate real-world mission performance.
4.3
Mission Validation Testing
The previous section described the calibration of individual agent motion on the flight deck, providing
evidence that the individual aircraft models can adequately replicate empirical observations in the
world. However, the more important testing addresses the ability of the full MASCS simulation
to replicate mission scenarios using multiple different aircraft. In these simulations, eighteen or
more aircraft proceed from parking to launch catapults over a period of time, executing taxi tasks
simultaneously while interacting with multiple crew.
Validating these operations requires the acquisition of data that also describes flight deck launch
events. The primary source of information for this data is two reports from the Center for Naval
Analyses (CNA) (Jewell, 1998; Jewell et al., 1998). The most important data from these reports
concerns the number of aircraft used in a single mission sortie (mission Density) and a Cumulative
Density Function (CDF) of the interdeparture times of launches from the flight deck. The former
helps define the test conditions within the simulation (the number of aircraft that should be included
in test scenarios) while the latter provides a highly-detailed measure of system behavior. Figure 4-6
provides a histogram of the number of aircraft used in sorties during the CNA study. While it is
notable that the data can be fit by a Normal distribution (µ = 27.014, σ = 4.2614, KolmogorovSmirnov statistic = 0.09866, p = 0.47351), knowledge of the distribution is not necessary for current
testing. Rather, that the bulk of operations utilized between 18 and 34 aircraft serves as an important
input parameter for MASCS testing.
Figure 4-7 shows the second important data point from these CNA reports: a CDF of the
interdeparture times of deck operations over the course of one week of missions. An interdeparture
time is defined as the time between two successive launches (departures) from the flight deck across
any two catapults; if launches happen from catapult 1 at t = 25 and from catapult 3 at t = 40, the
107
CHAPTER 4. MODEL VALIDATION
Figure 4-6: Histogram of the number of aircraft observed in sorties during the CNA study (Jewell,
1998).
interdeparture time is 15 seconds. In queuing theory, interarrival and interdeparture times are key
inputs to system models and serve as a powerful metric of a system’s behavior. For this work, this
empirical distribution serves as an important comparison point for the MASCS model; if the results
of simulation testing for the mission profiles shown in Fig. 4-6 provides the same interdeparture
times as shown in Fig. 4-7, it provides evidence that MASCS is not an invalid model of operations.
The usefulness of the CNA distribution is further enhanced by the fact that it comes from a
variety of different mission profiles aggregated together, allowing it to be applied to a variety of
launch event profiles (primarily, the number of aircraft). Unfortunately, the authors of the CNA
reports do not clearly identify the types of operating conditions observed (day, night) nor the specific
distribution mission sizes utilized. However, the authors did regard the data as representative
of fleet performance. The data contained in Figure 4-7 can be converted first into Probability
Density Function (PDF) form, then fitted with a negative exponential fit (the standard form of an
interdeparture rate) in the form λe−λ∗t , with λ=0.01557; this provides an average interdeparture
rate of 64.22 seconds. Details on the process of converting the CDF data into PDF and the statistical
tests for goodness of model fit can be found in Appendix E. If the MASCS simulation is a valid
replication of activity on the flight deck, the interdeparture rates of the MASCS model should align
closely with the reported rate from the CNA data.
The mission profiles listed in Fig. 4-6 were used as the template for mission cases for MASCS
validation. Table 4.1 contains the list of missions tested in this phase. Each mission scenario was
replicated thirty times in the simulation, with data on Launch event Duration (LD) and launch
interdeparture times tracked for each. A total of eighteen different mission profiles were tested,
108
CHAPTER 4. MODEL VALIDATION
Figure 4-7: Cumulative Density Function (CDF) of launch interdepartures from (Jewell, 1998).
This CDF can be converted into a Probability Density Function (PDF) in the form of a negative
exponential of the form λe−λ∗t with λ=0.01557
varying from 18 to 34 aircraft and using either three or four catapults. The numbers of aircraft
used in the test program reflect the significant changes in deck organization that occur from 20 to
24 aircraft; once the mission includes 22 or more aircraft, vehicles must be parked in the interior
of the deck. As this occurs, taxi lanes on the flight deck being to close off, and operations become
more complex in terms of allocating aircraft. While this is not expected to cause significant effects
in terms of mission duration (the planning heuristics of the handler and the structure of operations
should compensate for these effects), multiple runs were taken in this area to ensure that no adverse
interactions occurred.
Thirty replications were run for each of the missions listed in Table 4.1. Output variables included
the launch times for each aircraft (to generate interdeparture times) as well as total launch event
duration values. Launch interdeparture times were calculated for each mission independently and
fitted with negative exponential distributions in the EasyFit program. The corresponding λ values
and Kolmogorov-Smirnov (K-S) goodness of fit p-values were recorded. Distribution fitting was
also performed in the JMP PRO 10 software to obtain confidence intervals on the λ values, which
Table 4.1: List of scenarios tested in mission validation (Number of aircraft-number of catapults).
Mission Scenarios
18-3
18-4
20-3
20-4
21-3
21-4
22-3
22-4
23-3
23-4
24-3
24-4
109
26-3
26-4
30-3
30-4
34-3
34-4
CHAPTER 4. MODEL VALIDATION
EasyFit does not provide, with values of λ verified to match between the two programs. Full results
for these tests appear in Appendix F. Figure 4-8 shows the final λ values and confidence intervals
of the negative exponential fits of the data. The orange line denotes the λ value of the original fit
from the CNA reports.
As the figure shows, λ values for all simulation results lie relatively close to the original CNA
value, which falls well within the confidence intervals on the simulation λ fits. Values at the four
catapult missions tend to provide higher λ values (slightly faster interdeparture rates), which would
be expected given the parallel nature of operations that occur between catapult pairs: adding an
additional active catapult provides additional opportunities for launches to occur nearly parallel
between the forward and aft catapults. For many aircraft Density levels (number of aircraft used
in the mission), the λ values at the 3 and 4 aircraft cases bracket the original CNA λ value. In
general, these results support that the interdeparture times generated by MASCS are accurate with
respect to the empirical data obtained from the CNA reports. More specifically, it suggests that
the selection of the Launch Preparation time distribution and the modeling of catapult launches
are accurate with respect to the real system. However, this is only one aspect of operations in
an environment in which the launch task has significant influence. The next section examines the
robustness of the simulation with respect to changes in internal parameters, as well as the ability of
Figure 4-8: Graphical depiction of launch interdeparture negative exponential fits for the final
round of mission validation testing. Triangles denote λ values. Vertical lines denote 95% confidence
intervals. Note that the mean value of launch interdepartures across all scenarios and replications
is 66.5 seconds.
110
CHAPTER 4. MODEL VALIDATION
the simulation to adequately replicate launch event durations over a variety of input parameters.
4.4
Mission Sensitivity Analyses
Sensitivity analyses examine the response of the simulation’s output to changes in the parameters
of the system. The earlier sensitivity analysis of the single aircraft models examined changes to
parameters internal to the MASCS simulation model. Additionally, tests can also address changes
to the input parameters that define the initial conditions of the simulation. The earlier single aircraft
sensitivity analyses simply sought to determine whether or not the response of MASCS was robust
(non-extreme) to changes in these parameters. Tests can also examine how well the response of the
simulation agrees with the response of the real system, if this data is available (Carley, 1996). The
tests of robustness are a check of the internal validity of the simulation model, while tests against
empirical data are checks of the external validity of the model. Both forms of sensitivity analyses
were performed for mission-level testing in the MASCS environment and are discussed in this section.
The first section addresses the response of the simulation to changes in input parameters, comparing
its response to known expectations.
4.4.1
Sensitivity to Input Parameters
A sensitivity test utilizing the simulation’s input parameters examines the accuracy of the simulation
over various conditions. For MASCS, the primary input parameters of interest are the number of
aircraft included in the mission and the number of available catapults. Within these tests, parameters
regarding agent motion and agent decision-making (including the Deck Handler planning routines)
are left unchanged. Performing this test requires an understanding of the performance of the real
flight deck under these conditions.
Information on flight deck performance was obtained from two sources. First, the CNA reports
referenced earlier also include some general “rules of thumb” concerning how often aircraft launch
and land, providing general guidelines as to the expected time to complete a launch event, the
general range of times that are possible, and how flight deck performance responds to changes in
the number of aircraft used in the mission. These reports indicate that launches on four-catapult
carriers occur every 30 to 60 seconds (Jewell, 1998), although it is unclear as to whether this figure
includes both “Surge” and non-combat operations. This value compares favorably to the mean
launch interdeparture data reported in Fig. 4-7 (a mean of roughly 56 seconds). Given a specified
number of aircraft on the flight deck, and assuming that the catapults maintain uninterrupted queues
throughout the course of the mission, these values can be used to estimate the total time required of
a mission. This suggests that a 20-aircraft mission should range from 10 to 20 minutes in duration,
plus some additional time for taxi operations at the start of the event.
A third perspective views the flight deck as a queueing system in which launches are allocated
111
CHAPTER 4. MODEL VALIDATION
evenly between the forward and aft catapults and each pair of catapults works as an independent
server with independent queues. Because each catapult pair alternates launches within themselves,
they have as a single serial server; the time to complete a mission is then just a linear function
of the average number of launches between the catapult pairs. The launch preparation time was
defined in Chapter 3, based on previous data, as N (109.6, 57.82 ). If aircraft are evenly allocated
across catapult pairs, then the estimated time to complete a mission for a given number of aircraft
Naircraf t is Naircraf t ∗ µ/2 (number of aircraft times average departure rate of 54.8 seconds). This
model assumes that the queues of aircraft at catapults exist through the mission and does not
account for taxi operations prior to the first launch. It is useful, however, for understanding how
the system should respond to changes in the number of aircraft within the mission: if the allocation
of aircraft remains balanced between catapult pairs, then the time to complete the mission should
respond linearly.
When using fewer than four catapults, the ability to balance assignments on the flight deck is
disrupted, as one catapult “server” can now hold only two aircraft in queue at a time (as opposed to
four). This disrupts the process of allocations in the system and should have an effect on operations.
The end result is that, for a mission utilizing only three catapults, aircraft are overallocated to the
catapult pair and underallocated to the lone catapult. Assuming that all launches require the same
amount of time, this results in requiring up to two extra launches from the pair while the lone
catapult sits idle. Appendix G provides a more detailed discussion of the general process of queuing
aircraft at catapults over the course of a mission, addressing how aircraft are assigned to and queue
at catapults; a summary diagram of the resulting allocation appears in Figure 4-9. Additionally,
barring any significant flaws in Deck Handler assignment logic, a three catapult case should not
occur in less time than a four catapult case; doing so would require a significant lack of balance in
assignments between catapult pairs.
Figure 4-10 plots the mean mission duration values, ±1 standard deviation, for the mission
scenarios listed in Table 4.1 by number of catapults used. As described above, all reported data
suggest that the response of the simulation to changes in the number of aircraft should be linear
with slops similar to the average interdeparture time and average launch preparation time, and that
a small but consistent difference should exist between the three- and four catapult missions. The
Figure 4-9: Notional diagram of allocations of aircraft to catapults in a 22 aircraft mission. Dots
represent assigned aircraft launches; in the image, catapults 3 and 4 are paired and launches alternate
between the two. Catapult 1 is inoperable, and catapult 2 process all assigns in that section in series.
112
CHAPTER 4. MODEL VALIDATION
data reported in the figure comes from the same runs used to generate the launch interdeparture
data presented in Section 4.3.
Within the figure, linear fits of the data are overlaid and the corresponding R2 values reported.
Statistical tests on each linear model (F-test) showed statistically significant results (F (1, 7) =
1743.53, p = 0.0001 and F (1, 7) = 773.90, p = 0.0001, respectively). Model coefficients (66.6 and
68.4 seconds for the three- and four-catapult missions, respectively) are also significantly different
from zero (full results for these tests appear in Appendix H) and are close to the expected value of
mean interdeparture value from the simulation test data (average = 66.5 seconds) but slightly higher
than the values reported in the CNA data (heuristic of up to 60 seconds, reported interdeparture of
64.1 seconds) and that predicted by the simple queueing model (half the launch preparation time,
55 seconds). However, while there are some distances, they are relatively small.
A relatively consistent difference between launch event duration values for the three- and fourcatapult missions was also expected. The results show a mean different of 100.2 seconds, slightly
less than the mean launch preparation time of 109.8 seconds. Viewing results on a per-mission basis,
differences between three and four catapult ranged from 50 seconds (slightly less than half of one
launch) to 142 seconds (1.29 launches), well within the expected difference of zero to two launches.
Additionally, the average duration of three catapult missions at a given mission Density (e.g., 20
or 24 aircraft) is never smaller than that of the four-catapult mission. However, there does appear
to be some interesting variations in launch event duration between 20 and 24 aircraft, likely due
to how aircraft are placed on the flight deck. Using more than 20 aircraft requires that aircraft be
parked in the center of deck, eliminating some of the available taxi routes on the flight deck. These
routes continue to be eliminated as additional aircraft are added, requiring changes in how aircraft
are prioritized and missions are planned and executed. Past 26 aircraft, the changes in patterns
stabilize and return to a more linear trend.
The results presented in this section provide confidence that the simulation responds reasonably
with regards to changes in the input parameters of number of aircraft and number of available catapults. The behavior of the simulation in regards to both is in line with prior data and expectations of
SMEs, suggesting that the simulation reacts in a similar fashion to the real world system. Although
the differences between three and four catapult missions did vary, at no time was a three catapult
case observed to require less time than a four catapult case, nor were any substantial and unexpected
changes in average launch event duration observed. More specifically, these results suggest that the
modeling of the launch preparation process, the interactions between aircraft and catapults, the taxi
routing process, and the catapult queueing processes that were all defined in Chapter 3 are accurate
and are reasonable with respect to the output variables tested within this section. However, as with
the single aircraft calibration, sensitivity analyses of internal parameters should also be performed in
order to determine the robustness of mission performance with regards to those modeled parameters.
113
CHAPTER 4. MODEL VALIDATION
Figure 4-10: Plot of event durations versus number of aircraft, sorted by three- (blue) vs fourcatapult (red) scenarios.
This is described in the next section.
4.4.2
Sensitivity to Internal Parameters
Sensitivity analyses of internal parameters within the model examine the internal validity and robustness of the simulation to errors in the parameter models. The model should not be overly
sensitivity to errors in the definition of its parameters, given that the estimates of those parameters,
based on real data, are likely to be imperfect. For the MASCS model of flight deck operations,
the most important internal parameters include the launch preparation time (defined in Chapter 3,
Section 3.2.1), vehicle speeds, and the collision prevention parameters. The first two of these were
also tested for single aircraft in Section 4.2; collision prevention parameters could not be examined
because they require at least two vehicles to be present.
For launch preparation time, mean values are adjusted by ± 10%, while taxi speeds and collision
prevention thresholds are adjusted by ± 25%. As demonstrated in the previous section, the response
of mission event durations is linear with respect to the number of aircraft and the mean launch
preparation time. As such, a 10% change in the launch preparation time mean should lead to a 10%
change in mission behavior and be detectable through statistical tests. However, the structure of
queueing on the flight deck should limit the effects of taxi speed and collision prevention parameters.
Increasing the speed of transit only makes vehicles arrive at queues earlier and increases their wait
time at the catapult; it will have little meaningful effect on operations. Reducing speeds should
have some effect, but must be substantial enough to interrupt the queueing process at the catapults.
Similar effects should occur for the collision prevention parameters, with reductions in their values
removing constraints on vehicle motion and speeding their transit to catapults. Initial tests using
114
CHAPTER 4. MODEL VALIDATION
only 10% changes produced almost no detectable differences, whereupon the value with increased to
±25%.
Table 4.2 provides a list of the mission scenarios tested in this sensitivity analysis. For this phase
of testing, only values of 22 and 34 aircraft were utilized; SMEs report that 22 aircraft is a typical
mission profile, while 34 aircraft is the largest value seen in prior CNA reports (Jewell, 1998; Jewell
et al., 1998). The table lists scenarios in the format (Number of Aircraft)-(Available Catapults)
while also denoting the parameter changes involved in each scenario; a notation of X% denotes
that the parameter was varied by both +X% and −X% in two separate scenarios. Each scenario
was replicated 30 times at both the +X and −X settings. Note that at no time was more than
one parameter varied during a run. Given the relative certainty in how launch preparation changes
would affect operations, these were only conducted for two mission profiles (22-3 and 34-4).
Fig. 4-11 provides the results of these tests in the form of a “tornado” chart. Each set of bars
reflect the plus and minus deviations of a specific parameter for a specific mission, with results ranked
in terms of greatest difference between plus and minus values. From this chart, it can be seen that the
changes in launch preparation distribution are near ±7%, smaller in magnitude than the changes
to the corresponding internal parameters, demonstrating a slight dampening effect. A review of
the output of launch preparation time values demonstrated that, due to the bounding of values in
random number generator, output values were roughly ±7% rather than ±10%. This means that
even though the stochastic launch preparation time generator did not provide the expected values,
the simulation responded appropriately to the values that were generated.
ANOVA tests reveal that significant differences exists for launch preparation time tests at the
22-3 (F (2, 87) = 29.1344, p = 0.0001) and 34-4 (F (2, 87) = 34.4374, p = 0.0001) cases, reflecting
the significant effects of the launch preparation time in driving flight deck performance. Tukey
post-hoc tests show that significant pairwise differences exist across all three cases for each mission
settings at p = 0.0025 or less (full results are available in Appendix H. From the figure, the effects of
launch preparation on operations are clearly larger than the effects of changing the taxi and collision
prevention parameters, even though the changes to launch preparation time were smaller (in terms
of percent differences) than the change to speed and collision prevention parameters (10% vs. 25%).
Table 4.2: Matrix of scenarios tested during internal parameter sensitivity testing.
3 catapults
4 catapults
22 aircraft
34 aircraft
Taxi±25%
CA±25%
Launch±10%
Taxi±25%
CA±25%
Taxi±25%
CA±25%
Taxi±25%
CA±25%
Launch±10%
115
CHAPTER 4. MODEL VALIDATION
Figure 4-11: Results of final sensitivity analysis. Labels indicate number of aircraft-number of
catapults, the parameter varied, and the percentage variation of the parameters. Dagger symbols
(†) indicate an inverse response to changes in parameter values is expected. Asterisks (*) indicate
significant differences in ANOVA tests.
Significant results also appear for changes to aircraft taxi speed at the 34-3 (F (2, 87) = 4.9753, p =
0.009) and 34-4 settings (F (2, 87) = 3.8031, p = 0.0261). Tukey post-hoc tests reveal that, for both
of these cases, reductions in taxi speed provide significantly different results from increases in taxi
speed (p = 0.022, p = 0.007, respectively), but neither is significantly different from the mean in
either case; full results again appear in Appendix H). These results just how weak the effect of
taxi speed is on effecting on operations: even at changes of ±25% to the parameter, significant
differences only exist between the extremes. As described previously, this is likely an effect of the
launch preparation process regulating the flow of traffic on the flight deck. Interestingly, because
taxi speeds only had effects at the 34 aircraft case, it shows that the effects of speed may accrue
over increasing numbers of aircraft. Alternatively, it may be indicative of how the changes in how
missions evolve at the 34 aircraft level (more aircraft, fewer available taxi paths, and a difference in
the order of how areas are cleared) are more sensitive to vehicle transit speeds.
Tests for the effects of changing the collision prevention parameters show significant results only
at the 22-4 (F (2, 87) = 4.6278, p = 0.0123) and 34-4 (F (2, 87) = 5.4986, p = 0.0056). Tukey post-hoc
tests (full results in Appendix H) show that for both cases, decreasing the collision prevention parameters significantly decreases mission completion times. For the 22-4 results, the collision prevention
-25% case is significantly different from the +25% case (p = 0.0116) and marginally significant from
the baseline (p = 0.0855), while for the 34-4 mission, the -25% collision prevention case is different
116
CHAPTER 4. MODEL VALIDATION
from both the +25% case (p = 0.0014) and the baseline (p = 0.0176). This suggests that when using
four catapults, in which more aircraft are moving on the flight deck, the current collision prevention
settings may slightly overconstrain operations. Reducing these constraints and providing aircraft
mode freedom of motion in operations provides slight benefits to operations. However, increasing
these constraints appears to provide no further penalty to operations and indicates some robustness
in the system. However, because the results of the earlier input sensitivity and launch interdeparture
tests were accurate, the baseline collision parameter settings were left unchanged.
The results described within this chapter thus far provide confidence that the MASCS model of
aircraft carrier flight deck operations is a reasonable representation of operations, partially validated
in terms of the output variables of launch interdepartures and launch event durations values and
its responses to changes in internal and input parameters. The system has been shown to respond
reasonably to changes in taxi speed, collision prevention parameters, and changes in the launch
preparation. The latter produces an almost linear response to changes in the launch preparation
time distribution, as expected. The effects of the other parameters are limited due to the structure of
operations: parameters that increase the rate of taxi operations have no effect, as they are bounded
by the final launch tasks at catapults. Tasks that decrease the efficiency of operations have limited
effects due to specific interactions with system conditions, but whose effects can be easily explained.
These results provide evidence that the agent and task models that comprise MASCS are accurate
and respond appropriately to system inputs, but as noted at the start of this chapter, models for the
Deck Handler and other human elements in the system cannot be fully validated at this time due to
a lack of data. Additional data collection at a much higher level of detail than currently performed
must occur to capture details about individual behaviors and decision-making on the flight deck
to ensure that agent behaviors are accurately captured. In light of these limitations, the accuracy
of these behaviors can also be reviewed by SMEs familiar with deck operations to critique their
representations at a high level. The next section describes the SME review that occurred during the
MASCS validation process.
4.5
Subject Matter Expert Review
After concluding the sensitivity analyses, the simulation environment and test results were presented
to a set of Subject Matter Experts (SMEs) for review. SMEs attending this meeting had a range
of experience on the flight deck, each over multiple years (two pilots each with over 800 flight hours
in fighter aircraft; another individual with over 3,000 hours as a Naval Flight Officer and 2 years
as a carrier-based launch and recovery officer) All were employed by Navy research organizations at
the time of the meeting. During the meeting, the simulation (which displays a live animation of the
evolution of the flight deck environment) was presented first to SMEs for their critiques. SMEs were
shown live executions of the MASCS environment using missions of 22 and 34 aircraft and using
117
CHAPTER 4. MODEL VALIDATION
either three or four catapults, allowing the simulation to execute completely.
SMEs were specifically asked to review all aspects of the simulation as it played out — the
accuracy of movement of the crew, their alignment to aircraft, the motion of aircraft on the flight
deck and how their paths are deconflicted, and the assignment of aircraft to catapults. They were
also asked to provide comments on any other aspects of the simulation outside of those factors.
The simulations were often paused or restarted in order to the SMEs to provide comments on what
was currently happening or to get a closer look at what had just occurred. This continued until
participants had no remaining comments, upon which the discussion shifted to discussing the results
of validation testing described in Section 4.3. These results were presented in a series of slides to
the SMEs.
The SMEs agreed that, qualitatively, the animations of the simulation were accurate overall. The
SMEs expressed concern over what they considered to be large launch event duration values, but were
not overly worried given observations of the rest of the simulation. The SMEs also admitted that
they did not have a good understanding of how mission durations varied between combat “Surge”
operations, which occurs at a higher tempo due to faster launch preparations, and the majority
of slower tempo, more typical operations. They also commented that they were largely biased to
thinking in terms of Surge operations, but agreed that given the accuracy of the motion of crew,
aircraft, and the assignment of aircraft to catapults, the model should replicate Surge operations
with the correct (faster) launch preparation model.
In terms of sensitivities, SMEs explained that they had relatively poor mental models of these
effects. Aircraft only operate at one taxi speed, and they have no real understanding of how taxi
speeds might affect mission operations. In terms of the collision prevention parameters, aircraft
directors are all typically very risky when navigating aircraft. They described Directors as having
very little variation in how conservatively they navigated aircraft on the flight deck, but did agree
with the explanations of the sensitivity analyses when presented with that data. Additionally, they
also realized they had no real understanding of how the number of catapults affected the time to
complete the launch event: what is available is used, and extra catapults are thought to help clear
more space on deck.
That SMEs do not notice a difference is supported by the results of tests in Section 4.4. The
difference in performance between the three and four catapult missions averaged less than two
minutes in those tests, a difference not likely strong enough to be noticed by crew. Crew also noted
that they do not often consider the effects of including more aircraft in the mission, as this number
is not really an option for the deck crew. Even so, conceptually, the SMEs did not disagree with the
results and approved the simulation as an accurate replication of flight deck operations.
118
CHAPTER 4. MODEL VALIDATION
4.6
Discussion
The testing described in this chapter has demonstrated that the MASCS model was able to adequately replicate launch interdepartures on the flight deck while also demonstrating responses to
changes in parameters in line with expectations, providing confidence that the MASCS model is a
reasonable model of flight deck operations. As such, it suggests that the modeling techniques used
in creating MASCS are also reasonable and are suitable for use as templates for generating models
of the unmanned vehicle control architectures and their integration into flight deck operations. The
results of the testing discussed in this chapter also provide insight into the important parameters concerning UAV implementation, as well as the use of SME in analyzing system performance. However,
as noted at the beginning of this chapter, insufficient data is available to suggest that the simulation
model is fully validated. The data used in this testing process has been used to successfully calibrate
the model, but a variety of aspects of the environment have not yet been addressed.
Further validation will not be possible until a much greater amount of detailed data is available
for operations. The most pressing need is data on the activity of individual crew on the flight deck,
including the Deck Handler and his planning strategies, as well as in individual interactions between
Aircraft Directors and aircraft in the routing topology they use on the deck. While the heuristics
that govern these processes were based on discussions with subject matter experts, training manuals,
and observations of operations, they can only be considered as representative models of operations.
Substantial variability may exist within these behaviors that have not been captured in the MASCS
model at this time. Even so, the results of calibrating this model suggest that it is useful in exploring
the sensitivity of the flight deck environment to vehicle behaviors as modeled in MASCS.
In particular, the sensitivity tests presented in Section 4.4 suggest that the main drivers of
performance, in terms of mean event duration, lay in the launch process and not in any parameters
relating to aircraft motion on the flight deck. However, changes to movement speed and collision
prevention (vehicle’s freedom of motion) may still have effect if they can disrupt the queuing process
at the catapults. This is one of several potential tradeoffs that exist within the system. Increasing
taxi speeds in fully autonomous environments may be possible, but will be of little utility if launch
preparations are not changed. Likewise, the benefits of greatly reducing launch preparation times
may be lost if vehicles cannot arrive at catapults at the proper times. Given the Navy’s goal to put
UAVs on carriers, it is unclear how UAVs will perform in this current environment, or what changes
will be needed to incorporate their inclusion. Given the sensitivity of the launch time preparation
parameter, these results suggest that launch preparation capabilities should be a primary concern
for UAV developers in terms of aircraft carrier integration.
Interviews with SMEs revealed important limitations on their knowledge of the system and their
mental heuristics. Their bias towards combat operations was already discussed and is not surprising,
119
CHAPTER 4. MODEL VALIDATION
given human nature and the impactful nature of those memories. However, the limitations of their
knowledge of system sensitivity perhaps reveal some misunderstanding of flight deck dynamics. In
conversations, a variety of SMEs noted that extra catapults are always used in order to clear space
on the flight deck, will little understanding of how this actually affects flight deck performance. The
MASCS environment demonstrates that there is little practical benefit in terms of completing launch
operations at a cost of increasing the congestion on the flight deck, which should also be reflected in
significant changes in the safety metrics that will be examined later. The results of the testing at this
time suggest that it may be more beneficial, and perhaps safer, to commit resources to speeding the
launch process rather than to include an additional catapult. This is an interesting, and unexpected,
result of the simulation validation process that will be examined further in later chapters. This result
also demonstrates that the MASCS simulation may be useful in further examining the failures of
SME heuristics in understanding both current operations, and future ones involving UAVs
4.7
Chapter Summary
This chapter has described the validation testing process for the Multi-Agent Safety and Control
Simulation (MASCS) simulation environment. This chapter began with a discussion of the process
of validation and how it occurs for agent-based models. The chapter then described the phases
of simulation validation testing, beginning with the calibration of the aircraft agent model before
proceeding to validation testing of full missions utilizing 18 to 34 aircraft. In each section, the
results of both internal and external validation testing were discussed, describing the robustness of
the simulation to changes in different parameters and its ability to accurately replicate empirical
data from real-world testing. A final section described a review of the simulation by Subject Matter
Experts (SMEs) that deemed the behaviors of individual crew, aircraft, and the Deck Handler
planning to be accurate.
The results of this section all provide confidence that the MASCS simulation model is an accurate
representation of aircraft carrier flight deck operations. However, because data is not available for
all agents and for all aspects of flight deck operations in sufficient detail to validate all aspects of
the model, it cannot be claimed that MASCS is fully validated. At this time, it can only claim to
be a partially-validated model calibrated based on limited data on flight deck operations. However,
performance in the validation testing was sufficient to suggest that the MASCS model is ready for
the investigation of adding futuristic unmanned vehicles, new safety protocols, and changes to the
organization structure of operations. Chapter 5 describes the experimental test program defined
for the MASCS simulation environment, including the unmanned vehicle control architectures and
safety protocols of interest, how they are modeled, and their effects on the safety and productivity
of flight deck operations.
120
Chapter 5
MASCS Experimental Program
“The Three Laws of Robotics: 1: A robot
may not injure a human being or, through
inaction, allow a human being to come to
harm; 2: A robot must obey the orders
given it by human beings except where
such orders would conflict with the First
Law; 3: A robot must protect its own
existence as long as such protection does
not conflict with the First or Second Law.”
I, Robot (1950)
Isaac Asimov
The previous chapter described the calibration and validation testing processes of the MultiAgent Safety and Control Simulation (MASCS) model of aircraft carrier flight deck operations,
demonstrating that it is a reasonable replication of the real world system. The MASCS model
was assembled by modeling aircraft, crew, and supervisory staff as independent decision-making
entities, initialized and allowed to interact with the environment on their own through independentlymodeled tasks executed by these agents. The process of creating these agent models was described
in Chapter 3, which also provided background on flight deck operations. This chapter describes how,
within the partially-validated MASCS environment, future unmanned vehicle control architectures
and their integration into flight deck operations can be modeled and evaluated over a variety of
operational conditions. This chapter also describes different forms of safety protocols that are
applicable to flight deck operations, how they can be integrated into the simulation environment
independent of the unmanned vehicle models, and what their effects on operations might be.
A test matrix is presented, which details the experimental runs that will be used to explore the
121
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
effects of different Control Architectures and Safety Protocols on flight deck performance and how
these are influenced by both the total number of aircraft included within a mission and the number of
Unmanned Aerial Vehicles (UAVs) within that total. This description of the experimental protocol
also includes the definition of dependent variables that describe both safety and productivity on
the flight deck. This chapter concludes with a review of the results of this experimental program.
The first section of this chapter begins with a description of the experimental matrix to provide an
overview of the test program; the elements of this matrix are described in subsequent sections.
5.1
Experimental Plan
The experimental plan is designed to understand the interactions between UAV Control Architectures (CAs), operational Safety Protocols (SPs), the Density of mission operations (number of
aircraft utilized), and the Composition (percentage of mission that is unmanned or autonomous)
affect the safety and productivity of aircraft carrier flight deck operations. These four variables are
crossed in order to explore all possible interactions between these conditions. Control Architectures
and Safety Protocols are the most important independent variables of interest and are the main
focus of the test program. The Control Architecture factor includes six levels, modeling the control
architectures of Manual Control (MC) (partially validated in the previous chapter), Local Teleoperation (LT), Remote Teleoperation (RT), Gestural Control (GC), Vehicle-Based Human Supervisory
Control (VBSC), and System-Based Human Supervisory Control (SBSC).
Each control architectures involves changes both to individual vehicle behavior (ability to execute
tasks and observe crew) as well as changes to how operations are conducted (how plans are generated and how taxi commands are issued), both of which may affect operational performance. Four
different Safety Protocols are tested, moving from unconstrained (Dynamic Separation) to moderately constrained Temporal or Area Separation) to highly constrained (Temporal+Area Separation).
These Safety Protocols force adjustments to the planning and routing of aircraft on the flight deck,
altering when and how many unmanned vehicles are active at any given time.
The mission conditions will also affect how Safety Protocols are employed and cause further
changes to mission operations. However, as described in Chapter 3, taxi patterns become more
constricted and one or more catapults may not be accessible at increased mission Density. This may
have interesting interactions with the CAs and SPs defined in the previous section. Density is tested
at two levels: 22 (an average mission) or 34 (the largest typical mission). Lastly, the percentage of
UAVs within the mission has operational implementation on how the Safety Protocols are defined
for operations and also causes fundamental changes in how the mission evolves. Composition is
tested at three levels: 100% (all UAV, examining futuristic operations), 50% (even mix), and 25%
(an early, conservative integration of vehicles). Figure 5-1 presents the experimental matrix that
results from this cross of variables.
122
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Figure 5-1: Experimental Matrix for MASCS evaluation.
The matrix is not a full cross, as the Area, Temporal, and Temporal+Area protocols are not
applicable at the 100% Composition level (these protocols require heterogeneous vehicle types within
a mission; this is explained in the next section). In total, the matrix includes 90 runs, each of which
will be replicated thirty times in order to explore the stochastic distributions in the system. Two
additional runs examining Manual Control vehicles at the 22 and 34 aircraft levels are also part of
the data set but were already conducted in Chapter 4.
5.1.1
Safety Protocols
As discussed in Chapter 2, Safety Protocols (SPs) describe explicit changes to rules and procedures
within an environment in order to minimize the number of unsafe interactions between agents in
the system. For the Heterogeneous Manned-Unmanned Environments (HMUEs) that are the focus
of MASCS, this means changes that limit the physical interaction of unmanned aircraft, manned
aircraft, and crew on the flight deck. Chapter 2 also provided three main classes of safety protocols
that are applicable to unmanned aircraft on the flight deck environment. They are described in the
following sections.
Dynamic Separation Only
Dynamic Separation (DS) protocols attempt to maintain separation between vehicles during transit
operations; while vehicles are moving, these protocols act to ensure that they do not become too
close to one another by maintaining a minimum safe distance x between one another. Dynamic
Separation protocols are utilized within both manned and unmanned vehicles systems, and a version
123
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
was instantiated for the original MASCS model (see Chapter 3, Section 3.2.1). This original collision
prevention routine worked as follows: for a vehicle Vi that is in motion, stop if any vehicle is within
a distance df forward or distance dl laterally, where df = one body length and dl = one body width.
This same logic is used for each of the UAV control architectures described later, including
utilizing the same thresholds defined earlier for manual control operations. Implicit in this choice is
that crew and pilots on deck have the same level of trust in the UAVs that they do in other vehicles;
otherwise, it would be expected that the minimum safe distance for UAVs should be increased by
some amount. This assumption of equal trust in the UAVs is the basis of many assumptions in
the Control Architecture models discussed later — it allows the testing to differentiate between
the actual structural features of the architectures and to minimize or eliminate the effects due to
environmental or crew social factors. The Dynamic Separation protocol is always active in the
system, regardless of whether any other safety protocols are applied, and is applicable to all vehicles
models. Additionally, because it only affects an individual vehicle performing a specific task (and
does not affect planning or traffic routing on deck), it provides the fewest constraints on operations.
Area Separation
Chapter 2 previously defined the Area Separation protocol as enforcing the constraint that agents
of a certain class must remain within their designated area on the flight deck. The intent of such
Area Separation (A) protocols is to isolate the believed “unsafe” unmanned vehicles from manned
vehicles. In a more mathematical form, the working environment is divided into N regions, where
members of a group Gk must remain in region j, where j = 1 : N . This corresponds to set of
constraints in planning: when allocating members of group Gk to resources in the environment, only
assignments to resources within the corresponding region j are allowable. Operations within each
region are independent and can occur in parallel. However, because the intent of the protocol is to
separate groups of vehicles, there must be at least one vehicle from two or more groups present in
order for the protocol to be meaningful. As such, once the population of vehicles in the environment
are all from the same group, the constraints on operations can be relaxed.
In the strictest form, this means that members of every vehicle group Gk should remain within
their defined region j at all times. In flight deck operations, this would require dividing the carrier
flight deck into regions, altering initial parking configurations to place aircraft in the correct regions,
and ensuring that each region has access to at least one launch catapult. The experimental plan will
only address missions using two classes of vehicles (manned and one type of unmanned), this requires
only two regions on the deck. Given that there are two sets of launch catapults (one forward, one
aft), segregating the deck into forward and aft halves in a sensible approach. Figure 5-2 demonstrates
how this might occur and how operations might evolve.
Manned aircraft (blue areas) will begin parked in the forward section of the flight deck, as crew
and the Handler would likely have higher trust in their ability to navigate the high-density Street
124
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
area of the deck. Unmanned aircraft (yellow areas) would then be placed in the more open Fantail
area in the aft. Initial parking configurations are altered to ensure aircraft start in their correct
areas, with priority orderings properly balanced to ensure that both groups are assigned in parallel
and that taxi paths within regions do not conflict with one another. In making these changes, as few
alterations as possible were made from the original MC validation scenarios in order to maintain as
much consistency as possible across initial conditions. The Deck Handler planning routine is altered
to include the constraints on assignments described above: unmanned aircraft are only assigned to
aft catapults (numbers 3 and 4) and manned only forward (catapults 1 and 2) so long as at least
one of each vehicle exists on the flight deck.
Temporal Separation
Chapter 2 previously defined the Temporal Separation protocol as enforcing the constraint that only
one group of agents is allowed to be active at any given time during operations. The intent is to
isolate groups of vehicles from one another, just as in the Area Separation protocol, but to do so
while retaining access to the full environment. For this protocol, vehicle groups Gk are scheduled in
a pre-defined order, beginning with G1 , then G2 , and so forth to Gn where n = 1 : k. At any given
time, aircraft from only a single group Gk are allowed to taxi on the flight deck. Once all members
of the current group complete taxi operations, the next group is allowed to begin taxi operations.
Figure 5-3 diagrams how this might be applied to flight deck operations. A logical implementation
of this protocol would involve operating manned aircraft (blue) first in the mission, followed by
unmanned aircraft (yellow) for the same reasons as in the Area Separation protocol. Although crew
should trust the UAV systems, given the choice, they would likely prefer that manned aircraft are
active first due to the high density of aircraft on the flight deck. Crew would also likely again
prefer that manned aircraft begin the mission parked in the congested, high-traffic Street area in
the forward section of deck.
Figure 5-2: Example of the Area Separation protocol assuming identical numbers of UAVs and
aircraft. In this configuration, all manned aircraft parked forward are only assigned to the forward
aircraft pair, while all unmanned aircraft parked aft are only sent to the aft catapults. When one
group finishes all its launches, these restrictions are removed.
125
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Figure 5-3: Example of the Temporal Separation protocol assuming identical numbers of UAVs and
aircraft. In this configuration, all manned aircraft parked forward are assigned to catapults first but
are allowed to access the entire deck. After all manned aircraft reach their assigned catapults, the
unmanned vehicles are allowed to taxi and are also given access to the entire deck.
Integrating this into the MASCS simulation environment also requires modifications to the initial
conditions of test scenarios and the Handler planning routine. In the initial conditions, manned
aircraft must be parked such that they can easily taxi to catapults without requiring any of the
UAVs to be relocated. This moves the manned aircraft into the interior parking spaces on the
flight deck and places the UAVs on the edges and aft areas. Changes to the Deck Handler planning
heuristics enforce the constraint that manned aircraft should be assigned first; when the number
of manned aircraft without assignment reaches zero and all manned aircraft have arrived at their
assigned catapults, unmanned aircraft may then be assigned.
Combining Protocols
It is also possible to combine the Area and Temporal Separation protocols, although this is likely
to overconstrain activity on the flight deck. Employing these simultaneously means that one group
of vehicles will launch first from one catapult pair, after which the remaining group will have full
access to the flight deck. During the initial phase of these combined Temporal+Area missions, the
fully-functioning aft catapults do not receive aircraft assignments; manned aircraft are only routed
to the forward pair of catapults. This means that during manned operations, the flight deck is
operating below its full capacity, as half the available catapults are not used. This should then
increase the total launch event duration as compared to the other safety protocols.
A reasonable method of implementing this follows from the previous descriptions of Area and
Temporal Separation: manned aircraft are parked forward, with UAVs parked aft in the last areas to
be cleared (Figure 5-4). Manned aircraft are operated first, being sent only to the forward catapults;
once the last manned aircraft is close to his destination catapult, the UAVs would be allowed to
begin operations and have access to the entire flight deck. This should minimize interactions amongst
aircraft and increase overall safety, but it will likely cause significant increases in total launch event
duration as opposed to the other safety protocol options. Additionally, the combined T+A should
126
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
also be highly sensitive to the number of manned aircraft included in the mission: the larger this
complement, the longer the deck remains in an inefficient state (as no assignments are made to the
aft catapult pair).
Given this, there are four distinct Safety Protocol cases that exist: Dynamic Separation (DS),
Area Separation (A), Temporal Separation (T), and Temporal+Area Separation (T+A), each of
which are applicable to any type of UAVs system. However, it should be noted that the A, T, and
T+A protocols are only meaningful when the mission includes a heterogeneous set of vehicles. In the
MASCS testing program, the effectiveness of these protocols will be examined in terms of different
mission profiles and in terms of their interactions with the Control Architectures for the UAV
systems. As described in Chapter 2, five primary types of UAV Control Architecture exist, ranging
from simple teleoperation systems to more advanced supervisory control systems. The next section
characterizes how each of these five architectures are modeled within the MASCS environment as
modifications of the original Manual Control model presented in Chapter 3.
5.1.2
Control Architectures
Control Architecture (CA) is the second major factor in the experimental matrix in Figure 51, nested within Safety Protocols on the vertical axis. As described in Chapter 2, the Control
Architecture (CA) of an unmanned vehicle system refers to the way(s) in which human operators
provide control inputs to those vehicles. This was previously characterized in terms of operator
location, operator control inputs, and the modality of operator input, as shown in Figure 5-5. The
figure is architected such that, from left to right, the types of control inputs provided by the operator
become more abstract. For manual control and teleoperated systems, operators provide inputs
physically to control linkages in the system, manipulating the control yoke and rudder pedals to
direct the vehicle or moving a throttle to control speed. For gestural systems, operators command
an action to be executed (to turn, or slow down, or stop) through physical gestures), which the
Figure 5-4: Example of the Temporal Separation protocol assuming identical numbers of UAVs and
aircraft. In this configuration, all manned aircraft parked forward are assigned to catapults first
but are only allowed to taxi to forward catapults. After all manned aircraft reach their assigned
catapults, the unmanned vehicles are allowed to taxi and are given access to the entire deck.
127
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
vehicle’s onboard autonomy translates into control inputs. For supervisory control systems, operators
command more abstract goals (such as “taxi to this location”) through symbolic interactions in a
Graphical User Interface (GUI). These symbolic interaction are then decomposed into a set of more
discrete actions, which are then further decomposed into control inputs.
In the context of flight deck operations and models of agent behavior, the important activities
of a pilot that must be replicated by an unmanned vehicle control architecture can be summarized
in terms of three key aspects: visual detection of an Aircraft Director that is attempting to provide
instructions, correctly interpreting task instructions, and correctly executing the commanded tasks.
Each of these corresponds to a related variable: the probability of observing the Director if within
the vehicle’s field of view, the probability of correctly interpreting the command, and the probability
of correctly executing a task. For manual control, the likelihoods of these failures are so low that
subject matter experts would consider them nonexistent (0% likelihoods). A fourth difference in
behavior comes from the fact that the Director is providing instructions in this “zone coverage”
routing; this may not occur for some forms of UAV control, and thus how vehicles are routed forms
a fourth behavioral variable.
Two other variables also affect vehicle behavior, relating to the physical characteristics of the
vehicle: the field of view of the pilot/aircraft, which describes where the Director should be in
order to provide commands, and the latency in processing commands. The field of view for manual
operations is related to human sight range and the constraints of being in the cockpit; for unmanned
vehicles, field of view is related to the cameras used on the vehicle. The latency in processing
commands could come from a number of sources: delays in the communication of signals between
operator and vehicle, the time required for the vehicle’s computer system to process, or time required
for the human operator to interact with the system. Each of these can also be modeled as variables
Figure 5-5: Variations between control architectures, categorized along axes of operator location,
operator control inputs, and operator input modality.
128
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
within the MASCS simulation environment, and a full list of these six design variables appears
below.
1. Vehicle Behavior variables
(a) Routing method: the topology of the flight deck and how aircraft are routed. Uses
either crew “zone coverage” methodology or central operator control.
(b) Visual sensing and perception: the probability of failing to detect an operator when
they are within the field of view. Applicable to MC, LT, RT, GC architectures. Bounded
[0, 1.0]; applied to any task commanded by an Aircraft Director (move forward, turn,
etc.).
(c) Response selection: the probability of failing to select the correct response given an
issued command. Applicable to MC, LT, RT, GC architectures. Bounded [0, 1.0]; applied
to any task commanded by an Aircraft Director (move forward, turn, etc.).
(d) Response execution (catapult alignment): the probability of failing to properly
align to a catapult. Applicable to all architectures. Bounded [0, 1.0]; applied to final
motion task to travel on to catapult.
2. System Design variables
(a) Field of View: visual field of the pilot and/or cameras onboard the vehicle. Applicable
to MC, LT, RT, GC architectures. Bounded [0, 360]o .
(b) Latencies and delays: delays that exist in operations. Applicable to RT (communications latency), GC (processing delays and time penalties due to failures), and VBSC
architectures (operator interaction times).
Comparing the unmanned vehicle control architectures to the baseline manual control model,
Local Teleoperation (LT), Remote Teleoperation (RT), and Gestural Control (GC) systems are the
most similar to current operations: each still receives assignments from the Handler, who passes
these instructions to crew that relay taxi instructions to the vehicles through physical gestures. LT
and RT both utilized a remote human pilot to recognize and interpret those commands, while GC
replaces the pilot with a gestural-recognition system capable of identifying directors and interpreting
their commands. Modeling these systems only requires creating new vehicle models to replace the
original manually controlled F-18 models from Chapter 3. The two remaining Human Supervisory
Control (HSC) systems, VBSC and SBSC, require new vehicle models as well, but they also require
architectural changes to flight deck operations. Both of these systems are centrally controlled and
neither requires interactions with the crew for taxi instructions. Commands are instead provided by
a single operator through the central system.
For the VBSC system, assignments and taxi commands would be issued by the Deck Handler
through a GUI that allows them to issue catapult assignments and provide taxi waypoints to vehicles
individually. In such a system, the Handler behaves as a server, processing requests for instructions
from vehicles serially. Past research suggests that operator performance degrades as the number of
vehicles controlled in a VBSC system increases; as such, it would be likely that the VBSC operator
would only be given control over the VBSC vehicles, and manned aircraft would operate with crew as
129
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
they always have. For an SBSC system, a central planner would generate assignments for all aircraft
at once and issue them in parallel, no longer requiring the Handler to service vehicles individually.
However, optimization routines that work in this fashion work best when given control over all
elements of the system. For the SBSC model, manned aircraft would be included in the planning
system and would no longer be under the control of the Aircraft Directors on the flight deck. These
structural changes for the VBSC and SBSC systems have other repercussions as well, as will be
discussed later.
Each of these UAV control architectures can be modeled through changes to the variables listed
above, an overview of which appears in Figure 5-6 and will be elaborated on through the remainder
of this section. The first four lines of the figure describe the four major behavioral variables related to
human pilot operations; the last two lines describe the physical limitations of the control architectures
when work in the world. In generating parameter values and logic routines, as much as possible, the
definitions were based on prior research work in unmanned vehicle development; however, the novel
nature of the aircraft carrier domain means that a variety of assumptions must be made in order
to provide specific values to vehicle parameters. As described earlier, many of these assumptions
assume a “reasonable best-case scenario” for the unmanned systems, assuming that they are mature
technologies being used by expert users that are proficient in their operation and whose colleagues
trust their capabilities. These assumptions attempt to ensure a fair evaluation of the physical and
technological systems and attempt to avoid penalizing systems for perceived human social factors
such as distrust, dislike, or other issues.
Figure 5-6: Variations in effects of control architectures, categorized along axes of routing method,
sensing and perception failure rate, response selection failure rate, response execution failure rate,
field of view, and sources of latency.
130
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Visual Acquisition Parameters
One of the key requirements of pilots in flight deck operations it to quickly visually identify the
Aircraft Director who is currently providing instructions then sense and perceive nearby Directors,
comprehend the instructions they provide through physical gestures and select the appropriate
response, then execute that response. For the non-HSC UAV models (LT, RT, and GC), each
aspect of this process of Sensing and Perception, Response Selection, and Response Execution may
be affected by the properties of the control architecture as defined. Each action may have a higher
error rate than that observed for manual control. Each error may also have an associated time
penalty. Additionally, visual identification may not occur as quickly for computer systems or remote
pilots as for pilots seated within an aircraft.
Local Teleoperation (LT), RT, and GC systems all rely heavily on the use of cameras to observe
the outside world, with previous studies reporting that LT and RT operators experience a restricted
view of the environment that is detrimental to performance. This constraint on field-of-view should
also be included within the MASCS system. The research reported previously in Chapter 2 utilized
cameras and displays ranging from 40◦ to 257◦ in field of view, while the stereo camera used by
Song et al. has models ranging from 43◦ to 100◦ (Song et al., 2012). Given that the manual control
case was calibrated to 85◦ in earlier testing, a smaller value will be used for the UAV systems; the
median stereo camera field-of-view of 65◦ is chosen here. Operationally, this means that Aircraft
Directors will have to move slightly closer to the centerline of LT, RT, and GC aircraft to be visible
during operations. However, the same alignment logic used previously is still applicable; all that
is required is an adjustment of the aircraft model’s field-of-view parameter. Vehicle-Based Human
Supervisory Control (VBSC) and System-Based Human Supervisory Control (SBSC) systems do
not utilize cameras for processing tasks or relaying information to a remote operator, implying that
this limitation in camera field-of-view has no effect on their operations and is not necessary in the
modeling.
Once the Director is within the field of view, LT, RT, and GC systems must then visually identify
the Director and recognize that (1) this is their current Director that should be instructing them
and (2) the Director is commanding a specific task for them to perform. LT and RT systems both
continue to use a human pilot to perform this task, and prior research suggests that once an object
of interest is in the remote pilots camera view, they have a very high likelihood of detecting it.
As such, the probability of failing to notice a Director and failing to select the correct response a
task are both left at 0%. For VBSC and SBSC systems, these variables are again not applicable as
these systems do not rely on optical systems for task acquisition and these parameters are again not
necessary.
GC systems, however, require substantial changes to the model. Gesture recognition technologies
are still maturing, and the ability to recognize the complicated physical gestures performed by
131
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Directors in real-world settings is still an area of investigation. As noted earlier in Chapter 2, the
mean failure rate reported in the research, which includes laboratory settings and far less complicated
gestures, is currently 11.16%. While performance will certainly improve in laboratory settings, a
variety of environmental conditions (sun glare, obstructions, large numbers of moving vehicles)
ultimately limit performance in the real world. The median failure rate in the research is quite likely
to be close to the real-world failure rates, even with enhanced processing and vision capabilities.
As such, the median failure rate of 11.16% observed in the research is used as the failure rate of
the baseline GC system; the effects of this parameter on GC behavior will be explored later on in
Chapter 6.
There are also technical differences between identifying a Director and selecting a response given
their commands, with the latter being much more difficult. Failing to recognize either a Director or
a Task would involve some form of time penalty, either for a Director to issue an override to obtain
control over the vehicle (perhaps through a handheld wireless device) or to issue a “stop” command
to the vehicle and attempt to issue new instructions, respectively. A logical GC implementation in
this form would imply that a failure to recognize a Director might only happen once, after which
the Director would have the authority to force control of the vehicle. A failure to recognize tasks
does not have such a solution, due to the significant number of tasks that might be assigned. It
thus may take several attempts for the vehicle to register the correct command, with each failure
accruing a time penalty. For the baseline MASCS model of Gestural Control, the penalty for failing
to acquire a new Director is set at 5 seconds, representing the time required to realize the vehicle has
not recognized the Director and for the Director to issue an override. The time penalty for failing
to acquire task is modeled as a uniform distribution between 5 and 10 seconds, representing the
time for the operator to stop, issue a cancel command, and wait for the system to become ready to
accept new instructions (these times are also influenced by the expected latency in signal processing,
described later in Section 5.1.2). The number of failures in command comprehension and response
selection is determined at the start of each action commanded by a Director by a Bernoulli trial
generator, after which the total time penalty is solved by randomly sampling the uniform interval
[5, 10] once for each failure incurred.
Task Execution
The last of the original performance parameters involves task execution, in terms of alignment to the
catapult for launch. Chapter 3 noted that the probability of pilots failing to align to catapults was
very low, less than once per more than 200 launches. This was incorporated in the original Manual
Control model partially validated in Chapter 4. In transitioning from manual control to remotely
operated systems, previous research has reported that, even with large field-of-view cameras, the
remote pilots of LT and RT systems suffer degraded performance in their tasks. The common theme
in the literature reviewed in Chapter 2, although utilizing a variety of vehicles and tasks, was that
132
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
tasks that required accurate alignment of a vehicle (or robotic arm) were most difficult to perform
and were even more so when in the presence of signal lag. This suggests that both LT and RT
systems should suffer a higher failure rate than manual control, but the information available from
prior research does not provide a true estimate of failure probability (most often reporting numbers
of errors). An appropriate model may be to raise the LT failure rate to once per mission (5%, once
per 20 launches) and RT to roughly twice per mission (10%, once per 10 launches).
For the more advanced GC, SBSC, and VBSC systems, the limitations of the remote pilot to
sense his or her location in the world and align to a target are removed, but these systems instead
require some sort of technological intervention to ensure accurate alignment. It would be reasonable
to assume, for a future system being allowed full operational status on a carrier, that the performance
of these systems would be equal to or better than LT performance. Also, because VBSC and SBSC
systems would rely on greater autonomy and likely include enhanced sensor capabilities as opposed
to the rest, they would likely have the best non-MC failure rates. However, given the difficulties of
navigating on the flight deck, it is difficult to argue that the failure rate would be zero. These ideas
then present a relationship of failure rates between systems: MC < VBSC&SBSC < LT&GC < RT
(MC has a lower failure rate than VBSC and SBSC systems, which is lower than LT and GC systems,
which are better than RT systems). Given that the chance of catapult alignment failure for manual
control is once per many missions (100+ launches), and that LT was once per mission, GC is also
set to once per mission (5%) while VBSC and SBSC are set to once per two missions (2.5%). This
failure, like the visual recognition failures, also includes a time penalty, which Directors report as
being in the range of 45-60 seconds.
Latencies
Outside of the original parameters, certain UAV systems may introduce latency into the process of
response selection and task execution. This is expected to occur for RT, GC, and VBSC systems,
albeit for different reason in each. VBSC systems incur a form of latency related to the Handler’s
ability to process tasks through the supplied GUI. This is described in the next section and will
be accompanied by a description of the VBSC system latencies. The source of latency in GC
systems comes from software processing as the system attempts to translate the hand gestures
of the Director into understandable action commands. Prior research shows this requires anywhere
<<1 to 3 seconds. This is modeled as a uniform distribution over the range [0, 3], randomly sampled
at the start of every action commanded by a Director. Remote Teleoperation (RT) system latencies
arise from the transmission of signals from the vehicle to the remote operator, who issues commands
that take additional time to travel back. This means that there is a defined lag from the time the
Director commands a task to the time that the operator begins execution, as well as a potential lag
at the end of the task when the Director commands the next task. Prior reports have placed the
round-trip latency at 1.6 seconds (Vries, 2005), meaning that that the start of each action is delayed
133
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
by 1.6 seconds.
The effects at the end of the action are not as straightforward, however. Prior work demonstrated that experienced operators adapt their strategies to the system, commanding tasks that
attempt to anticipate the lag, then wait for the tasks to complete execution before issuing new
commands (Sheridan & Ferrell, 1963; Sheridan & Verplank, 1978). Such a strategy would likely
be used by experienced remote pilots in flight deck operations in relation to when they expect the
Director to issue a command. This provides a continuum of possibilities ranging from ending the
task too early to ending the task too late; Figure 5-7 shows how this might occur for the “worst
case” late completion of a task. Two timelines are depicted, one for the actions of the Director and
Pilot (top), the second for the response of the vehicle (bottom). The vehicle beings its motion task
on the left at T = 0, and it should complete at T = 5. In the very worst case scenario, the pilot does
not anticipate the end of the task, waiting for the Director to issue the “STOP” command. If this
occurs at T = 5, when the moving aircraft has reached the desired end point, the round-trip latency
in the system would delay the end of the task by 1.6 seconds: the time required for the “STOP”
command to travel to the remote pilot and for this “STOP” signal to reach the vehicle. What was
supposed to be a 5-second task completes in 6.6 seconds.
In order to execute the task precisely as defined, preemptive action must be taken either by the
Director or the Pilot to compensate for the latency in the system; these actions appear in purple
towards the right side of Figure 5-8. The Director must provide a “STOP” command at least 1.6
seconds early to account for the round trip latency of 1.6 seconds. The remote Pilot must act 0.8
seconds early as they must only handle latency in one direction. Pilots might also overcompensate
for the lag in the system if they do not have an accurate mental model of the current latencies in
communications of if they are unsure of the final taxi destination. For experienced operators working
with a mature system, it is doubtful that they would initiate actions any earlier than double the
Figure 5-7: Diagram of worst-case scenario for RT latency effects for a five-second task. Upper timeline shows Director and Pilot actions, bottom timeline shows vehicle response to Pilot commands. If
Director provides command late (at the 5 second mark) and the Pilot does not preemptively stop the
aircraft, the round-trip 1.6 second latency in communicating the “STOP” command to the vehicle,
the pilot issuing the command, and the command being executed causes the vehicle to stop late.
The task completes at 6.6 seconds.
134
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
lag (3.2 seconds before the action should stop). Given that the latency in a one-way signal is 0.8
seconds, this places a minimum bound at tasks ending 2.4 seconds early. In Fig. 5-8, this shows that
the 5 second task desired would end at T =2.6 seconds. While this is a significant shift in terms of
a 5 second task, most taxi tasks on the flight deck take much greater than 5 seconds to complete
and the relative inaccuracy is not nearly as severe.
Given the assumptions in the previous two paragraphs, we have a continuum of possibilities
ranging from tasks being complete 2.4 seconds early to 1.6 seconds late. This can be modeled in
MASCS as an adjustment to the expected completion time of the task itself. To reflect the variability
in behavior, this takes the form of a uniform distribution over the range [−2.4, 1.6] sampled randomly
for each task and added to the expected end time.
UAV Routing Methods
The final major difference between UAV Control Architectures involves how they are issued taxi
commands and routed through the flight deck. As described earlier, LT, RT, and GC systems still
operate under zone coverage, although their performance is affected by various factors described in
the previous sections. VBSC systems would rely on commands issued from the Handler through a
Graphical User Interface, which requires modeling the behavior of the Handler in operating such
a system. Prior work has demonstrated that operators of these system behave as queuing servers,
processing vehicle tasks individually according to some form of queuing policy, with the time required
to issue tasks being modeled by a randomized distribution. For flight deck operations, experienced
Handlers would likely operate in a “highest priority first” fashion, prioritizing aircraft nearest to
catapults in order to keep traffic flowing on the deck.
The time required to process tasks is taken from prior research by Nehme, whose research involved
operators controlling a team of unmanned vehicles through a VBSC-style interface, requiring them
to define waypoints that defined vehicle transit paths, among other task assignments (Nehme, 2009).
Nehme characterized the “idle” task, in which operators assigned a new transit task to a waiting
Figure 5-8: Diagram of optimal and worst-case early scenarios for RT latency effects for a five-second
task. Upper timeline shows Director and Pilot actions, bottom timeline shows vehicle response to
Pilot commands. Perfect execution requires preemptive commands by either the Director or the
Pilot. If the Pilot further pre-empts the Director’s actions, the task may complete early.
135
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
vehicle, as normally distributed with a mean of 3.19 seconds and standard deviation of 7.19 seconds.
The idle task described by Nehme is analogous to the action of assigning taxi tasks to idling aircraft
on the flight deck, so this same distribution will be used within MASCS for the baseline model of
VBSC systems. The interaction of the Handler’s interaction time with each vehicle and the current
number of vehicles in the queue generates latencies for VBSC systems: not only must a VBSC vehicle
wait N (3.19, 7.192 ) seconds at the start of each of its own tasks, it must wait for all other vehicles
ahead of it in the queue to be processed. If 4 vehicles are ahead in the queue, the fifth vehicle must
wait (one average) 3.19 ∗ 5 = 15.95 seconds before starting its next task. Thus, there is a significant
penalty for including additional vehicles in the VBSC queuing system, which has been characterized
in many previous studies (see (Cummings & Guerlain, 2007; Cummings & Mitchell, 2008)).
Additionally, since this style of routing does not rely on the network of crew on the flight deck,
the Handler can also assign longer taxi tasks on the flight deck — taxiing from aft to forward,
barring other constraints, could be done in a single motion rather than requiring multiple handoffs
between Directors. This is essentially a change in the topology of the flight deck; as described in
Chapter 3, the crew “zone coverage” routing system defines how aircraft can move throughout the
flight deck. The network of possible routes is altered for the SBSC and VBSC models, with longer
taxi movements now possible. However, the location of nodes in the network remains the same, with
the two supervisory control models allowed to make longer taxi motions. The general flow of traffic,
however, remains the same: Handlers have experience with flight deck operations and would likely
still employ the same general rules and routes of taxi actions as used today.
SBSC systems are also planned from a centralized systems but instead use a centralized planning
algorithm to define tasks for all vehicles simultaneously. All assignments are made in parallel, and
the Handler no longer functions as a queuing server in processing tasks. Once assignments are made,
just as with VBSC vehicles, tasks are carried out without the assistance of the Aircraft Directors
in taxi operations. What types of planning algorithms are useful in the aircraft carrier environment
is still an area of ongoing research, and prior work in this area demonstrated that even complex
planning algorithms have trouble decomposing the geometric constraints on the flight deck and
provide little benefit over human planning (Ryan et al., 2014). The appropriate development and
testing of planning algorithms for the flight deck lies outside the scope of this research; for SBSC
systems, the same Handler planning heuristics applied to the other CAs, which performed as well or
better than other planning algorithms in prior work, are applied here. As such, all five UAV control
architectures utilize the same Deck Handler assignment heuristics for assigning aircraft to launch
catapults.
A final change for the VBSC and SBSC systems involves the use of crew on the flight deck. It
was previously noted that the Aircraft Directors are not needed for issuing taxi tasks to VBSC and
SBSC vehicles on the flight deck. However, given the Navy’s emphasis on safety in deck operations,
136
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
UAVs still would not likely be allowed to taxi entirely on their own. It is likely that white-shirted
Safety Officers, already used to observe operations on the current flight deck, might be transitioned
into roles as aircraft Escorts, accompanying VBSC and SBSC vehicles as they taxi along to ensure
that they maintain sufficient clearance to avoid hitting other aircraft. These Escort crew might
also be supplied with a “kill switch” to stop the vehicle temporarily if a collision is imminent and
the vehicle appears to not recognize the danger. These crew are modeled as new Agents within
the MASCS system, requiring a modeling of assignment to escort specific vehicles as needed and
using similar alignment logic as the Aircraft Directors. At the start of a vehicle’s taxi operations,
an Escort is assigned that follows them through the duration of taxi operations; once the vehicle
launches, the Escort returns to a “home” position to await a new task. A total of ten Escorts are
placed in the system, with clusters of five appearing forward and aft. This provides the capability to
taxi eight vehicles to catapults while two others are returning from just-completed taxi operations
and is a sufficiently large number to not constrain operations.
Other Effects
Additionally, for GC, VBSC, and SBSC systems, the shape of the vehicle is changed from an F-18
planform to a X-47B Pegasus unmanned vehicles. Qualitatively, F-18s are longer and have a smaller
wingspan than the X-47s. However, the thresholds of collision avoidance were adjusted to maintain
the same distances for each model to maintain a consistency of behavior. Additionally, the flight
crew is assumed to have enough experience with the UAVs to be comfortable with working with
them (they trust the vehicles to perform safely) and taxiing them at the same speeds as manual
control aircraft. The distribution of speeds for the UAV models is thus identical to that used for
Manual Control, which also removes any potential confounds due to speed differences.
Control Architectures Summary
Together, the changes in parameters described in the previous sections define the differences between
Manual Control aircraft and each of the five UAV Control Architectures defined for MASCS, as
detailed in Figure 5-6. If properly modeled, these changes to parameters and models should generate
significant differences in performance for individual vehicles taxiing on the flight deck. The movement
from Manual to Local Teleoperation introduces a minor constraint on field of view on the pilots and
would be expected to worsen performance slightly. The move from Local Teleoperation to Remote
Teleoperation introduces (in this model) a small lag into communications that delays the start of any
task and affects the end of task execution, both of which should potentially degrade performance.
The move from Remote Teleoperation to Gestural Control replaces the pilot with a less-accurate
computerized system with a larger processing time at the start of each task and the chance to fail
multiple times. This worsens performance further and should provide the worst performance across
all systems.
137
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Moving from Gestural Control to Vehicle-based Human Supervisory Control eliminates the inaccurate gesture-processing system but replaces it with a single supervisor processing tasks for all
vehicles in series. For a single aircraft, the delay in supervisor interaction may be less than the
delays due to the gesture recognition system; if this occurs, the GC system should produce better
performance. At higher numbers of aircraft, however, more vehicles are placed into the VBSC supervisor queue, greatly increasing the supervisor’s workload. Under these conditions, the wait times for
vehicles may increase significantly, leading VBSC systems to provide poorer performance than GC
systems at the mission level. Lastly, because System-based Human Supervisory Control provides
the ability to assign all tasks simultaneously, the limitations of the supervisory in servicing vehicles
in series is removed and should greatly improve the performance of SBSC systems. Together, these
differences in performance should appear as an inverted-U for individual aircraft models. This may
not be true at the mission level, where interactions with the crew routing methods, catapult queuing strategies, and the Deck Handler planning heuristics may either mitigate or exacerbate these
qualities. Section 5.2 tests each of the control architectures using a single aircraft mission to explore
whether or not the six variables described here drive variations in performance.
5.1.3
Mission Settings: Composition and Density
The remaining two independent variables address how the context of the mission itself affects performance; these appear on the horizontal axis in Figure 5-1. The first variable is mission Density, set to
either 22 or 34 aircraft, which examines how increases in the number of vehicles affects operations.
Subject Matter Experts describe a typical mission as including 22 aircraft, with 34 being a maximum
value for operations. At this 34 aircraft setting, the number of available taxi routes is reduced due to
the number of aircraft parked on the flight deck. Additionally, one of the forward catapults (catapult
1) is covered with parked aircraft and not usable. The Street area also contains more aircraft than
the 22 aircraft setting, leading to an increase of the number of vehicles performing close proximity
interactions. Due to these factors, the assignment of aircraft also evolves in a different fashion;
aircraft parked on catapult 1 must be taxied early in the mission, which means that the Street area
remains blocked for a significant amount of time, limiting the flexibility of aircraft assignments.
The Composition variable addresses the fact that the Area, Temporal, and Temporal+Area
Separation protocols described in Section 5.1.1 can only be applied while at least one manned and
one unmanned aircraft are present on the flight deck. The Composition variable describes the
relative numbers of unmanned aircraft involved in operations as a percentage of the Density value.
At the 22 aircraft Density level, a 25% Composition setting uses 5 unmanned aircraft and 17 Manual
Control aircraft. Differences in the values of the Composition variable affects the implementation
of the different safety protocols and how they affect the evolution of a mission. These effects are
summarized in Table 5.1. For example, under the Area Separation protocol, the two groups of
138
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
manned and unmanned aircraft work in parallel in specific, segregated areas of the flight deck and
have equal access to catapults, regardless of the Composition level. Under the 50% Composition
protocol using Area Separation, these two groups are equal in number and should launch at similar
rates; the Area Separation protocol should remain in effect for the entire mission. At the 25%
Composition level under Area Separation, the small set of unmanned aircraft will launch more
quickly than the larger set of manned aircraft; once the unmanned aircraft have launched, the
Area Separation constraint can be relaxed as the only aircraft that remain are manned. The same
constraints are applied to both the 25% and 50% Composition cases, but differences in the number
of unmanned aircraft lead to differences in how the missions evolve.
Under the Temporal Separation protocol, the differences between Composition values are only
in terms of when unmanned aircraft become active during a mission. In both cases, manned aircraft
operate first, starting from their positions in the forward area of the flight deck. The larger number
of manned aircraft in the 25% Composition level means that unmanned aircraft will become active
later in the mission than would occur under the 50% Composition level. This is also true for the
Temporal+Area protocol, with the only difference being that T+A cases send all manned aircraft
forward. Missions can also be run at a 100% Composition level, utilizing only a single type of vehicle
during a mission (only manned or only unmanned), but the Area, Temporal, and Temporal+Area
protocols will not be feasible as only one type of vehicle is present. Dynamic Separation is the only
applicable protocol for the 100% Composition cases, with all other entries crossed out in Fig. 5-1.
Before proceeding to the execution of the matrix, however, additional testing should be performed
in order to verify that the UAV control architecture models and safety protocol logic routines function
as desired. The next section replicates the testing done in Chapter 4 in single aircraft calibration
with the new UAV models. While the results cannot be directly validated against empirical data,
the results should demonstrate the expected inverted-U shape of performance.
Table 5.1: Description of effects of Safety Protocol-Composition interactions on mission evolution.
Area Separation
Temporal Separation
Deck Access (same
for all Composition
values)
Operations (same
for all Composition
values)
Composition 50%
Composition 25%
Restricted to specified area
Unrestricted
Parallel (across areas)
Series (manned first, then
unmanned)
Aircraft groups launch at similar
rates. Constraints likely to remain
in place throughout mission
Smaller group of unmanned
aircraft launches quickly.
Constraints are relaxed afterwards
Unmanned aircraft become active
halfway through mission
139
Unmanned aircraft become active
three-quarters through mission
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
5.2
Unmanned Aerial Vehicle Model Verification
The changes defined in section were implemented into the MASCS model of flight deck operations,
defining new unmanned aircraft object classes and adjusting parameters for those agent models,
as well as making adjustments to existing logical structures within the system. Once these tasks
were completed, individual submodels within the code (e.g., the Bernoulli trial generator replicating
GC response selection failures) were verified for accuracy. After this was completed, a series of
verification tests of the single aircraft models were conducted using the same mission profile as
the single aircraft calibration testing described in Chapter 4, Section 4.2, again conducting 300
replications in each trial. Testing of the UAVs proceeded incrementally, introducing a single new
feature at a time in order to gauge its effects; these results are reported in Appendix J. The results
of the final model tests appear in Figure 5-9, which provides the mean Launch Duration value for
each vehicle class with whiskers indicating ±1 standard deviation.
As can be seen in the figure, the performance across unmanned vehicle control architectures
is that of the expected inverted-U shape. A Kruskal-Wallis non-parametric one-way analysis of
variance shows that significant differences exist among the data at the α = 0.05 level (χ2 (5) =
197.1712, p < 0.0001). A Steel-Dwass simultaneous comparisons test1 reveals significant differences
between multiple pairs of CAs. A total of nine significant pairwise differences exist at the α = 0.05
level, comprising all possible combinations between two sets of three CAs: the three “delayed” cases
of RT (mean rank= 981.15), GC (mean rank= 1153.39), and VBSC (mean rank= 1046.74), where
vehicles experience some sort of latency or lag in operations, and the three “non-delayed” cases of
MC (mean rank= 786.26), LT (mean rank= 771.84), and SBSC (mean rank= 663.63).
1 Steel-Dwass
is a nonparametric version of the Tukey test; see (Conover, 1980).
Figure 5-9: Column chart comparing time to complete single aircraft mission across all five UAV
types. Original results from Manual Control modeling are also presented.
140
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
These results for individual aircraft tasks suggest that latencies and lags experienced in task
execution by these vehicles have significant effects on performance, although this was expected.
However, as observed in the sensitivity tests of the manual control model in Chapter 4, Section 4.4,
changes that affect taxi operations “upstream” of operations may not actually have effect at the
mission level. The performance of the different UAV control architectures might also be influenced
by interactions with the Safety Protocols defined in this chapter in Section 5.1.1. Determining
the drivers of variation in safety and productivity on the flight deck is the focus of the MASCS
experimental testing program, described in the next section.
5.3
Dependent Variables and Measures
The focus of the MASCS experimental program is to examine the changes in safety and productivity
in aircraft carrier flight deck operations across different Control Architectures and Safety Protocols.
As noted in Chapter 4, very little information on flight deck operations is publicly available. As
such, there is little guidance in terms of what metrics are of importance for flight deck operations;
even if this guidance were available, it is not clear if these metrics would be appropriate for assessing
futuristic unmanned vehicle operations. Rather, inspiration for the performance metrics for use in
MASCS testing is taken from prior work on unmanned vehicles and driving research.
In prior work, Pina et al. defined five major classes of metrics for human supervisory control
systems that examine not only the performance of the entire human-unmanned vehicle system,
but that also examines humans and unmanned systems individually, as well as their efficiency in
collaboration and communication (Pina, Cummings, Crandall, & Penna, 2008; Pina, Donmez, &
Cummings, 2008). While all five classes are applicable to assessing the UAVs used in MASCS, four of
the five classes correspond to settings within MASCS and were defined as part of the various control
architectures; only the category of Mission Effectiveness (ME) metrics would exhibit significant
variation in MASCS testing.
This class of Mission Efficiency metrics describes measures of productivity within the aggregate
system. Pina et al. subdivided these measures into categories of Expediency (time to complete
tasks), Efficiency (resources consumed in completing tasks), and Safety (how much risk was incurred
during task execution). For MASCS, Expediency measures address how quickly the launch event
was completed, termed the Launch event Duration (LD). Efficiency measures address how well
operations were conducted, tracking parameters related to individual vehicle task execution and
helping to clarify differences in LD and the effects of the safety protocols on operations. Total
Taxi Distance (TTD) tracks the total travel distance of aircraft, providing an understanding of how
efficiently aircraft were routed on the flight deck, while Total Aircraft taXi Time (TAXT) and Total
Aircraft Active Time (TAAT) examine how long aircraft were actively executing taxi tasks or all
tasks (including launch preparation), respectively.
141
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Safety measures address both collision and non-collision hazards. Traffic researchers have explored the use of “surrogate” safety measures as proxies for true accident values. These surrogate
measures tracks events that occur more often than true accidents (making them easier to track)
but remain highly correlated with them (Archer & Kosonen, 2000; Archer, 2004; Gettman & Head,
2003; Gettman, Pu, Sayed, & Shelby, 2008). These measures often address the distances between
vehicles and the time-to-collision, both typically in regards to specified thresholds (e.g., one car
length or within one second of collision). The first set of MASCS safety measures tracks the number
of times another vehicle or human is within a specified distance of an aircraft; every time another
agent moves within this distance of a vehicle, that vehicle’s counter increases by one. Three different distance values are utilized within MASCS, corresponding to Primary (PHI, half-wingspan),
Secondary (SHI, full wingspan), and Tertiary (THI, two wingspans) Halo Incursions (Figure 5-10).
The term “incursion” is borrowed from commercial aviation, in which the entry of an unapproved
aircraft into a runway (or other restricted space) is known as an “incursion.” While these restricted
areas are specified as an enforceable constraints, “halos” in MASCS are purely descriptive measures
(but could be used as control features). The counts of incursions can also be subdivided by the
type of agent crossing the halo radius (aircraft or crewmember). While the inner-most Primary
Halo Incursion (PHI) measure is a small distance, it is common (although undesirable) for Aircraft
Directors to be this close to aircraft during operations.
Companion measures to the Halo Incursions track the total amount of time that each “halo” is
contains an incursion, as it is not clear which variable is more highly correlated with accidents on
the flight deck: the number of incursions or the total time that incursions exist. A variation of the
halo incursion measure takes guidance from traffic research and views collisions from a predictive
stance, tracking how many times a vehicle is within one second of colliding with another actor on the
Figure 5-10: Diagram of “halo” radii surrounding an aircraft. Primary incursions are within one
half wingspan of the vehicle. Secondary incursions are within one wingspan; tertiary are within two.
In the image shown, the aircraft has its wings folded for taxi operations. A transparent image of
the extended wings appears in the background.
142
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
flight deck. This prediction measure also utilizes the radii from the figure above, tracking predicted
Primary (PC), Secondary (SC), and Tertiary (TC) collision warnings.
Non-collision safety measures address the factors that influence other hazards on the flight deck,
taking inspiration from the idea of latent failures. The primary metric in this class is the total time
that vehicles are active within the landing strip area, term the Landing Strip Foul Time (LSFT).
Blocking the landing strip may potentially delay airborne aircraft low on fuel from landing, directly
increasing their risk of accidents. Other metrics focus on congestion on the flight deck, tracking the
total number of aircraft that transit through the Street and Fantail area (NS, NF), the maximum
number of aircraft in those areas at one time (MS, MF), and the total amount of time that these
areas are occupied (DS, DF). A list summarizing these primary performance measures appears
in Table 5.2. The next section addresses the experimental hypotheses of performance for these
dependent measures in terms of the experimental matrix shown previously in Fig. 5-1. Hypotheses
are presented both in terms of safety protocol performance with a given control architecture and
across control architectures individually (independent of safety protocols).
5.4
Experimental Hypotheses
The hypotheses for the experimental matrix in Figure 5-1 are organized into two different matrices
in two figures below. Figure 5-11 shows the hypotheses across control architectures (independent
of safety protocols), while Figure 5-12 examines hypotheses for safety protocols within each control
architecture. For every dependent variable described in the previous section, lower scores are better
indicating either lower risk of operations or faster/more efficient mission execution. Entries in the
cells in the figure denote whether performance should provide smaller values (better) or larger values
(worse), with blank cells denoting performance somewhere in between.
Within Figure 5-11, the hypotheses are that, regardless of SP settings, the detrimental effects of
the delays and errors of the GC system will generate the worst performance in terms of Expediency
Table 5.2: List of primary performance metrics for MASCS testing.
Category
Metric
Abbreviation
Expediency
Efficiency
Safety
Launch event Duration
Average interdeparture time
Total Aircraft taXi Time
Total Taxi Distance
Catapult wait time
Primary, Secondary, and Tertiary Halo Incursions
Duration of Halo Incursions
Predicted Collisions (Primary, Secondary, Tertiary)
Landing Zone Foul Time
Number of vehicles in Street (total, max at any time)
Number of vehicles in Fantail (total, max at any time)
Duration of time of vehicles in Street, Fantail
143
LD
ID
TAXT
TTD
PHI, SHI, THI
DPHI, DSHI, DTHI
PC, SC, TC
LSFT
NS, MS
NF, MF
SD, FD
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
and Efficiency measures, while the lack of constraints afforded by the VBSC and SBSC systems
should generate the best performance (smallest values) on these measures. GC should also generate
the worst performance for Safety measures that deal with time (Duration of “Halo” incursions,
etc.), as longer mission times should also correlate to larger values for those measures. For Safety
metrics dealing with counts of near collisions and predicted collisions, the SBSC control architecture
is expected to provide the highest values (worst performance) because it lacks the constrained taxi
patterns and artificial spacing introduced by the “zone coverage” system. However, because removing
these constraints is expected to increase the speed of operations, it is expected that safety measures
involving durations of time would perform better (lower values) for SBSC. For all metrics considering
the number of vehicles sent to the Street and the Fantail, the choice of CA should have no effect;
greater effects for these should come from the Safety Protocols and are reviewed in Figure 5-12.
In terms of Figure 5-12, within individual Control Architectures, the Dynamic Separation only
protocol should provide the best performance in terms of productivity measures as it provides the
fewest constraints on operations. The Temporal Protocol should also perform well on these measures
as it retains the flexibility to allocate aircraft across the entire deck. Conversely, the constraints put
in place by the Area and Temporal+Area cases should be detrimental to productivity measures,
most especially the Temporal+Area case.
Figure 5-11: Hypotheses for performance across Control Architectures. E.g.: System-Based Human
Supervisory Control (SBSC) is expected to perform best (smallest values) for all productivity measures and safety measures (smallest) related to durations. Gestural Control (GC) is expected to
perform worst (largest values) across all measures related to time.
144
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
In terms of safety measures, however, these same constraints are expected to afford Temporal+Area the best performance, as it should minimize the interaction between vehicles of all types.
The Temporal protocol is also expected to fare better in safety measures than the Area Separation
protocol: because the Area Separation protocol confines aircraft into specific regions of the deck
(increasing the density of operations in each area), the collision-related measures are all expected
to perform worse in these cases. However, Area Separation is expected to provides the lowest use
of the Street area of the deck; conversely, the T+A protocol should utilize the Street most heavily
(worst performance) and the Fantail the least.
The only expected interaction between Safety Protocols and Control Architectures is that, because the VBSC and SBSC cases do not utilize the zone coverage routing methods, their productivity
performance is not expected to differ between safety protocols. An interaction is also expected between Composition and the Temporal+Area Separation protocol. Because this protocol isolates
manned aircraft in the forward section of the flight deck, neglecting the aft catapults, utilizing more
manned aircraft in the 25% Composition case should lead to longer Launch event Duration values.
The dependent variables described in the previous paragraphs were integrated into the MASCS
Figure 5-12: Hypotheses for performance of Safety Protocols within each Control Architecture.
Example: for Local Teleoperation (LT), the Temporal Separation safety protocol is expected to
perform best (smallest values) across all measures of productivity and safety. Area+Temporal is
expected to provide the worst productivity measures (largest values), while Area Separation is
expected to be the worst for safety.
145
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
system and verified before beginning experimental runs, with the system set to output all measures
to file (as well as provide logs of system behavior) at the end of each simulation iteration. Each
entry in the experiment matrix was replicated thirty times, taking between one hour and 2.5 hours
to complete each set of replications (running at 10x normal speed). These results were then compiled
and examined, the results of which are presented and discussed in the next section.
5.5
Results
The experimental matrix shown in Figure 5-1 was executed using a team of Windows 7 machines
using the MASCS Java codebase and the Eclipse Java IDE. Simulations were executed at identical
screen resolutions (1536x864) across machines, with confirmation runs executed prior to the start of
the test matrix to ensure that the results of the simulation were not affected by differences between
computer systems. Thirty replications of each entry in the experimental matrix were performed (92
x 30 = 2760 runs in total). This section reviews the differences in performance in terms of these
dependent variables for the MASCS simulation environment, examining how changes in Control
Architecture (CA) and Safety Protocol (SP) interact both with one another and with the mission
configuration (number of total aircraft and percentage of UAVs within the mission).
This section utilizes Pareto frontier graphs to describe the general trends in the data, with
statistical tests formally determining whether any significant differences exist between cases. The
variances were typically not equal across data sets, and the statistics employed in the analysis are
primarily non-parametric Kruskal Wallis one-way analyses of variance and non-parametric SteelDwass simultaneous comparison tests (Conover, 1980). As the testing generated a very large amount
of data, not all of it can be shown in this section. Appendices K through N include additional
analyses on the collected data as well as details on the statistical tests performed on the data as well
as additional data plots that not covered in this chapter. The section begins by reviewing differences
in the Launch event Duration (LD) productivity measure before moving on to the performance of
the remaining safety and productivity metrics.
5.5.1
Effects on Mission Expediency
The time required to complete a launch event served not only as an important validation metric for
the baseline manual control MASCS simulation, it is also the most critical evaluation metric for
Subject Matter Experts (SMEs) in carrier operations. In Section 5.4, it was stated that the Gestural
Control (GC) system, because of its delays and errors, should perform the worst overall while the
Vehicle-Based Human Supervisory Control (VBSC) and SBSC systems should perform the best and
possibly be equivalent to Manual Control (MC) performance. The Local Teleoperation (LT) and
Remote Teleoperation (RT) systems were expected to lie somewhere in between. In terms of Safety
Protocols, it was expected that for all productivity measures (Expediency and Efficiency), Tempo146
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
ral Separation (T) Separation and Dynamic Separation (DS) would provide the best performance,
Temporal+Area Separation (T+A) the worst, and Area Separation (A) in between.
Figure 5-13 first provides column charts for LD performance for both Safety Protocols (left)
and Control Architectures (right) for the 22 aircraft Density level. This data does not differentiate
between the different Composition values 2 . From the data on Safety Protocol performance (left), the
detrimental effects of the T+A Safety Protocol can be observed, confirming the earlier hypothesis
regarding its performance. LD values for the remaining settings are all fairly similar, however:
Temporal, Area, and Dynamic Separation are not substantially different from one another. The
data for control architectures (right) also indicates that there is little variation across the different
cases.
Statistical tests show that there are significant differences amongst control architectures as well
as between safety protocols. A Kruskal-Wallis nonparametric one-way analysis of variance for the
Area, Temporal, and Dynamic separation safety protocols (ignoring the T+A case) reveals significant
differences at both the 22 aircraft level (χ2 (2) = 34.2927, p < 0.0001) and 34 aircraft levels (χ2 (2) =
100.6313, p < 0.0001). Post-hoc Steel-Dwass multiple comparisons tests reveal that at the 22 aircraft
level, the Temporal Separation protocol (mean rank = 629.62) is different from both Area (497.488)
and Dynamic Separation (511.682) protocols, with the latter two not significantly different from one
another. At the 34 aircraft level, significant differences exist for all pairwise comparisons between
the three protocols, with Temporal Separation (mean rank = 689.25) providing the largest values
(different from both Area and Dynamic Separation a p < 0.0001). Area Separation (mean rank =
2A
table of results for each individual SP-Composition pair can be found in Appendix K
Figure 5-13: Column charts of effects of Safety Protocols (left) and Control Architectures (right) for
22 aircraft missions.
147
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
448.24) provides better performance than Dynamic Separation (mean rank = 505.195, p = 0.0435).
For control architectures, a Kruskal-Wallis test using CA as the grouping variable show that
significant differences exist at the 22 aircraft level (χ2 (5) = 37.7272, p < 0.0001) but not at the 34
aircraft level (χ2 (5) = 5.8157, p = 0.3246). A post-hoc Steel-Dwass non-parametric simultaneous
comparisons test for the 22 aircraft data reveals that significant pairwise differences exist between
the SBSC (mean rank = 438.6) case and each of the RT (605.7), GC (582.2), and VBSC (562.0)
cases. The LT (mean rank = 515.2) and RT cases are also shown to be significantly different
from one another. These results, at least for 22 aircraft, suggest that the presence of latencies and
delays in the control architectures — which occurs for the RT, GC, and VBSC systems — has some
effect on mission productivity. However, while these results are statistically significant, the practical
differences between control architectures are only on the order of a few minutes.
Figure 5-14 provides a column chart depicting the results for individual control architectures at
the 22 aircraft Density level, grouped by Safety Protocol but excluding the T+A cases. The data in
figure 5-14 demonstrates that not only is there relatively little variation across safety protocols, there
is very little variation across control architectures within a single protocol. The most notable aspect
of the data in the figure is that the SBSC architecture appears to provide better performance than
the other architectures under the Dynamic Separation only and Area Separation safety protocols.
Additionally, RT, GC, and VBSC appear to cluster together as worst performers under the Temporal
Separation protocol. Statistical tests that compared control architecture performance separately
within each Safety Protocol were performed to further examine these differences.
For the Dynamic Separation protocol, an ANOVA shows significant differences amongst con-
Figure 5-14: Column chart comparing Launch event Duration (LD) values across all Control Architectures (CAs) sorted by Safety Protocol (SP). Whiskers indicate ±1 standard error.
148
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
trol architectures (F (5, 474) = 6.016, p < 0.0001). A Tukey HSD test across all pairwise comparisons shows that significant differences exist only between the SBSC architecture and the LT,
RT, GC, and VBSC architectures (all at p ≤ 0.0061). Under Dynamic Separation, SBSC and MC
are not significantly different from one another. For the Area Separation protocol, an ANOVA
shows no significant differences between architectures (F (5, 324) = 2.021, p < 0.0753). For the Temporal Separation protocol, an ANOVA shows significant differences amongst control architectures
(F (5, 324) = 3.947, p = 0.0017), with a Tukey test showing that only two significant differences
exist: Local Teleoperation (LT) provides smaller LD values than both the RT (p = 0.0477) and GC
(p = 0.0250) architectures. These combined results suggest that variations in control architecture
performance are limited, but that some important interactions exists between safety protocols and
control architectures.
An extended analysis of the LD results can be found in Appendix K, which performed a statistical
comparison across all individual run cases, no longer grouping data according to Safety Protocol
or Control Architecture. What these results demonstrated was that the significant main effects
of CA and SP described previously are due to differences between only a few specific missions:
the Temporal Separation applied to the RT, GC, and VBSC results in larger LD values, while
the Dynamic and Area Separation protocols applied to the SBSC architecture results in smaller
(better) LD values overall. The benefits of the SBSC architecture are likely due to change in the
deck topology that occur with the removal of the Aircraft Director crew and their “zone coverage”
routing system. Removing this crew removes constraints on when and where SBSC vehicles are
allowed to taxi, as they are not forced into the routing network in Fig. 3-7 in Section 3.2.2. This
leaves the SBSC architecture as the least-constrained architectures in terms of motion on the flight
deck, which appears to provide slight benefits in terms of launch event duration values.
However, as noted, the results show relatively little practical variation in results. This could
be due to an error may exist within one or more components of the MASCS model that serves to
limit possible variations in system performance. One option is that the individual vehicle models are
flawed. However, the results of single aircraft testing in Section 5.2 did show significant differences
in performance between the architectures for individual aircraft. These differences are somehow
being obscured at the mission level. Errors may exist in how the crew “zone coverage” routing
architecture handles the simultaneous taxiing of aircraft, but the SMEs that reviewed the execution
of the MASCS simulation judged that the models of Aircraft Directors and their routing of vehicles
were accurate.
An alternative explanation involves the heuristics that govern the catapult queueing process and
the current launch preparation time model. As described in Chapter 3, Deck Handlers can assign
two aircraft to each catapult. Within a catapult pair, one aircraft prepares for launch while the other
three wait nearby (see Figure 5-15, part a). After the first aircraft launches (part b) the aircraft
149
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
on the adjacent catapult begins launching, the next aircraft at the original catapult moves forward,
and the Handler assigns a new aircraft to this queue. As additional launches occur, the Handler
continues to assign new aircraft to the catapult, often taxiing multiple aircraft in parallel towards
the same catapult pair (Figure 5-15, part c).
It is an explicit goal of the Handler to keep these catapult queues filled: if queues are full, the
chances of any catapult sitting idle while waiting for a task is minimized. The crew “zone coverage”
routing architecture facilitates this by allowing decentralized and parallel routing of aircraft on the
flight deck, allowing the queues to be refilled quickly during operations. If the queues are kept filled
through the mission, then the time required to complete the mission is simply a matter of how
quickly the queues are emptied. This is an effect of the queuing policies at catapults and of the
launch preparation task, both of which were defined independently of the control architectures and
safety protocols of interest here. Differences in performance will only arise if an architecture fails to
provide aircraft to keep queues sufficiently filled.
Disrupting a queue requires that the new aircraft arrive at the catapults after the queue has
been emptied. That is, the three aircraft already at the catapults (Fig. 5-15, part b) must complete
their launches while the new aircraft is taxiing. Within MASCS, the time to complete the launch
preparation task averages roughly two minutes; this means that the queue is emptied in six minutes
on average. If the average taxi time for the new aircraft is less than six minutes, then aircraft will,
on average, arrive in time to keep queues filled. Figure 5-16 provides a selection of output data from
MASCS execution regarding the average taxi times and queue wait times for aircraft. Taxi times
range from 3 to 4 minutes, clearly less than the six required to clear the queues, which would suggest
that the queues remain in place throughout the mission. This is verified by the queue wait time data
on the right side of the chart, which shows that the average wait time in queue for aircraft ranges
Figure 5-15: Diagrams of queuing process at catapults. a) start of operations with full queues. b)
After first launch. c) after second launch.
150
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
from 2 to 3.5 minutes. This occurs across all safety protocols and across all control architectures.
The differences exhibited by aircraft in the single aircraft tests in Section 5.2 all occur upstream of
the launch preparation process and are largely mitigated by the queueing process.
The queueing pattern that arises at the catapult pairs is the result of human-generated heuristics
that govern taxi routing and aircraft assignments, the launch preparation task (a task dominated by
manual human tasks, as described in Section 3.1), and the inability to launch adjacent catapults in
parallel (a physical constraint influenced by the heuristics governing the design of the deck). Until
this queueing process is altered, little can be learned about the true effectiveness of manned or
unmanned vehicles in flight deck operations. Chapter 6 will explore how the queueing and launch
preparation processes might be improved in order to eliminate the queueing phenomenon observed
here and how it affects variations in flight deck productivity. The next section continues the review
of results from the experimental matrix, describing the effects of the safety protocols and control
architectures on the safety of flight deck operations.
5.5.2
Effects on Mission Safety
Several different safety metrics were defined in Section 5.3, each addressing different ways of viewing
the interactions between vehicles and crew on the flight deck. Additionally, each of these measures
could be viewed either as an aggregate or be separated into independent measures for aircraft and
crew. The settings of the collision prevention routines for aircraft prevent the occurrence of PHI and
Secondary Halo Incursions (SHI) incursions by other aircraft; vehicle interactions only appear at the
Tertiary Halo Incursions (THI) level, which is reviewed in this first section. Figure 5-17 presents a
Pareto chart of THI plotted against LD for the 22 aircraft cases, with the points forming the Pareto
frontier connected via a black line (the bottom left corner). This chart includes data points for
Figure 5-16: Average taxi times, average launch preparation time, and average time to clear a queue
of three aircraft for selected 22 aircraft missions.
151
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
each replication of every run setting (Control Architecture+SafetyProtocol+Composition). Marker
shapes are consistent for a specific CA (e.g., GC appears as boxes, VBSC as diamonds), while marker
colors denote the Safety Protocol (SP) utilized (blue for Dynamic Separation, red for Temporal, etc.).
Because results were largely consistent across different Composition values, no differentiation is made
for this variable in order to improve the readability of the charts. Pareto charts using individual
colors for each SP-Composition pairing can be found in Appendix N.
The general trend in the data in the figure is in terms of the different control architectures. VBSC
provides the worst performance (largest values), followed by GC, then a cluster of results for MC,
LT, and RT, then with SBSC providing the best results. This can be highlighted when looking at
results for only the Dynamic Separation case in Figure 5-18. These results are in direct opposition
to the hypotheses stated in Section 5.4, which expected that the SBSC case should produce the
Figure 5-17: Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 22 aircraft missions.
152
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
highest values in terms of Halo Incursion counts on the flight deck. Additionally, the wide disparity
between the VBSC and SBSC cases was not expected. The only major difference between the two
cases is the utilization of crew: under the SBSC protocol, only Escorts are utilized, while under
the VBSC protocol, a mix of Escorts and Directors are utilized. This suggests that, for these two
systems, the removal of the Aircraft Directors from the SBSC cases may have significant effects on
the THI metric. Since THI is an aggregate metric including both crew and aircraft contributions,
the contributions of one group may overshadow the other.
Figure 5-19 shows only the Tertiary Halo Incursions due to Aircraft (THIA) contribution to the
THI metric, and the trends in the THIA data are significantly different than the trends in THI in
Fig. 5-17. Most importantly, the only two main effects that appear to exist are that (1) the largest
counts of THIA values occur under the Dynamic Separation protocol and (2) that the T+A safety
protocols (bottom right corner, black markers) appear to generate the smallest values of THIA.
Additionally, the high THIA value Dynamic Separation values only comes from a subset of the
control architectures: GC (squares), VBSC (diamonds) and SBSC (triangles), each of which also
shows a much range of THIA values than the MC, LT, and RT architectures. In fact, SBSC provides
both some of the best (smallest) and some of the worst (largest) values in the results. Results for MC,
Figure 5-18: Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 22 aircraft Dynamic
Separation missions.
153
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
LT, and RT all appear to cluster in the lower half of the chart; Figure 5-20 provides a side-by-side
comparison of the LT and SBSC architectures to highlight these differences more clearly.
A Kruskal-Wallis test of the data utilizing SP as a factor returns significant results (χ2 (3) =
422.86, p < 0.0001) with the Dynamic Separation cases (mean rank = 922) returning the highest
overall values (worst performance), T+A returning the lowest (321), and Area (718) and Temporal
(660) in between. A Steel-Dwass multiple comparisons tests shows that the T+A and Dynamic
Separation cases are significantly different from all other safety protocols (all at p < 0.0001), while the
Temporal and Area cases are not significantly different from one another (p = 0.1386). These results
partially support the hypotheses stated in Section 5.4: as expected, the T+A protocol provides the
smallest overall values and DS the largest. However, these results fail to confirm the hypotheses
Figure 5-19: Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
22 aircraft missions.
154
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Figure 5-20: Pareto frontier plots of Tertiary Halo Incursions due to Aircraft (THIA) versus LD
for 22 aircraft missions for Local Teleoperation (LT, left) and System-Based Supervisory Control
(SBSC, right).
that Temporal would be equivalent to T+A and Area would be equivalent to Dynamic Separation.
In terms of mean THIA values, T+A (mean value = 54.54) provides a 24% improvement in
average THIA scores compared to the Dynamic Separation protocol (mean value = 71.33) and a
13% improvement as compared to Temporal (mean = 62.54) or Area Separation (mean = 63.26).
However, as pointed out in Section 5.5.1, this comes at a cost to LD: the average LD for the T+A
cases is 36.17 minutes, a 44.3% increase over the average 25.06 minute mission duration for the
combined Area, Dynamic Separation, and Temporal protocol results. Applying multiple high-level
constraints to operations with the combined Temporal+Area protocol does appear to provide a
benefit to vehicle safety (smaller THIA values), but this comes at a much larger relative cost to
mission productivity (increased LD values).
Significant differences also exist across the different control architectures (Kruskal-Wallis test,
χ2 (5) = 362.0433, p < 0.0001), with Steel-Dwass multiple comparisons tests showing differences
between two sets of vehicles: the three “autonomous” UAV architectures of GC (mean rank =
1050.8), VBSC (972.8), and SBSC (811.74) all perform significantly worse than the “manned” cases
MC (mean rank = 260.27), LT (234.0), and RT (431.0), with GC also being significantly worse than
VBSC (VBSC v. MC, p = 0.0026; all others, p < 0.0001; full results appear in Appendix N). These
155
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
results are likely influenced by the poor performance of GC, VBSC, and SBSC under the Dynamic
Separation protocol. These results also fail to confirm the hypotheses from Section 5.4: while the
SBSC architecture does provide the largest values and worst performance (as expected), significant
differences were not expected between the other architectures.
The original hypothesis was that the iterative interactions of the Deck Handler in VBSC and
the use of the “zone coverage” routing structure in MC, LT, RT, and GC and would each generate
sufficient spacing between vehicles to keep THIA values low. The statistical tests of the previous
paragraph suggest that this does occur for the MC, LT, and RT architectures, as tests demonstrated
that these three architectures were significantly different from, and provided smaller THIA values
than, the remaining architectures. For the MC, LT, and RT architectures, the combined Temporal+Area case also further reduced THIA values, but only slightly. This suggests that the “zone
coverage” mechanism used by these architectures may function as a different form of safety protocol
during operations that is incompatible with the Area and Temporal separation protocols.
Interestingly, the GC architecture provides significantly worse performance than LT and RT
although it also utilizes the crew’s zone coverage routing method. This poor performance is likely
due to the failures and processing latencies of the GC system; as shown previously in Section 5.2,
these failures and delays made GC the slowest of the UAV control architectures. This is partially
due to the topology of the zone coverage routing network, which requires multiple small taxi tasks
be performed as vehicles are passed between Aircraft Directors on the deck. When a GC vehicle is
handed off from one Director to another, the vehicle has a chance first to fail to recognize the new
Director, then multiple chances to fail to recognize his commands, as well as delays from processing
latencies. While this is occurring, the first Director is providing instructions to a new vehicle. If this
vehicle more readily finds the Director and accepts his commands, it will begin moving forward but
will be blocked by the first vehicle. The second aircraft will taxi forward until its collision prevention
routine commands that it stop, incurring a new THIA penalty. If this process continues repeatedly
for all vehicles on the flight deck, this would generate a higher rate of THIA values than LT and RT.
The poor performance of the VBSC and SBSC architectures is also likely explained by the
change in the topology of the deck due to their lack of use of the “zone coverage” crew routing
mechanism. This change in topology means that these two architectures do not experience the same
“natural spacing” as the other architectures. Instead, they are given greater freedom in taxiing on
the flight deck, coming as close to other vehicles as the Dynamic Separation collision prevention
routine will allow. Interestingly, the delays of the VBSC architecture do not exacerbate the effects
of the more open routing system and may be due to the influence of the Deck Handler’s queuing
policy. The VBSC Handler prioritizes aircraft closest to their catapults specifically to avoid the
conditions described for GC: aircraft at the “front” of operations are given instructions first in order
to prevent them from obstructing traffic. The flow of traffic is kept moving, and the same effects
156
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
observed for GC should not occur for VBSC.
In general, these results for the THIA aircraft safety measure suggest that the topology of the
flight deck, as influenced by the use of the crew’s “zone coverage” routing methods, is the most
important driver of performance in this environment. Architectures using that routing method and
not experiencing significant numbers of failures in operations (MC, LT, RT) provided the smallest
values for the metric. The two architectures not using the crew routing (VBSC and SBSC), provided
some of the worst values. While GC also used the crew routing, its high failure rates interacted
with the crew routing to generate high THIA values. However, for these latter three architectures
exhibiting very poor performance, results suggest that the use of the safety protocols is beneficial in
lowering the rate of aircraft interactions on the flight deck. However, overconstraining operations in
the T+A protocol provides only a slight improvement over the Temporal or Area Seapration cases
at a significant penalty to LD values.
Similar trends appear in data taken at the 34 aircraft Density level (Figure 5-21). Kruskal-Wallis
tests for SP main effects (χ2 (3) = 78.847, p < 0.0001) and CA main effects (χ2 (5) = 885.2522, p <
0.0001) again reveal significant results (full results appear in Appendix N). For Safety Protocols,
a Steel-Dwass multiple comparisons test reveals that T+A (mean rank = 547.07) is significantly
different from Area (mean rank = 738.10, p < 0.0001), Temporal (mean rank = 630.15, p = 0.0266),
and Dynamic Separation (mean rank = 788.11, p < 0.0001). Temporal is also significantly different
than Dynamic Separation (p < 0.0001) and Area (p = 0.0028), while Area and Dynamic Separation
are not significantly different from one another (p = 0.209).
A Steel-Dwass test on the Control Architecture results shows significant pairwise differences
exists between all cases except between MC and LT (p = 0.5905; MC and RT, p = 0.0003;VBSC
and GC, p = 0.0034; all others, p < 0.0001). Recall that at the 22 aircraft setting, differences
occurred primarily between the two groups of “manned” (MC, LT, RT) and “unmanned” (GC,
VBSC, SBSC) architectures, where comparisons within groups were not significant. These data
suggest that increasing the number of aircraft on the flight deck, which introduces new constraints
in how aircraft can be routed on the flight deck and changes how the Deck Handler allocates aircraft
through the mission, encourages further variations in the GC, VBSC, and SBSC architectures.
A review of results for the Duration of Tertiary Halo Incursions due to Aircraft (DTHIA) metric
was also conducted, which appears in Appendix M. However, these results demonstrated the same
general trend of results at the THIA metric. The trend observed in these metrics indicates that the
safety of aircraft interactions is influenced largely by the selection of the topology of the flight deck
and the role of crew in that topology. These results only partly support the hypotheses discussed in
Section 5.4. While, overall, the trends in performance for control architectures and safety protocols
were as expected, a variety of unexpected interactions existed within the system, due largely to the
differences in the topology of the flight deck between the “zone coverage” and supervisory control
157
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Figure 5-21: Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
34 aircraft missions.
routing methods. These were significant enough that the SBSC control architecture exhibited both
some of the best and some of the worst performance.
While these interactions highlight important considerations about the effects of crew interactions
on aircraft safety, aspects of crew safety are a second important consideration in operations. Recall
that the Pareto chart of the aggregated THI metrics showed trends quite different from the THIA
metric. As noted previously, these differences are likely due to the crew component. However, the
models of crew behavior implemented within MASCS were quite simple and do not reflect the true
complexity of crew behavior when working in close proximity to aircraft. The PHI and Duration of
Primary Halo Incursions (DPHI) metrics do provide some insight as to how vehicle routing affects
the rate at which vehicles encounter crew on the deck, but they cannot truly offer any information
158
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
on crew-vehicle interactions. To better characterize those interactions, future work should address
the creation of more complex models of crew and vehicle motions at a much finer resolution than
currently done in MASCS. An analysis of the Primary Halo Incursion (PHI) and Duration of Primary
Halo Incursions (DPHI) measures for this current work can be found in Appendix M, which indicates
that once again, the primary effect on operations was the choice of crew routing topology.
Several other safety metrics were defined in Section 5.3 that have not been discussed here. While
these metrics were output from MASCS and were analyzed, the trends in performance are the same
as those presented within this section. In particular, the predictive Collision measures were virtually
identical to the Halo Incursion measures, which is likely an artifact of the slow speeds of transit
on the flight deck (it takes several seconds for an aircraft to travel its body length). Pareto charts
of these other metrics are provided in Appendix N. The next section reviews the final category of
dependent variables included in MASCS: measures of efficiency.
5.5.3
Effects on Mission Efficiency
The design of the safety protocols and their interactions with the Composition settings were expected
to drive significant variations in the efficiency measures, as well as for productivity and safety
reviewed in earlier sections. While these measures may not be correlated directly to Launch event
Duration, their performance is still of interest: longer taxi distances imply more wear on aircraft
tires and on the deck itself, as well as increased fuel consumption. While these measures of efficiency
may not be the primary concerns of stakeholders at this time, they may be sources of secondary
benefits to operations.
The first of these measures, Total Taxi Distance (TTD), sums the total travel distance of all
aircraft in the simulation, providing a measure of the efficiency with which aircraft were allocated
to catapults by the Deck Handler planning heuristics, as well as the efficiency of routing in the
zone coverage routing scheme. Figure 5-22 shows data for this metric at the 22 aircraft Density
level. Similar to previous measures of safety, it can be seen that the 50% Area cases provide very
low values of TTD, whilst the Temporal and Temporal+Area provides provide the worst overall.
The performance of the Temporal cases is perhaps not surprising, given the previous discussions
about the rate of transition from fore to aft and vice versa. The increased values at the T+A
setting, however, come primarily from the unmanned vehicles that operate at the end of the mission
schedule: some subset of these vehicles are assigned to the forward catapults, incurring much larger
taxi distances than would be seen in the Area and Dynamic Separation cases.
A Kruskal-Wallis test using SP settings as the grouping variable reveals significant differences
within the data (χ2 (3) = 972.0598, p < 0.0001), and a Steel-Dwass test shows that the Temporal
and T+A safety protocols generate increased taxi distances on the flight deck (full results appear
in Appendix N). A slight trend does appear to exist for CAs within the larger pattern of SP, which
159
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
demonstrates that the architectures utilizing the X-47 aircraft model have shorter total taxi distances
on the flight deck. This is likely an artifact of the differences in shape of the vehicles: the X-47
aircraft used for GC, SBSC, and VBSC is shorter overall, and the gains in performance are likely
the result of minor differences in starting locations and ending locations (at catapults), and also
possibly for routing actions, because of this change in size. The plot of results for the 34 aircraft
case appears in Appendix N but shows generally the performance as the results shown here for 22
aircraft.
Two additional productivity measures were also tracked: Total Aircraft taXi Time (TAXT)
values (Figure 5-23) summed the total time in which vehicles are actively moving or otherwise
available to move, whilst Total Aircraft Active Time (TAAT) measures time spent in motion as
well as time spent at catapult performance the launch tasks. An error in coding lead to slightly
Figure 5-22: Pareto frontier plot of Total Taxi Distance (TTD) versus LD for 22 aircraft missions.
160
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
different calculations of these metrics for the HSC vehicles (SBSC and VBSC) and the remaining
cases, based on the use of Aircraft Directors in zone coverage. For cases using zone coverage routing,
TAXT included all time during which vehicles had an assigned Director and were not stopped due
to the collision prevention or traffic management routines; if the vehicle did not have a Director
assigned, or was otherwise stopped from moving, the time was not logged. For the remaining SBSC
and VBSC vehicles, because Directors were not utilized, time spent waiting on taxi paths to open
for motion was included in the time logging. This results in the significant variances amongst the
CAs in Figure 5-23, which shows the results of TAXT for the 22 aircraft case. The data also appear
highly correlated, which is expected given the nature of flight deck operations. Otherwise, aside from
the significant disruptive effects of the T+A case, no major trends are apparent. Values of TAAT
are almost identical to these measures and are not shown here (graphs appear in Appendix N).
5.6
Discussion
The Multi-Agent Safety and Control Simulation (MASCS) experimental matrix (Fig. 5-1) sought
to examine the interactions and effects of four different variables on the safety and productivity of
flight deck operations: Safety Protocols (SPs), Unmanned Aerial Vehicle (UAV) Control Architectures (CAs), mission Density (number of aircraft), and mission Composition (number of unmanned
vehicles used in the mission). The results of the testing described in this chapter have shown that
while variations in safety do occur in terms of the defined safety metrics, variations in mission productivity are minimal, especially for the key metric of Launch event Duration (LD). As a result,
tradeoffs between safety and productivity only truly occur when operations are heavily constrained
in the combined Temporal+Area Separation protocol. As compared to the baseline Manual Control (MC) case at 22 aircraft, the increased constraints of the T+A case increases the safety of
aircraft interactions by only 8% (MC average = 59.53, average of all T+A cases = 54.54) while
decreasing mission productivity by 44.3% (MC average = 24.97 minutes, T+A average = 36.17).
The causes of both the variations in safety measures and the lack of variation in productivity
measures can be attributed largely to the human-generated heuristics of flight deck operations that
dictate how aircraft are routed through the flight deck and queued at catapults. For trends in
safety measures, a key variable in aircraft interactions under the Tertiary Halo Incursions due to
Aircraft (THIA) metric was whether or not crew were being used in the current “zone coverage”
routing topology. For control architectures operating under this topology and that did not experience
high levels of failures or latencies in operations, the spacing afforded by the prior crew routing
topology improved safety. The choice of Safety Protocol had little effect in these cases, as the
routing topology already maintained sufficient spacing between vehicles. For control architectures
not using this prior crew routing strategy, or that did while experiencing significant delays and
latencies (the Gesture Control, GC, architecture), the safety of aircraft interactions was influenced
161
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
Figure 5-23: Pareto frontier plot of Total Aircraft taXi Time (TAXT) versus LD for 22 aircraft
missions.
by the number of high-level constraints applied to operations. The Dynamic Separation only case,
with no high-level constraints, provided the worst performance for these architectures. Applying one
high-level constraint in either Area or Temporal Separation provided improved performance, with
the combined Temporal+Area case providing the best.
Variations in productivity were also contingent on the role of crew, albeit in a different manner:
their role in supporting the launch preparation tasks on the flight deck, in combination with the
catapult queueing and Deck Handler mission planning. Here, the long average time to complete the
launch preparation task enables queues of aircraft to be maintained at catapults. These catapult
queues become the bottleneck in operations, as taxi operations appear to be efficient enough to
supply aircraft to catapults before queues can be emptied regardless of the safety protocols or
162
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
control architectures in use. However, further improvements to the queuing process and the launch
preparation task should allow variations in productivity to arise between control architectures.
When considering the goals of this dissertation — to understand the effects of integrating unmanned vehicle control architectures on the safety and productivity of flight deck operations — the
results of this chapter indicate that very little can be said about unmanned vehicle systems on their
own. Rather, the results indicate that the primary influences in safety and productivity in flight
deck operations comes from the interactions of the architectures with the crew on both a low level
(interactions with crew as governed by heuristics for vehicle routing, the deck topology, the launch
preparation process, and the catapult queueing process) and at a high level (the locations of crew
on the deck and their role in routing vehicles). The selection of different control architectures or
safety protocols only have effects in the context of how it changes vehicle interactions with crew.
Accordingly, changes to the models of crew behavior may generate very different trends in results.
Any future work addressing the results discussed in this chapter could place greater emphasis on
investigating the role of the crew in all aspects of operations. This might include examining formal
optimizations of crew behavior on the flight deck and simply altering the heuristics that guide their
behavior during operations. Chapter 6 will address some of these issues, but will focus primarily on
the launch preparation process and its effects on flight deck performance.
The data utilized in arriving at these conclusions was substantial — 92 runs, at thirty replications
each, and a minimum of 20 minutes per run. Collecting this data in real-world trials would have
required considerable resources and time in the field, disregarding the fact that many of the control
architectures tested here do not yet exist. This demonstrates the utility of the use of simulations, and
in particular, agent-based simulations for testing these systems. Modeling crew, aircraft, planning,
and task execution as independent elements of the system has afforded an ability to examine and
attribute results to the individual elements. These results have raised several additional questions,
however, including what role the launch preparation time plays in regulating deck performance
and whether any variations in control architecture productivity can be generated. The power of
simulations such as MASCS is that these changes can be made easily within the system and enable
an exploration of a variety of conditions quickly and efficiently. Chapter 6 explores how MASCS
can be used in further exploring the key drivers of performance for UAV control architectures and
for flight deck operations as a whole.
5.7
Chapter Summary
This chapter has described the process by which models of UAV performance on the aircraft carrier
flight deck were generated for the MASCS simulation environment, the verification of these models,
and the test and analysis of these models in terms of both the relative safety and productivity of
operations. The results of these tests demonstrated a wide range of effects, as well as showing how
163
CHAPTER 5. MASCS EXPERIMENTAL PROGRAM
varying definitions of safety metrics can provide wide-ranging results in terms of the performance
of different systems. Overall, these results failed to confirm many of the hypotheses described in
Section 5.4, most notably showing that the human heuristics that govern flight deck operations have
significant effects. While the choice of unmanned vehicle control architecture does result in variations
in safety metrics, these are primarily because of differences in when and how those heuristics are
applied. Variations in productivity (in terms of Launch event Duration) were largely not observed,
due to the bottleneck in catapult queueing that was created by the heuristics of the routing topology,
catapult queueing process, and the role of the crew in launch preparation. While these effects means
that very little can be said about how the choice of unmanned vehicle control architecture affects
operations, these results highlight key concerns about the role of crew in flight deck operations that
should be addressed in future work. Some of these concerns will be addressed in the next chapter,
examining how the MASCS simulation can be used to explore improvements to the launch preparation time model that would remove the bottleneck in operations and how this affects productivity
and safety on the flight deck.
164
Chapter 6
Exploring Design Options
“Artificial intelligence is often described as
the art and science of ‘getting machines to
do things that would be considered
intelligent if done by people.’ ”
Alone Together: Why We Expect More
from Technology and Less from Each
Other (2012)
Sherry Turkle
Chapter 1 described the motivation for the development of the Multi-Agent Safety and Control Simulation (MASCS) environment: to develop a simulation methodology for simulating the
integration of futuristic unmanned vehicle systems into mixed human-vehicle environments. This
methodology was described in Chapter 3, calibrated and partially validated in Chapter 4, and used
to test a set of candidate unmanned vehicle control architectures and operational safety protocols in
Chapter 5. An added benefit of simulations like the MASCS environment is that it enables further
investigations into the specific vehicle parameters the driver variations in performance in flight deck
operations.
The results of Chapter 5 indicated that the interactions between agents in the system is affected
not only by the types and number of vehicles active in the environment but by the constraints
placed on their behavior due to safety protocols and to the structure and organization of operations.
Neither of these factors are related to the choice of unmanned vehicle control architecture, which
demonstrated significant and varied interactions with these factors. Understanding these interactions is especially important for systems that employ large numbers of unmanned vehicles, manned
vehicles, and human collaborators, as testing such environments in the real world is both costly
and may endanger the individuals involved in the test program. However, the utility of systems
165
CHAPTER 6. EXPLORING DESIGN OPTIONS
like MASCS is that they can quickly and easily be altered to incorporate new vehicle models, crew
behaviors, environments, or other features of operations.
This chapter examines how a model like MASCS can be used to explore issues that affect the
performance of the flight deck system as a whole. Potential points of investigation include processlevel changes to operations, parameter-level changes to Unmanned Aerial Vehicle (UAV) models,
and architectural changes to the environment itself. For example, MASCS can be used to examine
how reductions in the number of crew used in “zone coverage” routing operations affect performance
on the flight deck (an architectural change). MASCS can also be used to examine alternate versions
of the unmanned vehicle control architectures (behavioral changes) defined in Chapter 5, altering the
individual failure rates of vehicles, changing the parameters the define the latencies for vehicles in
the system, or examining how the speed of aircraft travel affect mission performance. Additionally,
reconfigurations of the flight deck itself, in terms of both the placement of aircraft (an operational
change) as well as the location of the catapults (an architectural change) might also be tested.
Many other potential changes could be tested, but this chapter will address only a few. Chapter 5
suggested that the launch preparation task may have a significant role in regulating the pace of
flight deck operations. The effects of modifying this task are examined first, reviewing how changes
to this task might improve the productivity of flight deck operations for current Manual Control
operations before continuing on to examine its effects on unmanned vehicle operations. This then
extends to include launch preparation effects on the safety of operations, with the last section
exploring individual parameter settings for the Gestural Control (GC) architecture to explore how
its performance could be improved.
6.1
Exploration of Operational Changes
The results of Chapter 5 previously suggested that the combination of Deck Handler allocation
strategies and large launch preparation times lead to the creation and continued existence of queues
at catapults for all control architectures. With these queues in existence, the performance of the
flight deck is dependent only on the launch preparation time, eliminating much of the possible
variance in launch event duration. Improving the launch preparation process, whose characteristics
are independent of any specific vehicle control architecture, should both reduce launch event duration
values and eventually allow variations in the launch event duration values of the unmanned vehicle
architectures to arise. This section explores modifications to this task for the baseline Manual
Control (MC) architecture first, as it was partially validated against empirical data in Chapter 4 as
is considered representative of current operations.
As described in Chapter 3, Section 3.2, the launch preparation task requires several tasks be
performed manually by crew members on the flight deck. Appendix C described its definition as
a normal distribution with a mean µ = 109.65 seconds and standard deviation σ = 57.80 seconds
166
CHAPTER 6. EXPLORING DESIGN OPTIONS
(N (109.65, 57.802 ). Improvements to this task can be accomplished both by reducing the mean and
by reducing the standard deviation (square root of the variance) of the task. Reducing the mean
value of the launch preparation time should have a strong effect in reducing the average launch
event duration, as it should directly reduce the average time to launch an aircraft. Reducing the
variance of the launch preparation task decreases the occurrence of very large, unpredictable outlier
values and should also result in decreases in the average launch preparation time. Additionally,
removing these large outlier values might also make it more difficult to sustain queues of aircraft at
the catapults. Removing these outliers also has a side effect of increasing the predictability of the
task, a benefit for crew and supervisors on the flight deck.
If the launch preparation process continues to require that crew perform actions manually, decreasing the mean might be accomplished by providing the crew with improved checklists, electronic
checklists, or better training. These changes might also drive down the standard deviation, but a
more effective way of reducing both the mean and the standard deviation would be by automating
aspects of the launch preparation process. Checking the aircraft’s weight and using it to set the
catapult strength could be performed by wireless data transmission, while the physical connection
of the aircraft to the catapult’s “pendant” that propels the aircraft forward could perhaps be done
through passive interlocking mechanisms that require no human intervention.
However, it is not clear how much the launch preparation time needs to be reduced to eliminate
the queueing process and whether or not taxi operations can keep pace with these reductions. In this
first section, the effects of varying the mean and standard deviation of the launch preparation task
are explored in order to determine their effects on operations and whether or not taxi operations
becomes a limiting factor in operations. The first series of tests systematically reduces the standard
deviation σ of the launch preparation time model in increments of ten percent while holding the mean
µ constant at 109.65 seconds. A second series holds standard deviation constant at 57.80 seconds
while decreasing the mean. A third examines the combined effects, varying the mean while holding
the standard deviation constant at its smallest value from the first set of tests (28.90 seconds).
The list of cases tested appears in Table 6.1 and were applied to the baseline MC control architecture that was previously tested in Chapter 4. Tests occurred only at the 22 aircraft Density
setting, with thirty replications occurring for each launch preparation time model. Results for these
tests appear in Figure 6-1 in line chart form. At each plotted point, whiskers indicate ±1 standard
error in the data for the specified launch preparation model. The figure also includes linear fits for
all three data series, with both the fit and the resulting R2 value reported for each case.
As expected, reducing either the mean or the standard deviation has beneficial effects on the
Launch event Duration (LD) values, with reductions in the mean providing stronger effects. Decreasing the standard deviation by 50% provides only a 10.48% reduction in average LD from the
original launch preparation time model (Table 6.2). This same performance is provided by only a
167
CHAPTER 6. EXPLORING DESIGN OPTIONS
Table 6.1: Settings for launch preparation time testing.
Reducing Standard
Reducing Mean
Reducing Mean,
Deviation
50% Standard
Deviation
Baseline
-10%
-20%
-30%
-40%
-50%
N (109.65, 57.802 )
N (109.65, 52.022 )
N (109.65, 46.242 )
N (109.65, 40.462 )
N (109.65, 34.682 )
N (109.65, 28.902 )
N (109.65, 57.802 )
N (98.6832, 57.802 )
N (87.7184, 57.802 )
N (76.7536, 57.802 )
N (65.7888, 57.802 )
N (54.824, 57.802 )
N (109.65, 28.902 )
N (98.6832, 28.902 )
N (87.7184, 28.902 )
N (76.7536, 28.902 )
N (65.7888, 28.902 )
N (54.824, 28.902 )
Figure 6-1: Effects of altering launch preparation mean and standard deviation on mean launch
event duration for 22-aircraft Manual Control case.
20% reduction in the mean. Reducing the mean by 50% (and holding standard deviation constant)
leads to a 25% reduction in average LD from the original LP model (µ = 24.97 to µ = 18.86).
Reducing both the mean and standard deviation amplifies the beneficial effects of the two: at the
50% mean, 50% standard deviation condition, average LD is reduced by 43%, from 24.97 to 14.17
minutes.
The four end conditions (the original Launch Preparation (LP) model, standard deviation-50%,
mean-50%, and mean-50% + standard deviation-50%) were repeated for the 34 aircraft case. The
results of these tests appear in (Table 6.2) and show similar effects as that of the 22 aircraft tests.
The standard deviation-50% case provides a 9.75% decreases in average LD from the original launch
preparation time model, while the mean-50% case provides a 25.17% decrease in mean LD. Combining the two provides a 41.54% decrease in launch event duration. At this latter setting, the average
time to launch a mission of 34 aircraft (22.78 minutes) requires less time than a team of 22 aircraft
under the original launch preparation time model (24.97 minutes).
168
CHAPTER 6. EXPLORING DESIGN OPTIONS
Table 6.2: Mean launch durations and percent reductions for the 34 aircraft Manual Control case
for variations in launch preparation (LP) model.
22 aircraft
Baseline
SD -50%
Mean -50%
Mean -50%, SD
-50%
Mean Launch
event Duration
24.97
22.35
18.86
14.17
34 aircraft
% Reduction in
Mean
-42.64%
-51.61%
-63.65%
Mean Launch
event Duration
38.97
35.17
29.16
22.78
% Reduction in
Mean
-9.75%
-25.17%
-41.54%
These effects are of immediate practical significance to current Navy research and its development
of a next generation of electromagnetic (as opposed to current steam-powered) launch catapults for
the new Ford class of aircraft carriers. To generate the most benefit to operations, reducing both the
mean and the standard deviation is suggested. However, it is not clear that these improvements in
the launch preparation process are part of the objectives of the new catapult system being developed.
The EMALS project seeks to replace the steam catapult system with an electromagnetic one, but of
the six benefits listed by developer General Atomics on their website1 , only one relates to manpower
or launch rates (“reduced manning workload”). Even so, “reduced manning workload” is likely far
from complete automation of the process, which should further reduce the mean and the standard
deviation of task process times. The decreases to mean and standard deviation tested in this section
are modest compared to what might be possible with advanced automation of the launch preparation
task.
The results of this section have demonstrated that reducing both the mean and the standard
deviation of the launch preparation time improves LD values for flight deck operations. The most
substantial reductions in the launch preparation time model (reducing both the mean and standard
deviation by 50%) resulted in decreases in launch event duration of over 40% from the original LP
model. These reductions in launch preparation time may be strong enough to break the queuing
process for the different control architectures and allow variations in their LD values to arise. This
is the focus of the testing described in the next section.
6.1.1
Effects of Launch Preparation Times on UAV Performance
In the previous tests in Chapter 5, very little variation in average launch event duration values was
observed across the different unmanned vehicle control architectures. The only significant interactions that were observed occurred between the subset of the very best and very worst test cases. The
cause of this was hypothesized to be continued existence of queues at the catapults, produced by the
long launch preparation times combined with Handler allocation strategies. Decreasing the launch
preparation time should at some point allow queues to clear rapidly enough that taxi operations
1 http://www.ga.com/emals
169
CHAPTER 6. EXPLORING DESIGN OPTIONS
cannot keep pace, allowing variations in launch event durations for the different control architectures
to be observed. This is the focus of this section, beginning with an examination of the effects of
changing Launch Preparation (LP) parameters on control architecture performance.
Effects of Reducing Launch Preparation Standard Deviation
A first series of tests examined the effects of reducing the standard deviation in the launch preparation time model on the launch event duration performance of the different UAV Control Architectures. These tests address whether the reduction in standard deviation, which reduces the
largest launch preparation times that occur, reduces the ability of the control architectures to generate queues at the catapults. The Gestural Control (GC) and System-Based Human Supervisory
Control (SBSC) control architectures were tested first based on their status as the best- and worstperforming architectures, respectively. Each was tested utilizing the 22 aircraft Density level and
the 100% Composition level in order to maximize the effects of each control architecture. If reducing
the standard deviation of the launch preparation task has an effect on queuing, it should result in
significant differences between these two control architectures.
For each architecture, standard deviation values were decreased from the baseline value (σ =
57.80 seconds) in increments of 10%. Thirty replications were performed for each control architecture
at each standard deviation value. From the results of the previous section, the response of the GC
and SBSC architectures is expected to be linear with respect to decreases in launch preparation
standard deviation, although differences in performance may arise between the two. The results of
these tests are shown in Figure 6-2, which includes the results for Manual Control (MC) from the
previous section as a comparison point.
A fairly linear decrease is observed for each architecture, although there does appear to be
significant noise within the data. The SBSC and MC data also appear to converge at the smaller
standard deviation values (beginning at the -30% condition), suggesting that improving the launch
preparation process also removes the advantages from the more open routing methods of SBSC.
Conversely, it might also signify that the inefficiencies of the crew zone coverage routing are weaker
at the improved launch preparation tempo. The GC architecture, however, maintains its relatively
poor performance at each level.
At the smallest standard deviation value (28.90 seconds, original value -50%), the average time
to complete a mission across the three control architectures decreases from 24.67 minutes to 22.73
minutes (a 7.87% drop in mean LD). The GC results (mean = 23.09, standard error = 0.24 minutes)
appear to be significantly different from the SBSC results (mean = 22.41, s.e. = 0.22 minutes).
Additional tests were then conducted for the remaining control architectures (Local Teleoperation
(LT), Remote Teleoperation (RT), and Vehicle-Based Human Supervisory Control (VBSC)) using
the standard deviation -50% setting at the 22 aircraft Density levels, and tests for all architectures
were then conducted at the 34 aircraft level. These results appear in Figure 6-3 alongside results for
170
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-2: Line chart of mean Launch event Duration (LD) for the Manual Control (MC), Gestural
Control (GC), and System-Based Human Supervisory Control (SBSC) control architectures for
decreases in launch preparation time standard deviation. Whiskers indicate ±1 standard error in
the data.
missions using the original launch preparation time model.
An ANOVA test of the results at the 22 aircraft level reveals significant differences across cases
(F (5, 174) = 5.1771, p = 0.0002). A Tukey HSD test reveals that significant pairwise differences exist
between VBSC (µ = 22.14 minutes) and RT (µ = 23.55, p = 0.0006), VBSC and LT (µ = 23.18,
p = 0.0067), MC (µ = 22.35) and RT (p = 0.0296), and SBSC (µ = 22.41) and RT. Additionally,
the GC (µ = 23.09) and VBSC cases are marginally different (p = 0.0738). At the 34 aircraft density
level, an ANOVA also reveals significant differences (F (5, 174) = 5.5665, p < 0.0001). A Tukey HSD
test shows significant pairwise differences only between the MC (µ = 35.17) and RT (µ = 36.79),
Figure 6-3: Column graphs depicting Launch event Duration (LD) values all control architectures
under the 50% standard deviation, original mean launch preparation time model for 22 (left) and
34 (right) aircraft.
171
CHAPTER 6. EXPLORING DESIGN OPTIONS
MC and GC (µ = 36.61), and LT (µ = 35.66) and RT cases.
In terms of practical differences, the range of average launch event duration values does not
greatly change under the new reduced standard deviation launch preparation time model. Under
the original LP model, the difference between best and worst was only 1.38 minutes for the 22
aircraft missions and 1.56 minutes for the 34 aircraft missions. Under the new LP model with
standard deviation reduced by 50%, these differences increase slightly to 1.42 minutes for 22 aircraft
and 1.61 minutes for 34 aircraft. However, the slight decrease in average launch event duration
under the new launch preparation time model makes these differences significant in statistical tests.
Even so, these differences are limited to comparisons of the very best (MC, LT, and SBSC) and the
worst (GC and RT). Interestingly, at these settings, the VBSC architecture provides some of the
best results, in opposition to some of the results presented in Chapter 5.
One explanation for this is that the periodic high-value launch preparation times that occur
under the original, high standard deviation LP model cause an uneven rate of traffic flow of the
deck. Extremely long launch times will be followed by comparatively shorter launch times, requiring
that the Handler start operations for two vehicles in short succession. It is also likely that at least
one launch has occurred at the other catapult during this time, further increasing the number of
vehicles the Handler must being taxiing. When this occurs, the iterative interactions required from
the Handler help to constrain operations. Under an LP model with a reduced standard deviation,
the variability of the launch preparation task is reduced and should provide a steadier pace of
operations. With launches occurring at a more consistent rate, the number of vehicles in the Handler
queue at any one time should be smaller and remain fairly constant, easing the Handler’s job of
coordinating vehicle tasks. This is one example in which decreasing the random variation in the
launch preparation process might benefit operations. However, given substantial reductions in the
mean launch preparation time, the Handler may not be able to keep pace with operations and his
performance may worsen as compared to the other architectures. These effects are examined in the
next section.
Effects of Reducing Launch Preparation Mean
A second series of tests was conducted in which the standard deviation was held constant at its
original value (σ = 57.80) and the mean was decreased in increments of 10%. These tests were
then repeated, varying the mean while holding the standard deviation constant at the -50% settings
(σ = 29.80). Tests again began with the GC and SBSC architectures using 22 aircraft and running
only homogeneous (100% Composition) sets of aircraft under the Dynamic Separation (DS) protocol.
The MC architecture results are again used as a baseline comparison case. The results using both
the original and -50% standard deviation cases are shown in Figure 6-4.
As with the MC results shown previously, a linear response was expected for all control architectures at both standard deviation settings and is observed. The increasing benefits of reducing both
172
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-4: Line chart of Launch event Duration (LD) values for Manual Control (MC), Gestural
Control (GC), and System-Based Human Supervisory Control (SBSC) missions using 22 aircraft
varying the launch preparation mean. Whiskers indicate ±1 standard error.
mean and standard deviation also appear for the GC and SBSC control architectures, and it again
appears that these effects are more beneficial for the SBSC architecture. At the 50% mean, 50%
standard deviation condition, the GC (mean = 15.55, s.e. = 0.20 minutes) case again appears to be
significantly different from the SBSC case (mean = 13.82, s.e. = 0.18 minutes), which also appears
to be consistently similar to the MC results.
All six control architectures were then tested at the 50% mean, 50% standard deviation settings
(N (54.82, 29.82 )) using both 22 and 34 aircraft Density settings to examine what differences exist
between architectures. The results of these tests appear in Figure 6-5. For 22 aircraft cases, an
ANOVA again shows significant differences across cases (F (5, 174) = 14.6874, p < 0.0001) with a
Tukey HSD test now showing ten significant or marginally significant pairwise differences (p < 0.0328
for all; full results appear in Appendix O) between cases. Of these, nine include comparisons
between the “delayed” (RT, GC, VBSC) and non-delayed (MC, LT, SBSC) architectures. Two of
these comparisons (LT vs. RT and LT vs. VBSC) are only marginally significant (p = 0.0691 and
p = 0.0723, respectively). The tenth comparison that shows significant differences involves the LT
and SBSC architectures (p = 0.0328). At the 34 aircraft level, a Kruskal-Wallis test shows significant
results (χ2 (5) = 88.54, p < 0.0001), with a Steel-Dwass non-parametric simultaneous comparisons
test showing nine significant pairwise differences, once again occurring between the sets of “delayed”
and “non-delayed” groups. All nine comparisons return highly significant differences (p = 0.0002 or
less).
Overall, the results reported in this and the previous section suggest that, as hypothesized in
173
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-5: Column graphs depicting Launch event Duration (LD) values all control architectures
under the 50% mean, 50% standard deviation launch preparation time model for 22 (left) and 34
(right) aircraft.
Chapter 5, the definition of the launch preparation time forms a choke point on operations. Decreasing the mean and standard deviation of the time required to complete this task provides significant
benefits to operations. These changes also allow significant differences in control architectures to
arise, providing similar trends in performance as observed for the single aircraft tests in Chapter 5:
the “delayed” UAV control architectures of RT, GC, and VBSC perform worse than the remaining
“non-delayed” architectures. However, for these data, the differences in best and worst performance
of the control architectures at the mean-50%, standard deviation-50% cases (less than 2 minutes at
22 aircraft, less than 3 minutes at 34) are far smaller in magnitude than the benefits of changing the
launch preparation model. For stakeholders working in this and other similar domains, if the implementation of autonomy is being done to increase the productivity of the system, those stakeholders
should also question what other aspects of operations could benefit from increased automation. For
the aircraft carrier flight deck, there are greater benefits to operational productivity in improving
the launch preparation process than in automating individual vehicle behavior.
However, the launch preparation models used in this section might be improved further with
even more substantial automation upgrades. As described previously in this chapter and earlier in
Chapter 3 (Section 3.2), the tasks that are part of the launch preparation process are independent
of the type of vehicle being used and rely heavily on the crew. Crew must check the weight of the
vehicle, the settings of the catapult, and ensure that the nosewheel or the aircraft is properly aligned
to the catapult’s “pendant,” the portion of the catapult mechanism that connect to the nosewheel
and pulls the aircraft down the catapult. It is not unreasonable to imagine that these tasks could all
be automated: wireless data processing could accomplish the first two tasks, while improved sensor
systems and passive locking mechanisms could provide a secure, near-instant connection of aircraft to
catapult. Such automation, similar to that done in the manufacturing domain, has the potential to
even more significantly increase the performance of operations on the flight deck. The introduction
174
CHAPTER 6. EXPLORING DESIGN OPTIONS
of automation into this process, rather than into aircraft via the different control architectures, is
explored in the next section.
6.1.2
Effects of Automating Launch Preparation Tasks
Extreme improvements to the launch preparation process could be achieved by completely automating all subtasks involved: automating the physical connection between the aircraft and the catapult
mechanism as well as wirelessly communicating all information between the aircraft and the catapult
control system. Doing so should enable large reductions in both the mean and the standard deviation of the launch task; at such levels, differences in the performance between the various control
architectures would be due solely to inefficiencies in the taxi operations process. An extreme upper
bound of performance might have an average of 5 seconds with a standard deviation of 1 second over
the bounds [3,7] seconds, decreasing the time required to complete the launch preparation far below
that of the values tested in the previous sections. This model (N (5, 12 ) seconds) was input into the
MASCS model and a series of simulation trials were conducted using all control architectures at
both the 22 and 34 aircraft Density settings at the 100% Composition DS settings. Mean LD values
for these tests appear in Figure 6-6.
To begin, using an aggressive automated launch preparation time model results in substantial
decreases in time to complete the mission as compared to the original launch preparation time
model: an average of 8.57 minutes across all control architectures at the 22 aircraft level and 15.79
minutes at 34 aircraft. These are reductions of 65.7% and 59.5% from the average duration of
the MC architecture under the original launch preparation time model (mean values of 24.97 and
38.97, respectively). Under this highly automated task, the time to complete the mission is solely
dependent on the time required to taxi aircraft to catapults: aircraft launch so quickly that queues
are rarely able to form, and even then only exist for a few seconds. Table 6.3 provides the means and
Figure 6-6: Column graphs depicting Launch event Duration (LD) values all control architectures
under the “automated” launch preparation time model for 22 (left) and 34 (right) aircraft.
175
CHAPTER 6. EXPLORING DESIGN OPTIONS
standard deviations of total waits time in queue for all aircraft in the 22 and 34 aircraft missions
under both the original and automated launch preparation time models. While these values are
cumulatively over 60 and 90 minutes under the original launch preparation time models at the two
Density settings, respectively, average queue times are less than one minute under the automated
launch preparation model.
Note, also, in the lefthand side of Figure 6-6 that for the 22 aircraft results the MC, LT, and
SBSC architectures are again the best performers while RT, GC, and VBSC perform more poorly.
However, unlike other tests, the gap between the two groups is fairly small: the SBSC and RT
results are fairly close together. A Kruskal-Wallis test for data at the 22 aircraft Density level reveals
significant differences (χ2 (5) = 144.48, p < 0.0001), with a Steel-Dwass non-parametric simultaneous
comparisons test showing significant pairwise differences for thirteen of fifteen possible comparisons.
The only pairings that are not significantly different are GC (mean rank = 143.67) compared to
VBSC (mean rank = 147.5), the two worst performers, and MC (mean rank = 28.43) paired with
LT (mean rank = 37.2), the two best. All other comparisons are strongly statistically significant
(p = 0.0077 or less). This also means that the two groups of “delayed” (RT, GC, and VBSC) and
“non-delayed” architectures no longer exist as they have in previous results: MC and LT are both
significantly different from SBSC (mean rank = 81.0, both at the p < 0.0001 level), while GC and
VBSC are both different from RT (mean rank = 105.2, both at the p < 0.0001 level).
The “delayed” versus “non-delayed” trend of differences also disappears at the 34 aircraft level,
albeit in a different fashion. From the results in the figure, it appears that the SBSC architecture
is the best overall performer, while VBSC improves significantly and RT degrades substantially.
A Kruskal-Wallis test again shows significance (χ2 (5) = 147.01, p < 0.0001), with a Steel-Dwass
test also showing multiple significant pairwise differences. In these results, all but three pairwise
differences are strongly significant (p < 0.0001): MC (mean rank = 72.2), VBSC (mean rank =
74.7), and LT (mean rank = 78.17) are not significantly different from one another. However, SBSC
(mean rank = 17.83) has significantly smaller values than each of these cases and is the lone best
performer. Additionally, RT (mean rank = 159.93) and GC (mean rank = 140.17) are worse than
the remaining cases and significantly different from one another. At the 34 aircraft Density level,
one of the “delayed” architectures, VBSC, is now equivalent to two of the non-delayed architectures
Table 6.3: Means and standard deviations of wait times in catapult queues at both the 22 and 34
aircraft Density levels under the original and automated launch preparation models.
Mean Wait Time in Queue (minutes) Std Dev
22 aircraft
34 aircraft
Original
Automated
Original
Automated
69.9773
0.6236
92.7235
0.8987
176
12.3197
0.4519
13.4196
0.6173
CHAPTER 6. EXPLORING DESIGN OPTIONS
(MC and LT), having flipped from being one of the worst performers at the 22 aircraft level to one
of the best at the 34 aircraft level.
Looking across results at the two Density settings (22 and 34 aircraft), MC and LT had equivalent
performance in each, which is generally to be expected given the minimal differences between the two
architectures. The generally poor performance by both RT and GC is also to be expected; that GC is
relatively better than RT at 34 aircraft and worse at 22 is an interesting result, however. Additionally,
the large relative improvement in performance for the VBSC and SBSC cases at 34 aircraft is striking,
especially for VBSC. These latter differences in performance are perhaps explained by the lack of
the “zone coverage” routing structure, the common feature differentiating VBSC and SBSC from
the other architectures. One explanation is that the “open” routing used by these two architectures
is not optimized for the 22 aircraft setting, and the lack of constraints on taxi operations allows
more aircraft to interfere with each other and trigger their respective collision prevention routines.
For the other architectures, the use of the zone coverage routing leaves sufficient constraints in place
that aircraft remain well-spaced and do not often conflict with one another. At the 34 aircraft level,
the limitations on taxi paths due to the larger number of aircraft may be enough that the VBSC and
SBSC do not encounter such issues and retain their beneficial performance. A similar explanation
might explain why RT and GC differ in performance between the two density levels — the effects
of GC delays and latencies are more strongly felt at the unconstrained 22 aircraft Density, with the
constraints at the 34 aircraft level limiting the detrimental effects of that architecture.
In general, however, the role of the launch preparation time model has been clearly demonstrated
by tests in this section both in terms of its effects on launch event duration values for flight deck
operations as well as its effects in regulating UAV control architecture performance. Once the launch
preparation time model was sufficiently reduced to eliminate the presence of queues on the flight
deck, the differences in control architecture performance finally became visible. However, one caveat
to these results are that the heuristics used by the Aircraft Director and Deck Handler agents were
not designed for such fast operational tempos. The heuristics are quite likely not optimal for these
operating conditions, and even better performance might be achievable if these heuristics are altered.
Additionally, the tests performed in this section highlight the importance of exploring all facets of
the broader system, including processes defined independently of any of the unmanned vehicle control
architectures. For a stakeholder selecting a control architecture based on mission performance, a
failure to recognize the effects of tasks like the launch preparation process on operations could lead to
errors in decision-making. The results observed in Chapter 5 under the original launch preparation
time model might lead a decision-maker to conclude that all architectures are equivalent, or that
they provide no overall benefit compared to manual control operations. However, if the broader
system is slated to transition to a faster operating tempo, selecting GC or RT systems would clearly
be incorrect for future operations. In the opposite direction as well, using the results from tests
177
CHAPTER 6. EXPLORING DESIGN OPTIONS
utilizing the “automated” launch preparation time model may lead stakeholders to select the LT
or SBSC systems. If the system spends the majority of operations working within a slower tempo,
then the benefits expected from these systems will never be realized and the selection of a different
architecture might have been warranted.
To this point, this chapter has focused solely on launch event duration performance for the different UAV control architectures under the 100% Composition cases, to which the safety protocols
defined in Chapter 5 cannot be applied. However, just like for the control architectures, the results previously discussed in Chapter 5 showed little variation in performance amongst the Area,
Temporal, and Dynamic Separation safety protocols. It is likely that the lack of variation in safety
protocol performance was also due to the effects of the sustained queueing of aircraft at catapults,
and eliminating queuing through the use of the “automated” launch preparation time model should
make the effects of the different safety protocols on launch event duration more visible. This chapter
has also not yet explored the effects of the launch preparation time model on the results of the safety
metrics, which were heavily influenced by the selection of safety protocols. The next section explores
both of these topics, beginning with the effects of safety protocols on launch event durations under
the new, automated launch preparation time model.
6.1.3
Interactions with Safety Protocols
Just as the original launch preparation time, with its large mean and standard deviation, obscured
the effects of different control architectures, so too might it have obscured the effects of Safety
Protocols on operations. Chapter 5 demonstrated that for LD performance, other than the severelyconstrained Temporal+Area protocol, the only significant interactions between control architectures
and safety protocols occurred between the Temporal Separation protocol and the three “delayed”
UAV types. This was believed to have been caused by the increased use of forward-to-aft aircraft
routing on the flight deck, the effects of which should be magnified at higher operational tempos
and should further exacerbate errors and latencies in the three “delayed” control architectures. An
additional series of tests were executed for the GC and SBSC Control Architectures under all possible
settings of Safety Protocols (SPs), mission Density, and mission Composition settings using the new
“automated” launch preparation time model. This section reviews the results of these tests on the
Launch event Duration (LD) productivity measure and selected safety measures.
Effects on Productivity Measures
Figure 6-7 shows the average Launch event Duration (LD) values for the GC and SBSC architectures
across all safety protocols at both 22 (left) and 34 (right) Density levels using the new “automated”
launch preparation time. These figures also include the average LD of these architectures under
the original launch preparation time model. The trends in the performance of the safety protocols
under the automated launch preparation time model are generally the same for both mission Density
178
CHAPTER 6. EXPLORING DESIGN OPTIONS
levels: Dynamic and Area Separation provide the smallest LD values, while Temporal Separation
and Temporal+Area provide the largest. Interestingly, SBSC appears to provide smaller LD values
under Area and DS protocols, while GC performs slightly better at the Temporal and T+A settings.
Statistical tests verify that significant differences occur across Safety Protocols at all levels. For
GC results at 22 aircraft, a Kruskal-Wallis test returns significance (χ2 (3) = 194.78, p < 0.0001),
with a Steel-Dwass nonparametric multiple comparisons test showing that all possible pairwise
comparisons are significant at p = 0.0048 or less, with Area (mean rank = 58.97) providing the best
performance, followed by DS (89.84), Temporal (195.85), then T+A (220.17). Similar results are
seen for the SBSC data at 22 aircraft, with a Kruskal-Wallis test returning significance (χ2 (3) =
216.24, p < 0.0001), with a Steel-Dwass nonparametric multiple comparisons test showing that all
pairwise comparisons are significantly different with Area (mean rank = 59.817) providing the best
performance.
At the 34 aircraft level, a Kruskal-Wallis test for SP results for the GC control architecture
also shows significance (χ2 (3) = 209.13), where a Steel-Dwass test demonstrates that all but one
pairwise comparison is significantly different: the Dynamic Separation only (mean rank = 85.96)
and Area Separation (mean rank = 59.82) protocols are not significantly different from one another
(p = 0.0527). A Kruskal-Wallis test for SP results for the SBSC control architecture also shows
significance (χ2 (3) = 220.73, p < 0.0001) and a Steel-Dwass test also indicates all but one pairwise
comparison is strongly significantly different. In this case, however, the Temporal (mean rank =
202.27) and Temporal+Area (mean rank = 218.73) are shown to only be marginally significantly
different from one another (p = 0.0473).
Taken together, these results support the earlier results from Chapter 5 that indicated that the
Figure 6-7: Column graphs depicting Launch event Duration (LD) values for the Gestural Control
(GC) and SBSC Control Architectures run at the 22 (left) and 34 (right) aircraft Density levels
under the “automated” launch preparation time for all Safety Protocols (SPs).
179
CHAPTER 6. EXPLORING DESIGN OPTIONS
Temporal protocol had adverse effects on operations. These previous results from Chapter 5 showed
that these effects only occurred for the “delayed” UAV control architectures, however; the new results from this Chapter suggest that the Temporal is detrimental to operations in and of itself. That
this was not observed in Chapter 5 is likely due to the effects of the launch preparation time on
operations, which may have served to limit the effects of this safety protocol just as it limited the
differences in control architectures. Interestingly, the effects of the Temporal and T+A protocols are
felt more strongly by the SBSC architecture than by GC; while SBSC provides better performance
than GC under the Area Separation and Dynamic Separation protocols, it provides worse performance under Temporal and Temporal+Area. This is likely indicative of interactions with the open
routing structure of the SBSC architecture and it not being designed for this operational tempo.
This suggests that, while the zone coverage routing system does provide additional constraints on
operations, the lack of constraints under the more open routing method may allow aircraft to more
greatly interfere with one another’s actions, limiting some aspects of operations. The zone coverage system may be beneficial in structuring operations and preventing these interferences and
interactions from occurring.
The “automated” launch preparation time also has interesting interactions with mission Density.
Increasing mission Density directly increases the time to complete a launch event, but the relative
effects of mission Density differ between the original and automated launch preparation time models.
Figure 6-8 calculates the percent increase in mean LD values when moving from the 22 to 34 aircraft
Density level for the GC and SBSC control architectures under the two Launch Preparation (LP)
time models. Under the original LP model, increasing mission Density from 22 to 34 aircraft results in
a roughly 60% increase in LD for both the GC and SBSC architectures; this is basically equivalent
to the percent increase in number of aircraft (12/22 = 54.54%). Under the automated launch
preparation time model, the percent increase is closer to 90% for both architectures.
This difference in Density effects can be attributed to the effects of eliminating queues at the
catapults. Under the original LP model, launch event durations are solely a function of the number
of launches that are required due to the sustained existence of queues at the catapults. Increasing
the number of aircraft in the mission simply adds to the number of consecutive launches that
take place at the catapults. Under the automated LP model, these queues are eliminated and
the efficiency of taxi operations becomes the critical function in operations. The stronger effects
of increased Density under the automated launch preparation time model (increases of 90% from
the 22 aircraft Density level) likely indicate that the taxi operations currently modeled in MASCS
are inadequate for these higher-tempo operations. If taxi operations are appropriately redesigned
for each control architecture, further decreases in launch event duration might be obtainable at
both mission Density values. However, flight deck productivity is only one measure of performance
for flight deck operations, and the changes to the launch preparation time models may also affect
180
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-8: Percent increase in Launch event Duration when moving from 22 to 34 aircraft.
the results for the safety metrics. The next section reviews the effects of automating the launch
preparation process on the safety measures for flight deck operations defined previously in Chapter 5.
Effects on Safety Measures
In Chapter 5, the analyses of safety measures addressing aircraft-aircraft interaction focused on
the Tertiary halo radius (two wingspans away). Measures for crew interactions with vehicles were
also developed; while these Primary Halo Incursion (PHI) and Duration of Primary Halo Incursions (DPHI) metrics are imperfect given the crew behavior models included in MASCS, they were
analyzed in Appendix M. This section explores the effects of reducing the launch preparation mean
and standard deviation on the Tertiary Halo Incursions due to Aircraft (THIA), PHI, and DPHI
metrics. Figure 6-9 shows comparisons of THIA values between the original and automated launch
preparation time models for each of the four Safety Protocols for the SBSC architecture. The same
data is shown for GC in Figure 6-10.
Statistical tests verify that there are significant differences in performance between the original
and automated launch preparation time models. The results of Wilcoxon Rank Sum tests comparing THIA values at the automated and original launch preparation time models at each Density
level appear in Table 6.4. As can be seen, all four comparisons are strongly statistically significant.
Practically, these results may be somewhat counterintuitive: that increasing the tempo of operations
actually increases the safety of operations; this is not something that would typically be expected
for most systems. It suggests, again, an interaction with the routing of aircraft on the deck. The
increased tempo of operations speeds the rate at which catapults launch; while the rate of assignments by the Handler also increases, the speed at which aircraft taxi is left constant, leaving fewer
aircraft near the catapults and one deck in general. This may also explain why THIA values for the
automated launch preparation task at the 22 aircraft Density level appear to be invariant to Safety
181
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-9: Column charts comparing Tertiary Halo Incursions due to Aircraft (THIA) values between the “automated” launch preparation time and the original launch preparation time for all
treatments in the SBSC control architecture. Left: 22 aircraft cases. Right: 34 aircraft cases.
Figure 6-10: Column charts comparing Tertiary Halo Incursions due to Aircraft (THIA) values
between the “automated” launch preparation time and the original launch preparation time for all
treatments in the GC control architecture. Left: 22 aircraft cases. Right: 34 aircraft cases.
Protocols for both the GC and SBSC architectures: aircraft are not on the deck long enough that
taxi operations are basically unconstrained and the safety protocols have little real effect.
A review of values for PHI (Fig. 6-11 and 6-12) indicate only slight differences in performance
between the two launch preparation time models for the SBSC or GC architectures at either Density
level. The PHI results in Appendix M showed that the major influence was in the selection of vehicle
routing topology. As the launch preparation time does not affect this topology, the number of PHI
incursions would not be expected to change.
However, the routing does occur at a faster pace, and DPHI values are shown to substantially
decrease between cases (column charts also appear in Appendix O). This leads to a particularly
182
CHAPTER 6. EXPLORING DESIGN OPTIONS
Table 6.4: Results of Wilcoxon Rank Sum Tests comparing THIA values between the original launch
preparation time and the automated launch preparation time for the GC and SBSC Control Architectures. A value of p < 0.013 indicates significant differences between cases.
CA
Density χ2 (1) DF
p-value
GC
GC
SBSC
SBSC
22
34
22
34
376.92
377.70
328.34
396.43
1
1
1
1
< 0.0001∗
< 0.0001∗
< 0.0001∗
< 0.0001∗
interesting set of implications. Increasing the rate at which launches occur by decreasing the mean
and standard deviation of the launch preparation time reduces the total time of the mission, thereby
reducing the total possible time during which crew interactions can occur. It does not change how
aircraft are allocated, however, meaning that the total number of times crew interact with vehicles
should (and does) remain unchanged. This means that the overall rate of crew incursions per minute
of crew activity has increased, which in some environments might be considered the major driver
of crew risk. This exemplifies not only a tradeoff between safety and productivity, but a tradeoff
between different types of safety, one influenced by the speed of operations and one by the structure
in which those operations occur. Which of these measures should be considered most important
would require a detailed collection of data for current flight deck operations and accident rates that
is outside the scope of this thesis and a potential area of future work. However, these results do
provide evidence to support the idea that improving safety in one area may not improve safety in
all areas.
Thus far, MASCS has focused on the assessment of previously-defined unmanned vehicle control
architectures that were first described in Chapter 5 as models of mature futuristic systems. Tests
Figure 6-11: Column charts comparing Primary Halo Incursion (PHI) values between the “automated” launch preparation time and the original launch preparation time for all treatments in the
GC control architecture. Left: 22 aircraft cases. Right: 34 aircraft cases.
183
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-12: Column charts comparing Primary Halo Incursion (PHI) values between the “automated” launch preparation time and the original launch preparation time for all treatments in the
SBSC control architecture. Left: 22 aircraft cases. Right: 34 aircraft cases.
Figure 6-13: Column charts comparing Duration of Primary Halo Incursions (DPHI) values between
the “automated” launch preparation time and the original launch preparation time for all treatments
in the GC control architecture. Left: 22 aircraft cases. Right: 34 aircraft cases.
have focused on how these mature versions of these futuristic systems interact with process-level
tasks in the environment, but the power of agent-based models like MASCS includes the ability
to examine the effects of individual vehicle parameters on performance. This may be of use to
stakeholders seeking to understand the relative cost-benefit ratio of investing in improvement to a
current UAV control architecture or that seeks to understand exactly what type of performance is
required for a current system to meet some performance threshold. The next section examines how
simulations like MASCS could be used in exploring the design of individual control architectures.
184
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-14: Column charts comparing Duration of Primary Halo Incursions (DPHI) values between
the “automated” launch preparation time and the original launch preparation time for all treatments
in the SBSC control architecture. Left: 22 aircraft cases. Right: 34 aircraft cases.
6.2
Exploring Control Architecture Parameters
MASCS might also be useful in determining the performance requirements of unmanned systems
being developed for flight deck operations. In Chapter 5, a description and justification of the
parameters used for the baseline control architectures was provided. However, any or all of these
parameters may be modified and improved upon in future systems. It is not known which parameters,
and at what values, significant differences in performance will occur. However, models such as
MASCS can also be used in an exploratory fashion: given a description of the desired performance,
one or more parameter settings for a given UAV control architecture could be varied until either the
desired performance is achieved or the parameter value becomes unrealistic.
This can also serve as a way of providing direction for current autonomous systems research.
Consider the case of the Gesture Control system, which had the worst overall performance for an
individual aircraft and, at an increased operational tempo, the worst overall performance for a
control architecture in homogeneous missions (100% UAV Composition). The GC model differed in
several ways from the Manual Control baseline, and changes to one or more parameters may improve
performance sufficiently to make the performance of the two architectures equivalent. Testing can
also demonstrate at what point further improvement have no effect: a 95% success rate in gesture
recognition may be sufficient to provide acceptable performance, and further investment to reach
a 99% success rate would provide no additional benefits. Understanding each of these aspects of
performance — which parameters are most important, what parameter settings generate equivalent
performance, and at what point are further improvements no longer beneficial — could be used to
help better characterize vehicle performance requirements in future systems. This section explores
changes to these settings in the Gesture Control model, typically the worst-performing control
185
CHAPTER 6. EXPLORING DESIGN OPTIONS
architecture, to uncover at what parameter settings GC performance is becomes equivalent to the
baseline Manual Control architecture and how those settings might vary across mission conditions.
A first series of tests examines the effects of varying GC parameters under the original launch
preparation time model to determine whether or not significant effects are observed. Given the
previous results, it is expected that the original launch preparation time model will again prevent
any real differences in performance from appearing. For these tests, two different adjustments are
made to the original Gesture Control system model. First, a “Low Failure” case is created, in
which the failure rate is reduced from 11.18% to 1%. This assumes future systems that can reliably
recognizing gestures and cope with variability both between individual Aircraft Directors and within
a single Director (the precision of whose actions may degrade time). Failure rates are left as 1%
as GC systems are not likely to ever achieve perfect recognition rates for the complex flight deck
environment.
A “Low Latency” case can also be created, in which the processing of the system is reduced to
a maximum of 0.5 seconds, reflecting the behavior of the fastest-processing systems reported in the
literature. A third setting combines the two of these cases (Low Latency+Failure). Reducing the
failure rate of the system would be expected to have a stronger effect on operations than reductions
in processing latency, given the size of the time penalties for failures. However, it is doubtful
that changes to these parameters would generate significant differences under the original launch
preparation time, given its regulating influence as shown in previous tests.
The three modified Gesture Control models (Low Latency, Low Failure, and Low Latency+Low
Failure) were implemented in MASCS and tested at the 22 and 34 aircraft Density settings under
both the original and the “automated” launch preparation time models. Thirty replications were
performed for each combination of Gesture control model, mission Density, and launch preparation
time model. Average Launch event Duration (LD) values these tests appear in Figure 6-15, along
with results for the baseline GC and MC models defined previously in Chapter 5 as a comparison.
Improvements to the GC system latency and failure rate should improve its performance from that
of the baseline GC model and move it closer to that of the baseline MC model.
As with previous results under the original Launch Preparation (LP) model, changes to the GC
system parameters have no real effect on operations due to the constraints of the catapult queueing
process. Once this queuing is eliminated under the aggressive settings of the automated LP model,
LD values improve from left to right. The Low Failure and Low Latency cases are both slightly
better than the baseline GC model, with further improvements occurring at the Low Latency+Low
Failure case. At both the 22 and 34 aircraft Density levels, the performance of the Low Latency+Low
Failure case is close to that of the baseline MC architecture.
Statistical tests show significant differences for the results at both the 22 (Kruskal-Wallis Test,
χ2 (4) = 75.52, p < 0.0001) and 34 aircraft (Kruskal-Wallis test, χ2 (4) = 111.75, p < 0.0001) Density
186
CHAPTER 6. EXPLORING DESIGN OPTIONS
Figure 6-15: Column charts of mean launch event duration for variations of the Gestural Control (GC) model at both 22 (left) and 34 (right) aircraft Density level using the original launch
preparation time model.
levels. A Steel-Dwass non-parametric simultaneous comparisons test at the 22 aircraft level reveals
that all but one pairwise comparison are statistically significantly different: the Low Failure and
Low Latency conditions are not significant different from one another (p = 1.0; all others, p <
0.05). The Manual Control results are significant different from every GC model, including the
Low Latency+Low Failure case (p = 0.308). This means that, for 22 aircraft, even significant
improvements to the accuracy and processing speeds to GC vehicles are not sufficient to provide
performance equivalent to Manual Control operations. However, in practical terms, the average
mission times differ by only 0.47 minutes.
Interestingly, at the 34 aircraft level, increased accuracy and processing speeds are sufficient
to provide equivalent performance: a Steel-Dwass test shows only one pairwise comparison that is
not significant: Manual Control and the Low Latency+Low Failure (p = 0.9915). All others are
significant at the p = 0.05 level (Low Latency and Low Failure, p = 0.0056; Low Latency and GC
Baseline, p = 0.0407; all others, p < 0.0001). This again may indicate that the relative open 22
aircraft missions, as opposed to the somewhat constrained 34 aircraft missions, are more sensitive
to deficiencies in vehicle taxi behavior. Even though the failures and latencies have been reduced
for the GC models, the few failures that do occur, as well as their failures in catapult alignments,
may still have substantial effects on performance. The limitations on taxi paths that occur at the
34 aircraft case might serve to dampen some of these effects or to dampen the performance of the
MC model.
A final series of tests were taken in which the last major variable in GC, the time penalties due
to gesture recognition failures, was reduced from 5-10 seconds to 1-2 seconds. Adding this case into
the previous Kruskal-Wallis test, results are again significant (χ2 (5) = 109.60, p < 0.0001) but a
Steel-Dwass test now shows the MC case (mean rank = 39.83) is not significantly different from
187
CHAPTER 6. EXPLORING DESIGN OPTIONS
this new “Low Penalty” case (mean rank = 52.17, p = 0.9930). This Low Penalty case is also not
significantly different from the Low Latency+Low Failure case (mean rank = 66.08, p = 0.7722),
but the Low Latency+Failure case remains marginally significantly different from Manual Control
(p = 0.0454).
It is interesting that the 22 aircraft case required additional improvements to GC time penalties
to provide equal performance to Manual Control operations, but this is consistent with the results
of previous sections that indicated different trends in results between the two Density settings. As
noted in the previous discussions, this is likely an indication that the heuristics used by crew and the
Deck Handler are not optimal for the aggressive automated launch preparation time model. However,
even though the improved GC model is still not equivalent to Manual Control, the benefits to LD
times at this automated launch preparation process far outweigh the less than 1 minute difference
that remains between GC and MC. Although additional improvement to operations might occur
if the crew routing methods were redesigned, investment in the automation of the launch process
should become a priority to the Navy.
A final interesting question raised by these results is, for 22 aircraft missions using the lower
time penalties and processing times, what accuracy rate is required for the GC architecture in order
to generate performance equivalent to Manual Control? To answer this, additional trials were run
for the GC model using the Low Latency and Low Penalty settings decreasing the failure rate from
the baseline 11.18% to 10%, then in increments of 1%. The results of a subset of these tests appear
in Figure 6-16, which also includes data for the baseline Gestural Control (GC) model used in
Chapter 5 (fail rate = 11.18%), the GC Low Latency case (fail rate = 11.18%, max process latency
= 0.5 seconds), and the baseline Manual Control (MC) model from Chapter 5.
Figure 6-16: Plot of Launch Event Duration Means against Gestural Control (GC) failure rates
under the original launch preparation time model. Whiskers indicate ±1 standard error in the data.
188
CHAPTER 6. EXPLORING DESIGN OPTIONS
First, it can be seen that decreasing the latency provides an immediate benefit to operations;
however, decreasing the failure rate has a relatively small effect once this decrease in process latency is applied. Statistical tests do indicate that there are significant differences between cases,
however (Kruskal-Wallis test: χ2 (6) = 91.6848, p < 0.0001). The results from a Steel-Dwass posthoc nonparametric simultaneous comparisons test appear in Table 6.5. These results show that the
Manual Control baseline is significantly different from all cases except the 1% and 7% cases. This
is interesting, given that it would be expected that the 4% case should perform better than the 7%
case. These variations are likely due to small inefficiencies in the Handler heuristics and taxi routing
methods that have been described in other results. Although the 4% case is statistically significantly
different form Manual Control, the difference is only 0.46 minutes.
This and the preceding sections have addressed how changes to parameters defining the processlevel task of launch preparation and parameters related to the unmanned vehicle control architectures
affect the performance of the flight deck system as a whole. Throughout these results, it has been
pointed out that many remaining causes of inconsistencies in results and variations within the data
are likely a function of the structure of operations. This includes both the behavior of the crew
and the planning of operations, as well as extending into how aircraft are arranged on the flight
deck prior to mission operations and the physical locations of the parking places and catapults.
Simulations like MASCS could also be used to examine changes to these elements, although the
redefinition of the decision-making rule bases required to do so would require more time and effort
than is allowed within the scope of this thesis. Possible options for exploring these architectural
changes are discussed in the next section.
6.3
Possible Future Architectural Changes
Much of Chapter 3 discussed current aircraft carrier flight deck operations, describing the organization of the flight deck, the structure of crew tasks and interactions, and the way in which supervisors
(the Deck Handler) plan assignments for aircraft. Each of these is a significant feature of operations
that required substantial research to model and implement within MASCS. They are also features
of operations that are a core part of a crew’s on-the-job education; both have grown organically over
Table 6.5: Results of a Steel-Dwass nonparametric simultaneous comparisons test of Gestural Control
(GC) failure rate variations.
Level
- Level
Score Mean Difference
Std Err Dif
Z
p-Value
MC
MC
MC
MC
MC
MC
GC 1% Failure Rate
GC 7% Failure Rate
GC 4% Failure Rate
GC 10% Failure Rate
GC Low Latency
GC
-2.5667
-8.4333
-15.7667
-18.2333
-18.2333
-29.9667
4.50925
4.50925
4.50925
4.50925
4.50925
4.509187
-0.5692
-1.87023
-3.49652
-4.04354
-4.04354
-6.64569
0.9976
0.5003
0.0086*
0.001*
0.001*
<.0001*
189
CHAPTER 6. EXPLORING DESIGN OPTIONS
time and are not a formal part of a crew’s education about the flight deck. To test fundamental
changes to these schema would require significant retraining on the part of crew and may introduce
additional risks to operations, as crew are being asked to work within an unfamiliar system in an
unfamiliar fashion. However, because models of the current system were effectively constructed in
MASCS, changes to these systems can also be modeled and investigated.
Given the budgetary constraints of 21st century operations, reducing the number of crew required
for operations is a concern of the Navy. The new generation of Ford -class aircraft carriers already
aims to reduce the required crew by about 700 people (USNFord, 2014). While the Aircraft Directors
on the flight deck form only a small part of this complement of crew, MASCS could be used to explore
the effects of reducing the number of Directors utilized in operations. Removing crew from the
current “zone coverage” scheme would require a reconfiguration of Directors’ areas of responsibility
and how they connect to one another. MASCS currently hard-codes these connections into the
system, but these could either be converted into a configuration file loaded on start or replaced by
an algorithm that could determine connections automatically given the number and location of crew
on the deck. These could then be used to examine the effects of the number and placement of crew
on the flight deck and its effects on total launch event Duration and crew workload (which has not
been a focus of this work). Similar investigations could also be made with the Escort crew, exploring
how changes to the number of Escorts utilized, the number of starting locations, and the location of
those starting locations affect the efficiency of operations. The configuration of parking locations is
also based on past deck crew experience and might also be examined further. However, the parking
locations currently utilized are closely tied to the planning heuristics of the Deck Handler, and
changing one likely requires changing the architecture of the other. Parking spaces can be easily
adjusted in system configuration files, but substantial changes to their organization might require
significant changes to the Handler planning process.
The physical layout of the flight deck itself might also be explored: the locations and configuration
of the launch catapults and the landing strip could be altered, two changes that would be also
impossible to achieve in a real world test setting. The placement of catapults is of particular
interest: the limitations in alternating launches within each catapult pair provides a significant
constraint on operations. Deconflicting the catapults should be able provide significant benefits to
operations. If catapult operations could be effectively parallelized in the current deck layout (aircraft
could launch simultaneously without risking collision), parallel operations at the 22 aircraft Density
settings return an average launch duration of 16.28 minutes (reduction of 34.81% from baseline) and
28.47 minutes at the 34 aircraft level (a 26.95% reduction from the baseline). These are in line with
the results of the launch preparation mean -50% tests (18.86 and 29.16 minutes, respectively) from
Section 6.1.1. Combining parallel operations with a reduced launch preparation time mean should
combine to generate further advantages and should finally push the system into a state where the
190
CHAPTER 6. EXPLORING DESIGN OPTIONS
Aircraft Directors and the Handler cannot keep up with the pace of launches.
Other changes involving the location of catapults, as well as the total number on the flight deck,
could also be done easily within MASCS. How repositioning the catapults interacts with the current
Safety Protocols might also be of interest, and new protocols could be designed to exploit the features
of this new deck layout. For instance, the primary issues in performance regarding the Temporal
and T+A protocols was in requiring an increased number of forward-aft transitions of aircraft on the
flight deck. A rearrangement of parking locations and/or catapults might result in an orientation
that does not require these disruptive actions. Extremely radical designs, such as a circular flight
deck with catapults arranged around the edges of the deck, could be easily implemented and tested
within MASCS. Although, such changes may also require changes to the rest of the ship’s physical
structure that may alter the shape of the hull and affect the ship’s performance in the water.
MASCS might also be employed in exploring the use of alternative sets of Deck Handler heuristics,
as it is not clear whether or not these assignment rules are optimal and whether or not they could be
improved. Intelligent planning algorithms such as integer linear programs, Markov-based processes,
auction-based algorithms, or other methods might provide alternative planning strategies that may
be beneficial to operations, or might be able to offload some of the cognitive load of the Handler to
the algorithm. The full development, testing, and verification of these algorithms would require a
lengthy iterative process, for which the fully automated nature of MASCS would be beneficial. These
planning algorithms might also be expected to have significant interactions with the layout of parking
spaces on the flight deck, which could be jointly tested using the same optimization architectures as
described above. The testing of novel route planning methods might also be conducted, replacing
the basic taxi rules of the “zone coverage” system with algorithms such as A* and examining their
effects on taxi time, taxi distances, and fuel consumption. Because of the modular nature of the
MASCS system, integrating these new path planning and scheduling algorithms would be relatively
simple: each could be written as their own independent piece of the codebase and adapted to take
in the correct inputs from the MASCS system regarding aircraft and crew locations and provide the
same outputs in terms of “tasks” as done in the current code. Within the other agent models of
MASCS, simply rerouting the decision-making scripts to employ these new algorithms is a trivial
adjustment.
Lastly, in line with the testing of novel planning algorithms, the concept of replanning schedules
to recover from failures is also an area at which MASCS could be employed. If, during operations, a
substantial mechanical failure or accident occurs on the flight deck, substantial modifications to the
original schedule may be required. Such crisis events are also notoriously difficult to test in the real
world in a way that replicates the same level of stress generated by real accidents. A model such as
MASCS provides an alternative method of investigation.
191
CHAPTER 6. EXPLORING DESIGN OPTIONS
6.4
Limitations of the MASCS Model
As with any simulation environment, a variety of limitations exist within MASCS. The core issue
with any computerized simulation is that it is a system entirely based on mathematics and rules —
any element of the real world that is required within the model must be formulated in these terms,
as dictated by the specific type of computational model being generated. For an agent-based model
specifically, this requires translating agent motion and decision-making into accurate movement
models and accurate logical rule systems, with all requisite behavioral (and other) states included.
With this comes a desire for simplicity and parsimony — reducing the complexity of the real world
as much as possible in order to make modeling feasible. During the development of MASCS, many
of the purely random behaviors that can occur in the world — weather, mechanical failures, human
behavior outside the expected norm — were ignored in order to focus the analysis on the differences
in performance of the different unmanned vehicle models. As such, in its current instantiation, the
effects of interactions of these elements with other aspects of MASCS is not known. Additionally,
even for the features that are modeled, the boundaries placed on the randomized mathematical
distributions encoded in the system and the limited options in decision making processes limit the
ability of MASCS to produce truly unique events.
Additionally, the parameterized distributions used within MASCS were defined based on a limited set of data that may not adequately represent the true environment. As best as possible, these
parameters were based on empirical data from operations — videos of operations, live observations,
and interviews with Subject Matter Experts (SMEs) — but these data were not collected as part of
any internal Navy initiatives to study flight deck operations. Validation data, as well, was difficult
to obtain and only covered a limited number of aspects of operations. To the author’s knowledge,
detailed databases containing this information do not exist in any formal capacity. While MASCS
has been calibrated with respect to mission-level outputs of launch event duration and launch interdeparture rates, models of crew behavior and Handler planning heuristics have not been fully
validated as data is not available regarding these aspects. Until such time as fully detailed data
regarding individual crew behaviors and motion, Deck Handler planning, and possibly extending
to rare and critical events is obtained, any model of flight deck operations will be difficult to fully
validate.
The Control Architecture models integrated into MASCS also relied on a set of assumptions
to define models of mature, futuristic control architectures. Because these models address vehicle
systems that are not currently in existence, they cannot be formally validated against empirical data.
The models may also not be reflective of the actual control architectures built for future operations,
in terms of both low-level details of the architectures and in the settings of related parameters.
Similarly, the behavior of immature versions of these control architectures when initially fielded
192
CHAPTER 6. EXPLORING DESIGN OPTIONS
could be substantially different from the architectures modeled here. As such, the results are limited
to the definitions of the unmanned vehicle control architectures described in this work and the key
parameters that define their behavior.
Even more importantly, the role of the crew and their presence in operations had substantial
effects on safety and productivity performance, as previously discussed in Chapter 5, Section 5.6.
The results and insights gained from the testing that was performed in this dissertation are limited
by the definitions of crew behavior and their heuristics, including the topology of the crew zone
coverage network. Alternative versions of any of these aspects may produce markedly different
results for flight deck operations. Furthermore, the generalizability of these results to other domains
is contingent on how similar human collaborator behavior in those environments is to that on the
aircraft carrier flight deck.
This also includes the Deck Handler model and the assignment heuristics used to allocate aircraft
on the flight deck. These heuristics were described previously in Chapter 3, Section 3.2.3 and were
based on earlier work (Ryan, 2011; Ryan et al., 2014). The heuristics encoded into MASCS are
meant to be representative of the general behavior of Handlers in the world. While the same set of
heuristics was applied to all five UAV control architectures and the MC model in order to provide a
consistent comparison, different sets of heuristics may provide different results. In addition, future
operations may entail the use of different heuristics for each control architecture or safety protocol
that could be formally optimized for each. However, previous research demonstrated that designing
effective algorithms for this environment is difficult and may provide results no better than the
human heuristics (Ryan et al., 2014).
Other structural aspects of the flight deck, in terms of the geometry of the space, initial vehicle
locations, and vehicle destinations may also differ for other environments. On the flight deck, vehicles
move only in the two-dimensional plane of the flight deck in what is technically an open space in
which vehicles can transit in any unobstructed direction. Crew also generally attempt to move
aircraft in the same direction in order to maintain an orderly flow of traffic, but this is not a hard
constraint like it is for highways and other roadways. These aspects all influenced the definition
of the safety protocols, and for other domains that do involve vertical motion (such as military
aviation) or that have more or different restrictions on transit paths (highways, roadways in mines),
the safety protocols defined here may not be entirely applicable and the results observed in MASCS
testing may not apply. However, the models of unmanned vehicles used in the MASCS model should
be easily transferable to other domains, and the general insight as to the role of crew in operations
should also be true for other environments.
The MASCS model also includes only a limited number of tasks in the system model, all of which
concern the movement of aircraft from parking locations to the launch catapults. Other environments may contain a richer set of actions that would be required in modeling, providing additional
193
CHAPTER 6. EXPLORING DESIGN OPTIONS
opportunities for control architecture failures, latencies, or other interactions between elements.
However, it would be expected that the general features of the results described in Chapters 5 and
6, such as the effects of process level constraints and the ability of safety protocols to disrupt the
flow of traffic in operations, would likely hold in some form. At minimum, the development of the
current MASCS model of flight deck operations does provide a roadmap for the development and
testing of simulations for other environments.
An additional operational limitation in MASCS is the time to complete individual runs. Even at
ten times normal speed, a set of thirty replications takes between 60 and 180 minutes to complete,
conditional on the specific settings of the mission profile. Because of the time to complete runs
and the significant standard deviation in results, using MASCS within an optimization architecture
would likely require many more runs to complete its exploration of a design space. Accelerating the
speed of operations of MASCS would be a viable option, but the current architecture of the system
means that the resolution of vehicle motion would decrease and a significant amount of information
would be lost. Some other mechanism of updating the motions and actions of vehicles would be
required in order to provide the performance necessary to facilitate deeper investigations into system
parameters and optimality.
6.5
Chapter Summary
This chapter has explored some of the implications from the results displayed in Chapter 5, utilizing
the MASCS simulation environment to explore the effects of changing both vehicle-independent,
process-level tasks and UAV control architecture parameters on safety and productivity. Chapter 4
previously demonstrated that the MASCS model provides results in agreement with prior published
data on and SME estimates of operations. This chapter has demonstrated that modifying the
parameters of the launch preparation process eliminates the bottleneck it places on operations and
greatly improves flight deck performance. In making such changes, MASCS has demonstrated that
22 aircraft mission durations can be reduced from 20 minutes to less than 10 minutes with an
aggressive, automated launch preparation task. Furthermore, these results also indicate that at this
faster operating tempo, the safety of aircraft interactions becomes robust to the selection of safety
protocols and the total duration of crew interactions drops substantially.
Introducing this aggressive automated process also resulted in significant differences in Launch
event Durations for the different control architectures, although these again ended to cluster into
groups of “delayed” and non-delayed architectures. However, these results also suggest that at these
extreme settings, the density of operations and its interactions with the scheduling of tasks and
the taxi routing methods have significant effects on Launch event Duration values. The relative
effects of increasing mission Density from 22 to 34 aircraft was much larger under the automated
LP model than under the original, indicating that the heuristics used in these processes, which grew
194
CHAPTER 6. EXPLORING DESIGN OPTIONS
organically over time under much different operating conditions, are not well-constructed for an
accelerated launch rate.
A review of safety metrics also demonstrated that increasing the speed of operations may not
necessarily imply a reduction in safety. The automated launch preparation process does moderately
decrease the number of aircraft safety incursions and greatly decrease the total duration of crew
safety incursions but has no effect on the total number of crew incursions. Thus, in terms of two
metrics, accelerating the pace of operations is a net benefit to safety, although this also implies the
rate of crew safety incursions increases drastically. Additionally, variations in measures across safety
protocols also appears to be lessened, implying that the safety protocols provide little benefit at this
increase operational tempo.
The ramifications of the results in this and the previous chapter demonstrate the importance
of systems like MASCS. Although the SBSC and GC systems are still years away from practical
development, the results of this and the previous chapter have provided some understanding of the
important features of their performance, how they might be integrated into operations, and what
features of operations are especially interactive with these systems. MASCS has been able to explore
what might be possible with candidate versions of these systems. Most generally, the results of this
and the previous chapter show that the best performance in flight deck operations comes from either
fully manual or fully autonomous operations: under either the MC or the SBSC architectures. The
control architectures that lie between these models in the automation spectrum provide some form
of detrimental performance due both to the parameters defining their behavior and the architecture
of their interactions with the system.
To collect the data used to reach these conclusions in real-world field testing would consume an
incredible amount of time and resources, and the instrumentation of data collection for the metrics
defined here might also be problematic and difficult to achieve. By creating independent agentbased models that replicate the expected behavior of the different unmanned vehicle models, the
MASCS model has afforded a better understanding of flight deck operations than might be achieved
in years’ worth of field trials. Overall, MASCS has generated not only a better understanding of
how the flight deck works, but provided guidance on potential future research as to the choice of
UAV systems to incorporate into operations and how they interact with candidate safety protocols
for the environment.
195
CHAPTER 6. EXPLORING DESIGN OPTIONS
THIS PAGE INTENTIONALLY LEFT BLANK
196
Chapter 7
Conclusions
“Robots are emotionless, so they don’t get
upset if their buddy is killed, they don’t
commit crimes of rage and revenge. But
they see an 80-year-old grandmother in a
wheelchair the same way they see a T80
tank; they’re both just a series of zeros
and ones.”
“Military robots and the future of war,”
TED Talk, Feb. 2009
Peter Singer
As unmanned and autonomous vehicle systems move from environments in which they operate
primarily in isolation into Heterogeneous Manned-Unmanned Environments (HMUEs) that require
close interaction with human collaborators, properly understanding which characteristics of the
domain encourage safe and effective performance in these new domains is paramount. For a vehicle
operating in isolation, it important to know the location of the vehicle, its current destination, and
what transit path will move them to that point without overstressing the vehicle. For a vehicle
operating in an HMUE, the same issues are also at play; however, the transit path, speeds of
motions, and even the selection of the destination are influenced heavily by what types of actors are
nearby. The decision-making rules and behaviors that provide desirable performance for a manned or
unmanned vehicle operating on its own are not the same as the characteristics desirable for vehicles
operating in HMUEs.
Understanding which characteristics are desirable is further complicated by the large scale of
systems like the United States highway system. In most current practice, determining these characteristics occurs through a trial-and-error, one-factor-at-a-time style of testing: small adjustments
197
CHAPTER 7. CONCLUSIONS
are made to a system in the field that is then tested over a period of time. This process, and its reliance on physical systems operating in the world, brings with it several deficiencies and limitations:
cost, safety concerns, limits on what can feasibly be tested with the current available hardware and
software systems, and limitations on what can accurately be measured in the field. To this end,
the Multi-Agent Safety and Control Simulation (MASCS) agent-based simulation environment was
developed in order to facilitate the exploration of manned and unmanned vehicle behavior in one
candidate HMUE, the aircraft carrier flight deck. This chapter reviews the motivation behind and
the development and testing of the MASCS simulation model, as well as the contributions of this
research and the questions that remain for future investigation.
7.1
Modeling HMUEs
7.1.1
The MASCS Model
The Multi-Agent Safety and Control Simulation (MASCS) model was constructed with the goal
of modeling manned and unmanned vehicles acting within a candidate Heterogeneous MannedUnmanned Environment (HMUE), such that the performance of these vehicles could be evaluated
under realistic circumstances. This includes the effects not only of different vehicle control methods,
but the effects of changes in how they are integrated into operations and the mission conditions
under which they operate. These first two aspects, referred to as Control Architectures (CAs)
and Safety Protocols (SPs), respectively, served as the primary focus of MASCS development and
testing. A review of the existing research showed a spectrum of five different unmanned vehicle
control architectures that were candidates for inclusion within MASCS, alongside a model of Manual
Control that serves as a baseline comparison point. Unmanned vehicle control architectures ranged
from Local Teleoperation (LT) and Remote Teleoperation (RT), which still require a remote human
pilot, to Gestural Control (GC) (and other forms of advanced teleoperation) that require no pilot
but do require instructions from the crew, to Vehicle-Based Human Supervisory Control (VBSC)
and System-Based Human Supervisory Control (SBSC) that receive their instructions through a
centralized control system staffed by a senior supervisor (the Deck Handler). Slight variations in the
location of the human operator, the method of commanding the unmanned vehicle, and the types
of commands provided differentiate these Unmanned Aerial Vehicle (UAV) CA from one another.
These seemingly slight differences in architectures, however, are sufficient to generate significant
differences in performance for individual vehicles.
An initial framework that mapped the general architecture of these systems in terms of the
communication of information between human supervisors, manned and unmanned vehicles, and
other human collaborators served as the foundation for the construction of the MASCS environment.
This framework was then applied to aircraft carrier flight deck operations, chosen as an example
HMUE based on its existence as a heterogeneous manned environment and the current research
198
CHAPTER 7. CONCLUSIONS
concerning the development and integration of unmanned vehicles into flight deck operations. A
model of current manual flight operations was developed and partially validated based on data
obtained from previous research, observations of operations (both in-person and recorded), and
interviews with Subject Matter Experts (SMEs). This model then served as the template for the
implementation of unmanned vehicles into flight deck operations.
However, the introduction of unmanned vehicles might also entail changes to the structure of
operations in order to limit interactions between vehicles and other human collaborators. Past
experience has demonstrated that robotic, unmanned, and autonomous systems are not often given
free rein to operate in a system and are often isolated based on geographic area or in terms of
temporal scheduling. These forms of Safety Protocols suggest a few simple ways in which the
interactions of manned and unmanned vehicles can be limited to increase the safety of flight deck
operations. However, the true effectiveness of these protocols and their interactions with system
productivity are not readily known. Despite the ongoing development of unmanned vehicle systems
for flight deck operations, significant questions remain as to how UAVs will be inserted into the
mission planning and mission execution processes.
To incorporate both the variations between CAs and the different types of SPs, MASCS was
constructed as an agent-based model. Such models rely on the definition of independent “agents”
that replicate the decision-making behavior, task execution, and movement of the individual actors
in the real-world system. Differences in CA can be mapped to differences in decision-making rules
and parameters describing individual vehicle and crew behaviors, while changes in the SP are characterized by changes in the supervisory Deck Handler’s planning activities. The modular design of
the agents enabled these difference to easily be integrated into the simulation model. These features
also are the primary inputs to the MASCS model and are discussed in the next section, along with
the outputs of the simulation.
Model Inputs and Outputs
Chapter 3 detailed the creation of the original MASCS model of flight deck operations and the sources
of data used in its construction. Constructing an agent-based model of flight deck operations first
required the definition of (1) agents active within the environment, followed by the definition of
(2) states that described the current conditions of the agent, (3) decision-making rules that define
what actions should be taken to achieve its current goals, and (4) parameters that defined how those
actions are performed. Each of these could be tailored by future researchers to explore new control
architectures or new parameters for the currently defined architectures. Similar structures were also
provided for independent task models, as well as for the model of the physical environment (the
carrier flight deck).
• Agents: Creating MASCS requiring defining individual agents for all aspects of operations.
199
CHAPTER 7. CONCLUSIONS
This included pilot/aircraft agents, crew on the flight deck, supervisory planning agents, equipment such as catapults, and the flight deck itself. Additional models were constructed for tasks
executed by these agents, constructed in a modular fashion to make them easily adaptable to
a variety of situations. For each of these agents and models, sets of states, parameters, and
decision rules were defined individually based on prior observations of operations, a review of
training manuals, and interviews with SMEs.
– Agent states describe the current characteristics of an agent in the world, including its
current position, heading, assignments, and health. States describe the important characteristics of agents relevant both for their own activities and for the activities of other
agents in the system.
– Agent decision rules describe how, given information on the current state of the agent and
the observable states of other agents and the environment, how it should achieve its goals.
These rules determine what the goal should be, then determine what tasks are required
to reach that goal, then whether or not those tasks can be accomplished. Important
decision rule states included modeling how Aircraft Directors move to place themselves
within an aircraft’s field of view, how Directors pass aircraft between one another in the
“zone coverage” routing system, and how Handler’s plan catapult assignments for aircraft
in the system. Each of these decision rules were instrumental both in the design of the
MASCS system and shown to have significant effects on system behavior. These are each
system characteristics that might be altered under future investigations.
– Agent parameters define the characteristics of task performance and are often modeled
as parameterized random distributions. The definitions of these parameters, along with
changes to decision rules, define the differences between the various UAV control architectures modeled in MASCS. These include the speed at which aircraft can travel on
the flight deck, time distributions characterizing the time required to complete certain
tasks, and the failure rates of task execution. Failure rates and system processing latencies (caused by a variety of different factors) are also included here. Parameters for
Manual Control operations were defined based on empirical evidence and SME advice,
while UAV parameters were based on a combination of prior research and assumptions
of how mature system technologies would behave. Less-mature systems, or other types
of control architecture not considered here, may provide very different results than what
was observed in this testing program and are a point of future investigation.
• Task models: Models of agent task execution were also required, consisting of decision-making
rules, states, and parameters that describe under what conditions a task can be performed
as well as how that task is executed when conditions permit. Of primary importance were
200
CHAPTER 7. CONCLUSIONS
models of agent motion in the world, which rely on interactions with agent states, the Handler
assignment decision-making rule, and the Aircraft Director “zone coverage” routing system.
Two other important tasks, the launch preparation time and acceleration time, are vehicleindependent are based on stochastic time distributions based on real-world observations. This
models are modular in nature, defined independently of any agents in the system. As such, new
models task models, or variations to current models, could also be introduced. Current models
could also be adapted to new environments outside that of carrier flight deck operations.
• Mission and Deck Configurations: The characteristics of the mission to be executed and the
arrangement of the flight deck also serve as important factors in the system. The number of
aircraft included in the mission, mission Density, has fundamental effects on the taxi routes
and other actions available to agents in the system. The percentage of unmanned vehicles
used in the mission, mission Composition, affects how the Safety Protocols are executed.
Additionally, the locations of aircraft on the flight deck and the physical layout of the deck
itself form additional variables that were not explored in this work.
• Metrics: Analyzing operations required the definition of metrics in several categories addressing the productivity (expediency and efficiency) and safety of operations. The metrics provide
a breadth of perspectives that enabled a detailed analysis of differences in behavior of Safety
Protocols, Control Architectures, and their interactions with different mission configurations
(mission Density and Composition).
Each of the elements described above could be altered by future researchers exploring different
facets of operations. Additionally, the models themselves could be fundamentally altered: as the
current focus of MASCS is on futuristic vehicle technologies using experienced operators, interactions between human and aircraft were treated as settings for a combined human-vehicle system.
These models could be further differentiated, creating separate models for humans pilots/operators
and vehicles in order to explore interactions between the two at a much finer level of detail. If these
models, which would more explicitly explore human cognition and decision-making, were implemented, many of the task execution and failure rate models defined in this work would become the
outputs of these new, more detailed models. Implementing more detailed human cognitive models
might also require substantial changes to the decision-making rules in order to adapt to this new
architecture. This modularity and the ability to easily introduce new agent models is one of the
benefits of MASCS, which are discussed further in the next section.
Model Benefits
MASCS is, as far as it is known, the first simulation environment that models a variety of candidate
unmanned vehicle control architectures in a partially validated model of a real-world HMUE. This
has afforded an ability to examine features of unmanned vehicle performance and interaction with
201
CHAPTER 7. CONCLUSIONS
operational settings that has not been observed in other arenas. Due to the modular definition of
agents and tasks within MASCS, the characteristics of individual agents, how they execute tasks,
how they interact with one another, the physical organization of the flight deck, initial positions
of aircraft and equipment, and the planning methods of the Deck Handler and other staff can all
be investigated individually, without requiring modifications to other agents and decision-making
routines. This framework also allows for the easy insertion of models of the different UAV CAs
into the system, replacing manned aircraft with models of unmanned vehicle behavior and enabling
comparison of performance across cases with requiring the construction of an entirely new model of
the flight deck. Future models of unmanned aircraft could easily be generated and integrated with
the current MASCS model, along with new operational settings and conditions.
The ability to make these changes individually enabled the creation of a test program that
systematically varied the control architectures, safety protocols, and mission settings (both number
of aircraft and percentage of UAVs included in the mission) in order to characterize differences
in performance across cases as well as interactions between cases. This testing program and its
results were examined in detail in Chapter 5, providing significant insight into potential outcomes
of unmanned vehicle integration into flight deck operations. The most important lessons learned
concern the role of the crew in affecting operations. The use of human crew in routing aircraft
through the flight deck (the “zone coverage” system and its topology) was shown to have a significant
effect on the results of safety measures in those tests. Additionally, the role of crew in the vehicleindependent launch preparation process was shown to serve a regulatory function on operations,
limiting variations in the productivity of control architectures in terms of launch event duration.
Once these limitations in the launch process were removed after modeling an automated launch
preparation process, some key aspects of productivity on the flight deck were observed. The unconstrained and highly autonomous SBSC architecture provided some advantages in operations over
UAV control architectures experiencing latencies in operations (RT, GC, and VBSC). The architectures experiencing latencies in operations also performed most poorly when combined with the
Temporal separation protocol, which itself provided poor results in general. The poor performance of
the Temporal protocol suggests that for Deck Handlers scheduling flight deck operations, maintaining flexibility in aircraft assignments (Temporal Separation) is less important than in maintaining
a consistent flow of traffic from aft to forward (as occurs in Area Separation). Ultimately, for UAV
control architectures to have any benefits past that of the current Manual Control models, both the
flow of traffic must be preserved and the launch preparation time must be improved: otherwise, the
effects of these two elements greatly overshadow any differences in unmanned vehicle operations.
In terms of the safety of operations, significant interactions between control architectures, safety
protocols, and mission settings were also observed. Measures of safety that examined the distances
between aircraft (the Tertiary Halo Incursions due to Aircraft (THIA) measure) showed the best
202
CHAPTER 7. CONCLUSIONS
performance under the combined Temporal+Area protocol. However, improved performance on this
measure came at the cost of mission expediency, as the heavy constraints of this protocol limited the
ability to launch aircraft from the flight deck. While these results apply specifically to the design
of modern flight deck operations, it does suggest a need to carefully question the organization and
conduct of operations in other domains as they relate to and interact with their own unmanned
vehicle systems.
The selection of control architecture also had interesting interactions with these safety measures,
primarily due to the nature in which vehicles are routed on the flight deck. Under the THIA metric
measuring the safety of aircraft interactions with one another, the SBSC systems provided both
some of the very best and some of the very worst performance, depending on level of constraints
applied by the safety protocols. This also has repercussions for real-world operations: the MASCS
model did not assume that failures could occur in the collision prevention systems for any of the
UAV control architectures. These results suggests that, for SBSC system reliant solely on their own
autonomy to avoid accidents, imperfect compliance with collision prevention may make these some
of the least safe systems to operate.
In general, MASCS has provided both an increased understanding of the dynamics of flight deck
operations, the key features that govern mission productivity and efficiency, and pointed the way
towards a better understanding of which characteristics of unmanned vehicles and of safety protocols
generate desirable safety and productivity in this domain. The nature of vehicle interactions on
the flight deck are similar to those in highway driving and mining environments, and the results
described for SBSC here could extend to these domains in a limited fashion. Mining environments
are similar with the exception that vehicles travel cyclically between different sites within the mine.
Roadway driving is also similar, although the initial starting locations and destinations of aircraft
are all dissimilar; the models of unmanned vehicle control architectures, however, should be similar
to flight deck operations. The modularity of MASCS and its agent and task models means that it
could easily be extended into models of different physical environments in order to verify exactly
how well these results transfer into these new domains.
7.1.2
Model Confidence
Chapter 4 presented information regarding the calibration and partial validation of the MASCS
model against data on current manual control flight deck operations, examining the performance
at both the individual vehicle and mission levels, as well as sensitivity analyses of performance at
both levels. These sensitivity analyses address the most important aspects of flight deck operations,
namely vehicle speed, collision avoidance parameters, and launch preparation time. These tests
examined the accuracy of the model with respect to replicating current operations as well as the
stability and robustness of the model given changes in the task execution parameters.
203
CHAPTER 7. CONCLUSIONS
Model Accuracy
Validation testing for the MASCS model of current aircraft carrier operations relied on a combination
of reported performance data from Naval research institutions, video observations of operations,
and subjective evaluations by SMEs. The model of individual manual control aircraft behavior
was calibrated with recorded footage of an individual aircraft followed from initial parking location
through the completion of the launch process. This video was catalogued, breaking the entire launch
event into subtasks, each of which formed a comparison point for the simulation data. A replica
scenario was developed and executed within MASCS, demonstrating that the simulation was capable
of reproducing the video data within ±1 standard deviation on each metric.
More detailed data on mission performance was obtained from prior Navy studies. This data
reported the Cumulative Density Function (CDF) of launch interdeparture times on the flight deck
and expected values for the time to complete a mission of given size. A series of representative
missions were replicated within MASCS, demonstrating that the simulation was capable of providing
similar distributions of interdeparture times and total mission times in line with prior expectations
and reports. The results of individual vehicle testing suggested that the models of individual aircraft
motion and task execution were reasonable, while tests at the mission level demonstrated that the
architecture of flight deck operations, including the assignment of aircraft to catapults and the
routing of aircraft to those assignments, were also reasonable representations of the real system.
However, there are a variety of aspects of operations that have not been formally validated, with
individual crew behavior being the most prominent. While the model has been calibrated using data
on current mission operations, new data may reveal significant flaws in the model.
Models of UAV behavior were based on prior research evidence and supplemented with assumptions about realistic performance for mature systems utilizing experienced operators. While the
performance of these vehicles cannot be validated in any formal sense as they do not yet exist, tests
of individual vehicles (replicating the individual vehicle mission used in validation) provides results
in line with expectations: an inverted-U shape. Moving from LT to SBSC across the spectrum of
automation complexity, the GC architecture provides the worst performance due to its delays in
processing and failures in recognizing Directors and task instructions before improving. At the ends,
the LT and SBSC systems provide the best performance and were not shown to be significantly
different from Manual Control (MC). The remaining two cases of RT and VBSC were shown to be
significantly worse than MC but still better than GC.
Model Robustness
Additional tests in MASCS examined simulation behavior given changes in internal and input parameters within the model. These tests specifically addressed the reaction of the simulation to
changes in the number of aircraft and the number of catapults (input parameters) as well as changes
204
CHAPTER 7. CONCLUSIONS
in the taxi speed of aircraft and the related collision prevention parameters. Changes in the input
parameters lead to substantial changes in how aircraft are prioritized and routed on the flight deck,
while changes to internal parameter affect the execution of taxi operations upstream of the launch
process. The tests conducted in Chapter 4 verified that the response of the flight deck system was as
expected: linear with respect to the number of aircraft used in the mission, a slight difference (1-2
launches) between using 3 and 4 catapults, and that responses of the system to changes to taxi and
motion behaviors, such as taxi speed and collision avoidance parameters, did not generate extreme
variations in operations. Because these two parameters affect tasks upstream of the launch preparation task, their effects on mission performance are ultimately limited: differences in performance
only occur if the disruptions to taxi operations break the maintenance of queues at catapults. The
effects of varying the mean value of the launch preparation time model — a vehicle-independent
task that occurs on the flight deck — was shown to have much stronger effect than either of these
other parameters, as would be expected.
These results demonstrated that the MASCS model is capable of replicating the performance
of the real world system, including its responses to changes in both internal and input parameters.
This, in turn, signifies that models of Deck Handler planning heuristics and Aircraft Director traffic
routing on the flight deck are both reasonable representations of the real system, including possessing the flexibility to handle different mission definitions. Establishing these decision-making rules
was a complicated aspect of generating the agent models, and the work described in this dissertation provides a template for future researchers generating agent-based models of similar domains.
However, as noted in Chapter 6, Section 6.4, one of the key limitations in MASCS development was
the limited amount of data available on operations. If future data collection programs demonstrate
that the data used in validating MASCS and in defining MASCS agent parameters was incorrect,
MASCS should no longer be considered a valid model of operations. Input parameters for MASCS
would have to be updated and the entire simulation revalidated to ensure its accuracy.
7.1.3
Model Applications
In general, MASCS is a complicated simulation designed for “what-if” scenario testing for aircraft
carrier flight deck operations, designed to be flexible enough to test not only changes to the behavior
and performance of individual agents within the system but also to test changes to the execution
of tasks, in communication between agents, and in the methods of planning in the environment.
Chapter 5 presented one way in which the MASCS environment can be used, testing different
candidate UAV control architectures and safety protocols intended for use in these operations in
order to examine whether or not differences exist in the performance of these systems and, if so,
the magnitude of these differences and the sources of their variation. Chapter 6 explored another
way in which MASCS can be utilized: as a design space exploration tool used to examine the
205
CHAPTER 7. CONCLUSIONS
effects of changing parameters for process-level tasks (the launch preparation time) and the control
architectures on safety and productivity in the flight deck. A first series of tests explored altering
the mean and the standard deviation of the launch preparation time, while a second series explored
improvements to parameters for the GC architecture.
These results demonstrated that, for systems like the flight deck, where all vehicles head to one
of many identical destination states and perform the same final terminal process, the characteristics
of that process may ultimately affect judgments of the performance of the modeled systems. As
shown in Chapter 6, at high mean, high standard deviation launch preparation times, no difference
was observed between control architectures. If a stakeholder used these settings solely as their frame
of reference, they would conclude that the use of unmanned vehicles not only had no impact on
operations, but that it did not matter what control architecture they selected. Under a lower mean,
lower standard deviation preparation time model, significant differences did arise between models,
leading to very different conclusions about which UAV systems were most effective. However, in
absolute terms, the differences in performance between cases was quite small.
Additional tests explored what improvements to control architecture parameters would be required if differences in performance did exist. Chapter 6 examined in depth the two most important
parameters of the Gestural Control architecture, exploring how improvements in the processing latency and accuracy of these systems do or do not affect operations. This exploration lead to several
interesting conclusions about the performance of these systems. Notably, under operations using the
original launch preparation time model with large mean and standard deviation, significant improvements in these parameters had no further beneficial effect on operations: GC performance remained
equivalent to MC performance due to the regulating effects of the launch preparation time. Only
once this task was decreased were changes in GC parameter shown to have any effect. To reach
performance equivalent to MC operations required substantial improvements to the GC parameters,
but the practical differences between large and moderate improvements are fairly small.
Other tests examined the effects of changes to these processing on the performance of the different
Safety Protocols. Here, other key aspects to the determination of “safety” in these environments was
explored. Most intriguingly, results suggested that increasing the tempo of operations by reducing
the launch preparation mean and standard deviation might actually be beneficial to operations
overall, as it minimized the number of times aircraft come within a close distance of one another.
This increased tempo also decreases the total amount of time in which crew are placed in close range
to aircraft, but does not affect the total number of times this occurs. This implies an overall benefit
to the safety of other aircraft on the flight deck, but an increased rate of risk generation for crew; as
described above, balancing these considerations and understanding which are the most important
are a key concern for stakeholders making decisions for these systems in the future.
Each of these explorations would have taken a substantial amount of time and money to conduct
206
CHAPTER 7. CONCLUSIONS
under real-world conditions. The ability of MASCS to quickly modify the conditions of the system,
execute a series of tests (at typically ten time normal speed) then compile and review data provide
an investigative tool not available to most current stakeholders. Additionally, the nature of MASCS
provides the ability to test conditions that do not currently exist in the world: the effects of major
improvement to system technology can be done without issue in such a simulation environment. As
such, MASCS affords an ability to investigate system settings that do not currently exist and may
not in the near term be achievable. This is one potential area of future work discussed in the next
section.
7.2
Future Work
MASCS is, and was designed to be, a fairly simple agent-based model of the world. Each of every
one of the logical decision-making rules utilized within it is based on if/then statements and other
simple heuristic-based solutions. These models were based on a relatively limited amount of data
on operations as compared to other environments. A significant source of future work in MASCS
can address improving the complexity of the agent models, which in turn requires increased data
collection and complexity of data collection flight deck operations, crew behavior, and unmanned
vehicle performance.
As noted previously in Section 7.1.1, the pilot and aircraft models were treated as single entities
designed to be representative of mature technologies with experienced operators. By definition,
this negates any potential examination of the implementation of UAV models with novice operators
without properly calibrated trust in the systems. This also negates investigating any effects of
change in trust or improvement of skills over time, as well as eliminating the ability to examine
how different human operator behaviors affect the performance of the systems. In order to create
such models, the pilot/aircraft models must be further differentiated into two independent agents
that interact with one another, potentially in the style of systems such as SOAR (see (Laird, 2008))
and ACT-R (see (Anderson & Schunn, 2000)). This requires a much greater level of detail in both
human cognitive processes (memory, attention, decision-making and reaction times) as well as how
these processes interact with the vehicles themselves. A reasonable first step in this would be a
series of studies of naval aviators, understanding how they process information and execute tasks
on the flight deck, followed by similar exercises with individuals using prototype or simulated UAV
systems. Developing a model for any one of the control architectures alone may be considered a
thesis-level research topic, and any single model could be incorporated as an extension of MASCS.
Similar exercises could occur for the Aircraft Directors and Deck Handler, each of whom had a
single model of behavior developed for this work. While the models were shown to be flexible enough
to handle a variety of operating conditions on the flight deck, other alternative models may also be
possible. Errors on the part of these individuals were also largely eliminated from the model, and
207
CHAPTER 7. CONCLUSIONS
it may be of significant interest to characterize their effects. However, this again requires extensive
additional data collection looking at individual behavior and the creation of more detailed models
of cognitive processes, decision making, and errors. However, the generation of such a system would
be of extraordinary assistance in terms of diagnostic evaluations of flight deck operations: the effects
of an individual mistake could be demonstrated in simulation by simulating the mission both with
and without the error, examining the differences in mission evolution and performance that results.
This has significant potential as a training tool for new Directors and Deck Handlers, with the
possibility of integrating the system as a real-time assessment tool onboard carriers. In this sense,
the simulation could be converted into a decision-support system, allowing operators to gauge the
effects of their decisions in real time and make appropriate adjustments.
Additionally, as noted in Chapter 5, the models of close-proximity crew interactions with vehicles
were very simple. While the results of crew safety metrics were tracked and analyzed, the simplicity
of the crew models precludes any real judgments about the safety of crew interactions with aircraft.
To provide a true understanding of crew interactions with aircraft, more complex models that more
effectively capture how crew interact with aircraft on the order of inches and feet would be required;
the current iteration of MASCS does not address human or vehicle motion at that fidelity. However,
the construction of such models of crew would provide benefits to both current manual and future
unmanned operations, providing an understanding of the key behaviors of aircraft and pilots that
place crew in danger (and vice versa). Such information could then be used to either improve crew
and pilot training or improve the heuristics of vehicle routing on the flight deck.
Along a similar vein, a point of ongoing investigation is the development of planning algorithms
for use in flight deck operations, as well as research into data mining and characterizing the key
features of operations. It would be relatively easy to convert MASCS into a data generator for use
in this research, replacing the Deck Handler with an automated scheduling algorithm or replacing
the “zone coverage” crew routing with alternative path planning algorithms. Additional changes
could introduce new metrics and measurement methods into the system for use in the data mining
of operational performance. However, in order to transition the system into a true research tool
usable by individuals outside the program designer, an automated scenario generation tool would
aid in the creation of initial configuration files for the system and automated generation of output
statistics across trials would also be useful. Making such changes would also facilitate the merging
of the MASCS codebase with optimization toolkits such that would enable a broader exploration
of the design space. Wrapping the MASCS system into a genetic algorithm or particle swarm
optimization routine, which would have the authority to change the numbers and starting locations
of aircraft, parameters within those aircraft, and possibly select different types of planning algorithms
or heuristics would facilitate the search for a true “optimum” in the system. These searches could
also be conducted for aspects of flight deck operations outside of the launch process, which are
208
CHAPTER 7. CONCLUSIONS
described in the next sections.
7.2.1
Additional Phases of Operations
Additionally, with budget concerns driving a desire to reduce staffing levels, simulations like MASCS
could be used to examine how changes in the number of crew involved in flight deck operations drives
efficiency. While the current instantiation of MASCS only utilized Aircraft Directors, numerous other
personnel support operations prior to and after the launch phase. The development of MASCS
focused on taxi operations prior to launch because this is the period of time on deck in which the
behavior of UAVs would have the greatest effect. Prior to and after taxi operations, a variety of
other processes that rely heavily on crew take place: fueling, loading weapons, landing the previous
mission’s aircraft, and the process of “respotting” the deck prior to the next series of launches.
However, most of these tasks do not require much activity on the part of the aircraft and are instead
rely mostly on the actions of crew. Aspects of these operations do, however, influence the launch
mission and relate to an optimization of the placement of aircraft within it, which may be of interest
to other related domains.
Recovery Operations
A first significant step would be the inclusion of the landing process into operations. The Center
for Naval Analyses (CNA) report that provided guidance on the launch interdepature rates (Jewell,
1998) also provide guidance on the landing interarrival rates. Landings on the aircraft carrier flight
deck proceed to the single landing strip, with landings occurring at a median of 57.4 seconds. After
this, aircraft are immediately taxied out of the landing area and parked in areas forward of the
landing strip, typically beginning with the parking spaces on catapults 1 and 2. Doing so ensures
that the next aircraft coming in to land has no obstacles that would force it to abort its landing — a
key issue when aircraft are potentially running low on fuel. The poor performance of RT, GC, and
VBSC aircraft as demonstrated may pose a significant issue for these operations; extending MASCS
to include these operations might be of significant use in examining the combat readiness of the
different UAV types.
With the inclusion of recovery operations, an additional, more complicated type of operations
referred to as “flex deck” operations could be examined. In this rather rare condition (typically
occurring only during combat operations), the cyclic series of launch events is abandoned. Once
an aircraft lands, it is immediately taxied out of the landing strip, refueled, weapons reloaded, and
launched again as fast as possible. SMEs report that this puts tremendous stress on the crew to
maintain order and adequately manage the flight deck. At such extremely high tempo operations,
the “delayed” UAV systems may pose a serious problem in maintaining the required rate of launches
and be a detriment to conducting this form of combat operations. To the knowledge of the author,
very little research as to this type of operations has been conducted, perhaps due to the rarity of
209
CHAPTER 7. CONCLUSIONS
this type of operations and the stress placed on crew during its implementation. Extending MASCS
to incorporate these types of mission affords a completely new ability to review performance at
these unique operational conditions and may reveal important information about their utility in
comparison to more standard cyclic operations.
Respotting Operations
As noted above, after aircraft land they immediately taxi out of the landing strip to the forward area
of deck to clear space for the next aircraft arriving in the pattern. After all aircraft have landed, the
deck is reconfigured by the Deck Handler and his staff to set up for the next set of launch operations,
moving aircraft from their parking spots forward of the landing strip and reorienting them better
position them for taxiing to catapults (the initial conditions under which the current iteration of
MASCS operates). This layout is a form of spatial optimization — how to arrange a set of aircraft so
as to provide open access paths and robustness schedule execution — that, while done quite well by
Deck Handlers, is not well understood by other stakeholders. As MASCS already contains models
of vehicle size and shape and the ability to execute a given launch schedule, additional modifications
could be introduced to replicate both the process of vehicle relocation during respotting operations,
as well as provide tools to analyze to efficiency of the new parking arrangement. Consider a model
like MASCS existing within a tabletop configuration much like the “Ouija board” (Figure 7-1)
currently used by Deck Handlers, in which the agents within MASCS could be dragged and dropped
over the interface; a Handler could interact with the MASCS simulation much in the same fashion
as he currently does with the Ouija board, but now have the ability to analyze the effectiveness of
his current arrangement of aircraft, identify potential pitfalls in the schedule, and adjust for them
accordingly.
7.2.2
Deck Configuration
Current use of MASCS in testing purposefully used static templates of aircraft initial parking locations in an attempt to maintain consistency across conditions. In reality, there are a multitude
of possible parking arrangements that might be utilized by Deck Handlers, and testing the effects
of these arrangements falls easily within the capabilities of the MASCS environment. This could
be accomplished through the input of new configuration files, or through the use of the randomized
assignment generator described in the beginning of this Section 7.2. Additionally, radical reconfigurations of the flight deck, including redesigns for future aircraft carriers, could also be explored
within MASCS. However, such a major alteration to the system would require not only redesigning
the physical geometry of the flight deck but also rebuilding the logic behind the Handler’s decisionmaking rules for assigning aircraft to catapults as well as the zone coverage method of routing
aircraft amongst crew. Neither of these latter actions may be trivial, although intermediate versions
of a redesigned MASCS could be used as an aid to work with current operators to understand how
210
CHAPTER 7. CONCLUSIONS
Figure 7-1: Picture of a Deck Handler, resting his elbows the “Ouija board,” a tabletop-sized map
of the flight deck used for planning operations. Scale, metal cutouts or aircraft are used to update
aircraft positions and layouts, while colored thumbtacks are used to indicate fuel and weapons
statuses.
they might adapt their heuristics to future operational environments.
7.2.3
Responding to Failures
Examining the responsiveness of the system to failures may be an additional extension of the MASCS
model. In general, the MASCS missions assumed a fairly cooperative environment — no inclement
weather, no mechanical failures on the part of aircraft or deck equipment, no random behaviors by
crew, and no major issues that require a new plan be developed. Interviews with SMEs suggest that
operations often require changes to plans in the midst of execution; consider a case where a launch
catapult experiences a major issue and must shut down for the rest of the launch event. All aircraft
currently assigned to the catapult must be reassigned and rerouted to the remaining catapults, which
might also requires the relocation and movement of other aircraft on the flight deck to clear paths for
the vehicles to travel. Given that issues with the “delayed” set of unmanned vehicles, such a drastic
change in operations, requiring simultaneous action on the part of several agents, might place the
system in a very poor position and limit its effectiveness.
Other aspects of interest include the current utilization of the crew. This might address levels
211
CHAPTER 7. CONCLUSIONS
of crew fatigue that occur during operations and the influences of both operational tempo as well
as external environmental factors (rain, sun glare, etc.), or it might address how many crew are
actually required to run the flight deck. Currently in MASCS, a team of sixteen Aircraft Directors
are utilized in the zone coverage routing system. However, observations of the system recognize that
the workload is not spread evenly amongst the crew, nor are many crew reaching peak utilization.
While there are other benefits to having more crew than necessary on the flight deck (robustness
of operations, safety concerns, etc.), systems such as MASCS could explore the effects of reducing
this to smaller numbers of crew. Of particular interest would be the effects of crew reduction on
the other aspects of flight deck operations, discussed earlier in Section 7.2.1, as well as the resulting
effects on crew fatigue. Incorporating models the effects of fatigue and physiological state on crew
errors may also prove useful in better estimating the rates of injuries, accidents, and other aspects
of human performance on the flight deck.
7.2.4
Extensions to Other Domains
Lastly, and perhaps most importantly, alternative versions of this MASCS model could be developed
for other domains, although requires a substantial rewriting of most likely every single agent and task
defined within the system. However, the description of MASCS in Chapter 3 provides a template
for the creation of alternative simulation environments. Of particular interest are models of the
U.S. highway system or manufacturing environments, both of which are actively investigating the
development of autonomous systems. Models of these environments may be created through similar
methods as used in developing MASCS.
7.3
Contributions
The objective of this thesis was the development of an agent-based model of flight deck operations, a
candidate Heterogeneous Manned-Unmanned Environment (HMUE), into which models of futuristic
unmanned vehicle systems could be added and the effects of their integration into operations could be
analyzed. In working towards this objective, several contributions have been made to the modeling
and analysis of unmanned vehicle systems in complex sociotechnical systems, which are discussed
below. This thesis also partially fills a gap in existing research, in which models of unmanned
vehicle systems and their performance and interaction within human collaborators and other manned
vehicles in the context of real-world, heterogeneous operations has gone largely unexamined.
Chapter 1 posed four research questions that guided the development of the MASCS model. The
first two questions addressed the modeling of unmanned vehicle control architectures in HMUEs:
“What are the specific agents and their related states, parameters, and decision-making rules required to construct an agent-based model of carrier flight deck operations” and “What are the key
parameters that describe variations in unmanned vehicle control architectures and their interactions
212
CHAPTER 7. CONCLUSIONS
with the world [in that model]?” Answering these questions explored the creation of agent-based
models of Heterogeneous Manned-Unmanned Environments and provided the following contributions:
• The establishment of a general framework of manned and unmanned vehicle operations in
Heterogeneous Manned-Unmanned Environments that traces how information flows between
human collaborators, manned vehicles, unmanned vehicles, and supervisors in the environment.
This provided guidance as to the types of agents that should be defined within the system and
served as a template for the construction of the MASCS model of operations.
• A description of the important states, parameters, and decision-making rules that define the
key agents involved in flight deck operations, including models of motion and task execution
in the environment. This provides guidance for future researchers seeking to build models of
flight deck operations or of similar HMUEs.
• The identification of six key variables that define unmanned vehicle behavior in the context of
flight deck operations. New models of unmanned vehicles were defined as a series of changes to
a baseline manual control model, which was also created as part of this work. Verification tests
for these models showed the same type of performance differences generally expected from the
literature and demonstrates the utility of using an agent-based simulation for this research.
The final two sets of research questions addressed the effects of implementing unmanned vehicles
in flight deck operations. The first examined changes in productivity: “In the context of the modeled
control architectures, how do these key parameters drive performance on the aircraft carrier flight
deck? How do these parameters interact with the structure of the environment, and how do the
effects scale when using larger numbers of aircraft in the system?” The second set examined changes
in safety: “If high-level safety protocols are applied to operations, what effects do they have on the
safety of operations? Do the protocols leads to tradeoffs between safety and productivity in the
environment?” This required the development and execution of a experimental program for the
MASCS model, followed by additional tests that further explored the effects of different parameters
within the system. This resulted in the following contributions:
• A series of performance metrics addressing the expediency, efficiency, and safety of flight
deck operations were defined and implemented into the MASCS environment. These metrics
described the overall functionality of the flight deck system while also reviewing individual
vehicle behavior, the allocations of tasks across regions of the flight deck, and the proximity
to which vehicles come to both one another and to crew.
• Variations in safety were driven largely by differences in the flight deck topology due to the
heuristics of the crew zone coverage routing and the location of the crew in that topology.
213
CHAPTER 7. CONCLUSIONS
For systems not using the crew zone coverage routing, improvements in aircraft safety were
contingent on increasing high-level constraints in operations (separating aircraft based on time
or schedule). Overconstraining operations provided some additional benefits to safety at a
high cost to productivity.
• Variations in productivity, outside of the effects of overconstraining operations, were minimal
due to the effects of the heuristics governing vehicle routing and catapult queueing and the
utilization of the crew in the launch preparation process. Tests in Chapter 5 were unable to
identify any real differences in productivity due to unmanned vehicle integration because of
these factors. Key areas of future research concern reexamining the role of crew throughout
these aspects of flight deck operations.
• When improvements to the catapult queueing process were made in Chapter 6, results showed
that the occurrence of delays and latencies in operations is the key limitation in unmanned
vehicle productivity. Further improvements to the planning heuristics and the topology of the
flight deck (how vehicles are routed) may have further benefits to performance.
• At a much faster tempo of operations, variations in safety metrics across safety protocols for
a given control architecture decreased, due to the reduction in time vehicles spent in queues
and on the flight deck overall. This suggests that faster tempo operations (with vehicle transit
speed left unchanged) may aid in minimizing the chances of accidents on the flight deck.
In all, this dissertation should serve as a guide to two potential populations. First, for researchers
interested in developing agent-based models of human, manned vehicle, and unmanned vehicle behavior and interaction in the world, the creation of MASCS serves as a template for future models
for similar environments. Second, for stakeholders interested in assessing the future effects of unmanned vehicle implementation into candidate environments, the results of the testing of MASCS
provide guidance as to the key features of the environment that affect operations. In particular,
these results have demonstrated that neither Safety Protocols nor Control Architectures can be examined on their own: significant interactions occur between them that are influenced by the context
of conditions within the environment, the utilization of the crew, and the mission being conducted.
The performance of the SBSC architecture, the most advanced system utilizing a high level of autonomy, suggests that increased autonomy does not necessarily imply increased safety across all types
of measures.
As the transition of unmanned and autonomous vehicle systems from laboratory settings into the
world at large accelerates, understanding exactly what the principle features of unmanned vehicle
behavior are and how they interact with the environment is of significant importance. Given the
significant cost of performing tests in the real world, the development of simulation models that
adequately capture not only the motion of vehicles in the world, but their decision-making behavior
214
CHAPTER 7. CONCLUSIONS
and their interactions with other vehicles and people is of the utmost importance. MASCS provides
one method of doing so and has in turn provided some significant findings as to the true worth
of unmanned vehicles in one example domain and how they should be implemented. Extensions
of models like MASCS that include more complex models of not only human performance but of
vehicle hardware and sensors, even extended to hardware- or software-in-the-loop simulation, further
enhance these capabilities and provide an even better understanding of these environments. For
stakeholders such as the Federal Aviation Administration, the National Transportation Safety Board,
and the National Highway Traffic Safety Administration, the utility of such models for reviewing the
safe and effective of unmanned vehicles into their respective domains should be valuable not only in
promoting the development of effective unmanned vehicles, but safe ones as well.
215
CHAPTER 7. CONCLUSIONS
THIS PAGE INTENTIONALLY LEFT BLANK
216
References
Aircraft Carriers–CVN. (2014). Retrieved from http://ipv6.navy.mil/navydata/fact display.asp?
cid=4200&tid=200&ct=4. (Last accessed: April 29, 2014)
Alami, R., Albu-Schaffer, A., Bicchi, A., Bischoff, R., Chatila, R., De Luca, A., De Santis, A., Giralt,
G., Guiochet, J., Hirzinger, G., Ingrand, F., Lippiello, V., Mattone, R., Powell, D., Sen, S.,
Siciliano, B., Tonietti, G., & Villani, L. (2006). Safe and dependable physical human-robot
interaction in anthropic domains: State of the art and challenges. In Proceedings of Intelligent
Robotics Systems (IROS’06). Beijing, China.
Altenburg, K., Schlecht, J., & Nygard, K. E. (2002). An Agent-Based Simulation for Modeling
Intelligent Munitions. In Proceedings of the Second WSEAS International Conference on Simulation, Modeling and Optimization. Skiathos, Greece.
Anderson, D., Anderson, E., Lesh, N., Marks, J., Perlin, K., Ratajczak, D., & Ryall, K. (1999).
Human-Guided Simple Search: Combining Information Visualization and Heuristic Search.
In Proceedings of the 1999 Morkshop on New Paradigms in Information Visualization and
Manipulation in conjunction with the Eighth ACM International Conference on Information
and Knowledge Management. Kansas City, MO.
Anderson, J. R., & Schunn, C. D. (2000). Implications of the ACT-R Learning Theory: No Magic
Bullets (Technical Report). Department of Psychology, Carnegie Mellon University.
Archer, J. (2004). Methods for Assessment and Prediction of Traffic Safety at Urban Interactions
and their Application in Micro-simulation Modelling. Academic thesis, Royal Institute of
Technology, Stockholm, Sweden.
Archer, J., & Kosonen, I. (2000). The Potential of Micro-Simulation Modelling in Relation to Traffic
Safety Assessment. In Proceedings of the ESS Conference. Hamburg, Germany.
Arthur, K. W., Booth, K. S., & Ware, C. (1993). Evaluating 3D Task Performance for Fish Tank
Virtual Worlds. ACM Transactions on Information Systems, 11 (3), 239-265.
Atrash, A., Kaplow, R., Villemure, J., West, R., Yamani, H., & Pineau, J. (2009). Development and
Validation of a Robust Speech Interface for Improved Human-Robot Interaction. International
Journal of Social Robotics, 1, 345-356.
Bandyophadyay, T., Sarcione, L., & Hover, F. S. (2010). A Simple Reactive Collision Avoidance
Algorithm and its Application in Singapore Harbor. Field and Service Robotics, 62, 455-465.
Batty, M., Desyllas, J., & Duxbury, E. (2003). Safety in Numbers? Modelling Crowds and Designing
Control for the Notting Hill Carnival. Urban Studies, 40 (8), 1573-1590.
Bejczy, A. K., & Kim, W. S. (1990). Predictive Displays and Shared Compliance Control for TimeDelayed Telemanipulation. In Proceedings of the IEEE International Workshop on Intelligent
Robots and Systems ’90 (IROS ’90) ‘Towards a New Frontier of Applications’. Ibaraki, Japan.
Blank, M., Gorelick, L., Shechtman, E., Irani, M., & Basri, R. (2005). Action as Space-Time Shapes.
In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05).
Beijing, China.
217
REFERENCES
Blom, H. A. P., Stroeve, S. H., & Jong, H. H. de. (2006). Safety Risk Assessment by Monte Carlo
Simulation of Complex Safety Critical Operations (Technical Report No. NLR-TP-2006-684).
National Aerospace Laboratory NLR.
Bonabeau, E. (2002). Agent-based Modeling: Methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences of the United States of America,
99 (Suppl. 3), 7280-7287.
Bonabeau, E., Hunt, C. W., & Gaudiano, P. (2003). Agent-Based Modeling for Testing and
Designing Novel Decentralized Command and Control System Paradigms. Paper presented at
The International Command and Control Research and Technology Symposium. Washington,
D.C.
Bourakov, E., & Bordetsky, A. (2009). Voice-on-Target: A New Approach to Tactical Networking
and Unmanned Systems Control via the Voice Interface to the SA Environment. In Proceedings of the 14th International Command and Control Research and Technology Symposium.
Washington, D.C.
Bruni, S., & Cummings, M. (2005). Human Interaction with Mission Planning Search Algorithms.
Paper presented at Human Systems Integration Symposium (HSIS’05). Arlington, VA.
Burdick, J. W., DuToit, N., Howard, A., Looman, C., Ma, J., Murray, R. M., & Wongpiromsarn,
T. (2007). Sensing, Navigation and Reasoning Technologies for the DARPA Urban Challenge
(Technical Report). California Institute of Technology/Jet Propulsion Laboratory.
Calhoun, G. L., Draper, M. H., & Ruff, H. A. (2009). Effect of Level of Automation on Unmanned
Aerial Vehicle Routing Task. In Proceedings of the Human Factors and Ergonomics Society
Annual Meeting (Vol. 53, p. 197-201). San Antonio, TX.
Campbell, M., Garcia, E., Huttenlocher, D., Miller, I., Moran, P., Nathan, A., Schimpf, B., Zych, N.,
Catlin, J., Chelatrescu, F., Fujishima, H., Kline, F.-R., Lupashin, S., Reitmann, M., Shapiro,
A., & Wong, J. (2007). Team Cornell: Technical Review of the DARPA Urban Challenge
Vehicle (Technical Report). Cornell University.
Carley, K. (1996). Validating Computational Models (Technical Report). Pittsburgh, PA: Carnegie
Mellon University.
Carley, K., Fridsma, D., Casman, E., Yahja, A., Altman, N., Chen, L.-C., Kaminsky, B., & Nave, D.
(2003). BioWar: Scalable Agent-Based Model of Bioattacks. IEEE Transactions on Systems,
Man and Cybernetics–Part A: Systems and Humans, 36 (2), 252-265.
Carr, D., Hasegawa, H., Lemmon, D., & Plaisant, C. (1993). The Effects of Time Delays on a
Telepathology User Interface. In Proceedings of the Annual Symposium on Computer Application in Medical Care. Washington, D.C.
Ceballos, A., Gómez, J., Prieto, F., & Redarce, T. (2009). Robot Command Interface Using an
Audio-Visual Speech Recognition System. In E. Bayro-Corrochano & J.-O. Eklundh (Eds.),
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (p. 869876). Springer Berlin / Heidelberg.
Chen, X., Meaker, J. W., & Zhan, F. B. (2006). Agent-Based Modeling and Analysis of Hurricane
Evacuation Procedures for the Florida Keys. Natural Hazards, 38, 321-338.
Chen, Y. M., & Chang, S. H. (2008). An Agent-Based Simulation for Multi-UAVs Coordinative
Sensing. International Journal of Intelligent Computing and Cybernetics, 1 (2), 269-284.
Cicirelli, F., Furfaro, A., & Nigro, L. (2009). An Agent Infrastructure over HLA for Distributed
Simulation of Reconfigurable Systems and its Application to UAV Coordination. Simulation,
85 (1), 17-32.
218
REFERENCES
Clare, A. S. (2013). Modeling Real-time Human-automation Collaborative Scheduling of Unmanned
Vehicles. Doctoral dissertation, Massachusetts Institute of Technology.
Clothier, R. A., & Walker, R. A. (2006). Determination and Evaluation of UAV Safety Objectives.
In Proceedings of the 21st International Unmanned Air Vehicle Systems Conference (p. 18.118.16). Bristol, United Kingdom.
Conover, W. J. (1980). Practical Nonparametric Statistics. J. Wiley., New York.
Conway, L., Volz, R. A., & Walker, M. W. (1990). Teleautonomous Systems: Projecting and Coordinating Intelligent Action at a Distance. IEEE Transactions on Robotics and Automation,
6 (2), 146-158.
Corner, J. J., & Lamont, G. B. (2004). Parallel Simulation of UAV Swarm Scenarios. In Proceedings
of the 2004 Winter Simulation Conference (p. 355-363). Washington, D.C.
Crandall, J. W., Goodrich, M. A., Olsen, R. O., & Nielsen, C. W. (2005). Validating human-robot
interaction schemes in multitasking environments. IEEE Transactions on Systems, Man, and
Cybernetics–Part A: Systems and Humans, 35 (4), 438-449.
Cummings, M. L., Clare, A. S., & Hart, C. S. (2010). The Role of Human-Automation Consensus in
Multiple Unmanned Vehicle Scheduling. Human Factors: The Journal of the Human Factors
and Ergonomics Society, 52 (1).
Cummings, M. L., & Guerlain, S. (2007). Developing Operator Capacity Estimates for Supervisory
Control of Autonomous Vehicles. Human Factors, 49 (1), 1-15.
Cummings, M. L., How, J., Whitten, A., & Toupet, O. (2012). The Impact of Human-Automation
Collaboration in Decentralized Multiple Unmanned Vehicle Control. Proceedings of the IEEE,
100 (3), 660-671.
Cummings, M. L., & Mitchell, P. J. (2008). Predicting Controller Capacity in Remote Supervision
of Multiple Unmanned Vehicles. IEEE Transactions on Systems, Man, and Cybernetics–Part
A: Systems and Humans, 38 (2), 451-460.
Cummings, M. L., & Mitchell, P. M. (2005). Managing Multiple UAVs through a Timeline Display.
Paper presented at AIAA Infotech@Aerospace 2005. Arlington, VA.
Cummings, M. L., Nehme, C. E., & Crandall, J. W. (2007). Predicting Operator Capacity for
Supervisory Control of Multiple UAVs. In J. S. Chahl, L. C. Jain, A. Mizutani, & M. Sato-Ilic
(Eds.), Innovations in Intelligent Machines (Vol. 70, p. 11-37). Springer Berlin Heidelberg.
Defense Science Board. (2012). The Role of Autonomy in DoD Systems (Technical Report No.
July). Washington, D.C.: Department of Defense.
Dekker, S. W. A. (2005). Ten Questions About Human Error: A New View of Human Factors and
System Safety. Mahwah, NJ: Lawrence Erlbaum Associates.
Dixon, S., Wickens, C. D., & Chang, D. (2005). Mission Control of Multiple Unmanned Aerial
Vehicles: A Workload Analysis. Human Factors, 47 (3), 479-487.
Dixon, S. R., & Wickens, C. D. (2003). Control of Multiple-UAVs: A Workload Analysis. Paper
presented at The 12th International Symposium on Aviation Psychology. Dayton, OH.
Dixon, S. R., & Wickens, C. D. (2004). Reliability in Automated Aids for Unmanned Aerial Vehicle
Flight Control: Evaluating a Model of Automation Dependence in High Workload (Technical
Report). Savoy, IL: Aviation Human Factors Division, Institute of Aviation, University of
Illinois at Urbana-Champaign.
219
REFERENCES
Doherty, P., Granlund, G., Kuchcinski, K., Nordberg, K., Skarman, E., & Wiklund, J. (2000). The
WITAS Unmanned Aerial Vehicle Project. In Proceedings of the 14th European Conference on
Artificial Intelligence. Berlin, Germany.
Edmonds, B. (1995). What is Complexity?–The philosophy of complexity per se with application to
some examples in evolution. In F. Heylighen & D. Aerts (Eds.), The Evolution of Complexity.
Dordrecht: Kluwer.
Edmonds, B. (2001). Towards a Descriptive Models of Agent Strategy Search. Computational
Economics, 18, 113-135.
Efimba, M. E. (2003). An Exploratory Analysis of Littoral Combat Ships’ Ability to Protect Expeditionary Strike Groups. Master’s thesis, Naval Postgraduate School, Monterey, CA.
Eichler, S., Ostermaier, B., Schroth, C., & Kosch, T. (2005). Simulation of Car-to-Car Messaging:
Analyzing the Impact on Road Traffic. In Proceedings of the 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
(MASCOTS’05). Atlanta, GA.
Elkins, L., Sellers, D., & Monach, W. R. (2010). The Autonomous Maritime Navigation (AMN)
Project: Field Tests, Autonomous Behaviors, Data Fusion, Sensors, and Vehicles. Journal of
Field Robotics, 27 (6), 790-818.
Epstein, J. M. (1999). Agent-Based Computational Models and Generative Social Science. Complexity, 4 (5), 41-60.
Epstein, J. M., & Axtell, R. M. (1996). Growing Artificial Societies: Social Science from the Bottom
Up. Washington, D.C.: Brookings Institution Press.
Erlenbruch, T. (2002). Agent-Based Simulation of German Peacekeeping Operations for Units up
to Platoon Level. Master’s thesis, Monterey, CA.
Federal Aviation Administration. (2014a, January 6). Fact Sheet–Unmanned Aircraft Systems
(UAS). Retrieved from http://www.faa.gov/news/fact sheets/news story.cfm?newsId=14153.
(Last accessed: April 29, 2014)
Federal Aviation Administration. (2014b). Unmanned Aircraft (UAS) General FAQs (No. July 27).
Retrieved from http://www.faa.gov/about/initiatives/uas/uas faq/#Qn1. Washington, D.C.:
McGraw-Hill Companies, Inc. (Last accessed: July 27, 2014)
Fiscus, J. G., Ajot, J., & Garofolo, J. S. (2008). The Rich Transcription 2007 Meeting Recognition
Evaluation. In R. Stiefelhagen, R. Bowers, & J. Fiscus (Eds.), Multimodal Technologies for
Perception of Humans. Berlin: Springer-Verlag.
Fisher, S. (2008). Replan Understanding for Heterogenous Unmanned Vehicle Teams. Master’s
thesis, Cambridge, MA.
Forest, L. M., Kahn, A., Thomer, J., & Shapiro, M. (2007). The Design and Evaluation of HumanGuided Algorithms for Mission Planning. Paper presented at Human Systems Integration
Symposium (HSIS’07). Annapolis, MD.
Fryman, J. (2014). Updating the Industrial Robot Safety Standard. In Proceedings of ISR/Robotik
2014; 41st International Symposium on Robotics (pp. 704–707). Munich, Germany.
Furui, S. (2010). HIstory and Development of Speech Recognition. In F. Chen & K. Jokinen (Eds.),
Speech Technology. New York: Springer Science+Business Media.
Ganapathy, S. (2006). Human-Centered Time-Pressured Decision Making in Dynamic Complex
Systems. Doctoral dissertation, Wright State University, Dayton, OH.
220
REFERENCES
Gettman, D., & Head, L. (2003). Surrogate Safety Measures From Traffic Simulation Models Final
Report (Technical Report). McLean, VA: U. S. Department of Transportation.
Gettman, D., Pu, L., Sayed, T., & Shelby, S. (2008). Surrogate Safety Assessment Model and Validation: Final Report (Technical Report). McLean, VA: U. S. Department of Transportation.
Gomez, J.-B., Ceballos, A., Prieto, F., & Redarce, T. (2009). Mouth Gesture and Voice Command
Based Robot Command Interface. Paper presented at The 2009 IEEE International Conference
on Robotics and Automation (ICRA’09). Kobe, Japan.
Gonzales, D., Moore, L., Pernin, C., Matonick, D., & Dreyer, P. (2001). Assessing the Value
of Information Superiority for Ground Forces–Proof of Concept (Technical Report). Santa
Monica, CA: RAND Corporation.
Hakola, M. B. (2004). An Exploratory Analysis of Convoy Protection using Agent-Based Simulation.
Master’s thesis, Naval Postgraduate School, Monterey, CA.
Hashimoto, T., Sheridan, T. B., & Noyes, M. V. (1986). Effects of Predictive Information in
Teleoperation with Time Delay. Japanese Journal of Ergonomics, 22 (2).
Heath, B., Hill, R., & Ciarallo, F. (2009). A Survey of Agent-Based Modeling Practices (January
1998-July 2008). Journal of Articifical Societies and Social Simulation, 12 (4).
Heinzmann, J., & Zelinsky, A. (1999). A Safe-Control Paradigm for Human-Robot Interaction.
Journal of Intelligent and Robotic Systems, 25, 295-310.
Helbing, D., Farkas, I., & Vicsek, T. (2000). Simulating Dynamical Features of Escape Panic.
Nature, 407, 487-490.
Hennigan, W. J. (2013, July). Navy drone X-47B lands on carrier deck in historic first. Retrieved
from http://articles.latimes.com/2013/jul/10/business/la-fi-mo-navy-drone-x47b-20130709.
Los Angeles, CA. (Last accessed: April 29, 2014)
Hill, J. W. (1976). Comparison of Seven Performance Measures in a Time-Delayed Manipulation
Task. IEEE Transactions on Systems, Man, and Cybernetics, SMC-6 (4), 286-295.
Ho, S.-T. T. (2006). Investigating Ground Swarm Robotics Using Agent Based Simulation. Master’s
thesis, Naval Postgraduate School, Monterey, CA.
Howe, A. E., Whitley, L. D., Barbulescu, L., & Watson, J. P. (2000). Mixed Initiative Scheduling
for the Air Force Satellite Control Network. Paper presented at The 2nd International NASA
Workshop on Planning and Scheduling for Space. San Francisco, CA.
Hu, J., Thompson, J. M., Ren, J., & Sheridan, T. B. (2000). Investigations into Performance of
Minimally Invasive Telesurgery with Feedback Time Delays. Presence, 9 (4), 369-382.
Hu, X., & Sun, Y. (2007). Agent-Based Modeling and Simulation of Wildland Fire Suppression. In
Proceedings of the 2007 Winter Simulation Conference (p. 1275-1283). Washington, D.C.
Huntsberger, T., Aghazarian, H., Howard, A., & Trotz, D. C. (2011). Stereo-vision Based Navigation
for Autonomous Surface Vessels. Journal of Field Robotics, 28 (1), 3-18.
ISO. (unpublished). ISO/PDTS 15066. Robots and robotic devices Industrial safety requirements
Collaborative industrial robots (Technical Report).
Jenkins, D., & Vasigh, B. (2013). The Economic Impact of Unmanned Aircraft Systems Integration
in the United States (Technical Report). Arlington, VA: Association for Unmanned Vehicle
Systems International.
Jewell, A. (1998). Sortie Generation Capacity of Embarked Airwings (Technical Report). Alexandria,
VA: Center for Naval Analyses.
221
REFERENCES
Jewell, A., Wigge, M. A., Gagnon, C. M. K., Lynn, L. A., Kirk, K. M., Berg, K. K., Roberts, T. A.,
Hale, A. J., Jones, W. L., Matheny, A. M., Hall, J. O., & Measell, B. H. (1998). USS Nimitz
and Carrier Airwing Nine Surge Demonstration (Technical Report). Alexandria, VA: Center
for Naval Analyses.
Jhuang, H., Serre, T., Wolf, L., & Poggio, T. (2007). A Biologically Inspired System for Action
Recognition. Paper presented at The IEEE 11th International Conference on Computer Vision.
Rio de Janeiro, Brazil.
Johnson, D. A., Naffin, D. J., Puhalia, J. S., Sanchez, J., & Wellington, C. K. (2009). Development
and Implementation of a Team of Robotic Tractors for Autonomous Peat Moss Harvesting.
Journal of Field Robotics, 26 (6-7), 549-571.
Jolly, K. G., Kumar, R. S., & Vijaykumar, R. (2009). A Bezier Curve Based Path Planning in
a Multi-agent Robot Soccer System without Violating the Acceleration Limits. Robotics and
Autonomous Systems, 57, 23-33.
Kaber, D. B., Endsley, M., & Endsley, M. R. (2004). The Effects of Level of Automation and Adaptive Automation on Human Performance, Situation Awareness and Workload in a Dynamic
Control Task. Theoretical Issues in Ergonomics Science, 5 (2), 113-153.
Kaber, D. B., Endsley, M. R., & Onal, E. (2000). Design of Automation for Telerobots and the Effect
on Performance, Operator Situation Awareness and Subjective Workload. Human Factors and
Ergonomics in Manufacturing, 10 (4), 409-430.
Kalmus, H., Fry, D. B., & Denes, P. (1960). Effects of Delayed Visual Control on Writing, Drawing,
and Tracing. Language and Speech, 3 (2), 96-108.
Kanagarajah, A. K., Lindsay, P., Miller, A., & Parker, D. (2008). An Exploration into the Uses
of Agent-Based Modeling to Improve Quality of Healthcare. In Unifying Themes in Complex
Systems (p. 471-478). Springer.
Kennedy, W. G., Bugajska, M. D., Marge, M., Adams, W., Fransen, B. R., Perzanowski, D., Schultz,
A. C., & Trafton, J. G. (2007). Spatial Representation and Reasoning for Human-Robot
Collaboration. In Proceedings of the Twenty-Second Conference on Artificial Intelligence (p.
1554-1559). Vancouver, Canada.
Kramer, L. A., & Smith, S. F. (2002). Optimizing for Change: Mixed-Initiative Resource Allocation
with the AMC Barrel Allocator. Paper presented at The 3rd International NASA Workshop
on Planning and Scheduling for Space. Houston, TX.
Kulic, D., & Croft, E. A. (2005). Safe Planning for Human-Robot Interaction. Journal of Robotic
Systems, 22 (7), 383-396.
Kulic, D., & Croft, E. A. (2006). Real-time Safety for Human-Robot Interaction. Robotics and
Autonomous Systems, 54, 1-12.
Kumar, S., & Mitra, S. (2006). Self-Organizing Traffic at a Malfunctioning Intersection. Journal of
Artificial Societies and Social Simulation, 9 (4).
Kurniawati, E., Celetto, L., Capovilla, N., & George, S. (2012). Personalized Voice Command
Systems in Multi Modal User Interface. Paper presented at The 2012 IEEE International
Conference on Emerging Signal Processing Applications. Las Vegas, NV.
Laird, J. E. (2008). Extending the Soar Cognitive Architecture (Technical Report). Division of
Computer Science and Engineering, University of Michigan.
Lane, J. C., Carignan, C. R., Sullivan, B. R., Akin, D. L., Hunt, T., & Cohen, R. (2002). Effects of
Time Delay on Telerobotic Contorl of Neutral Buoyancy Vehicles. In Proceedings of the 2002
IEEE International Conference on Robotics & Automation (p. 2874-2879). Washington, D.C.
222
REFERENCES
LaPorte, T. R., & Consolini, P. M. (1991). Working in Practice but Not in Theory: Theoretical
Challenges of “High-Reliability Organizations”. Journal of Public Administration Researcn
and Theory, 1 (1), 19-47.
Laptev, I., Marszalek, M., Schimd, C., & Rozenfeld, B. (2008). Learning Realistic Human Action
from Movies. In IEEE Conference on Computer Vision and Pattern Recognition (p. 1-8).
Anchorage, AK.
Larson, J., Bruch, M., Halterman, R., Rogers, J., & Webster, R. (2007). Advances in Autonomous
Obstacle Avoidance for Unmanned Surface Vehicles. Paper presented at AUVSI Unmanned
Systems North America. Washington, D.C.
Laskowski, M., McLeod, R. D., Friesen, M. R., Podaima, B. W., & Alfa, A. S. (2009). Models of
Emergency Departments for Reducing Patient Waiting Times. PLoS ONE, 4 (7).
Laskowski, M., & Mukhi, S. (2008). Agent-Based Simulation of Emergency Departments with
Patient Diversion. In D. Weerasinghe (Ed.), ehealth 2008 (p. 25-37). Berlin: Springer.
Lauren, M. K. (2002). Firepower Concentration in Cellular Automaton Combat Models–An Alternative to Lanchester. The Journal of the Operational Research Society, 53 (6), 672-679.
Law, A. M. (2007). Simulation Modeling and Analysis (4th ed.). Boston: McGraw-Hill.
Lee, S. M., Pritchett, A. R., & Corker, K. (2007). Evaluating Transformations of the Air
Transportation System through Agent-Based Modeling and Simulation. Paper presented at
FAA/Eurocontrol Seminar on Air Traffic Management. Barcelona, Spain.
Leveson, N. G. (2004). Model-Based Analysis of Socio-Technical Risk (Technical Report). Cambridge, MA.
Leveson, N. G. (2011). Engineering a Safer World: Systems Thinking Applied to Safety. Cambridge,
MA: Massachusetts Institute of Technology.
Liang, L. A. H. (2005). The Use of Agent Based Simulation for Cooperative Sensing of the Battlefield.
Master’s thesis, Naval Postgraduate School, Monterey, CA.
Lin, Z., Jiang, Z., & Davis, L. S. (2009). Recognizing Actions by Shape-Motion Prototypes. Paper
presented at The 12th International Conference on Computer Vision. Kyoto, Japan.
Lockergnome. (2009, June). USS Nimitz Navy Aircraft Carrier Operations. Retrieved from http:
//www.youtube.com/watch?v= 2KYU97y2L8. (Last accessed: April 29, 2014)
Lum, M. J. H., Rosen, J., King, H., Friendman, D. C. W., Lendvay, T. S., Wright, A. S., Sinanan,
M. N., & Hannaford, B. (2009). Teleoperation in Surgical Robotics–Network Latency Effects
on Surgical Performance. Paper presented at The 31st Annual International Conference of the
IEEE EMBS. Minneapolis, MN.
Lundell, M., Tang, J., Hogan, T., & Nygard, K. (2006). An Agent-Based Heterogeneous UAV
Simulator Design. In Proceedings of the 5th WSEAS International Conference on Artificial
Intelligence, Knowledge Engineering, and Data Bases (p. 453-457). Madrid, Spain.
MacKenzie, I. S., & Ware, C. (1993). Lag as a Determinant of Human Performance in Interactive
Systems. In Proceedings of the INTERACT ’93 and CHI ’93 Conference on Human Factors
in Computing Systems (p. 488-493). Amsterdam, Netherlands: ACM.
Macy, M. W., & Willer, R. (2002). From Factors to Actors: Computational Sociology and AgentBased Modeling. Annual Review of Sociology, 28, 143-166.
223
REFERENCES
Maere, P. C., Clare, A. S., & Cummings, M. L. (2010). Assessing Operator Strategies for Adjusting Replan Alerts in Controlling Multiple Unmanned Vehicles. Paper presented at The 11th
IFAC/IFIP/IFORS/IEA Symposium on Analysis Design and Evaluation of Human Machine
Systems. Valenciennes, France.
Malasky, J., Forest, L. M., Khan, A. C., & Key, J. R. (2005). Experimental Evaluation of HumanMachine Collaborative Algorithms in Planning for Multiple UAVs. In The 2005 IEEE International Conference on Systems, Man and Cybernetics (Vol. 3, p. 2469-2475). Waikoloa,
HI.
Mancini, A., Cesetti, A., Iuale, A., Frontoni, E., Zingaretti, P., & Longhi, S. (2008). A Framework for
Simulation and Testing of UAVs in Cooperative Scenarios. Journal of Intelligent and Robotic
Systems, 54 (1-3), 307-329.
Mar, L. E. (1985). Human Control Performance in Operation of a Time-Delayed Master-Slave
Manipulator. Bachelor’s thesis, Massachusetts Institute of Technology, Cambridge, MA.
Markoff, J. (2010). Google Cars Drive Themselves, in Traffic. Retrieved from http://www.
nytimes.com/2010/10/10/science/10google.html?pagewanted=all& r=0. Mountain View, CA:
The New York Times. (Last accessed: April 29, 2014)
Marvel, J., & Bostelman, R. (2013). Towards Mobile Manipulator Safety Standards. In Proceedings
of the 2013 IEEE International Symposium on Robotic and Sensors Environments (ROSE)
(pp. 31–36). Washington, D.C.: IEEE.
Matsusaka, Y., Fujii, H., Okano, T., & Hara, I. (2009). Health Exercise Demonstration Robot
TAIZO and Effects of Using Voice Command in Robot-Human Collaborative Demonstration.
Paper presented at The 18th IEEE International Symposium on Robot and Human Interactive
Communication. Toyama, Japan.
McKinney, B. (2014). Unmanned Combat Air System Carrier Demonstration (UCAS-D) (Technical
Report). Los Angeles, CA: Northrop Grumman Corporation. (Last accessed: April 29, 2014)
McLean, G. F., & Prescott, B. (1991). Teleoperator Task Performance Evaluation Using 3-D Video.
Paper presented at The IEEE Pacific Rim Conference on Communications, Computers, and
Signal Processing. Victoria, BC, Canada.
McLean, G. F., Prescott, B., & Podhorodeski, R. (1994). Teleoperated System Performance Evaluation. IEEE Transactions on Systems, Man, and Cybernetics, 24 (5), 796-804.
McMindes, K. L. (2005). Unmanned Aerial Vehicle Survivability: The Impacts of Speed, Detectability, Altitude, and Enemy Capabilities. Master’s thesis, Naval Postgraduate School, Monterey,
CA.
Mekdeci, B., & Cummings, M. L. (2009). Modeling Multiple Human Operators in the Supervisory
Control of Heterogeneous Unmanned Vehicles. Paper presented at The 9th Conference on
Performance Metrics for Intelligent Systems (PerMIS’09). Gaithersburg, MD.
Mettler, L., Ibrahim, M., & Jonat, W. (1998). One Year of Experience Working with the Aid of a
Robotic Assistant (the Voice-controlled Optic Holder AESOP) in Gynaecological Endoscopic
Surgery. Human Reproduction, 13 (10), 2748-2750.
Milton, R. M. (2004). Using Agent-Based Modeling to Examine the Logistical Chain of the Seabase.
Master’s thesis, Naval Postgraduate School, Monterey, CA.
Mkrtchyan, A. A. (2011). Modeling Operator Performance in Low Task Load Supervisory Domains.
Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA.
Munoz, M. F. (2011). Agent-Based Simulation and Analysis of a Defensive UAV Swarm against an
Enemy UAV Swarm. Master’s thesis, Naval Postgraduate School, Monterey, CA.
224
REFERENCES
Naval Studies Board. (2005). Autonomous Vehicles in Support of Naval Operations (Technical
Report). Washington, D.C.: Department of Defense.
Nehme, C. (2009). Modeling Human Supervisory Control in Heterogeneous Unmanned Vehicle
Systems. Doctoral dissertation, Massachusetts Institute of Technology, Cambridge, MA.
Nikolaidis, S., & Shah, J. (2012). Human-Robot Interactive Planning using Cross-Training – A
Human Team Training Approach. Paper presented at AIAA Infotech@Aerospace 2012. Garden
Grove, CA.
Office of the Secretary of Defense. (2013). Unmanned Systems Roadmap FY2013-2038 (Technical
Report). Washington, D.C.: Office of the Secretary of Defense.
Olsen, D. R., & Wood, S. B. (2004). Fan-out: Measuring Human Control of Multiple Robots. In
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Vienna,
Austria.
Oving, A. B., & Erp, J. B. F. van. (2001). Driving with a Head-Slaved Camera System. In
Proceedings of the Human Factors and Ergonomics Society Annual Meeting (Vol. 45, p. 13721376). Minneapolis, MN.
Pan, X., Han, C. S., Dauber, K., & Law, K. H. (2007). A Multi-agent Based Framework for the
Simulation of Human and Social Behaviors during Emergency Evacuations. AI and Society,
22 (2), 113-132.
Parasuraman, R., Galster, S., Squire, P., Furukawa, H., & Miller, C. (2005). A Flexible DelegationType Interface Enhances System Performance in Human Supervision of Multiple Robots: Empirical Studies with RoboFlag. IEEE Transactions on Systems, Man, and Cybernetics, 35 (4),
481-493.
Parasuraman, R., Galster, S. M., & Miller, C. A. (2003). Human Control of Multiple Robots in the
RoboFlag Simulation Environment. Paper presented at The IEEE International Conference
on Systems, Man and Cybernetics. Washington, D.C.
Park, K., Lee, S. J., Jung, H.-Y., & Lee, Y. (2009). Human-Robot Interface Using Robust Speech
Recognition and User Localization Based on Noise Separation Device. Paper presented at
The 18th IEEE International Symposium on Robot and Human Interactive Communication.
Toyama, Japan.
Paruchuri, P., Pullalarevu, A. R., & Karlapalem, K. (2002). Multi Agent Simulation of Unorganized
Traffic. In Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS’02) (p.
176-183). Bologna, Italy.
Patvivatsiri, L. (2006). A Simulation Model for Bioterrorism Preparedness in an Emergency Room.
In Proceedings of the 2006 Winter Simulation Conference (p. 501-508). Monterey, CA.
Peeta, S., Zhang, P., & Zhou, W. (2005). Behavior-based Analysis of Freeway Car-Truck Interactions
and Related Mitigations Strategies. Transportation Research Part B, 39, 417-451.
Perrow, C. (1984). Normal accidents: Living with high-risk technologies. Princeton, NJ: Princeton
University Press.
Perzanowski, D., Schultz, A., Adams, W., Marsh, E., Gerhart, G. R., Gunderson, R. W., & Shoemaker, C. M. (2000). Using a Natural Language and Gesture Interface for Unmanned Vehicles.
In Proceedings of the Society of Photo-Optical Instrumentation Engineers (Vol. 4024).
Pina, P. E., Cummings, M. L., Crandall, J. W., & Penna, M. D. (2008). Identifying Generalizable
Metric Classes to Evaluate Human-Robot Teams. Paper presented at Metrics for Human-Robot
Interaction Workshop at the 3rd Annual Conference on Human-Robot Interaction. Amsterdam,
The Netherlands.
225
REFERENCES
Pina, P. E., Donmez, B., & Cummings, M. L. (2008). Selecting Metrics to Evaluate Human
Supervisory Control Applications (Technical Report). Cambridge, MA: MIT Humans and
Automation Laboratory.
Pritchett, A. R., Lee, S., Huang, D., Goldsman, D., Joines, J. A., Barton, R. R., Kang, K., & Fishwick, P. A. (2000). Hybrid-System Simulation for National Airspace System Safety Analysis.
In Proceedings of the 2000 Winter Simulation Conference (p. 1132-1142). Orlando, FL.
Rasmussen, S. J., & Chandler, P. R. (2002). MultiUAV: A Multiple UAV Simulation for Investigation
of Cooperative Control. In Proceedings of the 2002 Winter Simulation Conference (p. 869-877).
San Diego, CA.
Reason, J. (1990). Human Error. Cambridge, UK: Cambridge University Press.
Reason, J. T., Parker, D., & Lawton, R. (1998). Organisational Controls and Safety: The varieties
of rule-related behaviour. Journal of Occupational and Organisational Psychology, 71, 289-304.
Rio Tinto. (2014). Mine of the Future.
mine-of-the-future-9603.aspx.
Retrieved from http://www.riotinto.com/ironore/
Roberts, K. H. (1990). Some Characterstics of One Type of High Reliability Organization. Organization Science, 1 (2), 160-176.
Rochlin, G. I., La Porte, T. R., & Roberts, K. H. (1987). The Self-Designing High-Reliability
Organization: Aircraft Carrier Flight Operations at Sea. Naval War College Review (42),
76-90.
Roth, E., Hanson, M. L., Hopkins, C., Mancuso, V., & Zacharias, G. L. (2004). Human in the Loop
Evaluations of a Mixed-initiative System for Planning and Control of Multiple UAV Teams. In
Proceedings of the Human Factors and Ergonomics Society 48th Annual Meeting (p. 280-284).
New Orleans, LA.
Rovira, E., McGarry, K., & Parasuraman, R. (2007). Effects of Imperfect Automation on Decision
Making in a Simulated Command and Control Task. Human Factors, 49 (1), 76-87.
Roy, N., Baltus, G., Fox, D., Gemperle, F., Goetz, J., Hirsch, T., Margaritis, D., Montemerlo, M.,
Pineau, J., Schulte, J., & Thrun, S. (2000). Towards personal service robots for the elderly.
Paper presented at The Workshop on Interactive Robots and Entertainment (WIRE 2000)
(Vol. 25, p. 184).
Rucker, J. E. (2006). Using Agent-Based Modeling to Search for Elusive Hiding Targets. Master’s
thesis, Air Force Institute of Technology, Wright-Patterson Air Force Base, OH.
Ruff, H. A., Calhoun, G. L., Draper, M. H., Fontejon, J. V., & Guilfoos, B. J. (2004). Exploring
Automation Issues in Supervisory Control of Multiple UAVs. In Proceedings of the 2nd Human
Performance, Situation Awareness, and Automation Conference (HPSAA II). Daytona Beach,
FL.
Ruff, H. A., Narayanan, S., & Draper, M. H. (2002). Human Interaction with Levels of Automation
and Decision-Aid Fidelity in the Supervisory Control of Multiple Simulated Unmanned Air
Vehicles. Presence, 11 (4), 335-351.
Ryan, J. C. (2011). Assessing the Performance of Human-Automation Collaborative Planning
Systems. Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA.
Ryan, J. C., Banerjee, A. G., Cummings, M. L., & Roy, N. (2014). Comparing the Performance of
Expert User Heuristics and an Integer Linear Program in Aircraft Carrier Deck Operations.
IEEE Transactions on Cybernetics, 44 (6), 761–773.
226
REFERENCES
Ryan, J. C., Cummings, M. L., Roy, N., Banerjee, A. G., & Schulte, A. S. (2011). Designing an
Interactive Local and Global Decision Support System for Aircraft Carrier Deck Scheduling.
Paper presented at AIAA Infotech@Aerospace 2011. St. Louis, MO, USA.
Sargent, R. G. (2009). Verification and Validation of Simulation Models. In Proceedings of the 2009
Winter Simulation Conference (p. 162-176). Orlando, FL.
Sato, A. (2003). The RMAX Helicopter UAV (Technical Report). Shizuoka, Japan: Yamaha Motor
Co., LTD.
Scanlon, J. (2009). How KIVA Robots Help Zappos and Walgreens (No. April 10). Retrieved
from http://www.businessweek.com/innovate/content/apr2009/id20090415 876420.htm. New
York, NY: McGraw-Hill Companies, Inc. (Last accessed: April 29, 2014)
Schindler, K., & Gool, L. van. (2008). Action Snippets: How many frames does human action
recognition require? Paper presented at The IEEE Conference on Computer Vision and
Pattern Recognition. Anchorage, AK.
Schouwenaars, T., How, J., & Feron, E. (2004). Decentralized Cooperative Trajectory Planning
of Multiple Aircraft with Hard Safety Guarantees. Paper presented at The AIAA Guidance
Navigation and Control Conference and Exhibit. Providence, RI.
Schuldt, C., Laptev, I., & Caputo, B. (2004). Recognizing Human Actions: A Local SVM Approach.
In Proceedings of the International Conference on Pattern Recognition. Cambridge, UK.
Schultz, M., Schulz, C., & Fricke, H. (2008). Passenger Dynamics at Airport Terminal Environment.
In W. W. F. Klingsch, C. Rogsch, A. Schadschneider, & M. Schreckenberg (Eds.), Pedestrian
and Evacuation Dynamics 2008. Berlin: Springer-Verlag.
Schumann, B., Scanlan, J., & Takeda, K. (2011). Evaluating Design Decisions in Real-Time Using
Operations Modelling. Paper presented at The Air Transport and Operations Symposium.
Delft, The Netherlands.
Scott, S. D., Lesh, N., & Klau, G. W. (2002). Investigating Human-Computer Optimization. Paper
presented at CHI 2002. Minneapolis, MN.
Scovanner, P., Ali, S., & Shah, M. (2007). A 3-Dimensional SIFT Descriptor and its Application to
Action Recognition. Paper presented at ACM Multimedia. Augsburg, Bavaria, Germany.
Scribner, D. R., & Gombash, J. W. (1998). The Effect of Stereoscopic and Wide Field of View
Conditions on Teleoperator Performance (Technical Report No. ARL-TR-1598). Aberdeen
Proving Ground, MD.
Seetharaman, G., Lakhotia, A., & Blasch, E. P. (2006). Unmanned Vehicles Come of Age: The
DARPA Grand Challenge. Computer, 39 (12), 26-29.
Sha, F., & Saul, L. K. (2007). Large Margin Hidden Markov Models for Automatic Speech Recognition. In B. Scholkopf, J. C. Platt, & T. Hofmann (Eds.), Advances in Neural Information
Processing Systems 19. Cambridge, MA: MIT Press.
Sharma, S., Singh, H., & Prakash, A. (2008). Multi-Agent Modeling and Simulation of Human
Behavior in Aircraft Evacuations. IEEE Transactions on Aerospace and Electronic Systems,
44 (4), 1477-1488.
Sheik-Nainar, M. A., Kaber, D. B., & Chow, M.-Y. (2005). Control Gain Adaptation in Virtual
Reality Mediated Human-Telerobot Interaction. Human Factors and Ergonomics in Manufacturing, 15 (3), 259-274.
Sheridan, T. B. (1992). Telerobotics, Automation and Human Supervisory Control. Cambridge,
MA: The MIT Press.
227
REFERENCES
Sheridan, T. B., & Ferrell, W. R. (1963). Remote Manipulative Control with Transmission Delay.
IEEE Transactions on Human Factors in Electronics, 4 (1), 25-29.
Sheridan, T. B., & Verplank, W. (1978). Human and Computer Control of Undersea Teleoperators
(Technical Report). Cambridge, MA: MIT Man-Machine Systems Laboratory.
Simpson, R. C., & Levine, S. P. (2002). Voice Control of a Powered Wheelchair. IEEE Transactions
on Neural Systems and Rehabilitation Engineering, 10 (2), 122-125.
Sislak, D., Volf, P., Komenda, A., Samek, J., & Pechoucek, M. (2007). Agent-Based Multi-Layer
Collision Avoidance to Unmanned Aerial Vehicles. In J. Lawton, J. Patel, & A. Tate (Eds.),
Knowledge Systems for Coalition Operation (p. 1-6). Waltham, MA.
Smyth, C. C., Gombash, J. W., & Burcham, P. M. (2001). Indirect Vision Driving with Fixed
Flat Panel Displays for Near-unity, Wide, and Extended Fields of Camera View (Technical
Report). Aberdeen Proving Ground, MD.
Song, Y., Demirdjian, D., & Davis, R. (2012). Continuous Body and Hand Gesture Recognition for
Natural Human-Computer Interaction. ACM Transactions on Interactive Intelligent Systems,
2 (1).
Spain, E. H., & Hughes, T. W. (1991). Objective Assessments of Mobility with an Early Unmanned
Ground Vehicle (UGV) Prototype Viewing System (Technical Report). San Diego, CA: Naval
Ocean Systems Center.
Starr, G. P. (1979). A Comparison of Control Modes for Time-Delayed Remote Manipulation. IEEE
Transactions on Systems, Man, and Cybernetics, SMC-9 (4), 241-246.
Steele, M. J. (2004). Agent-based Simulation of Unmanned Surface Vehicles. Master’s thesis, Naval
Postgraduate School, Monterey, CA.
Sterman, J. D. (2000). Business Dynamics: Systems Thinking and Modeling for a Complex World.
New York: Irwin/McGraw-Hill.
Stroeve, S. H., Bakker, G. J., & Blom, H. A. P. (2007). Safety Risk Analysis of Runway Incursion
Alert Systems in the Tower and Cockpit by Multi-Agent Systemic Accident Modelling. Paper
presented at The 7th USA/Europe ATM R&D Seminar. Barcelona, Spain.
Sugeno, M., Hirano, I., Nakamura, S., & Kotsu, S. (1995). Development of an Intelligent Unmanned
Helicopter. In Proceedings of 1995 IEEE International Conference on Fuzzy Systems, 1995
International Joint Conference of the Fourth IEEE International Conference on Fuzzy Systems
and The Second International Fuzzy Engineering Symposium (Vol. 5, p. 33-34). Yokohama,
Japan.
Sycara, K., & Lewis, M. (2008). Agent-Based Approaches to Dynamic Team Simulation (Technical Report No. NPRST-TN-08-9). Millington, TN: Navy Personnel Research, Studies, and
Technology Division, Bureau of Naval Personnel.
Teller, S., Correa, A., Davis, R., Fletcher, L., Frazzolia, E., Glass, J., How, J. P., Jeon, J., Karaman,
S., Luders, B., Roy, N., Sainath, T., & Walter, M. R. (2010). A Voice-Commanded Robotic
Forklift Working Alongside Humans in Minimally-Prepared Outdoor Environments. Paper
presented at The IEEE International Conference on Robotics and Automation. Anchorage,
AK.
Teo, R., Jang, J. S., & Tomlin, C. J. (2004). Automated Multiple UAV Flight–The Stanford
DragonFly UAV Program. Paper presented at The 43rd IEEE Conference on Decision and
Control. Atlantis, Paradise Island, Bahamas.
Tesfatsion, L. (2003). Agent-Based Computational Economics: Modeling Economies as Complex
Adaptive Systems. Information Sciences, 149, 263-269.
228
REFERENCES
The DARPA Urban Challenge: Autonomous Vehicles in City Traffic. (2009). In M. Buehler,
K. Iagnemma, & S. Singh (Eds.), Springer Tracts in Advanced Robotics. Berlin Heidelberg:
Sprinter-Verlag.
Thrun, S. (2010). What we’re driving at. Retrieved from http://googleblog.blogspot.com/2010/10/
what-were-driving-at.html. (Last accessed: April 29, 2014)
Tonietti, G., Schiavi, R., & Bicchi, A. (2005). Design and Control of a Variable Stiffness Actuator for Safe and Fast Physical Human/Robot Interaction. In Proceedings of the 2005 IEEE
International Conference on Robotics and Automation. Barcelona, Spain.
Topf,
A.
(2011).
Rio Tinto boosts driverless truck fleet for use in Pilbara (No. March 26).
Retrieved from http://www.mining.com/2011/11/02/
rio-tinto-boosts-driverless-truck-fleet-for-use-in-pilbara/. Vancouver, BC, Canada: Mining.com. (Last accessed: April 29, 2014)
Trouvain, B., Schlick, C., & Mevert, M. (2003). Comparison of a Map- vs. Camera-based User Interface in a Multi-robot Navigation Task. In Proceedings of the IEEE International Conference
on Systems, Man, and Cybernetics (p. 3224-3231). Washington, D.C.
Vaughn, R. (2008). Massively Multi-Robot Simulation in Stage. Journal of Swarm Intelligence, 2,
189-208.
Vries, S. C. de. (2005). UAVs and Control Delays (Technical Report). Soesterberg, The Netherlands:
TNO Defence, Security, and Safety.
Wachter, M. D., Matton, M., Demuynck, K., Wambacq, P., Cools, R., & Compernolle, D. V. (2007).
Template Based Continuous Speech Recognition. IEEE Transactions on Audio, Speech, and
Language Processing, 15 (4), 1377-1390.
Wahle, J., & Schreckenberg, M. (2001). A Multi-Agent System for On-Line Simulations Based on
Real-World Traffic Data. In Proceedings of the 34th Annual Hawaii International Conference
on System Sciences. Grand Wailea, Maui, HI.
Weibel, R. E., & Hansman Jr., R. J. (2004). Safety Considerations for Operation of Different Classes
of UAVs in the NAS. Paper presented at The AIAA’s 4th Aviation Technology, Integration,
and Operations Forum. Chicago, IL.
Weick, K. E., & Roberts, K. H. (1993). Collective Mind in Organizations: Heedful Interrelating on
Flight Decks. Administrative Science Quarterly, 38 (3), 357-381.
Wickens, C. D., & Hollands, J. G. (2000). Engineering Psychology and Human Performance (3rd
ed.). Upper Saddle River, N.J.: Prentice Hall.
Wright, M. C. (2002). The Effects of Automation on Team Performance and Team Coordination.
Doctoral dissertation, North Carolina State University, Raleigh.
Xie, Y., Shortle, J., & Donahue, G. (2004). Airport Terminal-Approach Safety and Capacity Analysis
Using an Agent-Based Model. In Proceedings of the 2004 Winter Simulation Conference (p.
1349-1357). Washington, D.C.
Yoon, Y., Choe, T., Park, Y., & Kim, H. J. (2007). Safe Steering of UGVs in Polygonal Environments. Paper presented at The International Conference on Control, Automation and Systems.
Seoul, South Korea.
Yuan, J., Liu, Z., & Wu, Y. (2009). Discriminative Subvolume Search for Efficient Action Detection. Paper presented at The IEEE Conference on Computer Vision and Pattern Recognition.
Miami, FL.
229
REFERENCES
Yuhara, N., & Tajima, J. (2006). Multi-Driver Agent-Based Traffic Simulation Systems for Evaluating the Effects of Advanced Driver Assistance Systems on Road Traffic Accidents. Cognition,
Technology, and Work, 8, 283-300.
Zarboutis, N., & Marmaras, N. (2004). Searching Efficient Plans for Emergency Rescue through
Simulation: the case of a metro fire. Cognition, Technology, and Work, 6, 117-126.
230
Appendices
231
THIS PAGE INTENTIONALLY LEFT BLANK
232
Appendix A
Safety Principles from Carrier and
Mining Domains
This section contains tables of potential accidents in the aircraft carrier and mining domains, hazards
that could cause those accidents, the safety principles that help to prevent those hazards, and the
classes of safety principles they correspond to. Tables begin on the next page.
233
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
Accident
Aircraft hit by UAV
Aircraft hit by UAV
Catapult hits person
Explosion and Fire
Explosion and fire
Person falls overboard
Person hit by catapult
Person hit by UAV
Person hit by UAV
Person hit by UAV
Class
Collision Prevention
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas
Restricted Areas
Restricted Areas
Restricted Areas
Task Scheduling
Task Scheduling
Restricted Areas
Restricted Areas
Maintain proper spatial clearance
Table A.1: List of aircraft carrier of safety principles, part 1 of 3.
Hazard
Protocol
Aircraft is too close to UAV in
operation and unable to stop
Aircraft is too close to UAV in
operation and unable to stop
Person is inside safe boundary during
catapult launch
Accidental weapons firing on deck
causes fires, explosions
Static charge on ordnance ignites fuel
line, ignites weapons
Person too close to deck edge
Automated vehicles segregated from
manned
Minimize the presence of extraneous
personnel
Collision Prevention
Person is inside safe boundary during
catapult launch
Person too close to UAV during
operations, which was unable to stop
Person too close to vehicle during
operations, which was unable to stop
UAV does not recognize presence of
person
Person too close to UAV during
operations, which was unable to stop
Wait for permission before entering
area/using equipment
Automated vehicles segregated from
manned
Ensure safe area before beginning
actions
Post landing, aircraft are immediately
defueled and weapons are disarmed
Do not load fuel and weapons at same
time (static discharge)
Crew should not cross the boundary
line (deck edge)
Crew should not cross the boundary
line (catapults)
Maintain proper spatial clearance
Person too close to UAV during
operations, which was unable to stop
Wait for permission before entering
area/using equipment
Person hit by UAV
Person hit by UAV
Person too close to vehicle during
operations, which was unable to stop
Wait for permission before entering
area/using equipment
Collision Prevention
Person hit by UAV
UAV does not notice crew member in
area
Maintain proper spatial clearance
Person hit by UAV
234
235
Vehicle hits equipment
Vehicle hits equipment
Vehicle hits equipment
Vehicle hits equipment
Vehicle crashes due to
loss of fuel
Vehicle hit by vehicle
UAV hits person
UAV hits person
Person hit by vehicle
(landing aircraft)
Person hit by vehicle
(launching aircraft)
UAV hits person
Person hit by vehicle
Person hit by vehicle
Person hit by vehicle
Accident
Aircraft too close to equipment during
operations and unable to stop
Vehicle does not notice equipment in
area/ vice versa
Aircraft too close to equipment during
operations and unable to stop
Vehicle moves without recognizing
proximity of equipment
Person is inside safe boundary during
catapult launch
UAV too close to person during
operations and unable to stop
UAV too close to person during
operations and unable to stop
UAV too close to person during
operations and unable to stop
Insufficient fuel level (due to leak or
otherwise)
Vehicle does not notice vehicle in area
Vehicle does not notice crew member
in area
Crew member enters area during
unsafe time period
Vehicle does not notice crew member
in area
Person is inside LZ during landing
vehicles must remain stationary unless
being directed by the AD
Stay below safe driving speed
Maintain awareness of surroundings
Restricted Areas
Minimize the presence of extraneous
personnel
Maintain proper spatial clearance
Collision Prevention
Collision Prevention
Maintain SA
Collision Prevention
Task Scheduling
Collision Prevention
Maintain SA
Collision Prevention
Restricted Areas
Restricted Areas, Task
Scheduling
Restricted Areas, Task
Scheduling
Restricted Areas
Restricted Areas
Class
Maintain sufficient fuel levels
Stay below safe driving speed
Maintain awareness of surroundings
Minimize the presence of extraneous
personnel
Wait for permission before entering
area/using equipment
Wait for permission before entering
area/using equipment
Crew should not cross the boundary
line (LZ)
Crew should not cross the boundary
line (catapults)
Maintain proper spatial clearance
Table A.2: List of aircraft carrier safety principles, part 2 of 3.
Hazard
Protocol
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
Accident
Vehicle hits person
Vehicle hits person
Vehicle hits person
Vehicle hits person
Vehicle hits person
Vehicle hits person
Vehicle hits person
Vehicle hits vehicle
Vehicle hits vehicle
Vehicle hits vehicle
Vehicle hits vehicle
Vehicle hits vehicle
Vehicle hits vehicle
Vehicle is hit by vehicle
Vehicle is hit by vehicle
Vehicle is hit by vehicle
(landing aircraft)
Vehicle is hit by vehicle
(launching aircraft)
Restricted Areas, Task
Scheduling
Task Scheduling
Collision Prevention
Collision Prevention
Class
Maintain proper spatial clearance
Collision Prevention
Restricted Areas
Restricted Areas, Task
Scheduling
Restricted Areas
Restricted Areas
Collision Prevention
Task Scheduling
Collision Prevention
Maintain SA
Maintain SA
Collision Prevention
Maintain SA
Maintain awareness of surroundings
(AD)
Ensure safe area before beginning
actions
Tie down aircraft if not active
Apply minimum power needed for
movement/orientation
Stay below safe driving speed
Collision Prevention
Minimize the presence of extraneous
personnel
wait for permission before entering
area/using equipment
Vehicle should not cross the boundary
line (LZ)
Vehicle should not cross the boundary
line (catapults)
vehicles must remain stationary unless
being directed by the AD
Maintain awareness of surroundings
Maintain awareness of surroundings
(AD)
Tie down aircraft if not active
One AD in control of an aircraft at
any time
Apply minimum power needed for
movement/orientation
Stay below safe driving speed
Table A.3: List of aircraft carrier safety principles, part 3 of 3.
Hazard
Protocol
Vehicle too close to person and could
not stop in time
Vehicle does not notice crew member
in area
Vehicle does not notice crew member
in area
Vehicle moves unexpectedly
Vehicle too close to person and could
not stop in time
Vehicle too close to person and could
not stop in time
Vehicle does not notice crew member
in area
Vehicle does not notice vehicle in area
AD does not notice vehicle in area
Vehicle moves unexpectedly
Aircraft gets conflicting orders from
multiple ADs
Vehicle too close to vehicle and unable
to stop in time
Vehicle too close to vehicle and unable
to stop in time
Vehicle strays into path of other
vehicle
Vehicle strays into path of other
vehicle
Person is inside LZ during landing
Verson is inside near catapult during
launch
236
Equipment moves without recognizing
proximity of person
Equipment moves without recognizing
proximity of person
Equipment moves without recognizing
proximity of vehicle
Equipment moves without recognizing
proximity of vehicle
Vehicle too close to equipment during
operations (could not stop)
Vehicle unaware of location of
equipment
Equipment hits person
(LV)
Equipment hits vehicle
Equipment hits vehicle
Vehicle hits equipment
Vehicle hits equipment
Vehicle does not recognize vehicle
Person hit by auto large
vehicle
Vehicle hit by large
vehicle (auto)
Equipment hits person
(LV)
Vehicle hits person (LV)
Vehicle hits large vehicle
Vehicle (auto) unable to stop before
hitting vehicle
Vehicle unaware of location of vehicle
Vehicle (auto) hits large
vehicle
Vehicle hit by large
vehicle
Vehicle hits equipment
Vehicle unable to stop before hitting
equipment
Vehicle unable to stop before hitting
vehicle (Vehicle)
Vehicle unable to stop before hitting
person (LV)
Vehicle does not recognize person
Vehicle moves without recognizing
proximity of person
Person (LV) hit by large
vehicle
Accident
237
Maintain awareness of surroundings
Ensure safe area before beginning
actions
Maintain awareness of surroundings
Ensure safe area before beginning
actions
Maintain awareness of surroundings
Automated vehicles segregated from
manned
Automated vehicles segregated from
manned
Ensure safe area before beginning
actions
Maintain proper spatial clearance
Maintain proper spatial clearance
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas
Restricted Areas
Collision Prevention
Collision Prevention
Collision Prevention
Collision Prevention
Maintain proper spatial clearance
Maintain proper spatial clearance
Collision Prevention
Collision Prevention
Class
Maintain proper spatial clearance
Maintain proper spatial clearance
Table A.4: List of mining safety principles, part 1 of 3.
Hazard
Protocol
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
Vehicle hits person (LV)
Vehicle hits large vehicle
Vehicle unaware of location of person
(LV)
Vehicle unaware of location of person
(LV)
Vehicle does not notice equipment in
area
Maintain awareness of surroundings
Maintain awareness of surroundings
Ensure safe area before beginning
actions
Ensure safe area before beginning
actions
Table A.5: List of mining safety principles, part 2 of 3.
Hazard
Protocol
Vehicle hits person (LV)
Vehicle unaware of location of vehicle
Accident
Vehicle hits vehicle
Do not park larger vehicles next to
larger vehicles
Wait for permission before entering
area
Wait for permission before using
equipment
Minimize the presence of extraneous
personnel
Wait for permission before entering
area
Drivers must ensure safe transit
through intersections
Drivers must ensure safe transit
through intersections
Drivers must ensure safe transit
through intersections
Do not park larger vehicles next to
larger vehicles
Give priority to large vehicles
Vehicle hit by large
vehicle
Vehicle hits large vehicle
Vehicle hits person (LV)
Equipment moves without recognizing
proximity of person
Vehicle unaware of location of person
(LV)
Vehicle moves without recognizing
proximity of person
Vehicle unable to see large vehicle at
intersection
Vehicle unable to see large vehicle at
intersection
Vehicle unable to see person (LV) at
intersection
Vehicle unaware of location of parked
person (LV)
Vehicle unable to see person (LV) at
intersection
Vehicle unaware of location of parked
vehicle
Vehicle does not recognize person
Person (LV) hit by large
vehicle
Person (LV) hit by large
vehicle
Vehicle hit by large
vehicle
Person (LV) hit by auto
large vehicle
Person (LV) hit by
equipment
Person (LV) hit by large
vehicle
Person (LV) hit by large
vehicle
Class
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Restricted Areas,
Collision Prevention,
Task Scheduling
Collision Prevention
Collision Prevention
Collision Prevention
Task Scheduling
Task Scheduling
Task Scheduling
Restricted Areas, Task Scheduling
Restricted Areas, Task Scheduling
Restricted Areas
Restricted Areas, Task Scheduling
238
239
Vehicle hits person (LV)
Vehicle hits large vehicle
Vehicle hit by large
vehicle
Vehicle hit by large
vehicle (auto)
Vehicle (auto) hits large
vehicle
Vehicle hits equipment
Vehicle hit by equipment
Accident
Vehicle (auto) unable to stop before
hitting vehicle
Vehicle unable to stop before hitting
equipment
Vehicle unable to stop before hitting
vehicle
Vehicle unable to stop before hitting
person (LV)
Vehicle does not recognize vehicle
Equipment moves without recognizing
proximity of vehicle
Vehicle unaware of location of vehicle
Collision Prevention
Stay below safe driving speed
Stay below safe driving speed
Stay below safe driving speed
Collision Prevention
Collision Prevention
Collision Prevention
Restricted Areas, Task Scheduling
Restricted Areas
Restricted Areas, Task Scheduling
Class
Wait for permission before using
equipment
Minimize the presence of extraneous
personnel
Wait for permission before entering
area
Stay below safe driving speed
Table A.6: List of mining safety principles, part 3 of 3.
Hazard
Protocol
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
APPENDIX A. SAFETY PRINCIPLES FROM CARRIER AND MINING DOMAINS
THIS PAGE INTENTIONALLY LEFT BLANK
240
Appendix B
MASCS Agent Model Descriptions
B.1
Task Models and Rules
Multi-Agent Safety and Control Simulation (MASCS) task models operate through the use of two
main functions. The first, the execute function, exists within all tasks and updates all appropriate states and functions with the time of the elapsed update. An example of this appears as
pseudocode in MoveForward, which provides pseudocode for the “moveForward” actions that
replicates straight line motion on the flight deck. The task requires as input the current Agent
that action is assigned to, that agent’s initial position, desired final position, and current speed.
From this, a series of variables are initialized in lines 2-7; the most important of these are that the
total distance of travel totalDistance and total time of motion totalTime. The action will track the
total time of task execution currentTime and at each update compare the value to totalTime; task
execution stops when currentTime≥totaltime.
The script first tests whether or not the task should be updated (line 12): for motion tasks, if the
Director is not visible to the pilot or if a collision is imminent, the task should not proceed. If these
conditions are satisfied, the task proceeds: currentTime is updated with the value of elapsedTime
and the vehicle positions are updated according to the elapsed time, speed, and angle or travel. This
then repeats until the appropriate time has elapsed.
241
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
MoveForward(Aircraf t, init.x , init.y, f inal.x , f inal.y, speed, elapsedT ime)
1 // Calculate geometry and distance of motion
2 currentT ime = 0
inal. y−init. y
3 angle = tan ff inal.
x −init. x
p
4 totalDistance = ((f inal. x − init. x )2 + (f inal. y − init.y)2 )
5 // Time equals distance divided by speed
6 totalT ime = totalDistance
speed
7 // Total time of movement equals distance divided by speed
8 while currentT ime <= totalT ime
9
// Continue to operate until we reach totalTime/total distance of movement
10
// If Director is visible and aircraft not in danger of collision,
11
// update currentTime and aircraft locations
12
if (Aircraf t. Director .visible == true) && (Aircraf t. collision 6= true)
13
currentT ime+ = elapsedT ime
14
Aircraf t. x = init. x + elapsedT ime ∗ speed ∗ cos angle
15
Aircraf t. y = init. y + elapsedT ime ∗ speed ∗ sin angle
In the above example, checking the visibility of the Director and whether a collision is imminent
are both logical rule sets. Pseudocode for collision prevention appears in CollisionPrevention,
with pseudocode for Director visibility appearing DirectorVisibility. Aircraft models include a
check for Director visibility while taxiing. The routine requires knowledge of the current aircraft,
its current position and orientation, and its available field of view. It also requires a maximum
distance threshold, describing the general area in which the Director should be (see Figure 3-5 in
Chapter 3). The routine then checks both the current angle between the Aircraft and the Director
and their distance between them; if either is not at an acceptable value, another routine not shown
here calculates the shorter straight-line path to move the Aircraft Director within the Aircraft’s field
of view at the nearest aircraft wingtip.
DirectorVisibility(Aircraf t, aircraf t.angle, Director, distanceT hreshold)
1 // Calculate geometric relationships
currentP os.y−directorP os.y
2 angle = tan currentP
os.x−directorP os.x
3 dif f Angle =
pcurrentAngle − angle
4 distance = ((f inalP os.x − initP os.x)2 + (f inalP os.y − initP os.y)2 )
5 if (distance > distanceT hreshold)||(dif f Angle > Aircraf t.f ieldOf V iew)
6
Run Aircraf t. findAlignmentPoint
7
// Determines closest point where distance distance ≤ distanceT hreshold
// and dif f Angle ≤ Aircraf t.f ieldOf V iew
8 else Continue
While in motion, Aircraft are also continually applying a set of rules that act to prevent them
from colliding with other aircraft. Pseudocode for this appears in CollisionPrevention. The
routine first requires knowledge of the current aircraft in motion, its position and heading, as well as
a list of all aircraft currently on the flight deck and a distance threshold d. The routine then checks
the distance between the current aircraft in motion (Aircraft) and all other vehicles active on the
deck. If another aircraft is both within the distance d and in front of Aircraft (angle between them
242
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
is 90circ or less), the current aircraft in motion must stop. “Aircraft” will remain stopped until the
other aircraft moves out of the way. As mentioned above, this routine is always in operation while
the aircraft is in motion.
CollisionPrevention(Aircraf t, Aircraf tList, Director)
1 for i = 0 to Aircraf tList. size()
Aircraf t.y−Director.y
2
angle = tan Aircraf
t.x−Director.x
3
dif f Angle = Aircraf t. angle − angle
4
if |dif f Angle| ≤ 90
5
if distance < d
6
// Aircraft is in front of us and in danger of collision
7
return true
8
else
9
// Aircraft is in front of us, but far enough away
10
return false
11
else
12
// Aircraft is not in front of us
13
return false
Time-based actions address non-motion tasks executed by aircraft and typically involve interactions with other agents (vehicle, crew, or equipment agents). For example, the set of tasks performed
during launch preparations require no physical movement on the part of the aircraft or crew. The
Action that replicates these functions in MASCS simply counts up to a specific time value before
completing execution. The acceleration of the aircraft down the catapult is also time-based, using
the time required to complete the task to decompose the required acceleration of the vehicle to
flight speed. Data on these two parameters were compiled from video recordings of operations compiled primarily from the website Youtube.com. These data were then each fit into a mathematical
distribution, the results of which appear in Appendix C.
Executing these tasks is similar to executing motion based tasks as shown in MoveForward. For
these tasks, however, the time is provided by a specified parametric distribution. Pseudocode for the
launch preparation time appears in Launch Preparation-Initialize and Launch PreparationExecute; pseudocode for the acceleration task is a combination of MoveForward and Launch
Preparation-Initialize, first sampling the completion time from a candidate distribution then
iteratively updated vehicle speed and position over the provided time.
Launch Preparation-Initialize(mean, stddev, mininum)
1 // Initialize variables
2 distribution = newGaussian(mean, stddev)
3 totalT ime = distribution.sample()
4 // Randomly sample distribution to get task time
5 while (totalT ime ≤ (mean − 3.9 ∗ stddev) —— (totalT ime ≥ (mean + 3.9 ∗ stddev))
6
totalT ime = distribution.sample()
243
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
Launch Preparation-Execute(totalT ime, elapsedT ime)
1 // Initialize variables
2 currentT ime = 0
3 // Execution loop
4 while currentT ime <= totalT ime
5
currentT ime+ = elapsedT ime
B.2
Director models and rules
The collision prevention rule base was described above in CollisionPrevention. The logic behind
it is rather simple and is vehicle-centric in nature: it only describes what the behavior of an individual
aircraft should be and makes no judgments about the efficiency of the system. Employing only this
simple collision prevention rule base, aircraft can easily find themselves simultaneously blocking each
other and “locking” the deck. This requires action on the part of the Aircraft Directors to control
traffic on the flight deck, applying rules and priorities to aircraft prevent any motion conflicts from
occurring. Pseudocode for this appears in TrafficDeconfliction. These rules first rely on a
set of traffic management lists and assumes that an aircraft has already been provided a catapult
assignment (this logic is described later in Handler-Assign). One management list controls taxi
operations in the Fantail area for aircraft heading to aft catapults, or for aircraft parked aft heading
to forward catapults. Another list manages the “Street” area, managing traffic for aircraft heading
to forward catapults and aircraft parked forward that are assigned to aft catapults. A third list
manages the “Box” area near catapults 1 and 2, managing operations of aircraft parked far forward
on the flight deck.
Each list operates similarly: as aircraft receive assignments, they are added to the appropriate
list(s). Once they have moved through the area, they are removed from the list. The first aircraft in
the list has top priority in taxi operations, although the lists are checked to ensure that aircraft with
higher overall launch priorities are ranked higher than others. While an aircraft is taxiing, these rules
are continually checked to ensure that they have permission to move, much in the same way that the
collision prevention routines is checked. Given a current aircraft in motion (Aircraf t), it positions
in the appropriate list is checked. If it is the first in the list, it is allowed to move. Otherwise,
a series of additional checks are performed to determine if there actually is a traffic conflict. For
instance, if the first aircraft in the list (f irstAircraf t) is assigned to a forward catapult, no conflict
exists if Aircraf t is parked aft of f irstAircraf t and assigned to an aft catapult. The same is true
if f irstAircraf t is assigned aft and Aircraf t is parked forward and assigned forward.
Operations may be allowed to occur if both are heading the same direction, however. If both
are assigned forward and Aircraf t is aft of f irstAircraf t (following behind it), no conflict should
ever exist. If Aircraf t is parked forward of f irstAircraf t (ahead of it), it must be several aircraft
lengths ahead of it in order to prevent a conflict from occurring. So long as this is true, Aircraf t is
allowed to move.
244
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
Other rules for Directors dictate how they transfer aircraft between one another on the flight
deck. This requires knowing the current assignment of the aircraft, its positions and heading,
and the network of Directors on the deck. This network is known to each Director and includes
an understanding of their positions on the flight deck. Pseudocode for this routine appears in
DirectorHandoff, which functions as a search loop over all available connections. It begins
by defining a set of variables for search, the most important of which are the current distance
between the aircraft and its assigned catapult (destDistance) and the current angle to that catapult
(destAngle). The Director checks their connection network to see which options are available. The
routine then iterates over this list, calculating the distance and angle between the connection and
and the assigned catapult. The difference between the aircraft’s angle to the catapult and the new
Director’s angle to the catapult are compared; this should be as small as possible. The distance
of the new Director to the catapult is also compared, and should be both as small as possible and
smaller than the aircraft’s distance from the catapult. Best values for each are stored during the
iteration, and the Director that is closest to the catapult and closest to the straight line path to the
catapult is selected.
DirectorHandoff(Aircraf t, DirectorConnections, Destination)
1 // Initialize variables
Aircraf t.y−destination.y
2 destAngle = tan Aircraf
t.x−destination.x
3 bestDist = 1000000
4 bestAngle = 180
5 bestIndex = −1
6 // Iterate through list of connections to find best option
7 for i = 0 to DirectorConnections.size()
8
currentDir = DirectorConnections.get(i )
9
deltaX = currentDir.x − destination.x
10
deltaY = currentDir.y − destination.y
deltaY
11
currentAngle = | tan deltaX
− destAngle|
12
// currentAngle compares angle between Director and destination to angle between
// aircraft and destination;
looking for Directors closest to that path.
p
13
currentDist = ((deltaX)2 + (deltaY )2 )
14
if currentDist < bestDist & currentAngle < bestAngle
15
bestIndex = i
16
bestDist = currentDist
17
bestAngle = currentAngle
18 if bestIndex 6= −1
19
return DirectorConnections.get(bestIndex )
The Deck Handler allocation routine requires the definition of two different heuristic sets: the first
(Handler-Assign) assigns aircraft to catapults and used the second (Handler-RouteCheck) to
determine whether the route to the catapult is feasible. The assignment heuristic checks if any
aircraft are parked in the center parking spaces in the flight deck in between the “Street” and
“Fantail” areas (identification numbers 8 and 9 in the MASCS simulation). If any aircraft are
245
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
parked in these area, aircraft cannot be taxied forward to aft, or vice versa.
TrafficDeconfliction(Aircraf t, Aircraf tList, threshold)
1 for i = 0 to Aircraf tList. size()
2
if Aircraf tList. indexOf (this) 6= 0
3
f irstAircraf t == Aircraf tList. getFirst()
4
deltaX = f irstAircraf t.x − Aircraf t.x
5
deltaY = f irstAircraf
t.y − Aircraf t.y
p
6
distance = ((deltaX)2 + (deltaY )2 )
7
if f irstAircraf t. catapult < 3
8
// f irstAircraf t is assigned to head forward
9
if (Aircraf t.x − f irstAircraf t.x) < 0
10
Continue
11
// Aircraf t is aft of f irstAircraf t, which is heading forward;
// no conflict exists
12
elseif (Aircraf t. catapult < 3) &&
(Aircraf t.x − f irstAircraf t.x) > 0 && distance > threshold
13
Continue
14
// Aircraf t is forward of f irstAircraf t, heading forward, and
// far enough ahead of f irstAircraf t that no conflict exists
15
else Pause
16
// A routing conflict exists
17
else
18
// f irstAircraf t is assigned to head aft
19
if (Aircraf t.x − f irstAircraf t.x) > 0
20
Continue
21
// Aircraf t is forward of f irstAircraf t, which is heading aft;
// no conflict exists
22
elseif (Aircraf t. catapult > 2) && (Aircraf t.x − f irstAircraf t.x) < 0
&& distance > threshold
23
Continue
24
// Aircraf t is aft of f irstAircraf t, heading aft, and far enough ahead of
// f irstAircraf t that no conflict exists
25
else Pause
26
// A routing conflict exists
Handler-RouteCheck(Aircraf t, Catapult, P arkingSpaces)
1 if Aircraf t. parkingArea == aft && Catapult. number < 3
2
// aircraft parked aft and heading to forward catapult
3
if P arkingSpace.8 .occupied == true || P arkingSpace.9 . occupied == true
4
return false
5
else return true
6 elseif Aircraf t. parkingArea == forward && Catapult. number > 2
7
// aircraft parked forward and heading to aft catapult
8
if P arkingSpace.8 .occupied == true || P arkingSpace.9 . occupied == true
9
return false
10
else return true
11 else return true
246
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
Handler-Assign(U nassignedAircraf t, CatapultList)
1 // Initialize variables
2 bestDist = 1000000
3 bestIndex = −1
4 // Iterate over list of aircraft without assignments; only look at five highest priority
5 for i = 1 to 5
6
currentAircraf t = Aircraf tList. get(i )
7
// iterate over all catapults
8
for j = 1 to 4
9
if CatapultList(j).queue. size() == 0
10
if routeCheck == true
11
deltaX = currentAircraf t.x − CatapultList(j).x
12
deltaY = currentAircraf
t.y − CatapultList(j).y
p
13
distance = ((deltaX)2 + (deltaY )2 )
14
if distance < bestDist
15
bestDist = distance
16
distance = 0
17
bestIndex = i
18
if bestIndex 6= −1
19
currentAircraf t.catapult == i
20
U nassignedAircraf t.remove(currentAircraft)
21
else
22
bestDist = 1000000
23
if CatapultList(j).queue.size() == 1
24
if routeCheck == true
25
deltaX = currentAircraf t.x − CatapultList(j).x
26
deltaY = currentAircraf
t.y − CatapultList(j).y
p
27
distance = ((deltaX)2 + (deltaY )2 )
28
if distance < bestDist
29
bestDist = distance
30
distance = 0
31
bestIndex = i
32
if bestIndex 6= −1
33
currentAircraf t.catapult == i
34
U nassignedAircraf t.remove(currentAircraft)
247
APPENDIX B. MASCS AGENT MODEL DESCRIPTIONS
THIS PAGE INTENTIONALLY LEFT BLANK
248
Appendix C
MASCS Input Data
Turn rate observations:
Table C.1: Observed turn rates for F-18 aircraft during manually-piloted flight deck operations.
time
Estimated Angle
Estimated Rate
Source
(sec)
(degrees)
(degrees/second)
7.50
12.00
3.50
1.00
10.00
180.00
180.00
90.00
15.00
90.00
24.00
15.00
25.71
15.00
9.00
Average
http://youtu.be/RGQy5ImbnNI
http://youtu.be/RGQy5ImbnNI
http://youtu.be/7EaPe7zGZAw
http://youtu.be/ 2KYU97y2L8
http://youtu.be/7G3MujUry-I
17.75
The Launch Preparation Time and Takeoff Time distributions discussed in Chapter 3 were based
on empirical observations from video recordings of operations. Table C.2 contains the observations
related to the launch preparation time, while Table C.3 contains observations related to the Takeoff
time.
Table C.2: Observations of launch preparation time.
61.00 74.00 141.00
73.00 75.00 153.00
Table C.3: Observations of takeoff (acceleration to flight speed) time.
2.30
2.30
2.40
2.50
2.50
2.50
2.50
2.50
2.50
2.60
2.60
2.70
2.70
2.80
2.80
2.80
2.80
2.90
2.90
2.90
3.00
249
3.00
3.00
3.10
3.10
3.20
3.20
3.20
APPENDIX C. MASCS INPUT DATA
Because of the low number of observations for launch preparation time, any distribution fit to the
data is likely to be inaccurate. For this data, a normal distribution was fit that placed the maximum
(153 seconds) and minimum (61 seconds) observations at certain intervals from the mean, influenced
by an understanding of the maximum and minimum feasible launch times. The final selected distribution appears in Fig. C-1, with the six observed points overlaid. This distribution utilizes a mean of
109.648 seconds and a standard deviation of 57.8 seconds, which places the maximum observations
(153 seconds) at the 75th percentile and the minimum (60 seconds) at the 20th. The minimum
feasible launch time is 30 seconds; within Multi-Agent Safety and Control Simulation (MASCS),
any random samples that occur below this threshold are ignored. The maximum allowed time to
sample would be equal to the mean times 3.9 standard deviations, equal to 335 seconds (about 6.5
minutes). Beyond a distance of 3.9 standard deviations, values are considered to be extreme (99.99%
of all values lie within ±3.9 standard deviations of the mean).
Figure C-1: Probability density function of the
modeled launch preparation time, with empirical observations overlaid.
Figure C-2: Probability density function of the
modeled launch acceleration time, with empirical observations overlaid.
Figure C-2 shows the resulting distribution of takeoff times, based on a Lognormal distribution
fit to the data in Table C.3. The distribution fit was performed in the Easyfit program, which
returned the following information from the Kolmogorov-Smirnov test (Table C.4):
Table C.4: Results of Easyfit fit of Lognormal distribution to takeoff acceleration time data.
Lognormal
K-S statistic
p
Shape parameter σ
Location parameter µ
250
0.1513
0.49598
0.09891
1.0106
Appendix D
Example Scenario Configuration
Text
D.1
Example 1: Manually piloted Aircraft
This first line of text initializes an F-18 type aircraft (FMAC) named “F18 40” at parking space
9, spot 1, fuel level at 100% (1.0). Line two assigns a wait task of 5000 ms to the vehicle (spacing its addition into the simulation). Line 3 initializes a Taxi task that will send the vehicle to a
catapult; the assignment of -1 implies that the Deck Handler algorithm will make this assignment
some time in the future. Line 4 assigns a takeoff task to the aircraft, which will be at the same
aircraft as the Taxi task. Line 5 assigns the aircraft to fly away from the aircraft off to Mission.
The Launch preparation tasks discussed in Chapter 3 are contained within the Takeoff task structure.
Aircraft FMAC F18-40 Parked 0 0 9 1 1.0 0
Task Wait 5000 0
Task Taxi -1 Catapult
Task Takeoff -1 0
Task GotoMission 2000 North
D.2
Example 2: Crew
The first line below creates a “Crew” agent, defined as “D16” and assigned as an Aircraft Directors.
It is initially placed as deck coordinates (1355, 1050) and assigned to the group Deck 3.
Crew D16 director 1355 1050
251
APPENDIX D. EXAMPLE SCENARIO CONFIGURATION TEXT
D.3
Example 3: Deck equipment
The first line below creates a Catapult agent, replicating catapult 1, that it is operational (the second
numeral 1), with starting point (1656, 1124) and ending point (2362,1072) in the deck coordinates.
The trio of lines that follow define a pair of parking locations on the flight deck. The first line initializes a parking space structure (regions on the flight deck) named “Parking Space 1” and assigns
a parking angle of −140◦ to aircraft. The second line defines parking spot 0 within that region, that
it is availabled, and places it at coordinates (2340, 1100). The second line defines parking spot 1
within that space, at coordinates (2240, 1110).
Catapult 1 1 1656 1124 2362 1072
ParkingSpace 1 -140
ParkingSpot 0 1 2340 1100
ParkingSpot 1 1 2240 1110
252
Appendix E
Center for Naval Analyses (CNA)
Interdeparture Data
Figure E-1 shows the data reported by Jewell (Jewell, 1998) for launch interdeparture rates on the
aircraft carrier flight deck including a 3rd order polynomial fit of the data. Jewell notes that the
data comes from one week of operations on a carrier used in a previous study. Jewell’s original
presentation of this data did not include the 3rd order polynomial shown in the figure. In order to
extract the numerical values from this chart, the figure was imported into the Engauge Digitizer
software package1 . This software is capable of extracting numerical relationships from a chart given
the input image and specifications of the x- and y-axes and their ranges. The Engauge software
returned a set of 102 points comprising the information presented in Fig. E-1. These points were
then fit in the JMP PRO 10 statistical program with a 3rd-order polynomial fit, with both the F test
of model fit and coefficient tests returning significant results (Tables E.1 and E.2). The equation is
given as:
p = 0.3947704+0.004392∗time−3.99E −05∗(time−86.5334)2 +1.67E −07∗(time−86.5334)3 (E.1)
Table E.1: Results of F test of model fit for JMP PRO 10 fit of CNA CDF data with 3rd order
polynomial.
Source DF Sum of Squares Mean Square F ratio
Prob > F
Model
Error
C. Total
3
98
101
8.7080979
0.0299665
8.7380644
2.9027
0.00031
1 http://digitizer.sourceforge.net/
253
9492.7616
<.0001
APPENDIX E. CNA INTERDEPARTURE DATA
Figure E-1: Cumulative Density Function (CDF) of launch interdepartures from (Jewell, 1998).
Table E.2: Results of t tests for coefficients of JMP PRO 10 fit of CNA CDF data with 3rd order
polynomial.
Term
Estimate Std Error t Ratio Prob> |t|
Intercept
time
(time-86.5334)2
(time-86.5334)3
0.3947704
0.004392
-3.99E-05
1.67E-07
0.007481
0.000078
7.00E-07
1.32E-08
52.77
56.28
-56.96
12.59
<.0001
<.0001
<.0001
<.0001
This equation was then converted into a Probability Density Function (PDF) by taking the first
differential with respect to X. Since the data represents true interdeparture rates for a system,
it should be characterizable as a negative exponential function. The PDF developed from the
differentiation of the CDF equation is plotted in blue in Figure E-2. These data points were then
imported into JMP PRO 10 to formally fit a negative exponential to the data, done by transforming
the Y variable (in this case, the probability value) into log(Y). The results of this fit also returned
statistical significance (Table E.3 and E.4) for the model
Log(P ) = −4.128056 − 0.0155665 ∗ time
(E.2)
which is virtually identical to the more typical negative exponential form λ ∗ e−λx ,
P = 0.0155665 ∗ e−0.0155665∗time
254
(E.3)
APPENDIX E. CNA INTERDEPARTURE DATA
which is treated as the appropriate negative exponential model. For this model, the value of λ
corresponds to the average departure rate per second; the inverse of this corresponds to the average
time between arrivals (for this model, 64.25 seconds).
Figure E-2: PDF of launch interdepartures from (Jewell, 1998)
Table E.3: Results of F test of model fit for JMP PRO 10 fit of CNA PDF data with exponential
function.
Source DF Sum of Squares Mean Square F Ratio Prob > F
Model
Error
C. Total
1
100
101
78.076245
0.964404
79.040648
78.0762
0.0096
8095.805
<.0001
Table E.4: Results of t tests for coefficients of JMP PRO 10 fit of CNA PDF data with exponential
function.
Term
Estimate Std Error t Ratio Prob> |t|
Intercept
time
-4.128056
-0.015566
0.017851
0.000173
255
-231.2
-89.98
<.0001
<.0001
APPENDIX E. CNA INTERDEPARTURE DATA
THIS PAGE INTENTIONALLY LEFT BLANK
256
Appendix F
Multi-Agent Safety and Control
Simulation (MASCS) Single
Aircraft Test Results
Table F.1: Test results for distribution fits of the MASCS single aircraft calibration testing.
Launch Duration Taxi Duration Takeoff Duration Launch Preparation
K-S statistic
p
Mean µ
Standard deviation σ
0.05773
0.25992
209.99
49.628
0.06504
0.15113
83.943
6.0198
257
0.04409
0.5886
2.7513
0.25651
0.05413
0.33108
120.16
48.842
APPENDIX F. MASCS SINGLE AIRCRAFT TEST RESULTS
THIS PAGE INTENTIONALLY LEFT BLANK
258
Appendix G
Description of Mission Queuing
This section provides a more extended description of the process of queuing at catapults on the flight
deck and how this process influences the execution of the mission. Typically, operations allow one
aircraft on the catapult and one more parked behind the jet blast deflector; this implies a queue size
of two for an individual catapult and a size of four for a pair. Aircraft are allocated to a catapult
as soon as a position in the queue opens. With four catapult open, each pair services an equal
number of aircraft and the expectations are as listed above — allocations should be equally divided
between forward and aft catapults. However, if only three catapults are used, the larger queue size
of the pair of catapults interacts with the allocation pattern; as the pair is able to “store” more
aircraft in their vicinity, the allocations may be slightly inefficient. At the end of the mission, if
all three catapults have full queues, the pair will take four rounds of launches to clear its queue of
four aircraft, with the single catapult requiring only two to clear its two aircraft. Figure G-1shows
this allocation as time evolves. The leftmost image shows the initial allocation of launches to a
team of four catapults. The queue at each catapult, over time, is noted in the numbered columns
(catapult 1 on the left, to catapult 4 on the right). Colored blocks within each column denote (a)
the presence of an aircraft in that catapult’s queue and (b) the order in which they were allocated.
Thus, block 1 was the first aircraft allocated, assigned to catapult 1. Block 6 was the sixth aircraft
allocated to launch, assigned to catapult 2. The staggered, “checkered” pattern is the table is due
to the alternating launches between pairs of catapults; while aircraft 2 was assigned second, it was
assigned to catapult 2 and must wait for catapult 1 to launch aircraft 1. Similar patterning is shown
for aircraft at catapults 3 and 4. In this case, aircraft 1 is shown as a red block; this is used to
indicate the aircraft is in preparation to launch. In the middle image, aircraft 1 is now crossed out
with a black line, indicating it has launched. Now that there is a spot free at the catapult, aircraft 9
is assigned to take its place (recall from Chapter 3, the Handler routine assigns aircraft as catapults
become available). This pattern continues moving forward; in the rightmost image, after aircraft 3
259
APPENDIX G. DESCRIPTION OF MISSION QUEUING
and 4 have launched, aircraft 10 and 11 have been assigned, respectively, and aircraft 4 and 5 being
preparing for launch. Continuing this build-out as shown, each pair of catapults launches one half
of the allotted aircraft, and the time to complete the mission is approximate as N/2 ∗ µl aunch
Figure G-1: Pattern of allocation to catapults. Blue/green coloring corresponds to pairing of catapults 1/2 and 3/4, respectively. Red indicates the launch is in process. A black diagonal line
indicates that the launch has occurred and the aircraft is no longer in the system.
With only three catapults operating, the allocation pattern changes slightly (Figure G-2). One
catapult (in this case, catapult 2) is no longer paired and free to operate on its own. The leftmost
image in the figure shows the initial allocation of aircraft; four go to catapults 3 and 4, staggered in
an alternating pattern. Only two can go to catapult 2, but launch consecutively. The remaining two
images in the figure show how the allocation builds out from there; after aircraft 1 and 2 launch,
aircraft 3 and 4 begin processing, and aircraft 7 and 8 are allocated. This continues for the rest of
the allocation, with catapult 2 launching about as many aircraft as the paired catapults 3 and 4.
Figure G-2: Pattern of allocation to a set of three catapults. Blue/green coloring corresponds to
pairing of catapults 1/2 and 3/4, respectively. Red indicates the launch is in process. A black
diagonal line indicates that the launch has occurred and the aircraft is no longer in the system.
Figure G-3 shows how this allocation plays out through a 22 aircraft launch event. As described
above, the four catapult case maintains an even allocation across the pairs and both complete at
the same time (11 rounds of launches). For the three catapult case, because of the imbalance in
allocation the results from the queuing arrangement, catapult 2 finishes after only 10 launches and
260
APPENDIX G. DESCRIPTION OF MISSION QUEUING
the catapult 3/4 pair finishes after 12 rounds, one more than the four catapult case. This implies that
the three catapult case should require one additional round of launches than the four catapult case
under this idealized queue model of the flight deck. Given that there will be significant stochasticity
in the execution of the model, and that assignments are dependent on the geometric arrangement of
the flight deck at the time of assignment, it is not unreasonable to assume that this difference will
not be precisely equal to one, but should be close.
Figure G-3: Build-out of 22 aircraft launch events with four (left) and three (right) available catapults.
261
APPENDIX G. DESCRIPTION OF MISSION QUEUING
THIS PAGE INTENTIONALLY LEFT BLANK
262
Appendix H
Multi-Agent Safety and Control
Simulation (MASCS) Mission Test
Results
Table H.1: Results of distribution fits of interdeparture times for CNA and MASCS test data.
Results include λ and p-values for fits from Easyfit along with λ values and 95% confidence intervals
from JMP PRO 10.
Data source Easyfit λ value EasyFit K-S test p-value JMP λ Lower range Upper range
CNA
18-3
18-4
20-3
20-4
21-3
21-4
22-3
22-4
23-3
23-4
24-3
24-4
26-3
26-4
30-3
30-4
34-3
34-4
0.0156
0.0144
0.0164
0.0153
0.0160
0.0141
0.0152
0.0145
0.0154
0.0144
0.0153
0.0146
0.0159
0.0150
0.0159
0.0147
0.0154
0.0149
0.0153
0.0717
0.2813
0.1554
0.1251
0.1358
0.2755
0.1596
0.4500
0.2446
0.1782
0.2254
0.0999
0.2565
0.1958
0.2126
0.1089
0.1900
0.1981
0.3791
263
0.0156
0.0144
0.0164
0.0153
0.0159
0.0141
0.0152
0.0145
0.0154
0.0144
0.0153
0.0146
0.0159
0.0150
0.0159
0.0147
0.0154
0.0149
0.0153
0.0108
0.0123
0.0114
0.0120
0.0105
0.0113
0.0108
0.0115
0.0108
0.0115
0.0109
0.0119
0.0110
0.0117
0.0110
0.0115
0.0111
0.0114
0.0187
0.0213
0.0198
0.0207
0.0184
0.0197
0.0190
0.0201
0.0188
0.0199
0.0191
0.0208
0.0198
0.0210
0.0191
0.0200
0.0195
0.0200
APPENDIX H. MASCS MISSION TEST RESULTS
Table H.2: JMP PRO 10 output for F-test of model fit for LD values using three catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
Model
Error
C. Total
1
7
8
254.67542
1.0272
255.70262
254.675
0.147
1735.526
<.0001
Table H.3: JMP PRO 10 output for t-test of coefficients for linear fit of LD values using three
catapults.
Term
Estimate Std Error t Ratio Prob > |t|
Intercept
Num air
2.4130162
1.1130865
0.65966
0.026719
3.66
41.66
0.0081
<.0001
Table H.4: JMP PRO 10 output for F-test of model fit for LD values using four catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
Model
Error
C. Total
1
7
8
267.30216
2.4082
269.71036
267.302
0.344
776.9774
<.0001
Table H.5: JMP PRO 10 output for t-test of coefficients for linear fit of LD values using four
catapults.
Term
Estimate Std Error t Ratio Prob> |t|
Intercept
Num air
0.0860649
1.1403459
1.010041
0.04091
0.09
27.87
0.9345
<.0001
Table H.6: JMP Pro 10 output of Levene test of equal variances for launch preparation sensitivity
tests using 22 aircraft and 3 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
0.7676
0.2117
0.2099
0.7683
2
2
2
2
87
87
87
.
0.4672
0.8097
0.8111
0.4638
Table H.7: JMP Pro 10 output of an ANOVA test for launch preparation sensitivity tests using 22
aircraft and 3 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
22-3
Error
C. Total
2
87
89
259.75601
387.83637
647.59238
129.878
4.458
29.1344
<.0001
Table H.8: JMP Pro 10 output of Tukey test for launch preparation sensitivity tests using 22 aircraft
and 3 catapults.
Level
- Level
Difference Std Err Dif Lower CL Upper CL p-Value
22-3 +10%
22-3 +10%
22-3 base
22-3 -10%
22-3 base
22-3 -10%
4.155106
2.27535
1.879756
0.5451538
0.5451538
0.5451538
264
2.855189
0.975433
0.579839
5.455023
3.575267
3.179673
<.0001
0.0002
0.0025
APPENDIX H. MASCS MISSION TEST RESULTS
Table H.9: JMP Pro 10 output of Levene test of equal variances for taxi speed sensitivity tests using
22 aircraft and 3 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
0.3054
0.1777
0.2357
0.2962
2
2
2
2
87
87
87
.
0.7376
0.8375
0.7906
0.7436
Table H.10: JMP Pro 10 output of an ANOVA test for taxi speed sensitivity tests using 22 aircraft
and 3 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
22-3
Error
C. Total
2
87
89
12.48402
497.60072
510.08474
6.24201
5.71955
1.0913
0.3403
Table H.11: JMP Pro 10 output of Levene test of equal variances for collision prevention parameter
sensitivity tests using 22 aircraft and 3 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
1.1804
1.4554
1.4402
1.1048
2
2
2
2
87
87
87
.
0.312
0.2389
0.2425
0.3313
Table H.12: JMP Pro 10 output of an ANOVA test for collision prevention parameter sensitivity
tests using 22 aircraft and 3 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
22-3
Error
C. Total
2
87
89
17.85385
509.18622
527.04007
8.92693
5.85272
1.5253
0.2233
Table H.13: JMP Pro 10 output of Levene test of equal variances for taxi speed sensitivity tests
using 22 aircraft and 4 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
0.184
0.5919
0.5318
0.142
2
2
2
2
87
87
87
.
0.8323
0.5555
0.5895
0.8676
Table H.14: JMP Pro 10 output of an ANOVA test for taxi speed sensitivity tests using 22 aircraft
and 4 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
22-4
Error
C. Total
2
87
89
10.33376
404.14883
414.48259
5.16688
4.64539
265
1.1123
0.3334
APPENDIX H. MASCS MISSION TEST RESULTS
Table H.15: JMP Pro 10 output of a Levene test of equal variances for collision prevention parameter
sensitivity tests using 22 aircraft and 4 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
2.2512
1.4837
1.5309
2.6178
2
2
2
2
87
87
87
.
0.1114
0.2325
0.2221
0.073
Table H.16: JMP Pro 10 output of an ANOVA test for collision prevention parameter sensitivity
tests using 22 aircraft and 4 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
22-4
Error
C. Total
2
87
89
43.43476
408.27087
451.70563
21.7174
4.6928
4.6278
0.0123
Table H.17: JMP Pro 10 output of a Tukey test for collision prevention parameter sensitivity tests
using 22 aircraft and 4 catapults.
Level
- Level
Difference Std Err Dif Lower CL Upper CL p-Value
22-4 CA+25%
22-4 base
22-4 CA+25%
22-4 CA-25%
22-4 CA-25%
22-4 base
1.643544
1.203622
0.439922
0.5593311
0.5593311
0.5593311
0.309822
-0.130101
-0.893801
2.977267
2.537345
1.773645
0.0116
0.0855
0.7123
Table H.18: JMP Pro 10 output of a Levene test of equal variance for taxi speed sensitivity tests
using 34 aircraft and 3 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
0.6872
0.7932
0.8184
0.8447
2
2
2
2
87
87
87
.
0.5057
0.4556
0.4445
0.4297
Table H.19: JMP Pro 10 output of an ANOVA test for taxi speed sensitivity tests using 34 aircraft
and 3 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
34-3
Error
C. Total
2
87
89
83.3424
728.6756
812.018
41.6712
8.3756
266
4.9753
0.009
APPENDIX H. MASCS MISSION TEST RESULTS
Table H.20: JMP Pro 10 output of a Tukey test for taxi speed sensitivity tests using 34 aircraft and
3 catapults.
Level
- Level
Difference Std Err Dif Lower CL Upper CL p-Value
34-3 taxi-25
34-3base
34-3 taxi-25
34-3 taxi+25
34-3 taxi+25
34-3base
2.328156
1.483278
0.844878
0.747243
0.747243
0.747243
0.546357
-0.29852
-0.93692
4.109954
3.265076
2.626676
0.007
0.122
0.4979
Table H.21: JMP Pro 10 output of a Levene test of equal variance for collision prevention parameter
sensitivity tests using 34 aircraft and 3 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
1.3729
1.6028
1.7374
1.4627
2
2
2
2
87
87
87
.
0.2588
0.2072
0.182
0.2316
Table H.22: JMP Pro 10 output of an ANOVA test for collision prevention parameter sensitivity
tests using 34 aircraft and 3 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
34-3 C
Error
C. Total
2
87
89
3.90398
740.75
744.65397
1.95199
8.51437
0.2293
0.7956
Table H.23: JMP Pro 10 output of a Levene test of equal variance for collision prevention parameter
sensitivity tests using 34 aircraft and 4 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
0.6346
0.4903
0.4355
0.5318
2
2
2
2
87
87
87
.
0.5326
0.6141
0.6484
0.5876
Table H.24: JMP Pro 10 output of a an ANOVA for launch preparation time sensitivity tests using
34 aircraft and 4 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
34-4
Error
C. Total
2
87
89
517.4101
653.5734
1170.9835
258.705
7.512
34.4374
<.0001
Table H.25: JMP Pro 10 output of a Tukey test for launch preparation time sensitivity tests using
34 aircraft and 4 catapults.
Level
- Level
Difference Std Err Dif Lower CL Upper CL p-Value
34-4 +10%
34-4 +10%
34-4 base
34-4 -10%
34-4 base
34-4 -10%
5.873072
2.964311
2.908761
0.7076882
0.7076882
0.7076882
267
4.185593
1.276831
1.221281
7.560552
4.651791
4.596241
<.0001
0.0002
0.0003
APPENDIX H. MASCS MISSION TEST RESULTS
Table H.26: JMP Pro 10 output of Levene test of equal variance for taxi speed sensitivity tests using
34 aircraft and 4 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
2.7878
2.3031
2.6456
2.9422
2
2
2
2
87
87
87
.
0.0671
0.106
0.0767
0.0528
Table H.27: JMP Pro 10 output for an ANOVA test for taxi speed sensitivity tests using 34 aircraft
and 4 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
34-4
Error
C. Total
2
87
89
55.10413
630.28446
685.38859
27.5521
7.2446
3.8031
0.0261
Table H.28: JMP Pro 10 output of a Tukey test for taxi speed sensitivity tests using 34 aircraft and
4 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
34-4
Error
C. Total
2
87
89
55.10413
630.28446
685.38859
27.5521
7.2446
3.8031
0.0261
Table H.29: JMP Pro 10 output of Levene test of equal variance for collision prevention parameter
sensitivity tests using 34 aircraft and 4 catapults.
Test
F Ratio DFNum DFDen Prob > F
O’Brien[.5]
Brown-Forsythe
Levene
Bartlett
1.2733
1.0896
1.078
1.1711
2
2
2
2
87
87
87
.
0.2851
0.3409
0.3448
0.31
Table H.30: JMP Pro 10 output for an ANOVA test for collision prevention parameter sensitivity
tests using 34 aircraft and 4 catapults.
Source
DF Sum of Squares Mean Square F Ratio Prob > F
34-4
Error
C. Total
2
87
89
76.49218
605.13776
681.62995
38.2461
6.9556
5.4986
0.0056
Table H.31: JMP Pro 10 output for a Tukey test for collision prevention parameter sensitivity tests
using 34 aircraft and 4 catapults.
Level
- Level
Difference Std Err Dif Lower CL Upper CL p-Value
34-4 +25%
34-4 base
34-4 +25%
34-4 CA-25%
34-4 CA-25%
34-4 base
2.00675
1.900217
0.106533
0.6809604
0.6809604
0.6809604
268
0.383
0.27647
-1.51721
3.630498
3.523964
1.730281
0.0114
0.0176
0.9866
Appendix I
Unmanned Aerial Vehicle (UAV)
Models
Chapter 5 discusses the Multi-Agent Safety and Control Simulation (MASCS) experimental test
program and provided an overview of the Unmanned Aerial Vehicle (UAV) models encoded into
MASCS. This appendix provides additional information regarding those models.
I.1
Local Teleoperation
Local Teleoperation (LT) relocates the pilot from the cockpit of the aircraft to a Ground Control
Station (GCS) nearby that enables them to control their vehicles using the same control inputs
they would normally supply if they were within the vehicle (”stick and rudder inputs”). “Local”
teleoperation assumes that this control is done via Line-Of-Sight (LOS) communications and that
there is no significant latency in operations. Additionally, the use of a GCS implies that the operator
is now receiving visual information about the world through cameras on board the vehicle. These
may have a limited field of view and thus impair the operator’s ability to observe the location of
Directors and other vehicles in the world.
Research by several groups has addressed teleoperated driving ((McLean & Prescott, 1991;
McLean et al., 1994; Spain & Hughes, 1991; Scribner & Gombash, 1998; Oving & Erp, 2001;
Smyth et al., 2001)), a task similar to what will occur with teleoperated aircraft on the flight deck.
The research has shown that the move from direct viewing within the vehicle to the use of a teleoperated display with computer/video screen has some detrimental effect. The increase in task
completion time for teleoperated systems ranged from 0% to 126.18%, with a mean of 30.27% and
a median of 22.12%. The tasks used in these experiments typically required drivers to navigate a
marked course that required several turning maneuvers. In these tasks, drivers showed a significantly
higher error rate (defined as the number of markers or cones struck during the test) as compared to
269
APPENDIX I. UAV MODELS
non-teleoperated driving. Percent increases in errors ranged from 0% to 3650.0% with a median of
137.86% and a mean of 60%. However, these values are difficult to translate into direct error rates
on the flight deck. Conceptually, they signify a difficulty in high-precision alignment tasks, a skill
required while aligning and driving on to a catapult. A failure to align correctly to a catapult then
requires other crew to physically push the vehicle off of the catapult to realign it. Interviews and
observational data suggests this takes at least 15 to 45 seconds.
On the aircraft carrier flight deck, teleoperated vehicles would still be directed around the deck
through the zone coverage routing structure. The research described above suggests that operators
would drive more slowly while under guidance from Directors and would have more difficulty in
finding Directors on the deck due to their constrained field of view. The research does not suggest,
however, that operators would fail to notice their designated Directors once they lie within their
field of view, nor that that they would fail to properly comprehend instructions. Additionally,
though not explored in the research, the operator’s understanding of context should prevent her
from executing an incorrect action. However, because of the limitations of view field and lack
of tactile and proprioceptive sensory inputs, it is likely that local teleoperation systems will have
difficult in properly aligning to catapults.
I.1.1
Local Teleoperation Safety Implications
Just as with manual control, safety in local teleoperation is provided by a combination of the zone
coverage routing structure, the planning of aircraft assignments, and reactive collision avoidance
(Dynamic Separation) on the part of the UAV operator. However, this latter property will be
affected by the limited view field of the cameras and the screen on the GCS and thus should be less
effective than manual control operations.
Among the available safety protocols, it would be expected that Area+Temporal would be the
worst selection for productivity, while Area Separation would be the worst for safety metrics. The
problems lie in confining vehicles to small areas: in Area+Temporal, manned aircraft are confined
to a small area and cannot exploit the efficiency of using all four catapults simultaneously. In Area
Separation, UAVs are confined to operating in a single highly-congested area where the limited
field of view should decrease their ability to avoid imminent collisions. Accordingly, it would be
expected that Temporal Separation would be best for both safety and productivity — it allows the
manned aircraft (able to safely taxi high-congestion areas) to depart first, freeing up the remaining
space on the deck for the unmanned vehicles and increasing their efficiency. This both decreases
any detrimental effects of safety on the part of the UAVs while also increasing their productivity
by providing an open deck. Under this same logic, Area+Temporal should also do well in safety
measures as it operates in largely the same manner as Temporal Separation. Its performance in
productivity will be worse as it overly constrains manned aircraft access to catapults during their
270
APPENDIX I. UAV MODELS
phase of operations.
I.2
Remote Teleoperation
Remote Teleoperation (RT) systems are almost identical to Local Teleoperation systems (ground
control station with view constrained by cameras), with the lone change being that the operator is
now much farther away (hundreds to thousands of miles) and cannot communicate with the vehicle
using LOS methods. This typically requires the use of satellite communications, at the cost of
significant latency in transmissions. The first effect is that there is a delay in beginning to execute
actions. Once the Director issues a command to the UAV operator, the video feed must first travel
to the operator, whose response must then travel back to the vehicle for execution. Secondly, lag
effects how operators input tasks to the system. Research has shown that a naive teleoperator
inputs tasks assuming there is no lag, requiring multiple corrective actions to finally reach the target
and much greater amounts of time. Sheridan and Ferrell (Sheridan & Ferrell, 1963) observed that
more experienced teleoperators used an “open-loop” control pattern, inputting tasks anticipating
the lag, then waiting for the tasks to complete before issuing new commands. Using this strategy
prevents the error in one task from overly affecting the next task, thus reducing the overall number
of errors and time required to complete the tasks at hand. However, it still requires much more time
than non-latency tasks. This has important repercussions in modeling how operators execute taxi
commands on deck and is discussed in the next section.
A review of literature regarding these experiments, including the use of remotely teleoperated
ground vehicles, shows that lag is multiplicative in its effects even in the best case: one second of latency translates into much more than one second of additional time. Fifteen different papers (Kalmus
et al., 1960; Sheridan & Ferrell, 1963; Hill, 1976; Starr, 1979; Mar, 1985; Hashimoto et al., 1986;
Bejczy & Kim, 1990; Conway et al., 1990; Arthur et al., 1993; Carr et al., 1993; MacKenzie & Ware,
1993; Hu et al., 2000; Lane et al., 2002; Sheik-Nainar et al., 2005; Lum et al., 2009), totaling 56
different experimental tests, show that the percentage change in task completion time per second of
lag ranges from 16% to 330%, with a mean of 125% and a median of 97%. Thus, at the median,
one second of lag results in a near doubling of the time to complete a task (a 100% increase); two
seconds of lag equals a 200% increase. These tests addressed both cases where operators where
directly observing the system they were controlling, or while receiving information via a video or
computer display.
Remotely teleoperated vehicles are still expected to operate in a zone coverage routing scheme,
with Aircraft Directors using hand gestures to provide instructions that captured by vehicle cameras
and relayed to the remote teleoperators. The ability of UAV operators to detect Directors is affected
by the limited field of view of the onboard cameras, just as in local teleoperation, and they are still
expected to properly comprehend instructions when provided. Their ability to execute those actions
271
APPENDIX I. UAV MODELS
is affected by the signal latency; taxi speeds are likely to be greatly reduced, and high-accuracy
alignment tasks (aligning to catapult) are likely to suffer much higher error rates and take more
time. Furthermore, the “open-loop” nature of control means that they will likely not be perfect in
adhering to Director instructions as to when to stop taxiing or turning.
I.2.1
Remote Teleoperation Safety Implications
Just as with manual control and local teleoperation, safety in remote teleoperation is provided by
a combination of the zone coverage routing structure, the planning of aircraft assignments, and
reactive collision avoidance (Dynamic Separation) on the part of the UAV operator. However, this
latter property will be affected by the limited view field of the cameras, the screen on the GCS, and
also signal latency, making it even less effective than local teleoperation systems. The existence of
signal latency also means that vehicles will have trouble stopping in Dynamic Separation events,
and the likelihood of collisions in congested traffic areas will be greatly increased.
The performance of Remote Teleoperation safety protocols will be quite similar to that of Local
Teleoperation — the same concerns about operations apply to all phases. However, because of
the existence of latency in the signal, the performance of Remote Teleoperation along all measures
(productivity and safety) is expected to be worse than Local Teleoperation and worst out of all
available control architectures.
I.3
Gestural Control (advanced teleoperation)
The third control architecture, Gestural Control (GC), places the operator in the physical environment nearby the unmanned vehicle they are controlling. Here, control inputs are communicated via
hand and arm movements, providing basic action commands such as “drive forward” to unmanned
vehicles. On the aircraft carrier flight deck, this means the elimination of the pilot/operator all
together: the Aircraft Directors on deck now assume direct control over the unmanned vehicles.
Gestural control UAVs would be designed to accept the same hand signals as human pilots, however, meaning that this would be a relatively simple transition in operations. These vehicles would
again be routed in zone coverages, and the evolution of operations on the flight deck will look very
much like that of manual control operations. However, the ability of vehicles to detect Directors,
comprehend their commands, and execute tasks will not be as reliable and human pilots.
Gestural-control systems rely on stereo-cameras to detect visual and depth information about
the world. Onboard processing uses this information to develop a model of the world, denoting
what objects should be monitored before tracking and interpreting their gestures. A review of
recent research shows many systems with very high accuracy rates for data sets involving only
a small number (less than 20) of simple actions in relatively simple, “clean” (very little clutter)
environments. For larger action sets, involving more complex hand and arm gestures, accuracy is
272
APPENDIX I. UAV MODELS
still fairly high, but often well below 100%. Over all of the literature reviewed, systems display a
mean failure rate of 11.16% and median rate of 8.93% for mostly simple, clean data sets. However,
Song et al. (Song et al., 2012) provide an example of a system built specifically for the aircraft carrier
flight deck environment. Their most recent results achieve almost 90% accuracy for a limited set
of gestures and requires 3 seconds of processing time to register commands. This system utilizes a
Bumblebee stereo camera, which has optional fields of view of 43◦ , 65◦ , or 100◦ .
Due to the nature of the data sets used most broadly in the research, it is difficult to define
exactly how well these systems would perform on the flight deck. A variety of issues — sun glare
off of the ocean, multiple similarly-dressed individuals, a continually changing background, and the
variability of the humans providing gestures – can substantially affect the systems’ ability to detect
operators and comprehend the instructions they provide. Song et al.’s current 10% failure rate in
task comprehension would increase substantially when expanded to the full vocabulary of tasks (over
100 distinct commands) and placed in the world. Given this alternative, maintaining a failure rate
equivalent to the mean error rate of 11.16% may be permissable, given certain safety features on
the vehicles. Typically, these systems include a default “stop” command, such that if the tracked
individual providing commands is lost, or a match for the gesture is not found, the vehicle comes
to a complete halt. Modeling in MASCS will make the assumption that this is a permissable error
rate for Director and vehicle detection (within the vehicle’s field of view) and task comprehension.
For each case, failure to detect/comprehend results in a penalty of 10 seconds, modeling the time
required for a Director to recognize the failure and re-issue the command/be recognized by the
system.
I.3.1
Gesture Control Safety Implications
Safety in Gestural Control, just as in local and remote teleoperation, is again predicated upon the
orderly assignment of aircraft tasks, the structure of zone coverage routing and traffic management,
and in this case the ability of vehicles themselves to monitor nearby activity and avoid collisions
(Dynamic Separation). Just as in the teleoperation conditions, the ability of vehicles to do this is
mitigated by the failure rate of recognition and the required processing time. An incoming vehicle
may be fully within the field of view, but if the vehicle fails to recognize it as a vehicle or takes too
long to process, it will fail to stop and a collision might still occur.
In terms of safety, it is unclear as to which safety protocols perform best for these systems.
Gesture-controlled vehicles, because of the default stop command, should be better able to taxi in
high traffic areas but with some cost to productivity (they will likely stop more often due to the sheer
number of things in their fields of view). However, as with the teleoperation systems, reducing the
density of vehicles on the deck by moving manned aircraft early in the mission (through Temporal
Separation or Area+Temporal) should increase both productivity and safety in the system. Temporal
273
APPENDIX I. UAV MODELS
Separation should also provide the best performance in productivity, moving manned aircraft first
and allowing UAVs more open space to operate on the flight deck. However, it is unclear which of
the remaining protocols will provide the worst overall performance.
I.4
Human Supervisory Control Systems
The last two forms of control architectures are variations on more modern “human supervisory
control” systems. In both cases, vehicles contain advanced autonomous capabilities and only require
abstract task descriptions such as “go to this area and monitor for any activity” from the operator.
The vehicles include sufficient guidance, navigation, and control systems onboard the vehicle to
understand where they are in the world, where their final destination is, how to travel to that
destination, and be capable of executing tasks upon arrival. This typically requires a suite of sensors
(GPS, inertial measurement units, and various cameras and range detection systems) to acquire data
for the logic systems onboard.
For these systems, a human “Supervisor” manages all vehicles through a centrally networked
computer system, meaning that Directors on deck are no longer responsible for communicating
instructions to vehicles. Information on the status of all vehicles is fed to the central system, where
the operator decides what tasks vehicles should perform at what times. In some cases, the operator
is assigning tasks individually to each vehicle; in others, the operator may replan all tasks at once
(likely with the help of an intelligent planning algorithm). The latter may also include a path
planning algorithm that will attempt to pre-plan all vehicle paths to minimize the likelihood of
collisions or traffic congestion. The autonomy embedded in the vehicle is responsible for resolving
the abstract tasks provided by the Supervisor into low-level control inputs for the throttle, steering,
and other mechanisms. These vehicles would also contain reactive collision avoidance mechanisms
(Dynamic Separation systems) in the case that a crew member or other vehicle crosses their path
without warning. While cameras might be used for a variety of sensing, it is probably that all
crew and vehicles would also be equipped with GPS beacons and other sensors that broadcast
their location to the autonomous vehicle network, ensuring that they are seen and accounted for
before a collision. Despite the advanced capabilities, due to the environmental conditions on deck
(pitching, rolling, etc.) it is still likely that vehicles will have difficulty in high-precision tasks like
aligning to the catapult and navigating through close traffic. While the advanced autonomy does
not require crew on deck to provide tasks, this crew may be required as safety escorts to monitor the
movement of the vehicle and help correct any errors in catapult alignment. This would require fewer
crew than standard manual control operators, and in combination with the advanced networked
planning, would allow unmanned vehicles to travel faster than in other control architectures. Speeds
equivalent to manual control cases are not unreasonable.
As noted above, these systems take one of two different forms. In the first (Vehicle-based Human
274
APPENDIX I. UAV MODELS
Supervisory Control, VH), the operator provides tasks to vehicles individually, addressing them in
series. Once a vehicle finishes its tasks, it must wait for the operator to provide new instructions.
Servicing each vehicle takes some variable amount of time dependent on the operator’s current
workload, how well they are aware of the state of the system, and what tasks are being issued. Past
research has shown that operators can control up to 8 ground vehicles of 12 aircraft (Cummings
& Mitchell, 2008), suitable for a flight deck where no more than 8-12 vehicles are taxiing at any
given time. Previous work by Nehme (Nehme, 2009) modeled human operators performing a variety
of tasks, with values ranging from averages of 1-2 seconds to 25 seconds for various types of tasks
ranging from defining waypoints to verifying sensor data. In MASCS, waypoint definition is a
major responsibility of the Supervisor in Vehicle-Based Human Supervisory Control (VBSC), and
the Supervisor is responsible not only for assigning waypoints but also for defining vehicle paths on
the flight deck.
In system-based operations (SH), the operator uses a planning algorithm to replan tasks for
all vehicles simultaneously. SH systems also require the existence of a path planning agent to
handle the deconfliction of vehicle paths on the flight deck. The planning of tasks must happen
once at the beginning of the mission, and because schedules degrade over time due to uncertainty
in the world, may happen periodically throughout. Each time replanning occurs, it takes some
time to accomplish, during which vehicles may be waiting for instructions. This replanning may
also significantly change prior instructions, rerouting aircraft to new destinations. Such systems
typically assign a queue of tasks to a single vehicle for execution over a period of time, allowing the
operator to focus on monitoring the performance of the system as a whole, rather than individual
vehicles. In past work with the Deck operations Course of Action Planner (DCAP) system built for
carrier deck planning (Ryan et al., 2011; Ryan, 2011), the time required to replan ranged from 5 to
20 seconds, with replans only occurring in the event of failures (once per 20 minute mission). In a
similar system termed OPS-USERS (Fisher, 2008; Cummings, Clare, & Hart, 2010; Clare, 2013),
replanning times ranged from 2 to 15 seconds. Replans should not occur after the last vehicle is
allocated, however, as the cost of making changes outweighs the benefits.
In both cases, vehicles are not engaging in visual detection as in the previous control architectures;
they receive their tasks from the central network through direct communication and translation. This
also means that task comprehension does not occur in similar fashions to other architectures and
should be more reliable. For both of these elements, failure rates are not applicable for the system.
However, vehicles may still have issue in executing tasks. The difficulties affecting Gestural Control
automation are the same for V- and SH systems, and failure rates should be the same.
275
APPENDIX I. UAV MODELS
I.4.1
Human Supervisory Control (HSC) Safety Implications
For both of these systems, Dynamic Separation is accomplished through collision avoidance systems
onboard the vehicles that utilize both local sensor information and data from the network to monitor
the location of nearby vehicles. For avoiding other networked vehicles, these systems should be
flawless in their execution, performing as well as human operators in navigating the deck.
In terms of safety protocol performance, both VBSC and System-Based Human Supervisory
Control (SBSC) systems should perform better than all previous control architectures, given their
higher expected reliability and improved autonomy. Because of this performance, it is unclear that
any one safety protocol will generate better performance than another in terms of either productivity
or safety. These systems are the closest, in terms of performance, to manually controlled vehicles and
do not have the same limitations and errors of other systems. While some variation in performance
would not be surprising, it is unclear as to what these variations will be.
276
Appendix J
UAV Verification
Chapter 5 described the development of the five Control Architecture models and included results
showing the differences in performance of the full models of each. This appendix provides additional
data showing the effects of introducing individual elements to each of the models. As noted previously
in Figure 5-5, creating the UAV control architecture models required modifying parameters in or
introducing new parameters to the baseline Manual Control model. For some control architectures,
this required few changes, while others required more substantial changes either to the models or to
the structure of operations. For each model, parameters were introducing individually then tested
using the single aircraft calibration scenario with 300 replications being performed. This Appendix
presents the data for each of these individual vehicle models to highlight how changes in parameters
affected the performance of individual vehicle models.
For Local Teleoperation (LT), the two major changes were to slightly restrict the field of view
from 85◦ to 65◦ and to increase the rate of catapult alignment failures from 0% to 5%. The results of
this appear in Figure J-1, which plots three lines: the first to show the results of the baseline Manual
Control model (gray), one to show the empirical observation used in calibration testing (orange), and
a third for the UAV model (blue), which includes whiskers for standard deviation. The LT model
has two data points: the first point, labeled “LT +FOV” denotes that the first variable changed
was the field-of-view of the aircraft. The next point, “LT +CatFail” adds the 5% chance to fail at
catapult alignment. As can be seen from the chart, neither of these changes makes much difference
to results, which are quite similar to the original Manual Control (MC) baseline.
Figure J-2 shows similar data for the Remote Teleoperation (RT) model. The first change here
introduced both FOV and catapult alignment failure changes, given the lack of effect of field of view
on its own for the LT model. The second change introduced the 1.6 second latency at the start of
each commanded task, with the third introducing modifications to the end of tasks due to operator
behavior. For this model, the introduction of the initial latency has the greatest effect on results,
277
APPENDIX J. Unmanned Aerial Vehicle (UAV) VERIFICATION
Figure J-1: Mean launch event duration time for Local Teleoperation (LT) control architecture with
incremental introduction of features. Whiskers indicate ±1 standard deviation.
increasing mission duration by 13 seconds on average (216.21 to 228.96 seconds). However, data for
each case is still within one standard deviation of the original MC baseline.
Data for the Gestural Control (GC) is shown in Figure J-3, with the first model again beginning
with changes to the field of view and catapult alignment failure rate. The second data point include
the effects of failing to recognize and Director and the associated time penalties. This appears to
have no effect on results. The third model introduces the effects of failing to recognize commands,
which could occur multiple times in series with time penalties accruing for each. This clear shows a
significant change in operations, increasing the average launch event duration from 200.65 seconds
to 221.66 seconds. The final model adds further latency into the system, modeling the processing
time of the GC vehicles (uniform, 0-3 seconds). This has even further effects, increasing the average
launch event duration to 248.62 seconds and moving nearly a full standard deviation away from the
MC baseline.
Introducing the VH model required not only changes to vehicle parameters but also changes to
their interactions with the crew. The first model in Figure J-4 removes the Director crew from
operations; vehicles are no longer constrained to the zone coverage routing system. These results
suggest that the removal of the crew may provide a slight benefit to operations, as the mean time to
complete the launch event is smaller than the MC Baseline (196.38 seconds vs. 209.99 seconds). The
second model catapult alignment failures to 2.5%, again with only minor effects. The third model
introduces the Handler Queueing system, in which vehicles must repeatedly request instructions
from the Handler. This results in a substantial increase in launch event duration, increasing the
average time from 198.39 seconds to 229.31 seconds, and has an effect similar to that of the GC
278
APPENDIX J. UAV VERIFICATION
Figure J-2: Mean launch event duration time for Remote Teleoperation (RT) control architecture
with incremental introduction of features. Whiskers indicate ±1 standard deviation.
Figure J-3: Mean launch event duration time for Gestural Control (GC) control architecture with
incremental introduction of features. Whiskers indicate ±1 standard deviation.
279
APPENDIX J. UAV VERIFICATION
processing latency and the RT communications latency.
Figure J-4: Mean launch event duration time for Vehicle-Based Human Supervisory Control (VBSC)
control architecture with incremental introduction of features. Whiskers indicate ±1 standard deviation.
The System-Based Human Supervisory Control (SBSC) model was the final to be introduced,
which requires only two real changes. The first model (Figure J-5) removes the Director crew from
operations and eliminates the use of the zone coverage routing scheme. This, like for the VH model,
seems to provide a slight benefit to the speed of operations, lowering the mean launch event duration
to 193.87 seconds. The second model increases the rate of catapult alignment failures to 2.5%, again
with very little differences in operations. However, because not latencies are introduced, for this
model, it retains slightly better performance than the MC case in its full implementation.
Figure J-6 plots the data from the previous figures on the same axis, providing a comparison
of performance across all of the models. In general, a few conclusions can be drawn across these
cases. First, that the modeled changes in field of view and catapult failures appear to no real effect
on operations. Field of view results in only a minor change in how Aircraft Director crew align to
unmanned aircraft during operations; catapult alignment failures occur at a fairly low rate, and the
time penalties associated with these failures are on the order of the standard deviation of the launch
preparation time process. Additionally, these results suggest that latency in operations, be it from
communications, software processing, human interaction, or in the form of repeated failures with
time penalties, is quite detrimental to operations. The reason for this can be seen in Figure J-7,
which plots the taxi duration component of the launch event. Within this figure it can be seen
that significant differences occur between taxi times within the various models. Because the launch
preparation and launch acceleration time models do not vary for the unmanned vehicle control
280
APPENDIX J. UAV VERIFICATION
Figure J-5: Mean launch event duration time for System-Based Human Supervisory Control (SBSC)
control architecture with incremental introduction of features. Whiskers indicate ±1 standard deviation.
architectures, changes in taxi time are the primary source of variation here. This has important
ramifications for mission testing, where delays in taxi operations may cause significant disruptions
to the flow of traffic on deck and, in the aggregate, severely hamper mission operations.
281
APPENDIX J. UAV VERIFICATION
Figure J-6: Mean launch event duration time for all control architectures with incremental introduction of features. Whiskers indicate ±1 standard deviation.
Figure J-7: Mean taxi duration time for Local Teleoperation (LT) control architecture with incremental introduction of features. Whiskers indicate ±1 standard deviation.
282
Appendix K
Launch Event Duration (LD)
Results
The time required to complete a launch event served not only as an important validation metric for
the baseline manual control Multi-Agent Safety and Control Simulation (MASCS) simulation but
is also the most critical evaluation metric for Subject Matter Experts (SMEs) in carrier operations.
Chapter 5 provided a summary of the results of testing, indicating the few meaningful differences
in performance were found between Control Architectures, Safety Protocols, mission settings, and
interactions of each. This section provides additional detail on the tests that were performed.
K.1
Effects due to Safety Protocols
Figure K-1 provides boxplots of Launch Duration values according to Safety Protocol (SP). Presented in this fashion, the expected detrimental effect of the Temporal+Area Separation (T+A)
safety protocol is clearly seen. If these results are separated based on mission UAV Composition
(25% or 50%), the percent increase in Launch event Duration (LD) for the T+A cases as compared to the DS only baseline averaged 38.89% for the 50% Composition level and 57.45% for the
25% level. It also appears that the Temporal safety protocol produces slightly higher LD values
than the Area and DS only cases. Applying a Kruskal-Wallis nonparametric one-way analysis of
variance to the Area, Temporal, and DS only safety protocols (ignoring the T+A case) reveals significant differences at both the 22 aircraft level (χ2 (2) = 34.2927, p < 0.0001) and 34 aircraft levels
(χ2 (2) = 100.6313, p < 0.0001). Post-hoc Steel-Dwass multiple comparisons tests reveal that at the
22 aircraft level, the Temporal Separation protocol (mean rank = 629.62) is different from both Area
(497.488) and Dynamic Separation (511.682) protocols, with the latter two not significantly different
from one another. At the 34 aircraft level, significant differences exist for all pairwise comparisons
between the three protocols. This fails to confirm the hypothesis regarding Temporal protocol per283
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
formance: the Temporal protocol was expected to have equivalent performance to the DS only safety
protocol, given that it preserved the same flexibility in planning operations. Area was expected to
perform more poorly given the constraints it places on the assignment of vehicles. These results suggest quite the opposite, that the preservation of the flow of vehicles on the deck (which is unchanged
under the Area protocol) is more beneficial than preserving flexibility in catapult assignments.
K.2
Effects due to Control Architectures
Figure K-2 contains a second set of boxplots, now comparing LD values across Control Architectures,
excluding the T+A cases and their substantial effects. From this figure it can be seen that while
some fluctuations in mean LD do occur across architectures, there do not seem to be any large
deviations within the set. However, Kruskal-Wallis non-parametric one-way analysis of variance
tests using Control Architecture (CA) as the grouping variable show that significant differences
exist at the 22 aircraft level (χ2 (5) = 37.7272, p < 0.0001) but not at the 34 aircraft level (χ2 (5) =
5.8157, p = 0.3246). A post-hoc Steel-Dwass non-parametric simultaneous comparisons test for the
22 aircraft data reveals that significant pairwise differences exist between the System-Based Human
Supervisory Control (SBSC) (mean rank = 438.6) case and each of the Remote Teleoperation (RT)
(605.7), Gestural Control (GC) (582.2), and Vehicle-Based Human Supervisory Control (VBSC)
(562.0) cases. The Local Teleoperation (LT) (mean rank = 515.2) and RT cases are also shown to
be significantly different from one another, but no other comparisons result in significance. However,
in practical terms, the differences are quite small: the three largest differences in mean LD occur
between SBSC (µ = 24.27 minutes) and RT (µ = 25.50), GC (µ = 25.33 minutes), and VBSC
(µ = 25.30), providing differences of 1.23, 1.06, and 1.03 minutes, respectively.
These results only partly support the hypotheses regarding CA performance: the SBSC case
does trend towards better and the GC towards worse performance at the 22 aircraft level, but no
differences appear at the 34 aircraft level. However, the hypotheses did not expect significantly
poor performance for the RT and VBSC cases, as the delays generated in their operations were not
Figure K-1: Boxplots of Launch event Duration (LD) for each Safety Protocol at the 22 (left) and
34 (right) aircraft Density levels.
284
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
Figure K-2: Boxplots of Launch event Duration (LD) for each Control Architecture at the 22 (left)
and 34 (right) aircraft Density levels, with Temporal+Area safety protocols excluded.
expected to be as severe. Additionally, the lack of variation at the 34 aircraft level suggests that
the changes in deck traffic flow that occur at higher numbers of aircraft (the obstruction of taxi
routes and the temporary loss of the fourth catapult) may have a substantial dampening effect on
operations.
Interestingly, these conclusions are not supported when examining only the homogeneous cases
(where Composition = 100% and all vehicles within the mission are identical). Boxplots of these
results appear in Figure K-3, with little visible variation in results. This is supported by a statistical tests across each of these two sets of data. Statistical tests at the 22 aircraft (ANOVA,
F (5, 174) = 1.906, p = 0.096) and 34 aircraft levels (Kruskal-Wallis, χ2 (5) = 4.399, p = 0.493) reveal
no significant differences amongst the data at the α = 0.05 level. This is an interesting result given
the differences in the omnibus test on CA in the preceding paragraph and implies that the significant
differences in the earlier test on CA must be due to interactions of the Control Architectures with
the other settings.
K.3
Effects due to Interactions
In order to better identify the root causes of these variations, another series of statistical tests were
applied for each Density values (22 or 34 aircraft) using the full treatment specification (CA +
Composition + SP) as the grouping variable. Figure K-4 provides a column chart of these results,
with Table K.1 providing the means and standard deviations of each case. These data and the
statistical tests exclude the T+A cases, however, as their effects are already shown to be significant.
This results in one-way analyses of variance for 36 cases within each Density setting. Kruskal-Wallis
tests showed significant differences in the data at both the 22 aircraft (χ2 (35) = 108.999, p < 0.0001)
and 34 aircraft levels (χ2 (35) = 176.264, p < 0.0001). A post-hoc Steel-Dwass non-parametric
multiple comparisons test at the 22 aircraft level reveals seventeen significant pairwise comparisons
285
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
Figure K-3: Boxplots of Launch event Duration (LD) for each Control Architecture at the 100%
Composition settings for 22 (left) and 34 (right) aircraft Density levels.
that all show tests using the SBSC aircraft to be significantly better than cases using the RT, GC,
or VBSC aircraft. Most often, these comparisons involve the SBSC 25% DS only, 25% Area, and
100% DS only cases compared to the RT, GC, or VBSC 25% Temporal or 50% Temporal cases. A
Steel-Dwass multiple comparisons test at the 34 aircraft Density levels returns fifty-six significantly
different pairwise comparisons, all but ten of which involve the 50% Temporal protocol applied to a
Control Architecture (a full list of the significant pairwise differences can be found in Appendix L).
Thirty-five of the 46 50% Temporal protocol cases involve the use of the RT and VBSC systems, with
the remainder spread across the other UAV control architectures. Of the ten that do not include
the 50% Temporal protocol, seven involve the 25% Temporal protocol.
These results suggests that these the use of the 25% and 50% Temporal protocol cases are
creating a significant detrimental effect in operations, with especially adverse interactions with the
three “delayed” UAV cases of RT, GC, and VBSC. These performance variations may likely be the
result of changes in the manner in which aircraft are routed on the flight deck. The Deck Handler
heuristics, in the abstract, attempt to keep aircraft routed in an aft-to-forward direction, which in
turn means limiting the assignment of forward-parked aircraft to aft catapults. The design of the
Temporal Separation force the Handler to make just these assignments, as all manned aircraft are
parked forward and must launch before UAVs begin operations. Under the Area Separation protocol,
even though the flexibility of aircraft assignments is limited by the division of the deck, aircraft
parked forward are still most often assigned to forward catapults, preserving this flow of traffic.
Furthermore, in light of the significant main effects of control architectures observed earlier across
all cases and the distinct lack of differences in the homogeneous cases, it is likely that Temporal
Protocol performance is main driver behind the differences observed across control architectures.
More specifically, that the interactions of the Temporal protocol with the “delayed” UAV types are
substantial enough to create a main effect with that data.
286
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
Table K.1: Descriptive statistics for SP*Composition settings across all Control Architectures (CAs)
at the 22 aircraft Density level.
Level
Mean Std Dev Std Err Mean Lower 95% Upper 95%
100% Dynamic Separation Only
25% Dynamic Separation only
50% Dynamic Separation
25% Area Separation
50% Area Separation
25% Temporal Separation
50% Temporal Separation
25% Temporal+Area
50% Temporal+Area
24.90
24.82
24.75
24.42
25.09
25.38
26.07
38.63
33.70
2.11
2.12
1.94
2.24
2.55
2.24
2.24
3.33
3.85
0.16
0.17
0.16
0.18
0.21
0.18
0.18
0.27
0.31
24.59
24.47
24.44
24.06
24.68
25.02
25.71
38.09
33.08
25.21
25.16
25.06
24.78
25.50
25.74
26.43
39.17
34.32
Figure K-4: Column chart comparing Launch event Duration (LD) values across all Control Architectures (CAs) sorted by Safety Protocol (SP)-Composition pairs. Whiskers indicate ±1 standard
error.
Table K.2: Results of ANOVAs for LD results across control architectures for each SP*Composition
setting.
SP*COMP
DF1 DF2
F
p value Tukey results
100% DS
25% DS
50% DS
25% Area
50% Area
25% Temporal
50% Temporal
5
4
4
4
4
4
4
174
145
145
145
145
145
145
1.906
2.4906
3.9162
3.3537
0.9684
2.6199
2.0286
0.0956
0.0458*
0.0048*
0.0117*
0.4268
0.0374*
0.0934
287
n/a
RT v. SH, p=0.0503
RT v. SH, p=0.0027
GC v. SH, p=0.0106;VBSC v SH, p=0.0285
n/a
None
n/a
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
K.4
LD Effects Summary
This section has reviewed the effects of Control Architectures (CAs), Safety Protocols (SPs), mission
Density, and mission Composition on Launch event Duration, the primary productivity metric in
which Subject Matter Experts are concerned. This data has demonstrated that significant effects
do exist between Control Architectures (CAs) and between Safety Protocols (SPs), but that these
differences are largely due to interactions of CA with the other independent variables. It was earlier
hypothesized that the choice of CA would provide a main effect in results, with the GC case being the
worst and SBSC being the best. While data did demonstrate a significant difference between these
two cases, further tests at finer resolutions indicated that these effects are more likely due to adverse
interactions of the GC setting with the Temporal Separation protocol using 50% Unmanned Aerial
Vehicle (UAV) Composition. Adverse interactions were also observed between the 50% Temporal
Separation setting for the RT and VBSC control architectures, which was not expected. In fact, at
the 34 aircraft Density setting, significant effects were observed for the 50% Temporal case across
all control architectures and was the primary driver of variations at that level. No real main effects
were observed for control architectures at the 34 aircraft level.
These results also fail to confirm the earlier hypothesis regarding the performance of the Safety
Protocols in relation to the Launch event Duration measure, which presumed that the Temporal
protocol, preserving flexibility in assignments, would outperform the T+A and Area protocol cases.
In fact, the reverse was proven to be true in the case of the Area protocol: the preserving the flow
of traffic on deck is of greater importance than flexibility in assignments. Additionally, results also
suggest that the structure of operations forms a significant constraint on the performance of the UAV
control architectures, as the effects at the 22 aircraft level often differed from the effects at the 34
aircraft level. Together with the effects of the Safety Protocol, these results suggest that preserving
a general aft-to-forward flow of traffic on the flight deck is the most important feature in operations.
Disruptions to this process, either due to delays on the part of UAV Control Architectures or to
Safety Protocols, drives performance on these measures. Perhaps not unsurprisingly, the three CAs
that continued to show poor performance — RT, GC, and VBSC — are the three cases that involve
some sort of latency in operations. Even though the latencies were thought to have only a minimal
effect on operations, they appear to be substantial enough to fully disrupt the flow of traffic on
the flight deck. These disruptions also appear to be exacerbated by interactions with the Temporal
Separation safety protocol. This may be due to the fact that the Temporal protocol forces all UAVs
to be active at the same time; under the Area protocol, deck activity remains roughly split between
manned and unmanned.
Even so, when missions used only one type of aircraft (the 100% Composition setting), no
significant differences in performance were observed. Even though the use of more of the “delayed”
288
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
UAVs cases might be thought to further compound the delays on the flight deck, allowing the
planning system to retain its original flexibility and flow seems to mitigate the detrimental effects
of the “delayed” UAV systems. However, observations of simulation execution do note significant
disruptions in taxi operations for these control architectures; in many cases, vehicles are arriving
at catapults just as the previous aircraft is launching. Although there is an effect, it is not felt
because the vehicle arrives in time (just barely) to prepare for its own launch. Reducing the launch
preparation time mean may break this ability to queue at catapults and lead to significant effects
at the 100% Composition levels (the effects of which will be explored in Chapter 6). In general,
however, these results suggest that the different control architectures and safety protocols do have
significant interactions with the structure of flight deck operations, at least in terms of the time
to complete launch events. The next section examines the performance of the MASCS simulation
in terms of the Safety metrics defined previously in Section 5.3, building Pareto frontiers of safety
metrics vs. LD and examining how the effects of CA and SP discussed in this section affect the level
of risk in flight deck operations.
289
APPENDIX K. LAUNCH EVENT DURATION (LD) RESULTS
THIS PAGE INTENTIONALLY LEFT BLANK
290
Appendix L
Statistical Tests on Launch Event
Duration (LD)
Figure L-1: SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus SP, with
T+A safety protocols excluded.
Figure L-2: SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 22
aircraft cases versus SP, with T+A safety protocols excluded.
291
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Table L.1: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for 22 aircraft cases
versus SP, with T+A safety protocols excluded.
Level - Level Score Mean Difference Std Err Dif Z
p-Value
T
T
DS
DS
A
A
87.37083
70.7
12.44479
16.58243
14.15391
16.58243
5.268879
4.995085
0.75048
<0.0001
<0.0001
0.7333
Figure L-3: SPSS output of Levene Test of Equal Variances for the 34 aircraft cases versus SP, with
T+A safety protocols excluded.
Figure L-4: SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 34
aircraft cases versus SP, with T+A safety protocols excluded.
Table L.2: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for 34 aircraft cases
versus SP, with T+A safety protocols excluded.
Level - Level Score Mean Difference Std Err Dif Z
p-Value
T
T
DS
A
DS
A
135.5767
131.5627
39.7692
14.15392
16.58243
16.58243
9.578739
7.933861
2.398271
<0.0001
<0.0001
0.0435
Table L.3: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for 22 aircraft cases
versus CA, with T+A safety protocols excluded.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
SBSC
SBSC
VBSC
RT
GC
RT
SBSC
LT
-57.1429
-64.6952
46.7238
35.5286
11.84624
11.84624
11.84624
11.84624
292
-4.82371
-5.46125
3.94419
2.99914
<0.0001
<0.0001
0.0011
0.0324
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Figure L-5: SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA, with
T+A safety protocols excluded.
Figure L-6: SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 22
aircraft cases versus CA, with T+A safety protocols excluded.
Figure L-7: SPSS output of Levene Test of Equal Variances for the 34 aircraft cases versus CA, with
T+A safety protocols excluded.
293
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Figure L-8: SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 34
aircraft cases versus CA, with T+A safety protocols excluded.
Figure L-9: SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA for
Dynamic Separation.
Figure L-10: SPSS output of an ANOVA test for the 22 aircraft cases versus CA for Dynamic
Separation.
Figure L-11: SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA for
Area Separation.
294
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Table L.4: JMP PRO 10 output for Tukey HSD Multiple Comparisons tests for the 22 aircraft cases
versus CA for Dynamic Separation.
Level
- Level Difference Std Err Dif Lower CL Upper CL p-Value
RT
LT
MC
GC
VBSC
RT
RT
RT
RT
LT
LT
MC
MC
LT
GC
SBSC
SBSC
SBSC
SBSC
SBSC
VBSC
GC
MC
LT
VBSC
GC
VBSC
GC
MC
VBSC
93.04478
70.27722
67.49522
64.62133
63.28444
29.76033
28.42344
25.54956
22.76756
6.99278
5.65589
4.21078
2.87389
2.782
1.33689
17.94067
17.94067
25.37194
17.94067
17.94067
17.94067
17.94067
25.37194
17.94067
17.94067
17.94067
25.37194
25.37194
25.37194
17.94067
41.7089
18.9413
-5.1047
13.2855
11.9486
-21.5755
-22.9124
-47.0503
-28.5683
-44.3431
-45.68
-68.3891
-69.726
-69.8179
-49.999
144.3807
121.6131
140.0951
115.9572
114.6203
81.0962
79.7593
98.1495
74.1034
58.3287
56.9918
76.8107
75.4738
75.3819
52.6728
¡.0001
0.0014
0.0854
0.0047
0.0061
0.5598
0.6094
0.9155
0.8018
0.9988
0.9996
1
1
1
1
Figure L-12: SPSS output of an ANOVA test for the 22 aircraft cases versus CA for Area Separation.
Figure L-13: SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus CA for
Temporal Separation.
Figure L-14: SPSS output of an ANOVA test for the 22 aircraft cases versus CA for Temporal
Separation.
295
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Table L.5: JMP PRO 10 output for Tukey HSD Multiple Comparisons tests for the 22 aircraft cases
versus CA for Temporal Separation, with T+A safety protocols excluded.
Level
- Level
Difference Std Err Dif Lower CL Upper CL p-Value
GC T
GC T
RT T
RT T
VH T
VH T
GC T
RT T
VH T
SH T
SH T
GC T
GC T
RT T
MC DS
LT T
MC DS
LT T
MC DS
LT T
MC DS
SH T
SH T
SH T
LT T
MC DS
VH T
RT T
VH T
LT T
75.73367
74.867
70.32617
69.4595
67.619
66.75233
59.43167
54.02417
51.317
16.302
15.43533
8.11467
5.4075
2.70717
0.86667
24.38559
29.86613
24.38559
29.86613
24.38559
29.86613
24.38559
24.38559
24.38559
24.38559
29.86613
24.38559
24.38559
24.38559
29.86613
5.8235
-10.7551
0.416
-16.1626
-2.2911
-18.8697
-10.4785
-15.886
-18.5931
-53.6081
-70.1867
-61.7955
-64.5026
-67.203
-84.7554
145.6438
160.4891
140.2363
155.0816
137.5291
152.3744
129.3418
123.9343
121.2271
86.2121
101.0574
78.0248
75.3176
72.6173
86.4887
0.025
0.1251
0.0477
0.1868
0.0645
0.2246
0.1467
0.2335
0.2876
0.9852
0.9955
0.9995
0.9999
1
1
Figure L-15: SPSS output of Levene Test of Equal Variances for the 22 aircraft cases versus the full
treatment definition, with T+A safety protocols excluded.
Figure L-16: SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 22
aircraft cases versus the full treatment definition, with T+A safety protocols excluded.
296
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Table L.6: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests (significant results
only) for 22 aircraft cases versus full treatment definitions, with T+A safety protocols excluded.
Level
- Level
Score Mean Difference
Std Err Dif
Z
p-Value
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
22
-23.5667
-22.0333
-21.9
21.0333
-20.3667
-20.2333
-19.7
19.4333
-19.1
18.9667
-18.9
-18.8333
-18.3
-17.7667
-17.7667
-17.7
-17.5667
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.509187
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.509187
4.50925
-5.22629
-4.88625
-4.85668
4.66449
-4.51664
-4.48707
-4.3688
4.30966
-4.2358
4.20617
-4.19138
-4.1766
-4.05832
-3.94005
-3.94005
-3.92532
-3.8957
<0.0001
0.0006
0.0007
0.0017
0.0033
0.0038
0.0064
0.0082
0.0111
0.0125
0.0133
0.0141
0.0223
0.0347
0.0347
0.0366
0.0407
SBSC 25 A
SBSC 25 A
SBSC 25 A
SBSC 50 T
SBSC 25 A
SBSC 25 A
SBSC 25 DS
VBSC 50 T
SBSC 50 DS
VBSC 25 T
SBSC 100 DS
SBSC 25 DS
SBSC 100 DS
SBSC 100 DS
SBSC 25 DS
SBSC 50 DS
SBSC 25 A
GC 50 T
RT 50 T
GC 25 T
SBSC 25 A
RT 25 T
RT 50 DS
GC 50 T
SBSC 25 A
GC 50 T
SBSC 25 A
GC 50 T
RT 50 T
RT 50 T
GC 25 T
GC 25 T
RT 50 T
RT 100 DS
Figure L-17: SPSS output of Levene Test of Equal Variances for the 34 aircraft cases versus the full
treatment definition, with T+A safety protocols excluded.
Figure L-18: SPSS output of non-parametric Kruskal-Wallis one-way test of variances for the 34
aircraft cases versus the full treatment definition, with T+A safety protocols excluded.
297
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Table L.7: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests (significant results only)
for 34 aircraft cases versus full treatment definitions, with T+A safety protocols excluded (part 1).
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
27.6333
25.2333
25.2333
24.1
24.0333
23.6333
23.1667
22.7667
22.3667
22.1
21.8333
21.1667
21.1667
20.9
20.9
20.9
20.7
20.5
20.3667
19.8333
19.6333
19.5
19.1667
18.8333
18.7667
18.4333
18.4333
18.3667
18.3
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
6.12814
5.59591
5.59591
5.34457
5.32979
5.24108
5.13759
5.04888
4.96017
4.90104
4.8419
4.69406
4.69406
4.63492
4.63492
4.63492
4.59056
4.54621
4.51664
4.39837
4.35401
4.32444
4.25052
4.1766
4.16182
4.08789
4.08789
4.07311
4.05832
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0002
0.0003
0.0004
0.0005
0.0007
0.0015
0.0015
0.002
0.002
0.002
0.0024
0.0029
0.0033
0.0056
0.0068
0.0077
0.0104
0.0141
0.0149
0.0199
0.0199
0.0211
0.0223
VBSC 50 T
RT 50 T
VBSC 50 T
RT 50 T
VBSC 50 T
VBSC 50 T
VBSC 50 T
RT 50 T
RT 50 T
VBSC 50 T
VBSC 50 T
RT 50 T
VBSC 50 T
RT 50 T
VBSC 50 T
VBSC 50 T
VBSC 50 T
SBSC 25 T
RT 50 T
RT 50 T
VBSC 50 T
VBSC 50 T
VBSC 50 T
SBSC 25 T
VBSC 50 T
LT 50 T
RT 50 T
VBSC 50 T
RT 50 T
VBSC 50 DS
RT 50 A
RT 50 A
LT 25 A
VBSC 25 A
LT 25 A
RT 25 DS
RT 25 DS
MC 100 DS
MC 100 DS
SBSC 25 A
GC 50 DS
GC 50 DS
GC 25 DS
GC 25 DS
VBSC 50 A
GC 50 A
RT 50 A
GC 50 A
RT 50 DS
RT 50 DS
VBSC 25 DS
SBSC 50 A
LT 25 A
LT 50 A
LT 25 A
LT 50 A
RT 25 A
RT 25 A
298
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
Table L.8: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests (significant results only)
for 34 aircraft cases versus full treatment definitions, with T+A safety protocols excluded (part 2).
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
34
18.2333
18.2333
18.0333
17.9
17.7
-17.5
-17.5667
-17.6333
-17.7667
-17.9
-18.2333
-18.3667
-18.6333
-18.9
-19.3667
-19.3667
-19.5
-19.9
-19.9667
-20.5667
-20.8333
-21.5
-22.9
-23.1
-23.7667
-24.5
-28.2333
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.04354
4.04354
3.99919
3.96962
3.92526
-3.88091
-3.8957
-3.91048
-3.94005
-3.96962
-4.04354
-4.07311
-4.13225
-4.19138
-4.29488
-4.29488
-4.32444
-4.41315
-4.42794
-4.561
-4.62013
-4.76798
-5.07845
-5.1228
-5.27065
-5.43328
-6.2612
0.0236
0.0236
0.0279
0.0312
0.0366
0.0429
0.0407
0.0386
0.0347
0.0312
0.0236
0.0211
0.0168
0.0133
0.0087
0.0087
0.0077
0.0053
0.0049
0.0027
0.0021
0.001
0.0002
0.0002
<0.0001
<0.0001
<0.0001
RT 50 T
SBSC 25 T
SBSC 50 T
VBSC 50 T
RT 50 T
RT 25 DS
SBSC 25 A
VBSC 50 DS
VBSC 50 DS
VBSC 25 A
VBSC 50 DS
VBSC 25 A
RT 50 A
VBSC 25 DS
SBSC 50 A
VBSC 25 A
VBSC 50 DS
VBSC 50 DS
RT 50 A
VBSC 50 A
VBSC 50 DS
SBSC 25 A
VBSC 50 DS
VBSC 50 DS
VBSC 50 DS
VBSC 25 A
VBSC 50 DS
LT 25 T
SBSC 25 A
RT 50 A
LT 25 T
LT 50 DS
GC 50 T
LT 50 T
GC 100 DS
SBSC 50 DS
LT 50 T
RT 25 T
GC 50 T
LT 50 T
RT 50 T
RT 50 T
SBSC 25 T
RT 100 DS
GC 25 T
GC 50 T
RT 50 T
SBSC 50 T
RT 50 T
LT 50 T
GC 50 T
SBSC 25 T
RT 50 T
RT 50 T
299
APPENDIX L. STATISTICAL TESTS ON LAUNCH EVENT DURATION (LD)
THIS PAGE INTENTIONALLY LEFT BLANK
300
Appendix M
Analysis of Additional Safety
Metrics
This appendix presents additional results and analyses for safety metrics not covered in Chapter 5.
M.1
Results for Duration of Tertiary Halo Incursions due
to Aircraft (DTHIA)
Another view of safety looks at the total time during which halo incursions occur. It was previously
hypothesized that while System-Based Human Supervisory Control (SBSC) would have some of
the highest counts of Tertiary Halo Incursions due to Aircraft (THIA), it was expected to have
some of the lowest values for DTHIA because of the smaller expected Launch event Duration (LD).
Figure M-1 provides a Pareto plot of DTHIA versus LD for the 22 aircraft Density setting. The
trend in the data is similar to the THIA data earlier, with Dynamic Separation providing some of the
largest values (worst performance), while Temporal+Area provides the best overall. Kruskal-Wallis
tests confirm this trend (χ2 (3) = 108.3789, p < 0.0001) and CA (χ2 (5) = 736.3356, p < 0.0001).
For Safety Protocol (SP) settings, Steel-Dwass tests reveal similar performance to the THIA metric,
with Temporal (mean rank = 596.74) and Temporal+Area (mean rank = 574.38) providing the
best performance, followed by Area (mean rank = 665.52) and Dynamic Separation (mean rank =
837.29) measures, in line with prior expectations.
The trends in Control Architecture (CA) performance are also similar to that of the THIA measures, which can be better observed in Figure M-2. This figure shows only results for all control
architectures for the Dynamic Separation protocol only. Manual Control (MC), Local Teleoperation (LT), and Remote Teleoperation (RT) provide the best performance in the DTHIA measures,
followed by Gestural Control (GC), Vehicle-Based Human Supervisory Control (VBSC), and SBSC.
301
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
Figure M-1: Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 22 aircraft missions.
302
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
A Steel-Dwass test shows that only three pairwise comparisons are not significant: MC (mean rank
=311.7) vs. RT (424.8, p = 0.0631), MC vs. LT (287.9, p = 0.9571), and VBSC (893.6) vs. SBSC
(859.1, p = 0.8122). All other comparisons are shown to be significant at p < 0.0015, with the
“unmanned” CA performing worse than their manned counterparts and with the GC architecture
(mean rank = 1029.3) again showing the worst performance overall.
This is in line with the earlier THIA results but against the earlier hypotheses on performance:
it was expected that the SBSC cases would complete missions more quickly than other cases, thus
reducing the total possible time that halo incursions could exist. However, as shown in Section 5.5.1,
the benefits of the SBSC architecture on Launch event Duration (LD) were slight, so it is not
surprising that no effects were seen. It is true, however, that the SBSC case shows both the best
performance on this metric as well as some of the worst (although the GC architecture performs
worst overall). These trends are largely the same at the 34 aircraft Density setting, whose Pareto
chart may be found in Appendix N.
Figure M-2: Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 22 aircraft Dynamic Separation missions.
303
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
M.2
Results for Primary Halo Incursion (PHI)
This section reviews results for the Primary Halo Incursion (PHI) metric of crew incursions near aircraft during operations. Because the Tertiary Halo Incursions (THI) metric reviewed in Section 5.5.2
considered the largest halos around vehicles, crew incursions at that level would likely involve no
real danger to crew during operations. Reducing the size of the incursion radius to the Primary halo
level should provide a better understanding of how often crew are placed in potentially dangerous
situations. Whether or not these situations are truly dangerous and may lead to accidents is contingent on the exactly how crew are moving in and near to other aircraft on the flight deck at a level
of complexity not currently contained in Multi-Agent Safety and Control Simulation (MASCS). As
such, MASCS is not able to offer a truly detailed understanding of crew safety at this time. Future
work could involve more complex models of crew motion in order to address these limitations.
Even so, this section reviews PHI results in the context of when crew are likely to work very near
to aircraft on the flight deck. In Chapter 5, the performance of the THIA measures were shown to
be in partial agreement with hypotheses regarding safety measures on the flight deck, but given that
the safety protocols are not directly aimed at separating crew from unmanned vehicles and that the
crew are integral parts of a functioning flight deck, those results may not be consistent here. This
is apparent from the first set of data shown in this section; Figure M-3 shows the Pareto plot of
PHI versus LD at the 22 aircraft case. For this data, the results for cases using the Temporal+Area
Separation (T+A) safety protocols are excluded for the time being.
The first major trend in Figure M-3 is that there is a clear separation between the VBSC and
SBSC cases and the other control architectures. Neither the VBSC nor the SBSC architecture are
routed via the crew zone coverage topology. Instead, they receive waypoint commands from the
Deck Handler via the centrally-networked system, providing these two architectures greater freedom
in taxiing on the flight deck. This routed also included the use of crew “Escorts,” which appears to
have generated significantly higher PHI values. However, improving the performance of these two
architectures along the PHI metric would simply require adjusting the encoded Escort heuristics to
be more aware of the aircraft “halos” when moving in and near aircraft. Alternatively, the Escorts
are also optional and are not required for use on the flight deck; they could be entirely removed from
the simulation model.
Within the remaining control architectures (MC, LT, RT, and GC) in the bottom of Fig. M-3,
the choice of control architecture appears to have no effect on results. The stronger trends appears
to be in terms of SPs: the Area Separation cases seem to provide the smallest PHI values, followed
by Dynamic Separation, then with Temporal Separation providing the worst. A Kruskal-Wallis
nonparametric one-way analysis of variance test applied to just this group of control architectures
(MC, LT, RT, and GC) verifies that these differences along SPs settings are significant (χ2 (2) =
304
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
Figure M-3: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions,
excluding Temporal+Area cases.
121.737, p < 0.0001). Area Separation provides the best performance (mean rank = 125.51), then
Dynamic Separation only (362.37) and Temporal (482.39). A Steel-Dwass multiple comparisons test
shows that significant differences exists between each pairwise comparison at the p < 0.0001 level
(full results of these tests are included in Appendix N).
The differences between the Area and Temporal cases, as noted previously, are in terms of the
routing of traffic on the flight deck. Aircraft operating under the Area protocols typically move in
an aft-to-forward direction. Additionally, the layout of the areas limits how many aircraft parked aft
in the Fantail are taxied to the forward catapults. The Temporal cases result in a different pattern,
taxiing Manned aircraft from the forward area of deck to both the forward and aft catapults in order
to clear space in the Street area of the deck. Dynamic Separation results in patterns somewhere in
305
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
between the two: some forward to aft transitions do occur, but the general flow of traffic is aft to
forward.
Given these trends, this change in aircraft routing that results from the changes in Deck Handler
heuristics appear to be the primary influence in this lower cluster of results. If the source of these
variations are the transitions between areas, then the performance of the T+A cases should show
performance roughly average of the Temporal and Area cases; under this protocol, no manned
aircraft transition from forward to aft, but the remaining Unmanned Aerial Vehicles (UAVs) must
do so in order to launch. Figure M-4 shows the same data as Figure M-3 but now includes the
Temporal+Area Separation protocols (black markers on the right side of the figure). Figure M-5
shows this data again broken down for only the Remote Teleoperation control architecture.
Running statistical tests for this data for the non-supervisory control cases, a Steel-Dwass multiple comparisons test (Appendix N provides the full list of significant comparisons) demonstrates
that the T+A cases are significantly different from the all other cases and containing the largest
mean rank values (mean rank = 649.03; Temporal mean rank = 547.47). This data suggests that
the key issue with PHI values is not transitions between the forward and aft areas of deck, this data,
but simply the emphasis on using the forward area of the flight deck in general — the T+A case
includes no transitions between the forward and aft areas of the flight deck whatsoever.
The same general results appear at the 34 aircraft case, as shown in Figure M-6. A clear
separation once again exists between the VBSC and SBSC cases and the other and non-Human
Supervisory Control (HSC) architectures. Within the non-HSC architectures (MC, LT, RT, GC), the
trends are generally the same, although the differences between the Area and Dynamic Separation
results are not as significant as in the 22 aircraft results. This is likely explained by the way in
which 34 aircraft mission evolve: because of the increased number of aircraft on the deck and
the blocking of one of the forward catapults, the forward area of the flight deck is cleared more
slowly than the aft under the both the Dynamic and Area Separation cases. This means that
even the Area Separation case results in aircraft parked forward being assigned to the aft catapults.
However, as would be expected, the Temporal and Temporal+Area cases both still provide the worst
performance for PHI values at the 34 aircraft Density level. A Kruskal-Wallis one-way comparison
test for the lower cluster of data (MC, LT, RT, GC) reveals significant differences across SP settings
(χ2 (3) = 611.815, p < 0.0001), with a Steel-Dwass multiple comparisons test demonstrating that
significant differences exist in all possible pairwise comparisons. Area (mean rank = 191.17) provides
the lowest average values, followed by Dynamic Separation (mean rank = 282.855), Temporal+Area
(577.175), and Temporal Separation (722.564).
306
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
Figure M-4: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions.
M.3
Results for Duration of Primary Halo
Incursions (DPHI)
Viewing these measures in terms of the total Duration of Primary Halo Incursions (DPHI) demonstrates different trends in performance. Despite the variations seen in the count values, there is very
little variation in terms of durations, as seen for the 34 aircraft data in Figure M-7. For the data
in the figure, it appears that the choice of Safety Protocol has no difference on performance, while
the choice of Control Architecture only matters if it is the VBSC or GC cases. Similar performance
is observed at the 22 aircraft level (the Pareto chart can be found in Appendix N), although there
only the VBSC case exhibits poorer performance. These effects show that the trends hypothesized
307
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
Figure M-5: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft Remote
Teleoperation missions, including Temporal+Area cases.
to appear in the Duration measures do not appear, although it is notable that the VBSC and GC
cases (which suffer the greatest delays in operations) provide the worst performance. In general,
however, the differences in results for DPHI and PHI are significant for future research: while it is
not known currently which is the larger driver of risk in operations, the two do not appear to be
entirely correlated with one another in the MASCS model and do not produce the same patterns of
performance. This, in turn, suggests that reducing the number of halo incursions that occur during
a given operation may not necessarily translate into a reduction of the duration of those incursions.
Understanding which is the more important feature in terms of actual accidents on the flight deck
is an important aspect of future research.
A key factor in the results for PHI and THIA has been differences in how the crew are utilized.
In both cases, differences in the routing topology (either using or not using the zone coverage
Aircraft Director crew) had significant effects on results, with some additional effects based on the
number of high-level constraints applied to operations. In other cases, the choice of Safety Protocol
and it interactions with the zone coverage routing topology produced the most significant effects.
What results is a series of tradeoffs between safety metrics. Utilizing the combined Temporal+Area
protocol reduces the number of THIA occurrences while increasing the number of PHI occurrences,
308
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
Figure M-6: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 34 aircraft missions.
309
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
Figure M-7: Pareto frontier plot of Duration of Primary Halo Incursions (DPHI) versus LD for 34
aircraft missions.
due to the manner in which missions are planned and executed on the flight deck. Using Area
Separation provides better results for PHI but slightly worse for THIA; using Temporal or Dynamic
Separation provides worse performance in PHI values with performance on THIA ranging from
moderately good to very poor. Just as with the earlier results on mission productivity, however, the
key features driving behavior concern the role of crew on the flight deck, where they are located, and
what tasks they perform. Differences in how the unmanned vehicle control architectures interact
with the crew have helped identify the crew’s role in influencing these safety measures, but further
improvements may require rethinking the role of the crew in flight deck operations.
As example of this involves the crew “Escorts” defined for use with the VBSC and SBSC models
in this work. Within the PHI measures, the VBSC and SBSC architectures both exhibited much
310
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
higher values for crew incursions than the other cases. The model that described Escort behavior
resulted in the crew continually moving and readjusting to aircraft, running in and out of the
different halos over the course of operations. Altering Escort behavior so that they were more aware
of halo distances and less active in moving in and around aircraft should reduce the number of
these incursions. However, these crew were also optional: removing them entirely from operations
should drop these incursion counts to zero for the SBSC architecture and significantly reduce them
for VBSC (which still requires some Aircraft Director crew in operations). Similarly, when using
the zone coverage Aircraft Director routing, reducing the number of crew or adjusting their spacing,
could also be effective in promoting safe operations. These are questions that remain for future
work, however.
311
APPENDIX M. ANALYSIS OF ADDITIONAL SAFETY METRICS
THIS PAGE INTENTIONALLY LEFT BLANK
312
Appendix N
Statistical Tests for Pareto Frontier
Results
N.1
Supplemental statistical tests for Tertiary Halo
Incursions (THI)
This section presents more detailed statistics on Tertiary Halo Incursions (THI) and Tertiary Halo
Incursions due to Aircraft (THIA) measures as previously discussed in Chapter 5.
Figure N-1: SPSS output for Levene test of equal variances for the 22 aircraft THI values against
SPs.
Figure N-2: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for the
22 aircraft THIA values against SPs.
313
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.1: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 22 aircraft cases versus Safety Protocols.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
T+A
T+A
T+A
T
T
DS
DS
A
T
DS
A
A
-304.509
-198.283
-164.507
-157.95
-30.347
140.043
16.57664
14.14334
14.1445
16.57347
14.13862
16.57096
-18.3698
-14.0196
-11.6304
-9.5303
-2.1464
8.4511
<0.0001
<0.0001
<0.0001
<0.0001
0.1386
<0.0001
Figure N-3: SPSS output for Levene test of equal variances for the 34 aircraft THIA values against
SPs.
Figure N-4: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for the
34 aircraft THIA values against SPs.
Table N.2: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 34 aircraft cases versus Safety Protocols.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
T+A
T
T+A
T
T+A
DS
A
DS
DS
A
T
A
-85.68
-90.231
-131.297
-49.207
-39.567
32.256
14.1494
16.57937
16.57916
14.14992
14.15114
16.57825
-6.05538
-5.44235
-7.91942
-3.47752
-2.79601
1.9457
<0.0001
<0.0001
<0.0001
0.0028
0.0266
0.209
Figure N-5: SPSS output for Levene test of equal variances for the 22 aircraft THIA values against
CAs.
314
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-6: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for the
22 aircraft THIA values against CAs.
Table N.3: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 22 aircraft cases versus Control Architectures.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
SBSC
VBSC
SBSC
VBSC
SBSC
VBSC
MC
RT
SBSC
RT
VBSC
VBSC
MC
RT
LT
LT
LT
RT
RT
MC
MC
LT
LT
GC
MC
SBSC
GC
GC
GC
GC
157.2
150.2
139.004
124.993
79.315
62.296
42.981
42.767
-4.5
-13.278
-35.463
-52.385
-97.444
-179.056
-199.641
13.42179
13.41918
13.42073
13.41869
16.68611
16.68312
16.66496
13.40995
13.42322
16.66759
13.42316
13.42134
16.68439
13.41981
13.42073
11.7123
11.1929
10.3574
9.3148
4.7533
3.7341
2.5792
3.1892
-0.3352
-0.7966
-2.6419
-3.9031
-5.8405
-13.3426
-14.8755
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0026
0.1024
0.0179
0.9994
0.9682
0.0874
0.0013
<0.0001
<0.0001
<0.0001
Figure N-7: SPSS output for Levene test of equal variances for the 34 aircraft THIA values against
CAs.
Figure N-8: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for the
34 aircraft THIA values against CAs.
315
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.4: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 34 aircraft cases versus Control Architectures.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
VBSC
SBSC
VBSC
SBSC
VBSC
SBSC
RT
VBSC
RT
MC
VBSC
SBSC
MC
RT
LT
LT
LT
RT
RT
MC
MC
LT
SBSC
MC
LT
GC
GC
GC
GC
GC
263.081
240.696
240.415
172.874
147.852
137.611
136.948
80.67
70.556
26.889
-49.107
-117.944
-148.852
-256.748
-266.919
13.42601
13.42488
13.42439
13.42254
16.6856
16.68821
13.42157
13.42345
16.68154
16.67997
13.42016
13.42366
16.68544
13.42533
13.42617
19.5949
17.9291
17.9088
12.8794
8.861
8.246
10.2036
6.0097
4.2296
1.612
-3.6592
-8.7863
-8.9211
-19.1242
-19.8805
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0003
0.5905
0.0034
<0.0001
<0.0001
<0.0001
<0.0001
Figure N-9: SPSS output for Levene test of equal variances for the 22 aircraft THIA values against
SP-Composition pairs.
Figure N-10: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft THIA values against SP-Composition pairs.
Figure N-11: SPSS output for Levene test of equal variances for the 22 aircraft DTHIA values
against CAs.
316
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.5: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 22 aircraft cases versus Safety Protocols-Composition pairs.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
50
50
50
50
50
50
50
50
50
50
50
25
50
25
50
50
25
50
25
50
50
50
50
50
50
25
50
50
25
50
25
25
50
50
25
25
100 DS
25 T
25 T+A
25 DS
25 T+A
25 A
25 T+A
25 T
25 T+A
25 T
25 T
100 DS
25 DS
25 T
100 DS
50 T
100 DS
25 A
25 A
25 DS
50 DS
100 DS
25 A
25 DS
50 DS
25 DS
25 A
100 DS
25 A
50 A
25 DS
25 A
50 A
50 A
100 DS
100 DS
-160.411
-149.993
-149.993
-149.913
-148.673
-147.433
-143.9
-143.82
-97.78
-76.54
-72.993
-35.976
-27.82
14.473
23.729
35.007
51.217
53.647
74.367
96.193
107.06
124.221
125.527
136.167
139.06
141.66
146.9
154.691
148.02
149.473
149.713
149.993
149.993
149.993
159.433
164.798
10.54764
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.54764
10.01665
10.01665
10.54764
10.01665
10.54764
10.01665
10.01665
10.01665
10.01665
10.54764
10.01665
10.01665
10.01665
10.01665
10.01665
10.54764
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.54764
10.54764
-15.2082
-14.9744
-14.9744
-14.9664
-14.8426
-14.7188
-14.3661
-14.3581
-9.7617
-7.6413
-7.2872
-3.4108
-2.7774
1.4449
2.2497
3.4948
4.8558
5.3557
7.4243
9.6033
10.6882
11.7771
12.5318
13.594
13.8829
14.1424
14.6656
14.6659
14.7774
14.9225
14.9464
14.9744
14.9744
14.9744
15.1155
15.6242
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0187
0.122
0.8804
0.3733
0.014
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
A
A
A
A
DS
A
T+A
DS
T
T
T+A
A
DS
T+A
DS
T+A
DS
DS
DS
T
T
T
T
T+A
T+A
T
T+A
T+A
T
DS
T+A
T+A
T
T+A
T
T+A
317
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-12: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft DTHIA values against CAs.
Table N.6: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Duration of Tertiary
Halo Incursions due to Aircraft (DTHIA) 22 aircraft cases versus Control Architectures.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
VBSC
VBSC
SBSC
SBSC
VBSC
SBSC
RT
RT
MC
VBSC
SBSC
VBSC
MC
RT
LT
LT
RT
LT
RT
MC
MC
LT
MC
LT
SBSC
GC
GC
GC
GC
GC
253.485
222.481
188.511
163.915
142.093
99.407
95.667
46.167
14.259
-16.781
-51.907
-81.43
-147.389
-249.985
-264.737
13.42881
13.42881
13.42881
13.42881
16.69439
16.6944
13.42878
16.69434
16.69433
13.42881
13.42881
13.42881
16.6944
13.42881
13.42881
18.8762
16.5675
14.0378
12.2062
8.5114
5.9545
7.124
2.7654
0.8541
-1.2497
-3.8654
-6.0638
-8.8286
-18.6156
-19.7141
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0631
0.9571
0.8122
0.0015
<0.0001
<0.0001
<0.0001
<0.0001
Figure N-13: SPSS output for Levene test of equal variances for the 22 aircraft DTHIA values
against SPs.
318
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-14: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft DTHIA values against SPs.
Table N.7: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Duration of Tertiary
Halo Incursions due to Aircraft (DTHIA) 22 aircraft cases versus Safety Protocols.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
DS
T+A
T
T+A
T
T+A
A
T
A
A
DS
DS
105.88
-13.043
-36.967
-43.39
-132.92
-142.84
16.58242
14.1539
14.1539
14.1539
16.58242
16.58242
6.38505
-0.92154
-2.61177
-3.06559
-8.01569
-8.61395
<0.0001
0.7933
0.0446
0.0117
<0.0001
<0.0001
Figure N-15: SPSS output for Levene test of equal variances for the 22 aircraft PHI values against
SP-Composition pairs, excluding T+A safety protocols.
Figure N-16: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft PHI values against SP-Composition pairs, excluding T+A safety protocols.
319
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.8: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 22 aircraft cases versus Safety Protocols-Composition pairs, excluding T+A cases.
Level
50 A
50 A
50 A
50 A
50 DS
25 A
50 T
25 DS
50 DS
50 DS
50 T
50 T
50 T
25 DS
50 DS
25 T
25 T
50 T
50 DS
25 T
50 T
-Level
25 T
100 DS
25 DS
25 A
25 T
100 DS
25 T
100 DS
25 DS
100 DS
50 DS
25 DS
100 DS
25 A
25 A
100 DS
25 DS
25 A
50 A
25 A
50 A
Score Mean Difference
-72.14
-71.115
-50.7267
-48.48
-44.8533
-40.4617
-21.3533
9.1178
13.5133
20.0078
19.0733
28.9067
30.5861
36.98
52.34
57.7683
55.1667
58.1067
65.62
68.3467
73.84
Std Err Dif
10.01379
10.54384
10.01315
10.0107
10.01264
10.5391
10.01306
10.5399
10.00922
10.53917
10.01156
10.01181
10.54213
10.01086
10.01088
10.54343
10.01291
10.01187
10.01342
10.01373
10.0138
Z
-7.20407
-6.7447
-5.066
-4.84282
-4.47967
-3.83919
-2.13255
0.86507
1.35009
1.89842
1.90513
2.88726
2.90132
3.69399
5.22831
5.47908
5.50956
5.80377
6.55321
6.8253
7.37383
p-Value
<0.0001
<0.0001
<0.0001
<0.0001
0.0002
0.0024
0.3333
0.9776
0.8281
0.4813
0.4768
0.0595
0.0572
0.0041
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
Figure N-17: SPSS output for Levene test of equal variances for the 22 aircraft SHI values against
SPs-Composition interactions.
Figure N-18: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft PHI values against SPs-Composition pairs.
320
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.9: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Tertiary Halo
Incursions due to Aircraft (THIA) 22 aircraft cases versus Safety Protocols-Composition pairs.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
50
50
50
50
50
50
50
50
25
50
50
50
50
25
50
50
50
25
50
50
50
25
50
50
50
25
25
50
50
25
25
50
25
25
50
50
25 T+A
25 T
100 DS
25 T+A
25 DS
25 A
25 T
25 T+A
100 DS
25 T+A
25 T
25 T
50 T
100 DS
25 DS
100 DS
50 DS
25 T
25 DS
100 DS
50 DS
25 A
100 DS
25 DS
25 A
100 DS
25 DS
25 A
50 A
100 DS
25 A
25 A
25 A
25 DS
50 A
50 A
-76.3133
-72.14
-71.115
-56.5667
-50.7267
-48.48
-44.8533
-44.76
-40.4617
-32.8867
-22.6667
-21.3533
5.9333
9.1178
13.5133
20.0078
19.0733
21.5733
28.9067
30.5861
34.68
36.98
50.7283
48.3133
52.34
57.7683
55.1667
58.1067
65.62
69.245
68.3467
69.56
71.76
72.7267
73.84
76.2
10.01159
10.01379
10.54384
10.01164
10.01315
10.0107
10.01264
10.00948
10.5391
10.01064
10.01045
10.01306
10.01156
10.5399
10.00922
10.53917
10.01156
10.00572
10.01181
10.54213
10.00976
10.01086
10.54188
10.01022
10.01088
10.54343
10.01291
10.01187
10.01342
10.54296
10.01373
10.01299
10.01184
10.01199
10.0138
10.01328
-7.6225
-7.20407
-6.7447
-5.65009
-5.066
-4.84282
-4.47967
-4.47176
-3.83919
-3.28517
-2.2643
-2.13255
0.59265
0.86507
1.35009
1.89842
1.90513
2.1561
2.88726
2.90132
3.46462
3.69399
4.81208
4.8264
5.22831
5.47908
5.50956
5.80377
6.55321
6.56789
6.8253
6.94697
7.16751
7.26396
7.37383
7.60989
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0003
0.0003
0.0039
0.0283
0.3641
0.4508
0.9996
0.9947
0.916
0.6149
0.6102
0.4348
0.0917
0.0883
0.0156
0.0068
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
A
A
A
DS
A
A
DS
T+A
A
T
T+A
T
T+A
DS
DS
DS
T
T+A
T
T
T+A
DS
T+A
T+A
DS
T
T
T
DS
T+A
T
T+A
T+A
T+A
T
T+A
Figure N-19: SPSS output for Levene test of equal variances for the 34 aircraft PHI values against
SP-Composition pairs.
321
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-20: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft PHI values against SP-Composition pairs, excluding T+A safety protocols.
Table N.10: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 34 aircraft cases versus Safety Protocols-Composition pairs.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
50
50
50
50
50
50
50
25
50
50
50
25
25
50
25
50
50
50
50
50
50
25
50
50
25
25
50
25
50
25
25
50
50
50
50
50
25 T
50 T
25 T+A
25 T
25 T
25 T+A
100 DS
25 T
25 DS
25 A
25 T+A
100 DS
100 DS
100 DS
25 A
25 DS
25 A
25 T
50 A
50 DS
100 DS
100 DS
25 DS
25 T+A
25 A
100 DS
25 A
25 DS
100 DS
25 DS
25 A
50 A
50 DS
25 DS
25 A
50 A
-76.8333
-74.5267
-73.4933
-68.74
-50.0933
-44.8067
-38.7689
-35.92
-23.4
-21.46
-17.6
-9.3256
2.585
9.5211
10.8933
18.0067
26.1933
39.12
53.58
54.28
62.5167
64.3317
63.9133
65.1867
69.22
73.7428
70.5533
70.7
77.5439
75.6867
75.92
76
77.1733
77.8
77.9867
77.9933
10.01511
10.0147
10.01486
10.01548
10.01375
10.01508
10.54312
10.01365
10.01318
10.01231
10.01332
10.54328
10.54457
10.54374
10.01347
10.01369
10.01299
10.01371
10.01276
10.01412
10.54502
10.54612
10.01394
10.01377
10.01525
10.54638
10.01449
10.01523
10.54628
10.01553
10.01552
10.01455
10.01542
10.01546
10.01533
10.01492
-7.67174
-7.44173
-7.33843
-6.86338
-5.00245
-4.47392
-3.67718
-3.5871
-2.33692
-2.14336
-1.75766
-0.8845
0.24515
0.90301
1.08787
1.79821
2.61593
3.90665
5.35117
5.42035
5.92855
6.10003
6.38244
6.5097
6.91146
6.99224
7.04513
7.05925
7.35272
7.55693
7.58023
7.58896
7.70545
7.76799
7.78673
7.78772
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0003
0.0073
0.0101
0.3197
0.4434
0.7103
0.9938
1
0.9929
0.9761
0.6835
0.1797
0.003
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
A
T+A
A
DS
T+A
DS
A
T+A
A
A
T+A
A
DS
DS
DS
DS
DS
T
DS
T+A
T+A
T+A
T+A
T
T+A
T
T+A
T+A
T
T
T
T+A
T
T
T
T
322
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-21: SPSS output for Levene test of equal variances for the 22 aircraft PHI values against
SP.
Figure N-22: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft PHI values against SP.
Table N.11: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 22 aircraft cases versus Safety Protocols.
Level
- Level
Score Mean Difference
Std Err Dif
Z
p-Value
DS
T+A
T+A
T
T
T+A
A
DS
A
A
DS
T
197.1511
183.4178
177.9444
163.5389
112.1778
49.8222
13.05863
13.05781
10.96205
10.96168
13.05428
10.94888
15.09738
14.0466
16.23278
14.91914
8.59318
4.55044
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
Figure N-23: SPSS output for Levene test of equal variances for the 34 aircraft PHI values against
SP.
323
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-24: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft PHI values against SP.
Table N.12: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Primary Halo
Incursion (PHI) 22 aircraft cases versus Safety Protocols.
Level
- Level
Score Mean Difference
Std Err Dif
Z
p-Value
T
T+A
T
T+A
DS
T+A
DS
DS
A
A
A
T
237.973
211.622
179.894
175.567
82.547
-126.75
13.0727
13.07027
10.96536
10.96457
13.06124
10.9616
18.2038
16.1911
16.4057
16.0122
6.32
-11.5631
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
Figure N-25: SPSS output for Levene test of equal variances for the 22 aircraft TTD values against
SP-Composition pairs.
Figure N-26: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft TTD values against SP-Composition pairs.
Figure N-27: SPSS output for Levene test of equal variances for the 22 aircraft TTD values against
SP.
324
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.13: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Total Taxi Distance
(TTD) 22 aircraft cases versus SP-Composition pairs.
Level
-Level
Score Mean Difference
Std Err Dif
Z
p-Value
50
50
50
50
50
50
50
50
50
50
50
25
50
25
50
50
25
50
25
50
50
50
50
50
50
25
50
50
25
50
25
25
50
50
25
25
100 DS
25 T
25 T+A
25 DS
25 T+A
25 A
25 T+A
25 T
25 T+A
25 T
25 T
100 DS
25 DS
25 T
100 DS
50 T
100 DS
25 A
25 A
25 DS
50 DS
100 DS
25 A
25 DS
50 DS
25 DS
25 A
100 DS
25 A
50 A
25 DS
25 A
50 A
50 A
100 DS
100 DS
-160.411
-149.993
-149.993
-149.913
-148.673
-147.433
-143.9
-143.82
-97.78
-76.54
-72.993
-35.976
-27.82
14.473
23.729
35.007
51.217
53.647
74.367
96.193
107.06
124.221
125.527
136.167
139.06
141.66
146.9
154.691
148.02
149.473
149.713
149.993
149.993
149.993
159.433
164.798
10.54764
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.54764
10.01665
10.01665
10.54764
10.01665
10.54764
10.01665
10.01665
10.01665
10.01665
10.54764
10.01665
10.01665
10.01665
10.01665
10.01665
10.54764
10.01665
10.01665
10.01665
10.01665
10.01665
10.01665
10.54764
10.54764
-15.2082
-14.9744
-14.9744
-14.9664
-14.8426
-14.7188
-14.3661
-14.3581
-9.7617
-7.6413
-7.2872
-3.4108
-2.7774
1.4449
2.2497
3.4948
4.8558
5.3557
7.4243
9.6033
10.6882
11.7771
12.5318
13.594
13.8829
14.1424
14.6656
14.6659
14.7774
14.9225
14.9464
14.9744
14.9744
14.9744
15.1155
15.6242
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
0.0187
0.122
0.8804
0.3733
0.014
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
A
A
A
A
DS
A
T+A
DS
T
T
T+A
A
DS
T+A
DS
T+A
DS
DS
DS
T
T
T
T
T+A
T+A
T
T+A
T+A
T
DS
T+A
T+A
T
T+A
T
T+A
Figure N-28: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 22 aircraft TTD values against SP.
325
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Table N.14: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Total Taxi Distance
(TTD) 22 aircraft cases versus SP.
Level
- Level
Score Mean Difference
Std Err Dif
Z
p-Value
T+A
T
T+A
T
DS
T+A
DS
DS
A
A
A
T
374.6356
324.2715
298.45
286.7767
260.6798
37.1367
16.58243
16.58243
14.15392
14.15392
16.58243
14.15392
22.59232
19.55512
21.08604
20.2613
15.72024
2.62377
<.0001
<.0001
<.0001
<.0001
<.0001
0.0432
Figure N-29: SPSS output for Levene test of equal variances for the 34 aircraft TTD values against
SP.
Figure N-30: SPSS output for the non-parametric Kruskal-Wallis one-way analysis of variance for
the 34 aircraft TTD values against SP.
Table N.15: JMP PRO 10 output for Steel-Dwass Multiple Comparisons tests for Total Taxi Distance
(TTD) 34 aircraft cases versus SP.
Level
- Level
Score Mean Difference
Std Err Dif
Z
p-Value
T
T+A
T
T+A
DS
T+A
DS
DS
A
A
A
T
389.932
372.978
299.997
297.55
218.018
-244.097
16.58243
16.58243
14.15392
14.15392
16.58243
14.15392
23.5148
22.4924
21.1953
21.0225
13.1475
-17.2459
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
326
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
N.2
Full Color Pareto Charts
Figure N-31: Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 22 aircraft missions.
327
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-32: Pareto frontier plot of Tertiary Halo Incursions (THI) versus LD for 34 aircraft missions.
328
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-33: Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
22 aircraft missions.
329
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-34: Pareto frontier plot of Tertiary Halo Incursions due to Aircraft (THIA) versus LD for
34 aircraft missions.
330
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-35: Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 22 aircraft missions.
331
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-36: Pareto frontier plot of Duration of Tertiary Halo Incursions due to Aircraft (DTHIA)
versus LD for 34 aircraft missions.
332
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-37: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions,
excluding Temporal+Area cases.
333
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-38: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 22 aircraft missions.
334
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-39: Pareto frontier plot of Primary Halo Incursion (PHI) versus LD for 34 aircraft missions.
335
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-40: Pareto frontier plot of Duration of Primary Halo Incursions (DPHI) versus LD for 22
aircraft missions.
336
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-41: Pareto frontier plot of Duration of Primary Halo Incursions (DPHI) versus LD for 34
aircraft missions.
337
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-42: Pareto frontier plot of Total Taxi Distance (TTD) versus LD for 22 aircraft missions.
338
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-43: Pareto frontier plot of Total Taxi Distance (TTD) versus LD for 34 aircraft missions.
339
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-44: Pareto frontier plot of Total Aircraft taXi Time (TAXT) versus LD for 22 aircraft
missions.
340
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-45: Pareto frontier plot of Total Aircraft taXi Time (TAXT) versus LD for 34 aircraft
missions.
341
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-46: Pareto frontier plot of Total Aircraft Active Time (TAAT) versus LD for 22 aircraft
missions.
342
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-47: Pareto frontier plot of Total Aircraft Active Time (TAAT) versus LD for 34 aircraft
missions.
343
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-48: Pareto frontier plot of Primary Collision warnings (PC) versus LD for 22 aircraft
missions.
344
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-49: Pareto frontier plot of Primary Collision warnings (PC) versus LD for 34 aircraft
missions.
345
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-50: Pareto frontier plot of Secondary Collision warnings (SC) versus LD for 22 aircraft
missions.
346
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-51: Pareto frontier plot of Secondary Collision warnings (SC) versus LD for 34 aircraft
missions.
347
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-52: Pareto frontier plot of Tertiary Collision warnings (TC) versus LD for 22 aircraft
missions.
348
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-53: Pareto frontier plot of Tertiary Collision warnings (TC) versus LD for 34 aircraft
missions.
349
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-54: Pareto frontier plot of Landing Strip Foul Time (LSFT) versus LD for 22 aircraft
missions.
350
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-55: Pareto frontier plot of Landing Strip Foul Time (LSFT) versus LD for 34 aircraft
missions.
351
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-56: Pareto frontier plot of Number assigned to Street (NS) versus LD for 22 aircraft
missions.
352
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-57: Pareto frontier plot of Number assigned to Street (NS) versus LD for 34 aircraft
missions.
353
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-58: Pareto frontier plot of Number assigned to Fantail (NF) versus LD for 22 aircraft
missions.
354
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-59: Pareto frontier plot of Number assigned to Fantail (NF) versus LD for 34 aircraft
missions.
355
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-60: Pareto frontier plot of Duration of Vehicles in Street (SD) versus LD for 22 aircraft
missions.
356
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-61: Pareto frontier plot of Duration of Vehicles in Street (SD) versus LD for 34 aircraft
missions.
357
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-62: Pareto frontier plot of Duration of Vehicles in Fantail (FD) versus LD for 22 aircraft
missions.
358
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
Figure N-63: Pareto frontier plot of Duration of Vehicles in Fantail (FD) versus LD for 34 aircraft
missions.
359
APPENDIX N. STATISTICAL TESTS FOR PARETO FRONTIER RESULTS
THIS PAGE INTENTIONALLY LEFT BLANK
360
Appendix O
Statistical Tests for Design
Exploration
Figure O-1: SPSS output for Levene test of equal variances for LD values for all control architectures
at 22 aircraft under the 50% standard deviation launch preparation time.
Figure O-2: SPSS output for an ANOVA of LD values for all control architectures at 22 aircraft
under the 50% standard deviation launch preparation time.
361
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-3: SPSS output for Tukey HSC test of LD values for all control architectures at 22 aircraft
under the 50% standard deviation launch preparation time.
Figure O-4: SPSS output for Levene test of equal variances for LD values for all control architectures
at 34 aircraft under the 50% standard deviation launch preparation time.
362
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-5: SPSS output for an ANOVA of LD values for all control architectures at 34 aircraft
under the 50% standard deviation launch preparation time.
Figure O-6: SPSS output for Levene test of equal variances for LD values for all control architectures
at 22 aircraft under the launch preparation time model with 50% reduction in mean and 50%
reduction in standard deviation.
Table O.1: JMP PRO 10 output for a Steel-Dwass non-parametric simultaneous comparisons test
of LD values for all control architectures for 34 aircraft under the launch preparation time model
with 50% reduction in mean and standard deviation.
RT -50 LT -50
26.2333
4.50925
5.81767
<.0001
RT -50
VH -50
VH -50
VH -50
MC -50
RT -50
SH -50
SH -50
VH -50
VH -50
MC -50
LT -50
SH -50
SH -50
MC -50
SH -50
LT -50
MC -50
LT -50
GC -50
MC -50
LT -50
GC -50
RT -50
GC -50
GC -50
GC -50
RT -50
24.4333
22.5667
20.7667
19.4333
-0.9
-0.9667
-4.4333
-6.5
-8.7333
-9.9
-23.6333
-24.9
-25.7
-26.3667
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.509187
4.50925
4.50925
4.50925
4.50925
4.50925
363
5.41849
5.00453
4.60535
4.30966
-0.19959
-0.21437
-0.98316
-1.44148
-1.93679
-2.19549
-5.24108
-5.52198
-5.6994
-5.84724
<.0001
<.0001
<.0001
0.0002
1
0.9999
0.9235
0.7015
0.3795
0.2397
<.0001
<.0001
<.0001
<.0001
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-7: SPSS output for a Tukey HSD test of LD values for all control architectures at 34
aircraft under the 50% standard deviation launch preparation time.
364
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-8: SPSS output for an ANOVA test for LD values for all control architectures at 22 aircraft
under the launch preparation time model with 50% reduction in mean and 50% reduction in standard
deviation.
Figure O-9: SPSS output for Tukey HSD test for LD values for all control architectures at 22
aircraft under the launch preparation time model with 50% reduction in mean and 50% reduction
in standard deviation.
365
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-10: SPSS output for Levene test of equal variances for LD values for all control architectures
for 34 aircraft under the launch preparation time model with 50% reduction in mean and standard
deviation.
Figure O-11: SPSS output for a Kruskal-Wallis test of LD values for all control architectures for
34 aircraft under the launch preparation time model with 50% reduction in mean and standard
deviation.
Figure O-12: SPSS output for Levene test of equal variances for LD values for all control architectures
at 22 aircraft under the “automated” launch preparation time.
Figure O-13: SPSS output for Kruskal-Wallis nonparametric analysis of variance for LD values for
all control architectures at 22 aircraft under the “automated” launch preparation time.
Figure O-14: SPSS output for Levene test of equal variances for LD values for all control architectures
at 34 aircraft under the “automated” launch preparation time.
366
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Table O.2: JMP Pro 10 output for Steel-Dwass nonparametric simultaneous comparisons test of LD
values for all control architectures at 22 aircraft under the “automated” launch preparation time.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
RT
VH
VH
RT
VH
SH
SH
VH
VH
MC
SH
RT
SH
LT
MC
MC
MC
LT
LT
SH
MC
LT
RT
GC
LT
RT
GC
GC
GC
GC
29.9667
29.9667
29.7667
29.5667
27.3
27.1667
24.1667
22.7667
4.0333
-6.9
-15.5
-22.9
-27.5
-29.9
-29.9667
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.509187
4.50925
4.50925
4.509187
4.509187
4.509187
4.509187
6.6456
6.6456
6.60125
6.55689
6.05422
6.02465
5.35935
5.04888
0.89447
-1.53019
-3.43738
-5.07852
-6.09866
-6.63091
-6.64569
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
0.948
0.6447
0.0077
< .0001
< .0001
< .0001
< .0001
Figure O-15: SPSS output for Kruskal-Wallis nonparametric analysis of variance for LD values for
all control architectures at 34 aircraft under the “automated” launch preparation time.
Table O.3: JMP Pro 10 output for Stell-Dwass nonparametric simultaneous comparisons test of LD
values for all control architectures at 34 aircraft under the “automated” launch preparation time.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
RT
RT
VH
RT
VH
VH
MC
SH
SH
LT
MC
VH
SH
SH
VH
LT
MC
SH
GC
MC
LT
LT
MC
LT
GC
GC
GC
GC
RT
RT
29.9667
29.9667
29.3
18.8333
0.9
-2.4333
-3.3
-27.0333
-28.9
-29.3667
-29.3667
-29.3667
-29.9667
-29.9667
-29.9667
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
367
6.6456
6.6456
6.49775
4.1766
0.19959
-0.53963
-0.73183
-5.99508
-6.40905
-6.51254
-6.51254
-6.51254
-6.6456
-6.6456
-6.6456
< .0001
< .0001
< .0001
0.0004
1
0.9945
0.9781
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-16: SPSS output for Levene test of equal variances for LD values for the GC architecture
across all safety protocols at 22 aircraft under the “automated” launch preparation time.
Figure O-17: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the GC architecture across all safety protocols at 22 aircraft under the “automated” launch
preparation time.
Table O.4: SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the GC architecture across all safety protocols at 22 aircraft under the “automated”
launch preparation time.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
T+A
T
T+A
T
DS
T+A
DS
DS
A
A
A
T
73.45833
68.43056
59.98333
59.71667
27.76389
21.15
7.240888
7.240888
6.350853
6.350853
7.240888
6.350853
10.14493
9.45057
9.44493
9.40294
3.83432
3.33026
< .0001
< .0001
< .0001
< .0001
0.0007
0.0048
Figure O-18: SPSS output for Levene test of equal variances for LD values for the GC architecture
across all safety protocols at 34 aircraft under the “automated” launch preparation time.
368
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-19: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the GC architecture across all safety protocols at 34 aircraft under the “automated” launch
preparation time.
Table O.5: SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the GC architecture across all safety protocols at 34 aircraft under the “automated”
launch preparation time.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
T+A
T
T+A
T
T+A
DS
DS
DS
A
A
T
A
74.98611
74.70833
59.98333
59.91667
41.01667
18.45833
7.240895
7.240895
6.350853
6.350853
6.350853
7.240895
10.35592
10.31756
9.44493
9.43443
6.45845
2.54918
< .0001
< .0001
< .0001
< .0001
< .0001
0.0527
Figure O-20: SPSS output for Levene test of equal variances for LD values for the SBSC architecture
across all safety protocols at 22 aircraft under the “automated” launch preparation time.
Figure O-21: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the SBSC architecture across all safety protocols at 22 aircraft under the “automated” launch
preparation time.
Figure O-22: SPSS output for Levene test of equal variances for LD values for the SBSC architecture
across all safety protocols at 34 aircraft under the “automated” launch preparation time.
369
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Table O.6: SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the SBSC architecture across all safety protocols at 22 aircraft under the “automated”
launch preparation time.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
T
T+A
T
T+A
T+A
DS
DS
DS
A
A
T
A
74.98611
74.98611
59.98333
59.98333
49.25
26.125
7.240895
7.240895
6.350853
6.350853
6.350853
7.240895
10.35592
10.35592
9.44493
9.44493
7.75486
3.60798
< .0001
< .0001
< .0001
< .0001
< .0001
0.0018
Figure O-23: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the SBSC architecture across all safety protocols at 34 aircraft under the “automated” launch
preparation time.
Table O.7: SPSS output for Steel-Dwass non-parametric simultaneous comparisons tests for LD
values for the SBSC architecture across all safety protocols at 34 aircraft under the “automated”
launch preparation time.
Level
- Level Score Mean Difference Std Err Dif Z
p-Value
T
T+A
T
T+A
T+A
DS
DS
DS
A
A
T
A
74.9861
74.9861
59.9833
59.9833
16.45
-58.375
7.240895
7.240895
6.350853
6.350853
6.350853
7.240895
10.3559
10.3559
9.4449
9.4449
2.5902
-8.0618
< .0001
< .0001
< .0001
< .0001
0.0473
< .0001
Figure O-24: SPSS output for Levene test of equal variances for LD values for the GC architecture
across different parameter settings at 22 aircraft under the “automated” launch preparation time.
Figure O-25: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the GC architecture across different parameter settings at 22 aircraft under the “automated” launch
preparation time.
370
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Table O.8: JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
of LD values for the GC architecture across different parameter settings at 22 aircraft under the
“automated” launch preparation time.
Level
- Level
Score Mean Difference Std Err Dif Z
p-Value
LowLatency
Manual Control
LowLat+Fail
LowLatency
LowFailure
LowLat+Fail
Manual Control
Manual Control
LowLat+Fail
Manual Control
LowFailure
LowLat+Fail
LowLatency
GC Base
GC Base
LowFailure
LowLatency
LowFailure
GC Base
GC Base
-0.3
-13.0667
-18.6333
-19.1667
-19.6333
-20.4333
-27.3
-28.2333
-28.6333
-29.9667
4.50925
4.509187
4.50925
4.509187
4.509187
4.50925
4.50925
4.50925
4.509187
4.509187
-0.06653
-2.89779
-4.13225
-4.25058
-4.35407
-4.53143
-6.05422
-6.2612
-6.35
-6.64569
1
0.0308
0.0003
0.0002
0.0001
< .0001
< .0001
< .0001
< .0001
< .0001
Figure O-26: SPSS output for Levene test of equal variances for LD values for the GC architecture
across different parameter settings at 34 aircraft under the “automated” launch preparation time.
Figure O-27: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the GC architecture across different parameter settings at 34 aircraft under the “automated” launch
preparation time.
Table O.9: JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
of LD values for the GC architecture across different parameter settings at 34 aircraft under the
“automated” launch preparation time.
Level
- Level
Score Mean Difference Std Err Dif Z
p-Value
LowLatency
Manual Control
LowLatency
LowFailure
Manual Control
LowLat+Fail
Manual Control
Manual Control
LowLat+Fail
LowLat+Fail
LowFailure
LowLat+Fail
GC Base
GC Base
LowFailure
LowFailure
LowLatency
GC Base
LowLatency
GC Base
15.4333
2.0333
-12.6333
-25.1667
-25.7
-28.4333
-28.5667
-29.3667
-29.7
-29.7667
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
371
3.42259
0.45092
-2.80165
-5.58112
-5.6994
-6.30556
-6.33513
-6.51254
-6.58646
-6.60125
0.0056
0.9915
0.0407
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-28: SPSS output for Levene test of equal variances for LD values for the GC architecture
across different parameter settings at 22 aircraft under the “automated” launch preparation time,
including the Low Penalty case.
Figure O-29: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values for
the GC architecture across different parameter settings at 22 aircraft under the “automated” launch
preparation time, including the Low Penalty case.
Table O.10: JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test
of LD values for the GC architecture across different parameter settings at 22 aircraft under the
“automated” launch preparation time, including the Low Penalty case.
Level
- Level
Score Mean Difference Std Err Dif Z
p-Value
LowLatency
Manual Control
LowPen
Manual Control
LowLat+Fail
LowLatency
LowFailure
LowLat+Fail
LowPen
LowPen
Manual Control
Manual Control
LowLat+Fail
LowPen
Manual Control
LowFailure
LowPen
LowLat+Fail
LowLat+Fail
LowLatency
GC Base
GC Base
LowFailure
LowLatency
LowFailure
LowLatency
LowFailure
GC Base
GC Base
GC Base
-0.3
-2.5667
-5.9667
-13
-18.6667
-19.2333
-19.6667
-20.4667
-21.5667
-22.1
-27.3667
-28.2667
-28.6333
-29.5
-29.9667
4.50925
4.508247
4.508811
4.507746
4.508999
4.508999
4.509062
4.509124
4.509124
4.509124
4.50831
4.508435
4.509062
4.508999
4.508373
372
-0.06653
-0.56933
-1.32333
-2.88392
-4.13987
-4.26554
-4.36159
-4.53894
-4.78289
-4.90117
-6.07027
-6.26973
-6.35018
-6.54247
-6.64689
1
0.993
0.7722
0.0454
0.0005
0.0003
0.0002
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
< .0001
APPENDIX O. STATISTICAL TESTS FOR DESIGN EXPLORATION
Figure O-30: SPSS output for Levene test of equal variances for LD values for the GC architecture
across different failure rates at 22 aircraft under the “automated” launch preparation time, using
Low Latency and Low Penalty.
Figure O-31: SPSS output for Kruskal-Wallis non-parametric analysis of variance for LD values
for the GC architecture across different failure rates at 22 aircraft under the “automated” launch
preparation time, using Low Latency and Low Penalty.
Table O.11: JMP PRO 10 output for Steel-Dwass non-parametric simultaneous comparisons test for
LD values for the GC architecture across different failure rates at 22 aircraft under the “automated”
launch preparation time, using Low Latency and Low Penalty.
Level
- Level
Score Mean Difference Std Err Dif Z
p-Value
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
MC
MC
LowLat+Pen
LowLat+Pen
MC
MC
MC
3%
3%
5%
4%
5%
4%
5%
2%
5%
4%
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
LowLat+Pen
2%
1%
2%
2%
1%
1%
4%
1%
1%
2%
3%
3%
5%
4%
3%
15.8333
15.4333
8.6333
8.5667
8.3
7.2333
3.2333
0.5
-2.5667
-6.4333
-6.8333
-9.2333
-14.0333
-15.9
-23.7
373
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.50925
4.508498
4.508498
4.50925
4.509124
4.508498
4.508498
4.508498
3.5113
3.42259
1.91458
1.8998
1.84066
1.60411
0.71704
0.11088
-0.5693
-1.42693
-1.5154
-2.0477
-3.11264
-3.52667
-5.25674
0.0059
0.0082
0.393
0.4022
0.4394
0.5958
0.98
1
0.993
0.7106
0.6543
0.3153
0.0228
0.0056
< .0001
Download