Slides - CSE User Home Pages

advertisement
Detection of Precursors to Aviation Safety Incidents
due to Human Factors
I. Melnyk, P. Yadav, M. Steinbach, J. Srivastava, V. Kumar and A. Banerjee
Department of Computer Science & Engineering
University of Minnesota, Minneapolis
ICDM Workshop on Domain Driven Data Mining
December 7, 2013
Overview
•
Introduction
•
Related Work
•
Proposed Approach
• HMM vs HSMM
•
Experimental Results
•
Conclusion
2
Introduction
•
Fatal accidents and onboard fatalities 2003-2012 [Boeing Report ’12]
3
Introduction
•
Estimated two-fold increase in air traffic by 2025 [Sheridan ’06]
•
•
•
•
Congestion in air and airports
Load on pilots and traffic controllers
Greater chance to make error
Our objective
•
Detect precursors to aviation safety incidents due to human factors
– Analysis and modeling of pilot actions
– Data generation and evaluation
4
Related Work: Aviation Safety
• Hidden Markov Models [Srivastava ’05]
•
Observations modeled using N-dim binary vector (switches in cockpit)
•
Cluster data to get smaller class of observations; build HMM over reduced data
• Clustering approach [Budalakoti et al. ’09]
•
Cluster pilot action sequences using k-medoids based on nLCS
•
Rank order sequences to identify anomalous sequences
• One-class SVM [Das et al. ’10]
•
Detects anomalies in both continuous and discrete sequences
•
Employed Multiple Kernel learning: LCS for discrete, SAX for continuous
• Dynamic Bayesian Networks [Saada et al. ’12]
•
Hidden nodes – pilot actions; Observable nodes – aircraft sensors
•
Detects pilot errors in the past given current instrument data
5
Problem Formulation
•
Given database of
normal pilot action sequences
• Actions come from finite alphabet:
• Examples: “raise landing gear”, “lower flaps”, “decrease throttle”, etc.
• Construct model of a normal sequence from data
• Assign anomaly score to a test sequence
• Entire sequence is anomalous (offline anomaly detection)
• Specific action is anomalous (online anomaly detection)
• Examples: “unusual order of actions”, “forgotten action”, etc.
6
Analysis of Pilot Actions
•
Flight phases
•
Example: landing phase pilot actions
descent
touch down
braking on runway
Action ID
stage duration
time btw actions
Null action
Time
7
Analysis of Pilot Actions
•
Flight phases
•
Example: landing phase pilot actions
•
Simplification: ignore time duration between actions
descent
touch down
braking on runway
Action ID
stage duration
Time
8
Hidden Markov Model (HMM)
•
•
•
Hidden states
•
Stages of aircraft operation
•
Examples: initial descent, touch down, braking, etc.
Observations
•
Pilot actions
•
Example: initial descent – reduce throttle, lower flaps, lower landing gear, etc.
Model parameters
•
Prob. distributions: Transition
, observation
, prior
• Drawback
• Geometric state-duration distribution – encourages fast state switching
• Inability to model arbitrary state durations
9
Hidden Semi-Markov Model (HSMM)
•
Additional hidden variable
• State duration
• Forces hidden state
•
to last
time steps
Model parameters
• Probability distributions:
•
Duration
•
Transition
•
Observation
, initial distributions
10
,
HSMM: Model Parameters Estimation
• Estimate probability distributions (conditional probability tables)
•
Duration
•
Transition
•
Observation
•
Initial distributions
• Use database of normal pilot action sequences
• Select parameters which maximize likelihood of data
•
Non-convex problem without closed-form solution
•
Use Expectation Maximization (EM) [Dempster et al. ’77]
•
Similar to Baum-Welch algorithm for HMM [Baum et al. ’70]
11
Anomaly Detection Methodology
•
Detect if a test sequence is anomalous
• Entire sequence is anomalous (offline anomaly detection)
•
Normalized joint log likelihood
• Specific action is anomalous (online anomaly detection)
•
•
Conditional probability
Computational complexity
•
Computation uses Junction Tree algorithm for inference
- sequence length
- number of hidden states
•
Cost:
•
For comparison, complexity for HMM:
12
- maximum state duration
Results: Synthetic Data
•
Compared HMM and HSMM to detect duration anomalies
• Data:
Normal
Anomalous
Training
200
0
Testing
25
25
13
Flight Simulator
•
FlightGear flight simulator
Figure 4: Landing of aircraft in FlightGear simulator.
•
P
ilo
t’s
A
c
tio
n
s
S
e
q
u
e
n
c
e
• Landing flight phase
• Cessna 172 Skyhawk landing at Half Moon Bay, CA airport
$1 *(
2
0
Simulator setup
• Aircraft controlled using keyboard
•
•
4
0
6
0
8
0
Keystrokes interpreted as pilot actions
12 commands to control aircraft
1
0
0
1
2
0
1
4
0
1
6
0
1
8
0
2
0
0
T
im
e
s
ta
m
p
!" #$%&' (4' +#' , -' (
14
Pilot Actions for Landing
Figure 4: Landing of aircraft in FlightGear simulator.
P
ilo
t’s
A
c
tio
n
s
S
e
q
u
e
n
c
e
1
2
A
c
tio
n
ID
1
0
5, 6-(4$1 *(
8
6
4
2
0
0
2
0
4
0
6
0
8
0
1
0
0
1
2
0
1
4
0
1
6
0
1
8
0
2
0
0
T
im
e
s
ta
m
p
!" #$%&' ()*+#' , -' (
!" #$%&' (4' +#' , -' (
-!, . )&)/, 0$*(1 )!" (
, )&' #$*-2' &' 3, !$#(
4' -+' *4(7-)*8((((((
, )&' #$*-2' &' 3, !$#2#744' #(
, 66&)' 4(.brakes;
#' , 9-:(
-!, . )&)/, 0$*(1 )!" (#744' #(
e 5: Sequence of pilot’s actions under normal operating conditions while landing Cessna aircraft
15
tGear simulator.
Flight Simulator Data
• Generated Data
Normal
Anomalous
Type
# of
sequences
110
1
2
3
4
5
10
10
10
10
10
• Types of anomalies
•
1 – Throttle kept constant, flaps are not lowered; rest is normal
•
2 – No initial throttle increase; rest is normal
•
3 – Flaps are not lowered; rest is normal
•
4 – At the end of flight brakes are not applied; rest is normal
•
5 – Pilot overshoots runway and lands behind it; rest is normal
16
AUC Results: HSMM vs HMM
•
AUC based on 11-fold cross-validation
•
110 normal sequences split into 11 parts
•
10 parts used for training
•
1 part + 50 anomalous sequences used for testing
•
Initialization was fixed across runs
•
Performance metric: area under ROC curve (AUC)
17
Effect of Initialization: HSMM vs HMM
•
Dependency of AUC on initialization
•
Selected 10 random model initializations
•
Split of dataset was fixed across runs
•
•
Training: 100 normal sequences
•
Testing: 10 normal + 50 anomalous sequences
Performance metric: area under ROC curve (AUC)
18
Offline vs Online Anomaly Detection
Anomaly scoring:
Anomaly scoring:
• Detected anomalies in streaming data
1: No initial throttle increase
2: Incorrect usage of rudder
3: Mistakenly used elevator control after touch down
19
HSMM vs Other Methods
•
•
Baseline methods [Chandola et al. ’08]
•
EFSA: Extended Finite State Automata
•
t-STIDE: Threshold-based sequence time delay embedding
•
WIND: Window-based anomaly detection
Setup
•
Training: 100 normal sequences; Testing: 60 sequences
•
Sliding window length (history length) was varied from 1 to 20
•
Selected model with best AUC on training set; Evaluated on test set
Type 1
Type 2
Type 3
Type 4
Type 5
HSMM
1.00
0.97
0.77
0.88
1.00
HMM
0.87
0.60
0.71
0.95
0.99
WIND
0.9
0.60
0.70
0.87
1.00
EFSA
0.85
0.67
0.67
0.91
0.98
t-STIDE
0.84
0.67
0.68
0.92
1.00
20
Summary
•
Proposed framework to model discrete pilot actions
• HMM
•
Hidden states – stages of aircraft operation
•
Observations – pilot actions
•
Drawback – inability to model arbitrary state durations
• HSMM
•
•
Introduces additional hidden variable to model state durations
Evaluated model performance
• Synthetic data
• Flight simulator data
• Compared HSMM to HMM and other anomaly detection algorithms
Thank you!
21
Download