Ch19. Evaluation Criteria for BCI Research

advertisement
Contents
 Introduction
 Criteria for Evaluating Trial-based BCI Data
 Criteria for Evaluating Self-Paced BCI Data
 Other criteria
Introduction
 The factors affect BCI performance
 Trial-based (system-paced) <-> Asynchronous mode (selfpaced)
 Type and number of EEG feature




Spectral parameter
Slow cortical potentials
Spatiotemporal parameters
Nonlinear features
 Type of classifier
 Linear and quadratic discriminate analysis
 Support vector machines
 Neural networks
 Simple threshold detection
 Target application
Introduction
 The necessity of consistent evaluation criteria
 For compare different BCI systems and approaches
 Consideration for evaluation criteria
 What is being evaluated
 Feedback loop
 The most frequently used evaluation criterion
 Error rate or accuracy
 Response speed of BCIs
 Evaluation of asynchronous BCI data
Criteria for
Evaluating Trial-based BCI Data
 The Confusion Matrix
 Classification Accuracy and Error Rate
 Cohen’s Kappa Coefficient
 Mutual Information of a Discrete Output
The Confusion Matrix
Class
Y
N
Class
1
2
3
4
Total
Y
Hits (TP)
Misses (FN)
1
73
17
7
8
105
N
FA (FP)
CR (TN)
2
10
87
3
5
105
3
6
13
74
12
105
4
2
4
7
92
105
Total
91
121
91
117
420
Correctly classified
Incorrectly classified
TP: True Positive
TN: True Negative
FP: False Positive
FN: False Negative
FA: False Activation
CR: Correct Rejection
Classification Accuracy and Error
Rate
 The most widely used evaluation criteria in BCI research
 Denoted as ACC for classification ACCuracy, ERR(=1-ACC) for error
rate
 Can be very easily calculated and interpreted
 The accuracy is 100%/M (M: number of class)
 The maximum accuracy can never exceed 100%
ACC  p0


 Some limitation
M
n
i 1 ii
N
Correctly classified sample
Total number of sample
 The off-diagonal values of the confusion matrix are not considered
 Classification accuracy of less frequent classes have small weight
Cohen’s Kappa Coefficient
 Addresses several of the critiques on the accuracy measure
 Use the overall agreement po, and the chance agreement pe
p0  ACC
pe


M
nn
i 1 :i i:
2
N
po  pe

1  pe
Sum of i-th column
 0
 1
 0
 e   
Sum of i-th row
Predicted classes show no correlation with actual classes
Perfect classification
Different assignment between output and the true classes
p  p
o
2
e
 i 1 n:i ni: n:i  ni:  / N 3
M
1  pe 
Standard error of kappa coefficient
N

Cohen’s Kappa Coefficient
 Address several of the criticisms of the accuracy measure
 It considers the distribution of the wrong classifications
=> i.e., off-diagonal elements of the confusion matrix)
 Frequency of occurrence is normalized for each class
=> class with less samples get the same weight as class with
many samples
 The standard error of the kappa coefficient easily can be used
for comparing whether the results of distinct classification
systems have statistically significant differences
Mutual Information of a Discrete
Output
 Assume following things
 BCI system can be modeled as communication channel
 Communication theory of Shannon and Weaver (1949) can be
applied directly to quantify the information transfer
 Farwell and Donchin (1988)
 Information transfer for M classes can be calculated as
I  log 2 M 
 Problms
 The information rate assume an error-free system
 This suggestion is not useful for comparing different BCI systems
Mutual Information of a Discrete
Output
 Based on Pierce (1980), Wolpaw et al. (2000a)
 Information transfer rate for M classes an ACC = po is
B[bits]  log 2 M   po  log 2  po   1  po  log 2 1  po  / M  1
 The formula has following limitations
2nii

M selections (classes) are possible
specACCi 

Each class has the same probability
ni:  n:i

The specific accuracy is same for each class

Each undesired selection must have the same probability of selection

Often these assumption are not fulfilled
It does not satisfy 3rd and 4th condition
Class
1
2
3
4
Total
1
73
17
7
8
105
2
10
87
3
5
105
3
6
13
74
12
105
4
2
4
7
92
105
Total
91
121
91
117
420
Mutual Information of a Discrete
Output
 Random variable X models the user intension
 Random variable Y models the classifier output
 The entropy H(X) of a discrete random variable is defined as
H  X    px j  log 2  px j 
M
j 1
 Nykopp(2001) derived the information transfer for a general confusion matrix
I  X ; Y   H Y   H Y X 
H Y    py j  log 2  p yi 
M
j 1
p  y j    p xi   p  y j xi 
M
i 1



H Y X    pxi   p y j xi  log 2 p y j xi
M
M

i 1 j 1




I  X ; Y    pxi   p y j xi  log 2 p y j xi   py j  log 2  py j 
M
M
i 1 j 1
M
j 1
Probability to classify xi as yj
a priori probability for class xi
Mutual information
Criteria for
Evaluating Self-paced BCI Data
Asynchronous mode BCI
 The BCI system is specially designed to produce outputs in response
to intentional control
 HF-Difference
 Hit False difference
 H : hit rate, F: false detection rate
H  Se 
TP
TP  FN
HFdiff  H  F 
F
FP
TP  FP
TP
FP

 Se  Pr 1
TP  FN TP  FP
Other Criteria
 Receiver-Operator Characteristics (ROC)
 Correlation Coefficient
 Evaluation of Continuous-Input and Continuous-Output Systems
 Response time
 High accuracy is important
 But the response time is also important
 Maximum Steepness of the Mutual Information is used in BCI Competition
III
STMI t  
I t 
t  t0
t0 : time for the cue onset
I(t) : continuous mutual information
Download