identifying offline muscle strength profiles sufficient for short

advertisement
Journal of Clinical Monitoring and Computing (2006) 20: 209–220
DOI: 10.1007/s10877-006-9023-2
IDENTIFYING OFFLINE MUSCLE STRENGTH
PROFILES SUFFICIENT FOR SHORT-DURATION
FES-LCE EXERCISE: A PAC LEARNING MODEL
APPROACH
Randy D. Trumbower, PT, PhD,1 Sanguthevar Rajasekaran,
PhD2 and Pouran D. Faghri, MD, MS, FACSM2,3,
C
Springer 2006
Trumbower RD, Rajasekaran S, Faghri PD. Identifying offline muscle
strength profiles sufficient for short-duration FES-LCE exercise: a PAC
learning model approach.
J Clin Monit Comput 2006; 20: 209–220
ABSTRACT. Functional electrical stimulation-induced leg cycle
ergometry (FES-LCE) provides therapeutic exercise for persons
with spinal cord injury (SCI). However, there exists no systematic approach to predict whether an individual has sufficient thigh
muscle strength necessary for FES-LCE exercise. Objective. To
develop and test a Probably Approximately Correct (PAC) learning model as a predictor of thigh muscle strengths sufficient for
short-duration FES-LCE exercise and compare the model’s performance with other well-known statistical methods. Methods.
Six healthy male individuals with SCI, having age (32.0 ± 12.5
years), height (1.8 ± 0.04 m), and weight (79.12 ± 10.76 kg),
participated in static and dynamic experiments. During static experiments, absolute crank torque measurements were used to
estimate thigh muscle strengths in response to maximum FES
intensities of 70 mA, 105 mA, and 140 mA at fixed crank positions on an FES-LCE. During dynamic experiments, changes in
power output measurements were used to classify rider performance as ‘Fatigue’ or ‘No Fatigue’ during short-duration FESLCE at maximum stimulation intensities of 70 mA, 105 mA, and
140 mA and flywheel resistance levels of 0/8th, 1/8th, and 2/8th
kilopounds. A Probably Approximately Correct (PAC) learning
model was developed to classify static offline muscle strength observations with online rider performances. PAC’s discriminatory
power was compared with logistic regression (LR), Fisher’s linear discriminant analysis (LDA), and an artificial neural network
(ANN) model. Results. PAC and ANN learning models correctly identified 100% of the training examples. PAC’s average
performance on the validation set was 93.1%. The ANN and
LR performed comparable with 92.8% and 93.1% accuracy, respectively. The LDA method faired well on the validation set at
89.9%. Conclusions. PAC performed well in identifying muscle strengths associated with the online performance criterion.
Although PAC did not perform best during cross-validation, this
model has many advantages over the other methods. PAC can
adapt to changes in classification schemes and is more amenable
to theoretical analyses than the other methods. PAC learning has
an intuitive design and may be a practical choice for classifying
muscle strength profiles with well-defined performance criteria.
KEY WORDS. PAC learning, statistical models, FES-LCE, SCI.
1
Sensory Motor Performance Program, Rehabilitation Institute of
Chicago and Northwestern University, 2 School of Engineering, Department of Computer Science and Engineering, 3 School of Allied
Health, Department of Health Promotion, University of Connecticut, Storrs, CT USA
Received 23 January 2006. Accepted for publication 9 April 2006.
Address correspondence to Pouran D. Faghri, University of Connecticut, Koons Hall, U-2101, 358 Mansfield Road, U-2101, Storrs,
CT 06269-2101.
E-mail: pouran.faghri@uconn.edu.
INTRODUCTION
The goal of functional electrical stimulation-induced semireclined leg cycling (FES-LCE) is to provide therapeutic
exercise for persons with spinal cord injury (SCI) [1–3].
However, a major clinical concern is the ill-defined approach used to determine whether a prospective rider has
the necessary muscle strength to successfully participate in
FES-LCE exercise. Most clinicians use trial-and-error to
210
Journal of Clinical Monitoring and Computing Vol 20 No 3 2006
grade an individual’s response to FES and determine when
to initiate and by how much to progress a patient’s exercise regimen. There is no systematic method to evaluate an
individual’s strength capabilities for FES-LCE exercise. To
effectively prescribe FES-LCE, better defined methods for
identifying rider strength potentials in response to FES are
called for.
For many clinicians and researchers, quadriceps (QUAD)
strength training is considered preliminary to participation
in FES-LCE. Petrofsky et al. [4] prescribed an open-chain
weight-lifting program for prospective FES-LCE subjects
to ensure their success during FES-LCE. The testing included FES-induced short-arc-quad sets where subjects
were required to repeatedly lift a 7 kg weight for 15 min
consisting of 3-second lift, 1-second hold, and 3-second release followed by 6-second rest period. Other progressiveresistance QUAD strengthening protocols are also well
documented [5–8]. However, these strength training procedures do not consider the contributions of stimulated
hamstrings (HAM) and gluteus maximus (GLUT) muscles
nor the QUAD response to FES at different joint configurations and musculotendon lengths that are more consistent with leg cycling exercise. Even still, some studies
consider only inclusion of persons with SCI with one or
more months of experience in FES-LCE without report on
subjects’ prior thigh strength in response to FES [1, 9–12].
Thigh muscle strength is an important determinant in
FES-LCE therapeutic exercise. Inadequacies in muscle
force production make FES-induced pedaling near impossible. During FES-LCE, a muscle’s force generating capability must be sufficient to accelerate the crank in the
forward direction. The effectiveness of forward pedaling
is ultimately dependent on individual muscle responses to
FES and how these responses translate to pedal power. Direct measure of maximal isometric force generating capabilities of the individual thigh muscles would be ideal
for assessing individual muscle strengths, but not appropriate in fracture-prone persons with SCI [13]. Schutte et al.
[14] proposed an alternative method using a musculoskeletal model to approximate individual thigh muscle strengths
relative to able-bodied individuals. This approach, however, is not practical since (1) it does not provide a method
to distinguish a rider’s ability from other persons with SCI,
(2) it is based on a theoretical framework without considering variability of individual muscle responses to different
stimulation intensities, and (3) it defines strengths relative
to able-bodied individuals that do not rely on FES. More
recently, Trumbower et al. [15] performed a non-invasive
method for estimating isometric muscle strength by recording pedal forces from individual leg muscles in response
to different stimulation amplitudes while positioned on a
FES-LCE system at different crank positions. It has previously been reported that during FES-LCE there exists a
significant isometric component due to the high levels of
FES during pedaling [4], and suggests that recorded isometric strengths of individual muscles may be a practical
way of estimating online muscle force generating capacities
during short-term FES-LCE. Although the non-invasive
method appears promising, it alone cannot fully address
the dynamic contributions of muscles needed to power the
crank.
Currently, no studies have attempted to associate riderspecific isometric muscle forces in response to FES and
online cycling power. The strength characteristics of individual muscles involved in FES-LCE may be useful information in defining the anaerobic power output (PO)
capabilities during short-duration cycling exercise. PO is
a common measure of performance during leg cycling [1,
6, 16–18] and is used to quantify the energy transfer from
rider to bike. A systematic approach to classify muscle force
generating capabilities with PO levels required for shortduration FES-LCE is a necessary step when prescribing this
type of exercise to prospective riders. However, the connection between a rider’s offline muscle strength and PO
is not intuitive and predicting the association must depend
on statistical methods.
Classification models may be one way to associate muscle strength and PO and provide an indirect estimation of possible strength deficits in prospective riders. An
algorithm that classifies offline muscle strength profiles
with online performance criteria is one suitable approach.
Logistic regression (LR) [19], artificial neural networks
(ANN) [20], and linear discriminate analysis (LDA) [21,
22] models have been used for various biomedical classification schemes. Recently, Probably Approximately Correct (PAC) has gained attention as an efficient classification
model for its (1) ability to categorize arbitrary example
sets quickly, (2) capacity for extensive analyses, (3) strong
probabilistic framework [20, 23], and (4) ability to learn
intuitive rules via inductive inferencing [24]. However, it
is not clear if a PAC learning model is feasible for classifying muscle strengths, because it has not been previously
assessed for this type of clinical application.
Therefore, the purpose of this study was to develop, test,
and compare a PAC learning model as a predictor of thigh
muscle strength profiles sufficient for short-duration FESLCE with more well-known approaches. It is hypothesized
that the PAC model will perform well in correctly classifying prospective riders’ muscle strengths for FES-LCE
exercise. PAC may provide an efficient means to classify an
individual’s muscle strengths for FES-LCE and ultimately
eliminate the guess-work when prescribing this exercise
for potential new users. Moreover, the proposed model
may have implication for other FES applications that classify muscle strength profiles with well-defined performance
criteria.
Trumbower et al.: Classifying Strength Profiles for FES-LCE
211
Static test
METHODS AND MATERIALS
Subjects
Six healthy male individuals with SCI participated in this
study with similar age (32.0 ± 12.5 years), height (1.80 ±
0.04 m), and weight (79.1 ± 10.8 kg) (Table 1), and were
regular users of FES-LCE (having used the system for least
3 months). All subjects reviewed and signed a statement of
informed consent approved by the University’s Institutional
Review Board.
Instrumentation
A piezoelectric force sensor (PCBR Piezoelectronics, Inc.,
New York USA) was mounted on the right boot-pedal and
it measured normal and tangential pedal forces in the sagittal plane. Normal and tangential pedal force measurements
were acquired with a LabVIEWR DAQ board (National Instruments Inc, USA) at a sampling rate of 180 samples∗ s−1 .
Data were passed through a 5th order, zero lag, Butterworth
lowpass digital filter at 10 Hz [26]. Normal and tangential
force data were defined relative to the boot-pedal [15].
Experimental procedures
Each subject was fitted on the FES-LCE System (ERGYSR ,
Therapeutic Alliances, Inc., Fairborn OH, USA) and seat
configurations were adjusted based on anthropometry. Offline static and online dynamic tests were performed on
each subject with a minimum of 2 h rest between experiments. Stimulation parameters for all the tests consisted of
sinusoidal, biphasic waveform, with frequency of 50 Hz,
pulse duration of 500 μs, and phase duration of 1000 μs.
Procedures for recording peak crank torque values for the
QUAD, HAM, and GLUT muscles during offline static
testing are described in the work by Trumbower et al. [27].
Absolute crank torque values corresponding to stimulation
intensities of 70 mA, 105 mA, and 140 mA were calculated
at pedal crank positions 0, 90, 135, and 180 degrees. These
quantities were considered independent observations for
each subject’s QUAD, HAM, and GLUT muscle responses
to FES. A one-way analysis of variance (ANOVA) using a
Bonferroni post-hoc comparison test found no difference
(p > 0.05) in the mean absolute crank torque values across
the pedal crank positions.
Dynamic test
FES-LCE flywheel resistances were adjusted via an internal magnetic brake. Maximum stimulation intensities were
present at 70 mA, 105 mA, and 140 mA prior to cycling
and referred to the highest level of stimulation allowed by
the controller during cycling. The FES-LCE proportional
feedback controller’s target cadence was set at 50 rpm for all
tests. Levels of flywheel resistance and maximum stimulation intensity were randomly assigned prior to testing. Initially, subjects were provided a warm-up of active-assisted
FES-LCE at 50 rpm. Following the warm-up, subjects pedaled for 2 min at a maximum stimulation intensity of 70
mA, 105 mA, and 140 mA and flywheel resistance of 0/8th,
1/8th, and 2/8th kilopounds (KP). During the 2-minute
cycling period, kinematic and kinetic data were collected
for 30 seconds. After recording, subjects were given a 2
min cycling cool-down followed by 5 min of rest. The dynamic test was repeated for each of the 9 combinations of
flywheel resistance and maximum stimulation intensity.
Table 1. Physical characteristics of SCI subjects
Subjects
Age
(yr)
Height
(m)
Weight
(kg)
Functional
level∗∗
Time since
injury (yr)
ASIA
score∗
1
2
3
4
5
6
X̄ ± SD
38
54
21
20
30
22
31 ± 13
1.9
1.7
1.7
1.7
1.8
1.8
1.8 ± 0.1
101
77
66
79
75
76
79 ± 12
C5–C6
T5–T6
T5–T6
T8–T9
C4–C5
T5–T6
16
7
5
5
14
2
8±6
C
A
A
A
C
A
The mean and standard deviations ( X̄ ± SD) for age, height, weight, and time since injury are presented.
∗
American Spinal Injury Association standard neurological classification [25].
∗∗
C – Cervical Level; T – Thoracic Level.
212
Journal of Clinical Monitoring and Computing Vol 20 No 3 2006
Table 2. Summary of the number of online observations associated with performance index of ‘Fatigue’ or ‘No
Fatigue’
Classification
Observations
(Max = 70 mA)
Observations
(Max = 105 mA)
Observations
(Max = 140 mA)
Total
observations
>20%
>30%
>40%
>50%
>60%
7
2
1
1
3
4
15
2
0
1
0
0
18
0
0
0
0
0
No Fatigue
Fatigue
Total
%
40
4
1
2
3
4
54
74.1
7.4
1.8
3.7
5.6
7.4
100.0
Observations were made at maximum stimulation intensities of 70, 105, and 140 mA. An index value less than 20%
corresponded to ‘No Fatigue’, while index values greater than 20% were classified as ‘Fatigue’.
Performance index
A performance index (PI) was used to quantify cycling
performance by changes in PO during short-duration FESLCE [28]. Similar measurements of PO and anaerobic capacity have been used with able-bodied individuals during
short spurts of stationary leg pedaling [29, 30]. During this
study, PO changes were evaluated during short-duration
cycling under fixed maximum stimulation intensities and
flywheel resistances while the FES-LCE controller required
riders to maintain cycling speed at 50 rpm. The instantaneous power (P) was defined as
P = τ ∗ ϑ̇c r
(1)
where τ is the instantaneous crank torque and ϑ̇c r is the
instantaneous crank velocity (Equation (1)). Crank torque
was defined as the moment of the pedal forces about the
crank center, where the subject applied the pedal forces to
the boot-pedal. Average crank power ( P̄ ) was calculated
for every crank revolution as
1 T
τ ∗ ϑ̇c r dt
(2)
P̄ =
T 0
where T is defined as one crank period (Equation (2)).
The PI quantified the extent of decline in power during
leg cycling as
PI =
P̄initial − P̄final
× 100
P̄initial
(3)
where P̄initial was the peak of the first 3 average crank power
values, which represented the initial observed crank power,
and P̄final was the peak of the last 3 average crank power
values, which represented the final observed crank power
(Equation (3)). A positive PI corresponded to a reduction
in PO.
During short-duration leg cycling at maximum stimulation intensity, a reduction in PO indicates the onset of
cycling fatigue where recruited muscle fibers (primarily
type II [31]) are unable to sustain PO required for steadystate cycling at the preset target speed of 50 rpm. A PI
threshold level was set at 20% to separate rider performances into two classes: ‘Fatigue’ or ‘No Fatigue’. This
threshold represented a separation between individuals that
have sufficient muscle response to FES to maintain steadystate cycling and those that did not. The selected cutoff
percentage considered the speed restrictions of the current
FES-LCE controller which does not permit cycling to continue if cadence falls below 35 rpm [1]. Furthermore, the
strong dependence of the PI on cycling speed [28, 32] suggests that 20% is a reasonable threshold level to delineate
reductions in PO that are likely to cause early termination of exercise. Table 2 summarizes the number of online
observations associated with PI classes: ‘Fatigue’ or ‘No
Fatigue’.
Muscle strength classification
Muscle strength in the general sense is defined as a muscle’s
ability to generate force [33]. For this study, muscle
strengths for the QUAD (SQUAD ), HAM (SHAM ), and
GLUT (SGLUT ) were specifically defined by absolute crank
torque produced by each muscle in response to FES during
offline static testing. Muscle strengths in response to 70
mA, 105 mA, and 140 mA were classified according to
their respective stimulation intensities associated with the
online PI either greater than 20% (‘Fatigue’) or less than
20% (‘No Fatigue’). Figure 1 represents muscle strengths at
70 mA, 105 mA, and 140 mA. A total of 80 offline observations were made on six subjects. Of those observations,
Trumbower et al.: Classifying Strength Profiles for FES-LCE
213
Fig. 1. Scatterplot of muscle strengths defined as the absolute crank torque for QUAD (S QU AD ), HAM (S H AM ), and GLUT (SGLU T ) muscles. The plot
illustrates two muscle strength classifications: Fatigue’ (Stars) and ‘No Fatigue’ (Squares).
35 were classified as ‘No Fatigue’ and 45 were classified as
‘Fatigue’.
based on Occam’s Razor,1 that correctly classified all positive
examples and as many negative examples as possible. After
processing all the examples in this fashion, the algorithm
built the Boolean formula.
Probably approximately correct learning model
The goal of the PAC learning model is to learn a concept
C. In general C can be thought of as a function of x where
x is a set of variables. It may be computationally difficult
to learn C exactly. Taking into account computational efficiency, PAC admits learning an approximation G to C.
The error in learning, E(G), is defined as the probability
that C(x) = G(x) for an arbitrary element x. The variables
under concern in this study is all possible SQUAD , SHAM , and
SGLUT vectors. The concept to be learnt is the decision of
whether the subject is under fatigue or not. In other words,
the concept C under concern has only two possible values,
namely, ‘Fatigue’ and ‘No Fatigue’. Our goal is to learn an
approximation G to C. In this case, E(G) is the fraction of
the muscle strengths in the hypothesis space for which G
yields an incorrect answer. Our PAC learner was built based
on the “one-clause-at-a-time” learning algorithm [34, 35].
The PAC learner was trained and tested in MatlabR (The
Mathworks, Inc., Natick, MA). The Boolean formula initiated as G = 0 and clauses were sequentially added based on a
fast heuristic method [36] that chooses an optimal clause to
add while removing the correctly classified negative examples (Figure 2). The optimal clause was the simplest clause,
DATA ANALYSES
The PAC learning model was compared to three wellknown learning models: (1) logistic regression (LR), (2)
linear discriminant analysis (LDA), and (3) artificial neural networks (ANN). Model evaluation and analyses were
performed on the models using MatlabR (The Mathworks,
Inc., Natick, MA) and SPSS12.0R (SPSS, Inc., Chicago, IL
USA).
Model evaluation
To evaluate how well each model performed in an unsupervised state, a 10-fold cross-validation was performed.
The 10-fold cross-validation method was performed by
randomly dividing the data set into two equally sized subsets
1
Occam’s Razor, in reference to machine learning, is the development
of a learner that is as simple as possible to achieve good generalization
performance.
214
Journal of Clinical Monitoring and Computing Vol 20 No 3 2006
Fig. 2. Pseudocode for the OCAT approach [35] using a fast heuristic method [36] to build a Boolean formula f in Conjunctive Normal Form (CNF). The
maximum ratio refers to the ratio between number of positive examples accepted and the number of negative examples rejected in the built clause if ai is used
in the build clause.
corresponding to training and validation sets. In considering the small number of observations, the cross-validation
was repeated 10 times, selecting random samples of observations (i.e., positive and negative examples) to train the
models, setting aside the remaining observations for validation. Random selection of the observations was made using
a set of values randomly generated with a Bernoulli distribution having a probability parameter of 0.70. This crossvalidation procedure was repeated 10 times and the average
errors for the training and validation sets were computed.
Discrimination criteria included model accuracy, sensitivity, and specificity and were used to assess the quality of
the classification models [37]. Model accuracy was defined
as the average percentage of correct classification. Sensitivity and specificity were calculated as shown in Equations 4
and 5, respectively [38]. Model sensitivity was defined by
the ratio of the number of ‘No Fatigue’ values correctly
classified to the total number of actual ‘No Fatigue’ values
(i.e., true positive ratio).
Discrimination criteria, model accuracy, sensitivity, and
specificity were used to assess the quality of the classification
models [37]. Model accuracy was defined as the average
percentage of correct classification. Model sensitivity was
defined by the ratio of the number of ‘No Fatigue’ values
correctly classified to the total number of actual ‘No Fatigue’ values (i.e., true positive ratio). Model specificity was
defined as the ratio of number of ‘Fatigue’ values correctly
classified to the total number of actual ‘Fatigue’ values (true
negative ratio). Sensitivity and specificity were calculated
as shown in Equations 4 and 5, respectively [38].
SpecificityP (correct prediction | ‘No Fatigue’
= did occur)
(4)
Sensitivity
= P (correct prediction | ‘Fatigue’ did occur)
(5)
Logistic regression (LR) model
The LR model used a logarithmic function to constrain
the probability of performance outcome to 0 (‘Fatigue’)
or 1 (‘No Fatigue’). The regression coefficients were estimated using a nonlinear optimization routine (maximum
likelihood method) based on the maximum natural log of
the odds of the performance outcome occurring or not
occurring [19]. Covariates were defined by SQUAD , SHAM ,
and SGLUT values. The LR analysis employed a forward
stepwise inclusion method using a P-value of 0.05 at entry. For each observation, the predicted response was ‘No
Fatigue’ if the case’s probability was greater than the cutoff
value of 0.5.
p
(6)
log it( p) = ln
1− p
Trumbower et al.: Classifying Strength Profiles for FES-LCE
log it( p) = b 0 +
N
215
Artificial neural network (ANN) model
b i xi
(7)
i =1
The probability p of the dichotomous event occurring is
related to a set of predictor variables (i.e., SQUAD , SHAM , and
SGLUT ) where b0 is the intercept and bi corresponds to the
coefficients associated with these variables xi (Equations (6)
and (7)).
Linear discriminant analysis (LDA)
The LDA was developed based on a linear combination of
the SQUAD , SHAM , and SGLUT predictor values (Equation
(8)).
Di k = b o k + b 1k xi 1 + b 2k xi 2 + · · · + b q k xi q
(8)
where Dik is the value of the kth discriminant function
(‘Fatigue’ or ‘No Fatigue’) for the ith case, q is the number
of predictor variables, b j k is the value of the jth coefficient
of the kth function, and xi j is the value of the ith case of the
jth predictor. Function coefficients for SQUAD , SHAM , and
SGLUT were calculated for the separate functions for each
PI class (i.e., ‘Fatigue’ or ‘No Fatigue’) and average Wilk’s
λ scores were also calculated to determine which variables
better discriminated between groups.
The ANN model was designed as a multilayer perceptron
built using MatlabR Neural Network Toolbox (The Mathworks, Inc., Natick, MA) with interconnected neurons
(Figure 3). The network topology consisted of 1 hidden
layer and 1 output layer. This architecture has been shown
to be effective in modeling nonlinear classifications and
relationships between arbitrary input-output pairs [20].
Only 1 neuron was used in the linear output layer. The
hidden layer was defined by a logistic function y j (n) where
v j (n) is the sum of all the weighted synaptic inputs and bias
bi (Equations (9) and (10)).
1
1 + e v j (n)
N
v j (n) =
wi xi + b i
y j (n) =
(9)
(10)
i =1
A back-propagation algorithm [39] batch trained the ANN
until the mean-square-error between the predicted and
actual PI value was less than a preset threshold value of
0.01. Training of the ANN required less than 30 iterations. To reduce possible over-training, the process was
repeated to determine the minimum number of hidden
layer neurons needed to meet this criterion. Similar to
the LR method, the predicted response was ‘No Fatigue’ if the output was greater than the cutoff value of
0.5.
Fig. 3. Artificial neural network (ANN) topology used for mapping offline muscle strength profiles (QUAD, HAM, and GLUT) to performance index (PI).
In this figure, the 2 layer ANN consisted of 6 neurons within a hidden ‘tansig’ layer and a neuron in the ‘linear’ output layer. Weighted synaptic inputs and
biases (bi ) were included in both layers.
216
Journal of Clinical Monitoring and Computing Vol 20 No 3 2006
Probably approximately correct (PAC) model
The concept to be learnt was a Boolean function f defined
by the offline muscle strength profiles (SQUAD , SHAM , and
SGLUT ). Each variable assumed values from the set [0.00,
200.00]. These data were transformed into their equivalent binary data. That is, each variable was discretized and
converted into 15 binary variables based on a precision adjustment of 102 . Thus the transformed database was defined
on 45 binary variables. Input to the PAC learner was the
binary set of variables composed of positive and negative
examples. Examples were defined as assignments to the offline observations. A positive example was an assignment
that satisfied the formula of ‘No Fatigue’. A negative example was an assignment under which the formula evaluated
as ‘Fatigue’. The formulated Boolean functions were in
conjunctive normal form (CNF) composed of a conjunction of disjunctions of literals ai (i.e., variables and their
negations)
k
f (a 1 , a 2 , . . . , a n ) = ∧ ( ∨ a i )
(11)
j =1 i ∈υ j
where ai is either xi (binary value) or x̄i (negation), υi is
the superset of the indices of the atoms in the ai conjunction, and k is the number of clauses (Equation (11))
[36]. The Boolean function (x3 ∨ x¯1 ) ∧ (x¯2 ∨ x3 ∨ x4 ) is
an example of a derived CNF formula where (x3 ∨ x¯1 )
and (x¯2 ∨ x3 ∨ x4 )are the two formula clauses.
RESULTS
A total of 80 offline observations were included in model
training and validation. Table 3 summarizes means and
standard errors for thigh muscle strengths corresponding to
‘Fatigue’ and ‘No Fatigue’ at the 3 maximum stimulation
intensities. The mean SQUAD values (20.0 ± 1.4 Nm) classified as ‘No Fatigue’ were more than twice the recorded
mean SHAM values (8.4 ± 0.8 Nm) and nearly 20 times
SGLUT (1.0 ± 0.2 Nm). The mean SQUAD values classified
as ‘Fatigue’ was markedly less (3.1 ± 0.4 Nm) as was the
SHAM values (1.1 ± 0.2 Nm). The GLUT muscle registered
the smallest mean strength values of the 3 muscles when
classified as ‘Fatigue’ (0.1 Nm) and ‘No Fatigue’ (1.0 Nm).
Table 4 summarizes the classification performances for
LR, LDA, ANN, and PAC. The PAC and ANN model
correctly identified 100% of the training set for both
‘No Fatigue’ and ‘Fatigue’ examples. Overall accuracy decreased on the validation sets for PAC (93.1%) and ANN
(89.8%). Conversely, the LR model increased in accuracy
from 91.3% on the training set to 93.1% on the validation set. The LDA recorded accuracies of 93.8% during
training and 89.9% during validation. The sensitivity scores
were lowest for PAC at 90.2%. However, PAC recorded the
highest percentage for specificity at 95.0%. An average of
approximately 4 CNF clauses and 20 ± 3 atoms were used
to build the Boolean formulas. During cross-validation,
the lowest accuracy of PAC was recorded with the only
Boolean formula containing 5 clauses.
DISCUSSION
PAC learning has been shown to be effective in classification problems where the relationship pairs are not intuitive
[35, 40]. This study indicates that using a PAC model may
be beneficial for classifying thigh muscle strengths not sufficient for short-duration FES-LCE. The model performed
well in correctly identifying the training examples (100%)
Table 3. Group statistics for S QU AD , S H AM , and SGLU T values associated with
the online performance index ‘Fatigue’ and ‘No Fatigue’ at 70, 105, and 140 mA
Performance
index
Fatigue
No Fatigue
Total
Stimulation
intensity
(mA)
70
105
140
Total
70
105
140
Total
Mean ± Standard error
SQUAD (Nm)
SHAM (Nm)
SGLUT (Nm)
2.5 ± 0.4
5.8 ± 1.2
3.3 ± 0.0
3.1 ± 0.4
18.8 ± 4.3
18.7 ± 2.7
21.1 ± 1.8
20.0 ± 1.4
10.6 ± 1.0
0.8 ± 0.2
2.4 ± 0.9
3.8 ± 0.0
1.1 ± 0.2
6.0 ± 1.6
8.5 ± 1.5
8.8 ± 1.1
8.4 ± 0.8
4.4 ± 0.5
0.1 ± 0.0
0.3 ± 0.1
0.1 ± 0.0
0.1 ± 0.0
0.5 ± 0.1
0.9 ± 0.2
1.2 ± 0.3
1.0 ± 0.2
0.5 ± 0.1
Trumbower et al.: Classifying Strength Profiles for FES-LCE
217
Table 4. Comparison of Probably Approximately Correct (PAC) learning model with logistic regression (LR),
linear discriminant analysis (LDA), and artificial neural network (ANN) models for predicting PO performance
based on offline muscle strength profiles
Model
method
LR
LDA
ANN
PAC
(A) Predicted, training sets (%)
Accuracy
91.3
93.8
100.0
100.0
(B) Predicted, validation sets (%)
Sensitivity
Specificity
Accuracy
93.0
97.3
91.4
90.2
92.9
82.6
93.2
95.0
93.1
89.9
92.2
93.1
(A) summarizes the predicted PO performances from training set and (B) summarizes the discrimination power
of the models on the validation set.
and validation sets (93.1%). In particular, this study explored the utility of a PAC learning model as compared
with well-known classification methods (i.e., LDA, LR,
and ANN). Overall validation accuracy for the analyzed
models ranged from 89.9% to 93.1%, which suggests that
classifying offline thigh muscle strengths based on the performance criterion is possible.
The discrimination power of PAC was defined by its
sensitivity, specificity, and accuracy. These assessment tools
are important in determining the likelihood of true positive or true negative classifications. A true positive classification resulted in a prospective rider labeled as sufficient
to ride (‘No Fatigue’) when they are. A true negative classification resulted in a prospective rider labeled
as insufficient (‘Fatigue’) and incapable of short-duration
FES-LCE. From our results, the PAC was better at classifying true negative observations (95.0%) as opposed to
true positive observations (90.2%). The PAC model performed best, overall, in terms of discriminating those
strength profiles not sufficient for FES-LCE. This has
major clinical relevance, because clinicians are required
to perform safe identification of prospective riders that
are too weak for FES-LCE so that the potential of injury from over stimulation to highly atrophied muscles is
minimized.
This study in addition to assessing the feasibility of using
a PAC learning model for FES-LCE performance classification provided insight into the contribution of individual muscle strengths to pedal power under defined FES
conditions. It was clear during analyses that contribution
of QUAD and HAM muscles dominated all classification
models. For instance, the LDA Wilk’s λ scores for the
SQUAD , SHAM , and SGLUT variables were 0.42, 0.51 and
0.73, respectively. The low λ score recorded for the SQUAD
was considered the best predictor variable at distinguishing between ‘Fatigue’ and ‘No Fatigue’. Moreover, the
‘Fatigue’ coefficients for SQUAD (0.37) and SHAM (0.47)
were smaller than calculated coefficients for ‘No Fatigue’,
indicating the SQUAD and SHAM that generated large absolute crank torque values were less likely to result in
‘Fatigue’ during short-duration FES-LCE. This was not
the case for the GLUT muscle where models showed no
improved model performance with the addition of the
SGLUT variable. This finding is consistent with previous
work that showed little contribution of the GLUT during FES-LCE [15]. Crago et al. [41] found that stimulation intensity determines the number of motor units recruited and the force generated. During this study, a low
maximum stimulation intensity of 70 mA resulted in reductions in PO classified as ‘Fatigue’ in 5 of the 6 tested
subjects; this stimulation intensity may not have excited
a sufficient number of muscle motor units necessary to
maintain PO during short-duration FES-LCE. Conversely,
stimulation intensities of 105 mA and 140 mA contributed
to larger torque generation and PO and as a result, a
smaller number of ‘Fatigue’ classifications resulted. Individuals with SCI differ vastly in how their muscles respond
to FES, computing user-specific muscle strength profiles
characteristic of performance were considered in this study
and should be further considered for online identification
methods.
Although there are a number of benefits for using any
one of the compared approaches, the analyses were limited
to a small population (n = 6) and drawing such conclusions
is premature. Larger sample sizes would presumably provide
more user confidence and thus strengthen or weaken one
approach compared to another. For instance, PAC improves
its predictive power through learning from larger training
sets [20], which is not necessarily the case for the parametric
models (i.e., LDA and LR).
The LDA, LR, and ANN are cited frequently in the literature and are considered good classification methods [37].
In particular, LR andANN are the most frequently used
models in medical research, followed by LDA, as defined
218
Journal of Clinical Monitoring and Computing Vol 20 No 3 2006
by the number of indexes found in PUBMED (Logistic
Regression – 59,935 indexes; Neural Networks – 9912
indexes; Linear Discriminant Analysis −1273 indexes).
Aside from frequent use, there are important distinctions
worth noting between the compared methods. The LDA
and LR are customary parametric methods that draw on
many statistical assumptions. For instance, the LDA [21]
assumes that the predictors are not highly correlated with
each other and the mean and variance of the three predictors are also not correlated [42]. The LR requires a
dichotomous performance outcome, but other methods
become better suited in situations that involve more than
two outcomes such as with inherent ordering (ordinal regression) without inherent ordering (multinomial logistic
regression), where the performance measure and predictors are scaled (linear regression), or a dependent variable
is scaled and some or all the predictors are categorical (generalized linear model univariate regression) [43].
If modifications of the studied classification schemes occur whether by adding additional muscles as predictors
or providing additional performance subsets, the functionality of these parametric methods would inevitably
falter.
A significant advantage that both PAC and ANN have
over LR and LDA is better adaptation to changes in the
classification schemes. This is due to their intrinsic ability
to model any function. PAC and ANN are non-parametric
models not requiring strict assumptions of the data distribution like LR and LDA. This data tolerance avoids potential
error via incorrect assumptions, and permits minimal userknowledge of statistics. However, during this study, ANN
presented with limitations not evident from its predictive
performance. ANN’s performance was easily assessed, but
finding the optimal topology was not straightforward. The
approach to find an appropriate design was done by trialand-error as is typical when using ANN. In many cases,
this type of model selection can lead to over-fitting errors
on training data thus reducing its overall generalizability
[20]. The PAC learning model does not have the inherent problems of parametric models or the design limitations of ANN. It provides a number of favorable intrinsic properties not available in other methods, thus making
it a worthy alternative. The OCAT algorithm formulated
in this study in fact reduces the risk of overfitting by reducing the model complexity to simplest form. By definition, PAC learns a Boolean function that completely
represents both positive and negative examples while minimizing the size of clauses and total number of literals used
[36].
PAC learning models may be used to evaluate multiple rules on multiple outcome spaces. The capacity of
PAC learning is not limited on three variables (i.e., SQUAD ,
SHAM , and SGLUT ) nor is it limited to the dichotomous
outcome measure (i.e., ‘Fatigue’ or ‘No Fatigue’). PAC
is promising for learning probability subsets that define
the strength of likelihood online rider performance for
n-dimensional muscle strengths without significantly altering the learning framework. Prior to this preliminary
work, PAC learning had never been applied to this type of
clinical application. Based on PAC’s overall performance,
it deserves consideration as a decision-support system for
FES-LCE. Another crucial advantage that PAC learning
offers over the other methods is its amenability to theoretical analyses. For instance, high probability convergence
bounds can be proven for PAC learning algorithms. On the
other hand, even for simple ANNs, convergence proofs are
difficult to obtain.
Other classification models such as support vector machines [44], k-Nearest neighbors [45], and decision trees
[46] may also be further explored. These particular models
differ from PAC and the other compared methods in that
they do not provide a functional form along with parameters to describe the input-output relationship and behavior
[37]. The contributions of model parameters were interpretable when using LDA, LR, and PAC. In contrary, ANN
lacked interpretability of its weights and biases at the level of
the individual strength predictors and offer little intuitive
understanding of its functionality. The PAC model provided a Boolean formula that was easy to interpret without
the need for statistical understanding. Additional control
parameters ε and δ are also provided for PAC learning. Here
ε is known as the accuracy parameter and δ is the confidence parameter. We say that a PAC algorithm is capable of
learning a concept with parameters ε and δ if the probability
that the error in learning is greater than ε is at most δ [23].
These control parameters are helpful in determining the
sample complexity issues that are part of any learning model
system. The focus of sample complexity is on how large the
sample size should be to acquire sufficient information for
learner to learn a new concept (i.e., performance index).
Further study is needed to explore the issues of sample
complexity and parameter control of PAC learning as well
as the computational complexity as it relates to this clinical
application.
Overall, the PAC model learns examples well, but the
strength of any model is only as good as the data it learns
from. Although the PAC model had strong sensitivity and
specificity for this study, its performance was defined by
measurements that may not have truly identified the individual muscle strengths in responses to FES. Trumbower
and Faghri [15] reported overflow responses in some individuals during FES-LCE, which induced inappropriate
reflexive responses in leg muscles leading to cocontraction
and spastic reflexive withdrawal of thigh muscles. Spasticity was not quantified or predicted during this study,
but should be further assessed to ensure the data sets well
Trumbower et al.: Classifying Strength Profiles for FES-LCE
represent the overall population. Also keep in mind that
this study did not evaluate how close the predictions of
the PAC model were to real underlying probabilities, because data were collected from individuals already capable
of riding the FES-LCE system. Future calibration studies
are suggested to assess whether there are differences between average model observations and average outcomes of
larger sample sizes, including persons without the muscle
strength to participate in FES-LCE exercise. Calibration
may be used to determine whether there are statistically
significant differences between the expected and observed
outcomes [48] and would be vital when developing a PAC
learning system that may help in the identification of a
prospective rider’s potential.
The reader must use caution when drawing conclusions
about the predictability of PAC or the other modeling
methods. The models were developed to classify FESinduced offline strength characteristics as predictors of online performance. The criterion used for classifying online
performance was based on short-duration anaerobic FESLCE and did not attempt to infer a direct link between
offline muscle responses to FES and a rider’s muscle activation dynamics, muscle contraction dynamics, exercise
duration potential, or aerobic capacity. The classification
model’s predictive power for classifying offline thigh muscles strengths was defined by a dichotomous online performance criterion and not suggestive of a superior model
type over the other. Future studies are needed to fully evaluate the performance criterion in a larger population of
persons with SCI with and without FES-LCE experience
before one can ascertain the potential of using a PAC learning model as a decision-support system for this clinical
application.
In summary, FES-LCE is a therapeutic exercise aimed to
improve strength and cardiovascular fitness in persons with
SCI. However, many individuals may not be appropriate
for FES-LCE training because no systematic method
exists for identifying whether or not a prospective rider
has the muscle strengths in response to FES needed for
FES-LCE exercise. The results of this study indicate that
the tested models were good estimators. However, PAC
may be a favorable choice for systematic classification of
offline muscle strengths of prospective riders of FES-LCE,
because of its intuitive design. The approach may readily
assist clinicians in identifying muscle response characteristics that are indicative of persons who are likely to
perform FES-LCE and use those characteristics to classify
riders as ‘strong’ or ‘weak’. Future classification schemes
using this approach may consider probability subsets that
identify a percent likelihood of sufficient strength profiles
that further classify a rider’s potential. This supervised
learning approach may remove the clinical uncertainties
219
in prescribing FES-LCE exercise based on muscle strength
profiles in persons with SCI.
References
1. Faghri PD, Glaser RM, Figoni SF. Functional electrical stimulation leg cycle ergometer exercise: training effects on cardiorespiratory responses of spinal cord injured subjects at rest and during
submaximal exercise. Arch Phys Med Rehabil 1992; 73: 1085–
1093.
2. Glaser RM. Functional neuromuscular stimulation. Exercise
conditioning of spinal cord injured patients. Int J Sports Med
1994; 15: 142–148.
3. Petrofsky J. Bicycle ergometer for paralyzed muscles. Journal of
Clinical Engineering 1984; 9: 13–19.
4. Petrofsky J, Stacy. The effect of training on endurance and the
cardiovascular responses of individuals with paraplegia during
dynamic exercise induced by functional electrical stimulation.
European Journal of Applied Physiology 1992; 64: 487–492.
5. Faghri P, Glaser R. Feasibility of using two FES exercise modes
for spinal cord injured patients. Clinical Kinesiology 1989; 44.
6. Hooker SP, Figoni SF, Glaser RM, Rodgers MM, Ezenwa BN,
Faghri PD. Physiologic responses to prolonged electrically stimulated leg-cycle exercise in the spinal cord injured. Arch Phys
Med Rehabil 1990; 71: 863–869.
7. Leeds EM, Klose KJ, Ganz W, Serafini A, Green BA. Bone
mineral density after bicycle ergometry training. Arch Phys Med
Rehabil 1990; 71: 207–209.
8. Ragnarrson K, Pollack S, O’Daniel W, Edgar R, Petrofsky J,
Nash M. Clinical evaluation of computerized functional electrical stimulation after spinal cord injury: a multicenter pilot
study. Arch Phys Med Rehabililitation 1988; 69: 672–677.
9. Figoni SF. Perspectives on cardiovascular fitness and SCI. J Am
Paraplegia Soc 1990; 13: 63–71.
10. Gurney AB, Robergs RA, Aisenbrey J, Cordova JC, McClanahan L. Detraining from total body exercise ergometry in individuals with spinal cord injury. Spinal Cord 1998; 36: 782–789.
11. Franco JC, Perell KL, Gregor RJ, Scremin AM. Knee kinetics
during functional electrical stimulation induced cycling in subjects with spinal cord injury: a preliminary study. J Rehabil Res
Dev 1999; 36: 207–216.
12. Gerrits HL, de Haan A, Sargeant AJ, Dallmeijer A, Hopman
MT. Altered contractile properties of the quadriceps muscle in
people with spinal cord injury following functional electrical
stimulated cycle training. Spinal Cord 2000; 38: 214–223.
13. Vestergaard P, Krogh K, Rejnmark L, Mosekilde L. Fracture
rates and risk factors for fractures in patients with spinal cord
injury. Spinal Cord 1998; 36: 790–796.
14. Schutte L, Rodgers M, Zajac F, Glaser R. Improving the efficacy of electrical-stimulation induced leg cycle ergometry, in
Mechanical Engineering, Stanford University, Palo Alto CA 1993.
15. Trumbower RD, Faghri PD. Crank torque profile of leg muscles
at different stimulation intensities and pedal crank positions on
a recumbent leg cycle ergometer. Presented at ACRM Conference 2005.
16. Figoni SF, Rodgers MM, Glaser RM, Hooker SP, Faghri PD,
Ezenwa BN, Mathews T, Suryaprasad AG, Gupta SC. Physiologic responses of paraplegics and quadriplegics to passive
220
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
Journal of Clinical Monitoring and Computing Vol 20 No 3 2006
and active leg cycle ergometry. J Am Paraplegia Soc 1990; 13:
33–39.
Figoni S, Rodgers M, Glaser R, Hooker S, Faghri P, Ezenwa
B, Mathews T, Suryaprasad A, Gupta S. Physiologic responses
of paraplegics and quadriplegics to passive and active leg cycle
ergometry. J Am Paraplegia Soc 1990; 13: 33–39.
Gfohler M, Lugner P. Cycling by means of functional electrical
stimulation. IEEE Trans Rehabil Eng 2000; 8: 233–243.
Myers R. Classical and modern regression with applicaitons, 2nd
edition, Duxbury Press, Belmont 1990.
Haykin S. Neural networks, 2nd Edition 1999.
Fisher R. The use of multiple measurements in taxonomic problems. Ann Eugen 1936; 10: 422–429.
Fukunaga K. Introduction to statistical pattern recognition. Academic Press, San Diego 1990.
Valiant L. A theory of the learnable. Communications of the
ACM 1984; 17: 1134–1142.
Angluin D, Smith C. Inductive inference: theory and methods.
Computing Surveys 1983; 15: 237–269.
Center NSCIS. The 2004 annual statistical report for the model
spinal cord injury care systems. University of Alabama at Birmingham, Birmingham 2004.
Winter D. Biomechanics and Motor Control of Human Movement, 2nd edition, John Wiley & Sons, Inc, New York, 1990.
Trumbower R, Faghri P. FES-induced pedal force generation of
individual leg muscles during a single crank rotation in persons
with SCI, Spinal Cord, Submitted.
Faghri PD, Trumbower RD. Short-duration FES-induced leg
cycling dynamics at different stimulation intensities and flywheel
resistances, presented at IFESS 2005, Montreal, Canada 2005.
Thorstensson A, Karlsson J. Fatiguability and fiber composition
of human skeletal muscle. Acta Physio Scand 1976; 98: 318–322.
Vandewalle H, Peres G, Monod H. Standard anaerobic exercise
tests. Sports Med 1987; 4: 268–289.
Burnham R, Martin T, Stein R, Bell G, MacLean I, Steadward
R. Skeletal muscle fibre type transformation following spinal
cord injury. Spinal Cord 1997; 35: 86–91.
McCartney N, Obminski G, Heigenhauser GJ. Torque-velocity
relationship in isokinetic cycling exercise. J Appl Physiol 1985;
58: 1459–1462.
Siff M. Biomechanical foundations of strength and power training, in V. Zatsiorsky (ed.), Biomechanics in sport, Blackwell
Scientific Ltd., London 2000, pp. 103–139.
34. Triantaphyllou E, Soyster A, Kumara S. Generating logical expressions from positive and negative examples via a branchand-bound approach. Computers Ops Res 1994; 21: 185–
197.
35. Sanchez S, Triantaphyllou E, Chen J, Liao T. An incremental
learning algorithm for constructing Boolean functions from positive and negative examples. Computers & Operations Research
2002; 29: 1677–1700.
36. Deshpande A, Triantaphyllou E. A greedy randomized adaptive
search procedure (GRASP) for inferring logical clauses from
exmples in polynomial time and some extensions. Mathematical
and Computer Modelling 1998; 27: 75–99.
37. Dreiseitl S, Ohno-Machado L. Logistic regression and artificial
neural network classification models: a methodology review. J
Biomed Inform 2002; 35: 352–359.
38. Subasi A, Ercelebi E. Classification of EEG signals using neural
network and logistic regression. Comput. Methods Programs
Biomed 2005; 78: 87–99.
39. Rumelhart D, McClelland J. Parallel distributed processing: eexplorations in the microstructure of cognition. Vol. 1, MIT Press,
Cambridge 1986.
40. Kearns M, Vazirani U. An introduction to computational learning theory. MIT Press 1994.
41. Crago PE, Peckham PH, Thrope GB. Modulation of muscle
force by recruitment during intramuscular stimulation. IEEE
Trans Biomed. Eng 1980; 27: 679–684.
42. Johnson R, Wichern D. Applied multivariate statistical analysis,
4th edition. Prentice-Hall, Saddle River 1998.
43. Windows SF. SPSS 12.0 for Windows, Release 12.0.0 ed. SPSS,
Inc Chicago 2003.
44. Cristianini N, Shawe-Taylor J. An introduction to support vector
machines and other kernel-based learning methods. University
Press, Cambridge 2000.
45. Dasarathy B. Nearest neighbor pattern classification techniques.
IEEE Computer Society Press, Silver Spring 1991.
46. Breiman L, Friedman J, Olshen R, Stone C. Classification and
regression trees. Chapman & Hall, New York 1984.
47. Trumbower RD, Faghri PD. Relationship between isometric
pedal force generation and stimulation intensity of individual
leg muscles involved in FES-induced leg cycling. Presented at
IFESS 2005, Montreal, Canada 2005.
48. Hosmer D, Lemeshow S. Applied logistic regression, 2nd ed.
Wiley, New York, 2000.
Download