IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 63 Emotional State Classification in Patient–Robot Interaction Using Wavelet Analysis and Statistics-Based Feature Selection Manida Swangnetr and David B. Kaber, Member, IEEE Abstract—Due to a major shortage of nurses in the U.S., future healthcare service robots are expected to be used in tasks involving direct interaction with patients. Consequently, there is a need to design nursing robots with the capability to detect and respond to patient emotional states and to facilitate positive experiences in healthcare. The objective of this study was to develop a new computational algorithm for accurate patient emotional state classification in interaction with nursing robots during medical service. A simulated medicine delivery experiment was conducted at two nursing homes using a robot with different human-like features. Physiological signals, including heart rate (HR) and galvanic skin response (GSR), as well as subjective ratings of valence (happy–unhappy) and arousal (excited–bored) were collected on elderly residents. A three-stage emotional state classification algorithm was applied to these data, including: 1) physiological feature extraction; 2) statistical-based feature selection; and 3) a machinelearning model of emotional states. A pre-processed HR signal was used. GSR signals were nonstationary and noisy and were further processed using wavelet analysis. A set of wavelet coefficients, representing GSR features, was used as a basis for current emotional state classification. Arousal and valence were significantly explained by statistical features of the HR signal and GSR wavelet features. Wavelet-based de-noising of GSR signals led to an increase in the percentage of correct classifications of emotional states and clearer relationships among the physiological response and arousal and valence. The new algorithm may serve as an effective method for future service robot real-time detection of patient emotional states and behavior adaptation to promote positive healthcare experiences. Index Terms—Emotions, machine learning, physiological variables, regression analysis, service robots, wavelet analysis. I. I NTRODUCTION accuracy and reliability in basic nursing task performance (e.g., medication administration). During the past two decades, the healthcare industry, including hospitals and nursing homes, have identified the capability of mobile transport robots to assist nurses in routine patient services. Hospital delivery robots are now used to automatically transport prescribed medicines to nurse stations, meals, and linens to patient rooms, and medical records or specimens to labs. Existing commercial robots, like the Aethon Tug, are capable of autonomous pointto-point navigation in healthcare environments. Although such robots have been implemented in many hospitals, they do not deliver medicines or other healthcare-related materials directly to patients. There are always nurses who must go between robots and patients. As a result, current robot designs do not support direct interaction with patients. The robotics industry is currently seeking research results on how different features of robots support interactive tasks or social interaction between humans and robots as part of healthcare operations. In nursing operations, patient care requires timely and careful task performance as well as support of positive patient emotions. If future service robots are to be used in tasks involving direct interaction with patients, it is important to understand how patients perceive robots and evaluate robot performance as a basis for judging healthcare quality. Since emotions play an important role for patients in communication and interaction with hospital staff (e.g., describing pain), there is a need to design service robots to be capable of detecting, classifying and responding to patient emotions to achieve positive healthcare experiences. U A. Subjective Measures of Emotional States Manuscript received June 6, 2011; revised February 10, 2012; accepted June 18, 2012. Date of publication September 12, 2012; date of current version December 21, 2012. This work was supported in part by the Edward P. Fitts Department of Industrial and Systems Engineering at North Carolina State University. This paper was recommended by Associate Editor R. Roberts of the former IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans (2011 Impact Factor: 2.123). M. Swangnetr is with the Back, Neck, and Other Join Pain Research Group, Department of Production Technology, Khon Kaen University, Khon Kaen 40002, Thailand (e-mail: manida@kku.ac.th). D. B. Kaber is with the Edward P. Fitts Department of Industrial and Systems Engineering, North Carolina State University, Raleigh, NC 27695-7906 USA (e-mail: dbkaber@ncsu.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMCA.2012.2210408 Russell’s [2] theory of emotion is the most widely accepted in psychology and contends that all emotions can be organized in terms of continuous dimensions. The theory includes a 2-D emotion space, defined by valence (pleasantness/ unpleasantness) and arousal (strong engagement/ disengagement), as presented in Fig. 1. Watson and Tellegen [3] suggested that the axes of the emotion space should pass through regions where emotion labels used by individuals are most densely clustered. They conducted a factor analysis with varimax rotation resulting in a model, including positive affect (PA) and negative affect (NA), which was a 45◦ rotation from Russell’s [2] valence-arousal model. However, Russell and Feldman-Barrett [4] showed that when placing more emotion terms into the 2-D space, the increased term density provided SE OF service robots in healthcare operations represents a potential technological solution to reducing overloaded nursing staffs in hospitals [1] for critical tasks and to increase 2168-2291/$31.00 © 2012 IEEE 64 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 Fig. 2. Fig. 1. Two-dimensional valence-arousal structure of affect (adapted from Watson & Tellegen, 1985). no further guidance for factor model rotation. Reisenzien [5] also offered that the valence-arousal model provided conceptually separate building blocks of core affective feelings. For example, Reisenzien said enthusiasm is a PA that is a blend of pleasantness and strong engagement. He also said distress is a NA that is a blend of unpleasantness and strong engagement. For this reason, in the present study, Russell’s [2] 2-D model of emotion was used. Subjective measures of emotion include self-reports, interviews on emotional experiences, and questionnaires on which a participant identifies images of expressions or phrases that most closely resemble their current feelings. Previous research has used image-based self-report measures for assessing human emotional states in terms of valence and arousal, such as the self-assessment Manikin (SAM) [6]. Image-based questionnaires are designed to overcome the disadvantage of subjects having to label emotions, which can lead to inconsistency in responses. The SAM consists of pictures of manikins representing five states of arousal (ranging from “excited” to “bored”) and five states of valence (ranging from “very happy” to “very unhappy”). Subjects can rate their current emotional state by either selecting a manikin or marking in a space between two manikins, resulting in a nine-point scale. B. Physiological Measures of Emotional States Physiological responses have also been identified as reliable indicators of human emotional and cognitive states. They are considered automatic outcomes of the autonomous nervous system (ANS), primarily driven by emotions. The ANS is composed of two main subsystems: the sympathetic nervous system (tends to mobilize the body for emergencies—the “fight” response) and the parasympathetic nervous system (tends to conserve and store bodily resources—the “flight” response). Among several physiological measures, heart rate (HR) and galvanic skin response (GSR) are the two most commonly used for revealing states of arousal and valence [7]. These responses are also relatively simple and inexpensive to measure with minimal intrusiveness to subject behavior [8]. 1) HR: HR is usually measured in milliseconds by detecting “QRS” wave complexes in an electrocardiographic (ECG) GSR waveform. record and determining the intervals between adjacent “R” wave peaks. HR, in beats per minute (bpm), can be directly calculated from “RR” intervals. There are also several statistical measures that can be determined on HR (see Malik et al. [9] for a comprehensive description). Common features used in studies of human emotion and cognitive states include: mean HR (in bpm) and the standard deviation of HR (SDHR), also expressed in bpm. HR has been previously used to differentiate between user positive and negative emotions in human–computer interaction (HCI) tasks. Mandryk and Atkins [10] developed fuzzy rules, based on a literature review, defining how physiological signals related to arousal and valence. They asserted that when HR is high, arousal and valence are also high. However, other studies have shown that there are no observed HR differences between positive and negative emotions [11]–[13]. Therefore, the relationships between HR and valence and arousal may not be definite. Due to the practicality of HR for emotion state classification, further assessment of these relationships was conducted in this study. 2) GSR: GSR measures electrodermal activity in terms of changes in resistance across two regions of skin. A voltage is applied between two electrodes attached to the skin and the resulting current is proportional to the skin conductance (SC, μSiemens), or inversely proportional to the skin resistance (μOmhs). The response is typically large and varies slowly over time; however, it has been found to fluctuate quickly during mental, physical, and emotional arousal. A GSR signal consists of two main components: SC level (SCL) or tonic level—this refers to the baseline level of response; and SC response (SCR) or phasic response—this refers to changes from the baseline causing a momentary increase in SC (i.e., a small wave superimposed on the SCL). SCR normally occurs in the presence of a stimulus; however, a SCR that appears during rest periods, or in absence of stimuli, is referred to as “nonspecific” SCR (NSSCR). Fig. 2 illustrates a typical GSR waveform. Dawson et al. [7] identified GSR measures related to emotional state, including: SCL; change in SCL; frequency of NSSCRs; SCR amplitude; SCR latency; SCR rise time; SCR half recovery time; SCR habitation (number of stimuli occurring before no response); and slope of SCR habitation. The most commonly used measure is the amplitude of SCR. Change in SC is widely accepted to reflect cognitive activity and emotional response with a linear correlation to arousal. Specifically, SC has been found to increase as arousal increases [7], [10], [14], SWANGNETR AND KABER: EMOTIONAL STATE CLASSIFICATION IN PATIENT–ROBOT INTERACTION [15]. However, the relationship between SC and valence is not definite. Although Dawson et al. [7] report that in most studies SC does not distinguish positive from negative valence stimuli, other research has demonstrated some relationship between SC and valence. For example, increases in the magnitude of SC have been associated with negative valence [15], [16]. Relatively few studies have examined the psychological significance of SCL and NS-SCRs produced during the performance of on-going tasks. Dawson et al. [7] reported relations between these measures and task engagement as well as emotions. Typically, SCL increases about 1 μSiemen above resting level during anticipation of a task and then increases another 1 to 2 μSiemens during performance. Pecchinenda and Smith [17] measured NS-SCR rate, and maximum amplitude and slope of SCL during a difficult problem-solving task. Results showed that SC activity increased at the start of trials but decreased by the end under the most difficult condition. C. Current Modeling Approaches for Classifying Emotional States Although physiological signals can be noisy and some may lack definitive relationships with emotional states, numerous studies have been conducted on emotional state classification using such objective data (e.g., [10], [15]–[20]). Emotion modeling approaches vary in terms of both physiological variable inputs and classification methods. Analysis of variance (ANOVA) and linear regression are the most common for identifying physiological signal features that may be significant in differentiating among emotional states and selecting sets of significant features for predicting emotions, respectively [18], [19]. However, these approaches make the assumption of linear relationships between physiological responses and emotional states. Fuzzy logic models are an alternative classification approach to deal with nonlinear relationships and uncertainty among system inputs and outputs. Such models can represent continuous processes that are not easily divided into discrete segments; that is, when a change from one linguistically defined state to another is not clear. Mandryk and Atkins [10] developed a fuzzy logic model to transform HR, GSR, and facial electromyography (EMG) for smiling and frowning to arousal and valence states during video game play. A second model was used to classify arousal and valence into five lower level emotional states related to the gaming situation, including: boredom, challenge, excitement, frustration, and fun. Model results revealed the same trends as self-reported emotions for fun, boredom, and excitement. This approach has the advantage of describing variations among specific emotional states during the course of a complete emotional experience. The major drawback is fuzzy rules used in a fuzzy system for classification problems must be constructed manually by experts in the problem domain. However, as described above, previous research has not been able to clearly define the relationships between physiological responses (e.g., HR and GSR) and specific emotional states (valence and arousal). Machine learning approaches have also been used to deal with nonlinear relationships and uncertainty in emotion clas- 65 sification based on physiological responses. Machine learning algorithms can automatically generate models, including rules and patterns, from data. Supervised learning algorithms are used to identify functions for mapping inputs to desired outputs based on training data and are later validated against test data. Artificial neural networks (ANN) are a common form used for human emotional state classification. Lee et al. [16] applied a multilayer perceptron (MLP) network to recognize emotions from the standard deviation of ECG RR intervals (in ms), the root mean-square of the difference in successive RR intervals (in ms), meanHR, the low-frequency/high-frequency ratio of HR variability (HRV) and the SC magnitude for GSR. By using ratings from the SAM questionnaire as desired outputs, the network was able to learn sadness, calm pleasure, interesting pleasure, and fear with correct classifications of 80.2%. Lisetti and Nasoz [20] compared three different machine learning algorithms, including k-nearest neighbor (KNN), discriminant function analysis (DFA), and a neural network using a Marquardt backpropagation (MBP) algorithm, for emotion classification in a HCI application. They used minimum, maximum, mean, and variance values of normalized GSR, body temperature, and HR as algorithm inputs while emotion states/outputs (sadness, anger, fear, surprise, frustration, and amusement) were elicited based on movie clips. Results showed that emotion recognition by the KNN was 72.3% accurate, the DFA was 75.0% accurate, and the MBP NN was 84.1% accurate. Instead of learning from labeled data, unsupervised learning approaches automatically discover patterns in a data set. Amershi et al. [15] applied an unsupervised clustering technique to affective expressions in educational games. They identified several influential features from SC, HR, and EMG for smile and frown muscles. Results showed that only a few statistical features of the responses (e.g., mean and standard deviation) were relevant for defining clusters. In addition, clustering was able to identify meaningful patterns of reactions within the data. Some studies have been conducted on recognizing human emotional states when interacting with robots. Itoh et al. [21] developed a bioinstrumentation system to measure human stress states during HRI. However, stress responses do not strictly covary with emotional responses. For example, different levels of stress may occur with an emotional response of fear. Moreover, this study only used HRV as an indicator of stress based on a defined classification rule in which the ratio of sympathetic/parasympathetic responses increases if a human feels stress. Kulic and Croft [22] used physiological responses to assess subject emotional states of valence and arousal by using the motion of a robot manipulator arm as a stimulus. The emotional state assessment was based on a rule-based classification model constructed from a literature review. Subjective responses were used separately to measure participant discrete emotion responses to the robot motion. A series of studies by Liu and Sarkar et al. [23]–[25] developed a human emotional state assessment model based on physiological responses in a HCI scenario. The model was then used for real-time emotion classification in the same scenario. However, the manner in which a human interacts with a robot is similar but not identical to interactions between a human and a computer [26]. 66 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 D. Motivation and Objective Real-world applications for interactive robots in hospital environments are being developed. Physiological responses, such as HR and GSR signals from patients can be monitored, with minimal intrusiveness, in real-time in hospitals for determining patient status. Therefore, it is possible that physiological measures can be extracted and emotional states classified in real-time during patient interaction with robots. This situation provides a basis for performing real-time robot expression modification according to current human emotional states and to ensure quality in robot-assisted healthcare operations. Although there are several on-going research studies on emotional state identification based on physiological data, the literature reveals relatively few classification models for recognizing human emotions when interacting with service robots, particularly nursing robots in medicine delivery tasks. Some methods of emotional classification have been adopted and/or modified for testing in human interaction with humanoid robots (e.g., [21]). However, the manner and circumstances in which patients interact with a robot may be similar but are not identical to interactions between a healthy human and a robot in other situations. As nursing robots are expected to become common in hospitals of the future, it is important to develop accurate methods for assessing patient responses to such robots providing medical services. It also appears that the relationships between physiological responses (e.g., HR and GSR) and emotional states (valence and arousal) are not well-defined. Therefore, further assessment is needed through sensitive physiological feature identification along with robust emotional state classification modeling. One major problem when analyzing physiological signals is noise interference. Additional signal processing must be applied to attenuate noise without distorting signal characteristics. Since physiological signals are nonstationary [27] and may include random artifacts and other unpredictable phenomena, they are problematic for several signal processing methods, such as fast Fourier transform (FFT). A wavelet transform, which is a tool for analysis of transient, nonstationary or time-varying phenomena, is often useful for such processing. Prior studies have represented stochastic physiological signals using statistical features (based on expert domain knowledge) to classify emotional states. Unfortunately, information can be lost with such features as simplifying assumptions are made, including knowledge of the probability density function of the data. Furthermore, there may be signal features that have not been identified by experts, but have the potential to significantly improve emotion classification accuracy. It has been suggested that signal processing features may be useful for this purpose [27]. Considering these research issues, the objectives of the present study were to: 1) Assess the relationships between physiological responses, including HR and GSR, and emotional states in terms of valence and arousal during patient–robot interaction (PRI)—Arousal states were expected to be better explained by GSR features; while, valence states were expected to be better explained by HR features. 2) Develop a machine learning algorithm for accurate patient emotional state classification with the potential to classify states in real-time during PRI. 3) Examine the utility of advanced signal processing features for representing physiological signals in emotional state identification—Wavelet coefficients can be used as a compressed representation of amplitude, time, and frequency features of physiological signals. 4) Develop a wavelet-based de-noising algorithm by identifying the noise distribution and features of a reference signal to eliminate those noise features overlapping with the informative signal frequency. 5) Identify significant wavelet-based features for emotional state classification—A statistical approach can be used to identify physiological features with utility for classifying emotional states. II. E XPERIMENT An experiment was conducted to develop an empirical data set that could be used as a basis for addressing the above objectives. Observations on human emotional states and physiological responses in interacting with an actual service robot in the target context were needed to develop the emotional state classification algorithm and to demonstrate the wavelet-based signal processing methods. A. Procedure With the aging U.S. population, the Health Resources and Services Administration predicted that elderly persons represent the primary future users of healthcare facilities for agerelated healthcare needs. Therefore, elderly will likely be the largest user group of nursing robots in the future. For the present study, we recruited 24 residents at senior centers (17 females and 7 males) in Cary, North Carolina. They ranged in age from 63 to 91 years with a mean of 80.5 years and a standard deviation of 8.8 years. At the beginning of the experiment, participants read and signed an informed consent and completed a background survey. They were then provided with a brief introduction on nursing robots and applications in healthcare tasks, including medicine delivery. This was followed by a familiarization with the SAM form for emotion ratings and the physiological measurement. A Polar HR monitor (Polar Electro Inc.) was used, including a S810i wrist receiver and T31 transmitter attached to an elastic strap around the chest area. The monitor recorded heart activity in RR intervals, and the Polar Precision Performance software was used for analysis. An iWorx GSR-200 amplifier (iWorx Systems, Inc.) was used to apply a voltage between electrodes placed on the surface of a subject’s index and ring fingertips. Factory calibration of the amplifier equated 1 volt to a SC of 5 μSiemens. The output voltages of the GSR signal were transmitted to a DT9834 data acquisition system (Data Translation, Inc.) and finally recorded in a computer using quickDAQ software (Data Translation, Inc.) with a sampling rate of 1024 Hz. SWANGNETR AND KABER: EMOTIONAL STATE CLASSIFICATION IN PATIENT–ROBOT INTERACTION Fig. 3. 67 PeopleBot platform. Once the HR monitor and GSR electrodes were placed on a subject, they were asked to sit and relax on a sofa located in a simulated patient room. During this period, 1 min of HR and GSR data was recorded and the mean responses were later used as baselines [28] for normalizing test trial data. There were six events identified during each experiment test trial. At the beginning of a trial, a PeopleBot robot (see Fig. 3) entered the simulated patient room (Event 1), with a container of medicine in a gripper and stopped in front of the subject (Event 2). The robot notified subjects of its arrival (Event 3) and released the bottle of medicine from its gripper (Event 4). The robot then waited for a short period of time before it turned around (Event 5) and left the room (Event 6). The design of the experiment was a randomized complete block design with three independent variables representing robot feature manipulations, including robot facial features (see Fig. 4; abstract or android), speech capability (synthesized or digitized voice), and different modes of user interaction (i.e., visual messages or physical confirmation of receipt of medication with a touch screen). Each subject was exposed to all settings of each variable. No interactions of features were studied in the experiment. The order of presentation of the control condition (i.e., a robot without any features) among stimulus trials was randomly assigned. The physiological data (HR and GSR) were collected throughout trials. At the end of each trial, subjects completed the SAM questionnaire, indicating their emotional response to the specific robot configurations. After subjects completed 14 test trials (2 replications of 2 levels of the face, voice and interactivity conditions, plus 1 control condition), a final interview was conducted in which they provided comments on their impressions of the robot configurations. On average, each subject took ∼50 min. to complete the experiment. B. Data Post-Processing and Overall Results To address individual differences in internal scaling of emotions and physiological responses, all observations on the response measures were normalized for each participant. The Fig. 4. Some robot feature manipulations. arousal and valence ratings from the SAM questionnaire were converted to z-scores. Normalized ratings were then categorized as “low,” “medium,” or “high” levels of valence and arousal. There were two subjects who did not follow experiment instructions in the arousal and valence ratings, and for this reason their data were considered invalid and were excluded from analysis. The physiological response measures, HR (in bpm) and GSR (in μSiemens), were also normalized by subtracting test trial readings (Yi ) from the mean baseline response (Ybaseline ) and dividing by the maximum range for every participant, as shown in a following equation: Ynormalized = Yi − Ybaseline . Max [abs(Yi − Ybaseline )] (1) Results of ANOVAs on the two subjective measures of emotion, arousal, and valence indicated significant differences in emotional state depending on robot feature settings. However, no one feature appeared more powerful than any other for facilitating positive emotional experiences. Regarding the physiological measures, there were also significant differences among robot feature types. The interactivity condition produced greater HR and GSR responses than the face and voice conditions when interacting with the robot. However, when making comparisons among the settings of each feature, only the levels of interactivity appeared to differ significantly. Correlation analyses were also conducted between the physiological 68 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 TABLE I A NOVA R ESULTS ON HR FOR L EVELS OF VALENCE AND A ROUSAL AT S PECIFIC E VENTS (N OTE : F-S TATISTICS I NCLUDE N UMBERATOR AND D ENOMINATOR D EGREES OF F REEDOM . P-VALUES R EVEAL S IGNIFICANCE L EVELS FOR E ACH T EST. B OLD VALUES ARE S IGNIFICANT.) Fig. 5. Post-hoc results on GSR for levels of arousal at significant stimulus events. TABLE II A NOVA R ESULTS ON GSR FOR L EVELS OF VALENCE AND A ROUSAL AT S PECIFIC E VENTS and subjective ratings of emotions. Results revealed a strong positive relation between valance and HR, but no significant correlation between arousal and GSR. III. A NALYTICAL M ETHODOLOGY A. Event Selection and Analytical Procedure Analysis of physiological measures as a basis for emotional state classification is generally conducted on an event basis. Ekman [29] observed that human emotional responses typically last between 0.5 and 4 s. Consequently, short recording periods (< 0.5 s) might cause certain emotions to be missed or long periods (> 4 s) might lead to observations of mixed emotions [30], [31]. In addition, Dawson et al. [7] indicated any GSR SCR, which begins between 1 and 3 s or 1 and 4 s after stimulus onset, is considered to be elicited by that stimulus. We used a 4-s time window for physiological data recording after an event in the experiment trials for data analysis purposes. One-way ANOVA models were structured for each event and used to identify that event providing the greatest degree of discrimination of physiological responses (i.e., HR and GSR) based on subject emotional states, which were categorized as three levels of arousal and three levels of valence. Experiment results, shown in Tables I and II, revealed significant differences in HR for the levels of valence during Events 4 (opening gripper), 5 (subject accepting medicine) and 6 (robot leaving room); while there were significant differences in GSR for the levels of arousal during Events 2 (robot moving in front of patient), 4 and 6. Across these analyses, only Events 4 and 6 supported HR and GSR for discriminating among emotional states. Event 6 occurred after the robot moved from the patient room and was not considered a potential stimulus. Therefore, Event 4, the robot opening its gripper to release the medicine bottle to a patient, was selected as the stimulus event for further investigation. Fig. 6. Post-hoc results on HR for levels of valence at significant stimulus events. Further post-hoc analyses were conducted using Tukey’s tests on HR and GSR responses recorded at influential stimulus events to identify significant differences among the levels of emotional state (arousal and valence). Results revealed high arousal to be associated with higher GSR than low and medium arousal (see Fig. 5); whereas medium valence was associated with higher HR than low and high valence for significant stimulus events (Fig. 6). Fig. 7 presents the overall analytical procedure used in this study. Initially, we sought to extract statistical and wavelet features from the raw physiological signals. We then used a statistical approach to select significant or relevant features for use in the machine learning model for emotional state classification. The following subsections describe each of these steps in detail. B. Physiological Feature Extraction: HR Analysis Based on the prior research (e.g., [32]), several statistical features of the HR response were identified for investigation in classifying emotional states. These included mean and median HR as measures of centrality and SDHR and the range of the response as measures of variation. (The 4-s time window for data analysis was not sufficient for determining HRV or other HR features in the frequency domain.) Plots of the normalized HR response distributions for some data collected in the study revealed symmetry with a small number of outliers in a few trials. For such distributions, based on small sample sizes, the sample mean is a more efficient estimator of the population mean (i.e., it has a smaller variance) than the median [33]. Therefore, mean HR was used for analysis purposes. Based on the data set, either SDHR or the range were considered suitable for estimating the population variance. These variables SWANGNETR AND KABER: EMOTIONAL STATE CLASSIFICATION IN PATIENT–ROBOT INTERACTION Fig. 7. 69 Procedure to development of emotional state classification algorithm. were selected for subsequent statistical analysis to identify physiological signal features for classifying emotional states. TABLE III MSES C ALCULATED ON THE D IFFERENCES B ETWEEN THE R ECONSTRUCTED S IGNALS FOR VARIOUS DB N WAVELETS C. Physiological Feature Extraction: GSR Analysis 1) Wavelet Selection: As suggested above, the recorded GSR signals were nonstationary and noisy and required further signal processing. A small wavelet, referred to as the “mother” wavelet, was translated and scaled to obtain a time-frequency localization of the GSR signal. This leads to most of the energy of the signal being well represented by a few wavelet expansion coefficients. The mother wavelet was chosen primarily based on its similarity to the raw GSR signal [27]. Among several families of wavelets, Daubechies (dbN), Symlets (symN), and Coiflets (coifN) families have the properties of orthogonality and compactness that can support fast computation algorithms. Unlike the near symmetrical shape of symN and coifN, an asymmetrically shaped dbN wavelet was found to closely match the GSR waveform. Although previous studies have used dbN wavelets to analyze GSR signals [34], [35], the choice of wavelets was subjective and not primarily based on the shape of a typical GSR signal. Therefore, these approaches may not have captured all informative features of the signal or eliminated noise. Gupta et al. [36] suggested that if the form of the wavelet is matched to the form of the signal, such that the maximum energy of the signal is accounted for in the initial scaling space, then the energy in next lower wavelet subspace should be very small. The mother wavelet that produces the minimum mean square error [MSE; see (2)] between two signals is the best match to the signal [x(t) − x̂(t)]2 dt . (2) MSE = n−1 The typical frequency of the GSR signal is 0.0167–0.25 Hz [7]. Therefore, reconstruction of the signal in the wavelet scale, including frequencies below 0.5 Hz, will represent the GSR signal, x(t); whereas signal reconstruction on the next lower wavelet scale (including higher frequencies) will capture both the GSR signal and noise, x̂(t). (Note: As the wavelet scale decreases, the represented frequencies increase.) Several prior studies have recommend that the cutoff frequency for high- frequency noise be at least double the highest signal frequency. The MSE of the difference between these two reconstructed signals for the common dbN, are shown in Table III. Results indicated that db3 was the most appropriate choice of mother wavelet to represent the GSR signal. 2) Noise Elimination: In recording the GSR signal, the measurement device also generated noise, including: white noise existing inherently in the amplifier (i.e., the power is spread over the entire frequency spectrum), noise from poor electrode contacts or variations in skin potential (i.e., a low-frequency fluctuation), power line noise (60 Hz. in the US), motion artifacts, etc. [37], [38]. For the purpose of frequency analysis, signal noises were separated into: mid-band frequency noise, which overlaps the GSR frequency; and high-frequency noise (> 0.5 Hz). a) High-frequency noise elimination: Based on decomposition of the GSR signal using the db3 wavelet, coefficients representing high-frequency (> 0.5 Hz) details of the signal (noise) were set to zero. Consequently, an entire 4-s GSR signal (1024 × 4 = 4096 data points) was effectively represented by only 24 wavelet coefficients. These coefficients were characterized by a 4-s time localization at four frequency ranges. (Note: The number of coefficients for each frequency range is not uniform. The higher the frequency, the greater the number of coefficients.) The amplitude of the coefficients corresponded to the amplitude of the GSR signal in specific frequencies at specific times. b) Mid-band frequency noise elimination: Wavelet threshold shrinking algorithms have been widely used for removal of noise from signals [39], [40]. The soft thresholding shrinkage function was used in the present algorithm. The concept is to set all frequency subband coefficients, which are less than a particular threshold (λ), to zero since such details are often associated with noise. Subsequently, shrinking is applied to other nonzero coefficients based on the threshold value. 70 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 Fig. 9. Laplace distribution fit to wavelet detail coefficients from the GSR data collected during the subject rest period. Fig. 8. Power spectrum of GSR occurring during subject rest period. There are several wavelet threshold selection rules used for denoising, for example: the Donoho andJohnstone universal threshold (DJ threshold), calculated by σ 2 log(N ), where σ is noise standard deviation and N is the signal size; a confidence interval threshold, such as 3σ or 4σ; the SureShrink threshold, based on Stein’s unbiased estimate of risk (a quadratic loss function); and the Minimax threshold developed to yield the minimum of the maximum MSE over a given set of functions. However, in practice, the noise level σ, which is needed in all these thresholding procedures, is rarely known and is therefore commonly estimated based on the median absolute deviation of the estimated wavelet coefficients at the finest level (i.e., the highest frequency) divided by 0.6745 [41]. It can be seen that these selection rules are generated from signal characteristics but have no relation to the nature of the noise. A frequency analysis was conducted on the 1 min of GSR signal data recorded during the subject rest period. A FFT was used to reveal the power spectrum of the baseline response to range from 0 to 0.5 Hz with peak values occurring for frequencies less than 0.1 Hz. (see Fig. 8). (This technique was only applied to the GSR signal during the rest period to identify noise in specific frequency ranges of the reference signal. Wavelet analysis was used for all test signal denoising and feature identification.) The typical frequency of the GSR signal during the rest period (without any stimuli) was between 0.0167 and 0.05 Hz. This frequency range is represented by approximation coefficients after decomposition. Therefore, all detail coefficients (frequencies between 0.05 and 0.5 Hz) represent signal noise and can be used as thresholds for signal de-noising. The confidence interval threshold technique (mentioned above), determines a threshold from the standard deviation of noise. This is based on the basic concept that by setting a threshold λ to 3σ, the probability of noise coefficients out of the interval [−3σ, 3σ] will be very small (0.27%). However, this technique is based on the assumption that the noise coefficients are normally distributed. In fact, the distribution of wavelet detail coefficients is better represented by a zero-mean Laplace distribution [42], which has heavier tails than a Gaussian distribution. Wavelet Fig. 10. Noisy GSR and de-noised GSR comparison. detail coefficients obtained from the baseline data set were tested for fit to Normal and Laplace distribution using the Kolmogorov-Smirnov test. Results confirmed that the detail coefficient distributions were not significantly different from the Laplace distribution (p > 0.05), but there was a significant difference from the Normal distribution. Fig. 9 illustrates the histogram of detail coefficients from the signal recorded during the rest period fitted by a Laplace distribution. The Laplace distribution has a mean and median of μ and 2 = 2b2 . to have 99.73% confidence that noise covariance σL efficients will be eliminated from the signal, the threshold of wavelet shrinkage is set to 4.18σL. When compared with the normal distribution, noise coefficients modeled with the Laplace distribution will have a higher threshold value. The σ of noise in the raw signal was estimated based on the rest period data. We expected no GSR frequencies between 0.05 and 0.5 Hz when the subject was not exposed to a stimulus. The σ represents the variance of the Laplace distribution on the wavelet coefficients for the signal during the rest period. This σ is also used as the threshold for de-noising the signal when the subject is exposed to a stimulus. Fig. 10 illustrates an example GSR signal de-noised using the above methodology. SWANGNETR AND KABER: EMOTIONAL STATE CLASSIFICATION IN PATIENT–ROBOT INTERACTION The rectified signal with de-noising in all frequencies can be compared with the raw GSR signal and the GSR signal with only high-frequency noise elimination. The wavelet analysis initially eliminated the high-frequency noise, for example, any abrupt signals caused by motion artifacts. The transformation then eliminated noise present in the frequencies overlapping with the GSR signal frequency, resulting in a smoother GSR signal. This methodology appears to be highly promising for isolating informative GSR signal features and has not previously been demonstrated. 71 TABLE IV S IGNIFICANT H IGH -F REQUENCY DE -N OISED GSR AND HR F EATURES FOR C LASSIFYING A ROUSAL AND VALENCE BASED ON R EGRESSION A NALYSIS D. Feature Selection The feature selection step in the new algorithm was to reduce the complexity of any data model by selecting the minimum set of most relevant physiological signal features for classifying emotional states. Principal component analysis (PCA) is the most commonly used data reduction technique in pattern recognition and classification [43] and has been applied to data in the wavelet domain (e.g., [44]). The basic idea is to project the data on to a lower dimensional space, where most of the information is retained. Unfortunately, transformed variables are not easy to interpret (i.e., assign abstract meaning; [45]). Moreover, PCA does not take class information into consideration; consequently, there is no guarantee that the classes in the transformed data are better separated than in the original form [43]. Multiple-hypothesis testing procedures are another form of feature selection technique that has been used for wavelet coefficient selection (e.g., [46], [47]). All coefficients are simultaneously tested for a significant departure from zero; therefore, the selected set of wavelet coefficients will provide the best contrast between classes [41]. However, the features identified in this approach are based on their classification capabilities but not on specific relationships with classes. Prior research (e.g., [32]) has also used stepwise regression procedures for selection of physiological features for use in models to predict emotional states. This type of analysis can be used to identify features that have a significant relationship with emotional responses and to avoid overly complex classification models. However, such analysis is typically conducted in the original time domain. On this basis, a stepwise backward-elimination regression approach was applied to identify the HR and GSR features that were statistically significant in classifying emotional states. The SDHR and RangeHR responses were highly correlated and were, therefore, examined in separate models. The analysis also examined both high-frequency de-noised and total de-noised GSR signals (i.e., signals with high and mid-band frequency elimination). Although previous studies demonstrated some relationships may exist between the statistical features of HR and GSR and emotional states, no prior work has investigated wavelet coefficients to predict emotional states. Therefore, the entire range of wavelet coefficients, determined based on the GSR signals, was considered potentially viable for classification of concurrent subject emotional states. Since the orthogonality property of the Daubechies family of wavelets ensures there is no correlation between wavelet coefficients for the signal being processed, it is possible to include multiple dbN TABLE V S IGNIFICANT T OTAL DE -N OISED GSR AND HR F EATURES FOR C LASSIFYING A ROUSAL AND VALENCE BASED ON R EGRESSION A NALYSIS wavelet coefficients as predictors in a single regression model of emotional states without violating model assumptions. Results revealed regression models of arousal producing the highest R-squared values to include: 15 high-frequency denoised GSR wavelet coefficient features and SDHR; and 14 total de-noised GSR wavelet coefficient features and SDHR (see Tables IV and V). For valence models, the best predictive models included: 14 high-frequency de-noised GSR wavelet coefficient features; and 11 total de-noised GSR wavelet coefficient features, MeanHR and SDHR (see Tables IV and V). (Note that low and high GSR subscripts correspond with low and high signal frequencies from time 0 to 4 s.) E. Emotional State Classification Among a large number of neural network structures, the MLP is used more often and has proven to be a universal approximator [43]. We implemented neural network models using the NeuralSolution software (NeuroDimension, Inc.) with an error-back propagation algorithm. A hyperbolic tangent sigmoid function was used as the activation function for all neurons in the hidden layer. A forward momentum of 0.9 [43] was also used at the hidden nodes to prevent a network from getting stuck at a local minimum during training. We determined weights of links among the nodes in the ANN structure to minimize the classification error. We set MSE = 0.02[48] as a criterion error goal. For validation (testing) of the ANNs, the data “hold-out” method was used, including separating 80% of the samples for training the NN and the remaining 20% for 72 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 TABLE VI OVERALL PCCS FOR A ROUSAL AND VALENCE C LASSIFICATION N ETWORKS U SING H IGH -F REQUENCY DE -N OISED WAVELET F EATURES S ELECTED BASED ON S TEPWISE R EGRESSION A NALYSIS TABLE VII OVERALL PCCS FOR A ROUSAL AND VALENCE C LASSIFICATION N ETWORKS U SING T OTAL DE -N OISED WAVELET F EATURES S ELECTED BASED ON S TEPWISE R EGRESSION A NALYSIS Fig. 11. Sensitivity analysis of arousal state. validation. The validation data was randomly selected. The number of data points was balanced among the classes of valence and arousal. To create a parsimonious ANN for classifying subject emotional states based on physiological signal features, we used a single hidden layer of processing elements. The minimum number of hidden nodes (h) in the hidden layer can be defined based on the number of inputs (n) and outputs (m) using Masters [49] equation 1 (3) h = Int (m × n) 2 . Fig. 12. Sensitivity analysis of valence state. In this paper, the emotional states of subjects were classified into three levels, including: low, medium, and high arousal or valence. Based on the sets of physiological data features selected from the stepwise regression procedure for both the high-frequency de-noising data set (16 features for arousal model and 14 features for valence model) and the total denoising data set (15 features for arousal model and 13 features for valence model), Masters’ equation indicated the minimum number of hidden nodes for the ANNs to be approximately six (for either emotional response). Holding fixed the set of inputs selected from the stepwise regression and only using a single hidden layer network structure, the number of hidden layer nodes was optimized to achieve the highest percentage of correct classifications (PCC) of subject emotional responses. The PCCs for the best arousal and valence classification networks from the high-frequency de-noised data set are shown in Table VI. Results revealed the overall PCC in validation of the ANN for classifying arousal to be 72%. This network was constructed with eight hidden nodes and produced a R-squared value of 0.73. The PCC of the ANN for classifying valence was 67%. The network was constructed with six hidden nodes and produced a R-squared value of 0.6. Results of arousal and valence state classification based on the total de-noised data set, as shown in Table VII, revealed the overall PCC in validation of the ANN for classifying arousal to be 82%. This network was constructed with seven hidden nodes and produced a R-squared value of 0.78. The PCC of the ANN for classifying valence was 73%. The network was constructed with seven hidden nodes and produced a R-squared value of 0.73. It can be seen that after applying the proposed algorithm to eliminate signal noise, overlapping with the informative signal frequency, the classification models produced higher PCCs for both arousal and valence state. Sensitivity analyses were performed using the NeuroSolutions software to provide information on the significance of the various inputs to the arousal and valance classification networks (see Figs. 11 and 12). The HR signal appeared to have the least effect on the arousal model; however, it had a relatively high effect on the valence model. In contrast, most GSR features had significant effects on arousal while fewer numbers of GSR features had great impact on valence states. To demonstrate the performance of the statistical feature selection methodology, additional ANNs were constructed including all GSR wavelet coefficients (24 features) and HR features (MeanHR and SDHR) for classifying arousal and valence. For the high-frequency de-noised data set, results (Table VIII) revealed an overall PCC in validation of the ANN for classifying arousal to be 67%. The PCC of the ANN for classifying valence was 63%. For the total de-noised data set, results (Table IX) revealed the PCC of the ANN for classifying arousal to be 78%. The PCC of the ANN for classifying valence was 75%. Using all wavelet coefficient features for classifying current emotional states not only produced lower PCCs but involved more complex networks. Although the valence model (total de-noised) in all features resulted in a slightly higher PCC (75% compared with 73% for the reduced model), the full network was much more complex (26 input and 8 hidden nodes versus 13 input and 7 hidden nodes for the reduced model). SWANGNETR AND KABER: EMOTIONAL STATE CLASSIFICATION IN PATIENT–ROBOT INTERACTION TABLE VIII OVERALL PCCS FOR A ROUSAL AND VALENCE C LASSIFICATION N ETWORKS U SING A LL H IGH -F REQUENCY DE -N OISED WAVELET F EATURES TABLE IX OVERALL PCCS FOR A ROUSAL AND VALENCE C LASSIFICATION N ETWORKS U SING A LL T OTAL DE -N OISED WAVELET F EATURES 73 form content in repeated observations. In this paper, the elderly participants may have been unaware of the randomized order of questions and may have misinterpreted the scales after some trials. Another explanation for our findings is that the initial wavelet analysis only filtered out the high-frequency noise from the GSR signal. Frequency analysis of GSR signals recorded during the rest period revealed some noise in mid-band frequencies, which overlapped the typical GSR frequency. This midband noise could have accounted for some of the variability in the valence response; otherwise, HR features might have explained valence states. After further applying the new midband de-noising algorithm and conducting sensitivity analyses on the NN models, results confirmed arousal states were better explained by GSR features than HR; while valence states were better explained by HR (than arousal) and fewer GSR features. IV. D ISCUSSION V. C ONCLUSION AND F UTURE W ORK This study demonstrated wavelet analysis can be an effective approach for significant feature extraction from physiological data. The wavelet technology also serves as an efficient means for feature reduction; the number of GSR signal features was reduced by 99.4% for analysis. The stepwise regression also proved to be an effective statistical method for further feature reduction. The number of GSR and HR features for the model of arousal was reduced by 38.5% after applying high-frequency noise elimination and by 42.3% after further applying midband frequency noise elimination. For the model of valence, the number of physiological signal features was reduced by 46.2% after applying high-frequency noise elimination and by 50% after further applying mid-band frequency noise elimination. Therefore, these methods can be used to develop emotional state classification models without high complexity but that include significant physiological signal features (i.e., amplitude, time and frequency). Comparison of the R-squared values from the regression models (ranging from 0.0424 to 0.0534) with the neural network models (ranging from 0.6 to 0.78) indicated that the relationships between the physiological responses, including HR and GSR, and emotional states, in terms of valence and arousal, are likely nonlinear. This was in agreement with speculation based on the regression analysis. Prior research has established that GSR is an indicator of arousal, with a linear correlation among the physiological response and emotional state. However, GSR has not been shown to distinguish positive from negative valence stimuli. In contrast, HR has been shown to have a strong relationship with both arousal and valence states. The analyses on the high-frequency de-noised data set confirmed the relationships between GSR and HR and arousal; however, valence states were not predicted by HR features. We suspected two possible reasons for this. First, there may have been some bias in subject self-reports of emotion during the study. Since the SAM questionnaire is a multi-dimensional subjective rating system, the order of presentation of the subscales was randomized on forms presented to subjects after each trial. This is a typical procedure in human factors research used to promote subject attention to The results of this study further support relationships between HR and GSR responses and emotional states, in terms of arousal and valence. Hazlett [18] indicated in his review that such physiological measures mostly reflect arousal and may be limited for indicating emotional valence. We found GSR and HR to be predictive of arousal states and for HR to be a stronger indicator of valence. Hazlett noted that some studies have validated facial EMG as a valence indicator. Our future research will examine facial EMG signals as an additional physiological measure for classifying emotional states. The machine learning algorithm we developed has the potential to classify patient emotional states in real-time during interaction with a robot. Percent correct classification accuracies for the NN models on arousal and valance ranged from 73 to 82%. Other emotion recognition methods have been developed using facial expressions via image processing [e.g., 50]. These methods have achieved high PCCs (e.g., 88.2–92.6%); however, this methodology is based on explicit emotion expressions. Interaction with service robots may not induce extreme happiness or unhappiness in patients. The study demonstrated a set of wavelet coefficients could be determined to effectively represent physiological signal features without additional postprocessing steps and therefore can support fast emotional state classification, when integrated as inputs in NN models. Wavelet technology also proved highly effective for eliminating noise, overlapping with informative physiological signal frequencies, and increasing the accuracy of emotional state classification with the neural networks. In general, the wavelet transformation process supports fast coefficient computation for simultaneous noise elimination and physiological signal representation. This is important because we are ultimately interested in real-time classification of patient emotional states for service robot behavior adaptation. The approach requires that GSR and HR data be captured on paients in real time; signals must then be transformed and reconstructed using wavelet analysis; wavelet expansion coefficients must be computed; and the coefficients must be used as inputs to a trained ANN for emotional state prediction. Classified patient states can then be used as a basis for adapting robot behavior 74 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 43, NO. 1, JANUARY 2013 or interface configurations to ensure positive patient emotional experiences (e.g., high arousal and high valence) in healthcare. Such a real-time emotional state classification system may be a valuable tool as part of future service robot design to ensure patient perceptions of quality in healthcare operations. We plan to implement the new human emotional classification algorithm on a service robot platform and to attempt to adapt various types of behaviors, including positioning relative to users and speech patterns. In the present study, the robot configuration generating the highest HR and GSR responses was the design requiring subjects to confirm their receipt of medicine. In a future study, we will further explore such a configuration and use maximum physiological responses for emotional state classification. [17] [18] [19] [20] [21] ACKNOWLEDGMENT [22] The authors thank Dr. Yuan-Shin Lee for comments on an earlier unpublished technical report of this work as well as Dr. T. Zhang, Dr. B. Zhu, L. Hodge, and Dr. P. Mosaly for assistance in the experimental data collection and statistical analysis. [23] R EFERENCES [1] D. I. Auerbach, P. I. Buerhaus, and D. O. Staiger, “Better late than never: Workforce supply implications of later entry into nursing,” Health Affairs, vol. 26, no. 1, pp. 178–185, Jan./Feb. 2007. [2] J. A. Russell, “A circumplex model of affect,” J. Person. Social Psychol., vol. 39, no. 6, pp. 1161–1178, 1980. [3] D. Watson and A. Tellegen, “Toward a consensual structure of mood,” Psychol. Bull., vol. 98, no. 2, pp. 219–235, Sep. 1985. [4] J. A. Russell and L. Feldman-Barrett, “Core affect, prototypical emotional episodes, and other things called emotion: Dissecting the elephant,” J. Person. Social Psychol., vol. 76, no. 5, pp. 805–819, 1999. [5] R. Reisenzein, “Pleasure-activation theory and the intensity of emotions,” J. Person. Social Psychol., vol. 67, no. 3, pp. 525–539, 1994. [6] M. Bradley and P. Lang, “Measuring emotion: The self-assessment Manikin and the semantic differential,” J. Behav. Ther. Exp. Psychiatry, vol. 25, no. 1, pp. 49–59, Mar. 1994. [7] M. E. Dawson, A. M. Schell, and D. L. Filion, “The electrodermal system,” in Handbook of Psychophysiology, J. T. Cacioppo, L. G. Tassinary, and G. G. Berntson, Eds., 3rd ed. New York: Cambridge Univ. Press, 2007, pp. 159–181. [8] C. D. Wickens, Engineering Psychology and Human Performance, 2nd ed. New York: Harper Collins Pub. Inc., 1992. [9] M. Malik, J. T. Bigger, A. J. Camm, R. E. Kleiger, A. Malliani, A. J. Moss, and P. J. Schwartz, “Heart rate variability: Standards of measurement, physiological interpretation, and clinical use,” Eur. Heart J., vol. 17, no. 3, pp. 354–381, Mar. 1996. [10] R. L. Mandryk and M. S. Atkins, “A fuzzy physiological approach for continuously modeling emotion during interaction with play technologies,” Int. J. Hum.-Comput. Stud., vol. 65, no. 4, pp. 329–347, 2007. [11] S. A. Neumann and S. R. Waldstein, “Similar patterns of cardiovascular response during emotional activation as a function of affective valence and arousal and gender,” J. Psychosom. Res., vol. 50, no. 5, pp. 245–253, May 2001. [12] T. Ritz and M. Thöns, “Airway response of healthy individuals to affective picture series,” Int. J. Psychophysiol., vol. 46, no. 1, pp. 67–75, Oct. 2002. [13] C. Peter and A. Herbon, “Emotion representation and physiology assignments in digital systems,” Interact. Comput., vol. 18, no. 2, pp. 139–170, Mar. 2006. [14] A. Nakasone, H. Prendinger, and M. Ishizuka, “Emotion recognition from electromyography and skin conductance,” in Proc. 5th Int. Workshop BSI, Tokyo, Japan, 2005, pp. 219–222. [15] S. Amershi, C. Conati, and H. Maclaren, “Using feature selection and unsupervised clustering to identify affective expressions in educational games,” in Proc. Workshop Motivational Affect Issues ITS, Jhongli, Taiwan, 2006, pp. 21–28. [16] C. K. Lee, S. K. Yoo, Y. J. Park, N. H. Kim, K. S. Jeong, and B. C. Lee, “Using neural network to recognize human emotions from heart [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] rate variability and skin resistance,” in Proc. 27th Annu. Conf. IEEE Eng. Med. Biol., Shanghai, China, pp. 5523–5525. A. Pecchinenda and C. A. Smith, “The affective significance of skin conductance activity during a difficult problem-solving task,” Cognit. Emotion, vol. 10, no. 5, pp. 481–503, Sep. 1996. R. L. Hazlett, “Measuring emotional valence during interactive experiences: Boys at video game play,” in Proc. CHI, Novel Methods: Emotions, Gesture, Events, Montreal, QC, Canada, 2006, pp. 1023–1026. H. Leng, Y. Lin, and L. A. Zanzi, “An experimental study on physiological parameters toward driver emotion recognition,” Lecture Notes in Computer Science, vol. 4566, pp. 237–246, 2007. C. L. Lisetti and F. Nasoz, “Using noninvasive wearable computers to recognize human emotions from physiological signals,” EURASIP J. Appl. Signal Process., vol. 2004, pp. 1672–1687, 2004. K. Itoh, H. Miwa, Y. Nukariya, M. Zecca, H. Takanobu, S. Roccella, M. C. Carrozza, P. Dario, and A. Takanishi, “Development of a bioinstrumentation system in the interaction between a human and a robot,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Beijing, China, 2006, pp. 2620–2625. D. Kulic and E. Croft, “Physiological and subjective responses to articulated robot motion,” Robotica, vol. 25, no. 1, pp. 13–27, Jan. 2007. C. Liu, K. Conn, N. Sarkar, and W. Stone, “Online affect detection and robot behavior adaptation for intervention of children with autism,” IEEE Trans. Robot., vol. 24, no. 4, pp. 883–896, Aug. 2008. P. Agrawal, C. Liu, and N. Sarkar, “Interaction between human and robotAn affect-inspired approach,” Interact. Stud., vol. 9, no. 2, pp. 230–257, 2008. P. Rani, C. Liu, N. Sarkar, and E. Vanman, “An empirical study of machine learning techniques for affect recognition in human-robot interaction,” Pattern Anal. Appl., vol. 9, no. 1, pp. 58–69, May 2006. C. L. Breazeal, “Emotion and sociable humanoid robots,” Int. J. Hum.Comput. Stud., vol. 59, no. 1/2, pp. 119–155, Jul. 2003. K. Najarian and R. Splinter, Biomedical Signal and Image Processing. Boca Raton, FL: CRC Press, 2006. M. W. Scerbo, F. G. Freeman, P. J. Mikulka, R. Parasuraman, F. Di Nocera, and L. J. Prinzel, “The efficacy of psychophysiological measures for implementing adaptive technology,” NASA Langley Res. Center, Hampton, VA, NASA TP-2001-211018, 2001. P. Ekman, “Expression and the nature of emotion,” in Approaches to Emotion, K. S. Scherer and P. Ekman, Eds. Hillsdale, NJ: Erlbaum, 1984. P. Ekman, “An argument for basic emotions,” Cognit. Emotion, vol. 6, no. 3/4, pp. 169–200, 1992. P. Ekman, “Basic emotions,” in Handbook of Cognition and Emotion, T. Dalgleish and M. Power, Eds. Sussex, U.K.: Wiley, 1999. G. Zhang, R. Xu, Q. Ji, P. Cowings, and W. Toscano, “Context, observation, and operator State (COS): Dynamic fatigue monitoring,” presented at the NASA Aviation Safety Tech. Conf., Denver, CO, 2008. J. F. Kenney and E. S. Keeping, Mathematics of Statistics, 3rd ed. Princeton, NJ: Van Nostrand, 1954, pt. 1. M. Slater, C. Guger, G. Edlinger, R. Leeb, G. Pfurtscheller, A. Antley, M. Garau, A. Brogni, and D. Friedman, “Analysis of physiological responses to a social situation in an immersive virtual environment,” Presence, Teleoper. Virtual Environ., vol. 15, no. 5, pp. 553–569, Oct. 2006. J. Laparra-Hernández, J. M. Belda-Lois, E. Medina, N. Campos, and R. Poveda, “EMG and GSR signals for evaluating user’s perception of different types of ceramic flooring,” Int. J. Ind. Ergonom., vol. 39, no. 2, pp. 326–332, 2009. A. Gupta, S. D. Joshi, and S. Prasad, “On a new approach for estimating wavelet matched to signal,” in Proc. 8th Nat. Conf. Commun., Bombay, India, 2002, pp. 180–184. C. J. Peek, “A primer of biofeedback instrumentation,” in Biofeedback: A Practitioner’s Guide, M. S. Schwartz and F. Andrasik, Eds., 3rd ed. New York: Guilford Press, 2003. R. Moghimi, “Understanding noise optimization in sensor signalconditioning circuits,” EE Times, 2008. [Online]. Available: http:// eetimes.com/design/automotive-design/4010307/Understanding-noiseoptimization-in-sensor-signal-conditioning-circuits-Part-1a-of-4-partsJ. Li, Y. Hou, P. Wei, and G. Chen, “A novel method for the determination of the wavelet denoising threshold,” in Proc. 1st ICBBE, Wuhan, China, 2007, pp. 713–716. S. Poornachandra and N. Kumaravel, “A novel method for the elimination of power line frequency in ECG signal using hyper shrinkage function,” Digit. Signal Process., vol. 18, no. 2, pp. 116–126, Mar. 2008. F. Abramovich, T. C. Bailey, and T. Sapatinas, “Wavelet analysis and its statistical applications,” Statistician, vol. 49, pt. 1, pp. 1–29, 2000. SWANGNETR AND KABER: EMOTIONAL STATE CLASSIFICATION IN PATIENT–ROBOT INTERACTION [42] E. Y. Lam, “Statistical modelling of the wavelet coefficients with different bases and decomposition levels,” Proc. Inst. Elect. Eng—Vis. Image Signal Process., vol. 151, no. 3, pp. 203–206, Jun. 2004. [43] R. Polikar, “Pattern recognition,” in Wiley Encyclopedia of Biomedical Engineering, M. Akay, Ed. New York: Wiley, 2006. [44] R. Yamada, J. Ushiba, Y. Tomita, and Y. Masakado, “Decomposition of electromyographic signal by principal component analysis of wavelet coefficient,” in Proc. IEEE EMBS Asian-Pac. Conf. Biomed. Eng., Keihanna, Japan, 2003, pp. 118–119. [45] J. L. Semmlow, Biosignal and Biomedical Image Processing: MATLABBased Applications. New York: Marcel Dekker, 2004. [46] F. Abramovich and Y. Benjamini, “Thresholding of wavelet coefficients as multiple hypotheses testing procedure,” Lecture Notes in Statistics, vol. 103, pp. 5–14, 1995. [47] F. Abramovich and Y. Benjamini, “Adaptive thresholding of wavelet coefficients,” Comput. Stat. Data Anal., vol. 22, no. 4, pp. 351–361, Aug. 1996. [48] L. J. Prinzel, “Research on hazardous states of awareness and physiological factors in aerospace operations,” NASA, Greenbelt, MD, NASA/ TM-2002-211444, L-18149, NAS 1.15:211444, 2002. [49] T. Masters, Practical Neural Network Recipes in C++. San Diego, CA: Academic, 1993. [50] A. Chakraborty, A. Konar, U. K. Chakraborty, and A. Chatterjee, “Emotion recognition from facial expressions and its control using fuzzy logic,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 39, no. 4, pp. 726–743, Jul. 2009. Manida Swangnetr received the B.S. degree in industrial engineering from Chulalongkorn University, Bangkok, Thailand, in 2001 and the M.S. degree in industrial engineering and the Ph.D. degree in industrial and systems engineering with a focus in human factors and ergonomics from North Carolina State University, Raleigh, in 2006 and 2010, respectively. Currently, she is a Lecturer in the Departments of Production Technology and Industrial Engineering at Khon Kaen University, Thailand. She is also a Research Team Member in the Back, Neck, and Other Join Pain research group in the Faculty of Associate Medical Sciences. Prior to these appointments, she worked as a Research Assistant in the Department of Industrial and Systems Engineering at North Carolina State University. She has published several other papers on human emotional state classification in interacting with robots through the International Ergonomics Association Triennial Conference, the Annual Meeting of the Human Factors & Ergonomics Society, and the AAAI Symposium on Dialog with Robots. Her current research interests include: cognitive engineering; human functional state modeling in use of automation; ergonomic interventions for occupational work; and ergonomics approaches to training capabilities for disabled persons. Dr. Swangnetr is a Member of the Human Factors and Ergonomics Society and is a registered Associate Ergonomics Professional. 75 David B. Kaber (M’99) received B.S. and M.S. degrees in industrial engineering from the University of Ccntral Florida, Orlando, in 1991 and 1993, respectively and the Ph.D. degree in industrial engineering from Texas Tech University, Lubbock, in 1996. Currently, he is a Professor of Industrial and Systems Engineering at North Carolina State University, Raleigh and Associate Faculty in biomedical engineering and psychology. He is also the Director of the Occupational Safety and Ergonomics Program, which is supported by the National Institute for Occupational Safety and Health. Prior to this, he worked as an Associate Professor at the same institution and as an Assistant Professor at Mississippi State University, Mississippi State. His current research interests include computational modeling of human cognitive behavior in interacting with advanced automated systems and optimizing design of automation interfaces based on tradeoffs in information load, task performance, and cognitive workload. Dr. Kaber is a recent Fellow of the Human Factors and Ergonomics Society and is a Certified Human Factors Professional. He is also a Member of Alpha Pi Mu, ASEE, IEHF, IIE, and Sigma Xi.