Studying Relationships between Human Posture and Health Risk Factors during Sedentary Activities Tejas Srinivasan Mentors: Vladimir Pavlovic Saehoon Yi Problem With suboptimal posture, sedentary activities for long hours can create major health risks Can we discover causes for these health risks through analysis of posture and health of a subject throughout time? Short-term Goals Figure out how to collect and interpret kinematic and biophysical data from Shimmer sensors Develop a system to classify posture from kinematic data Long-term Goals Develop a system to classify health risks from biophysical data Design and execute experiments which can be used to infer information about postural characteristics and health throughout time Analyze data to discover relationships between postural characteristics and health risk factors in order to develop a model Shimmer Sensors Wireless sensor platform Kinematic Sensing 3-axis accelerometer 3-axis gyro Biophysical Sensing Electrocardiography Electromyography Skin conductance / Galvanic skin response Shimmer Sensors • Interface with a computer via bluetooth • Bluetooth has a limited radius of 10 meters • Uncalibrated accelerometer data was received in terms of mV. • Due to our specific goal of classification of posture, calibration of accelerometer data was deemed not necessary. Methodology: Placement of Sensors • GSR unit on lower back • 9DOF unit on back of neck • EMG unit on right arm • ECG unit on chest • Each sensor was kept in the same orientation each time (the Shimmer logo facing upright). 6/7/12 Methodology: Classes for Posture • We adopted the posture labels used by Dr. Zheng and Dr. Morrell. • Leaning forward • Slouching • Upright • Leaning backward Data Collection: Experiment 1 • Four Shimmer units were strapped onto a subject and 3-axis accelerometer data was collected. • The format of the resulting data was an n by 12 matrix where n is the number of frames recorded by each sensor. • Each of 5 subjects performed the following procedure 25 times. 1. Lean forward for 3 seconds 2. Slouch for 3 seconds 3. Stay upright for 3 seconds 4. Lean back for 3 seconds 6/7/12 Data Collection: Experiment 1 • We segmented the time-series into separate postures for each person. • We eventually had 25 time-series samples for each posture per person. • Each posture is static and accelerometer signals did not change very much. Thus, we took the average of the accelerometer signals for each time-series sample. • The feature vector for a posture was thus a length 12 vector, each component representing the mean of a particular accelerometer signal throughout time. 6/7/12 [Example of what data looks like] 6/7/12 Data Collection: Experiment 2 • Collected data on subjects completing more natural, nondeterministic tasks. • Each subject followed this script. 1. Lean forward for 5 seconds 2. Stand up and walk around the room for 10 seconds. 3. Bend over to pick something up for 5 seconds. 4. Stand upright and walk back to your chair within 5 seconds. 5. Sit upright for 5 seconds. 6/7/12 Support Vector Machines (SVM) • Our method of choice for classification of posture • An SVM model is built from labeled training data. • A feature vector not present in the training data is passed through the model, and the model predicts the vector’s label. 6/7/12 SVM (cont. ) • Example of SVM in R^2 SVM (cont. ) • SVM generalizes this process to n dimensions, where it finds the hyperplane which maximizes this margin. • Classifying into multiple categories can be reduced to several of these binary classification problems • One-versus-rest • One-verses-one 6/7/12 SVM (cont.) • Sometimes, data isn’t best separated linearly, but requires a nonlinear transformation in order to be best separated. • This is achieved through kernels which implicitly map the data into a different space. Particularly, if ф is the mapping from one space to another, then K(x,y) = ф(x) · ф(y) is the kernel. • The simplest kernel, the linear kernel, is simply the dot product between two vectors and is what is used to linearly separate data. SVM (cont.) • Examples of Kernels • Radial Basis Function • Polynomial 6/7/12 K-Nearest Neighbor • Very simple classification method • To classify test point x with some training data, find the k closest training points and pick the most frequent training label as the predicted label for x Our SVM Model • For our SVM model, we used a linear kernel. • For each of the five subjects, we partitioned their 100 labeled posture samples into 60 samples for training and 40 samples for testing (each posture being represented equally in each partition). 6/7/12 Our SVM Model (cont.) 6/7/12 Results: Experiment 1 • The postures were enumerated as follows: 1 for leaning forward, 2 for slouching, 3 for sitting upright, and 4 for leaning back. • For each SVM model we constructed a matrix, M, where Mx,y is the probability that the predicted posture is y given that its labeled posture is x 6/7/12 Results: Experiment 1 Type A Overall Accuracy: 98% Type B Overall Accuracy: 61.9% Type C Overall Accuracy: 67.6% Type D Overall Accuracy: 92.4% Results: Experiment 2 • For the natural activity time series, we had a sliding window with a width of 50 frames and a shift of 25 frames. For each window, we took the mean signal and passed it to the corresponding subject’s SVM model to predict the posture during that window. 6/7/12 Leaning back Walking around the room Slouching Leaning forward Bending over Getting back up and sitting in chair Sitting upright Analysis • SVM test results for the static postures • The best results were obtained when we trained an SVM model on a particular subject’s training data and tested the model on their testing data set. • The worst results were obtained when the set of subjects used for training data and the set of subjects used for test data were disjoint. These results represent the model’s ability to generalize. • P(Predicted = 4 | Actual = 1) = 0 and P(Predicted = 1 | Actual = 4) = 0 in all instances 6/7/12 Analysis (cont.) • For every misclassified data point, x, in the Type B testing data, we performed the following procedure • Data point x, whose labeled posture was a, was misclassified as p in the Type B setting. For the model’s training data, we looked at the average posture vector for a, μa, and for p, μp. • We compared D(x, μa) and D(x, μp) with Euclidean metric. • In most cases (~80%), we found that D(x, μa) > D(x, μp) • This result demonstrates that the misclassified data points in Type B were often closer to the training subject’s mean vector for the predicted label than for the actual label. • We classified the same data from Type B with kNN and the resulting accuracy matrix was very similar to the SVM matrix. KNN Accuracy Matrix SVM Accuracy Matrix Analysis (cont.) • For Type C, we found that with the previous procedure, only ~40% of the misclassified data points were closer to their predicted mean posture vector than their labeled mean posture vector in the training data. • Note that this is completely possible with SVM since it does not take any sort of distance into account. • As expected, after performing kNN on the training and testing data for Type C, the kNN accuracy matrix didn’t coincide with the SVM matrix very much. KNN Accuracy Matrix SVM Accuracy Matrix Analysis (cont.) • The Type C models involved data from four subjects. It’s possible that the linear kernel was too simple a model to generalize the data. • While the RBF kernel with various parameters didn’t outperform the linear kernel, the polynomial kernel did slightly outperform the linear kernel with an accuracy of 69.4%. Type C: Linear Kernel Type C: Polynomial Kernel (Degree = 10) Analysis 6/7/12 Conclusions • We used SVM to classify postures accurately when the training and test data were collected from the same set of people. • The SVM model had problems generalizing to people beyond its training data as shown in results from Type B and C experiments. • Accuracies for leaning forward and leaning back were still fairly high. • Accuracies for slouching and staying upright were very low. 6/7/12 Future Goals • Gather more data • Build a classifier for “good” and “bad” posture • Begin working towards the second major goal: classification of health risks through biophysical sensors.