Holger Junker, Oliver Amft, Paul Lukowicz, and Gerhard Troster
Pattern Recognition, vol. 41, no. 6, pp. 2010-2024, 2008
2010. 04. 08
Jongwon Yoon
• Introduction
– Related works
– Contributions
– Terminologies
• Spotting approach
• Case studies
• Spotting implementation
– Preselection stage
– Classification stage
• Experiments
• Results
• Discussion
• Conclusion
• Activity recognition
– Motivated by a variety of mobile and ubiquitous computing applications
• Body-mounted motion sensors for activity recognition
– Advantage : Only influenced by user activity
– Difficult to extract relevant features
• Information is often ambiguous and incomplete
• Sensors do not provide exact trajectory because of gravity and arm speed changes
• Solution
– Spotting of sporadically occurring activities
Introduction
• Wearable instrumentation for gesture recognition
– Kung Fu moves (Chambers et al., 2002)
– “atomic” gestures recognition (Benbasat, 2000)
– House holding activities recognition (Bao, 2003)
– Workshop activities recognition (Lukowicz et el., 2004)
• Spotting task
– HMM-based endpoint detection in continuous data (Deng and Tsui, 2000)
• Used HMM-based accumulation score
• Search start point using the viterbi algorithm
– HMM-based Threshold model (Lee and Kim, 1999)
• Calculates the likelihood threshold of an input pattern
– Partitioning the incoming data using an intensity analysis (Lukowicz, 2004)
Introduction
• Two-stage gesture spotting method
– Novel method based on body-worn motion sensors
– Specifically designed towards the needs and constraints of activity recognition in wearable and pervasive systems
• Large null class
• Lack of appropriate models for the null class
• Large variability in the way gestures are performed
• Variable gesture length
• Verification of the proposed method on two scenarios
– Comprise nearly a thousand relevant gestures
– Scenario1) Interaction with different everyday objects
• Part of a wide range of wearable systems applications
– Scenario2) Nutrition intake
• Highly specialized application motivated by the needs of a large industry dominated health monitoring project
Introduction
• Motion segment
– Represents atomic, non-overlapping unit of human motion
– Characterized by their spatio-temporal trajectory
• Motion event
– Span a sequence of motion segments
• Activity
– Describes a situation that may consist of various motion events
• Signal segment
– A slice of sensor data that corresponds to a motion segment
• Candidate section
– A slice of sensor data that may contain a gesture
• Naïve approach
– Performs on all possible sections in the data stream
– Computational effort problem
• Two-stage gesture spotting method
– Preselection stage
• Localize and preselect sections in the continuous signal stream
– Classification stage
• Classify candidate sections
• Case study 1
– Spotting of diverse object interaction gestures
• Key component in a context recognition system
• May facilitate more natural human-computer interfaces
• Case study 2
– Dietary intake gestures
• Become one sensing domain of an automated dietary monitoring system
• Framework
• Relevant gestures
Preseselection stage
• Preselection stage
– 1) Initial partitioning of the signal stream
– 2) Identify potential selection
– 3) Candidate selection
• Partition a motion parameter into non-overlapping, meaningful segments
– Used motion parameter : Pitch and Roll of the lower arm
• Used sliding-window and bottom-up algorithm (SWAB)
– Ex) Partitioning of each buffer of length n
• Step 1) Start from the arbitrary segmentation of the signal into n/2 segments
• Step 2) Calculate the cost of merging each pair of adjacent segments
– Cost : The error of approximating the signal with its linear regression
• Step 3) Merge the lowest cost pair
Preseselection stage
• Used sliding-window and bottom-up algorithm (SWAB) (cont.)
• Extension of the segmentation algorithm
– To ensure that the algorithm provided a good approximation
– Merge adjacent segments if their linear regressions had similar slopes
Preseselection stage
• Each motion segment endpoint is considered as potential end of a gesture
– For each endpoint, potential start points were derived from preceding motion segment boundaries
• Confining the search space
– 1) For the actual length T of the section, T min
– 2) For the number of motion segments n
N
MS,min
≤ n
MS
≤ N
MS,max
MS
≤ T ≤ T max in the actual section,
Preseselection stage
• Searching
– Used simple single-value features
• Min / max signal values, sum of signal samples, duration of the gesture …
– If d(f
PS
;G k
) smaller than a gesture-specific threshold ▶ Contain gesture G k
• Selection of candidate sections
– Collision of two sections can be occurred
– Select sections with the smallest similarity
Spotting implementation
• HMM based classification
• Features
– Pitch and roll angles from the lower / upper arm sensors
– Derivative of the acceleration signal from the lower arm
– The cumulative sum of the acceleration from the lower arm
– Derivative of the rate of turn signal from the lower sensor
– The cumulative sum of the rate of turn from the lower arm
• Model
– Single Gaussian models
– Consisted of 4-10 states
• Experimental setting
– Five inertial sensors
– One female and three male
• Right-handed
• Aged 25-35 years
• Data sets
– No constraints to the movements of the subjects
• To obtain data sets with a realistic zeroclass
– Eight additional similar gestures
• To enrich the diversity of movements
Results
• Recall and Precision
• Other evaluation metrics
Results
• Precision-recall curves
• Evaluation results
Results
• Initial testing
– Case 1 : 98.4% / Case 2 : 97.4%
• Classification of candidate sections
Results
• Including Zero-class model
– Case 1 : Extracted from all relevant gesture models
– Case 2 : Constructed on the basis of additional gestures that were carried out by the subjects
• Summary of the total spotting results
• Similarity-based search
– Way to avoid the explicit modeling of a zero-class
• Explicit zero-class model can be added to improve the recognition
– Permits different feature sets for individual gestures
• Future work
– Additional challenges
• Differences in the size and consistency of food pieces
• Additional degrees of freedom
• Temporal aspects
– The presented spotting approach can be applied to other types of motion events