CSCI498B/598B Human-Centered Robotics October 5, 2015

advertisement
CSCI498B/598B
Human-Centered Robotics
October 5, 2015
Skeleton-based representations
• Solution 2: Compute additional features
𝒅
𝜽
2
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Histogram of Joint Position Differences
3
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Joint Movement Volume Features
4
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Joint Movement Volume Features
5
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Covariance of 3D Joints
6
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Covariance of 3D Joints
Temporal Pyramid:
7
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Histogram of Oriented Displacement
8
Skeleton-based representation
• Solution 2: Feature extraction - compute additional features
Representation: Histogram of Oriented Displacement
Temporal pyramid also applies.
9
Skeleton-based representation
• Advantages:
• Fast!
• Invariant to human body rotations
• Able to obtain satisfactory performance
• Disadvantages:
• Cannot model human-object interaction
• Cannot incorporate background/context information
• Performance depends heavily on skeleton data quality
10
General Pipeline
1. Data acquisition
2. Representation construction
3. Pattern recognition (classification or clustering)
4. Decision making
5. Taking an action
11
• Question: Can we use the local features (e.g.,
“regions of corners”) to represent an observation
or a human activity?
12
Bag-of-Words representation
• Question: Can we use the local features (e.g.,
“regions of corners”) to represent an observation
or a human activity?
13
Bag-of-Words representation
14
Bag-of-Words representation
15
Bag-of-Words representation
• Bag-of-words representation has its origin in text processing
• Orderless document representation: frequencies of words
from a dictionary
16
Bag-of-Words representation
• Bag-of-words representation has its origin in text processing
• Orderless document representation: frequencies of words
from a dictionary
17
Bag-of-Words representation
• Bag-of-words representation has its origin in text processing
• Orderless document representation: frequencies of words
from a dictionary
18
Bag-of-Words representation
• Bag-of-words representation has its origin in text processing
• Orderless document representation: frequencies of words
from a dictionary
19
Bag-of-Words representation
• Texture is characterized by the repetition of basic elements
20
Bag-of-Words representation
• Texture is characterized by the repetition of basic elements
21
Bag-of-Words representation
• Procedures:
• Feature detection: e.g., corners
22
Bag-of-Words representation
• Procedures:
• Feature detection: e.g., corners
• Feature description: e.g., incorporate the information of
local regions around the corners
23
Bag-of-Words representation
• Procedures:
• Feature detection: e.g., corners
• Feature description: e.g., incorporate the information of
local regions around the corners
• Dictionary learning: e.g., through clustering then assign
each feature descriptor an index
24
Bag-of-Words representation
• Procedures:
• Feature detection: e.g., corners
• Feature description: e.g., incorporate the information of
local regions around the corners
• Dictionary learning: e.g., through clustering then assign
each feature descriptor an index
• Bag-of-words representation
25
Bag-of-Words representation
• Advantages:
• Robust to partial occlusion, illumination changes, etc
• Independent of global skeleton data
• Capable of modeling more complicated scenarios, such as
human-object interaction
• Capable of capturing context information contained in the
scene
How about modeling human activities?
26
Bag-of-Words representation
• Limitations:
• Incapable of modeling time
• Incapable of dealing with 3D perceptual data
• Ignore spatial information, since assuming there’s no
order within the visual features/words.
27
Download