Processing Sequential Sensor Data John Krumm Microsoft Research

advertisement
Processing Sequential
Sensor Data
John Krumm
Microsoft Research
Redmond, Washington USA
jckrumm@microsoft.com
1
Interpret a Sequential Signal
1-D Signal
120
100
80
60
40
20
0
0
10
20
30
40
50
60
70
80
90
100
Time (seconds)
Signal is
• Often a function of time (as above)
• Often from a sensor
2
2
Pervasive/Ubicomp Examples
Signal sources
• Accelerometer
• Light sensor
• Gyro sensor
• Indoor location
• GPS
• Microphone
•…
Interpretations
• Speed
• Mode of transportation
• Location
• Moving vs. not moving
• Proximity to other people
• Emotion
•…
3
3
Goals of this Tutorial
• Confidence to add sequential signal processing to your research
• Ability to assess research with simple sequential signal processing
• Know the terminology
• Know the basic techniques
• How to implement
• Where they’re appropriate
• Assess numerical results in an accepted way
• At least give the appearance that you know what you’re talking about
4
4
Not Covering
12000
10000
Regression – fit
function to data
8000
6000
4000
2000
0
0
10
20
30
40
50
60
70
80
90
100
5
Classification –
classify things
based on
measured features
4
3
2
1
0
0
0.5
1
1.5
2
2.5
3
100%
Statistical Tests –
determine if data
support a
hypothesis
80%
60%
40%
20%
0%
5
5
Outline
• Introduction (already done!)
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
6
Signal Dimensionality
1-D Signal
120
100
80
60
40
20
0
1D: z(t)
0
10
20
30
40
50
60
70
80
90
100
Time (seconds)
2-D Signal
100
()
z1(t)
z2(t)
80
70
z2 (meters)
2D: z(t) =
90
60
50
40
30
bold means vector
20
10
0
0
10
20
30
40
50
60
z1 (meters)
70
80
90 100
7
7
Sampled Signal
Cannot measure nor store continuous signal, so take samples instead
[ z(0), z(Δ), z(2Δ), … , z((n-1)Δ) ] = [ z1, z2, z3, … , zn ]
Δ = sampling interval, e.g. 1 second, 5 minutes, …
1-D Signal
120
100
80
60
40
20
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Time (seconds)
Δ = 0.1 seconds
8
8
Signal + Noise
Noise
• Often assumed to be Gaussian
• Often assumed to be zero mean
• Often assumed to be i.i.d. (independent,
identically distributed)
• vi ~ N(0,σ) for zero mean, Gaussian, i.i.d., σ is
standard deviation
zi = xi + vi
measurement
from noisy
sensor
actual value,
but unknown
random
number
representing
sensor noise
1-D Signal
120
100
80
60
40
20
0
0
10
20
30
40
50
Time (seconds)
60
70
80
90
100
9
9
Running Example
Actual Path and Measured Locations
Track a moving person in (x,y)
• 1000 (x,y) measurements
• Δ = 1 second
100
90
80
70
measurement vector
y (meters)
actual location
noise
𝒛𝑖 = 𝒙𝑖 + 𝒗𝑖
π‘₯𝑖
𝒙𝑖 = 𝑦 = π‘₯𝑖 , 𝑦𝑖
𝑖
(π‘₯)
𝒗𝑖 =
𝑣𝑖
(𝑦)
𝑣𝑖
~
𝑇
𝑁 0,3
𝑁 0,3
zero mean
60
50
40
30
20
10
0
0
10
20
30
50
60
70
80
90
100
x (meters)
standard deviation = 3 meters
Also 10 randomly inserted outliers with N(0,15)
40
start
outlier
10
Outline
• Introduction
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
11
Mean Filter
• Also called “moving average” and “box car filter”
• Apply to x and y measurements separately
Filtered version of this point is mean of points in solid box
zx
t
• “Causal” filter because it doesn’t look into future
• Causes lag when values change sharply
• Help fix with decaying weights, e.g.
• Sensitive to outliers, i.e. one really bad point can cause mean to take on any value
• Simple and effective (I will not vote to reject your paper if you use this technique)
12
Mean Filter
Actual Path and Measured Locations
90
90
80
80
70
70
outlier
60
50
40
60
50
40
30
30
20
20
10
10
0
0
10
20
30
40
50
60
x (meters)
70
80
90 100
Mean Filter
100
y (meters)
y (meters)
100
0
0
10
20
30
40
50
60
70
80
90 100
x (meters)
10 points in each mean
• Outlier has noticeable impact
• If only there were some convenient way to fix this …
13
13
Median Filter
Filtered version of this point is mean median of points in solid box
Insensitive to value
of, e.g., this point
zx
t
Median is way less sensitive
to outliners than mean
median (1, 3, 4, 7, 1 x 1010) = 4
mean (1, 3, 4, 7, 1 x 1010) ≈ 2 x 109
14
14
Median Filter
Actual Path and Measured Locations
Median Filter
100
90
90
80
80
70
70
outlier
60
50
40
y (meters)
y (meters)
100
60
50
40
30
30
20
20
10
10
0
0
10
20
30
40
50
60
70
80
90 100
0
0
10
20
x (meters)
30
40
50
60
70
80
90 100
x (meters)
10 points in each median
Outlier has noticeable less impact
15
15
Mean and Median Filter
Mean and Median Filter
100
90
Mean
Median
80
The median is almost
always better to use
than the mean.
y (meters)
70
60
50
40
Editorial:
mean vs.
median
30
20
10
0
0
10
20
30
40
50
60
70
80
90
100
x (meters)
16
16
Outline
• Introduction
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
17
Kalman Filter
Assumed trajectory
is parabolic
• Mean and median filters assume smoothness
• Kalman filter adds assumption about trajectory
My favorite book on Kalman filtering
Weight data against
assumptions about
system’s dynamics
Big difference #1: Kalman
filter includes (helpful)
assumptions about
behavior of measured
process
18
Kalman Filter
Kalman filter separates measured variables from state variables
(π‘₯)
Measure:
𝒛𝑖 =
𝑧𝑖
(𝑦 )
𝑧𝑖
π‘₯𝑖
𝑦𝑖
Infer state:
𝒙𝑖 =
(π‘₯)
𝑣𝑖
(𝑦 )
𝑣𝑖
Running example:
measure (x,y) coordinates
(noisy)
100
90
80
70
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
Running example:
estimate location and
velocity (!)
Big difference #2: Kalman
filter can include state
variables that are not
measured directly
19
19
Kalman Filter Measurements
Measurement vector is related to state
vector by a matrix multiplication plus noise.
𝒛𝑖 = 𝐻𝑖 𝒙𝑖 + 𝒗𝑖
Running example:
(π‘₯)
𝑧𝑖
(𝑦 )
𝑧𝑖
=
1
0
0
1
0
0
(π‘₯)
𝑣𝑖
+ 𝑁 𝟎, 𝑅𝑖
(𝑦)
= π‘₯𝑖 + 𝑁 0, πœŽπ‘Ÿ
(𝑦 )
= 𝑦𝑖 + 𝑁 0, πœŽπ‘Ÿ
𝑧𝑖
π‘₯𝑖
𝑦𝑖
𝑣𝑖
(π‘₯)
𝑧𝑖
0
0
Sleepy eyes threat
level: orange
• In this case, measurements are
just noisy copies of actual location
• Makes sensor noise explicit, e.g.
GPS has σ of around 5 meters
20
20
Kalman Filter Dynamics
Insert a bias for how we think system will change through time
𝒙𝑖 = Φ𝑖−1 𝒙𝑖−1 + 𝑀𝑖−1
π‘₯𝑖
𝑦𝑖
(π‘₯)
𝑣𝑖
(𝑦)
𝑣𝑖
1
0
=
0
0
0
1
0
0
βˆ†π‘‘π‘–
0
1
0
0
βˆ†π‘‘π‘–
0
1
(π‘₯)
π‘₯𝑖 = π‘₯𝑖−1 + βˆ†π‘‘π‘– 𝑣𝑖
(π‘₯)
𝑣𝑖
(π‘₯)
= 𝑣𝑖−1 + 𝑁(0, πœŽπ‘  )
π‘₯𝑖−1
𝑦𝑖−1
(π‘₯)
𝑣𝑖−1
(𝑦)
𝑣𝑖−1
0
0
+ 𝑁(0, 𝜎 )
𝑠
𝑁(0, πœŽπ‘  )
location is standard straight-line motion
velocity changes randomly (because we don’t
have any idea what it actually does)
21
21
Kalman Filter Ingredients
1
0
0
0
1 0 0 0
0 1 0 0
H matrix: gives measurements for given state
𝑁 𝟎, 𝑅𝑖
Measurement noise: sensor noise
0
1
0
0
βˆ†π‘‘π‘–
0
1
0
𝑁 𝟎, 𝑄𝑖
0
βˆ†π‘‘π‘–
0
1
φ matrix: gives time dynamics of state
Process noise: uncertainty in dynamics model
22
22
Kalman Filter Recipe
(−)
𝒙𝑖
(−)
𝑃𝑖
(+)
= Φ𝑖−1 𝒙𝑖−1
(+)
𝑇
= Φ𝑖−1 𝑃𝑖−1 Φ𝑖−1
+ 𝑄𝑖−1
(−)
𝐾𝑖 = 𝑃𝑖
(+)
𝒙𝑖
(+)
𝑃𝑖
(−)
𝐻𝑖𝑇 𝐻𝑖 𝑃𝑖
(−)
= 𝒙𝑖
= 𝐼−
𝐻𝑖𝑇 + 𝑅𝑖
−1
(−)
+ 𝐾𝑖 𝒛𝑖 − 𝐻𝑖 𝒙𝑖
(−)
𝐾𝑖 𝐻𝑖 𝑃𝑖
• Just plug in measurements and go
• Recursive filter – current time step uses
state and error estimates from previous time
step
Sleepy eyes threat
level: red
Big difference #3: Kalman
filter gives uncertainty
estimate in the form of a
Gaussian covariance matrix
23
23
Kalman Filter
Velocity model:
(π‘₯)
𝑣𝑖
Kalman Filter
(π‘₯)
= 𝑣𝑖−1 + 𝑁(0, πœŽπ‘  )
100
90
80
70
y (meters)
• Smooth
• Tends to overshoot corners
• Too much dependence on
straight line velocity assumption
• Too little dependence on data
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
80
90
100
x (meters)
25
25
Kalman Filter
Velocity model:
(π‘₯)
𝑣𝑖
=
(π‘₯)
𝑣𝑖−1
Kalman Filter
+ 𝑁(0, πœŽπ‘  )
100
90
Untuned
80
Tuned
70
y (meters)
• Hard to pick process noise σs
• Process noise models our
uncertainty in system dynamics
• Here it accounts for fact that
motion is not a straight line
60
50
40
30
“Tuning” σs (by trying a bunch of
values) gives better result
20
10
0
0
10
20
30
40
50
60
70
80
90
100
x (meters)
26
26
Kalman Filter
The Kalman filter was fine back in the old days. But I really prefer
more modern methods that are not saddled with Kalman’s
restrictions on continuous state variables and linearity
assumptions.
Editorial:
Kalman
filter
27
27
Outline
• Introduction (already done!)
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
28
Particle Filter
Dieter Fox et al.
WiFi tracking in a multi-floor building
• Multiple “particles” as hypotheses
• Particles move based on probabilistic motion model
• Particles live or die based on how well they match sensor data
29
29
Particle Filter
Dieter Fox et al.
• Allows multi-modal uncertainty (Kalman is unimodal Gaussian)
• Allows continuous and discrete state variables (e.g. 3rd floor)
• Allows rich dynamic model (e.g. must follow floor plan)
• Can be slow, especially if state vector dimension is too large
(e.g. (x, y, identity, activity, next activity, emotional state, …) )
30
30
Particle Filter Ingredients
• z = measurement, x = state, not necessarily same
• Probability distribution of a measurement given actual value
• Can be anything, not just Gaussian like Kalman
• But we use Gaussian for running example, just like Kalman
𝑝 𝒛𝑖 𝒙𝑖
p(zi|xi)
E.g. measured
speed (in z) will
be slower if
emotional state
(in x) is “tired”
xi
zi
For running example, measurement
is noisy version of actual value
31
31
Particle Filter Ingredients
• Probabilistic dynamics, how state changes through time
• Can be anything, e.g.
• Tend to go slower up hills
• Avoid left turns
• Attracted to Scandinavian people
• Closed form not necessary
• Just need a dynamic simulation with a noise component
• But we use Gaussian for running example, just like Kalman
𝑝 𝒙𝑖 𝒙𝑖−1
xi
random vector
xi-1
32
32
Home Example
Rich measurement and state dynamics models
Measurements
z = ( (x,y) location in house from WiFi)T
State (what we want to estimate)
x = (room, activity)
𝑝 𝒛𝑖 𝒙𝑖
• p((x,y) in kitchen | in bathroom) = 0
𝑝 𝒙𝑖 𝒙𝑖−1
• p( sleeping now | sleeping previously) = 0.9
• p( cooking now | working previously) = 0.02
• p( watching TV & sleeping| *) = 0
• p( bedroom 4 | master bedroom) = 0 33 33
Particle Filter Algorithm
Start with N instances of state vector xi(j) , i = 0, j = 1 … N
1. i = i+1
2. Take new measurement zi
3. Propagate particles forward in time with p(xi|xi-1), i.e. generate new,
random hypotheses
4. Compute importance weights wi(j) = p(zi|xi(j)), i.e. how well does
measurement support hypothesis?
5. Normalize importance weights so they sum to 1.0
6. Randomly pick new particles based on importance weights
7. Goto 1
Compute state estimate
• Weighted mean (assumes unimodal)
• Median
Sleepy eyes threat
level: orange
34
Particle Filter
Dieter Fox et al.
WiFi tracking in a multi-floor building
• Multiple “particles” as hypotheses
• Particles move based on probabilistic motion model
• Particles live or die based on how well they match sensor data
35
35
Particle Filter Running Example
𝑝 𝒛𝑖 𝒙𝑖
Particle Filter
Measurement model reflects true,
simulated measurement noise. Same
as Kalman in this case.
p(zi|xi)
100
90
80
70
Actual
y (meters)
xi
zi
𝑝 𝒙𝑖 𝒙𝑖−1
π‘₯𝑖 = π‘₯𝑖−1 +
Straight line motion with
random velocity change. Same
as Kalman in this case.
(π‘₯)
βˆ†π‘‘π‘– 𝑣𝑖
location is standard
straight-line motion
60
Particle 1000
Particle 1000000
50
40
30
20
10
0
0
(π‘₯)
𝑣𝑖
(π‘₯)
= 𝑣𝑖−1 + 𝑁(0, πœŽπ‘  )
velocity changes randomly
(because we don’t have
any idea what it actually
does)
10
20
30
40
50
60
70
80
90
100
x (meters)
Sometimes increasing the number of particles helps36
36
Particle Filter Resources
UbiComp 2004
Especially Chapter 1
37
37
Particle Filter
The particle filter is wonderfully rich and expressive
if you can afford the computations. Be careful not to
let your state vector get too large.
Editorial:
Particle
filter
38
38
Outline
• Introduction
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
39
Hidden Markov Model (HMM)
Big difference from previous: states are discrete, e.g.
• Spoken phoneme
• {walking, driving, biking, riding bus}
• {moving, still}
• {cooking, sleeping, watching TV, playing game, … }
Markov 1856 - 1922
Hidden Markov
40
40
(Unhidden) Markov Model
0.7
0.9
0.1
bus
walk
0.1
0.0
0.2
0.0
• Move to new state (or not)
• at every time click
• when finished with current state
• Transition probabilities control state transitions
0.1
drive
0.9
Example
inspired by:
41
UbiComp 2003
41
Hidden Markov Model
0.7
0.9
0.1
bus
walk
0.1
0.0
Can “see” states
only via noisy sensor
0.2
0.0
0.1
drive
accelerometer
0.9
42
42
HMM: Two Parts
Two parts to every HMM:
1) Observation probabilities P(Xi(j)|zi) – probability of state j given measurement at time i
2) Transition probabilities ajk – probability of transition from state j to state k
Initial State
Probabilities
Transition
Probabilities
Observation
Probabilities
Transition
Probabilities
Observation
Probabilities
Transition
Probabilities
Observation
Probabilities
P(X0(j))
ajk
P(X1(j)|z1)
ajk
P(X2(j)|z2)
ajk
P(X3(j)|z2)
• Find path that maximizes product of probabilities (observation & transition)
• Use Viterbi algorithm to find path efficiently
43
43
Smooth Results with HMM
Signal Strength
still
moving
still
Signal Strength
80
60
40
20
moving vs. still
0
0
10
20
30
40
50
60
70
80
90
100
Time (sec.)
still
moving
0.00011
0.99989
still
0.99989
moving
0.00011
noise variance
Signal strength has higher
variance when moving →
observation probabilities
Transitions between states
relatively rare (made-up
numbers) → transition
probabilities
44
44
Smooth Results with HMM
0.99989
still
0.4
still
still
0.2
0.00011
still
0.9
0.3
0.00011
moving
0.99989
moving
0.6
moving
0.8
Viterbi algorithm finds path with
maximum product of observation
and transition probabilities
moving
0.1
0.7
80
Still vs. Moving Estimate
60
moving
actual
40
still
20
moving
0
moving
inf erred
still
0
20
40
60
80
100
inf erred and
smoothed with
HMM
still
0
200
400
600
800
1000
Time (seconds)
still
moving
Results in fewer false transitions between
states, i.e. smoother and slightly more accurate
noise variance
45
45
Running Example
Hidden Markov Model
Discrete states are 10,000
1m x 1m squares
100
90
80
Observation probabilities
spread in Gaussian over
nearby squares as per
measurement noise model
Transition probabilities go
to 8-connected neighbors
y (meters)
70
60
50
40
0.011762 0.136136 0.011762
30
0.13964 0.401401 0.13964
0.011762 0.136136 0.011762
20
10
0
0
10
20
30
40
50
60
70
80
90
100
x (meters)
46
46
HMM Reference
• Good description of Viterbi algorithm
• Also how to learn model from data
47
47
Hidden Markov Model
The HMM is great for certain applications when
your states are discrete.
Editorial:
Hidden
Markov
Model
Tracking in (x,y,z) with HMM?
• Huge state space (→ slow)
• Long dwells
• Interactions with other airplanes
48
48
Outline
• Introduction
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
49
Presenting Continuous
Performance Results
Tracking Error vs. Filter
Euclidian
distance
estimated
value
meters
𝑒𝑖 = 𝒙𝑖 − 𝒙𝑖
actual
value
50
45
40
35
30
25
20
15
10
5
0
Mean Error
Median Error
Measured
Plot mean or median of
Euclidian distance error
• Median is less sensitive
to error outliers
Mean
Median
Kalman Kalman
(untuned) (tuned)
Particle
HMM
Tracking Error vs. Filter
7
meters
6
5
Mean Error
4
Median Error
3
2
Note: Don’t judge these filtering methods
based on these plots. I didn’t spend much time
tuning the methods to improve their
performance.
1
0
Mean
Median
Kalman
(untuned)
Kalman
(tuned)
Particle
HMM
50
50
Presenting Continuous
Performance Results
Cumulative error distribution
• Shows how errors are distributed
• More detailed than just a mean or
median error
Cumulative Error Distribution
1
0.9
0.8
Fraction
0.7
95th
percentile
0.6
Median
0.5
HMM
0.4
Kalman (tuned)
0.3
Particle
0.2
Mean
0.1
Kalman (untuned)
0
0
1
2
3
4
5
6
7
8
9
10
Error (meters)
95% of the time, the particle filter
gives an error of 6 meters or less
(95th percentile error)
50% of the time, the particle filter
gives an error of 2 meters or less
(median error)
51
51
Presenting Discrete
Performance Results
Techniques like particle filter and HMM can classify sequential data into discrete classes
Confusion matrix
Actual Activities
Si tti ng
Sta ndi ng
Wa l ki ng
Inferred Acti vi ti es
Up s ta i rs
Down
s ta i rs
El eva tor
down
El eva tor
up
Brus hi ng
teeth
Si tti ng
75%
24%
1%
0%
0%
0%
0%
0%
Sta ndi ng
29%
55%
6%
1%
0%
4%
3%
2%
Wa l ki ng
4%
7%
79%
3%
4%
1%
1%
1%
Up s ta i rs
0%
1%
4%
95%
0%
0%
1%
0%
Down s ta i rs
0%
1%
7%
0%
89%
2%
0%
0%
El eva tor down
0%
2%
1%
0%
8%
87%
1%
0%
El eva tor up
0%
2%
2%
6%
0%
3%
87%
0%
Brus hi ng teeth
2%
10%
3%
0%
0%
0%
0%
85%
Pervasive 2006
52
52
End
Actual Path and Measured Locations
100
90
80
meters
• Introduction
• Signal terminology and assumptions
• Running example
• Filtering
• Mean and median filters
• Kalman filter
• Particle filter
• Hidden Markov model
• Presenting performance results
70
y (meters)
60
50
40
30
20
10
0
0
10
20
30
40
50
60
x (meters)
70
80
90
100
Tracking Error vs. Filter
7
6
5
4
3
2
1
0
Mean Error
Median Error
Mean
Median
Kalman
(untuned)
Kalman
(tuned)
Particle
HMM
53
53
54
Ubiquitous Computing Fundamentals,
CRC Press, © 2010
55
Download