Statistical Techniques in Robotics (STR, S16) Lecture#15 (Monday, March 7) Kalman Filtering

advertisement
Statistical Techniques in Robotics (STR, S16)
Lecture#15 (Monday, March 7)
Lecturer: Byron Boots
1
Kalman Filtering
Gauss Markov Model
Consider X1 , X2 , ..., Xt , Xt+1 to be the state variables and Y1 , Y2 , ..., Yt , Yt+1 be the sequence of
corresponding observations. As in Hidden Markov models, conditional independencies (see Figure
1) dictate that past and future states are uncorrelated given the current state, Xt at time t. For
example, if we know what X2 is, then no information about X1 can possibly help us to reason
about what X3 should be. We are interested in a special case where xi and yi are jointly Gaussian
distributed.
X1
X2
Y1
Y2
-------------
Xt
Xt+1
...
Yt
Yt+1
...
Figure 1: The Independence Diagram of a Gauss-Markov model
The update for state variable Xt+1 is given by:
Xt+1 = AXt + where
X0 ∼ N (µ0 , Σ0 )
∼ N (0, Q)
Xt+1 |Xt ∼ N (AXt , Q)
The corresponding observation Yt+1 is given by equation:
Yt+1 = CXt+1 + δ
where
Y0 ∼ N (µ0 , 0 )
δ ∼ N (0, R)
Each component is defined as follows:
1
• At : Matrix (n × n) that describes how the state evolves from t to t-1 without controls or noise.
• Ct : Matrix (k × n) that describes how to map the state Xt to an observation Yt , where k is
the number of observations.
• t , δt : Random variables representing the process and measurement noise that are assumed to
be independent and normally distributed with n × n noise covariances Rt and Qt respectively.
We want to find xt |y1...t , so we need to calculate µxt and Σxt . In a real situation we would first try
to remove all outliers to achieve a more stable result.
2
Lazy Gauss Markov Filter
Let us take an example of a velocity-controlled mobile robot equipped with a GPS sensor.
I (∆t)I pos
pos
+ = AXt + =
Xt+1 =
vel t
0
I
vel t+1
pos
+ δ = CXt+1 + δ
Yt+1 = I 0
vel t+1
where
2
γ I
0
= N (0, Q)
∼ N 0,
0 σ2I
0 0
= N (0, R)
δ ∼ N 0,
0 ξ2I
Lazy Gauss Markov filter has two steps; prediction (motion update) and correction (observation update). We denote Gaussian parameters after prediction step but before correction step
by superscript ( )− .
2.1
Prediction Step (Motion Update)
For Xt |Y1,...,t ∼ N (µt , Σt ), we have the following by applying the motion model
−
Xt+1 |Y1,...,t ∼ N (µ−
t+1 , Σt+1 )
where
µ−
t+1 = Aµt
T
Σ−
t+1 = AΣt A + Q
2
Proof
Mean:
µ−
t+1 = E[Xt+1 ]
= E[AXt + ]
= E[AXt ] + E[]
= AE[Xt ] (since the mean of is 0)
= Aµt
Variance:
T
Σ−
t+1 = E[(Xt+1 − µt+1 )(Xt+1 − µt+1 ) ]
= E[(AXt + − Aµt )(AXt + − Aµt )T ]
= E[(AXt − Aµt )(AXt − Aµt )T ] + V ar()
= AE[(Xt − µt )(Xt − µt )T ]AT + Q
= AΣt AT + Q
2.2
Correction Step (Observation Update)
For the observation model the natural parameterization is more suitable as it involves multiplication
−
−
e −
of terms. So we first convert Xt+1 |Y1,...,t ∼ N (µ−
t+1 , Σt+1 ) to Xt+1 |Y1,...,t ∼ N (Jt+1 , Pt+1 ). Then
we apply observation update. Note that P (Xt+1 |Y1,...,t+1 ) ∝ P (Yt+1 |Xt+1 )P (Xt+1 ) by Markov
assumption.
1 T
1
T −1
−T
P (Yt+1 |Xt+1 )P (Xt+1 ) ∝ exp J Xt+1 − Xt+1 P Xt+1 exp − (Yt+1 − CXt+1 ) R (Yt+1 − CXt+1 )
2
2
1
T
−1
T
T −1
T
−1
= exp − −2Yt+1 R CXt+1 + Xt+1 C R CXt+1 + Yt+1 R Yt+1
2
1
T
−1
T
T −1
= exp − −2Yt+1 R CXt+1 + Xt+1 C R CXt+1
2
The last term in the second line is constant with respect to xt+1 , so it can be added to the the
marginalization term. Therefore we have the observation update as follows.
+
−
Jt+1
= Jt+1
+ C T R−1 Yt+1
+
−
Pt+1
= Pt+1
+ C T R−1 C
2.3
KF BLR
Note that this parameterization is directly related to Bayes Linear Regression (BLR) if it meets the
following conditions:
• X is equivalent to θ in BLR.
3
• δ is equivalent to the measurement noise for Y in BLR.
• is zero because θ, the state in BLR, is the same for t.
• Q is going to 0 as t → ∞. It is nonzero if the noise might be changing as a function of time.
• A is an identity matrix.
• Y is equivalent to Y in BLR.
• C is equivalent to xt in BLR which may be different at every timestep.
In short: KF to BLR is given by:




BLR 



2.4
ΘT = X
σ2 = R
0=Q
I=A
Y =Y
X T = Ct




 KF



Performance
Lazy Gauss Markov can be expressed in two forms:
• When expressed in terms of moment parameters, µ and Σ, it acts as Kalman Filter.
• When expressed in terms of natural parameters, J and P , it acts as Information Filter.
Kalman filters, as we will see, require matrix multiplications, approximately O(n2 ) time, to do a
prediction step, yet require matrix inversions, approximately O(n2.8 ) time, to perform the observation update. Information filters are the exact opposite, requiring matrix inversions for the prediction
step and matrix multiplications for the observation update. As mentioned above, the conversion
between moment and natural parameterization requires an inversion of the covariance matrix, so
switching between the two can be costly. Depending on the ratio of motion model updates to
observation model updates one filter may be faster than the other.
3
3.1
The Kalman Filter
Introduction
The Kalman filter was invented as a technique for filtering and prediction in linear Gaussian systems.
It implements belief computations (Bayesian filtering) for continuous spaces and is not applicable
to discrete/hybrid state spaces.
Kalman filters represent beliefs by the moment parameterization, where the belief at time t is
represented by a mean µt and covariance Σt . Contrast this with the Information Filter, which
describes belief in terms of the natural paramaterization. All computed posteriors are Gaussian
4
if the following properties hold (Markov assumption must hold as well). These three items ensure
that all posteriors are always Gaussian for any time t.
1. The state transition probability p(xt |ut , xt−1 ) must be linear in their arguments and with
added Gaussian noise. This probability therefore assumes linear dynamics and in this form is
called linear Gaussian. It is represented by the following equation :
where ε ∼ N (0, Q)
xt+1 = Axt + ε,
(1)
2. The measurement probability p(yt |xt ) must also be linear with its arguments and with added
Gaussian noise. This probability is represented by the following probability:
where ε ∼ N (0, R)
yt+1 = Cxt + δ,
(2)
3. The initial belief must be normally distributed with some mean µ0 and covariance Σ0 .
Observation Update in terms of moment parameters µ and Σ:
Above, we presented the Lazy Gauss Markov Filter, which did the motion step in the moment
parameterization, and the observation step in the natural parameterization. We are now going to
derive both steps in the moment parameterization, resulting in the Kalman Filter. Combining (1)
and (2) together, we can show how the state and measurement are thought to come from a normal
distribution with a mean vector µ and covariance matrix Σ:
Xt
Yt
∼N
µXt
µY t
ΣXX
ΣXY
ΣY X
ΣY Y
The observation update is:
µX|Y = µX + ΣXY Σ−1
Y Y (Y − µY )
ΣX|Y = ΣXX − ΣXY Σ−1
Y Y ΣY X
We know that
µY = CµX
ΣY Y = R + CΣXX C T
Therefore, we have to find ΣXY to calculate the remaining terms in equation 1 and 2.
By definition,
ΣXY = E[(X − µX )(Y − µY )T ]
= E[(X − µX )(Y − CµX )T ]
However, Y = CX + δ with δ having 0 mean and independent of all other observations.
ΣXY = E[(X − µX )(X − µX )T ]C T
⇒ ΣXY = ΣXX C T
Plugging these values into the conditional mean and covariance above results in
µX|Y = µX + ΣXX C T (R + CΣXX C T )−1 (Y − CµX )
ΣX|Y = ΣXX − ΣXX C T (R + CΣXX C T )−1 CΣXX
5
3.2
The Kalman Filter Algorithm
The complete Kalman filter algorithm is outlines below. The input to the filter is the belief represented by the mean µt and covariance Σt at the previous timestep.
µ−
t+1 = Aµt
T
Σ−
t+1 = AΣt A + Q
−
−
−
T
T
−1
µt+1 = µt+1 + Σ−
t+1 C (CΣt+1 C + R) (yt − Cµt+1 )
−
−
−
T
T
−1
Σt+1 = Σ−
t+1 − Σt+1 C (CΣt+1 C + R) CΣt+1
The first two lines of the algorithm calculate predicted beliefs represented by µ− and Σ− at the
new time (t + 1) without incorporating the measurement yt . The mean is then updated using the
−
deterministic version of the state transition function shown in (1) using the variables µ−
t and Σt .
In the final two lines, the predicted belief is updated to the posterior belief by incorporating the
−
T
−1 is known as the Kalman Gain. It details
T
measurement yt . The quantity Σ−
t+1 C (CΣt+1 C + R)
the degree to which each measurement is incorporated into the new state estimate. The third line
manipulates the predicted mean in proportion to the Kalman Gain with the quantity (yt − Cµ−
t ),
known as the innovation.
3.3
Kalman Filter Illustration
Figure 2: Kalman filtering example for 1D localization
Figure 2 above illustrates the Kalman filter algorithm for simple 1D localization. Supposing the
robot moves in the horizontal direction, the prior of robot’s position is shown as the red line in (a).
Polling its sensors gives the distribution shown by the green line in (a), and the blue line is a result
6
of the final two lines of the Kalman filter algorithm shown above. In (b) the robot has moved to a
new position illustrated by the magenta line. It polls its sensor that report back the position given
by the cyan line in (c), and the update from the algorithm gives the posterior given by the yellow
line in (d).
The Kalman filter alternates the sensor update step (lines 3-4), which incorporates sensor data into
the present belief, with the motion update step (lines 1-2), where the belief is predicted according
to the action taken by the robot.
7
Download