# Least Squares Approximation of Scattered Data

```Least Squares Approximation of Scattered Data
&Oslash;yvind Hjelle
[email protected], +47 67 82 82 75
Simula Research Laboratory, www.simula.no
November 2, 2009
Motivation
Approximation of huge data sets
Reverse engineering
Geological modelling
Geographic information systems
Medical modelling
...
motivation...
Surface triangulations
Piecewise linear polynomials over triangulations:
S10 (∆) = f ∈ C 0 (Ω) : f |ti ∈ Π1
where Π1 is the space of bivariate linear polynomials (cf. data
dependent triangulations).
f (x, y) =
n
X
ci Ni (x, y).
i=1
Coefficients: (c1 , c2 , . . . , cn )
Basis functions: (N1 (x, y), N2 (x, y), . . . , Nn (x, y))
n = number of vertices in ∆.
N i ( x, y )
Ωi
vi
Basis function Ni (x, y) at vi
Some properties
Ωj
Geometry
Ni (vj ) = δij ,
j = 1, . . . , n,
(vj = (xj , yj )),
Ni (x, y) ≥ 0 inside the support Ωi , and
Ni (x, y) = 0 outside Ωi .
Linear independency
{Ni (x, y)} are linearly independent on Ω, that is:
P
n
i=1 αi Ni (x, y) = 0 for all (x, y) ∈ Ω, implies that αi = 0
for i = 1, . . . , n, where αi ∈ R.
Only three basis
Pfunctions can be non-zero strictly inside a triangle
=⇒ f (x, y) = ni=1 ci Ni (x, y) has at most three non-zero terms.
Since Ni (vj ) = δij , the coefficient vector equals the “node values”:
f (xj , yj ) = cj ,
Partition of unity
X
i
Ni (x, y) = 1,
j = 1, . . . , n.
for (x, y) ∈ Ω.
The gradient of f ∈ S10 (∆)
Let gk = f |tk be the restriction of f to triangle tk .
∇gk =
∂gk ∂gk
,
∂x ∂y
.
Since gk is linear ∇gk is constant inside tk .
Normal vector of tk :
b
n'
g
c
v2
v1
a
n=
∂gk
∂gk
,−
,1 .
−
∂x
∂y
b
n'
g
c
v2
v1
a
n=
∂gk
∂gk
−
,−
,1
∂x
∂y
n′ = v1 &times;v2 = −ηak ca − ηbk cb − ηck cc , −&micro;ka ca − &micro;kb cb − &micro;kc cc , 2Ak
where
ηak = (yb − yc ) ,
ηbk = (yc − ya ) ,
&micro;ka = (xc − xb ) ,
&micro;kb = (xa − xc ) ,
ηck = (ya − yb ) ,
&micro;kc = (xb − xa ) ,
and Ak is the area of the projection of tk in the xy-plane,
1
Ak = ((xb − xa ) (yc − ya ) − (yb − ya ) (xc − xa )) .
2
(Exercise.)
Thus,
∇gk = (ηak ca + ηbk cb + ηck cc , &micro;ka ca + &micro;kb cb + &micro;kc cc )/2Ak .
or
Gradient as a linear combination of all coefficients
!
n
n
X
1 X k
k
&micro;i ci .
ηi ci ,
∇gk =
2A
i=1
ηik , &micro;ki
i=1
6= 0 only when i is a vertex index of tk .
Approximation on triangulations of subsets of data
Given P = (x1 , y1 ), (x2 , y2 ), . . . , (xm , ym ) and corresponding
data values z1 , . . . , zm .
Let ∆ be a triangulation of a subset V of P with n ≤ m
vertices from P .
Assume that V are the n first vertices of P with equal
numbering:
V = {vk = (xk , yk )}nk=1 .
Problem
Find coefficients (c1 , c2 , . . . , cn ) such that
f (xk , yk ) ≈ zk , k = 1, . . . , m.
Method of least squares
minimize I(c) =
Pm
2
k=1 (f (xk , yk ) − zk )
Inserting f (xk , yk ) =
I(c) =
m
X
k=1
Pn
j=1 cj Nj (xk , yk ):
2

n
X

cj Nj (xk , yk ) − zk  = kBc − zk22 .
j=1
z = (z1 , . . . , zm )T , || &middot; ||2 is the Euclidean norm, and B is
the m &times; n matrix:



B=

N1 (x1 , y1 )
N1 (x2 , y2 )
..
.
N2 (x1 , y1 )
N2 (x2 , y2 )
..
.
&middot;&middot;&middot;
&middot;&middot;&middot;
..
.
Nn (x1 , y1 )
Nn (x2 , y2 )
..
.
N1 (xm , ym ) N2 (xm , ym ) &middot; &middot; &middot; Nn (xm , ym )



.

Minimize I(c)
I(c) =
m
X
k=1
Find
c∗


n
X
j=1
2
cj Nj (xk , yk ) − zk 
when all partial derivatives ∂I/∂ci , i = 1, . . . , n are zero:


m
n
X
X

cj Nj (xk , yk ) − zk  Ni (xk , yk ) = 0,
∂I/∂ci = 2
k=1
j=1
i = 1, . . . , n
Solution
m
n X
X
j=1 k=1
Ni (xk , yk )Nj (xk , yk )cj
=
m
X
Ni (xk , yk )zk ,
k=1
Linear equation system of n equations in n unknowns
BT B c = BT z
(Ax = b).
i = 1, . . . , n.
Implementation
Element of system matrix
m
X
Ni (xk , yk )Nj (xk , yk )
B B ij =
T
k=1
B B ij = BT B ji , thus BT B is symmetric.
T
Element of right hand side
m
X
B z i=
Ni (xk , yk )zk .
T
k=1
A digression with m = n
Let m = n. That is,
the number of nodes in ∆ = the number of scattered data
(and the position of nodes coincides with the scattered data)
⇓
BT B c = BT z becomes Bc = z
P
2
min I(c) = m
k=1 (f (xk , yk ) − zk ) = 0 (zero deviation
everywhere)
Trivial solution c∗i = zi ,
k = 1, . . . , m
Existence and Uniqueness of BT B c = BT z (∗)
From basic linear algebra:
Existence: (∗) has at least one solution
Uniqueness: (∗) has a unique solution if BT B is positive
definite.
(See G. H. Golub and C. F. Loan, Matrix Computations, John Hopkins University Press, third edition, 1996)
A ∈ Rn&times;n is called positive definite if xT Ax &gt; 0 for all
non-zero vectors x ∈ Rn
BT B is positive definite if and only if B has linearly
independent columns bi , i = 1, . . . , n.
b1 , b2 , . . . , bn are called linearly independent if:
t1 b1 + &middot; &middot; &middot; + tn bn = 0,
(tj ∈ R).
=⇒
tj = 0, for j = 1, . . . , n
Recall:



B=

N1 (x1 , y1 )
N1 (x2 , y2 )
..
.
N2 (x1 , y1 )
N2 (x2 , y2 )
..
.
&middot;&middot;&middot;
&middot;&middot;&middot;
..
.
Nn (x1 , y1 )
Nn (x2 , y2 )
..
.
N1 (xm , ym ) N2 (xm , ym ) &middot; &middot; &middot; Nn (xm , ym )
But Ni (xj , yj ) = δij , for j = 1, . . . , n (on the subset V ).
Partition with B1 ∈ Rn&times;n and B2 ∈ R(m−n)&times;(m−n) gives:
B=
B1
B2
=
I
.
B2



.

B=

1
0
..
.






0

 N1 (xn+1, yn+1 )

 N1 (xn+2, yn+2 )


..

.
N1 (xm, ym )
B1
B2
=
I
=
B2
0
1
..
.
&middot;&middot;&middot;
&middot;&middot;&middot;
..
.
0
N2 (xn+1, yn+1 )
N2 (xn+2, yn+2 )
..
.
&middot;&middot;&middot;
&middot;&middot;&middot;
&middot;&middot;&middot;
..
.
N2 (xm, ym )
&middot;&middot;&middot;
0
0
..
.







1

Nn (xn+1, yn+1 ) 

Nn (xn+2, yn+2 ) 


..

.
Nn (xm, ym )
Columns b11 , . . . , b1n of B1 = I are linearly independent:
t1 b11 + &middot; &middot; &middot; + tn b1n = 0 =⇒ tj = 0, for j = 1, . . . , n.
⇓
t1 b1 + &middot; &middot; &middot; + tn bn = 0 =⇒ tj = 0, for j = 1, . . . , n.
⇓
B has linearly independent columns.
⇓
T
B B is positive definite.
⇓
BT B c = BT z has a unique solution.
(when the nodes of ∆ is the subset V of P )
Sparsity
Recall:
m
X
BT B ij =
Ni (xk , yk )Nj (xk , yk ).
k=1
This implies:
BT B ij 6= 0 =⇒ (Int)Ωi ∩ (Int)Ωj 6= φ.
vj
vi
Ωi
Ωi I Ω j
Ωj
i = j (diagonal element),
(Int)Ωi ∩ (Int)Ωj 6= φ only when BT B ij corresponds to an
edge in the triangulation.
Recall:
|E| ≤ 3 |V | − 6
This implies
Number of non-zero off-diagonal elements is less than 6|V |
Total number of non-zeros in BT B is less than
6|V | + |V | = 7|V | (off-diagonal elements and diagonal)
Average number of non-zeros in each row is approx. 7
Less than 7/|V | of the system matrix is non-zero
Sparsity and symmetry suggests, e.g., the conjugate gradient
method for solving (BT B)c = BT z (∗).
Sorting lexicographically on x and y brings non-zero
elements closer to the diagonal
Penalized Least Squares
Random noise added on the right
Smoothing:
J(c) = cT Ec,
where E is positive semidefinite, i.e. xT Ex ≥ 0 for all non-zero
vectors x ∈ Rn (&gt; 0 when positive definite).
cT Ec is a “standard” way of constructing smoothing terms, but
may involve second derivative! ⇒
Must make discrete analogues of second derivative for S 0 (∆).
minimize I(c) =
m
X
(f (xk , yk ) − zk )2 + λJ(c)
k=1
=
m
X
(f (xk , yk ) − zk )2 + λcT Ec
k=1
= kBc − zk22 + λcT Ec
for λ ∈ R, λ ≥ 0.
Solution where ∂I/∂ci = 0, i = 1, . . . , n:
BT B + λE c = BT z
(∗∗)
Uniqueness of penalized least squares
Recall: unique solution when BT B + λE is positive definite.
BT B positive definite &amp; E positive semidefinite
⇓
BT B + λE positive definite (Exercise).
Thus (∗∗) has always a unique solution!
But remember: the nodes of ∆ are still the subset V of P :
Implementation issues
How can we choose λ in BT B + λE c = BT z
1
2
?
Generalized cross-validation (Golitschek &amp; Schumaker)
λ = BT B / kEk (Floater),
F
F
P
where k &middot; kF is the Frobenius matrix norm: kAkF = ( ij a2ij )1/2 .
1: difficult to implement and CPU-demanding?
2: my experience; does not always work (wrong scaling?)
Possible solution: Choose λ from experiments?
Smoothing Terms for Penalized Least Squares
membrane energy:
Z &quot; 2 2 # Z
∂g
∂g
+
= |∇g|2
∂x
∂y
=
Z
∇g &middot; ∇g ,
thin-plate (spline) energy:
2 2 2 2 #
Z &quot; 2 2
∂ g
∂ g
∂ g
+2
.
+
∂x2
∂x∂y
∂y 2
membrane energy prefers surfaces with small area,
thin-plate energy prefers surfaces
Pn with small “curvature”.
Note: requires that g(x, y) = i=1 ci Ni (x, y) is twice
differentiable!
Membrane energy functional.
J1 (c) =
Z
2
|∇g| =
Z ∂g
∂x
2
+
∂g
∂y
2
,
Let gk = f |tk be the restriction of f to tk .
∇gk =
∂gk ∂gk
,
∂x ∂y
1
=
2Ak
n
X
i=1
ηik ci ,
n
X
i=1
&micro;ki ci
!
.
J1 (c) =
|T |
X
Ak |∇gk |2 =
k=1
k=1
Ak
&quot;
∂gk
∂x
2
∂gk
+
∂y

!2
n
X
k
&micro;i ci 
2 #

!2
|T |
n
X
1  X k
ηi ci +
=
4Ak
i=1
i=1
k=1




!
! n
|T |
n
n
n
X
X
X
X
1  X k
=
ηjk cj  +
&micro;kj cj 
ηi ci 
&micro;ki ci 
4Ak
j=1
j=1
i=1
i=1
k=1


|T | n
n X
X
X

ηik ηjk + &micro;ki &micro;kj /4Ak  ci cj = cT Ec,
=
i=1 j=1
where
|T |
X
k=1
|T |
X
Eij =
(ηik ηjk + &micro;ki &micro;kj )/4Ak .
k=1
Uniqueness with the membrane-operator:
Recall from above:
Uniqueness relies on positive semidefiniteness of E.
But, recall when deriving J:
J1 (c) =
|T |
X
k=1
Ak
&quot;
∂gk
∂x
2
+
∂gk
∂y
2 #
= &middot; &middot; &middot; = cT Ec.
So, cT Ec ≥ 0 and E is (“at least”) positive semidefinite.
Discrete membrane energy functional
The umbrella-operator
L
L
L
ck
L
L
cl
Let ∂Ωk denote the nodes on the boundary of Ωk
Let nk denote the degree (or valency) of the vertex vk .
Discrete roughness measure around vk :
Mk f =
1 X
cl − ck .
nk
l∈∂Ωk
Consider |Mk f |2 as a discrete analogue of (∂g/∂x)2 + (∂g/∂y)2 .
Mk f as a linear combination of all coefficients in c:
Mk f
1 X
cl − ck =
nk
l∈∂Ωk

n
 −1,
X
1
k
k
,
=
ηl cl , ηl =
 nk
l=1
0
=
l=k
(vk , vl ) an edge in ∆
otherwise.
J2 (c) =
n
X
(Mk f )2 =
k=1
=
&quot;
k=1
n
n
n X
X
X
i=1 j=1
&quot; n
n
X
X
k=1
#
ηik ci
i=1
ηik ηjk ci cj =
T
= c Ec,
where
Eij =
#2
=
&quot; n
n
X
X
k=1
n
n X
X
i=1 j=1
n
X
k=1
ηik ηjk .
i=1
Eij ci cj
.

# n
X
ηjk cj 
ηik ci 
j=1
3
3
2
1
1
2
2
vi
3
1
3
3
1
2
3
Implementation:
E is symmetric =⇒ BT B + λE is symmetric
E is sparse: Eij 6= 0 when
1
2
3
i=j
(i, j) corresponds to an edge and
both (i, k) and (j, k) corresponds to edges (’1’, ’2’ and ’3’
vertices combined with vi in the figure.)
Weighted (scale dependent) umbrella-operator:
X
fk = 1
ωkl cl − ck ,
M
Wk
l∈∂Ωk
where ωkl = 1/ kvl − vk k2

 −1,
ωkl
k
,
ηel =
 Wk
0
and Wk =
P
l∈∂Ωk
ωkl .
l=k
if (vk , vl ) an edge in ∆
otherwise.
.
Equals unweighted umprella-operator when all edge lengths are
equal.
Uniqueness with the umbrella-operator:
Recall from above:
Uniqueness relies on positive semidefiniteness of E.
But, recall when deriving J2 :
J2 (c) =
n
X
(Mk f )2 = &middot; &middot; &middot; = cT Ec.
k=1
So, cT Ec ≥ 0 and E is (“at least”) positive semidefinite.
Discrete thin-plate energy functional
R 2
2 + g2 )
(Recall continous case J(c) = gxx
+ 2gxy
yy
P
2
Make discrete analog
(Tk f )
(equivalent with JND-measure, jump in normal derivative, for data
dependent triangulations.)
r
t
l
t1
ek
t2
s
Stencil for Tk f
Restrictions of f (x, y) at two incident triangles t1 and t2 :
gi = f |ti , i = 1, 2 see figure.
r
t
l
t1
ek
t2
s
Stencil for Tk f
1
“First order divided differences” of f at t1 and t2 :
∇gi = (∂gi /∂x, ∂gi /∂y),
2
i = 1, 2
“Second order divided differences” defines Tk f :
1
2
3
Normal vectors n1 and n2 of planes defined by g1
and g2 :ni = (−∂gi /∂x, −∂gi /∂y, 1), i = 1, 2.
mek = n1 − n2
= (−∂g1 /∂x + ∂g2 /∂x, −∂g1 /∂y + ∂g2 /∂y, 0)
|Tk f | = kmek k2 (= |mek &middot; nek |)
r
t
l
t1
ek
t2
s
Let indices ω(ek ) = (s, t, l, r).
Note: Tk f must be a linear combination of (cs , ct , cl , cr )!
Exercise:
X
Tk f =
βik ci ,
i∈ω(ek )
where
βlk = −
βsk =
Le k
2A[l,s,t]
Lek A[t,l,r]
2A[l,s,t] A[r,t,s]
βrk = −
βtk =
Le k
.
2A[r,t,s]
Lek A[s,r,l]
2A[l,s,t] A[r,t,s]
Lek is the length of ek
A[u,v,w] is the area of triangle t (in the plane)
Tk f as a linear combination of all coefficients in c.
Sum over all interior edges:
J3 (c) =
|EI |
X
(Tk f )2 =
k=1
&quot; n
|EI |
X
X
k=1
βik ci
i=1
#2


|EI |
n
n
n X
n X
X
X
X

Eij ci cj
βik βjk  ci cj =
=
i=1 j=1
i=1 j=1
k=1
= cT Ec,
where
Eij =
|EI |
X
k=1
βik βjk .
3
3
2
1
1
2
2
vi
3
1
3
3
1
2
3
Implementation:
E is symmetric =⇒ BT B + λE is symmetric
E is sparse: Eij 6= 0 when
1
2
i=j
i and j corresponds to vertices of the same
neighborhood ω(ek ) (’1’ and ’2’ vertices combined with vi in
the figure.)
Uniqueness relies on positive semidefiniteness of E.
But, recall when deriving J3 (c):
J3 (c) =
|EI |
X
(Tk f )2 = &middot; &middot; &middot; = cT Ec.
k=1
So, cT Ec ≥ 0 and E is (“at least”) positive definite.
Example:
P
2
I(c) = m
k=1 (f (xk , yk ) − zk ) :
P
I(c) = m
(xk , yk ) − zk )2 + λJ3 (c)
k=1
(f
T
λ = (100 &middot; B BF / kEkF ) :
Approximation over General Triangulations
Recall:
Uniqueness of BT B c = BT z with triangulation ∆(V ) relied on
V ⊂ P.
Now:
∆ is not a triangulation of a subset of P :
Ωj
BT B c = BT z has in general not a unique solution, since
columns of B are not necessarily linear independent:
Recall:


N1 (x1 , y1 )
N2 (x1 , y1 ) &middot; &middot; &middot; Nn (x1 , y1 )
 N1 (x2 , y2 )
N2 (x2 , y2 ) &middot; &middot; &middot; Nn (x2 , y2 ) 


B=
.
..
..
..
..


.
.
.
.
N1 (xm , ym ) N2 (xm , ym ) &middot; &middot; &middot; Nn (xm , ym )
Example:
If there are no scattered data inside domain Ωj of Nj (x, y), column
bj = 0.
Then the columns of B are not linearly independent,
T
B B is not positive definite, and the solution is not unique.
Also recall:
minimize I(c) =
m
X
k=1
=
m
X
k=1
(f (xk , yk ) − zk )2
2

n
X

cj Nj (xk , yk ) − zk 
j=1
So, cj in figure does not contribute to I(c); thus not uniqueness.
In general BT B is positive semidefinite for general triangulations:
xT BT B x = (Bx)T (Bx) = kBck22 ≥ 0.
Does smoothing term guarantee uniqueness?
Notation: PD = Positive Definite
PSD = Positive SemiDefinite
BT B + λE c = BT z has a unique solution if BT B + λE is
PD (and not only PSD).
We know that BT B + λE is “at least” PSD, but is it PD?
For (BT B + λE) to be PSD:
cT BT B + λE c = cT BT B c + λcT Ec = 0,
where cT (BT B)c = 0 and cT Ec = 0 for a c 6= 0.
Observation: cT Ec = J(c), the general energy term.
(∗)
Uniqueness with the membrane operator for general
triangulations
J1 (c)
|T |
X
2
Ak |∇gk | =
k=1
J1 (c) = 0 implies
∂gk /∂x = 0
|T |
X
k=1
Ak
&quot;
∂gk
∂x
2
+
∂gk
∂y
2 #
.
and ∂gk /∂y = 0, k = 1, . . . |T |.
Then all coefficients are equal: c1 = c2 = &middot; &middot; &middot; = cn .
and first term of (∗) :
cT (BT B)c = (Bc)T (Bc) = kBck22 = 0,
⇓
n
P
(Bc)j =
ci Ni (xj , yj ) = f (xj , yj ) = 0, j = 1, . . . , m
i=1
But then
(∗∗) =⇒ f == 0
and
c=0
Conclusion:
cT (BT B + λE)c = 0 implies that c = 0
⇓
(BT B + λE) is PD.
Uniqueness with the umbrella operator for general
triangulations
J2 (c) =
n
X
(Mk f )2 =
k=1
n
X
k=1
J2 (c) = 0 implies
ck =
1 X
cl ,
nk

2
X
1
cl − ck  .
nk
l∈∂Ωk
k = 1, . . . n.
l∈∂Ωk
In particular this must hold for cmax = max{ck }nk=1 .
But then all neighbours cl of cmax with l in ∂Ωmax must be
equal to cmax etc... and repeat this argument.
⇓
cT Ec = 0 ⇐⇒ c1 = c2 = &middot; &middot; &middot; = cn
(∗∗)
and first term of (∗) :
cT (BT B)c = (Bc)T (Bc) = kBck22 = 0,
⇓
n
P
(Bc)j =
ci Ni (xj , yj ) = f (xj , yj ) = 0, j = 1, . . . , m
i=1
But then
(∗∗) =⇒ f == 0
and
c=0
Conclusion:
cT (BT B + λE)c = 0 implies that c = 0
⇓
(BT B + λE) is PD.
Uniqueness with the thin-plate energy functional for
general triangulations
|EI |
X
J3 (c) =
2
(Tk f ) =
k=1
|EI |
X
kme k22 = cT Ec
k=1
where, me = n1 − n2
= (−∂g1 /∂x + ∂g2 /∂x, −∂g1 /∂y + ∂g2 /∂y, 0)
J3 (c) = 0 implies:
kme k2 = 0
over all interior edges.
Then
∇f |t1 = ∇f |t2 = &middot; &middot; &middot; = ∇f |t|T |
and f is a linear polynomial.
Again, as for the umrella operator
cT (BT B)c = (Bc)T (Bc) = kBck22 = 0
implies that f == 0, and c = 0.
Conclusion:
cT (BT B + λE)c = 0 implies that c = 0
⇓
(BT B + λE) is PD.
```