# What is shape? - University of Cincinnati ```SHAPE ANALYSIS
Issues and Problems
M.B. Rao
University of Cincinnati
A Seminar Presented at
The Division of Biostatistics and Bioinformatics
University of Cincinnati
January 30, 2015
1
Outline
1. Exordium
2. Consulting Problems
3. What is shape?
4. How to bring shapes into a single platform?
a. Procrustes analysis
b. Bookstein coordinates
c. Helmert transformation a la Kendall
5. Distribution theory
6. Excursus
1. Exordium
Introduction and Examples
Take any object depicted in two or three dimensions. The focus is on the outline
(shape) of the object. What is shape? The outline can be described by a
mathematical function. This is hard. Another way is to identify some landmarks of
the shape and note the coordinates of the landmarks in a co-ordinate system.
Such data on the shape is called landmark data. We will be working with landmark
data of several objects. From a statistical point of view, a shape is characterized
by a collection of points listed in some order along with their coordinates. The
points are put together in the form of a matrix with rows representing points or
landmarks and columns coordinates.
The subject matter comes under the name ‘Statistical Shape Analysis’ and
‘Morphometrics.’
There are hundreds of papers published in this area.
Books
Ian Dryden and Kanti Mardia – Statistical Shape Analysis, Wiley, 1998
2
Julien Claude – Morphometrics with R, Springer, 2008
Fred Bookstein – Morphometric Tools for Landmark data, Cambridge University
Press, 1991
Examples
There are some examples of shapes data in the package. Download the data
‘digit3.dat.’
&gt; data(digit3.dat)
What are the dimensions of the data? It is an array.
&gt; dim(digit3.dat)
 13 2 30
It is an array consisting of 30 shapes in two dimensions with 13 landmarks. The
first number gives the number of rows, second columns, and third slices.
Plot the first two shapes.
&gt; plotshapes(digit3.dat[, , 1], joinline = c(1 : 13))
&gt; plotshapes(digit3.dat[, , 2], joinline = c(1 : 13))
Plot all shapes.
 Plotshapes(digit3.dat, joinline = c(1:13))
Data behind Shape 1:
&gt; digit3.dat[ , , 1]
[,1] [,2]
[1,]
9
-27
[2,]
12
-31
[3,]
17
-36
[4,]
26
-39
[5,]
34
-37
3
[6,]
36
-33
[7,]
38
-27
[8,]
35
-19
[9,]
30
-15
[10,]
21
-14
[11,]
21
-8
[12,]
16
-6
[13,]
8
-5
-40
-30
-20
-10
0
Example 1
10
20
30
40
-40
-35
-30
-25
-20
-15
-10
Example 2
5
10
15
20
25
30
35
40
Plot all shapes of 3.
4
0
-10
-20
-30
-40
-50
0
10
20
30
40
50
Questions
Basic: How does one define shape?
1.
2.
3.
4.
5.
How to define distance between two shapes?
How to define mean shape?
How to define median shape?
How to measure variation present in the shapes?
What is Shape space? How to introduce a distribution on the shape space?
2. Consulting problems
A. The Antarctica was teeming with life 60,000 years ago: flora and fauna. A
geology professor, on a summer excavation expedition in the Antarctica, brought
60 seeds. Each seed laid out on a tracing paper was looked at through a
microscope. Its outline was drawn on the paper. He brought these 60 papers to
my office and asked me to do cluster analysis on the seeds. How?
B. A Physics researcher in the medical school here came to me with a modeling
problem.
Treatment regimen of breast cancer: Six-week program: Once a week treatment:
a. Identify the tumor in the breast; b. Locate the center of the tumor; c. Intense
radiation is applied at the center for a certain length of time.
5
The woman comes again next week: a. Identify the tumor in the breast (the tumor
seems to have shrunk); b. Locate the center of the tumor (the center has shifted –
the tumor also has shifted too!); c. Apply radiation.
Data: Week 1: Center (0, 0, 0)
2. Center (x1, y1, z1)
…
…
6. Center (x5, y5, z5)
+ Outlines of the shapes of the tumors
+ some co-variate information (age; parity; density of the breast; race
etc.
Data on about 30 women –
Questions: Model how the center is shifting from week to week; variation in the
shapes
C. Donna’s Morphometrics Lab in the Children’s Hospital
D. Tessier facial cleft
3. What is shape?
Suppose Y and W are two shapes in m-dimensional space each with k landmarks.
What this means is that Y and W are matrices of order kxm (k rows and m
columns).
Say that the shapes Y and W are the same if after shifting the location of one
shape it is identical with the other shape. What does it mean to say location shift?
Suppose
1 1
4 3
Y = ( 0 0) and W = (3 2) with 1 joined 2 joined to 3.
2 3
−1 1
6
1
3 2
4
0) + (3 2) = (3
3 2
2
1
3
2)
3
-1
0
1
2
3
4
1
(0
−1
-1
0
1
2
3
4
Say two shapes Y and W are the same if after shrinking or enlarging (scaling) one
of the shapes it is identical to the other shape.
Consider the following examples.
0
1
2
3
4
4 3
2 1.5
Y = (1.5 1 ) and W = (3 2)
2 3
1 1.5
0
1
2
3
4
Shrink the shape W by 50%.
7
4
0.5*W = 0.5*(3
2
3
2 1.5
2) = (1.5 1 ) = X
3
1 1.5
Say two shapes Y and W are the same if after rotating one of the shapes by an
angle it coincides with the other shape.
Look at the following examples.
−3 4
4
Y = (−2 3) and W = (3
−3 2
2
3
2)
3
2
4
6
Plot the shapes.
W
-2
0
Y
-4
-2
0
2
4
Rotate the shape W anti-clockwise by an angle 900. What does it mean?
4 3
4 3
−3 4
0 1
𝑐𝑜𝑠90 𝑠𝑖𝑛90
(3 2)*(
) = (3 2)*(
) = (−2 3)
−1 0
−𝑠𝑖𝑛90 𝑐𝑜𝑠90
2 3
2 3
−3 2
The 2x2 matrix above is an example of a rotation matrix.
One could rotate a shape by any angle.
In general, the result of a θ0 rotation of a point or a shape anti-clockwise is
tantamount to post-multiplying the point or the shape matrix by the matrix
𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃
(
).
−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃
8
Definition: Shape is all the geometrical information that remains when location,
scale, and rotational effects are filtered out from an object.
In summary:
Take any shape. Shift it to another location. We are not getting a new shape.
Take any shape. Scale it. We are not getting a new shape.
Take any shape. Rotate it. We are not getting a new shape.
Say two shapes are identical if after shifting the location, scaling, and rotating
of one shape, it matches with the other shape.
Goals:
1. We have landmark data on several objects. Bring them together onto the
same platform by location shift, scaling, and rotation. Obtain summary
statistics of the shapes after that. How to do that?
2. Develop distribution theory on shapes.
3. Fit a shape distribution to the landmark data.
4. Pursue statistical inference.
5. Non-parametric inference
6. Pattern recognition
7. Etc.
In general, the result of a θ0 rotation of a point or a shape anti-clockwise is
tantamount to post-multiplying the point or the shape matrix by the matrix
𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃
(
).
−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃
Example:
Consider the landmark data on the first digit 3 from the dataset ‘digit2.dat.’ Plot
it.
&gt; data(digit3.dat)
&gt; digit3.dat[ , , 1]
[,1] [,2]
[1,]
9
-27
9
[2,]
12
-31
[3,]
17
-36
[4,]
26
-39
[5,]
34
-37
[6,]
36
-33
[7,]
38
-27
[8,]
35
-19
[9,]
30
-15
[10,]
21
-14
[11,]
21
-8
[12,]
16
-6
[13,]
8
-5
-40
-30
-20
-10
0
&gt; plotshapes(digit3.dat[ , , 1], joinline = c(1:13))
10
20
30
40
Rotate it by 900. Post-multiply its landmark data by the appropriate rotation
matrix.
10
40
0
30
-10
20
-20
10
-30
-40
10
20
30
40
0
10
20
30
40
1. If the shape is located in the first quadrant, the 900 rotation will place it in
2. If the shape is located in the second quadrant, the 900 rotation will place it
3. If the shape is located in the third quadrant, the 900 rotation will place it in
4. If the shape is located in the fourth quadrant, the 900 rotation will place it
5. Out Digit 3 is located in the fourth quadrant.
4. How to bring all shapes into a single platform?
a. What is Procrustes Analysis?
There are two planar shapes y and w. Mathematically, y and w each is an ordered
sequence of two-tuples signifying the landmarks of the shape.
Let us look at the ‘digit3.dat’ data. Let us look at the first two shapes. They can be
brought into the same frame. Shape 1 seems to be taller and wider than Shape 2.
The simplest way to achieve this objective is to make the centroid of the shapes
to be (0, 0). For each shape, make the mean of each coordinate zero.
Goal:
1. Rotate Shape 2 by an angle θ.
2. Expand it by an amount β &gt; 1?
3. Shift it to a different position by (a, b).
11
4. Choose θ, β, and (a, b) so that the resultant modified Shape 2 is closest
to Shape 1 in the Euclidean sense.
This is, in essence, a Procrustes transformation of Shape 2 to Shape 1. The
objective can be formulated mathematically if we view each 2-tuple as a complex
number.
Procrustes is a mythological figure from the Greek mythology. He owned only one
bed on a roadside inn for the benefit of travelers who wanted to rest at his inn for
the night. He could offer the bed to only one traveler. The dimensions of the bed
were fixed and non-immutable. If the traveler was shorter than the length of the
bed, Procrustes stretched his legs to fit him snugly into the bed. If the traveler
was taller than the length of the bed, Procrustes chopped his legs so as to fit him
snugly into the bed.
Procrustes transformation
Let yT = (y1, y2, … , yk) and wT = (w1, w2, … , wk) be two centralized shapes. Each
two-tuple is written as a complex number. Each yi and wi is a complex number.
Write the complex linear model:
yi = u + β*eiθ*wi + εi
i = 1, 2, … , k
In matrix notation,
𝑦1
𝑤1
𝜀1
𝑎 + 𝑖𝑏
y = (𝑦.2 ) = (𝑎 +. 𝑖𝑏) + 𝛽 ∗ 𝑒 𝑖𝜃 ∗ (𝑤.2 ) + (𝜀.2 ) = (a+ib)*1k + β*𝑒 𝑖𝜃 ∗ 𝑤 + ℰ,
𝑦𝑘
𝑤𝑘
𝜀𝑘
𝑎 + 𝑖𝑏
where 1k is a column vector of k 1 s.
We are rotating the shape w by an angle θ (i.e., 𝑒 𝑖𝜃 ∗ 𝑤) and then scaling it by β.
Whatever shape we get we are shifting it by a + ib.
Parameters
1. u = a + i b is the shift.
2. θ is the degree of rotation.
3. β &gt; 0 is the scale.
12
ε is error.
Estimate the parameters so that y and (a + ib)*1k + β*eiθ*w are closest. Minimize
the sum of absolute squared errors. Minimize
𝑘
∑ 𝜖̅𝑖 𝜖𝑖
𝑖=1
with respect to u, θ, and β. (Complex least squares problem!) The symbol - is the
operation of complex conjugation. Solution is explicit.
𝜃̂ = arg(w*y) = -arg(y*w)
√𝑤
𝛽̂ =
∗ 𝑦 ∗ 𝑦𝑤
𝑤∗𝑤
𝑢̂ = 0
The package “shapes” has a command which does procrustes analysis.
-40
-40
-20
-20
0
0
Plot Shapes 1 and 2 in the frame.
10
20
30
40
10
20
30
Numerical example
Both shapes are centered at the origin (0, 0).
13
40
50
0
-3
-2
-1
Y
1
2
3
Two Shapes of Three
-4
-2
0
2
4
X
Comments: The shapes seem to be similar. The blue shape (Shape2) is bigger. If
we can squash it (scale: β &lt; 1) and rotate it, we should be able to get it closer to
the red shape. Use the linear model theory for complex data to bring Shape 2
closest to Shape 1.
0
-3
-2
-1
Y
1
2
3
Shape 1 &amp; Modified Shape 2
-2
-1
0
1
2
X
Centering Digit 3 data and Procrustes
We center all Digit 3 shapes and apply Procrustes transformation to Shapes 2 to
30 to bring them all closest to Shape 1. We will then find the mean shape and
variance of shapes.
Let us plot the centered shapes.
14
-20
-10
0
10
20
Cenered Digit 3s
-20
-10
0
10
20
Let us apply the Procrustes transformation to Shapes 2 to 30 to bring them all
closest to Shape 1.
-20
-10
0
10
20
Procrustes of Shapes 2:30 into Shape 1
-20
-10
0
10
20
Let us find the mean shape.
15
-20
-10
0
10
20
Procrustes of Shapes 2:30 into Shape 1 + Mean Shape
-20
-10
0
10
20
Criticism?
Let us measure variation present in the shapes. From each Procrusted shape take
away the Mean Shape, square the differences, add them, and then divided by 13.
Standard deviation of the shapes:
 3.743057
Can we build a 95% confidence interval for the population mean shape?
Bookstein’s coordinates and Bookstein’s mean shape
Let (x1, y1), (x2, y2), … , (xk, yk) be the landmarks of a shape in two dimensions.
Translate, rotate, and rescale the shape so that Landmark 1 becomes (-1/2, 0) and
Landmark 2 becomes (1/2, 0). The new landmarks are (-1/2, 0), (1/2, 0), (u3, v3),
(u4, v4), … , (uk, vk).
The whole operation can be summarized as follows. For any j ≥ 1,
𝑢𝑗
cos 𝜃
(𝑣 ) = c*(
𝑗
− sin 𝜃
𝑥𝑗
𝑏
sin 𝜃
) ∗ ((𝑦 ) − ( 1 ))
𝑗
𝑏2
cos 𝜃
c = scaling factor
A=(
cos 𝜃
− sin 𝜃
sin 𝜃
) = Rotation by an angle θ clock-wise
cos 𝜃
𝑏
( 1 ) = translation
𝑏2
16
There are four unknowns.
We need to find them. Look at our goals on Landmarks 1 and 2. Set
cos 𝜃
−0.5
(
) = c*(
0
− sin 𝜃
cos 𝜃
0.5
( ) = c*(
0
− sin 𝜃
𝑥1
𝑏
sin 𝜃
) ∗ ((𝑦 ) − ( 1 ))
𝑏2
1
cos 𝜃
𝑥2
𝑏
sin 𝜃
) ∗ ((𝑦 ) − ( 1 ))
𝑏2
2
cos 𝜃
We will have four equations in four unknowns. The solution is given by
𝑥1 + 𝑥2
𝑏
( 1 ) = (𝑦1+2 𝑦2)
𝑏2
2
2
1/c2 = (x1 – x2)2 + (y1 – y2)2 = Distance between Landmark 1 and Landmark 2 = 𝐷12
𝑥2 − 𝑥1
𝑦2 − 𝑦1
A = c(−(𝑦 − 𝑦 ) 𝑥 − 𝑥 )
2
1
2
1
With this solution, the new land marks are:
uj =
vj =
(𝑥2 − 𝑥1 )(𝑥𝑗 − 𝑥1 )+ (𝑦2 − 𝑦1 )(𝑦𝑗 − 𝑦1 )
2
𝐷12
- 0.5
(𝑥2 − 𝑥1 )(𝑦𝑗 − 𝑦1 )− (𝑦2 − 𝑦1 )(𝑥𝑗 − 𝑥1 )
2
𝐷12
j = 3, 4, … , k
k=3
(x1, y1) = (2, 1)
(x2, y2) = (1, 2)
(x3, y3) = (2, 2)
New landmarks:
(-0.5, 0)
(0.5, 0)
(0.0, -0.5)
17
Aim:
We have m shapes with landmarks. For each shape, get the Bookstein’s
coordinates. Then we can calculate the mean shape by averaging the new
coordinates (Bookstein mean shape).
-0.6
-0.6
-0.2
-0.2
0.2
0.2
0.6
0.6
Example: Let us look at the female gorillas. Only the first two are shown.
-0.6
-0.2
0.2
0.6
-0.6
-0.2
0.2
0.6
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
Do Bookstein.
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
5. Distributions on the pre-shape space
Pre-shape space
We look at a shape X (configuration matrix) in m-dimensions described by k
points. This means that X is a matrix of order kxm. Let us consider the Helmert
matrix HF of order kxk.
18
HF =
1
1
√𝑘
−1
√𝑘
1
√1∗2
−1
√1∗2
−1
2
√2∗3
−1
√2∗3
−1
√2∗3
−1
3∗4
√3∗4
…√
−1
(√(𝑘−1)∗𝑘
…√
…
1
1
√𝑘
√𝑘
0
0
3∗4
…
0
3
√3∗4
…
−1
−1
−1
√(𝑘−1)∗𝑘
√(𝑘−1)∗𝑘
√(𝑘−1)∗𝑘
…
…
1
√𝑘
…
0
…
…
0
0
…
(𝑘−1)
√(𝑘−1)∗𝑘 )
Properties:
1. It is an orthogonal matrix, i.e., (HF)(HF)T = (HF)THF = Ik, identity matrix of
order kxk.
2. With the exception of the first row, every row sum is zero.
Consider the following sub-matrix H of order (k-1)xk obtained from HF by deleting
its first row.
H=
−1
−1
√2
−1
√2
−1
√3∗2
−1
√3∗2
−1
√4∗3
√4∗3
−1
−1
−1
√(𝑘−1)𝑘
√(𝑘−1)𝑘
…
…
(√(𝑘−1)𝑘
…
…
0 0
2
√3∗2
−1
0 … 0
3
… 0
…
…
… …
…
…
…
−1
−1
√(𝑘−1)𝑘 … √(𝑘−1)𝑘 )
√4∗3
…
… 0
√4∗3
Definition: Let U = (uij) be a matrix of order pxq. Its norm is defined by
2
‖𝑈‖ = 𝑠𝑞𝑟𝑡(∑𝑝𝑖=1 ∑𝑞𝑗=1 𝑢𝑖𝑗
).
It can be shown that ‖𝑈‖ = √𝑡𝑟𝑎𝑐𝑒(𝑈 ∗ 𝑈 𝑇 ) = √𝑡𝑟𝑎𝑐𝑒(𝑈 𝑇 ∗ 𝑈),
where the trace of any square matrix is the sum of all its diagonal elements.
Definition: The pre-shape of a configuration matrix X is defined by
𝐻𝑋
Z = ‖𝐻𝑋‖.
The matrix Z is of order (k-1)xm. Let Z = (zij). Note that
19
‖𝑍‖ = ‖
𝐻𝑋
‖=
‖𝐻𝑋‖
1
‖𝐻𝑋‖
𝑚
2
‖𝐻𝑋‖ = √∑𝑘−1
𝑖=1 ∑𝑗=1 𝑧𝑖𝑗 = 1
This is not always true. Why?
Definition: A shape X is coincident if all rows of X are identical. This means all k
points in the m-dimensional Euclidean space are the same.
If the shape X is coincident, then HX = 0. Why? Consequently, ‖𝐻𝑋‖ = 0. The preshape of X does not make sense.
Definition: A shape X is non-coincident if it is not coincident.
Properties of pre-shape
1. Shift the shape X to some other spot in the m-dimensional space. The new
shape is of the form
𝑎1 𝑎2 … 𝑎𝑚
… 𝑎
X1 = X + (𝑎…1 𝑎…2 … …𝑚 ) = X + A
𝑎1 𝑎2 … 𝑎𝑚
Its pre-shape is exactly the same as that of X.
HX1 = H(X + A) = HX + HA = HX + 0 = HX and ‖𝐻𝑋1‖ = ‖𝐻𝑋‖
Z1 = pre-shape of X1 = HX1/‖𝐻𝑋1‖ = HX/‖𝐻𝑋‖ = Z
2. Distort the shape X by a factor β &gt; 0.
The pre-shape of βX is exactly the same as that of X.
Definition: The pre-shape space Σ is the collection of all pre-shapes of noncoincident shapes. Mathematically,
Σ = { Z = HX/‖𝐻𝑋‖; 𝑋 𝑛𝑜𝑛 − 𝑐𝑜𝑖𝑛𝑐𝑖𝑑𝑒𝑛𝑡}
One could have introduced Shape Space as the collection of all shapes X. This
space is wild and chaotic. A shape X and all its translations and distortions prowl
the Shape Space as distinct objects. We cannot introduce distributions on such a
wild entity. On the other hand, the pre-shape space is orderly. A shape and all its
translations and distortions are one and the same in the world of pre-shapes. It is
easy to introduce distributions on such an entity. Further, the term ‘pre-shape’
signifies that we are one step away from shape – rotation still has to be removed.
20
Specialize to the case m = 2
We now focus on shapes in two-dimensions. The pre-shape Z of a shape X is of
dimensions (k-1)x2. Let us write it explicitly –
𝑧11
𝑧21
Z = HX/‖𝐻𝑋‖ = ( …
𝑧𝑘−11
𝑧12
𝑧22
… )
𝑧𝑘−12
with the property
2
2
∑𝑘−1
𝑖=1 ∑𝑗=1 𝑧𝑖𝑗 = 1
Obtain the polar coordinates of each point in Z.
𝑧11 = 𝑟1 ∗ 𝑐𝑜𝑠𝜃1 and 𝑧12 = 𝑟1 ∗ 𝑠𝑖𝑛𝜃1
𝑧21 = 𝑟2 ∗ 𝑐𝑜𝑠𝜃2 and 𝑧22 = 𝑟2 ∗ 𝑠𝑖𝑛𝜃2
…
𝑧𝑘−11 = 𝑟𝑘−1 ∗ 𝑐𝑜𝑠𝜃𝑘−1 and 𝑧𝑘−12 = 𝑟𝑘−1 ∗ 𝑠𝑖𝑛𝜃𝑘−1
2
Note that 𝑟12 + r22 + … + 𝑟𝑘−1
= 1. Why?
Let 𝑠1 = 𝑟12 ,
𝑠2 = 𝑟22 ,
…
2
𝑠𝑘−2 = 𝑟𝑘−2
.
Each angle θ ε [0, 2π).
Let us summarize the polar information in the pre-shape Z.
P = (𝑠1 , 𝑠2 , … , 𝑠𝑘−2 , 𝜃1 , 𝜃2 , … , 𝜃𝑘−1 ).
Properties of P
1. Each si &gt; 0.
2. s1 + s2 + … + sk-2 ≤ 1.
3. Each θi ε [0, 2π).
21
Knowing a non-coincident shape X implies knowing its pre-shape Z.
Knowing the pre-shape Z implies knowing its polar vector P.
Given any P with properties listed above there is a pre-shape Z of some shape X.
Give me
P = (𝑠1 , 𝑠2 , … , 𝑠𝑘−2 , 𝜃1 , 𝜃2 , … , 𝜃𝑘−1 ),
with the properties
1. Each si &gt; 0.
2. s1 + s2 + … + sk-2 ≤ 1.
3. Each θi ε [0, 2π).
Calculate
𝑧11 = 𝑟1 ∗ 𝑐𝑜𝑠𝜃1 and 𝑧12 = 𝑟1 ∗ 𝑠𝑖𝑛𝜃1
𝑧21 = 𝑟2 ∗ 𝑐𝑜𝑠𝜃2 and 𝑧22 = 𝑟2 ∗ 𝑠𝑖𝑛𝜃2
…
𝑧𝑘−11 = 𝑟𝑘−1 ∗ 𝑐𝑜𝑠𝜃𝑘−1 and 𝑧𝑘−12 = 𝑟𝑘−1 ∗ 𝑠𝑖𝑛𝜃𝑘−1
Note that ri = sqrt(si).
Define
𝑧11
𝑧21
Z=( …
𝑧𝑘−11
𝑧12
𝑧22
… ).
𝑧𝑘−12
Define
X = HTZ.
Note that the order of the matrix X is kx2.
The pre-shape of X is precisely Z. Note that HHT = Ik-1.
Now, the strategy is clear. If one wants to introduce a distribution on the preshape Σ, it suffices to introduce a distribution on the polar space
22
Ω = {P = (𝑠1 , 𝑠2 , … , 𝑠𝑘−2 , 𝜃1 , 𝜃2 , … , 𝜃𝑘−1 ): si &gt; 0 for all i; s1 + s2 + … + sk-2 ≤ 1; θi ε
[0, 2π) for all i}.
Uniform pre-shape distribution on Ω
The point (𝑠1 , 𝑠2 , … , 𝑠𝑘−2 ) is in the k-2 dimensional simplex. It is a solid. Its
volume is 1/(k-2)!. (Geometry result)
Put a uniform distribution on the simplex.
Put a uniform distribution for θ1 on the interval [0, 2π).
Put a uniform distribution for θ2 on the interval [0, 2π).
…
Put a uniform distribution for θk-1 on the interval [0, 2π).
String them together independently. This is the uniform pre-shape distribution.
The uniform distribution on the simplex is a special type of Dirichlet distribution.
Definition
The random vector (S1, S2, … , Sk-2) with each Si ≥ 0 and S1 + S2 + … + Sk-2 ≤ 1 is said
to have a Dirichlet distribution with parameters p1 &gt; 0, p2 &gt; 0, … , pk-2 &gt; 0, pk-1 &gt; 0 if
the joint probability density function is given by
f(s1, s2, … , sk-2) =
𝛤(p1 + p2 + … + pk−1 ) 𝑝1−1 𝑝2−1
𝑝𝑘−2 −1
(1 − 𝑠1 − 𝑠2 − … 𝑠𝑘−2 )𝑝𝑘−1−1
𝑠1 𝑠2
… 𝑠𝑘−2
𝛤(p1 )𝛤(p2 ) … 𝛤(p𝑘−1 )
for s1 ≥ 0, … , sk-2 ≥ 0 and s1 + s2 + … + sk-2 ≤ 1.
This is a multivariate version of the Beta distribution. If p1 = p2 = … = pk-1 = 1, the
distribution is uniform on the simplex.
This definition is introduced to point out that we can entertain other distributions
on the simplex.
General: Put a distribution on the simplex; put a distribution on each angle. String
them together independently. We will have a distribution on the pre-shape space
Ω. Dirichlet distribution on the simplex is a natural. There is a plethora of
23
distributions on the angle space [0, 2π) (circular distributions – von Mises
distribution, for example)
30 male gorillas’ landmark data on the shape of skulls
30 female gorillas’ landmark data on the shape of skulls
8 landmarks
Identify differences
How?
Parametric approach
a. Put a Dirichlet distribution on the six-dimensional simplex. Estimate the
parameters of the distribution using the male gorillas’ data. Estimate the
parameters of the distribution using the female gorillas’ data. Interpret the
parameters.
b. Put a von Mises distribution on each angle space. Estimate the parameters of
the distributions separately for males and females. Interpret the parameters.
c. Examine the differences.
6. Excursus
Challenges
a. Introducing distributions on the pre-shape space
b. Fitting distributions
c. Goodness-of-fit
d. Cluster analysis of shapes
e. Shape space in the context of Bookstein coordinates.
f. Spline model approach to Shape analysis
24
```