Uploaded by Seniru Kottegoda

Physical Simulation Textbook

advertisement
Physical Simulation from Geometric First Principles
Course Notes for Physical Simulation
Etienne Vouga
The University of Texas at Austin
Version: 1.1
Last Revised: December 19, 2022
ii
Preface
These notes are intended to accompany my graduate-level Physical Simulation computer science
elective course. They have been designed to stand alone, and to extend and elaborate on some of
the more technical points glossed over in the lectures; on the other hand, many of the more didactic
examples discussed in the lectures are not reproduced here.
Formatting and Conventions The notes are divided into chapters that very roughly correspond
to the class lectures. The first few chapters build on each other and explain fundamental mathematical concepts necessary for understanding the rest of the material; other topics are arranged
more haphazardly. Chapters can contain two special kinds of sections:
These “hold it” sections are a rhetorical aside to the material in the main text. They are
intended to be read inline with the chapter text and are used to
• anticipate common questions about the material;
• resolve apparent inconsistencies or contradictions in the exposition;
• provide additional insights and examples;
• qualify general/broad assertions with caveats and exceptions;
and are important to understanding the key chapter concepts.
Bonus Math On the other hand, “bonus math” sections are precisely that: optional, and often, significantly more technical material not required for understanding key chapter concepts. These contain
mathematical proofs of claims, generalizations and extensions of the main material, asides that might
require advanced knowledge of differential geometry or linear algebra to fully appreciate, etc. Feel free to
read these if you’re interested, or skip if you’re not.
Errors and Omissions I am continually revising these notes as I teach my course, and the
current version surely contains many errors both subtle and egregious. Feedback is welcome.
iii
iv
Contents
Preface
iii
0 Introductory Concepts
0.1 Configurations and Configuration
0.2 Trajectories and Tangent Space .
0.3 Potential Energy . . . . . . . . .
0.4 Kinetic Energy . . . . . . . . . .
Space
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
1 Inner Product Spaces
1.1 Vector Spaces, Bases, and Natural Properties
1.2 Covectors . . . . . . . . . . . . . . . . . . . .
1.3 Inner Products . . . . . . . . . . . . . . . . .
1.3.1 Musical Isomorphisms . . . . . . . . .
1.4 Other Linear Objects . . . . . . . . . . . . . .
1.4.1 Symmetry and Eigenvalues . . . . . .
1.5 Inner Product Spaces in Physics . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
3
5
6
7
.
.
.
.
.
.
.
9
9
12
14
16
17
19
21
.
.
.
.
.
23
24
25
28
30
31
2 The
2.1
2.2
2.3
2.4
2.5
Differential
Directional Derivative
The Differential . . . .
Worked Examples . .
The Hessian . . . . . .
Checking Your Work .
3 The
3.1
3.2
3.3
Spring Potential
33
Deriving Hooke’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Springs in Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Mass-Spring Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Time Integration
4.1 Kinetic Energy and Momentum . . . .
4.2 Equations of Motion . . . . . . . . . .
4.3 Time Integrators . . . . . . . . . . . .
4.4 Newton’s Method . . . . . . . . . . . .
4.5 What Makes a Good Time Integrator?
.
.
.
.
.
.
.
.
.
.
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
41
43
46
48
5 Hamilton’s Principle
5.1 The Lagrangian and Action . . . .
5.2 The Calculus of Variations . . . . .
5.3 The Euler-Lagrange Equations . .
5.4 Discrete Hamilton’s Principle . . .
5.5 Discrete Euler-Lagrange Equations
.
.
.
.
.
6 Handling Constraints
6.1 Reduced Coordinates . . . . . . . . .
6.2 Penalty Method . . . . . . . . . . . .
6.3 Step and Project . . . . . . . . . . .
6.4 The Method of Lagrange Multipliers
6.5 Constrained Hamilton’s Principle . .
7 Rigid Bodies in Two Dimensions
7.1 Stokes’s Theorem . . . . . . . . .
7.2 Area and Center of Mass . . . . .
7.3 Simulating Rigid Bodies . . . . .
7.4 A Rigid Body Time Integrator .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8 Lie Groups and Noether’s Theorem
8.1 Non-Euclidean Configuration Spaces . . . .
8.1.1 Constructing Manifold Configuration
8.2 Matrix Lie Groups . . . . . . . . . . . . . .
8.2.1 Tangent Space at the Identity . . . .
8.2.2 Walking in Straight Lines . . . . . .
8.2.3 Walking Between Points . . . . . . .
8.3 Noether’s Theorem . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
52
53
54
56
58
.
.
.
.
.
61
63
64
64
66
68
.
.
.
.
71
73
74
76
79
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
81
81
82
84
85
86
87
88
9 Rigid Bodies in Three Dimensions
9.1 Rotations in 3D . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 Representing 3D Rigid Bodies . . . . . . . . . . . . . . . . .
9.3 The Exponential Map, Its Derivative, and Angular Velocity
9.4 Equations of Motion . . . . . . . . . . . . . . . . . . . . . .
9.5 Discrete Equations of Motion . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
93
93
97
98
99
103
10 External Forces, Non-Conservative Forces, and Impulses
10.1 Work and the Lagrange-D’Alembert Principle . . . . . . . .
10.1.1 Non-conservative Forces . . . . . . . . . . . . . . . .
10.1.2 Non-configurational Forces . . . . . . . . . . . . . .
10.2 Impulses . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3 Impulses in Time Integrators . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
107
108
108
109
110
112
vi
. . . . .
Spaces
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11 Inequality Constraints and Impact
11.1 Collision Detection . . . . . . . . . . . . . . . .
11.2 Inequality Constraints . . . . . . . . . . . . . .
11.3 The Penalty Potential . . . . . . . . . . . . . .
11.4 Collision Impulses . . . . . . . . . . . . . . . .
11.4.1 Relative Velocity . . . . . . . . . . . . .
11.4.2 Computing the Impulse . . . . . . . . .
11.5 Handling Multiple Impacts . . . . . . . . . . .
11.5.1 The Special Case of Inelastic Collisions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12 1D Rubber Bands
12.1 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.1.1 Representing the Rubber Band with Piecewise Linear Elements
12.2 Kinetic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.3 Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . . .
13 Elasticity in Higher Dimensions
13.1 2D Sheets . . . . . . . . . . . .
13.2 2D Discretization . . . . . . . .
13.2.1 Kinetic Energy . . . . .
13.3 Elasticity in 3D . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
115
. 115
. 116
. 118
. 118
. 119
. 120
. 121
. 122
.
.
.
.
125
. 129
. 129
. 131
. 132
.
.
.
.
135
. 135
. 140
. 143
. 143
14 Principal Modes
145
14.1 Geometry of Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
14.2 Principal Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
14.3 Dimension Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
15 Dirichlet Energy and the Laplacian
15.1 Dirichlet Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15.2 The Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15.3 Discretizing the Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
151
. 151
. 153
. 156
A Debugging Hints
161
A.1 Why Is My Code So Slow? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.2 I’m Getting NaNs! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
vii
viii
Chapter 0
Introductory Concepts
Physicists like to think that all you have
to do is say, ‘these are the conditions, now
what happens next?’
Richard Feynman
We begin by establishing what it even means to simulate physics. What kind of algorithms is
physical simulation concerned with? What are the inputs, what are the outputs, and what are the
key steps in writing down the algorithm?
There are really only three ingredients needed to define a physical system whose behavior we
are interested in studying:
1. The degrees of freedom or state variables of the system. If you take a snapshot of the system at
any given time, what quantities do you need to know in order to fully describe the snapshot?
For example, consider a perfectly elastic ball that is bouncing around a closed box without
gravity (like in a game of Pong). What are the degrees of freedom of this system? There are
many possible answers, actually:
• the x, y, and z coordinates of the ball. With only these three numbers, we know exactly
where the ball is at any snapshot of the system. But wait! This is assuming that the
ball is a point particle. What if the ball has a small, finite extent? Then just knowing
the position of the ball is not enough to uniquely define a snapshot, because we don’t
have information about the ball’s orientation;
• the position of the center of the ball, and a rotation that defines its orientation. Now we
can track how the ball spins over time. But, is this enough? What about when the ball
hits a wall? Doesn’t it deform slightly during impact? Maybe it’s an oversimplification
to assume that the ball is perfectly rigid. How can we represent this “squishing” of
the ball with only the center position and orientation of the ball? We are still missing
information;
• the positions of each atom in the ball (≈ 1026 numbers). Now we can fully represent
how the ball squishes and stretches as it hits the walls and corners of the box. But this
is still only an approximation of
1
• the positions of the elementary particles making up each atom in the ball, which is a
classical approximation of
• the values of wave functions? quantum fields at every point in space? vibration modes
of strings? something else?
The key point here is that it is meaningless to talk about “true” degrees of freedom of a
system; these are most likely unknowable, and in any case, not particularly useful: it is
extremely unlikely we will see computer hardware capable of atom-level simulations of an
elastic ball within our lifetimes. Instead, we choose the degrees of freedom of each system
we are interested in simulating, based on a case-by-case evaluation of what details about the
motion of the system we truly care about, and what we can approximate away.
How do we know which degrees of freedom are important and need to be represented? We can
draw a lot of knowledge from physicists, who have constructed simplified models of objects
and compared how well these models approximate reality. We can also try adding more
degrees of freedom, or removing some, and checking if this affects the behavior of the system
to a degree that we care about.
What about the velocity of the ball, or its mass? We don’t treat the velocity of the
ball as a degree of freedom because it’s not something that you can measure from just
a snapshot of the ball at one instant in time. Rather, the velocity measures how the
position degree of freedom changes between several snapshots that happen to be located
close together in time. Put differently, knowing the velocity is required to reconstruct
the motion of the ball, but not to reconstruct one snapshot, and so it is not a degree of
freedom.
What about the mass of the ball? We could include it as a degree of freedom, and must include it
if it changes over time (for example, if the ball is made of ice and slowly sublimates over time). If a
quantity never changes (like the mass of an ideal elastic ball) there is no need to treat it as an explicit
degree of freedom; we can just treat it as a constant.
The “single snapshot” of a physical simulation, consisting of some concrete set of values for all
of the system’s degrees of freedom, is called the configuration of the system. We will discuss
configurations further later in this chapter;
2. the forces acting on the system. We leave off formalizing what a “force” is for now and appeal
to the intuitive picture of a force as “pulling” on degrees of freedom over time, causing them
to change value. Common examples of forces include gravity, the elastic force that causes a
spring to resist being compressed or stretched, and the contact force that prevents two objects
from tunneling through each other (by pushing them apart just hard enough to prevent them
from coming any closer, once they are already touching).
Just as we have a choice in what degrees of freedom to use to represent a physical system, we
also have a choice of which forces to include. You may have studied the so-called “fundamental
forces” of nature: gravity, the electromagnetic force, weak and strong nuclear forces, etc.
These are examples of forces, but just as you cannot in practice simulate an elastic ball by
representing each individual atom, you cannot simulate an everyday object by simulating
forces at the strong nuclear level! Therefore we make approximations: for example, you may
remember Hooke’s law F = −kx for a spring that has been stretched by a distance x; this is
not a fundamental force, but rather, it estimates the total effect of countless atoms pulling
2
on each other electrostatically within the spring’s steel material. For a physical simulation,
Hooke’s law is often good enough;
3. the equations of motion that tell us how the forces cause the degrees of freedom to evolve
over time. The law of motion you are probably familiar with is Newton’s second law F = ma;
this law tells us that forces cause accelerations on the degrees of freedom, which causes the
degrees of freedom to change over time, which causes the forces to change also, etc. To
simulate physics, we will need a way to turn the equations of motion into an algorithm for
updating the degrees of freedom over time. Such an algorithm is called a time integrator (it
integrates, i.e. advances, the physical system forward in the time direction). But before we
can talk about time integrators in a meaningful way, we need to formalize the concepts above
a bit more, which we will do in the next section.
Three ingredients: that’s all it takes to define a physical system, and write down an algorithm that
will simulate it. Of course, the devil is in the details, and the bulk of this course will cover how to
pick the right degrees of freedom and forces for simulating common types of objects, and how to
effectively evolve the objects over time using the right discretization of the equations of motion.
0.1
Configurations and Configuration Space
In the introduction to this chapter we defined degrees of freedom as the variables in a system that
change over time. Suppose we have n degrees of freedom in a physical system. We can take each
of these variables x1 , x2 , . . . xn and store them in a single large vector called the configuration of
the system, which we will represent using the variable q (we will use bold-faced letters to denote
vectors, here and throughout the rest of these notes):



q=

x1
x2
..
.



 ∈ Rn

xn
Each different vector q corresponds to a different possible “snapshot” of the physical system: the
space of all possible configurations is called the configuration space Q. In this case, Q = Rn . Let
us look at some examples.
1. Spring in 1D. Consider a spring on the 1D line, with one end anchored at x = 0 and the other
at x = c. One natural degree of freedom here is the second endpoint position position c; the
configuration space is thus just the real line R and any configuration q = [c] in configuration
space is just a 1D vector, i.e., one real number.
2. Two balls in 1D. Now suppose we have two balls on the 1D line, located at x = c1 and
x = c2 . One possible set of degrees of freedom is just {c1 , c2 }, so that Q = R2 is the real
plane, and q = [c1 c2 ]T is a point in the plane. We see in this example already a key concept:
a single point in high-dimensional configuration space can represent the positions of several
objects in the “physical space.” We can push this idea to encompass arbitrarily many objects
in arbitrary ambient dimension:
3
3. Pool balls in 2D. Suppose we have a triangle of 15 pool balls on a pool table (which we can
approximate by the 2D plane, if we ignore the edges of the table and pockets). We will also
ignore the rotation or deformation of the balls, so that we can choose as degrees of freedom
the centers of all of the balls, x1 , y1 , x2 , y2 , . . . , x15 , y15 . Since there are 15 different balls and
each ball has two degrees of freedom, the configuration space is a whopping 30 dimensions,
and a configuration is a concatenation of all of these DOFs,


x1
 y1 


 x2 




q =  y2  ∈ Q = R30 .
 .. 
 . 


 x15 
y15
Why did we interleave the xs and ys in this way? What principle led us to doing so?
Absolutely none. We pick the degrees of freedom, and likewise, we get to pick how to
pack them together into a configuration: there is nothing physically meaningful about
this packing!
Keeping the DOFs of individual objects together is sometimes convenient; in the language of computer science, this preserves the spatial locality of information about the
physical system, and can lead to simpler/nicer formulas once we start doing calculations with the
configuration. But we could just as well have grouped all the xs first, then the ys; or shuffled together
all of the DOFs and stored them in random order. All are valid ways of representing a configuration
of 15 pools balls.
4. Two balls in 1D, take two. Remember, the degrees of freedom are not something intrinsic
to a physical system: we choose them, and there can be multiple choices. Let us pick different
degrees of freedom for the pair of balls on the 1D line: the center point between them,
p = 21 (c1 + c2 ), and the separation between them, s = c2 − c1 . The configuration space is
still the plane Q = R2 , but each configuration q has different meaning—represents a different
snapshot state—than when we picked the ball coordinates as the DOFs.
5. Pendulum in 2D. Let’s do one more example, a simple pendulum in the plane. We will
assume one end of the pendulum is fixed (at (0, 0), say) and the other end is free to swing
back and forth. We can use, as our degree of freedom, the angle θ that the pendulum makes
with the y-axis, so that the position of the free end of the pendulum is given by
y = −L cos θ,
x = L sin θ,
(1)
for a fixed pendulum length L. In that case Q = R and q = [θ]. We have a one-dimensional
configuration space, even though our object lives in two-dimensional space. There is no
fundamental problem with this arrangement.
There is something fishy with this last example: although it is true that every snapshot
of the pendulum corresponds to some value of θ, this representation is not unique: the
pendulum at θ = 0 is in the same position as the pendulum at θ = 2π, or θ = 2πk for
any integer k.
4
This objection is well-founded. Perhaps a better choice of configuration space would be
Q = S 1 , the one-dimensional circle (which we can represent as “real numbers mod 2π”,
i.e. real numbers in [0, 2π), with the 2π endpoint glued to the zero endpoint): q can then move around
on this circle, but if the pendulum swings around in a complete circle, q will return to its initial value,
rather than being offset by 2π. This example reveals the fact that the natural configuration space for
a physical system is not always a Euclidean space.
The notion of configuration space can get even more messy. For example, consider a piece of spaghetti
moving around in space. For now, it’s a single object, but later, it might break apart into a large,
unknown number of pieces. What is the configuration space? Do we treat the spaghetti as being
“pre-broken,” and use a very large number of degrees of freedom for the spaghetti, even while it is one
piece? Do we swap dimensions of Q at the moment the spaghetti breaks?
In this course we will focus almost exclusively on Euclidean configuration spaces. We do so for
several reasons. First, it is the simplest possible theoretical setup, and while it is not too difficult to
extend the theory to non-Euclidean spaces (see Bonus Math below), these complications muddy the
waters in a way that is unhelpful your first time encountering physical simulation. Second, there is
a very practical reason for focusing on Q = Rn : when you go to implement a physical simulation,
you will be representing q as an array of doubles, which discretize Rn . We might as well stick to
configuration spaces that are easiest to translate into code. Third, although it is true that the global
configuration space of the pendulum is more accurately represented by S 1 than by R, locally we can
split configuration space into pieces that are each Euclidean, as formalized below in the Bonus Math,
so that if we know how to do simulation on a Euclidean configuration space, it’s not too difficult to
extend the code to work on more complicated spaces.
Finally, we could insist that the configuration space really is R—this is not incorrect. In that case
we’re simply including an extra piece of information as a degree of freedom, namely, the total number
of times that the pendulum has wound clockwise around the pivot point since the start of its motion.
The last example of the pendulum raises an interesting topic. What if we want to restrict
the allowed configurations in a configuration space? For example, for the pendulum, suppose we
instead choose as our degrees of freedom the endpoint coordinates (x, y) of the non-fixed end of the
pendulum. The problem now is that only some of the configurations q are valid: the rest correspond
to non-physical states since the pendulum length is wrong. In fact, the set of valid configurations
looks like a circle, inside the larger two-dimensional full configuration space. Another example is
the pair of balls: what if we want to insist that the balls can never tunnel through each other, so
that only configurations with c1 < c2 are valid? This restriction cuts Q in half along a straight
line, with one half being valid configurations, and the other half being illegal, since configurations
there correspond to balls that have tunneled.
The idea of defining a large configuration space, and then restricting the region of allowable
configurations q, is an extremely useful one, and we will cover it in depth in a later chapter. For
now, though, for simplicity’s sake we will assume that we do not have any restrictions: that any
q ∈ Q is physical.
0.2
Trajectories and Tangent Space
Now that we have defined configuration space, we can define a trajectory: it is simply a continuous
curve q(t) : R → Q. This curve might represent the true physical motion of an object, with t
the time and q(t) returning the state of the system at time t, but it doesn’t have to: we can also
write down many hypothetical trajectories that are totally crazy, nonphysical paths that represent
5
motions that a real-world object would never undergo.
A trajectory q(t) doesn’t have to be smooth; even a physical trajectory might have kinks
(corresponding, for example, to times t at which objects collide against each other, and suddenly
change course). At times when q(t) is differentiable, we can write down a configurational velocity
dq
(t).
dt
Here we use the physicist’s convention that a dot over a variable means a time derivative. Notice
that q̇ is a vector with the same dimensions as q, and just like q is a concatenation of all of the
degrees of freedom of a system, q̇ is a concatenation (of the same variables, in the same order!) of
the velocities (i.e., time derivatives) of each of the degrees of freedom.
At a given configurational point q ∈ Q, there are many possible curves you could draw passing
through q, and each can have a different velocity at the value of t where the curve passes over q.
The space of all possible configurational velocities at q is called the configuration tangent space
T Qq . If Q = Rn is Euclidean, then for every point q the configuration tangent space is also
Euclidean, and has the same dimension, T Qq = Rn . Elements of T Qq are called configuration
tangent vectors at q.
Note that just as configuration q can be viewed as a bunch of positions “packed together” into
a single vector, a configuration tangent vector q̇ is a bunch of different velocities “packed together.”
It is important to understand that a configurational velocity only has meaning when attached to
a particular configuration q; consider again, for instance, the pendulum whose degree of freedom
is the angle θ, with endpoint expressed in terms of θ as in equation (1). A configurational velocity
θ̇ = 1 at θ = 0 corresponds to the pendulum’s endpoint being at (0, −L) and having velocity of L
pointing right; at a different configuration θ = π/2, the pendulum is at (L, 0) and a velocity with
the exact same numbers, θ̇ = 1, corresponds to endpoint velocity of L pointing upwards.
q̇(t) =
Bonus Math A very useful generalization of the discussion above is to the case where Q is no longer
Euclidean, but instead is an arbitrary Riemannian manifold M . The full definition of a manifold is a bit
technical, but the key idea is that you can decompose M into overlapping regions M = R1 ∪ R2 ∪ · · · ∪ Rk ,
so that each region is “Euclidean-like”: there exist functions φi : Ri → Rn that are continuous and have
n
n
continuous inverses, so that whenever Ri and Rj overlap, φi (φ−1
j ) : R → R is smooth.
The φi are called charts and the full collection of charts is called an atlas. The names are evocative:
imagine a sphere. The entire sphere cannot be parameterized by a region of the plane, but you can divide
the sphere into a northern hemisphere and a southern hemisphere, with a small strip of overlap near the
equator. You can now flatten each hemisphere onto a page of a paper atlas: if you are a ship’s navigator,
and want to plot a course, you can do calculations on one page of the atlas, and if you ever “wander near
the edge” by getting too close to the equator, you can flip to the other page, and continue to work there.
This metaphor exactly matches what you would do in a physical simulation in the code.
Potentials, trajectories, configurational velocity vectors, etc. still work in this setting, but the warning
above about configurational velocities being well-defined only with respect to a single anchor point is even
more important: a velocity vector at one point of the sphere might be tangent to the sphere, but if you
blindly copy-paste it onto a different part of the sphere, it may no longer be tangent!
0.3
Potential Energy
The second ingredient of a physical system that we listed above are forces that describe how objects
in the system interact with each other. In this course, we will focus mostly on conservative forces:
6
those which arise from a potential energy over configuration space.
A potential is simply a scalar function V : Q → R on configuration space. It assigns an energy
cost to each configuration q; intuitively, the force associated with V tries to pull the configuration
from regions of Q with high cost to those with low cost. It is common when reading about general
relativity to encounter the “rubber sheet analogy,” and it is also a very useful tool for visualizing
and understanding how potentials affect the physical trajectory q(t) of a system: you can think of
V as defining a height at every point over Q, and q(t) as the path of a marble that is rolling over
this landscape. The force associated with the energy V constantly tugs on q in the down-slope
direction. We will formalize this intuitive picture in the next few chapters.
For now, consider some example potential functions:
• 1D Spring Energy For the physical system described earlier consisting of a single spring
whose degree of freedom is the second endpoint coordinate x = c (with the first endpoint
anchored at x = 0), we could describe the spring force using the simple quadratic potential
1
V (c) = c2 .
2
Notice this behaves intuitively like we think a spring should: the potential is “bowl-shaped”
with a minimum at c = 0, which makes sense since the spring is at rest when retracted. If we
pull the particle to either the left (c < 0) or the right (c > 0), the force pulls in the “downhill”
direction, to the right or left, respectively, to restore the spring to its retracted state.
• Gravity For a particle system whose degree of freedom is a single particle q = (x, y, z) in
R3 , we can encode gravity (in the y direction) by the potential
V (q) = gy
for some positive constant g. Notice that this potential has no minimum: it grows more and
more positive as y gets larger, and more and more negative as y gets smaller; the force pulls
in the downhill direction of the potential, which is in this case always in the −y direction.
0.4
Kinetic Energy
We have already defined potential energy V , which assigns a scalar energy to every configuration
q ∈ Q. We now need a second kind of energy: kinetic energy, or energy inherent to motion of
degrees of freedom. Formally, kinetic energy is a scalar function of configurational velocity and
position,
T (q, q̇) : Q × T Q → R.
Perhaps you remember from high school physics that the kinetic energy of a single point particle
with mass m and velocity v is
1
T = m kvk2 ,
2
T
and generalizing this formula to k particles in n dimensions, q = q1 q2 . . . qk
, with masses
m1 , m2 , . . . , mk , we get
k
1X
T =
mi kq̇i k2 ,
2
i=1
7
which can be rewritten in a particularly compact and convenient form:
1
T = q̇T M q̇
2

m1 In×n
0
...
0

0
m
I
.
.
.
0
2
n×n

M =
..
..
..
.
..

.
.
.
0
0
0 mk In×n





,
nk×nk
where In×n is the n × n identity matrix, and M is a square matrix (the same size as the total
number of degrees of freedom, nk) called the configurational mass matrix. Notice that M is a
diagonal matrix; almost all entries of M are zero.1
Why does kinetic energy have this form? Historically it was first hypothesized based on
experiments with the impact energy of dropped weights, and I don’t know of any way to prove
it from first principles. One can show that this formula is needed to guarantee conservation
of energy, given Newton’s second law (discussed in later chapters); however this reasoning is
circular since there is no particular reason to take either conservation of energy or Newton’s
second law on faith, and both will be shown to be consequences of Hamilton’s principle, which
needs a formula for kinetic energy. We will assume that the kinetic energy formula is part of the definition
of the degrees of freedom: something we have to formulate based on our analysis of the system in terms of
simpler parts, or accept as experimental fact from the physicists.
Notice that this formula is only for collections of point particles; we will see later in the course
when we study rigid bodies that kinetic energy can become much more complicated, and can even
depend on position q in addition to velocity q̇. But the special case T = 12 q̇T M q̇ is such a common
special case (since many physical systems can be built up from collections of point particles) that
we will often assume that the kinetic energy can be refactored into this form (which we will call
the standard form of kinetic energy), as it often greatly simplifies formulas and calculations.
1
This means that when implementing M in practical codes, especially when nk is large, it is critical to use a
sparse matrix data structure to represent M , rather than a dense array that explicitly stores in memory all of
the zeros.
8
Chapter 1
Inner Product Spaces
There aren’t enough small numbers to
meet the many demands made of them.
Richard Guy
Before we get to any actual physical simulation, we need to review some crucial mathematical
background. Ostensibly this material is covered as part of a first course on elementary linear
algebra, but in my experience much about inner product spaces is either not taught at all, or
taught poorly, so we will go over it again here. It is tempting to skip this chapter, if you already
have a working knowledge of linear algebra, but do not do this unless you truly are fully comfortable
with the distinction between vectors and covectors, index raising and lowering, etc.
1.1
Vector Spaces, Bases, and Natural Properties
Suppose we have a vector space V of finite dimension d.1 We can pick a basis of d linearly independent vectors B = {b1 , b2 , . . . , bd } for V and represent all elements of V as linear combinations
in this basis:
d
X
v∈V =
αi bi , αi ∈ R.
i=1
We compactly write these coefficients αi as a (column) vector,


α1
 α2 


v= .  .
.
 . 
αd
B
This convention is by now extremely familiar to you, but notice some subtleties about this representation: the left-hand side, v, is an object in the vector space V . It exists independently of what
basis B we chose, and exists even if we didn’t bother to choose a basis at all! On the other hand,
1
Here and throughout these notes we assume all vector spaces are over the real numbers, unless otherwise specified.
But much of the information here applies unmodified if you use C or some other favorite base field instead.
9
the object on the right-hand side is just a bucket of numbers. It has no meaning independent of
the basis B we have chosen.
The same vector can have different representations as columns-of-numbers depending on the
basis we choose for V . The same column-of-numbers can represent different vectors, if we choose
different bases for V . In the above I wrote a subscript B on the column-of-numbers to indicate the
intimate association between the numbers and the choice of B. This point is an absolutely crucial
one.
The word “vector” has become overloaded to refer to both the intrinsic object v, and the
column-of-numbers, which is rather unfortunate and a source of unending confusion. To try to
mitigate this overloading, let us call the column-of-numbers representation on the right the matrix
form of the vector v and write it as [v], with brackets around the vector.2 .
Example Let us look at a somewhat exotic vector space, the space P2 of all polynomials of degree
2 or less. This set is indeed a well-defined vector space, since the sum of two such polynomials is still
a polynomial of degree 2 or less, and multiplying a polynomial in P2 by a real number yields a new
polynomial still in P2 . A basis for this vector space (called the monomial basis) is B = {1, x, x2 };
these basis vectors are obviously linearly independent, and reveal that P2 has dimension three.
The polynomial x2 − 2x + 1 is a vector in P2 , and its matrix form with respect to the chosen
basis is


1
 −2  .
1
B
We could have picked a different basis instead; for example B̃ = {x2 − 1, x + 1, x − 1}. In this new
basis, the same vector has matrix form




1
1
 −2  ≡  0  .
(1.1)
1
−2 B̃
B
Here I’ve used ≡ to mean “represents the same vector as”; of course, the contents of the two
matrices (the actual numbers in the bucket) have different values.
Given a basis B for a vector space V , we can write a second basis B̃ in terms of the first by
expressing every basis vector b̃j as a linear combination of the bi :
b̃j =
d
X
bi tij ,
i=1
or, abusing notation a bit,
h i
B̃ = [B] T,
where the basis vectors of each basis have been assembled into a collection of columns
[B] =
b1 b2 · · ·
2
bd
Yes, brackets will also be used throughout these notes for grouping, not just to denote the matrix form of vectors
etc. Hopefully the context makes the usage clear.
10
and each column t?j of T tells us how to construct b̃j as a linear combination of the bi , in a manner
evocative of matrix multiplication. Of course, [B] is not really a matrix, unless we express the basis
vectors of B and B̃ in terms of some mutual third basis.
In particular, given a vector v in matrix form with respect to the second basis, this relation
allows us to write down the matrix form of the vector with respect to the first basis:


α1


d
d X
d
d
d
d
 α2 
X
X
X
X
X


αj b̃j =
αj bi tij =
bi 
tij αj  =
(T α)i bi ≡ [T α]B ,
 ..  ≡
 . 
j=1
j=1 i=1
i=1
j=1
i=1
αd B̃
where the matrix T with entries tij can now be interpreted in two ways:
• as we already saw above, its columns are the matrix representations of the new basis vectors
b̃, expressed with respect to the old basis vectors;
• it maps, via matrix multiplication, the matrix representation of a vector with respect to B̃ to
the matrix representation of the same vector with respect to B.
These two interpretations are counterintuitive and perhaps confusing, since T seems to map in
opposite directions! If in doubt, we can verify our understanding by testing the action of T on test
cases that are easy to reason about: for example, consider the vector [ei ], whose i-th entry is 1 and
whose other entries are zero. T [ei ] computes the i column of T which, per the discussion above,
gives us the coefficients for the linear combination of basis elements in B which produces b̃i . So the
vector b̃i has matrix forms [ei ]B̃ and (T [ei ])B , and T maps between the matrix form with respect
to B̃ to that with respect to B, consistent with the second bullet above.
A consequence of the above facts is that the inverse T −1 maps matrix forms of vectors in the B
basis to the matrix form in the B̃ basis.
Example
For P2 and the two bases above, we have that
x2 − 1 = −1 · 1 + 0 · x + 1 · x2
x + 1 = 1 · 1 + 1 · x + 0 · x2
x − 1 = −1 · 1 + 1 · x + 0 · x2
so that

and
x2 − 1 x + 1 x − 1
=

−1 1 −1
T =  0 1 1 ;
1 0 0
1 x x2

−1 1 −1
 0 1 1 
1 0 0

0
0
1

T −1 = 
1
2
1
2
1
2
1
2
.

− 12
− 12
From these matrices we can calculate

 

1
1
[x2 − 2x + 1]B̃ = T −1 [x2 − 2x + 1]B = T −1  −2  =  0  ,
1
−2
in agreement with our previous computation in Equation (1.1).
11
Natural Quantities Let us call an object (number, function, etc) involving V natural if it does
not depend on the choice of basis for V . The only natural property of V itself is its dimension:
this is because every pair of vector spaces with the same dimension (and the same base field) are
isomorphic: indistinguishable from each other. This fact is the linear algebra equivalent of the
law of small numbers: there are too few vector spaces to meet the many demands made of them.
“Accidental” isomorphisms between vector spaces abound, and objects with the same matrix form
look like they are the same even though they represent totally different things (the matrix form of
a polynomial in P2 could also represent a direction in 3D Euclidean space, for example).
An example of a natural property of a set of vectors is linear dependence: whether or not vectors
are linearly independent doesn’t depend on the basis used to represent those vectors in matrix from.
The first coordinate of a vector is not a natural property, since obviously, changing the basis will
change the coordinates of its matrix representation.
1.2
Covectors
Given a vector space V , let V ∗ be the set of linear functions ν : V → R. Such functions ν are
sometimes called covectors. Recall that a linear function must satisfy the property
ν(v + αw) = ν(v) + αν(w)
for v, w ∈ V and α ∈ R. Observe that V ∗ is a vector space: adding two linear functions, or scaling
a linear function, yields another linear function.
Although V ∗ might look at first like a very large space, the linearity requirement is a very
∗
restrictive one. In fact, given a basis B for V , every linear
Pd function ν ∈ V is completely determined
by its action on the basis vectors bi ∈ B: for if v = i=1 αi bi , then
ν(v) =
d
X
αi ν(bi ),
i=1
by repeated application of the linearity property. If we know ν(bi ), we know ν(v) for any vector
v.
Taking this idea further, we can write covectors ν in matrix form as row vectors:
[ν]B = ν(b1 ) ν(b2 ) · · · ν(bd ) .
Now covectors act on vectors by ordinary matrix multiplication (the dimensions make sense, since
the product of a row vector with a column vector is a scalar.) Indeed


α1
d

X

 α2 
ν(v) =
αi ν(bi ) = ν(b1 ) ν(b2 ) · · · ν(bd )  .  = [ν]B [v]B .
 .. 
i=1
αd
We can also calculate how covectors transform under change of bases. If b̃j =
ν(b̃j ) =
d
X
ν(bi )tij = ([ν]B T )j
i=1
12
Pd
i=1 bi tij ,
then
and [ν]B̃ = [ν]B T. Notice that this rule is different than the rule for vectors, which transform by
multiplication on the left by T −1 .
We could have intuited this rule just by “symbol-pushing” rather than from careful derivation:
we know that the evaluation ν(v) is a natural quantity, so must be the same when ν and v are
expressed in any basis:
ν(v) = [ν]B [v]B = [ν]B̃ [v]B̃ = [ν]B̃ T −1 [v]B
and now it is “obvious” that we need [ν]B̃ = [ν]B T to “cancel out” the T −1 .
Example Let’s return to our vector space P2 of quadratic polynomials. Plugging in x = 2 into a
polynomial turns out to be a linear function of polynomials, since if f, g ∈ P2 , f (2)+g(2) = (f +g)(2)
and (αf )(2) = αf (2). So, perhaps surprisingly, evaluation at x = 2 is a covector ν, and we can
compute its matrix form in the B = {1, x, x2 } basis by testing it on each basis vector:
[ν]B = ν(1) ν(x) ν(x2 ) = 1 2 4 .
Does this actually work? Let’s try it on our example polynomial v = x2 − 2x + 1.


1
ν(v) = [ν][v] = 1 2 4  −2  = 1,
1
which is indeed 22 − 2(2) + 1.
Basis for Covectors When we wrote ν in matrix form, we implicitly chose a basis for V ∗ ,
although it went by so fast you may have missed it:
X
ν=
ν(bi )b∗i ,
i
where B ∗ = {b∗i }di=1 are basis covectors satisfying
b∗i (bj )
(
0,
= δij =
1,
i 6= j
i = j.
When evaluating ν at v, each basis covector b∗i “plucks out” the bi coefficient of v. In coordinates,
[bi ]B = ei ; [b∗i ]B∗ = eTi . A basis chosen for V thus extends to a basis for V ∗ via this pairing, and
so B ∗ is often called the dual basis of B. However, this relationship between corresponding vectors
and covectors is not natural! Vector/covector pairs in one basis may not be paired in another basis.
This is a key yet very confusing claim. In our running P2 example with basis {1, x, x2 }, for
instance, there is a pairing between the vector v with matrix form


1
 −2  = 1 · b1 − 2 · b2 + 1 · b3
1
and the covector ν with matrix form
1 −2 1 = 1 · b∗1 − 2 · b∗2 + 1 · b∗3 .
13
How can this pairing not be natural? Transposing a vector into a covector seems like the most natural
operation in the world.
But it’s not. Algebraically this is clear by noticing what happens to each of these two objects when you
change the basis for V , and the corresponding dual basis for V ∗ . We have that [v]B = [ν]TB∗ , but generally
speaking
T
T
[v]B̃ = T −1 [v]B = T −1 [ν]TB∗ 6= T T [ν]TB∗ = ([ν]B∗ T ) = ([ν]B̃∗ ) .
Concretely, for the new basis B̃ = {x2 − 1, x + 1, x − 1}, we already computed that


1
[v]B̃ =  0  .
−2
We can also compute

[ν]B̃∗ =
1
−2
1
−1
 0
1

1 −1
1 1 = 0
0 0
−1
−3
and in this new basis v and ν are no longer transposes of each other when written in matrix form.
The fact that v and ν were paired as dual vectors was just an accident of the choice of basis B; it wasn’t
any special property of the vector/covector pair.
A quick final note: in physics and engineering, covectors are often called covariant vectors, since
they transform “in the same way” as the basis vectors (using T ) during a change of coordinates.
Ordinary vectors are called contravariant vectors since they transform using T −1 instead of T . We
won’t use this terminology anywhere in these notes, but it’s good to be able to connect terminology
across different fields.
1.3
Inner Products
An inner product space (V, g) is a vector space V along with a special function g : V × V → R,
called the inner product or metric of V , which measures lengths and angles of vectors in V . This
inner product must satisfy several properties:
• linearity: g(v, w + αu) = g(v, w) + αg(v, u), and likewise for the first parameter;
• symmetry: g(v, w) = g(w, v);
• positivity: g(v, v) > 0 except when v = 0.
Sometimes angle brackets are used to represent the inner product: hv, wig = g(v, w). The g
subscript is often omitted from the angle brackets as well, when the inner product that should be
used is obvious from context.
Notice that by linearity, g(0, 0) = 0. The inner product applied to the same vector twice,
g(v, v), should be interpreted as measuring the (always non-negative) squared length of the vector
v.
Just like a vector space is a set with some extra structure (addition of vectors and multiplication
by scalars), an inner product space is a vector space with additional structure (the ability to measure
lengths). Perhaps the most familiar inner product space is Euclidean space Rn , with g the ordinary
Euclidean dot product. It is easy to check the dot product satisfies all of the properties above. We
14
can write down more exotic inner product space, though. For example, let us turn our polynomial
vector space P2 into an inner product space by defining the metric:
Z
1
vw dx.
g(v, w) =
0
It is easy to see that this g satisfies all three properties of an inner product for quadratic polynomials
in P2 . (Defining an inner product by integration in this way is a rather common trick, and the
resulting inner product is called
the L2 inner product on P2 .)
p
The length of a vector g(v, v) is often called the norm, and written kvkg (where again, the g
is sometimes omitted if it is obvious which inner product to use). We can normalize nonzero
p vectors
in an inner product space by rescaling them into unit vectors with norm one: v̂ = v/ g(v, v). If
applying g to the same vector twice measures (squared) length, applying g to two different unit
vectors can be thought of as measuring the cosine between the vectors, by analogy to the equivalent
formula for vectors in the plane when g is the ordinary Euclidean dot product:
g(v, w)
cos θv,w = p
= g(v̂, ŵ).
g(v, v)g(w, w)
What it means to have an “angle” between, say, polynomials, I leave to your imagination: however,
intuitively, g(v̂, ŵ) measures how closely the vectors “align” to each other.
Inner Product Matrix Form Symmetry and linearity of the inner product means that we can
encode all P
possible inner products
P in terms of just a few coefficients, once we choose a basis B for
V . If v = di=1 αi bi and w = di=1 βi bi , then by applying linearity repeatedly we have that
g(v, w) =
d X
d
X
αi βj g(bi , bj )
i=1 j=1
and any inner product g is determined by the d2 coefficients gij = g(bi , bj ). Notice that the formula
above for g(v, w) is exactly the definition of matrix multiplication, so that
g(v, w) = [v]TB [g]B [w]B
with the matrix form of g being a d×d matrix with entries gij . By now it’s barely worth mentioning
that the numbers contained in the matrix [g] are intimately dependent on the choice of basis B: an
inner product is natural; the matrix form of the inner product is not!
Symmetry of g implies that the matrix [g] is a symmetric matrix, with [g]T = [g]. Moreover,
positivity requires that [v]T [g][v] > 0 whenever [v] 6= 0; this is precisely the definition of a positivedefinite matrix. So an inner product always has a square symmetric positive-definite matrix as its
matrix form, no matter the choice of basis.
How does the matrix form of the inner product transform under change of basis? We can infer
the formula from the fact that the inner product of two arbitrary vectors should be independent of
the basis:
g(v, w) = [v]TB [g]B [w]B = [v]TB̃ [g]B̃ [w]B̃ = [v]TB T −T [g]B̃ T −1 [w]B ,
so that [g]B̃ = T T [g]B T.
15
1.3.1
Musical Isomorphisms
As mentioned in section 1.2, a choice of basis for a vector space V induces a basis for the space of
covectors V ∗ , but the correspondence between vectors and covectors is not a natural one. Using an
inner product, though, it is possible to transform between vectors and covectors in a natural way.
The key insight is this: any vector v can be turned into a covector v[ by plugging it in as one
parameter into the inner product:
v[ (w) = g(v, w).
This function v[ maps from V to R, and is linear, since g is bilinear. So v[ is a well-defined covector
of V ∗ . We can even write down an explicit formula for the matrix form of v[ by converting both
sides of the above equation to matrix form:
[v[ ] = [v]T [g].
Notice that if [g] happens to be the identity matrix (as is the case for the Euclidean dot product,
expressed in the standard Euclidean basis) then this “flat” operation is just taking the transpose
of v. But the above construction is now natural, i.e. basis-independent, and for a different choice
of basis the matrix form of g may not be the identity matrix anymore.
A similar transformation exists in the reverse direction: if ν is a covector, we can transform it
into a vector ν ] , where ν ] is the unique vector satisfying
ν(w) = g(ν ] , w)
for any w ∈ V . In matrix form, [v] ] = [g]−1 [ν].
Why does such a vector ν ] have to exist, though? Unlike the [ operation, which is explicit
and clearly well-defined, it is perhaps not obvious why there must exist a vector ν ] with the
property that ν(w) = g(ν ] , w) for every different vector w. This is one case where working
with the matrix form is more convenient, since it is easy to check that the vector with form
[g]−1 [ν] has this property. It is also possible to prove existence of ν ] without using any choice
of basis or coordinates, though the argument is a bit subtle. Let S be the space of all covectors
of the form g(v, ·) for v ∈ V . Since g is linear, S is closed under addition and scalar multiplication, so S is
a linear subspace of V ∗ .
It also has dimension d, since if b1 , b2 , . . . , bd are a set of linearly independent vectors,
then the g(bi , ·)
P
are also linearly independent: suppose for contradiction that this is not true, and
αi g(bi , ·) is the zero
function for some coefficients αi , not all zero. Then by the linearity of g, for any vector w, g(u, w) = 0,
Pd
where u = i=1 αi bi . In particular, g(u, u) = 0, but this is a contradiction since g is positive and u 6= 0
since the bi are linearly independent.
Therefore S is a full-dimensional vector subspace of V ∗ , and so must equal V ∗ .
The mappings [ : V → V ∗ and ] : V ∗ → V are called the musical isomorphisms and satisfy the
identities you might expect from the musical notation:
]
v[ = v
[
ν ] = ν.
The musical isomorphisms are sometimes called index raising and lowering, especially in physics
and engineering, due to the way these fields write down vectors and covectors using Einstein index
notation.
16
1.4
Other Linear Objects
The inner product is a special kind of more general bilinear function f : V × V → R satisfying
f (v, w + αu) = f (v, w) + αf (v, u)
f (v + αu, w) = f (v, w) + αf (u, w),
but no symmetry or positivity. The space of bilinear scalar-valued functions is written V ∗ ⊗ V ∗ .
Just like the inner product, a bilinear function f ∈ V ∗ ⊗ V ∗ has a d × d matrix as its matrix form,
and acts by matrix multiplication on its two arguments:
f (v, w) = [v]T [f ][w].
The only difference is that [f ] is not necessarily symmetric positive-definite.
Yet another kind of linear function on V is the space of linear transformations M : V → V .
Linear transformations take as input only a single vector of v, and evaluate to a vector in V rather
than to a scalar. Being linear means that M acts as you would expect on combinations of vectors:
M (v + αw) = M (v) + αM (w).
The space of linear transformations is written as V ⊗ V ∗ . As with
Pdall linear objects, the behavior
of M can be characterized by its action on a basis for V : if v = i=1 αi bi , then
M (v) =
d
X
αi M (bi ).
i=1
As in the rest of this chapter, we can use this expansion to write down M in matrix form: the
above expression tells us that applying M to a vector is the same as multiplication by a matrix
whose columns are M applied to the basis vectors:
[M ]B = M (b1 ) M (b2 ) · · · M (bd ) .
How does the matrix form of M transform under change of coordinates? By now, the reasoning
needed to answer this question should be straightforward: [M ]B̃ [v]B̃ and [M ]B [v]B should represent
the same vector, but in different bases; therefore
T −1 [M ]B [v]B = [M ]B̃ [v]B̃ = [M ]B̃ T −1 [v]B
and [M ]B̃ = T −1 MB T.
Notice that linear transformations and bilinear scalar functions are two completely different
kinds of objects, with different inputs, outputs, and transformation rules under change of basis—
and yet both have the exact same matrix form (d × d matrix)! This is the “law of small vector
spaces” at work again, is a source of unending confusion, and is the reason it is extremely important
to keep clear in your mind, any time you see a matrix, what kind of object each matrix is supposed
to represent, and how these objects transform into each other. Table 1.1 lists the several kinds of
objects we’ve discussed in this chapter, along with their representations and transformations rules.
17
mathematicians call it
vector
covector;
dual vector;
linear functional
linear transformation
bilinear form;
inner product
coordinate frame
physicists call it
contravariant
vector
what it really is
implement it as
(0, 1) tensor
d × 1 column
covariant vector
(1, 0) tensor
mixed-variance
2-tensor
covariant
2-tensor
contravariant
2-tensor
how it functions
how it transforms
v
[v]B̃ = T −1 [v]B
1 × d row
ν(v) = [ν][v]
[ν]B̃ = [ν]B T
(1, 1) tensor
d × d matrix
A(v) = [A][v]
[A]B̃ = T −1 [A]B T
(2, 0) tensor
d × d matrix
g(v, w) = [v]T [g][w]
[g]B̃ = T T [g]B T
(0, 2) tensor
d × d matrix
F
[F ]B̃ = T −1 [F ]B
Table 1.1: Linear Objects Demystified
Bonus Math All of the objects in table 1.1 are special cases of tensors: generalizations of multilinear
objects. You don’t need to know much about tensors, beyond what we’ve covered already in this chapter,
to understand physical simulation concepts, but I can give you a bit more detail about the general setting
here.
If you have two vector spaces V1 and V2 , how can you combine them into a new vector space? It turns
out there are two natural ways of making such a combination.
The first is called the direct sum V1 ⊕ V2 , and has as its vectors the set of all ordered pairs (v1 , v2 ),
with v1 ∈ V1 and v2 ∈ V2 . The addition rule for the direct sum is simple: just treat each of the vectors
in the pair separately:
(v1 , v2 ) + (w1 , w2 ) = (v1 + w1 , v2 + w2 )
similarly, distribute scalar multiplication into each component in the obvious way:
α(v1 , v2 ) = (αv1 , αv2 ).
It is not hard to show that V1 ⊕ V2 satisfies the properties of a vector space, with zero element (01 , 02 ).
If B1 and B2 are bases for V1 and V2 , with dimension d1 and d2 respectively, then V1 ⊕ V2 has dimension
d1 + d2 , and one easily-constructed basis for the direct sum has basis vectors of the form (01 , b ∈ B2 ) and
(b ∈ B1 , 02 ).
The other natural way to form a vector space out of V1 and V2 is the tensor product. Perhaps the
easiest description of the tensor product is to define it in terms of bases B1 and B2 chosen for V1 and V2 .
Declare V1 ⊗ V2 to be the span of all pairs of vectors of the form (bi1 , bj2 ), with bi1 ∈ B1 and bj2 ∈ B2 . This
vector space has dimension d1 d2 and has an obvious basis (the one we just constructed)! What is perhaps
not obvious about this construction is that it is a natural one, since it relied on a choice of basis for V1
and V2 ; but it turns out that it is indeed basis-independent. (There is an alternate, but more subtle,
construction of the tensor product that makes this clear: start from the Cartesian product V1 × V2 , and
turn it into a vector space by including all linear combinations of the pairs in this Cartesian product. This
gives us an infinite-dimensional vector space; we now reduce this space by declaring vectors to be equal if
they differ by bilinearity of addition:
(v1 + w1 , v2 ) = (v1 , v2 ) + (w1 , v2 )
(αv1 , v2 ) = α(v1 , v2 ).
It can be shown that “quotienting out” by this equivalence yields a vector space, that it has dimension
d1 d2 , and that is is isomorphic to the tensor product defined above in terms of bases B1 and B2 .)
An example might help: P2 ⊗ P2 is the vector space of bivariate polynomials where each variables has
degree 2 or less. For example, x2 − 2x + 1, x + y, and 2x2 y + 4y 2 − xy + 1 are all examples of vectors in
18
P2 ⊗ P2 . For bases B1 = {1, x, x2 } and B2 = {1, y, y 2 }, they can be expanded as
x2 − 2x + 1 = 1 · (x2 , 1) − 2 · (x, 1) + 1 · (1, 1)
x + y = 1 · (x, 1) + 1 · (1, y)
2
2
2x y + 4y + xy + 1 = 2 · (x2 , y) + 4 · (1, y 2 ) + 1 · (x, y) + 1 · (1, 1).
Pairs (bi1 , bj2 ) are usually written as bi1 ⊗bj2 . Given two vectors v1 =
one can define the tensor product of vectors ⊗ : V1 × V2 7→ V1 ⊗ V2 , by
v1 ⊗ v2 =
d1 X
d2
X
Pd1
i=1
α1i bi1 and v2 =
Pd2
i=1
α2i bi2 ,
α1i α2j bi1 ⊗ bj2 .
i=1 j=1
This tensor product of vectors works exactly how you would expect in the case of P2 ⊗ P2 :
(1 + 2x) ⊗ (y 2 − y) = (1 ⊗ y 2 ) + 2(x ⊗ y 2 ) − (1 ⊗ y) − 2(x ⊗ y) = y 2 + 2xy 2 − y − 2xy.
Given a vector space V , it is especially common to take tensor products of V with itself and with its
space of covectors V ∗ . Elements of V ⊗ V ∗ are called (1, 1)-tensors, and are linear transformations: linear
functions that take in a vector in V and output a vector in V . Elements of V ∗ ⊗ V ∗ are (2, 0)-tensors, and
are the bilinear scalar functions of V . In general, (n, m)-tensors are elements of the tensor product space
containing n copies of V ∗ , and m copies of V . This space has dimension dn+m ; n + m is sometimes called
the rank of the tensor. Tensors of rank 2 have matrix form that can be represented as a square matrix
(see table 1.1). Rank 3 tensors have matrix form that look like a 3D “cube” of numbers; rank 1 tensors
are the vectors or covectors.
1.4.1
Symmetry and Eigenvalues
Bonus Math It is worth delving more deeply into the similarities and differences between bilinear scalar
functions f and linear transformations M . Both have as their matrix form a d × d matrix, but many of
the usual linear algebra concepts related to matrices only make sense for one of the two kinds of objects.
Symmetry Both types of objects have a notion of symmetry. For bilinear functions, it’s the symmetry
we already saw for the inner product: that f (v, w) = f (w, v). In matrix form, symmetry means that
[v]T [f ][w] = [w]T [f ][v] = ([w]T [f ][v])T = [v]T [f ]T [w],
where the second equality comes from the fact that a scalar can always be transposed to equal itself.
Therefore symmetry of f , in matrix form, corresponds to symmetry of [f ] in the usual sense of [f ] = [f ]T .
Now what about linear transformations? It is less obvious, but there is indeed a corresponding notion:
that of self-adjointness of M :
hv, M wig = hM v, wig .
(1.2)
Unpacking this identity into coordinates, we have that
[v]T [g][M ][w] = [v]T [M ]T [g][w]
or in other words, [M ]T = [g][M ][g]−1 . Notice that this is not just ordinary symmetry of the matrix [M ].
19
Eigenvalues and Eigenvectors A linear transformation M has v 6= 0 as an eigenvector if, for some
scalar λ, M (v) = λv. Or in matrix form,
[M ][v] = λ[v].
Notice that the magnitude of v is irrelevant: if v is an eigenvector, so is αv for any α 6= 0. We might as
well assume that g(v, v) = 1, though we still do not have a canonical direction ±v for the eigenvector.
It is a standard result from linear algebra that if M is self-adjoint, then (i) M has a full set of
d eigenvalue/eigenvector pairs (λi , vi ), where (ii) the eigenvalues are real and (iii) the eigenvectors are
orthogonal to each other: that is, g(vi , vj ) = δij . The full proofs of these facts needs some advanced
knowledge about the algebra of polynomials over C, so we will not go through them here: but it is easy
to see the last fact, at least for eigenvectors corresponding to distinct eigenvalues λi 6= λj :
λi g(vi , vj ) = g(M (vi ), vj ) = g(vi , M (vj )) = λj g(vi , vj )
where the middle equality is due to self-adjointness of M ; now the only way the first and last expression
are equal to each other is if g(vi , vj ) = 0.
The above facts combine to tell us that the matrix form of M can be decomposed as


λ1


λ2


[M ] = V ΛV −1 , Λ = 
, V = v1 v2 · · · vd ,

.
..


λd
where V T [g]V = I. In other words, in the basis of its own eigenvectors, M acts just by rescaling each
coordinate by the corresponding eigenvalue. This fact is extremely useful both when reasoning about the
effects of linear transformations, and doing numerical computation.
Many numerical linear algebra packages only support, or better-support, matrices that are truly
symmetric, rather than just self-adjoint. This is not a problem: if M is self-adjoint, then
([g][M ])T = [M ]T [g]T = [g][M ][g]−1 [g]T = [g][M ]
so that [g][M ] is symmetric. We can write the defining equation (1.2) as
[g]−1 ([g][M ])[v] = λ[v]
or, moving [g] to the other side,
([g][M ])[v] = λ[g][v].
The matrices on both sides of this equation are symmetric, in the ordinary sense, and so finding the
eigenvectors and eigenvalues of M is the same as solving this equation with a symmetric matrix on both
sides, called a generalized eigenvalue problem. Software packages typically have more support for finding
generalized eigenvalues of pairs of symmetric matrices, than they do for finding eigenvalues of arbitrary
non-symmetric matrices.
Now the definition of eigenvector and eigenvalue in equation (1.2) makes no sense for bilinear functions
f : you can write down the same expression using [f ] instead of [M ], and the dimensions match up, but
the expression is meaningless nonsense since you cannot apply f to just a single vector. How do we port
the concept of eigenvector to these objects?
Notice that if we plug in a vector v into one of the arguments of f , we are left with a linear function
V → R; in other words f (v, ·) ∈ V ∗ . Therefore a possible generalization of equation (1.2) is to say that
f (v, ·) = λv[
20
concept
symmetry
diagonalization
linear transformations M
self-adjointness
g(v, M (w)) = g(M (v), w)
[g][M ] = [M ]T [g]
eigenvalues and eigenvectors
M (vi ) = λi vi
[M ][vi ] = λi [vi ], g(vi , vj ) = δij
bilinear functions f
true symmetry
f (v, w) = f (w, v)
[f ] = [f ]T
generalized eigenvalues/vectors
f (vi , ·) = λi vi[
[f ][vi ] = λi [g][vi ], g(vi , vj ) = δij
Table 1.2: Symmetry and Diagonalization for Different Linear Objects
where the [ on the right-hand side turns v into a covector. In matrix form,
[f ][v] = λ[g][v].
Notice that the inner product appears on the right-hand side, when it didn’t for linear transformations. In
other words, the eigenvectors of a bilinear form are the eigenvectors of the matrix [g]−1 [f ], or alternatively,
the generalized eigenvectors of [f ] with respect to [g].
Just as for linear transformations, when f is symmetric, the eigenvectors of f form a basis that allows
us to trivialize evaluation of the function into just a rescaling. The eigenvectors of f form an orthonormal
basis V with V T [g]V = I and
(
λi , i = j
.
f (vi , vj ) =
0, i 6= j.
Notice that this equation makes no sense unless f is symmetric. As expected, in matrix form, this property
reads
V T [f ]V = Λ
or
[g]−1 [f ] = V ΛV −1 .
Table 1.2 summarizes the discussion in this section. The punchline is this: if you have a matrix, you
should ask yourself “does this matrix represent a linear transformation, or a bilinear function”? If the
former, for example, then it only makes sense to talk about self-adjointness of the matrix (and not ordinary
symmetry) and to talk about its eigenvalues and eigenvectors (and not its generalized eigenvectors and
eigenvalues). Applying the wrong concept to the wrong type of object is a sure sign you’ve wandered onto
the wrong track.
1.5
Inner Product Spaces in Physics
Inner product spaces are a cornerstone of geometric understanding of physics.3 Consider a point
q ∈ Q and a configurational velocity q̇ ∈ T Qq . How do we measure the magnitude of this
configurational velocity? We could of course use the ordinary Euclidean norm,
p
kq̇kI = q̇ · q̇,
but there is a huge problem with this approach: it is not natural, whereas all physical properties
must be invariant with respect to choice of coordinates! To measure lengths of velocities, we need a
3
In fact physicists usually work with Hilbert spaces: inner product spaces with an additional technical restriction
that the vector space is complete, i.e. that all Cauchy sequences converge to an element of the space. We will not
worry about completeness in this course.
21
canonical and natural way to assign a non-negative number to every velocity, and in fact, we have
already seen something that fits the bill: the kinetic energy,
hq̇, q̇iT = 2T (q̇, q).
Since kinetic energy doesn’t depend on choice of coordinates, we now have a natural inner product
on T Qq . We can take this idea even further: in standard form, where T (q̇, q) = 12 q̇T M (q)q̇, we
even have the norm in matrix form:
kq̇, q̇k2T = q̇T M q̇
which we can extend in the obvious way to an inner product,
hq̇1 , q̇2 iT = q̇T1 M q̇2 .
Notice that this matrix depends on q, but not q̇: it is a fixed matrix at every point q in configuration
space, but might be different at different points of configuration space (for example, if one degree
of freedom represents the amount of fuel a rocket has consumed, the mass matrix would decrease
as this degree of freedom increases.)
The above insight is so important, I will put it in a big box:
Configurational tangent space is an inner product space, and mass is the metric.
We will use this fact over and over again throughout our calculations and derivations.
A parting thought: if T Q is an inner product space and velocities are its vectors, what are its
covectors?
22
Chapter 2
The Differential
Among all of the mathematical disciplines
the theory of differential equations is the
most important It furnishes the
explanation of all those elementary
manifestations of nature which involve
time.
Sophus Lie
In the first chapter, we discussed potential function V : Q → R, and mentioned that conservative
forces could be derived from such potentials. In this chapter, we will formalize this idea, and study
how to take a potential V and turn it into a covector field F (q) over Q: a covector at every point
of Q specifying the direction and magnitude of the force arising from V . The formula is simple:
F (q) = −dV (q),
where dV is a covector called the differential of V . As discussed in the previous chapter, the force
(and the differential), like all covectors, can be written in matrix form [F (q)] as a row vector, given
a choice of basis for the configuration tangent space T Q. We will define the differential, and how
to calculate it, in the sections that follow.
In earlier physics classes, you may have learned that only some types of forces are conservative,
and can be represented by a potential. Gravity, elastic spring forces, the electromagnetic force,
contact forces that prevent collisions, etc. are all conservative forces, but there are important
examples of non-conservative forces: friction, air drag, etc.
Non-conservative forces are dissipative: they violate conservation of energy (by decreasing
energy over time). There are no known non-conservative forces in nature. However, these
forces are still useful in practical simulations—for example, consider a block of wood being pushed up an
incline. Due to phenomena that are not fully understood (scraping of microscopic imperfections on the
surface of the wood block against the surface of the incline; transient formation and breaking of chemical
bonds between the wood and incline, etc) some of the energy of the system gets converted into vibration
and heat of the block. If we were to do a full accounting of these microscopic phenomena, we would see
that energy is perfectly conserved; but as discussed in the first chapter, it is unnecessary and impractical
to simulate the details of the vibration of every individual wood molecule. A non-conservative friction force
approximates the apparent loss of energy to heat, without simulating the heat itself.
23
2.1
Directional Derivative
We will assume in this course that you have a firm grasp of ordinary (one-variable) differential
calculus. How can we use one-variable calculus to understand the behavior of V , which is a function
of potentially many variables? Let q be a point in configuration space; the neighboring landscape
near q might be very complex, depending on how complicated V is. But a key idea is that we can
reduce the problem of understanding the high-dimensional neighborhood of q to a one-dimensional
problem, which we thoroughly understand, by looking in only one direction at time.
Let w be a configurational tangent vector in T Qq . Then we can ask: how does the function
V change, to first order, as we move in the direction w? To answer this question, we define a
parameterized curve that starts at q and moves in the w direction:
γ(s) = q + sw.
We can evaluate V at every point along this curve. This now gives us a one-dimensional function
depending on a single scalar variable, s; we then take the ordinary one-dimensional derivative,
yielding the directional derivative of V , at q, in the direction w:
Dw V (q) =
d
V (q + sw)
ds
s→0
.
This directional derivative inherits many useful properties from the usual one-dimensional derivative:
1. it is additive in the function:
Dw (V1 + V2 )(q) = Dw V1 (q) + Dw V2 (q);
2. it is linear in the direction w:
Dw1 +αw2 V (q) = Dw1 V (q) + αDw2 V (q).
Notice that this linearity is only in the direction of the derivative; Dw V (q) is not generally
linear in the point q. This linearity is perhaps not obvious; it follows from the fact that, by
Taylor’s theorem, any differentiable function V looks like plane on a sufficiently small neighborhood of q. We can prove this linearity property, though it unfortunately and unavoidably
requires the use of non-negligible additional machinery in the form of (multivariable) differentiability of V :
d
V (q + s[w1 + αw2 ])
ds
s→0
n
X
∂V
=
(q) (w1 + αw2 )i
∂xi
Dw1 +αw2 V (q) =
=
i=1
n
X
i=1
n
X ∂V
∂V
(q) (w1 )i + α
(q) (w2 )i
∂xi
∂xi
i=1
d
d
V (q + sw1 ])
+ α V (q + sw2 )
ds
ds
s→0
= Dw1 V (q) + αDw2 V (q).
=
24
s→0
Linearity in the direction w has as a special case the intuitive fact that the directional
derivative in one direction is simply the negative of the derivative in the opposite direction:
Dw V (q) = −D−w V (q).
Moving beyond scalar-valued potentials V , the notion of directional derivative extends easily to
functions that return a vector, rather than a scalar. Concretely, if f : Rn → Rm , we can write
f (q) = [f1 (q), f2 (q), . . . , fm (q)]
and
Dw f (q) = [Dw f1 (q), Dw f2 (q), . . . , Dw fm (q)] .
2.2
The Differential
Closely related to the directional derivative of a function f : Rn → Rm is the differential {df (q)} (δq).1
The differential takes two arguments: the function f , and a point q ∈ Rn , and returns a new function df (q). This function is no longer a function of Q, but instead, is a function of the tangent space
T Qq at q. It takes in a tangent vector δq (which can be interpreted as an infinitesimal change in
q) and returns the directional derivative
{df (q)} (δq) = Dδq f (q),
which can be interpreted as the infinitesimal change in f due to perturbing q by the infinitesimal
change δq. The differential has tons of different names in different contexts; in differential geometry
where f is a map between two manifolds, it is often called the pushforward.
The differential is more properly called a functional rather than a function, since it takes in one
function, and produces another one. Because it is so closely related to the directional derivative,
the linearity properties immediately carry over:
1. it is additive in the function: {d(f + g)(q)} (δq) = {df (q)} (δq) + {dg(q)} (δq);
2. it is linear in δq:
{df (q)} (δq1 + αδq2 ) + {df (q)} (δq1 ) + α {df (q)} (δq2 ).
The notation here can get a little cluttered and confusing; the object in {curly brackets} is
the function returned by the differential, and it is being evaluated at a tangent vector written
in (round parentheses). In addition to the linearity properties, the differential also has two
other useful properties:
3. it satisfies the chain rule:
{d(f ◦ g)(q)}(δq) = {df (g(q))}( {dg(q)}(δq) ).
The intuition for the chain rule should be clear: first dg maps the infinitesimal change δq
at q to the infinitesimal change {dg(q)}(δq) of g at g(q), and then df maps that to the
infinitesimal change in f .
1
The curly braces here are for grouping; they are not set notation. They are unfortunately the least confusing out
of the lot of poor delimiter choices here.
25
4. it satisfies a product rule: if
{d(f
is any binary bilinear operator,
g)(q)}(δq) = {df (q)}(δq)
g(q) + f (q)
{dg(q)}(δq).
Examples of binary bilinear operators include: scalar multiplication, matrix multiplication,
vector dot products, vector cross products, etc. The proof follows easily by first applying the
definition of the differential in terms of the directional derivative, and then using the ordinary
one-variable product rule.
Matrix Representation If f is a scalar function Rn → R, then the linearity property 2 says
precisely that df is a covector, which has a matrix form [df ] that can be written as an n-dimensional
row vector. In fact, even if f : Rn → Rm is vector-valued, since df (q) is linear in δq, there must
exist a matrix [df (q)]m×n with {df (q)}(δq) = [df (q)][δq]. Therefore we can use the notation
from the last chapter, [df (q)], to denote this matrix representation (which is usually called the
Jaccobian of f ). Notice the important difference in the two sides of this equation! The left side is
function evaluation of the function df (w), and the right side is matrix-vector multiplication. To
avoid cluttered notation, often the evaluation at q is omitted and left implied—[df ][δq]—though of
course one should always remember that generally df is not constant in q.
df
In the special case of f : R → R, [df ] is just a scalar, and equal to the usual derivative dx
. This
makes sense since the ordinary derivative of a scalar function does measure the infinitesimal change
in f due to an infinitesimal change in x, for the special case δx = 1. For a scalar function such as
a potential V : Rn → R, [dV ] is a row vector.
Finally, note that although it is guaranteed to be possible to write down a matrix representation
of df , there is not always a simple formula for [df ]! Sometimes, particularly when computing
intermediate results in a long calculation, it can be more useful to leave the differential in functional
form.
Functions of Multiple Variables Functions that take multiple variables can be tricky, especially if the parameters of the function all depend on the variable being differentiated. Consider
for example f (v, w) : Rn × Rn → R. How do we compute the differential of
g(q) = f (q, q)?
We could fall back to the definition of the differential in terms of the directional derivative, but we
can also observe the following: define f˜ : R2n → R by
v
˜
f
= f (v, w).
w
Then
In×n
In×n
In×n
˜
˜
{dg(q)}(δq) = d f
q
(δq) = df
q
δq
In×n
In×n
In×n
by the chain rule. Or, in terms of the Jacobian of f˜,
I
I
n×n
n×n
{dg(q)}(δq) = df˜
q
δq.
In×n
In×n
26
The Jacobian [df˜] is of size 1 × 2n; the left 1 × n block maps an infinitesimal change in v to an
infinitesimal change in f˜, or, equivalently, in f . This bock is thus equal to [d1 f ], the Jacobian of f
with respect to the first parameter (treating the second parameter as a constant). We thus have
{dg(q)}(δq) =
d1 f (q, q) d2 f (q, q)
In×n
In×n
δq
= [d1 f (q, q)] δq + [d2 f (q, q)] δq
= {d1 f (q, q)}(δq) + {d2 f (q, q)}(δq).
Not coincidentally, this calculation should remind you of the multivariable chain rule,
d
∂f dx ∂f dy
f [x(t), y(t)] =
+
.
dt
∂x dt
∂y dt
As a second example, suppose u(q) : Rn → Rm , v(q) : Rn → Rp are two functions, and f :
Rm × Rp → R is a function of two variables. To calculate the differential of
g(q) = f [u(q), v(q)]
we calculate
{dg(q)}(δq) = {d1 f [u(q), v(q)]} ({du(q)}(δq)) + {d2 f [u(q), v(q)]} ({dv(q)}(δq)) ,
or using the matrix representations of the differentials,
[dg(q)] = [d1 f [u(q), v(q)]] [du(q)] + [d2 f [u(q), v(q)]] [dv(q)] .
You should check that the dimensions of the objects on the right-hand side, and their geometric
meaning, make sense.
Matrix-valued Functions If f (q) : Rn → Rm×p is matrix-valued, the differential is still welldefined: it maps infinitesimal changes in q to infinitesimal changes in the matrix. It is also still linear
in δq, although it no longer has a “matrix” representation: there is no matrix which multiplies
a vector to yield another matrix. In these cases it is often easiest to leave the differential in
functional form. However, it is possible to express df in matrix form, by replacing f with an
“unrolled” vector-valued version f¯:


f11 (q)
 f12 (q) 






..
f11 (q) f12 (q) · · · f1p (q)


.




.
.
.
¯

.
.
.
f (q) = 
 ; f (q) =  f1p (q) 
.
.
.
.
 f21 (q) 
fm1 (q) fm2 (q) · · · fmp (q)




..


.
fmp (q)
Now df¯ has an mp × n matrix form, as usual.
27
The Gradient In calculus or physics classes you may have been exposed to the gradient ∇f of
a function f : Rn → R, which is intimately related to the differential: in fact the gradient of f at
q is defined to be the unique vector ∇f (q) satisfying
h∇f (q), δqi = [df (q)][δq]
for all δq. In other words, the gradient is the metric dual of the differential:
∇f (q) = {df (q)}] .
As a special case, in Euclidean space under the Euclidean metric,
h∇f (q), δqi = [∇f (q)] · [δq] = [∇f (q)]T [δq],
and the gradient is just the transpose of the Jacobian matrix [df (q)]. Notice that the gradient in
its usual form is only defined for a scalar function f : Rn → R, whereas the differential is defined
for functions of all shapes and sizes.
2.3
Worked Examples
We will now look at some examples of computing the differential. Although we are ultimately
interested in taking the differential of scalar potential functions, to get forces, it is very much
worthwhile to understand and practice applying the differential to functions of arbitrary dimension;
it is extremely common to define the potential in terms of a complicated composition of higherdimensional functions, as we will see shortly.
Whenever possible we will try to write the differential in matrix form, though we will also see
some examples where this is not profitable.
Dot Product Consider two vector fields u(t) and v(t) defined along a 1D line parameterized by
a scalar parameter t, and let f (t) = u(t) · v(t). We have
{df (t)}(δt) = [du]δt · v + u · ([dv]δt)
= v · [du]δt + u · [dv]δt
= (v · [du] + u · [dv])δt
[df ] = v · [du] + u · [dv].
The first step uses the fact that the dot product is a binary bilinear operator. We then use properties
of the dot product (symmetry) to rearrange the expression for the differential into the form M δt
for a matrix M . Again, M does not always have a nice expression, though in this case it does.
If the vectors are themselves functions of a vector, so that f (q) = u(q) · v(q), the same
calculation can be done:
{df (q)}(δq) = [du][δq] · v + u · [dv][δbq]
= ([du][δq])T v + uT [dv][δq]
= vT [du][δq] + uT [dv][δq]
= vT [du] + uT [dv] [δq]
[df ] = vT [du] + uT [dv].
28
Here we have used the fact that if c is an expression that evaluates to a scalar, c = cT . This
“transpose trick” is a common way of rearranging the differential into matrix form.
Identity If f (q) = q, then [df ] = I, the identity matrix. This follows easily from the definition
of the differential, and should also be intuitive: the infinitesimal change in q due to infinitesimally
changing q by δq is δq.
Linear Functions If f is linear, {df (q)}(δq) = f (δq). This also follows easily from the definition,
or also from the product rule: if f is linear there must exist a matrix F with f (q) = F [q]. Then
{df (q)}(δq) = ([dF ][δq]) [q] + F I[δq] = 0 + F [δq] = f (δq).
Transpose If M is a matrix (or vector) and f (M ) = M T , {df (M )}(δM ) = (δM )T . This follows
immediately from the fact that the transpose is a linear function of the matrix.
Norm
If f (q) = kqk =
√
q · q, then by the chain rule
1
{df }(δq) = (q · q)−1/2 2q · δq
2
and [df ] = q̂T , where the hat notation means a unit vector, q̂ =
q
kqk .
Hat Function Now for a more complicated example: consider f (q) = q̂ =
q
kqk .
q T
1
I[δq] −
q̂ [δq]
kqk
kqk2
I
qqT
[df ] =
−
.
kqk kqk3
{df (q)}(δq) =
As a sanity check, notice that {df (q)}(q) = 0: the infinitesimal change in q̂ by perturbing q in the
q direction is, as expected, zero!
Matrix Inverse Here’s a neat example. Let f (M ) = M −1 . Directly finding df is not simple,
but notice that M f (M ) = I is constant and so has zero differential. Therefore
0 = {d[M f (M )]}(δM ) = [δM ]f (M ) + M {df }(δM )
and after some algebraic manipulation,
{df }(δM ) = −M −1 [δM ]M −1 .
Notice that in this example there is not a simple way of rewriting the differential in matrix form
(this would require higher-rank tensors, or index notation).
29
Cross Product
operator, so that
Finally consider f (q) = u(q) × v(q). The cross product is a binary bilinear
{df }(δq) = ([du][δq]) × v + u × ([dv][δq]).
Notice that since the cross product is antisymmetric, one must be careful about the order of the
terms! Rewriting the differential in matrix form is most easily done using cross-product matrices:
let [u]× denote the matrix with [u]× v = u × v for any vector v. Then
{df }(δw) = −v × ([du][δq]) + u × ([dv][δq])
= −[v]× [du][δq] + [u]× [dv][δq]
[df ] = −[v]× [du] + [u]× [dv].
2.4
The Hessian
For most of this chapter, we have been most concerned with how the differential {df (q)}(δq)
behaves when we vary the tangent vector input δq. Let us choose a value for δq (let’s call it δq1 )
and imagine the differential as a function only of q, with δq = δq1 held fixed: let
g(q) = {df (q)}(δq1 ).
This function g(q) is just another function over Q, so we can take the differential (again): {dg(q)}(δq2 ).
Like any other differential, dg is linear in the direction parameter δq2 (which is not necessarily related to the original direction δq1 we chose when differentiating f ).
We now have a differential-of-the-differential which depends on three parameters: the point
q ∈ Q, and two directions δq1 , δq2 ∈ T Q at q. We can write this second-differential using the
notation
{d2 f (q)}(δq1 , δq2 ) = {d [{df (q)}(δq1 )]} (δq2 ).
Let us take stock of its properties.
• The second-differential is a linear function of both δq1 and δq2 ;
• It is symmetric in δq1 and δq2 : swapping the two directions does not change the value of d2 f .
This is not obvious from the above construction, but if we expand out the two differentials
in terms of directional derivatives,
{d2 f (q)}(δq1 , δq2 ) = lim
=
s→0
d2
d
d
lim f (q + tδq1 + sδq2 )
ds t→0 dt
dsdt
f (q + tδq1 + sδq2 )
s,t→0
,
which is clearly symmetric in δq1 and δq2 . (The second step requires that the limits and
derivatives on the first line commute; this is guaranteed to be true for sufficiently-smooth
functions and amounts to the equivalence of mixed partial derivatives.)
• In the special case that f is a scalar function Q → R, then d2 f is a bilinear scalar function; for
a given choice of basis, the second-differential therefore has a matrix form (see Section 1.4):
{d2 f (q)}(δq1 , δq2 ) = δqT1 d2 f (q) δq2 .
30
The matrix [d2 f (q)] is often written Hf and is called the Hessian of f .2 The matrix Hf
is square and (for sufficiently smooth functions f ) symmetric. (Noticing that a computed
Hessian is not symmetric is a common way to spot a bug in derivative calculations!)
• For scalar functions f , we can express the entries of the Hessian matrix in coordinates. If we
T
, we have that
write q = q1 q2 . . . qn
[Hf (q)]ij = {d2 f (q)}(ei , ej ) =
∂2f
(q),
∂qi ∂qj
and we see that Hf is the matrix of mixed second partial derivatives of f .
Figure 2.1: Log-log plots of the relative error between the exact and finite-different approximation
of the derivative of f (q) = q̂, using forward and centered finite differences (left and right plots
respectively). Notice that for not too small (greater than about 10−6 ), there is a clear correlation
between the error and , as predicted by Taylor’s theorem: the error decreases like O() for forward
differences and like O(2 ) for centered differences. If becomes too small, though, the error increases, as catastrophic cancellation reduces the number of bits of information present in the finite
difference. For < 10−15 , the finite difference is pure noise.
2.5
Checking Your Work
Although computing the differential is in principle a purely mechanical exercise, it is easy to make
mistakes, especially once many chain and product rule terms are involved (such as in the above
computation of the differential of the hat function.) Fortunately, there is an easy way to numerically
check your formula. We can expand any function f to second order using Taylor’s theorem:
1
f (q + δq) = f (q) + {df (q)}(δq) + {d2 f (q)}(δq, δq) + o(kδqk2 ),
2
and in particular, provided that {df (q)}(δq) 6= 0, it is possible to estimate the value of the differential using finite differences:
f (q + δq) − f (q) ≈ {df (q)}(δq).
2
Amusingly, the Hessian is named after its inventor Ludwig Otto Hesse, a student of Carl Jacobi, who invented
the Jacobian!
31
This fact suggests a way of “probing” any given entry of the Jacobian matrix:
• Set q to a random value and evaluate f (q).
• Say you want to test the ith column of the Jacobian of f . Pick a small constant (say, 10−6 )
and compute f (q + ei ) where ei is the ith Euclidean basis function.
• Since [df (q)][ei ] = [df (q)]i , if your formula for the differential is correct, you should have
that
f (q + ei ) − f (q)
≈ [df (q)]i .
How am I supposed to tell if the two sides are “approximately” equal? Unfortunately, this is
not a perfect science. In principle, as you shrink smaller and smaller, the two sides should
agree to higher and higher accuracy: again using the Taylor expansion, we can write more
precisely that
f (q + ei ) − f (q)
= [df (q)]i + [d2 f (q)]ii + o()
and so there should be a linear relationship between the error
E=
f (q + ei ) − f (q)
− [df (q)]i
and the size of . Unfortunately, in practice this will not be true, due to a phenomenon called catastrophic
cancellation. If f (q + ei ) and f (q) are too similar to each other, then they will have nearly identical IEEE
floating-point representation, so that their difference will be very imprecise (or in the worst case, numerically
zero). See Figure 2.1, left, for a plot of E versus showing this effect.
How small is too small? A double can store roughly 50 bits worth of mantissa information, and since
250 ≈ 1015 , using < 10−15 is never a good idea (you will get pure noise from the finite difference calculation).
However loss of precision becomes noticeable well before = 10−15 (see Figure 2.1, left). Therefore you want
small, but not too small—the optimal tolerance will depend on factors such as the expected magnitude of
f (q) and [Hf ]ii . The most robust procedure to test derivative code is to make a plot, similar to the one in
Figure 2.1, comparing E to for a range of values and check for linear decrease in E as you shrink . In a
pinch, checking that E is “small” for = 10−6 works well for most functions.
Centered Differences
Using Taylor’s theorem in the backwards δq direction, we have
1
f (q − δq) = f (q) − {df (q)}(δq) + {d2 f (q)}(δq, δq) + o(kδqk2 ).
2
This formula suggests an even more accurate estimate of the derivative using centered differences:
f (q + ei ) − f (q − ei )
= [df (q)]i + o().
2
Notice that the Hessian term has canceled on the right-hand side, so that the error
E=
f (q + ei ) − f (q − ei )
− [df (q)]i
2
decreases quadratically in , rather than linearly. Using centered differences can thus help more
easily spot a bug in an implementation of the differential. Figure 2.1 shows plots of error versus for
both forward and centered differences on a log-log scale; notice that centered differences converge
more quickly, and to lower error, than forward differences. But take note: centered differences are
no less susceptible to loss of precision due to catastrophic cancellation.
32
Chapter 3
The Spring Potential
ceiiinosssttuv
Robert Hooke
We will now derive our first physical potential, and calculate the corresponding force. We will
do so for the simplest possible elastic object: a spring in 1D. We will assume the spring has one
end anchored at x = 0; the other is free to move. Call the position of the second endpoint q; the
spring begins at, and is at rest when, q = ±L.
What is the energy of this spring? Perhaps you remember from high school physics Hooke’s law
1
Vspring = k(|q| − L)2
2
for some spring constant k. However there is nothing “spooky” about this law and no reason to
take it on faith: it can be derived from simple geometric principles. Understanding these principles
will be the key to formulating elasticity in more complicated settings.
3.1
Deriving Hooke’s Law
We will take the following assumptions about the behavior of Vspring for the spring as our first
principles:
Laws of the Spring
1. The energy Vspring depends only on the stretchedness (strain) of the spring.
2. The energy is zero when = 0.
3. The energy is non-negative.
Hopefully all of these laws are intuitive: the energy is minimized when the spring isn’t stretched or
compressed, and increases if the spring is stretched in either direction. The absolute energy doesn’t
really matter—recall that forces depend only on the differential of the energy −dVspring , and so
adding a constant to the potential energy does nothing to change the physics of the system—but
we might as well shift the energy so that it is zero when the spring is at rest.
33
We still need to define the “stretchedness” of the spring, called its strain . A natural definition
is the relative change in length of the spring
=
|q| − L
.
L
Notice the absolute value: the spring is stretched the same amount when the second endpoint is at
q = x as when it is at q = −x, so these configurations have the same strain.
The above definition of strain should raise several questions. Who says that stretchedness
should be measured by a difference of lengths? Doesn’t a difference of squared lengths, or
of cubed lengths, etc. also define a “stretchedness”? What about the denominator: why use
relative length instead of absolute difference of length? Why is relative to the rest length L
instead of the current length |q|?
The short answer is that it doesn’t matter. Any formula for that is zero when the spring
is at rest, becomes more positive as the spring is stretched, and becomes more negative when the spring is
compressed, will also yield Hooke’s law. (See the Bonus Math at the end of this section for some worked
examples.) Using relative length, so that is unitless, is conventional, for reasons that will be more clear in
higher dimensions; but again, using absolute length also works for deriving Hooke’s law.
The formula above is called the Cauchy strain and is the simplest unitless formula for that satisfies
the requirements. Mechanical engineering is riddled with alternative formulas that are more convenient in
various applications and for various types of materials; a second particularly important formula is the Green
strain
q 2 − L2
=
2L2
which avoids absolute values. We will use the Green strain heavily in higher dimensions.
From the First Law, we know that energy must be a function of , i.e. Vspring (q) = Ṽspring ((q)) for
some function Ṽspring . You may not think we know much at all about Ṽspring – who’s to say it’s not
some crazy function like π || tan2 ? – but in fact the second and third laws tell us quite a bit. We
can Taylor expand Ṽspring to get
1
1
Ṽspring () = c0 + c1 + c2 2 + c3 3 + O(4 )
2
6
where ci are some constant coefficients. We know c0 = 0, because by the second law, Ṽspring (0) = 0.
We also know that c1 = 0. This fact is easy to see algebraically: since Ṽspring has a local minimum
at = 0, its derivative there must be zero, which means c1 = 0. Alternatively, c1 is the slope of
the tangent line to Ṽspring at = 0: if this line were anything but horizontal, a small change in away from zero, in one direction or the other, would cause the energy to dip negative, violating the
Third Law.
For small displacements where the string is not stretched too much relative to L, is much
smaller than 1 and so 3 and higher powers are negligible compared to 2 . This motivates the
approximation
1
1
|q| − L 2 1
= k(|q| − L)2
(3.1)
Vspring ≈ c2 (q)2 = c2
2
2
L
2
where the constant expression Lc22 has been absorbed into a single constant k. Hooke’s law!
What about the constant c2 ? How do we derive it? Unfortunately, we can’t using geometric
arguments, unless in addition to L we also had a lot more information like the type of metal used
34
Strain
plastic regime
ductile
fracture
Hookean
regime
Force Magnitude
Figure 3.1: A cartoon of the force-strain relationship for a spring. For small amounts of force the
relationship is linear (Hookean). As force becomes large, the spring’s coils become permanently
damaged; the spring is deforming plastically. Eventually the coils have been stretched completely
straight, so that the spring looks like a wire, and again there is a linear force-strain relationship,
until the wire thins and breaks.
to craft the spring, its thickness, its number of coils, etc—it is a material constant that must be
measured or provided to us. This constant (especially in the form of k above) is commonly called
the stiffness of the spring.
A couple of points to conclude our discussion of the simple 1D spring:
• As mentioned in the Chapter 2, forces are covectors, and for conservative forces governed
by a potential energy, we can go from the energy to the force form by taking the negative
differential with respect to the degree of freedom q. In order to express this force as a row
vector of numbers, we want the matrix form (Jacobian) of the differential:
[F (q)] = − [dVspring (q)] ≈ −k(|q| − L)q̂ T .
Notice that the force is now linear in . This formula is the force form of Hooke’s law, which
you may have also studied in high school.
• Hooke’s law is only an approximation: equation (3.1) has higher-order terms that are negligible when is small, but become increasingly important as grows large. Consider for example
a non-idealized spring: for small amounts of stretching , the spring obeys Hooke’s law, i.e. it
is Hookean. If you stretch the spring too much, though, the coils start to deform. In addition
to no longer obeying Hooke’s law, the spring also suffers permanent damage: it no longer
returns to its original length when released. This permanent damage is called plasticity and
also occurs in e.g. paper when you crease it. We won’t discuss plasticity further here.
If you keep stretching the spring, eventually the coils completely unwind until the spring looks
more like a wire: the spring’s behavior becomes Hookean again, though with a much higher
35
stiffness. Eventually, the wire itself starts deforming plastically until it thins and snaps due
to ductile fracture. This behavior can be summarized in a plot measuring the external force
applied vs. the strain of the spring; see figure 3.1.
• Two pieces of information are required to specify the behavior of the spring: (i) the strain
formula used for , and (ii) the material model consisting of the coefficients c2 , c3 , . . .. It is
important to understand that while the strain formula used is unimportant, both the strain
and material constants are needed to specify the behavior of the spring: for every spring
behavior, and every strain formulation, there is a set of material constants (different for
different strains) that encode that behavior.
It may be surprising that it took three pages to describe the behavior of a simple spring! Almost
all of the key ideas underlying the theory of elasticity, which we will revisit much later in the course
in order to understand deformable volumes and cloth, are already contained in this example.
Bonus Math Here we will look further at how the spring energy equations change when a different
formulation is used for strain. In particular we will re-derive Hooke’s law using the Green strain
=
q 2 − L2
.
2L2
By an argument identical to that for Cauchy strain, the Laws of the Spring imply that
1
1
d2 (q)2 + d3 (q)3 + · · ·
2
6
2
2
1
q − L2
≈ d2
,
2
2L2
Vspring =
(3.2)
for material constants d2 , d3 , . . . which doesn’t look at all like Hooke’s law in equation (3.1). However, we
can make the equations look more similar to each other with some algebraic manipulation:
1
d2
2
q 2 − L2
2L2
2
1
= d2
2
(|q| − L + L)2 − L2
2L2
2
2
(|q| − L)2 + 2L(|q| − L) + L2 − L2
2L2
2
1
(|q| − L)2 + 2L(|q| − L)
= d2
2
2L2
1 (|q| − L)4 + 4L(|q| − L)3 + 4L2 (|q| − L)2
= d2
2
4L4
d2
d2
d2
=
(|q| − L)2 +
(|q| − L)3 +
(|q| − L)4 ,
2L2
2L3
8L4
1
= d2
2
and notice that the first term matches equation (3.1). In other words, the Green strain Hooke’s law agrees
with the Cauchy strain Hooke’s law (with c2 = d2 ). However the higher-order terms differ: to get the
exact same expression for energy, the Cauchy derivation would need c3 = 3d2 and c4 = 3d2 .
Next, we can check what happens if we use the deformed length of the spring in the strain denominator,
e.g.
|q| − L
.
=
|q|
36
We get
Vspring
1
≈ e2
2
|q| − L
|q|
2
.
In this case the manipulation is a little trickier: first,
1
e2
2
Now recall that
|q| − L
|q|
2
=
1
c2 (|q| − L)2
2
=
e2
(|q| − L)2
2L2
1
L + (|q| − L)
!2
1
1+
|q|−L
L
2
.
1
= 1 − x + x2 − x3 + · · ·
1+x
(at least for x sufficiently small, so that convergence is not an issue) and that
1
= 1 − 2x + 3x2 − 4x3 + · · · .
(1 + x)2
Therefore
e2
e2
3e2
(|q| − L)2 + 3 (|q| − L)3 +
(|q| − L)4 + · · · ,
2
2L
L
2L4
which again agrees with the Cauchy strain Hooke law (with c2 = e2 ), with different higher-order terms.
The punchline from both of these calculations is that the spring’s behavior is determined by the strain
formulation and the material model. The strain formulation can be chosen freely (different formulas are
convenient in different circumstances) but the material model must then be specified with respect to that
strain.
Vspring ≈
3.2
Springs in Higher Dimensions
Now let us look at a spring connecting two endpoints in Rn (typically n = 2 or 3). The key
difference here is that in 1D, all change in the degree of freedom caused some change in the strain
of the spring; in high dimensions, it is possible to rotate the spring without changing its length.
There are thus many possible states of the spring where the potential is zero, besides the initial
state.
Let us first consider the case where one endpoint of the spring is clamped at the origin, so that
we can use as our degrees of freedom the second endpoint q ∈ Rn . All we really need to do now
is to write down the strain measure of our higher-dimensional spring; if we want to use Cauchy
strain, for instance, we can simply take
=
kqk − L
L
which is the straightforward generalization of the 1D strain formula to nD. Now all of the previous
arguments apply essentially unmodified, and we get that the Hookean spring potential is
1
Vspring (q) = k(kqk − L)2 .
2
The corresponding force can be calculate using the chain rule:
[F (q)] = −k(kqk − L)q̂T ,
37
which should make intuitive sense: the direction of the spring force is from the second endpoint
towards the origin, and the magnitude scales with the strain.
Although the potential is well-defined for all configurations q ∈ Q, there is a potential problem
with the force formula: what happens when q = 0? At this point, the term q̂ is indeterminate.
Should we be worried?
The answer is—perhaps. Physically, this point corresponds to the case where you’ve taken
a spring, and overpowered it so that it is completely compressed. Clearly the spring is highly
strained, and so the spring’s force will want to act to restore the spring’s shape... but in which
direction? In a real-world spring, the spring has thickness and a clear cross-section orientation, which would
tell us in which direction the spring would re-expand, but in the case of the idealized spring we’re studying
here, we’re not storing this information, and uniquely determining a force direction is impossible when q = 0.
Mathematically, the problem occurs because the potential has a tangent discontinuity at q = 0—a spike—where the potential is only C 0 and
not C 1 (see inset plot of the potential, for the case n = 2.) The potential doesn’t have a well-defined tangent plane at q = 0, so cannot have a
well-defined force there.
Since this discontinuity is only at a single point, in practice it doesn’t
pose much of a problem. A trajectory q(t) may not even cross exactly
over the spike, and even if it does, the spike can be ignored: take an
arbitrarily small neighborhood of the spike tip, “sand it down” to get rid
of the discontinuity, and the path that a physical trajectory takes over the
spike will not be affected. The reason for this is that q(t) is the solution
to an ODE that involves the force (which we will finally formalize in the
next chapter) and the integral of a function doesn’t change if you perturb the value of that function at only
one point.
In code, you do need to be aware of the tangent discontinuity, since if you try to calculate F naively at
a point too close to q = 0, you will get a numerical NaN that will propagate and corrupt the simulation. For
this reason you need to check for the failure case and hard-code a force (for instance, 0) when it occurs.
We now make several modifications to the spring potential that will help us generalize it even
further. First, we can modify our notation a bit to explicitly specify the values of the two material
parameters, k and L. Let us also allow both endpoints to move, so that we need 2n degrees of
freedom to describe the spring system,


x1
 y1 


 .. 

q1
. 

q2n×1 =
=
 x2  .
q2


 y2 


..
.
Here we have blocked out q into two parts: one block representing the first point’s degrees of
freedom, and the second block, the second’s. This will allow us to write down formulas in a much
cleaner way than if we restricted ourselves to only using q in raw form. We can write down explicit
formulas for q1 and q2 in terms of q, which will prove helpful:
q1 = S1 q = In×n 0n×n q
q2 = S2 q = 0n×n In×n q.
38
Here the Si are n × 2n selection matrices that pluck out sub-parts of the global configuration vector
q. We can now define Cauchy strain as
=
kq1 − q2 k − L
L
so that the spring potential and force are
k
(kq1 − q2 k − L)2 .
2
k
k,L
Vspring
(q) = (kS1 q − S2 qk − L)2 .
2
q 1 − q2 T
k,L
Fspring (q1 , q2 ) = k(kq1 − q2 k − L)
([dq2 ] − [dq1 ])
kq1 − q2 k
S1 q − S2 q T
k,L
(S2 − S1 ).
Fspring (q) = k(kS1 q − S2 qk − L)
kS1 q − S2 qk
k,L
Vspring
(q1 , q2 ) =
(3.3)
(3.4)
(3.5)
(3.6)
Equations (3.3) and (3.5) are the expressions for the energy and force in local coordinates: the
formulas involve only the qi and their differentials, without using any information about how the
qi are packed into the global configuration q. By contrast equations (3.4) and (3.6) are the same
formulas using global coordinates; the selection matrices pluck out the right parts of q to evaluate
the energy or force, and in the case of the force, stick the components of the force in the right place
in the entire configurational force vector. This distinction between local and global indices becomes
particularly important once we move to systems that involve more than two points and more than
one spring.
3.3
Mass-Spring Systems
Let us now suppose we have m different points, and s different springs; the configuration space is
now Q = Rmn . Let us label the springs using indices 1, ..., s, and denote the beginning and ending
point of spring i by bi and ei , and the rest length and stiffness of spring i by Li and ki . Like in
the one-spring case, we can partition q into m vectors of size n × 1, with qj = Sj q for an n × mn
selection matrix Sj .
What is the potential energy of the system? We simply add together all of the potentials of the
individual springs:
s
X
ki ,Li
Vtot (q) =
Vspring
(qbi , qei ).
i=1
Since the differential distributes over sums, the total force is just the sum of force contributions
from the individual springs:
Ftot (q) =
s
X
ki ,Li
Fspring
(qbi , qei )
i=1
=
s
X
ki (kqbi − qei k − Li )
i=1
qbi − qei
kqbi − qei k
There are a couple of things to notice about this expression:
39
T
(Sei − Sbi ) .
• There is one term in the sum per spring, and each term is just a copy of the spring force
expression in equation (3.5), with the resulting force slotted into different parts of the final,
global configurational force Ftot by different selection matrices. The local force expression
can thus be thought of as a template or stencil that is applied over and over again to each of
the springs in the systems.
• The above formula in global coordinates, with selection matrices converting local expressions
to global ones, is compact and foolproof. However, most developers experienced with physical
simulation would not implement the configurational force in this way, since explicitly materializing the selection matrices is computationally inefficient—they would instead write code
that updates the global (configurational) force vector Ftot directly as if the stencil had been
multiplied by the selection matrices. But if in doubt, you should feel free to implement the
actual selection matrices in your own code, until you are comfortable optimizing them away!
• Each term in the sum will modify exactly two blocks of n coordinates each in the configurational force vector Ftot . Each of these paired contributions is “equal and opposite”, i.e., the
force on the point at one end of a spring is the same magnitude, but opposite direction, as the
force on the point at the other end of the spring. We will see later why this must be true, and
how it is related to conservation of linear momentum, but for now, it is a good sanity-check
that we haven’t made any mistakes in deriving our formula.
• A coordinate of the configurational force vector could be modified by zero, one, or many
different terms in the sum. If a point is connected to several springs, each of those springs
will exert a force on that point independently, with the total on that point the sum of these
contributions. When writing code to compute Ftot , it is very important to use += when
accumulating contributions!
40
Chapter 4
Time Integration
A change in motion is proportional to the
motive force impressed and takes place
along the straight line in which that Force
is impressed.
Isaac Newton
We will now finally complete the ingredients we need to define a physical simulation: we will
look at the equations of motion that define an exact (time-continuous) physical trajectory, then
discretize those laws to yield algorithms for advancing time in a simulation.
4.1
Kinetic Energy and Momentum
Momentum
We can use kinetic energy to define momentum
h i
p = [dq̇ T ] = q̇[
where the notation [dq̇ ] denotes the Jacobian (matrix form of the differential) of T with respect to
q̇. Or, in the notation of Chapter 1, momentum is the dual of velocity with respect to the kinetic
energy metric: p is a covector, which multiplies velocity to give a scalar (energy). Notice that p is
a function of q̇ and, perhaps, q as well. For the special case of a system whose kinetic energy is in
standard form T = 12 q̇T M (q)q̇, we have p = q̇T M (q), which aligns with the common definition of
momentum as “mass times velocity.”
4.2
Equations of Motion
Recall that a trajectory is a curve q(t) through configuration space. It is also useful to think
of a trajectory through phase space, the 2n-dimensional space Q ⊕ P consisting of configuration
space as the first n dimensions, and the space of configurational momenta P as the second set of n
dimensions. Every point in phase space is a pair of position and momentum (q, p), and trajectories
can be represented as paths (q(t), p(t)) through phase space.
Notice that not all paths through phase space are physical paths. In particular, for systems of
point particles, p = q̇T M for a constant mass matrix M , so knowing q(t) completely determines
41
p
p
q
p
q
q
Figure 4.1: The equations of motion determine a vector field in phase space; for the case of a 1D
zero-rest-length spring, this vector field looks like a “whirlpool” (left). Given initial conditions,
flowing along the vector field for time t traces out a physical trajectory (center). Different initial
conditions yield different valid, phyiscal trajectories. Time integrators approximate the true flow
using the equations of motion (right). Explicit Euler uses the vector at the start of each step (red),
Implicit Euler, the vector at the end (blue), and Implicit Midpoint, the vector halfway between
the start and end point (magenta). Notice the characteristic energy blowup of Explicit Euler, and
energy damping of Implicit Euler.
p(t); only paths with p(t) = q̇(t)T M are physically allowed. Another view of this restriction is
that at every time t, a trajectory through phase space has a tangent vector (q̇(t), ṗ(t)), and we
know what the first component of this tangent vector must be: M −1 pT .
We need one more piece of information, ṗ(t), and we get this from Newton’s second law. For
a single particle you might remember this law as F = ma; this formula generalizes to F (q) = ṗ.
Notice that for the special case of systems of point particles, this formula simplifies to F T = M q̈,
which looks very similar to the one-particle formula. For now, we take this law as given; we will see
in the next chapter how to derive this law from a more fundamental physical principle, Hamilton’s
principle of extreme action. We can combine Newton’s second law with the definition of momentum
to give a pair of fundamental equations called the equations of motion of the system:
q̇ = M −1 pT
ṗ = F (q).
The equations of motion uniquely determine physical trajectories, in the following sense. At every
point (q, p) in phase space, one can draw a vector (q̇, ṗ) as determined by the equations of motion.
Given initial conditions (q(0), p(0)), one can then trace out the trajectory of those initial conditions
over time by flowing the initial conditions along the vector field. More specifically, one can define
the trajectory γ(t) as the unique curve satisfying
γ(0) = (q(0), p(0))
γ 0 (t) = (M −1 pT , F (q)).
A closely-related concept is the flow φt (q, p) which computes, for any time t and initial point (q, p),
the point in phase space after flowing the point along the vector field for time t. The flow of the
42
initial conditions, as a function of t, then sweeps out the trajectory:
γ(t) = φt [q(0), p(0)] .
Example Consider a 1D zero-rest-length spring, with one end anchored at x = 0 and the second
at x = q, so that the system has only a single degree of freedom q. Let us assume that the point
has unit mass, and that the spring has unit stiffness, so that V (q) = 21 q 2 and F (q) = −q. The
equations of motion are then
q̇ = p
ṗ = −q.
Since the phase space is two-dimensional, we can draw it in the plane, with position along the
x-axis and momentum along the y-axis; we can then also draw at every point the vector field
determined by the equations of motion (Figure 4.1, left). It looks like a whirlpool, with rotational
symmetry of the vectors and linear growth in vector magnitude as radius increases. Given initial
conditions (q(0), p(0)), the physical trajectory is given by the flow of the initial conditions along the
vector field (Figure 4.1, middle); for the whirlpool vector field, this flow yields circular trajectories.
Such trajectories should make common sense: an oscillating spring alternates between the spring
being maximally stretched and the endpoint having zero momentum (far left and far right points
of the circular trajectory), and the spring being fully compressed but having maximum left or right
momentum (top and bottom points of the trajectory). Different initial conditions yield different
trajectories, though all trajectories are circles: this corresponds to the fact that the spring always
oscillates in the same way no matter how it is initially excited or stretched, with the only difference
being the amplitude of oscillation (radius of the circular path in phase space).
Does there always exist a trajectory in phase space, for any initial conditions? Or is it possible
for a trajectory to reach a “dead end”? Is the trajectory always unique? In some cases, both
existence and uniqueness of the trajectory, given initial conditions, is guaranteed: the PicardLindelöf theorem, for instances, ensure this is the case whenever force is sufficiently smooth
as a function of q (i.e., Lipshitz-continuous). However, this answer is not fully satisfying:
we’ve already encountered a case where force is discontinuous at some points, in the previous
chapter on higher-dimensional springs, and a discontinuous force is certainly not Lipshitz-continuous. In
this course, we simply won’t worry about such issues: we will take for granted that for any physical systems
we’re interested in, a unique trajectory always exists. In the time-discrete setting, which we describe below,
the question is moot since our algorithms will always yield a unique approximation to φt for any starting
point (though see the discussion below on the failure of implicit methods.)
4.3
Time Integrators
For very simple physical systems, such as the mass-spring networks, it is actually possible to write
down closed-form formulas for φt given any initial conditions and values for t. But for physical
systems of sufficient complexity, this is no longer possible, and the best we can do is to approximate
the flow (or equivalently, the trajectory γ(t)) using a finite amount of computation.
The key idea is to forget about computing the trajectory at every time t, and instead cut up
time into windows of a finite size h. We will then estimate the trajectory only at the boundaries
between windows: at t = 0, h, 2h, 3h, . . .. The window size h is called the discrete time step, and
43
to avoid overly cluttered notation, we will use superscripts to denote values of q and p at integer
multiples of the time step: q0 = q(0), qi = q(ih), etc.
A discrete time integrator Φ maps from points in phase space to new points in phase space,
and approximates flowing the input point along the vector field given by the equations of motion
for time h. In other words
Φ(q, p) ≈ φh (q, p).
There are many, many possible candidates for Φ, with their pros and cons, and we will examine
several of them here. Before we do so, though, note that this time integrator is the last piece we
need in order to compute a physical trajectory using numerical simulation: we start with initial
conditions q0 , p0 , then compute (q1 , p1 ) = Φ(q0 , p0 ), then compute (q2 , p2 ) = Φ(q1 , p1 ), and so
on and so forth, for as many time steps as we desire. At each stage, computing qi+1 and pi+1 is
just a matter of applying the time integrator map to the previous position and momentum, qi and
pi .
Explicit Euler Perhaps the most natural time integrator can be motivated as follows. At a given
point (qi , pi ) in phase space, the local linear approximation to γ(t) is given by the equations of
motion:1
T
γ[(i + 1)h] ≈ γ(ih) + hγ 0 (ih) = (qi , pi ) + h M −1 pi , F (qi ) .
This approximation leads directly to formulas for Φ:
qi+1 = qi + hM −1 pi
T
pi+1 = pi + hF (qi ).
Geometrically, these formulas can be interpreted as follows: evaluate the vector field at your current
point in phase space, then move an amount h in the direction specified by that vector. This time
integrator is called Explicit Euler : explicit, because calculating qi+1 and pi+1 does not require any
particular effort; you simply plug in known values into the right-hand side of the equations, and
directly read off the new values for position and momentum.
More likely than not, any tutorial you will find online about physical simulation will start with
an implementation of Explicit Euler. This is a sad mistake—outside of its pedagogical simplicity,
Explicit Euler is a terrible time integrator and should not be used in practical codes under any
circumstance. Explicit Euler is unconditionally unstable: no matter how small you pick h, a
simulation using Explicit Euler will “blow up,” and quickly, as you start taking a large number of
time steps: the energy of the system will grow exponentially, until all of your degrees of freedom
are infs and nans. Do not use Explicit Euler!
Implicit Euler Notice that there is a peculiar asymmetry in the Explicit Euler step described
above: the vector representing the step taken, [(qi+1 , pi+1 ) − (qi , pi )], matches the vector field
given by the equations of motion at the beginning of the step, but not at the end of the step. What
if we turn things around? What if we instead look for a new position and momentum so that the
1
Here and throughout the rest of this section, we assume kinetic energy has the standard form, and that the mass
matrix is constant, so that p = q̇T M . The integrators derived here must be modified accordingly if these assumptions
do not hold.
44
vector field at the end of the step aligns with the step taken? This idea yields the time integrator
T
qi+1 = qi + hM −1 pi+1
pi+1 = pi + hF (qi+1 ),
often called Implicit Euler. Notice that is is far from clear how to actually find qi+1 or pi+1 :
the first equation requires the new momentum, but to find the new momentum using the second
equation, you first need the new position, and you have entered a chicken-and-egg situation. This
is why the integrator is called implicit: there is no closed-form formula for the new position and
momentum, but rather, they have to be computed by solving an implicit equation coupling them
together. Let us set aside the obstacle of how to solve these equations for now—we will address it
in the next section. Does Implicit Euler work any better than Explicit Euler in practice? Implicit
Euler has the opposite behavior: where a simulation using Explicit Euler experiences unphysical
energy growth over time, Implicit Euler causes unphysical numerical damping: instead of oscillating
at constant amplitude forever, a spring simulated using Implicit Euler will slowly come to rest, as
if the spring were experiencing drag.
Although this damping is nonphysical, Implicit Euler is nevertheless extremely popular in applications that care more about visual plausibility than physical correctness, such as real-time games,
where the damping and robustness of the integrator is considered a feature, not a bug.
Trapezoid Rule If Explicit Euler has too much energy growth, and Implicit Euler has too much
damping, surely we can just average the two results, and get better behavior? Shockingly, this idea
actually works. The Trapezoid Rule
i+1
T
+ pi
i+1
i
−1 p
q
= q + hM
2
F (qi+1 ) + F (qi )
pi+1 = pi + h
2
is also implicit, but is the first example of an integrator with good energy behavior : for sufficiently
small time steps, the energy neither grows too large, nor decays too much.
Implicit Midpoint Another idea is to require the step to align with the vector in the middle of
the step, rather than at the beginning or end; this idea leads to the Implicit Midpoint integrator
which also has good energy behavior:
i+1
T
+ pi
i+1
i
−1 p
q
= q + hM
2
i+1
q
+ qi
.
pi+1 = pi + hF
2
Velocity Verlet This next integrator is more difficult to motivate geometrically: it relies on
using one component of the vector from the start of the step, and the other component from the
end of the step:
T
qi+1 = qi + hM −1 pi
pi+1 = pi + hF qi+1 .
45
Notice that despite the appearance of qi+1 on the right-hand side, this method is explicit, since you
can compute qi+1 first using known information, then compute pi+1 without difficulty. Notice also
the extreme similarity between Velocity Verlet and Explicit Euler: there is only one small change
in one superscript. But what a difference this change makes! Whereas Explicit Euler reliably blows
up at any time step, Velocity Verlet has good energy behavior.
Velocity Verlet goes by several other names, such as the Leapfrog method, or Symplectic Euler.
All are essentially the same method, though sometimes they are written with terms reordered or
reindexed.
Runge-Kutta 4 I will not give the full formula for Runge-Kutta 4, which is quite complex, but
I must mention it given its ubiquitous presence in scientific computing applications. As discussed
below, Runge-Kutta 4 has very high accuracy when used in simulations over short time scales, and
is explicit.
We can compare several of these time integrator for one time step on the 1D spring example
(Figure 4.1,right): notice that Explicit Euler spirals out, Implicit Euler spirals in, and Implicit
Midpoint stays close to the circle defining the true trajectory through phase space.
4.4
Newton’s Method
Before we discuss the properties of time integrators in more depth, we first describe how to actually
go about solving for qi+1 and pi+1 in implicit methods. The key tool for doing this is Newton’s
method, without a doubt one of the most important algorithm ever invented.
Let us step back and look at a general function f (q) : Rn → Rn , and suppose we want to find
a value of q so that f (q) = 0. We will also assume that we start with an initial guess q0 so that
f (q0 ) ≈ 0. Newton’s method gives us a way of turning this initial guess into an even better guess.
The main idea is to Taylor-expand f at q0 :
f (q0 + δq) = f (q0 ) + [df (q0 )]δq + o(kδqk2 ).
(4.1)
We want to find a correction δq so that f (q0 + δq) = 0, but we cannot solve equation (4.1) exactly
for δq, since the right-hand side might contain arbitrarily many high-order terms. What we can do
instead is to assume that the initial guess is a good one, so that higher powers of δq are negligibly
small. This assumption leads to the approximation
0 ≈ f (q0 ) + [df (q0 )]δq,
and now we can find δq by solving a linear system, since [df (q0 )] is an n × n matrix.
We now have a better guess q1 = q0 + δq. If f (q1 ) is sufficiently close to zero, we have our
solution; otherwise, we can continue to improve the guess by repeating the entire process, but this
time linearizing f about q1 . This procedure leads to the following algorithm, which takes in f , a
tolerance , and an initial guess q0 , and returns a q with kf (q)k ≤ .
1: function Newton(function f , initial guess q0 , tolerance )
2:
while kf (q0 )k > do
3:
δq ← −[df (q0 )]−1 f (q0 )
4:
q0 ← q0 + δq
5:
end while
46
return q0
7: end function
6:
Newton’s method performs extremely well in practice, often finding the zeros of complicated
nonlinear functions f in high dimensions in under a dozen iterations. However, its proper use
requires being aware of several caveats:
• it is not guaranteed to converge for every initial guess q0 . The algorithm might fail to
terminate if q0 is too far from a solution to f (q) = 0 (and especially if q0 is separated from a
zero of f by a critical point of f ), or q might diverge to infinity, or Newton’s method might
only converge after a great many iterations. The full conditions that guarantee convergence
can be found in a numerical analysis textbook and are too complex to cover here, as are
advanced modifications to Newton’s method (trust regions and advanced line searches) that
can help ensure convergence in tough cases. (We will look at how Newton’s method can be
made more robust in some cases, such as when solving for an Implicit Euler step, using a line
search in the next chapter.) The punchline is that Newton’s method is only reliable when the
initial guess is good, but fortunately, in all cases when we need to use Newton’s method in
this class, we will have access to very good initial guesses. As an extra protection against
failure, it is common to specify a maximum number of iterations to allow in the while loop,
and to abort if this maximum number of iterations is exceeded.
• one specific failure case is if the function Jacobian [df ] is singular during some iteration of
Newton’s method, so that there is no solution for δq during that step. One common situation
where this can arise is when f has a double root near the initial guess: in particular Newton’s
method is not suitable for finding the zeros of scalar functions of the form f (q) = g(q)2 ,
where it is guaranteed that any zero is also a local minimum.
• Newton’s method is very robust to errors in df and will converge (for a good initial guess)
even when [df ] differs significantly from its correct value. This is both a blessing and a curse:
on the one hand, it opens the door to quasi-Newton methods which purposefully replace df
with an approximation, in cases where df is difficult or expensive to compute exactly. But the
robustness of Newton’s method also makes debugging df difficult, since major implementation
errors might still result in code that appears to work correctly on many test cases, failing
only for very complicated large physical systems. One rule of thumb: if Newton’s method
requires more than a dozen or so iterations when applied to an implicit time integrator, there
is very likely a bug in the df computation, even if the physical trajectory appears correct at
a glance.
• Although the pseudocode above includes an expression for δq in terms of the inverse [df ]−1 of
the Jacobian matrix, in practice you never actually materialize this inverse matrix. Instead
you invoke a linear solver that solves problems of the form Ax = b directly, given the
matrix A and right-hand side b. Linear solvers require substantially less computation than
building A−1 explicitly. Moreover, in many cases (such as when f comes from terms involving
forces in a large physical system) the matrix [df ] is sparse. In these cases it is critical for
good performance to use sparse matrix data structures and sparse linear solvers when
computing δq.
47
Application to Implicit Integrators Let us now look at how Newton’s method can be used
to compute the action of an implicit time integrator, for example Implicit Euler. Implicit Euler
involves 2n unknowns qi+1 and pi+1 , but it is possible to eliminate half of these by plugging one
of the equations of motion into the other one. For example, let us eliminate momentum to get
T
T
qi+1 = qi + hM −1 pi + hF (qi+1 ) = qi + hM −1 pi + h2 M −1 F (qi+1 )T .
To transform this equation into the form f (q) = 0, we just need to move all terms to one side:
f (qi+1 ) = qi+1 − qi − hM −1 pi
T
− h2 M −1 F (qi+1 )T = 0,
where the unknown variable here is qi+1 . What do we use as the initial guess? There are several
good options, for example, qi+1
= qi , which is a very good guess when the time step h is small.
0
T
Another good option is the Explicit Euler step qi+1
= qi + hM −1 pi .
0
Bonus Math Observe that we can multiply the equation for f by M without changing its zeros, so we
could instead apply Newton’s method to
f (qi+1 ) = M qi+1 − M qi − h pi
T
− h2 F (qi+1 )T = 0.
Notice that the Jacobian of this function is [df ] = M − h2 [dF T ]. M is (almost always) symmetric positivedefinite and [dF T ] is always symmetric for conservative forces by the equality of mixed partial derivatives,
so that [df ] is symmetric and has only real eigenvalues. If λ is the smallest eigenvalue of M and µ the
largest eigenvalue of [dF T ], then as long as λ > h2 µ (not a difficult condition, when h is small) the matrix
[df ] is positive-definite. This has two benefits: first, if [df ] is positive-definite it cannot be singular; and
second, it allows specialized efficient and robust solvers (such as sparse Cholesky factorization, or the
method of conjugate gradients) to be used when solving for δq in Newton’s method.
4.5
What Makes a Good Time Integrator?
How do we pick a time integrator? Above we colloquially described some integrators as being stable
or having good energy behavior ; let us try to pin down some of these desirable properties, and take
inventory of the simple integrators listed above.
Accuracy There are two kinds of accuracy that we can discuss when examining a time integrator:
(i) how does the approximate, numerical step Φ compare to the exact flow φh , as a function of h?
This is the local error of the method; (ii) for a fixed time step h, how does the computed trajectory
qi compare to the true trajectory φhi q0 as i gets large? This is the global error of the method,
since it cares about the fidelity of the computed trajectory over a large number of steps, rather
than just about the error in one step.
The local error is commonly specified by stating the method’s order : the largest integer k so
that
kΦ(q0 ) − φh q0 k ∈ O(hk ).
A method with k = 1 is called first order, a method with k = 2, second order, and so forth. Note
that any time integrator worth mentioning will be at least first order: if decreasing h does not
improve the accuracy of the time integrator, the results of the integrator are complete garbage!
48
Another interpretation of order is: if you halve the time step, what happens (roughly speaking) to
the error of one step of the method? If the error also halves, the method is first-order; if the error
quarters, the method is second-order, etc. Runge-Kutta 4 is a fourth-order method, which means
that if you halve the time step (and hence double the amount of computation needed to simulate
1
a trajectory of fixed length), you decrease the error by a factor of 16
! This characteristic explains
why RK4 is so popular in scientific computing.
It is important to understand that local and global error are completely separate concepts: a
method can have very small local error but terrible global behavior, or very high local error but
good global error. (Compare a drunk man trying to walk down a straight path, vs. a car that veers
ever so slightly to the right of center. The drunk man will stay on the road no matter how long he
walks, but each step might veer left or right of the center of the road. The car will appear, at any
given time, to be heading in a straight line, but will drift further and further away from the center
of the road over time.)
Stability The formal notion of stability is outside the scope of these notes, but informally, a
method is unstable if errors in the trajectory get amplified over time by Φ, and stable if errors are
damped over time. Since introducing some amount of error every step is unavoidable (since Φ is
only an approximation to the exact solution φh ), the trajectory computed by an unstable integrator
quickly becomes useless due to the amplification of this error.
Some integrators are unconditionally unstable: they are unstable no matter how small of a time
step h you choose. Explicit Euler is one example. The usefulness of such integrators in practice is
suspect. Most integrators are conditionally stable: they are stable if the time step is small enough,
but become unstable if the time step is chosen too large. The cutoff point where the integrator
switches from being stable to unstable is called the stable time step size and depends on both the
integrator, and the physical system: for example, a stiff spring will require a smaller stable time
step size for Implicit Midpoint than a weak spring will require, for the same integrator. Implicit
Euler is renowned for being stable even at very large step sizes, for a wide variety of physical
systems. Notice that stability is not the same thing as accuracy: Implicit Euler damps out all
motion, including errors in the motion, but this may not be a desirable tradeoff if you care about
the long-term behavior of a physical system.
Conservation of Energy
energy,
All conservative physical systems obey the law of conservation of
E(t) = T [q̇(t), q(t)] + V [q(t)] = const.
One can ask how closely the approximate trajectory given by a time integrator obeys this conservation law; the energy behavior of a time integrator is often closely correlated to its global error
behavior. A few special time integrators have good energy behavior, where the energy of the discrete
trajectory oscillates about the true value, but does not drift too far from the true value:
|E(t) − E(0)| < C
∀t < k1 ek2 /h
for some constants C, k1 , and k2 . In the next lecture we will explore a recipe, Hamilton’s Principle
of Extreme Action, for constructing integrators that have this property.
49
Conservation of Momentum Physical trajectories often obey other conservation laws, like
conservation of linear or angular momentum. We can ask whether, given a physical system whose
exact trajectories conserve momentum, a time integrator yields a discrete trajectory that also
conserves momentum. Almost all integrators will conserve linear momentum, but Explicit and
Implicit Euler, for instance, do not conserve angular momentum.
We end this chapter of notes by summarizing in table form the properties of the common
integrators listed earlier.
Method
Explicit Euler
Implicit Euler
Trapezoid Rule
Implicit Midpoint
Velocity Verlet
Runge-Kutta 4
Order
1
1
2
2
1 or 23
4
Global Error
Exponential
Exponential
Linear
Linear
Linear
Exponential
Stability
Unstable
Stable2
Conditional
Conditional
Conditional
Unstable
Energy Behav.
Explodes
Decays
Good
Good
Good
Explodes
Cons. of Momentum
Only linear
Only linear
Yes
Yes
Yes
Yes
2
Strictly speaking, Implicit Euler can be unstable, for some nonlinear physical systems and very large time step
size. But for many common physical systems Implicit Euler is stable no matter how large the time step is chosen to
be.
3
The equations as written earlier in this chapter have order 1. But a trivial reindexing of the equations yields a
formulation with order 2.
50
Chapter 5
Hamilton’s Principle
When I was in high school, my physics
teacher—whose name was Mr.
Bader—called me down one day after
physics class and said, ‘You look bored; I
want to tell you something interesting.’
Then he told me something which I found
absolutely fascinating, and have, since
then, always found fascinating. Every time
the subject comes up, I work on it. In
fact, when I began to prepare this lecture I
found myself making more analyses on the
thing. Instead of worrying about the
lecture, I got involved in a new problem.
The subject is this—the principle of least
action.
Richard Feynman
The previous few chapters covered the different pieces needed to define a physical simulation—
the degrees of freedom, forces, and equations of motion—and how to discretize these and put them
together into an algorithm for stepping forward in time. This chapter will cover the same material
but from a very different perspective: that of Lagrangian mechanics. The result will be a recipe
for how to start from a simple description of a system’s degrees of freedom and forces, called
the Lagrangian, and generate from it a discrete time integrator. This integrator automatically
comes with multiple desirable guarantees: it has good energy behavior, linear global error growth,
and conserves linear and angular momentum when appropriate. We will see how several of the
integrators from the previous chapter (Velocity Verlet, Implicit Midpoint, and the Trapezoid Rule)
arise from this recipe.
This chapter is in many ways the most important in the entire course. We will later learn
important enhancements and modifications, such as how to deal with constraints, contact and
friction, etc, but the basic procedure described here is an incredibly fruitful one, as we will see
when we study rigid bodies.
51
5.1
The Lagrangian and Action
In previous chapters we discussed the kinetic and potential energies T (q̇, q) and V (q) of physical
systems. Notice that these energies also implicitly include a choice of degrees of freedom and
configuration space Q. It turns out that these energies are all that you need to define a physical
system: there is a principle called Hamilton’s Principle of Extreme Action that turns those energies
into equations of motion. Before we describe this principle, we need to set the stage with a few
more concepts and definitions.
Suppose you have a physical system, and I tell you the configuration q0 and q1 at two different
times t0 and t1 . You can then draw many possible trajectories q(t) that interpolates these two
values: q(t0 ) = q0 and q(t1 ) = q1 . For example, I could specify that a ball begins in my hand at
time t0 and ends up several feet away from me, in midair, at time t1 . Some of these trajectories
are completely unphysical: the ball leaving my hand and hovering around the room like a UFO,
for instance, before reaching the goal state. How can we differentiate between a physical path (a
parabolic arc, in this case) and unphysical paths? In the last chapter, we saw one way: we can look
at the path at each time t ∈ [t0 , t1 ] and verify that the path satisfies the equations of motion at all
times. But there is a different approach.
In the language of differential equations, the problem we just posed is a boundary value problem:
given data at the start and end of the trajectory, what are the physical trajectories that travel
in between? This is in contrast to the view in the previous chapter of time integration as an
initial value problem: given a starting position and momentum, trace out the path through
phase space that the physical trajectory must take.
Unlike the initial value view, where a unique solution exists for any initial conditions given
“sufficiently smooth” expressions for the energies in the system, the boundary value problem can have zero,
one, or many solutions even for very simple physical systems. Consider a 1D zero-rest-length spring, for
instance, with one endpoint at x = 0 and the other at x = q. If I give you the boundary values q(0) = 0 and
q(1) = 0, what is the physical trajectory in between? One trivial solution is for the spring to do nothing:
q(t) = 0. Clearly this is physical. But there are other solutions: the spring could have oscillated for half a
period, if it had just the right initial velocity: stretched to a maximum extent at t = 21 , and then compressed
back to the zero length state by t = 1. Or it could have oscillated a full period, or three half periods, or any
other half-integer number of periods. All of these are physical trajectories.
The Lagrangian L(q̇, q) : T Q × Q → R looks at a single point in time along a trajectory and
computes the difference, at that time, between kinetic and potential energy:
L(q̇, q) = T (q̇, q) − V (q).
Notice that the minus sign is not a typo: it is very tempting to add kinetic and potential energy,
since we know from prior physics classes that the total energy of a system is an important quantity
that is often conserved. But here we subtract.
The Lagrangian gives us a scalar at each point on a trajectory, so we can integrate up the
Lagrangian along the entire curve to yield the action S of the trajectory:
Z t1
S(q(t)) =
L[q̇(t), q(t)] dt.
t0
Notice that the action takes in a function—a curve q(t)—and outputs a single scalar. The more
precise term for this kind of function-of-functions is a functional. We have already studied a different
52
functional in great depth: the differential operator d, which takes in a function of Q and returns a
function of Q × T Q. We can now state a preliminary form of Hamilton’s principle, in terms of the
action S:
Hamilton’s Principle of Extreme Action (informal)
Given boundary conditions q0 and q1 at times t0 and t1 , a trajectory q(t) with q(t0 ) = q0 and
q(t1 ) = q1 is physical if and only if it is a critical point of S, with respect to all other trajectories
satisfying the boundary conditions.
Talking about critical points of S is a bit weird, since S takes in an entire function: infinitely many
pieces of data. We have some intuition for what a critical point of S must look like: if a trajectory
q(t) has a large value for its action, for instance, and wiggling q(t) slightly (while maintaining the
fact that the trajectory obeys the boundary conditions) cannot increase the value of S no matter
how you wiggle it, then q(t) is a local maximum of S and hence a critical point (and hence a
physical trajectory). But there can also be saddle critical points: the intuition here is that wiggling
the trajectory leaves the action unchanged, to first order, no matter how you wiggle. We formalize
these ideas in the next section.
5.2
The Calculus of Variations
We want to do calculus on functionals: the stumbling block, though, is that functionals are “infinitedimensional” functions and it is far from clear what it means to take the derivative of a functional
with respect to a function. Just as in the case of multidimensional derivatives, the key idea is to
reduce this daunting infinite-dimensional problem into a one-dimensional problem. This procedure
is called the calculus of variations and should bear more than a passing similarity to how we defined
a directional derivative and differential in finite-dimensional calculus.
Suppose we have a trajectory q(t) that interpolates some boundary conditions. We can wiggle,
or perturb, this trajectory in any of countless different ways; these perturbations can be encoded
by a vector field δq(t) along q(t), which tells us how to displace each point of the trajectory. This
gives us a new trajectory
q̃(t) = q(t) + δq(t).
The vector field δq(t) is called a variation of q(t). We are especially interested in variations which
do not move the endpoints of q(t): these are just the vector fields with δq(t0 ) = δq(t1 ) = 0.
The variation δq(t) can be thought of as a direction encoding an infinitesimal change in q(t),
just as a single tangent vector δq ∈ T Q can be interpreted as an infinitesimal change in a single
configuration q ∈ Q. We can define an analogue of the directional derivative, called the variational
derivative, based on this idea: for any scalar , the curve
q (t) = q(t) + δq(t)
is a perturbation of q(t), with the magnitude of the perturbation at each time getting larger and
larger as increases. For each value of , we can evaluate S; this turns S into a scalar-valued
function of the single scalar . The variational derivative of S, with respect to the variation δq, is
then
d
Dδq(t) q(t) = S[q(t) + δq(t)]
.
d
→0
The variational derivative measures the infinitesimal (first-order) change in S due to the infinitesimal change δq(t) to the trajectory.
53
We are playing fast and loose with regularity assumptions here. Clearly, δq(t) must be continuous, so that the family of curves q (t) stay connected. We will also assume that q(t) and all
variations δq(t) are infinitely differentiable, to avoid worrying about potential problems like
q̇ (t) being undefined at some values of t, etc. But bear in mind that physical trajectories
often do violate these regularity assumptions: a rigid ball bouncing against a rigid floor, for
example, might have a trajectory whose velocity changes instantaneously and discontinuously
at the moment the ball hits the floor. It is not hard to generalize the calculus of variations and Hamilton’s
principle to, e.g., piecewise-smooth trajectories and variations by only integrating the Lagrangian over the
smooth pieces, but we won’t worry about such details here.
What does it mean, now, for a trajectory to be a critical point of S? For a given variation δq(t),
the variational derivative must be zero, since otherwise it is possible to perturb q(t) in the δq(t)
direction and increase or decrease the action to first order. For a trajectory to be a critical point
of S, the variational derivative must be zero for all possible variations δq(t): just like a function
f : Rn → R has a critical point at q if the directional derivative of f at q is zero for all possible
tangent vectors δq at q.
In the case of functions, there are infinitely many possible tangent directions δq, but because
of the linearity of the directional derivative with respect to the direction vector, it is only
necessary to check a basis for the tangent space at q, which is finite dimensional (n vectors, if
the domain of the function is Rn ). For functionals, the variations lie in an infinite-dimensional
space of functions, and checking all of the infinitely-many basis variations is not practical. But
we will see that thinking in terms of variations will nevertheless pay off indirectly.
We can now formalize Hamilton’s principle:
Hamilton’s Principle of Extreme Action
Given boundary conditions q0 and q1 at times t0 and t1 , a trajectory q(t) with q(t0 ) = q0 and
q(t1 ) = q1 is physical if and only if Dδq(t) S = 0 for all variations δq(t) satisfying
δq(t0 ) = δq(t1 ) = 0.
Historically this principle was sometimes called the principle of least action, but this name is
misleading: the physical trajectory can be a saddle point, or even a maximum, of S. It doesn’t
have to be a local minimum.
Why does Hamilton’s principle hold? Why does nature want to extremize the difference
of kinetic and potential energy? Nobody really knows. Experimentally, the principle has
been shown to work in a very wide variety of setting, ranging from classical mechanics to
quantum mechanics. We will see that in the classical setting, Hamilton’s principle implies
and generalizes Newton’s second law (which you may or may not consider more intuitive
than Hamilton’s principle). In this course we will take Hamilton’s principle as one of our
experimentally-validated first principles.
5.3
The Euler-Lagrange Equations
We will now derive the consequences of Hamilton’s principle using the following strategy. First, we
will select an arbitrary variation δq(t) and apply Hamilton’s principle to yield a condition involving
q(t) and δq(t) that must be true for physical trajectories. We will then use the fact that δq(t) is
arbitrary to eliminate δq(t) from the condition.
54
We begin by rewriting Hamilton’s principle using the definition of the variational derivative:
Z
d
d t1
˙
L[q̇(t) + δq(t),
q(t) + δq(t)] dt
.
0 = S[q(t) + δq(t)]
=
d
d t0
→0
→0
We now swap the order of integration of differentiation and apply the chain rule:
Z t1 h
i
˙
0=
{dq̇ L[q̇(t), q(t)]} δq(t)
+ {dq L[q̇(t), q(t)]} (δq(t)) dt,
t0
where dq̇ and dq are the differentials with respect to the first and second argument of L, respectively.
In terms of the Jacobians of L with respect to its two arguments,
Z t1 h
h
i
i
˙
0=
[dq̇ L[q̇(t), q(t)]] δq(t)
+ [dq L[q̇(t), q(t)]] [δq(t)) dt.
t0
˙ and δq are not independent: choosing one
We chose δq(t) to be an arbitrary variation, but δq
˙ into δq,
determines the other. To further simplify the above expression we would like to turn δq
and we can do exactly this using integration by parts:
Z t1
Z t1
t1
d
0 = [dq̇ L[q̇(t), q(t)]] [δq(t)] −
[dq̇ L[q̇(t), q(t)]] [δq(t)] dt +
[dq L[q̇(t), q(t)]] [δq(t)] dt.
t0
t0 dt
t0
The first term vanishes, since the variations we are allowed to look at in Hamilton’s principle are
only those that leave the endpoints of the trajectory fixed: δq(t0 ) = δq(t1 ) = 0. Combining the
remaining terms yields
Z t1 d
[dq̇ L[q̇(t), q(t)]] − [dq L[q̇(t), q(t)]] δq(t) dt.
0=
dt
t0
This integral must be zero for any variation δq satisfying the boundary conditions δq(t0 ) = δq(t1 ) =
0. The only way that the integral can be zero every time, no matter which variation we choose,
is for the term in brackets to be zero for every value of t ∈ (t0 , t1 ). This gives the Euler-Lagrange
equations
d
0=
[dq̇ L(q̇(t), q(t)]] − [dq L[q̇(t), q(t)]] ,
dt
d
or, eliding some notation, dt
dq̇ L − dq L = 0. We have shown that a trajectory is physical if and
only if it satisfies these Euler-Lagrange equations at every time t.
Rt
Bonus Math We can formalize the argument above that if t01 v(t)T δq(t) dt = 0 for arbitrary δq(t) with
δq(t0 ) = δq(t1 ) = 0, and v(t) is continuous, then v(t) = 0 for all t ∈ (t0 , t1 ). Suppose for contradiction
that for some time s ∈ (t0 , t1 ), some component i of v is nonzero: vi (s) = c 6= 0. We can assume without
loss of generality that c is positive. Then since v is continuous, vi (t) is continuous also, and so there exist
a nonempty interval (a, b) ⊂ (t0 , t1 ) containing s for which
vi (t) >
c
2
∀t ∈ (a, b).
Now we choose δq(t) to be a bump function with the following properties:
1. δqj (t) = 0 if j 6= i;
55
2. δqi (t) = 0 when t ≤ a or t ≥ b;
3. δqi (t) > 0 when t ∈ (a, b);
Rb
4. a δqi (t) dt = 1.
Clearly this variation satisfies the condition that δq(t0 ) = δq(t1 ) = 0. We then have
Z
t1
v(t)T δq(t) dt =
t0
Z
b
vi (t)δqi (t)
a
Z
≥
a
b
c
δqi (t) dt
2
Z
c b
=
δqi (t) dt
2 a
c
= =
6 0,
2
a contradiction.
Example: Particle Systems Let us look at the common special case of a quadratic kinetic energy T (q̇, q) = 12 q̇T M q̇ with M a constant symmetric positive-definite matrix, where the potential
energy V (q) is left arbitrary. The Lagrangian is then
1
L(q̇, q) = q̇T M q̇ − V (q)
2
and the Euler-Lagranage equations are
d
[M q̇]T + [dV (q)]
dt
d
= p + [dV (q)]
dt
= ṗ + [dV (q)],
0=
or in other words,
ṗ = −[dV (q)] = F (q);
Newton’s second law! Hamilton’s principle continues to give correct equations of motion in more
complicated settings, where T depends on positions; we will see such examples in the next lecture,
when we discuss constrained physical systems.
5.4
Discrete Hamilton’s Principle
We now repeat the whole process above, but for discrete trajectories: trajectories where we don’t
have a smooth path q(t) defined at all times t, but instead only have a discrete sample qi of
configurations at the integer multiples of a time step h. Once again, we start with a boundary
value problem: suppose we know the configuration at time 0, and k time steps later at time
kh. A discrete trajectory interpolating these boundary conditions is a set of k + 1 configurations
q = {qi }ki=0 .
In the time-continuous setting, Hamilton’s principle told us whether a trajectory q(t) was
physically correct or not. We’d like something analogous for discrete trajectories q, but it’s not
56
entirely clear what it even means for a discrete path to be “physically correct.” Matches a smooth
trajectory as closely as possible? But what does that mean?
To answer this question, let us take the time-continuous Hamilton’s principle, and try to port it
over to the discrete setting. We will then take this discrete Hamilton’s principle as a first principle,
and declare a discrete trajectory physical if and only if it satisfies the discrete Hamilton’s principle.
In order to do all this, we need to define a Lagrangian and action for discrete trajectories q. To
start with, we can take the smooth action and break it up into pieces that are local to each time
step:
Z kh
k−1 Z (i+1)h
X
L[q̇(t), q(t)] dt =
L[q̇(t), q(t)] dt.
0
i=0
ih
Now what? The smooth action needs to evaluate the Lagrangian at every time t between time step
i and i + 1, and this is a problem since we do not have any data for these intermediate times. We
are forced to approximate the action: to make a discretization choice.
Discretization Assumption We assume that the Lagrangian is piecewise-constant, so that for
any time t between time step i and i + 1, the Lagrangian has the same value. This lets us write the
Lagrangian as depending only on the configuration at the endpoints, L[q̇(t), q(t)] ≈ L(qi , qi+1 ) for
t ∈ (ih, (i + 1)h), and the action as
S(q) =
k−1
X
hL(qi , qi+1 ).
i=0
Again, this choice is not necessarily “correct” or “best”; I have made it because it is in some
sense the simplest possible way of discretizing S. We could instead have assumed that, e.g., the
Lagrangian varies linearly, or quadratically, between two time steps, etc. The consequences of such
choices are beyond the scope of this chapter, but I do want to point out that they are also perfectly
reasonable (if quite a bit more complex.)
Given a physical system, we are still tasked with defining the discrete Lagrangian L(qi , qi+1 )
of the system. We will work through some examples later in this chapter; for now we leave L
as a general function Q × Q → R. In order to define a discrete Hamilton’s principle, the last
ingredient we need is a notion of a variation of S: what does it mean for q to be a critical point
of S? Intuitively, it is clear what a variation of a discrete trajectory should be: you can’t perturb
q(t) at arbitrary times t, since the perturbed trajectory is no longer discrete; but what you can
perturb is the configuration at each time step, qi .
This intuition motivates the following definition: a discrete variation δq of a discrete trajectory
q is a set of k + 1 vectors δq = {δqi }ki=0 ∈ T Qk+1 . Just as in the time-continuous case, for any
∈ R we can perturb q into a new discrete trajectory by moving the configuration at each time
step in the δq direction:
q = {qi + δqi }ki=0 ,
and the variational derivative of the discrete action S is just
Dδq S[q] =
d
S[q ]
d
→0
.
We now have all of the concepts we need to port Hamilton’s principle to the discrete setting:
57
Discrete Hamilton’s Principle of Extreme Action
Given boundary conditions q0 and qk at time steps 0 and k, a trajectory q = {qi }ki=0 is physical if
and only if Dδq S[q] = 0 for all discrete variations δq satisfiying δq0 = δqk = 0.
5.5
Discrete Euler-Lagrange Equations
Where deriving the Euler-Lagrange equations in the time-continuous setting required some tricky
integration by parts, in the discrete setting, the derivation is somewhat simpler. We again start by
taking δq to be an arbitrary variation, and substitute the definition of S into Hamilton’s principle:
k−1
0=
d X
hL[qi + δqi , qi+1 + δqi+1 ]
d
i=0
→0
.
We now differentiate and apply the chain rule to get
k−1 X
d
L[qi + δqi , qi+1 + δqi+1 ]
0=h
d
=
=
i=0
k−1
X
i=0
k−1
X
d1 L qi + δqi , qi+1 + δqi+1
→0
δqi + d2 L qi + δqi , qi+1 + δqi+1
δqi+1
→0
d1 L qi , qi+1 δqi + d2 L qi , qi+1 δqi+1 ,
i=0
where d1 L and d2 L denote the differential of L with respect to its first and second parameter,
respectively. To simplify this equation further, we need to do some reindexing. We can expand out
the sum, grouping terms δq0 , δq1 , etc:
0 = d1 L(q0 , q1 ) δq0
+ d2 L(q0 , q1 ) δq1
+ d1 L(q1 , q2 ) δq1 + d2 L(q1 , q2 ) δq2
+ d1 L(q2 , q3 ) δq2
+ d2 L(q2 , q3 ) δq3
.. ..
. .
h
i
d1 L(qk−1 , qk ) δqk−1
h
i
+ d2 L(qk−1 , qk ) δqk .
The first and last terms are zero, since δq0 = δqk = 0, and the other terms can be grouped by
column to get
k−1
X
0=
d2 L(qi−1 , qi ) + d1 L(qi , qi+1 ) δqi .
i=1
If q is a physical trajectory, then this equation has to hold for any variation δq. The only way this
can be true is if the term in brackets is zero for every i:
0 = d2 L qi−1 , qi + d1 L qi , qi+1
∀i ∈ 1, . . . , k − 1,
the discrete Euler-Lagrange equations of the physical system.
58
Worked Example Let’s say we have a simple particle system, with particle masses encoded in
a mass matrix M , and potential V acting on the system. Let us use the discrete Euler-Lagrange
equations to derive a time integrator for this system. We first need to write down the discrete
Lagrange L(qi , qi+1 ), and to do so we need to make more discretization choices.
First, if we are assuming that the Lagrangian is piecewise-constant, we need a piecewise-constant
estimate of kinetic energy. We can get one by estimating the velocity as a constant between time
step i and i + 1:
T
i+1
1 qi+1 − qi
q
− qi
i
i+1
T (q , q ) =
M
.
2
h
h
We also need a piecewise-constant estimate of the potential energy of the system; we can choose,
for example, to sample the potential at the end-of-step position, V (qi , qi+1 ) = V (qi+1 ). This gives
us the discrete Lagrangian
i
L(q , q
i+1
1
)=
2
qi+1 − qi
h
T
M
qi+1 − qi
h
− V (qi+1 )
with Jacobians
T
1 qi+1 − qi
i
i+1
M
d1 L q , q
=−
h
h
1 qi+1 − qi T
i
i+1
d2 L q , q
=
M − dV (qi+1 ) .
h
h
Putting together the pieces, the Euler-Lagrange equations are
1
h
qi − qi−1
h
T
1
M − dV (qi ) −
h
qi+1 − qi
h
T
M = 0.
At first blush this equation does not look like a typical time integrator we studied in the previous
chapter. But we can perform some manipulations to transform the equation into a more familiar
form. First, we can multiply both sides by h to get
qi − qi−1
h
T
M − h dV (qi ) −
and now we can introduce an auxiliary variable pi =
pi ,
qi+1 − qi
h
qi+1 −qi
h
T
T
M = 0,
M . Rearranging this definition of
and using it to simplify the Euler-Lagrange equation, gives the pair of equations
T
qi+1 = qi + hM −1 pi
pi = pi−1 − h dV (qi ) ,
or, after reindexing the second equation and using the definition of force,
qi+1 = qi + hM −1 pi
T
pi+1 = pi + hF (qi+1 ).
59
Velocity Verlet! And if we had made different discretization choices when writing down L, we would
have gotten different time integrators. For example, we could have chosen to estimate potential
energy by sampling it at the time step midpoint, instead of at the end-of-step position:
i
q + qi+1
i
i+1
V (q , q ) = V
;
2
it may not surprise you that the Euler-Lagrange equations would, in this case, give you the Implicit
Midpoint integrator.
Benefits of Hamilton’s Principle We have seen that the discrete Hamilton’s principle allows us
to derive different time integrators: we get Velocity Verlet, Implicit Midpoint, and yet other, more
sophisticated integrators, depending on how we choose to discretize the Lagrangian L(qi , qi+1 ).
What’s the point, though? Why go through all of the trouble of deriving the Euler-Lagrange
equations, instead of just using a pre-canned time integrator?
First, just like Newton’s second law only holds for particle systems where the kinetic energy is
of the form 21 q̇T M q̇, the time integrators in the last chapter only work for simple physical systems
with quadratic kinetic energy of this form. As soon as we have a more complicated system, such as
physical systems with constraints (the topic of the next lecture), rigid bodies, systems in rotating
coordinate systems, etc, Newton’s second law, and the common time integrators like Implicit Euler
no longer work. They must be modified, and the Euler-Lagrange equations tell you exactly how to
perform this modification.
Second, the discrete Hamilton’s principle always produces a good time integrator: notice that we
did not get Explicit Euler, or Implicit Euler, out of any set of Euler-Lagrange equations, and indeed
it is impossible to get Explicit Euler, no matter how you try to discretize the Lagrangian. This
is because the time integrators that comes out of the Euler-Lagrange equations (called variational
integrators) are guaranteed to have the following properties:
• good energy behavior;
• linear global error growth;
• conservation of linear and angular momentum, when applicable.
We will start to see the payoff immediately in the next lecture, when we talk about simulating
constrained systems.
60
Chapter 6
Handling Constraints
The more constraints one imposes, the
more one frees one’s self.
Igor Stravinsky
In this chapter we study physical systems with constraints: restrictions on which configurations
q ∈ Q are allowed, and which are forbidden. We will start with a simple model problem: a 2D
pendulum made up of a point mass at the end of a massless rigid rod of length L that is anchored
at the origin. We can model the pendulum using a two-dimensional configuration space, with
q = (x, y) the position of the free endpoint of the rod. Since the rod is rigid, though, not every
value of q is a legal endpoint, since the rod must have length L: instead, we want to restrict q to
only take on values on the circle kqk = L.
The set of allowed values of q is called the feasible set of configurations, and unallowed values is
the infeasible set. We specify the feasible set using a constraint function g : Q → R. This constraint
function must satisfy three conditions:
• g(q) = 0 whenever q is feasible, and nonzero when q is infeasible;
• g(q) is smooth;
• g has no critical points on the zero level set: dg(q) 6= 0 whenever g(q) = 0.
The third condition is a bit technical and it may not be obvious why we need it, but its importance
will become clear later in this lecture. Note that one consequence of this condition, though, is that
the zero level set g(q) = 0, encoding the feasible configurations, must split Q into two regions, one
where g is negative, and one where g is positive (if g were positive on both sides, or negative on
both sides, then g would have a local minimum or local maximum on the zero level set, which is
not allowed.)
The choice of constraint is not unique: for the pendulum, for example, the rod length constraint
could be expressed using any of the following constraint functions:
• g(q) = kqk − L
• g(q) = 2kqk − 2L
• g(q) = kqk2 − L2
61
• g(q) =
kqk
L
− 1.
Mathematically, all of these constraints are equally good ways of representing the rod constraint.
But in practice, the third function is probably the most useful: this is because unlike the other
proposed functions, it is just a polynomial in the entries of q, rather than involving square roots.
Generally speaking, the simpler the expression of the constraint function, the better, as we will
soon be taking derivatives of g, etc.
A non-example of a valid constraint function for the rigid rod is the function
g(q) = (kqk − L)2 ;
although this function is zero exactly on the feasible set, and is smooth, it has a minimum on the
zero level set, violating the third condition.
Multiple Constraints What if we want several constraints in a physical system? For example,
we can turn our single pendulum into a double pendulum, by attaching a second rigid rod onto
the end of the first. The configuration space of the double-pendulum is now four-dimensional:
q = (x1 , y1 , x2 , y2 ) encodes the endpoints of the first and second rod, and we now need to enforce
two constraints: k(x1 , y1 )k = L and k(x1 , y1 ) − (x2 , y2 )k = L.
We can write down a constraint function for each constraint:
g1 (q) = k(x1 , y1 )k2 − L2 ,
g2 (q) = k(x2 , y2 ) − (x1 , y1 )k2 − L2 .
Geometrically, the set g1 (q) = 0 is a three-dimensional hypersurface in 4D space encoding the
configurations where the first rod has the correct length. g2 (q) = 0 is a second three-dimensional
surface where the second rod has the correct length. The intersection of these hypersurfaces is a
two-dimensional surface, where both rods have the right length, and it is this intersection surface
that is the feasible set of configurations.
We can therefore represent both constraints in a single function, which is now vector-valued:
g1 (q)
,
g(q) =
g2 (q)
where again we have that g(q) = 0 if and only if q is feasible. In this way we can encode a physical
system with arbitrarily many different constraints: if the system has m constraints, the constraint
function is a map g : Q → Rm , where each component of g is a different constraint function gi (q)
for a different constraint.
Simulating Constraints Now that we can represent constraints mathematically using a constraint function, how do we enforce the constraints in a simulation? There are at least four different
techniques; we will cover each in the remainder of this chapter:
1. use reduced coordinates;
2. add a penalty force;
3. enforce constraints using a step-and-project scheme;
4. bake the constraints into Hamilton’s principle.
62
6.1
Reduced Coordinates
The first technique for simulating constraints is to avoid them entirely: if we’re lucky, we can find
new degrees of freedom for the physical system that don’t need any constraints at all. For example,
for the pendulum, instead of using the endpoint of the rod (x, y) as the degrees of the freedom of
the system, we could use the angle of the rod θ away from vertical as the degree of freedom instead.
Now, we have no constraints, so we can use the familiar unconstrained simulation techniques
to simulate the pendulum. We do need to be a bit careful about how we apply time integrators
to the reduced coordinates, though, since the expressions for the kinetic and potential energies of
the system are usually significantly more involved when written in reduced coordinates instead of
“natural” Euclidean coordinates.
For example, suppose the pendulum’s mass is m, and that the system has a single potential,
gravity, V (x, y) = 9.8my. To derive a time integrator in reduced coordinates, we write
x = L sin θ
y = L cos θ,
and then write down a discrete Lagrangian
L(θi , θi+1 ) =
m (L sin θi+1 , L cos θi+1 ) − (L sin θi , L cos θi )
2
h
2
− 9.8mL cos θi+1 .
i+1 i 2
θ
−θ
Why isn’t the kinetic energy just the formula from the last chapter, m
? This is a
2
h
key point: kinetic energy in a physical system is a natural quantity associated to how masses
are moving in the system, not to how that motion is encoded using degrees of freedom. The
d
velocity of the DOF θ̇ is not the same as the velocity of the mass dt
(L sin θ, L cos θ), and kinetic
energy depends only on the latter. This distinction between the velocity of the mass in the
system, and the velocity of the DOFs used to encode the system, will arise again when we
study rigid bodies, and when we look at fluid dynamics.
Deriving a time integrator now just requires turning the crank of Hamilton’s principle, as
discussed in the last chapter.
Bonus Math The Euler-Lagrange equations corresponding to our Lagrangian are
T (L sin θi , L cos θi ) − (L sin θi−1 , L cos θi−1 )
(L cos θi , −L sin θi )
0=m
+ 9.8mL sin θi
h
h
T (L sin θi+1 , L cos θi+1 ) − (L sin θi , L cos θi )
(L cos θi , −L sin θi )
−m
h
h
2
2
mL
mL
= 2 − sin θi−1 cos θi + cos θi−1 sin θi + 9.8mL sin θi − 2 sin θi+1 cos θi − cos θi+1 sin θi
h
h
L
L
= sin θi − θi−1 + 9.8h sin θi − sin θi+1 − θi .
h
h
Now we introduce the new variable
ωi =
L
sin θi+1 − θi
h
63
which leads to the time integrator
hω i
L
= ω i + 9.8h sin θi+1 .
θi+1 = θi + arcsin
ω i+1
Notice that this integrator is explicit, like Velocity Verlet, though the updated position θi+1 is now
nonlinear in the velocity.
Despite the more complicated expression for the time integrator, using reduced coordinates is
usually a good idea whenever possible, since it completely eliminates the problem of handling constraints. But for complex physical systems, identifying usable reduced coordinates is unfortunately
much easier said than done.
6.2
Penalty Method
A second general technique for handling constraints is to enforce them using a penalty force with
potential
k
Vpenalty (q) = kg(q)k2
2
for a large penalty parameter k. Notice the extreme similarity between the penalty potential and a
spring potential: indeed, the intuitive interpretation of the penalty potential is a spring-like force
that pulls q back onto the level set g = 0 whenever q strays away from it.
Implementing the penalty force is in principle extremely straightforward: simply add Vpenalty
to the system and treat it like any other potential energy. There are some practical issues, though.
First of all, using the penalty method does not exactly enforce the constraints: instead, q will
oscillate about the feasible region g = 0, with the amount of error decreasing as k increases.
But there is no free lunch: as the amplitude decreases, the frequency of oscillation increases, and
the stable time step size of a time integrator typically decreases as this frequency increases (the
simulation must sample q densely enough in time to resolve those high-frequency oscillations).
Setting k too large often results in an intractably tiny stable time step size. Specialized integrators
(exponential integrators; IMEX integrators) exist which try to circumvent this limitation, but they
are beyond the scope of these notes.
6.3
Step and Project
The third technique is to first take an unconstrained step of time integration, completely ignoring
the constraints. This yields a new, tentative position and velocity q̃i+1 and q̃˙ i+1 , which probably
violate the constraint. We therefore project the tentative position back onto the feasible set, by
finding the closest point on the feasible set to the the new position q̃i+1 . Finally, we update velocity
to reflect this perturbation in final position.
The steps of the algorithm are summarized below:
1. Take a tentative unconstrained step using any time integrator to yield provisional new positions and velocities q̃i+1 and q̃˙ i+1 .
64
Figure 6.1: Constrained optimization. A function f : R2 → R and a constraint g : R2 → R
are visualized by drawing their level sets (left). The feasible set g = 0 is the curve in bold red.
Constrained optimization of f can be visualized by drawing the gradients of f and g along this
level set (right): the critical points of f , subject to g = 0, are the points on the red curve where
∇df and ∇dg are parallel.
2. Project the provisional position back onto the feasible set: this is done by solving the optimization problem
min
qi+1
T
1 i+1
q
− q̃i+1 M qi+1 − q̃i+1
2
s.t. g(qi+1 ) = 0.
3. Set velocity based on the provisional velocity, and the correction just made to provisional
position:
qi+1 − q̃i+1
q̇i+1 = q̃˙ i+1 +
.
h
The first and third steps are straightfoward, but the middle step requires the ability to solve
constrained optimization problems. We will look at how to solve this problem next.
Why are we using the mass matrix to measure distance between qi+1 and q̃i+1 , rather than
just using Euclidean distance? Remeber, the kinetic energy metric is the natural metric on
configurational tangent space. Unlike the Eulidean distance, which is coordinate-dependent,
the kinetic energy metric tends to be more physically meaningful. For some intuition backing up
the kinetic energy metric as the right choice, consider the case of a particle system containing
particles of many different masses; the effect of using M in particle systems is to penalize
perturbing particles with large mass, and instead preferentially move the less-massive particles; the massive
particles have more inertia and altering their trajectory should carry a steeper cost.
65
6.4
The Method of Lagrange Multipliers
To understand how to solve constrained optimization problems, it helps to first review the unconstrained case. If we have a function f : Q → R, finding its critical points simply amounts to “taking
the derivative and setting it equal to zero”: in other words, finding points where df = 0. Indeed,
the definition of a critical point is a point where perturbing away from that point does not alter f
to first order: {df (q)} (δq) = 0 for all directions δq.
Now let us look at the situation where we can only move within the feasible set g(q) = 0 for a
single constraint g. Figure 6.1 shows this scenario, where we have drawn the level sets of f and g,
with the zero level set of g drawn in bold red. We are allowed to slide back and forth along this
red curve: how do we now find the critical points of f , with respect to this sliding back and forth?
For a scalar-valued function, df in matrix form is a row vector; but we can also represent it as
a (column) vector by way of the gradient. Recall that for scalar-valued functions, the differential
is a covector, and the gradient is its vector dual df ] satisfying
{df (q)} (v) = h∇f (q), viM = ∇f (q)T M v
for every tangent vector v at q; recall also that we can write the gradient in terms of the matrix
form of the differential, ∇f = M −1 [df ]T . The√gradient has an important geometric interpretation:
out of all vectors v that have the same length vT M v as the gradient, the gradient is the direction
of steepest increase of f . This is because
[df ]∇f − [df ]v = ∇f T M (∇f − v)
= ∇f T M ∇f − ∇f T M v
1
=
∇f T M ∇f − ∇f T M v − vT M ∇f + vT M v
2
1
= (∇f − v)T M (∇f − v)
2
≥ 0,
so that it is impossible to find a v for which the change in f , [df ]v is greater than the choice
v = ∇f .
It follows that −∇f is the direction of steepest descent of f . In the case n = 2, we can plot ∇f
along the curve g = 0. The claim now is that the critical points of f along the curve occur where
∇f is perpendicular to the curve. Otherwise, if ∇f points slightly in one direction along the curve,
or slightly in the other, we can slide along the curve in the opposite direction to infinitesimally
change f . The only way we cannot change f to first order by sliding is if ∇f is perpendicular to
the sliding directions. When n > 2, the situation is more geometrically complex: the level set g = 0
is no longer a curve, but instead, a hypersurface of dimension n − 1. But it’s still the case that we
are at a critical point of f if and only if df points perpendicular to this hypersurface.
There is another vector we know is perpendicular to the level set curve g = 0: the gradient of
the constraint function ∇g. We can therefore write an algebraic expression for the geometric fact
that ∇f is perpendicular to the level set:
∇f = λ∇g
for some scalar λ ∈ R, called the Lagrange multiplier of the constraint. Or, alternatively, df = λdg.
66
This observation lets us turn constrained optimization problems into unconstrained problems.
Suppose we are trying to solve
min f (q) s.t. g(q) = 0.
q
Then every solution of this problem is also a solution of the unconstrained problem
ext f (q) − λg(q).
q,λ
(6.1)
Why? The solutions of this second problem are just those where the differential vanishes, and
setting the differential of f (q) − λg(q) with respect to q and λ equal to zero gives
[df (q)] − λ[dg(q)] g(q) = 0.
(6.2)
Setting this differential equal to zero gives us two sets of equations: g(q) = 0, which tells us that
the critical point we have found lies in the feasible region of Q, and df − λdg = 0, which tells us
that we are at a critical point of f restricted to that feasible region.
There are a few important subtleties to keep in mind: first of all, minima of the constrained
optimization problem do not necessarily turn into minima of (6.1); in fact, the critical points
of (6.1) are usually saddle points. Second, every minimizer of the constrained problem is a critical
point of the unconstrained problem, but the converse is not always true: solutions to (6.1) might
be maxima, or saddle points, of the constrained problem, rather than minima.
Finally, unless the constraint function g is particularly simple, it will not be possible to solve
the system of equations (6.2) in closed form. But Newton’s method can be used to find a solution
numerically.
Multiple Constraints What if we have multiple constraints gi (q) instead of just one? The
situation is much the same, but it is more difficult to visualize geometrically. Instead of sliding
along a single hypersurface g = 0, we have to slide along multiple hypersurfaces simultaneously. We
get “stuck” if the direction we want to slide is perpendicular to any of the individual constraints; in
other words, we are at a critical point of f if ∇f is a linear combination of the constraint function
gradients:
m
X
∇f (q) =
λi ∇gi (q)
i=1
where we now have m different Lagrange multipliers λi , one per constraint. Equivalently, we can
write this condition as
df (q) = λT dg(q)
T
where λ = λ1 λ2 · · · λm
. It follows that the multiple-constraint generalization of the
unconstrained problem (6.1) is
ext f (q) − λT g(q);
q,λ
setting the differential of this unconstrained objective equal to zero gives exactly the equations that
characterize a critical point of f :
df (q) − λT dg(q) = 0
g(q) = 0.
Once again, Newton’s method can be used to solve for q and λ.
67
6.5
Constrained Hamilton’s Principle
Finally, there is one more approach we can take to enforcing constraints in a simulation: instead
of avoiding constraints altogether, as with the penalty or reduced coordinate approach, or ignoring
the constraints and trying to patch them up later, as in step-and-project, we can try to build the
constraints directly into the time integrator. That is, we can construct a time integrator that is
guaranteed to always output new positions qi+1 that lie in the feasible region of Q.
You may not be surprised to learn that the way to do this is to go back to Hamilton’s principle.
Instead of extremizing the action S over all possible paths q interpolating the boundary conditions
q0 and qk , we restrict ourselves only to paths that satisfy the constraints: that is, that have
g(qi ) = 0 for all i ∈ {0, 1, . . . , k}. We do this by applying the method of Lagrange multipliers
directly to Hamilton’s principle: we convert the constrained problem
ext
q
k−1
X
hL(qi , qi+1 )
s.t.
g(qi ) = 0
i=0
into the unconstrained problem
ext
q,λ
k−1
X
hL(qi , qi+1 ) −
k−1
X
!
(λi )T g(qi )
(6.3)
i=1
i=0
where we now have one extra Lagrange multiplier variable per constraint per timestep. The key
insight now is to notice that (6.3) has exactly the same form as an ordinary, unconstrained application of Hamilton’s principle to a discrete Lagrangian... to a physical system with extra degrees of
freedom λi . Yes, we started off with too many degrees of freedom (hence the need for constraints),
and we are going to solve the problem by adding in even more degrees of freedom!
we define a new configuration space Q̃ = Q × Rm , new degrees of freedom q̃i =
Concretely,
i
q
, and a new discrete Lagrangian
λi
L̃(q̃i , q̃i+1 ) = L(qi , qi+1 ) − (λi )T g(qi ).
The Euler-Lagrange equations for this new discrete Lagrangian then yield a time integrator that
simulates the constrained system.
Worked Example Let’s look again at our model problem, the pendulum with a rigid rod. We
have a single constraint, and so a single extra degree of freedom λi . Our discrete Lagrangian is
then
2
m qi+1 − qi
i
i+1
L̃(q̃ , q̃ ) =
− V (qi+1 ) − λi (kqi k2 − L2 )
2
h
with Euler-Lagrange equations
m
h
qi − qi−1
h
T
m
− dV (qi ) −
h
qi+1 − qi
h
T
− 2λi (qi )T = 0
−(kqi k2 − L2 ) = 0.
68
Once again we introduce the new variable q̇i =
qi+1 −qi
h
and reindex to get the time integrator
qi+1 = qi + hq̇i
2h i+1 i+1
h dV (qi+1 ) −
λ q
q̇i+1 = q̇i −
m
m
2
qi+1 + hq̇i+1 = L2 .
Notice that the first equation can be solved explicitly to get qi+1 . But it is no longer possible to
then directly solve for q̇i+1 , since the second formula depends on the unknown Lagrange multiplier
λi+1 . Instead, the second and third equation must be solved implicitly using Newton’s method.
Comparing this time integrator to standard Velocity Verlet reveals an important insight: the
extra λ term in the second equation has the form of an extra, fictitious force that acts to enforce
the constraint. This force acts in the ∇g direction, i.e., perpendicular to the level set g = 0, and
the unknown scalar λ controls the magnitude of the force: the third equation tells us that this λ
is just exactly strong enough to keep q from leaving the level set! The value of λ that is computed
therefore has physical meaning: the larger the value, the more that inertia and/or the other forces
are acting to push q off of the feasible region. This information could be used to, for example,
fracture the rod if the value of λ ever exceeds some threshold during the simulation; we will revisit
this idea in the second programming project.
69
70
Chapter 7
Rigid Bodies in Two Dimensions
“Distress not yourself if you cannot at first
understand the deeper mysteries of
Spaceland. By degrees they will dawn
upon you. Let us begin by casting back a
glance at the region whence you came.”
Flatland, Edwin Abbott
We now extend rigid rods in the plane, which can be described by four degrees of freedom with
one length constraint, to two-dimensional rigid bodies. We will see that these rigid bodies, even in
2D, bring with them quite a few subtleties, which unfortunately will only compound once we move
to three dimensions in later chapters.
First, let us define a rigid body: a solid region Ω of the plane that can translate and rotate
within the plane, but cannot stretch, bend, or undergo any other deformation beside rigid motions.
For simplicity we will restrict ourselves to rigid bodies satisfying the following assumptions:
• the rigid body consists of the area inside a single simple (non-intersecting) but possibly nonconvex polygon Ω in the plane;
• the density ρ of the rigid body is constant within the area of the body.
There is nothing fundamental about these assumptions: one can define rigid bodies that have
curved sides, or consist of multiple disconnected pieces, or have spatially-varying density, etc—but
the assumptions above will significantly simplify exposition without limiting us too much in the
kinds of planar objects that can be represented by the rigid body.
Let v1 , v2 , . . . , vn denote the consecutive vertices of the boundary of the rigid body (so that the
boundary consists of edges connecting vi to vi+1 for i = 1, . . . , n − 1, and vn to v1 .) Let us now
pick reduced coordinates for the motion of the rigid body. Since the rigid body can only translate
and rotate, intuitively, we need just three degrees of freedom to parameterize the body’s motion:
two for the translation in the two planar dimensions, and one for the angle of rotation of the body.
To formalize this idea, let us pick a special “center” point c̄ of the rigid body—this can be any
point in the plane (and doesn’t even have to be inside the rigid body) but we will see later that
some choices of c̄ are more natural than others. Let us also pick a rest or undeformed pose for the
rigid body, which consists of an assignment of a position v̄i to each vertex of the body. Now any
rigid motion of the body can be achieved by:
71
1. rotating all points in the (rest pose of the) body about the center c̄ by θ radians counterclockwise; and then
2. translating all points in the (rotated rest pose of the) body by the vector t = (tx , ty ).
The reduced degrees of freedom in this scheme are just the translation vector and rotation angle,
q = (tx , ty , θ). Once again, there is nothing sacred about this choice of degrees of freedom: we could
have represented the deformed poses of the rigid body using different DOFs, for example first a
translation, and then a rotation about the origin; or some other scheme.
Important: we have now set up two different coordinate systems, corresponding to the rigid
body in its rest pose, and in its deformed pose. The former is often called the object’s body
coordinates and the latter the world coordinates. In this lecture we will use bars to indicate points
and vectors in body coordinates and no bar for quantities in world coordinates. It is absolutely
critical to keep rigorous track of which coordinate system is being used to represent all points and
vectors in the rigid body simulation; confusing the coordinate systems is a classic and common
source of bugs.
Why even use reduced coordinates in the first place to represent rigid bodies? Why not build
the rigid body out of multiple rigid rods, and simulate it using the techniques from the previous
chapter? This is an important question, as there’s nothing wrong with a constraint-based
simulation of a rigid body. However, there are several reasons to prefer reduced coordinates:
(1) in reduced coordinates, all rigid bodies require only three degrees of freedom, no matter
how large n is or how complicated the geometry of the body. In contrast, a constraint-based
formulation of the rigid body needs 2n degrees of freedom. (2) choosing correct constraints for representing
the rigid body is not entirely trivial. It is not enough to insert a rigid rod between each pair of consecutive
boundary vertices vi and vi+1 , since for n > 3 this polygon is still flexible. Additional diagonal rods are
needed. The complete set of n(n−1)
rods connecting every vertex to every other vertex will work, but requires
2
a very large number of constraints. The best approach is to compute a triangulation of the bounding polygon,
which yields O(n) edges that need to be turned into rigid rods. Finally (3) with reduced degrees of freedom,
it will be possible to model a rigid body of uniform density throughout the interior. Using rigid rods and
boundary vertices “lumps” the object’s mass at the boundary, which alters the dynamics of the rigid body.
All of that said, there is no free lunch: the price of the above advantages is a significantly more complicated theoretical formulation, which we describe in the remainder of this chapter.
For any point v̄ in body coordinates (which could be one of the vertices of the body in its rest pose,
or any other point on the interior or exterior of the rigid body Ω), the above gives us the following
formula for the corresponding position v(q) in world coordinates, as a function of the configuration
q:
v(q) = Rθ (v̄ − c̄) + c̄ + t,
where Rθ w is rotation of the vector w counterclockwise by θ radians; Rθ is linear in w and so has
a familiar matrix form
cos θ − sin θ
Rθ =
.
sin θ cos θ
Finally, the choice of rest pose for a rigid body is completely arbitrary. Looking at the formula
for v(q), there is an obvious opportunity for simplification: place the rigid body’s rest pose in the
plane so that c̄ lies at the origin, and v(q) = Rθ v̄ + t.
72
What point on the rigid body should we choose for c̄? There is a natural choice, called the
center of mass or centroid or “geometric center” of the object,
Z
1
(x, y) dA,
(7.1)
c̄cm =
A(Ω) Ω
where A(Ω) is the area of the rigid body and dA = dxdy is the plane’s area element. If you were to
cut Ω out of sheet metal and balance it on a pin at exactly this choice of c̄, the rigid body would
perfectly balance. Choosing the center of mass, rather than some other point, for the center c̄ is
not essential, but will lead to substantial simplifications in both two and three dimensions. For the
remainder of the lecture, we will assume that the rigid body’s rest pose is positioned in the plane
so that its center of mass lies at the origin (in practice, this can be ensured by placing the rigid
body in the plane arbitrarily, computing the center of mass c̄cm , then translating the rest pose by
−c̄cm .)
How do we compute the center of mass, though? There is a powerful tool, Stokes’s Theorem,
which allows us to write down the integral in equation 7.1 (as well as many other area integrals
over Ω) in closed form.
7.1
Stokes’s Theorem
Let w(q) = [w1 (q), w2 (q), . . . , wk (q)]T : Rk → Rk be a vector field in k dimensions. The divergence
of w, written as ∇ · w or div w, is the scalar function
(∇ · w)(q) =
k
X
∂wi
i=1
∂qi
(q).
∂w
y
x
Notice that in two dimensions, ∇ · w = ∂w
∂x + ∂y is the sum of the x derivative of the x coordinate
of w, plus the y derivative of the y coordinate of w.
Divergence computes, at every point, how much of a source or sink w has at that point. For
example, suppose w encodes the flow of, say, corn across the United States. A source of w, such
as Iowa, where in all directions w points away from the point, has positive divergence; a sink, like
New York City, where vectors point toward the point, has negative divergence.
Suppose we now want to calculate the total rate at which the United States is exporting corn
to other countries. There are two ways to measure this rate:
1. query every location in the United States, and ask how much corn that region of the US is
producing, and how much it is consuming; the net rate of corn production at a point in the
US is the divergence ∇ · w of the corn flow vector field w. Any corn that is produced in the
US but not consumed must be exported, so that the total rate of corn export is
Z
∇ · w dA.
Ω
2. examine every point on the border of the US, and measure how much corn is flowing in or
out of the country at that border point. Let ∂Ω denote the boundary of the US, and n̂
the unit outward normal vector at each point along the boundary. At every point on the
boundary, w · n measures the rate at which corn is flowing out of the country at that point
73
(the component of w perpendicular to n is flowing parallel to the border, and not contributing
to corn import or export). Therefore the total rate of corn export is
Z
w · n̂ ds
∂Ω
where ds is the boundary’s arclength element.
Two formulas measuring the same thing must be equal:
Z
Z
w · n̂ ds.
∇ · w dA =
∂Ω
Ω
This formula, (a special case of) Stokes’s Theorem, is incredibly useful, a lot more so than it might
first appear. It allows us to turn a 2D area integral into a 1D boundary integral whenever we can
express the integrand as the divergence of some vector field. We will use this technique to easily
compute the area and center of mass of an arbitrary 2D rigid body, no matter how complex.
Bonus Math It is instructive to examine the special case of k = 1. In one dimension, a region Ω is
just an interval [a, b], and the boundary ∂Ω is the pair of endpoints {a, b}. The outward-pointing normal
n̂ at a is the vector −1, and at b, the vector 1. A 1D vector field w is just a scalar function w1 (x) and
divergence the ordinary derivative ∇ · w = w10 (x).
Putting together all of these observations, Stokes’s Theorem in one dimension is
Z
b
b
w10 (x) dx = −w1 (a) + w1 (b) = w1 (x) ,
a
a
which you perhaps remember as the Fundamental Theorem of Calculus. Just as the FTC lets us antidifferentiate functions in 1D, Stokes’s Theorem lets us antidifferentiate in multiple dimensions.
7.2
Area and Center of Mass
Let us apply Stokes’s Theorem to the problem of finding the area of a rigid body. In some cases
calculating the area of a rigid body is easy, for example, a convex rigid body can be easily decomposed into triangular pieces whose areas can be summed. But complicated, concave rigid bodies
are nontrivial to triangulate; nevertheless, computing the area is easy.
We begin by noting that the area of Ω is, by definition, the integral of the constant function 1
over Ω:
Z
1 dA.
A(Ω) =
Ω
How does this help? Let us try to apply Stokes’s Theorem; to do this, we need to find a vector
field w whose divergence just happens to be the constant 1 over the whole region Ω. This requires
a bit of insight and creativity, but it is not too difficult to find a w that works, for example
w = 21 (x, y).
Why this particular vector field w? Why not another vector field like (x, 0) or (0, y), or even
something horrible like w = (x + ecos(y) , x7/8 )? You can check that all of these vector fields
also have divergence equal to the constant 1. Unlike in 1D, where antiderivatives differ only
by a constant of integration, when applying Stokes’s Theorem there are many possible vector
74
fields that gives the same divergence. Obviously, simpler choices of w will simplify subsequent
calculation, but there is not a strong reason to prefer, say, w = 21 (x, y) over w = (x, 0); I
personally prefer the former due to the symmetry in x and y.
We can now apply Stokes’s Theorem to the area calculation:
Z
Z
1
1
∇ · (x, y) dA =
(x, y) · n̂ ds.
A(Ω) =
2
∂Ω 2
Ω
How do we perform the boundary integration? Let us denote the edge connecting v̄i to v̄i+1 by
ei+1/2 , and the outward-pointing normal (which is constant along the entire edge) by n̂i+1/2 .1 We
can write n̂i+1/2 in terms of the edge vertices, assuming that the vertices of ∂Ω are oriented in a
clockwise direction around the boundary:
n̂i+1/2 =
Rπ/2 (v̄i+1 − v̄i )
.
kv̄i+1 − v̄i k
We can now decompose the integral over the boundary into a sum of integrals over each edge,
k Z
X
A(Ω) =
i=1
ei+1/2
1
(x, y) · n̂i+1/2 ds.
2
To finish computing the boundary integral, we parameterize the boundary edge ei+1/2 using an
arclength-parameterized curve
v̄i+1 − v̄i
;
γ(t) = v̄i + t
kv̄i+1 − v̄i k
notice that this curve starts at the first endpoint of the edge v̄i when t = 0, ends at the second
endpoint when t = kv̄i+1 − v̄i k, and moves at unit speed between the two endpoints. We then have
A(Ω) =
k Z
X
i=1
kv̄i+1 −v̄i k
1
2
0
v̄i+1 − v̄i
v̄i + t
kv̄i+1 − v̄i k
·
Rπ/2 (v̄i+1 − v̄i )
dt
kv̄i+1 − v̄i k
and notice that the normal is perpendicular to the linear term in t, so that all terms in the integrand
that depend on t cancel:
A(Ω) =
k Z
X
i=1
=
=
1
2
1
2
kv̄i+1 −v̄i k
0
k
X
i=1
k
X
Rπ/2 (v̄i+1 − v̄i )
1
v̄i ·
dt
2
kv̄i+1 − v̄i k
v̄i · Rπ/2 (v̄i+1 − v̄i )
v̄i · Rπ/2 v̄i+1 .
i=1
This formula is the popular “shoelace formula” for calculating polygon area, and the method used
to derive it, using Stokes’s Theorem, is a powerful one. Let us use it again to calculate center of
mass.
1
To simplify calculations and avoid having to treat the last edge of the polygon as a special case, we identify
v̄n+1 with v̄1 ; in practical code, of course, one must carefully handle wrapping around to the beginning of the list of
vertices.
75
Center of Mass We know how to calculate A(Ω), so we need only calculate the integral
Z
(x, y) dA.
Ω
This integral differs from the integral defining rigid body area in that we are integrating a vector valued function over Ω, but notice that we can reduce this problem to integrating a scalar by simply
considering each coordinate of the center of mass independently:
Z
Z
x dA, y dA .
Ω
Ω
Let us look at the first coordinate. Once again we need to find a vector field w whose divergence
is x, for example, w = 21 (x2 , 0), which yields by Stokes’s Theorem
Z
Z
1 2
1
∇ · (x2 , 0) dA =
(x , 0) · n̂ ds.
2
∂Ω 2
Ω
Splitting up the boundary integral, as in the calculation of area, gives
"
#
k Z kv̄i+1 −v̄i k
X
Rπ/2 (v̄i+1 − v̄i )
x̄i+1 − x̄i 2
1
x̄i + t
,0 ·
dt
2
kv̄i+1 − v̄i k
kv̄i+1 − v̄i k
i=1 0
"
#
k Z kv̄i+1 −v̄i k
X
(ȳi − ȳi+1 , x̄i+1 − x̄i )
x̄i+1 − x̄i 2
1
,0 ·
dt
=
x̄i + t
2
kv̄i+1 − v̄i k
kv̄i+1 − v̄i k
0
i=1
k
=
1X 2
(x̄i + x̄i x̄i+1 + x̄2i+1 )(ȳi − ȳi+1 )
6
i=1
The last line was derived by integrating, in closed form, the quadratic polynomial in t (with rather
convoluted coefficients) on the second line. The formula can be further simplified by reindexing,
but the form above is already sufficient for implementation. The second coordinate of the center
of mass can be derived similarly, yielding
k
1 X 2
2
(x̄i + x̄i x̄i+1 + x̄2i+1 )(ȳi − ȳi+1 ), (ȳi2 + ȳi ȳi+1 + ȳi+1
)(x̄i+1 − x̄i ) .
6
i=1
7.3
Simulating Rigid Bodies
We can derive a time integrator for rigid bodies by following the usual Hamilton’s Principle recipe.
Let V (q) be some given potential energy that we want to include in a simulation; the other decision
we still need to make is how to write down kinetic energy of the system. As in the discussion in
the previous chapter, it makes no sense to simply square the velocities of the each of the degrees
of freedom—instead we want to look at the moving masses in the simulation and assign each mass
its appropriate energy contribution. In the case of rigid bodies, there are infinitely many masses,
each with infinitesimal mass ρ: the kinetic energy thus becomes the integral
Z
1
T (q, q̇) =
ρkv̇(q)k2 dA.
2 v̄∈Ω
76
Notice that we are integrating over the rigid body’s rest pose Ω, but that the integrand is the
velocity in world coordinates: each point v̄ in the rigid body has a different velocity in world
coordinates (if the object is spinning in place, for instance, with θ̇ 6= 0 and t = 0, then points
farther from c̄ spin faster than points near this center of rotation) and the integral takes these
differences in velocity into account.
It should come as no surprise that we will be able to turn this area integral into a boundary
integral. In fact, even greater simplifications are possible, if we make the standard assumptions
that c̄ is at the origin and is equal to the center of mass of the rest pose. We then have an explicit
formula for v(q) for any v̄:
Z
T (q, q̇) =
ρ
v̄∈Ω
d
[Rθ v̄] + ṫ
dt
2
dA.
It is possible to compute the explicit time derivative of the Rθ matrix, and this approach will work
in 2D, but does not extend to 3D, so it is worth taking the time to consider this derivative in more
depth. First, consider the case where θ = 0. What is the magnitude and direction of the derivative
of Rθ v̄ when the body is still in its rest pose? Clearly it must be perpendicular to v̄, and, by the
chain rule, linear in θ̇, so
d
[Rθ v̄]
∝ θ̇ẑ × v̄.
dt
θ=0
The above expression is a slight abuse of notation: θ̇ẑ is a vector in R3 (with the first two components
zero), and you cannot take the cross product of a vector in space with a vector in R2 ; however a
vector in the plane can be converted into a vector in space by padding the vector with a zero ẑ
component, and then taking the cross product of θ̇ẑ with the resulting padded vector always gives
a result that is in the plane (since the result must be perpendicular to θ̇ẑ).
The proportionality constant cannot depend on θ (since θ = 0), cannot depend on v̄ (since Rθ v̄
is linear, the derivative must be as well), and so must be a numerical constant. We can probe the
value of this constant by plugging in a concrete example: say, kv̄k = 1 and θ̇ = 1. In this case Rθ v̄
sweeps out a unit-radius circle (with circumference 2π) in 2π seconds, so that the velocity of v has
length 1. Therefore proportionality in the equation above is actually equality,
d
[Rθ v̄]
dt
θ=0
= θ̇ẑ × v̄.
Now what if θ 6= 0? Rigidly rotating the entire rigid body does not effect how the body responds
to additional rotation, other than to rotate the resulting velocity vectors. Hence
d
[Rθ v̄] = Rθ θ̇ẑ × v̄ = Rθ [θ̇ẑ]× v̄,
dt
where in the third term we’ve written the expression using slightly more compact cross-productmatrix notation.
We can now plug this expression into our formulation of kinetic energy:
Z 1
T (q, q̇) = ρ
v̄T [θ̇ẑ]T× RθT Rθ [θ̇ẑ]× v̄ + v̄T [θ̇ẑ]T× RθT ṫ + ṫT Rθ [θ̇ẑ]× v̄ + ṫT ṫ dA
2 Ω
Obviously, RθT Rθ = I. Because of anti-symmetry of the cross product, cross-product matrices are
skew-symmetric: [w]T× = −[w]× . Moreover multiplying a vector in the plane by [ẑ]× twice rotates
77
it by ninety degrees, then rotates it again. Therefore
1
T (q, q̇) = ρ
2
Z v̄T (θ̇2 )v̄ + v̄T [θ̇ẑ]T× RθT ṫ + ṫT Rθ [θ̇ẑ]× v̄ + ṫT ṫ dA.
Ω
Let us now split up this integral:
1
T (q, q̇) = ρθ̇2
2
Z
Z
T
T
v̄ v̄ dA + ρṫ Rθ [θ̇ẑ]×
Ω
1
v̄ dA + ρṫT ṫ
2
Ω
Z
1 dA.
Ω
The second integral is the center of mass, which we’ve set to zero by judicious choise of how to
place the rest pose in the plane. The last integral is the area of the body. We can thus write
1
1
T (q, q̇) = θ̇T MI θ̇ + ṫT Mc ṫ
2
2
where Mc = ρA(Ω)I is the mass matrix encoding the total mass of the rigid body, and MI is called
the inertia tensor of the body and describes how much kinetic energy is stored in rotation of the
body. The formula for the inertia tensor
Z
MI = ρ
v̄T v̄ dA = ρ
Z
Ω
(x̄2 + ȳ 2 ) dA
Ω
can be computed using Stokes’s Theorem, and e.g. w = 13 (x̄3 , ȳ 3 ). A straightforward but unpleasant
calculation gives a closed-form formula for the ensuing boundary integral,
k
MI =
ρ X
2
(x̄i + x̄i+1 )(x̄2i + x̄2i+1 )(ȳi − ȳi+1 ) + (ȳi + ȳi+1 )(ȳi2 + ȳi+1
)(x̄i+1 − x̄i ) .
12
i=1
It is important to note that both Mc and MI depend on the object’s rest pose geometry, but are
constant with respect to q or q̇.
Lagrangian
system:
We can now put everything together into a Lagrangian describing the rigid body
1
1
L(q, q̇) = θ̇T MI θ̇ + ṫT Mc ṫ − V (q).
2
2
This Lagrangian can be discretized in the usual way by assuming that θ and t change at a constant
rate between time step i and i + 1; this gives the discrete Lagrangian
L(qi , qi+1 ) =
1
2
θi+1 − θi
h
T
MI
θi+1 − θi
h
+
1
2
ti+1 − ti
h
T
Mc
ti+1 − ti
h
− V (qi+1 ) (7.2)
and the time integrator follows as usual from applying Hamilton’s principle and extracting the
Euler-Lagrange equations. It may seem a bit silly to treat the scalar MI as if it is a matrix in the
expression above, but doing so will allow us to compare the discrete Lagrangian for 2D bodies to
that of 3D bodies, which we cover in the next chapter.
78
7.4
A Rigid Body Time Integrator
Let us derive the discrete Euler-Lagrange equations for the discrete Lagrangian we derived in
equation 7.2. We have that
i+1 i T
T
1 θi+1 −θi
d1 L qi , qi+1 = − h1 t h−t
Mc − h
MI
h
i i−1 T
i −ti−1 T
i−1
i
i
1
1
t
θ
−θ
d2 L q , q = − dV (q ) + h
Mc h
MI ;
h
h
introduce as usual new variables
i
ṫ =
ti+1 − ti
h
,
i
θ̇ =
θi+1 − θi
h
the equations of motion become, after some minor rearrangement and simplification,
ti+1 = ti + hṫi
θi+1 = θi + hθ̇i
T
ṫi+1 = ṫi − hMc−1 dt V qi+1
T
θ̇i+1 = θ̇i − hMI−1 dθ V qi+1
.
These equations bear an uncanny resemblance to the Velocity Verlet integrator for mass-spring
systems, and like Velocity Verlet, this time integrator is explicit (since the first two equations allow
us to explicitly solve for the updated position and orientation of the body, which can then be used
to compute the updated velocities ṫi+1 and θ̇i+1 .
The derivation above calls the variables ṫ and θ̇ “velocities,” and uses notation suggestive of
time differentiation in the smooth setting, but how did we know to introduce these variables,
and why do they approximate velocities? Indeed, we fabricated these variables out of whole
cloth: nothing in Hamilton’s principle tells us that these are the natural velocity variables,
other than that choosing these quantities as new variables helps to simplify the time integration
formulas. Moreover, whereas ṫ does indeed have units of velocity (distance over time), the θ̇
variable has units of frequency (hertz) rather than velocity.
The name “velocity” is somewhat justified by the fact that these variables are finite difference approximations of the time derivative of the configurational variables,
ṫi =
ti+1 − ti
d
≈ t(t),
h
dt
and similarly for θ̇i , but it is good to remember that these additional variables are artificial and that the
equations of motion could have easily been written without them, for example, by expressing the configuration
at time step i + 1 in terms of that at time steps i and i − 1:
T
ti+1 = 2ti − ti−1 − h2 Mc−1 dt V qi
T
θi+1 = 2θi − θi−1 − h2 MI−1 dθ V qi
.
Finally, note that we made multiple arbitrary choices in how to discretize the Lagrangian: our
kinetic energy assumes the rigid body moves in piecewise screw motions (piecewise constant translations combined with rotating at piecewise constant speeds), we approximated potential energy
79
based on end-of-timestep evaluation, etc. Different choices would have led to different integrators,
some explicit, and some implicit.
80
Chapter 8
Lie Groups and Noether’s Theorem
The simplicity, power, and depth of
Noether’s theorem only slowly became
apparent. Today, it is an indispensable
part of the groundwork of modern physics.
Robert Crease
Before we attack the problem of simulating 3D rigid bodies, we will take a detour to study a
new piece of mathematical machinery: matrix Lie groups, which we will need in the next chapter
to cleanly formulate the configuration space of 3D rigid bodies. Along the way, we will discuss
an especially important and powerful relationship between conservation laws obeyed by a physical
system, and smooth symmetries of the system’s Lagrangian, called Noether’s theorem.
8.1
Non-Euclidean Configuration Spaces
As mentioned in Chapter 0, the configuration space Q of a physical system does not need to be
Euclidean. In fact we have already studied an important class of non-Euclidean configuration
spaces: those derived from the zero level set of a constraint function g : Rn → Rm . The set
of configurations obeying these constraints usually form an m-dimensional hypersurface within
n-dimensional space, and as we saw in Chapter 6, restricting trajectories to this hypersurface
introduced considerable complications to time integration.
Why do I say “usually”? Consider first a single constraint function g : Rn → R. Given enough
regularity assumptions on g, the level sets of g look like well-behaved hypersurfaces of dimension
n − 1 that cover or foliate the ambient space Rn , like layers of a pastry or onion. All level sets, that
is, except those containing critical points of g: these level sets can pinch off, fork, look like isolated
points, etc, with no guarantees of having “nice” geometry. Fortunately, it is possible to show that
almost all level sets do not contain critical points (Sard’s lemma), and moreover, we specifically
required, in our definition of a constraint function, that dg(q) 6= 0 whenever g(q) = 0, so the
zero level set of a constraint function is guaranteed to have the geometry of an n − 1-dimensional
hypersurface, with n − 1-dimensional tangent hyperplanes at every point.
Geometry of Constraint Manifolds More generally, if c ∈ R is a regular value of g—that is,
if g has no critical points on the level set g = c— then the level set g −1 (c) has manifold structure.
81
The definition of a manifold is somewhat technical, but the key property we’re interested in is
that g −1 (c) looks, locally, like a piece of m-dimensional space, and has an m-dimensional tangent
hyperplane at every point q ∈ g −1 (c), where in this case m = n − 1.
What about the case of multiple constraints g : Rn → Rk ? Requiring 0 to be a regular value
of each constraint is not quite enough: it is true that each level set gi−1 (0) is an n − 1-dimensional
manifold, and so we expect the intersection of all k of these manifolds to be an n − k-dimensional
manifold (just like two surfaces intersect along a common line in R3 ). But this is not guaranteed:
two surfaces that intersect at a point where they are tangent can intersect at only a point, rather
than a curve. Therefore we not only want dgi (q) 6= 0 on the level set g−1 (0), but we also don’t
want two differentials to be parallel, i.e. to differ by a constant scale factor.
Even this condition is not enough: the general rule is the following. Suppose g : Rn → Rk is a
smooth function and v ∈ Rk is a regular value of g, that is, the v level set of g contains no critical
points. Then the level set S = g−1 (v) is a n − k-dimensional manifold. The intuition here again
is that almost every level set of smooth function looks like a well-behaved hypersurface; the only
places where the level set can have singularities are at places where g has a critical point, where
now we are looking at high-dimensional critical points of a vector-valued function.
What does it mean for g to have a critical point? What does a critical point look like on a
high-dimensional function, and how do we check that a vector, say, 0, is a regular value of g?
Consider first the case of a function g : Rn → R. A critical point of g is a point q where
perturbing q in any direction δq yields no first-order change in g: {dg(q)} (δq) = 0 for all δq.
A regular point is one where for some δq, it is possible to increase g to first order, and for some
other δq, it is possible to decrease g. In other words, dg(q) is surjective when q is a regular
point, or alternatively, the Jacobian [dg(q)] has full row rank.
Now for vector-valued functions g : Rn → Rk , a regular point is similarly any point where, for any
desired infinitesimal change δg in g, there exists some δq for which {dg(q)} (δq) = δg. In other words,
again, q is a regular point if dg is surjective, or [dg] has full row rank.
The general strategy, then, to show that 0 is a regular value is to prove that whenever g(q) = 0, dg(q)
is surjective.
8.1.1
Constructing Manifold Configuration Spaces
Suppose we want to restrict the configuration space to some m-dimensional hypersurface Q ⊂ Rn .
Per the above discussion, one way of doing this is to cook up n − m constraint functions g whose
zero level set is Q, and then use any of the methods from Chapter 6 to do time integration subject
to enforcing the constraints g(q) = 0. If we are careful to choose constraints for which the above
requirement on surjectivity (full row rank) of the Jacobian [dg] holds, we know we’ve found a
minimal set of independent constraints whose zero level sets intersect to give Q.
Examples The set of orthogonal matrix O(n) is a manifold for any dimension n. To show this,
we need to cook up a g whose zero level set is O(n). It is tempting to try
g(M ) = M T M − I
but although this set of constraints has the right zero level set, the Jacobian doesn’t have full row
rank:
{dg(M )} (δM ) = M T δM + δM T M
82
is not surjective since the right-hand side is always a symmetric matrix. But we can observe
that M T M − I is also symmetric, so instead of requiring M T M − I = 0, we only need to look
at the upper-triangular part of M T M − I to constrain M to be orthogonal. Let L(M ) be the
n(n + 1)/2-dimensional vector consisting of the upper-triangular entries of M . Then
g(M ) = L(M T M − I)
has surjective differential, and so O(n) is a manifold of dimension n2 − n(n + 1)/2 = n(n − 1)/2. It
is possible to show that O(n) consists of two disconnected pieces; one consisting of all orthogonal
matrices with det M = −1, and the other, all orthogonal matrices with det M = 1, often named
the special orthogonal matrices SO(n). Notice that SO(n) is the set of all (orientation-preserving)
rotations in Rn : it follows that this set of orientation-preserving rotations SO(n) is also a manifold
with dimension n(n − 1)/2. Notice that in particular, SO(2) is a curve in R4 , and SO(3) is a
three-dimensional surface in nine-dimensional space.
As a second example, consider the special linear group SL(n) consisting of all n × n matrices
with determinant 1. We again need to find a suitable g. An obvious choice is
g(M ) = det M − 1, {dg(M )} (δM ) = det M tr M −1 δM .
Is this differential surjective whenever g(M ) = 0? Yes, and it’s easy show: let α ∈ R be any real
number. Then set δM = αn M . We have
α α
{dg(M )}
M = det M tr (I) = α,
n
n
where we have used that det M = 1 and tr (In×n ) = n. Hence SL(n) is a hypersurface of dimension
n2 − 1 in n2 -dimensional space.
Challenges Suppose we wanted to represent a physical system using as the degrees of freedom
elements of a manifold Q ⊂ Rn . What complications do we encounter if we try to formulate time
integration on Q?
• The tangent space T Q of velocities at q ∈ Q can be tricky to characterize. For instance,
suppose we take Q = S 2 , the unit sphere in R3 (perhaps this Q represents the possible
configurations of a 3D pendulum). At every point q0 the sphere has a well-defined tangent
plane, and all configurational velocities at q0 (or equivalently, the set of all possible tangent
vectors to curves q(t) passing through q0 at t = 0) lie in this plane, but the plane is different
for different q0 .
• Given q ∈ Q and v ∈ T Q, it is no longer possible to “walk in a straight line” in the v direction
as we could in Euclidean space: q + tv will not generally lie on Q for any non-zero t, since Q
can curve away from the tangent plane at q.
• Given two points q0 , q1 on Q, the straight line connecting the points is not necessarily contained in Q. The shortest path between q0 and q1 on Q may not be a line, may not exist,
may not be unique (consider the north and south poles on a sphere), and may cross itself
or have other complicated geometric features. These complications cause us difficulties when
we try to discretize the Lagrangian: either we must allow the discrete trajectory to travel
outside of Q, or we must use complicated curves to represent the discrete trajectory, rather
than piecewise straight lines.
83
With some difficulty, it is possible to formulate continuous and discrete time integration on arbitrary
smooth manifolds, or on even more general spaces Q. But the less structure we have on Q, the
more complicated the procedure becomes. In this chapter, we will study configuration spaces Q
that are non-Euclidean, but still retain a lot of algebraic and geometric structure, allowing us to
bypass all of the above difficulties. These spaces are the matrix Lie groups.
8.2
Matrix Lie Groups
A matrix Lie group is a subset G of n × n matrices satisfying two properties: first, they must have
manifold structure, that is, the set must have the geometry of a hypersurface in Rn×n . Second, they
must form a group: this means that
• I ∈ G;
• for any pair of matrices M1 , M2 ∈ G, the product M1 M2 is also an element of G;
• for any matrix M ∈ G, the inverse M −1 is also in G.
Notice that as a consequence of the last property, matrix Lie groups never contain singular matrices.
Examples
Let us list several important Lie groups.
• Rotations of the plane SO(2), the set of two-dimensional rotation matrices, is a Lie group.
The product of two rotations is another rotation, rotations obviously have an inverse (just
rotate in the other direction), and the identity is a rotation, so SO(2) is a group. We argued
above that SO(2) is a manifold, and so SO(2) is a Lie group. We can also represent the group
explicitly: elements of SO(2) are of the form
cos θ − sin θ
M=
sin θ cos θ
for θ ∈ R, which, in agreement with our calculation above, has the geometry of a onedimensional curve in four-dimensional space. At every point M , G has a tangent line spanned
by the vector
− sin θ − cos θ
.
cos θ − sin θ
• Rotations in general Rotations in any dimension form a group, and we already argued
above that SO(n) is a manifold, so SO(n) is a Lie group for any n.
• Orthogonal matrices Again, orthogonal matrices form a group, and we already argued that
O(n) is a manifold.
• Translations Translations are affine transformations and so do not have a matrix representation; however, we can represent translations by (n + 1) × (n + 1) matrices acting on
n-dimensional vectors represented in (n + 1)-dimensional homogeneous coordinates. Translations matrices T then look like
0n×n tn×1
Tt =
01×n
1
84
and it is easy to see that T is a group, since T0 = I and T−t = Tt−1 . It is also obvious that
T has manifold structure, since it’s just an n-dimensional hyperplane in (n + 1)2 -dimensional
space.
8.2.1
Tangent Space at the Identity
An m-dimensional Lie group has an m-dimensional tangent space at every point, including at the
identity I (which is guaranteed to be a point in the Lie group.) The set of all tangent vectors at
the identity is called the Lie algebra of G (often written using fraktur notation, g) and is especially
useful since we will soon see that we can express tangent vectors at any point on G in terms of
elements of the Lie algebra.
Computing the Lie Algebra Computing the Lie algebra for Lie groups defined as level sets of
functions g is especially easy. Let γ(t) : R → G be a curve on G, passing through the identity at
t = 0: γ(0) = I. Then γ 0 (0) ∈ g, and the set of all possible tangents to curves through I span the
tangent plane of G at I. Curves that lie in G satisfy g(γ[t]) = 0 for all t; differentiating yields
{dg(γ[t])} (γ 0 [t]) = 0
or, plugging in t = 0,
[dg(I)] γ 0 (0) = 0.
The Lie algebra g is thus the nullspace of the Jacobian of g at the identity.
Examples Let us compute the Lie algebra of the group of rotations in 3D space, SO(3). We
have that g(M ) = L(M T M − I), and
{dg(M )} (δM ) = L(δM T M + M T δM ),
so {dg(I)} (δM ) = L(δM T + δM ). The nullspace of the differential is thus the set of all matrices
δM with δM T = −δM , or in other words, the set of skew-symmetric 3 × 3 matrices.
How about for the translations, T ? The set of translations consists of all matrices of the form

1
 0
t=
 0
0
0
1
0
0

0 tx
0 ty 
.
1 tz 
0 1
We could easily write down a g whose level set is T (namely, for each zero or one entry in t, we
add a constraint gi enforincg that value on the corrresponding matrix entry) but since T has the
geometry of a three-dimensional hyperplane in sixteen-dimensional space, it’s obvious that the set
of tangent vectors t at I have the form

0
 0
v=
 0
0
0
0
0
0
85

0 tx
0 ty 
.
0 tz 
0 0
Tangent Space at Other Points Now we will begin to see the power of combining manifold
and group structure. Let’s say we want to compute the set of all tangent vectors at some point
g ∈ G, not necessarily the identity. How do we do this? Well, again, for every tangent vector
v ∈ Tg G we can find a curve γ(t) : R → G with γ(0) = g and γ 0 (0) = v. Because G is a group, we
can multiply every point on γ by g −1 . This gives us a new curve,
ψ(t) = g −1 γ(t)
with ψ(0) = I and
ψ 0 (0) =
d
ψ(t)
dt
t→0
=
d −1
g γ(t)
dt
t→0
= g −1 γ 0 (0) = g −1 v.
In other words, every tangent vector at g has a corresponding tangent vector at I, and vice versa,
and the point itself, g, gives the correspondence. This is such as an important fact that I will put
it in a box:
The tangent space of G at g is gg.
In addition to allowing us to easily express the tangent hyperplane at any point (assuming we can
compute the Lie algebra), there is a second, more subtle use for this correspondence: it gives us a
way to “translate” vectors from one point on G to another. In Euclidean space, we are used to the
fact that vectors translate freely: we can take a vector at a point p, and redraw it at a different base
point q. This approach doesn’t work on a manifold: if you take a tangent vector v at p ∈ G and
translate it, in the ambient Euclidean space, to q ∈ G, then almost always the translated vector is
no longer tangent to G. But in a Lie group, the vector qp−1 v is tangent to G at q, and gives us a
canonical way to map tangent vectors from one point to another.
8.2.2
Walking in Straight Lines
Now let us turn to the problem of tracing a “straight line” starting at a point p and in the direction
v tangent to G at p. To make things easier, let us first study the case p = I; in this case v ∈ g.
What does it mean to walk in a straight line?
Obviously, we must walk along a continuous curve γ(t) : R → G, with γ(0) = I and γ 0 (0) = v.
But after walking a short distance along γ, what is the new direction γ should travel along? It
must be a tangent vector to G, at γ()—one natural choice is to transport the initial vector v to
this point and walk in that direction, γ()γ 0 (). This leads to the following ODE characterizing the
curve:
γ 0 (t) = γ(t)v.
If γ were a real-valued function, the solution would be γ(t) = etv . It turns out that this solution
also works when v is a matrix, if we define matrix exponentiation by
eM =
∞
X
1 i
M .
i!
i=0
86
Is this line really “as straight as possible,” though? We’d like to travel like a steel ball bearing
rolling on a curved magnetic surface, bending just enough to stay on the surface, but not
turning to the side: or in other words, if we start at the south pole and walk north, we should
follow a meridian up to the north pole, without swerves, loops, or detours. Is this behavior
true for the curve γ(t) = etv ? How do we check?
First, we need to formalize what it means to walk straight. We need a natural metric on
Rn×n —the kinetic energy metric M , say—and then we can define an arclength of γ (in the ambient space)
for a piece of the curve t ∈ [0, T ] by
Z T
L=
kγ 0 (t)kM dt.
0
If γ is a true straight path (called a geodesic) then γ should be the shortest path, in terms of the above
arclength, between I and γ(T ), for any T . It turns out this will only be true if the metric is compatible with
the group structure of G: satisfies
(gv)T M (g)(gw) = vT M (I)w
for any g ∈ G and v, w ∈ g (a property called left-invariance of M ). This property is not always true, but
will turn out to hold for a very important special case, where G = SO(n) represents the orientation of a
rigid body, and M its rotational kinetic energy.
The mapping exp : g → G is called the exponential map and should be interpreted as follows:
the direction of a tangent vector v tells you which way to start walking away from I, and the
magnitude tells you how far to walk. The exponential map is not necessarily one-to-one (injective):
there might be multiple directions you can travel to reach the same point (for example, if you walk
a distance of π in either direction on the unit circle, you will get to the same antipodal point). It
might not be surjective: there can be points unreachable by walking in a straight line from the
identity (for example O(2) has the geometry of two loops in four-dimensional space; only one of
these loops is connected to the identity, so you cannot possible reach any points on the other loop
by walking.) Still, despite these shortcomings, the exponential map is extremely powerful because
it gives us an automatic Euclidean parameterization of all points near the identity on G: in code,
we can represent vectors in g as arrays of doubles, and calculate using these vectors instead of
points on the much more complicated surface G.
Finally, what if we want to start walking from a different point p? We simply use the group
structure to reduce this problem to walking away from the identity:
−1 v
γ(t) = petp
.
We map v back to a direction at I (by multiplying by p−1 ), compute a path in that direction at
the identity, and then map that path back to the point p.
8.2.3
Walking Between Points
The previous section discussed how to walk in a straight line, given an initial point and direction.
What if we have two points, and want to find a curve connecting them? Let’s again simplify the
problem by first considering the case where one point is at I. We want to find a curve γ : [0, 1] → G
with γ(0) = I, and γ(1) = p for some prescribed point p ∈ G.
The key idea is to again use the exponential map, or rather, its inverse. If we knew that walking
in direction v ∈ g would eventually hit the point p, we could use the exponential map to get our
curve γ. In other words, we need a vector v with ev = p. We can solve this problem by taking the
87
matrix logarithm of p, where the logarithm is again defined by extending the power series for log
to the matrix setting:
∞
X
(−1)i+1
log M =
(M − I)i .
i
i=1
Unlike the exponential, the logarithm of a matrix does not always exist (which makes sense, in
light of the observations above that the exponential map is neither always injective nor surjective),
which can be seen also in the fact that if kM − Ik is too large in the formula above, the terms in
the series will not converge. However, when M is close to the identity, the logarithm exists, and
we can compute our path
γ(t) = et log p .
If the starting point is not the identity, but some second point q,
−1
γ(t) = et log(pq ) q,
where the logarithm pq−1 will exist so long as the two points are “close” to each other, i.e.,
pq−1 ≈ I.
8.3
Noether’s Theorem
Before we move on to the use of Lie groups in representing physical systems (and in particular,
rigid bodies), let’s take a brief detour and describe one of the foundational theorems in classic
mechanics: Noether’s theorem, which, informally, states that conserved quantities like momentum
and energy come from symmetries in the system’s Lagrangian.
Smooth Symmetries Given configuration space Q, and a function f (q) over Q, a symmetry of
f is a bijection φ : Q → Q that leaves the value of f unchanged: in other words, f (φ[q]) = f (q)
for all points q ∈ Q. For example, if Q = R, and f (x) = x2 , then the antipodal map φ(x) = −x is
a symmetry of f since x2 = (−x)2 for all real numbers x.
A smooth symmetry is a smooth one-parameter family of symmetries: instead of taking in just
q, a smooth symmetry φs (q) is also a function of a parameter s ∈ R, with the properties that
• φs is a symmetry for each s;
• φ0 is the identity, i.e. φ0 (q) = q for all q ∈ Q;
• φs (q) is smooth.
For example, if Q = R2 and f (q) = kqk, then rotations about the origin,
cos s − sin s
φs (q) =
,
sin s cos s
are a smooth one-parameter family of symmetries of f . Denoting by φ0s the s derivative of φ, it
follows from differentiating both sides of the defining equation f (φs [q]) = f (q) and evaluating at
s = 0 that
[df (q)] φ00 (q) = 0
(8.1)
for all q ∈ Q; in other words, if we imagine “animating” Q by moving all points q ∈ Q along a
path q(t) = φt (q), this animation would keep all points q on their original level sets of f , and
equation (8.1) states that equivalently, all velocities lie tangent to a level set.
88
Noether’s Theorem Now let us look specifically at the case where the function f in question is
the Lagrangian L(q̇, q). A smooth symmetry of L affects not just a single point q, but the entire
trajectory q(t), and in particular, maps the configurational velocity q̇ to the time derivative of the
transformed trajectory,
d
φs (q[t]) = {dφs (q)} (q̇) .
(8.2)
dt
The fact that φs is a smooth symmetry of the Lagrangian thus implies that
L [{dφs (q)} (q̇) , φs (q)] = L(q̇, q).
This equation holds for any value of s, so as above we can differentiate with respect to s and
evaluate at s = 0:
d1 L [q̇, q] dφ00 (q) (q̇) + d2 L [q̇, q] φ00 (q) = 0.
Using the Euler-Lagrange equations, and equation (8.2), we can simplify this equation to
d1 L [q̇, q]
d
d 0
φ0 (q) + (d1 L [q̇, q]) φ00 (q) = 0
dt
dt
or, after combining the two terms,
d d1 L (q̇, q) φ00 (q) = 0.
dt
In other words, the quantity d1 L (q̇, q) φ00 (q) is conserved: is invariant over time.
Let us now connect this general result back to Lie groups. A Lie group G, since it consists of
a subset of n × n matrices, maps Rn → Rn and so each element g ∈ G encodes a special kind of
symmetry φ: those that are linear in q. If a function f is invariant with respect to every element
of G, then f is invariant with respect to the smooth symmetries φs (q) = esv q, for any element
v ∈ g. Smooth symmetries of that form satisfy φ00 (q) = vq, and so we can write Noether’s theorem
for the special case of Lie groups: suppose that a system has a Lagrangian that is invariant with
respect to the action of the Lie group G, i.e., that L(g q̇, gq) = L(q̇, q) for every g ∈ G. Then the
quantity d1 L(q̇, q)vq is conserved, for all v ∈ g. If the Lagrangian is of the standard form, with
kinetic energy 12 q̇T M (q)q̇, and potential energy which does not depend on q̇, then the conserved
quantity can be simplified to
[M (q)q̇)]T vq.
Examples
Let us look at a few special cases of Noether’s theorem.
• Rotation invariance in R3 : suppose Q = R3 and the Lagrangian is isotropic—invariant with
respect to rotations G = SO(3). We know that so(3) is the set of skew-symmetric matrices,
which are also the set of cross-product matrices [w]× for some axis w ∈ R3 . Then we have
that the conserved quantity is q̇T M (q)[w]× q, or in other words, the angular momentum
p(w × q) = w · (q × pT )
with respect to axis w is conserved.
89
• Rotation invariance for particle systems: suppose now instead that Q = R3n , encoding the
positions of n particles, q = [x1 , y1 , z1 , x2 , . . .]T , and that the Lagrangian is isotropic, which
in this setting means that it is invariant with respect to rotating all particles by the same
rotation R. This situation corresponds to symmetry of L with respect to the block-diagonal
matrices G = diag(R, R, . . .) for R ∈ SO(3). It is not hard to show that this G is a Lie group,
whose Lie algebra is simply g = diag(v, v, . . .) with v ∈ so(3). Again writing v = [w]× for
an axis vector w, the conserved quantity is then
!
n
n
X
X
pvq =
pi (w × qi ) = w ·
qi × pTi ,
i=1
i=1
the usual law of conservation of angular momentum.
• Translation invariance in R3 : let us return to the single particle case, and consider Lagrangians
that are translation-invariant. As discussed above, translation is not a linear operation on
R3 , but if we write the position of the particle as
 
x
 y 
4

q=
 z ⊂Q=R ,
1
the set of translations
are the matrices

1



0
T = 
 0



0
by vectors (tx , ty , tz ) is a Lie group, whose elements and Lie algebra
0
1
0
0


0 tx 





0 ty 
 ; t= 

1 tz 





0 1
0
0
0
0
0
0
0
0

0 tx 


0 ty 

0 tz 


0 0
(tx , ty , tz ) ∈ R3 .
Notice that if v ∈ t, then vq is constant, and equal to the translation vector t, no matter
what q is. Therefore the conserved quantity, by Noether’s theorem, is
pt,
exactly conservation of linear momentum. As with the case of rotations, this conservation
law extends directly to particle systems of n individual particles.
Discrete Noether’s Theorem One of the powerful consequences of Hamilton’s principle as
a recipe for constructing numerical integrators is that the resulting integrators satisfy a discrete
Noether’s theorem: if the discrete Lagrangian is invariant with respect to a smooth symmetry, the
time integration method will conserve a corresponding physical quantity exactly, regardless of the
time step! To see this, let us suppose the discrete Lagrangian is invariant to smooth symmetry φs :
L φs qi , φs qi+1 = L qi , qi+1 .
Repeating the above calculation, differentiating with respect to s and evaluating at s = 0 gives
d1 L qi , qi+1
φ00 qi + d2 L qi , qi+1
φ00 qi+1 = 0.
90
Plugging in the discrete Euler-Lagrange equations, we get
d1 L qi , qi+1
φ00 qi − d1 L qi+1 , qi+2
φ00 qi+1 = 0,
or in other words, d1 L qi , qi+1 φ00 qi is conserved as you increment the time step i → (i + 1).
If the symmetry comes from a Lie group,
d1 L qi , qi+1 vqi
is conserved, for all elements v of the Lie algebra.
As a concrete example, let us look at the case of Velocity Verlet for a single particle system,
which arises from the discrete Lagrangian
i
L q ,q
i+1
=
qi+1 − qi
h
T
M
qi+1 − qi
h
− V qi+1 .
If the potential V is rotation-invariant, so is L, and hence for v a cross-product matrix [w]× ,
qi+1 − qi
h
T
M w × qi
is conserved, which is the discrete analogue of angular momentum, as is apparent if we use the
triple product cyclic permutation identity to rewrite the quantity as
i+1
q
− qi
i
w· q ×M
.
h
91
92
Chapter 9
Rigid Bodies in Three Dimensions
How about sending me a fourth gimbal for
Christmas?
Mike Collins
We now study how to simulate rigid bodies in three dimensions. The framework we will use is
the same as in two dimensions: we start from a template of the rigid body, positioned in space so
that its center of mass lies at the origin (0, 0, 0). We can then represent any deformed pose of the
rigid body as a rigid motion (rotation and translation) of the template. The extra wrinkle in 3D is
that rotations are much more complex than they were in 2D (where rotations could be represented
by simply a scalar angle); this wrinkle will turn out to be far from minor. Before we work through
the details of the equations of motion for 3D rigid bodies, we will review rotations in R3 and how
to represent them.
9.1
Rotations in 3D
A rotation is a linear transformation R(v) of vectors in R3 that preserves lengths: kR(v)k = kvk
for all vectors v ∈ R3 . It is clear that such a transformation must fix the zero vector, and that (like
all linear transformations) it can be expressed as multiplication of v by a matrix, R(v) = R3×3 v.
It is perhaps less clear than any length-preserving transformation must also preserve angles. This
is because of the polarization identity
kv + wk2 = kvk2 + 2(v · w) + kwk2
which allows all dot products (and hence angles) to be expressed in terms of vector lengths; since
rotations preserve the lengths, they must also preserve dot products.
Algebraically, preserving dot products implies that
eTi (RT R)ej = Rei · Rej = ei · ej = δij ,
where ei is the Euclidean basis vector with a one in the ith entry and zeros everywhere else; therefore
RT R = I. Conversely, any matrix with RT R = I represents a linear transformation that preserves
lengths and angles, and so the set of all rotations in 3D can be represented by the set of orthogonal
matrices O(3) satisfying this identity.
93
Figure 9.1: Left: A mechanical gimbal: concentric metal rings attached to each other along separate
axes allow a gyroscope (metallic object in the center) to achieve every possible orientation in 3D.
Right: A rotation represented as an axis of rotation and an angle. The diagram shows the righthand rule convention governing the direction of rotation.
Since the determinant of a product of matrices is the product of the determinants, and the
determinant of a matrix is the same as that of its transpose, (det R)2 = 1 and det R = ±1. It is
sometimes important to differentiate between orientation-preserving rotations, with det R = 1,
and orientation reversing rotations with det R = −1. Notice in particular that all reflections
are just special cases of orientation-reversing rotations (they are linear transformations and
they preserve vector lengths). But reflections are not what one typically means in everyday
language when one speaks of a rotation, and in this chapter too we will restrict our attention to orientationpreserving rotations, sometimes called special orthogonal matrices SO(3), which we studied in the previous
chapter.
Myriad different schemes have been invented for how to represent a rotation. We review a few
of them here:
Rotation Matrix We can simply represent a rotation using its matrix form R. This representation requires nine degrees of freedom (one per entry of the 3×3 rotation matrix) but these degrees of
freedom cannot be chosen arbitrarily, since the matrix must satisfy the defining equation RT R = I.
This matrix equation is really 9 separate scalar constraints on the entries of R, but three of these
constraints are redundant. In summary, representing the rotation as a matrix requires storing 9
DOFs satisfying six quadratic constraints. Rotation matrices have many advantages, including ease
of use, intuitiveness, and algebraic simplicity, but they are not compact and the constraints are
awkward to enforce in applications like physical simulation.
Euler Angles The rotation can be expressed by successive rotations about the Euclidean axes
(“first rotate by θ degrees about the z axis, then rotation by ψ about the y axis, then rotate...”).
Three angles are enough to represent every possible rotation in R3 , but there are many different
conventions about which axes to use, and inwhich order to rotate about them; there is not a single
standard formulation of Euler angles.
Historically, Euler angles came about as they are a natural way of describing the configuration
of a gimbal, a mechanical device consisting of three concentric rings attached to each other along
94
different axes (see figure 9.1). Euler angles are also commonly used in aviation, where the angles
are called the yaw, pitch, and roll of the airplane. Unlike the rotation matrix representation, Euler
angles are compact: you only need three degrees of freedom to represent any rotation. However,
except for simulating an actual mechanical gimbal, Euler angles are highly unsuitable for use
in physical simulation due to a phenomenon called gimbal lock, where in certain degenerate
configurations the gimbal loses degrees of freedom and can no longer rotate in all possible directions.
In simulations gimbal lock manifests as singularities that appear when trying to step forward time.
Also, notice that although every rotation can be represented by a set of Euler angles, these angles
are not unique: rotating about the x axis by θ radians, for example, represents the same rotation
as rotating about the axis by 2π + θ radians. We won’t go into any more depth about the details
and subtleties of using Euler angles, given their unsuitability for our purpose.
Axis-Angle Every rotation in 3D fixes an angle: in other words, for every rotation R there exists
a vector a for which Ra = a. (The trivial rotation R = I fixes every axis, and is the only rotation
that fixes more than one axis.) We can therefore represent every rotation by (i) the axis a that it
fixes, and (ii) the amount of radians θ to rotate about that axis. Since the axis is just a direction
(and needs no magnitude information), the axis and angle information can be stored compactly as
a single three-dimensional axis-angle vector θ = θa. Some subtleties of this representation:
• A convention is needed for which way to rotate about a by θ radians. The usual convention
is to rotate counterclockwise, in the following sense: orient your view so that a points directly
towards you. Then the rotation represented by axis-angle θa is counterclockwise rotation,
as viewed from your perspective. There is a mnemonic for this convention, similar to the
right-hand rule mnemonic for cross products: make a fist with your thumb extended, as if
you were hailing a hitchhiker. Point your thumb so that it point in the direction a. Then the
direction of rotation is the direction that your non-thumb fingers curl around your first. See
figure 9.1 for a diagram.
• Spinning about a by θ radians is the same net rotation as spinning about −a by −θ radians.
Therefore there is no information lost when multiplying a and θ together to form the axisangle θ: given an axis-angle vector, we can assume that the axis of rotation is θ̂ and the
amount of rotation the positive magnitude kθk.
• The trivial rotation (identity transformation) does not fix a unique axis. However we can still
represent the identity rotation in axis-angle form as the zero vector (0, 0, 0).
• An axis-angle vector can be converted into a rotation matrix by analyzing how the rotation
acts on components of vectors parallel to θ and perpendicular to θ. A rather involved geometric calculation yields Rodrigues’s Rotation Formula for the matrix corresponding to any
axis-angle,
rot (θ) = cos kθkI + sin kθk[θ̂]× + (1 − cos kθk)θ̂ θ̂ T .
As a sanity check, notice that [rot θ] θ = θ.
• Like with Euler angles, the axis-angle representation of a rotation is not unique. For example,
θa and (2π + θ)a both represent the same rotation. This non-uniqueness does not hinder use
of axis-angles in practice, but is an important caveat to keep in mind. To compute an axisangle representing the rotation with matrix R, first the axis a can be computed. Since it is
95
fixed, Ra = a, so a is the eigenvector of R with eigenvalue 1. Finally θ can be calculated by
measuring the angle between v and Rv, where v is any test vector perpendicular to the axis
a.
• Rotations generally do not commute. In particular, it is not true that the composition of two
rotations is just the sum of their axis-angle representations:
rot(θ1 + θ2 ) 6= [rot θ1 ][rot θ2 ]
except for certain special cases, such as when θ1 and θ2 are parallel. Resist the temptation!
Lie Algebra Recall from the previous chapter that SO(3) is a Lie group, and than we can thus
represent elements of the Lie group by elements of the Lie algebra so(3). Moreover, this Lie algebra
consists of all skew-symmeric matrices, which are also all cross-product matrices [θ]× for vectors
θ ∈ R3 . This suggestive choice of notation is not an accident: the Lie algebra representation of
rotations turns out to be the same as the axis-angle representation, where the matrix exponential
of the cross-product matrix gives the same rotation as applying the Rodrigues’s Rotation Formula
to θ:
e[θ]× = rot θ.
Quaternion Finally, there is a very elegant representation of rotations in terms of unit quaternions. This representation is very similar in spirit to the axis-angle representation, and is popular
in practice since many common operations, such as composing rotations together, is more efficient
when working with quaternions than with axis-angle vectors.
In this lecture, we will work with the axis-angle/Lie algebra representation, as it is a good
compromise between geometric elegance, intuitiveness, and efficiency, making it a suitable representation for simulation. In practical codes one would almost always use quaternions instead of
axis-angle vectors, but we will not do so here, since working with quaternions adds an extra level of
complexity and opacity to the already subtle and challenging topic of rigid body dynamics. Everything we do in this chapter can be converted to use quaternions instead, once you are comfortable
with the axis-angle approach.
Bonus Math Why does every rotation fix an axis? This somewhat surprising result can be proved with
a little bit of linear algebra. Let R be a rotation matrix in SO(3). Since RT R = RRT , the matrix R has a
full set of (not necessary real) eigenvalues; let λ be one eigenvalue, with eigenvector v. Since R preserves
lengths of vectors,
kvk2 = kRvk2 = (Rv)H Rv = (λv)H (λv) = kλk2 kvk2 ,
so λ must have modulus 1.
Now let λ1 , λ2 , λ3 be the three eigenvalues of R. Since det R = 1,
λ1 λ2 λ3 = 1.
There are now two cases: if all three eigenvalues are real, they are all equal to ±1, and so either all three
are 1 (and the matrix is the identity matrix and fixes all axes) or one eigenvalue is 1 and the other two
are −1. If some of the eigenvalues are complex, then exactly two must be complex, and the third one
real, with the two complex eigenvalues conjugates of each other. Call the real eigenvalue λ3 . Then since
96
λ1 and λ2 are complex conjugates,
λ1 λ2 = kλ1 k2 = 1,
and λ3 = 1.
In both cases the matrix R must have 1 as an eigenvalue. The eigenvector a corresponding to this
eigenvalue must be real, and must be satisfy Ra = a, i.e., it must be fixed by R.
Notice that the above argument relies critically on the fact that R is a 3 × 3 matrix. The argument
can be extended to any odd dimension (complex eigenvalues must still pair up into conjugate pairs) but
fails in even dimension: in fact, in 4D, there exist rotations that do not fix any axis (consider for instance
a rotation that independently spins the xy plane, and the zw plane).
9.2
Representing 3D Rigid Bodies
Once again we represent rigid bodies moving through space as transformations of a template body
Ω, which we position so that its center of mass lies at the origin (0, 0, 0). (In this chapter we again
assume, for simplicity, that the density ρ of the object is constant).
Any pose of this body can be described by starting with Ω, then (i) first rotating it (about the
origin, i.e. its center of mass) by axis-angle θ; and then (ii) translating it by the vector t. In other
words, we can represent
therigid body’s motion using a six-dimensional configuration space, with
θ
containing now three degrees of freedom encoding the orientation of
the configuration q =
t
the body, and three encoding the translation away from the template.
Therefore if v̄ ∈ Ω is a point on the template, the position of the corresponding point on the
deformed body v(q) is given by
v(q) = rot(θ)v̄ + t.
(9.1)
All we need now in order to derive the equations of motion of 3D rigid bodies is to formulate
a Lagrangian L(q, q̇) and apply Hamilton’s principle, and for this, we need to the kinetic and
potential energy of the system as a function of the configuration annd configurational velocity q
and q̇. We leave the potential energy generic, and assume it depends only on the body’s position
and orientation, and not velocity: V (q). To calculate the kinetic energy, we follow the reasoning
from the 2D case, and integrate up the contribution to kinetic energy from each material point v̄
in the template:
Z
2
ρ d
T (q, q̇) =
v(q) dV.
Ω 2 dt
What is the velocity v̇ of the material point v̄? We simply differentiate equation 9.1 with respect
to time:
v̇ =
d
[rot(θ)v̄] + ṫ
dt
= {d rot(θ)} (θ̇) v̄ + ṫ.
That first term needs to be carefully unpacked: d rot(θ) is the differential of the function rot, which
takes in an axis-angle vector and produces a rotation matrix, at the point θ: it is a function which
maps an infinitesimal change δθ in θ to an infinitesimal change in the rotation matrix rot θ. Due
to the chain rule, that differential is being evaluated for δθ = θ̇, and then the result (a 3 × 3
matrix representing the infinitesimal change in the rotation matrix rot θ over time) is applied to
the constant vector v.
97
To get a tractable expression for kinetic energy, and hence the rigid body Lagrangian, we need
a better understanding of the differential d rot(θ). Before returning to the problem of writing down
the rigid body’s kinetic energy, we take a detour to study this function in more detail.
9.3
The Exponential Map, Its Derivative, and Angular Velocity
Recall that rot θ can equivalently be expressed in terms of the matrix exponential of a crossproduct matrix, exp ([θ]× ) . It is very tempting to differentiate this function as one would the
scalar exponential, namely,
?
{d exp(M )}(δM ) = eM δM
but this is wrong! To see the error, let us expand eM using the power series definition, and take
the derivative very carefully:
M2 M3
+
+ . . . (M ) δM
d I +M +
2
3!
δM M 2 + M δM M + M 2 δM
δM M + M δM
+
+ ...
= I + δM +
2
3!
where we have applied the product rule to every term in the power series. We would like to
simplify e.g. δM M + M δM + 2M δM , but we cannot do so because generally speaking matrices do
not commute, and M and δM are arbitrary: there is no reason to expect M and δM to commute.
There is therefore, unfortunately, no simple closed-form formula for the derivative of the matrix
exponential. For the special case of skew-symmetric matrices, though, we can fall back to differentiating Rodrigues’s Rotation Formula. This derivative is not at all pleasant and does not have an
especially compact or simple form; it can be shown using extensive geometric arguments1 that for
any vector v,
[(d rot(θ)) δθ] v = − rot(θ)[v]× T (θ)δθ,
where T (θ) is the matrix
T (θ) =
(
I,
kθk = 0
θθ T +(rot(−θ)−I)[θ]×
,
kθk2
kθk > 0.
.
Notice that this formula does not give us directly the Jacobian of rot (which would be require a
rank three tensor), but rather, the Jacobian applied to a given fixed vector v. Notice a few useful
properties of this T (θ) matrix:
• it leaves vectors parallel to θ unchanged: T (θ)θ = θ;
• like with rot itself, negating the input angle yields the transpose of the matrix: T (−θ) =
T (θ)T . Unlike with rotation matrices, though, T (θ) is not generally orthogonal.
1
Gallego and Yezzi, A Compact Formula for the Derivative of a 3-D Rotation in Exponential Coordinates . Journal
of Mathematical Imaging and Vision, Volume 51 Issue 3, March 2015, 378–384.
98
Angular Velocity Where does that leave us in terms of calculating the velocity of a particle
d
inside a rigid body? We could compute dt
rot(θ[t]) using the chain rule, and the formulas above,
but the calculation quickly becomes unwieldy. Instead, we make the following observation: this
derivative is a tangent rotation at rot θ, and so, since rotations form a Lie group, must be the
product of rot θ by some skew-symmetric matrix [ω]× :
d
rot(θ[t]) = [rot θ(t)][ω(t)]× .
dt
(9.2)
This infinitesimal rotation, which we can express using the axis-angle ω, is called the angular
velocity. Note that angular velocity implicitly depends on both θ and θ̇, but we will find it is much
easier to work with ω rather than these parent quantities. This is because the angular velocity
is something that can be measured directly from the trajectory of the rigid body: the angular
velocity encodes about what axis (in the body’s template coordinates), and how fast, the rigid
body is currently spinning. In contrast, the relationship between the spin of the rigid body and θ̇
is a much more complicated one, depending also on how the current orientation of the rigid body
differs from the rest orientation. The former is a physical quantity; the latter, an artificial one that
depends on how we’ve chosen to set up our rigid body template.
Example Consider the earth’s orbit around the sun. We can set up a coordinate system where
the sun is at (0, 0, 0), and the earth orbits in the xy plane. We can set up a template for the earth,
oriented so that the earth’s center is at the origin, the equator lies in the xy plane, and the north
and south pole lie on the z axis.
• The earth’s axis is tilted with respect to the earth’s orbital plane. Therefore the earth’s
orientation θ(t) represents a rotation that tilts the north pole slightly away from the z axis.
This rotation fixes some axis, but this axis is not obvious (there can be a “twisting” motion
of the earth about its axis, in addition to the tilting of the poles). θ(t) is periodic over time,
and has the same value at the same time each day.
• The earth spins about its axis at a speed of 2π radians per day. Therefore the angular velocity
ω(t) of the earth, in units of radians per second, is
ω(t) =
2π
ẑ.
24 · 60 · 60
Notice that the angular velocity is not parallel to θ. Instead, it is parallel to the earth’s polar
axis, in the earth’s template coordinates.
• The axis of the earth’s rotation, in world coordinates, varies over time and can be computed
by mapping ω from template to world coordinates: [rot θ(t)]ω(t).
9.4
Equations of Motion
We now return to the question of deriving the kinetic energy for a rigid body. We can write down
the velocity of a material point on Ω in terms of orientation and angular velocity using equation 9.2:
v̇(t) = [rot θ(t)][ω(t)]× v̄ + ṫ(t).
99
Intuitively, the cross product with ω the computes the velocity of the point due to it spinning
around the ω axis in template coordinates; this velocity is then rotated into world coordinates
using the current body orientation θ.
The total kinetic energy of the rigid body is then
Z
ρ
k[rot θ(t)][ω(t)]× v̄ + ṫk2 dV
T (q, q̇) =
2 Ω
Z
Z
Z
ρ
2
2
T
=
kṫk dV
k[rot θ(t)][ω(t)]× v̄k dV + 2 ṫ [rot θ(t)][ω(t)]× v̄ dV +
2 Ω
Ω
Ω
Z
Z
ρ
2
T
2
=
k[rot θ(t)][ω(t)]× v̄k dV + 2ṫ [rot θ(t)][ω(t)]×
v̄ dV + vol(Ω)kṫk
2 Ω
Ω
R
Since the template is placed with its center of mass at the origin, Ω v̄ dV = 0 and the middle term
drops out. We can further simplify the first term, first by expanding out the norm-squared term
and canceling the rotations, then by using the fact that the cross product is anti-commutative, so
that [ω]× v̄ = −[v̄]× ω:
T (q, q̇) =
=
=
=
=
Z
ρ
2
2
k[rot θ(t)][ω(t)]× v̄k dV + vol(Ω)kṫk
2 Ω
Z
ρ
T
T
T
2
v̄ [ω]× [rot θ(t)] [rot θ(t)][ω]× v̄ dV + vol(Ω)kṫk
2 Ω
Z
ρ
T
T
2
v̄ [ω]× [ω]× v̄ dV + vol(Ω)kṫk
2 Ω
Z
ρ
T
T
2
ω [v̄]× [v̄]× ω dV + vol(Ω)kṫk
2 Ω
1 T
1
ω MI ω + ṫT Mc ṫ,
2
2
where Mc = ρ vol(Ω)I is the mass matrix of the rigid body, and
Z
MI =
ρ[v̄]T× [v̄]× dV
Ω
is the inertia matrix or inertia tensor. The mass term should not be surprising: it is identical to
the kinetic energy of a point particle containing the same mass as the rigid body, but concentrated
at the body’s center of mass. The inertia tensor is a bit more involved: it measures the energy
stored in the spinning of the rigid body.
It makes sense that rotational kinetic energy depends only on ω, and not directly on θ: the
energy stored in spinning a body depends on how the axis of rotation relates to the body’s geometry,
but not on how the body happens to be currently oriented in space. Notice that MI is symmetric,
and positive-semidefinite (vT MI v ≥ 0 for all vectors v). The inertia tensor measures, for each
axis v in template coordinates, how much energy is stored in the rigid body spinning about that
axis. Since MI depends only on the template geometry of the body, and not on its current position
or orientation, the inertia tensor can be precomputed from the template geometry (using Stokes’s
theorem).
100
Principal Axes Since the inertia tensor is symmetric positive-semidefinite, it has three nonnegative real eigenvalues 0 ≤ λ1 ≤ λ2 ≤ λ3 with corresponding eigenvectors v1 , v2 , v3 . The eigenvectors
with least and greatest eigenvalue, v1 and v3 , intuitively represent the axes about which it is
“easiest” and “hardest” to start the object spinning. For a cylindrical rigid body, for instance, v1
points in the direction parallel to the cylinder’s axis, and v2 and v3 both have the same eigenvalue
and point in directions perpendicular to the cylinder’s axis (parallel to the cross-section plane): it
requires less energy to spin a cylinder about its axis, than to twirl it around Darth-Maul-lightsaberstyle.
Bonus Math Let ā be any (unit-length) axis in the body’s template coordinates; then the kinetic energy
stored in the spinning of the body about axis ā satisfies the inequality
1
λ1
λ3
≤ āT MI ā ≤
,
2
2
2
so that the two extreme principal axes are indeed the directions of spin that contain the least and most
kinetic energy. To see that this inequality is true, expand ā in the eigenvector basis:
ā = α1 v1 + α2 v2 + α3 v3
for some scalar coefficients αi . Since kāk = 1, α12 + α22 + α32 = 1. Then
1
1 T
ā MI ā = āT (α1 λ1 v1 + α2 λ2 v2 + α3 λ3 v3 )
2
2
1
= (α12 λ1 + α22 λ2 + α32 λ3 )
2
1
≤ (α12 λ3 + α22 λ3 + α32 λ3 )
2
λ3
=
,
2
and the other inequality follows similarly.
Euler-Lagrange Equations We can now write down the full Lagrangian for the rigid body:
1
1
L(q, q̇) = ω(θ, θ̇)T MI ω(θ, θ̇) + ṫT Mc ṫ − V (q),
2
2
where we make clear that the angular velocity ω is not an independent variable, but rather is a
function of the orientation and its time derivative, with the precise relationship specified in equation 9.2. The Euler-Lagrange equations for this Lagrangian comes in two parts; the translational
part is utterly unsurprising:
d
d L − dt L = 0
dt ṫ
d T
ṫ Mc + [dt V (q)] = 0
dt
ẗT Mc = − [dt V (q)] ,
where the right-hand side is the translational component of the force acting on the rigid body, and
the left-hand side is the body’s linear momentum. This equation is exactly Newton’s second law,
101
with the rigid body treated as a point particle positioned at its center of mass. Next, the rotational
part of the Euler-Lagrange equations is much more complex, since the rotational kinetic energy
depends not just on θ̇, but also on the orientation itself. First, we have that
[ω]× = [rot θ]T {d rot(θ)} (θ̇),
so for any test vector v ∈ R3 ,
h
i
ω × v = [rot θ]T [− rot θ][v]× T (θ)θ̇ = T (θ)θ̇ × v,
so since v is arbitrary, we have an explicit formula for angular velocity:
ω(θ, θ̇) = T (θ)θ̇.
It follows that
h
(9.3)
i
d2 ω(θ, θ̇) = T (θ),
so that
h
i
d
d dθ̇ L =
ω(θ, θ̇)T MI d2 ω(θ, θ̇)
dt
dt
d d
=
ω(θ, θ̇)T MI T (θ) + ω(θ, θ̇)T MI (T [θ]) .
dt
dt
To find the second term in the Euler-Lagrange equations, we compute:
dθ L(δθ) = ω(θ, θ̇)T MI {dT (θ)} (δθ)θ̇ − {dθ V (q)} (δθ) .
The dT term is a rank-three tensor, with {dT (θ)} (δθ) a matrix, which then multiplies θ̇.
To simplify this expression, we need to do a quick side calculation: let v be an arbitrary test
vector, and consider the second derivative
n h
i
o
d
([rot θ]v) (θ) (δθ) = d −(rot θ)[v]× T (θ)θ̇ (θ) (δθ)
d
dt
h
i
n h
i
o
= (rot θ) [v]× T (θ)θ̇ T (θ)δθ − (rot θ)[v]× d T (θ)θ̇ (θ) (δθ).
×
Since mixed derivatives commute, the above expression must be equal to
d
d
({d [(rot θ)v] (θ)} (δθ)) =
(−(rot θ)[v]× T (θ)δθ)
dt
dt
= (rot θ) [[v]× T (θ)δθ]× T (θ)θ̇ − (rot θ)[v]×
d
T (θ)δθ.
dt
Subtracting these expressions from each other and eliminating rot θ yields
n h
i
o
h
i
d
[v]×
T (θ)δθ − d T (θ)θ̇ (θ) (δθ) = [[v]× T (θ)δθ]× T (θ)θ̇ − [v]× T (θ)θ̇ T (θ)δθ
dt
×
h
i
= [T (θ)δθ]× [v]× T (θ)θ̇ − T (θ)θ̇ [v]× T (θ)δθ
×
h
i
= −[v]× T (θ)θ̇ T (θ)δθ,
×
102
where the last equality is using the Jacobi identity of the cross product,
u × (v × w) + v × (w × u) = −w × (u × v).
Returning now to the full Euler-Lagrange equations for the rotational part of the DOFs, we get
h
i
0 = ω̇ T MI T (θ) − ω T MI T (θ)θ̇ T (θ) − [dθ V (q)]
×
or, taking the transpose of both sides and simplifying,
MI ω̇ + ω × MI ω = T (θ)−T [dθ V (q)]T ,
called the Euler equations. The right-hand is the torque exerted on the rigid body by the force
coming from the potential V ; the left-hand side is the analogue of the “M a” part of Newton’s
second law, but notice that the inertia tensor plays the part of the mass matrix.
Precession Notice that there is a second term on the left-hand side of the Euler equations, which
does not depend on angular acceleration ω̇. This term is a fictitious force encoding the precession of
the orientation of the rigid body over time. This precession or “wobbling” of the rigid body occurs
even if there are no torques acting on the body (think of a frisbee during an imperfect throw).
Notice one case when the precession force is zero: when ω is an eigenvector of the inertia tensor,
MI ω = λω, in which case the middle term drops out (since ω × λω = 0). In other words, a rigid
body spinning about one of its principal axes will continue spinning about that axis; trying to spin
the body about any other axis will cause the body to reorient itself (precess) over time.2
9.5
Discrete Equations of Motion
The usual recipe for discretizing the above Lagrangian would be to estimate velocities on each time
interval between time steps under the assumption that t and θ vary linearly between consecutive
time steps. While this assumption makes sense for the translation degrees of freedom t, linear
interpolation of the axis-angle θ does not correspond to any kind of simple motion of the rigid
body. An alternative is to assume that the angular velocity is constant between time steps: in
particular, define the angular velocity ω(θ i , θ i+1 ) to be the axis and rotational speed, in template
coordinates, which takes the rigid body from orientation θ i to θ i+1 over the course of one time
step:
(9.4)
rot θ i rot hω θ i , θ i+1 = rot θ i+1 .
Although this equation is not an explicit formula for ω, notice that is easy to solve for ω given θ i
and θ i+1 , or to solve for θ i+1 given θ i and ω, if one can convert between matrix and axis-angle
representations of rotations.
One discrete Lagrangian in terms of angular velocity is then
i
L(q , q
i+1
1
1
) = ω(θ i , θ i+1 )T MI ω(θ i , θ i+1 ) +
2
2
2
ti+1 − ti
h
T
Mc
ti+1 − ti
h
− V (qi+1 ),
It turns out that only the maximum and minimum-eigenvalue eigenvectors yield stable motion: an object spinning
about its middle principal axis is in unstable equilibrium and will begin to wobble at the slightest disturbance to its
geometry or motion.
103
and a time integrator can be derived from this Lagrangian by the usual recipe, i.e. by formulating
the discrete Euler-Lagrange equations
d1 L(qi , qi+1 ) + d2 L(qi−1 , qi ) = 0
and massaging them into a time step map (ti , θ i ) 7→ (ti+1 , θ i+1 ). From the discrete Euler-Lagrange
equations we have immediately that
T
T
1 ti − ti−1
1 ti+1 − ti
Mc +
Mc − dt V (qi ) = 0
−
h
h
h
h
i i+1 T
i i+1
i−1 i T
ω(θ , θ ) MI d1 ω(θ , θ ) + ω(θ , θ ) MI d2 ω(θ i−1 , θ i ) − dθ V (qi ) = 0,
and so we need the derivative of angular velocity with respect to the starting and ending orientations. We can compute these from equation 9.4: multiplying an arbitrary vector v by both sides
of that equation, and then differentiating both sides with respect to θ i+1 , yields
− rot θ i rot hω θ i , θ i+1 [v]× T hω θ i , θ i+1 hd2 ω θ i , θ i+1 = − rot θ i+1 [v]× T θ i+1 ,
or in other words,
1
−1
d2 ω θ i , θ i+1 = T hω θ i , θ i+1
T θ i+1 ,
h
and by a similar calculation,
−1
1
d1 ω θ i , θ i+1 = − T −hω θ i , θ i+1
T θi .
h
i+1 i Writing ω i = ω θ i , θ i+1 and ṫi = t h−t , the Euler-Lagrange equations are then
ti+1 = ti + hṫi
rot θ i+1 = rot θ i rot hω i
T
ṫi+1 = ṫi − hMc−1 dt V (qi+1 )
T
−1
T
−1
−1
ω i+1 MI T −hω i+1
= ω i MI T hω i
− h dθ V qi+1 T θ i+1
,
which can be solved using a mix of implicit and explicit updates.
What happened to the precession term? It’s still here, albeit obscured a bit in the definition
of the T (θ) matrices. When h is small, a Taylor expansion gives
T (hθ) ≈ I −
h
[θ]× ,
2
and therefore
MI ω i+1 − T −hω i+1
T
T hω i
−T
−1
MI ω i = MI ω i+1 − T hω i+1 T −hω i
MI ω i
h i
h i+1 ω
I−
ω
Mi ω i
≈ MI ω i+1 − I −
2
2
ω i+1 + ω i
≈ MI ω i+1 − ω i + h
× MI ω i ,
2
and a discretization of the precession term in the Euler equations appears in the discrete equations of motion
104
as well. This term will cause a rotating object, even in the absence of any external forces, to “wobble” about
its axis of rotation over time.
105
106
Chapter 10
External Forces, Non-Conservative
Forces, and Impulses
Weight, force and causal impulse, together
with resistance, are the four external
powers in which all the visible actions of
mortals have their being and their end
Leonardo da Vinci
So far we have assumed that all forces acting on a physical system arise from potential functions
V (q) over configuration space. We now relax this requirement to also include forces that do not arise
from a potential, including external forces imposed from outside of the physical system, as well as
impulses, infinitely strong forces acting over infinitesimal periods of time, resulting in instantaneous
changes in momentum. We will need both of these concepts to properly handle collisions between
objects, the subject of the next chapter.
As mentioned in the first lecture, as far as we know, all forces in the universe are conservative.
But this fact doesn’t diminish the value of being able to simulate non-conservative forces and
impulses:
• External forces can be used to model the influence of gravity, wind, motors, or human beings
on physical systems, where these phenomena are driving the motion of the system, but are
themselves unaffected by the system (and modeling them as part of the system would be a
waste of effort);
• Many practical phenomena, such as friction between two objects sliding against each other, or
drag slowing an object flying through the air, are extremely complex and not fully understood;
non-conservative forces can capture the phenomenological behavior observed in experiments,
without needing to understand and model the underlying mechanism causing friction or drag;
• Events such as collisions appear, at everyday time scales, to occur instantaneously, even
though if you look at a collision in slow motion, you can see that the two objects smoothly
deform once they come into contact with each other, before restoring their shape and pushing
off against each other. Impulses allow simulations to capture the right coarse-scale behavior
during a collision, without needing to resolve the detailed motion at small time scales.
107
10.1
Work and the Lagrange-D’Alembert Principle
Let us go back to the derivation of the equations of motion from Hamilton’s Principle. We derived
the Euler-Lagrange equations by asserting that the physically-correct trajectories q(t) were those
that extremized the action, or equivalently, those for which, for any arbitrary variation of the
trajectory δq(t) that fixed its endpoints,
Z t1 d
dq̇ L − dq L δq dt = 0.
dt
t0
Substituting the definition of the Lagrangian as the difference of kinetic and potential energy,
Z t1 Z t1
d
[dV (q)]δq dt = 0.
(10.1)
[dq̇ T ] − [dq T ] δq dt +
dt
t0
t0
The expression −[dV (q)](δq) = Fδq is called the virtual work that the force would do on the
system at time t if q were to be perturbed in the δq direction: a perturbation that moves q in
the direction of F causes the force to do positive work on the system, decreasing the potential
energy; similarly virtual work is negative if q moves against the force, and zero if the motion is
perpendicular to the force.
The above equation then states that a trajectory q(t) is physical if for every perturbation δq(t),
the variation in kinetic energy due to this perturbation is equal to the virtual work of the system’s
forces. This characterization of motion is called the Lagrange-D’Alembert principle, or sometimes
the Principle of Virtual Work. For conservative forces, this principle is equivalent to Hamilton’s
principle, but it’s also more general, as we will see in the rest of this chapter. We will first apply
the Lagrange-D’Alembert principle to non-conservative forces acting on te DOFs on the system,
and then extend the formulation to forces that act on points of an object not directly represented
by degrees of freedom (non-configurational forces.)
10.1.1
Non-conservative Forces
Non-conservative forces are those that do not come from a potential energy depending only on
configuration q. These forces might arise from potentials that depend on both q and q̇ (for example
viscous damping, a force which resists strain rate instead of strain, and causes the amplitude of a
spring’s oscillation to decay over time), or may not be explainable by a potential at all. If we have
formula F(q, q̇), we can simply add it to the Lagrange-D’Alembert principle (10.1): a trajectory is
physically correct if, for all variations δq(t) fixing the endpoints q(t0 ) and q(t1 ), we have that
Z t1 Z t1
Z t1
d
[dq̇ T ] − [dq T ] δq dt +
[dV (q)]δq dt −
F(q, q̇)δq dt = 0.
dt
t0
t0
t0
It follows from the usual argument that since δq is arbitrary,
d
[dq̇ T ] − [dq T ] = −[dV (q)] + F(q, q̇),
dt
or if kinetic energy is in standard form and the mass matrix does not depend on the configuration,
q̈T M = −[dV (q)] + F(q, q̇).
(10.2)
There is nothing surprising about this result: it is exacly what one would expect, if one were
to just add on the non-conservative forces to the conservative forces in the Euler-Lagrange equations.
108
Why can we just add additional virtual work terms to equation (10.1), when Hamilton’s
principle includes only terms that come from the system’s potential energy? The LagrangeD’Alembert principle indeed cannot be derived from Hamilton’s principle, and is more general
than Hamilton’s principle since it accounts for both conservative and non-conservative forces.
If we want to work with non-conservative forces, we will have to take this principle as an axiom,
just as we did Hamilton’s principle for conservative systems.
Although the Lagrange-D’Alembert principle is more general than Hamilton’s principle, remember that
there are advantages to using Hamilton’s principle. Noether’s theorem, and the discrete Noether’s theorem,
no longer necessarily hold once external forces are introduced to the simulation. The good energy behavior
and linear global error growth that comes with time integrators derived from Hamilton’s principle are also
no longer guaranteed.
10.1.2
Non-configurational Forces
Let’s look at a case where it’s less straightforward how to extend the equations of motion. Let
Fext (t) be an external force that acts on some point v(q) ∈ R3 —notice that v lies in the ambient
space, not the configuration space Q. You should imagine this external forces as poking a specific
point on a rigid body, for instance. We know how to simulate a configurational force that acts on
the position and orientation of the rigid body—how do we simulate a force that acts on just one
point?
The virtual work of this force due to a variation δv in the point is just Fext δv. Now every
variation δq of the configuration induces a variation in v, and the relationship is given by the
differential of v:
δv(t) = {dv(q[t])} (δq[t]) .
Putting together the pieces, the Lagrange-D’Alembert principle tells us that the physically correct
trajectory is the one that satisfies, for every variation δq,
Z t1
Z t1 d
[dq̇ T ] − [dq T ] δq dt +
([dV (q)] − Fext [dv(q)]) δq dt = 0,
dt
t0
t0
from which we can extract the modified Euler-Lagrange equations
d
[dq̇ L] − [dq L] − Fext [dv(q)] = 0.
dt
What’s the punchline? If we want to add an external force Fext acting on v(q) to a physical system,
we need only add the corresponding configurational force Fext [dv] to the other forces in the system.
The same principle applies identically in the discrete case.
Example Consider a rigid body with degrees of freedom t, θ, and suppose we apply an external
force F to the point v(q) = rot(θ)v̄ + t. We have that
[dv] = I − rot(θ)[v̄]× T (θ) .
Therefore applying the force F to any point of the rigid body, no matter where, is equivalent to
applying the same force to the center of mass of the body; and in addition applying the torque
−F rot(θ)[v̄]× T (θ) to accelerate the rotation of the body. There are a few things to notice about
this formula for the torque:
109
• It does depend on the point of application, since v̄ appears in the formula;
• The first rotation term rot(θ) maps the applied external force to the body’s template coordinates. This makes sense: the torque applied depends on the force direction relative to the
object’s geometry, and does not depend on how the object happens to be oriented in space
at the time the torque is applied;
• The torque is zero if the applied force F is parallel to the vector between v and the center
of mass. In this case the applied external force accelerates the center of mass only, without
causing the body to spin;
• The T (θ) term exactly cancels the corresponding term in the Euler equations: if the body is
unaffected by any other configurational forces, the equations of motion governing the body’s
orientation become simply
MI ω̇ + ω × MI ω = [v̄]× rot(−θ)FT .
10.2
Impulses
As hinted earlier, there are situations where it doesn’t make sense to simulate interaction between
objects using forces alone. The effect of forces is smeared over time, since forces change the acceleration of bodies, which then only indirectly affects velocities (and positions). Simulating collisions
between objects can be done using forces (by adding a force to the system that repels objects that
come too close to each other), and we will explore this idea more fully in the next chapter; but the
downside of this approach is that objects do not bounce off of each other instantaneously.
Directly modifying positions (“teleporting objects”) is only occasionally useful in a simulation;
but there are several situations (such as the aforementioned collisions) where it makes sense to
simulate an impulse: an instantaneous change in velocity. To incorporate impulses correctly into a
simulation, we first need to understand how impulses relate to forces, and then we will see that we
can derive how to apply impulses using our existing Lagrangian machinery.
Impulses as Instantaneous Forces Let us look at a simple physical system, consisting of a
single particle in 1D. We will impose no forces on the system, other than an external force F (t)
that varies over time. Per the above discussion of external forces, an equation of motion for this
system is simply
mq̈ = F (t)
where m is the mass of the particle. If the particle is motionless at time 0, the momentum of the
particle at time τ can be computed by integrating the equations of motion:
Z τ
Z τ
mq̈ dt =
F (t) dt
0
0
Z τ
mq̇(τ ) =
F (t) dt.
0
Notice that the change in momentum ∆p = p(τ ) − p(0) depends just on the integral of the force,
and not on the specific values of F at any particular time in the interval (0, τ ).
110
Given a desired change in momentum ∆p and a duration τ , how can we construct an external
forces that causes the desired change in momentum after the specified time has elapsed? There are
many possible solutions: all we need to do is to write down an F (t) whose integral is the change
in momentum. Let us additionally require that F (t) is continuous, and zero outside the interval
(0, τ ). It is still easy to cook up such a force: for example, we can use as the force the bump
(
6∆p
t(τ − t) 0 ≤ t ≤ τ
τ3
F (t) =
0
otherwise
which one can check integrate to ∆p over the interval (0, τ ).
As τ → 0, the magnitude of F (t) at its highest point increases to infinity, while the time during
which F (t) exerts a nonzero force shrinks to zero, while keeping the net effect on momentum the
same (see figure 10.2).
Note that this function F (t) is far from the only one that can do the job. If we don’t care
about the force being continuous (slowly ramping up from zero), we can define F (t) as being
piecewise-constant:
(
∆p
0≤t≤τ
F (t) = τ
0
otherwise.
In the limit as τ → 0, the force F (t) acts instantaneously and is infinitely strong; this limit can
be expressed in terms of a Dirac delta distribution,
lim F (t) = ∆p δ(t).
τ →0
This instantaneous force is an impulse. We can generalize this idea from one dimension to arbitrary
configuration spaces: applying an impulse J ∈ Rn at time tJ to a physical system with configuration
space Q = Rn means taking the limit as τ → 0 of the trajectory subject to the external force
(
J
tJ ≤ t ≤ tJ + τ
(10.3)
F(t) = τ
0 otherwise.
Notice that J has units of momentum, and not of force (one interpretation is that J is the “net
effect” of the infinitely strong force after it has been integrated up over the infinitesimal time during
which it exerts itself).
Of course, in practice we want to apply impulses by instantaneously changing the momentum
p of the system, without going through the process of computing any limits. We will now derive
111
how to turn an impulse J into such a change in momentum. There is a bit of subtlety to this: for
example, exerting an impulse on a rigid body will typically cause the body’s center of mass to start
moving, and in addition will cause the rigid body to start spinning. Applying the impulse closer
to the center of mass results in a larger change in center of mass velocity, and smaller change in
angular velocity. We can calculate the formulas which tell us the precise relationship.
10.3
Impulses in Time Integrators
Equation (10.3) gives us a blueprint for how to apply impulses in a physical simulation: we can
add a constant external force of magnitude J/h that acts only during the interval between time
steps iJ and iJ+1 . The problem with this approach, though, is that the impulse does not truly act
instantaneously: the shortest period of time during which a force can act, in a simulation, is the
length of a single time step.
How do we take the limit in equation (10.3), then? The idea is simple: we shrink the time step
size to zero. But only for the step iJ that involves the impulse.
Example: Particle Systems Let us look at a concrete example: a particle system with kinetic
energy 21 q̇T M q̇ and potential energy V (q). We want to apply an impulse J at time hiJ , i.e., at time
step iJ . We modify our usual procedure for writing down a discrete Lagrangian by introducing a
variable time step: let the interval of time between step i and i + 1 be hi , with
(
h, i 6= iJ
i
h =
τ, i = iJ ,
where h is the “standard” time step of the simulation, and τ is the interval over which we apply
the impulse (with τ much smaller than h; later we will take the limit as τ → 0.)
The discrete Lagrangian (without the impulse) in this setting is
L(qi , qi+1 ) =
1
2
qi+1 − qi
hi
T
M
qi+1 − qi
hi
− V (qi+1 )
with action
S=
X
hi L(qi , qi+1 ).
i
Notice that the action is now a weighted sum of the discrete Lagrangian at different time steps—
recall that the action is integrating the (piecewise constant) Lagrangian over the entire trajectory,
so time steps with a larger time interval contribute more to the action that time steps with small
time interval. The Euler-Lagrange equations of this action are
hi d1 L(qi , qi+1 ) + hi−1 d2 L(qi−1 , qi ) = 0,
and adding the impulse to these Euler-Lagrange equations per equation (10.2) yields
−
qi+1 − qi
hi
T
M+
qi − qi−1
hi−1
T
112
M − hi−1 dV (qi ) + hi−1 Fi = 0
where
(
J/τ, i = iJ
F =
0,
i 6= iJ .
i
Making the usual substitution q̇i =
and simplifying yields
T
M + q̇i−1 M − hi−1 dV (qi ) + hi−1 Fi
T
T
= q̇i − hi+1 M −1 dV (qi+1 ) + hi+1 M −1 Fi+1 .
0 = − q̇i
q̇i+1
qi+1 −qi
hi
T
Most of the time, hi = hi+1 = h and the equations above reduce to the usual Velocity Verlet
integrator. There are two exceptions: when i = iJ ,
qiJ +1 = qiJ + τ q̇iJ
and when i = iJ − 1,
T
q̇iJ = q̇iJ−1 − τ M −1 dV (qiJ ) + M −1 JT .
Taking τ → 0 we get that qiJ +1 = qiJ and q̇iJ = q̇iJ−1 + M −1 JT . All other updates are the same
as Velocity Verlet. These equations have the following interpretation: (i) the velocity during the
infinitesimal τ -length step between time steps iJ and iJ+1 is different from the previous velocity by
M −1 JT . This term is the change in momentum induced by the impulse. No other forces modify
the velocity during the short time step (which makes sense, since a finite force integrated over an
infinitesimal duration yields a zero change in momentum.) (ii) The position update during the
short time step does nothing (which again makes sense, since a configuration traveling with a given
velocity for an infinitesimal time cannot move a finite distance).
In other words, the effect of the impulse can be simulated by simply applying the momentum update ∆p = J between time steps iJ−1 and iJ , and then continuing with Velocity Verlet
integration.
What if we want to apply an impulse at some time tJ that is not a perfect integer multiple
of the time step size? We can compute the last time step that occurs before the collision,
ilast = btJ /hc, and split the time window [hilast , h (ilast + 1)] into three new time steps:
• a pre-impulse substep [hilast , tJ ];
• an impulse infinitesimal step [tJ , tJ + τ ];
• a post-impulse subsep [tJ + τ, h (ilast + 1)];
and derive the discrete equations of motion that implement time integration over each of these substeps.
The most tempting use of this technique is to handle collisions: two objects might collide midway through
a time step, and the above splitting scheme corresponds to advancing time up until the collision occurs,
resolving the collision by applying an impulse, and then advancing time again to finish the interrupted time
step. In practice it is uncommon to handle collisions in this way, due to Zeno-type paradoxes that can arise
and cause infinitely many collisions to occur within a finite time window, making the approach impractical.
It also bears mentioning that the Velocity Verlet case is particularly simple; for implicit integrators,
advancing time by fractional steps can be significantly more computationally involved.
113
114
Chapter 11
Inequality Constraints and Impact
However, even now most of the standard
finite element software is not fully capable
of solving contact problems, including
friction, with robust algorithms.
Peter Wriggers
A comprehensive treatment of simulating collisions could fill an entire semester, and even then,
robust and accurate simulation of frictional contact remains to a large extent an open problem.
Handling collisions in a simulation has several facets:
• collision detection, which is concerned with simply determining whether a collision has
occurred between objects in the simulation, or is about to occur, and if so, locating the
point(s) of contact;
• collision response, which is concerned with reacting to detected collisions to prevent objects
from tunnelling into each other. Collision response can be loosely subdivided into algorithms
for handling impact, where objects bounce off of each other with only transient contact, and
resting contact, where objects remain touching for a prolonged period of time, such as when
objects are stacked or slide over each other. Friction plays a key role in both impact and
resting contact of everyday objects.
In this lecture, we will cover only the simplest case of collision handling: dealing with frictionless
impact. This will require expanding our mathematical repertoire to include not only equality
constraints, but also inequality constraints that block out regions of configuration space as being
“forbidden.” But before we discuss impact response, we will briefly survey collision detection
methods.
11.1
Collision Detection
Detecting whether a collision has occurred between bodies in a simulation is already a surprisingly
challenging problem. The crux of the difficulty is that collision detection is an inherently O(n2 )
task, for a simulation consisting of n objects, and so collision detection quickly dominates the
computational costs of the time integrator and other simulation steps, once the size of the physical
115
system becomes large enough. It is not uncommon for a physical simulation to spend 90% of its
time budget on collision detection. Therefore substantial effort has been spent on designing collision
detection algorithms that are as efficient as possible.
Collision detection can be taxonomized into two types: discrete and continuous-time collision
detection. Discrete collision detection looks at only a single configuration q and asks, are any
objects colliding in this configuration? Answering this question boils down to detecting whether
any pair of objects have overlapping volumes; the details of how to do this depend on whether the
objects are rigid bodies, triangles, spheres, etc. To reduce computational expense, discrete collision
detection is usually broken down into two sequential phases:
• a broad phase looks at all objects in the physical system, and quickly decides which pairs
of objects cannot possibly be colliding, and which might be colliding. The goal of this step
is to reduce the number of possible collisions from O(n2 ) to O(n log n) (in the typical case)
by culling obvious cases where objects are too far apart to be colliding. Hierarchical spatial
data structures are typically used to perform the broad phase, such as bounding volume
hierarchies, kD trees, etc.
• a narrow phase takes in the candidate object pairs collected during the broad phase, and
conclusively determines whether any of them are actually colliding. This determination may
involve expensive geometric queries, which is why it is important that the broad phase cull
as many candidates as possible.
Instead of looking at only one configuration, continuous-time collision detection examines the
entire trajectory of the system between two consecutive time steps, and determines whether, and if
so when, objects collide during this motion. Continuous-time collision detection requires a description of how objects evolve between time steps (which was decided when the discrete kinetic energy
of the system was formulated) and is usually an extremely expensive operation: one approach is to
treat the moving objects as 4D volumes in spacetime, and perform broad-phase collision detection
on these spacetime volumes; any pair of objects whose swept 4D volumes overlap are then checked
in the narrow phase.
Continuous-time collision detection is the only way to detect and respond to tunnelling events,
where objects pass completely through each other during the course of a single time step, but this
robustness comes at a steep computational cost. The choice of discrete vs continuous-time collision
detection thus depends on the level of accuracy needed in the simulation, and the types of objects
that are being simulated: thick rigid bodies are unlikely to tunnel, so that it may be enough to
only check for collisions at every time step; when simulating thin objects like cloth, on the other
hand, missing a collision due to tunnelling is noticeable and catastrophic, so that continuous-time
collision detection must be used.
For the rest of this chapter, we will assume that collisions can be accurately detected. But bear
in mind that robust and efficient collision detection is a nontrivial computational challenge, with
no one-size-fits-all solution suitable to all types of simulations.
11.2
Inequality Constraints
Before we can talk about how to respond to collisions, we need to formalize mathematically what
it means for two objects to be colliding. Previously we looked at equality constraints: restrictions
116
on where the configuration q is allowed to travel within the full configuration space Q, encoded by
a constraint function g : Q → R.
We now extend these constraint functions to inequality constraints, which can more flexibly
restrict which configurations are allowed or disallowed. An inequality constraint is a function
g : Q → R satisyfing the following properties:
• g(q) ≥ 0 whenever q represents a physically valid configuration; such configurations are called
feasible;
• g(q) < 0 whenever q represents a physically invalid (infeasible) configuration;
• dg(q) 6= 0 when g(q) = 0.
The last condition is the same as that imposed on equality constraints, and requires that every zero
level set g(q) = 0 separates space into a feasible and infeasible regions.
Example For a ball of radius r and center (x, y), represented using the configuration q = (x, y), a
simple floor constraint at y = 0 can be represented using the constraint function g(q) = q·(0, 1)−r.
For a system consisting of two balls of radius r, with configuration q = (x1 , y1 , x2 , y2 ), a constraint
that these two balls do not overlap is
g(q) = k(x1 , y1 ) − (x2 , y2 )k − r.
Notice that in both cases, the valid configurations are precisely those satisfying g(q) ≥ 0.
Why these particular constraint functions? As with equality constraints, there are multiple
possible ways of formulating any given physical constraint as an inequality constraint function.
For instance, the two-ball constraint could also be written as
g(q) = k(x1 , y1 ) − (x2 , y2 )k2 − r2 .
The two formulations are mathematically equivalent, though in practice this new constraint function might
be preferable, since it involves no square roots.
Like with equality constraints, the magnitude of the inequality constraint function is irrelevant: the
constraint g(q) is equivalent to the scaled constraint αg(q) for any α > 0. But unlike equality constraints,
the sign of the inequality constraint is critical: g(q) and −g(q) encode completely opposite valid regions of
configuration space.
Multiple Constraints Several inequality constraints can be combined by simply writing down
more constraint functions. For constraints g1 , g2 , . . ., a configuration q is valid if and only if gi (q) ≥
0 for every constraint gi . We can write down the full set of constraints as a vector-valued function


g1 (q)


g(q) =  g2 (q)  .
..
.
A configuration q is thus feasible if g(q) ≥ 0 (where this inequality of vectors means that every
entry of the left vector is greater than or equal to the corresponding entry of the right vector.)
Now that we have the mathematical machinery for representing constraints, we can formulate
methods for enforcing them.
117
11.3
The Penalty Potential
Perhaps the most straightforward way of responding to collisions is by adding an internal force to
the system that prevents them. More conceretely, if g(q) encodes an inequality constraint that
should be enforced, we can simply add
(
0,
g(q) ≥ 0
Vpenalty (q) = k
,
2
2 g(q) , g(q) < 0
for a scalar stiffness parameter k, as a potential energy to the system. Notice how this potential
will influence the system: when g(q) > 0, dV = 0 and the force generated from this potential does
nothing. If g(q) < 0, on the other hand, the force resists increase in the magnitrude of g, and
pushes q out of the infeasible region and back into the feasible region. The magnitude of the force
increases the larger the constraint violation −g(q). Intuitively, this potential encodes a one-sided
spring, which does nothing when the constraint is not violated, and pulls the configuration back
towards the feasible region when the constraint becomes violated.
For multiple constraints, an additional term can be added to the potential energy for each
constraint.
For a physical system involving n objects, preventing collisions requires writing down a constraint function for each pair of objects in the system, resulting in O(n2 ) potential energy
terms. Handling this many different forces becomes impractical when n is large. But although
there are O(n2 ) potential energy terms in principle, the corresponding forces do nothing unless
gi (q) < 0; therefore in practice time integrators can ignore all penalty energies except those
for which the constraint is active (negative). This set of active constraints is detected every
iteration using discrete collision detection.
The penalty method is easy to implement—it reuses the standard time integration machinery
without needing any modifications—but suffers from several significant drawbacks. First, the collision response from penalty forces is soft: two objects that collide will not instantly repel from each
other, but instead, will sink into each other a small distance until the penalty forces push them
back out. This softness can be beneficial when responding to resting contact, but is undesirable for
impact.
There is also a need to choose the stiffness k. There is no free lunch: if k is large, then the penalty
forces are strong and will quickly repel objects away from each other during a collision; however a
large k then requires a smaller time step, decreasing the efficiency of the overall simulation. If k is
small, then the penalty forces are weak, and objects tunnel further into each other—and perhaps
pass entirely through each other—before the forces succeed in arresting the colliding objects.
Avoiding these challenges requires an entirely new approach: instead of adding potentials to the
system for stopping constraint violation, we can apply an impulse whenever we detect a collision
has occured, which modifies the velocity of the objects so that they separate. The problem, then, is
(i) determining when to apply an impulse, and (ii) computing the impulse direction and magnitude
needed to resolve the collision. We will examine these questions next.
11.4
Collision Impulses
The basic idea is simple: we will assume that impact occurs instantaneously, and model the reactions
of the objects to the impact event using impulses. These impulses alter the velocities of the objects,
118
so that they go from approaching each other to separating.
There are devils in the details, however. First, when do we apply an impulse? There are at
least three reasonable approaches:
1. at each time step when objects overlap. This approach requires only discrete collision detection: however, since impulses are applied only at time steps, and only after collisions have
already occured, there will be slight interpenetration of objects before the impulse is applied;
2. at the exact time of collisions. This option is tempting, since it is the most physically correct:
the impulse will be applied at the exact moment when two objects first touch. However, there
are significant downsides to this choice as well. First, detecting the time of collision requires
relatively expensive continuous-time collision detection. It also requires modifying the time
integrator to allow for adaptive time stepping, so that time can be advanced to the moment
of impact, no matter when it occurs within a time step. Lastly, stopping time at every impact
event invites Zeno-type paradoxes where infinitely many collisions occur during a finite period
of time, rendering it impossible to finish a simulation no matter how much computational
resources are available. Consider, for example, a ball bouncing on a table. Each bounce
dissipates some energy, so that the time between bounces decreases with each bounce. A
straightforward simulation of this system will never finish since the ball will bounce infinitely
often in a finite amount of time;
3. at each time step, during the time step before objects overlap. In other words, after each
time integration step, there is a collision detection query to detect whether object will collide
during the next time step (or during the motion interpolating the two steps, if continuous-time
collision detection is available.) If a collision is detected, impulses are applied to prevent the
collision from ever happening. In this approach, objects never interpenetrate; instead, it is
possible for objects to bounce off of each other before they ever touch. Often, and especially for
thin objects, it is better to be overly conservative when choosing when to respond to collision,
and so this option is quite popular in practice, where it is sometimes called a velocity filter
(since the impact response “filters” the raw velocity computed by the time integrator and
modifies it to prevent future collisions).
In the rest of this lecture, we will focus on the first type of impact response. The basic theory
extends in a straightforward way to using impulses as a velocity filter, etc.
11.4.1
Relative Velocity
Now that we have decided when to apply an impulse, there is still the matter of if to apply one
at all. Obviously, if the constraint is nowhere close to being violated, g(q) > 0, there is no need
to apply any impulse. What if g(q) = 0? In this case, the configuration is right at the boundary
of the feasible and infeasible region of Q: when g encodes non-penetration of a pair objects, this
situation corresponds to the two objects exactly touching. If the objects are moving towards each
other, so that q is about to enter into the infeasible region, then we must prevent the situation by
applying an impulse. If q is leaving the infeasible region and entering the feasible region, we need
do nothing. If the two objects are sliding past each other (so that the q̇ is parallel to the constraint
boundary g = 0), we also don’t need to apply any impulses.
119
We can formalize the above reasoning by measuring the derivative of g as a function of time:
d
g(q(t)) = [dg(q)] q̇,
dt
often called the relative velocity with respect to the constraint g. Notice that this relative velocity,
unlike the full velocity q̇, is just a scalar, and is negative when the constraint violation is getting
worse over time (objects are approaching) and positive when the constraint violation is decreasing
(objects are separating). When g(q) = 0, we only need to apply an impulse if the relative velocity
is negative.
What about the case when objects are already penetrating, g(q) < 0? Clearly we must apply
an impulse, right? Not really. Once again, we check the relative velocity: if the relative velocity is
nonnegative, the objects are already separating (due to internal or external forces unrelated to the
impact handling, for example) and we don’t need to interfere with q̇ to resolve the impact.
In summary, the algorithm for deciding whether to apply an impulse is
• if g(q) > 0, do nothing;
• else calculate the relative velocity [dg(q)] q̇. Apply an impulse if and only if objects are
approaching: [dg(q)] q̇ < 0.
11.4.2
Computing the Impulse
Once we decide to apply an impulses, we need to determine the direction and magnitude of the
impulse. The direction is simple: we apply an impulse in the direction that most quickly moves us
away from the infeasible region: the derivative of the constraint function, dg. Therefore we will an
apply an impulse
J = α [dg(q)]T
where α is an unknown impulse magnitude. Choosing the magnitude is a bit subtle: if α is too
small, the impulse does not fully kill the negative relative velocity, and the impulse does not succeed
in stopping the violation of the constraint g. If α is too large, we inject energy into the system, with
objects flying apart post-impact at much greater speeds than that with which they were originally
approaching.
There are two common approaches to choosing α. The first is to assume a perfectly elastic
collision: objects bounce off of each other without dissipating any energy, as if they were made out
of idealized rubber. Let ∆q̇(J) denote the change in the configurational velocity due to the impulse
J; then the post-impulse velocity q̇+ is related to the pre-impact velocity q̇− by the simple relation
q̇+ = q̇− + ∆q̇(J).
To get perfectly elastic impact, we want the relative velocity after the impulse to have the same
magnitude, but opposite sign, of the original (negative) relative velocity:
[dg(q)] q̇+ = [dg(q)] q̇− + ∆q̇(J) = − [dg(q)] q̇− .
We can solve this equation for α to find the impulse which accomplishes this goal.
The second common approach is to assume a perfect inelastic impact: objects collide, and a
maximum amount of energy is dissipated, so that the objects don’t recoil off of each other at all
120
(think of a beanbag falling onto the floor). This inelastic impact violates conservation of energy,
but models complex dissipative phenomena that occur in the real world such as internal vibrations
and friction of the objects, noise generation during the impact, etc. An inelastic impact results in
zero post-impact relative velocity:
[dg(q)] q̇− + ∆q̇(J) = 0,
which we can again solve for α.
Most real-world objects are neither perfectly elastic, nor perfectly inelastic. For example, a ball,
when dropped on the floor, will bounce back up to some fraction c of its original height, where c
depends on the material that the ball is made out of, and the material of the floor. This fraction
c is called the coefficient of restitution and ranges from 0 (for inelastic impact) to 1 (for elastic
impact). We can easily extend the previous two special cases to handle arbitrary coefficients of
restitution:
[dg(q)] q̇− + ∆q̇(J) = −c [dg(q)] q̇− .
Example: For particle systems, we saw in the previous chapter than the change in velocity due
to an impulse is given by
∆q̇(J) = M −1 J.
We can use this relation to derive an explicit formula for α, for arbitrary coefficients of restitution.
We have that
[dg(q)] q̇− + ∆q̇(J) = −c [dg(q)] q̇−
h
i
[dg(q)] q̇− + αM −1 [dg(q)]T = −c [dg(q)] q̇−
α [dg(q)] M −1 [dg(q)]T = −(1 + c) [dg(q)] q̇−
α=
11.5
−(1 + c) [dg(q)] q̇−
[dg(q)] M −1 [dg(q)]T
.
Handling Multiple Impacts
Unlike the penalty method, which trivially extends to multiple constraints, there are some additional
complications that arise when trying to apply impulses to prevent violation of multiple inequality
constraints. First, one must decide whether to apply the impulses sequentially, or simultaneously:
Gauss-Seidel-style impulses are applied one at a time, in some (artificial, user-defined) order.
Notice that earlier impulses can change relative velocities, perhaps removing the need to apply
impulses for constraints later in the order, or perhaps causing new violations (negative relative
velocities) for constraints that previously did not need an impulse. Moreover, there is no guarantee
that a single pass through the set of constraints gi will correct all relative velocities: therefore to
be fully robust, one must iterate through all of the constraints repeatedly until no new constraint
violations are detected.
There is no guarantee that this algorithm terminates after finitely many iterations. Even if
it does, the output will depend on the order used to resolve the constraints; it is therefore a bit
unclear to what extent the final velocities are physically correct.
Jacobi-style impulses are applied all at once: a first pass through all constraints gathers the
impulses each would apply separately, and then they are applied simultaneously to the system’s
121
velocity. Unlike for Gauss-Seidel-style impulses, there is no need to choose a particular ordering of
the constraints.
However, there is no guarantee that a single Jacobi impulse pass doesn’t cause new collisions,
either. Moreover Jacobi impulses can violate conservation of energy, even when the coefficient of
restitution used for each individual constraint is 1.0: consider for instance three balls on the real
line, with q = [x1 , x2 , x3 ]T , and suppose the left ball has velocity 1.0, the right ball has velocity
−1.0, the middle ball is stationary, and all three balls collide simultaneously. For balls of equal
mass, the post-impulse velocity has all three balls stationary, which clearly decreased the total
energy of the system. It is also possible to construct examples where Jacobi iterations add energy
to the system.
11.5.1
The Special Case of Inelastic Collisions
For inequality constraints on physical systems in standard form corresponding to inelastic impact
(zero coefficient of restitution), it is possible to handle the constraints with impulses in a more
systematic way than either of the above two strategies. The key insight is that we can write down
the problem of applying impulses in terms of an optimization: find the configurational velocity
closest to the current velocity, but that satisfies all constraints:
1
min (v − q̇)T M (v − q̇)
v 2
s.t.
[dgi (q)] v ≥ 0, if gi (q) ≤ 0.
(11.1)
Notice the use of the kinetic energy metric in the objective function; in addition to being coordinateindependent, this use of the mass matrix prioritizes deflecting the velocity of lighter objects, rather
than heavier ones. To simplify the following exposition, let us assume that all constraints are
violated, dgi (q) ≤ 0 (we can simply ignore non-violated constraints in the calculation, if any exist.)
It is probably not obvious why equation (11.1) corresponds to applying inelastic impulses; to see
this, let us explore how to generalize the method of Lagrange multipliers to inequality constraints.
We begin with a generic inequality-constrained problem,
min f (q)
s.t.
q
g(q) ≥ 0.
What do solutions of this problem look like? There are two possibilities: either the solution is in
the interior of the feasible region, g(q) > 0, in which case the solution is also a solution of the
unconstrained problem
min f (q).
q
We know how to find these solutions: we solve df (q) = 0, and look for any solutions q that lie
within the feasible region g(q) ≥ 0. Any such points are also solutions of the original inequalityconstrained problem. (Of course, in practice it may not be so easy to solve df (q) = 0; we can find
one solution using Newton’s method, if we have a good idea of where the solution might be.)
The other case is that the solution lies on the boundary, where g(q) = 0. In this case the
solution q does not necessarily have to be a solution of the unconstrained function f , as long as,
just like in the equality-constrained case, sliding left or right on the boundary g(q) = 0, or moving
into the interior g(q) > 0, cannot decrease our function. This condition is true whenever the
gradient [df (q)]T points directly into the feasible region:
[df (q)] = λ [dg(q)] ,
122
λ ≥ 0.
Notice that unlike in the equality-constrained case, we now have a restriction on λ: this is because,
if df and dg are anti-parallel, then we are not at a minimum of our constrained objective, since we
can move into the interior of the feasible region and decrease f .
We can summarize the two cases by the following rule: q is a minimum if
[df (q)] = λ [dg(q)]
and either : (i) [dg(q)] > 0 and λ = 0 (the interior case), or (ii) [dg(q)] = 0 and λ > 0 (the boundary
case). These conditions can be refactored into
dg(q) ≥ 0
λ≥0
dg(q) ⊥ λ,
where the notation on the third line, called a complementary condition, indicates that either the
left term or the right term (or both) must be zero.
We can now write down the conditions on the solutions to the original problem (11.1):
X
M (v − q̇) =
λi [dgi (q)]T
i
λ≥0
[dgi (q)] v ≥ 0 ∀i
[dgi (q)] v ⊥ λi ∀i,
which can be further compactified by introducing the matrix G, whose columns are the [dgi (q)]:
M (v − q̇) − Gλ = 0, λ ≥ 0, GT v ≥ 0, GT v ⊥ λ,
(11.2)
where the inequalities and complementarity of vectors are interpreted as acting entrywise.
Now consider what these conditions mean physically. The first equation,
v = q̇ + M −1 Gλ,
tells us that the output velocity of the optimization problem is equal to the post-impact velocity,
plus some impulses in the [dgi ]T directions. If λi = 0, then the relative velocity [dgi ] v can be
anything positive; this corresponds to the case where no impulse was applied at all to fix constraint
i. On the other hand, if λi > 0, then [dgi ] v = 0: this means that whenever we applied an impulse
for a constraint, the relative velocity for that constraint ended up being zero. So all impulses we
applied were inelastic.
It may not seem like we gained much by formulating the problem of finding a post-impact
velocity in terms of the equations (11.2). But it turns out that these equations have a special
form: the objective function is linear in the unknowns (v and λ), the constraints are linear inequality constraints, except for the complimentary constraints, which is technically quadratic in
the unknowns. Such a problem is called a linear complementarity problem and special solvers are
available for finding solutions to them. These can be used to find, relatively efficiently, a valid
post-impact velocity.
Unfortunately, it is not clear how to modify equation 11.1 to encode elastic impulses, rather
than inelastic ones. If such a formulation exists, it remains to be discovered.
123
124
Chapter 12
1D Rubber Bands
Reality can be elastic, and I want to see
how elastic it can be.
Yoko Ono
Earlier in these notes, we derived Hooke’s law for elastic springs. We will now deepen our
understanding of elasticity, and how to simulate elastic objects. This line will culminate in a
formulation of the mechanics of 2D and 3D elastic shells and volumes like cloth, coke bottles, and
rubber balls; but we will begin with a much humbler setup: a 1D rubber band. The difference
between a rubber band and a spring is that the rubber band can be stretched different amounts
in different places (imagine that there are thumbtacks holding the rubber band in place, so some
parts are stretched and others compressed.)
Because the rubber band is stretched by different amounts in different places, it is no longer
possible to write down the energy of the rubber band as only a function of its deformed length:
what happens on the inside matters!
As in the case of the spring, we will start with an undeformed rubber band Ω; let us take
Ω = [0, L] an interval of length L. The degrees of freedom of the rubber band are now all possible
deformations of Ω, which we can represent using an embedding function φ : Ω → R. The image
φ(Ω) = Ω is then the deformed rubber band. (As usual we will use the convention that barred
variables relate to the undeformed rubber band, and unbarred variables, to the deformed rubber
band.) The use of this φ represents a key leap beyond the spring case in the previous section: now
we have infinitely many degrees of freedom! Obviously this will present a challenge for simulation;
but we will get to that problem later.
Some example φ:
• The identity embedding φ(x) = x, which just gives Ω = Ω.
• Rigid translations of the rubber band: φ(x) = x + t for some fixed t.
• Uniform stretching of the rubber band to deformed length L (while keeping the left endpoint
fixed): φ(x) = xL/L.
The goal now is to write the potential energy of the rubber band, this time as a function V (φ) of
the function φ (a function-of-a-function is called a functional ). The Laws of the Spring generalize
naturally to this setting:
125
φ
Ω
Ω
Jφ
Figure 12.1: A 1D rubber band (visualized with thickness, and vertical tick marks, to clarify the
deformation). The deformation map φ maps from points in the undeformed rubber band Ω to the
deformed rubber band Ω. This rubber band has been compressed towards the left, and stretched
towards the right. The amount of stretching at a point can be measured by looking at how φ changes
the distance between two nearby points (blue and red). Tangent vectors on the undeformed rubber
band are mapped to the deformed rubber band by the deformation Jacobian Jφ; an alternative
way to measure stretchedness is to measure the change in the length of tangent vectors drawn at a
point.
Laws of the Rubber Band
1. The energy functional is local : it can be written as
Z
W (x) dx
Ω
where W is an energy density depending only on the behavior of φ near a material point x.
2. The energy density W at a point x ∈ Ω depends only on the stretchedness (strain) (x) at
that point on the rubber band.
3. The energy density is zero when = 0.
4. The energy density is non-negative.
The main change is that the energy is now expressed as an integral of a local energy density.
Imagine a rubber band whose left end is very stretched, but whose right end isn’t stretched at all.
The left end will contribute a large amount of energy to the total energy E, and the right end will
contribute nothing; but the amount contributed by the left end depends only on what φ looks like
near the left end, and does not depend on φ at other places along the rubber band (this is what is
meant by the energy being a local quantity.) The other laws are unchanged from the spring case;
but note that the strain is now also a function over Ω, and can differ at different points along the
rubber band.
Finally, remember that once we have the energy, if we wanted to compute the force we could
do so using F = −dφ V .
126
The embedding φ is a function—what does it even mean to take the derivative of the functional
V with respect to a function φ?!
For now, don’t worry about it. The meaning of the differential is unchanged: dφ V maps
from an infinitesimal change in the embedding φ to an infinitesimal change in energy. Like
the force in finite dimensions, F can be interpreted as measuring the “direction of steepest
descent”: the way φ needs to change to most quickly decrease the total energy of the system.
The question of what F actually is and how to compute it for continuum solids is in fact a very subtle and
important one, which we will return to later. For now, we’ll focus only on computing V and trust that we
can then compute forces from it.
Our task, then, is to derive an expression for (x), like we did for the spring. How do we measure
“stretchedness” at a point x? Intuitively, we could scatter some points near x, and then see how
their distance to x differs on Ω and Ω̄: if the points are further from x on the deformed rubber
band, the rubber band is stretched at x, and if nearer, compressed.
We can take this idea further and instead place test points y infinitesimally close to x. Equivalently we could pick tangent vectors v at x and test them: if we draw an infinitesimal tangent vector
v, and then stretch the rubber band according to φ, how much does v stretch (see figure 12.1)?
Notice that measuring the stretching of v is the same as measuring the change in distance between
x and a test point y = x + δv for an infinitesimal distance parameter δ.
What vector v in Ω does v get mapped to by φ? The answer is v = [dφ(x)]v. There are two
ways of seeing this:
1. Geometrically, the derivative of φ maps infinitesimal changes in x to infinitesimal changes
in φ (this is the entire purpose of the differential). An infinitesimal change in x is a tangent
vector v on Ω; an infinitesimal change in φ is the corresponding tangent vector v on Ω.
2. We can work from the fact that the image of v is the vector between two infinitesimally close
points φ(x) and φ(x + δv):
φ(x + δv) − φ(x)
v = lim
,
δ→0
δ
which is the definition of the directional derivative of φ in the direction v, i.e. [dφ(x)]v.
We can now write down the stretchedness of the rubber band at a point x and a direction v,
by comparing the lengths of v and v:
(x, v) =
kvk2 − kvk2
k[Jφ(x)]vk2 − kvk2
=
.
2kvk2
2kvk2
Once again, there are many possible choices for what formula to use for the strain: here we’ve
chosen the Green strain (which compares squared lengths of the vectors), since it avoids needing
to deal with absolute values. Although this strain is technically a function of both a point and a
direction, the choice of v doesn’t really matter: all tangent vectors at x get stretched by the same
relative amount by φ. Indeed the above formula simplifies to
(x, v) =
[Jφ(x)]2 − 1
2
and we have a single number for the strain at every x, that does not depend on v.
127
Finally, by an identical argument to the spring case, the energy density must be of the form
1
1
W (x) = c2 (x)2 + c3 (x)3 + · · ·
2
6
where the precise constants c2 , c3 , etc are determined by the material of the rubber band (and the
choice of strain formula).
We can perform some sanity checks on the above calculations. First, suppose φ merely translates
a rubber band: φ(x) = x + t. Clearly this translation should not have any elastic energy; let’s check
if this is in fact true. Jφ(x) = 1, so = 0, W = 0, and V = 0.
Finally, what if the rubber band stretches uniformly, like the spring? If the deformed length of
the rubber band is L, and we pin the left endpoint of the rubber band, this scenario corresponds
to
xL
φ(x) =
,
L
which linearly stretches out the rubber band, and this deformation has derivative
therefore
2
2
L2 /L − 1
L2 − L
(x) =
=
2 .
2
2L
L
.
L
The strain is
For a simple Hookean material with c3 = c4 = · · · = 0, the energy is therefore
Z
0
L
c2
2
2
L2 − L
!2
2
dx =
2
c2 (L2 − L )2
2L
3
8L
which is nearly identical to the formula for a Green strain spring in equation (3.2).
Why don’t the formulas match exactly? Isn’t a uniformly-stretched rubber band just the same
3
4
thing as a spring? Why is there an L in the denominator here and L for the spring?
The answer is that the constants c2 for a spring and for a rubber band are not perfectly
interchangeable. Both measure the “stiffness” of the material, but in the spring case c2 has
units of energy (since the strain is unitless) and in the rubber band case, c2 has units of energy
density, or energy
length . The two formulas can be reconciled by writing the energy of the rubber
band as
2
(c2 L)(L2 − L )2
4
8L
where c2 L now plays the role of the stiffness constant in equation (3.2).
This section introduced two extremely important concepts:
• the idea of an object having continuous degrees of freedom rather than just its total length,
and how to represent the deformation mathematically;
• measuring strain at a point and in a direction, by looking at how tangent vectors get stretched.
We got very lucky for the 1D rubber band, since the direction v turned out not to matter: it
canceled from everywhere. In 2D we won’t be so fortunate.
128
12.1
Discretization
We now turn to the problem of discretizing V . We cannot represent φ directly in a simulation—
doing so would require infinite amounts of memory. We must instead reduce the space of possible
deformations of the rubber band to some finite-dimensional configuration space Q. This is done
by saying that φ can no longer be an arbitrary (smooth) function, but must live in some finitedimensional function space
N
X
φ(x̄) =
αi bi (x̄)
i=1
spanned by the basis functions bi (x), which are chosen ahead of time and do not change. The
coefficients αi then become the degrees of freedom of the discretized system.
How do we pick this space? There is no one right answer: we must trade off between a variety
of concerns, including efficiency and accuracy. Notice that our choice of basis {bi } affects the
physics of the simulated system since the only motions that the elastic band can undergo are
those in the span of these basis functions. For example, if bi included only rigid translations,
then the rubber band could not stretch or compress at all, no matter what forces are exerted
where on the rubber band: the physical system doesn’t have the “vocabulary” to express this
deformation!
Choosing the right basis thus becomes a delicate decision. We want to make sure that “common,” coarse
motion of the rubber band is expressible in the basis, while we may choose to give up resolving fine details
of the motion for the sake of efficiency (keeping N small). We will see in a future chapter one approach
for constructing a good basis by computing the natural vibrational modes of the rubber band, which has
different advantages and disadvantages compared to the piecewise-linear basis described below.
Several different recipes have been invented for how to systematically build the basis {bi }; one
desirable property, for instance, is that the behavior of the discrete rubber band converges to the
smooth band’s motion as N → ∞ (and if we have a bound on the error as a function of N ,
even better, since we can examine this bound and choose N to fit our accuracy requirements and
performance budgets). The massive field of finite element analysis studies in detail the different
choices of basis and their pros and cons. Here we describe the simplest practical basis.
12.1.1
Representing the Rubber Band with Piecewise Linear Elements
Choose N different points on the rubber band’s rest state x̄0 , x̄1 , . . . , x̄N . These points can be
anywhere on the rubber band, but let us ensure that the first and last point are at the rubber band
endpoints, that no two points coincide, and that they are in sorted order (x̄i < x̄i+1 ).
If we know where each of these N points end up in the deformed state, we can construct a φ
which maps arbitrary points x̄, by assuming that points in between the x̄i move linearly according
to their nearest chosen points:
φ(x̄) =
x̄ − x̄i
x̄i+1 − x̄
xi+1 +
xi ,
x̄i+1 − x̄i
x̄i+1 − x̄i
x̄i ≤ x̄ < x̄i+1 .
Notice that the mapping φ
• depends on the values of the xi , and satisfies φ(x̄i ) = xi ;
• linearly stretches the material between two consective points x̄i and x̄i+1 ;
129
(12.1)
• is not differentiable everywhere (the derivative is not defined at the x̄i ) but does have welldefined differential on the intervals of material (x̄i , x̄i+1 ). On each of these intervals, φ is linear
in x̄, and so dφ is constant.
This last observation lets us write down an energy for the rubber band, in terms of the values of
xi :
N
−1 Z x̄i+1
X
V (q) =
Wq (x̄) dx̄,
i=1
x̄i
where W is the energy density depending on strain, as discussed above, and the degrees of freedom
of the system are the current positions of the x̄i :
q=
x1 x2 · · ·
xN
.
What basis {bi (x̄)} does this discretization correspond to? Notice that if we define bi to be the
piecewise-linear function which is 1 at x̄i , ramps down to zero at x̄i−1 and x̄i+1 , and is zero beyond
these ramps, then we have that
N
X
φ(x̄) =
xi bi (x̄).
i=1
So the discretization above is the result of choosing the piecewise-linear “hat basis” for {bi } (socalled since each basis element looks like a triangular hat, centered at a different point on the elastic
rod.) This basis is very common, due to its simplicity—but by no means the only or best choice!
More precisely, the basis functions have formulas


0,
x < x̄i−1



 x−x̄i−1 , x̄
i−1 ≤ x < x̄i
−x̄i−1
bi (x) = x̄x̄ii+1
−x


x̄i+1 −x̄i , x̄i ≤ x ≤ x̄i+1



0,
x ≥ x̄i+1 ,
ignoring cases that are undefined at the boundary (the first two cases involving x̄i−1 when i = 1,
for instance).
Once we have picked a discrete basis, writing down the full formula for V (q) is a matter of
straightforward computation: the Jacobian of the piecewise-linear φ is the piecewise-constant
Jφ(x) =
N
X
xi Jbi (x̄)
i=1
Jbi (x) =


0,




1
x̄i −x̄i−1 ,
1


− x̄i+1 −x̄i ,


0,
x < x̄i−1
x̄i−1 ≤ x < x̄i
x̄i ≤ x ≤ x̄i+1
x ≥ x̄i+1 .
Notice that this formula is in direct agreement with simply differentiating equation (12.1). Furthermore, although Jφ(x) in principle depends on N different terms, for any given value of x, only
130
two terms of the sum will be nonzero; in particular for x ∈ [x̄i , x̄i+1 ], the Green strain is constant
in x,
(xi+1 − xi )2 − 1
,
(x) =
2 (x̄i+1 − x̄i )2
which can be easily integrated over each segment to give V (q), given the material parameters ci .
12.2
Kinetic Energy
We now turn to discussing the other term that we need in order to define the physical system: the
kinetic energy of the rod. Again, we define kinetic energy by identifying the masses in the system,
and integrating up their contribution to the kinetic energy, just as in the case of rigid bodies.
Z L
1
T (q̇) =
ρφ̇(x̄)2 dx̄
(12.2)
2
0
where ρ is the density of the rubber band (here assumed constant). For discrete rods, we can
approximate kinetic energy by applying the usual assumption that masses move along affine trajectories between time steps. For the hat basis, we have that the velocity φ̇(x̄) of any point on Ω̄
between time steps i and i + 1 is
!
!
i+1
N
i
i
X
xi+1
− xij
xi+1
x̄ − x̄j
x̄j+1 − x̄ xj − xj
j
j+1 − xj+1
bj (x̄) =
+
, x̄j ≤ x̄ < x̄j+1 ,
h
x̄j+1 − x̄j
h
x̄j+1 − x̄j
h
j=1
where h is the time step size. Notice that only the coefficients xi have changed, as only these are
degrees of freedom that change over time; the hat basis itself depends only on position x̄ on Ω̄ and
is time-invariant (assuming the rubber band does not snap, and that there are no other changes
to the number or locations of sample points x̄i on the rubber band, etc.) Therefore the kinetic
energy (12.2) can be discretized in standard form in terms of a constant, precomputed mass matrix
M:
i+1
T
Z L
1 qi+1 − qi
q
− qi
i
i+1
T (q , q ) =
M
, Mij = ρ
bi (x̄)bj (x̄) dx̄.
2
h
h
0
Notice that this matrix is not diagonal, but it is sparse (in fact tridiagonal), since the hats bi and
bj only overlap when |i − j| ≤ 1. Computing each term involves integrating a quadratic polynomial
in x̄, with formulas
 x̄ −x̄
2
1

i=1
ρ 3 ,
x̄i+1 −x̄i−1
Mii = ρ
, 1≤i≤N −1
3

 x̄N −x̄N −1
ρ
, x=N
3
x̄i+1 − x̄i
Mi(i+1) = ρ
.
6
As a sanity check, if qi+1 − qi /h is constant (with every entry equal to, say, c) then
X
1
1
T (qi , qi+1 ) = ρc2
Mij = ρ (x̄N − x̄1 ) c2 ,
2
2
i,j
as expected for a rigid body Ω̄ translating at speed c.
131
Bonus Math The above mass matrix M is called the Galerkin or “finite element” mass matrix, and
while it yields a relatively simple discrete kinetic energy in standard form, the fact that M is tridiagonal
is sometimes undesirable, as M −1 cannot be explicitly computed efficiently. An alternative is to discretize
velocity as piecewise-constant:
i
i+1
T (q , q
N
1 X
)= ρ
Aj
2 j=1
where Aj is the area of the rubber band belonging to

x̄1 −x̄0

 2 ,
j−1
Aj = x̄j+1 −x̄
,
2

 x̄N −x̄N −1
,
2
xi+1
− xij
j
h
!2
,
point x̄j :
j=0
1≤j ≤N −1.
j = N.
Notice that this kinetic energy can be expressed trivially in standard form,
T (qi , qi+1 ) =
1
2
qi+1 − qi
h
T
MB
qi+1 − qi
h
,
for diagonal barycentric or lumped mass matrix MiiB = ρAi . It is important to remark, though, that this
formulation requires interpreting the discrete deformation q differently when computing potential and
kinetic energy: for potential energy, we assumed that q encodes piecewise-linear motion of the rod, and
for kinetic energy, we assumed that chunks of the rod around each x̄i move rigidly (and discontinuously!).
In practice, though, the lumped mass matrix is extremely commonly used, as there are few negative
consequences of this discretization crime.
12.3
Euler-Lagrange Equations
For discrete rubber bands, nothing special need be done: we simply plug in V and T into the
discrete Euler-Lagrange equations and reap a time integrator. The straightforwardness of the
discrete case stands in contrast to the continuous case, where our degree of freedom is the “infinitedimensional” time-varying function φ(t). We will this pattern repeat as we look at more complex
physical systems governed by more elaborate PDEs: mathematical complexities that dominate
the continuous description of the system degrade to simple linear algebra and finite-dimensional
calculus after discretization.
Bonus Math Although we do not need to derive the continuous equations of motions in order to simulate
rubber bands, it is nevertheless instructive to see how the calcuation can be done in the continuous setting.
The key, as always, is Hamilton’s principle: we want to extremize the action
Z t1 h
i
S(φ[t]) =
L φ(t), φ̇(t) dt
t0
over all trajectories φ[t] (now paths through infinite-dimensional space rather than through Rn ) that fix
the starting and ending deformations φ(t0 ) and φ(t1 ). The calculus of variations still applies: for any
132
perturbation δφ(t) to the trajectory φ(t) with
[δφ(t0 )] (x̄) = [δφ(t1 )] (x̄) = 0,
if φ(t) extremizes S, it must be the case that the directional derivative of S in the δφ “direction” vanishes,
lim
→0
d
S(φ + φ) = 0.
d
(If the notation [δφ(t0 )] (x̄) is confusing, remember the type of objects involved: a single deformation φ is
a function from Ω̄ to R. A trajectory is a path φ(t) which yields a different deformation for every moment
of time t. Fixing an endpoint of the path means that δφ, at time t0 , is the deformation function which
maps all points on the rubber band to the origin.) Plugging in the Lagrangian for the rubber band, we
have
Z Z i2
d t1 L ρ h
˙ (x̄) − Wφ+δφ (x̄) dx̄ dt.
φ̇ + δφ
0 = lim
→0 d t
2
0
0
The first term is straightforward to differentiate, but the second term is not; let us substitute in more of
the energy density,
2 #
Z t1 Z L "
d c2 J[φ(t) + δφ(t)]T J[φ(t) + δφ(t)] − 1
˙
dx̄ dt,
ρφ̇(x̄)δφ(x̄) − lim
0=
→0 d 2
2
t0
0
where again, in the second term we are taking the Jacobian (with respect to the spatial variable x̄) of
the function that we get by evaluating the trajectory φ + δφ at the time t. Next, we integrate the first
term by parts with respect to time (why? because as in the finite-dimensional case, we want to get an
expression inside the integral that multiplies δφ; since the perturbation is arbitrary, we will then argue
that the expression must be zero), and simplify the derivative of the second term, to get
Z
t1
Z
0=
t0
0
L
h
i
c2
−ρφ̈(x̄)δφ − (x̄) J[δφ(t)]T J[φ(t)] + J[φ(t)]T J[δφ(t)] dx̄ dt,
4
or, taking advantage of the fact that Jφ is just the one-dimensional spatial derivative
primes to denote differentiation by x̄,
Z
t1
Z
L
0=
t0
0
h
−ρφ̈(x̄)δφ −
d
dx̄ φ,
and using
i
c2
(x̄)φ(t)0 δφ(t)0 dx̄ dt.
2
How do we deal with the δφ(t)0 term? By integrating by parts in the spatial dimension, of course!
#
Z t1 "
Z L
L
c2
c2 d
0
0
0=
− (x̄)φ(t) (x̄)δφ(t)(x̄)
+
−ρφ̈(x̄) +
[(x̄)φ(t) (x̄)] δφ dx̄ dt.
2
2 dx̄
x̄=0
t0
0
All derivatives have been removed from δφ and we are ready to interpret the above as Euler-Lagrange
equations; but we need to exercise some care, since we now have two integrals, and an extra spatial
boundary term wrapped inside a time integral. By choosing perturbations that are zero except for a
neighborhood of t and x̄ away from the rubber band boundary, we have the equations of motion,
ρφ̈(x̄) =
c2 d
[(x̄)φ(t)0 (x̄)] ,
2 dx̄
133
0 < x̄ < L.
But we can also choose a perturbation that is zero except near the boundary, at x̄ = 0 or x̄ = L. By
making δφ(x̄) decay quickly enough as you move away from the boundary, we can make the second,
double-integral term negligible compared to the first term, and we get the boundary conditions
(x̄)φ(t)0 (x̄) = 0,
x̄ ∈ {0, L}.
Since we can typically rule out φ0 (t) = 0 (this would correspond to extreme compression of the rubber
band, so that it is on the cusp of inverting through itself), the boundary conditions imply that at the
tips of the rubber band, the strain is always zero. As a consequence, suppose you take a rubber band,
stretch it uniformly, and then let go: by the equations of motion, the acceleration φ00 is zero at all interior
points of the rubber band; on the other hand, the endpoints of the rubber band instantly infinitesimally
unstretch to have zero strain. A moment later, points in the rubber band near the endpoints start to feel
a nonzero force, since 0 is now nonzero near the boundaries; an elastic wave of nonzero force will start
propagating inward from the boundaries, until the whole rubber band is moving.
A very similar argument explains why a slinky, when held at one end and suspended under gravity,
appear to “hover” for a while when you let go of the top end.
134
Chapter 13
Elasticity in Higher Dimensions
The key to growth is the introduction of
higher dimensions of consciousness into
our awareness.
Lao Tzu
We now look at two- and three-dimensional elastic bodies. The jump from one to two dimensions
will be significant, as several complications arise thanks to the fact that the material might stretch
or compress differently, at a single point, in different directions. By contrast, once you understand
the physics of a 2D sheet, 3D involves virtually no new concepts.
13.1
2D Sheets
Much of the derivation will track that of the 1D rubber band: in particular, we start with an
undeformed region Ω in the plane. A function φ : Ω → R2 represents the deformation of the sheet,
with the image Ω of φ the deformed configuration of the sheet. The total energy of the deformation
φ is then, as it was for the rubber band, expressed as a (now double) integral over Ω of a strain
density W .
We can measured the stretchedness of Ω at any point p = (x, y) of Ω by looking at tangent
vectors v at p, and seeing how much they get stretched or compressed by φ. As before, the
deformed tangent vector v is given by [dφ(p)]v. (Recall the reasoning for this: the derivative of φ
maps infinitesimal changes in p (i.e. tangent vectors at p) to infinitesimal changes in φ.)
A few concrete examples:
• First consider a simple translation
of the sheet, φ(p) = p + t for a translation vector t =
1 0
(tx , ty ). We have that Jφ =
and so translating Ω leaves tangent vectors completely
0 1
unchanged.
r11 r12
• Next consider a rotation: φ(p) = Rp where R =
is a rotation matrix. We can
r21 r22
expand out φ as (r11 x + r12 y, r21 x + r22 y) so that
r11 r12
Jφ =
=R
r21 r22
135
dφ
Ω
φ
Ω
Figure 13.1: A two-dimensional deformation function φ maps points in an undeformed rectangular
sheet Ω to a deformed sheet Ω which has been squeezed in the horizontal direction. As in the
1D case, the deformation differential dφ maps tangent vectors on Ω to tangent vectors on Ω; in
particular, it maps an entire circle of tangent vectors to an ellipse of deformed tangent vectors
(although the circle has been drawn with finite radius, more properly it should be thought of as an
infinitesimal circle). The strain induced by φ can then be measured by looking at how much the
vectors that make up the ellipse have been stretched or compressed.
and φ rotates tangent vectors by the same amount of it rotates the sheet.
• Finally consider a deformation that
the sheet in the x direction and stretches it in y:
squeezes
1/2 0
φ(x, y) = (x/2, 2y). Then Jφ =
compresses the x component of tangent vectors
0 2
and stretches the y component, as one might expect.
Since we know how φ maps undeformed to deformed tangent vectors, the next step, if we’re
following the roadmap from the previous section, is to define a strain. As before, this strain is a
function of both the position on the material a tangent direction at that point:
(p, v) =
kJφ(p)vk2 − kvk2
,
2kvk2
where once again we have chosen the Green strain, this time to avoid square roots. As before, it’s
still true that is invariant to rescaling v: the scale factor cancels out. But unlike in the 1D case,
in 2D the strain is not independent of the direction v! We already saw this in the example above
of φ(x, y) = (x/2, 2y): at the same point, some directions get stretched and some get compressed.
It’s no longer possible to write down a single scalar at every point encoding the stretchedness of
the material at that point.
We will need to modify the laws of elasticity to deal with this situation; though the intuition
is still straightforward: the energy density at a point will need to consider the stretching in all
directions. Before we list the new laws, we can make some simplifications to : as mentioned above,
strain is scale-invariant, so we may as well assume that v is a unit tangent vector. We then get
(p, v̂) =
kJφ(p)v̂k2 − 1
.
2
136
One way of interpreting is that, at every point p, dφ maps the entire circle of unit tangent vectors
to a deformed circle on Ω: since for a fixed p, Jφ is just a constant 2 × 2 matrix, the image of this
circle must be a linear transformation of a circle: an ellipse. measures how much the ellipse has
been distorted at any point along the circle (see figure 13.1).
We can now state the laws of elasticity in 2D:
Laws of Elasticity
1. The energy functional is local : it can be written as
Z
W (p) dA
Ω
where W is an energy density.
2. The energy density W at a point p ∈ Ω depends only on the stretchedness (strain) (p, ·) at
that point on the rubber band.
3. The energy density is zero at p when (p, v) = 0 for any v.
4. The energy density is nonnegative.
These laws are almost entirely unchanged: the energy integral is now over an area instead of along
a line, but the major change is that the energy density no longer depends on just a scalar strain
(p): now it depends on the function (p, v) at one fixed p. What does this mean, exactly? We
can write as
T
T Jφ(p) Jφ(p) − I
(p, v̂) = v̂
v̂,
(13.1)
2
where the 2 × 2 matrix
1
Jφ(p)T Jφ(p) − I
2
now only depends on the point p and not on the direction. This formula is derived from the fact
that for any vector w, kwk2 = w · w = wT w. A couple of notes about S:
S=
• The matrix S(p) is called the strain tensor since, from S and a direction v̂, you can compute
the strain at any point in that direction.
• Note that the formula for S, as well as the fact that can be written in terms of S at all,
is dependent on the fact that we decided to use the Green strain. Using a different strain
formulation would lead to a different strain tensor formula than equation (13.1).
T
• The strain tensor
is symmetric: S = S. In other words, S consists of three different scalars,
s11 s12
S=
.
s12 s22
Since the energy density is a functional of , the only possible quantities the energy density can
depend on are the entries of S:
W (p) = W̃ [s11 (p), s12 (p), s22 (p)].
137
This is the key point of this section: even though the strain can now vary depending on the
tangent direction v you look in at the point p, the stretchedness in every direction can still be
encoded in only three numbers!
There is one more idea worth discussing related to 2D elasticity. Some materials, like woven
cloth, behave differently depending on the direction they are stretched: these are called anisotropic
materials. In contrast, materials like rubber that behave the same way when stretched in any direction are called isotropic, and this extra symmetry can be used to write down a simpler expression
for their elastic energy. In the remainder of this tutorial we will assume isotropy: to the Laws of
Elasticity we will add:
5. The energy density is isotropic: W ((p, ·)) = W ((p, R·)) for any rotation R.
It is important to understand exactly what this law is saying: it is not true that (p, v) = (p, w)
for any v, w: for instance, the sheet in figure 13.1 is only being stretched in one direction, and
has zero strain in the vertical direction, no matter what material the sheet is made out of: the
material doesn’t affect the definition of strain at all! Rather, isotropy says that the energy density
associated to the function (p, v), for one p and all tangent vectors at that point, is the same as
the energy density of the function you get by rotating all tangent vectors. In other words, if you
think of dφ as mapping an infinitesimal circle of tangent vectors to an infinitesimal ellipse, and W
as measuring the distortion of that ellipse, then in an isotropic material the energy density depends
only on the shape of that ellipse, and not on its orientation.
If we call the strain in rotated coordinates R (with R (p, v) = (p, Rv)), we have from equation (13.1) that
T
R (p, v̂) = v̂ RT SRv̂.
The second law tells us that W can only depend on RT SR, yet the fifth insists that the energy
density is the same no matter what R is: what quantities of RT SR can be measured that are the
same for any R? The answer are the eigenvalues of S.
In case your linear algebra is rusty, we will argue why this is true. S is symmetric, so it has
real eigenvalues and it has a full set of orthonormal eigenvectors; by the spectral theorem it is
therefore possible to decompose S = U T DU where U is a matrix of eigenvectors of S and D a
diagonal matrix of eigenvalues. Since the eigenvectors are orthonormal, U is a rotation matrix, and
T
so U (p, v̂) = v̂ Dv̂. Since W is only a functional of U , this means that W must depend only on
the entries of D; these are all zero except the diagonal, which are the eigenvalues.
There are a couple of subtleties in the above argument that are worth bringing to light. First,
just because W can’t depend on anything except the eigenvalues of S, how do we know that it
can depend on the eigenvalues? Are we sure that the eigenvalues of S are the same as those of
RT SR for any R (otherwise, an energy density that depends on the eigenvalues would violate
law five)?
This is in fact easy to check: if v is an eigenvector of S with eigenvalue λ, then
RT SR(RT v) = RT Sv = RT λv = λ(RT v)
and RT v is an eigenvector of RT SR with the same eigenvalue λ.
The second subtlety is that the energy can depend on the set of eigenvalues, but not on their
order :
0 1
this is because of D is a diagonal matrix, it is possible to choose the rotation matrix R =
so that
1 0
138
RT DR has permuted eigenvalues:
0
1
1
0
T λ1
0
0
λ2
0
1
1
0
=
0
1
1
0
0
λ2
λ1
0
=
λ2
0
0
λ1
.
The upshot of the above is that W , for isotropic materials, must depend on the unordered set
of eigenvalues {λ1 , λ2 } of S: W (p) = W̃ (λ1 (p), λ2 (p)). Now we can continue the argument we’ve
seen twice before in previous chapters: at any given p, we Taylor expand W̃ to get
1
1
1
1
W̃ = c0 + c1 λ1 + c2 λ2 + c3 λ21 + c4 λ22 + c5 λ1 λ2 + c6 λ31 + · · ·
2
2
2
6
and recognize that we know a lot about the constants just from the laws of elasticity: c0 = c1 =
c2 = 0 since the energy density is zero and has a local minimum when is zero for all v, which is
the case when λ1 = λ2 = 0. We also know that c3 = c4 since W̃ can depend only on the unordered
set of eigenvalues. Therefore
c5
c3
W̃ ≈ (λ21 + λ22 ) + λ1 λ2 ,
2
2
the 2D analogue of Hooke’s law! The constants are often reshuffled so that the energy density has
the form
1
W̃ ≈ α(λ21 + λ22 ) + β(λ1 + λ2 )2
2
because each of the two eigenvalue terms now has a very simple expression in terms of the matrix
trace: λ1 + λ2 = tr S and λ21 + λ22 = tr S 2 , leading to
1
W̃ ≈ α tr(S 2 ) + β(tr S)2 .
2
Some comments about this energy density:
(13.2)
• The constants c0 , c1 , . . . depend on the material the sheet is made out of, just as in the case
of the 1D spring or rubber band. Real materials will have non-zero higher-order coefficients.
Also, just as in the case of the spring, this material model only makes sense if also accompanied
by the formula used to define strain (in the case of the calculations above, the Green strain).
• The simplest material model, where only the quadratic coefficients are nonzero, is called the
Saint Venant-Kirchhoff model. Like Hooke’s law in 1D, it is most valid in the case of very
small deformations, where the eigenvalues λi are close to zero.
• Instead of a single stiffness parameter, the Saint Venant-Kirchhoff model has two material
parameters, called the Lamé parameters. These have intuitive geometric meaning: α measures
the resistance of the material to total compression/stretching in all directions, very much like
a “stiffness” in 1D. On the other hand β measures stretching that isn’t compensated by
compression in the other direction: for instance, a material with low α and high β would
strongly resist being uniformly compressed, but would not resist much to being compressed
in the x direction and stretched in the y direction.
• Instead of α and β, the material behavior could be specified using any other pair of constants
that are functions of α and β. Engineering has tons of such constants, such as the Young’s
modulus, bulk modulus, shear modulus, etc, each of which measure a slightly different aspect
of the material’s behavior, but all of which can be converted into each other and into the two
Lamé parameters.
139
• The material parameters can vary over the surface, if, for example, the sheet is made out of
rubber that is stiffer towards one end than towards the other. Objects with the same material
parameters everywhere are called uniform.
Whew! Going from 1D objects to 2D objects is quite a leap. Key ideas in this section were:
1. the strain is no longer just a scalar at every point of the material. It now depends on direction
in a fundamental way, that doesn’t cancel out like it does in 1D. The energy density then has
to be formulated as a function of this strain function;
2. the strain at a point in any direction can be encoded by a matrix; and for isotropic materials,
the only part of this matrix that matters for computing the elastic energy are the eigenvalues.
If you understand these concepts, the move from 2D to 3D will be almost disappointingly straightforward.
Bonus Math What if we don’t assume that the material is isotropic? Then it is not longer true that
energy density depends only on the eigenvalues of S: we have to look at all entries of S instead:
W (p) = W̃ [s11 (p), s12 (p), s22 (p)].
From here, we can proceed as usual: we Taylor expand W̃ and eliminate the constant and linear terms to
get
1
1
1
1
1
1
W̃ ≈ c1 s211 + c2 s212 + c3 s222 + c4 s11 s12 + c5 s11 s22 + c6 s12 s22
2
2
2
2
2
2
and unfortunately there are no symmetries we can use to consolidate any of these constants (intuitively
it makes sense that a lot more parameters are needed, since information about the direction the material
is stiffest, and the direction it is weakest, and the disparity in stiffness, needs to be encoded.)
Instead of the entries of S, the energy density can be expressed in terms of the matrices in the spectral
decomposition U T DU , but this time the U matrices cannot be ignored. U is 2 × 2 and is made up of two
orthogonal unit vectors; a single angle specifies the first vector, and then the second vector is uniquely
determined (up to an irrelevant sign) by the fact that it is perpendicular to the first. Therefore all of U
is encoded by a single θ, and W can be written as a function of the eigenvalues of S and this single angle,
which intuitively represents the direction the material is strongest (or weakest):
W (p) = W̃ [λ1 (p), λ2 (p), θ(p)].
Notice that the Taylor expansion of W̃ will still have six terms, since θ will contribute a squared term
and two cross terms, and the λ21 and λ22 terms cannot be combined in the way it can when the material
is isotropic.
13.2
2D Discretization
We can discretize a 2D elastic sheet in the same way as we did the 1D rubber band: by choosing
a few key points q̄i on the sheet, tracking how they move over time, and writing φ in terms of the
deformed positions qi .
More specifically, consider a discretization of Ω̄ by a triangle mesh T with N vertices q̄i and
triangles Tijk . Let qi denote the deformed positions of the vertices;
these are also our degrees of
freedom, so that the discrete configuration is the vector q = q1 q2 · · · qN . We can now
140
construct a function φq (p̄) that maps any point p̄ on Ω̄ to a deformed point in R2 . We do so as
follows:
• first, we locate the triangle Tijk that contains p̄;
• then we write p̄ in terms of the triangle corners. We do so using the barycentric coordinates
(u, v) of p̄: the scalars u and v satisfying
p̄ = q̄i + u(q̄j − q̄i ) + v(q̄k − q̄i ).
Notice that the barycentric coordinates can be computed by writing the above as a linear
system
u
q̄j − q̄i q̄k − q̄i
= p̄ − q̄i ,
v
yielding
−1
u
= q̄j − q̄i q̄k − q̄i
(p̄ − q̄i ).
v
• finally, the same barycentric coordinates can be used to find the corresponding point on the
deformed triangle,
p = qi + u(qj − qi ) + v(qk − qi ).
Putting the pieces together, we get that within each triangle Tijk ,
−1
q̄j − q̄i q̄k − q̄i
φq (p̄) = qi + qj − qi qk − qi
(p̄ − q̄i ).
Notice that this function is therefore piecewise-linear over Ω̄. As in the case for the rubber band,
it is possible to write φ in terms of a finite-dimensional basis of functions,
φq (p̄) =
N
X
qi bi (p)
i=1
where the basis functions bi : Ω̄ → R are piecewise-linear conical “hats” with value one at vertex
q̄i which linearly decreases to zero at the vertices neighboring i.
The strain, energy density, and overall elastic energy V (q) can then be computed as in the
smooth case, using this discrete φ. On triange Tijk , the 3 × 2 Jacobian of φ is simply the constant
matrix
−1
q̄j − q̄i q̄k − q̄i
Jφijk = qj − qi qk − qi
;
notice also that this Jacobian is linear in the vertex position degrees of freedom qi . The Green
strain tensor is then a constant matrix (quadratic in the vertex positions) over each triangle,
Sijk =
JφTijk Jφijk − I
,
2
and since the strain is constant on each triangle, so is the energy density, and so the total elastic
energy of the sheet (assuming a St. Venant-Kirchhoff material) can be computed by summing the
energy contributed by each triangle:
X
1
2
2
V (q) =
Aijk α tr Sijk + β (tr Sijk ) ,
2
Tijk
where Aijk is the area of triangle Tijjk on Ω̄. This energy is quartic in q, and is sometimes called
the constant-strain triangle formulation of elasticity.
141
When is it OK to use a constant-strain-triangle discretization? Is it enough to assume a
piecewise-linear deformation of Ω̄?
Indeed, other discretizations are possible: for instance, instead of assuming a piecewiselinear deformation, we could instead allow piecewise-quadratic deformation. To do so we would
need to add more degrees of freedom per triangle (since knowing a triange’s three deformed
corner positions qi is not enough to uniquely pin down a quadratic deformation), and derive
new expressions for strain and energy density (which will no longer be constant over each triangle). Many
other schemes are possible. When analyzing the usefulness of a discretization, there are several criteria
we can examine (much as there are several criteria that can be used to pick a “good” time integrator, as
discussed in chapter 4). A few of these are listed here:
• consistency: Suppose you have a sequence of finer and finer meshes T (with N1 < N2 < · · · triangles)
and corresponding discrete deformations φi , where the finer the mesh, the better φi approximates a
continuous deformation φ of Ω̄. A discretization scheme is consistent if the discrete elastic energy also
approaches that of φ as you refine the mesh: limi→∞ V (φi ) = V (φ).
You might wonder, when does it ever make sense to use an inconsistent method? Usually the central
principle of discretization is to approximate smooth physics, and if a larger memory and computation
budget does not buy you increased accuracy, a method’s usefulness is dubious. But analyzing consistency of a method is typically very difficult, with many stipulations and caveats. For example, the
constant-strain triangle discretization is consistent if :
– the triangles of Ti stay “nicely-shaped” (close to equilateral). A super-fine triangle mesh does
not allow you to accurate estimate V if many triangles are long, skinny slivers;
– the mesh resolution is uniform: refining only part of the mesh increases the total number of
triangles, but obviously adding more triangle to an already-fine region of T will not improve the
quality of φi in coarse regions of the mesh;
– the limit deformation φ is sufficiently-smooth (has bounded higher derivatives).
To my knowledge, there does not exist a bulletproof method that gives good results no matter how
bad the quality of the triangle mesh. (As a very important practical consequence, if you want good
results from a physical simulation, be sure to use a reasonable mesh!)
• order of convergence: consistency is a bare-minimum requirement. Order of convergence gives a
finer-grained measurement of how quickly the error in energy
converges to zero as you increase the
1
resolution Ni of the mesh. If the energy error scales like O N s , then the scheme is said to be order
i
s. The benefit of higher order of convergence is that the higher s is, the fewer triangles are needed
to estimate V with the same error. The hat basis’s constant-strain formulation has linear (order one)
convergence.
• computation cost: a high-order method may produce very accurate energy estimates using very
few triangles, but there is not free lunch, as the formula for V (q) might be complex and expensive
to compute. The main benefit of the hat basis is that strain is constant on each triangle, so that the
energy density can be integrated up in closed form. Most advanced discretization methods do not
have a closed form for V : instead, the integral itself must be approximated, using algorithms called
numerical quadrature methods. The details of such methods are beyond the scope of this chapter; the
key point is that the need for quadrature adds complexity and expense to the simulation.
142
13.2.1
Kinetic Energy
As in the case of rubber bands, the kinetic energy can be discretized by assuming that the triangle
vertices (and thus all mass points on Ω̄) move along an affine trajectory between time steps:
T (qi , qi+1 ) =
Z
Ω̄

N
ρ X
2
qi+1
− qij
j
j=1
!
h
2
bj (p̄) dp̄,
for time step size h and assuming the sheet has constant density ρ. Also as in the rubber band
case, this energy can be written in standard form, with mass matrix M that depends only on the
rest triangle vertex positions q̄i . M can thus be precomputed once; it is sparse (Mij = 0 if vertices
i and j are not edge-adjacent) but not diagonal.
Also as with rubber bands, a different “lumped” or barycentric mass matrix is popular in
practice:
i
T (q , q
i+1
qi+1 − qi
h
X AT
,
=ρ
3
1
)=
2
B
Mjj
T
M
B
qi+1 − qi
h
,
T ∼j
where T ∼ j is the set of triangles that contain vertex j, and AT is the area of triangle T : in other
words, each triangle “gives” a third of its area to each of its three vertices. The advantage of M B
is that it is diagonal, and hence trivial to invert.
13.3
Elasticity in 3D
Almost nothing changes in three dimensions. Given an undeformed volume Ω̄, deformations of the
volume are encoded by a smooth function φ : R3 → R3 . The Green strain of φ is exactly as it was
in 2D:
Jφ(p̄)T Jφ(p̄) − I
T
ˆ
ˆ
ˆ.
(p̄, v̄) = v̄
v̄
2
For a uniform, isotropic material, the energy density is a function of the three eigenvalues of the
strain tensor, where again the linear and constant terms must be zero:
W̃ ≈ c0 (λ21 + λ22 + λ23 ) + c1 (λ1 λ2 + λ1 λ3 + λ2 λ3 ),
and again, exactly as in 2D, this energy can be written in compact form using matrix trace; see
equation (13.2).
Three-dimensional elastic volumes can be discretized using a tetrahedral mesh. The exact same
strategy can be used to construct a φ from the undeformed and deformed vertex positions as was
used in 2D: each point p̄ can be written in the barycentric coordinates of its enclosing tetrahedron,
so that φq is piecewise-linear and the strain is piecewise-constant.
143
144
Chapter 14
Principal Modes
As the eyes, said I, seem formed for
studying astronomy, so do the ears seem
formed for harmonious motions: and these
seem to be twin sciences to one another, as
also the Pythagoreans say.
Plato
Discretization in the previous chapters have consisted of choosing a set of privileged points
(vertices), and describing the deformation of a continuum of material in terms of the motion of
those points (in particular, we have described how to write down deformation that is piecewiselinear in the displacements of the vertices).
Is there a better choice for how to discretize deformation? In this chapter we will look at how to
pick a linear subspace of deformations of an object that are “most likely,” in a mathematical sense.
In addition to giving us an alternative (and sometimes, more efficient) discretization of volumetric
objects, these principal modes will also allow us to simulate objects in certain situations using an
arbitrarily large time step, with no instability or error.
14.1
Geometry of Equilibrium
Suppose we have a particle system with potential V (q), so that the equations of motion are given
by
M q̈ = −[dV (q)]T .
The system is in static equilibrium if the configuration will not change over time, until perturbed
by external forces. This will occur when the acceleration on the configuration is zero, e.g. when
dV (q) = 0. Let q0 be such an equilibrium configuration. Notice that, by definition, q0 is a critical
point of the potential energy V .
We can further classify the type of equilibrium at q0 : the equilibrium is stable if q0 is a local
minimum of V . In this case, small perturbations of the configuration away from q0 will yield
forces that pull q back towards q0 . The equilibrium is unstable otherwise, i.e. when q0 is a local
maximum or saddle point of V . At an unstable equilibrium, a nudge in a descent direction will
cause the configuration to slide down the energy landscape away from q0 and not return.
145
How can we tell if q0 is stable or unstable algebraically? We can Taylor expand the potential
V about q0 :
1
V (q0 + δq) = V (q0 ) + JV (q0 )δq + δqT HV (q0 )δq + o(kq0 k3 )
2
T
where HV is the Hessian matrix J[JV ] of second derivatives of V . Since q0 is a critical point of
V , the linear term vanishes. If we further assume that the deviation δq away from q0 is small, so
that the higher-order terms in the expansion are negligible, we get
1
V (q0 + δq) ≈ V (q0 ) + δqT HV δq.
2
Equality of mixed partial derivatives tells us that HV is symmetric, and so has a full set of real
eigenvalues λi with corresponding eigenvectors vi . As always we may assume that the eigenvectors
are normalized, so that kvi k = 1.
Let us now look at what happens to the energy if we take a small step away from q0 in the
vi direction:
2
λi 2
V (q0 + vi ) = V (q0 ) + viT HV vi = V (q0 ) +
.
2
2
Therefore our step decreased V if λi < 0, and increased V if λi > 0. If all of the eigenvalues are
positive, we know that if we move away from q0 in any direction, we will increase V : therefore we
must be at a local minimum at q0 .
This observation motivates the so-called “second derivative test” for multivariable functions: a
critical point q0 of V is
• a local minimum, if HV is positive-definite;
• a local maximum, if HV is negative-definite;
• a saddle point, if HV is indefinite (has at least one positive and at least one negative eigenvalue);
• of unknown character, if HV is singular and not a saddle point.
Why are there cases where the second derivative test fails? Recall that in the Taylor expansion
of V about q0 , we assumed that the third and higher-order terms are negligible compared to
the second-order term, and ignored them in our analysis of how moving in the vi directions
changes V . When HV is nonsingular, this reasoning is sound: for sufficiently small , the
higher-order terms decay quicker than the 2 term, under mild assumptions on the smoothness
of V . But when HV is singular, it is the third-order term that dominates V ’s behavior in a
direction that lies in the kernel of HV , and that term cannot be neglected.
We see this corner case arise even in one-dimensional configuration space; V (q) = q 3 has a saddle point
at q0 = 0, and V (q) = q 4 has a minimum there, and it is impossible to distinguish the two cases just by
looking at the first and second derivatives of V .
In practice, it does arise quite often that HV is singular, due to symmetries of the physical system. For
example, suppose V represents the elastic energy of a volumetric body. Since the energy is unchanged by
rigid motions of the body, we know that the kernel of HV contains, at minimum, the configurational tangent
vectors correponding to translation about each axis. (Rigid rotations also leave V unchanged, but since a
rotation cannot be represented by moving in a constant direction on Q, these motions are not guaranteed to
contribute to the kernel of HV .) There are two general approaches to dealing with this nonsingularity:
• be aware of their presence, and explicitly ignore them. I.e. if HV has all positive eigenvalues except
146
for three zero eigenvalues, whose eigenvectors you know correspond to translations, then you know
that you are at a local minimum of V , despite technically failing the second derivative tests.
• pick new coordinates for your physical system which removes the symmetries leading to singularity.
For example, by pinning one point of the body to a fixed location in space, the body is no longer free
to translate.
14.2
Principal Modes
Let us now assume that q0 is a stable equilibrium of the system. Suppose we want to simulate
the system, starting from a configuration near q0 . Instead of using q as our degrees of freedom,
let us introduce new degrees of freedom d, displacements away from q0 , related to the original
coordinates by
q = q0 + d.
The equations of motion with respect to these new degrees of freedom are
M d̈ = −[dV (q0 + d)]T ,
and let us now Taylor-expand the right-hand side of these equations to get
M d̈ = −[dV (q0 )]T − HV (q0 )d + o(kdk3 ).
If we once again use the fact thaat q0 is an equilibrium, and that d is small, we get
M d̈ ≈ −HV (q0 )d.
(14.1)
Notice that the matrix HV (q0 ) is constant and depends only on the equilibrium point q0 , not on
the motion of the system encoded in the degrees of freedom d.
It turns out that this differential equation is simple enough that it can be solved in closed form.
Perhaps you’ve already seen differential equations of the same form, in one dimension: for a constant
k > 0, ÿ = −ky is√a classic simple
√ (linear, second-order) ODE called the wave equation which is
solved by y = sin( kt), y = cos( kt), or any linear combination of these two fundamental solutions.
Equation (14.1) adds the extra wrinkle that the differential equation involves several variables, but
the form of the solution is much the same. Let λi and vi be the generalized eigenvalues and
eigenvectors of HV (q0 ) with respect to M , i.e., solutions to
HV (q0 )vi = λi M vi .
(For more detials on the relationship between eigenvectors, eigenvalues, and generalized eigenvalues,
see chapter 1). If M is symmetric positive-definite, it turns out1 that the generalized eigenvalues
are positive if and only if HV (q0 ) has positive eigenvalues. And we know this is true since q0 was
assumed to be a stable equilibrium of the system.
√
Then one can check that d(t) = Avi e±i λi t is a solution to the equations of motion, for any
constant A. Indeed,
M d̈ = M A(−1)λi e±i
√
λi t
= −HV (q0 )Avi e±i
√
λi t
.
The generalized eigenvalues are the same as the ordinary eigenvalues of the (non-symmetric) matrix M −1 HV .
Since the spectrum of a product of matrices is invariant under cyclic permutation, the generalized eigenvalues also
match the ordinary eigenvalues of M −1/2 HV M −1/2 , which is a symmetric product of positive-definite matrices and
hence is positive-definite itself.
1
147
Since the equations of motion (14.1) are linear differential equations satisfying the superposition
property, with the sum of two solutions still a solution, the general solution d(t) is of the form
√
√
X
d(t) =
αi vi ei λi t + βi vi e−i λi t
i
where the coefficients αi and βi depend on the initial conditions d(0) and ḋ(0). Since we know
these initial conditions are real, we can perform the usual rewriting of the exponential as sines and
cosines to get
X
p
p
d(t) =
ηi vi sin( λi t) + µi vi cos( λi t).
(14.2)
i
Plugging in the t = 0 we can solve for the values of µi . In matrix form,
U µ = d(0)
where U is the matrix with columns vi . Similarly,
Λ1/2 U η = ḋ(0)
√
where Λ1/2 denotes the diagonal matrix with entries λi . If we are given the intiial conditions d(0)
and ḋ(0) of the system, we can solve these linear equations for µ and η. The displacement d(t)
from q0 can then be evaluated in closed form, at any time t in the future, using equation (14.2).
Summary By assuming that the trajectory of the system stays close to q0 , and by changing
coordinates, we made several remarkable observations:
• the trajectory of the system becomes easily computable, in closed form, given the coefficients
µ and η, which can be easily calculated from the initial conditions by solving linear systems.
• the trajectory is a superposition of principal modes, each of which look like a zero-rest-length
√
spring: the eigenvector vi is the “orientation” of the spring in configuration space, and λi
the frequency of oscillation of the spring. This observation is quite deep: any physical system,
sufficiently close to an equilibrium state, behaves like a mass-spring system!
• there is a natural ordering of the principal modes, given by the sorting of the eigenvalues
λ0 ≤ λ1 ≤ . . . . Each mode is oscillating independently, and so there is a different amount of
energy stored in each mode, depending on its amplitude. If we assume that the generalized
eigenvectors vi are normalized, so that viT M vi = 1, the energy stored in the pair of modes
with eigenvalue λi is
1 2
(η + µ2i )λi
2 i
and so the modes with smallest eigenvalue are the lowest-energy modes—the deformations
that are coarsest and “easiest” to induce in an object. The higher λi , the more energy is
required to excite that mode in the object.
Of course, the formula for d(t) is only valid if we assume that the higher-order terms in the
Taylor expansion of the force −[dV ]T can be neglected. This is often the case for volumetric elastic
bodies, which jiggle or squish slightly when perturbed but do not undergo large changes in shape.
The assumption is not fruitful for objects like cloth, where large changes in shape (i.e. from a flat
sheet to a folded sheet) can involve only small changes in potential energy.
148
14.3
Dimension Reduction
The stationary solution d(t) derived in the previous section allows a simulation to take an arbitrarylarge time step in constant time—but it assumes that no external forces act on the system, and so
is not useful for simulations involving contact or other interactions between objects. But we can
use the observation that the principal modes with smallest eigenvalue are the most energetically
favorable, and hence the most “important” or “likely,” as a starting point for deriving a new
discretization of a deformable object: instead of the coordinates q or d, which might involve many
thousands or millions of degrees of freedom, depending on the complexity of the object, we can
allow only deformations of the form
q(t) = q0 +
N
X
αi (t)vi .
i=0
In other words, we restrict our simulation to a well-chosen subspace of configuration space, consisting
of only the N “most important” deformations of the object. The degrees of freedom are now the
time-varying coefficients α that tell us where we are in that subspace. Choosing N is not an exact
science: too small, and the possible deformations of the objects will be too constrained, and the
object will act stiffer and more rigid than it should; too high, and we lose the performance benefit
of reducing the number of degrees of freedom in our discretization.
Writing the above equation in matrix form, q(t) = q0 + U α, the kinetic energy of the system
is now
1
1
T (q̇) = q̇T M q̇ = α̇T U T M U α̇.
2
2
Since the generalized eigenvectors vi are orthogonal with respect to the inner product M , and
normalized, U T M U = I and the identity matrix now takes the role of the syste mass matrix:
1
T (q̇) = α̇T α̇.
2
Similarly
1
V (q) ≈ αT Γα
2
so the equations of motion are
α̈ = −Γα,
which
the solutions derived in the previous section where αi is a linear combination of
√ agrees with √
sin( λi ) and cos( λi ). External forces can now be added using the principle of virtual work: if
Fext acts on a point p(α), the equations of motion incorporating this external force become
α̈ = −Γα + [dp]T Fext .
The simplicity and efficiency of this discreitzation makes it an attractive option for simulating
volumetric objects that do not deform much, or in too complex manner, in performance-critical
applications like games or interactive tools.
149
150
Chapter 15
Dirichlet Energy and the Laplacian
One of the most remarkable properties of
the operator ∇ isthat when repeated
it
∂2
∂2
∂2
2
becomes ∇ = − ∂x2 + ∂y2 + ∂z2 , an
operator occurring in all parts of Physics,
which we may refer to as Laplace’s
Operator.
James Clerk Maxwell
We will next look at one of the most important concepts in physical simulation, which arises
in systems involving diffusion, heat flow, wave propagation, surface tension, elastic bending, soap
bubbles, sounds and vibrations, and many others: the Laplacian ∆. But we will approach this
topic gently, starting with the comfortable setting of 1D functions.
15.1
Dirichlet Energy
Let H be the set of smooth real-valued functions on the unit interval [0, 1] with derivative zero at
the boundaries: f 0 (0) = f 0 (1) = 0. Notice the following two facts about H:
• the sum f + g of any two functions f, g ∈ H is also in H, since the sum of smooth functions
is smooth, and adding together the two functions with f 0 (0) = 0 and f 0 (1) = 0 keeps the
derivative zero at the boundaries;
• for any scalar α ∈ R and f ∈ H, also αf ∈ H.
These two facts imply that H is a vector space (over the reals). Unlike more common vector spaces,
like Rn , this vector space is infinite-dimensional (and it may not be entirely obvious how to find a
basis for H; hold that thought.)
We can introduce the L2 inner product on H defined by
1
Z
hf, gi =
f (x)g(x) dx.
0
One can check that this inner product satisfies the usual requirements from chapter 1:
151
• it is symmetric, hf, gi = hg, f i;
• it is bilinear in its inputs:
hf, αg + hi = αhf, gi + hf, hi
and likewise for the first input, by symmetry;
• it is positive: hf, f i ≥ 0, with equality only when f is the constant function f (x) = 0.
Bonus Math This last claim is by far the most subtle. Suppose f (a) = b 6= 0 for some value of a ∈ [0, 1].
We can assume, without loss of generality, that b > 0. Then consider the inverse image f −1 [(b/2, ∞)].
Since (b/2, ∞) is open and f is continuous, so is the inverse image. Therefore there exists some nonempty
interval (x1 , x2 ) containing a with f (x) > b/2 whenever x ∈ (x1 , x2 ). We then have that
Z
hf, f i =
1
f 2 dx ≥
0
Z
x2
f 2 dx ≥
Z
x1
x2
x1
(x2 − x1 )b2
b2
dx =
> 0.
4
4
Notice that this argument required the continuity of f ; if f is allowed discontinuous points, it is quite
possible for its norm to be zero, as for instance for the function
(
0, x 6= 12
.
f (x) =
1, x = 12 .
Less explicitly, we have used the fact that the interval [0, 1] is compact, since otherwise there is no
R1
guarantee that the integral 0 f 2 dx is even well-defined.
Now consider the Dirichlet energy
V (f ) = hf 0 , f 0 i =
1
2
Z
1
[f 0 (x)]2 dx.
0
We have encountered these kinds of functionals—functions that take in functions—before, in the
chapters on elasticity. Notice a few obvious properties of V : it is always nonnegative, and zero
if and only if f is a constant function. Notice also what V measures intuitively: it is larger the
more oscillatory f is, and smaller the smoother f is. Clearly the smoothest possible function is a
constant function.
Let us now consider the gradient of Dirichlet energy. This notion is a bit tricky to pin down,
since V is a function of functions, but we have seen this type of gradient before, when looking
at elasticity. We first define a directional derivative of V : if f and δf are functions in H, the
directional derivative of V at f in the direction δf is
d
V (f + δf )
d
→0
,
ie, the infinitesimal change is V due to an infinitesimal perturbation in f in the δf direction.
We then define the gradient ∇V to be the function with the property that for any perturbation
δf ∈ H, the inner product of ∇V and δf tells us the directional derivative of V in the δf direction:
h∇V, δf i =
d
V (f + δf )
d
152
→0
.
It is not obvious that such a function ∇V even exists!1 But let us try to calculate it, by simplifying
the right-hand side. Plugging in definitions we have
Z 1
Z
2
2
d 1 0
d 1 1 0
0
=
f (x) + δf (x ] dx
f (x) + δf 0 (x) dx
d 2 0
→0
→0
0 d 2
Z 1
(f 0 (x) + δf 0 (x))δf 0 (x) dx
=
→0
0
Z 1
=
f 0 (x)δf 0 (x) dx.
0
Here we were allowed to swap the order of differentiation and integration since the variables x and
have nothing to do with each other. Now we apply integration by parts:
Z 1
Z 1
Z 1
x=1
0
0
0
00
f (x)δf (x) dx = f (x)δf (x)
−
f (x)δf (x) dx = −
f 00 (x)δf (x) dx.
x=0
0
0
0
Here the boundary term vanished since f ∈ H, so that f 0 (0) = f 0 (1) = 0. Notice that this last
integral looks just like an inner product,
Z 1
−
f 00 (x)δf (x) dx = h−f 00 , δf i,
0
so that we have our gradient: ∇V = −f 00 . Just as in the finite-dimensional case, this gradient
encodes the “direction of quickest increase” in Dirichlet energy of f : it tries to make mountains
into even taller mountains, and valleys into deeper valleys.
15.2
The Laplacian
The gradient of Dirichlet energy is so important that it has a special name: the Laplacian of f ∈ H,
written ∆f , is the negative gradient of the Dirichlet energy at f :
∆f = f 00 .
Observe how ∆f relates to the original function f . In places where f has a valley—a local
minimum—∆f is positive, and flowing f in the direction of ∆f “fills in” the valley. In places where
f has a mountain—a local maximum—∆f is negative and flowing f in its direction “flattens” the
mountain. In places where f is a straight line, ∆f = 0. All of this is in agreement with the fact
that ∆f is the negative gradient (direction of steepest descent) of the Dirichlet energy.
Warning The historical definition of the Laplacian is as the positive gradient of Dirichlet energy,
and this definition is still ocasionally used, especially in mathematics. Be aware when reading
textbooks and papers that the authors may be using a different sign convention for ∆ than that
chosen here.
We can write down a differential equation that flows f in the direction of its Laplacian, which
will smooth out f over time: this is the heat equation or diffusion equation. To fully specify this
equation, we write f as a function of both position x ∈ [0, 1] and time t ≥ 0. We also need to pick
1
Also notice that we do not require that ∇f ∈ H; in general it will not satisfy the boundary conditions. But there
is no trouble extending the inner product to allow the expression on the left-hand side.
153
• the initial temperature f0 (x) = f (x, 0) along the interval [0, 1];
• boundary conditions for what happens at the interval endpoints x = 0 and x = 1. If we insist
on f (x, t) ∈ H, we are requiring that the no temperature flows in or out of the interval at the
boundary; these boundary conditions are known as Neumann boundary conditions.
• a rate constant α controlling how quickly heat diffuses over the interval.
With this information, the heat equation is
∂f
(x, t) = α∆f (x, t),
∂t
∂f
(0, t) = 0,
∂t
∂f
(1, t) = 0,
∂t
x ∈ (0, 1)
f (x, 0) = f0 (x).
As t → ∞, the temperature reaches a steady state where ∂f
∂t = 0 even on the interior; this steady
state thus satisfies α∆f = 0, so that f is a linear polynomial; the only linear polynomial with zero
derivative at the interval boundaries is the constant function. So over time, no matter the initial
conditions f0 (x), the temperature on the interval evens out to a constant value, exactly as we might
expect from intuition.
Spectrum of the Laplacian
functions f, g ∈ H and α ∈ R,
Notice that ∆ is a linear operator on functions: for any two
∆(f + αg) = ∆f + α∆g,
so ∆ can be thought of as an infinite-dimensional matrix acting on “infinite-dimensional“ vectors
in H.
In analogy to eigenvectors of matrices, we can look for functions f ∈ H with
∆f = λf
for some scalar λ. In analogy to a symmetric matrix, the Laplacian is a “symmetric” operator:
Z 1
hf, ∆gi =
f (x)g 00 (x) dx
0
Z 1
x=1
= f (x)g 0 (x)
−
f 0 (x)g 0 (x) dx
x=0
0
Z 1
x=1
0
= 0 − f (x)g(x)
+
f 00 (x)g(x) dx
x=0
0
= h∆f, gi.
In finite dimensions, being symmetric guarantees that a matrix has a full set of real eigenvalues.
We might thus expect that ∆ has real eigenvalues; this turns out to indeed be true. The solutions
to the Dirichlet eigenvalue problem are
fn (x) = cos(nπx), λn = −n2
n ∈ Z.
We can ignore the eigenfunctions with n < 0, since these are the same as the eigenfunctions with
positive n (since cosine is an even function). Notice that all eigenvalues are negative: this is not
154
surprising, since for any function f ∈ H with f 0 (0) = f 0 (1) = 0, we can reverse the calculations in
the derivation of the Laplacian to get
Z 1
Z 1
1
00
f 0 (x)f 0 (x) dx = − V (x)
f (x)f (x) dx = −
hf, ∆f i =
2
0
0
and the Dirichlet energy is always nonnegative. One eigenfunction is special: the constant eigenfunction f (x) = 1, with eigenvalue zero.
The eigenfunctions of the Laplacian are orthogonal, in the sense that their inner products are
zero:
hfi , fj i = 0, i 6= j.
This is easy to see since two distinct eigenfunctions of ∆ have unequal eigenvalues (use the symmetry
of ∆ described above).
Now let us look again at the heat equation. We can take the initial temperature f0 (x) and split
it into a linear combination of eigenfunctions of the Laplacian, so that
f0 (x) =
∞
X
αn0 cos(nπx).
n=0
Similarly we can decompose f (x, t) into these same two pieces, where the coefficients of the eigenfunctions now depend on time (notice that the Neumann boundary conditions are crucial to our
ability to do this rewriting):
∞
X
f (x, t) =
αn (t) cos(nπx).
(15.1)
n=0
The heat equation then becomes
∞
X
α̇n (t) cos(nπx) =
n=0
αn (0) =
∞
X
−n2 αn (t) cos(nπx)
n=0
αn0 .
Since the Laplacian eigenfunctions are orthogonal, the first equation splits into a different equation
for each value of n; hence α̇n (t) = −n2 αn (t), which has the solution
2
αn (t) = αn0 e−n t .
There are several important punchlines here:
1. Over time, each of the coefficients αn (t) decays to zero exponentially, except the coefficient
of f0 , the constant eigenfunction. Therefore as t → ∞, the only part of f (x, t) that remains
nonzero is the “constant part“ of f , as expected.
2. The higher-frequency components of the initial temperature f0 (x) decay more quickly than
the lower-frequency components, since they have larger n.
3. If we know the decomposition of f0 (x) into the form in equation (15.1), then we can very easily
compute the temperature f (x, t) at any position and time in closed form, without needing to
perform any discretization. This is precisely the main idea behind the use of the (discrete)
Fourier transform in signal processing.
155
Relationship to Model Reduction You should be noticing many connections between the
ideas discussed in this section, and those of the previous chapter on principal modes of physical
systems with potential energy V (q). These are not at all accidental: in fact, the Laplacian is exactly
the infinite-dimensional analogue of the Hessian of the elastic energy of a rubber band, after the
degrees of freedom of the rubber band are rewritten to measure deviation away from the rubber
band’s equilibrium state φ(x̄) = x̄. The constant eigenfunction is the zero-energy translation mode,
and the other eigenfunctions encode higher and higher frequency (and energy) deformations of the
rubber band.
Bonus Math We already showed in chapter 12 that the first derivative of the 1D elastic energy is
d
lim V (φ + δφ) ∝
→0 d
L
Z
((x̄)φ0 (x̄)) δφ0 (x̄) dx̄.
0
The second variation of V , in the directions δφ1 and δφ2 , is then
d2
V (φ + δφ1 + µδφ2 ) ∝
lim
,µ→0 dµd
Z
L
0 2 0
φ (x̄) δφ2 (x̄) + (x̄)δφ02 (x̄) δφ01 (x̄) dx̄,
0
which when evaluated at the equlibrium configuration where φ(x̄) = x̄ simplifies to
hδφ01 , δφ02 i = −hδφ2 , ∆δφ1 i,
where to get the second expression we integrated by parts.
15.3
Discretizing the Laplacian
Let us now repeat in the discrete setting the derivation we just completed in the smooth setting,
which involves the following steps:
• formulate a discretization of the function space H;
• write down a Dirichlet energy on functions belonging to this space;
• define the Laplacian as the negative gradient of the Dirichlet energy.
We need a way of representing discrete functions on the unit interval, and of taking their inner
products. One natural discretization of H is to specify the function f just at a few vertices xi ,
for example at N + 1 evenly spaced points x0 = 0, x1 = N1 , . . . , xN = 1. This means that discrete
functions are elements of RN +1 , and consist of vectors F with Fi representing the value of the
function at xi .
How do we discretize the condition that f 0 (0) = f 0 (1) = 0? We could for example insist that
F0 = F1 , so that F is flat on the first interval (x0 , x1 )... but there’s really no need to add this
condition. We can instead imagine that F is constant if we were to extend F slightly to the left of
x0 , or to the right of xn . This decision amounts to the same thing (up to reindexing and shortening
the interval) as setting F0 = F1 and FN −1 = FN .
Choosing an inner product is less straightforward. One approach is to extend F to a function
f over the entire interval by linear interpolation; this is the idea used in chapter 12. Here we will
do something even simpler, and instead use the barycentric (lumped) inner product: let us chop
156
up the unit interval into pieces centered at the xi , and take the inner product as if the functions
were constant over their piece:
N
−1
X
1
1
1
hF, Gi =
F0 G0 +
Fi Gi +
FN GN .
2N
N
2N
i=1
Here the first and last term have different weights since the piece associated to x0 and xN is half
as long as the others. We can write this inner product as hF, Gi = F T M G, where M is a diagonal
matrix of weights.
Next we write down a Dirichlet energy of F . A natural choice is
N −1
1X
V (F ) =
2
Fi+1 − Fi
!2
1
N
i=0
1
;
N
the finite difference approximates the derivative of F , and the scaling by N1 is due to the fact that
the squared derivative is being integrated on the segment of the unit interval between xi and xi+1 .
Finally we derive the gradient of V . By definition h∇V (F ), Gi = [dV (F )]G, and we have chosen
discretizations for both sides of this expression:
∇V (F )T M G =
d
V (F + G)
d
→0
.
G is arbitrary, so let’s consider the function
(
0, i 6= k
Gi =
1, i = k
for some index 0 < k < N . In other words, let’s ignore the boundaries for now, and look just at a
G that perturbs the value of F at a single interior vertex. Plugging in this G gives
−Fk+1 + 2Fk − Fk−1
1
[∇V (F )]k =
1
N
N
What about for k on the boundaries? For k = 0, for instance, we will get
1
F1 − F0
[∇V ]0 =
1
2N
N
and a similar expression holds true for k = N . Putting it all together, we have that
[∇V (F )]k =
 2F −2F
1
0

 N −2 ,
−Fk+1 +2Fk −Fk−1
,
N −2

 2FN −2FN −1
,
N −2
k=0
0<k<N
k = N.
Now we simply declare that the discrete Laplacian is the vector ∆F = −∇V (F ).
157
Properties of the Laplacian We can make several important observations about the discrete
Laplacian. First, the Laplacian of a discrete function F is linear in the entries of F , and so ∆ can
be represented by multiplication of a matrix with F . In fact this matrix is sparse (tri-diagonal)
and we can easily read off the entries from the formula for the Laplacian:


−2 2
 1 −2 1


1 


1 −2 1
∆= 2
.

N 
.
.
.
.


.
.
2
−2
This matrix is not exactly symmetric (due to the top-left and bottom-right corners) but it is selfadjoint in the sense that hF, ∆Gi = h∆F, Gi; in other words M ∆ is symmetric. It is therefore
common to write
∆ = M −1 L
where L is the symmetric matrix

−1 1
 1 −2 1
1 

1 −2 1
L=

N
.. ..

.
.
1 −1




.


Spectrum of the Laplacian As in the smooth case, we can look for eigenfunctions of ∆:
M −1 Lvi = λi vi .
In practice it is often easier numerically to search for the generalized eigenvalues and eigenvectors
of L with respect to M ,
Lvi = λi M vi ,
but notice that solutions of one equation are also solutions of the other.
Now one eigenvector of ∆ is obvious by inspection: the constant vector 1 is an eigenvector with
eigenvalue zero, since the sum of all rows of L is zero. Also as in the smooth case, it turns out that
the other generalized eigenvalues of ∆ are negative: the matrix L is negative-semidefinite, and the
generalized eigenvalues of L are nonpositive.
Bonus Math To see that L is negative-semidefinite, notice that −L is diagonally dominant:
X
−Lii ≥
|Lij |
j6=i
for every row i. This is a sufficient condition for −L being positive-semidefinite. But we still need to argue
that the generalized eigenvalues of L with respect to M are also nonpositive. Recall that the eigenvalues
of a product of invertible square matricesa AB is invariant under swapping of the matrices; this is because
0 = det(AB − λI) = det(A[B − λA−1 ]) = det A det(B − λA−1 ) = det(BA − λI).
158
Let M −1/2 be the diagonal matrix whose entries are the reciprocals of the positive square roots of those of
M ; clearly M −1/2 is positive-definite. Then applying the above fact to the case A = M −1/2 , B = M −1/2 L
we have that the generalized eigenvalues of L (or alternatively, the eigenvalues of M −1 L) are the same as
the ordinary eigenvalues of M −1/2 LM −1/2 . Since L is negative-semidefinite and M −1/2 positive-definite,
this product is negative-semidefinite.
a
This fact is true even if the matrices aren’t invertible or aren’t square (but have compatible dimensions so
that both AB and BA are well-defined) but these generalizations are harder to prove.
Discrete Heat Equation Given initial conditions F 0 and Neumann boundary conditions, we
can discretize the heat equation:
F i+1 − F i
= αM −1 LF i ,
h
i ≥ 0,
(15.2)
where h is a chosen time step size. The left-hand side is the standard discretization of a time
derivative by a finite difference (average rate of change over the interval). What about the righthand side, though? Here we have the Laplacian, which is a constant matrix, but what about F ?
Why use F at time step i, instead of i + 1? Or maybe we should have used the average of F at the
two time steps?
These questions should sound very familiar—we had similar options when discretizing Newton’s
second law! And we know that in the case of Newton’s law, the choice to use time step i led to an
unstable explicit integrator, and time step i + 1 led to a more compex, but more stable, implicit
integator. The same drama will play out here in the discetization of heat flow.
Let us rearrange equation (15.2):
F i+1 = (I + αhM −1 L)F i ,
and let us look at the case where the initial temperature F 0 is a non-constant eigenvector of ∆
with eigenvalue λ < 0. For such a choice of F 0 , we can easily solve the above equation in closed
form for any i:
F i+1 = (1 + αhλ)i+1 F 0 .
We know that the physically correct behavior is for F i → 0 as i → ∞. And this will indeed happen,
provided that
|1 + αhλ| < 1.
2
Since α is a positive constant and λ < 0, for a sufficiently small time step h < αλ
, the simulation
2
0
will thus correctly model the behavior of the initial temperature F . If h > αλ , we instead have
that the temperature oscillations increase over time: the simulation “blows up.”
A general F 0 will not be an eigenvector of ∆, but can be decomposed in the eigenvector basis,
F0 =
N
X
ci vi .
i=0
Since F i+1 is linear in F i , we can analyze how each term in this sum evolves over time indepedently
of the others; in particular we want each of these components (except for the component associated
to the constant eigenvector) to decay over time. Therefore we must pick a time step h small enough
that the above inequality is satisfied even for the largest-magnitude (most negative) eigenvalue of
∆. For large N this stable time step size is excruciatingly small.
159
Maybe F 0 doesn’t include any components of the eigenvectors with very negative eigenvalue,
so that we can use a less conservative time step size? Unfortunately this idea is not wise in
practice: even though F 0 may not contain any part of an eigenvector, due to numerical errors
in the process of carrying out time integrator of heat diffusion using equation (15.2), very
small components of missing eigenvectors are very likely to creep into F i . And these errors
will amplify exponentially as i increases.
In constrast if we use the temperature at time step i + 1, we get implicit heat flow; we can
rewrite the equation as
(I − αhM −1 L)F i+1 = F i ,
and advancing each time step requires solving a linear system. But notice what happens this time
when we pick F 0 to be a non-constant eigenvector of ∆: we get in closed form
F i+1 =
1
F 0.
(1 − hαλ)i+1
Again, λ < 0, so the denominator is always greater than one, for any time step size! Therefore F i
always decays as i → ∞, and we need never worry about instability (we might still want a small
time step, so that the simulated heat flow is more physically accurate, of course. But at least the
simulation will not blow up if we choose an h too large.)
Finally, although solving a linear system every iteration of time integration may seem inefficient,
notice that the matrix (I − αhM −1 L) is constant over time, and so a decomposition of the matrix
can be precomputed to make all subsequent linear solves very inexpensive. Even better is to rewrite
the heat equation as
(M − αhL)F i+1 = M F i ;
now the matrix M −αhL is symmetric positive-definite, allowing the use of the very efficient (sparse)
Cholesky factorization.
160
Appendix A
Debugging Hints
If debugging is the process of removing
software bugs, then programming must be
the process of putting them in.
Edsger Dijkstra
This appendix gives some advice about how to deal with common problems that arise when
implementing physical simulation code. Physics code can be especially frustrating to debug, since
it may not be clear whether the bug is in the code at all (vs in the mathematics or problem
formulation), may not become obvious until many minutes of simulation have elapsed, and can be
elusive to pinpoint since most do not immediately crash the program. There are, however, some
best practices and tips that come with experience that you may be able to use.
A.1
Why Is My Code So Slow?
Performance problems can be debugged using a profiler. Here, though, are some quick problems to
check for:
Compile the code in Release mode, unless you are trying to debug a problem, of course.
Simulation code is often compute-bound and compiler optimizations make a huge difference, sometimes up to an order of magnitude improvement in running time. How to turn on compiler optimizations varies by vendor and build system, but some common cases are in the following table:
CMake
gcc
llvm
Visual Studio
-DCMAKE BUILD TYPE=Release
-O2
-O2
switch configuration to “Release”
If you are using compute-heavy external libraries, these will need to be compiled with optimizations
turned on as well.
Use sparse matrices whenever it makes sense (whenever an n × n matrix has fewer than
O(n2 ) nonzeros, and has more than a dozen or so rows and columns). The vast majority of
matrices that arise in physical simulation are sparse; your mass matrices and Hessians should be
Eigen::SparseMatrix<double>s, not Eigen::MatrixXds. A few other tips when it comes to
sparse matrices:
161
• Do not materialize sparse matrices by repeatedly calling coeffRef(); each such call forces
Eigen to rearrange its internal sparse matrix representation data structures. Instead, assemble
the matrix nonzeros into a list of triplets, using a pattern like
std::vector<Eigen::Triplet<double> > coeffs;
for(...)
{
coeffs.push_back({row, column, val});
}
Eigen::SparseMatrix<double> M(nrows, ncols);
M.setFromTriplets(coeffs.begin(), coeffs.end());
Two things to keep in mind: (1) you must set the size of the sparse matrix correctly before
calling setFromTriplets(); (2) Eigen will automatically add together triplets with the same
row and column, so you do not need to do this yourself.
• Matrix-matrix multiplication is quite expensive compared to matrix-vector multiplication.
Refactor your mathematical expressions to use matrix-vector multiplication whenever possible. Eigen performs smart compile-time lazy evaluation which evaluates complex expressions
in the optimal way most of the time, but of course it’s wise to write your code in a fool-proof
way, keeping in mind that C++ uses left-to-right associativity for its mathematical operators:
// Don’t do this!!
Eigen::SparseMatrix<double> hugeMat = A*B;
Eigen::VectorXd result = hugeMat*v;
// Probably OK (relies on Eigen reordering)
Eigen::VectorXd result = A*B*v;
// Always OK
Eigen::VectorXd result = A*(B*v);
/// -orEigen::VectorXd intermed = B*v;
Eigen::VectorXd result = A*intermed;
• Eigen supports sparse vectors, but these are not worth the trouble in almost all cases.
Do not pass huge data structures by value, pass them by reference instead:
// Don’t do this!!
Eigen::Vector3d localForce(Eigen::VectorXd configuration);
// OK
Eigen::Vector3d localForce(Eigen::VectorXd &configuration);
162
when passing by value, the compiler will create an entire deep copy of your huge vector/matrix
before calling your function. This is almost never what you want for functions that will be called
many times per time step.
Use the right linear solvers for the job at hand:
• Do not use dense matrix solvers on sparse matrices!!
• For positive-definite matrices, use either sparse Cholesky decomposition (Eigen::SimplicialLDLT)
or the method of conjugate gradients (Eigen::ConjugateGradient).
• For positive-semidefinite matrices, adding a small multiple of the identity matrix to your
diagonal and then using the above solvers usually works well.
• It is often possible to refactor equations so that matrices become positive-definite. This
refactoring is almost always a significant performance boost.
• Sparse QR is the most robust linear solver, but also the slowest. Use it for nasty, rank-deficient
matrices. Use SuiteSparse’s implementation (Eigen::SPQR), not Eigen’s built-in version, as
the latter is buggy.
• Despite what the Eigen folks claim, SuiteSparse’s other solvers (such as CHOLMOD) are
significantly faster than Eigen’s counterparts as well.
Eigen unfortunately does not include a built-in sparse matrix eigensolver. Use an external library
like Arpack, or write your own implementation of (inverse) power iteration. Don’t try to use Eigen’s
dense matrix eigensolvers on your huge sparse matrices.
A.2
I’m Getting NaNs!
In floating-point calculations, infs and nans are special sentinel values that mark that your calculations numerically overflowed, or yielded an undefined results, respectively. Once one of these
values enters into your simulation, they will propagate and soon destroy the entire simulation state.
There are two broad root causes of nans in simulation codes:
Local expressions that yield an undefined result if numerical values are slightly perturbed due
to numerical error. You need to guard against such expressions becoming undefined. For instance:
• Vector normalization v 7→ v̂ is undefined when v = 0, and ill-defined numerically if the
magnitude of v is too small. You may need to add corner-case code that handles this case
specially.
• Square roots of negative numbers are undefined. Even innocent-looking expressions like
p
x2 − 2xy + y 2
can be dangerous: mathematically, the expression inside the square root is equivalent to
(x − y)2 and so is always non-negative; but numerically, due to floating-point rounding it
might evaluate to a number just slightly below zero, which then yields a nan when you try
to sqrt it. Protect yourself by either refactoring the expression to remove the square root,
convincing yourself that the floating-point code can never be negative, or including guards
against an undefined results:
163
// Don’t do this!
double result = std::sqrt(x*x - 2*x*y + y*y);
// OK
double tmp = std::max(x*x - 2*x*y + y*y, 0.0);
double result = std::sqrt(tmp);
// Even Better
double result = std::fabs(x-y);
• Inverse trig functions are another source of potential nans. Do not use arccosine to compute
the angle between two vectors:
u·v
θ = acos
kukkvk
When the two vectors are colinear, the argument to arccosine is exactly ±1; but numerically,
you might get an intermediate result slightly greater than 1 or less than -1, whose arccosine
is undefined. Moreover, arccosine is numerically unstable near 1: small changes in the input
value yield large changes in the output angle, since the graph of arccosine is flat near 1.
Instead use
θ = atan2 (ku × vk, u · v)
if you want an angle θ ∈ [0, 2π), or
θ = 2 atan2 ([u × v] · n̂, kukkvk + u · v) ,
where n̂ is the oriented normal to the plane containing both vectors, if you want a signed
angle θ ∈ (−π, π]. If you must use arccosine, clamp its input explicitly to [−1, 1].
Failure of iterative algorithms to converge, causing the results to “blow up,” is the second
major source of nans in simulation codes. Examples include trying to use Newton’s method with
an initial guess too far from the root, running time integration using too large of a time step, trying
to solve a linear system when the matrix is too close to singular, etc. Often the best you can do
to guard against these failures is to detect them and then adjust your simulation parameters (time
step size, e.g.) to increase stability; sometimes you can swap faster, more fragile algorithms (such
as Eigen::SimplicialLLT, which solves linear systems only for strictly positive-definite matrices)
for slower, more robust ones (such as Eigen::SPQR, which works even for singular matrices).
164
Download