Uploaded by Bilegdemberel .B

Physics-0

advertisement
Physics-0 Lecture Notes
Instructors: Chi-Ming Chang, Qing-Rui Wang, Fan Yang
Qiuzhen College, Tsinghua University
2023 Spring
1
Contents
1 Mechanics
1.1
1.2
1.3
1.4
1.5
5
Kinematics: Velocity, Acceleration, Inertial Frame . . . . . . . . . . . . . . . . . . .
5
1.1.1
Point particle and reference frame . . . . . . . . . . . . . . . . . . . . . . . .
5
1.1.2
Velocity and acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
1.1.3
Inertial frame and Galilean transformation . . . . . . . . . . . . . . . . . . . 10
1.1.4
Linear Motions, Circular Motion, Parabolic Motion . . . . . . . . . . . . . . . 12
Newton’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.1
Newtonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2
Newton’s first law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.3
Newton’s second law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.4
Newton’s third law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.5
Applying Newton’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Forces in Nature, Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.1
The four fundamental forces in nature . . . . . . . . . . . . . . . . . . . . . . 24
1.3.2
Some particular forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.3.3
Statics I: forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.4
Statics II: torques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Energy and Momentum Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.4.1
Energy conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.4.2
Momentum conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.4.3
Elastic scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.4.4
Inelastic scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Harmonic Oscillators, Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2
1.6
1.5.1
Simple harmonic motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.5.2
Damped harmonic motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1.5.3
Forced oscillations and resonance . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.5.4
Simple Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.5.5
Double pendulum and chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
The Theory of Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
1.6.1
Newton’s law of Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
1.6.2
Gravitational potential energy . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1.6.3
Kepler’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
1.6.4
Gravitation on Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
1.6.5
Gauss’s law for gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2 Electricity and Magnetism
2.1
2.2
2.3
2.4
75
Coulomb’s Law, Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.1.1
Electric charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.1.2
Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.1.3
Electric fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.1.4
Electric fields due to charged objects . . . . . . . . . . . . . . . . . . . . . . . 81
2.1.5
Charged particles in electric fields . . . . . . . . . . . . . . . . . . . . . . . . 86
Electric flux, Gauss’s law, and integral theorems . . . . . . . . . . . . . . . . . . . . 88
2.2.1
Vector calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2.2.2
Electric flux, divergence theorem, and Gauss’s law . . . . . . . . . . . . . . . 93
2.2.3
Stoke’s theorem, Poincaré lemma, and electric potential . . . . . . . . . . . . 99
Applying Gauss’s Law, Electric potential . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.3.1
A charged isolated conductor . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.3.2
Combining Gauss’s law with symmetry . . . . . . . . . . . . . . . . . . . . . 104
2.3.3
Electric potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Capacitance, Current and Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
2.4.1
Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
2.4.2
Capacitors in Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
2.4.3
Energy Stored in an Electric Field . . . . . . . . . . . . . . . . . . . . . . . . 117
2.4.4
Electric Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3
2.5
2.6
2.7
2.4.5
Resistance and Resistivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2.4.6
Electric Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
2.4.7
RC Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Magnetic Fields, Magnetic Fields Due to Currents . . . . . . . . . . . . . . . . . . . 125
2.5.1
Magnetic fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2.5.2
Magnetic forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2.5.3
Magnetic fields from currents . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
2.5.4
Ampere’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Faraday’s Law, Induction and Inductance . . . . . . . . . . . . . . . . . . . . . . . . 139
2.6.1
Faraday’s law of induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
2.6.2
Induced electric fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
2.6.3
Inductors and inductance, self-induction . . . . . . . . . . . . . . . . . . . . . 143
2.6.4
RL circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
2.6.5
Energy of a magnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
2.6.6
LC harmonic oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
2.6.7
RLC damped oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Maxwell’s equations, Electromagnetic Waves
. . . . . . . . . . . . . . . . . . . . . . 150
2.7.1
Overview of Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . . . 150
2.7.2
Maxwell’s Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
2.7.3
Relativistic formulations of Maxwell equations . . . . . . . . . . . . . . . . . 154
2.7.4
Electromagnetic wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
2.7.5
Symmetry of Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . . . 157
2.7.6
Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
2.7.7
Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4
Chapter 1
Mechanics
1.1
1.1.1
Kinematics: Velocity, Acceleration, Inertial Frame
Point particle and reference frame
Physics is the subject that describes nature, and many phenomena in nature are ultimately boiled
down to the motion of some objects. Therefore, a very important question in physics is “How do
we describe the motion of something?” To answer this question, we need first to ask what is the
“something” we would like to describe its motion. If something is a dog, then its motion could
be very complicated. It might run straight along some direction or it might bite its own tail and
spin around. To simplify our task, let us consider the motion of an idealized object called a point
particle. Its defining feature is that it lacks spatial extension; hence, it has zero volume. The
motion of a point particle is significantly simplified than the motion of a dog because it has no
internal motion like spin; hence, we could describe the motion of a point particle by the change of
its position with time.
Definition 1 (Point particle). A point particle is an idealization of particles in physics. It has no
spatial extension, and its motion is completely specified by the time evolution of its position.
On the other hand, if all parts of an object move in the same way, then it could be effectively treated
as a point particle. For example, a small ball falling from a high altitude can be approximated by
a point particle.
We are still not yet to answer the question. To describe or measure the motion of a point
particle, we need to first decide to what the motion of a point particle is referenced. For simplicity,
we could choose to describe the motion of a point particle A referenced to another point particle
O. Next, we introduce a Cartesian coordinate system where the origin is chosen to be the position
of the particle O, which is called a reference frame of the particle O. For example, consider the
point particle A moving on a two-dimensional plane as shown in Figure 1.1. At time t = 0, the
5
particle A starts at the coordinate (x, y) = (− 21 , 23 ) and moves along the red curve and ends at the
coordinate (x, y) = ( 92 , − 21 ) at time t = t∗ .
y
4
3
2
A 1
−1 O
1
2
3
4
x
−1
Figure 1.1: The motion of a point particle A in the reference frame of the point particle O.
The motion of the particle A can be described by the coordinates of the reference frame as
functions of the time t, i.e. (x(t), y(t)). From time t1 to t2 , the point particle A travels from
(x1 , y1 ) to (x2 , y2 ). The displacement of the particle A at t1 and t2 is
(∆x, ∆y) = (x2 − x1 , y2 − y1 ) ,
(1.1.1)
and the distance is given by
d=
p
∆x2 + ∆y 2 .
(1.1.2)
If displacement of the point particle A from t1 to t2 is (∆x, ∆y) and from t2 to t3 is (∆x′ , ∆y ′ ),
then the displacement from t1 to t3 is
(∆x + ∆x′ , ∆y + ∆y ′ ) .
(1.1.3)
The coordinate (x, y) itself can also be understood as the displacement between the particle O and
the particle A.
In the example we just studied, the motion of the point particle A confines on a two-dimensional
plane, and we used the coordinates (x, y) to describe its motion. More generally, a particle moving
in d dimensions can be described by the Cartesian coordinates (x1 , x2 , · · · , xd ). It is convenient to
introduce the concept of a vector. A vector quantity is a physical quantity with both magnitude
and direction.
6
Definition 2 (Vector). A d-dimensional vector ⃗v is a collection of d numbers as ⃗v = (v1 , v2 , · · · , vd )
equipped with the operations: vector addition and scalar multiplication.
• Vector addition: Given two d-dimensional vectors ⃗v = (v1 , · · · , vd ) and ⃗v ′ = (v1′ , · · · , vd′ ), the
sum ⃗v + ⃗v ′ is a d-dimensional vector defined by
⃗v + ⃗v ′ = (v1 + v1′ , · · · , vn + vd′ ) .
(1.1.4)
• Scalar multiplication: Given a number a and a d-dimensional vector ⃗v = (v1 , · · · , vd ), the
product a⃗v is defined by
(1.1.5)
a⃗v = (av1 , · · · , avd ) .
Let us look at some examples in two dimensions. The position of a particle is a vector ⃗r = (x, y).
Using vector addition and scalar multiplication, we can write the displacement also as a vector as
∆⃗r = ⃗r2 + (−1)⃗r1 = (x2 − x1 , y2 − y1 ) .
(1.1.6)
The discussion around (1.1.2) says that the total displacement from t1 to t3 is the vector addition
of the displacement from t1 to t2 and the displacement from t2 to t3 .
Definition 3 (Inner product and norm). The inner product ⃗u · ⃗v of two n-dimensional vectors
⃗u = (u1 , · · · , un ) and ⃗v = (v1 , · · · , vn ) is defined by
⃗u · ⃗v = u1 v1 + u2 v2 + · · · ud vd .
(1.1.7)
The norm |⃗v | of a vector ⃗v is defined by the square root of the inner product of ⃗v with itself, i.e.
|⃗v | =
√
⃗v · ⃗v .
(1.1.8)
The norm of a displacement gives the distance,
d = |∆⃗r| ,
(1.1.9)
and (1.1.2) is an example in two dimensions.
So far, in our description of the motion of a point particle, we have ignored the units. We
measure each physical quantity in its own units, by comparison with a standard, which corresponds
to exactly 1 unit of the quantity. The unit for time is second with the symbol s and for length is
meter with the symbol m. The corresponding standards are given by1
Definition 4 (Second). One second is the time taken by 9 192 631 770 oscillations of the light (of
a specified wavelength) emitted by a cesium-133 atom.
1
The following two definitions are literally
Jearl Walker - Fundamentals of Physics].
taken
7
from
[Robert Resnick;
David Halliday;
Definition 5 (Meter). The meter is the length of the path traveled by light in a vacuum during a
time interval of 1/299 792 458 of a second.
For example, when we said the time period is 135 s, we mean that during that time period, the
light emitted by a cesium-133 atom has oscillated 135 × 9192631770 times. When we say that two
points are 4.3 m apart, we mean that it takes 4.3 × 1/299792458 for the light to travel from one
point to the other.
1.1.2
Velocity and acceleration
When people want to have some idea of the motion of something, the most common question people
ask is “how fast does something move?”. Velocity is one of the most important characteristics of
the motion of a point particle. Consider a time interval from t to t + ∆t. Suppose the particle A
moves from ⃗r1 to ⃗r2 , the average velocity of the particle A in the time interval ∆t = t2 − t1 is
⃗vavg =
∆⃗r
⃗r2 − ⃗r1
=
.
∆t
∆t
(1.1.10)
The unit of the average velocity is m/s. Given the average velocity, we can compute the displacement as
(1.1.11)
∆⃗r = ⃗vavg ∆t ,
and the distance as
d = |⃗vavg |∆t .
(1.1.12)
We can see that the average velocity ⃗vavg is associated with the time period ∆t, and only
depends on the initial and final positions of the particle. However, one usually would like to know
the velocity of an object at a particular instance in time. We define the instantaneous velocity as
given by the limit
⃗r(t + ∆t) − ⃗r(t)
(1.1.13)
⃗v (t) = lim
.
∆t→0
∆t
In terms of the components ⃗v = (vx , vy ), we have
vx (t) = lim
∆t→0
x(t + ∆t) − x(t)
,
∆t
vy (t) = lim
∆t→0
y(t + ∆t) − y(t)
.
∆t
(1.1.14)
The operation we performed on the coordinate x(t) in (1.1.13) to get the instantaneous velocity
vx (t) is an example of a derivative. The derivative of a function f (x) with respect to the variable
(x)
x, denoted by dfdx
, is defined by the limit
df (x)
f (x + ϵ) − f (x)
= lim
.
ϵ→0
dx
ϵ
We have
⃗v (t) =
d⃗r(t)
.
dt
8
(1.1.15)
(1.1.16)
We would not give the mathematical definitions for the limit “lim” in (1.1.13). Instead, let
us try to understand the meaning of the instance velocity by looking at the following example.
Consider a point particle moving along the x-direction, and we want to measure its instantaneous
velocity at t = 2.7 s. To this end, we measure the position of this point particle at a sequence of
time instances. The result of the measurements is listed in Table 1.1.
t (s)
x (m)
2.7
0.968583
2.7001
0.967734
2.701
0.959588
2.71
0.830246
2.8
−0.844328
Table 1.1: The data from measuring the positions of a point particle.
Now, we could approximate the instantaneous velocity by the average velocity of the particle
in the time interval [2.7 s, 2.8 s], and obtain2
v(2.7 s) ≈ vavg =
−0.844328 − 0.968583
m/s = −18.129 m/s .
2.8 − 2.7
(1.1.17)
We can get better approximations of the instantaneous velocity by using smaller time intervals
[2.7 s, 2.71 s], [2.7 s, 2.701 s], and [2.7 s, 2.7001 s]. The average velocities associated with these time
intervals are

0.830246−0.968583

m/s = −13.834 m/s for interval [2.7 s, 2.71 s] ,

2.71−2.7

(1.1.18)
v(2.7 s) ≈ vavg = 0.959588−0.968583
m/s = −8.996 m/s
for interval [2.7 s, 2.701 s] ,
2.701−2.7


 0.967734−0.968583 m/s = −8.494 m/s
for interval [2.7 s, 2.7001 s] .
2.7001−2.7
When we use smaller and smaller intervals, we obtain better and better approximations of the
instantaneous velocity. The equation (1.1.13) means that the instantaneous velocity equals the
average velocity associated with an interval of an infinitesimal length ϵ. Of course, the “interval
of an infinitesimal length” could only be achieved ideally, and in practice, one could only obtain
approximations of the instantaneous velocity, but when the interval becomes smaller and smaller,
the approximation becomes better and better. In fact, the data in table 1.1 come from the particle
trajectory x(t) = sin(2πt2 ), and we have the instantaneous velocity v(2.7 s) = −8.43785 m/s.
We can also understand the process of getting a better and better approximation of the instantaneous velocity pictorially. In Figure 1.2, we draw the motion of a point particle in the t-x plane.
The instantaneous velocity at t = t0 can be computed by the limit
xn − x0
,
n→∞ tn − t0
v(t0 ) = lim
(1.1.19)
where tn approaches t0 for n going to infinity. From the figure, we can also see that the instantaneous
velocity is the slope of the trajectory on the t-x plane.
2
We drop the subscript x when considering particles moving in one dimension.
9
Figure 1.2: The trajectory of a point particle on the t-x plane.
Given the instantaneous velocity v(t), one could integrate it to get the displacement of the
particle,
ˆ t2
(1.1.20)
∆⃗r(t2 , t1 ) = ⃗r(t2 ) − ⃗r(t1 ) =
⃗v (t)dt .
t1
In a very similar fashion, we define the instantaneous acceleration as
⃗v (t + ∆t) − ⃗v (t)
d⃗v (t)
d2⃗r(t)
=
=
.
∆t→0
∆t
dt
dt2
⃗a(t) = lim
1.1.3
(1.1.21)
Inertial frame and Galilean transformation
In Section 1.1.1, we learned that to describe the motion of a point particle A we need first to pick
a reference frame, and we chose the reference frame associated with the point particle O. (O is
always at the origin.) Our description of the motion of A heavily depends on the reference frame,
namely the motion of O. If we instead choose a reference frame associated with a different point
particle B, then our description of the motion of A would be completely different. This is of course
not satisfactory. Hence, we would like to have a more universal way to describe the motion of A.
This lead to the following two definitions.
Definition 6 (Free particle). A free particle is a point particle that receives no influence from any
other objects.
10
Definition 7 (Inertial frame). An inertial frame is a frame that is referenced to a free particle.
Let us try to understand the above definitions using some analogies. Say you have a case in
court. In a normal situation, you would prefer the jury of your case to be just, i.e. not influenced
by any other people outside the jury. The point particle of our reference frame is like the jury, and
a free particle is like a just jury that we are seeking to have. Therefore, we would like always to
use inertial frames to describe the motion of objects.
Inertial frames are not unique. Every free particle defines an inertial frame. We would like
to have a way to compare our descriptions of the same motion but using different inertial frames.
Consider two free particles O, O′ , and a point particle A. Both reference frames of O and O′ are
inertial frames. As we will see in the next section, Newton’s first law implies that a free particle in
an inertial frame has a constant velocity. Let us denote the constant velocity of O′ in the inertial
frame of O by ⃗v .
The motion of the point particle A in the inertial frame of O is given by the position vector ⃗r,
and in the inertial frame of O′ is given by the position vector ⃗r′ . The position vectors ⃗r and ⃗r′ are
related by
(1.1.22)
⃗r′ = ⃗r − ⃗v t .
The above relation between the coordinates of the two inertial frames is called the Galilean transformation.
There are other more basic coordinate transformations. First, when the two free particles O
and O′ are separated by a displacement vector d⃗ but with no relative velocity, the position vectors
⃗r and ⃗r′ are related by
(1.1.23)
⃗r′ = ⃗r + d⃗ .
When the clocks of the two inertial frames differ by time s, the time coordinates t and t′ are related
by
(1.1.24)
t′ = t + s .
The transformation from (t, ⃗r) to (t′ , ⃗r′ ) is called a translation. Next, we can consider the transformation that keeps the inertial frame of O but rotates the coordinates, i.e.
⃗r′ = R · ⃗r ,
(1.1.25)
where R is an orthogonal matrix (RRT = I). This transformation is called a rotation. The Galilean
group is the group that contains the compositions of Galilean transformation, translations, and
rotations.
Exercise (Galilean group).
1. Work out the transformation rule of a generic element in the Galilean group.
2. Derive the composition rule of two generic elements.
3. Show that the Galilean group is a group, i.e. it obeys all the axioms of a group.
11
1.1.4
Linear Motions, Circular Motion, Parabolic Motion
We will introduce three simple motions of a point particle. We will use Newton’s notation for
differentiation,
dx(t)
d2 x(t)
(1.1.26)
.
ẋ(t) =
, ẍ(t) =
dt
dt2
Linear motions: A linear motion is a motion of a point particle along a straight line. Let x be
the coordinate along the straight line. A linear motion can be described by the function x(t). Let
us give two examples of linear motions.
1. Constant velocity: The linear motion of a particle with a constant velocity v is given by
x(t) = x0 + vt .
(1.1.27)
We compute the instantaneous velocity and acceleration by taking the first and second derivatives
(1.1.28)
ẋ(t) = v , ẍ(t) = 0 .
We have verified that the particle is moving at a constant velocity v without acceleration.
2. Constant acceleration: All objects near Earth’s surface when neglecting the contact or
non-contact effects from the air or other objects (except Earth) move downwards with a
constant acceleration, called the free-fall acceleration, denoted by g. We will learn in later
sections that such acceleration is caused by the gravitational attraction force between the
object and Earth. For now, we just give its value
g = 9.8 m/s2 .
(1.1.29)
The motion of a small ball with a free-fall acceleration is given by3
1
y(t) = y0 + vt − gt2 .
2
(1.1.30)
The first and second derivatives are
ẏ(t) = v − gt ,
ÿ(t) = −g .
(1.1.31)
We have verified that the particle is moving at a constant acceleration −g, and the minus
sign is because the acceleration is pointing downward.
Let us look at the position and velocity of the particle at three different important time
instances. First, at the time t = 0, the position and velocity of the ball are y = y0 and ẏ = v.
We assume that the velocity v is positive. At t = vg , the ball reaches the maximum height
y = y0 +
3
v2
2g
and has zero velocity. At t =
2v
g ,
the ball is back to the starting height y = y0
We changed our coordinate from x to y for the vertical direction.
12
with an opposite velocity ẏ = −v. The trajectory of the ball on the t-x plane is drawn in
Figure 1.3. It is a parabola.
Figure 1.3: Trajectory of a free-falling object near Earth’s surface on the t-x plane.
Circular motion: A point particle moving along a circle in two dimensions is called a circular
motion, as shown in Figure 1.4. To describe a circular motion, it is convenient to change from the
Cartesian coordinates (x, y) to the polar coordinate (r, θ) with the coordinate transformation:
x = r cos θ ,
(1.1.32)
y = r sin θ .
The inverse transformation is
r=
p
x2 + y 2 = |⃗r| ,
13
θ = arctan
y
.
x
(1.1.33)
Figure 1.4: The particle is confined on a circle for the circular motion.
For circular motions, the radius r is a constant, and the angle θ is a function of the time t, i.e.
the function θ(t) describes a circular motion. The first derivative of θ is called the angular velocity,
dθ(t)
= ω(t) .
dt
(1.1.34)
Let us consider a circular motion with a constant angular velocity ω, described by
θ(t) = θ0 + ωt .
(1.1.35)
Using the coordinate transformation (1.1.32), we find
⃗r = (r cos(ωt), r sin(ωt)) .
(1.1.36)
d⃗r
= (−rω sin (ωt), rω cos(ωt)) ,
dt
d2⃗r
⃗a = 2 = (−rω 2 cos(ωt), −rω 2 sin(ωt)) .
dt
(1.1.37)
The velocity and acceleration are
⃗v =
We see that the acceleration vector ⃗a is proportional to the coordinate ⃗r, and pointing towards the
origin, i.e.
(1.1.38)
⃗a = −ω 2⃗r .
14
This acceleration is called centripetal acceleration, which is a very important characteristic of circular motions. Taking the norms of the velocity and the acceleration, we find
|⃗a| = rω 2 ,
(1.1.39)
|⃗v |2
.
r
(1.1.40)
2π
2πr
=
.
ω
|⃗v |
(1.1.41)
|⃗v | = rω ,
which gives the relation
|⃗a| =
The period of the circular motion is
T =
Parabolic Motion: Consider a point particle near Earth’s surface. It has free-fall acceleration
in the vertical direction and no acceleration in the horizontal direction. However, the motion along
the horizontal direction is not completely trivial, since there could be a non-zero constant horizontal
velocity. The most general such motion is described by
1
y(t) = y0 + vy t − gt2 .
2
x(t) = x0 + vx t ,
(1.1.42)
The first and second derivatives are
ẋ(t) = vx ,
ẏ(t) = vy − gt ,
ẍ(t) = 0 ,
ÿ(t) = −g .
(1.1.43)
We have verified that there is a constant horizontal velocity and a constant vertical acceleration.
We can eliminate the time t from the equations (1.1.42) and find
vy
1
y − y0 = (x − x0 ) − g
vx
2
x − x0
vx
2
,
(1.1.44)
which is a parabola on the two-plane.
Let us again look at some special time instances. First, at t = 0, the initial position and
v
velocity are (x, y) = (x0 , y0 ) and (ẋ, ẏ) = (vx , vy ). At t = gy , the ball reaches the maximum
height at position (x, y) = (x0 +
(ẋ, ẏ) = (vx , 0). At t =
the velocity (vx , −vy ).
2vy
g ,
vx vy
g , y0
+
vy2
2g ),
and the velocity is pointing in the x-direction as
the ball is back to the initial height at (x, y) = (x0 +
15
2vx vy
g , y0 )
with
1.2
1.2.1
Newton’s Laws
Newtonian mechanics
Classical mechanics is a physical theory describing the motion, including accelerations, of macroscopic objects, and what can cause an object to accelerate. That cause is called a force, which is,
loosely speaking, a push or pull on the object. The force is said to act on the object to change its
velocity. In the next section, we will see some examples of forces in nature.
In classical mechanics, the relation between a force and the acceleration it causes was fully
understood by the celebrated Newton’s three laws of motion, which is the subject of this chapter.
The study of that relation, as Newton presented it, is then called Newtonian mechanics. However,
we remark that Newtonian mechanics does not apply to all situations. On the one hand, it is only
a low-speed approximation of Einstein’s special theory of relativity. If the speeds of the objects
become an appreciable fraction of the speed of light, Newtonian mechanics fails. On the other hand,
if the interacting bodies are on the scale of atomic structure, then Newtonian mechanics should be
replaced with more sophisticated quantum mechanics. Although physicists now view Newtonian
mechanics as a special case of these two deeper theories, it is a very useful and important special
case because it applies to the motion of objects ranging in size from the very small (almost on the
scale of atomic structure) to astronomical (galaxies and clusters of galaxies).
It is also worth mentioning that even though the underlying Newton’s laws of classical mechanics are very simple, there are still sufficiently many phenomena in classical mechanics that remain
mysterious to physicists and mathematicians. Studies of these phenomena have led to many profound mathematical theories. One famous example is the Millennium Prize about the Navier–Stokes
equation, which is in the regime of classical mechanics and has been one of the central problems
in PDE studies. Another example is the chaos phenomena in classical mechanics (you may know
the three-body problem), which has had far-reaching impacts on the development of the dynamical
systems theory.
1.2.2
Newton’s first law
Before Newton, Aristotle proposed that some force is needed to keep a body moving at constant
velocity. A body was thought to be in its “natural state” when it was at rest, and for a body to
move with constant velocity, it seems that we had to push or pull it in some way. Otherwise, the
body would “naturally” stop moving. However:
Example (Galileo’s thought experiment). Galileo considered a sliding body on inclined planes in
the absence of friction. The speed acquired by a body moving down a plane from a height, say h,
was sufficient to enable it to reach the same height when climbing up another plane at a different
inclination, say θ. As θ decreases, the body should travel a greater and greater distance. Galileo
proposed that the body could travel indefinitely far as θ → 0, contrary to the Aristotelian notion of
the natural tendency of an object to remain at rest unless acted upon by an external force.
16
Due to his contribution, Galileo is credited with introducing the concept of inertia. Newton
later exploited it as his first law of motion.
Physics law 1 (Newton’s first law/the law of inertia). In an inertial frame, if no force acts on a
body, the body’s velocity cannot change; that is, the body cannot accelerate.
In other words, if a free body is at rest, it stays at rest. If it is moving, it continues to move
with the same velocity (same magnitude and same direction).
Note Newton’s first law does not hold in non-inertial frames. For example, in the frame of a
bus accelerating from rest, a ball will accelerate backward even if no force acts on it. Hence, we
can give another definition of the inertial frame: an inertial frame is a frame where Newton’s first
law holds.
1.2.3
Newton’s second law
Newton’s second law is perhaps the greatest law in physics and has deeply affected physics and
even human history since after Newton. Before introducing Newton’s second law, we first need to
make some preparations.
Force is a vector quantity, i.e., it has not only magnitude but also direction. So, if two or more
forces act on a body, we find the net force by adding them as vectors. This leads to the principle
of superposition (or decomposition) for forces, i.e., a single force that has the same magnitude and
direction as the calculated net force would then have the same effect as all the individual forces.
In this note, we will often use F⃗ to represent a force.
There may be multiple forces acting on a body, but if their net force is zero, the body cannot
accelerate. So, a more precise statement of Newton’s first law is:
Physics law 2 (Newton’s first law). In an inertial frame, if no net force acts on a body, the body’s
velocity cannot change; that is, the body cannot accelerate.
Mass is a quantitative measure of inertia. From everyday experience, we know that the object
with the larger mass is accelerated less. With careful experiments, people find that the acceleration
is actually inversely related to the mass (rather than, say, the square of the mass). In other words,
applying the same force F to two bodies of masses m1 and m2 , their accelerations a1 and a2 satisfy
a1
m2
=
.
a2
m1
This suggests that
m1 a1 = m2 a2 = CF
for a universal dimensionless C that does not depend on any physical quantities of the bodies. By
choosing the proper physical units, we can let C = 1, which leads to Newton’s second law.
17
Physics law 3 (Newton’s second law). In an inertial frame, the net force on a body is equal to the
product of the body’s mass and its acceleration. In other words, suppose a net force F⃗ is acted on
a body of mass m, then the generated acceleration satisfies
F⃗ = m⃗a.
Note Newton’s first law is a special case of the second law with F⃗ = 0 and ⃗a = 0. We now check
that Newton’s second law is “invariant” under Galilean transformations:
⃗x
in inertial frame O → ⃗x′ = ⃗x − ⃗v t in inertial frame O′ ,
where ⃗v is a constant velocity vector. Then, in O′ , we have that
⃗a′ =
d2 ⃗x
d2 ⃗x′
=
= ⃗a.
dt2
dt2
Hence, in the new frame O′ , we still have F⃗ = m⃗a′ .
The standard units of mass and acceleration are kg and m/s2 . Then, the unit of force is
kg · m/s2 , which is called “Newton”, denoted by N. Hence, a 1 N force acting on a body of 1 kg
mass leads to an acceleration 1 m/s2 .
Remark. Mass is an intrinsic characteristic of a body. However, what, exactly, is mass? This turns
out to be a much deeper question than it may look. In everyday language, it is often confused with
weight, but this is wrong. In classical mechanics, by Newton’s second law, we can only say that
the mass of a body is a characteristic that relates a force on the body to the resulting acceleration.
There is no more familiar definition—we can have a physical sensation of mass only when we try
to accelerate a body. The weight of a body is actually the gravitational force acting on the body,
from which we can measure the mass by observing how the body accelerates under gravitation. In
more advanced physics (such as relativity and quantum field theory), people have a much deeper
understanding of “mass”.
1.2.4
Newton’s third law
Two bodies are said to interact when they push or pull on each other, that is when a force acts on
each body due to the other body. Newton’s third law states that the action force is equal to the
reaction force.
Physics law 4 (Newton’s third law/Action-reaction law). In an inertial frame, when two bodies
interact, the forces on the bodies from each other are always equal in magnitude and opposite in
direction. In other words, let A and B be two interacting bodies. F⃗AB is the force acting on B from
A, and F⃗BA is the force acting on A from B. Then, we have
F⃗AB = −F⃗BA .
18
If we say Newton’s first law is about setting up inertial frames and the second law gives how an
object changes its motion if some force is acting on it, Newton’s third law is purely a description
of the nature of forces. In Feynman’s lectures, he said:
In our discussion of Newton’s laws, it was explained that these laws are a kind of program that
says “Pay attention to the forces,” and that Newton told us only two things about the nature of
forces. In the case of gravitation, he gave us the complete law of the force. In the case of the very
complicated forces between atoms, he was not aware of the right laws for the forces; however, he
discovered one rule, one general property of forces, which is expressed in his Third Law, and that
is the total knowledge that Newton had about the nature of forces—the law of gravitation and this
principle, but no other details.
Nowadays, we know there are four fundamental interactions: weak, strong, electromagnetic and
gravitation forces. We will discuss Newton’s law of gravitation in Section 1.6, and the electromagnetic force in Chapter 2. In applications, we will also often consider phenomenological forces
(i.e., forces due to fundamental interactions), such as frictions, pressure, buoyancy, elastic force,
damping force, and so on.
1.2.5
Applying Newton’s laws
In principle, with Newton’s second law, we can predict the motion of a given object at any future
time. Denote the position of the object by ⃗x. In physics, most forces depend only on the position
and velocity (but not the acceleration) of the object. Then, Newton’s second law can be written
as a second-order differential equation of ⃗x:
¨ = F⃗ (⃗x, ⃗x˙ ),
m⃗x
where we used Newton’s notation of differentiation: ⃗x˙ = d⃗x/dt. To get a solution, we need to
solve the equation and fix the two integration constants using the initial conditions for ⃗x(0) and
⃗v (0) = ⃗x˙ (0). In general, for a system of multiple objects, we can write down a differential equation
for each object. Thus, to predict the evolution of the system, we need to solve the system of
differential equations with the given initial condition. In this note, we will look at some examples
such that solving the given differential equation is possible and simple. In reality, however, such
simple scenarios are rare and very few problems can be solved exactly by pure analysis.
Everybody on earth is pulled toward the ground by a gravitational force that is proportional to
the mass m:
F⃗g = m⃗g ,
where ⃗g is the gravitational acceleration. Near the ground, ⃗g can be treated as a constant vector
with a magnitude approximately 9.8 m/s2 and pointing downward.
Free fall is any motion of a body where gravity is the only force acting upon it. If we release a
body from rest, then its free fall is a linear motion described by (1.1.30) with a = −g (where the
19
positive direction is upward) and v = 0. If the body has a horizontal velocity, then its free fall is
described by the parabolic motion (1.1.42).
Example. A box starts from rest at height h0 and falls along a frictionless plane inclined at angle
θ. How long does it take to reach the ground? Suppose the mass of the box is m. What is the
normal force on the box?
Solution: We can set up the axes such that the x-axis is pointing downward along the inclined
plane, the y-axis is pointing upward along the normal direction of the inclined plane, and the origin
is at the position of the box at t = 0.
Figure 1.5: Forces on the box on the inclined plane.
Then, the gravitational force on the box can be written as
F⃗g = (mg sin θ, −mg cos θ).
On the other hand, the normal force is F⃗N = (0, FN ). Since the box will slide along the inclined
plane, its acceleration can be written as ⃗a = (a, 0). By Newton’s second law,
F⃗g + F⃗N = m⃗a.
Along the normal direction, it gives the normal force FN = mg cos θ. Along the x-axis, the above
equation gives a = g sin θ. Then, by (1.1.30), we have that the position of the box x(t) is given by
1
x(t) = g sin θ · t2 .
2
When the box reaches the ground, we have x(t) = h0 / sin θ. Solving for t, we get
s
2h0
t=
.
g sin2 θ
Recall that for a point particle doing circular motion, its acceleration point towards the rotation
center and has a magnitude v 2 /r = ω 2 r, where r is the radius of the circular motion, v is the speed
of the particle, and ω = v/r is the angular speed. We now look at the following example:
20
Example. In a double-star system, two stars of masses m1 and m2 are rotating around a center
with angular speed ω. The gravitation force between them is given by Gm1 m2 /r2 , where G is the
gravitational constant and r is the distance between the two stars. Given ω, determine r and the
position of the rotation center.
Solution: Suppose the distances from the two stars to the rotation center are r1 and r2 = r − r1 ,
respectively. The acceleration of the two stars are towards the rotation center and equal to
a1 = ω 2 r1 ,
a2 = ω 2 r2 ,
in magnitude. Then, by Newton’s second law, we have
Gm1 m2
= m1 ω 2 r1 = m2 ω 2 r2 .
r2
(1.2.1)
From this equation, we get m1 r1 = m2 r2 = m2 (r − r1 ), which gives
r1 =
m2
r,
m1 + m2
r2 =
m1
r.
m1 + m2
(1.2.2)
Plugging it into Equation (1.2.1), we can solve that
r=
G(m1 + m2 )
ω2
1/3
.
(1.2.3)
Remark. Note that the rotation center is in fact the system’s center of mass. In general, for a
system of n bodies of masses m1 , . . . , mn and at position ⃗x1 , . . . , ⃗xn , the center of mass of this
system is a vector ⃗x such that
(m1 + · · · + mn )⃗x = m1 ⃗x1 + · · · + mn ⃗xn .
Next, we see some examples where solving second-order differential equations is necessary.
Example. Consider a ball falling in some liquid. The liquid exerts a damping force Fd = bv that
is proportional to the speed v of the ball and in the opposite direction of the velocity, where b is the
damping constant. Suppose the ball is released from rest and has mass m. Determine its motion
with respect to t and the velocity of the ball when t → ∞.
Solution: Let the positive x-axis be the downward direction and let the origin be the starting
position of the ball. By Newton’s second law, we need to solve the following equation
mẍ = mg − bẋ,
with initial conditions x(0) = 0 and ẋ(0) = 0. Denoting y = ẋ, the above equation is written as
mẏ = mg − by,
21
which can be solved as
mg
.
b
With the initial condition, we can get that C = −mg/b. Integrating ẋ, we get
b
ẋ(t) = y(t) = Ce− m t +
ˆ
t
ẋ(τ )dτ =
x(t) = x(0) +
0
mg
m2 g b
t + 2 e− m t − 1 .
b
b
Note that when t → ∞, ẋ(t) → mg/b, in which case the Fd = mg. This means the gravitational
force and the damping force will finally balance, and the velocity of the ball will not change anymore.
Example. Consider a box of mass m lying on a frictionless plane. It is connected to a spring.
The spring force satisfies Hooke’s law F⃗ = −k⃗x, where ⃗x is the displacement of the spring’s free
end from its position when the spring is in its relaxed state (which is chosen as the origin), and k
is the spring constant. At the time t = 0, the box is released from rest at position ⃗x0 . Determine
the position ⃗x(t) of the box at any time t.
Solution: Without loss of generality, suppose ⃗x0 is along the x-axis. By Newton’s second law, we
need to solve the following equation
mẍ = −kx,
with initial conditions x(0) = x0 and ẋ(0) = 0. We can rewrite the above equation as
r
k
2
ẍ = −ω x, ω :=
.
m
We know the above equation has a general solution
x(t) = C1 cos(ωt) + C2 sin(ωt).
(1.2.4)
From the initial conditions, we can determine that C1 = x0 and C2 = 0. Hence, the solution is
⃗x(t) = x0 cos(ωt)⃗i, where ⃗i is the basis unit vector along the x-axis.
Remark. The above system is called a harmonic oscillator. It is one of the most important examples
in almost every field of physics. We will discuss this system in the context of classical mechanics
in more detail in Section 1.5. You will see it again and again in more advanced physics courses (in
particular, it will be one of the first few examples when you study quantum mechanics).
Finally, we look at a two-body problem. We will show that it can be reduced to solving two
one-body problems.
Example. Consider two boxes of have masses m1 and m2 . They are connected by a spring, which
is of length ℓ when it is in a relaxed state. Moreover, the spring constant is k. At the time t = 0,
the two boxes have the same velocity ⃗v along the direction of the spring and the spring is compressed
to length ℓ/2. Determine the positions ⃗x1 (t) and ⃗x2 (t) of the two boxes at any time t.
22
Solution: Without loss of generality, we choose the x-axis such that the spring is along the xdirection, and
m2
ℓ
ℓ
m1
x1 (0) = −
, x2 (0) =
.
m1 + m2 2
m1 + m2 2
(They are chosen such that the center of mass is at the origin.) By Newton’s second law, we can
write down the following system of equations:
m1 ẍ1 = k(x2 − x1 − ℓ),
m2 ẍ2 = −k(x2 − x1 − ℓ),
(1.2.5)
with initial conditions
x1 (0) = −
ℓ
m2
,
m1 + m2 2
ẋ1 (0) = v,
x2 =
ℓ
m1
,
m1 + m2 2
ẋ2 (0) = v.
To solve (1.2.5), we introduce
y1 :=
m1 x1 + m2 x2
,
m1 + m2
y2 = x2 − x1 .
Note that y1 is the center of mass of the two boxes, while y2 describes their relative position. Then,
from the two equations in (1.2.5), we get
r
k
k
2
ÿ1 = 0, ÿ2 = −ω (y2 − ℓ), with ω =
+
,
(1.2.6)
m1 m2
with initial conditions
y1 (0) = 0,
ẏ1 (0) = v,
y2 (0) = ℓ/2,
ẏ2 (0) = 0.
(1.2.7)
It is easy to see that the two differential equations in (1.2.5) have general solutions
y1 (t) = C1 t + C2 ,
y2 (t) = ℓ + C3 cos(ωt) + C4 sin(ωt).
From the initial conditions (1.2.7), we can determine that
C1 = v,
C2 = 0,
C3 = −ℓ/2,
C4 = 0.
Hence, we get the solution
y1 (t) = vt,
y2 (t) = ℓ −
ℓ
cos(ωt),
2
from which we can solve x1 (t) and x2 (t).
Remark. In physics, the interactions between two bodies only depend on their relative positions
(which is due to the translational symmetry of physics laws). Hence, the two-body problem in
classical mechanics can always be reduced to two one-body problems, which are much easier to
solve. Due to this reason, people say that two-body problems are exactly solvable. However, this
is not true anymore for problems with three or more bodies. In particular, it is well known that
for some initial conditions, the three-body motion can exhibit chaotic behaviors.
In general, with more and more bodies, the problem becomes harder and harder. However, when
there are a significant amount of particles, the whole system will exhibit some collective behaviors
that are predictable. This will be the focus of statistical mechanics.
23
1.3
Forces in Nature, Statics
1.3.1
The four fundamental forces in nature
Our universe is governed by the four fundamental forces that play a crucial role in how everything interacts with one another. These four forces are
• gravitational force
• electromagnetic force
• weak nuclear force
• strong nuclear force.
Each of these forces plays a unique role in the workings of the universe, from the behavior of objects
on a planetary scale to the interactions of subatomic particles.
In quantum theory, the concept of forces as we know it in classical physics becomes somewhat
ambiguous. Instead, what we call forces are usually interpreted as the exchange of particles between
objects, which is more accurately described as an interaction. Therefore, it is more appropriate to
refer to the “four fundamental interactions” rather than the “four fundamental forces”.
Here is a brief summary of these four fundamental forces:
1. Gravitational Force. Gravity is the force that governs the behavior of objects on a large
or macroscopic scale. It is responsible for keeping planets in orbit around stars, and causing
apples and other objects to fall towards the ground. This force is described by Newton’s law
of universal gravitation, which states that every particle in the universe is attracted to every
other particle with a force that is proportional to the product of their masses and inversely
proportional to the square of the distance between them:
Gm1 m2
F⃗ = −
r̂.
r2
(1.3.1)
Comparing to other forces, the Newtonian constant of gravitation G ≈ 6.67 × 10−11 m3 /kg·s2
is relatively small. We will discuss Newton’s theory of gravity in detail in section 1.6 of this
lecture.
The best theory for gravity to date is Einstein’s theory of General Relativity. It describes
gravity as the curvature of spacetime manifold caused by the presence of massive objects.
And masses move according to the curvature of the manifold. The basic equation of general
relativity is Einstein’s equation
8πG
1
Rµν − Rgµν = 4 Tµν ,
2
c
24
(1.3.2)
where the left-hand side describes the geometry of spacetime and the right-hand side describes
the distribution of matter and energy. J. A. Wheeler’s famous statement summarizes the core
idea of general relativity as “Spacetime tells matter how to move; matter tells spacetime how
to curve.”
2. Electromagnetic Force. Electromagnetic force unifies the electric force, which describes
how electric charges attract or repel each other and interact with electric fields, and the
magnetic force, which describes the interaction of moving charged objects and magnetic fields.
For example, the Coulomb’s law states that the force between two stationary, electrically
charged particles is
kq1 q2
F⃗ =
r̂,
r2
(1.3.3)
which is very similar to Newton’s law of universal gravitation Eq. (1.3.1). Because almost
all everyday objects have electric charges, the electromagnetic force plays a significant role
in many everyday phenomena, including the operation of electronic devices, the behavior of
a compass, the colors we see, and the heat and light we feel from the sun. Our lecture will
delve deeper into the topic of electromagnetism in the second half, offering a more detailed
and comprehensive discussion on the subject.
3. Weak Nuclear Force. The weak interaction plays a critical role in governing the decay
of unstable subatomic particles, initiating nuclear fusion reactions in stars like the Sun, and
underlying some forms of radioactivity. It involves the exchange of force-carrier particles
called W and Z particles, which are relatively heavy with masses around 100 times that of
a proton. Interestingly, the weak interaction is the only fundamental interaction known to
break parity symmetry. The idea of parity violation was proposed in the mid-1950s by ChenNing Yang and Tsung-Dao Lee and later confirmed experimentally by Chien Shiung Wu.
The weak force violates parity symmetry, which means that it treats left-handed and righthanded particles differently. As a result, the concept of left-handedness and right-handedness
has physical significance in the weak interaction, making them fundamentally different: The
nature is left/right handed.
4. Strong Nuclear Force. The strong interaction is the force that binds protons and neutrons
together in the nucleus of an atom. The strong interaction is responsible for the stability
of atomic nuclei, and without it, the nucleus would quickly fall apart. The strong force is
mediated by particles known as gluons, which are exchanged between quarks to hold them
together. The strong force is very strong at very short distances, but it quickly decreases in
strength at longer distances.
These four fundamental forces behave very differently. For example, the interaction range of
them is different. The strong nuclear force has a very short range, only acting over distances of
the order of a few femtometers (10−15 meters). The weak nuclear force has a range of about 10−18
meters, while the electromagnetic force and gravity have infinite range. These differences originate
25
from the masses of the mediating particles. As a result, in our everyday life, we can only observe
and experience the electromagnetic force and the gravitational force easily.
Besides the interaction ranges, the strengths of these four fundamental forces are also different:
strong nuclear force > electromagnetic force > weak nuclear force > gravitational force.
A notable example that illustrates the relative strength of electromagnetic force compared to gravitational force is that we are able to stand on the ground, rather than being pulled directly to
the center of the Earth by gravity. This is because the electromagnetic force between the atoms
in our feet and the atoms in the ground (only atoms near the feet!) is much stronger than the
gravitational attraction between us and the Earth (the whole Earth!).
The history of physics is a history of unification, where theories are developed to explain multiple
phenomena with a single framework. For example, Newton’s theory of gravity unified the falling of
an apple and the motion of the moon around the Earth. Maxwell’s theory unified electricity and
magnetism. While three of the four fundamental forces have been successfully unified by gauge
theory, gravity remains incompatible with quantum theory. Despite decades of effort, it is still
unknown how to reconcile Einstein’s theory of gravity with quantum mechanics.
1.3.2
Some particular forces
There are several types of forces that we encounter in our daily life quite frequently.
Gravitational force and weight
The gravitational force on Earth is a fundamental force that is exerted by the Earth on all
objects with mass, including human beings, animals, and objects. The weight of an object near
the surface of the Earth is defined by Newton’s law of universal gravitation (1.3.1) as:
W = mg =
GM m
,
R2
(1.3.4)
where M is the mass of the Earth, m is the mass of the object, R is the radius of the Earth, and
g is the acceleration due to gravity. The direction of the gravitational force is always towards the
center of the Earth. From the expression of the free-fall acceleration
g=
GM
,
R2
(1.3.5)
we find notably that this acceleration is independent of the mass of the object, meaning that all
objects will experience the same acceleration under the influence of gravity near the Earth’s surface.
Numerically, the free-fall acceleration is approximately g ≈ 9.8 m/s2 .
Normal force
When an object is placed on a table, it experiences a normal force with direction perpendicular
to the table. Essentially, this normal force comes from the electromagnetic repulsion between the
26
closely spaced molecules of the object and the molecules of the table. The magnitude of the force
is not fixed.
Friction
Friction is a force that opposes motion between two surfaces in contact. When two objects are
in contact, the roughness of their surfaces causes them to ”stick” together slightly. This makes it
harder to move one object relative to the other. Friction can be thought of as a microscopic ”drag”
force that acts opposite to the direction of motion or the direction of an applied force.
There are two types of friction: static friction and kinetic (or sliding) friction. Static friction
occurs when an object is stationary and is about to be moved. The force of static friction prevents
the object from being moved until a sufficient external force is applied. Once the object starts to
move, kinetic friction takes over, which is generally less than static friction.
The magnitude of static friction has a maximum value, which is proportional to the normal
force. So we have
fs ≤ µs FN ,
(1.3.6)
where µs is a dimensionless coefficient of static friction and FN is the magnitude of the normal
force. On the other hand, if the body begins to slide along the surface, the magnitude of the kinetic
frictional force is
fk = µk FN ,
(1.3.7)
where µk is the coefficient of kinetic friction. Usually, the maximum static friction is bigger than
the kinetic friction: µs > µk .
Figure 1.6: Friction.
Tension
Tension is a pulling force that is transmitted through a flexible material, such as a rope, cable,
or string, when it is pulled tight by opposing forces. It is a type of force that acts along the length
27
of the material, and its magnitude is proportional to the amount of force being applied. Tension is
used in a wide range of applications, such as in bridges, cables, pulleys, and other structures that
rely on the strength of flexible materials.
Figure 1.7: Tensions of ropes. (a) Pulling an object on a table. (b) The rope runs around a pulley.
Spring force
Spring force is a force exerted by a compressed or stretched spring, which tends to restore
the spring to its equilibrium length. The magnitude of the spring force is proportional to the
displacement of the spring from its equilibrium position. It can be expressed mathematically as
F⃗ = −k⃗x,
(1.3.8)
where F is the spring force, k is the spring constant, and x is the displacement from the equilibrium
position.
Figure 1.8: Spring force.
28
1.3.3
Statics I: forces
When an object or system remains unchanged over time in a particular reference frame, it is said
to be in a state of equilibrium or static equilibrium. The branch of classical mechanics that studies
systems in this state is known as statics.
Definition 8 (Body). A body refers to a physical object, which can consist of one or multiple parts,
that can be treated as a single entity for the purpose of analysis.
The concept of body is useful for simplifying the description and calculation of the motion,
forces, and other properties of objects, especially when the individual components are too numerous
or complicated to handle separately. By treating a complex system as a single body, we can apply
the principles of mechanics, such as Newton’s laws, to describe its behavior.
In the field of statics, we analyze an object by considering the forces acting on it as a whole
and by examining each of its individual parts, which can be thought of as distinct bodies. The
selection of a particular body to analyze is an engineering decision that is made based on the
specific goals of the analysis. For instance, when designing the foundation of a high-rise building,
we may consider the entire building as a single body. However, when evaluating the strength of the
building’s individual components, such as columns and beams, we would examine them separately
to ensure that they can effectively perform their intended functions.
Newton’s second law states that the net force acting on a body is equal to its mass times its
acceleration. Therefore, if a body is not accelerating, the net force acting on it must be zero. In
other words, if a body is in static equilibrium, the total force acting on it should be zero:
F⃗tot = m⃗a = ⃗0.
(1.3.9)
So we have the following result:
The first condition of equilibrium. In order for a body to be in static equilibrium, the net
force acting on it must be zero.
According to the principle of superposition of forces (or addition of vectors in a vector space),
the total force acting on an object is the sum of all the individual forces acting on it from the
environment. Mathematically, we can express this as:
X
F⃗tot =
F⃗a ,
(1.3.10)
a
where F⃗a represents an individual force acting on the body from the environment. If we connect
the tails of the arrows representing the force vectors to their heads, then the condition that the
total force is zero is equivalent to saying that the resulting path formed by all the arrows is closed
(see Figure 1.9).
29
Figure 1.9: The first condition of equilibrium: Total force of a static body should be zero.
When considering a body as consisting of several parts, each part will exert internal interacting
forces F⃗ij on each other. The total force acting on the body should include all external forces F⃗aj
and internal forces F⃗ij :
!
X X
X
(1.3.11)
F⃗ij .
F⃗tot =
F⃗aj +
j
a
i
P P
Because of the Newton’s third law/action-reaction law F⃗ij = −F⃗ji , the internal forces j i F⃗ij
sum up to zero. Therefore, we obtain again Eq. (1.3.10) by identifying the external force as the
P
sum of F⃗a = j F⃗aj .
Example. A box is at rest on an inclined plane with an angle of inclination θ. The weight of the
box is mg. The static friction coefficient for the plane and the box is µ. Determine the normal force
and friction exerted by the inclined plane on the box. What is the maximum angle θmax at which
the box remains in equilibrium on the plane?
Figure 1.10: Box on an inclined plane.
Solution: We can set up a coordinate system such that the x-axis is parallel to the surface of the
inclined plane and pointing down the slope, the y-axis is perpendicular to the surface and pointing
upwards, and the origin is at a convenient location such as the position of the box.
30
The box on a slope experiences three forces: the gravitational force m⃗g , the normal force F⃗N ,
and the friction f⃗. To satisfy the condition of total force, F⃗tot = m⃗g + F⃗N + f⃗, being zero, we need
to consider two equations along the two axes:
f = mg sin θ,
(1.3.12)
FN = mg cos θ.
(1.3.13)
These two equations completely determine the normal force and the frictional force.
Since the static friction should satisfy f ≤ µFN , the maximum angle satisfy tan θmax = µ, so
θmax = arctan µ.
(1.3.14)
Example. Calculating forces in the pulley system shown in Figure 1.11.
Figure 1.11: A box hanging by ropes.
Example. If the crate has mass m in Figure 1.12, determine the forces in the boom and in the
topping lift.
31
Figure 1.12: Cargo boom.
Solution: The triangle formed by the three forces acting on the static point B is similar to the
triangle ABC. Therefore, we can establish the following equation:
T1
T2
mg
=
=
,
sin θA
sin θC
sin θB
(1.3.15)
which allows us to determine all three forces.
Example. A small ball with mass m is hanging still by ropes (see Figure 1.13). The angles between
the first two ropes and the vertical direction are θ1 and θ2 , respectively. Find the tension in each
of the three ropes.
Figure 1.13: A small ball hanging by ropes.
Solution: The total force of the ball should be zero, so we have
T3 = mg.
32
(1.3.16)
The total force acting on the connecting point of the three ropes should also be zero. The two
equations in horizontal and vertical directions are
T1 sin θ1 = T2 sin θ2
T1 cos θ1 + T2 cos θ2 = T3 .
(1.3.17)
(1.3.18)
The solutions for the three equations are
sin θ2
,
sin(θ1 + θ2 )
sin θ1
T2 = mg
,
sin(θ1 + θ2 )
T1 = mg
T3 = mg.
(1.3.19)
(1.3.20)
(1.3.21)
Example. Consider the cable and pulley arrangement shown in Figure 1.14. The lower block has
mass M , and the upper block has mass m. The coefficient of friction between the two blocks is µ,
and the coefficient of friction between the lower block and the floor is also µ. What is the maximum
horizontal force F that can be exerted on the lower block before it moves? And what is the tension
T in the cable?
Figure 1.14
Solution: The maximum external force F that can be exerted on the lower block before it moves
occurs when the friction forces between the two blocks and between the lower block and the floor
are at their maximum values,
f1 = µN1 ,
(1.3.22)
f2 = µN2 .
(1.3.23)
Applying the first condition of equilibrium on the upper block, we can set the total forces along
the x and y directions to zero, giving us:
f1 − T = 0,
(1.3.24)
N1 − mg = 0.
(1.3.25)
33
For the lower block, we have
F − T − f1 − f2 = 0,
(1.3.26)
N2 − N1 − M g = 0.
(1.3.27)
By solving these equations, we can find that the maximum external force F and the tension in the
cable T are given by:
1.3.4
F = µ(3m + M )g,
(1.3.28)
T = µmg.
(1.3.29)
Statics II: torques
When analyzing the motion of a body made up of several parts, it can be useful to separate
their motions into the motion of the center of mass and the relative motions of the different parts.
However, in order to simplify our analysis and ignore the relative motion, we introduce the following
concept.
Definition 9 (Rigid body). A rigid body is a physical object that maintains its shape and size under
external forces, meaning that it does not undergo deformation, bending, stretching, or twisting.
In reality, no object is truly rigid, and all objects can deform under the application of forces.
However, for many practical purposes, it is sufficient to treat an object as a rigid body. For a rigid
body, we can assume that all points of it move together, so we only need to consider the motion of
the object as a whole rather than the motion of each individual point on the object. This simplifies
the equations of motion and makes it easier to analyze the object’s behavior.
We can ask the question: what is the condition for a rigid body to be at rest or in static
equilibrium? The condition that the total force acting on a rigid body is zero is not sufficient to
ensure that the body is static. There is another condition related to the moment acting on the
body.
⃗ and B
⃗ in three-dimensional Euclidean space,
Definition 10 (Cross product). Given two vectors A
⃗ that is perpendicular to both A
⃗ and B
⃗ and whose
the cross product of them is another vector C
⃗ and B
⃗ multiplied by the sine of the angle
magnitude is equal to the product of the magnitudes of A
⃗ is given by the right-hand rule.
between them. The direction of the resulting vector C
34
Figure 1.15: Cross product.
⃗ = (Ax , Ay , Az ) and B
⃗ = (Bx , By , Bz ) in a coordinate system is given
The cross product of A
by:
x̂ ŷ ẑ
⃗
⃗
A × B = Ax Ay Az = (Ay Bz − Az By )x̂ + (Az Bx − Ax Bz )ŷ + (Ax By − Ay Bx )ẑ
Bx By Bz
= (Ay Bz − Az By , Az Bx − Ax Bz , Ax By − Ay Bx ),
(1.3.30)
where x̂, ŷ, and ẑ are the unit vectors in the x, y, and z directions, respectively. One can show
directly that
⃗ × B|
⃗ = |A|
⃗ · |B|
⃗ · sin θ,
|A
(1.3.31)
⃗ and B.
⃗ If both A
⃗ and B
⃗ are coplanar and lie in the xy plane, their
where θ is the angle between A
cross product is perpendicular to the plane:
⃗×B
⃗ = (Ax , Ay , 0) × (Bx , By , 0) = (0, 0, Ax By − Ay Bx ).
A
(1.3.32)
⃗ and B
⃗ in the xy plane is a vector that only has a z-component,
Therefore, the cross product of A
given by Ax By − Ay Bx .
⃗ B
⃗ = −B×
⃗ A)
⃗ and is distributive over addition
The cross product is anticommutative (that is, A×
⃗ × (B
⃗ + C)
⃗ =A
⃗×B
⃗ +A
⃗ × C).
⃗ It is not associative, but satisfies the Jacobi identity
(that is, A
⃗ × (B
⃗ × C)
⃗ +B
⃗ × (C
⃗ × A)
⃗ +C
⃗ × (A
⃗ × B)
⃗ = 0. It also satisfies A
⃗ · (B
⃗ × C)
⃗ =B
⃗ · (C
⃗ × A)
⃗ =C
⃗ · (A
⃗ × B)
⃗
A
⃗ × (B
⃗ × C)
⃗ = (A
⃗ · C)
⃗ B
⃗ − (A
⃗ · B)
⃗ C.
⃗
and A
The cross product allows us to define the physical concept of the moment of a force, also known
as torque. The moment of a force measures the rotational effect of a force about a specific point
or axis. Unlike a linear force, which causes translational motion of a body, a force that creates a
moment must be applied in a way that causes the body to begin to rotate or twist. This happens
when the force does not act through the centroid or center of mass of the body.
35
Definition 11 (Moment of a force/torque). The moment of force, or torque, is a vector defined as
⃗τ = ⃗r × F⃗ ,
(1.3.33)
where ⃗r is the position vector from the rotation center to the point where the force F⃗ is applied.
By definition, the magnitude of ⃗τ is
|⃗τ | = |⃗r| · |F⃗ | · sin θ,
(1.3.34)
where θ is the angle between the two vectors ⃗r and F⃗ . The direction of τ is given by the right-hand
rule.
One can define the moment of a force with respect to a rotational axis using either ⃗r × F⃗⊥ (see
Figure 1.16b) or (⃗r × F⃗ )∥ (see Figure 1.16c). These two definitions coincide with each other and
can be used interchangeably (show that).
Figure 1.16: Moment of a force/t/orque with respect to (a) a point or (b) (c) a rotational axis.
The second condition of equilibrium. In order for a rigid body to be in static equilibrium,
the net torque acting on it with respect to any chosen axis or point must be zero.
If we have several forces acting on different positions of a static rigid body, the total torque can
be calculated as:
X
⃗τtot =
⃗ri × F⃗i .
(1.3.35)
i
The second condition of equilibrium states that the total torque ⃗τtot acting on the rigid body as a
vector is zero, which ensures that the body does not rotate.
Example. Design a steelyard balance to find mass with torque.
36
Figure 1.17: A steelyard balance.
Example. A uniform pole of length L and weight mg is pivoted at one end to a wall. It is held at
an angle of θ above the horizontal by a horizontal guy wire attached l units from the end attached
to the wall. A load of M g hangs from the upper end of the pole. Calculate the tension in the guy
wire and determine the force exerted on the pole by the wall.
Figure 1.18: A pole at static equilibrium.
Solution: The first condition of equilibrium gives us two equations relating the forces on the pole:
Nx − T = 0,
(1.3.36)
Ny − mg − M g = 0.
(1.3.37)
However, we have three unknown forces: Nx , Ny , and T . Therefore, we need an additional equation
to solve for all three forces. This equation can be obtained from the second condition of equilibrium.
37
Before calculating torques, we need to choose a reference point around which we will calculate
the torques. Choosing the lower left end of the pole is a good choice as it will result in two forces,
Nx and Ny , having zero torque. This simplifies the equations. Using this reference point, the total
torque can be written as:
τtot = mg
L
cos θ + M gL cos θ − T l sin θ = 0.
2
(1.3.38)
Using all the three equations from the first and second conditions of equilibrium, we can solve for
the unknowns as:
Ny = mg + M g,
L
L
Nx = T = mg cos θ + M g cos θ.
2l
l
38
(1.3.39)
(1.3.40)
1.4
Energy and Momentum Conservation
Nowadays, we know that Newton’s laws are only approximations of the laws of nature. New
theories are necessary when objects move at very high speeds (special relativity), are very massive
(general relativity), or are very small (quantum mechanics). However, some consequences followed
by Newton’s law actually hold more universally. In this section, we introduce two of such important
consequences: energy and momentum conservation.
1.4.1
Energy conservation
Let us consider a point particle moving in one dimension under a force which can be written as
F (x) = −
dU (x)
,
dx
(1.4.1)
where the function U (x) is called the potential energy. Newton’s second law tells us that
dU (x)
dv
= −m .
dx
dt
(1.4.2)
Multiplying both sides of the equation by the velocity v, we obtain
−
dU (x)
dx dU (x)
dv
1 dv 2
=−
= mv
= m
,
dt
dt dx
dt
2 dt
(1.4.3)
which can be simplified to
d
dt
1
2
mv + U (x) = 0 .
2
(1.4.4)
We see that Newton’s second law implies that a particular combination of the square of the velocity
and the potential energy is conserved (invariant under time evolution). Let us define the kinetic
energy K of the point particle as
1
(1.4.5)
K = mv 2 ,
2
and the total energy E of the point particle as
1
E = mv 2 + U (x) .
2
(1.4.6)
Newton’s second law implies that the total energy E is conserved.
Example. The force of a spring
F = −kx ,
(1.4.7)
is conservative. The potential energy of the spring is
U (x) = kx2 .
Show that the energy of a simple harmonic oscillator (1.2.4) is conserved.
39
(1.4.8)
Solution: The position and velocity of the box are
v0
sin(ωt) ,
ω
ẋ(t) = −x0 ω sin(ωt) + v0 cos(ωt) ,
x(t) = x0 cos(ωt) +
(1.4.9)
where x0 and v0 are the initial position and velocity. The kinetic energy K is
1
1
K = mẋ2 = m (−x0 ω sin(ωt) + v0 cos(ωt))2 .
2
2
(1.4.10)
The potential energy U is
2
1
1 v0
U = kx2 = k x0 cos(ωt) +
sin(ωt) .
2
2
ω
(1.4.11)
They separately are not conserved. The total energy is
E =K +U
1
=
mω 2 x20 sin2 (ωt) + mv02 cos2 (ωt) − 2mωx0 v0 sin(ωt) cos(ωt)
2
x0 v0
v02
2
2
2
sin(ωt) cos(ωt)
+kx0 cos (ωt) + k 2 sin (ωt) + 2k
ω
ω
1
1
= mv02 + kx20 .
2
2
(1.4.12)
We see that the total energy is indeed conserved.
In the example of the simple harmonic oscillator, if initially the spring is relaxed and the box
has no velocity, i.e. x0 = 0 and v0 = 0, then the system would be at rest with zero energy forever.
We could apply an external force to push the spring to increase the potential and start the motion.
We would give a more general discussion of this process in the following. Consider another force
F ′ that balances the force F such that the net force is zero, i.e.
F ′ = −F =
dU
.
dx
(1.4.13)
Suppose the point particle travels from x = xi to x = xf while remaining zero net force (1.4.13).
We define the work that is done by the force F ′ on the point particle as
ˆ xf
ˆ xf
dU
′
W =
F dx =
dx = U (xf ) − U (xi ) .
(1.4.14)
xi
xi dx
We see that the work equals to the change of the potential energy.
Let us generalize the above discussion to higher dimensions.
Definition 12 (Conservative force). A force F⃗ is conservative if it only depends on the position
⃗r (but not the velocity ⃗v ) and can be written as the gradient of the potential energy U (⃗r), more
precisely,
∂U
∂U
⃗
⃗
F = −∇U =
,··· ,
.
(1.4.15)
∂x1
∂xd
40
∂
The notation ∂x
in (1.4.15) is the partial derivative, which means that we only take the derivative
i
on xi while fixing the other xj for j ̸= i. The kinetic energy and the total energy of a point particle
moving in higher dimensions are
1
K = m|⃗v |2 ,
2
1
E = m|⃗v |2 + U (⃗r) .
2
Let us check that the total energy is conserved,
dE
d 1
d⃗v d⃗r ⃗
2
=
m|⃗v | + U (⃗r) = m⃗v ·
+
· ∇U (⃗r) = ⃗v · m⃗a − F⃗ = 0 .
dt
dt 2
dt
dt
(1.4.16)
(1.4.17)
Example. Consider a free-falling ball near Earth’s surface. It receives a gravitational force as
F⃗ = (0, −mg) ,
which is conservative, and can be written as
∂U ∂U
⃗
,
,
F =−
∂x ∂y
U = mgy ,
(1.4.18)
(1.4.19)
where U is called the gravitational potential energy. The motion of the ball is given by (1.1.42).
The velocity of the ball is given by (1.1.43). Check that the total energy is conserved.
Solution: Let us compute the total energy
1
E = m(ẋ2 + ẏ 2 ) + mgy
2
1
1 2
1
2
2
= mvx + m(vy − gt) + mg y0 + vy t − gt
2
2
2
1
1
= mvx2 + mvy2 + mgy0 ,
2
2
(1.4.20)
which is independent of the time t.
Consider an external force F⃗ ′ that balances the gravitational force and pulls the ball from ⃗ri to
⃗rf following the curve C⃗ri ,⃗rf as shown in figure 1.19. What is the work done by the force F⃗ ′ ?
41
Figure 1.19: An external force F⃗ ′ pulls the ball upward from ⃗ri to ⃗rf .
Let us consider the case with a general potential energy U (⃗r). The force F⃗ ′ balances the
conservative force F⃗ as
⃗ .
(1.4.21)
F⃗ ′ = −F⃗ = ∇U
The work done by the force F⃗ ′ along a curve C⃗ri ,⃗rf starting at ⃗ri and ending at ⃗rf is
˛
˛
F⃗ ′ · d⃗r =
W =
C⃗ri ,⃗rf
C⃗ri ,⃗rf
⃗ · d⃗r = U (⃗rf ) − U (⃗ri ) ≡ ∆U .
∇U
(1.4.22)
We see interestingly that the work only depends on the starting and ending positions ⃗ri , ⃗rf and
the curve C⃗ri ,⃗rf . In particular, if the curve is a closed loop C, then the work done by the force F⃗ ′
is zero,
˛
˛
⃗ · d⃗r = 0 .
W = F⃗ ′ · d⃗r = ∇U
(1.4.23)
C
Note that this property is only true if
F⃗ ′
C
is against a conservative force.
Now, let us consider the situation that the external force F⃗ ′ does not balance the conservative
⃗ . The net force of the system is
force F⃗ = −∇U
⃗ .
F⃗ ′ + F⃗ = F⃗ ′ − ∇U
(1.4.24)
⃗ = m⃗a .
F⃗ ′ − ∇U
(1.4.25)
Newton’s second law implies
42
Let us integrate this equation along the contour C⃗ri ,⃗rf . We find
˛
˛
˛
d⃗v
′
⃗
⃗
m · d⃗r
∇U · d⃗r +
F · d⃗r =
dt
C⃗ri ,⃗rf
C⃗ri ,⃗rf
C⃗ri ,⃗rf
ˆ tf
˛
d⃗v
⃗ · d⃗r +
∇U
=
m · ⃗v dt
dt
ti
C⃗ri ,⃗rf
ˆ tf 1
˛
d 2 m|⃗v |2
⃗ · d⃗r +
∇U
=
dt
dt
ti
C⃗r ,⃗r
i
(1.4.26)
f
= [U (⃗rf ) + K(tf )] − [U (⃗ri ) + K(ti )]
≡ ∆E .
We have learned that when the external force F⃗ ′ does not balance the conservative force, the integral
of the external force over a contour C⃗ri ,⃗rf equals to the energy difference between the initial and
final configuration. It is important to note that now the integral of the external force depends on
the entire contour C⃗ri ,⃗rf not just on the initial and final positions.
Energy conservation can be further generalized to multi-particle systems. The kinetic energy
of a system of n particles is the sum of the kinetic energies of the individual particles,
K=
n
X
1
i=1
2
mi |⃗vi |2 ,
(1.4.27)
The potential energy is a function of the position vectors ⃗ri of the particles, U (⃗r1 , · · · , ⃗rn ). The
force acting on the i-th particle is given by the i-th divergence of the potential energy,
⃗ i U (⃗r1 , · · · , ⃗rn ) ,
F⃗i = −∇
(1.4.28)
⃗ i is the differential operator acting on ⃗ri . It is straightforward to check that the total
where ∇
energy is conserved.
Example. The gravitational potential energy of two stars 1 and 2 is
U (⃗r1 , ⃗r2 ) = −
Gm1 m2
.
|⃗r12 |
(1.4.29)
The force on star 1 is4
⃗ 1 − Gm1 m2 = − Gm1 m2 r̂12 ,
F⃗1 = −∇
|⃗r12 |
|⃗r12 |2
4
(1.4.31)
More explicitly, we have used
∂
∂
∂
1
1
1
p
p
p
,
,
∂x x2 + y 2 + z 2 ∂y x2 + y 2 + z 2 ∂z x2 + y 2 + z 2
1
=−
3 (x, y, z)
2
2
(x + y + z 2 ) 2
1
= − 2 r̂ .
|⃗r|
⃗ 1 =
∇
|⃗r|
43
!
(1.4.30)
and the force on star 2 is
⃗2
F⃗1 = −∇
Gm1 m2
−
|⃗r12 |
=
Gm1 m2
r̂12 .
|⃗r12 |2
(1.4.32)
Consider the circular motion of the two-star system studied in section 1.2.5. The total energy is
1
1
Gm1 m2
E = m1 (r1 ω)2 + m2 (r2 ω)2 +
,
2
2
r
(1.4.33)
where r1 , r2 and r are given in (1.2.2) and (1.2.3). However, this example is not very interesting
since the kinetic energy and the potential energy are separately conserved. In the next section, we
will see the elliptic motion of the two-star system, where the kinetic energy and the potential energy
are not conserved separately while the total energy is conserved.
1.4.2
Momentum conservation
Let us consider a system of two point particles, whose position vectors are ⃗r1 and ⃗r2 . We assume
that no external force (or no net external force) is acting on the system, but the two particles in
the system can interact with each other. More explicitly, particle 1 gives a force F⃗12 to particle 2,
and particle 2 gives a force F⃗21 to particle 1. Newton’s second law gives
2
2
d ⃗r1
F⃗21 = m1 2 ,
dt
d ⃗r2
F⃗12 = m2 2 ,
dt
(1.4.34)
where m1 and m2 are the masses of the two particles. Newton’s third law implies
F⃗12 = −F⃗21 .
(1.4.35)
Now, let us consider
0 = F⃗12 + F⃗21 = m1
d2⃗r2
d
d2⃗r1
+ m2 2 = (m1⃗v1 + m2⃗v2 ) .
2
dt
dt
dt
(1.4.36)
We see that the quantity m1⃗v1 + m2⃗v2 is conserved.
The above discussion can be generalized to more complicated systems. Before we do that, let
us introduce some new concepts. First, the momentum of a point particle is defined by its mass
times velocity,
(1.4.37)
p⃗ ≡ m⃗v ,
The total momentum P⃗ of a system is defined by the sum over all the momenta. For example, in
our previous two-particle system, the total momentum is
P⃗ = p⃗1 + p⃗2 = m1⃗v1 + m2⃗v2 ,
(1.4.38)
which is conserved if there is no net external force by (1.4.36). Now, let us generalize the above
discussion to a system of n particles without net external force. The i-th particle acts on the j-th
44
particle by a force F⃗ij . Newton’s third law implies
0=
n
X
F⃗ij =
i,j=1
i̸=j
n
X
i=1
d
d2⃗ri
mi 2 =
dt
dt
!
X
mi⃗vi
i=1
d
≡
dt
!
X
i=1
pi
=
dP⃗
.
dt
(1.4.39)
Therefore, the total momentum of the system is conserved if there is no net external force.
When there is a net external force F⃗ext , Newton’s second law tells us that the total momentum
⃗
P is not conserved, and its time-derivative is
dP⃗
F⃗ext =
.
dt
(1.4.40)
The above equation looks like the Newton’s second law for a point particle. Hence, it is convenient
to interpret the total momentum as the momentum of an effective point particle whose mass equals
the total mass M = m1 + · · · + mn , i.e.
P⃗ = M⃗vcom ,
(1.4.41)
where ⃗vcom is the velocity of the effective point particle which is a weighted sum of the velocities
of individual particles
n
1 X
⃗vcom =
mi⃗vi .
(1.4.42)
M
i=1
From this formula, we can further infer that the effective point particle is located at the position
⃗rcom given by
n
1 X
⃗rcom =
mi⃗ri .
(1.4.43)
M
i=1
The position specified by the position vector ⃗rcom is called the center of mass.
The center of mass provides a way to describe the “overall motion” of a system of particles.
Namely, a system of particles with a net external force F⃗ext can be viewed as a point particle with
mass M sitting at the center of mass ⃗rcom . In other words, the center of mass motion of a system of
particles only depends on the net external force F⃗ext via Newton’s second law and is independent of
the interactions among the particles in the system. Using translation and Galilean transformation,
we could go to an inertial frame that the center of mass is at the origin and has zero velocity, i.e.
⃗rcom = 0 = ⃗vcom . Such an inertial frame is called the center of mass frame.
1.4.3
Elastic scattering
A m → n scattering is a physical process where m objects come in from infinity, interact via contact
or non-contact forces at finite distance, and n objects go out to infinity. The initial momenta
p⃗1 , · · · , p⃗m of the objects change to the final momenta p⃗1 ′ , · · · , p⃗n ′ during the scattering process.
The interactions (forces) between the objects can be very complicated and we usually do not know
45
their form. We will only assume that the interactions are “local”, i.e. they decay sufficiently fast
with the distance between the objects so that the objects are free when they come in or go out
to infinity. During the scattering process, the total momentum and energy should be conserved.
While they are usually not sufficient to fully determine the scattering process, they give important
constraints to the system.
Figure 1.20: A m → n scattering process.
In a scattering process, while the total energy is always conserved, the form of the energy could
change from one to the other. For example, the objects coming in from infinity carry kinetic energy,
which could transfer to potential energy, heat (thermal energy), or energy of sound and light. In
this subsection, we will first consider an idealistic situation, where the kinetic energy during the
scattering process is conserved. Such a scattering is called an elastic scatterings. The scattering
processes that do not conserve kinetic energy are called inelastic scatterings. On the other hand,
momentum should always be conserved if the system is closed.
We will focus on the 2 → 2 scattering process, where we have two incoming particles or objects
and two outgoing ones.
Example. Consider a one-dimensional 2 → 2 elastic scattering process. Two objects A and B of
masses mA , mB , and velocities vA , vB come in from the left infinity. They collide and go out to
the right infinity. Compute the velocities of the objects A and B after the scattering.
46
Figure 1.21: One-dimensional elastic scattering process.
Solution: The momentum conservation of the scattering process gives
′
′
m A vA + m B vB = m A vA
+ mB vB
.
(1.4.44)
The conservation of the kinetic energy gives
1
1
1
1
2
2
′2
′2
mA vA
+ m B vB
= mA vA
+ mB vB
.
2
2
2
2
(1.4.45)
Solving the above two equations, we find two solutions
′
vA
= vA ,
and
′
vA
=
mA − mB
2mB
vA +
vB ,
mA + mB
mA + mB
′
vB
= vB ,
′
vB
=
(1.4.46)
2mA
mB − mA
vA +
vB .
mA + mB
mA + mB
(1.4.47)
The initial and final velocities in the first solution are the same. This means that the scattering
actually does not happen, which could be realized when vA ≤ vB . The velocities in the second
solution change under the scattering.
It is interesting to go to the center of mass frame. Let us first compute the velocity of the center
of mass
mA vA + mB vB
vcom =
.
(1.4.48)
mA + mB
In the center of mass frame, the objects A and B have initial velocities
vA − vcom =
mB (vA − vB )
,
mA + mB
vB − vcom =
mA (vB − vA )
,
mA + mB
mB (vA − vB )
,
mA + mB
′
vB
− vcom = −
(1.4.49)
and final velocities
′
vA
− vcom = −
mA (vB − vA )
.
mA + mB
(1.4.50)
We see that in the center of mass frame, the final velocities are just given by flipping the signs of
the initial velocities. This can be explained by the time-reversal symmetry of the system.
Now, let us generalize our study to the scatterings in higher dimensions.
Example. Consider a 2 → 2 elastic scattering in d dimensions. Let the velocities of the incoming
particles be ⃗vA and ⃗vB and of the outgoing particles be ⃗vA ′ and ⃗vB ′ .
47
1. Show that the momentum conservation and the energy conservation are invariant under the
Galilean transformation.
2. Using Galilean transformation, translation, and rotation, show that we can always go to an
inertial frame such that the total momentum is zero, and the scattering is two-dimensional,
i.e. all the incoming and outgoing objects or particles are moving in the x-y plane.
3. Show that any Galilean invariant quantities (quantities that are invariant under transformations in the Galilean group) can be written as functions of two Galilean invariant variables.
Explicitly construct such two variables.
Figure 1.22: A d-dimensional 2 → 2 scattering can always be put on a 2-plane.
Solution:
1. Let us start with the momentum conservation
mA⃗vA + mB ⃗vB = p⃗A + p⃗B = p⃗A ′ + p⃗B ′ = mA⃗vA ′ + mB ⃗vB ′ .
(1.4.51)
Under the Galilean transformation
⃗vB ′ → ⃗vB ′ + ⃗v ,
(1.4.52)
mA (⃗vA + ⃗v ) + mB (⃗vB + ⃗v ) = mA (⃗vA ′ + ⃗v ) + mB (⃗vB ′ + ⃗v ) ,
(1.4.53)
⃗vA → ⃗vA + ⃗v ,
⃗vB → ⃗vB + ⃗v ,
⃗vA ′ → ⃗vA ′ + ⃗v ,
the momentum conservation becomes
which is equivalent to the momentum conservation before the Galilean transformation.
48
Next, let us consider the energy conservation,
1
1
1
1
mA |⃗vA |2 + mB |⃗vB |2 = mA |⃗vA ′ |2 + mB |⃗vB ′ |2 .
2
2
2
2
(1.4.54)
Under the Galilean transformation, the energy conservation becomes
1
1
1
1
mA |⃗vA + ⃗v |2 + mB |⃗vB + ⃗v |2 = mA |⃗vA ′ + ⃗v |2 + mB |⃗vB ′ + ⃗v |2 .
2
2
2
2
(1.4.55)
Expanding both sides of the equation, we find
1
1
1
mA |⃗vA |2 + mB |⃗vB |2 + ⃗v · (mA⃗vA + mB ⃗vB ) + (mA + mB )|⃗v |2
2
2
2
1
1
1
′ 2
′ 2
′
= mA |⃗vA | + mB |⃗vB | + ⃗v · (mA⃗vA + mB ⃗vB ′ ) + (mA + mB )|⃗v |2 .
2
2
2
(1.4.56)
Using the momentum conservation, we see that it is equivalent to the energy conservation
before the Galilean transformation.
2. Let us first use translation and Galilean transformation to set the center of mass to be at the
origin with zero velocity. Since the total momentum is zero, the momenta of the two objects,
labeled by A and B, have the same magnitude and opposite directions, i.e. the incoming
momenta satisfy
(1.4.57)
p⃗A = −⃗
pB ,
and the outgoing momenta satisfy
p⃗A ′ = −⃗
pB ′ .
(1.4.58)
Next, we can rotate our frame to make p⃗A and p⃗A ′ to lie in the x-y plane.
In this frame, the scattering process is characterized by only two quantities: the center of
mass energy Ecom and the scattering angle θ. They are given by
Ecom =
pB |2
mA + mB
|⃗
pA |2 |⃗
+
=
|⃗
pA |2 ,
2mA
2mB
2mA mB
(1.4.59)
p⃗A · p⃗A ′ = |⃗
pA ||⃗
pA ′ | cos θ .
(1.4.60)
and
3. Consider a generic inertial frame. The momenta p⃗A , p⃗B , p⃗A ′ , and , p⃗B ′ , subject to the
momentum conservation
(1.4.61)
p⃗A + p⃗B = p⃗A ′ + p⃗B ′ ,
span a three-dimensional vector space. There is a two-dimensional subspace that is invariant
under the Galilean transformation and spanned by the vectors
⃗vAB ≡ ⃗vA − ⃗vB =
p⃗A
p⃗B
−
,
mA mB
⃗vB ′ B ≡ ⃗vB ′ − ⃗vB =
49
1
(⃗
pB ′ − p⃗B ) .
mB
(1.4.62)
Using these two vectors, we can construct three rotational invariant quadratic combinations
s = |⃗vAB |2 ,
t = |⃗vB ′ B |2 ,
u = |⃗vAB − ⃗vB ′ B |2 .
(1.4.63)
By the energy and momentum conservation, one can find one linear relation between s, t, and
u. To find such a relation, let us use Galilean transformation to go to the frame with ⃗vB = 0.
In this frame, ⃗vAB = ⃗vA , ⃗vB ′ B = ⃗vB ′ , and the energy and momentum conservation become
mA⃗vA = mA⃗vA ′ + mB ⃗vB ′ ,
1
1
1
mA |⃗vA |2 = mA |⃗vA ′ |2 + mB |⃗vB ′ |2 .
2
2
2
(1.4.64)
We can solve for ⃗vA ′ using the momentum conservation as
⃗vA ′ = ⃗vA −
mB ′
⃗vB .
mA
(1.4.65)
The energy conservation now can be written as
1
1
1
mA s = mA |⃗vA ′ |2 + mB t
2
2
2
1
mB ′ 2 1
= mA ⃗vA −
⃗vB + mB t
2
mA
2
m2
1
1
= mA s − mB ⃗vA · ⃗vB ′ + B t + mB t
2
2mA
2
1
mB (mA + mB )
= mA s − mB (s + t − u) +
t,
2
2mA
(1.4.66)
mA s − mB t − mA u = 0 .
(1.4.67)
which simplifies as
Therefore, any Galilean invariant quantities can be written as functions of the Galilean invariant variables s and t.5
Now, let us go to the center of mass frame, and we have the relations (1.4.57) and (1.4.58).
We find that s and t are related to the center of mass energy Ecom and the scattering angle
θ by
p⃗A
p⃗B
s=
−
mA mB
2
=
1
1
+
mA mB
2
|⃗
pA |2 =
mA + mB
Ecom ,
2mA mB
p⃗A
1
p⃗B
p⃗B ′ 2
p⃗A ′ 2
−
=
−
= 2 (|⃗
pA |2 − 2⃗
pA ′ · p⃗A + |⃗
pA ′ |2 )
t=
mB
mB
mB
mB
mB
2
4m
A
= 2 |⃗
pA |2 (1 − cos θ) =
Ecom (1 − cos θ) .
mB (mA + mB )
mB
5
(1.4.68)
The variables s, t and u are the analogs of the Mandelstam variables of 2 → 2 scatterings in quantum field theory.
50
1.4.4
Inelastic scattering
As we discussed in the previous section, an inelastic scattering process does not conserve the kinetic
energy, while the momentum is conserved. Let us consider the following examples of the inelastic
scatterings if the system is closed.
Example. Show that 2 → 1 scatterings cannot be elastic.
Solution: Let the momenta of the incoming particles be p⃗A and p⃗B . By the momentum conservation, the momentum of the outgoing particle is p⃗A + p⃗B . The total kinetic energies of the incoming
particles and the outgoing particle are
pB |2
|⃗
pA |2 |⃗
+
,
2mA
2mB
|⃗
pA + p⃗B |2
.
2(mA + mB )
(1.4.69)
Now, Let us go to the center of mass frame. The outgoing particle has zero momentum and
zero kinetic energy. However, the total kinetic energy of the incoming particles is positive.
Example. Consider a one-dimensional scattering process between a box A and a system of two
boxes B and C connected by a spring. Initially, the box A has velocity ⃗vA , the boxes B and C are
at rest, and the spring is in the relaxed state with the spring constant k. The scattering between the
boxes A and B is elastic, but the scattering between the box A and the system of the boxes B and
C is inelastic as some of the kinetic energy changes to the “internal energy”, the potential energy
of the spring plus the kinetic energy of the relative velocity between boxes B and C. Find the total
internal energy.
Figure 1.23: An inelastic scattering between a box and two boxes connected by a spring.
Solution: Since the scattering between the boxes A and B is elastic, we can use the formulae in
(1.4.47) and find
2mA
mA − mB
′
′
vA
=
vA , vB
=
vA .
(1.4.70)
mA + mB
mA + mB
The box B would then push the spring to compress, and the spring would push the box C to move.
The system of the boxes B and C has a center of mass velocity ⃗vBC,com , which can be computed
by the momentum conservation (of the BC system)
′
mB vB
= (mB + mC )vBC,com .
51
(1.4.71)
The internal energy is given by the initial kinetic energy of the box B minus the kinetic energy of
the center of mass velocity, i.e.
1
1
′2
2
mB vB
− (mB + mC )vBC,com
2
2
2
2
1
2mA
2mA
1
mB
vA − (mB + mC )
vA
= mB
2
mA + mB
2
mB + mC mA + mB
m2A mB mC
=
v2 .
(mA + mB )2 (mB + mC ) A
52
(1.4.72)
1.5
Harmonic Oscillators, Pendulum
Harmonic oscillators are important in physics and engineering because they are the simplest type of
oscillating system, and many real-world systems can be modeled as harmonic oscillators. Examples
of harmonic oscillators include mass-spring systems, pendulums, and electric circuits. Understanding the behavior of harmonic oscillators can help us understand more complex systems, as well
as develop technologies such as clocks, musical instruments, and sensors. Additionally, quantum
harmonic oscillators are an important concept in quantum mechanics, and the study of harmonic
oscillators plays a crucial role in the development of quantum field theory. Overall, studying harmonic oscillators provides fundamental insights into the behavior of physical systems and their
applications in various fields.
1.5.1
Simple harmonic motion
The classic harmonic oscillator is a fundamental concept in physics that describes a particle with
mass m coupled to a spring. The force acting on the particle is given by Hooke’s law
F⃗ = −k⃗x,
(1.5.1)
where ⃗x is the position vector of the particle with respect to the equilibrium point, and k is the
spring constant.
According to the Newton’s second law F⃗ = m⃗a, the equation of motion for the particle is given
¨, where ⃗a = ⃗x
¨ is the acceleration of the particle. This is a second-order differential
by −k⃗x = m⃗x
equation that governs the motion of the oscillator. To simplify the equation of motion, we define
the angular frequency of the oscillator as
r
k
ω=
.
(1.5.2)
m
The equation of motion for the particle now becomes
¨(t) = −ω 2 ⃗x(t),
⃗x
(1.5.3)
which is a well-known differential equation that describes simple harmonic motion.
For simplicity, let us consider a particle that is constrained to move along a one-dimensional
line. The generic solution of the differential equation for this particle is
x(t) = xm cos(ωt + ϕ),
(1.5.4)
where xm is the amplitude, which is the maximum value of x(t), and ϕ is the initial phase at time
zero. The argument of the cosine function is called the phase. These terminologies are summarized
in Figure 1.24. One can also obtain these parameters from the x-t plot. For instance, a negative
shift in the initial phase will cause the cosine curve to shift towards the right (see Figure 1.25).
53
Figure 1.24
The motion of the harmonic oscillator x(t) is periodic, which means that if we change time t to
2π
t + 2π
ω , x(t) remains the same. Thus, T = ω is the period of the motion. The period T is related
to the frequency f and angular frequency ω as:
ω=
2π
= 2πf.
T
(1.5.5)
Here, f is the frequency in Hertz (Hz=s−1 ) and ω is the angular frequency in radians per second.
The period and frequency are reciprocals of each other, and they are both related to the oscillation
rate of the oscillator.
Figure 1.25
The velocity of a simple harmonic oscillator is given by the derivative of its displacement over
time, which is expressed as:
dx(t)
π
v(t) =
= −ωxm sin(ωt + ϕ) = ωxm cos ωt + ϕ +
.
(1.5.6)
dt
2
The velocity amplitude is given by vm = ωxm . And the phase constant is shifted by π/2 compared
to x(t). The acceleration of the oscillator can be calculated as the derivative of its velocity over
time, which is expressed as:
a(t) =
dv(t)
= −ω 2 xm cos(ωt + ϕ) = ω 2 xm cos(ωt + ϕ + π).
dt
54
(1.5.7)
The acceleration amplitude is given by am = ω 2 xm and the phase is shifted by π compared to x(t).
From these expressions, one can check easily that
a(t) = −ω 2 x(t),
(1.5.8)
which demonstrates that the acceleration of the simple harmonic oscillator is directly proportional
to its displacement, and is in the opposite direction to it. This result is required exactly by Newton’s
second law and Hooke’s law.
Let us now consider the energies of the harmonic oscillator. By definition −dU (x)/dx = F (x),
the potential energy of the system is given by:
1
1
U (t) = kx(t)2 = kx2m cos2 (ωt + ϕ).
2
2
(1.5.9)
Similarly, the kinetic energy of the system is given by:
1
1
K(t) = mv(t)2 = mω 2 x2m sin2 (ωt + ϕ).
2
2
(1.5.10)
Both of these energies oscillate with time, as they depend on the position or velocity of the particle.
The period of this oscillation is π/ω = T /2. However, the total energy, which is the sum of the
potential and kinetic energies, is conserved and remains constant:
1
E(t) = U (t) + K(t) = kx2m = const.
2
(1.5.11)
This total energy is equal to the potential energy when the displacement of the particle reaches its
maximum, since the velocity, and hence the kinetic energy, at that point is zero.
1.5.2
Damped harmonic motion
Damped harmonic motion is a type of oscillation that occurs in real physical systems. It involves
an object oscillating on a spring, but unlike the idealized simple harmonic motion, the system
experiences damping due to various factors such as internal friction and air resistance. As a result,
the amplitude of oscillation decreases with time, and the system eventually comes to rest. This
decay in amplitude is a consequence of the energy of the system being dissipated into thermal
energy.
To account for the energy loss due to friction or other dissipative effects, we introduce a damping
force F⃗d proportional to the velocity ⃗v of the system. In the case of linear viscous damping, the
force is given by
F⃗d = −b⃗v ,
(1.5.12)
where b is a positive constant known as the damping coefficient. The damping force acts in the
opposite direction of the motion and causes the system to slow down. A larger value of b corresponds
to a stronger damping force, resulting in a faster decay of the oscillation amplitude. Note that this
55
linear assumption is valid only for small velocities. The generic damping force is very complicated.
It depends on the velocity, the properties of the medium, the geometry of the system, and many
other factors.
With the damping force, we can write Newton’s second law for the system as:
−bv − kx = ma.
(1.5.13)
By substituting dx/dt for v and d2 x/dt2 for a, we get the following second-order linear differential
equation:
m
dx
d2 x
+b
+ kx = 0.
dt2
dt
(1.5.14)
The standard procedure of solving the above equation is by assuming x(t) = eλt , we obtain an
algebraic equation mλ2 + bλ + k = 0. Depending on the sign of b2 − 4mk, we have three types of
damped harmonic motion solutions with quite different physical behaviors. Figure 1.26 illustrates
the position versus time plots for three types of damped harmonic motions.
Figure 1.26: Three types of damped harmonic motions. (a) Underdamped; (b) Critically damped;
(c) Overdamped.
1. Underdamped (b <
where
√
b
4mk or b2 − 4mk < 0): λ has two complex roots λ1/2 = − 2m
± iω ′ ,
r
ω′ =
b2
k
−
<ω=
m 4m2
r
k
m
(1.5.15)
is the angular frequency of the damped oscillator. The general solution can be written as
b
b
x(t) = e− 2m t (c1 cos ω ′ t + c2 sin ω ′ t) = xm e− 2m t cos(ω ′ t + ϕ).
(1.5.16)
Here, xm is the amplitude of the motion. The system oscillates (at reduced frequency ω ′
compared to the undamped case ω) with the amplitude gradually decreasing to zero due to
b
the exponential factor e− 2m t .
56
b
Since the amplitude of the underdamped oscillator can be understood as xm e− 2m t , the total
energy of the oscillator can be approximated by
b
1
E(t) ≈ kx2m e− m t ,
2
(1.5.17)
which decreases exponentially with time. The energy loss is attributed to internal energy in
the medium that is responsible for damping.
p
If the damping is absent (b = 0), then ω ′ = ω = k/m, which is the angular frequency of an
undamped oscillator. And the solution to the differential equation reduces to the solution of
the undamped oscillator.
√
b
2. Overdamped (b > 4mk or b2 −4mk > 0): λ has two real and distinct roots λ1/2 = − 2m
±ω ′
with
r
b2
k
′
ω =
− .
(1.5.18)
2
4m
m
The general solution can be written as the sum of two exponentially decaying (λ1/2 < 0)
functions
x(t) = c1 e−|λ1 |t + c2 e−|λ2 |t ,
(1.5.19)
where c1 and c2 are constants determined by initial conditions. Therefore, the overdamped
system will return (exponentially decays) to equilibrium without any oscillating. The decay
b
rate at long time is governed by |λ1 | = −λ1 = 2m
− ω ′ with less absolute value:
x(t) ∼ c1 e−|λ1 |t , as t → +∞.
(1.5.20)
√
b
3. Critically damped (b = 4mk or b2 −4mk = 0): λ has two identical roots λ1/2 = λ = − 2m
.
The angular frequency is ω ′ = 0. The general solution is
x(t) = (c1 + c2 t)e−|λ|t ,
(1.5.21)
where c1 and c2 are determined by initial conditions. The system returns to equilibrium as
quickly as possible without oscillating (faster than the overdamped case).
1.5.3
Forced oscillations and resonance
In contrast to the simple harmonic oscillator and the damped oscillator, we may encounter a system
where external force F (t) acting on the particle are time-dependent. One example is a swing being
pushed by a person with a periodic force. From Newton’s second law, we obtain the equation of
motion: F (t) − bv(t) − kx(t) = ma(t). This is the second-order inhomogeneous linear differential
equation
m
d2 x(t)
dx(t)
+b
+ kx(t) = F (t),
2
dt
dt
57
(1.5.22)
where the time-dependent external force F (t) is the inhomogeneous term.
By using Fourier transformation, a generic time-dependent force can be decomposed into several
sine or cosine terms with different frequencies. For simplicity, let us consider a single-component
periodic external force of the form:
F (t) = Fm cos(ωF t).
(1.5.23)
The generic solution of the equation of motion, Eq. (1.5.22), is a superposition of a particular
solution and the homogeneous solution of Eq.(1.5.14), as discussed in the previous subsection. We
can assume the particular solution to be of the form
x(t) = A cos(ωF t + ϕ),
(1.5.24)
where the amplitude A and phase ϕ are unknown coefficients. This solution corresponds to a steady
state that is periodic with respect to time. We can also consider it to be the solution in the limit
of t → ∞, since the damping term decays to zero in this limit. By substituting the trial solution
Eq. (1.5.24) into Eq. (1.5.22), we have
A(k − mωF2 ) cos(ωF t + ϕ) − bAωF sin(ωF t + ϕ) = Fm cos(ωF t).
(1.5.25)
This equation can be rewritten as two equations in terms of A and ϕ by expanding cos(ωF t) =
cos(ωF t + ϕ) cos(ϕ) + sin(ωF t + ϕ) sin(ϕ) on the right-hand side:
A(k − mωF2 ) = Fm cos ϕ,
(1.5.26)
−bAωF = Fm sin ϕ,
(1.5.27)
So the amplitude and phase of the trial solution can be determined as
Fm
A= q
,
m2 (ωF2 − ω 2 )2 + b2 ωF2
ϕ = arctan
where ω =
force).
q
k
m
bωF
,
m(ωF2 − ω 2 )
(1.5.28)
(1.5.29)
is the natural/intrinsic frequency of the oscillator (without damping or external
The steady-state solution of the forced oscillation problem, given by Eq. (1.5.24) in the limit
t → ∞, describes a harmonic oscillator with an angular frequency ωF inherited
q from the external
k
force. This frequency is NOT related to the natural/intrinsic frequency ω = m
of the oscillator.
In fact, the steady-state solution oscillates at the same frequency as the driving force, regardless of
the intrinsic parameters (m and k) of the oscillator and its initial conditions (see Figure 1.27).
58
Figure 1.27: Forced oscillations with different initial conditions for k = 1, m = 1, Fm = 1, b = 0.7
and ωF = ω = 1.1. All the curves converge to the steady oscillation Eq. (1.5.24) in the large t limit.
The frequency of the motion also converges to the external frequency ωF of the applied force.
One interesting phenomenon in the forced harmonic oscillator is resonance, where the amplitude
of the steady-state motion is maximized. The amplitude A of the steady state inq
Eq.(1.5.28) depends
k
on both the frequency ωF of the external force and the natural frequency ω = m
of the oscillator
(see Figure 1.31). From Eq.(1.5.28), the amplitude reaches its maximum value if the extrinsic and
intrinsic frequencies coincide, i.e.,
r
k
ωF = ω =
,
(1.5.30)
m
which is independent of the damping coefficient b and initial conditions. This condition is called
resonance.
Figure 1.28: The amplitude A of the steady state as a function of frequency ratio ωF /ω. The system
exhibits resonance when the driving frequency is equal to the natural frequency of the oscillator,
ωF = ω.
59
1.5.4
Simple Pendulum
A pendulum is a physical system consisting of a mass (called the bob) m suspended from a fixed
point by a string or rod with length L that is free to swing back and forth under the influence of
gravity. Pendulums are studied in physics because they exhibit a simple harmonic motion in some
limit.
Figure 1.29: A simple pendulum.
By Newton’s second law, the motion of a pendulum is governed by the equation:
T − mg cos θ = mθ̇2 L,
mg sin θ = −mL
d2 θ
dt2
(1.5.31)
,
(1.5.32)
where θ is the angular displacement of the pendulum from its equilibrium position, and g is the
acceleration due to gravity. The second equation is a nonlinear differential equation that is difficult
to solve exactly.
In the case of small angles, where sin θ ≈ θ, the differential equation governing the motion of a
simple pendulum can be simplified to:
d2 θ
g
+ θ = 0,
dt2
L
(1.5.33)
which is exactly the linear differential equation Eq. (1.5.3) for a harmonic oscillator with a natural
frequency of
r
g
ω=
.
(1.5.34)
L
60
The solution to this differential equation is:
θ(t) = θ0 cos (ωt) ,
(1.5.35)
where θ0 is the initial displacement of the pendulum from its equilibrium position. We can observe
that the period of the pendulum, given by:
s
L
2π
= 2π
(1.5.36)
T =
ω
g
is independent of the mass m of the pendulum.
Similar to the harmonic oscillator of mass-spring system, the energy of a simple pendulum is
transformed between kinetic energy and gravitational potential energy as the pendulum swings
back and forth.
1.5.5
Double pendulum and chaos
The double pendulum is a classic example of a complex physical system that exhibits chaotic
behavior. It consists of two pendulums attached to each other, with the second pendulum attached
to the end of the first pendulum.
Figure 1.30: Double pendulum.
Consider a double pendulum consisting of two point masses m1 and m2 , suspended by rigid,
massless rods of length L1 and L2 respectively. The position of each pendulum is described by two
angles θ1 and θ2 . The position of each mass can be described by two coordinates: x1 = L1 sin θ1 and
y1 = −L1 cos θ1 for mass m1 , and x2 = L1 sin θ1 + L2 sin θ2 and y2 = −L1 cos θ1 − L2 cos θ2 for mass
m2 . The velocities and accelerations of the masses are the time derivatives of these coordinates:
ẋi , ẏi , ẍi , and ÿi , which can be expressed as functions of θi , θ̇i , and θ̈i .
61
Using Newton’s second law for each mass, we can write down the equations of motion:
m1 ẍ1 = −T1 sin θ1 + T2 sin θ2 ,
(1.5.37)
m1 ÿ1 = T1 cos θ1 − T2 cos θ2 − m1 g,
(1.5.38)
m2 ẍ2 = −T2 sin θ2 ,
(1.5.39)
m2 ÿ2 = T2 cos θ2 − m2 g,
(1.5.40)
where T1 and T2 are tensions of the two rods. By eliminating the tensions, we obtain a set of
coupled, nonlinear differential equations for the two angles θ1 and θ2 :
2
2
−g(2m1 + m2 ) sin θ1 − m2 g sin(θ1 − 2θ2 ) − 2 sin(θ1 − θ2 )m2 (θ˙2 L2 + θ˙1 L1 cos(θ1 − θ2 ))
θ¨1 =
,
L1 (2m1 + m2 − m2 cos(2θ1 − 2θ2 ))
(1.5.41)
2
2
2 sin(θ1 − θ2 )(θ˙1 L1 (m1 + m2 ) + g(m1 + m2 ) cos θ1 + θ˙2 L2 m2 cos(θ1 − θ2 ))
θ¨2 =
.
L2 (2m1 + m2 − m2 cos(2θ1 − 2θ2 ))
(1.5.42)
These equations for the double pendulum are quite complex and difficult to solve analytically.
However, the motion of the double pendulum can be simulated using numerical methods, such
as the Runge-Kutta algorithm, which can accurately predict the motion of the double pendulum
for a given set of initial conditions. You can find interactive simulations of double pendulum on
this website.
One of the most fascinating features of the double pendulum is its chaotic behavior. Chaos is a
phenomenon where a small change in the initial conditions of a system can lead to vastly different
outcomes in its motion. In the case of the double pendulum, even the slightest differences in the
initial positions or velocities of the pendulums can result in completely different trajectories.
Figure 1.31: Long exposure of double pendulum exhibiting chaotic motion (tracked with an LED).
This characteristic of the double pendulum has significant implications. It means that it is
practically impossible to predict the long-term motion of the system, since any small measurement
error or uncertainty in the initial conditions can lead to dramatically different results. This is
known as the ”butterfly effect”, as even the flapping of a butterfly’s wings can potentially cause a
hurricane on the other side of the world.
62
1.6
The Theory of Gravitation
One of the earliest goals of physics is to understand the gravitational force that holds us to Earth,
holds the Moon in orbit around Earth, and holds Earth in orbit around the Sun. It also reaches
out through the whole Milky Way galaxy, holding together the billions and billions of stars in the
Galaxy and the countless molecules and dust particles between stars. The gravitational force also
reaches across intergalactic space, holding together the Local Group of galaxies, which includes,
in addition to the Milky Way, the Andromeda Galaxy at a distance of 2.3 × 106 light-years away
from Earth, plus several closer dwarf galaxies. The Local Group is part of the Local Supercluster
of galaxies that is being drawn by the gravitational force toward an exceptionally massive region
of space called the Great Attractor. This region appears to be about 3.0 × 108 light-years from
Earth, on the opposite side of the Milky Way. And the gravitational force is even more far-reaching
because it attempts to hold together the entire universe, which is still expanding.
Gravitation is the first fundamental force whose law is known to people due to Newton’s great
contribution, i.e., Newton’s law of universal gravitation, which is the focus of the current section.
However, the gravitational force remains to be one of the most mysterious forces in nature and is
still not fully understood by physicists so far.
1.6.1
Newton’s law of Gravitation
If the myth were true, in 1665, a falling apple inspired the 23-year-old Isaac Newton to his law
of universal gravitation. Newton recognized that this force pulling the apple to the ground is also
responsible for holding the Moon in its orbit. He further made a far-reaching generalization that
every body in the universe attracts every other body. This tendency of bodies to move toward one
another is called gravitation, which originates from the mass of each body.
Physics law 5 (Newton’s Law of Gravitation). Any point particle in the universe attracts any
point particle with a gravitational force whose magnitude is
F =G
m1 m2
,
r2
where m1 and m2 are the masses of the two particles, r is the distance between them, and
G = 6.67 × 10−11 N · m2 /kg2
is the gravitational constant.
Here, particle 1 at ⃗r1 attracts particle 2 at ⃗r2 means the gravitational force on particle 2 is along
the direction from ⃗r2 to ⃗r1 , i.e., the gravitation force on particle 2 in vector form is
m1 m2 ⃗r1 − ⃗r2
.
F⃗12 = G
|⃗r1 − ⃗r2 |2 |⃗r1 − ⃗r2 |
63
(1.6.1)
In reality, two objects can be treated as point particles only when the distance between them is
sufficiently large. When the two objects are close to each other compared to their own scales, the
formula (1.6.1) cannot be applied directly. In particular, ⃗r1 and ⃗r2 are not well-defined for the two
objects anymore. In general, let D ⊂ R3 denote a domain occupied by an object. We decompose
it into a union of many small parts (called differential elements), each of which can be treated as
a point particle. A small differential element at ⃗r = (x, y, z) can be thought of as a box with side
lengths ∆x, ∆y, and ∆z. Suppose the density of the object at ⃗r = (x, y, z) is given by ρ(⃗r). Then,
the gravitation of this differential element on a point particle of mass m at position ⃗r′ is given by
Gmρ(⃗r)
⃗r − ⃗r′
∆x∆y∆z.
|⃗r − ⃗r′ |3
Then, by the principle of superposition for forces, the total gravitational force is given by
X
Gmρ(⃗r)
differential elements in D
⃗r − ⃗r′
∆x∆y∆z.
|⃗r − ⃗r′ |3
Taking ∆x, ∆y and ∆z to zero, the summation above becomes a triple integral
˚
⃗r − ⃗r′
Gmρ(⃗r)
dxdydz, ⃗r = (x, y, z).
|⃗r − ⃗r′ |3
D
(1.6.2)
You will learn how to evaluate such multi-variable integrals in your calculus courses; it is not the
focus of this course.
In applications, the integral is (1.6.2) can be very hard to calculate analytically. We hope to
find a simpler way to calculate the gravitation between two objects. Intuitively, it may be natural
to conjecture that we can still apply (1.6.1) directly with ⃗x1 and ⃗x2 being the centers of mass of
these two objects. Unfortunately, this is not true in general. However, there are some special cases
where this is true, which simplifies the calculations greatly.
Proposition 1.6.1. Consider a ball with uniform density (mass per unit volume). Suppose the
center of the ball is at ⃗r1 and the ball has mass m1 . The magnitude of the gravitational force from
this ball on a particle of mass m2 , located outside the ball at ⃗r2 , is then given by (1.6.1).
In other words, when calculating the gravitation, a uniform ball can be treated as a point
particle of the same mass located at its center. From this proposition, we can easily derive that
when calculating the gravitational force between two balls, they can be treated as two point particles
located at their respective centers (think about why). This means when we consider gravitation
between planets, it is safe to treat them as point particles no matter how far they are apart from
each other.
Let us assume that Earth is a uniform sphere of mass M . The magnitude of the gravitational
force from Earth on a particle of mass m, located outside Earth a distance r from Earth’s center,
is then given by GM m/r2 . If the particle is released, it will fall toward the center of Earth with
the gravitational acceleration g = GM/r2 . In particular, at the surface of Earth, the gravitational
64
acceleration is about 9.8 m/s2 . Moreover, the radius of Earth is about r = 6371 km. Then, we can
calculate the mass of Earth as
gr2
M=
≈ 5.96 × 1024 kg.
G
This is a simple example of a way to measure the masses of planets: we first observe the acceleration
caused by the gravitation of a planet and measure various distances with astronomical observations,
and then calculate the mass using Newton’s law of gravitation.
Example. A geostationary satellite is a geosynchronous satellite, which has a geostationary orbit—
a circular orbit directly above the Earth’s equator and with an orbital period the same as the Earth’s
rotation period. Determine the height of a geostationary satellite.
Solution: Let R be the distance from the geostationary satellite to the center of Earth, r = 6371km
be the radius of Earth, and g = 9.8 m/s2 be the gravitational acceleration at the ground. We have
seen that GM = gr2 . The angular speed of the satellite is
ω=
2π
rad/s.
24 × 3600
Since the size of the satellite is very small compared with r, it can be treated as a point particle.
Then, by Newton’s second law, we have
mgr2
GM m
=
⇒ R=
mω R =
R2
R2
2
gr2
ω2
1/3
.
The height is then given by R − r.
We have an immediate generalization of Proposition 1.6.1 to the radially symmetrical case.
Proposition 1.6.2. Proposition 1.6.1 also holds for a ball with radially symmetrical density, i.e.,
the density depends only on the distance to the center.
In particular, this proposition holds for the gravitational forces outside a uniform shell of
matter—the region between two spheres with the same center. On the other hand, we also have
the following simple fact inside the shell:
Proposition 1.6.3. A shell of matter with radially symmetric density exerts no net gravitational
force on a particle located inside it.
By Proposition 1.6.2 and Proposition 1.6.3, to calculate the gravitational force at a point ⃗x
inside a ball with radially symmetric density, we can ignore the shell outside the ⃗x and treat the
matter inside the sphere containing ⃗x as a point particle located at the center of the ball.
Example. Assume that Earth is a uniform ball. Suppose there is a tunnel connecting the north
pole and the south pole. Drop a ball in the tunnel from the north pole. Determine the time the ball
needs to reach the south pole.
65
Solution: Let R be the radius of Earth, M the mass of earth, and ρ the density of Earth. We set up
an axis along the tunnel pointing from the north pole to the south pole. Let the origin be the center
of Earth, with the coordinate of the north and south poles being −R and R respectively. Denote
by r(t) the coordinate of the ball, with r(0) = −R. By Proposition 1.6.2 and Proposition 1.6.3, the
gravitation acceleration of the ball at r(t) is equal to
a(t) = −sgn(r(t))
Gρ 43 π|r(t)|3
4
g
GM
= − πGρr(t) = − 3 r(t) = − r(t),
2
|r(t)|
3
R
R
(1.6.3)
where the sign function



1,
sgn(r(t)) = 0,


−1,
r(t) > 0
r(t) = 0
r(t) < 0
is due to the fact that the gravitation points to the center of Earth, and we used GM = gR2 in the
last step. Thus, we get the differential equation
g
r̈(t) = − r(t),
R
with initial conditions r(0) = −R and ṙ(0) = 0. This again gives a harmonic oscillator, which has
the solution
r g
r(t) = −R cos
t .
R
To reach the south pole with r(t) = R, we need half of the period of the oscillation, which is
s
R
t=π
.
g
It is possible to prove Propositions 1.6.1–1.6.3 by evaluating the integral (1.6.2) directly. But
there is a much simpler way to prove them based on a deep theorem in multi-variable calculus,
called Gauss’s law ; see Section 1.6.5 below.
1.6.2
Gravitational potential energy
In Section 1.4, we have mentioned that the gravitational force is a conservative force and is associated with some potential energy. There we were careful to keep the particle near Earth’s surface
so that we could regard the gravitational force as constant. By Newton’s Law of Gravitation, we
know that the gravitational force is generally not constant. Is it still conservative, and, if yes, what
is the associated potential energy?
Suppose we already know that the gravitational force is conservative. We now calculate the
potential energy. Consider a point particle of M located at the origin, let m be a particle located
at ⃗r = (x, y, z). Suppose the system has energy 0 when r → ∞. Now, suppose we move the particle
66
from ∞ to ⃗r along the radial direction, i.e., from ∞⃗er to ⃗r with ⃗er = ⃗r/|⃗r|. Then, it is easy to
calculate the work done by the gravitational force during this process:
ˆ ∞
GM m
GM m
W =
dx =
.
2
x
r
r
Thus, the potential energy, if exists, must be equal to −W = −GM m/r.
Proposition 1.6.4. Gravitational force is conservative. Furthermore, the potential energy of a
system of two particles with masses m1 and m2 is
V =−
Gm1 m2
+ C,
r
(1.6.4)
where r is the distance between the two particles. In particular, we can choose C = 0 if the reference
potential energy at ∞ is taken to be 0.
Proof. To show that V is a potential energy, it suffices to show that its gradient is equal to the
gravitational force. Suppose particle 1 is at origin, and particle 2 is at ⃗r = (x, y, z). Then,
V (⃗r) = −
Gm1 m2
(x2 + y 2 + z 2 )1/2
+ C.
We can calculate its gradient as
−∇V (⃗r) = −(∂x V, ∂y V, ∂z V ) = −
Gm1 m2
(x2 + y 2 +
z 2 )3/2
(x, y, z) = −
Gm1 m2 ⃗r
,
r2
r
which is exactly the gravitational force on particle 2. Hence, V is indeed the gravitational potential
energy between the two particles, which implies that the gravitation force is conservative.
By this proposition, the gravitational force satisfies the path independence property: if we move
the pair of particles from one configuration to another, the net work done by the gravitational force
during this process is equal to the negative of the change of the potential energy and does not
depend on the path taken by the particles.
We remark that the potential energy given by Equation (1.6.4) is a property of the system
of two particles rather than of either particle alone. There is no way to divide this energy and
say that so much belongs to one particle and so much to the other. However, if m1 ≫ m2 , as is
true for the problems concerning Earth (mass m1 ) and an object (mass m2 ) near its surface, we
often speak of “the potential energy of particle 2”, because, when the object moves in the vicinity
of Earth, changes in the potential energy of the system appear almost entirely as changes in the
kinetic energy of the baseball, while changes in the kinetic energy of Earth are too small to be
measured. When we speak of the potential energy of bodies of comparable mass, however, we have
to be careful to treat them as a system.
In general, if our system contains more than two particles, we consider each pair of particles
in turn, calculate the gravitational potential energy of that pair as if the other particles were not
67
there, and then algebraically sum the results. For example, for a system of three particles with
masses m1 , m2 and m3 and distances r12 , r23 and r13 between each pair of them, its potential
energy is
Gm1 m2 Gm2 m3 Gm1 m3
−
−
.
V =−
r12
r23
r13
More generally, as discussed above (1.6.2), the potential energy between an object occupying D ⊂
R3 and a point particle of mass m at position ⃗r′ is calculated as an integral:
˚
Gmρ(⃗r)
V =−
dxdydz, ⃗r = (x, y, z).
(1.6.5)
r − ⃗r′ |
D |⃗
Again, the potential energy between a radially symmetric ball and a particle can be evaluated in a
much simpler way by using Propositions 1.6.2 and 1.6.3.
Example. Assume that Earth is a uniform ball. Determine the gravitational potential energy of a
point particle of mass m outside and inside Earth.
Solution: Let R be the radius of Earth, M the mass of earth, and ρ the density of Earth. First,
suppose the location ⃗r of the point particle is outside Earth. Then, with Proposition 1.6.2, we
can obtain the work done by the gravitational force from ∞ to ⃗r along the radial direction as
W = GM m/r, thus giving a potential
V (r) = −
GM m
,
r
for r ≥ R.
Now, suppose ⃗r is inside Earth, i.e., r < R. In Equation (1.6.3), we have seen that the gravitational
force on the particle at x⃗er for some r ≤ x ≤ R has a magnitude
GM m
x.
R3
Thus, the work done by the gravitational force from R⃗er to ⃗r along the radial direction is equal to
ˆ R
GM m
1 GM m 2
xdx =
(R − r2 ),
3
3
R
2
R
r
which gives a change in the potential energy as
V (r) − V (R) = −
1 GM m 2
(R − r2 ),
2 R3
To summarize, we have
(
2
− GMr m = − mgR
r ,
V (r) =
m
m
(R2 − r2 ) = −mgR −
− GM
− 21 GM
R
R3
for r < R.
r≥R
mg
2
2R (R
−
r2 ),
0≤r<R
,
where we also rewrote the results using GM = gR2 .
With a similar argument, we obtain that the gravitational potential energy between two radially
symmetric balls with masses m1 , m2 and distance r between them is given by (1.6.4).
68
Example. A satellite with mass m moves in a circular orbit around Earth with radius r. Determine
its mechanical energy E.
Solution: The potential energy of the satellite is
V =−
GM m
.
r
To find the kinetic energy, we write Newton’s second law as
GM m
v2
=
m
.
r2
r
where v 2 /r is the centripetal acceleration of the satellite. From this equation, we can get the kinetic
energy
1
GM m
K = mv 2 =
.
2
2r
Therefore, the total mechanical energy is
E =K +V =−
GM m
.
2r
(1.6.6)
Following Section 1.4, we have the celebrated energy conservation for an isolated system interacting only through gravitational forces: the mechanical energy of the system, i.e., the kinetic
energy plus the gravitational potential energy, does not change. We now use the mechanical energy
conservation to study an important concept called escape speed.
If you fire a projectile upward, it will slow, stop momentarily, and return to Earth if the initial
speed is too slow. There is a certain minimum initial speed that will cause it to move upward
forever, theoretically coming to rest only at infinity. This minimum initial speed is called the
(Earth) escape speed. We now determine the escape speed of Earth using the concept of energy
conservation. Suppose the escape speed is v2 . Then, a projectile of mass m has mechanical energy
1
GM m
1
E = mv22 −
= mv22 − mgR,
2
R
2
in which M is the mass of Earth and R is its radius. If the projectile can reach infinity, it has zero
potential energy and at least zero kinetic energy there. Thus, its mechanical energy at infinity is
zero, so
p
1
mv22 − mgR = 0 ⇒ v2 = 2gR.
2
2
Plugging into g = 9.8 m/s and R = 6371 km, we get the escape speed for Earth as v2 = 11.2 km/s,
which is also called the second cosmic velocity for Earth. The “third cosmic velocity” is the speed
that a spacecraft needs to attain in order to be able to leave our solar system, i.e., to escape the
gravitation of Earth and Sun. With a similar method, one can obtain the third cosmic velocity as
16.7 km/s. Finally, let’s determine the “first cosmic velocity”.
Example. Calculate the first cosmic velocity v1 of Earth, defined as the minimum initial speed for
a projectile to move in a circular orbit around Earth.
69
Solution: By (1.6.6), the mechanical energy of a projectile moving in a circular orbit around Earth
is
GM m
E=−
,
2r
where m is the mass of the projectile and r is the radius of the orbit. Note that this energy is
minimum when r is equal to the radius R of Earth. Moreover, the potential energy at the surface
of Earth is −GM m/R. Thus, by energy conservation,
r
p
1
GM
GM m
GM m
2
mv1 −
=−
⇒ v1 =
= gR ≈ 7.9 km/s.
2
R
2R
R
We have seen that the escape speed of a planet of mass M and radius R is
r
2GM
,
R
which is larger as M becomes larger and R becomes smaller. In particular, when M is large and R
is small enough that the escape speed is larger than the speed of light, neither particles nor light
can escape from its surface, thus giving some of the most mysterious structures in the universe:
“black holes”. A black hole may form when a star considerably larger than our Sun burns out,
the gravitational force between all its particles can cause the star to collapse in on itself. Any
star coming too near a black hole can be ripped apart by the strong gravitational force (i.e, the
tidal force) and pulled into the hole. Enough captures like this yields a supermassive black hole.
Such mysterious monsters appear to be common in the universe, although observing them is very
difficult.
1.6.3
Kepler’s laws
The motions of the planets, as they seemingly wander against the background of the stars, have
been a puzzle since the dawn of history. Johannes Kepler (1571–1630), after a lifetime of study of
the extensive data of the planetary motions in the solar system, worked out the empirical laws that
govern these motions that now bear Kepler’s name.
Physics law 6 (Kepler’s first law: the law of orbits). All planets move in elliptical orbits, with the
Sun at one focus.
Note that circular orbit is a special case of this law in which the two foci merge to a single
central point. Our Earth is indeed on an elliptical orbit around the Sun, although the eccentricity
e of the orbit is not large: e ≈ 0.0167. Recall that eccentricity is defined as
r
c
b2
e = = 1 − 2,
a
a
where a is the semi-major axis, b is the semi-minor axis, and c is the distance from a focus to the
center.
70
Physics law 7 (Kepler’s second law: the law of areas). A line that connects a planet to the Sun
sweeps out equal areas in the plane of the planet’s orbit in equal time intervals; that is, the rate
dA/dt at which it sweeps out area A is constant.
Qualitatively, this second law tells us that the planet will move most slowly when it is farthest
from the Sun and most rapidly when it is nearest to the Sun. Kepler’s second law is actually
equivalent to the law of conservation of angular momentum, and we will prove it below using
Newton’s second law and law of gravitation.
Physics law 8 (Kepler’s third law: the law of periods). The square of the period of any planet is
proportional to the cube of the semi-major axis of its orbit.
We now illustrate the third law with a circular orbit. Applying Newton’s second law to the
orbiting planet with mass m yields
GM m
= mω 2 r,
r2
where M is the mass of Sun and r is the radius of the orbit. On the other hand, given the period
T , the angular speed is equal to 2π/T . Thus, the above equation gives
GM
4π 2
T2
4π 2
=
r
⇒
=
.
r2
T2
r3
GM
The quantity on the right-hand side is a constant that depends only on the mass M of Sun. The
above equation holds also for elliptical orbits, provided we replace r with a, the semi-major axis of
the ellipse.
Although Kepler’s laws are about planets orbiting the Sun, they hold equally well for satellites,
either natural or artificial, orbiting Earth or any other massive central body.
Kepler’s laws are phenomenological laws and indeed can be derived from Newton’s second law
and law of gravitation (although historically, Newton’s law of gravitation is inspired by Kepler’s
three laws). We now prove Kepler’s second law, while the proof of Kepler’s first and third laws is
more advanced and is not required in this course.
Choose the coordinate such that the origin is at the center of the Sun. Suppose at time t,
the planet is at ⃗r = (x, y) and its velocity is ⃗v = (vx , vy ). We can choose the direction of ⃗v such
that xvy − yvx ≥ 0 (otherwise, we can reverse the directions of the coordinate axes). During an
infinitesimal time ∆t, the planet travels ⃗v ∆t. Then, using cross product formula for the area of a
triangle, the area swept by the planet during ∆t is
1
1
∆A = |⃗r × (⃗v ∆t)| = (xvy − yvx )∆t.
2
2
Hence, the instantaneous rate at which the area is being swept out is equal to
dA
1
= (xvy − yvx ).
dt
2
71
Then, Kepler’s second law is equivalent to that this rate is constant, i.e.,
d2 A
= 0.
dt2
(1.6.7)
In fact, for this system, we can define the angular momentum of the planet as
⃗ = ⃗r × p⃗ = m⃗r × ⃗v = m(xvy − yvx )⃗k,
L
where ⃗k is the basis unit vector along the positive z direction. Then, equation (1.6.7) is equivalent
⃗
to the conservation of L:
⃗
dL
= 0.
(1.6.8)
dt
⃗ and use that ⃗r˙ = ⃗v and p⃗˙ = F⃗g by Newton’s second
To check (1.6.8), we take the derivative of L
law, where F⃗g is the gravitational force on the planet. In this way, we get
⃗
dL
= ⃗v × p⃗ + ⃗r × F⃗g = 0,
dt
where, in the last step, we use that ⃗v is parallel to p⃗ = m⃗v and ⃗r is parallel to F⃗g by Newton’s law
of gravitation. This shows (1.6.8), hence concluding Kepler’s second law.
1.6.4
Gravitation on Earth
So far, we have assumed that Earth is a uniform ball and an inertial frame by neglecting its rotation.
This simplification has allowed us to assume that the free-fall acceleration g of a particle is the
same as the particle’s gravitational acceleration. Furthermore, we assumed that g has the constant
value g = GM/r2 any place on Earth’s surface. However, any g value measured at a given location
will differ from the value ag = GM/r2 for that location for the following three reasons:
• Earth’s mass is not uniformly distributed. The density of Earth varies radially, and the
density of the crust (outer section) varies from region to region over Earth’s surface. Thus, g
varies from region to region over the surface.
• Earth is not a sphere. Earth is approximately an ellipsoid, flattened at the poles and bulging
at the equator. Its equatorial radius (from its center point out to the equator) is greater than
its polar radius (from its center point out to either north or south pole) by 21 km. Thus, the
free-fall acceleration g increases if you were to measure it while moving at sea level from the
equator toward the north or south pole. As you move, you are actually getting closer to the
center of Earth, and thus, by Newton’s law of gravitation, g increases.
• Earth is rotating, so its ground is not an inertial frame. The rotation axis runs through the
north and south poles of Earth. An object located on Earth’s surface anywhere except at
those poles must rotate in a circle about the rotation axis and thus must have a centripetal
acceleration directed toward the center of the circle. This centripetal acceleration requires a
centripetal net force that is also directed toward that center.
72
To see how Earth’s rotation causes a difference in gravitational acceleration, we consider a
particle of mass m at a point with latitude π/2 − θ ∈ [0, π/2]. Denote the normal force on the
particle by F⃗N , and the tangent force by T⃗ . The gravitation from Earth is GM m/r2 = mag pointing
to the center of Earth. Since the particle is doing a circular motion around the rotation axis with
radius r sin θ and angular speed ω, it has a centripetal acceleration ω 2 r sin θ directed toward the
rotation axis between the north and south poles.
Figure 1.32: The rotation frame on Earth.
By Newton’s second law, we have
(mag − FN ) sin θ + T cos θ = mω 2 r sin θ,
(mag − FN ) cos θ = T sin θ,
solving which gives
FN = m(ag − ω 2 r sin2 θ).
The magnitude of the normal force is equal to the weight mg read on the scale, meaning that the
free-fall acceleration g measured in the frame of the rotating Earth is
g=
FN
= ag − ω 2 r sin2 θ.
m
The difference ω 2 r sin2 θ is largest at the equator (i.e., θ = π/2), where one can calculate that ω 2 r
is approximately 0.034 m/s2 , much smaller than 9.8 m/s2 . Therefore, neglecting this difference is
often justified.
1.6.5
Gauss’s law for gravity
Gauss’s law for gravity, also known as Gauss’s flux theorem for gravity, states that: the flux (surface
integral) of the gravitational field (i.e., the field of gravitational acceleration) over any closed surface
is equal to the mass enclosed times −4πG.
73
To explain the statement, we consider an object of mass M and enclosed by a surface S.
This object generates a gravitational acceleration ⃗ag at any point of space, which is called the
gravitational field. Its flux over the surface is defined as follows. We divide the surface into a union
⃗ each of which can be regarded as a flat area element with direction.
of small area elements dA,
⃗ is a vector that is perpendicular to the element and points outside the surface
More precisely, dA
and has a magnitude equal to the area dA of the element. Then, the flux of the gravitational field
⃗ with ⃗ag taking the value at the location of the
⃗ag through this area element is given by ⃗ag · dA,
element. Adding together the flux over all area elements yields the total flux through the whole
surface, which can be expressed as a surface integral
‹
⃗
⃗ag · dA.
S
Gauss’s law tells us that this integral is equal to −4πGM.
Gauss’s law for gravity is equivalent to Newton’s law of gravitation, and it is often more convenient to work from than Newton’s law. Going from Gauss’s law to Newton’s law of gravitation is
simple (you can think about it by yourself). The other direction needs to use Gauss’s theorem in
vector calculus. We will return to Gauss’s law when we discuss electric fields of electrically charged
bodies. We now use it to prove Propositions 1.6.2 and 1.6.3.
We consider a radially symmetric ball, whose center is at the origin. We want to calculate the
gravitational field it generates at a location ⃗r outside the ball. By radial symmetry, the field points
toward the origin, i.e., it has direction −⃗er = −⃗r/r. We only need to determine the magnitude of
the field, denoted as ag (r). For this purpose, we consider a sphere of radius r that encloses the
ball. Again, by radial symmetry, the magnitude of the field is the same at any point of the sphere,
perpendicular to the sphere, and points inside the sphere. Hence, the flux (surface integral) of the
gravitational field over the sphere is
−ag (r) · 4πr2 = −4πGM,
where M is the mass of the ball. Solving this equation gives
ag (r) =
GM
.
r2
This concludes Proposition 1.6.2. Proposition 1.6.3 follows from the same argument with M = 0,
because there is no mass enclosed by the sphere containing ⃗r inside the shell.
74
Chapter 2
Electricity and Magnetism
2.1
Coulomb’s Law, Electric Fields
Starting from this section, we dive into another important branch of physics—electromagnetism
(EM)—which, roughly speaking, studies interactions between particles with electric charges via
electromagnetic fields. It is the second fundamental interaction studied in this course and is the
dominant force in the interactions of electrons, atoms, and molecules. The modern science of
electromagnetism was developed by scientists in many countries. One of the most important was
Michael Faraday, a gifted experimenter with a talent for physical intuition and visualization. In
the mid-nineteenth century, James Clerk Maxwell put Faraday’s ideas into mathematical form,
introduced many new ideas of his own, and put electromagnetism on a sound theoretical basis. We
will present Maxwell’s unified theory of electromagnetism in Section 2.7, in the form of the famous
Maxwell’s equations.
In this section, we study the simplest form of electromagnetism, called electrostatics, which is a
branch of EM theory that studies electric charges at rest, with a main focus on electric fields due
to static electrically charged objects. (However, we will also consider moving charged particles in
electric fields—it is just the electric field that is “static”.)
2.1.1
Electric charges
There are two types of electric charge, named by the American scientist Benjamin Franklin as
positive charge and negative charge. He could have called them anything, but using algebraic signs
as names comes in handy when we add up charges to find the net charge. In most everyday objects,
there are about equal numbers of negatively charged particles and positively charged particles, and
so the net charge is zero, in which case the object is said to be electrically neutral. But sometimes,
the positive and negative charges are unbalanced, then the net charge of the object is called excess
charge.
75
Generally, we denote the charge of a point particle or a charged object by q. q is positive
(resp. negative) if the excess charge of the object is positive (resp. negative). The SI unit of charges
is C (coulomb). For practical reasons having to do with the accuracy of measurements, the coulomb
unit is derived from the SI unit A (ampere) for electric current i. We shall discuss current in detail
in Section 2.4. For now, we only note that current i is the rate at which charge moves past a point:
i=
dq
.
dt
This gives that A = C/s, or C = A · s.
Charge is quantized. Charges of objects are all due to charged particles in atoms, which consist
of positively charged protons, negatively charged electrons, and electrically neutral neutrons. The
charge of a single electron and that of a single proton have the same magnitude but are opposite in
sign. Hence, an electrically neutral atom contains equal numbers of electrons and protons. Electrons
are held near the nucleus because they have the electrical sign opposite that of the protons in the
nucleus and thus are attracted to the nucleus. The measured value of the charge of a proton is
approximately
e = 1.602 × 10−19 C,
and the charge of an electron is given by −e. Thus, any positive or negative charge q of an object
can be written as
q = ne, n ∈ Z,
where n is equal to the number of protons minus the number of electrons in the object. This gives
a very important feature of charges that is very different from mass:
Charges are not continuous; they are quantized and appear as integer multiples of e.
For this reason, people call e as the elementary charge. The 2019 redefinition of the SI base
units fixed the numerical value of e as 1.602176634 × 10−19 when expressed in coulombs, i.e, the
ampere (A) is defined as the electrical current equivalent to 1019 elementary charges moving every
1.602176634 seconds. But for historical reasons, people still use A instead of C as a base unit.
Quarks, the constituent particles of protons and neutrons, have charges of ±e/3 or ±2e/3, but
they cannot be detected individually. For this and for historical reasons, people do not take their
charges to be the elementary charge. The elementary charge e is one of the important constants of
nature, like the speed of light c. In modern physics, it is believed that the quantization of electric
charges is related to some topological properties of the gauge structure of the EM theory.
Charge is conserved. An important feature of electric charge is that it is always conserved,
i.e., the net charge of a closed system of bodies is unchanged. In other words, in all processes, a
positive or negative charge is not created but only transferred from one body to another, changing
the net charge of each body. This hypothesis of conservation of charge, first put forward by
Benjamin Franklin, has stood up under close examination, both for large-scale charged bodies and
for atoms, nuclei, and elementary particles. No exceptions have ever been found. A simple example
76
of charge conservation occurs when an electron e− and its antiparticle, the positron e+ , undergo
an annihilation process, transforming into two gamma rays:
e− + e+ → γ + γ.
Here, e− has charge −e and e+ has charge e, while γ is neutral. The converse of annihilation also
occurs: in pair production, a gamma ray transforms into an electron and a positron:
γ → e− + e+ .
So far, we have seen four conservation laws: energy, momentum, angular momentum, and electric charge. Noether’s theorem, a central result in theoretical physics, asserts that each conservation
law is associated with a symmetry of the underlying physics. The conservation of energy is associated with time translation symmetry. The conservation of momentum and the conservation of
angular momentum are associated with (space) translation symmetry and rotation symmetry, respectively. The symmetry that is associated with charge conservation is the global gauge invariance
of the electromagnetic field.
2.1.2
Coulomb’s law
With experiments, scientists discovered that: particles with the same sign of electrical charge repel
each other, and particles with opposite signs attract each other. Quantitatively, the strength of
the attraction or repulsion is described by the famous Coulomb’s law of electrostatic force (or
electric force) between charged particles.
Physics law 9 (Coulomb’s law). Consider two charged particles, where particle 1 has charge q1
and locates at ⃗r1 and particle 2 has charge q2 and locates at ⃗r2 . Then, the electric force acting on
particle 2 due to particle 1 is given by
kq1 q2
F⃗12 =
r̂,
r2
(2.1.1)
where r = |⃗r2 − ⃗r1 | is the separation between the particles, r̂ = (⃗r2 − ⃗r1 )/r is the unit vector along
the direction pointing from particle 1 to particle 2, and k is a positive constant called the Coulomb
constant and has value
k = 8.99 × 109 N · m2 /C2 .
Remark. The Coulomb constant is often written as
1
k=
,
4πε0
where ε0 is called the permittivity constant and has value
ε0 = 8.85 × 10−12 C2 /(N · m2 ).
In some sense, it is a more fundamental quantity than k, and is one of the two physical constants
that will appear in Maxwell’s equations (another one is the Vacuum permeability µ0 ).
77
Let’s first check the direction of the force on particle 2 as given by (2.1.1). If q1 and q2 have the
same sign (i.e., they are both positive or both negative), q1 q2 is positive and (2.1.1) tells us that the
force on particle 2 is in the direction of r̂, pointing from particle 1 to particle 2. That is, particle
2 is being repelled from particle 1. Conversely, if q1 and q2 have opposite signs, then (2.1.1) tells
us that the force on particle 2 is in the direction −r̂. That is, particle 2 is being attracted toward
particle 1.
You may notice something that is very curious. Although the two types of forces are wildly
different, the form of Coulomb’s law is the same as that of Newton’s law of gravitation (1.6.1): they
both satisfy the inverse square law (i.e., the 1/r2 dependence) that involve a product of a property
of the interacting particles—the charge in one case and the mass in the other. The main difference
between these two laws is that gravitational forces are always attractive but electrostatic forces
may be either attractive or repulsive, depending on the signs of the charges. This difference arises
from the fact that there is only one type of mass but two types of charge.
Due to the similarity with Newton’s law, some results developed in Section 1.6 also apply to
Coulomb’s law. Let’s list them here for your convenience.
1. Superposition of forces. As with all forces in this book, the electrostatic force obeys the
principle of superposition. In general, consider an object with charge density at ⃗r = (x, y, z) given
by ρ(⃗r). Then, similar to (1.6.2), the total electric force on a particle of charge q at position ⃗r′ is
given by a triple integral
˚
⃗r′ − ⃗r
kqρ(⃗r) ′
dxdydz, ⃗r = (x, y, z),
(2.1.2)
|⃗r − ⃗r|3
D
where D ⊂ R3 denotes the domain occupied by an object.
2. Shell theories. Analogous to the shell theories (Proposition 1.6.2 and Proposition 1.6.3) for
the gravitational force, we have two shell theories for the electrostatic force.
Proposition 2.1.1. (1) A charged particle outside a shell with a radially symmetrical charge density
is attracted or repelled as if the shell’s charge were concentrated as a particle at its center.
(2) A charged particle inside a shell with a radially symmetrical charge density has no net force
acting on it due to the shell.
3. Electric potential energy. Similar to Proposition 1.6.4, the electrostatic force is a conservative
force and is associated with some potential energy.
Proposition 2.1.2 (Electric potential energy). Coulomb (electrostatic) force is conservative. Furthermore, the potential energy of a system of two particles with charges q1 and q2 is
V =
kq1 q2
+ C,
r
(2.1.3)
where r is the distance between the two particles. In particular, we can choose C = 0 if the reference
potential energy at ∞ is taken to be 0.
78
If our system contains more than two particles, we consider each pair of particles in turn,
calculate the gravitational potential energy of that pair as if the other particles were not there, and
then algebraically sum the results. In general, the potential energy between an object occupying
D ⊂ R3 and a point particle of mass q at position ⃗r′ is calculated as an integral:
˚
kqρ(⃗r)
V =
dxdydz, ⃗r = (x, y, z).
(2.1.4)
r − ⃗r′ |
D |⃗
Finally, following Section 1.4, we have the celebrated energy conservation for an isolated system interacting only through Coulomb forces: the mechanical energy of the system, i.e., the kinetic energy
plus the electric potential energy, does not change. The electric potential energy is associated with
a so-called electric potential of the charged object, which will be discussed in detail in Section 2.2.
4. Gauss’s law. Similar to gravitation, the electrostatic force also satisfies Gauss’s law as discussed
in Section 1.6.5. We will discuss Gauss’s law and its consequence in detail in Section 2.2.
2.1.3
Electric fields
Consider two positively charged particles. We know that an electrostatic force acts on particle 2 due
to the presence of particle 1. We can also calculate the force direction and the force magnitude using
Coulomb’s law. There is another convenient (and actually deeper) way of looking at the interaction
between the two particles: particle 1 sets up an electric field at all points in the surrounding space,
even if the space is a vacuum. If we place particle 2 at any point in that space, it is affected by the
electric field particle 1 has already set up at that point.
In modern physics, fields are more fundamental concepts than forces. In particular, for EM
theory, the electric and magnetic fields are the two basic subjects of study. An electric field is a
⃗ one for each point in the space
vector field that consists of a distribution of electric field vectors E,
around a charged object. In other words, an electric field is a vector-valued function of the space
⃗ r) at a point ⃗r, is defined as follows. At point ⃗r, we place a particle with a
points, whose value E(⃗
small positive charge q0 , called a test charge. (We can think of the charge to be small so that it
does not disturb the object’s charge distribution.) We then measure the electrostatic force F⃗ that
acts on the test charge. The electric field at that point is then defined as
⃗
⃗ r) = F .
E(⃗
q0
(2.1.5)
From (2.1.5), we see that the SI unit for the electric field is N/C. Since the test charge is positive,
the two vectors in (2.1.5) are in the same direction. We can shift the test charge around to various
other points, to measure the electric fields there, so that we can figure out the distribution of the
electric field set up by the charged object. Note that the electric field exists independent of the test
charge. It is something that a charged object sets up in the surrounding space (even the vacuum),
independent of whether we happen to come along to measure it. (Be sure to distinguish between
force and field: force is a push or pull between two charged objects, while the electric field is an
abstract property set up by one given charged object.)
79
By Coulomb’s law (2.1.1), the electric field at ⃗r′ due to a point charge q located at ⃗r is given by
⃗ r′ ) =
E(⃗
kq
r̂,
|⃗r′ − ⃗r|2
r̂ =
⃗r′ − ⃗r
.
|⃗r′ − ⃗r|
(2.1.6)
Similar to forces, the electric fields also obey the principle of superposition. That is, if several
electric fields are set up at a given point by several charged particles, we can find the net field by
adding them as vectors. Hence, to calculate the net electric field at a given point due to several
particles, find the electric field due to each particle and then sum the fields as vectors. The electric
field set up by a general charged object can be calculated as in (2.1.2). As a special example, the
field due to a shell with a radially symmetrical charge density can be calculated easily with the
help of Proposition 2.1.1. We will see more examples in Section 2.1.4 below.
The idea of electric fields was introduced by Michael Faraday, who also introduced a useful
way to visualize an electric field in space. He envisioned lines, called electric field lines, in the
space around any given charged particle or object The electric field lines are drawn according to
the following rules: (1) at any point, the electric field vector must be tangent to the electric field
line through that point and in the same direction; (2) in a plane perpendicular to the field lines,
the relative density of the lines represents the relative magnitude of the field there, with greater
density for greater magnitude. Mathematically, it is not hard to show that such a family of electric
field lines exists and is unique.
Figure 2.1 gives an example of electric field lines near a sphere uniformly covered with negative
charges. At every point around the sphere, an electric field vector points radially inward toward
the sphere, and we can represent this electric field with electric field lines as in Figure 2.1.
Figure 2.1: The electric field lines near a uniform sphere of negative charges.
If the sphere in Figure 2.1 were uniformly covered with positive charges, the electric field vectors
at all points around it would be radially outward and thus so would the electric field lines. So, we
observe the following rule:
Electric field lines extend away from positive charge (where they originate) and toward negative
charge (where they terminate).
80
In Figure 2.1, the field lines originate on distant positive charges that are not shown.
Another feature of electric field lines is that they never intersect each other (think about why).
For example, Figure 2.2 shows the field lines for two particles with equal positive charges.
Figure 2.2: The electric field lines for two particles with equal positive charge.
We remark that a lot of different fields are used in science and engineering. The gravitational
field mentioned in Section 1.6.5 is another important vector field. However, not all fields are vector
fields. For example, a temperature field is a scalar field, which gives the distribution of temperatures
at each point. In EM theory, the electric and magnetic fields are combined into an electromagnetic
tensor field, and in Einstein’s theory of General Relativity, the gravitational field is also a tensor
field.
2.1.4
Electric fields due to charged objects
In this subsection, we examine some important examples of electric fields generated by several
specific types of charged objects.
Example (Electric field due to an electric dipole). Consider a system of two particles that have
the same charge magnitude q but opposite signs, a very common and important arrangement known
⃗ and −d/2,
⃗
as an electric dipole. Suppose the positive charge and negative charge are located at d/2
respectively. Calculate the electric field set up by this electric dipole.
Solution: We evaluate the electric field at the point ⃗r. Using (2.1.6), we have
⃗ r) =
E(⃗
kq
kq
⃗
⃗
(⃗r − d/2)
−
(⃗r + d/2).
3
3
⃗
⃗
|⃗r − d/2|
|⃗r + d/2|
In applications, we are often interested in the electrical field of a dipole at distances that are large
compared with the dimensions of the dipole, that is, at distances such that r ≫ d. At such large
81
distances, we can expand the above equation as
!
!
⃗
1
1
d
1
1
⃗ r) = kq⃗r
E(⃗
−
− kq
+
⃗ 3 |⃗r + d/2|
⃗ 3
⃗ 3 |⃗r + d/2|
⃗ 3
2 |⃗r − d/2|
|⃗r − d/2|
⃗ 3 − |⃗r − d/2|
⃗ 3 kq d⃗
|⃗r + d/2|
kqd2
= kq⃗r
− 3 +O
⃗ 3 |⃗r + d/2|
⃗ 3
r
r4
|⃗r − d/2|

!3/2
!3/2 
2
2
2
⃗
⃗
⃗
⃗
r
·
d
d
d
kqd
kqr3⃗r
⃗
r
·
d
kq
d
 1+
−
+ 2
+O
=
− 1− 2 + 2
⃗ 3 |⃗r + d/2|
⃗ 3
r2
4r
r
4r
r3
r4
|⃗r − d/2|
kq⃗r 3⃗r · d⃗ kq d⃗
kqd2
3(⃗
p · r̂)r̂ − p⃗
kqd2
= 3
− 3 +O
=k
+O
,
r
r2
r
r4
r3
r4
2
where O (·) denotes a vector with length of order at most kqd
and r̂ = ⃗r/r is the unit vector along
r4
⃗
⃗r direction. The vector p⃗ = q d, which involves the two intrinsic properties q and d⃗ of the dipole, is
an important vector quantity known as the electric dipole moment of the dipole. The unit of the
dipole moment is C · m. The magnitude and direction of p⃗ indicate the strength and orientation of
a dipole, respectively. In an idealistic situation, we take the dimension of the dipole d → 0, while
keeping the dipole moment p⃗ unchanged (by increasing the charge q = p/d → ∞). This limiting
process results in a “point dipole”, whose electric field is given by
p · r̂)r̂ − p⃗
⃗ r) = k 3(⃗
E(⃗
.
r3
(2.1.7)
An important feature of this field is that it decays with respect to r as 1/r3 in contrast to the 1/r2
decay in Coulomb’s law, i.e., the electric field of a dipole decays faster than that of a point charge.
In Figure 2.3, we show the pattern of electric field lines for an electric dipole.
Figure 2.3: The pattern of electric field lines around an electric dipole.
Example (Electric field due to a line of charge). Consider an infinite line of charge with uniform
linear charge density λ. Calculate the electric field around this line.
82
Solution: Suppose the line is along z direction and we calculate the electric field at ⃗r. We set up
the coordinate such that the line is the z-axis and ⃗r is on the x-axis, i.e., ⃗r = (x, 0, 0) for some
x > 0. Applying the symmetry of the problem, we know that the electric field is along x-direction.
We now calculate its magnitude.
Consider two symmetric differential elements dz at ±z. The magnitude of the electric field generated
by each of them at ⃗r is equal to
kλ
E(z)dz = 2
dz.
z + x2
The superposition of the two electric fields is along the x direction and has a magnitude
2E(z) cos θdz =
(z 2
kλx
dz.
+ x2 )3/2
Summing over all these differential elements over dz, we get that the total electric field is along the
x direction and has a magnitude
ˆ ∞
kλx
E(⃗r) = 2
dz.
(z 2 + x2 )3/2
0
With the change of variable z = x tan θ, we can evaluate the integral as
ˆ
E(⃗r) = 2
0
π/2
kλx
x3
cos3
θ
2kλ
x
dz =
2
cos θ
x
ˆ
π/2
cos θdθ =
0
2kλ
.
x
In general, the electric field at a point ⃗r = (x, y, z) is
(x, y, 0)
⃗ r) = p 2kλ
E(⃗
·p
.
2
2
x +y
x2 + y 2
Example (Electric field due to a circle). Consider a circle of radius R and with uniform linear
charge density λ. Calculate the electric field on the central axis, i.e., the axis through the ring’s
center and perpendicular to the plane of the ring.
83
Solution: Suppose the central axis is along z direction and the circle is on the xy-plane. At an
arbitrary point ⃗r = (0, 0, z) on the axis, by the symmetry of the problem, we know that the electric
field is along z-direction. We now calculate its magnitude.
Consider a different element ds on the circle. The magnitude of the electric field generated by this
element is equal to
kλ
ds.
2
z + R2
Its component along the z direction is equal to
z2
kλ
kλz
cos θds = 2
ds.
2
+R
(z + R2 )3/2
Summing over all these differential elements over ds, we get that the total electric field is along the
z direction and has a magnitude
E(⃗r) = 2πR
(z 2
kqz
kλz
= 2
,
3/2
2
+R )
(z + R2 )3/2
(2.1.8)
where q = 2πRλ is the total charge of the circle.
Example (Electric field due to a disk). Consider a circular disk of radius R and with uniform
surface charge density σ. Calculate the electric field on the central axis, i.e., the axis through the
disk’s center and perpendicular to the plane of the disk.
Solution: Suppose the central axis is along z direction and the disk is on the xy-plane. At an
arbitrary point ⃗r = (0, 0, z) on the axis, by the symmetry of the problem, we know that the electric
field is along z-direction. We now calculate its magnitude.
84
By the previous example, we know that the thin ring with radial width dr sets up a differential
electric field
kz(2πrσdr)
zrσ
dE = 2
=
dr,
(z + r2 )3/2
2ε0 (z 2 + r2 )3/2
where 2πrσdr is the total charge of the ring. Summing over all such rings, we get the total electric
field as
ˆ R
R
σ
zrσ
z
z
σ
√
=
E(⃗r) =
1− √
.
(2.1.9)
dr = −
2
2 3/2
2ε0 z 2 + r2 0
2ε0
z 2 + R2
0 2ε0 (z + r )
If we let R → ∞ while keeping z finite, the second term in the parentheses on the RHS
approaches zero, and the above equation reduces to
E=
σ
.
2ε0
(2.1.10)
For an infinite plane, any axis perpendicular to it can be regarded as a central axis. Hence, it sets
up a uniform electric field around it, and the electric field lines are as follows:
85
If we insert another infinite charged plane that is parallel to the previous plane and has uniform
surface charge density −σ, then by superposition of electric fields, the electric field between the
two planes is uniform and equal to
σ
(2.1.11)
E= ,
ε0
while the electric field vanishes elsewhere. This is also how people generate uniform electric fields
in the real world: a uniform electric field is produced by placing a potential (or voltage) difference
across two parallel metal plates. As long as the separation between the two plates is small compared
to the scale of the plates, then the electric field between them is approximately uniform at the
points far away from the edges of the plates. (We will discuss the concept of potential and voltage
in Section 2.3.)
2.1.5
Charged particles in electric fields
Given an electric field, it is simple to determine the electrostatic force acting on a particle of charge
q:
⃗
F⃗ = q E,
(2.1.12)
⃗ is the electric field that other charges have produced at the location of the particle. The
where E
field is not the field set up by the particle itself; to distinguish the two fields, the field acting on
the particle is often called the external field. A charged particle or object is not affected by its own
electric field. Equation (2.1.12) played a key role in the measurement of the elementary charge e in
the famous Millikan oil-drop experiment. You can read this wikipedia page about some interesting
stories about this experiment.
⃗
Example. Consider an electric dipole with electric dipole moment p⃗ in a uniform electric field E.
Calculate the net force and torque acting on the dipole, and find its potential energy.
Solution: Suppose the electric dipole consists of two particles with charges q and −q located at
⃗r1 and ⃗r2 , respectively. Its electric dipole moment is equal to
p⃗ = q(⃗r1 − ⃗r2 ).
The total force on the dipole is
⃗ − qE
⃗ = 0.
F⃗ = q E
The total torque on the dipole is
⃗ − ⃗r2 × q E
⃗ = q(⃗r1 − ⃗r2 ) × E
⃗ = p⃗ × E.
⃗
⃗τ = ⃗r1 × q E
⃗ is equal to
We claim that the potential energy of a charged particle in a uniform electric field E
⃗ · ⃗r + C,
U (⃗r) = −q E
86
where q is the charge of the particle and C is a constant depending on our choice of the zero
reference potential energy. (We can take C = 0 if we let U (0) = 0.) To see this, we find that
⃗
∇⃗r U (⃗r) = q E,
which is indeed the electric force on the charge. Thus, the potential energy of the dipole is
⃗ · ⃗r1 + q E
⃗ · ⃗r2 = −q(⃗r1 − ⃗r2 ) · E
⃗ = −⃗
⃗
U = −q E
p · E.
87
2.2
Electric flux, Gauss’s law, and integral theorems
A main reference of this section is David Tong’s lecture notes on vector calculus
http://www.damtp.cam.ac.uk/user/tong/vc.html.
2.2.1
Vector calculus
As we have seen in the previous section, electric fields are vector fields, which assign a vector for
every point in space R3 . More precisely, a vector field F⃗ in d dimensions is a map
F⃗ : Rd → Rd .
(2.2.1)
Fields are fundamental concepts in physics. Besides vector fields, later on, we will encounter another
kind of field called a scalar field, which assigns a number for every point in space. More precisely,
a scalar field ϕ in d dimensions is a map
ϕ : Rd → R .
(2.2.2)
Vector fields and scalar fields are related by three important operations: gradient, divergence, and
curl.
Let us introduce the first operation, the gradient denoted by Grad. It is an operation that
takes a scalar field to a vector field.
Definition 13 (Gradient). Given Cartesian coordinates xi with i = 1, · · · , d on Rd , the gradient
is defined by
∂ϕ(⃗r)
∂ϕ(⃗r)
⃗
(2.2.3)
Grad : ϕ(⃗r) 7→ ∇ϕ(⃗r) ≡
,··· ,
∂x1
∂xd
where ϕ(⃗r) is a scalar field.
The above definition relies on the choice of Cartesian coordinates xi . A coordinate-free definition
is given by considering the difference between the scalar field ϕ evaluated at two nearby points ⃗r
and ⃗r + ⃗ϵ with ϵ = |⃗ϵ| ≪ 1,
⃗ r) + O(ϵ2 ) ,
ϕ(⃗r + ⃗ϵ) − ϕ(⃗r) = ⃗ϵ · ∇ϕ(⃗
(2.2.4)
where O(ϵ2 ) denotes the terms that are of order at least ϵ2 . (2.2.4) could be regarded as an
alternative definition of the gradient. When picking a choice of Cartesian coordinates with ⃗r =
(x1 , · · · , xd ) and ⃗ϵ = (ϵ1 , · · · , ϵd ), we recover the definition (2.2.3).
We have already seen many examples of the gradient when we discussed the conservative force
in Section 1.4.1. In particular, the gradient of the gravitational potential energy of between two
point particles is computed explicitly in equations (1.4.30) and (1.4.31).
⃗ as an object in its own right, and call it the gradient operator.
We can view ∇
88
Definition 14 (Gradient operator). Given Cartesian coordinates xi with i = 1, · · · , d on Rd , the
gradient operator is defined by
∂
∂
⃗ ≡
∇
,··· ,
.
(2.2.5)
∂x1
∂xd
It is a vector whose entries are partial derivatives.
The gradient operator is an example of the differential operator. A differential operator broadly
means a collection of derivatives, that can act on some functions. For example, the one-variable
d
derivative dt
is a differential operator that can act on a function f (t) and gives dfdt(t) .
⃗ can act on other fields in different ways.
Besides acting on scalar fields, the gradient operator ∇
The divergence, denoted by Div, is a way for the gradient operator to act on a vector field and
produces a scalar field.
Definition 15 (Divergence). Given Cartesian coordinates xi with i = 1, · · · , d on Rd , the divergence is defined by
d
X
∂Fi (⃗r)
⃗ · F⃗ (⃗r) ≡
(2.2.6)
,
Div : F⃗ (⃗r) 7→ ∇
∂xi
i=1
where F⃗ (⃗r) is a vector field.
As an example, let us compute the divergence of the electric field of a charged particle.
Example. By Coulomb’s law (2.1.1), the electric field at ⃗r due to a point charge q located at the
origin ⃗r = 0 is given by
⃗ r) = kq ⃗r .
E(⃗
(2.2.7)
r3
⃗ · E(⃗
⃗ r).
Compute ∇
Solution: Let us first compute
∂Ex
∂x ,
∂Ex
∂ kqx
=
∂x
∂x r3
kqx
kq 3
= 3 − × 2(x′ − x) × 5
r
2
r
kq 3kqx2
= 3 −
.
r
r5
The
∂Ey
∂y
and
∂Ez
∂z
(2.2.8)
can be computed in a similar way. Now, we sum up these three terms and find
⃗ · E(⃗
⃗ r) = ∂Ex + ∂Ey + ∂Ez
∇
∂x
∂y
∂z
3kq 3kq x2 + y 2 + z 2
= 3 −
r
r5
= 0.
89
(2.2.9)
⃗ · E(⃗
⃗ r) is a scalar field that is identically zero. However, we need
Naively, we may conclude that ∇
to be careful about the point at the origin ⃗r = 0, where the electric field diverges. We will see later
⃗ r) cannot be zero at the origin ⃗r = 0, but instead we actually have
that ∇ · E(⃗
⃗ · E(⃗
⃗ r) = 4πkqδ 3 (⃗r) ,
∇
(2.2.10)
where δ 3 (⃗r) is the three-dimensional Dirac delta function
δ 3 (⃗r) ≡ δ(x)δ(y)δ(z) .
(2.2.11)
The Dirac delta function δ(x) can be loosely thought of as a function on the real line, which is zero
everywhere except at the origin, where it is infinite,
(
+∞ for x = 0 ,
δ(x) ≈
(2.2.12)
0
for x ̸= 0 ,
and is also constrained to satisfy the identity
ˆ ∞
δ(x) = 1 .
(2.2.13)
−∞
A formula for the Dirac delta function as a limit is
x 2
1
δb (x) = √ e−( b ) .
b π
δ(x) = lim δb (x) ,
b→0+
(2.2.14)
The following plot shows how the function δb (x) approaching δ(x) as b → 0+ .
8
6
4
2
-1.0
0.5
-0.5
1.0
Figure 2.4: The plot of the function δb (x) for b = 1, 21 , 14 , 81 ,
1
16 .
We will give a derivation of the statement (2.2.10) when we discuss the divergence theorem in
the next subsection.
90
Physically, the formula (2.2.10) tells us that the divergence is an operation that measures the
⃗ ·E
⃗ is non-zero at the position of the charged
source (the charged particle) of the electric field. ∇
particle, and is zero everywhere else. We will study the following two examples that will further
confirm our physical intuition.
Let us consider the divergence of the electric field generated by n charged particles.
Example. Consider n charged particles with charges q1 , q2 , · · · , qn at the positions ⃗r1 , ⃗r2 , · · · , ⃗rn .
The electric field generated by them is
n
X
⃗r − ⃗ri
⃗ r) =
E(⃗
kqi
.
(2.2.15)
|⃗r − ⃗ri |3
i=1
⃗ · E(⃗
⃗ r).
Compute ∇
Solution: We note that the divergence is a linear operation. That is, for a linear combination of
two vector fields
(2.2.16)
a1 F⃗1 (⃗r) + a2 F⃗2 (⃗r) ,
with a1 and a2 two constants independent of the position vector ⃗r, we have
h
i
⃗ · a1 F⃗1 (⃗r) + a2 F⃗2 (⃗r) = a1 ∇
⃗ · F⃗1 (⃗r) + a2 ∇
⃗ · F⃗2 (⃗r) .
∇
Hence, we have
⃗ · E(⃗
⃗ r) =
∇
n
X
⃗ ·
kqi ∇
i=1
= 4πk
n
X
(2.2.17)
⃗r − ⃗ri
|⃗r − ⃗ri |3
(2.2.18)
3
qi δ (⃗r − ⃗ri ) .
i=1
We see again that the divergence of the electric field is only non-zero at the place where the charged
particles reside.
We can go one step further to compute the divergence of the electric field generated by a charged
object.
Example. The electric field generated by a charged object with a charge density ρ(⃗r) is
ˆ
⃗r − ⃗r′
′
⃗
kρ(⃗r )
dx′ dy ′ dz ′ ,
E(⃗r) =
′ |3
|⃗
r
−
⃗
r
D
(2.2.19)
⃗ · E(⃗
⃗ r).
where D is the domain of the charged object. Compute ∇
Solution: By the linearity of the divergence (2.2.16), we have
ˆ
⃗r − ⃗r′
′ ⃗
⃗
⃗
∇ · E(⃗r) =
kρ(⃗r )∇ ·
dx′ dy ′ dz ′
′ |3
|⃗
r
−
⃗
r
ˆD
⃗r − ⃗r′
⃗ ·
=
kρ(⃗r′ )∇
dx′ dy ′ dz ′
|⃗r − ⃗r′ |3
ˆ
=
4πkρ(⃗r′ )δ 3 (⃗r − ⃗r′ )dx′ dy ′ dz ′
= 4πkρ(⃗r) ,
91
(2.2.20)
where we have used the fact that ρ(⃗r′ ) is zero at ⃗r′ ∈
/ D at the second equality, and (2.2.13) at the
⃗
⃗
forth equality. We see that ∇ · E(⃗r) gives the charge density of the object. It again confirms our
intuition that the divergence measures the source of the electric field.
⃗ to act on a vector field,
In three dimensions, there is another way for the gradient operator ∇
that is, by taking the cross product.
Definition 16 (Curl). Given Cartesian coordinates (x, y, z), the curl, denoted as Curl, is defined
by
⃗ × F⃗ (⃗r) ≡ ∂F3 − ∂F2 , ∂F1 − ∂F3 , ∂F2 − ∂F1 ,
Curl : F⃗ (⃗r) 7→ ∇
(2.2.21)
∂y
∂z ∂z
∂x ∂x
∂y
where F⃗ (⃗r) is a vector field.
We see that Curl is an operation that takes a vector field to another vector field. Alternatively,
the curl can be defined by
3
X
∂Fk
⃗ × F⃗ )i =
,
(∇
ϵijk
(2.2.22)
∂xj
j,k=1
where ϵijk is a totally antisymmetric tensor with ϵ123 = 1.
The meaning of the curl is that it measures the rotation of a vector field. Let us try to understand
this statement by looking at the following examples.
Example. Compute the curl of the vector field
F⃗ (⃗r) = (y, −x, 0) ,
(2.2.23)
whose field lines are plotted below
1.0
0.5
0.0
-0.5
-1.0
-1.0
-0.5
0.0
92
0.5
1.0
Solution: By a direct computation, we find
⃗ × F⃗ = 0, 0, ∂(−x) − ∂y = (0, 0, −2) .
∇
∂x
∂y
(2.2.24)
From the above figure, we see that F⃗ is a vector field that rotates clockwise on the x-y plane.
⃗ × F⃗ is a vector pointing in the negative z-direction, which
Indeed, our computation shows that ∇
is perpendicular to the x-y plane.
Next, we consider the curl on the electric fields. From the plots of the electric field lines in
Figure 2.1, 2.2, and 2.3, we see that the static electric fields look in general not rotating. Let us
verify our expectations.
Example. Compute the curl of the electric field
⃗ r) = kq ⃗r .
E(⃗
r3
(2.2.25)
Solution:Let us compute the z-component
∂ y
∂ x
−
∂x r3 ∂y r3
−3x
−3y
= kq y 5 − x 5
r
r
⃗ × E)
⃗ z = kq
(∇
(2.2.26)
= 0.
⃗ ×E
⃗ are also zero, and we
By similar computations, we find that the other two components of ∇
conclude
⃗ × E(⃗
⃗ r) = 0 .
(2.2.27)
∇
We still need to worry about the point at ⃗r = 0 where the electric field diverges. We will see in
⃗ ×E
⃗ is also zero at the origin.
Section 2.2.3 that ∇
Like divergence, curl is also a linear operation, i.e.
⃗ × (a1 F⃗1 + a2 F⃗2 ) = a1 ∇
⃗ × F⃗1 + a2 ∇
⃗ × F⃗2 .
∇
(2.2.28)
Then, by a similar argument as before for the divergence, we find that the electric fields of n charged
particles or a charged object are nonrotational.
In Section 2.6, we will see that the electric fields can rotate when there are time-dependent
magnetic fields.
2.2.2
Electric flux, divergence theorem, and Gauss’s law
As we have learned from the previous subsection, the divergence provides a way to measure the
sources of the electric fields. In this section, we will introduce Gauss’s law, which is a very different
93
way to measure the source of an electric field. The equivalence of these two ways leads to the
divergence theorem.
The idea of Gauss’s law is that given a closed surface S, we would like to know if there is any
net charge inside S by analyzing the electric fields on S. To obtain more intuitions, let us consider
the situation in which a charged particle is inside the closed surface S. When the charge of the
particle is positive, the electric field lines of the particle are always pointing outward to the surface.
Figure 2.5 shows the electric lines piercing a piece of the surface.
Figure 2.5: Electric field lines pierce a surface.
On the other hand, when the charge of the particle is negative, all the electric field lines are pointing
inward to the surface. Hence, there should be a relation between the net number of electric field
lines piercing a closed surface (the number of outward electric field lines minus the number of inward
electric field lines) and the net electric charge inside the surface.
Let us try to make this relation more precise. First, we need to have a more precise definition
of the “number of electric field lines piercing a surface”. This leads to the following definition. We
would like to make our discussion a bit more general, by working in d-dimensional space Rd , and
the “surfaces” in the following are specifically referred to the (d − 1)-dimensional surfaces in Rd .
But, you could always fix d = 3 if you like.
Definition 17 (Flux). The flux Φ of a vector field F⃗ through an oriented surface S is defined as
the integral
ˆ
⃗.
Φ=
F⃗ · dA
(2.2.29)
S
Let us try to decode this definition. Consider a very small piece of the surface with area ∆A, which
is small enough such that we could approximate it by a plane as shown in the following picture.
94
⃗ by the following two conditions:
We define the area vector ∆A
⃗ = ∆A ,
1. |∆A|
⃗ is orthogonal to all the vectors along the plane, i.e. ∆A
⃗ · ⃗v = 0 if ⃗v is along the plane.
2. ∆A
⃗ is defined as the limit when the area becomes infinitesimal. These two conditions
The vector dA
⃗ up to a sign. That is, if an area vector dA
⃗ satisfies conditions 1 and 2, then
fix the area vector ∆A
⃗ also satisfies the same conditions. To capture the information of this sign, let us
the vector −dA
look at the unit normal vector of the surface S
n̂ =
⃗
dA
,
⃗
|dA|
(2.2.30)
which is called the orientation of the surface S. To make the flux well-behaved, we would like to
require that the orientation n̂ should vary continuously locally on the surface S. More precisely,
we demand the condition:
3. In any open neighborhood on the surface S, the area vector n̂ is continuous.
If the orientation n̂ can be extended to be continuous on the entire surface S, then we have a way to
⃗ on S. In fact, not every surface admits a continuous orientaconsistently choose the area vector dA
tion. The surfaces that admit a continuous orientation are called orientable surfaces, otherwise are
called non-orientable surfaces. We could only define the flux for orientable surfaces. The orientable
surfaces with a chosen orientation are called oriented surfaces. We would regard the orientation as
part of the definition of an oriented surface. That is, two oriented surfaces that coincide in space
are regarded as different oriented surfaces if they have different orientations. For a closed oriented
surface, we choose our convention that its orientation is always pointing outward the surface.
⃗ and the vector field F⃗ and integrate over
In (2.2.29), we take the inner product between dA
the closed surface S. Let d = 3 and the vector field F⃗ be the electric field. The formula (2.2.29)
defines the electric flux, which is our precise definition of the “number of electric field lines piercing
a surface”. We can see that the definition (2.2.29) of the electric flux agrees with our expectation.
Namely, when we have a positively (negatively) charged object inside the closed surface S, we find
a positive (negative) flux. We will see later a precise formula (Gauss’s law) on the relation between
the flux through a closed surface and the net charge inside that surface.
95
Now, from our discussions in the previous subsection and this subsection, we have seen two
ways to find the sources (charged objects) of the electric fields, by the divergence and by the flux.
These two ways are beautifully related by the divergence theorem, also known as Gauss’s theorem.
Theorem 2.2.1 (Divergence theorem). For a vector field F⃗ over Rd ,
ˆ
ˆ
d
⃗
⃗
⃗,
∇·F d x=
F⃗ · dA
B
(2.2.31)
S
where B is a bounded region whose boundary ∂B = S is å piecewise smooth closed (d − 1)dimensional surface.
Let us leave the proof of the divergence theorem to your analysis class. Instead, we will try to
understand the physical meaning of the divergence theorem. We will again focus on d = 3 and
⃗
F⃗ = E.
Example. Consider a charged particle of charge q at the origin ⃗r = 0, which generates the electric
field
⃗ r) = kq ⃗r .
E(⃗
(2.2.32)
|⃗r|3
Compute the flux of the electric field through a round two-sphere S2 of radius R centered at the
origin.
⃗ is along the radial direction, and we have
Solution: By the spherical symmetry, the vector dA
⃗ = r̂|dA|
⃗ = r̂dA .
dA
(2.2.33)
It is convenient to work in the spherical coordinates (r, θ, ϕ), which is related to the Cartesian
coordinate (x, y, z) by
x = r sin θ cos ϕ ,
y = r sin θ sin ϕ ,
(2.2.34)
z = r cos θ ,
where r, θ, ϕ are in the range r ∈ [0, ∞), θ ∈ [0, π], ϕ ∈ [0, 2π). We would like to know how to
perform the volume integral and the surface integral in the spherical coordinates.
Let us consider a more general problem, the volume integral in a general coordinate system
(u, v, w). Consider a small cube in the (u, v, w)-coordinate, whose six faces are on the constant u,
v, or w planes. The sides of the cube have lengths ∆u, ∆v, and ∆w. The area of the cube is not
simply given by ∆u∆v∆w, because the sides are not at necessarily right angles. When the cube is
small enough, we have
∂x
∂x
∂x
∆x =
∆u +
∆v +
∆w + · · · ,
∂u
∂v
∂w
∂y
∂y
∂y
(2.2.35)
∆y =
∆u +
∆v +
∆w + · · · ,
∂u
∂v
∂w
∂z
∂z
∂z
∆u +
∆v +
∆w + · · · .
∆z =
∂u
∂v
∂w
96
where the · · · are the second and higher order terms O(∆u2 , ∆v 2 , ∆w2 , ∆u∆v, ∆v∆w, ∆u∆w). In
the matrix form, we have
 




∂x ∂x ∂x
∆x
∆u
∂v ∂w
 


 ∂u
∂y ∂y ∂y 
(2.2.36)
∆y  = J  ∆v  , J =  ∂u ∂v ∂w  ,
∂z ∂z ∂z
∆z
∆w
∂u ∂v ∂w
where J is called the Jacobian matrix. Geometrically, this means that the cube in the (u, v, w)coordinate is a parallelepiped in the (x, y, z)-coordinate with sides given by the vectors
−→ ∂x ∂y ∂z ∆u = ∂u ∂u ∂u ∆u ,
−→ ∂x ∂y ∂z ∆v = ∂v ∂v ∂v ∆v ,
(2.2.37)
−−→
∂x ∂y ∂z
∆w = ∂w
∆w .
∂w ∂w
The volume of the parallelepiped is given by
−→ −→ −−→
(∆u × ∆v) · ∆w = | det J|∆u∆v∆w .
Let us compute the Jacobian for the spherical coordinate


 ∂x ∂x ∂x 
sin θ cos ϕ r cos θ cos ϕ −r sin θ sin ϕ
∂r ∂θ ∂ϕ



∂y ∂y 
2
det  ∂y
∂r ∂θ ∂ϕ  = det  sin θ sin ϕ r cos θ sin ϕ r sin θ cos ϕ  = r sin θ .
∂z ∂z ∂z
cos θ
−r sin θ
0
∂r ∂θ ∂ϕ
(2.2.38)
(2.2.39)
The volume of a cube in the spherical coordinate is
r2 sin θ∆r∆θ∆ϕ .
(2.2.40)
The integration measure in the spherical coordinate should be
r2 sin θdrdθdϕ .
(2.2.41)
Now, since a round two-sphere centered at the origin has a constant radial coordinate r = R, the
⃗ should be
area vector dA
⃗ = r̂dA = r̂(r2 sin θdθdϕ) .
(2.2.42)
dA
The flux is then computed by the integral
ˆ π ˆ 2π
⃗r
Φ=
dθ
dϕ kq 3 · r̂(r2 sin θ) = 4πkq .
r
0
0
(2.2.43)
Now, we could complete our argument that a three-dimensional Dirac delta function should sit
at the right-hand side of the equation (2.2.10). From our previous computations (2.2.8) and (2.2.9),
⃗ ·E
⃗ is zero except at the origin where the point
we know that the divergence of the electric field ∇
97
⃗ ·E
⃗ cannot be zero at the origin, because by the divergence theorem (2.2.31), we
charge sits. ∇
know that
ˆ
⃗ ·E
⃗ d3 x = 4πkq .
∇
(2.2.44)
B
In fact, the volume integral on the left-hand side only receives a contribution from the point at the
⃗ ·E
⃗ must be proportional to a Dirac delta function in order to give nonzero
origin ⃗r = 0. Hence, ∇
volume integral. The proportionality constant can be fixed by (2.2.44) to be 4πkq.
The divergence theorem has a very profound consequence on the electric flux.
⃗ · F⃗ = 0, its flux through a surface S
Corollary 2.2.2. For a divergence-free vector field F⃗ , i.e. ∇
is invariant under local continuous deformation of S.
To see this corollary, let us consider the following example.
Example. A charged object of a charge density ρ(⃗r) is inside a closed surface S as shown in
Figure 2.6.
Figure 2.6: The electric flux is invariant under deformations on S as long as the deformation does
not cross any charges.
Now let us consider continuously deforming the surface S to S ′ , and let ∆B be the region bounded
by the two surfaces S and S ′ , i.e. ∂(∆B) = S ∪ S ′ , where S ′ denotes the orientation reversal of S ′ .
Assume that there is no charged object inside ∆B. We can compute the difference of the electric
flux through S and through S ′ ,
ˆ
ˆ
ˆ
ˆ
⃗
⃗
⃗
⃗
⃗
⃗
⃗
⃗ · Ed
⃗ 3x = 0 ,
E · dA =
∆Φ =
E · dA −
E · dA =
∇
(2.2.45)
S
S′
S∪S ′
∆B
⃗ ·E
⃗ =0
where at the last equality we used the fact that there is no charged object insider B so ∇
inside B.
98
We can compute the electric flux directly using the divergence theorem and the formula (2.2.20).
Let B be the region bounded by S. We have
ˆ
ˆ
ˆ
3
⃗ · dA
⃗=
⃗ · Ed
⃗ x = 4πk
Φ=
E
∇
ρ(⃗r)d3 x = 4πkQ ,
(2.2.46)
S
B
B
where Q is the total (net) charge inside the surface S. We have found that the electric flux through
a surface S equals 4πk times the total net charge inside S. This statement is called Gauss’s law of
the electric field. The surface S is also called Gaussian surface.
2.2.3
Stoke’s theorem, Poincaré lemma, and electric potential
In Section 2.2.1, we have seen that curl measures the rotation of a vector field. Let us introduce
a different way to measure the rotation of a vector field. The rotation of a vector field F⃗ along a
closed curve C can be measured by the following loop integral:
ˆ
F⃗ · d⃗r .
(2.2.47)
C
Example. Compute the loop integral (2.2.47) of the vector field
F⃗ (⃗r) = (y, −x, 0) ,
(2.2.48)
along a curve C which is a counter-clockwise circle of radius r centered at the origin as shown in
the following figure
1.0
0.5
0.0
-0.5
-1.0
-1.0
-0.5
0.0
0.5
1.0
Solution: We compute
ˆ
ˆ
F⃗ · d⃗r = −
C
ˆ
|F⃗ ||d⃗r| = −
C
r2 dθ = −2πr2 ,
0
99
2π
(2.2.49)
where at the first equality we use F⃗ ·d⃗r = |F⃗ ||d⃗r| because the vector field F⃗ is always in the opposite
direction as d⃗r. We see that the loop integral is indeed nonzero for a rotating vector field.
Let us also consider an example of the loop integral (2.2.47) for a nonrotating vector field
Example. Compute the loop integral (2.2.47) of the electric field of a point charge q at the origin,
⃗ r) = kq ⃗r .
E(⃗
r3
(2.2.50)
Solution: Let us take the curve to be a unit circle on the x-y plane with counter-clockwise
orientation, so we have
(2.2.51)
d⃗r = (− sin ϕ, cos ϕ, 0)rdϕ .
Now, we can easily see that we have
⃗ · d⃗r = 0 .
E
(2.2.52)
So the loop integral indeed vanishes for the nonrotating field (2.2.50).
The equivalence of the two ways of measuring the rotation of the vector field, by taking curl
and by the loop integral, leads to Stoke’s theorem.
Theorem 2.2.3 (Stoke’s theorem). Let S be a smooth surface in R3 with boundary C = ∂S a
piecewise smooth curve. For any smooth vector field F⃗ (⃗r), we have
ˆ ˆ
⃗ × F⃗ · dA
⃗ = F⃗ · d⃗r .
∇
(2.2.53)
C
S
We will again leave the proof of Stoke’s theorem to your analysis class.
Instead, let us verify Stoke’s theorem for the examples (2.2.48) and (2.2.50). First, we have
computed the curl of (2.2.48) previously in (2.2.24). We have
ˆ S
ˆ
ˆ
⃗ × F⃗ · dA
⃗ = (0, 0, −2) · dA
⃗=
∇
S
r
ˆ
dr′
0
2π
dθ(−2r) = −2πr2 .
(2.2.54)
0
We got the same answer as the loop integral (2.2.49).
Next, we look at the example (2.2.50). We found that the curl of (2.2.50) was zero previously
⃗ ×E
⃗ is also zero at the
in (2.2.26) (except at the origin). Now, we can give an argument that ∇
2
origin. We can take the surface S to be a small disk Dϵ on the x-y plane of radius ϵ centering at
the origin. The area vector on the Dϵ2 is pointing in the z-direction
⃗ = ẑdA .
dA
100
(2.2.55)
Now, let us consider
ˆ
1
⃗ × E)
⃗ z dA
(∇
ϵ→0 πϵ2 D2
ϵ
ˆ
1
⃗ × E)
⃗ · dA
⃗
= lim 2
(∇
ϵ→0 πϵ
Dϵ2
ˆ
1
⃗ · d⃗r
E
= lim 2
ϵ→0 πϵ
∂Dϵ2
0
= lim 2
ϵ→0 πϵ
= 0.
⃗ × E)
⃗ z (0) = lim
(∇
(2.2.56)
⃗ × E)
⃗ z (0) as the average of (∇
⃗ × E)
⃗ z over the
On the first equality, we have rewritten the (∇
infinitesimal disk Dϵ2 . We have used (2.2.55) on the second equality, Stoke’s theorem on the third
⃗ × E)
⃗ x=0=
equality, and (2.2.52) on the fourth equality. In similar ways, we can show that (∇
⃗
⃗
⃗
⃗
(∇ × E)y . Hence, ∇ × E = 0 everywhere.
As discussed in Section 2.2.1, static electric fields are always nonrotational. There is a beautiful
theorem saying that nonrotational is equivalent to conservative.
Theorem 2.2.4 (Poincaré lemma). For vector fields defined everywhere on R3 , conservative is
equivalent to nonrotational, i.e.
⃗ ⇐⇒ ∇
⃗ × F⃗ = 0 .
(2.2.57)
F⃗ = ∇ϕ
Proof: First, let us prove the ⇒ direction. We have
⃗ ,
F⃗ = ∇ϕ
(2.2.58)
∂ϕ
.
∂xi
(2.2.59)
or in component form
Fi =
⃗ × F⃗ using (2.2.22),
Now, we compute the components of ∇
⃗ × F⃗ )i =
(∇
3
X
j=1
ϵijk
3
3
X
X
∂Fk
∂2ϕ
∂Fk
=
ϵijk
=
ϵijk
= 0,
∂xj
∂xj
∂xj ∂xk
j,k=1
(2.2.60)
j,k=1
where we have used the fact that partial derivatives commute with each other at the last equality.
Next, let us prove the ⇐ direction. We assume that F⃗ is a nonrotational field,
⃗ × F⃗ = 0 .
∇
Let us define a scalar field ϕ(⃗r) by the line integral
ˆ
ϕ(⃗r) =
F⃗ (⃗r′ ) · d⃗r′ ,
C(⃗
r0 ,⃗
r)
101
(2.2.61)
(2.2.62)
where C(⃗r0 , ⃗r) is a curve from ⃗r0 to ⃗r, and ⃗r0 is any fixed reference point. This line integral defines
an unambiguous scalar field because it only depends on the boundary points of the curve but not
the curve itself. To see this, let us consider the difference of the line integral along the curves
C1 (⃗r0 , ⃗r) and C2 (⃗r0 , ⃗r),
ˆ
ˆ
ˆ
ˆ ⃗ × F⃗ · dA
⃗ = 0,
F⃗ (⃗r′ ) · d⃗r′ −
F⃗ (⃗r′ ) · d⃗r′ =
F⃗ (⃗r′ ) · d⃗r′ =
∇
C1 (⃗
r0 ,⃗
r)
C2 (⃗
r0 ,⃗
r)
C1 (⃗
r0 ,⃗
r)∪C2 (⃗
r0 ,⃗
r)
S
where at the second equality we have used Stoke’s theorem for the surface S with boundary ∂S =
C1 (⃗r0 , ⃗r) ∪ C2 (⃗r0 , ⃗r).
Figure 2.7: Changing the curve in the definition of the scalar field.
Now, it is easy to check that the scalar field ϕ(⃗r) defined in (2.2.62) satisfies
⃗ ,
F⃗ = ∇ϕ
(2.2.63)
Hence, F⃗ is conservative. Q.E.D.
Since static electric fields are always nonrotational, they are always conservative. In other
⃗ r), there always exists a scalar field V (⃗r) such that
words, for a electric field E(⃗
⃗ = −∇V
⃗ .
E
This scalar field V (⃗r) is called a electric potential.
102
(2.2.64)
2.3
Applying Gauss’s Law, Electric potential
In this section, we discuss various applications of Gauss’s Law and electric potential.
2.3.1
A charged isolated conductor
Application. Gauss’s law implies that if an isolated conductor carries an excess charge, the charge
would be entirely on the surface of the conductor.
Conductors are materials in which charged particles (electrons) are free to move; examples
include metals (such as copper in common lamp wire), the human body, and tap water. The
charged particles in nonconductors (insulators) are not free to move; examples include rubber
(such as the insulation on common lamp wire), plastic, glass, and chemically pure water. The
net electric field inside a conductor must be zero, because, in a generic situation, the electric field
would not always point in the normal direction to the surface of the conductor, and would exert
forces on the charged particles to make them move and redistribute. Eventually, an equilibrium
configuration would be achieved, such that there is no net force on any charged particles; hence,
no net electric field inside the conductor.
Figure 2.8: Cross-sections of conductors.
The left picture in Figure 2.8 shows the cross-section of a conductor. We can consider a Gaussian
surface that is very close to the surface of the conductor, shown as the red curve in the picture.
There is no electric flux through this Gaussian surface, because, as we just argued, there is no net
electric field inside the conductor. By Gauss’ law, the net charge inside the Gaussian surface must
be zero. By shrinking the Gaussian surface to a smaller size inside the conductor, we can further
argue that not just the net charge is zero, but the charge density at every interior point is zero.
The right picture in Figure 2.8 shows a more complicated situation, a conductor with a cavity that
contains a positive charge. In this case, there are non-zero electric fields in the cavity. We choose
a Gaussian surface to be the union of two surfaces, shown as the red curves, that are very close to
103
the inner and outer surfaces of the conductor. Again, there is no electric flux through either the
Gaussian surface; hence, the excess charge in the conductor should be on the inner or outer surface
of the conductor.
2.3.2
Combining Gauss’s law with symmetry
Determining the electric field configuration for a given charge distribution is generally a complex
task. However, if the charge distribution exhibits a certain symmetry, Gauss’s law can be applied to
deduce the electric field configuration. This simplifies the calculation significantly, as the symmetry
reduces the complexity of the problem.
Spherical symmetry
Consider a point charge q located at the origin. This system possesses spatial O(3) symmetry,
meaning that it is invariant under the action of the group elements which are 3-by-3 real matrices
R that satisfy RT R = I. The symmetry action is ⃗r′ = R⃗r. Among these elements, there are several
special symmetry operations that transform the system in a specific way.
(1) SO(3) rotational symmetry which satisfies det R = 1. We can decompose any threedimensional rotation into a sequence of rotations around different axes. This allows us to represent
any rotation by three angles, known as Euler angles. The rotational symmetry along a given axis
forms a group SO(2).
(2) Mirror reflection symmetry involves the reflection of the entire system about a given plane.
For example, consider a reflection through the xy-plane. The position vector of an arbitrary point
P (x, y, z) changes to P ′ (x, y, −z) under reflection.
(3) Parity symmetry maps a point P (x, y, z) to its opposite point P ′ (−x, −y, −z) with respect
to the origin, i.e., ⃗r′ = −⃗r.
Now we ask what is the transformation law of the electric field of the charged particle under
the SO(3) transformations. According to the Coulomb’s law, the electric field of the point charge is
⃗ r) = kq⃗3r . Now suppose we apply a R ∈O(3) transformation to the coordinate system,
given by E(⃗
|⃗
r|
⃗ r)
such that ⃗r is transformed to ⃗r′ = R⃗r. We can simply substitute ⃗r′ = R⃗r in the expression for E(⃗
to obtain
′
⃗ r) = kq⃗r → E(R⃗
⃗ r) = E(⃗
⃗ r′ ) = kq⃗r = kqR⃗r = kqR⃗r = RE(⃗
⃗ r),
E(⃗
|⃗r|3
|⃗r′ |3
|R⃗r|3
|⃗r|3
(2.3.1)
⃗ r) = RE(⃗
⃗ r) implies that
where we used the condition RT R = I in the denominator. The result E(R⃗
the electric field transforms in the same way as the position vector ⃗r under the O(3) symmetry.
⃗ r) = RE(⃗
⃗ r) and Gauss’s law to determine the electric field of
We can use the condition E(R⃗
a point charge. Let’s assume that ⃗r is along the z axis. In this case, a rotation Rz (θ) along the
⃗ r) = E(R
⃗ z (θ)⃗r) = Rz (θ)E(⃗
⃗ r). This implies that E(⃗
⃗ r) must
z axis leaves ⃗r fixed. So, we have E(⃗
104
also be along the z direction. Therefore, we have determined that the direction of the electric
field of a point charge is always radial to the point charge. We can further determine that the
magnitude of the electric field is spherically symmetric by taking the norm of the vector equation
⃗ r) = RE(⃗
⃗ r). This gives |E(R⃗
⃗ r)| = |RE(⃗
⃗ r)| = |E(⃗
⃗ r)|. Hence, both the direction and magnitude
E(R⃗
of the electric field of a point charge are spherically symmetric.
Now consider a charge distribution described by a density function ρ(⃗x). The electric field at
point ⃗r is given by Coulomb’s law as
ˆ
⃗ r) = d3 ⃗x kρ(⃗x)(⃗r − ⃗x) ,
E(⃗
(2.3.2)
|⃗r − ⃗x|3
where the integral is taken over the volume of the charge distribution. If the charge distribution is
spherically symmetric, meaning that its density function satisfies
ρ(R⃗x) = ρ(⃗x),
∀R ∈ O(3),
then the electric field will transform under rotation R to
ˆ
ˆ
x)(R⃗r − ⃗x)
x)(⃗r − R−1 ⃗x)
′
3 kρ(⃗
3 kρ(⃗
⃗
⃗
⃗ r).
E(⃗r ) = E(R⃗r) = d ⃗x
=
R
d
⃗
x
= RE(⃗
|R⃗r − ⃗x|3
|⃗r − R−1 ⃗x|3
(2.3.3)
(2.3.4)
In the last step, we used the condition ρ(R⃗x) = ρ(⃗x) and d3 (R−1 ⃗x) = d3 ⃗x. The transformation
law tells us that the electric field has the same symmetry as that of the point charge at the origin.
This means that the electric field is only dependent on the distance |⃗r| from the origin and its
direction is along the radial direction r̂. Hence, we can express the electric field at any point as
⃗ r) = E(|⃗r|)r̂, where E(|⃗r|) is a scalar function that only depends on the distance |⃗r|.
E(⃗
Figure 2.9: Gaussian surface for a spherically symmetric charge distribution.
By choosing a Gaussian surface to be the sphere that encloses the charge distribution, we can
apply Gauss’s law to determine the scalar function E(|⃗r|). And finally we have the following result:
If a charge distribution has spherical symmetry, then a charged particle outside the distribution
will be attracted or repelled by the distribution as if all the charge were concentrated at the center
of the distribution.
105
Cylindrical symmetry
Now, let’s consider an infinite cylinder with cylindrical symmetry along the z direction and a charge
distribution that also has cylindrical symmetry. We can utilize the symmetry of the system and
Gauss’s law to determine the electric field generated by this charge distribution.
This system exhibits lower rotational symmetry compared to the point charge at the origin.
Specifically, the rotational symmetry is only preserved along the z axis, reducing from SO(3) to
SO(2). However, the reflection symmetry is preserved if the mirror plane is perpendicular to the z
axis or contains the z axis. Additionally, the system also exhibits the parity symmetry that maps
⃗r to −⃗r. Finally, the system exhibits new translational symmetry along the z direction compared
to the point charge case.
To determine the transformation rules for the electric field under a symmetry transformation in
a cylindrical system, we can use Coulomb’s law and apply the same argument used for spherically
symmetric systems. Let g be a symmetry transformation of the cylindrical system. Since the
charge density ρ is symmetric under g, we have ρ(g⃗r) = ρ(⃗r). Moreover, since the differential
volume element d3⃗r is invariant under g, we have d3 (g⃗r) = d3⃗r. Therefore, using Coulomb’s law,
we find that the electric field transforms as
⃗ r) = g E(⃗
⃗ r).
E(g⃗
(2.3.5)
This means that the electric field from the system also exhibits cylindrical symmetry, and we can
use symmetry operations to determine its direction at any given point. For example, by choosing
g to be a translational symmetry along the z-axis, we find that the electric field is invariant under
translation. By applying some simple symmetry operations, we can prove that the electric field at
a position (x, y, z) should be along the direction of (x, y, 0). Similarly, we can usep
the symmetry
argument to show that the magnitude of the electric field is only a function of r = x2 + y 2 .
Figure 2.10: Gaussian surface for a cylindrically symmetric charge distribution.
We can select a cylinder as our Gaussian surface. Applying Gauss’s law to this surface, we
106
obtain:
˛
Φ=
⃗ · dS
⃗ = E(r) · 2πrh = Qenc = λh ,
E
ϵ0
ϵ0
(2.3.6)
where Qenc is the charge enclosed by the surface, and λ is the linear charge density along the
z-direction. Solving for E(r), we obtain:
E(r) =
λ
.
2πϵ0 r
(2.3.7)
We see that a charged particle will be attracted or repelled by the cylindrically symmetric charge
distribution as if all the charge were concentrated at the z axis.
Planar symmetry
Consider a thin, infinite, nonconducting sheet or plane with a uniform (positive) surface charge
density σ. This system has translational symmetry along any direction in the xy-plane, and is
also invariant under any mirror reflection symmetry for the reflection plane perpendicular to the
charged plane. Using these symmetries and the transformation rule
⃗ r) = g E(⃗
⃗ r),
E(g⃗
(2.3.8)
we can show that the electric field is always perpendicular to the charged sheet at any point in
⃗ r) = E(z)ẑ, where E(z) is a scalar function that depends only on the z
space. It has the form E(⃗
coordinate of the position ⃗r and satisfies E(−z) = −E(z).
Figure 2.11: Gaussian surface for a charged plane.
⃗
A useful Gaussian surface is a closed cylinder with end caps of area A (see Fig. 2.11). Since E
is along ẑ direction, the electric flux can be calculated easily as
˛
⃗ · dS
⃗ = E(z)A − E(−z)A = 2E(z)A = σA .
Φ= E
ϵ0
107
The solution of electric field is
⃗ r) = σ sgn(z)ẑ.
E(⃗
2ϵ0
(2.3.9)
This result agrees with the previous calculation Eq. (2.1.10) using directly Coulomb’s law. However,
the derivation here using symmetry and Gauss’s law is much simpler.
Figure 2.12: (a) A infinite conducting charged plate. (b) Two infinite conducting plates with
opposite charges.
Using the electric field of a charged plane, we can also derive the charge and electric field
distribution for a infinite conducting charged plate (see Fig. 2.12a).As we have previously discussed,
the charge on the conducting plate must be located on its surface and must be uniformly distributed.
If it were not uniformly distributed, the electric field parallel to the surface would cause the charge
to move. Therefore, the surface charge density must be constant. If the charge density on each
σi
surface of the conducting plate is σi , the electric field generated will have a magnitude of 2ϵ
. To
0
ensure that the total electric field inside the plate is zero, we must have σ1 = σ2 = σ, meaning that
the surface charge density must be uniform across the entire surface of the plate. Therefore, the
total electric field outside the plate is 2 × 2ϵσ0 = ϵσ0 .
Let’s consider another example involving two infinite conducting charged plates with opposite
charges q and −q as shown in Fig. 2.12b. In this case, the charges should be uniformly distributed
on the four surfaces of the plates. The question is, how many charges are on each surface? To solve
this, we start by examining the electric field inside the left plate due to the right plate. It is given
by 2ϵσ0 pointing to the right, where σ is the total charge density of the right plate. For the total
electric field inside the left plate to be zero, all of the charge on the left plate should be distributed
on its right surface. This ensures that the electric field inside the left plate cancels out. Similarly,
all the charge of the right plate should be on its left surface. This completes the charge distribution
for the two plates with opposite charges.
108
2.3.3
Electric potential
Definition of electric potential
Given a force, we can determine the potential energy that is able to produce that force. In general,
the relationship between force and potential energy is given by:
F⃗ = −∇U.
(2.3.10)
Since the Coulomb force is proportional to the test charge, we usually define the electric potential
V in terms of electric potential energy U as:
U = qV.
(2.3.11)
The electric potential is a scalar quantity measured in volts, denoted by the symbol V. One volt
is defined as one joule of energy per coulomb of charge. It is important to note that the electric
potential depends only on the charge distribution and not on the test charge itself.
⃗ we can derive the relation between the electric potential
From the Coulomb’s force F⃗ = q E,
⃗ When the test charge is
and the electric field. Consider a test charge q0 in an electric field E.
moved from point A to point B along a path, the work done by the electric force is given by
ˆ
B
W =
A
ˆ
F⃗ · d⃗l =
B
ˆ
⃗ · d⃗l = q0
q0 E
A
B
⃗ · d⃗l
E
(2.3.12)
A
The work done is also equal to the change in electric potential energy of the test charge, which is
given by W = –∆U . Thus we have
∆U = q0 ∆V = q0 (VB − VA ).
(2.3.13)
This shows that
ˆ
B
VB − VA = −
⃗ · d⃗l.
E
(2.3.14)
A
If we further choose VA = 0 for some reference point A, then the electric potential can be calculated
as the line integral of electric field as
ˆ
⃗
r
V (⃗r) = −
⃗ · d⃗l.
E
(2.3.15)
A
Conversely, the electric field can be obtained from the electric potential using the relation
⃗ = −∇V,
E
where ∇ is the gradient operator.
109
(2.3.16)
The definition of electric potential above does not depend on the path chosen between two
points. This is known as the path-independence of electric potential, which can be proven using
Stokes’ theorem. The line integral of the electric field along a closed path is zero, as shown by
˛
ˆ
⃗
⃗ · dS
⃗ = 0,
⃗
(2.3.17)
E · dl = (∇ × E)
C=∂S
S
where C = L′ − L is a closed path and S is some surface satisfying C = ∂S. The above equation is
⃗ = 0. Therefore,
valid only if the field is conservative, which is true for electrostatic fields as ∇ × E
the electric potential is a well-defined scalar field.
Calculating electric potential
The electric potential of a point charge q can be calculated using the definition of electric potential
as
ˆ ⃗r
ˆ |⃗r|
kq
kq
1 q
⃗
⃗
V (⃗r) = −
E · dl = −
· dr′ =
=
.
(2.3.18)
′2
|⃗r|
4πϵ0 |⃗r|
+∞
+∞ r
This expression shows that the electric potential of a point charge varies inversely with the distance
from the charge, similar to the potential of Newton’s gravitation.
For a system of charged particles, the electric potential is the sum of the individual electric
potentials generated by each individual charge:
X
X 1
qi
V (⃗r) =
Vi (⃗r) =
,
(2.3.19)
4πϵ0 |⃗r − ⃗ri |
i
i
Here, qi represents the charge of the ith particle, and ⃗ri represents its position.
For a charged distribution or continuum with density ρ(⃗r′ ), the electric potential at a point ⃗r
can be calculated using a scalar integral. This integral is given by:
ˆ
ˆ
1
dq ′
1
ρ(⃗r′ )dV ′
V (⃗r) =
=
,
(2.3.20)
4πϵ0
|⃗r − ⃗r′ |
4πϵ0
|⃗r − ⃗r′ |
where dq ′ = ρ(⃗r′ )dV ′ is the infinitesimal charge at a point ⃗r′ . The integral takes into account the
contribution to the potential from each infinitesimal charge in the distribution.
Electric potential and field of an electric dipole
Now, Let us calculate the electric potential of an electric dipole. The total electric potential at
any point due to a dipole is given by the summation of the electric potentials due to its individual
charges. If we have a dipole with charges +q and −q separated by a distance d, then the electric
potential at a point P at a distance r from the center of the dipole is given by:
q
q
q r(−) − r(+)
1
V (⃗r) =
−
=
.
(2.3.21)
4πϵ0 r(+) r(−)
4πϵ0 r(+) r(−)
110
When r is much larger than d, we can approximate r(−) − r(+) by d cos θ, where θ is the angle
⃗ In the denominator, r(+) r(−) can be approximated by r2 . This gives us the
between ⃗r and d.
simplified expression for the electric potential
V ≈
1 qd cos θ
1 qr̂ · d⃗
1 p⃗ · r̂
=
=
.
2
4πϵ0 r
4πϵ0 r2
4πϵ0 r2
(2.3.22)
It is important to note that the electric potential of an electric dipole varies inversely with the
square of the distance from the center of the dipole. From this electric potential, we can derive the
electric field of the dipole as
p · r̂)r̂ − p⃗
⃗ r) = −∇V (⃗r) = k 3(⃗
E(⃗
.
r3
(2.3.23)
This agrees with the previous result Eq. (2.1.7) derived directly from Coulomb’s law.
Figure 2.13: Calculating the electric potential of an electric dipole.
In general, calculating the electric potential of a charge distribution is much simpler than
calculating the electric field of the same charge distribution. The reason for this is that the former
involves a scalar integral, while the latter involves a vector integral.
Electric potentials and fields of charged ring and disk
Let us consider another example of calculating electric potentials and fields of charged ring and
disk.
For a charge ring with equation x2 + y 2 = R2 and line charge density λ, the electric potential
111
at the point ⃗r = (0, 0, z) is given by
ˆ
2π
√
V (⃗r) =
0
z2
k
kq
λRdθ = √
.
2
2
+R
z + R2
(2.3.24)
The electric field is then
⃗ r) = −∇V (⃗r) =
E(⃗
0, 0,
kqz
2
(z + R2 )3/2
,
(2.3.25)
which agrees with the previous result Eq. (2.1.8).
For a charged disk with radius R and area charge density σ, the electric potential at the point
⃗r = (0, 0, z) is
ˆ
V (⃗r) =
√
kdq
=
z 2 + r2
ˆ
0
R
p
kσ(2πr)dr
√
= 2πkσ( z 2 + R2 − |z|).
z 2 + r2
(2.3.26)
The electric field is
⃗ r) = −∇V (⃗r) =
E(⃗
0, 0, 2πkσ sgn(z) − √
z
2
z + R2
=
σ
0, 0,
2ϵ0
sgn(z) − √
z
,
z 2 + R2
(2.3.27)
which also agrees with the previous result Eq. (2.1.9). In the infinite plane limit R → ∞, the
electric field becomes
σ
⃗
E(⃗r) = 0, 0,
sgn(z) ,
(2.3.28)
2ϵ0
which is the standard result for the electric field of an infinite uniformly charged plane.
112
2.4
2.4.1
Capacitance, Current and Resistance
Capacitance
Two conductors, isolated electrically from each other and from their surroundings, form a capacitor.
When the capacitor is charged, the charges on the conductors have the same magnitude of charges
but opposite signs. No matter what their geometry, flat or not, people call these conductors plates.
The following figure shows a conventional arrangement, called a parallel-plate capacitor, consisting
of two parallel conducting plates of area A separated by a distance d.
In principle, capacitors can be of all kinds of geometries, and we will study some special setups
below. In this class, we always assume that no material medium is present in the region between
the plates; in general, the space between the plates of a capacitor is filled with a dielectric, i.e., an
insulating material such as glass or plastic.
When a capacitor is charged, its plates have charges of equal magnitudes but opposite signs ±q.
Then, we refer to the charge of a capacitor as q, the absolute value of these charges on the plates.
Because the plates are conductors, they are equipotential surfaces; all points on a plate are at the
same electric potential. Moreover, there is a potential difference between the two plates, whose
absolute value is denoted by V . For capacitors, q and V are proportional to each other:
q = CV,
where the proportionality constant C is called the capacitance of the capacitor. Its value depends
only on the geometry of the plates and not on q or V . The capacitance is a measure of how much
charge can be stored on the plates to produce a certain potential difference between them: the
greater the capacitance, the more charge can be stored in the capacitor. The SI unit of capacitance
is farad (F), i.e., Coulomb per Volt (C/V).
We now look at some examples of capacitors with special geometries and calculate their capacitance. In general, the calculation of capacitance consists of the following steps:
1. Assume a virtual charge q on the plates.
2. Calculate the electric field between the two plates in terms of q (by using Coulomb’s law or
Gauss’ law).
113
3. Calculate the potential difference V between the plates by doing a path integral of the electric
field from the positive plate to the negative one.
4. Calculate C = q/V .
Example (Parallel-Plate Capacitor). Consider a parallel-plate capacitor, where each plate is of
area A and the two plates are separated by d. Suppose the two plates are large enough compared to
d. Find its capacitance.
Solution: The electric field between the two plates is given by (2.1.11) pointing from the positive
plate to the negative plate, where the charge density is σ = q/A. Then, the potential difference is
V =
qd
σ
d=
.
ε0
ε0 A
Hence, the capacitance is
q
ε0 A
=
.
(2.4.1)
V
d
Example (Cylindrical Capacitor). Consider a cylindrical capacitor of length L formed by two
coaxial cylinders of inner radius a and outer radius b. Suppose L ≫ b so that we can neglect the
fringing of the electric field that occurs at the ends of the cylinders. Find its capacitance.
C=
Solution: Using Gauss’ law, we find that the electric field between the two plates is given by
(2.3.7) pointing from the positive plate to the negative plate, where the charge density is λ = q/L.
Then, the potential difference is
ˆ b
q
q
b
V =
dr =
log .
2πϵ
Lr
2πϵ
L
a
0
0
a
Hence, the capacitance is
C=
2πϵ0 L
q
=
.
V
log(b/a)
Figure 2.14: A cross-section of a long cylindrical capacitor, or a capacitor consisting of two concentric spherical shells.
114
Example (Spherical Capacitor). Consider a capacitor that consists of two concentric spherical
shells, of inner radius a and outer radius b. Find its capacitance.
Solution: Using Gauss’ law or Proposition 2.1.1, the electric field between the two plates is
E=
q
4πε0 r2
pointing from the positive plate to the negative plate. Then, the potential difference is
ˆ b
q
1 1
q
dr =
.
V =
−
2
4πϵ0 a b
a 4πε0 r
Hence, the capacitance is
C=
q
4πϵ0 ab
=
.
V
b−a
We can assign a capacitance to a single isolated spherical conductor of radius a by assuming
that the other plate is a conducting sphere of ∞ radius. Let b → ∞ in the above equation, we get
the capacitance of a single isolated spherical conductor as
C = 4πϵ0 a.
2.4.2
Capacitors in Circuits
To charge a capacitor, we place it in an electric circuit with a battery. In the following figure, a
battery B, a switch S, an uncharged capacitor C, and interconnecting wires form a circuit. The
battery maintains a potential difference V between its terminals.
When the switch is closed, electrically connecting those wires, the circuit is complete and electrons
are driven through the wires by an electric field that the battery sets up in the wires. The field
drives electrons from the capacitor plate h to the positive terminal of the battery, so plate h will
become positively charged. The field drives just as many electrons from the negative terminal of
the battery to the capacitor plate l, so plate l will become negatively charged just as much as plate
h becomes positively charged. Initially, when the plates are uncharged, the potential difference
between them is zero. As the plates become oppositely charged, that potential difference increases
115
until it equals the potential difference V between the terminals of the battery. Then, plate h and
the positive terminal of the battery are at the same potential, and there is no longer an electric field
in the wire between them. Similarly, plate l and the negative terminal reach the same potential.
Thus, with the field zero, there is no further drive of electrons. The capacitor is then said to be
fully charged, with a potential difference V and charge q = CV .
When there is a combination of capacitors in a circuit, we can sometimes replace that combination with an equivalent capacitor—that is, a single capacitor that has the same capacitance as
the actual combination of capacitors.
The above figure shows an electric circuit in which three capacitors are connected in parallel to
battery B. Here, each capacitor has the same potential difference V , and the total charge q stored
on the capacitors is the sum of the charges stored on all the capacitors. Thus, capacitors connected
in parallel can be replaced with an equivalent capacitor that has the same total charge q and the
same potential difference V as the actual capacitors. In general, consider n capacitors connected
P
in parallel, each with capacitance Ci , i = 1, . . . , n. Then, the total charge is q = ni=1 Ci V , which
gives the equivalent capacitance
n
X
q
Ceq =
=
Ci .
(2.4.2)
V
i=1
The following figure shows three capacitors connected in series to battery B. Here, the potential
differences that exist across the capacitors in the series produce identical charges q on them. Thus,
capacitors that are connected in series can be replaced with an equivalent capacitor that has the
same charge q and the same total potential difference V as the actual series capacitors.
116
In general, consider n capacitors connected in parallel, each with capacitance Ci , i = 1, . . . , n. Then,
P
the total potential difference is V = ni=1 q/Ci , and hence the equivalent capacitance satisfies that
n
X 1
1
V
=
.
=
Ceq
q
Ci
(2.4.3)
i=1
2.4.3
Energy Stored in an Electric Field
Work must be done by an external battery to charge a capacitor, at the expense of its stored
chemical energy. We visualize the work as being stored as electric potential energy in the electric
field between the plates. We now evaluate this energy. Suppose that, at a given instant, a charge q ′
has been transferred from one plate of a capacitor to the other. The potential difference V ′ between
the plates at that instant is equal to q ′ /C. Then, if an extra increment of charge dq ′ is transferred,
the increment of work required will be
dW = V ′ dq ′ =
q′ ′
dq .
C
The work required to bring the total capacitor charge up to a final value q is
ˆ q ′
q2
q ′
dq =
.
W =
2C
0 C
Hence, the potential energy U stored in the capacitor is given by
U=
1
q2
= CV 2 .
2C
2
(2.4.4)
Consider a parallel-plate capacitor, where each plate is of area A and the two plates are separated
by d. Neglecting fringing, the electric field has the same value at all points between the plates.
Thus, the energy density u, that is, the potential energy per unit volume between the plates, should
also be uniform. Using (2.4.4) and (2.4.1), we obtain that
CV 2
1
= ε0
u=
2Ad
2
V
d
2
.
Furthermore, the electric field between the plates is E = V /d. Hence, we get the following formula
for the electric energy density
1
u = ε0 E 2 .
2
Although we derived this result for the special case of an electric field of a parallel plate capacitor,
⃗ exists at any point in space, then at each point
it holds for any electric field. If an electric field E
⃗r, there is an electric potential energy with a density (amount per unit volume) given by
1 ⃗
u(⃗r) = ε0 |E(⃗
r)|2 .
2
117
2.4.4
Electric Current
Simply speaking, an electric current is a stream of moving charges. In this section, we focus on
steady currents of conduction electrons moving through metallic conductors such as copper wires.
Recall that an isolated conducting loop, regardless of whether it has an excess charge, is all at the
same potential. Hence, no net electric force acts on the conduction electrons and there is no current.
If we insert a battery in the loop, the conducting loop is no longer at a single potential. Electric
fields act inside the material making up the loop, exerting forces on the conduction electrons,
causing them to move and thus establishing a current. After a very short time, the electron flow
reaches a constant value and the current is in its steady state.
Take a section of a conductor, part of a conducting loop in which current has been established.
If charge dq passes through a cross-section of the conductor in time dt, then the current i through
that cross-section is defined as
dq
i=
.
dt
As discussed before, the SI unit for current is ampere (A) or Coulomb per second (C/s), which is an
SI base unit. Note that current is a scalar because both charge and time are scalars. Yet, we often
represent a current with an arrow to indicate that charge is moving. More precisely, we will draw
the current arrows in the direction in which positively charged particles would be forced to move
through the loop by the electric field. Such positive charge carriers would move away from the
positive battery terminal and toward the negative terminal. However, in applications, the charge
carriers are often electrons and thus are negatively charged. The electric field forces them to move
in the direction opposite the current arrows, from the negative terminal to the positive terminal.
Then, the assumed motion of positive charge carriers in one direction has the same effect as the
actual motion of negative charge carriers in the opposite direction.
We remark that the arrows of currents do not mean vectors, so they do not satisfy vector
addition. Current arrows show only a direction (or sense) of flow along a conductor, not a direction
in space. Due to the conservation of charges, there is also a “conservation of currents”. For example,
the following picture shows a conductor with current i0 splitting at a junction into two branches:
Then, the magnitudes of the currents in the branches must add to yield the magnitude of the
current in the original conductor, i.e., i0 = i1 + i2 . In more general circuits, we have the following
Kirchhoff’s junction rule (or Kirchhoff’s current law).
118
Physics law 10 (Kirchhoff’s junction rule). The sum of the currents entering any junction must
be equal to the sum of the currents leaving that junction.
Sometimes, we want to take a localized view and study the flow of charge through a cross-section
⃗
of the conductor at a particular point. To describe this flow, we can use the current density J,
which has the same direction as the velocity of the moving charges if they are positive and the
opposite direction if they are negative. The magnitude J is equal to the current per unit area
through that element. Then, the total current through the surface is
ˆ
⃗
i = J⃗ · dA,
⃗ is the area vector of the element. If the current is uniform across the surface and parallel
where dA
⃗ then J⃗ is also uniform and parallel to dA,
⃗ and we have
to dA,
i = JA ⇔ J =
i
,
A
where A is the total area of the surface. The SI unit for current density is A/m2 .
2.4.5
Resistance and Resistivity
If we apply a potential between the ends of different conductors, different currents result depending on their resistances. We determine the resistance between any two points of a conductor by
applying a potential difference V between those points and measuring the current i that results.
The resistance R is then defined as
V
R= .
i
Its value depends only on the material and geomerty of the conductor and not on V or i. The SI
unit for resistance is ohm (Ω) or volt per ampere (V/A). A conductor whose function in a circuit
is to provide a specified resistance is called a resistor. Given a voltage difference V between the
two ends of the resistor, it generated a current i = V /R. On the other hand, a current i across the
resistor corresponds to a voltage difference V = iR.
⃗ at a point in a
We sometimes wish to take a localized view and focus on the electric field E
resistive material. Then, instead of dealing with the current i through the resistor, we deal with
the current density J⃗ at the point in question. Instead of the resistance R of an object, we deal
with the resistivity ρ of the material:
E
ρ= .
J
The SI units of ρ is (V/m)/(A/m2 ) = (V/A) · m = Ω · m. We can rewrite the above equation into
a more general vector form
⃗
E
J⃗ = .
(2.4.5)
ρ
119
Compared to resistivity, people speak of the conductivity σ of a material more often, which is
simply the reciprocal of its resistivity,
σ = ρ−1 .
The definition of σ allows us to write (2.4.5) into a slightly simpler form:
⃗
J⃗ = σ E.
(2.4.6)
Note that resistance is a property of an object, while resistivity is a property of a material. We
now derive the relation between the resistivity of a material such as copper and the resistance of
a length of wire made of that material. Let A be the cross-sectional area of the wire and L be its
length. Apply a potential difference V between its ends. Suppose the electric field and the current
density are constant for all points within the wire. Then, they have values E = V /L and J = i/A,
which give that
E
V A
A
ρL
ρ=
=
=R
⇔ R=
.
J
i L
L
A
This relation can be applied only to a homogeneous isotropic conductor of uniform cross-section.
As discussed above, a resistor is a conductor with a specified resistance R. In particular, R is
unchanged no matter what the magnitude and direction (polarity) of the applied potential difference
are. In this case, we say that the resistor satisfies the Ohm’s Law, which asserts that
the current through a device is always proportional to the potential difference applied to it.
Although, for historical reasons, the term “law” is used here, this assertion is correct only in certain
situations. For example, the left device of Figure 2.15 obeys Ohm’s law, while the right device of
Figure 2.15—a semiconducting pn junction diode—does not. For the pn junction diode, current
can exist only when the polarity of V is positive and the applied potential difference is more than
about 1.5 V. When current does exist, the relation between i and V is also not linear.
Figure 2.15: Left plot: current i versus applied potential difference V when the device is a resistor.
Right plot: i versus V when the device is a semiconducting pn junction diode.
Remark. In middle school physics, it is often contended that Ohm’s law states that V = iR.
However, this is not true—this equation is just the defining equation for resistance, and it applies
120
to all conducting devices, whether they obey Ohm’s law or not. The essence of Ohm’s law is that
i is linear with respect to V , that is, R is independent of V .
When there is a steady current i across a resistor R, the amount of charge dq that moves
between its ends in the time interval dt is equal to dq = idt. This charge dq moves through a
potential decrease of magnitude V , and thus its electric potential energy decreases in magnitude
by
dU = V dq = iV dt.
The principle of conservation of energy tells that the decrease in electric potential energy must
be accompanied by a transfer of energy to some other form. The power P associated with that
transfer is
dU
P =
= iV.
(2.4.7)
dt
Note that the SI unit of P is V · A = (J/C) · (C/s) = J/s = W, as it should be. Using V = iR, we
can rewrite (2.4.7) as
V2
P = i2 R =
,
R
which gives the rate of electrical energy dissipation due to resistance. Given a fixed current, to
lower the electrical energy dissipation, it is desired to have as small resistance as possible. In
particular, when the phenomenon of superconductivity occurs, the resistivity of the material
drops to zero and there is no electrical energy dissipation. The study of the mechanisms for
superconductivity remains one of the central topics in modern theoretical physics, and the search
for high-temperature (or even room-temperature) superconductors is arguably one of the most
important challenges among physicists.
2.4.6
Electric Circuits
In this subsection, we study the physics of electric circuits which are closed loops consisting of
resistors, batteries, capacitors, and conducting wires between them (the resistances of wires are
negligible). We restrict our attention to circuits through which charge flows in one direction, called
direct current (DC) circuits.
Given an electric circuit, we want to know the voltage (or potential) at each point and the
current in each segment of conducting wire. In principle, no matter how complicated an electric
circuit is, it can always be solved by using Kirchhoff’s junction rule (or Kirchhoff’s current law) in
Physics law 10 and the following Kirchhoff’s loop rule (or Kirchhoff’s voltage law).
Physics law 11 (Kirchhoff’s loop rule). The algebraic sum of the changes in potential encountered
in a complete traversal of any loop of a circuit must be zero.
When we apply Kirchhoff’s voltage laws, we start at a point, say o, and mentally walk clockwise
or counterclockwise around the circuit until we are back at o, keeping track of potential changes as
we move. During the walk, we deal with resistors, batteries, and capacitors in the following ways.
121
• For a move through resistance R in the direction of the current, the change in potential is
−iR, while in the opposite direction, it is +iR.
• A battery maintains its positive terminal at a higher electric potential than the negative
terminal. People usually call the potential increase from the − terminal to + terminal as
the emf (i.e., electromotive force) E , meaning that it supplies the energy for the motion of
electrons via the work it does. For a move through a battery in the direction of the emf arrow
(i.e., from − terminal to + terminal), the change in potential is +E ; in the opposite direction,
it is −E .
• A capacitor is an insulator, i.e., it has ∞ resistance. Hence, there is no current across the
capacitor. For a move across the capacitor with a potential difference V , the change in
potential is +V from the negative plate to the positive one; in the opposite direction, it is
−V .
In applications of Kirchhoff’s loop and junction rules, we need to solve a system of linear
equations involving voltages at various points and currents along some wire segments. In some
cases, the circuit structure can be greatly simplified if we use equivalent resistances for resistors in
series or in parallel.
Resistors in series. The left figure shows an electric circuit in which three resistances are connected in series to an ideal battery with emf E .
All three resistances have identical currents i, while the sum of the potential differences across the
resistances is equal to E . Thus, resistances connected in series can be replaced with an equivalent
resistance Req (the right figure) that has the same current i and the same total potential difference E
as the actual resistances. In general, consider n capacitors connected in series, each with resistance
Ri , i = 1, . . . , n. Then, by the loop rule, we have
E −
n
X
n
iRk = 0 ⇒ Req
X
E
=
Rk .
=
i
(2.4.8)
k=1
k=1
Resistors in parallel. The left figure shows three resistances connected in parallel to an ideal
battery with emf E .
122
All three resistances have the same potential difference across them, producing a current through
each. Thus, resistances connected in parallel can be replaced with an equivalent resistance Req
(the right figure) that has the same potential difference and the same total current as the actual
resistances. In general, consider n capacitors connected in parallel, each with resistance Ri , i =
1, . . . , n. Then, applying the loop rule to the n loop, each of which consists of the battery and a
resistor, we get that
E = ik Rk , k = 1, . . . , n.
By the junction rule, the total current of the equivalent resistance is
i=
n
X
ik .
k=1
From the above two equations, we can derive the equivalent resistance as
n
Req =
X 1
E
1
⇒
=
.
i
Req
Rk
(2.4.9)
k=1
You can compare (2.4.8) and (2.4.9) for resistance with (2.4.2) and (2.4.3) for capacitance.
2.4.7
RC Circuits
In this subsection, we consider the following RC series circuit consisting of a capacitor C, an ideal
battery of emf E , and a resistance R:
123
When switch S is closed on a, the capacitor is charged through the resistor. When the switch is
afterward closed on b, the capacitor discharges through the resistor.
We know that as soon as the circuit is complete, charge begins to flow (current exists) between
a capacitor plate and a battery terminal on each side of the capacitor until the potential difference
between the two capacitor plates equals E across the battery. Then, we say that the system reaches
equilibrium, and the equilibrium charge on the fully charged capacitor is equal to CE . Here, we
are interested in the charging process, which is a dynamic process. In particular, we want to know
how the charge q(t) on the capacitor plates, the potential difference V (t) across the capacitor, and
the current i(t) in the circuit vary with time during the charging process.
Using the loop rule, we find that
E − i(t)R −
q(t)
= 0.
C
Since i = dq/dt, we can rewrite the above equation as
R
dq
q
+
= E,
dt
C
with initial condition q(0) = 0. It is easy to solve this equation and get
t
q(t) = CE 1 − e− RC .
Here, the product RC is called the capacitive time constant of the circuit. From q(t), we immediately
obtain that
t
t
dq(t)
E
q(t)
i(t) =
= e− RC , V (t) =
= E 1 − e− RC .
dt
R
C
As expected, as t → ∞, V (t) → E , i(t) → 0, and q(t) → CE .
Assume now that the capacitor has been fully charged. At a new time t = 0, switch S is thrown
from a to b so that the capacitor can discharge through resistance R. How does the discharging
process behave? Similar as above, we obtain the following differential equation of q(t):
R
dq
q
+
= 0,
dt
C
with initial condition q(0) = q0 = CE . The solution to this equation is
t
q(t) = q0 e− RC .
Hence, q decreases exponentially to zero. The current is then given by
i(t) =
dq(t)
q0 − t
=−
e RC ,
dt
RC
i.e., the current also decreases exponentially to zero.
124
2.5
Magnetic Fields, Magnetic Fields Due to Currents
Magnetism is a fundamental force of nature that is observed in certain materials, notably iron,
cobalt, and nickel, as well as some alloys and compounds. Magnets produce a magnetic field, which
is a region of space in which magnetic forces are exerted on other magnets, magnetic materials, or
charged particles.
Magnetism plays a critical role in both our natural world and our daily lives. It is responsible
for the Earth’s magnetic field, which protects our planet from the harmful effects of solar wind
and cosmic rays. In addition, magnetism has a profound impact on our technology, from the small
magnets used in hard drives and credit card strips to the giant magnets used in medical imaging
machines and particle accelerators. Understanding magnetism is therefore essential for both advancing our understanding of the natural world and developing new technologies that improve our
lives.
2.5.1
Magnetic fields
An important concept in magnetism is the source of magnetic field. While electric charges are
the source of electric fields, the existence of magnetic charges, or magnetic monopoles, has not
been experimentally observed. Instead, a magnet with both north and south poles can produce a
magnetic field. When two magnets are brought close to each other, the north pole of one magnet
will attract the south pole of the other magnet, while the north pole of both magnets will repel
each other. However, it is not possible to isolate a single north pole or south pole of a magnet, as
attempting to do so will split the magnet into two separate magnets with both poles.
It is crucial to understand that a magnetic field can also be generated by the motion of electric
charges, as observed in electric currents. This principle gave rise to the development of electromagnets, which will be covered in detail in subsequent sections. When an electric current flows through
a wire coil, a magnetic field is produced in the surrounding space. As moving electric charges have
a time-varying electric field, we can conclude that the magnetic field arises due to the time-varying
electric field.
Definition of magnetic field
Electric fields are defined as the Coulomb force acting on a test electric charge divided by the value
⃗ = F⃗E /q. The concept of electric fields can be extended to magnetic fields
of its electric charge E
by defining them using the magnetic force. Magnetic fields do not affect static electric charges, but
they do exert a force on moving charges known as the Lorentz force. This force is perpendicular to
the direction of the velocity of the charged particle and the magnetic field it is moving through. The
magnetic force can be described using the cross product of the charge’s velocity and the magnetic
125
field, and can be written as
⃗
F⃗B = q⃗v × B,
(2.5.1)
where q is the electric charge of the charged particle and ⃗v is its velocity. This allows us to define
the magnetic field as a vector that satisfies this equation. Therefore, the magnetic field can be
thought of as a quantity that measures the force exerted on a moving charge in a magnetic field.
By definition of the cross product, the magnitude of the Lorentz force is given by
⃗ sin θ,
|F⃗B | = q|⃗v ||B|
(2.5.2)
where θ is the angle between the velocity and the magnetic field. As the magnetic force is always
perpendicular to the velocity of the particle, it does not do any work on the particle. It only changes
the direction of the velocity and does not alter the magnitude of the velocity. This means that a
magnetic field can only change the trajectory of a charged particle and not its speed.
The unit of the magnetic field is the Tesla (T), which is defined as the magnetic field that exerts
a force of one Newton on a charge of one Coulomb moving perpendicularly to the magnetic field
N·s
with a velocity of one meter per second: 1T = 1 C·m
. One Tesla is equal to 104 Gauss (G), which is
commonly used as another unit of magnetic field strength. For example, the Earth’s magnetic field
has a strength of about 0.5 Gauss. The choice of unit depends on the context and magnitude of
the magnetic field being measured. In everyday applications, milliTesla (mT) is often used, while
in scientific research and high-tech industries, the Tesla is the standard unit.
Magnetic field line
Magnetic field lines are a useful tool for visualizing the direction and strength of a magnetic field.
The field lines are drawn in such a way that the tangent to the line at any point gives the direction
of the magnetic field at that point. The strength of the field is proportional to the density of the
field lines. Magnetic field lines always form closed loops, which means that they originate from the
north pole of a magnet and terminate at the south pole. In addition, magnetic field lines never
intersect, which means that they cannot cross each other. The closer the field lines are, the stronger
the magnetic field is at that point.
Figure 2.16
126
Michael Faraday, a renowned English physicist and chemist, revolutionized our understanding
of magnetic fields in the mid-19th century by introducing the concept of magnetic field lines.
Through a series of groundbreaking experiments on electromagnetism, he discovered that magnetic
fields possess direction and shape, leading him to propose the idea of visualizing magnetic fields with
lines. Faraday’s seminal work formed the basis of modern electromagnetic theory and transformed
the study of magnetic fields from a vague notion to a concrete and measurable phenomenon. Today,
fields are a fundamental concept in quantum field theory and are considered as real physical degrees
of freedom rather than merely conceptual lines.
The absence of magnetic monopoles is a fundamental property of magnetic fields, which is
visually represented through closed magnetic field line loops that do not have any endpoints or
sources/sinks. Unlike electric fields, where positive and negative charges act as sources and sinks of
electric field lines, the magnetic field is divergenceless, meaning that its divergence is always zero.
⃗ = 0, which is different from the electric field with
Using vector calculus, this is expressed as ∇ · B
⃗ = ρ/ϵ0 .
∇·E
2.5.2
Magnetic forces
When a charged particle is in motion and exposed to both electric and magnetic fields, the net
force acting on it is given by the equation
⃗ + q⃗v × B.
⃗
F⃗ = q E
(2.5.3)
This is the fundamental formula for forces of electric and magnetic fields acting on a charged
particle.
The Hall effect
We can consider the field configuration in which the electric and magnetic fields are perpendicular
to each other, and the charged particle is initially moving with a velocity perpendicular to both
fields. In this scenario, there is a possibility that the total force acting on the particle is zero, and
the particle will continue moving with a constant velocity (see Figure 2.17). The condition for this
equilibrium is given by
⃗ + q⃗v × B
⃗ = 0,
qE
(2.5.4)
from which we find the speed of the particle should be
v=
E
,
B
(2.5.5)
where E and B are the magnitudes of the electric and magnetic fields, respectively.
The Hall effect is a phenomenon that occurs when two conductor plates are exposed to a
magnetic field, as shown in Figure 2.17. Electrons move between the plates with an initial velocity,
127
vd , in the absence of an electric field. As time goes on, one plate accumulates more electrons than
the other, as illustrated in Figure 2.17(a), resulting in a potential difference between the two plates
and an induced electric field. By using the equilibrium condition derived earlier, we can determine
the electric potential difference between the two plates as
V = Ed = vd Bd,
(2.5.6)
where E is the electric field induced by the accumulated electrons, vd is the drift speed of the
electrons, and d is the distance between the two plates. Meanwhile, the current is given by i =
JA = nevd A, where n is the electron density and A is the cross-sectional area of the electron
current. Therefore, the magnetic field can be expressed as
B=
V
neV A
=
.
vd d
id
(2.5.7)
If all the physical quantities on the right-hand side are known, we can use this formula of the Hall
effect to measure the magnetic field.
Figure 2.17: (a) A moving electron in magnetic field. (b) A moving electron in electric and magnetic
fields which are perpendicular to each other.
Circulating charged particle in uniform magnetic field
When a charged particle moves in a uniform magnetic field, it experiences a force perpendicular to
both the direction of motion and the direction of the magnetic field. This force causes the particle
to move in a circular or helical path.
⃗ with an initial
First, let’s consider a charged particle moving in a uniform magnetic field B
⃗ acting
velocity ⃗v perpendicular to the magnetic field. In this case, the magnetic force F⃗B = q⃗v × B
128
⃗ so the particle moves in a circular path with a
on the particle is perpendicular to both ⃗v and B,
radius r governed by the centripetal force:
FB = |q|vB =
mv 2
,
r
(2.5.8)
where m is the mass of the particle. The radius of the circle is given by:
r=
mv
,
|q|B
(2.5.9)
This equation tells us that the radius of the circle depends on the speed of the particle and the
strength of the magnetic field. The period of the motion, which is the time it takes for the particle
to complete one full circle, is given by:
T =
2πr
2πm
=
.
v
|q|B
(2.5.10)
This equation tells us that the period of the motion is independent of the velocity of the particle
and depends only on its mass, charge, and the strength of the magnetic field.
Figure 2.18
In the more general case, when the initial velocity of a charged particle is not perpendicular to a
uniform magnetic field, the particle moves in a helical path around the direction of the field vector.
To understand this motion, we can resolve the velocity vector ⃗v into two components, one parallel
⃗ The parallel component v∥ = v cos ϕ determines
and one perpendicular to the magnetic field B.
the pitch p of the helix, which is the distance between adjacent turns, while the perpendicular
component v⊥ = v sin ϕ determines the radius of the helix. The radius of the circular path can be
calculated using the perpendicular velocity component and Eq. (2.5.9):
r=
mv⊥
.
|q|B
129
(2.5.11)
The pitch of the helix can be calculated using the parallel velocity component and the period T of
the motion:
p = v∥ T =
2πmv∥
.
|q|B
(2.5.12)
It is worth noting that the pitch of the helix is independent of the perpendicular velocity v⊥ , since
the period T only depends on the mass, charge, and strength of the magnetic field.
In general, a charged particle moving in a magnetic field experiences a force that causes it to
move in a spiral path along the direction of the magnetic field lines. This phenomenon is crucial
for understanding the protective nature of the Earth’s magnetic field. The Earth’s magnetic field
acts as an invisible shield that deflects charged particles, especially those present in the solar wind.
The solar wind primarily consists of electrons and protons that approach the Earth. When these
charged particles encounter the Earth’s magnetic field lines, they undergo a magnetic force that
alters their trajectory. Instead of moving straight towards the Earth’s surface, they spiral along
the field lines. This spiraling motion redirects the charged particles towards the Earth’s north or
south poles.
Magnetic force acting on a wire
⃗ We
Let us now consider a straight wire carrying a current i placed in a uniform magnetic field B.
want to understand the magnetic force acting on this wire with length L.
Figure 2.19
Suppose the wire has electric current i, which is induced by charged particles with charge q and
drift velocity ⃗vd . Then the total moving charge in the wire of length L should be q = it = iL/vd .
The magnetic force acting on this wire is given by
⃗ = iL ⃗vd × B
⃗ = iL
⃗ × B.
⃗
F⃗ = q⃗vd × B
vd
(2.5.13)
⃗ is the vector of length L and pointing along the direction of the current or drift velocHere, L
ity. From this expression, we see that the magnetic force acting on the current-carrying wire is
perpendicular to both the direction of the current and the direction of the magnetic field.
130
Torque on a current loop
Figure 2.20
The figure above depicts a simple motor that consists of a single current-carrying loop placed in a
⃗ The loop experiences two magnetic forces, denoted by F⃗ and −F⃗ , which
uniform magnetic field B.
act in opposite directions along the loop. As a result, the loop experiences a torque that tends to
rotate it about its central axis.
Since two of the edges are always perpendicular to the direction of the magnetic field, the force
acting on them has a magnitude
F = ibB,
(2.5.14)
where b is the length of the edge. The torque of these forces with respect to the central axis is then
τ = F a sin θ = iBab sin θ
(2.5.15)
where a is the length of the other edge of the current loop, and θ is the angle between the normal
direction n̂ of the loop plane and the direction of the magnetic field.
Suppose we replace the single loop of current with a coil of N tightly wound loops. The torque
on the coil due to the magnetic field is given by
τ = N τsingle-loop = N iBA sin θ,
(2.5.16)
where A = ab is the area enclosed by each loop. We see that the torque on the coil is proportional to
the number of turns N , the current i, and the enclosed area A. This demonstrates that increasing
any of these parameters will result in an increase in the torque of the simple motor.
Magnetic dipole moment
The magnetic dipole moment is a fundamental concept in magnetism and plays a crucial role in
understanding the behavior of current-carrying coils in magnetic fields. Similar to an electric dipole
consisting of positive and negative charges separated by a distance, a magnetic dipole arises from
131
a current loop or coil. The magnitude of the magnetic dipole moment of a coil is given by the
expression
|⃗
µ| = N iA,
(2.5.17)
Here, N represents the number of turns in the coil, i is the current flowing through the coil, and A
is the area enclosed by each turn. The magnetic dipole moment is a measure of the strength of the
dipole and is typically expressed in units of ampere-square meters (A · m2 ). The direction of the
magnetic dipole moment vector, denoted as µ
⃗ , is determined by the normal vector n̂ to the plane
of the coil. The right-hand rule can be used to determine the direction of µ
⃗ by aligning the fingers
of your right hand with the current flow in the coil, and the thumb points in the direction of µ
⃗.
With the magnetic dipole moment defined, we can now apply it to understand the torque
experienced by a current-carrying coil in a magnetic field. The torque on the coil is given by the
cross product of the magnetic dipole moment vector and the magnetic field vector, as shown in
Eq. (2.5.16):
⃗
⃗τ = µ
⃗ × B.
(2.5.18)
This expression reveals that the coil experiences a torque perpendicular to both the magnetic
dipole moment vector and the magnetic field vector. This is similar to the torque experienced by
an electric dipole in an electric field, given by:
⃗
⃗τ = p⃗ × E,
(2.5.19)
where p⃗ = q(⃗r+ − ⃗r− ) is the electric dipole moment.
A magnetic dipole placed in an external magnetic field possesses energy that is influenced by
the orientation of the dipole with respect to the field. Drawing an analogy to electric dipoles, where
the energy is given by
⃗
U = −⃗
p · E,
(2.5.20)
we can write a similar expression for magnetic dipoles as:
⃗
U = −⃗
µ · B.
(2.5.21)
This relationship emphasizes the similarity in the energy calculations between electric and magnetic
dipoles, where the energy depends on the alignment of the dipole moment with the corresponding
field.
Not only coils possess magnetic dipole moments, but fundamental particles such as the electron
also exhibit their intrinsic magnetic dipole moments. This phenomenon arises from the intrinsic
spin of the electron, which can be envisioned as a minute charged particle spinning on its axis.
As a result of this spin, the electron generates a magnetic moment akin to that of a miniature
bar magnet, complete with north and south poles. The direction of the electron’s magnetic dipole
132
moment aligns with its spin angular momentum vector. When subjected to an external magnetic
field, the electron’s magnetic dipole moment interacts with the field, leading to a torque and energy
associated with the interaction. This interaction can give rise to various magnetic effects, including
precession and alignment of the electron’s spin with the applied field. Accurate determination of
the electron’s magnetic dipole moment has been the focus of extensive experimental and theoretical
investigations, playing a significant role in advancing our comprehension of quantum field theories
and the properties of matter at the microscopic level.
2.5.3
Magnetic fields from currents
A magnetic field is not only capable of exerting a force on moving charged particles but also of
being produced by the motion of charged particles or currents. This reciprocal relationship between
magnetic fields and moving charges is a fundamental aspect of electromagnetism.
Biot-Savart law
The Biot-Savart law summarizes the experimental findings regarding the magnetic field produced
at a point P , located at a distance ⃗r from a current-carrying length element d⃗s. This law states
⃗ is equal to µ0 multiplied by the product of the current i, d⃗s, sin θ, and
that the magnetic field dB
4π
inversely proportional to the square of the distance |⃗r|. Here, θ represents the angle between the
directions of d⃗s and the unit vector r̂, which points from d⃗s towards P . The constant µ0 , known
as the permeability constant, has a value of µ0 ≈ 4π × 10−7 T · m/A ≈ 1.26 × 10−6 T · m/A.
Figure 2.21
⃗ as illustrated in the above figure, is perpendicular to
The direction of the magnetic field dB,
both d⃗s and r̂ and can be determined using the cross product d⃗s × r̂. Consequently, the vector
form of the Biot-Savart law is expressed as
⃗ =
dB
µ0 id⃗s × r̂
4π |⃗r|2
133
(2.5.22)
The magnitude of a magnetic field is proportional to the current and inversely proportional to the
square of the distance from the source. This is similar to the electric field generated by a charged
particle, which is proportional to the electric charge and inversely proportional to the square of the
distance.
We can now use the law of Biot and Savart to derive that the magnetic field at a perpendicular
distance R from a long (infinite) straight wire carrying a current i. To find the total magnetic field
due to the entire infinite wire, we integrate the contributions from all infinitesimal current elements
along the wire
√
ˆ
ˆ ∞
ˆ ∞
µ0 i sin θds
µ0 i(R/ s2 + R2 )ds
⃗
⃗
B = dB =
(î × r̂0 ) =
(î × r̂0 )
r2
s2 + R2
−∞ 4π
−∞ 4π
∞
s
µ0 i
µ0 iR
√
(î × r̂0 ) =
(î × r̂0 ),
(2.5.23)
=
4π R2 s2 + R2 s=−∞
2πR
Here, î is the unit vector along the direction of the current, r̂0 is the unit vector pointing from the
wire to the point, θ is the angle between î (or d⃗s) and ⃗r. The magnetic field lines produced by the
central wire form circular patterns around the wire, and their magnitude is inversely proportional
to the distance R. This can be observed in the figure below, which shows iron filings aligned with
the magnetic field lines.
Figure 2.22
As an additional example, let’s examine the magnetic field generated by a current in a circular
arc of wire. Consider an arc-shaped wire with a central angle ϕ, radius R, and center C, carrying
a current i. The magnetic field at the center C due to any differential segment is directed along
the same direction. Therefore, we can simplify the problem by considering the magnitude of the
magnetic field using the scalar integral form of the Biot-Savart law:
ˆ
B=
ˆ
dB =
0
ϕ
µ0 iRdϕ′
µ0 iϕ
=
.
2
4π R
4πR
(2.5.24)
If the arc becomes a complete circle, the magnetic field at its center can be simplified further. In
this case, the central angle ϕ is replaced by 2π. The expression for the magnetic field at the center
134
of a circular wire becomes
µ0 i
.
R
And its direction is perpendicular to the circle plane with right-hand rule.
B=
(2.5.25)
Force between two parallel currents
Since moving charged particles can generate a magnetic field, it follows that magnetic fields can
exert forces on moving charged particles. This leads us to the intriguing question of what happens
when two currents are present. Consider the simplest scenario where two currents flow through
infinitely long wires that are parallel to each other. We want to determine the force between these
parallel currents.
To calculate the force between two parallel currents, we can apply Biot-Savart law to determine
the magnetic field created by one current and then use the Lorentz force law to find the force
experienced by the other current. For two parallel long wires carrying currents i1 and i2 , separated
by a distance d, the magnetic field created by the first wire at the location of the second wire is
given by
µ0 i1
.
(2.5.26)
2πd
The force experienced by the second wire can be obtained by multiplying the magnetic field B with
the length of the wire L2 and the current i2 :
B=
µ0 i1 i2 L2
.
(2.5.27)
2πd
A simple analysis of the direction of the force reveals that parallel currents attract each other, while
antiparallel currents repel each other.
F = i2 L2 B =
The force between two parallel currents is not only a fundamental concept in electromagnetism,
but it also played a crucial role in defining the unit of electric current, the ampere. Prior to 2019,
the ampere was defined as the constant current that, when maintained in two parallel conductors
of infinite length, placed one meter apart in a vacuum, would exert a force of 2 × 10−7 newtons per
meter of length between them. This basically fix the value of µ0 to be 4π × 10−7 T · m/A.
However, with the redefinition of the SI base units in 2019, the ampere is now defined in terms of
the elementary charge e. The elementary charge has a fixed numerical value of 1.602176634 × 10−19
coulombs (C), where 1 ampere (A) is equal to 1 coulomb per second (C/s). The second, in turn, is
defined in terms of the unperturbed ground state hyperfine transition frequency of the caesium-133
atom.
2.5.4
Ampere’s law
When determining the net electric field resulting from a charge distribution, we consider the sum of
the differential electric fields contributed by each element. In cases where the distribution exhibits
135
planar, cylindrical, or spherical symmetry, Gauss’ law provides a more efficient method to calculate
the total electric field.
Similarly, when calculating the net magnetic field caused by a distribution of currents, we sum
the contributions of the differential magnetic fields using the Biot-Savart law. If the distribution
possesses symmetry, Ampere’s law can be utilized for a simpler analysis. Ampere’s law, derived
from the Biot-Savart law, is commonly associated with Ampere, but it was actually derived and
generalized by James Clerk Maxwell. Ampere’s law can be expressed as:
˛
⃗ · d⃗s = µ0 ienc .
B
(2.5.28)
In this equation, the integral is taken around a closed loop, known as an Amperian loop, and the
dot product represents the magnetic field along the loop integrated over the differential path length.
The quantity ienc denotes the net current enclosed by the loop.
Symmetry transformation of magnetic field
To explore the symmetry transformation of the magnetic field, let’s first examine the cross product
under O(3) rotation. It can be easily shown that
⃗ × (RB)
⃗ = det(R) · R(A
⃗ × B)
⃗
(RA)
(2.5.29)
for R ∈ O(3). In fact, the cross product is unique in that it is orthogonal to both factors, has
a length equal to the area of the parallelogram they form, and forms a right-handed triple with
them. These properties remain invariant under rotations in SO(3). Therefore, the cross product
is invariant under SO(3) transformation. However, if R ∈ O(3) has the property det R = −1,
the right-hand rule will transform into a left-hand rule, resulting in a minus sign in the previous
equation.
Now, let’s delve into understanding the transformation rule for the magnetic field due to a
current density that is invariant under an R ∈ O(3) transformation. Starting from the Biot-Savart
law, we have:
ˆ
ˆ
µ0 i(⃗s)d⃗s × (⃗r − ⃗s)
⃗
⃗
B(⃗r) = dB =
.
(2.5.30)
4π
|⃗r − ⃗s|3
The magnetic field at position r⃗′ = R⃗r is then given by:
ˆ
ˆ
µ0 i(⃗s)d⃗s × (R⃗r − ⃗s)
µ0 i(R−1⃗s)d(R−1⃗s) × R(⃗r − R−1⃗s)
′
⃗
⃗
B(⃗r ) = B(R⃗r) =
=
4π
|R⃗r − ⃗s|3
4π
|⃗r − R−1⃗s|3
ˆ
ˆ
µ0 i(⃗s)d⃗s × R(⃗r − ⃗s)
µ0 R[i(⃗x)d⃗s] × R(⃗r − ⃗s)
=
=
3
4π
|⃗r − ⃗s|
4π
|⃗r − ⃗s|3
ˆ
µ0 i(⃗s)d⃗s × (⃗r − ⃗s)
⃗ r).
= det(R) · R
= det(R) · RB(⃗
(2.5.31)
4π
|⃗r − ⃗s|3
136
Here, we used the symmetry property of the current, R(id⃗s) = id⃗s. From this transformation rule,
we observe that the magnetic field is transformed in the same way under rotation if R ∈ SO(3) and
has an additional minus sign for R ∈ O(3) with det(R) = −1.
A physical quantity is classified as a vector if it follows the transformation rule
⃗v ′ = R⃗v ,
(2.5.32)
under a rotation R ∈ O(3). On the other hand, a quantity is considered a pseudovector (or axial
vector) if its transformation rule is
⃗v ′ = det(R) · R⃗v .
(2.5.33)
Performing usual operations on vectors or pseudovectors does not change their nature. For example,
r
the position vector ⃗r is a vector, and its derivative d⃗
dt (velocity) is also a vector. Similarly, the
acceleration is a vector. As demonstrated earlier, the electric field is a vector. However, the
cross product of two vectors results in a pseudovector. This explains why the magnetic field is a
pseudovector, as it is fundamentally related to d⃗s × ⃗r, where d⃗s can be viewed as the direction
of the drift velocity of charge carriers in a wire, which is a vector. Similarly, angular momentum
⃗ = ⃗r × p⃗ and torque τ = ⃗r × F⃗ are pseudovectors.
L
Combining Ampere’s law with symmetry
When there are symmetries in the current density or distribution, Ampere’s law and symmetry can
be utilized to simplify the derivation of the magnetic field generated by the current.
For example, let’s consider a long, straight wire carrying a steady current i. Applying Ampere’s
law to a circular loop of radius r centered on the wire, we utilize the rotational and translational
⃗ has a constant magnitude along the loop, with direction detersymmetries. The magnetic field B
mined by the right-hand rule. Ampere’s law yields
˛
⃗ · ds = B · 2πr = µ0 i,
B
(2.5.34)
allowing us to solve for the magnetic field of an infinite wire
B=
µ0 i
.
2πr
(2.5.35)
This demonstrates the inverse relationship between the magnetic field and distance r, while emphasizing the utility of Ampere’s law in analyzing infinitely long wires.
137
Figure 2.23
Similarly, for a long ideal solenoid carrying a current i, we can choose the Amperian loop to be
the loop in Fig. 2.23. Then the Ampere’s law gives us
˛
⃗ · ds = Bh = µ0 inh,
B
(2.5.36)
where n is the number of turns per unit length of the solenoid. The magnetic field inside the
solenoid is uniform and has magnitude
B = µ0 in.
138
(2.5.37)
2.6
2.6.1
Faraday’s Law, Induction and Inductance
Faraday’s law of induction
In the last section, we showed that a current produces a magnetic field. A more surprising physical
discovery of Faraday is the reverse effect: a changing magnetic field can induce an electric field that
drives a current. This link between a magnetic field and the electric field it induces is now called
Faraday’s law of induction.
Faraday originally stated his law of induction as “an emf is induced in a loop when the number
of magnetic field lines that pass through the loop is changing”. The number of magnetic field lines
passing through the loop is quantitatively described by the magnetic flux. Suppose a loop enclosing
⃗ Then, the magnetic flux through the loop is defined as
an area A is placed in a magnetic field B.
¨
⃗ · dA.
⃗
ΦB :=
B
A
⃗ is divergence-free, i.e., ∇
⃗ ·B
⃗ = 0, by Corollary 2.2.2, its flux through
Since the magnetic field B
a surface S is invariant under local continuous deformation of S. As a consequence, the flux ΦB
is well-defined and does not depend on the particular surface enclosed by the loop that we have
chosen. The SI unit for magnetic flux is called the weber (Wb): 1 Wb = 1 T · m2 .
Physics law 12 (Faraday’s law of induction). The magnitude of the emf E induced in a conducting loop is equal to the rate at which the magnetic flux ΦB through that loop changes with time.
Moreover, the induced emf E tends to oppose the flux change. In formula, it writes that
E =−
dΦB
.
dt
(2.6.1)
If we change the magnetic flux through a coil of n turns, an induced emf appears in every turn
and the total emf induced in the coil is the sum of these individual induced emfs:
E = −n
dΦB
.
dt
Here are some common means by which we can change the magnetic flux through a coil:
• Change the magnitude of the magnetic field within the coil.
• Change the area of the coil or the portion of that area that lies within the magnetic field.
• Change the angle between the direction of the magnetic field and the plane of the coil.
The rule of thumb to determine the directions of the induced currents is as follows:
An induced current has a direction such that the magnetic field due to the current opposes the
change in the magnetic flux that induces the current.
139
This is called Lenz’s law. We illustrate it with the following figure.
Figure 2.24: The direction of the current i induced in a loop is such that the current’s magnetic
⃗ ind opposes the change in the magnetic field B
⃗ inducing i. The field B
⃗ ind is always directed
field B
⃗
⃗ (figures
opposite an increasing field B (figures a, c) and in the same direction as a decreasing field B
b, d). The curled–straight right-hand rule gives the direction of the induced current based on the
direction of the induced field.
⃗ (pointing
Example. Suppose we pull a closed conducting loop out of a uniform magnetic field B
inside the paper) at constant velocity v as in Figure 2.25. Suppose the resistance of the loop is R.
Find the induced current, the force we act on the loop, the rate of work we do on the loop, and the
rate of energy dissipation in the loop.
Solution: We choose the coordinate axes in Figure 2.25 such that ⃗v = v⃗ex is along x direction,
⃗ is along the −z direction, i.e., B
⃗ = −B⃗ez .
and B
As we move the loop to the right, the portion of its area within the magnetic field decreases, so
the flux through the loop also decreases. The magnetic flux is ΦB = BLx, which is changing with
rate
dΦB
dx
= BL
= −BLv,
dt
dt
140
where the − sign indicates the fact that x is decreasing. Then, by Faraday’s law, an emf is induced
in the loop with magnitude E = BLv, which leads to an induced current
i=
BLv
.
R
Moreover, by Lenz’s law, the induced current is in the clockwise direction.
Figure 2.25
With the induced current, we can calculate the magnetic forces on the loop. For the left edge,
we have
2 2
⃗ = −iLB⃗ex = − B L v ⃗ex .
F⃗1 = i(L⃗ey ) × B
R
For the upper and lower edges, we have
⃗ = ixB⃗ey ,
F⃗2 = i(x⃗ex ) × B
⃗ = −ixB⃗ey .
F⃗3 = −i(x⃗ex ) × B
Hence, to ensure that the loop moves at a constant velocity, the force we should act on the loop is
B 2 L2 v
F⃗ = −(F⃗1 + F⃗2 + F⃗3 ) =
⃗ex .
R
Then, the rate of work done on the loop is
B 2 L2 v 2
F⃗ · ⃗v =
.
R
Finally, the rate of energy dissipation is
P = i2 R =
B 2 L2 v 2
.
R
Note that this is exactly equal to the rate at which we are doing work on the loop. In other words,
this shows that the work that we have done in pulling the loop through the magnetic field finally
transfers to the thermal energy in the loop.
141
2.6.2
Induced electric fields
Faraday’s law of induction tells us that a changing magnetic field induces an emf (and hence a
current) in a copper ring. Then, an electric field must be present along the ring because an electric
⃗ by a changing
field is needed to move the conduction electrons. This induced electric field E
magnetic field is just as real as an electric field produced by static charges—either field will exert
⃗ on a particle of charge q0 . Now, Faraday’s law can be equivalently stated as follows:
a force q0 E
A changing magnetic field induces an electric field.
Note that this statement is more general in the sense that the electric field is induced even if there
is no such copper ring, i.e., the electric field would appear even if the changing magnetic field were
in a vacuum.
To derive the formula for the induced electric field, we look at the following figures. We assume
⃗ is increasing in magnitude at a rate dB/dt.
⃗
that the magnetic field B
In (a), an emf E is induced along the copper ring as in (2.6.1). In (b), we replace the copper ring
with a hypothetical circular path of radius r. Consider a particle of charge q0 moving around this
circular path, the work W done on it in one revolution by the induced electric field is
W = q0 E = −q0
dΦB
= −q0
dt
¨
S
⃗
dB
⃗
· dA,
dt
where S is the disk enclosed by the circular path. On the other hand, the work done on a particle
of charge q0 along the circular path C can also be evaluated from the electric field as
˛
¨
⃗ · d⃗s = q0
⃗ × E)
⃗ · dA.
⃗
W = q0 E
(∇
C
S
where we used Theorem 2.2.3 in the second step. Comparing the above equations, we obtain that
¨
−
S
⃗
dB
⃗=
· dA
dt
¨
⃗ × E)
⃗ · dA.
⃗
(∇
S
Note that this equation should hold for any surface S, so there should be
⃗
⃗ ×E
⃗ = − dB .
∇
dt
142
(2.6.2)
This is Faraday’s law in Maxwell’s equations.
Now, we have two types of electric fields: static electric fields produced by static charges, and
induced electric fields produced by changing magnetic flux. Although electric fields produced in
either way exert forces on charged particles, there is an important difference between them. The
static electric field is non-rotational, so there is an electric potential associated with it by Poincaré
lemma, Theorem 2.2.4. But this is not the case for induced electric fields, which have non-vanishing
curl: electric potential has no meaning for electric fields that are produced by induction. We can
derive a contradiction assuming there is indeed an electric potential. Consider a charged q0 that
makes a loop around the circular path in the above Figure (b). It starts at a certain point and, on
its return to that same point, has experienced an emf E , that is, work of q0 E has been done on the
particle by the electric field. However, that is impossible because the particle is back at the same
point, which can have only one particular potential.
The difference between the two types of electric fields can also be told from their field lines.
Field lines of induced electric fields form closed loops. But field lines produced by static charges
never do so but must start on positive charges and end on negative charges. Thus, a field line from
a charge can never loop around and back onto itself.
2.6.3
Inductors and inductance, self-induction
In the previous lecture, we discussed how capacitors can generate desired electric fields. Now we
will shift our focus to inductors, which play a crucial role in generating desired magnetic fields.
Specifically, we will examine the behavior of a long solenoid, with our attention drawn to a short
length near the middle to mitigate any potential fringing effects.
When a current i flows through the windings of the solenoid, it generates a magnetic flux ΦB in
the central region of the inductor. The magnitude of the magnetic flux increases with the current.
Similarly, increasing the number of turns N in the solenoid increases the magnetic flux. To quantify
the ability of the solenoid to produce magnetic flux, we define the inductance L as:
L=
N ΦB
.
i
(2.6.3)
The inductance represents the ratio of the magnetic flux to the current and provides a measure
of the solenoid’s capability as an inductor. The SI unit of magnetic flux is the tesla-square meter
(T·m2 ), and as a result, the SI unit of inductance is the tesla-square meter per ampere (T·m2 /A).
This unit is referred to as the henry (H) in honor of Joseph Henry, an American physicist who
co-discovered the law of induction alongside Faraday. Therefore, we can express 1 henry as 1 H =
1 T·m2 /A.
To calculate the inductance of a solenoid, let’s consider a solenoid with length l. The magnetic
flux in this region can be expressed as
N ΦB = (nl)(BA),
143
(2.6.4)
where n represents the number of turns per unit length of the solenoid and B is the magnitude
of the magnetic field inside the solenoid. As we calculated before, the magnetic field within the
solenoid is given by B = µ0 in, where µ0 is the permeability of free space and i is the current.
Hence, the inductance of the solenoid per unit length can be determined as
L
N ΦB
=
= µ0 n2 A.
l
il
(2.6.5)
The previous equation for the inductance of a solenoid provides a good approximation when the
solenoid is much longer than its radius. This approximation neglects the spreading of magnetic
field lines near the ends of the solenoid. Similarly, the parallel-plate capacitor formula (C =
ϵ0 A/d) neglects the fringing of electric field lines near the edges of the capacitor plates. These
approximations hold under specific conditions but may not fully account for the behavior near the
boundaries.
Figure 2.26
If the current in an inductor, such as a coil, is changing, it induces a changing magnetic flux,
leading to a self-induced electric field according to Faraday’s law. This phenomenon is known as
self-induction. By the definition of inductance, N ΦB = Li, and applying Faraday’s law, the electric
potential is given by:
E =−
d(N ΦB )
di
= −L .
dt
dt
(2.6.6)
Therefore, in any inductor, whether it be a coil, solenoid, or toroid, a self-induced electric field arises
whenever the current changes with time, regardless of the current’s magnitude. The direction of
a self-induced potential is determined by Lenz’s law. According to Lenz’s law, the self-induced
electric field has a direction that opposes the change in current. This is why we have a negative
sign in above equation.
2.6.4
RL circuits
In section 2.4.7, we discussed RC circuits that involve both a resistor and a capacitor. Now, let’s
consider an RL circuit where we have a resistor and an inductor connected together in a circuit.
We aim to derive and solve the differential equation that governs the behavior of this circuit.
144
Figure 2.27
Let’s first consider the stationary case when the switch is at a. Initially, due to the presence of
the inductor, the increasing current gives rise to an induced electric potential that is opposite in
direction to the battery potential, following Lenz’s law. As time progresses, the current reaches its
final maximum value, which is i = E /R. This implies that after a long time, the inductor behaves
like an ordinary connecting wire. And the voltage across the two ends of the inductor is zero. So,
in the stationary state, the circuit can be simplified to a battery connected to a resistance.
In the stationary state, when the switch is on position b, the circuit consists of the inductor and
the resistor without a battery. Without a battery to drive the current, the circuit eventually reaches
an equilibrium state where no current flows and no voltage is present across the components.
Now let us analyze the time-dependent situations quantitatively. When the switch S in the
circuit is closed on position a, the current in the resistor begins to increase. A self-induced electric
di
potential EL = –L dt
is induced in the resistor, opposing the rise of the current according to Lenz’s
law. This self-induced potential opposes the battery E . Therefore, the voltage across the resistor
di
is given by E + EL = E − L dt
. By Ohm’s law, the voltage across the resistor should also be iR.
Thus, the differential equation for the circuit is:
E −L
di
− iR = 0.
dt
With the initial conditions i(t = 0) = 0 and the long-time limit i(t = +∞) =
the above differential equation is
i(t) =
(2.6.7)
E
R,
the solution to
E
− t
(1 − e τL ),
R
(2.6.8)
where the inductive time constant τL is given by
τL =
L
.
R
(2.6.9)
It is customary to verify that the dimension of τL defined above does indeed have the dimension
of time. The potential difference across the resistor VR (t) = i(t)R exhibits a similar behavior as
the current. The potential difference across the inductor is |VL (t)| = L di(t)
dt = E e
exponentially to zero as expected.
145
− τt
L
, which decays
On the other hand, if the switch is initially on position a for a long time and then switched to
position b at t = 0, the initial current through the circuit is i(0) = E /R. Initially, the inductor
behaves like a wire. As time progresses without a battery in the circuit, the current will decrease.
di
This time-varying current induces a potential EL = −L dt
across the inductor. This potential should
be equal to the potential difference across the resistor VR = iR. Thus, we have the differential
equation
L
di
+ iR = 0.
dt
(2.6.10)
Using the initial condition i(0) = E /R, the solution to the above differential equation is
i(t) =
E −t/τL
e
,
R
where the inductive time constant is again τL =
tially from its initial value E /R to zero.
2.6.5
L
R.
(2.6.11)
We observe that the current decays exponen-
Energy of a magnetic field
Let us consider the RL circuit again. By multiplying i to both sides of the RL differential equation,
we obtain
E i = Li
di
+ i2 R.
dt
(2.6.12)
The left-hand side E i = E dq
dt can be interpreted as the rate at which the battery delivers energy
to the rest of the circuit. On the right-hand side, the second term i2 R represents the rate at
di
which energy appears as thermal energy in the resistor. Therefore, the first term Li dt
should be
understood as the rate of energy lost due to the inductor. Since an inductor stores energy in the
form of a magnetic field, we can say that it represents the rate of energy stored in the magnetic
field:
di
dUB
= Li .
dt
dt
By integrating this expression, we obtain the magnetic energy
ˆ
ˆ i
dUB
1
UB =
dt =
Li′ di′ = Li2 .
dt
2
0
(2.6.13)
(2.6.14)
The expression bears a similarity to the expression for electric energy stored in a capacitor, UE =
q2
2C .
If the inductor is a long solenoid of cross-sectional area A and length l, the magnetic energy
density of it is approximately given by
uB =
L i2
i2
B2
UB
=
= µ0 n 2 A
=
.
Al
l 2A
2A
2µ0
146
(2.6.15)
In this equation, we used the relationship Ll = µ0 n2 A from Eq. (2.6.5) and B = µ0 in for a
long solenoid. This final expression represents the density of stored magnetic energy at any point
where the magnitude of the magnetic field is B. Although we derived it specifically for the case
of a solenoid, it holds true for all magnetic fields, regardless of their generation. Notably, the
magnetic energy density expression bears a resemblance to the expression for electric energy density,
uE = 12 ϵ0 E 2 .
2.6.6
LC harmonic oscillations
We have discussed RC and RL circuits in previous lectures. In this section, we will consider an
LC circuit consisting of an inductor with inductance L and a capacitor with capacitance C. Let’s
denote the charge on the capacitor as q, and the current in the circuit as i, which is equal to dq
dt . The
q
di
electric potential differences across the inductor and the capacitor are −L dt and C , respectively.
Therefore, we have the differential equation
q
d2 q
+
= 0,
2
dt
C
(2.6.16)
d2 q
= −ω 2 q.
dt2
(2.6.17)
L
or equivalently,
The angular frequency is given by
1
(2.6.18)
LC
This equation represents a harmonic oscillator. The solution to this differential equation is
ω=√
q = Q cos(ωt + ϕ),
(2.6.19)
where Q is the amplitude of the oscillation and ϕ is the phase angle. The corresponding current is
given by
i=
dq
= −ωQ sin(ωt + ϕ)
dt
Figure 2.28
147
(2.6.20)
In a harmonic oscillator, such as a spring system, we know that the total energy is conserved.
The energy is transformed between kinetic energy and the potential energy of the spring. Similarly,
in the LC circuit, we expect total energy conservation. The total energy consists of the electric
energy of the capacitor and the magnetic energy of the inductor. Using the energy formulas we
derived earlier, the total energy is given by:
1
q2
U = UE + UB = Li2 +
.
2
2C
(2.6.21)
The conservation of this total energy implies that:
dU
di
q dq
d2 q
q
= Li +
= Li 2 + i = 0.
dt
dt C dt
dt
C
(2.6.22)
By eliminating i on both sides of the equation, this equation is precisely the harmonic oscillator
differential equation that we derived earlier. It confirms that the total energy of the LC circuit is
conserved, just like in other harmonic oscillator systems.
2.6.7
RLC damped oscillations
In this subsection, we will delve into the behavior of RLC circuits. RLC circuits are fundamental
electrical circuits that incorporate resistors (R), inductors (L), and capacitors (C).
Figure 2.29
The differential equation for a series RLC circuit can be derived by applying Kirchhoff’s law
to the circuit. The voltage drop across the resistor is given by Ohm’s Law as VR (t) = i(t)R,
where i(t) is the current flowing through the circuit. According to Faraday’s law, the voltage
across the inductor is given by VL (t) = −L(di(t)/dt). The voltage across the capacitor is given by
´
VC (t) = (1/C) i(t)dt. Applying Kirchhoff’s law to the series RLC circuit, we have:
L
q
d2 q
dq
+R +
= 0.
2
dt
dt
C
148
(2.6.23)
where we used the relation i(t) = dq(t)/dt. This is the differential equation that describes the
behavior of a series RLC circuit. Interestingly, it has the same form as the equation of motion for
a damped harmonic oscillator, as given by Eq. (1.5.14):
m
d2 x
dx
+b
+ kx = 0.
2
dt
dt
By making the following correspondences between physical quantities and parameters:


(

L ↔ m
q
↔ x
,
R ↔ b ,

i = dq
↔ v = dx

dt
dt
C ↔ 1
k
(2.6.24)
(2.6.25)
we can establish a direct connection between the series RLC circuit and the damped harmonic
oscillator.
Depending on the values of the parameters, the series RLC circuit can exhibit three different
types of behavior:
q
• Underdamped: When R < 4L
C , the circuit is underdamped. In this case, the system exhibits
oscillatory behavior, with the current and charge undergoing decaying oscillations.
q
• Overdamped: When R > 4L
C , the circuit is overdamped. In this scenario, the system does
not exhibit oscillations. The current and charge approach their equilibrium values gradually
without oscillatory behavior.
q
4L
• Critically damped: When R =
C , the circuit is critically damped. Here, the system
reaches equilibrium in the shortest time possible without oscillating.
Understanding the behavior of the RLC circuit in terms of the damped harmonic oscillator helps
us analyze and predict the response of the circuit under different parameter values.
Additionally, we can introduce an alternating current (AC) battery to the RLC circuit, which
adds a time-varying term E (t) to the system. In this scenario, the differential equation has the
same form as that of the forced damped harmonic oscillator, with the battery acting as an external
driving force. By leveraging the results obtained for the forced damped harmonic oscillator, we can
gain insights into the behavior of the RLC circuit with the battery. We can study, for example, the
phenomena of resonance, where the circuit exhibits a maximum response to a specific frequency
of the AC battery. By understanding the behavior of the forced damped harmonic oscillator, we
can indeed explore its applications in various fields, including signal processing and communication
systems. One of the key applications is signal amplification or magnification. The forced damped
harmonic oscillator can be used to enhance or amplify weak signals by applying an AC battery as
an external force to the system.
149
2.7
2.7.1
Maxwell’s equations, Electromagnetic Waves
Overview of Maxwell’s equations
Maxwell’s equations are a set of four differential equations that describe how electric and magnetic
fields interact. Named after the Scottish physicist James Clerk Maxwell, these equations form the
foundation of classical electrodynamics, optics, and electric circuits. They can also be used to
derive the wave equation for light, illustrating that light is a form of electromagnetic radiation.
The equations are typically written in the following form:
Gauss’s Law for Electricity:
⃗ ·E
⃗ = ρ
∇
ε0
(2.7.1)
⃗ ·B
⃗ =0
∇
(2.7.2)
⃗
⃗ ×E
⃗ = − ∂B .
∇
∂t
(2.7.3)
Gauss’s Law for Magnetism:
Faraday’s Law of Induction:
Ampere’s Circuital Law with Maxwell’s Addition:
⃗
⃗ ×B
⃗ = µ0 J⃗ + µ0 ε0 ∂ E .
∇
∂t
(2.7.4)
⃗ and B
⃗ are the electric and magnetic field vectors, ρ is the electric charge density, J⃗ is
Here, E
the current density, ε0 is the permittivity of free space, and µ0 is the permeability of free space.
Gauss’s Law for Electricity Gauss’s law for electricity states that the electric flux through any
closed surface is proportional to the total charge enclosed within the surface. Mathematically, this
is represented by the equation:
⃗ ·E
⃗ = ρ
∇
(2.7.5)
ε0
⃗ · E,
⃗ represents the divergence of the electric field E.
⃗ The
The left-hand side of this equation, ∇
divergence is a measure of the ’outgoingness’ of a vector field at a given point. In the context of
an electric field, it measures how much electric field is emanating from a particular point in space.
The right-hand side, ερ0 , represents the charge density ρ divided by the permittivity of free space
ε0 . The permittivity of free space is a constant that characterizes the amount of electric field that
can exist in a vacuum for a given electric charge.
150
Gauss’s Law for Magnetism Gauss’s law for magnetism states that the net magnetic flux out
of any closed surface is zero. This implies that there are no magnetic monopoles and that every
magnetic field line that begins at some point must end at another point. Mathematically, this is
represented as:
⃗ ·B
⃗ =0
(2.7.6)
∇
⃗ is the magnetic field, and ∇·
⃗ B
⃗ = 0 represents the divergence of B.
⃗ Since the divergence
Here, B
of the magnetic field is zero, this implies that the magnetic field lines are continuous loops.
Faraday’s Law of Induction Faraday’s law of induction states that a changing magnetic field
induces an electromotive force (EMF) in a closed loop of wire. This induced EMF creates an electric
field, represented by the equation:
⃗
⃗ ×E
⃗ = − ∂B .
(2.7.7)
∇
∂t
⃗ × E,
⃗ represents the curl of the electric field, which measures the ‘circulaThe left-hand side, ∇
⃗
tion’ or ‘rotationality’ of the field. The right-hand side, − ∂∂tB , represents the rate of change of the
magnetic field over time. The negative sign indicates that the induced EMF and, therefore, the
induced electric field, oppose the change in the magnetic field, as stated by Lenz’s law.
Ampere’s Circuital Law with Maxwell’s Addition Ampere’s Circuital Law with Maxwell’s
addition, also known as Maxwell’s law of electromagnetic induction, states that a magnetic field
is induced by both the current density and the rate of change of the electric field. The law is
represented as:
⃗
⃗ ×B
⃗ = µ0 J⃗ + µ0 ε0 ∂ E .
(2.7.8)
∇
∂t
⃗ × B,
⃗ represents the curl of the magnetic field. This measures the ‘circuThe left-hand side, ∇
lation’ or ‘rotationality’ of the field. On the right-hand side, the term µ0 J⃗ represents the current
⃗
density J⃗ multiplied by the permeability of free space µ0 . The term µ0 ε0 ∂∂tE represents the rate of
change of the electric field over time, scaled by the constants µ0 and ε0 . The second term on the
right-hand side is Maxwell’s addition, which accounts for the creation of a magnetic field due to a
changing electric field. This term is what allows for electromagnetic waves to propagate through
space, as each changing field induces a change in the other, creating a self-sustaining wave.
2.7.2
Maxwell’s Addition
The original form of Ampère’s law, which relates the circulation of the magnetic field around a
closed loop to the electric current passing through the loop, worked well for static fields. However,
it failed to account for situations where the electric field changes with time. It was James Clerk
Maxwell who recognized this inconsistency and added a crucial term to Ampère’s law, now referred
to as “Maxwell’s Addition”.
151
Maxwell realized that a changing electric field produces a magnetic field, just as a changing
magnetic field produces an electric field (as described by Faraday’s law of induction). This was a
significant insight because it established symmetry between electric and magnetic fields.
Maxwell’s addition is stated mathematically as follows:
⃗ ×B
⃗ = µ0
∇
⃗
∂E
J⃗ + ε0
∂t
!
.
⃗
(2.7.9)
⃗
The term ε0 ∂∂tE is Maxwell’s addition, where ε0 is the permittivity of free space and ∂∂tE is the rate
of change of the electric field. This term is known as the displacement current. It isn’t a current in
the traditional sense of charges moving in a conductor; rather, it’s a “current” of changing electric
field. It’s essential for the consistency of Ampère’s law with charge conservation.
The Necessity of Maxwell’s Addition: The Displacement Current
Consider a charging capacitor. There is a current i inside the wires connected to the plates, but in
the region outside the wire and between the plates, there is no conduction current. According to
⃗ around the red loop C in Figure 2.30
the original Ampère’s law, the integral of the magnetic field B
⃗
equals the integral of the current density J over the surface S, whose boundary is C, i.e. S = C,
ˆ
ˆ
⃗
⃗.
B · d⃗r = µ0 J⃗ · dA
(2.7.10)
C
S
Now, there is a paradox that we could choose the surface to be the orange surface Sorange or the
blue surface Sblue . We obtained different values of the surface integral
ˆ
ˆ
⃗ = µ0 i ̸= 0 = µ0
⃗.
J⃗ · dA
J⃗ · dA
µ0
(2.7.11)
Sblue
Sorange
Figure 2.30: Charging a capacitor.
Maxwell explained this paradox by introducing the concept of displacement current. He argued
that a changing electric field between the plates of the capacitor induces a “current” - not of moving
⃗
charges, but of the changing electric field itself. This displacement current, ε0 ∂∂tE , is present in the
region between the capacitor plates.
152
Recall, the electric field between two conducting plates of surface charge density σ is
E=
q
σ
=
.
ε0
ε0 A
(2.7.12)
Taking a time derivative gives
dE
d
µ 0 i = µ 0 ε0 A
= µ 0 ε0
dt
dt
ˆ
⃗ · dA
⃗,
E
(2.7.13)
where we have used the fact that the electric field is approximately uniform when the plate is
sufficiently large. Now, we find
!
!
ˆ
ˆ
⃗
⃗
∂
E
∂
E
⃗=
⃗.
· dA
J⃗ + ε0
· dA
(2.7.14)
J⃗ + ε0
∂t
∂t
Sblue
Sorange
The integral form of Ampère’s Law with Maxwell’s Addition is
!
ˆ
ˆ
⃗
∂E
⃗,
⃗
⃗
· dA
B · d⃗r = µ0
J + ε0
∂t
C
S
(2.7.15)
which is independent of the choice of the surface S as long as the boundary of S is the curve C, i.e.
∂S = C. Using Stoke’s theorem, we have
ˆ
ˆ
⃗
⃗ × B)
⃗ · dA
⃗.
B · d⃗r = (∇
(2.7.16)
C
S
Taking S to be infinitesimal, we arrive at the differential form of Ampère’s Law with Maxwell’s
Addition,
⃗
⃗ ×B
⃗ = µ0 J⃗ + µ0 ε0 ∂ E .
(2.7.17)
∇
∂t
Consistency with charge conservation
Maxwell’s addition to Ampère’s law is also crucial for the conservation of electric charge. Consider
an electric charge distribution of density ρ(⃗r) inside a region B, and the charges are flowing out of
⃗ r) over the boundary S = ∂B. The conservation of electric charge gives
B by a current density J(⃗
the equation
ˆ
ˆ
⃗ r) · dA
⃗=−d
J(⃗
ρ(⃗r)d3 x .
(2.7.18)
dt
S
B
By the divergence theorem, we have
ˆ
ˆ
⃗ r) · dA
⃗=
J(⃗
S
⃗ · J(⃗
⃗ r)d3 x .
∇
(2.7.19)
B
Taking B to be infinitesimal, we arrive at the charge conservation equation
⃗ · J⃗ = − ∂ρ .
∇
∂t
153
(2.7.20)
On the other hand, in the homework problem, you have found the identity
⃗ · (∇
⃗ × B)
⃗ = 0.
∇
(2.7.21)
Using Ampère’s Law with Maxwell’s Addition and the Gauss law, we recover the charge conservation
equation
⃗
⃗ · (∇
⃗ × B)
⃗ = µ0 ∇
⃗ · J⃗ + µ0 ε0 ∇
⃗ · ∂ E = µ0 ∇
⃗ · J⃗ + µ0 ∂ρ .
(2.7.22)
0=∇
∂t
∂t
We note that Maxwell’s Addition is crucial for the above derivation. Without Maxwell’s Addition,
Ampère’s Law would be inconsistent with the charge conservation.
2.7.3
Relativistic formulations of Maxwell equations
Maxwell’s equations can be written in the index form as
1
ρ,
ε0
∂i Bi = 0 ,
∂
ϵijk ∂j Ek = − Bi ,
∂t
∂i Ei =
ϵijk ∂j Bk = µ0 Ji + µ0 ε0
(2.7.23)
∂
Ei ,
∂t
∂
where i, j = 1, 2, 3 and ∂i ≡ ∂x
i . We have used the Einstein summation convention, where the
repeated indices are summed over. We do not distinguish the upper and the lower i, j indices, i.e.
V i = Vi . ϵijk is the rank-3 Levi-Civita symbol, a totally anti-symmetric tensor with ϵ123 = 1. The
other components of ϵijk are determined by the total anti-symmetricity.
Now, we define
F ij ≡
1
ϵijk Bk ,
µ0
F 0i ≡ cε0 Ei ,
F i0 ≡ −cε0 Ei ,
F 00 ≡ 0 ,
J 0 ≡ cρ ,
(2.7.24)
where c is a constant that will be determined later.
Exercise.
Show that Bi =
µ0
jk
2 ϵijk F .
The electric and magnetic fields can be assembled into a rank-2 anti-symmetric tensor
F µν = −F νµ ,
(2.7.25)
where we have combined 0 and i forming a new index µ = 0, 1, 2, 3. Writing in the matrix form,
we have


0
cε0 E1 cε0 E2 cε0 E3
1
1
−cε E

0
0 1

µ0 B3 − µ0 B2 
(2.7.26)
F µν = 
.
1
0
−cε0 E2 − µ10 B3
µ0 B1 
−cε0 E3 µ10 B2 − µ10 B1
0
154
F µν is called a field strength.
Now, let us rewrite Maxwell’s equation in terms of F µν . First, let us define
x0 ≡ ct ,
∂0 ≡
1∂
∂
=
.
0
∂x
c ∂t
(2.7.27)
We need to stress here that the upper and lower 0-indices are different, i.e. V 0 ̸= V0 , and we will
determine how they are related later. Maxwell’s equation now becomes
∂j F 0j = J 0 ,
∂j F ij = J i + ∂0 F 0i ,
(2.7.28)
ϵijk ∂ i F jk = 0 ,
ϵijk ∂ j F 0k = −
c2 µ0 ε0
ϵijk ∂0 F jk .
2
The first two equations can be nicely written as
∂ν F µν = J µ .
(2.7.29)
To simplify the last two equations, let us examine the equation
ϵσµνρ ∂ µ F νρ = 0 ,
(2.7.30)
where ϵσµνρ is the rank-4 Levi Civita symbol, which is a totally anti-symmetric tensor with ϵ0123 = 1,
and in particular, ϵ0ijk = ϵijk . When σ = 0, the equation (2.7.30) becomes
ϵijk ∂ i F jk = 0 ,
(2.7.31)
which is the same as the third equation in (2.7.28). When σ = i, the equation (2.7.30) becomes
1
ϵijk ∂ j F 0k = ϵijk ∂ 0 F jk .
2
(2.7.32)
To match with the fourth equation in (2.7.28), we first set
c= √
1
.
µ0 ε 0
(2.7.33)
Next, we define
x0 ≡ −x0 ,
∂0 ≡
∂
∂
= − 0 = −∂0 .
∂x0
∂x
(2.7.34)
With these definitions, we see that (2.7.32) and the fourth equation of (2.7.28) become the same.
In summary, Maxwell’s equations become (2.7.29) and (2.7.30).
The equation (2.7.30) can be solved by
F µν = ∂ µ Aν − ∂ ν Aµ ,
155
(2.7.35)
for arbitrary vector field Aµ . We check that
ϵσµνρ ∂ µ (∂ ν Aρ − ∂ ρ Aν ) = ϵσµνρ ∂ µ ∂ ν Aρ − ϵσµνρ ∂ µ ∂ ρ Aν = 0 ,
(2.7.36)
where we have used the fact that partial derivatives commutes, i.e. ∂ µ ∂ ν Aρ = ∂ ν ∂ µ Aρ . The vector
field Aµ is called a gauge potential.
We note that the gauge potential is not entirely physical. Consider two gauge potentials that
differ by a total derivative as
(2.7.37)
A′ν = Aµ + ∂ µ λ .
We have
F ′µν = ∂ µ A′ν − ∂ ν A′µ = ∂ µ Aν + ∂ µ ∂ ν λ − ∂ ν Aµ − ∂ ν ∂ µ λ = ∂ µ Aν − ∂ ν Aµ = F µν .
(2.7.38)
(2.7.37) is called a gauge transformation or a gauge ambiguity. The gauge ambiguity can be
eliminated by imposing gauge conditions. A commonly used gauge condition is the Lorentz gauge
∂µ Aµ = 0 ,
(2.7.39)
which actually eliminates only part of the gauge ambiguity (2.7.37).
2.7.4
Electromagnetic wave
We have seen that Maxwell’s equations (2.7.23) simplify to (2.7.29) and (2.7.29) after treating time
t as the fourth coordinate x0 = ct. An important question remains: What is the constant c?
Exercise. Show that the unit of c is m/s from the units of µ0 and ε0 . Hence, x0 has the unit m,
so it makes sense to combine x0 with xi into xµ .
Therefore, c should be the velocity of something. We show that the “something” is the electromagnetic wave. Let us plug (2.7.35) into (2.7.30),
∂ν ∂ µ Aν − ∂ν ∂ ν Aµ = J µ .
(2.7.40)
Using the Lorentz gauge (2.7.39), we find
−∂ν ∂ ν Aµ = J µ .
For simplicity, we consider the case without any charge and current. We have
"
#
∂ 2
∂ 2
∂ 2
∂ 2
−
−
−
Aµ = 0 .
∂x0
∂x1
∂x2
∂x3
(2.7.41)
(2.7.42)
This equation is a wave equation that describes electromagnetic waves. The equation can be solved
by the ansatz
(2.7.43)
Aµ = εµ cos(kµ xµ + ϕ) ,
156
where kµ is called the momentum and εµ is called the polarization of the electromagnetic wave.
The wave equation (2.7.42) and the Lorentz gauge (2.7.39) implies
kµ k µ = kµ εµ = 0 .
(2.7.44)
For simplicity, we choose
k µ = (k, 0, 0, k) ,
εµ = (0, 1, 0, 0) .
(2.7.45)
The nonzero component of the gauge potential is
A1 = cos(k(x0 − x3 ) + ϕ) .
(2.7.46)
The nonzero components of the field strength are
F 01 = −∂0 A1 − ∂1 A0 = k sin(k(x0 − x3 ) + ϕ) ,
F 13 = ∂1 A3 − ∂3 A1 = −k sin(k(x0 − x3 ) + ϕ) .
We find the nonzero components of the electro and magnetic fields
r
µ0
E1 =
k sin(k(x0 − x3 ) + ϕ) , B2 = µ0 k sin(k(x0 − x3 ) + ϕ) .
ε0
Now, using x0 = ct, we find
r
µ0
E1 =
k sin(k(ct − x3 ) + ϕ) ,
ε0
B2 = µ0 k sin(k(ct − x3 ) + ϕ) .
(2.7.47)
(2.7.48)
(2.7.49)
We see that the wavefronts of both the electric and magnetic fields have speed c.
2.7.5
Symmetry of Maxwell’s equations
We have seen before in many examples that our laws of physics are invariant under Galilean
transformations. Are Maxwell’s equations also invariant under Galilean transformations?
First, Maxwell’s equations are invariant under rotation Ri j ∈ O(3),
x′i = Ri j xj ,
Exercise.
∂j′ =
∂
∂xi ∂
=
= (R−1 )i j ∂i = (RT )i j ∂i .
∂x′j
∂x′j ∂xi
(2.7.50)
Show that ϵijk Ri l Rj m Rk n = ϵlmn det(R).
The field strengths and the current transform as
F ′0j = Ri j F 0j ,
F ′ij = Ri k Rj l F kl ,
J ′i = Ri j J j ,
J ′0 = J 0 .
(2.7.51)
It is now easy to see that Maxwell’s equations (2.7.28) are invariant under the O(3) transformations.
However, Maxwell’s equations are not invariant under the Galilean transformations,
x′i = xi +
vi 0
x ,
c
157
x′0 = x0 .
(2.7.52)
The derivatives transform as
∂xj ∂
∂x0 ∂
∂
∂
=
+
=
,
∂x′i
∂x′i ∂xj
∂x′i ∂x0
∂xi
∂
∂xj ∂
∂x0 ∂
vi ∂
∂
=
+ ′0 0 = −
+
.
′0
′0
j
∂x
∂x ∂x
∂x ∂x
c ∂xi ∂x0
(2.7.53)
There is no good way for the field strength F ij and F 0i to transform and make Maxwell’s equations
(2.7.28) invariant.
Instead of Galilean transformations, Maxwell’s equations are invariant under Lorentz transformations. Let us see how Lorentz transformations come about. In the previous sections, we use the
speed c of the electromagnetic wave to turn time t into x0 = ct, and combine it with xi to form xµ .
It is natural to treat xµ as a 4-vector. We have upper µ and lower µ indices. They are related by
xµ = η µν xν ,
xµ = ηµν xν ,
(2.7.54)

0
0

 = ηµν .
0
1
(2.7.55)
where η µν and ηµν are 4 × 4 matrices,
η µν

−1
0

=
0
0
0
1
0
0
0
0
1
0
The norm of the 4-vector xµ is defined by
|x|2 ≡ xµ xµ = ηµν xµ xν .
(2.7.56)
Recall that the O(3) transformations of the 3-vector xi leave the norm of xi invariant. Analogously, we consider transformations that leave the norm (2.7.56) of the 4-vector invariant as
ηµν x′µ x′ν = ηµν Λµ ρ Λν σ xρ xσ = ηµν xµ xν .
(2.7.57)
The transformation matrix Λµ ρ must satisfy
ηµν Λµ ρ Λν σ = ηρσ .
(2.7.58)
ΛT ηΛ = η .
(2.7.59)
In matrix notation, we have
Such matrices form a group called O(1, 3). The group O(3) is a subgroup of O(1, 3). These
transformations are called Lorentz transformations.
For simplicity, let us consider the transformations that leave x2 and x3 invariant. They take
the form


cosh ζ sinh ζ 0 0
 sinh ζ cosh ζ 0 0


(2.7.60)
Λ=
.
 0
0 1 0
0
0 01
158
We find that x0 and x1 transform as
x′0 = x0 cosh ζ + x1 sinh ζ ,
x′1 = x0 sinh ζ + x1 cosh ζ .
It can be written using x = x1 and t = x0 /c as
v t′ = γ t + 2 x ,
c
x′ = γ(x + vt) ,
where γ =
q 1
2
1− v2
(2.7.61)
(2.7.62)
, and we have changed the variable ζ to v by
c
cosh ζ = q
1
1−
v2
c2
.
(2.7.63)
Now, we can consider the limit c ≫ v, and the Lorentz transformation reduces to the Galilean
transformation
t′ = t ,
(2.7.64)
x′ = x + vt .
We see that the parameter v in the Lorentz transformation becomes the velocity in the Galilean
transformation. The Galilean transformation can be regarded as the low-velocity limit of the
Lorentz transformation.
The Lorentz transformation has very profound physical implications. Although Lorentz first
found the Lorentz transformation, it was Einstein who first explain their meaning to the rest of the
world. Now, this subject is called Einstein’s special relativity.
In the following two sections, we will introduce two of the most important implications of the
Lorentz transformations: 1. Time Dilation, 2. Length Contraction.
2.7.6
Time Dilation
Time dilation, as described by special relativity, posits that an observer in one inertial frame will
perceive time to be passing more slowly in another frame that is moving relative to the first. This
is a counterintuitive phenomenon that runs counter to our everyday experiences but is a natural
consequence of the Lorentz transformations and the constant speed of light.
Let’s consider two observers, Alice who is stationary and Bob who is moving at a velocity v
relative to Alice. Suppose Bob carries a light clock with him, which measures time by bouncing a
beam of light between two mirrors.
Now, from Bob’s perspective, the light in the clock travels a distance of 2d (where d is the
distance between the mirrors) in a time interval ∆t′ (one tick of the clock). Using the invariant
speed of light, c, we have
(2.7.65)
2d = c∆t′ .
159
However, from Alice’s perspective, who sees Bob moving, the light beam in Bob’s clock follows
a diagonal path, forming a right triangle with the vertical distance of 2d and horizontal distance
of v∆t. According to Pythagoras’ theorem, the hypotenuse of this triangle (the path of the light)
should be:
(2.7.66)
c2 ∆t2 = (2d)2 + (v∆t)2 .
Here, ∆t is the time interval as measured by Alice for one tick of Bob’s clock. Substituting 2d = c∆t′
into the above equation, we get:
(2.7.67)
c2 ∆t2 = c2 ∆t′2 + v 2 ∆t2 .
Solving for ∆t, we find:
∆t = γ∆t′ ,
where γ =
q 1
2
1− v2
(2.7.68)
is the Lorentz factor. This is the time dilation formula, indicating that Alice
c
observes Bob’s clock to tick more slowly by a factor of γ.
It’s important to note that this time dilation effect is not due to any mechanical or optical flaws
in the clock, but is a genuine effect of relative motion. We can also directly derive the time dilation
(2.7.68) from the Lorentz transformation. Let’s begin with the Lorentz transformation:
vx t′ = γ t + 2 ,
c
(2.7.69)
x′ = γ (x + vt) ,
The time intervals ∆t′ , ∆t and space intervals ∆x′ , ∆x would satisfy
v
∆t′ = γ ∆t + 2 ∆x ,
c
∆x′ = γ (∆x + v∆t) ,
(2.7.70)
Here:
• ∆t′ and ∆x′ are the time and space intervals measured in the moving frame (Bob’s frame),
respectively
• ∆t and ∆x are the time and space intervals measured in the stationary frame (Alice’s frame),
resepctively
• v is the relative velocity between the two frames.
Now, let’s imagine a scenario where a single event (such as the tick of a clock) happens at
the origin of the moving frame (Bob’s frame). In this case, ∆x′ = 0 for that event. Substituting
∆x′ = 0 into the Lorentz transformation for time, we have:
∆t = γ∆t′ .
(2.7.71)
This equation states that the time interval for an event at the origin, as measured in the moving
frame (Bob’s frame), is γ times the time interval as measured in the stationary frame (Alice’s frame).
160
This means Bob measures the time interval to be longer than Alice by a factor of γ, which is the
essence of time dilation.
Time dilation has been experimentally confirmed in numerous tests, such as time-dilated decay
of muons in cosmic rays and precision measurements using atomic clocks on board GPS satellites.
Time dilation, as such, highlights the flexible nature of time under special relativity - a stark
contrast to our everyday perception of time as a constant, unchanging entity.
2.7.7
Length Contraction
Length contraction, much like time dilation, is a crucial and counterintuitive prediction of special
relativity. According to this phenomenon, the length of an object in its direction of motion is
observed to be shorter when viewed from a frame that is in motion relative to the object. This can
be derived from the Lorentz transformations.
Consider an object at rest in the frame of observer Bob. The length L′ of the object as
measured in Bob’s frame is given by the difference between the coordinates of its endpoints, x′1 and
x′2 : L′ = x′2 − x′1 = ∆x′ .
From the perspective of another observer, Alice, who is moving at a speed v relative to Bob,
the coordinates of the endpoints of the object are transformed by the Lorentz transformations as
follows:
v
∆t′ = γ ∆t + 2 ∆x ,
c
(2.7.72)
′
∆x = γ (∆x + v∆t) ,
where x denotes coordinates in Alice’s frame, γ = √
1
1−v 2 /c2
is the Lorentz factor, and t′ is the time
as observed by Bob.
However, Alice measures the length of the object at a particular time, so for her, the times at
the two ends of the object are the same, i.e., ∆t = 0. Substituting this into the equations, we have:
v
∆x ,
c2
∆x′ = γ∆x,
∆t′ = γ
(2.7.73)
The length L = ∆x of the object as observed by Alice is then given by L = ∆x = γ −1 ∆x′ = γ −1 L′ .
This result is counterintuitive because it suggests that an object’s length can change depending
on the observer’s state of motion, which contradicts our everyday experiences. However, it’s important to note that length contraction, like time dilation, is a real effect that has been confirmed
in numerous experiments, and it is essential for maintaining the consistency of the laws of physics
in all inertial frames as required by the principle of relativity.
161
Download