Uploaded by 2582356287

0-1150notes

advertisement
PHYS 1150 - Problem Solving in Physics
Yip Man Kit
Rm 415B, CYM Physics Building
mankit@hku.hk
January 2023
Contents
1 Vectors
1.1 Properties of Vectors . . . . . . . . . . . . .
1.2 Examples of Vectors and Scalars . . . . . .
1.3 Adding Vectors . . . . . . . . . . . . . . . .
1.4 The Components of Vectors . . . . . . . .
1.5 Subtracting Vectors . . . . . . . . . . . . .
1.6 Position Vector and its Time Derivatives .
1.7 Reference Frames . . . . . . . . . . . . . . .
1.8 Scalar Products of Vectors . . . . . . . . .
1.9 Applications of Scalar Products . . . . . .
1.10 Cross Product of Vectors . . . . . . . . . .
1.11 Triple Products . . . . . . . . . . . . . . . .
1.12 Applications of Cross Product . . . . . . .
1.12.1 Torque . . . . . . . . . . . . . . . . .
1.12.2 Magnetic Force . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
3
8
11
15
19
20
21
25
27
28
28
28
2 Differentiation
2.1 Basic Ideas and the Extremum . . .
2.2 Derivatives of Physical Quantities . .
2.3 Centripetal Acceleration . . . . . . .
2.4 Seeking the Extremum . . . . . . . .
2.5 Case Study on Projectile Motion . .
2.6 A Revisit to Newton’s Second Law .
2.7 Electric Potential and Electric Field
2.8 L’ Hôpital’s Rule . . . . . . . . . . . .
2.9 Taylor’s Series . . . . . . . . . . . . .
2.10 Newton’s Method . . . . . . . . . . .
2.11 Useful Differentiation Formulae . . .
2.12 Appendix: Method of Bisection . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
30
31
33
34
38
38
40
41
44
50
54
54
3 Integration
3.1 Indefinite Integration . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Integration by Substitution . . . . . . . . . . . . . . . .
56
56
57
i
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
ii
3.1.2 Integration using Partial Fraction . .
3.1.3 Integration by Parts . . . . . . . . . . .
Definite Integration . . . . . . . . . . . . . . .
3.2.1 Fundamental Theorem of Calculus . .
3.2.2 Integration using Reduction Formula
Impulse . . . . . . . . . . . . . . . . . . . . . .
Center of Mass . . . . . . . . . . . . . . . . . .
Work Done by a Force . . . . . . . . . . . . . .
Energy Stored in a Spring . . . . . . . . . . .
Electric Field due to a Charged Wire . . . . .
The Length of a Curve . . . . . . . . . . . . .
Area under a Curve . . . . . . . . . . . . . . .
Moment of Inertia . . . . . . . . . . . . . . . .
The Dog-And-Rabbit Chase Problem . . . . .
Numerical Integration . . . . . . . . . . . . . .
3.12.1 Trapezoidal Rule . . . . . . . . . . . .
3.12.2 Simpson’s Rule . . . . . . . . . . . . . .
Useful Integration Formulae . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Ordinary Differential Equations
4.1 Separation of Variables . . . . . . . . . . . . . . . . .
4.2 Simple Harmonic Motion . . . . . . . . . . . . . . . .
4.3 Free Fall with Air Resistance . . . . . . . . . . . . . .
4.4 Radioactive Decay . . . . . . . . . . . . . . . . . . . .
4.5 Charging a Capacitor . . . . . . . . . . . . . . . . . .
4.6 Parabolic Mirror . . . . . . . . . . . . . . . . . . . . .
4.7 Torricelli’s Law of Draining . . . . . . . . . . . . . . .
4.8 First Order Linear Differential Equation . . . . . . .
4.9 Second Order Homogeneous Differential Equations .
5 Trigonometry and Complex Numbers
5.1 Compound Angle Formulae . . . . . . .
5.2 Complex Numbers . . . . . . . . . . . .
5.3 Complex Plane . . . . . . . . . . . . . .
5.4 De Moivre’s Theorem . . . . . . . . . .
5.5 Euler’s Formula . . . . . . . . . . . . . .
5.6 A Revisit to Simple Harmonic Motion
5.7 Particle in a Box . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Partial Differentiation
6.1 Partial Derivative . . . . . . . . . . . . . . . .
6.2 Geometrical Meaning of Partial Derivatives .
6.3 Polar Coordinates . . . . . . . . . . . . . . . .
6.4 Polar Coordinates and the Length of a Curve
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
59
60
61
64
66
68
71
72
73
74
76
77
80
81
82
83
85
.
.
.
.
.
.
.
.
.
86
86
88
91
94
96
98
99
100
104
.
.
.
.
.
.
.
106
106
113
113
116
118
120
122
.
.
.
.
124
124
134
135
138
CONTENTS
6.5
6.6
6.7
6.8
Cartesian Coordinates . . . . . . . . . . . . . . . . .
Cylindrical Coordinates . . . . . . . . . . . . . . . .
Spherical Coordinates . . . . . . . . . . . . . . . . .
A Revisit to Electric Field and Electric Potential
7 Matrix and Transformation
7.1 Matrix . . . . . . . . . . . . . . . . . . . . . . .
7.2 Properties of Matrices . . . . . . . . . . . . . .
7.3 Determinant . . . . . . . . . . . . . . . . . . . .
7.4 Properties of Determinant . . . . . . . . . . .
7.5 Inverse . . . . . . . . . . . . . . . . . . . . . . .
7.6 Properties of an Inverse . . . . . . . . . . . . .
7.7 Systems of Linear Equations . . . . . . . . . .
7.8 Cramer’s Rule . . . . . . . . . . . . . . . . . . .
7.9 Eigenvalues and Eigenvectors . . . . . . . . .
7.10 Diagonalization . . . . . . . . . . . . . . . . . .
7.11 Rotation of Axes . . . . . . . . . . . . . . . . .
7.12 Special Matrices . . . . . . . . . . . . . . . . .
7.13 Vector Spaces . . . . . . . . . . . . . . . . . . .
7.14 Linear Transformation . . . . . . . . . . . . . .
7.14.1 Basis Vectors . . . . . . . . . . . . . . .
7.14.2 Linear Operator . . . . . . . . . . . . .
7.15 Matrix Representation of a Linear Operator .
Index
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
139
141
143
146
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
149
149
152
154
156
160
162
163
163
167
172
175
180
180
181
181
182
183
186
Chapter 1
Vectors
1.1
Properties of Vectors
A vector is a quantity which has both magnitude (length) and direction.
The vector v⃗ has magnitude denoted as ∣⃗
v ∣ or simply v. The position of the
vector in space is immaterial, as shown in figure 1.1. Vectors having the same
direction and same length, originated from different points in the space are
⃗ B
⃗ and C⃗ show in the figure are equivalent.
identical. Three vectors A,
Figure 1.1: Three vectors in space
The properties of vectors are listed as follows.
ˆ Scalar Multiplication:
For any real number λ, the scalar product of λ and v⃗ is λ⃗
v which is a
⃗
vector having its length ∣λ∣ times as long as v . We call λ a scalar in
contrast to the vector v⃗.
1. If λ = 1, the product is 1⃗
v = v⃗.
⃗ the null vector or zero vector. The
2. If λ = 0, we have 0⃗
v = 0,
magnitude of it is zero.
3. If λ = −1, the product is −1⃗
v = −⃗
v , the direction of it is opposite
to v⃗ but having the same length as v⃗, see figure 1.2. We read −⃗
v
as minus v⃗.
4. If λ = 1/∣⃗
v ∣, we obtain the unit vector v̂ = v⃗/∣⃗
v ∣ which points in the
direction of v⃗ but the magnitude of it is 1.
1
CHAPTER 1. VECTORS
2
Figure 1.2: Negative of v⃗
ˆ Addition and Subtraction:
⃗ we have
For any scalars α and β, and any vectors A⃗ and B,
⃗=B
⃗ + A⃗ (commutative law)
1. A⃗ + B
⃗ = A⃗ − B
⃗
2. A⃗ + (−B)
⃗ + C)
⃗ = (A⃗ + B)
⃗ + C⃗ (associative law)
3. A⃗ + (B
⃗ = (αβ)A⃗ (associative law)
4. α(β A)
⃗ = αA⃗ + αB
⃗ (distributive law)
5. α(A⃗ + B)
6. (α + β)A⃗ = αA⃗ + β A⃗ (distributive law)
1.2
Examples of Vectors and Scalars
⃗, force F⃗ ,
Examples of vectors are displacement x⃗, velocity v⃗, acceleration a
⃗ , angular acceleration α
⃗ , torque τ⃗, electric
momentum p⃗, angular velocity ω
⃗ etc.
field E⃗ and magnetic field B,
Examples of scalars are distance x, speed v, mass m, work done W , power
W , temperature T , gravitational potential V , electric flux ΦE , and magnetic
flux ΦB , etc. Surprisingly, angular displacement is a scalar instead of a vector
because it does not obey the commutative law. Figure 1.3 states that the
order of the rotations will make a big difference in the result.
Figure 1.3: Angular displacement is not a vector
CHAPTER 1. VECTORS
1.3
3
Adding Vectors
An object acting by two forces F⃗1 and F⃗2 is equivalent to being acted by
F⃗3 , where F⃗3 = F⃗1 + F⃗2 . The sum of these vectors gives the resultant of
the applied forces. The direction and magnitude of F⃗3 can be obtained by
constructing a parallelogram as shown in figure 1.4. The diagonal of it gives
the resultant vector. The proof is given in section 1.4.
Figure 1.4: Resultant vector
Notice that the vectors can be translated as shown in the left diagram of
figure 1.5. Then we obtain two vector diagrams in the form of a triangle, e.g.
F⃗1 + F⃗2 = F⃗3 in the middle diagram and F⃗2 + F⃗1 = F⃗3 in the right diagram.
Both diagrams give the same result for F⃗3 . The results can be extended to
a many-force system and the vector diagram becomes a polygon. If we label
Figure 1.5: Sum of two vectors
⃗ B,
⃗ and C⃗ respectively, we have A⃗ + B
⃗ + C⃗ = 0. In
F⃗1 , F⃗2 , and −F⃗3 by A,
other words, when the sum of three vectors equals zero, the vector diagram
is a triangle with the vectors pointing in well order (i.e. it is in a cyclic
way, clockwise or counterclockwise). The vector diagrams in figure 1.6 are
equivalent. If an object is at equilibrium due to three forces, the force
diagram is a triangle with the force vectors pointing in well order.
Figure 1.6: Equilibrium of three forces
CHAPTER 1. VECTORS
4
Example 1.1. An object of mass m is suspended by three light strings as
shown in figure 1.7. The tensions in the strings are T1 , T2 and T3 respectively.
The strings connecting to the ceiling make angles α and β to the horizontal.
Find the tension in each string.
Figure 1.7: Suspending an object by strings
Solution: Obviously, we have T3 = mg because the object is at equilibrium.
The free-body diagram of the object is shown in the left diagram of figure
1.8. As the knot is also at equilibrium, we can construct a vector diagram
using the tension in the strings through the knot, see the right diagram of
figure 1.8. It is a triangle having sides given by the magnitude of tensions in
the strings. The direction of each vector is well ordered (i.e. it is in a cyclic
way) in the diagram. As a reminder, the sides of the triangle are not the
length of the strings. Using Sine law, we obtain the tensions of the strings
readily.
Figure 1.8: Vector diagrams
CHAPTER 1. VECTORS
5
T1
T3
=
○
sin(90 − β) sin(α + β)
mg
T1
=
cos β sin(α + β)
Hence, we have
T1 =
mg cos β
sin(α + β)
T2 =
mg cos α
sin(α + β)
Similarly, we obtain
Of course, one can use the routine method, i.e. to resolve the components of the tensive forces into horizontal and vertical components and then
consider the equilibrium of forces for each direction. But, it is too clumsy. ∎
Example 1.2. A particle is projected from the floor with speed vi at an
angle 30○ with the horizontal as shown in figure 1.9. Find the time of flight
of the particle.
Figure 1.9: Projectile motion of a particle
Solution: The projectile is symmetric about the vertical line through the
maximum height of flight. Thus, the initial velocity and the final velocity
have the same magnitude, vi = vf , and v⃗i and v⃗f make the same acute angle
with the floor. Since v⃗f = v⃗i + g⃗t, we can construct the vector diagram,
figure 1.10, using the vectors in this equation. The diagram is an equilateral
triangle which has sides vi = vf = gt. Therefore, we have t = vi /g. Notice that
g⃗t = ∆⃗
v = v⃗f − v⃗i , the change of velocity after time t.
Figure 1.10: Vector diagram of the change of velocity
CHAPTER 1. VECTORS
6
Example 1.3. An experiment is performed by a gun and a particle as shown
in figure 1.11. The particle is released when the gun fires. Show that the
bullet can always hit the particle if its initial velocity u⃗ points to the particle.
Figure 1.11: The gun and particle experiment
Solution: Let t be the time needed by the bullet to travel the horizontal
distance between the gun and the particle. Denote the position vector of the
⃗ From the kinematic equation, we have S⃗ = u⃗t + g⃗t2 /2,
bullet at time t as S.
where u⃗ is along the firing direction of the gun and g⃗ is vertically downward.
Figure 1.12: Position vector of the bullet at time t
We obtain the vector diagram as shown in figure 1.12. It is a triangle having
sides S, ut and gt2 /2. Thus, we can assert that the bullet, at time t, is
at a vertical distance gt2 /2 under the initial position of the particle. On the
other hand, we notice that the particle also has a vertical displacement g⃗t2 /2.
Therefore, the bullet can always hit the particle.
∎
Example 1.4. A boat sails to across a swift, straight river of width d. The
speed of the boat in still water is u and that of the water is v, where v > u.
If the boat sails directly toward the opposite bank, find the downstream
distance he has traveled when he reaches the opposite bank. Find also the
minimum downstream distance and the direction of sailing of the boat.
CHAPTER 1. VECTORS
7
Figure 1.13: Resultant velocity of the boat
Solution: Refer to figure 1.13, the resultant velocity of the boat is given by
Ð→ Ð→ Ð→
v⃗ + u⃗ = OQ + QC = OC. It means that the boat will arrive B in the opposite
bank, where the downstream distance AB = d/ tan θ = d/(u/v) = dv/u. One
should notice that the time taken by the boat is d/u.
Figure 1.14: The minimum downstream distance of the boat
To find the minimum downstream distance, we construct a circle of radius
u, as shown in figure 1.14. The tangent line of the circle from O shows
the path of the boat such that it has the minimum downstream distance in
the opposite bank. As the tangent line meets the opposite bank at B ′ , the
minimum downstream distance is AB ′ . Refer to △OQC ′ , it is a right-angled
triangle, where sin α = u/v. The boat should sail in a direction making an
angle α with the normal of the river, i.e. OA. The resultant velocity of the
ÐÐ→
ÐÐ→ Ð→ ÐÐ→
boat is OC ′ , where OC ′ = OQ + QC ′ = v⃗ + u⃗. The minimum downstream
distance is
√
d
v 2 − u2
AB ′ = d cot α =
u
CHAPTER 1. VECTORS
8
Remark: The time taken is
t=
d
=
u cos α
dv
d
√
= √
v 2 − u2
u v 2 − u2
)
u(
v
∎
1.4
The Components of Vectors
Given that A⃗ = Ax î + Ay ĵ, the horizontal and vertical components
√ of it is
⃗
represented by Ax = A cos α and Ay = A sin α, where A = ∣A∣ = A2x + A2y .
⃗ = Bx î + By ĵ, we have C⃗ = A⃗ + B
⃗ = (Ax + Bx ) î + (Ay + By ) ĵ, where
If B
√
⃗ = Bx2 + By2 . Figure 1.15 shows the
Bx = B cos β, By = B sin β and B = ∣B∣
⃗ along the î and ĵ directions. The resultant vector
components of A⃗ and B
⃗ is shown in figure (1.16). It is the diagonal of the parallelogram
C⃗ = A⃗ + B
⃗
formed by A⃗ and B.
Figure 1.15: Components of vectors
Figure 1.16: Resultant of two vectors
CHAPTER 1. VECTORS
9
Example 1.5. Three charges, each equal to +2.90 µC, are placed at three
corners of a square 0.500 m on a side, as shown in figure 1.17. Find the
magnitude and direction of the net force on charge number 3. The Coulomb’s
constant ke = 1/(4π0 ) is 8.99 × 109 N ⋅ m2 /C2 , where 0 is the permittivity of
free space.
Figure 1.17: Electric forces exerted on a charge
Solution: The magnitude of electric force exerted on charge i by charge
j is given by
Fij = ke
q i qj
,
r2
where i, j = 1, 2, 3 and i ≠ j.
The magnitude of electric force exerted on charge 3 by charge 1:
q2
(2.90 × 10−6 C)2
F31 = ke √
= (8.99 × 109 N ⋅ m2 /C2 ) √
= 0.151 N
( 2r)2
[( 2) (0.500 m)]2
The magnitude of electric force exerted on charge 3 by charge 2:
F32 = ke
−6
2
q2
9
2
2 (2.90 × 10 C)
=
(8.99
×
10
N
⋅
m
/C
)
= 0.302 N
r2
(0.500 m)2
The x- and y- components of F⃗31 and F⃗32 :
F31,x
F31,y
F32,x
F32,y
=
=
=
=
F31
F31
F32
F32
cos 45○ = (0.151 N) (0.707) = 0.107 N
sin 45○ = (0.151 N) (0.707) = 0.107 N
cos 0○ = (0.302 N) (1) = 0.302 N
sin 0○ = (0.302 N) (0) = 0 N
The resultant force on charge 3 has components:
F3,x = F31,x + F32,x = 0.107 N + 0.302 N = 0.409 N
F3,y = F31,y + F32,y = 0.107 N + 0 N = 0.107 N
CHAPTER 1. VECTORS
10
The resultant force acting on charge 3:
√
2
2
F = F3,x
+ F3,y
= 0.423 N
The direction of the resultant force on charge 3:
θ = tan−1 (
F3,y
) = 14.7○
F3,x
∎
Example 1.6. A right triangular wedge of mass M and inclination angle θ,
has a small block of mass m placed on its inclined surface, as shown in figure
1.18. Assuming all surfaces are frictionless, what horizontal acceleration a
must M have relative to the table to keep m stationary relative to the wedge?
What horizontal force F must be applied to the wedge to achieve this result?
Figure 1.18: Pushing the wedge
Solution: Suppose that the block has no motion with respect to the wedge
when they move together with a common horizontal acceleration a. The
free-body diagrams of the wedge and the block are sketched in figure 1.19.
Figure 1.19: Free-body diagrams of the wedge and the block
The equations of motion of the block along the horizontal and vertical
are stated as follows. Refer to the right diagram of figure 1.19.
⎧
⎪
⎪ N2 sin θ = ma
⎨
mg − N2 cos θ = 0
⎪
⎪
⎩
gives
N2 =
mg
cos θ
CHAPTER 1. VECTORS
11
Eliminating N2 , we obtain a = g tan θ.
From the left diagram of figure 1.19, we can write the equation of motion of
the wedge along the horizontal.
F − N2 sin θ = M a
Therefore, we have F = (M + m) g tan θ. It is not surprised to see that the
answer is simply the product of M + m (the total mass of the objects) and
g tan θ (the common acceleration of the objects along the horizontal), because
both objects move together horizontally and the only external horizontal
force exerted on the system is F . Here, we have the system formed by the
block and the wedge. The normal forces N2 are internal forces of the system.
The normal force N1 exerted on the wedge by the table is an external vertical
force with respect to the system. However, N1 is irrelevent to the discussion
concerning the horizontal acceleration of the system.
∎
1.5
Subtracting Vectors
Two objects A and B are located at different place on the 2d-plane. The
position vectors of them are r⃗A and r⃗B respectively. The position of object A
relative to object B is given by r⃗AB = r⃗A − r⃗B , the left diagram in figure 1.20.
The direction and magnitude of r⃗AB provide information about the position
of A relative to B. The meaning of relative position vector is straight forward
by considering the following cases. Suppose that there is an observer located
at B and he tries to state the position of A. He will say r⃗AB . If the observer
is located at A and he tries to state the position of B. Then, he will say r⃗BA ,
where r⃗BA = r⃗B − r⃗A . The right diagram in figure 1.20 shows the direction of
r⃗BA . The concept can be extended to relative velocity. The velocity of A
relative to B is v⃗AB = v⃗A − v⃗B , where v⃗A and v⃗B are velocity vectors of A and
B respectively.
Figure 1.20: Relative position between two objects
Example 1.7. A ship A is steaming due north at 16 km/hr and a ship B is
steaming due west at 12 km/hr. Find the velocity of A with respect to B.
CHAPTER 1. VECTORS
12
Solution: The velocity of A with respect to B is v⃗AB which equals to v⃗A −⃗
vB .
Referring to the vector diagram√
shown in the right of figure 1.21, the relative
velocity has magnitude ∣⃗
vAB ∣ = 122 + 162 = 20 km/hr. The direction of v⃗AB
is N tan−1 (12/16) E, i.e. N 36○ 52′ E.
Figure 1.21: The velocity of ship A with respect to ship B
Example 1.8. A man traveling East at 8 kmh−1 finds that the wind seems to
blow directly from the North. On doubling his speed he finds that it appears
to come from NE. Find the velocity of the wind.
Solution: Let the velocity of the wind be w⃗ = x î + y ĵ. Then the velocity of
the wind relative to the man is
w⃗ − 8 î = (x − 8) î + y ĵ
Notice that the man is the observer and he feels the wind. All about such
feeling (i.e. blowing directly from the North) is relative to him. The vector
subtraction represents this relative velocity. Therefore w⃗ − 8 î = (x − 8) î + y ĵ
is parallel to −ĵ. Hence, we obtain x − 8 = 0 (i.e. x = 8). Figure 1.22 shows
the relative velocity of the wind to the man traveling East at 8 kmh−1 . The
vector −ĵ indicates the North wind relative to the man.
Figure 1.22: The wind with respect to a man traveling East at 8 kmh−1
When the man doubles his speed, the velocity of the wind relative to him
is given by
w⃗ − 16 î = (x − 16) î + y ĵ
CHAPTER 1. VECTORS
13
Figure 1.23: The wind with respect to the man traveling East at 16 kmh−1
But the wind seems blowing from the NE and is therefore parallel to −(î + ĵ).
Hence, we can write y = x − 16 √
= 8 − 16 = −8. The velocity of the wind is
8 î − 8 ĵ, which is equivalent to 8 2 kmhr−1 from NW. Figure 1.23 shows the
relative velocity of the wind to the man traveling East at 16 kmh−1 . The
vector −(î + ĵ) indicates the NE wind relative to the man.
∎
Example 1.9. John is running at a constant speed v0 = 0.7 m/s along a
straight path. His father Peter, at a normal distance 10 m from the path and
a distance 20 m from John, observes John’s approaching. Suppose that Peter
starts to run at constant speed, along a straight course, immediately when
he observes John, what is the minimum speed of Peter such that they can
meet? Where do they meet and what is the time?
Figure 1.24: The initial positions of John and Peter
Solution: We subtract both persons by the velocity vector of John, then
John becomes stationary and Peter runs with v⃗P J , where v⃗P J represents the
velocity of Peter relative to John and v⃗P J = v⃗P − v⃗J . Generally, v⃗P J has an
arbitrary direction because it depends on the choice of Peter (i.e. v⃗P ).
Figure 1.25: The vector diagram of Peter’s path
CHAPTER 1. VECTORS
14
Nevertheless, the direction of v⃗P J in figure 1.25 points to John such that
Peter can meet John eventually. Denote the minimun velocity of Peter as
v⃗min , where the magnitude of it is vmin = (0.7 m/s) sin 30○ = 0.35 m/s.
As a reminder, v⃗min shows the actual direction of Peter such that he meets
John eventually. In fact, he meets John at M , and P M shows the actual
path adopted by Peter. From △JM P , we have
JM =
40
20m
=√ m
○
cos 30
3
40
√ m
JM
400
3
The time needed is t, where t =
=
= √ seconds.
7
0.7 m/s
m/s 7 3
10
∎
Example 1.10. An experiment is performed by a gun and a particle as
shown in figure 1.26. The particle is released when the gun fires. Show that
the bullet can always hit the particle if its initial velocity u⃗ points to the
particle. This example repeats the same question stated in example 1.3, but
we try to solve it by applying the idea of relative velocity.
Figure 1.26: The gun and particle experiment
Solution: When t ≥ 0, we have
{
Velocity of the bullet ∶ v⃗b = u⃗ + g⃗ t
Velocity of the particle ∶ v⃗p = g⃗ t
So, the velocity of the bullet relative to the particle is v⃗bp = v⃗b − v⃗p = u⃗. It
means that the bullet travels with u⃗ with respect to an observer riding on the
particle. Equivalently, the gravitational effect has been cancelled out in the
picture of relative motion and the bullet travels with u⃗ towards the particle.
The bullet must hit the particle.
CHAPTER 1. VECTORS
15
Alternatively, one may imagine the following. If the velocity vector g⃗t is
subracted from both the particle and the bullet, then the particle becomes
at rest and the bullet travels with u⃗ relative to the particle. This is the idea
behind relative motion. Finally, we conclude that the bullet hits the particle
definitely.
∎
1.6
Position Vector and its Time Derivatives
The position vector indicates the position of an object with respect to the
origin in a coordinate system, e.g. Cartesian coordinate system. Its time
derivative gives the velocity of the object. The direction of the velocity vector
is tangential to the path of the object. We can differentiate the velocity vector
again with time and obtain the acceleration vector. It is the second order
derivative of the position vector with respect to time. It reveals that the
equation of motion of an object, i.e. F⃗ = m⃗
a, is generally a second order
differential equation, where F⃗ is the force exerted on an object of mass m
⃗.
moving with acceleration a
ˆ Position vector
The position vector of a point is given
√ by r⃗ = x î+y ĵ+z k̂. It directs from
the origin and has length r = ∣⃗
r∣ = x2 + y 2 + z 2 . Figure 1.27 shows the
trajectory of a particle at various times. The position of the particle at
time t1 is given by r⃗1 and at a later time t2 it becomes r⃗2 . The change
of position vector, or simply the displacement, is ∆⃗
r = r⃗2 − r⃗1 . It is
important to notice that when ∆t = t2 − t1 is very small, ∆⃗
r is about to
lie on the trajectory and it shows roughly the direction of the particle.
Figure 1.27: The positon vector of a particle
ˆ Velocity vector
The average velocity during time ∆t is defined as
v⃗avg =
∆⃗
r r⃗(t1 + ∆t) − r⃗(t1 )
=
∆t
∆t
CHAPTER 1. VECTORS
16
The instantaneous velocity at time t is defined as the time derivative
of the position vector. It is tangential to the path at the instant.
v⃗ =
d⃗
r
∆⃗
r
r⃗(t + ∆t) − r⃗(t)
= lim
= lim
∆t→0
∆t→0
dt
∆t
∆t
The speed of the particle is v = ∣⃗
v ∣ which is positive and a scalar quantity. It is in contrast to velocity v⃗ which is a vector. In some textbooks,
v⃗ is labeled as r⃗˙ .
Figure 1.28: The velocity vector of a particle
ˆ Acceleration vector
The average acceleration during time ∆t is defined as
⃗avg =
a
∆⃗
v v⃗(t1 + ∆t) − v⃗(t1 )
=
∆t
∆t
The instantaneous acceleration at time t is defined as the time derivative of the velocity vector.
⃗=
a
d⃗
v
∆⃗
v
v⃗(t + ∆t) − v⃗(t)
= lim
= lim
dt ∆t→0 ∆t ∆t→0
∆t
Notice that the acceleration vector is also the second order time derivative of the position vector.
d⃗
v d d⃗
r
d2 r⃗
⃗ = v⃗˙ =
a
= ( ) = 2 = r¨⃗
dt dt dt
dt
Example 1.11. A particle moving in the space has position vector r⃗ at time
t. Its speed and the magnitude of acceleration at the instant are v and a
respectively. Comment on the following pairs of quantities.
(a)
dr
and v,
dt
(b)
dv
and a.
dt
CHAPTER 1. VECTORS
17
Solution:
(a) The quantity v is the speed of the particle, it is always positive. Howdr
ever,
is the rate of change of the distance between the particle and the
dt
origin. It can take positive value or negative value. For example, it is negar∣
dr d∣⃗
=
and
tive when the particle approaches the origin. Notice also that
dt
dt
d⃗
r
d∣⃗
r∣
d⃗
r
v = ∣⃗
v ∣ = ∣ ∣. Interestingly,
is not necessary equal to ∣ ∣. Let’s consider
dt
dt
dt
a particle moving in a circular path, as shown in figure 1.29.
Figure 1.29: The circular motion of a particle
As the distance between the particle and the center of rotation is fixed, i.e.
dr d∣⃗
r∣
r = ∣⃗
r∣ is a constant, we have
=
= 0. On the contrary, the speed of the
dt
dt
d⃗
r
particle v = ∣⃗
v ∣ = ∣ ∣ is nonzero when the particle is revolving. Therefore,
dt
dr
we assert that
is not necessary equal to v except that when the particle
dt
moves directly away from the origin along a straight path or the particle is
at rest.
(b) The quantity a is the magnitude of acceleration of the particle, it is aldv
is the rate of change of the speed
ways positive. On the other hand,
dt
of the particle along the path. It can take positive value or negative value.
For example, it is negative if the particle reduces its speed. Let’s consider a
particle performing the uniform circular motion. The speed of the particle v
dv
is a constant and
= 0. However, the particle experiences a centripetal
dt
dv
⃗ ≠ 0 and thus a ≠ 0. Therefore,
acceleration, i.e. a
is not necessary equal
dt
to a.
Strictly speaking,
dv
is the acceleration of the particle along the path, i.e.
dt
CHAPTER 1. VECTORS
18
Figure 1.30: The circular motion of a particle
the tangential acceleration at . Refer to figure 1.30. It is positive if the
particle accelerates along the path, it is negative if the particle decelerates
along the path and it becomes zero if the particle maintains a constant speed
dv
alone does not provide any
along the path. It is important to notice that
dt
information about the acceleration normal to the path. The latter is named
as radial acceleration an . Some books adopt the symbol ar instead. The
expression an = v 2 /r is true for uniform or non-uniform circular motion. It is
also true for motions along arbitrary curves.
∎
Example 1.12. A particle is thrown horizontally with initial velocity u from
a cliff. The projectile is shown in the left diagram of figure 1.31. Find the
acceleration of the particle along and normal to the path at time t. Hence,
find the radius of curvature of the path at time t.
Solution:
Figure 1.31: The trajectory of a particle
The horizontal and vertical velocities of the particle at time t are given by
vx = u
and
vy = −gt
√
respectively. The resultant speed at time t is given by v = vx2 + vy2 =
√
dv
u2 + g 2 t2 and the acceleration of the particle along the path is at =
=
dt
CHAPTER 1. VECTORS
√
19
g2t
. An alternative approach is given in example 1.15. Since
u2 + g 2 t2
particle is driven by the gravitational attraction, the net acceleration of
particle is g. We can write g 2 = a2t + a2n , where an is the acceleration of
particle normal to the path, refer to the right diagram of figure 1.31.
obtain
√
gu
an = g 2 − a2t = √
2
u + g 2 t2
the
the
the
We
It is also the centripetal acceleration of the particle at time t and it relates
the speed of the particle by an = v 2 /r, where r is the radius of curvature.
Then, we get the result readily and
r=
v 2 (u2 + g 2 t2 )3/2
=
an
gu
As a final remark to the discussion, one should notice that the radius of
curvature applies to all curves, not only the circular paths.
∎
1.7
Reference Frames
Each observer - such as you standing on the ground - defines a reference
frame. A reference frame requires a coordinate system and a set of clocks,
which enable an observer to measure positions, velocities, and accelerations
in his or her particular frame. As shown in figure 1.32, we have two different
frames to watch an object P . Obviously, we obtain the relation between
position vectors measured from different frames
r⃗P A = r⃗P B + r⃗BA
In words: ”The position of P as measured by frame A is equal to the position
of P as measured by frame B plus the position of B as measured by A.”
Figure 1.32: Reference Frames
CHAPTER 1. VECTORS
20
Taking time derivative on both sides, we obtain the relation of velocities:
v⃗P A = v⃗P B + v⃗BA . If the two frames move at constant speed with respect to
⃗P A = a
⃗P B . That means, two obeach other, i.e. v⃗BA = constant, we obtain a
servers moving with constant velocity with each other should write down the
⃗P A = a
⃗P B + a
⃗BA .
same equation of motion for the object. Generally, we have a
Below is a summary of the quantities measured in different reference
frames.
⎧
r⃗ = r⃗P B + r⃗BA
⎪
⎪
⎪ PA
⎨ v⃗P A = v⃗P B + v⃗BA
⎪
⎪
⎪
⃗P A = a
⃗P B + a
⃗BA
⎩ a
1.8
Scalar Products of Vectors
⃗ is defined as
The scalar product or dot product of two vectors A⃗ and B
follows.
⃗ = ∣A∣∣
⃗ B∣
⃗ cos θ = AB cos θ ,
A⃗ ⋅ B
where θ is the angle between the vectors. It is a scalar quantity and the
⃗=B
⃗ ⋅ A.
⃗ It can take positive, negative or
operation is commutative, i.e. A⃗ ⋅ B
zero values. If the vectors make an acute angle with each other, the product
is positive. If the vectors make an obtuse angle, it becomes negative. When
they are perpendicular to each other, the product is zero.
Figure 1.33: Scalar product of vectors
⃗ is a unit vector, i.e. B = 1, A⃗ ⋅ B
⃗ = A(1) cos θ = A cos θ, which is
If B
⃗ Suppose A⃗ and B
⃗ are described in Cartesian
the projection of A⃗ on B.
⃗
⃗
coordinates, A = Ax î + Ay ĵ + Az k̂ and B = Bx î + By ĵ + Bz k̂, then we have
⃗ = (Ax î + Ay ĵ + Az k̂) ⋅ (Bx î + By ĵ + Bz k̂)
A⃗ ⋅ B
= Ax Bx + Ay By + Az Bz
The simplification is done by using the relations î ⋅ î = ĵ ⋅ ĵ = k̂ ⋅ k̂ = 1 and
⃗ ∣A∣
⃗ cos 0○ = A2 = A2x + A2y + A2z ,
î ⋅ ĵ = ĵ ⋅ k̂ = k̂ ⋅ î = 0. Obviously, A⃗ ⋅ A⃗ = ∣A∣
CHAPTER 1. VECTORS
21
⃗ ⋅B
⃗ = ∣B∣
⃗ ∣B∣
⃗ cos 0○ = B 2 = Bx2 + By2 + Bz2 , and
B
cos θ =
⃗
Ax Bx + Ay By + Az Bz
A⃗ ⋅ B
√
=√
.
AB
A2x + A2y + A2z Bx2 + By2 + Bz2
Example 1.13. Use the scalar product of two vectors to prove the cosine
rule of a triangle.
Figure 1.34: A proof of the cosine rule
⃗, ⃗b, and c⃗ such that the
Solution: Construct the △ABC and the vectors a
vectors are along the sides of the triangle, as shown in figure 1.34. We notice
⃗, and the dot product of itself is ⃗b ⋅ ⃗b = (⃗
⃗) ⋅ (⃗
⃗). Thus, we
that ⃗b = c⃗ − a
c−a
c−a
have
⃗
b2 = c2 + a2 − 2 c⃗ ⋅ a
Then, we obtain the cosine rule b2 = c2 + a2 − 2 ca cos B.
∎
Example 1.14. Use the scalar product of two vectors to prove the CauchySchwarz inequality for real numbers ai and bi , i = 1, 2, 3, and n = 3.
2
n
n
n
i=1
i=1
(∑ ai bi ) ≤ ∑ a2i ∑ b2i
i=1
⃗ = b1 î + b2 ĵ + b3 k̂. Denote the angle
Solution: Let A⃗ = a1 î + a2 ĵ + a3 k̂ and B
⃗
⃗
between A and B as θ. We have
⃗ 2 = A2 B 2 cos2 θ ≤ A2 B 2
(A⃗ ⋅ B)
3
2
3
3
i=1
i=1
Therefore, we obtain (∑ ai bi ) ≤ ∑ a2i ∑ b2i .
i=1
1.9
∎
Applications of Scalar Products
Scalar products are commonly used to define physical quantities such as work
done, electric flux, and magnetic flux. Let’s take work done as an example.
The work done by a force F⃗ is defined as
W = F⃗ ⋅ d⃗ = F d cos θ ,
CHAPTER 1. VECTORS
22
where d⃗ is the displacement of the point of application. The frictional force
does negative work when a mass slides on a rough table because the force
vector and the displacement vector are anti-parallel to each other. In fact,
W = (F cos θ) d = (d cos θ) F . It implies that the following approaches are
equivalent when we compute the work done. (1) Multiplying the projection
of F⃗ on d⃗ by the displacement d⃗ or (2) multiplying the projection of d⃗ on F⃗
by the force F⃗ . In fact, the scalar product of two vectors gives hint to find
the projection of any one of the vectors along the other.
Figure 1.35: Scalar product and the projection of vector
Example 1.15. A particle is thrown horizontally with initial velocity u from
a cliff. The projectile is shown in the left diagram of figure 1.31. Find the
acceleration of the particle along and normal to the path at time t.
Solution:
Figure 1.36: The acceleration of a particle along and normal to the trajectory
The horizontal and vertical velocities of the particle at time t are given by
vx = u
and
vy = −gt
respectively. The velocity of the particle at time t has the form v⃗ = u î − gt ĵ
⃗ = −g ĵ. Then we can compute the acceleration of the
and the acceleration a
⃗ and v̂, where v̂ = v⃗/v.
particle along the path by using the scalar product of a
CHAPTER 1. VECTORS
23
The tangential acceleration of the particle at time t is
⎛ u î − gt ĵ ⎞
v⃗
⃗ ⋅ v̂ = (−g ĵ) ⋅ ( ) = (−g ĵ) ⋅ √
,
at = a
v
⎝ u2 + g 2 t2 ⎠
which leads to at = √
g2t
. An alternative approach is given in example
u2 + g 2 t2
⃗ and v̂ gives the acceleration of the particle
1.12. The cross product of a
normal to the path. Read section 1.10.
RRR
⎛ u î − gt ĵ ⎞RRRRR RRRRR ug k̂ RRRRR
v⃗
R
R
an = ∣⃗
a × v̂∣ = ∣(−g ĵ) × ( )∣ = RRR(−g ĵ) × √
R = R√
R
v
⎝ u2 + g 2 t2 ⎠RRRR RRRR u2 + g 2 t2 RRRR
RRR
R R
R
Therefore, an = √
ug
u2 + g 2 t 2
.
∎
Example 1.16. A block is being pulled by a constant force F through the
light string which makes a constant angle of 60○ with the horizontal. The
pulley is light and frictionless. If F = 10 N and the block moves by 1 m, what
is the work done by the force F ?
Figure 1.37: The work done by a pull
Solution:
Method I: Let’s consider the movement of the point of action. Initially, the
point of action on the string is at point A, as shown in figure 1.38. When the
block moves by 1 m, the point of action moves to point B along the straight
path AB. We notice that AC = CB = 1 m and the displacement of the point
√
Ð→
of action is given by AB, where AB = 2 (AC cos 30○ ) = 3 m.
Let W be the work done by the applied force F⃗ . We have
Ð→
W = F⃗ ⋅ AB
Ð→
= ∣F⃗ ∣ ⋅ ∣AB∣ cos 30○
√
√
= (10 N) ( 3 m)( 3/2)
Therefore, we obtain W = 15 J.
CHAPTER 1. VECTORS
24
Figure 1.38: The work done by a pull
Method II: Consider the total force acting on the block. It is the sum of two
tension forces as shown in the left diagram of figure 1.39. Each force drives
the block by 1 m. The right diagram in figure 1.39 illustrates the effect of
individual force. The work done by the two forces is
W = (F cos 60○ ) (1 m) + (F ) (1 m)
= (10 N)(0.5)(1 m) + (10 N)
Therefore, we obtain W = 15 J.
Figure 1.39: The work done by a pull
Method III: Consider the resultant force F ′ acting on the block. It is the
sum of two tension forces as shown in the right diagram of figure 1.40.
Figure 1.40: The work done by a pull
√
√
We have F ′ = 2F cos 30○ = 2(10 N)( 3/2) = 10 3 N. This resultant force
drives the block by 1 m and the work done by it is
W = (F ′ cos 30○ ) (1 m)
√
√
= (10 3 N) ( 3/2) (1 m)
Therefore, we obtain W = 15 J.
∎
CHAPTER 1. VECTORS
1.10
25
Cross Product of Vectors
⃗ and ⃗b is defined as
The cross product of two vectors a
⃗ × ⃗b = ∣⃗
a
a∣∣⃗b∣ sin θ n̂ = ab sin θ n̂ ,
⃗ and ⃗b, and n̂ is a unit vector generated by
where θ is the angle between a
the right-hand rule. The direction of n̂ is perpendicular to the plane formed
⃗ and ⃗b. If a
⃗ and ⃗b are parallel or anti-parallel vectors, the cross product
by a
of them is zero.
Figure 1.41: The cross product and the right-hand rule
⃗ × ⃗b = −⃗b× a
⃗. Notice also that the
The cross product is anti-commutative and a
⃗
⃗
⃗ × (b × c⃗) ≠ (⃗
cross product is not associative, i.e. a
a × b) × c⃗ because the former
lies on the bc-plane and the latter lies on the ab-plane. The computation of
cross product is easily to proceed because
⎧
î × î = ĵ × ĵ = k̂ × k̂ = 0
and
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩ î × ĵ = k̂, ĵ × k̂ = î, k̂ × î = ĵ
⃗ = ax î + ay ĵ + az k̂ and ⃗b = bx î + by ĵ + bz k̂ can
The cross product of vectors a
be represented by a determinant, e.g.
RRR î ĵ k̂ RRR
RR
RR
⃗ × ⃗b = RRRR ax ay az RRRR
a
RRR
R
RR bx by bz RRRR
⃗ × ⃗b = (ay bz − by az ) î − (ax bz − bx az ) ĵ + (ax by − bx ay ) k̂.
where a
⃗ and ⃗b as
Example 1.17. Find the area of the triangle formed by vectors a
shown in the left diagram of figure 1.42.
Figure 1.42: The cross product and the area of a triangle
CHAPTER 1. VECTORS
26
⃗ and ⃗b is given by
Solution: The area of the parallelogram formed by a
⃗ and ⃗b is
ah = ab sin θ = ∣⃗
a × ⃗b∣. Thus, the area of the triangle formed by a
ah/2 = ∣⃗
a × ⃗b∣/2.
∎
Example 1.18. Use the cross products of two vectors to prove the sine rule
of a triangle.
Solution:
Figure 1.43: A proof of the sine rule
The area of △ABC is ∣⃗
a × c⃗∣/2. The adjacent triangle ADB has area ∣⃗
c × ⃗b∣/2.
⃗
However, the two triangles have the same area, ∣⃗
a × c⃗∣ = ∣⃗
c × b∣. We obtain
ca sin B = bc sin A and thus the result comes, a/ sin A = b/ sin B.
∎
Example 1.19. A particle is tracing a circular path about the origin with
radius r and angular speed ω. Express its tangential velocity and centripetal
⃗ . The direction of ω
⃗ is defined by the right-hand
acceleration with r⃗ and ω
rule which has your fingers swirling along the direction of rotation and the
⃗.
thumb pointing to the direction of ω
Solution:
Figure 1.44: The vectorial form of some physical quantities
Notice that the angular velocity ω points out of the page. The tangential
⃗ × r⃗ and its magnitude is ωr. The centripetal acceleration is
velocity is v⃗t = ω
⃗n = ω
⃗ × (⃗
a
ω × r⃗) and its magnitude is ω 2 r.
∎
CHAPTER 1. VECTORS
1.11
27
Triple Products
⃗, ⃗b, and c⃗, is defined as a
⃗ ⋅ (⃗b × c⃗). If
The scalar triple product of vectors a
the vectors are not coplanar, the magnitude of it is the volume of the par⃗)∣ =
allelepiped formed by the three vectors. Therefore, ∣⃗
a ⋅ (⃗b × c⃗)∣ = ∣⃗b ⋅ (⃗
c×a
⃗
∣⃗
c ⋅ (⃗
a × b)∣.
Figure 1.45: The scalar triple product of vectors
⃗ = ax î + ay ĵ + az k̂, ⃗b = bx î + by ĵ + bz k̂, and c⃗ = cx î + cy ĵ + cz k̂, we obtain
If a
⃗ ⋅ (⃗b × c⃗) = (ax î + ay ĵ + az k̂) ⋅ [(by cz − cy bz ) î − (bx cz − cx bz ) ĵ + (bx cy − cx by ) k̂]
a
= ax (by cz − cy bz ) − ay (bx cz − cx bz ) + az (bx cy − cx by )
Nevertheless, we can present the scalar
RRR a
RR x
⃗ ⋅ (⃗b × c⃗) = RRRR bx
a
RRR
RR cx
triple product by a determinant, e.g.
ay az RRRR
R
by bz RRRR
R
cy cz RRRR
⃗ ⋅(⃗b×⃗
The circular shift: a
c) = ⃗b⋅(⃗
c ×⃗
a) = c⃗⋅(⃗
a × ⃗b) (Refer to D.II in section 7.4).
⃗, ⃗b, and c⃗, is defined as a
⃗ ×(⃗b× c⃗).
The vector triple product of vectors a
Obviously, it lies on the bc-plane and is given by
⃗ × (⃗b × c⃗) = ⃗b (⃗
a
a ⋅ c⃗) − c⃗ (⃗
a ⋅ ⃗b)
The proof of the above relation is straightforward. Without loss of generality,
⃗ = (ax , ay , az ), ⃗b = (bx , 0, 0), and c⃗ = (cx , cy , 0).
we let a
⃗ × (⃗b × c⃗), we have
Plugging the vectors into a
⃗ × (⃗b × c⃗) = (ax î + ay ĵ + az k̂) × bx cy k̂
a
= ay bx cy î − ax bx cy ĵ
= ay bx cy î + ax bx cx î − ax bx cx î − ax bx cy ĵ
= (ay cy + ax cx ) bx î − ax bx (cx î + cy ĵ)
= ⃗b (⃗
a ⋅ c⃗) − c⃗ (⃗
a ⋅ ⃗b)
⃗ ⋅ c and µ = −⃗
It is in the form λ⃗b + µ⃗
c, where λ = a
a ⋅ ⃗b. The result recalls us
again that the vector triple product lies on the bc-plane. Similarly, we have
⃗b × (⃗
⃗) = c⃗ (⃗b ⋅ a
⃗) − a
⃗ (⃗b ⋅ c⃗) and c⃗ × (⃗
⃗ (⃗
⃗).
c×a
a × ⃗b) = a
c ⋅ ⃗b) − ⃗b (⃗
c⋅a
CHAPTER 1. VECTORS
1.12
Applications of Cross Product
1.12.1
Torque
28
The rotation of an object is produced by a torque τ⃗ (the moment of force)
which exerts on the object about the axis of rotation. A torque is defined
by the cross product of two vectors,
τ⃗ = r⃗ × F⃗ = rF sin θ n̂ ,
where F⃗ is the applied force, r⃗ is position vector directed from the axis of
rotation to the point of action and n̂ is the unit vector. It is a vector quantity
whose direction is generated by using the right-hand rule. The definition of
torque makes sense to us because the radial component of F⃗ produces no
effect to rotation.
Figure 1.46: A torque about the axis of rotation
Let’s consider an example. A particle is connected to a massless rod
which has its next end pivoted at O, as shown in figure 1.46. The length
of the rod is r. The torque becomes zero if the force F⃗ is along the rod,
i.e θ = 0○ or 180○ . The torque is a maximum if the force F⃗ is normal to
the rod i.e. θ = 90○ . The direction of a torque (given by n̂) indicates the
direction of the rotation, either counterclockwise or clockwise. Notice that
τ⃗ = r⃗ × F⃗ = r (F sin θ) n̂ = F (r sin θ) n̂. In other words, we can consider the
normal distance of F from O, the moment arm for rotation, i.e. r sin θ. If
there are many forces acting on the particle, the net torque is τ⃗net = ∑ni τi =
n
r × F⃗i ) = r⃗ × (∑ni F⃗i ) = r⃗ × F⃗net .
∑i (⃗
1.12.2
Magnetic Force
There is another example found in physical application which relies on cross
product. It is the magnetic force exerted on a moving charge. When a charge
CHAPTER 1. VECTORS
29
of q coulombs travels with velocity v⃗ in a magnetic region of magnetic field
⃗ The magnetic force acting on the charge is
B.
⃗
F⃗ = q (⃗
v × B)
Figure 1.47: The magnetic force exerted on a moving charge
It means that the charge experiences no magnetic force if it is stationary or
it is moving along the magnetic field. The force is a maximum if the charge
travels perpendicular to the magnetic field, figure 1.47. The direction of the
force points into the page if q is positive. A neutral particle experiences
no magnetic force in the magnetic region. As the magnetic force is always
normal to the motion of the charge, there is no work done on the charge by
the magnetic force. The kinetic energy of the charge remains unchange when
the charge performs the uniform circular motion of radius r. The magnetic
force provides the centripetal force, figure 1.48. If the velocity of the charge
v is perpendicular to the magnetic field B, then we have
mv 2
F = qvB =
r
mv
r =
qB
Figure 1.48: The circular motion of a charge in the magnetic region
Chapter 2
Differentiation
2.1
Basic Ideas and the Extremum
The derivative of a function y = f (x) is denoted by y ′ , f ′ (x) or dy/dx which
represents the slope (gradient) of the function at x.
∆y
f (x + h) − f (x)
dy
= lim
= lim
dx ∆x→0 ∆x h→0
h
A function f (x) is said to be differentiable if f ′ (x) exists. The slope of the
curve at P (x0 , y0 ) is f ′ (x0 ) as shown in figure 2.1.
f ′ (x) =
Figure 2.1: The derivative of a function and its tangent line
A differentiable function is also a continuous function. The geometrical properties of f (x) can be shown by the derivatives of f .
ˆ A function f (x) is said to be increasing in the interval a < x < b if
f ′ (x) > 0 for any a < x < b.
ˆ A function f (x) is said to be decreasing in the interval a < x < b if
f ′ (x) < 0 for any a < x < b.
30
CHAPTER 2. DIFFERENTIATION
31
ˆ If a function f (x) has a local maximum at x0 and it is twice differentiable there, then f ′ (x0 ) = 0 and f ′′ (x0 ) < 0.
ˆ If a function f (x) has a local minimum at x0 and it is twice differentiable there, then f ′ (x0 ) = 0 and f ′′ (x0 ) > 0.
While doing the computation of derivatives, there is a useful relation to
facilitate the calculations, it is the chain rule:
If f is a function of u, and u is a function of x, then we have
df (u) df (u) du(x)
=
.
dx
du
dx
We can also write it again as
d
f (u(x)) = [f ′ (u(x))] [u′ (x)].
dx
dy
dx
If y = f (x), we have the reciprocal rule, i.e.
= 1/ .
dx
dy
The total differential of a differentiable function y = f (x) is
dy = f ′ (x) dx,
where the derivative f ′ (x) links up the change in y due to the change in x.
Example 2.1. Find the percentage error in volume if there is a percentage
error in the measurement of the length of a cube by 0.1 %.
Solution: The volume of a cube of sides L is V = L3 , which gives V ′ (L) =
dV /dL = 3L2 . Then, the total differential of V is
dV = V ′ (L) dL = 3L2 dL
Therefore, we obtain
dL
0.1
dV 3L2 dL
=
=3( )=3(
) = 0.003
V
V
L
100
The percentage error of the volume is
2.2
dV
× 100 % = 0.3 %.
V
∎
Derivatives of Physical Quantities
Derivatives are widely used to define physical quantities. Typical examples
are the velocity and acceleration of an object. A particle moving in a coordinate system with position vector r⃗ at time t has its velocity v⃗ and acceleration
⃗ given by
a
v⃗ = lim
∆t→0
∆⃗
r d⃗
r
=
∆t dt
and
⃗ = lim
a
∆t→0
∆⃗
v d⃗
v d2 r⃗
=
=
∆t dt dt2
CHAPTER 2. DIFFERENTIATION
32
If the motion is along a straight line, we simply have
v=
dx
dt
and
a=
dv d2 x
=
dt dt2
More examples: The power is the work per unit time, P = dW /dt. In an
electric circuit, the current I in the wire stands for dq/dt, where I is the rate
of charge flowing in the wire. Other than time derviatives, there are some
quantities defined without times. For instance, the density of a non-uniform
chain is represented by λ = dm/dx. In thermodynamics, the specific heat
capacity of a gas at constant volume is the amount of heat Q needed by one
mole of gas to increase its temperatures by one unit, cV = dQ/dT .
Example 2.2. An object moving along a straight line has an acceleration a,
where a is the time derivative of velocity, i.e. dv/dt. Express a again without
time explicitly in your answer.
Solution: From the definition of a, we have
a=
dv
dv dv dx
=
= ( ) v,
dt dx dt
dx
where v and x are the velocity and displacement of the object at time t.
dx
There is another convention adopted by many textbooks: ẋ = v =
, and
dt
d2 x
dẋ
ẍ = a = 2 . The above result can be expressed as ẍ = ( ) ẋ. It is a useful
dt
dx
technique if one wants to relate ẋ and x in the differential equation.
∎
Example 2.3. A rod AC has a mark B on it such that AB = l1 and BC = l2 .
Suppose points A and B are mounted on the y-axis and x-axis respectively
such that the rod is movable on the axis frame, as shown in figure 2.2. Show
that C moves on an elliptical locus. Find also the velocities of B and C when
A moves toward O with a uniform speed v0 .
Figure 2.2: The movement of a rod under constraints
CHAPTER 2. DIFFERENTIATION
33
Solution: Refer to figure 2.2, we label the coordinates of C as (xC , yC ),
where
{
xC = l1 cos θ + l2 cos θ = (l1 + l2 ) cos θ
yC = −l2 sin θ
Eliminating θ from the above equations, we obtain the locus of C. It is an
ellipse.
(
yC 2
xC 2
) +( ) =1
l1 + l2
l2
Now, the coordinates of A are
{
xA = 0
yA = l1 sin θ
which gives ẏA = −v0 = l1 θ̇ cos θ and thus
θ̇ =
−v0
l1 cos θ
(2.1)
The coordinates of B are
{
xB = l1 cos θ
yB = 0
which gives ẋB = vB = −l1 θ̇ sin θ. Using equation 2.1 we obtain vB = v0 tan θ.
Differentiate the coordinates of C and use equation 2.1 again, we have the
velocities of C along x- and y- axis respectively.
⎧
v0 (l1 + l2 ) tan θ
⎪
⎪
⎪
ẋ
=
−(l
+
l
)
θ̇
sin
θ
=
C
1
2
⎪
⎪
l1
⎨
l2 v0
⎪
⎪
⎪
ẏC = −l2 θ̇ cos θ =
⎪
⎪
l1
⎩
The physical quantities such as displacement, velocity, and acceleration are
defined in the coordinate system as functions of time. This example applies
simple ideas in coordinate geometry to solve the problems.
∎
2.3
Centripetal Acceleration
An object performing a uniform circular motion of radius r with velocity v
has acceleration a, where a is referred to as the centripetal acceleration. The
magnitude of a is v 2 /r and the direction always points toward the center of
rotation. The proof is given as follows.
CHAPTER 2. DIFFERENTIATION
34
Figure 2.3: The uniform circular motion
Consider the object traveling from P1 to P2 along the circular path during
a time interval ∆t. Without loss of generality, we set P1 and P2 be two
points which have symmetry about the y-axis. At P1 the velocity is v⃗1 =
v cos θ î + v sin θ ĵ, and at P2 the velocity is v⃗2 = v cos θ î − v sin θ ĵ. The time
interval ∆t = 2rθ/v The x-component of the average acceleration is
aave,x =
v2x − v1x
=0
∆t
because the x-component of the v⃗1 and v⃗2 are the same: v1x = v2x = v cos θ.
Hence, the x-component of the instantaneous acceleration is
ax = lim aave,x = 0
∆t→0
The y-component of the average acceleration is
aave,y =
v2y − v1y −v sin θ − v sin θ −2v sin θ
v 2 sin θ
=
=
=− (
)
∆t
2rθ/v
2rθ/v
r
θ
Hence, the y-component of the instantaneous acceleration is
ay = lim aave,y = −
∆t→0
sin θ
v2
v2
(lim
)=− ,
r θ→0 θ
r
sin θ
= 1, see section 2.8.
θ→0 θ
where we have used the fact that lim
2.4
Seeking the Extremum
Differential calculus is an efficient tools to find the local extremum of a function. The local extremum is also the turning point (stationary point) where
the first order derivative of the function is zero. The second order derivative
of the function helps to determine the properties of the turning point such
CHAPTER 2. DIFFERENTIATION
35
as a maximum point or a minimum point. Sometimes, we do not spend time
to work out the second order derivative for verification because the physical
picture of a system would naturally tell the situation.
Example 2.4. Consider light passing from one medium with index of refraction n1 into another medium with index of refraction n2 . Use Fermat’s
principle to derive the law of refraction: n1 sin θ1 = n2 sin θ2 .
Fermat’s Principle: Light travels by the path that takes the least amount of
time.
Figure 2.4: The refraction of light
Solution: Consider a beam which passes through point A in medium 1 and
point B in medium 2, where A and B have vertical distances h1 and h2 from
the interface of the two media. Let A1 and B1 be the points of projection
from A and B on the interface and C be the intersecting point when the
beam from A meets the interface. Denote A1 B1 as a and A1 C as x, where
0 ≤ x ≤ a.
The total travelling time for paths AC and CB:
√
√
h21 + x2
h22 + (a − x)2
AC CB
+
=
+
T (x) =
v1
v2
v1
v2
The first and second derivatives of T with respect to x are
T ′ (x) =
1
x
1
a−x
⋅√
− ⋅√
v1
h21 + x2 v2
h22 + (a − x)2
T ′′ (x) =
h2
h22
1
1
⋅ 2 1 2 3/2 + ⋅ 2
>0
v1 (h1 + x )
v2 [h2 + (a − x)2 ]3/2
The turning point of T (x) can be obtained when one solves the equation
T ′ (x) = 0. Since T ′′ (x) is positive, the turning point of T (x) is a minimum
CHAPTER 2. DIFFERENTIATION
36
value which is the minimum time stated in Fermat’s principle. Let’s work it
out and set T ′ (x) = 0, Thus,
1
x
1
a−x
⋅√
=
⋅√
2
2
2
v1
v2
h1 + x
h2 + (a − x)2
1
1
sin θ1 =
sin θ2
v1
v2
Multiplying both sides by the speed of light in the free space, c, we obtain
c
c
sin θ1 =
sin θ2
v1
v2
n1 sin θ1 = n2 sin θ2
The last equation is the Snell’s law for the refraction of light, where ni = c/vi
and i = 1 and 2.
∎
Example 2.5. A uniform rod of length 2a is placed with its lower end inside
a smooth bowl. The bowl is a hemispherical hollow of radius a and it is fixed
on a horizontal plane. Find the equilibrium position of the rod.
Figure 2.5: The rod in a bowl
Solution: Denote G as the center of mass of the rod. The vertical distance
of G from the x-axis is y, where
y = AG sin θ = (AB − GB) sin θ
= (2a cos θ − a) sin θ
When the rod is at equilibrium in the bowl, it occupies the lowest gravitational potential energy. In other words, the vertical distance y obtains the
maximum. The derivative of y with respect to θ is
y ′ = 2a (cos2 θ − sin2 θ) − a cos θ
= 4a cos2 θ − a cos θ − 2a
CHAPTER 2. DIFFERENTIATION
37
The turning point of y satisfies y ′ = 0 which gives 4a cos2 θ − a cos θ − 2a = 0.
The rod reaches its equilibrium when the angle of inclination θ0 = cos−1 (1 +
√
33)/8 = 32.5○ . One may check that y ′′ ∣θ0 < 0, which indicates the maximum
value of y at θ = θ0 .
∎
Example 2.6. A light cord of length l has one of its ends connected to a
particle of mass m, while the next end of it is fixed at a point O on the ceiling. Initially, the cord is kept horizontally and the particle is at a distance l
from O such that the cord is tight, then the particle is released to fall under
the gravity. Find the angle that the cord makes with the vertical when the
particle obtains its maximum vertical speed.
Figure 2.6: The swinging particle
Solution: The particle has zero vertical speed when it is located at the
initial position and the lowest position. It means that the vertical speed of
the particle has a turning point when it is descending. Here, the turning
point is the maximum vertical speed when the cord has an inclined angle θ′
with the vertical. However, the turning point occurs when the net vertical
force exerted on the particle is zero, i.e. Fy = 0. Thus,
T cos θ′ = mg
(2.2)
The conservation of mechanical energy gives
mgl cos θ′ =
1
mv 2
2
(2.3)
The particle performs the circular motion with radius l because the net force
along the cord contributes the centripetal force
T − mg cos θ′ =
mv 2
l
(2.4)
Eliminating v from equations 2.3 and 2.4, we obtain T = 3mg cos√θ′ . Using
this equation and equation 2.2 to eliminate T , we obtain cos θ′ = 1/ 3, which
gives θ′ = 54.7○ .
∎
CHAPTER 2. DIFFERENTIATION
2.5
38
Case Study on Projectile Motion
Example 2.7. A ball is projected, with speed u and angle of elevation α
from the floor. Find the condition of α such that the ball is always moving
further away from the point of projection.
Figure 2.7: The projectile of a ball
Solution: Let the point of projection be the origin of the Cartesian coordinate system. The x and y coordinates of the particle at time t are
⎧
⎪
⎪ x = (u cos α) t
1
⎨
⎪
y = (u sin α) t − gt2
⎪
⎩
2
1
Hence, we have r2 = x2 +y 2 = (u2 cos2 α) t2 +(u2 sin2 α) t2 + g 2 t4 −(u sin α) gt3 ,
4
which implies
1
r2 = u2 t2 + g 2 t4 − (u sin α) gt3
4
Differentiate both sides of the above equation with respect to t, we have
dr2
= 2u2 t + g 2 t3 − 3t2 ug sin α
dt
If the distance between the ball and the point of projection increases with
time, we have r(t) an increasing function (or simply r2 (t) an increasing
function). Then we can write dr2 /dt > 0. A quadratic inequality of t follows:
g 2 t2 − 3gt sin α + 2u2 > 0. This equation is valid if the discriminant of the
quadratic expression is less than zero, i.e. ∆ < 0.√Then we have 9u2 g 2 sin2 α −
8u2 g 2 < 0. After solving, we obtain sin α < 2 2/3. In other words, the
distance between the ball and the point of projection is always increasing if
the angle of projection is less than 70.5○ .
∎
2.6
A Revisit to Newton’s Second Law
In high school physics, Newton’s second law is presented in the simplest form
F = ma, where m is regarded as a point mass of constant value. In fact, the
more general description of Newton’s second law is F = dp/dt, where p is the
CHAPTER 2. DIFFERENTIATION
39
momentum of the mass system. Notice that F = dp/dt reduces to F = ma if
m is a point mass of constant value.
F = dp/dt = d(mv)/dt = m (dv/dt) = ma
The following example deals with the motion of a long chain which has
two portions, the moving part and the stationary part. The formula F = ma
is still applicable to study the problem if one can locate the center of mass
of the chain before further calculations. But, the approach is a bit clumsy
and time consuming. An easier approach to study the problem is to adopt
F = dp/dt.
Example 2.8.
A uniform open-link chain of mass ρ per unit length and total length L has
one of its end fixed at the ceiling. The free end of it is released from rest at
x = 0 and it falls under gravity as shown in figure 2.8. Find the force R that
supports the fixed end. Express your answer in terms of x.
Figure 2.8: The falling chain
dp
Solution: Newton’s second law states that Fnet = , where Fnet is the net
dt
force acting on an object and p is the momentum of the object. Recall that
p = mv, where m and v are the mass and the velocity of the object respectively. In this problem v = ẋ. The right portion falls after it is released. The
left portion has length (L + x)/2 and the right portion has length (L − x)/2
when the end point of right portion is at a distance x under the ceiling. Figure 2.9 shows the mass distribution of the chain on its two portions.
L−x
Notice that the momentum of the right portion is
ρẋ and that of
2
the left portion is zero. Since the right portion is under free fall, the velocity
and acceleration of it are governed by ẋ2 = 2gx and ẍ = g respectively. Here
we have taken downward as positive as the measurement of x is downward
from the ceiling.
CHAPTER 2. DIFFERENTIATION
40
Figure 2.9: The net force on the chain
Notice also that the net force on the entire chain is ρLg − R, where R is
the force exerted on the chain by the ceiling and ρgL is the weight of the
chain. Hence, we have
d
L−x
{(
) ρẋ}
dt
2
ρ
{ẍ (L − x) − ẋ2 }
=
2
ρ
=
{g (L − x) − 2gx}
2
−R + ρLg =
ρg
Therefore, we obtain R =
(L + 3x).
2
When the entire chain is just unfolded, x = L, the force that supports the
chain is R = 2ρgL.
∎
2.7
Electric Potential and Electric Field
If a charged system is symmetric about the x-axis, the electric field E⃗ at a
point P on the x-axis points along the x-axis. The electric potential V at P
relates the electric field E⃗ by
dV
E⃗ = −
î
dx
Notice that V is the work done by an external agent to move a unit positive
charge from infinity to point P under the influence of the charged system.
The above expression provides an easier way to find the vector field through
a scalar function.
CHAPTER 2. DIFFERENTIATION
41
Example 2.9. A uniformly charged ring of radius r and total charge Q
exerts an electric field around it. If the electric potential at a point P on the
x-axis is given by
V =
Q
1
√
2
4π0 r + x2
Find the electric field at point P .
Figure 2.10: A uniformly charged ring
Solution: Due to symmetry, the electric field at point P is directed along
the x-axis and it is given by Ex = −dV /dx.
Ex = −
Q d
1
Q
x
(√
)=
.
2
2
2
4π0 dx
4π0 (r + x2 )3/2
r +x
Q
x
Thus, the electric field at P is E⃗ =
î. One can check that
2
4π0 (r + x2 )3/2
Q
the field strength converges to
if x >> r. This result makes sense
4π0 x2
because the ring can be regarded as a point charge with charge Q when the
measurement is performed very far away from the ring.
∎
2.8
L’ Hôpital’s Rule
Suppose both the functions f (x) and g(x) are differentiable near x = a and
f (a) = g(a) = 0. Then,
f (x) f ′ (a)
= ′
.
x→a g(x)
g (a)
lim
Proof. We just need to recognize that
f (x) − f (a)
x→a
x−a
f ′ (a) = lim
CHAPTER 2. DIFFERENTIATION
42
and similar for g ′ (a). Then,
f (x) − f (a)
f (x)
= lim
= lim
lim
x→a g(x) − g(a)
x→a
x→a g(x)
The indeterminate forms:
f (x)−f (a)
x−a
g(x)−g(a)
x−a
=
f ′ (a)
.
g ′ (a)
∞
0
and
0
∞
0
form
0
Suppose the functions f (x) and g(x) are differentiable near x = a and f (a) =
g(a) = 0. Then,
Rule 1:
f (x)
f ′ (x)
= lim ′
x→a g(x)
x→a g (x)
lim
∞
form
∞
Suppose the functions f (x) and g(x) are differentiable near x = a and
lim f (x) = lim g(x) = ∞. Then,
Rule 2:
x→a
x→a
f (x)
f ′ (x)
= lim ′
x→a g(x)
x→a g (x)
lim
ex − 1
sin x
and lim
.
x→0
x→0 x
x
sin x
0
Solution: The expression lim
has the indeterminate form [ ]. Apply
x→0 x
0
the L’ Hôpital’s rule, we have
cos x
sin x
= lim
= 1.
lim
x→0
x→0 x
1
ex − 1
0
Notice also that lim
has the indeterminate form [ ]. L’ Hôpital’s rule
x→0
x
0
gives
ex − 1
ex
lim
= lim = 1 .
x→0
x→0 1
x
∎
1 + cos πx
Example 2.11. Find lim 2
.
x→1 x − 2x + 1
1 + cos πx
Solution: Obviously, the expression lim 2
has the indeterminate
x→1 x − 2x + 1
0
form [ ]. L’ Hôpital’s rule gives
0
1 + cos πx
−π sin πx
−π 2 cos πx π 2
lim 2
= lim
= lim
=
.
x→1 x − 2x + 1
x→1 2x − 2
x→1
2
2
∎
Example 2.10. Find lim
CHAPTER 2. DIFFERENTIATION
43
1 − cos x2
.
x→0
sin2 x
1 − cos x2
0
Solution: The expression lim
has the indeterminate form [ ].
2
x→0
0
sin x
Apply L’ Hôpital’s rule, we have
Example 2.12. Find lim
1 − cos x2
2 x sin x2
x
sin x2
=
lim
=
(lim
)
(lim
) = 1 ⋅ 0 = 0.
x→0
x→0 2 sin x cos x
x→0 sin x
x→0 cos x
sin2 x
lim
In the calculations, we do not apply L’ Hôpital’s rule twice though it works
and gives correct answer. The approach is a bit clumsy. Instead of doing this
way, we operate the limits of individual expressions if they exist, as shown
above.
∎
Example 2.13. Find lim (
x→+∞
x+c x
) , where c = ln 2.
x−c
Solution: Notice that
{
x+c x
x+c
x+c x
(
) = eln( x−c ) = ex ln( x−c ) = e
x−c
x+c )
ln( x−c
}
1
x
0
The expression in the curly bracket has the indeterminate form [ ] when
0
x+c
x → +∞. Using the fact that ln (
) = ln(x + c) − ln(x − c) and applying
x−c
L’ Hôpital’s rule, we have
ln (
lim
x→+∞
x+c
1
1
)
−
2c (x2 )
x−c
= lim x + c x − c = lim 2 2 = 2c = 2 ln 2 = ln 4
x→+∞
x→+∞ x − c
1
1
− 2
x
x
x+c x
Therefore, lim (
) = 4, if c = ln 2.
x→+∞ x − c
∎
Example 2.14. A particle of mass m is thrown vertically upward with velocity v0 in a resistive medium. It is found that the time for it to reach the
maximum height is
t=
m
kv0
ln (1 +
),
k
mg
where k is a constant related to the strength of drag force and the resistance
increases with k. Show that the above expression converges to the required
time in the perfect case if k → 0.
CHAPTER 2. DIFFERENTIATION
44
Solution: For an ideal medium, the time required is v0 /g. Let’s compute
m
kv0
lim ln (1 +
) by using L’ Hôpital’s rule. One can check that the exk→0 k
mg
0
pression has the indeterminate form [ ].
0
m ln (1 +
kv0
m
) = lim
lim ln (1 +
k→0
k→0 k
mg
k
kv0
mg )
= lim
v0
( mg
)
m
kv
1+ mg0
k→0
1
mv0
v0
=
k→0 mg + kv0
g
= lim
∎
The proof is completed.
2.9
Taylor’s Series
Assume f (x) is infinitely differentiable at a, then
f (x) = f (a) + f ′ (a) (x − a) +
1 ′′
1
f (a) (x − a)2 + f (3) (a) (x − a)3
2!
3!
1 (n)
f (a) (x − a)n + ⋯ ,
(2.5)
n!
where f (n) is the n-th derivative of f (x). The above expression is called the
Taylor’s series of f (x) about a reference point at x = a.
+⋯ +
Proof. Assuming that f (x) can be expanded in a power series in x − a, let
f (x) = A0 + A1 (x − a) + A2 (x − a)2 + A3 (x − a)3 + ⋯ + An (x − a)n + ⋯ (2.6)
Differentiating both sides with respect to x, successively n times,
f ′ (x)
f ′′ (x)
f ′′′ (x)
⋮
f (n)
=
=
=
=
=
A1 + 2A2 (x − a) + 3A3 (x − a)2 + ⋯ + nAn (x − a)n−1 + ⋯
1 ⋅ 2A2 + 2 ⋅ 3A3 (x − a) + ⋯ + n(n − 1)An (x − a)n−2 + ⋯
1 ⋅ 2 ⋅ 3A3 + ⋯ + n(n − 1)(n − 2)An (x − a)n−3 + ⋯
⋮
⋮
n!An + terms in (x − a), etc. ⋯
Using x = a in these n + 1 equations, we have
f (a) = A0 , f ′ (a) = A1 , f ′′ (a) = 2!A2 , f ′′′ (a) = 3!A3 , ⋯ , f (n) (a) = n!An , ⋯
Therefore,
A0 = f (a), A1 = f ′ (a), A2 =
f ′′′ (a)
f (n) (a)
f ′′ (a)
, A3 =
, ⋯ , An =
, ⋯
2!
3!
n!
CHAPTER 2. DIFFERENTIATION
45
Substituting these results into equation 2.6, we have
f (x) = f (a) + f ′ (a) (x − a) +
+ ⋯+
1
1 ′′
f (a) (x − a)2 + f (3) (a) (x − a)3
2!
3!
1 (n)
f (a) (x − a)n + ⋯
n!
There are many useful results produced by the Taylor’s series and they
are widely used in physics. To approximate the answer, we can cut the long
tails of the series and keep the first few terms. Taking the origin as the
reference point, we have
ex = 1 + x +
x2 x3
+
+⋯
2! 3!
ax = ex ln a = 1 + x ln a +
(x ln a)2 (x ln a)3
+
+⋯
2!
3!
x3 x5 x7
+
−
+⋯
3! 5! 7!
x 2 x4 x6
cos x = 1 −
+
−
+⋯
2! 4! 6!
x2 x3 x4
+
−
+⋯
ln(1 + x) = x −
2
3
4
sin x = x −
Binomial expansion (as an infinite series)
p (p − 1) p−2 2 p (p − 1) (p − 2) p−3 3
a x +
a x +⋯
2!
3!
p
p
p
= ap + ( ) ap−1 x + ( ) ap−2 x2 + ( ) ap−3 x3 + ⋯,
3
1
2
(a + x)p = ap + p ap−1 x +
p
p (p − 1) (p − 2)⋯(p − r + 1)
where p is real (not a positive integer) and ( ) =
.
r
r!
In particular, when a = 1, we have
p (p − 1) 2 p (p − 1) (p − 2) 3
x +
x +⋯
2!
3!
p
p
p
= 1 + ( ) x + ( ) x2 + ( ) x 3 + ⋯ .
1
2
3
(1 + x)p = 1 + p x +
It is an important result to do approximation if x is small, e.g. (1 + x)1/2 ≈
1 + x/2.
Remark: If p and r are non-negative integers with 0 ≤ r ≤ p, then the
p
p!
p
combinations formula is ( ) =
. An alternative notation of ( ) is
r
r! (p − r)!
r
p
p
p
Cr . Obviously, ( ) = (
). We should note that 0! = 1.
r
p−r
CHAPTER 2. DIFFERENTIATION
46
Example 2.15. Find the Taylor’s series of ln(1 − x) about x = 0.
Solution: Consider the derivatives of the function, we have
d
1
ln(1 − x)∣ = −
∣ = −1
dx
1 − x x=0
x=0
and
d2
1
ln(1 − x)∣ = −
∣ = −1 .
2
dx
(1 − x)2 x=0
x=0
Do it repeatedly, we obtain
dn
ln(1 − x)∣ = −(n − 1)! .
dxn
x=0
Hence, the Taylor’s series is
ln(1 − x) = −x −
x2 x3
xn
−
−⋯−
− ⋯.
2
3
n
∎
Example 2.16. Find the Taylor’s series of esin x about x = 0.
Solution: The Taylor’s series of ex and sin x about x = 0 are
x2 x3
+
+⋯
2! 3!
x3 x5 x7
sin x = x −
+
−
+⋯
3! 5! 7!
ex = 1 + x +
For convenience ex is sometimes written as exp(x). The Taylor’s series of
esin x about x = 0 is
x3
+ ⋯)
3!
2
3
x3
1
x3
1
x3
= 1 + (x −
+ ⋯) + (x −
+ ⋯) + (x −
+ ⋯) + ⋯
3!
2!
3!
3!
3!
3
4
x
1
2x
1
3x5
= 1 + (x −
+ ⋯) + (x2 −
+ ⋯) + (x3 −
+ ⋯) + ⋯
3!
2!
3!
3!
3!
x2
= 1+x+
+⋯
2
exp (x −
where we ignore x4 and higher order terms. (Incidentally, x3 term also vanishes.) Be careful how high an order you have to keep: If a Taylor’s series
of order n is required, all terms up to and including order n must be kept in
intermediate calculation.
∎
Example 2.17. Not all functions have Taylor’s series that converge to itself.
Give an example to illustrate this.
CHAPTER 2. DIFFERENTIATION
47
y
x
Figure 2.11: A smooth function
Solution: Define a function
⎧
1
⎪
⎪ exp(− ) if x > 0
.
f (x) = ⎨
x
⎪
⎪
0
if
x
≤
0
⎩
The graph of the function is shown in Fig. 2.11. For x > 0,
f ′ (x) =
exp(−1/x)
,
x2
and f (n) (x) = exp(−1/x)Pn (1/x) where Pn (1/x) is a polynomial in 1/x. This
can be proved by a simple induction. Then, by the fact that
lim f (n) (x) = 0 ,
x→0+
(2.7)
the function is infinitely differentiable at x = 0. Its Taylor’s series at x = 0
are identically zero, not equal to the function itself. The proof of equation
2.7 is given as follows. We notice that
lim e−y y m = 0
y→+∞
for all positive integer m. We say that exponential is faster than any power.
The proof of it can be completed by using L’Hôpital’s rule,
ym
m y m−1
m (m − 1) y m−2
m!
=
lim
=
lim
=
lim
= 0.
y→+∞ ey
y→+∞
y→+∞
y→+∞ ey
ey
ey
lim e−y y m = lim
y→+∞
∎
Example 2.18. Two massless springs, each with force constant k and unstretched length l0 are connected in a straight line as shown in figure. Find
an expression for the work done of a force which moves the point of attachment, i.e. the knot, between the two springs a perpendicular distance x from
CHAPTER 2. DIFFERENTIATION
48
Figure 2.12: A two-spring system
the equilibrium point. Hence, show that the work done for such movement
kx4
is given by 2 when x << l0 .
4 l0
Solution: When√the vertical displacement of the knot is x, the extension of
the spring is e = l02 + x2 − l0 . The tension in each spring is T = ke. The force
exerted by an external agent to displace the knot by x from its equilibrium
√
⎛
⎞
2kl0 x
x
is F = 2T sin θ = 2ke sin θ = 2k ( l02 + x2 − l0 ) √
= 2k x − √
.
2
2
⎝ l0 + x ⎠
l02 + x2
Figure 2.13: A two-spring system
Hence, the work done by the force for such displacement is
x
W = ∫ F dx
0
2kl0 x ⎞
2k x − √
dx
⎝
l02 + x2 ⎠
√
= kx2 − 2kl0 l02 + x2 + 2kl02
= ∫
0
The binominal expansion of
√
√
x⎛
1/2
x
l02 + x2 = l0 (1 + ( )2 ) gives
l0
l02 + x2 = l0 (1 +
1 x2 1 x4
−
+ ⋯) .
2 l02 8 l04
CHAPTER 2. DIFFERENTIATION
49
When x << l0 we neglect the higher order terms after x4 . Therefore, the work
done by the external force is
W = kx2 − 2kl0 {l0 (1 +
=
1 x2 1 x4
−
)} + 2kl02
2 l02 8 l04
kx4
4 l02
∎
Example 2.19. An electric dipole consists of two equal and opposite
charges (±q) separated by a distance s. Show that the approximate potential
at a point P far away is given by
1 qs cos θ
,
4π0
r2
where r is the distance measured from P to the mid-point of dipole and θ
is the angle between the dipole and the line joining P and the mid-point of
dipole.
Solution: The potential due to the dipole is V (P ) =
1
q
q
( − ), where
4π0 r+ r−
⎧
s 2
s
s2
⎪
2
2
2
⎪
=
r
+
(
r
)
−
rs
cos
θ
=
r
(1
−
cos
θ
+
)
⎪
+
⎪
⎪
2
r
4 r2
⎪
⎪
⎨
⎪
⎪
⎪
s 2
s
s2
⎪
2
2
2
⎪
r
=
r
+
(
)
+
rs
cos
θ
=
r
(1
+
cos
θ
+
)
⎪
−
⎪
⎩
2
r
4 r2
When P is far away from the dipole, we have r >> s. The higher order terms
Figure 2.14: The electric dipole
CHAPTER 2. DIFFERENTIATION
50
in the above expressions are negligible. Thus
s
⎧
⎪
r+2 ≈ r2 (1 − cos θ)
⎪
⎪
⎪
r
⎪
⎪
⎨
⎪
⎪
s
⎪
⎪
⎪
r2 ≈ r2 (1 + cos θ)
⎪
⎩ −
r
The binominal expansion of them are
⎧
−1/2
⎪
s
1 1
⎪
⎪
≈
(1
−
cos
θ)
≈
⎪
⎪
⎪
r
r
r
⎪
+
⎪
⎨
⎪
−1/2
⎪
⎪
1 1
s
⎪
⎪
⎪
≈
≈
(1
+
cos
θ)
⎪
⎪
r
⎩ r− r
1
s
(1 +
cos θ)
r
2r
1
s
(1 −
cos θ)
r
2r
Therefore, we have
1
s
1
−
≈ 2 cos θ
r+ r− r
and hence V (P ) ≈
2.10
1 qs cos θ
.
4π0
r2
∎
Newton’s Method
If we can calculate the derivative of a function, we may be able to find out
the roots of a function. Newton’s method provides an effective approach to
find the roots of a function. The procedures are stated as follows.
Figure 2.15: The idea of Newton’s method
We take an initial guess x = x0 such that it is close to one of the roots of
f (x) = 0. In figure 2.15, the root of f (x) = 0 is located at x = a. Then the
equation of the tangent line at (x0 , f (x0 )) is
y − f (x0 ) = f ′ (x0 ) (x − x0 )
CHAPTER 2. DIFFERENTIATION
51
If we put y = 0, we obtain the x-intercept of the tangent line.
x1 = x0 −
f (x0 )
f ′ (x0 )
We can repeat doing this and obtain the recurrence relation
xn+1 = xn −
f (xn )
f ′ (xn )
(2.8)
where n = 0, 1, 2, . . . . We expect the sequence xn will converge to the exact
root. If the initial guess is far from the root, we should avoid the possibility
that f ′ (x0 ) = 0 or approximately zero, otherwise this method will fail to give
the answer. Figures 2.20 and 2.17 indicate the undesirable outcomes due to
poor initial guess. Lastly, Newton’s method is inapplicable if we are going
to find the multiple roots of f (x) = 0.
Figure 2.16: The initial guess is far away from the root and has f ′ (x0 ) ≈ 0
Figure 2.17: The initial guess is far away from the root
CHAPTER 2. DIFFERENTIATION
52
Example 2.20. Solve x5 + x3 − 1 = 0 by Newton’s method with an initial
guess x = 0.8.
Solution: We consider the function f (x) = x5 + x3 − 1, then we have f ′ (x) =
5x4 + 3x2 . Hence, we define
xn+1 = xn −
x5n + x3n − 1
5x4n + 3x2n
Figure 2.18: A plot of y = f (x)
Taking x0 = 0.8, we have
n
0
1
2
3
xn
f (xn )
f ′ (xn )
0.8
−0.16032
3.968
0.840403226 0.012774638 4.61297286
0.837633941 0.000064684 4.566319479
0.837619775
∗ ∗∗
∗ ∗∗
One of the roots of f (x) = 0 is 0.8376.
∎
CHAPTER 2. DIFFERENTIATION
53
Example 2.21. Solve (x − 3) ex + 3 = 0 by Newton’s method with an initial
guess x = 2.8.
Solution: Obviously, x = 0 is a root of the equation. Now, we proceed to
find the next root. Consider the function f (x) = (x − 3)ex + 3 = 0 and then
we have
f ′ (x) = ex + (x − 3) ex = (x − 2)ex
We define
xn+1 = xn −
(xn − 3) exn + 3
(xn − 2) exn
Figure 2.19: A plot of y = f (x)
Taking x0 = 2.8, we have
n
0
1
2
3
xn
f (xn )
f ′ (xn )
2.8
−0.288929354 13.155717417
2.821962265 0.007220640
13.817024262
2.821439675 0.000004181
13.801025461
2.821439372
∗ ∗∗
∗ ∗∗
One of the roots of f (x) = 0 is 2.8214.
∎
CHAPTER 2. DIFFERENTIATION
2.11
Useful Differentiation Formulae
f (x)
c
cx
cxn
sin x
cos x
tan x
sec x
csc x
cot x
ex
ln x
ax
u±v
uv
u
v
Chain Rule:
2.12
54
dy dy du
=
dx du dx
f ′ (x)
0
c
cnxn−1
cos x
− sin x
sec2 x
sec x tan x
− csc x cot x
− csc2 x
ex
1
x
ax ln a
u′ ± v ′
uv ′ + vu′
vu′ − uv ′
v2
Reciprocal Rule:
dy
1
=
dx
dx
dy
Appendix: Method of Bisection
Method of Bisection provides the simplest way to find the root of a equation. Let f (x) be a continuous function in x. Without loss of generality,
let a0 < b0 , f (a0 ) < 0, and f (b0 ) > 0, then a real root appears in the interval
(a0 , b0 ). Now, check the sign of the function at the mid-point of a0 and b0 . If
a0 + b 0
a0 + b 0
) = 0, we obtain the root. If f (
) < 0, we label the mid-point
f(
2
2
a0 + b 0
as a1 and set b1 = b0 . If f (
) > 0, we label the mid-point as b1 and set
2
a1 = a0 . Iterate the process n times, then an approximation of the root can
be found.
Example 2.22. Let f (x) = x3 −2x−5. One can check that a root of f (x) = 0
lies in the interval (2, 3). Locate the root by using the method of bisection.
Solution: Consider two points on the x-axis, x = 2 and x = 3. We note that
f (2) = −1 < 0 and f (3) = 16 > 0, so a real root appears in the interval (2, 3).
The following table shows the signs of f (x) on the two sides of the root.
CHAPTER 2. DIFFERENTIATION
f (2) < 0
f (3) > 0
f (2.5) > 0
f (2.25) > 0
f (2.125) > 0
f (2.0625) < 0
f (2.09375) < 0
f (2.109375) > 0
55
The interval that contains a real root of f (x) = 0
(a0 , b0 ) = (2, 3)
(a1 , b1 ) = (2, 2.5)
(a2 , b2 ) = (2, 2.25)
(a3 , b3 ) = (2, 2.125)
(a4 , b4 ) = (2.06252, 2.125)
(a5 , b5 ) = (2.09375, 2.125)
(a6 , b6 ) = (2.09375, 2.109375)
If we terminate the process after six bisections, then we obtain the approximate value of the root. It is 2.1015625.
Figure 2.20: A plot of y = f (x)
Figure 2.21: A plot of y = f (x) near the root
∎
Chapter 3
Integration
3.1
Indefinite Integration
As a first understanding, integration could be treated as the inverse of differentiation. We denote it by
F (x) = ∫ f (x) dx .
Because the derivative of a constant is zero, integral of a function is not
unique. There is a constant of integration, and the adjective “indefinite”.
1
dx.
Example 3.1. Find ∫ x dx, ∫ xn dx and ∫
x
Solution:
1 2
∫ x dx = 2 x + const.
We will usually denote the constant of integration by C. Note that the
constants could be different in different equations.
1
n
n+1
∫ x dx = n + 1 x + C
1
∫ x dx = ln x + C .
if n ≠ −1, and
Example 3.2. A particle with initial velocity u moves along a straight line.
If the acceleration of the particle is a constant a, find the velocity and the
displacement of the particle after time t.
Solution: The acceleration a is a constant and notice that in calculus nodv
tation a = , where v is the velocity. We perform integration on both sides
dt
56
CHAPTER 3. INTEGRATION
57
with respect to t and obtain
dv
∫ a dt = ∫ ( dt ) dt
a∫ dt = ∫ dv
at = v + C ,
where C is the arbitrary constant to be determined by the initial condition.
Put t = 0 and v = u, we obtain C = −u. Hence, v = u + at.
ds
Let s be the displacement and notice again that in calculus notation v = .
dt
ds
Then,
= u + at. Integrating both sides with respect to t, we have
dt
ds
∫ ( dt ) dt = ∫ (u + at) dt
∫ ds = ∫ (u + at) dt
1
s = ut + at2 + C ′ ,
2
where C ′ is the arbitrary constant to be determined by the initial condition.
1
Put t = 0 and s = 0, we obtain C ′ = 0. Hence, s = ut + at2 .
∎
2
3.1.1
Integration by Substitution
There are many tricks to try to find the integral of a functions, not always
work. Very often, we can prove that the integral exists but we do not have
a simple analytic expression. One very useful trick is substitution which
simplifies our work in great extend.
Example 3.3. Find ∫ (2x + 8)3 dx.
Solution: Put y = 2 x + 8 and we obtain dy = 2 dx.
1
3
3
∫ (2 x + 8) dx = 2 ∫ y dy
1 4
=
y +C
8
1
=
(2 x + 8)4 + C
8
= 2 (x + 4)4 + C
One can differentiate the result to verify that the integration is correct. For
this simple substitution, we seldom define and write down the function y
CHAPTER 3. INTEGRATION
58
explicitly, but instead we write
1
3
3
∫ (2 x + 8) dx = 2 ∫ (2 x + 8) d(2 x + 8)
1 (2 x + 8)4
=
+C
2
4
= 2 (x + 4)4 + C
∎
1
Example 3.4. Find ∫ √
dx.
1 − x2
Solution: We substitute x = sin θ and obtain dx = cos θ dθ.
1
1
cos θ dθ
dx = ∫ √
∫ √
1 − x2
1 − sin2 θ
1
= ∫
cos θ dθ
cos θ
= θ+C
= sin−1 x + C
∎
3.1.2
Integration using Partial Fraction
For rational functions, partial fractions should be the first thing to try. We
start with pulling terms to make the degree of the polynomial in the numerator less than that of the denominator. Then, we factorize the denominator,
if possible.
x2 + 8x − 3
dx.
Example 3.5. Find ∫ 2
x + 5x + 4
Solution:
x2 + 8x − 3
3x − 7
∫ x2 + 5x + 4 dx = ∫ (1 + x2 + 5x + 4 ) dx
The latter term can be treated be partial fraction.
Since x2 + 5x + 4 = (x + 1)(x + 4), we try
x2
3x − 7
A
B
=
+
+ 5x + 4
x+1 x+4
3x − 7 = A(x + 4) + B(x + 1)
3x − 7 = (A + B)x + 4A + B
CHAPTER 3. INTEGRATION
59
We could take A = −10/3 and B = 19/3. Then
−10/3
19/3
3x − 7
∫ (1 + x2 + 5x + 4 ) dx = x + ∫ ( x + 1 + x + 4 ) dx
−10
19
= x+
ln(x + 1) +
ln(x + 4) + const .
3
3
∎
3.1.3
Integration by Parts
Consider the product rule
d(uv) du
dv
=
v+u
dx
dx
dx
If we integrate both sides of the equation with respect to x, we have
uv = ∫ u′ v dx + ∫ uv ′ dx ,
or
uv = ∫ v du + ∫ u dv
Rearrange the expression, we obtain
∫ u dv = uv − ∫ v du
This is called integration by parts. In actual calculation, one has to
recognize which part is u and which part is v, usually not straight forward.
Example 3.6. Find ∫ x sin x dx.
Solution: If we take v = x2 /2, we have dv = x dx and
x2
x
sin
x
dx
=
sin
x
d
(
)
∫
∫
2
x2
x2
=
sin x − ∫
cos x dx
2
2
This does not seem to go anywhere. Instead, if we take v = − cos x, we have
dv = sin x dx and
∫ x sin x dx = − ∫ x d cos x
= −x cos x + ∫ cos x dx
= −x cos x + sin x + C
∎
CHAPTER 3. INTEGRATION
3.2
60
Definite Integration
How to calculate the area under a curve between x = a and x = b, Fig. 3.1?
One way to get an approximate answer is: We divide the interval to N small
intervals. Define
xn = a +
b−a
n,
N
where n = 0, 1, 2, . . . , N . The area of the rectangle between xn and xn+1 is
approximately f (x∗n )(xn+1 − xn ) where we choose a point xn ≤ x∗n ≤ xn+1 . The
area under the curve is about
N −1
SN = ∑ f (x∗n )(xn+1 − xn )
n=0
The limit limN →∞ SN , if exists, is the area under the curve.
In the formal definition, we allow small intervals of any length and any
point x∗n inside the small interval.
f(x)
a
b
x
Figure 3.1: The area under a curve
Definition 3.7. Let f (x) be a function defined in the interval a ≤ x ≤ b. A
partition of the interval is a = x0 < x1 < ⋯ < xN = b. Denote
δ = max(xi+1 − xi )
i
the maximum of the size of the small intervals. Choose a point xn ≤ x∗n ≤ xn+1 .
Define the Riemann sum as
N −1
SN = ∑ f (x∗n )(xn+1 − xn )
n=0
CHAPTER 3. INTEGRATION
61
If the limit limδ→0 SN exists, independent of how we choose the partition (as
long as the maximum size goes to zero) and how we choose the point inside
the small intervals, then we say that the function is Riemann integrable in
the interval a and b, and call the limit the definite integral of the function,
b
∫a f (x) dx
(3.1)
Example 3.8. If the function is a constant f (x) = k, then the Riemann sum
is
SN =
N −1
∑ f (x∗n ) (xn+1 − xn )
n=0
N −1
= k ∑ (xn+1 − xn )
n=0
= k (b − a)
Hence,
b
∫a k dx = k (b − a)
b
We also note that ∫ k dx is a function of b and
a
b
d
d
(∫ k dx) =
(k (b − a)) = k
db a
db
∎
A remark on the dummy variable: In Eq. (3.1), there is the notation
x, but its significance is just to indicate the variable of the function. The
definite integral of a function is a number, so
b
b
b
b
∫a f (x) dx = ∫a f (y) dy = ∫a f (t) dt = ∫a f (α) dα
The variable inside a definite integral is called a dummy variable.
3.2.1
Fundamental Theorem of Calculus
Part 1 of the Theorem: It shows that the integral is an antiderviative of
a function.
If f is a continuous function, define the function
F (x) = ∫
a
x
f (t) dt ,
CHAPTER 3. INTEGRATION
62
note that its independent variable is the upper limit of a definite integral.
Then, F (x) is differentiable and
dF (x)
= f (x) ,
dx
which is the value of f at the upper limit x.
Proof.
dF (x)
dx
x+h
x
1
f (t) dt − ∫ f (t) dt)
= lim (∫
h→0 h
a
a
1 x+h
= lim ∫
f (t) dt
h→0 h x
By the definition of Riemann integral, obviously we have
h min
f (x∗ ) ≤ ∫
∗
x≤x ≤x+h
x+h
x
f (t) dt ≤ h max
f (x∗ )
∗
x≤x ≤x+h
When h → 0, minx≤x∗ ≤x+h f (x∗ ) = maxx≤x∗ ≤x+h f (x∗ ) = f (x) because f is
continuous. As a result,
1 x+h
f (t) dt ≤ f (x)
∫
h→0 h x
f (x) ≤ lim
and the theorem is proved.
Part 2 of the theorem: It gives a very useful result for definite integral
A direct consequence of the above result is that
b
∫a f (t) dt = G(b) − G(a)
for which
(3.2)
dG(x)
= f (x).
dx
x
Proof. Let F (x) = ∫ f (t) dt, then we have F ′ (x) = f (x) = G′ (x). That
a
means F (x) = G(x) + k. Obvously, k = −G(a) because F (a) = 0. Thus, we
obtain
b
∫a f (t) dt = F (b) = G(b) − G(a)
Every differentiable function is integrable, and many more functions are integrable than differentiable.
CHAPTER 3. INTEGRATION
63
Example 3.9. Find the area under the curve of y = sin x from 0 to π.
Solution:
π
∫0 sin x dx
π
= − cos x ∣
0
= −(cos π − cos 0)
= 2
Example 3.10. A particle with initial velocity u moves along a straight
line. If the acceleration of the particle is a constant a, find the velocity and
the displacement of the particle after time t. This example is the same as
example 3.2, but we try to work out the answers with definite integral.
Solution: The acceleration a is a constant and notice that in calculus nodv
tation a = , where v is the velocity. We perform integration on both sides
dt
with respect to t and obtain
t
∫0
t dv
a dt = ∫ ( ) dt
dt
0
v
t
a∫ dt = ∫ dv
u
0
at = v − u
v = u + at
ds
Let s be the displacement and notice again that in calculus notation v = .
dt
ds
Then,
= u + at. Integrating both sides with respect to t, we have
dt
t ds
t
∫0 ( dt ) dt = ∫0 (u + at) dt
s
∫0
t
ds = ∫ (u + at) dt
0
1
s = ut + at2
2
One should notice that the area under the vt-graph is the displacement of
the particle because s = ∫
t
v dt.
0
Example 3.11. A particle moves on the xy-plane with velocity
v⃗ = −aω sin ωt î + bω cos ωt ĵ ,
∎
CHAPTER 3. INTEGRATION
64
where a, b and ω are constants and t is the time. The initial position of the
particle is at a î. Find the position vector of the particle at time t, deduce
that the locus of the particle is an ellipse. Show also that the acceleration of
the particle directs towards the origin.
t
r⃗
d⃗
r
, we have ∫ v⃗ dt = ∫ d⃗
r. In some books, they
dt
0
aî
t
r⃗
adopt the dummy variable r⃗′ and write ∫ v⃗ dt = ∫ d⃗
r′ such that the upper
0
aî
limit r⃗ in the right integral does not have the same name as the independent
variable in the integral. Then
Solution: Since v⃗ =
r⃗
t
′
∫0 (−aω sin ωt î + bω cos ωt ĵ) dt = r⃗ ∣
aî
t
= r⃗ − aî
(a cos ωt î + b sin ωt ĵ) ∣
0
a cos ωt î − aî + b sin ωt ĵ = r⃗ − aî
Therefore, we obtain r⃗ = a cos ωt î + b sin ωt ĵ. Notice that r⃗ = x î + y ĵ. Hence,
we get x = a cos ωt, and y = b sin ωt. Eliminating t, we have an ellipse, i.e.
x2 y 2
+
=1
a2 b 2
The acceleration of the particle is v⃗˙ , where
d⃗
v
v⃗˙ =
=
dt
=
=
=
d
(−aω sin ωt î + bω cos ωt ĵ)
dt
−aω 2 cos ωt î − bω 2 sin ωt ĵ
−ω 2 (a cos ωt î + b sin ωt ĵ)
−ω 2 r⃗
Hence, the acceleration of the particle always points towards the origin.
3.2.2
∎
Integration using Reduction Formula
Integration by part is a useful skill to obtain an integral. However, there are
some functions that you have to use this method several times before the
final answer is obtained. If the index of a function (e.g. power index) drops
when you repeat the method, a reduction formula may follow. The final
answer of the integral appears after using this formula recursively.
π/2
Example 3.12. Denote In = ∫
sinn x dx, where n is a positive integer.
0
Obtain the reduction formula for In Hence, find I5 .
CHAPTER 3. INTEGRATION
65
Solution:
In = ∫
0
π/2
sinn x dx
π/2
= −∫
sinn−1 x d cos x
x=0
π/2
= − sin
n−1
x cos x ∣
0
= (n − 1) ∫
= (n − 1) ∫
π/2
+∫
π/2
cos x d sinn−1 x
0
sinn−2 x cos2 x dx
0
π/2
0
sinn−2 x (1 − sin2 x) dx
= (n − 1) (In−2 − In )
Hence, we obtain In =
n−1
4
4 2
8
In−2 . Finally, we have I5 = I3 = ⋅ I1 = . ∎
n
5
5 3
15
Example 3.13. Denote In = ∫ secn θ dθ, where n is a positive integer. Obtain the reduction formula for In .
Solution:
In = ∫ secn θ dθ
= ∫ secn−2 θ d tan θ
= secn−2 θ tan θ − ∫ tan θ d secn−2 θ
= secn−2 θ tan θ − (n − 2) ∫ tan2 θ secn−2 θ dθ
= secn−2 θ tan θ − (n − 2) ∫ (sec2 θ − 1) secn−2 θ dθ
= secn−2 θ tan θ − (n − 2) In + (n − 2) In−2
Therefore, we have
(n − 1)In = secn−2 θ tan θ + (n − 2) In−2
1
n−2
In =
secn−2 θ tan θ +
In−2
n−1
n−1
∎
Example 3.14. The beta function B(p, q) is defined by the integral
B(p, q) = ∫
1
0
xp−1 (1 − x)q−1 dx
for p ≥ 1, q ≥ 1.
CHAPTER 3. INTEGRATION
66
Show that if p ≥ 1 and q ≥ 2, then B(p, q) =
B(10, 5).
q−1
B(p, q − 1). Hence, find
p+q−1
Solution:
1
B(p, q) = ∫ xp−1 (1 − x)q−1 dx
0
=
1
1
(1 − x)q−1 dxp
p ∫x=0
1
=
=
=
=
1
q−1
1
[(1 − x)q−1 xp ] ∣ +
xp (1 − x)q−2 dx
∫
p
p
0
0
1
q−1
−
xp−1 [(1 − x) − 1] (1 − x)q−2 dx
p ∫0
1
1
q−1
q−1
xp−1 (1 − x)q−1 dx +
xp−1 (1 − x)q−2 dx
−
∫
∫
p
p
0
0
q−1
q−1
−
B(p, q) +
B(p, q − 1)
p
p
q−1
B(p, q − 1). This reduction formula gives
p+q−1
4
4 3 2 1
4 3 2 1 1
1
B(10, 5) =
B(10, 4) = ⋅
⋅ ⋅ ⋅B(10, 1) = ⋅
⋅ ⋅ ⋅ =
14
14 13 12 11
14 13 12 11 10 10010
Hence, we obtain B(p, q) =
∎
3.3
Impulse
The impulse of an object of mass m is written as I, it is defined as the
change of momentum of the object. The mathematical expression is
I = ∆p = m (v − u) ,
where p is the linear momentum. The initial and final velocities are u and v
respectively. Recall that the force exerted on the object is given by Newton’s
second law F = dp/dt. Integrating both sides with respect to t, we have
I = ∫ dp = ∆p = ∫ F dt = Fave ∆t ,
where Fave is the average force exerted on the object during the time ∆t.
Figure 3.2 shows the variation of a force exerted on an object when collsion
occurs. The area under the curve equals to the area of the rectangle given
by Fave ∆t.
CHAPTER 3. INTEGRATION
67
Figure 3.2: The impulse of a force is the area under the curve
Example 3.15. A particle of mass m is thrown vertically upward with speed
v0 and it returns to the initial point with speed v1 . Suppose that the retarding
force F due to the air resistance is linearly proportional to the instantaneous
velocity of particle, i.e. F = −kv, where k > 0 and v is the velocity of the
particle. By considering the total impulse acting on the particle during its
motion, show that the time elapsed is given by
1
t = (v0 + v1 ) .
g
Solution: Taking downward as positive, the total force exerted on the particle is F = mg − kv. The expression is true for both upward and downward
motions. The velocity of the particle v has negative value when the particle
travels upward v. The value of v becomes positive when the particle travels
downward. Newton’s second law gives
F =
dp
dt
I = ∫ F dt = ∫ dp
∫ F dt = ∆p
The integral in the above equation is the impulse exerted on the particle and
it is equivalent to the change of momentum of the particle. Thus,
t
∫0 (mg − kv) dt = m[v1 − (−v0 )]
mgt − k ∫
t
0
v dt = m(v1 + v0 )
t
Because the total displacement of the particle is zero, we have ∫ v dt = 0.
0
Thus,
v0 + v1
t =
g
∎
CHAPTER 3. INTEGRATION
3.4
68
Center of Mass
The center of mass of a system of particles is the point at which the total
mass of the system may be considered concentrated. It describes the average
position of the system. Consider a system of n particles on the xy-plane,
as shown in figure 3.3. The mass of the ith particle is mi and its location
is given by r⃗i or (xi , yi ), where i = 1, 2, 3, . . . , n. The center of mass of this
system is (xcm , ycm ), where
n
n
xcm =
∑ xi mi
i=1
n
and
∑ mi
i=1
ycm =
∑ yi mi
i=1
n
∑ mi
i=1
Let’s rewrite the definition again and try to realize its physical picture.
⎛
⎞
⎜ mi ⎟
⎟
xcm = ∑ xi ⎜ n
⎜
⎟
i=1
m
∑
⎝ j=1 j ⎠
n
⎛
⎞
⎜ mi ⎟
⎟
ycm = ∑ yi ⎜ n
⎜
⎟
i=1
m
∑
⎝ j=1 j ⎠
n
and
Figure 3.3: A system of particles on the xy-plane
One can see readily that the quantities in the brackets are the weighting
⎞
⎛
n
⎜ mi ⎟
⎟ = 1. Thus, the center of mass of a system of
functions because ∑ ⎜ n
⎜
⎟
i=1 ∑ m
⎝ j=1 j ⎠
masses represents the average position of the system.
Now, we extend our understanding from the discretized model to a continuous model. For example, we consider the center of mass of a thin rod
CHAPTER 3. INTEGRATION
69
which lies on the x-axis. We cut the rod into infinite number of segments,
each has infinitesimally small size and the mass is dm. Then, the center
of mass of the rod can be determined by replacing the summation sign by
the integral and the position of the mass element is marked by x, i.e. the
coordinate of dm.
n
xcm =
∑ xi mi
i=1
n
∑ mi
Ð→ xcm =
i=1
∫ x dm
∫ dm
Similarly, for a 2-D object lying on the xy-plane, the center of mass of it
is located at (xcm , ycm ), where the coordinates are given by the integrals as
follows.
xcm =
∫ x dm
ycm =
and
∫ dm
∫ y dm
∫ dm
Example 3.16. A uniform rod of length L has uniform density. (a) Locate
the center of mass of the rod. (b) Locate the center of mass of the rod again
if the rod has non-uniform density λ′ = λ0 (1 + x/L), where x is the distance
from the light end, 0 ≤ x ≤ L and λ0 is a constant.
Solution:
Figure 3.4: The thin rod
(a) For convenience, we place the rod on the xy-plane such that one end of
the rod is at the origin and the rod lies on the positive x-axis. Then, we
cut the rod into numerous segments, each having a length dx. Denote the
density of the rod as λ, the mass of the small segment is given by dm = λ dx.
By the definition of the center of mass, we have
L
L
xcm =
∫ x dm
∫ dm
=
λ∫
L
x dx
0
L
λ∫
dx
0
=
∫0 x dx
L
∫0 dx
=
x2
∣
2 0
L
x∣
0
=
L
2
CHAPTER 3. INTEGRATION
70
(b) If the density of the rod is non-uniform, the small segment has mass
x
dm = λ′ dx = λ0 (1 + ) dx. Then the center of mass of the rod is
L
L
L
x
x2
λ
)
dx
) dx
x
(1
+
(x
+
0∫
∫0
∫ x dm
L
L
0
=
=
xcm =
L
L
x
x
dm
λ0 ∫ (1 + ) dx
(1 + ) dx
∫
∫
L
L
0
0
Therefore,
L
xcm =
x2 x3
( +
)∣
2 3L 0
L
(x +
x2
)∣
2L 0
5L2
5L
= 6 =
3L
9
2
∎
Example 3.17. A uniform wire of radius R is bent into a semi-circle. Locate
the center of mass of the wire.
Solution:
Figure 3.5: A uniform and semi-circular wire
Due to symmetry, the x-coordinate of the center of mass always lies on the
y-axis (i.e. xcm = 0). Consider an infinitesimal element of length dl on the
wire, where dl = R dφ. The mass of the element is dm = λ dl = λ R dφ. By
the definition of centre of mass
π
ycm =
∫wire
y dm
∫wire dm
=
∫0 (R sin φ) (λ R dφ)
π
∫0 λ R dφ
=
R2 λ ∫
π
sin φ dφ
0
π
Rλ ∫
dφ
0
π
R
R
2R
= − cos φ ∣ = − (−1 − 1) =
π
π
π
φ=0
∎
CHAPTER 3. INTEGRATION
3.5
71
Work Done by a Force
A particle is driven by a force F⃗ such that it moves along a path C. For
a small displacement d⃗
r along the path, the corresponding work done is
⃗
dW = F ⋅ d⃗
r. The scalar product is adopted because the component of the
force along the displacement contributes to the motion of the particle, but
the component of the force normal to the displacement does no work on the
particle. The total work done by the force F⃗ along the path C is given by
an integral which sums up all the small work done.
r
W = ∫ dW = ∫ F⃗ ⋅ d⃗
C
C
Example 3.18. Find the work done by the gravity when a particle of mass
m falls freely by a distance h.
Solution: Suppose that the point of release of the particle is coincident with
the origin of the coordinate system. The motion of the particle is along the
y-axis. The gravitational force acting on the particle is −mg ĵ and the small
displacement is d⃗
r = dy ĵ. Notice the latter represents the general expression
of the displacement. The actual direction of it is stated by the lower and
upper limits in the integral. The work done by the gravity is
W = ∫ (−mg ĵ) ⋅ (dy ĵ) = ∫
C
0
−h
(−mg) dy = −mg ∫
−h
dy
0
−h
Therefore, W = −mgy ∣
= mgh. The value is positive because the gravita0
tional force points in the same direction as the displacement. The particle
gains kinetic energy by the same amount too.
Example 3.19. A particle of mass m slides down along the inner surface
of a smooth hemispherical hollow of radius h. If the initial position of the
particle is at the rim of the hollow, find the work done by the gravity when
the particle reaches the lowest point of the hollow.
Figure 3.6: A sliding particle in the bowl
CHAPTER 3. INTEGRATION
72
Solution:
The particle slides down the inner surface of the hollow along the path C
and the infinitesimal displacement of it is denoted by d⃗
r = dx î + dy ĵ. The
force exerted on the particle due to gravitational force is F⃗ = −mg ĵ. By the
definition of work done, we can write
r = ∫ (−mg ĵ) ⋅ (dx î + dy ĵ) = ∫
W = ∫ F⃗ ⋅ d⃗
C
C
0
−h
(−mg) dy = −mg ∫
−h
dy
0
−h
Therefore, W = −mgy ∣
= mgh. It has the same value as the answer in
0
example 3.18. In fact, the work done by a gravitational force is independent
of the path that it travels. The amount of work done only depends on the
initial and final positions of the particle. A force with this property is called
the conservative force. One should notice that the normal force exerted on
the particle by the hollow does no work on the particle because the normal
force is always perpendicular to the displacement of the particle.
3.6
Energy Stored in a Spring
A force Fapp is applied to an unstretched spring and the spring extends by
x. If the extension x is linearly proportional to the applied force Fapp , we
say that the spring obeys the Hooke’s law Fapp = kx, where k is a positive
constant called the spring constant. The work done by the force Fapp over
the displacement x is just the total energy E stored in the spring.
x
x
1
E = ∫ Fapp dx = ∫ kx dx = kx2
2
0
0
Figure 3.7: The spring stores energy when it is stretched
If the spring experiences a force such that there is a compression in the spring,
the same amount of energy is stored when the spring is compressed by x.
CHAPTER 3. INTEGRATION
3.7
73
Electric Field due to a Charged Wire
A very long straight wire has positive charges distributed along it and the
line density is λ. The electric field at point P distanced D normally from
one end of the wire can be obtained if we add up the electric fields produced
by all charges on the wire. Figure 3.8 shows the electric at P due to the
small segment in the wire. Now, we divide the wire into numerous segments.
Coulomb’s law applies to the calculations because each segment is infinitesimally small in length that the charges in it are considered as point charge.
The Coulomb’s constant is ke .
Figure 3.8: The electric field due to a charged segment in the wire
The amount of charge occupied by the small segment is dq = λ dx, where
x = D tan θ and dx = D sec2 θ dθ. Thus, dq = λD sec2 θ dθ. The distance
between the segment and point P is r = D/ cos θ. Hence, the magnitude of
the electric field at P due to the charged segment is
dE = ke
dq
λD sec2 θ dθ ke λ
=
k
=
dθ
e
r2
D
D 2
(
)
cos θ
Let the total electric field at P be E⃗ = Ex î + Ey ĵ, where
Ex = − ∫ sin θ dE
and
ke λ π/2
ke λ
sin θ dθ = −
∫
D 0
D
and
Ey = − ∫ cos θ dE
Then
Ex = −
Ey = −
ke λ π/2
ke λ
cos θ dθ = −
∫
D 0
D
ke λ
Hence, the total electric field at P is E⃗ = −
(î + ĵ).
D
The above discussion is about a semi-infinite wire of charge density λ. If
the wire of the same charge density is an infinite long wire which has its left
end begins at the negative infinity and its right end extends to the positive
infinity, the electric field at P points along the negative y direction and the
magnitude is 2ke λ/D.
CHAPTER 3. INTEGRATION
3.8
74
The Length of a Curve
If y = f (x) is a continuous function and f ′ (x) exists, then the length of the
curve in the range a ≤ x ≤ b is given by
b√
1 + [f ′ (x)]2 dx
S=∫
a
Figure 3.9: A small segement on the curve
Proof. Divide the curve into many segments such that each segment has finite
and small length ∆S, where (∆S)2 = (∆x)2 + (∆y)2 . Then
√
∆y 2
∆S = 1 + [
] ∆x
∆x
If there are numerous segments, the length of each segment becomes infinitesimally small and thus the total length of the curve in the interval a ≤ x ≤ b
is given by
√
b
dy 2
S = ∫ dS = ∫
1 + [ ] dx
(3.3)
dx
a
b√
Hence, we obtain S = ∫
1 + [f ′ (x)]2 dx, where f ′ (x) = df (x)/dx.
a
If the curve has a parametric form (x(t), y(t)), then we have
dy
dy dt y ′ (t)
=
=
dx dx x′ (t)
dt
and
dx = (
dx
) dt = x′ (t) dt ,
dt
where t is the independent variable of the coordinates (x, y). Equation 3.3
becomes
t2 √
S=∫
[x′ (t)]2 + [y ′ (t)]2 dt ,
(3.4)
t
1
where x(t1 ) = a and x(t2 ) = b.
CHAPTER 3. INTEGRATION
75
Example 3.20. Find the mass of a metal wire of density λ = 2 kgm−1 if the
1
1
wire has a parabolic shape given by y = x2 and − ≤ x ≤ . The measurement
2
2
of x is in meter.
Figure 3.10: A parabolic metal wire
Solution: y = f (x) = x2 gives f ′ (x) = 2x. The total mass of the wire is
M = λS = λ ∫ dS = λ ∫
1
2
− 21
√
1 + 4x2 dx
Using the substitution 2x = tan θ and 2 dx = sec2 θ dθ, we have
π
λ
4
M =
sec3 θ dθ
∫
π
2 −4
π
⎡
⎤
π
4
⎥
λ ⎢⎢ 1
1
4
=
sec θ tan θ ∣ + ∫ π sec θ dθ⎥⎥
(by using example 3.13)
⎢
2 ⎢2
2 −4
⎥
π
−
⎣
⎦
4
π
⎡ √ √
⎤
4 ⎥
λ ⎢⎢ 1
1
( 2 + 2) + ln(sec θ + tan θ) ∣ ⎥⎥
=
2 ⎢⎢ 2
2
− π4 ⎥
⎣
⎦
√
λ √
1
2+1
=
[ 2 + ln ( √
)]
2
2
2−1
√
√
√
−1
Plugging in λ = 2 kgm , the mass of the wire M = ( 2 + ln 3 + 2 2) kg ∎
Example 3.21. A disk of radius a rotates without sliding on a horizontal
plane. The initial contact point at P on the disk traces a path when the disk
rolls and the path is called the cycloid. The coordinates of P are
{
x = a (θ − sin θ) ,
y = a (1 − cos θ) ,
where θ is the angular displacement of the disk. Find the length of the cycloid with 0 ≤ θ ≤ 2π.
CHAPTER 3. INTEGRATION
76
Figure 3.11: The cycloid of a disk
Solution: The derivatives of x and y with respect to θ are x′ (θ) = a (1−cos θ)
and y ′ (θ) = −a sin θ. The length of the cycloid is
2π
√
[x′ (θ)]2 + [y ′ (θ)]2 dθ
2π √
= ∫
a2 (1 − cos θ)2 + a2 sin2 θ dθ
0
S = ∫
0
= 2a ∫
0
2π
θ
sin ( ) dθ = 8a
2
θ
We have applied the trigonometric identity cos θ = 1 − 2 sin2 in the above
2
treatment.
∎
3.9
Area under a Curve
If y = f (x) is a continuous function, then the area under the curve in the
b
range a ≤ x ≤ b is given by ∫a f (x) dx. In physical science, there are many
examples for which the area under the curve gives a physical quantity. Here
shows some examples.
ˆ Velocity-time curve (vt-curve) gives the displacement,
ˆ Acceleration-time curve (at-curve) gives the velocity,
ˆ Force-time curve (F t-curve) gives the impulse of a force,
ˆ Force-displacement curve (F s-curve) gives the work done by a force,
ˆ Pressure-volume curve (P V -curve) gives the work done by a gas
ˆ Charge-voltage curve (QV -curve) gives the energy stored in a capacitor
CHAPTER 3. INTEGRATION
77
Example 3.22. Find the work done by an ideal gas if n moles of gas undergo an isothermal expansion in a container at temperature T . The piston
is pushed outward and the volume of the gas increases from its initial value
Vi to a final value Vf .
Figure 3.12: The gas expansion and the P V -graph
Solution: In an isothermal process, the temperature is kept constant.
When the gas has a volume change by ∆V the work done by the gas system
is ∆Ws = F ∆x = P A ∆x = P ∆V , where F is the force exerted on the piston,
P is the pressure in the gas, x is the displacement of the piston and A is the
cross sectional area of the piston in the container. During the expansion, the
total work done by the gas system is
Ws = ∫ P dV = nRT ∫
Vf
Vi
dV
V
where R is the universal gas constant. In the right hand side of the above
equation, we have applied the ideal gas law P V = nRT such that the integral
contains only one variable, i.e. the volume V . Then
Vf
Ws = nRT ln V ∣
= nRT ln (
Vi
Vf
)
Vi
In an expansion, the work done by the gas Ws is positive and its value
is given by the area under the P V -curve in figure 3.12. If the process is
a compression, Ws becomes negative and it is the negative area under the
P V -curve.
∎
3.10
Moment of Inertia
Mass is a measure of the amount of inertia of an object in translational
motion. Greater the mass is greater the resistance to against change. That
is to say, we need a greater force to move an object from rest and changes its
state of motion. This idea is stated in Newton’s second law F = ma, where
CHAPTER 3. INTEGRATION
78
Figure 3.13: A system of particles rotating about the z-axis
F is the net force exerted on an object of mass m and a is the acceleration
of the object.
In a rotational motion, the quantity to measure the amount of inertia is
the moment of inertia. For a system of n particles rotating about an axis,
the distribution of mass about the axis affects the resistance of an object to
rotate. The moment of inertia about the axis is defined as
n
I = ∑ mi ri2 ,
i=0
where mi is the mass of the i-th particle and ri is the radius of rotation of
it about the axis. Newton’s second law for rotation is τ = Iα, where τ is the
torque (i.e. the moment of force) exerted on the system about the axis and
α is the angular acceleration of the system. One can always compare F = ma
with τ = Iα, where F , m, and a are the quantities adopted in translational
motions, while τ , I and α are the quantities adopted in rotational motions.
The correspondence of them are F → τ , m → I, and a → α, thus
F = ma → τ = Iα
Detailed discussion about torque was stated in section 1.12.1. On the other
hand, one can see that if the object in figure 3.13 rotates with angular speed
ω, then the i-th particle occupies a velocity vi = ri ω. Thus, the total kinetic
energy (KE) of the object is
n
1 n
1 n
1
1
K = ∑ ( mi vi2 ) = (∑ mi (ri ω)2 ) = (∑ mi ri2 ) ω 2 = I ω 2
2 i=1
2 i=1
2
i=1 2
This is an important result when we compare the KE expression of translational motion to rotational motion. The correspondence of the quantities
are M → I and V → ω, thus
1
1
M V 2 → I ω2
2
2
CHAPTER 3. INTEGRATION
79
where M is the total mass of the object and V is the velocity of the center
of mass of the object when the object has translational motion.
In a rigid body, the mass is distributed continuously instead of having
a discretized model. Hence, we have to replace the summation sign by an
integral and the position of the mass element is marked by r, i.e. the radius
of rotation of the mass element dm.
n
I = ∑ mi ri2 Ð→ I = ∫ r2 dm
i=1
Obviously, the moment of inertia of a ring about an axis passing through
its center and normal to the plane of the ring is mr2 , where m and radius r
are the mass and the radius of the ring respectively.
Example 3.23. Find the moment of inertia of a uniform thin rod about a
normal axis which passes through the center of the rod. The rod has mass
M and length L.
Figure 3.14: A rod rotating about a normal axis through its mid-point
Solution: For convenience, we fit the rod in a coordinate system such that
the rod lies on the x-axis and the mid-point of the rod meets the origin.
Divide the thin rod into numerous segments so that each segment has infinitesimal small length dx and the mass of the segment is dm = λ dx, where
λ = M /L is the density of the rod. From the definition of moment of inertia,
we have
L/2
L/2
λL3 M L2
λx3
I = ∫ x2 dm = λ ∫
x2 dx =
∣
=
=
3 −L/2 12
12
−L/2
∎
Example 3.24. Find the moment of inertia of a uniform thin rod about a
normal axis which passes through the end of the rod. The rod has mass M
and length L.
Solution: This time we fit the rod in a coordinate system such that the
rod lies on the x-axis and the left end of the rod meets the origin. Divide
the thin rod into numerous segments so that each segment has infinitesimal
CHAPTER 3. INTEGRATION
80
Figure 3.15: A rod rotating about a normal axis through its end
small length dx and the mass of the segment is dm = λ dx, where λ = M /L
is the density of the rod. From the definition of moment of inertia, we have
L
L
λx3
λL3 M L2
I = ∫ x2 dm = λ ∫ x2 dx =
∣ =
=
3 0
3
3
0
We observe that the answer in example 3.23 is less than that in this example,
because the rod in the latter case has more masses distributed far away from
the axis of rotation.
∎
3.11
The Dog-And-Rabbit Chase Problem
The classic dog-and-rabbit chase problem is an interesting topic in calculus.
A dog is at a distance L due south of a rabbit, it observes the rabbit running
in a vast field at time t = 0. The positions of them at time t = 0 are shown in
the figure. When the dog sees the rabbit, it starts to pursue the rabbit and
its motion always points to the rabbit. Given that the rabbit keeps running
due east with a constant speed v and the dog’ speed is a constant u, where
v < u. Find the time elapsed when the dog catches the rabbit.
Figure 3.16: The dog-and-rabbit chase problem
Let x be the horizontal displacement of the rabbit relative to the dog and
τ be the time elapsed when the dog catches the rabbit. Then, at arbitrary
time t
dx
= v − u cos θ
dt
(3.5)
CHAPTER 3. INTEGRATION
81
Integrating both sides of equation 3.5 from t = 0 to t = τ , we have
0
τ
∫x=0 dx = ∫0 (v − u cos θ) dt
0 = vτ − u ∫
τ
cos θ dt
0
That is to say,
τ
vτ
∫0 cos θ dt = u
(3.6)
Let r be the displacement of the rabbit relative to the dog. At any instant
of time,
dr
= v cos θ − u
dt
(3.7)
Integrating both sides of equation 3.7 from t = 0 to t = τ , we have
τ
0
∫r=L dr = ∫0 (v cos θ − u) dt
τ
−L = v ∫ cos θ dt − uτ
0
Hence, we obtain
τ
L = uτ − v ∫ cos θ dt
0
(3.8)
Substituting equation 3.6 into equation 3.8, we have
L = uτ − v (
Therefore, the required time τ is
3.12
vτ
)
u
Lu
.
− v2
u2
Numerical Integration
b
We will introduce two basic methods which compute the integral ∫
a
f (x) dx
numerically. The idea of trapezoidal rule is simple that the integrand f (x) is
approximated by secant lines. However, the convergence of trapezoidal rule
is slower than that of Simpson’s rule. The latter approximates the integrand
by parabolas.
CHAPTER 3. INTEGRATION
3.12.1
82
Trapezoidal Rule
b
Given a function f (x) ≥ 0, the integral ∫ f (x) dx can be approximated
a
by the area of a trapezium as shown in figure 3.17. The accuracy can be
improved if we subdivide the interval [a, b] into n subintervals of equal width,
as shown in figure 3.18. The precision increases with n. When n = 1, we have
b
1
∫a f (x) dx ≈ 2 (b − a) [f (a) + f (b)]
Figure 3.17: Trapezoidal rule when n = 1
Let h be the width of each subinterval when there are n subintervals.
Then we have h = (b − a)/n and the trapezoidal rule
b
∫a
n
f (x) dx = ∑ ∫
i=1
xi
xi−1
f (x) dx
1 n
∑ h [f (xi−1 ) + f (xi )]
2 i=1
h
=
[f0 + f1 + f1 + f2 + f2 + ⋅ ⋅ ⋅ + fn−1 + fn−1 + fn ]
2
n−1
h
=
[f0 + 2 ∑ fi + fn ]
2
i=1
≈
Figure 3.18: Trapezoidal rule when n > 1
CHAPTER 3. INTEGRATION
3.12.2
83
Simpson’s Rule
Other than using secant lines to approximate the integrand f (x), we can
replace the integrand by many parabolas. To simplify the discussion, let’s
consider the range of the integral described by [−h, h]. Then we divide the
interval by two subintervals of equal width and replace the integrand by a
parabola p(x) = ax2 + bx + c such that it passes through A0 (−h, f0 ), A1 (0, f1 ),
and A2 (h, f2 ). The parabola is shown by the dashed curve in figure 3.19.
f0 = ah2 − bh + c
f1 = c
f2 = ah2 + bh + c
Figure 3.19: Simpson’s rule when n = 2
The constants a, b, and c can be determined. However, we are not necessary
to compute them because
h
ax3 bx2
h
(ax
+
bx
+
c)
dx
=
(
+
+ cx) ∣ = (2ah2 + 6c)
∫−h
3
2
3
−h
h
2
and we can simply write
h
h
2
∫−h (ax + bx + c) dx = 3 (f0 + 4f1 + f2 )
(3.9)
One can check that 2ah2 + 6c = f0 + 4f1 + f2 . Practically, we subdivide the
interval [a, b] into n intervals, where n is even. Then we replace the integrand
CHAPTER 3. INTEGRATION
84
by n/2 parabolas. It is noted that the integral over the region [xi−1 , xi+1 ] has
similar expression as that described in equation 3.9. Hence, we have
∫x
xi+1
h
(fi−1 + 4fi + fi+1 )
3
f (x) dx ≈
i−1
Let x0 = a, xn = b, xi = a + ih, and h =
Simpson’s rule
b
∫a
n/2
f (x) dx = ∑ ∫
x
i=1
x2i
b−a
. The overall result is known as
n
f (x) dx
2i−2
h n/2
∑ (f2i−2 + 4f2i−1 + f2i )
3 i=1
h
=
(f0 + 4f1 + f2 + f2 + 4f3 + f4 + f4 + 4f5 + f6 + . . . + fn−2 + 4fn−1 + fn )
3
h
=
[f0 + 4 (f1 + f3 + f5 + . . . + fn−1 ) + 2 (f2 + f4 + f6 + . . . + fn−2 ) + fn ]
3
≈
Example 3.25. If I = ∫
π
ex cos x dx, compute I by the following methods.
0
(a) Integration by parts
(b) Trapezoidal rule with n = 2, 4, 8, 16, 32 and 64
(c) Simpson’s rule with n = 2, 4, 8, 16, 32 and 64
Solution:
(a) Using intergration by parts, we have
π
I = ∫ ex cos x dx
0
π
= ∫ cos x dex
0
π
π
= e cos x ∣ + ∫ ex sin x dx
0
x
0
= −eπ − 1 + ∫
π
sin x dex
0
π
π
= −eπ − 1 + (ex sin x ∣ − ∫ ex cos x dx)
0
0
= −e − 1 − ∫
π
= −eπ − 1 − I
π
x
e cos x dx
0
1
Thus 2I = −1 − eπ , then we have I = − (1 + eπ ) = −12.07034632.
2
CHAPTER 3. INTEGRATION
85
(b) and (c)
n
2
4
8
16
32
64
In by Trapezoidal rule
-17.389259
-13.336023
-12.382162
-12.148004
-12.089742
-12.075194
In by Simpson’s rule
-11.5928395534
-11.9849440198
-12.0642089572
-12.0699513233
-12.0703214561
-12.0703447599
We observe that the answer is correct to 4 decimal numbers by using Simpson’s rule with n = 32 while Trapezoidal rule converges to the answer much
slower.
∎
3.13
Useful Integration Formulae
All constants of integration are omitted but implied in this table.
f (x)
∫ f (x) dx
c
cx
x2
2
cxn+1
n+1
− cos x
sin x
tan x
− cot x
sec x
− csc x
x
cxn
(n ≠ −1)
sin x
cos x
sec2 x
csc2 x
sec x tan x
csc x cot x
sec x
csc x
tan x
cot x
ex
1
x
ln x
Integration by Parts: ∫ u dv = uv − ∫ v du
π x
ln(sec x + tan x) or ln tan ( + )
4 2
x
ln(csc x − cot x) or ln tan
2
ln sec x
ln sin x
ex
ln x
x ln x − x
Chapter 4
Ordinary Differential Equations
The ordinary differential equations (ODE) are widely used in physics and
engineering. The word ”ordinary” means that the equation involves single
variable function such as y(x) or y(t). In this chapter, we will focus on
the ordinary first order differential equations. ”First order” means that the
equation has the term dy/dx but no higher order terms. If the equation
has second order term, we can try to reduce it to a first order equation and
proceed the calculation.
4.1
Separation of Variables
If the right-hand side of the first order differential equation
dy
= f (x, y)
dx
can be expressed as a function that depends only on x times a function
that depends on y, then the differential equation is called separable. Such
equation has the form
dy
= g(x) p(y)
dx
To solve the equation, we multiply both sides by 1/p(y) and dx. Hence, we
obtain
dy
= g(x) dx
p(y)
Then integrate both sides of the equation
dy
∫ p(y) = ∫ g(x) dx
This technique is known as the separation of variables.
86
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
87
dy x − 5
= 2 .
dx
y
Solution: We separate the variables on two sides of the equation as y 2 dy =
(x − 5) dx. Integrating, we have
Example 4.1. Solve
2
∫ y dy = ∫ (x − 5) dx
y3
x2
=
− 5x + C
3
2
1/3
3x2
y = (
− 15x + 3C)
2
Replace 3C by K, we then have
y=(
1/3
3x2
− 15x + K)
2
∎
Example 4.2. A 1-kg particle is driven by a force f (x) = − sin x along the
x-axis. Its initial position is at the origin and its initial velocity is 2 m/s
towards the positive x-direction. (a) Express the velocity of the particle as
a function of position. (b) Hence, find the limiting position of the particle.
Solution:
(a) Newton’s second law gives the equation of motion of the particle
− sin x = (1) ẍ ,
where ẍ = d2 x/dt2 is the acceleration of the particle. Using the chain rule,
we have
ẍ =
dẋ dẋ dx dẋ
=
=
ẋ
dt dx dt dx
Then, we obtain
− sin x =
dẋ
ẋ
dx
Separating the variables and integrating both sides, we have
−∫
0
x
v
sin x dx = ∫ ẋ dẋ
2
x
cos x∣
0
v
ẋ2
=
∣
2 2
2 (1 + cos x) = v 2
(4.1)
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
88
where v is the velocity of the particle when it has a displacement x from the
x
origin. Using the identity cos x = 2 cos2 − 1 and the fact that v = 2 when
2
x
x = 0, we have v = 2 cos .
2
dx
dx
x
(b) Rewrite the velocity v as
, we have
= 2 cos . Then
dt
dt
2
t
x
dx
=
2
∫0 dt
∫0
2
x
π x
2 ln tan ( + ) ∣ = 2t
4 4 0
π x
ln tan ( + ) = t
4 4
x
sec
π x π
When t → ∞, we have + → . Thus, the limiting position of the particle
4 4
2
is at x = π. In the integration, we have applied the formula
π θ
∫ sec θ dθ = ln(sec θ + tan θ) = ln tan( 4 + 2 )
∎
4.2
Simple Harmonic Motion
A particle of mass m is connected to a light spring that obeys the Hooke’s
law F = −kx, where F is the restoring force in the spring when the spring is
stretched or compressed from its natural length by x. The spring constant
is denoted by k which is a positive constant. If the spring is stretched, x is
positive and F becomes negative. On the other hand, if the spring is compressed, x is negative and F become positive. It means that the restoring
force always points opposite to the position vector of the particle and it trys
to restore the natural length of the spring.
Consider the initial condition of a spring-mass system as follows. At time
t = 0, the particle is displaced to the right by a distance A from the equilibrium position and it is released. Newton’s second law gives the equation of
motion which is independent of the initial conditions of the system.
mẍ = −kx
or
ẍ = −ω 2 x ,
where ω 2 = k/m and ẍ is the acceleration of the particle, sometimes it is
denoted by a. It implies that the direction of the acceleration of the particle
is always opposite to the displacement of the particle. It is a second order
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
89
Figure 4.1: The spring-mass system
ordinary differential equation. The solving of it can be done by reducing
its order to a first order equation and then separating the variables. Use
equation 4.1, we have
v
dẋ
ẋ = −ω 2 x
dx
x
2
∫0 ẋ dẋ = −ω ∫A x dx
v 2 = ω 2 (A2 − x2 )
Since v = dx/dt, we can write
√
dx
= ω A2 − x2
dt
Separating the variables and integrating both sides, we get
x
t
dx
= ω ∫ dt
∫A √ 2
0
A − x2
Substitute x = A sin θ, we have dx = A cos θ dθ and ∫ π
2
π
−1 x
becomes sin ( ) − = ωt. Thus,
A
2
x
)
sin−1 ( A
dθ = ωt, this
x
π
= sin ( + ωt)
A
2
x
= cos ωt
A
x = A cos ωt
It is a periodic function showing that the particle oscillates about the origin
with a constant period T = 2π/ω. The variations of velocity v and acceleration a are also periodic, where v = ẋ = −Aω sin ωt and a = ẍ = −Aω 2 cos ωt =
−ω 2 x.
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
90
In summary, in a spring-mass system, if the particle is released at x = A
it will oscillate with amplitude A about the equilibrium position, i.e. x = 0.
The equation of motion is ẍ = −ω 2 x and
⎧
x = A cos ωt
⎪
⎪
⎪
⎨ v = −Aω sin ωt
⎪
⎪
2
⎪
⎩ a = −Aω cos ωt
√
where ω = k/m is the angular velocity. The maximum magnitude of velocity is vmax = ωA when x = 0, and the maximum magnitude of acceleration is amax√= ω 2 A when x = A or x = −A. The period of oscillation is
2π
m
= 2π
. It is worth to notice that the velocity relates the displaceT=
ω
k
ment by v 2 = ω 2 (A2 − x2 ). This expression is also the direct consequence of
conservation of energy.
Example 4.3. A particle of mass m is placed on a smooth horizontal plane.
It is connected to a spring of spring constant k and is projected to the right
with speed v0 from its equilibrium position. Find the velocity of the particle
as functions of displacement and time respectively.
Figure 4.2: The spring-mass system
Solution: The velocity as a function of displacement can be obtained readily
if we consider the conservation of energy. Instead of doing this way, we try
to obtain
it by using the equation of motion of the particle, ẍ = −ω 2 x, where
√
ω = k/m. Use equation 4.1, we have
v
dẋ
ẋ = −ω 2 x
dx
x
2
∫v ẋ dẋ = −ω ∫0 x dx
0
v 2 − v02 = −ω 2 x2
v 2 = v02 − ω 2 x2
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
91
This is the velocity-time function of the particle. Recall that v = dx/dt, and
thus we have
√
dx
v02 − ω 2 x2
=
dt
Hence,
x
∫0 √
Use the substitution x =
t
dx
v02 − ω 2 x2
= ∫ dt
0
v0
v0
sin θ, we have dx =
cos θ dθ and
ω
ω
sin
1
∫
ω 0
−1 ωx
(v )
0
sin−1 (
dθ = t
ωx
) = ωt
v0
x = A sin ωt ,
where A = v0 /ω is the amplitude of oscillation. Thus, we obtain the velocity
of the particle as a function of time, v = Aω cos ωt = v0 cos ωt.
∎
4.3
Free Fall with Air Resistance
A particle is released at a height h from the floor. Suppose that the air
resistance is not ignorable and the drag force (then the acceleration too) is
linearly proportional to the velocity of the particle. We are interested to
study the following problems. (a) Determine the velocity of the particle at
time t, (b) determine the height of the particle at time t, and (c) find the
relation between the height and the velocity of particle.
One should note that the acceleration of the particle varies with time. All
formulae that we learned in high school about constant acceleration such as
v = u + at, s = ut + 21 at2 , and v 2 = u2 + 2as will not be applicable. We should
rely on calculus to solve these problems.
(a) Taking upward motion as positive, the acceleration of the particle is
dv
a = −g − kv, where a = dv/dt. That is to say,
= −g − kv. Rearranging the
dt
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
92
equation, we have
dv
= dt
−g − kv
v
t
dv
−∫
= ∫ dt
0 g + kv
0
v
1
− ln(g + kv)∣ = t
k
0
g + kv
1
) = t
− ln (
k
g
g + kv
= e−kt
g
g
v = − (1 − e−kt )
k
The terminal speed of the particle is given by g/k. It is reached when the
drag force (upward force) cancels out the weight of the particle (downward
force).
Figure 4.3: The speed of a falling particle in air
dy
g
(b) The velocity of the particle v =
= − (1 − e−kt ). Integrating both sides
dt
k
with respect to t, we have
y
t
g
dy
=
−
(1 − e−kt ) dt
∫h
k ∫0
t
g
1 −kt
y − h = − (t + e )∣
k
k
0
1 −kt 1
g
y − h = − (t + e − )
k
k
k
g
g
y = h − t + 2 (1 − e−kt )
k
k
dv
(c) Using the chain rule to rewrite the acceleration, we obtain a =
=
dt
dv dy
dv
= ( ) v. The time t does not show explicitly in the expression.
dy dt
dy
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
93
Hence, we have
dv
dy
v dv
dy = −
g + kv
−g − kv = v
Integrating on both sides, we get
y
∫h
v v dv
dy = − ∫
0 g + kv
1
k
1
y−h = −
k
1
y−h = −
k
y−h = −
v kv + g − g
∫0 ( g + kv ) dv
v
v
dv
[∫ dv − g ∫
]
0
0 g + kv
g
g + kv
[v − ln (
)]
k
g
Thus, the height relates to the velocity of the particle by
y =h−
g + kv
v g
+ 2 ln (
)
k k
g
∎
Example 4.4. A particle of mass m is projected with speed v0 at an angle
φ to the horizontal. Find the time when the particle reaches its maximum
height if air resistance is not negligible. Given that the air resistance (i.e.
the force drag) is linearly proportional to the velocity of the particle.
Solution:
Let the air resistance a force constant k, where k is positive. The equation
of motion of the particle along the vertical is
−mg − kvy = may ,
dvy
where ay is the acceleration of the particle along the vertical. But, ay =
,
dt
which gives
dvy
mg + kvy
= −(
)
dt
m
Arrange the equation by separating the variables,
dvy
dt
=−
mg + kvy
m
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
94
Integrate both sides of the equation, we obtain
vy
t
dvy
1
=
−
dt
∫v sin φ mg + kv
m ∫0
0
y
vy
t
1
ln(mg + kvy )∣
= −
k
m
vy =v0 sin φ
mg + kvy
1
t
ln (
) = −
k
mg + kv0 sin φ
m
mg + kvy
kt
= e− m
mg + kv0 sin φ
Therefore, the vertical velocity of the particle is given by
vy =
1
kt
[(mg + kv0 sin φ) e− m − mg]
k
At the maximum point, vy = 0, thus
(mg + kv0 sin φ) e− m = mg
kt
kv0
sin φ
mg
m
kv0
t =
ln (1 +
sin φ)
k
mg
kt
em = 1 +
∎
4.4
Radioactive Decay
In a sample of N radioactive nuclei, the rate at which the nuclei will decay
is proportional to N ,
dN
= −λN ,
dt
(4.2)
where λ is positive and is called the disintegration constant (or decay constant). In fact, λ has a characteristic value for every radionuclide. Rearranging equation 4.2, we have
dN
= −λ dt
N
Integrating both sides, we obtain
N
∫N
0
t
dN
= −λ ∫ dt
N
0
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
95
Here N0 is the number of radioactive nuclei in the sample at time t = 0. Then
N
ln ( ) = −λt
N0
N
= e−λt
N0
Therefore, we obtain the formula of radioactive decay:
N = N0 e−λt
(4.3)
A common time measure of how long the radionuclides can last is the halflife T1/2 . It is the time at which N decreases to one-half its initial value.
From equation 4.3, we get
N0
= N0 e−λT1/2
2
1
= e−λT1/2
2
1
ln ( ) = −λ T1/2
2
ln 2
T1/2 =
λ
dN
It is also the time needed for ∣
∣ to reduce to one-half of its initial value.
dt
Example 4.5. A certain radioactive material is known to decay at a rate
proportional to the amount present. If initially there is 50 mg of the material present and after two hours it is observed that the material has lost 10
percent of its original mass, find the mass of the material after four hours.
Solution: Let N be the mass of the material and k be the decay constant,
we have
dN
= −kN
dt
Following the steps as stated in section 4.4, we obtain
N = N0 e−kt ,
(4.4)
where N0 is the mass of the radioactive material at time t = 0. Thus, we have
N = 50 e−kt
As 10 % of the original mass has lost after two hours (t = 2), the mass of the
material present is 50 − 5 = 45 mg. So
45 = 50 e−2k ,
50
1
ln ( ) = 0.053. Four hours later (t = 4), there are N =
2
45
50 e−0.053 (4) = 40.5 mg of the material left.
∎
which gives k =
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
4.5
96
Charging a Capacitor
A capacitor is charged by a battery of voltage E when the switch S in the
circuit is closed. Figure 4.4 shows the RC circuit which has a resistor of
resistance R. The capacitance of a capacitor is defined by
C=
Q
,
V
where Q is the charge stoted in the capacitor and V is the voltage across
the capacitor. It is a constant for a given capacitor. Go around the loop
clockwisely by using the Kirchhoff ’s voltage rule, we have
E−
Q
− IR = 0 ,
C
Figure 4.4: The RC circuit
where E, Q/C, and IR are the potential differences across the battery, capacitor, and the resistor respectively, and I = dQ/dt. Then
E−
Q
dQ
−(
)R = 0
C
dt
CE − Q
dQ
=
RC
dt
dt
dQ
=
RC
CE − Q
Integrating on both sides, we have
t dt
∫0 RC
t
RC
t
RC
t
RC
Q
dQ
CE − Q
Q d(CE − Q)
= −∫
Q=0 CE − Q
= ∫
0
Q
= − ln (CE − Q)∣
0
CE
= ln (
)
CE − Q
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
97
CE
CE − Q
t
Q = CE (1 − e− RC )
t
e RC
=
Therefore, we obtain the variation of the stored charge with time. A plot of
the quantities is shown in figure 4.5
Q = Q0 (1 − e− τ ) ,
t
Figure 4.5: The charging curve of a capacitor
Obviously, Q0 = CE is the maximum charge stored in the capacitor and
τ = RC is the time constant. If the circuit has a greater value in τ , the time
needed for charging the capacitor is longer. When t = τ , the charge stored
in the capacitor is about 63 % of the maximum value. When the capacitor
is fully charged, the voltage across it equals that of the battery and there is
no charge (no current) flowing in the circuit. The variation of current with
time is
Q0 − t
dQ d
t
= {Q0 (1 − e− τ )} =
e τ
dt dt
τ
E −t
I =
e τ
R
I =
From the above equation, we realize that the initial current is E/R. The
capacitor seems not connected in the circuit when the switch is just closed.
Then, the current drops exponentially and it disappears eventually. At t = τ ,
the current is about 37 % of the maximum value.
Figure 4.6: The variation of current in a RC circuit
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
4.6
98
Parabolic Mirror
Parallel light beams travel along the x direction as shown in the figure. Here
we investigate the shape of a mirror such that it reflects all such beams to the
origin. This is a classical problem that can be solved by separable differential
equation.
Consider a ray of light KM which meets the mirror at M (x, y). Let T T ′
be a tangent to the mirror at M , then we have ∠T ′ M K = ∠T M O = ∠M T O.
Figure 4.7: A ray of light is reflected to pass the origin
√
Hence, OT = OM = x2 + y 2 . The tangent equation at M is Y −y = y ′ (X −x),
y
dy
. It gives the x-intercept X = x − ′ when Y = 0. Then
where y ′ =
dx
y √
y
y
∣OT ∣ = ∣X∣ = −X = −x + ′ . Thus, ∣OT ∣ = ∣OM ∣ implies x2 + y 2 = −x + ′ or
y
y
√
(x + x2 + y 2 ) dy = y dx. Therefore,
dy
y
√
=
(4.5)
dx x + x2 + y 2
This is a homogeneous differential equation of degree zero because the RHS of
the above equation is a homogeneous function f (λx, λy) = f (x, y). Generally,
a homogeneous function of degree n gives f (λx, λy) = λn f (x, y). Now,
we let x = ty and differentiate both sides with respect to y, then we have
dx
dt
=t+y ( )
(4.6)
dy
dy
Knowing the fact that
dx
dy −1
= ( ) and substituting equation 4.6 into equady
dx
tion 4.5, we obtain
dy
dt
=√
y
1 + t2
Integrating both sides, we get
dy
dt
∫ y = ∫ √
1 + t2
√
ln y = ln(t + 1 + t2 ) + ln C
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
Hence, we have y = C (t +
99
√
1 + t2 ). Substituting t = x/y, we obtain
x+
√
x2 + y 2 =
y2
C
After simplification, we finally get
y 2 = 2C (x +
4.7
C
)
2
Torricelli’s Law of Draining
Suppose that a cylindrical water tank has water leaving it through a small
hole of area a at the bottom of the tank. Figure 4.8 shows the water tank.
Denote the depth of water in the tank as y(t) and the volume of water√as V
at time t. The velocity of the stream of water leaving the hole is v = 2gy,
which is the free falling velocity of a drop of water from the surface of water.
Figure 4.8: A draining cylinrical tank
It is worth to mention that the movement of the water surface is very small
when compared with the leaving water. During a short time interval dt, the
change in volume is
√
dV = −av dt = −a 2gy dt
It is equal to the change in volume near the water surface, dV = A dy, where
A is the cross sectional area of the cylinder. Therefore,
√
A dy = −a 2gy dt
Hence, we have
dy
a√
=−
2gy
dt
A
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
100
√
Putting k = a 2g/A, we have
√
dy
= −k y
dt
This is a separable differential equation which has the following form.
dy
√ = −k dt
y
Integrating both sides and setting the lower and upper limits by the initial
height y1 and the final height y2 respectively, we have
∫y
y2
1
t
dy
√ = −k ∫ dt
y
0
√
2 y∣
y2
= −kt
y1
Note that y1 > y2 . Hence, we have the
√
√
2 ( y2 − y1 ) = −kt
(4.7)
Example 4.6. A cylindrical tank has a small hole at the bottom and water
is draining from the hole. The water is 5.0 m deep at noon and it is 2.5 m
deep at 1:00 p.m. When will the tank be empty?
Solution: We apply equation 4.7 and substitute t = 1 hr, y1 = 5.0 m and
y2 = 2.5 m, then
√
√
2 ( 2.5 − 5.0) = −k(1)
1
We obtain k = 1.31 m 2 hr−1 . When the tank is empty, equation 4.7 becomes
√ √
2 ( 0 − 5.0) = −(1.30) t
t = 3.41 hr
The tank will be empty at 3:25 p.m.
4.8
∎
First Order Linear Differential Equation
The first order linear differential equation has a standard form
dy
+ P (x) y = Q(x)
dx
(4.8)
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
101
One can solve it by multiplying both sides an integrating factor e∫ P (x) dx ,
equation 4.8 becomes
e∫ P (x) dx
dy
+ e∫ P (x) dx P (x) y = e∫ P (x) dx Q(x)
dx
d
(y e∫ P (x) dx ) = e∫ P (x) dx Q(x)
dx
Then integrating both sides with respect to x, we obtain
y e∫ P (x) dx = ∫ [e∫ P (x) dx Q(x)] dx
Thus the solution of equation 4.8
y = (e− ∫ P (x) dx ) ∫ [e∫ P (x) dx Q(x)] dx
Example 4.7. Solve x
dy
− ky = x2 .
dx
Solution: Rearrange the equation as the standard form
dy
k
−( ) y =x
dx
x
(4.9)
where P (x) = −k/x and Q(x) = x. The integrating factor is e∫ P (x) dx =
k
−k
e− ∫ x dx = e−k ln x = eln x = x−k . Multiplying both sides of equation 4.9 by x−k ,
we have
x−k
dy
k
− x−k ( ) y = x1−k
dx
x
d
(yx−k ) = x1−k
dx
yx−k = ∫ x1−k dx
Thus
y = xk ∫ x1−k dx
y = xk (
y =
x2−k
+ C)
2−k
x2
+ Cxk
2−k
∎
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
Example 4.8. Solve (x2 + 1)
102
dy
+ 4xy = x where y(2) = 1.
dx
Solution: Rearrange the equation as the standard form
dy
4x
x
+( 2
)y= 2
dx
x +1
x +1
where P (x) =
(4.10)
4x
x
and Q(x) = 2
. The integrating factor is
+1
x +1
x2
e∫ P (x) dx = e∫
4x
x2 +1
dx
= e2 ∫
d(x2 +1)
x2 +1
= e2 ln(x
2 +1)
= eln(x
2 +1)2
= (x2 + 1)2
Multiplying both sides of equation 4.10 by (x2 + 1)2 , we have
(x2 + 1)2
dy
+ 4x (x2 + 1) y = x (x2 + 1)
dx
d
{y (x2 + 1)2 } = x (x2 + 1)
dx
y (x2 + 1)2 = ∫ x (x2 + 1) dx
Therefore
y (x2 + 1)2 =
x4 x2
+
+C
4
2
Applying the given condition y(2) = 1, we obtain C = 19. We finally obtain
y (x2 + 1)2 =
x4 x2
+
+ 19
4
2
∎
Example 4.9. A large tank initially contains 50 m3 of brine in which there
is dissolved 10 kg of salt. Brine containing 2 kg of dissolved salt per m3
flows into the tank at the rate of 5 m3 /min. The mixture is kept uniform by
stirring, and the stirred mixture simultaneously flows out at the slower rate
of 3 m3 /min. How much salt is in the tank at any time t > 0?
Figure 4.9: A tank of brine
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
103
Solution: The volume of brine in the tank at time t is {50 + (5 − 3) t} m3 =
{50 + 2t} m3 . Let the amount of salt in the tank at the instant be x. The
x
.
concentration of salt in the tank is
50 + 2t
The rate of change of salt in the tank is
dx
x
= 5(2) − 3 (
)
dt
50 + 2t
Hence, we have
3x
dx
+
= 10
dt 50 + 2t
(4.11)
It is a first order differential equation in the standard form given by equation
3
4.8, where P (t) =
and Q(t) = 10. The integrating factor is
50 + 2t
e∫ P (t) dt = e∫
3
50+2t
dt
3
= e2 ∫
d(50+2t)
50+2t
3
= e2
ln(50+2t)
3
3
= eln(50+2t) 2 = (50 + 2t) 2
Multiplying this factor to both sides of equation 4.11, we have
d
3
3
{x (50 + 2t) 2 } = 10 (50 + 2t) 2
dt
Integrating both sides with respect to t, we obtain
t, x
t
3
3
∫t=0, x=10 d {x (50 + 2t) 2 } = 10 ∫0 (50 + 2t) 2 dt
t, x
t
3
2
x (50 + 2t) ∣
t=0, x=10
3
= 5 ∫ (50 + 2t) 2 d(50 + 2t)
0
t
3
2
3
2
3
2
3
2
x (50 + 2t) − 10 (50)
5
2
= 2 (50 + 2t) ∣
0
x (50 + 2t) − 10 (50)
5
2
5
= 2 (50 + 2t) − 2 (50 2 )
Then
5
x = 2 (50 + 2t) −
3
2 (50 2 ) − 10 (50 2 )
3
(50 + 2t) 2
After simplification, we obtain the amount of salt in the tank at time t > 0
√
22, 500 2
x = 100 + 4t −
3
(50 + 2t) 2
∎
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
4.9
104
Second Order Homogeneous Differential
Equations
In this section, we will put our focus on a second order homogeneous differential equation.
ẍ + 2b ẋ + ω 2 x = 0 ,
The term ’homogeneous’ refers to a zero in the right hand side of the equation.
For simplicity, we ignore the first order term in the discussion, i.e. b = 0. A
typical example is the simple harmonic motion of an object which is an ideal
system without any friction and the total energy of the object is a constant.
The equation of motion is
ẍ + ω 2 x = 0 ,
(4.12)
where x is the displacement of the object measured from the equilibrium
position and ω > 0. Recall that in section 4.2, we have solved the equation
by using the reduction of order. Now, we try to solve it directly.
Let x = eλt be a solution of equation 4.12, then ẋ = λ eλt and ẍ = λ2 eλt .
Plugging into the equation again, we have
(λ2 + ω 2 ) eλt = 0
which implies λ2 + ω 2 = 0. This is an auxiliary equation of the differential
equation and it has roots λ = ±iω. Thus eiωt and e−iωt satisfy equation 4.12.
The complimentary solution of the equation becomes
x = c1 eiωt + c2 e−iωt ,
where c1 and c2 are arbitrary constants and can be determined if the initial
conditions are given. Using the fact that eiωt = cos ωt + i sin ωt and e−iωt =
cos ωt − i sin ωt, the solution becomes
x = (c1 + c2 ) cos ωt + i (c1 − c2 ) sin ωt
where c1 and c2 could be complex numbers. That is to say there are 4
arbitrary constants. However, there are two arbitrary constants for each
second order differential equation and the two constants c1 and c2 must be
related such that they have two independent elements only. Then, we can
make c1 and c2 as complex conjugates and thus A = c1 + c2 and B = i (c1 − c2 ).
The solutions becomes
x = A cos ωt + B sin ωt
(4.13)
CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS
105
Differentiating on both sides,
ẋ = ω (−A sin ωt + B cos ωt)
(4.14)
Let’s consider a typical example first. If an object is released at x = D when
t = 0, we obtain the arbitrary constants A = D and B = 0. Hence, we have
x = D cos ωt
Next, we study the second example. If the object is projected to the right
with velocity v0 at x = 0 when t = 0, we obtain A = 0 and B = v0 /ω. Hence,
we have
v0
sin ωt ,
x=
ω
where v0 /ω is the amplitude of the oscillation.
Sometimes, equation 4.13 is written as
x = µ sin(ωt + φ) ,
(4.15)
where µ and φ are the amplitude and the phase angle of the oscillation
respectively. The proof of equation 4.15 is straightforward
√ by considering
A = µ sin φ and B = µ cos φ in equation 4.13, where µ = A2 + B 2 . Making
use the trigonometric identity sin(x + y) = sin x cos y + cos x sin y, we have
x = A cos ωt + B sin ωt
= µ (sin φ cos ωt + cos φ sin ωt)
= µ sin(ωt + φ)
Example 4.10. The equation of motion of a harmonic oscillator is given by
d2 x
+ 4x = 0, where x is the displacement of the oscillator measured from
dt2
√
its equilibrium position at time t. If x = 3 m and ẋ = 6 3 m/s when t = 0,
determine x as a function of time.
Solution: The roots of the auxiliary equation λ2 + 4 = 0 are ±2i. Then the
complimentary solution is x = A e2it + B e−2it or x = µ sin(2t + φ). Practically,
we apply the latter expression to present the oscillation instead of the former.
ẋ = 2µ cos(2t + φ)
From the initial conditions, we have
{
√3 = µ sin φ
6 3 = 2 µ cos φ
Then, we know µ = 6 and φ =
π
π
. Therefore, we obtain x = 6 sin (2t + ). ∎
6
6
Chapter 5
Trigonometry and Complex
Numbers
5.1
Compound Angle Formulae
Before we could discuss how to represent complex numbers on a plane, we
need some angle formulae. Consider the triangle ABC in figure 5.1. The
length of the side AB is denoted by c and is equal to the sum of AD and
DB
c = b cos α + a cos β
(5.1)
There are several ways to write down the area of the triangle ABC
C
π−α−β
a
b
α
A
β
D
c
B
Figure 5.1: Addition angle formula
1
1
bc sin α = (b2 sin α cos α + ab sin α cos β)
2
2
1
1
=
ac sin β = (ab cos α sin β + a2 sin β cos β)
2
2
1
1
=
ab sin(π − α − β) = ab sin(α + β)
2
2
Area =
(5.2)
(5.3)
(5.4)
If we write it as the sum of ACD and BCD, we have
Area =
1 2
1
b sin α cos α + a2 sin β cos β
2
2
106
(5.5)
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
107
The sum of equations 5.4 and 5.5 is the sum of equations 5.2 and 5.3
ab sin(α + β) + b2 sin α cos α + a2 sin β cos β
= b2 sin α cos α + ab sin α cos β + ab cos α sin β + a2 sin β cos β
Canceling some terms and common factors, we have the first of a group of
very important identities,
sin(α + β) = sin α cos β + cos α sin β
(5.6)
sin(α − β) = sin α cos β − cos α sin β
(5.7)
Replace β by −β,
Replace α by π/2 − α in the above equation,
cos(α + β) = cos α cos β − sin α sin β
(5.8)
Divide equation 5.6 by the above
sin α cos β + cos α sin β
cos α cos β − sin α sin β
tan α + tan β
=
1 − tan α tan β
tan(α + β) =
(5.9)
Set β = α in equations 5.6, 5.8 and 5.9, we obtain useful formulae expressed
in equations 5.10 to 5.14.
sin 2α = 2 sin α cos α
cos 2α = cos2 α − sin2 α
= 1 − 2 sin2 α
= 2 cos2 α − 1
2 tan α
tan 2α =
1 − tan2 α
(5.10)
(5.11)
(5.12)
(5.13)
(5.14)
Adding equation 5.6 and equation 5.7,
sin(α + β) + sin(α − β) = 2 sin α cos β
(5.15)
Substitute A = α + β and B = α − β, we have
sin A + sin B = 2 sin
A+B
A−B
cos
2
2
(5.16)
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
108
We summarize
sin(A + B)
sin(A − B)
cos(A + B)
cos(A − B)
=
=
=
=
sin A cos B + cos A sin B
sin A cos B − cos A sin B
cos A cos B − sin A sin B
cos A cos B + sin A sin B
tan A + tan B
1 − tan A tan B
tan A − tan B
tan(A − B) =
1 + tan A tan B
tan(A + B) =
A+B
A−B
cos
2
2
A−B
A+B
sin
sin A − sin B = 2 cos
2
2
A+B
A−B
cos A + cos B = 2 cos
cos
2
2
A+B
A−B
cos A − cos B = −2 sin
sin
2
2
sin A + sin B = 2 sin
(5.17)
(5.18)
(5.19)
(5.20)
(5.21)
(5.22)
(5.23)
(5.24)
(5.25)
(5.26)
1
[sin(A + B) + sin(A − B)]
(5.27)
2
1
cos A sin B =
[sin(A + B) − sin(A − B)]
(5.28)
2
1
cos A cos B =
[cos(A + B) + cos(A − B)]
(5.29)
2
1
sin A sin B = − [cos(A + B) − cos(A − B)]
(5.30)
2
If we put A = B in equation 5.30, we have
1
sin2 A = (1 − cos 2A)
(5.31)
2
Similarly, equation 5.29 gives
1
(5.32)
cos2 A = (1 + cos 2A)
2
Readers should be able to derive all those formulae that we have not proved
and memorize everything.
A final remark: Sometimes, it is useful to use the notation
1
csc θ ≡
(5.33)
sin θ
1
sec θ ≡
(5.34)
cos θ
1
cot θ ≡
(5.35)
tan θ
sin A cos B =
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
109
They are called cosecant, secant and cotangent respectively. Using the
identity sin2 θ + cos2 θ ≡ 1, we obtain
1 + tan2 θ ≡ sec2 θ
(5.36)
1 + cot2 θ ≡ csc2 θ
(5.37)
and a similar one
Example 5.1. Evaluate sin 15○ without using calculator.
○
○
○
○
○
○
○
Solution: We note that
√ sin 15 = sin(45 −30 ) = sin 45 cos 30 −cos 45 sin 30 .
1
1 1
1 √ √
3
Thus, sin 15○ = √ (
) − √ ( ) = ( 6 − 2).
∎
4
2 2
2 2
Example 5.2. Express sin 3x in terms of sin x. Hence, evaluate cos 36○ without using calculator.
Solution: Note that sin 3x = sin(2x + x) = sin 2x cos x + cos 2x sin x. Hence,
we can write
sin 3x =
=
=
=
(2 sin x cos x) cos x + (1 − 2 sin2 x) sin x
2 sin x cos2 x + sin x − 2 sin3 x
2 sin x (1 − sin2 x) + sin x − 2 sin3 x
3 sin x − 4 sin3 x
Let x = 36○ , we have 5x = 180○ or 3x = 180○ − 2x. Obviously, sin 3x =
sin(180○ − 2x) = sin 2x. Hence, we can write
3 sin x − 4 sin3 x
3 − 4 sin2 x
3 − 4 (1 − cos2 x)
4 cos2 x − 2 cos x − 1
=
=
=
=
2 sin x cos x
2 cos x
2 cos x
0
Solving the quadratic
equation and ignoring the negative root, we obtain
√
1+ 5
.
∎
cos 36○ =
4
Example 5.3. Find ∫ sin2 x dx and ∫ cos2 x dx.
Solution: Recall the trigonometric identities stated in equations 5.12 and
5.13: cos 2x = 1 − 2 sin2 x and cos 2x = 2 cos2 x − 1, we have
1
1
1
2
∫ sin x dx = 2 ∫ (1 − cos 2x) dx = 2 x − 4 sin 2x + C
1
1
1
2
′
∫ cos x dx = 2 ∫ (1 + cos 2x) dx = 2 x + 4 sin 2x + C
∎
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
110
θ
Example 5.4. If t = tan , express sin θ, cos θ and tan θ in terms of t. These
2
are the well-known half-angle formulae. Use the first result to evaluate
∫ csc θ dθ.
Solution: Recall equations 5.10, 5.11 and 5.14, we have
2 tan 2θ
2t
θ
θ
=
sin θ = 2 sin cos =
,
θ
2
2
2 1 + tan 2 1 + t2
θ
θ 1 − tan2 2θ 1 − t2
=
cos θ = cos2 − sin2 =
, and
2
2 1 + tan2 2θ 1 + t2
2 tan 2θ
2t
.
=
tan θ =
2 θ
1 − tan 2 1 − t2
θ
1
θ
2dt
The substitution t = tan implies dt = sec2 dθ which gives dθ =
.
2
2
2
1 + t2
Therefore,
dθ
1 + t2
2
dt
θ
csc
θ
dθ
=
=
∫
∫ sin θ ∫ 2t ⋅ 1 + t2 dt = ∫ t = ln t + C = ln [tan ( 2 )] + C
∎
Example 5.5. An object of mass m is at rest on a rough table which has
coefficient of static friction µ. Find the minimum force to move the object.
Solution:
Figure 5.2: An applied force on a block
Let F be the applied force which has an elevated angle θ. The normal reaction
on the mass is N . When the mass is just to move, the equations of motion
along and normal the horizontal are
{
F cos θ = µ N
F sin θ + N = mg
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
Eliminating N in the above equations, we have F sin θ +
111
F cos θ
= mg. Hence,
µ
we can write
F=
mgµ
µ sin θ + cos θ
(5.38)
Figure 5.3: The sides of the right-angled triangle
Now, we construct a right-angled triangle to proceed the calculation. Figure
5.3 shows the triangle which has an acute angle α. Its opposite side and
adjacent side are defined by the coefficient of cos θ and sin θ stated
in equation
√
2 + 1 sin α and
µ
5.38. Then, we express the length
of
the
opposite
side:
1
=
√
that of the adjacent side: µ = µ2 + 1 cos α. Hence,
F = √
= √
mgµ
µ2 + 1 [sin θ cos α + cos θ sin α]
mgµ
µ2 + 1 sin(θ + α)
The least applied force Fmin is obtained if we put sin(θ + α) = 1. Therefore,
mgµ
.
Fmin = √
µ2 + 1
∎
Example 5.6. A particle is thrown uphill with speed v0 on an inclined plane.
The angle of projection measured from the inclined plane is α. The elevation
angle of the inclined plane measured from the horizontal is β. Find the range
of the particle on the inclined plane. Show further that if α1 and α2 are the
π
possible angles for the same range, then α1 + α2 + β = .
2
Solution: Construct the coordinate system where the origin is located at
the point of projection and the x-axis and y-axis are along and normal to
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
112
the inclined plane respectively. The positive directions of x and y are uphill
and above the inclined plane respectively. At time t, the displacements of
the particle along the x and y axes are
⎧
1
⎪
⎪
x = (v0 cos α) t − (g sin β) t2
⎪
⎪
⎪
2
⎨
⎪
1
⎪
⎪
⎪
y = (v0 sin α) t − (g cos β) t2
⎪
⎩
2
When the particle hits the plane, the particle has coordinates (R, 0). The
Figure 5.4: A projectile on the hill
1
second equation gives v0 sin α = gt cos β and thus the required time t =
2
2v0 sin α
. Substituting the time expression into the first equation, we have
g cos β
the range
R = (v0 cos α) (
4v 2 sin2 α
2v0 sin α
1
) − (g sin β) ( 20 2 )
g cos β
2
g cos β
2v02 sin α
[cos α cos β − sin α sin β]
g cos2 β
2v02 sin α
=
[cos(α + β)]
g cos2 β
v02
=
[sin(2α + β) − sin β]
g cos2 β
=
Let α1 and α2 be the possible angles which achieve the same uphill range.
We have 2α2 + β = π − (2α1 + β) which gives α1 + α2 + β = π/2.
Remark: The range R on the inclined plane is a maximum when 2α+β = π/2
π β
which implies α = − . Hence, the maximum range is
4 2
v02
g (1 + sin β)
∎
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
5.2
113
Complex Numbers
Not all polynomials have real number solutions, for example, no real number
x satisfies x2 + 1 = 0. We need complex numbers.
√
We denote the square root of −1 as i ≡ −1. Obviously, i2 = −1, i3 = −i,
and i4 = 1. A complex number z is defined to be the sum a + b i where the
real part of z: Re(z) = a and the imaginary part of z: Im(z) = b are real
numbers. The addition (or subtraction) of two complex numbers is
(a + b i) ± (c + d i) = (a ± c) + (b ± d) i
(5.39)
Using distributive law, multiplication is
(a + b i) × (c + d i) = ac + ad i + bc i + bd i2 = (ac − bd) + (ad + bc) i
(5.40)
Note that (a + b i)(c + d i) = (c + d i)(a + b i).
Division is a bit tricky.
1 a − bi
1
=
a + bi
a + bi a − b i
a − bi
= 2 2
a +b
a
b
= 2 2− 2 2i,
a +b
a +b
(5.41)
and we have
c + di
a
b
= (c + d i) ( 2 2 − 2 2 i)
a + bi
a +b
a +b
ac + bd ad − bc
=
+
i
a2 + b 2 a2 + b 2
(5.42)
Example 5.7. Compute the following expressions.
Solution:
(2 + 4 i) + (3 − 6 i) = 5 − 2 i
1
31 7
(1 − 3 i)( + 5 i) =
+ i
2
2 2
2 + 4i
(2 + 4 i)(3 + 6 i) −6 + 8 i
=
=
3 − 6i
9 + 36
15
∎
5.3
Complex Plane
We usually represent a complex number z = a+b i as a point on the complex
plane. Figure 5.5 shows the Argand Diagram of a complex number. The
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
114
Im
a+bi
b
r
θ
a
Re
Figure 5.5: Complex plane
Im
(a+c)+(b+d)i
c+di
a+bi
Re
Figure 5.6: Addition of two complex numbers
horizontal axis is for the real part of the number and the vertical axis is
for the imaginary part. The addition of two complex numbers corresponds
to forming a parallelogram, figure 5.6. We define the absolute value or
modulus r = ∣z∣ = ∣a + b i∣ of a complex number as
√
(5.43)
∣z∣ ≡ a2 + b2 ,
which is just the “length” of the arrow of the number in the complex plane.
If it is, in fact, a real number (b = 0), we get back the absolute value of a real
number.
The argument is the angle arg z ≡ θ in figure 5.5, θ = tan−1 (b/a). The
conjugate of z is z̄ = a − b i. We have the following simple identities. The
first one is the polar form of a complex number.
z
z̄
z + z̄
z − z̄
=
=
=
=
r(cos θ + i sin θ)
r(cos θ − i sin θ)
2a = 2 Re(z)
2b i = 2 i Im(z)
(5.44)
(5.45)
(5.46)
(5.47)
(z̄)
z z̄
∣z∣
arg z̄
=
=
=
=
z
∣z∣2 = r2
∣z̄∣
− arg z
(5.48)
(5.49)
(5.50)
(5.51)
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
115
Equation 5.49 implies a2 + b2 = (a + b i) (a − b i). For two complex numbers
z1 = a1 + b1 i = r1 (cos θ1 + i sin θ1 ) and z2 = a2 + b2 i = r2 (cos θ2 + i sin θ2 ), their
product is
z1 z2 = r1 r2 (cos θ1 + i sin θ1 ) (cos θ2 + i sin θ2 )
= r1 r2 [(cos θ1 cos θ2 − sin θ1 sin θ2 ) + i (sin θ1 cos θ2 + cos θ1 sin θ2 )]
= r1 r2 [cos(θ1 + θ2 ) + i sin(θ1 + θ2 )]
(5.52)
by equations 5.17 and 5.19. We see that the absolute value of a product is
the product of the absolute values and the argument of a product is the sum
of the arguments, see figure 5.7.
∣z1 z2 ∣ = ∣z1 ∣∣z2 ∣
arg(z1 z2 ) = arg z1 + arg z2
(5.53)
(5.54)
This is illustrated in figure 5.7. Moreover,
∣z1 /z2 ∣
arg(z1 /z2 )
z1 + z2
z1 − z2
z1 z2
=
=
=
=
=
∣z1 ∣ / ∣z2 ∣
arg z1 − arg z2
z̄1 + z̄2
z̄1 − z̄2
z̄1 z̄2
(z1 /z2 ) = z̄1 /z̄2
(5.55)
(5.56)
(5.57)
(5.58)
(5.59)
(5.60)
The unit complex number is represented by cos θ + i sin θ (or cos θ − i sin θ),
because its magnitude is 1. If ∣z∣ = 1, we have ∣z∣2 = z z̄ = 1, then z̄ = 1/z.
Furthermore,
1
1
1
(z + z̄) =
(z + )
2
2
z
1
1
1
sin θ =
(z − z̄) =
(z − )
2i
2i
z
cos θ =
Im z1 z 2
θ1
z2
θ2
θ1
z1
Re
Figure 5.7: Product of two complex numbers
(5.61)
(5.62)
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
Example 5.8. Let z1 = 1 + i and z2 =
116
√
3 + i, find arg(z1 z2 ).
√
Solution: Note that r1 = 2 and r2 = 2. The principal arguments of z1 and
z2 are arg z1 = π/4 and arg z2 = π/6 respectively. We have
√
√
√
z1 z2 = (1 + i)( 3 + i) = ( 3 − 1) + ( 3 + 1) i
The absolute value of z1 z2 is
√ √
√
√
√
√
√
( 3 − 1)2 + ( 3 + 1)2 = 3 − 2 3 + 1 + 3 + 2 3 + 1 = 8
and the argument is
√
3+1
)
arg(z1 z2 ) = tan−1 ( √
3−1
which can be verified to be 5π/12 (i.e. π/4 + π/6).
5.4
∎
De Moivre’s Theorem
For any positive integer n, we have
(cos x + i sin x)n = cos nx + i sin nx
(5.63)
The proof can be completed by mathematical induction. This theorem can
be generalized to include negative power in the LHS of equation 5.63. Let
m = −n, we can write
(cos x + i sin x)m = (cos x + i sin x)−n
1
=
(cos x + i sin x)n
1
=
cos nx + i sin nx
1
cos nx − i sin nx
= (
)(
)
cos nx + i sin nx
cos nx − i sin nx
= cos nx − i sin nx
= cos(−nx) + i sin(−nx)
= cos mx + i sin mx
Furthermore, the theorem is also true if n is replaced by a rational number,
e.g. p/q, where p and q are integers. The proof is straightforward.
q
p
p
(cos( x) + i sin( x)) = cos px + i sin px
q
q
= (cos x + i sin x)p
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
117
Hence,
p
p
cos ( ) x + i sin ( ) x = (cos x + i sin x)p/q
q
q
Example 5.9. Show that (cos x − i sin x)n = cos n x − i sin n x, where n is an
integer.
Solution:
Obviously, we can write
(cos x − i sin x)n = (cos(−x) + i sin(−x))n
= cos(−n x) + i sin(−n x)
= cos n x − i sin n x
This is an alternative form of De Moivre’s theorem.
∎
Example 5.10. Find expressions for cos 3 x and sin 3 x.
Solution: By using De Moivre’s theorem, we have
cos 3 x + i sin 3 x =
=
=
=
(cos x + i sin x)3
cos3 x + 3i cos2 x sin x − 3 cos x sin2 x − i sin3 x
(cos3 x − 3 cos x sin2 x) + i (3 cos2 x sin x − sin3 x)
(4 cos3 x − 3 cos x) + i (3 sin x − 4 sin3 x)
Therefore, cos 3 x = 4 cos3 x − 3 cos x and sin 3 x = 3 sin x − 4 sin3 x.
∎
Example 5.11. Solve z 3 + 1 = 0.
Solution: Rearrange the equation and rewrite −1 in polar form.
z 3 = −1
= cos (2k π + π) + i sin (2k π + π) , where k = 0, ± 1, ± 2 ⋯ .
A cubic equation has three roots, so
(2k + 1) π
(2k + 1) π
) + i sin (
),
where k = 0, 1, 2.
3
3
π
π
5π
5π
= cos ( ) + i sin ( ) , − 1, and cos ( ) + i sin ( )
3
3
3
3
√
√
1
3
1
3
=
+i
, − 1, and − i
2
2
2
2
z = cos (
Figure 5.8 shows the three roots of the equation, where z0 , z1 , and z2
correspond to k = 0, 1, and 2 respectively. z1 is real, z0 and z2 are conjugates
of each other. In fact, complex roots always come in pairs.
∎
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
118
Figure 5.8: The Argand diagram of the roots of z 3 + 1 = 0
5.5
Euler’s Formula
If ∣z∣ = 1, we write z = cos x + i sin x. In section 2.6 a discussion on Taylor’s
series was covered and let’s recall the series of the following functions here.
x2 x3 x4
+
+
+⋯
2! 3! 4!
x2 x4 x6
cos x = 1 −
+
−
+⋯
2! 4! 6!
x3 x 5 x7
sin x = x −
+
−
+⋯
3! 5! 7!
ex = 1 + x +
Euler’s formula states that z can be represented by an exponential function.
eix = cos x + i sin x
(5.64)
Starting from the formula, we get
1 i x −i x
(e + e )
2
1
(ei x − e−i x )
sin x =
2i
cos x =
(5.65)
(5.66)
Remark: If z = r (cos x + i sin x), then
ln z = ln r + ln (cos x + i sin x) = ln ∣z∣ + ln ei x = ln ∣z∣ + i x
Therefore
ln z = ln ∣z∣ + i arg z
Example 5.12. Prove that
1 + cos 6 x + i sin 6 x
= cos 6 x + i sin 6 x .
1 + cos 6 x − i sin 6 x
(5.67)
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
119
Solution:
1 + e6i x
1 + cos 6 x + i sin 6 x
=
1 + cos 6 x − i sin 6 x
1 + e−6i x
e3i x e−3i x + e3i x
= −3i x ( 3i x −3i x )
e
e +e
6i x
= e
= cos 6 x + i sin 6 x
∎
Example 5.13. Evaluate (a) ln(−3) and (b) ln 3 i.
Solution:
(a) Using equation 5.67, we have
ln(−3) = ln ∣ − 3∣ + i arg(−3) = ln 3 + i (2k π + π) = ln 3 + i (2k + 1) π ,
where k = 0, ±1, ±2, ⋯ .
(b) Using equation 5.67 again, we have
π
ln 3 i = ln ∣3 i∣ + i arg(3 i) = ln 3 + i (2k π + ) ,
2
where k = 0, ±1, ±2, ⋯ .
∎
Example 5.14. Find the value of ii . Is it a real number or an imaginary
number?
Solution: Rewriting ii as the power of the exponential constant and using
equation 5.67, then we obtain
i
ii = (eln i ) = ei ln i = ei [ln ∣i∣+i arg(i)] = ei [ln 1+i (2k π+ 2 )] = e−(2k π+ 2 ) ,
π
π
where k = 0, ±1, ±2, ⋯ . Equivalently, we have
ii = e(2n π− 2 ) ,
π
where n = 0, ±1, ±2, ⋯ . So, we conclude that ii is a real number.
Example 5.15. Find the sum of the infinite series
sin x sin 2 x sin 3 x
+
+
+⋯
e
e2
e3
∎
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
120
[Hint: Construct a similar series with cosine functions.]
Solution: Define the following infinite series
C = 1+
S =
cos x cos 2 x cos 3 x
+
+
+⋯
e
e2
e3
sin x sin 2 x sin 3 x
+
+
+⋯
e
e2
e3
So we have
cos x + i sin x cos 2 x + i sin 2 x cos 3 x + i sin 3 x
+
+
+⋯
e
e2
e3
cos x + i sin x (cos x + i sin x)2 (cos x + i sin x)3
= 1+
+
+
+⋯
e
e2
e3
ei x ei 2x ei 3x
+ 2 + 3 +⋯
= 1+
e
e
e
C + iS = 1 +
The RHS is a geometric series with common ratio
Thus,
eix
eix
1
, where ∣ ∣ = < 1.
e
e
e
1
ix
1 − ee
e
=
e − eix
e
=
e − cos x − i sin x
e
e − cos x + i sin x
⋅
=
e − cos x − i sin x e − cos x + i sin x
e (e − cos x + i sin x)
=
(e − cos x)2 + sin2 x
i e sin x
e (e − cos x)
= 2
+ 2
e − 2e cos x + 1 e − 2e cos x + 1
C + iS =
Extracting the imaginary part, we have
sin x sin 2 x sin 3 x
e sin x
+
+
+⋯= 2
2
3
e
e
e
e − 2e cos x + 1
∎
5.6
A Revisit to Simple Harmonic Motion
In section 4.2 we described the motion of an object which performs simple
harmonic motion. The oscillation obeys Hooke’s law such that the restoring
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
121
Figure 5.9: The spring-mass system
force is always linearly proportional to the displacement of the object from
its equilibrium position.
Recall that the equation of motion of the object is a second order differential equation and is given by
ẍ + ω 2 x = 0
(5.68)
where ω is the angular frequency of the oscillation. We admit that the
solving of this equation is a bit clumsy in section 4.2 because we worked by
integration twice in order to obtain the expression of x in terms of t. Now,
we solve the equation again through the analysis of complex roots in the
characteristic equation. Note that the characteristic equation (auxiliary
equation) of the differential equation is given by λ2 +ω 2 = 0. Its roots are ±iω.
One can verify that eiωt and e−iωt satisfy the differential equation. Thus, the
general solution of equation 5.68 is
x = c1 eiωt + c2 e−iωt ,
(5.69)
where c1 and c2 are arbitrary constants and they could be complex. Using
Euler’s formula, we have
x = (c1 + c2 ) cos ωt + i (c1 − c2 ) sin ωt
(5.70)
We know that a second order differential equation has a solution which associates to two arbitrary constants. Obviously, two complex numbers c1 and
c2 have four independent constants (i.e. four arbitrary constants). Thus, c1
and c2 are conjugate of each other such that they produce two real constants,
A and B by
{
c1 + c2 = A
i (c1 − c2 ) = B
Then
x = A cos ωt + B sin ωt
(5.71)
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
122
The solution of x is real and it measures the displacement of the object from
its equilibrium position. Finally, we use the skill in example 5.5 to rewrite
the solution as
x = µ sin(ωt + δ) ,
(5.72)
√
where µ = A2 + B 2 and δ = tan−1 (A/B). If x = D (the amplitude of oscillation) and ẋ = 0 when t = 0, the values of µ and δ are determined (µ = D and
δ = π/2). Then, we have x = D sin(ωt + π/2) = D cos ωt.
Alternatively, we can rewrite equation 5.71 as
x = µ cos(ωt − δ) ,
(5.73)
√
if we set µ = A2 + B 2 and δ = tan−1 (B/A). This is an alternative form
compared to equation 5.72. For the case where x = D and ẋ = 0 when t = 0,
the values of µ and δ are determined again (µ = D and δ = 0). Then, we
have x = D cos ωt. To conclude, both expressions in equations 5.72 and 5.73
give the same answer of x based on the initial (boundary) conditions of the
system.
5.7
Particle in a Box
In the simplest quantum mechanical system, a particle is trapped in a 1-D
box with infinitely hard walls as shown in figure 5.10. The potential along x
is given by
⎧
∞
⎪
⎪
⎪
V (x) = ⎨ 0
⎪
⎪
⎪
⎩ ∞
x<0
0≤x≤L
x>L
Figure 5.10: The infinitely potential well
CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS
123
The wavefunction of the particle ψ is given by the Schrödinger’s equation.
̵ 2 d2 ψ
h
+ V (x)ψ = Eψ ,
(5.74)
−
2m dx2
where m and E are the mass and the energy of the particle respectively. The
̵ relates the Planck’s constant h by h
̵ = h/(2π)
reduced Planck’s constant h
and E > 0. Inside the box the potential V is zero. The equation becomes
̵ 2 d2 ψ
h
= Eψ ,
(5.75)
−
2m dx2
2mE
Setting k 2 = ̵ 2 and rearranging the equation, we obtain
h
d2 ψ
+ k2ψ = 0
(5.76)
dx2
The characteristic equation of the above equation is λ2 + k 2 = 0. Its roots
are ik and −ik. Using the result in equation 5.71, the general solution of
equation 5.76 is simply
ψ = A cos kx + B sin kx
(5.77)
The boundary condition of ψ tells us that ψ = 0 at x = 0, so A = 0. Similarly,
the fact that ψ = 0 at x = L gives B sin kL = 0 and thus kL = nπ, where
n = 1, 2, 3 . . . . The wave number k is quantized now
nπ
n = 1, 2, 3 . . .
(5.78)
k=
L
and the energy is quantized (discret energy levels En ) as shown in figure 5.11.
Figure 5.11: The energy levels and wavefunctions of the particle
̵2
n2 π 2 h
n = 1, 2, 3 . . .
2mL2
The wavefunction of the particle is given by
nπ
x
n = 1, 2, 3 . . .
ψn = B sin
L
En =
(5.79)
(5.80)
Chapter 6
Partial Differentiation
6.1
Partial Derivative
Consider a function f of three variables x, y and z,
f (x, y, z)
If y and z are held constant and only x is allowed to vary, the partial derivative
∂f
or fx and is defined as the limit
with respect to x is denoted by
∂x
f (x + ∆x, y, z) − f (x, y, z)
∂f
= lim
∂x ∆x→0
∆x
The total differential of f is given by
df =
∂f
∂f
∂f
dx +
dy +
dz ,
∂x
∂y
∂z
(6.1)
where df represents the change in f due to the infinitesimal changes in x, y
and z respectively. The proof of equation 6.1 is shown below.
Consider the difference of the functional values at two adjacent points
P (x, y, z) and Q(x + ∆x, y + ∆y, z + ∆z),
∆f = f (x + ∆x, y + ∆y, z + ∆z) − f (x, y, z)
= [f (x + ∆x, y + ∆y, z + ∆z) − f (x, y + ∆y, z + ∆z)]
+[f (x, y + ∆y, z + ∆z) − f (x, y, z + ∆z)]
+[f (x, y, z + ∆z) − f (x, y, z)]
124
CHAPTER 6. PARTIAL DIFFERENTIATION
125
Then, we can write
∆f = [(
f (x + ∆x, y + ∆y, z + ∆z) − f (x, y + ∆y, z + ∆z)
) ∆x]
∆x
f (x, y + ∆y, z + ∆z) − f (x, y, z + ∆z)
+ [(
) ∆y]
∆y
f (x, y, z + ∆z) − f (x, y, z)
+ [(
) ∆z]
∆z
In the limiting case as ∆x → 0, ∆y → 0, and ∆z → 0, we have ∆x ≅ dx,
∆y ≅ dy, and ∆z ≅ dz, and ∆f ≅ df . The above equation can be reduced
into
f (x + ∆x, y + ∆y, z + ∆z) − f (x, y + ∆y, z + ∆z)
df = [ lim (
) ∆x]
∆x→0
∆x
f (x, y + ∆y, z + ∆z) − f (x, y, z + ∆z)
) ∆y]
+ [ lim (
∆y→0
∆y
f (x, y, z + ∆z) − f (x, y, z)
+ [ lim (
) ∆z]
∆z→0
∆z
Therefore, we have df =
∂f
∂f
∂f
dx +
dy +
dz.
∂x
∂y
∂z
Remarks on chain rules:
1. If f is a three-variable function f (x, y, z), where x, y and z are functions
of t, i.e. x(t), y(t) and z(t), then the derivative of f with respect to t is
the total differentiation of f and is given by the chain rule
df ∂f dx ∂f dy ∂f dz
=
+
+
dt ∂x dt ∂y dt ∂z dt
2. Suppose that y and z are functions of x, then f becomes a function of x.
So, we have
df ∂f ∂f dy ∂f dz
=
+
+
dx ∂x ∂y dx ∂z dx
Example 6.2 gives an illustration of this.
3. Suppose that x and y are independent but that z is a function of x and
dy
y, then
= 0 and f becomes a function of x and y. So, we have
dx
∂f
∂f ∂f ∂z
( ) =
+
∂x y ∂x ∂z ∂x
Note that f in the LHS is purely a function of x and y because z has been
substituted into it. The subscript indicates y being held constant. In the
RHS, f corresponds to a function of x, y and z. Similarly, we have
CHAPTER 6. PARTIAL DIFFERENTIATION
(
126
∂f
∂f ∂f ∂z
) =
+
∂y x ∂y ∂z ∂y
if y is allowed to vary but x is fixed. As a reminder, the subscript labels
the quantity being fixed. Read examples 6.3 and 6.4.
Example 6.1. If f = ex sin y, find
∂ 2f
∂f ∂f ∂ 2 f ∂ 2 f
,
,
,
,
and
.
∂x ∂y ∂x2 ∂y 2
∂x ∂y
Solution: Consider the expression f = ex sin y, we obtain
∂f
= ex sin y
∂x
and
∂ 2f
= ex sin y
2
∂x
∂f
= ex cos y
∂y
and
∂ 2f
= −ex sin y
∂y 2
∂ 2f
∂ ∂f
∂ x
=
( )=
(e cos y) = ex cos y
∂x ∂y ∂x ∂y
∂x
Remark:
∂ 2f
∂ ∂f
∂ x
∂ 2f
=
=
( )=
(e sin y) = ex cos y
∂x ∂y ∂y ∂x ∂y ∂x
∂y
∎
∂f
df
Example 6.2. Given that f = ex sin y, find
. If y = ex , find
.
∂x
dx
∂f
Solution: Consider f = ex sin y, we have
= ex sin y.
∂x
Now, we put y = ex , the function f can be rewritten as a single variable
function, where f = ex sin ex , and
df
= (sin ex ) ex + ex (cos ex ) ex = ex sin ex + e2x cos ex
dx
Alternatively, we apply the chain rule in remark 2 of section 6.1 and ignore
z, then
∂f ∂f dy
df
=
+
dx
∂x ∂y dx
= ex sin y + (ex cos y) ex
= ex sin y + e2x cos y
= ex sin ex + e2x cos ex
∎
CHAPTER 6. PARTIAL DIFFERENTIATION
127
Example 6.3. Let f (x, y, z) = x2 + xy 2 + z, where z(x, y) = 3x − y. Obtain
∂f
∂f
and ( ) .
∂x
∂x y
∂f
= 2x + y 2 .
Solution: Obviously,
∂x
∂f
∂z
In addition, we have
= 1 and
= 3. The chain rule in remark 3 of
∂z
∂x
section 6.1 gives
(
∂f
∂f ∂f ∂z
) =
+
∂x y
∂x ∂z ∂x
= 2x + y 2 + (1)(3)
= 2x + y 2 + 3
∂f
).
∂x y
The function f (x, y, z) can be rewritten as f (x, y, 3x − y) = x2 + xy 2 + 3x − y.
∂f
It is a function of x and y only. Hence, ( ) = 2x + y 2 + 3, where y is fixed
∂x y
in the derivative.
∎
∂f
Example 6.4. Let f (x, y, z) = xyz, where z = ln (3 x + 2 y + z). Obtain
,
∂x
∂f
∂f
and ( ) .
∂z
∂x y
∂f
∂f
Solution: It is easy to write down
= yz and
= xy.
∂x
∂z
We observe from z = ln (3 x + 2 y + z) that z couples with x and y and it
cannot be extracted explicitly from the expression. So, direct substituting
∂f
an expression of z into f (x, y, z) for further differentiation to obtain ( )
∂x y
is not possible. However, the chain rule in remark 3 of section 6.1 gives
Let’s illustrate the meaning of (
(
Let’s work out
∂f
∂f ∂f ∂z
) =
+
∂x y
∂x ∂z ∂x
∂z
∂z
∂z
. One should note that
= ( ) . Clearly, we have
∂x
∂x
∂x y
∂z
3+
∂z
3
∂x , which gives ∂z =
=
. Hence, we have
∂x 3 x + 2 y + z
∂x 3 x + 2 y + z − 1
(
∂f
∂f ∂f ∂z
) =
+
∂x y
∂x ∂z ∂x
3 xy
3 xyz + 2 y 2 z + yz 2 − yz + 3 xy
= yz +
=
3x + 2y + z − 1
3x + 2y + z − 1
∎
CHAPTER 6. PARTIAL DIFFERENTIATION
128
Example 6.5. The Cartesian coordinates relate the polar coordinates in the
following forms: x = r cos θ and y = r sin θ, find
(a)
∂x ∂x
∂ 2x ∂ 2x
,
in terms of r and θ, hence find
,
.
∂r ∂θ
∂r2 ∂θ2
(b)
∂r
∂θ
∂ 2r ∂ 2θ
and
in terms of r and θ, hence
,
.
∂x
∂x
∂x2 ∂x2
Solution:
(a)
∂x
∂x
= cos θ and
= −r sin θ.
∂r
∂θ
∂ 2x
∂ 2x
=
0
and
= −r cos θ.
Hence,
∂r2
∂θ2
(b) Consider the expression x = r cos θ and differentiate both sides with
respect to x, we obtain
1=r
∂
∂r
(cos θ) + cos θ
∂x
∂x
which implies
1 = −r sin θ
∂θ
∂r
+ cos θ
∂x
∂x
(6.2)
Consider the expression y = r sin θ and differentiate both sides with respect to x, we obtain
0=r
∂
∂r
(sin θ) + sin θ
∂x
∂x
which implies
0 = r cos θ
∂θ
∂r
+ sin θ
∂x
∂x
(6.3)
∂r
∂θ
sin θ
= cos θ and
=−
.
∂x
∂x
r
Alternatively, one may consider the expression r2 = x2 + y 2 . Differentiating both sides with respect to x, we obtain
Solve equations 6.2 and 6.3, we obtain
2r
∂r
= 2x
∂x
∂r
x r cos θ
=
=
= cos θ
∂x
r
r
CHAPTER 6. PARTIAL DIFFERENTIATION
129
y
Next, we know that tan θ = . If we differentiate both sides with respect
x
∂y
x ( ) − y ( ∂x )
∂x
∂x
∂θ
x (0) − y (1)
y
to x, we obtain sec2 θ ( ) =
=
=− 2.
2
2
∂x
x
x
x
y
sin
θ
∂θ
= − cos2 θ ( 2 ) = −
.
So, we obtain
∂x
x
r
Hence, we have
∂ 2r
∂ ∂r
∂
∂θ
sin θ
sin2 θ
=
(
)
=
(cos
θ)
=
−
sin
θ
=
−
sin
θ
(−
)
=
∂x2 ∂x ∂x
∂x
∂x
r
r
Repeat similar process, we have
∂ 2θ
∂ ∂θ
∂
sin θ
=
( )=
(−
) , then
2
∂x
∂x ∂x
∂x
r
∂
∂r
(sin θ) − sin θ }
∂x
∂x
r2
sin θ
− {r cos θ (−
) − sin θ cos θ}
r
=
r2
2 sin θ cos θ
=
r2
sin 2θ
=
r2
∂ 2θ
=
∂x2
− {r
Remark: The reciprocal rule resulted from the differentiation of singlevariable function may not apply to partial differentiation. In this example,
∂x ∂r
∂x
∂x
∂r
∂r
we observe that
≠ 1. In fact,
= ( ) and
= ( ) , where
∂r ∂x
∂r
∂r θ
∂x
∂x y
the parameters being fixed are not the same in each derivative. Similarly,
∂x ∂θ
∂x
∂x
∂θ
∂θ
≠ 1, where
= ( ) and
= ( ) . The reciprocal rule is valid
∂θ ∂x
∂θ
∂θ r
∂x
∂x y
for multivariable functions if the parameter(s) being fixed is(are) the same.
Further discussion can be found in the remark of example 6.8.
∎
1/5
Example 6.6. Evaluate {(3.8)2 + 2 (2.1)3 }
tion without using calculator.
to the first order approxima-
Solution: Let z = (x2 + 2y 3 )1/5 . The total differential of z is
dz =
where
∂z
∂z
dx +
dy ,
∂x
∂y
∂z 1 2
∂z 1 2
= (x + 2y 3 )−4/5 (2x) and
= (x + 2y 3 )−4/5 (6y 2 ).
∂x 5
∂y 5
CHAPTER 6. PARTIAL DIFFERENTIATION
130
1 2
(x + 2y 3 )−4/5 (2x dx + 6y 2 dy).
5
1/5
Let x = 4, y = 2 and set dx = −0.2 and dy = 0.1, we have z = [42 + 2 (2)3 ] = 2
and
Hence, dz =
1 2
(4 + 2 (2)3 )−4/5 [2 (4) (−0.2) + 6 (22 ) (0.1)]
5
1 1
=
(−1.6 + 2.4) = 0.01
5 16
dz =
1/5
Therefore {(3.8)2 + 2 (2.1)3 } = 2 + 0.01 = 2.01. The higher order approximation can be obtained by using the Taylor series in two variables.
∎
Example 6.7. Functions x and y are described by {
x = eu cos v
.
y = eu sin v
(a) Write down dx and dy. Hence, show that
{
(b) If z = uv, find
du = e−u cos v dx + e−u sin v dy
dv = −e−u sin v dx + e−u cos v dy
∂z
∂z
and
by using the results in (a).
∂x
∂y
Solution:
(a) Differentiate x and y with respect to u and v, we have
⎧
∂x
⎪
⎪
= eu cos v
⎪
⎪
⎪
∂u
⎪
⎨
⎪
⎪
∂x
⎪
⎪
⎪
= −eu sin v
⎪
⎩ ∂v
⎧
∂y
⎪
⎪
= eu sin v
⎪
⎪
⎪
∂u
⎪
⎨
⎪
⎪
∂y
⎪
⎪
⎪
= eu cos v
⎪
⎩ ∂v
and
The total differential of x is dx =
∂x
∂x
du +
dv. We can write
∂u
∂v
dx = eu cos v du − eu sin v dv
The total differential of y is dy =
(6.4)
∂y
∂y
du +
dv. We can write
∂u
∂v
dy = eu sin v du + eu cos v dv
Solving equations 6.4 and 6.5, we obtain
du = e−u cos v dx + e−u sin v dy
dv = −e−u sin v dx + e−u cos v dy
(6.5)
CHAPTER 6. PARTIAL DIFFERENTIATION
131
(b) Since z = uv, we obtain dz = u dv + v du. Hence,
dz = u (−e−u sin v dx + e−u cos v dy) + v (e−u cos v dx + e−u sin v dy)
Arrange the equation, we have
dz = (−ue−u sin v + ve−u cos v) dx + (ue−u cos v + ve−u sin v) dy
∂z
∂z
dx +
dy.
∂x
∂y
Hence, We can write
But, dz =
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∂z
= −ue−u sin v + ve−u cos v
∂x
∂z
= ue−u cos v + ve−u sin v
∂y
∎
∂z ∂y ∂x
= −1 .
∂y ∂x ∂z
Solution: If f (x, y, z) = 0, we can write z as a function of x and y, i.e.
∂f ∂f ∂z
∂f
+
= 0. Please refer to remark 3 of
z(x, y). Then, we have ( ) =
∂y x ∂y ∂z ∂y
section 6.1. Hence, we obtain
Example 6.8. If f (x, y, z) = 0, show that
∂f
∂z
∂y
=−
∂f
∂y
∂z
where
Similarly, we can write y(z, x) and (
∂z
∂z
=( )
∂y
∂y x
∂f
∂f ∂f ∂y
) =
+
= 0. Hence, we
∂x z
∂x ∂y ∂x
obtain
∂f
∂y
= − ∂x
∂f
∂x
∂y
where
Similarly, we can write x(y, z) and (
∂y
∂y
=( )
∂x
∂x z
∂f
∂f ∂x ∂f
) =
+
= 0. Hence, we
∂z y
∂x ∂z ∂z
obtain
∂f
∂x
= − ∂z
∂f
∂z
∂x
where
∂x
∂x
=( )
∂z
∂z y
CHAPTER 6. PARTIAL DIFFERENTIATION
132
∂z ∂y ∂x
= −1.
∂y ∂x ∂z
This formula is widely used in thermodynamics. According to the ideal gas
law, we know that P V = nRT , where n is the number of moles of gas in the
system at equilibrium. Now, we define a function f (P, V, T ) = 0, where P , V ,
and T are the pressure, volume, and temperature of the gas system. Thus,
we have
Therefore, we obtain the product of the partial derivatives
(
∂T
∂V
∂P
) (
) (
) = −1
∂V P ∂P T ∂T V
Remark: Knowing that f (x, y, z) = 0, we can write z(x, y), y(z, x), and
x(y, z). If we repeat the above work in a similar way, we get
∂f
∂x
∂y
=−
,
∂f
∂y
∂x
∂f
∂z
= − ∂x ,
∂f
∂x
∂z
and
∂f
∂y
= − ∂z ,
∂f
∂z
∂y
∂z
∂z
∂x
∂x
∂y
∂y
=( ) ,
= ( ) , and
= ( ) . Multiplying the above
∂x
∂x y ∂y
∂y z
∂z
∂z x
derivatives, we obtain another product relation
where
∂z ∂x ∂y
= −1
∂x ∂y ∂z
Substituting P , V and T for x, y, and z respectively, we obtain
(
∂T
∂P
∂V
) (
) (
) = −1
∂P V ∂V T ∂T P
In addition, we obtain the following product relations by direct multiplication
of the partial derivatives.
∂z ∂y
= 1,
∂y ∂z
∂y ∂x
= 1,
∂x ∂y
and
∂x ∂z
=1
∂z ∂x
The results are straightforward in meaning. For example, the first product
considers x as a fixed number, then f (x, y, z) = 0 implies that y is a function
of z (or z is a function of y). The reciprocal rule in single variable function
applies. Similarly, we obtain the remainding two. Hence, we have
(
∂T
∂V
) (
) = 1,
∂V P ∂T P
(
∂V
∂P
) (
) = 1,
∂P T ∂V T
and
(
∂P
∂T
) (
) =1
∂T V ∂P V
∎
CHAPTER 6. PARTIAL DIFFERENTIATION
133
Example 6.9. This problem repeats example 6.8. If f (x, y, z) = 0, show by
∂z ∂y ∂x
considering the total differentials of x, y, and z that
= −1.
∂y ∂x ∂z
Solution: Since f (x, y, z) = 0, we have x(y, z), y(z, x), and z(x, y) respectively. Then, we write down the total differentials as
∂x
∂x
) dy + ( ) dz
∂y z
∂z y
∂y
∂y
dy = ( ) dz + ( ) dx
∂z x
∂x z
∂z
∂z
dz = ( ) dx + ( ) dy
∂x y
∂y x
dx = (
(6.6)
(6.7)
(6.8)
Eliminating dy from equations 6.6 and 6.7, we obtain
[1 − (
∂y
∂x
∂y
∂x
∂x
) ( ) ] dx − [( ) ( ) + ( ) ] dz = 0
∂y z ∂x z
∂y z ∂z x
∂z y
(6.9)
As dx and dz can be chosen independently, we have
1−(
(
∂x
∂y
) ( ) =0
∂y z ∂x z
∂y
∂x
∂x
) ( ) +( ) =0
∂y z ∂z x
∂z y
(6.10)
(6.11)
Similarly, we eliminate dz from equations 6.7 and 6.8, thus
[1 − (
∂z
∂y
∂z
∂y
∂y
) ( ) ] dy − [( ) ( ) + ( ) ] dx = 0
∂z x ∂y x
∂z x ∂x y
∂x z
(6.12)
As dx and dy can be chosen independently, we have
1−(
(
∂y
∂z
) ( ) =0
∂z x ∂y x
∂z
∂y
∂y
) ( ) +( ) =0
∂z x ∂x y
∂x z
(6.13)
(6.14)
Using the same manner, we eliminate dx from equations 6.6 and 6.8, thus
[1 − (
∂z
∂x
∂z
∂x
∂z
) ( ) ] dz − [( ) ( ) + ( ) ] dy = 0
∂x y ∂z y
∂x y ∂y z
∂y x
(6.15)
CHAPTER 6. PARTIAL DIFFERENTIATION
134
As dy and dz can be chosen independently, we have
1−(
(
∂z
∂x
) ( ) =0
∂x y ∂z y
∂x
∂z
∂z
) ( ) +( ) =0
∂x y ∂y z
∂y x
(6.16)
(6.17)
Equations 6.10, 6.13, and 6.16 give the following reciprocal rules
∂x
∂y
) ( ) =1
∂y z ∂x z
∂y
∂z
( ) ( ) =1
∂z x ∂y x
∂z
∂x
( ) ( ) =1
∂x y ∂z y
(
(6.18)
(6.19)
(6.20)
Equations 6.11, 6.14, and 6.17 and the above rules give
∂y
∂x
∂z
) ( ) ( ) = −1
∂z x ∂y z ∂x y
∂y
∂x
∂z
( ) ( ) ( ) = −1
∂y x ∂x z ∂z y
(
(6.21)
(6.22)
∎
6.2
Geometrical Meaning of Partial Derivatives
Figure 6.1 shows the intersection between the curved surface z = f (x, y) and
the vertical plane x = x0 . It is a curve given by z(x0 , y). The slope of this
curve at (x0 , y0 ) is given by the partial derivative of z with respect to y, i.e.
∂z
∣
. While doing the differentiation, only the values of y is allowed to
∂y (x0 ,y0 )
vary but the value of x is always fixed at x0 . The derivative gives the rate
of change of z along the positive y direction. Similarly, if we cut the curved
surface by another vertical plane y = y0 , the intersecting curve is z(x, y0 ) and
∂z
the slope of it at (x0 , y0 ) is
∣
. This quantity gives the rate of change
∂x (x0 ,y0 )
of z along the positive x direction.
CHAPTER 6. PARTIAL DIFFERENTIATION
135
Figure 6.1: The meaning of partial derivative
6.3
Polar Coordinates
A point P on the Cartesian plane is represented by (x, y) with respect to
an origin O, where x and y are the horizontal and vertical coordinates and
(x, y) is called the Cartesian coordinates. The location of point P can also
be represented by the polar coordinates (r, θ) with respect to the origin O,
where r = OP ≥ 0 is the radial distance between point P and the origin. The
polar angle θ is the angle between the positive x axis and OP . It is positive
if the angle is measured counterclockwisely from the positive x-axis. The
angle becomes negative if it is measured clockwisely from the positive x axis.
Figure 6.2 shows the polar coordinates of point P on the Cartesian plane.
Figure 6.2: The polar coordinate system
CHAPTER 6. PARTIAL DIFFERENTIATION
136
In some textbooks, they use (ρ, φ) instead of (r, θ) in order to avoid confusion
with the spherical coordinate system (section 6.7). Recall that the Cartesian
coordinate system has a pair of unit vectors which associate with the coordinates, i.e. î for x and ĵ for y. Similarly, the polar coordinate system has
unit vectors êr and êθ associated with coordinates r and θ respectively. The
direction of êr and êθ are orthogonal to each other (normal to each other)
as shown in figure 6.2. The two sets of unit vectors are related by
{
êr = cos θ î + sin θ ĵ
êθ = − sin θ î + cos θ ĵ
(6.23)
The second line in equation set 6.23 is obtained by replacing θ by π/2 + θ
because êθ is turned counterclockwisely by π/2 from êr . The transformation
between (x, y) and (r, θ) are
{
x = r cos θ
y = r sin θ
For better analysis, let’s rename the position vector of P as ⃗l, where
⃗l = x î + y ĵ. If we use polar coordinates, then ⃗l = rêr . Another point Q on
Ð→
the xy plane has an infinitesimal displacement from P , the vector P Q is
d⃗l = dx î + dy ĵ
(6.24)
This change can be represented by using the polar coordinates as
d⃗l = dr êr + r dθ êθ
(6.25)
The derivation of equation 6.25 is stated as follows. We first express dx and
dy in equation 6.24 in terms of the polar coordinates and then we write down
î and ĵ in terms of êr and êθ . From the transformation x = r cos θ, we have
the total differential of x
dx = cos θ dr − r sin θ dθ
(6.26)
Similarly, the transformation y = r sin θ gives the total differential of y
dy = sin θ dr + r cos θ dθ
(6.27)
From equation 6.23, we have
{
î = cos θ êr − sin θ êθ
ĵ = sin θ êr + cos θ êθ
(6.28)
Substituting equations 6.26, 6.27 and 6.28 into equation 6.24, we have
d⃗l = (cos θ dr − r sin θ dθ) (cos θ êr − sin θ êθ ) +
(sin θ dr + r cos θ dθ) (sin θ êr + cos θ êθ )
= (cos2 θ + sin2 θ) dr êr + r (cos2 θ + sin2 θ) dθ êθ
CHAPTER 6. PARTIAL DIFFERENTIATION
137
Finally, we obtain the vectorial representation of a small change using the
polar coordinates
d⃗l = dr êr + r dθ êθ
(6.29)
Remark: There are two ways to obtain equation 6.28 from equation set 6.23.
The first way is to regard î and ĵ as two unknowns in equation set 6.23, then
solve them simultaneously. However, this approach is very clumsy. The
second way is to consider the dot products of equation set 6.23 with î and ĵ
respectively, The results provide the information of the projections of î and
ĵ on êr and êθ . For example, we form the dot product of î with equation set
6.23 and obtain the projections of î on êr and êθ
{
î ⋅ êr = cos θ
î ⋅ êθ = − sin θ
(6.30)
Next, we form the dot product of ĵ with equation set 6.23, then we know the
projections of ĵ on êr and êθ .
{
ĵ ⋅ êr = sin θ
ĵ ⋅ êθ = cos θ
(6.31)
Equation sets 6.30 and 6.31 give rise to equation set 6.28.
Example 6.10. In polar coordinate system, the position of a point is represented by (r, θ). If there is an infinitesimally small change in the position,
∂ ⃗l
∂ ⃗l
dr +
dθ that d⃗l = dr êr + r dθ êθ .
say, d⃗l, show by considering d⃗l =
∂r
∂θ
Solution: The position of a point, say P is given by ⃗l = x î + y ĵ, then we
have ⃗l = r cos θ î + r sin θ ĵ. Differentiating both sides, we have
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∂ ⃗l
= cos θ î + sin θ ĵ
∂r
∂ ⃗l
= −r sin θ î + r cos θ ĵ
∂θ
Then
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
√
∂ ⃗l
∣ ∣ = cos2 θ + sin2 θ = 1
∂r
∣
√
∂ ⃗l
∣ = r2 sin2 θ + r2 cos2 θ = r
∂θ
CHAPTER 6. PARTIAL DIFFERENTIATION
138
∂ ⃗l
∂ ⃗l
We know that
is parallel to êr and
is parallel to êθ . Hence, we can
∂r
∂θ
write
∂ ⃗l
∂ ⃗l
dr +
dθ
∂r
∂θ
∂ ⃗l
∂ ⃗l
= ∣ ∣ dr êr + ∣ ∣ dθ êθ
∂r
∂θ
= dr êr + r dθ êθ
d⃗l =
∎
6.4
Polar Coordinates and the Length of a
Curve
In section 3.8, we have a discussion about the length of a curve in Cartesian
coordinates. Recall that on a Cartesian plane, the length of a small segment
is given by
√
√
dy 2
dl = (dx)2 + (dy)2 = 1 + ( ) dx .
dx
The small length corresponds to the magnitude of a small vector d⃗l where
d⃗l = dx î + dy ĵ, with î and ĵ are orthogonal unit vectors (normal to each
other). Likewise in polar coordinates, a small vector d⃗l = dr êr + r dθ êθ has a
length
√
√
dr 2
dl = (dr)2 + r2 (dθ)2 = r2 + ( ) dθ ,
dθ
where êr and êθ are orthogonal unit vectors. Thus, the total length of the
curve is given by
√
θ2
dr 2
2
r + ( ) dθ
∫θ
dθ
1
Example 6.11. Find the total length of the cardioid r = 2 (1 + cos θ) as
shown in the figure.
CHAPTER 6. PARTIAL DIFFERENTIATION
139
Figure 6.3: The cardioid
dr
= −2 sin θ, then the total length of the
Solution: Firstly, we obtain
dθ
cardioid is given by
√
π
dr 2
S = 2∫
r2 + ( ) dθ
dθ
0
π√
= 2∫
4 (1 + cos θ)2 + 4 sin2 θ dθ
0
π√
√
1 + cos θ dθ
= 4 2∫
0
π
θ
= 8 ∫ cos dθ
2
0
= 16
We have applied the trigonometric identity cos θ = 2 cos2
6.5
θ
− 1.
2
∎
Cartesian Coordinates
A Cartesian coordinate system has coordinates (x, y, z) as shown in figure
6.4. The position vector of a point P in the space is represented by
⃗l = x î + y ĵ + z k̂ ,
where î, ĵ, and k̂ are ”constant vectors”. The directions and magnitudes
of î, ĵ and k̂ never change. The length of each of them is 1. Then, the
infinitesimal change of the position vector is
d⃗l = dx î + dy ĵ + dz k̂
The answer is straight forward, but the idea behind it relates to partial differentiation. Let’s reveal it! Recall that in equation 6.1, the total differential
of a scalar function f is given by
df =
∂f
∂f
∂f
dx +
dy +
dz
∂x
∂y
∂z
CHAPTER 6. PARTIAL DIFFERENTIATION
140
Figure 6.4: The Cartesian coordinate system
It is also true for vector function, so
∂ ⃗l
∂ ⃗l
∂ ⃗l
d⃗l =
dx +
dy +
dz
∂x
∂y
∂z
(6.32)
On the other hand, the position vector ⃗l = x î + y ĵ + z k̂ gives
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∂ ⃗l
= î
∂x
along î
∂ ⃗l
= ĵ
∂y
along ĵ
∂ ⃗l
= k̂
∂z
along k̂
∂ ⃗l
∂ ⃗l
∂ ⃗l
∣ = 1, ∣ ∣ = 1, and ∣ ∣ = 1,
∂x
∂y
∂z
respectively. Equation 6.32 is equivalent to
Obviously, the magnitudes of them are ∣
∂ ⃗l
∂ ⃗l
∂ ⃗l
d⃗l = ∣ ∣ dx î + ∣ ∣ dy ĵ + ∣ ∣ dz k̂
∂x
∂y
∂z
Therefore, the vectorial representation of an infinitesimal change using the
Cartesian coordinates is
d⃗l = dx î + dy ĵ + dz k̂
The answer looks trivial, but the approach to obtain it is applicable to other
coordinate systems.
CHAPTER 6. PARTIAL DIFFERENTIATION
6.6
141
Cylindrical Coordinates
A cylindrical coordinate system has coordinates (ρ, φ, z). It is an extension
of the plane polar coordinates to include the z axis as shown in figure 6.5.
Unlike section 6.3 the plane polar coordinates are written as (ρ, φ) instead
of (r, θ). This is to avoid confusion with the spherical coordinates to be
discussed in section 6.7. The polar coordinates ρ and φ are obtained by the
transformations x = ρ cos φ and y = ρ sin φ. Using Cartesian coordinates the
position vector of P is expressed as ⃗l = x î + y ĵ + z k̂, then we can write
⃗l = ρ cos φ î + ρ sin φ ĵ + z k̂
One should also be aware of ⃗l = ρ ρ̂ + z k̂. Note that
Figure 6.5: The cylindrical coordinate system
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∂ ⃗l
= cos φ î + sin φ ĵ
∂ρ
along ρ̂
∂ ⃗l
= −ρ sin φ î + ρ cos φ ĵ
∂φ
along φ̂
∂ ⃗l
= k̂
∂z
along k̂
CHAPTER 6. PARTIAL DIFFERENTIATION
142
The magnitudes of them are
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∣
√
∂ ⃗l
∣ = cos2 φ + sin2 φ = 1
∂ρ
∣
√
∂ ⃗l
∣ = ρ2 sin2 φ + ρ2 cos2 φ = ρ
∂φ
∣
∂ ⃗l
∣=1
∂z
Hence, we obtain the unit vectors
⎧
∂ ⃗l
∂ ⃗l
⎪
⎪
⎪
/∣
∣ = cos φ î + sin φ ĵ
ρ̂
=
⎪
⎪
⎪
∂ρ
∂ρ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
∂ ⃗l
∂ ⃗l
⎪
⎨ φ̂ =
/∣ ∣ = − sin φ î + cos φ ĵ
⎪
∂φ
∂φ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
∂ ⃗l
∂ ⃗l
⎪
⎪
k̂ =
/∣ ∣ = k̂
⎪
⎪
⎪
∂z
∂z
⎩
One can check that ρ̂, φ̂ and k̂ are orthogonal to each other. ρ̂ and φ̂ are unit
vectors, but they are not constant vectors because both of them are functions
of φ; their direction vary with φ. The infinitesimal change in the position
vector is
∂ ⃗l
∂ ⃗l
∂ ⃗l
dρ +
dφ +
dz
∂ρ
∂φ
∂z
∂ ⃗l
∂ ⃗l
∂ ⃗l
d⃗l = ∣ ∣ dρ ρ̂ + ∣ ∣ dφ φ̂ + ∣ ∣ dz k̂
∂ρ
∂φ
∂z
d⃗l =
Therefore, the vectorial representation of a small change using the cylindrical
coordinates is
d⃗l = dρ ρ̂ + ρ dφ φ̂ + dz k̂
Example 6.12. Let C⃗ = c1 î + c2 ĵ + c3 k̂ be a constant vector. Rewrite C⃗
using the cylindrical coordinates.
Solution: Recall that
⎧
ρ̂ = cos φ î + sin φ ĵ
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩ φ̂ = − sin φ î + cos φ ĵ
CHAPTER 6. PARTIAL DIFFERENTIATION
143
which gives
⎧
î ⋅ ρ̂ = cos φ
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩ î ⋅ φ̂ = − sin φ
and
⎧
ĵ ⋅ ρ̂ = sin φ
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩ ĵ ⋅ φ̂ = cos φ
One can see readily that
⎧
î = cos φ ρ̂ − sin φ φ̂
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩ ĵ = sin φ ρ̂ + cos φ φ̂
thus
C⃗ = c1 î + c2 ĵ + c3 k̂
= c1 (cos φ ρ̂ − sin φ φ̂) + c2 (sin φ ρ̂ + cos φ φ̂) + c3 k̂
= (c1 cos φ + c2 sin φ) ρ̂ − (c1 sin φ − c2 cos φ) φ̂ + c3 k̂
∎
6.7
Spherical Coordinates
A spherical coordinate system has coordinates (r, θ, φ), where 0 ≤ θ ≤ π and
0 ≤ φ ≤ 2 π. One should not confuse the symbols, i.e. r and θ, adopted in
polar coordinates in section 6.3 because they represent differently in the two
systems. Figure 6.6 shows the spherical coordinates as well as their associated unit vectors.
The transformations are x = r sin θ cos φ, y = r sin θ sin φ, and z = r cos θ.
As the position vector of P in the Cartesian coordinates is ⃗l = x î + y ĵ + z k̂,
then we can write
⃗l = r sin θ cos φ î + r sin θ sin φ ĵ + r cos θ k̂
One should also be aware of ⃗l = rr̂ (some books read r⃗ = r r̂ = ⃗l). Note that
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∂ ⃗l
= sin θ cos φ î + sin θ sin φ ĵ + cos θ k̂
∂r
along r̂
∂ ⃗l
= r cos θ cos φ î + r cos θ sin φ ĵ − r sin θ k̂
∂θ
along θ̂
∂ ⃗l
= −r sin θ sin φ î + r sin θ cos φ ĵ
∂φ
along φ̂
CHAPTER 6. PARTIAL DIFFERENTIATION
144
Figure 6.6: The spherical coordinate system
The magnitudes of them are
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
∣
√
∂ ⃗l
∣ = sin2 θ cos2 φ + sin2 θ sin2 φ + cos2 θ = 1
∂r
∣
√
∂ ⃗l
∣ = r2 cos2 θ cos2 φ + r2 cos2 θ sin2 φ + r2 sin2 θ = r
∂θ
∣
√
∂ ⃗l
∣ = r2 sin2 θ sin2 φ + r2 sin2 θ cos2 φ = r sin θ
∂φ
Hence, we obtain the unit vectors
⎧
∂ ⃗l
∂ ⃗l
⎪
⎪
⎪
r̂
=
/∣
∣ = sin θ cos φ î + sin θ sin φ ĵ + cos θ k̂
⎪
⎪
⎪
∂r
∂r
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
∂ ⃗l
∂ ⃗l
⎪
⎨ θ̂ =
/∣ ∣ = cos θ cos φ î + cos θ sin φ ĵ − sin θ k̂
⎪
∂θ
∂θ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
∂ ⃗l
∂ ⃗l
⎪
⎪
φ̂
=
/∣
∣ = − sin φ î + cos φ ĵ
⎪
⎪
⎪
∂φ
∂φ
⎩
One can check that r̂, θ̂ and φ̂ are orthogonal to each other. r̂, θ̂, and φ̂ are
unit vectors but they are not constant vectors because r̂ and θ̂ are functions
of θ and φ and φ̂ is a function of φ. Their directions vary with the parameters.
CHAPTER 6. PARTIAL DIFFERENTIATION
145
The infinitesimal change in the position vector is
∂ ⃗l
∂ ⃗l
∂ ⃗l
dr +
dθ +
dφ
∂r
∂θ
∂φ
∂ ⃗l
∂ ⃗l
∂ ⃗l
d⃗l = ∣ ∣ dr r̂ + ∣ ∣ dθ θ̂ + ∣ ∣ dφ φ̂
∂r
∂θ
∂φ
d⃗l =
Therefore, the vectorial representation of a small change using the spherical
coordinates is
d⃗l = dr r̂ + r dθ θ̂ + r sin θ dφ φ̂
Example 6.13. A force F⃗ = 2 î acts on a particle such that it moves along
a horizontal circular path from (x, y, z) = (5, 0, 0) to (x, y, z) = (0, 5, 0). The
movement is in the positive quadrant. Use the spherical coordinate system
to find the work done by F⃗ . The unit of force is newtons and that of spatial
measurement is meters.
Solution: Denote the circular path as C. The points of it can be described
π
by spherical coordinates (r, θ, φ), where r = 5 and 0 ≤ φ ≤ . Obviously,
2
π
θ = because C lies on the xy-plane. The required work done is
2
W = ∫ F⃗ ⋅ d⃗l .
C
Figure 6.7: The circular path C of the particle
Next, we rewrite F⃗ using the spherical coordinates. Recall that
⎧
r̂ = sin θ cos φ î + sin θ sin φ ĵ + cos θ k̂
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ θ̂ = cos θ cos φ î + cos θ sin φ ĵ − sin θ k̂
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩ φ̂ = − sin φ î + cos φ ĵ
CHAPTER 6. PARTIAL DIFFERENTIATION
146
The components of î along r̂, θ̂, and φ̂ can be obtained by using the scalar
products.
⎧
î ⋅ r̂ = sin θ cos φ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ î ⋅ θ̂ = cos θ cos φ
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩ î ⋅ φ̂ = − sin φ
Hence, we have
î = sin θ cos φ r̂ + cos θ cos φ θ̂ − sin φ φ̂
Since the force lies on the xy-plane, we have θ =
π
and
2
î = cos φ r̂ − sin φ φ̂
so
F⃗ = 2 î = 2 cos φ r̂ − 2 sin φ φ̂
In spherical coordinates, the infinitesimal change of a position vector is given
by d⃗l = dr r̂ + r dθ θ̂ + r sin θ dφ φ̂. Substituting the following values into the
π
expression: r = 5, dr = 0, θ = and dθ = 0, then we have d⃗l = 5 dφ φ̂. Hence
2
W = ∫ F⃗ ⋅ d⃗l
C
= ∫ (2 cos φ r̂ − 2 sin φ φ̂) ⋅ (5 dφ φ̂)
C
= −10 ∫ sin φ dφ
C
π
2
= −10 ∫ sin φ dφ
0
= 10 cos φ ∣
= −10 J
π
2
0
∎
6.8
A Revisit to Electric Field and Electric
Potential
Electric field is defined as the electric force acting on a unit positive test
charge. It is a vector field associated to each point in the space. Then the
CHAPTER 6. PARTIAL DIFFERENTIATION
147
work done by an electric field E⃗ to move this test charge by a displacement d⃗l
is E⃗ ⋅ d⃗l. The amount of work done relates to the change of electric potential
of the test charge by
dV
= −E⃗ ⋅ d⃗l
= −(Ex î + Ey ĵ + Ez k̂) ⋅ (dx î + dy ĵ + dz k̂)
= −(Ex dx + Ey dy + Ez dz)
However, the total differential of V gives
dV =
∂V
∂V
∂V
dx +
dy +
dz
∂x
∂y
∂z
(6.33)
Thus, we have
⎧
∂V
⎪
⎪
E
=
−
x
⎪
⎪
⎪
∂x
⎪
⎪
⎪
⎪
⎪
⎪
⎪
∂V
⎪
⎨ Ey = −
⎪
∂y
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
∂V
⎪
⎪
⎪
E =−
⎪
⎩ z
∂z
Therefore, we can express the electric field in terms of the partial derivatives
of the electric potential, e.g.
∂V
∂V
∂V
+ ĵ
+ k̂
)
E⃗ = Ex î + Ey ĵ + Ez k̂ = − (î
∂x
∂y
∂z
Obviously,
∂
∂
∂
E⃗ = − (î
+ ĵ
+ k̂ ) V .
∂x
∂y
∂z
∂
∂
∂
Hence, we have E⃗ = −∇V , where ∇ ≡ î
+ ĵ
+ k̂
is called the ”del”
∂x
∂y
∂z
operator. Thus equation 6.33 can be rewritten as dV = ∇V ⋅ d⃗l, where ∇V
links up dV and d⃗l. One can always compare the case in single variable
function. Let y = f (x), then the total differential of y is dy = f ′ (x) dx, where
f ′ (x) links up dy and dx.
Example 6.14. A point charge q0 is fixed at the origin, show that the work
done required by the electric force to move a positive charge q from point
A(r1 , θ1 , φ1 ) to point B(r2 , θ2 , φ2 ) is independent to the path choosen.
CHAPTER 6. PARTIAL DIFFERENTIATION
148
q0 q
Solution: The electric force exerted on charge q by charge q0 is F⃗ = ke 2 r̂,
r
where r̂ is the unit vector in the radial direction and ke is the Coulomb’s constant. Recall that the representation of a small displacement in the spherical
⃗ = dr r̂ + r dθ θ̂ + r sin θ dφ φ̂. Hence, the required work
coordinate system is dl
done is
(r2 ,θ2 ,φ2 )
W = ∫
(r
1 ,θ1 ,φ1 )
(r2 ,θ2 ,φ2 )
F⃗ ⋅ d⃗l
q0 q
r̂ ⋅ (dr r̂ + r dθ θ̂ + r sin θ dφ φ̂)
r2
1 ,θ1 ,φ1 )
(r2 ,θ2 ,φ2 )
q0 q
= ∫
ke 2 dr
r
(r1 ,θ1 ,φ1 )
= ∫
(r
ke
(r2 ,θ2 ,φ2 )
q0 q
= −ke
∣
r (r1 ,θ1 ,φ1 )
= −ke q0 q (
1
1
− )
r2 r1
The answer indicates that the work done by the electric force in moving a
charge q from A to B is independent to the path.
∎
Chapter 7
Matrix and Transformation
7.1
Matrix
A collection of m × n numbers in rectangular form is called a matrix. Let A
be the matrix, where
⎛ a11 a12 . . . a1n ⎞
⋮
⋮ ⎟
A=⎜ ⋮
⎝ am1 am2 . . . amn ⎠
(7.1)
We say that this matrix has m rows and n columns or a m × n matrix, and
the aij are the entries of the matrix. If m = 1, we call it a row vector,
sometimes, we write it as (a11 , a12 , ⋯, a1n ). If n = 1, we call it a column
vector. If m = n, we call it a square matrix.
If k is another number, then the multiplication of k and the matrix A is
defined by
⎛ a11 . . . a1n ⎞ ⎛ ka11 . . . ka1n ⎞
⋮ ⎟=⎜ ⋮
⋮ ⎟ ,
kA = k ⎜ ⋮
⎝ am1 . . . amn ⎠ ⎝ kam1 . . . kamn ⎠
(7.2)
which is that each entry is multiplied by k.
Addition (and subtraction) of two matrices A and B is defined only when
they have the same numbers of rows and columns,
⎛ a11 . . . a1n ⎞ ⎛ b11 . . . b1n ⎞
⋮ ⎟±⎜ ⋮
⋮ ⎟
A±B = ⎜ ⋮
⎝ am1 . . . amn ⎠ ⎝ bm1 . . . bmn ⎠
⎛ a11 ± b11 . . . a1n ± b1n ⎞
⋮
⋮
⎟
= ⎜
⎝ am1 ± bm1 . . . amn ± bmn ⎠
(7.3)
Matrix multiplication is more complicated and it is the key of theory of
matrices. We can multiply two matrices only if the number of columns of the
149
CHAPTER 7. MATRIX AND TRANSFORMATION
150
first equals to the number of rows of the second, i.e., we can only multiply
m × n and n × l matrices. Consider the multiplication of matrices A and B,
then AB = C and C is a m×l matrix. The rs-th entry of C is crs = ∑ni=1 ari bis ,
where 1 ≤ r ≤ m and 1 ≤ s ≤ l.
⎛ a11 . . . a1n ⎞ ⎛ b11 . . . b1l ⎞
⋮ ⎟⎜ ⋮
⋮ ⎟
AB = ⎜ ⋮
⎝ am1 . . . amn ⎠ ⎝ bn1 . . . bnl ⎠
n
n
⎛ ∑i=1 a1i bi1 . . . ∑i=1 a1i bil ⎞
⋮
⋮
⎟
= ⎜
⎝ ∑ni=1 ami bi1 . . . ∑ni=1 ami bil ⎠
⎛ c11 . . . c1l ⎞
⋮ ⎟=C
= ⎜ ⋮
⎝ cm1 . . . cml ⎠
(7.4)
Matrix multiplication is not commutative, in general, AB ≠ BA. If AB = BA,
they are said to be commutative. Matrix multiplication is associative.
If X, Y and Z are matrices of size m × n, n × l and l × r, then we have
(XY )Z = X(Y Z) and the resulting matrix has size m × r and the ij-th entry
of it is
n
l
∑ ∑ xis yst ztj
(7.5)
s=1 t=1
If A is a square matrix, then A2 = AA, Ai Aj = Ai+j and (Ai )j = Aij for nonnegative integers i and j. The zeroth power of a matrix is an identity matrix,
i.e. A0 = I. The details of identity matrix will be given in section 7.2. We
will only concern matrices with sizes 3×3 or smaller. The following examples
demonstrate the multiplication of matrices which have different dimensions.
α
) = aα + bβ
β
(a)
( a b )(
(b)
(
α
αa αb
)
)( a b ) = (
β
βa βb
(c)
(
a b
α
aα + bβ
)( ) = (
)
c d
β
cα + dβ
(d)
( a b )(
(e)
(
α β
) = ( aα + bγ aβ + bδ )
γ δ
a b
α β
aα + bγ aβ + bδ
)(
)=(
)
c d
γ δ
cα + dγ cβ + dδ
CHAPTER 7. MATRIX AND TRANSFORMATION
151
Example 7.1. Given the following matrices,
2 −1 5
A=(
),
0 22 −8
3 −4 7
B=(
)
2 11 −4
and
⎛ 5 ⎞
C=⎜ 3 ⎟,
⎝ 9 ⎠
compute 3A, A + B and AC.
Solution:
3A = 3 (
A+B =(
AC = (
2 −1 5
6 −3 15
)=(
)
0 22 −8
0 66 −24
2 −1 5
3 −4 7
5 −5 12
)+(
)=(
)
0 22 −8
2 11 −4
2 33 −12
5
2 −1 5 ⎛ ⎞
10 − 3 + 45
52
)⎜ 3 ⎟ = (
)=(
)
0 22 −8 ⎝ ⎠
0 + 66 − 72
−6
9
∎
Example 7.2. Illustrate the non-commutativity of matrix multiplication
using matrices A and B, where
A=(
1 1
)
0 1
and
B=(
1 0
)
1 1
Solution: We have
AB = (
= (
1 1
1 0
)(
)
0 1
1 1
2 1
)
1 1
while
BA = (
= (
1 0
1 1
)(
)
1 1
0 1
1 1
) ≠ AB
1 2
∎
The n × n identity matrix is the square matrix
⎛
⎜
In = ⎜
⎜
⎝
1
0
⋮
0
0 ...
1 ...
⋮ ⋱
0 ...
0
0
⋮
1
⎞
⎟
⎟
⎟
⎠
(7.6)
CHAPTER 7. MATRIX AND TRANSFORMATION
152
where it is the number “1” in the diagonal and “0” elsewhere. We can easily
see that if A has n rows, then In A = A and if A has n columns, then AIn = A.
Hence, In is the “1” in matrix multiplication.
Do we have division of matrix? In general, no. We say that a square
matrix A has an inverse if there is another square matrix of the same size
B such that
AB = BA = In
(7.7)
Inverse is usually denoted by A−1 . The zero matrix (matrix with all entries
0), of course, does not have an inverse.
Example 7.3. Check that the following matrix does not have an inverse,
(
1 1
)
1 1
(7.8)
Solution: If it has an inverse, let it be
(
a b
)
c d
(7.9)
We require that
(
1 1
a b
1 0
)(
) = (
)
1 1
c d
0 1
(
a+c b+d
1 0
) = (
)
a+c b+d
0 1
This is inconsistent, as a + c cannot be both 0 and 1.
7.2
(7.10)
∎
Properties of Matrices
(M.I) If all the entries of a matrix are zero, then the matrix is called zero
matrix (denoted by 0). If A is a matrix, then
ˆ A+0=0+A=A
ˆ A0 = 0 and 0A = 0
ˆ kA = 0, when the scalar k = 0.
(M.II) A n-square matrix X is called an identity matrix if its entries
xij = 0 for all i ≠ j and xij = 1 for all i = j. An identity matrix is
denoted by I. If A is a n-square matrix then
ˆ AI = IA = A
ˆ A0 = I
CHAPTER 7. MATRIX AND TRANSFORMATION
153
ˆ If B is also a n-square matrix and AB = BA = I, then B = A−1
is called the inverse of A, and A = B −1 is the inverse of B. A
and B are non-singular matrices.
(M.III) A n-square matrix A is called a diagonal matrix if its entries
aij = 0 for all i ≠ j. It is denoted by D.
(M.IV) If A is an m × n-matrix, then B is the transpose of A, where
bij = aji . B is a n × m-matrix denoted by AT .
ˆ (AT )T = A
ˆ The transpose of a row vector becomes a column vector
and vice versa.
ˆ If A and B have the same dimension, then (A + B)T = AT + B T
ˆ If A is a p × q matrix and B is a q × r matrix, then (AB)T =
B T AT . If C is a r × s matrix, then (ABC)T = C T B T AT .
n
(M.V) If A is a n-square matrix, then trA = ∑ aii is called the trace of A.
i=1
ˆ tr(kA) = k tr(A)
ˆ tr(A) = tr(AT )
ˆ If B is another n-square matrix, then tr(A+B) = tr(A)+tr(B).
ˆ If A is a m × n matrix and B is a n × m matrix, then tr(AB) =
tr(BA)
(M.VI) If A is a square matrix and AT = A−1 , then A is called an orthogonal matrix.
ˆ AT A = AAT = I
(M.VII) A is similar to B if there exists an invertible matrix P such that
P −1 AP = B. We denote the relation as A ∼ B.
(M.VIII) If a square matrix X is symmetric, then xij = xji .
ˆ X = XT
ˆ A symmetric matrix S can be constructed by
1
S = [A + AT ] ,
2
where A is any square matrix.
CHAPTER 7. MATRIX AND TRANSFORMATION
154
(M.IX) If X is a skew-symmetric matrix, then xij = −xji .
ˆ X = −X T
ˆ A skew-symmetric matrix S ∗ can be constructed by
1
S ∗ = [A − AT ] ,
2
where A is any square matrix. In fact any square matrix A =
S + S ∗.
7.3
Determinant
The determinant of a square matrix A is denoted as ∣A∣ or det(A). Our
discussion will limit to matrices of order 2 and order 3 only.
Let’s consider a 2 × 2 matrix first. If
A=(
a b
)
c d
then
det A = ∣A∣ = ∣
a b
∣ = ad − bc
c d
Example 7.4. Find the determinant of A = (
1 2
).
3 4
Solution:
det A = ∣A∣ = det (
1 2
) = (1)(4) − (2)(3) = −2
3 4
∎
What about if the matrix is of order
having the form
⎛ a11 a12
A = ⎜ a21 a22
⎝ a31 a32
3 × 3? Let’s consider a matrix A
a13 ⎞
a23 ⎟
a33 ⎠
The determinant ∣A∣ is complicated in structure. We define two new quantities before discussion. Consider the minor Mij of matrix A first. It is
a determinant of a matrix formed by eliminating the ith row and the jth
column of A. Then we define the cofactor Aij of A, where
Aij = (−1)i+j Mij
CHAPTER 7. MATRIX AND TRANSFORMATION
155
Here is the answer for the determinant of A.
∣A∣ = a11 A11 + a12 A12 + a13 A13
where A11 , A12 , and A13 are the cofactors of the first row in A as shown
below.
A11 = (−1)1+1 ∣
a22 a23
∣
a32 a33
∣A∣ = a11 ∣
A12 = (−1)1+2 ∣
a21 a23
∣
a31 a33
A13 = (−1)1+3 ∣
a22 a23
a
a
a
a
∣ − a12 ∣ 21 23 ∣ + a13 ∣ 21 22 ∣
a32 a33
a31 a33
a31 a32
The determinant can also be expressed using the cofactors of a row or the
cofactors of a column, e.g.
∣A∣ = ai1 Ai1 + ai2 Ai2 + ai3 Ai3
(row expansion) or
∣A∣ = a1j A1j + a2j A2j + a3j A3j (column expansion)
⎛ 2 −3 1 ⎞
Example 7.5. Evaluate the determinant of A, where A = ⎜ 2 0 −1 ⎟ .
⎝ 1 4 5 ⎠
Solution:
Using the cofactors of the first row in A, we have
∣A∣ = a11 A11 + a12 A12 + a13 A13 , then
0 −1
2 −1
2 0
∣ − (−3) ⋅ ∣
∣ + (1) ⋅ ∣
∣
4 5
1 5
1 4
∣A∣ = (2) ⋅ ∣
= 2 [0 − (−4)] + 3 [10 − (−1)] + 1 [8 − 0]
= 49
Using the cofactors of the second row in A, we have
∣A∣ = a21 A21 + a22 A22 + a23 A23 , then
∣A∣ = −(2) ⋅ ∣
−3 1
2 1
2 −3
∣ − (−1) ⋅ ∣
∣
∣ + (0) ⋅ ∣
4 5
1 5
1 4
= −2 [−15 − (4)] + 0 + 1 [8 − (−3)]
= 49
Using the cofactors of the first column in A, we have
∣A∣ = a11 A11 + a21 A21 + a31 A31 , then
∣A∣ = (2) ⋅ ∣
0 −1
−3 1
−3 1
∣ − (2) ⋅ ∣
∣ + (1) ⋅ ∣
∣
4 5
4 5
0 −1
= 2 [0 − (−4)] − 2 [−15 − 4] + 1 [3 − 0]
= 49
a21 a22
∣
a31 a32
CHAPTER 7. MATRIX AND TRANSFORMATION
156
Using the cofactors of the second column in A, we have
∣A∣ = a12 A12 + a22 A22 + a32 A32 , then
∣A∣ = −(−3) ⋅ ∣
2 −1
2 1
2 1
∣ + (0) ⋅ ∣
∣ − (4) ⋅ ∣
∣
1 5
1 5
2 −1
= 3 [10 − (−1)] + 0 − 4 [−2 − 2]
= 49
Readers can check the answers by using the cofactors of row 3 and column 3
respectively.
∎
In order to find out which square matrix has an inverse and which does
not, the determinant of a matrix is defined. A square matrix A has an
inverse such that AA−1 = A−1 A = I if and only if
det A ≠ 0
(7.11)
The details of an inverse will be discussed in section 7.5. One should note
that det A is a number, not a matrix.
Theorem 7.6. For two square matrices of the same size,
det(AB) = det A det B
(7.12)
On the left hand side, inside the parenthesis, it is the matrix multiplication
of A and B. On the right hand side, it is the multiplication of two numbers.
Proof. For the 2 × 2 case, by example (e) after equation 7.5, we have
=
=
=
=
=
det(AB)
(aα + bγ)(cβ + dδ) − (aβ + bδ)(cα + dγ)
acαβ + adαδ + bcβγ + bdγδ − (acαβ + adβγ + bcαδ + bdγδ)
adαδ + bcβγ − adβγ − bcαδ
(ad − bc)(αδ − βγ)
det A det B
(7.13)
The theorem is true for square matrices of all sizes, but we are not going to
prove that.
7.4
Properties of Determinant
(D.I) If all entries in a row (or a column) are zero, then the value of the
determinant is zero.
CHAPTER 7. MATRIX AND TRANSFORMATION
157
(D.II) If any two rows (or two columns) are interchanged, the value of the
determinant changes sign.
(D.III) If any two rows (or two columns) are identical, then the value of the
determinant is zero.
(D.IV) If each entry in a row (or a column) is multiplied by the same constant k, then the value of the determinant is multiplied by k.
(D.V) Given two square matrices A and B, we have det AB = det A det B.
(D.VI) Given a square matrix A, we have det A = det AT , where AT is the
transpose of A.
1
, where A−1 is the
(D.VII) Given a square matrix A, we have det A−1 =
det A
inverse of A.
(D.VIII) If each of the entries in a row or column can be expressed as the
sum of two numbers, then the determinant can be expressed as the
sum of two determinants. So
RRR a + α x p RRR RRR a x p RRR RRR α x p RRR
RRR
R R
R R
R
RRR b + β y q RRRRR = RRRRR b y q RRRRR + RRRRR β y q RRRRR
RRR
R R
R R
R
RR c + γ z r RRRR RRRR c z r RRRR RRRR γ z r RRRR
There is a common application, e.g.
RRR a x p RRR RRR a ± kx x p RRR
RRR
R R
R
RRR b y q RRRRR = RRRRR b ± ky y q RRRRR
RRR
R R
R
RR c z r RRRR RRRR c ± kz z r RRRR
RRR 1
−3
2 RRRR
RRR
R
3
1 RRRR .
Example 7.7. Evaluate RRR −2
RRR
R
RR −203 300 105 RRRR
Solution:
RRR 1
RRR
RRR
−3
2 RRRR
1
−3
2
RRR
R
RRR
R
R
RRR −2
RRR
3
1 RRRR = RRRR
−2
3
1
RRR
RRR
RRR
R
RR −203 300 105 RR
RR −200 − 3 300 + 0 100 + 5 RRRR
RRR 1
−3
2 RRRR RRRR 1 −3 2 RRRR
RRR
R R
R
3
1 RRRR + RRRR −2 3 1 RRRR
= RRR −2
RRR
R R
R
RR −200 300 100 RRRR RRRR −3 0 5 RRRR
(Property D.VIII)
CHAPTER 7. MATRIX AND TRANSFORMATION
=
=
=
=
=
158
RRR 1 −3 2 RRR RRR 1 −3 2 RRR
RR
RR RR
RR
100 RRRR −2 3 1 RRRR + RRRR −2 3 1 RRRR
(Property D.IV)
RRR
RRR RRR
RRR
RR −2 3 1 RR RR −3 0 5 RR
RRR 1 −3 2 RRR
RR
RR
0 + RRRR −2 3 1 RRRR
(Property D.III)
RRR
RRR
RR −3 0 5 RR
RRR −1 0 3 RRR
RRR
R
RRR −2 3 1 RRRRR
(R1 ∶ R1 + R2 )
RRR
RRR
RR −3 0 5 RR
(3) [(−1) (5) − (3) (−3)]
(Column expansion by using C2 )
12
∎
RRR x
y
z
RRR
2
2
R
y
z2
Example 7.8. Factorize RR x
RRR
RR y + z z + x x + y
Solution:
RRR x
y
z RRRR
RRR
R
RRR x2
y2
z 2 RRRR
RRR
R
RR y + z z + x x + y RRRR
RRR
RRR
x
y
z
RRR
RRR
RRR
x2
y2
z2
= RRR
RRR
R
RR x + y + z x + y + z x + y + z RRRR
RRR x y z RRR
RR
RR
= (x + y + z) RRRR x2 y 2 z 2 RRRR
RRR
R
RR 1 1 1 RRRR
RRR x − y
y − z z RRRR
RRRR 2
R
= (x + y + z) RR x − y 2 y 2 − z 2 z 2 RRRR
RRR
R
0
0
1 RRRR
RR
x−y
y−z
= (x + y + z) ∣ 2
∣
x − y2 y2 − z2
= (x + y + z)(x − y)(y − z) ∣
RRR
RRR
RRR
RRR
RR
(R3 ∶ R3 + R1 )
(C1 ∶ C1 − C2 and C2 ∶ C2 − C3 )
1
1
∣
x+y y+z
= (x + y + z)(x − y)(y − z)(z − x)
∎
RRR 1
1
1 RRRR
RRR
R
Example 7.9. Evaluate RRR ln x ln 2x ln 3x RRRR
RRR
R
RR ln y ln 2y ln 3y RRRR
CHAPTER 7. MATRIX AND TRANSFORMATION
159
Solution:
RRR
RRR
RRR
RRR
RR
RRR
RR
= RRRR
RRR
RR
RRR
RR
= RRRR
RRR
RR
1
1
1 RRRR
R
ln x ln 2x ln 3x RRRR
R
ln y ln 2y ln 3y RRRR
RRR
1
1
1
RR
ln x ln x + ln 2 ln x + ln 3 RRRR
R
ln y ln y + ln 2 ln y + ln 3 RRRR
RRR RRR 1
1
1
1
0
1
RR RR
ln x ln x ln x + ln 3 RRRR + RRRR ln x ln 2 ln x + ln 3
R R
ln y ln y ln y + ln 3 RRRR RRRR ln y ln 2 ln y + ln 3
RRR 1
0
1
RR
= 0 + RRRR ln x ln 2 ln x
RRR
RR ln y ln 2 ln y
ln 2 ln 3
= 0+0+∣
∣
ln 2 ln 3
= 0
RRR RRR 1
0
0 RRRR
RRR RRR
RRR + RRR ln x ln 2 ln 3 RRRRR
RRR RRR
R
RR RR ln y ln 2 ln 3 RRRR
RRR
RRR
RRR
RRR
RR
(Property D.VIII)
(Properties D.III and D.VIII)
One may obtain the answer using an alternative way.
RRR
RRR
RRR
RRR
RR
RRR
RR
= RRRR
RRR
RR
RRR
RR
= RRRRR
RRR
RR
RRR
RR
= RRRR
RRR
RR
1
1
1 RRRR
R
ln x ln 2x ln 3x RRRR
R
ln y ln 2y ln 3y RRRR
1
0
0
ln x ln 2x − ln x ln 3x − ln x
ln y ln 2y − ln y ln 3y − ln y
RRR
1
0
0
RR
) ln ( 3x
) RRRR
ln x ln ( 2x
x
x
RRR
3y
ln y ln ( 2y
y ) ln ( y ) RRR
1
0
0 RRRR
R
ln x ln 2 ln 3 RRRR
R
ln y ln 2 ln 3 RRRR
ln 2 ln 3
= ∣
∣
ln 2 ln 3
= 0
RRR
RRR
RRR
RRR
RR
(C2 ∶ C2 − C1 and C3 ∶ C3 − C1 )
∎
Example 7.10. Show that u⃗ = (2, −1, 1), v⃗ = (3, −4, −2), w⃗ = (5, −10, −8) are
linearly dependent vectors . In other words, any one of the three vectors
CHAPTER 7. MATRIX AND TRANSFORMATION
160
is the linear combination of the remaining two.
Solution: Let’s check whether the three vectors lie on the same plane first.
If so, any one of the three vectors is the linear combination of the remaining
two. Equivalently, the parallelepiped formed by the three vectors has zero
volume if the three vectors lie on the same plane. According to section 1.11,
the volume of the parallelepiped is
RRR 2 −1 1 RRR
RR
RR
⃗ = RRRR 3 −4 −2 RRRR = 0
u⃗ ⋅ (⃗
v × w)
RRR
R
RR 5 −10 −8 RRRR
Therefore, the three vectors are linearly dependent. In fact, one can check
that w⃗ = −2 u⃗ + 3 v⃗.
∎
7.5
Inverse
A square matrix A is invertible if it occupies an inverse A−1 such that
AA−1 = A−1 A = I. The matrix A is non-singular if its inverse exists. A
singular matrix is not invertible and it does not have an inverse.
Theorem 7.11. A square matrix has an inverse if and only if its determinant
a b
is non-zero. Explicitly, the inverse of a 2 × 2 matrix A = (
) is
c d
A−1 =
1
d −b
(
)
det A −c a
(7.14)
Proof. If the determinant is non-zero, consider
A−1 A =
=
1
d −b
a b
(
)(
)
−c
a
c d
det A
1
ad − bc bd − bd
(
)
det A −ac + ac −bc + ad
= (
1 0
)
0 1
(7.15)
Note that AA−1 = I too, where det A and det A−1 cannot be zero.
Example 7.12. If A = (
2 1
) , find A−1 .
0 1
Solution: We obtain det A = 2.
−1
A
2 1
= (
)
0 1
−1
= (
− 12
)
0 1
1
2
and
det A−1 =
1
2
CHAPTER 7. MATRIX AND TRANSFORMATION
161
1
Observe that det A det A−1 = (2) ( ) = 1 and
2
det A det A−1 = det(AA−1 ) = det I = 1 .
∎
If A is a 3 × 3 matrix having the form
⎛ a11 a12 a13 ⎞
A = ⎜ a21 a22 a23 ⎟ ,
⎝ a31 a32 a33 ⎠
then the inverse of it is given by
A−1 =
adjA
,
∣A∣
where adj A is called the adjoint of A. Readers should remember that adj A
is formed by the transpose of the matrix of the cofactors of A, e.g.
T
⎛ A11 A12 A13 ⎞
adj A = ⎜ A21 A22 A23 ⎟
⎝ A31 A32 A33 ⎠
(7.16)
Recall that the transpose of a matrix A is denoted as AT , where
⎛ a11 a21 a31 ⎞
AT = ⎜ a12 a22 a32 ⎟ .
⎝ a13 a23 a33 ⎠
⎛ 2 3 −4 ⎞
Example 7.13. Let A = ⎜ 0 −4 2 ⎟ . Find the inverse of A.
⎝ 1 −1 5 ⎠
Solution: The cofactors of A are:
A11 = + ∣
−4 2
∣ = −18 ,
−1 5
A12 = − ∣
0 2
∣ = 2,
1 5
A13 = + ∣
0 −4
∣=4
1 −1
A21 = − ∣
3 −4
∣ = −11 ,
−1 5
A22 = + ∣
2 −4
∣ = 14 ,
1 5
A23 = − ∣
2 3
∣=5
1 −1
A31 = + ∣
3 −4
∣ = −10 ,
−4 2
A32 = − ∣
2 −4
∣ = −4 ,
0 2
A33 = + ∣
2 3
∣ = −8
0 −4
CHAPTER 7. MATRIX AND TRANSFORMATION
162
So, the determinant of A is given by
∣A∣ = a11 A11 + a12 A12 + a13 A13 = (2)(−18) + (3)(2) + (−4)(4) = −46 ,
and the adjoint of A is given by
T
⎛ −18 2 4 ⎞
⎛ −18 −11 −10 ⎞
14 −4 ⎟ ,
adj A = ⎜ −11 14 5 ⎟ = ⎜ 2
⎝ −10 −4 −8 ⎠
⎝ 4
5
−8 ⎠
Hence, A
7.6
−1
−18 −11 −10 ⎞
1 ⎛
adj A
14 −4 ⎟ .
=
=− ⎜ 2
∣A∣
46 ⎝
4
5
−8 ⎠
∎
Properties of an Inverse
(IN.I) (A−1 )−1 = A
(IN.II) (kA)−1 = k −1 A−1 , where k is a non-zero scalar.
(IN.III) (AB)−1 = B −1 A−1 . Furthermore, (ABC)−1 = C −1 B −1 A−1 .
Proof. Since AB (B −1 A−1 ) = (AI) A−1 = A A−1 = I and
(B −1 A−1 ) AB = B −1 (IB) = B −1 B = I, so (AB)−1 = B −1 A−1 .
Next, (ABC)−1 = C −1 (AB)−1 = C −1 B −1 A−1 .
(IN.IV) If A is invertible then (A−1 )T = (AT )−1 . It means that the order to
obtain the transpose and the inverse is not important.
Proof. AT (A−1 )T = (A−1 A)T = I, so (A−1 )T = (AT )−1 .
(IN.V) If A is symmetric, then A−1 is also symmetric, i.e (A−1 )T = A−1 .
Proof. A (A−1 )T = AT (A−1 )T = (A−1 A)T = I, so (A−1 )T = A−1 .
(IN.VI) If AT = A−1 , A is called an orthogonal matrix, then det A = ±1.
Proof. A AT = I implies det(A AT ) = det A det AT = (det A)2 = 1.
Hence, the answer appears.
CHAPTER 7. MATRIX AND TRANSFORMATION
7.7
163
Systems of Linear Equations
A system of linear equations is
⎧
a11 x1 + a12 x2 + . . . + a1n xn = b1
⎪
⎪
⎪
⎪
⎪ a21 x1 + a22 x2 + . . . + a2n xn = b2
⎨
⋮
⋮ ⋮
⎪
⎪
⎪
⎪
⎪
⎩ am1 x1 + am2 x2 + . . . + amn xn = bm
(7.17)
where all the aij and bi are known and we would like to find out the xj . In
matrix notation, this is
⎛ a11 a12 . . . a1n ⎞ ⎛ x1 ⎞ ⎛ b1 ⎞
⋮
⋮ ⎟⎜ ⋮ ⎟ = ⎜ ⋮ ⎟
⎜ ⋮
⎝ am1 am2 . . . amn ⎠ ⎝ xn ⎠ ⎝ bm ⎠
or in compact notation,
⃗ =B
⃗,
AX
(7.18)
(7.19)
⃗ and B
⃗ are column vectors. In a form like this, we cannot say
where X
whether there is solution or even solution exists, whether it is unique.
7.8
Cramer’s Rule
If m = n, the system stated in equation 7.17 has n equations and n unknowns.
There are some points to note.
(I) The system is inconsistent if the solution set is empty.
(II) The system is consistent if there is a non-empty solution set.
(i) The system has a unique solution.
(ii) The system has a non-unique solution (infinitely many solutions)
⃗
We define a n-square matrix Ak by replacing the k-th column of A by B.
According to Cramer’s rule, we have
xk =
det Ak
det A
ˆ If det A = 0 and det Ak ≠ 0 (at least one of), the system has no solution.
ˆ If det A ≠ 0, the system has a unique solution set.
ˆ If det A = 0 and det Ak = 0 (all of), the system has infinitely many
solutions.
CHAPTER 7. MATRIX AND TRANSFORMATION
164
Example 7.14. Consider
{
x+y =0
x+y =1
(7.20)
These two equations are inconsistent and there is no solution. The equation
⃗ = B,
⃗ where
set can be written in matrix form, e.g. A X
A=(
1 1
)
1 1
⃗ =( x )
X
y
⃗=( 0 )
B
1
Using Cramer’s rule, we have
x=
det A1
=
det A
∣
0 1
∣
1 1
1 1
∣
∣
1 1
y=
,
det A2
=
det A
∣
1 0
∣
1 1
1 1
∣
∣
1 1
,
where det A = 0, det A1 ≠ 0, and det A2 ≠ 0.
Now, consider
2x + 2y = 2
{
x+y =1
(7.21)
There are solutions, and, in fact, infinitely many solutions. Let’s analyze by
using Cramer’s rule. The set of equations can be expressed in matrix form,
⃗ = B,
⃗ where
AX
A=(
x=
2 2
)
1 1
det A1
=
det A
∣
⃗ =( x )
X
y
2 2
∣
1 1
2 2
∣
∣
1 1
,
y=
⃗=( 2 )
B
1
det A2
=
det A
∣
2 2
∣
1 1
2 2
∣
∣
1 1
,
where det A = 0, det A1 = 0, and det A2 = 0.
∎
Example 7.15. Consider
{
2x + y = 5
y = 3
(7.22)
This can be simply solved by substitution x = 1 and y = 3. A more sophisticated way to solve this is by using Cramer’s rule. Let
A=(
2 1
)
0 1
⃗ =( x )
X
y
and
⃗=( 5 )
B
3
CHAPTER 7. MATRIX AND TRANSFORMATION
165
⃗ = B,
⃗ where
We have A X
x=
det A1
=
det A
∣
5 1
∣
3 1
2 1
∣
∣
0 1
= 1,
y=
det A2
=
det A
∣
2 5
∣
0 3
2 1
∣
∣
0 1
=3
⃗ = A−1 B,
⃗ where A−1 = adjA and det A = 2.
Alternatively, X
det A
(
adjA 5
x
( )
) =
y
det A 3
T
1
1 0
5
(
) ( )
=
−1
2
3
2
=
1 1 −1
5
(
)( )
3
2 0 2
= (
− 12
5
1
)( ) = ( )
0 1
3
3
1
2
We get the same answer.
∎
Example 7.16. A circuit is given as shown in figure 7.1. Find the current
flowing through each resistor.
Figure 7.1: The electric circuit
Solution: Before everything, let’s review two rules in electric circuit.
ˆ Kirchhoff ’s rule 1: Junction rule
The algebraic sum of the currents at any junction must equal zero, i.e.
CHAPTER 7. MATRIX AND TRANSFORMATION
166
∑ I = 0. This is resulted from the conservation of charge.
Junction
(The currents directed into the junction are regarded as +I and those
leaving as −I in the equation.)
ˆ Kirchhoff ’s rule 2: Loop rule (voltage rule)
The sum of the potential differences across all elements around any
closed circuit loop must be zero, i.e.
∑ ∆V = 0. This is resulted
Closed loop
from the conservation of energy.
Denote the currents flowing through the resistors as I1 , I2 , and I3 , see the
circuit diagram in figure 7.2. The directions of the currents are assigned
arbitrarily.
Figure 7.2: The directions of the currents are assigned arbitrarily
At junction c: I1 + I2 − I3 = 0
Loop abcda: 10.0 − 6.0 I1 − 2.0 I3 = 0
Loop bef cb: −4.0 I2 − 14.0 + 6.0 I1 − 10.0 = 0
Rearranging the equations, we have
⎧
I +I −I
= 0
⎪
⎪
⎪ 1 2 3
⎨ −6.0 I1 − 2.0 I3 = −10.0
⎪
⎪
⎪
⎩ 6.0 I1 − 4.0 I2 = 24.0
The matrix form of the system is
1
−1 ⎞ ⎛ I1 ⎞ ⎛ 0 ⎞
⎛ 1
0
−2.0 ⎟ ⎜ I2 ⎟ = ⎜ −10.0 ⎟
⎜ −6.0
⎝ 6.0 −4.0
0 ⎠ ⎝ I3 ⎠ ⎝ 24.0 ⎠
CHAPTER 7. MATRIX AND TRANSFORMATION
167
Solving the system with Cramer’s rule, we get
RRR 0
RRR
RRR −10.0
RRR
R 24.0
I1 = RR
RRR 1
RRR
RRR −6.0
RRR 6.0
R
RRR 1
RRR
RRR −6.0
RRR
R 6.0
I3 = RRR
RRR
1
RRRR
RRR −6.0
RRR
RRR
RR 6.0
1
−1 RRRR
R
0
−2.0 RRRR
R
−4.0
0 RRRR
= 2.0 A ,
1
−1 RRRR
R
0
−2.0 RRRR
R
−4.0
0 RRRR
1
0 RRRR
R
0
−10.0 RRRR
R
−4.0 24.0 RRRR
= −1.0 A
1
−1 RRRRRR
RRR
0
−2.0 RRRR
RR
−4.0
0 RRRRR
RRR 1
0
−1 RRRR
RRR
RRR −6.0 −10.0 −2.0 RRRRR
RRR
R
0 RRRR
RR 6.0 24.0
I2 = R
= −3.0 A ,
RRR 1
1
−1 RRRR
RRR
R
0
−2.0 RRRR
RRR −6.0
R
RRR 6.0 −4.0
0 RRRR
R
The initial guess of the directions of I2 and I3 is incorrect because the values
of them are negative. They should flow in the opposite directions.
∎
7.9
Eigenvalues and Eigenvectors
Let A be a n-square matrix, there exists a non-zero vector x⃗ (a column vector)
such that
A x⃗ = λ x⃗ ,
(7.23)
where λ is a scalar. Equation 7.23 is the eigenvalue equation. We call λ the
eigenvalue of A and x⃗ the corresponding eigenvector. The following paragraph outlines the procedure such that the eigenvalues and eigenvectors of
A are obtained.
We arrange equation 7.23 as
(λI − A) x⃗ = 0⃗
(7.24)
For a non-trivial solution of x⃗, we have
det(λI − A) = 0
(7.25)
The LHS of equation 7.25 is called the characteristic polynomial. Equation 7.25 is the characteristic equation of degree n of A. Solving this
equation, we obtain the eigenvalues of A. Substituting these values to equation 7.24, we obtain the corresponding eigenvectors.
CHAPTER 7. MATRIX AND TRANSFORMATION
168
Remark: A set of vectors x⃗1 , x⃗2 , ⋯ , x⃗n are linearly independent if and
only if λ1 x⃗1 +λ2 x⃗2 +⋯+λn x⃗n = 0 implies λ1 = λ2 = ⋯ = λn = 0. Alternatively,
a set of vectors is said to be linearly dependent if one of the vector in the
set can be defined as the linear combination of other vectors in the same set.
Zero vector is always linearly dependent to other vectors.
1 2
) , find the eigenvalues and eigenvectors of A.
0 4
Example 7.17. If A = (
Solution: Let x⃗ be an eigenvector of A.
A x⃗ = λ x⃗
(λI − A) x⃗ = ⃗0
λ − 1 −2
(
) x⃗ = ⃗0
0
λ−4
For a non-trivial solution of x⃗, we have
det (
λ − 1 −2
) = 0
0
λ−4
(λ − 1) (λ − 4) = 0
λ = 1 or 4
Let λ1 = 1, λ2 = 4 and x⃗ = (
u
). Recall that
v
(λI − A) x⃗ = 0⃗
We have
(
λ − 1 −2
u
0
)( ) = ( )
0
λ−4
v
0
(7.26)
For λ1 = 1, equation 7.26 gives
(
0 −2
u
0
)( ) = ( )
0 −3
v
0
We get v = 0. In order to obtain the eigenvector corresponding to λ1 , we
1
set u = 1. Then x⃗1 = ( )
0
For λ2 = 4, equation 7.26 gives
(
3 −2
u
0
)( ) = ( )
0 0
v
0
CHAPTER 7. MATRIX AND TRANSFORMATION
169
We get 3 u − 2 v = 0. In order to obtain the eigenvector which corresponds
2
to λ2 , we set u = 2 and v = 3. Then x⃗2 = ( ) .
3
1
The eigenvectors of A are independent and they are x⃗1 = ( ) and
0
2
x⃗2 = ( ) .
3
Remark: One can check that A x⃗1 = λ1 x⃗1 and A x⃗2 = λ2 x⃗2 .
∎
⎛ −1 0 1 ⎞
Example 7.18. If A = ⎜ 3 0 −3 ⎟ , find the eigenvalues and eigenvectors
⎝ 1 0 −1 ⎠
of A.
Solution: Let x⃗ be an eigenvector of A.
A x⃗ = λ x⃗
(λI − A) x⃗ = 0⃗
⎛ λ + 1 0 −1 ⎞
3 ⎟ x⃗ = 0⃗
⎜ −3 λ
⎝ −1 0 λ + 1 ⎠
(7.27)
For a non-trivial solution of x⃗, we have
⎛ λ + 1 0 −1 ⎞
3 ⎟ = 0
det ⎜ −3 λ
⎝ −1 0 λ + 1 ⎠
λ2 (λ + 2) = 0
λ = 0 or − 2
⎛ u ⎞
Let λ1 = λ2 = 0 (equal roots), λ3 = −2 and x⃗ = ⎜ v ⎟. Equation 7.27 becomes
⎝ w ⎠
⎛ 0 ⎞
⎛ λ + 1 0 −1 ⎞ ⎛ u ⎞
3 ⎟⎜ v ⎟ = ⎜ 0 ⎟
⎜ −3 λ
⎝ −1 0 λ + 1 ⎠ ⎝ w ⎠
⎝ 0 ⎠
For λ1 = 0, we have
⎛ 0 ⎞
⎛ 1 0 −1 ⎞ ⎛ u ⎞
⎜ −3 0 3 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟
⎝ −1 0 1 ⎠ ⎝ w ⎠
⎝ 0 ⎠
CHAPTER 7. MATRIX AND TRANSFORMATION
170
so
⎧
u−w
= 0
⎪
⎪
⎪
⎨ −3u + 3w = 0
⎪
⎪
⎪
⎩ −u + w = 0
which gives
u−w =0
⎛ 1 ⎞
We set u = 1, v = 0, and w = 1, then x⃗1 = ⎜ 0 ⎟ .
⎝ 1 ⎠
There is another independent eigenvector which corresponds to this eigen⎛ 0 ⎞
value (λ1 = λ2 = 0). Let’s set u = 0, v = 1, and w = 0 then x⃗2 = ⎜ 1 ⎟ .
⎝ 0 ⎠
For λ3 = −2, we have
⎛ −1 0 −1 ⎞ ⎛ u ⎞
⎛ 0 ⎞
⎜ −3 −2 3 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟
⎝ −1 0 −1 ⎠ ⎝ w ⎠
⎝ 0 ⎠
so
⎧
−u − w
= 0
⎪
⎪
⎪
⎨ −3u − 2v + 3w = 0
⎪
⎪
⎪
−u − w
= 0
⎩
which gives
{
u+w =0
v − 3w = 0
⎛ 1 ⎞
We set u = 1, v = −3, and w = −1, then x⃗3 = ⎜ −3 ⎟ .
⎝ −1 ⎠
⎛ 1 ⎞
⎛ 0 ⎞
To conclude, the three independent eigenvectors are x⃗1 = ⎜ 0 ⎟ , x⃗2 = ⎜ 1 ⎟ ,
⎝ 1 ⎠
⎝ 0 ⎠
⎛ 1 ⎞
⃗
and x3 = ⎜ −3 ⎟ .
∎
⎝ −1 ⎠
⎛ 0 1 0 ⎞
Example 7.19. If A = ⎜ 0 0 1 ⎟ , find the eigenvalues and eigenvectors
⎝ 2 −5 4 ⎠
of A.
Solution: Let x⃗ be an eigenvector of A.
A x⃗ = λ x⃗
(λI − A) x⃗ = 0⃗
0 ⎞
⎛ λ −1
−1 ⎟ x⃗ = 0⃗
⎜ 0 λ
⎝ −2 5 λ − 4 ⎠
(7.28)
CHAPTER 7. MATRIX AND TRANSFORMATION
171
For a non-trivial solution of x⃗, we have
0 ⎞
⎛ λ −1
−1 ⎟ = 0
det ⎜ 0 λ
⎝ −2 5 λ − 4 ⎠
λ3 − 4 λ2 + 5 λ − 2 = 0
(λ − 1)2 (λ − 2) = 0
λ = 1 or 2
⎛ u ⎞
Let λ1 = λ2 = 1 (equal roots), λ3 = 2 and x⃗ = ⎜ v ⎟. Equation 7.28 becomes
⎝ w ⎠
0 ⎞⎛ u ⎞
⎛ λ −1
⎛ 0 ⎞
−1 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟
⎜ 0 λ
⎝ −2 5 λ − 4 ⎠ ⎝ w ⎠
⎝ 0 ⎠
For λ1 = 1, we have
⎛ 1 −1 0 ⎞ ⎛ u ⎞
⎛ 0 ⎞
⎜ 0 1 −1 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟
⎝ −2 5 −3 ⎠ ⎝ w ⎠
⎝ 0 ⎠
so
⎧
u−v
= 0
⎪
⎪
⎪
v−w
= 0
⎨
⎪
⎪
⎪
⎩ −2u + 5v − 3w = 0
which gives
{
u−v =0
v−w =0
⎛ 1 ⎞
We set u = 1, v = 1, and w = 1, then x⃗1 = ⎜ 1 ⎟ .
⎝ 1 ⎠
⎛ 0 ⎞
Next, we set u = 0, then v = 0, and w = 0. x⃗2 = ⎜ 0 ⎟ . We should note
⎝ 0 ⎠
that x⃗2 is an trivial answer of the eigenvalue equation and it is not regarded
as an eigenvector of A. Moreover, x⃗1 and x⃗2 are dependent vectors because
x⃗2 = 0 x⃗1 .
For λ3 = 2, we have
⎛ 2 −1 0 ⎞ ⎛ u ⎞
⎛ 0 ⎞
⎜ 0 2 −1 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟ ,
⎝ −2 5 −2 ⎠ ⎝ w ⎠
⎝ 0 ⎠
CHAPTER 7. MATRIX AND TRANSFORMATION
172
so
⎧
2u − v
= 0
⎪
⎪
⎪
2v − w
= 0
⎨
⎪
⎪
⎪
−2u
+
5v
−
2w
= 0
⎩
which gives
{
2u − v = 0
2v − w = 0
⎛ 1 ⎞
We set u = 1, v = 2, and w = 4, then x⃗3 = ⎜ 2 ⎟ .
⎝ 4 ⎠
⎛ 1 ⎞
⎛ 1 ⎞
Hence, A has two independent eigenvectors, x⃗1 = ⎜ 1 ⎟ and x⃗3 = ⎜ 2 ⎟ .
⎝ 1 ⎠
⎝ 4 ⎠
∎
Example 7.20. If matrix A is similar to matrix B, show that they have the
same set of eigenvalues.
Solution: If A is similar to B, then there exist an invertible matrix P
such that B = P −1 AP . Let λ and x⃗ be the eigenvalue and eigenvector of A
respectively. We have BP −1 = P −1 A and A x⃗ = λ⃗
x. So
(BP −1 ) x⃗ = (P −1 A) x⃗
B (P −1 x⃗) = P −1 (λ x⃗)
B (P −1 x⃗) = λ (P −1 x⃗)
Therefore, λ is an eigenvalue of B and P −1 x⃗ is the corresponding eigenvector.
A and B share the same set of eigenvalues.
7.10
Diagonalization
Given a n-square matrix A, diagonalization of A is very useful when one
wants to compute Am , where m is a non-negative integer. The procedure to
diagonalize a matrix is shown below.
(I) Find the eigenvalues λi and eigenvectors x⃗i of A, where i = 1, 2, ⋯, n.
The matrix A can be diagonalized, if A has n independent eigenvectors.
(II) Construct the n-square matrix P using the eigenvectors x⃗1 , x⃗2 , ⋯, x⃗n ,
as its columns.
(III) Find the inverse of P .
(IV) A diagonal matrix D is formed, where D = P −1 AP and the diagonal
entries are λ1 , λ2 , ⋯, λn .
Knowing that Dm = P −1 Am P , we have Am = P Dm P −1 .
CHAPTER 7. MATRIX AND TRANSFORMATION
173
Example 7.21. Given a 3 × 3 real matrix, show that D = P −1 AP is a diagonal matrix if the columns of P are formed by the eigenvectors of A.
Solution: Let λ1 , λ2 and λ3 be the eigenvalues of A. Let the corresponding
independent eigenvectors be x⃗1 , x⃗2 , and x⃗3 . As a reminder, we should realize
that they are column vectors.
A x⃗1 = λ1 x⃗1
A x⃗2 = λ2 x⃗2
A x⃗3 = λ3 x⃗3
Construct P by using the eigenvectors, i.e. P = (⃗
x1 , x⃗2 , x⃗3 ), then
AP = A (⃗
x1 , x⃗2 , x⃗3 )
= (A x⃗1 , A x⃗2 , A x⃗3 )
= (λ1 x⃗1 , λ2 x⃗2 , λ3 x⃗3 )
⎛ λ1 0 0 ⎞
= (⃗
x1 , x⃗2 , x⃗3 ) ⎜ 0 λ2 0 ⎟
⎝ 0 0 λ3 ⎠
= PD
Hence, D = P −1 AP .
Remark: P = (⃗
x1 , x⃗2 , x⃗3 ) is a row vector which consists three column
vectors: x⃗1 , x⃗2 , and x⃗3 . So, P is a 3 × 3 matrix.
∎
Example 7.22. If A = (
1 2
) , diagonalize A then compute A5 .
0 4
Solution: Refer to the answers in example 7.17, we get the eigenvalues and
1
eigenvectors of A. Eigenvalues: λ = 1 and 4, and eigenvectors: x⃗1 = ( )
0
2
and x⃗2 = ( ) .
3
Now, we construct the matrix P using the eigenvectors
P =(
1 2
).
0 3
Then we compute the inverse of P
P −1 =
adjP 1 3 −2
= (
),
det P 3 0 1
CHAPTER 7. MATRIX AND TRANSFORMATION
174
1 2
∣ = 3.
0 3
A diagonal matrix is obtained as
where det P = ∣
D = P −1 AP =
1 3 −2
1 2
1 2
1 0
(
)(
)(
)=(
)
0
1
0
4
0
3
0 4
3
In fact, we can write down D without computing the product, because
D=(
λ1 0
1 0
)=(
)
0 λ2
0 4
So,
D5 = (
1
0
)
0 1024
Now, we compute the power of A. As was stated above that D5 = P −1 A5 P ,
so we have A5 = P D5 P −1 , which gives
A5 =
1 1 2
1
0
3 −2
1 682
(
)(
)(
)=(
)
0 1024
0 1
0 1024
3 0 3
∎
Example 7.23. If A = (
1 2
) , find eA .
0 4
Solution: Making use the diagonal matrix D obtained in example 7.22 and
x2 x3
the fact that A0 = I and ex = 1 + x +
+
+ ⋯, we have
2! 3!
A A2 A3
+
+
+ ⋯) P
1! 2!
3!
P −1 A P P −1 A2 P P −1 A3 P
I+
+
+
+⋯
1!
2!
3!
(P −1 A P ) (P −1 A P )2 (P −1 A P )3
I+
+
+
+⋯
1!
2!
3!
−1
eP A P
eD
P −1 eA P = P −1 (I +
=
=
=
=
We have applied the fact that
P −1 A2 P
= (P −1 A P ) (P −1 A P )
P −1 A3 P
= (P −1 A P ) (P −1 A P ) (P −1 A P ) = (P −1 A P )3
⋮
=
⋮
= (P −1 A P )2
=
⋮
CHAPTER 7. MATRIX AND TRANSFORMATION
175
Hence,
eA = P eD P −1
D D2 D3
= P (I + +
+
+ ⋯) P −1
1! 2!
3!
e1 0
= P (
) P −1
0 e4
=
1 1 2
e1 0
3 −2
(
)(
)(
)
0 e4
0 1
3 0 3
=
1 3 e −2 e + 2e4
(
)
0
3 e4
3
4
e − 23e + 2 3e
)
= (
0
e4
∎
7.11
Rotation of Axes
There are two coordinate systems XY and X ′ Y ′ . If the two coordinate
systems are related by a rotation, figure 7.3, what will be the relation between
the coordinates?
Figure 7.3: Two coordinate systems related by a rotation
Let the coordinates of the same point be (x, y) and (x′ , y ′ ) in the two
systems. It is easy to see that if the polar coordinates of the point are (r, θ)
in the original system, then they are (r, θ − α) in the new system. Hence, we
have
x′ = r cos(θ − α)
= r(cos θ cos α + sin θ sin α)
= x cos α + y sin α ,
(7.29)
CHAPTER 7. MATRIX AND TRANSFORMATION
y′ =
=
=
=
r sin(θ − α)
r(sin θ cos α − cos θ sin α)
y cos α − x sin α
−x sin α + y cos α ,
176
(7.30)
where we have used equations 5.20 and 5.18.
In matrix notation, these are
(
x′
cos α sin α
x
)=(
)( )
′
y
− sin α cos α
y
(7.31)
The inverse transformation is
(
x
cos α − sin α
x′
)=(
)( ′ )
y
sin α cos α
y
(7.32)
This could be obtained by reversing the sign of α or taking the inverse of the
matrix. To write down the relation explicitly, we have
x = x′ cos α − y ′ sin α ,
y = x′ sin α + y ′ cos α
(7.33)
(7.34)
Example 7.24. The equation of a circle with center at the origin is x2 + y 2 =
R2 . In the rotated system, we have
(x′ cos α − y ′ sin α)2 + (x′ sin α + y ′ cos α)2 = R2
x′ 2 cos2 α + y ′ 2 sin2 α + x′ 2 sin2 α + y ′ 2 cos2 α = R2
x′ 2 + y ′ 2 = R 2 ,
∎
as expected.
Example 7.25. Find the equation of the line pair 4x2 − 11xy + 6y 2 = 0
when the axes are rotated counterclockwisely through the acute angle whose
tangent is 43 .
Solution: If α is the angle of rotation, tan α = 43 gives sin α = 45 and cos α = 35 .
The new equation is
2
2
3
4
3
4
4
3
4
3
4 ( x′ − y ′ ) − 11 ( x′ − y ′ ) ( x′ + y ′ ) + 6 ( x′ + y ′ )
5
5
5
5
5
5
5
5
′
′ 2
′
′
′
′
′
4 (3x − 4y ) − 11 (3x − 4y ) (4x + 3y ) + 6 (4x + 3y ′ )2
125x′ y ′ + 250y ′ 2
x′ y ′ + 2y ′ 2
= 0
= 0
= 0
= 0
∎
CHAPTER 7. MATRIX AND TRANSFORMATION
177
Example 7.26. The equation of a parabola is x2 = 4ay. Consider a simple
π
rotation of axes counterclockwisely by . By equations 7.33 and 7.34, we
2
simply have x = −y ′ and y = x′ . In the new coordinates, the equation of the
parabola becomes
y ′ 2 = 4ax′
(7.35)
If the rotation angle is
x′ − y ′
x′ + y ′
π
, we have x = √
and y = √ , and the new
4
2
2
equation becomes
2
x′ − y ′
x′ + y ′
( √ ) = 4a ( √ )
2
2
√
√
′2
′ ′
′2
′
x − 2x y + y
= 4 2a x + 4 2a y ′
√
√
x′ 2 − 2x′ y ′ + y ′ 2 − 4 2a x′ − 4 2a y ′ = 0
∎
Example 7.27. The ellipse in figure 7.4 is 97 x2 + 192 xy + 153 y 2 = 225 when
using the axes x and y as reference. There is another set of axes, x′ and
y ′ , which rotates the axes x and y counterclockwisely by an acute angle α,
with sin α = 4/5. Show that the equation of the ellipse has the standard form
when using the x′ and y ′ coordinates.
Figure 7.4: An ellipse and the rotation of axes
Solution: Using equations 7.33 and 7.34, we have
⎧
3 x′ 4 y ′
⎪
⎪
x
=
−
⎪
⎪
⎪
5
5
⎪
⎨
⎪
⎪
4 x′ 3 y ′
⎪
⎪
⎪
y
=
+
⎪
⎩
5
5
CHAPTER 7. MATRIX AND TRANSFORMATION
178
Then, we rewrite the equation of the ellipse 97 x2 + 192 xy + 153 y 2 = 225 as
97 (
3 x′ 4 y ′ 2
3 x′ 4 y ′ 4 x′ 3 y ′
4 x′ 3 y ′ 2
−
) + 192 (
−
)(
+
) + 153 (
+
)
5
5
5
5
5
5
5
5
97 (3 x′ − 4 y ′ )2 + 192 (3 x′ − 4 y ′ ) (4 x′ + 3 y ′ ) + 153 (4 x′ + 3 y ′ )2
5625 x′ 2 + 625 y ′ 2
y′2
x′ 2 +
9
The above representation is the standard form of an ellipse.
= 225
= 5625
= 5625
= 1
∎
The above discussion focuses on the rotation of axes while the interested
point P is fixed. Now, we consider the transformation of the coordinates if
point P rotates but the axes are fixed. Figure 7.5 shows a point P which
rotates counterclockwisly by an angle α.
Figure 7.5: The coordinates of a rotating point
The new position of it is P ′ and its new coordinates are
x′ =
=
=
′
y =
=
=
r cos(α + β)
r (cos α cos β − sin α sin β)
x cos α − y sin α ,
r sin(α + β)
r (sin α cos β + cos α sin β)
x sin α + y cos α
(7.36)
(7.37)
The matrix form of equations 7.36 and 7.37 is
(
x′
cos α − sin α
x
)=(
)( )
′
y
sin α cos α
y
(7.38)
The discussion can be extended to 3-dimensional space and it is clearly that
equation 7.38 represents the rotation of point P about the z-axis and P
CHAPTER 7. MATRIX AND TRANSFORMATION
179
always lies on the xy-plane (z = 0). Thus we can write
′
⎛ x ⎞ ⎛ cos α − sin α 0 ⎞ ⎛ x ⎞
⎜ y ′ ⎟ = ⎜ sin α cos α 0 ⎟ ⎜ y ⎟
⎝ z′ ⎠ ⎝ 0
0
1 ⎠⎝ z ⎠
(7.39)
If P rotates about the x-axis, we have
′
0
0
⎛ x ⎞ ⎛ 1
⎞⎛ x ⎞
′
⎜ y ⎟ = ⎜ 0 cos α − sin α ⎟ ⎜ y ⎟
⎝ z ′ ⎠ ⎝ 0 sin α cos α ⎠ ⎝ z ⎠
(7.40)
The result appears after using the following conversions in equation 7.38.
{
x′ → y ′
y′ → z′
and
{
x→y
y→z
Recall that the convention is governed by the right hand rule: k̂ = î × ĵ and
î = ĵ × k̂. The correspondence of these equations is k̂ → î, î → ĵ and ĵ → k̂.
See figure 7.6.
Figure 7.6: The axes of rotation
If P rotates about the y-axis, we have
′
⎛ x ⎞ ⎛ cos α 0 sin α ⎞ ⎛ x ⎞
′
0
1
0 ⎟⎜ y ⎟
⎜ y ⎟=⎜
⎝ z ′ ⎠ ⎝ − sin α 0 cos α ⎠ ⎝ z ⎠
(7.41)
The result appears after using the following conversions in equation 7.38.
{
x′ → z ′
y ′ → x′
and
{
x→z
y→x
Similar arguments from the right hand rule: k̂ = î × ĵ and ĵ = k̂ × î. The
correspondence of these equations is k̂ → ĵ, î → k̂ and ĵ → î. See figure 7.6.
CHAPTER 7. MATRIX AND TRANSFORMATION
7.12
180
Special Matrices
Special matrices are widely used in physics. Some of them are shown below.
(i) Orthogonal matrix: AT = A−1
We note that AAT = AT A = I and det A = ±1. Rotation matrix is an
example of orthogonal matrix.
(ii) Hermitian matrix: A„ = A
It is widely used in quantum mechanics, where A„ stands for the transpose of the complex conjugate of A, i.e. A„ = (A∗ )T .
(iii) Unitary matrix: U U „ = U „ U = I
The difference between unitary matrix and orthogonal matrix is that
unitary matrix considers the complex conjugate, but the orthogonal
matrix does not.
Example 7.28. Show that the eigenvalue of a Hermitian matrix is real.
Solution: If A is a Hermitian matrix, then A = A„ . Let λ and x⃗ be the
eigenvalue and eigenvector respectively such that A x⃗ = λ x⃗. Obtaining the
transpose of the complex conjugate of both sides, we have
(A x⃗)„ = λ∗ x⃗„
x⃗„ A „ = λ∗ x⃗„
x⃗„ A = λ∗ x⃗„
(λ∗ represents the complex conjugate of λ)
(A = A„ )
Multiplying both sides by x⃗, we get
x⃗„ A x⃗
x⃗„ (λ x⃗)
λ x⃗„ x⃗
λ
=
=
=
=
λ∗ x⃗„ x⃗
λ∗ x⃗„ x⃗
λ∗ x⃗„ x⃗
λ∗
∎
7.13
Vector Spaces
A vector space V over a field K consists a set on which two operations
(called addition and scalar multiplication, respectively) are defined so that
for each pair of elements x, y in V there is a unique element x + y (a sum) in
V , and for each element a in K and each element x in V there is a unique
element ax (a product) in V , such that the following conditions hold.
(VS.I) For all x, y in V , x + y = y + x (commutative law).
CHAPTER 7. MATRIX AND TRANSFORMATION
181
(VS.II) For all x, y, z in V , (x + y) + z = x + (y + z) (associative law).
(VS.III) There exists an element in V denoted by 0 such that x + 0 = x for
each x in V . The vector 0 is called the zero vector of V .
(VS.IV) For each element x in V there exists an element y in V such that
x + y = 0. The vector y is called the additive inverse of x and is
denoted by −x.
(VS.V) For each element x in V and the unit scalar 1 in K, 1x = x.
(VS.VI) For each pair of elements a, b in K and each element x in V ,
(ab) x = a (bx).
(VS.VII) For each element a in K and each pair of elements x, y in V ,
a (x + y) = a x + a y (distributive law).
(VS.VIII) For each pair of elements a, b in K and each element x in V ,
(a + b) x = a x + b x (distributive law).
Example 7.29. Let V be the set of all m × n matrices with entries from an
arbitrary field K. Illustrate that V is a vector space over K with respect to
the operations of matrix addition and scalar multiplication.
Solution: One can illustrate easily that V satisfies properties VS.I to VS.VIII.
For example, V contains the zero matrix 0, and for each element in V , its
additive inverse is also in V .
∎
Example 7.30. Show that for any scalar k in the scalar field K and any
vectors u and v in the vector space V ,
k (u − v) = ku − kv .
Solution: Clearly, k (u − v) = k (u + (−v)) = k u + k (−v) = k u − k v
7.14
Linear Transformation
7.14.1
Basis Vectors
∎
Let e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1) be the basis vectors of 3-D
space, then all 3-D vectors are linear combination of the elements of the basis
{ei } of the space. For example, an arbitrary vector v⃗ in 3-D space can be
expressed as (x, y, z) = x e1 + y e2 + z e3 , where e1 , e2 , and e3 are commonly
CHAPTER 7. MATRIX AND TRANSFORMATION
182
labelled as î, ĵ, and k̂ respectively. Alternatively, we write down v⃗ as a
column vector, i.e.
⎛ 1 ⎞
⎛ 0 ⎞
⎛ 0 ⎞ ⎛ x ⎞
v⃗ = x î + y ĵ + z k̂ = x ⎜ 0 ⎟ + y ⎜ 1 ⎟ + z ⎜ 0 ⎟ = ⎜ y ⎟
⎝ 0 ⎠
⎝ 0 ⎠
⎝ 1 ⎠ ⎝ z ⎠
Basis vectors are linearly independent to each other. Any one of them cannot
be generated from the remaining elements in {ei }. So, you cannot express e1
as the linear combination of e2 and e3 . Other than the usual basis vectors
e1 , e2 and e3 , one can choose any three non-coplanar vectors as basis vectors,
say e1 ′ = (1, 1, 1), e2 ′ = (1, 1, 0), and e3 ′ = (1, 0, 0).
Example 7.31. A vector v⃗ = (2, 3) is defined using the usual basis vectors:
e1 = (1, 0) and e2 = (0, 1) in the 2-D vector space. If the basis vectors are
changed to e1 ′ = (1, 1) and e2 ′ = (−1, 1), express v⃗ using the new basis.
Solution: Recall that v⃗ = (2, 3) = 2 e1 + 3 e2 . Let v⃗ = (2, 3) = a e1 ′ + b e2 ′ , then
(
2
1
−1
) = a( ) + b(
)
3
1
1
we have a − b = 2 and a + b = 3, which give a = 25 , b = 21 .
Therefore v⃗ = 52 e1 ′ + 21 e2 ′ . Using the new basis, v⃗ is given by ( 52 , 12 ).
7.14.2
∎
Linear Operator
Let V and W be vector spaces (over K). We call a function T ∶ V → W a
linear transformation from V to W if, for all x, y ∈ V and c ∈ K, we have
(a) T (x + y) = T (x) + T (y) and
(b) T (cx) = c T (x).
We often simply call T linear. If V = W , we call T a linear operator on V .
Properties of a Linear Operator
ˆ If T is linear, then T (0) = 0.
ˆ T is linear if and only if T (cx + y) = c T (x) + T (y) for all x, y ∈ V and
c ∈ K.
ˆ If T is linear, then T (x − y) = T (x) − T (y) for all x, y ∈ V .
CHAPTER 7. MATRIX AND TRANSFORMATION
183
ˆ T is linear if and only if, for x1 , x2 , ⋯ , xn ∈ V , and a1 , a2 , ⋯ , an ∈ K,
we have
n
n
i=1
i=1
T (∑ ai xi ) = ∑ ai T (xi ) .
Generally, the second property is used to prove the transformation is linear.
Example 7.32. Define T ∶ R2 → R2 by T (a1 , a2 ) = (3a1 + a2 , 2a1 ). Show
that T is linear.
Solution: Let c ∈ R and x, y ∈ R2 , where x = (b1 , b2 ) and y = (d1 , d2 ). Since
cx + y = (cb1 + d1 , cb2 + d2 ), we have
T (cx + y) = (3 (cb1 + d1 ) + cb2 + d2 , 2 (cb1 + d1 ))
Also
cT (x) + T (y) = c (3b1 + b2 , 2b1 ) + (3d1 + d2 , 2d1 )
= (3cb1 + cb2 + 3d1 + d2 , 2cb1 + 2d1 )
= (3 (cb1 + d1 ) + cb2 + d2 , 2 (cb1 + d1 ))
∎
So T is linear.
7.15
Matrix Representation of a Linear Operator
Let T be a linear operator such that
⎧
T (e1 ) = a11 e1 + a12 e2 + a13 e3
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ T (e2 ) = a21 e1 + a22 e2 + a23 e3
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩ T (e3 ) = a31 e1 + a32 e2 + a33 e3
then the matrix representation of T is
⎛ a11 a21 a31 ⎞
T = ⎜ a12 a22 a32 ⎟
⎝ a13 a23 a33 ⎠
One can check that equation set 7.42 has a matrix representation.
⎛ 1 ⎞ ⎛ a11 a21 a31 ⎞ ⎛ 1 ⎞ ⎛ a11 ⎞
T e1 = T ⎜ 0 ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ 0 ⎟ = ⎜ a12 ⎟
⎝ 0 ⎠ ⎝ a13 a23 a33 ⎠ ⎝ 0 ⎠ ⎝ a13 ⎠
(7.42)
CHAPTER 7. MATRIX AND TRANSFORMATION
184
Similarly, we have
⎛ 0 ⎞ ⎛ a11 a21 a31 ⎞ ⎛ 0 ⎞ ⎛ a21 ⎞
T e2 = T ⎜ 1 ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ 1 ⎟ = ⎜ a22 ⎟
⎝ 0 ⎠ ⎝ a13 a23 a33 ⎠ ⎝ 0 ⎠ ⎝ a23 ⎠
⎛ 0 ⎞ ⎛ a11 a21 a31 ⎞ ⎛ 0 ⎞ ⎛ a31 ⎞
T e3 = T ⎜ 0 ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ 0 ⎟ = ⎜ a32 ⎟
⎝ 1 ⎠ ⎝ a13 a23 a33 ⎠ ⎝ 1 ⎠ ⎝ a33 ⎠
Moreover, if T operates on (x, y, z), it gives
⎛ x ⎞ ⎛ a11 a21 a31 ⎞ ⎛ x ⎞ ⎛ x a11 + y a21 + z a31 ⎞
T (x e1 + y e2 + z e3 ) = T ⎜ y ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ y ⎟ = ⎜ x a12 + y a22 + z a32 ⎟
⎝ z ⎠ ⎝ a13 a23 a33 ⎠ ⎝ z ⎠ ⎝ x a13 + y a23 + z a33 ⎠
which is equilvalent to
T (x, y, z) =
=
=
=
=
T (x e1 + y e2 + z e3 )
T (x e1 ) + T (y e2 ) + T (z e3 )
x T (e1 ) + y T (e2 ) + z T (e3 )
x (a11 e1 + a12 e2 + a13 e3 ) + y (a21 e1 + a22 e2 + a23 e3 ) + z (a31 e1 + a32 e2 + a33 e3 )
(x a11 + y a21 + z a31 ) e1 + (x a12 + y a22 + z a32 ) e2 + (x a13 + y a23 + z a33 ) e3
Example 7.33. If S and T are linear operators such that S(x, y) = (y, x)
and T (x, y) = (3 x − y, −2 x + 5 y), find the matrix representaion of S and T
using the usual basis vectors (1, 0) and (0, 1).
Solution:
S=(
0 1
)
1 0
T =(
and
3 −1
),
−2 5
where
S(
x
y
)=( )
y
x
and
T(
3x − y
x
)=(
).
y
−2 x + 5 y
∎
Example 7.34. A linear operator T = (
1 2
) is defined using the usual
3 4
basis, {e1 = (1, 0), e2 = (0, 1)}.
(a) Find T (1, 3) and T (2, 7).
(b) Find the matrix of T using the basis {(1, 3), (2, 7)}.
CHAPTER 7. MATRIX AND TRANSFORMATION
185
Solution:
(a)
(b) Denote e1 ′ = (
T (
1
1 2
1
7
)=(
)( )=(
)
3
3 4
3
15
T (
2
1 2
2
16
)=(
)( )=(
)
7
3 4
7
34
1
2
) and e2 ′ = ( ) and let
3
7
7
) = a e1 ′ + b e 2 ′
15
and
(
16
) = c e1 ′ + d e 2 ′
34
7
1
2
) = a( ) + b( )
15
3
7
and
(
16
1
2
) = c( ) + d( )
34
3
7
(
We have
(
Hence, we get {
a = 19
c = 44
and {
.
b = −6
d = −14
Let Te′ be the matrix of T using the basis {(1, 3), (2, 7)}. Now, we can write
Te′ e1 ′ = 19 e1 ′ − 6 e2 ′ and Te′ e2 ′ = 44 e1 ′ − 14 e2 ′ . Thus,
Te′ = (
19 44
)
−6 −14
Remark: Using the linear operator Te′ , we have
Te′ (
1
19
)=(
)
0
−6
Te′ (
and
0
44
)=(
)
1
−14
1
0
) = (1) e1 ′ + (0) e2 ′ = e1 ′ and ( ) = (0) e1 ′ + (1) e2 ′ = e2 ′ .
0
1
They are not the usual bases, e1 and e2 . In the same manner,
The vectors (
(
19
) = (19) e1 ′ + (−6) e2 ′
−6
and
(
44
) = (44) e1 ′ + (−14) e2 ′
−14
∎
Index
absolute value, 114
acceleration vector, 16
adding vectors, 3
additive inverse, 181
adjoint, 161
Argand Diagram, 113
argument, 114
associative, 150
auxiliary equation, 121
basis, 181
basis vectors, 181
beta function, 65
binomial expansion, 45
bisection, method of, 54
capacitor, 96
cardioid, 138
Cartesian coordinates, 139
center of mass, 68
centripetal acceleration, 17, 33
chain rule, 125
characteristic equation, 121, 167
characteristic polynomial, 167
charging a capacitor, 96
cofactor, 154
column, 149
column vector, 149, 153
commutative, 150
complex numbers, 113
complex plane, 113
components, 8
compound angle formulae, 106
conjugate, 114
consistent system, 163
constant of integration, 56
coplanar vectors, 27
cosecant, 109
cotangent, 109
Cramer’s rule, 163
cycloid, 75
cylindrical coordinates, 141
De Moivre’s theorem, 116
definite integral, 61
definite integration, 56, 60
del operator, 147
derivative, 30
determinant, 154, 156
diagonal matrix, 153
diagonalization, 172
differentiable, 30
differentiation, 30
dummy variable, 61
eigenvalues, 167
eigenvectors, 167
electric dipole, 49
electric field, 40, 146
electric force, 9
electric potential, 40, 146
entry, 149
equilibrium, 3, 36
Euler’s formula, 118
Fermat’s principle, 35
first order differential equation, 100
fundamental theorem of calculus, 61
half-angle formulae, 110
half-life, 95
Hermitian matrix, 180
homogeneous function, 98
186
INDEX
Hooke’s law, 88
identity matrix, 151, 152
impulse, 66
inconsistent system, 163
integrating factor, 101
integration, 56
integration by parts, 59
inverse, 152, 160
invertible, 160
isothermal process, 77
Kirchhoff’s junction rule, 165
Kirchhoff’s voltage rule, 96, 166
L’ Hôpital’s rule, 41
length of a curve, 74
linear operator, 182
linear transformation, 181
linearly dependent, 159, 168
linearly independent, 168
magnetic force, 29
matrix, 149
minor, 154
modulus, 114
moment of inertia, 78
Newton’s method, 50
non-singular, 160
ordinary differential equations, 86
orthogonal, 136
orthogonal matrix, 153, 162, 180
parabolic miror, 98
partial derivative, 124
partial differentiation, 124
partial fraction, 58
particle in a box, 122
polar coordinates, 135
polar form, 114
position vector, 15
projectile, 5, 18, 22
quantized, 123
187
radial acceleration, 18
radioactive decay, 94
radius of curvature, 18
RC circuit, 96
reduction formula, 64
reference frames, 19
refraction, 35
relative velocity, 11
resultant, 3, 8, 18
Riemann integrable, 61
rotation of axes, 175
row, 149
row vector, 149, 153
scalar product, 20
scalar triple product, 27
Schrodinger’s equation, 123
secant, 109
second order differential equation, 104
separation of variables, 86
similar matrix, 153
simple harmonic motion, 88, 120
Simpson’s rule, 83
singular, 160
skew-symmetric matrix, 154
spherical coordinates, 143
square matrix, 149
standard form, 100
substitution, 57
subtracting vectors, 11
symmetric matrix, 153
tangential acceleration, 18
Taylor’s series, 44
terminal speed, 92
torque, 28
Torricelli’s Law, 99
total differential, 31
trace, 153
transpose, 153
trapezoidal rule, 82
trigonometry, 106
triple product, 27
INDEX
unitary matrix, 180
vector, 1
vector space, 180
vector triple product, 27
velocity vector, 15
wavefunction, 123
work, 71
zero matrix, 152
zero vector, 181
188
Download