PHYS 1150 - Problem Solving in Physics Yip Man Kit Rm 415B, CYM Physics Building mankit@hku.hk January 2023 Contents 1 Vectors 1.1 Properties of Vectors . . . . . . . . . . . . . 1.2 Examples of Vectors and Scalars . . . . . . 1.3 Adding Vectors . . . . . . . . . . . . . . . . 1.4 The Components of Vectors . . . . . . . . 1.5 Subtracting Vectors . . . . . . . . . . . . . 1.6 Position Vector and its Time Derivatives . 1.7 Reference Frames . . . . . . . . . . . . . . . 1.8 Scalar Products of Vectors . . . . . . . . . 1.9 Applications of Scalar Products . . . . . . 1.10 Cross Product of Vectors . . . . . . . . . . 1.11 Triple Products . . . . . . . . . . . . . . . . 1.12 Applications of Cross Product . . . . . . . 1.12.1 Torque . . . . . . . . . . . . . . . . . 1.12.2 Magnetic Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 8 11 15 19 20 21 25 27 28 28 28 2 Differentiation 2.1 Basic Ideas and the Extremum . . . 2.2 Derivatives of Physical Quantities . . 2.3 Centripetal Acceleration . . . . . . . 2.4 Seeking the Extremum . . . . . . . . 2.5 Case Study on Projectile Motion . . 2.6 A Revisit to Newton’s Second Law . 2.7 Electric Potential and Electric Field 2.8 L’ Hôpital’s Rule . . . . . . . . . . . . 2.9 Taylor’s Series . . . . . . . . . . . . . 2.10 Newton’s Method . . . . . . . . . . . 2.11 Useful Differentiation Formulae . . . 2.12 Appendix: Method of Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 30 31 33 34 38 38 40 41 44 50 54 54 3 Integration 3.1 Indefinite Integration . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Integration by Substitution . . . . . . . . . . . . . . . . 56 56 57 i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 ii 3.1.2 Integration using Partial Fraction . . 3.1.3 Integration by Parts . . . . . . . . . . . Definite Integration . . . . . . . . . . . . . . . 3.2.1 Fundamental Theorem of Calculus . . 3.2.2 Integration using Reduction Formula Impulse . . . . . . . . . . . . . . . . . . . . . . Center of Mass . . . . . . . . . . . . . . . . . . Work Done by a Force . . . . . . . . . . . . . . Energy Stored in a Spring . . . . . . . . . . . Electric Field due to a Charged Wire . . . . . The Length of a Curve . . . . . . . . . . . . . Area under a Curve . . . . . . . . . . . . . . . Moment of Inertia . . . . . . . . . . . . . . . . The Dog-And-Rabbit Chase Problem . . . . . Numerical Integration . . . . . . . . . . . . . . 3.12.1 Trapezoidal Rule . . . . . . . . . . . . 3.12.2 Simpson’s Rule . . . . . . . . . . . . . . Useful Integration Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Ordinary Differential Equations 4.1 Separation of Variables . . . . . . . . . . . . . . . . . 4.2 Simple Harmonic Motion . . . . . . . . . . . . . . . . 4.3 Free Fall with Air Resistance . . . . . . . . . . . . . . 4.4 Radioactive Decay . . . . . . . . . . . . . . . . . . . . 4.5 Charging a Capacitor . . . . . . . . . . . . . . . . . . 4.6 Parabolic Mirror . . . . . . . . . . . . . . . . . . . . . 4.7 Torricelli’s Law of Draining . . . . . . . . . . . . . . . 4.8 First Order Linear Differential Equation . . . . . . . 4.9 Second Order Homogeneous Differential Equations . 5 Trigonometry and Complex Numbers 5.1 Compound Angle Formulae . . . . . . . 5.2 Complex Numbers . . . . . . . . . . . . 5.3 Complex Plane . . . . . . . . . . . . . . 5.4 De Moivre’s Theorem . . . . . . . . . . 5.5 Euler’s Formula . . . . . . . . . . . . . . 5.6 A Revisit to Simple Harmonic Motion 5.7 Particle in a Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Partial Differentiation 6.1 Partial Derivative . . . . . . . . . . . . . . . . 6.2 Geometrical Meaning of Partial Derivatives . 6.3 Polar Coordinates . . . . . . . . . . . . . . . . 6.4 Polar Coordinates and the Length of a Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 59 60 61 64 66 68 71 72 73 74 76 77 80 81 82 83 85 . . . . . . . . . 86 86 88 91 94 96 98 99 100 104 . . . . . . . 106 106 113 113 116 118 120 122 . . . . 124 124 134 135 138 CONTENTS 6.5 6.6 6.7 6.8 Cartesian Coordinates . . . . . . . . . . . . . . . . . Cylindrical Coordinates . . . . . . . . . . . . . . . . Spherical Coordinates . . . . . . . . . . . . . . . . . A Revisit to Electric Field and Electric Potential 7 Matrix and Transformation 7.1 Matrix . . . . . . . . . . . . . . . . . . . . . . . 7.2 Properties of Matrices . . . . . . . . . . . . . . 7.3 Determinant . . . . . . . . . . . . . . . . . . . . 7.4 Properties of Determinant . . . . . . . . . . . 7.5 Inverse . . . . . . . . . . . . . . . . . . . . . . . 7.6 Properties of an Inverse . . . . . . . . . . . . . 7.7 Systems of Linear Equations . . . . . . . . . . 7.8 Cramer’s Rule . . . . . . . . . . . . . . . . . . . 7.9 Eigenvalues and Eigenvectors . . . . . . . . . 7.10 Diagonalization . . . . . . . . . . . . . . . . . . 7.11 Rotation of Axes . . . . . . . . . . . . . . . . . 7.12 Special Matrices . . . . . . . . . . . . . . . . . 7.13 Vector Spaces . . . . . . . . . . . . . . . . . . . 7.14 Linear Transformation . . . . . . . . . . . . . . 7.14.1 Basis Vectors . . . . . . . . . . . . . . . 7.14.2 Linear Operator . . . . . . . . . . . . . 7.15 Matrix Representation of a Linear Operator . Index iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 141 143 146 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 149 152 154 156 160 162 163 163 167 172 175 180 180 181 181 182 183 186 Chapter 1 Vectors 1.1 Properties of Vectors A vector is a quantity which has both magnitude (length) and direction. The vector v⃗ has magnitude denoted as ∣⃗ v ∣ or simply v. The position of the vector in space is immaterial, as shown in figure 1.1. Vectors having the same direction and same length, originated from different points in the space are ⃗ B ⃗ and C⃗ show in the figure are equivalent. identical. Three vectors A, Figure 1.1: Three vectors in space The properties of vectors are listed as follows. Scalar Multiplication: For any real number λ, the scalar product of λ and v⃗ is λ⃗ v which is a ⃗ vector having its length ∣λ∣ times as long as v . We call λ a scalar in contrast to the vector v⃗. 1. If λ = 1, the product is 1⃗ v = v⃗. ⃗ the null vector or zero vector. The 2. If λ = 0, we have 0⃗ v = 0, magnitude of it is zero. 3. If λ = −1, the product is −1⃗ v = −⃗ v , the direction of it is opposite to v⃗ but having the same length as v⃗, see figure 1.2. We read −⃗ v as minus v⃗. 4. If λ = 1/∣⃗ v ∣, we obtain the unit vector v̂ = v⃗/∣⃗ v ∣ which points in the direction of v⃗ but the magnitude of it is 1. 1 CHAPTER 1. VECTORS 2 Figure 1.2: Negative of v⃗ Addition and Subtraction: ⃗ we have For any scalars α and β, and any vectors A⃗ and B, ⃗=B ⃗ + A⃗ (commutative law) 1. A⃗ + B ⃗ = A⃗ − B ⃗ 2. A⃗ + (−B) ⃗ + C) ⃗ = (A⃗ + B) ⃗ + C⃗ (associative law) 3. A⃗ + (B ⃗ = (αβ)A⃗ (associative law) 4. α(β A) ⃗ = αA⃗ + αB ⃗ (distributive law) 5. α(A⃗ + B) 6. (α + β)A⃗ = αA⃗ + β A⃗ (distributive law) 1.2 Examples of Vectors and Scalars ⃗, force F⃗ , Examples of vectors are displacement x⃗, velocity v⃗, acceleration a ⃗ , angular acceleration α ⃗ , torque τ⃗, electric momentum p⃗, angular velocity ω ⃗ etc. field E⃗ and magnetic field B, Examples of scalars are distance x, speed v, mass m, work done W , power W , temperature T , gravitational potential V , electric flux ΦE , and magnetic flux ΦB , etc. Surprisingly, angular displacement is a scalar instead of a vector because it does not obey the commutative law. Figure 1.3 states that the order of the rotations will make a big difference in the result. Figure 1.3: Angular displacement is not a vector CHAPTER 1. VECTORS 1.3 3 Adding Vectors An object acting by two forces F⃗1 and F⃗2 is equivalent to being acted by F⃗3 , where F⃗3 = F⃗1 + F⃗2 . The sum of these vectors gives the resultant of the applied forces. The direction and magnitude of F⃗3 can be obtained by constructing a parallelogram as shown in figure 1.4. The diagonal of it gives the resultant vector. The proof is given in section 1.4. Figure 1.4: Resultant vector Notice that the vectors can be translated as shown in the left diagram of figure 1.5. Then we obtain two vector diagrams in the form of a triangle, e.g. F⃗1 + F⃗2 = F⃗3 in the middle diagram and F⃗2 + F⃗1 = F⃗3 in the right diagram. Both diagrams give the same result for F⃗3 . The results can be extended to a many-force system and the vector diagram becomes a polygon. If we label Figure 1.5: Sum of two vectors ⃗ B, ⃗ and C⃗ respectively, we have A⃗ + B ⃗ + C⃗ = 0. In F⃗1 , F⃗2 , and −F⃗3 by A, other words, when the sum of three vectors equals zero, the vector diagram is a triangle with the vectors pointing in well order (i.e. it is in a cyclic way, clockwise or counterclockwise). The vector diagrams in figure 1.6 are equivalent. If an object is at equilibrium due to three forces, the force diagram is a triangle with the force vectors pointing in well order. Figure 1.6: Equilibrium of three forces CHAPTER 1. VECTORS 4 Example 1.1. An object of mass m is suspended by three light strings as shown in figure 1.7. The tensions in the strings are T1 , T2 and T3 respectively. The strings connecting to the ceiling make angles α and β to the horizontal. Find the tension in each string. Figure 1.7: Suspending an object by strings Solution: Obviously, we have T3 = mg because the object is at equilibrium. The free-body diagram of the object is shown in the left diagram of figure 1.8. As the knot is also at equilibrium, we can construct a vector diagram using the tension in the strings through the knot, see the right diagram of figure 1.8. It is a triangle having sides given by the magnitude of tensions in the strings. The direction of each vector is well ordered (i.e. it is in a cyclic way) in the diagram. As a reminder, the sides of the triangle are not the length of the strings. Using Sine law, we obtain the tensions of the strings readily. Figure 1.8: Vector diagrams CHAPTER 1. VECTORS 5 T1 T3 = ○ sin(90 − β) sin(α + β) mg T1 = cos β sin(α + β) Hence, we have T1 = mg cos β sin(α + β) T2 = mg cos α sin(α + β) Similarly, we obtain Of course, one can use the routine method, i.e. to resolve the components of the tensive forces into horizontal and vertical components and then consider the equilibrium of forces for each direction. But, it is too clumsy. ∎ Example 1.2. A particle is projected from the floor with speed vi at an angle 30○ with the horizontal as shown in figure 1.9. Find the time of flight of the particle. Figure 1.9: Projectile motion of a particle Solution: The projectile is symmetric about the vertical line through the maximum height of flight. Thus, the initial velocity and the final velocity have the same magnitude, vi = vf , and v⃗i and v⃗f make the same acute angle with the floor. Since v⃗f = v⃗i + g⃗t, we can construct the vector diagram, figure 1.10, using the vectors in this equation. The diagram is an equilateral triangle which has sides vi = vf = gt. Therefore, we have t = vi /g. Notice that g⃗t = ∆⃗ v = v⃗f − v⃗i , the change of velocity after time t. Figure 1.10: Vector diagram of the change of velocity CHAPTER 1. VECTORS 6 Example 1.3. An experiment is performed by a gun and a particle as shown in figure 1.11. The particle is released when the gun fires. Show that the bullet can always hit the particle if its initial velocity u⃗ points to the particle. Figure 1.11: The gun and particle experiment Solution: Let t be the time needed by the bullet to travel the horizontal distance between the gun and the particle. Denote the position vector of the ⃗ From the kinematic equation, we have S⃗ = u⃗t + g⃗t2 /2, bullet at time t as S. where u⃗ is along the firing direction of the gun and g⃗ is vertically downward. Figure 1.12: Position vector of the bullet at time t We obtain the vector diagram as shown in figure 1.12. It is a triangle having sides S, ut and gt2 /2. Thus, we can assert that the bullet, at time t, is at a vertical distance gt2 /2 under the initial position of the particle. On the other hand, we notice that the particle also has a vertical displacement g⃗t2 /2. Therefore, the bullet can always hit the particle. ∎ Example 1.4. A boat sails to across a swift, straight river of width d. The speed of the boat in still water is u and that of the water is v, where v > u. If the boat sails directly toward the opposite bank, find the downstream distance he has traveled when he reaches the opposite bank. Find also the minimum downstream distance and the direction of sailing of the boat. CHAPTER 1. VECTORS 7 Figure 1.13: Resultant velocity of the boat Solution: Refer to figure 1.13, the resultant velocity of the boat is given by Ð→ Ð→ Ð→ v⃗ + u⃗ = OQ + QC = OC. It means that the boat will arrive B in the opposite bank, where the downstream distance AB = d/ tan θ = d/(u/v) = dv/u. One should notice that the time taken by the boat is d/u. Figure 1.14: The minimum downstream distance of the boat To find the minimum downstream distance, we construct a circle of radius u, as shown in figure 1.14. The tangent line of the circle from O shows the path of the boat such that it has the minimum downstream distance in the opposite bank. As the tangent line meets the opposite bank at B ′ , the minimum downstream distance is AB ′ . Refer to △OQC ′ , it is a right-angled triangle, where sin α = u/v. The boat should sail in a direction making an angle α with the normal of the river, i.e. OA. The resultant velocity of the ÐÐ→ ÐÐ→ Ð→ ÐÐ→ boat is OC ′ , where OC ′ = OQ + QC ′ = v⃗ + u⃗. The minimum downstream distance is √ d v 2 − u2 AB ′ = d cot α = u CHAPTER 1. VECTORS 8 Remark: The time taken is t= d = u cos α dv d √ = √ v 2 − u2 u v 2 − u2 ) u( v ∎ 1.4 The Components of Vectors Given that A⃗ = Ax î + Ay ĵ, the horizontal and vertical components √ of it is ⃗ represented by Ax = A cos α and Ay = A sin α, where A = ∣A∣ = A2x + A2y . ⃗ = Bx î + By ĵ, we have C⃗ = A⃗ + B ⃗ = (Ax + Bx ) î + (Ay + By ) ĵ, where If B √ ⃗ = Bx2 + By2 . Figure 1.15 shows the Bx = B cos β, By = B sin β and B = ∣B∣ ⃗ along the î and ĵ directions. The resultant vector components of A⃗ and B ⃗ is shown in figure (1.16). It is the diagonal of the parallelogram C⃗ = A⃗ + B ⃗ formed by A⃗ and B. Figure 1.15: Components of vectors Figure 1.16: Resultant of two vectors CHAPTER 1. VECTORS 9 Example 1.5. Three charges, each equal to +2.90 µC, are placed at three corners of a square 0.500 m on a side, as shown in figure 1.17. Find the magnitude and direction of the net force on charge number 3. The Coulomb’s constant ke = 1/(4π0 ) is 8.99 × 109 N ⋅ m2 /C2 , where 0 is the permittivity of free space. Figure 1.17: Electric forces exerted on a charge Solution: The magnitude of electric force exerted on charge i by charge j is given by Fij = ke q i qj , r2 where i, j = 1, 2, 3 and i ≠ j. The magnitude of electric force exerted on charge 3 by charge 1: q2 (2.90 × 10−6 C)2 F31 = ke √ = (8.99 × 109 N ⋅ m2 /C2 ) √ = 0.151 N ( 2r)2 [( 2) (0.500 m)]2 The magnitude of electric force exerted on charge 3 by charge 2: F32 = ke −6 2 q2 9 2 2 (2.90 × 10 C) = (8.99 × 10 N ⋅ m /C ) = 0.302 N r2 (0.500 m)2 The x- and y- components of F⃗31 and F⃗32 : F31,x F31,y F32,x F32,y = = = = F31 F31 F32 F32 cos 45○ = (0.151 N) (0.707) = 0.107 N sin 45○ = (0.151 N) (0.707) = 0.107 N cos 0○ = (0.302 N) (1) = 0.302 N sin 0○ = (0.302 N) (0) = 0 N The resultant force on charge 3 has components: F3,x = F31,x + F32,x = 0.107 N + 0.302 N = 0.409 N F3,y = F31,y + F32,y = 0.107 N + 0 N = 0.107 N CHAPTER 1. VECTORS 10 The resultant force acting on charge 3: √ 2 2 F = F3,x + F3,y = 0.423 N The direction of the resultant force on charge 3: θ = tan−1 ( F3,y ) = 14.7○ F3,x ∎ Example 1.6. A right triangular wedge of mass M and inclination angle θ, has a small block of mass m placed on its inclined surface, as shown in figure 1.18. Assuming all surfaces are frictionless, what horizontal acceleration a must M have relative to the table to keep m stationary relative to the wedge? What horizontal force F must be applied to the wedge to achieve this result? Figure 1.18: Pushing the wedge Solution: Suppose that the block has no motion with respect to the wedge when they move together with a common horizontal acceleration a. The free-body diagrams of the wedge and the block are sketched in figure 1.19. Figure 1.19: Free-body diagrams of the wedge and the block The equations of motion of the block along the horizontal and vertical are stated as follows. Refer to the right diagram of figure 1.19. ⎧ ⎪ ⎪ N2 sin θ = ma ⎨ mg − N2 cos θ = 0 ⎪ ⎪ ⎩ gives N2 = mg cos θ CHAPTER 1. VECTORS 11 Eliminating N2 , we obtain a = g tan θ. From the left diagram of figure 1.19, we can write the equation of motion of the wedge along the horizontal. F − N2 sin θ = M a Therefore, we have F = (M + m) g tan θ. It is not surprised to see that the answer is simply the product of M + m (the total mass of the objects) and g tan θ (the common acceleration of the objects along the horizontal), because both objects move together horizontally and the only external horizontal force exerted on the system is F . Here, we have the system formed by the block and the wedge. The normal forces N2 are internal forces of the system. The normal force N1 exerted on the wedge by the table is an external vertical force with respect to the system. However, N1 is irrelevent to the discussion concerning the horizontal acceleration of the system. ∎ 1.5 Subtracting Vectors Two objects A and B are located at different place on the 2d-plane. The position vectors of them are r⃗A and r⃗B respectively. The position of object A relative to object B is given by r⃗AB = r⃗A − r⃗B , the left diagram in figure 1.20. The direction and magnitude of r⃗AB provide information about the position of A relative to B. The meaning of relative position vector is straight forward by considering the following cases. Suppose that there is an observer located at B and he tries to state the position of A. He will say r⃗AB . If the observer is located at A and he tries to state the position of B. Then, he will say r⃗BA , where r⃗BA = r⃗B − r⃗A . The right diagram in figure 1.20 shows the direction of r⃗BA . The concept can be extended to relative velocity. The velocity of A relative to B is v⃗AB = v⃗A − v⃗B , where v⃗A and v⃗B are velocity vectors of A and B respectively. Figure 1.20: Relative position between two objects Example 1.7. A ship A is steaming due north at 16 km/hr and a ship B is steaming due west at 12 km/hr. Find the velocity of A with respect to B. CHAPTER 1. VECTORS 12 Solution: The velocity of A with respect to B is v⃗AB which equals to v⃗A −⃗ vB . Referring to the vector diagram√ shown in the right of figure 1.21, the relative velocity has magnitude ∣⃗ vAB ∣ = 122 + 162 = 20 km/hr. The direction of v⃗AB is N tan−1 (12/16) E, i.e. N 36○ 52′ E. Figure 1.21: The velocity of ship A with respect to ship B Example 1.8. A man traveling East at 8 kmh−1 finds that the wind seems to blow directly from the North. On doubling his speed he finds that it appears to come from NE. Find the velocity of the wind. Solution: Let the velocity of the wind be w⃗ = x î + y ĵ. Then the velocity of the wind relative to the man is w⃗ − 8 î = (x − 8) î + y ĵ Notice that the man is the observer and he feels the wind. All about such feeling (i.e. blowing directly from the North) is relative to him. The vector subtraction represents this relative velocity. Therefore w⃗ − 8 î = (x − 8) î + y ĵ is parallel to −ĵ. Hence, we obtain x − 8 = 0 (i.e. x = 8). Figure 1.22 shows the relative velocity of the wind to the man traveling East at 8 kmh−1 . The vector −ĵ indicates the North wind relative to the man. Figure 1.22: The wind with respect to a man traveling East at 8 kmh−1 When the man doubles his speed, the velocity of the wind relative to him is given by w⃗ − 16 î = (x − 16) î + y ĵ CHAPTER 1. VECTORS 13 Figure 1.23: The wind with respect to the man traveling East at 16 kmh−1 But the wind seems blowing from the NE and is therefore parallel to −(î + ĵ). Hence, we can write y = x − 16 √ = 8 − 16 = −8. The velocity of the wind is 8 î − 8 ĵ, which is equivalent to 8 2 kmhr−1 from NW. Figure 1.23 shows the relative velocity of the wind to the man traveling East at 16 kmh−1 . The vector −(î + ĵ) indicates the NE wind relative to the man. ∎ Example 1.9. John is running at a constant speed v0 = 0.7 m/s along a straight path. His father Peter, at a normal distance 10 m from the path and a distance 20 m from John, observes John’s approaching. Suppose that Peter starts to run at constant speed, along a straight course, immediately when he observes John, what is the minimum speed of Peter such that they can meet? Where do they meet and what is the time? Figure 1.24: The initial positions of John and Peter Solution: We subtract both persons by the velocity vector of John, then John becomes stationary and Peter runs with v⃗P J , where v⃗P J represents the velocity of Peter relative to John and v⃗P J = v⃗P − v⃗J . Generally, v⃗P J has an arbitrary direction because it depends on the choice of Peter (i.e. v⃗P ). Figure 1.25: The vector diagram of Peter’s path CHAPTER 1. VECTORS 14 Nevertheless, the direction of v⃗P J in figure 1.25 points to John such that Peter can meet John eventually. Denote the minimun velocity of Peter as v⃗min , where the magnitude of it is vmin = (0.7 m/s) sin 30○ = 0.35 m/s. As a reminder, v⃗min shows the actual direction of Peter such that he meets John eventually. In fact, he meets John at M , and P M shows the actual path adopted by Peter. From △JM P , we have JM = 40 20m =√ m ○ cos 30 3 40 √ m JM 400 3 The time needed is t, where t = = = √ seconds. 7 0.7 m/s m/s 7 3 10 ∎ Example 1.10. An experiment is performed by a gun and a particle as shown in figure 1.26. The particle is released when the gun fires. Show that the bullet can always hit the particle if its initial velocity u⃗ points to the particle. This example repeats the same question stated in example 1.3, but we try to solve it by applying the idea of relative velocity. Figure 1.26: The gun and particle experiment Solution: When t ≥ 0, we have { Velocity of the bullet ∶ v⃗b = u⃗ + g⃗ t Velocity of the particle ∶ v⃗p = g⃗ t So, the velocity of the bullet relative to the particle is v⃗bp = v⃗b − v⃗p = u⃗. It means that the bullet travels with u⃗ with respect to an observer riding on the particle. Equivalently, the gravitational effect has been cancelled out in the picture of relative motion and the bullet travels with u⃗ towards the particle. The bullet must hit the particle. CHAPTER 1. VECTORS 15 Alternatively, one may imagine the following. If the velocity vector g⃗t is subracted from both the particle and the bullet, then the particle becomes at rest and the bullet travels with u⃗ relative to the particle. This is the idea behind relative motion. Finally, we conclude that the bullet hits the particle definitely. ∎ 1.6 Position Vector and its Time Derivatives The position vector indicates the position of an object with respect to the origin in a coordinate system, e.g. Cartesian coordinate system. Its time derivative gives the velocity of the object. The direction of the velocity vector is tangential to the path of the object. We can differentiate the velocity vector again with time and obtain the acceleration vector. It is the second order derivative of the position vector with respect to time. It reveals that the equation of motion of an object, i.e. F⃗ = m⃗ a, is generally a second order differential equation, where F⃗ is the force exerted on an object of mass m ⃗. moving with acceleration a Position vector The position vector of a point is given √ by r⃗ = x î+y ĵ+z k̂. It directs from the origin and has length r = ∣⃗ r∣ = x2 + y 2 + z 2 . Figure 1.27 shows the trajectory of a particle at various times. The position of the particle at time t1 is given by r⃗1 and at a later time t2 it becomes r⃗2 . The change of position vector, or simply the displacement, is ∆⃗ r = r⃗2 − r⃗1 . It is important to notice that when ∆t = t2 − t1 is very small, ∆⃗ r is about to lie on the trajectory and it shows roughly the direction of the particle. Figure 1.27: The positon vector of a particle Velocity vector The average velocity during time ∆t is defined as v⃗avg = ∆⃗ r r⃗(t1 + ∆t) − r⃗(t1 ) = ∆t ∆t CHAPTER 1. VECTORS 16 The instantaneous velocity at time t is defined as the time derivative of the position vector. It is tangential to the path at the instant. v⃗ = d⃗ r ∆⃗ r r⃗(t + ∆t) − r⃗(t) = lim = lim ∆t→0 ∆t→0 dt ∆t ∆t The speed of the particle is v = ∣⃗ v ∣ which is positive and a scalar quantity. It is in contrast to velocity v⃗ which is a vector. In some textbooks, v⃗ is labeled as r⃗˙ . Figure 1.28: The velocity vector of a particle Acceleration vector The average acceleration during time ∆t is defined as ⃗avg = a ∆⃗ v v⃗(t1 + ∆t) − v⃗(t1 ) = ∆t ∆t The instantaneous acceleration at time t is defined as the time derivative of the velocity vector. ⃗= a d⃗ v ∆⃗ v v⃗(t + ∆t) − v⃗(t) = lim = lim dt ∆t→0 ∆t ∆t→0 ∆t Notice that the acceleration vector is also the second order time derivative of the position vector. d⃗ v d d⃗ r d2 r⃗ ⃗ = v⃗˙ = a = ( ) = 2 = r¨⃗ dt dt dt dt Example 1.11. A particle moving in the space has position vector r⃗ at time t. Its speed and the magnitude of acceleration at the instant are v and a respectively. Comment on the following pairs of quantities. (a) dr and v, dt (b) dv and a. dt CHAPTER 1. VECTORS 17 Solution: (a) The quantity v is the speed of the particle, it is always positive. Howdr ever, is the rate of change of the distance between the particle and the dt origin. It can take positive value or negative value. For example, it is negar∣ dr d∣⃗ = and tive when the particle approaches the origin. Notice also that dt dt d⃗ r d∣⃗ r∣ d⃗ r v = ∣⃗ v ∣ = ∣ ∣. Interestingly, is not necessary equal to ∣ ∣. Let’s consider dt dt dt a particle moving in a circular path, as shown in figure 1.29. Figure 1.29: The circular motion of a particle As the distance between the particle and the center of rotation is fixed, i.e. dr d∣⃗ r∣ r = ∣⃗ r∣ is a constant, we have = = 0. On the contrary, the speed of the dt dt d⃗ r particle v = ∣⃗ v ∣ = ∣ ∣ is nonzero when the particle is revolving. Therefore, dt dr we assert that is not necessary equal to v except that when the particle dt moves directly away from the origin along a straight path or the particle is at rest. (b) The quantity a is the magnitude of acceleration of the particle, it is aldv is the rate of change of the speed ways positive. On the other hand, dt of the particle along the path. It can take positive value or negative value. For example, it is negative if the particle reduces its speed. Let’s consider a particle performing the uniform circular motion. The speed of the particle v dv is a constant and = 0. However, the particle experiences a centripetal dt dv ⃗ ≠ 0 and thus a ≠ 0. Therefore, acceleration, i.e. a is not necessary equal dt to a. Strictly speaking, dv is the acceleration of the particle along the path, i.e. dt CHAPTER 1. VECTORS 18 Figure 1.30: The circular motion of a particle the tangential acceleration at . Refer to figure 1.30. It is positive if the particle accelerates along the path, it is negative if the particle decelerates along the path and it becomes zero if the particle maintains a constant speed dv alone does not provide any along the path. It is important to notice that dt information about the acceleration normal to the path. The latter is named as radial acceleration an . Some books adopt the symbol ar instead. The expression an = v 2 /r is true for uniform or non-uniform circular motion. It is also true for motions along arbitrary curves. ∎ Example 1.12. A particle is thrown horizontally with initial velocity u from a cliff. The projectile is shown in the left diagram of figure 1.31. Find the acceleration of the particle along and normal to the path at time t. Hence, find the radius of curvature of the path at time t. Solution: Figure 1.31: The trajectory of a particle The horizontal and vertical velocities of the particle at time t are given by vx = u and vy = −gt √ respectively. The resultant speed at time t is given by v = vx2 + vy2 = √ dv u2 + g 2 t2 and the acceleration of the particle along the path is at = = dt CHAPTER 1. VECTORS √ 19 g2t . An alternative approach is given in example 1.15. Since u2 + g 2 t2 particle is driven by the gravitational attraction, the net acceleration of particle is g. We can write g 2 = a2t + a2n , where an is the acceleration of particle normal to the path, refer to the right diagram of figure 1.31. obtain √ gu an = g 2 − a2t = √ 2 u + g 2 t2 the the the We It is also the centripetal acceleration of the particle at time t and it relates the speed of the particle by an = v 2 /r, where r is the radius of curvature. Then, we get the result readily and r= v 2 (u2 + g 2 t2 )3/2 = an gu As a final remark to the discussion, one should notice that the radius of curvature applies to all curves, not only the circular paths. ∎ 1.7 Reference Frames Each observer - such as you standing on the ground - defines a reference frame. A reference frame requires a coordinate system and a set of clocks, which enable an observer to measure positions, velocities, and accelerations in his or her particular frame. As shown in figure 1.32, we have two different frames to watch an object P . Obviously, we obtain the relation between position vectors measured from different frames r⃗P A = r⃗P B + r⃗BA In words: ”The position of P as measured by frame A is equal to the position of P as measured by frame B plus the position of B as measured by A.” Figure 1.32: Reference Frames CHAPTER 1. VECTORS 20 Taking time derivative on both sides, we obtain the relation of velocities: v⃗P A = v⃗P B + v⃗BA . If the two frames move at constant speed with respect to ⃗P A = a ⃗P B . That means, two obeach other, i.e. v⃗BA = constant, we obtain a servers moving with constant velocity with each other should write down the ⃗P A = a ⃗P B + a ⃗BA . same equation of motion for the object. Generally, we have a Below is a summary of the quantities measured in different reference frames. ⎧ r⃗ = r⃗P B + r⃗BA ⎪ ⎪ ⎪ PA ⎨ v⃗P A = v⃗P B + v⃗BA ⎪ ⎪ ⎪ ⃗P A = a ⃗P B + a ⃗BA ⎩ a 1.8 Scalar Products of Vectors ⃗ is defined as The scalar product or dot product of two vectors A⃗ and B follows. ⃗ = ∣A∣∣ ⃗ B∣ ⃗ cos θ = AB cos θ , A⃗ ⋅ B where θ is the angle between the vectors. It is a scalar quantity and the ⃗=B ⃗ ⋅ A. ⃗ It can take positive, negative or operation is commutative, i.e. A⃗ ⋅ B zero values. If the vectors make an acute angle with each other, the product is positive. If the vectors make an obtuse angle, it becomes negative. When they are perpendicular to each other, the product is zero. Figure 1.33: Scalar product of vectors ⃗ is a unit vector, i.e. B = 1, A⃗ ⋅ B ⃗ = A(1) cos θ = A cos θ, which is If B ⃗ Suppose A⃗ and B ⃗ are described in Cartesian the projection of A⃗ on B. ⃗ ⃗ coordinates, A = Ax î + Ay ĵ + Az k̂ and B = Bx î + By ĵ + Bz k̂, then we have ⃗ = (Ax î + Ay ĵ + Az k̂) ⋅ (Bx î + By ĵ + Bz k̂) A⃗ ⋅ B = Ax Bx + Ay By + Az Bz The simplification is done by using the relations î ⋅ î = ĵ ⋅ ĵ = k̂ ⋅ k̂ = 1 and ⃗ ∣A∣ ⃗ cos 0○ = A2 = A2x + A2y + A2z , î ⋅ ĵ = ĵ ⋅ k̂ = k̂ ⋅ î = 0. Obviously, A⃗ ⋅ A⃗ = ∣A∣ CHAPTER 1. VECTORS 21 ⃗ ⋅B ⃗ = ∣B∣ ⃗ ∣B∣ ⃗ cos 0○ = B 2 = Bx2 + By2 + Bz2 , and B cos θ = ⃗ Ax Bx + Ay By + Az Bz A⃗ ⋅ B √ =√ . AB A2x + A2y + A2z Bx2 + By2 + Bz2 Example 1.13. Use the scalar product of two vectors to prove the cosine rule of a triangle. Figure 1.34: A proof of the cosine rule ⃗, ⃗b, and c⃗ such that the Solution: Construct the △ABC and the vectors a vectors are along the sides of the triangle, as shown in figure 1.34. We notice ⃗, and the dot product of itself is ⃗b ⋅ ⃗b = (⃗ ⃗) ⋅ (⃗ ⃗). Thus, we that ⃗b = c⃗ − a c−a c−a have ⃗ b2 = c2 + a2 − 2 c⃗ ⋅ a Then, we obtain the cosine rule b2 = c2 + a2 − 2 ca cos B. ∎ Example 1.14. Use the scalar product of two vectors to prove the CauchySchwarz inequality for real numbers ai and bi , i = 1, 2, 3, and n = 3. 2 n n n i=1 i=1 (∑ ai bi ) ≤ ∑ a2i ∑ b2i i=1 ⃗ = b1 î + b2 ĵ + b3 k̂. Denote the angle Solution: Let A⃗ = a1 î + a2 ĵ + a3 k̂ and B ⃗ ⃗ between A and B as θ. We have ⃗ 2 = A2 B 2 cos2 θ ≤ A2 B 2 (A⃗ ⋅ B) 3 2 3 3 i=1 i=1 Therefore, we obtain (∑ ai bi ) ≤ ∑ a2i ∑ b2i . i=1 1.9 ∎ Applications of Scalar Products Scalar products are commonly used to define physical quantities such as work done, electric flux, and magnetic flux. Let’s take work done as an example. The work done by a force F⃗ is defined as W = F⃗ ⋅ d⃗ = F d cos θ , CHAPTER 1. VECTORS 22 where d⃗ is the displacement of the point of application. The frictional force does negative work when a mass slides on a rough table because the force vector and the displacement vector are anti-parallel to each other. In fact, W = (F cos θ) d = (d cos θ) F . It implies that the following approaches are equivalent when we compute the work done. (1) Multiplying the projection of F⃗ on d⃗ by the displacement d⃗ or (2) multiplying the projection of d⃗ on F⃗ by the force F⃗ . In fact, the scalar product of two vectors gives hint to find the projection of any one of the vectors along the other. Figure 1.35: Scalar product and the projection of vector Example 1.15. A particle is thrown horizontally with initial velocity u from a cliff. The projectile is shown in the left diagram of figure 1.31. Find the acceleration of the particle along and normal to the path at time t. Solution: Figure 1.36: The acceleration of a particle along and normal to the trajectory The horizontal and vertical velocities of the particle at time t are given by vx = u and vy = −gt respectively. The velocity of the particle at time t has the form v⃗ = u î − gt ĵ ⃗ = −g ĵ. Then we can compute the acceleration of the and the acceleration a ⃗ and v̂, where v̂ = v⃗/v. particle along the path by using the scalar product of a CHAPTER 1. VECTORS 23 The tangential acceleration of the particle at time t is ⎛ u î − gt ĵ ⎞ v⃗ ⃗ ⋅ v̂ = (−g ĵ) ⋅ ( ) = (−g ĵ) ⋅ √ , at = a v ⎝ u2 + g 2 t2 ⎠ which leads to at = √ g2t . An alternative approach is given in example u2 + g 2 t2 ⃗ and v̂ gives the acceleration of the particle 1.12. The cross product of a normal to the path. Read section 1.10. RRR ⎛ u î − gt ĵ ⎞RRRRR RRRRR ug k̂ RRRRR v⃗ R R an = ∣⃗ a × v̂∣ = ∣(−g ĵ) × ( )∣ = RRR(−g ĵ) × √ R = R√ R v ⎝ u2 + g 2 t2 ⎠RRRR RRRR u2 + g 2 t2 RRRR RRR R R R Therefore, an = √ ug u2 + g 2 t 2 . ∎ Example 1.16. A block is being pulled by a constant force F through the light string which makes a constant angle of 60○ with the horizontal. The pulley is light and frictionless. If F = 10 N and the block moves by 1 m, what is the work done by the force F ? Figure 1.37: The work done by a pull Solution: Method I: Let’s consider the movement of the point of action. Initially, the point of action on the string is at point A, as shown in figure 1.38. When the block moves by 1 m, the point of action moves to point B along the straight path AB. We notice that AC = CB = 1 m and the displacement of the point √ Ð→ of action is given by AB, where AB = 2 (AC cos 30○ ) = 3 m. Let W be the work done by the applied force F⃗ . We have Ð→ W = F⃗ ⋅ AB Ð→ = ∣F⃗ ∣ ⋅ ∣AB∣ cos 30○ √ √ = (10 N) ( 3 m)( 3/2) Therefore, we obtain W = 15 J. CHAPTER 1. VECTORS 24 Figure 1.38: The work done by a pull Method II: Consider the total force acting on the block. It is the sum of two tension forces as shown in the left diagram of figure 1.39. Each force drives the block by 1 m. The right diagram in figure 1.39 illustrates the effect of individual force. The work done by the two forces is W = (F cos 60○ ) (1 m) + (F ) (1 m) = (10 N)(0.5)(1 m) + (10 N) Therefore, we obtain W = 15 J. Figure 1.39: The work done by a pull Method III: Consider the resultant force F ′ acting on the block. It is the sum of two tension forces as shown in the right diagram of figure 1.40. Figure 1.40: The work done by a pull √ √ We have F ′ = 2F cos 30○ = 2(10 N)( 3/2) = 10 3 N. This resultant force drives the block by 1 m and the work done by it is W = (F ′ cos 30○ ) (1 m) √ √ = (10 3 N) ( 3/2) (1 m) Therefore, we obtain W = 15 J. ∎ CHAPTER 1. VECTORS 1.10 25 Cross Product of Vectors ⃗ and ⃗b is defined as The cross product of two vectors a ⃗ × ⃗b = ∣⃗ a a∣∣⃗b∣ sin θ n̂ = ab sin θ n̂ , ⃗ and ⃗b, and n̂ is a unit vector generated by where θ is the angle between a the right-hand rule. The direction of n̂ is perpendicular to the plane formed ⃗ and ⃗b. If a ⃗ and ⃗b are parallel or anti-parallel vectors, the cross product by a of them is zero. Figure 1.41: The cross product and the right-hand rule ⃗ × ⃗b = −⃗b× a ⃗. Notice also that the The cross product is anti-commutative and a ⃗ ⃗ ⃗ × (b × c⃗) ≠ (⃗ cross product is not associative, i.e. a a × b) × c⃗ because the former lies on the bc-plane and the latter lies on the ab-plane. The computation of cross product is easily to proceed because ⎧ î × î = ĵ × ĵ = k̂ × k̂ = 0 and ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ î × ĵ = k̂, ĵ × k̂ = î, k̂ × î = ĵ ⃗ = ax î + ay ĵ + az k̂ and ⃗b = bx î + by ĵ + bz k̂ can The cross product of vectors a be represented by a determinant, e.g. RRR î ĵ k̂ RRR RR RR ⃗ × ⃗b = RRRR ax ay az RRRR a RRR R RR bx by bz RRRR ⃗ × ⃗b = (ay bz − by az ) î − (ax bz − bx az ) ĵ + (ax by − bx ay ) k̂. where a ⃗ and ⃗b as Example 1.17. Find the area of the triangle formed by vectors a shown in the left diagram of figure 1.42. Figure 1.42: The cross product and the area of a triangle CHAPTER 1. VECTORS 26 ⃗ and ⃗b is given by Solution: The area of the parallelogram formed by a ⃗ and ⃗b is ah = ab sin θ = ∣⃗ a × ⃗b∣. Thus, the area of the triangle formed by a ah/2 = ∣⃗ a × ⃗b∣/2. ∎ Example 1.18. Use the cross products of two vectors to prove the sine rule of a triangle. Solution: Figure 1.43: A proof of the sine rule The area of △ABC is ∣⃗ a × c⃗∣/2. The adjacent triangle ADB has area ∣⃗ c × ⃗b∣/2. ⃗ However, the two triangles have the same area, ∣⃗ a × c⃗∣ = ∣⃗ c × b∣. We obtain ca sin B = bc sin A and thus the result comes, a/ sin A = b/ sin B. ∎ Example 1.19. A particle is tracing a circular path about the origin with radius r and angular speed ω. Express its tangential velocity and centripetal ⃗ . The direction of ω ⃗ is defined by the right-hand acceleration with r⃗ and ω rule which has your fingers swirling along the direction of rotation and the ⃗. thumb pointing to the direction of ω Solution: Figure 1.44: The vectorial form of some physical quantities Notice that the angular velocity ω points out of the page. The tangential ⃗ × r⃗ and its magnitude is ωr. The centripetal acceleration is velocity is v⃗t = ω ⃗n = ω ⃗ × (⃗ a ω × r⃗) and its magnitude is ω 2 r. ∎ CHAPTER 1. VECTORS 1.11 27 Triple Products ⃗, ⃗b, and c⃗, is defined as a ⃗ ⋅ (⃗b × c⃗). If The scalar triple product of vectors a the vectors are not coplanar, the magnitude of it is the volume of the par⃗)∣ = allelepiped formed by the three vectors. Therefore, ∣⃗ a ⋅ (⃗b × c⃗)∣ = ∣⃗b ⋅ (⃗ c×a ⃗ ∣⃗ c ⋅ (⃗ a × b)∣. Figure 1.45: The scalar triple product of vectors ⃗ = ax î + ay ĵ + az k̂, ⃗b = bx î + by ĵ + bz k̂, and c⃗ = cx î + cy ĵ + cz k̂, we obtain If a ⃗ ⋅ (⃗b × c⃗) = (ax î + ay ĵ + az k̂) ⋅ [(by cz − cy bz ) î − (bx cz − cx bz ) ĵ + (bx cy − cx by ) k̂] a = ax (by cz − cy bz ) − ay (bx cz − cx bz ) + az (bx cy − cx by ) Nevertheless, we can present the scalar RRR a RR x ⃗ ⋅ (⃗b × c⃗) = RRRR bx a RRR RR cx triple product by a determinant, e.g. ay az RRRR R by bz RRRR R cy cz RRRR ⃗ ⋅(⃗b×⃗ The circular shift: a c) = ⃗b⋅(⃗ c ×⃗ a) = c⃗⋅(⃗ a × ⃗b) (Refer to D.II in section 7.4). ⃗, ⃗b, and c⃗, is defined as a ⃗ ×(⃗b× c⃗). The vector triple product of vectors a Obviously, it lies on the bc-plane and is given by ⃗ × (⃗b × c⃗) = ⃗b (⃗ a a ⋅ c⃗) − c⃗ (⃗ a ⋅ ⃗b) The proof of the above relation is straightforward. Without loss of generality, ⃗ = (ax , ay , az ), ⃗b = (bx , 0, 0), and c⃗ = (cx , cy , 0). we let a ⃗ × (⃗b × c⃗), we have Plugging the vectors into a ⃗ × (⃗b × c⃗) = (ax î + ay ĵ + az k̂) × bx cy k̂ a = ay bx cy î − ax bx cy ĵ = ay bx cy î + ax bx cx î − ax bx cx î − ax bx cy ĵ = (ay cy + ax cx ) bx î − ax bx (cx î + cy ĵ) = ⃗b (⃗ a ⋅ c⃗) − c⃗ (⃗ a ⋅ ⃗b) ⃗ ⋅ c and µ = −⃗ It is in the form λ⃗b + µ⃗ c, where λ = a a ⋅ ⃗b. The result recalls us again that the vector triple product lies on the bc-plane. Similarly, we have ⃗b × (⃗ ⃗) = c⃗ (⃗b ⋅ a ⃗) − a ⃗ (⃗b ⋅ c⃗) and c⃗ × (⃗ ⃗ (⃗ ⃗). c×a a × ⃗b) = a c ⋅ ⃗b) − ⃗b (⃗ c⋅a CHAPTER 1. VECTORS 1.12 Applications of Cross Product 1.12.1 Torque 28 The rotation of an object is produced by a torque τ⃗ (the moment of force) which exerts on the object about the axis of rotation. A torque is defined by the cross product of two vectors, τ⃗ = r⃗ × F⃗ = rF sin θ n̂ , where F⃗ is the applied force, r⃗ is position vector directed from the axis of rotation to the point of action and n̂ is the unit vector. It is a vector quantity whose direction is generated by using the right-hand rule. The definition of torque makes sense to us because the radial component of F⃗ produces no effect to rotation. Figure 1.46: A torque about the axis of rotation Let’s consider an example. A particle is connected to a massless rod which has its next end pivoted at O, as shown in figure 1.46. The length of the rod is r. The torque becomes zero if the force F⃗ is along the rod, i.e θ = 0○ or 180○ . The torque is a maximum if the force F⃗ is normal to the rod i.e. θ = 90○ . The direction of a torque (given by n̂) indicates the direction of the rotation, either counterclockwise or clockwise. Notice that τ⃗ = r⃗ × F⃗ = r (F sin θ) n̂ = F (r sin θ) n̂. In other words, we can consider the normal distance of F from O, the moment arm for rotation, i.e. r sin θ. If there are many forces acting on the particle, the net torque is τ⃗net = ∑ni τi = n r × F⃗i ) = r⃗ × (∑ni F⃗i ) = r⃗ × F⃗net . ∑i (⃗ 1.12.2 Magnetic Force There is another example found in physical application which relies on cross product. It is the magnetic force exerted on a moving charge. When a charge CHAPTER 1. VECTORS 29 of q coulombs travels with velocity v⃗ in a magnetic region of magnetic field ⃗ The magnetic force acting on the charge is B. ⃗ F⃗ = q (⃗ v × B) Figure 1.47: The magnetic force exerted on a moving charge It means that the charge experiences no magnetic force if it is stationary or it is moving along the magnetic field. The force is a maximum if the charge travels perpendicular to the magnetic field, figure 1.47. The direction of the force points into the page if q is positive. A neutral particle experiences no magnetic force in the magnetic region. As the magnetic force is always normal to the motion of the charge, there is no work done on the charge by the magnetic force. The kinetic energy of the charge remains unchange when the charge performs the uniform circular motion of radius r. The magnetic force provides the centripetal force, figure 1.48. If the velocity of the charge v is perpendicular to the magnetic field B, then we have mv 2 F = qvB = r mv r = qB Figure 1.48: The circular motion of a charge in the magnetic region Chapter 2 Differentiation 2.1 Basic Ideas and the Extremum The derivative of a function y = f (x) is denoted by y ′ , f ′ (x) or dy/dx which represents the slope (gradient) of the function at x. ∆y f (x + h) − f (x) dy = lim = lim dx ∆x→0 ∆x h→0 h A function f (x) is said to be differentiable if f ′ (x) exists. The slope of the curve at P (x0 , y0 ) is f ′ (x0 ) as shown in figure 2.1. f ′ (x) = Figure 2.1: The derivative of a function and its tangent line A differentiable function is also a continuous function. The geometrical properties of f (x) can be shown by the derivatives of f . A function f (x) is said to be increasing in the interval a < x < b if f ′ (x) > 0 for any a < x < b. A function f (x) is said to be decreasing in the interval a < x < b if f ′ (x) < 0 for any a < x < b. 30 CHAPTER 2. DIFFERENTIATION 31 If a function f (x) has a local maximum at x0 and it is twice differentiable there, then f ′ (x0 ) = 0 and f ′′ (x0 ) < 0. If a function f (x) has a local minimum at x0 and it is twice differentiable there, then f ′ (x0 ) = 0 and f ′′ (x0 ) > 0. While doing the computation of derivatives, there is a useful relation to facilitate the calculations, it is the chain rule: If f is a function of u, and u is a function of x, then we have df (u) df (u) du(x) = . dx du dx We can also write it again as d f (u(x)) = [f ′ (u(x))] [u′ (x)]. dx dy dx If y = f (x), we have the reciprocal rule, i.e. = 1/ . dx dy The total differential of a differentiable function y = f (x) is dy = f ′ (x) dx, where the derivative f ′ (x) links up the change in y due to the change in x. Example 2.1. Find the percentage error in volume if there is a percentage error in the measurement of the length of a cube by 0.1 %. Solution: The volume of a cube of sides L is V = L3 , which gives V ′ (L) = dV /dL = 3L2 . Then, the total differential of V is dV = V ′ (L) dL = 3L2 dL Therefore, we obtain dL 0.1 dV 3L2 dL = =3( )=3( ) = 0.003 V V L 100 The percentage error of the volume is 2.2 dV × 100 % = 0.3 %. V ∎ Derivatives of Physical Quantities Derivatives are widely used to define physical quantities. Typical examples are the velocity and acceleration of an object. A particle moving in a coordinate system with position vector r⃗ at time t has its velocity v⃗ and acceleration ⃗ given by a v⃗ = lim ∆t→0 ∆⃗ r d⃗ r = ∆t dt and ⃗ = lim a ∆t→0 ∆⃗ v d⃗ v d2 r⃗ = = ∆t dt dt2 CHAPTER 2. DIFFERENTIATION 32 If the motion is along a straight line, we simply have v= dx dt and a= dv d2 x = dt dt2 More examples: The power is the work per unit time, P = dW /dt. In an electric circuit, the current I in the wire stands for dq/dt, where I is the rate of charge flowing in the wire. Other than time derviatives, there are some quantities defined without times. For instance, the density of a non-uniform chain is represented by λ = dm/dx. In thermodynamics, the specific heat capacity of a gas at constant volume is the amount of heat Q needed by one mole of gas to increase its temperatures by one unit, cV = dQ/dT . Example 2.2. An object moving along a straight line has an acceleration a, where a is the time derivative of velocity, i.e. dv/dt. Express a again without time explicitly in your answer. Solution: From the definition of a, we have a= dv dv dv dx = = ( ) v, dt dx dt dx where v and x are the velocity and displacement of the object at time t. dx There is another convention adopted by many textbooks: ẋ = v = , and dt d2 x dẋ ẍ = a = 2 . The above result can be expressed as ẍ = ( ) ẋ. It is a useful dt dx technique if one wants to relate ẋ and x in the differential equation. ∎ Example 2.3. A rod AC has a mark B on it such that AB = l1 and BC = l2 . Suppose points A and B are mounted on the y-axis and x-axis respectively such that the rod is movable on the axis frame, as shown in figure 2.2. Show that C moves on an elliptical locus. Find also the velocities of B and C when A moves toward O with a uniform speed v0 . Figure 2.2: The movement of a rod under constraints CHAPTER 2. DIFFERENTIATION 33 Solution: Refer to figure 2.2, we label the coordinates of C as (xC , yC ), where { xC = l1 cos θ + l2 cos θ = (l1 + l2 ) cos θ yC = −l2 sin θ Eliminating θ from the above equations, we obtain the locus of C. It is an ellipse. ( yC 2 xC 2 ) +( ) =1 l1 + l2 l2 Now, the coordinates of A are { xA = 0 yA = l1 sin θ which gives ẏA = −v0 = l1 θ̇ cos θ and thus θ̇ = −v0 l1 cos θ (2.1) The coordinates of B are { xB = l1 cos θ yB = 0 which gives ẋB = vB = −l1 θ̇ sin θ. Using equation 2.1 we obtain vB = v0 tan θ. Differentiate the coordinates of C and use equation 2.1 again, we have the velocities of C along x- and y- axis respectively. ⎧ v0 (l1 + l2 ) tan θ ⎪ ⎪ ⎪ ẋ = −(l + l ) θ̇ sin θ = C 1 2 ⎪ ⎪ l1 ⎨ l2 v0 ⎪ ⎪ ⎪ ẏC = −l2 θ̇ cos θ = ⎪ ⎪ l1 ⎩ The physical quantities such as displacement, velocity, and acceleration are defined in the coordinate system as functions of time. This example applies simple ideas in coordinate geometry to solve the problems. ∎ 2.3 Centripetal Acceleration An object performing a uniform circular motion of radius r with velocity v has acceleration a, where a is referred to as the centripetal acceleration. The magnitude of a is v 2 /r and the direction always points toward the center of rotation. The proof is given as follows. CHAPTER 2. DIFFERENTIATION 34 Figure 2.3: The uniform circular motion Consider the object traveling from P1 to P2 along the circular path during a time interval ∆t. Without loss of generality, we set P1 and P2 be two points which have symmetry about the y-axis. At P1 the velocity is v⃗1 = v cos θ î + v sin θ ĵ, and at P2 the velocity is v⃗2 = v cos θ î − v sin θ ĵ. The time interval ∆t = 2rθ/v The x-component of the average acceleration is aave,x = v2x − v1x =0 ∆t because the x-component of the v⃗1 and v⃗2 are the same: v1x = v2x = v cos θ. Hence, the x-component of the instantaneous acceleration is ax = lim aave,x = 0 ∆t→0 The y-component of the average acceleration is aave,y = v2y − v1y −v sin θ − v sin θ −2v sin θ v 2 sin θ = = =− ( ) ∆t 2rθ/v 2rθ/v r θ Hence, the y-component of the instantaneous acceleration is ay = lim aave,y = − ∆t→0 sin θ v2 v2 (lim )=− , r θ→0 θ r sin θ = 1, see section 2.8. θ→0 θ where we have used the fact that lim 2.4 Seeking the Extremum Differential calculus is an efficient tools to find the local extremum of a function. The local extremum is also the turning point (stationary point) where the first order derivative of the function is zero. The second order derivative of the function helps to determine the properties of the turning point such CHAPTER 2. DIFFERENTIATION 35 as a maximum point or a minimum point. Sometimes, we do not spend time to work out the second order derivative for verification because the physical picture of a system would naturally tell the situation. Example 2.4. Consider light passing from one medium with index of refraction n1 into another medium with index of refraction n2 . Use Fermat’s principle to derive the law of refraction: n1 sin θ1 = n2 sin θ2 . Fermat’s Principle: Light travels by the path that takes the least amount of time. Figure 2.4: The refraction of light Solution: Consider a beam which passes through point A in medium 1 and point B in medium 2, where A and B have vertical distances h1 and h2 from the interface of the two media. Let A1 and B1 be the points of projection from A and B on the interface and C be the intersecting point when the beam from A meets the interface. Denote A1 B1 as a and A1 C as x, where 0 ≤ x ≤ a. The total travelling time for paths AC and CB: √ √ h21 + x2 h22 + (a − x)2 AC CB + = + T (x) = v1 v2 v1 v2 The first and second derivatives of T with respect to x are T ′ (x) = 1 x 1 a−x ⋅√ − ⋅√ v1 h21 + x2 v2 h22 + (a − x)2 T ′′ (x) = h2 h22 1 1 ⋅ 2 1 2 3/2 + ⋅ 2 >0 v1 (h1 + x ) v2 [h2 + (a − x)2 ]3/2 The turning point of T (x) can be obtained when one solves the equation T ′ (x) = 0. Since T ′′ (x) is positive, the turning point of T (x) is a minimum CHAPTER 2. DIFFERENTIATION 36 value which is the minimum time stated in Fermat’s principle. Let’s work it out and set T ′ (x) = 0, Thus, 1 x 1 a−x ⋅√ = ⋅√ 2 2 2 v1 v2 h1 + x h2 + (a − x)2 1 1 sin θ1 = sin θ2 v1 v2 Multiplying both sides by the speed of light in the free space, c, we obtain c c sin θ1 = sin θ2 v1 v2 n1 sin θ1 = n2 sin θ2 The last equation is the Snell’s law for the refraction of light, where ni = c/vi and i = 1 and 2. ∎ Example 2.5. A uniform rod of length 2a is placed with its lower end inside a smooth bowl. The bowl is a hemispherical hollow of radius a and it is fixed on a horizontal plane. Find the equilibrium position of the rod. Figure 2.5: The rod in a bowl Solution: Denote G as the center of mass of the rod. The vertical distance of G from the x-axis is y, where y = AG sin θ = (AB − GB) sin θ = (2a cos θ − a) sin θ When the rod is at equilibrium in the bowl, it occupies the lowest gravitational potential energy. In other words, the vertical distance y obtains the maximum. The derivative of y with respect to θ is y ′ = 2a (cos2 θ − sin2 θ) − a cos θ = 4a cos2 θ − a cos θ − 2a CHAPTER 2. DIFFERENTIATION 37 The turning point of y satisfies y ′ = 0 which gives 4a cos2 θ − a cos θ − 2a = 0. The rod reaches its equilibrium when the angle of inclination θ0 = cos−1 (1 + √ 33)/8 = 32.5○ . One may check that y ′′ ∣θ0 < 0, which indicates the maximum value of y at θ = θ0 . ∎ Example 2.6. A light cord of length l has one of its ends connected to a particle of mass m, while the next end of it is fixed at a point O on the ceiling. Initially, the cord is kept horizontally and the particle is at a distance l from O such that the cord is tight, then the particle is released to fall under the gravity. Find the angle that the cord makes with the vertical when the particle obtains its maximum vertical speed. Figure 2.6: The swinging particle Solution: The particle has zero vertical speed when it is located at the initial position and the lowest position. It means that the vertical speed of the particle has a turning point when it is descending. Here, the turning point is the maximum vertical speed when the cord has an inclined angle θ′ with the vertical. However, the turning point occurs when the net vertical force exerted on the particle is zero, i.e. Fy = 0. Thus, T cos θ′ = mg (2.2) The conservation of mechanical energy gives mgl cos θ′ = 1 mv 2 2 (2.3) The particle performs the circular motion with radius l because the net force along the cord contributes the centripetal force T − mg cos θ′ = mv 2 l (2.4) Eliminating v from equations 2.3 and 2.4, we obtain T = 3mg cos√θ′ . Using this equation and equation 2.2 to eliminate T , we obtain cos θ′ = 1/ 3, which gives θ′ = 54.7○ . ∎ CHAPTER 2. DIFFERENTIATION 2.5 38 Case Study on Projectile Motion Example 2.7. A ball is projected, with speed u and angle of elevation α from the floor. Find the condition of α such that the ball is always moving further away from the point of projection. Figure 2.7: The projectile of a ball Solution: Let the point of projection be the origin of the Cartesian coordinate system. The x and y coordinates of the particle at time t are ⎧ ⎪ ⎪ x = (u cos α) t 1 ⎨ ⎪ y = (u sin α) t − gt2 ⎪ ⎩ 2 1 Hence, we have r2 = x2 +y 2 = (u2 cos2 α) t2 +(u2 sin2 α) t2 + g 2 t4 −(u sin α) gt3 , 4 which implies 1 r2 = u2 t2 + g 2 t4 − (u sin α) gt3 4 Differentiate both sides of the above equation with respect to t, we have dr2 = 2u2 t + g 2 t3 − 3t2 ug sin α dt If the distance between the ball and the point of projection increases with time, we have r(t) an increasing function (or simply r2 (t) an increasing function). Then we can write dr2 /dt > 0. A quadratic inequality of t follows: g 2 t2 − 3gt sin α + 2u2 > 0. This equation is valid if the discriminant of the quadratic expression is less than zero, i.e. ∆ < 0.√Then we have 9u2 g 2 sin2 α − 8u2 g 2 < 0. After solving, we obtain sin α < 2 2/3. In other words, the distance between the ball and the point of projection is always increasing if the angle of projection is less than 70.5○ . ∎ 2.6 A Revisit to Newton’s Second Law In high school physics, Newton’s second law is presented in the simplest form F = ma, where m is regarded as a point mass of constant value. In fact, the more general description of Newton’s second law is F = dp/dt, where p is the CHAPTER 2. DIFFERENTIATION 39 momentum of the mass system. Notice that F = dp/dt reduces to F = ma if m is a point mass of constant value. F = dp/dt = d(mv)/dt = m (dv/dt) = ma The following example deals with the motion of a long chain which has two portions, the moving part and the stationary part. The formula F = ma is still applicable to study the problem if one can locate the center of mass of the chain before further calculations. But, the approach is a bit clumsy and time consuming. An easier approach to study the problem is to adopt F = dp/dt. Example 2.8. A uniform open-link chain of mass ρ per unit length and total length L has one of its end fixed at the ceiling. The free end of it is released from rest at x = 0 and it falls under gravity as shown in figure 2.8. Find the force R that supports the fixed end. Express your answer in terms of x. Figure 2.8: The falling chain dp Solution: Newton’s second law states that Fnet = , where Fnet is the net dt force acting on an object and p is the momentum of the object. Recall that p = mv, where m and v are the mass and the velocity of the object respectively. In this problem v = ẋ. The right portion falls after it is released. The left portion has length (L + x)/2 and the right portion has length (L − x)/2 when the end point of right portion is at a distance x under the ceiling. Figure 2.9 shows the mass distribution of the chain on its two portions. L−x Notice that the momentum of the right portion is ρẋ and that of 2 the left portion is zero. Since the right portion is under free fall, the velocity and acceleration of it are governed by ẋ2 = 2gx and ẍ = g respectively. Here we have taken downward as positive as the measurement of x is downward from the ceiling. CHAPTER 2. DIFFERENTIATION 40 Figure 2.9: The net force on the chain Notice also that the net force on the entire chain is ρLg − R, where R is the force exerted on the chain by the ceiling and ρgL is the weight of the chain. Hence, we have d L−x {( ) ρẋ} dt 2 ρ {ẍ (L − x) − ẋ2 } = 2 ρ = {g (L − x) − 2gx} 2 −R + ρLg = ρg Therefore, we obtain R = (L + 3x). 2 When the entire chain is just unfolded, x = L, the force that supports the chain is R = 2ρgL. ∎ 2.7 Electric Potential and Electric Field If a charged system is symmetric about the x-axis, the electric field E⃗ at a point P on the x-axis points along the x-axis. The electric potential V at P relates the electric field E⃗ by dV E⃗ = − î dx Notice that V is the work done by an external agent to move a unit positive charge from infinity to point P under the influence of the charged system. The above expression provides an easier way to find the vector field through a scalar function. CHAPTER 2. DIFFERENTIATION 41 Example 2.9. A uniformly charged ring of radius r and total charge Q exerts an electric field around it. If the electric potential at a point P on the x-axis is given by V = Q 1 √ 2 4π0 r + x2 Find the electric field at point P . Figure 2.10: A uniformly charged ring Solution: Due to symmetry, the electric field at point P is directed along the x-axis and it is given by Ex = −dV /dx. Ex = − Q d 1 Q x (√ )= . 2 2 2 4π0 dx 4π0 (r + x2 )3/2 r +x Q x Thus, the electric field at P is E⃗ = î. One can check that 2 4π0 (r + x2 )3/2 Q the field strength converges to if x >> r. This result makes sense 4π0 x2 because the ring can be regarded as a point charge with charge Q when the measurement is performed very far away from the ring. ∎ 2.8 L’ Hôpital’s Rule Suppose both the functions f (x) and g(x) are differentiable near x = a and f (a) = g(a) = 0. Then, f (x) f ′ (a) = ′ . x→a g(x) g (a) lim Proof. We just need to recognize that f (x) − f (a) x→a x−a f ′ (a) = lim CHAPTER 2. DIFFERENTIATION 42 and similar for g ′ (a). Then, f (x) − f (a) f (x) = lim = lim lim x→a g(x) − g(a) x→a x→a g(x) The indeterminate forms: f (x)−f (a) x−a g(x)−g(a) x−a = f ′ (a) . g ′ (a) ∞ 0 and 0 ∞ 0 form 0 Suppose the functions f (x) and g(x) are differentiable near x = a and f (a) = g(a) = 0. Then, Rule 1: f (x) f ′ (x) = lim ′ x→a g(x) x→a g (x) lim ∞ form ∞ Suppose the functions f (x) and g(x) are differentiable near x = a and lim f (x) = lim g(x) = ∞. Then, Rule 2: x→a x→a f (x) f ′ (x) = lim ′ x→a g(x) x→a g (x) lim ex − 1 sin x and lim . x→0 x→0 x x sin x 0 Solution: The expression lim has the indeterminate form [ ]. Apply x→0 x 0 the L’ Hôpital’s rule, we have cos x sin x = lim = 1. lim x→0 x→0 x 1 ex − 1 0 Notice also that lim has the indeterminate form [ ]. L’ Hôpital’s rule x→0 x 0 gives ex − 1 ex lim = lim = 1 . x→0 x→0 1 x ∎ 1 + cos πx Example 2.11. Find lim 2 . x→1 x − 2x + 1 1 + cos πx Solution: Obviously, the expression lim 2 has the indeterminate x→1 x − 2x + 1 0 form [ ]. L’ Hôpital’s rule gives 0 1 + cos πx −π sin πx −π 2 cos πx π 2 lim 2 = lim = lim = . x→1 x − 2x + 1 x→1 2x − 2 x→1 2 2 ∎ Example 2.10. Find lim CHAPTER 2. DIFFERENTIATION 43 1 − cos x2 . x→0 sin2 x 1 − cos x2 0 Solution: The expression lim has the indeterminate form [ ]. 2 x→0 0 sin x Apply L’ Hôpital’s rule, we have Example 2.12. Find lim 1 − cos x2 2 x sin x2 x sin x2 = lim = (lim ) (lim ) = 1 ⋅ 0 = 0. x→0 x→0 2 sin x cos x x→0 sin x x→0 cos x sin2 x lim In the calculations, we do not apply L’ Hôpital’s rule twice though it works and gives correct answer. The approach is a bit clumsy. Instead of doing this way, we operate the limits of individual expressions if they exist, as shown above. ∎ Example 2.13. Find lim ( x→+∞ x+c x ) , where c = ln 2. x−c Solution: Notice that { x+c x x+c x+c x ( ) = eln( x−c ) = ex ln( x−c ) = e x−c x+c ) ln( x−c } 1 x 0 The expression in the curly bracket has the indeterminate form [ ] when 0 x+c x → +∞. Using the fact that ln ( ) = ln(x + c) − ln(x − c) and applying x−c L’ Hôpital’s rule, we have ln ( lim x→+∞ x+c 1 1 ) − 2c (x2 ) x−c = lim x + c x − c = lim 2 2 = 2c = 2 ln 2 = ln 4 x→+∞ x→+∞ x − c 1 1 − 2 x x x+c x Therefore, lim ( ) = 4, if c = ln 2. x→+∞ x − c ∎ Example 2.14. A particle of mass m is thrown vertically upward with velocity v0 in a resistive medium. It is found that the time for it to reach the maximum height is t= m kv0 ln (1 + ), k mg where k is a constant related to the strength of drag force and the resistance increases with k. Show that the above expression converges to the required time in the perfect case if k → 0. CHAPTER 2. DIFFERENTIATION 44 Solution: For an ideal medium, the time required is v0 /g. Let’s compute m kv0 lim ln (1 + ) by using L’ Hôpital’s rule. One can check that the exk→0 k mg 0 pression has the indeterminate form [ ]. 0 m ln (1 + kv0 m ) = lim lim ln (1 + k→0 k→0 k mg k kv0 mg ) = lim v0 ( mg ) m kv 1+ mg0 k→0 1 mv0 v0 = k→0 mg + kv0 g = lim ∎ The proof is completed. 2.9 Taylor’s Series Assume f (x) is infinitely differentiable at a, then f (x) = f (a) + f ′ (a) (x − a) + 1 ′′ 1 f (a) (x − a)2 + f (3) (a) (x − a)3 2! 3! 1 (n) f (a) (x − a)n + ⋯ , (2.5) n! where f (n) is the n-th derivative of f (x). The above expression is called the Taylor’s series of f (x) about a reference point at x = a. +⋯ + Proof. Assuming that f (x) can be expanded in a power series in x − a, let f (x) = A0 + A1 (x − a) + A2 (x − a)2 + A3 (x − a)3 + ⋯ + An (x − a)n + ⋯ (2.6) Differentiating both sides with respect to x, successively n times, f ′ (x) f ′′ (x) f ′′′ (x) ⋮ f (n) = = = = = A1 + 2A2 (x − a) + 3A3 (x − a)2 + ⋯ + nAn (x − a)n−1 + ⋯ 1 ⋅ 2A2 + 2 ⋅ 3A3 (x − a) + ⋯ + n(n − 1)An (x − a)n−2 + ⋯ 1 ⋅ 2 ⋅ 3A3 + ⋯ + n(n − 1)(n − 2)An (x − a)n−3 + ⋯ ⋮ ⋮ n!An + terms in (x − a), etc. ⋯ Using x = a in these n + 1 equations, we have f (a) = A0 , f ′ (a) = A1 , f ′′ (a) = 2!A2 , f ′′′ (a) = 3!A3 , ⋯ , f (n) (a) = n!An , ⋯ Therefore, A0 = f (a), A1 = f ′ (a), A2 = f ′′′ (a) f (n) (a) f ′′ (a) , A3 = , ⋯ , An = , ⋯ 2! 3! n! CHAPTER 2. DIFFERENTIATION 45 Substituting these results into equation 2.6, we have f (x) = f (a) + f ′ (a) (x − a) + + ⋯+ 1 1 ′′ f (a) (x − a)2 + f (3) (a) (x − a)3 2! 3! 1 (n) f (a) (x − a)n + ⋯ n! There are many useful results produced by the Taylor’s series and they are widely used in physics. To approximate the answer, we can cut the long tails of the series and keep the first few terms. Taking the origin as the reference point, we have ex = 1 + x + x2 x3 + +⋯ 2! 3! ax = ex ln a = 1 + x ln a + (x ln a)2 (x ln a)3 + +⋯ 2! 3! x3 x5 x7 + − +⋯ 3! 5! 7! x 2 x4 x6 cos x = 1 − + − +⋯ 2! 4! 6! x2 x3 x4 + − +⋯ ln(1 + x) = x − 2 3 4 sin x = x − Binomial expansion (as an infinite series) p (p − 1) p−2 2 p (p − 1) (p − 2) p−3 3 a x + a x +⋯ 2! 3! p p p = ap + ( ) ap−1 x + ( ) ap−2 x2 + ( ) ap−3 x3 + ⋯, 3 1 2 (a + x)p = ap + p ap−1 x + p p (p − 1) (p − 2)⋯(p − r + 1) where p is real (not a positive integer) and ( ) = . r r! In particular, when a = 1, we have p (p − 1) 2 p (p − 1) (p − 2) 3 x + x +⋯ 2! 3! p p p = 1 + ( ) x + ( ) x2 + ( ) x 3 + ⋯ . 1 2 3 (1 + x)p = 1 + p x + It is an important result to do approximation if x is small, e.g. (1 + x)1/2 ≈ 1 + x/2. Remark: If p and r are non-negative integers with 0 ≤ r ≤ p, then the p p! p combinations formula is ( ) = . An alternative notation of ( ) is r r! (p − r)! r p p p Cr . Obviously, ( ) = ( ). We should note that 0! = 1. r p−r CHAPTER 2. DIFFERENTIATION 46 Example 2.15. Find the Taylor’s series of ln(1 − x) about x = 0. Solution: Consider the derivatives of the function, we have d 1 ln(1 − x)∣ = − ∣ = −1 dx 1 − x x=0 x=0 and d2 1 ln(1 − x)∣ = − ∣ = −1 . 2 dx (1 − x)2 x=0 x=0 Do it repeatedly, we obtain dn ln(1 − x)∣ = −(n − 1)! . dxn x=0 Hence, the Taylor’s series is ln(1 − x) = −x − x2 x3 xn − −⋯− − ⋯. 2 3 n ∎ Example 2.16. Find the Taylor’s series of esin x about x = 0. Solution: The Taylor’s series of ex and sin x about x = 0 are x2 x3 + +⋯ 2! 3! x3 x5 x7 sin x = x − + − +⋯ 3! 5! 7! ex = 1 + x + For convenience ex is sometimes written as exp(x). The Taylor’s series of esin x about x = 0 is x3 + ⋯) 3! 2 3 x3 1 x3 1 x3 = 1 + (x − + ⋯) + (x − + ⋯) + (x − + ⋯) + ⋯ 3! 2! 3! 3! 3! 3 4 x 1 2x 1 3x5 = 1 + (x − + ⋯) + (x2 − + ⋯) + (x3 − + ⋯) + ⋯ 3! 2! 3! 3! 3! x2 = 1+x+ +⋯ 2 exp (x − where we ignore x4 and higher order terms. (Incidentally, x3 term also vanishes.) Be careful how high an order you have to keep: If a Taylor’s series of order n is required, all terms up to and including order n must be kept in intermediate calculation. ∎ Example 2.17. Not all functions have Taylor’s series that converge to itself. Give an example to illustrate this. CHAPTER 2. DIFFERENTIATION 47 y x Figure 2.11: A smooth function Solution: Define a function ⎧ 1 ⎪ ⎪ exp(− ) if x > 0 . f (x) = ⎨ x ⎪ ⎪ 0 if x ≤ 0 ⎩ The graph of the function is shown in Fig. 2.11. For x > 0, f ′ (x) = exp(−1/x) , x2 and f (n) (x) = exp(−1/x)Pn (1/x) where Pn (1/x) is a polynomial in 1/x. This can be proved by a simple induction. Then, by the fact that lim f (n) (x) = 0 , x→0+ (2.7) the function is infinitely differentiable at x = 0. Its Taylor’s series at x = 0 are identically zero, not equal to the function itself. The proof of equation 2.7 is given as follows. We notice that lim e−y y m = 0 y→+∞ for all positive integer m. We say that exponential is faster than any power. The proof of it can be completed by using L’Hôpital’s rule, ym m y m−1 m (m − 1) y m−2 m! = lim = lim = lim = 0. y→+∞ ey y→+∞ y→+∞ y→+∞ ey ey ey lim e−y y m = lim y→+∞ ∎ Example 2.18. Two massless springs, each with force constant k and unstretched length l0 are connected in a straight line as shown in figure. Find an expression for the work done of a force which moves the point of attachment, i.e. the knot, between the two springs a perpendicular distance x from CHAPTER 2. DIFFERENTIATION 48 Figure 2.12: A two-spring system the equilibrium point. Hence, show that the work done for such movement kx4 is given by 2 when x << l0 . 4 l0 Solution: When√the vertical displacement of the knot is x, the extension of the spring is e = l02 + x2 − l0 . The tension in each spring is T = ke. The force exerted by an external agent to displace the knot by x from its equilibrium √ ⎛ ⎞ 2kl0 x x is F = 2T sin θ = 2ke sin θ = 2k ( l02 + x2 − l0 ) √ = 2k x − √ . 2 2 ⎝ l0 + x ⎠ l02 + x2 Figure 2.13: A two-spring system Hence, the work done by the force for such displacement is x W = ∫ F dx 0 2kl0 x ⎞ 2k x − √ dx ⎝ l02 + x2 ⎠ √ = kx2 − 2kl0 l02 + x2 + 2kl02 = ∫ 0 The binominal expansion of √ √ x⎛ 1/2 x l02 + x2 = l0 (1 + ( )2 ) gives l0 l02 + x2 = l0 (1 + 1 x2 1 x4 − + ⋯) . 2 l02 8 l04 CHAPTER 2. DIFFERENTIATION 49 When x << l0 we neglect the higher order terms after x4 . Therefore, the work done by the external force is W = kx2 − 2kl0 {l0 (1 + = 1 x2 1 x4 − )} + 2kl02 2 l02 8 l04 kx4 4 l02 ∎ Example 2.19. An electric dipole consists of two equal and opposite charges (±q) separated by a distance s. Show that the approximate potential at a point P far away is given by 1 qs cos θ , 4π0 r2 where r is the distance measured from P to the mid-point of dipole and θ is the angle between the dipole and the line joining P and the mid-point of dipole. Solution: The potential due to the dipole is V (P ) = 1 q q ( − ), where 4π0 r+ r− ⎧ s 2 s s2 ⎪ 2 2 2 ⎪ = r + ( r ) − rs cos θ = r (1 − cos θ + ) ⎪ + ⎪ ⎪ 2 r 4 r2 ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ s 2 s s2 ⎪ 2 2 2 ⎪ r = r + ( ) + rs cos θ = r (1 + cos θ + ) ⎪ − ⎪ ⎩ 2 r 4 r2 When P is far away from the dipole, we have r >> s. The higher order terms Figure 2.14: The electric dipole CHAPTER 2. DIFFERENTIATION 50 in the above expressions are negligible. Thus s ⎧ ⎪ r+2 ≈ r2 (1 − cos θ) ⎪ ⎪ ⎪ r ⎪ ⎪ ⎨ ⎪ ⎪ s ⎪ ⎪ ⎪ r2 ≈ r2 (1 + cos θ) ⎪ ⎩ − r The binominal expansion of them are ⎧ −1/2 ⎪ s 1 1 ⎪ ⎪ ≈ (1 − cos θ) ≈ ⎪ ⎪ ⎪ r r r ⎪ + ⎪ ⎨ ⎪ −1/2 ⎪ ⎪ 1 1 s ⎪ ⎪ ⎪ ≈ ≈ (1 + cos θ) ⎪ ⎪ r ⎩ r− r 1 s (1 + cos θ) r 2r 1 s (1 − cos θ) r 2r Therefore, we have 1 s 1 − ≈ 2 cos θ r+ r− r and hence V (P ) ≈ 2.10 1 qs cos θ . 4π0 r2 ∎ Newton’s Method If we can calculate the derivative of a function, we may be able to find out the roots of a function. Newton’s method provides an effective approach to find the roots of a function. The procedures are stated as follows. Figure 2.15: The idea of Newton’s method We take an initial guess x = x0 such that it is close to one of the roots of f (x) = 0. In figure 2.15, the root of f (x) = 0 is located at x = a. Then the equation of the tangent line at (x0 , f (x0 )) is y − f (x0 ) = f ′ (x0 ) (x − x0 ) CHAPTER 2. DIFFERENTIATION 51 If we put y = 0, we obtain the x-intercept of the tangent line. x1 = x0 − f (x0 ) f ′ (x0 ) We can repeat doing this and obtain the recurrence relation xn+1 = xn − f (xn ) f ′ (xn ) (2.8) where n = 0, 1, 2, . . . . We expect the sequence xn will converge to the exact root. If the initial guess is far from the root, we should avoid the possibility that f ′ (x0 ) = 0 or approximately zero, otherwise this method will fail to give the answer. Figures 2.20 and 2.17 indicate the undesirable outcomes due to poor initial guess. Lastly, Newton’s method is inapplicable if we are going to find the multiple roots of f (x) = 0. Figure 2.16: The initial guess is far away from the root and has f ′ (x0 ) ≈ 0 Figure 2.17: The initial guess is far away from the root CHAPTER 2. DIFFERENTIATION 52 Example 2.20. Solve x5 + x3 − 1 = 0 by Newton’s method with an initial guess x = 0.8. Solution: We consider the function f (x) = x5 + x3 − 1, then we have f ′ (x) = 5x4 + 3x2 . Hence, we define xn+1 = xn − x5n + x3n − 1 5x4n + 3x2n Figure 2.18: A plot of y = f (x) Taking x0 = 0.8, we have n 0 1 2 3 xn f (xn ) f ′ (xn ) 0.8 −0.16032 3.968 0.840403226 0.012774638 4.61297286 0.837633941 0.000064684 4.566319479 0.837619775 ∗ ∗∗ ∗ ∗∗ One of the roots of f (x) = 0 is 0.8376. ∎ CHAPTER 2. DIFFERENTIATION 53 Example 2.21. Solve (x − 3) ex + 3 = 0 by Newton’s method with an initial guess x = 2.8. Solution: Obviously, x = 0 is a root of the equation. Now, we proceed to find the next root. Consider the function f (x) = (x − 3)ex + 3 = 0 and then we have f ′ (x) = ex + (x − 3) ex = (x − 2)ex We define xn+1 = xn − (xn − 3) exn + 3 (xn − 2) exn Figure 2.19: A plot of y = f (x) Taking x0 = 2.8, we have n 0 1 2 3 xn f (xn ) f ′ (xn ) 2.8 −0.288929354 13.155717417 2.821962265 0.007220640 13.817024262 2.821439675 0.000004181 13.801025461 2.821439372 ∗ ∗∗ ∗ ∗∗ One of the roots of f (x) = 0 is 2.8214. ∎ CHAPTER 2. DIFFERENTIATION 2.11 Useful Differentiation Formulae f (x) c cx cxn sin x cos x tan x sec x csc x cot x ex ln x ax u±v uv u v Chain Rule: 2.12 54 dy dy du = dx du dx f ′ (x) 0 c cnxn−1 cos x − sin x sec2 x sec x tan x − csc x cot x − csc2 x ex 1 x ax ln a u′ ± v ′ uv ′ + vu′ vu′ − uv ′ v2 Reciprocal Rule: dy 1 = dx dx dy Appendix: Method of Bisection Method of Bisection provides the simplest way to find the root of a equation. Let f (x) be a continuous function in x. Without loss of generality, let a0 < b0 , f (a0 ) < 0, and f (b0 ) > 0, then a real root appears in the interval (a0 , b0 ). Now, check the sign of the function at the mid-point of a0 and b0 . If a0 + b 0 a0 + b 0 ) = 0, we obtain the root. If f ( ) < 0, we label the mid-point f( 2 2 a0 + b 0 as a1 and set b1 = b0 . If f ( ) > 0, we label the mid-point as b1 and set 2 a1 = a0 . Iterate the process n times, then an approximation of the root can be found. Example 2.22. Let f (x) = x3 −2x−5. One can check that a root of f (x) = 0 lies in the interval (2, 3). Locate the root by using the method of bisection. Solution: Consider two points on the x-axis, x = 2 and x = 3. We note that f (2) = −1 < 0 and f (3) = 16 > 0, so a real root appears in the interval (2, 3). The following table shows the signs of f (x) on the two sides of the root. CHAPTER 2. DIFFERENTIATION f (2) < 0 f (3) > 0 f (2.5) > 0 f (2.25) > 0 f (2.125) > 0 f (2.0625) < 0 f (2.09375) < 0 f (2.109375) > 0 55 The interval that contains a real root of f (x) = 0 (a0 , b0 ) = (2, 3) (a1 , b1 ) = (2, 2.5) (a2 , b2 ) = (2, 2.25) (a3 , b3 ) = (2, 2.125) (a4 , b4 ) = (2.06252, 2.125) (a5 , b5 ) = (2.09375, 2.125) (a6 , b6 ) = (2.09375, 2.109375) If we terminate the process after six bisections, then we obtain the approximate value of the root. It is 2.1015625. Figure 2.20: A plot of y = f (x) Figure 2.21: A plot of y = f (x) near the root ∎ Chapter 3 Integration 3.1 Indefinite Integration As a first understanding, integration could be treated as the inverse of differentiation. We denote it by F (x) = ∫ f (x) dx . Because the derivative of a constant is zero, integral of a function is not unique. There is a constant of integration, and the adjective “indefinite”. 1 dx. Example 3.1. Find ∫ x dx, ∫ xn dx and ∫ x Solution: 1 2 ∫ x dx = 2 x + const. We will usually denote the constant of integration by C. Note that the constants could be different in different equations. 1 n n+1 ∫ x dx = n + 1 x + C 1 ∫ x dx = ln x + C . if n ≠ −1, and Example 3.2. A particle with initial velocity u moves along a straight line. If the acceleration of the particle is a constant a, find the velocity and the displacement of the particle after time t. Solution: The acceleration a is a constant and notice that in calculus nodv tation a = , where v is the velocity. We perform integration on both sides dt 56 CHAPTER 3. INTEGRATION 57 with respect to t and obtain dv ∫ a dt = ∫ ( dt ) dt a∫ dt = ∫ dv at = v + C , where C is the arbitrary constant to be determined by the initial condition. Put t = 0 and v = u, we obtain C = −u. Hence, v = u + at. ds Let s be the displacement and notice again that in calculus notation v = . dt ds Then, = u + at. Integrating both sides with respect to t, we have dt ds ∫ ( dt ) dt = ∫ (u + at) dt ∫ ds = ∫ (u + at) dt 1 s = ut + at2 + C ′ , 2 where C ′ is the arbitrary constant to be determined by the initial condition. 1 Put t = 0 and s = 0, we obtain C ′ = 0. Hence, s = ut + at2 . ∎ 2 3.1.1 Integration by Substitution There are many tricks to try to find the integral of a functions, not always work. Very often, we can prove that the integral exists but we do not have a simple analytic expression. One very useful trick is substitution which simplifies our work in great extend. Example 3.3. Find ∫ (2x + 8)3 dx. Solution: Put y = 2 x + 8 and we obtain dy = 2 dx. 1 3 3 ∫ (2 x + 8) dx = 2 ∫ y dy 1 4 = y +C 8 1 = (2 x + 8)4 + C 8 = 2 (x + 4)4 + C One can differentiate the result to verify that the integration is correct. For this simple substitution, we seldom define and write down the function y CHAPTER 3. INTEGRATION 58 explicitly, but instead we write 1 3 3 ∫ (2 x + 8) dx = 2 ∫ (2 x + 8) d(2 x + 8) 1 (2 x + 8)4 = +C 2 4 = 2 (x + 4)4 + C ∎ 1 Example 3.4. Find ∫ √ dx. 1 − x2 Solution: We substitute x = sin θ and obtain dx = cos θ dθ. 1 1 cos θ dθ dx = ∫ √ ∫ √ 1 − x2 1 − sin2 θ 1 = ∫ cos θ dθ cos θ = θ+C = sin−1 x + C ∎ 3.1.2 Integration using Partial Fraction For rational functions, partial fractions should be the first thing to try. We start with pulling terms to make the degree of the polynomial in the numerator less than that of the denominator. Then, we factorize the denominator, if possible. x2 + 8x − 3 dx. Example 3.5. Find ∫ 2 x + 5x + 4 Solution: x2 + 8x − 3 3x − 7 ∫ x2 + 5x + 4 dx = ∫ (1 + x2 + 5x + 4 ) dx The latter term can be treated be partial fraction. Since x2 + 5x + 4 = (x + 1)(x + 4), we try x2 3x − 7 A B = + + 5x + 4 x+1 x+4 3x − 7 = A(x + 4) + B(x + 1) 3x − 7 = (A + B)x + 4A + B CHAPTER 3. INTEGRATION 59 We could take A = −10/3 and B = 19/3. Then −10/3 19/3 3x − 7 ∫ (1 + x2 + 5x + 4 ) dx = x + ∫ ( x + 1 + x + 4 ) dx −10 19 = x+ ln(x + 1) + ln(x + 4) + const . 3 3 ∎ 3.1.3 Integration by Parts Consider the product rule d(uv) du dv = v+u dx dx dx If we integrate both sides of the equation with respect to x, we have uv = ∫ u′ v dx + ∫ uv ′ dx , or uv = ∫ v du + ∫ u dv Rearrange the expression, we obtain ∫ u dv = uv − ∫ v du This is called integration by parts. In actual calculation, one has to recognize which part is u and which part is v, usually not straight forward. Example 3.6. Find ∫ x sin x dx. Solution: If we take v = x2 /2, we have dv = x dx and x2 x sin x dx = sin x d ( ) ∫ ∫ 2 x2 x2 = sin x − ∫ cos x dx 2 2 This does not seem to go anywhere. Instead, if we take v = − cos x, we have dv = sin x dx and ∫ x sin x dx = − ∫ x d cos x = −x cos x + ∫ cos x dx = −x cos x + sin x + C ∎ CHAPTER 3. INTEGRATION 3.2 60 Definite Integration How to calculate the area under a curve between x = a and x = b, Fig. 3.1? One way to get an approximate answer is: We divide the interval to N small intervals. Define xn = a + b−a n, N where n = 0, 1, 2, . . . , N . The area of the rectangle between xn and xn+1 is approximately f (x∗n )(xn+1 − xn ) where we choose a point xn ≤ x∗n ≤ xn+1 . The area under the curve is about N −1 SN = ∑ f (x∗n )(xn+1 − xn ) n=0 The limit limN →∞ SN , if exists, is the area under the curve. In the formal definition, we allow small intervals of any length and any point x∗n inside the small interval. f(x) a b x Figure 3.1: The area under a curve Definition 3.7. Let f (x) be a function defined in the interval a ≤ x ≤ b. A partition of the interval is a = x0 < x1 < ⋯ < xN = b. Denote δ = max(xi+1 − xi ) i the maximum of the size of the small intervals. Choose a point xn ≤ x∗n ≤ xn+1 . Define the Riemann sum as N −1 SN = ∑ f (x∗n )(xn+1 − xn ) n=0 CHAPTER 3. INTEGRATION 61 If the limit limδ→0 SN exists, independent of how we choose the partition (as long as the maximum size goes to zero) and how we choose the point inside the small intervals, then we say that the function is Riemann integrable in the interval a and b, and call the limit the definite integral of the function, b ∫a f (x) dx (3.1) Example 3.8. If the function is a constant f (x) = k, then the Riemann sum is SN = N −1 ∑ f (x∗n ) (xn+1 − xn ) n=0 N −1 = k ∑ (xn+1 − xn ) n=0 = k (b − a) Hence, b ∫a k dx = k (b − a) b We also note that ∫ k dx is a function of b and a b d d (∫ k dx) = (k (b − a)) = k db a db ∎ A remark on the dummy variable: In Eq. (3.1), there is the notation x, but its significance is just to indicate the variable of the function. The definite integral of a function is a number, so b b b b ∫a f (x) dx = ∫a f (y) dy = ∫a f (t) dt = ∫a f (α) dα The variable inside a definite integral is called a dummy variable. 3.2.1 Fundamental Theorem of Calculus Part 1 of the Theorem: It shows that the integral is an antiderviative of a function. If f is a continuous function, define the function F (x) = ∫ a x f (t) dt , CHAPTER 3. INTEGRATION 62 note that its independent variable is the upper limit of a definite integral. Then, F (x) is differentiable and dF (x) = f (x) , dx which is the value of f at the upper limit x. Proof. dF (x) dx x+h x 1 f (t) dt − ∫ f (t) dt) = lim (∫ h→0 h a a 1 x+h = lim ∫ f (t) dt h→0 h x By the definition of Riemann integral, obviously we have h min f (x∗ ) ≤ ∫ ∗ x≤x ≤x+h x+h x f (t) dt ≤ h max f (x∗ ) ∗ x≤x ≤x+h When h → 0, minx≤x∗ ≤x+h f (x∗ ) = maxx≤x∗ ≤x+h f (x∗ ) = f (x) because f is continuous. As a result, 1 x+h f (t) dt ≤ f (x) ∫ h→0 h x f (x) ≤ lim and the theorem is proved. Part 2 of the theorem: It gives a very useful result for definite integral A direct consequence of the above result is that b ∫a f (t) dt = G(b) − G(a) for which (3.2) dG(x) = f (x). dx x Proof. Let F (x) = ∫ f (t) dt, then we have F ′ (x) = f (x) = G′ (x). That a means F (x) = G(x) + k. Obvously, k = −G(a) because F (a) = 0. Thus, we obtain b ∫a f (t) dt = F (b) = G(b) − G(a) Every differentiable function is integrable, and many more functions are integrable than differentiable. CHAPTER 3. INTEGRATION 63 Example 3.9. Find the area under the curve of y = sin x from 0 to π. Solution: π ∫0 sin x dx π = − cos x ∣ 0 = −(cos π − cos 0) = 2 Example 3.10. A particle with initial velocity u moves along a straight line. If the acceleration of the particle is a constant a, find the velocity and the displacement of the particle after time t. This example is the same as example 3.2, but we try to work out the answers with definite integral. Solution: The acceleration a is a constant and notice that in calculus nodv tation a = , where v is the velocity. We perform integration on both sides dt with respect to t and obtain t ∫0 t dv a dt = ∫ ( ) dt dt 0 v t a∫ dt = ∫ dv u 0 at = v − u v = u + at ds Let s be the displacement and notice again that in calculus notation v = . dt ds Then, = u + at. Integrating both sides with respect to t, we have dt t ds t ∫0 ( dt ) dt = ∫0 (u + at) dt s ∫0 t ds = ∫ (u + at) dt 0 1 s = ut + at2 2 One should notice that the area under the vt-graph is the displacement of the particle because s = ∫ t v dt. 0 Example 3.11. A particle moves on the xy-plane with velocity v⃗ = −aω sin ωt î + bω cos ωt ĵ , ∎ CHAPTER 3. INTEGRATION 64 where a, b and ω are constants and t is the time. The initial position of the particle is at a î. Find the position vector of the particle at time t, deduce that the locus of the particle is an ellipse. Show also that the acceleration of the particle directs towards the origin. t r⃗ d⃗ r , we have ∫ v⃗ dt = ∫ d⃗ r. In some books, they dt 0 aî t r⃗ adopt the dummy variable r⃗′ and write ∫ v⃗ dt = ∫ d⃗ r′ such that the upper 0 aî limit r⃗ in the right integral does not have the same name as the independent variable in the integral. Then Solution: Since v⃗ = r⃗ t ′ ∫0 (−aω sin ωt î + bω cos ωt ĵ) dt = r⃗ ∣ aî t = r⃗ − aî (a cos ωt î + b sin ωt ĵ) ∣ 0 a cos ωt î − aî + b sin ωt ĵ = r⃗ − aî Therefore, we obtain r⃗ = a cos ωt î + b sin ωt ĵ. Notice that r⃗ = x î + y ĵ. Hence, we get x = a cos ωt, and y = b sin ωt. Eliminating t, we have an ellipse, i.e. x2 y 2 + =1 a2 b 2 The acceleration of the particle is v⃗˙ , where d⃗ v v⃗˙ = = dt = = = d (−aω sin ωt î + bω cos ωt ĵ) dt −aω 2 cos ωt î − bω 2 sin ωt ĵ −ω 2 (a cos ωt î + b sin ωt ĵ) −ω 2 r⃗ Hence, the acceleration of the particle always points towards the origin. 3.2.2 ∎ Integration using Reduction Formula Integration by part is a useful skill to obtain an integral. However, there are some functions that you have to use this method several times before the final answer is obtained. If the index of a function (e.g. power index) drops when you repeat the method, a reduction formula may follow. The final answer of the integral appears after using this formula recursively. π/2 Example 3.12. Denote In = ∫ sinn x dx, where n is a positive integer. 0 Obtain the reduction formula for In Hence, find I5 . CHAPTER 3. INTEGRATION 65 Solution: In = ∫ 0 π/2 sinn x dx π/2 = −∫ sinn−1 x d cos x x=0 π/2 = − sin n−1 x cos x ∣ 0 = (n − 1) ∫ = (n − 1) ∫ π/2 +∫ π/2 cos x d sinn−1 x 0 sinn−2 x cos2 x dx 0 π/2 0 sinn−2 x (1 − sin2 x) dx = (n − 1) (In−2 − In ) Hence, we obtain In = n−1 4 4 2 8 In−2 . Finally, we have I5 = I3 = ⋅ I1 = . ∎ n 5 5 3 15 Example 3.13. Denote In = ∫ secn θ dθ, where n is a positive integer. Obtain the reduction formula for In . Solution: In = ∫ secn θ dθ = ∫ secn−2 θ d tan θ = secn−2 θ tan θ − ∫ tan θ d secn−2 θ = secn−2 θ tan θ − (n − 2) ∫ tan2 θ secn−2 θ dθ = secn−2 θ tan θ − (n − 2) ∫ (sec2 θ − 1) secn−2 θ dθ = secn−2 θ tan θ − (n − 2) In + (n − 2) In−2 Therefore, we have (n − 1)In = secn−2 θ tan θ + (n − 2) In−2 1 n−2 In = secn−2 θ tan θ + In−2 n−1 n−1 ∎ Example 3.14. The beta function B(p, q) is defined by the integral B(p, q) = ∫ 1 0 xp−1 (1 − x)q−1 dx for p ≥ 1, q ≥ 1. CHAPTER 3. INTEGRATION 66 Show that if p ≥ 1 and q ≥ 2, then B(p, q) = B(10, 5). q−1 B(p, q − 1). Hence, find p+q−1 Solution: 1 B(p, q) = ∫ xp−1 (1 − x)q−1 dx 0 = 1 1 (1 − x)q−1 dxp p ∫x=0 1 = = = = 1 q−1 1 [(1 − x)q−1 xp ] ∣ + xp (1 − x)q−2 dx ∫ p p 0 0 1 q−1 − xp−1 [(1 − x) − 1] (1 − x)q−2 dx p ∫0 1 1 q−1 q−1 xp−1 (1 − x)q−1 dx + xp−1 (1 − x)q−2 dx − ∫ ∫ p p 0 0 q−1 q−1 − B(p, q) + B(p, q − 1) p p q−1 B(p, q − 1). This reduction formula gives p+q−1 4 4 3 2 1 4 3 2 1 1 1 B(10, 5) = B(10, 4) = ⋅ ⋅ ⋅ ⋅B(10, 1) = ⋅ ⋅ ⋅ ⋅ = 14 14 13 12 11 14 13 12 11 10 10010 Hence, we obtain B(p, q) = ∎ 3.3 Impulse The impulse of an object of mass m is written as I, it is defined as the change of momentum of the object. The mathematical expression is I = ∆p = m (v − u) , where p is the linear momentum. The initial and final velocities are u and v respectively. Recall that the force exerted on the object is given by Newton’s second law F = dp/dt. Integrating both sides with respect to t, we have I = ∫ dp = ∆p = ∫ F dt = Fave ∆t , where Fave is the average force exerted on the object during the time ∆t. Figure 3.2 shows the variation of a force exerted on an object when collsion occurs. The area under the curve equals to the area of the rectangle given by Fave ∆t. CHAPTER 3. INTEGRATION 67 Figure 3.2: The impulse of a force is the area under the curve Example 3.15. A particle of mass m is thrown vertically upward with speed v0 and it returns to the initial point with speed v1 . Suppose that the retarding force F due to the air resistance is linearly proportional to the instantaneous velocity of particle, i.e. F = −kv, where k > 0 and v is the velocity of the particle. By considering the total impulse acting on the particle during its motion, show that the time elapsed is given by 1 t = (v0 + v1 ) . g Solution: Taking downward as positive, the total force exerted on the particle is F = mg − kv. The expression is true for both upward and downward motions. The velocity of the particle v has negative value when the particle travels upward v. The value of v becomes positive when the particle travels downward. Newton’s second law gives F = dp dt I = ∫ F dt = ∫ dp ∫ F dt = ∆p The integral in the above equation is the impulse exerted on the particle and it is equivalent to the change of momentum of the particle. Thus, t ∫0 (mg − kv) dt = m[v1 − (−v0 )] mgt − k ∫ t 0 v dt = m(v1 + v0 ) t Because the total displacement of the particle is zero, we have ∫ v dt = 0. 0 Thus, v0 + v1 t = g ∎ CHAPTER 3. INTEGRATION 3.4 68 Center of Mass The center of mass of a system of particles is the point at which the total mass of the system may be considered concentrated. It describes the average position of the system. Consider a system of n particles on the xy-plane, as shown in figure 3.3. The mass of the ith particle is mi and its location is given by r⃗i or (xi , yi ), where i = 1, 2, 3, . . . , n. The center of mass of this system is (xcm , ycm ), where n n xcm = ∑ xi mi i=1 n and ∑ mi i=1 ycm = ∑ yi mi i=1 n ∑ mi i=1 Let’s rewrite the definition again and try to realize its physical picture. ⎛ ⎞ ⎜ mi ⎟ ⎟ xcm = ∑ xi ⎜ n ⎜ ⎟ i=1 m ∑ ⎝ j=1 j ⎠ n ⎛ ⎞ ⎜ mi ⎟ ⎟ ycm = ∑ yi ⎜ n ⎜ ⎟ i=1 m ∑ ⎝ j=1 j ⎠ n and Figure 3.3: A system of particles on the xy-plane One can see readily that the quantities in the brackets are the weighting ⎞ ⎛ n ⎜ mi ⎟ ⎟ = 1. Thus, the center of mass of a system of functions because ∑ ⎜ n ⎜ ⎟ i=1 ∑ m ⎝ j=1 j ⎠ masses represents the average position of the system. Now, we extend our understanding from the discretized model to a continuous model. For example, we consider the center of mass of a thin rod CHAPTER 3. INTEGRATION 69 which lies on the x-axis. We cut the rod into infinite number of segments, each has infinitesimally small size and the mass is dm. Then, the center of mass of the rod can be determined by replacing the summation sign by the integral and the position of the mass element is marked by x, i.e. the coordinate of dm. n xcm = ∑ xi mi i=1 n ∑ mi Ð→ xcm = i=1 ∫ x dm ∫ dm Similarly, for a 2-D object lying on the xy-plane, the center of mass of it is located at (xcm , ycm ), where the coordinates are given by the integrals as follows. xcm = ∫ x dm ycm = and ∫ dm ∫ y dm ∫ dm Example 3.16. A uniform rod of length L has uniform density. (a) Locate the center of mass of the rod. (b) Locate the center of mass of the rod again if the rod has non-uniform density λ′ = λ0 (1 + x/L), where x is the distance from the light end, 0 ≤ x ≤ L and λ0 is a constant. Solution: Figure 3.4: The thin rod (a) For convenience, we place the rod on the xy-plane such that one end of the rod is at the origin and the rod lies on the positive x-axis. Then, we cut the rod into numerous segments, each having a length dx. Denote the density of the rod as λ, the mass of the small segment is given by dm = λ dx. By the definition of the center of mass, we have L L xcm = ∫ x dm ∫ dm = λ∫ L x dx 0 L λ∫ dx 0 = ∫0 x dx L ∫0 dx = x2 ∣ 2 0 L x∣ 0 = L 2 CHAPTER 3. INTEGRATION 70 (b) If the density of the rod is non-uniform, the small segment has mass x dm = λ′ dx = λ0 (1 + ) dx. Then the center of mass of the rod is L L L x x2 λ ) dx ) dx x (1 + (x + 0∫ ∫0 ∫ x dm L L 0 = = xcm = L L x x dm λ0 ∫ (1 + ) dx (1 + ) dx ∫ ∫ L L 0 0 Therefore, L xcm = x2 x3 ( + )∣ 2 3L 0 L (x + x2 )∣ 2L 0 5L2 5L = 6 = 3L 9 2 ∎ Example 3.17. A uniform wire of radius R is bent into a semi-circle. Locate the center of mass of the wire. Solution: Figure 3.5: A uniform and semi-circular wire Due to symmetry, the x-coordinate of the center of mass always lies on the y-axis (i.e. xcm = 0). Consider an infinitesimal element of length dl on the wire, where dl = R dφ. The mass of the element is dm = λ dl = λ R dφ. By the definition of centre of mass π ycm = ∫wire y dm ∫wire dm = ∫0 (R sin φ) (λ R dφ) π ∫0 λ R dφ = R2 λ ∫ π sin φ dφ 0 π Rλ ∫ dφ 0 π R R 2R = − cos φ ∣ = − (−1 − 1) = π π π φ=0 ∎ CHAPTER 3. INTEGRATION 3.5 71 Work Done by a Force A particle is driven by a force F⃗ such that it moves along a path C. For a small displacement d⃗ r along the path, the corresponding work done is ⃗ dW = F ⋅ d⃗ r. The scalar product is adopted because the component of the force along the displacement contributes to the motion of the particle, but the component of the force normal to the displacement does no work on the particle. The total work done by the force F⃗ along the path C is given by an integral which sums up all the small work done. r W = ∫ dW = ∫ F⃗ ⋅ d⃗ C C Example 3.18. Find the work done by the gravity when a particle of mass m falls freely by a distance h. Solution: Suppose that the point of release of the particle is coincident with the origin of the coordinate system. The motion of the particle is along the y-axis. The gravitational force acting on the particle is −mg ĵ and the small displacement is d⃗ r = dy ĵ. Notice the latter represents the general expression of the displacement. The actual direction of it is stated by the lower and upper limits in the integral. The work done by the gravity is W = ∫ (−mg ĵ) ⋅ (dy ĵ) = ∫ C 0 −h (−mg) dy = −mg ∫ −h dy 0 −h Therefore, W = −mgy ∣ = mgh. The value is positive because the gravita0 tional force points in the same direction as the displacement. The particle gains kinetic energy by the same amount too. Example 3.19. A particle of mass m slides down along the inner surface of a smooth hemispherical hollow of radius h. If the initial position of the particle is at the rim of the hollow, find the work done by the gravity when the particle reaches the lowest point of the hollow. Figure 3.6: A sliding particle in the bowl CHAPTER 3. INTEGRATION 72 Solution: The particle slides down the inner surface of the hollow along the path C and the infinitesimal displacement of it is denoted by d⃗ r = dx î + dy ĵ. The force exerted on the particle due to gravitational force is F⃗ = −mg ĵ. By the definition of work done, we can write r = ∫ (−mg ĵ) ⋅ (dx î + dy ĵ) = ∫ W = ∫ F⃗ ⋅ d⃗ C C 0 −h (−mg) dy = −mg ∫ −h dy 0 −h Therefore, W = −mgy ∣ = mgh. It has the same value as the answer in 0 example 3.18. In fact, the work done by a gravitational force is independent of the path that it travels. The amount of work done only depends on the initial and final positions of the particle. A force with this property is called the conservative force. One should notice that the normal force exerted on the particle by the hollow does no work on the particle because the normal force is always perpendicular to the displacement of the particle. 3.6 Energy Stored in a Spring A force Fapp is applied to an unstretched spring and the spring extends by x. If the extension x is linearly proportional to the applied force Fapp , we say that the spring obeys the Hooke’s law Fapp = kx, where k is a positive constant called the spring constant. The work done by the force Fapp over the displacement x is just the total energy E stored in the spring. x x 1 E = ∫ Fapp dx = ∫ kx dx = kx2 2 0 0 Figure 3.7: The spring stores energy when it is stretched If the spring experiences a force such that there is a compression in the spring, the same amount of energy is stored when the spring is compressed by x. CHAPTER 3. INTEGRATION 3.7 73 Electric Field due to a Charged Wire A very long straight wire has positive charges distributed along it and the line density is λ. The electric field at point P distanced D normally from one end of the wire can be obtained if we add up the electric fields produced by all charges on the wire. Figure 3.8 shows the electric at P due to the small segment in the wire. Now, we divide the wire into numerous segments. Coulomb’s law applies to the calculations because each segment is infinitesimally small in length that the charges in it are considered as point charge. The Coulomb’s constant is ke . Figure 3.8: The electric field due to a charged segment in the wire The amount of charge occupied by the small segment is dq = λ dx, where x = D tan θ and dx = D sec2 θ dθ. Thus, dq = λD sec2 θ dθ. The distance between the segment and point P is r = D/ cos θ. Hence, the magnitude of the electric field at P due to the charged segment is dE = ke dq λD sec2 θ dθ ke λ = k = dθ e r2 D D 2 ( ) cos θ Let the total electric field at P be E⃗ = Ex î + Ey ĵ, where Ex = − ∫ sin θ dE and ke λ π/2 ke λ sin θ dθ = − ∫ D 0 D and Ey = − ∫ cos θ dE Then Ex = − Ey = − ke λ π/2 ke λ cos θ dθ = − ∫ D 0 D ke λ Hence, the total electric field at P is E⃗ = − (î + ĵ). D The above discussion is about a semi-infinite wire of charge density λ. If the wire of the same charge density is an infinite long wire which has its left end begins at the negative infinity and its right end extends to the positive infinity, the electric field at P points along the negative y direction and the magnitude is 2ke λ/D. CHAPTER 3. INTEGRATION 3.8 74 The Length of a Curve If y = f (x) is a continuous function and f ′ (x) exists, then the length of the curve in the range a ≤ x ≤ b is given by b√ 1 + [f ′ (x)]2 dx S=∫ a Figure 3.9: A small segement on the curve Proof. Divide the curve into many segments such that each segment has finite and small length ∆S, where (∆S)2 = (∆x)2 + (∆y)2 . Then √ ∆y 2 ∆S = 1 + [ ] ∆x ∆x If there are numerous segments, the length of each segment becomes infinitesimally small and thus the total length of the curve in the interval a ≤ x ≤ b is given by √ b dy 2 S = ∫ dS = ∫ 1 + [ ] dx (3.3) dx a b√ Hence, we obtain S = ∫ 1 + [f ′ (x)]2 dx, where f ′ (x) = df (x)/dx. a If the curve has a parametric form (x(t), y(t)), then we have dy dy dt y ′ (t) = = dx dx x′ (t) dt and dx = ( dx ) dt = x′ (t) dt , dt where t is the independent variable of the coordinates (x, y). Equation 3.3 becomes t2 √ S=∫ [x′ (t)]2 + [y ′ (t)]2 dt , (3.4) t 1 where x(t1 ) = a and x(t2 ) = b. CHAPTER 3. INTEGRATION 75 Example 3.20. Find the mass of a metal wire of density λ = 2 kgm−1 if the 1 1 wire has a parabolic shape given by y = x2 and − ≤ x ≤ . The measurement 2 2 of x is in meter. Figure 3.10: A parabolic metal wire Solution: y = f (x) = x2 gives f ′ (x) = 2x. The total mass of the wire is M = λS = λ ∫ dS = λ ∫ 1 2 − 21 √ 1 + 4x2 dx Using the substitution 2x = tan θ and 2 dx = sec2 θ dθ, we have π λ 4 M = sec3 θ dθ ∫ π 2 −4 π ⎡ ⎤ π 4 ⎥ λ ⎢⎢ 1 1 4 = sec θ tan θ ∣ + ∫ π sec θ dθ⎥⎥ (by using example 3.13) ⎢ 2 ⎢2 2 −4 ⎥ π − ⎣ ⎦ 4 π ⎡ √ √ ⎤ 4 ⎥ λ ⎢⎢ 1 1 ( 2 + 2) + ln(sec θ + tan θ) ∣ ⎥⎥ = 2 ⎢⎢ 2 2 − π4 ⎥ ⎣ ⎦ √ λ √ 1 2+1 = [ 2 + ln ( √ )] 2 2 2−1 √ √ √ −1 Plugging in λ = 2 kgm , the mass of the wire M = ( 2 + ln 3 + 2 2) kg ∎ Example 3.21. A disk of radius a rotates without sliding on a horizontal plane. The initial contact point at P on the disk traces a path when the disk rolls and the path is called the cycloid. The coordinates of P are { x = a (θ − sin θ) , y = a (1 − cos θ) , where θ is the angular displacement of the disk. Find the length of the cycloid with 0 ≤ θ ≤ 2π. CHAPTER 3. INTEGRATION 76 Figure 3.11: The cycloid of a disk Solution: The derivatives of x and y with respect to θ are x′ (θ) = a (1−cos θ) and y ′ (θ) = −a sin θ. The length of the cycloid is 2π √ [x′ (θ)]2 + [y ′ (θ)]2 dθ 2π √ = ∫ a2 (1 − cos θ)2 + a2 sin2 θ dθ 0 S = ∫ 0 = 2a ∫ 0 2π θ sin ( ) dθ = 8a 2 θ We have applied the trigonometric identity cos θ = 1 − 2 sin2 in the above 2 treatment. ∎ 3.9 Area under a Curve If y = f (x) is a continuous function, then the area under the curve in the b range a ≤ x ≤ b is given by ∫a f (x) dx. In physical science, there are many examples for which the area under the curve gives a physical quantity. Here shows some examples. Velocity-time curve (vt-curve) gives the displacement, Acceleration-time curve (at-curve) gives the velocity, Force-time curve (F t-curve) gives the impulse of a force, Force-displacement curve (F s-curve) gives the work done by a force, Pressure-volume curve (P V -curve) gives the work done by a gas Charge-voltage curve (QV -curve) gives the energy stored in a capacitor CHAPTER 3. INTEGRATION 77 Example 3.22. Find the work done by an ideal gas if n moles of gas undergo an isothermal expansion in a container at temperature T . The piston is pushed outward and the volume of the gas increases from its initial value Vi to a final value Vf . Figure 3.12: The gas expansion and the P V -graph Solution: In an isothermal process, the temperature is kept constant. When the gas has a volume change by ∆V the work done by the gas system is ∆Ws = F ∆x = P A ∆x = P ∆V , where F is the force exerted on the piston, P is the pressure in the gas, x is the displacement of the piston and A is the cross sectional area of the piston in the container. During the expansion, the total work done by the gas system is Ws = ∫ P dV = nRT ∫ Vf Vi dV V where R is the universal gas constant. In the right hand side of the above equation, we have applied the ideal gas law P V = nRT such that the integral contains only one variable, i.e. the volume V . Then Vf Ws = nRT ln V ∣ = nRT ln ( Vi Vf ) Vi In an expansion, the work done by the gas Ws is positive and its value is given by the area under the P V -curve in figure 3.12. If the process is a compression, Ws becomes negative and it is the negative area under the P V -curve. ∎ 3.10 Moment of Inertia Mass is a measure of the amount of inertia of an object in translational motion. Greater the mass is greater the resistance to against change. That is to say, we need a greater force to move an object from rest and changes its state of motion. This idea is stated in Newton’s second law F = ma, where CHAPTER 3. INTEGRATION 78 Figure 3.13: A system of particles rotating about the z-axis F is the net force exerted on an object of mass m and a is the acceleration of the object. In a rotational motion, the quantity to measure the amount of inertia is the moment of inertia. For a system of n particles rotating about an axis, the distribution of mass about the axis affects the resistance of an object to rotate. The moment of inertia about the axis is defined as n I = ∑ mi ri2 , i=0 where mi is the mass of the i-th particle and ri is the radius of rotation of it about the axis. Newton’s second law for rotation is τ = Iα, where τ is the torque (i.e. the moment of force) exerted on the system about the axis and α is the angular acceleration of the system. One can always compare F = ma with τ = Iα, where F , m, and a are the quantities adopted in translational motions, while τ , I and α are the quantities adopted in rotational motions. The correspondence of them are F → τ , m → I, and a → α, thus F = ma → τ = Iα Detailed discussion about torque was stated in section 1.12.1. On the other hand, one can see that if the object in figure 3.13 rotates with angular speed ω, then the i-th particle occupies a velocity vi = ri ω. Thus, the total kinetic energy (KE) of the object is n 1 n 1 n 1 1 K = ∑ ( mi vi2 ) = (∑ mi (ri ω)2 ) = (∑ mi ri2 ) ω 2 = I ω 2 2 i=1 2 i=1 2 i=1 2 This is an important result when we compare the KE expression of translational motion to rotational motion. The correspondence of the quantities are M → I and V → ω, thus 1 1 M V 2 → I ω2 2 2 CHAPTER 3. INTEGRATION 79 where M is the total mass of the object and V is the velocity of the center of mass of the object when the object has translational motion. In a rigid body, the mass is distributed continuously instead of having a discretized model. Hence, we have to replace the summation sign by an integral and the position of the mass element is marked by r, i.e. the radius of rotation of the mass element dm. n I = ∑ mi ri2 Ð→ I = ∫ r2 dm i=1 Obviously, the moment of inertia of a ring about an axis passing through its center and normal to the plane of the ring is mr2 , where m and radius r are the mass and the radius of the ring respectively. Example 3.23. Find the moment of inertia of a uniform thin rod about a normal axis which passes through the center of the rod. The rod has mass M and length L. Figure 3.14: A rod rotating about a normal axis through its mid-point Solution: For convenience, we fit the rod in a coordinate system such that the rod lies on the x-axis and the mid-point of the rod meets the origin. Divide the thin rod into numerous segments so that each segment has infinitesimal small length dx and the mass of the segment is dm = λ dx, where λ = M /L is the density of the rod. From the definition of moment of inertia, we have L/2 L/2 λL3 M L2 λx3 I = ∫ x2 dm = λ ∫ x2 dx = ∣ = = 3 −L/2 12 12 −L/2 ∎ Example 3.24. Find the moment of inertia of a uniform thin rod about a normal axis which passes through the end of the rod. The rod has mass M and length L. Solution: This time we fit the rod in a coordinate system such that the rod lies on the x-axis and the left end of the rod meets the origin. Divide the thin rod into numerous segments so that each segment has infinitesimal CHAPTER 3. INTEGRATION 80 Figure 3.15: A rod rotating about a normal axis through its end small length dx and the mass of the segment is dm = λ dx, where λ = M /L is the density of the rod. From the definition of moment of inertia, we have L L λx3 λL3 M L2 I = ∫ x2 dm = λ ∫ x2 dx = ∣ = = 3 0 3 3 0 We observe that the answer in example 3.23 is less than that in this example, because the rod in the latter case has more masses distributed far away from the axis of rotation. ∎ 3.11 The Dog-And-Rabbit Chase Problem The classic dog-and-rabbit chase problem is an interesting topic in calculus. A dog is at a distance L due south of a rabbit, it observes the rabbit running in a vast field at time t = 0. The positions of them at time t = 0 are shown in the figure. When the dog sees the rabbit, it starts to pursue the rabbit and its motion always points to the rabbit. Given that the rabbit keeps running due east with a constant speed v and the dog’ speed is a constant u, where v < u. Find the time elapsed when the dog catches the rabbit. Figure 3.16: The dog-and-rabbit chase problem Let x be the horizontal displacement of the rabbit relative to the dog and τ be the time elapsed when the dog catches the rabbit. Then, at arbitrary time t dx = v − u cos θ dt (3.5) CHAPTER 3. INTEGRATION 81 Integrating both sides of equation 3.5 from t = 0 to t = τ , we have 0 τ ∫x=0 dx = ∫0 (v − u cos θ) dt 0 = vτ − u ∫ τ cos θ dt 0 That is to say, τ vτ ∫0 cos θ dt = u (3.6) Let r be the displacement of the rabbit relative to the dog. At any instant of time, dr = v cos θ − u dt (3.7) Integrating both sides of equation 3.7 from t = 0 to t = τ , we have τ 0 ∫r=L dr = ∫0 (v cos θ − u) dt τ −L = v ∫ cos θ dt − uτ 0 Hence, we obtain τ L = uτ − v ∫ cos θ dt 0 (3.8) Substituting equation 3.6 into equation 3.8, we have L = uτ − v ( Therefore, the required time τ is 3.12 vτ ) u Lu . − v2 u2 Numerical Integration b We will introduce two basic methods which compute the integral ∫ a f (x) dx numerically. The idea of trapezoidal rule is simple that the integrand f (x) is approximated by secant lines. However, the convergence of trapezoidal rule is slower than that of Simpson’s rule. The latter approximates the integrand by parabolas. CHAPTER 3. INTEGRATION 3.12.1 82 Trapezoidal Rule b Given a function f (x) ≥ 0, the integral ∫ f (x) dx can be approximated a by the area of a trapezium as shown in figure 3.17. The accuracy can be improved if we subdivide the interval [a, b] into n subintervals of equal width, as shown in figure 3.18. The precision increases with n. When n = 1, we have b 1 ∫a f (x) dx ≈ 2 (b − a) [f (a) + f (b)] Figure 3.17: Trapezoidal rule when n = 1 Let h be the width of each subinterval when there are n subintervals. Then we have h = (b − a)/n and the trapezoidal rule b ∫a n f (x) dx = ∑ ∫ i=1 xi xi−1 f (x) dx 1 n ∑ h [f (xi−1 ) + f (xi )] 2 i=1 h = [f0 + f1 + f1 + f2 + f2 + ⋅ ⋅ ⋅ + fn−1 + fn−1 + fn ] 2 n−1 h = [f0 + 2 ∑ fi + fn ] 2 i=1 ≈ Figure 3.18: Trapezoidal rule when n > 1 CHAPTER 3. INTEGRATION 3.12.2 83 Simpson’s Rule Other than using secant lines to approximate the integrand f (x), we can replace the integrand by many parabolas. To simplify the discussion, let’s consider the range of the integral described by [−h, h]. Then we divide the interval by two subintervals of equal width and replace the integrand by a parabola p(x) = ax2 + bx + c such that it passes through A0 (−h, f0 ), A1 (0, f1 ), and A2 (h, f2 ). The parabola is shown by the dashed curve in figure 3.19. f0 = ah2 − bh + c f1 = c f2 = ah2 + bh + c Figure 3.19: Simpson’s rule when n = 2 The constants a, b, and c can be determined. However, we are not necessary to compute them because h ax3 bx2 h (ax + bx + c) dx = ( + + cx) ∣ = (2ah2 + 6c) ∫−h 3 2 3 −h h 2 and we can simply write h h 2 ∫−h (ax + bx + c) dx = 3 (f0 + 4f1 + f2 ) (3.9) One can check that 2ah2 + 6c = f0 + 4f1 + f2 . Practically, we subdivide the interval [a, b] into n intervals, where n is even. Then we replace the integrand CHAPTER 3. INTEGRATION 84 by n/2 parabolas. It is noted that the integral over the region [xi−1 , xi+1 ] has similar expression as that described in equation 3.9. Hence, we have ∫x xi+1 h (fi−1 + 4fi + fi+1 ) 3 f (x) dx ≈ i−1 Let x0 = a, xn = b, xi = a + ih, and h = Simpson’s rule b ∫a n/2 f (x) dx = ∑ ∫ x i=1 x2i b−a . The overall result is known as n f (x) dx 2i−2 h n/2 ∑ (f2i−2 + 4f2i−1 + f2i ) 3 i=1 h = (f0 + 4f1 + f2 + f2 + 4f3 + f4 + f4 + 4f5 + f6 + . . . + fn−2 + 4fn−1 + fn ) 3 h = [f0 + 4 (f1 + f3 + f5 + . . . + fn−1 ) + 2 (f2 + f4 + f6 + . . . + fn−2 ) + fn ] 3 ≈ Example 3.25. If I = ∫ π ex cos x dx, compute I by the following methods. 0 (a) Integration by parts (b) Trapezoidal rule with n = 2, 4, 8, 16, 32 and 64 (c) Simpson’s rule with n = 2, 4, 8, 16, 32 and 64 Solution: (a) Using intergration by parts, we have π I = ∫ ex cos x dx 0 π = ∫ cos x dex 0 π π = e cos x ∣ + ∫ ex sin x dx 0 x 0 = −eπ − 1 + ∫ π sin x dex 0 π π = −eπ − 1 + (ex sin x ∣ − ∫ ex cos x dx) 0 0 = −e − 1 − ∫ π = −eπ − 1 − I π x e cos x dx 0 1 Thus 2I = −1 − eπ , then we have I = − (1 + eπ ) = −12.07034632. 2 CHAPTER 3. INTEGRATION 85 (b) and (c) n 2 4 8 16 32 64 In by Trapezoidal rule -17.389259 -13.336023 -12.382162 -12.148004 -12.089742 -12.075194 In by Simpson’s rule -11.5928395534 -11.9849440198 -12.0642089572 -12.0699513233 -12.0703214561 -12.0703447599 We observe that the answer is correct to 4 decimal numbers by using Simpson’s rule with n = 32 while Trapezoidal rule converges to the answer much slower. ∎ 3.13 Useful Integration Formulae All constants of integration are omitted but implied in this table. f (x) ∫ f (x) dx c cx x2 2 cxn+1 n+1 − cos x sin x tan x − cot x sec x − csc x x cxn (n ≠ −1) sin x cos x sec2 x csc2 x sec x tan x csc x cot x sec x csc x tan x cot x ex 1 x ln x Integration by Parts: ∫ u dv = uv − ∫ v du π x ln(sec x + tan x) or ln tan ( + ) 4 2 x ln(csc x − cot x) or ln tan 2 ln sec x ln sin x ex ln x x ln x − x Chapter 4 Ordinary Differential Equations The ordinary differential equations (ODE) are widely used in physics and engineering. The word ”ordinary” means that the equation involves single variable function such as y(x) or y(t). In this chapter, we will focus on the ordinary first order differential equations. ”First order” means that the equation has the term dy/dx but no higher order terms. If the equation has second order term, we can try to reduce it to a first order equation and proceed the calculation. 4.1 Separation of Variables If the right-hand side of the first order differential equation dy = f (x, y) dx can be expressed as a function that depends only on x times a function that depends on y, then the differential equation is called separable. Such equation has the form dy = g(x) p(y) dx To solve the equation, we multiply both sides by 1/p(y) and dx. Hence, we obtain dy = g(x) dx p(y) Then integrate both sides of the equation dy ∫ p(y) = ∫ g(x) dx This technique is known as the separation of variables. 86 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 87 dy x − 5 = 2 . dx y Solution: We separate the variables on two sides of the equation as y 2 dy = (x − 5) dx. Integrating, we have Example 4.1. Solve 2 ∫ y dy = ∫ (x − 5) dx y3 x2 = − 5x + C 3 2 1/3 3x2 y = ( − 15x + 3C) 2 Replace 3C by K, we then have y=( 1/3 3x2 − 15x + K) 2 ∎ Example 4.2. A 1-kg particle is driven by a force f (x) = − sin x along the x-axis. Its initial position is at the origin and its initial velocity is 2 m/s towards the positive x-direction. (a) Express the velocity of the particle as a function of position. (b) Hence, find the limiting position of the particle. Solution: (a) Newton’s second law gives the equation of motion of the particle − sin x = (1) ẍ , where ẍ = d2 x/dt2 is the acceleration of the particle. Using the chain rule, we have ẍ = dẋ dẋ dx dẋ = = ẋ dt dx dt dx Then, we obtain − sin x = dẋ ẋ dx Separating the variables and integrating both sides, we have −∫ 0 x v sin x dx = ∫ ẋ dẋ 2 x cos x∣ 0 v ẋ2 = ∣ 2 2 2 (1 + cos x) = v 2 (4.1) CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 88 where v is the velocity of the particle when it has a displacement x from the x origin. Using the identity cos x = 2 cos2 − 1 and the fact that v = 2 when 2 x x = 0, we have v = 2 cos . 2 dx dx x (b) Rewrite the velocity v as , we have = 2 cos . Then dt dt 2 t x dx = 2 ∫0 dt ∫0 2 x π x 2 ln tan ( + ) ∣ = 2t 4 4 0 π x ln tan ( + ) = t 4 4 x sec π x π When t → ∞, we have + → . Thus, the limiting position of the particle 4 4 2 is at x = π. In the integration, we have applied the formula π θ ∫ sec θ dθ = ln(sec θ + tan θ) = ln tan( 4 + 2 ) ∎ 4.2 Simple Harmonic Motion A particle of mass m is connected to a light spring that obeys the Hooke’s law F = −kx, where F is the restoring force in the spring when the spring is stretched or compressed from its natural length by x. The spring constant is denoted by k which is a positive constant. If the spring is stretched, x is positive and F becomes negative. On the other hand, if the spring is compressed, x is negative and F become positive. It means that the restoring force always points opposite to the position vector of the particle and it trys to restore the natural length of the spring. Consider the initial condition of a spring-mass system as follows. At time t = 0, the particle is displaced to the right by a distance A from the equilibrium position and it is released. Newton’s second law gives the equation of motion which is independent of the initial conditions of the system. mẍ = −kx or ẍ = −ω 2 x , where ω 2 = k/m and ẍ is the acceleration of the particle, sometimes it is denoted by a. It implies that the direction of the acceleration of the particle is always opposite to the displacement of the particle. It is a second order CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 89 Figure 4.1: The spring-mass system ordinary differential equation. The solving of it can be done by reducing its order to a first order equation and then separating the variables. Use equation 4.1, we have v dẋ ẋ = −ω 2 x dx x 2 ∫0 ẋ dẋ = −ω ∫A x dx v 2 = ω 2 (A2 − x2 ) Since v = dx/dt, we can write √ dx = ω A2 − x2 dt Separating the variables and integrating both sides, we get x t dx = ω ∫ dt ∫A √ 2 0 A − x2 Substitute x = A sin θ, we have dx = A cos θ dθ and ∫ π 2 π −1 x becomes sin ( ) − = ωt. Thus, A 2 x ) sin−1 ( A dθ = ωt, this x π = sin ( + ωt) A 2 x = cos ωt A x = A cos ωt It is a periodic function showing that the particle oscillates about the origin with a constant period T = 2π/ω. The variations of velocity v and acceleration a are also periodic, where v = ẋ = −Aω sin ωt and a = ẍ = −Aω 2 cos ωt = −ω 2 x. CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 90 In summary, in a spring-mass system, if the particle is released at x = A it will oscillate with amplitude A about the equilibrium position, i.e. x = 0. The equation of motion is ẍ = −ω 2 x and ⎧ x = A cos ωt ⎪ ⎪ ⎪ ⎨ v = −Aω sin ωt ⎪ ⎪ 2 ⎪ ⎩ a = −Aω cos ωt √ where ω = k/m is the angular velocity. The maximum magnitude of velocity is vmax = ωA when x = 0, and the maximum magnitude of acceleration is amax√= ω 2 A when x = A or x = −A. The period of oscillation is 2π m = 2π . It is worth to notice that the velocity relates the displaceT= ω k ment by v 2 = ω 2 (A2 − x2 ). This expression is also the direct consequence of conservation of energy. Example 4.3. A particle of mass m is placed on a smooth horizontal plane. It is connected to a spring of spring constant k and is projected to the right with speed v0 from its equilibrium position. Find the velocity of the particle as functions of displacement and time respectively. Figure 4.2: The spring-mass system Solution: The velocity as a function of displacement can be obtained readily if we consider the conservation of energy. Instead of doing this way, we try to obtain it by using the equation of motion of the particle, ẍ = −ω 2 x, where √ ω = k/m. Use equation 4.1, we have v dẋ ẋ = −ω 2 x dx x 2 ∫v ẋ dẋ = −ω ∫0 x dx 0 v 2 − v02 = −ω 2 x2 v 2 = v02 − ω 2 x2 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 91 This is the velocity-time function of the particle. Recall that v = dx/dt, and thus we have √ dx v02 − ω 2 x2 = dt Hence, x ∫0 √ Use the substitution x = t dx v02 − ω 2 x2 = ∫ dt 0 v0 v0 sin θ, we have dx = cos θ dθ and ω ω sin 1 ∫ ω 0 −1 ωx (v ) 0 sin−1 ( dθ = t ωx ) = ωt v0 x = A sin ωt , where A = v0 /ω is the amplitude of oscillation. Thus, we obtain the velocity of the particle as a function of time, v = Aω cos ωt = v0 cos ωt. ∎ 4.3 Free Fall with Air Resistance A particle is released at a height h from the floor. Suppose that the air resistance is not ignorable and the drag force (then the acceleration too) is linearly proportional to the velocity of the particle. We are interested to study the following problems. (a) Determine the velocity of the particle at time t, (b) determine the height of the particle at time t, and (c) find the relation between the height and the velocity of particle. One should note that the acceleration of the particle varies with time. All formulae that we learned in high school about constant acceleration such as v = u + at, s = ut + 21 at2 , and v 2 = u2 + 2as will not be applicable. We should rely on calculus to solve these problems. (a) Taking upward motion as positive, the acceleration of the particle is dv a = −g − kv, where a = dv/dt. That is to say, = −g − kv. Rearranging the dt CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 92 equation, we have dv = dt −g − kv v t dv −∫ = ∫ dt 0 g + kv 0 v 1 − ln(g + kv)∣ = t k 0 g + kv 1 ) = t − ln ( k g g + kv = e−kt g g v = − (1 − e−kt ) k The terminal speed of the particle is given by g/k. It is reached when the drag force (upward force) cancels out the weight of the particle (downward force). Figure 4.3: The speed of a falling particle in air dy g (b) The velocity of the particle v = = − (1 − e−kt ). Integrating both sides dt k with respect to t, we have y t g dy = − (1 − e−kt ) dt ∫h k ∫0 t g 1 −kt y − h = − (t + e )∣ k k 0 1 −kt 1 g y − h = − (t + e − ) k k k g g y = h − t + 2 (1 − e−kt ) k k dv (c) Using the chain rule to rewrite the acceleration, we obtain a = = dt dv dy dv = ( ) v. The time t does not show explicitly in the expression. dy dt dy CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 93 Hence, we have dv dy v dv dy = − g + kv −g − kv = v Integrating on both sides, we get y ∫h v v dv dy = − ∫ 0 g + kv 1 k 1 y−h = − k 1 y−h = − k y−h = − v kv + g − g ∫0 ( g + kv ) dv v v dv [∫ dv − g ∫ ] 0 0 g + kv g g + kv [v − ln ( )] k g Thus, the height relates to the velocity of the particle by y =h− g + kv v g + 2 ln ( ) k k g ∎ Example 4.4. A particle of mass m is projected with speed v0 at an angle φ to the horizontal. Find the time when the particle reaches its maximum height if air resistance is not negligible. Given that the air resistance (i.e. the force drag) is linearly proportional to the velocity of the particle. Solution: Let the air resistance a force constant k, where k is positive. The equation of motion of the particle along the vertical is −mg − kvy = may , dvy where ay is the acceleration of the particle along the vertical. But, ay = , dt which gives dvy mg + kvy = −( ) dt m Arrange the equation by separating the variables, dvy dt =− mg + kvy m CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 94 Integrate both sides of the equation, we obtain vy t dvy 1 = − dt ∫v sin φ mg + kv m ∫0 0 y vy t 1 ln(mg + kvy )∣ = − k m vy =v0 sin φ mg + kvy 1 t ln ( ) = − k mg + kv0 sin φ m mg + kvy kt = e− m mg + kv0 sin φ Therefore, the vertical velocity of the particle is given by vy = 1 kt [(mg + kv0 sin φ) e− m − mg] k At the maximum point, vy = 0, thus (mg + kv0 sin φ) e− m = mg kt kv0 sin φ mg m kv0 t = ln (1 + sin φ) k mg kt em = 1 + ∎ 4.4 Radioactive Decay In a sample of N radioactive nuclei, the rate at which the nuclei will decay is proportional to N , dN = −λN , dt (4.2) where λ is positive and is called the disintegration constant (or decay constant). In fact, λ has a characteristic value for every radionuclide. Rearranging equation 4.2, we have dN = −λ dt N Integrating both sides, we obtain N ∫N 0 t dN = −λ ∫ dt N 0 CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 95 Here N0 is the number of radioactive nuclei in the sample at time t = 0. Then N ln ( ) = −λt N0 N = e−λt N0 Therefore, we obtain the formula of radioactive decay: N = N0 e−λt (4.3) A common time measure of how long the radionuclides can last is the halflife T1/2 . It is the time at which N decreases to one-half its initial value. From equation 4.3, we get N0 = N0 e−λT1/2 2 1 = e−λT1/2 2 1 ln ( ) = −λ T1/2 2 ln 2 T1/2 = λ dN It is also the time needed for ∣ ∣ to reduce to one-half of its initial value. dt Example 4.5. A certain radioactive material is known to decay at a rate proportional to the amount present. If initially there is 50 mg of the material present and after two hours it is observed that the material has lost 10 percent of its original mass, find the mass of the material after four hours. Solution: Let N be the mass of the material and k be the decay constant, we have dN = −kN dt Following the steps as stated in section 4.4, we obtain N = N0 e−kt , (4.4) where N0 is the mass of the radioactive material at time t = 0. Thus, we have N = 50 e−kt As 10 % of the original mass has lost after two hours (t = 2), the mass of the material present is 50 − 5 = 45 mg. So 45 = 50 e−2k , 50 1 ln ( ) = 0.053. Four hours later (t = 4), there are N = 2 45 50 e−0.053 (4) = 40.5 mg of the material left. ∎ which gives k = CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 4.5 96 Charging a Capacitor A capacitor is charged by a battery of voltage E when the switch S in the circuit is closed. Figure 4.4 shows the RC circuit which has a resistor of resistance R. The capacitance of a capacitor is defined by C= Q , V where Q is the charge stoted in the capacitor and V is the voltage across the capacitor. It is a constant for a given capacitor. Go around the loop clockwisely by using the Kirchhoff ’s voltage rule, we have E− Q − IR = 0 , C Figure 4.4: The RC circuit where E, Q/C, and IR are the potential differences across the battery, capacitor, and the resistor respectively, and I = dQ/dt. Then E− Q dQ −( )R = 0 C dt CE − Q dQ = RC dt dt dQ = RC CE − Q Integrating on both sides, we have t dt ∫0 RC t RC t RC t RC Q dQ CE − Q Q d(CE − Q) = −∫ Q=0 CE − Q = ∫ 0 Q = − ln (CE − Q)∣ 0 CE = ln ( ) CE − Q CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 97 CE CE − Q t Q = CE (1 − e− RC ) t e RC = Therefore, we obtain the variation of the stored charge with time. A plot of the quantities is shown in figure 4.5 Q = Q0 (1 − e− τ ) , t Figure 4.5: The charging curve of a capacitor Obviously, Q0 = CE is the maximum charge stored in the capacitor and τ = RC is the time constant. If the circuit has a greater value in τ , the time needed for charging the capacitor is longer. When t = τ , the charge stored in the capacitor is about 63 % of the maximum value. When the capacitor is fully charged, the voltage across it equals that of the battery and there is no charge (no current) flowing in the circuit. The variation of current with time is Q0 − t dQ d t = {Q0 (1 − e− τ )} = e τ dt dt τ E −t I = e τ R I = From the above equation, we realize that the initial current is E/R. The capacitor seems not connected in the circuit when the switch is just closed. Then, the current drops exponentially and it disappears eventually. At t = τ , the current is about 37 % of the maximum value. Figure 4.6: The variation of current in a RC circuit CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 4.6 98 Parabolic Mirror Parallel light beams travel along the x direction as shown in the figure. Here we investigate the shape of a mirror such that it reflects all such beams to the origin. This is a classical problem that can be solved by separable differential equation. Consider a ray of light KM which meets the mirror at M (x, y). Let T T ′ be a tangent to the mirror at M , then we have ∠T ′ M K = ∠T M O = ∠M T O. Figure 4.7: A ray of light is reflected to pass the origin √ Hence, OT = OM = x2 + y 2 . The tangent equation at M is Y −y = y ′ (X −x), y dy . It gives the x-intercept X = x − ′ when Y = 0. Then where y ′ = dx y √ y y ∣OT ∣ = ∣X∣ = −X = −x + ′ . Thus, ∣OT ∣ = ∣OM ∣ implies x2 + y 2 = −x + ′ or y y √ (x + x2 + y 2 ) dy = y dx. Therefore, dy y √ = (4.5) dx x + x2 + y 2 This is a homogeneous differential equation of degree zero because the RHS of the above equation is a homogeneous function f (λx, λy) = f (x, y). Generally, a homogeneous function of degree n gives f (λx, λy) = λn f (x, y). Now, we let x = ty and differentiate both sides with respect to y, then we have dx dt =t+y ( ) (4.6) dy dy Knowing the fact that dx dy −1 = ( ) and substituting equation 4.6 into equady dx tion 4.5, we obtain dy dt =√ y 1 + t2 Integrating both sides, we get dy dt ∫ y = ∫ √ 1 + t2 √ ln y = ln(t + 1 + t2 ) + ln C CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS Hence, we have y = C (t + 99 √ 1 + t2 ). Substituting t = x/y, we obtain x+ √ x2 + y 2 = y2 C After simplification, we finally get y 2 = 2C (x + 4.7 C ) 2 Torricelli’s Law of Draining Suppose that a cylindrical water tank has water leaving it through a small hole of area a at the bottom of the tank. Figure 4.8 shows the water tank. Denote the depth of water in the tank as y(t) and the volume of water√as V at time t. The velocity of the stream of water leaving the hole is v = 2gy, which is the free falling velocity of a drop of water from the surface of water. Figure 4.8: A draining cylinrical tank It is worth to mention that the movement of the water surface is very small when compared with the leaving water. During a short time interval dt, the change in volume is √ dV = −av dt = −a 2gy dt It is equal to the change in volume near the water surface, dV = A dy, where A is the cross sectional area of the cylinder. Therefore, √ A dy = −a 2gy dt Hence, we have dy a√ =− 2gy dt A CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 100 √ Putting k = a 2g/A, we have √ dy = −k y dt This is a separable differential equation which has the following form. dy √ = −k dt y Integrating both sides and setting the lower and upper limits by the initial height y1 and the final height y2 respectively, we have ∫y y2 1 t dy √ = −k ∫ dt y 0 √ 2 y∣ y2 = −kt y1 Note that y1 > y2 . Hence, we have the √ √ 2 ( y2 − y1 ) = −kt (4.7) Example 4.6. A cylindrical tank has a small hole at the bottom and water is draining from the hole. The water is 5.0 m deep at noon and it is 2.5 m deep at 1:00 p.m. When will the tank be empty? Solution: We apply equation 4.7 and substitute t = 1 hr, y1 = 5.0 m and y2 = 2.5 m, then √ √ 2 ( 2.5 − 5.0) = −k(1) 1 We obtain k = 1.31 m 2 hr−1 . When the tank is empty, equation 4.7 becomes √ √ 2 ( 0 − 5.0) = −(1.30) t t = 3.41 hr The tank will be empty at 3:25 p.m. 4.8 ∎ First Order Linear Differential Equation The first order linear differential equation has a standard form dy + P (x) y = Q(x) dx (4.8) CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 101 One can solve it by multiplying both sides an integrating factor e∫ P (x) dx , equation 4.8 becomes e∫ P (x) dx dy + e∫ P (x) dx P (x) y = e∫ P (x) dx Q(x) dx d (y e∫ P (x) dx ) = e∫ P (x) dx Q(x) dx Then integrating both sides with respect to x, we obtain y e∫ P (x) dx = ∫ [e∫ P (x) dx Q(x)] dx Thus the solution of equation 4.8 y = (e− ∫ P (x) dx ) ∫ [e∫ P (x) dx Q(x)] dx Example 4.7. Solve x dy − ky = x2 . dx Solution: Rearrange the equation as the standard form dy k −( ) y =x dx x (4.9) where P (x) = −k/x and Q(x) = x. The integrating factor is e∫ P (x) dx = k −k e− ∫ x dx = e−k ln x = eln x = x−k . Multiplying both sides of equation 4.9 by x−k , we have x−k dy k − x−k ( ) y = x1−k dx x d (yx−k ) = x1−k dx yx−k = ∫ x1−k dx Thus y = xk ∫ x1−k dx y = xk ( y = x2−k + C) 2−k x2 + Cxk 2−k ∎ CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS Example 4.8. Solve (x2 + 1) 102 dy + 4xy = x where y(2) = 1. dx Solution: Rearrange the equation as the standard form dy 4x x +( 2 )y= 2 dx x +1 x +1 where P (x) = (4.10) 4x x and Q(x) = 2 . The integrating factor is +1 x +1 x2 e∫ P (x) dx = e∫ 4x x2 +1 dx = e2 ∫ d(x2 +1) x2 +1 = e2 ln(x 2 +1) = eln(x 2 +1)2 = (x2 + 1)2 Multiplying both sides of equation 4.10 by (x2 + 1)2 , we have (x2 + 1)2 dy + 4x (x2 + 1) y = x (x2 + 1) dx d {y (x2 + 1)2 } = x (x2 + 1) dx y (x2 + 1)2 = ∫ x (x2 + 1) dx Therefore y (x2 + 1)2 = x4 x2 + +C 4 2 Applying the given condition y(2) = 1, we obtain C = 19. We finally obtain y (x2 + 1)2 = x4 x2 + + 19 4 2 ∎ Example 4.9. A large tank initially contains 50 m3 of brine in which there is dissolved 10 kg of salt. Brine containing 2 kg of dissolved salt per m3 flows into the tank at the rate of 5 m3 /min. The mixture is kept uniform by stirring, and the stirred mixture simultaneously flows out at the slower rate of 3 m3 /min. How much salt is in the tank at any time t > 0? Figure 4.9: A tank of brine CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 103 Solution: The volume of brine in the tank at time t is {50 + (5 − 3) t} m3 = {50 + 2t} m3 . Let the amount of salt in the tank at the instant be x. The x . concentration of salt in the tank is 50 + 2t The rate of change of salt in the tank is dx x = 5(2) − 3 ( ) dt 50 + 2t Hence, we have 3x dx + = 10 dt 50 + 2t (4.11) It is a first order differential equation in the standard form given by equation 3 4.8, where P (t) = and Q(t) = 10. The integrating factor is 50 + 2t e∫ P (t) dt = e∫ 3 50+2t dt 3 = e2 ∫ d(50+2t) 50+2t 3 = e2 ln(50+2t) 3 3 = eln(50+2t) 2 = (50 + 2t) 2 Multiplying this factor to both sides of equation 4.11, we have d 3 3 {x (50 + 2t) 2 } = 10 (50 + 2t) 2 dt Integrating both sides with respect to t, we obtain t, x t 3 3 ∫t=0, x=10 d {x (50 + 2t) 2 } = 10 ∫0 (50 + 2t) 2 dt t, x t 3 2 x (50 + 2t) ∣ t=0, x=10 3 = 5 ∫ (50 + 2t) 2 d(50 + 2t) 0 t 3 2 3 2 3 2 3 2 x (50 + 2t) − 10 (50) 5 2 = 2 (50 + 2t) ∣ 0 x (50 + 2t) − 10 (50) 5 2 5 = 2 (50 + 2t) − 2 (50 2 ) Then 5 x = 2 (50 + 2t) − 3 2 (50 2 ) − 10 (50 2 ) 3 (50 + 2t) 2 After simplification, we obtain the amount of salt in the tank at time t > 0 √ 22, 500 2 x = 100 + 4t − 3 (50 + 2t) 2 ∎ CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 4.9 104 Second Order Homogeneous Differential Equations In this section, we will put our focus on a second order homogeneous differential equation. ẍ + 2b ẋ + ω 2 x = 0 , The term ’homogeneous’ refers to a zero in the right hand side of the equation. For simplicity, we ignore the first order term in the discussion, i.e. b = 0. A typical example is the simple harmonic motion of an object which is an ideal system without any friction and the total energy of the object is a constant. The equation of motion is ẍ + ω 2 x = 0 , (4.12) where x is the displacement of the object measured from the equilibrium position and ω > 0. Recall that in section 4.2, we have solved the equation by using the reduction of order. Now, we try to solve it directly. Let x = eλt be a solution of equation 4.12, then ẋ = λ eλt and ẍ = λ2 eλt . Plugging into the equation again, we have (λ2 + ω 2 ) eλt = 0 which implies λ2 + ω 2 = 0. This is an auxiliary equation of the differential equation and it has roots λ = ±iω. Thus eiωt and e−iωt satisfy equation 4.12. The complimentary solution of the equation becomes x = c1 eiωt + c2 e−iωt , where c1 and c2 are arbitrary constants and can be determined if the initial conditions are given. Using the fact that eiωt = cos ωt + i sin ωt and e−iωt = cos ωt − i sin ωt, the solution becomes x = (c1 + c2 ) cos ωt + i (c1 − c2 ) sin ωt where c1 and c2 could be complex numbers. That is to say there are 4 arbitrary constants. However, there are two arbitrary constants for each second order differential equation and the two constants c1 and c2 must be related such that they have two independent elements only. Then, we can make c1 and c2 as complex conjugates and thus A = c1 + c2 and B = i (c1 − c2 ). The solutions becomes x = A cos ωt + B sin ωt (4.13) CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS 105 Differentiating on both sides, ẋ = ω (−A sin ωt + B cos ωt) (4.14) Let’s consider a typical example first. If an object is released at x = D when t = 0, we obtain the arbitrary constants A = D and B = 0. Hence, we have x = D cos ωt Next, we study the second example. If the object is projected to the right with velocity v0 at x = 0 when t = 0, we obtain A = 0 and B = v0 /ω. Hence, we have v0 sin ωt , x= ω where v0 /ω is the amplitude of the oscillation. Sometimes, equation 4.13 is written as x = µ sin(ωt + φ) , (4.15) where µ and φ are the amplitude and the phase angle of the oscillation respectively. The proof of equation 4.15 is straightforward √ by considering A = µ sin φ and B = µ cos φ in equation 4.13, where µ = A2 + B 2 . Making use the trigonometric identity sin(x + y) = sin x cos y + cos x sin y, we have x = A cos ωt + B sin ωt = µ (sin φ cos ωt + cos φ sin ωt) = µ sin(ωt + φ) Example 4.10. The equation of motion of a harmonic oscillator is given by d2 x + 4x = 0, where x is the displacement of the oscillator measured from dt2 √ its equilibrium position at time t. If x = 3 m and ẋ = 6 3 m/s when t = 0, determine x as a function of time. Solution: The roots of the auxiliary equation λ2 + 4 = 0 are ±2i. Then the complimentary solution is x = A e2it + B e−2it or x = µ sin(2t + φ). Practically, we apply the latter expression to present the oscillation instead of the former. ẋ = 2µ cos(2t + φ) From the initial conditions, we have { √3 = µ sin φ 6 3 = 2 µ cos φ Then, we know µ = 6 and φ = π π . Therefore, we obtain x = 6 sin (2t + ). ∎ 6 6 Chapter 5 Trigonometry and Complex Numbers 5.1 Compound Angle Formulae Before we could discuss how to represent complex numbers on a plane, we need some angle formulae. Consider the triangle ABC in figure 5.1. The length of the side AB is denoted by c and is equal to the sum of AD and DB c = b cos α + a cos β (5.1) There are several ways to write down the area of the triangle ABC C π−α−β a b α A β D c B Figure 5.1: Addition angle formula 1 1 bc sin α = (b2 sin α cos α + ab sin α cos β) 2 2 1 1 = ac sin β = (ab cos α sin β + a2 sin β cos β) 2 2 1 1 = ab sin(π − α − β) = ab sin(α + β) 2 2 Area = (5.2) (5.3) (5.4) If we write it as the sum of ACD and BCD, we have Area = 1 2 1 b sin α cos α + a2 sin β cos β 2 2 106 (5.5) CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 107 The sum of equations 5.4 and 5.5 is the sum of equations 5.2 and 5.3 ab sin(α + β) + b2 sin α cos α + a2 sin β cos β = b2 sin α cos α + ab sin α cos β + ab cos α sin β + a2 sin β cos β Canceling some terms and common factors, we have the first of a group of very important identities, sin(α + β) = sin α cos β + cos α sin β (5.6) sin(α − β) = sin α cos β − cos α sin β (5.7) Replace β by −β, Replace α by π/2 − α in the above equation, cos(α + β) = cos α cos β − sin α sin β (5.8) Divide equation 5.6 by the above sin α cos β + cos α sin β cos α cos β − sin α sin β tan α + tan β = 1 − tan α tan β tan(α + β) = (5.9) Set β = α in equations 5.6, 5.8 and 5.9, we obtain useful formulae expressed in equations 5.10 to 5.14. sin 2α = 2 sin α cos α cos 2α = cos2 α − sin2 α = 1 − 2 sin2 α = 2 cos2 α − 1 2 tan α tan 2α = 1 − tan2 α (5.10) (5.11) (5.12) (5.13) (5.14) Adding equation 5.6 and equation 5.7, sin(α + β) + sin(α − β) = 2 sin α cos β (5.15) Substitute A = α + β and B = α − β, we have sin A + sin B = 2 sin A+B A−B cos 2 2 (5.16) CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 108 We summarize sin(A + B) sin(A − B) cos(A + B) cos(A − B) = = = = sin A cos B + cos A sin B sin A cos B − cos A sin B cos A cos B − sin A sin B cos A cos B + sin A sin B tan A + tan B 1 − tan A tan B tan A − tan B tan(A − B) = 1 + tan A tan B tan(A + B) = A+B A−B cos 2 2 A−B A+B sin sin A − sin B = 2 cos 2 2 A+B A−B cos A + cos B = 2 cos cos 2 2 A+B A−B cos A − cos B = −2 sin sin 2 2 sin A + sin B = 2 sin (5.17) (5.18) (5.19) (5.20) (5.21) (5.22) (5.23) (5.24) (5.25) (5.26) 1 [sin(A + B) + sin(A − B)] (5.27) 2 1 cos A sin B = [sin(A + B) − sin(A − B)] (5.28) 2 1 cos A cos B = [cos(A + B) + cos(A − B)] (5.29) 2 1 sin A sin B = − [cos(A + B) − cos(A − B)] (5.30) 2 If we put A = B in equation 5.30, we have 1 sin2 A = (1 − cos 2A) (5.31) 2 Similarly, equation 5.29 gives 1 (5.32) cos2 A = (1 + cos 2A) 2 Readers should be able to derive all those formulae that we have not proved and memorize everything. A final remark: Sometimes, it is useful to use the notation 1 csc θ ≡ (5.33) sin θ 1 sec θ ≡ (5.34) cos θ 1 cot θ ≡ (5.35) tan θ sin A cos B = CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 109 They are called cosecant, secant and cotangent respectively. Using the identity sin2 θ + cos2 θ ≡ 1, we obtain 1 + tan2 θ ≡ sec2 θ (5.36) 1 + cot2 θ ≡ csc2 θ (5.37) and a similar one Example 5.1. Evaluate sin 15○ without using calculator. ○ ○ ○ ○ ○ ○ ○ Solution: We note that √ sin 15 = sin(45 −30 ) = sin 45 cos 30 −cos 45 sin 30 . 1 1 1 1 √ √ 3 Thus, sin 15○ = √ ( ) − √ ( ) = ( 6 − 2). ∎ 4 2 2 2 2 Example 5.2. Express sin 3x in terms of sin x. Hence, evaluate cos 36○ without using calculator. Solution: Note that sin 3x = sin(2x + x) = sin 2x cos x + cos 2x sin x. Hence, we can write sin 3x = = = = (2 sin x cos x) cos x + (1 − 2 sin2 x) sin x 2 sin x cos2 x + sin x − 2 sin3 x 2 sin x (1 − sin2 x) + sin x − 2 sin3 x 3 sin x − 4 sin3 x Let x = 36○ , we have 5x = 180○ or 3x = 180○ − 2x. Obviously, sin 3x = sin(180○ − 2x) = sin 2x. Hence, we can write 3 sin x − 4 sin3 x 3 − 4 sin2 x 3 − 4 (1 − cos2 x) 4 cos2 x − 2 cos x − 1 = = = = 2 sin x cos x 2 cos x 2 cos x 0 Solving the quadratic equation and ignoring the negative root, we obtain √ 1+ 5 . ∎ cos 36○ = 4 Example 5.3. Find ∫ sin2 x dx and ∫ cos2 x dx. Solution: Recall the trigonometric identities stated in equations 5.12 and 5.13: cos 2x = 1 − 2 sin2 x and cos 2x = 2 cos2 x − 1, we have 1 1 1 2 ∫ sin x dx = 2 ∫ (1 − cos 2x) dx = 2 x − 4 sin 2x + C 1 1 1 2 ′ ∫ cos x dx = 2 ∫ (1 + cos 2x) dx = 2 x + 4 sin 2x + C ∎ CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 110 θ Example 5.4. If t = tan , express sin θ, cos θ and tan θ in terms of t. These 2 are the well-known half-angle formulae. Use the first result to evaluate ∫ csc θ dθ. Solution: Recall equations 5.10, 5.11 and 5.14, we have 2 tan 2θ 2t θ θ = sin θ = 2 sin cos = , θ 2 2 2 1 + tan 2 1 + t2 θ θ 1 − tan2 2θ 1 − t2 = cos θ = cos2 − sin2 = , and 2 2 1 + tan2 2θ 1 + t2 2 tan 2θ 2t . = tan θ = 2 θ 1 − tan 2 1 − t2 θ 1 θ 2dt The substitution t = tan implies dt = sec2 dθ which gives dθ = . 2 2 2 1 + t2 Therefore, dθ 1 + t2 2 dt θ csc θ dθ = = ∫ ∫ sin θ ∫ 2t ⋅ 1 + t2 dt = ∫ t = ln t + C = ln [tan ( 2 )] + C ∎ Example 5.5. An object of mass m is at rest on a rough table which has coefficient of static friction µ. Find the minimum force to move the object. Solution: Figure 5.2: An applied force on a block Let F be the applied force which has an elevated angle θ. The normal reaction on the mass is N . When the mass is just to move, the equations of motion along and normal the horizontal are { F cos θ = µ N F sin θ + N = mg CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS Eliminating N in the above equations, we have F sin θ + 111 F cos θ = mg. Hence, µ we can write F= mgµ µ sin θ + cos θ (5.38) Figure 5.3: The sides of the right-angled triangle Now, we construct a right-angled triangle to proceed the calculation. Figure 5.3 shows the triangle which has an acute angle α. Its opposite side and adjacent side are defined by the coefficient of cos θ and sin θ stated in equation √ 2 + 1 sin α and µ 5.38. Then, we express the length of the opposite side: 1 = √ that of the adjacent side: µ = µ2 + 1 cos α. Hence, F = √ = √ mgµ µ2 + 1 [sin θ cos α + cos θ sin α] mgµ µ2 + 1 sin(θ + α) The least applied force Fmin is obtained if we put sin(θ + α) = 1. Therefore, mgµ . Fmin = √ µ2 + 1 ∎ Example 5.6. A particle is thrown uphill with speed v0 on an inclined plane. The angle of projection measured from the inclined plane is α. The elevation angle of the inclined plane measured from the horizontal is β. Find the range of the particle on the inclined plane. Show further that if α1 and α2 are the π possible angles for the same range, then α1 + α2 + β = . 2 Solution: Construct the coordinate system where the origin is located at the point of projection and the x-axis and y-axis are along and normal to CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 112 the inclined plane respectively. The positive directions of x and y are uphill and above the inclined plane respectively. At time t, the displacements of the particle along the x and y axes are ⎧ 1 ⎪ ⎪ x = (v0 cos α) t − (g sin β) t2 ⎪ ⎪ ⎪ 2 ⎨ ⎪ 1 ⎪ ⎪ ⎪ y = (v0 sin α) t − (g cos β) t2 ⎪ ⎩ 2 When the particle hits the plane, the particle has coordinates (R, 0). The Figure 5.4: A projectile on the hill 1 second equation gives v0 sin α = gt cos β and thus the required time t = 2 2v0 sin α . Substituting the time expression into the first equation, we have g cos β the range R = (v0 cos α) ( 4v 2 sin2 α 2v0 sin α 1 ) − (g sin β) ( 20 2 ) g cos β 2 g cos β 2v02 sin α [cos α cos β − sin α sin β] g cos2 β 2v02 sin α = [cos(α + β)] g cos2 β v02 = [sin(2α + β) − sin β] g cos2 β = Let α1 and α2 be the possible angles which achieve the same uphill range. We have 2α2 + β = π − (2α1 + β) which gives α1 + α2 + β = π/2. Remark: The range R on the inclined plane is a maximum when 2α+β = π/2 π β which implies α = − . Hence, the maximum range is 4 2 v02 g (1 + sin β) ∎ CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 5.2 113 Complex Numbers Not all polynomials have real number solutions, for example, no real number x satisfies x2 + 1 = 0. We need complex numbers. √ We denote the square root of −1 as i ≡ −1. Obviously, i2 = −1, i3 = −i, and i4 = 1. A complex number z is defined to be the sum a + b i where the real part of z: Re(z) = a and the imaginary part of z: Im(z) = b are real numbers. The addition (or subtraction) of two complex numbers is (a + b i) ± (c + d i) = (a ± c) + (b ± d) i (5.39) Using distributive law, multiplication is (a + b i) × (c + d i) = ac + ad i + bc i + bd i2 = (ac − bd) + (ad + bc) i (5.40) Note that (a + b i)(c + d i) = (c + d i)(a + b i). Division is a bit tricky. 1 a − bi 1 = a + bi a + bi a − b i a − bi = 2 2 a +b a b = 2 2− 2 2i, a +b a +b (5.41) and we have c + di a b = (c + d i) ( 2 2 − 2 2 i) a + bi a +b a +b ac + bd ad − bc = + i a2 + b 2 a2 + b 2 (5.42) Example 5.7. Compute the following expressions. Solution: (2 + 4 i) + (3 − 6 i) = 5 − 2 i 1 31 7 (1 − 3 i)( + 5 i) = + i 2 2 2 2 + 4i (2 + 4 i)(3 + 6 i) −6 + 8 i = = 3 − 6i 9 + 36 15 ∎ 5.3 Complex Plane We usually represent a complex number z = a+b i as a point on the complex plane. Figure 5.5 shows the Argand Diagram of a complex number. The CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 114 Im a+bi b r θ a Re Figure 5.5: Complex plane Im (a+c)+(b+d)i c+di a+bi Re Figure 5.6: Addition of two complex numbers horizontal axis is for the real part of the number and the vertical axis is for the imaginary part. The addition of two complex numbers corresponds to forming a parallelogram, figure 5.6. We define the absolute value or modulus r = ∣z∣ = ∣a + b i∣ of a complex number as √ (5.43) ∣z∣ ≡ a2 + b2 , which is just the “length” of the arrow of the number in the complex plane. If it is, in fact, a real number (b = 0), we get back the absolute value of a real number. The argument is the angle arg z ≡ θ in figure 5.5, θ = tan−1 (b/a). The conjugate of z is z̄ = a − b i. We have the following simple identities. The first one is the polar form of a complex number. z z̄ z + z̄ z − z̄ = = = = r(cos θ + i sin θ) r(cos θ − i sin θ) 2a = 2 Re(z) 2b i = 2 i Im(z) (5.44) (5.45) (5.46) (5.47) (z̄) z z̄ ∣z∣ arg z̄ = = = = z ∣z∣2 = r2 ∣z̄∣ − arg z (5.48) (5.49) (5.50) (5.51) CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 115 Equation 5.49 implies a2 + b2 = (a + b i) (a − b i). For two complex numbers z1 = a1 + b1 i = r1 (cos θ1 + i sin θ1 ) and z2 = a2 + b2 i = r2 (cos θ2 + i sin θ2 ), their product is z1 z2 = r1 r2 (cos θ1 + i sin θ1 ) (cos θ2 + i sin θ2 ) = r1 r2 [(cos θ1 cos θ2 − sin θ1 sin θ2 ) + i (sin θ1 cos θ2 + cos θ1 sin θ2 )] = r1 r2 [cos(θ1 + θ2 ) + i sin(θ1 + θ2 )] (5.52) by equations 5.17 and 5.19. We see that the absolute value of a product is the product of the absolute values and the argument of a product is the sum of the arguments, see figure 5.7. ∣z1 z2 ∣ = ∣z1 ∣∣z2 ∣ arg(z1 z2 ) = arg z1 + arg z2 (5.53) (5.54) This is illustrated in figure 5.7. Moreover, ∣z1 /z2 ∣ arg(z1 /z2 ) z1 + z2 z1 − z2 z1 z2 = = = = = ∣z1 ∣ / ∣z2 ∣ arg z1 − arg z2 z̄1 + z̄2 z̄1 − z̄2 z̄1 z̄2 (z1 /z2 ) = z̄1 /z̄2 (5.55) (5.56) (5.57) (5.58) (5.59) (5.60) The unit complex number is represented by cos θ + i sin θ (or cos θ − i sin θ), because its magnitude is 1. If ∣z∣ = 1, we have ∣z∣2 = z z̄ = 1, then z̄ = 1/z. Furthermore, 1 1 1 (z + z̄) = (z + ) 2 2 z 1 1 1 sin θ = (z − z̄) = (z − ) 2i 2i z cos θ = Im z1 z 2 θ1 z2 θ2 θ1 z1 Re Figure 5.7: Product of two complex numbers (5.61) (5.62) CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS Example 5.8. Let z1 = 1 + i and z2 = 116 √ 3 + i, find arg(z1 z2 ). √ Solution: Note that r1 = 2 and r2 = 2. The principal arguments of z1 and z2 are arg z1 = π/4 and arg z2 = π/6 respectively. We have √ √ √ z1 z2 = (1 + i)( 3 + i) = ( 3 − 1) + ( 3 + 1) i The absolute value of z1 z2 is √ √ √ √ √ √ √ ( 3 − 1)2 + ( 3 + 1)2 = 3 − 2 3 + 1 + 3 + 2 3 + 1 = 8 and the argument is √ 3+1 ) arg(z1 z2 ) = tan−1 ( √ 3−1 which can be verified to be 5π/12 (i.e. π/4 + π/6). 5.4 ∎ De Moivre’s Theorem For any positive integer n, we have (cos x + i sin x)n = cos nx + i sin nx (5.63) The proof can be completed by mathematical induction. This theorem can be generalized to include negative power in the LHS of equation 5.63. Let m = −n, we can write (cos x + i sin x)m = (cos x + i sin x)−n 1 = (cos x + i sin x)n 1 = cos nx + i sin nx 1 cos nx − i sin nx = ( )( ) cos nx + i sin nx cos nx − i sin nx = cos nx − i sin nx = cos(−nx) + i sin(−nx) = cos mx + i sin mx Furthermore, the theorem is also true if n is replaced by a rational number, e.g. p/q, where p and q are integers. The proof is straightforward. q p p (cos( x) + i sin( x)) = cos px + i sin px q q = (cos x + i sin x)p CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 117 Hence, p p cos ( ) x + i sin ( ) x = (cos x + i sin x)p/q q q Example 5.9. Show that (cos x − i sin x)n = cos n x − i sin n x, where n is an integer. Solution: Obviously, we can write (cos x − i sin x)n = (cos(−x) + i sin(−x))n = cos(−n x) + i sin(−n x) = cos n x − i sin n x This is an alternative form of De Moivre’s theorem. ∎ Example 5.10. Find expressions for cos 3 x and sin 3 x. Solution: By using De Moivre’s theorem, we have cos 3 x + i sin 3 x = = = = (cos x + i sin x)3 cos3 x + 3i cos2 x sin x − 3 cos x sin2 x − i sin3 x (cos3 x − 3 cos x sin2 x) + i (3 cos2 x sin x − sin3 x) (4 cos3 x − 3 cos x) + i (3 sin x − 4 sin3 x) Therefore, cos 3 x = 4 cos3 x − 3 cos x and sin 3 x = 3 sin x − 4 sin3 x. ∎ Example 5.11. Solve z 3 + 1 = 0. Solution: Rearrange the equation and rewrite −1 in polar form. z 3 = −1 = cos (2k π + π) + i sin (2k π + π) , where k = 0, ± 1, ± 2 ⋯ . A cubic equation has three roots, so (2k + 1) π (2k + 1) π ) + i sin ( ), where k = 0, 1, 2. 3 3 π π 5π 5π = cos ( ) + i sin ( ) , − 1, and cos ( ) + i sin ( ) 3 3 3 3 √ √ 1 3 1 3 = +i , − 1, and − i 2 2 2 2 z = cos ( Figure 5.8 shows the three roots of the equation, where z0 , z1 , and z2 correspond to k = 0, 1, and 2 respectively. z1 is real, z0 and z2 are conjugates of each other. In fact, complex roots always come in pairs. ∎ CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 118 Figure 5.8: The Argand diagram of the roots of z 3 + 1 = 0 5.5 Euler’s Formula If ∣z∣ = 1, we write z = cos x + i sin x. In section 2.6 a discussion on Taylor’s series was covered and let’s recall the series of the following functions here. x2 x3 x4 + + +⋯ 2! 3! 4! x2 x4 x6 cos x = 1 − + − +⋯ 2! 4! 6! x3 x 5 x7 sin x = x − + − +⋯ 3! 5! 7! ex = 1 + x + Euler’s formula states that z can be represented by an exponential function. eix = cos x + i sin x (5.64) Starting from the formula, we get 1 i x −i x (e + e ) 2 1 (ei x − e−i x ) sin x = 2i cos x = (5.65) (5.66) Remark: If z = r (cos x + i sin x), then ln z = ln r + ln (cos x + i sin x) = ln ∣z∣ + ln ei x = ln ∣z∣ + i x Therefore ln z = ln ∣z∣ + i arg z Example 5.12. Prove that 1 + cos 6 x + i sin 6 x = cos 6 x + i sin 6 x . 1 + cos 6 x − i sin 6 x (5.67) CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 119 Solution: 1 + e6i x 1 + cos 6 x + i sin 6 x = 1 + cos 6 x − i sin 6 x 1 + e−6i x e3i x e−3i x + e3i x = −3i x ( 3i x −3i x ) e e +e 6i x = e = cos 6 x + i sin 6 x ∎ Example 5.13. Evaluate (a) ln(−3) and (b) ln 3 i. Solution: (a) Using equation 5.67, we have ln(−3) = ln ∣ − 3∣ + i arg(−3) = ln 3 + i (2k π + π) = ln 3 + i (2k + 1) π , where k = 0, ±1, ±2, ⋯ . (b) Using equation 5.67 again, we have π ln 3 i = ln ∣3 i∣ + i arg(3 i) = ln 3 + i (2k π + ) , 2 where k = 0, ±1, ±2, ⋯ . ∎ Example 5.14. Find the value of ii . Is it a real number or an imaginary number? Solution: Rewriting ii as the power of the exponential constant and using equation 5.67, then we obtain i ii = (eln i ) = ei ln i = ei [ln ∣i∣+i arg(i)] = ei [ln 1+i (2k π+ 2 )] = e−(2k π+ 2 ) , π π where k = 0, ±1, ±2, ⋯ . Equivalently, we have ii = e(2n π− 2 ) , π where n = 0, ±1, ±2, ⋯ . So, we conclude that ii is a real number. Example 5.15. Find the sum of the infinite series sin x sin 2 x sin 3 x + + +⋯ e e2 e3 ∎ CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 120 [Hint: Construct a similar series with cosine functions.] Solution: Define the following infinite series C = 1+ S = cos x cos 2 x cos 3 x + + +⋯ e e2 e3 sin x sin 2 x sin 3 x + + +⋯ e e2 e3 So we have cos x + i sin x cos 2 x + i sin 2 x cos 3 x + i sin 3 x + + +⋯ e e2 e3 cos x + i sin x (cos x + i sin x)2 (cos x + i sin x)3 = 1+ + + +⋯ e e2 e3 ei x ei 2x ei 3x + 2 + 3 +⋯ = 1+ e e e C + iS = 1 + The RHS is a geometric series with common ratio Thus, eix eix 1 , where ∣ ∣ = < 1. e e e 1 ix 1 − ee e = e − eix e = e − cos x − i sin x e e − cos x + i sin x ⋅ = e − cos x − i sin x e − cos x + i sin x e (e − cos x + i sin x) = (e − cos x)2 + sin2 x i e sin x e (e − cos x) = 2 + 2 e − 2e cos x + 1 e − 2e cos x + 1 C + iS = Extracting the imaginary part, we have sin x sin 2 x sin 3 x e sin x + + +⋯= 2 2 3 e e e e − 2e cos x + 1 ∎ 5.6 A Revisit to Simple Harmonic Motion In section 4.2 we described the motion of an object which performs simple harmonic motion. The oscillation obeys Hooke’s law such that the restoring CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 121 Figure 5.9: The spring-mass system force is always linearly proportional to the displacement of the object from its equilibrium position. Recall that the equation of motion of the object is a second order differential equation and is given by ẍ + ω 2 x = 0 (5.68) where ω is the angular frequency of the oscillation. We admit that the solving of this equation is a bit clumsy in section 4.2 because we worked by integration twice in order to obtain the expression of x in terms of t. Now, we solve the equation again through the analysis of complex roots in the characteristic equation. Note that the characteristic equation (auxiliary equation) of the differential equation is given by λ2 +ω 2 = 0. Its roots are ±iω. One can verify that eiωt and e−iωt satisfy the differential equation. Thus, the general solution of equation 5.68 is x = c1 eiωt + c2 e−iωt , (5.69) where c1 and c2 are arbitrary constants and they could be complex. Using Euler’s formula, we have x = (c1 + c2 ) cos ωt + i (c1 − c2 ) sin ωt (5.70) We know that a second order differential equation has a solution which associates to two arbitrary constants. Obviously, two complex numbers c1 and c2 have four independent constants (i.e. four arbitrary constants). Thus, c1 and c2 are conjugate of each other such that they produce two real constants, A and B by { c1 + c2 = A i (c1 − c2 ) = B Then x = A cos ωt + B sin ωt (5.71) CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 122 The solution of x is real and it measures the displacement of the object from its equilibrium position. Finally, we use the skill in example 5.5 to rewrite the solution as x = µ sin(ωt + δ) , (5.72) √ where µ = A2 + B 2 and δ = tan−1 (A/B). If x = D (the amplitude of oscillation) and ẋ = 0 when t = 0, the values of µ and δ are determined (µ = D and δ = π/2). Then, we have x = D sin(ωt + π/2) = D cos ωt. Alternatively, we can rewrite equation 5.71 as x = µ cos(ωt − δ) , (5.73) √ if we set µ = A2 + B 2 and δ = tan−1 (B/A). This is an alternative form compared to equation 5.72. For the case where x = D and ẋ = 0 when t = 0, the values of µ and δ are determined again (µ = D and δ = 0). Then, we have x = D cos ωt. To conclude, both expressions in equations 5.72 and 5.73 give the same answer of x based on the initial (boundary) conditions of the system. 5.7 Particle in a Box In the simplest quantum mechanical system, a particle is trapped in a 1-D box with infinitely hard walls as shown in figure 5.10. The potential along x is given by ⎧ ∞ ⎪ ⎪ ⎪ V (x) = ⎨ 0 ⎪ ⎪ ⎪ ⎩ ∞ x<0 0≤x≤L x>L Figure 5.10: The infinitely potential well CHAPTER 5. TRIGONOMETRY AND COMPLEX NUMBERS 123 The wavefunction of the particle ψ is given by the Schrödinger’s equation. ̵ 2 d2 ψ h + V (x)ψ = Eψ , (5.74) − 2m dx2 where m and E are the mass and the energy of the particle respectively. The ̵ relates the Planck’s constant h by h ̵ = h/(2π) reduced Planck’s constant h and E > 0. Inside the box the potential V is zero. The equation becomes ̵ 2 d2 ψ h = Eψ , (5.75) − 2m dx2 2mE Setting k 2 = ̵ 2 and rearranging the equation, we obtain h d2 ψ + k2ψ = 0 (5.76) dx2 The characteristic equation of the above equation is λ2 + k 2 = 0. Its roots are ik and −ik. Using the result in equation 5.71, the general solution of equation 5.76 is simply ψ = A cos kx + B sin kx (5.77) The boundary condition of ψ tells us that ψ = 0 at x = 0, so A = 0. Similarly, the fact that ψ = 0 at x = L gives B sin kL = 0 and thus kL = nπ, where n = 1, 2, 3 . . . . The wave number k is quantized now nπ n = 1, 2, 3 . . . (5.78) k= L and the energy is quantized (discret energy levels En ) as shown in figure 5.11. Figure 5.11: The energy levels and wavefunctions of the particle ̵2 n2 π 2 h n = 1, 2, 3 . . . 2mL2 The wavefunction of the particle is given by nπ x n = 1, 2, 3 . . . ψn = B sin L En = (5.79) (5.80) Chapter 6 Partial Differentiation 6.1 Partial Derivative Consider a function f of three variables x, y and z, f (x, y, z) If y and z are held constant and only x is allowed to vary, the partial derivative ∂f or fx and is defined as the limit with respect to x is denoted by ∂x f (x + ∆x, y, z) − f (x, y, z) ∂f = lim ∂x ∆x→0 ∆x The total differential of f is given by df = ∂f ∂f ∂f dx + dy + dz , ∂x ∂y ∂z (6.1) where df represents the change in f due to the infinitesimal changes in x, y and z respectively. The proof of equation 6.1 is shown below. Consider the difference of the functional values at two adjacent points P (x, y, z) and Q(x + ∆x, y + ∆y, z + ∆z), ∆f = f (x + ∆x, y + ∆y, z + ∆z) − f (x, y, z) = [f (x + ∆x, y + ∆y, z + ∆z) − f (x, y + ∆y, z + ∆z)] +[f (x, y + ∆y, z + ∆z) − f (x, y, z + ∆z)] +[f (x, y, z + ∆z) − f (x, y, z)] 124 CHAPTER 6. PARTIAL DIFFERENTIATION 125 Then, we can write ∆f = [( f (x + ∆x, y + ∆y, z + ∆z) − f (x, y + ∆y, z + ∆z) ) ∆x] ∆x f (x, y + ∆y, z + ∆z) − f (x, y, z + ∆z) + [( ) ∆y] ∆y f (x, y, z + ∆z) − f (x, y, z) + [( ) ∆z] ∆z In the limiting case as ∆x → 0, ∆y → 0, and ∆z → 0, we have ∆x ≅ dx, ∆y ≅ dy, and ∆z ≅ dz, and ∆f ≅ df . The above equation can be reduced into f (x + ∆x, y + ∆y, z + ∆z) − f (x, y + ∆y, z + ∆z) df = [ lim ( ) ∆x] ∆x→0 ∆x f (x, y + ∆y, z + ∆z) − f (x, y, z + ∆z) ) ∆y] + [ lim ( ∆y→0 ∆y f (x, y, z + ∆z) − f (x, y, z) + [ lim ( ) ∆z] ∆z→0 ∆z Therefore, we have df = ∂f ∂f ∂f dx + dy + dz. ∂x ∂y ∂z Remarks on chain rules: 1. If f is a three-variable function f (x, y, z), where x, y and z are functions of t, i.e. x(t), y(t) and z(t), then the derivative of f with respect to t is the total differentiation of f and is given by the chain rule df ∂f dx ∂f dy ∂f dz = + + dt ∂x dt ∂y dt ∂z dt 2. Suppose that y and z are functions of x, then f becomes a function of x. So, we have df ∂f ∂f dy ∂f dz = + + dx ∂x ∂y dx ∂z dx Example 6.2 gives an illustration of this. 3. Suppose that x and y are independent but that z is a function of x and dy y, then = 0 and f becomes a function of x and y. So, we have dx ∂f ∂f ∂f ∂z ( ) = + ∂x y ∂x ∂z ∂x Note that f in the LHS is purely a function of x and y because z has been substituted into it. The subscript indicates y being held constant. In the RHS, f corresponds to a function of x, y and z. Similarly, we have CHAPTER 6. PARTIAL DIFFERENTIATION ( 126 ∂f ∂f ∂f ∂z ) = + ∂y x ∂y ∂z ∂y if y is allowed to vary but x is fixed. As a reminder, the subscript labels the quantity being fixed. Read examples 6.3 and 6.4. Example 6.1. If f = ex sin y, find ∂ 2f ∂f ∂f ∂ 2 f ∂ 2 f , , , , and . ∂x ∂y ∂x2 ∂y 2 ∂x ∂y Solution: Consider the expression f = ex sin y, we obtain ∂f = ex sin y ∂x and ∂ 2f = ex sin y 2 ∂x ∂f = ex cos y ∂y and ∂ 2f = −ex sin y ∂y 2 ∂ 2f ∂ ∂f ∂ x = ( )= (e cos y) = ex cos y ∂x ∂y ∂x ∂y ∂x Remark: ∂ 2f ∂ ∂f ∂ x ∂ 2f = = ( )= (e sin y) = ex cos y ∂x ∂y ∂y ∂x ∂y ∂x ∂y ∎ ∂f df Example 6.2. Given that f = ex sin y, find . If y = ex , find . ∂x dx ∂f Solution: Consider f = ex sin y, we have = ex sin y. ∂x Now, we put y = ex , the function f can be rewritten as a single variable function, where f = ex sin ex , and df = (sin ex ) ex + ex (cos ex ) ex = ex sin ex + e2x cos ex dx Alternatively, we apply the chain rule in remark 2 of section 6.1 and ignore z, then ∂f ∂f dy df = + dx ∂x ∂y dx = ex sin y + (ex cos y) ex = ex sin y + e2x cos y = ex sin ex + e2x cos ex ∎ CHAPTER 6. PARTIAL DIFFERENTIATION 127 Example 6.3. Let f (x, y, z) = x2 + xy 2 + z, where z(x, y) = 3x − y. Obtain ∂f ∂f and ( ) . ∂x ∂x y ∂f = 2x + y 2 . Solution: Obviously, ∂x ∂f ∂z In addition, we have = 1 and = 3. The chain rule in remark 3 of ∂z ∂x section 6.1 gives ( ∂f ∂f ∂f ∂z ) = + ∂x y ∂x ∂z ∂x = 2x + y 2 + (1)(3) = 2x + y 2 + 3 ∂f ). ∂x y The function f (x, y, z) can be rewritten as f (x, y, 3x − y) = x2 + xy 2 + 3x − y. ∂f It is a function of x and y only. Hence, ( ) = 2x + y 2 + 3, where y is fixed ∂x y in the derivative. ∎ ∂f Example 6.4. Let f (x, y, z) = xyz, where z = ln (3 x + 2 y + z). Obtain , ∂x ∂f ∂f and ( ) . ∂z ∂x y ∂f ∂f Solution: It is easy to write down = yz and = xy. ∂x ∂z We observe from z = ln (3 x + 2 y + z) that z couples with x and y and it cannot be extracted explicitly from the expression. So, direct substituting ∂f an expression of z into f (x, y, z) for further differentiation to obtain ( ) ∂x y is not possible. However, the chain rule in remark 3 of section 6.1 gives Let’s illustrate the meaning of ( ( Let’s work out ∂f ∂f ∂f ∂z ) = + ∂x y ∂x ∂z ∂x ∂z ∂z ∂z . One should note that = ( ) . Clearly, we have ∂x ∂x ∂x y ∂z 3+ ∂z 3 ∂x , which gives ∂z = = . Hence, we have ∂x 3 x + 2 y + z ∂x 3 x + 2 y + z − 1 ( ∂f ∂f ∂f ∂z ) = + ∂x y ∂x ∂z ∂x 3 xy 3 xyz + 2 y 2 z + yz 2 − yz + 3 xy = yz + = 3x + 2y + z − 1 3x + 2y + z − 1 ∎ CHAPTER 6. PARTIAL DIFFERENTIATION 128 Example 6.5. The Cartesian coordinates relate the polar coordinates in the following forms: x = r cos θ and y = r sin θ, find (a) ∂x ∂x ∂ 2x ∂ 2x , in terms of r and θ, hence find , . ∂r ∂θ ∂r2 ∂θ2 (b) ∂r ∂θ ∂ 2r ∂ 2θ and in terms of r and θ, hence , . ∂x ∂x ∂x2 ∂x2 Solution: (a) ∂x ∂x = cos θ and = −r sin θ. ∂r ∂θ ∂ 2x ∂ 2x = 0 and = −r cos θ. Hence, ∂r2 ∂θ2 (b) Consider the expression x = r cos θ and differentiate both sides with respect to x, we obtain 1=r ∂ ∂r (cos θ) + cos θ ∂x ∂x which implies 1 = −r sin θ ∂θ ∂r + cos θ ∂x ∂x (6.2) Consider the expression y = r sin θ and differentiate both sides with respect to x, we obtain 0=r ∂ ∂r (sin θ) + sin θ ∂x ∂x which implies 0 = r cos θ ∂θ ∂r + sin θ ∂x ∂x (6.3) ∂r ∂θ sin θ = cos θ and =− . ∂x ∂x r Alternatively, one may consider the expression r2 = x2 + y 2 . Differentiating both sides with respect to x, we obtain Solve equations 6.2 and 6.3, we obtain 2r ∂r = 2x ∂x ∂r x r cos θ = = = cos θ ∂x r r CHAPTER 6. PARTIAL DIFFERENTIATION 129 y Next, we know that tan θ = . If we differentiate both sides with respect x ∂y x ( ) − y ( ∂x ) ∂x ∂x ∂θ x (0) − y (1) y to x, we obtain sec2 θ ( ) = = =− 2. 2 2 ∂x x x x y sin θ ∂θ = − cos2 θ ( 2 ) = − . So, we obtain ∂x x r Hence, we have ∂ 2r ∂ ∂r ∂ ∂θ sin θ sin2 θ = ( ) = (cos θ) = − sin θ = − sin θ (− ) = ∂x2 ∂x ∂x ∂x ∂x r r Repeat similar process, we have ∂ 2θ ∂ ∂θ ∂ sin θ = ( )= (− ) , then 2 ∂x ∂x ∂x ∂x r ∂ ∂r (sin θ) − sin θ } ∂x ∂x r2 sin θ − {r cos θ (− ) − sin θ cos θ} r = r2 2 sin θ cos θ = r2 sin 2θ = r2 ∂ 2θ = ∂x2 − {r Remark: The reciprocal rule resulted from the differentiation of singlevariable function may not apply to partial differentiation. In this example, ∂x ∂r ∂x ∂x ∂r ∂r we observe that ≠ 1. In fact, = ( ) and = ( ) , where ∂r ∂x ∂r ∂r θ ∂x ∂x y the parameters being fixed are not the same in each derivative. Similarly, ∂x ∂θ ∂x ∂x ∂θ ∂θ ≠ 1, where = ( ) and = ( ) . The reciprocal rule is valid ∂θ ∂x ∂θ ∂θ r ∂x ∂x y for multivariable functions if the parameter(s) being fixed is(are) the same. Further discussion can be found in the remark of example 6.8. ∎ 1/5 Example 6.6. Evaluate {(3.8)2 + 2 (2.1)3 } tion without using calculator. to the first order approxima- Solution: Let z = (x2 + 2y 3 )1/5 . The total differential of z is dz = where ∂z ∂z dx + dy , ∂x ∂y ∂z 1 2 ∂z 1 2 = (x + 2y 3 )−4/5 (2x) and = (x + 2y 3 )−4/5 (6y 2 ). ∂x 5 ∂y 5 CHAPTER 6. PARTIAL DIFFERENTIATION 130 1 2 (x + 2y 3 )−4/5 (2x dx + 6y 2 dy). 5 1/5 Let x = 4, y = 2 and set dx = −0.2 and dy = 0.1, we have z = [42 + 2 (2)3 ] = 2 and Hence, dz = 1 2 (4 + 2 (2)3 )−4/5 [2 (4) (−0.2) + 6 (22 ) (0.1)] 5 1 1 = (−1.6 + 2.4) = 0.01 5 16 dz = 1/5 Therefore {(3.8)2 + 2 (2.1)3 } = 2 + 0.01 = 2.01. The higher order approximation can be obtained by using the Taylor series in two variables. ∎ Example 6.7. Functions x and y are described by { x = eu cos v . y = eu sin v (a) Write down dx and dy. Hence, show that { (b) If z = uv, find du = e−u cos v dx + e−u sin v dy dv = −e−u sin v dx + e−u cos v dy ∂z ∂z and by using the results in (a). ∂x ∂y Solution: (a) Differentiate x and y with respect to u and v, we have ⎧ ∂x ⎪ ⎪ = eu cos v ⎪ ⎪ ⎪ ∂u ⎪ ⎨ ⎪ ⎪ ∂x ⎪ ⎪ ⎪ = −eu sin v ⎪ ⎩ ∂v ⎧ ∂y ⎪ ⎪ = eu sin v ⎪ ⎪ ⎪ ∂u ⎪ ⎨ ⎪ ⎪ ∂y ⎪ ⎪ ⎪ = eu cos v ⎪ ⎩ ∂v and The total differential of x is dx = ∂x ∂x du + dv. We can write ∂u ∂v dx = eu cos v du − eu sin v dv The total differential of y is dy = (6.4) ∂y ∂y du + dv. We can write ∂u ∂v dy = eu sin v du + eu cos v dv Solving equations 6.4 and 6.5, we obtain du = e−u cos v dx + e−u sin v dy dv = −e−u sin v dx + e−u cos v dy (6.5) CHAPTER 6. PARTIAL DIFFERENTIATION 131 (b) Since z = uv, we obtain dz = u dv + v du. Hence, dz = u (−e−u sin v dx + e−u cos v dy) + v (e−u cos v dx + e−u sin v dy) Arrange the equation, we have dz = (−ue−u sin v + ve−u cos v) dx + (ue−u cos v + ve−u sin v) dy ∂z ∂z dx + dy. ∂x ∂y Hence, We can write But, dz = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂z = −ue−u sin v + ve−u cos v ∂x ∂z = ue−u cos v + ve−u sin v ∂y ∎ ∂z ∂y ∂x = −1 . ∂y ∂x ∂z Solution: If f (x, y, z) = 0, we can write z as a function of x and y, i.e. ∂f ∂f ∂z ∂f + = 0. Please refer to remark 3 of z(x, y). Then, we have ( ) = ∂y x ∂y ∂z ∂y section 6.1. Hence, we obtain Example 6.8. If f (x, y, z) = 0, show that ∂f ∂z ∂y =− ∂f ∂y ∂z where Similarly, we can write y(z, x) and ( ∂z ∂z =( ) ∂y ∂y x ∂f ∂f ∂f ∂y ) = + = 0. Hence, we ∂x z ∂x ∂y ∂x obtain ∂f ∂y = − ∂x ∂f ∂x ∂y where Similarly, we can write x(y, z) and ( ∂y ∂y =( ) ∂x ∂x z ∂f ∂f ∂x ∂f ) = + = 0. Hence, we ∂z y ∂x ∂z ∂z obtain ∂f ∂x = − ∂z ∂f ∂z ∂x where ∂x ∂x =( ) ∂z ∂z y CHAPTER 6. PARTIAL DIFFERENTIATION 132 ∂z ∂y ∂x = −1. ∂y ∂x ∂z This formula is widely used in thermodynamics. According to the ideal gas law, we know that P V = nRT , where n is the number of moles of gas in the system at equilibrium. Now, we define a function f (P, V, T ) = 0, where P , V , and T are the pressure, volume, and temperature of the gas system. Thus, we have Therefore, we obtain the product of the partial derivatives ( ∂T ∂V ∂P ) ( ) ( ) = −1 ∂V P ∂P T ∂T V Remark: Knowing that f (x, y, z) = 0, we can write z(x, y), y(z, x), and x(y, z). If we repeat the above work in a similar way, we get ∂f ∂x ∂y =− , ∂f ∂y ∂x ∂f ∂z = − ∂x , ∂f ∂x ∂z and ∂f ∂y = − ∂z , ∂f ∂z ∂y ∂z ∂z ∂x ∂x ∂y ∂y =( ) , = ( ) , and = ( ) . Multiplying the above ∂x ∂x y ∂y ∂y z ∂z ∂z x derivatives, we obtain another product relation where ∂z ∂x ∂y = −1 ∂x ∂y ∂z Substituting P , V and T for x, y, and z respectively, we obtain ( ∂T ∂P ∂V ) ( ) ( ) = −1 ∂P V ∂V T ∂T P In addition, we obtain the following product relations by direct multiplication of the partial derivatives. ∂z ∂y = 1, ∂y ∂z ∂y ∂x = 1, ∂x ∂y and ∂x ∂z =1 ∂z ∂x The results are straightforward in meaning. For example, the first product considers x as a fixed number, then f (x, y, z) = 0 implies that y is a function of z (or z is a function of y). The reciprocal rule in single variable function applies. Similarly, we obtain the remainding two. Hence, we have ( ∂T ∂V ) ( ) = 1, ∂V P ∂T P ( ∂V ∂P ) ( ) = 1, ∂P T ∂V T and ( ∂P ∂T ) ( ) =1 ∂T V ∂P V ∎ CHAPTER 6. PARTIAL DIFFERENTIATION 133 Example 6.9. This problem repeats example 6.8. If f (x, y, z) = 0, show by ∂z ∂y ∂x considering the total differentials of x, y, and z that = −1. ∂y ∂x ∂z Solution: Since f (x, y, z) = 0, we have x(y, z), y(z, x), and z(x, y) respectively. Then, we write down the total differentials as ∂x ∂x ) dy + ( ) dz ∂y z ∂z y ∂y ∂y dy = ( ) dz + ( ) dx ∂z x ∂x z ∂z ∂z dz = ( ) dx + ( ) dy ∂x y ∂y x dx = ( (6.6) (6.7) (6.8) Eliminating dy from equations 6.6 and 6.7, we obtain [1 − ( ∂y ∂x ∂y ∂x ∂x ) ( ) ] dx − [( ) ( ) + ( ) ] dz = 0 ∂y z ∂x z ∂y z ∂z x ∂z y (6.9) As dx and dz can be chosen independently, we have 1−( ( ∂x ∂y ) ( ) =0 ∂y z ∂x z ∂y ∂x ∂x ) ( ) +( ) =0 ∂y z ∂z x ∂z y (6.10) (6.11) Similarly, we eliminate dz from equations 6.7 and 6.8, thus [1 − ( ∂z ∂y ∂z ∂y ∂y ) ( ) ] dy − [( ) ( ) + ( ) ] dx = 0 ∂z x ∂y x ∂z x ∂x y ∂x z (6.12) As dx and dy can be chosen independently, we have 1−( ( ∂y ∂z ) ( ) =0 ∂z x ∂y x ∂z ∂y ∂y ) ( ) +( ) =0 ∂z x ∂x y ∂x z (6.13) (6.14) Using the same manner, we eliminate dx from equations 6.6 and 6.8, thus [1 − ( ∂z ∂x ∂z ∂x ∂z ) ( ) ] dz − [( ) ( ) + ( ) ] dy = 0 ∂x y ∂z y ∂x y ∂y z ∂y x (6.15) CHAPTER 6. PARTIAL DIFFERENTIATION 134 As dy and dz can be chosen independently, we have 1−( ( ∂z ∂x ) ( ) =0 ∂x y ∂z y ∂x ∂z ∂z ) ( ) +( ) =0 ∂x y ∂y z ∂y x (6.16) (6.17) Equations 6.10, 6.13, and 6.16 give the following reciprocal rules ∂x ∂y ) ( ) =1 ∂y z ∂x z ∂y ∂z ( ) ( ) =1 ∂z x ∂y x ∂z ∂x ( ) ( ) =1 ∂x y ∂z y ( (6.18) (6.19) (6.20) Equations 6.11, 6.14, and 6.17 and the above rules give ∂y ∂x ∂z ) ( ) ( ) = −1 ∂z x ∂y z ∂x y ∂y ∂x ∂z ( ) ( ) ( ) = −1 ∂y x ∂x z ∂z y ( (6.21) (6.22) ∎ 6.2 Geometrical Meaning of Partial Derivatives Figure 6.1 shows the intersection between the curved surface z = f (x, y) and the vertical plane x = x0 . It is a curve given by z(x0 , y). The slope of this curve at (x0 , y0 ) is given by the partial derivative of z with respect to y, i.e. ∂z ∣ . While doing the differentiation, only the values of y is allowed to ∂y (x0 ,y0 ) vary but the value of x is always fixed at x0 . The derivative gives the rate of change of z along the positive y direction. Similarly, if we cut the curved surface by another vertical plane y = y0 , the intersecting curve is z(x, y0 ) and ∂z the slope of it at (x0 , y0 ) is ∣ . This quantity gives the rate of change ∂x (x0 ,y0 ) of z along the positive x direction. CHAPTER 6. PARTIAL DIFFERENTIATION 135 Figure 6.1: The meaning of partial derivative 6.3 Polar Coordinates A point P on the Cartesian plane is represented by (x, y) with respect to an origin O, where x and y are the horizontal and vertical coordinates and (x, y) is called the Cartesian coordinates. The location of point P can also be represented by the polar coordinates (r, θ) with respect to the origin O, where r = OP ≥ 0 is the radial distance between point P and the origin. The polar angle θ is the angle between the positive x axis and OP . It is positive if the angle is measured counterclockwisely from the positive x-axis. The angle becomes negative if it is measured clockwisely from the positive x axis. Figure 6.2 shows the polar coordinates of point P on the Cartesian plane. Figure 6.2: The polar coordinate system CHAPTER 6. PARTIAL DIFFERENTIATION 136 In some textbooks, they use (ρ, φ) instead of (r, θ) in order to avoid confusion with the spherical coordinate system (section 6.7). Recall that the Cartesian coordinate system has a pair of unit vectors which associate with the coordinates, i.e. î for x and ĵ for y. Similarly, the polar coordinate system has unit vectors êr and êθ associated with coordinates r and θ respectively. The direction of êr and êθ are orthogonal to each other (normal to each other) as shown in figure 6.2. The two sets of unit vectors are related by { êr = cos θ î + sin θ ĵ êθ = − sin θ î + cos θ ĵ (6.23) The second line in equation set 6.23 is obtained by replacing θ by π/2 + θ because êθ is turned counterclockwisely by π/2 from êr . The transformation between (x, y) and (r, θ) are { x = r cos θ y = r sin θ For better analysis, let’s rename the position vector of P as ⃗l, where ⃗l = x î + y ĵ. If we use polar coordinates, then ⃗l = rêr . Another point Q on Ð→ the xy plane has an infinitesimal displacement from P , the vector P Q is d⃗l = dx î + dy ĵ (6.24) This change can be represented by using the polar coordinates as d⃗l = dr êr + r dθ êθ (6.25) The derivation of equation 6.25 is stated as follows. We first express dx and dy in equation 6.24 in terms of the polar coordinates and then we write down î and ĵ in terms of êr and êθ . From the transformation x = r cos θ, we have the total differential of x dx = cos θ dr − r sin θ dθ (6.26) Similarly, the transformation y = r sin θ gives the total differential of y dy = sin θ dr + r cos θ dθ (6.27) From equation 6.23, we have { î = cos θ êr − sin θ êθ ĵ = sin θ êr + cos θ êθ (6.28) Substituting equations 6.26, 6.27 and 6.28 into equation 6.24, we have d⃗l = (cos θ dr − r sin θ dθ) (cos θ êr − sin θ êθ ) + (sin θ dr + r cos θ dθ) (sin θ êr + cos θ êθ ) = (cos2 θ + sin2 θ) dr êr + r (cos2 θ + sin2 θ) dθ êθ CHAPTER 6. PARTIAL DIFFERENTIATION 137 Finally, we obtain the vectorial representation of a small change using the polar coordinates d⃗l = dr êr + r dθ êθ (6.29) Remark: There are two ways to obtain equation 6.28 from equation set 6.23. The first way is to regard î and ĵ as two unknowns in equation set 6.23, then solve them simultaneously. However, this approach is very clumsy. The second way is to consider the dot products of equation set 6.23 with î and ĵ respectively, The results provide the information of the projections of î and ĵ on êr and êθ . For example, we form the dot product of î with equation set 6.23 and obtain the projections of î on êr and êθ { î ⋅ êr = cos θ î ⋅ êθ = − sin θ (6.30) Next, we form the dot product of ĵ with equation set 6.23, then we know the projections of ĵ on êr and êθ . { ĵ ⋅ êr = sin θ ĵ ⋅ êθ = cos θ (6.31) Equation sets 6.30 and 6.31 give rise to equation set 6.28. Example 6.10. In polar coordinate system, the position of a point is represented by (r, θ). If there is an infinitesimally small change in the position, ∂ ⃗l ∂ ⃗l dr + dθ that d⃗l = dr êr + r dθ êθ . say, d⃗l, show by considering d⃗l = ∂r ∂θ Solution: The position of a point, say P is given by ⃗l = x î + y ĵ, then we have ⃗l = r cos θ î + r sin θ ĵ. Differentiating both sides, we have ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂ ⃗l = cos θ î + sin θ ĵ ∂r ∂ ⃗l = −r sin θ î + r cos θ ĵ ∂θ Then ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ √ ∂ ⃗l ∣ ∣ = cos2 θ + sin2 θ = 1 ∂r ∣ √ ∂ ⃗l ∣ = r2 sin2 θ + r2 cos2 θ = r ∂θ CHAPTER 6. PARTIAL DIFFERENTIATION 138 ∂ ⃗l ∂ ⃗l We know that is parallel to êr and is parallel to êθ . Hence, we can ∂r ∂θ write ∂ ⃗l ∂ ⃗l dr + dθ ∂r ∂θ ∂ ⃗l ∂ ⃗l = ∣ ∣ dr êr + ∣ ∣ dθ êθ ∂r ∂θ = dr êr + r dθ êθ d⃗l = ∎ 6.4 Polar Coordinates and the Length of a Curve In section 3.8, we have a discussion about the length of a curve in Cartesian coordinates. Recall that on a Cartesian plane, the length of a small segment is given by √ √ dy 2 dl = (dx)2 + (dy)2 = 1 + ( ) dx . dx The small length corresponds to the magnitude of a small vector d⃗l where d⃗l = dx î + dy ĵ, with î and ĵ are orthogonal unit vectors (normal to each other). Likewise in polar coordinates, a small vector d⃗l = dr êr + r dθ êθ has a length √ √ dr 2 dl = (dr)2 + r2 (dθ)2 = r2 + ( ) dθ , dθ where êr and êθ are orthogonal unit vectors. Thus, the total length of the curve is given by √ θ2 dr 2 2 r + ( ) dθ ∫θ dθ 1 Example 6.11. Find the total length of the cardioid r = 2 (1 + cos θ) as shown in the figure. CHAPTER 6. PARTIAL DIFFERENTIATION 139 Figure 6.3: The cardioid dr = −2 sin θ, then the total length of the Solution: Firstly, we obtain dθ cardioid is given by √ π dr 2 S = 2∫ r2 + ( ) dθ dθ 0 π√ = 2∫ 4 (1 + cos θ)2 + 4 sin2 θ dθ 0 π√ √ 1 + cos θ dθ = 4 2∫ 0 π θ = 8 ∫ cos dθ 2 0 = 16 We have applied the trigonometric identity cos θ = 2 cos2 6.5 θ − 1. 2 ∎ Cartesian Coordinates A Cartesian coordinate system has coordinates (x, y, z) as shown in figure 6.4. The position vector of a point P in the space is represented by ⃗l = x î + y ĵ + z k̂ , where î, ĵ, and k̂ are ”constant vectors”. The directions and magnitudes of î, ĵ and k̂ never change. The length of each of them is 1. Then, the infinitesimal change of the position vector is d⃗l = dx î + dy ĵ + dz k̂ The answer is straight forward, but the idea behind it relates to partial differentiation. Let’s reveal it! Recall that in equation 6.1, the total differential of a scalar function f is given by df = ∂f ∂f ∂f dx + dy + dz ∂x ∂y ∂z CHAPTER 6. PARTIAL DIFFERENTIATION 140 Figure 6.4: The Cartesian coordinate system It is also true for vector function, so ∂ ⃗l ∂ ⃗l ∂ ⃗l d⃗l = dx + dy + dz ∂x ∂y ∂z (6.32) On the other hand, the position vector ⃗l = x î + y ĵ + z k̂ gives ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂ ⃗l = î ∂x along î ∂ ⃗l = ĵ ∂y along ĵ ∂ ⃗l = k̂ ∂z along k̂ ∂ ⃗l ∂ ⃗l ∂ ⃗l ∣ = 1, ∣ ∣ = 1, and ∣ ∣ = 1, ∂x ∂y ∂z respectively. Equation 6.32 is equivalent to Obviously, the magnitudes of them are ∣ ∂ ⃗l ∂ ⃗l ∂ ⃗l d⃗l = ∣ ∣ dx î + ∣ ∣ dy ĵ + ∣ ∣ dz k̂ ∂x ∂y ∂z Therefore, the vectorial representation of an infinitesimal change using the Cartesian coordinates is d⃗l = dx î + dy ĵ + dz k̂ The answer looks trivial, but the approach to obtain it is applicable to other coordinate systems. CHAPTER 6. PARTIAL DIFFERENTIATION 6.6 141 Cylindrical Coordinates A cylindrical coordinate system has coordinates (ρ, φ, z). It is an extension of the plane polar coordinates to include the z axis as shown in figure 6.5. Unlike section 6.3 the plane polar coordinates are written as (ρ, φ) instead of (r, θ). This is to avoid confusion with the spherical coordinates to be discussed in section 6.7. The polar coordinates ρ and φ are obtained by the transformations x = ρ cos φ and y = ρ sin φ. Using Cartesian coordinates the position vector of P is expressed as ⃗l = x î + y ĵ + z k̂, then we can write ⃗l = ρ cos φ î + ρ sin φ ĵ + z k̂ One should also be aware of ⃗l = ρ ρ̂ + z k̂. Note that Figure 6.5: The cylindrical coordinate system ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂ ⃗l = cos φ î + sin φ ĵ ∂ρ along ρ̂ ∂ ⃗l = −ρ sin φ î + ρ cos φ ĵ ∂φ along φ̂ ∂ ⃗l = k̂ ∂z along k̂ CHAPTER 6. PARTIAL DIFFERENTIATION 142 The magnitudes of them are ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∣ √ ∂ ⃗l ∣ = cos2 φ + sin2 φ = 1 ∂ρ ∣ √ ∂ ⃗l ∣ = ρ2 sin2 φ + ρ2 cos2 φ = ρ ∂φ ∣ ∂ ⃗l ∣=1 ∂z Hence, we obtain the unit vectors ⎧ ∂ ⃗l ∂ ⃗l ⎪ ⎪ ⎪ /∣ ∣ = cos φ î + sin φ ĵ ρ̂ = ⎪ ⎪ ⎪ ∂ρ ∂ρ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ∂ ⃗l ∂ ⃗l ⎪ ⎨ φ̂ = /∣ ∣ = − sin φ î + cos φ ĵ ⎪ ∂φ ∂φ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ∂ ⃗l ∂ ⃗l ⎪ ⎪ k̂ = /∣ ∣ = k̂ ⎪ ⎪ ⎪ ∂z ∂z ⎩ One can check that ρ̂, φ̂ and k̂ are orthogonal to each other. ρ̂ and φ̂ are unit vectors, but they are not constant vectors because both of them are functions of φ; their direction vary with φ. The infinitesimal change in the position vector is ∂ ⃗l ∂ ⃗l ∂ ⃗l dρ + dφ + dz ∂ρ ∂φ ∂z ∂ ⃗l ∂ ⃗l ∂ ⃗l d⃗l = ∣ ∣ dρ ρ̂ + ∣ ∣ dφ φ̂ + ∣ ∣ dz k̂ ∂ρ ∂φ ∂z d⃗l = Therefore, the vectorial representation of a small change using the cylindrical coordinates is d⃗l = dρ ρ̂ + ρ dφ φ̂ + dz k̂ Example 6.12. Let C⃗ = c1 î + c2 ĵ + c3 k̂ be a constant vector. Rewrite C⃗ using the cylindrical coordinates. Solution: Recall that ⎧ ρ̂ = cos φ î + sin φ ĵ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ φ̂ = − sin φ î + cos φ ĵ CHAPTER 6. PARTIAL DIFFERENTIATION 143 which gives ⎧ î ⋅ ρ̂ = cos φ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ î ⋅ φ̂ = − sin φ and ⎧ ĵ ⋅ ρ̂ = sin φ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ ĵ ⋅ φ̂ = cos φ One can see readily that ⎧ î = cos φ ρ̂ − sin φ φ̂ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ ĵ = sin φ ρ̂ + cos φ φ̂ thus C⃗ = c1 î + c2 ĵ + c3 k̂ = c1 (cos φ ρ̂ − sin φ φ̂) + c2 (sin φ ρ̂ + cos φ φ̂) + c3 k̂ = (c1 cos φ + c2 sin φ) ρ̂ − (c1 sin φ − c2 cos φ) φ̂ + c3 k̂ ∎ 6.7 Spherical Coordinates A spherical coordinate system has coordinates (r, θ, φ), where 0 ≤ θ ≤ π and 0 ≤ φ ≤ 2 π. One should not confuse the symbols, i.e. r and θ, adopted in polar coordinates in section 6.3 because they represent differently in the two systems. Figure 6.6 shows the spherical coordinates as well as their associated unit vectors. The transformations are x = r sin θ cos φ, y = r sin θ sin φ, and z = r cos θ. As the position vector of P in the Cartesian coordinates is ⃗l = x î + y ĵ + z k̂, then we can write ⃗l = r sin θ cos φ î + r sin θ sin φ ĵ + r cos θ k̂ One should also be aware of ⃗l = rr̂ (some books read r⃗ = r r̂ = ⃗l). Note that ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∂ ⃗l = sin θ cos φ î + sin θ sin φ ĵ + cos θ k̂ ∂r along r̂ ∂ ⃗l = r cos θ cos φ î + r cos θ sin φ ĵ − r sin θ k̂ ∂θ along θ̂ ∂ ⃗l = −r sin θ sin φ î + r sin θ cos φ ĵ ∂φ along φ̂ CHAPTER 6. PARTIAL DIFFERENTIATION 144 Figure 6.6: The spherical coordinate system The magnitudes of them are ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∣ √ ∂ ⃗l ∣ = sin2 θ cos2 φ + sin2 θ sin2 φ + cos2 θ = 1 ∂r ∣ √ ∂ ⃗l ∣ = r2 cos2 θ cos2 φ + r2 cos2 θ sin2 φ + r2 sin2 θ = r ∂θ ∣ √ ∂ ⃗l ∣ = r2 sin2 θ sin2 φ + r2 sin2 θ cos2 φ = r sin θ ∂φ Hence, we obtain the unit vectors ⎧ ∂ ⃗l ∂ ⃗l ⎪ ⎪ ⎪ r̂ = /∣ ∣ = sin θ cos φ î + sin θ sin φ ĵ + cos θ k̂ ⎪ ⎪ ⎪ ∂r ∂r ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ∂ ⃗l ∂ ⃗l ⎪ ⎨ θ̂ = /∣ ∣ = cos θ cos φ î + cos θ sin φ ĵ − sin θ k̂ ⎪ ∂θ ∂θ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ∂ ⃗l ∂ ⃗l ⎪ ⎪ φ̂ = /∣ ∣ = − sin φ î + cos φ ĵ ⎪ ⎪ ⎪ ∂φ ∂φ ⎩ One can check that r̂, θ̂ and φ̂ are orthogonal to each other. r̂, θ̂, and φ̂ are unit vectors but they are not constant vectors because r̂ and θ̂ are functions of θ and φ and φ̂ is a function of φ. Their directions vary with the parameters. CHAPTER 6. PARTIAL DIFFERENTIATION 145 The infinitesimal change in the position vector is ∂ ⃗l ∂ ⃗l ∂ ⃗l dr + dθ + dφ ∂r ∂θ ∂φ ∂ ⃗l ∂ ⃗l ∂ ⃗l d⃗l = ∣ ∣ dr r̂ + ∣ ∣ dθ θ̂ + ∣ ∣ dφ φ̂ ∂r ∂θ ∂φ d⃗l = Therefore, the vectorial representation of a small change using the spherical coordinates is d⃗l = dr r̂ + r dθ θ̂ + r sin θ dφ φ̂ Example 6.13. A force F⃗ = 2 î acts on a particle such that it moves along a horizontal circular path from (x, y, z) = (5, 0, 0) to (x, y, z) = (0, 5, 0). The movement is in the positive quadrant. Use the spherical coordinate system to find the work done by F⃗ . The unit of force is newtons and that of spatial measurement is meters. Solution: Denote the circular path as C. The points of it can be described π by spherical coordinates (r, θ, φ), where r = 5 and 0 ≤ φ ≤ . Obviously, 2 π θ = because C lies on the xy-plane. The required work done is 2 W = ∫ F⃗ ⋅ d⃗l . C Figure 6.7: The circular path C of the particle Next, we rewrite F⃗ using the spherical coordinates. Recall that ⎧ r̂ = sin θ cos φ î + sin θ sin φ ĵ + cos θ k̂ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ θ̂ = cos θ cos φ î + cos θ sin φ ĵ − sin θ k̂ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ φ̂ = − sin φ î + cos φ ĵ CHAPTER 6. PARTIAL DIFFERENTIATION 146 The components of î along r̂, θ̂, and φ̂ can be obtained by using the scalar products. ⎧ î ⋅ r̂ = sin θ cos φ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ î ⋅ θ̂ = cos θ cos φ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ î ⋅ φ̂ = − sin φ Hence, we have î = sin θ cos φ r̂ + cos θ cos φ θ̂ − sin φ φ̂ Since the force lies on the xy-plane, we have θ = π and 2 î = cos φ r̂ − sin φ φ̂ so F⃗ = 2 î = 2 cos φ r̂ − 2 sin φ φ̂ In spherical coordinates, the infinitesimal change of a position vector is given by d⃗l = dr r̂ + r dθ θ̂ + r sin θ dφ φ̂. Substituting the following values into the π expression: r = 5, dr = 0, θ = and dθ = 0, then we have d⃗l = 5 dφ φ̂. Hence 2 W = ∫ F⃗ ⋅ d⃗l C = ∫ (2 cos φ r̂ − 2 sin φ φ̂) ⋅ (5 dφ φ̂) C = −10 ∫ sin φ dφ C π 2 = −10 ∫ sin φ dφ 0 = 10 cos φ ∣ = −10 J π 2 0 ∎ 6.8 A Revisit to Electric Field and Electric Potential Electric field is defined as the electric force acting on a unit positive test charge. It is a vector field associated to each point in the space. Then the CHAPTER 6. PARTIAL DIFFERENTIATION 147 work done by an electric field E⃗ to move this test charge by a displacement d⃗l is E⃗ ⋅ d⃗l. The amount of work done relates to the change of electric potential of the test charge by dV = −E⃗ ⋅ d⃗l = −(Ex î + Ey ĵ + Ez k̂) ⋅ (dx î + dy ĵ + dz k̂) = −(Ex dx + Ey dy + Ez dz) However, the total differential of V gives dV = ∂V ∂V ∂V dx + dy + dz ∂x ∂y ∂z (6.33) Thus, we have ⎧ ∂V ⎪ ⎪ E = − x ⎪ ⎪ ⎪ ∂x ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ∂V ⎪ ⎨ Ey = − ⎪ ∂y ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ∂V ⎪ ⎪ ⎪ E =− ⎪ ⎩ z ∂z Therefore, we can express the electric field in terms of the partial derivatives of the electric potential, e.g. ∂V ∂V ∂V + ĵ + k̂ ) E⃗ = Ex î + Ey ĵ + Ez k̂ = − (î ∂x ∂y ∂z Obviously, ∂ ∂ ∂ E⃗ = − (î + ĵ + k̂ ) V . ∂x ∂y ∂z ∂ ∂ ∂ Hence, we have E⃗ = −∇V , where ∇ ≡ î + ĵ + k̂ is called the ”del” ∂x ∂y ∂z operator. Thus equation 6.33 can be rewritten as dV = ∇V ⋅ d⃗l, where ∇V links up dV and d⃗l. One can always compare the case in single variable function. Let y = f (x), then the total differential of y is dy = f ′ (x) dx, where f ′ (x) links up dy and dx. Example 6.14. A point charge q0 is fixed at the origin, show that the work done required by the electric force to move a positive charge q from point A(r1 , θ1 , φ1 ) to point B(r2 , θ2 , φ2 ) is independent to the path choosen. CHAPTER 6. PARTIAL DIFFERENTIATION 148 q0 q Solution: The electric force exerted on charge q by charge q0 is F⃗ = ke 2 r̂, r where r̂ is the unit vector in the radial direction and ke is the Coulomb’s constant. Recall that the representation of a small displacement in the spherical ⃗ = dr r̂ + r dθ θ̂ + r sin θ dφ φ̂. Hence, the required work coordinate system is dl done is (r2 ,θ2 ,φ2 ) W = ∫ (r 1 ,θ1 ,φ1 ) (r2 ,θ2 ,φ2 ) F⃗ ⋅ d⃗l q0 q r̂ ⋅ (dr r̂ + r dθ θ̂ + r sin θ dφ φ̂) r2 1 ,θ1 ,φ1 ) (r2 ,θ2 ,φ2 ) q0 q = ∫ ke 2 dr r (r1 ,θ1 ,φ1 ) = ∫ (r ke (r2 ,θ2 ,φ2 ) q0 q = −ke ∣ r (r1 ,θ1 ,φ1 ) = −ke q0 q ( 1 1 − ) r2 r1 The answer indicates that the work done by the electric force in moving a charge q from A to B is independent to the path. ∎ Chapter 7 Matrix and Transformation 7.1 Matrix A collection of m × n numbers in rectangular form is called a matrix. Let A be the matrix, where ⎛ a11 a12 . . . a1n ⎞ ⋮ ⋮ ⎟ A=⎜ ⋮ ⎝ am1 am2 . . . amn ⎠ (7.1) We say that this matrix has m rows and n columns or a m × n matrix, and the aij are the entries of the matrix. If m = 1, we call it a row vector, sometimes, we write it as (a11 , a12 , ⋯, a1n ). If n = 1, we call it a column vector. If m = n, we call it a square matrix. If k is another number, then the multiplication of k and the matrix A is defined by ⎛ a11 . . . a1n ⎞ ⎛ ka11 . . . ka1n ⎞ ⋮ ⎟=⎜ ⋮ ⋮ ⎟ , kA = k ⎜ ⋮ ⎝ am1 . . . amn ⎠ ⎝ kam1 . . . kamn ⎠ (7.2) which is that each entry is multiplied by k. Addition (and subtraction) of two matrices A and B is defined only when they have the same numbers of rows and columns, ⎛ a11 . . . a1n ⎞ ⎛ b11 . . . b1n ⎞ ⋮ ⎟±⎜ ⋮ ⋮ ⎟ A±B = ⎜ ⋮ ⎝ am1 . . . amn ⎠ ⎝ bm1 . . . bmn ⎠ ⎛ a11 ± b11 . . . a1n ± b1n ⎞ ⋮ ⋮ ⎟ = ⎜ ⎝ am1 ± bm1 . . . amn ± bmn ⎠ (7.3) Matrix multiplication is more complicated and it is the key of theory of matrices. We can multiply two matrices only if the number of columns of the 149 CHAPTER 7. MATRIX AND TRANSFORMATION 150 first equals to the number of rows of the second, i.e., we can only multiply m × n and n × l matrices. Consider the multiplication of matrices A and B, then AB = C and C is a m×l matrix. The rs-th entry of C is crs = ∑ni=1 ari bis , where 1 ≤ r ≤ m and 1 ≤ s ≤ l. ⎛ a11 . . . a1n ⎞ ⎛ b11 . . . b1l ⎞ ⋮ ⎟⎜ ⋮ ⋮ ⎟ AB = ⎜ ⋮ ⎝ am1 . . . amn ⎠ ⎝ bn1 . . . bnl ⎠ n n ⎛ ∑i=1 a1i bi1 . . . ∑i=1 a1i bil ⎞ ⋮ ⋮ ⎟ = ⎜ ⎝ ∑ni=1 ami bi1 . . . ∑ni=1 ami bil ⎠ ⎛ c11 . . . c1l ⎞ ⋮ ⎟=C = ⎜ ⋮ ⎝ cm1 . . . cml ⎠ (7.4) Matrix multiplication is not commutative, in general, AB ≠ BA. If AB = BA, they are said to be commutative. Matrix multiplication is associative. If X, Y and Z are matrices of size m × n, n × l and l × r, then we have (XY )Z = X(Y Z) and the resulting matrix has size m × r and the ij-th entry of it is n l ∑ ∑ xis yst ztj (7.5) s=1 t=1 If A is a square matrix, then A2 = AA, Ai Aj = Ai+j and (Ai )j = Aij for nonnegative integers i and j. The zeroth power of a matrix is an identity matrix, i.e. A0 = I. The details of identity matrix will be given in section 7.2. We will only concern matrices with sizes 3×3 or smaller. The following examples demonstrate the multiplication of matrices which have different dimensions. α ) = aα + bβ β (a) ( a b )( (b) ( α αa αb ) )( a b ) = ( β βa βb (c) ( a b α aα + bβ )( ) = ( ) c d β cα + dβ (d) ( a b )( (e) ( α β ) = ( aα + bγ aβ + bδ ) γ δ a b α β aα + bγ aβ + bδ )( )=( ) c d γ δ cα + dγ cβ + dδ CHAPTER 7. MATRIX AND TRANSFORMATION 151 Example 7.1. Given the following matrices, 2 −1 5 A=( ), 0 22 −8 3 −4 7 B=( ) 2 11 −4 and ⎛ 5 ⎞ C=⎜ 3 ⎟, ⎝ 9 ⎠ compute 3A, A + B and AC. Solution: 3A = 3 ( A+B =( AC = ( 2 −1 5 6 −3 15 )=( ) 0 22 −8 0 66 −24 2 −1 5 3 −4 7 5 −5 12 )+( )=( ) 0 22 −8 2 11 −4 2 33 −12 5 2 −1 5 ⎛ ⎞ 10 − 3 + 45 52 )⎜ 3 ⎟ = ( )=( ) 0 22 −8 ⎝ ⎠ 0 + 66 − 72 −6 9 ∎ Example 7.2. Illustrate the non-commutativity of matrix multiplication using matrices A and B, where A=( 1 1 ) 0 1 and B=( 1 0 ) 1 1 Solution: We have AB = ( = ( 1 1 1 0 )( ) 0 1 1 1 2 1 ) 1 1 while BA = ( = ( 1 0 1 1 )( ) 1 1 0 1 1 1 ) ≠ AB 1 2 ∎ The n × n identity matrix is the square matrix ⎛ ⎜ In = ⎜ ⎜ ⎝ 1 0 ⋮ 0 0 ... 1 ... ⋮ ⋱ 0 ... 0 0 ⋮ 1 ⎞ ⎟ ⎟ ⎟ ⎠ (7.6) CHAPTER 7. MATRIX AND TRANSFORMATION 152 where it is the number “1” in the diagonal and “0” elsewhere. We can easily see that if A has n rows, then In A = A and if A has n columns, then AIn = A. Hence, In is the “1” in matrix multiplication. Do we have division of matrix? In general, no. We say that a square matrix A has an inverse if there is another square matrix of the same size B such that AB = BA = In (7.7) Inverse is usually denoted by A−1 . The zero matrix (matrix with all entries 0), of course, does not have an inverse. Example 7.3. Check that the following matrix does not have an inverse, ( 1 1 ) 1 1 (7.8) Solution: If it has an inverse, let it be ( a b ) c d (7.9) We require that ( 1 1 a b 1 0 )( ) = ( ) 1 1 c d 0 1 ( a+c b+d 1 0 ) = ( ) a+c b+d 0 1 This is inconsistent, as a + c cannot be both 0 and 1. 7.2 (7.10) ∎ Properties of Matrices (M.I) If all the entries of a matrix are zero, then the matrix is called zero matrix (denoted by 0). If A is a matrix, then A+0=0+A=A A0 = 0 and 0A = 0 kA = 0, when the scalar k = 0. (M.II) A n-square matrix X is called an identity matrix if its entries xij = 0 for all i ≠ j and xij = 1 for all i = j. An identity matrix is denoted by I. If A is a n-square matrix then AI = IA = A A0 = I CHAPTER 7. MATRIX AND TRANSFORMATION 153 If B is also a n-square matrix and AB = BA = I, then B = A−1 is called the inverse of A, and A = B −1 is the inverse of B. A and B are non-singular matrices. (M.III) A n-square matrix A is called a diagonal matrix if its entries aij = 0 for all i ≠ j. It is denoted by D. (M.IV) If A is an m × n-matrix, then B is the transpose of A, where bij = aji . B is a n × m-matrix denoted by AT . (AT )T = A The transpose of a row vector becomes a column vector and vice versa. If A and B have the same dimension, then (A + B)T = AT + B T If A is a p × q matrix and B is a q × r matrix, then (AB)T = B T AT . If C is a r × s matrix, then (ABC)T = C T B T AT . n (M.V) If A is a n-square matrix, then trA = ∑ aii is called the trace of A. i=1 tr(kA) = k tr(A) tr(A) = tr(AT ) If B is another n-square matrix, then tr(A+B) = tr(A)+tr(B). If A is a m × n matrix and B is a n × m matrix, then tr(AB) = tr(BA) (M.VI) If A is a square matrix and AT = A−1 , then A is called an orthogonal matrix. AT A = AAT = I (M.VII) A is similar to B if there exists an invertible matrix P such that P −1 AP = B. We denote the relation as A ∼ B. (M.VIII) If a square matrix X is symmetric, then xij = xji . X = XT A symmetric matrix S can be constructed by 1 S = [A + AT ] , 2 where A is any square matrix. CHAPTER 7. MATRIX AND TRANSFORMATION 154 (M.IX) If X is a skew-symmetric matrix, then xij = −xji . X = −X T A skew-symmetric matrix S ∗ can be constructed by 1 S ∗ = [A − AT ] , 2 where A is any square matrix. In fact any square matrix A = S + S ∗. 7.3 Determinant The determinant of a square matrix A is denoted as ∣A∣ or det(A). Our discussion will limit to matrices of order 2 and order 3 only. Let’s consider a 2 × 2 matrix first. If A=( a b ) c d then det A = ∣A∣ = ∣ a b ∣ = ad − bc c d Example 7.4. Find the determinant of A = ( 1 2 ). 3 4 Solution: det A = ∣A∣ = det ( 1 2 ) = (1)(4) − (2)(3) = −2 3 4 ∎ What about if the matrix is of order having the form ⎛ a11 a12 A = ⎜ a21 a22 ⎝ a31 a32 3 × 3? Let’s consider a matrix A a13 ⎞ a23 ⎟ a33 ⎠ The determinant ∣A∣ is complicated in structure. We define two new quantities before discussion. Consider the minor Mij of matrix A first. It is a determinant of a matrix formed by eliminating the ith row and the jth column of A. Then we define the cofactor Aij of A, where Aij = (−1)i+j Mij CHAPTER 7. MATRIX AND TRANSFORMATION 155 Here is the answer for the determinant of A. ∣A∣ = a11 A11 + a12 A12 + a13 A13 where A11 , A12 , and A13 are the cofactors of the first row in A as shown below. A11 = (−1)1+1 ∣ a22 a23 ∣ a32 a33 ∣A∣ = a11 ∣ A12 = (−1)1+2 ∣ a21 a23 ∣ a31 a33 A13 = (−1)1+3 ∣ a22 a23 a a a a ∣ − a12 ∣ 21 23 ∣ + a13 ∣ 21 22 ∣ a32 a33 a31 a33 a31 a32 The determinant can also be expressed using the cofactors of a row or the cofactors of a column, e.g. ∣A∣ = ai1 Ai1 + ai2 Ai2 + ai3 Ai3 (row expansion) or ∣A∣ = a1j A1j + a2j A2j + a3j A3j (column expansion) ⎛ 2 −3 1 ⎞ Example 7.5. Evaluate the determinant of A, where A = ⎜ 2 0 −1 ⎟ . ⎝ 1 4 5 ⎠ Solution: Using the cofactors of the first row in A, we have ∣A∣ = a11 A11 + a12 A12 + a13 A13 , then 0 −1 2 −1 2 0 ∣ − (−3) ⋅ ∣ ∣ + (1) ⋅ ∣ ∣ 4 5 1 5 1 4 ∣A∣ = (2) ⋅ ∣ = 2 [0 − (−4)] + 3 [10 − (−1)] + 1 [8 − 0] = 49 Using the cofactors of the second row in A, we have ∣A∣ = a21 A21 + a22 A22 + a23 A23 , then ∣A∣ = −(2) ⋅ ∣ −3 1 2 1 2 −3 ∣ − (−1) ⋅ ∣ ∣ ∣ + (0) ⋅ ∣ 4 5 1 5 1 4 = −2 [−15 − (4)] + 0 + 1 [8 − (−3)] = 49 Using the cofactors of the first column in A, we have ∣A∣ = a11 A11 + a21 A21 + a31 A31 , then ∣A∣ = (2) ⋅ ∣ 0 −1 −3 1 −3 1 ∣ − (2) ⋅ ∣ ∣ + (1) ⋅ ∣ ∣ 4 5 4 5 0 −1 = 2 [0 − (−4)] − 2 [−15 − 4] + 1 [3 − 0] = 49 a21 a22 ∣ a31 a32 CHAPTER 7. MATRIX AND TRANSFORMATION 156 Using the cofactors of the second column in A, we have ∣A∣ = a12 A12 + a22 A22 + a32 A32 , then ∣A∣ = −(−3) ⋅ ∣ 2 −1 2 1 2 1 ∣ + (0) ⋅ ∣ ∣ − (4) ⋅ ∣ ∣ 1 5 1 5 2 −1 = 3 [10 − (−1)] + 0 − 4 [−2 − 2] = 49 Readers can check the answers by using the cofactors of row 3 and column 3 respectively. ∎ In order to find out which square matrix has an inverse and which does not, the determinant of a matrix is defined. A square matrix A has an inverse such that AA−1 = A−1 A = I if and only if det A ≠ 0 (7.11) The details of an inverse will be discussed in section 7.5. One should note that det A is a number, not a matrix. Theorem 7.6. For two square matrices of the same size, det(AB) = det A det B (7.12) On the left hand side, inside the parenthesis, it is the matrix multiplication of A and B. On the right hand side, it is the multiplication of two numbers. Proof. For the 2 × 2 case, by example (e) after equation 7.5, we have = = = = = det(AB) (aα + bγ)(cβ + dδ) − (aβ + bδ)(cα + dγ) acαβ + adαδ + bcβγ + bdγδ − (acαβ + adβγ + bcαδ + bdγδ) adαδ + bcβγ − adβγ − bcαδ (ad − bc)(αδ − βγ) det A det B (7.13) The theorem is true for square matrices of all sizes, but we are not going to prove that. 7.4 Properties of Determinant (D.I) If all entries in a row (or a column) are zero, then the value of the determinant is zero. CHAPTER 7. MATRIX AND TRANSFORMATION 157 (D.II) If any two rows (or two columns) are interchanged, the value of the determinant changes sign. (D.III) If any two rows (or two columns) are identical, then the value of the determinant is zero. (D.IV) If each entry in a row (or a column) is multiplied by the same constant k, then the value of the determinant is multiplied by k. (D.V) Given two square matrices A and B, we have det AB = det A det B. (D.VI) Given a square matrix A, we have det A = det AT , where AT is the transpose of A. 1 , where A−1 is the (D.VII) Given a square matrix A, we have det A−1 = det A inverse of A. (D.VIII) If each of the entries in a row or column can be expressed as the sum of two numbers, then the determinant can be expressed as the sum of two determinants. So RRR a + α x p RRR RRR a x p RRR RRR α x p RRR RRR R R R R R RRR b + β y q RRRRR = RRRRR b y q RRRRR + RRRRR β y q RRRRR RRR R R R R R RR c + γ z r RRRR RRRR c z r RRRR RRRR γ z r RRRR There is a common application, e.g. RRR a x p RRR RRR a ± kx x p RRR RRR R R R RRR b y q RRRRR = RRRRR b ± ky y q RRRRR RRR R R R RR c z r RRRR RRRR c ± kz z r RRRR RRR 1 −3 2 RRRR RRR R 3 1 RRRR . Example 7.7. Evaluate RRR −2 RRR R RR −203 300 105 RRRR Solution: RRR 1 RRR RRR −3 2 RRRR 1 −3 2 RRR R RRR R R RRR −2 RRR 3 1 RRRR = RRRR −2 3 1 RRR RRR RRR R RR −203 300 105 RR RR −200 − 3 300 + 0 100 + 5 RRRR RRR 1 −3 2 RRRR RRRR 1 −3 2 RRRR RRR R R R 3 1 RRRR + RRRR −2 3 1 RRRR = RRR −2 RRR R R R RR −200 300 100 RRRR RRRR −3 0 5 RRRR (Property D.VIII) CHAPTER 7. MATRIX AND TRANSFORMATION = = = = = 158 RRR 1 −3 2 RRR RRR 1 −3 2 RRR RR RR RR RR 100 RRRR −2 3 1 RRRR + RRRR −2 3 1 RRRR (Property D.IV) RRR RRR RRR RRR RR −2 3 1 RR RR −3 0 5 RR RRR 1 −3 2 RRR RR RR 0 + RRRR −2 3 1 RRRR (Property D.III) RRR RRR RR −3 0 5 RR RRR −1 0 3 RRR RRR R RRR −2 3 1 RRRRR (R1 ∶ R1 + R2 ) RRR RRR RR −3 0 5 RR (3) [(−1) (5) − (3) (−3)] (Column expansion by using C2 ) 12 ∎ RRR x y z RRR 2 2 R y z2 Example 7.8. Factorize RR x RRR RR y + z z + x x + y Solution: RRR x y z RRRR RRR R RRR x2 y2 z 2 RRRR RRR R RR y + z z + x x + y RRRR RRR RRR x y z RRR RRR RRR x2 y2 z2 = RRR RRR R RR x + y + z x + y + z x + y + z RRRR RRR x y z RRR RR RR = (x + y + z) RRRR x2 y 2 z 2 RRRR RRR R RR 1 1 1 RRRR RRR x − y y − z z RRRR RRRR 2 R = (x + y + z) RR x − y 2 y 2 − z 2 z 2 RRRR RRR R 0 0 1 RRRR RR x−y y−z = (x + y + z) ∣ 2 ∣ x − y2 y2 − z2 = (x + y + z)(x − y)(y − z) ∣ RRR RRR RRR RRR RR (R3 ∶ R3 + R1 ) (C1 ∶ C1 − C2 and C2 ∶ C2 − C3 ) 1 1 ∣ x+y y+z = (x + y + z)(x − y)(y − z)(z − x) ∎ RRR 1 1 1 RRRR RRR R Example 7.9. Evaluate RRR ln x ln 2x ln 3x RRRR RRR R RR ln y ln 2y ln 3y RRRR CHAPTER 7. MATRIX AND TRANSFORMATION 159 Solution: RRR RRR RRR RRR RR RRR RR = RRRR RRR RR RRR RR = RRRR RRR RR 1 1 1 RRRR R ln x ln 2x ln 3x RRRR R ln y ln 2y ln 3y RRRR RRR 1 1 1 RR ln x ln x + ln 2 ln x + ln 3 RRRR R ln y ln y + ln 2 ln y + ln 3 RRRR RRR RRR 1 1 1 1 0 1 RR RR ln x ln x ln x + ln 3 RRRR + RRRR ln x ln 2 ln x + ln 3 R R ln y ln y ln y + ln 3 RRRR RRRR ln y ln 2 ln y + ln 3 RRR 1 0 1 RR = 0 + RRRR ln x ln 2 ln x RRR RR ln y ln 2 ln y ln 2 ln 3 = 0+0+∣ ∣ ln 2 ln 3 = 0 RRR RRR 1 0 0 RRRR RRR RRR RRR + RRR ln x ln 2 ln 3 RRRRR RRR RRR R RR RR ln y ln 2 ln 3 RRRR RRR RRR RRR RRR RR (Property D.VIII) (Properties D.III and D.VIII) One may obtain the answer using an alternative way. RRR RRR RRR RRR RR RRR RR = RRRR RRR RR RRR RR = RRRRR RRR RR RRR RR = RRRR RRR RR 1 1 1 RRRR R ln x ln 2x ln 3x RRRR R ln y ln 2y ln 3y RRRR 1 0 0 ln x ln 2x − ln x ln 3x − ln x ln y ln 2y − ln y ln 3y − ln y RRR 1 0 0 RR ) ln ( 3x ) RRRR ln x ln ( 2x x x RRR 3y ln y ln ( 2y y ) ln ( y ) RRR 1 0 0 RRRR R ln x ln 2 ln 3 RRRR R ln y ln 2 ln 3 RRRR ln 2 ln 3 = ∣ ∣ ln 2 ln 3 = 0 RRR RRR RRR RRR RR (C2 ∶ C2 − C1 and C3 ∶ C3 − C1 ) ∎ Example 7.10. Show that u⃗ = (2, −1, 1), v⃗ = (3, −4, −2), w⃗ = (5, −10, −8) are linearly dependent vectors . In other words, any one of the three vectors CHAPTER 7. MATRIX AND TRANSFORMATION 160 is the linear combination of the remaining two. Solution: Let’s check whether the three vectors lie on the same plane first. If so, any one of the three vectors is the linear combination of the remaining two. Equivalently, the parallelepiped formed by the three vectors has zero volume if the three vectors lie on the same plane. According to section 1.11, the volume of the parallelepiped is RRR 2 −1 1 RRR RR RR ⃗ = RRRR 3 −4 −2 RRRR = 0 u⃗ ⋅ (⃗ v × w) RRR R RR 5 −10 −8 RRRR Therefore, the three vectors are linearly dependent. In fact, one can check that w⃗ = −2 u⃗ + 3 v⃗. ∎ 7.5 Inverse A square matrix A is invertible if it occupies an inverse A−1 such that AA−1 = A−1 A = I. The matrix A is non-singular if its inverse exists. A singular matrix is not invertible and it does not have an inverse. Theorem 7.11. A square matrix has an inverse if and only if its determinant a b is non-zero. Explicitly, the inverse of a 2 × 2 matrix A = ( ) is c d A−1 = 1 d −b ( ) det A −c a (7.14) Proof. If the determinant is non-zero, consider A−1 A = = 1 d −b a b ( )( ) −c a c d det A 1 ad − bc bd − bd ( ) det A −ac + ac −bc + ad = ( 1 0 ) 0 1 (7.15) Note that AA−1 = I too, where det A and det A−1 cannot be zero. Example 7.12. If A = ( 2 1 ) , find A−1 . 0 1 Solution: We obtain det A = 2. −1 A 2 1 = ( ) 0 1 −1 = ( − 12 ) 0 1 1 2 and det A−1 = 1 2 CHAPTER 7. MATRIX AND TRANSFORMATION 161 1 Observe that det A det A−1 = (2) ( ) = 1 and 2 det A det A−1 = det(AA−1 ) = det I = 1 . ∎ If A is a 3 × 3 matrix having the form ⎛ a11 a12 a13 ⎞ A = ⎜ a21 a22 a23 ⎟ , ⎝ a31 a32 a33 ⎠ then the inverse of it is given by A−1 = adjA , ∣A∣ where adj A is called the adjoint of A. Readers should remember that adj A is formed by the transpose of the matrix of the cofactors of A, e.g. T ⎛ A11 A12 A13 ⎞ adj A = ⎜ A21 A22 A23 ⎟ ⎝ A31 A32 A33 ⎠ (7.16) Recall that the transpose of a matrix A is denoted as AT , where ⎛ a11 a21 a31 ⎞ AT = ⎜ a12 a22 a32 ⎟ . ⎝ a13 a23 a33 ⎠ ⎛ 2 3 −4 ⎞ Example 7.13. Let A = ⎜ 0 −4 2 ⎟ . Find the inverse of A. ⎝ 1 −1 5 ⎠ Solution: The cofactors of A are: A11 = + ∣ −4 2 ∣ = −18 , −1 5 A12 = − ∣ 0 2 ∣ = 2, 1 5 A13 = + ∣ 0 −4 ∣=4 1 −1 A21 = − ∣ 3 −4 ∣ = −11 , −1 5 A22 = + ∣ 2 −4 ∣ = 14 , 1 5 A23 = − ∣ 2 3 ∣=5 1 −1 A31 = + ∣ 3 −4 ∣ = −10 , −4 2 A32 = − ∣ 2 −4 ∣ = −4 , 0 2 A33 = + ∣ 2 3 ∣ = −8 0 −4 CHAPTER 7. MATRIX AND TRANSFORMATION 162 So, the determinant of A is given by ∣A∣ = a11 A11 + a12 A12 + a13 A13 = (2)(−18) + (3)(2) + (−4)(4) = −46 , and the adjoint of A is given by T ⎛ −18 2 4 ⎞ ⎛ −18 −11 −10 ⎞ 14 −4 ⎟ , adj A = ⎜ −11 14 5 ⎟ = ⎜ 2 ⎝ −10 −4 −8 ⎠ ⎝ 4 5 −8 ⎠ Hence, A 7.6 −1 −18 −11 −10 ⎞ 1 ⎛ adj A 14 −4 ⎟ . = =− ⎜ 2 ∣A∣ 46 ⎝ 4 5 −8 ⎠ ∎ Properties of an Inverse (IN.I) (A−1 )−1 = A (IN.II) (kA)−1 = k −1 A−1 , where k is a non-zero scalar. (IN.III) (AB)−1 = B −1 A−1 . Furthermore, (ABC)−1 = C −1 B −1 A−1 . Proof. Since AB (B −1 A−1 ) = (AI) A−1 = A A−1 = I and (B −1 A−1 ) AB = B −1 (IB) = B −1 B = I, so (AB)−1 = B −1 A−1 . Next, (ABC)−1 = C −1 (AB)−1 = C −1 B −1 A−1 . (IN.IV) If A is invertible then (A−1 )T = (AT )−1 . It means that the order to obtain the transpose and the inverse is not important. Proof. AT (A−1 )T = (A−1 A)T = I, so (A−1 )T = (AT )−1 . (IN.V) If A is symmetric, then A−1 is also symmetric, i.e (A−1 )T = A−1 . Proof. A (A−1 )T = AT (A−1 )T = (A−1 A)T = I, so (A−1 )T = A−1 . (IN.VI) If AT = A−1 , A is called an orthogonal matrix, then det A = ±1. Proof. A AT = I implies det(A AT ) = det A det AT = (det A)2 = 1. Hence, the answer appears. CHAPTER 7. MATRIX AND TRANSFORMATION 7.7 163 Systems of Linear Equations A system of linear equations is ⎧ a11 x1 + a12 x2 + . . . + a1n xn = b1 ⎪ ⎪ ⎪ ⎪ ⎪ a21 x1 + a22 x2 + . . . + a2n xn = b2 ⎨ ⋮ ⋮ ⋮ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ am1 x1 + am2 x2 + . . . + amn xn = bm (7.17) where all the aij and bi are known and we would like to find out the xj . In matrix notation, this is ⎛ a11 a12 . . . a1n ⎞ ⎛ x1 ⎞ ⎛ b1 ⎞ ⋮ ⋮ ⎟⎜ ⋮ ⎟ = ⎜ ⋮ ⎟ ⎜ ⋮ ⎝ am1 am2 . . . amn ⎠ ⎝ xn ⎠ ⎝ bm ⎠ or in compact notation, ⃗ =B ⃗, AX (7.18) (7.19) ⃗ and B ⃗ are column vectors. In a form like this, we cannot say where X whether there is solution or even solution exists, whether it is unique. 7.8 Cramer’s Rule If m = n, the system stated in equation 7.17 has n equations and n unknowns. There are some points to note. (I) The system is inconsistent if the solution set is empty. (II) The system is consistent if there is a non-empty solution set. (i) The system has a unique solution. (ii) The system has a non-unique solution (infinitely many solutions) ⃗ We define a n-square matrix Ak by replacing the k-th column of A by B. According to Cramer’s rule, we have xk = det Ak det A If det A = 0 and det Ak ≠ 0 (at least one of), the system has no solution. If det A ≠ 0, the system has a unique solution set. If det A = 0 and det Ak = 0 (all of), the system has infinitely many solutions. CHAPTER 7. MATRIX AND TRANSFORMATION 164 Example 7.14. Consider { x+y =0 x+y =1 (7.20) These two equations are inconsistent and there is no solution. The equation ⃗ = B, ⃗ where set can be written in matrix form, e.g. A X A=( 1 1 ) 1 1 ⃗ =( x ) X y ⃗=( 0 ) B 1 Using Cramer’s rule, we have x= det A1 = det A ∣ 0 1 ∣ 1 1 1 1 ∣ ∣ 1 1 y= , det A2 = det A ∣ 1 0 ∣ 1 1 1 1 ∣ ∣ 1 1 , where det A = 0, det A1 ≠ 0, and det A2 ≠ 0. Now, consider 2x + 2y = 2 { x+y =1 (7.21) There are solutions, and, in fact, infinitely many solutions. Let’s analyze by using Cramer’s rule. The set of equations can be expressed in matrix form, ⃗ = B, ⃗ where AX A=( x= 2 2 ) 1 1 det A1 = det A ∣ ⃗ =( x ) X y 2 2 ∣ 1 1 2 2 ∣ ∣ 1 1 , y= ⃗=( 2 ) B 1 det A2 = det A ∣ 2 2 ∣ 1 1 2 2 ∣ ∣ 1 1 , where det A = 0, det A1 = 0, and det A2 = 0. ∎ Example 7.15. Consider { 2x + y = 5 y = 3 (7.22) This can be simply solved by substitution x = 1 and y = 3. A more sophisticated way to solve this is by using Cramer’s rule. Let A=( 2 1 ) 0 1 ⃗ =( x ) X y and ⃗=( 5 ) B 3 CHAPTER 7. MATRIX AND TRANSFORMATION 165 ⃗ = B, ⃗ where We have A X x= det A1 = det A ∣ 5 1 ∣ 3 1 2 1 ∣ ∣ 0 1 = 1, y= det A2 = det A ∣ 2 5 ∣ 0 3 2 1 ∣ ∣ 0 1 =3 ⃗ = A−1 B, ⃗ where A−1 = adjA and det A = 2. Alternatively, X det A ( adjA 5 x ( ) ) = y det A 3 T 1 1 0 5 ( ) ( ) = −1 2 3 2 = 1 1 −1 5 ( )( ) 3 2 0 2 = ( − 12 5 1 )( ) = ( ) 0 1 3 3 1 2 We get the same answer. ∎ Example 7.16. A circuit is given as shown in figure 7.1. Find the current flowing through each resistor. Figure 7.1: The electric circuit Solution: Before everything, let’s review two rules in electric circuit. Kirchhoff ’s rule 1: Junction rule The algebraic sum of the currents at any junction must equal zero, i.e. CHAPTER 7. MATRIX AND TRANSFORMATION 166 ∑ I = 0. This is resulted from the conservation of charge. Junction (The currents directed into the junction are regarded as +I and those leaving as −I in the equation.) Kirchhoff ’s rule 2: Loop rule (voltage rule) The sum of the potential differences across all elements around any closed circuit loop must be zero, i.e. ∑ ∆V = 0. This is resulted Closed loop from the conservation of energy. Denote the currents flowing through the resistors as I1 , I2 , and I3 , see the circuit diagram in figure 7.2. The directions of the currents are assigned arbitrarily. Figure 7.2: The directions of the currents are assigned arbitrarily At junction c: I1 + I2 − I3 = 0 Loop abcda: 10.0 − 6.0 I1 − 2.0 I3 = 0 Loop bef cb: −4.0 I2 − 14.0 + 6.0 I1 − 10.0 = 0 Rearranging the equations, we have ⎧ I +I −I = 0 ⎪ ⎪ ⎪ 1 2 3 ⎨ −6.0 I1 − 2.0 I3 = −10.0 ⎪ ⎪ ⎪ ⎩ 6.0 I1 − 4.0 I2 = 24.0 The matrix form of the system is 1 −1 ⎞ ⎛ I1 ⎞ ⎛ 0 ⎞ ⎛ 1 0 −2.0 ⎟ ⎜ I2 ⎟ = ⎜ −10.0 ⎟ ⎜ −6.0 ⎝ 6.0 −4.0 0 ⎠ ⎝ I3 ⎠ ⎝ 24.0 ⎠ CHAPTER 7. MATRIX AND TRANSFORMATION 167 Solving the system with Cramer’s rule, we get RRR 0 RRR RRR −10.0 RRR R 24.0 I1 = RR RRR 1 RRR RRR −6.0 RRR 6.0 R RRR 1 RRR RRR −6.0 RRR R 6.0 I3 = RRR RRR 1 RRRR RRR −6.0 RRR RRR RR 6.0 1 −1 RRRR R 0 −2.0 RRRR R −4.0 0 RRRR = 2.0 A , 1 −1 RRRR R 0 −2.0 RRRR R −4.0 0 RRRR 1 0 RRRR R 0 −10.0 RRRR R −4.0 24.0 RRRR = −1.0 A 1 −1 RRRRRR RRR 0 −2.0 RRRR RR −4.0 0 RRRRR RRR 1 0 −1 RRRR RRR RRR −6.0 −10.0 −2.0 RRRRR RRR R 0 RRRR RR 6.0 24.0 I2 = R = −3.0 A , RRR 1 1 −1 RRRR RRR R 0 −2.0 RRRR RRR −6.0 R RRR 6.0 −4.0 0 RRRR R The initial guess of the directions of I2 and I3 is incorrect because the values of them are negative. They should flow in the opposite directions. ∎ 7.9 Eigenvalues and Eigenvectors Let A be a n-square matrix, there exists a non-zero vector x⃗ (a column vector) such that A x⃗ = λ x⃗ , (7.23) where λ is a scalar. Equation 7.23 is the eigenvalue equation. We call λ the eigenvalue of A and x⃗ the corresponding eigenvector. The following paragraph outlines the procedure such that the eigenvalues and eigenvectors of A are obtained. We arrange equation 7.23 as (λI − A) x⃗ = 0⃗ (7.24) For a non-trivial solution of x⃗, we have det(λI − A) = 0 (7.25) The LHS of equation 7.25 is called the characteristic polynomial. Equation 7.25 is the characteristic equation of degree n of A. Solving this equation, we obtain the eigenvalues of A. Substituting these values to equation 7.24, we obtain the corresponding eigenvectors. CHAPTER 7. MATRIX AND TRANSFORMATION 168 Remark: A set of vectors x⃗1 , x⃗2 , ⋯ , x⃗n are linearly independent if and only if λ1 x⃗1 +λ2 x⃗2 +⋯+λn x⃗n = 0 implies λ1 = λ2 = ⋯ = λn = 0. Alternatively, a set of vectors is said to be linearly dependent if one of the vector in the set can be defined as the linear combination of other vectors in the same set. Zero vector is always linearly dependent to other vectors. 1 2 ) , find the eigenvalues and eigenvectors of A. 0 4 Example 7.17. If A = ( Solution: Let x⃗ be an eigenvector of A. A x⃗ = λ x⃗ (λI − A) x⃗ = ⃗0 λ − 1 −2 ( ) x⃗ = ⃗0 0 λ−4 For a non-trivial solution of x⃗, we have det ( λ − 1 −2 ) = 0 0 λ−4 (λ − 1) (λ − 4) = 0 λ = 1 or 4 Let λ1 = 1, λ2 = 4 and x⃗ = ( u ). Recall that v (λI − A) x⃗ = 0⃗ We have ( λ − 1 −2 u 0 )( ) = ( ) 0 λ−4 v 0 (7.26) For λ1 = 1, equation 7.26 gives ( 0 −2 u 0 )( ) = ( ) 0 −3 v 0 We get v = 0. In order to obtain the eigenvector corresponding to λ1 , we 1 set u = 1. Then x⃗1 = ( ) 0 For λ2 = 4, equation 7.26 gives ( 3 −2 u 0 )( ) = ( ) 0 0 v 0 CHAPTER 7. MATRIX AND TRANSFORMATION 169 We get 3 u − 2 v = 0. In order to obtain the eigenvector which corresponds 2 to λ2 , we set u = 2 and v = 3. Then x⃗2 = ( ) . 3 1 The eigenvectors of A are independent and they are x⃗1 = ( ) and 0 2 x⃗2 = ( ) . 3 Remark: One can check that A x⃗1 = λ1 x⃗1 and A x⃗2 = λ2 x⃗2 . ∎ ⎛ −1 0 1 ⎞ Example 7.18. If A = ⎜ 3 0 −3 ⎟ , find the eigenvalues and eigenvectors ⎝ 1 0 −1 ⎠ of A. Solution: Let x⃗ be an eigenvector of A. A x⃗ = λ x⃗ (λI − A) x⃗ = 0⃗ ⎛ λ + 1 0 −1 ⎞ 3 ⎟ x⃗ = 0⃗ ⎜ −3 λ ⎝ −1 0 λ + 1 ⎠ (7.27) For a non-trivial solution of x⃗, we have ⎛ λ + 1 0 −1 ⎞ 3 ⎟ = 0 det ⎜ −3 λ ⎝ −1 0 λ + 1 ⎠ λ2 (λ + 2) = 0 λ = 0 or − 2 ⎛ u ⎞ Let λ1 = λ2 = 0 (equal roots), λ3 = −2 and x⃗ = ⎜ v ⎟. Equation 7.27 becomes ⎝ w ⎠ ⎛ 0 ⎞ ⎛ λ + 1 0 −1 ⎞ ⎛ u ⎞ 3 ⎟⎜ v ⎟ = ⎜ 0 ⎟ ⎜ −3 λ ⎝ −1 0 λ + 1 ⎠ ⎝ w ⎠ ⎝ 0 ⎠ For λ1 = 0, we have ⎛ 0 ⎞ ⎛ 1 0 −1 ⎞ ⎛ u ⎞ ⎜ −3 0 3 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟ ⎝ −1 0 1 ⎠ ⎝ w ⎠ ⎝ 0 ⎠ CHAPTER 7. MATRIX AND TRANSFORMATION 170 so ⎧ u−w = 0 ⎪ ⎪ ⎪ ⎨ −3u + 3w = 0 ⎪ ⎪ ⎪ ⎩ −u + w = 0 which gives u−w =0 ⎛ 1 ⎞ We set u = 1, v = 0, and w = 1, then x⃗1 = ⎜ 0 ⎟ . ⎝ 1 ⎠ There is another independent eigenvector which corresponds to this eigen⎛ 0 ⎞ value (λ1 = λ2 = 0). Let’s set u = 0, v = 1, and w = 0 then x⃗2 = ⎜ 1 ⎟ . ⎝ 0 ⎠ For λ3 = −2, we have ⎛ −1 0 −1 ⎞ ⎛ u ⎞ ⎛ 0 ⎞ ⎜ −3 −2 3 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟ ⎝ −1 0 −1 ⎠ ⎝ w ⎠ ⎝ 0 ⎠ so ⎧ −u − w = 0 ⎪ ⎪ ⎪ ⎨ −3u − 2v + 3w = 0 ⎪ ⎪ ⎪ −u − w = 0 ⎩ which gives { u+w =0 v − 3w = 0 ⎛ 1 ⎞ We set u = 1, v = −3, and w = −1, then x⃗3 = ⎜ −3 ⎟ . ⎝ −1 ⎠ ⎛ 1 ⎞ ⎛ 0 ⎞ To conclude, the three independent eigenvectors are x⃗1 = ⎜ 0 ⎟ , x⃗2 = ⎜ 1 ⎟ , ⎝ 1 ⎠ ⎝ 0 ⎠ ⎛ 1 ⎞ ⃗ and x3 = ⎜ −3 ⎟ . ∎ ⎝ −1 ⎠ ⎛ 0 1 0 ⎞ Example 7.19. If A = ⎜ 0 0 1 ⎟ , find the eigenvalues and eigenvectors ⎝ 2 −5 4 ⎠ of A. Solution: Let x⃗ be an eigenvector of A. A x⃗ = λ x⃗ (λI − A) x⃗ = 0⃗ 0 ⎞ ⎛ λ −1 −1 ⎟ x⃗ = 0⃗ ⎜ 0 λ ⎝ −2 5 λ − 4 ⎠ (7.28) CHAPTER 7. MATRIX AND TRANSFORMATION 171 For a non-trivial solution of x⃗, we have 0 ⎞ ⎛ λ −1 −1 ⎟ = 0 det ⎜ 0 λ ⎝ −2 5 λ − 4 ⎠ λ3 − 4 λ2 + 5 λ − 2 = 0 (λ − 1)2 (λ − 2) = 0 λ = 1 or 2 ⎛ u ⎞ Let λ1 = λ2 = 1 (equal roots), λ3 = 2 and x⃗ = ⎜ v ⎟. Equation 7.28 becomes ⎝ w ⎠ 0 ⎞⎛ u ⎞ ⎛ λ −1 ⎛ 0 ⎞ −1 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟ ⎜ 0 λ ⎝ −2 5 λ − 4 ⎠ ⎝ w ⎠ ⎝ 0 ⎠ For λ1 = 1, we have ⎛ 1 −1 0 ⎞ ⎛ u ⎞ ⎛ 0 ⎞ ⎜ 0 1 −1 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟ ⎝ −2 5 −3 ⎠ ⎝ w ⎠ ⎝ 0 ⎠ so ⎧ u−v = 0 ⎪ ⎪ ⎪ v−w = 0 ⎨ ⎪ ⎪ ⎪ ⎩ −2u + 5v − 3w = 0 which gives { u−v =0 v−w =0 ⎛ 1 ⎞ We set u = 1, v = 1, and w = 1, then x⃗1 = ⎜ 1 ⎟ . ⎝ 1 ⎠ ⎛ 0 ⎞ Next, we set u = 0, then v = 0, and w = 0. x⃗2 = ⎜ 0 ⎟ . We should note ⎝ 0 ⎠ that x⃗2 is an trivial answer of the eigenvalue equation and it is not regarded as an eigenvector of A. Moreover, x⃗1 and x⃗2 are dependent vectors because x⃗2 = 0 x⃗1 . For λ3 = 2, we have ⎛ 2 −1 0 ⎞ ⎛ u ⎞ ⎛ 0 ⎞ ⎜ 0 2 −1 ⎟ ⎜ v ⎟ = ⎜ 0 ⎟ , ⎝ −2 5 −2 ⎠ ⎝ w ⎠ ⎝ 0 ⎠ CHAPTER 7. MATRIX AND TRANSFORMATION 172 so ⎧ 2u − v = 0 ⎪ ⎪ ⎪ 2v − w = 0 ⎨ ⎪ ⎪ ⎪ −2u + 5v − 2w = 0 ⎩ which gives { 2u − v = 0 2v − w = 0 ⎛ 1 ⎞ We set u = 1, v = 2, and w = 4, then x⃗3 = ⎜ 2 ⎟ . ⎝ 4 ⎠ ⎛ 1 ⎞ ⎛ 1 ⎞ Hence, A has two independent eigenvectors, x⃗1 = ⎜ 1 ⎟ and x⃗3 = ⎜ 2 ⎟ . ⎝ 1 ⎠ ⎝ 4 ⎠ ∎ Example 7.20. If matrix A is similar to matrix B, show that they have the same set of eigenvalues. Solution: If A is similar to B, then there exist an invertible matrix P such that B = P −1 AP . Let λ and x⃗ be the eigenvalue and eigenvector of A respectively. We have BP −1 = P −1 A and A x⃗ = λ⃗ x. So (BP −1 ) x⃗ = (P −1 A) x⃗ B (P −1 x⃗) = P −1 (λ x⃗) B (P −1 x⃗) = λ (P −1 x⃗) Therefore, λ is an eigenvalue of B and P −1 x⃗ is the corresponding eigenvector. A and B share the same set of eigenvalues. 7.10 Diagonalization Given a n-square matrix A, diagonalization of A is very useful when one wants to compute Am , where m is a non-negative integer. The procedure to diagonalize a matrix is shown below. (I) Find the eigenvalues λi and eigenvectors x⃗i of A, where i = 1, 2, ⋯, n. The matrix A can be diagonalized, if A has n independent eigenvectors. (II) Construct the n-square matrix P using the eigenvectors x⃗1 , x⃗2 , ⋯, x⃗n , as its columns. (III) Find the inverse of P . (IV) A diagonal matrix D is formed, where D = P −1 AP and the diagonal entries are λ1 , λ2 , ⋯, λn . Knowing that Dm = P −1 Am P , we have Am = P Dm P −1 . CHAPTER 7. MATRIX AND TRANSFORMATION 173 Example 7.21. Given a 3 × 3 real matrix, show that D = P −1 AP is a diagonal matrix if the columns of P are formed by the eigenvectors of A. Solution: Let λ1 , λ2 and λ3 be the eigenvalues of A. Let the corresponding independent eigenvectors be x⃗1 , x⃗2 , and x⃗3 . As a reminder, we should realize that they are column vectors. A x⃗1 = λ1 x⃗1 A x⃗2 = λ2 x⃗2 A x⃗3 = λ3 x⃗3 Construct P by using the eigenvectors, i.e. P = (⃗ x1 , x⃗2 , x⃗3 ), then AP = A (⃗ x1 , x⃗2 , x⃗3 ) = (A x⃗1 , A x⃗2 , A x⃗3 ) = (λ1 x⃗1 , λ2 x⃗2 , λ3 x⃗3 ) ⎛ λ1 0 0 ⎞ = (⃗ x1 , x⃗2 , x⃗3 ) ⎜ 0 λ2 0 ⎟ ⎝ 0 0 λ3 ⎠ = PD Hence, D = P −1 AP . Remark: P = (⃗ x1 , x⃗2 , x⃗3 ) is a row vector which consists three column vectors: x⃗1 , x⃗2 , and x⃗3 . So, P is a 3 × 3 matrix. ∎ Example 7.22. If A = ( 1 2 ) , diagonalize A then compute A5 . 0 4 Solution: Refer to the answers in example 7.17, we get the eigenvalues and 1 eigenvectors of A. Eigenvalues: λ = 1 and 4, and eigenvectors: x⃗1 = ( ) 0 2 and x⃗2 = ( ) . 3 Now, we construct the matrix P using the eigenvectors P =( 1 2 ). 0 3 Then we compute the inverse of P P −1 = adjP 1 3 −2 = ( ), det P 3 0 1 CHAPTER 7. MATRIX AND TRANSFORMATION 174 1 2 ∣ = 3. 0 3 A diagonal matrix is obtained as where det P = ∣ D = P −1 AP = 1 3 −2 1 2 1 2 1 0 ( )( )( )=( ) 0 1 0 4 0 3 0 4 3 In fact, we can write down D without computing the product, because D=( λ1 0 1 0 )=( ) 0 λ2 0 4 So, D5 = ( 1 0 ) 0 1024 Now, we compute the power of A. As was stated above that D5 = P −1 A5 P , so we have A5 = P D5 P −1 , which gives A5 = 1 1 2 1 0 3 −2 1 682 ( )( )( )=( ) 0 1024 0 1 0 1024 3 0 3 ∎ Example 7.23. If A = ( 1 2 ) , find eA . 0 4 Solution: Making use the diagonal matrix D obtained in example 7.22 and x2 x3 the fact that A0 = I and ex = 1 + x + + + ⋯, we have 2! 3! A A2 A3 + + + ⋯) P 1! 2! 3! P −1 A P P −1 A2 P P −1 A3 P I+ + + +⋯ 1! 2! 3! (P −1 A P ) (P −1 A P )2 (P −1 A P )3 I+ + + +⋯ 1! 2! 3! −1 eP A P eD P −1 eA P = P −1 (I + = = = = We have applied the fact that P −1 A2 P = (P −1 A P ) (P −1 A P ) P −1 A3 P = (P −1 A P ) (P −1 A P ) (P −1 A P ) = (P −1 A P )3 ⋮ = ⋮ = (P −1 A P )2 = ⋮ CHAPTER 7. MATRIX AND TRANSFORMATION 175 Hence, eA = P eD P −1 D D2 D3 = P (I + + + + ⋯) P −1 1! 2! 3! e1 0 = P ( ) P −1 0 e4 = 1 1 2 e1 0 3 −2 ( )( )( ) 0 e4 0 1 3 0 3 = 1 3 e −2 e + 2e4 ( ) 0 3 e4 3 4 e − 23e + 2 3e ) = ( 0 e4 ∎ 7.11 Rotation of Axes There are two coordinate systems XY and X ′ Y ′ . If the two coordinate systems are related by a rotation, figure 7.3, what will be the relation between the coordinates? Figure 7.3: Two coordinate systems related by a rotation Let the coordinates of the same point be (x, y) and (x′ , y ′ ) in the two systems. It is easy to see that if the polar coordinates of the point are (r, θ) in the original system, then they are (r, θ − α) in the new system. Hence, we have x′ = r cos(θ − α) = r(cos θ cos α + sin θ sin α) = x cos α + y sin α , (7.29) CHAPTER 7. MATRIX AND TRANSFORMATION y′ = = = = r sin(θ − α) r(sin θ cos α − cos θ sin α) y cos α − x sin α −x sin α + y cos α , 176 (7.30) where we have used equations 5.20 and 5.18. In matrix notation, these are ( x′ cos α sin α x )=( )( ) ′ y − sin α cos α y (7.31) The inverse transformation is ( x cos α − sin α x′ )=( )( ′ ) y sin α cos α y (7.32) This could be obtained by reversing the sign of α or taking the inverse of the matrix. To write down the relation explicitly, we have x = x′ cos α − y ′ sin α , y = x′ sin α + y ′ cos α (7.33) (7.34) Example 7.24. The equation of a circle with center at the origin is x2 + y 2 = R2 . In the rotated system, we have (x′ cos α − y ′ sin α)2 + (x′ sin α + y ′ cos α)2 = R2 x′ 2 cos2 α + y ′ 2 sin2 α + x′ 2 sin2 α + y ′ 2 cos2 α = R2 x′ 2 + y ′ 2 = R 2 , ∎ as expected. Example 7.25. Find the equation of the line pair 4x2 − 11xy + 6y 2 = 0 when the axes are rotated counterclockwisely through the acute angle whose tangent is 43 . Solution: If α is the angle of rotation, tan α = 43 gives sin α = 45 and cos α = 35 . The new equation is 2 2 3 4 3 4 4 3 4 3 4 ( x′ − y ′ ) − 11 ( x′ − y ′ ) ( x′ + y ′ ) + 6 ( x′ + y ′ ) 5 5 5 5 5 5 5 5 ′ ′ 2 ′ ′ ′ ′ ′ 4 (3x − 4y ) − 11 (3x − 4y ) (4x + 3y ) + 6 (4x + 3y ′ )2 125x′ y ′ + 250y ′ 2 x′ y ′ + 2y ′ 2 = 0 = 0 = 0 = 0 ∎ CHAPTER 7. MATRIX AND TRANSFORMATION 177 Example 7.26. The equation of a parabola is x2 = 4ay. Consider a simple π rotation of axes counterclockwisely by . By equations 7.33 and 7.34, we 2 simply have x = −y ′ and y = x′ . In the new coordinates, the equation of the parabola becomes y ′ 2 = 4ax′ (7.35) If the rotation angle is x′ − y ′ x′ + y ′ π , we have x = √ and y = √ , and the new 4 2 2 equation becomes 2 x′ − y ′ x′ + y ′ ( √ ) = 4a ( √ ) 2 2 √ √ ′2 ′ ′ ′2 ′ x − 2x y + y = 4 2a x + 4 2a y ′ √ √ x′ 2 − 2x′ y ′ + y ′ 2 − 4 2a x′ − 4 2a y ′ = 0 ∎ Example 7.27. The ellipse in figure 7.4 is 97 x2 + 192 xy + 153 y 2 = 225 when using the axes x and y as reference. There is another set of axes, x′ and y ′ , which rotates the axes x and y counterclockwisely by an acute angle α, with sin α = 4/5. Show that the equation of the ellipse has the standard form when using the x′ and y ′ coordinates. Figure 7.4: An ellipse and the rotation of axes Solution: Using equations 7.33 and 7.34, we have ⎧ 3 x′ 4 y ′ ⎪ ⎪ x = − ⎪ ⎪ ⎪ 5 5 ⎪ ⎨ ⎪ ⎪ 4 x′ 3 y ′ ⎪ ⎪ ⎪ y = + ⎪ ⎩ 5 5 CHAPTER 7. MATRIX AND TRANSFORMATION 178 Then, we rewrite the equation of the ellipse 97 x2 + 192 xy + 153 y 2 = 225 as 97 ( 3 x′ 4 y ′ 2 3 x′ 4 y ′ 4 x′ 3 y ′ 4 x′ 3 y ′ 2 − ) + 192 ( − )( + ) + 153 ( + ) 5 5 5 5 5 5 5 5 97 (3 x′ − 4 y ′ )2 + 192 (3 x′ − 4 y ′ ) (4 x′ + 3 y ′ ) + 153 (4 x′ + 3 y ′ )2 5625 x′ 2 + 625 y ′ 2 y′2 x′ 2 + 9 The above representation is the standard form of an ellipse. = 225 = 5625 = 5625 = 1 ∎ The above discussion focuses on the rotation of axes while the interested point P is fixed. Now, we consider the transformation of the coordinates if point P rotates but the axes are fixed. Figure 7.5 shows a point P which rotates counterclockwisly by an angle α. Figure 7.5: The coordinates of a rotating point The new position of it is P ′ and its new coordinates are x′ = = = ′ y = = = r cos(α + β) r (cos α cos β − sin α sin β) x cos α − y sin α , r sin(α + β) r (sin α cos β + cos α sin β) x sin α + y cos α (7.36) (7.37) The matrix form of equations 7.36 and 7.37 is ( x′ cos α − sin α x )=( )( ) ′ y sin α cos α y (7.38) The discussion can be extended to 3-dimensional space and it is clearly that equation 7.38 represents the rotation of point P about the z-axis and P CHAPTER 7. MATRIX AND TRANSFORMATION 179 always lies on the xy-plane (z = 0). Thus we can write ′ ⎛ x ⎞ ⎛ cos α − sin α 0 ⎞ ⎛ x ⎞ ⎜ y ′ ⎟ = ⎜ sin α cos α 0 ⎟ ⎜ y ⎟ ⎝ z′ ⎠ ⎝ 0 0 1 ⎠⎝ z ⎠ (7.39) If P rotates about the x-axis, we have ′ 0 0 ⎛ x ⎞ ⎛ 1 ⎞⎛ x ⎞ ′ ⎜ y ⎟ = ⎜ 0 cos α − sin α ⎟ ⎜ y ⎟ ⎝ z ′ ⎠ ⎝ 0 sin α cos α ⎠ ⎝ z ⎠ (7.40) The result appears after using the following conversions in equation 7.38. { x′ → y ′ y′ → z′ and { x→y y→z Recall that the convention is governed by the right hand rule: k̂ = î × ĵ and î = ĵ × k̂. The correspondence of these equations is k̂ → î, î → ĵ and ĵ → k̂. See figure 7.6. Figure 7.6: The axes of rotation If P rotates about the y-axis, we have ′ ⎛ x ⎞ ⎛ cos α 0 sin α ⎞ ⎛ x ⎞ ′ 0 1 0 ⎟⎜ y ⎟ ⎜ y ⎟=⎜ ⎝ z ′ ⎠ ⎝ − sin α 0 cos α ⎠ ⎝ z ⎠ (7.41) The result appears after using the following conversions in equation 7.38. { x′ → z ′ y ′ → x′ and { x→z y→x Similar arguments from the right hand rule: k̂ = î × ĵ and ĵ = k̂ × î. The correspondence of these equations is k̂ → ĵ, î → k̂ and ĵ → î. See figure 7.6. CHAPTER 7. MATRIX AND TRANSFORMATION 7.12 180 Special Matrices Special matrices are widely used in physics. Some of them are shown below. (i) Orthogonal matrix: AT = A−1 We note that AAT = AT A = I and det A = ±1. Rotation matrix is an example of orthogonal matrix. (ii) Hermitian matrix: A = A It is widely used in quantum mechanics, where A stands for the transpose of the complex conjugate of A, i.e. A = (A∗ )T . (iii) Unitary matrix: U U = U U = I The difference between unitary matrix and orthogonal matrix is that unitary matrix considers the complex conjugate, but the orthogonal matrix does not. Example 7.28. Show that the eigenvalue of a Hermitian matrix is real. Solution: If A is a Hermitian matrix, then A = A . Let λ and x⃗ be the eigenvalue and eigenvector respectively such that A x⃗ = λ x⃗. Obtaining the transpose of the complex conjugate of both sides, we have (A x⃗) = λ∗ x⃗ x⃗ A = λ∗ x⃗ x⃗ A = λ∗ x⃗ (λ∗ represents the complex conjugate of λ) (A = A ) Multiplying both sides by x⃗, we get x⃗ A x⃗ x⃗ (λ x⃗) λ x⃗ x⃗ λ = = = = λ∗ x⃗ x⃗ λ∗ x⃗ x⃗ λ∗ x⃗ x⃗ λ∗ ∎ 7.13 Vector Spaces A vector space V over a field K consists a set on which two operations (called addition and scalar multiplication, respectively) are defined so that for each pair of elements x, y in V there is a unique element x + y (a sum) in V , and for each element a in K and each element x in V there is a unique element ax (a product) in V , such that the following conditions hold. (VS.I) For all x, y in V , x + y = y + x (commutative law). CHAPTER 7. MATRIX AND TRANSFORMATION 181 (VS.II) For all x, y, z in V , (x + y) + z = x + (y + z) (associative law). (VS.III) There exists an element in V denoted by 0 such that x + 0 = x for each x in V . The vector 0 is called the zero vector of V . (VS.IV) For each element x in V there exists an element y in V such that x + y = 0. The vector y is called the additive inverse of x and is denoted by −x. (VS.V) For each element x in V and the unit scalar 1 in K, 1x = x. (VS.VI) For each pair of elements a, b in K and each element x in V , (ab) x = a (bx). (VS.VII) For each element a in K and each pair of elements x, y in V , a (x + y) = a x + a y (distributive law). (VS.VIII) For each pair of elements a, b in K and each element x in V , (a + b) x = a x + b x (distributive law). Example 7.29. Let V be the set of all m × n matrices with entries from an arbitrary field K. Illustrate that V is a vector space over K with respect to the operations of matrix addition and scalar multiplication. Solution: One can illustrate easily that V satisfies properties VS.I to VS.VIII. For example, V contains the zero matrix 0, and for each element in V , its additive inverse is also in V . ∎ Example 7.30. Show that for any scalar k in the scalar field K and any vectors u and v in the vector space V , k (u − v) = ku − kv . Solution: Clearly, k (u − v) = k (u + (−v)) = k u + k (−v) = k u − k v 7.14 Linear Transformation 7.14.1 Basis Vectors ∎ Let e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1) be the basis vectors of 3-D space, then all 3-D vectors are linear combination of the elements of the basis {ei } of the space. For example, an arbitrary vector v⃗ in 3-D space can be expressed as (x, y, z) = x e1 + y e2 + z e3 , where e1 , e2 , and e3 are commonly CHAPTER 7. MATRIX AND TRANSFORMATION 182 labelled as î, ĵ, and k̂ respectively. Alternatively, we write down v⃗ as a column vector, i.e. ⎛ 1 ⎞ ⎛ 0 ⎞ ⎛ 0 ⎞ ⎛ x ⎞ v⃗ = x î + y ĵ + z k̂ = x ⎜ 0 ⎟ + y ⎜ 1 ⎟ + z ⎜ 0 ⎟ = ⎜ y ⎟ ⎝ 0 ⎠ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ z ⎠ Basis vectors are linearly independent to each other. Any one of them cannot be generated from the remaining elements in {ei }. So, you cannot express e1 as the linear combination of e2 and e3 . Other than the usual basis vectors e1 , e2 and e3 , one can choose any three non-coplanar vectors as basis vectors, say e1 ′ = (1, 1, 1), e2 ′ = (1, 1, 0), and e3 ′ = (1, 0, 0). Example 7.31. A vector v⃗ = (2, 3) is defined using the usual basis vectors: e1 = (1, 0) and e2 = (0, 1) in the 2-D vector space. If the basis vectors are changed to e1 ′ = (1, 1) and e2 ′ = (−1, 1), express v⃗ using the new basis. Solution: Recall that v⃗ = (2, 3) = 2 e1 + 3 e2 . Let v⃗ = (2, 3) = a e1 ′ + b e2 ′ , then ( 2 1 −1 ) = a( ) + b( ) 3 1 1 we have a − b = 2 and a + b = 3, which give a = 25 , b = 21 . Therefore v⃗ = 52 e1 ′ + 21 e2 ′ . Using the new basis, v⃗ is given by ( 52 , 12 ). 7.14.2 ∎ Linear Operator Let V and W be vector spaces (over K). We call a function T ∶ V → W a linear transformation from V to W if, for all x, y ∈ V and c ∈ K, we have (a) T (x + y) = T (x) + T (y) and (b) T (cx) = c T (x). We often simply call T linear. If V = W , we call T a linear operator on V . Properties of a Linear Operator If T is linear, then T (0) = 0. T is linear if and only if T (cx + y) = c T (x) + T (y) for all x, y ∈ V and c ∈ K. If T is linear, then T (x − y) = T (x) − T (y) for all x, y ∈ V . CHAPTER 7. MATRIX AND TRANSFORMATION 183 T is linear if and only if, for x1 , x2 , ⋯ , xn ∈ V , and a1 , a2 , ⋯ , an ∈ K, we have n n i=1 i=1 T (∑ ai xi ) = ∑ ai T (xi ) . Generally, the second property is used to prove the transformation is linear. Example 7.32. Define T ∶ R2 → R2 by T (a1 , a2 ) = (3a1 + a2 , 2a1 ). Show that T is linear. Solution: Let c ∈ R and x, y ∈ R2 , where x = (b1 , b2 ) and y = (d1 , d2 ). Since cx + y = (cb1 + d1 , cb2 + d2 ), we have T (cx + y) = (3 (cb1 + d1 ) + cb2 + d2 , 2 (cb1 + d1 )) Also cT (x) + T (y) = c (3b1 + b2 , 2b1 ) + (3d1 + d2 , 2d1 ) = (3cb1 + cb2 + 3d1 + d2 , 2cb1 + 2d1 ) = (3 (cb1 + d1 ) + cb2 + d2 , 2 (cb1 + d1 )) ∎ So T is linear. 7.15 Matrix Representation of a Linear Operator Let T be a linear operator such that ⎧ T (e1 ) = a11 e1 + a12 e2 + a13 e3 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ T (e2 ) = a21 e1 + a22 e2 + a23 e3 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ T (e3 ) = a31 e1 + a32 e2 + a33 e3 then the matrix representation of T is ⎛ a11 a21 a31 ⎞ T = ⎜ a12 a22 a32 ⎟ ⎝ a13 a23 a33 ⎠ One can check that equation set 7.42 has a matrix representation. ⎛ 1 ⎞ ⎛ a11 a21 a31 ⎞ ⎛ 1 ⎞ ⎛ a11 ⎞ T e1 = T ⎜ 0 ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ 0 ⎟ = ⎜ a12 ⎟ ⎝ 0 ⎠ ⎝ a13 a23 a33 ⎠ ⎝ 0 ⎠ ⎝ a13 ⎠ (7.42) CHAPTER 7. MATRIX AND TRANSFORMATION 184 Similarly, we have ⎛ 0 ⎞ ⎛ a11 a21 a31 ⎞ ⎛ 0 ⎞ ⎛ a21 ⎞ T e2 = T ⎜ 1 ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ 1 ⎟ = ⎜ a22 ⎟ ⎝ 0 ⎠ ⎝ a13 a23 a33 ⎠ ⎝ 0 ⎠ ⎝ a23 ⎠ ⎛ 0 ⎞ ⎛ a11 a21 a31 ⎞ ⎛ 0 ⎞ ⎛ a31 ⎞ T e3 = T ⎜ 0 ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ 0 ⎟ = ⎜ a32 ⎟ ⎝ 1 ⎠ ⎝ a13 a23 a33 ⎠ ⎝ 1 ⎠ ⎝ a33 ⎠ Moreover, if T operates on (x, y, z), it gives ⎛ x ⎞ ⎛ a11 a21 a31 ⎞ ⎛ x ⎞ ⎛ x a11 + y a21 + z a31 ⎞ T (x e1 + y e2 + z e3 ) = T ⎜ y ⎟ = ⎜ a12 a22 a32 ⎟ ⎜ y ⎟ = ⎜ x a12 + y a22 + z a32 ⎟ ⎝ z ⎠ ⎝ a13 a23 a33 ⎠ ⎝ z ⎠ ⎝ x a13 + y a23 + z a33 ⎠ which is equilvalent to T (x, y, z) = = = = = T (x e1 + y e2 + z e3 ) T (x e1 ) + T (y e2 ) + T (z e3 ) x T (e1 ) + y T (e2 ) + z T (e3 ) x (a11 e1 + a12 e2 + a13 e3 ) + y (a21 e1 + a22 e2 + a23 e3 ) + z (a31 e1 + a32 e2 + a33 e3 ) (x a11 + y a21 + z a31 ) e1 + (x a12 + y a22 + z a32 ) e2 + (x a13 + y a23 + z a33 ) e3 Example 7.33. If S and T are linear operators such that S(x, y) = (y, x) and T (x, y) = (3 x − y, −2 x + 5 y), find the matrix representaion of S and T using the usual basis vectors (1, 0) and (0, 1). Solution: S=( 0 1 ) 1 0 T =( and 3 −1 ), −2 5 where S( x y )=( ) y x and T( 3x − y x )=( ). y −2 x + 5 y ∎ Example 7.34. A linear operator T = ( 1 2 ) is defined using the usual 3 4 basis, {e1 = (1, 0), e2 = (0, 1)}. (a) Find T (1, 3) and T (2, 7). (b) Find the matrix of T using the basis {(1, 3), (2, 7)}. CHAPTER 7. MATRIX AND TRANSFORMATION 185 Solution: (a) (b) Denote e1 ′ = ( T ( 1 1 2 1 7 )=( )( )=( ) 3 3 4 3 15 T ( 2 1 2 2 16 )=( )( )=( ) 7 3 4 7 34 1 2 ) and e2 ′ = ( ) and let 3 7 7 ) = a e1 ′ + b e 2 ′ 15 and ( 16 ) = c e1 ′ + d e 2 ′ 34 7 1 2 ) = a( ) + b( ) 15 3 7 and ( 16 1 2 ) = c( ) + d( ) 34 3 7 ( We have ( Hence, we get { a = 19 c = 44 and { . b = −6 d = −14 Let Te′ be the matrix of T using the basis {(1, 3), (2, 7)}. Now, we can write Te′ e1 ′ = 19 e1 ′ − 6 e2 ′ and Te′ e2 ′ = 44 e1 ′ − 14 e2 ′ . Thus, Te′ = ( 19 44 ) −6 −14 Remark: Using the linear operator Te′ , we have Te′ ( 1 19 )=( ) 0 −6 Te′ ( and 0 44 )=( ) 1 −14 1 0 ) = (1) e1 ′ + (0) e2 ′ = e1 ′ and ( ) = (0) e1 ′ + (1) e2 ′ = e2 ′ . 0 1 They are not the usual bases, e1 and e2 . In the same manner, The vectors ( ( 19 ) = (19) e1 ′ + (−6) e2 ′ −6 and ( 44 ) = (44) e1 ′ + (−14) e2 ′ −14 ∎ Index absolute value, 114 acceleration vector, 16 adding vectors, 3 additive inverse, 181 adjoint, 161 Argand Diagram, 113 argument, 114 associative, 150 auxiliary equation, 121 basis, 181 basis vectors, 181 beta function, 65 binomial expansion, 45 bisection, method of, 54 capacitor, 96 cardioid, 138 Cartesian coordinates, 139 center of mass, 68 centripetal acceleration, 17, 33 chain rule, 125 characteristic equation, 121, 167 characteristic polynomial, 167 charging a capacitor, 96 cofactor, 154 column, 149 column vector, 149, 153 commutative, 150 complex numbers, 113 complex plane, 113 components, 8 compound angle formulae, 106 conjugate, 114 consistent system, 163 constant of integration, 56 coplanar vectors, 27 cosecant, 109 cotangent, 109 Cramer’s rule, 163 cycloid, 75 cylindrical coordinates, 141 De Moivre’s theorem, 116 definite integral, 61 definite integration, 56, 60 del operator, 147 derivative, 30 determinant, 154, 156 diagonal matrix, 153 diagonalization, 172 differentiable, 30 differentiation, 30 dummy variable, 61 eigenvalues, 167 eigenvectors, 167 electric dipole, 49 electric field, 40, 146 electric force, 9 electric potential, 40, 146 entry, 149 equilibrium, 3, 36 Euler’s formula, 118 Fermat’s principle, 35 first order differential equation, 100 fundamental theorem of calculus, 61 half-angle formulae, 110 half-life, 95 Hermitian matrix, 180 homogeneous function, 98 186 INDEX Hooke’s law, 88 identity matrix, 151, 152 impulse, 66 inconsistent system, 163 integrating factor, 101 integration, 56 integration by parts, 59 inverse, 152, 160 invertible, 160 isothermal process, 77 Kirchhoff’s junction rule, 165 Kirchhoff’s voltage rule, 96, 166 L’ Hôpital’s rule, 41 length of a curve, 74 linear operator, 182 linear transformation, 181 linearly dependent, 159, 168 linearly independent, 168 magnetic force, 29 matrix, 149 minor, 154 modulus, 114 moment of inertia, 78 Newton’s method, 50 non-singular, 160 ordinary differential equations, 86 orthogonal, 136 orthogonal matrix, 153, 162, 180 parabolic miror, 98 partial derivative, 124 partial differentiation, 124 partial fraction, 58 particle in a box, 122 polar coordinates, 135 polar form, 114 position vector, 15 projectile, 5, 18, 22 quantized, 123 187 radial acceleration, 18 radioactive decay, 94 radius of curvature, 18 RC circuit, 96 reduction formula, 64 reference frames, 19 refraction, 35 relative velocity, 11 resultant, 3, 8, 18 Riemann integrable, 61 rotation of axes, 175 row, 149 row vector, 149, 153 scalar product, 20 scalar triple product, 27 Schrodinger’s equation, 123 secant, 109 second order differential equation, 104 separation of variables, 86 similar matrix, 153 simple harmonic motion, 88, 120 Simpson’s rule, 83 singular, 160 skew-symmetric matrix, 154 spherical coordinates, 143 square matrix, 149 standard form, 100 substitution, 57 subtracting vectors, 11 symmetric matrix, 153 tangential acceleration, 18 Taylor’s series, 44 terminal speed, 92 torque, 28 Torricelli’s Law, 99 total differential, 31 trace, 153 transpose, 153 trapezoidal rule, 82 trigonometry, 106 triple product, 27 INDEX unitary matrix, 180 vector, 1 vector space, 180 vector triple product, 27 velocity vector, 15 wavefunction, 123 work, 71 zero matrix, 152 zero vector, 181 188