Physics-0 Lecture Notes Instructors: Chi-Ming Chang, Qing-Rui Wang, Fan Yang Qiuzhen College, Tsinghua University 2023 Spring 1 Contents 1 Mechanics 1.1 1.2 1.3 1.4 1.5 5 Kinematics: Velocity, Acceleration, Inertial Frame . . . . . . . . . . . . . . . . . . . 5 1.1.1 Point particle and reference frame . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.2 Velocity and acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.3 Inertial frame and Galilean transformation . . . . . . . . . . . . . . . . . . . 10 1.1.4 Linear Motions, Circular Motion, Parabolic Motion . . . . . . . . . . . . . . . 12 Newton’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.2.1 Newtonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.2.2 Newton’s first law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.2.3 Newton’s second law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.2.4 Newton’s third law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.2.5 Applying Newton’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Forces in Nature, Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3.1 The four fundamental forces in nature . . . . . . . . . . . . . . . . . . . . . . 24 1.3.2 Some particular forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.3.3 Statics I: forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.3.4 Statics II: torques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Energy and Momentum Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 1.4.1 Energy conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 1.4.2 Momentum conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 1.4.3 Elastic scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 1.4.4 Inelastic scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Harmonic Oscillators, Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2 1.6 1.5.1 Simple harmonic motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 1.5.2 Damped harmonic motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 1.5.3 Forced oscillations and resonance . . . . . . . . . . . . . . . . . . . . . . . . . 57 1.5.4 Simple Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 1.5.5 Double pendulum and chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 The Theory of Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 1.6.1 Newton’s law of Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 1.6.2 Gravitational potential energy . . . . . . . . . . . . . . . . . . . . . . . . . . 66 1.6.3 Kepler’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 1.6.4 Gravitation on Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 1.6.5 Gauss’s law for gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2 Electricity and Magnetism 2.1 2.2 2.3 2.4 75 Coulomb’s Law, Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 2.1.1 Electric charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 2.1.2 Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 2.1.3 Electric fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 2.1.4 Electric fields due to charged objects . . . . . . . . . . . . . . . . . . . . . . . 81 2.1.5 Charged particles in electric fields . . . . . . . . . . . . . . . . . . . . . . . . 86 Electric flux, Gauss’s law, and integral theorems . . . . . . . . . . . . . . . . . . . . 88 2.2.1 Vector calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 2.2.2 Electric flux, divergence theorem, and Gauss’s law . . . . . . . . . . . . . . . 93 2.2.3 Stoke’s theorem, Poincaré lemma, and electric potential . . . . . . . . . . . . 99 Applying Gauss’s Law, Electric potential . . . . . . . . . . . . . . . . . . . . . . . . . 103 2.3.1 A charged isolated conductor . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 2.3.2 Combining Gauss’s law with symmetry . . . . . . . . . . . . . . . . . . . . . 104 2.3.3 Electric potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Capacitance, Current and Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 2.4.1 Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 2.4.2 Capacitors in Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 2.4.3 Energy Stored in an Electric Field . . . . . . . . . . . . . . . . . . . . . . . . 117 2.4.4 Electric Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 3 2.5 2.6 2.7 2.4.5 Resistance and Resistivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 2.4.6 Electric Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 2.4.7 RC Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Magnetic Fields, Magnetic Fields Due to Currents . . . . . . . . . . . . . . . . . . . 125 2.5.1 Magnetic fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 2.5.2 Magnetic forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 2.5.3 Magnetic fields from currents . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 2.5.4 Ampere’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Faraday’s Law, Induction and Inductance . . . . . . . . . . . . . . . . . . . . . . . . 139 2.6.1 Faraday’s law of induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 2.6.2 Induced electric fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 2.6.3 Inductors and inductance, self-induction . . . . . . . . . . . . . . . . . . . . . 143 2.6.4 RL circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 2.6.5 Energy of a magnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 2.6.6 LC harmonic oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 2.6.7 RLC damped oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Maxwell’s equations, Electromagnetic Waves . . . . . . . . . . . . . . . . . . . . . . 150 2.7.1 Overview of Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . . . 150 2.7.2 Maxwell’s Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 2.7.3 Relativistic formulations of Maxwell equations . . . . . . . . . . . . . . . . . 154 2.7.4 Electromagnetic wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 2.7.5 Symmetry of Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . . . 157 2.7.6 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 2.7.7 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 4 Chapter 1 Mechanics 1.1 1.1.1 Kinematics: Velocity, Acceleration, Inertial Frame Point particle and reference frame Physics is the subject that describes nature, and many phenomena in nature are ultimately boiled down to the motion of some objects. Therefore, a very important question in physics is “How do we describe the motion of something?” To answer this question, we need first to ask what is the “something” we would like to describe its motion. If something is a dog, then its motion could be very complicated. It might run straight along some direction or it might bite its own tail and spin around. To simplify our task, let us consider the motion of an idealized object called a point particle. Its defining feature is that it lacks spatial extension; hence, it has zero volume. The motion of a point particle is significantly simplified than the motion of a dog because it has no internal motion like spin; hence, we could describe the motion of a point particle by the change of its position with time. Definition 1 (Point particle). A point particle is an idealization of particles in physics. It has no spatial extension, and its motion is completely specified by the time evolution of its position. On the other hand, if all parts of an object move in the same way, then it could be effectively treated as a point particle. For example, a small ball falling from a high altitude can be approximated by a point particle. We are still not yet to answer the question. To describe or measure the motion of a point particle, we need to first decide to what the motion of a point particle is referenced. For simplicity, we could choose to describe the motion of a point particle A referenced to another point particle O. Next, we introduce a Cartesian coordinate system where the origin is chosen to be the position of the particle O, which is called a reference frame of the particle O. For example, consider the point particle A moving on a two-dimensional plane as shown in Figure 1.1. At time t = 0, the 5 particle A starts at the coordinate (x, y) = (− 21 , 23 ) and moves along the red curve and ends at the coordinate (x, y) = ( 92 , − 21 ) at time t = t∗ . y 4 3 2 A 1 −1 O 1 2 3 4 x −1 Figure 1.1: The motion of a point particle A in the reference frame of the point particle O. The motion of the particle A can be described by the coordinates of the reference frame as functions of the time t, i.e. (x(t), y(t)). From time t1 to t2 , the point particle A travels from (x1 , y1 ) to (x2 , y2 ). The displacement of the particle A at t1 and t2 is (∆x, ∆y) = (x2 − x1 , y2 − y1 ) , (1.1.1) and the distance is given by d= p ∆x2 + ∆y 2 . (1.1.2) If displacement of the point particle A from t1 to t2 is (∆x, ∆y) and from t2 to t3 is (∆x′ , ∆y ′ ), then the displacement from t1 to t3 is (∆x + ∆x′ , ∆y + ∆y ′ ) . (1.1.3) The coordinate (x, y) itself can also be understood as the displacement between the particle O and the particle A. In the example we just studied, the motion of the point particle A confines on a two-dimensional plane, and we used the coordinates (x, y) to describe its motion. More generally, a particle moving in d dimensions can be described by the Cartesian coordinates (x1 , x2 , · · · , xd ). It is convenient to introduce the concept of a vector. A vector quantity is a physical quantity with both magnitude and direction. 6 Definition 2 (Vector). A d-dimensional vector ⃗v is a collection of d numbers as ⃗v = (v1 , v2 , · · · , vd ) equipped with the operations: vector addition and scalar multiplication. • Vector addition: Given two d-dimensional vectors ⃗v = (v1 , · · · , vd ) and ⃗v ′ = (v1′ , · · · , vd′ ), the sum ⃗v + ⃗v ′ is a d-dimensional vector defined by ⃗v + ⃗v ′ = (v1 + v1′ , · · · , vn + vd′ ) . (1.1.4) • Scalar multiplication: Given a number a and a d-dimensional vector ⃗v = (v1 , · · · , vd ), the product a⃗v is defined by (1.1.5) a⃗v = (av1 , · · · , avd ) . Let us look at some examples in two dimensions. The position of a particle is a vector ⃗r = (x, y). Using vector addition and scalar multiplication, we can write the displacement also as a vector as ∆⃗r = ⃗r2 + (−1)⃗r1 = (x2 − x1 , y2 − y1 ) . (1.1.6) The discussion around (1.1.2) says that the total displacement from t1 to t3 is the vector addition of the displacement from t1 to t2 and the displacement from t2 to t3 . Definition 3 (Inner product and norm). The inner product ⃗u · ⃗v of two n-dimensional vectors ⃗u = (u1 , · · · , un ) and ⃗v = (v1 , · · · , vn ) is defined by ⃗u · ⃗v = u1 v1 + u2 v2 + · · · ud vd . (1.1.7) The norm |⃗v | of a vector ⃗v is defined by the square root of the inner product of ⃗v with itself, i.e. |⃗v | = √ ⃗v · ⃗v . (1.1.8) The norm of a displacement gives the distance, d = |∆⃗r| , (1.1.9) and (1.1.2) is an example in two dimensions. So far, in our description of the motion of a point particle, we have ignored the units. We measure each physical quantity in its own units, by comparison with a standard, which corresponds to exactly 1 unit of the quantity. The unit for time is second with the symbol s and for length is meter with the symbol m. The corresponding standards are given by1 Definition 4 (Second). One second is the time taken by 9 192 631 770 oscillations of the light (of a specified wavelength) emitted by a cesium-133 atom. 1 The following two definitions are literally Jearl Walker - Fundamentals of Physics]. taken 7 from [Robert Resnick; David Halliday; Definition 5 (Meter). The meter is the length of the path traveled by light in a vacuum during a time interval of 1/299 792 458 of a second. For example, when we said the time period is 135 s, we mean that during that time period, the light emitted by a cesium-133 atom has oscillated 135 × 9192631770 times. When we say that two points are 4.3 m apart, we mean that it takes 4.3 × 1/299792458 for the light to travel from one point to the other. 1.1.2 Velocity and acceleration When people want to have some idea of the motion of something, the most common question people ask is “how fast does something move?”. Velocity is one of the most important characteristics of the motion of a point particle. Consider a time interval from t to t + ∆t. Suppose the particle A moves from ⃗r1 to ⃗r2 , the average velocity of the particle A in the time interval ∆t = t2 − t1 is ⃗vavg = ∆⃗r ⃗r2 − ⃗r1 = . ∆t ∆t (1.1.10) The unit of the average velocity is m/s. Given the average velocity, we can compute the displacement as (1.1.11) ∆⃗r = ⃗vavg ∆t , and the distance as d = |⃗vavg |∆t . (1.1.12) We can see that the average velocity ⃗vavg is associated with the time period ∆t, and only depends on the initial and final positions of the particle. However, one usually would like to know the velocity of an object at a particular instance in time. We define the instantaneous velocity as given by the limit ⃗r(t + ∆t) − ⃗r(t) (1.1.13) ⃗v (t) = lim . ∆t→0 ∆t In terms of the components ⃗v = (vx , vy ), we have vx (t) = lim ∆t→0 x(t + ∆t) − x(t) , ∆t vy (t) = lim ∆t→0 y(t + ∆t) − y(t) . ∆t (1.1.14) The operation we performed on the coordinate x(t) in (1.1.13) to get the instantaneous velocity vx (t) is an example of a derivative. The derivative of a function f (x) with respect to the variable (x) x, denoted by dfdx , is defined by the limit df (x) f (x + ϵ) − f (x) = lim . ϵ→0 dx ϵ We have ⃗v (t) = d⃗r(t) . dt 8 (1.1.15) (1.1.16) We would not give the mathematical definitions for the limit “lim” in (1.1.13). Instead, let us try to understand the meaning of the instance velocity by looking at the following example. Consider a point particle moving along the x-direction, and we want to measure its instantaneous velocity at t = 2.7 s. To this end, we measure the position of this point particle at a sequence of time instances. The result of the measurements is listed in Table 1.1. t (s) x (m) 2.7 0.968583 2.7001 0.967734 2.701 0.959588 2.71 0.830246 2.8 −0.844328 Table 1.1: The data from measuring the positions of a point particle. Now, we could approximate the instantaneous velocity by the average velocity of the particle in the time interval [2.7 s, 2.8 s], and obtain2 v(2.7 s) ≈ vavg = −0.844328 − 0.968583 m/s = −18.129 m/s . 2.8 − 2.7 (1.1.17) We can get better approximations of the instantaneous velocity by using smaller time intervals [2.7 s, 2.71 s], [2.7 s, 2.701 s], and [2.7 s, 2.7001 s]. The average velocities associated with these time intervals are 0.830246−0.968583 m/s = −13.834 m/s for interval [2.7 s, 2.71 s] , 2.71−2.7 (1.1.18) v(2.7 s) ≈ vavg = 0.959588−0.968583 m/s = −8.996 m/s for interval [2.7 s, 2.701 s] , 2.701−2.7 0.967734−0.968583 m/s = −8.494 m/s for interval [2.7 s, 2.7001 s] . 2.7001−2.7 When we use smaller and smaller intervals, we obtain better and better approximations of the instantaneous velocity. The equation (1.1.13) means that the instantaneous velocity equals the average velocity associated with an interval of an infinitesimal length ϵ. Of course, the “interval of an infinitesimal length” could only be achieved ideally, and in practice, one could only obtain approximations of the instantaneous velocity, but when the interval becomes smaller and smaller, the approximation becomes better and better. In fact, the data in table 1.1 come from the particle trajectory x(t) = sin(2πt2 ), and we have the instantaneous velocity v(2.7 s) = −8.43785 m/s. We can also understand the process of getting a better and better approximation of the instantaneous velocity pictorially. In Figure 1.2, we draw the motion of a point particle in the t-x plane. The instantaneous velocity at t = t0 can be computed by the limit xn − x0 , n→∞ tn − t0 v(t0 ) = lim (1.1.19) where tn approaches t0 for n going to infinity. From the figure, we can also see that the instantaneous velocity is the slope of the trajectory on the t-x plane. 2 We drop the subscript x when considering particles moving in one dimension. 9 Figure 1.2: The trajectory of a point particle on the t-x plane. Given the instantaneous velocity v(t), one could integrate it to get the displacement of the particle, ˆ t2 (1.1.20) ∆⃗r(t2 , t1 ) = ⃗r(t2 ) − ⃗r(t1 ) = ⃗v (t)dt . t1 In a very similar fashion, we define the instantaneous acceleration as ⃗v (t + ∆t) − ⃗v (t) d⃗v (t) d2⃗r(t) = = . ∆t→0 ∆t dt dt2 ⃗a(t) = lim 1.1.3 (1.1.21) Inertial frame and Galilean transformation In Section 1.1.1, we learned that to describe the motion of a point particle A we need first to pick a reference frame, and we chose the reference frame associated with the point particle O. (O is always at the origin.) Our description of the motion of A heavily depends on the reference frame, namely the motion of O. If we instead choose a reference frame associated with a different point particle B, then our description of the motion of A would be completely different. This is of course not satisfactory. Hence, we would like to have a more universal way to describe the motion of A. This lead to the following two definitions. Definition 6 (Free particle). A free particle is a point particle that receives no influence from any other objects. 10 Definition 7 (Inertial frame). An inertial frame is a frame that is referenced to a free particle. Let us try to understand the above definitions using some analogies. Say you have a case in court. In a normal situation, you would prefer the jury of your case to be just, i.e. not influenced by any other people outside the jury. The point particle of our reference frame is like the jury, and a free particle is like a just jury that we are seeking to have. Therefore, we would like always to use inertial frames to describe the motion of objects. Inertial frames are not unique. Every free particle defines an inertial frame. We would like to have a way to compare our descriptions of the same motion but using different inertial frames. Consider two free particles O, O′ , and a point particle A. Both reference frames of O and O′ are inertial frames. As we will see in the next section, Newton’s first law implies that a free particle in an inertial frame has a constant velocity. Let us denote the constant velocity of O′ in the inertial frame of O by ⃗v . The motion of the point particle A in the inertial frame of O is given by the position vector ⃗r, and in the inertial frame of O′ is given by the position vector ⃗r′ . The position vectors ⃗r and ⃗r′ are related by (1.1.22) ⃗r′ = ⃗r − ⃗v t . The above relation between the coordinates of the two inertial frames is called the Galilean transformation. There are other more basic coordinate transformations. First, when the two free particles O and O′ are separated by a displacement vector d⃗ but with no relative velocity, the position vectors ⃗r and ⃗r′ are related by (1.1.23) ⃗r′ = ⃗r + d⃗ . When the clocks of the two inertial frames differ by time s, the time coordinates t and t′ are related by (1.1.24) t′ = t + s . The transformation from (t, ⃗r) to (t′ , ⃗r′ ) is called a translation. Next, we can consider the transformation that keeps the inertial frame of O but rotates the coordinates, i.e. ⃗r′ = R · ⃗r , (1.1.25) where R is an orthogonal matrix (RRT = I). This transformation is called a rotation. The Galilean group is the group that contains the compositions of Galilean transformation, translations, and rotations. Exercise (Galilean group). 1. Work out the transformation rule of a generic element in the Galilean group. 2. Derive the composition rule of two generic elements. 3. Show that the Galilean group is a group, i.e. it obeys all the axioms of a group. 11 1.1.4 Linear Motions, Circular Motion, Parabolic Motion We will introduce three simple motions of a point particle. We will use Newton’s notation for differentiation, dx(t) d2 x(t) (1.1.26) . ẋ(t) = , ẍ(t) = dt dt2 Linear motions: A linear motion is a motion of a point particle along a straight line. Let x be the coordinate along the straight line. A linear motion can be described by the function x(t). Let us give two examples of linear motions. 1. Constant velocity: The linear motion of a particle with a constant velocity v is given by x(t) = x0 + vt . (1.1.27) We compute the instantaneous velocity and acceleration by taking the first and second derivatives (1.1.28) ẋ(t) = v , ẍ(t) = 0 . We have verified that the particle is moving at a constant velocity v without acceleration. 2. Constant acceleration: All objects near Earth’s surface when neglecting the contact or non-contact effects from the air or other objects (except Earth) move downwards with a constant acceleration, called the free-fall acceleration, denoted by g. We will learn in later sections that such acceleration is caused by the gravitational attraction force between the object and Earth. For now, we just give its value g = 9.8 m/s2 . (1.1.29) The motion of a small ball with a free-fall acceleration is given by3 1 y(t) = y0 + vt − gt2 . 2 (1.1.30) The first and second derivatives are ẏ(t) = v − gt , ÿ(t) = −g . (1.1.31) We have verified that the particle is moving at a constant acceleration −g, and the minus sign is because the acceleration is pointing downward. Let us look at the position and velocity of the particle at three different important time instances. First, at the time t = 0, the position and velocity of the ball are y = y0 and ẏ = v. We assume that the velocity v is positive. At t = vg , the ball reaches the maximum height y = y0 + 3 v2 2g and has zero velocity. At t = 2v g , the ball is back to the starting height y = y0 We changed our coordinate from x to y for the vertical direction. 12 with an opposite velocity ẏ = −v. The trajectory of the ball on the t-x plane is drawn in Figure 1.3. It is a parabola. Figure 1.3: Trajectory of a free-falling object near Earth’s surface on the t-x plane. Circular motion: A point particle moving along a circle in two dimensions is called a circular motion, as shown in Figure 1.4. To describe a circular motion, it is convenient to change from the Cartesian coordinates (x, y) to the polar coordinate (r, θ) with the coordinate transformation: x = r cos θ , (1.1.32) y = r sin θ . The inverse transformation is r= p x2 + y 2 = |⃗r| , 13 θ = arctan y . x (1.1.33) Figure 1.4: The particle is confined on a circle for the circular motion. For circular motions, the radius r is a constant, and the angle θ is a function of the time t, i.e. the function θ(t) describes a circular motion. The first derivative of θ is called the angular velocity, dθ(t) = ω(t) . dt (1.1.34) Let us consider a circular motion with a constant angular velocity ω, described by θ(t) = θ0 + ωt . (1.1.35) Using the coordinate transformation (1.1.32), we find ⃗r = (r cos(ωt), r sin(ωt)) . (1.1.36) d⃗r = (−rω sin (ωt), rω cos(ωt)) , dt d2⃗r ⃗a = 2 = (−rω 2 cos(ωt), −rω 2 sin(ωt)) . dt (1.1.37) The velocity and acceleration are ⃗v = We see that the acceleration vector ⃗a is proportional to the coordinate ⃗r, and pointing towards the origin, i.e. (1.1.38) ⃗a = −ω 2⃗r . 14 This acceleration is called centripetal acceleration, which is a very important characteristic of circular motions. Taking the norms of the velocity and the acceleration, we find |⃗a| = rω 2 , (1.1.39) |⃗v |2 . r (1.1.40) 2π 2πr = . ω |⃗v | (1.1.41) |⃗v | = rω , which gives the relation |⃗a| = The period of the circular motion is T = Parabolic Motion: Consider a point particle near Earth’s surface. It has free-fall acceleration in the vertical direction and no acceleration in the horizontal direction. However, the motion along the horizontal direction is not completely trivial, since there could be a non-zero constant horizontal velocity. The most general such motion is described by 1 y(t) = y0 + vy t − gt2 . 2 x(t) = x0 + vx t , (1.1.42) The first and second derivatives are ẋ(t) = vx , ẏ(t) = vy − gt , ẍ(t) = 0 , ÿ(t) = −g . (1.1.43) We have verified that there is a constant horizontal velocity and a constant vertical acceleration. We can eliminate the time t from the equations (1.1.42) and find vy 1 y − y0 = (x − x0 ) − g vx 2 x − x0 vx 2 , (1.1.44) which is a parabola on the two-plane. Let us again look at some special time instances. First, at t = 0, the initial position and v velocity are (x, y) = (x0 , y0 ) and (ẋ, ẏ) = (vx , vy ). At t = gy , the ball reaches the maximum height at position (x, y) = (x0 + (ẋ, ẏ) = (vx , 0). At t = the velocity (vx , −vy ). 2vy g , vx vy g , y0 + vy2 2g ), and the velocity is pointing in the x-direction as the ball is back to the initial height at (x, y) = (x0 + 15 2vx vy g , y0 ) with 1.2 1.2.1 Newton’s Laws Newtonian mechanics Classical mechanics is a physical theory describing the motion, including accelerations, of macroscopic objects, and what can cause an object to accelerate. That cause is called a force, which is, loosely speaking, a push or pull on the object. The force is said to act on the object to change its velocity. In the next section, we will see some examples of forces in nature. In classical mechanics, the relation between a force and the acceleration it causes was fully understood by the celebrated Newton’s three laws of motion, which is the subject of this chapter. The study of that relation, as Newton presented it, is then called Newtonian mechanics. However, we remark that Newtonian mechanics does not apply to all situations. On the one hand, it is only a low-speed approximation of Einstein’s special theory of relativity. If the speeds of the objects become an appreciable fraction of the speed of light, Newtonian mechanics fails. On the other hand, if the interacting bodies are on the scale of atomic structure, then Newtonian mechanics should be replaced with more sophisticated quantum mechanics. Although physicists now view Newtonian mechanics as a special case of these two deeper theories, it is a very useful and important special case because it applies to the motion of objects ranging in size from the very small (almost on the scale of atomic structure) to astronomical (galaxies and clusters of galaxies). It is also worth mentioning that even though the underlying Newton’s laws of classical mechanics are very simple, there are still sufficiently many phenomena in classical mechanics that remain mysterious to physicists and mathematicians. Studies of these phenomena have led to many profound mathematical theories. One famous example is the Millennium Prize about the Navier–Stokes equation, which is in the regime of classical mechanics and has been one of the central problems in PDE studies. Another example is the chaos phenomena in classical mechanics (you may know the three-body problem), which has had far-reaching impacts on the development of the dynamical systems theory. 1.2.2 Newton’s first law Before Newton, Aristotle proposed that some force is needed to keep a body moving at constant velocity. A body was thought to be in its “natural state” when it was at rest, and for a body to move with constant velocity, it seems that we had to push or pull it in some way. Otherwise, the body would “naturally” stop moving. However: Example (Galileo’s thought experiment). Galileo considered a sliding body on inclined planes in the absence of friction. The speed acquired by a body moving down a plane from a height, say h, was sufficient to enable it to reach the same height when climbing up another plane at a different inclination, say θ. As θ decreases, the body should travel a greater and greater distance. Galileo proposed that the body could travel indefinitely far as θ → 0, contrary to the Aristotelian notion of the natural tendency of an object to remain at rest unless acted upon by an external force. 16 Due to his contribution, Galileo is credited with introducing the concept of inertia. Newton later exploited it as his first law of motion. Physics law 1 (Newton’s first law/the law of inertia). In an inertial frame, if no force acts on a body, the body’s velocity cannot change; that is, the body cannot accelerate. In other words, if a free body is at rest, it stays at rest. If it is moving, it continues to move with the same velocity (same magnitude and same direction). Note Newton’s first law does not hold in non-inertial frames. For example, in the frame of a bus accelerating from rest, a ball will accelerate backward even if no force acts on it. Hence, we can give another definition of the inertial frame: an inertial frame is a frame where Newton’s first law holds. 1.2.3 Newton’s second law Newton’s second law is perhaps the greatest law in physics and has deeply affected physics and even human history since after Newton. Before introducing Newton’s second law, we first need to make some preparations. Force is a vector quantity, i.e., it has not only magnitude but also direction. So, if two or more forces act on a body, we find the net force by adding them as vectors. This leads to the principle of superposition (or decomposition) for forces, i.e., a single force that has the same magnitude and direction as the calculated net force would then have the same effect as all the individual forces. In this note, we will often use F⃗ to represent a force. There may be multiple forces acting on a body, but if their net force is zero, the body cannot accelerate. So, a more precise statement of Newton’s first law is: Physics law 2 (Newton’s first law). In an inertial frame, if no net force acts on a body, the body’s velocity cannot change; that is, the body cannot accelerate. Mass is a quantitative measure of inertia. From everyday experience, we know that the object with the larger mass is accelerated less. With careful experiments, people find that the acceleration is actually inversely related to the mass (rather than, say, the square of the mass). In other words, applying the same force F to two bodies of masses m1 and m2 , their accelerations a1 and a2 satisfy a1 m2 = . a2 m1 This suggests that m1 a1 = m2 a2 = CF for a universal dimensionless C that does not depend on any physical quantities of the bodies. By choosing the proper physical units, we can let C = 1, which leads to Newton’s second law. 17 Physics law 3 (Newton’s second law). In an inertial frame, the net force on a body is equal to the product of the body’s mass and its acceleration. In other words, suppose a net force F⃗ is acted on a body of mass m, then the generated acceleration satisfies F⃗ = m⃗a. Note Newton’s first law is a special case of the second law with F⃗ = 0 and ⃗a = 0. We now check that Newton’s second law is “invariant” under Galilean transformations: ⃗x in inertial frame O → ⃗x′ = ⃗x − ⃗v t in inertial frame O′ , where ⃗v is a constant velocity vector. Then, in O′ , we have that ⃗a′ = d2 ⃗x d2 ⃗x′ = = ⃗a. dt2 dt2 Hence, in the new frame O′ , we still have F⃗ = m⃗a′ . The standard units of mass and acceleration are kg and m/s2 . Then, the unit of force is kg · m/s2 , which is called “Newton”, denoted by N. Hence, a 1 N force acting on a body of 1 kg mass leads to an acceleration 1 m/s2 . Remark. Mass is an intrinsic characteristic of a body. However, what, exactly, is mass? This turns out to be a much deeper question than it may look. In everyday language, it is often confused with weight, but this is wrong. In classical mechanics, by Newton’s second law, we can only say that the mass of a body is a characteristic that relates a force on the body to the resulting acceleration. There is no more familiar definition—we can have a physical sensation of mass only when we try to accelerate a body. The weight of a body is actually the gravitational force acting on the body, from which we can measure the mass by observing how the body accelerates under gravitation. In more advanced physics (such as relativity and quantum field theory), people have a much deeper understanding of “mass”. 1.2.4 Newton’s third law Two bodies are said to interact when they push or pull on each other, that is when a force acts on each body due to the other body. Newton’s third law states that the action force is equal to the reaction force. Physics law 4 (Newton’s third law/Action-reaction law). In an inertial frame, when two bodies interact, the forces on the bodies from each other are always equal in magnitude and opposite in direction. In other words, let A and B be two interacting bodies. F⃗AB is the force acting on B from A, and F⃗BA is the force acting on A from B. Then, we have F⃗AB = −F⃗BA . 18 If we say Newton’s first law is about setting up inertial frames and the second law gives how an object changes its motion if some force is acting on it, Newton’s third law is purely a description of the nature of forces. In Feynman’s lectures, he said: In our discussion of Newton’s laws, it was explained that these laws are a kind of program that says “Pay attention to the forces,” and that Newton told us only two things about the nature of forces. In the case of gravitation, he gave us the complete law of the force. In the case of the very complicated forces between atoms, he was not aware of the right laws for the forces; however, he discovered one rule, one general property of forces, which is expressed in his Third Law, and that is the total knowledge that Newton had about the nature of forces—the law of gravitation and this principle, but no other details. Nowadays, we know there are four fundamental interactions: weak, strong, electromagnetic and gravitation forces. We will discuss Newton’s law of gravitation in Section 1.6, and the electromagnetic force in Chapter 2. In applications, we will also often consider phenomenological forces (i.e., forces due to fundamental interactions), such as frictions, pressure, buoyancy, elastic force, damping force, and so on. 1.2.5 Applying Newton’s laws In principle, with Newton’s second law, we can predict the motion of a given object at any future time. Denote the position of the object by ⃗x. In physics, most forces depend only on the position and velocity (but not the acceleration) of the object. Then, Newton’s second law can be written as a second-order differential equation of ⃗x: ¨ = F⃗ (⃗x, ⃗x˙ ), m⃗x where we used Newton’s notation of differentiation: ⃗x˙ = d⃗x/dt. To get a solution, we need to solve the equation and fix the two integration constants using the initial conditions for ⃗x(0) and ⃗v (0) = ⃗x˙ (0). In general, for a system of multiple objects, we can write down a differential equation for each object. Thus, to predict the evolution of the system, we need to solve the system of differential equations with the given initial condition. In this note, we will look at some examples such that solving the given differential equation is possible and simple. In reality, however, such simple scenarios are rare and very few problems can be solved exactly by pure analysis. Everybody on earth is pulled toward the ground by a gravitational force that is proportional to the mass m: F⃗g = m⃗g , where ⃗g is the gravitational acceleration. Near the ground, ⃗g can be treated as a constant vector with a magnitude approximately 9.8 m/s2 and pointing downward. Free fall is any motion of a body where gravity is the only force acting upon it. If we release a body from rest, then its free fall is a linear motion described by (1.1.30) with a = −g (where the 19 positive direction is upward) and v = 0. If the body has a horizontal velocity, then its free fall is described by the parabolic motion (1.1.42). Example. A box starts from rest at height h0 and falls along a frictionless plane inclined at angle θ. How long does it take to reach the ground? Suppose the mass of the box is m. What is the normal force on the box? Solution: We can set up the axes such that the x-axis is pointing downward along the inclined plane, the y-axis is pointing upward along the normal direction of the inclined plane, and the origin is at the position of the box at t = 0. Figure 1.5: Forces on the box on the inclined plane. Then, the gravitational force on the box can be written as F⃗g = (mg sin θ, −mg cos θ). On the other hand, the normal force is F⃗N = (0, FN ). Since the box will slide along the inclined plane, its acceleration can be written as ⃗a = (a, 0). By Newton’s second law, F⃗g + F⃗N = m⃗a. Along the normal direction, it gives the normal force FN = mg cos θ. Along the x-axis, the above equation gives a = g sin θ. Then, by (1.1.30), we have that the position of the box x(t) is given by 1 x(t) = g sin θ · t2 . 2 When the box reaches the ground, we have x(t) = h0 / sin θ. Solving for t, we get s 2h0 t= . g sin2 θ Recall that for a point particle doing circular motion, its acceleration point towards the rotation center and has a magnitude v 2 /r = ω 2 r, where r is the radius of the circular motion, v is the speed of the particle, and ω = v/r is the angular speed. We now look at the following example: 20 Example. In a double-star system, two stars of masses m1 and m2 are rotating around a center with angular speed ω. The gravitation force between them is given by Gm1 m2 /r2 , where G is the gravitational constant and r is the distance between the two stars. Given ω, determine r and the position of the rotation center. Solution: Suppose the distances from the two stars to the rotation center are r1 and r2 = r − r1 , respectively. The acceleration of the two stars are towards the rotation center and equal to a1 = ω 2 r1 , a2 = ω 2 r2 , in magnitude. Then, by Newton’s second law, we have Gm1 m2 = m1 ω 2 r1 = m2 ω 2 r2 . r2 (1.2.1) From this equation, we get m1 r1 = m2 r2 = m2 (r − r1 ), which gives r1 = m2 r, m1 + m2 r2 = m1 r. m1 + m2 (1.2.2) Plugging it into Equation (1.2.1), we can solve that r= G(m1 + m2 ) ω2 1/3 . (1.2.3) Remark. Note that the rotation center is in fact the system’s center of mass. In general, for a system of n bodies of masses m1 , . . . , mn and at position ⃗x1 , . . . , ⃗xn , the center of mass of this system is a vector ⃗x such that (m1 + · · · + mn )⃗x = m1 ⃗x1 + · · · + mn ⃗xn . Next, we see some examples where solving second-order differential equations is necessary. Example. Consider a ball falling in some liquid. The liquid exerts a damping force Fd = bv that is proportional to the speed v of the ball and in the opposite direction of the velocity, where b is the damping constant. Suppose the ball is released from rest and has mass m. Determine its motion with respect to t and the velocity of the ball when t → ∞. Solution: Let the positive x-axis be the downward direction and let the origin be the starting position of the ball. By Newton’s second law, we need to solve the following equation mẍ = mg − bẋ, with initial conditions x(0) = 0 and ẋ(0) = 0. Denoting y = ẋ, the above equation is written as mẏ = mg − by, 21 which can be solved as mg . b With the initial condition, we can get that C = −mg/b. Integrating ẋ, we get b ẋ(t) = y(t) = Ce− m t + ˆ t ẋ(τ )dτ = x(t) = x(0) + 0 mg m2 g b t + 2 e− m t − 1 . b b Note that when t → ∞, ẋ(t) → mg/b, in which case the Fd = mg. This means the gravitational force and the damping force will finally balance, and the velocity of the ball will not change anymore. Example. Consider a box of mass m lying on a frictionless plane. It is connected to a spring. The spring force satisfies Hooke’s law F⃗ = −k⃗x, where ⃗x is the displacement of the spring’s free end from its position when the spring is in its relaxed state (which is chosen as the origin), and k is the spring constant. At the time t = 0, the box is released from rest at position ⃗x0 . Determine the position ⃗x(t) of the box at any time t. Solution: Without loss of generality, suppose ⃗x0 is along the x-axis. By Newton’s second law, we need to solve the following equation mẍ = −kx, with initial conditions x(0) = x0 and ẋ(0) = 0. We can rewrite the above equation as r k 2 ẍ = −ω x, ω := . m We know the above equation has a general solution x(t) = C1 cos(ωt) + C2 sin(ωt). (1.2.4) From the initial conditions, we can determine that C1 = x0 and C2 = 0. Hence, the solution is ⃗x(t) = x0 cos(ωt)⃗i, where ⃗i is the basis unit vector along the x-axis. Remark. The above system is called a harmonic oscillator. It is one of the most important examples in almost every field of physics. We will discuss this system in the context of classical mechanics in more detail in Section 1.5. You will see it again and again in more advanced physics courses (in particular, it will be one of the first few examples when you study quantum mechanics). Finally, we look at a two-body problem. We will show that it can be reduced to solving two one-body problems. Example. Consider two boxes of have masses m1 and m2 . They are connected by a spring, which is of length ℓ when it is in a relaxed state. Moreover, the spring constant is k. At the time t = 0, the two boxes have the same velocity ⃗v along the direction of the spring and the spring is compressed to length ℓ/2. Determine the positions ⃗x1 (t) and ⃗x2 (t) of the two boxes at any time t. 22 Solution: Without loss of generality, we choose the x-axis such that the spring is along the xdirection, and m2 ℓ ℓ m1 x1 (0) = − , x2 (0) = . m1 + m2 2 m1 + m2 2 (They are chosen such that the center of mass is at the origin.) By Newton’s second law, we can write down the following system of equations: m1 ẍ1 = k(x2 − x1 − ℓ), m2 ẍ2 = −k(x2 − x1 − ℓ), (1.2.5) with initial conditions x1 (0) = − ℓ m2 , m1 + m2 2 ẋ1 (0) = v, x2 = ℓ m1 , m1 + m2 2 ẋ2 (0) = v. To solve (1.2.5), we introduce y1 := m1 x1 + m2 x2 , m1 + m2 y2 = x2 − x1 . Note that y1 is the center of mass of the two boxes, while y2 describes their relative position. Then, from the two equations in (1.2.5), we get r k k 2 ÿ1 = 0, ÿ2 = −ω (y2 − ℓ), with ω = + , (1.2.6) m1 m2 with initial conditions y1 (0) = 0, ẏ1 (0) = v, y2 (0) = ℓ/2, ẏ2 (0) = 0. (1.2.7) It is easy to see that the two differential equations in (1.2.5) have general solutions y1 (t) = C1 t + C2 , y2 (t) = ℓ + C3 cos(ωt) + C4 sin(ωt). From the initial conditions (1.2.7), we can determine that C1 = v, C2 = 0, C3 = −ℓ/2, C4 = 0. Hence, we get the solution y1 (t) = vt, y2 (t) = ℓ − ℓ cos(ωt), 2 from which we can solve x1 (t) and x2 (t). Remark. In physics, the interactions between two bodies only depend on their relative positions (which is due to the translational symmetry of physics laws). Hence, the two-body problem in classical mechanics can always be reduced to two one-body problems, which are much easier to solve. Due to this reason, people say that two-body problems are exactly solvable. However, this is not true anymore for problems with three or more bodies. In particular, it is well known that for some initial conditions, the three-body motion can exhibit chaotic behaviors. In general, with more and more bodies, the problem becomes harder and harder. However, when there are a significant amount of particles, the whole system will exhibit some collective behaviors that are predictable. This will be the focus of statistical mechanics. 23 1.3 Forces in Nature, Statics 1.3.1 The four fundamental forces in nature Our universe is governed by the four fundamental forces that play a crucial role in how everything interacts with one another. These four forces are • gravitational force • electromagnetic force • weak nuclear force • strong nuclear force. Each of these forces plays a unique role in the workings of the universe, from the behavior of objects on a planetary scale to the interactions of subatomic particles. In quantum theory, the concept of forces as we know it in classical physics becomes somewhat ambiguous. Instead, what we call forces are usually interpreted as the exchange of particles between objects, which is more accurately described as an interaction. Therefore, it is more appropriate to refer to the “four fundamental interactions” rather than the “four fundamental forces”. Here is a brief summary of these four fundamental forces: 1. Gravitational Force. Gravity is the force that governs the behavior of objects on a large or macroscopic scale. It is responsible for keeping planets in orbit around stars, and causing apples and other objects to fall towards the ground. This force is described by Newton’s law of universal gravitation, which states that every particle in the universe is attracted to every other particle with a force that is proportional to the product of their masses and inversely proportional to the square of the distance between them: Gm1 m2 F⃗ = − r̂. r2 (1.3.1) Comparing to other forces, the Newtonian constant of gravitation G ≈ 6.67 × 10−11 m3 /kg·s2 is relatively small. We will discuss Newton’s theory of gravity in detail in section 1.6 of this lecture. The best theory for gravity to date is Einstein’s theory of General Relativity. It describes gravity as the curvature of spacetime manifold caused by the presence of massive objects. And masses move according to the curvature of the manifold. The basic equation of general relativity is Einstein’s equation 8πG 1 Rµν − Rgµν = 4 Tµν , 2 c 24 (1.3.2) where the left-hand side describes the geometry of spacetime and the right-hand side describes the distribution of matter and energy. J. A. Wheeler’s famous statement summarizes the core idea of general relativity as “Spacetime tells matter how to move; matter tells spacetime how to curve.” 2. Electromagnetic Force. Electromagnetic force unifies the electric force, which describes how electric charges attract or repel each other and interact with electric fields, and the magnetic force, which describes the interaction of moving charged objects and magnetic fields. For example, the Coulomb’s law states that the force between two stationary, electrically charged particles is kq1 q2 F⃗ = r̂, r2 (1.3.3) which is very similar to Newton’s law of universal gravitation Eq. (1.3.1). Because almost all everyday objects have electric charges, the electromagnetic force plays a significant role in many everyday phenomena, including the operation of electronic devices, the behavior of a compass, the colors we see, and the heat and light we feel from the sun. Our lecture will delve deeper into the topic of electromagnetism in the second half, offering a more detailed and comprehensive discussion on the subject. 3. Weak Nuclear Force. The weak interaction plays a critical role in governing the decay of unstable subatomic particles, initiating nuclear fusion reactions in stars like the Sun, and underlying some forms of radioactivity. It involves the exchange of force-carrier particles called W and Z particles, which are relatively heavy with masses around 100 times that of a proton. Interestingly, the weak interaction is the only fundamental interaction known to break parity symmetry. The idea of parity violation was proposed in the mid-1950s by ChenNing Yang and Tsung-Dao Lee and later confirmed experimentally by Chien Shiung Wu. The weak force violates parity symmetry, which means that it treats left-handed and righthanded particles differently. As a result, the concept of left-handedness and right-handedness has physical significance in the weak interaction, making them fundamentally different: The nature is left/right handed. 4. Strong Nuclear Force. The strong interaction is the force that binds protons and neutrons together in the nucleus of an atom. The strong interaction is responsible for the stability of atomic nuclei, and without it, the nucleus would quickly fall apart. The strong force is mediated by particles known as gluons, which are exchanged between quarks to hold them together. The strong force is very strong at very short distances, but it quickly decreases in strength at longer distances. These four fundamental forces behave very differently. For example, the interaction range of them is different. The strong nuclear force has a very short range, only acting over distances of the order of a few femtometers (10−15 meters). The weak nuclear force has a range of about 10−18 meters, while the electromagnetic force and gravity have infinite range. These differences originate 25 from the masses of the mediating particles. As a result, in our everyday life, we can only observe and experience the electromagnetic force and the gravitational force easily. Besides the interaction ranges, the strengths of these four fundamental forces are also different: strong nuclear force > electromagnetic force > weak nuclear force > gravitational force. A notable example that illustrates the relative strength of electromagnetic force compared to gravitational force is that we are able to stand on the ground, rather than being pulled directly to the center of the Earth by gravity. This is because the electromagnetic force between the atoms in our feet and the atoms in the ground (only atoms near the feet!) is much stronger than the gravitational attraction between us and the Earth (the whole Earth!). The history of physics is a history of unification, where theories are developed to explain multiple phenomena with a single framework. For example, Newton’s theory of gravity unified the falling of an apple and the motion of the moon around the Earth. Maxwell’s theory unified electricity and magnetism. While three of the four fundamental forces have been successfully unified by gauge theory, gravity remains incompatible with quantum theory. Despite decades of effort, it is still unknown how to reconcile Einstein’s theory of gravity with quantum mechanics. 1.3.2 Some particular forces There are several types of forces that we encounter in our daily life quite frequently. Gravitational force and weight The gravitational force on Earth is a fundamental force that is exerted by the Earth on all objects with mass, including human beings, animals, and objects. The weight of an object near the surface of the Earth is defined by Newton’s law of universal gravitation (1.3.1) as: W = mg = GM m , R2 (1.3.4) where M is the mass of the Earth, m is the mass of the object, R is the radius of the Earth, and g is the acceleration due to gravity. The direction of the gravitational force is always towards the center of the Earth. From the expression of the free-fall acceleration g= GM , R2 (1.3.5) we find notably that this acceleration is independent of the mass of the object, meaning that all objects will experience the same acceleration under the influence of gravity near the Earth’s surface. Numerically, the free-fall acceleration is approximately g ≈ 9.8 m/s2 . Normal force When an object is placed on a table, it experiences a normal force with direction perpendicular to the table. Essentially, this normal force comes from the electromagnetic repulsion between the 26 closely spaced molecules of the object and the molecules of the table. The magnitude of the force is not fixed. Friction Friction is a force that opposes motion between two surfaces in contact. When two objects are in contact, the roughness of their surfaces causes them to ”stick” together slightly. This makes it harder to move one object relative to the other. Friction can be thought of as a microscopic ”drag” force that acts opposite to the direction of motion or the direction of an applied force. There are two types of friction: static friction and kinetic (or sliding) friction. Static friction occurs when an object is stationary and is about to be moved. The force of static friction prevents the object from being moved until a sufficient external force is applied. Once the object starts to move, kinetic friction takes over, which is generally less than static friction. The magnitude of static friction has a maximum value, which is proportional to the normal force. So we have fs ≤ µs FN , (1.3.6) where µs is a dimensionless coefficient of static friction and FN is the magnitude of the normal force. On the other hand, if the body begins to slide along the surface, the magnitude of the kinetic frictional force is fk = µk FN , (1.3.7) where µk is the coefficient of kinetic friction. Usually, the maximum static friction is bigger than the kinetic friction: µs > µk . Figure 1.6: Friction. Tension Tension is a pulling force that is transmitted through a flexible material, such as a rope, cable, or string, when it is pulled tight by opposing forces. It is a type of force that acts along the length 27 of the material, and its magnitude is proportional to the amount of force being applied. Tension is used in a wide range of applications, such as in bridges, cables, pulleys, and other structures that rely on the strength of flexible materials. Figure 1.7: Tensions of ropes. (a) Pulling an object on a table. (b) The rope runs around a pulley. Spring force Spring force is a force exerted by a compressed or stretched spring, which tends to restore the spring to its equilibrium length. The magnitude of the spring force is proportional to the displacement of the spring from its equilibrium position. It can be expressed mathematically as F⃗ = −k⃗x, (1.3.8) where F is the spring force, k is the spring constant, and x is the displacement from the equilibrium position. Figure 1.8: Spring force. 28 1.3.3 Statics I: forces When an object or system remains unchanged over time in a particular reference frame, it is said to be in a state of equilibrium or static equilibrium. The branch of classical mechanics that studies systems in this state is known as statics. Definition 8 (Body). A body refers to a physical object, which can consist of one or multiple parts, that can be treated as a single entity for the purpose of analysis. The concept of body is useful for simplifying the description and calculation of the motion, forces, and other properties of objects, especially when the individual components are too numerous or complicated to handle separately. By treating a complex system as a single body, we can apply the principles of mechanics, such as Newton’s laws, to describe its behavior. In the field of statics, we analyze an object by considering the forces acting on it as a whole and by examining each of its individual parts, which can be thought of as distinct bodies. The selection of a particular body to analyze is an engineering decision that is made based on the specific goals of the analysis. For instance, when designing the foundation of a high-rise building, we may consider the entire building as a single body. However, when evaluating the strength of the building’s individual components, such as columns and beams, we would examine them separately to ensure that they can effectively perform their intended functions. Newton’s second law states that the net force acting on a body is equal to its mass times its acceleration. Therefore, if a body is not accelerating, the net force acting on it must be zero. In other words, if a body is in static equilibrium, the total force acting on it should be zero: F⃗tot = m⃗a = ⃗0. (1.3.9) So we have the following result: The first condition of equilibrium. In order for a body to be in static equilibrium, the net force acting on it must be zero. According to the principle of superposition of forces (or addition of vectors in a vector space), the total force acting on an object is the sum of all the individual forces acting on it from the environment. Mathematically, we can express this as: X F⃗tot = F⃗a , (1.3.10) a where F⃗a represents an individual force acting on the body from the environment. If we connect the tails of the arrows representing the force vectors to their heads, then the condition that the total force is zero is equivalent to saying that the resulting path formed by all the arrows is closed (see Figure 1.9). 29 Figure 1.9: The first condition of equilibrium: Total force of a static body should be zero. When considering a body as consisting of several parts, each part will exert internal interacting forces F⃗ij on each other. The total force acting on the body should include all external forces F⃗aj and internal forces F⃗ij : ! X X X (1.3.11) F⃗ij . F⃗tot = F⃗aj + j a i P P Because of the Newton’s third law/action-reaction law F⃗ij = −F⃗ji , the internal forces j i F⃗ij sum up to zero. Therefore, we obtain again Eq. (1.3.10) by identifying the external force as the P sum of F⃗a = j F⃗aj . Example. A box is at rest on an inclined plane with an angle of inclination θ. The weight of the box is mg. The static friction coefficient for the plane and the box is µ. Determine the normal force and friction exerted by the inclined plane on the box. What is the maximum angle θmax at which the box remains in equilibrium on the plane? Figure 1.10: Box on an inclined plane. Solution: We can set up a coordinate system such that the x-axis is parallel to the surface of the inclined plane and pointing down the slope, the y-axis is perpendicular to the surface and pointing upwards, and the origin is at a convenient location such as the position of the box. 30 The box on a slope experiences three forces: the gravitational force m⃗g , the normal force F⃗N , and the friction f⃗. To satisfy the condition of total force, F⃗tot = m⃗g + F⃗N + f⃗, being zero, we need to consider two equations along the two axes: f = mg sin θ, (1.3.12) FN = mg cos θ. (1.3.13) These two equations completely determine the normal force and the frictional force. Since the static friction should satisfy f ≤ µFN , the maximum angle satisfy tan θmax = µ, so θmax = arctan µ. (1.3.14) Example. Calculating forces in the pulley system shown in Figure 1.11. Figure 1.11: A box hanging by ropes. Example. If the crate has mass m in Figure 1.12, determine the forces in the boom and in the topping lift. 31 Figure 1.12: Cargo boom. Solution: The triangle formed by the three forces acting on the static point B is similar to the triangle ABC. Therefore, we can establish the following equation: T1 T2 mg = = , sin θA sin θC sin θB (1.3.15) which allows us to determine all three forces. Example. A small ball with mass m is hanging still by ropes (see Figure 1.13). The angles between the first two ropes and the vertical direction are θ1 and θ2 , respectively. Find the tension in each of the three ropes. Figure 1.13: A small ball hanging by ropes. Solution: The total force of the ball should be zero, so we have T3 = mg. 32 (1.3.16) The total force acting on the connecting point of the three ropes should also be zero. The two equations in horizontal and vertical directions are T1 sin θ1 = T2 sin θ2 T1 cos θ1 + T2 cos θ2 = T3 . (1.3.17) (1.3.18) The solutions for the three equations are sin θ2 , sin(θ1 + θ2 ) sin θ1 T2 = mg , sin(θ1 + θ2 ) T1 = mg T3 = mg. (1.3.19) (1.3.20) (1.3.21) Example. Consider the cable and pulley arrangement shown in Figure 1.14. The lower block has mass M , and the upper block has mass m. The coefficient of friction between the two blocks is µ, and the coefficient of friction between the lower block and the floor is also µ. What is the maximum horizontal force F that can be exerted on the lower block before it moves? And what is the tension T in the cable? Figure 1.14 Solution: The maximum external force F that can be exerted on the lower block before it moves occurs when the friction forces between the two blocks and between the lower block and the floor are at their maximum values, f1 = µN1 , (1.3.22) f2 = µN2 . (1.3.23) Applying the first condition of equilibrium on the upper block, we can set the total forces along the x and y directions to zero, giving us: f1 − T = 0, (1.3.24) N1 − mg = 0. (1.3.25) 33 For the lower block, we have F − T − f1 − f2 = 0, (1.3.26) N2 − N1 − M g = 0. (1.3.27) By solving these equations, we can find that the maximum external force F and the tension in the cable T are given by: 1.3.4 F = µ(3m + M )g, (1.3.28) T = µmg. (1.3.29) Statics II: torques When analyzing the motion of a body made up of several parts, it can be useful to separate their motions into the motion of the center of mass and the relative motions of the different parts. However, in order to simplify our analysis and ignore the relative motion, we introduce the following concept. Definition 9 (Rigid body). A rigid body is a physical object that maintains its shape and size under external forces, meaning that it does not undergo deformation, bending, stretching, or twisting. In reality, no object is truly rigid, and all objects can deform under the application of forces. However, for many practical purposes, it is sufficient to treat an object as a rigid body. For a rigid body, we can assume that all points of it move together, so we only need to consider the motion of the object as a whole rather than the motion of each individual point on the object. This simplifies the equations of motion and makes it easier to analyze the object’s behavior. We can ask the question: what is the condition for a rigid body to be at rest or in static equilibrium? The condition that the total force acting on a rigid body is zero is not sufficient to ensure that the body is static. There is another condition related to the moment acting on the body. ⃗ and B ⃗ in three-dimensional Euclidean space, Definition 10 (Cross product). Given two vectors A ⃗ that is perpendicular to both A ⃗ and B ⃗ and whose the cross product of them is another vector C ⃗ and B ⃗ multiplied by the sine of the angle magnitude is equal to the product of the magnitudes of A ⃗ is given by the right-hand rule. between them. The direction of the resulting vector C 34 Figure 1.15: Cross product. ⃗ = (Ax , Ay , Az ) and B ⃗ = (Bx , By , Bz ) in a coordinate system is given The cross product of A by: x̂ ŷ ẑ ⃗ ⃗ A × B = Ax Ay Az = (Ay Bz − Az By )x̂ + (Az Bx − Ax Bz )ŷ + (Ax By − Ay Bx )ẑ Bx By Bz = (Ay Bz − Az By , Az Bx − Ax Bz , Ax By − Ay Bx ), (1.3.30) where x̂, ŷ, and ẑ are the unit vectors in the x, y, and z directions, respectively. One can show directly that ⃗ × B| ⃗ = |A| ⃗ · |B| ⃗ · sin θ, |A (1.3.31) ⃗ and B. ⃗ If both A ⃗ and B ⃗ are coplanar and lie in the xy plane, their where θ is the angle between A cross product is perpendicular to the plane: ⃗×B ⃗ = (Ax , Ay , 0) × (Bx , By , 0) = (0, 0, Ax By − Ay Bx ). A (1.3.32) ⃗ and B ⃗ in the xy plane is a vector that only has a z-component, Therefore, the cross product of A given by Ax By − Ay Bx . ⃗ B ⃗ = −B× ⃗ A) ⃗ and is distributive over addition The cross product is anticommutative (that is, A× ⃗ × (B ⃗ + C) ⃗ =A ⃗×B ⃗ +A ⃗ × C). ⃗ It is not associative, but satisfies the Jacobi identity (that is, A ⃗ × (B ⃗ × C) ⃗ +B ⃗ × (C ⃗ × A) ⃗ +C ⃗ × (A ⃗ × B) ⃗ = 0. It also satisfies A ⃗ · (B ⃗ × C) ⃗ =B ⃗ · (C ⃗ × A) ⃗ =C ⃗ · (A ⃗ × B) ⃗ A ⃗ × (B ⃗ × C) ⃗ = (A ⃗ · C) ⃗ B ⃗ − (A ⃗ · B) ⃗ C. ⃗ and A The cross product allows us to define the physical concept of the moment of a force, also known as torque. The moment of a force measures the rotational effect of a force about a specific point or axis. Unlike a linear force, which causes translational motion of a body, a force that creates a moment must be applied in a way that causes the body to begin to rotate or twist. This happens when the force does not act through the centroid or center of mass of the body. 35 Definition 11 (Moment of a force/torque). The moment of force, or torque, is a vector defined as ⃗τ = ⃗r × F⃗ , (1.3.33) where ⃗r is the position vector from the rotation center to the point where the force F⃗ is applied. By definition, the magnitude of ⃗τ is |⃗τ | = |⃗r| · |F⃗ | · sin θ, (1.3.34) where θ is the angle between the two vectors ⃗r and F⃗ . The direction of τ is given by the right-hand rule. One can define the moment of a force with respect to a rotational axis using either ⃗r × F⃗⊥ (see Figure 1.16b) or (⃗r × F⃗ )∥ (see Figure 1.16c). These two definitions coincide with each other and can be used interchangeably (show that). Figure 1.16: Moment of a force/t/orque with respect to (a) a point or (b) (c) a rotational axis. The second condition of equilibrium. In order for a rigid body to be in static equilibrium, the net torque acting on it with respect to any chosen axis or point must be zero. If we have several forces acting on different positions of a static rigid body, the total torque can be calculated as: X ⃗τtot = ⃗ri × F⃗i . (1.3.35) i The second condition of equilibrium states that the total torque ⃗τtot acting on the rigid body as a vector is zero, which ensures that the body does not rotate. Example. Design a steelyard balance to find mass with torque. 36 Figure 1.17: A steelyard balance. Example. A uniform pole of length L and weight mg is pivoted at one end to a wall. It is held at an angle of θ above the horizontal by a horizontal guy wire attached l units from the end attached to the wall. A load of M g hangs from the upper end of the pole. Calculate the tension in the guy wire and determine the force exerted on the pole by the wall. Figure 1.18: A pole at static equilibrium. Solution: The first condition of equilibrium gives us two equations relating the forces on the pole: Nx − T = 0, (1.3.36) Ny − mg − M g = 0. (1.3.37) However, we have three unknown forces: Nx , Ny , and T . Therefore, we need an additional equation to solve for all three forces. This equation can be obtained from the second condition of equilibrium. 37 Before calculating torques, we need to choose a reference point around which we will calculate the torques. Choosing the lower left end of the pole is a good choice as it will result in two forces, Nx and Ny , having zero torque. This simplifies the equations. Using this reference point, the total torque can be written as: τtot = mg L cos θ + M gL cos θ − T l sin θ = 0. 2 (1.3.38) Using all the three equations from the first and second conditions of equilibrium, we can solve for the unknowns as: Ny = mg + M g, L L Nx = T = mg cos θ + M g cos θ. 2l l 38 (1.3.39) (1.3.40) 1.4 Energy and Momentum Conservation Nowadays, we know that Newton’s laws are only approximations of the laws of nature. New theories are necessary when objects move at very high speeds (special relativity), are very massive (general relativity), or are very small (quantum mechanics). However, some consequences followed by Newton’s law actually hold more universally. In this section, we introduce two of such important consequences: energy and momentum conservation. 1.4.1 Energy conservation Let us consider a point particle moving in one dimension under a force which can be written as F (x) = − dU (x) , dx (1.4.1) where the function U (x) is called the potential energy. Newton’s second law tells us that dU (x) dv = −m . dx dt (1.4.2) Multiplying both sides of the equation by the velocity v, we obtain − dU (x) dx dU (x) dv 1 dv 2 =− = mv = m , dt dt dx dt 2 dt (1.4.3) which can be simplified to d dt 1 2 mv + U (x) = 0 . 2 (1.4.4) We see that Newton’s second law implies that a particular combination of the square of the velocity and the potential energy is conserved (invariant under time evolution). Let us define the kinetic energy K of the point particle as 1 (1.4.5) K = mv 2 , 2 and the total energy E of the point particle as 1 E = mv 2 + U (x) . 2 (1.4.6) Newton’s second law implies that the total energy E is conserved. Example. The force of a spring F = −kx , (1.4.7) is conservative. The potential energy of the spring is U (x) = kx2 . Show that the energy of a simple harmonic oscillator (1.2.4) is conserved. 39 (1.4.8) Solution: The position and velocity of the box are v0 sin(ωt) , ω ẋ(t) = −x0 ω sin(ωt) + v0 cos(ωt) , x(t) = x0 cos(ωt) + (1.4.9) where x0 and v0 are the initial position and velocity. The kinetic energy K is 1 1 K = mẋ2 = m (−x0 ω sin(ωt) + v0 cos(ωt))2 . 2 2 (1.4.10) The potential energy U is 2 1 1 v0 U = kx2 = k x0 cos(ωt) + sin(ωt) . 2 2 ω (1.4.11) They separately are not conserved. The total energy is E =K +U 1 = mω 2 x20 sin2 (ωt) + mv02 cos2 (ωt) − 2mωx0 v0 sin(ωt) cos(ωt) 2 x0 v0 v02 2 2 2 sin(ωt) cos(ωt) +kx0 cos (ωt) + k 2 sin (ωt) + 2k ω ω 1 1 = mv02 + kx20 . 2 2 (1.4.12) We see that the total energy is indeed conserved. In the example of the simple harmonic oscillator, if initially the spring is relaxed and the box has no velocity, i.e. x0 = 0 and v0 = 0, then the system would be at rest with zero energy forever. We could apply an external force to push the spring to increase the potential and start the motion. We would give a more general discussion of this process in the following. Consider another force F ′ that balances the force F such that the net force is zero, i.e. F ′ = −F = dU . dx (1.4.13) Suppose the point particle travels from x = xi to x = xf while remaining zero net force (1.4.13). We define the work that is done by the force F ′ on the point particle as ˆ xf ˆ xf dU ′ W = F dx = dx = U (xf ) − U (xi ) . (1.4.14) xi xi dx We see that the work equals to the change of the potential energy. Let us generalize the above discussion to higher dimensions. Definition 12 (Conservative force). A force F⃗ is conservative if it only depends on the position ⃗r (but not the velocity ⃗v ) and can be written as the gradient of the potential energy U (⃗r), more precisely, ∂U ∂U ⃗ ⃗ F = −∇U = ,··· , . (1.4.15) ∂x1 ∂xd 40 ∂ The notation ∂x in (1.4.15) is the partial derivative, which means that we only take the derivative i on xi while fixing the other xj for j ̸= i. The kinetic energy and the total energy of a point particle moving in higher dimensions are 1 K = m|⃗v |2 , 2 1 E = m|⃗v |2 + U (⃗r) . 2 Let us check that the total energy is conserved, dE d 1 d⃗v d⃗r ⃗ 2 = m|⃗v | + U (⃗r) = m⃗v · + · ∇U (⃗r) = ⃗v · m⃗a − F⃗ = 0 . dt dt 2 dt dt (1.4.16) (1.4.17) Example. Consider a free-falling ball near Earth’s surface. It receives a gravitational force as F⃗ = (0, −mg) , which is conservative, and can be written as ∂U ∂U ⃗ , , F =− ∂x ∂y U = mgy , (1.4.18) (1.4.19) where U is called the gravitational potential energy. The motion of the ball is given by (1.1.42). The velocity of the ball is given by (1.1.43). Check that the total energy is conserved. Solution: Let us compute the total energy 1 E = m(ẋ2 + ẏ 2 ) + mgy 2 1 1 2 1 2 2 = mvx + m(vy − gt) + mg y0 + vy t − gt 2 2 2 1 1 = mvx2 + mvy2 + mgy0 , 2 2 (1.4.20) which is independent of the time t. Consider an external force F⃗ ′ that balances the gravitational force and pulls the ball from ⃗ri to ⃗rf following the curve C⃗ri ,⃗rf as shown in figure 1.19. What is the work done by the force F⃗ ′ ? 41 Figure 1.19: An external force F⃗ ′ pulls the ball upward from ⃗ri to ⃗rf . Let us consider the case with a general potential energy U (⃗r). The force F⃗ ′ balances the conservative force F⃗ as ⃗ . (1.4.21) F⃗ ′ = −F⃗ = ∇U The work done by the force F⃗ ′ along a curve C⃗ri ,⃗rf starting at ⃗ri and ending at ⃗rf is ˛ ˛ F⃗ ′ · d⃗r = W = C⃗ri ,⃗rf C⃗ri ,⃗rf ⃗ · d⃗r = U (⃗rf ) − U (⃗ri ) ≡ ∆U . ∇U (1.4.22) We see interestingly that the work only depends on the starting and ending positions ⃗ri , ⃗rf and the curve C⃗ri ,⃗rf . In particular, if the curve is a closed loop C, then the work done by the force F⃗ ′ is zero, ˛ ˛ ⃗ · d⃗r = 0 . W = F⃗ ′ · d⃗r = ∇U (1.4.23) C Note that this property is only true if F⃗ ′ C is against a conservative force. Now, let us consider the situation that the external force F⃗ ′ does not balance the conservative ⃗ . The net force of the system is force F⃗ = −∇U ⃗ . F⃗ ′ + F⃗ = F⃗ ′ − ∇U (1.4.24) ⃗ = m⃗a . F⃗ ′ − ∇U (1.4.25) Newton’s second law implies 42 Let us integrate this equation along the contour C⃗ri ,⃗rf . We find ˛ ˛ ˛ d⃗v ′ ⃗ ⃗ m · d⃗r ∇U · d⃗r + F · d⃗r = dt C⃗ri ,⃗rf C⃗ri ,⃗rf C⃗ri ,⃗rf ˆ tf ˛ d⃗v ⃗ · d⃗r + ∇U = m · ⃗v dt dt ti C⃗ri ,⃗rf ˆ tf 1 ˛ d 2 m|⃗v |2 ⃗ · d⃗r + ∇U = dt dt ti C⃗r ,⃗r i (1.4.26) f = [U (⃗rf ) + K(tf )] − [U (⃗ri ) + K(ti )] ≡ ∆E . We have learned that when the external force F⃗ ′ does not balance the conservative force, the integral of the external force over a contour C⃗ri ,⃗rf equals to the energy difference between the initial and final configuration. It is important to note that now the integral of the external force depends on the entire contour C⃗ri ,⃗rf not just on the initial and final positions. Energy conservation can be further generalized to multi-particle systems. The kinetic energy of a system of n particles is the sum of the kinetic energies of the individual particles, K= n X 1 i=1 2 mi |⃗vi |2 , (1.4.27) The potential energy is a function of the position vectors ⃗ri of the particles, U (⃗r1 , · · · , ⃗rn ). The force acting on the i-th particle is given by the i-th divergence of the potential energy, ⃗ i U (⃗r1 , · · · , ⃗rn ) , F⃗i = −∇ (1.4.28) ⃗ i is the differential operator acting on ⃗ri . It is straightforward to check that the total where ∇ energy is conserved. Example. The gravitational potential energy of two stars 1 and 2 is U (⃗r1 , ⃗r2 ) = − Gm1 m2 . |⃗r12 | (1.4.29) The force on star 1 is4 ⃗ 1 − Gm1 m2 = − Gm1 m2 r̂12 , F⃗1 = −∇ |⃗r12 | |⃗r12 |2 4 (1.4.31) More explicitly, we have used ∂ ∂ ∂ 1 1 1 p p p , , ∂x x2 + y 2 + z 2 ∂y x2 + y 2 + z 2 ∂z x2 + y 2 + z 2 1 =− 3 (x, y, z) 2 2 (x + y + z 2 ) 2 1 = − 2 r̂ . |⃗r| ⃗ 1 = ∇ |⃗r| 43 ! (1.4.30) and the force on star 2 is ⃗2 F⃗1 = −∇ Gm1 m2 − |⃗r12 | = Gm1 m2 r̂12 . |⃗r12 |2 (1.4.32) Consider the circular motion of the two-star system studied in section 1.2.5. The total energy is 1 1 Gm1 m2 E = m1 (r1 ω)2 + m2 (r2 ω)2 + , 2 2 r (1.4.33) where r1 , r2 and r are given in (1.2.2) and (1.2.3). However, this example is not very interesting since the kinetic energy and the potential energy are separately conserved. In the next section, we will see the elliptic motion of the two-star system, where the kinetic energy and the potential energy are not conserved separately while the total energy is conserved. 1.4.2 Momentum conservation Let us consider a system of two point particles, whose position vectors are ⃗r1 and ⃗r2 . We assume that no external force (or no net external force) is acting on the system, but the two particles in the system can interact with each other. More explicitly, particle 1 gives a force F⃗12 to particle 2, and particle 2 gives a force F⃗21 to particle 1. Newton’s second law gives 2 2 d ⃗r1 F⃗21 = m1 2 , dt d ⃗r2 F⃗12 = m2 2 , dt (1.4.34) where m1 and m2 are the masses of the two particles. Newton’s third law implies F⃗12 = −F⃗21 . (1.4.35) Now, let us consider 0 = F⃗12 + F⃗21 = m1 d2⃗r2 d d2⃗r1 + m2 2 = (m1⃗v1 + m2⃗v2 ) . 2 dt dt dt (1.4.36) We see that the quantity m1⃗v1 + m2⃗v2 is conserved. The above discussion can be generalized to more complicated systems. Before we do that, let us introduce some new concepts. First, the momentum of a point particle is defined by its mass times velocity, (1.4.37) p⃗ ≡ m⃗v , The total momentum P⃗ of a system is defined by the sum over all the momenta. For example, in our previous two-particle system, the total momentum is P⃗ = p⃗1 + p⃗2 = m1⃗v1 + m2⃗v2 , (1.4.38) which is conserved if there is no net external force by (1.4.36). Now, let us generalize the above discussion to a system of n particles without net external force. The i-th particle acts on the j-th 44 particle by a force F⃗ij . Newton’s third law implies 0= n X F⃗ij = i,j=1 i̸=j n X i=1 d d2⃗ri mi 2 = dt dt ! X mi⃗vi i=1 d ≡ dt ! X i=1 pi = dP⃗ . dt (1.4.39) Therefore, the total momentum of the system is conserved if there is no net external force. When there is a net external force F⃗ext , Newton’s second law tells us that the total momentum ⃗ P is not conserved, and its time-derivative is dP⃗ F⃗ext = . dt (1.4.40) The above equation looks like the Newton’s second law for a point particle. Hence, it is convenient to interpret the total momentum as the momentum of an effective point particle whose mass equals the total mass M = m1 + · · · + mn , i.e. P⃗ = M⃗vcom , (1.4.41) where ⃗vcom is the velocity of the effective point particle which is a weighted sum of the velocities of individual particles n 1 X ⃗vcom = mi⃗vi . (1.4.42) M i=1 From this formula, we can further infer that the effective point particle is located at the position ⃗rcom given by n 1 X ⃗rcom = mi⃗ri . (1.4.43) M i=1 The position specified by the position vector ⃗rcom is called the center of mass. The center of mass provides a way to describe the “overall motion” of a system of particles. Namely, a system of particles with a net external force F⃗ext can be viewed as a point particle with mass M sitting at the center of mass ⃗rcom . In other words, the center of mass motion of a system of particles only depends on the net external force F⃗ext via Newton’s second law and is independent of the interactions among the particles in the system. Using translation and Galilean transformation, we could go to an inertial frame that the center of mass is at the origin and has zero velocity, i.e. ⃗rcom = 0 = ⃗vcom . Such an inertial frame is called the center of mass frame. 1.4.3 Elastic scattering A m → n scattering is a physical process where m objects come in from infinity, interact via contact or non-contact forces at finite distance, and n objects go out to infinity. The initial momenta p⃗1 , · · · , p⃗m of the objects change to the final momenta p⃗1 ′ , · · · , p⃗n ′ during the scattering process. The interactions (forces) between the objects can be very complicated and we usually do not know 45 their form. We will only assume that the interactions are “local”, i.e. they decay sufficiently fast with the distance between the objects so that the objects are free when they come in or go out to infinity. During the scattering process, the total momentum and energy should be conserved. While they are usually not sufficient to fully determine the scattering process, they give important constraints to the system. Figure 1.20: A m → n scattering process. In a scattering process, while the total energy is always conserved, the form of the energy could change from one to the other. For example, the objects coming in from infinity carry kinetic energy, which could transfer to potential energy, heat (thermal energy), or energy of sound and light. In this subsection, we will first consider an idealistic situation, where the kinetic energy during the scattering process is conserved. Such a scattering is called an elastic scatterings. The scattering processes that do not conserve kinetic energy are called inelastic scatterings. On the other hand, momentum should always be conserved if the system is closed. We will focus on the 2 → 2 scattering process, where we have two incoming particles or objects and two outgoing ones. Example. Consider a one-dimensional 2 → 2 elastic scattering process. Two objects A and B of masses mA , mB , and velocities vA , vB come in from the left infinity. They collide and go out to the right infinity. Compute the velocities of the objects A and B after the scattering. 46 Figure 1.21: One-dimensional elastic scattering process. Solution: The momentum conservation of the scattering process gives ′ ′ m A vA + m B vB = m A vA + mB vB . (1.4.44) The conservation of the kinetic energy gives 1 1 1 1 2 2 ′2 ′2 mA vA + m B vB = mA vA + mB vB . 2 2 2 2 (1.4.45) Solving the above two equations, we find two solutions ′ vA = vA , and ′ vA = mA − mB 2mB vA + vB , mA + mB mA + mB ′ vB = vB , ′ vB = (1.4.46) 2mA mB − mA vA + vB . mA + mB mA + mB (1.4.47) The initial and final velocities in the first solution are the same. This means that the scattering actually does not happen, which could be realized when vA ≤ vB . The velocities in the second solution change under the scattering. It is interesting to go to the center of mass frame. Let us first compute the velocity of the center of mass mA vA + mB vB vcom = . (1.4.48) mA + mB In the center of mass frame, the objects A and B have initial velocities vA − vcom = mB (vA − vB ) , mA + mB vB − vcom = mA (vB − vA ) , mA + mB mB (vA − vB ) , mA + mB ′ vB − vcom = − (1.4.49) and final velocities ′ vA − vcom = − mA (vB − vA ) . mA + mB (1.4.50) We see that in the center of mass frame, the final velocities are just given by flipping the signs of the initial velocities. This can be explained by the time-reversal symmetry of the system. Now, let us generalize our study to the scatterings in higher dimensions. Example. Consider a 2 → 2 elastic scattering in d dimensions. Let the velocities of the incoming particles be ⃗vA and ⃗vB and of the outgoing particles be ⃗vA ′ and ⃗vB ′ . 47 1. Show that the momentum conservation and the energy conservation are invariant under the Galilean transformation. 2. Using Galilean transformation, translation, and rotation, show that we can always go to an inertial frame such that the total momentum is zero, and the scattering is two-dimensional, i.e. all the incoming and outgoing objects or particles are moving in the x-y plane. 3. Show that any Galilean invariant quantities (quantities that are invariant under transformations in the Galilean group) can be written as functions of two Galilean invariant variables. Explicitly construct such two variables. Figure 1.22: A d-dimensional 2 → 2 scattering can always be put on a 2-plane. Solution: 1. Let us start with the momentum conservation mA⃗vA + mB ⃗vB = p⃗A + p⃗B = p⃗A ′ + p⃗B ′ = mA⃗vA ′ + mB ⃗vB ′ . (1.4.51) Under the Galilean transformation ⃗vB ′ → ⃗vB ′ + ⃗v , (1.4.52) mA (⃗vA + ⃗v ) + mB (⃗vB + ⃗v ) = mA (⃗vA ′ + ⃗v ) + mB (⃗vB ′ + ⃗v ) , (1.4.53) ⃗vA → ⃗vA + ⃗v , ⃗vB → ⃗vB + ⃗v , ⃗vA ′ → ⃗vA ′ + ⃗v , the momentum conservation becomes which is equivalent to the momentum conservation before the Galilean transformation. 48 Next, let us consider the energy conservation, 1 1 1 1 mA |⃗vA |2 + mB |⃗vB |2 = mA |⃗vA ′ |2 + mB |⃗vB ′ |2 . 2 2 2 2 (1.4.54) Under the Galilean transformation, the energy conservation becomes 1 1 1 1 mA |⃗vA + ⃗v |2 + mB |⃗vB + ⃗v |2 = mA |⃗vA ′ + ⃗v |2 + mB |⃗vB ′ + ⃗v |2 . 2 2 2 2 (1.4.55) Expanding both sides of the equation, we find 1 1 1 mA |⃗vA |2 + mB |⃗vB |2 + ⃗v · (mA⃗vA + mB ⃗vB ) + (mA + mB )|⃗v |2 2 2 2 1 1 1 ′ 2 ′ 2 ′ = mA |⃗vA | + mB |⃗vB | + ⃗v · (mA⃗vA + mB ⃗vB ′ ) + (mA + mB )|⃗v |2 . 2 2 2 (1.4.56) Using the momentum conservation, we see that it is equivalent to the energy conservation before the Galilean transformation. 2. Let us first use translation and Galilean transformation to set the center of mass to be at the origin with zero velocity. Since the total momentum is zero, the momenta of the two objects, labeled by A and B, have the same magnitude and opposite directions, i.e. the incoming momenta satisfy (1.4.57) p⃗A = −⃗ pB , and the outgoing momenta satisfy p⃗A ′ = −⃗ pB ′ . (1.4.58) Next, we can rotate our frame to make p⃗A and p⃗A ′ to lie in the x-y plane. In this frame, the scattering process is characterized by only two quantities: the center of mass energy Ecom and the scattering angle θ. They are given by Ecom = pB |2 mA + mB |⃗ pA |2 |⃗ + = |⃗ pA |2 , 2mA 2mB 2mA mB (1.4.59) p⃗A · p⃗A ′ = |⃗ pA ||⃗ pA ′ | cos θ . (1.4.60) and 3. Consider a generic inertial frame. The momenta p⃗A , p⃗B , p⃗A ′ , and , p⃗B ′ , subject to the momentum conservation (1.4.61) p⃗A + p⃗B = p⃗A ′ + p⃗B ′ , span a three-dimensional vector space. There is a two-dimensional subspace that is invariant under the Galilean transformation and spanned by the vectors ⃗vAB ≡ ⃗vA − ⃗vB = p⃗A p⃗B − , mA mB ⃗vB ′ B ≡ ⃗vB ′ − ⃗vB = 49 1 (⃗ pB ′ − p⃗B ) . mB (1.4.62) Using these two vectors, we can construct three rotational invariant quadratic combinations s = |⃗vAB |2 , t = |⃗vB ′ B |2 , u = |⃗vAB − ⃗vB ′ B |2 . (1.4.63) By the energy and momentum conservation, one can find one linear relation between s, t, and u. To find such a relation, let us use Galilean transformation to go to the frame with ⃗vB = 0. In this frame, ⃗vAB = ⃗vA , ⃗vB ′ B = ⃗vB ′ , and the energy and momentum conservation become mA⃗vA = mA⃗vA ′ + mB ⃗vB ′ , 1 1 1 mA |⃗vA |2 = mA |⃗vA ′ |2 + mB |⃗vB ′ |2 . 2 2 2 (1.4.64) We can solve for ⃗vA ′ using the momentum conservation as ⃗vA ′ = ⃗vA − mB ′ ⃗vB . mA (1.4.65) The energy conservation now can be written as 1 1 1 mA s = mA |⃗vA ′ |2 + mB t 2 2 2 1 mB ′ 2 1 = mA ⃗vA − ⃗vB + mB t 2 mA 2 m2 1 1 = mA s − mB ⃗vA · ⃗vB ′ + B t + mB t 2 2mA 2 1 mB (mA + mB ) = mA s − mB (s + t − u) + t, 2 2mA (1.4.66) mA s − mB t − mA u = 0 . (1.4.67) which simplifies as Therefore, any Galilean invariant quantities can be written as functions of the Galilean invariant variables s and t.5 Now, let us go to the center of mass frame, and we have the relations (1.4.57) and (1.4.58). We find that s and t are related to the center of mass energy Ecom and the scattering angle θ by p⃗A p⃗B s= − mA mB 2 = 1 1 + mA mB 2 |⃗ pA |2 = mA + mB Ecom , 2mA mB p⃗A 1 p⃗B p⃗B ′ 2 p⃗A ′ 2 − = − = 2 (|⃗ pA |2 − 2⃗ pA ′ · p⃗A + |⃗ pA ′ |2 ) t= mB mB mB mB mB 2 4m A = 2 |⃗ pA |2 (1 − cos θ) = Ecom (1 − cos θ) . mB (mA + mB ) mB 5 (1.4.68) The variables s, t and u are the analogs of the Mandelstam variables of 2 → 2 scatterings in quantum field theory. 50 1.4.4 Inelastic scattering As we discussed in the previous section, an inelastic scattering process does not conserve the kinetic energy, while the momentum is conserved. Let us consider the following examples of the inelastic scatterings if the system is closed. Example. Show that 2 → 1 scatterings cannot be elastic. Solution: Let the momenta of the incoming particles be p⃗A and p⃗B . By the momentum conservation, the momentum of the outgoing particle is p⃗A + p⃗B . The total kinetic energies of the incoming particles and the outgoing particle are pB |2 |⃗ pA |2 |⃗ + , 2mA 2mB |⃗ pA + p⃗B |2 . 2(mA + mB ) (1.4.69) Now, Let us go to the center of mass frame. The outgoing particle has zero momentum and zero kinetic energy. However, the total kinetic energy of the incoming particles is positive. Example. Consider a one-dimensional scattering process between a box A and a system of two boxes B and C connected by a spring. Initially, the box A has velocity ⃗vA , the boxes B and C are at rest, and the spring is in the relaxed state with the spring constant k. The scattering between the boxes A and B is elastic, but the scattering between the box A and the system of the boxes B and C is inelastic as some of the kinetic energy changes to the “internal energy”, the potential energy of the spring plus the kinetic energy of the relative velocity between boxes B and C. Find the total internal energy. Figure 1.23: An inelastic scattering between a box and two boxes connected by a spring. Solution: Since the scattering between the boxes A and B is elastic, we can use the formulae in (1.4.47) and find 2mA mA − mB ′ ′ vA = vA , vB = vA . (1.4.70) mA + mB mA + mB The box B would then push the spring to compress, and the spring would push the box C to move. The system of the boxes B and C has a center of mass velocity ⃗vBC,com , which can be computed by the momentum conservation (of the BC system) ′ mB vB = (mB + mC )vBC,com . 51 (1.4.71) The internal energy is given by the initial kinetic energy of the box B minus the kinetic energy of the center of mass velocity, i.e. 1 1 ′2 2 mB vB − (mB + mC )vBC,com 2 2 2 2 1 2mA 2mA 1 mB vA − (mB + mC ) vA = mB 2 mA + mB 2 mB + mC mA + mB m2A mB mC = v2 . (mA + mB )2 (mB + mC ) A 52 (1.4.72) 1.5 Harmonic Oscillators, Pendulum Harmonic oscillators are important in physics and engineering because they are the simplest type of oscillating system, and many real-world systems can be modeled as harmonic oscillators. Examples of harmonic oscillators include mass-spring systems, pendulums, and electric circuits. Understanding the behavior of harmonic oscillators can help us understand more complex systems, as well as develop technologies such as clocks, musical instruments, and sensors. Additionally, quantum harmonic oscillators are an important concept in quantum mechanics, and the study of harmonic oscillators plays a crucial role in the development of quantum field theory. Overall, studying harmonic oscillators provides fundamental insights into the behavior of physical systems and their applications in various fields. 1.5.1 Simple harmonic motion The classic harmonic oscillator is a fundamental concept in physics that describes a particle with mass m coupled to a spring. The force acting on the particle is given by Hooke’s law F⃗ = −k⃗x, (1.5.1) where ⃗x is the position vector of the particle with respect to the equilibrium point, and k is the spring constant. According to the Newton’s second law F⃗ = m⃗a, the equation of motion for the particle is given ¨, where ⃗a = ⃗x ¨ is the acceleration of the particle. This is a second-order differential by −k⃗x = m⃗x equation that governs the motion of the oscillator. To simplify the equation of motion, we define the angular frequency of the oscillator as r k ω= . (1.5.2) m The equation of motion for the particle now becomes ¨(t) = −ω 2 ⃗x(t), ⃗x (1.5.3) which is a well-known differential equation that describes simple harmonic motion. For simplicity, let us consider a particle that is constrained to move along a one-dimensional line. The generic solution of the differential equation for this particle is x(t) = xm cos(ωt + ϕ), (1.5.4) where xm is the amplitude, which is the maximum value of x(t), and ϕ is the initial phase at time zero. The argument of the cosine function is called the phase. These terminologies are summarized in Figure 1.24. One can also obtain these parameters from the x-t plot. For instance, a negative shift in the initial phase will cause the cosine curve to shift towards the right (see Figure 1.25). 53 Figure 1.24 The motion of the harmonic oscillator x(t) is periodic, which means that if we change time t to 2π t + 2π ω , x(t) remains the same. Thus, T = ω is the period of the motion. The period T is related to the frequency f and angular frequency ω as: ω= 2π = 2πf. T (1.5.5) Here, f is the frequency in Hertz (Hz=s−1 ) and ω is the angular frequency in radians per second. The period and frequency are reciprocals of each other, and they are both related to the oscillation rate of the oscillator. Figure 1.25 The velocity of a simple harmonic oscillator is given by the derivative of its displacement over time, which is expressed as: dx(t) π v(t) = = −ωxm sin(ωt + ϕ) = ωxm cos ωt + ϕ + . (1.5.6) dt 2 The velocity amplitude is given by vm = ωxm . And the phase constant is shifted by π/2 compared to x(t). The acceleration of the oscillator can be calculated as the derivative of its velocity over time, which is expressed as: a(t) = dv(t) = −ω 2 xm cos(ωt + ϕ) = ω 2 xm cos(ωt + ϕ + π). dt 54 (1.5.7) The acceleration amplitude is given by am = ω 2 xm and the phase is shifted by π compared to x(t). From these expressions, one can check easily that a(t) = −ω 2 x(t), (1.5.8) which demonstrates that the acceleration of the simple harmonic oscillator is directly proportional to its displacement, and is in the opposite direction to it. This result is required exactly by Newton’s second law and Hooke’s law. Let us now consider the energies of the harmonic oscillator. By definition −dU (x)/dx = F (x), the potential energy of the system is given by: 1 1 U (t) = kx(t)2 = kx2m cos2 (ωt + ϕ). 2 2 (1.5.9) Similarly, the kinetic energy of the system is given by: 1 1 K(t) = mv(t)2 = mω 2 x2m sin2 (ωt + ϕ). 2 2 (1.5.10) Both of these energies oscillate with time, as they depend on the position or velocity of the particle. The period of this oscillation is π/ω = T /2. However, the total energy, which is the sum of the potential and kinetic energies, is conserved and remains constant: 1 E(t) = U (t) + K(t) = kx2m = const. 2 (1.5.11) This total energy is equal to the potential energy when the displacement of the particle reaches its maximum, since the velocity, and hence the kinetic energy, at that point is zero. 1.5.2 Damped harmonic motion Damped harmonic motion is a type of oscillation that occurs in real physical systems. It involves an object oscillating on a spring, but unlike the idealized simple harmonic motion, the system experiences damping due to various factors such as internal friction and air resistance. As a result, the amplitude of oscillation decreases with time, and the system eventually comes to rest. This decay in amplitude is a consequence of the energy of the system being dissipated into thermal energy. To account for the energy loss due to friction or other dissipative effects, we introduce a damping force F⃗d proportional to the velocity ⃗v of the system. In the case of linear viscous damping, the force is given by F⃗d = −b⃗v , (1.5.12) where b is a positive constant known as the damping coefficient. The damping force acts in the opposite direction of the motion and causes the system to slow down. A larger value of b corresponds to a stronger damping force, resulting in a faster decay of the oscillation amplitude. Note that this 55 linear assumption is valid only for small velocities. The generic damping force is very complicated. It depends on the velocity, the properties of the medium, the geometry of the system, and many other factors. With the damping force, we can write Newton’s second law for the system as: −bv − kx = ma. (1.5.13) By substituting dx/dt for v and d2 x/dt2 for a, we get the following second-order linear differential equation: m dx d2 x +b + kx = 0. dt2 dt (1.5.14) The standard procedure of solving the above equation is by assuming x(t) = eλt , we obtain an algebraic equation mλ2 + bλ + k = 0. Depending on the sign of b2 − 4mk, we have three types of damped harmonic motion solutions with quite different physical behaviors. Figure 1.26 illustrates the position versus time plots for three types of damped harmonic motions. Figure 1.26: Three types of damped harmonic motions. (a) Underdamped; (b) Critically damped; (c) Overdamped. 1. Underdamped (b < where √ b 4mk or b2 − 4mk < 0): λ has two complex roots λ1/2 = − 2m ± iω ′ , r ω′ = b2 k − <ω= m 4m2 r k m (1.5.15) is the angular frequency of the damped oscillator. The general solution can be written as b b x(t) = e− 2m t (c1 cos ω ′ t + c2 sin ω ′ t) = xm e− 2m t cos(ω ′ t + ϕ). (1.5.16) Here, xm is the amplitude of the motion. The system oscillates (at reduced frequency ω ′ compared to the undamped case ω) with the amplitude gradually decreasing to zero due to b the exponential factor e− 2m t . 56 b Since the amplitude of the underdamped oscillator can be understood as xm e− 2m t , the total energy of the oscillator can be approximated by b 1 E(t) ≈ kx2m e− m t , 2 (1.5.17) which decreases exponentially with time. The energy loss is attributed to internal energy in the medium that is responsible for damping. p If the damping is absent (b = 0), then ω ′ = ω = k/m, which is the angular frequency of an undamped oscillator. And the solution to the differential equation reduces to the solution of the undamped oscillator. √ b 2. Overdamped (b > 4mk or b2 −4mk > 0): λ has two real and distinct roots λ1/2 = − 2m ±ω ′ with r b2 k ′ ω = − . (1.5.18) 2 4m m The general solution can be written as the sum of two exponentially decaying (λ1/2 < 0) functions x(t) = c1 e−|λ1 |t + c2 e−|λ2 |t , (1.5.19) where c1 and c2 are constants determined by initial conditions. Therefore, the overdamped system will return (exponentially decays) to equilibrium without any oscillating. The decay b rate at long time is governed by |λ1 | = −λ1 = 2m − ω ′ with less absolute value: x(t) ∼ c1 e−|λ1 |t , as t → +∞. (1.5.20) √ b 3. Critically damped (b = 4mk or b2 −4mk = 0): λ has two identical roots λ1/2 = λ = − 2m . The angular frequency is ω ′ = 0. The general solution is x(t) = (c1 + c2 t)e−|λ|t , (1.5.21) where c1 and c2 are determined by initial conditions. The system returns to equilibrium as quickly as possible without oscillating (faster than the overdamped case). 1.5.3 Forced oscillations and resonance In contrast to the simple harmonic oscillator and the damped oscillator, we may encounter a system where external force F (t) acting on the particle are time-dependent. One example is a swing being pushed by a person with a periodic force. From Newton’s second law, we obtain the equation of motion: F (t) − bv(t) − kx(t) = ma(t). This is the second-order inhomogeneous linear differential equation m d2 x(t) dx(t) +b + kx(t) = F (t), 2 dt dt 57 (1.5.22) where the time-dependent external force F (t) is the inhomogeneous term. By using Fourier transformation, a generic time-dependent force can be decomposed into several sine or cosine terms with different frequencies. For simplicity, let us consider a single-component periodic external force of the form: F (t) = Fm cos(ωF t). (1.5.23) The generic solution of the equation of motion, Eq. (1.5.22), is a superposition of a particular solution and the homogeneous solution of Eq.(1.5.14), as discussed in the previous subsection. We can assume the particular solution to be of the form x(t) = A cos(ωF t + ϕ), (1.5.24) where the amplitude A and phase ϕ are unknown coefficients. This solution corresponds to a steady state that is periodic with respect to time. We can also consider it to be the solution in the limit of t → ∞, since the damping term decays to zero in this limit. By substituting the trial solution Eq. (1.5.24) into Eq. (1.5.22), we have A(k − mωF2 ) cos(ωF t + ϕ) − bAωF sin(ωF t + ϕ) = Fm cos(ωF t). (1.5.25) This equation can be rewritten as two equations in terms of A and ϕ by expanding cos(ωF t) = cos(ωF t + ϕ) cos(ϕ) + sin(ωF t + ϕ) sin(ϕ) on the right-hand side: A(k − mωF2 ) = Fm cos ϕ, (1.5.26) −bAωF = Fm sin ϕ, (1.5.27) So the amplitude and phase of the trial solution can be determined as Fm A= q , m2 (ωF2 − ω 2 )2 + b2 ωF2 ϕ = arctan where ω = force). q k m bωF , m(ωF2 − ω 2 ) (1.5.28) (1.5.29) is the natural/intrinsic frequency of the oscillator (without damping or external The steady-state solution of the forced oscillation problem, given by Eq. (1.5.24) in the limit t → ∞, describes a harmonic oscillator with an angular frequency ωF inherited q from the external k force. This frequency is NOT related to the natural/intrinsic frequency ω = m of the oscillator. In fact, the steady-state solution oscillates at the same frequency as the driving force, regardless of the intrinsic parameters (m and k) of the oscillator and its initial conditions (see Figure 1.27). 58 Figure 1.27: Forced oscillations with different initial conditions for k = 1, m = 1, Fm = 1, b = 0.7 and ωF = ω = 1.1. All the curves converge to the steady oscillation Eq. (1.5.24) in the large t limit. The frequency of the motion also converges to the external frequency ωF of the applied force. One interesting phenomenon in the forced harmonic oscillator is resonance, where the amplitude of the steady-state motion is maximized. The amplitude A of the steady state inq Eq.(1.5.28) depends k on both the frequency ωF of the external force and the natural frequency ω = m of the oscillator (see Figure 1.31). From Eq.(1.5.28), the amplitude reaches its maximum value if the extrinsic and intrinsic frequencies coincide, i.e., r k ωF = ω = , (1.5.30) m which is independent of the damping coefficient b and initial conditions. This condition is called resonance. Figure 1.28: The amplitude A of the steady state as a function of frequency ratio ωF /ω. The system exhibits resonance when the driving frequency is equal to the natural frequency of the oscillator, ωF = ω. 59 1.5.4 Simple Pendulum A pendulum is a physical system consisting of a mass (called the bob) m suspended from a fixed point by a string or rod with length L that is free to swing back and forth under the influence of gravity. Pendulums are studied in physics because they exhibit a simple harmonic motion in some limit. Figure 1.29: A simple pendulum. By Newton’s second law, the motion of a pendulum is governed by the equation: T − mg cos θ = mθ̇2 L, mg sin θ = −mL d2 θ dt2 (1.5.31) , (1.5.32) where θ is the angular displacement of the pendulum from its equilibrium position, and g is the acceleration due to gravity. The second equation is a nonlinear differential equation that is difficult to solve exactly. In the case of small angles, where sin θ ≈ θ, the differential equation governing the motion of a simple pendulum can be simplified to: d2 θ g + θ = 0, dt2 L (1.5.33) which is exactly the linear differential equation Eq. (1.5.3) for a harmonic oscillator with a natural frequency of r g ω= . (1.5.34) L 60 The solution to this differential equation is: θ(t) = θ0 cos (ωt) , (1.5.35) where θ0 is the initial displacement of the pendulum from its equilibrium position. We can observe that the period of the pendulum, given by: s L 2π = 2π (1.5.36) T = ω g is independent of the mass m of the pendulum. Similar to the harmonic oscillator of mass-spring system, the energy of a simple pendulum is transformed between kinetic energy and gravitational potential energy as the pendulum swings back and forth. 1.5.5 Double pendulum and chaos The double pendulum is a classic example of a complex physical system that exhibits chaotic behavior. It consists of two pendulums attached to each other, with the second pendulum attached to the end of the first pendulum. Figure 1.30: Double pendulum. Consider a double pendulum consisting of two point masses m1 and m2 , suspended by rigid, massless rods of length L1 and L2 respectively. The position of each pendulum is described by two angles θ1 and θ2 . The position of each mass can be described by two coordinates: x1 = L1 sin θ1 and y1 = −L1 cos θ1 for mass m1 , and x2 = L1 sin θ1 + L2 sin θ2 and y2 = −L1 cos θ1 − L2 cos θ2 for mass m2 . The velocities and accelerations of the masses are the time derivatives of these coordinates: ẋi , ẏi , ẍi , and ÿi , which can be expressed as functions of θi , θ̇i , and θ̈i . 61 Using Newton’s second law for each mass, we can write down the equations of motion: m1 ẍ1 = −T1 sin θ1 + T2 sin θ2 , (1.5.37) m1 ÿ1 = T1 cos θ1 − T2 cos θ2 − m1 g, (1.5.38) m2 ẍ2 = −T2 sin θ2 , (1.5.39) m2 ÿ2 = T2 cos θ2 − m2 g, (1.5.40) where T1 and T2 are tensions of the two rods. By eliminating the tensions, we obtain a set of coupled, nonlinear differential equations for the two angles θ1 and θ2 : 2 2 −g(2m1 + m2 ) sin θ1 − m2 g sin(θ1 − 2θ2 ) − 2 sin(θ1 − θ2 )m2 (θ˙2 L2 + θ˙1 L1 cos(θ1 − θ2 )) θ¨1 = , L1 (2m1 + m2 − m2 cos(2θ1 − 2θ2 )) (1.5.41) 2 2 2 sin(θ1 − θ2 )(θ˙1 L1 (m1 + m2 ) + g(m1 + m2 ) cos θ1 + θ˙2 L2 m2 cos(θ1 − θ2 )) θ¨2 = . L2 (2m1 + m2 − m2 cos(2θ1 − 2θ2 )) (1.5.42) These equations for the double pendulum are quite complex and difficult to solve analytically. However, the motion of the double pendulum can be simulated using numerical methods, such as the Runge-Kutta algorithm, which can accurately predict the motion of the double pendulum for a given set of initial conditions. You can find interactive simulations of double pendulum on this website. One of the most fascinating features of the double pendulum is its chaotic behavior. Chaos is a phenomenon where a small change in the initial conditions of a system can lead to vastly different outcomes in its motion. In the case of the double pendulum, even the slightest differences in the initial positions or velocities of the pendulums can result in completely different trajectories. Figure 1.31: Long exposure of double pendulum exhibiting chaotic motion (tracked with an LED). This characteristic of the double pendulum has significant implications. It means that it is practically impossible to predict the long-term motion of the system, since any small measurement error or uncertainty in the initial conditions can lead to dramatically different results. This is known as the ”butterfly effect”, as even the flapping of a butterfly’s wings can potentially cause a hurricane on the other side of the world. 62 1.6 The Theory of Gravitation One of the earliest goals of physics is to understand the gravitational force that holds us to Earth, holds the Moon in orbit around Earth, and holds Earth in orbit around the Sun. It also reaches out through the whole Milky Way galaxy, holding together the billions and billions of stars in the Galaxy and the countless molecules and dust particles between stars. The gravitational force also reaches across intergalactic space, holding together the Local Group of galaxies, which includes, in addition to the Milky Way, the Andromeda Galaxy at a distance of 2.3 × 106 light-years away from Earth, plus several closer dwarf galaxies. The Local Group is part of the Local Supercluster of galaxies that is being drawn by the gravitational force toward an exceptionally massive region of space called the Great Attractor. This region appears to be about 3.0 × 108 light-years from Earth, on the opposite side of the Milky Way. And the gravitational force is even more far-reaching because it attempts to hold together the entire universe, which is still expanding. Gravitation is the first fundamental force whose law is known to people due to Newton’s great contribution, i.e., Newton’s law of universal gravitation, which is the focus of the current section. However, the gravitational force remains to be one of the most mysterious forces in nature and is still not fully understood by physicists so far. 1.6.1 Newton’s law of Gravitation If the myth were true, in 1665, a falling apple inspired the 23-year-old Isaac Newton to his law of universal gravitation. Newton recognized that this force pulling the apple to the ground is also responsible for holding the Moon in its orbit. He further made a far-reaching generalization that every body in the universe attracts every other body. This tendency of bodies to move toward one another is called gravitation, which originates from the mass of each body. Physics law 5 (Newton’s Law of Gravitation). Any point particle in the universe attracts any point particle with a gravitational force whose magnitude is F =G m1 m2 , r2 where m1 and m2 are the masses of the two particles, r is the distance between them, and G = 6.67 × 10−11 N · m2 /kg2 is the gravitational constant. Here, particle 1 at ⃗r1 attracts particle 2 at ⃗r2 means the gravitational force on particle 2 is along the direction from ⃗r2 to ⃗r1 , i.e., the gravitation force on particle 2 in vector form is m1 m2 ⃗r1 − ⃗r2 . F⃗12 = G |⃗r1 − ⃗r2 |2 |⃗r1 − ⃗r2 | 63 (1.6.1) In reality, two objects can be treated as point particles only when the distance between them is sufficiently large. When the two objects are close to each other compared to their own scales, the formula (1.6.1) cannot be applied directly. In particular, ⃗r1 and ⃗r2 are not well-defined for the two objects anymore. In general, let D ⊂ R3 denote a domain occupied by an object. We decompose it into a union of many small parts (called differential elements), each of which can be treated as a point particle. A small differential element at ⃗r = (x, y, z) can be thought of as a box with side lengths ∆x, ∆y, and ∆z. Suppose the density of the object at ⃗r = (x, y, z) is given by ρ(⃗r). Then, the gravitation of this differential element on a point particle of mass m at position ⃗r′ is given by Gmρ(⃗r) ⃗r − ⃗r′ ∆x∆y∆z. |⃗r − ⃗r′ |3 Then, by the principle of superposition for forces, the total gravitational force is given by X Gmρ(⃗r) differential elements in D ⃗r − ⃗r′ ∆x∆y∆z. |⃗r − ⃗r′ |3 Taking ∆x, ∆y and ∆z to zero, the summation above becomes a triple integral ˚ ⃗r − ⃗r′ Gmρ(⃗r) dxdydz, ⃗r = (x, y, z). |⃗r − ⃗r′ |3 D (1.6.2) You will learn how to evaluate such multi-variable integrals in your calculus courses; it is not the focus of this course. In applications, the integral is (1.6.2) can be very hard to calculate analytically. We hope to find a simpler way to calculate the gravitation between two objects. Intuitively, it may be natural to conjecture that we can still apply (1.6.1) directly with ⃗x1 and ⃗x2 being the centers of mass of these two objects. Unfortunately, this is not true in general. However, there are some special cases where this is true, which simplifies the calculations greatly. Proposition 1.6.1. Consider a ball with uniform density (mass per unit volume). Suppose the center of the ball is at ⃗r1 and the ball has mass m1 . The magnitude of the gravitational force from this ball on a particle of mass m2 , located outside the ball at ⃗r2 , is then given by (1.6.1). In other words, when calculating the gravitation, a uniform ball can be treated as a point particle of the same mass located at its center. From this proposition, we can easily derive that when calculating the gravitational force between two balls, they can be treated as two point particles located at their respective centers (think about why). This means when we consider gravitation between planets, it is safe to treat them as point particles no matter how far they are apart from each other. Let us assume that Earth is a uniform sphere of mass M . The magnitude of the gravitational force from Earth on a particle of mass m, located outside Earth a distance r from Earth’s center, is then given by GM m/r2 . If the particle is released, it will fall toward the center of Earth with the gravitational acceleration g = GM/r2 . In particular, at the surface of Earth, the gravitational 64 acceleration is about 9.8 m/s2 . Moreover, the radius of Earth is about r = 6371 km. Then, we can calculate the mass of Earth as gr2 M= ≈ 5.96 × 1024 kg. G This is a simple example of a way to measure the masses of planets: we first observe the acceleration caused by the gravitation of a planet and measure various distances with astronomical observations, and then calculate the mass using Newton’s law of gravitation. Example. A geostationary satellite is a geosynchronous satellite, which has a geostationary orbit— a circular orbit directly above the Earth’s equator and with an orbital period the same as the Earth’s rotation period. Determine the height of a geostationary satellite. Solution: Let R be the distance from the geostationary satellite to the center of Earth, r = 6371km be the radius of Earth, and g = 9.8 m/s2 be the gravitational acceleration at the ground. We have seen that GM = gr2 . The angular speed of the satellite is ω= 2π rad/s. 24 × 3600 Since the size of the satellite is very small compared with r, it can be treated as a point particle. Then, by Newton’s second law, we have mgr2 GM m = ⇒ R= mω R = R2 R2 2 gr2 ω2 1/3 . The height is then given by R − r. We have an immediate generalization of Proposition 1.6.1 to the radially symmetrical case. Proposition 1.6.2. Proposition 1.6.1 also holds for a ball with radially symmetrical density, i.e., the density depends only on the distance to the center. In particular, this proposition holds for the gravitational forces outside a uniform shell of matter—the region between two spheres with the same center. On the other hand, we also have the following simple fact inside the shell: Proposition 1.6.3. A shell of matter with radially symmetric density exerts no net gravitational force on a particle located inside it. By Proposition 1.6.2 and Proposition 1.6.3, to calculate the gravitational force at a point ⃗x inside a ball with radially symmetric density, we can ignore the shell outside the ⃗x and treat the matter inside the sphere containing ⃗x as a point particle located at the center of the ball. Example. Assume that Earth is a uniform ball. Suppose there is a tunnel connecting the north pole and the south pole. Drop a ball in the tunnel from the north pole. Determine the time the ball needs to reach the south pole. 65 Solution: Let R be the radius of Earth, M the mass of earth, and ρ the density of Earth. We set up an axis along the tunnel pointing from the north pole to the south pole. Let the origin be the center of Earth, with the coordinate of the north and south poles being −R and R respectively. Denote by r(t) the coordinate of the ball, with r(0) = −R. By Proposition 1.6.2 and Proposition 1.6.3, the gravitation acceleration of the ball at r(t) is equal to a(t) = −sgn(r(t)) Gρ 43 π|r(t)|3 4 g GM = − πGρr(t) = − 3 r(t) = − r(t), 2 |r(t)| 3 R R (1.6.3) where the sign function 1, sgn(r(t)) = 0, −1, r(t) > 0 r(t) = 0 r(t) < 0 is due to the fact that the gravitation points to the center of Earth, and we used GM = gR2 in the last step. Thus, we get the differential equation g r̈(t) = − r(t), R with initial conditions r(0) = −R and ṙ(0) = 0. This again gives a harmonic oscillator, which has the solution r g r(t) = −R cos t . R To reach the south pole with r(t) = R, we need half of the period of the oscillation, which is s R t=π . g It is possible to prove Propositions 1.6.1–1.6.3 by evaluating the integral (1.6.2) directly. But there is a much simpler way to prove them based on a deep theorem in multi-variable calculus, called Gauss’s law ; see Section 1.6.5 below. 1.6.2 Gravitational potential energy In Section 1.4, we have mentioned that the gravitational force is a conservative force and is associated with some potential energy. There we were careful to keep the particle near Earth’s surface so that we could regard the gravitational force as constant. By Newton’s Law of Gravitation, we know that the gravitational force is generally not constant. Is it still conservative, and, if yes, what is the associated potential energy? Suppose we already know that the gravitational force is conservative. We now calculate the potential energy. Consider a point particle of M located at the origin, let m be a particle located at ⃗r = (x, y, z). Suppose the system has energy 0 when r → ∞. Now, suppose we move the particle 66 from ∞ to ⃗r along the radial direction, i.e., from ∞⃗er to ⃗r with ⃗er = ⃗r/|⃗r|. Then, it is easy to calculate the work done by the gravitational force during this process: ˆ ∞ GM m GM m W = dx = . 2 x r r Thus, the potential energy, if exists, must be equal to −W = −GM m/r. Proposition 1.6.4. Gravitational force is conservative. Furthermore, the potential energy of a system of two particles with masses m1 and m2 is V =− Gm1 m2 + C, r (1.6.4) where r is the distance between the two particles. In particular, we can choose C = 0 if the reference potential energy at ∞ is taken to be 0. Proof. To show that V is a potential energy, it suffices to show that its gradient is equal to the gravitational force. Suppose particle 1 is at origin, and particle 2 is at ⃗r = (x, y, z). Then, V (⃗r) = − Gm1 m2 (x2 + y 2 + z 2 )1/2 + C. We can calculate its gradient as −∇V (⃗r) = −(∂x V, ∂y V, ∂z V ) = − Gm1 m2 (x2 + y 2 + z 2 )3/2 (x, y, z) = − Gm1 m2 ⃗r , r2 r which is exactly the gravitational force on particle 2. Hence, V is indeed the gravitational potential energy between the two particles, which implies that the gravitation force is conservative. By this proposition, the gravitational force satisfies the path independence property: if we move the pair of particles from one configuration to another, the net work done by the gravitational force during this process is equal to the negative of the change of the potential energy and does not depend on the path taken by the particles. We remark that the potential energy given by Equation (1.6.4) is a property of the system of two particles rather than of either particle alone. There is no way to divide this energy and say that so much belongs to one particle and so much to the other. However, if m1 ≫ m2 , as is true for the problems concerning Earth (mass m1 ) and an object (mass m2 ) near its surface, we often speak of “the potential energy of particle 2”, because, when the object moves in the vicinity of Earth, changes in the potential energy of the system appear almost entirely as changes in the kinetic energy of the baseball, while changes in the kinetic energy of Earth are too small to be measured. When we speak of the potential energy of bodies of comparable mass, however, we have to be careful to treat them as a system. In general, if our system contains more than two particles, we consider each pair of particles in turn, calculate the gravitational potential energy of that pair as if the other particles were not 67 there, and then algebraically sum the results. For example, for a system of three particles with masses m1 , m2 and m3 and distances r12 , r23 and r13 between each pair of them, its potential energy is Gm1 m2 Gm2 m3 Gm1 m3 − − . V =− r12 r23 r13 More generally, as discussed above (1.6.2), the potential energy between an object occupying D ⊂ R3 and a point particle of mass m at position ⃗r′ is calculated as an integral: ˚ Gmρ(⃗r) V =− dxdydz, ⃗r = (x, y, z). (1.6.5) r − ⃗r′ | D |⃗ Again, the potential energy between a radially symmetric ball and a particle can be evaluated in a much simpler way by using Propositions 1.6.2 and 1.6.3. Example. Assume that Earth is a uniform ball. Determine the gravitational potential energy of a point particle of mass m outside and inside Earth. Solution: Let R be the radius of Earth, M the mass of earth, and ρ the density of Earth. First, suppose the location ⃗r of the point particle is outside Earth. Then, with Proposition 1.6.2, we can obtain the work done by the gravitational force from ∞ to ⃗r along the radial direction as W = GM m/r, thus giving a potential V (r) = − GM m , r for r ≥ R. Now, suppose ⃗r is inside Earth, i.e., r < R. In Equation (1.6.3), we have seen that the gravitational force on the particle at x⃗er for some r ≤ x ≤ R has a magnitude GM m x. R3 Thus, the work done by the gravitational force from R⃗er to ⃗r along the radial direction is equal to ˆ R GM m 1 GM m 2 xdx = (R − r2 ), 3 3 R 2 R r which gives a change in the potential energy as V (r) − V (R) = − 1 GM m 2 (R − r2 ), 2 R3 To summarize, we have ( 2 − GMr m = − mgR r , V (r) = m m (R2 − r2 ) = −mgR − − GM − 21 GM R R3 for r < R. r≥R mg 2 2R (R − r2 ), 0≤r<R , where we also rewrote the results using GM = gR2 . With a similar argument, we obtain that the gravitational potential energy between two radially symmetric balls with masses m1 , m2 and distance r between them is given by (1.6.4). 68 Example. A satellite with mass m moves in a circular orbit around Earth with radius r. Determine its mechanical energy E. Solution: The potential energy of the satellite is V =− GM m . r To find the kinetic energy, we write Newton’s second law as GM m v2 = m . r2 r where v 2 /r is the centripetal acceleration of the satellite. From this equation, we can get the kinetic energy 1 GM m K = mv 2 = . 2 2r Therefore, the total mechanical energy is E =K +V =− GM m . 2r (1.6.6) Following Section 1.4, we have the celebrated energy conservation for an isolated system interacting only through gravitational forces: the mechanical energy of the system, i.e., the kinetic energy plus the gravitational potential energy, does not change. We now use the mechanical energy conservation to study an important concept called escape speed. If you fire a projectile upward, it will slow, stop momentarily, and return to Earth if the initial speed is too slow. There is a certain minimum initial speed that will cause it to move upward forever, theoretically coming to rest only at infinity. This minimum initial speed is called the (Earth) escape speed. We now determine the escape speed of Earth using the concept of energy conservation. Suppose the escape speed is v2 . Then, a projectile of mass m has mechanical energy 1 GM m 1 E = mv22 − = mv22 − mgR, 2 R 2 in which M is the mass of Earth and R is its radius. If the projectile can reach infinity, it has zero potential energy and at least zero kinetic energy there. Thus, its mechanical energy at infinity is zero, so p 1 mv22 − mgR = 0 ⇒ v2 = 2gR. 2 2 Plugging into g = 9.8 m/s and R = 6371 km, we get the escape speed for Earth as v2 = 11.2 km/s, which is also called the second cosmic velocity for Earth. The “third cosmic velocity” is the speed that a spacecraft needs to attain in order to be able to leave our solar system, i.e., to escape the gravitation of Earth and Sun. With a similar method, one can obtain the third cosmic velocity as 16.7 km/s. Finally, let’s determine the “first cosmic velocity”. Example. Calculate the first cosmic velocity v1 of Earth, defined as the minimum initial speed for a projectile to move in a circular orbit around Earth. 69 Solution: By (1.6.6), the mechanical energy of a projectile moving in a circular orbit around Earth is GM m E=− , 2r where m is the mass of the projectile and r is the radius of the orbit. Note that this energy is minimum when r is equal to the radius R of Earth. Moreover, the potential energy at the surface of Earth is −GM m/R. Thus, by energy conservation, r p 1 GM GM m GM m 2 mv1 − =− ⇒ v1 = = gR ≈ 7.9 km/s. 2 R 2R R We have seen that the escape speed of a planet of mass M and radius R is r 2GM , R which is larger as M becomes larger and R becomes smaller. In particular, when M is large and R is small enough that the escape speed is larger than the speed of light, neither particles nor light can escape from its surface, thus giving some of the most mysterious structures in the universe: “black holes”. A black hole may form when a star considerably larger than our Sun burns out, the gravitational force between all its particles can cause the star to collapse in on itself. Any star coming too near a black hole can be ripped apart by the strong gravitational force (i.e, the tidal force) and pulled into the hole. Enough captures like this yields a supermassive black hole. Such mysterious monsters appear to be common in the universe, although observing them is very difficult. 1.6.3 Kepler’s laws The motions of the planets, as they seemingly wander against the background of the stars, have been a puzzle since the dawn of history. Johannes Kepler (1571–1630), after a lifetime of study of the extensive data of the planetary motions in the solar system, worked out the empirical laws that govern these motions that now bear Kepler’s name. Physics law 6 (Kepler’s first law: the law of orbits). All planets move in elliptical orbits, with the Sun at one focus. Note that circular orbit is a special case of this law in which the two foci merge to a single central point. Our Earth is indeed on an elliptical orbit around the Sun, although the eccentricity e of the orbit is not large: e ≈ 0.0167. Recall that eccentricity is defined as r c b2 e = = 1 − 2, a a where a is the semi-major axis, b is the semi-minor axis, and c is the distance from a focus to the center. 70 Physics law 7 (Kepler’s second law: the law of areas). A line that connects a planet to the Sun sweeps out equal areas in the plane of the planet’s orbit in equal time intervals; that is, the rate dA/dt at which it sweeps out area A is constant. Qualitatively, this second law tells us that the planet will move most slowly when it is farthest from the Sun and most rapidly when it is nearest to the Sun. Kepler’s second law is actually equivalent to the law of conservation of angular momentum, and we will prove it below using Newton’s second law and law of gravitation. Physics law 8 (Kepler’s third law: the law of periods). The square of the period of any planet is proportional to the cube of the semi-major axis of its orbit. We now illustrate the third law with a circular orbit. Applying Newton’s second law to the orbiting planet with mass m yields GM m = mω 2 r, r2 where M is the mass of Sun and r is the radius of the orbit. On the other hand, given the period T , the angular speed is equal to 2π/T . Thus, the above equation gives GM 4π 2 T2 4π 2 = r ⇒ = . r2 T2 r3 GM The quantity on the right-hand side is a constant that depends only on the mass M of Sun. The above equation holds also for elliptical orbits, provided we replace r with a, the semi-major axis of the ellipse. Although Kepler’s laws are about planets orbiting the Sun, they hold equally well for satellites, either natural or artificial, orbiting Earth or any other massive central body. Kepler’s laws are phenomenological laws and indeed can be derived from Newton’s second law and law of gravitation (although historically, Newton’s law of gravitation is inspired by Kepler’s three laws). We now prove Kepler’s second law, while the proof of Kepler’s first and third laws is more advanced and is not required in this course. Choose the coordinate such that the origin is at the center of the Sun. Suppose at time t, the planet is at ⃗r = (x, y) and its velocity is ⃗v = (vx , vy ). We can choose the direction of ⃗v such that xvy − yvx ≥ 0 (otherwise, we can reverse the directions of the coordinate axes). During an infinitesimal time ∆t, the planet travels ⃗v ∆t. Then, using cross product formula for the area of a triangle, the area swept by the planet during ∆t is 1 1 ∆A = |⃗r × (⃗v ∆t)| = (xvy − yvx )∆t. 2 2 Hence, the instantaneous rate at which the area is being swept out is equal to dA 1 = (xvy − yvx ). dt 2 71 Then, Kepler’s second law is equivalent to that this rate is constant, i.e., d2 A = 0. dt2 (1.6.7) In fact, for this system, we can define the angular momentum of the planet as ⃗ = ⃗r × p⃗ = m⃗r × ⃗v = m(xvy − yvx )⃗k, L where ⃗k is the basis unit vector along the positive z direction. Then, equation (1.6.7) is equivalent ⃗ to the conservation of L: ⃗ dL = 0. (1.6.8) dt ⃗ and use that ⃗r˙ = ⃗v and p⃗˙ = F⃗g by Newton’s second To check (1.6.8), we take the derivative of L law, where F⃗g is the gravitational force on the planet. In this way, we get ⃗ dL = ⃗v × p⃗ + ⃗r × F⃗g = 0, dt where, in the last step, we use that ⃗v is parallel to p⃗ = m⃗v and ⃗r is parallel to F⃗g by Newton’s law of gravitation. This shows (1.6.8), hence concluding Kepler’s second law. 1.6.4 Gravitation on Earth So far, we have assumed that Earth is a uniform ball and an inertial frame by neglecting its rotation. This simplification has allowed us to assume that the free-fall acceleration g of a particle is the same as the particle’s gravitational acceleration. Furthermore, we assumed that g has the constant value g = GM/r2 any place on Earth’s surface. However, any g value measured at a given location will differ from the value ag = GM/r2 for that location for the following three reasons: • Earth’s mass is not uniformly distributed. The density of Earth varies radially, and the density of the crust (outer section) varies from region to region over Earth’s surface. Thus, g varies from region to region over the surface. • Earth is not a sphere. Earth is approximately an ellipsoid, flattened at the poles and bulging at the equator. Its equatorial radius (from its center point out to the equator) is greater than its polar radius (from its center point out to either north or south pole) by 21 km. Thus, the free-fall acceleration g increases if you were to measure it while moving at sea level from the equator toward the north or south pole. As you move, you are actually getting closer to the center of Earth, and thus, by Newton’s law of gravitation, g increases. • Earth is rotating, so its ground is not an inertial frame. The rotation axis runs through the north and south poles of Earth. An object located on Earth’s surface anywhere except at those poles must rotate in a circle about the rotation axis and thus must have a centripetal acceleration directed toward the center of the circle. This centripetal acceleration requires a centripetal net force that is also directed toward that center. 72 To see how Earth’s rotation causes a difference in gravitational acceleration, we consider a particle of mass m at a point with latitude π/2 − θ ∈ [0, π/2]. Denote the normal force on the particle by F⃗N , and the tangent force by T⃗ . The gravitation from Earth is GM m/r2 = mag pointing to the center of Earth. Since the particle is doing a circular motion around the rotation axis with radius r sin θ and angular speed ω, it has a centripetal acceleration ω 2 r sin θ directed toward the rotation axis between the north and south poles. Figure 1.32: The rotation frame on Earth. By Newton’s second law, we have (mag − FN ) sin θ + T cos θ = mω 2 r sin θ, (mag − FN ) cos θ = T sin θ, solving which gives FN = m(ag − ω 2 r sin2 θ). The magnitude of the normal force is equal to the weight mg read on the scale, meaning that the free-fall acceleration g measured in the frame of the rotating Earth is g= FN = ag − ω 2 r sin2 θ. m The difference ω 2 r sin2 θ is largest at the equator (i.e., θ = π/2), where one can calculate that ω 2 r is approximately 0.034 m/s2 , much smaller than 9.8 m/s2 . Therefore, neglecting this difference is often justified. 1.6.5 Gauss’s law for gravity Gauss’s law for gravity, also known as Gauss’s flux theorem for gravity, states that: the flux (surface integral) of the gravitational field (i.e., the field of gravitational acceleration) over any closed surface is equal to the mass enclosed times −4πG. 73 To explain the statement, we consider an object of mass M and enclosed by a surface S. This object generates a gravitational acceleration ⃗ag at any point of space, which is called the gravitational field. Its flux over the surface is defined as follows. We divide the surface into a union ⃗ each of which can be regarded as a flat area element with direction. of small area elements dA, ⃗ is a vector that is perpendicular to the element and points outside the surface More precisely, dA and has a magnitude equal to the area dA of the element. Then, the flux of the gravitational field ⃗ with ⃗ag taking the value at the location of the ⃗ag through this area element is given by ⃗ag · dA, element. Adding together the flux over all area elements yields the total flux through the whole surface, which can be expressed as a surface integral ‹ ⃗ ⃗ag · dA. S Gauss’s law tells us that this integral is equal to −4πGM. Gauss’s law for gravity is equivalent to Newton’s law of gravitation, and it is often more convenient to work from than Newton’s law. Going from Gauss’s law to Newton’s law of gravitation is simple (you can think about it by yourself). The other direction needs to use Gauss’s theorem in vector calculus. We will return to Gauss’s law when we discuss electric fields of electrically charged bodies. We now use it to prove Propositions 1.6.2 and 1.6.3. We consider a radially symmetric ball, whose center is at the origin. We want to calculate the gravitational field it generates at a location ⃗r outside the ball. By radial symmetry, the field points toward the origin, i.e., it has direction −⃗er = −⃗r/r. We only need to determine the magnitude of the field, denoted as ag (r). For this purpose, we consider a sphere of radius r that encloses the ball. Again, by radial symmetry, the magnitude of the field is the same at any point of the sphere, perpendicular to the sphere, and points inside the sphere. Hence, the flux (surface integral) of the gravitational field over the sphere is −ag (r) · 4πr2 = −4πGM, where M is the mass of the ball. Solving this equation gives ag (r) = GM . r2 This concludes Proposition 1.6.2. Proposition 1.6.3 follows from the same argument with M = 0, because there is no mass enclosed by the sphere containing ⃗r inside the shell. 74 Chapter 2 Electricity and Magnetism 2.1 Coulomb’s Law, Electric Fields Starting from this section, we dive into another important branch of physics—electromagnetism (EM)—which, roughly speaking, studies interactions between particles with electric charges via electromagnetic fields. It is the second fundamental interaction studied in this course and is the dominant force in the interactions of electrons, atoms, and molecules. The modern science of electromagnetism was developed by scientists in many countries. One of the most important was Michael Faraday, a gifted experimenter with a talent for physical intuition and visualization. In the mid-nineteenth century, James Clerk Maxwell put Faraday’s ideas into mathematical form, introduced many new ideas of his own, and put electromagnetism on a sound theoretical basis. We will present Maxwell’s unified theory of electromagnetism in Section 2.7, in the form of the famous Maxwell’s equations. In this section, we study the simplest form of electromagnetism, called electrostatics, which is a branch of EM theory that studies electric charges at rest, with a main focus on electric fields due to static electrically charged objects. (However, we will also consider moving charged particles in electric fields—it is just the electric field that is “static”.) 2.1.1 Electric charges There are two types of electric charge, named by the American scientist Benjamin Franklin as positive charge and negative charge. He could have called them anything, but using algebraic signs as names comes in handy when we add up charges to find the net charge. In most everyday objects, there are about equal numbers of negatively charged particles and positively charged particles, and so the net charge is zero, in which case the object is said to be electrically neutral. But sometimes, the positive and negative charges are unbalanced, then the net charge of the object is called excess charge. 75 Generally, we denote the charge of a point particle or a charged object by q. q is positive (resp. negative) if the excess charge of the object is positive (resp. negative). The SI unit of charges is C (coulomb). For practical reasons having to do with the accuracy of measurements, the coulomb unit is derived from the SI unit A (ampere) for electric current i. We shall discuss current in detail in Section 2.4. For now, we only note that current i is the rate at which charge moves past a point: i= dq . dt This gives that A = C/s, or C = A · s. Charge is quantized. Charges of objects are all due to charged particles in atoms, which consist of positively charged protons, negatively charged electrons, and electrically neutral neutrons. The charge of a single electron and that of a single proton have the same magnitude but are opposite in sign. Hence, an electrically neutral atom contains equal numbers of electrons and protons. Electrons are held near the nucleus because they have the electrical sign opposite that of the protons in the nucleus and thus are attracted to the nucleus. The measured value of the charge of a proton is approximately e = 1.602 × 10−19 C, and the charge of an electron is given by −e. Thus, any positive or negative charge q of an object can be written as q = ne, n ∈ Z, where n is equal to the number of protons minus the number of electrons in the object. This gives a very important feature of charges that is very different from mass: Charges are not continuous; they are quantized and appear as integer multiples of e. For this reason, people call e as the elementary charge. The 2019 redefinition of the SI base units fixed the numerical value of e as 1.602176634 × 10−19 when expressed in coulombs, i.e, the ampere (A) is defined as the electrical current equivalent to 1019 elementary charges moving every 1.602176634 seconds. But for historical reasons, people still use A instead of C as a base unit. Quarks, the constituent particles of protons and neutrons, have charges of ±e/3 or ±2e/3, but they cannot be detected individually. For this and for historical reasons, people do not take their charges to be the elementary charge. The elementary charge e is one of the important constants of nature, like the speed of light c. In modern physics, it is believed that the quantization of electric charges is related to some topological properties of the gauge structure of the EM theory. Charge is conserved. An important feature of electric charge is that it is always conserved, i.e., the net charge of a closed system of bodies is unchanged. In other words, in all processes, a positive or negative charge is not created but only transferred from one body to another, changing the net charge of each body. This hypothesis of conservation of charge, first put forward by Benjamin Franklin, has stood up under close examination, both for large-scale charged bodies and for atoms, nuclei, and elementary particles. No exceptions have ever been found. A simple example 76 of charge conservation occurs when an electron e− and its antiparticle, the positron e+ , undergo an annihilation process, transforming into two gamma rays: e− + e+ → γ + γ. Here, e− has charge −e and e+ has charge e, while γ is neutral. The converse of annihilation also occurs: in pair production, a gamma ray transforms into an electron and a positron: γ → e− + e+ . So far, we have seen four conservation laws: energy, momentum, angular momentum, and electric charge. Noether’s theorem, a central result in theoretical physics, asserts that each conservation law is associated with a symmetry of the underlying physics. The conservation of energy is associated with time translation symmetry. The conservation of momentum and the conservation of angular momentum are associated with (space) translation symmetry and rotation symmetry, respectively. The symmetry that is associated with charge conservation is the global gauge invariance of the electromagnetic field. 2.1.2 Coulomb’s law With experiments, scientists discovered that: particles with the same sign of electrical charge repel each other, and particles with opposite signs attract each other. Quantitatively, the strength of the attraction or repulsion is described by the famous Coulomb’s law of electrostatic force (or electric force) between charged particles. Physics law 9 (Coulomb’s law). Consider two charged particles, where particle 1 has charge q1 and locates at ⃗r1 and particle 2 has charge q2 and locates at ⃗r2 . Then, the electric force acting on particle 2 due to particle 1 is given by kq1 q2 F⃗12 = r̂, r2 (2.1.1) where r = |⃗r2 − ⃗r1 | is the separation between the particles, r̂ = (⃗r2 − ⃗r1 )/r is the unit vector along the direction pointing from particle 1 to particle 2, and k is a positive constant called the Coulomb constant and has value k = 8.99 × 109 N · m2 /C2 . Remark. The Coulomb constant is often written as 1 k= , 4πε0 where ε0 is called the permittivity constant and has value ε0 = 8.85 × 10−12 C2 /(N · m2 ). In some sense, it is a more fundamental quantity than k, and is one of the two physical constants that will appear in Maxwell’s equations (another one is the Vacuum permeability µ0 ). 77 Let’s first check the direction of the force on particle 2 as given by (2.1.1). If q1 and q2 have the same sign (i.e., they are both positive or both negative), q1 q2 is positive and (2.1.1) tells us that the force on particle 2 is in the direction of r̂, pointing from particle 1 to particle 2. That is, particle 2 is being repelled from particle 1. Conversely, if q1 and q2 have opposite signs, then (2.1.1) tells us that the force on particle 2 is in the direction −r̂. That is, particle 2 is being attracted toward particle 1. You may notice something that is very curious. Although the two types of forces are wildly different, the form of Coulomb’s law is the same as that of Newton’s law of gravitation (1.6.1): they both satisfy the inverse square law (i.e., the 1/r2 dependence) that involve a product of a property of the interacting particles—the charge in one case and the mass in the other. The main difference between these two laws is that gravitational forces are always attractive but electrostatic forces may be either attractive or repulsive, depending on the signs of the charges. This difference arises from the fact that there is only one type of mass but two types of charge. Due to the similarity with Newton’s law, some results developed in Section 1.6 also apply to Coulomb’s law. Let’s list them here for your convenience. 1. Superposition of forces. As with all forces in this book, the electrostatic force obeys the principle of superposition. In general, consider an object with charge density at ⃗r = (x, y, z) given by ρ(⃗r). Then, similar to (1.6.2), the total electric force on a particle of charge q at position ⃗r′ is given by a triple integral ˚ ⃗r′ − ⃗r kqρ(⃗r) ′ dxdydz, ⃗r = (x, y, z), (2.1.2) |⃗r − ⃗r|3 D where D ⊂ R3 denotes the domain occupied by an object. 2. Shell theories. Analogous to the shell theories (Proposition 1.6.2 and Proposition 1.6.3) for the gravitational force, we have two shell theories for the electrostatic force. Proposition 2.1.1. (1) A charged particle outside a shell with a radially symmetrical charge density is attracted or repelled as if the shell’s charge were concentrated as a particle at its center. (2) A charged particle inside a shell with a radially symmetrical charge density has no net force acting on it due to the shell. 3. Electric potential energy. Similar to Proposition 1.6.4, the electrostatic force is a conservative force and is associated with some potential energy. Proposition 2.1.2 (Electric potential energy). Coulomb (electrostatic) force is conservative. Furthermore, the potential energy of a system of two particles with charges q1 and q2 is V = kq1 q2 + C, r (2.1.3) where r is the distance between the two particles. In particular, we can choose C = 0 if the reference potential energy at ∞ is taken to be 0. 78 If our system contains more than two particles, we consider each pair of particles in turn, calculate the gravitational potential energy of that pair as if the other particles were not there, and then algebraically sum the results. In general, the potential energy between an object occupying D ⊂ R3 and a point particle of mass q at position ⃗r′ is calculated as an integral: ˚ kqρ(⃗r) V = dxdydz, ⃗r = (x, y, z). (2.1.4) r − ⃗r′ | D |⃗ Finally, following Section 1.4, we have the celebrated energy conservation for an isolated system interacting only through Coulomb forces: the mechanical energy of the system, i.e., the kinetic energy plus the electric potential energy, does not change. The electric potential energy is associated with a so-called electric potential of the charged object, which will be discussed in detail in Section 2.2. 4. Gauss’s law. Similar to gravitation, the electrostatic force also satisfies Gauss’s law as discussed in Section 1.6.5. We will discuss Gauss’s law and its consequence in detail in Section 2.2. 2.1.3 Electric fields Consider two positively charged particles. We know that an electrostatic force acts on particle 2 due to the presence of particle 1. We can also calculate the force direction and the force magnitude using Coulomb’s law. There is another convenient (and actually deeper) way of looking at the interaction between the two particles: particle 1 sets up an electric field at all points in the surrounding space, even if the space is a vacuum. If we place particle 2 at any point in that space, it is affected by the electric field particle 1 has already set up at that point. In modern physics, fields are more fundamental concepts than forces. In particular, for EM theory, the electric and magnetic fields are the two basic subjects of study. An electric field is a ⃗ one for each point in the space vector field that consists of a distribution of electric field vectors E, around a charged object. In other words, an electric field is a vector-valued function of the space ⃗ r) at a point ⃗r, is defined as follows. At point ⃗r, we place a particle with a points, whose value E(⃗ small positive charge q0 , called a test charge. (We can think of the charge to be small so that it does not disturb the object’s charge distribution.) We then measure the electrostatic force F⃗ that acts on the test charge. The electric field at that point is then defined as ⃗ ⃗ r) = F . E(⃗ q0 (2.1.5) From (2.1.5), we see that the SI unit for the electric field is N/C. Since the test charge is positive, the two vectors in (2.1.5) are in the same direction. We can shift the test charge around to various other points, to measure the electric fields there, so that we can figure out the distribution of the electric field set up by the charged object. Note that the electric field exists independent of the test charge. It is something that a charged object sets up in the surrounding space (even the vacuum), independent of whether we happen to come along to measure it. (Be sure to distinguish between force and field: force is a push or pull between two charged objects, while the electric field is an abstract property set up by one given charged object.) 79 By Coulomb’s law (2.1.1), the electric field at ⃗r′ due to a point charge q located at ⃗r is given by ⃗ r′ ) = E(⃗ kq r̂, |⃗r′ − ⃗r|2 r̂ = ⃗r′ − ⃗r . |⃗r′ − ⃗r| (2.1.6) Similar to forces, the electric fields also obey the principle of superposition. That is, if several electric fields are set up at a given point by several charged particles, we can find the net field by adding them as vectors. Hence, to calculate the net electric field at a given point due to several particles, find the electric field due to each particle and then sum the fields as vectors. The electric field set up by a general charged object can be calculated as in (2.1.2). As a special example, the field due to a shell with a radially symmetrical charge density can be calculated easily with the help of Proposition 2.1.1. We will see more examples in Section 2.1.4 below. The idea of electric fields was introduced by Michael Faraday, who also introduced a useful way to visualize an electric field in space. He envisioned lines, called electric field lines, in the space around any given charged particle or object The electric field lines are drawn according to the following rules: (1) at any point, the electric field vector must be tangent to the electric field line through that point and in the same direction; (2) in a plane perpendicular to the field lines, the relative density of the lines represents the relative magnitude of the field there, with greater density for greater magnitude. Mathematically, it is not hard to show that such a family of electric field lines exists and is unique. Figure 2.1 gives an example of electric field lines near a sphere uniformly covered with negative charges. At every point around the sphere, an electric field vector points radially inward toward the sphere, and we can represent this electric field with electric field lines as in Figure 2.1. Figure 2.1: The electric field lines near a uniform sphere of negative charges. If the sphere in Figure 2.1 were uniformly covered with positive charges, the electric field vectors at all points around it would be radially outward and thus so would the electric field lines. So, we observe the following rule: Electric field lines extend away from positive charge (where they originate) and toward negative charge (where they terminate). 80 In Figure 2.1, the field lines originate on distant positive charges that are not shown. Another feature of electric field lines is that they never intersect each other (think about why). For example, Figure 2.2 shows the field lines for two particles with equal positive charges. Figure 2.2: The electric field lines for two particles with equal positive charge. We remark that a lot of different fields are used in science and engineering. The gravitational field mentioned in Section 1.6.5 is another important vector field. However, not all fields are vector fields. For example, a temperature field is a scalar field, which gives the distribution of temperatures at each point. In EM theory, the electric and magnetic fields are combined into an electromagnetic tensor field, and in Einstein’s theory of General Relativity, the gravitational field is also a tensor field. 2.1.4 Electric fields due to charged objects In this subsection, we examine some important examples of electric fields generated by several specific types of charged objects. Example (Electric field due to an electric dipole). Consider a system of two particles that have the same charge magnitude q but opposite signs, a very common and important arrangement known ⃗ and −d/2, ⃗ as an electric dipole. Suppose the positive charge and negative charge are located at d/2 respectively. Calculate the electric field set up by this electric dipole. Solution: We evaluate the electric field at the point ⃗r. Using (2.1.6), we have ⃗ r) = E(⃗ kq kq ⃗ ⃗ (⃗r − d/2) − (⃗r + d/2). 3 3 ⃗ ⃗ |⃗r − d/2| |⃗r + d/2| In applications, we are often interested in the electrical field of a dipole at distances that are large compared with the dimensions of the dipole, that is, at distances such that r ≫ d. At such large 81 distances, we can expand the above equation as ! ! ⃗ 1 1 d 1 1 ⃗ r) = kq⃗r E(⃗ − − kq + ⃗ 3 |⃗r + d/2| ⃗ 3 ⃗ 3 |⃗r + d/2| ⃗ 3 2 |⃗r − d/2| |⃗r − d/2| ⃗ 3 − |⃗r − d/2| ⃗ 3 kq d⃗ |⃗r + d/2| kqd2 = kq⃗r − 3 +O ⃗ 3 |⃗r + d/2| ⃗ 3 r r4 |⃗r − d/2| !3/2 !3/2 2 2 2 ⃗ ⃗ ⃗ ⃗ r · d d d kqd kqr3⃗r ⃗ r · d kq d 1+ − + 2 +O = − 1− 2 + 2 ⃗ 3 |⃗r + d/2| ⃗ 3 r2 4r r 4r r3 r4 |⃗r − d/2| kq⃗r 3⃗r · d⃗ kq d⃗ kqd2 3(⃗ p · r̂)r̂ − p⃗ kqd2 = 3 − 3 +O =k +O , r r2 r r4 r3 r4 2 where O (·) denotes a vector with length of order at most kqd and r̂ = ⃗r/r is the unit vector along r4 ⃗ ⃗r direction. The vector p⃗ = q d, which involves the two intrinsic properties q and d⃗ of the dipole, is an important vector quantity known as the electric dipole moment of the dipole. The unit of the dipole moment is C · m. The magnitude and direction of p⃗ indicate the strength and orientation of a dipole, respectively. In an idealistic situation, we take the dimension of the dipole d → 0, while keeping the dipole moment p⃗ unchanged (by increasing the charge q = p/d → ∞). This limiting process results in a “point dipole”, whose electric field is given by p · r̂)r̂ − p⃗ ⃗ r) = k 3(⃗ E(⃗ . r3 (2.1.7) An important feature of this field is that it decays with respect to r as 1/r3 in contrast to the 1/r2 decay in Coulomb’s law, i.e., the electric field of a dipole decays faster than that of a point charge. In Figure 2.3, we show the pattern of electric field lines for an electric dipole. Figure 2.3: The pattern of electric field lines around an electric dipole. Example (Electric field due to a line of charge). Consider an infinite line of charge with uniform linear charge density λ. Calculate the electric field around this line. 82 Solution: Suppose the line is along z direction and we calculate the electric field at ⃗r. We set up the coordinate such that the line is the z-axis and ⃗r is on the x-axis, i.e., ⃗r = (x, 0, 0) for some x > 0. Applying the symmetry of the problem, we know that the electric field is along x-direction. We now calculate its magnitude. Consider two symmetric differential elements dz at ±z. The magnitude of the electric field generated by each of them at ⃗r is equal to kλ E(z)dz = 2 dz. z + x2 The superposition of the two electric fields is along the x direction and has a magnitude 2E(z) cos θdz = (z 2 kλx dz. + x2 )3/2 Summing over all these differential elements over dz, we get that the total electric field is along the x direction and has a magnitude ˆ ∞ kλx E(⃗r) = 2 dz. (z 2 + x2 )3/2 0 With the change of variable z = x tan θ, we can evaluate the integral as ˆ E(⃗r) = 2 0 π/2 kλx x3 cos3 θ 2kλ x dz = 2 cos θ x ˆ π/2 cos θdθ = 0 2kλ . x In general, the electric field at a point ⃗r = (x, y, z) is (x, y, 0) ⃗ r) = p 2kλ E(⃗ ·p . 2 2 x +y x2 + y 2 Example (Electric field due to a circle). Consider a circle of radius R and with uniform linear charge density λ. Calculate the electric field on the central axis, i.e., the axis through the ring’s center and perpendicular to the plane of the ring. 83 Solution: Suppose the central axis is along z direction and the circle is on the xy-plane. At an arbitrary point ⃗r = (0, 0, z) on the axis, by the symmetry of the problem, we know that the electric field is along z-direction. We now calculate its magnitude. Consider a different element ds on the circle. The magnitude of the electric field generated by this element is equal to kλ ds. 2 z + R2 Its component along the z direction is equal to z2 kλ kλz cos θds = 2 ds. 2 +R (z + R2 )3/2 Summing over all these differential elements over ds, we get that the total electric field is along the z direction and has a magnitude E(⃗r) = 2πR (z 2 kqz kλz = 2 , 3/2 2 +R ) (z + R2 )3/2 (2.1.8) where q = 2πRλ is the total charge of the circle. Example (Electric field due to a disk). Consider a circular disk of radius R and with uniform surface charge density σ. Calculate the electric field on the central axis, i.e., the axis through the disk’s center and perpendicular to the plane of the disk. Solution: Suppose the central axis is along z direction and the disk is on the xy-plane. At an arbitrary point ⃗r = (0, 0, z) on the axis, by the symmetry of the problem, we know that the electric field is along z-direction. We now calculate its magnitude. 84 By the previous example, we know that the thin ring with radial width dr sets up a differential electric field kz(2πrσdr) zrσ dE = 2 = dr, (z + r2 )3/2 2ε0 (z 2 + r2 )3/2 where 2πrσdr is the total charge of the ring. Summing over all such rings, we get the total electric field as ˆ R R σ zrσ z z σ √ = E(⃗r) = 1− √ . (2.1.9) dr = − 2 2 3/2 2ε0 z 2 + r2 0 2ε0 z 2 + R2 0 2ε0 (z + r ) If we let R → ∞ while keeping z finite, the second term in the parentheses on the RHS approaches zero, and the above equation reduces to E= σ . 2ε0 (2.1.10) For an infinite plane, any axis perpendicular to it can be regarded as a central axis. Hence, it sets up a uniform electric field around it, and the electric field lines are as follows: 85 If we insert another infinite charged plane that is parallel to the previous plane and has uniform surface charge density −σ, then by superposition of electric fields, the electric field between the two planes is uniform and equal to σ (2.1.11) E= , ε0 while the electric field vanishes elsewhere. This is also how people generate uniform electric fields in the real world: a uniform electric field is produced by placing a potential (or voltage) difference across two parallel metal plates. As long as the separation between the two plates is small compared to the scale of the plates, then the electric field between them is approximately uniform at the points far away from the edges of the plates. (We will discuss the concept of potential and voltage in Section 2.3.) 2.1.5 Charged particles in electric fields Given an electric field, it is simple to determine the electrostatic force acting on a particle of charge q: ⃗ F⃗ = q E, (2.1.12) ⃗ is the electric field that other charges have produced at the location of the particle. The where E field is not the field set up by the particle itself; to distinguish the two fields, the field acting on the particle is often called the external field. A charged particle or object is not affected by its own electric field. Equation (2.1.12) played a key role in the measurement of the elementary charge e in the famous Millikan oil-drop experiment. You can read this wikipedia page about some interesting stories about this experiment. ⃗ Example. Consider an electric dipole with electric dipole moment p⃗ in a uniform electric field E. Calculate the net force and torque acting on the dipole, and find its potential energy. Solution: Suppose the electric dipole consists of two particles with charges q and −q located at ⃗r1 and ⃗r2 , respectively. Its electric dipole moment is equal to p⃗ = q(⃗r1 − ⃗r2 ). The total force on the dipole is ⃗ − qE ⃗ = 0. F⃗ = q E The total torque on the dipole is ⃗ − ⃗r2 × q E ⃗ = q(⃗r1 − ⃗r2 ) × E ⃗ = p⃗ × E. ⃗ ⃗τ = ⃗r1 × q E ⃗ is equal to We claim that the potential energy of a charged particle in a uniform electric field E ⃗ · ⃗r + C, U (⃗r) = −q E 86 where q is the charge of the particle and C is a constant depending on our choice of the zero reference potential energy. (We can take C = 0 if we let U (0) = 0.) To see this, we find that ⃗ ∇⃗r U (⃗r) = q E, which is indeed the electric force on the charge. Thus, the potential energy of the dipole is ⃗ · ⃗r1 + q E ⃗ · ⃗r2 = −q(⃗r1 − ⃗r2 ) · E ⃗ = −⃗ ⃗ U = −q E p · E. 87 2.2 Electric flux, Gauss’s law, and integral theorems A main reference of this section is David Tong’s lecture notes on vector calculus http://www.damtp.cam.ac.uk/user/tong/vc.html. 2.2.1 Vector calculus As we have seen in the previous section, electric fields are vector fields, which assign a vector for every point in space R3 . More precisely, a vector field F⃗ in d dimensions is a map F⃗ : Rd → Rd . (2.2.1) Fields are fundamental concepts in physics. Besides vector fields, later on, we will encounter another kind of field called a scalar field, which assigns a number for every point in space. More precisely, a scalar field ϕ in d dimensions is a map ϕ : Rd → R . (2.2.2) Vector fields and scalar fields are related by three important operations: gradient, divergence, and curl. Let us introduce the first operation, the gradient denoted by Grad. It is an operation that takes a scalar field to a vector field. Definition 13 (Gradient). Given Cartesian coordinates xi with i = 1, · · · , d on Rd , the gradient is defined by ∂ϕ(⃗r) ∂ϕ(⃗r) ⃗ (2.2.3) Grad : ϕ(⃗r) 7→ ∇ϕ(⃗r) ≡ ,··· , ∂x1 ∂xd where ϕ(⃗r) is a scalar field. The above definition relies on the choice of Cartesian coordinates xi . A coordinate-free definition is given by considering the difference between the scalar field ϕ evaluated at two nearby points ⃗r and ⃗r + ⃗ϵ with ϵ = |⃗ϵ| ≪ 1, ⃗ r) + O(ϵ2 ) , ϕ(⃗r + ⃗ϵ) − ϕ(⃗r) = ⃗ϵ · ∇ϕ(⃗ (2.2.4) where O(ϵ2 ) denotes the terms that are of order at least ϵ2 . (2.2.4) could be regarded as an alternative definition of the gradient. When picking a choice of Cartesian coordinates with ⃗r = (x1 , · · · , xd ) and ⃗ϵ = (ϵ1 , · · · , ϵd ), we recover the definition (2.2.3). We have already seen many examples of the gradient when we discussed the conservative force in Section 1.4.1. In particular, the gradient of the gravitational potential energy of between two point particles is computed explicitly in equations (1.4.30) and (1.4.31). ⃗ as an object in its own right, and call it the gradient operator. We can view ∇ 88 Definition 14 (Gradient operator). Given Cartesian coordinates xi with i = 1, · · · , d on Rd , the gradient operator is defined by ∂ ∂ ⃗ ≡ ∇ ,··· , . (2.2.5) ∂x1 ∂xd It is a vector whose entries are partial derivatives. The gradient operator is an example of the differential operator. A differential operator broadly means a collection of derivatives, that can act on some functions. For example, the one-variable d derivative dt is a differential operator that can act on a function f (t) and gives dfdt(t) . ⃗ can act on other fields in different ways. Besides acting on scalar fields, the gradient operator ∇ The divergence, denoted by Div, is a way for the gradient operator to act on a vector field and produces a scalar field. Definition 15 (Divergence). Given Cartesian coordinates xi with i = 1, · · · , d on Rd , the divergence is defined by d X ∂Fi (⃗r) ⃗ · F⃗ (⃗r) ≡ (2.2.6) , Div : F⃗ (⃗r) 7→ ∇ ∂xi i=1 where F⃗ (⃗r) is a vector field. As an example, let us compute the divergence of the electric field of a charged particle. Example. By Coulomb’s law (2.1.1), the electric field at ⃗r due to a point charge q located at the origin ⃗r = 0 is given by ⃗ r) = kq ⃗r . E(⃗ (2.2.7) r3 ⃗ · E(⃗ ⃗ r). Compute ∇ Solution: Let us first compute ∂Ex ∂x , ∂Ex ∂ kqx = ∂x ∂x r3 kqx kq 3 = 3 − × 2(x′ − x) × 5 r 2 r kq 3kqx2 = 3 − . r r5 The ∂Ey ∂y and ∂Ez ∂z (2.2.8) can be computed in a similar way. Now, we sum up these three terms and find ⃗ · E(⃗ ⃗ r) = ∂Ex + ∂Ey + ∂Ez ∇ ∂x ∂y ∂z 3kq 3kq x2 + y 2 + z 2 = 3 − r r5 = 0. 89 (2.2.9) ⃗ · E(⃗ ⃗ r) is a scalar field that is identically zero. However, we need Naively, we may conclude that ∇ to be careful about the point at the origin ⃗r = 0, where the electric field diverges. We will see later ⃗ r) cannot be zero at the origin ⃗r = 0, but instead we actually have that ∇ · E(⃗ ⃗ · E(⃗ ⃗ r) = 4πkqδ 3 (⃗r) , ∇ (2.2.10) where δ 3 (⃗r) is the three-dimensional Dirac delta function δ 3 (⃗r) ≡ δ(x)δ(y)δ(z) . (2.2.11) The Dirac delta function δ(x) can be loosely thought of as a function on the real line, which is zero everywhere except at the origin, where it is infinite, ( +∞ for x = 0 , δ(x) ≈ (2.2.12) 0 for x ̸= 0 , and is also constrained to satisfy the identity ˆ ∞ δ(x) = 1 . (2.2.13) −∞ A formula for the Dirac delta function as a limit is x 2 1 δb (x) = √ e−( b ) . b π δ(x) = lim δb (x) , b→0+ (2.2.14) The following plot shows how the function δb (x) approaching δ(x) as b → 0+ . 8 6 4 2 -1.0 0.5 -0.5 1.0 Figure 2.4: The plot of the function δb (x) for b = 1, 21 , 14 , 81 , 1 16 . We will give a derivation of the statement (2.2.10) when we discuss the divergence theorem in the next subsection. 90 Physically, the formula (2.2.10) tells us that the divergence is an operation that measures the ⃗ ·E ⃗ is non-zero at the position of the charged source (the charged particle) of the electric field. ∇ particle, and is zero everywhere else. We will study the following two examples that will further confirm our physical intuition. Let us consider the divergence of the electric field generated by n charged particles. Example. Consider n charged particles with charges q1 , q2 , · · · , qn at the positions ⃗r1 , ⃗r2 , · · · , ⃗rn . The electric field generated by them is n X ⃗r − ⃗ri ⃗ r) = E(⃗ kqi . (2.2.15) |⃗r − ⃗ri |3 i=1 ⃗ · E(⃗ ⃗ r). Compute ∇ Solution: We note that the divergence is a linear operation. That is, for a linear combination of two vector fields (2.2.16) a1 F⃗1 (⃗r) + a2 F⃗2 (⃗r) , with a1 and a2 two constants independent of the position vector ⃗r, we have h i ⃗ · a1 F⃗1 (⃗r) + a2 F⃗2 (⃗r) = a1 ∇ ⃗ · F⃗1 (⃗r) + a2 ∇ ⃗ · F⃗2 (⃗r) . ∇ Hence, we have ⃗ · E(⃗ ⃗ r) = ∇ n X ⃗ · kqi ∇ i=1 = 4πk n X (2.2.17) ⃗r − ⃗ri |⃗r − ⃗ri |3 (2.2.18) 3 qi δ (⃗r − ⃗ri ) . i=1 We see again that the divergence of the electric field is only non-zero at the place where the charged particles reside. We can go one step further to compute the divergence of the electric field generated by a charged object. Example. The electric field generated by a charged object with a charge density ρ(⃗r) is ˆ ⃗r − ⃗r′ ′ ⃗ kρ(⃗r ) dx′ dy ′ dz ′ , E(⃗r) = ′ |3 |⃗ r − ⃗ r D (2.2.19) ⃗ · E(⃗ ⃗ r). where D is the domain of the charged object. Compute ∇ Solution: By the linearity of the divergence (2.2.16), we have ˆ ⃗r − ⃗r′ ′ ⃗ ⃗ ⃗ ∇ · E(⃗r) = kρ(⃗r )∇ · dx′ dy ′ dz ′ ′ |3 |⃗ r − ⃗ r ˆD ⃗r − ⃗r′ ⃗ · = kρ(⃗r′ )∇ dx′ dy ′ dz ′ |⃗r − ⃗r′ |3 ˆ = 4πkρ(⃗r′ )δ 3 (⃗r − ⃗r′ )dx′ dy ′ dz ′ = 4πkρ(⃗r) , 91 (2.2.20) where we have used the fact that ρ(⃗r′ ) is zero at ⃗r′ ∈ / D at the second equality, and (2.2.13) at the ⃗ ⃗ forth equality. We see that ∇ · E(⃗r) gives the charge density of the object. It again confirms our intuition that the divergence measures the source of the electric field. ⃗ to act on a vector field, In three dimensions, there is another way for the gradient operator ∇ that is, by taking the cross product. Definition 16 (Curl). Given Cartesian coordinates (x, y, z), the curl, denoted as Curl, is defined by ⃗ × F⃗ (⃗r) ≡ ∂F3 − ∂F2 , ∂F1 − ∂F3 , ∂F2 − ∂F1 , Curl : F⃗ (⃗r) 7→ ∇ (2.2.21) ∂y ∂z ∂z ∂x ∂x ∂y where F⃗ (⃗r) is a vector field. We see that Curl is an operation that takes a vector field to another vector field. Alternatively, the curl can be defined by 3 X ∂Fk ⃗ × F⃗ )i = , (∇ ϵijk (2.2.22) ∂xj j,k=1 where ϵijk is a totally antisymmetric tensor with ϵ123 = 1. The meaning of the curl is that it measures the rotation of a vector field. Let us try to understand this statement by looking at the following examples. Example. Compute the curl of the vector field F⃗ (⃗r) = (y, −x, 0) , (2.2.23) whose field lines are plotted below 1.0 0.5 0.0 -0.5 -1.0 -1.0 -0.5 0.0 92 0.5 1.0 Solution: By a direct computation, we find ⃗ × F⃗ = 0, 0, ∂(−x) − ∂y = (0, 0, −2) . ∇ ∂x ∂y (2.2.24) From the above figure, we see that F⃗ is a vector field that rotates clockwise on the x-y plane. ⃗ × F⃗ is a vector pointing in the negative z-direction, which Indeed, our computation shows that ∇ is perpendicular to the x-y plane. Next, we consider the curl on the electric fields. From the plots of the electric field lines in Figure 2.1, 2.2, and 2.3, we see that the static electric fields look in general not rotating. Let us verify our expectations. Example. Compute the curl of the electric field ⃗ r) = kq ⃗r . E(⃗ r3 (2.2.25) Solution:Let us compute the z-component ∂ y ∂ x − ∂x r3 ∂y r3 −3x −3y = kq y 5 − x 5 r r ⃗ × E) ⃗ z = kq (∇ (2.2.26) = 0. ⃗ ×E ⃗ are also zero, and we By similar computations, we find that the other two components of ∇ conclude ⃗ × E(⃗ ⃗ r) = 0 . (2.2.27) ∇ We still need to worry about the point at ⃗r = 0 where the electric field diverges. We will see in ⃗ ×E ⃗ is also zero at the origin. Section 2.2.3 that ∇ Like divergence, curl is also a linear operation, i.e. ⃗ × (a1 F⃗1 + a2 F⃗2 ) = a1 ∇ ⃗ × F⃗1 + a2 ∇ ⃗ × F⃗2 . ∇ (2.2.28) Then, by a similar argument as before for the divergence, we find that the electric fields of n charged particles or a charged object are nonrotational. In Section 2.6, we will see that the electric fields can rotate when there are time-dependent magnetic fields. 2.2.2 Electric flux, divergence theorem, and Gauss’s law As we have learned from the previous subsection, the divergence provides a way to measure the sources of the electric fields. In this section, we will introduce Gauss’s law, which is a very different 93 way to measure the source of an electric field. The equivalence of these two ways leads to the divergence theorem. The idea of Gauss’s law is that given a closed surface S, we would like to know if there is any net charge inside S by analyzing the electric fields on S. To obtain more intuitions, let us consider the situation in which a charged particle is inside the closed surface S. When the charge of the particle is positive, the electric field lines of the particle are always pointing outward to the surface. Figure 2.5 shows the electric lines piercing a piece of the surface. Figure 2.5: Electric field lines pierce a surface. On the other hand, when the charge of the particle is negative, all the electric field lines are pointing inward to the surface. Hence, there should be a relation between the net number of electric field lines piercing a closed surface (the number of outward electric field lines minus the number of inward electric field lines) and the net electric charge inside the surface. Let us try to make this relation more precise. First, we need to have a more precise definition of the “number of electric field lines piercing a surface”. This leads to the following definition. We would like to make our discussion a bit more general, by working in d-dimensional space Rd , and the “surfaces” in the following are specifically referred to the (d − 1)-dimensional surfaces in Rd . But, you could always fix d = 3 if you like. Definition 17 (Flux). The flux Φ of a vector field F⃗ through an oriented surface S is defined as the integral ˆ ⃗. Φ= F⃗ · dA (2.2.29) S Let us try to decode this definition. Consider a very small piece of the surface with area ∆A, which is small enough such that we could approximate it by a plane as shown in the following picture. 94 ⃗ by the following two conditions: We define the area vector ∆A ⃗ = ∆A , 1. |∆A| ⃗ is orthogonal to all the vectors along the plane, i.e. ∆A ⃗ · ⃗v = 0 if ⃗v is along the plane. 2. ∆A ⃗ is defined as the limit when the area becomes infinitesimal. These two conditions The vector dA ⃗ up to a sign. That is, if an area vector dA ⃗ satisfies conditions 1 and 2, then fix the area vector ∆A ⃗ also satisfies the same conditions. To capture the information of this sign, let us the vector −dA look at the unit normal vector of the surface S n̂ = ⃗ dA , ⃗ |dA| (2.2.30) which is called the orientation of the surface S. To make the flux well-behaved, we would like to require that the orientation n̂ should vary continuously locally on the surface S. More precisely, we demand the condition: 3. In any open neighborhood on the surface S, the area vector n̂ is continuous. If the orientation n̂ can be extended to be continuous on the entire surface S, then we have a way to ⃗ on S. In fact, not every surface admits a continuous orientaconsistently choose the area vector dA tion. The surfaces that admit a continuous orientation are called orientable surfaces, otherwise are called non-orientable surfaces. We could only define the flux for orientable surfaces. The orientable surfaces with a chosen orientation are called oriented surfaces. We would regard the orientation as part of the definition of an oriented surface. That is, two oriented surfaces that coincide in space are regarded as different oriented surfaces if they have different orientations. For a closed oriented surface, we choose our convention that its orientation is always pointing outward the surface. ⃗ and the vector field F⃗ and integrate over In (2.2.29), we take the inner product between dA the closed surface S. Let d = 3 and the vector field F⃗ be the electric field. The formula (2.2.29) defines the electric flux, which is our precise definition of the “number of electric field lines piercing a surface”. We can see that the definition (2.2.29) of the electric flux agrees with our expectation. Namely, when we have a positively (negatively) charged object inside the closed surface S, we find a positive (negative) flux. We will see later a precise formula (Gauss’s law) on the relation between the flux through a closed surface and the net charge inside that surface. 95 Now, from our discussions in the previous subsection and this subsection, we have seen two ways to find the sources (charged objects) of the electric fields, by the divergence and by the flux. These two ways are beautifully related by the divergence theorem, also known as Gauss’s theorem. Theorem 2.2.1 (Divergence theorem). For a vector field F⃗ over Rd , ˆ ˆ d ⃗ ⃗ ⃗, ∇·F d x= F⃗ · dA B (2.2.31) S where B is a bounded region whose boundary ∂B = S is å piecewise smooth closed (d − 1)dimensional surface. Let us leave the proof of the divergence theorem to your analysis class. Instead, we will try to understand the physical meaning of the divergence theorem. We will again focus on d = 3 and ⃗ F⃗ = E. Example. Consider a charged particle of charge q at the origin ⃗r = 0, which generates the electric field ⃗ r) = kq ⃗r . E(⃗ (2.2.32) |⃗r|3 Compute the flux of the electric field through a round two-sphere S2 of radius R centered at the origin. ⃗ is along the radial direction, and we have Solution: By the spherical symmetry, the vector dA ⃗ = r̂|dA| ⃗ = r̂dA . dA (2.2.33) It is convenient to work in the spherical coordinates (r, θ, ϕ), which is related to the Cartesian coordinate (x, y, z) by x = r sin θ cos ϕ , y = r sin θ sin ϕ , (2.2.34) z = r cos θ , where r, θ, ϕ are in the range r ∈ [0, ∞), θ ∈ [0, π], ϕ ∈ [0, 2π). We would like to know how to perform the volume integral and the surface integral in the spherical coordinates. Let us consider a more general problem, the volume integral in a general coordinate system (u, v, w). Consider a small cube in the (u, v, w)-coordinate, whose six faces are on the constant u, v, or w planes. The sides of the cube have lengths ∆u, ∆v, and ∆w. The area of the cube is not simply given by ∆u∆v∆w, because the sides are not at necessarily right angles. When the cube is small enough, we have ∂x ∂x ∂x ∆x = ∆u + ∆v + ∆w + · · · , ∂u ∂v ∂w ∂y ∂y ∂y (2.2.35) ∆y = ∆u + ∆v + ∆w + · · · , ∂u ∂v ∂w ∂z ∂z ∂z ∆u + ∆v + ∆w + · · · . ∆z = ∂u ∂v ∂w 96 where the · · · are the second and higher order terms O(∆u2 , ∆v 2 , ∆w2 , ∆u∆v, ∆v∆w, ∆u∆w). In the matrix form, we have ∂x ∂x ∂x ∆x ∆u ∂v ∂w ∂u ∂y ∂y ∂y (2.2.36) ∆y = J ∆v , J = ∂u ∂v ∂w , ∂z ∂z ∂z ∆z ∆w ∂u ∂v ∂w where J is called the Jacobian matrix. Geometrically, this means that the cube in the (u, v, w)coordinate is a parallelepiped in the (x, y, z)-coordinate with sides given by the vectors −→ ∂x ∂y ∂z ∆u = ∂u ∂u ∂u ∆u , −→ ∂x ∂y ∂z ∆v = ∂v ∂v ∂v ∆v , (2.2.37) −−→ ∂x ∂y ∂z ∆w = ∂w ∆w . ∂w ∂w The volume of the parallelepiped is given by −→ −→ −−→ (∆u × ∆v) · ∆w = | det J|∆u∆v∆w . Let us compute the Jacobian for the spherical coordinate ∂x ∂x ∂x sin θ cos ϕ r cos θ cos ϕ −r sin θ sin ϕ ∂r ∂θ ∂ϕ ∂y ∂y 2 det ∂y ∂r ∂θ ∂ϕ = det sin θ sin ϕ r cos θ sin ϕ r sin θ cos ϕ = r sin θ . ∂z ∂z ∂z cos θ −r sin θ 0 ∂r ∂θ ∂ϕ (2.2.38) (2.2.39) The volume of a cube in the spherical coordinate is r2 sin θ∆r∆θ∆ϕ . (2.2.40) The integration measure in the spherical coordinate should be r2 sin θdrdθdϕ . (2.2.41) Now, since a round two-sphere centered at the origin has a constant radial coordinate r = R, the ⃗ should be area vector dA ⃗ = r̂dA = r̂(r2 sin θdθdϕ) . (2.2.42) dA The flux is then computed by the integral ˆ π ˆ 2π ⃗r Φ= dθ dϕ kq 3 · r̂(r2 sin θ) = 4πkq . r 0 0 (2.2.43) Now, we could complete our argument that a three-dimensional Dirac delta function should sit at the right-hand side of the equation (2.2.10). From our previous computations (2.2.8) and (2.2.9), ⃗ ·E ⃗ is zero except at the origin where the point we know that the divergence of the electric field ∇ 97 ⃗ ·E ⃗ cannot be zero at the origin, because by the divergence theorem (2.2.31), we charge sits. ∇ know that ˆ ⃗ ·E ⃗ d3 x = 4πkq . ∇ (2.2.44) B In fact, the volume integral on the left-hand side only receives a contribution from the point at the ⃗ ·E ⃗ must be proportional to a Dirac delta function in order to give nonzero origin ⃗r = 0. Hence, ∇ volume integral. The proportionality constant can be fixed by (2.2.44) to be 4πkq. The divergence theorem has a very profound consequence on the electric flux. ⃗ · F⃗ = 0, its flux through a surface S Corollary 2.2.2. For a divergence-free vector field F⃗ , i.e. ∇ is invariant under local continuous deformation of S. To see this corollary, let us consider the following example. Example. A charged object of a charge density ρ(⃗r) is inside a closed surface S as shown in Figure 2.6. Figure 2.6: The electric flux is invariant under deformations on S as long as the deformation does not cross any charges. Now let us consider continuously deforming the surface S to S ′ , and let ∆B be the region bounded by the two surfaces S and S ′ , i.e. ∂(∆B) = S ∪ S ′ , where S ′ denotes the orientation reversal of S ′ . Assume that there is no charged object inside ∆B. We can compute the difference of the electric flux through S and through S ′ , ˆ ˆ ˆ ˆ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ ⃗ · Ed ⃗ 3x = 0 , E · dA = ∆Φ = E · dA − E · dA = ∇ (2.2.45) S S′ S∪S ′ ∆B ⃗ ·E ⃗ =0 where at the last equality we used the fact that there is no charged object insider B so ∇ inside B. 98 We can compute the electric flux directly using the divergence theorem and the formula (2.2.20). Let B be the region bounded by S. We have ˆ ˆ ˆ 3 ⃗ · dA ⃗= ⃗ · Ed ⃗ x = 4πk Φ= E ∇ ρ(⃗r)d3 x = 4πkQ , (2.2.46) S B B where Q is the total (net) charge inside the surface S. We have found that the electric flux through a surface S equals 4πk times the total net charge inside S. This statement is called Gauss’s law of the electric field. The surface S is also called Gaussian surface. 2.2.3 Stoke’s theorem, Poincaré lemma, and electric potential In Section 2.2.1, we have seen that curl measures the rotation of a vector field. Let us introduce a different way to measure the rotation of a vector field. The rotation of a vector field F⃗ along a closed curve C can be measured by the following loop integral: ˆ F⃗ · d⃗r . (2.2.47) C Example. Compute the loop integral (2.2.47) of the vector field F⃗ (⃗r) = (y, −x, 0) , (2.2.48) along a curve C which is a counter-clockwise circle of radius r centered at the origin as shown in the following figure 1.0 0.5 0.0 -0.5 -1.0 -1.0 -0.5 0.0 0.5 1.0 Solution: We compute ˆ ˆ F⃗ · d⃗r = − C ˆ |F⃗ ||d⃗r| = − C r2 dθ = −2πr2 , 0 99 2π (2.2.49) where at the first equality we use F⃗ ·d⃗r = |F⃗ ||d⃗r| because the vector field F⃗ is always in the opposite direction as d⃗r. We see that the loop integral is indeed nonzero for a rotating vector field. Let us also consider an example of the loop integral (2.2.47) for a nonrotating vector field Example. Compute the loop integral (2.2.47) of the electric field of a point charge q at the origin, ⃗ r) = kq ⃗r . E(⃗ r3 (2.2.50) Solution: Let us take the curve to be a unit circle on the x-y plane with counter-clockwise orientation, so we have (2.2.51) d⃗r = (− sin ϕ, cos ϕ, 0)rdϕ . Now, we can easily see that we have ⃗ · d⃗r = 0 . E (2.2.52) So the loop integral indeed vanishes for the nonrotating field (2.2.50). The equivalence of the two ways of measuring the rotation of the vector field, by taking curl and by the loop integral, leads to Stoke’s theorem. Theorem 2.2.3 (Stoke’s theorem). Let S be a smooth surface in R3 with boundary C = ∂S a piecewise smooth curve. For any smooth vector field F⃗ (⃗r), we have ˆ ˆ ⃗ × F⃗ · dA ⃗ = F⃗ · d⃗r . ∇ (2.2.53) C S We will again leave the proof of Stoke’s theorem to your analysis class. Instead, let us verify Stoke’s theorem for the examples (2.2.48) and (2.2.50). First, we have computed the curl of (2.2.48) previously in (2.2.24). We have ˆ S ˆ ˆ ⃗ × F⃗ · dA ⃗ = (0, 0, −2) · dA ⃗= ∇ S r ˆ dr′ 0 2π dθ(−2r) = −2πr2 . (2.2.54) 0 We got the same answer as the loop integral (2.2.49). Next, we look at the example (2.2.50). We found that the curl of (2.2.50) was zero previously ⃗ ×E ⃗ is also zero at the in (2.2.26) (except at the origin). Now, we can give an argument that ∇ 2 origin. We can take the surface S to be a small disk Dϵ on the x-y plane of radius ϵ centering at the origin. The area vector on the Dϵ2 is pointing in the z-direction ⃗ = ẑdA . dA 100 (2.2.55) Now, let us consider ˆ 1 ⃗ × E) ⃗ z dA (∇ ϵ→0 πϵ2 D2 ϵ ˆ 1 ⃗ × E) ⃗ · dA ⃗ = lim 2 (∇ ϵ→0 πϵ Dϵ2 ˆ 1 ⃗ · d⃗r E = lim 2 ϵ→0 πϵ ∂Dϵ2 0 = lim 2 ϵ→0 πϵ = 0. ⃗ × E) ⃗ z (0) = lim (∇ (2.2.56) ⃗ × E) ⃗ z (0) as the average of (∇ ⃗ × E) ⃗ z over the On the first equality, we have rewritten the (∇ infinitesimal disk Dϵ2 . We have used (2.2.55) on the second equality, Stoke’s theorem on the third ⃗ × E) ⃗ x=0= equality, and (2.2.52) on the fourth equality. In similar ways, we can show that (∇ ⃗ ⃗ ⃗ ⃗ (∇ × E)y . Hence, ∇ × E = 0 everywhere. As discussed in Section 2.2.1, static electric fields are always nonrotational. There is a beautiful theorem saying that nonrotational is equivalent to conservative. Theorem 2.2.4 (Poincaré lemma). For vector fields defined everywhere on R3 , conservative is equivalent to nonrotational, i.e. ⃗ ⇐⇒ ∇ ⃗ × F⃗ = 0 . (2.2.57) F⃗ = ∇ϕ Proof: First, let us prove the ⇒ direction. We have ⃗ , F⃗ = ∇ϕ (2.2.58) ∂ϕ . ∂xi (2.2.59) or in component form Fi = ⃗ × F⃗ using (2.2.22), Now, we compute the components of ∇ ⃗ × F⃗ )i = (∇ 3 X j=1 ϵijk 3 3 X X ∂Fk ∂2ϕ ∂Fk = ϵijk = ϵijk = 0, ∂xj ∂xj ∂xj ∂xk j,k=1 (2.2.60) j,k=1 where we have used the fact that partial derivatives commute with each other at the last equality. Next, let us prove the ⇐ direction. We assume that F⃗ is a nonrotational field, ⃗ × F⃗ = 0 . ∇ Let us define a scalar field ϕ(⃗r) by the line integral ˆ ϕ(⃗r) = F⃗ (⃗r′ ) · d⃗r′ , C(⃗ r0 ,⃗ r) 101 (2.2.61) (2.2.62) where C(⃗r0 , ⃗r) is a curve from ⃗r0 to ⃗r, and ⃗r0 is any fixed reference point. This line integral defines an unambiguous scalar field because it only depends on the boundary points of the curve but not the curve itself. To see this, let us consider the difference of the line integral along the curves C1 (⃗r0 , ⃗r) and C2 (⃗r0 , ⃗r), ˆ ˆ ˆ ˆ ⃗ × F⃗ · dA ⃗ = 0, F⃗ (⃗r′ ) · d⃗r′ − F⃗ (⃗r′ ) · d⃗r′ = F⃗ (⃗r′ ) · d⃗r′ = ∇ C1 (⃗ r0 ,⃗ r) C2 (⃗ r0 ,⃗ r) C1 (⃗ r0 ,⃗ r)∪C2 (⃗ r0 ,⃗ r) S where at the second equality we have used Stoke’s theorem for the surface S with boundary ∂S = C1 (⃗r0 , ⃗r) ∪ C2 (⃗r0 , ⃗r). Figure 2.7: Changing the curve in the definition of the scalar field. Now, it is easy to check that the scalar field ϕ(⃗r) defined in (2.2.62) satisfies ⃗ , F⃗ = ∇ϕ (2.2.63) Hence, F⃗ is conservative. Q.E.D. Since static electric fields are always nonrotational, they are always conservative. In other ⃗ r), there always exists a scalar field V (⃗r) such that words, for a electric field E(⃗ ⃗ = −∇V ⃗ . E This scalar field V (⃗r) is called a electric potential. 102 (2.2.64) 2.3 Applying Gauss’s Law, Electric potential In this section, we discuss various applications of Gauss’s Law and electric potential. 2.3.1 A charged isolated conductor Application. Gauss’s law implies that if an isolated conductor carries an excess charge, the charge would be entirely on the surface of the conductor. Conductors are materials in which charged particles (electrons) are free to move; examples include metals (such as copper in common lamp wire), the human body, and tap water. The charged particles in nonconductors (insulators) are not free to move; examples include rubber (such as the insulation on common lamp wire), plastic, glass, and chemically pure water. The net electric field inside a conductor must be zero, because, in a generic situation, the electric field would not always point in the normal direction to the surface of the conductor, and would exert forces on the charged particles to make them move and redistribute. Eventually, an equilibrium configuration would be achieved, such that there is no net force on any charged particles; hence, no net electric field inside the conductor. Figure 2.8: Cross-sections of conductors. The left picture in Figure 2.8 shows the cross-section of a conductor. We can consider a Gaussian surface that is very close to the surface of the conductor, shown as the red curve in the picture. There is no electric flux through this Gaussian surface, because, as we just argued, there is no net electric field inside the conductor. By Gauss’ law, the net charge inside the Gaussian surface must be zero. By shrinking the Gaussian surface to a smaller size inside the conductor, we can further argue that not just the net charge is zero, but the charge density at every interior point is zero. The right picture in Figure 2.8 shows a more complicated situation, a conductor with a cavity that contains a positive charge. In this case, there are non-zero electric fields in the cavity. We choose a Gaussian surface to be the union of two surfaces, shown as the red curves, that are very close to 103 the inner and outer surfaces of the conductor. Again, there is no electric flux through either the Gaussian surface; hence, the excess charge in the conductor should be on the inner or outer surface of the conductor. 2.3.2 Combining Gauss’s law with symmetry Determining the electric field configuration for a given charge distribution is generally a complex task. However, if the charge distribution exhibits a certain symmetry, Gauss’s law can be applied to deduce the electric field configuration. This simplifies the calculation significantly, as the symmetry reduces the complexity of the problem. Spherical symmetry Consider a point charge q located at the origin. This system possesses spatial O(3) symmetry, meaning that it is invariant under the action of the group elements which are 3-by-3 real matrices R that satisfy RT R = I. The symmetry action is ⃗r′ = R⃗r. Among these elements, there are several special symmetry operations that transform the system in a specific way. (1) SO(3) rotational symmetry which satisfies det R = 1. We can decompose any threedimensional rotation into a sequence of rotations around different axes. This allows us to represent any rotation by three angles, known as Euler angles. The rotational symmetry along a given axis forms a group SO(2). (2) Mirror reflection symmetry involves the reflection of the entire system about a given plane. For example, consider a reflection through the xy-plane. The position vector of an arbitrary point P (x, y, z) changes to P ′ (x, y, −z) under reflection. (3) Parity symmetry maps a point P (x, y, z) to its opposite point P ′ (−x, −y, −z) with respect to the origin, i.e., ⃗r′ = −⃗r. Now we ask what is the transformation law of the electric field of the charged particle under the SO(3) transformations. According to the Coulomb’s law, the electric field of the point charge is ⃗ r) = kq⃗3r . Now suppose we apply a R ∈O(3) transformation to the coordinate system, given by E(⃗ |⃗ r| ⃗ r) such that ⃗r is transformed to ⃗r′ = R⃗r. We can simply substitute ⃗r′ = R⃗r in the expression for E(⃗ to obtain ′ ⃗ r) = kq⃗r → E(R⃗ ⃗ r) = E(⃗ ⃗ r′ ) = kq⃗r = kqR⃗r = kqR⃗r = RE(⃗ ⃗ r), E(⃗ |⃗r|3 |⃗r′ |3 |R⃗r|3 |⃗r|3 (2.3.1) ⃗ r) = RE(⃗ ⃗ r) implies that where we used the condition RT R = I in the denominator. The result E(R⃗ the electric field transforms in the same way as the position vector ⃗r under the O(3) symmetry. ⃗ r) = RE(⃗ ⃗ r) and Gauss’s law to determine the electric field of We can use the condition E(R⃗ a point charge. Let’s assume that ⃗r is along the z axis. In this case, a rotation Rz (θ) along the ⃗ r) = E(R ⃗ z (θ)⃗r) = Rz (θ)E(⃗ ⃗ r). This implies that E(⃗ ⃗ r) must z axis leaves ⃗r fixed. So, we have E(⃗ 104 also be along the z direction. Therefore, we have determined that the direction of the electric field of a point charge is always radial to the point charge. We can further determine that the magnitude of the electric field is spherically symmetric by taking the norm of the vector equation ⃗ r) = RE(⃗ ⃗ r). This gives |E(R⃗ ⃗ r)| = |RE(⃗ ⃗ r)| = |E(⃗ ⃗ r)|. Hence, both the direction and magnitude E(R⃗ of the electric field of a point charge are spherically symmetric. Now consider a charge distribution described by a density function ρ(⃗x). The electric field at point ⃗r is given by Coulomb’s law as ˆ ⃗ r) = d3 ⃗x kρ(⃗x)(⃗r − ⃗x) , E(⃗ (2.3.2) |⃗r − ⃗x|3 where the integral is taken over the volume of the charge distribution. If the charge distribution is spherically symmetric, meaning that its density function satisfies ρ(R⃗x) = ρ(⃗x), ∀R ∈ O(3), then the electric field will transform under rotation R to ˆ ˆ x)(R⃗r − ⃗x) x)(⃗r − R−1 ⃗x) ′ 3 kρ(⃗ 3 kρ(⃗ ⃗ ⃗ ⃗ r). E(⃗r ) = E(R⃗r) = d ⃗x = R d ⃗ x = RE(⃗ |R⃗r − ⃗x|3 |⃗r − R−1 ⃗x|3 (2.3.3) (2.3.4) In the last step, we used the condition ρ(R⃗x) = ρ(⃗x) and d3 (R−1 ⃗x) = d3 ⃗x. The transformation law tells us that the electric field has the same symmetry as that of the point charge at the origin. This means that the electric field is only dependent on the distance |⃗r| from the origin and its direction is along the radial direction r̂. Hence, we can express the electric field at any point as ⃗ r) = E(|⃗r|)r̂, where E(|⃗r|) is a scalar function that only depends on the distance |⃗r|. E(⃗ Figure 2.9: Gaussian surface for a spherically symmetric charge distribution. By choosing a Gaussian surface to be the sphere that encloses the charge distribution, we can apply Gauss’s law to determine the scalar function E(|⃗r|). And finally we have the following result: If a charge distribution has spherical symmetry, then a charged particle outside the distribution will be attracted or repelled by the distribution as if all the charge were concentrated at the center of the distribution. 105 Cylindrical symmetry Now, let’s consider an infinite cylinder with cylindrical symmetry along the z direction and a charge distribution that also has cylindrical symmetry. We can utilize the symmetry of the system and Gauss’s law to determine the electric field generated by this charge distribution. This system exhibits lower rotational symmetry compared to the point charge at the origin. Specifically, the rotational symmetry is only preserved along the z axis, reducing from SO(3) to SO(2). However, the reflection symmetry is preserved if the mirror plane is perpendicular to the z axis or contains the z axis. Additionally, the system also exhibits the parity symmetry that maps ⃗r to −⃗r. Finally, the system exhibits new translational symmetry along the z direction compared to the point charge case. To determine the transformation rules for the electric field under a symmetry transformation in a cylindrical system, we can use Coulomb’s law and apply the same argument used for spherically symmetric systems. Let g be a symmetry transformation of the cylindrical system. Since the charge density ρ is symmetric under g, we have ρ(g⃗r) = ρ(⃗r). Moreover, since the differential volume element d3⃗r is invariant under g, we have d3 (g⃗r) = d3⃗r. Therefore, using Coulomb’s law, we find that the electric field transforms as ⃗ r) = g E(⃗ ⃗ r). E(g⃗ (2.3.5) This means that the electric field from the system also exhibits cylindrical symmetry, and we can use symmetry operations to determine its direction at any given point. For example, by choosing g to be a translational symmetry along the z-axis, we find that the electric field is invariant under translation. By applying some simple symmetry operations, we can prove that the electric field at a position (x, y, z) should be along the direction of (x, y, 0). Similarly, we can usep the symmetry argument to show that the magnitude of the electric field is only a function of r = x2 + y 2 . Figure 2.10: Gaussian surface for a cylindrically symmetric charge distribution. We can select a cylinder as our Gaussian surface. Applying Gauss’s law to this surface, we 106 obtain: ˛ Φ= ⃗ · dS ⃗ = E(r) · 2πrh = Qenc = λh , E ϵ0 ϵ0 (2.3.6) where Qenc is the charge enclosed by the surface, and λ is the linear charge density along the z-direction. Solving for E(r), we obtain: E(r) = λ . 2πϵ0 r (2.3.7) We see that a charged particle will be attracted or repelled by the cylindrically symmetric charge distribution as if all the charge were concentrated at the z axis. Planar symmetry Consider a thin, infinite, nonconducting sheet or plane with a uniform (positive) surface charge density σ. This system has translational symmetry along any direction in the xy-plane, and is also invariant under any mirror reflection symmetry for the reflection plane perpendicular to the charged plane. Using these symmetries and the transformation rule ⃗ r) = g E(⃗ ⃗ r), E(g⃗ (2.3.8) we can show that the electric field is always perpendicular to the charged sheet at any point in ⃗ r) = E(z)ẑ, where E(z) is a scalar function that depends only on the z space. It has the form E(⃗ coordinate of the position ⃗r and satisfies E(−z) = −E(z). Figure 2.11: Gaussian surface for a charged plane. ⃗ A useful Gaussian surface is a closed cylinder with end caps of area A (see Fig. 2.11). Since E is along ẑ direction, the electric flux can be calculated easily as ˛ ⃗ · dS ⃗ = E(z)A − E(−z)A = 2E(z)A = σA . Φ= E ϵ0 107 The solution of electric field is ⃗ r) = σ sgn(z)ẑ. E(⃗ 2ϵ0 (2.3.9) This result agrees with the previous calculation Eq. (2.1.10) using directly Coulomb’s law. However, the derivation here using symmetry and Gauss’s law is much simpler. Figure 2.12: (a) A infinite conducting charged plate. (b) Two infinite conducting plates with opposite charges. Using the electric field of a charged plane, we can also derive the charge and electric field distribution for a infinite conducting charged plate (see Fig. 2.12a).As we have previously discussed, the charge on the conducting plate must be located on its surface and must be uniformly distributed. If it were not uniformly distributed, the electric field parallel to the surface would cause the charge to move. Therefore, the surface charge density must be constant. If the charge density on each σi surface of the conducting plate is σi , the electric field generated will have a magnitude of 2ϵ . To 0 ensure that the total electric field inside the plate is zero, we must have σ1 = σ2 = σ, meaning that the surface charge density must be uniform across the entire surface of the plate. Therefore, the total electric field outside the plate is 2 × 2ϵσ0 = ϵσ0 . Let’s consider another example involving two infinite conducting charged plates with opposite charges q and −q as shown in Fig. 2.12b. In this case, the charges should be uniformly distributed on the four surfaces of the plates. The question is, how many charges are on each surface? To solve this, we start by examining the electric field inside the left plate due to the right plate. It is given by 2ϵσ0 pointing to the right, where σ is the total charge density of the right plate. For the total electric field inside the left plate to be zero, all of the charge on the left plate should be distributed on its right surface. This ensures that the electric field inside the left plate cancels out. Similarly, all the charge of the right plate should be on its left surface. This completes the charge distribution for the two plates with opposite charges. 108 2.3.3 Electric potential Definition of electric potential Given a force, we can determine the potential energy that is able to produce that force. In general, the relationship between force and potential energy is given by: F⃗ = −∇U. (2.3.10) Since the Coulomb force is proportional to the test charge, we usually define the electric potential V in terms of electric potential energy U as: U = qV. (2.3.11) The electric potential is a scalar quantity measured in volts, denoted by the symbol V. One volt is defined as one joule of energy per coulomb of charge. It is important to note that the electric potential depends only on the charge distribution and not on the test charge itself. ⃗ we can derive the relation between the electric potential From the Coulomb’s force F⃗ = q E, ⃗ When the test charge is and the electric field. Consider a test charge q0 in an electric field E. moved from point A to point B along a path, the work done by the electric force is given by ˆ B W = A ˆ F⃗ · d⃗l = B ˆ ⃗ · d⃗l = q0 q0 E A B ⃗ · d⃗l E (2.3.12) A The work done is also equal to the change in electric potential energy of the test charge, which is given by W = –∆U . Thus we have ∆U = q0 ∆V = q0 (VB − VA ). (2.3.13) This shows that ˆ B VB − VA = − ⃗ · d⃗l. E (2.3.14) A If we further choose VA = 0 for some reference point A, then the electric potential can be calculated as the line integral of electric field as ˆ ⃗ r V (⃗r) = − ⃗ · d⃗l. E (2.3.15) A Conversely, the electric field can be obtained from the electric potential using the relation ⃗ = −∇V, E where ∇ is the gradient operator. 109 (2.3.16) The definition of electric potential above does not depend on the path chosen between two points. This is known as the path-independence of electric potential, which can be proven using Stokes’ theorem. The line integral of the electric field along a closed path is zero, as shown by ˛ ˆ ⃗ ⃗ · dS ⃗ = 0, ⃗ (2.3.17) E · dl = (∇ × E) C=∂S S where C = L′ − L is a closed path and S is some surface satisfying C = ∂S. The above equation is ⃗ = 0. Therefore, valid only if the field is conservative, which is true for electrostatic fields as ∇ × E the electric potential is a well-defined scalar field. Calculating electric potential The electric potential of a point charge q can be calculated using the definition of electric potential as ˆ ⃗r ˆ |⃗r| kq kq 1 q ⃗ ⃗ V (⃗r) = − E · dl = − · dr′ = = . (2.3.18) ′2 |⃗r| 4πϵ0 |⃗r| +∞ +∞ r This expression shows that the electric potential of a point charge varies inversely with the distance from the charge, similar to the potential of Newton’s gravitation. For a system of charged particles, the electric potential is the sum of the individual electric potentials generated by each individual charge: X X 1 qi V (⃗r) = Vi (⃗r) = , (2.3.19) 4πϵ0 |⃗r − ⃗ri | i i Here, qi represents the charge of the ith particle, and ⃗ri represents its position. For a charged distribution or continuum with density ρ(⃗r′ ), the electric potential at a point ⃗r can be calculated using a scalar integral. This integral is given by: ˆ ˆ 1 dq ′ 1 ρ(⃗r′ )dV ′ V (⃗r) = = , (2.3.20) 4πϵ0 |⃗r − ⃗r′ | 4πϵ0 |⃗r − ⃗r′ | where dq ′ = ρ(⃗r′ )dV ′ is the infinitesimal charge at a point ⃗r′ . The integral takes into account the contribution to the potential from each infinitesimal charge in the distribution. Electric potential and field of an electric dipole Now, Let us calculate the electric potential of an electric dipole. The total electric potential at any point due to a dipole is given by the summation of the electric potentials due to its individual charges. If we have a dipole with charges +q and −q separated by a distance d, then the electric potential at a point P at a distance r from the center of the dipole is given by: q q q r(−) − r(+) 1 V (⃗r) = − = . (2.3.21) 4πϵ0 r(+) r(−) 4πϵ0 r(+) r(−) 110 When r is much larger than d, we can approximate r(−) − r(+) by d cos θ, where θ is the angle ⃗ In the denominator, r(+) r(−) can be approximated by r2 . This gives us the between ⃗r and d. simplified expression for the electric potential V ≈ 1 qd cos θ 1 qr̂ · d⃗ 1 p⃗ · r̂ = = . 2 4πϵ0 r 4πϵ0 r2 4πϵ0 r2 (2.3.22) It is important to note that the electric potential of an electric dipole varies inversely with the square of the distance from the center of the dipole. From this electric potential, we can derive the electric field of the dipole as p · r̂)r̂ − p⃗ ⃗ r) = −∇V (⃗r) = k 3(⃗ E(⃗ . r3 (2.3.23) This agrees with the previous result Eq. (2.1.7) derived directly from Coulomb’s law. Figure 2.13: Calculating the electric potential of an electric dipole. In general, calculating the electric potential of a charge distribution is much simpler than calculating the electric field of the same charge distribution. The reason for this is that the former involves a scalar integral, while the latter involves a vector integral. Electric potentials and fields of charged ring and disk Let us consider another example of calculating electric potentials and fields of charged ring and disk. For a charge ring with equation x2 + y 2 = R2 and line charge density λ, the electric potential 111 at the point ⃗r = (0, 0, z) is given by ˆ 2π √ V (⃗r) = 0 z2 k kq λRdθ = √ . 2 2 +R z + R2 (2.3.24) The electric field is then ⃗ r) = −∇V (⃗r) = E(⃗ 0, 0, kqz 2 (z + R2 )3/2 , (2.3.25) which agrees with the previous result Eq. (2.1.8). For a charged disk with radius R and area charge density σ, the electric potential at the point ⃗r = (0, 0, z) is ˆ V (⃗r) = √ kdq = z 2 + r2 ˆ 0 R p kσ(2πr)dr √ = 2πkσ( z 2 + R2 − |z|). z 2 + r2 (2.3.26) The electric field is ⃗ r) = −∇V (⃗r) = E(⃗ 0, 0, 2πkσ sgn(z) − √ z 2 z + R2 = σ 0, 0, 2ϵ0 sgn(z) − √ z , z 2 + R2 (2.3.27) which also agrees with the previous result Eq. (2.1.9). In the infinite plane limit R → ∞, the electric field becomes σ ⃗ E(⃗r) = 0, 0, sgn(z) , (2.3.28) 2ϵ0 which is the standard result for the electric field of an infinite uniformly charged plane. 112 2.4 2.4.1 Capacitance, Current and Resistance Capacitance Two conductors, isolated electrically from each other and from their surroundings, form a capacitor. When the capacitor is charged, the charges on the conductors have the same magnitude of charges but opposite signs. No matter what their geometry, flat or not, people call these conductors plates. The following figure shows a conventional arrangement, called a parallel-plate capacitor, consisting of two parallel conducting plates of area A separated by a distance d. In principle, capacitors can be of all kinds of geometries, and we will study some special setups below. In this class, we always assume that no material medium is present in the region between the plates; in general, the space between the plates of a capacitor is filled with a dielectric, i.e., an insulating material such as glass or plastic. When a capacitor is charged, its plates have charges of equal magnitudes but opposite signs ±q. Then, we refer to the charge of a capacitor as q, the absolute value of these charges on the plates. Because the plates are conductors, they are equipotential surfaces; all points on a plate are at the same electric potential. Moreover, there is a potential difference between the two plates, whose absolute value is denoted by V . For capacitors, q and V are proportional to each other: q = CV, where the proportionality constant C is called the capacitance of the capacitor. Its value depends only on the geometry of the plates and not on q or V . The capacitance is a measure of how much charge can be stored on the plates to produce a certain potential difference between them: the greater the capacitance, the more charge can be stored in the capacitor. The SI unit of capacitance is farad (F), i.e., Coulomb per Volt (C/V). We now look at some examples of capacitors with special geometries and calculate their capacitance. In general, the calculation of capacitance consists of the following steps: 1. Assume a virtual charge q on the plates. 2. Calculate the electric field between the two plates in terms of q (by using Coulomb’s law or Gauss’ law). 113 3. Calculate the potential difference V between the plates by doing a path integral of the electric field from the positive plate to the negative one. 4. Calculate C = q/V . Example (Parallel-Plate Capacitor). Consider a parallel-plate capacitor, where each plate is of area A and the two plates are separated by d. Suppose the two plates are large enough compared to d. Find its capacitance. Solution: The electric field between the two plates is given by (2.1.11) pointing from the positive plate to the negative plate, where the charge density is σ = q/A. Then, the potential difference is V = qd σ d= . ε0 ε0 A Hence, the capacitance is q ε0 A = . (2.4.1) V d Example (Cylindrical Capacitor). Consider a cylindrical capacitor of length L formed by two coaxial cylinders of inner radius a and outer radius b. Suppose L ≫ b so that we can neglect the fringing of the electric field that occurs at the ends of the cylinders. Find its capacitance. C= Solution: Using Gauss’ law, we find that the electric field between the two plates is given by (2.3.7) pointing from the positive plate to the negative plate, where the charge density is λ = q/L. Then, the potential difference is ˆ b q q b V = dr = log . 2πϵ Lr 2πϵ L a 0 0 a Hence, the capacitance is C= 2πϵ0 L q = . V log(b/a) Figure 2.14: A cross-section of a long cylindrical capacitor, or a capacitor consisting of two concentric spherical shells. 114 Example (Spherical Capacitor). Consider a capacitor that consists of two concentric spherical shells, of inner radius a and outer radius b. Find its capacitance. Solution: Using Gauss’ law or Proposition 2.1.1, the electric field between the two plates is E= q 4πε0 r2 pointing from the positive plate to the negative plate. Then, the potential difference is ˆ b q 1 1 q dr = . V = − 2 4πϵ0 a b a 4πε0 r Hence, the capacitance is C= q 4πϵ0 ab = . V b−a We can assign a capacitance to a single isolated spherical conductor of radius a by assuming that the other plate is a conducting sphere of ∞ radius. Let b → ∞ in the above equation, we get the capacitance of a single isolated spherical conductor as C = 4πϵ0 a. 2.4.2 Capacitors in Circuits To charge a capacitor, we place it in an electric circuit with a battery. In the following figure, a battery B, a switch S, an uncharged capacitor C, and interconnecting wires form a circuit. The battery maintains a potential difference V between its terminals. When the switch is closed, electrically connecting those wires, the circuit is complete and electrons are driven through the wires by an electric field that the battery sets up in the wires. The field drives electrons from the capacitor plate h to the positive terminal of the battery, so plate h will become positively charged. The field drives just as many electrons from the negative terminal of the battery to the capacitor plate l, so plate l will become negatively charged just as much as plate h becomes positively charged. Initially, when the plates are uncharged, the potential difference between them is zero. As the plates become oppositely charged, that potential difference increases 115 until it equals the potential difference V between the terminals of the battery. Then, plate h and the positive terminal of the battery are at the same potential, and there is no longer an electric field in the wire between them. Similarly, plate l and the negative terminal reach the same potential. Thus, with the field zero, there is no further drive of electrons. The capacitor is then said to be fully charged, with a potential difference V and charge q = CV . When there is a combination of capacitors in a circuit, we can sometimes replace that combination with an equivalent capacitor—that is, a single capacitor that has the same capacitance as the actual combination of capacitors. The above figure shows an electric circuit in which three capacitors are connected in parallel to battery B. Here, each capacitor has the same potential difference V , and the total charge q stored on the capacitors is the sum of the charges stored on all the capacitors. Thus, capacitors connected in parallel can be replaced with an equivalent capacitor that has the same total charge q and the same potential difference V as the actual capacitors. In general, consider n capacitors connected P in parallel, each with capacitance Ci , i = 1, . . . , n. Then, the total charge is q = ni=1 Ci V , which gives the equivalent capacitance n X q Ceq = = Ci . (2.4.2) V i=1 The following figure shows three capacitors connected in series to battery B. Here, the potential differences that exist across the capacitors in the series produce identical charges q on them. Thus, capacitors that are connected in series can be replaced with an equivalent capacitor that has the same charge q and the same total potential difference V as the actual series capacitors. 116 In general, consider n capacitors connected in parallel, each with capacitance Ci , i = 1, . . . , n. Then, P the total potential difference is V = ni=1 q/Ci , and hence the equivalent capacitance satisfies that n X 1 1 V = . = Ceq q Ci (2.4.3) i=1 2.4.3 Energy Stored in an Electric Field Work must be done by an external battery to charge a capacitor, at the expense of its stored chemical energy. We visualize the work as being stored as electric potential energy in the electric field between the plates. We now evaluate this energy. Suppose that, at a given instant, a charge q ′ has been transferred from one plate of a capacitor to the other. The potential difference V ′ between the plates at that instant is equal to q ′ /C. Then, if an extra increment of charge dq ′ is transferred, the increment of work required will be dW = V ′ dq ′ = q′ ′ dq . C The work required to bring the total capacitor charge up to a final value q is ˆ q ′ q2 q ′ dq = . W = 2C 0 C Hence, the potential energy U stored in the capacitor is given by U= 1 q2 = CV 2 . 2C 2 (2.4.4) Consider a parallel-plate capacitor, where each plate is of area A and the two plates are separated by d. Neglecting fringing, the electric field has the same value at all points between the plates. Thus, the energy density u, that is, the potential energy per unit volume between the plates, should also be uniform. Using (2.4.4) and (2.4.1), we obtain that CV 2 1 = ε0 u= 2Ad 2 V d 2 . Furthermore, the electric field between the plates is E = V /d. Hence, we get the following formula for the electric energy density 1 u = ε0 E 2 . 2 Although we derived this result for the special case of an electric field of a parallel plate capacitor, ⃗ exists at any point in space, then at each point it holds for any electric field. If an electric field E ⃗r, there is an electric potential energy with a density (amount per unit volume) given by 1 ⃗ u(⃗r) = ε0 |E(⃗ r)|2 . 2 117 2.4.4 Electric Current Simply speaking, an electric current is a stream of moving charges. In this section, we focus on steady currents of conduction electrons moving through metallic conductors such as copper wires. Recall that an isolated conducting loop, regardless of whether it has an excess charge, is all at the same potential. Hence, no net electric force acts on the conduction electrons and there is no current. If we insert a battery in the loop, the conducting loop is no longer at a single potential. Electric fields act inside the material making up the loop, exerting forces on the conduction electrons, causing them to move and thus establishing a current. After a very short time, the electron flow reaches a constant value and the current is in its steady state. Take a section of a conductor, part of a conducting loop in which current has been established. If charge dq passes through a cross-section of the conductor in time dt, then the current i through that cross-section is defined as dq i= . dt As discussed before, the SI unit for current is ampere (A) or Coulomb per second (C/s), which is an SI base unit. Note that current is a scalar because both charge and time are scalars. Yet, we often represent a current with an arrow to indicate that charge is moving. More precisely, we will draw the current arrows in the direction in which positively charged particles would be forced to move through the loop by the electric field. Such positive charge carriers would move away from the positive battery terminal and toward the negative terminal. However, in applications, the charge carriers are often electrons and thus are negatively charged. The electric field forces them to move in the direction opposite the current arrows, from the negative terminal to the positive terminal. Then, the assumed motion of positive charge carriers in one direction has the same effect as the actual motion of negative charge carriers in the opposite direction. We remark that the arrows of currents do not mean vectors, so they do not satisfy vector addition. Current arrows show only a direction (or sense) of flow along a conductor, not a direction in space. Due to the conservation of charges, there is also a “conservation of currents”. For example, the following picture shows a conductor with current i0 splitting at a junction into two branches: Then, the magnitudes of the currents in the branches must add to yield the magnitude of the current in the original conductor, i.e., i0 = i1 + i2 . In more general circuits, we have the following Kirchhoff’s junction rule (or Kirchhoff’s current law). 118 Physics law 10 (Kirchhoff’s junction rule). The sum of the currents entering any junction must be equal to the sum of the currents leaving that junction. Sometimes, we want to take a localized view and study the flow of charge through a cross-section ⃗ of the conductor at a particular point. To describe this flow, we can use the current density J, which has the same direction as the velocity of the moving charges if they are positive and the opposite direction if they are negative. The magnitude J is equal to the current per unit area through that element. Then, the total current through the surface is ˆ ⃗ i = J⃗ · dA, ⃗ is the area vector of the element. If the current is uniform across the surface and parallel where dA ⃗ then J⃗ is also uniform and parallel to dA, ⃗ and we have to dA, i = JA ⇔ J = i , A where A is the total area of the surface. The SI unit for current density is A/m2 . 2.4.5 Resistance and Resistivity If we apply a potential between the ends of different conductors, different currents result depending on their resistances. We determine the resistance between any two points of a conductor by applying a potential difference V between those points and measuring the current i that results. The resistance R is then defined as V R= . i Its value depends only on the material and geomerty of the conductor and not on V or i. The SI unit for resistance is ohm (Ω) or volt per ampere (V/A). A conductor whose function in a circuit is to provide a specified resistance is called a resistor. Given a voltage difference V between the two ends of the resistor, it generated a current i = V /R. On the other hand, a current i across the resistor corresponds to a voltage difference V = iR. ⃗ at a point in a We sometimes wish to take a localized view and focus on the electric field E resistive material. Then, instead of dealing with the current i through the resistor, we deal with the current density J⃗ at the point in question. Instead of the resistance R of an object, we deal with the resistivity ρ of the material: E ρ= . J The SI units of ρ is (V/m)/(A/m2 ) = (V/A) · m = Ω · m. We can rewrite the above equation into a more general vector form ⃗ E J⃗ = . (2.4.5) ρ 119 Compared to resistivity, people speak of the conductivity σ of a material more often, which is simply the reciprocal of its resistivity, σ = ρ−1 . The definition of σ allows us to write (2.4.5) into a slightly simpler form: ⃗ J⃗ = σ E. (2.4.6) Note that resistance is a property of an object, while resistivity is a property of a material. We now derive the relation between the resistivity of a material such as copper and the resistance of a length of wire made of that material. Let A be the cross-sectional area of the wire and L be its length. Apply a potential difference V between its ends. Suppose the electric field and the current density are constant for all points within the wire. Then, they have values E = V /L and J = i/A, which give that E V A A ρL ρ= = =R ⇔ R= . J i L L A This relation can be applied only to a homogeneous isotropic conductor of uniform cross-section. As discussed above, a resistor is a conductor with a specified resistance R. In particular, R is unchanged no matter what the magnitude and direction (polarity) of the applied potential difference are. In this case, we say that the resistor satisfies the Ohm’s Law, which asserts that the current through a device is always proportional to the potential difference applied to it. Although, for historical reasons, the term “law” is used here, this assertion is correct only in certain situations. For example, the left device of Figure 2.15 obeys Ohm’s law, while the right device of Figure 2.15—a semiconducting pn junction diode—does not. For the pn junction diode, current can exist only when the polarity of V is positive and the applied potential difference is more than about 1.5 V. When current does exist, the relation between i and V is also not linear. Figure 2.15: Left plot: current i versus applied potential difference V when the device is a resistor. Right plot: i versus V when the device is a semiconducting pn junction diode. Remark. In middle school physics, it is often contended that Ohm’s law states that V = iR. However, this is not true—this equation is just the defining equation for resistance, and it applies 120 to all conducting devices, whether they obey Ohm’s law or not. The essence of Ohm’s law is that i is linear with respect to V , that is, R is independent of V . When there is a steady current i across a resistor R, the amount of charge dq that moves between its ends in the time interval dt is equal to dq = idt. This charge dq moves through a potential decrease of magnitude V , and thus its electric potential energy decreases in magnitude by dU = V dq = iV dt. The principle of conservation of energy tells that the decrease in electric potential energy must be accompanied by a transfer of energy to some other form. The power P associated with that transfer is dU P = = iV. (2.4.7) dt Note that the SI unit of P is V · A = (J/C) · (C/s) = J/s = W, as it should be. Using V = iR, we can rewrite (2.4.7) as V2 P = i2 R = , R which gives the rate of electrical energy dissipation due to resistance. Given a fixed current, to lower the electrical energy dissipation, it is desired to have as small resistance as possible. In particular, when the phenomenon of superconductivity occurs, the resistivity of the material drops to zero and there is no electrical energy dissipation. The study of the mechanisms for superconductivity remains one of the central topics in modern theoretical physics, and the search for high-temperature (or even room-temperature) superconductors is arguably one of the most important challenges among physicists. 2.4.6 Electric Circuits In this subsection, we study the physics of electric circuits which are closed loops consisting of resistors, batteries, capacitors, and conducting wires between them (the resistances of wires are negligible). We restrict our attention to circuits through which charge flows in one direction, called direct current (DC) circuits. Given an electric circuit, we want to know the voltage (or potential) at each point and the current in each segment of conducting wire. In principle, no matter how complicated an electric circuit is, it can always be solved by using Kirchhoff’s junction rule (or Kirchhoff’s current law) in Physics law 10 and the following Kirchhoff’s loop rule (or Kirchhoff’s voltage law). Physics law 11 (Kirchhoff’s loop rule). The algebraic sum of the changes in potential encountered in a complete traversal of any loop of a circuit must be zero. When we apply Kirchhoff’s voltage laws, we start at a point, say o, and mentally walk clockwise or counterclockwise around the circuit until we are back at o, keeping track of potential changes as we move. During the walk, we deal with resistors, batteries, and capacitors in the following ways. 121 • For a move through resistance R in the direction of the current, the change in potential is −iR, while in the opposite direction, it is +iR. • A battery maintains its positive terminal at a higher electric potential than the negative terminal. People usually call the potential increase from the − terminal to + terminal as the emf (i.e., electromotive force) E , meaning that it supplies the energy for the motion of electrons via the work it does. For a move through a battery in the direction of the emf arrow (i.e., from − terminal to + terminal), the change in potential is +E ; in the opposite direction, it is −E . • A capacitor is an insulator, i.e., it has ∞ resistance. Hence, there is no current across the capacitor. For a move across the capacitor with a potential difference V , the change in potential is +V from the negative plate to the positive one; in the opposite direction, it is −V . In applications of Kirchhoff’s loop and junction rules, we need to solve a system of linear equations involving voltages at various points and currents along some wire segments. In some cases, the circuit structure can be greatly simplified if we use equivalent resistances for resistors in series or in parallel. Resistors in series. The left figure shows an electric circuit in which three resistances are connected in series to an ideal battery with emf E . All three resistances have identical currents i, while the sum of the potential differences across the resistances is equal to E . Thus, resistances connected in series can be replaced with an equivalent resistance Req (the right figure) that has the same current i and the same total potential difference E as the actual resistances. In general, consider n capacitors connected in series, each with resistance Ri , i = 1, . . . , n. Then, by the loop rule, we have E − n X n iRk = 0 ⇒ Req X E = Rk . = i (2.4.8) k=1 k=1 Resistors in parallel. The left figure shows three resistances connected in parallel to an ideal battery with emf E . 122 All three resistances have the same potential difference across them, producing a current through each. Thus, resistances connected in parallel can be replaced with an equivalent resistance Req (the right figure) that has the same potential difference and the same total current as the actual resistances. In general, consider n capacitors connected in parallel, each with resistance Ri , i = 1, . . . , n. Then, applying the loop rule to the n loop, each of which consists of the battery and a resistor, we get that E = ik Rk , k = 1, . . . , n. By the junction rule, the total current of the equivalent resistance is i= n X ik . k=1 From the above two equations, we can derive the equivalent resistance as n Req = X 1 E 1 ⇒ = . i Req Rk (2.4.9) k=1 You can compare (2.4.8) and (2.4.9) for resistance with (2.4.2) and (2.4.3) for capacitance. 2.4.7 RC Circuits In this subsection, we consider the following RC series circuit consisting of a capacitor C, an ideal battery of emf E , and a resistance R: 123 When switch S is closed on a, the capacitor is charged through the resistor. When the switch is afterward closed on b, the capacitor discharges through the resistor. We know that as soon as the circuit is complete, charge begins to flow (current exists) between a capacitor plate and a battery terminal on each side of the capacitor until the potential difference between the two capacitor plates equals E across the battery. Then, we say that the system reaches equilibrium, and the equilibrium charge on the fully charged capacitor is equal to CE . Here, we are interested in the charging process, which is a dynamic process. In particular, we want to know how the charge q(t) on the capacitor plates, the potential difference V (t) across the capacitor, and the current i(t) in the circuit vary with time during the charging process. Using the loop rule, we find that E − i(t)R − q(t) = 0. C Since i = dq/dt, we can rewrite the above equation as R dq q + = E, dt C with initial condition q(0) = 0. It is easy to solve this equation and get t q(t) = CE 1 − e− RC . Here, the product RC is called the capacitive time constant of the circuit. From q(t), we immediately obtain that t t dq(t) E q(t) i(t) = = e− RC , V (t) = = E 1 − e− RC . dt R C As expected, as t → ∞, V (t) → E , i(t) → 0, and q(t) → CE . Assume now that the capacitor has been fully charged. At a new time t = 0, switch S is thrown from a to b so that the capacitor can discharge through resistance R. How does the discharging process behave? Similar as above, we obtain the following differential equation of q(t): R dq q + = 0, dt C with initial condition q(0) = q0 = CE . The solution to this equation is t q(t) = q0 e− RC . Hence, q decreases exponentially to zero. The current is then given by i(t) = dq(t) q0 − t =− e RC , dt RC i.e., the current also decreases exponentially to zero. 124 2.5 Magnetic Fields, Magnetic Fields Due to Currents Magnetism is a fundamental force of nature that is observed in certain materials, notably iron, cobalt, and nickel, as well as some alloys and compounds. Magnets produce a magnetic field, which is a region of space in which magnetic forces are exerted on other magnets, magnetic materials, or charged particles. Magnetism plays a critical role in both our natural world and our daily lives. It is responsible for the Earth’s magnetic field, which protects our planet from the harmful effects of solar wind and cosmic rays. In addition, magnetism has a profound impact on our technology, from the small magnets used in hard drives and credit card strips to the giant magnets used in medical imaging machines and particle accelerators. Understanding magnetism is therefore essential for both advancing our understanding of the natural world and developing new technologies that improve our lives. 2.5.1 Magnetic fields An important concept in magnetism is the source of magnetic field. While electric charges are the source of electric fields, the existence of magnetic charges, or magnetic monopoles, has not been experimentally observed. Instead, a magnet with both north and south poles can produce a magnetic field. When two magnets are brought close to each other, the north pole of one magnet will attract the south pole of the other magnet, while the north pole of both magnets will repel each other. However, it is not possible to isolate a single north pole or south pole of a magnet, as attempting to do so will split the magnet into two separate magnets with both poles. It is crucial to understand that a magnetic field can also be generated by the motion of electric charges, as observed in electric currents. This principle gave rise to the development of electromagnets, which will be covered in detail in subsequent sections. When an electric current flows through a wire coil, a magnetic field is produced in the surrounding space. As moving electric charges have a time-varying electric field, we can conclude that the magnetic field arises due to the time-varying electric field. Definition of magnetic field Electric fields are defined as the Coulomb force acting on a test electric charge divided by the value ⃗ = F⃗E /q. The concept of electric fields can be extended to magnetic fields of its electric charge E by defining them using the magnetic force. Magnetic fields do not affect static electric charges, but they do exert a force on moving charges known as the Lorentz force. This force is perpendicular to the direction of the velocity of the charged particle and the magnetic field it is moving through. The magnetic force can be described using the cross product of the charge’s velocity and the magnetic 125 field, and can be written as ⃗ F⃗B = q⃗v × B, (2.5.1) where q is the electric charge of the charged particle and ⃗v is its velocity. This allows us to define the magnetic field as a vector that satisfies this equation. Therefore, the magnetic field can be thought of as a quantity that measures the force exerted on a moving charge in a magnetic field. By definition of the cross product, the magnitude of the Lorentz force is given by ⃗ sin θ, |F⃗B | = q|⃗v ||B| (2.5.2) where θ is the angle between the velocity and the magnetic field. As the magnetic force is always perpendicular to the velocity of the particle, it does not do any work on the particle. It only changes the direction of the velocity and does not alter the magnitude of the velocity. This means that a magnetic field can only change the trajectory of a charged particle and not its speed. The unit of the magnetic field is the Tesla (T), which is defined as the magnetic field that exerts a force of one Newton on a charge of one Coulomb moving perpendicularly to the magnetic field N·s with a velocity of one meter per second: 1T = 1 C·m . One Tesla is equal to 104 Gauss (G), which is commonly used as another unit of magnetic field strength. For example, the Earth’s magnetic field has a strength of about 0.5 Gauss. The choice of unit depends on the context and magnitude of the magnetic field being measured. In everyday applications, milliTesla (mT) is often used, while in scientific research and high-tech industries, the Tesla is the standard unit. Magnetic field line Magnetic field lines are a useful tool for visualizing the direction and strength of a magnetic field. The field lines are drawn in such a way that the tangent to the line at any point gives the direction of the magnetic field at that point. The strength of the field is proportional to the density of the field lines. Magnetic field lines always form closed loops, which means that they originate from the north pole of a magnet and terminate at the south pole. In addition, magnetic field lines never intersect, which means that they cannot cross each other. The closer the field lines are, the stronger the magnetic field is at that point. Figure 2.16 126 Michael Faraday, a renowned English physicist and chemist, revolutionized our understanding of magnetic fields in the mid-19th century by introducing the concept of magnetic field lines. Through a series of groundbreaking experiments on electromagnetism, he discovered that magnetic fields possess direction and shape, leading him to propose the idea of visualizing magnetic fields with lines. Faraday’s seminal work formed the basis of modern electromagnetic theory and transformed the study of magnetic fields from a vague notion to a concrete and measurable phenomenon. Today, fields are a fundamental concept in quantum field theory and are considered as real physical degrees of freedom rather than merely conceptual lines. The absence of magnetic monopoles is a fundamental property of magnetic fields, which is visually represented through closed magnetic field line loops that do not have any endpoints or sources/sinks. Unlike electric fields, where positive and negative charges act as sources and sinks of electric field lines, the magnetic field is divergenceless, meaning that its divergence is always zero. ⃗ = 0, which is different from the electric field with Using vector calculus, this is expressed as ∇ · B ⃗ = ρ/ϵ0 . ∇·E 2.5.2 Magnetic forces When a charged particle is in motion and exposed to both electric and magnetic fields, the net force acting on it is given by the equation ⃗ + q⃗v × B. ⃗ F⃗ = q E (2.5.3) This is the fundamental formula for forces of electric and magnetic fields acting on a charged particle. The Hall effect We can consider the field configuration in which the electric and magnetic fields are perpendicular to each other, and the charged particle is initially moving with a velocity perpendicular to both fields. In this scenario, there is a possibility that the total force acting on the particle is zero, and the particle will continue moving with a constant velocity (see Figure 2.17). The condition for this equilibrium is given by ⃗ + q⃗v × B ⃗ = 0, qE (2.5.4) from which we find the speed of the particle should be v= E , B (2.5.5) where E and B are the magnitudes of the electric and magnetic fields, respectively. The Hall effect is a phenomenon that occurs when two conductor plates are exposed to a magnetic field, as shown in Figure 2.17. Electrons move between the plates with an initial velocity, 127 vd , in the absence of an electric field. As time goes on, one plate accumulates more electrons than the other, as illustrated in Figure 2.17(a), resulting in a potential difference between the two plates and an induced electric field. By using the equilibrium condition derived earlier, we can determine the electric potential difference between the two plates as V = Ed = vd Bd, (2.5.6) where E is the electric field induced by the accumulated electrons, vd is the drift speed of the electrons, and d is the distance between the two plates. Meanwhile, the current is given by i = JA = nevd A, where n is the electron density and A is the cross-sectional area of the electron current. Therefore, the magnetic field can be expressed as B= V neV A = . vd d id (2.5.7) If all the physical quantities on the right-hand side are known, we can use this formula of the Hall effect to measure the magnetic field. Figure 2.17: (a) A moving electron in magnetic field. (b) A moving electron in electric and magnetic fields which are perpendicular to each other. Circulating charged particle in uniform magnetic field When a charged particle moves in a uniform magnetic field, it experiences a force perpendicular to both the direction of motion and the direction of the magnetic field. This force causes the particle to move in a circular or helical path. ⃗ with an initial First, let’s consider a charged particle moving in a uniform magnetic field B ⃗ acting velocity ⃗v perpendicular to the magnetic field. In this case, the magnetic force F⃗B = q⃗v × B 128 ⃗ so the particle moves in a circular path with a on the particle is perpendicular to both ⃗v and B, radius r governed by the centripetal force: FB = |q|vB = mv 2 , r (2.5.8) where m is the mass of the particle. The radius of the circle is given by: r= mv , |q|B (2.5.9) This equation tells us that the radius of the circle depends on the speed of the particle and the strength of the magnetic field. The period of the motion, which is the time it takes for the particle to complete one full circle, is given by: T = 2πr 2πm = . v |q|B (2.5.10) This equation tells us that the period of the motion is independent of the velocity of the particle and depends only on its mass, charge, and the strength of the magnetic field. Figure 2.18 In the more general case, when the initial velocity of a charged particle is not perpendicular to a uniform magnetic field, the particle moves in a helical path around the direction of the field vector. To understand this motion, we can resolve the velocity vector ⃗v into two components, one parallel ⃗ The parallel component v∥ = v cos ϕ determines and one perpendicular to the magnetic field B. the pitch p of the helix, which is the distance between adjacent turns, while the perpendicular component v⊥ = v sin ϕ determines the radius of the helix. The radius of the circular path can be calculated using the perpendicular velocity component and Eq. (2.5.9): r= mv⊥ . |q|B 129 (2.5.11) The pitch of the helix can be calculated using the parallel velocity component and the period T of the motion: p = v∥ T = 2πmv∥ . |q|B (2.5.12) It is worth noting that the pitch of the helix is independent of the perpendicular velocity v⊥ , since the period T only depends on the mass, charge, and strength of the magnetic field. In general, a charged particle moving in a magnetic field experiences a force that causes it to move in a spiral path along the direction of the magnetic field lines. This phenomenon is crucial for understanding the protective nature of the Earth’s magnetic field. The Earth’s magnetic field acts as an invisible shield that deflects charged particles, especially those present in the solar wind. The solar wind primarily consists of electrons and protons that approach the Earth. When these charged particles encounter the Earth’s magnetic field lines, they undergo a magnetic force that alters their trajectory. Instead of moving straight towards the Earth’s surface, they spiral along the field lines. This spiraling motion redirects the charged particles towards the Earth’s north or south poles. Magnetic force acting on a wire ⃗ We Let us now consider a straight wire carrying a current i placed in a uniform magnetic field B. want to understand the magnetic force acting on this wire with length L. Figure 2.19 Suppose the wire has electric current i, which is induced by charged particles with charge q and drift velocity ⃗vd . Then the total moving charge in the wire of length L should be q = it = iL/vd . The magnetic force acting on this wire is given by ⃗ = iL ⃗vd × B ⃗ = iL ⃗ × B. ⃗ F⃗ = q⃗vd × B vd (2.5.13) ⃗ is the vector of length L and pointing along the direction of the current or drift velocHere, L ity. From this expression, we see that the magnetic force acting on the current-carrying wire is perpendicular to both the direction of the current and the direction of the magnetic field. 130 Torque on a current loop Figure 2.20 The figure above depicts a simple motor that consists of a single current-carrying loop placed in a ⃗ The loop experiences two magnetic forces, denoted by F⃗ and −F⃗ , which uniform magnetic field B. act in opposite directions along the loop. As a result, the loop experiences a torque that tends to rotate it about its central axis. Since two of the edges are always perpendicular to the direction of the magnetic field, the force acting on them has a magnitude F = ibB, (2.5.14) where b is the length of the edge. The torque of these forces with respect to the central axis is then τ = F a sin θ = iBab sin θ (2.5.15) where a is the length of the other edge of the current loop, and θ is the angle between the normal direction n̂ of the loop plane and the direction of the magnetic field. Suppose we replace the single loop of current with a coil of N tightly wound loops. The torque on the coil due to the magnetic field is given by τ = N τsingle-loop = N iBA sin θ, (2.5.16) where A = ab is the area enclosed by each loop. We see that the torque on the coil is proportional to the number of turns N , the current i, and the enclosed area A. This demonstrates that increasing any of these parameters will result in an increase in the torque of the simple motor. Magnetic dipole moment The magnetic dipole moment is a fundamental concept in magnetism and plays a crucial role in understanding the behavior of current-carrying coils in magnetic fields. Similar to an electric dipole consisting of positive and negative charges separated by a distance, a magnetic dipole arises from 131 a current loop or coil. The magnitude of the magnetic dipole moment of a coil is given by the expression |⃗ µ| = N iA, (2.5.17) Here, N represents the number of turns in the coil, i is the current flowing through the coil, and A is the area enclosed by each turn. The magnetic dipole moment is a measure of the strength of the dipole and is typically expressed in units of ampere-square meters (A · m2 ). The direction of the magnetic dipole moment vector, denoted as µ ⃗ , is determined by the normal vector n̂ to the plane of the coil. The right-hand rule can be used to determine the direction of µ ⃗ by aligning the fingers of your right hand with the current flow in the coil, and the thumb points in the direction of µ ⃗. With the magnetic dipole moment defined, we can now apply it to understand the torque experienced by a current-carrying coil in a magnetic field. The torque on the coil is given by the cross product of the magnetic dipole moment vector and the magnetic field vector, as shown in Eq. (2.5.16): ⃗ ⃗τ = µ ⃗ × B. (2.5.18) This expression reveals that the coil experiences a torque perpendicular to both the magnetic dipole moment vector and the magnetic field vector. This is similar to the torque experienced by an electric dipole in an electric field, given by: ⃗ ⃗τ = p⃗ × E, (2.5.19) where p⃗ = q(⃗r+ − ⃗r− ) is the electric dipole moment. A magnetic dipole placed in an external magnetic field possesses energy that is influenced by the orientation of the dipole with respect to the field. Drawing an analogy to electric dipoles, where the energy is given by ⃗ U = −⃗ p · E, (2.5.20) we can write a similar expression for magnetic dipoles as: ⃗ U = −⃗ µ · B. (2.5.21) This relationship emphasizes the similarity in the energy calculations between electric and magnetic dipoles, where the energy depends on the alignment of the dipole moment with the corresponding field. Not only coils possess magnetic dipole moments, but fundamental particles such as the electron also exhibit their intrinsic magnetic dipole moments. This phenomenon arises from the intrinsic spin of the electron, which can be envisioned as a minute charged particle spinning on its axis. As a result of this spin, the electron generates a magnetic moment akin to that of a miniature bar magnet, complete with north and south poles. The direction of the electron’s magnetic dipole 132 moment aligns with its spin angular momentum vector. When subjected to an external magnetic field, the electron’s magnetic dipole moment interacts with the field, leading to a torque and energy associated with the interaction. This interaction can give rise to various magnetic effects, including precession and alignment of the electron’s spin with the applied field. Accurate determination of the electron’s magnetic dipole moment has been the focus of extensive experimental and theoretical investigations, playing a significant role in advancing our comprehension of quantum field theories and the properties of matter at the microscopic level. 2.5.3 Magnetic fields from currents A magnetic field is not only capable of exerting a force on moving charged particles but also of being produced by the motion of charged particles or currents. This reciprocal relationship between magnetic fields and moving charges is a fundamental aspect of electromagnetism. Biot-Savart law The Biot-Savart law summarizes the experimental findings regarding the magnetic field produced at a point P , located at a distance ⃗r from a current-carrying length element d⃗s. This law states ⃗ is equal to µ0 multiplied by the product of the current i, d⃗s, sin θ, and that the magnetic field dB 4π inversely proportional to the square of the distance |⃗r|. Here, θ represents the angle between the directions of d⃗s and the unit vector r̂, which points from d⃗s towards P . The constant µ0 , known as the permeability constant, has a value of µ0 ≈ 4π × 10−7 T · m/A ≈ 1.26 × 10−6 T · m/A. Figure 2.21 ⃗ as illustrated in the above figure, is perpendicular to The direction of the magnetic field dB, both d⃗s and r̂ and can be determined using the cross product d⃗s × r̂. Consequently, the vector form of the Biot-Savart law is expressed as ⃗ = dB µ0 id⃗s × r̂ 4π |⃗r|2 133 (2.5.22) The magnitude of a magnetic field is proportional to the current and inversely proportional to the square of the distance from the source. This is similar to the electric field generated by a charged particle, which is proportional to the electric charge and inversely proportional to the square of the distance. We can now use the law of Biot and Savart to derive that the magnetic field at a perpendicular distance R from a long (infinite) straight wire carrying a current i. To find the total magnetic field due to the entire infinite wire, we integrate the contributions from all infinitesimal current elements along the wire √ ˆ ˆ ∞ ˆ ∞ µ0 i sin θds µ0 i(R/ s2 + R2 )ds ⃗ ⃗ B = dB = (î × r̂0 ) = (î × r̂0 ) r2 s2 + R2 −∞ 4π −∞ 4π ∞ s µ0 i µ0 iR √ (î × r̂0 ) = (î × r̂0 ), (2.5.23) = 4π R2 s2 + R2 s=−∞ 2πR Here, î is the unit vector along the direction of the current, r̂0 is the unit vector pointing from the wire to the point, θ is the angle between î (or d⃗s) and ⃗r. The magnetic field lines produced by the central wire form circular patterns around the wire, and their magnitude is inversely proportional to the distance R. This can be observed in the figure below, which shows iron filings aligned with the magnetic field lines. Figure 2.22 As an additional example, let’s examine the magnetic field generated by a current in a circular arc of wire. Consider an arc-shaped wire with a central angle ϕ, radius R, and center C, carrying a current i. The magnetic field at the center C due to any differential segment is directed along the same direction. Therefore, we can simplify the problem by considering the magnitude of the magnetic field using the scalar integral form of the Biot-Savart law: ˆ B= ˆ dB = 0 ϕ µ0 iRdϕ′ µ0 iϕ = . 2 4π R 4πR (2.5.24) If the arc becomes a complete circle, the magnetic field at its center can be simplified further. In this case, the central angle ϕ is replaced by 2π. The expression for the magnetic field at the center 134 of a circular wire becomes µ0 i . R And its direction is perpendicular to the circle plane with right-hand rule. B= (2.5.25) Force between two parallel currents Since moving charged particles can generate a magnetic field, it follows that magnetic fields can exert forces on moving charged particles. This leads us to the intriguing question of what happens when two currents are present. Consider the simplest scenario where two currents flow through infinitely long wires that are parallel to each other. We want to determine the force between these parallel currents. To calculate the force between two parallel currents, we can apply Biot-Savart law to determine the magnetic field created by one current and then use the Lorentz force law to find the force experienced by the other current. For two parallel long wires carrying currents i1 and i2 , separated by a distance d, the magnetic field created by the first wire at the location of the second wire is given by µ0 i1 . (2.5.26) 2πd The force experienced by the second wire can be obtained by multiplying the magnetic field B with the length of the wire L2 and the current i2 : B= µ0 i1 i2 L2 . (2.5.27) 2πd A simple analysis of the direction of the force reveals that parallel currents attract each other, while antiparallel currents repel each other. F = i2 L2 B = The force between two parallel currents is not only a fundamental concept in electromagnetism, but it also played a crucial role in defining the unit of electric current, the ampere. Prior to 2019, the ampere was defined as the constant current that, when maintained in two parallel conductors of infinite length, placed one meter apart in a vacuum, would exert a force of 2 × 10−7 newtons per meter of length between them. This basically fix the value of µ0 to be 4π × 10−7 T · m/A. However, with the redefinition of the SI base units in 2019, the ampere is now defined in terms of the elementary charge e. The elementary charge has a fixed numerical value of 1.602176634 × 10−19 coulombs (C), where 1 ampere (A) is equal to 1 coulomb per second (C/s). The second, in turn, is defined in terms of the unperturbed ground state hyperfine transition frequency of the caesium-133 atom. 2.5.4 Ampere’s law When determining the net electric field resulting from a charge distribution, we consider the sum of the differential electric fields contributed by each element. In cases where the distribution exhibits 135 planar, cylindrical, or spherical symmetry, Gauss’ law provides a more efficient method to calculate the total electric field. Similarly, when calculating the net magnetic field caused by a distribution of currents, we sum the contributions of the differential magnetic fields using the Biot-Savart law. If the distribution possesses symmetry, Ampere’s law can be utilized for a simpler analysis. Ampere’s law, derived from the Biot-Savart law, is commonly associated with Ampere, but it was actually derived and generalized by James Clerk Maxwell. Ampere’s law can be expressed as: ˛ ⃗ · d⃗s = µ0 ienc . B (2.5.28) In this equation, the integral is taken around a closed loop, known as an Amperian loop, and the dot product represents the magnetic field along the loop integrated over the differential path length. The quantity ienc denotes the net current enclosed by the loop. Symmetry transformation of magnetic field To explore the symmetry transformation of the magnetic field, let’s first examine the cross product under O(3) rotation. It can be easily shown that ⃗ × (RB) ⃗ = det(R) · R(A ⃗ × B) ⃗ (RA) (2.5.29) for R ∈ O(3). In fact, the cross product is unique in that it is orthogonal to both factors, has a length equal to the area of the parallelogram they form, and forms a right-handed triple with them. These properties remain invariant under rotations in SO(3). Therefore, the cross product is invariant under SO(3) transformation. However, if R ∈ O(3) has the property det R = −1, the right-hand rule will transform into a left-hand rule, resulting in a minus sign in the previous equation. Now, let’s delve into understanding the transformation rule for the magnetic field due to a current density that is invariant under an R ∈ O(3) transformation. Starting from the Biot-Savart law, we have: ˆ ˆ µ0 i(⃗s)d⃗s × (⃗r − ⃗s) ⃗ ⃗ B(⃗r) = dB = . (2.5.30) 4π |⃗r − ⃗s|3 The magnetic field at position r⃗′ = R⃗r is then given by: ˆ ˆ µ0 i(⃗s)d⃗s × (R⃗r − ⃗s) µ0 i(R−1⃗s)d(R−1⃗s) × R(⃗r − R−1⃗s) ′ ⃗ ⃗ B(⃗r ) = B(R⃗r) = = 4π |R⃗r − ⃗s|3 4π |⃗r − R−1⃗s|3 ˆ ˆ µ0 i(⃗s)d⃗s × R(⃗r − ⃗s) µ0 R[i(⃗x)d⃗s] × R(⃗r − ⃗s) = = 3 4π |⃗r − ⃗s| 4π |⃗r − ⃗s|3 ˆ µ0 i(⃗s)d⃗s × (⃗r − ⃗s) ⃗ r). = det(R) · R = det(R) · RB(⃗ (2.5.31) 4π |⃗r − ⃗s|3 136 Here, we used the symmetry property of the current, R(id⃗s) = id⃗s. From this transformation rule, we observe that the magnetic field is transformed in the same way under rotation if R ∈ SO(3) and has an additional minus sign for R ∈ O(3) with det(R) = −1. A physical quantity is classified as a vector if it follows the transformation rule ⃗v ′ = R⃗v , (2.5.32) under a rotation R ∈ O(3). On the other hand, a quantity is considered a pseudovector (or axial vector) if its transformation rule is ⃗v ′ = det(R) · R⃗v . (2.5.33) Performing usual operations on vectors or pseudovectors does not change their nature. For example, r the position vector ⃗r is a vector, and its derivative d⃗ dt (velocity) is also a vector. Similarly, the acceleration is a vector. As demonstrated earlier, the electric field is a vector. However, the cross product of two vectors results in a pseudovector. This explains why the magnetic field is a pseudovector, as it is fundamentally related to d⃗s × ⃗r, where d⃗s can be viewed as the direction of the drift velocity of charge carriers in a wire, which is a vector. Similarly, angular momentum ⃗ = ⃗r × p⃗ and torque τ = ⃗r × F⃗ are pseudovectors. L Combining Ampere’s law with symmetry When there are symmetries in the current density or distribution, Ampere’s law and symmetry can be utilized to simplify the derivation of the magnetic field generated by the current. For example, let’s consider a long, straight wire carrying a steady current i. Applying Ampere’s law to a circular loop of radius r centered on the wire, we utilize the rotational and translational ⃗ has a constant magnitude along the loop, with direction detersymmetries. The magnetic field B mined by the right-hand rule. Ampere’s law yields ˛ ⃗ · ds = B · 2πr = µ0 i, B (2.5.34) allowing us to solve for the magnetic field of an infinite wire B= µ0 i . 2πr (2.5.35) This demonstrates the inverse relationship between the magnetic field and distance r, while emphasizing the utility of Ampere’s law in analyzing infinitely long wires. 137 Figure 2.23 Similarly, for a long ideal solenoid carrying a current i, we can choose the Amperian loop to be the loop in Fig. 2.23. Then the Ampere’s law gives us ˛ ⃗ · ds = Bh = µ0 inh, B (2.5.36) where n is the number of turns per unit length of the solenoid. The magnetic field inside the solenoid is uniform and has magnitude B = µ0 in. 138 (2.5.37) 2.6 2.6.1 Faraday’s Law, Induction and Inductance Faraday’s law of induction In the last section, we showed that a current produces a magnetic field. A more surprising physical discovery of Faraday is the reverse effect: a changing magnetic field can induce an electric field that drives a current. This link between a magnetic field and the electric field it induces is now called Faraday’s law of induction. Faraday originally stated his law of induction as “an emf is induced in a loop when the number of magnetic field lines that pass through the loop is changing”. The number of magnetic field lines passing through the loop is quantitatively described by the magnetic flux. Suppose a loop enclosing ⃗ Then, the magnetic flux through the loop is defined as an area A is placed in a magnetic field B. ¨ ⃗ · dA. ⃗ ΦB := B A ⃗ is divergence-free, i.e., ∇ ⃗ ·B ⃗ = 0, by Corollary 2.2.2, its flux through Since the magnetic field B a surface S is invariant under local continuous deformation of S. As a consequence, the flux ΦB is well-defined and does not depend on the particular surface enclosed by the loop that we have chosen. The SI unit for magnetic flux is called the weber (Wb): 1 Wb = 1 T · m2 . Physics law 12 (Faraday’s law of induction). The magnitude of the emf E induced in a conducting loop is equal to the rate at which the magnetic flux ΦB through that loop changes with time. Moreover, the induced emf E tends to oppose the flux change. In formula, it writes that E =− dΦB . dt (2.6.1) If we change the magnetic flux through a coil of n turns, an induced emf appears in every turn and the total emf induced in the coil is the sum of these individual induced emfs: E = −n dΦB . dt Here are some common means by which we can change the magnetic flux through a coil: • Change the magnitude of the magnetic field within the coil. • Change the area of the coil or the portion of that area that lies within the magnetic field. • Change the angle between the direction of the magnetic field and the plane of the coil. The rule of thumb to determine the directions of the induced currents is as follows: An induced current has a direction such that the magnetic field due to the current opposes the change in the magnetic flux that induces the current. 139 This is called Lenz’s law. We illustrate it with the following figure. Figure 2.24: The direction of the current i induced in a loop is such that the current’s magnetic ⃗ ind opposes the change in the magnetic field B ⃗ inducing i. The field B ⃗ ind is always directed field B ⃗ ⃗ (figures opposite an increasing field B (figures a, c) and in the same direction as a decreasing field B b, d). The curled–straight right-hand rule gives the direction of the induced current based on the direction of the induced field. ⃗ (pointing Example. Suppose we pull a closed conducting loop out of a uniform magnetic field B inside the paper) at constant velocity v as in Figure 2.25. Suppose the resistance of the loop is R. Find the induced current, the force we act on the loop, the rate of work we do on the loop, and the rate of energy dissipation in the loop. Solution: We choose the coordinate axes in Figure 2.25 such that ⃗v = v⃗ex is along x direction, ⃗ is along the −z direction, i.e., B ⃗ = −B⃗ez . and B As we move the loop to the right, the portion of its area within the magnetic field decreases, so the flux through the loop also decreases. The magnetic flux is ΦB = BLx, which is changing with rate dΦB dx = BL = −BLv, dt dt 140 where the − sign indicates the fact that x is decreasing. Then, by Faraday’s law, an emf is induced in the loop with magnitude E = BLv, which leads to an induced current i= BLv . R Moreover, by Lenz’s law, the induced current is in the clockwise direction. Figure 2.25 With the induced current, we can calculate the magnetic forces on the loop. For the left edge, we have 2 2 ⃗ = −iLB⃗ex = − B L v ⃗ex . F⃗1 = i(L⃗ey ) × B R For the upper and lower edges, we have ⃗ = ixB⃗ey , F⃗2 = i(x⃗ex ) × B ⃗ = −ixB⃗ey . F⃗3 = −i(x⃗ex ) × B Hence, to ensure that the loop moves at a constant velocity, the force we should act on the loop is B 2 L2 v F⃗ = −(F⃗1 + F⃗2 + F⃗3 ) = ⃗ex . R Then, the rate of work done on the loop is B 2 L2 v 2 F⃗ · ⃗v = . R Finally, the rate of energy dissipation is P = i2 R = B 2 L2 v 2 . R Note that this is exactly equal to the rate at which we are doing work on the loop. In other words, this shows that the work that we have done in pulling the loop through the magnetic field finally transfers to the thermal energy in the loop. 141 2.6.2 Induced electric fields Faraday’s law of induction tells us that a changing magnetic field induces an emf (and hence a current) in a copper ring. Then, an electric field must be present along the ring because an electric ⃗ by a changing field is needed to move the conduction electrons. This induced electric field E magnetic field is just as real as an electric field produced by static charges—either field will exert ⃗ on a particle of charge q0 . Now, Faraday’s law can be equivalently stated as follows: a force q0 E A changing magnetic field induces an electric field. Note that this statement is more general in the sense that the electric field is induced even if there is no such copper ring, i.e., the electric field would appear even if the changing magnetic field were in a vacuum. To derive the formula for the induced electric field, we look at the following figures. We assume ⃗ is increasing in magnitude at a rate dB/dt. ⃗ that the magnetic field B In (a), an emf E is induced along the copper ring as in (2.6.1). In (b), we replace the copper ring with a hypothetical circular path of radius r. Consider a particle of charge q0 moving around this circular path, the work W done on it in one revolution by the induced electric field is W = q0 E = −q0 dΦB = −q0 dt ¨ S ⃗ dB ⃗ · dA, dt where S is the disk enclosed by the circular path. On the other hand, the work done on a particle of charge q0 along the circular path C can also be evaluated from the electric field as ˛ ¨ ⃗ · d⃗s = q0 ⃗ × E) ⃗ · dA. ⃗ W = q0 E (∇ C S where we used Theorem 2.2.3 in the second step. Comparing the above equations, we obtain that ¨ − S ⃗ dB ⃗= · dA dt ¨ ⃗ × E) ⃗ · dA. ⃗ (∇ S Note that this equation should hold for any surface S, so there should be ⃗ ⃗ ×E ⃗ = − dB . ∇ dt 142 (2.6.2) This is Faraday’s law in Maxwell’s equations. Now, we have two types of electric fields: static electric fields produced by static charges, and induced electric fields produced by changing magnetic flux. Although electric fields produced in either way exert forces on charged particles, there is an important difference between them. The static electric field is non-rotational, so there is an electric potential associated with it by Poincaré lemma, Theorem 2.2.4. But this is not the case for induced electric fields, which have non-vanishing curl: electric potential has no meaning for electric fields that are produced by induction. We can derive a contradiction assuming there is indeed an electric potential. Consider a charged q0 that makes a loop around the circular path in the above Figure (b). It starts at a certain point and, on its return to that same point, has experienced an emf E , that is, work of q0 E has been done on the particle by the electric field. However, that is impossible because the particle is back at the same point, which can have only one particular potential. The difference between the two types of electric fields can also be told from their field lines. Field lines of induced electric fields form closed loops. But field lines produced by static charges never do so but must start on positive charges and end on negative charges. Thus, a field line from a charge can never loop around and back onto itself. 2.6.3 Inductors and inductance, self-induction In the previous lecture, we discussed how capacitors can generate desired electric fields. Now we will shift our focus to inductors, which play a crucial role in generating desired magnetic fields. Specifically, we will examine the behavior of a long solenoid, with our attention drawn to a short length near the middle to mitigate any potential fringing effects. When a current i flows through the windings of the solenoid, it generates a magnetic flux ΦB in the central region of the inductor. The magnitude of the magnetic flux increases with the current. Similarly, increasing the number of turns N in the solenoid increases the magnetic flux. To quantify the ability of the solenoid to produce magnetic flux, we define the inductance L as: L= N ΦB . i (2.6.3) The inductance represents the ratio of the magnetic flux to the current and provides a measure of the solenoid’s capability as an inductor. The SI unit of magnetic flux is the tesla-square meter (T·m2 ), and as a result, the SI unit of inductance is the tesla-square meter per ampere (T·m2 /A). This unit is referred to as the henry (H) in honor of Joseph Henry, an American physicist who co-discovered the law of induction alongside Faraday. Therefore, we can express 1 henry as 1 H = 1 T·m2 /A. To calculate the inductance of a solenoid, let’s consider a solenoid with length l. The magnetic flux in this region can be expressed as N ΦB = (nl)(BA), 143 (2.6.4) where n represents the number of turns per unit length of the solenoid and B is the magnitude of the magnetic field inside the solenoid. As we calculated before, the magnetic field within the solenoid is given by B = µ0 in, where µ0 is the permeability of free space and i is the current. Hence, the inductance of the solenoid per unit length can be determined as L N ΦB = = µ0 n2 A. l il (2.6.5) The previous equation for the inductance of a solenoid provides a good approximation when the solenoid is much longer than its radius. This approximation neglects the spreading of magnetic field lines near the ends of the solenoid. Similarly, the parallel-plate capacitor formula (C = ϵ0 A/d) neglects the fringing of electric field lines near the edges of the capacitor plates. These approximations hold under specific conditions but may not fully account for the behavior near the boundaries. Figure 2.26 If the current in an inductor, such as a coil, is changing, it induces a changing magnetic flux, leading to a self-induced electric field according to Faraday’s law. This phenomenon is known as self-induction. By the definition of inductance, N ΦB = Li, and applying Faraday’s law, the electric potential is given by: E =− d(N ΦB ) di = −L . dt dt (2.6.6) Therefore, in any inductor, whether it be a coil, solenoid, or toroid, a self-induced electric field arises whenever the current changes with time, regardless of the current’s magnitude. The direction of a self-induced potential is determined by Lenz’s law. According to Lenz’s law, the self-induced electric field has a direction that opposes the change in current. This is why we have a negative sign in above equation. 2.6.4 RL circuits In section 2.4.7, we discussed RC circuits that involve both a resistor and a capacitor. Now, let’s consider an RL circuit where we have a resistor and an inductor connected together in a circuit. We aim to derive and solve the differential equation that governs the behavior of this circuit. 144 Figure 2.27 Let’s first consider the stationary case when the switch is at a. Initially, due to the presence of the inductor, the increasing current gives rise to an induced electric potential that is opposite in direction to the battery potential, following Lenz’s law. As time progresses, the current reaches its final maximum value, which is i = E /R. This implies that after a long time, the inductor behaves like an ordinary connecting wire. And the voltage across the two ends of the inductor is zero. So, in the stationary state, the circuit can be simplified to a battery connected to a resistance. In the stationary state, when the switch is on position b, the circuit consists of the inductor and the resistor without a battery. Without a battery to drive the current, the circuit eventually reaches an equilibrium state where no current flows and no voltage is present across the components. Now let us analyze the time-dependent situations quantitatively. When the switch S in the circuit is closed on position a, the current in the resistor begins to increase. A self-induced electric di potential EL = –L dt is induced in the resistor, opposing the rise of the current according to Lenz’s law. This self-induced potential opposes the battery E . Therefore, the voltage across the resistor di is given by E + EL = E − L dt . By Ohm’s law, the voltage across the resistor should also be iR. Thus, the differential equation for the circuit is: E −L di − iR = 0. dt With the initial conditions i(t = 0) = 0 and the long-time limit i(t = +∞) = the above differential equation is i(t) = (2.6.7) E R, the solution to E − t (1 − e τL ), R (2.6.8) where the inductive time constant τL is given by τL = L . R (2.6.9) It is customary to verify that the dimension of τL defined above does indeed have the dimension of time. The potential difference across the resistor VR (t) = i(t)R exhibits a similar behavior as the current. The potential difference across the inductor is |VL (t)| = L di(t) dt = E e exponentially to zero as expected. 145 − τt L , which decays On the other hand, if the switch is initially on position a for a long time and then switched to position b at t = 0, the initial current through the circuit is i(0) = E /R. Initially, the inductor behaves like a wire. As time progresses without a battery in the circuit, the current will decrease. di This time-varying current induces a potential EL = −L dt across the inductor. This potential should be equal to the potential difference across the resistor VR = iR. Thus, we have the differential equation L di + iR = 0. dt (2.6.10) Using the initial condition i(0) = E /R, the solution to the above differential equation is i(t) = E −t/τL e , R where the inductive time constant is again τL = tially from its initial value E /R to zero. 2.6.5 L R. (2.6.11) We observe that the current decays exponen- Energy of a magnetic field Let us consider the RL circuit again. By multiplying i to both sides of the RL differential equation, we obtain E i = Li di + i2 R. dt (2.6.12) The left-hand side E i = E dq dt can be interpreted as the rate at which the battery delivers energy to the rest of the circuit. On the right-hand side, the second term i2 R represents the rate at di which energy appears as thermal energy in the resistor. Therefore, the first term Li dt should be understood as the rate of energy lost due to the inductor. Since an inductor stores energy in the form of a magnetic field, we can say that it represents the rate of energy stored in the magnetic field: di dUB = Li . dt dt By integrating this expression, we obtain the magnetic energy ˆ ˆ i dUB 1 UB = dt = Li′ di′ = Li2 . dt 2 0 (2.6.13) (2.6.14) The expression bears a similarity to the expression for electric energy stored in a capacitor, UE = q2 2C . If the inductor is a long solenoid of cross-sectional area A and length l, the magnetic energy density of it is approximately given by uB = L i2 i2 B2 UB = = µ0 n 2 A = . Al l 2A 2A 2µ0 146 (2.6.15) In this equation, we used the relationship Ll = µ0 n2 A from Eq. (2.6.5) and B = µ0 in for a long solenoid. This final expression represents the density of stored magnetic energy at any point where the magnitude of the magnetic field is B. Although we derived it specifically for the case of a solenoid, it holds true for all magnetic fields, regardless of their generation. Notably, the magnetic energy density expression bears a resemblance to the expression for electric energy density, uE = 12 ϵ0 E 2 . 2.6.6 LC harmonic oscillations We have discussed RC and RL circuits in previous lectures. In this section, we will consider an LC circuit consisting of an inductor with inductance L and a capacitor with capacitance C. Let’s denote the charge on the capacitor as q, and the current in the circuit as i, which is equal to dq dt . The q di electric potential differences across the inductor and the capacitor are −L dt and C , respectively. Therefore, we have the differential equation q d2 q + = 0, 2 dt C (2.6.16) d2 q = −ω 2 q. dt2 (2.6.17) L or equivalently, The angular frequency is given by 1 (2.6.18) LC This equation represents a harmonic oscillator. The solution to this differential equation is ω=√ q = Q cos(ωt + ϕ), (2.6.19) where Q is the amplitude of the oscillation and ϕ is the phase angle. The corresponding current is given by i= dq = −ωQ sin(ωt + ϕ) dt Figure 2.28 147 (2.6.20) In a harmonic oscillator, such as a spring system, we know that the total energy is conserved. The energy is transformed between kinetic energy and the potential energy of the spring. Similarly, in the LC circuit, we expect total energy conservation. The total energy consists of the electric energy of the capacitor and the magnetic energy of the inductor. Using the energy formulas we derived earlier, the total energy is given by: 1 q2 U = UE + UB = Li2 + . 2 2C (2.6.21) The conservation of this total energy implies that: dU di q dq d2 q q = Li + = Li 2 + i = 0. dt dt C dt dt C (2.6.22) By eliminating i on both sides of the equation, this equation is precisely the harmonic oscillator differential equation that we derived earlier. It confirms that the total energy of the LC circuit is conserved, just like in other harmonic oscillator systems. 2.6.7 RLC damped oscillations In this subsection, we will delve into the behavior of RLC circuits. RLC circuits are fundamental electrical circuits that incorporate resistors (R), inductors (L), and capacitors (C). Figure 2.29 The differential equation for a series RLC circuit can be derived by applying Kirchhoff’s law to the circuit. The voltage drop across the resistor is given by Ohm’s Law as VR (t) = i(t)R, where i(t) is the current flowing through the circuit. According to Faraday’s law, the voltage across the inductor is given by VL (t) = −L(di(t)/dt). The voltage across the capacitor is given by ´ VC (t) = (1/C) i(t)dt. Applying Kirchhoff’s law to the series RLC circuit, we have: L q d2 q dq +R + = 0. 2 dt dt C 148 (2.6.23) where we used the relation i(t) = dq(t)/dt. This is the differential equation that describes the behavior of a series RLC circuit. Interestingly, it has the same form as the equation of motion for a damped harmonic oscillator, as given by Eq. (1.5.14): m d2 x dx +b + kx = 0. 2 dt dt By making the following correspondences between physical quantities and parameters: ( L ↔ m q ↔ x , R ↔ b , i = dq ↔ v = dx dt dt C ↔ 1 k (2.6.24) (2.6.25) we can establish a direct connection between the series RLC circuit and the damped harmonic oscillator. Depending on the values of the parameters, the series RLC circuit can exhibit three different types of behavior: q • Underdamped: When R < 4L C , the circuit is underdamped. In this case, the system exhibits oscillatory behavior, with the current and charge undergoing decaying oscillations. q • Overdamped: When R > 4L C , the circuit is overdamped. In this scenario, the system does not exhibit oscillations. The current and charge approach their equilibrium values gradually without oscillatory behavior. q 4L • Critically damped: When R = C , the circuit is critically damped. Here, the system reaches equilibrium in the shortest time possible without oscillating. Understanding the behavior of the RLC circuit in terms of the damped harmonic oscillator helps us analyze and predict the response of the circuit under different parameter values. Additionally, we can introduce an alternating current (AC) battery to the RLC circuit, which adds a time-varying term E (t) to the system. In this scenario, the differential equation has the same form as that of the forced damped harmonic oscillator, with the battery acting as an external driving force. By leveraging the results obtained for the forced damped harmonic oscillator, we can gain insights into the behavior of the RLC circuit with the battery. We can study, for example, the phenomena of resonance, where the circuit exhibits a maximum response to a specific frequency of the AC battery. By understanding the behavior of the forced damped harmonic oscillator, we can indeed explore its applications in various fields, including signal processing and communication systems. One of the key applications is signal amplification or magnification. The forced damped harmonic oscillator can be used to enhance or amplify weak signals by applying an AC battery as an external force to the system. 149 2.7 2.7.1 Maxwell’s equations, Electromagnetic Waves Overview of Maxwell’s equations Maxwell’s equations are a set of four differential equations that describe how electric and magnetic fields interact. Named after the Scottish physicist James Clerk Maxwell, these equations form the foundation of classical electrodynamics, optics, and electric circuits. They can also be used to derive the wave equation for light, illustrating that light is a form of electromagnetic radiation. The equations are typically written in the following form: Gauss’s Law for Electricity: ⃗ ·E ⃗ = ρ ∇ ε0 (2.7.1) ⃗ ·B ⃗ =0 ∇ (2.7.2) ⃗ ⃗ ×E ⃗ = − ∂B . ∇ ∂t (2.7.3) Gauss’s Law for Magnetism: Faraday’s Law of Induction: Ampere’s Circuital Law with Maxwell’s Addition: ⃗ ⃗ ×B ⃗ = µ0 J⃗ + µ0 ε0 ∂ E . ∇ ∂t (2.7.4) ⃗ and B ⃗ are the electric and magnetic field vectors, ρ is the electric charge density, J⃗ is Here, E the current density, ε0 is the permittivity of free space, and µ0 is the permeability of free space. Gauss’s Law for Electricity Gauss’s law for electricity states that the electric flux through any closed surface is proportional to the total charge enclosed within the surface. Mathematically, this is represented by the equation: ⃗ ·E ⃗ = ρ ∇ (2.7.5) ε0 ⃗ · E, ⃗ represents the divergence of the electric field E. ⃗ The The left-hand side of this equation, ∇ divergence is a measure of the ’outgoingness’ of a vector field at a given point. In the context of an electric field, it measures how much electric field is emanating from a particular point in space. The right-hand side, ερ0 , represents the charge density ρ divided by the permittivity of free space ε0 . The permittivity of free space is a constant that characterizes the amount of electric field that can exist in a vacuum for a given electric charge. 150 Gauss’s Law for Magnetism Gauss’s law for magnetism states that the net magnetic flux out of any closed surface is zero. This implies that there are no magnetic monopoles and that every magnetic field line that begins at some point must end at another point. Mathematically, this is represented as: ⃗ ·B ⃗ =0 (2.7.6) ∇ ⃗ is the magnetic field, and ∇· ⃗ B ⃗ = 0 represents the divergence of B. ⃗ Since the divergence Here, B of the magnetic field is zero, this implies that the magnetic field lines are continuous loops. Faraday’s Law of Induction Faraday’s law of induction states that a changing magnetic field induces an electromotive force (EMF) in a closed loop of wire. This induced EMF creates an electric field, represented by the equation: ⃗ ⃗ ×E ⃗ = − ∂B . (2.7.7) ∇ ∂t ⃗ × E, ⃗ represents the curl of the electric field, which measures the ‘circulaThe left-hand side, ∇ ⃗ tion’ or ‘rotationality’ of the field. The right-hand side, − ∂∂tB , represents the rate of change of the magnetic field over time. The negative sign indicates that the induced EMF and, therefore, the induced electric field, oppose the change in the magnetic field, as stated by Lenz’s law. Ampere’s Circuital Law with Maxwell’s Addition Ampere’s Circuital Law with Maxwell’s addition, also known as Maxwell’s law of electromagnetic induction, states that a magnetic field is induced by both the current density and the rate of change of the electric field. The law is represented as: ⃗ ⃗ ×B ⃗ = µ0 J⃗ + µ0 ε0 ∂ E . (2.7.8) ∇ ∂t ⃗ × B, ⃗ represents the curl of the magnetic field. This measures the ‘circuThe left-hand side, ∇ lation’ or ‘rotationality’ of the field. On the right-hand side, the term µ0 J⃗ represents the current ⃗ density J⃗ multiplied by the permeability of free space µ0 . The term µ0 ε0 ∂∂tE represents the rate of change of the electric field over time, scaled by the constants µ0 and ε0 . The second term on the right-hand side is Maxwell’s addition, which accounts for the creation of a magnetic field due to a changing electric field. This term is what allows for electromagnetic waves to propagate through space, as each changing field induces a change in the other, creating a self-sustaining wave. 2.7.2 Maxwell’s Addition The original form of Ampère’s law, which relates the circulation of the magnetic field around a closed loop to the electric current passing through the loop, worked well for static fields. However, it failed to account for situations where the electric field changes with time. It was James Clerk Maxwell who recognized this inconsistency and added a crucial term to Ampère’s law, now referred to as “Maxwell’s Addition”. 151 Maxwell realized that a changing electric field produces a magnetic field, just as a changing magnetic field produces an electric field (as described by Faraday’s law of induction). This was a significant insight because it established symmetry between electric and magnetic fields. Maxwell’s addition is stated mathematically as follows: ⃗ ×B ⃗ = µ0 ∇ ⃗ ∂E J⃗ + ε0 ∂t ! . ⃗ (2.7.9) ⃗ The term ε0 ∂∂tE is Maxwell’s addition, where ε0 is the permittivity of free space and ∂∂tE is the rate of change of the electric field. This term is known as the displacement current. It isn’t a current in the traditional sense of charges moving in a conductor; rather, it’s a “current” of changing electric field. It’s essential for the consistency of Ampère’s law with charge conservation. The Necessity of Maxwell’s Addition: The Displacement Current Consider a charging capacitor. There is a current i inside the wires connected to the plates, but in the region outside the wire and between the plates, there is no conduction current. According to ⃗ around the red loop C in Figure 2.30 the original Ampère’s law, the integral of the magnetic field B ⃗ equals the integral of the current density J over the surface S, whose boundary is C, i.e. S = C, ˆ ˆ ⃗ ⃗. B · d⃗r = µ0 J⃗ · dA (2.7.10) C S Now, there is a paradox that we could choose the surface to be the orange surface Sorange or the blue surface Sblue . We obtained different values of the surface integral ˆ ˆ ⃗ = µ0 i ̸= 0 = µ0 ⃗. J⃗ · dA J⃗ · dA µ0 (2.7.11) Sblue Sorange Figure 2.30: Charging a capacitor. Maxwell explained this paradox by introducing the concept of displacement current. He argued that a changing electric field between the plates of the capacitor induces a “current” - not of moving ⃗ charges, but of the changing electric field itself. This displacement current, ε0 ∂∂tE , is present in the region between the capacitor plates. 152 Recall, the electric field between two conducting plates of surface charge density σ is E= q σ = . ε0 ε0 A (2.7.12) Taking a time derivative gives dE d µ 0 i = µ 0 ε0 A = µ 0 ε0 dt dt ˆ ⃗ · dA ⃗, E (2.7.13) where we have used the fact that the electric field is approximately uniform when the plate is sufficiently large. Now, we find ! ! ˆ ˆ ⃗ ⃗ ∂ E ∂ E ⃗= ⃗. · dA J⃗ + ε0 · dA (2.7.14) J⃗ + ε0 ∂t ∂t Sblue Sorange The integral form of Ampère’s Law with Maxwell’s Addition is ! ˆ ˆ ⃗ ∂E ⃗, ⃗ ⃗ · dA B · d⃗r = µ0 J + ε0 ∂t C S (2.7.15) which is independent of the choice of the surface S as long as the boundary of S is the curve C, i.e. ∂S = C. Using Stoke’s theorem, we have ˆ ˆ ⃗ ⃗ × B) ⃗ · dA ⃗. B · d⃗r = (∇ (2.7.16) C S Taking S to be infinitesimal, we arrive at the differential form of Ampère’s Law with Maxwell’s Addition, ⃗ ⃗ ×B ⃗ = µ0 J⃗ + µ0 ε0 ∂ E . (2.7.17) ∇ ∂t Consistency with charge conservation Maxwell’s addition to Ampère’s law is also crucial for the conservation of electric charge. Consider an electric charge distribution of density ρ(⃗r) inside a region B, and the charges are flowing out of ⃗ r) over the boundary S = ∂B. The conservation of electric charge gives B by a current density J(⃗ the equation ˆ ˆ ⃗ r) · dA ⃗=−d J(⃗ ρ(⃗r)d3 x . (2.7.18) dt S B By the divergence theorem, we have ˆ ˆ ⃗ r) · dA ⃗= J(⃗ S ⃗ · J(⃗ ⃗ r)d3 x . ∇ (2.7.19) B Taking B to be infinitesimal, we arrive at the charge conservation equation ⃗ · J⃗ = − ∂ρ . ∇ ∂t 153 (2.7.20) On the other hand, in the homework problem, you have found the identity ⃗ · (∇ ⃗ × B) ⃗ = 0. ∇ (2.7.21) Using Ampère’s Law with Maxwell’s Addition and the Gauss law, we recover the charge conservation equation ⃗ ⃗ · (∇ ⃗ × B) ⃗ = µ0 ∇ ⃗ · J⃗ + µ0 ε0 ∇ ⃗ · ∂ E = µ0 ∇ ⃗ · J⃗ + µ0 ∂ρ . (2.7.22) 0=∇ ∂t ∂t We note that Maxwell’s Addition is crucial for the above derivation. Without Maxwell’s Addition, Ampère’s Law would be inconsistent with the charge conservation. 2.7.3 Relativistic formulations of Maxwell equations Maxwell’s equations can be written in the index form as 1 ρ, ε0 ∂i Bi = 0 , ∂ ϵijk ∂j Ek = − Bi , ∂t ∂i Ei = ϵijk ∂j Bk = µ0 Ji + µ0 ε0 (2.7.23) ∂ Ei , ∂t ∂ where i, j = 1, 2, 3 and ∂i ≡ ∂x i . We have used the Einstein summation convention, where the repeated indices are summed over. We do not distinguish the upper and the lower i, j indices, i.e. V i = Vi . ϵijk is the rank-3 Levi-Civita symbol, a totally anti-symmetric tensor with ϵ123 = 1. The other components of ϵijk are determined by the total anti-symmetricity. Now, we define F ij ≡ 1 ϵijk Bk , µ0 F 0i ≡ cε0 Ei , F i0 ≡ −cε0 Ei , F 00 ≡ 0 , J 0 ≡ cρ , (2.7.24) where c is a constant that will be determined later. Exercise. Show that Bi = µ0 jk 2 ϵijk F . The electric and magnetic fields can be assembled into a rank-2 anti-symmetric tensor F µν = −F νµ , (2.7.25) where we have combined 0 and i forming a new index µ = 0, 1, 2, 3. Writing in the matrix form, we have 0 cε0 E1 cε0 E2 cε0 E3 1 1 −cε E 0 0 1 µ0 B3 − µ0 B2 (2.7.26) F µν = . 1 0 −cε0 E2 − µ10 B3 µ0 B1 −cε0 E3 µ10 B2 − µ10 B1 0 154 F µν is called a field strength. Now, let us rewrite Maxwell’s equation in terms of F µν . First, let us define x0 ≡ ct , ∂0 ≡ 1∂ ∂ = . 0 ∂x c ∂t (2.7.27) We need to stress here that the upper and lower 0-indices are different, i.e. V 0 ̸= V0 , and we will determine how they are related later. Maxwell’s equation now becomes ∂j F 0j = J 0 , ∂j F ij = J i + ∂0 F 0i , (2.7.28) ϵijk ∂ i F jk = 0 , ϵijk ∂ j F 0k = − c2 µ0 ε0 ϵijk ∂0 F jk . 2 The first two equations can be nicely written as ∂ν F µν = J µ . (2.7.29) To simplify the last two equations, let us examine the equation ϵσµνρ ∂ µ F νρ = 0 , (2.7.30) where ϵσµνρ is the rank-4 Levi Civita symbol, which is a totally anti-symmetric tensor with ϵ0123 = 1, and in particular, ϵ0ijk = ϵijk . When σ = 0, the equation (2.7.30) becomes ϵijk ∂ i F jk = 0 , (2.7.31) which is the same as the third equation in (2.7.28). When σ = i, the equation (2.7.30) becomes 1 ϵijk ∂ j F 0k = ϵijk ∂ 0 F jk . 2 (2.7.32) To match with the fourth equation in (2.7.28), we first set c= √ 1 . µ0 ε 0 (2.7.33) Next, we define x0 ≡ −x0 , ∂0 ≡ ∂ ∂ = − 0 = −∂0 . ∂x0 ∂x (2.7.34) With these definitions, we see that (2.7.32) and the fourth equation of (2.7.28) become the same. In summary, Maxwell’s equations become (2.7.29) and (2.7.30). The equation (2.7.30) can be solved by F µν = ∂ µ Aν − ∂ ν Aµ , 155 (2.7.35) for arbitrary vector field Aµ . We check that ϵσµνρ ∂ µ (∂ ν Aρ − ∂ ρ Aν ) = ϵσµνρ ∂ µ ∂ ν Aρ − ϵσµνρ ∂ µ ∂ ρ Aν = 0 , (2.7.36) where we have used the fact that partial derivatives commutes, i.e. ∂ µ ∂ ν Aρ = ∂ ν ∂ µ Aρ . The vector field Aµ is called a gauge potential. We note that the gauge potential is not entirely physical. Consider two gauge potentials that differ by a total derivative as (2.7.37) A′ν = Aµ + ∂ µ λ . We have F ′µν = ∂ µ A′ν − ∂ ν A′µ = ∂ µ Aν + ∂ µ ∂ ν λ − ∂ ν Aµ − ∂ ν ∂ µ λ = ∂ µ Aν − ∂ ν Aµ = F µν . (2.7.38) (2.7.37) is called a gauge transformation or a gauge ambiguity. The gauge ambiguity can be eliminated by imposing gauge conditions. A commonly used gauge condition is the Lorentz gauge ∂µ Aµ = 0 , (2.7.39) which actually eliminates only part of the gauge ambiguity (2.7.37). 2.7.4 Electromagnetic wave We have seen that Maxwell’s equations (2.7.23) simplify to (2.7.29) and (2.7.29) after treating time t as the fourth coordinate x0 = ct. An important question remains: What is the constant c? Exercise. Show that the unit of c is m/s from the units of µ0 and ε0 . Hence, x0 has the unit m, so it makes sense to combine x0 with xi into xµ . Therefore, c should be the velocity of something. We show that the “something” is the electromagnetic wave. Let us plug (2.7.35) into (2.7.30), ∂ν ∂ µ Aν − ∂ν ∂ ν Aµ = J µ . (2.7.40) Using the Lorentz gauge (2.7.39), we find −∂ν ∂ ν Aµ = J µ . For simplicity, we consider the case without any charge and current. We have " # ∂ 2 ∂ 2 ∂ 2 ∂ 2 − − − Aµ = 0 . ∂x0 ∂x1 ∂x2 ∂x3 (2.7.41) (2.7.42) This equation is a wave equation that describes electromagnetic waves. The equation can be solved by the ansatz (2.7.43) Aµ = εµ cos(kµ xµ + ϕ) , 156 where kµ is called the momentum and εµ is called the polarization of the electromagnetic wave. The wave equation (2.7.42) and the Lorentz gauge (2.7.39) implies kµ k µ = kµ εµ = 0 . (2.7.44) For simplicity, we choose k µ = (k, 0, 0, k) , εµ = (0, 1, 0, 0) . (2.7.45) The nonzero component of the gauge potential is A1 = cos(k(x0 − x3 ) + ϕ) . (2.7.46) The nonzero components of the field strength are F 01 = −∂0 A1 − ∂1 A0 = k sin(k(x0 − x3 ) + ϕ) , F 13 = ∂1 A3 − ∂3 A1 = −k sin(k(x0 − x3 ) + ϕ) . We find the nonzero components of the electro and magnetic fields r µ0 E1 = k sin(k(x0 − x3 ) + ϕ) , B2 = µ0 k sin(k(x0 − x3 ) + ϕ) . ε0 Now, using x0 = ct, we find r µ0 E1 = k sin(k(ct − x3 ) + ϕ) , ε0 B2 = µ0 k sin(k(ct − x3 ) + ϕ) . (2.7.47) (2.7.48) (2.7.49) We see that the wavefronts of both the electric and magnetic fields have speed c. 2.7.5 Symmetry of Maxwell’s equations We have seen before in many examples that our laws of physics are invariant under Galilean transformations. Are Maxwell’s equations also invariant under Galilean transformations? First, Maxwell’s equations are invariant under rotation Ri j ∈ O(3), x′i = Ri j xj , Exercise. ∂j′ = ∂ ∂xi ∂ = = (R−1 )i j ∂i = (RT )i j ∂i . ∂x′j ∂x′j ∂xi (2.7.50) Show that ϵijk Ri l Rj m Rk n = ϵlmn det(R). The field strengths and the current transform as F ′0j = Ri j F 0j , F ′ij = Ri k Rj l F kl , J ′i = Ri j J j , J ′0 = J 0 . (2.7.51) It is now easy to see that Maxwell’s equations (2.7.28) are invariant under the O(3) transformations. However, Maxwell’s equations are not invariant under the Galilean transformations, x′i = xi + vi 0 x , c 157 x′0 = x0 . (2.7.52) The derivatives transform as ∂xj ∂ ∂x0 ∂ ∂ ∂ = + = , ∂x′i ∂x′i ∂xj ∂x′i ∂x0 ∂xi ∂ ∂xj ∂ ∂x0 ∂ vi ∂ ∂ = + ′0 0 = − + . ′0 ′0 j ∂x ∂x ∂x ∂x ∂x c ∂xi ∂x0 (2.7.53) There is no good way for the field strength F ij and F 0i to transform and make Maxwell’s equations (2.7.28) invariant. Instead of Galilean transformations, Maxwell’s equations are invariant under Lorentz transformations. Let us see how Lorentz transformations come about. In the previous sections, we use the speed c of the electromagnetic wave to turn time t into x0 = ct, and combine it with xi to form xµ . It is natural to treat xµ as a 4-vector. We have upper µ and lower µ indices. They are related by xµ = η µν xν , xµ = ηµν xν , (2.7.54) 0 0 = ηµν . 0 1 (2.7.55) where η µν and ηµν are 4 × 4 matrices, η µν −1 0 = 0 0 0 1 0 0 0 0 1 0 The norm of the 4-vector xµ is defined by |x|2 ≡ xµ xµ = ηµν xµ xν . (2.7.56) Recall that the O(3) transformations of the 3-vector xi leave the norm of xi invariant. Analogously, we consider transformations that leave the norm (2.7.56) of the 4-vector invariant as ηµν x′µ x′ν = ηµν Λµ ρ Λν σ xρ xσ = ηµν xµ xν . (2.7.57) The transformation matrix Λµ ρ must satisfy ηµν Λµ ρ Λν σ = ηρσ . (2.7.58) ΛT ηΛ = η . (2.7.59) In matrix notation, we have Such matrices form a group called O(1, 3). The group O(3) is a subgroup of O(1, 3). These transformations are called Lorentz transformations. For simplicity, let us consider the transformations that leave x2 and x3 invariant. They take the form cosh ζ sinh ζ 0 0 sinh ζ cosh ζ 0 0 (2.7.60) Λ= . 0 0 1 0 0 0 01 158 We find that x0 and x1 transform as x′0 = x0 cosh ζ + x1 sinh ζ , x′1 = x0 sinh ζ + x1 cosh ζ . It can be written using x = x1 and t = x0 /c as v t′ = γ t + 2 x , c x′ = γ(x + vt) , where γ = q 1 2 1− v2 (2.7.61) (2.7.62) , and we have changed the variable ζ to v by c cosh ζ = q 1 1− v2 c2 . (2.7.63) Now, we can consider the limit c ≫ v, and the Lorentz transformation reduces to the Galilean transformation t′ = t , (2.7.64) x′ = x + vt . We see that the parameter v in the Lorentz transformation becomes the velocity in the Galilean transformation. The Galilean transformation can be regarded as the low-velocity limit of the Lorentz transformation. The Lorentz transformation has very profound physical implications. Although Lorentz first found the Lorentz transformation, it was Einstein who first explain their meaning to the rest of the world. Now, this subject is called Einstein’s special relativity. In the following two sections, we will introduce two of the most important implications of the Lorentz transformations: 1. Time Dilation, 2. Length Contraction. 2.7.6 Time Dilation Time dilation, as described by special relativity, posits that an observer in one inertial frame will perceive time to be passing more slowly in another frame that is moving relative to the first. This is a counterintuitive phenomenon that runs counter to our everyday experiences but is a natural consequence of the Lorentz transformations and the constant speed of light. Let’s consider two observers, Alice who is stationary and Bob who is moving at a velocity v relative to Alice. Suppose Bob carries a light clock with him, which measures time by bouncing a beam of light between two mirrors. Now, from Bob’s perspective, the light in the clock travels a distance of 2d (where d is the distance between the mirrors) in a time interval ∆t′ (one tick of the clock). Using the invariant speed of light, c, we have (2.7.65) 2d = c∆t′ . 159 However, from Alice’s perspective, who sees Bob moving, the light beam in Bob’s clock follows a diagonal path, forming a right triangle with the vertical distance of 2d and horizontal distance of v∆t. According to Pythagoras’ theorem, the hypotenuse of this triangle (the path of the light) should be: (2.7.66) c2 ∆t2 = (2d)2 + (v∆t)2 . Here, ∆t is the time interval as measured by Alice for one tick of Bob’s clock. Substituting 2d = c∆t′ into the above equation, we get: (2.7.67) c2 ∆t2 = c2 ∆t′2 + v 2 ∆t2 . Solving for ∆t, we find: ∆t = γ∆t′ , where γ = q 1 2 1− v2 (2.7.68) is the Lorentz factor. This is the time dilation formula, indicating that Alice c observes Bob’s clock to tick more slowly by a factor of γ. It’s important to note that this time dilation effect is not due to any mechanical or optical flaws in the clock, but is a genuine effect of relative motion. We can also directly derive the time dilation (2.7.68) from the Lorentz transformation. Let’s begin with the Lorentz transformation: vx t′ = γ t + 2 , c (2.7.69) x′ = γ (x + vt) , The time intervals ∆t′ , ∆t and space intervals ∆x′ , ∆x would satisfy v ∆t′ = γ ∆t + 2 ∆x , c ∆x′ = γ (∆x + v∆t) , (2.7.70) Here: • ∆t′ and ∆x′ are the time and space intervals measured in the moving frame (Bob’s frame), respectively • ∆t and ∆x are the time and space intervals measured in the stationary frame (Alice’s frame), resepctively • v is the relative velocity between the two frames. Now, let’s imagine a scenario where a single event (such as the tick of a clock) happens at the origin of the moving frame (Bob’s frame). In this case, ∆x′ = 0 for that event. Substituting ∆x′ = 0 into the Lorentz transformation for time, we have: ∆t = γ∆t′ . (2.7.71) This equation states that the time interval for an event at the origin, as measured in the moving frame (Bob’s frame), is γ times the time interval as measured in the stationary frame (Alice’s frame). 160 This means Bob measures the time interval to be longer than Alice by a factor of γ, which is the essence of time dilation. Time dilation has been experimentally confirmed in numerous tests, such as time-dilated decay of muons in cosmic rays and precision measurements using atomic clocks on board GPS satellites. Time dilation, as such, highlights the flexible nature of time under special relativity - a stark contrast to our everyday perception of time as a constant, unchanging entity. 2.7.7 Length Contraction Length contraction, much like time dilation, is a crucial and counterintuitive prediction of special relativity. According to this phenomenon, the length of an object in its direction of motion is observed to be shorter when viewed from a frame that is in motion relative to the object. This can be derived from the Lorentz transformations. Consider an object at rest in the frame of observer Bob. The length L′ of the object as measured in Bob’s frame is given by the difference between the coordinates of its endpoints, x′1 and x′2 : L′ = x′2 − x′1 = ∆x′ . From the perspective of another observer, Alice, who is moving at a speed v relative to Bob, the coordinates of the endpoints of the object are transformed by the Lorentz transformations as follows: v ∆t′ = γ ∆t + 2 ∆x , c (2.7.72) ′ ∆x = γ (∆x + v∆t) , where x denotes coordinates in Alice’s frame, γ = √ 1 1−v 2 /c2 is the Lorentz factor, and t′ is the time as observed by Bob. However, Alice measures the length of the object at a particular time, so for her, the times at the two ends of the object are the same, i.e., ∆t = 0. Substituting this into the equations, we have: v ∆x , c2 ∆x′ = γ∆x, ∆t′ = γ (2.7.73) The length L = ∆x of the object as observed by Alice is then given by L = ∆x = γ −1 ∆x′ = γ −1 L′ . This result is counterintuitive because it suggests that an object’s length can change depending on the observer’s state of motion, which contradicts our everyday experiences. However, it’s important to note that length contraction, like time dilation, is a real effect that has been confirmed in numerous experiments, and it is essential for maintaining the consistency of the laws of physics in all inertial frames as required by the principle of relativity. 161