European Journal of Physics PAPER Defining the electromagnetic potentials To cite this article: Art Davis 2020 Eur. J. Phys. 41 045202 View the article online for updates and enhancements. This content was downloaded from IP address 130.65.109.155 on 28/08/2020 at 19:25 European Journal of Physics Eur. J. Phys. 41 (2020) 045202 (14pp) https://doi.org/10.1088/1361-6404/ab78a6 Defining the electromagnetic potentials Art Davis San Jose State University, United States of America E-mail: artice.davis@sjsu.edu Received 27 October 2019, revised 19 January 2020 Accepted for publication 21 February 2020 Published 8 June 2020 Abstract This paper offers a critical examination of the classical method of introducing the elecrtromagnetic potentials in a nonrelativistic context. It applies the nonretarded Helmholtz theorem to show that the usual method is ambiguous, with that ambiguity being reflected in the gauge transformation equations. This ambiguity can be removed by carefully invoking the Helmholtz decomposition theorem, but the process exposes the inadequacy of the conventional procedure. It then shows how the retarded Helmholtz theorem can be used to motivate rigorous definitions of the vector and scalar potentials and the field variables based upon the current density as the single electrical source variable. The method depends upon the assumption of causality, hence is restricted to classical field theory. It uses the powerful theory of operators as the main analysis tool. Keywords: electromagentic potentials, Helmholtz theorem, operators, Maxwellʼs equations 1. Introduction The fundamental idea of a potential is not really clear. For instance, one speaks of a chemical potential, a gravitational potential, a diffusion potential, a vector potential, or a retarded potential. But just what is a potential? We loosely think of a scalar potential of a vector function f as being a scalar function f such that f = -f and of a vector potential as being a vector function A such that f = ´ A. In the back of our minds, we carry the qualification that each vector function f must be expressed in one of these forms or the other—but not both1. In the final analysis, electromagnetic theory texts simply do not clarify this issue by rigorously defining the term ‘potential.’ This paper is intended to do so. It will proceed as indicated in the following outline. 1 In fact we will show that it is generally expressible as a linear combination of the two types of potential. 0143-0807/20/045202+14$33.00 © 2020 European Physical Society Printed in the UK 1 Eur. J. Phys. 41 (2020) 045202 A Davis 2. Review of the conventional method We have already mentioned that the electromagnetic potentials are ill-defined in the usual treatment, but it might be surprising that the same is true for the fundamental field variables B and E —quantities which most would probably consider to be directly measurable. We will support this contention by referring to a specific text: Classical Electrodynamics (3rd ed.), by Jackson [1]. In so doing, we have no intention of impugning this venerable text. We have chosen it simply because it is widely used and offers a concrete and clear example which makes our point effectively. Consider the electric field intensity E . The typical text discusses Coulomb’s law and defines E as the force exerted by stationary source charges upon a stationary test charge. (Jackson does this in chapter 1. Though he never explicitly states that all the charges must be stationary, it is clear from context that he does mean to impose this restriction.) The next few chapters (1 through 4 in Jackson) explore electrostatics thoroughly. It then discusses current electricity and the magnetic field B2 (chapter 5 in Jackson). At this juncture, the text begins to discuss the more general situation leading up to the Maxwell equations. Jackson takes the first step in this direction in chapter 5, section 5.15 when he introduces Faraday’s law. After previously ‘defining’ the magnetic field vector B and its associated quantity magnetic flux, he says (his equation number) ‘The changing flux induces an electric field around the circuit, the line integral of which is called the electromotive force . After several intervening sentences, he writes ‘The electromotive force is given by = ∮C E¢ · dl, (5.134) where E¢ is the electric field at the element dl of the circuit C.’ Unfortunately, he has not at this point defined either E¢ or , so he is simply parachuting in the concepts of electromotive force and general electric field together. He never offers precise definitions, and this is commonly the case with electromagnetic theory textbooks in general. 2.1. The definition of ‘definition’ There is thus a clear need for a standard by which we can judge the rigour of any definition. In other words we must have conditions that a proposed definition must satisfy in order to be a true definition. Consider this. A mathematical definition consists of an equation such as w = f (x , y , z ) , (1) where x, y, and z are variables which have already been defined and f is to be taken as a function in a general sense; that is, it can be an operator which operates on the functions x, y, and z. If w, x, y, and z are the mathematical representations of physical quantities w is called a derived quantity or variable (w being derived from the quantities x, y, and z). An example is the defining equation for the current in a capacitor, i=Cdv/dt. It is possible, however for a defining equation to take the form w = f (r , t ) , (2) where f is some known function of position and/or time. Then the corresponding variable is called a source variable. For example i (t ) = 0.01 cos (2p106t ) A is an example of a source 2 The definition of B is even more fraught with problems than is the definition of E . 2 Eur. J. Phys. 41 (2020) 045202 A Davis variable specifying the current in an element as an independent function of time. Such an equation defines a current source. For electromagnetic field theory, we assume that all quantities are represented as either derived variables or source variables. The requirement that a derived variable refer only to previously defined variables implies that there can be no loops in a flow diagram of all electromagnetic variable definitions. A bit of reflection will convince one that the flow diagram must have a tree structure and that each and every variable must be ultimately expressible in terms of source variables alone. This is the argument so ably put forward by Jefimenko [2] in his paper showing that causality requires this condition. We will use it to critique the usual approach to defining the electromagnetic potentials. It is clear now that the loose ‘definitions’ of potential in the opening paragraph of this paper were not proper definitions at all! Proper definitions of f and A as derived variables would present them as equations expressing them in terms of f rather than the reverse. This fact has unfortunately not been recognised in what has become a standard textbook presentation of the electromagnetic potentials which dates from at least the early part of the last century [3] and continues through the present day [1]. Texts adopting this approach begin with a list of Maxwell’s equations, namely · B = 0, (3a) · E = - r 0, (3b) ´ B = m0 J + 1 ¶t E , c2 (3c) ´ E = -¶t B , (3d ) 3 which are assumed to hold for arbitrary time varying sources . They are either assumed ad hoc or developed from experimental laws. Since · B = 0 , the reasoning goes4, we can write B=´A (4) for some unspecified vector function A. Using this expression in equation (3d), we have ´ [E + ¶t A] = 0. ( 5) Now, the logic goes, the zero curl implies that we can express the argument of that curl as the gradient of some undetermined scalar function. Doing that and rearranging slightly, we have E = -f - ¶t A. (6) The negative sign on ∇f has been selected merely to agree with conventional formulas. Assuming that E and B have been previously defined, these equations are clearly intended to be definitions of of the potentials as derived variables—but they are ambiguous; they do not define the potentials in terms of previously defined variables. The typical 3 The general definitions of the field variables are, as we have already noted, often not stated precisely—though we will assume in the first part of this paper that they have already received precise definitions based upon the source variables current density and/or charge. 4 At this juncture Jackson (in section 2 of his chapter 6) says, ‘Since · B = 0 still holds, we can define B in terms of a vector potential: B = ´ A. (6.7) Apparently he unconsciously realises that such an equation defines the variable on the left in terms of the one on the right—although the context certainly implies that he means the converse. There are many such contretemps in the general literature where the basic definitions are concerned. 3 Eur. J. Phys. 41 (2020) 045202 A Davis treatment recognises this solecism by introducing the gauge transformations A¢ = A + l, (7a) f¢ = f - ¶t l, (7b) where λ is an arbitrary scalar field to be determined. The typical presentation then substitutes these expressions into equations (4) and (6), thus showing that neither B nor E is effected. Another feature of the conventional development is its assertion that one can use the ‘freedom of the gauge’ embodied in the gauge transformation, equations (7), to arbitrarily specify the divergence of A. We will show that this assumption is, in fact, incorrect. In addition to this fallacy, there is yet another shortcoming. The vector potential and the scalar potential are ‘potentials’ of different functions, B and E + ¶t A, respectively. This seems awkward, particularly as regards the second function. Finally, and most importantly, the authors of texts adopting the aforementioned approach do not cite authorities for these equations. For instance, just why does zero divergence of a vector function imply that it is the curl of another? This is undoubtedly an assumption about time varying fields that is carried over from previous work on static and quasistatic fields. Our thesis is that the fundamental reference for such statements about zero divergence and zero curl is the classical Helmholtz decomposition theorem, but that it has been incorrectly applied because this theorem shows that the vector and scalar potentials for a given vector function are not independent; rather, they are uniquely defined by the function being decomposed. 3. Operators and their application to electromagnetic field theory 3.1. Basic definitions and properties We will use the powerful operator approach, so will supply a brief review of basic operator concepts5. An operator is the generalisation of the idea of a real valued function, a function which maps functions into functions. For example the time differentiation operator p=d/dt maps any differentiable function of time into its derivative. In this case, we can write p sin = cos to signify that it maps the sine function into the cosine function. We often write this action, though, by giving the typical values of these functions—writing p sin (t ) = cos (t ), which really means that [ p sin](t ) = cos (t ). In the general case, for functions of both space and time, we write Lf = g (8) [Lf ](r , t ) = g (r , t ). (9) and By the very act of writing these equations, we are assuming that each function f produces a unique function g, that is, there is a single valued operator L whose ‘value’ at f is g, but we will now explore the existence of an inverse operator L−1 which gives a unique function f for each choice of the function g—in which case we will write f=L−1g. A function is a special type of relation, so let S and R be two arbitrary sets and recall that a relation L on S×R is a subset of S×R; L Ì {( f , g): f Î S , g Î R}. Now suppose ( f1, g1) 5 See [4]. This is a well written argument for teaching operators to undergraduates with the derivation of the classical Helmholtz theorem as an important example. Unfortunately, the author uses the term ‘symbolic methods’ in a more or less disparaging way, considering them to be somewhat nonrigorous. But see [5], which shows that the assumption of causality makes these methods quite rigorous. Also see [6] for a more leasurely discussion. 4 Eur. J. Phys. 41 (2020) 045202 A Davis and ( f2, g2) are both arbitrary elements of L. If f1=f2 always implies that g1=g2, then L is called a function—in this case, an operator from S to R. This, of course, is just a precise way of saying that to each choice of f there corresponds a unique g. Now let L be an operator and let L−1={(g, f ): ( f, g)äL}, which is called the inverse relation for L. If L−1 is itself a function (another operator), then it is called the inverse of L and we write L−1g=f. We will now discuss an important property which assures us that a given operator L has an inverse. The most common definition used in the literature has to do with the causality of an operator [7], but we will use a slightly more general definition which refers to the variables themselves [8]. We will now let L be a general relation (not necessarily a function), with L−1 its inverse relation. Choose arbitrary elements (g1, f1)äL−1 and (g2, f2)]äL−1 and let t0 be an arbitrary instant of time and let r ̶ ¢ be an arbitrary point in space. If g1 (r ̶ ¢ , t ) = g2 (r ̶ ¢ , t ) for all t<t0 and each r ̶ ¢ implies that f1 (r, t ) = f2 (r, t ) for all t<t0 and each choice of r , then we say that f is a causal consequent of g and that g is a causal antecedent of f. Here is the importance of this condition: it ensures that L−1 is an operator (that is, a function) whether or not L is. Why? Simply because g1=g2 means that g1 (r ̶ ¢ , t ) = g2 (r ̶ ¢ , t ) for any choice of t0 and any r ̶ ¢. Our condition, then, implies that f1 (r, t ) = f2 (r, t ) for any choice of t and for any r . But this is merely another way of expressing the implication that g1=g2 implies f1=f2. In passing, we mention that all of the operator equations above are equally valid if both f and g are vector functions denoted by bold letters such as f which we have already used for the spatial position vector r = xi ei (the vector ei being the ith unit vector in a Cartesian coordinate system. 3.2. The Poisson equation Poisson’s equation is 2f = - g, (10) where we have chosen the negative sign to be consistent with existing literature. The operator ∇2 is called the Laplacian operator. Under the assumption that f is a causal consequence of g a unique inverse exists. We will write this inverse operator as 1/∇2 and its action (in operator form) as 1 [ - g ]. (11) 2 But, specifically, just what is this inverse? We begin with the well-known ‘sampling property’ of the unit delta function f= g (r , t ) = ò g (r¢, t ) d (r - r¢) dt ¢, (12) where dτ′ is the differential volume element, the delta function is a three-dimensional one, and the integral is a volume integral over all space with the integration variable being r ¢. Next, we apply the well known identity ⎡ 1 ⎤ = - 2y (r , r ¢) , d (r - r ¢) = - 2 ⎢ ⎣ 4pR ⎥⎦ (13) where ψ has the obvious definition and R = ∣r - r ¢∣. Inserting (13) into (12), gives ò g (r , t ) = - g (r ¢ , t ) 2y (r , r ¢) dt ¢ = - 2 5 ò g (r¢, t ) y (r, r¢) dt ¢, (14) Eur. J. Phys. 41 (2020) 045202 A Davis where we have noted that the Laplacian operator is with respect to r , whereas the integration variable is r ¢. We have, of course, also assumed that the function g is ‘nice’ enough for the integration and differentiations to be interchanged. Equation (14) says that the function to the right of the Laplacian operator is a solution to the Poisson equation, but if we assume that f is a causal consequence of g, we see that it is the unique solution to that equation. Thus, we now see that ⎤ ⎡ 1 f (r , t ) = ⎢ 2 { - g} ⎥ (r , t ) = ⎦ ⎣ ò g (r ¢ , t ) dt ¢ 4p R (15) is the unique solution to the Poisson equation (10) for each function g. Note carefully that the negative sign on g in (10) has resulted in a positive sign in its solution—which was the objective of including it in Poisson’s equation in the first place. In summary, we now know that—under the assumption of causality—the solution to the Poisson equation in (10) has the unique operator solution (11) given by (15). Before leaving our study of Poisson’s equation, we observe that all the above work continues to hold unchanged if we replace the scalar functions f and g by the vector functions f and g . Thus, for future reference, we write the vector form of the Poisson equation by 2f = - g (16) and its operator solution as f= 1 { - g} , 2 (17) which has the explicit meaning ⎤ ⎡ 1 f (r , t ) = ⎢ 2 { - g} ⎥ (r , t ) = ⎦ ⎣ ò g ( r ¢ , t ) dt ¢ . 4p R (18) 3.3. The wave equation The wave equation is ⎡ 2 1 2⎤ ⎢⎣ - 2 ¶ t ⎥⎦ f = - g , c (19) where c is an arbitrary positive real number and ¶ t2 = ¶ 2 ¶t 2 . We will use the symbol 1 2 ¶t (20) c2 and call it the d’Alembertian operator (or box operator). Using this convenient symbol, the wave equation assumes the form 2c = 2 - 2c f = - g. (21) If we assume that f is a causal consequent of g, a unique inverse of the d’Alembertian operator exists. Hence, we will write the solution to the wave equation in operator form as f= 1 [ - g ]. 2c (22) But just what is this inverse? The derivation is somewhat more complicated than that for the Poisson equation. We proceed almost as we did in that former derivation by considering 6 Eur. J. Phys. 41 (2020) 045202 A Davis the delta function identity g (r , t ) = ò g (r¢, t ) d (r - r¢) dt ¢ = ò g (r¢, t - R c) d (r - r ¢dt ¢ , (23) where R c = ∣r - r ¢∣ c can be considered to be the propagation delay of a signal making the transit from r ¢ to r . For compactness of notation we will write g¢ = g (r ¢, t ) and use the definition of f given in equation (13) to rewrite (23) in the form ò g (r , t ) = - g¢ 2y dt ¢. (24) For simplicity, let’s work at first with the integrand—calling it I— and reserve the integration for the final step. Thus, we write I = g¢ 2y = g¢ · y = · [g¢y] - g¢ · y = · [ {g¢y} - y g¢] - g¢ · y = 2 [g¢y] - y 2g¢ - 2g¢ · y. (25) Computation of the individual terms is straightforward but a bit tedious, the final result being ⎡ 1 ⎤ I = ⎢ 2 - 2 ¶ 2t ⎥ [g¢y] = 2c [g¢y]. ⎦ ⎣ c (26) Inserting this result into equation (24) gives ò 2c [g (r¢, t - R c) y (r , r ¢)] dt ¢ = - g (r , t ). (27) The derivatives involved in the d’Alembertian operator are either relative to the nonprimed coordinates or are with respect to time; thus, assuming that interchange of these differentiations with integration with respect to the primed coordinates is valid, we have g (r ¢ , t - R c) dt ¢ = - g (r , t ). (28) 4pR Thus, assuming f is a causal consequence of g, we have shown that the wave equation has a unique solution given by 2c ò ⎡ 1 ⎤ f (r , t ) = ⎢ 2 { - g} ⎥ (r , t ) = ⎣ c ⎦ ò g (r ¢ , t - R c) dt ¢ , 4pR (29) an equation which defines the inverse d’Alembertian operator. In summary, we now know that—under the assumption of causality—the solution to the wave equation in (21) has the operator solution given in (22) defined by the explicit equation (29). Before leaving the wave equation, we observe once more that all the above work continues to hold if f and g are replaced by the vector functions f and g respectively. Thus for future reference we write the vector wave equation as 2c f = - g (30) and its operator solution as f= 1 [ - g] , 2c (31) 7 Eur. J. Phys. 41 (2020) 045202 A Davis which has the explicit meaning that f (r , t ) = ò g ( r ¢ , t - R c ) dt ¢ . 4p R (32) Again, we note that the negative sign on the wave equation forcing function implies a positive sign in the solution. Finally, we observe that if we allow the arbitrary parameter c to become infinite then the d’Alembertian operator reduces to the Laplacian operator and the preceding three equations (30)–(32) become equations (16)–(18), respectively. Therefore, we can say that the Poisson equation is a special case of the wave equation for which c ¥ and there is instantaneous propagation of effects from one point to another. 3.4. Commutativity We will now derive some important commutativity relations for the wave equation. The same properties will then hold for the Poisson equation because it is a special case of the wave equation. First, let’s introduce some new and useful notation. We define ⎧ ¶ ; a = i = 1, 2, 3 ¶ = ⎪ ⎪ i ¶xi ¶a = ⎨ ⎪ ¶ = ¶ ; a = t. t ⎪ ⎩ ¶t (33) Differentiating both sides of the operator form of our wave equation in (30) using (33) and then swapping the order of the partial differentiations, we write ¶a 2c f = 2c ¶a f = -¶a g. (34) Multiplying both sides of the last equality in (34) by the inverse of the d’Alembertian operator and then using equation (31) to eliminate f in terms of g results in ¶a 1 1 [ - g] = 2 [ -¶a g]. 2 c c (35) But g can be arbitrarily chosen, so the preceding equation implies the pure operator equation ¶a 1 1 = 2 ¶a. 2 c c (36) In other words, any partial derivative commutes with the inverse d’Alembertian operator. Because the processes of integrating and forming the various algebraic vector operations such as gradient, curl, and divergence are linear we can write the following ‘omnibus’ commutativity property: ⎧ ⎫ ⎧ ⎫ ⎪ · ⎪ 1 1 ⎪ · ⎪ ⎨ ⎬ 2 = 2⎨ ⎬. c ⎪ ´⎪ ⎪ ´⎪ c ⎩ ¶t ⎭ ⎩ ¶t ⎭ (37) Letting c approach infinity, the inverse d’Alembertian operator becomes the inverse Laplacian, so we can immediately write 8 Eur. J. Phys. 41 (2020) 045202 A Davis ⎧ ⎫ ⎧ ⎫ ⎪ · ⎪ 1 1 ⎪ · ⎪ ⎨ ⎬ ⎬. = 2⎨ ⎪ ´⎪ ⎪ ´⎪ 2 ⎩ ¶t ⎭ ⎩ ¶t ⎭ (38) These commutativity properties are extremely useful—as we will now see. 4. Derivation of the conventional Helmholtz theorem using operators We have just seen in our derivation of the commutativity relations how powerful operator methods can be. As Limos [4] has shown, they can be used effectively in deriving the Helmholtz decomposition, which goes like this. We begin with the standard vector identity ´ [ ´ f ] = [ · f ] - 2 f (39) and rearrange it slightly to write 2 f = [ · f ] - ´ [ ´ f ]. (40) Multiplying both sides from the left by the inverse Laplacian operator, using the commutativity properties we have just derived, and manipulating the signs slightly, we have ⎤ ⎡ 1 ⎤ ⎡ 1 f = - ⎢ 2 { - · f } ⎥ + ´ ⎢ 2 { - ´ f } ⎥. ⎦ ⎣ ⎦ ⎣ (41) Equation (41) is the classical (nonretarded) Helmholtz decomposition. It is unique except for the order of the operators to the left of f on the right side of the equality. The ordering we have chosen just happens to be handy for interpretation. 4.1. Application: a second look at the conventional approach Now let us use this result to more critically analyse the conventional approach to introducing the potentials assuming that the basic field variables B and E have already received rigorous operational definitions and that Maxwell’s equations (3) are known to be valid. Let f = B in equation (41). Then equation (3a), · B = 0 , implies that the first term on the right side of the equality in equation (41) vanishes and that we then have ⎤ ⎡ 1 B = ´ ⎢ 2 { - ´ B} ⎥. ⎦ ⎣ (42) Thus, it is certainly true that B is the curl of some function—but not just any function! In fact, that function is given by 1 [ - ´ B]. (43) 2 A is defined to be the vector potential—and is given explicitly by equation (43)6. Using our commutativity relations (38) we see that A= ·A=- 1 [ · { ´ B}] = 0 2 6 (44) This is not a trivial comment because it implies that the so-called ‘gauge freedom’ in fact reflects a faulty (ambiguous) definition of A. 9 Eur. J. Phys. 41 (2020) 045202 A Davis because the divergence of any curl is zero. Thus, we see that the classical approach leads to what is usually called the Coulomb gauge condition7. The divergence of the vector potential is not subject to choice. It is always zero. Let us now follow the classical argument through its next step, which is to note that Maxwell’s equation (3d), namely ´ E = -¶t B leads to ´ [E + ¶t B] = 0. Rather than simply asserting that the argument of the curl can then be expressed as the gradient of some scalar, we let f = E + ¶t A in the Helmholtz decomposition (41) and note that the second term is zero. We require only the divergence to compute the first term. We have · f = · [E + ¶t A] = r 0, (45) where we have applied Maxwell’s equation (3b), namely · E = r o , used the commutativity of ∇and ∂t, and noted that · A = 0 as already shown in equation (44). Thus, the decomposition in equation (41) implies that ⎤ ⎡ 1 E + ¶t A = - ⎢ 2 { - r 0} ⎥ = -f , ⎦ ⎣ (46) where 1 { - r 0} , 2 an operator equation whose pointwise meaning is f= f (r , t ) = ò r (r ¢ , t ) dt ¢. 4p 0 R (47) (48) This is the classical expression for the Coulomb potential for a distributed charge extended to the case in which the source charges may may vary with time. There is no arbitrariness in its definition. Notice that there is zero propagation delay between an element of charge and the potential at a distant point due to that element of charge. Rewriting equation (46), we have E = -f - ¶t A. (49) We can now compute the vector potential in terms of the source variables. Using the Maxwell equation (3c), ´ B = m0 J + c-2¶t E , in equation (43) and applying (49), we have A= ⎤ 1 ⎡ 1 1 ⎡ 1 1 2 ⎤ ⎢ - m J - 2 ¶t E⎥⎦ = 2 ⎢⎣ - m 0 J + 2 ¶t f + 2 ¶ t A⎥⎦. 2 ⎣ 0 c c c (50) Multiply both sides by ∇2 from the left, then bring both terms involving A to the left side of the equality to get ⎡ 2 1 2⎤ 1 2 ⎢⎣ - 2 ¶ t ⎥⎦ A = c A = - m 0 J + 2 ¶t f. c c (51) Thus, A= ⎤ 1 ⎡ 1 - m 0 J + 2 ¶t f⎥. 2⎢ ⎦ ⎣ c c (52) 7 This is not a result of ‘choosing the Coulomb gauge,’ but is a result of our (more rigorously motivated) definition of A in equation (43). 10 Eur. J. Phys. 41 (2020) 045202 A Davis In explicit terms A (r , t ) = ò mo J (r ¢ , t - R c) - ¶t ¢f (r ¢ , t - R c) dt ¢ , (53) 4pR where f is the explicit function of the charge given in equation (48). This expression is often referred to as the vector potential in the Coulomb gauge, and has been derived by others in different ways. See [9, 10]. Now that we have examined the conventional argument in the light of the classical Helmholtz decomposition, we can evaluate it on a more rigorous basis. By explicitly using that theorem we have made the two major steps rigorous. However, our analysis also exposes some deficiencies. The vector potential is defined in terms of B and the scalar potential in terms of the more complicated hybrid function E + ¶t A, whereas one would hope that the two potentials relate to the same function. The procedure leads to the scalar potential defined by equation (48), which has no propagation delay, and an awkward expression (53) for the vector potential—which does exhibit propagation delay. The conventional approach advances the argument that the supposed ambiguity in the definition of the potentials leads to the ability to choose the divergence of the vector potential arbitrarily (the so-called ‘gauge freedom’), whereas our analysis has shown that it is determined uniquely by the magnetic field B and that this divergence is zero (the ‘Coulomb gauge condition’). 5. Derivation of the retarded Helmholtz theorem using operators As we have noted, most texts do not define the fundamental field quantities in an operationally rigorous manner. We will now remedy this by showing that rigorous operational definitions of the potentials can be based upon the current density alone as the single electromagnetic source quantity and that the field variable definitions are rigorous logical results of these potentials. Other investigations of this problem (not using operator methods) are detailed in [11–13]. To see this, we alter the modified standard vector identity of equation (40) by subtracting c-2¶ t2 f from both sides to obtain ⎡ 2 1 2⎤ 1 2 2 ⎢⎣ - 2 ¶ t ⎥⎦ f = c f = [ · f ] - ´ [ ´ f ] - 2 ¶ t f . c c (54) If we multiply both sides from the left by 1 2c and use commutativity we have ⎡ ⎡ ⎤ ⎧ 1 ⎫⎤ ⎧ 1 ⎫⎤ 1⎡ 1 f = - ⎢ · ⎨ 2 ( - f ) ⎬ ⎥ + ´ ⎢ ´ ⎨ 2 ( - f ) ⎬ ⎥ + ¶t 2 ⎢ 2 ( - f ) ⎥. ⎢⎣ ⎢⎣ c ⎣ c ⎦ ⎩ c ⎭ ⎥⎦ ⎩ c ⎭ ⎥⎦ (55) Since f is always a causal consequence of itself, we see that the operator on the right is uniquely defined to within commutation of the various operators involved. (We have chosen the order of these operators to facilitate our ensuing development.) Equation (55) is called the retarded Helmholtz decomposition. 5.1. Application: rigorous definition of the electromagnetic potentials and field variables We will apply the retarded Helmholtz decomposition to electromagnetic theory by making the identification f = m 0 J, (56) 11 Eur. J. Phys. 41 (2020) 045202 A Davis where J is the current density and μ0 is an arbitrary positive constant used to adjust units. We will consider J to be a source variable, that is one defined by specification of a measuring instrument and a procedure for measuring it. (Think, for instance, of the current distribution in a transmitting antenna.) Then we define charge density as a derived variable in the following way. Conservation of charge implies that · J (r , t ) = -¶t r (r , t ). (57) Integrating both sides with respect to time, we get r (r , t ) = t ò-¥ · J (r, a) da (58) assuming that r (r, -¥) = 0 for each value of r . Thus we have defined the charge density uniquely in terms of the current density. On the other hand, knowing the divergence of J does not uniquely specify J the charge density; hence, we cannot treat J as a derived variable defined in terms of ρ. Thus, we will take J as the single electromagnetic source variable— with all others being defined in terms of it as derived variables. Now let us use the identification in (56) in the retarded Helmholtz decomposition (55). The result is ⎡ ⎡ ⎧ 1 ⎫⎤ ⎧ 1 ⎫⎤ m 0 J = - ⎢ · ⎨ 2 ( - m 0 J ) ⎬ ⎥ + ´ ⎢ ´ ⎨ 2 ( - m 0 J ) ⎬ ⎥ ⎢⎣ ⎩ c ⎭ ⎥⎦ ⎩ c ⎭ ⎥⎦ ⎣⎢ ⎤ 1 ⎡ 1 + 2 ¶t ⎢ 2 ( - m 0 J ) ⎥. c ⎣ c ⎦ (59) This result motivates us to define the vector potential by A= 1 [-m0 J ], 2c (60) an operator identity which means m 0 J (r ¢ , t - R c) dt ¢ . 4pR The vector A is called the retarded vector potential of J . The divergence of A is given by A (r , t ) = ò ⎡ 1 ⎤ 1 1 · A = · ⎢ 2 ( - m 0 J ) ⎥ = 2 [ - m 0 · J ] = - 2 ( - m 0 ¶t r ) c c ⎣ c ⎦ ⎤ 1 ⎡ 1 1 = - 2 ¶t ⎢ 2 ( - r 0) ⎥ = - 2 ¶t f , c ⎣ c c ⎦ (61) (62) where we have defined 0 = 1 (m0 c 2 ) and f= 1 [ - r 0 ]. 2c (63) The pointwise meaning of this operator equation is f (r , t ) = ò r ( r ¢ , t - R c ) dt ¢ . 4p R 12 (64) Eur. J. Phys. 41 (2020) 045202 A Davis Thus, f is well defined by (64) and is referred to as the (retarded) scalar potential of J . Observe that it is derived from the (retarded) vector potential and is therefore intimately linked to it. Equation (62) is called the Lorenz condition—which we see always holds. Therefore, the divergence of A is not subject to independent choice. With these definitions, a slight rearrangement of (59) gives 1 [f + ¶t A] + ´ [ ´ A]. c2 The form of the first term motivates us to define m0 J = (65) E = -f - ¶t A, (66) the negative sign having been chosen for consistency with established practice, and the form of the last term impels us to define B = ´ A. (67) With the preceding definitions of the field vectors, a slight rearrangement of (65) produces the so-called Ampere–Maxwell equation (3c): 1 ¶t E. c2 If we take the divergence of equation (67) we get ´ B = m0 J + (68) · B = 0, (69) which is Maxwell’s equation (3a), because the divergence of any curl is zero. Computing the curl of both sides of equation (66) and using commutativity, we have ´ E = -¶t B (70) because the curl of any gradient is zero, thus eliminating the first term. This result is commonly called Faraday’s law and is Maxwell’s equation (3d). Finally, we take the divergence of equation (66), then use commutativity and equation (62) to get ⎤ ⎡ 1 · E = - 2f - ¶t ⎢ - 2 ¶t f⎥ = - 2c f. ⎦ ⎣ c (71) Multiplying both sides of (63) by ,2c and using the result in (71), we have · E = r 0, (72) which is Maxwell’s equation (3b). Thus, we have defined the electromagnetic potentials in a logically coherent way by applying the retarded Helmholtz theorem to the single source quantity J . We have defined charge in terms of J assuming charge conservation. We have defined the field variables E and B in terms of these potentials, and we have shown that all four of Maxwell’s equations follow from the definitions. Finally, it is worthwhile mentioning that even though substitution of the gauge transformation equations (7) does leave the equations for the fields E and B in equations (66) and (67) invariant there is no implication of gauge invariance—because the transformed values of A and f no longer satisfy our definitions of the potentials, that is, equations (61) and (64). 6. Conclusions After covering the vagueness of the conventional approach to introducing the electromagnetic field variables E and B and potentials A and f we developed the conventional (nonretarded) Helmholtz decomposition using operator theory and used it to make the conventional 13 Eur. J. Phys. 41 (2020) 045202 A Davis approach rigorous. Unfortunately, the ‘rigorization’ of that process reveals a number of shortcomings. It assumes that B and E have been rigorously defined—which is not the case. The rigorous development of this procedure shows also that the vector potential A has no arbitrariness as customarily believed: it is completely and uniquely determined by the magnetic field B; hence, it does not in fact possess the indeterminacy required for the gauge transformation to be applied. In fact, our procedure shows that this method always requires · A = 0 which is usually referred to as saying that the ‘Lorenz gauge has been selected.’ The resulting scalar potential has no propagation delay, and the vector potential is a complicated function of the source variable J that is awkward to manipulate. We again used operator theory to derive the more general retarded Helmholtz decomposition by only slightly modifying a standard vector identity and using it to motivate rigorous definitions of both potentials. We then used these potentials to rigorously define the field variables B and E . Thus, we have presented a logical foundation for electromagnetic field theory, the only lacking element being the concept of force8. Acknowledgments The author would like to acknowledge useful discussions with Vladimir Onoochin. He also acknowledges the considerable contributions of the reviewers of this paper. They have materially added to its quality and readability. ORCID iDs Art Davis https://orcid.org/0000-0001-7877-1372 References [1] Jackson J D 1999 Classical Electrodynamics 3rd edn (New York: Wiley) [2] Jefimenko O 2004 Presenting electromagnetic theory in accordance with the principle of causality Eur. J. Phys. 25 287–96 [3] Lorentz H A 1929 The Theory of Electrons (Leipzig: Taubner) ch 1 [4] Limos N A 1987 Symbolic proof of the Helmholtz theorem Am. J. Phys. 55 57–9 [5] Davis A M 1994 A unified theory of lumped circuits and differential systems based on Heaviside operators and causality IEEE Trans. Circuits Syst. 41 712–27 [6] Davis A M 1998 Linear Circuit Analysis (New York: PWS) [7] Zadeh L and Polak E 1969 System Theory (New York: McGraw-Hill) p 332 [8] Kuh E and Rohrer R 1967 Theory of Linear Active Networks (San Francisco: Holden-Day) p 34 [9] Jackson J D 2002 From Lorentz to Coulomb and other explicit gauge transformations Am. J. Phys. 70 917–28 [10] Stewart A M 2003 Vector potential of the Coulomb gauge Eur. J. Phys. 24 519–24 [11] Mcquistan R 1965 Scalar and Vector Fields (New York: Wiley) section 12.3 [12] Davis A M 2003 A generalized Helmholtz theorem for time-varying vector fields Am. J. Phys. 74 72–6 [13] Heras J 2006 Comment on a generalized Helmholtz theorem for time-varying vector fields Am J. Phys. 74 742–43 8 The concept of force is usually introduced by postulating the Lorentz force law as an independent assumption. 14