Uploaded by artmd

Davis 2020 Eur. J. Phys. 41 045202

advertisement
European Journal of Physics
PAPER
Defining the electromagnetic potentials
To cite this article: Art Davis 2020 Eur. J. Phys. 41 045202
View the article online for updates and enhancements.
This content was downloaded from IP address 130.65.109.155 on 28/08/2020 at 19:25
European Journal of Physics
Eur. J. Phys. 41 (2020) 045202 (14pp)
https://doi.org/10.1088/1361-6404/ab78a6
Defining the electromagnetic potentials
Art Davis
San Jose State University, United States of America
E-mail: artice.davis@sjsu.edu
Received 27 October 2019, revised 19 January 2020
Accepted for publication 21 February 2020
Published 8 June 2020
Abstract
This paper offers a critical examination of the classical method of introducing
the elecrtromagnetic potentials in a nonrelativistic context. It applies the
nonretarded Helmholtz theorem to show that the usual method is ambiguous,
with that ambiguity being reflected in the gauge transformation equations. This
ambiguity can be removed by carefully invoking the Helmholtz decomposition
theorem, but the process exposes the inadequacy of the conventional procedure. It then shows how the retarded Helmholtz theorem can be used to
motivate rigorous definitions of the vector and scalar potentials and the field
variables based upon the current density as the single electrical source variable. The method depends upon the assumption of causality, hence is restricted
to classical field theory. It uses the powerful theory of operators as the main
analysis tool.
Keywords: electromagentic potentials, Helmholtz theorem, operators, Maxwellʼs equations
1. Introduction
The fundamental idea of a potential is not really clear. For instance, one speaks of a chemical
potential, a gravitational potential, a diffusion potential, a vector potential, or a retarded
potential. But just what is a potential? We loosely think of a scalar potential of a vector
function f as being a scalar function f such that f = -f and of a vector potential as being
a vector function A such that f =  ´ A. In the back of our minds, we carry the qualification
that each vector function f must be expressed in one of these forms or the other—but not
both1. In the final analysis, electromagnetic theory texts simply do not clarify this issue by
rigorously defining the term ‘potential.’ This paper is intended to do so. It will proceed as
indicated in the following outline.
1
In fact we will show that it is generally expressible as a linear combination of the two types of potential.
0143-0807/20/045202+14$33.00 © 2020 European Physical Society Printed in the UK
1
Eur. J. Phys. 41 (2020) 045202
A Davis
2. Review of the conventional method
We have already mentioned that the electromagnetic potentials are ill-defined in the usual
treatment, but it might be surprising that the same is true for the fundamental field variables B
and E —quantities which most would probably consider to be directly measurable. We will
support this contention by referring to a specific text: Classical Electrodynamics (3rd ed.), by
Jackson [1]. In so doing, we have no intention of impugning this venerable text. We have
chosen it simply because it is widely used and offers a concrete and clear example which
makes our point effectively.
Consider the electric field intensity E . The typical text discusses Coulomb’s law and
defines E as the force exerted by stationary source charges upon a stationary test charge.
(Jackson does this in chapter 1. Though he never explicitly states that all the charges must be
stationary, it is clear from context that he does mean to impose this restriction.) The next few
chapters (1 through 4 in Jackson) explore electrostatics thoroughly. It then discusses current
electricity and the magnetic field B2 (chapter 5 in Jackson).
At this juncture, the text begins to discuss the more general situation leading up to the
Maxwell equations. Jackson takes the first step in this direction in chapter 5, section 5.15
when he introduces Faraday’s law. After previously ‘defining’ the magnetic field vector B
and its associated quantity magnetic flux, he says (his equation number)
‘The changing flux induces an electric field around the circuit, the line integral
of which is called the electromotive force  . After several intervening sentences, he writes ‘The electromotive force is given by
=
∮C E¢ · dl,
(5.134)
where E¢ is the electric field at the element dl of the circuit C.’
Unfortunately, he has not at this point defined either E¢ or  , so he is simply parachuting in
the concepts of electromotive force and general electric field together. He never offers precise
definitions, and this is commonly the case with electromagnetic theory textbooks in general.
2.1. The definition of ‘definition’
There is thus a clear need for a standard by which we can judge the rigour of any definition. In
other words we must have conditions that a proposed definition must satisfy in order to be a
true definition. Consider this. A mathematical definition consists of an equation such as
w = f (x , y , z ) ,
(1)
where x, y, and z are variables which have already been defined and f is to be taken as a
function in a general sense; that is, it can be an operator which operates on the functions x, y,
and z. If w, x, y, and z are the mathematical representations of physical quantities w is called a
derived quantity or variable (w being derived from the quantities x, y, and z). An example is
the defining equation for the current in a capacitor, i=Cdv/dt.
It is possible, however for a defining equation to take the form
w = f (r , t ) ,
(2)
where f is some known function of position and/or time. Then the corresponding variable is
called a source variable. For example i (t ) = 0.01 cos (2p106t ) A is an example of a source
2
The definition of B is even more fraught with problems than is the definition of E .
2
Eur. J. Phys. 41 (2020) 045202
A Davis
variable specifying the current in an element as an independent function of time. Such an
equation defines a current source.
For electromagnetic field theory, we assume that all quantities are represented as either
derived variables or source variables. The requirement that a derived variable refer only to
previously defined variables implies that there can be no loops in a flow diagram of all
electromagnetic variable definitions. A bit of reflection will convince one that the flow
diagram must have a tree structure and that each and every variable must be ultimately
expressible in terms of source variables alone. This is the argument so ably put forward by
Jefimenko [2] in his paper showing that causality requires this condition. We will use it to
critique the usual approach to defining the electromagnetic potentials.
It is clear now that the loose ‘definitions’ of potential in the opening paragraph of this
paper were not proper definitions at all! Proper definitions of f and A as derived variables
would present them as equations expressing them in terms of f rather than the reverse. This
fact has unfortunately not been recognised in what has become a standard textbook presentation of the electromagnetic potentials which dates from at least the early part of the last
century [3] and continues through the present day [1].
Texts adopting this approach begin with a list of Maxwell’s equations, namely
 · B = 0,
(3a)
 · E = - r  0,
(3b)
 ´ B = m0 J +
1
¶t E ,
c2
(3c)
 ´ E = -¶t B ,
(3d )
3
which are assumed to hold for arbitrary time varying sources . They are either assumed
ad hoc or developed from experimental laws. Since  · B = 0 , the reasoning goes4, we can
write
B=´A
(4)
for some unspecified vector function A. Using this expression in equation (3d), we have
 ´ [E + ¶t A] = 0.
( 5)
Now, the logic goes, the zero curl implies that we can express the argument of that curl as the
gradient of some undetermined scalar function. Doing that and rearranging slightly, we have
E = -f - ¶t A.
(6)
The negative sign on ∇f has been selected merely to agree with conventional formulas.
Assuming that E and B have been previously defined, these equations are clearly
intended to be definitions of of the potentials as derived variables—but they are ambiguous;
they do not define the potentials in terms of previously defined variables. The typical
3
The general definitions of the field variables are, as we have already noted, often not stated precisely—though we
will assume in the first part of this paper that they have already received precise definitions based upon the source
variables current density and/or charge.
4
At this juncture Jackson (in section 2 of his chapter 6) says, ‘Since  · B = 0 still holds, we can define B in terms
of a vector potential:
B =  ´ A.
(6.7)
Apparently he unconsciously realises that such an equation defines the variable on the left in terms of the one on the
right—although the context certainly implies that he means the converse. There are many such contretemps in the
general literature where the basic definitions are concerned.
3
Eur. J. Phys. 41 (2020) 045202
A Davis
treatment recognises this solecism by introducing the gauge transformations
A¢ = A + l,
(7a)
f¢ = f - ¶t l,
(7b)
where λ is an arbitrary scalar field to be determined. The typical presentation then substitutes
these expressions into equations (4) and (6), thus showing that neither B nor E is effected.
Another feature of the conventional development is its assertion that one can use the
‘freedom of the gauge’ embodied in the gauge transformation, equations (7), to arbitrarily
specify the divergence of A. We will show that this assumption is, in fact, incorrect.
In addition to this fallacy, there is yet another shortcoming. The vector potential and the
scalar potential are ‘potentials’ of different functions, B and E + ¶t A, respectively. This
seems awkward, particularly as regards the second function.
Finally, and most importantly, the authors of texts adopting the aforementioned approach
do not cite authorities for these equations. For instance, just why does zero divergence of a
vector function imply that it is the curl of another? This is undoubtedly an assumption about
time varying fields that is carried over from previous work on static and quasistatic fields. Our
thesis is that the fundamental reference for such statements about zero divergence and zero
curl is the classical Helmholtz decomposition theorem, but that it has been incorrectly applied
because this theorem shows that the vector and scalar potentials for a given vector function
are not independent; rather, they are uniquely defined by the function being decomposed.
3. Operators and their application to electromagnetic field theory
3.1. Basic definitions and properties
We will use the powerful operator approach, so will supply a brief review of basic operator
concepts5. An operator is the generalisation of the idea of a real valued function, a function
which maps functions into functions. For example the time differentiation operator p=d/dt
maps any differentiable function of time into its derivative. In this case, we can write
p sin = cos to signify that it maps the sine function into the cosine function. We often write
this action, though, by giving the typical values of these functions—writing p sin (t ) = cos (t ),
which really means that [ p sin](t ) = cos (t ). In the general case, for functions of both space
and time, we write
Lf = g
(8)
[Lf ](r , t ) = g (r , t ).
(9)
and
By the very act of writing these equations, we are assuming that each function f produces a
unique function g, that is, there is a single valued operator L whose ‘value’ at f is g, but we
will now explore the existence of an inverse operator L−1 which gives a unique function f for
each choice of the function g—in which case we will write f=L−1g.
A function is a special type of relation, so let S and R be two arbitrary sets and recall that
a relation L on S×R is a subset of S×R; L Ì {( f , g): f Î S , g Î R}. Now suppose ( f1, g1)
5
See [4]. This is a well written argument for teaching operators to undergraduates with the derivation of the classical
Helmholtz theorem as an important example. Unfortunately, the author uses the term ‘symbolic methods’ in a more
or less disparaging way, considering them to be somewhat nonrigorous. But see [5], which shows that the assumption
of causality makes these methods quite rigorous. Also see [6] for a more leasurely discussion.
4
Eur. J. Phys. 41 (2020) 045202
A Davis
and ( f2, g2) are both arbitrary elements of L. If f1=f2 always implies that g1=g2, then L is
called a function—in this case, an operator from S to R. This, of course, is just a precise way
of saying that to each choice of f there corresponds a unique g. Now let L be an operator and
let L−1={(g, f ): ( f, g)äL}, which is called the inverse relation for L. If L−1 is itself a
function (another operator), then it is called the inverse of L and we write L−1g=f.
We will now discuss an important property which assures us that a given operator L has
an inverse. The most common definition used in the literature has to do with the causality of
an operator [7], but we will use a slightly more general definition which refers to the variables
themselves [8]. We will now let L be a general relation (not necessarily a function), with L−1
its inverse relation. Choose arbitrary elements (g1, f1)äL−1 and (g2, f2)]äL−1 and let t0 be
an arbitrary instant of time and let r ̶ ¢ be an arbitrary point in space. If g1 (r ̶ ¢ , t ) = g2 (r ̶ ¢ , t ) for
all t<t0 and each r ̶ ¢ implies that f1 (r, t ) = f2 (r, t ) for all t<t0 and each choice of r , then
we say that f is a causal consequent of g and that g is a causal antecedent of f. Here is the
importance of this condition: it ensures that L−1 is an operator (that is, a function) whether or
not L is. Why? Simply because g1=g2 means that g1 (r ̶ ¢ , t ) = g2 (r ̶ ¢ , t ) for any choice of t0
and any r ̶ ¢. Our condition, then, implies that f1 (r, t ) = f2 (r, t ) for any choice of t and for any
r . But this is merely another way of expressing the implication that g1=g2 implies f1=f2.
In passing, we mention that all of the operator equations above are equally valid if both f
and g are vector functions denoted by bold letters such as f which we have already used for
the spatial position vector r = xi ei (the vector ei being the ith unit vector in a Cartesian
coordinate system.
3.2. The Poisson equation
Poisson’s equation is
 2f = - g,
(10)
where we have chosen the negative sign to be consistent with existing literature. The operator
∇2 is called the Laplacian operator. Under the assumption that f is a causal consequence of g a
unique inverse exists. We will write this inverse operator as 1/∇2 and its action (in operator
form) as
1
[ - g ].
(11)
2
But, specifically, just what is this inverse? We begin with the well-known ‘sampling property’
of the unit delta function
f=
g (r , t ) =
ò g (r¢, t ) d (r - r¢) dt ¢,
(12)
where dτ′ is the differential volume element, the delta function is a three-dimensional one,
and the integral is a volume integral over all space with the integration variable being r ¢. Next,
we apply the well known identity
⎡ 1 ⎤
= - 2y (r , r ¢) ,
d (r - r ¢) = - 2 ⎢
⎣ 4pR ⎥⎦
(13)
where ψ has the obvious definition and R = ∣r - r ¢∣. Inserting (13) into (12), gives
ò
g (r , t ) = - g (r ¢ , t )  2y (r , r ¢) dt ¢ = - 2
5
ò g (r¢, t ) y (r, r¢) dt ¢,
(14)
Eur. J. Phys. 41 (2020) 045202
A Davis
where we have noted that the Laplacian operator is with respect to r , whereas the integration
variable is r ¢. We have, of course, also assumed that the function g is ‘nice’ enough for the
integration and differentiations to be interchanged.
Equation (14) says that the function to the right of the Laplacian operator is a solution to
the Poisson equation, but if we assume that f is a causal consequence of g, we see that it is the
unique solution to that equation. Thus, we now see that
⎤
⎡ 1
f (r , t ) = ⎢ 2 { - g} ⎥ (r , t ) =
⎦
⎣
ò
g (r ¢ , t )
dt ¢
4p R
(15)
is the unique solution to the Poisson equation (10) for each function g. Note carefully that the
negative sign on g in (10) has resulted in a positive sign in its solution—which was the
objective of including it in Poisson’s equation in the first place.
In summary, we now know that—under the assumption of causality—the solution to the
Poisson equation in (10) has the unique operator solution (11) given by (15).
Before leaving our study of Poisson’s equation, we observe that all the above work
continues to hold unchanged if we replace the scalar functions f and g by the vector functions
f and g . Thus, for future reference, we write the vector form of the Poisson equation by
 2f = - g
(16)
and its operator solution as
f=
1
{ - g} ,
2
(17)
which has the explicit meaning
⎤
⎡ 1
f (r , t ) = ⎢ 2 { - g} ⎥ (r , t ) =
⎦
⎣
ò
g ( r ¢ , t ) dt ¢
.
4p R
(18)
3.3. The wave equation
The wave equation is
⎡ 2
1 2⎤
⎢⎣ - 2 ¶ t ⎥⎦ f = - g ,
c
(19)
where c is an arbitrary positive real number and ¶ t2 = ¶ 2 ¶t 2 . We will use the symbol
1 2
¶t
(20)
c2
and call it the d’Alembertian operator (or box operator). Using this convenient symbol, the
wave equation assumes the form
2c =  2 -
2c f = - g.
(21)
If we assume that f is a causal consequent of g, a unique inverse of the d’Alembertian operator
exists. Hence, we will write the solution to the wave equation in operator form as
f=
1
[ - g ].
2c
(22)
But just what is this inverse? The derivation is somewhat more complicated than that for
the Poisson equation. We proceed almost as we did in that former derivation by considering
6
Eur. J. Phys. 41 (2020) 045202
A Davis
the delta function identity
g (r , t ) =
ò g (r¢, t ) d (r - r¢) dt ¢ = ò g (r¢, t - R
c) d (r - r ¢dt ¢ ,
(23)
where R c = ∣r - r ¢∣ c can be considered to be the propagation delay of a signal making the
transit from r ¢ to r . For compactness of notation we will write g¢ = g (r ¢, t ) and use the
definition of f given in equation (13) to rewrite (23) in the form
ò
g (r , t ) = - g¢ 2y dt ¢.
(24)
For simplicity, let’s work at first with the integrand—calling it I— and reserve the integration
for the final step. Thus, we write
I = g¢ 2y = g¢ · y =  · [g¢y] - g¢ · y
=  · [ {g¢y} - y g¢] - g¢ · y
=  2 [g¢y] - y  2g¢ - 2g¢ · y.
(25)
Computation of the individual terms is straightforward but a bit tedious, the final result being
⎡
1 ⎤
I = ⎢ 2 - 2 ¶ 2t ⎥ [g¢y] = 2c [g¢y].
⎦
⎣
c
(26)
Inserting this result into equation (24) gives
ò 2c [g (r¢, t - R
c) y (r , r ¢)] dt ¢ = - g (r , t ).
(27)
The derivatives involved in the d’Alembertian operator are either relative to the nonprimed
coordinates or are with respect to time; thus, assuming that interchange of these
differentiations with integration with respect to the primed coordinates is valid, we have
g (r ¢ , t - R c) dt ¢
= - g (r , t ).
(28)
4pR
Thus, assuming f is a causal consequence of g, we have shown that the wave equation has a
unique solution given by
2c
ò
⎡ 1
⎤
f (r , t ) = ⎢ 2 { - g} ⎥ (r , t ) =
⎣ c
⎦
ò
g (r ¢ , t - R c) dt ¢
,
4pR
(29)
an equation which defines the inverse d’Alembertian operator.
In summary, we now know that—under the assumption of causality—the solution to the
wave equation in (21) has the operator solution given in (22) defined by the explicit
equation (29).
Before leaving the wave equation, we observe once more that all the above work continues to hold if f and g are replaced by the vector functions f and g respectively. Thus for
future reference we write the vector wave equation as
2c f = - g
(30)
and its operator solution as
f=
1
[ - g] ,
2c
(31)
7
Eur. J. Phys. 41 (2020) 045202
A Davis
which has the explicit meaning that
f (r , t ) =
ò
g ( r ¢ , t - R c ) dt ¢
.
4p R
(32)
Again, we note that the negative sign on the wave equation forcing function implies a
positive sign in the solution. Finally, we observe that if we allow the arbitrary parameter c to
become infinite then the d’Alembertian operator reduces to the Laplacian operator and the
preceding three equations (30)–(32) become equations (16)–(18), respectively. Therefore, we
can say that the Poisson equation is a special case of the wave equation for which c  ¥ and
there is instantaneous propagation of effects from one point to another.
3.4. Commutativity
We will now derive some important commutativity relations for the wave equation. The same
properties will then hold for the Poisson equation because it is a special case of the wave
equation. First, let’s introduce some new and useful notation. We define
⎧
¶
; a = i = 1, 2, 3
¶ =
⎪
⎪ i
¶xi
¶a = ⎨
⎪ ¶ = ¶ ; a = t.
t
⎪
⎩
¶t
(33)
Differentiating both sides of the operator form of our wave equation in (30) using (33) and
then swapping the order of the partial differentiations, we write
¶a 2c f = 2c ¶a f = -¶a g.
(34)
Multiplying both sides of the last equality in (34) by the inverse of the d’Alembertian operator
and then using equation (31) to eliminate f in terms of g results in
¶a
1
1
[ - g] = 2 [ -¶a g].
2
c
c
(35)
But g can be arbitrarily chosen, so the preceding equation implies the pure operator equation
¶a
1
1
= 2 ¶a.
2
c
c
(36)
In other words, any partial derivative commutes with the inverse d’Alembertian operator.
Because the processes of integrating and forming the various algebraic vector operations such
as gradient, curl, and divergence are linear we can write the following ‘omnibus’
commutativity property:
⎧ ⎫
⎧ ⎫
⎪ · ⎪ 1
1 ⎪ · ⎪
⎨
⎬ 2 = 2⎨
⎬.
c ⎪ ´⎪
⎪ ´⎪ c
⎩ ¶t ⎭
⎩ ¶t ⎭
(37)
Letting c approach infinity, the inverse d’Alembertian operator becomes the inverse
Laplacian, so we can immediately write
8
Eur. J. Phys. 41 (2020) 045202
A Davis
⎧ ⎫
⎧ ⎫
⎪ · ⎪ 1
1 ⎪ · ⎪
⎨
⎬
⎬.
= 2⎨
 ⎪ ´⎪
⎪ ´⎪  2
⎩ ¶t ⎭
⎩ ¶t ⎭
(38)
These commutativity properties are extremely useful—as we will now see.
4. Derivation of the conventional Helmholtz theorem using operators
We have just seen in our derivation of the commutativity relations how powerful operator
methods can be. As Limos [4] has shown, they can be used effectively in deriving the
Helmholtz decomposition, which goes like this. We begin with the standard vector identity
 ´ [ ´ f ] =  [  · f ] -  2 f
(39)
and rearrange it slightly to write
 2 f =  [  · f ] -  ´ [ ´ f ].
(40)
Multiplying both sides from the left by the inverse Laplacian operator, using the
commutativity properties we have just derived, and manipulating the signs slightly, we have
⎤
⎡ 1
⎤
⎡ 1
f = - ⎢ 2 { - · f } ⎥ +  ´ ⎢ 2 { - ´ f } ⎥.
⎦
⎣
⎦
⎣
(41)
Equation (41) is the classical (nonretarded) Helmholtz decomposition. It is unique except for
the order of the operators to the left of f on the right side of the equality. The ordering we
have chosen just happens to be handy for interpretation.
4.1. Application: a second look at the conventional approach
Now let us use this result to more critically analyse the conventional approach to introducing
the potentials assuming that the basic field variables B and E have already received rigorous
operational definitions and that Maxwell’s equations (3) are known to be valid. Let f = B in
equation (41). Then equation (3a),  · B = 0 , implies that the first term on the right side of
the equality in equation (41) vanishes and that we then have
⎤
⎡ 1
B =  ´ ⎢ 2 { - ´ B} ⎥.
⎦
⎣
(42)
Thus, it is certainly true that B is the curl of some function—but not just any function! In fact,
that function is given by
1
[ - ´ B].
(43)
2
A is defined to be the vector potential—and is given explicitly by equation (43)6. Using our
commutativity relations (38) we see that
A=
·A=-
1
[ · { ´ B}] = 0
2
6
(44)
This is not a trivial comment because it implies that the so-called ‘gauge freedom’ in fact reflects a faulty
(ambiguous) definition of A.
9
Eur. J. Phys. 41 (2020) 045202
A Davis
because the divergence of any curl is zero. Thus, we see that the classical approach leads to
what is usually called the Coulomb gauge condition7. The divergence of the vector potential
is not subject to choice. It is always zero.
Let us now follow the classical argument through its next step, which is to note that
Maxwell’s equation (3d), namely  ´ E = -¶t B leads to  ´ [E + ¶t B] = 0. Rather than
simply asserting that the argument of the curl can then be expressed as the gradient of some
scalar, we let f = E + ¶t A in the Helmholtz decomposition (41) and note that the second
term is zero. We require only the divergence to compute the first term. We have
 · f =  · [E + ¶t A] = r  0,
(45)
where we have applied Maxwell’s equation (3b), namely  · E = r  o , used the
commutativity of ∇and ∂t, and noted that  · A = 0 as already shown in equation (44).
Thus, the decomposition in equation (41) implies that
⎤
⎡ 1
E + ¶t A = - ⎢ 2 { - r  0} ⎥ = -f ,
⎦
⎣
(46)
where
1
{ - r  0} ,
2
an operator equation whose pointwise meaning is
f=
f (r , t ) =
ò
r (r ¢ , t )
dt ¢.
4p 0 R
(47)
(48)
This is the classical expression for the Coulomb potential for a distributed charge extended to
the case in which the source charges may may vary with time. There is no arbitrariness in its
definition. Notice that there is zero propagation delay between an element of charge and the
potential at a distant point due to that element of charge. Rewriting equation (46), we have
E = -f - ¶t A.
(49)
We can now compute the vector potential in terms of the source variables. Using the
Maxwell equation (3c),  ´ B = m0 J + c-2¶t E , in equation (43) and applying (49), we
have
A=
⎤
1 ⎡
1
1 ⎡
1
1 2 ⎤
⎢ - m J - 2 ¶t E⎥⎦ = 2 ⎢⎣ - m 0 J + 2 ¶t f + 2 ¶ t A⎥⎦.
2 ⎣ 0
c

c
c
(50)
Multiply both sides by ∇2 from the left, then bring both terms involving A to the left side of
the equality to get
⎡ 2
1 2⎤
1
2
⎢⎣ - 2 ¶ t ⎥⎦ A = c A = - m 0 J + 2 ¶t f.
c
c
(51)
Thus,
A=
⎤
1 ⎡
1
- m 0 J + 2 ¶t f⎥.
2⎢
⎦
⎣
c
c
(52)
7
This is not a result of ‘choosing the Coulomb gauge,’ but is a result of our (more rigorously motivated) definition
of A in equation (43).
10
Eur. J. Phys. 41 (2020) 045202
A Davis
In explicit terms
A (r , t ) =
ò
mo J (r ¢ , t - R c) - ¶t ¢f (r ¢ , t - R c)
dt ¢ ,
(53)
4pR
where f is the explicit function of the charge given in equation (48). This expression is often
referred to as the vector potential in the Coulomb gauge, and has been derived by others in
different ways. See [9, 10].
Now that we have examined the conventional argument in the light of the classical
Helmholtz decomposition, we can evaluate it on a more rigorous basis. By explicitly using
that theorem we have made the two major steps rigorous. However, our analysis also exposes
some deficiencies. The vector potential is defined in terms of B and the scalar potential in
terms of the more complicated hybrid function E + ¶t A, whereas one would hope that the
two potentials relate to the same function. The procedure leads to the scalar potential defined
by equation (48), which has no propagation delay, and an awkward expression (53) for the
vector potential—which does exhibit propagation delay. The conventional approach advances
the argument that the supposed ambiguity in the definition of the potentials leads to the ability
to choose the divergence of the vector potential arbitrarily (the so-called ‘gauge freedom’),
whereas our analysis has shown that it is determined uniquely by the magnetic field B and
that this divergence is zero (the ‘Coulomb gauge condition’).
5. Derivation of the retarded Helmholtz theorem using operators
As we have noted, most texts do not define the fundamental field quantities in an operationally rigorous manner. We will now remedy this by showing that rigorous operational
definitions of the potentials can be based upon the current density alone as the single
electromagnetic source quantity and that the field variable definitions are rigorous logical
results of these potentials. Other investigations of this problem (not using operator methods)
are detailed in [11–13].
To see this, we alter the modified standard vector identity of equation (40) by subtracting
c-2¶ t2 f from both sides to obtain
⎡ 2
1 2⎤
1 2
2
⎢⎣ - 2 ¶ t ⎥⎦ f = c f =  [ · f ] -  ´ [ ´ f ] - 2 ¶ t f .
c
c
(54)
If we multiply both sides from the left by 1 2c and use commutativity we have
⎡
⎡
⎤
⎧ 1
⎫⎤
⎧ 1
⎫⎤
1⎡ 1
f = - ⎢ · ⎨ 2 ( - f ) ⎬ ⎥ +  ´ ⎢ ´ ⎨ 2 ( - f ) ⎬ ⎥ + ¶t 2 ⎢ 2 ( - f ) ⎥.
⎢⎣
⎢⎣
c ⎣ c
⎦
⎩ c
⎭ ⎥⎦
⎩ c
⎭ ⎥⎦
(55)
Since f is always a causal consequence of itself, we see that the operator on the right is
uniquely defined to within commutation of the various operators involved. (We have chosen
the order of these operators to facilitate our ensuing development.) Equation (55) is called the
retarded Helmholtz decomposition.
5.1. Application: rigorous definition of the electromagnetic potentials and field variables
We will apply the retarded Helmholtz decomposition to electromagnetic theory by making the
identification
f = m 0 J,
(56)
11
Eur. J. Phys. 41 (2020) 045202
A Davis
where J is the current density and μ0 is an arbitrary positive constant used to adjust units. We
will consider J to be a source variable, that is one defined by specification of a measuring
instrument and a procedure for measuring it. (Think, for instance, of the current distribution in
a transmitting antenna.) Then we define charge density as a derived variable in the following
way. Conservation of charge implies that
 · J (r , t ) = -¶t r (r , t ).
(57)
Integrating both sides with respect to time, we get
r (r , t ) =
t
ò-¥  · J (r, a) da
(58)
assuming that r (r, -¥) = 0 for each value of r . Thus we have defined the charge density
uniquely in terms of the current density. On the other hand, knowing the divergence of J does
not uniquely specify J the charge density; hence, we cannot treat J as a derived variable
defined in terms of ρ. Thus, we will take J as the single electromagnetic source variable—
with all others being defined in terms of it as derived variables.
Now let us use the identification in (56) in the retarded Helmholtz decomposition (55).
The result is
⎡
⎡
⎧ 1
⎫⎤
⎧ 1
⎫⎤
m 0 J = - ⎢ · ⎨ 2 ( - m 0 J ) ⎬ ⎥ +  ´ ⎢ ´ ⎨ 2 ( - m 0 J ) ⎬ ⎥
⎢⎣
⎩ c
⎭ ⎥⎦
⎩ c
⎭ ⎥⎦
⎣⎢
⎤
1 ⎡ 1
+ 2 ¶t ⎢ 2 ( - m 0 J ) ⎥.
c ⎣ c
⎦
(59)
This result motivates us to define the vector potential by
A=
1
[-m0 J ],
2c
(60)
an operator identity which means
m 0 J (r ¢ , t - R c) dt ¢
.
4pR
The vector A is called the retarded vector potential of J .
The divergence of A is given by
A (r , t ) =
ò
⎡ 1
⎤
1
1
 · A =  · ⎢ 2 ( - m 0 J ) ⎥ = 2 [ - m 0  · J ] = - 2 ( - m 0 ¶t r )
c
c
⎣ c
⎦
⎤
1 ⎡ 1
1
= - 2 ¶t ⎢ 2 ( - r  0) ⎥ = - 2 ¶t f ,
c ⎣ c
c
⎦
(61)
(62)
where we have defined  0 = 1 (m0 c 2 ) and
f=
1
[ - r  0 ].
2c
(63)
The pointwise meaning of this operator equation is
f (r , t ) =
ò
r ( r ¢ , t - R c ) dt ¢
.
4p R
12
(64)
Eur. J. Phys. 41 (2020) 045202
A Davis
Thus, f is well defined by (64) and is referred to as the (retarded) scalar potential of J .
Observe that it is derived from the (retarded) vector potential and is therefore intimately
linked to it. Equation (62) is called the Lorenz condition—which we see always holds.
Therefore, the divergence of A is not subject to independent choice.
With these definitions, a slight rearrangement of (59) gives
1
[f + ¶t A] +  ´ [ ´ A].
c2
The form of the first term motivates us to define
m0 J =
(65)
E = -f - ¶t A,
(66)
the negative sign having been chosen for consistency with established practice, and the form
of the last term impels us to define
B =  ´ A.
(67)
With the preceding definitions of the field vectors, a slight rearrangement of (65) produces the so-called Ampere–Maxwell equation (3c):
1
¶t E.
c2
If we take the divergence of equation (67) we get
 ´ B = m0 J +
(68)
 · B = 0,
(69)
which is Maxwell’s equation (3a), because the divergence of any curl is zero. Computing the
curl of both sides of equation (66) and using commutativity, we have
 ´ E = -¶t B
(70)
because the curl of any gradient is zero, thus eliminating the first term. This result is
commonly called Faraday’s law and is Maxwell’s equation (3d). Finally, we take the
divergence of equation (66), then use commutativity and equation (62) to get
⎤
⎡ 1
 · E = - 2f - ¶t ⎢ - 2 ¶t f⎥ = - 2c f.
⎦
⎣ c
(71)
Multiplying both sides of (63) by ,2c and using the result in (71), we have
 · E = r  0,
(72)
which is Maxwell’s equation (3b). Thus, we have defined the electromagnetic potentials in a
logically coherent way by applying the retarded Helmholtz theorem to the single source
quantity J . We have defined charge in terms of J assuming charge conservation. We have
defined the field variables E and B in terms of these potentials, and we have shown that all
four of Maxwell’s equations follow from the definitions.
Finally, it is worthwhile mentioning that even though substitution of the gauge transformation equations (7) does leave the equations for the fields E and B in equations (66) and
(67) invariant there is no implication of gauge invariance—because the transformed values of
A and f no longer satisfy our definitions of the potentials, that is, equations (61) and (64).
6. Conclusions
After covering the vagueness of the conventional approach to introducing the electromagnetic
field variables E and B and potentials A and f we developed the conventional (nonretarded)
Helmholtz decomposition using operator theory and used it to make the conventional
13
Eur. J. Phys. 41 (2020) 045202
A Davis
approach rigorous. Unfortunately, the ‘rigorization’ of that process reveals a number of
shortcomings. It assumes that B and E have been rigorously defined—which is not the case.
The rigorous development of this procedure shows also that the vector potential A has no
arbitrariness as customarily believed: it is completely and uniquely determined by the
magnetic field B; hence, it does not in fact possess the indeterminacy required for the gauge
transformation to be applied. In fact, our procedure shows that this method always requires
 · A = 0 which is usually referred to as saying that the ‘Lorenz gauge has been selected.’
The resulting scalar potential has no propagation delay, and the vector potential is a complicated function of the source variable J that is awkward to manipulate.
We again used operator theory to derive the more general retarded Helmholtz decomposition by only slightly modifying a standard vector identity and using it to motivate
rigorous definitions of both potentials. We then used these potentials to rigorously define the
field variables B and E . Thus, we have presented a logical foundation for electromagnetic
field theory, the only lacking element being the concept of force8.
Acknowledgments
The author would like to acknowledge useful discussions with Vladimir Onoochin. He also
acknowledges the considerable contributions of the reviewers of this paper. They have
materially added to its quality and readability.
ORCID iDs
Art Davis
https://orcid.org/0000-0001-7877-1372
References
[1] Jackson J D 1999 Classical Electrodynamics 3rd edn (New York: Wiley)
[2] Jefimenko O 2004 Presenting electromagnetic theory in accordance with the principle of causality
Eur. J. Phys. 25 287–96
[3] Lorentz H A 1929 The Theory of Electrons (Leipzig: Taubner) ch 1
[4] Limos N A 1987 Symbolic proof of the Helmholtz theorem Am. J. Phys. 55 57–9
[5] Davis A M 1994 A unified theory of lumped circuits and differential systems based on Heaviside
operators and causality IEEE Trans. Circuits Syst. 41 712–27
[6] Davis A M 1998 Linear Circuit Analysis (New York: PWS)
[7] Zadeh L and Polak E 1969 System Theory (New York: McGraw-Hill) p 332
[8] Kuh E and Rohrer R 1967 Theory of Linear Active Networks (San Francisco: Holden-Day) p 34
[9] Jackson J D 2002 From Lorentz to Coulomb and other explicit gauge transformations Am. J. Phys.
70 917–28
[10] Stewart A M 2003 Vector potential of the Coulomb gauge Eur. J. Phys. 24 519–24
[11] Mcquistan R 1965 Scalar and Vector Fields (New York: Wiley) section 12.3
[12] Davis A M 2003 A generalized Helmholtz theorem for time-varying vector fields Am. J. Phys. 74
72–6
[13] Heras J 2006 Comment on a generalized Helmholtz theorem for time-varying vector fields Am J.
Phys. 74 742–43
8
The concept of force is usually introduced by postulating the Lorentz force law as an independent assumption.
14
Download