GENERAL RELATIVITY The Origin of the Special and General Theories Newtonian Mechanics and Inertial Frames of Reference Newtonian particle mechanics is based on Newton's laws of motion. Newton's first law may be written: "Reference frames exist in which all free particles have zero acceleration". Here a free particle is defined to be one on which no net force acts. It is assumed that the question of whether or not a particle is free is absolute and does not depend on the choice of frame in which the motion is expressed. Originally Newton presumed the existence of a unique reference frame called "absolute space" with respect to which free particles would have zero acceleration. However the additional assumption of "absolute time" meant that any frame moving with uniform velocity with respect to absolute space would be dynamically equivalent to the latter. Accordingly in specifying Newton's laws we normally refer to a set of reference frames, each of which has uniform velocity with respect to any other. These are the so-called inertial frames - in practice, those moving with uniform velocity with respect to the "fixed" stars. (Why the latter should be involved is the subject of the so-called "Mach principle" - that distant matter in the universe determines by some means the inertia effects which we observe.) Newton's second law may be written: "The acceleration of a particle with respect to any inertial frame is proportional to the force acting on it", or F mr (1) where the constant of proportionality m is called the inertial mass of the particle. We assume here that the force F is independent of the reference frame in which the motion of the particle is expressed. Experimentally equation (1) and its immediate consequences are found to be correct for particles moving with everyday speeds, i.e. small compared with the velocity of light. Newton's third law is that the forces exerted on each other by two interacting particles are equal and opposite. This proposition is comparatively easy to verify for simple mechanical systems, but it is untrue in electrodynamics, e.g. for the forces between two charged particles in relative motion. 1 The Origins of the Special Theory of Relativity As mentioned above, inertial frames all have a constant relative velocity with respect to one another. Mathematically this is in accordance with the so-called Galilean transformation equations, which state in pre-relativistic physics the presumed relation between the spacetime coordinates of an event (x, y, z, t ) in one inertial frame and the corresponding coordinates (x', y', z', t') in another. For this purpose we usually imagine the two frames of reference to be in "standard configuration", with the cartesian axes coinciding at t = t' = 0 and the relative motion being in the common x, x' direction, as shown in the figure. "stationary" frame S y y' "moving" frame S' v x' x z' z Figure 1. Inertial frames S and S' in "standard configuration". Since the distance between the origins is just vt, it is apparently "obvious" that the coordinate transformation equations are of the form x' = x - vt, y' = y, z' = z, t' = t (2) the fourth of these expressing the Newtonian belief that the time of an event is the same in all frames of reference, i.e. "time" is absolute. These are the Galilean transformation equations, and it is an elementary exercise now to verify that the components of acceleration of any object with respect to the two coordinate systems of reference are the same. So the acceleration in equation (1) will have the same value in any inertial frame. Since Newton's laws of motion are the basis for all of particle mechanics, later extended to continuum mechanics (rigid bodies etc.) it follows that the whole of classical mechanics "works" in all inertial frames of reference. For example, if the momentum of a physical system is conserved in one inertial frame of reference (which will be the case if there are no external forces acting on it) then it follows that the same result will hold in any other inertial frame. It is of interest to enquire if this principle also holds in other branches of physics, e.g. does electromagnetism apply equally in all inertial frames of reference? Electromagnetism is based 2 on Maxwell's equations, which are sophisticated ways of expressing, in the form of differential equations , more familiar equations such as Coulomb's law, the Biot-Savart law, etc. It can be shown that, in the absence of matter, Maxwell's equations combine to give a wave equation, describing electromagnetic waves, the velocity of such waves being given by the formula c ( o o )1/ 2 .When numerical values are inserted, the result is c = 2.998 x108 m s-1, the same as the experimental value of the velocity of light. This was a great triumph for electromagnetic theory when it was first discovered, since it identified light as a form of electromagnetic wave, with wavelengths in a particular (visible) range. However one drawback, which proved very troublesome for late 19th century physics, was this: any wave travelling at a speed of exactly c in one inertial frame must surely have a different value in other inertial frames; we have after all from (2) that the velocity components dx/dt and dx'/dt' in the frames S and S' respectively must differ by v. So electromagnetism, unlike mechanics, appears to single out one inertial frame in particular - that in which the velocity of light is exactly c for light travelling in any direction. For understandable reasons, prerelativity physicists assumed that this preferred frame was that in which the "medium" for light propagation was at rest, by analogy with the propagation of sound. This all-pervading medium was referred to as the "ether" and assumed to be endowed with specific physical properties, even although in empty space it seemed to consist of nothing at all. The experimental search for the ether took place in the late 19th century, but ultimately proved to be a blind alley. Repeated attempts (see textbook accounts of the MichelsonMorley and other experiments) to measure the velocity of the earth with respect to the ether frame ended in failure, and likewise various ingenious attempts to account in an ad hoc way for those failures. Einstein's Special Theory of Relativity Einstein started from the belief that physics demonstrates an essential unity - there are no rigid boundaries between its various disciplines. So electromagnetism should be on the same footing as mechanics in the sense of being valid in all inertial frames of reference. He realised also that the Galilean transformation equations (2), the origin of the contradiction just discussed, are not self-evidently true, but are fallible assertions about the results of hypothetical physical experiments. It is possible therefore that they could be wrong, in spite of the fact that they seem so "obvious" and work so well in everyday situations. In 1905, Einstein put forward the following two postulates: 1. The laws of physics should have the same form in all inertial frames of reference. 3 2. Electromagnetic signals in vacuo travel at speed c with respect to all inertial reference frames. Note that the first postulate confirms the special role of inertial frames of reference in physics. It implies that any proposed law of physics should satisfy the theoretical test of "covariance"; i.e. when we have expressed it in mathematical form in one inertial frame, transforming the coordinates to a different inertial frame should leave the form of the equation unaltered. The second postulate reveals the velocity of light as one of the few fundamental constants, on the same footing as Plank's constant and the electronic charge. But it also compels us to question the validity of the Galilean transformation equations, since they are clearly incompatible with the postulate that the velocity of light never changes. Recognising however that the Galilean equations are experimentally correct for small velocities, we search for a more general set of transformation equations which will (we guess) approximate to the Galilean equations in some appropriate limit. The Lorentz transformation equations These equations describe how the coordinates (x, y, z, t) of an event in one inertial frame S are connected to the corresponding coordinates (x', y', z', t') in another frame S'. In the simplest case we set up the two frames of reference as shown in figure 1, the "standard configuration" of S and S', with relative velocity v between the frames in the common x, x' direction. (The actual assignment of coordinates to events in a given inertial frame is straightforward, assuming standard measuring rods and clocks distributed throughout the frame. In particular, each clock at a distance from the origin is synchronised with all others by setting it to read t = /c when a light signal, sent out from the origin at t = 0, arrives at the clock. Then the time of any event is the time currently shown on the clock beside which the event occurs.) The arguments which lead to the new "Lorentz transformation equations" include, as well as Einstein's postulates, some basic assumptions about the nature of space and time. We assume for example that there are no preferred directions in space, and that the choice of the origins of the coordinate systems is of no fundamental importance. Taken together with the first postulate - that all inertial reference frames should be on the same footing - it can be shown that the transformation equations must be linear, and are restricted to the mathematical form 4 x' x vt 1 v 2 / k 2 t vx / k 2 y' = y, z' = z, t ' (3) 1 v 2 / k2 where k is a constant with the dimensions of velocity. At this point we note that the Galilean transformation equations (2) are obtained by making the erroneous assumption that k = ∞. It is Einstein's second postulate which determines the correct value of k. We imagine a light pulse emitted at t = 0 from the origin of the inertial frame S, and which at any subsequent time will be of spherical shape described by the equation x y z c t 0 2 2 2 2 2 (4) The second postulate requires that the pulse should be described in frame S' by an equation of exactly the same form, i.e. x 2 y 2 z 2 c 2 t 2 0 . (5) It is necessary therefore that either equation should imply the other; and this is achieved by taking k = c, for then we have (as is easy to verify) 2 2 2 2 2 2 2 2 2 2 x y z c t x y z c t . Hence the final result is x vt x' , 1 v 2 / c 2 y' = y, z' = z, t ' t vx / c 2 1 v 2 / c2 (6) (7) which are the Lorentz transformation equations. The Galilean transformation equations are an approximation to these, valid in the "non-relativistic limit" v/c « 1. Some consequences of the Lorentz transformation equations 1. Reversal of the roles of the two frames. We took S to be the frame at rest and S' to be moving in the positive x direction with respect to S with speed v. But this initial choice was arbitrary, and we could just as well have taken S' to be at rest, with S moving in the negative x' direction with respect to S' with speed v. The equivalence of these two descriptions is shown mathematically by solving equations (7) for the coordinates x, y, z, t in terms of the primed coordinates. We find 5 x x v t 1 v 2 / c2 , y = y', z = z', t t v x / c 2 1 v 2 / c 2 (8) which are of the same mathematical form but with the primed and non-primed coordinates interchanged and v replaced by -v. 2. Time dilation. Consider two events which occur at the same place in one of the two frames - say the frame S'. Denoting the coordinates of the events (x'1, y'1, z'1, t'1) and (x'2, y'2, z'2, t'2) in frame S' (so that x'1 = x'2, y'1 = y'2, z'1 = z'2) and (x1, y1, z1, t1) and (x2, y2, z2, t2) in frame S, application of the fourth of the transformation equations (8) to both events followed by subtraction yields the result t2 t1 (t 2 t 1) (v / c 2 )( x 2 x 1) 2 2 1 v / c t 2 t1 (9) 1 v 2 / c 2 This equation shows that the time interval between the two events does not take the same value in all frames of reference; the interval in S is extended by a factor (1-v2/c2)-1/2 compared with the interval in S'. We call this time dilation. The quantity t2'-t1' is called the proper time interval between the events, and is the minimum time interval which would be ascribed to the two events in any inertial reference frame. This result is often expressed by the statement: "moving clocks go slow". This refers to a hypothetical attempt by observers in one frame of reference - the frame S in this case - to ascertain the rate of a standard clock which is at rest in the moving frame S'. To do this the observers in S note two particular ticks of the moving clock and use these as the two events whose space and time coordinates have just been described. They are bound to find, if the theory is correct, that the moving clock has registered a smaller time interval than is recorded by clocks in their own frame, so their conclusion is that, compared with their own clocks (which naturally are assumed to be "correct"), the moving clock is going slow. It is only apparently anomalous that the same conclusion, but in reverse, would be reached by observers in the frame S' who make observations on a clock at rest in S. Their conclusion, in other words, is that t 2 t 1 t2 t1 1 v 2 / c2 , (10) the same equation as (9) but with the primed and unprimed coordinates interchanged. There is no contradiction between these equations, since the two physical situations are different. In the first case, the two events which are being used for purposes of comparing time intervals 6 the two ticks of the clock in S' - occur at the same place in frame S', and in the second, they are at the same place in frame S. 3. The velocity of light as a limiting velocity. The factor (1-v2/c2)-1/2 becomes imaginary for values of v exceeding c, so we see that no frame of reference can have a value of v in this range. Furthermore, since frames of reference may be constructed from material objects, it follows that no particle may have a velocity exceeding c either. 4.The Doppler effect. Like the Doppler effect in acoustics, this refers to the shift in frequency when a wave impinges on two observers who are in relative motion with respect to another. In its simplest version, we will consider an electromagnetic wave approaching from x = +∞ two observers located at the origins of the two reference frames S and S'. Given that the two observers record frequencies and ' respectively, it can be shown that these are related by the equation 1 v / c . 1v / c (11) If the source of the light is stationary in frame S, then the fact that the square root factor is greater than 1 indicates that the observer in S', for whom the light source is approaching, would note a frequency shift towards the blue end of the spectrum. Similarly when the source of light is receding from the observer S' (i.e. light is approaching from -∞) a similar Doppler effect is produced, but the factor on the RHS is the inverse of that in (11), indicating a shift of spectral lines towards the red end of the spectrum Back to Newtonian Mechanics - Inertial Forces In a frame moving with acceleration a with respect to an inertial frame, a free particle will have an acceleration -a which, if the first frame had been inertial, would have been caused by a force of magnitude -ma . Although there is no real force acting on the particle, we can, if we like, attribute the acceleration -a to the action of a hypothetical "force" of just this magnitude. If we are prepared to introduce in this way forces which have no physical origin, we may extend the validity of (1) to all frames of reference, on the understanding that F now includes fictitious or inertial forces as well as real forces if the frame is non-inertial. The characteristic feature of any inertial force is that the acceleration it "produces" is independent of the mass of the particle on which it acts. Newtonian Gravitational Theory 7 Suppose two particles, of gravitational mass M and M', are situated at r and r' respectively. According to Newtonian gravitational theory, the force on M is r r F GM M | r r| 3 where G is the gravitational constant (G = 6.67 x10 -11 3 -1 -2 m kg s (12) .) Alternatively , the force may be expressed in terms of the gravitational potential function evaluated at r, i.e. F MV (r) where (13) GM V (r) | r r| (14) If, instead of a point source M', we have a distribution of matter represented by a density , then the generalisation of (14) is ( r) 3 V (r) G d r | r r| . (15) By analogy with electrostatics where an inverse square law also applies, or otherwise, it is easily shown that V 4G . 2 (16) In terms of V, the motion of the particle is governed by Newton's second law: mÝ rÝ F MV (17) Now there is no apparent reason why any connection should exist between the gravitational mass and inertial mass for a particle; the first quantity measures its capacity for gravitational interaction with other particles (just as its charge measures its capacity for electromagnetic interaction), while the second measures its resistance to change of velocity when acted on by a force. However experimentally (e.g. Eötvös' experiment) one finds that M = k m where k is numerically the same, to very high accuracy, for all particles. The statement that the proportionality between M and m is exact, and not just correct to a very good approximation, is one form of the Principle of Equivalence. We may of course adjust the unit of gravitational mass so that k = 1; gravitational and inertial masses are then numerically, though not conceptually, the same. It then follows from (17) that Ý V rÝ (18) 8 i.e. the acceleration of a particle in a gravitational field is independent of its mass. The general success of Newton's laws of motion together with the gravitational equation (12) in explaining the motion of the heavenly bodies is well known, and their successful application to other branches of physics led to the belief that they must be universally applicable. However, the weaknesses of the Newtonian approach to gravity were eventually realised to be as follows: 1.The Newtonian field equations are time independent, implying action at a distance with information being propagated at infinite speed, contrary to the conclusions of special relativity. 2.The apparently fortuitous proportionality between inertial and gravitational mass already referred to. Newtonian theory provides no explanation for this. 3.There was discovered an unpredicted residual advance in the perihelion angle of the orbit of the planet Mercury, after allowing for the perturbations to its elliptic orbit due to the influence of the other planets. These weaknesses provided the motivation for the development of the general theory of relativity, introduced by A Einstein in 1916. Founded on only a few postulates, it links the motion of free particles to the presence of large gravitating masses via the intrinsic geometry of spacetime, and serves as a base for the construction of theories of the universe. The General Theory of Relativity Einstein started from the supposition that the observed proportionality between inertial mass and gravitational mass is no accident. This proportionality implies that the acceleration of a particle in a gravitational field is independent of its mass, which is the same result as when the "force" supplying the acceleration is inertial. If this is to be no coincidence, it must be because the gravitational force is itself of the inertial variety, i.e. it manifests itself as "causing" the acceleration of test particles simply because of the peculiar choice of reference frame for observation of the motion. In a frame falling freely under gravity however, the acceleration of test particles is zero and the gravitational "force" disappears. This simple reasoning causes us to adopt the following change of outlook. We retain Newton's laws and the inertial frames of reference, but with this change: that real forces do not now include that previously described as the force of gravitation. This means in practice that inertial frames 9 are those which are freely falling but not rotating with respect to the fixed stars. Note that since the acceleration due to gravity varies from place to place, a given reference frame, defined by rigid axes, can be inertial only locally, strictly at a point but approximately so over a small region. As pointed out above, Special Relativity (SR) ascribes particular importance to the inertial frames of reference. Since what we mean by "inertial frames" is not altered by the change in outlook described above (rather it is the particular reference frames which qualify for that description), it is natural to suppose that this special relationship should continue, i.e. that SR holds only in freely falling frames, those in which all gravitational effects disappear. This supposition is in fact supported by experimental evidence. According to SR, light travels in straight lines with respect to any inertial frame of reference In a non-inertial frame, e.g. one accelerating with respect to the first, it is easy to see (especially if the light is assumed to consist of a stream of photons) that light rays must be curved Hence if a gravitational field is in some sense equivalent to an "inertial field", as Einstein's theory says it is*, light rays should be bent by the gravitational field of a large body. It is this which experiments confirm. The fact that no reference frame is inertial everywhere means that there is now no reason why one set of reference frames should be favoured over all others, as was the assumption in SR, for the expression of the laws of physics. (Locally, there is of course a preferred set - those in which no gravitational effects appear and in which SR holds.) This is the motivation for the Principle of Covariance : "The laws of physics should be expressible in the same form (i.e. generally covariant) in all reference frames". The reshaping of the laws of physics into a generally covariant form therefore becomes one of the principal tasks of the general theory of relativity. Since tensor equations are true in all coordinate systems, it is natural that the mathematical expression of the theory should be in tensor form. *The qualification "in some sense" is vital to the truth of this statement, and its frequent omission is a cause of great confusion. Specifically, effects due to a gravitational field are equal to those caused by a mere acceleration only to first order, i.e. when second and higher derivatives of the metric tensor are ignored. N C McGill 10