Derivation of Lorentz transformation from principles of statistical information theory

LORENTZ TRANSFORMATION FROM INFORMATION Derivation of Lorentz transformation from principles of statistical information theory Revised 2/28/2016 Thomas E. Butler 1 Woodruff Way, Columbia, New Jersey 07832 e-mail address: astraclara@embarqmail.com The Lorentz transformation is derived from invariance of an information quantity related to statistical hypothesis testing on single particle system identification parameters. Invariance results from recognition of an equivalent observer as one who reaches the same conclusions as another when the same statistical methods are used. System identity is maintained by parameter values which minimize discrimination information, given by a Kullback-Liebler divergence, under a constraint of known shift in observation time. Deviation of discrimination information from the minimum value gives the difference in information between an observed system under a constraint shift and the expected system that maintains identity under the same constraint. System observation states are represented by parametric probability distributions of particle system measurement values. Keywords: Lorentz transformation, special relativity, information, Kullback-Liebler discrimination information PACS: 03.30.+p;89.70.Cf;01.55.+b I. INTRODUCTION The observation state of a free single particle system at a constant velocity is represented by a probability distribution of observation measurement values. Change in the observation time results in a change of the observed system state to a state that retains the system identity of the initial system. The system identity [1] [2] [3] is preserved by a final observation state that minimizes the number of bits of discrimination information given by the Kullback-Liebler discrimination information [4] under the constraint of an observation time shift. Excess discrimination information above the minimization value shows how close an observed system state is to a state which preserves system identity, and is an input to identity hypothesis testing methods. Different observers are expected to obtain the same system identity hypothesis conclusions when using the same statistical methods. Then the state associated with a different observer must keep the excess discrimination invariant. Invariance of the excess discrimination information for a free single particle system gives the Lorentz transformation, not of random variables which do not transform, but of the observed system state as represented by parameters of the probability distribution. Derivations of the Lorentz transformation from within other areas of physics, for example quantum information and communication theory [5], make the transformation dependent on theories within those branches of physics. The derivation based on a discrimination invariant shares a similar dependence, but primarily on concepts of statistical information theory, independent of other branches of physics beyond representation of simple Galilean motion. Although the statistical discrimination derivation depends on specific probability models, the resultant transformation is the same for all similar probability models for small shifts in space and time parameters. The information based derivation also supports generalization of the Lorentz transformation to a concept of equivalence transformations of distributions of arbitrary quantities. Another characteristic of the statistical information derivation of the Lorentz transformation is that there is no assumption of an invariant speed [6] [7]. Domain and range set properties of equivalence transformations imply existence of an upper bound on the magnitude of the velocity. Then transformations that preserve excess discrimination invariance require all boundary velocities to map to the boundary. II. PARTICLE OBSERVATION STATE Observations of the motion of a particle in an experiment produce a set of times of observation and associated particle positions. In a repeatable experiment, the information contained by an ensemble of sets of positiontime value pairs can be concisely represented by a probability distribution which normalizes over both observation time and position random variables, as shown in the random sample of Figure 1. Each experiment is independent, so the probability distribution represents the expected distribution of a single experiment. No special LORENTZ TRANSFORMATION FROM INFORMATION 100 Point Normal f(x,t) Sample: Velocity=8.325 std deviation(x,t)=(16.667,2) 4 2 O b s e rv a t io n T im e t interpretations are required beyond the probability distribution as a simple representation of measurement data, given that a measurement exists. The probability distribution is the observation state of the particle system. An observation time and position density that represents measurement data is applicable to many variants of a position versus time experiment. A single experiment can consist of dynamic observations in which the particle time and position are recorded as the particle moves. An alternative experiment might consist of a single timeposition measurement, with apparent motion from an ensemble of repeated experiments with different observation times. The same density represents measurements for both interpretations. -2 A. Example probability density The observed particle state is a probability density in position and time random variables which provides a realistic summary of observation data from a controlled experiment. This is made evident by a plot of data generated by a simulated random sample from a normal time and space probability density, shown in Figure 1. The probability density of Figure 1, which is normalized over both observation time and position random variables, generates a random sample that appears to be an ordinary collection of observation data from a repeated constant velocity experiment of finite duration that might have come from a physics class lab. The same result could also be produced by a single experiment with a large number of observations. In either case the measurement position standard deviation of 16.667 is not an error, but primarily the width of the range of the main concentration of position observations. Position error is given by the conditional position given time standard deviation of 0.745. Similarly, the measurement time standard deviation of 2 is not an error but primarily the extent of the range of observed times. Peak frequency of observations increases without limit for a dynamic interpretation of the model, as the number of observations rises, without any effect on the probability density. In this respect the normal density exhibits a classical physics behavior of independence from arbitrarily small time intervals. 0 -4 -40 -20 0 20 40 x Position Observation Figure 1 B. A model of uniform motion The example probability density of Figure 1 is a two dimensional normal probability density in observation position x and time t random variables as a model of experimental data from non-quantum mechanical uniform particle motion. Extension to a normal density in four dimensions models a three dimensional velocity. The four dimensional normal density f is f (R : R, Σ) = with random parameters 1 det (2πΣ) variable mean V = σt (Σxt Σyt vector Σzt ( R= x vector covariance matrix Σ . −2 T 1 − (R −R ) Σ−1 (R −R ) 2 e ( R= x y z , y z t t ) T , Define the vector V ) T (1) ) T , and by . Then the position-time block matrix form of Σ is C + σ 2VV T  t Σ =   σ 2V T  t σt 2V   , σt 2  (2) with C the 3x3 position given time co-variance matrix and σt 2 is the observation time variance. Define the vector of random variable observation position components by LORENTZ TRANSFORMATION FROM INFORMATION ( X = x y z ) T with mean ( X = x y z ) T The density f is the product of the marginal probability density g in the observation time random variable and the conditional probability density h in observed position given observed time, which for (1) are defined by 1 h (ε : C) = det (2πC ) g (δt : σt ) = 1 − e e 1 − εTC −1ε 2 , and A. Parameter classification δt 2 2σt 2 equation of state an isentropic gas produces a different relation between pressure and density than that of an isothermal gas. Parameters can be classified as either constraints, controlled by an experiment, or as unconstrained, left to be determined as parameters of a distribution of measurements after an experiment is performed. . (3) 2πσt with ε = δX −V δt , δX = X − X , δt = t − t and where f (R : R, Σ) = m (δt : σt ) h (ε : C) (4) . Although the position standard deviation of (4) is the width of the range plus position error, and is not a position error, the diagonal components the position covariance matrix C are an indication of position error. The four dimensional probability density with parameters X , t , V,C and σt is selected as a model of single free particle motion. III. PARAMETERS Parameters of the probability distribution are a representation of the particle observation state. A theory provides the general form of the distribution as a collection of possible allowed distributions, with parameter values to select a particular distribution. All inputs allowed by the applicable theory which select a particular f (R : R, Σ) from the set of all possible densities supported by the theory are parameters. Parameters are constructed such that there is a one to one relationship between a parameter value and a probability distribution. Initial, boundary and environmental conditions of an experiment are parameters, as are numerical parameters. Statistical methods are used to estimate parameter values from the output data of an experiment. Parameters of probability distributions are a necessary and common component of physics. Pressure, temperature, entropy and density are thermodynamic parameters associated with probability distributions of the underlying statistical mechanics of very large degree of freedom composite systems. Some parameters are designed to be controlled by an experiment, and others set free to vary in response to the controlled parameters. Thermodynamics provides the similar example that for the same ideal gas All parameter values are extracted from experimental data. Prior to execution of an experiment the design of an experiment can select planned values of some parameters, based on prior theoretical considerations independent of the outputs of the experiment. Mean and extent of the range of the observation time are two parameters which might be set by the design of an experiment, and confirmed by experimental data, since they can be regarded as under the control of the observer and independent of the object particle of the experiment. Parameters are classified as constraint or responsive in a given experiment: 1. Constraint Parameters Constraint parameters are parameters with planned values input to an experiment which exhibit high levels of repeatability and are independent of the subject of an experiment. The value of a constraint associated with a constraint parameter value is given by a subset of the set of allowed parameters to which the constraint confines distribution selection. 2. Responsive Parameters Responsive parameters have values which are not planned input but are outputs from the execution of an experiment and may depend on the subject of the experiment. ( Let P = X V t σt C ) represent the parameters of the probability density so that (4) is condensed to f (R : P ) = m (δt : σt ) h (ε : C) (5) Some parameters or transformations of parameters might be selected to be either constraints or responsive, for different experiment designs. Other parameters can have intrinsic properties that effectively classify the parameters as constraint or responsive parameters. Time parameters have such intrinsic properties. LORENTZ TRANSFORMATION FROM INFORMATION { Let Q = {∀P } be the set of all probability distribution parameters allowed by a theory. An experiment design defines constraints which restrict parameter values to a subset C ⊂ Q . The value of a constraint is given by subset C . Denote the set of all valid constraint sets as Λ (Q ) , so C ∈ Λ (Q ) . Define the set of constraint sets ϒ (C ) relative to a set C to be the set of all subsets of C which are also valid constraint sets, so that ϒ (C ) = Λ (Q ) ∩ Ρ (C ) where Ρ (C ) is the set of all subsets of C . All constraints associated with a particular type of parameter constraint are contained in an adjustable constraint relative to constraintD which is defined to be a set Α (D ), Α (D ) = {C 1, C 2 , C 3 , …} , (6) with elements C i ⊂ D that satisfy elements satisfies C1 ∪ C 2 ∪ C 3 ∪ … = D (8) Properties (7) and (8) of an adjustable constraint imply that every element of D is contained within one and only one element of Α (D ) . Then there exists a function ) for each P ∈ D which returns the constraint set element of Α (D ) that contains P . For example, if one of the co-ordinates of P is the mean observation time t parameter constraint, then index i = t and M P : Α (D ) = C t . If a type of constraint is defined ( ) by known values of parameter components denoted by ζ ( ) then ζ can be used as the index and C ζ : Α (D ) is the corresponding constraint set element of the adjustable constraint set. Let C i ∈ Αc (D ) be an element of an adjustable constraint and Bi ∈ Αb (D ) be an element of a different adjustable constraint. Then the B. Confirmation Sets Verification of a theory is rarely accomplished by a single type of experiment represented by the constraint choices in a one adjustable constraint set. More typical is a variety of types of experiments to provide a stronger verification of a theory. Each type of experiment corresponds to a different adjustable constraint set. Define a confirmation set of adjustable constraints to be the set of all adjustable constraints used in experiments to verify a theory. ɶ given by Designate A { } ɶ = Α (D ), …, Α (D ), … A 1 n (9) as the confirmation set for the single particle theory. (7) for distinct elements C i ≠ C j , and where the union of all ( Α(b,c ) (D ) is an adjustable constraint relative to D . C. The Time Postulate Ci ∩ C j = ∅ M P : Α (D ) } Α(b,c ) (D ) = Bi ∩ C j ∀Bi ≠ C j satisfy (7) and (8) so that 4. Constraint Sets elements of set In an experiment set up to measure the particle position about a specified time the mean observed time and the extent of the range of observed times give all parameters – mean and standard deviation - of the normal marginal time density. Since all of the time measurements are under control of the observer, all of the marginal observation time density parameters are constraint parameters. Set up a different experiment in which a clock is triggered to measure the time the particle passes a detector at a known position. This experiment can be used to measure mean velocity, and provides an example where the mean position is a constraint parameter and the mean time is not a constraint. A verifiable theory with a time and spatial components includes adjustable constraints corresponding to both types of experiments, which are elements of the confirmation set of the theory. A prominent characteristic of observation time distributions is the tendency whenever possible to regard the time distribution parameters as experimental constraints. This tendency is prevalent because time values are generally considered to be under the control of the observer to the maximum extent possible. These consideration suggest the time postulate, which is 1. Time Postulate Every parameter that selects a marginal distribution of an observation time independent of any other random LORENTZ TRANSFORMATION FROM INFORMATION observation variables is a control parameter, and thus an experimental constraint, in at least one Adjustable Constraint set element of a Confirmation Set of a theory with time dependence, and it is this quality of the necessity of control that distinguishes time from all other quantities. To the extent that every mechanism is repeatable and controlled, all mechanisms are clocks. Under the time postulate, parameters which describe a clock’s data must all be under the control of an observer in at least one verification experiment, and are therefore constraint parameters for that experiment. Each confirmation set of a time dependent theory must contain at least one adjustable constraint in which all parameters of the marginal observation time density are constraint parameters. The design of the experiment must select one of these time parameter adjustable constraints as a primary constraint, to serve as the focus of statistical decisions. Only the primary constraint is used to determine the excess discrimination invariant. This is because only a constrained time parameter marginal density provides potential operator selection of all possible time measurement scenarios that are under control of the observer with certainty. Parameters in other adjustable constraints in a confirmation Set are incomplete in that they do not provide certainty of access to all possible measurement times, for if they did the time postulate requires that they also be time, potentially as a parameter of a clock mechanism. B. Discrimination Information A sequence of experiments which measure position and observation time of a single particle produces a sequence of observation data and a corresponding sequence of observation data probability density parameters. Parameter values which are far from values expected for the uniform motion of a free particle are not representative of the uniform motion of the originally observed free particle. When trajectory information is the only available particle identity information then only parameters along an initial trajectory maintain the identity of the particle system. Let P0 be the initial particle system observation state parameter, and P1 the observation state of a subsequent experiment with new constraint parameter values. The discrimination information available in the density in favor that analysis of data selects P1 over P0 is f (R : P1 ) ∫ dxdydzdt f (R : P1 ) ln f (R : P ) 0 ∀R ∞ m (δt1 : σt 1 ) = ∫ dt m (δt1 : σt 1 ) ln + δ σ m t : ( ) t 0 0 −∞ h (ε1 : C1 ) δ σ ε dxdydzdt m t : h : C ln ( ) ( ) ∫ t1 1 1 1 h (ε0 : C0 ) ∀R I (P1, P0 ) = (10) which evaluates to σ 1 (∆t ) 1 = − ln t 1 − σt 12 ∆σt −2 2 σt 0 2 2 σ 2 I 10 t0 IV. CHANGE OF OBSERVATION STATE A. Parameter Shifts A single particle probability density in observation time and position random variables models data from an experiment of finite duration. Motion can continue outside the range of data from an experiment, which presents the opportunity for observation data collected by additional experiments. The density for each experiment is represented by different parameter values. Particle observation state is given by the probability density which represents data from an experiment. Different densities represent different experiments, and different particle observation states. Motion of a particle through different experiments occurs with a change of particle observation states. Transitions between observation states of the particle are represented as changes in parameters of the probability density. where detC 1 1 1 − ln − C 1∆C −1 (11) 2 detC 0 2 1 1 + σt 12 ∆V TC 0−1∆V + δΠTC 0−1δΠ 2 2 ∆ indicates a component of P1 − P0 and δΠ = ∆X −V0∆t . C. Preservation of identity An initial observation state parameter P0 shifts to a parameter P1 for a single particle system when controlled constraint parameters, such as the mean observation time component of P1 , shift to planned values. Adjustable constraint sets provide structures to constraints which can be used by all observers. A particular adjustable constraint set defines possible constraint values for a particular type of constraint associated with an experiment. LORENTZ TRANSFORMATION FROM INFORMATION Parameters R1 which preserve identity of a system with observation state parameter P0 under a shift in constraint value must minimize discrimination information and so must satisfy I (R1, P0 ) = ( min ( ) p∈C ζ1 :Αζ (Q ) I (p, P0 ) (12) ) = M (R1 : Αζ (Q )) . where C ζ1 : Αζ (Q ) The time postulate requires t1 and σt 1 be constraint parameters. Position error in pre-quantum theory C 1 is under the control of the observer, so C 1 is also a constraint ) as constraint parameter, only parameters X1 and V1 remain to be parameter. With ( ζ1 = t1 σt 1 C 1 adjusted in (11) to satisfy (12). The identity preservation parameters which solve (12) are X1 = X 0 +V0 ∆t V1 = V0 (13) with t1 , σt 1 and C1 as constraint parameters with known values. Thus the parameters that preserve identity continue motion along a trajectory with the same initial velocity on the same line. V. EQUIVALENT OBSERVERS A. Experiment Verification Acceptance of the outcome of an experiment demands an independent method of verification. The demand can be met in traditional physics where there are implicit assumptions of independent, identifiable, characteristics of the physical system outside the scope of a theory. One example is the Kepler-Newton theory of orbital motion of the planet Mars about the sun, where deviations from the orbit can be observed since the planet is identifiable by characteristics, such as surface features and diameter, outside the scope of the orbital theory. When the theory contains all of the identity information of the physical system, verification cannot be based on system characteristics independent of the theory. An example of such a theory is the bosonic theory of photons, which are indisitnguishable particles fully identified by symmetric wave functions, where no single photon experiment can be repeated using exactly the same photon [8]. Exclusion of verification based on independent system characteristics elevates the significance of verification restricted to statistical analysis of repeatability which underlies the probability densities of the theory. Repeatability can be verified by observers outside the scope of the theory. Thus by keeping the theory incomplete in the sense that it does not bring all observers within the scope of the theories physical descriptions, the requirement of verification with separate system identification moves to a requirement of independent observer verification, especially when the entire physical system description is contained within the theory. Verification of a bosonic photon experimental result never involves preparation of the same photon by a different observer, but instead repetition of the same experimental conditions by an independent observer. Any theory that fully contains the identity of a system cannot support validation based on independent characteristics of the system. In the theory of special relativity independent observers are represented by Lorentz transformations of space time co-ordinates. Lorentz transformations in one approach arise from quantum consideration [5]. In quantum theory the wave function becomes the object of the transformation through invariant wave equations, associated with corresponding transformations of observables. An information theoretic approach to define the reference frame of equivalent observers has the opportunity to generalize the transformation to another observer to be a transformation of the probability distribution for any type of random variable and theory, without limitation to only space and time quantities. Define an equivalence transformation to be a generalized transformation of the distribution, and therefore of the distribution parameters, to the frame of an equivalent observer. An identity transformation of the probability distribution of a theory defines the local observer and is also an equivalence transformation. A verification capability requires that at least one other observer exist that is not the local observer. A complete and independently verifiable scientific theory has the property of: 1. Independent Observer Scrutiny Every independently verifiable theory must support at least one independent observer other than the local observer, and must necessarily have at least one equivalence transformation that is not the identity transformation. B. Equivalence Transformations An equivalent observer must reach the same conclusions from data available to the observer as the local observer of an experiment reaches. Each equivalent observer then has the same information to accept or reject the conclusion of identity preservation under a shift in constraints. The design of the experiment makes the transformation of all initial parameters available to all observers, and execution LORENTZ TRANSFORMATION FROM INFORMATION of the experiment results in estimates of final parameter values for all observers. An equivalence transformation E transforms parameters P in use by an observer into parameters P ′ = E P used by an equivalent observer. If E is the equivalent observer transformation, the final discrimination information determined by the equivalent observer is I (E P1, E P0 ) , but the same observer would determine that discrimination as I (E R1, E P0 ) if identity preservation parameters were observed as expected by the theory, where R1 is given by (13). The larger the magnitude of the difference between the two discrimination values the more likely is a rejection of the conclusion that identity is preserved by the tested theory, while a very small magnitude difference supports the conclusion. The difference is proportional to the number of bits input into statistical decision methods and for the local observer is equal to ( ) where ( ζ1 = t1 the R1 min ( ) p ∈M P1 :Αζ (Q ) constraint C. Equivalent Observer Collections as a set Define an equivalent observer collection C containing a set E = {E 1, …, E n , …,} of equivalence transformations using primary adjustable constraint Αζ which is also a member of C , a parameter domain D ⊆ Q and range R ⊆ Q common to all transformations withinE , { I (p, P0 ) parameter ( ) required of an are: Containment Structure Every equivalence transformation E is an element of a set E contained in some equivalent observer collectionC , E ∈ E ∈C . A theory may produce more than one equivalent observer collection. 2. choose constraints from an adjustable constraint Αζ (15) which must be the primary constraint of a confirmation set of adjustable constraints. The time postulate implies that all parameters of the model marginal probability distribution of observation time within Αζ are constraint parameters. In the invariant equation (15) the same adjustable constraint Αζ is used by all equivalent observers of the experiment. Equivalence transformations are dependent on the primary adjustable constraint. An equivalent observer transformation operates on all in a subset P ⊂ Q of the set of parameters P ∈ D parameters Q . As every equivalent observer is also an equivalent observer to any other equivalent observer, the inverse E −1P exists for every P in the range of E and is an equivalence transformation. An equivalence transformation defines observers in the context of a theory. Equivalence transformations depend on properties of the Primary Constraint of a Confirmation Set. In the Primary Constraint all parameters of the marginal time distribution are constraint parameters; and the equivalence transformations for that adjustable constraint define the transformations which must be associated with all remaining adjustable constraints in the Confirmation Set. Equivalence transformations are defined by properties, such as invariance, of each transformation. Properties of Time Postulate Primary Constraint All equivalent observers of an experiment on a probability model of a physical system, with each observer represented by an element E ∈ E ∈C K is input to statistical methods which determine rejection or acceptance of identity preservation, K is invariant: ) properties values ) ( The equivalent observer collectionC (14) σt 1 C 1 are also component values in P1 . Since K P1, P0 : Αζ = K E P1, E P0 : Αζ } C = E ,D , R , Αζ . 1. K P1, P0 : Αζ = I (P1, P0 ) − I (R1, P0 ) = I (P1, P0 ) − collections of transformations, such as the verifiable theory property of independent observer scrutiny, also contribute to the definition of equivalence transformations. 3. Invariant Excess Identity Discrimination Every E ∈ E preserves ( ) K P1, P0 : Αζ . identity discrepancy Let S = {B1, …, Bn , …,} be the set of all transformations which preserve K . Then ∀Pi ∈ D , K P1, P0 : Αζ = K B i P1, B i P0 : Αζ , ( ) ( ) and E ⊆ S . 4. Independent Observer Scrutiny Every E contains at least one E not the identity. 5. Transitivity LORENTZ TRANSFORMATION FROM INFORMATION { If C ′ = E ′,D , R , Αζ } { and C = E , R , H , Αζ } are equivalent operator collections, then C ′′ = E ′′,D , H , Αζ must also be an equivalent { } operator collection with every equivalence transformation E ′′ ∈E E ′′ equal to the composition E and E ′ ∈E ′ , where E ′′ = E ′E for some E ∈E H can be any domain or range in an equivalent operator collection. This property implies there must always exist a collection for which the transformation domain equals the range. 6. makes (16) a third order invariant in parameter shifts. The Equivalent observer collection property of Invariant Excess Identity Discrimination requires invariance of K K =0 then and implies that if so that identity K ′ = K E P1, E P0 : Αζ = 0 , ( E −1 exists for E ∈ E ∈C = {E ,D , R , Αζ } , and each E all −1 is a valid equivalence transformation and an element of an equivalent observer collection C −1 E −1 { = {E = E −1 } , R ,D , Αζ , Identity preservation equation (13) transforms to X1′ = X 0′ +V0′∆t ′ V1′ = V0′ } . Every E is a one to one map. µ = X −Vt η = σt Maximal Transformation Set { Every E ∈C = E ,D , R , Αζ } contains all possible E that have properties 1 through 6 and produce only valid probability models. b. Maximal Parameter Set { } Every D ∈C = E ,D , R , Αζ and G ∈C contain all possible parameter values P that give valid probability models for all transformations which satisfy the Maximal Transformation Set property. These properties define an equivalent observer collection, and the equivalent transformations contained within the collection. There can be more than one observer collection for each probability model. D. K Invariance Invariant K is defined by (14), and for the norm density space and time observation model is equal to K= T 1 ∆X −V0∆t ) C 0−1 (∆X −V0∆t ) ( 2 1 + σt 12 ∆V TC 0−1∆V 2 (16) τ =t ε =C υ =V (18) −1 so that the identity preservation equations (13) and (17) become Maximally Inclusive a. (17) Each component of P ′ potentially depends on components Represent components of parameter P by the of P . invertible transform where −1 , …, E −1n , … 1 ) preservation is invariant. With parameter ′ ′ ′ P = (X ,V , t , σt ,C ) set E P = P = (X ,V , t ′, σt′,C ′) Inverse Existence Unique 7. The presence of σt 12 = σt 02 + ∆σt 2 in the second term µ*1 = µ0 µ1′* = µ0′ υ*1 = υ0 υ1′* = υ0′ with τ1, η1, ε1 as constraint parameters and (19) * to indicate identity preservation parameters. In the following analysis assume the transformations are continuous functions of parameters and continuous for all first through third derivatives with respect to components of P0 . Differentiate µ1′* and υ1′* in (17) with respect to τ1, η1 and ε1 , and use (19) to get ∂ ∂ υ ′ (µ0 , υ0 , τ1, η1, ε1 ) = υ ′ (µ0 , υ0 , τ 0 , η0 , ε0 ) = 0 ∂τ1 ∂τ1 ∂ ∂ υ ′ (µ0 , υ0 , τ1, η1, ε1 ) = υ ′ (µ0 , υ0 , τ 0 , η0 , ε0 ) = 0 ∂η1 ∂η1 ∂ ∂ υ ′ (µ0 , υ0 , τ1, η1, ε1 ) = υ ′ (µ0 , υ0 , τ 0 , η0 , ε0 ) = 0 ∂ε1 ∂ε1 ∂ ∂ µ ′ (µ0 , υ0 , τ1, η1, ε1 ) = µ ′ (µ0 , υ0 , τ 0 , η0 , ε0 ) = 0 ∂τ1 ∂τ1 ∂ ∂ µ ′ (µ0 , υ0 , τ1, η1, ε1 ) = µ ′ (µ0 , υ0 , τ 0 , η0 , ε0 ) = 0 ∂η1 ∂η1 ∂ ∂ µ ′ (µ0 , υ0 , τ1, η1, ε1 ) = µ ′ (µ0 , υ0 , τ 0 , η0 , ε0 ) = 0 ∂ε1 ∂ε1 which imply that µ ′ and υ ′ can be written as LORENTZ TRANSFORMATION FROM INFORMATION υ ′ = υ ′ (µ, υ) µ ′ = µ ′ (µ, υ) (20) Next differentiate (21) with respect to τ0 : In the P = (µ, υ, τ, η, ε) representation of parameters the invariance equation K = K ′ becomes η1 T 1 ∆υT ε0∆υ + (∆µ + τ1∆υ) ε0 (∆µ + τ1∆υ) = 2 2 (21) η1′ T 1 T ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ∆υ ε0∆υ + (∆µ + τ1∆υ ) ε0 (∆µ + τ1∆υ ) 2 2 Differentiate (21) with respect to η0 to get η1′ ∆υ ′T ∂ε0′ ∆υ ′ 2 ∂η 0 T ∂ε0′ 1 + (∆µ ′ + τ1′∆υ ′) (∆µ ′ + τ1′∆υ ′) 2 ∂η 0 0= η1′ ∂ε0′ ∆υ ′ 2 ∂τ 0 T ∂ε0′ 1 + (∆µ ′ + τ1′∆υ ′) (∆µ ′ + τ1′∆υ ′) 2 ∂τ 0 0= ∆υ ′T ∂ε ′ = 0. ∂τ Take the second derivative of (21) with respect to η1 : Analysis similar to that for (22) results in 0= (22) 1 ∂2 η ′ + τ1′2 ∆υ ′T ε0′ ∆υ ′ 2 ∂η 2 1 ( ) 1 Extraction of valid equivalence transformations E begins with determination of transformations B which keep K invariant, with initial assumption that the domain D and range R are the set of all parameters. Then D and R are adjusted to satisfy the properties of equivalent observer collections. Components of P1′ appear in (22) but not components of (24) + ∂2 τ1′ ∂η12 (25) ∆υ ′T ε0′ ∆µ ′ . Components of P0′ appear in (25) but not components of P0 , so P0′ can be any valid parameter. Select µ0′ = µ1′ to Since ε0′ is an inverse co- remove the second term. P1 , so the initial assumption that range and domain equal variance matrix, ∆υ ′T ε0′ ∆υ ′ is almost positive definite, D implies P1′ can be any valid parameter in the D . Then and positive for some ∆υ ′ so that ∆µ ′ , ∆υ ′ and τ1′ can take on an infinity of values. To establish a procedure that reveals the impact of the infinity of P1′ values that satisfy (22), define unit vectors ( ) jˆ1T = 1 0 0 , ( ) ( jˆ2T = 0 1 0 , ( ) 1 ˆ ˆ 1 kîj = ( ji + jk ) and lîj = jî − jˆj . 2 2 3 x 3 symmetric matrix ( ) jˆ3T = 0 0 1 , If S is any ) equations to replace a ∆υ ′T S ∆υ ′ term with a 2Sik . Select µ1 such that ∆µ ′ = 0 in (22), and select any small scalar a . Adjust υ1′ to apply the procedure to use (23) for ∂ε0′ ∂ε ′ a = 0. , so that η1′ + τ1′2 ∂η ∂η0 2 ( ) with the result 2 ∂2 η ′ ∂η 2 = ∂2 τ ′ ∂η 2 = 0. The third derivative of (21) with respect to τ1 is: (23) to akîj for some scalar a and then to alîj and subtract the 0= (η ′ + τ ′ ) = 0 . 1 ∂3 η ′ + τ1′2 ∆υ ′T ε0′ ∆υ ′ 2 ∂τ 3 1 ( ) 1 If i = k , (23) becomes jîT Sjî = Sii . In (22), set ∆υ ′ first 2 ∂η 2 Values of ∆µ ′ exist for which ∆υ ′T ε0′ ∆µ ′ is not zero, 0= 1 ˆT ˆ k Skij − lîjT Slîj = Sik 2 ij each index pair i, j with indices 1… 3 . ∂2 The result is + ∂ 3 τ1′ ∆υ ′T ε0′ ∆µ ′ ∂τ13 so that by following a similar analysis to that used for η, derivatives with respect to ∂3 3 (η ′ + τ ′ ) = ∂∂ττ 2 ′ = ∂3η′ = 0 . Presence of a non3 ∂τ 3 ∂τ 3 zero quadratic terms in the expression integrals violates the ∂2 τ ′ ∂2 η ′ = = 0. unique Inverse Existence property, so ∂τ 2 ∂τ 2 The second derivative of (21) with respect to τ1 is then: LORENTZ TRANSFORMATION FROM INFORMATION  ∂τ ′ 2 ∆υT ε0 ∆υ =  1  ∆υ ′T ε0′ ∆υ ′  τ1  ∂ (26) that follows (23) to get Integrate the η ′ equations to get η ′ = a τ + g η + bτη + h where a, b, g, h do not depend on τ or η . Since τ is a mean time with potential unbounded values that can be negative or positive, non-zero a and b allow a negative variance η ′ and invalid probability model. Thus a = b = 0 and η ′ = g η + h . Integrate the τ ′ derivatives to obtain τ ′ = q τ + pτη + s η + r where q, p, s, r do not depend on τ or η . Substitute this expression for τ ′ into (21) and isolate the η12 and η12 τ12 terms which appear only on the right transformed side of the equation. The resultant equation requires that p = s = 0 and τ ′ = q τ + r , with q, r independent of τ or η . A first derivative of (21) ∆υT ε0 (∆µ + τ1∆υ) = implies ∂τ1′ ∂τ1 τ1 results by ∆υ ′T ε0′ (∆µ ′ + τ1′∆υ ′) in and let P1 approach P0 , P1 → P0 = P get ∂η ′ = 0. Then select ∆µ ′ = 0 to ∂εij ∂τ ′ = 0 . Next differentiate (21) by η1 to get ∂εij ∆υT ε0 ∆υ = ∂η1′ ∂η1 ∆υ ′T ε0′ ∆υ ′ The derivative of (29) by µ0 is 0 = and implies 0= ∂2 η1′ ∂η1∂µ1 ∂ε ′ = 0. ∂µ ∂η1′ ∂η1 ∆υ ′T Differentiation ∆υ ′T ε0′ ∆υ ′ . ∂2 η ′ ∂η 2 by ∂ε0′ ∂µ0 µ1 ∂2 η ′ =0 ∂η∂µ The combined with the prior (29) ∆υ ′ is result, = 0 result integrates to η ′ = g (υ ) η + h (υ ) for some positive functions g, h . Differentiate (21) first by ε0ab and then by ε0kl to get ∂τ ′ =q ≠ 0. ∂τ Using results obtained so far, apply Adjust µ0′ so that ∆µ ′ = −τ1′∆υ ′ and use the procedure ∂2 ∂µ12 0 = η1′∆υ ′T to (21) and then ∂ε0′ ∂ε0ab ∂ε0kl (∆µ ′ + τ1′∆υ ′) T to get ∆υ ′ + ∂ε0′ ∂ε0ab ∂ε0kl (∆µ ′ + τ1′∆υ ′) (30) T ε = η′ ∂υ ′ ∂υ ′ ε′ ∂µ ∂µ T  ∂µ ′  ∂µ ′ ∂υ ′  ∂υ ′  ′  ε ′   +  + τ′ + τ  ∂µ  ∂µ ∂µ  ∂µ  Differentiate by τ twice to obtain (27) ∂ε0ab ∂ε0kl , differentiation of (30) by η1′ as an τ1′ = 0 so that ∆µ ′T Q0 ∆µ ′ = 0 . Then (30) becomes ∆µ ′T Q0 ∆υ ′ = 0 . Since µ1′ and υ1′ , and thus ∆µ ′ and ∂υ ′ = 0 and, with (20) that υ ′ is a ∂µ function of only υ , υ ′ = υ ′ (v ) . ∆υ ′ can be arbitrarily and independently selected, only Q = 0 can satisfy the equation for all possible P ′ 0 1 component values. Thus ∂εij′ ∂εab ∂εkl = 0 , each εij′ depends linearly on the εab components, and the inverse co-variance The derivative of (21) with respect to εij 1 is 1 ∂η1′ ∆υ ′T ε0′ ∆υ ′ 0= 2 ∂εij 1 ∂τ1′ + ∆υ ′T ε0′ (∆µ ′ + τ1′∆υ ′) ∂εij 1 ∂ε0′ independent variable yields ∆υ ′T Q0 ∆υ ′ = 0 . Next select  ∂τ ′ 2 ∂υ ′T ∂υ ′  0 =  ε′  ∂τ  ∂µ ∂µ which shows that With Q0 = matrix has the form (28) εij′ = Tijkl (υ ) εkl + Dij (υ ) (31) LORENTZ TRANSFORMATION FROM INFORMATION for some T , D . Substitute (31) into (21) and let ε = φ (υ ) 2 components of matrix ε0 approach zero, ε0 → 0 , or equivalently isolate terms with no εkl factors, to get η1′ ∆υ ′T D (υ0 ) ∆υ ′ 2 (32) T 1 + (∆µ ′ + τ1′∆υ ′) D (υ0 )(∆µ ′ + τ1′∆υ ′) 2 0= Variation of η1′ gives ∆υ ′T D (υ0 ) ∆υ ′ = 0 , selection of τ1′ = 0 gives ∆µ ′T D (υ0 ) ∆µ ′ = 0 . arbitrarily and εij′ = Tijkl (υ ) εkl . ∆µ ′ and independently, Any linear P1 Let ∆υ ′ can be selected so D (υ0 ) = 0 and transformation ε ′ = U (υ ) εU (υ ) T of (33) into (21) , remove the η1 factor terms, select µ0′ so that ∆µ = −τ1∆υ to get ∆υ ′T ε0′ ∆υ ′ T 1 + (∆µ ′ + τ1′∆υ ′) ε0′ (∆µ ′ + τ1′∆υ ′) = 0 2 (34) T Negative h (υ) allows improper probability model so that negative η ′ in an h (υ ) = 0 and η ′ = g (υ ) η = φ (υ ) η . 2 Multiply (29) by η1 to get 1 1 ∂η1′ η1∆υT ε0 ∆υ = ∆υ ′T ε0′ ∆υ ′ 2 2 ∂η1 2 1 (35) = η1φ (υ ) ∆υ ′T ε0′ ∆υ ′ 2 1 = η1′ ∆υ ′T ε0′ ∆υ ′ 2 Take the second derivatives of (35) by components of υ1 and let P1 → P0 = P to get ε= T T ∂µ ′ ∂µ ′ ∂µ ′ ∂µ ′ = ε′ U (υ) εU (υ ) ∂µ ∂µ ∂µ ∂µ (38) Then T T ∂ ∂2 µ ′ ∂µ ′ ε=2 U (υ) εU (υ) =0 2 ∂µ ∂µ ∂µ so ∂2 µ ′ that ∂µ2 =0, and µ′ is linear in µ, µ ′ (µ, υ ) = S (υ ) µ + l (υ ) for some matrix S (υ ) and vector l (υ ) . Apply ∂2 ∂µ12 ε0 = to (37) to get ∂2 τ1′ ∂µ12 ∆υ ′T ε0′ (∆µ ′ + τ1′∆υ ′)  ∂µ ′ ∂τ ′ T  ∂µ ′ ∂τ ′   1 1 +  + ∆υ ′ ε0′  1 + 1 ∆υ ′    ∂µ1 ∂µ1   ∂µ1 ∂µ1  The almost positive definite property of ε0′ in (34) implies h (υ ) ≤ 0 . P1 → P0 = P in the second approach P0 , derivative of (37) with respect to components of µ to get a Substitute the derived form for η ′ , η ′ = g (υ ) η + h (υ ) , 2 (37) As µ1′ does not covariance matrix can take the form U T εU for some orthogonal matrix, so there exists matrix U (υ) such that h (υ ) (36) Subtract (35) from (21) to get T 1 ∆µ + τ1∆υ) ε0 (∆µ + τ1∆υ) = ( 2 T 1 ∆µ ′ + τ1′∆υ ′) ε0′ (∆µ ′ + τ1′∆υ ′) ( 2 depend on τ1′ , then the only term that remains is ∆µ ′T D (υ0 ) ∆υ ′ = 0 . T ∂υ ′ ∂υ ′ ε′ ∂υ ∂υ Variation δµ1′ gives ∂2 τ ′ ∂µ 2 (39) = 0 so that τ ′ depends linearly on µ . Since every co-variance inverse co-variance matrix is almost positive definite and symmetric, there exists matrix Ζ such that ε = ΖT Ζ , and (38) can be written T  ∂µ ′ −1  1 = ΖU (υ) Ζ  ∂µ   Then Ζ (ε)U (υ)  ∂µ ′ −1  Ζ  ΖU (υ) ∂µ   (40) −1 ∂µ ′ Ζ (ε) = O for some orthogonal ∂µ matrix O . Analysis performed to this point implies the equivalence transformation components show the following dependencies: LORENTZ TRANSFORMATION FROM INFORMATION S (υ ) = L − υ ′lT η ′ = η ′ (η, υ) = φ (υ) η µ ′ = µ ′ (µ, υ) = S (υ) µ + d (υ) 2 ( ε ′ = ε ′ (ε, υ) = U (υ) εU (υ) τ ′ = τ ′ (τ, υ, µ) = φ (υ) τ + l (υ) µ + b (υ) T Differentiate (37) by µ1 , then by τ1 and finally by υ1 and with the sign of φ chosen to make +φ be the τ coefficient of τ ′ . Using (41), equations (38) , (36) orthogonal matrices Ra , Rb . become, for some −1 Ζ (ε)U (υ) S (υ ) Ζ (ε) = Ra RaT Ra = 1 (42) −1 ∂υ ′ φ (υ) Ζ (ε)U (υ) Ζ (ε) = Rb RbT Rb = 1 ∂υ Using index notation with double index summation convention, the first of these equations can be written as Ζil (ε)U lm (υ ) Smj (υ ) = Rail Ζlj (ε) (43) Any non-singular matrix Ζ results in a non-singular almost positive definite matrix ε . As there are no limitations on possible values of ε in the model beyond those required of an inverse covariance matrix, Ζ can be any non-singular 3x3 matrix. Variation of Ζij in (43) shows that U (υ ) S (υ ) = Ra = r 1 equation of (42) shows that φ (υ)U (υ) ∂υ ′ = ±1 ∂υ ∂υ ′ = ±S (υ) ∂υ Differentiate (37) by µ1 and then by µ0 to get φ (υ ) (44). + ∂µ1 ( ) vector κ and constant matrix Q such that ∂φ ∂υ ′ = −w κ φ − w υ ′κT = Q ∂υ ∂υ with solutions uυ + j υ −s φ = w 1 − κT v υ′ = =Q w 1 − κT v w 1 − κT v ( ) ( ) ( ) with constant scalar w and constant vector j = −Qs . And since ∂υ ′ φ (υ ) = Q + w υ ′κT = ±S (υ) = ± L − υ ′lT ∂υ Q = ±L, l = ∓wκ . ( ) 0= T 1 ∆d + b1∆υ ′) ε0′ (∆d + b1∆υ ′) ( 2 Since the right side cannot depend on P0 , b (υ ) = r for constant vector u . With these results the τ1 term of (37) is (45) τ1∆υT ε0 ∆µ = ( ( ) ( ) ) φ (υ1 ) τ1∆υ ′T ε0′ ∆ S (υ ) µ + lT µ1 ∆υ ′ ∆υ ′ = S (υ1 ) + υ1′l (υ1 ) − υ0′l (υ1 ) T becomes some constant scalar r and d (υ ) = u − r υ ′ (υ ) for some  ∂µ ′ ∂τ ′ T ∂µ0′ ε0 =  1 + 1 ∆υ ′ ε0′  µ1 ∂µ1  ∂ ∂µ0 ∂µ1 terms to get T T  −1  ∂υ ′ ∂φ ∂φ  ε0′−1ε0 LT − lv0′T = φ1 1 + υ1′ 1 − υ0′ 1   ∂υ1 ∂υ1 ∂υ1    The right side of this equation must not depend on components of P1 , which requires that there exist constant which requires ∆d + b1∆υ ′ = 0 d (υ1 ) + b (υ1 ) υ ′ (υ1 ) = d (υ0 ) + b (υ1 ) υ ′ (υ0 ) and thus T re-arrange In (37) set µ1 = µ0 = 0 and τ1 = 0 so that the equation is the product of scalar and the identity matrix. Since Similar analysis of the second RaT Ra = 1 , ra = ±1 . ∂τ1′ τ ′ = φ (υ ) τ + l T µ + b (υ ) (41) υ ′ = υ ′ (υ ) so that ∂µ1′ ) µ ′ = L − υ ′lT µ + d (υ ) T must not depend on any component of P1 . This requires that l be constant and that for some constant matrix L , The µ1 derivative of the τ1 factors is ( ∆υT ε0 = φ (υ1 ) ∆υ ′T ε0′ S (υ1 ) + ∆υ ′lT = φ (υ1 ) ∆υ ′T ε0′S (υ0 ) and a subsequent υ1 derivative is ) LORENTZ TRANSFORMATION FROM INFORMATION T  ∂φ (υ )T ∂υ1′  1  ε ′S (υ0 )   + φ (υ1 ) ∂υ1 0  ∂υ1   ∂φ (υ )T T  1    + ±S (υ1 ) ε0′S (υ0 )  ∂υ1  space and time parameter transformations, but also transformations of velocity, time variance and position given time co-variance from an invariant that is conceptually simpler than the Minkowski space-time separation. To see this consider that for the case 2 C = σc 0 1, the invariant is Let P1 approach P0 to get ε = ±S (υ ) ε ′S (υ ) . Since ε 2 2 1 −2 1 σc 0 (∆X −V0∆t ) + σt 12 σc 0−2 (∆V ) . 2 2 Analysis leading up to transformation (47) shows that K is the sum of two invariant terms, namely 2 1 K X = σc 0−2 (∆X −V0 ∆t ) 2 2 1 2 −2 KV = σt 1 σc 0 (∆V ) 2 ( ) ε0 = ∆υ ′ ε0′S (υ0 ) T ( ) = ∆υ ′T ε0′S (υ0 ) T and ε ′ are positive definite, only the positive sign is valid and ε = S (υ ) ε ′S (υ ) . T These results allow (41) to be written as ( η ′ = w 2 1 − κT v υ′ = Q (υ − s ) ( ) 2 η ( ( ) ε ′ = Q + w υ ′κT ( ) A Lorentz transformation extended to include transformation of σt , σc , and V keeps these terms ) w 1 − κT v    (υ − s ) T  µ ′ = Q 1 + κ  µ + u − rυ′   1 − κT v   T −1 ) ( ε Q + w υ ′κT K= (46) invariant, and so maintains an elliptic, rather than a hyperbolic invariant, although in a parameter space of up to 21 dimensions. −1 ) E. Transformation Selectors τ ′ = w 1 − κ v τ − wκ µ + r T T a. Transitivity, Inverse and Composition Restoration of the original parameter representation makes the solution of the invariance equation be σt′2= w 2θ (V ) σt 2 2 V′ = P V −s θ (V ) X ′ = w P (X − st ) + u ( ) ( C ′ = w 2 P +V ′κT C P +V ′κT ( ) (47) ) T t′ = w t −κ X + r T Q and θ (V ) = 1 − κTV . The w transformations given by selection of values of w, k, s, P, r and u in (47) are transformations that keep K invariant and satisfy the Invariant Excess Identity Discrimination property of equivalence transformations under stated assumptions of continuity of derivatives. If w = γ , with P defined by P = S = (w, κ, P, s, u, r ) is a transformation selector, which picks a single transformation from the set of all transformations defined by (47), and acts as an index of transformations that preserve identity discrepancy. Remaining equivalence transformation properties determine the allowed selectors contained within equivalent observer collections, along with allowed domain and range parameter sets. There is a one to one correspondence between selectors and invariant preserving transformations, so an equivalent observer collection set of transformations E can be represented by a set of selectors S . This also implies that C = E ,D , R , Αζ can be represented by C { = {S ,D , R , Αζ } . Exclusion of invalid probability models is implicitly required by the Maximally Inclusive properties. In (47), valid probability models require non-zero time variance and non-zero covariance matrix determinant, so that k = −s γc−2 , P = 1 , r = 0 and u = 0 are selected, with γ= 1 − 2 −2 2 1−s c ( ) w≠0 θ (V ) = 1 − κTV ≠ 0 , then the X ′ and t ′ transformations of (47) correspond to a simple instance of the Lorentz transformation with moving frame velocity s . The set of transformations defined by (47) contains but is larger than the Poincare group. Invariance of K produces not only } T κ V ≠1 and (48) LORENTZ TRANSFORMATION FROM INFORMATION    V − s T  det P +V ′κT = det (P ) det 1 + κ  ≠ 0   θ (V )   det (P ) ≠ 0 (49)   T  V − s T  V s s κ − − 1 κ  = 1 + κT det 1 + = ≠0   V θ 1 − κTV 1 − κTV ( )   κT s ≠ 1 ( ) In the selector representation, the Transitivity property requires that if and C = S ,D , R , Αζ { { C ′ = S ′, R , H , Αζ } } ( εɶ = Pɶ 1 − sɶκɶT λɶ = 1 − κɶ sɶ w ′ = λɶ−1wɶ −1 κ ′ = −PɶT −1κɶ { } collection with selectors formed from all compositions of This implies that given selector S ′ and S . with associated invariant S ′ = (w ′, κ ′, P ′, s ′, u ′, r ′) transformation E ′ , and selector S = (w, κ, P, s, u, r ) with E , the transformation E ′E P = E ′′P for P ∈ D must be of the form of S ′′ = (w ′′, κ ′′, P ′′, s ′′, u ′′, r ′′) . (47) with selector ε = P + s ′κT and Define λ = 1 + κ ′ Ps . Work through the algebra to get T w ′′ = λw ′w ( κ ′′ = λ−1 κ + PT κ ′ u ′ = −wɶ −1εɶ−1 (uɶ + rɶPɶsɶ) r ′ = −λɶ−1wɶ −1 rɶ + κɶT Pɶ−1uɶ ( ( Next form the composition of a transformation E from D → R , represented by S = (w, κ, P, s, u, r ) , with the inverse of any transformation Eɶ from D → G , Eɶ −1 from R → D . The composition E ′′ = Eɶ −1E is an equivalence transformation from D → D . With Sɶ = (wɶ, κɶ, Pɶ, sɶ, uɶ, rɶ) to represent Eɶ and S ′′ = (w ′′, κ ′′, P ′′, s ′′, u ′′, r ′′) to represent ( −1 r ′′ = r ′ + w ′ r − κ ′T u as the inverse transformation selector. The existence of an inverse (51) requires existence of Pɶ−1 so det (Pɶ) ≠ 0 . εɶ = Pɶ 1 − sɶκɶT ) ( εˆ = P 1 − sɶκT ) λˆ = 1 − κɶ s T T λɶ = 1 − κɶ sɶ s ′′ = ε−1 (s ′ + Ps ) u ′′ = w ′P ′ (u − rs ′) + u ′ ) E ′′ , use (51) to form the inverse selector and (50) to form the composition selector with result ) P ′′ = λ P ′ε (51) ɶɶ−1 P ′ = λε s ′ = −Pɶsɶ are equivalent observer collections, then C ′′ = S ′′,D , H , Αζ is an equivalent observer ) T (50) ) as the selector formed by the composition of equivalence transformations. The Inverse Existence Property of equivalent observer collections requires the inverse of an equivalence transformation to be an equivalence transformation within an associated inverse collection. Then there must be a transformation selector for the inverse transformation. Let represent an equivalence Sɶ = (wɶ, κɶ, Pɶ, sɶ, uɶ, rɶ) transformation E from domain D to range R , and let S ′ = (w ′, κ ′, P ′, s ′, u ′, r ′) represent E −1 . Using the form (47) for both transformations, and E −1E = 1 , work through the algebra to get ˆ ɶ−1wɶ −1w w ′′ = λλ κ ′′ = λˆ−1 (κ − κɶ) ɶɶ−1εˆ P ′′ = λˆ−1λε s ′′ = εˆ−1P (s − sɶ) ( (52) ) u ′′ = wɶ −1εɶ−1 u − uɶ + (r − rɶ) Pɶsɶ ( ) r ′′ = λɶ−1wɶ −1 r − rɶ + κɶT Pɶ−1 (u − uɶ) b. Range equals domain case Consider the case R = D . Then, given the same adjustable constraint so that all inputs into formation of the observer collection are identical, the Maximal Transformation Set property implies Sɶ and S represent transformations in the same set. Since Sɶ = S is then possible, making (52) the identity transformation1 , the identity transformation is always an element of every equivalent observer collection for which range and domain are the same. In (52) valid probability models require LORENTZ TRANSFORMATION FROM INFORMATION ( ) ɶɶ−1εˆ = det (P ′′) = det λˆ−1λε λɶ3 det (εˆ) λˆ3 det (εɶ) ( )) λɶ (1 − κ sɶ) det (P ) (53) ( = = λˆ det (Pɶ (1 − sɶκɶ )) λˆ (1 − κɶ sɶ) det (Pɶ) λɶ3 det P 1 − sɶκT 3 = T λɶ2 det (P ) 3 T 3 T Reasoning similar to that used in (54) through (57) can be ≠ 0, ±∞ λˆ2 det (Pɶ) applied to (48), κTV ≠ 1 . Since the inverse of (52) must exist, (53) can be neither zero nor infinite. Thus λɶ ≠ 0 and λˆ ≠ 0 , or κɶT sɶ ≠ 1 κT sɶ ≠ 1 (54) So if D = R then κ and sɶ can be independently chosen from the same set of transformations. This implies that the second inequality in (54) is true for any κ and any s in the transformation set, even if from different transformations and an interdependence exists. The Maximal Transformation Set property implies that all possible κ and s be included in the selectors for the observer collection transformation set. Since there is no imposition of a favored direction, there exist κ and s vectors in all spatial directions in the set. Assume that the magnitude κ is unbounded in the set. Given any κ with magnitude at least s −1 , since s can be in any direction, s can always be chosen close enough to perpendicular to κ that  −1  κT s = κ s cos θ =  s + δ κ  s cos θ   π   = 1 + s δ κ cos  + δθ  = 1   2 ( ) with upper bound of ac = 1 . κT s can take any values with magnitude less than one, including values very near one, and the largest value ac must therefore equal 1: 1 a= (57) c κT s = βκT βs (55) If the magnitude of V is 1 , the angle V > c , and κ of any magnitude κ < c π formed by κ and V can be chosen close enough to 2 radians to make the product equal 1. Thus the magnitude of V is bounded by c : (58) V <c Different values of c result in different parameter domain sets and therefore different equivalent observer collections. Transformations to a range not the same as the domain are to observers which use different values of c . The value of c acts as an index selector for equivalent observer collections, and is a meta selector, providing data about collections of transformations that primarily results from equivalence transformation properties other than invariance. c. Transformation of velocity bound Assume R = D , so both V and V ′ are bounded by c . V V′ and β ′ = so β < 1 and β ′ < 1 . The c c transformation of velocity in (47) is, in ς , β, βr and c Define β = terms, β′ = P β − βs (59) 1 − βκT β which violates (54) . Thus the magnitude of κ must be bounded in the transformation set. Switch κ and s in the argument to show that s must also be bounded in a transformation set. Lack of any favored direction implies that the boundaries are spheres. Designate c as the upper bound of s and a as the upper bound of κ in a s κ and βκ = , with c a magnitudes less than 1. Then (54) is written transformation set. Define βs = κT s = ac βκT βs ≠ 1 (56) Let r be the magnitude of β − βs , and q̂ be the unit vector in the direction of β − βs , so β − βs = rqˆ with r ≥ 0 . Define the square magnitude of the transformed normalized velocity β ′ as a function Π (β ) = β ′T β ′ so that Π ( β ) = r 2 θ −2 R with θ = 1 − βκT β = 1 − βκT βs − r βκT qˆ . R = qˆT PT Pqˆ and The minimum value of Π is Π = 0 , which occurs at β = βs , a stationary point of Π . Fix q̂ so that the derivative LORENTZ TRANSFORMATION FROM INFORMATION ( ) ( d d 2 −2 Π (β ) = r θ R = 2 r θ −1 dr dr ) with A = S Τ S , and S = ζ −1P . Since (61) restricts only the velocity magnitude and not direction, equation (62) must be true for all directions of unit vector ϒ . Reverse the direction of the unit vector to obtain  −1  θ − r θ−2 d θ  R   dr  ( )( ) = 2 (r θ )(θ + r β qˆ) R = 2 (r θ )(1 − β = 2 r θ−1 θ−1 + r θ−2 βκT qˆ R −3 Since R > 0 , −3 T κ r > 0, βκT βs < 1 , and T βs κ βκT β , )R the d Π (β ) > 0 for all possible directions q̂ . Thus derivative dr there can be no other stationary points besides β = βs . Then the maximum value of Π , which must equal 1 if R = D , must occur for β with velocity at the boundary of T D , where also β β = 1 is a maximum. Therefore the boundary of D maps into the boundary of R , which by assumption R = D , is also the boundary of D , and (59) must map all magnitude one β to magnitude one β ′ . Every range R can also be a domain, so if R ≠ D , R and D are characterized by different values of the velocity bound (58). Designate the bound for D by c and for R c′ by c ′ . Define ζ = . Then (59) becomes c β − βs V′ V′ β′ = = =P (60) c′ ζc ζ 1− β Tβ κ The maximum value of the magnitude right side of (60) is the maximum value of Π , which equals 1, so the maximum value of the magnitude of β ′ is ζ . Since the maximum magnitude of Π occurs at the boundary of D , the boundary of D maps into the boundary of R . (1 + β ϒ) = (ϒ + β ) 2 Τ V′ = P Τ 1−κ V =P Τ 1−κ V = ζc ϒ ′ Τ (c ϒ − s ) P P (1 − β ϒ) Τ 2 (β Since these equations must be true for any unit vector ϒ , ( ) ( A = βκ βκ Τ + 1 1 − βs Τ Aβs ( ( Τ ) βκ = Aβs = βκ βκ + 1 1 − βs Τ Aβs Re-arrange the second equation to get ( ) ( Aβ ) β βκ 1 − βκ Τ βs = βκ 1 − βs Τ Aβs ( = 1 − βs Τ s )) β (63) s ) (64) s If the expressions in parenthesis in (63) were to vanish, then A = βκ βκ Τ for which det (A ) = 0 and therefore, det (P ) = 0 , in violation of the Inverse Existence Property. Thus the terms in parenthesis do not vanish. s Consequently (64) requires βκ = βs , or κ = , and βs c2 is an eigenvector of A . The definition of A makes A a positive definite matrix, so it is possible to write A= MΤ DM for some diagonal matrix D and some orthogonal matrix M , MΤ M = MMΤ = 1 . With A expressed in this manner the first line of (63) becomes ( Define γ = 1 − βs Τ βs ( )) 1 − 2 ( ) . Then ) the (65) component equations of (65) are Dii δij = (Mβs ) (Mβs ) + γ −2δij j which imply that only one of the three (Mβs ) can be noni zero. Select the index of the non-zero component of Mβs as i = 1 . In the following use of either of the other two axis for the non-zero component yields the same A . Since the matrix M is orthogonal, Mβs = βs = βκ , (Mβs )1 2 Τ − βs Τ A ϒ = 0 i (61) (c ϒ − s ) = ζ 2c2 = (ϒ − βs ) A (ϒ − βs ) ) two )) ϒ = 0 = (Mβs )(Mβs ) + 1 1 − βs Τ βs which can be written as 2 Τ κ Τ κ 1 − βκ Τ ϒ ( ( for some unit vector ϒ ′ . From this equation form the squared magnitude of the transformed velocity as Τ ( ϒ Τ A − βκ βκ Τ − 1 1 − βs Τ Aβs These D = M βs βs Τ + 1 1 − βs Τ Aβs M Τ Since the analysis of the domain structure shows that an equivalence transformation maps any velocity on the boundary of D ,V = c ϒ , with unit vector ϒ , to a velocity also magnitude c ′ = ζc on the boundary of R . cϒ − s A (ϒ + βs ) . equations imply that F. Lorentz Transformation V −s Τ s κ (62) = βs Τ βs and D11 = βs Τ βs + γ −2 = 1 D22 = D33 = γ −2 LORENTZ TRANSFORMATION FROM INFORMATION Selection of any other index as the non-zero component results in a corresponding permutation of the D indices. Matrix M rotates βs to align with a coordinate axis. The definition of A implies Τ A = S Τ S = M Τ DM = (ΛM) with ( ΛM ) 1 2 ɶΛM for some Λij = Dij . Consequently, S = O ɶ ,O ɶ ΤO ɶ= 1 . orthogonal matrix O Define a position unit scale factor d by reference to an axis perpendicular to βs as ( ɶ Τ SMΤ d = wζ O ) 33 = w ζΛ33 = w ζγ −1 Selector w is then given by d w =  γ ζ  With d as position scale factor and ζ as velocity scale factor, the equivalence transformation (47) becomes X ′ = d γ S (X − st ) + u d  sT X  t ′ = γ t −  + r ζ  c 2  V −s V ′ = ζS (66)  sTV   1 − 2   c  2 d 2 2  sT  2 2  σt′ =   γ 1 − V  σt   ζ  c 2  T    V − s ) sT   s (V − s )  T  ( 2 2  C ′ = d γ S 1 + C 1 + 2  S  c 2 − sTV   c − sTV     1 0  ɶ 0 γ −1 with S = O  0 0  1 0  − Τ  2   s s  0  M , and γ = 1 −  , for   c 2   −1  γ  ɶ and M , such that matrix M some orthogonal matrices O rotates s into a vector which lies on the x axis ɶ = OMT . ɶ so that O Ms = + s xˆ . Define O = OM  1 0  Then S= OMT 0 γ −1  0 0 0   0  M = OW  γ −1  with W  0  1 0   0  M . W first rotates defined by W = MT 0 γ −1   −1   0 0 γ  s to lie along the x axis, next multiplies the components of V perpendicular to s by γ −1 and then returns s to the original orientation. The matrix M which rotates V to the x axis is s ⊥ = sy 2 + s z 2 Mυ = s xˆ s = sx 2 + sy 2 + sz 2 s sy sz   x   s s   s 1 0 0      − s s s   −sx sz   ⊥  x y  Γ−1 = 0 γ −1 0  M =    s s s⊥ s s⊥   −1     0 0 γ  −sy   sz   0 s s⊥ s s⊥   ssT W = MT Γ−1M = γ −1 1 + (1 − γ −1 ) sT s Application of W to the velocity transformation term results in the traditional relativistic velocity transformation. W is the same no matter which axis the frame velocity is first rotated into, and the matrix can be expressed as W = γ −11 + (1 − γ −1 ) ssT sT s With these results the equivalence transformation (66) is  ssT   X ′ = dO 1 + (γ − 1)  (X − st ) + q  sT s  d   sT X  t ′ =   γ t −  + r  ζ   c 2   ssT  V − s  V ′ = ζγ −1O 1 + (γ − 1)   sT s   sTV  1 −   c 2  2 d 2 2  sTV  2 2   ′  (67) σt =   γ 1 −  σt  ζ   c 2   ssT   C ′ = d 2O 1 + (γ − 1)   sT s  Τ    (V − s )s Τ   s (V − s )  1 + C 1 + 2   c 2 − sTV   c − sTV    T   ss   Τ 1 + (γ − 1) T  O  s s  LORENTZ TRANSFORMATION FROM INFORMATION The first three equations of (67) are a combination of time and velocity units scaling transformations and Lorentz transformations in time and three spatial dimensions [9]. These are relativistic transformations which account for the possibility of different measurement scales. In addition to conventional relativistic Lorentz transformations, the equivalence transformation defines precisely the transformation of the observation time variance and the spatial covariance given time. ( ) 1 ∆W T FRW FRR−1 FRW ∆W 2 −∆W T FRW FRR−1 FRW ∆W 1 + ∆W T FWW ∆W 2 Now the excess discrimination invariant is I P1* , P0 ≈ ( K (P1, P0 ) = I (P1, P0 ) − I P1* , P0 = G. Small Parameter Shift Consider a general continuous probability model with continuous derivatives, vector R of real scalar responsive parameters, and vector W of real scalar constraint parameters. Designate P as the vector of parameters with components from X ,V and W . Under sufficiently small ∆Pi = P1i − P0i the discrimination information [10] is, to second order in ∆P , 1 ∆PT F∆P 2 where F is the Fisher information matrix defined by I (P1, P0 ) ≈ Fij = ∫ dvR f (R : P0 ) (68) ∂ ln f (R : P0 ) ∂ ln f (R : P0 ) ∀R F in block matrix form is F  F =  RR FWR ∂P0i ∂P0 j which makes (68) become 1 ∆RT FRR ∆R + ∆RT FRW ∆W 2 1 + ∆W T FWW ∆W 2 The variational equation ∂ I (P1, P0 ) = 0 ∂R1 (69) defines the stationary point, and minimum, with solution R1* : ∆R * = R1* − R0 = −FRR−1 FRW ∆W Substitution into (69) with result ) ) T 1 ∆R + FRR−1 FRW ∆W FRR 2 ∆R + FRR−1 FRW ∆W ( (72) ) In the simple constant velocity probability model of (10) the response, or un-constrained component vector R is composed of components of X and V , while W is composed of components of t , σt ,C . The second order excess discrimination (72) for the constant velocity model is T 1 K (P1, P0 ) = (∆X −V0 ∆t ) C 0−1 (∆X −V0 ∆t ) 2 (73) 1 2 T −1 + σt 0 ∆V C 0 ∆V 2 The correct value of K given by (16) differs from (73) by replacement of σt 0 with σt 1 . Since σt 1 / σt 0 is not invariant under the equivalence transformation, the transformation does not exactly keep (73) invariant, but only to a second order approximation. The discrepancy also implies that the transformation that preserves invariance of (73) is only an approximation to the correct transformation. Extension of (68) to third order terms recovers (16). Despite initial appearances (16) is actually a third order invariant since σt 12 = σt 02 + ∆σt 2 . FRW   FWW  I (P1, P0 ) ≈ ( (71) (70) H. Equivalence Transformations Recap Equivalence transformations are defined on parameters of a probability model by properties which include invariance of information in a deviation from parameter values expected after a controlled parameter component shift. The invariant information can be used to determine if parameter estimates confirm to expected system behavior, so equivalence transformations keep invariant related statistical decision processes. Remaining properties of equivalence transformations define relations between the transformations and structures of transformation domain and range needed to generate valid probability models for all elements of the domain. Some of the significant concepts, properties and mathematical tools developed to support them are LORENTZ TRANSFORMATION FROM INFORMATION 1. The property of independent observer scrutiny requires that collections of equivalence transformations contain transformations other than identity. Independent observers are needed to confirm a theory that does not support identification of the observed system independent of the applied model. 2. Adjustable parameter constraints allow parameter shifts to be defined in an indexed set theory context. A parameter value acts as index to an adjustable constraint set element, which is a constraint set and the value of a constraint. 3. The time postulate requires that observation time parameters be constraints in an equivalent observer collection. Under the postulate time as control is the essential characteristic which distinguishes time from other quantities. Statistical samples on a high correlation coefficient probability model show that a multi-variate normal density in observed position and time random variables is a reasonable representation of uniform motion. Application of the properties of equivalence transformations to this probability model results in equivalence transformations which are Lorentz transformations of mean observed position and time, Lorentz transformation of velocity, and specific transformations of the co-variance matrix and standard deviation of observation times. The elliptic information invariant is conceptually simpler than the Minkowski space-time interval, though in higher dimension than four since there are more parameter components. Structural properties on parameter domain and range require that velocity magnitude be bounded, without introduction of any external light speed concept. LORENTZ TRANSFORMATION FROM INFORMATION Works Cited [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] J. Hauth, "Grey-Box Modelling for Nonlinear Systems," Dissertation im Fachbereich Mathematik der Technischen, December 2008. K. V. Chernyshov, "An Information Theoretic Approach to System Identification via Input/Output Signal Processing," in SPECOM'2006, St. Petersburg, Russia, 2006. K. Petr and J. Vlcek, "On Improving Probability Models in System Identification," Mathematical Modeling, vol. 7, no. 1, pp. 182-186, 2012. S. Kullback, Information Theory and Statistics, Mineola, NY: Dover Publications, 1997, p. 399. P. A. Hoehn and M. p. Mueller, "An operational approach to spacetime symmetries:Lorentz transformations from quantum communication," 1 February 2016. [Online]. Available: http://arxiv.org/abs/1412.8462v4. O. Certik, "Simple Derivation of the special relativity without the speed of light axiom," 7 10 2007. [Online]. Available: http://arxiv.org/pdf/0710.3398.pdf. A. Pelissetto and T. Massimo, "Getting the Lorentz transformations without requiring an invariant speed," 2015. B. Braverman, S. Ravets, S. Kocsis, M. J. Stevens, R. P. Mirin, L. K. Shalm1 and A. M. Steinberg, "Observing the Average Trajectories of Single Photons in a Two-Slit Interferometer," Science, vol. 332, no. 6034, pp. 1170-1173, 3 June 2011. "Lorentz transformations in three dimensions," 22 6 2011. [Online]. Available: http://www.physicspages.com/2011/06/22/lorentztransformations-in-three-dimensions/. S. Kullback, "Fisher's Information," in Information Theory and Statistics, Mineola, NY: Dover, 1997, pp. 26-28.

Derivation of Lorentz transformation from principles of statistical information theory

Related documents

Products

Support

Derivation of Lorentz transformation from principles of statistical information theory

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib