1 What is covariance? 1.1 Formal explanation 1.1.1 Probability distribution and measurements Let X and Y be random variables described by a probability density ρ(X, Y ), which RR should be normalized to unity ( ρ(X, Y )dXdY = 1). The probability density can be thought as a underlying law of the physical behavior, while the X or Y are just the observable symptoms. By measuring the X, Y we through the dice, and the probability of observing result (X, Y ) is p(X, Y ) = ρ(X, Y )dXdY . For every variable F = F (X, Y ) dependent on (X, Y ) we can calculate its expectation values: Z Z hF i = F (X, Y )ρ(X, Y )dXdY . In practice, the probability distribution ρ(X, Y ) is usually unknown and in the experiment we observe = measure the random realizations of the random variables (X, Y ). If we do N measurements, we will get a set of results {(Xi , Yi )}N i=1 = {(X1 , Y1 ), (X2 , Y2 ), . . . , (XN , YN )} . The probability distribution will be “encoded” inside the measurements, therefore we can calculate any variable’s F expectation value as simple as: N 1 X F (Xi , Yi ) . hF i ≈ N i=1 There are a lot of useful stuff that can be calculated, but a we will focus only on the few chosen ones. 1.1.2 Moments Moments describe how one of the variables (X or Y ) are distributed, thus using those we can characterize the unknown “projections” of ρ(X, Y ) on X or Y : ρ (X) = R ρ(X, Y )dY , X ρY (Y ) = R ρ(X, Y )dX . Distributions ρX (X) and ρY (Y ) are called marginal densities. There are quite a few of different types of moments, but we will remember only the two main types. • The raw moments are calculated as Z Z Z k k νk (X) = hX i = X ρ(X, Y )dXdY = X k ρX (X)dX . 1 ρ(X) FWHM≈2.355σ μX X Figure 1 Gaussian distribution described by Eq. 1 The most important raw moment is the variable expectation value: Z ν1 (X) = hXi = XρX (X)dX ≈ N 1 X Xi . N i=1 It characterize the “center-of-mass” of the distribution, which however is not necessarily the most probable outcome of the measurement. Usually the ν1 (X) is denoted as µX , and due to the lack of letters in Latin+Greek alphabets we will be very confused by the next item in the list. • The central moments are the second important parameters: Z k µk (X) = h(X − hXi) i = X k ρX (X)dX . The µ1 (X) = hX − hXii = 0 is really boring, however1 µ2 (X) = h (X − hXi)2 i = hX 2 i − 2 hXihXi +hXi2 = hX 2 i − hXi2 = var(X) | {z } | {z } X 2 −2XhXi+hXi2 hXi2 is really cool and is known as variance of X and thus denoted as var(X). This value is closely related to the standard deviation which is defined as p p σX = var(X) = hX 2 i − hXi2 . Both variance and standard deviation show how the variable is “spread” around the expectation value. The best illustration of these concepts is the Gaussian distribution (see Fig. 1) 1 (X − µX )2 ρX (X) = √ exp − . 2 2σX 2πσ 1 h1i = R 1ρX (X)dX = RR ρ(X, Y )dXdY = 1 by definition. 2 (1) 2 The expectation value of X is hXi = µX and the variance is var(X) = σX (i.e. σX is the standard deviation). But please always remember the following: FWHM (full width for at half 2 maximum) (X−µX ) ∈ (0; 1], the Gaussian distribution is NOT the standard deviation σ. Since exp − 2σ2 X 2 X) condition for FWHM is 12 = exp − (X−µ . Solutions of this equations are 2σ 2 X X̃± = (X − µX )± = ± p 2 ln(2)σX , thus taking the difference between the edges of Gaussian distribution at half-way-to-maximum we will get FWHM: p FWHM = X̃+ − X̃− = 2 2 ln(2)σX ≈ 2.354820045 · σX . 1.1.3 Covariance If we want to know something about how the variables X and Y are distributed jointly, we can use a quantity called covariance. It is calculated as cov(X, Y ) = h(X − hXi) · (Y − hY i)i = hXY i − hXihY i . It is fairly simple to compute from the measurements: ! ! N N N 1 X 1 X 1 X Xi Yi − Xi · Yi , cov(X, Y ) ≈ N i=1 N i=1 N i=1 | {z } | {z } | {z } ≈hXY i ≈hXi ≈hY i and such calculation is usually done by special functions in high-level languages (in MATLAB it is cov, and in Python one can use either numpy.cov or scipy.stats.cov). To understand the meaning of this relation we need a concept of independent events. T The events A and B are independent, if their probability is p(A B) = p(A) · p(B). In case of continuous distribution ρ(X, Y ) this means factorizability of probability density function: ρ(X, Y ) = ρX (X) · ρY (Y ) . This means that Z Z n m hX Y i = X n Y m ρX (X) · ρY (Y )dXdY = | {z } ρ(X,Y )dXdY Z = | Z m X ρX (X)dX · Y ρY (Y )dY = hX n ihY m i {z } | {z } n hX n i hY m i therefore hXY i = hXihY i and thus for independent X and Y cov(X, Y ) = 0. 3 De tec tor "A" te Source De ↑A r" o t c B" ↓B (|↑A↓B⟩–|↓A↑B⟩ )/√2 Figure 2 Prototypical scheme to measure entanglement of two electrons in singlet state. The covariance normed at the standard deviations is called Pearson correlation coefficient: ρX,Y = cov(X, Y ) hXY i − hXihY i cov(X, Y ) =p . =p σX · σY var(X) · var(Y ) (hX 2 i − hXi2 ) · (hY 2 i − hY i2 ) Statisticians love this one more then the pure covariance, however, in the physics it’s completely opposite. 1.2 Simple example of how covariance works Let us consider a prototypical quantum cryptography scheme (see Fig. 2): (1) • a source that produces two electrons flying in opposite directions, • two detectors (“A” and “B”) to catch electrons and measure their spin on the vertical axis. In this system there are four possible measurement outcomes: 1. ↑A and ↑B with spins sA = + 21 and sB = + 12 , 2. ↑A and ↓B with sA = + 21 and sB = − 12 , 3. ↓A and ↑B with sA = − 21 and sB = + 21 , 4. ↓A and ↓B with sA = − 21 and sB = − 21 , where lower index denotes result of the observation on the detector A or B. There can be a three basic situations, that can be happening. 4 1.2.1 Anticorrelated electrons The source can produce entangled pair of electrons created from a singlet state |ψi = ↑A ↓B i+| ↓A ↑B i) In this case probabilities of the outcomes will be p(↑A ↑B ) = p(↓A ↓B ) = 0 and p(↑A ↓B ) = p(↓A ↑B ) = 21 . The average spin orientations hsA i and hsB i measured at the detectors will be 1/2 1/2 0 0 z }| { z }| { z }| { z }| { hsA i = + 1 p(↑A ↑B ) + + 1 p(↑A ↓B ) + − 1 p(↓A ↑B ) + − 1 p(↓A ↓B ) = 0 2 2 2 2 . 1 1 1 hsB i = + 2 p(↑A ↑B ) + − 2 p(↑A ↓B ) + + 2 p(↓A ↑B ) + − 12 p(↓A ↓B ) = 0 | {z } | {z } | {z } | {z } 0 0 √1 (| 2 1/2 1/2 However, the cross-term for the entangled state will be: 1/2 0 z }| { 1 1 z }| { 1 1 hsA sB i = + + p(↑A ↑B ) + + − p(↑A ↓B ) + 2 2 2 2 0 z 1/2 }| { 1 1 z }| { 1 1 1 + − + − p(↓A ↑B ) + − p(↓A ↓B ) = − , 2 2 2 2 4 and thus 1 . 4 Negative covariance means that if one value of sA positive, then the value of sB is probably negative. cov(sA , sB ) = hsA sB i − hsA ihsB i = − 1.2.2 Non-correlated electrons If we break the entanglement we will get independent electrons, which will give equal probabilities for all four outcomes: p(↑A ↑B ) = p(↓A ↓B ) = p(↑A ↓B ) = p(↓A ↑B ) = 41 . This will not change the expectation values of the single detector measurements: 1/4 1/4 1/4 1/4 z }| { z }| { z }| { z }| { hsA i = + 1 p(↑A ↑B ) + + 1 p(↑A ↓B ) + − 1 p(↓A ↑B ) + − 1 p(↓A ↓B ) = 0 2 2 2 2 , 1 1 1 hsB i = + 2 p(↑A ↑B ) + − 2 p(↑A ↓B ) + + 2 p(↓A ↑B ) + − 12 p(↓A ↓B ) = 0 | {z } | {z } | {z } | {z } 1/4 1/4 1/4 1/4 but the cross-term will now change: 1/4 z 1/4 }| { 1 1 z }| { 1 1 hsA sB i = + + p(↑A ↑B ) + + − p(↑A ↓B ) + 2 2 2 2 1/4 z 1/4 }| { 1 1 z }| { 1 1 + − + p(↓A ↑B ) + − − p(↓A ↓B ) = 0 , 2 2 2 2 and we will see the independence of the electrons as a cov(sA , sB ) = hsA sB i − hsA ihsB i = 0 . 5 1.2.3 Positively correlated electrons However, we can set the source to produce entangled electrons from a triple state, that can be one of three possibilities: | ↑A ↑B i , |ψi = √1 (| ↑A ↓B i + | ↓A ↑B i) , 2 | ↓A ↓B i . This will give a set of probabilities p(↑A ↑B ) = p(↓A ↓B ) = hsA i and hsB i will still be zero, but the cross-term 1 2 and p(↑A ↓B ) = p(↓A ↑B ) = 0. The 0 z 1/2 }| { 1 1 z }| { 1 1 hsA sB i = + + p(↑A ↑B ) + + − p(↑A ↓B ) + 2 2 2 2 1/2 0 z }| { 1 1 z }| { 1 1 1 p(↓A ↑B ) + − p(↓A ↓B ) = + , + − + − 2 2 2 2 4 will be positive, and thus the covariance cov(sA , sB ) = hsA sB i − hsA ihsB i = + 1 . 4 This positive correlation means that when sA positive, the sB is also positive. 1.2.4 All together now! To summarize: • negative covariance = anti-relation between variables (when one is increased, the other is decreased, and vice versa), • zero covariance = independent variables, • positive covariance = when one value is increased, the second is increased as well (and vice versa). 2 TOF-TOF covariance maps 2.1 What is TOF covariance map? First, we need to bring back the memory of what is time-of-flight (TOF) mass spectroscopy (MS). The simplest setup (see Fig. 3) consists of the two plates charged positively and negatively. This capacitor-like construction forms an uniform electric field. When the ions are produced at distance L from the detector (being close to the negative plate) they are accelerated by Lorentz force F = qE towards negative plate. Depending on the mass-to-charge 6 "A": mA, qA + vA E vB vA vB "B": mB, qB L – x Figure 3 Simplest time-of-flight (TOF) spectrometer scheme and the ion trajectories in the Coulomb explosion reaction. L is the distance to the detector, E is the electric field strength along perpendicular to the detector, v is the particle velocity and v its projection on the electric field direction, m and q are the mass and the charge of the ion. ratio, the time from ion production to registration will be different: the heavier ions will come later then the lighter ones. This simple scheme provides a very fine mass-to-charge resolution. In the end, the MS signal is the I(TOF), ion current as a function of TOF. When it is digitized, the continuous signal turns into a discrete vector-like quantity: I = ( I1 , I2 , . . . , IM ) , |{z} |{z} |{z} I(TOF1 ) I(TOF2 ) I(TOFM ) where TOFα = TOF1 + dt · (α − 1), dt is the discretization time step and M is the number of MS points taken. We can perform N measurements of the TOF MS, which will give us a N set of independent results {Ii = (I1i , I2i , . . . , IM i ) = {Iαi }M α=1 }i=1 . Based on this ensamble of points we can calculate M × M size covariance map (4; 5) consisting of elements ! ! N N N 1 X 1 X 1 X Cαβ = cov( Iα , Iβ ) = Iαi Iβi − Iαi · Iβi . (2) |{z} N i=1 N i=1 N i=1 |{z} I(TOFα ) I(TOFβ ) From the meaning of the covariance we can expect to see the underlying reactions. • If fragments A and B are formed independently, in the corresponding part of the map will be cov(I(A), I(B)) = 0. • If the fragment A is formed from parent ion B, the increase of the A will deplete amount = intensity of the parent, and vice versa, less fragmentation of B will yield in smaller amounts of observed A. Therefore we expect from this part of the map cov(I(A), I(B)) < 0. 7 • If the A and B are produced in the same reaction sequence from the same parent species, the increase/decrease of one ion can be noticed in the connected fragment, therefore cov(I(A), I(B)) > 0. However, life is a little bit more interesting than that. 2.2 Coulomb explosions The Coulomb explosion is a reaction of the kind ABq+ → AqA + + BqB + . We have then two ions flying away from each other (Fig. 3). According to the momentum conservation law, the final velocities of the fragments (vA for AqA + and vB for BqB + ) should be related to the initial velocity of ABq+ (vAB ) as (mA + mB ) vAB = mA vA + mB vB , {z } | mAB where m are the masses of the parent and the fragments. Let us consider only the projection of the velocities on the electric field direction (v, see Fig. 3): mAB vAB = mA vA + mB vB . (3) Each of the fragments will move towards the detector according to the second Newton’s law mẍ = F = qE, where q is the charge of the ion and E is the electric field strength (we consider only the electric field direction). This we can solve taking the initial position of the ion x = 0 and velocity v: qE t2 x(t) = v · t + · , m |{z} 2 a where a is the ion acceleration in the constant electric field. We need to know the time when the ion will reach the detector located at distance L from the reaction place. It can be found according tox(t) = L, which is a quadratic equation for TOF t a 2 t +v·t−L=0 . 2 The solution is √ 2aL − v 2 − v t= . a √ √ Since the applied field is strong enough, we can write 2aL − v 2 ≈ 2aL, thus r 2L v t= + − , a a | {z } | {z } δt t0 8 slow op e = q 2/ q 1 fast t0B time-of-flight sl fast slow t0A time-of-flight Figure 4 Formation of the slope in Coulomb explosion on the covariance maps where t0 = q 2L E · m q is the time when the velocity-less ion will reach the detector,2 and δt = − mv q E (4) is the TOF drift. Obviously, the ions that are already flying towards the detector v > 0 will reach the final destination earlier (t < t0 ), and those, that initially run away from detector v < 0 will come later (t > t0 ). We want to know how the arrival times of AqA + and BqB + are connected = correlated. From Eq. 3 we can write mAB vAB mA vB = − vA . mB mB The time drift (Eq. 4) for AqA + is δtA = − mqAA vEA , while for BqB + it is δtB = − mB vB mAB vAB qA mA vA qA =− + = δtAB − δtA . qB E q E q q E qB | B{z } B | A{z } −δtA δtAB 2 It should be the average arrival time for this sort of ions. 9 This gives the TOFs for ions AqA + and BqB + : t = t + δt , A 0A A tB = t0B + δtAB − qA δt qB A . The times tA and tB are connected through the variable δtA that shows initial velocity of fragment AqA + . Thus the positive correlation = covariance in coordinates (tA , tB ) (i.e. in the coordinates of the covariance map, (TOFi , TOFj )) for ions AqA + and BqB + will look like a straight line with a slope − qqBA (see Fig. 4). Those straight lines with negative slopes are the clear signatures of the Coulomb explosions. If there are some intermediate or later reactions, the shape of covariances will change in very wierd ways (see (2) and (3) for more information). 2.3 Partial covariance maps The free electron lasers (FELs) are known to have large fluctuations of the pulse energies. Let us consider ion yields of two ions: A and B. Assuming that the formation of all the ions happens as a result of one-photon process, we can assume that the ion yields of ions will be proportional to the power of the FEL (W ): A = a · W , B = b · W , where a and b are the true random ion yields, that depend only on the intermolecular random variables, i.e. a and b are independent of W . This means that han W m i = han ihW m i. Let us calculate the covariance of A and B that we will get from the covariance maps: cov(A, B) = hABi − hAi hBi = habihW 2 i − haihbihW i2 . | {z } |{z} |{z} habihW 2 i haihW i hbihW i But in reality we want to have cov(a, b) = habi − haihbi , so let’s clean it up in cov(A, B). To do that we need to add and subtract missing haihbihW 2 i, and by doing that we’ll get cov(A, B) = habihW 2 i − haihbihW 2 i +haihbi hW 2 i − hW i2 = hW 2 i·cov(a, b)+haihbi var(W ) . {z } | {z } | hW 2 i·cov(a,b) var(W ) The first term is what we want: it represents the true intramolecular relation between ions. The second term is called false covariance: it will give a nonzero covariance even if the ions in reality are unrelated (cov(a, b) = 0). To clean out the first term we need to express the second addend in the terms of observables A, B, W . 10 Let us calculate covariance of A and B with W : haihW i haihW 2 i z }| { z}|{ cov(A, W ) = hAW i − hAi hW i = hai · var(W ) , cov(B, W ) = hBW i − hBi hW i = hbi · var(W ) , | {z } |{z} 2 hbihW i hbihW i therefore haihbi var(W ) = cov(A, W ) · cov(W, B) . var(W ) Now we can calculate hW 2 i · cov(a, b) which is called partial covariance (pcov) as pcov(A, B) = cov(A, B) − cov(A, W ) · cov(W, B) . var(W ) In principle, this value represents the “true” covariance of ionic signals. N If we have the same measured data {Ii = (I1i , I2i , . . . , IM i ) = {Iαi }M α=1 }i=1 as in section 2.1 we would have to calculate the following matrix elements: Pαβ = Cαβ − 1 N PN 2 i=1 Wi · − 1 P N 1 N i=1 Wi N 1 X Iαi Wi − N i=1 2 · N 1 X Iαi N i=1 ! · N 1 X Wi N i=1 N 1 X Iβi Wi − N i=1 · !! · N 1 X Iβi N i=1 ! · N 1 X Wi N i=1 !! where Cαβ is the covariance map defined by Eq. 2. If we denote the M × M TOF-TOF covariance matrix as cov(I, I) and a define a vector of length M for ion intensity covariance with FEL power W : cov(I, W ) = (cov(I1 , W ), cov(I2 , W ), . . . , cov(IM , W )) , then we can write partial covariance matrix pcov(I, I) as pcov(I, I) = cov(I, I) − cov(I, W ) ⊗ cov(W, I) , var(W ) where ⊗ is an outer product. When using high-level languages (such as MATLAB and Python) please use this or equivalent expressions, otherwise the calculation of the partial covariance will take forever. In reality, the FEL power can be not the only random variable that creates the false covariances. The molecular beam can fluctuate, the other lasers can also have power variations, the detection devices can do some stuff, so a lot of possible things can happen. If there are 11 multiple random variables {ξk }K k=1 = ξ that are independent from each other (cov(ξk , ξl ) = 0 for k 6= l), then the generalized covariance can be calculated: K X cov(I, ξk ) ⊗ cov(ξk , I) pcov(I, I|ξ) = cov(I, I) − k=1 var(ξk ) . However, other variables except for FEL power are usually unknown, therefore people take the assumption that the FEL power correction factor looks exactly like the other ones, i.e. cov(I, W ) ⊗ cov(W, I) cov(I, ξ) ⊗ cov(ξ, I) ∝ , var(W ) var(ξ) so to get rid of all the false covariances we can just take a partial covariance modified by a scale factor s > 1: pcov(I, I) = cov(I, I) − s · 2.4 cov(I, W ) ⊗ cov(W, I) . var(W ) Real life example In FLASH beamtime we did the XUV-IR destructions of different PAHs, in particular of fluorene (C13 H10 ). It can go the following reaction pathway: C H + XUV → C H2+ + 2e− , 13 10 13 10 2+ C13 H → Cn H+ + C13−n H+ , 10 x 10−x and the second Coulomb explosion can be registered in the TOF-TOF covariance maps. Fig. 5 shows Coulomb explosions for n = 2, 3, 4. • We can calculate the the covariance map (top picture in Fig. 5), but the Coulomb explosion lines are just slightly above the false covariances. • If we compute partial covariance correcting for the FEL (XUV) intensity (middle picture in Fig. 5) the background noise and decreases, and the lines appear a little more. • But if we scale XUV correction matrix to the optimum (bottom picture in Fig. 5), we will get rid of most of the noise and the lines will appear more vividly. 12 cov pcov scaled pcov Figure 5 Fluorene (C13 H10 ) TOF-TOF covariance, partial covariance and partial covariance with optimized scale coefficient (from top to bottom). 13 References [1] Ekert, A. K. Quantum cryptography based on bell’s theorem. Phys. Rev. Lett. 67 (Aug 1991), 661–663. [2] Eland, J. The dynamics of three-body dissociations of dications studied by the triple coincidence technique pepipico. Molecular Physics 61, 3 (1987), 725–745. [3] Eland, J. H. D. Dynamics of fragmentation reactions from peak shapes in multiparticle coincidence experiments. Laser Chemistry 11 (1991), 259–263. [4] Frasinski, L. J. Covariance mapping techniques. Journal of Physics B: Atomic, Molecular and Optical Physics 49, 15 (jul 2016), 152004. [5] Zhaunerchyk, V., Frasinski, L. J., Eland, J. H. D., and Feifel, R. Theory and simulations of covariance mapping in multiple dimensions for data analysis in high-event-rate experiments. Phys. Rev. A 89 (May 2014), 053418. 14