Propagation of Errors Suppose we measure the branching fraction BR(Higgs+ -) using the number of produced Higgs Bosons (Nproduced), the number of Higgs+ - decays found (Nfound), and the efficiency for finding a Higgs+ – decay (e). BR(Higgs+ -)=Nfound/(eNproduced), If we know the uncertainties (s’s) of Nproduced, Nfound, and e what is the uncertainty on BR(Higgs+ -) ? More formally we could ask, given that we have a functional relationship between several measured variables (x, y, z), i.e. Q = f(x, y, z) What is the uncertainty in Q if the uncertainties in x, y, and z are known? Usually when we talk about uncertainties in a measured variable such as x we assume that the value of x represents the mean of a Gaussian distribution and the uncertainty in x is the standard deviation (s) of the Guassian distribution. A word of caution here, not all measurements can be represented by Gaussian distributions, but more on that later! To answer this question we use a technique called Propagation of Errors. 880.P20 Winter 2006 Richard Kass Propagation of Errors To calculate the variance in Q as a function of the variances in x and y we use the following: s Q2 s 2x Q / x s 2y Q / y 2s x y Q / x Q / y 2 2 Note: if x and y are uncorrelated (sxy =0) then the last term in the above equation is 0. Let’s derive the above formula. Assume we have several measurement of the quantities x (e.g. x1, x2...xi) and y (e.g. y1, y2...yi). We can calculate the average of x and y using: N N i1 i 1 x xi / N and y yi / N Let's define: Qi f(xi, yi) Q f(x, y) evaluated at the average values Now expand Qi about the average values: Q Q Qi Q( x , y ) ( xi x ) ( yi y ) x x y y higher order term s x y Assume we can neglect the higher order terms (i.e. the measured values are close to the average values). We can rewrite the above as: Q Q Qi Q (x i x ) (yi y ) x x y y We would like to find the variance of Q. By definition the variance of Q is just: 2 1 N s (Qi Q) N i1 Note: To first order the average of a function is the function evaluated at its average value(s): <Q>=Q() 2 Q 880.P20 Winter 2006 Richard Kass Propagation of Errors If we expand the summation using the definition of Q - Qi we get: 2 2 2 2 1 N Q 1 N Q 2 N Q Q 2 s Q (xi x ) (yi y ) (xi x )(yi y ) x x y y N i1 N i1 N i1 x x y y Since the derivatives are all evaluated at the average values (x, y) we can pull the derivatives outside of the summations. Finally, remembering the definition of the variance we can write: 2 2 N 2 2 Q 2 Q 2 Q Q (xi x )(y i y ) sQ sx sy x y N x y i 1 y x x y If the measurements are uncorrelated then the summation in the above equation will be very close to zero (if the variables are truly uncorrelated then the sum is 0) and can be neglected. Thus for uncorrelated variables we have: 2 Q 2 2 Q sQ sx s 2y x y x 2 y uncorrelated errors 1 N (xi x )(yi y ) N i 1 If however x and y are correlated, then we define the COVARIANCE sxy using: s x y The variance in Q including correlations is given by: 2 2 2 2 Q 2 Q 2 Q Q sQ sx sy x x y y y x x y sx y correlated errors Example: Error in BR(Higgs+ – ). Assume: Nproduced =100 10, Nfound =10 3, e = 0.2 0.02 2 s BR s N 2 2 pr BR s 2 N fd N pr 2 BR BR 2 s e2 s N pr e N fd 2 s BR 2 N fd eN 2 pr 2 2 s 2 N fd 1 eN pr 10 1 10 10 9 (4 10 4 ) 4 2 4 2 0.2 10 0.2 10 4 10 10 4 2 s 2 N fd 2 e e N pr 2 0.17 BR(Higgs+ – ) =0.5 0.2 880.P20 Winter 2006 Richard Kass 2 Propagation of Errors Example: The error in the average. The average of several measurements each with the same uncertainty (s) is given by: x1 x2 xn n 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 1 2 1 s x s s s ns s s x s x 2 1 x1 x 2 n xn s n s n n n n “error in the mean” This is a very important result! It says that we can determine the mean better by combining measurements. Unfortunately, the precision only increases as the square root of the number of measurements. Do not confuse s with s! s is related to the width of the pdf (e.g. gaussian) that the measurements come from. s does not get smaller as we combine measurements. A slightly more complicated problem is the case of the weighted average or unequal s’s: x1 x2 xn s 12 s 22 s n2 1 / s 12 1 / s 22 1 / s n2 Using same procedure as above we obtain: s 2 1 1 / s 12 1 / s 22 1 / s n2 “error in the weighted mean” 880.P20 Winter 2006 Richard Kass Propagation of Errors Problems with Propagation of Errors: In calculating the variance using propagation of errors we usually assume that we are dealing with Gaussian errors for the measured variable (e.g. x). Unfortunately, just because x is described by a Gaussian distribution does not mean that f(x) will be described by a Gaussian distribution. Example: when the new distribution is Gaussian. Let y = Ax, with A = a constant and x a Gaussian variable. Let the pdf for x be gaussian: p(x, x , s x )dx e ( x x ) 2 2 s 2x dx s x 2 Then y = Ax and sy = Asx. Putting this into the above equation we have: p( x, x , s x )dx e Ae sy 2 dx e ( y / A y / A ) 2 2 s x 2 ( y y ) 2 2 s 2y ( x x ) 2 s 2x dx e 2( s / A ) 2 y 2 s y / A 2 dx ( y y ) 2 2 s 2y s y 2 dy p( y, y , s y )dy Thus the new pdf for y, p(y, y, sy) is also given by a Gaussian probability distribution function. 100 y = 2x with x = 10 2 dN/dy 80 sy 2s x 4 60 Start with a gaussian with =10, s=2. Get another gaussian with =20, s= 4 40 20 0 880.P20 Winter 2006 0 10 20 30 40 y Richard Kass Error of Propagation of Errors Example when the new distribution is non-Gaussian: Let y = 2/x The transformed probability distribution function for y does not have the form of a Gaussian pdf. 100 y = 2/x with x = 10 2 dN/dy 80 s y 2sx / x 2 Start with a gaussian with =10, s=2. DO NOT get another gaussian ! Get a pdf with = 0.2, s = 0.04. This new pdf has longer tails than a gaussian pdf. 60 G x 40 Prob(y>y+5sy) =5x10-3, for gaussian 3x10-7 20 0 0.1 0.2 0.3 0.4 0.5 0.6 y Unphysical situations can arise if we use the propagation of errors results blindly! Example: Suppose we measure the volume of a cylinder: V = R2L. Let R = 1 cm exact, and L = 1.0 ± 0.5 cm. Using propagation of errors we have: sV = R2sL = /2 cm3. and V = ± /2 cm3 However, if the error on V (sV) is to be interpreted in the Gaussian sense then the above result says that there’s a finite probability (≈ 3%) that the volume (V) is < 0 since V is only two standard deviations away from than 0! Clearly this is unphysical ! Care must be taken in interpreting the meaning of sV. 880.P20 Winter 2006 Richard Kass Generalization of Propagation of Errors We can generalize the propagation of errors formula: s Q2 s x2 Q / x 2 s y2 Q / y 2 2s x y Q / x Q / y In matrix notation: s 2 Q Q / x s x2 Q / y s x y s x y Q / x 2 s y Q / y We can generalize to any number of variables: s2=dTVd with d an N-dimensional vector of derivatives and V an NxN matrix of variances and covariances V is often called the “error matrix” or “covariance matrix”. V is a symmetric matrix (NxN and V=VT) Example: Error in BR(Higgs+ – ). Assume: BR=Nfd/(eNpr) 2 2 s BR N fd / eN pr 2 s 2BR s 2N pr 1 / eN pr BR s 2 N fd N pr N fd / e 2 N pr 2 2 s Npr 0 0 BR BR 2 s e2 s N pr N e fd 2 0 2 s Nfd 0 N fd eN 2 pr 2 0 N fd / eN pr 0 1 / eN pr s e2 N fd / e 2 N pr 2 s 2 N fd 1 eN pr 2 s 2 N fd 2 e e N pr 880.P20 Winter 2006 Richard Kass 2 A Real Life Example We want to measure the branching fraction for B-D0K*-. We can measure it using three different decay modes of the D0: D0, D00 and D0 D0K D0K0 D0K3 B( B Dk0 K * ) e k NB N ( D0 X k ) f K * B B( K * )B( D 0 X k ) 0 e, efficiency (%) 13.30 4.60 8.82 B(D0X) (%) 3.80 12.84 7.46 N, Yield (events) 144.4±13.2 185.4±18.6 195.0±18.2 B(B-D0K*-)x10-4 5.15±0.45 5.65±0.54 5.34±0.48 Statistical uncertainties only Also have to take into account systematic errors How should we combine the 3 measurements? 8 880.P20 Winter 2006 Richard Kass A Real Life Example Also have to take into account systematic errors Summary of Systematic Errors uncorrelated standard recipes study data & MC, vary cuts data (B-D0-) Vs MC lumi script finite MC samples PDG BF uncertainties study data & MC, vary cuts uncorrelated correlated 6.6% 6.2% 3.1% 5.2% 4.7% 7.3% Some sources of systematic errors are correlated, some are not. Correlated errors wind up in the off-diagonal elements of the error matrix. 9 880.P20 Winter 2006 Richard Kass A Real Life Example Since the 3 measurements have different precision we will do a weighted average. But, a bit tricky because we have statistical and systematic errors and some of the systematic errors are correlated. B(B-D0K*-)=w1B(B-DK*-)+w2B(B-0DK*-)+w3B(B-DK*-) Follow the procedure outlined in Lyons et al., NIMA 270, 110 (1988) We want to find the weights (w ) that minimizes: s2=wTVw i subject to: Swi=1 (Here the derivative vector is just the weights) Can solve this problem using Lagrange multiplier technique: s2=wTVw+l(wTI-1) here I is a vector of 1’s. d 2 s 2Vw lI 0 w (l / 2)V 1 I dw Need to find the multiplier, l. From constraint equation we get: Can now solve the problem: Vw (l / 2) I V is a 3x3 symmetric matrix d 2 s ( wT I 1) 0 wT I 1 dl wT Vw (l / 2) wT I (l / 2) using constraint wT Vw wT VV 1Vw (Vw) T V 1 (Vw) (l / 2 I ) T V 1 (l / 2 I ) (l2 / 4) I T V 1 I 2 (l2 / 4) I T V 1 I l / 2 l T 1 For the uncorrelated case the I V I weights are the same as using: 880.P20 Winter 2006 V 1 I w T 1 I V 10 I x1 x2 xn s 12 s 22 s n2 1 / s 12 1 / s 22 1 / s n2 Richard Kass Branching Fraction Averaging Procedure The weights (w1, w2, w3) are calculated using the error matrix, V: V= Vstatistics + Vsystematics s st2 , K V 0 0 0 s st2 , K 0 0 2 s syTOT , K s syC , K s syC , K 0 s st2 , K 3 s syC , K 3 s syC , K 0 0 s syC , K s syC , K 2 s syTOT , K s syC , K 3 s syC , K 0 0 0 s syC , K 3 s syC , K s syC , K 3 s syC , K 2 s syTOT , K 3 st=statistics syTOT=total systematics syC=correlated systematics Note: an off diagonal element is a “dot” product of the errors of the 3 modes, e.g.: s syC , K s syC , K 0 s syC1, K s syC1, K 0 s syC 2, K s syC 2, K 0 where 1=tracking eff, 2=particle ID, etc. V 1u Calculate weights: w T 1 (0.506, 0.216, 0.278) u V u Calculate variances: s st2 wTVstatisticsw 0.09 s sy2 wTVsystematics w 0.119 B( B D 0 K * ) (5.29 0.30 0.34) 104 11 published in PRD 773, 111104(R) (2006) 880.P20 Winter 2006 Richard Kass 0