ppt

advertisement
Lab 3b: Distribution of the mean
Outline
• Distribution of the mean: Normal (Gaussian)
– Central Limit Theorem
• “Justification” of the mean as the best
estimate
• “Justification” of error propagation formula
– Propagation of errors
– Standard deviation of the mean
640:104 - Elementary Combinatorics and Probability
960:211-212. Statistics I, II
Finite number of repeated measurements
The average value is the best estimation of true mean
value, while the square of standard deviation is the best
estimation of true variance.
x  x
1 N
X  x   xi
N i 1
N
1
2
2
2
 x 
 xi  x 

N  1 i 1
x
x  x
The same set of data also provides the best estimation
of the variance of the mean.
2
 
2
x
x
N
x 
x
N
The Central Limit Theorem
The mean is a sum of N independent random
variables:
x1  x2   x N
x
N
The PDF of the mean is the PDF of the sum of N
independent random variables.
In probability theory, the central limit theorem (CLT)
states that, given conditions, the mean of a sufficiently
large number of independent random variables, each with
finite mean and variance, will be approximately normally
distributed.
N 
P  x  
 PG  x 
*In practice, N just needs to be large enough.
Probability Distribution Function of the mean
An example of CLT
P  x1 
P  x1  x2  x3 
P  x1  x2 
P  x1  x2  x3  x4 
For this example N>4 is “enough”.
More examples of CLT
http://www.statisticalengineering.com/central_limit_theorem.html
The significant consequence of CLT
Almost any repeated measurements of independent
random variables can be effectively described by
Normal (Gaussian) distribution.
E.g. N measurements xi (i=1, 2, …, N) of random
variable x can be divided to m subsets of measurements
{nj}, j=1,2,…,m.
N  n1  n2   nm
Define the means of each subset as new measurements
of new random variable y. If nj are large enough, the
PDF of yj is approximately a Normal (Gaussian) one.
nj
1
y j  x j   xi
n j i 1
“Justification” of the mean as the best estimate
Assuming all the measured values follow a normal distribution
GX,(x) with unknown parameters X and .
2
1
  x  X  / 2 2
GX , 
e
 2
The probability of obtaining value x1 in a small interval dx1 is:
2
1
1  x1  X 2 / 2 2
 x1  X  / 2 2
Prob  x between x1 and x1  dx1  
e
dx1  e

 2
constant
Similarly, the probability of obtaining value xi (i=0, …, N) in a
small interval dxi is:
2
1
1  xi  X 2 / 2 2
 xi  X  / 2 2
Prob  x between xi and xi  dxi  
e
dxi  e

 2
Justification of the mean as best estimate (cont.)
The probability of N measurements with values: x1, x2, … xN:
Prob X ,  x1 ,
, xN   Prob x1   Prob x2  

1

e
 Prob xN 
N
  xi  X 2 / 2 2
i 1
N
Here x1, x2, … xN are known results because they are measured
values. On the other hand, X and  are unknown parameters. We
want to find out the best estimate of them from ProbX, (x1, x2, …
xN ). One of the reasonable procedures is the principle of
maximum likelihood, i.e. maximizing ProbX, (x1, x2, … xN )
respect to X minimizing N
2
2
 x
i
 X  / 2
i 1
N
1
   xi  X   0   best estimate of X  
N
i 1
N
x
i
i 1
x
Best estimate of variance
Similarly, we could maximize ProbX, (x1, x2, … xN ) respect to 
to get the best estimate of :
1  N
1     xi  X 2 / 2 2
2 
 xi  X 2 / 2 2

 N 1  e
 N      xi  X      3   e
0


 i 1
   
1 N
2
  N  2   xi  X   0
N


i 1
  best estimate of   
1
N
N
  xi  X 
2
i 1
However, we don’t really know X. In practice, we have to replace
X with the mean value, which reduces the accuracy. To account
for this factor, we have to replace N with N-1.
 best estimate of   
1 N
2
x

x
 i 

N  1 i 1
Justification of error propagation formula (I)
Measured quantity plus a constant:
q x A
 q  X  A,  q = x
Measured quantity multiplies a fixed #:
q  Bx
 q  B  X ,  q =B x
Sum of two measured quantities:
P  q 

q x  y
q x y

P  x  P  y  dxdy   P  x  P  q  x  dx
Assuming X=Y=0,

P  x 
e
 x 2 / 2 x2
 x 2
P  y 
e
 y 2 / 2 2y
 y 2
Justification of error propagation formula (II)
Sum of two measured quantities: q  x  y
2

e


 x 2  y 2
1
2 x y



e
x2  q x 
 2
2 x
2 2y


 q  x  / 2 2y

P  q    P  x  P  q  x  dx  

 x 2 / 2 x2
1
2  x2   y2 

e

e
2
dx
q2
2  x2  2y

q   
2
x
2
y
dx
Justification of error propagation formula (III)
The general case: q  q  x  .
Assuming  x are small enough so that we are concerned
only with values of x close to X .
Using Taylor expansion:
 q 
q  x  q  X      x  X 
constant
 x 

 q    q 
 q  X     X     x
 x    x 

q
 q 
q   x  
x
x
 x 
2
Justification of error propagation formula (IV)
The more general case: q  q  x, y  .
Assuming  x and  y are small enough so that we are concerned
only with values of x close to X and y close to Y .
Using Taylor expansion:
 q 
 q 
q  x, y   q  X , Y      x  X      y  Y 
 x 
 y 

 q    q 
 q 
 q 
 q  X ,Y     X   Y     x    y
 x 
 y    x 
 y 

 q

2
 q

q   x     y 
 x   y 
2
Standard deviation of the mean
x1  x2 
x
N
 xN
Assuming they are normally distributed and have the same X and  x .
2
2
 q


x 
 xN 
 q   q 
x  
x   
x  
 x1   x2 
2
2
2
1
 1

x   x    x  
N  N 
x 
2
1

  x  
N 
x
N
1 2 x
x 
N
N
Lab 3b: test the CLT
1. One run of decay rate (R=N/Δt) measurement
a.
b.
c.
Sampling rate: 0.5 second/sample (Δt)
Run time: 30 seconds (# of measurements: n=60)
calculate N ,  N ,  N , plot histogram.
2. Repeat the 30-sec run 20 times (a set of runs)
a.
b.
c.
Remember to save data
Plot histograms of 21 means with proper bin widths
Make sure you answer all the questions
3. Sample size (n=T/ Δt) dependence
– Keep the same configuration, vary run time
– Record N and  N , plot N vs. log n with error bar =  N .
* “Origin” (OriginLab®) is more convenient than Matlab.
http://www.physics.rutgers.edu/rcem/~ahogan/326.html
Download