Interpreting results, descriptive statistics

advertisement
Descriptive statistics
Experiment  Data  Sample Statistics
•
•
•
•
Sample mean
Sample variance
Normalize sample variance by N-1
Standard deviation goes as square-root of
N
Inferential Statistics
Model
• Estimates of parameters
• Inferences
• Predictions
Importance of the Gaussian
1
f ( x) 
e
 2
1  x 
 

2  
2
Why is the Gaussian important?
• Sum if independent observations converge
to Gaussian, Central Limit Theorem
• Linear combination is also Gaussian
• Has maximum entropy for given 
• Least-squares becomes max likelihood
• Derived variables have known densities
• Sample means and variances of
independent samples are independent
Derived distributions
• Sample mean is Gaussian
2

• Sample variance is
distributed
• Sample mean with unknown variance is
Student-t distributed
• This allows us to get confidence intervals
for mean and variance
The logic of confidence intervals
• The mean with unknown variance is distributed as
Student-t ; that is, if samples xi are normally
distributed,
( x   ) /( s / n )
where x is the sample mean and s is the
sample variance, is distributed as Student-t
• Pick q1 and q2 from “tables” so that
prob{ q1 < ( x   ) /( s / n ) < q2 } = 0.99
Then
( x  q2 s / n )
<μ<
( x  q1s / n )
which gives us confidence intervals
on where the actual mean can be
Simulating random arrivals
• Method 1: take small t, flip coin with
event probability  t
• Method 2: generate exponentially
distributed r. variable to determine next
arrival time (use transformation of uniform)
Binomial distribution (Bernoulli trials)
• Suppose we flip a fair coin n times. The mean # of
heads is n/2, and the standard deviation is n / 2 .
For large n ( about > 30), the distribution, called
binomial, approaches normal. Specifically, if x is
the number of heads, the normalized variable
z  ( x  n / 2) /( n / 2) is distributed as N(0,1),
the normal distribution with mean 0 and variance 1.
• This enables to estimate probability of
events using Bernoulli trials very easily.
normalized z  (60  n / 2) /( n / 2)  2
• Example: We flip a coin 100 times and
observe 60 heads. What is the probability of
that event?
" table lookup":


z
N (0,1)dx  0.0228
Martin Gardner: How not to test a
Psychic (Prometheus, 1989)
p. 31: report of claim that a psychic subject
made 781 hits out of 1000. That
corresponds to z = 17.8 [ z = 9.5  1.049 E-21 ]
---------------• Notice that we get here is
prob{event|hypothesis}, where the
hypothesis is that the trials are Bernoulli.
• What we don’t get is the
prob{hypothesis|event}.
Download