Descriptive statistics Experiment Data Sample Statistics • • • • Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root of N Inferential Statistics Model • Estimates of parameters • Inferences • Predictions Importance of the Gaussian 1 f ( x) e 2 1 x 2 2 Why is the Gaussian important? • Sum if independent observations converge to Gaussian, Central Limit Theorem • Linear combination is also Gaussian • Has maximum entropy for given • Least-squares becomes max likelihood • Derived variables have known densities • Sample means and variances of independent samples are independent Derived distributions • Sample mean is Gaussian 2 • Sample variance is distributed • Sample mean with unknown variance is Student-t distributed • This allows us to get confidence intervals for mean and variance The logic of confidence intervals • The mean with unknown variance is distributed as Student-t ; that is, if samples xi are normally distributed, ( x ) /( s / n ) where x is the sample mean and s is the sample variance, is distributed as Student-t • Pick q1 and q2 from “tables” so that prob{ q1 < ( x ) /( s / n ) < q2 } = 0.99 Then ( x q2 s / n ) <μ< ( x q1s / n ) which gives us confidence intervals on where the actual mean can be Simulating random arrivals • Method 1: take small t, flip coin with event probability t • Method 2: generate exponentially distributed r. variable to determine next arrival time (use transformation of uniform) Binomial distribution (Bernoulli trials) • Suppose we flip a fair coin n times. The mean # of heads is n/2, and the standard deviation is n / 2 . For large n ( about > 30), the distribution, called binomial, approaches normal. Specifically, if x is the number of heads, the normalized variable z ( x n / 2) /( n / 2) is distributed as N(0,1), the normal distribution with mean 0 and variance 1. • This enables to estimate probability of events using Bernoulli trials very easily. normalized z (60 n / 2) /( n / 2) 2 • Example: We flip a coin 100 times and observe 60 heads. What is the probability of that event? " table lookup": z N (0,1)dx 0.0228 Martin Gardner: How not to test a Psychic (Prometheus, 1989) p. 31: report of claim that a psychic subject made 781 hits out of 1000. That corresponds to z = 17.8 [ z = 9.5 1.049 E-21 ] ---------------• Notice that we get here is prob{event|hypothesis}, where the hypothesis is that the trials are Bernoulli. • What we don’t get is the prob{hypothesis|event}.