Data analysis Ben Graham October 26, 2015 MA930, University of Warwick

Data analysis Ben Graham MA930, University of Warwick October 26, 2015 p-value examples I I p-value: P such that P(P ≤ t ) ≤ t for t ∈ [0, 1]. Random sample X , . . . , Xn iidrv Exponential(θ), mean statistic 1 µ := 1/θ I Under H 0 unknown : µ = 1, sample mean has the Gamma distribution with mean 1, shape n Q = FH0 (X̄ ) ∈ [0, 1] uniform under H0 H1 : µ < 1 I Reject smallvalues of Q I P = Q I Let I I H 1 : µ 6= 1 I Reject large values of I I H 1 P =1−Q Q : µ 6= 1 I Reject small and large values of I P = 1 − |2Q − 1| Q Hypothesis testing I Example: Lady tasting tea I Experiments by Ronald Fisher and Muriel Bristol I 8 cups of tea: 4 tea into cup rst, 4 milk rst #Successes Count 0 1 1 16= 2 36= 3 16= 4 I P 4 4 34 2 2 4 4 1 4 3 1 1 count =70= 8 4 I Null hypothesis: no ability to taste dierence construct p-value I Under H 0, P(4 successes | H0 ) = 1/70 I Muriel got 4/4: Reject H 0 Power calculations I Before doing an experiment, check it can do what you want I Example: Testing a coin for bias: I 0 :θ= 2 vs 1 1 2 I Ask to keep I I I X ∼Binomial(n, θ) H : θ 6= P(reject H | H ) ≤ 5% Ask for P(reject H | |θ − | ≥ 0.1) ≥ 92% √ Assume X̄ ∼ N (θ, θ/4 n ) in the range θ ∈ [0.4, 0.6] Need n such that [FN−( , ) (1 − 5%/2) + FN−( , ) (1 − 8%)] × s .d . ≥ 0.1 H 1 0 0 1 0 1 0 1 2 1 0 1 Condence Intervals I Parameter θ∈R I 95% CI: Statistics L, R such that ∀θ, Pθ [L ≤ θ ≤ R ] ≥ 95% I Not uniqueone sided or two sided, etc I Complement of the critical regions for testing H 0 : θ = θ̂, that the MLE is the right parameter. I N.B. Here θ is xed and the statistics L, R are random. i.e. Normal condence intervals I X 1 , . . . , Xn ∼ N (θ, 1) I MLE X̄ iidrv = θ̂ ∼ N (θ, 1/n) I 1.96 1.96 1.96 1.96 P[θ− √ ≤ θ̂ ≤ θ+ √ ) = P[θ ∈ (θ̂− √ , θ̂+ √ )] = 95% n I n n 1.64 1.64 P[θ̂ ≥ θ − √ ) = P[θ ∈ (−∞, θ̂ + √ )] = 95% n I n n 1.64 1.64 P[θ̂ ≤ θ + √ ) = P[θ ∈ (θ̂ − √ , ∞)] = 95% n n t condence intervals I X 1 , . . . , Xn ∼ N (µ, σ 2 ) I Sample mean I X̄ iidrv = µ̂ ∼ N (θ, 1/n), sample variance X̄ −√µ ∼t S / n n− q such that Ft −1 (q ) − Ft −1 (−q ) = P(−q ≤ A ≤ q | A ∼ tn− ) = 95% Hypothesis test: Under H : µ = µ , X̄ −µ 0 √ Ft −1 S / n ∼ Uniform(0, 1) S S Condence interval: ∀µ, P(µ ∈ (X̄ − q √ , X̄ + q √ )) = 95% n n n 1 n 0 n I 2 1 I Choose I S 0 Hypothesis test for contingency table I I I I I I m × n contingency table H :properties P P are independent pi ,j = ai × bj ; i ai = 1 , j bj = 1 so m + n − 2 degrees of freedom H :properties are not independent. m × n − 1 degrees of freedom P Number of observations N = i ,j Oi ,j Expected number of observation under H is P P Ei ,j := N × ( Oi ,k /N ) × ( Ok ,j /N ) Asymptotically under H by Wilk's theorem: 0 1 0 0 Q (Ei ,j /N )O , ≈ χ2(m−1)(n−1) −2 log Q (Oi ,j /N )O , P 2 Pearson's χ test statistic (Oi ,j − Ei ,j )2 /Ei approximates above if mini Ei ≥ 5. i j i j I I Large values: reject independence. Small values: faked data? Variance stabilizing transforms I X ∼ f (· | θ) EX = θ and X ) =: V (θ) p Y := X / V (θ) has variance 1 I Varθ ( I I Taylor's theorem: g (X )) ≈Var(θ + g 0 (θ)Y V (θ)) ≈ g 0 (θ) V (θ) 0 Want to nd g such that Var(g (X )) ≈ g (θ) V (θ) ≈constant g 0 (θ) ≈ constant × V (θ)− / p Var( I 2 2 1 2 g (θ) = ˆ θ V (u )− / du 1 2 √ V (θ) = θ → g (X ) = X Exponential mean θ : V (θ) = θ → g (X ) = log X I Poisson: I 2 Bayesian statistics I Parameter I Data θ with prior belief X ∼ f (X | θ) I Joint distribution ´ ´ θ x f (θ) f (X , θ) = f (θ)f (X f (x , θ)dxd θ = 1 | θ), I Bayes theorem, Bayes' theorem, Bayes's theorem: f (x , θ) t f (x , t )dt f (θ | x ) = ´ = f (θ)f (x | θ) Z (x ) i.e. Posterior is proportional to prior times likelihood Can generally ignore the normalizing constant Z (x ) Bayesian statistics I Instead of MLE θ̂, we can look at properties of the posterior distribution I I δ =posterior δ =posterior mean minimizes the expected square error median minimes the expected absolute error I The prior distribution does not need to be a real probability distribution. I ´ f (θ)d θ = ∞ call it an improper prior. For a random sample of size n , as n → ∞, the prior becomes If less important. Asymptotically f (θ | X 1 , . . . , Xn ) ∼ N (θ, I (θ)−1 /n) (just like MLE). I The exception to this rule is if the prior is way o, i.e. taking θ ∼ N (0, 1) or θ ∼ Uniform(0, 1) when θ is really 100 Credible intervals I For Bayesian, credible intervals replace condence intervals. I A 95% credible interval is an interval covering 95% of the posterior ˆ R (x ) L(x ) f (θ | x )d θ = 95% ↔ Pposterior (θ ∈ (L(x ), R (x ))) = 95% I Unlike the frequentist case, given the data, ”θ ∈ (L(x ), R (x ))? is still ocially random. , Where do priors come from? I Non-informative priors: make up something so broad that it is guarenteed to cover all but the most unrealistic values of θ. I OR: ask an expert I Conjugate priors: some pairs I normal prior and normal likelihood I beta prior and binomial likelihood I beta prior and geometric likelihood I gamma prior and poisson likelihood I gamma prior and normal likelihood I gamma prior and gamma likelihood I etc work out nicely analytically, so are often used. I Jerey's prior f (θ) ∝ I (θ) invariant under reparametrization p I If the prior looks a lot like the posterior, your experiment is rather questionable. Jerey's prior example I Likelihood Bernoulli(θ) I Could call the Uniform(0, 1) distribution an uninformative prior I Jerey's prior: f (x | θ) = θx (1 − θ) −x 1 I (θ) = Var ∂ ∂θ = Var log X −θ θ(1 − θ) f (x | θ) ∝ I (θ)− I Observe f (x | θ) 1 = Var = −X − θ 1−θ X 1 1 θ(1 − θ) 1 1 , → f (x | θ) = Beta 2 2 n samples: k 1s and n − k 0s = Beta( + k , + n − k ) Posterior 1 1 2 2 I Credible interval: 2.5% and 97.5% quantiles of the posterior distribution [qbeta]

Data analysis Ben Graham October 26, 2015 MA930, University of Warwick

Related documents

Products

Support

Data analysis Ben Graham October 26, 2015 MA930, University of Warwick

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib