Scientific Methods 1 ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8: Statistical Methods-Significance tests & confidence limits Barry & Goran www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131 10 Dec 2012 COMP80131-SEEDSM8 1 Introduction • Statistical significance testing has so far been applied on the assumption of a 1) discrete population with binomial distribution 2) continuous population with known normal pdf & stdev. • Before proceeding further, take a quick look at a few more prob distributions & pdfs. • Significance testing can be adapted to any of these. 10 Dec 2012 COMP80131-SEEDSM8 2 Exponential pdf • Lifetimes, e.g. of light bulbs, follow an exponential distribution: :x0 0 pdf ( x) x / ( 1 / ) e :x0 0.5 mean = 2; x = 0:0.1:10; y = exppdf(x,mean); plot(x,y); 0.45 0.4 0.35 pdf 0.3 0.25 0.2 Mean = 0.15 0.1 Stdev = also 0.05 0 0 1 10 Dec 2012 2 3 4 5 x 6 7 8 9 10 COMP80131-SEEDSM8 3 Poisson Distribution • For applications that involve counting number of times a random event occurs in a given amount of time, e.g. number of people walking into a store in an hour. prob( x) • • • • x e x where x is an integer x! λ, is both mean & variance of the distribution. Poisson & exponential distributions are related. If number of counts follows a Poisson distribution, then interval between individual counts follows exponential distribution. As λ gets larger, Poisson pdf normal with µ = λ, σ2 = λ. 10 Dec 2012 COMP80131-SEEDSM8 4 Poisson distributions in MATLAB x=0:60 y = poisspdf(x,20); stem(x,y); x=0:16 y = poisspdf(x,5); stem(x,y); 0.09 0.18 0.08 0.16 0.07 0.14 0.06 prob(x) prob(x) 0.12 0.1 0.08 0.04 0.03 0.06 0.02 0.04 0.01 0.02 0 0.05 0 0 2 10 Dec 2012 4 6 8 x 10 12 14 16 0 COMP80131-SEEDSM8 10 20 30 x 40 50 60 5 Chi-squared distribution •Given a population of normally distrib random variables with mean = 0 & stdev =1. •Randomly choose a sample of V observations of them. •Let x be the sum of their squares. •Then pdf of x has the 2 distribution: :x0 0 1 V / 2 1 x / 2 V2 ( x) x e :x0 V /2 2 (V / 2) (‘Gamma function’ (x) is generalisation of x! to non-integers). If s = stdev of the V observations, pdf(s2) (1/V)V2(s2) If pop mean = & stdev = , pdf (s2 ) (1/V)V2(s2/2+ 2) 10 Dec 2012 COMP80131-SEEDSM8 6 Plot chi2 pdf with V = 4 0.2 0.18 0.16 x = 0:0.2:15; y = chi2pdf(x,4); plot(x,y) 0.14 pdf 0.12 0.1 0.08 0.06 0.04 0.02 0 0 5 10 15 x 10 Dec 2012 COMP80131-SEEDSM8 7 Student’s t-distribution pdf Depends on a single parameter V (degrees of freedom). As V, t-pdf approaches standard normal distribution ( (V 1) / 2 ) pdf (t ) 1 t 2 /V ) V (V / 2) (V 1) / 2 : t If x is random sample of size n from a normal distribution with mean μ, then the t-statistic x s/ n (with x sample mean & s sample stdev) has Student's t-pdf with V = n – 1 degrees of freedom. 10 Dec 2012 COMP80131-SEEDSM8 8 Compare t-pdf(V=5) with normal T-pdf(blue) Norm-pdf(red) 0.4 0.35 0.3 x = -5:0.1:5; y = tpdf(x,5); z = normpdf(x,0,1); plot(x,y,'b',x,z,'r'); 0.25 0.2 0.15 0.1 0.05 0 -5 -4 10 Dec 2012 -3 -2 -1 0 1 2 3 4 COMP80131-SEEDSM8 5 9 MATLAB functions for t-dist • pdf for t-distribution with V degrees of freedom: y = tpdf ( t,V); (With samples with n values, V = n-1) • Cumulative df with V degrees of freedom p = tcdf ( t , V) Prob of rand var being t • Complementary df (area under ‘tail’ from t to ) p = 1 – tcdf ( t , V) Prob of rand var being > t 10 Dec 2012 COMP80131-SEEDSM8 10 Inverse-cdf in MATLAB t-pdf • Inverse of cumulative distrib function p x t • If p = tcdf(t,V) then t = tinv(p,V) Value of t such that prob of rand var being t is p • If p = normcdf(z,m,) then z = norminv(p,m, ) Value of z such that prob of rand var being z is p t-pdf • Complementary version: t = tinv(1-p,V) Value of t such that prob of rand var being > t is p. • Similarly for complementary version of norminv 10 Dec 2012 COMP80131-SEEDSM8 p x t 11 Significance testing: z-test • • • • Assume Normal population with known stdev = . Null-hypothesis: pop-mean = 0 Alternative hyp: pop-mean < 0 Take one sample of n values & calculate the z-statistic: x 0 z (with x sample mean & pop stdev) / n If pop-mean = 0, dist of z will be standard Normal (mean=0, std=1) Std Normal pdf 0.4 0.3 If mean of z is 0, how likely is a value z as just calculated? 0.2 p-value = prob (x z) 0.1 = 1-normcdf(z,0,1) 0 10 Dec 2012 -2 -1 0 1 2 z 4 If p-value < significance level alpha () reject null-hyp. COMP80131-SEEDSM8 12 Alternative formulation z x 0 / n (with x sample mean & pop stdev) Assuming we need 95% confidence, = 0.05 Let z() = norminv(1-, 0, 1) = 1.65 Prob of getting rand var 1.65 is less than 0.05 If z 1.65, it is outside our 95% ‘confidence limit’ that the null-hyp may be true. So reject null-hyp. Confidence limit is for z is - to 1.65 Neglect possibility that z may be negative.(1-tailed test) Confidence limit for sample-mean is - to 1.65/n + 0 10 Dec 2012 COMP80131-SEEDSM8 13 2-tailed test x 0 z / n (with x sample mean & pop stdev) Assuming we need 95% confidence, = 0.05. Allowing possibility that z < 0, extreme portions of tails are for z > z(/2)) and for z < -z(/2)). prob(z z(/2)) + prob(z -z((/2) ) = 2 prob(z z(/2)) Now, z(/2) = norminv(1-/2,0,1) = 1.96 Prob of getting rand var 1.96 or -1.96 is 0.05 If z > 1.96 or z < - 1.96, it is outside our 95% ‘confidence limit’ that the null hyp may be true. So reject null-hyp. Confidence limits for z are -1.96 to 1.96 Confidence limits for sample-mean are: 0 - 1.96/n to 0 + 1.96/n 10 Dec 2012 COMP80131-SEEDSM8 14 Significance testing: t-test • • • • Assume Normal population with unknown stdev. Null-hypothesis: pop-mean =0 Alternative hyp: pop-mean < 0 Take one sample of n values & calculate the t-statistic: x 0 t s/ n (with x sample mean & s sample stdev) T-pdf(blue) Norm-pdf(red) If pop-mean = 0, dist of t will be standard t-pdf (blue) with V=n-1. 0.4 How likely is calculated value of t? 0.3 ‘1-tailed’ p-value = prob (x t) 0.2 = 1 - tcdf(t , n-1) t If p-value < significance level alpha () reject null hyp. 0.1 0-5 -4 -3 -2 -1 0 10 Dec 2012 1 2 3 4 5 COMP80131-SEEDSM8 15 Alternative formulation (2-tailed) • Null-Hyp is that pop-mean is 0 x 0 t (with x sample mean & s sample stdev) s/ n • Assuming we need 95% confidence, = 0.05 • Confidence limits for 0 is: x tinv(1 / 2, n 1) s / n to x tinv(1 / 2, n 1) s / n If value of 0 is outside these limits, reject the null-hyp that population mean is 0 If 0 is within these confidence limits, cannot reject null-hyp. 10 Dec 2012 COMP80131-SEEDSM8 16 Difference betw z-test & t-test(2-tailed) • With z-test pop-std () is known; with t-test is unknown. x 0 z / n t x 0 s/ n (with x sample mean & pop stdev) (with x sample mean & s sample stdev) For z-test, p-value = prob ( x z) = 1- normcdf(z,0,1) For t-test, p-value = prob( x t) = 1 – tcdf(t,n-1) Same Null-hyp: pop-mean = 0 : reject if 0 outside conf limits Confidence limits for z-test: x norminv (1 / 2 ,0,1) / n to x norminv (1 / 2, 0,1) / n Confidence limits for t-test: x tinv(1 / 2, n 1) s / n to x tinv(1 / 2, n 1) s / n 10 Dec 2012 COMP80131-SEEDSM8 17 Non-Gaussian populations • If samples of size n are ‘randomly’ chosen from a pop with mean & std , the pdf of their sample-means approaches a Normal (Gaussian) pdf with mean & stdev /n as n ∞. • Regardless of whether the population is Gaussian or not! • This is Central Limit Theorem • Tests can be made to work for non-Gaussian populations provided n is ‘large enough’. 10 Dec 2012 COMP80131-SEEDSM8 18 Meaning of confidence limits If =0.5, there is 95% probability that the confidence limits for a given sample will contain the true population statistic say. 10 Dec 2012 COMP80131-SEEDSM8 19 A really subtle point • Does this mean that there a 95% probability that lies within the 95% confidence limits for the given sample? 10 Dec 2012 COMP80131-SEEDSM8 20 A really subtle point • Does this mean that there a 95% probability that lies within the 95% confidence limits for the given sample? • No! A common mistake! • We have just one sample – we have no idea whether it is one whose confidence limits contain or not. • Only 95% of possible samples will have conf limits which contain . 10 Dec 2012 COMP80131-SEEDSM8 21 P-values & confidence limits in MATLAB • Come for free with most measurements. For example: x= [1;2;3;4;5;6]; y =[1.1; 3;2;4;6;4]; [R, p_value, Rlo, Rup] = corrcoef(x,y) • Returns Pearson corr coeff R= 0.79, • p_value = 0.061, • Also 95% confidence limits: Rlo=-0.06, Rup = 0.98 • 95% prob that the true corr lies between -0.06 & 0.98 • “ Returns p-values for testing the hypothesis of no correlation. Each p-value is probability of getting a correlation as large as the observed value by random chance, when the true correlation is zero. If p_value is small, say < 0.05, then the correlation is significant”. 10 Dec 2012 COMP80131-SEEDSM8 22 Credibility limits • Baysian equivalent of ‘confidence limits’ • If limits are C1 to C2, & = 0.05 • Now there is 95% probability of the statistic, say, lying between C1 & C2. • ‘Confidence limits are ‘frequentist’ • Jonas explained why many people distrust the frequentist approach and consider the Bayesian approach to be much more reliable. 10 Dec 2012 COMP80131-SEEDSM8 23 Reminder: Binomial distribution True probability of getting that no of heads • If p=prob(Heads), prob of getting Heads exactly r times in n independent coin-tosses is: r (n-r) nCr p (1-p) • For a fair coin. p=0.5, this becomes nCr /2n 0.2 0.16 0.12 0.1 0.04 0.02 00 10 Dec 2012 2 4 6 8 10 12 14 COMP80131-SEEDSM8 16 18 No of heads obtainable 20 with n coin-tosses 24 True probability of getting that no of heads Binomist dist with n=6 0.4 0.35 0.3 0.25 0.2 0.15 0.0156 0.1 0.05 0 0 10 Dec 2012 1 2 3 4 5 6 No of heads obtainable with n coin-tosses COMP80131-SEEDSM8 25 MATLAB Script p = 0.5; % for coin tossing n=6; for r=0:n nCr = prod(n:-1:(n-r+1))/prod(1:r); Prob(1+r) = nCr * (p^r) * (1-p)^(n-r); end; Prob figure(2); stem(0:n,Prob); 10 Dec 2012 COMP80131-SEEDSM8 26 Geometric distribution (p = prob of success). • p(x) = (1-p)px-1 • Number of trials (coin tosses) up to & including that in which first failure occurs 0.5 0.45 p = 0.5 x=1:10; prob = (1-p)*p.^(x-1); stem(x,prob); prob of first failure at x 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 1 2 10 Dec 2012 3 4 5 6 x: number of trials 7 8 9 10 COMP80131-SEEDSM8 27 Geometric distribution (again) prob of first failure at x 0.5 0.4 0.3 prob(6) = 0.0156 0.2 prob(5) = 0.0313 0.1 0.05 0 1 10 Dec 2012 2 3 4 5 6 7 x: number of trials 8 COMP80131-SEEDSM8 9 10 28 Barry’s Assignment • • • • • • • Deadline 20 Dec 2012 Email to barry@man.ac.uk with ‘SEEDSM12’ in title or Hand in paper copy to SSO Exam statistics are in examdata.dat and examdata.xls in www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131 (or navigate from www.cs.man.ac.uk/~barry) 10 Dec 2012 COMP80131-SEEDSM8 29