Scientific Methods 1 ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 8: Statistical Methods-Significance tests & confidence limits Barry & Goran www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131 1 Dec 2011 COMP80131-SEEDSM8 1 Introduction • Statistical significance testing has so far been applied on the assumption of a (1) discrete population with binomial distribution (2) continuous population with known normal pdf & known std. • Before proceeding further, take a quick look at a few more prob distributions & pdfs. • Significance testing can be adapted to any of these. 1 Dec 2011 COMP80131-SEEDSM8 2 Exponential pdf • Lifetimes e.g. of light bulbs follow an exponential distribution: :x0 0 pdf ( x) x / ( 1 / ) e :x0 0.5 mean = 2; x = 0:0.1:10; y = exppdf(x,mean); plot(x,y); 0.45 0.4 0.35 pdf 0.3 0.25 0.2 Mean = 0.15 0.1 Std = also 0.05 0 0 1 Dec 2011 1 2 3 4 5 x 6 7 8 9 10 COMP80131-SEEDSM8 3 Poisson Distribution • For applications that involve counting number of times a random event occurs in a given amount of time, e.g. number of people walking into a store in an hour. prob( x) x e x x! where x is an integer • λ, is both mean & variance of the distribution. • Poisson & exponential distributions are related. • If number of counts follows a Poisson distribution, then interval between individual counts follows exponential distribution. • As λ gets larger, Poisson pdf normal with µ = λ, σ2 = λ. 1 Dec 2011 COMP80131-SEEDSM8 4 Poisson distributions in MATLAB x=0:60 y = poisspdf(x,20); stem(x,y); x=0:16 y = poisspdf(x,5); stem(x,y); 0.09 0.18 0.08 0.16 0.07 0.14 0.06 prob(x) prob(x) 0.12 0.1 0.08 0.04 0.03 0.06 0.02 0.04 0.01 0.02 0 0.05 0 0 2 1 Dec 2011 4 6 8 x 10 12 14 16 0 COMP80131-SEEDSM8 10 20 30 x 40 50 60 5 Chi-squared distribution Given V indep normally distrib random variables, X1, X2, …, XV all with mean = 0 & std =1, let 2(V) = X12 + X22 + … + XV2 Then the pdf of samples x of 2 is: :x0 0 1 V / 2 1 x / 2 pdf ( x) x e :x0 V /2 2 (V / 2) ‘Gamma function’ (x) is a generalisation of x! to non-integers. This pdf will tell us how about variance of a population. If s=std of samples of V observations of normally distributed pop with std σ: Vs2/2 2 (V) 1 Dec 2011 COMP80131-SEEDSM8 6 Plot chi2 pdf with V = 4 0.2 0.18 0.16 x = 0:0.2:15; y = chi2pdf(x,4); plot(x,y) 0.14 pdf 0.12 0.1 0.08 0.06 0.04 0.02 0 0 5 10 15 x 1 Dec 2011 COMP80131-SEEDSM8 7 Student’s t-distribution pdf Depends on a single parameter V (degrees of freedom). As V, t-pdf approaches standard normal distribution (V 1) / 2 ( (V 1) / 2 ) 2 1 t / V ) pdf (t ) : t V (V / 2) If x is a random sample of size n from a normal distribution with mean μ, then the t-statistic x s/ n (with x sample - mean & s sample - stdev) has Student's t-pdf with V = n – 1 degrees of freedom. 1 Dec 2011 COMP80131-SEEDSM8 8 Compare t-pdf(V=5) with normal 0.4 0.35 T-pdf(blue) Norm-pdf(red) 0.3 x = -5:0.1:5; y = tpdf(x,5); z = normpdf(x,0,1); plot(x,y,'b',x,z,'r'); 0.25 0.2 0.15 0.1 0.05 0 -5 -4 1 Dec 2011 -3 -2 -1 0 1 2 3 4 COMP80131-SEEDSM8 5 9 MATLAB functions for t-dist • pdf for t-distribution with V degrees of freedom: y = tpdf ( t,V); (With samples with n values, V = n-1) . • Cumulative df with V degrees of freedom p = tcdf ( t , V) Prob of rand var being t • Complementary df (area under ‘tail’ from t to ) p = 1 – tcdf ( t , V) Prob of rand var being > t 1 Dec 2011 COMP80131-SEEDSM8 10 Inverse-cdf in MATLAB • Inverse of cumulative distrib function: • If p=tcdf(t,V) then t = tinv(p,V) Value of t such that prob of rand var being t is p • If p = normcdf(z,m,) then z = norminv(p,m, ) Value of z such that prob of rand var being z is p Complementary version: t = tinv(1-p,V) Value of t such that prob of rand var being > t is p. Similarly for complementary version of norminv 1 Dec 2011 COMP80131-SEEDSM8 11 Significance testing: z-test • • • • Assume Normal population with known stddev = . Null hypothesis: pop-mean =0 Alternative hyp: pop-mean < 0 Take one sample of n values & calculate the z-statistic: x 0 z (with x sample - mean & pop - stdev) / n If pop-mean = 0, dist of z will be standard Normal (mean=0, std=1) Std Normal pdf 0.4 0.3 If mean of z is 0, how likely is a value z as just calculated? 0.2 p-value = prob (x z) 0.1 = 1-normcdf(z,0,1) 0 1 Dec 2011 -2 -1 0 1 2 z 4 If p-value < significance level alpha () reject null hyp. COMP80131-SEEDSM8 12 Alternative formulation z x 0 / n (with x sample - mean & pop - stdev) Assuming we need 95% confidence, = 0.05 Let z() = norminv(1-,0,1) = 1.65 Prob of getting rand var 1.65 is less than 0.05 If z 1.65, it is outside our 95% ‘confidence limit’ that the null hyp may be true. So reject null hyp. Confidence limit is for z is - to 1.65 Neglect possibility that z may be negative.(1-tailed test) Confidence limit for sample-mean is - to 1.65/n + 0 1 Dec 2011 COMP80131-SEEDSM8 13 2-tailed test z x 0 / n (with x sample - mean & pop - stdev) Assuming we need 95% confidence, = 0.05 Allowing possibility that z < 0, extreme portions of tails are for z > z(/2)) and for z < -z(/2)). prob(z z(/2)) + prob (z -z((/2) ) = 2 prob(z z(/2)) = Now, z(/2) = norminv(1-/2,0,1) = 1.96 Prob of getting rand var 1.96 or -1.96 is 0.05 If z > 1.96 or z < - 1.96, it is outside our 95% ‘confidence limit’ that the null hyp may be true. So reject null hyp. Confidence limit is for z is -1.96 to 1.96 Confidence limits for sample-mean is 0 - 1.96/n to 0 + 1.96/n 1 Dec 2011 COMP80131-SEEDSM8 14 Significance testing: t-test • • • • Assume Normal population with unknown stddev. Null hypothesis: pop-mean =0 Alternative hyp: pop-mean < 0 Take one sample of n values & calculate the t-statistic: x 0 t s/ n (with x sample - mean & s sample - stdev) T-pdf(blue) Norm-pdf(red) If pop-mean = 0, dist of t will be standard t-pdf (blue) with V=n-1. 0.4 How likely is calculated value of t? 0.3 ‘1-tailed’ p-value = prob (x t) 0.2 = 1 - tcdf(t , n-1) t If p-value < significance level alpha () reject null hyp. 0.1 0-5 -4 -3 -2 -1 0 1 Dec 2011 1 2 3 4 5 COMP80131-SEEDSM8 15 Alternative formulation (2-tailed) • Null Hyp is that pop-mean is 0 t x 0 s/ n (with x sample - mean & s sample - stdev) • Assuming we need 95% confidence, = 0.05 • Confidence limits for 0 is: x tinv(1 / 2, n 1) s / n to x tinv(1 / 2, n 1) s / n If value of 0 is outside these limits, reject the null hyp that population mean is 0 Can say with 95% confidence that pop-mean > 0 or < 0 If 0 is within these confidence limits, cannot reject null-hyp. 1 Dec 2011 COMP80131-SEEDSM8 16 Difference betw z-test & t-test(2-tailed) • With z-test pop-std () is known; with t-test is unknown. x 0 z / n t x 0 s/ n (with x sample - mean & pop - stdev) (with x sample - mean & s sample - stdev) For z-test, p-value = prob ( x z) = 1- normcdf(z,0,1) For t-test, p-value = prob( x t) = 1 – tcdf(t,n-1) Same Null-hyp: pop-mean = 0 : reject if 0 outside conf limits Confidence limits for z-test: x norminv (1 / 2 ,0,1) / n to x norminv (1 / 2, 0,1) / n Confidence limits for t-test: x tinv(1 / 2, n 1) s / n to x tinv(1 / 2, n 1) s / n 1 Dec 2011 COMP80131-SEEDSM8 17 Non-Gaussian populations • If samples of size n are ‘randomly’ chosen from a pop with mean & std , the pdf of their mean, m1 say, approaches a Normal (Gaussian) pdf with mean & std /n as n is made larger & larger. • Regardless of whether the population is Gaussian or not! • This is Central Limit Theorem • Tests can be made to work for non-Gaussian populations provided n is ‘large enough’. 1 Dec 2011 COMP80131-SEEDSM8 18 Barry’s Assignment • • • • • • • Deadline 20 Dec 2011 Email to barry@man.ac.uk with ‘SEEDSM’ in title or Hand in paper copy to SSO Exam statistics are in examdata.dat and examdata.xls in www.cs.man.ac.uk/~barry/mydocs/MyCOMP80131 (or navigate from www.cs.man.ac.uk/~barry) 1 Dec 2011 COMP80131-SEEDSM8 19 Question 1 • What are the essential differences between Baysian and ‘frequentist’ statistics? 1 Dec 2011 COMP80131-SEEDSM8 20 Question 2: fair coin test Suppose we obtain heads 15 times out of 20 flips of a coin. By establishing confidence limits, state whether it is it likely to be a fair coin? 1 Dec 2011 COMP80131-SEEDSM8 21 Question 3: Exam statistics • Analyse the ficticious exam results & comment on features. • Compute means, stds & vars for each subject & histograms for the distributions. • Make observations about performance in each subject & overall • Do marks support the hypothesis that people good at Music are also good at Maths? • Do they support the hypothesis that people good at English are also good at French? • Do they support the hypothesis that people good at Art are also good at Maths? • If you have access to only 50 rows of this data, investigate the same hypotheses • What conclusions could you draw, and with what degree of certainty? 1 Dec 2011 COMP80131-SEEDSM8 22 Question 4: Bayes Theorem (a) A patent goes to a doctor with a bad cough & a fever. The doctor needs to decide whether he has ‘swine flu’. Let statement S = ‘has bad cough and fever’ & statement F = ‘has swine flu’. The doctor consults his medical books and finds that about 40% of patients with swine-flu have these same symptoms. Assuming that, currently, about 1% of the population is suffering from swine-flu and that currently about 5% have bad cough and fever (due to many possible causes including swine-flu), we can apply Bayes theorem to estimate the probability of this particular patient having swine-flu. (b) A doctor in another country knows form his text-books that for 40% of patients with swine-flu, the statement S, ‘has bad cough and fever’ is true. He sees many patients and comes to believe that the probability that a patient with ‘bad cough and fever’ actually has swine-flu is about 0.1 or 10%. If there were reason to believe that, currently, about 1% of the population have a bad cough and fever, what percentage of the population is likely to be suffering from swine-flu? 1 Dec 2011 COMP80131-SEEDSM8 23