• Don’t spam class email lists!!! • Farshad (farshad.nemati@uleth.ca) has prepared a suggested format for you final project. It will be on the web site. The Z statistic Zx Where x x X X x n The Z statistic Zx Where x x X X x n The Z statistic Zx Where x x X X x n The Z statistic Zx Where x x X X x n The t Statistic(s) • Using an estimated , which we’ll 2 ˆ call we can create an estimate of X ˆX which we’ll call 2 Estimate: and: 2 (X X ) ˆ i n 1 2 ˆ ˆ ˆX n n nS 2 n 1 The t Statistic(s) • Caution: some textbooks and some calculators use the symbol S2 to represent this estimated population variance The t Statistic(s) ˆ X instead of X we get a • Using, statistic that isn’t from a normal (Z) distribution - it is from a family of distributions called t x x tn1 ˆx The t Statistic(s) • What’s the difference between t and Z? The t Statistic(s) • What’s the difference between t and Z? • Nothing if n is really large (approaching infinity) – because n-1 and n are almost the same number! The t Statistic(s) • With small values of n, the shape of the t distribution depends on the degrees of freedom (n-1) The t Statistic(s) • With small values of n, the shape of the t distribution depends on the degrees of freedom (n-1) – specifically it is flatter but still symmetric with small n The t Statistic(s) • Since the shape of the t distribution depends on the d.f., the fraction of t scores falling within any given range also depends on d. f. The t Statistic(s) • The Z table isn’t useful (unless n is huge) instead we use a t-table which gives tcrit for different degrees of freedom (and both one- and two-tailed tests) The t Statistic(s) • There is a t table on page 142 of your book • Look it over - notice how tcrit changes with the d.f. and the alpha level The t Statistic(s) • The logic of using this table to test alternative hypothesis against null hypothesis is precisely as with Z scores - in fact, the values in the bottom row are given by the Z table and the familiar +/- 1.96 appears for alpha = .05 (twotailed) An Example • You have a theory that drivers in Alberta are illegally speedy An Example • You have a theory that drivers in Alberta are illegally speedy – Prediction: the mean speed on highway 2 between Ft. Mac and Calgary is greater than 110 An Example • You have a theory that drivers in Alberta are illegally speedy – Prediction: the mean speed on highway 2 between Ft. Mac and Calgary is greater than 110 • Here’s another way to say that: a sample of n drivers on the highway is not a sample from a population of drivers with a mean speed of 110 An Example • Set up the problem: – null hypothesis: your sample of drivers on highway 2 are representative of a population with an average speed of 110 km/hr X x and t X 110 t crit in 95% of such samples ˆx – alternative hypothesis: sample of drivers is from a population with a mean speed greater than 110 thus: X 110 and X x t crit ˆx An Example • Here are some (fake) data X 114.4 Car # 1 2 3 4 5 Speed 105 118 112 121 116 ˆ 2 ˆx X 110 s2 30.64 ns2 5 30.64 38.3 n 1 4 ˆ2 38.3 2.768 n 5 X X 114.4 110 t4 d . f . 1.59 ˆX 2.768 An Example – tcrit for a one-tailed test with 5-1 = 4 d.f. is 2.1318 – Our computed t = 1.59 does not exceed tcrit thus we cannot reject the null hypothesis – We conclude there is no evidence to support our hypothesis that drivers are speeding on highway 2 – Does this mean that drivers are not speeding on highway 2? T-test for one sample mean • We’ve discussed how to create and use a t statistic when we want to compare a sample mean to a hypothesized mean x x tn1 ˆx t Tests for Two Sample Means • We’re often interested in a more sophisticated and powerful experimental design… t Tests for Two Sample Means • We’re often interested in a more sophisticated and powerful experimental design… • Usually we perform some experimental manipulation and look for a change on some score or variable – e.g. before and after taking a drug t Tests for Two Sample Means • We manipulate a variable (eg. drug dose) and we want to know whether some other variable (e.g. fever) depends on our manipulation t Tests for Two Sample Means • We manipulate a variable (eg. drug dose) and we want to know whether some other variable (e.g. fever) depends on our manipulation • Let’s introduce some formal terms: – independent variable: the variable that you control – dependent variable: the variable that depends on the experimental manipulation (the one you measure) t Tests for Two Sample Means • Example: Let’s ask whether or not Tylenol reduces fever - there are two ways you could do this… t Tests for Two Sample Means • Example: Let’s ask whether or not Tylenol reduces fever - there are two ways you could do this… 1. Get a bunch of people with fevers, give half of them Tylenol and half of them a placebo and then measure their temperatures t Tests for Two Sample Means • Example: Let’s ask whether or not Tylenol reduces fever - there are two ways you could do this… 1. Get a bunch of people with fevers, give half of them Tylenol and half of them a placebo and then measure their temperatures 2. Get a bunch of people with fevers, measure their temperatures, then give them Tylenol and measure them again t Tests for Two Sample Means • Repeated Measures - an experiment in which the same subject (or object) is measured in two (or more!) conditions t Tests for Two Sample Means • Repeated Measures - an experiment in which the same subject (or object) is measured in two (or more!) conditions • The two samples are actually pairs of scores and those pairs are correlated or dependent t Tests for Two Sample Means • Repeated Measures - an experiment in which the same subject (or object) is measured in two (or more!) conditions • The two samples are actually pairs of scores and those pairs are correlated or dependent • This type of t test is called a test for two dependent sample means (sometimes called a paired t-test) t Tests for Two Dependent Sample Means • When comparing two paired samples we’re often not interested in the absolute scores but we are interested in the differences between scores Sample 1 Sample 2 X21 X11 X22 X12 . . . . . . X2n X1n Difference X11 - X21 X12 - X22 . . . X1n - X2n •This is a sample of differences taken from a population of differences •it has a mean and standard deviation t Tests for Two Dependent Sample Means • If we’re wondering whether an independent variable has some effect on the dependent variable then our null hypothesis is that there is no difference between the two paired measurements in our sample t Tests for Two Dependent Sample Means • If we’re wondering whether an independent variable has some effect on the dependent variable then our null hypothesis is that there is no difference between the two paired measurements in our sample • Some differences would be positive, some would be negative, on average the difference would be zero t Tests for Two Dependent Sample Means • We can use a t-test to test if the sample of differences has a mean that is significantly different from zero D D tn1 ˆD • This is done by simply treating your column of differences as a one-sample t-test with a null that u = 0 hypothesis t Tests for Two Dependent Sample Means • Some curiosities that make your life easier with regard to paired t-tests – Note that: D X1 X 2 – And that n1 always equals n2 with the z-test, the t distribution is symmetric so you treat – As negative differences as if they were positive for comparing to tcrit – Also as with the z-test, one- or two-tailed tests are possible…simply use the appropriate column from the t table t Test for Two Independent Sample Means • Often we have a situation in which repeated measures is inappropriate or impossible (e.g. any time measuring the dependant variable once alters subsequent measurements) t Test for Two Independent Sample Means • Often we have a situation in which repeated measures is inappropriate or impossible (e.g. any time measuring the dependant variable once alters subsequent measurements) • In this situation we must use a betweensubjects design t Test for Two Independent Sample Means • The data are laid out like the repeated measures case except they aren’t pairs of scores, the two columns are measurements of different subjects (objects, etc.) • We thus usually only refer to a single measurement with respect to the mean of that sample Sample 1 Sample 2 X1 X2 Difference D X1 X 2 t Test for Two Independent Sample Means • The null hypothesis states that these two independent samples are random samples from the same population t Test for Two Independent Sample Means • The null hypothesis states that these two independent samples are random samples from the same population – so you would expect the difference to be zero on average t Test for Two Independent Sample Means • The null hypothesis states that these two independent samples are random samples from the same population – so you would expect the difference to be zero on average – therefore the numerator of the t statistic in this situation works just like the dependent samples case D D where D 0 t Test for Two Independent Sample Means • The denominator is different because… t Test for Two Independent Sample Means • The denominator is different because… • How many degrees of freedom are there? t Test for Two Independent Sample Means • The denominator is different because… • How many degrees of freedom are there? – The mean difference is based on two different samples, each with their own degrees of freedom t Test for Two Independent Sample Means • The denominator is different because… • How many degrees of freedom are there? – The mean difference is based on two different samples, each with their own degrees of freedom – So there are n1-1+n2-1 = n1+n2-2 d.f. t Test for Two Independent Sample Means • The denominator is different because… • How many degrees of freedom are there? – The mean difference is based on two different samples, each with their own degrees of freedom – So there are n1-1+n2-1 = n1+n2-2 d.f. – The best estimate of the population standard deviation will incorporate both samples so that it has more degrees of freedom t Test for Two Independent Sample Means • We can pool the sums of squares (which weights the variances according to the number in each sample) n1 n2 i1 i1 SS pooled (X1i X1) 2 (X 2i X 2 ) 2 n1S12 n2 S22 t Test for Two Independent Sample Means • We can pool the sums of squares (which weights the variances according to the number in each sample) n1 n2 i1 i1 SS pooled (X1i X1) 2 (X 2i X 2 ) 2 n1S12 n2 S22 • Then divide by the pooled degrees of 2 freedom to estimate ˆ 2 SS pooled n1 n 2 2 t Test for Two Independent Sample Means • Estimate D : Both samples contribute to the standard error of the mean differences so 2 2 ˆ ˆ pooled pooled 2 ˆ D n1 n2 and… ˆD 2 ˆ pooled n1 2 ˆ pooled n2 t Test for Two Independent Sample Means • Now we can construct a t statistic X1 X 2 D t n1 n 2 2 ˆ X X ˆD 1 2 t Test for Two Independent Sample Means • Notice that this t statistic has more degrees of freedom than its dependent samples counterpart • Why does a repeated measures design still tend to have more power? t Test for Two Independent Sample Means • Consider an example: – Are northbound drivers slower than southbound drivers on highway 2 ? – Null hypothesis: samples of n speeds taken from northbound and southbound traffic are from the same population – Alternative hypothesis: samples of southbound drivers are from a population with a mean greater than that of northbound drivers t Test for Two Independent Sample Means Northbound Car # 1 2 3 4 5 ˆ 2 pooled ˆX 1 X 2 Speed 105 118 112 121 116 Southbound Car # 1 2 3 4 5 Speed 121 119 127 124 115 X1 114.4 S 21 30.64 X 2 121.2 S 2 2 16.96 n1S12 n 2 S22 5 30.64 5 16.96 29.75 (n1 1) (n 2 1) (5 1) (5 1) t8d . f . ˆ2 n1 ˆ2 n2 29.75 29.75 3.45 5 5 X1 X 2 114.4 121.2 1.97 ˆ X X 3.45 1 2 For a one-tailed test at = .05, with 8 d.f. , tcrit = 1.86. We can therefore reject the null hypothesis and conclude that southbound drivers are faster. t Test for Two Independent Sample Means • Some caveats and disclaimers about independent-sample t-tests: – There is an assumption of equal variance in the two underlying populations • If this assumption is violated, your Type I error rate is greater than the indicated alpha! • However, for samples of equal n, the t-test is quite robust to violations of this assumption (so you usually don’t have to worry about it) – Note that n need not be equal! (but it’s better if possible) Next Time: • Too many t tests spoils the statistics