Lecture 7
Analysis and Interpretation of Inferential Data
Using t Distribution
Reading
Best and Kahn
Chapter 11
Pages 406 to
423
Outline
•Independent versus Dependent
Samples
•Assumptions about the
independent-samples t-test
•Calculate the independent
sample t-test.
•Degrees of freedom for the
independent-samples t test.
•Using EXCEL to calculate t test
•Interpretation of t-test from
SPSS
•Presenting the results in APA
format.
Review Hypothesis Testing.








Identify hypothesis to be tested and put it in symbolic form.
Identify the null hypothesis
Identify the alternative hypothesis.
Select the significance level α based on the seriousness of the
Type 1 error.
Identify the statistic that is relevant to the test and identify
the sampling distribution.
Determine the test statistic either p value or critical value.
Draw the graph.
Reject H0: Test statistic is in the critical region or the p value
≤α.
Fail to reject H0: test is not in the critical region or p value>α
Finding P-Values
Hypothesis
Left tailed
right tailed
Type of test
Is it a two tailed test?
left
P-value = area to the
left of the test
statistic
P-value = twice
the area to the
left of the test
statistic
right
P-value = twice the
area to the right of
the test statistic.
P-value = area to
the right of the test
statistic.
Independent versus Dependent Samples
Definition
Two samples drawn from two populations are
independent if the selection of one sample from one
population does not affect the selection of the second
sample from the second population. Otherwise, the
samples are dependent.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 1
Suppose we want to estimate the difference between the
mean salaries of all male and all female executives. To do
so, we draw two samples, one from the population of male
executives and another from the population of female
executives. These two samples are independent because
they are drawn from two different populations, and the
samples have no effect on each other.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 2
Suppose we want to estimate the difference between the
mean weights of all participants before and after a weight
loss program. To accomplish this, suppose we take a
sample of 40 participants and measure their weights before
and after the completion of this program. Note that these
two samples include the same 40 participants. This is an
example of two dependent samples. Such samples are
also called paired or matched samples.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
T-test: Definition
The t-test compares the means for two groups of individuals.
It is a versatile statistical test. It can be used to test
whether two group means are different.
For example:
 Is Nightmare on Elm Street 2 is scarier than Nightmare
on Elm street 1?


Watch both movies and measure heart rate.
Does listening to music while you work improve your
attention?

Get some people to write an essay while listening to music
and then write a different essay when working in silence.
Then compare their essay grades.
Why use T test?
 Case 1
 The population standard deviation is not known.
 The sample size is small ( n < 30).
 The population from which the sample is selected is
normally distributed.
 Case 2
 The population standard deviation is not known
 The sample size is large ( n ≥ 30).
Hypothesis about two groups
Suppose I ask you about your anxiety in taking the basic statistics
course. If I ask you to number your anxiety level on a scale of 1
to 10, where 1 would mean very little anxiety and 10 would
indicate that high anxiety.
I then pose the following questions
1. Is the male any different from female anxiety?
2. Do students who have previous mathematics experience suffer
less anxiety?
3. Do part-time students experience as much anxiety as full time
students?
4. Are undergraduate students more anxious than post graduate
students?
Each one of these questions can be examined using an
independent sample t-test.
Is there a difference between the
two group means?
 If you calculate two sample means and they are
different, there are 2 possible reasons for the
difference.
1. Each group comes from a different population and
the sample means represent two different population
means. When this happens you reject the null
Hypothesis.
2. The groups come from the same population and the
means vary by chance. You just happen to pick two
groups with means that are far apart. You fail to
reject the null hypothesis.
The independent t-test
 Used in situations in which there are two experimental
conditions and different participants used in each
condition.
 The assumptions about independent sample t-tests
 The variable being measured is normally distributed.
 The variances of the groups being assessed are
equivalent ( homogeneous)
 Sample 1 is randomly sampled from population1 and
sample 2 is randomly sampled form population2.
Independent sample t test
equation
t=
x1  x 2
estimate of the standard error
Example
Estimate of Standard error
 Recall: the standard error tells us how variable the
differences between sample means are by chance
alone.
 If the standard deviations high then large differences
between sample means can occur by chance.
 If the standard deviation is small then only small
differences between the sample means are expected.
 The standard error of the sampling distribution is
used to assess whether the difference between two
samples means is statistically significantly meaningful
or simply a chance result.
.
Variance Sum Law
 The variance sum law is used to calculate the
standard deviation of the sampling distribution of
differences between sample means. It states
 The variance of the difference between independent
variables is equal to the sum of their variances ( Howell,
2006).
In essence this tells you that: The variance of the
sampling distribution is equal to the sum of variances of
the two populations from which the samples were taken.
Calculate the Standard error of each
population.
 Using the sample standard deviation we calculate the
standard error of each population’s sample
distribution.
 SE of sampling distribution of population 1 =
 SE of the sample distribution of population 2 =
s1
N1
s2
N2
 Recall: Variance is equal to standard deviation
squared. Calculate the variance of each population.
 Variance of Sampling distribution of population 1 =
 s1

 N
1

2
2

s1
 

N1

 Variance of sampling distribution of population 2 =
 s2

 N
2

2
2

s2
 

N
2

 The variance sum law: to find the variance of the
sampling distribution of differences we add the
variances of the sampling distribution.
 Variance of sampling distribution of differences =
s12
s22

N1
N2
 To find the standard error of the sampling
distribution of differences we find the square root of
the variance
 SE of the sampling distribution of differences =
 s12
s22 



N
N
2 
 1
 Therefore substitute SE in the previous equation for
t. ( See page 409, Best and Kahn).
t
x1  x2
 s12
s22 



N2 
 N1
 This equation works only when the sample sizes are
equal.
 Sometimes we ant to compare two groups that
contain different numbers of participants then the
above equation is not appropriate.
 Instead the pooled variance estimate t-test is used
The pooled variance estimate for
two samples.
s
2
p
n1  1 s


2
1
  n2  1 s
n1  n2  2
2
2
Pooled Standard Deviation for Two Samples
The pooled standard deviation for two samples is
computed as
(n1  1)s  (n2  1)s
sp 
n 1 n2  2
2
1
2
2
2
1 and
where n1 and n2 are the sizes of the two samples and s
2 are the variances of the two samples, respectively. Here
s2
s is an estimator of σ.
p
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Estimator of the Standard Deviation of x1 – x2
Estimator of the Standard Deviation of x1 – x2
The estimator of the standard deviation of
is
x1  x 2
s x1  x 2  s p
1 1

n1 n2
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Degree of freedom (df) for
independent samples t- test
The degree of freedom that we calculate for the
independent sample t-test must reflect the number in
each sample minus one.
df = n1 + n2 – 2
or
df = ( n1 -1) + (n2 – 1)
or
df = df1 + df2
Example
A sample of 14 cans of Brand I diet soda gave the mean
number of calories of 23 per can with a standard
deviation of 3 calories. Another sample of 16 cans of
Brand II diet soda gave the mean number of calories of
25 per can with a standard deviation of 4 calories.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Question
At the 1% significance level, can you conclude that
the mean number of calories per can are different for
these two brands of diet soda? Assume that the
calories per can of diet soda are normally distributed
for each of the two brands and that the standard
deviations for the two populations are equal.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Solution
 Step 1:
 H0: μ1 – μ2 = 0 (The mean numbers of calories are not
different.)
 H1: μ1 – μ2 ≠ 0 (The mean numbers of calories are
different.)
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Solution
 Step 2:
 The two samples are independent
 σ1 and σ2 are unknown but equal
 The sample sizes are small but both populations are
normally distributed
 We will use the t distribution
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Solution
 Step 3:
 The ≠ sign in the alternative hypothesis indicates that




the test is two-tailed.
α = .01.
Area in each tail = α / 2 = .01 / 2 = .005
df = n1 + n2 – 2 = 14 + 16 – 2 = 28
Critical values of t are -2.763 and 2.763. ( page 483 Best
and Kahn)
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Draw figure
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Solution
Step 4:
(n1  1)s12  (n2  1)s22
(14  1)(3)2  (16  1)(4)2
sp 

 3.57071421
n1  n2  2
16  16  2
s x1  x 2
1 1
1 1
 sp
  (3.57071421)
  1.30674760
n1 n2
14 16
( x1  x 2 )  ( 1  2 ) (23  25)  0
t

 1.531
s x1  x 2
1.30674760
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Solution
 Step 5:
 The value of the test statistic t = -1.531
 It falls in the nonrejection region
 Therefore, we fail to reject the null hypothesis
 Consequently, we conclude that there is no difference
in the mean numbers of calories per can for the two
brands of diet soda.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 2
A sample of 40 children from New York State showed
that the mean time they spend watching television is
28.50 hours per week with a standard deviation of 4
hours. Another sample of 35 children from California
showed that the mean time spent by them watching
television is 23.25 hours per week with a standard
deviation of 5 hours.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Question
Using a 2.5% significance level, can you conclude that
the mean time spent watching television by children in
New York State is greater than that for children in
California? Assume that the standard deviations for the
two populations are equal.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example Solution
 Step 1:
 H0: μ1 – μ2 = 0
 H1: μ1 – μ2 > 0
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
 Step 2:
 The two samples are independent
 Standard deviations of the two populations are
unknown but assumed to be equal
 Both samples are large
 We use the t distribution to make the test
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
 Step 3:
 α = .025
 Area in the right tail of the t distribution = α = .025
 df = n1 + n2 – 2 = 40 + 35 – 2 = 73
 Critical value of t is 1.993
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Figure 10.4
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Step 4:
(n1  1)s12  (n2  1)s22
(40  1)(4)2  (35  1)(5)2
sp 

 4.49352655
n1  n2  2
40  35  2
s x1  x 2
1 1
1 1
 sp
  (4.49352655)
  1.04004930
n1 n2
40 35
( x1  x 2 )  ( 1  2 ) (28.50  23.25)  0
t

 5.048
s x1  x 2
1.04004930
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
 Step 5:
 The value of the test statistic t = 5.048
 It falls in the rejection region
 Therefore, we reject the null hypothesis H0
 Hence, we conclude that children in New York State
spend more time, on average, watching TV than
children in California.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Using EXCEL to calculate t

http://www.ehow.com/video_4983079_use-excels-ttest-function.html?sms_ss=gmail&at_xt=4cc6188f5805f4b4,0
Output from the independent
sample t-test
 SPSS gives an out put for independent t-test in 2
tables. Let us assume that we are assessing your
statistics anxiety: from the population I have 2 groups
sample size is 12. Group one is exposed to an
innovative teaching strategy and group two the
traditional statistics lecture. The group who were
exposed to innovative strategy had a mean anxiety of
5.50 with a standard deviation of 2.78 SE of the group
is .802. the group exposed to traditional lecture had a
mean anxiety level of 5.58,with a standard deviation of
2.503 and SE of .723.
SPSS Output ( table 1)
Teaching strategy
N
Mean
Std. Deviation
Std. Error
Mean
Anxiety
12
12
5.58
5.50
2.77980
2.50303
.80246
.72256
Lecture
Innovative
Table 2
Levene’s test
for Equality of
variance
T-test for equity of means
95% confidence
interval of the
difference
Anxiety
Equal variances
assumed
Equal variances not
assumed
F
Sig
.201
.659
T
df
Sig ( 2
tailed)
.
Mean
difference
-.077
22
939
-.083
-..077
21.762
.939
-.083
Std error
difference
1.079
1.079
lower
-2.322
-2.324
Upper
2.156
2.158
Explanation
 Notice there is more information in the second table.
 The top row label the statistics computed, below each
label are the values calculated by SPSS.
 The first column is divided into two more rows.
 Row 1: equal variances assumed
 Row 2: equal variances not assumed.
 Remember: one of the assumptions of t is equal
variances. When this is violated, we have the option
of using a more conservative estimate.
Levene’s Test for equity of variances
F
 Levene’s test of homogeneity of variances computes a
statistic called F. For our data F = .201
 Sig:
 In this column, SPSS reports the significance of the
Levene’s F. If the significance level is ≤ .05, then we
conclude that the variances of the two groups differ
significantly. The alpha associated with Levene’s F is
.659. Since .659 is greater than .05, the difference
between the variances is not significant. Thus we do not
have evidence that we have violated the assumption of
equal variances.
T-test for Equality of means
 Because Levene’s test was not significant, we use the top
row of the output labeled “ equal variances assumed”
t
 The t value is -.077
 df
 there are 22 degrees of freedom
 Sig( 2 tailed)
 the p at which the t is significant is .939. Since we use p≤.05,
and .939 is greater than .05, the difference between students
who were exposed to innovative teaching strategies and
traditional lecture statistics anxiety level is not significant.
 Mean Difference
 The difference in anxiety between the samples is -.0833
 Std. Error Difference:
 The standard error mean difference
 s   1.0798
x x
95% Confidence Interval of the difference
 Another ways of determining if there is a significant
difference between the two means is to compute
confidence bands around the observed t. If 0 falls within
the band, we do not have a significant difference, if the
band does not include 0, the difference is significant.
 Lower
 The lower point of the band is -2.322
 Upper:
 the upper point of the band is 2.156
 Since 0 falls within 95% confidence band, the difference
between the two samples statistics anxiety levels is not
significant.
What do the results say?
 Based on our tow samples, t = -.077. Our calculated t does
not exceed the critical value of t = ±2.074. Thus we fail to
reject the null hypothesis. There is no significant
difference between students statistics anxiety between
student exposed to traditional lecture methods and those
exposed to innovative strategies.
 If the results indicated a significant difference we would
compute and effect size using Cohen’s d.
Presenting results using APA format
You then present your results
Students who were exposed to innovative had a statistics
anxiety level of 5.50 ( s = 2.78) while that of students
exposed to traditional lectures was 5.58 ( s – 2.00).
Statistics anxiety levels did not differ significantly
between students exposed to traditional teaching
methods and those exposed to innovative methods
within this study sample. t(22) = -.077. p> .05.
Or: no significant difference was found between
students exposed to traditional teaching methods and
those exposed to an innovative strategy. T(22) = .077, p
= .939
In-class Exercise
Best and Kahn
Page 444 nos. 8
and 9