One tailed tests : Based on a uni-directional hypothesis
Example: Effect of training on problems using PowerPoint
Population figures for usability of PP are known
Hypothesis: Training will decrease number of problems with PP
Two tailed tests : Based on a bi-directional hypothesis
Hypothesis: Training will change the number of problems with PP
1000
800
600
400
200
0
Sampling Distribution
1400
Population for usability of Powerpoint
1200
Std. Dev = .45
Mean = 5.65
N = 10000.00
Mean Usability Index
Identify region
Unidirectional hypothesis: .05 level
Bidirectional hypothesis: .05 level
• What does it mean if our significance level is
.05?
For a uni-directional hypothesis
For a bi-directional hypothesis
PowerPoint example:
• Unidirectional
If we set significance level at .05 level,
• 5% of the time we will higher mean by chance
• 95% of the time the higher mean mean will be real
• Bidirectional
If we set significance level at .05 level
• 2.5 % of the time we will find higher mean by chance
• 2.5% of the time we will find lower mean by chance
• 95% of time difference will be real
•What happens if we decrease our significance level from .01 to .05
Probability of finding differences that don’t exist goes up (criteria becomes more lenient)
•What happens if we increase our significance from .01 to .001
Probability of not finding differences that exist goes up (criteria becomes more conservative)
• PowerPoint example:
If we set significance level at .05 level,
• 5% of the time we will find a difference by chance
• 95% of the time the difference will be real
If we set significance level at .01 level
• 1% of the time we will find a difference by chance
• 99% of time difference will be real
• For usability, if you are set out to find problems: setting lenient criteria might work better (you will identify more problems)
• Effect of decreasing significance level from .01 to .05
Probability of finding differences that don’t exist goes up (criteria becomes more lenient)
Also called Type I error (Alpha)
• Effect of increasing significance from .01 to .001
Probability of not finding differences that exist goes up
(criteria becomes more conservative)
Also called Type II error (Beta)
• The number of independent pieces of information remaining after estimating one or more parameters
• Example: List= 1, 2, 3, 4 Average= 2.5
• For average to remain the same three of the numbers can be anything you want, fourth is fixed
• New List = 1, 5, 2.5, __ Average = 2.5
• T tests: are differences significant?
• One sample t tests, comparing one mean to population
• Within subjects test: Comparing mean in condition 1 to mean in condition 2
• Between Subjects test: Comparing mean in condition 1 to mean in condition 2
• Does training lead to lesser problems with PP?
• 9 subjects were trained on the use of PP.
• Then designed a presentation with PP.
No of problems they had was DV
26
32
27
21
21
24
21
25
18
Mean 23.89
SD 4.20
• Mean = 23.89
• SD = 4.20
• Results
Mean number of problems = 23.89
• Assume we know that without training the mean would be 30, but not the standard deviation
Population mean = 30
• Is 23.89 enough larger than 30 to conclude that video affected results?
• We need to know what kinds of sample means to expect if training has no effect.
i. e. What kinds of means if m = 23.89
This is the sampling distribution of the mean.
• The sampling distribution of the mean depends on
Mean of sampled population
St. dev. of sampled population
Size of sample
1000
800
600
400
200
0
Sampling Distribution
1400
Number of problems with Powerpoint Use
1200
Std. Dev = .45
Mean = 5.65
N = 10000.00
Mean Number of problems
Cont.
• Shape of the sampled population
Approaches normal
Rate of approach depends on sample size
Also depends on the shape of the population distribution
• Given a population with mean = m and standard deviation = s , the sampling distribution of the mean (the distribution of sample means) has a mean = m , and a standard deviation = s / n .
• The distribution approaches normal as the sample size, increases.
n ,
• Let population be very skewed
• Draw samples of 3 and calculate means
• Draw samples of 10 and calculate means
• Plot means
• Note changes in means, standard deviations, and shapes
Cont.
3000
Skewed Population
2000
1000
0
0.0
2.0
4.0
6.0
8.0
10
.0
12
.0
14
.0
16
.0
18
.0
Std. Dev = 2.43
Mean = 3.0
N = 10000.00
20
.0
X
Cont.
n
Sampling Distribution
2000
Sample size = n = 3
1000
0
Std. Dev = 1.40
Mean = 2.99
0.0
0
1.0
0
2.0
0
3.0
0
4.0
0
5.0
0
6.0
0
7.0
0
8.0
0
9.0
0
10
.0
0
11
.0
0
12
.0
0
13
N = 10000.00
.0
0
Sample Mean
Cont.
n
Sampling Distribution
1600
Sample size = n = 10
600
400
200
0
1400
1200
1000
800
1.0
0
1.5
0
2.0
0
2.5
0
3.0
0
3.5
0
4.0
0
4.5
0
5.0
0
Std. Dev = .77
Mean = 2.99
5.5
0
6.0
0
6.5
0
N = 10000.00
Sample Mean
Cont.
• Means have stayed at 3.00 throughout-except for minor sampling error
• Standard deviations have decreased appropriately
• Shapes have become more normal--see superimposed normal distribution for reference
• Assume mean of population known, but standard deviation (SD) not known
• Substitute sample SD for population SD
(standard error)
• Gives you the t statistics
• Compare values of t t to tabled values which show critical
t
• Get mean difference between sample and population mean
• Use sample SD as variance metric = 4.40
t
X
m s
30
23 .
89
4 .
40
6 .
11
1 .
46
1 .
48 n 9
• Skewness of sampling distribution of variance decreases as n increases
• t will differ from z less as sample size increases
• Therefore need to adjust t accordingly
• df = n - 1
• t based on df
Looking up critical t (Table
E.6)
Two-Tailed Significance Level df .10 .05 .02 .01
4 1.812 2.228 2.764 3.169
5 1.753 2.131 2.602 2.947
6 1.725 2.086 2.528 2.845
7 1.708 2.060 2.485 2.787
8 1.697 2.042 2.457 2.750
9 1.660 1.984 2.364 2.626
• Critical t= n = 9, t
.05
significance)
= 2.62 (two tail
• If t > 2.62, reject H
0
• Conclude that training leads to less problems
t
• Difference between sample and population means
• Magnitude of sample variance
• Sample size
• Significance level a
• One-tailed versus two-tailed test