Instructor: Mr. Chu Duc Nghia Group members:
Duong Thi Chi
Pham Thi Hoa
Pham Thi Mai
Nguyen Thi Van
1
2
Problem: Susan predicts that students will learn most effectively with a constant background sound, as opposed to an unpredictable sound or no sound at all. She randomly divides twenty-four students into three groups of eight. All students study a passage of text for 30 minutes. Those in group 1 study with background sound at a constant volume in the background. Those in group 2 study with noise that changes volume periodically. Those in group 3 study with no sound at all. After studying, all students take a 10 point multiple choice test over the material.
0.05
3
SOURCES SS df MS F scores Sample mean
Among
Within
30.08
2
87.88
21
15.04
4.18
3.59
Constant sound(1)
Random sound(2)
74686629
55344722
No sound(3) 24712155
6
4
3.375
Decision rule: reject H o if F
F
, k
1,
F
0.05,2,21
3.4668
DECISION: reject H o as F=3.59> F
0.05,2,21
3.4668
Conclusion: difference exists among average score of 3 group
=>background sound affects studying results
4
1.
2.
3.
5
Because the more means there are:
• The more number of t-test we have to take
• The greater type-I error ( the probability of rejecting the null hypothesis when it is true)
=> Using TUKEY TEST
6
Tukey test: is a multiple comparison procedure and statistical test developed by John Tukey.
Characteristics:
Compare all possible pairs of means to find which
means are significantly different from one another
Generally used in conjunction with an ANOVA
Based on a studentized range distribution q
7
Identify the technique
Problem objective: detect the difference between
population means
Data type: quantitative
Experimental design: independent
Assumptions
The observations being tested are independent
The means are from normally distributed populations
There is equal variation across observations.
(homoscedasticity)
8
Studentized distribution is built upon the formula: q
x max
MSE x
/ min n
It is similar to student-t distribution: but q-distribution takes into account the number of means under consideration. The more means under consideration, the larger q value (studentized t).
How it is built?
We take random samples from independent populations of interest.
Then identify the largest and the smallest mean among the sample means chosen, calculate difference between these two means, and then compute q as formula. After repeating the procedure many times, we get many value of q. These values form a q-distribution.
9
Step 1 : arrange the means from the smallest to the largest and calculate the difference b/w each pair of means.
Step 2 : calculate the critical value ω :
k: number of samples
v: d.f associated with MSE (v=n-k)
q
, k , v
MSE n g
α: significance level q
α,k,v: critical value of studentized range (see in the table next slide) n g
: number of observations
*equal sample size: n g n
n
n
*unequal sample sizes:
...
n g
2 n
1 n
1 n n
2
2
10
Step 3 : compare the differences calculated & ω. If larger than ω the means pairs are significantly different.
11
The Tukey confidence limits
( x l arg er
x smaller
)
q
, v , k
MSE n g
How to use confidence interval??
- Calculate confidence intervals for each pair of means.
- If the interval contains value 0, then conclude: difference of that pair is not significantly different from 0
- If the interval is in negative/positive side, then difference exist in that pair of means
12
Problem objective: detect the difference between population means
Data type: quantitative
Experimental design: independent use Tukey test with assumptions as
The means (average scores of students from each groups) are from normally distributed populations
There is equal variation across observations.
(homoscedasticity)
13
Step 1 :
No sound(3)
Random S(2)
Const. S(1)
3.375
4
6
No sound(3) Random S(2) Const. S(1)
3.375
4 6
-
-
-
0.625
-
-
2.625
2
-
Step 2:
ω=q
0.05 ,24-3,3
MSE
0.05,21,3 n g
8
Step 3 : see that the difference b/w constant sound group and no sound group is significant because 2.625>2.5776.
14
Other solution to example : using the Tukey confidence interval.
The 95% confidence interval between 3 pairs of means are:
0
.
0474
x
1
x
3
5 .
2026
0 .
5776
0 .
19526
x
1 x
2
x
2
x
3
4 .
5776
3 .
2026
x
1
x
2
x
2
x
3
not significantly different from zero x
x significant or the difference b/w constant sound group and no sound group is significant . This conclusion is consistent with using Tukey test.
15
16
0
0
0
17
According to Cohen (1988), Power is “the probability of rejecting a null hypothesis when it is false — and therefore should be rejected.”
H
0 is true H
0 is false
Reject H
0
Type I error
=
Correct decision
= 1-
= power
Not reject H
0
Correct decision
= 1-
Type II error
=
Example: Ho: beautiful girls are intelligent.
Ha: beautiful girls are not intelligent.
If beautiful girls are actually intelligent , but we say they are stupid, so we make Type I error!!!
If they are actually not intelligent, but we say they are we commit Type II error!
If they are actually not intelligent & we say they are not the test’s power is strong!
18
Non-rejection region
19
Role of power analysis : find optimal sample size + compute the test’s power to check how many % it will not make Type II error important!
Priori Power Analysis
• Before a research
• Aim: find the optimal sample size to ensure the test is powerful (β≥0.8) .
• too large sample size waste of time, money , effort, etc,
• too small sample size low test’s power.
Posteriori Power Analysis
• After a research
• Compute the test’s power.
20
Significance level
(conventional
0.05)
Effect size Sample size
Types of test
(ANOVA, ttest...)
21
Sample size: larger sample size more information collected the test is more powerful. But too large sample size waste of time, money & other resources.
Statistical significance level ( conventional: 0.05): The greater alpha the smaller beta the more powerful.
Effect size : the bigger effect size is the more power the test has.
22
EFFECT SIZE : show that difference is significant or not .
Generally, effect size is calculated by taking the difference between the
two groups and dividing it by the standard deviation.
To interpret the resulting number, most social scientists use this general guide developed by Cohen:
▪ < 0.1 = trivial effect
▪ 0.1 - 0.3 = small effect
▪ 0.3 - 0.5 = moderate effect
▪ > 0.5 = large difference effect
23
Because effect size can only be calculated after data is collected, you will have to use an estimate for the power analysis. How to estimate??
Literature review: based on similar test in the same field in the past in which the author detected the effect size successfully.
Based on experience, rationale, perception of yourself.
Neutral: use a value of 0.5 as it indicates a moderate to large difference.
24
EFFECT SIZE:
Effect size can be used for many types of tests, each test has a specific formula to calculate effect size.
s
For 2 means:
ES
x
1
x
2 s
For ANOVA: with k: Number of groups
ES
( x i
x )
2 k * MSE
25
Example :
Testing the effectiveness of two different teaching method: A&B. 2 random samples of students which have the same studying result were taken from two classes to participate in the test. After 1 month, the result revealed that group A student has better scores than group B, measured by the mean scores of two groups. Group
A’s result is 10 points higher than group B’s , s=30
ES
x
A
x
B with s
ES= 10/30=0.33
moderate effect.
26
Using the example from Tukey test:
α=0.05, medium ES , power =0.8, ANOVA with 3 groups.
look at the table at the next slide, the required sample size each group is 52.
27
28
29
Tukey test:
Help detect where the difference belong to which pairs of means , simultaneously, control Type I error :α (reject Ho when it is true- serious case)
But conservative: loss of power when compare all pair wise of means with a critical value.
Power analysis
Help best estimate the sample sizes when conducting different kinds of tests
Make the test more meaningful as it points out the effect size of each test
Avoid the case when researchers can not reject Ho and arbitrarily conclude that
Ho is true
30
http://137.148.49.106/offices/assessment/Assessment%20Reports%202006/CoS/
Psychology%203%20of%203.pdf
http://pcbfaculty.ou.edu/classfiles/MGT%206973%20Seminar%20in%20Research
%20Methods/MGT%206973%20Res%20Methods%20Spr%202006/Week-
5%20Research%20Design%20and%20Primary%20Data%20Collection/Cohen%2
01992%20PB%20A%20power%20primer.pdf
http://www.cvgs.k12.va.us/DIGSTATS/main/Guides/g_tukey.html http://www.epa.gov/bioiweb1/statprimer/power.html http://www.faculty.sfasu.edu/cobledean/Biostatistics/Lecture6/MultipleCompari sonTests.PDF http://web.mst.edu/~psyworld/tukeyssteps.html
http://www.cvgs.k12.va.us/DIGSTATS/main/Guides/g_tukey.html
http://faculty.vassar.edu/lowry/ch14pt2.html
http://people.richland.edu/james/lecture/m170/ch13-1wy.html
http://faculty.vassar.edu/lowry/vsanova.html
http://www.statsoft.com/textbook/power-analysis/ http://math.yorku.ca/SCS/Online/power/
31
32