stats for computer s..

advertisement
F, t, and p
Basic Statistics for
Computer Scientists
(aka knowing enough to be critical of user studies)
April 4, 2002
Benjamin Lok
User Studies
• Trying to identify phenomena or trends
– Hypothesis
• Blood pressure increases with age and weight
• Smoking increases risk of cancer
• Real objects in VEs improve performance
– How might we investigate this?
Variables and Conditions
• Hypothesis: Real objects in VEs improve
performance
• Independent Variable – the variable that is being
manipulated by the experimenter (VE type)
• Dependent Variable – the variable that is caused
by the independent variable. (performance)
• Experimental conditions – The level of
independent variable in which the situation of
interest was created.
Descriptive Statistics
• Hypothesis: Real objects in VEs improve
performance
• null hypothesis - assume real objects in VEs are
the SAME as virtual objects in VEs
• Innocent until proven guilty
• Your job: Prove otherwise!
• alternate hypothesis – interacting with real
objects is better than interacting with virtual
objects
Raw Data
• What does the mean tell us? Is that
enough?
Variances
• standard deviation – measure of dispersion (square
root of the sum of squares divided by N)
Small Pattern (seconds)
Large Pattern (seconds)
Mean
S.D.
Min
Max
Mean
S.D.
Min
Max
Real Space (n=41)
16.81
6.34
8.77
47.37
37.24
8.99
23.90
57.20
Purely Virtual (n=13)
47.24
10.43
33.85
73.55
116.99
32.25
70.20
192.20
Hybrid (n=13)
31.68
5.65
20.20
39.25
86.83
26.80
56.65
153.85
Vis Faith Hybrid (n=14)
28.88
7.64
20.20
46.00
72.31
16.41
51.60
104.50
Hypothesis
• We assumed the means are “equal”
• But are they? Or is the difference due to
chance?
Small Pattern (seconds)
Real Space (n=41)
Purely Virtual (n=13)
Hybrid (n=13)
Vis Faith Hybrid (n=14)
Large Pattern (seconds)
Mean
S.D.
Min
Max
Mean
S.D.
Min
Max
16.81
6.34
8.77
47.37
37.24
8.99
23.90
57.20
47.24
10.43
33.85
73.55
116.99
32.25
70.20
192.20
31.68
5.65
20.20
39.25
86.83
26.80
56.65
153.85
28.88
7.64
20.20
46.00
72.31
16.41
51.60
104.50
T - test
• T – test – statistical test used to determine
whether two observed means are
statistically different
T – test
• (rule of thumb) Good values of t > 1.96
• Look at what contributes to t
• http://trochim.human.cornell.edu/kb/stat_t.h
tm
F statistic, p values
• F statistic – assesses the extent to which the means
of the experimental conditions differ more than
would be expected by chance
• t is related to F statistic
• Look up a table, get the p value. Compare to α
• α value – probability of making a Type I error
(rejecting null hypothesis when really true)
• p value – statistical likelihood of an observed
pattern of data, calculated on the basis of the
sampling distribution of the statistic. (% chance it
was due to chance)
Let’s look at data
Small Pattern
Large Pattern
t – test
with unequal variance
p – value
t – test
with unequal variance
p - value
PVE – RSE vs.
VFHE – RSE
3.32
0.0026**
4.39
0.00016***
PVE – RSE vs.
HE – RSE
2.81
0.0094**
2.45
0.021*
VFHE – RSE vs.
HE – RSE
1.02
0.32
2.01
0.055+
Total Sense of Presence Score
Scale from 0..6
Between Groups
Total Sense of Presence
t – test
with unequal
variance
PVE –
VFHE
PVE – HE
VFHE – HE
p – value
Purely VE
1.10
0.28
1.64
0.11
0.64
0.53
Hybrid VE
Visually
Faithful
Hybrid VE
Mea
n
S.D
Mi
n
Ma
x
3.21
2.19
0
6
1.86
2.17
0
6
2.36
1.94
0
6
Significance
• What does it mean to be significant?
• You have some confidence it was not due to
chance.
• But difference between statistical significance and
meaningful significance
• Always know:
–
–
–
–
samples (n)
p value
variance/standard deviation
means
Download