Review: The Logic Underlying ANOVA • The possible pair-wise comparisons: 

advertisement
Review: The Logic Underlying
ANOVA
• The possible pair-wise comparisons:
Sample 1 Sample 2 Sample 3
X21
X31
X11
X22
X32
X12
.
.
.
.
.
.
.
.
.
X2n
X3n
X1n
means:
X1
X2
X3
Review: The Logic Underlying
ANOVA
• There are k samples with which to
estimate population variance
ˆ 12 

2
(X

X
)
 i 1
n 1
Sample 1 Sample 2 Sample 3
X21
X31
X11
X22
X32
X12
.
.
.
.
.
.
.
.
.
X2n
X3n
X1n
X1
X2
X3
Review: The Logic Underlying
ANOVA
• There are k samples with which to
estimate population variance
ˆ 22 

2
(X

X
)
 i 2
n 1
Sample 1 Sample 2 Sample 3
X21
X31
X11
X22
X32
X12
.
.
.
.
.
.
.
.
.
X2n
X3n
X1n
X1
X2
X3
Review: The Logic Underlying
ANOVA
• There are k samples with which to
estimate population variance
ˆ 32 

2
(X

X
)
 i 3
n 1
Sample 1 Sample 2 Sample 3
X21
X31
X11
X22
X32
X12
.
.
.
.
.
.
.
.
.
X2n
X3n
X1n
X1
X2
X3
Review: The Logic Underlying
ANOVA
• The average of these variance estimates
is called the “Mean Square Error” or
“Mean Square Within”
k
ˆ
2
j
MSerror 
j1
k
Review: The Logic Underlying
ANOVA
• There are k means with which to
estimate the population variance
ˆ 2  n
ˆ X2

(X

n
2

X
)
j
overall
k 1
Sample 1 Sample 2 Sample 3
X21
X31
X11
X22
X32
X12
.
.
.
.
.
.
.
.
.
X2n
X3n
X1n
X1
X2
X3

Review: The Logic Underlying
ANOVA
• This estimate of population variance
based on sample means is called Mean
Square Effect or Mean Square Between
ˆ  n
ˆ

2
2
X
(X

n
j
 X overall )
k 1
2
The F Statistic
• MSerror is based on deviation scores
within each sample but…
• MSeffect is based on deviations between
samples
• MSeffect would overestimate the
population variance when there is some
effect of the treatment pushing the
means of the different samples apart
The F Statistic
• We compare MSeffect against MSerror by
constructing a statistic called F
The F Statistic
• F is the ratio of MSeffect to MSerror
Fk1,k(n1) 

MSeffect
MS error
The F Statistic
• If the hull hypothesis: 1  2  3  
is true then we would expect: X1  X 2  X 3  

except for random sampling variation

The F Statistic
• F is the ratio of MSeffect to MSerror
Fk1,k(n1) 
MSeffect
MS error
• If the null hypothesis is true then F
 should equal 1.0
ANOVA is scalable
• You can create a single F for any
number of samples
ANOVA is scalable
• You can create a single F for any
number of samples
• It is also possible to examine more than
one independent variable using a multiway ANOVA
– Factors are the categories of independent
variables
– Levels are the variables within each factor
ANOVA is scalable
A two-way ANOVA:
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1
Main Effects and Interactions
• There are two types of findings with multi-way
ANOVA: Main Effects and Interactions
– For example a main effect of Factor 1 indicates that the
means under the various levels of Factor 1 were different (at
least one was different)
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1
X1
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1
X2
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1
X3
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1
X4
Main Effects and Interactions
A main effect of Factor 1
Levels of
Factor 2
dependent variable
1
2
3
means of each sample
1
2
3
Factor 1
4
Main Effects and Interactions
• There are two types of findings with multi-way ANOVA: Main
Effects and Interactions
– For example a main effect of Factor 1 indicates that the means
under the various levels of Factor 1 were different (at least one was
different)
– A main effect of Factor 2 indicates that the means under the various
levels of Factor 2 were different
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1

X1
X2
Xn
X1
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1

X2
Main Effects and Interactions
1
4 levels of factor 1
2
3
4
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
2
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
3
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
X1
X2
Xn
1
X3
Main Effects and Interactions
A main effect of Factor 2
Levels of
Factor 2
dependent variable
1
2
3
1
2
3
Factor 1
4
Main Effects and Interactions
• There are two types of findings with multi-way ANOVA: Main
Effects and Interactions
– For example a main effect of Factor 1 means that the means under
the various levels of Factor 1 were different (at least one was
different)
– A main effect of Factor 2 means that the means under the various
levels of Factor 2 were different
– An interaction means that there was an effect of one factor but the
effect is different for different levels of the other factor
Main Effects and Interactions
Levels of
Factor 2
An Interaction
dependent variable
1
2
3
1
2
3
Factor 1
4
Correlation
• We often measure two or more different
parameters of a single object
Correlation
• This creates two or more sets of
measurements
Correlation
• These sets of measurements can be
related to each other
– Large values in one set correspond to
large values in the other set
– Small values in one set correspond to
small values in the other set
Correlation
• examples:
– height and weight
– smoking and lung cancer
– SES and longevity
Correlation
• We call the relationship between two
sets of numbers the correlation
Correlation
• Measure heights and weights of 6
people
Person
a
b
c
d
e
f
Height
5’4
5’10
5’2
5’1
5’6
5’8
Weight
120
140
100
110
140
150
Correlation
• Height vs. Weight
Weight
100 110 120 130 140 150
Height
5’ 5’2 5’4 5’6 5’8 5’10
Correlation
• Height vs. Weight
a
Weight
100 110 120 130 140 150
a
Height
5’ 5’2 5’4 5’6 5’8 5’10
Correlation
• Height vs. Weight
a
Weight
b
100 110 120 130 140 150
a
b
Height
5’ 5’2 5’4 5’6 5’8 5’10
Correlation
• Height vs. Weight
c
Weight
d
a
b, e
f
100 110 120 130 140 150
d c
a
e
f
b
Height
5’ 5’2 5’4 5’6 5’8 5’10
Correlation
• Notice that small values on one scale
pair up with small values on the other
c
Weight
d
a
b, e
f
100 110 120 130 140 150
d c
a
e
f
b
Height
5’ 5’2 5’4 5’6 5’8 5’10
Correlation
• Scatter Plot shows
the relationship on a
single graph
• Like two number
lines perpendicular
to each other
Think of this as the
y-axis
c
d
a
b, e
f
100 110 120 130 140 150
d c
a
e
f
b
5’ 5’2 5’4 5’6 5’8 5’10
Think of this as the x-axis
Correlation
• Scatter Plot
shows the
relationship
on a single
graph
*
*
*
*
*
*
d c
a
e
f
b
5’ 5’2 5’4 5’6 5’8 5’10
Height
Correlation
• The
relationship
here is like a
straight line
*
*
*
• We call this
linear
correlation
*
*
*
Various Kinds of Linear Correlation
• Strong Positive
Various Kinds of Linear Correlation
• Weak Positive
Various Kinds of Linear Correlation
• Strong Negative
Various Kinds of Linear Correlation
• No (or very
weak)
Correlation
• y values are
random with
respect to x
values
Various Kinds of Linear Correlation
• No Linear
Correlation
Correlation Enables Prediction
• Strong correlations mean that we can
predict a y value given an x value…this
is called regression
• Accuracy of our prediction depends on
strength of the correlation
Spurious Correlation
• Sometimes two measures (called variables) both
correlate with some other unknown variable
(sometimes called a lurking variable) and
consequently correlate with each other
• This does not mean that they are causally related!
• e.g. use of cigarette lighters positively correlated with
incidence of lung cancer
Next Time: measuring correlations
Download