Review: The Logic Underlying ANOVA • The possible pair-wise comparisons: Sample 1 Sample 2 Sample 3 X21 X31 X11 X22 X32 X12 . . . . . . . . . X2n X3n X1n means: X1 X2 X3 Review: The Logic Underlying ANOVA • There are k samples with which to estimate population variance ˆ 12 2 (X X ) i 1 n 1 Sample 1 Sample 2 Sample 3 X21 X31 X11 X22 X32 X12 . . . . . . . . . X2n X3n X1n X1 X2 X3 Review: The Logic Underlying ANOVA • There are k samples with which to estimate population variance ˆ 22 2 (X X ) i 2 n 1 Sample 1 Sample 2 Sample 3 X21 X31 X11 X22 X32 X12 . . . . . . . . . X2n X3n X1n X1 X2 X3 Review: The Logic Underlying ANOVA • There are k samples with which to estimate population variance ˆ 32 2 (X X ) i 3 n 1 Sample 1 Sample 2 Sample 3 X21 X31 X11 X22 X32 X12 . . . . . . . . . X2n X3n X1n X1 X2 X3 Review: The Logic Underlying ANOVA • The average of these variance estimates is called the “Mean Square Error” or “Mean Square Within” k ˆ 2 j MSerror j1 k Review: The Logic Underlying ANOVA • There are k means with which to estimate the population variance ˆ 2 n ˆ X2 (X n 2 X ) j overall k 1 Sample 1 Sample 2 Sample 3 X21 X31 X11 X22 X32 X12 . . . . . . . . . X2n X3n X1n X1 X2 X3 Review: The Logic Underlying ANOVA • This estimate of population variance based on sample means is called Mean Square Effect or Mean Square Between ˆ n ˆ 2 2 X (X n j X overall ) k 1 2 The F Statistic • MSerror is based on deviation scores within each sample but… • MSeffect is based on deviations between samples • MSeffect would overestimate the population variance when there is some effect of the treatment pushing the means of the different samples apart The F Statistic • We compare MSeffect against MSerror by constructing a statistic called F The F Statistic • F is the ratio of MSeffect to MSerror Fk1,k(n1) MSeffect MS error The F Statistic • If the hull hypothesis: 1 2 3 is true then we would expect: X1 X 2 X 3 except for random sampling variation The F Statistic • F is the ratio of MSeffect to MSerror Fk1,k(n1) MSeffect MS error • If the null hypothesis is true then F should equal 1.0 ANOVA is scalable • You can create a single F for any number of samples ANOVA is scalable • You can create a single F for any number of samples • It is also possible to examine more than one independent variable using a multiway ANOVA – Factors are the categories of independent variables – Levels are the variables within each factor ANOVA is scalable A two-way ANOVA: 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 Main Effects and Interactions • There are two types of findings with multi-way ANOVA: Main Effects and Interactions – For example a main effect of Factor 1 indicates that the means under the various levels of Factor 1 were different (at least one was different) Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X1 Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X2 Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X3 Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X4 Main Effects and Interactions A main effect of Factor 1 Levels of Factor 2 dependent variable 1 2 3 means of each sample 1 2 3 Factor 1 4 Main Effects and Interactions • There are two types of findings with multi-way ANOVA: Main Effects and Interactions – For example a main effect of Factor 1 indicates that the means under the various levels of Factor 1 were different (at least one was different) – A main effect of Factor 2 indicates that the means under the various levels of Factor 2 were different Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X1 X2 Xn X1 Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X2 Main Effects and Interactions 1 4 levels of factor 1 2 3 4 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 2 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 3 X1 X2 Xn X1 X2 Xn X1 X2 Xn X1 X2 Xn 1 X3 Main Effects and Interactions A main effect of Factor 2 Levels of Factor 2 dependent variable 1 2 3 1 2 3 Factor 1 4 Main Effects and Interactions • There are two types of findings with multi-way ANOVA: Main Effects and Interactions – For example a main effect of Factor 1 means that the means under the various levels of Factor 1 were different (at least one was different) – A main effect of Factor 2 means that the means under the various levels of Factor 2 were different – An interaction means that there was an effect of one factor but the effect is different for different levels of the other factor Main Effects and Interactions Levels of Factor 2 An Interaction dependent variable 1 2 3 1 2 3 Factor 1 4 Correlation • We often measure two or more different parameters of a single object Correlation • This creates two or more sets of measurements Correlation • These sets of measurements can be related to each other – Large values in one set correspond to large values in the other set – Small values in one set correspond to small values in the other set Correlation • examples: – height and weight – smoking and lung cancer – SES and longevity Correlation • We call the relationship between two sets of numbers the correlation Correlation • Measure heights and weights of 6 people Person a b c d e f Height 5’4 5’10 5’2 5’1 5’6 5’8 Weight 120 140 100 110 140 150 Correlation • Height vs. Weight Weight 100 110 120 130 140 150 Height 5’ 5’2 5’4 5’6 5’8 5’10 Correlation • Height vs. Weight a Weight 100 110 120 130 140 150 a Height 5’ 5’2 5’4 5’6 5’8 5’10 Correlation • Height vs. Weight a Weight b 100 110 120 130 140 150 a b Height 5’ 5’2 5’4 5’6 5’8 5’10 Correlation • Height vs. Weight c Weight d a b, e f 100 110 120 130 140 150 d c a e f b Height 5’ 5’2 5’4 5’6 5’8 5’10 Correlation • Notice that small values on one scale pair up with small values on the other c Weight d a b, e f 100 110 120 130 140 150 d c a e f b Height 5’ 5’2 5’4 5’6 5’8 5’10 Correlation • Scatter Plot shows the relationship on a single graph • Like two number lines perpendicular to each other Think of this as the y-axis c d a b, e f 100 110 120 130 140 150 d c a e f b 5’ 5’2 5’4 5’6 5’8 5’10 Think of this as the x-axis Correlation • Scatter Plot shows the relationship on a single graph * * * * * * d c a e f b 5’ 5’2 5’4 5’6 5’8 5’10 Height Correlation • The relationship here is like a straight line * * * • We call this linear correlation * * * Various Kinds of Linear Correlation • Strong Positive Various Kinds of Linear Correlation • Weak Positive Various Kinds of Linear Correlation • Strong Negative Various Kinds of Linear Correlation • No (or very weak) Correlation • y values are random with respect to x values Various Kinds of Linear Correlation • No Linear Correlation Correlation Enables Prediction • Strong correlations mean that we can predict a y value given an x value…this is called regression • Accuracy of our prediction depends on strength of the correlation Spurious Correlation • Sometimes two measures (called variables) both correlate with some other unknown variable (sometimes called a lurking variable) and consequently correlate with each other • This does not mean that they are causally related! • e.g. use of cigarette lighters positively correlated with incidence of lung cancer Next Time: measuring correlations