Stat 104 – Lecture 5 Association • Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the variability in the response. 1 Association • There is an association between two variables if values of one variable are more likely to occur with certain values of a second variable. 2 Picturing Association • Two Categorical (Qualitative). –Cross-tabs table, mosaic plot. • Two Numerical (Quantitative). –Scatter diagram. 3 1 Stat 104 – Lecture 5 Categorical Data • Who? –Students in a statistics class at Penn State University. • What? –“With whom is it easiest to make friends?” Opposite sex, same sex, no difference. –Gender. Male, female. 4 Cross-tabs Table With whom is it easiest to make friends? Same Sex Opposit No Diff e Sex Total Female 16 58 63 137 Male 13 15 40 68 Total 29 73 103 205 5 Bar Graph With whom is it easiest to make friends? Distributions Answer Frequencies 50.2 35.6 75 50 14.1 Count 100 Level No Diff Opposite Same Total Count 103 73 29 205 Prob 0.50244 0.35610 0.14146 1.00000 N Missing 0 3 Levels 25 No Diff Opposite Same 6 2 Stat 104 – Lecture 5 Percentages With whom is it easiest to make friends? Count Row % Female Male Total Same Sex Opposite Sex 16 11.7% 13 19.1% 29 58 42.3% 15 22.1% 73 No Diff Total 63 46.0% 40 58.8% 103 137 100% 68 100% 205 7 Mosaic Plot 1.00 Same Answer 0.75 Opposite 0.50 0.25 No Diff 0.00 Female Male Gender 8 Interpretation • More that 50% of males say no difference while less than 50% of females say no difference. • Females are about twice as likely as males to say opposite. • Males are about twice as likely as females to say the same. 9 3 Stat 104 – Lecture 5 Scatter Plot • Statistics is about … variation. • Recognize, quantify and try to explain variation. • Variation in two quantitative variables is displayed in a scatter plot. 10 Scatter Plot • Numerical variable on the vertical axis, y, is the response variable. • Numerical variable on the horizontal axis, x, is the explanatory variable. 11 Scatter Plot • Example: Body mass (kg) and Bite force (N) for Canidae. –y, Response: Bite force (N) –x, Explanatory: Body mass (kg) –Cases: 28 species of Canidae. 12 4 Stat 104 – Lecture 5 Bivariate Fit of BFca (N) By Body Mass (kg) 500 BFca (N) 400 300 200 100 0 0 5 10 15 20 25 30 35 40 Body Mass (kg) 13 Positive Association • Positive Association –Above average values of Bite force are associated with above average values of Body mass. –Below average values of Bite force are associated with below average values of Body mass. 14 Scatter Plot • Example: Outside temperature and amount of natural gas used. –Response: Natural gas used (1000 ft3). –Explanatory: Outside temperature (o C). –Cases: 26 days. 15 5 Stat 104 – Lecture 5 Gas 10 5 0 -5.0 .0 5.0 Temp 10.0 15.0 16 Negative Association –Above average values of gas are associated with below average temperatures. –Below average values of gas are associated with above average temperatures. 17 Association • Positive –As x goes up, y tends to go up. • Negative –As x goes up, y tends to go down. 18 6