Stat 104 – Lecture 5 Association • Variables

Stat 104 – Lecture 5
Association
• Variables
–Response – an outcome variable
whose values exhibit variability.
–Explanatory – a variable that we
use to try to explain the
variability in the response.
1
Association
• There is an association
between two variables if
values of one variable are
more likely to occur with
certain values of a second
variable.
2
Picturing Association
• Two Categorical (Qualitative).
–Cross-tabs table, mosaic plot.
• Two Numerical (Quantitative).
–Scatter diagram.
3
1
Stat 104 – Lecture 5
Categorical Data
• Who?
–Students in a statistics class at
Penn State University.
• What?
–“With whom is it easiest to make
friends?” Opposite sex, same sex,
no difference.
–Gender. Male, female.
4
Cross-tabs Table
With whom is it easiest to make friends?
Same
Sex
Opposit No Diff
e Sex
Total
Female
16
58
63
137
Male
13
15
40
68
Total
29
73
103
205
5
Bar Graph
With whom is it easiest to make friends?
Distributions
Answer
Frequencies
50.2
35.6
75
50
14.1
Count
100
Level
No Diff
Opposite
Same
Total
Count
103
73
29
205
Prob
0.50244
0.35610
0.14146
1.00000
N Missing
0
3 Levels
25
No Diff
Opposite
Same
6
2
Stat 104 – Lecture 5
Percentages
With whom is it easiest to make friends?
Count
Row %
Female
Male
Total
Same Sex Opposite
Sex
16
11.7%
13
19.1%
29
58
42.3%
15
22.1%
73
No Diff
Total
63
46.0%
40
58.8%
103
137
100%
68
100%
205
7
Mosaic Plot
1.00
Same
Answer
0.75
Opposite
0.50
0.25
No Diff
0.00
Female
Male
Gender
8
Interpretation
• More that 50% of males say no
difference while less than 50%
of females say no difference.
• Females are about twice as
likely as males to say opposite.
• Males are about twice as likely
as females to say the same.
9
3
Stat 104 – Lecture 5
Scatter Plot
• Statistics is about … variation.
• Recognize, quantify and try to
explain variation.
• Variation in two quantitative
variables is displayed in a scatter
plot.
10
Scatter Plot
• Numerical variable on the
vertical axis, y, is the
response variable.
• Numerical variable on the
horizontal axis, x, is the
explanatory variable.
11
Scatter Plot
• Example: Body mass (kg) and
Bite force (N) for Canidae.
–y, Response: Bite force (N)
–x, Explanatory: Body mass (kg)
–Cases: 28 species of Canidae.
12
4
Stat 104 – Lecture 5
Bivariate Fit of BFca (N) By Body Mass (kg)
500
BFca (N)
400
300
200
100
0
0
5
10
15
20
25
30
35
40
Body Mass (kg)
13
Positive Association
• Positive Association
–Above average values of Bite
force are associated with above
average values of Body mass.
–Below average values of Bite
force are associated with below
average values of Body mass.
14
Scatter Plot
• Example: Outside temperature
and amount of natural gas used.
–Response: Natural gas used (1000
ft3).
–Explanatory: Outside temperature
(o C).
–Cases: 26 days.
15
5
Stat 104 – Lecture 5
Gas
10
5
0
-5.0
.0
5.0
Temp
10.0
15.0
16
Negative Association
–Above average values of gas
are associated with below
average temperatures.
–Below average values of gas
are associated with above
average temperatures.
17
Association
• Positive
–As x goes up, y tends to go up.
• Negative
–As x goes up, y tends to go
down.
18
6