Today’s Agenda - SPSS: Boxplots - Ratios - Crosstabs

advertisement
Today’s Agenda
-
SPSS: Boxplots
Ratios
Crosstabs
Conditionals
Association and causation
SPSS: Crosstabs
Relevant text: P.54-60 Chapter 2
From last time:
- Boxplots are good for visualizing the general trend in
interval data.
- They show everything in the _________________ (Min-Q1Median-Q3-Max), the whiskers and the outliers.
From last time:
- __________________boxplots can be used to compare the
distributions of multiple sets of interval data.
To build one in SPSS, go to
_____________________________________________
Then, in the boxplot pop-up, switch to “Summaries of
__________________ “ and click __________________
This assumes that we’re comparing data from different
variables like X,Y, and Z in the SPSS example from last week.
Move the variables you want plotted into
“___________________________”
Then click OK
- This is the result. Side-by-side boxplots can be used to
compare more than 2 variables.
- But, boxplots can only display __________________ data.
When we have no interval data, we need something else, like
cross tabulations (or crosstabs).
Cross tabulations (literally __________________) are a nongraphical method of summarizing two variables of _________
or __________________data.
Favoured Pet
Cat
Dragon
Total
Favoured Ice Cream
Vanilla Chocolate
Total
72
29
101
210
825
1035
282
854
1136
Each _________ represents the category of one variable.
Favoured Pet
Cat
Dragon
Total
Favoured Ice Cream
Vanilla Chocolate
Total
72
29
101
210
825
1035
282
854
1136
The row totals show that _________ people prefer cats,
And _________ people prefer bearded dragons.
Each _________ represents the category of the other variable.
Favoured Pet
Cat
Dragon
Total
Favoured Ice Cream
Vanilla Chocolate
Total
72
29
101
210
825
1035
282
854
1136
The row totals show that _________ people prefer vanilla,
And _________ people prefer chocolate.
Each _________ represents the number of cases that are in
both the row and column categories.
Favoured Pet
Cat
Dragon
Total
Favoured Ice Cream
Vanilla Chocolate
Total
72
29
101
210
825
1035
282
854
1136
_________ people like vanilla AND cats best. (boring)
_________ people like chocolate AND dragons best. (better)
Another perspective:
Of the people that prefer cats to dragons, 72 like vanilla.
Conditionals
When we only consider one row or one column, we are
conditioning on that response.
“Of the people that prefer cats to dragons, 72 like vanilla.”
Means:
_________ on the preference of cats, 72 (out of 101) prefer
vanilla.
To go further we need the _________.
A ratio is simply a measure of how much of one thing there is
in comparison to another.
Example: The fertility rate in Canada is 1.49 children per
woman. Comparing # of (expected) children to # of women.
Example: The DeLorean car goes 88 miles per hour. Comparing
# of miles traveled to # of hours passed.
Ratios can be used to make fair comparisons between things
that come from different scales.
Canada has a few more hockey players than the US, but a
MUCH bigger hockey player to citizen ratio.
Source: IIHF Survey of Players
These comparisons can be made over time too.
- There are roughly as many traffic fatalities in the US as
there were 60 years ago. (~30,000)
- Traffic fatalities are considered to be “at an all-time low”
because much more driving is happening in 2009 than
1949, but resulting in the same number of deaths.
- The fatalities per mile is much lower now. (About 1/6 as
much)
Source: National Highway Safety Traffic Administration
http://www-fars.nhtsa.dot.gov/Main/index.aspx
Ratios let us make a fair comparison between different
conditions.
Favoured Pet
Cat
Dragon
Total
Favoured Ice Cream
Vanilla Chocolate
Total
72
29
101
210
825
1035
282
854
1136
There may be more total vanilla fans among dragon fans but…
72 of 101, or _________of cat fans prefer vanilla.
210 of 103, or _________ of dragon fans prefer vanilla.
When we compare the ratios instead of the raw numbers, we
account for the different _________ of the groups being
considered.
Safety Status
Vehicle
Motorbike
Car
Total
Died in
Traffic
Else
11
9989
54 99,946
65 109,935
Total
10,000
100,000
110,000
Which do you think is more dangerous? Motorbikes or cars?
Cars have more fatalities, but the fatalities _________ is much
higher with motorbikes.
Safety Status
Vehicle
Motorbike
Car
Total
Died in
Traffic
Else
11
9989
54 99,946
65 109,935
11/10000 = 0.11% Fatality rate on bikes.
54/100000 = 0.05% Fatality rate on cars.
Total
10,000
100,000
110,000
We’ve only seen 2x2 crosstabs, but larger ones are possible.
The age and sex table of last week is a (2 x lots) crosstab. (Also,
age is _________ but sex is _________, they mix fine)
Ordinal variables can be included because they’re still
categories.
You can have more than two categories for both variable too.
Child’s Education
<HS
HS College BSc
< High School 142 381
112
31
High School 157 637
225
57
Mother’s Some College 36
206
486
549
Education
BSc
18
25
68
410
Adv. Degree
4
11
22
91
TOTAL
357 1260
913
1138
MSc+ TOTAL
2
668
14
1090
98
1375
103
624
54
182
271 3939
Association and causation
If the _________ from one variable is more common when a
particular response from another variable appears, we say
there is a __________________between the two responses.
Safety Status
Vehicle
Motorbike
Car
Total
Died in
Traffic
Else
11
9989
54 99,946
65 109,935
Total
10,000
100,000
110,000
A _________ association means one is less common when the
other happens.
Here we would say that riding a motorbike has a _________
association with dying in traffic.
!!!!!!!: But an association does NOT imply _________.
Just because two things happen together does not mean one
of them caused the other.
Example (this is actually true): Being left handed has a negative
association with __________________.
That’s right, left handed people historically don’t live as long as
right handed people.
Source: Evidence for longevity differences between left handed and right handed men: an archival
study of cricketers. (1991)
J P Aggleton, R W Kentridge, and N J Neave
What’s so horribly wrong with lefties?
…nothing. Only the ones that died younger were counted.
Not long ago, left handedness was considered. Children were
forced to adopt right handedness, but not anymore.
If you looked at all the deaths today, more of the people that
were born a long time ago will be right handed.
So more of the people that had long lives ending today will
have been right handed.
It isn’t that lefties are dying young, it’s that lefties make up a
smaller portion of old people than of young people.
The authors of the source paper didn’t account for forcedhandedness.
Forced-handedness is a __________________, or a
_________.
It’s something other than handedness that affects life
expectancy, but hasn’t been accounted for.
Unless you _________ for all other possible variables, you
can’t pin an association down to one thing causing the other.
(Almost never in social research)
Later in the semester when we see the interval version of
association, called correlation, we’ll revisit this again.
Crosstabs in SPSS
To build a crosstab table:
_____________________________________________
In the crosstab pop-up, move one variable to _________ and
one to _________, and click _________
In the output window, the crosstab will appear with the labels
instead of the variable names if you set them.
Another name for a crosstab is a __________________.
Download