Today’s Agenda - SPSS: Boxplots Ratios Crosstabs Conditionals Association and causation SPSS: Crosstabs Relevant text: P.54-60 Chapter 2 From last time: - Boxplots are good for visualizing the general trend in interval data. - They show everything in the _________________ (Min-Q1Median-Q3-Max), the whiskers and the outliers. From last time: - __________________boxplots can be used to compare the distributions of multiple sets of interval data. To build one in SPSS, go to _____________________________________________ Then, in the boxplot pop-up, switch to “Summaries of __________________ “ and click __________________ This assumes that we’re comparing data from different variables like X,Y, and Z in the SPSS example from last week. Move the variables you want plotted into “___________________________” Then click OK - This is the result. Side-by-side boxplots can be used to compare more than 2 variables. - But, boxplots can only display __________________ data. When we have no interval data, we need something else, like cross tabulations (or crosstabs). Cross tabulations (literally __________________) are a nongraphical method of summarizing two variables of _________ or __________________data. Favoured Pet Cat Dragon Total Favoured Ice Cream Vanilla Chocolate Total 72 29 101 210 825 1035 282 854 1136 Each _________ represents the category of one variable. Favoured Pet Cat Dragon Total Favoured Ice Cream Vanilla Chocolate Total 72 29 101 210 825 1035 282 854 1136 The row totals show that _________ people prefer cats, And _________ people prefer bearded dragons. Each _________ represents the category of the other variable. Favoured Pet Cat Dragon Total Favoured Ice Cream Vanilla Chocolate Total 72 29 101 210 825 1035 282 854 1136 The row totals show that _________ people prefer vanilla, And _________ people prefer chocolate. Each _________ represents the number of cases that are in both the row and column categories. Favoured Pet Cat Dragon Total Favoured Ice Cream Vanilla Chocolate Total 72 29 101 210 825 1035 282 854 1136 _________ people like vanilla AND cats best. (boring) _________ people like chocolate AND dragons best. (better) Another perspective: Of the people that prefer cats to dragons, 72 like vanilla. Conditionals When we only consider one row or one column, we are conditioning on that response. “Of the people that prefer cats to dragons, 72 like vanilla.” Means: _________ on the preference of cats, 72 (out of 101) prefer vanilla. To go further we need the _________. A ratio is simply a measure of how much of one thing there is in comparison to another. Example: The fertility rate in Canada is 1.49 children per woman. Comparing # of (expected) children to # of women. Example: The DeLorean car goes 88 miles per hour. Comparing # of miles traveled to # of hours passed. Ratios can be used to make fair comparisons between things that come from different scales. Canada has a few more hockey players than the US, but a MUCH bigger hockey player to citizen ratio. Source: IIHF Survey of Players These comparisons can be made over time too. - There are roughly as many traffic fatalities in the US as there were 60 years ago. (~30,000) - Traffic fatalities are considered to be “at an all-time low” because much more driving is happening in 2009 than 1949, but resulting in the same number of deaths. - The fatalities per mile is much lower now. (About 1/6 as much) Source: National Highway Safety Traffic Administration http://www-fars.nhtsa.dot.gov/Main/index.aspx Ratios let us make a fair comparison between different conditions. Favoured Pet Cat Dragon Total Favoured Ice Cream Vanilla Chocolate Total 72 29 101 210 825 1035 282 854 1136 There may be more total vanilla fans among dragon fans but… 72 of 101, or _________of cat fans prefer vanilla. 210 of 103, or _________ of dragon fans prefer vanilla. When we compare the ratios instead of the raw numbers, we account for the different _________ of the groups being considered. Safety Status Vehicle Motorbike Car Total Died in Traffic Else 11 9989 54 99,946 65 109,935 Total 10,000 100,000 110,000 Which do you think is more dangerous? Motorbikes or cars? Cars have more fatalities, but the fatalities _________ is much higher with motorbikes. Safety Status Vehicle Motorbike Car Total Died in Traffic Else 11 9989 54 99,946 65 109,935 11/10000 = 0.11% Fatality rate on bikes. 54/100000 = 0.05% Fatality rate on cars. Total 10,000 100,000 110,000 We’ve only seen 2x2 crosstabs, but larger ones are possible. The age and sex table of last week is a (2 x lots) crosstab. (Also, age is _________ but sex is _________, they mix fine) Ordinal variables can be included because they’re still categories. You can have more than two categories for both variable too. Child’s Education <HS HS College BSc < High School 142 381 112 31 High School 157 637 225 57 Mother’s Some College 36 206 486 549 Education BSc 18 25 68 410 Adv. Degree 4 11 22 91 TOTAL 357 1260 913 1138 MSc+ TOTAL 2 668 14 1090 98 1375 103 624 54 182 271 3939 Association and causation If the _________ from one variable is more common when a particular response from another variable appears, we say there is a __________________between the two responses. Safety Status Vehicle Motorbike Car Total Died in Traffic Else 11 9989 54 99,946 65 109,935 Total 10,000 100,000 110,000 A _________ association means one is less common when the other happens. Here we would say that riding a motorbike has a _________ association with dying in traffic. !!!!!!!: But an association does NOT imply _________. Just because two things happen together does not mean one of them caused the other. Example (this is actually true): Being left handed has a negative association with __________________. That’s right, left handed people historically don’t live as long as right handed people. Source: Evidence for longevity differences between left handed and right handed men: an archival study of cricketers. (1991) J P Aggleton, R W Kentridge, and N J Neave What’s so horribly wrong with lefties? …nothing. Only the ones that died younger were counted. Not long ago, left handedness was considered. Children were forced to adopt right handedness, but not anymore. If you looked at all the deaths today, more of the people that were born a long time ago will be right handed. So more of the people that had long lives ending today will have been right handed. It isn’t that lefties are dying young, it’s that lefties make up a smaller portion of old people than of young people. The authors of the source paper didn’t account for forcedhandedness. Forced-handedness is a __________________, or a _________. It’s something other than handedness that affects life expectancy, but hasn’t been accounted for. Unless you _________ for all other possible variables, you can’t pin an association down to one thing causing the other. (Almost never in social research) Later in the semester when we see the interval version of association, called correlation, we’ll revisit this again. Crosstabs in SPSS To build a crosstab table: _____________________________________________ In the crosstab pop-up, move one variable to _________ and one to _________, and click _________ In the output window, the crosstab will appear with the labels instead of the variable names if you set them. Another name for a crosstab is a __________________.