Sociology 549, Lecture 3 Graphs by Paul von Hippel Common graphs for frequency distributions • • • • Pie chart Line chart (frequency polygon) Bar chart Histogram Other common graphs • Time series • Statistical map Common distortions • False perspective – e.g., tilting a pie chart • Shortening an axis; e.g., – not starting the vertical at 0 – breaking the vertical – squishing the horizontal • Reasons – Add visual interest – Make small differences look big, – Or make big differences look small Shapes of distributions • Symmetric • Skewed – Positively skewed – Negatively skewed • Modal – unimodal – bimodal – multimodal Pie chart • Rare in research • Common in media • Hard to compare wedges (different orientations) • Can’t show order – Restrict to nominal variables Majors in Soc 549 Psychology 4% Criminology 35% Sociology 61% Perspective distortion • Add a meaningless 3rd dimension • Tilt pie away – Edge adds to front – Perspective shrinks back – Comparisons even harder Criminology 35% Psychology 4% Sociology 61% Pie Charts in politics • Federal budget, from the website of the War Resisters’ League • Redrawn Current Military 26% Past Military 20% Human Resources 32% General Physical Government Resources 16% 6% Bar chart (column chart) • In research, more common than pie • Can show order – Appropriate for ordinal and interval – (as well as nominal) • Easy to compare vertical distances Majors in Soc 549 16 14 12 10 8 6 4 2 0 Sociology Criminology Psychology Axis distortion • Start vertical above zero – Exaggerates all differences • Similar distortion: – Break vertical axis Majors in Soc 549 15 14 13 12 11 10 9 8 7 Sociology Criminology Psychology Perspective distortion • Add meaningless 3rd dimension – Reduces differences (caps same size) 14 12 10 8 6 4 2 0 Psychology Criminology Sociology Perspective distortion (continued) • Add 3rd dimension and overlap • Exaggerates differences – Hides side of smaller bars – Also hides part of top • Rotation would make it worse 14 12 10 8 6 4 2 0 Psychology Criminology Sociology Line chart (frequency polygon) • Common in research • Can show order – Appropriate for ordinal and interval variables 16 14 12 10 8 6 4 2 0 Sociology Criminology Psychology Axis distortions 15 • Start vertical above zero – Or break vertical 14 13 12 11 10 9 8 7 Sociology Criminology Psychology Perspective distortion • Add meaningless 3rd dimension • Tilt horizontal – Exaggerates trend 14 12 10 8 6 4 2 0 S1 Sociology Criminology Psychology Bar vs. line: similarities Majors in Soc 549 16 • Bar and line charts almost equivalent – Start with a bar chart • Connect tops • remove bottoms • You get a line chart! 14 12 10 8 6 4 2 0 Sociology Criminology Psychology Sociology Criminology Psychology 16 14 12 10 8 6 4 2 0 Bar vs. line: Differences 16 14 • Line suggests trend more strongly – Helpful with ordinal or interval variables – Misleading with nominal 12 10 8 6 4 2 0 Sociology Criminology Senior Junior Psychology 16 14 12 10 8 6 4 2 0 Sophomore Bar vs. line: Differences • Line eases comparison of groups 16 16 14 14 12 12 6 4 2 So ci ol C rim ogy in o Ps log y yc Ph ho ys lo ic gy al C th om m e ra un py ic at io ns Bi ol og y 0 Sociology of Sport 10 8 6 4 2 0 lo gy Cr im in ol og Ps y yc h Ph ol og ys y ica lt he Co ra m py m un ica t io ns Bi ol og y 8 Social statistics So cio 10 Social statistics Sociology of Sport Histograms • Like bar chart, except – Variable typically continuous – Bars touch • usually – Horizontal can represent equal class intervals (“bins”) • Bin shown by center value (e.g. 35.0) • Or by ends of class interval (e.g. 33.75-36.25) Starting salaries for BAs in sociology, 2000-2001 30 20 10 Std. Dev = 4.31 Mean = 28.7 N = 96.00 0 22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 47.5 Starting salary in thousands National Association of Colleges and Employers survey of college placement offices Summary: Graphical display of distributions Pie Bar Line Histogram Nominal √ √ Book disapproves Ordinal Book approves √ √ Interval √ if continuous Shape of distributions: Positive or right skew • Positive or right skew • Characteristics: – Peak on left – Long right tail • Stretched (Skewed) to the right – A few large values • Common cause – Floor but no ceiling Starting salaries for BAs in sociology, 2000-2001 30 20 10 Std. Dev = 4.31 Mean = 28.7 N = 96.00 0 22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 47.5 Starting salary in thousands National Association of Colleges and Employers survey of college placement offices Negative or left skew • Negative or left skew • Characteristics mirror positive skew: – Peak on right – Long left tail • Stretched (Skewed) to the left – A few small values • Common cause – Ceiling but no floor Assignment 1 scores, sociology 549, winter 2001 14 12 10 8 6 4 Std. Dev = 15.79 2 Mean = 75.4 N = 101.00 0 35.0 45.0 40.0 55.0 50.0 65.0 60.0 Assignment 1 scores 75.0 70.0 85.0 80.0 95.0 90.0 100.0 Symmetry • Symmetry, no skew 200 Frequency – Two tails, or no tails • Important example: 100 – The normal curve 0 58.00 61.20 59.60 64.40 62.80 67.60 66.00 70.80 69.20 74.00 72.40 77.20 75.60 Height of adult males (inches) 80.40 78.80 82.00 Dummy variables • Describe the shape of this distribution. Sex distribution, Soc 549, winter 2003 30 25 20 Num ber of 15 students 10 5 0 0 1 Sex dum m y (1=fem ale) Unimodal distributions • Mode – peak – most common value • Unimodal – one peak – e.g., starting salaries Starting salaries for BAs in sociology, 2000-2001 30 20 10 • mode around $27K • Interpretation – the most common salaries – are in the high $20s Std. Dev = 4.31 Mean = 28.7 N = 96.00 0 22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 47.5 Starting salary in thousands National Association of Colleges and Employers survey of college placement offices Bimodal distributions • Bimodal – two modes – e.g., # children • modes at 0 and 2 • Interpretation? 500 400 300 200 Count 100 0 0 2 1 4 3 NUMBER OF CHILDREN 6 5 EIGHT OR MORE 7 Multimodal distributions • Multimodal – more than 2 modes – e.g., hours worked by OSU sociology students • modes at 0, 20, 40 (primary) mode secondary modes Review of shape • Shapes – Symmetric – Skewed • Positive (right) • Negative (left) – Unimodal, bimodal, multimodal Time series: don’t show distributions, show change over time BAs in social science and history (National Center for Educational Statistics) 50% 45% 40% 35% 30% % women 25% 20% 15% 10% 5% 0% 1970 1975 1980 1985 1990 1995 Axis distortion: start (or break) vertical above zero BAs in social science and history 46% 44% 42% 40% % women 38% 36% 34% 32% 30% 1970 1975 1980 1985 1990 1995 Axis distortion: Squeeze vertical or stretch horizontal 50% 45% 40% 35% 30% % women 25% 20% 15% 10% 5% 0% 1970 1975 1980 1985 1990 1995 Axis distortion: Squeeze horizontal or stretch vertical 50% 45% 40% 35% 30% % 25% women 20% 15% 10% 5% 0% 1970 1980 1990 Axis distortion in business • NASDAQ stock index, reported by Yahoo! 2500 2000 •Redrawn NASDAQ stock index 1500 1000 500 0 6-Jan-02 6-Jan-03 Graphical distortion: Summary • Axis distortion – Squeeze one axis • Honest aspect ratio is 3:2 (Tufte) – Start or break vertical axis above zero • Perspective distortion – Add disproportionate areas in a meaningless 3rd dimension – Use blocking & tilting Graphics: Good advice • Keep it simple – Don’t stretch axes – Don’t start or break axes above zero – Don’t use 3-D • If you have to use 3D, avoid abuses – With just a few numbers, consider a table instead of a graph Graphics: Evil advice • Use every trick (3D, distorted axes) – Maximize differences that serve your purpose – Minimize differences that work against you