Lecture 3

advertisement
Sociology 549,
Lecture 3
Graphs
by Paul von Hippel
Common graphs for frequency
distributions
•
•
•
•
Pie chart
Line chart (frequency polygon)
Bar chart
Histogram
Other common graphs
• Time series
• Statistical map
Common distortions
• False perspective
– e.g., tilting a pie chart
• Shortening an axis; e.g.,
– not starting the vertical at 0
– breaking the vertical
– squishing the horizontal
• Reasons
– Add visual interest
– Make small differences look big,
– Or make big differences look small
Shapes of distributions
• Symmetric
• Skewed
– Positively skewed
– Negatively skewed
• Modal
– unimodal
– bimodal
– multimodal
Pie chart
• Rare in research
• Common in media
• Hard to compare
wedges (different
orientations)
• Can’t show order
– Restrict to nominal
variables
Majors in Soc 549
Psychology
4%
Criminology
35%
Sociology
61%
Perspective distortion
• Add a meaningless 3rd
dimension
• Tilt pie away
– Edge adds to front
– Perspective shrinks
back
– Comparisons even
harder
Criminology
35%
Psychology
4%
Sociology
61%
Pie Charts in politics
• Federal budget, from the website of the War Resisters’ League
• Redrawn
Current
Military
26%
Past Military
20%
Human
Resources
32%
General
Physical Government
Resources
16%
6%
Bar chart
(column chart)
• In research,
more common than
pie
• Can show order
– Appropriate for ordinal
and interval
– (as well as nominal)
• Easy to compare
vertical distances
Majors in Soc 549
16
14
12
10
8
6
4
2
0
Sociology
Criminology
Psychology
Axis distortion
• Start vertical above
zero
– Exaggerates all
differences
• Similar distortion:
– Break vertical axis
Majors in Soc 549
15
14
13
12
11
10
9
8
7
Sociology
Criminology
Psychology
Perspective distortion
• Add meaningless 3rd
dimension
– Reduces differences
(caps same size)
14
12
10
8
6
4
2
0
Psychology
Criminology
Sociology
Perspective distortion (continued)
• Add 3rd dimension and
overlap
• Exaggerates
differences
– Hides side of smaller
bars
– Also hides part of top
• Rotation would make
it worse
14
12
10
8
6
4
2
0
Psychology
Criminology
Sociology
Line chart
(frequency polygon)
• Common in research
• Can show order
– Appropriate for ordinal
and interval variables
16
14
12
10
8
6
4
2
0
Sociology
Criminology
Psychology
Axis distortions
15
• Start vertical above
zero
– Or break vertical
14
13
12
11
10
9
8
7
Sociology
Criminology
Psychology
Perspective distortion
• Add meaningless 3rd
dimension
• Tilt horizontal
– Exaggerates trend
14
12
10
8
6
4
2
0
S1
Sociology
Criminology
Psychology
Bar vs. line: similarities
Majors in Soc 549
16
• Bar and line charts
almost equivalent
– Start with a bar chart
• Connect tops
• remove bottoms
• You get a line chart!
14
12
10
8
6
4
2
0
Sociology
Criminology
Psychology
Sociology
Criminology
Psychology
16
14
12
10
8
6
4
2
0
Bar vs. line: Differences
16
14
• Line suggests trend
more strongly
– Helpful with ordinal or
interval variables
– Misleading with
nominal
12
10
8
6
4
2
0
Sociology
Criminology
Senior
Junior
Psychology
16
14
12
10
8
6
4
2
0
Sophomore
Bar vs. line: Differences
• Line eases comparison of groups
16
16
14
14
12
12
6
4
2
So
ci
ol
C
rim ogy
in
o
Ps log
y
yc
Ph
ho
ys
lo
ic
gy
al
C
th
om
m e ra
un
py
ic
at
io
ns
Bi
ol
og
y
0
Sociology of Sport
10
8
6
4
2
0
lo
gy
Cr
im
in
ol
og
Ps
y
yc
h
Ph
ol
og
ys
y
ica
lt
he
Co
ra
m
py
m
un
ica
t io
ns
Bi
ol
og
y
8
Social statistics
So
cio
10
Social statistics
Sociology of Sport
Histograms
• Like bar chart,
except
– Variable typically
continuous
– Bars touch
• usually
– Horizontal can
represent equal class
intervals (“bins”)
• Bin shown by
center value
(e.g. 35.0)
• Or by ends of class
interval
(e.g. 33.75-36.25)
Starting salaries for BAs in sociology, 2000-2001
30
20
10
Std. Dev = 4.31
Mean = 28.7
N = 96.00
0
22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 47.5
Starting salary in thousands
National Association of Colleges and Employers
survey of college placement offices
Summary: Graphical display of
distributions
Pie
Bar
Line
Histogram
Nominal
√
√
Book disapproves
Ordinal
Book approves
√
√
Interval
√
if continuous
Shape of distributions:
Positive or right skew
• Positive or right skew
• Characteristics:
– Peak on left
– Long right tail
• Stretched (Skewed)
to the right
– A few large values
• Common cause
– Floor but no ceiling
Starting salaries for BAs in sociology, 2000-2001
30
20
10
Std. Dev = 4.31
Mean = 28.7
N = 96.00
0
22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 47.5
Starting salary in thousands
National Association of Colleges and Employers
survey of college placement offices
Negative or left skew
• Negative or left skew
• Characteristics
mirror positive skew:
– Peak on right
– Long left tail
• Stretched (Skewed)
to the left
– A few small values
• Common cause
– Ceiling but no floor
Assignment 1 scores, sociology 549, winter 2001
14
12
10
8
6
4
Std. Dev = 15.79
2
Mean = 75.4
N = 101.00
0
35.0
45.0
40.0
55.0
50.0
65.0
60.0
Assignment 1 scores
75.0
70.0
85.0
80.0
95.0
90.0
100.0
Symmetry
• Symmetry, no skew
200
Frequency
– Two tails,
or no tails
• Important example:
100
– The normal curve
0
58.00
61.20
59.60
64.40
62.80
67.60
66.00
70.80
69.20
74.00
72.40
77.20
75.60
Height of adult males (inches)
80.40
78.80
82.00
Dummy variables
• Describe the shape of
this distribution.
Sex distribution, Soc 549, winter
2003
30
25
20
Num ber
of
15
students
10
5
0
0
1
Sex dum m y (1=fem ale)
Unimodal distributions
• Mode
– peak
– most common value
• Unimodal
– one peak
– e.g., starting salaries
Starting salaries for BAs in sociology, 2000-2001
30
20
10
• mode around $27K
• Interpretation
– the most common salaries
– are in the high $20s
Std. Dev = 4.31
Mean = 28.7
N = 96.00
0
22.5 25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 47.5
Starting salary in thousands
National Association of Colleges and Employers
survey of college placement offices
Bimodal distributions
• Bimodal
– two modes
– e.g., # children
• modes at 0 and 2
• Interpretation?
500
400
300
200
Count
100
0
0
2
1
4
3
NUMBER OF CHILDREN
6
5
EIGHT OR MORE
7
Multimodal distributions
• Multimodal
– more than 2 modes
– e.g., hours worked by
OSU sociology
students
• modes at 0, 20, 40
(primary)
mode
secondary
modes
Review of shape
• Shapes
– Symmetric
– Skewed
• Positive (right)
• Negative (left)
– Unimodal, bimodal, multimodal
Time series:
don’t show distributions,
show change over time
BAs in social science and history
(National Center for Educational Statistics)
50%
45%
40%
35%
30%
% women 25%
20%
15%
10%
5%
0%
1970
1975
1980
1985
1990
1995
Axis distortion:
start (or break) vertical
above zero
BAs in social science and history
46%
44%
42%
40%
% women 38%
36%
34%
32%
30%
1970
1975
1980
1985
1990
1995
Axis distortion:
Squeeze vertical
or stretch horizontal
50%
45%
40%
35%
30%
% women 25%
20%
15%
10%
5%
0%
1970
1975
1980
1985
1990
1995
Axis distortion:
Squeeze horizontal
or stretch vertical
50%
45%
40%
35%
30%
%
25%
women
20%
15%
10%
5%
0%
1970
1980
1990
Axis distortion in business
• NASDAQ stock index, reported by Yahoo!
2500
2000
•Redrawn
NASDAQ
stock index
1500
1000
500
0
6-Jan-02
6-Jan-03
Graphical distortion: Summary
• Axis distortion
– Squeeze one axis
• Honest aspect ratio is 3:2 (Tufte)
– Start or break vertical axis above zero
• Perspective distortion
– Add disproportionate areas in a meaningless 3rd
dimension
– Use blocking & tilting
Graphics: Good advice
• Keep it simple
– Don’t stretch axes
– Don’t start or break axes above zero
– Don’t use 3-D
• If you have to use 3D, avoid abuses
– With just a few numbers,
consider a table instead of a graph
Graphics: Evil advice
• Use every trick (3D, distorted axes)
– Maximize differences that serve your purpose
– Minimize differences that work against you
Download