The Structure of a Social Science Collaboration

advertisement
The Structure of a Social Science Collaboration Network:
Disciplinary Cohesion from 1963 to 1999
James Moody
The Ohio State University
"If we ever get to the point of charting a whole city or a whole
nation, we would have … a picture of a vast solar system of
intangible structures, powerfully influencing conduct, as
gravitation does in space. Such an invisible structure underlies
society and has its influence in determining the conduct of
society as a whole."
J.L. Moreno, New York Times, April 13, 1933
"Science, carved up into a host of detailed studies that have no
link with one another, no longer forms a solid whole."
Durkheim, 1933
Stratification
Social
Welfare
Organizations
Historical
Sociology
Crime
Gender
Health
Large-Scale Social Networks Models
3 Large-Scale Network Models:
1) Small-World Networks (Watts, 1999)
2) Scale-Free Networks (Barabasi & Albert 1999)
3) Structurally Cohesive Networks (White & Harary, 2001)
Milgram’s Small World Finding:
Distance to target person, by sending group.
Large-Scale Social Networks Models
Small -World Networks
C=Large, L is Small =
SW Graphs
•High relative probability that a node’s contacts are connected to each other.
•Small relative average distance between nodes
Large-Scale Social Networks Models
Small-World Networks
In a highly clustered, ordered
network, a single random
connection will create a shortcut
that lowers L dramatically
Watts demonstrates that Small
world properties can occur in
graphs with a surprisingly small
number of shortcuts
Large-Scale Social Networks Models
Small -World Networks
Locally clustered graphs are a good model for coauthorship
when there are many authors on a paper.
Paper 1
Paper 2
Paper 3
Paper 4
Paper 5
Newman (2001) finds that coauthorship among natural scientists
fits a small world model
Large-Scale Social Networks Models
Scale Free Networks
Many large networks are characterized by a highly skewed
distribution of the number of partners (degree)
Large-Scale Social Networks Models
Scale Free Networks
Many large networks are characterized by a highly skewed
distribution of the number of partners (degree)
p(k ) ~ k

Large-Scale Social Networks Models
Scale-Free Networks
• Scale-free networks appear
when new nodes enter the
network by attaching to already
popular nodes.
• Scale-free networks are
common (WWW, Sexual
Networks, Email)
Large-Scale Social Networks Models
Scale-Free Networks
Colorado Springs High-Risk
(Sexual contact only)
•Network is power-law
distributed, with  = -1.3
Large-Scale Social Networks Models
Scale-Free Networks
Hubs make the network fragile to node disruption
Large-Scale Social Networks Models
Scale-Free Networks
Hubs make the network fragile to node disruption
Large-Scale Social Networks Models
Structurally Cohesive Networks
•Networks are structurally cohesive if they remain connected
even when nodes are removed
0
2
1
Node Connectivity
3
Large-Scale Social Networks Models
Structurally Cohesive Networks
•Identified in wide ranging contexts:
•High School Friendship networks
•Biotechnology Inter-organizational networks
•Mexican political networks
•Structurally cohesive networks are conducive to equality
and diffusion, since no node can control the flow of goods
through the network.
•Empirical trace of organic solidarity
Coauthorship in the Social Sciences
Data
•Data are from the Sociological Abstracts
•281,163 papers published between 1963 and 1999
128,151 people who have coauthored
•Data re-coded to correct for middle initials and similar names
•The coauthorship network is created by linking any two people
who publish a paper together.
Coauthorship Trends in the Social Sciences
Distribution of Coauthorship Across Journals
Child
Development
Sociological Abstracts, 1963-1999
Proportion of papers w. >1 author
1
0.8
Soc.
Forces
J. Health &
Soc. Beh.
ASR
0.6
J.Am.
Statistical A.
0.4
AJS
Atca
Politica
Soc.
Theory
0.2
Signs
J. Soc.
History
0
0
100
200
300
400
500
600
700
Coauthorship Rank
800
900
1000
1100
0
0.5
1
1.5
2
2.5
Odds of Coauthorship by Substantive Area
Coauthorship Trends in the Social Sciences
Coauthorship Trends in Sociology
Sociological Abstracts and ASR
Proportion of papers with >1 author
0.75
0.6
0.45
0.3
Sociological Abstracts
ASR
0.15
0
1930
1940
1950
1960
1970
Year
1980
1990
2000
Publication Rates
The two key constraints on a collaboration network are the distribution of
the number of authors on a paper and the number of papers authors publish.
1000000
Number of Authors
100000
10000
1000
100
10
1
1
10
100
Number of Publications
1000
Number of Authors
1000000
Number of Papers
100000
10000
1000
100
10
1
1
10
Number of Authors
100
The Social Science Collaboration Graph
Constructed by assigning an edge between any pair of people who
coauthored a paper together.
g=745
The Social Science Collaboration Graph
Example Paths: 3-steps from N. B. Tuma
Node size = ln(degree)
g=745
The Social Science Collaboration Graph
Degree
Distribution of Number of Coauthors (Degree)
100000
p(k )  e  k    (ln(k ))
Number of Authors (log)
10000
1000
100
Does not conform to the
scale-free model
10
1
1
10
Number of coauthors (log)
100
The Social Science Collaboration Graph
Centrality
Better indicator of location in the network is closeness centrality
The Social Science Collaboration Graph
Centrality
Top 10 Authors, by Centrality:
Ronald Kessler (2620)
James S. House (2060)
Duane F. Alwin (1913)
Kenneth C. Land (1829)
Philip J. Leaf (1651)
Peter H. Rossi (1631)
Steven S. Martin (1577)
David G. Ostrow (1492)
Charles W. Mueller (1486)
Edward O. Laumann (1465)
The Social Science Collaboration Graph
Component Structure
Percent of the Population in a component of size g:
19%
g=2
9%
54%
g=3
g=68,285
5%
3%
10%
Figure 7. Selected components from the
Sociology Coauthorhship Network
The Social Science Collaboration Graph
Small-World Structure?
Clustering
Distance
Observed
Random
0.194
0.206
9.81
7.57
The Sociology network does not have a small-world structure.
The Social Science Collaboration Graph
Component Structure
Largest Bicomponent, g = 29,462
0.04
0.27
0.50
0.73
0.96
The Social Science Collaboration Graph
Component Structure
Largest Bicomponent, n = 29,462
The Social Science Collaboration Graph
Internal Structure of the largest bicomponent
The Social Science Collaboration Graph
Internal Structure of the largest bicomponent
Group 1
Size
3667
In-group / out- group ties
3.24
% male
67
Years in discipline
8.46
Number of co-authored publications
5.32
Group 2
987
2.86
52
4.67
3.24
The Social Science Collaboration Graph
Internal Structure of the largest bicomponent
5+ -connected
5,223
4-connected
7,992
3-connected
14,672
2-connected
29,462
0
10
20
30
40
The Social Science Collaboration Graph
Component Structure
•Broad Core-periphery structure
(68,923)
59,866
38,823
29,462
Bicomponent
Component
Unconnected
Structurally Isolated
The Social Science Collaboration Graph
Network Core Position
Characteristics of authors by component embeddedness
Structural
Total
Isolate
(0)
Percent male
62%
69%
(a)
Years in the discipline
4.02
2.88
Avg. number of authors per paper
2.26
1.0
Number of publications
2.17
1.51
Number of co-authors
2.05
0.00
Year of first publication
1985.85
1984.5
N
197074
68932
Unconnected
(1)
62%
3.44
2.57
1.76
1.95
1986.6
59,866
Largest
component
(2)
56%
3.98
2.83
2.02
2.48
1986.6
38,823
Core
bicomponent
(3)
59%
7.97
3.78
4.78
6.49
1986.18
29,462
The Social Science Collaboration Graph
Network Core Position
•Distinct subfield effects for ever-coauthored
Unlikely:
History & Theory
Sociology of Knowledge
Radical / Marxist Sociology
Feminist / Gender Studies
Likely:
Social psychology
Family
Health & Medicine
Social Problems
Social Welfare
The Social Science Collaboration Graph
Network Core Position
•Weak subfield effects for network embeddedness
•Large number of Coauthors increases embeddedness
•Large number of people on any given paper decreases embeddedness
Graph Connectivity, Cumulative 1963 - 1999
0.6
% in Giant Component
0.5
Percent
0.4
% of connected
in bicomponent
0.3
0.2
0.1
0
1965
1970
1975
1980
1985
Years (1963 - date)
1990
1995
2000
Figure 10. Growth of Sociology Coauthrship Networks, 5-year moving window
70000
60000
Number of People
50000
40000
30000
20000
10000
0
1965
1970
1975
1980
1985
Ending Year
1990
1995
2000
2005
Network Connectivity: 5-year moving window
0.4
2.25
0.35
2.2
Percent
0.25
2.15
0.2
2.1
0.15
0.1
Connectivity
Bicomponent
0.05
2.05
Component
0
1975
1980
1985
1990
Year
1995
2
2000
Connectivity
0.3
Download