Document 10772639

advertisement
Intellectual diversity of scientific disciplines and
research teams
Staša Milojević
School of Informatics and Computing, Indiana University Bloomington
Exponential rise in scientific literature
•  Doubling time 10-20 years
20 papers
per day
Number of articles published annually
8000
7000
Astronomy
6000
5000
4000
3000
1 paper
per day
2000
1000
0
1880
1900
1920
1940 1960
Publication year
1980
2000
Is publication volume a good proxy for the growth
of science?
1960
textbook
?
Measure concepts instead of publications!
2015
textbook?
Cognitive concepts from document titles
•  Full text not normally available
•  Abstracts: availability incomplete
•  Keywords: very restricted
•  Titles function as “attention triggers” (Bazerman 1985,1989)
•  Titles have undergone a change during the 20th century, becoming •  More informative
•  More specific
•  Containing a larger number of words that indicate article content
•  Any given article title captures the concepts only crudely and incompletely
=> Use very large number of articles!
Measuring cognitive content from titles
Intensity [# of occurrences]
Title length (words)
•  Idea: count number of unique
phrases in titles
•  Must normalize for unequal
volume of literature
=> Cognitive content: number
of unique phrases in 10,000 title
words (~1000 articles)
A
14
12
10
8
6
4
2
0
1880
1890
1900
1910
1920
1930
1940 1950 1960
Publication year
1970
Literature Y
Literature X
Same volume
of literature
Cognitive extent [number of different unique concepts]
1980
1990
2000
2010
Concepts from titles
•  Automatically identify phrases
•  Use general words to separate phrases
•  General words: prepositions, articles + non-specific words, such as:
study, analysis, result, determination, comparison, discovery
•  Examples of titles with phrases capitalized: •  HALOS and VOIDS in MULTIFRACTAL+MODEL of COSMIC+STRUCTURE
•  COLLISION+STRENGTHS for ELECTRON+IMPACT+EXCITATION of FINE+STRUCTURE+LEVELS in FE+XIII
•  Old words in new context:
•  MASS, ENERGY -> MASS-ENERGY EQUIVALENCE •  considered a new phrase (introduced in relativity theory)
Data
•  Research articles for:
•  Astronomy (160,000 articles since 1892)
•  Physics (440,000 articles since 1895)
•  Biomedicine (20 million articles since 1946)
•  Sources:
•  Thompson-Reuters Web of Science
•  NASA ADS
•  American Physical Society
•  PubMed
Evolution of the cognitive extent of astronomy
7000
Cognitive extent
6000
Astronomy
5000
4000
3000
2000
1880
1890
1900
1910
1920
1930
1940 1950 1960
Publication year
1970
1980
1990
Dot = ca. 1000 articles
Blue line = 2-year average
Standard deviation from independent batches of articles: ~ 1.5%
2000
2010
Evolution of the cognitive extent of astronomy
7000
6500
6000
Astronomy
5000
4500
Phase II
se III
Pha
WWII
3500
3000
Sputnik
4000
Number of articles published annually
8000
2000
1880
1890
1900
1910
1920
1930
1940 1950 1960
Publication year
Astronomy
7000
6000
5000
4000
3000
2000
1000
0
1880
2500
Astronomy ->"
astrophysics
Cognitive extent
5500
1970
1980
1900
1990
1920
1940 1960
Publication year
2000
1980
2000
2010
Milojević, S. (in prep.)
Cognitive extent by journal
•  Cognitive extent should not be equaled with quality!
2400
2200
Cognitive extent
2000
1800
MNRAS
ApJ
AJ
A&A
1600
1400
1200
1000
800
1880
1890
1900
1910
1920
1930
1940 1950 1960
Publication year
1970
1980
1990
2000
2010
Milojević, S. (in prep.)
Cognitive content by country
•  Country of lead author
•  Again, this is all normalized to the same publication volume
6500
USA
Cognitive extent
6000
5500
UK
France
Italy
Germany
Japan
5000
Netherlands
Canada
4500
4000
1960
China
1970
1980
1990
Publication year
2000
2010
Milojević, S. (in prep.)
Science as a collaborative effort
•  Collaboration is one of the defining features of modern science
•  Most common way of studying collaboration: through coauthorship
networks
•  Connect authors on a paper
•  Add to the network
Börner et al. 2004
Teams as foundation for collaboration
•  Teams are becoming dominant in the production of knowledge (Wuchty,
Jones, & Uzzi 2007)
•  Teams are growing in size (Bordons & Gomez 2000)
•  High-impact research is increasingly attributed to large teams (Wuchty,
Jones, & Uzzi 2007; Börner et al. 2010)
•  Research that features novel ideas is attributed to large teams (Uzzi,
Mukherjee, Stringer, & Jones 2013)
Change in the character of team size distribution
•  Poisson function used to be a good fit, but not any more!
Astronomy
Team formation essentially a Poisson
process, which somehow transformed
Milojević, S. (2014). PNAS, 111(11), 3984
Functional description of different team types
Distribution of each team type, as determined in the
modeling, can be described by a corresponding
analytical function
Functional forms allow the empirical distribution to
be analyzed without running a model simulation
1E+0
1
1E+0
1
ALL TEAMS
Standard core teams
Core +1 teams
Extended teams
1E-­‐1
0.1
0.01
1E-­‐2
10-­‐5
1E-­‐5
POISSON+1"
(shifted)
TRUNCATED
POWER LAW
SUM
Poisson (standard)
Poisson w/extra member
Truncated power law
1E-­‐1
0.1
0.01
1E-­‐2
Articles s
10-­‐3
el 1E-­‐3
ci
tr
10-­‐4
A 1E-­‐4
POISSON
10-­‐3
1E-­‐3
10-­‐4
1E-­‐4
10-­‐5
1E-­‐5
2006-­‐10
10-­‐6
1E-­‐6
10-­‐7
1E-­‐7
1
2006-­‐10
10-­‐6
1E-­‐6
10
100
Article team size (k)
10-­‐7
1E-­‐7
1
Milojević, S. (2014). PNAS, 111(11), 3984
10
100
Article team size (k)
Cognitive extent covered by different types of
teams
•  Large teams do net (yet) cover the entire field
7000
Astronomy
Single authors
6000
Author pairs
Cognitive extent
Core teams
5000
Extended teams
Large extended teams
4000
3000
2000
1880
1890
1900
1910
1920
1930
1940 1950 1960
Publication year
1970
1980
1990
2000
2010
Currently: core (3-9 authors), extended (10-20), large
extended (>20)
Milojević, S. (in prep.)
Cognitive extent covered by different types of
teams
•  Large teams do net (yet) cover the entire field
8000
Physics
Single authors
Cognitive extent
7000
Author pairs
Core teams
6000
Extended teams
Large extended teams
5000
4000
3000
1880
1890
1900
1910
1920
1930
1940 1950 1960
Publication year
1970
1980
1990
2000
2010
Milojević, S. (in prep.)
Cognitive extent covered by different types of
teams
•  Single authors cover less than small teams
9000
00
00
Physics
Author pairs
Core teams
Extended teams
Large extended teams
Cognitive extent
00
8000
Single authors
Biomedicine
7000
6000
00
5000
00
00
1880
4000
1940
1890
1900
1910
1920
1930
1950
1960
1940 1950
Publication year
1970 1980 1990
Publication year
1960 1970 1980 1990
2000
2010
2000
2010
Milojević, S. (in prep.)
Thank you!
Staša Milojević
School of Informatics and Computing
Indiana University Bloomington
smilojev@indiana.edu
Download