The Network Structure of Sociology Production James Moody Ohio State University Indiana University December, 2005 Introduction Outline: •Big Picture: Networks, Structure, Action & Outcomes •Guiding Questions & General Approach •Examples: Hierarchy, Romance & the spread of STDs •Networks & Science: Two Questions & 4 networks •How do scientific fields evolve? •Where do good ideas come from? •Data Sources & Methods •Results •Where does sociology fit? Journal co-citation networks •What do sociologists study? Topic networks •Who produces sociology? Social science collaboration networks •Discussion Networks, Structure, Action & Outcomes Guiding Questions: Where does social structure come from? How does social structure enable & constrain action & outcomes? General Approach: (1) Seek structure in patterns of association: “To speak of social life is to speak of the association between people – their associating in work and in play, in love and in war, to trade or to worship, to help or to hinder. It is in the social relations men establish that their interests find expression and their desires become realized.” — Peter M. Blau Exchange and Power in Social Life, 1964 Networks, Structure, Action & Outcomes Guiding Questions: Where does social structure come from? How does social structure enable & constrain action & outcomes? General Approach: (2) Focus on large-scale network structure: "If we ever get to the point of charting a whole city or a whole nation, we would have … a picture of a vast solar system of intangible structures, powerfully influencing conduct, as gravitation does in space. Such an invisible structure underlies society and has its influence in determining the conduct of society as a whole." — J.L. Moreno, New York Times, April 13, 1933 Networks, Structure, Action & Outcomes Guiding Questions: Where does social structure come from? How does social structure enable & constrain action & outcomes? General Approach: (3) Link well-defined network structures to relevant social theory… “The social structure [of the dyad] rests immediately on the one and on the other of the two, and the secession of either would destroy the whole. . . . As soon, however, as there is a sociation of three, a group continues to exist even in case one of the members drops out.” —Simmel ([1908] 1950:123) This can then be operationalized as node-connectivity directly. Networks, Structure, Action & Outcomes Guiding Questions: Where does social structure come from? How does social structure enable & constrain action & outcomes? General Approach: (4) …in a manner that can explain truly emergent social properties. “[Social facts] assume a shape, a tangible form peculiar to them and constitute a reality sui generis vastly distinct from the individual facts which manifest that reality” — Durkheim Rules Of Sociological Method Networks, Structure, Action & Outcomes Examples: Hierarchy in High School A Gallery of Friendship Networks 776 adolescents from a working-class, all-white, suburban, school in the Midwest. (Source: Add Health) Networks, Structure, Action & Outcomes Examples: Hierarchy in High School A Gallery of Friendship Networks 678 adolescents from a working-class, all-white, rural, school in the Midwest. Across these settings (and many more) we can literally see the differences imposed by classic ‘Blau space’ features of youth communities. Race, grades, SES etc. often shape the gross topography of school friendship networks. (Source: Add Health) Networks, Structure, Action & Outcomes Examples: Hierarchy in High School Distribution of Popularity Community type Size By size and city type Networks, Structure, Action & Outcomes If you examine all schools you find: • All of the school networks have a rank-strata structure • The structure remains constant even though nearly half of all relationships are new •People’s position in the popularity distribution is fluid What social process will explain a stable macro-structure in the face of dynamic relations? Networks, Structure, Action & Outcomes Examples: Hierarchy in High School Endogenous Building Blocks: A periodic table of social elements: (0) (1) (2) (3) (4) (5) (6) 003 012 102 111D 201 210 300 021D 111U 120D 021U 030T 120U 021C 030C 120C Networks, Structure, Action & Outcomes Examples: Hierarchy in High School Classic balance theory offers a set of simple local rules for relational change: •A friend of a friend is a friend •My enemy’s enemy is my friend. (0) (1) (2) (3) (4) (5) (6) 003 012 102 111D 201 210 300 021D 111U 120D Intransitive Transitive 021U 030T 120U 021C 030C 120C Mixed Networks, Structure, Action & Outcomes Examples: Hierarchy in High School vacuous transition Increases # transitive Decreases # intransitive 102 030C 120C Decreases # transitive Increases # intransitive Vacuous triad 111U Intransitive triad Transitive triad 021C 201 003 012 111D 021D 210 120U 030T 021U 120D (some transitions will both increase transitivity & decrease intransitivity – the effects are independent – they are colored here for net balance) 300 Networks, Structure, Action & Outcomes Examples: Hierarchy in High School ERGM Coefficient Distributions* 0.8 Endogenous Focal Orgs. Dyadic Similarity/Distance. 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 *Coefficients based on pseudo-likelihood approximations, here standardized so they fit well on the page… Networks, Structure, Action & Outcomes Examples: Building Romantic Networks 2 12 9 63 Male Female Networks, Structure, Action & Outcomes Examples: Building Romantic Networks Networks, Structure, Action & Outcomes Examples: Building Romantic Networks What micro-structures are taboo in high-school romantic relations? Networks, Structure, Action & Outcomes Examples: Building Romantic Networks The 4-cycle prohibition fits the observed data. Networks, Structure, Action & Outcomes Examples: Systemic Effects of Local Action in STD Cores An STD Puzzle: • contact rates are low (most people have few partners) • dyadic transmission is difficult (compared, say, to the flu) • people are infectious for short periods of time •Particularly for bacterial STDs, but even AIDS infectiousness peaks shortly after acquiring the disease How does the disease manage to remain endemic? • Activity heterogeneity is the common answer: a few active “stars” keep the disease endemic •But this doesn’t fit the empirical facts-on-the-ground in many cases. •What if many people make small changes, instead of few people making big changes? Networks, Structure, Action & Outcomes Examples: Systemic Effects of Local Action in STD Cores Networks, Structure, Action & Outcomes Examples: Systemic Effects of Local Action in STD Cores Networks, Structure, Action & Outcomes While my substantive work has ranged widely, I always focus on the intersection of individual action embedded in network structures over time. The long-term goal is to identify fundamental principles for either networks or action that can explain the wide variety observed social structures with a small number of locally digestible and contextually relevant action rules. My new work turns these tools to questions about the development of science. Networks & Science: Two Questions & 4 networks 1) How do scientific fields evolve? a) Is there a coherent logic to the ebb and flow of topics studied? b) How does the success or failure of ideas depend on the social community in which it is embedded? c) (How) Does the evidentiary basis of a field shape it’s logic of discovery? The descriptive answer is given by mapping the field in network space. The analytic answer will come by modeling the emergence, growth and decline of scientific subfields. Networks & Science: Two Questions & 4 networks 2) Where do good ideas come from? a) What is a good idea? a) Ideas that change a scientific field. Indexed by (a) citations and (b) the relevant topography of the networks within which the idea was originally embedded. Ideas are not inherently good; they are recognized as “good” by their effect on a field. b) How do disciplines produce new ideas? a) Intersection Good ideas are produced by combining ideas of others in unique ways (Burt) b) Development Good ideas arise naturally from either the progressive “error reduction” process of good normal science (Popper) or the accepted practices of a scientific community (Crane). c) Peer Influence & Recognition Any idea is a good idea if others think so, and thinking so is influenced by the network. (Gould). d) Resource competition Search for prestige conditioned by organizational structure (Fuchs) Will model this by examining how citations are affected by field dynamics (and vice versa). Networks & Science: Two Questions & 4 networks Theoretical approaches to scientific development We are thus left with multiple action frames to guide our understanding: Truth: Ideas run their error-reduction course (Popper) Prestige: Actors seek the greatest visibility (Merton) Resource competition: “To the victor goes the spoils” – Fuchs Boundary Protection ( Lamont) Fractal Development (Abbott) Community Influence (SSK – Collins, etc) Peer magnification (Gould) Power (JL Martin) For entire fields, these mechanisms are largely unknown and underspecified. Need to extend beyond particular lab studies Take a large-scale “Satellite” view of science dynamics Link action frames to specific patterns in 4 science networks Networks & Science: Two Questions & 4 networks Theoretical approaches to scientific development Four relevant networks: 1. Citation networks – a direct trace of scientific recognition & production 2. Topic networks – clusters of scientific products related to the same subject 3. Collaboration networks – “invisible communities” of social interaction that produces scientific products 4. Research Communities – People linked through common research topics (Substantively a derivative of 2 & 3) Networks & Science: Two Questions & 4 networks Scientific Environments Evidentiary Basis: How do we array disciplines with respect to evidence? Two Dimensions: Objectivity & Control Objectivity is taken from Popper: The extent to which a given knowledge claim is independent of the knower. Control refers to the ability of scientists to directly manipulate the object of study. “Lab Science” with complete ability to control apparatus (and thus environment) represents the strongest ability, while “observation” represents the other. Cases: Chemistry (Lab Science: High Objectivity & High Control) Geology (Field Science: High Objectivity & Low Control) Sociology (Social Science: Moderate Objectivity & Low Control) Literary Criticism (Humanities: Low Objectivity & Low Control) This approach is very similar to Fuchs (1993) Networks & Science: Two Questions & 4 networks Chemistry Geology Sociology Citation Journal Citation Structure Topics Subfield Evolution Collaboration Community Collaboration & Cohesion Literary Criticism Networks & Science: Two Questions & 4 networks Focusing on Sociology as a current case The field of sociology can thus be thought of as the intersection of multiple networks. The shape of these networks differs across scales and over time. - Differences between local and global visions of the network shape our perceptions of scientific coherence. - We tend to perceive coherence in our own specialty fields and incoherence for the entire discipline. - A globally federated structure, that cannot easily exclude empirical topics, might still be socially coherent if scientific mixing crosscuts empirical problems. We can see this structure by examining these 4 networks at large scale and over time. Data Sources •Citation Networks •Compiled from the ISI web of science Journal citation tables •Covers 1681 social science journals indexed in 2003 •Will eventually -fill this series from 1990 to present across all fields. -Add a sample of paper-level citations to model performance. •Topic & Collaboration Networks (for Sociology) •Compiled from Sociological Abstracts •281,163 papers published between 1963 and 1999 •A sub-sample of “sociology only” papers published in a select set of non-specialty sociology journals 35% of the total (~100K) •Contains information on title, abstract, keywords, author(s), tables, journal & citation •Will use similar indexes for Chemistry, Geology and Lit Crit Where does sociology fit? •Perennial debates over the existence of a theoretical core •Rapid growth in the internal diversity of topics sociologist study: 50 45 Number of ASA Sections 40 35 30 25 20 15 10 5 0 1950 1960 1970 1980 1990 2000 2010 Where does sociology fit? •Perennial debates over the existence of a theoretical core •Rapid growth in the number of journals relevant to sociologists: Where does sociology fit? This growth & diversity has been seen as evidence for the ultimate emptiness of sociology as a scientific discipline. But disciplines are shaped by the connections between ideas, not the number of ideas. That is, we recognize fields by who they speak to as much as by what they speak about. The clearest empirical trace of this communication is citation. Disciplines can then be defined as clusters of work that speak more to each other than to anyone else, which we trace with co-citation networks. Where does sociology fit? Building co-citation networks Links in a co-citation network are constructed by measuring how similar each journal is to every other journal. Similarity is gauged by correlating the pattern of citations received by each journals from every other journal. AJS ASR AER … JER J1 # # 0 0 J2 # # 0 0 J3 0 0 # # J4 . . . JER 0 # # # 0 0 # # Comparing across columns tells us whether the two journals are recognized by others as similar. Where does sociology fit? Building co-citation networks Links in a co-citation network are constructed by measuring how similar each journal is to every other journal. Similarity is gauged by correlating the pattern of citations received by each journals from every other journal. AJS ASR AER … JER AJS 1.0 ASR High 1.0 AER . . . JER Low Med 1.0 Low Low High 1.0 This create a valued network of ties between two journals. I use a cosine similarity score developed in bibliometrics, selected for those with ties > 0.45 & at sharing at least 2% of their citation volume. Source: Loet Leydesdorff Where does sociology fit? Economics co-citation similarity network Density = 0.197 N=152 Isolates (not shown): 5 Node size proportional to log(degree) Where does sociology fit? Political Science co-citation similarity network Density = 0.160 N=69 Isolates (not shown): 10 Node size proportional to log(degree) Where does sociology fit? Sociology co-citation similarity network Density = 0.140 N=69 Isolates : 7 Where does sociology fit? Where does sociology fit? Where does sociology fit? Where does sociology fit? •Sociology “fits” at the center of the social sciences. We are not as internally cohesive as Economics or Law, but more so than many (anthropology, allied health fields). •This represents a tradeoff. We have traded unique dominance of a topic (markets, politics, mind, space, history) for diversity & thus centrality. •Sociology is an interstitial discipline (Abbott, 2004) in at least two-senses: •There is no content topic we can reasonably exclude •We pull together, and generate, the ideas and topics covered by specialty disciplines. •This makes us uniquely positioned to provide insights on many different empirical questions. How have the topics sociologists study shifted over time? What do sociologists study? How do we capture the internal organization of research problems? •Could use paper-level citation networks (see Hargens 2000), but data are difficult & expensive to obtain for large-scale networks. •Can examine the network of papers formed by the topics they write about. •Directly taps scientific content •Purely endogenous creation of topics that allows new topic areas to emerge and old ones to die over time •Tractability: data can be extracted from information held in Sociological Abstracts •Multiple levels: •Coarse grained Focus solely on keywords (Light 2005) •Fine grained Use all information available (title, abstract, keywords) What do sociologists study? A fine grained view Data Selection & Manipulation: Index entries contain title, abstract and keywords that summarize the paper’s content. •Sample all papers indexed within four 3-year windows between 1970 and 1999. •Construct a paper – by – word matrix, where the ij cell lists how many times word i is used to describe paper j. •Word set is stemmed to get at root words •A stop-list is used to minimize inclusion of low-information content words (“the” “and” “is” etc.) or words commonly found in the data source (“Tables” “Figure” “References”) •Construct a network by linking the most highly correlated papers •Use correlation of 0.40 or better •Ties are treated as valued in the network analyses What do sociologists study? A fine grained view Analysis & Presentation: General approach is “quantitatively inductive” - Construct a low-dimensional map of the network, using contour sociograms. These allow for full information in the network structure. -Use cluster analysis to identify distinct topics -Use a variant of Moody’s RNM algorithm to cluster the network This clustering routine: (a) is efficient: Allows clustering on 10s of thousands of nodes (b) automatically specifies the optimal number of clusters (c) allows that some cases can fall ‘between’ clusters -I set a minimum cluster size of 12 papers published over the 3-year window. -Evaluate the clustered papers for content and label the maps. What do sociologists study? A fine grained view Analysis & Presentation: General approach is “quantitatively inductive” Compare the maps over time qualitatively, looking for general changes in the frequency & alliance of topics. Examine shifts in structural indicators of the extent of clustering & cluster size distributions. What do sociologists study? A fine grained view Example: One-step neighborhood of “More information, better jobs?” What do sociologists study? A fine grained view Example: One-step neighborhood of “More information, better jobs?” What do sociologists study? A fine grained view: Content (all journals) What do sociologists study? A fine grained view: Content (all journals) What do sociologists study? A fine grained view: Content (all journals) What do sociologists study? A fine grained view: Content (all journals) What do sociologists study? A fine grained view: Content (all journals) The cluster content of the topic network has evolved slowly: •Some clearly central specialties have remained prominent over the entire period. This includes larger areas such as: • Class & Stratification • Race & Ethnicity • Education • Gender (Strongest from 1980s on) • Family (Strongest from the 1980s on) • Crime As well as clearly distinct, though numerically smaller bodies of research related to • Suicide • Sociology of Science, Technology & “Reflexive” sociology • Unions What do sociologists study? A fine grained view: Content (all journals) The cluster content of the topic network has evolved slowly: •The clearest change has been the rapid growth of social research on health. •Dominated by a very large body of research related to HIV/AIDS •Other areas of relative growth include: •Family topics were most prominent in the 1980s •A strong presence of research on sex & sexuality emerged in the 1980s and 90s •Relative declines have come in areas such as: • Groups • Interaction • “Radical” studies • Elite studies Summary: A move away from basic social processes toward studying social problems, with a growing uniqueness of theory & method What do sociologists study? A fine grained view: Content (Restricted Sample) What do sociologists study? A fine grained view: Content (Restricted Sample) What do sociologists study? A fine grained view: Content (Restricted Sample) What do sociologists study? A fine grained view: Content (Restricted Sample) What do sociologists study? A fine grained view: Content (Restricted Sample) The cluster content of the restricted topic network has evolved similarly to the wider social science field: •The subfield structure is less dominated by the purely applied work on HIV/AIDS in the 90s, but there is a still a clear association of topics around sexuality, health and AIDS. •Health, Family, Education, Gender, and Race are always prominent and large. •The relative prominence of “reflexive sociology” is much higher – •These topics cannot be published elsewhere, and the resulting tight cluster looks proportionately larger in the smaller sample. What do sociologists study? A fine grained view: Content We can measure the degree of consensus in words used to describe papers with: C = S pi2 Where pi is the proportion of times word i is used What do sociologists study? A fine grained view: Content Word Consensus Scores 1970 - 1999 0.13 C (x 100) 0.125 Soc Only 0.12 0.115 All SA Journals 0.11 1965 1970 1975 1980 1985 1990 1995 2000 What do sociologists study? A fine grained view (Core Soc) Proportion of papers falling inside a cluster 1 0.9 Total Cn > 12 0.8 Restricted 0.7 0.6 0.5 0.4 0.3 Total Cn > 100 Restricted 0.2 0.1 1965 1970 1975 1980 1985 1990 1995 2000 What do sociologists study? A fine grained view We can measure the extent that ties fall within clusters with the modularity score: ls d s M L 2 L s 2 Where: s indexes clusters in the network ls is the number of lines in cluster s ds is the sum of the degrees of s L is the total number of lines What do sociologists study? A fine grained view Network Modularity 1970 - 1999 0.85 Modularity Score All SA Journals 0.8 Soc Only 0.75 0.7 1965 1970 1975 1980 1985 1990 1995 2000 What do sociologists study? A fine grained view Number of Clusters 1970 - 1999 In-cluster ties / Total ties 500 All Journals 400 300 200 Soc Only 100 0 1965 1970 1975 1980 1985 1990 1995 2000 What do sociologists study? A fine grained view Mean Cluster Size 1970 - 1999 In-cluster ties / Total ties 80 All Journals 70 60 50 Soc Only 40 30 20 10 0 1965 1970 1975 1980 1985 1990 1995 2000 What do sociologists study? A fine grained view The cluster structure of the topic network: •The vast majority of papers can be assigned to clear clusters, with slight growth in this proportion over time. •The number of clusters has increased rapidly, though slightly slower within core sociology than in the broader field of social science. •There has been significant growth in the tails of the distribution – the size distribution is more skewed in later periods. •The modularity of the network has increased over time, though most of this change is between the 1970 and 1980 periods. What do sociologists study? A fine grained view Next steps: 1. Build a continuous moving window to fill in the dates from 1970 to 2005. 2. Link clusters across time periods, so we can track exactly the relative growth and decline of each subfield. 3. Model this growth as a function of connections to other fields, author composition and disciplinary environment. 4. Build this network’s dual: Scientists connected through topics. What do sociologists study? A clustered topic structure focused strongly on practical problem solving has a hint of Durkheim’s concern: Is there any integration across these topic clusters? We shouldn’t jump too quickly to the fractured conclusion: • Topic clusters are formed from papers, and papers typically have well encapsulated ideas. They have a small “maximum digestible unit” • Scientific integration is really about how scientists bridge these multiple topics. • If authors write and collaborate across these topics then, ideas can quickly disseminate as well. What is the structure of the collaboration graph – if this is highly clustered it would signal potential fragmentation Who produces sociology? Who produces sociology? Science is typically produced through collaboration, both formally and informally (Crane 1972, Crane & Small 2000, Friedkin 1998). The best empirical trace of collaboration for large communities of science is coauthorship. •Misses the less intense collaborations recognized in acknowledgements, discussions, colleagues reading each other’s work •But should provide the strongest test of a fractionalization hypothesis, since the set of people we write with should be more like us than the set of people we have lunch with or discuss work with informally. •There are differences across subfields in formal collaboration rates, which, if anything, should magnify the extent of observed fragmentation. Who produces sociology? Coauthorship Trends in Sociology Sociological Abstracts and ASR Proportion of papers with >1 author 0.75 0.6 0.45 0.3 Sociological Abstracts ASR 0.15 0 1930 1940 1950 1960 1970 Year 1980 1990 2000 Who produces sociology? Distribution of Coauthorship Across Journals Child Development Sociological Abstracts, 1963-1999 Proportion of papers w. >1 author 1 0.8 Soc. Forces J. Health & Soc. Beh. ASR 0.6 J.Am. Statistical A. 0.4 AJS Atca Politica Soc. Theory 0.2 Signs J. Soc. History 0 0 100 200 300 400 500 600 700 Coauthorship Rank 800 900 1000 1100 Who produces sociology? Construct a collaboration network by assigning an edge between any pair of people who coauthored a paper together. Example Paths: 3-steps from Stan Wasserman N=361 Who produces sociology? Construct a collaboration network by assigning an edge between any pair of people who coauthored a paper together. Example Paths: 3-steps from Stan Wasserman N=361 Node size proportional to log of degree Who produces sociology? The simplest summary test for a fragmented network is to measure the extent of clustering in the network. Watts’ work on the “small-world problem” suggests that if the collaboration network is a small world network it might be fractured. C=Large, L is Small = SW Graphs •High relative probability that a node’s contacts are connected to each other. •Small relative average distance between nodes Who produces sociology? In a highly clustered, ordered network, a single random connection will create a shortcut that lowers L dramatically Watts demonstrates that Small world properties can occur in graphs with a surprisingly small number of shortcuts Who produces sociology? Locally clustered graphs are a good model for coauthorship when there are many authors on a paper. Paper 1 Paper 2 Paper 3 Paper 4 Paper 5 Newman (2001) finds that coauthorship among natural scientists fits a small world model. I test this model on the sociology coauthorship network, using all authors from 1963 – 1999. Who produces sociology? Clustering Distance Observed Random 0.194 0.206 9.81 7.57 The sociology network is less clustered than would be expected by chance and somewhat longer overall distances. This suggests that it does not have a small-world structure. Who produces sociology? The network has a broad Core-periphery structure (68,923) 59,866 38,823 29,462 Bicomponent Component Unconnected Structurally Isolated Who produces sociology? Internal Structure of the Coauthorship Core Health General Sociology Who produces sociology? •Strong specialty effects for ever-coauthored Unlikely: History & Theory Sociology of Knowledge Radical / Marxist Sociology Feminist / Gender Studies Likely: Social psychology Family Health & Medicine Social Problems Social Welfare Who produces sociology? •Weak specialty effects for network embeddedness •Large number of coauthors increases embeddedness •Large number of people on any given paper decreases embeddedness Who produces sociology? 0.4 0.35 2.25 Evolution of Network Cohesion: 5-year moving window 2.2 Percent 0.25 2.15 0.2 2.1 0.15 0.1 Connectivity Bicomponent 0.05 2.05 Component 0 1975 1980 1985 1990 Year 1995 2 2000 Connectivity 0.3 Summary & discussion Social Science Citation Structure •Economics, Law, Psychology, Business/Management, Linguistics are most cohesive •The are also “peripheral” in that they speak to a relatively limited set of problems •Sociology is at least as cohesive as Political Science, and more cohesive than fields such as Anthropology, Social Work, Education or allied health fields that all have more limited empirical domains •Our position represents a tradeoff between internal cohesion and external centrality. Summary & discussion Scientific Topic Network •Big-Picture: A general progression towards problem solving and the specialization of work on theory & methods (Light 2005). •Fine-grained structure: •A federated topic structure that has remained largely constant since the 1970s, though there have been shifts in topics. •Key content areas have remained largely constant •Race, Family, Class, Gender, Science, and Health •A decrease in focus on general foundation problems •Group structure, community, interaction •An increase in work on social problems •Health & HIV/AIDS -related topics •Some evidence for greater homogeneity in topics discussed Summary & discussion Scientific Collaboration Network •The networks is not divided into small research-area based clusters. •There is no partition that strongly separates scientists. •This has to imply that authors bridge topic clusters. •This is good for social cohesion, and probably good for theoretical cohesion. •Caveat: There is evidence for a division based on research method, with largely quantitative work more likely to be coauthored, though there is no such simple division in the topics network. Summary & discussion Combined, these models suggest a discipline that is integrated socially and locally cohesive topically. Discipline-wide integration will likely only increase as pressures for collaboration push more scientists to work together across topics. However, the perception of disintegration will likely continue: • because most of us are only exposed outside our areas by work that appears in the general journals. •But almost all of the topical cohesion is due to “normal science” work occurring in specialty journals.