Mapping the Structure and Evolution of Chemistry Research Dr. Katy Börner Cyberinfrastructure for Network Science Center, Director Information Visualization Laboratory, Director School of Library and Information Science Indiana University, Bloomington, IN katy@indiana.edu Collaborative work with Kevin W. Boyack & Richard Klavans CHE-0524661 June 26, 2007 @ ISSI, Madrid, Spain 1 Studying the Emerging Global Brain: Analyzing and Visualizing the Impact of Co-Authorship Teams Börner, Dall’Asta, Ke & Vespignani (2005) Complexity, 10(4):58-67. Research question: • Is science driven by prolific single experts or by high-impact coauthorship teams? Contributions: • New approach to allocate citational credit. • Novel weighted graph representation. • Visualization of the growth of weighted co-author network. • Centrality measures to identify author impact. • Global statistical analysis of paper production and citations in correlation with co-authorship team size over time. • Local, author-centered entropy measure. 2 Spatio-Temporal Information Production and Consumption of Major U.S. Research Institutions Börner, Katy, Penumarthy, Shashikant, Meiss, Mark and Ke, Weimao. (2006) Mapping the Diffusion of Scholarly Knowledge Among Major U.S. Research Institutions. Scientometrics. 68(3), pp. 415-426. Research questions: 1. Does space still matter in the Internet age? 2. Does one still have to study and work at major research institutions in order to have access to high quality data and expertise and to produce high quality research? 3. Does the Internet lead to more global citation patterns, i.e., more citation links between papers produced at geographically distant research instructions? Contributions: ¾ Answer to Qs 1 + 2 is YES. ¾ Answer to Qs 3 is NO. ¾ Novel approach to analyzing the dual role of institutions as information producers and consumers and to study and visualize the diffusion of information among them. Overview 1. Introduction 2. Base map of science 3. Mapping chemistry References: Boyack, Kevin W., Börner, Katy and Klavans, Richard. Mapping the Structure and Evolution of Chemistry Research, Accepted for 11th International Conference on Scientometrics and Informetrics, Madrid, Spain, June 25-27, 2007. Boyack, Kevin W., Klavans, Richard and Börner, Katy. (2005). Mapping the Backbone of Science. Scientometrics. 64(3), 351-374. Shiffrin, Richard M. and Börner, Katy. (April 2004). Mapping Knowledge Domains. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl_1). Börner, Katy, Chen, Chaomei, and Boyack, Kevin. (2003). Visualizing Knowledge Domains. In Blaise Cronin (Ed.), Annual Review of Information Science & Technology, Information Today, Inc./American Society for Information Science and Technology, Medford, NJ, volume 37, chapter 5, pp. 179-255. Börner, Katy, Sanyal, Soma and Vespignani, Alessandro (2007). Network Science. In Blaise Cronin (Ed.), Annual Review of Information Science & Technology, Information Today, Inc./American Society for Information Science and Technology, Medford, NJ, volume 41, chapter 12, pp. 537-607. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 6 3 Why map science and/or chemistry? ¾ How does our collective scholarly knowledge grow over time? ¾ What major areas of science exist and how are they interlinked? ¾ Which areas are major knowledge producers; which ones are consumers? Computational scientometrics – the application of bibliometric/scientometric methods to large-scale scholarly datasets – and the communication of results via maps of science might help us answer these questions. ¾ Chemistry is a field that is undergoing significant change. Interdisciplinary research has increased over time and the lines between chemistry and the life sciences have seemingly blurred. ¾ This study maps the structures of Chemistry, Biochemistry, Biology, and Bioengineering, and their interactions over 30 years using journal citation patterns. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 7 How to map science and/or chemistry? 1. 2. 3. 4. 5. Base Map. Information from the combined Science (SCIE) and Social Science (SSCI) Citations Indexes from 2002 was used to generate a disciplinary map of 7,227 journals and 671 journal clusters. Overlays of DOE, NIH, NSF funding. Chemistry Disciplines. Clusters relevant to study the structure and evolution of chemistry were identified using JCR categories and were further clustered into 14 disciplines. Knowledge Diffusion among Disciplines. The changing scientific composition of these 14 disciplines and their knowledge exchange via citation linkages was computed. The result is visually communicated using the 2002 map of science introduced in 1.) Discussion. Major changes on the dominance, influence, and role of Chemistry, Biology, Biochemistry, and Bioengineering over these 30 years are discussed. We conclude with a discussion and suggestions for future work. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 8 4 General mapping process (multi-step) Coupling matrix Coupling matrix Search Databases Patents/ Papers/ Papers Journals Emergent categories patent_no US6821009 US6821143 US6821553 US6821643 US6821644 US6821647 US6821648 US6821649 US6821650 US6821651 US6821793 US6821799 US6821800 US6821803 US6821804 US6821806 lev1 1542 2273 98 2286 cluster lev2 lev3 368 56 332 2 258 3 307 33 284 2173 652 2015 2015 2295 2252 832 1965 1700 1993 187 307 289 309 309 95 372 334 339 188 279 47 33 33 33 33 19 2 47 1 58 47 lev4 5 2 1 1 3 1 1 1 1 3 2 3 1 2 3 Bibliographic Bibliographic coupling Coupling 1 Aggregate records Aggregate records 1 .7 .6 .6 2 5 2 3 .1 .9 .8 .2 4 .6 6 A Graph Layout Graph Layout Cluster Assignment Cluster Assignment Browsable Maps modified single-link VxOrd Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 9 Generation of 2002 paper-level base map ¾ Combined SCIE/SSCI from 2002 ¾ 1.07M papers, 24.5M references, 7,300 journals ¾ Bibliographic coupling of papers, aggregated to journals, counts are normalized using cosine index. ¾ First step ¾ remove 25 multidisciplinary journals (MD) ¾ run graph layout and clustering on remaining journals ¾ resulting in 646 clusters Law Math Stat CompSci Economics PhysChem Education Psychology Vision Brain Physics Chemistry Astro Environ Psychiatry GeoSci MRI BioChem ¾ Second step ¾ add MD journals back as single-journal clusters, total is now 671 clusters ¾ re-aggregate coupling counts at the cluster level ¾ layout positions of 671 clusters using VxOrd ¾ by association, this gives x,y positions for each journal Engineering Policy Biology BioMat General Medicine Plant MicroBio Cancer Therapeutic Virology Infect Disease Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. Pajek 10 5 Node – cluster of journals Edge – similarity of journal clusters based on bibliographic coupling. 2002 disciplinary map Law Math Engineering Policy Stat CompSci Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat General Medicine Plant MicroBio Cancer Therapeutic Virology Infect Disease Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. Pajek 11 6 Node – cluster of journals Edge – similarity of journal clusters based on bibliographic coupling. 2002 disciplinary map Law Math Engineering Policy Stat CompSci Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat General Medicine Plant MicroBio Cancer Therapeutic Virology Infect Disease Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. Pajek 14 7 Funding overlays ¾ DOE – simple lookup of author institutions (LANL, LLNL, SNL, ANL, BNL, …) ¾ NIH and NSF: Used RaDiUS data from 1999 ¾ Matched funded PI and institution from 1999 funding data to first author and institution from the 2002 ISI publication data. ¾ Not comprehensive, but representative. ¾ Undercounts actual funding profiles (e.g., if PI wasn’t first author, if publications appeared earlier or later than 2002, …). ¾ Some false hits – could use text analysis to narrow further. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 15 Funding overlay – DOE laboratories Law Math Engineering Policy Stat CompSci Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat Vitality High General Medicine Plant MicroBio Cancer Low Therapeutic Virology Infect Disease Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. Pajek 16 8 Funding overlay – NIH funded research (extramural) Law Math Engineering Policy CompSci Stat Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat Vitality High General Medicine Plant MicroBio Cancer Therapeutic Low Infect Disease Virology Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. Pajek 17 Pajek 18 Funding overlay – NSF funded research Law Math Engineering Policy Stat CompSci Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat Vitality High General Medicine Plant MicroBio Cancer Low Therapeutic Virology Infect Disease Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 9 How to map chemistry ¾ Data ¾ ¾ ¾ ¾ ¾ 1993-2004 JCR (citing:cited journal counts). Need comparable data for back years. Obtained journal citing:cited pairs for 1974, 79, 84, 89. Summed pairs to citing:cited journal counts. Thus, full set of citing:cited journal counts at 5 year intervals 1974-2004. JCR (citing:cited journal counts) 1974 1979 1984 1989 1994 1997 2004 JCR Data ¾ Used 2002 base map as starting point. ¾ Added unique journals for other years to existing (2002) structure. ¾ Order of adding years: 2004, 1999, 1994, 1989, 1984, 1979, 1974 ¾ Identified four fields and 14 disciplines within map. ¾ Fractional counting of publications by field and discipline. ¾ Graphical display of results. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 19 Adding unique journals ¾ Example, for 2004, if [newjnl] was not in the 2002 list. ¾ ¾ ¾ ¾ ¾ Intercitation counts were read from table for all newjnl::existingjnl pairs. Counts were aggregated to a newjnl::existingcluster matrix. Only counts to citedyr >= pubyear-9 were included (JCR standard). Cosine index values were generated for the newjnl::existingcluster matrix. Newjnl was assigned to the existingcluster with the highest cosine value. ¾ For the next year, 1999, if [newjnl] was not in the combined 2002,2004 list ¾ same process. ¾ All journals for all years were assigned to one of the 671 clusters. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 20 10 Four fields of interest / JCR categories Field JCR Categories Chemistry DW – Chemistry, Applied DX – Chemistry, Medicinal DY – Chemistry, Multidisciplinary EA – Chemistry, Analytical EC – Chemistry, Inorganic & Nuclear EE – Chemistry, Organic EI – Chemistry, Physical HQ – Electrochemistry II – Engineering, Chemical GC – Geochemistry & Geophysics UH – Physics, Atomic, Molecular & Chemical Biology CU – Biology CX – Biology, Miscellaneous DR – Cell Biology HY – Developmental Biology HT – Evolutionary Biology PI – Marine & Freshwater Biology QU – Microbiology WF – Reproductive Biology Biochemistry CO – Biochemical Research Methods CQ – Biochemistry & Molecular Biology individual journals: Science, Nature, PNAS Bioengineering DA – Biophysics IG – Engineering, Biomedical DB – Biotechnology & Applied Microbiology QE – Materials Science, Biomaterials Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 21 Fractional counting of papers ¾ For each journal: ¾ Find JCR category assignments. ¾ For example, the journal Bioelectrochemistry is in 4 JCR categories ¾ CQ (Biochem & Mol Bio) – Biochemistry ¾ CU (Biology) – Biology ¾ DA (Biophysics) – Bioengineering ¾ HQ (Electrochemistry) – Chemistry ¾ Split paper counts among fields. ¾ If there are 64 papers in Bioelectrochemistry, each of the four fields will get 16 papers. ¾ Most journals are assigned to only 1 or 2 JCR categories. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 22 11 Chemistry on the 2002 base map Law Math Engineering Policy CompSci Stat Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat General Medicine Plant MicroBio Cancer Therapeutic Infect Disease Virology Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. Pajek 23 Pajek 24 Four fields on the 2002 base map Law Math Engineering Policy Stat CompSci Economics PhysChem Education Vision Physics Chemistry Psychology Brain Astro GeoSci Environ Psychiatry MRI BioChem Biology BioMat General Medicine Plant Cancer Therapeutic Virology Infect Disease MicroBio CHEMISTRY BIOLOGY BIOCHEM BIOENG Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 12 Definition of 14 disciplines • • • • CHEMISTRY BIOLOGY BIOCHEM BIOENG Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 25 Symbolic map of 14 disciplines; 1974 Materials Chem related to Materials/Physics. CM Physics Chemistry, Polymer Phys Chem Majority of Chem Research Number of papers by cluster Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chem related to Earth Science GeoSci Climate Chemistry Other Biochemistry BioEngineering Biology; Zoology; Ecology Toxicology; Pharmacology Knowledge Flows (cluster to cluster) source cited Biochemistry Food Sci citing Earth Sciences Chem related to Biology et al. 8 recipient Physics Other arrow colors as above MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 26 13 Symbolic map of 14 disciplines; 1974 Materials Chemistry, Polymer Phys Chem Major knowledge flows. Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Largest net donor of knowledge. Biochemistry Knowledge Flows (cluster to cluster) Biology; Zoology; Ecology Toxicology; Pharmacology source recipient cited citing Earth Sciences Physics Other arrow colors as above 8 MicroBio; Plant Sci Food Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 27 Symbolic map of 14 disciplines; 1979 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Biology; Zoology; Ecology Toxicology; Pharmacology Knowledge Flows (cluster to cluster) source cited recipient citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 Food Sci MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 28 14 Symbolic map of 14 disciplines; 1984 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Knowledge Flows (cluster to cluster) Biology; Zoology; Ecology Toxicology; Pharmacology source recipient cited citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 MicroBio; Plant Sci Food Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 29 Symbolic map of 14 disciplines; 1989 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Biology; Zoology; Ecology Toxicology; Pharmacology Knowledge Flows (cluster to cluster) source cited recipient citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 Food Sci MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 30 15 Symbolic map of 14 disciplines; 1994 Materials 10 12 6 5 Chemistry, Polymer Phys Chem 13 Number of papers by cluster 4 CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical 3 Fraction of papers by cluster Biology Chemistry Climate 7 9 Other GeoSci Biochemistry BioEngineering 14 Toxicology; Pharmacology 1 2 Knowledge Flows (cluster to cluster) Biology; Zoology; Ecology source recipient cited citing Earth Sciences Biochemistry Physics Other arrow colors as above 11 8 Food Sci MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 31 Symbolic map of 14 disciplines; 1999 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Biology; Zoology; Ecology Toxicology; Pharmacology Knowledge Flows (cluster to cluster) source cited recipient citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 Food Sci MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 32 16 Symbolic map of 14 disciplines; 2004 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Knowledge Flows (cluster to cluster) Biology; Zoology; Ecology Toxicology; Pharmacology source recipient cited citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 MicroBio; Plant Sci Food Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 33 Symbolic map of 14 disciplines; 2004 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Biology; Zoology; Ecology Toxicology; Pharmacology Knowledge Flows (cluster to cluster) source cited recipient citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 Food Sci MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 34 17 Symbolic map of 14 disciplines; 2004 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Knowledge Flows (cluster to cluster) Biology; Zoology; Ecology Toxicology; Pharmacology source recipient cited citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 MicroBio; Plant Sci Food Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 35 Symbolic map of 14 disciplines; 2004 Materials Chemistry, Polymer Phys Chem Number of papers by cluster CM Physics Physics; Optics Chemistry, Gen/Organic 40,000 20,000 10,000 5,000 Chemistry, Analytical Fraction of papers by cluster Biology Chemistry Climate Other GeoSci Biochemistry BioEngineering Biology; Zoology; Ecology Toxicology; Pharmacology Knowledge Flows (cluster to cluster) source cited recipient citing Earth Sciences Biochemistry Physics Other arrow colors as above 8 Food Sci MicroBio; Plant Sci Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 36 18 Summary ¾ Maps show the growth, distribution, and knowledge flows between Chemistry, Biology, Biochemistry, and Bioengineering. ¾ Over the past 30 years, Biochemistry and Bioengineering are moving steadily into Chemistry territory, and are having a large influence on the general knowledge base. ¾ Chemistry’s impact on the knowledge base is growing, but at a slower rate. ¾ Journal-level data provides no information about the topics at the interface between fields, thus limiting the strategic decisions that can be made based on the mapping exercise. ¾ Folding in patent and or commercial data would provide a basis to study the impact of research on innovation and product development. It might very well be the case that some areas of science change their impact from a generator of cited scholarly knowledge to a generator of commercially valuable and hence patented and/or disclosed knowledge. ¾ Paper-level data would support the ¾ identification of topics on the interfaces between fields, knowledge flows at topical levels, and detailed trends at these micro-levels. ¾ analysis of the trajectories and impact of single researchers, teams, institutions, or nations. Börner: “Mapping the Evolving Interface of Mainstream Chemistry and the Fields of Biochemistry, Biology, and Bioengineering. 37 The End. 19