IUNI Web of Science Data: An Introduction for Beginners Katy Börner and Robert Light Cyberinfrastructure for Network Science Center School of Informatics and Computing and Indiana University Network Science Institute Indiana University, USA January 11, 2016 OlivierH.Beauchesne,2011.MapofScientificCollaborationsfrom2005Ͳ2009. IUNIWebofScienceData DataAcquisition TheIUNIScienceofScienceHubacquiredthecompletesetofThomsonReuters’Web ofScienceXMLrawdata(WebofKnowledgeversion5)comprising • ScienceCitationIndexExpandedfrom1900Ͳ2013 • SocialSciencesCitationIndexfrom1900Ͳ2013 • Arts&HumanitiesCitationIndexfrom1975Ͳ2013 • BookCitationIndexͲͲ Sciencefrom2005Ͳ2013 • BookCitationIndexͲͲ SocialSciences&Humanitiesfrom2005Ͳ2013 • ConferenceProceedingsCitationIndexͲͲ Science&Technicalfrom1990Ͳ2013 BasicStatistics: • WebofScienceCoreCollection:Thenumberoftotalitemsfrom1900through2013is56,442,146. • Thereare1,005,597,828referencestoallitemsinthecollection. • ItemsByEdition(somedocumentsspanmultipleeditions) • SCIE(ScienceCitationIndexExpanded)Ͳ 42,263,961[828.9Mreferences] • SSCI(SocialSciencesCitationIndex)Ͳ 7,690,154[131.6Mreferences] • AHCI (Arts&HumanitiesCitationIndex)Ͳ 4,281,088[35.3Mreferences] • BSCI(BookCitationIndex– Science)Ͳ 307,091[15.6Mreferences] • BHCI(BookCitationIndex– SocialSciences&Humanities)Ͳ 452,559[14.2Mreferences] • ISTP(IndextoScientific&TechnicalProceedings)Ͳ 7,291,457[72.7Mreferences] • ISSHP(IndextoSocialSciences&HumanitiesProceedings)Ͳ 564,970[9.4Mreferences] IUNIWebofScienceData IUNIWebofScienceData IUNIWebofScienceData IUNIWebofScienceData DataCleanliness Notalldataisaspristineaswe’dwish.Forinstance,grantagenciesareverymuch takenasisfromtheAcknowledgementsectionofpapers,makingasimplequestion like“HowmanypapersarederivedfromNIHfunding?”ratherchallengingtoaddress untilsomeformofdatacleaningisdone. Howdoweavoideveryteamdoingthisdatacleaningrepetitively? IUNIWebofScienceData RawData ProvidedbyThomsonReutersintheformof166XMLfiles(561GB). The127ͲpagedocumentationoftheXMLdataformatisavailableontheEnclaveat /WoS/Documentation. IUNIWebofScienceData WoS Database • PostgreSQLdatabasewithafullyparsedsetofALLdata. • DatabaseschemaandDataDictionarydocumentationforeaseofuse. • Planned:Simpleuserinterfaceforrunningcommonqueries. TableSchemaisat/WoS/DocumentationontheEnclaveandwillbeaddedtothepublic sitesoon. DataDictionaryworkisinprogress(expectedmidͲtoͲendofJan). ReducedneedforXMLparsingwillfreeupEnclavecomputecyclesforrunningnetwork analysisandvisualizationalgorithms. IUNIWebofScienceData IUNIWebofScienceData DataEnclave Setup • CreatedbyIUasasecureenvironmentforworkingwiththedata. • PoweredbytheKarstclusterandmaintainedbyUITSadministrators. • OverseenbytheDataStewardandtheDataAdvisoryBoard. • OpenedfortestinginSeptember2015. • CurrentlyhousestherawXMLfiles,alongwithstatisticalandanalysissoftware. Usage • Loginfromcampus(orviaVPN)usinganycomputer. • Usevirtualdesktopsetuptorunqueries,analyzes,generatevisualizations. • Toextractdata/results,savefilesindedicated/mbx directoryforyourresearchteam andcontactDataSteward. • Sofar,overtwentydataextractionrequestshavebeenissuedandtheaverage responsetimeislessthan30minutes(medianresponseislessthantenminutes) • Considerwritingpapersintheenclavetoreducedataextractionrequests. IUNIWebofScienceData DataAccess ThisdatacanbeusedbyanyemployeeofIndianaUniversityforacademicresearch andwithoutanysharingofdata.Datadetailsandinformationonhowtoaccessthe dataaredetailedathttp://www.indiana.edu/~iuni/resources/wos.html Torequestdataaccess,pleasecompletetheIUNIWebofScienceData(WoS)Access RequestForm Sofar,fiveteamshaverequestedandgainedaccesstothedata. IUNIWebofScienceData DataEnclaveScreenshot IUNIWebofScienceData WoS DataUsage TheWoS dataisacoreassetoftheIUNIScienceofScienceResearchHub.Itisessential forR&DrelatedtotheScienceObservatory,seehttp://iuni.iu.edu/about. Ultimately,theScienceObservatorywillprovideanextensible,secureinfrastructureand expertisetostudythescience,technology,andinnovationsysteminnearrealͲtimeand tocommunicateresultstoabroadaudienceofresearchers,practitioners,educators, patients,andinterestedpolicymakers. TheIUNIScienceofScienceResearchHubiscollaboratingin/leadingthefollowinggrant developmentefforts: • NSFMidͲWestDataHUBͲ "ImpactAssessment"and"CommunityEngagement"IU leadisBethPlale,SOIC • NSFEAGERXDMoD ValueAnalytics(withUITS) • NWBonJetstream/XSEDE(withUITS) • "ScienceofScience"NSFExpeditionsinComputingleadbyAlexVespignani,NEU • "ScienceObservatory"NSFScienceandTechnologyCenterleadbyKatyBorner,SOIC • NSFNRTproposalleadbyLuisRocha,SOIC PleaseletusknowhowwecanmaximizethevalueoftheIUNIWoS dataforyourR&D. IUNIWebofScienceData PlannedWoS DataUsage SpecialagreementsexistforusingWoS publicationsbyIUfaculty.Datacanbeusedto • BulkͲfillandpreͲpopulateFARͲlikesystemsonanannualbasis. • VisualizeIU'simpact—since1900—aspartofthebicentennialcelebrations. • ServeResearcherNetworkingsystemslikehttp://nrn.cns.iu.edu thatempowerall faculty,staff,students,alumni,funderstoidentifyexpertisebasedonpublicationsbut alsofunding,teachingdataprovidedbyIUproductionsystems. TheIUNIScienceofScienceResearchHubiscurrently • WorkingonthedeploymentoftheNetworkWorkbenchthatisusedby100kexperts aroundtheglobeonJetstream/XSEDE,seeIUtolaunchJetstreamvideoat https://www.youtube.com/watch?v=t63c12A6bds. • LearningAnalytics,IUGrandChallengesproposal • OrganizingaNSFSciSIP agendasettingconferenceon“ModellingScience,Technology, andInnovation”thatwilltakeplaceatNAS,DConMay17Ͳ18,2016,see http://modsti.cns.iu.edu • Prototypingmoderated“ScienceandTechnologyForecasts”incollaborationwith Journalism,seenextslide. ScienceForecast S1:E1,2015