IUNI Web of Science Data: An Introduction for Beginners

advertisement
IUNI Web of Science Data:
An Introduction for Beginners
Katy Börner and Robert Light
Cyberinfrastructure for Network Science Center
School of Informatics and Computing and Indiana University Network Science Institute
Indiana University, USA
January 11, 2016
OlivierH.Beauchesne,2011.MapofScientificCollaborationsfrom2005Ͳ2009.
IUNIWebofScienceData
DataAcquisition
TheIUNIScienceofScienceHubacquiredthecompletesetofThomsonReuters’Web
ofScienceXMLrawdata(WebofKnowledgeversion5)comprising
• ScienceCitationIndexExpandedfrom1900Ͳ2013
• SocialSciencesCitationIndexfrom1900Ͳ2013
• Arts&HumanitiesCitationIndexfrom1975Ͳ2013
• BookCitationIndexͲͲ Sciencefrom2005Ͳ2013
• BookCitationIndexͲͲ SocialSciences&Humanitiesfrom2005Ͳ2013
• ConferenceProceedingsCitationIndexͲͲ Science&Technicalfrom1990Ͳ2013
BasicStatistics:
• WebofScienceCoreCollection:Thenumberoftotalitemsfrom1900through2013is56,442,146.
• Thereare1,005,597,828referencestoallitemsinthecollection.
• ItemsByEdition(somedocumentsspanmultipleeditions)
• SCIE(ScienceCitationIndexExpanded)Ͳ 42,263,961[828.9Mreferences]
• SSCI(SocialSciencesCitationIndex)Ͳ 7,690,154[131.6Mreferences]
• AHCI (Arts&HumanitiesCitationIndex)Ͳ 4,281,088[35.3Mreferences]
• BSCI(BookCitationIndex– Science)Ͳ 307,091[15.6Mreferences]
• BHCI(BookCitationIndex– SocialSciences&Humanities)Ͳ 452,559[14.2Mreferences]
• ISTP(IndextoScientific&TechnicalProceedings)Ͳ 7,291,457[72.7Mreferences]
• ISSHP(IndextoSocialSciences&HumanitiesProceedings)Ͳ 564,970[9.4Mreferences]
IUNIWebofScienceData
IUNIWebofScienceData
IUNIWebofScienceData
IUNIWebofScienceData
DataCleanliness
Notalldataisaspristineaswe’dwish.Forinstance,grantagenciesareverymuch
takenasisfromtheAcknowledgementsectionofpapers,makingasimplequestion
like“HowmanypapersarederivedfromNIHfunding?”ratherchallengingtoaddress
untilsomeformofdatacleaningisdone.
Howdoweavoideveryteamdoingthisdatacleaningrepetitively?
IUNIWebofScienceData
RawData
ProvidedbyThomsonReutersintheformof166XMLfiles(561GB).
The127ͲpagedocumentationoftheXMLdataformatisavailableontheEnclaveat
/WoS/Documentation.
IUNIWebofScienceData
WoS Database
• PostgreSQLdatabasewithafullyparsedsetofALLdata.
• DatabaseschemaandDataDictionarydocumentationforeaseofuse.
• Planned:Simpleuserinterfaceforrunningcommonqueries.
TableSchemaisat/WoS/DocumentationontheEnclaveandwillbeaddedtothepublic
sitesoon.
DataDictionaryworkisinprogress(expectedmidͲtoͲendofJan).
ReducedneedforXMLparsingwillfreeupEnclavecomputecyclesforrunningnetwork
analysisandvisualizationalgorithms.
IUNIWebofScienceData
IUNIWebofScienceData
DataEnclave
Setup
• CreatedbyIUasasecureenvironmentforworkingwiththedata.
• PoweredbytheKarstclusterandmaintainedbyUITSadministrators.
• OverseenbytheDataStewardandtheDataAdvisoryBoard.
• OpenedfortestinginSeptember2015.
• CurrentlyhousestherawXMLfiles,alongwithstatisticalandanalysissoftware.
Usage
• Loginfromcampus(orviaVPN)usinganycomputer.
• Usevirtualdesktopsetuptorunqueries,analyzes,generatevisualizations.
• Toextractdata/results,savefilesindedicated/mbx directoryforyourresearchteam
andcontactDataSteward.
• Sofar,overtwentydataextractionrequestshavebeenissuedandtheaverage
responsetimeislessthan30minutes(medianresponseislessthantenminutes)
• Considerwritingpapersintheenclavetoreducedataextractionrequests.
IUNIWebofScienceData
DataAccess
ThisdatacanbeusedbyanyemployeeofIndianaUniversityforacademicresearch
andwithoutanysharingofdata.Datadetailsandinformationonhowtoaccessthe
dataaredetailedathttp://www.indiana.edu/~iuni/resources/wos.html
Torequestdataaccess,pleasecompletetheIUNIWebofScienceData(WoS)Access
RequestForm
Sofar,fiveteamshaverequestedandgainedaccesstothedata.
IUNIWebofScienceData
DataEnclaveScreenshot
IUNIWebofScienceData
WoS DataUsage
TheWoS dataisacoreassetoftheIUNIScienceofScienceResearchHub.Itisessential
forR&DrelatedtotheScienceObservatory,seehttp://iuni.iu.edu/about.
Ultimately,theScienceObservatorywillprovideanextensible,secureinfrastructureand
expertisetostudythescience,technology,andinnovationsysteminnearrealͲtimeand
tocommunicateresultstoabroadaudienceofresearchers,practitioners,educators,
patients,andinterestedpolicymakers.
TheIUNIScienceofScienceResearchHubiscollaboratingin/leadingthefollowinggrant
developmentefforts:
• NSFMidͲWestDataHUBͲ "ImpactAssessment"and"CommunityEngagement"IU
leadisBethPlale,SOIC
• NSFEAGERXDMoD ValueAnalytics(withUITS)
• NWBonJetstream/XSEDE(withUITS)
• "ScienceofScience"NSFExpeditionsinComputingleadbyAlexVespignani,NEU
• "ScienceObservatory"NSFScienceandTechnologyCenterleadbyKatyBorner,SOIC
• NSFNRTproposalleadbyLuisRocha,SOIC
PleaseletusknowhowwecanmaximizethevalueoftheIUNIWoS dataforyourR&D.
IUNIWebofScienceData
PlannedWoS DataUsage
SpecialagreementsexistforusingWoS publicationsbyIUfaculty.Datacanbeusedto
• BulkͲfillandpreͲpopulateFARͲlikesystemsonanannualbasis.
• VisualizeIU'simpact—since1900—aspartofthebicentennialcelebrations.
• ServeResearcherNetworkingsystemslikehttp://nrn.cns.iu.edu thatempowerall
faculty,staff,students,alumni,funderstoidentifyexpertisebasedonpublicationsbut
alsofunding,teachingdataprovidedbyIUproductionsystems.
TheIUNIScienceofScienceResearchHubiscurrently
• WorkingonthedeploymentoftheNetworkWorkbenchthatisusedby100kexperts
aroundtheglobeonJetstream/XSEDE,seeIUtolaunchJetstreamvideoat
https://www.youtube.com/watch?v=t63c12A6bds.
• LearningAnalytics,IUGrandChallengesproposal
• OrganizingaNSFSciSIP agendasettingconferenceon“ModellingScience,Technology,
andInnovation”thatwilltakeplaceatNAS,DConMay17Ͳ18,2016,see
http://modsti.cns.iu.edu
• Prototypingmoderated“ScienceandTechnologyForecasts”incollaborationwith
Journalism,seenextslide.
ScienceForecast
S1:E1,2015
Download