Making sense of Interest Group/Working Group Activity by RDA Technical Advisory Board Beth Plale Professor of Data Science Indiana University USA With special thanks to RDA/US Fellow Nic Weber Technical Advisory Board Members TAB is an elected body • Beth Plale, co-chair (US) • Andrew Treloar, co-chair (Australia) • Bridget Almas (US) • Carole Palmer (US) • Chuang Liu (China) • Francoise Genova (France) • Jamie Shiers (Switzerland) • Peter Fox (US) • Peter Wittenburg (Germany) • Rainer Stotzka (Germany) • Simon Cox (Australia) • Susanna-Assunta Sansone (UK) TAB: what it does • Case statement review: Reviews and guides case statement creation • Liaison: Engages and supports IG/WG activity. Host plenary IG/WG Chairs meetings. Each IG/WG has liaison. Cross group coordination. • Plenary planning : with eye towards minimizing overlap and quality proposals • Socio-technical vision and strategy: technical scope of RDA, issues of productivity: – e.g., 30% are Working Groups and 70% are Interest Groups. Is that right/good balance? RDA P6: 60 working groups and interest groups 60 WGs and IGs is a lot of activity. How can newcomer possibly make sense of RDA? Conceptualizing RDA Activity through Clustering: A Brief History • RDA TAB undertook effort begun in 2014 under lead of TAB co-Chair Dr. B. Plale to better illuminate collective activity of RDA • Sources of information influencing – Analysis of WG/IG stated objectives and other information – Numerous discussions with WG/IG chairs and community – Multiple earlier versions of clustering, none of which quite worked (comprehensive, illuminating) Clustering Purpose • Guide newcomers find products in progress of interest, and groups to which they can contribute • Help externals see scope of solution space of RDA • Guide RDA members in gaps and overlaps • Help TAB in guidance and evaluation of existing and new groups Clustering along two dimensions • Beneficiary dimension: spectrum from data provider to data consumer – Primary beneficiary is data provider (or act of data provisioning) at one end of spectrum or data consumer at other end of spectrum • Solution dimension: spectrum from technical to social/organizational – Solution manifests itself most strongly as software or infrastructure (technical) on one hand; or as policy, organizational, governance, educational, or community building (social) on other Social/organizational solution aimed at data provider Social/organizational solution aimed at data consumer Technical solution aimed at data provider Technical solution aimed at data consumer Placing activity on grid • Self identification/positioning by WG/IG chairs • Activity is represented as single point in grid space labeled by (0, 100) in each dimension • Following graphs are for those WG/IGs that have responded to inquiries so far (about 50% have responded) Social/organizational + data consumer Technical + Data Consumer Technical + Data Provider Social/organizational + Data Provider Terms to further describe • Use of terms to further describe activity of WG/IG • Terms drawn from Data Practices and Curation Vocabulary (DPCVocab) but not limited to For 34 groups who have replied with their info. Location: Q1: UR, Q2: LR, Q3: LL, Q4: UR. Color coded by quadrant and WGs in dark WG IG Name Quadrant Beneficiary Solution Keywords Community Capability Model IG Q1 65 95 Data Management, Data Literacy, Education Data for Development Q1 58 63 Discovery, Knowledge Organization / Representation, Education, Data Literacy Development of cloud computing capacity and education in developing world research Q1 60 60 Education, Research Practices Long tail of research data IG Q1 66 66 Interoperability, Data Fabric, Knowledge Organization / Representation Quality of Urban Life Interest Group Q1 86 60 Governance, Education, Values / Ethics RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing Q1 World 95 95 Education, Data Literacy, Research Practices Agriculture Data Interest Group (IGAD) Q2 80 30 Interoperability, Data Fabric, Knowledge Organization / Representation Big Data IG Q2 65 15 Big Data, Interoperability, Discovery, Values / Ethics Biodiversity Data Integration IG Q2 51 5 Interoperability, Data Brokering, Knowledge Organization / Representation Brokering IG Q2 70 25 Interoperability, Data Brokering, Knowledge Organization / Representation Data Description Registry Interoperability (DDRI) Q2 55 15 Interoperability, Brokering, Registry ELIXIR Bridging Force IG Q2 60 1 Interoperability, Data Brokering, Infrastructure Marine Data Harmonization IG Q2 52 10 Interoperability, Data Brokering, Infrastructure Metadata IG Q2 50 50 Knowledge Organization / Representation, Discovery, Metadata Structural Biology IG Q2 85 15 Data Brokering, Knowledge Organization / Representation, Metadata Toxicogenomics Interoperability IG Q2 55 Wheat Data Interoperability WG Q2 80 Data in Context IG Q3 40 45 Values / Ethics, Research Practices, Big Data Data Type Registries WG Q3 45 45 Registry, Interoperability, Infrastructure Domain Repositories Interest Group Q3 10 40 Infrastructure, Data Management, Data Dissemination / Publication PID IG Q3 11 25 Knowledge Organization / Representation, Discovery, Metadata PID Information Types WG Q3 15 25 Knowledge Organization / Representation, Discovery, Metadata Practical Policy WG Q3 40 45 Governance, Research Practices Preservation e-Infrastructure IG Q3 10 20 Infrastructure, Data Management, Data Dissemination / Publication RDA/WDS Publishing Data Workflows WG Q3 8 Repository Platforms for Research Data Q3 12 10 Infrastructure, Data Management, Data Dissemination / Publication, Research Practices Archiving multimedia interactive /dynamic data and projects Q4 42 80 Data brokering, Governance, Data Management Data Foundation and Terminology WG Q4 30 75 Interoperability, Data Literacy, Data Fabric RDA/CODATA Legal Interoperability IG Q4 38 90 Interoperability, Governance RDA/WDS Publishing Data Cost Recovery for Data Centres Q4 10 90 Data Dissemination / Publication, Knowledge Organization / Representation RDA/WDS Publishing Data Services WG Q4 10 80 Data Dissemination / Publication, Research Practices, Data Management Repository Audit and Certification DSA–WDS Partnership WG Q4 25 55 Governance, Infrastructure, Registry Service Management IG Q4 10 90 Data management, Governance Standardisation of Data Categories and Codes WG Q4 1 5 Interoperability, Data Brokering, Data Management, Metadata 5 Interoperability, Data Brokering, Data Management, Metadata 10 Data Dissemination / Publication, Knowledge Organization / Representation 90 Interoperability, Data Fabric, Knowledge Organization / Representation WG IG NameQuadrant Beneficiary Solution Keywords Community Capability Model IG Q1 Q1 Data for Development 65 58 Quadrant Beneficiary Solution Keywords 95 Data Management, Education 95 Data Management, Data L Q1 Data Literacy, 65 63 Discovery, Knowledge Organization Education, Data Org Lite Q1 58/ Representation, 63 Discovery, Knowledge d education in developing world research Q1 60 in developing 60 Education, Development of cloud computing capacity and education world Research research Q1 Practices Long tail of research data IG Q1 66 60 60 Education, Research Pract 66 Interoperability, Q1 Data Fabric, Knowledge Organization / Representation 66 66 Interoperability, Data Fabri 60 Governance, Education, Values / Ethics Q1 86 60 Governance, Education, Va Q1 86 Quality of Urban Life Interest Group nce RDA/CODATA and Cloud Computing the Developing Q1 World and 95Cloud Computing 95 Education, Literacy,World Research SummerinSchools in Data Science in theData Developing Q1 95Practices95 Education, Data Literacy, R Q2 Agriculture Data Interest Group (IGAD) 80 30 Interoperability, Q2 Data Fabric, Knowledge Organization / Representation 80 30 Interoperability, Data Fabri Q2 65 15 Big Data, Interoperability, Discovery, Ethics Q2 65 Values /15 Big Data, Interoperability, D Biodiversity Data Integration IG Q2 51 5 Interoperability, Q2 Data Brokering, Knowledge Organization / Representation 51 5 Interoperability, Data Broke Q2 70 25 Interoperability, Q2 Data Brokering, Knowledge Organization / Representation 70 25 Interoperability, Data Broke 55 15 Interoperability, Q2 Brokering, Registry55 Big Data IG Brokering IG RI)Data Description Registry Interoperability Q2 (DDRI) 15 Interoperability, Brokering, Q2 60 1 Interoperability, Q2 Data Brokering, Infrastructure 60 1 Interoperability, Data Broke Marine Data Harmonization IG Q2 52 52 10 Interoperability, Data Broke 10 Interoperability, Q2 Data Brokering, Infrastructure Metadata IG Q2 50 Q2 50 50 Knowledge Organization / 50 Knowledge Organization / Representation, Discovery, Metadata Structural Biology IG Q2 85 85 15 Data Brokering, Knowledge 15 Data Brokering,Q2 Knowledge Organization / Representation, Metadata Toxicogenomics Interoperability IG Q2 55 55 Management, 5 Interoperability, 5 Interoperability, Q2 Data Brokering, Data Metadata Data Broke Wheat Data Interoperability WG Q2 80 Data in Context IG 40 80 Management, 5 Interoperability, 5 Interoperability, Q2 Data Brokering, Data Metadata Data Broke 40 Big Data45 Values / Ethics, Research P 45 Values / Ethics, Q3 Research Practices, ELIXIR Bridging Force IG Data Type Registries WG Q3 Q3 45 45 Registry, Interoperability, In Q3 45 45 Registry, Interoperability, Infrastructure Term Assignment. Orange: social/consumer; Domain Repositories Interest Group Q3 Management, 10 40 Infrastructure, Data Manag Q3 10 40 Infrastructure, Data Data Dissemination / Publication PID IG Q3 11 25 Knowledge Organization / Q3 11 25 Knowledge Organization / Representation, Discovery, Metadata Blue: technical/consumer. Terms chosen by group PID Information Types WG Q3 15 25 Knowledge Organization / Q3 15 25 Knowledge Organization / Representation, Discovery, Metadata Practical Policy WG Q3 45 Governance, Research Pra Q3 40 45 Governance, Research Practices 40 to describe activity more precisely than name Preservation e-Infrastructure IG Q3 Q3 Management, 10 20 Infrastructure, Data Manag 10 20 Infrastructure, Data Data Dissemination / Publication RDA/WDS Publishing Data Workflows Q3 / Publication, Knowledge 8 10 Data Dissemination / Publi Q3 WG 8 10 Data Dissemination Organization / Representatio alone. Repository Platforms for Research Q3 Management, 12 10 Infrastructure, Data Manag Q3Data 12 10 Infrastructure, Data Data Dissemination / Publication, Resea WG IG Name Quadrant Beneficiary Solution Keywords Community Capability Model IG Q1 65 95 Data Management, Data Literacy, Education Data for Development Q1 58 63 Discovery, Knowledge Organization / Representation, Education, Data Literacy Development of cloud computing capacity and education in developing world research Q1 60 60 Education, Research Practices Long tail of research data IG Q1 66 66 Interoperability, Data Fabric, Knowledge Organization / Representation Quality of Urban Life Interest Group Q1 86 60 Governance, Education, Values / Ethics RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing Q1 World 95 95 Education, Data Literacy, Research Practices Agriculture Data Interest Group (IGAD) Q2 80 30 Interoperability, Data Fabric, Knowledge Organization / Representation Big Data IG Q2 65 15 Big Data, Interoperability, Discovery, Values / Ethics Biodiversity Data Integration IG Q2 51 5 Interoperability, Data Brokering, Knowledge Organization / Representation Brokering IG Q2 70 25 Interoperability, Data Brokering, Knowledge Organization / Representation Data Description Registry Interoperability (DDRI) Q2 55 15 Interoperability, Brokering, Registry ELIXIR Bridging Force IG Q2 60 1 Interoperability, Data Brokering, Infrastructure Marine Data Harmonization IG Q2 52 10 Interoperability, Data Brokering, Infrastructure Metadata IG Q2 50 50 Knowledge Organization / Representation, Discovery, Metadata Structural Biology IG Q2 85 15 Data Brokering, Knowledge Organization / Representation, Metadata Toxicogenomics Interoperability IG Q2 55 Wheat Data Interoperability WG Q2 80 Data in Context IG Q3 40 45 Values / Ethics, Research Practices, Big Data Data Type Registries WG Q3 45 45 Registry, Interoperability, Infrastructure Domain Repositories Interest Group Q3 10 40 Infrastructure, Data Management, Data Dissemination / Publication PID IG Q3 11 25 Knowledge Organization / Representation, Discovery, Metadata PID Information Types WG Q3 15 25 Knowledge Organization / Representation, Discovery, Metadata Practical Policy WG Q3 40 45 Governance, Research Practices Preservation e-Infrastructure IG Q3 10 20 Infrastructure, Data Management, Data Dissemination / Publication RDA/WDS Publishing Data Workflows WG Q3 8 Repository Platforms for Research Data Q3 12 10 Infrastructure, Data Management, Data Dissemination / Publication, Research P Archiving multimedia interactive /dynamic data and projects Q4 42 80 Data brokering, Governance, Data Management Data Foundation and Terminology WG Q4 30 75 Interoperability, Data Literacy, Data Fabric RDA/CODATA Legal Interoperability IG Q4 38 90 Interoperability, Governance RDA/WDS Publishing Data Cost Recovery for Data Centres Q4 10 90 Data Dissemination / Publication, Knowledge Organization / Representation RDA/WDS Publishing Data Services WG Q4 10 80 Data Dissemination / Publication, Research Practices, Data Management Repository Audit and Certification DSA–WDS Partnership WG Q4 25 55 Governance, Infrastructure, Registry Service Management IG Q4 10 90 Data management, Governance Standardisation of Data Categories and Codes WG Q4 1 5 Interoperability, Data Brokering, Data Management, Metadata 5 Interoperability, Data Brokering, Data Management, Metadata 10 Data Dissemination / Publication, Knowledge Organization / Representation 90 Interoperability, Data Fabric, Knowledge Organization / Representation Larger version of full list of term assignment to date. Summary • Clustering has exposed relatively equal representation of WG/IG activity in each category • WG activity more heavily concentrated in technical dimension. TAB discussing solutions to stimulate WG activity on social/organizational dimension • RDA/US Fellow: Building clustering into new webenabled tool to explore RDA activity for RDA site • RDA/US Fellow: gather additional information to study RDA (WG/IG engagement: e.g., profiles of those engaged based on organizational affiliation) • Whitepaper in preparation on clustering