New Task Group CRIS Architecture & Development Maximilian Stempfhuber RWTH Aachen University Library stempfhuber@bth.rwth-aachen.de Agenda A view to research information The role of CERIF As a data model In the CRIS development process Why a new Task Group? Task Group’s Mission Information & Research Process Project management Research organisations (Human)Resources Research programmes Infrastructure Research strategy Publication State of the management Art Experiments Communication Proposal Management Research data Knowledge Prizes, patents Wealth Software Excellence Expertise Commercial products Work programm Research Information Project proposal Project Results Transfer Wealth creation CRIS Semantic & Temporal Aspects Current Research Information System Org. Unit Person Expertise CV Project Funding Programme Results Publications Events Software Patents CERIF A Data Model for CRIS Common European Research Information Format Funding Programme Person Skills Project Publication Organisation Service Equipment CV Patent Product Event Classification (Semantics ) CERIF A Data Model for CRIS Common European Research Information Format • • • • Entity Relationship Model Generators for several DBMS CERIF-XML as exchange format Code of Good Practice • Commercial software systems • Proprietary implementations Same Model… … different results Current Process RESPONSIBILITY PROCESS Concept Anyone Reject Review Marketing Plan - Market Analysis - Cost benefit analysis/ economic model Concept Acceptance RTD-Promoter Accept RIS proposal Project Manager Reject RTD-Promoter Steering Group Process CONTROLS RIS Proposal - Definition of Purpose - Identification of Users - Definition of Content RIS Proposal Acceptance Accept RIS Design Development Function Publishing Function Reject RTD-Promoter Steering Group RIS Design Plan - Database Specification - Structure and Presentation - Classification and Indexing - Search and Navigation RIS Development Acceptance Ongoing Accept Collection Function Production Function Marketing Function Information Processing RTD-Promoter Steering Group Reject M a i n t e n a n c e Structured Output RIS Information Processing Plan - Data Collection Plan - Collection Guidelines - Quality Control Plan - Acceptance Test Plan Information processing acceptance Accept Publishing/ Distribution Distribution Plan Marketing Plan (revisited) Economoic model (implementation) Maintenance Plan / Acceptance Code of Good Practice • Organizational view • Covers whole process • Waterfall-like Missing Aspects • (Software) Architecture • Technology • Reference Implementation Looking Beyond… … the CRIS domain • Administrative systems at the institution • Local information systems (OAR etc.) • Community systems (ResearchGate etc.) • Clusters of Excellence (Idea League) • Virtual Organizations (Fraunhofer, Helmholtz, Leibniz, Max-Planck) CERIF-CRIS Connectivity -CRIS ns, Org. Units, s, Events, rammes, etc. CERIF-XML Institutional Repository CERIF-CRIS Projects, Persons, Org. Units, Publications, Events, Research Programmes, etc. Research Data Repository Finance CERIF-XML Human Resources CERIF-CR Projects, Persons, Publications, E Research Program Project Management Community CRIS euroCRIS Strategy Enhance existing CRIS Connect CRIS with a common CERIF wrapper euroCRIS Create standardized, reusable services Fill gaps with new CRIS The Gap… … between model and implementation Agreements, Standards, Best Practices, Re-Use low high Concrete System Code of Good Practice CERIF What’s Missing? Business Logic Data Access Layer CERIF Database Management System Operating System CERIFXML Search, Harvesting, Services Code of Good Practice User Interface Why is it important? 20 : 80 CERIF Development & Testing What can be gained… … for euroCRIS as an organization? • Community building • Exchange • Reuse • Evolution • Spreading ideas & Connectivity … beyond CERIF What can be gained… … for euroCRIS members? • Using building blocks • Reducing development & testing • Getting additional functionality • Opening ones system & content … even in combination with commercial software Requirements • • • • • Requirements engineering (Functional) Software specification Code of Good Practice (Updated) Best Practice Examples / DRIS (Updated) Available (commercial) solutions Database Systems • Paradigms (Relational, Object-Oriented, XML, multi-dimensional DBMS) • Systems (IBM DB2, Oracle, PostgreSQL; commercial vs. Open Source) • Interfaces (ODBC, JDBC, Perl DBI, PEAR) • Query languages (SQL, OQL, XQuery) • Schema evolution / migration Database Abstraction • Separating software architecture from the (physical) database model • Encapsulation vs. normalization • Object-Relational-Mapping (ORM) • Schema evolution / migration • Convention over configuration (Coding by convention) & tool support Programming & Managing • Re-use of modules and libraries • Generating CRIS Open Source code base • Share experience with colleagues – Scalability (e.g. middleware) – Reliability (e.g. components) – Integrated Development Environments (IDE) – Development process (SCRUM, V-Model, MDA, …) Software Architecture • Permanent evolution vs. re-use – Development philosophy → architecture – Domain modeling → architecture – Software frameworks → architecture – Tools support → architecture – Programming languages → architecture • Current buzz words: SOA/REST, Cloud Computing, RIA, BPM, Portal/CMS Functional Modules • • • • • • Self containment Standardized interfaces Standardized functionality Standardized input (e.g. CERIF-XML) Standardized output? CRIS plug-in architecture needed Workflow • • • • • Business Process Modeling (BPM) Workflows at the UI level Quality assurance in CRIS Event/data-driven services Drives re-usable software modules (e.g. input verification, data acquisition) & processes User Interface • • • • • Common / consistent user experience Re-use of interaction patterns Sharing solutions (e.g. CSS frameworks) Sharing knowledge (e.g. accessibility) Integration CRISs and services Information Design • Common ways for expressing – Semantic relationships – Temporal aspects – Qualities & quantities • Software modules for visualizations – Network graphs – Timelines – Charts, … • Experiences with commercial software Statistics & Reporting • Defining recurring information needs • Standardizing on basic data formats • Statistics / reporting as a (re-usable / commercial) service • Software modules • Layout templates (e.g. XSLT, XML FO) External Access • Defining public CRIS services – Functional specification – Interface specification – I/O format specification • Services – Searching for entities – Data analysis / information extraction Data Exchange • • • • • Harvesting interfaces Entity extraction Replication Federation Schema mapping TG Roadmap • Establishing TG Mission • Recruiting TG Members • Initial Survey: Where are we now? Where are we going? – Technologies used (DMBS, languages etc.) – Methodologies used (SOA, SCRUM, outsourcing etc.) – Gap analysis: Topics for support & exchange, common modeling of CRIS architectures, abstraction layers, module specifications etc.