Tutorial Outline 30’ 50’ 80 20’ • From Content Management Systems to VREs • Creating a VRE • Using a VRE • Conclusions 1 www.d4science.eu RCDL 2008 10 October 2007 Dubna (Russia) From Content Management Systems to VREs: requirements, challenges, and opportunities Pasquale Pagano CNR-ISTI Pasquale.pagano@isti.cnr.it www.d4science.eu Session Outline • Towards VREs A bit of history Delos DL Reference Model VREs, VOs, and e-Infrastructures D4Science Vision • gCube System Characteristics Technological Complexity and Solution Architecture • gCube Standards and Technology SOI, WSRF gCore • gCube Enabling Services Features Opportunities 3 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu The evolution 2006 Virtual Research Environments Digital Library Management System 2003 consumer and data provider 2001 Digital Library Repository + Catalogue + Search service consumer and resource provider consumer 1996 consumer few large institutions few small institutions many small institutions many virtual organizations 4 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu The DL Universe BRICKS From Content Management Systems to VREs 10 October 2008, Dubna (Russia) 5 www.d4science.eu DELOS: Digital Library definition A (potentially virtual) organization that comprehensively collects, manages, and preserves for the long term rich digital content and offers to its user communities specialized functionality on that content, of measurable quality, and according to prescribed policies * DELOS Reference Model for Digital Libraries 6 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu DL Reference Model: the 7 Domains The Reference Model is founded on 6+1 domains: Resource – captures generic characteristics (super-domain) Content – information available User – actors interacting with system Functionality –operations supported Policy – rules and conditions governing operation Quality – qualitative & quantitative characterisations of system Architecture –physical software (and hardware) constituents realising the DL * DELOS Reference Model forconcretely Digital Libraries 7 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu DL Reference Model: the 7 Domains [cont.] * DELOS Reference Model for Digital Libraries From Content Management Systems to VREs 10 October 2008, Dubna (Russia) 8 www.d4science.eu Reference Frameworks 9 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Virtual Research Environments (VRE) • Distributed frameworks for carrying out cooperative activities like “in silico experiments”, data analysis and processing, production of new knowledge using specialized tools • Largely based on retrieval and access of always updated knowledge from diverse heterogeneous content sources • Produce knowledge that is preserved and made available for other usages inside and outside the VRE 10 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Highly dynamic, created and dismissed on-demand From Content Management Systems to VREs 10 October 2008, Dubna (Russia) Informaion Service VDL Generator Annotation Metadata Broker Content Security Operating on new information objects 11 www.d4science.eu ImpECt Portal Arte Portal Process Optimization Process Execution & Reliability Process Design & Verification Feature Extraction Service Search Service Index Service Personalization CSDS Data Fusion Metadata Management M26 Wrapper & Monitor Content Management 1,2 DVOS Keeper Broker & Matchmaker Virtual Research Environments: characteristics 0,8 0,6 1 PrototypeAvailableBuild 0,4 0,2 0 Based on specialised tools which support the generation of new knowledge Virtual Research Environments: the vision • A Virtual Research Environment (VRE) provides a framework of applications, services and data sources dynamically identified to support the underlying processes of research/collaboration/cooperation. The purpose of a VRE is to help researchers* belonging to Virtual Organization by managing the increasingly complex range of tasks involved in carrying out their activities. *Researcher has to be considered in the large, i.e. end-user, decision-makers, resource and data providers, etc. 12 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Virtual Organization (VO) • A Virtual Organization (VO) models sets of users and resources belonging to a e-Infrastructure. It defines clearly and carefully what is shared, who is allowed to share, and the conditions under which sharing occurs, usually based on an authentication and authorization policies. VOs may have a limited lifetime and they are dynamically created to satisfy transient needs of the constituent potentially heterogeneous actual Organizations. 13 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu e-Infrastructure • An infrastructure is the basic physical and organizational structures and facilities (roads, power supplies, ..) needed for the operation of a society or enterprise • An e-Infrastructure provides support for effective consumption of shared resources: hardware-bound resources (i.e. networks, storage, instruments, and computational resources), system-level software resources (i.e. basic middleware services), and application-level software resources (i.e. data sources and services). These e-Infrastructures offer mechanisms that concurrently exploit networks, grids and data in a seamless fashion, and will thus enable scientific communities to operate within a coherent model, regardless of the location of their research facilities. 14 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu D4Science vision D4Science vision calls for the realization of scientific e-Infrastructures that will remove technical concerns from the minds of scientists, hide all related complexities from their perception, and enable users to focus on their science and collaborate on common research challenges gCube is a framework to manage distributed e-infrastructures where it is possible to define, host, and maintain dynamic virtual environments capable to satisfy the collaboration needs of distributed Virtual Organizations (VOs) 15 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube System 16 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Characteristics A Software framework to support ON-DEMAND virtual collaborations* among remote parties cost-effective, secure, dynamic, both short and long lived overcome ad-hoc systems alike to make discoverable and accessible computing, storage, data and service resources to promote and/or contribute to data and service integration * Research Environment From Content Management Systems to VREs 10 October 2008, Dubna (Russia) 17 www.d4science.eu gCube Technological Complexity Software framework needs a ‘middleware’ (typically distributed) is open by definition new resource types and/or new resource instances can be de/registered at any time is powerful if it supports application scope the portion of the infrastructure in which a resource exists the portion of the infrastructure in which a resource can act, operate, or has power or control is powerful if it supports sharing scope (controlled resource sharing) machines, storage, data and services resources 18 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Technological Solution • By relying on gLite, gCube is an e-Infrastructure enabling system to share computing, storage, data and service resources → g3 • gCube system allows collaborations in eScience strongly content-oriented, potentially data and processing intensive within the sharing scope of Virtual Organizations (VOs) broader and longer lived may stretch across the whole infrastructure or else over significant subsets thereof take place in Virtual Research Environments (VREs) scope interactively created, managed, defined, and used: system administrators, application designers, researchers typically short to medium lifespan 19 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Architecture gCube framework is composed by 4 main subsystems: Enabling Services definition and runtime management of VREs Information Organization Services storage, management, description, and annotation of information in a VRE Information Retrieval Services retrieval of information in a VRE Presentation Services VRE users interface with system and application services 20 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Standards and Technology 21 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Service Oriented Infrastructure Is a Virtualised IT Infrastructure which 1. exposes a catalog of services WS instead instead of running of running service service instances, instances, 2. supports Workflow SOA Application, definition and and execution, and 3. includes infrastructure resources such as compute, storage, and networking hardware and software (middleware) to support the running of services. gCube provides a production quality software framework to enable scientific e-Infrastructures empowered by a collaborative environment 22 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube & Standards for communication • Java WSCore, Apache Axis, and GridForum specifications (and implementation if any): WS-Notification, WS-Addressing, WS-Security WSRF WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-BaseFaults (WSRF-BF) WS-ServiceGroup (WSRF-SG) WS-DAI, WS-DAIR, WS-DAIX 23 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Web Service Resource Framework (WSRF) Life time Properties Service Groups Notification Error Handling Web Services Unified way to model and interact with stateful web services Logic Logic WS State Statefull WS 24 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Web Service Resource Framework (WSRF) Life time Service Groups Properties Notification Error Handling Web Services Lifetime (WS-ResourceLifetime) Factory based dynamic creation of services Instances are created with a limited lifetime Prevent services from consuming resource indefinitely (“Garbage Collection”) Properties (WS-ResourceProperties) Defines type and values of a resource state From Content Management Systems to VREs 10 October 2008, Dubna (Russia) Service groups (WS-ServiceGroups) Describes an interface for operating on collections of WS-Resources E.g. to distribute an action to a set of services Notification (WS-Notification) Notification about state changes Applies traditional publish/subscribe paradigm Error Handling (WS-BaseFaults) Defines base handling of communication errors 25 www.d4science.eu WSRF: Advantages and Disadvantages Advantages Clear Description of Resources and Interfaces Dynamic sharing of resources On-demand services exploitation Cross-organizations trusted environment Widely accepted Web Service standards Disadvantages Reference implementations are still in development Several complementing specifications are in development Complex middleware requires maintenance and administration overhead 26 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Towards the usability: gCore To overcome the complexities of the design and implementation of SOI compliant services an application framework for the consolidation / development of existing/new gCube services the gCore Framework (gCF) To meet the needs of system administrators, infrastructure managers, and resource providers an easy to install self-contained service container for the participation to Service Oriented Infrastructures the gCore Container (gCore) 27 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCF facilities: partial overview re-deployement activation update models lifetime failure lifetime scoping scoping WS management State notification publication Calls persistence configuration scoping security faults port-types service 28 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube & Standards for communication • Mutual authentication based on GSI secure conversation (through delegation and renewal) • Business Process Execution Language for Web Services (WS-BPEL) • GridFTP and SRM support • VOMS for users and groups management • GWT and JSR168(JSR268 is coming) https://quality.wiki.d4science.researchinfrastructures.eu/quality/index.php/Standards 29 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Enabling Services 30 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Enabling Services • VRE Management services definition of VREs the dynamic deployment of VRE resources across the infrastructure • Software Repository service Storage and provision of deployable software components • Information Service publication of resources profile discovery of VRE resources through xPath, XQuery real-time monitoring subscription/notification Dynamic Virtual Organisation Support services robust and flexible security framework for managing VOs 31 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Enabling Services Production Failure Recovery WS Software gHN HW Repository Rapid deployment State Dynamic Load Balancing State State State State WS WS WS WS gHN HW CPU Usage 90% … WS State WS gHN HW gHN HW Service provision continuity gHN HW CPU Usage 30% Balancing utilization with head room 32 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu gCube Enabling Services gCube goes beyond other management systems by adding Service Management capabilities Provides solutions for system administrators to eliminate manual deployment overheads, eliminate manual environment configuration overheads, ensure optimal placement of services within the infrastructure support user community services orchestration Opens unique opportunities for outsourcing state-of-the-art service implementations 33 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu Conclusions • gCube infrastructures creates new opportunities to change the VRE development model used by distributed and dynamic organisations and communities • gCube offers an application framework for the development of Stateful Web Services (gCF) an easy to install self-contained service container (gCore) SOI middleware (Enabling services) Rapid deployment Failure recovery Load balance (soon) 34 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu http://www.gcube-system.org/ http://www.d4science.org/ 35 www.d4science.eu gCube & Standardization Bodies • ISO: data representation (e.g. ISO3166 for countries, ISO4217 for currencies) and metadata (ISO19115 for GIS) • OGF: Standards related to Architecture (e.g. OGSA), Data (e.g. DAIS, GridFTP), Management (e.g. GLUE, Resources Usage), Applications (e.g. DRMAA), Compute (e.g. JSDL) • OAI: Resources Exposure/Harvesting (OAI-PMH) Resources Exchange (OAI-ORE) • OASIS: Standards related to stateful web services (e.g. WSRF), process management (BPEL), remote user interfaces (WSRP), A&A (SAML / XACML) • W3C: All the standards related to Web Architecture (e.g., URI/URL, HTTP), Service Oriented Architectures (e.g. SOAP, WSDL, WSAddressing) and data representation and manipulation( e.g. XML*) • Others: Classification systems (e.g. ISSCAAP, ISSCFV, ISSCFG), features representation (e.g. GML for GIS), metadata (e.g. AgMES for Agricultural, SDMX for Statistics) 36 From Content Management Systems to VREs 10 October 2008, Dubna (Russia) www.d4science.eu