Data Mediation and OGSA-DQP in the context of @neurIST Martin Koehler Department of Scientific Computing University of Vienna http://www.aneurist.org EU Project @neurIST Integrated Biomedical Informatics for the Management of Cerebral Aneurysms • Project duration: 2006-2009 (48 months) • 33 Partners • Budget: ~17,5 MEuro Objectives: Development of a generic IT infrastructure for the management and processing of heterogeneous data associated with the diagnosis and treatment of cerebral aneurysms and subarachnoid haemorrhage. Transform the management of cerebral aneurysm by providing new insight, personalised risk assessment and methods for the design of improved medical devices and treatment protocols. @neurIST Service-oriented Architecture CA Client Based on standard Web Services technology. Service Registry WSDL, SOAP, WS-Addressing End-to-End Security (own WS-Security implementation) Generic Grid Services Based on Vienna Grid Environment, GEMSS middleware Applications and data virtualized as Grid Serv. with uniform interface Application-level QoS support Dynamic negotiation of service level agreements (QoS contracts) Business Phase QoS Negotiation Phase AAA Security Pricing Model Estimate job capacity Negotiate QoS Exchange Contracts Job Handling Phase Upload input data Start job Monitor job Download results @neurIST Data Services Virtualization of heterogeneous data sources as services Utilizes OGSA-DAI (internally) Interface compliant with @neurIST compute services Data Service Different Variants - Data Access Services Client DBS Data Service Client access to single data source Client - Data Mediation Services Client Access Mechanism CSV Fil e CSV Fil e Data Service integration of multiple data sources Registry Registry DBS DBS Data Service • OGSA/DAI (perform document / response document) DBS @neurIST Scenario Hospital 4 Hospital n Hospital 3 Grid Data Service GRID Data Mediation Service Hospital 2 OGSA-DAI Hospital 1 Virtual Clinic Distributed Data Mediation Service Central Data Mediation Service Distributed Data Mediation Service OGSA-DQP Coordination Service GDMS Query Execution Engine OGSA-DQP Evaluation Service Data Service OGSA-DQP Evaluation Service Data Service Data Service Data Service Data Service Data Service Extending DQP with Data Mediation capabilities • Usage of a distributed query processing engine – Distribution of query processing steps and data – Support more complex queries (combine OGSA-DQP and data mediation) – Improving query execution performance • Provisioning of a global data schema – Less complexity in data schema – Providing location, schema, and language transparency –Provide application/scenario specific views Distributed Query Execution • Queries against virtual data source are processed in three phases • Distributed Query Plan Creation – Query is parsed by OGSA-DQP and a distributed query plan is created • Distributed Query Plan Mediation – Query plan against the global schema is adapted to a query pan against the mediated sources – TABLE_SCAN operators against virtual data source are exchanged with sub query plans mediating data to the global schema – Query plan is updated according to the inserted operators • Distributed Query Plan Execution – Distributed query plan is executed by the OGSA-DQP query execution engine using evaluation services Data Mediation Example Mediated schema C: C1 (A1=B1), C2 (A2),C3 (B2) Step 1: Provide static info to OGSA-DQP for Table C & D Step 2: select d2,c2 from C,D where c.c1=d.d1 and c.c2=„ABCDE„ J J S S A: A1,A2 D: D1 (E1=F1), D2 (E2),D3 (F2) J S E: E1,E2 C B: B1,B2 S D Step 3 for each table: replace simple scan Operation with query tree J according to mediation schema J J F: F1,F2 a2 = „ABCDE„ A B E F @neurIST Security Infrastructure @neurIST specific issues • @neuInfo based on – OGSA-DAI 2.2 – GDMS – OGSA-DQP 3.2 Tech Preview – Axis 1.2 • • @neurIST Security System – Based on Axis 1.2 – Security Token Services and Relationship Manager – Token Delegation Support for Data Mediation Services