NeSc Middleware Workshop July 22-23 2004 www.eu-egee.org Discovery and Monitoring of Services using R-GMA Abdeslem DJAOUI / RAL EGEE is a project funded by the European Union under contract IST-2003-508833 Content • Background on R-GMA • Service discovery and Monitoring using R-GMA • Service interfaces charcteristics NeSc Middleware Workshop, 22-233 July 2004 - 2 GMA separates information source discovery from transfer to a sink GMA: see GGF document GFD-1.7 Producer Replicated Directory: Consumer metadata Producer: metadata Messages directly from P to C Receives sends Messages Consumer: Combined Messages C/P NeSc Middleware Workshop, 22-233 July 2004 - 3 R-GMA • A relational implementation of GMA Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT” Consumers collect: SQL “SELECT” Powerful data model and query language • All data modeled as tables • SQL can express most queries in one expression • User don’t have to construct complex SQL, front ends that automate the process can be used. • Creates impression that you have one RDBMS per VO Ability to issue global queries across all information NeSc Middleware Workshop, 22-233 July 2004 - 4 R-GMA: producers, consumers registry and schema • Registry has two main User services tables: Producer execute or stream Consumer S lo tor ca e tio n up k n o Lo atio loc Registry Store table description Schema Producer • Table name • Predicate • Location Consumer • Query • Location • Schema holds description of tables Column names and types of each table • Registry predicate defines subset of “global” table NeSc Middleware Workshop, 22-233 July 2004 - 5 Contributions to the “global” table CPULoad (Global Schema) Country Site Facility Load Timestamp UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 CH CERN ALICE 0.9 19055611022002 CH CERN CDF 0.6 19055511022002 CPULoad (Producer 2) CPULoad (Producer 1) UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 WHERE country = ’UK’ AND site = ’RAL’ WHERE country = ’CH’ AND site = ’CERN’ UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 CPULoad (Producer 3) CH CERN ATLAS 1.6 19055611022002 CH CERN CDF 0.6 19055511022002 NeSc Middleware Workshop, 22-233 July 2004 - 6 Queries over “global” table – merging streams SELECT * from CPULoad WHERE country = ’UK’ CPULoad (Consumer) Country Site Facility Load Timestamp UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 CPULoad (Producer 2) CPULoad (Producer 1) UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 Mediator handles P/C matchmaking and merging information from multiple producers for queries on one table UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 CPULoad (Producer 3) CH CERN ATLAS 1.6 19055611022002 CH CERN CDF 0.6 19055511022002 NeSc Middleware Workshop, 22-233 July 2004 - 7 Service Discovery: Service table • Service table columns definitions Endpoint: VARCHAR(255) - URI to contact the service Type: VARCHAR(50) - Type of service MajorVersion: INT - Major version number MinorVersion: INT - Minor version number PatchVersion: INT - Patch version number Site_Name: VARCHAR(100) - Name of the site WSDL: VARCHAR(255) - URL of WSDL describing the service Semantics: VARCHAR(255) - URL of detailed description NeSc Middleware Workshop, 22-233 July 2004 - 8 Service monitoring: ServiceStatus table • ServiceStatus table columns definitions: Endpoint: VARCHAR(255) - URI to contact the service Status: INT- Status code, 0 means the service is up. Message: VARCHAR(255) - Human readable indication of the service status NeSc Middleware Workshop, 22-233 July 2004 - 9 Queries over “global” table – joining tables SELECT Service.Endpoint Service.Site_name from Service S, ServiceStatus SS WHERE (S.Endpoint= SS.Endpoint and SS.Status=0) Service/ServiceStatus (Consumer) Endpoint Site_Name gppse02 RAL Service/ServiceStatus Service Endpoint Type Site_name gppse01 SE RAL … … … gppse01 SE RAL … … … gppse02 SE RAL … … … lxshare0404 SE CERN … … lxshare0404 SE CERN … Endpoint … … ServiceStatus Status… message gppse01 gppse02 0 Service is up lxshare0404 NeSc Middleware Workshop, 22-233 July 2004 - 10 R-GMA services for EGEE users • Producer services Used for publishing information Advertise the type of information by declaring a table A predicate (SQL WHERE clause) can define the precise subset of the global table to publish 3 types • Primary • Secondary • On-Demand • Consumer service Used as a sink for information Defined by a single query (SQL SELECT statement) Query types • Continuous • one-time – History – Latest NeSc Middleware Workshop, 22-233 July 2004 - 11 Producers • Primary and Secondary Producers support History Queries • over time sequenced data Latest Queries • correspond to intuitive idea of current information Continuous queries • as soon as new data becomes available it is broadcast to all interested parties • On Demand Producers Static queries (similar standard query to a database) NeSc Middleware Workshop, 22-233 July 2004 - 12 Producer Properties • Primary or Secondary Producer may use: Memory • Gives best performance for continuous queries File • Data has a good chance of being recovered after machine crash • Fair performance Database • Poor performance for inserts and continuous queries • Best chance of data recovery after machine crash • Best performance for joins NeSc Middleware Workshop, 22-233 July 2004 - 13 Three Kinds of Query insert select Producer Producer Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Continuous Query Tuple Tuple Tuple History Query Tuple Tuple Tuple Latest Query Tuple Tuple Tuple Tuple NeSc Middleware Workshop, 22-233 July 2004 - 14 Service: PrimaryProducer • A client uses a PrimaryProducer to publish information into R-GMA • availability of code URL - http://hepunx.rl.ac.uk/egee/jra1-uk/ License - EGEE Support- Through EGEE (OMII in the future) • SOA Model:WS WS-I compliant WSDL NeSc Middleware Workshop, 22-233 July 2004 - 15 PrimaryProducer Service Operations For Latest Java API see • http://hepunx.rl.ac.uk/egee/jra1-uk/ NeSc Middleware Workshop, 22-233 July 2004 - 16 Service: Consumer • A client uses a Consumer to retrieve information from one or more producers • availability of code URL - http://hepunx.rl.ac.uk/egee/jra1-uk/ License - EGEE Support- Through EGEE (OMII in the future) • SOA Model:WS WS-I compliant WSDL NeSc Middleware Workshop, 22-233 July 2004 - 17 Consumer Service Operations For Latest Java API • http://hepunx.rl.ac.uk/egee/jra1-uk/ NeSc Middleware Workshop, 22-233 July 2004 - 18 What do you use to build your service? (i.e. How ‘standard’ is your service?) NB:A low score means less risk & more mainstream • Widely Implemented Standard Specification (1pt) - 1 • Implemented draft specification (2pt) • • • <An idea that exits as a white paper, but no code and no specification details> Concept (6pt) • <Specification in standards body and supported by most/many companies. One/few implementations exist (e.g., WSSecurity, BPEL)> Implemented draft specification (3pt) <Specification in standards body but alternatives exist. Industry is divided. One/few implementations exist. (e.g., Transactions, coordination, notification, etc.). Implemented proposal (4pt) An implementation of an idea, a proposal but not submitted to standards body yet (e.g., WS-Addressing, WS-Trust, etc.) Non-implemented proposal (5pt) • <Demonstrable Multiple Implementations, e.g. SOAP, WSDL> <An idea that exists only as power point slides!!> TOTAL: <List specs and add up!> NeSc Middleware Workshop, 22-233 July 2004 - 19 Service Dependencies • What else does your service depend on (i.e. external dependencies)? RDBMs (e.g. service persistence): MySQL, DB2 Other services (name them): Registry and Schema • What does your implementation depend on? Languages (Java) Container type: Apache Tomcat, Axis, …. NeSc Middleware Workshop, 22-233 July 2004 - 20 AAA & Security • What authentication mechanism do you use? EGEE Authentication • What authorisation mechanism do you use? Fine Grained Authorisation • The authorization rules are defined in a TableAuthorization object that is passed into the createTable method. (View : AllowedCredentials) • To impose the constraints that a row of the table is available to the owner of the job, i.e. if the DN matches. SELECT * from Job where Owner=[DN] : DN=[DN]; • If you match the allowed credentials you will have read access to the data defined in that view. • What accounting mechanism do you use? Is interaction audited? Not now Is usage run against quotas? Not now • Does service interaction need to be encrypted? This is a requirement from Bioinformatics • If these are not used now, will they be in the future? They could be NeSc Middleware Workshop, 22-233 July 2004 - 21 Exploiting the Service Architecture • What features from your ‘plumbing’ do you use in your service? Factory pattern Java Logging Event notification (streaming) Meta-data Registry discovery/advertisement Other WSRF/WS characteristics? NeSc Middleware Workshop, 22-233 July 2004 - 22 Service Activity • • • • Multiple interaction or single user? Both Throughput (1/per day or 100/per second?) Both Typical data volume moved in Typical data volume moved out NeSc Middleware Workshop, 22-233 July 2004 - 23 Service Failure • Required Reliability Failure semantics? • Positive ack – in future • Submit & forget - current • Required Persistence Work never lost? Optional • Required Availability One of many or unique requirement NeSc Middleware Workshop, 22-233 July 2004 - 24 Required Service Management • Remote access to: Performance Progress Diagnostic and repair interfaces: • Nagios monitoring NeSc Middleware Workshop, 22-233 July 2004 - 25