WP3 R-GMA – DataGrid’s Monitoring System 1/7/2003 Werner Nutt (Heriot-Watt University) <w.nutt@hw.ac.uk> RGMA = Relational Grid Monitoring Architecture WP3 • Grid Monitoring and Information System developed within DataGrid (Work Package 3) • Based on the “Grid Monitoring Architecture” of the Global Grid Forum • Code is open source and freely available Homepage: type “wp3” into Google R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 2 Contributors WP3 • Heriot-Watt, Edinburgh – Andrew Cooke, Alasdair Gray, Lisha Ma, Werner Nutt • IBM-UK – James Magowan, Manfred Oevers, Paul Taylor • Queen Mary, University of London – Roney Cordenonsi • CCLRC/PPARC – Rob Byrom, Laurence Field, Steve Hicks, Manish Soni, Antony Wilson, Jason Leake – Linda Cornwall, Abdeslem Djaoui, Steve Fisher, Robin Middleton • SZTAKI, Hungary – Peter Kacsuk, Norbert Podhorszki • Trinity College Dublin – Brian Coghlan, Stuart Kenny, David O’Callaghan R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 3 Overview WP3 • Grid monitoring: Requirements • The R-GMA approach: A virtual monitoring database • Components of R-GMA: – – – – Schema Producers and Consumers Registry Republishers • Query Planning R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 4 Major Components of DataGrid WP3 Job Submission User Interface Resource Broker Status Information Monitoring System Logging and Bookkeeping Replica Catalogue Computer Computing Element Computer Computer Storage Element Computer Computer Computer R-GMA -DataGrid's Monitoring System Data Transfer Werner Nutt - 1/7/2003 5 WP7: R-GMA Collects Network Monitoring Data R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 WP3 6 The Grid Monitoring Problem WP3 In a Grid we have – Computers – Storage elements – Network nodes and connections – Application programmes, … Monitoring: – What is the current state of the system? – How did the system behave in the past ? R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 7 Monitoring Data Come in two Kinds WP3 A Grid monitoring system makes available two kinds of data • static data “pools”, e.g., databases on – network topology, nodes connected – applications available (versions, licences, ...) • “streams” of data, e.g., – sensor data (cpu load, network traffic, ...) Data streams may give rise to data pools if they are archived Today: R-GMA is tailored towards streams, but not pools R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 8 Examples of Monitoring Queries WP3 • “Show me the (average) cpu-load of computers at Heriot-Watt!” • “Between which nodes was yesterday the average transportation time for 1 MB packets higher than than 0.… seconds?” • For every computing element CE, how many computers of CE have currently a cpu-load of no “ more than 30%?” R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 9 Grid Monitoring Requirements WP3 • Support for publishing data “pools” and “streams” • Support for locating data sources (automatic, if possible) • Queries with different temporal interpretations (continuous, latest state, history) • Scalability (there may be thousands of data sources) • Resilience to failure (data sources may become unavailable) • Flexibility (we don’t know which queries will be posed) R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 10 Architecture Approach 1: A Monitoring Data Warehouse WP3 Idea: – store all data about the Grid status into a huge database – and query it Not realistic: • Loading takes time • Data occupy space • Connections to the warehouse may fail • Often monitoring data flow as data streams, and queries ask for data streams as output R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 11 Approach 2: Monitoring with a “Multi-agent System” WP3 The Grid Monitoring Architecture (GMA) of the Global Grid Forum distinguishes between: • Consumers of information • Producers of information Consumer find/ register • Directory Service – Producers register their supply – Consumers register their demand Sensor Directory Service Producer Data Base MonitoringApplication Directory Service mediates between producers and consumers R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 12 Questions about GMA: WP3 • Which kinds of producers and consumers are there? • In which language do producers register their supply and consumers their demand ? • What is the meaning of a registration? • How does a consumer find suitable producers? And how does a producer find suitable consumers? • Producers have different capabilities to answer queries (e.g. selections, joins, …). Which of them should they register? R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 13 R-GMA: A Virtual Monitoring Data Warehouse • Language of producers and consumers: relational queries (SQL) • Vocabulary: Relations in a global schema Consumer • Consumer: poses queries over global schema Query Global Schema S DB-Producer DB V1 V2 . . . Vn Views on S WP3 V Registry R-GMA -DataGrid's Monitoring System • Producer: – has a type (stream p., database p.) Stream Producer Sensor – publishes relations R1, … ,Rk – for every R, registers a simple view V on the global schema Werner Nutt - 1/7/2003 14 Schema & Contributions WP3 CPULoad (Global Schema) Country Site Facility Load Timestamp UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 CH CERN ALICE 0.9 19055611022002 CH CERN CDF 0.6 19055511022002 CPULoad (Stream Producer 2) CPULoad (Stream Producer 1) UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 CPULoad (Stream Producer 3) R-GMA -DataGrid's Monitoring System CH CERN ATLAS 1.6 19055611022002 CH CERN CDF 0.6 19055511022002 Werner Nutt - 1/7/2003 15 Contributions are Views WP3 CPULoad (Producer 1) UK RAL CDF 0.3 19055711022002 UK RAL ATLAS 1.6 19055611022002 SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’RAL’ CPULoad (Producer 2) UK GLA CDF 0.4 19055811022002 UK GLA ALICE 0.5 19055611022002 SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’GLA’ R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 16 Keys in the Global Schema WP3 Network throughput: tp(src, dest, method, pcktSize, timestamp, time) Intuitively, tp has the primary key (src, dest, method, pcktSize, timestamp). We need to know the primary keys • to understand the global schema • to answer latest snapshot queries Primary keys are declared, but not enforced! Although, sometimes they hold globally if they hold locally ! R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 17 Metaphor: Roles and Agents WP3 R-GMA Clients: Grid components or Grid applications • Clients can play the roles of producers or consumers A client would need special capabilities for a role: • Clients are supported in their roles by agents Implementation: • APIs for client roles: “new StreamProducer(…)” • Agents are objects on a Web server R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 18 Primary Producers WP3 Database producer • supports queries over fixed set of tuples (static queries) • can be used to publish a database Stream producer • supports queries over changing set of tuples (continuous queries) • supports “latest snapshot queries” – offers up-to-date values for each primary key in a db Today: DatabaseProducer’s and StreamProducer’s in R-GMA are different from the above! R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 19 Communication Modes of Stream Producers WP3 Stream Producers may offer two communication modes for continuous queries: – lossless (… but tuples could become stale) – lossy (… but tuples are fresh) Producer Producer Servlet Consumer Servlet IIIIIIII... IIIIIIII... Queue Queue Consumer Today: R-GMA’s StreamProducer’s are resilient and support lossless communication R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 20 Republishers Publish Query Answers WP3 into database into stream Static Query Materialised View -- Continuous Query Archiver Stream Republisher Archiver: shows the history of a stream. Stream Republisher: enables – merging, – thinning, – summarising of streams … R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 21 Republishers in R-GMA Today WP3 Republishers are called “archivers” (although some of them don't archive anything) An archiver (= republisher) • is defined by a query • consumes only from “stream producers” • publishes the query result according to its type, using – a “stream producer”, or – a “latest snapshot producer”, or – a “database producer” (which keeps an archive) Republishers are used to answer complex queries! R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 22 The Next Step: Hierarchies of Stream Republishers National Republisher country = ‘uk’ Local/site Republisher site =‘ral’ WP3 site = ‘hw’ Stream Producers ral R-GMA -DataGrid's Monitoring System hw Werner Nutt - 1/7/2003 23 Republisher Hierarchies: The Issues WP3 • Republishers are defined by queries: hierarchies have to be maintained automatically • new stream producers must only be added to republishers at “lowest level” • hierarchy has to be replanned if a republisher fails • difficult: transition from one plan to the other without loss of tuples • How well can we describe the content of a stream? Possibly need for descriptions that join • stream relations • static relations CPULoad(machineID, load, timestamp) locatedAt(machineID, site) R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 24 What is the Meaning of a Query in R-GMA? WP3 Assumption: the views of (primary) producers are selections on a single relation, i.e., queries of the form SELECT * FROM cpu_load WHERE machine_id = ‘AB123’ AND loc = ‘hw’ (each producer contributes its parts of a relation) • The virtual database contains the union of the data of all the primary producers • Conceptually, a query is evaluated over the entire virtual db R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 25 Stream Queries can have Various Temporal Interpretations WP3 Consider a query over the relation “Transport Time” tt(src, dest, pcktSize, method, timestamp, time) SELECT * FROM tt WHERE src = ral AND dest = bologna What is meant? Measurements – from now ? (Continuous Query) – up until now ? (History Query) – right now ? (Latest Snapshot Query) Today: Queries can be “flagged” with their type R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 26 Advanced Queries: Mixing Temporal Query Types WP3 • “Which connections have currently a transportation time that is higher than last week's average?” (latest snapshot and history) • “Show me the cpu load of those machines where it is lower than yesterday's load average!” (continuous and history) We do not intend to support such queries by R-GMA! R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 27 In R-GMA Query Answering Needs Mediation WP3 Suppose P1, P2 publish for tp (throughput) P1: P2: … WHERE src = hw … WHERE src = ral AND pcktSize > 20 A global consumer poses its query over global relations SELECT * FROM tp WHERE pcktSize > 10 A mediator translates this into queries over local relations SELECT * FROM UNION SELECT * FROM P1.tp WHERE pcktSize > 10 P2.tp Today: R-GMA’s mediator handles simple queries like the one above R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 28 Global and Local Consumers WP3 • Global consumers pose queries over global relations SELECT * FROM tp WHERE pcktSize > 10 , which are translated into queries over local relations SELECT * FROM UNION SELECT * FROM P1.tp WHERE pcktSize > 10 P2.tp • Local consumers pose queries over local relations directly SELECT * FROM P1.tp WHERE method = ping Today: a consumer can be global or local, but local relations cannot be referred to explicitly R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 29 How does the Mediator Find Suitable Publishers? WP3 P1, P2, P3 publish for tt (Transport Time) P1: … src = hw P2: … src = ral AND pcktSize > 20 P3: … src = ral AND method = ping Q: SELECT * FROM tt WHERE src = ral AND method = ping We see: P1 is not suitable for Q, but P2 and P3 are. Why? src = hw AND src = ral AND method = ping src = ral AND pcktSize > 20 AND … is never true is sometimes true Satisfiability Test! Today: implemented R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 30 … So Which Publishers Should the Mediator Ask? WP3 P2: … src = ral AND pcktSize > 20 P3: … src = ral AND method = ping SELECT * FROM tt WHERE src = ral AND method = ping Q: All answers to Q returned by P2 are also returned by P3 : whenever src = ral AND pcktSize > 20 AND src = ral AND method = ping is true, then src = ral AND method = ping AND src = ral AND method = ping is true. Hence, R-GMA only needs to ask P3 Entailment Test! Needed for Republisher Hierarchies! (not yet implemented) R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 31 … But What Did the Producers Promise? WP3 P registers view V Does P promise – some of V ? – all of V? (sound description) (sound and complete description) • The Entailment Test only makes sense when the registered views are sound and complete descriptions • Producers should register completeness flags R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 32 … Why May a Producer not be Complete? WP3 • The language of views is more restricted than the language of queries Hence: republishers may be unable to say exactly what they publish • Archivers may archive in lossy mode • Producers may lose tuples • A producer may not know everything about the real world Open to debate R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 33 Summary (1) WP3 Monitoring data come in Pools and Streams Global Schema • primary keys Types of Stream Queries • continuous vs. history vs. latest snapshot Producers • DB producers: publish database • stream producers: lossless vs. lossy communication modes R-GMA -DataGrid's Monitoring System Werner Nutt - 1/7/2003 34