Grids for Data Intensive Science Paul Avery University of Florida http://www.phys.ufl.edu/~avery/ avery@phys.ufl.edu Texas APS Meeting University of Texas, Brownsville Oct. 11, 2002 Texas APS Meeting (Oct. 11, 2002) Paul Avery 1 Outline of Talk Grids and Science Data Grids and Data Intensive Sciences High Energy Physics Digital Astronomy Data Grid Projects Networks and Data Grids Summary This talk represents only a small slice of a fascinating, multifaceted set of research efforts Texas APS Meeting (Oct. 11, 2002) Paul Avery 2 Grids and Science Texas APS Meeting (Oct. 11, 2002) Paul Avery 3 The Grid Concept Grid: Geographically distributed computing resources configured for coordinated use Fabric: Physical resources & networks provide raw capability Middleware: Software ties it all together (tools, services, etc.) Goal: Transparent resource sharing Texas APS Meeting (Oct. 11, 2002) Paul Avery 4 Fundamental Idea: Resource Sharing Resources for complex problems are distributed Advanced scientific instruments (accelerators, telescopes, …) Storage and computing Groups of people Communities require access to common services Research collaborations (physics, astronomy, biology, eng. …) Government agencies Health care organizations, large corporations, … Goal “Virtual Organizations” Create a “VO” from geographically separated components Make all community resources available to any VO member Leverage strengths at different institutions Add people & resources dynamically Texas APS Meeting (Oct. 11, 2002) Paul Avery 5 Short Comment About “The Grid” There is no single “Grid” a la the Internet Many Grids Grids, each devote to different organizations are (or soon will be) The foundation on which to build secure, efficient, and fair sharing of computing resources Grids are not Sources of free computing The means to access and process Petabyte-scale data freely without thinking about it Texas APS Meeting (Oct. 11, 2002) Paul Avery 6 Proto-Grid: SETI@home Community: Arecibo Over SETI researchers + enthusiasts radio data sent to users (250KB data chunks) 2M PCs used Texas APS Meeting (Oct. 11, 2002) Paul Avery 7 More Advanced Proto-Grid: Evaluation of AIDS Drugs Entropia “DCGrid” software Uses 1000s of PCs Chief applications Drug design AIDS research Texas APS Meeting (Oct. 11, 2002) Paul Avery 8 Some (Realistic) Grid Examples High energy physics 3,000 physicists worldwide pool Petaflops of CPU resources to analyze Petabytes of data Climate modeling Climate scientists visualize, annotate, & analyze Terabytes of simulation data Biology A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour Engineering A multidisciplinary analysis in aerospace couples code and data in four companies to design a new airframe Many commercial applications From Ian Foster Texas APS Meeting (Oct. 11, 2002) Paul Avery 9 Grids: Why Now? Moore’s law improvements in computing Highly functional endsystems Universal wired and wireless Internet connections Universal Changing connectivity modes of working and problem solving Interdisciplinary teams Computation and simulation as primary tools Network (Next exponentials slide) Texas APS Meeting (Oct. 11, 2002) Paul Avery 10 Network Exponentials & Collaboration Network (WAN) vs. computer performance Computer speed doubles every 18 months WAN speed doubles every 12 months (revised) Difference = order of magnitude per 10 years Plus ubiquitous network connections! 1986 to 2001 1,000 Networks: 50,000 Computers: 2001 to 2010? 60 Networks: 500 Computers: Scientific American (Jan-2001) Texas APS Meeting (Oct. 11, 2002) Paul Avery 11 Basic Grid Challenges Overall goal: Coordinated sharing of resources Resources Many under different administrative control technical problems to overcome Authentication, authorization, policy, auditing Resource discovery, access, negotiation, allocation, control Dynamic formation & management of Virtual Organizations Delivery of multiple levels of service Autonomic management of resources Failure detection & recovery Additional issue: lack of central control & knowledge Preservation of local site autonomy Texas APS Meeting (Oct. 11, 2002) Paul Avery 12 Advanced Grid Challenges: Workflow Manage workflow across Grid Balance policy vs. instantaneous capability to complete tasks Balance effective resource use vs. fast turnaround for priority jobs Match resource usage to policy over the long term Goal-oriented algorithms: steering requests according to metrics Maintain a global view of resources and system state Coherent end-to-end system monitoring Adaptive learning: new paradigms for execution optimization Handle user-Grid interactions Guidelines, Build agents high level services & integrated user environment Texas APS Meeting (Oct. 11, 2002) Paul Avery 13 Layered Grid Architecture (Analogy to Internet Architecture) Application User Managing multiple resources: ubiquitous infrastructure services Collective Sharing single resources: negotiating access, controlling use Talking to things: communications, security Application Resource Connectivity Transport Internet Fabric Link Controlling things locally: Accessing, controlling resources Internet Protocol Architecture Specialized services: App. specific distributed services From Ian Foster Texas APS Meeting (Oct. 11, 2002) Paul Avery 14 Globus Project and Toolkit Globus Project™ (UC/Argonne + USC/ISI) O(40) researchers & developers Identify and define core protocols and services Globus Toolkit™ 2.0 Reference Globus Toolkit used by most Data Grid projects today US: GriPhyN, PPDG, TeraGrid, iVDGL, … EU-DataGrid and national projects EU: Recent implementation of core protocols & services progress: OGSA and web services (2002) OGSA: Open Grid Software Architecture Applying “web services” to Grids: WSDL, SOAP, XML, … Keeps Grids in the commercial mainstream Globus ToolKit 3.0 Texas APS Meeting (Oct. 11, 2002) Paul Avery 15 Data Grids Texas APS Meeting (Oct. 11, 2002) Paul Avery 16 Data Intensive Science: 2000-2015 Scientific discovery increasingly driven by IT Computationally intensive analyses Massive data collections Data distributed across networks of varying capability Geographically distributed collaboration Dominant 2000 2005 2010 2015 factor: data growth (1 Petabyte = 1000 TB) ~0.5 Petabyte ~10 Petabytes ~100 Petabytes ~1000 Petabytes? How to collect, manage, access and interpret this quantity of data? Drives demand for “Data Grids” to handle additional dimension of data access & movement Texas APS Meeting (Oct. 11, 2002) Paul Avery 17 Data Intensive Physical Sciences High energy & nuclear physics Including Gravity new experiments at CERN’s Large Hadron Collider wave searches LIGO, GEO, VIRGO Astronomy: Digital sky surveys Sloan Digital sky Survey, VISTA, other Gigapixel arrays “Virtual” Observatories (multi-wavelength astronomy) Time-dependent 3-D systems (simulation & data) Earth Observation, climate modeling Geophysics, earthquake modeling Fluids, aerodynamic design Pollutant dispersal scenarios Texas APS Meeting (Oct. 11, 2002) Paul Avery 18 Data Intensive Biology and Medicine Medical data X-Ray, mammography data, etc. (many petabytes) Digitizing patient records (ditto) X-ray crystallography Bright X-Ray sources, e.g. Argonne Advanced Photon Source Molecular genomics and related disciplines Human Genome, other genome databases Proteomics (protein structure, activities, …) Protein interactions, drug delivery Brain Craig Venter keynote @SC2001 scans (1-10m, time dependent) Virtual Population Laboratory (proposed) Database of populations, geography, transportation corridors Simulate likely spread of disease outbreaks Texas APS Meeting (Oct. 11, 2002) Paul Avery 19 Example: High Energy Physics @ LHC “Compact” Muon Solenoid at the LHC (CERN) Smithsonian standard man Texas APS Meeting (Oct. 11, 2002) Paul Avery 20 CERN LHC site CMS LHCb ALICE Texas APS Meeting (Oct. 11, 2002) Atlas Paul Avery 21 Collisions at LHC (2007?) ProtonProton Protons/bunch Beam energy Luminosity 2835 bunch/beam 1011 7 TeV x 7 TeV 1034 cm2s1 Bunch Crossing rate Every 25 nsec Proton Collision rate ~109 Hz (Average ~20 Collisions/Crossing) Parton (quark, gluon) l l Higgs o Z + e Particle e+ New physics rate ~ 105 Hz e- o Z jet Texas APS Meeting (Oct. 11, 2002) jet e- Selection: 1 in 1013 SUSY..... Paul Avery 22 Data Rates: From Detector to Storage Physics filtering 40 MHz ~1000 TB/sec Level 1 Trigger: Special Hardware 75 GB/sec 75 KHz Level 2 Trigger: Commodity CPUs 5 GB/sec 5 KHz Level 3 Trigger: Commodity CPUs 100 MB/sec 100 Hz Raw Data to storage Texas APS Meeting (Oct. 11, 2002) Paul Avery 23 LHC Data Complexity “Events” resulting from beam-beam collisions: Signal event is obscured by 20 overlapping uninteresting collisions in same crossing CPU time does not scale from previous generations 2000 Texas APS Meeting (Oct. 11, 2002) 2007 Paul Avery 24 LHC: Higgs Decay into 4 muons (+30 minimum bias events) All charged tracks with pt > 2 GeV Reconstructed tracks with pt > 25 GeV 109 events/sec, selectivity: 1 in 1013 Texas APS Meeting (Oct. 11, 2002) Paul Avery 25 LHC Computing Overview Complexity: Millions of individual detector channels Scale: PetaOps (CPU), Petabytes (Data) Distribution: Global distribution of people & resources 1800 Physicists 150 Institutes 32 Countries Texas APS Meeting (Oct. 11, 2002) Paul Avery 26 Global LHC Data Grid Experiment (e.g., CMS) Tier0/( Tier1)/( Tier2) ~1:1:1 ~100 MBytes/sec Online System Tier 0 2.5 Gbits/sec Tier 1 France Italy UK CERN Computer Center > 20 TIPS USA 2.5 Gbits/sec Tier 2 Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center ~0.6 Gbits/sec Tier 3 InstituteInstitute Institute ~0.25TIPS Institute 0.1 - 1 Gbits/sec Tier 4 Physics data cache PCs, other portals Texas APS Meeting (Oct. 11, 2002) Paul Avery 27 LHC Tier2 Center (2001) “Flat” switching topology FEth/GEth Switch WAN Router >1 RAID Texas APS Meeting (Oct. 11, 2002) Paul Avery Tape 28 LHC Tier2 Center (2001) “Hierarchical” switching topology FEth Switch FEth Switch GEth Switch FEth Switch FEth Switch WAN Router >1 RAID Texas APS Meeting (Oct. 11, 2002) Paul Avery Tape 29 Hardware Cost Estimates Buy 1.4 years 1.1 years 2.1 years 1.2 years late, but not too late: phased implementation R&D Phase 2001-2004 Implementation Phase 2004-2007 R&D to develop capabilities and computing model itself Prototyping at increasing scales of capability & complexity Texas APS Meeting (Oct. 11, 2002) Paul Avery 30 Example: Digital Astronomy Trends Future dominated by detector improvements 1000 • Moore’s Law growth in CCDs 100 • Gigapixel arrays on horizon 10 • Growth in CPU/storage tracking data volumes 1 • Investment in software critical Glass MPixels 0.1 1970 1975 1980 1985 1990 1995 2000 CCDs Glass •Total area of 3m+ telescopes in the world in m2 •Total number of CCD pixels in Mpix •25 year growth: 30x in glass, 3000x in pixels Texas APS Meeting (Oct. 11, 2002) Paul Avery 31 The Age of Mega-Surveys Next generation mega-surveys will change astronomy Top-down design Large sky coverage Sound statistical plans Well controlled, uniform systematics The technology to store and access the data is here We are riding Moore’s law Integrating these archives is for the whole community Astronomical data mining will lead to stunning new discoveries “Virtual Observatory” Texas APS Meeting (Oct. 11, 2002) Paul Avery 32 Virtual Observatories Multi-wavelength astronomy, Multiple surveys Standards Source Catalogs Image Data Specialized Data: Information Archives: Spectroscopy, Time Series, Polarization Discovery Tools: Derived & legacy data: NED,Simbad,ADS, etc Visualization, Statistics Texas APS Meeting (Oct. 11, 2002) Paul Avery 33 Virtual Observatory Data Challenge Digital representation of the sky All-sky + deep fields Integrated catalog and image databases Spectra of selected samples Size of the archived data 40,000 square degrees Resolution < 0.1 arcsec > 50 trillion pixels One band (2 bytes/pixel) 100 Terabytes Multi-wavelength: 500-1000 Terabytes Time dimension: Many Petabytes Large, globally distributed database engines Multi-Petabyte data size Thousands of queries per day, Gbyte/s I/O speed per site Data Grid computing infrastructure Texas APS Meeting (Oct. 11, 2002) Paul Avery 34 Sloan Sky Survey Data Grid Texas APS Meeting (Oct. 11, 2002) Paul Avery 35 Data Grid Projects Texas APS Meeting (Oct. 11, 2002) Paul Avery 36 New Collaborative Endeavors via Grids Fundamentally Old: New: alters conduct of scientific research People, resources flow inward to labs Resources, data flow outward to universities Strengthens Couples universities universities to data intensive science Couples universities to national & international labs Brings front-line research to students Exploits intellectual resources of formerly isolated schools Opens new opportunities for minority and women researchers Builds partnerships to drive new IT/science advances Physics Application Universities sciences Fundamental sciences Research Community Texas APS Meeting (Oct. 11, 2002) Astronomy, biology, etc. Computer Science Laboratories IT infrastructure IT industry Paul Avery 37 Background: Major Data Grid Projects Particle Physics Data Grid (US, DOE) Data Grid applications for HENP expts. GriPhyN (US, NSF) Petascale Virtual-Data iVDGL (US, NSF) Global Grid lab Grids Data DOE Science Grid (DOE) Link major DOE computing TeraGrid (US, NSF) Dist. supercomp. sites resources (13 TFlops) European Data Grid (EU, Data Grid technologies, CrossGrid (EU, EC) Realtime Grid tools DataTAG (EU, EC) Transatlantic network, EC) EU deployment intensive expts. Collaborations of application scientists & computer scientists Infrastructure deployment Globus devel. & based Grid applications Japanese Grid Project (APGrid?) (Japan) Grid deployment throughout Japan Texas APS Meeting (Oct. 11, 2002) Paul Avery 38 GriPhyN: PetaScale Virtual-Data Grids Production Team Individual Investigator Interactive User Tools Virtual Data Tools Request Planning & Scheduling Tools Resource èResource èManagement Management èServices Services Workgroups ~1 Petaflop ~100 Petabytes Request Execution & Management Tools èSecurity and Security and èPolicy Policy èServices Services Other Grid Services èOther Grid èServices Transforms Distributed resources Raw data source Texas APS Meeting (Oct. 11, 2002) (code, storage, CPUs, networks) Paul Avery 39 Major facilities, archives Virtual Data Concept Data request may Compute locally Compute remotely Access local data Access remote data Regional facilities, caches Scheduling based on Local policies Global policies Cost Texas APS Meeting (Oct. 11, 2002) Fetch item Paul Avery Local facilities, caches 40 Early GriPhyN Challenge Problem: CMS Data Reconstruction Master Condor job running at Caltech 5) Secondary reports complete to master Caltech workstation 6) Master starts reconstruction jobs via Globus jobmanager on cluster April 2001 Caltech NCSA Wisconsin 2) Launch secondary job on Wisconsin pool; input files via Globus GASS Secondary Condor job on UW pool 3) 100 Monte Carlo jobs on Wisconsin Condor pool 9) Reconstruction job reports complete to master 7) GridFTP fetches data from UniTree 4) 100 data files transferred via GridFTP, ~ 1 GB each NCSA Linux cluster Texas APS Meeting (Oct. 11, 2002) 8) Processed objectivity database stored to UniTree Paul Avery NCSA UniTree - GridFTPenabled FTP server 41 Particle Physics Data Grid Funded by DOE MICS ($9.5M for 2001-2004) DB replication, caching, catalogs Practical orientation: networks, instrumentation, monitoring Computer Science Program of Work CS1: Job Description Language CS2: Schedule and Manage Data Processing & Placement Activities CS3 Monitoring and Status Reporting CS4 Storage Resource Management CS5 Reliable Replication Services CS6 File Transfer Services …. CS11 Grid-enabled Analysis Texas APS Meeting (Oct. 11, 2002) Paul Avery 42 iVDGL: A Global Grid Laboratory “We propose to create, operate and evaluate, over a sustained period of time, an international research laboratory for data-intensive science.” From NSF proposal, 2001 International A A A A A U.S. Virtual-Data Grid Laboratory global Grid laboratory (US, EU, Asia, South America, …) place to conduct Data Grid tests “at scale” mechanism to create common Grid infrastructure laboratory for other disciplines to perform Data Grid tests focus of outreach efforts to small institutions part funded by NSF (2001-2006) $13.7M (NSF) + $2M (matching) UF directs this project International partners bring own funds Texas APS Meeting (Oct. 11, 2002) Paul Avery 43 Current US-CMS Testbed (30 CPUs) Wisconsin Princeton Fermilab Caltech UCSD Florida Brazil Texas APS Meeting (Oct. 11, 2002) Paul Avery 44 US-iVDGL Data Grid (Dec. 2002) SKC LBL Wisconsin Michigan PSU Fermilab Argonne NCSA Caltech Oklahoma Indiana Paul Avery J. Hopkins Hampton FSU Arlington Texas APS Meeting (Oct. 11, 2002) BNL Vanderbilt UCSD/SDSC Brownsville Boston U UF Tier1 Tier2 Tier3 FIU 45 iVDGL Map (2002-2003) Surfnet DataTAG New partners Brazil T1 Russia T1 Chile T2 Pakistan T2 China T2 Romania ? Texas APS Meeting (Oct. 11, 2002) Paul Avery Tier0/1 facility Tier2 facility Tier3 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link 46 TeraGrid: 13 TeraFlops, 40 Gb/s Site Resources 26 4 Site Resources HPSS HPSS 24 8 External Networks Caltech External Networks Argonne 40 Gb/s External Networks Site Resources HPSS Texas APS Meeting (Oct. 11, 2002) SDSC 4.1 TF 225 TB 5 NCSA/PACI 8 TF 240 TB Paul Avery External Networks Site Resources UniTree 47 DOE Science Grid Link major DOE computing sites (LBNL) Texas APS Meeting (Oct. 11, 2002) Paul Avery 48 EU DataGrid Project Work Package Work Package title Lead contractor WP1 Grid Workload Management INFN WP2 Grid Data Management CERN WP3 Grid Monitoring Services PPARC WP4 Fabric Management CERN WP5 Mass Storage Management PPARC WP6 Integration Testbed CNRS WP7 Network Services CNRS WP8 High Energy Physics Applications CERN WP9 Earth Observation Science Applications ESA WP10 Biology Science Applications INFN WP11 Dissemination and Exploitation INFN WP12 Project Management CERN Texas APS Meeting (Oct. 11, 2002) Paul Avery 49 LHC Computing Grid Project Texas APS Meeting (Oct. 11, 2002) Paul Avery 50 Need for Common Grid Infrastructure Grid computing sometimes compared to electric grid You plug in to get a resource (CPU, storage, …) You don’t care where the resource is located This analogy is more appropriate than originally intended expresses a USA viewpoint uniform power grid What happens when you travel around the world? It Different frequencies Different voltages Different sockets! 60 Hz, 50 Hz 120 V, 220 V USA, 2 pin, France, UK, etc. Want to avoid this situation in Grid computing Texas APS Meeting (Oct. 11, 2002) Paul Avery 51 Grid Coordination Efforts Global Grid Forum (GGF) www.gridforum.org International forum for general Grid efforts Many working groups, standards definitions Next one in Toronto, Feb. 17-20 HICB (High energy physics) Represents HEP collaborations, primarily LHC experiments Joint development & deployment of Data Grid middleware GriPhyN, PPDG, TeraGrid, iVDGL, EU-DataGrid, LCG, DataTAG, Crossgrid Common testbed, open source software model Several meeting so far New infrastructure Data Grid projects? Fold into existing Grid landscape (primarily US + EU) Texas APS Meeting (Oct. 11, 2002) Paul Avery 52 Networks Texas APS Meeting (Oct. 11, 2002) Paul Avery 53 Next Generation Networks for HENP Rapid access to massive data stores Petabytes Balance and beyond of high throughput vs rapid turnaround Coordinate Seamless & manage: Computing, Data, Networks high performance operation of WANs & LANs WAN: Wide Area Network LAN: Local Area Network Reliable, quantifiable, high performance Rapid access to the data and computing resources “Grid-enabled” data analysis, production and collaboration Full participation by all physicists, regardless of location Requires good connectivity Grid-enabled software, advanced networking, collaborative tools Texas APS Meeting (Oct. 11, 2002) Paul Avery 54 2.5 Gbps Backbone 201 Primary Participants All 50 States, D.C. and Puerto Rico 75 Partner Corporations and Non-Profits 14 State Research and Education Nets 15 “GigaPoPs” Support 70% of Members Texas APS Meeting (Oct. 11, 2002) Paul Avery 55 Total U.S. Internet Traffic 100 Pbps Limit of same % GDP as Voice 10 Pbps 1 Pbps 100Tbps New Measurements 10Tbps 1Tbps 100Gbps Projected at 4/Year Voice Crossover: August 2000 10Gbps 1Gbps ARPA & NSF Data to 96 100Mbps 10Mbps 4X/Year 2.8X/Year 1Mbps 100Kbps 10Kbps 1Kbps 100 bps 10 bps 1970 1975 1980 1985 1990 1995 2000 2005 2010 U.S. Internet Data Traffic Texas APS Meeting (Oct. 11, 2002) Paul Avery 56 Source: Roberts et al., 2001 Bandwidth for the US-CERN Link Link Bandwidth (Mbps) 10000 Evolution typical of major HENP links 2001-2006 8000 6000 4000 2000 0 FY2001 FY2002 FY2003 FY2004 FY2005 FY2006 BW (Mbps) 310 622 1250 2500 5000 10000 2155 Mbps in 2001 622 Mbps May 2002 2.5 Gbps Research Link Summer 2002 (DataTAG) 10 Gbps Research Link in mid-2003 (DataTAG) Texas APS Meeting (Oct. 11, 2002) Paul Avery 57 Transatlantic Network Estimates 2001 2002 2003 2004 CMS 2005 2006 100 200 300 600 800 2500 ATLAS 50 100 300 600 800 2500 BaBar 300 600 1100 1600 2300 3000 CDF 100 300 400 2000 3000 6000 D0 400 1600 2400 3200 6400 8000 BTeV 20 40 100 200 300 500 DESY 100 180 210 240 270 300 CERN 311 622 2500 5000 10000 20000 BW in Mbps, assuming 50% utilization See http://gate.hep.anl.gov/lprice/TAN Texas APS Meeting (Oct. 11, 2002) Paul Avery 58 All Major Links Advancing Rapidly Next generation 10 Gbps national network backbones Starting Major to appear in the US, Europe and Japan transoceanic links Are/will Critical be at 2.5 - 10 Gbps in 2002-2003 path Remove regional, last mile bottlenecks Remove compromises in network quality Prevent TCP/IP inefficiencies at high link speeds Texas APS Meeting (Oct. 11, 2002) Paul Avery 59 U.S. Cyberinfrastructure Panel: Draft Recommendations (4/2002) New initiative to revolutionize science, engineering research Capitalize on new computing & communications opportunities Supercomputing, massive storage, networking, software, collaboration, visualization, and human resources Budget estimate: incremental $650 M/year (continuing) New office with highly placed, credible leader Initiate competitive, discipline-driven path-breaking applications Coordinate policy and allocations across fields and projects Develop middleware & other software essential to scientific research Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities Participants NSF directorates, Federal agencies, international e-science Texas APS Meeting (Oct. 11, 2002) Paul Avery 60 Summary Data Grids will qualitatively and quantitatively change the nature of collaborations and approaches to computing Current Data Grid projects will provide vast experience for new collaborations, point the way to the future Networks Many must continue exponential growth challenges during the coming transition New grid projects will provide rich experience and lessons Difficult to predict situation even 3-5 years ahead Texas APS Meeting (Oct. 11, 2002) Paul Avery 61 Grid References Grid Book www.mkp.com/grids Globus www.globus.org Global Grid Forum www.gridforum.org TeraGrid www.teragrid.org EU DataGrid www.eu-datagrid.org PPDG www.ppdg.net GriPhyN www.griphyn.org iVDGL www.ivdgl.org Texas APS Meeting (Oct. 11, 2002) Paul Avery 62 More Slides Texas APS Meeting (Oct. 11, 2002) Paul Avery 63 1990s Information Infrastructure O(107) nodes Network Network-centric Simple, fixed end systems Few embedded capabilities Few services No user-level quality of service From Ian Foster Texas APS Meeting (Oct. 11, 2002) Paul Avery 64 Emerging Information Infrastructure O(1010) nodes Application-centric Caching Resource Discovery Processing QoS Grid Heterogeneous, mobile end-systems Many embedded capabilities Rich services User-level quality of service Qualitatively different, not just “faster and more reliable” From Ian Foster Texas APS Meeting (Oct. 11, 2002) Paul Avery 65 Globus General Approach Define Applications Grid protocols & APIs Protocol-mediated access to remote resources Integrate and extend existing standards Develop reference implementation Diverse global services Open source Globus Toolkit Client & server SDKs, services, tools, etc. Grid-enable wide variety of tools Globus Toolkit FTP, SSH, Condor, SRB, MPI, … Learn about real world problems Core services Deployment Testing Applications Diverse resources Texas APS Meeting (Oct. 11, 2002) Paul Avery 66 ICFA SCIC SCIC: Standing Committee on Interregional Connectivity Created by ICFA in July 1998 in Vancouver Make recommendations to ICFA concerning the connectivity between the Americas, Asia and Europe SCIC duties Monitor traffic Keep track of technology developments Periodically review forecasts of future bandwidth needs Provide early warning of potential problems Create subcommittees when necessary Reports: February, July and October 2002 Texas APS Meeting (Oct. 11, 2002) Paul Avery 67 SCIC Details Network status and upgrade plans Bandwidth and performance evolution Per country & transatlantic Performance Study measurements (world overview) specific topics Example: Bulk transfer, VoIP, Collaborative Systems, QoS, Security Identification of problem areas Ideas on how to improve, or encourage to improve E.g., faster links equipment cost issues, TCP/IP scalability, etc. Meetings Summary and sub-reports available (February, May, October) http://www.slac.stanford.edu/grp/scs/trip/notes-icfa-dec01cottrell.html Texas APS Meeting (Oct. 11, 2002) Paul Avery 68 Internet2 HENP Working Group Mission: Ensure the following HENP needs National and international network infrastructures (end-to-end) Standardized tools & facilities for high performance end-to-end monitoring and tracking Collaborative systems Meet HENP needs in a timely manner US LHC and other major HENP Programs At-large scientific community Create program broadly applicable across many fields Internet2 Working Group: Oct. 26 2001 Co-Chairs: S. McKee (Michigan), H. Newman (Caltech) http://www.internet2.edu/henp (WG home page) http://www.internet2.edu/e2e (end-to-end initiative) Texas APS Meeting (Oct. 11, 2002) Paul Avery 69