Research Issues for Building and Integrating Peer-based and Grid Systems Xiaodong Zhang National Science Foundation This talk does not necessarily reflect NSFs official opinions Hardware Cost and Implications . $400,000/MIPS (Cray-I) z z z . $250/MIPS (i860) 1980 1990 . $2/MIPS or less 2002 Storages are large and cheap. Information and computing available everywhere. Major Challenges: distributed resource management security and privacy Impact on US Computer Exports z z Speed Limits on Computer Exports - Russia, China, India, and Middle East Countries - Millions of Theoretical Operations Per Second (MTOPS) Before 2001, MTOPS = 28,000 - less powerful than a cluster of ten 1.5 GHz/2-way PCs. z 2001, MTOPS = 85,000 - less powerful than a cluster of ten 2.2 GHz/4-way PCs. z 2002, MTOPS = 195,000 MTOPS -less powerful than a cluster of ten 3 GHz/8-way PCs. MTOPS Hardly Reflects Reality z z z MTOPS views a computer as a high performance calculator. - ignores the deep memory hierarchy, - ignores the fast internel interconnections, - ignores the power of clusters, and - ignores resource sharing using Internet. Senete passed a bill to remove MTOPS on 9/6/01. The computing power is mainly determined by effective utilization of aggregated networked resources. Client/Server based IT Infrastructure z Services provided by data/computing centers. z Grid and Web search engines are server-based. z Each server can be built by a distributed cluster. z Inter- and intra resource coordination. z Services are guaranteed and trusted z Security is enforced within each server. Current and Coming Internet Systems z z z The rapid growing Internet services are provided by an increasing number of peers. Variety of devices: ranging from a cell phone to a Supercomputer Center. Pervasive computing: access information and services anytime and anywhere. Client/Server Model is Being Challenged No single server or search engine can sufficiently cover increasing Web contents. z z z 2×1018 Bytes/year generated in Internet. But only 3×1012 Bytes/year available to public (0.00015%). Google only searches 1.3×108 Web pages. (Source: Gong, IEEE Internet Computing, 2001) Client/Server (continued) Client/server model seriously limits utilization of available bandwidth and service. z z z Popular servers and search engines become traffic bottlenecks. But high speed networks connecting many clients become idle. Computing cycles and information in clients are ignored. A New Paradigm: Peer-oriented Systems z Both client (consumer) & server (producer). z Has the freedom to join and leave any time. z z Huge peer diversity: service ability, storage space, networking speed, and service demand. A widely decentralized system opening for both opportunities and new concerns. Peer-oriented Systems Client/server Server a search Pure P2P engine/grid e.g. Freenet & Gnutella Hybrid P2P directory e.g. Napster Peer-oriented Applications z z z File Sharing: document sharing among peers with no or limited central controls. Instant Messaging (IM): Immediate voice and file exchanges among peers. Distributed Processing: One can widely utilize resources available in other remote peers. Problem 1: Loosing Security and Privacy z Providing a conduit for evil code and viruses. z Providing loopholes for information leakage. z Relaxing the privacy protection by exposing peer identities. Problem 2: Weak Resource Coordinations z z z With limited or no central control, but mainly reply on self-organization. Lacking communication monitoring and scheduling: cause unnecessary traffic jams. Lacking access and service coordinations: unbalanced loads among peers. Demanded Solution (1): Fast Peer Services z z z Dynamically identifying and collecting trusted and guaranteed peers as the backbones. Establishing adaptive self-organization and monitoring for resource coordinations. Fast data and service searching in low-diameter region. (2): Allowing Distrustful Peers Exist z Ensure that peer interactions do not become intrusive (monitoring/scheduling) do protect privacy (communication anonymity) not used for denial-of-service attacks (security) (3): Measurable Security Metrics z Benchmarks for security measurement. z Stochastical models for security analysis. z Validating systems and quantifying security degrees. (4): Understanding the Trade-offs z z z Analyzing the impact of centralized controls to performance and security. Quantifying the security loss and performance gain/loss by decentralization. Optimizing peer-oriented systems for individual and combined objectives: high performance, highly secured, balanced of both, for a given performance objective, finding... (5): Utilizing Existing Infrastructure z z z Avoid establishing new standards and protocols. Avoid modifying commonly used and general purpose software. Peer-oriented processing should be automatic with little user involvement. Application Differences: Grid & P2P z z Grid: providing a global problem solving environment for large and critical scientific applications and professional collaborations, where each grid is a server. P2P: providing a general and commercial information/computing services, where each peer can be both server and client. Operation Differences: Grid & P2P z z Grid: direct access to computing, software, and data resources in remote & targeted sites. (Servers-based) P2P: random accesses to available computing, software, and data resources without a specifict target. (Clients-based) Different Participants: Grid & P2P z z Grid: pre-determined and registered clients and servers. P2P: clients and servers are not distinguished and registered, which can come and go by their choices. Different QoS: Grid & P2P z z Grid: guaranteed and reliable services are required for each grid server. P2P: only partially reliable, because services from some peers are not guaranteed and trusted. Security Differences: Grid & P2P z z Grid: authentication, authority, and firewall protection to each grid. P2P: privacy, anonymity, authentication, authority, and fire wall protection to each peer is not guaranteed. Different Controls: Grid & P2P z z Grid: centralized control plays important role in resource monitoring/allocations and job scheduling. P2P: limited or no central controls, mainly rely on self-organization. Changing of NSF Sponsored High Performance Computing Efforts z 1986 to 1996: 5 Indepdendent Supercomputer Centers: Illinois, San Diego, Pittsburgh, Cornell, and Princeton Science & Technology Research Centers: CRPC and Visualization & Graphics z Missions: - providing high performance resources - developing new technologies -Advancement of scientifc discovery. Changing of NSF Sponsored High Performance Computing Efforts z 1997 to 2002: Two Parnerships for Adv. Comp. Infras. (PACI) NCSA at Illinois and NPACI at San Diego leading 60+ institutions from 27 states. z Missions: - prividing grid computing and data resources - developing grid software tools - applications on grids - education outreach and training. Changing of NSF Sponsored High Performance Computing Efforts z 2001 to 2004: Distributed Terascale Facility (DTF) 4 DTF sites: NCSA, NPACI, Argonne, and Caltech providing aggregated 14+ teraflops and 450+ terabytes. z Tasks: - NCSA: 6+ TFs & 240+TBs Linux cluster of Itanium’s - NPACI: 4+ TFs & 225+ TBs - Angonne: 1+ TF IBM cluster, grid & viz. software - Caltech: 86 TB on-line storage. Large NSF Sponsored Grid Projects z GIOD (Globally Interconnected Object Databases) global data storage and accesses of particle collider experiments z GriPhyN (Grid Physics Network) building global grids for experimental physics studies. z iVDgL (international Virtual-Data grid Lab) grids for physics/astronomy experiments data-intensive science, US & EU collaboration z NEES (Network for Earthquake Engineering Simulation) shifting from physical tests to simulation (20 grid sites) Changing of NSF Sponsored High Performance Computing Efforts z 2003 to 2005: Enhanced Distributed Terascale Facility 4 original DTF sites plus Pittsburgh SC. z Tasks: - Enhancing the existing DTFs’ software and hardware - Testing large scale applications. - Widely connecting to users. Merging P2P and Grid z Envisioning 2005 and later: Decentralized Distributed Terascale Facility - many DTF sites both large (servers) and small (peers). - Pervasive computing: application scope beyond SC. z Tasks: Merging with peer-oriented technology - developing security & privacy protocols. - coordinating heterogeneous Internet resources. - Future of Distributed Computing z z z z Grid infrastructure will provide reliable computing resources for large applications. Within a grid region, peer-oriended techniques will be integrated. Peer-oriended paradigm will play a major role for information retrievals. The demand for data accesses/transfers will be higher than cycles. Suggestions To China Grid Projects z z z Building Internet and system infrastructure. - using open standand for resource sharing - both grid and peer-oriented systems Joining and becoming a part of international efforts to learn and to contribute. Identifying key domestic applications. Identifying Key Domestic Applications z z Distributed digital libraries - management of natural/human resources. - national security data archives. - ... Large collaborations operations - large distributed simulation - collaborative designs and manufacture - bioinformatics - global information searching & procesing, ...