20-755: The Internet Lecture 1: Introduction David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University Institute for eCommerce, Summer 1999 Lecture 01, 20-755: The Internet, Summer 1999 1 Today’s lecture • • • • Course overview (25 min) Internet history (25 min) break (10 min) Research overview (50 min) Lecture 01, 20-755: The Internet, Summer 1999 2 Course Goals • Understand the basic Internet infrastructure – review of basic computer system and internetworking concepts, TCP/IP protocol suite. • Understand how this infrastructure is used to provide Internet services – client-server programming model – existing Internet services – building secure, scalable, and highly available services • Understand how to write Internet programs – Use DNS and HTTP to map a part of the CMU Internet – Build a server that provides an interesting Internet service. Lecture 01, 20-755: The Internet, Summer 1999 3 Teaching approach • Approach the Internet from a host-centric viewpoint – How the Internet is used to provide services. – Complements the network-centric viewpoint of 20-770: Communications and Networking. • Students learn best by doing – In our case, this means programming. Lecture 01, 20-755: The Internet, Summer 1999 4 Course organization • 14 lectures – Readings from the textbook and supplementary readings are posted beforehand. – Guest lecture: Bruce Maggs, SCS Assoc Prof and VP for Research at Akamai, a Boston-based Internet startup. • Evaluation – – – – • Class participation (10%) Two programming homeworks (20%) (groups of up to 2) Programming project (50%) (groups of up to 2) Final exam (20%) Office Hours – Mon 2:00-3:30 – These are nominal times. Visit anytime my door is open. Lecture 01, 20-755: The Internet, Summer 1999 5 Programming assignments • Will be done on euro.ecom.cmu.edu – Pentium-class PC server running Linux • • Homeworks will use Perl5. Project can use language of your choice. • Question: – Does the class need additional tutoring in editing and running Perl5 programs on a Unix box? Lecture 01, 20-755: The Internet, Summer 1999 6 Scheduling issues • We’ll need to double up on lectures (10:3012:20 and 1:30-3:20) on three different days: – Mon July 12 – Fri July 16 – Fri July 23 • No class Fri Aug 6. Lecture 01, 20-755: The Internet, Summer 1999 7 Course coverage • • • • • • • • • Intro to computer systems (2 lectures) Review of internetworking (2 lectures) Client-server computing (1 lecture) Web technology (2 lectures) Other Internet applications (1 lecture) Secure servers (1 lecture) Scalable and available servers (2 lectures) RPC-based computing (1 lecture) Internet startup guest lecture Lecture 01, 20-755: The Internet, Summer 1999 8 Internet history • Sources: – Leiner et. al, “A brief history of the Internet”, www.isoc.org/internet-history/brief.html – R. H. Zakon, “Hobbes’ Internet Timeline, v4.1”, www.isoc.org/guest/zakon/Internet/History/HIT.html – D. Comer, “The Internet Book, Sec. Edition”, PrenticeHall, 1997. Lecture 01, 20-755: The Internet, Summer 1999 9 ARPANET Origins • 1962 – J.C.R. Licklider (MIT) describes “Galactic Network”. – Licklider becomes head of computer research at Defense Advanced Research Program (DARPA) and convinces eventual successor, Lawrence Roberts (MIT), among others, of the importance of the concept. • 1964 – Leonard Kleinrock (MIT) publishes first book on packet switching. • 1965 – Roberts and Thomas Merrill build first wide-area network (using a dial-up phone line!) between MA and CA. • 1967 – Roberts (now at DARPA) publishes plan for “ARPANET”, running at a blistering rate of 50 kbps. Lecture 01, 20-755: The Internet, Summer 1999 10 ARPANET Origins (cont) • 1968 – DARPA issues RFQ for the packet switch component. – BBN (led by Frank Heart) wins contract and designs switch called an Interface Message Processor (IMP) – Bob Kahn (DARPA) works on overall ARPANET arch. – Roberts and Howard Frank (Network Analysis Corp) work on network topology and economics. – Kleinrock (UCLA) builds network measurement system. • 1969 – First IMP installed at UCLA (first ARPANET node). – Nodes added at SRI, UCSB, and Utah. – By the end of the year the 4-node ARPANET is working, with 56kbps lines supplied by AT&T Lecture 01, 20-755: The Internet, Summer 1999 11 ARPANET Origins (cont) • 1970 – BBN, RAND, and MIT added to ARPANET. – Network Working Group (NWG), under Steve Crocker, designed initial host-to-host protocol (NCP). • 1971 – 15 hosts: UCLA, SRI, UCSB, Utah, BBN, MIT, RAND, SDC, Harvard, Lincoln Labs, UIUC, CWRU, CMU, NASA/Ames. – Ray Tomlinson (BBN) writes first ARPANET email program (origin of the @ sign). – email becomes the first Internet killer app. Lecture 01, 20-755: The Internet, Summer 1999 12 Birth of Internetworking • 1972 – Kahn (DARPA) introduces idea of “open architecture networking” : » Each network must stand on its own, with no internal changes allowed to connect to the Internet. » Communications would be on a best-effort basis. » “black boxes” (later called “gateways” and “routers” would be used to connect the networks) » No global control at the operations level. • 1973 – Metcalf and Boggs (Xerox) develop Ethernet. • 1974 – Kahn and Vint Cerf (Stanford) publish first details of TCP, which is later split into TCP and IP in 1978. Lecture 01, 20-755: The Internet, Summer 1999 13 Birth of Internetworking • ~1980 – Berkeley releases open source BSD Unix with a TCP/IP. • 1982 – DARPA establishes TCP/IP as the protocol suite for ARPANET, offering first definition of an “internet”. • 1983 – Jan 1: ARPANET switches from NCP to TCP/IP. • 1984 – Mockpetris (USC/ISI) invents DNS. – Number of ARPANET hosts surpasses 1,000. • 1985 – symbolics.com becomes first registered domain name. – other firsts: cmu.edu, purdue.edu, rice.edu, ucla.edu, css.gov, mitr.org Lecture 01, 20-755: The Internet, Summer 1999 14 Birth of Internetworking • 1986 – NSFNET backbone created (56Kbps) between 5 supercomputing sites (Princeton, Pittsburgh, San Diego, Ithica, Urbana), allowing explosion of University sites. • 1988 – Internet worm attack – NSFNET backbone upgraded to T1 (1.544 Mbps). • 1989 – Number of hosts breaks 100,000. • 1990 – ARPANET ceases to exist. – world.std.com becomes first commercial dial-up ISP. Lecture 01, 20-755: The Internet, Summer 1999 15 The Web changed everything... • 1991 – Tim Berners-Lee (CERN) invents the World Wide Web (HTTP server and text-based Lynx browser) – NSFNET backbone upgraded to T3 (44.736 Mbps). • 1993 – Mosaic WWW browser developed by Marc Andreessen (UIUC) • 1995 – WWW traffic surpasses ftp as the source of greatest Internet traffic. – Netscape goes public. – NSFNET decommissioned and replaced by interconnected commercial network providers. • 1999 – MCI/Worldcom upgrades its US backbone to 2.5Gbps. Lecture 01, 20-755: The Internet, Summer 1999 16 Internet Domain Survey (www.isc.org) 100,000,000 1,000,000 100,000 10,000 1,000 100 Au g81 O ct -8 No 4 v86 O ct -8 8 O ct -8 9 Ju l-9 1 Ap r-9 Ja 2 n93 O ct -9 3 O ct -9 Ja 4 n96 Ju l-9 Ja 7 n99 Internet hosts 10,000,000 Lecture 01, 20-755: The Internet, Summer 1999 17 Summary • The Internet has had an enormous impact on the world economy and day-to-day lives. – mechanism for world-wide information dissemination. – medium for collaboration and interaction without regard to geographic location. • One of the most successful examples of government, university, and business partnership. – Possible only because of sustained government investment and commitment to research and development. – Successful because of commitment by passionate researchers to “rough consensus and working code” (David Clarke, MIT) Lecture 01, 20-755: The Internet, Summer 1999 18 Break time! Lecture 01, 20-755: The Internet, Summer 1999 19 Dv: A toolkit for visualizing massive remote datasets David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University Institute for eCommerce, Summer 1999 Lecture 01, 20-755: The Internet, Summer 1999 20 Internet service models request server client response • Traditional lightweight service model – small to moderate amount of computation to satisfy requests – e.g. serving web pages, stock quotes, online trading, search engines • Proposed heayweight service model – massive amounts of computations to satisfy requests – scientific visualization, data mining, medical imaging Lecture 01, 20-755: The Internet, Summer 1999 21 Lecture 01, 20-755: The Internet, Summer 1999 22 Quake Project • Carnegie Mellon – David O’Hallaron (CS and ECE) – Jacobo Bielak [PI] and Omar Ghattas (CivE) • University of California Berkeley – Jonathan Shewchuk (EECS) • Southern California Earthquake Center – Steve Day and Harold Magistrale (San Diego State) • Kogakuin University, Tokyo – Yoshi Hisada Lecture 01, 20-755: The Internet, Summer 1999 23 Teora, Italy 1980 Lecture 01, 20-755: The Internet, Summer 1999 24 San Fernando Valley lat. 34.38 long. -118.16 epicenter lat. 34.32 long. -118.48 x San Fernando Valley lat. 34.08 long. -118.75 Lecture 01, 20-755: The Internet, Summer 1999 25 San Fernando Valley (top view) Hard rock x epicenter Soft soil Lecture 01, 20-755: The Internet, Summer 1999 26 San Fernando Valley (side view) Soft soil Hard rock Lecture 01, 20-755: The Internet, Summer 1999 27 San Fernando Valley (side view) Soft soil Hard rock Lecture 01, 20-755: The Internet, Summer 1999 28 Initial node distribution Lecture 01, 20-755: The Internet, Summer 1999 29 Unstructured mesh Lecture 01, 20-755: The Internet, Summer 1999 30 Unstructured mesh (top view) Lecture 01, 20-755: The Internet, Summer 1999 31 Partitioned unstructured finite element mesh of San Fernando nodes element Lecture 01, 20-755: The Internet, Summer 1999 32 Communication graph Vertices: processors Edges: communications Lecture 01, 20-755: The Internet, Summer 1999 33 Quake solver code NODEVECTOR3 disp[3], M, C, M23; MATRIX3 K; /* matrix and vector assembly */ FORELEM(i) { ... } /* time integration loop */ for (iter = 1; iter <= timesteps; iter++) { MV3PRODUCT(K, disp[dispt], disp[disptplus]); disp[disptplus] *= - IP.dt * IP.dt; disp[disptplus] += 2.0 * M * disp[dispt] (M - IP.dt / 2.0 * C) * disp[disptminus] - ...); disp[disptplus] = disp[disptplus] / (M + IP.dt / 2.0 * C); i = disptminus; disptminus = dispt; dispt = disptplus; disptplus = i; } Lecture 01, 20-755: The Internet, Summer 1999 34 Archimedes www.cs.cmu.edu/~quake Problem Geometry (.poly) Triangle/Pyramid MVPRODUCT(A,x,w); DOTPRODUCT(x,w,xw); r = r/xw; Finite element algorithm (.arch) Author .c Runtime library .node, .ele Slice C compiler a.out .part parallel system .pack Parcel Lecture 01, 20-755: The Internet, Summer 1999 35 Northridge quake simulation • • 40 seconds of an aftershock from the Jan 17, 1994 Northridge quake in San Fernando Valley of Southern California. Model: – 50 x 50 x 10 km region of San Fernando Valley. – 13,422,563 nodes, 76,778,630 linear tetrahedral elements, 1 Hz frequency resolution, 20 meter spatial resolution. • Simulation – – – – 0.0024s timestep 16,666 timesteps (40M x 40M SMVP each timestep). ~15 GBytes of DRAM. 6.5 hours on 256 PEs of Cray T3D (150 MHz 21064 Alphas, 64 MB/PE). – Comp: 16,679s (71%) Comm: 575s (2%) I/O: 5995s(25%) – 80 trillion (10^12) flops (sustained 3.5 GFLOPS). – 800 GB/575s (burst rate of 1.4 GB/s). Lecture 01, 20-755: The Internet, Summer 1999 36 Kobe 2/2/95 aftershock Lecture 01, 20-755: The Internet, Summer 1999 37 Kobe 2/2/95 aftershock Lecture 01, 20-755: The Internet, Summer 1999 38 Lecture 01, 20-755: The Internet, Summer 1999 39 Visualization of 1994 Northridge aftershock Lecture 01, 20-755: The Internet, Summer 1999 40 Visualization of 1994 Northridge aftershock Lecture 01, 20-755: The Internet, Summer 1999 41 Typical Quake viz pipeline resolution ROI reading interpolation contours isosurface extraction scene scene synthesis local display and input rendering remote database vtk library routines FEM solver materials engine database Lecture 01, 20-755: The Internet, Summer 1999 42 Heavyweight grid service model WAN Remote compute hosts (allocated once per service by the service provider) Lecture 01, 20-755: The Internet, Summer 1999 Local compute hosts (allocated once per request by the service user) 43 Active frames Active Frame Server Input Active Frame Frame data Frame program Output Active Frame Active frame interpreter Frame data Frame program Application libraries e.g, vtk Host Lecture 01, 20-755: The Internet, Summer 1999 44 Overview of a Dv visualization service User inputs Display Remote dataset Local Dv client Request frame Response frames Dv Server Resp. frames Dv Server ... Resp. frames Dv Server Resp. frames Dv Server (Request Server) Remote DV Active Frame Servers Lecture 01, 20-755: The Internet, Summer 1999 Local DV Active Frame Servers 45 Grid-enabling vtk with Dv request frame [request server, scheduler, flowgraph, data reader ] status request server reader scheduler local Dv client result ... ... response frames (to other Dv servers) local Dv server [native data, scheduler, flowgraph,control ] remote machine (Dv request server) Lecture 01, 20-755: The Internet, Summer 1999 local machine (Dv client) 46 Scheduling Dv programs • Scheduling at request frame creation time – all response frames use same schedule – performance portability (i.e. adjusting to heterogeneous resources) is possible. – no adaptivity (i.e., adjusting to dynamic resources) • Scheduling at response frame creation time – performance portability and limited adaptivity. • Scheduling at response frame delivery time – performance portability and greatest degree of adaptivity. – per-frame scheduling overhead a potential disadvantage. Lecture 01, 20-755: The Internet, Summer 1999 47 Scheduling scenarios Ultrahigh Bandwidth Link low-end remote server Lecture 01, 20-755: The Internet, Summer 1999 powerful local server 48 Scheduling scenarios High Bandwidth Link high-end remote server Lecture 01, 20-755: The Internet, Summer 1999 powerful local workstation 49 Scheduling scenarios Low Bandwidth Link high-end remote server Lecture 01, 20-755: The Internet, Summer 1999 local PC 50 Scheduling scenarios Low Bw Link High Bandwidth Link high-end remote server Lecture 01, 20-755: The Internet, Summer 1999 powerful local proxy server low-end local PC or PDA 51 Summary • Heavyweight grid service model – service providers can constrain resources allocated to a particular service – service users can contribute resources to improve response time of throughput • Active frames – general software framework for providing heavyweight Internet services – framework can be specialized for a particular service type • Dv – specialized version of active frame server for vizualization – grid-enables existing vtk toolkit – flexible framework for experimenting with scheduling algs Lecture 01, 20-755: The Internet, Summer 1999 52