Intelligent Information Systems 1. Internet History Gio Wiederhold EPFL, April-June 2000, at 14:15 - 15:15, room INJ 211 7/26/2016 EPFL1H - Gio spring 2000 1 Schedule for Seminar Course on Presentations in English -- but I'll try to manage discussions in French and/or German. • I plan to cover the material in an integrating fashion, drawing from concepts in databases, artificial intelligence, software engineering, and business principles. 1. 13/4 Historical background, enabling technology:ARPA, Internet, DB, OO, AI., IR, XML. 2. 27/4 Search engines and methods (recall, precision, overload, semantic problems). 3. 4/5 Digital libraries, information resources. Value of services, copyright. 4. 11/5 E-commerce. Client-servers. Portals. Payment mechanisms, dynamic pricing. 5. 19/5 Mediated systems. Functions, interfaces, and standards. Intelligence in processing. Role of humans and automation, maintenance. 6. 26/5 Software composition. Distribution of functions. Parallelism. [ww D.Beringer] 7. 31/5 Application to Bioinformatics. 8. 15/6 Semantic Interoperation. (Changed from original plan) 9. 22/6 Privacy protection and security. Security mediation. 10.29/6 Educational challenges. Expected changes in teaching and learning. Summary and projection for the future. • Feedback and comments are appreciated. 7/26/2016 EPFL1H - Gio spring 2000 2 The origin: ARPAnet • Motivation – Share expensive computing resources funded at 5 principal research sites by ARPA – services needed • TELNET -- remote execution control • FTP -- file transfer needed for TELNET • messaging for synchronization -- email - SMTP – requirements • handle heterogeneity • survivability 7/26/2016 EPFL1H - Gio spring 2000 3 (D)ARPA • (Defense) – internal motivation, >, < f(political climate) • Advanced Research – not undertaken by industry (by need) • Projects – limited time, intense support • Agency – started 1958 - post sputnik - rocket science – Information science started ~1967 IPTO 7/26/2016 EPFL1H - Gio spring 2000 4 Technologies • Platform, representation independence – ascii(7), bcd (6), ebcdic (8), binary (any size) • Packeting – limits buffer lengths, allows rerouting • Dynamic path determination – nodes decide next best node -- now by DNServers – (versus other systems -- initially • Uunet required specifying all nodes • NASA network had direct connections • VMnet central directory 7/26/2016 EPFL1H - Gio spring 2000 5 Growth - exponential • • • • • • • • • • • • • 1969 - 5 nodes, 4 computing sites 1972 - ~ 12 nodes, 37 sites ~1976 - ad hoc gateways to other nets for email 1979 - many computer scientists have/need access 1981 - Stanford & Xerox router / gateway protocols 1985 domain naming x.y.typ / Internet Protocol (4 segment) addresses 1991 - base for NREN, NSF backbone; except for x.y.mil 1992 - commercial domains permitted - ICANN established 1993 - 15M users, 3M paying 1994 - Digital Library initiative 1995 - fully commercial operation, research use by grants 1996 - NSF research initiative New Generation Internet 1999 - 2.2M sites on Internet, 288M public pages 7/26/2016 EPFL1H - Gio spring 2000 6 Initial configuration • BBN - development node - IMP [Bob Kahn] – Lockheed mini-computer • SRI - documentation node - RFCs [John Postel – DEC PDP-10 • UCLA - network science node [Leonard Kleinrock] – IBM 360 BBN SRI • UCSB - software node [Feldman, … ] – SDS Sigma 7 (~ 360 architecture) UT SB LA • Utah - graphics hardware node [Ivan Sutherland, Evans, ] – SDS 940 Each node has multiple connections to other nodes Nodes can serve more than one computing site 7/26/2016 EPFL1H - Gio spring 2000 7 Packeting and IMPs IMP (interface message processor) must deal with A • Limited memory M • Unreliable communications • Long sessions & big files b c TCP Transmission Control Protocol: Packeting and Packet Switching A,G, 2, 202, mno (used in AlohaNet, 1962 [Abramson] ) • splits up messages, files into independently portable units: packets { from, to, number, size, data } EPFL1H - Gio spring 2000 G C A,G, 1, 256, i jk l Each node reads header, makes forwarding decisions based on a table (can change dynamically) B 7/26/2016 A here B use b C use c D use b E use c F use c G use b g A use b B here C use b 8 Development Informal, distributed over the user community • Request For Comment RFCs collected at SRI, adopted when they made sense to enough participants, as demonstrated by prototypes, can become standards • RFCs available now at Network Solutions Inc • allowed growth without a central authority Is that a generalizable principle ? 7/26/2016 EPFL1H - Gio spring 2000 9 More early participants • IMPs could handle 4?? computer sites – (I.e., at SRI: SRI PDP-10, Stanford SAIL PDP-6, SUMEX) • added Terminal Interface Processors (TIPs) – for terminals (AT&T TTY, DEC VT100, …) only • More IMPs, TIPs, but restricted to ARPA contractors Other networks, other technologies • IBM VMnet internal, then external IBM customers – central naming authority • NASA for sharing its satellite data processing – high bandwidth, mainly Telnetting • UUnet for Unix users and ARPAnet sites – periodic forwarding, name in message all intermediate nodes – access to Europe (fast via SEISMO in Norway) 7/26/2016 EPFL1H - Gio spring 2000 10 Email ? ! • Initially - at Telnet login show system status – local time, up/busy/down, special situations • Add arbitrary messages • As need for remote computing diminished use of E-mail increased - new communication medium Formalized 1982 by RFC 821 Simple Mail Transmission Protocol Serendipitous major social / research benefit • Many Related Functions – bulletin boards ... 7/26/2016 EPFL1H - Gio spring 2000 11 ETHERNET Novel protocol for broadcast medium [Metcalfe, Bogg, Shoch] • Also developed in Aloha net (Hawaiian islands) • collision detection (CD) protocol – no synchronization, fully distributed – relatively long latency in space and on wire -causes collisions -- crossed, mixed signals – listen while sending, when coll. detected stop both! – resend with exponential backoff (wish humans would do that) – simple and stable to fairly high utilization • outperformed at high rate theoretically only Used for local networks, with gateways to Internet 7/26/2016 EPFL1H - Gio spring 2000 12 INTERNET Backbone providers – UUNet (1993), SPRINT, AT&T, MCI, GT&E, Worldcom +European PT&Ts serve regional networks, large users (CNN, …) • • share resources by free peering at gateway nodes Regional subnets and ISPs distribute bandwidth to 1. Consumers (#=n) may pay / are seen as having value Metcalf’s law: the value of a net is ~ n 2 2. Smaller ISPs All dealers oversubscribe: Sell more bandwidth than they buy – count on fractional use – don’t need to buy for intra-region / intra-ISP traffic • Peripheral buffering services can reduce traffic further (MIT AKEMI) 7/26/2016 EPFL1H - Gio spring 2000 13 HTML • Hierarchical Text Markup Language – sharing of physics preprints [Tim Berners-Lee @CERN] – markup = embedded format commands for layout • Multi-part, multi-representation (text, figs) documents • Markups per SGML + (hyper = external) links – SGML = IBM initiated standard graphic document markup – basic commands are size, font, color independent, to be interpreted by the publisher for report, book, manual, ... Alternative to (a.o.) (also UNIX runoff, … ) • XEROX initiated Postscript (PS), Adobe PDF – exact bit-wise layout via executable script • TEX markup Detail (pretty math) [Knuth], LATEX macros [Lamport] – generates device independent format (DVI), then PS 7/26/2016 EPFL1H - Gio spring 2000 14 Web • • • • Browsers for HTML Mosaic [Andressen, Bina at UIUC] Netscape … Search engines (Topic 2) 7/26/2016 EPFL1H - Gio spring 2000 15 XML Machine Processable ! • return to origin? – – – – ARPAnet -- share heterogeneous machines Email -- people-to-people Digital Library -- people-to-machines E-commerce (E2B)-- people-to-machines • client-server – – – – Mediated -- people-to-services-to-machines Business (B2B)-- machine-to-machine(s) Business services -- machine-to-services-to-machines Ubiquitous -- gadget-to-gadget Future • (embedded) 7/26/2016 EPFL1H - Gio spring 2000 16 Fin Comments? • what was new / what was old or boring? • future emphasis – more technological detail? – more situational detail? – more extrapolation to the future 7/26/2016 EPFL1H - Gio spring 2000 17