Chapter 2 Technology Infrastructure Learning Objectives 1. Origins, growth and current structure of the Internet 2. Packet-switching 3. Internet protocols 4. History of Markup languages 5. Internets, Intranets and Extranets 2 The beginning … • 1858 – let there be transatlantic cables – But it was not right … technical failures • 1866 – let there be transatlantic cables 2 – Great success which remained in operation for 100 years • 1957 – The USSR launched Sputnik – The first satellite • 1958 – The US setup ARPA – Advanced Research Projects Agency 1960’s The cold war • US vrs USSR • Threat of nuclear attack • Computers strategic • Dr Licklider chosen to improve military’s use of computer technologies ARPANET’s first transmision … • Plan: – University of California, Los Angeles hoped to log onto the Stanford computer and try to send it the word “LOGIN” "We set up a telephone connection between us and the guys at SRI...," "We typed the L and we asked on the phone, "Do you see the L?" "Yes, we see the L," came the response. "We typed the O, and we asked, "Do you see the O." "Yes, we see the O." "Then we typed the G, and the system crashed" • On the second attempt, it worked perfectly! • The revolution had begun … 1969 ARPANET • Financed by Defence ARPA (DARPA) • Made up of a few nodes initially • 1971 first Terminal Interface Processor enable direct dial-in into the net • 1972 first public demonstration with 24 sites (NASA, National Science Foundation, Defence, etc) Email was born when Ray Tomlinson wrote a program that could send messages • 1973 connections with England and Norway. First satellite connection. • 1974 62 computers online The ARPANET started growing (1974) 1975 ARPANET • 1975 opened to the public with 111 computers online by 1977 • 1979 Students at Duke Uni and North Carolina Uni started UseNet to share news • 1983 the Department of Defence Data Network (called MILNET) split off from ARPANET • 1985 public links existed across North America, Europe and Australia • 1989 the National Science foundation permitted two commercial email services, MCI Mail and CompuServe • 1990 ARPANET retired and computers were moved to the new network, NSFNET to be known as the Internet • 1995 privitzation completed with several Network Acces Points in place through which access could be sold to ISPs, etc. Concept behind the WWW • 1945 Vannevar Bush director of US Scientific Research speculated on the creation of a machine called a memex which would hold all documents • Hypertext – Created by Ted Nelson in the 60s – Text, Pictures, anything can be linked – Move from one item to the other – Browser is used to interpret hypermedia • http://www.xanadu.com World Wide Web (1) • Proposed by Sir Tim Burners-Lee while working at CERN (Large Physics European Research Centre) • Originally for record keeping and links • Read – Original WWW proposal • http://www.w3.org/History/1989/proposal.html – WWW past, present, future • http://www.w3.org/People/Berners-Lee/1996/ppf.html World Wide Web (2) • CERN had an ARPANET connection via EUnet since 1990 • 1991 Tim posted a note on alt.hypertext.newsgroup about – Web server – Line browser • Servers started appearing everywhere • 1993 MOSAIC (first to embed images) • 1994 W3C created (http://www.w3.org) The Internet as we know it … • Enormous network of computers • E-mail • Mailing List • Bulletin boards • Web pages • Intranets • Distribution of information, software, etc • What else? Why a packet switching network? 13 Why a packet switching network? 14 Packet-Switched Networks (1) • Local area network (LAN) – Network of computers located close together • Wide area networks (WANs) – Networks of computers connected over greater distances • Circuit – Combination of telephone lines and closed switches that connect them to each other 15 Packet-Switched Networks (2) • Circuit switching is used in telephone communication • The Internet uses packet switching • Packet switching needs computers called ‘routers’ and the programs called ‘routing algorithms’ 16 Packet-Switched Networks (3) • Information is divided into packets • It is passed from node to node • It is recomposed as one chunk on the destination server 17 Routing Packets • Routing computers – Computers that decide how best to forward packets • Routing algorithms – Rules contained in programs on router computers that determine the best path on which to send packets – Programs apply their routing algorithms to information they have stored in routing tables 18 ARPANET • ARPANET is the earliest packet-switched network • This wide area network used the Network Control Protocol (NCP) • A protocol is a collection of rules for formatting, ordering, and error-checking data sent across a network 19 ARPANET (2) 20 Open Architecture of ARPANET 1. Independent networks should not require any internal changes in order to be connected to the network 2. Packets that do not arrive at their destinations must be retransmitted from their source network 3. The router computers do not retain information about the packets that they handle 4. No global control exists over the network. 21 The Internet 22 The TCP/IP Protocol • The Transmission Control Protocol (TCP) and the Internet Protocol (IP) are the two protocols that support the Internet operation • TCP controls the assembly of a message into smaller packets before it is transmitted over the Internet • The IP protocol includes rules for routing individual data packets from their source to their destination 23 Open Systems Interconnections Model OSI Model (also called TCP/IP protocol suite) layers (from the highest to the lowest): 7 Application 6 Presentation 5 Session { HTTP, SMTP, FTP, Telnet, SSH, Whois, etc. 4 Transport TCP, UDP 3 Network IP 2 Data Link Ethernet 1 Physical Wire, Radio, Fibre Optic 24 Some jargon … • • • • • • • • • HTTP SMTP FTP SSH Telnet Whois TCP UDP IP HyperText Transfer Protocol Simple Mail Transfer Protocol File Transfer Protocol Secure Shell Telephone Network Who Is? Transmission Control Protocol User Datagram Protocol Internet Protocol 25 IP Address • Internet addresses are based on a 32-bit number called an IP address • IP addresses appear as a series of up to four separate numbers delineated by a period • An address such as 126.204.89.56 uniquely identifies a computer connected to the Internet • IP Subnetting conceptually divides a large network into smaller sub-networks 26 IP Classes (1) 27 IP Classes (2) Class Leading Value Network Numbers Addresses Per Network Class A 0 126 16,777,214 Class B 10 16,384 65,534 Class C 110 2,097,152 254 28 Subnetting 29 Without subnetting … • Explosion in size of IP routing tables. • Every time more address space was needed, the administrator would have to apply for a new block of addresses. • Any changes to the internal structure of a company's network would potentially affect devices and sites outside the organization. • Keeping track of all those different Class C networks would be a bit of a headache in its own right. 30 Benefits of Subnetting • Better Match to Physical Network Structure • Flexibility • Invisibility To Public Internet • No Need To Request New IP Addresses • No Routing Table Entry Proliferation 31 IP Vr6 (or IP Next Generation) • Network Layer • Developed in 1994 • Will replace the IP Vr4 standard – limits on network addresses will eventually lead to exhaustion of available addresses (by 2023) – supports only 4,294,967,296 addresses (32bits) • Improvements include – providing future cell phones and mobile devices their own unique & permanent addresses – supports about 3.4 × 1038 (128bits) 32 Domain Names • A Uniform Resource Locator (URL) consists of names and abbreviations that are much easier to remember than IP addresses • The HTTP protocol defines how an Internet resource is accessed • An address such as www.microsoft.com is called a domain name 33 Top-Level Domain Names • Internet Corporation for Assigned Names and Numbers (ICANN) – Responsible for managing domain names and coordinating them with IP address registrars 34 More Top-Level Domain Names • Unsponsored .biz .com .edu .gov .info .mil . name .net .orgSponsored .aero .cat .coop .in t .jobs .museum .pro .travel • Infrastructure .arpa .root • Startup phase .mobi .post .tel • Proposed .asia .cym .geo .kid .kids .mail .sc o .web .xxx • Deleted/retired .nato • Reserved .example .invalid .localhost .testPs eudodomains .bitnet .csnet .local .onion .uucp 35 Country Top Level Domains • .ac .ad .ae .af .ag .ai .al .am .an .ao .aq .ar .as .at .au .aw . ax .az .ba .bb .bd .be .bf .bg .bh .bi .bj .bm .bn .bo .br .bs . bt .bv .bw .by .bz .ca .cc .cd .cf .cg .ch .ci .ck .cl .cm .cn . co .cr .cu .cv .cx .cy .cz .de .dj .dk .dm .do .dz .ec .ee .eg . er .es .et .eu .fi .fj .fk .fm .fo .fr .ga .gd .ge .gf .gg .gh .gi . gl .gm .gn .gp .gq .gr .gs .gt .gu .gw .gy .hk .hm .hn .hr .ht .hu .id .ie .il .im .in .io .iq .ir .is .it .je .jm .jo .jp .ke .kg .kh .ki .km .kn .kr .kw .ky .kz .la .lb .lc .li .lk .lr .ls .lt .lu .lv .ly .ma .mc .md .mg .mh .mk .ml .mm .mn .mo .mp .mq .mr .m s .mt .mu .mv .mw .mx .my .mz .na .nc .ne .nf .ng .ni .nl .n o .np .nr .nu .nz .om .pa .pe .pf .pg .ph .pk .pl .pm .pn .pr . ps .pt .pw .py .qa .re .ro .ru .rw .sa .sb .sc .sd .se .sg .sh . si .sk .sl .sm .sn .sr .st .sv .sy .sz .tc .td .tf .tg .th .tj .tk .tl .tm .tn .to .tr .tt .tv .tw .tz .ua .ug .uk .us .uy .uz .va .vc .ve .vg .vi .vn .vu .wf .ws .ye .yt .yu .za .zm .zw 36 HTTP • Hypertext Transfer Protocol (HTTP) is responsible for transferring and displaying Web pages • A user’s Web browser opens an HTTP session and sends a request for a Web page to a remote server • In response, the sever creates an HTTP response message that is sent back to the client’s Web browser 37 Internet Utility Programs • Finger – Lists information about the user • Ping (Packet Internet Groper) – forwards data packets to check the quality of a link or verify the connection of a machine to the Internet. • Tracert – Used to visually see a network packet being sent and received and the amount of hops required for that packet to get to its destination • VisualRoute 38 Internet Applications Three representative Internet applications: • Electronic mail • Telnet • FTP 39 SMTP, POP, and IMAP (1) • E-mail is sent across the Internet is managed and stored by mail servers • Simple Mail Transfer Protocol (SMTP) is the standard for e-mail client program • Post Office Protocol (POP) is the standard for e-mail server program • The Interactive Mail Access Protocol (IMAP) is a newer e-mail protocol 40 SMTP, POP, and IMAP (2) 41 FTP • The File Transfer Protocol (FTP) implements a mechanism to transfer files between TCP/IPconnected computers • FTP transfers both binary and ASCII text • Full privilege FTP allows remote uploading and downloading files • Anonymous FTP allows you to log on as a guest 42 Controlling Unsolicited Commercial E-Mail (UCE) better known as Spam • Use complex email addresses rather than name and surname combination – Why? Bots? Name Directories? • Control exposure of email address – How? Java script? JPEG? • Use multiple email addresses for different purposes – In what occasions? • Use content-filtering software – black list spam filter – white list spam filter – challenge response using graphical challenges ? 43 Overview of Markup Languages • SGML is a rich meta language that is useful for defining markup languages • HTML is particularly useful for displaying Web pages • XML defines data structures for electronic commerce (and much more …) 44 Development of Markup Languages http://www.w3.org/ 45 More Markup Languages … • • • • • • • • • • • • • • • • • • • Address XML Computing Environment XML Content Syndication XML Customer Information XML Electronic Data Interchange (EDI) XML Geospatial XML Human XML Localization XML Math XML Open Office XML Topic maps XML Trade XML Translation XML Universal Business Language (UBL) Universal Data Element Framework (UDEF) Accounting XML Advertising XML Astronomy XML Building XML • • • • • • • • • • • • • • • • • • • Chemistry XML Construction XML Education XML Food XML Finance XML Government XML Healthcare XML Human Resources XML Instruments XML Insurance XML Legal XML Manufacturing XML News XML Photo XML Physics XML Publishing XML Real Estate XML Telecommunications XML Travel XML 46 Standard Generalized Markup Language • The ISO adopted SGML standard in 1986 • SGML is nonproprietary and platformindependent • SGML supports user-defined tags and architecture to complement the required richness of documents 47 Hypertext Markup Language • Tim Berners-Lee invented HTML • HTML is a document production language that includes a set of tags that define the format and style of a document • HTML is based on SGML • HTML is an instance of one particular SGML document type – Document Type Definition (DTD) 48 HTML Tags • An HTML document contains both document content and tags • The tags are the HTML codes inserted in a document to specify the format on screen • Each tag is enclosed in brackets (< >) • Most tags are two-sided – opening and closing tags • Well formed tags, bots, meta tags?? Why are they important? 49 HTML Links • Hyperlinks are bits of text that connect the current document to: – – – – Another location in the same document Another document on the same host machine Another document on the Internet Can they link to a toaster at home? • Hyperlinks are created using the HTML anchor tag • Two popular link structures: – Linear hyperlink structure – Hierarchical hyperlink structure 50 HTML Editors (1) • Low end editor displays HTML code on the screen and allow you to insert HTML tag pairs by clicking selected buttons • High end editor are Web site builder programs, they provide a rich environment that displays the Web page, not the HTML code • Macromedia Dreamweaver is an example of Web site builders 51 HTML Editors (2) 52 Extensible Markup Language • XML is a descendant of SGML (subset) • XML allows designers to easily describe and deliver structured data from any application in a standard, consistent way • XML can be embedded within an HTML document • XML allows you to create your own customized markup language. 53 Learn XML in a slide • Tag – a piece of Markup – – • <name length=“7”>Alexiei</name> Rules to keep XML well formed 1. 2. 3. 4. • <name>Alexiei</name> Attribute – properties – • <name> </name> Element – well formed usage of tags – • An opening tag A closing tag Can be nested but not overlapping Case sensitivity Quoted attributes Required end tag Short hand – <abc></abc> is equivalent to <abc/> 54 Which is valid XML? <book pages=100>E-Commerce</book> <book pages=“100”><title>E-Commerce</title></book> <book>E-Commerce</booK> <book pages=“100”> <title>E-Commerce</title> <author> <name>Gary</name> <surname>Schneider</surname> </author> </book> <book pages=“100”><title>E-Commerce</book></title> 55 Answers <book pages=100>E-Commerce</book> <book pages=“100”><title>E-Commerce</title></book> <book>E-Commerce</booK> <book pages=“100”> <title>E-Commerce</title> <author> <name>Gary</name> <surname>Schneider</surname> </author> </book> <book pages=“100”><title>E-Commerce</book></title> 56 XML exercise • Create an XML ID card having – ID number must be an attribute of the top element – And using the following tags • • • • Name Surname DOB Address subdivided by – – – – House Number House Name Street Locality 57 Processing a Request for an XML Page • Why going through all this hassle? • How would you go about displaying HTML on a – PC – Handheld – Mobile 58 HTML + XML Version History • HTML version 1.0 was introduced in 1991 • HTML 2.0 was released in Sept. 1995 • HTML 3.2 was introduced in 1997 • HTML 4.0 was released by W3C in Dec 1997 • HTML 4.01 was released in Dec 1999 • XHTML 1.0 became a W3C recommendation in Jan 2000 59 Web Clients and Severs • Your PC is a Web client in a worldwide client/server network • Web software is platform-neutral • Computers that are connected to the Internet and contain documents made publicly available are called Web servers 60 What’s happening to web software? 61 Web Client/Server Architecture • Client/server architecture may be used on LANs, WANs, and the Web • The server’s workload is heavy • It needs to be high-ended computers with lots of disk capacity, fault-tolerant processors, and ample memory • The term thin client describes a client’s relatively low workload, compared with that of a server – Where’s the workload done? 62 Two-Tier Client/Server • A two-tier architecture is one in which only a client (tier 1) and a server (tier 2) are involved in the requests and the responses that flow between them over the Internet • A typical request message from a client to a server consists of three major parts: – A request line – Optional request headers – An optional entity body 63 Example message … 64 Three-Tier Client/Server • A three-tier architecture builds on the traditional two-tier approach • The first tier is the client, the second tier is the Web server, and the third tier consists of applications and their databases • A Common Gateway Interface (CGI) is a protocol which allows Web servers to interact dynamically with clients 65 Web-Site Types • Development Sites • Intranets • Extranets • Transaction-Processing sites • Content-Delivery Sites 66 Intranets • An intranet is a Web-based private network that hosts Internet applications on a LAN • Intranets are an extremely popular and low-cost way to distribute corporate information • The intranet infrastructure includes a TCP/IP network, Web authoring software, Web server hardware and software, Web clients, and a firewall server 67 Intranet Benefits • Increased, less expensive, environmentally friendly internal communication • Low acquisition and deployment costs • Low maintenance costs • Increased information accessibility • Timely, current information availability • Easy information publication, distribution, and training 68 Extranets • Extranets connect companies with suppliers or other business partners • An extranet can be: a public network, a secure (private) network, or a virtual private network (VPN) • Extranets provide the private infrastructure for companies to coordinate their purchase and communications with one another 69 Extranets (Cont.) • A public network extranet exists when an organization allows the public to access its intranet from any public network • A private network is a private, leased-line connection between two companies that physically connects their intranets to one another • A VPN extranet is a network that uses public networks and their protocols to send sensitive data to partners, customers, suppliers, and employees using a system called “tunneling” 70 Virtual Private Network (VPN) • Extranet that uses public networks and their protocols • IP tunneling – Effectively creates a private passageway through the public Internet • Encapsulation – Process used by VPN software • VPN software – Must be installed on the computers at both ends of the transmission 71 VPN Architecture Example 72 Internet Connectivity • Internet Service Providers (ISPs) provide Internet access services to other businesses. • Ways to connect to an ISP: 73 WiMax • Covers a much wider range than WiFi 74 Internet 2 • • • • Future of Internet Consortium of 220 members High speed network 1999 – Capacity of 2.5 GBits/second • 2003 – Capacity of 10 GBits/second • 2006 – Speed record of 9.08 GBits/second over 30,000 km for 5 hours 75 Web X.0 • It was though that the next web would be the Semantic Web, instead Web 2.0 came along … • But is it really a new Web or just hype? 76 Semantic Web (SW) • Problem of information overload – A lot of data on the internet – Too much, much more than a human can process • Idea – Make automated agents process the data for people • They’re fast, • don’t get tired and • don’t get bored • How to achieve it? – Create the SW, a web understandable by both humans and machines 77 SW Example 78 SW stages (cake) 79 Questions? 80 Questions • Compare the two-tier and three-tier architectures. Which do you think is the likely candidate for an E-commerce site? • Design and Explain how it would work. 81