Origins of the Internet The Internet was started as a research project sponsored by the Advanced Research Projects Agency (ARPA) within the U.S. Dept. of Defense in the late 1960s. Called ARPAnet, this distributed network was conceived to support university and military research. In the 1980s the National Science Foundation (NSF) used this same technology to create the NSFnet, which was intended to support education and research. In the mid 1980s the Internet was commercialized as dot coms were created. Much traffic on the Internet is now Web-based and commercial in nature. What is the Internet? – The Internet is a constellation of communicating devices supported by a common communications protocol (TCP/IP), offering the following capabilities: – SMTP – Simple Mail Transport Protocol (aka “Email”) telnet -- the ability to connect to a remote host and interact as if one where onsite FTP – File Transfer Protocol, the ability to connect to a remote host and upload/download a file – – On the Internet, packets of data are routed from place to place: Data may include: Email Web pages Graphics and multimedia Chats, conferences Video and voice Other types of files and data Some characteristics of the Internet The Internet is agnostic about the data it carries Today, the Internet is being used for all kinds of things not envisioned by its founders, from telephone calls to e-commerce. In Q301, we will learn about many of the uses of the Internet, and become prepared to use, develop and shape many of the forms of communication taking place. Hostnames and the Domain Name Server (DNS) Every host (aka system or computer) on the Internet requires a unique identification number, a numeric address The Internet requires this number to get data from a source to a destination address The number is built up of 4 numbers, each between 0 and 255 (aka, 4 bytes total) For every hostname, such as sislt.missouri.edu, there is a numeric equivalent (128.206.171.229). Domain Name Server (DNS) The Domain Name Server (DNS) system and associated sub-systems on the Internet are used to Find out which server handles addressing for a particular domain (like .com or .unc.edu Contact that server Do a DNS lookup to find out the numeric address associated with the hostname The Internet routing structure then worries about how to get the data from one location to the other, using a series of "routes" (which are maintained by routers and other networking equipment) DNS (cont’d): Theoretically, data can take different routes between a source and a destination. In practice, there is usually only one route that's most likely between two hosts. Top-level domains include .com, .gov, .edu and .net, which are mostly used for US-based organizations. .us, .ca, .jp and others are used for specific countries. New top-level domains include .biz, .coop and .info. Internet Protocols:TCP and IP Protocols are rules for communication. By agreeing on such rules, the Internet works. Many protocols are used for the Internet, to make sure data get from one place to another. For example, the HyperText Transport Protocol (HTTP) governs how Web clients (browsers) talk with Web servers. The defining protocol for the Internet is the Internet Protocol, the IP. IP : the Internet Protocol The Internet is the network of IP networks. (This is the minimal definition of the Internet). IP is a protocol that manages getting data from one place to the other on the Internet. The IP doesn't do any quality control or error handling, other than to make sure the payload arrives intact. TCP does quality control, packet reassembly, and other things to make sure the data are usable. Hostnames and the DNS For the most part, the DNS operates behind the scenes. We don't need to know an IP address for a particular hostname, and don't need to worry about how the data get from one place to another. For experts, though, it's good to be able to get behind the scenes to diagnose problems or errors. Some utilities available include ping, traceroute, nslookup and whois. These are available for Unix/Linux and Windows machines, among others. Usage (samples are using Unix, but pine and nslookup are available in the DOS/Command-line interface with Windows) 1. whois missouri.edu gets basic information about the DNS servers for missouri.edu 2. whois -h whois.networksolutions.com missouri.edu gets more detailed information uncluding the administrative, technical, billing and zone contact. There are other registrars than networksolutions.com, but they were one of the first -- whois can help you to find out where to go to find out about the human organization on the other side of a hostname. 3. ping www.yahoo.com sends a ICMP packet over the IP to check whether a host is alive, and how quickly a response is received. On Sun systems (including the SILS server), use Usage (cont’d): 4. nslookup www.state.gov gets the IP address(es) associated with the hostname www.state.gov. Some sites have more than one IP address for redundancy or loadhandling (try www.whitehouse.gov or www.yahoo.com). 5. traceroute www.google.com (^C to cancel) traces the route your packets take to get to www.google.com. This can be thrown off by firewalls or other systems that block some types of packets. The World Wide Web is A software application, most often running on the Internet (but not required to be) using a client – server protocol for communications. When run on the Internet, these apps are supported by SMTP, telnet, and FTP What makes the Web Unique? Hyperlinks !!! Hyperlinks (the ability to move from one source to another in a webbed environment) are the primary reason why the Web is so popular (and navigable). Hyperlinks are often combinations of the telnet and FTP functions within TCP/IP (the so-called “Internet protocol”) Hyperlinks are examples of “associative trails” (Vannevar Bush 1945) So what is “client-server”? Client-server itself is a software application that supports connectivity and functionality between users (running “client software applications”) and hosts, or servers running server-side software. Sometimes we call this design philosophy an “Open Systems” design, since it supports multiple H/S platforms. The Web (or “WWW”) enables these functions: Text and graphic presentation to the end user Hyperlinks to related materials created by a web page author What is HTML? The authoring language of the Web is currently HTML, which stands for HyperText Markup Language. Future versions of the Web are likely to be based on XML, and eXtensible Markup Language HTML does not support such things as: Sound, motion, video User interactions Counters and market information But these applications can be accomplished through external programming tools that run “under” HTML.