EMTM 553: E-commerce Systems Lecture 2: The Internet and the Web Insup Lee Department of Computer and Information Science University of Pennsylvania lee@cis.upenn.edu www.cis.upenn.edu/~lee 7/26/2016 EMTM 1 Computer Networks 7/26/2016 EMTM 2 Development of the Internet • A network of networks, or an “inter-network” • ARPA Net in 1960s • 1980s, NSF Net to connect universities and research labs • 1991, NSF allowed commercial traffic onto the Internet • 1995, Internet Service Providers (ISP): companies that provide and charge a fee for connections to the Internet 7/26/2016 EMTM 3 Uses of the Internet • E-mail – to send messages to one or many across the Internet • File Transfer Protocol (FTP) – to transfer data files from one computer to another • Telnet – to remotely logon to another computer • World Wide Web (WWW) – to access information using a common interface • Videoconferencing – to use video across the Internet for conferencing purposes • Multimedia – to use video, audio, and animations across the Internet 7/26/2016 EMTM 4 Design Principles of the Internet • Interoperability – Independent implementations of Internet protocols actually work together. • Layering and Simplicity – IP itself is simple. – Below IP, IP hides the complexity of many different kinds of network hardware. – Above IP, higher-level protocols offer services abstractions that are easy to use and understand for application programs. • Uniform naming and addressing – IP address: 32-bit (e.g., 123.45.678.0 in dotted quad form) • End-to-end protocols – The network needs to know only the destination address for delivering a packet. 7/26/2016 EMTM 5 Layering of Internet Protocols Application (Email, Web Browser) End-to-End Protocol (TCP/UDP) Host-to-Host Protocol (IP) Physical Layer 7/26/2016 EMTM 6 Internet protocols • • • • • Host computers and routers Computers form networks Routers connect networks Each host has a unique address Internet protocol (IP) – IP addressing – IP datagram – Ports: (IP address, Port number) • Transmission Control Protocol (TCP) • Unreliable Datagram Protocol (UDP) • Domain Name System (DNS) 7/26/2016 EMTM 7 Physical Layer • No single technology: – Ethernet, token ring, FDDI (Fiber Distributed Data Interconnect), ATM (Asynchronous Transfer Mode), etc. • ARP (Address Resolution Protocol) to translated IP addresses (32-bit) into Ethernet addresses (48-bit) • Routing protocols 7/26/2016 EMTM 8 Internet Protocol (IP) • Protocol that supports the interconnection of multiple networking technologies into a single, logical inter-network. • IP specifies the format of packet or datagrams. • IP provides the addressing scheme: global, unique, hierarchical. (ex. 158.130.64.176) 7/26/2016 EMTM 9 Packet-Switched Networks • The Internet uses Packet switching – Files and messages are broken down into packets, which are electronically labeled with their origin and destination – The destination computer collects the packets and reassembles the data from the pieces in each packet – Each computer the packet encounters decides the best route towards its destination 7/26/2016 EMTM 10 Packet-Switched Network and Message Packets Source: Schneider and Perry 7/26/2016 EMTM 11 TCP: Transmission Control Protocol • End-to-End protocol on top of IP • TCP provides a reliable, in-order delivery of packets using acknowledgements, checksums and sequence number. • Flow control, congestion control • Suitable for file transfer, email 7/26/2016 EMTM 12 UDP: Unreliable Datagram Protocol • End-to-End protocol on top of IP. • It does multiplexing/demultiplexing and ensures the correctness by using checksum. • UDP is used for small/real-time packet delivery such as voice data. 7/26/2016 EMTM 13 Domain Name System (DNS) • IP addresses are difficult for human to remember. • Mapping IP to readable host names. • For example: red.seas.upenn.edu (158.130.64.176) edu cmu upenn cis clapton 7/26/2016 rice washington seas red EMTM blue 14 Other Internet Protocols • Hypertext Transfer Protocol (HTTP) – Responsible for transferring and displaying Web pages • Simple Mail Transfer Protocol (SMTP) – Specifies the exact format of a mail message • Post Office Protocol (POP) – Responsible for retrieving e-mail from a mail server 7/26/2016 EMTM 15 Other Internet Protocols • Interactive Mail Access Protocol (IMAP) – Latest protocol, may replace POP – Defines how a client program asks a mail server to present available mail o Download only selected messages, instead of all messages o View headers only o Create and manipulate mailboxes on the server 7/26/2016 EMTM 16 Other Internet Protocols • File Transfer Protocol (FTP) – – – – 7/26/2016 Transfers files between TCP/IP-connected computers Uses client/server model Transfers both binary and ASCII text Displays and manipulates remote and local computer file directories EMTM 17 The World Wide Web • In 1992, Tim Berners-Lee at CERN released the first implementation of the WWW. • A global hypertext network of Web servers and Web browsers connected by HTTP (Hypertext Transfer Protocol). • The Web is a collection of “pages” located on “servers” all over the world. • Servers store HTML (Hypertext Markup Language) files and respond to request. • A browser provides a point-and-click user interface to access pages in HTML. • Each page is assigned a URL (Uniform Resource Locator), which is the page’s worldwide name. 7/26/2016 EMTM 18 Advantages of the Web • A global information sharing architecture that integrates online content and information servers in an easy-to-use manner. – Ease of navigation and use – Ease of publishing content • New distribution channel (of digital goods) such as software, documents, music, video, etc. • Enable a network-centric computing paradigm. • New business applications (e.g., auction of surplus capacity) 7/26/2016 EMTM 19 Markup Languages and the Web • Standard Generalized Markup Language (SGML) – – – – – – 7/26/2016 Regulated ISO standard since 1986 Nonproprietary Supports user-defined tags Costly to set up Expensive compared to HTML Steep learning curve EMTM 20 Markup Languages and the Web • Hypertext Markup Language (HTML) – Based on SGML – Easier to learn and support – Supports commonly used text markup features o Headings, title bars, bullets, lines, lists o Precise graphic positioning, tables, and frames – Standard language for Web pages • Extensible Markup Language (XML) – Descendant of SGML – Defines which data to display, instead of how a page is displayed – Describes a page’s actual content, unlike HTML – Data-tracking capability 7/26/2016 EMTM 21 Traditional vs. Hyperlinked Document Pages Source: Schneider and Perry 7/26/2016 EMTM 22 HTML • HTML tags – <tagname properties>Displayed information affected by tag</tagname> o <B>best</B> - Bolds the word “best” o <P align=“right”> - Aligns text to the right • HTML code defines the formatting of the page, but a page may look different on two different browsers 7/26/2016 EMTM 23 HTML Codes to Format Memo Page Source: Schneider and Perry 7/26/2016 EMTM 24 Internet Explorer Display of Memo Page Source: Schneider and Perry 7/26/2016 EMTM 25 More about HTML • HTML Links – Anchor tags used to link to text within the same document, or on a distant computer o <A HREF=“address”>Visible link text</A> o <A HREF=http://www.upenn.edu>University of Pennsylvania</A> o <A HREF=“#references”>References are found here</A> – Text between the anchors appears as a hyperlink 7/26/2016 EMTM 26 Hyperlink Structures Source: Schneider and Perry 7/26/2016 EMTM 27 HTML Version History • Version 1.0 appeared in the summer of 1991 • Version 2.0 was released in September 1995 – Internet Explorer 2.0 and Netscape Navigator 2.0 appeared • Version 3.2 was released in 1997 – Provided support for tables, complex numbers, and text flow around images • Version 4.0 was released in December 1997 – Support for OBJECT tag and Cascading Style Sheets (CSS) – Internationalization for various languages – Accessibility features 7/26/2016 EMTM 28 HTML Editors • Used to generate the HTML code – Simple text editors offer limited flexibility – Any word processor can be used – Web site builders offer more control o Microsoft FrontPage o Dreamweaver 7/26/2016 EMTM 29 Web Clients and Servers • Client computers typically request services, including printing, information retrieval, and database access • Servers are responsible for processing the clients’ requests 7/26/2016 EMTM 30 Client/Server Structure of the WWW Source: Schneider and Perry 7/26/2016 EMTM 31 Web Browser • Implements HTTP (HyperText Transfer Protocol) – Interact with servers – Displays web pages – Caching, freshness control • • • • • • • Page rendering, font mapping Compression, decompression Handles multimedia, supports plug-ins Interprets scripts Executes Java applets Maintains cache, history Manipulates cookies 7/26/2016 EMTM 32 Web Browser HTTP: HyperText Transport Protocol URL: Uniform Resource Locator Request/Reply pages Ex. www.google.com HTML: HyperText Markup Language Ex. Graphic,colorful page 7/26/2016 EMTM 33 Web Server Software • Capabilities/Features – – – – – 7/26/2016 Support HTTP protocol Support Secure Sockets Layer (SSL) File Transfer Protocol (FTP) Search engines and indexing Data Analysis EMTM 34 Uniform Resource Locators (URL) • http://www.w3.org/example/index.html • Protocol designator (http:) • Server name (servername.domain) – DNS name – The browser use the DNS to translate the name to an IP network address • Pathname (/path/name/of/object.html) 7/26/2016 EMTM 35 Message Flow Between Web Client and Server Source: Schneider and Perry 7/26/2016 EMTM 36 Hypertext Transfer Protocol (HTTP) • The standard Web transfer protocol. • With HTTP, a client opens a TCP connection to the web server. • Two types of messages – Requests from browsers to servers; e.g., o GET to retrieve document o PUT to upload files to the server o POST to send the results of a form filled out by the user – Responses from servers to browsers 7/26/2016 EMTM 37 Basic Web application architecture Database Web Server requests Programs downloads Web Browser 7/26/2016 EMTM 38 Two-Tier Client/Server Tier 1 Tier 2 Network DMS Client Server • User Interface • Web Server • Web Applications • Database Server • Application Server 7/26/2016 EMTM 39 Three-Tier Client/Server Tier 1 Tier 2 Network Tier 3 Network DMS Client Server Database Server • User Interface • Web Server • Web Applications • Application Server 7/26/2016 EMTM 40 N-Tier Client/Server Tier 1 Tier 2 Tier 3 Tier N DMS Client 7/26/2016 Web Server Application Server EMTM Database Server 41 Q&A 7/26/2016 EMTM 42