Presented by Chris Janneck (“CJ”) Web Servers Web Protocols and Practice: Chapter 4 • • • • • • of Web Servers and Service Introduction of Servers and Sites Steps in Handling a Client Request Sharing and Saving data across Clients Server Architectures Real-World Servers and Hosting Case Study: The Apache Server • This presentation will cover many aspects What’s Coming Up • A web server is the software which posts the site(s) on the Internet • Servers handle the HTTP requests, call any necessary scripts or dynamic material, etc. • A web site is a collection of pages, seen by users on the Internet • Types of sites can include web searches, e-commerce sites, personal pages, etc. Web Site or Server: Which is it? Provide content, or error message if suitable 4. Generate and transmit response Does client have permission? 3. Determine authorization for request URL != PATH 2. Translate URL to file name Examine header, operations needed, etc. 1. Read and parse HTTP request message Method of Handling Requests – Authentication: determines if the client request comes from a valid user, and which one • Usually a username and password • Necessary in establishing a session with the server – Authorization: determines if the user has permission to access/use the requested resource • Can be established on a directory or file basis • Divided into two parts: Setting the Web’s “Permissions” “static” HTML pages • This differentiates the Web from prior file transfer methods (e.g. FTP) • Usually, dynamic content takes a larger toll on the server than similar “static” pages • The web and web servers are not limited to Dynamic Responses – Can include macros (directives) • Given in file, invisible to the user • E.g. <!-#echo var=‘ ‘LAST_MODIFIED’’-> – To alleviate unnecessary load in parsing for includes, pages which use them are given specific URLs • .shtml • .php – Personal Home Page • .asp – Microsoft Active Server Page • Server-Side Includes Methods for Dynamic Response – New process invoked by server • CGI (Common Gateway Interface) – Module within Web server • Saves overhead of new processes • NSAPI – Netscape Server API • ISAPI – Microsoft’s Internet Server API – External consistent process which Server contacts • Databases, etc. • FastCGI • Server Scripts Methods for Dynamic Response – Local scripts may access data blocked otherwise from the outside • Security cautions – Dynamic content forces server to generate internally or retrieve data, instead of simply transmitting a static page • Performance hit on the server Cautions in Dynamic Content • Security concerns with Cookies – Retain a user’s settings in a dynamic environment – May contain preferences, contact info, items in an online “shopping cart” client host by the server • They allow the server to retain state in the “stateless” world of HTTP • Cookies are simple text files placed on the “C” is for Cookie – The translation process, control information, etc. parsing through an HTML request • Or, the server can share data involved in – Storing content in memory decreases the time needed to fetch/generate responses – “server-side” caching among similar (or identical) requests • Servers can also share responses to requests It’s Nice to Share – One way, the other way, or the middle way different clients across the internet • Like multiprocessing, there are 3 main approaches to this problem • A server handles requests continually from Server Architectures request if the first one is idle, waiting for a resource, disk transaction, etc. • Works well for small, bounded sites • Problems arise when waiting for resources from other sites, or generating highly dynamic pages • Can tweak this to begin working on another – Once the first request is handled, begin the next • Simplest idea: handle one request at a time One way: Event-Driven allocate a new process to each request • Each process completes all necessary steps for the request, then terminates • The entire server is not hampered if a single process encounters a long wait • Caution: Must be wary of memory leaks and computational overhead of forking • Another common computing idea is to Other way: Process-Driven • Multithreaded servers are another example • Are susceptible to downsides of both styles process to handle more than one request simultaneously • E.g. A process-driven style which allows each utilizes some aspects of Event and Processdriven styles • A Hybrid architecture is simply one which The Middle Way: Hybrid machine • Can install multiple domains on the same – http://www.me.com/two and …/three be dedicated to web hosting • It is possible, and often done, to host multiple sites on a single machine • Can have multiple sites within the same domain name • For most sites, a machine does not need to What really happens on Servers • • • • harshly when hosted on a single machine These sites can have their content hosted on many machines to help with traffic Often called “mirror” sites Mirrors can replicate the whole site, or large files (video, audio, hi-res pics, etc.) Make sure data is up-to-date on all machines • Larger, more popular sites will suffer And then, there was Yahoo! started, can place limits on number allowed • Resources: A “resource pool” metaphor • Each process adds, removes, and uses the “pool” through the API • Architecture: Process-driven • Parent process pre-forks processes when Case study of “A Patchy” Server access file 3. Identify and invoke a handler to generate response 4. Transmit response to client 5. Log the request 1. Convert the URL into a file name 2. Determine if request has permission to Apache Handling (This side up) • • • • • • Difference between a Server and a Site? Process in handling a client? Sharing data and caching? This way or that? (Event or Process-driven) How to get the most from your machine? Do I use Apache server or not? Did You Get All That? Questions? Thank you!