Web Servers: Implementation and Performance Erich Nahum IBM T.J. Watson Research Center

advertisement
Web Servers:
Implementation and Performance
Erich Nahum
IBM T.J. Watson Research Center
www.research.ibm.com/people/n/nahum
nahum@us.ibm.com
Web Servers: Implementation and Performance
Erich Nahum
1
Contents and Timeline:
• Introduction to the Web (30 min):
– HTTP, Clients, Servers, Proxies, DNS, CDN’s
• Outline of a Web Server Transaction (25 min):
– Receiving a request, generating a response
• Web Server Architectural Models (20 min):
– Processes, threads, events
• Web Server Workload Characteristics (30 min):
– File sizes, document popularity, embedded objects
• Web Server Workload Generation (20 min):
– Webstone, SpecWeb, TPC-W
Web Servers: Implementation and Performance
Erich Nahum
2
Things Not Covered in Tutorial
• Client-side issues: HTML rendering, Javascript
interpretation
• TCP issues: implementation, interaction with HTTP
• Proxies: some similarities, many differences
• Dynamic Content: CGI, PHP, EJB, ASP, etc.
• QoS for Web Servers
• SSL/TLS and HTTPS
• Content Distribution Networks (CDN’s)
• Security and Denial of Service
Web Servers: Implementation and Performance
Erich Nahum
3
Assumptions and Expectations
• Some familiarity with WWW as a user
(Has anyone here not used a browser?)
• Some familiarity with networking concepts
(e.g., unreliability, reordering, race conditions)
• Familiarity with systems programming
(e.g., know what sockets, hashing, caching are)
• Examples will be based on C & Unix
taken from BSD, Linux, AIX, and real servers
(sorry, Java and Windows fans)
Web Servers: Implementation and Performance
Erich Nahum
4
Objectives and Takeaways
After this tutorial, hopefully we will all know:
•
•
•
•
•
•
Basics of the Web, HTTP, clients, servers, DNS
Basics of server implementation & performance
Pros and cons of various server architectures
Characteristics of web server workloads
Difficulties in workload generation
Design loop of implement, measure, debug, and fix
Many lessons should be applicable to any networked
server, e.g., files, mail, news, DNS, LDAP, etc.
Web Servers: Implementation and Performance
Erich Nahum
5
Acknowledgements
Many people contributed comments and
suggestions to this tutorial, including:
Abhishek Chandra
Mark Crovella
Suresh Chari
Peter Druschel
Jim Kurose
Balachander Krishnamurthy
Vivek Pai
Jennifer Rexford
Anees Shaikh
Srinivasan Seshan
Errors are all mine, of course.
Web Servers: Implementation and Performance
Erich Nahum
6
Chapter 1: Introduction to the
World-Wide Web (WWW)
Web Servers: Implementation and Performance
Erich Nahum
7
Introduction to the WWW
http request
http request
http response
http response
Client
Proxy
Server
• HTTP: Hypertext Transfer Protocol
– Communication protocol between clients and servers
– Application layer protocol for WWW
• Client/Server model:
– Client: browser that requests, receives, displays object
– Server: receives requests and responds to them
– Proxy: intermediary that aggregates requests, responses
• Protocol consists of various operations
– Few for HTTP 1.0 (RFC 1945, 1996)
– Many more in HTTP 1.1 (RFC 2616, 1999)
Web Servers: Implementation and Performance
Erich Nahum
8
How are Requests Generated?
• User clicks on something
• Uniform Resource Locator (URL):
–
–
–
–
–
–
http://www.nytimes.com
https://www.paymybills.com
ftp://ftp.kernel.org
news://news.deja.com
telnet://gaia.cs.umass.edu
mailto:nahum@us.ibm.com
• Different URL schemes map to different services
• Hostname is converted from a name to a 32-bit IP
address (DNS resolve)
• Connection is established to server
Most browser requests are HTTP requests.
Web Servers: Implementation and Performance
Erich Nahum
9
How are DNS names resolved?
• Clients have a well-known IP address for a local
DNS name server
• Clients ask local name server for IP address
• Local name server may not know it, however!
• Name server has, in turn, a parent to ask (the
“DNS hierarchy”)
• The local name server’s job is to iteratively query
servers until name is found and return IP address
to server
• Each name server can cache names, but:
– Each name:IP mapping has a time-to-live field
– After time expires, name server must discard mapping
Web Servers: Implementation and Performance
Erich Nahum
10
DNS in Action
myclient.watson.ibm.com
200 OK index.html
GET /index.html
www.ipam.ucla.edu
12.100.104.5
12.100.104.5
www.ipam.ucla.edu?
12.100.104.5 (TTL = 10 min)
ns.watson.ibm.com
(name server)
www.ipam.ucla.edu?
ns.ucla.edu
(name server)
A.GTLD-SERVER.NET
(name server for .edu)
Web Servers: Implementation and Performance
Erich Nahum
11
What Happens Then?
• Client downloads HTML document
– Sometimes called “container page”
– Typically in text format (ASCII)
– Contains instructions for rendering
(e.g., background color, frames)
– Links to other pages
• Many have embedded objects:
– Images: GIF, JPG (logos, banner ads)
– Usually automatically retrieved
• I.e., without user involvement
• can control sometimes
(e.g. browser options, junkbuster)
Web Servers: Implementation and Performance
Erich Nahum
<html>
<head>
<meta
name=“Author”
content=“Erich Nahum”>
<title> Linux Web
Server Performance
</title>
</head>
<body text=“#00000”>
<img width=31
height=11
src=“ibmlogo.gif”>
<img
src=“images/new.gif>
<h1>Hi There!</h1>
Here’s lots of cool
linux stuff!
<a href=“more.html”>
Click here</a>
for more!
</body>
</html>
sample html file
12
So What’s a Web Server Do?
• Respond to client requests, typically a browser
– Can be a proxy, which aggregates client requests (e.g., AOL)
– Could be search engine spider or custom (e.g., Keynote)
• May have work to do on client’s behalf:
–
–
–
–
Is the client’s cached copy still good?
Is client authorized to get this document?
Is client a proxy on someone else’s behalf?
Run an arbitrary program (e.g., stock trade)
• Hundreds or thousands of simultaneous clients
• Hard to predict how many will show up on some day
• Many requests are in progress concurrently
Server capacity planning is non-trivial.
Web Servers: Implementation and Performance
Erich Nahum
13
What do HTTP Requests Look Like?
GET /images/penguin.gif HTTP/1.0
User-Agent: Mozilla/0.9.4 (Linux 2.2.19)
Host: www.kernel.org
Accept: text/html, image/gif, image/jpeg
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8
Cookie: B=xh203jfsf; Y=3sdkfjej
<cr><lf>
• Messages are in ASCII (human-readable)
• Carriage-return and line-feed indicate end of headers
• Headers may communicate private information
(browser, OS, cookie information, etc.)
Web Servers: Implementation and Performance
Erich Nahum
14
What Kind of Requests are there?
Called Methods:
• GET: retrieve a file (95% of requests)
• HEAD: just get meta-data (e.g., mod time)
• POST: submitting a form to a server
• PUT: store enclosed document as URI
• DELETE: removed named resource
• LINK/UNLINK: in 1.0, gone in 1.1
• TRACE: http “echo” for debugging (added in 1.1)
• CONNECT: used by proxies for tunneling (1.1)
• OPTIONS: request for server/proxy options (1.1)
Web Servers: Implementation and Performance
Erich Nahum
15
What Do Responses Look Like?
HTTP/1.0 200 OK
Server: Tux 2.0
Content-Type: image/gif
Content-Length: 43
Last-Modified: Fri, 15 Apr 1994 02:36:21 GMT
Expires: Wed, 20 Feb 2002 18:54:46 GMT
Date: Mon, 12 Nov 2001 14:29:48 GMT
Cache-Control: no-cache
Pragma: no-cache
Connection: close
Set-Cookie: PA=wefj2we0-jfjf
<cr><lf>
<data follows…>
• Similar format to requests (i.e., ASCII)
Web Servers: Implementation and Performance
Erich Nahum
16
What Responses are There?
• 1XX: Informational (def’d in 1.0, used in 1.1)
100 Continue, 101 Switching Protocols
• 2XX: Success
200 OK, 206 Partial Content
• 3XX: Redirection
301 Moved Permanently, 304 Not Modified
• 4XX: Client error
400 Bad Request, 403 Forbidden, 404 Not Found
• 5XX: Server error
500 Internal Server Error, 503 Service
Unavailable, 505 HTTP Version Not Supported
Web Servers: Implementation and Performance
Erich Nahum
17
What are all these Headers?
Specify capabilities and properties:
• General:
Connection, Date
• Request:
Accept-Encoding, User-Agent
• Response:
Location, Server type
• Entity:
Content-Encoding, Last-Modified
• Hop-by-hop:
Proxy-Authenticate, Transfer-Encoding
Server must pay attention to respond properly.
Web Servers: Implementation and Performance
Erich Nahum
18
The Role of Proxies
clients
Internet
proxy
servers
• Clients send requests to local proxy
• Proxy sends requests to remote servers
• Proxy can cache responses and return them
Web Servers: Implementation and Performance
Erich Nahum
19
Why have a Proxy?
• For performance:
– Many of the same web documents are requested by many
different clients (“locality of reference”)
– A copy of the document can be cached for later requests
(typical document hit rate: ~ 50%)
– Since proxy is closer to client, responses times are
smaller than from server
• For cost savings:
– Organizations pay by ISP bandwidth used
– Cached responses don’t consume ISP bandwidth
• For security/policy:
– Typically located in “demilitarized zone” (DMZ)
– Easier to protect a single point rather than all clients
– Can enforce corporate/government policies (e.g., porn)
Web Servers: Implementation and Performance
Erich Nahum
20
Proxy Placement in the Web
proxy
clients
Internet
proxy
“reverse”
proxy
servers
proxy
• Proxies can be placed in arbitrary points in net:
– Can be organized into hierarchies
– Placed in front of a server: “reverse” proxy
– Route requests to specific proxies: content distribution
Web Servers: Implementation and Performance
Erich Nahum
21
Content Distribution Networks
proxy
origin
servers
Internet
proxy
proxy
• Push content out to proxies:
– Route client requests to “closest” proxy
– Reduce load on origin server
– Reduce response time seen by client
Web Servers: Implementation and Performance
Erich Nahum
22
Mechanisms for CDN’s
• IP Anycast:
– Route an IP packet to one-of-many IP addresses
– Some research but not deployed or supported by IPV4
• TCP Redirection:
– Client TCP packets go to one machine, but responses
come from a different one
– Clunky, not clear it reduces load or response time
• HTTP Redirection:
– When client connects, use 302 response (moved temp) to
send client to proxy close to client
– Server must be aware of CDN network
• DNS Redirection:
– When client asks for server IP address, tell them based
on where they are in the network
– Used by most CDN providers (e.g., Akamai)
Web Servers: Implementation and Performance
Erich Nahum
23
DNS Based Request-Routing
cdn 1
cdn 2
www.service.com
cdn 3
cdn 5
cdn 4
service.com?
client
service.com?
cdn 3
cdn 3
requestrouting DNS
name server
local nameserver
Web Servers: Implementation and Performance
Erich Nahum
24
Summary: Introduction to WWW
• The major application on the Internet
– Majority of traffic is HTTP (or HTTP-related)
– Messages mostly in ASCII text (helps debugging!)
• Client/server model:
– Clients make requests, servers respond to them
– Proxies act as servers to clients, clients to servers
• Content may be spread across network
– Through either proxy caches or content distr. networks
– DNS redirection is the common approach to CDNs
• Various HTTP headers and commands
– Too many to go into detail here
– We’ll focus on common server ones
– Many web books/tutorials exist
(e.g., Krishnamurthy & Rexford 2001)
Web Servers: Implementation and Performance
Erich Nahum
25
Chapter 2: Outline of a Typical
Web Server Transaction
Web Servers: Implementation and Performance
Erich Nahum
26
Outline of an HTTP Transaction
• In this section we go over the
basics of servicing an HTTP GET
request from user space
• For this example, we'll assume a
single process running in user
space, similar to Apache 1.3
• At each stage see what the
costs/problems can be
• Also try to think of where costs
can be optimized
• We’ll describe relevant socket
operations as we go
Web Servers: Implementation and Performance
Erich Nahum
initialize;
forever do {
get request;
process;
send response;
log request;
}
server in
a nutshell
27
Readying a Server
s = socket();
bind(s, 80);
listen(s);
while (1) {
newconn =
/* allocate listen socket */
/* bind to TCP port 80
*/
/* indicate willingness to accept */
accept(s); /* accept new connection */b
• First thing a server does is notify the OS it is interested in
WWW server requests; these are typically on TCP port 80.
Other services use different ports (e.g., SSL is on 443)
• Allocate a socket and bind()'s it to the address (port 80)
• Server calls listen() on the socket to indicate willingness to
receive requests
• Calls accept() to wait for a request to come in (and blocks)
• When the accept() returns, we have a new socket which
represents a new connection to a client
Web Servers: Implementation and Performance
Erich Nahum
28
Processing a Request
remoteIP = getsockname(newconn);
remoteHost = gethostbyname(remoteIP);
gettimeofday(currentTime);
read(newconn, reqBuffer, sizeof(reqBuffer));
reqInfo = serverParse(reqBuffer);
•
getsockname() called to get the remote host name
•
gethostbyname() called to get name of other end
•
gettimeofday() is called to get time of request
•
•
read() is called on new socket to retrieve request
request is determined by parsing the data
– for logging purposes (optional, but done by most)
– again for logging purposes
– both for Date header and for logging
– “GET /images/jul4/flag.gif”
Web Servers: Implementation and Performance
Erich Nahum
29
Processing a Request (cont)
fileName = parseOutFileName(requestBuffer);
fileAttr = stat(fileName);
serverCheckFileStuff(fileName, fileAttr);
open(fileName);
•
stat() called to test file path
– to see if file exists/is accessible
– may not be there, may only be available to certain people
– "/microsoft/top-secret/plans-for-world-domination.html"
•
stat() also used for file meta-data
– e.g., size of file, last modified time
– "Have plans changed since last time I checked?“
• might have to stat() multiple files just to get to end
– e.g., 4 stats in bill g example above
• assuming all is OK, open() called to open the file
Web Servers: Implementation and Performance
Erich Nahum
30
Responding to a Request
read(fileName, fileBuffer);
headerBuffer = serverFigureHeaders(fileName, reqInfo);
write(newSock, headerBuffer);
write(newSock, fileBuffer);
close(newSock);
close(fileName);
write(logFile, requestInfo);
•
•
read() called to read the file into user space
write() is called to send HTTP headers on socket
(early servers called write() for each header!)
•
•
•
•
write() is called to write the file on the socket
close() is called to close the socket
close() is called to close the open file descriptor
write() is called on the log file
Web Servers: Implementation and Performance
Erich Nahum
31
Optimizing the Basic Structure
• As we will see, a great deal of locality exists in
web requests and web traffic.
• Much of the work described above doesn't really
need to be performed each time.
• Optimizations fall under 2 categories: caching and
custom OS primitives.
Web Servers: Implementation and Performance
Erich Nahum
32
Optimizations: Caching
Idea is to exploit locality in client requests. Many
files are requested over and over (e.g., index.html).
• Why open and close
files over and over
again? Instead,
cache open file FD’s,
manage them LRU.
fileDescriptor =
lookInFDCache(fileName);
metaInfo =
lookInMetaInfoCache(fileName);
headerBuffer =
lookInHTTPHeaderCache(fileName);
• Why stat them again
• Again, cache HTTP header
and again? Cache path
info on a per-url basis,
name and access
rather than re-generating
characteristics.
info over and over.
Web Servers: Implementation and Performance
Erich Nahum
33
Optimizations: Caching (cont)
• Instead of reading and writing the data, cache data,
as well as meta-data, in user space
• Even better, mmap()
the file so that two
copies don't exist in
both user and kernel
space
fileData =
lookInFileDataCache(fileName);
fileData =
lookInMMapCache(fileName);
remoteHostName =
lookRemoteHostCache(fileName);
• Since we see the same clients over and over, cache
the reverse name lookups (or better yet, don't do
resolves at all, log only IP addresses)
Web Servers: Implementation and Performance
Erich Nahum
34
Optimizations: OS Primitives
• Rather than call accept(), getsockname() & read(),
add a new primitive, acceptExtended(), which
combines the 3 primitives
• Instead of calling
gettimeofday(), use a
memory-mapped
counter that is cheap
to access (a few
instructions rather
than a system call)
acceptExtended(listenSock,
&newSock, readBuffer,
&remoteInfo);
currentTime = *mappedTimePointer;
buffer[0] = firstHTTPHeader;
buffer[1] = secondHTTPHeader;
buffer[2] = fileDataBuffer;
writev(newSock, buffer, 3);
• Instead of calling write() many times, use writev()
Web Servers: Implementation and Performance
Erich Nahum
35
OS Primitives (cont)
• Rather than calling read() & write(), or write() with
an mmap()'ed file, use a new primitive called
sendfile() (or transmitfile()). Bytes stay in the
kernel.
• While we're at it,
add a header option
to sendfile() so
that we don't have
to call write() at all.
httpInfo = cacheLookup(reqBuffer);
sendfile(newConn,
httpInfo->headers,
httpInfo->fileDescriptor,
OPT_CLOSE_WHEN_DONE);
• Also add an option to close the connection so that we
don't have to call close() explicitly.
All this assumes proper OS support. Most have it these days.
Web Servers: Implementation and Performance
Erich Nahum
36
An Accelerated Server Example
acceptex(socket, newConn, reqBuffer, remoteHostInfo);
httpInfo = cacheLookup(reqBuffer);
sendfile(newConn, httpInfo->headers,
httpInfo->fileDescriptor, OPT_CLOSE_WHEN_DONE);
write(logFile, requestInfo);
• acceptex() is called
– gets new socket, request, remote host IP address
• string match in hash table is done to parse request
– hash table entry contains relevant meta-data, including
modification times, file descriptors, permissions, etc.
• sendfile() is called
– pre-computed header, file descriptor, and close option
• log written back asynchronously (buffered write()).
That’s it!
Web Servers: Implementation and Performance
Erich Nahum
37
Complications
• Much of this assumes sharing is easy:
– but, this is dependent on the server architectural model
– if multiple processes are being used, as in Apache, it is
difficult to share data structures.
• Take, for example, mmap():
– mmap() maps a file into the address space of a process.
– a file mmap'ed in one address space can’t be re-used for
a request for the same file served by another process.
– Apache 1.3 does use mmap() instead of read().
– in this case, mmap() eliminates one data copy versus a
separate read() & write() combination, but process will
still need to open() and close() the file.
Web Servers: Implementation and Performance
Erich Nahum
38
Complications (cont)
• Similarly, meta-data info needs to be shared:
– e.g., file size, access permissions, last modified time, etc.
• While locality is high, cache misses can and do
happen sometimes:
– if previously unseen file requested, process can block
waiting for disk.
• OS can impose other restrictions:
– e.g., limits on number of open file descriptors.
– e.g., sockets typically allow buffering about 64 KB of data.
If a process tries to write() a 1 MB file, it will block until
other end receives the data.
• Need to be able to cope with the misses without
slowing down the hits
Web Servers: Implementation and Performance
Erich Nahum
39
Summary: Outline of a Typical
HTTP Transaction
• A server can perform many steps in the process of
servicing a request
• Different actions depending on many factors:
– e.g., 304 not modified if client's cached copy is good
– e.g., 404 not found, 401 unauthorized
• Most requests are for small subset of data:
– we’ll see more about this in the Workload section
– we can leverage that fact for performance
• Architectural model affects possible optimizations
– we’ll go into this in more detail in the next section
Web Servers: Implementation and Performance
Erich Nahum
40
Chapter 3:
Server Architectural Models
Web Servers: Implementation and Performance
Erich Nahum
41
Server Architectural Models
Several approaches to server structure:
• Process based: Apache, NCSA
• Thread-based: JAWS, IIS
• Event-based: Flash, Zeus
• Kernel-based: Tux, AFPA, ExoKernel
We will describe the advantages and disadvantages
of each.
Fundamental tradeoffs exist between performance,
protection, sharing, robustness, extensibility, etc.
Web Servers: Implementation and Performance
Erich Nahum
42
Process Model (ex: Apache)
• Process created to handle each new request:
– Process can block on appropriate actions,
(e.g., socket read, file read, socket write)
– Concurrency handled via multiple processes
• Quickly becomes unwieldy:
– Process creation is expensive.
– Instead, pre-forked pool is created.
– Upper limit on # of processes is enforced
• First by the server, eventually by the operating system.
• Concurrency is limited by upper bound
Web Servers: Implementation and Performance
Erich Nahum
43
Process Model: Pros and Cons
• Advantages:
– Most importantly, consistent with programmer's way of
thinking. Most programmers think in terms of linear
series of steps to accomplish task.
– Processes are protected from one another; can't nuke
data in some other address space. Similarly, if one
crashes, others unaffected.
• Disadvantages:
– Slow. Forking is expensive, allocating stack, VM data
structures for each process adds up and puts pressure on
the memory system.
– Difficulty in sharing info across processes.
– Have to use locking.
– No control over scheduling decisions.
Web Servers: Implementation and Performance
Erich Nahum
44
Thread Model (Ex: JAWS)
• Use threads instead of processes. Threads
consume fewer resources than processes (e.g.,
stack, VM allocation).
• Forking and deleting threads is cheaper than
processes.
• Similarly, pre-forked thread pool is created. May
be limits to numbers but hopefully less of an issue
than with processes since fewer resources
required.
Web Servers: Implementation and Performance
Erich Nahum
45
Thread Model: Pros and Cons
• Advantages:
– Faster than processes. Creating/destroying cheaper.
– Maintains programmer's way of thinking.
– Sharing is enabled by default.
• Disadvantages:
– Less robust. Threads not protected from each other.
– Requires proper OS support, otherwise, if one thread
blocks on a file read, will block all the address space.
– Can still run out of threads if servicing many clients
concurrently.
– Can exhaust certain per-process limits not encountered
with processes (e.g., number of open file descriptors).
– Limited or no control over scheduling decisions.
Web Servers: Implementation and Performance
Erich Nahum
46
Event Model (Ex: Flash)
while (1) {
accept new connections until none remaining;
call select() on all active file descriptors;
for each FD:
if (fd ready for reading) call read();
if (fd ready for writing) call write();
}
• Use a single process and deal with requests in a
event-driven manner, like a giant switchboard.
• Use non-blocking option (O_NDELAY) on sockets, do
everything asynchronously, never block on anything,
and have OS notify us when something is ready.
Web Servers: Implementation and Performance
Erich Nahum
47
Event-Driven: Pros and Cons
• Advantages:
–
–
–
–
–
–
Very fast.
Sharing is inherent, since there’s only one process.
Don't even need locks as in thread models.
Can maximize concurrency in request stream easily.
No context-switch costs or extra memory consumption.
Complete control over scheduling decisions.
• Disadvantages:
– Less robust. Failure can halt whole server.
– Pushes per-process resource limits (like file descriptors).
– Not every OS has full asynchronous I/O, so can still
block on a file read. Flash uses helper processes to deal
with this (AMPED architecture).
Web Servers: Implementation and Performance
Erich Nahum
48
In-Kernel Model (Ex: Tux)
HTTP
SOCK
user/
kernel
boundary
HTTP
user/
kernel
boundary
TCP
TCP
IP
IP
ETH
ETH
user-space server
kernel-space server
• Dedicated kernel thread for HTTP requests:
– One option: put whole server in kernel.
– More likely, just deal with static GET requests in kernel to
capture majority of requests.
– Punt dynamic requests to full-scale server in user space,
such as Apache.
Web Servers: Implementation and Performance
Erich Nahum
49
In-Kernel Model: Pros and Cons
• In-kernel event model:
– Avoids transitions to user space, copies across u-k boundary, etc.
– Leverages already existing asynchronous primitives in the kernel
(kernel doesn't block on a file read, etc.)
• Advantages:
– Extremely fast. Tight integration with kernel.
– Small component without full server optimizes common case.
• Disadvantages:
– Less robust. Bugs can crash whole machine, not just server.
– Harder to debug and extend, since kernel programming required,
which is not as well-known as sockets.
– Similarly, harder to deploy. APIs are OS-specific (Linux, BSD,
NT), whereas sockets & threads are (mostly) standardized.
– HTTP evolving over time, have to modify kernel code in response.
Web Servers: Implementation and Performance
Erich Nahum
50
So What’s the Performance?
• Graph shows server throughput for Tux, Flash, and Apache.
• Experiments done on 400 MHz P/II, gigabit Ethernet, Linux
2.4.9-ac10, 8 client machines, WaspClient workload generator
• Tux is fastest, but Flash close behind
Web Servers: Implementation and Performance
Erich Nahum
51
Summary: Server Architectures
• Many ways to code up a server
– Tradeoffs in speed, safety, robustness, ease of
programming and extensibility, etc.
• Multiple servers exist for each kind of model
– Not clear that a consensus exists.
• Better case for in-kernel servers as devices
e.g. reverse proxy accelerator, Akamai CDN node
• User-space servers have a role:
– OS should provides proper primitives for efficiency
– Leave HTTP-protocol related actions in user-space
– In this case, event-driven model is attractive
• Key pieces to a fast event-driven server:
– Minimize copying
– Efficient event notification mechanism
Web Servers: Implementation and Performance
Erich Nahum
52
Chapter 5:
Workload Characterization
Web Servers: Implementation and Performance
Erich Nahum
53
Workload Characterization
•
Why Characterize Workloads?
– Gives an idea about traffic behavior
("Which documents are users interested in?")
– Aids in capacity planning
("Is the number of clients increasing over time?")
– Aids in implementation
("Does caching help?")
• How do we capture them ?
– Through server logs (typically enabled)
– Through packet traces (harder to obtain and to process)
Web Servers: Implementation and Performance
Erich Nahum
54
Factors to Consider
client?
proxy?
server?
• Where do I get logs from?
– Client logs give us an idea, but not necessarily the same
– Same for proxy logs
– What we care about is the workload at the server
•
Is trace representative?
– Corporate POP vs. News vs. Shopping site
• What kind of time resolution?
– e.g., second, millisecond, microsecond
• Does trace/log capture all the traffic?
– e.g., incoming link only, or one node out of a cluster
Web Servers: Implementation and Performance
Erich Nahum
55
Probability Refresher
•
Lots of variability in workloads
•
Some terminology/jargon:
•
Heavy-tailed:
–
–
–
–
–
–
Use probability distributions to express
Want to consider many factors
Mean: average of samples
Median : half are bigger, half are smaller
Percentiles: dump samples into N bins
(median is 50th percentile number)
As x->infinity
Pr[ X  x]  cx  a
Web Servers: Implementation and Performance
Erich Nahum
56
Important Distributions
Some Frequently-Seen Distributions:
•
Normal:
–
(avg. sigma, variance mu)
f ( x) 
e
 ( x   ) 2 /( 2 2 )
 2
 (ln( x )   ) 2 /( 2 2 )
•
Lognormal:
f ( x) 
•
Exponential:
f ( x)  e  x
•
Pareto:
f ( x)  ak a / x ( a 1)
–
–
–
(x >= 0; sigma > 0)
(x >= 0)
e
x 2
(x >= k, shape a, scale k)
Web Servers: Implementation and Performance
Erich Nahum
57
More Probability
• Graph shows 3 distributions with average = 2.
• Note average  median in some cases !
• Different distributions have different “weight” in tail.
Web Servers: Implementation and Performance
Erich Nahum
58
What Info is Useful?
• Request methods
– GET, POST, HEAD, etc.
• Response codes
•
•
•
•
•
•
– success, failure, not-modified, etc.
Size of requested files
Size of transferred objects
Popularity of requested files
Numbers of embedded objects
Inter-arrival time between requests
Protocol support (1.0 vs. 1.1)
Web Servers: Implementation and Performance
Erich Nahum
59
Sample Logs for Illustration
Name:
Chess
1997
Olympics
1998
IBM
1998
IBM
2001
Description:
KasparovDeep Blue
Event Site
Nagano 1998
Olympics
Event Site
Corporate
Presence
Corporate
Presence
Period:
2 weeks in
May 1997
2 days in
Feb 1998
1 day in
June 1998
1 day in
Feb 2001
Hits:
1,586,667
5,800,000
11,485,600
12,445,739
Bytes:
14,171,711
10,515,507
54,697,108
28,804,852
Clients:
256,382
80,921
86,0211
319,698
URLS:
2,293
30,465
15,788
42,874
We’ll use statistics generated from these logs as examples.
Web Servers: Implementation and Performance
Erich Nahum
60
Request Methods
Chess
1997
Olympics IBM
1998
1998
IBM
2001
GET
96%
99.6%
99.3%
97%
HEAD
04%
00.3 %
00.08% 02%
POST
00.007% 00.04 %
Others: noise
noise
00.02% 00.2%
noise
noise
• KR01: "overwhelming majority" are GETs, few POSTs
• IBM2001 trace starts seeing a few 1.1 methods (CONNECT,
OPTIONS, LINK), but still very small (1/10^5 %)
Web Servers: Implementation and Performance
Erich Nahum
61
Response Codes
Code
Meaning
Chess
1997
Olympics
1998
IBM
1998
IBM
2001
200
204
206
301
302
304
400
401
403
404
407
500
501
503
???
OK
NO_CONTENT
PARTIAL_CONTENT
MOVED_PERMANENTLY
MOVED_TEMPORARILY
NOT_MODIFIED
BAD_REQUEST
UNAUTHORIZED
FORBIDDEN
NOT_FOUND
PROXY_AUTH
SERVER_ERROR
NOT_IMPLEMENTED
SERVICE_UNAVAIL
UNKNOWN
85.32
--.-00.25
00.05
00.05
13.73
00.001
--.—00.01
00.55
--.---.---.---.-00.0003
76.02
--.---.---.-00.05
23.24
00.0001
00.001
00.02
00.64
--.-00.003
00.0001
--.-00.00004
75.28
00.00001
--.---.-01.18
22.84
00.003
00.0001
00.01
00.65
--.-00.006
00.0005
00.0001
00.005
67.72
--.---.---.-15.11
16.26
00.001
00.001
00.009
00.79
00.002
00.07
00.006
00.0003
00.0004
• Table shows percentage of responses.
• Majority are OK and NOT_MODIFIED.
• Consistent with numbers from AW96, KR01.
Web Servers: Implementation and Performance
Erich Nahum
62
Resource (File) Sizes
•
•
•
Shows file/memory usage (not weighted by frequency!)
Lognormal body, consistent with results from AW96, CB96, KR01.
AW96, CB96: sizes have Pareto tail; Downey01: Sizes are lognormal.
Web Servers: Implementation and Performance
Erich Nahum
63
Tails from the File Size
• Shows the complementary CDF (CCDF) of file sizes.
• Haven’t done the curve fitting but looks Pareto-ish.
Web Servers: Implementation and Performance
Erich Nahum
64
Response (Transfer) Sizes
• Shows network usage (weighted by frequency of requests)
• Lognormal body, pareto tail, consistent with CBC95,
AW96, CB96, KR01
Web Servers: Implementation and Performance
Erich Nahum
65
Tails of Transfer Size
• Shows the complementary CDF (CCDF) of file sizes.
• Looks more Pareto-like; certainly some big transfers.
Web Servers: Implementation and Performance
Erich Nahum
66
Resource Popularity
•
Follows a Zipf model: p(r) = r^{-alpha}
•
•
Consistent with CBC95, AW96, CB96, PQ00, KR01
Shows that caching popular documents is very effective
(alpha = 1 true Zipf; others “Zipf-like")
Web Servers: Implementation and Performance
Erich Nahum
67
Number of Embedded Objects
• Mah97: avg 3, 90% are 5 or less
• BC98: pareto distr, median 0.8, mean 1.7
• Arlitt98 World Cup study: median 15 objects, 90%
are 20 or less
• MW00: median 7-17, mean 11-18, 90% 40 or less
• STA00: median 5,30 (2 traces), 90% 50 or less
• Mah97, BC98, SCJO01: embedded objects tend to
be smaller than container objects
• KR01: median is 8-20, pareto distribution
Trend seems to be that number is increasing over time.
Web Servers: Implementation and Performance
Erich Nahum
68
Session Inter-Arrivals
• Inter-arrival time between successive requests
– “Think time"
– difference between user requests vs. ALL requests
– partly depends on definition of boundary
• CB96: variability across multiple timescales, "selfsimilarity", average load very different from peak
or heavy load
• SCJO01: log-normal, 90% less than 1 minute.
• AW96: independent and exponentially distributed
• KR01: pareto with a=1.5, session arrivals follow
poisson distribution, but requests follow pareto
Web Servers: Implementation and Performance
Erich Nahum
69
Protocol Support
• IBM.com 2001 logs:
– Show roughly 53% of client requests are 1.1
• KA01 study:
– 92% of servers claim to support 1.1 (as of Sep 00)
– Only 31% actually do; most fail to comply with spec
• SCJO01 show:
– Avg 6.5 requests per persistent connection
– 65% have 2 connections per page, rest more.
– 40-50% of objects downloaded by persistent connections
Appears that we are in the middle of a slow transition to 1.1
Web Servers: Implementation and Performance
Erich Nahum
70
Summary: Workload
Characterization
• Traffic is variable:
– Responses vary across multiple orders of magnitude
• Traffic is bursty:
– Peak loads much larger than average loads
• Certain files more popular than others
– Zipf-like distribution captures this well
• Two-sided aspect of transfers:
– Most responses are small (zero pretty common)
– Most of the bytes are from large transfers
• Controversy over Pareto/log-normal distribution
• Non-trivial for workload generators to replicate
Web Servers: Implementation and Performance
Erich Nahum
71
Chapter 6: Workload Generators
Web Servers: Implementation and Performance
Erich Nahum
72
Why Workload Generators?
• Allows stress-testing and
bug-finding
• Gives us some idea of server
capacity
• Allows us a scientific process
to compare approaches
– e.g., server models, gigabit
adaptors, OS implementations
• Assumption is that
difference in testbed
translates to some
difference in real-world
• Allows the performance
debugging cycle
Web Servers: Implementation and Performance
Measure
Fix
and/or
improve
Reproduce
Find
Problem
The Performance
Debugging Cycle
Erich Nahum
73
Problems with Workload
Generators
• Only as good as our understanding of the traffic
• Traffic may change over time
– generators must too
• May not be representative
– e.g., are file size distributions from IBM.com similar to mine?
• May be ignoring important factors
– e.g., browser behavior, WAN conditions, modem connectivity
• Still, useful for diagnosing and treating problems
Web Servers: Implementation and Performance
Erich Nahum
74
How does W. Generation Work?
• Many clients, one server
– match asymmetry of Internet
• Server is populated with some
kind of synthetic content
• Simulated clients produce
requests for server
• Master process to control
clients, aggregate results
• Goal is to measure server
– not the client or network
Requests
Responses
• Must be robust to conditions
– e.g., if server keeps sending 404
not found, will clients notice?
Web Servers: Implementation and Performance
Erich Nahum
75
Evolution: WebStone
•
•
•
•
•
The original workload generator from SGI in 1995
Process based workload generator, implemented in C
Clients talk to master via sockets
Configurable: # client machines, # client processes, run time
Measured several metrics: avg + max connect time, response
time, throughput rate (bits/sec), # pages, # files
• 1.0 only does GETS, CGI support added in 2.0
• Static requests, 5 different file sizes:
Percentage
Size
35.00
500 B
50.00
5 KB
14.00
50 KB
0.90
500 KB
0.10
5 MB
Web Servers: Implementation and Performance
www.mindcraft.com/webstone
Erich Nahum
76
Evolution: SPECWeb96
• Developed by SPEC
– Systems Performance Evaluation Consortium
– Non-profit group with many benchmarks (CPU, FS)
• Attempt to get more representative
– Based on logs from NCSA, HP, Hal Computers
• 4 classes of files:
Percentage
Size
35.00
0-1 KB
50.00
1-10 KB
14.00
10-100 KB
1.00 100 KB – 1 MB
• Poisson distribution between each class
Web Servers: Implementation and Performance
Erich Nahum
77
SPECWeb96 (cont)
• Notion of scaling versus load:
– number of directories in data set size doubles as
expected throughput quadruples (sqrt(throughput/5)*10)
– requests spread evenly across all application directories
• Process based WG
• Clients talk to master via RPC's (less robust)
• Still only does GETS, no keep-alive
www.spec.org/osg/web96
Web Servers: Implementation and Performance
Erich Nahum
78
Evolution: SURGE
• Scalable URL Reference GEnerator
– Barford & Crovella at Boston University CS Dept.
• Much more worried about representativeness,
captures:
–
–
–
–
–
–
server file size distributions,
request size distribution,
relative file popularity
embedded file references
temporal locality of reference
idle periods ("think times") of users
• Process/thread based WG
Web Servers: Implementation and Performance
Erich Nahum
79
SURGE (cont)
• Notion of “user-equivalent”:
– statistical model of a user
– active “off” time (between URLS),
– inactive “off” time (between pages)
• Captures various levels of burstiness
• Not validated, shows that load generated is
different than SpecWeb96 and has more
burstiness in terms of CPU and # active
connections
www.cs.wisc.edu/~pb
Web Servers: Implementation and Performance
Erich Nahum
80
Evolution: S-client
• Almost all workload generators are closed-loop:
– client submits a request, waits for server, maybe thinks
for some time, repeat as necessary
• Problem with the closed-loop approach:
– client can't generate requests faster than the server can
respond
– limits the generated load to the capacity of the server
– in the real world, arrivals don’t depend on server state
• i.e., real users have no idea about load on the server when
they click on a site, although successive clicks may have this
property
– in particular, can't overload the server
• s-client tries to be open-loop:
– by generating connections at a particular rate
– independent of server load/capacity
Web Servers: Implementation and Performance
Erich Nahum
81
S-Client (cont)
• How is s-client open-loop?
– connecting asynchronously at a particular rate
– using non-blocking connect() socket call
• Connect complete within a particular time?
– if yes, continue normally.
– if not, socket is closed and new connect initiated.
• Other details:
– uses single-address space event-driven model like Flash
– calls select() on large numbers of file descriptors
– can generate large loads
• Problems:
– client capacity is still limited by active FD's
– “arrival” is a TCP connect, not an HTTP request
www.cs.rice.edu/CS/Systems/Web-measurement
Web Servers: Implementation and Performance
Erich Nahum
82
Evolution: SPECWeb99
• In response to people "gaming" benchmark, now
includes rules:
– IP maximum segment lifetime (MSL) must be at least 60
seconds (more on this later!)
– Link-layer maximum transmission unit (MTU) must not be
larger than 1460 bytes (Ethernet frame size)
– Dynamic content may not be cached
• not clear that this is followed
– Servers must log requests.
• W3C common log format is sufficient but not mandatory.
– Resulting workload must be within 10% of target.
– Error rate must be below 1%.
• Metric has changed:
– now "number of simultaneous conforming connections“:
rate of a connection must be greater than 320 Kbps
Web Servers: Implementation and Performance
Erich Nahum
83
SPECWeb99 (cont)
• Directory size has changed:
(25 + (400000/122000)* simultaneous conns) / 5.0)
• Improved HTTP 1.0/1.1 support:
– Keep-alive requests (client closes after N requests)
– Cookies
• Back-end notion of user demographics
– Used for ad rotation
– Request includes user_id and last_ad
• Request breakdown:
–
–
–
–
–
70.00 % static GET
12.45 % dynamic GET
12.60 % dynamic GET with custom ad rotation
04.80 % dynamic POST
00.15 % dynamic GET calling CGI code
Web Servers: Implementation and Performance
Erich Nahum
84
SPECWeb99 (cont)
• Other breakdowns:
–
–
–
–
30 % HTTP 1.0 with no keep-alive or persistence
70 % HTTP 1.0 with keep-alive to "model" persistence
still has 4 classes of file size with Poisson distribution
supports Zipf popularity
• Client implementation details:
– Master-client communication now uses sockets
– Code includes sample Perl code for CGI
– Client configurable to use threads or processes
• Much more info on setup, debugging, tuning
• All results posted to web page,
– including configuration & back end code
www.spec.org/osg/web99
Web Servers: Implementation and Performance
Erich Nahum
85
SpecWeb99 vs. File Sizes
• SpecWeb99: In the ballpark, but not very smooth
Web Servers: Implementation and Performance
Erich Nahum
86
SpecWeb99 vs. File Size Tail
• SpecWeb99 tail isn’t as long as real logs (900 KB max)
Web Servers: Implementation and Performance
Erich Nahum
87
SpecWeb99 vs.Transfer Sizes
• Doesn’t capture 304 (not modified) responses
• Coarser distribution than real logs (i.e., not smooth)
Web Servers: Implementation and Performance
Erich Nahum
88
Spec99 vs.Transfer Size Tails
• SpecWeb99 does OK, although tail drops off rapidly (and in
fact, no file is greater than 1 MB in SpecWeb99!).
Web Servers: Implementation and Performance
Erich Nahum
89
Spec99 vs. Resource Popularity
• SpecWeb99 seems to do a good job, although tail
isn’t long enough
Web Servers: Implementation and Performance
Erich Nahum
90
Evolution: TPC-W
• Transaction Processing Council (TPC-W)
–
–
–
–
More known for database workloads like TPC-D
Metrics include dollars/transaction (unlike SPEC)
Provides specification, not source
Meant to capture a large e-commerce site
–
–
–
–
–
web serving, searching, browsing, shopping carts
online transaction processing (OLTP)
decision support (DSS)
secure purchasing (SSL), best sellers, new products
customer registration, administrative updates
• Models online bookstore
• Has notion of scaling per user
– 5 MB of DB tables per user
– 1 KB per shopping item, 25 KB per item in static images
Web Servers: Implementation and Performance
Erich Nahum
91
TPC-W (cont)
• Remote browser emulator (RBE)
– emulates a single user
– send HTTP request, parse, wait for thinking, repeat
• Metrics:
– WIPS: shopping
– WIPSb: browsing
– WIPSo: ordering
• Setups tend to be very large:
– multiple image servers, application servers, load balancer
– DB back end (typically SMP)
– Example: IBM 12-way SMP w/DB2, 9 PCs w/IIS: 1M $
www.tpc.org/tpcw
Web Servers: Implementation and Performance
Erich Nahum
92
Summary: Workload Generators
• Only the beginning. Many other workload generators:
–
–
–
–
httperf from HP
WAGON from IBM
WaspClient from IBM
Others?
• Both workloads and generators change over time:
– Both started simple, got more complex
– As workload changes, so must generators
• No one single "good" generator
– SpecWeb99 seems the favorite (2002 rumored in the works)
• Implementation issues similar to servers:
– They are networked-based request producers
(i.e., produce GET's instead of 200 OK's).
– Implementation affects capacity planning of clients!
(want to make sure clients are not bottleneck)
Web Servers: Implementation and Performance
Erich Nahum
93
End of this tutorial…
• This is roughly half of a four-hour tutorial:
– ACM SIGMETRICS 2002 (June, Marina Del Ray, CA)
• Remainder gets into more detailed issues:
–
–
–
–
Event notification mechanisms in servers
Overview of the TCP protocol
TCP dynamics for servers
TCP implementation issues for servers
• Talk to me if you’re still interested, or
• Point your browser at:
www.sigmetrics.org
Web Servers: Implementation and Performance
Erich Nahum
94
Chapter: Event Notification
• Event notification:
– Mechanism for kernel and application to notify each
other of interesting/important events
– E.g., connection arrivals, socket closes, data available to
read, space available for writing
• Idea is to exploit concurrency:
– Concurrency in user workloads means host CPU can
overlap multiple events to maximize parallelism
– Keep network, disk busy; never block
• Simultaneously, want to minimize costs:
– user/kernel crossings and testing idle socket descriptors
• Event notification changes applications:
– state-based to event-based
– requires a change in thinking
Web Servers: Implementation and Performance
Erich Nahum
95
Chapter: Introduction to TCP
• Layering is a common principle in
network protocol design
• TCP is the major transport protocol
in the Internet
• Since HTTP runs on top of TCP,
much interaction between the two
• Asymmetry in client-server model
puts strain on server-side TCP
implementations
• Thus, major issue in web servers is
TCP implementation and behavior
Web Servers: Implementation and Performance
Erich Nahum
application
transport
network
link
physical
96
Chapter: TCP Dynamics
• In this section we'll describe some of the
problems you can run into as a WWW server
interacting with TCP.
• Most of these affect the response as seen by the
client, not the throughput generated by the
server.
• Ideally, a server developer shouldn't have to
worry about this stuff, but in practice, we'll see
that's not the case.
• Examples we'll look at include:
–
–
–
–
The initial window size
The delayed ACK problem
Nagle and its interaction with delayed ack
Small receive windows interfering with loss recovery
Web Servers: Implementation and Performance
Erich Nahum
97
Chapter: Server TCP
Implementation
• In this section we look at ways in which the host
TCP implementation is stressed under large web
server workloads. Most of these techniques deal
with large numbers of connections:
– Looking up arriving TCP segments with large numbers of
connections
– Dealing with the TIME-WAIT state caused by closing
large number of connections
– Managing large numbers of timers to support connections
– Dealing with memory consumption of connection state
• Removing data-touching operations
– byte copying and checksums
Web Servers: Implementation and Performance
Erich Nahum
98
Download