Apache JServ Building Scalable Web Services Using Sunny Gleason COM S 717

advertisement
Building Scalable Web Services Using
Apache JServ
Sunny Gleason
COM S 717
Tuesday, December 4, 2001
In This Lecture
•
•
•
•
What is JServ?
The Alternatives
Java Servlet API
Apache JServ / Tomcat
– Scalability
– Load Balancing
– Fault-Tolerance
• JServ Security
Introduction
• Running a web service has changed a
lot since the early 1990’s
• Originally static HTML, text, and images
• Still a great deal of HTML content
• Shift from static pages to dynamically
generated content
• Database-driven content, WAP, XML,
XSLT
What is JServ?
• JServ Server is a Java Servlet Engine
(compliant with the Java Servlet API v2.0)
• Free software produced by the Apache
Software Foundation
• Mod_jserv is a module for connecting JServ
to Apache HTTP Server
• JServ engine has been replaced by Tomcat
• Mod_jserv has been replaced by mod_jk
HTTP Basics
• HyperText Transfer Protocol
• Built on top of TCP
• 2 Well-Known Methods:
– GET
– POST
• Other Methods
– HEAD, PUT, DELETE, ...
• Stateless
HTTP GET
• Format:
– GET url HTTP/1.1 crlf headers crlf crlf
• The url string contains the resource identifier
i.e. “/top.htm”
• The headers contain optional information
provided by the client to the server
• Query Data may follow a question mark in
the URL
– i.e. “/search.pl?query=linux”
HTTP PUT
• Format:
– PUT url HTTP/1.1 crlf headers crlf crlf
form_data
• Form data not passed through URL
• Allows submission of data values which
are larger than maximum URL length
– [URL ~ 2k on MS IE4.0 and above]
HTTP Server Response
• HTTP 200 OK crlf headers crlf crlf
content
• Headers include MIME-type, content
length, content encoding
• Other responses: 301 Redirect, 401
Authorization Required, 403 Access
Forbidden, 500 Internal Server Error
Cookies
• Persistent Client-Side Information
• <Server, Key, Value, Expiration Date>
tuples
• Server sets cookie using Set-Cookie
header
• All future requests to server (before
expiration date) accompanied by cookie
in header
Serving Dynamic Content
• We discuss 3 early models for dynamic
content:
– CGI
– Mod_perl
– Mod_php
The Alternatives: CGI
• Common Gateway Interface
• Advantages
– Flexibility - run any program
• bash, perl, python, php
– Low process overhead when idle
• Disadvantages
– Reload interpreter upon every request
– Re-establish (costly) database connections
– Security concerns - passing parameters
The Alternatives: Mod_Perl
•
•
•
•
Apache module for Perl
Memory-resident interpreter
Precompiled scripts / Script cache
Speed / Memory Tradeoff
– HTTP Processes maintain individual perl
interpreters
– Allows persistent database connections, other
persistent server state
– Consistency between HTTP processes was not
always assured
The Alternatives: Mod_Php
• Apache module for PHP
(PHP: Hypertext Preprocessor)
• Template-based language
– Code tags are “embedded” within HTML template
files
– Similar to MS ASP
• Suitable where HTML to script code ratio is
high
• Huge library of add-on modules
• Similar tradeoffs as mod_perl
The Alternatives: Summary
• Should application logic be running on
the web server?
– scalability
– fault-tolerance
– security
• Clearly, need something better for
enterprise-scale applications
Apache JServ
• Separate Application Server from Web Server
– Clean up the architecture
– Improve Scalability
– Provide fault-tolerance
• Embrace Java Philosophy
– “Write once, run anywhere”
• Provide additional Servlet functionality
– Like user sessions
JServ: Openness
• JServ is 100% Java Code
– Platform-Independent
– Runs on any compliant JVM (IBM, Sun, ...)
• JServ is built on top of TCP
• Part of the Apache Software Foundation
– Integrates nicely with Apache HTTP Server
– Ports available for Windows, BSD, Linux ...
JServ: Security
• JServ/Apache can run on different hosts
(also: different users)
• JServ itself is comprised of many “Zones”
– A zone is a JVM which executes some number of
Java Servlets
•
•
•
•
JServ may be placed behind a firewall
JServ offers ACL security by IP address
Optional shared-key authentication
Apache HTTP Server may integrate SSL for
secure HTTP client-server interaction
JServ: Load Balancing
• Level 0: 1 - 1 Apache/JServ
– No load balancing, no redundancy
• Level 1: 1 - n Apache/JServ
– Each JServ hosts different zones
(load partitioning)
• Level 2: 1 - m*n Apache/JServ
– Each zone may be balanced among several JServs
• Level 3: p - m*n Apache/JServ
– Multiple Apache Servers, multiple JServs
JServ: Levels 0-1
• Level 0: allows smaller hosts to run
entire application on a single machine
• Level 1: allows different hosts to serve
different applications
• Typically difficult to plan/partition
applications in this manner
JServ: Level 2
• 1 - m*n Apache/JServ
– Allows Apache to balance requests among
several JServ servers hosting the same
zones
– Apache configuration file specifies ratio of
hits for each JServ
– Each HTTP process chooses server for each
JServ zone, sends new requests to this
target
JServ: Level 3
• p - m*n Apache/JServ
– Allows HTTP traffic to be load-balanced
among several Apache servers
– Allows Servlet workload to be distributed
among several JServ servers
– In order for the system to work, each
Apache HTTP server must have identical
JServ configuration
• (To preserve sessions, as we’ll see later)
JServ: Session Handling
• Once established, a session is bound to a
particular JServ
• But, HTTP client accesses might be “sprayed”
among many HTTP servers
– Allows HTTP Server fault-tolerance
• Identical mod_jserv configuration allows
different Apache servers to “route” requests
to the right JServ
• Mechanism requires client to maintain a
cookie which contains JServ server ID
JServ: Session Handling
• How does it work?
– Every time a request arrives for a balanced
ServletMountPoint, mod_jserv chooses a JServ to
handle the request
– mod_jserv adds a cookie trailer to the
environment variables of the JServ request
(i.e. JS3)
– JServ appends the cookie trailer to the end of the
session cookie
– Upon subsequent requests, Apache examines
cookie, and sends the request to the correct JServ
JServ: Fault-Tolerance
• (Assume Level 3)
• No Single Point of Failure
– Apache can become overloaded and fail, but JServ
servers continue to provide services (although SSL
sessions lost)
– JServ redundancy allows applications to continue
running even if multiple hosts fail (although
application sessions will be lost)
– Since any Apache can route to any JServ, as long
as one of each stay up, the system can work
JServ: Fault-Tolerance
• How is the JServ fault tolerance
implemented?
– Each Apache contains a memory-mapped file
where it keeps JServ information
– Each Apache process has access to the file
– If a process does not receive a response from a
JServ process, it marks it as DOWN in the file
• (Load is re-distributed [fairly] among the survivor JServs)
– A “watchdog” process pings the JServs
intermittently, updates the JServ status in memory
if the server is back online
JServ: Fault-Tolerance
– Apache Fault-Tolerance: Step 1
• 1. Web server requests www.jserv.com:80
• 2. HTTP Load-balancing system routes request
to 111.222.244.10:3000
• 3. Apache server chooses a random JServ
machine, say 192.168.0.51:8885
• 4. JServ machine responds to request with
content of page, along with cookie with name
“JServSessionID” and value “xxxx-JS1”
JServ: Fault-Tolerance
• Apache Fault-Tolerance: Step 2
– 1. Client requests another page from
www.jserv.com:80
– 2. HTTP Load-balancing system routes request to
111.222.244.20:3000
– 3. Apache server recognizes session cookie, finds
“JS1” at end of the cookie
– 4. Apache looks up “JS1” in JServ configuration,
routes request to 192.168.0.51:8885
JServ: Load-Balancing
• Step 1: JServ Load-Balancing
– 1.
– 2.
– 3.
– 4.
– 5.
– 6.
Client A requests a servlet (A1)
HTTP chooses target JServ (A’1)
Client A cookie is set for JS1
Client B requests a servlet (B2)
HTTP chooses target JServ (B’2)
Client B cookie is set for JS2
JServ: Load-balancing
• Step 2: Session Handling
– 1. Client B requests a servlet (sends
previously-set cookie)
– 2. HTTP server recognizes cookie
– 3. Request is routed to JServ2 (B’2)
JServ: Fault-Tolerance
• [assume Jserv1 goes down]
• Step 3: JServ Fault-Tolerance
– 1. Client A requests a servlet
– 2. HTTP Server recognizes the JS1 cookie
– 3. Request is passed to JServ1, resulting in
timeout
– 4. HTTP marks JServ1 “dead” in shared memory
– 5. HTTP looks up another server for the servlet
mount point, sends request to JServ3
– 6. If a new session is needed, a new one is
created and the new cookie is set to “JS3” (JS1
erased)
JServ: Fault-Tolerance
• Implementation Issues
– Denial of Service
• Failed requests must be re-distributed evenly!
• Otherwise, a single server will bear the brunt of the load,
and probably crash
– Network Partitioning and Application-level Data
Synchronization Issues
• Must still be anticipated by the app. designer
– Watchdog process
• For single-threaded watchdog, if timeout is t, time
between crash and restoration could be f*t, where f is
the number of failed processes
JServ: Manageability
• Shared JServ State allows HTTP process
coordination
• Admins can mark JServs as “shutdown” in
shared memory
• JServ processes can be brought down for
maintenance
• Apache HTTP processes redirect requests
among “live” servers
• Detailed availability information can be
produced by logging contents of shared
memory file
Tomcat: New Features
• Enhanced security model
• Property files which specify access
rights (open socket, write file, etc.)
• Allows different protection levels within
the same JVM (i.e. Java 2 protection
model)
Conclusion
• JServ provides:
– Limited support for
• Load balancing
• Fault-tolerance
• External Security
– Good support for
• Internal Security
• N-tier application abstraction provides
flexibility when needed, “loopback” option
otherwise
The End
• Any questions/comments?
– Apache Web Server
– JServ / Tomcat Servlet Containers
– Scalability / Load-balancing
– Fault-tolerance
– Security
For Further Info
• Apache Jakarta Project
• http://jakarta.apache.org/
• http://jakarta.apache.org/tomcat/
Download