CS233: Database Systems — Lecture Slides 8 presented by Timothy Heron∗ November 19, 2004 ∗ E-mail: theron@dcs.warwick.ac.uk CS233: Database Systems Lectures 1 Databases and the Web • The Three-tier architecture • Approaches to integrating Databases into the Web environment • Java solutions CS233: Database Systems Lectures 2 What is the WWW? • A network of information systems • Platform independent delivery and dissemination of information and interactive applications • Global network However from a database perspective it is: • Web documents (static) are file-based which creates problems in managing and keeping up-to-date this information • Many useful Web sites use dynamic content (more than brochure ware) e.g. product and pricing information. This sort of data is bread-and-butter of databases and DBMS. CS233: Database Systems Lectures 3 The Hyperlink • The WWW is a hyper-media (non-sequential) means of browsing information. • Simple point-and-click interface. • Multi-media. Web architecture is Client-Server • Clients (browsers) request information from Servers (Web-servers eg. Apache HTTP Server) • Much of information stored in documents using HTML (Hypertext Markup Language) CS233: Database Systems Lectures 4 Mailing List Our Mailing List application could be easily written as a Web application. We can re-use our Members.java class without modification, we just need to change the interface code. Mailing List Web Demo CS233: Database Systems Lectures 5 The Protocol: HTTP Documents are addressed by a URL (Uniform Resource Locator). Hypertext Transfer Protocol (HTTP) defines how web-clients and web-servers communicate: 1. Connection: the client establishes a connection with the Web server 2. Request: The client sends a request message to the Web server. 3. Response: The Web server sends a response (eg. an HTML document) to the client. 4. Close: The connection is closed by the Web server. The protocol is stateless: the server retains no information between requests. CS233: Database Systems Lectures 6 Statelessness The protocol is stateless: the server retains no information between requests. This means that servers (and clients) can be ‘thin’ requiring little memory or disk space. This also means that concept of a session which is essential to DBMS transactions is difficult to support. (One way is to have hidden identifiers in HTML forms which get passed back and forth between client and server, other ways include ’Cookies’ and storing a session id in the URL). CS233: Database Systems Lectures 7 Static v Dynamic Pages HTML page stored in file is an example of a static page: content does not change unless file is changed. Dynamic Pages: generate each time it is accessed e.g. could contain information generated from queries to a database. Essentially, Dynamic Pages are created by programs (usually scripts) which are run by the server (when requested by the client) that create HTML dynamically containing results of data access e.g. to a RDBMs. CS233: Database Systems Lectures 8 Web-DBMS Architecture Two Tier client-server architecture: • Client is user interface and ‘business logic’ within data processing programs. Primary role is the presentation of data. • Server: DBMS. Primary role is to supply data services. CS233: Database Systems Lectures Three Tier architecture: • First Tier: Client running user interface. • Second Tier: Business logic and data processing logic handled by an Application Server. • Third Tier: Data validation and database access handled by DBMS. The Second Tier centralises the application maintainance and application logic in an extensible framework. It reduces to burden on Client making it ‘thin’. The client no longer has to include database specific code. The Three-Tier architecture maps naturally on to the Web architecture which the Web-Server acting as an Application Server or sending requests to a co-resident Application Server. 9 CS233: Database Systems Lectures What are the advantages of using Web environment to support Three-Tier systems ? • DBMS can manage web document hierarchies • HTML has a simplicity in user-end markup and multi-media GUI, it is also standardised. • Through browsers, web access is platform independent. • Network access is transparent and global: internet. 10 CS233: Database Systems Lectures What are the disadvantages ? • Reliability: Slow, Reliability of packet delivery and bandwidth. • HTTP is not scalable: protocol not optimised for multiple small requests. • HTTP is stateless: session management has to be coded. • HTML is limited in functionality. 11 CS233: Database Systems Lectures Approaches to Integrating Web and DBMSs How to get dynamic content into web pages? • CGI: Scripting languages: C, PHP, Perl, JavaScript • Java solutions: – Applets - not used much now – Servlets (API) – JavaServer Pages (JSP) – Enterprise Java Beans (EJB) – Struts 12 CS233: Database Systems Lectures Other solutions (not discussed) • SSI (Server Side Includes) • Microsoft’s Web Solutions Platform: COM, DCOM, Active Server Pages - ASP and ASP.NET, Active X Data Objects (ADO) • Oracle Internet Platform: Oracle Internet Application Server (iAS) 13 CS233: Database Systems Lectures 14 CGI Common Gateway Interface (CGI) is a protocol (specification) for communicating between a browser and a web-server. • An HTML <form> is used to generate contents which are passed by GET or POST methods to a named CGI script. • GET appends the form contents to the URL as a query-stringe.g. url/test-cgi2?stuff=Hi+there&action=dosomething • POST first contacts the CGI script and then sends the contents to its standard input (stdin). This hides the contents from the user. • Results from the CGI program are passed back to the browser. Results must be in Multipurpose Internet Mail Extension (MIME) format. CS233: Database Systems Lectures 15 Problems with CGI • The browser-server interaction is limited: only query-string is passed and standard output (MIME) returned. • A new process is invoked for each HTTP connection (each user form submit or session) which is costly. • There is no persistence of session state between user requests. • Badly written CGI is open to abuse. CS233: Database Systems Lectures 16 Servlets Servlets are (Java) objects that extend the functionality of the web-server. They are servers dedicated to specific CGI requests. Once started, they wait for CGI requests, execute them and wait for more. Servlets are created by compiling Java code which users the packages : javax.servlet javax.servlet.http Web-servers support and manage these programs through built-in extensions (plugins), e.g. Apache Tomcat. CS233: Database Systems Lectures 17 The life cycle of a servlet 1. The servlet execution engine (e.g. Tomcat) creates a servlet and runs it in its own thread (light-weight process). 2. The servlet object is initialised by the init() method and waits for service requests. 3. It responds to GET and POST requests by calling the user defined doGet() or doPost() methods. 4. User content is unpackaged using the getParameterValues() method. 5. It creates MIME response output to stdout, perhaps using a HTML helper object like HtmlWriter. CS233: Database Systems Lectures 18 Why use servlets? Servlets are preferred over CGI scripts because: • Servlets can keep user sessions open for multiple requests. • They can manage client state within the object (or with supporting objects). • They can access a DBMS through the Java database API: JDBC. • They are easier to write securely, Java does not allow direct access to memory and so attacks such as ’buffer overflows’ are easier to protect against. CS233: Database Systems Lectures 19 Mailing List as Servlets We can break our Mailing List application into 4 key parts : • A HTML page to select whether we want to ’Add a member’ or ’List members’. • A HTML page containing a form to enter a member’s details. • An AddMemberServlet to add the form’s contents to the database. • A ListMembersServlet to retrieve data from the database and present it as an HTML page. CS233: Database Systems Lectures 20 ListMembersServlet.java import java.io.*; import javax.servlet.*; import javax.servlet.http.*; public class ListMembersServlet extends HttpServlet { public void doGet(HttpServletRequest request,HttpServletRespon throws IOException, ServletException { // Get the list of members Members members = new Members(); String[] memberList = members.listMembers(); // Prepare the beginning of our response CS233: Database Systems Lectures PrintWriter writer = response.getWriter(); response.setContentType("text/html"); writer.println("<html>"); writer.println("<head>"); writer.println("<title>List of Members</title>"); writer.println("</head>"); writer.println("<body bgcolor=\"white\">"); writer.println("<h1>Members</h1><hr>"); for (int i = 0; i < memberList.length; i++) writer.println("<br>"+memberList[i]); // Prepare the ending of our response writer.println("<hr><br><br>"); writer.println( 21 CS233: Database Systems Lectures 22 "<a href=\"../mailinglist/MailingList.html\">Return</a><br writer.println("</body>"); writer.println("</html>"); } } CS233: Database Systems Lectures 23 The Add Member form We have to construct a web page to accept input from the user. This requires us to use the <FORM> tag and the <INPUT> tag. CS233: Database Systems Lectures 24 AddMember.html <html> <head> <title>Mailing List - Add Member</title> </head> <body> <h1>Add a new member</h1> <hr> <form action="../servlet/AddMemberServlet" method="post"> <table width="100%" border="0"> <tr><td>Forename :</td> <td><input type="TEXT" name="forename" /></td></tr> CS233: Database Systems Lectures <tr><td>Surname :</td> <td><input type="TEXT" name="surname" /></td></tr> <tr><td>Address 1 :</td> <td><input type="TEXT" name="address1" /></td></tr> <tr><td>Address 2 :</td> <td><input type="TEXT" name="address2" /></td></tr> <tr><td>City :</td> <td><input type="TEXT" name="city" /></td></tr> <tr><td>Postcode :</td> <td><input type="TEXT" name="postcode" /></td></tr> 25 CS233: Database Systems Lectures </table> <input type="SUBMIT"> </form> <hr> <p><a href="MailingList.html">Return</a></p> </body> </html> 26 CS233: Database Systems Lectures 27 AddMemberServlet We have to create a servlet that the web browser will submit this form to. Since the user is submitting a form (rather than following a link) we must override the doPost method instead of the doGet method we overrode in the ListMembersServlet. The fields on the form are passed to a servlet through the HttpServletRequest request object. We call the String getParameter(String parameterName) method of the request object to get to the user’s submission. CS233: Database Systems Lectures 28 AddMemberServlet.java public class AddMemberServlet extends HttpServlet { public void doPost(HttpServletRequest request,HttpServletRespo response) throws IOException, ServletException { // Acquire the parameters String forename = request.getParameter("forename"); String surname = request.getParameter("surname"); String address1 = request.getParameter("address1"); String address2 = request.getParameter("address2"); String city = request.getParameter("city"); String postcode = request.getParameter("postcode"); // Try add the member to our list CS233: Database Systems Lectures Members members = new Members(); boolean success = members.addMember(forename,surname, address1,address2,city,postcode); // Prepare the beginning of our response PrintWriter writer = response.getWriter(); response.setContentType("text/html"); writer.println("<html>"); writer.println("<head>"); writer.println("<title>Add Member Result</title>"); writer.println("</head>"); writer.println("<body bgcolor=\"white\">"); writer.println("<h3>Add Member Result</h3>"); 29 CS233: Database Systems Lectures 30 if (success) writer.println("<p>Member "+forename+" added successfully. </p>" else writer.println("<p>Could not add member to the list.</p>") // Prepare the ending of our response writer.println("<hr>"); writer.println("<a href=\"../mailinglist/AddMember.html\"> Add another member</a><br>"); writer.println("<a href=\"../mailinglist/MailingList.html\"> Return</a><br>"); writer.println("</body>"); writer.println("</html>"); } }