Web and databases Bruno DEFUDE Computer Science Dept Web and DB - INT Evry 1 WWW – Web advantages Universal client Easy to use Open standards Good integration with other Internet services and protocols Extensible Low software and network costs Corporate network (Intranet), inter-companies network (extranet), WAN Web and DB - INT Evry 2 Why coupling web and DB? Web is a kind of DB but » Without schema » Without query language » Without transactions, recovery, … » Without powerfull authorisation mechanism DBs store huge amount of data which are interesting to publish on the web Web and DB - INT Evry 3 Simple coupling Transform DBs into sets of static web pages » Simple » Redundency, problem of consistency » Disadvantages of web Need to generate dynamic web pages constructed with DB content Web and DB - INT Evry 4 First solution: CGI scripts HTTP server CGI url query string Html result Browser (client) SQL DBMS Web and DB - INT Evry 5 HTML Pros and Cons of CGI Simple and portable Lots of code to write (one CGI per query!): can be improved with generic solutions Not very efficient Web and DB - INT Evry 6 WWW – CGI problem Client 1 Client 2 Client 3 Process 1 C G I Server Process HTTP Server Process 1 Process 1 CGI Scripts Web and DB - INT Evry 7 WWW – the API Solution Client 1 HTTP Server Server Process Client 2 Client 3 API Thread 1 Set of functions Thread 2 Thread 3 CGI scripts are functions of a DLL, Processed As threads within the multi-threaded HTTP server Web and DB - INT Evry 8 FastCGI Persistent CGI script CGI (daemon like) A fastcgi script is splitted into 3 parts: » init : process one time » body : process at each request » ending : process one time Init part must include costly part of the CGI code (connection to a DBMS, …) Well adapted to DBMS access, but is persistent (can not handle numerous different connections) Web and DB - INT Evry 9 WWW – Web limitations performances » Internet network => increase bandwith » CGI scripts => FastCGI, server-side scripts security » HTTP => S-HTTP secure » TCP/IP level => SSL protocol Web and DB - INT Evry 10 WWW – Web Limitations (2) Transaction management » not possible with HTTP 1.0 – no session mode – One can use "cookies" » Possible with HTTP 1.1 – Allows persistent connexion (TCP level) User interface » HTML is not very powerfull to construct sophisticated UI » Java is more powerfull Web and DB - INT Evry 11 WWW – Web evolutions At the beginning » access to informations – Distributed on the network – Represented by hypermedia documents today » Software construction and processing – client/server, heterogeneous – Inside the same organization (Intranet) Web and DB - INT Evry 12 Coupling Web and DBMS Gateways Principles Java approach (applets, servlets, JSP) Server-side scripting (PHP) Web and DB - INT Evry 13 Gateways Principles Single-request gateways Multiple-request gateways (transactional) Web and DB - INT Evry 14 Functionalities (1) translate HTTP request (mapping environment variables to SQL statements) (2) process SQL queries on the DBMS (3) encode SQL results as HTML pages Web and DB - INT Evry 15 Process SQL queries Program written using a DBMS API » DBMS-dependent approach (embedded SQL) » DBMS-independent approach (ODBC, JDBC or DBI for Perl) Programming languages used » classical : C, C++, Ada, … if embedded SQL » Scripting language : Perl (with DBI) Web and DB - INT Evry 16 HTML encoding Generic solution (can be automated): » select : HTML table » other query : string Specific solution: a specific program for each query – Within the gateway (one gateway / query !) – Within the DBMS (one stored procedure / query) Intermediate solution: » encoding can be parametrized (update form, hypertext navigation within the DB ,…) Web and DB - INT Evry 17 DB access (simple html) HTTP server CGI url Html result Browser (client) query string HTML SQL data Gateway DBMS Web and DB - INT Evry 18 Simple HTML example <html><body> <form name="f1" action="http://mica/multi2.cgi" method="get"> <input type="hidden" name="uid" value="citcom/citcom@MICA"> <input type="hidden" name="sqlstatement" value="select * from students where sname="> Give a name : <input name="sname" value=""> <p> <input type="submit" value="lancer"> </form> </body></html> Web and DB - INT Evry 19 DB access (client-side script) HTTP server url query gateway string HTML SQL Html result Browser (client) data DBMS Web and DB - INT Evry 20 HTML with Javascript example <html><body> <form name="f1" action="http://mica/multi2.cgi" method="get"> <input type="hidden" name="uid" value="citcom/citcom@MICA"> <input type="hidden" name="sqlstatement" value="select * from students where sname="> Give a name : <input name="sname" value=""> <p> <input type="button" value="go" onClick="f1.sqlstatement.value+=f1.sname.value; f1.submit();"> </form> </body></html> Web and DB - INT Evry 21 DB access (without http) DBMS Specific protocol Browser (client) + java, TCL program (applet, tclet …) Specific protocol: JDBC, IIOP (CORBA) Web and DB - INT Evry 22 Functionalities (1) (2) (3) Specific gateway or Simple HTML CGI DBMS DBMS Client-side script without http Web and DB - INT Evry client DBMS gateway or DBMS DBMS client 23 Variations DBMS-dependent or independent (native or ODBC) Supported query language (SQL or subset, static vs dynamic) HTML encoding (generic or specific) efficiency Web and DB - INT Evry 24 Efficiency CGI vs NSAPI, ISAPI (but proprietary solutions) Decrease process number (if large number of clients) : multithreaded gateway Web and DB - INT Evry 25 Transactions transaction = DB program (read and write sequence) ACID properties – A : Atomicity – C : Consistency – I : Isolation – D : Durability Web and DB - INT Evry 26 Examples of transactional services on-line ordering (music, books, planes, …) bank insurance e-business!!!! Web and DB - INT Evry 27 Transactions and Web Transaction = sequence of URL invocations (management context by the web?) Transaction = ACID properties (ensured by the DBMS) HTTP = no session support – Client-side management context (cookies) – Simulation of HTTP sessions (transactional web) – Use server-side scripting languages (PHP, ASP, JSP, ...) Web and DB - INT Evry 28 Transactions and cookies Browser (client) cgi http server 1 1 1’ c1 2 c1 2’ c1, c2 3 fin c1, c2 5’ del(c1, c2) 1’ c1 2 c1 2’ c1, c2 3 fin c1, c2 5’ del(c1, c2) 4’ ok 1 : first access 4 sql(c1, c2) 1’ : c1 cookie is generated 2 : other access with c1 cookie transport 2’ : c1, c2 cookies are generated 3 : end of transaction with c1, c2 transport 4 : transaction is constructed and processed on the DBMS 5’ : cookies deletion DBMS Web and DB - INT Evry 29 Transaction and cookies (2) Context is managed on client-side (cookies) advantages: – Easy to code – DB access is done one time at the end of the session (DB resources are not blocked) – If no explicit user termination, nothing to do (at DB level) Web and DB - INT Evry 30 Transaction and cookies (3) Cons – Not really transactional – Limited functionality – Lots of cgi scripts to code Cookies problems – global to a user (no distinction between two windows) – global to a url (does not allow two different transactions on the same site at the same site) Web and DB - INT Evry 31 transactional gateway browser (Client) http Server Web and DB - INT Evry cgi 32 daemon gateway DBMS transactional gateway principles Context is managed by a daemon on server-side Need of a transaction id stored on clientside (cookie or rewritten URL) and on server-side (in an array of the daemon) A gateway does not process a single query but a complete transaction (a sequence of queries) Web and DB - INT Evry 33 transactional gateway operations (1) 1 : transaction beginning request (implicite or explicite) – An id is allocated by the daemon, transactions array is updated, a gateway is launched, an id is send back to the client (cookie or rewritten URL) 2 : DB operation request – The request is routed by the daemon on the right gateway using the transaction id and the transactions array Web and DB - INT Evry 34 transactional gateway operations(2) 3 : transaction ending request – implicit : error, timeout – explicit : idem 2 + update of transactions array, id is deleted on client-side and the gateway is stopped Web and DB - INT Evry 35 transactional gateway Resume Pros – really transactional – generic solution Cons – complex architecture – DB resources are blocked until the end of the transaction – Need to detect a user « abort » (timeout) Web and DB - INT Evry 36 Scripting language (serverside) Offer a programming language integrated to the web (ability to have calls from a web page, associated to a HTTP server) run-time of the language offers session support (and consequently of transactions) PHP, servlet - JSP, ASP, XSP (Cocoon) see Java and PHP for more details Web and DB - INT Evry 37 Java and Web JVM concept is well adapted for Web Java language is widely used JDBC (standard API for DBMSs) Applets servlets Java Server Pages Web and DB - INT Evry 38 Java - Java and Web language for W3 client » applet = small compiled Java application – Loaded from a W3 server – Running is secured by the client JVM language for W3 server » servlet = compiled Java application – Stored on server-side – Run by server-side JVM Web and DB - INT Evry 39 Servlets Servlet 2.2 (http://java.sun.com/servlet) Tomcat from Apache is the reference implementation Same as CGI for Java CGI extensions (session, buffer management, redirection/chaining) Web and DB - INT Evry 40 Servlet Architecture Web browser Client HTTP Server Servlet container servlet1 servlet2 Container manages servlets (activation, desactivation) Web and DB - INT Evry 41 Servlet object Model session Object request Object servlet1 getParameter getAttribute getHeaders getCookies Web and DB - INT Evry getAttribute Response Object setBufferSize setHeader setRedirect 42 JSP JSP 1.2 (http://java.sun.com/jsp) Tomcat from Apache is the reference implementation Scripts included into HTML pages Same as ASP, PHP but for Java compiled into servlets portability of Java Server-side scripting with extensibility using tag libraries Web and DB - INT Evry 43 Example of JSP page <html> <%@ page language=« java » import=« java.util.* » %> <h1>Welcome</h1> <p>Date: <p> <jsp:usebean id=« clock » class=« jspCalendar »/> <ul><li>Day : <%= clock.getDayOfMonth() %> <li>Year : <%= clock.getYear() %></ul> <%-- Test for AM or PM --%> <%! Int time = Calendar.getInstance().get(Calendar.AM_PM); %> <% if (time == Calendar.AM) { %> Good Morning <% } else { %> Good Afternoon <% } %> <@ include file=« copyright.html » %> </html> Web and DB - INT Evry 44 JSP page Components (1) JSP actions (or tags) : allow to call beans (XML syntax) directives : processed by JSP engine during page compilation into servlets (include, import, ...) declarations : as in Java Web and DB - INT Evry 45 JSP page Components (2) expressions : variables or constants integrated into the HTML result page scriplets : blocks of Java code integrated into a JSP page comments Web and DB - INT Evry 46 Personalized Tags One can add his/her own tags to simplify JSP page writing tags libraries mechanism to define new tags is complex Web and DB - INT Evry 47 Example use of tag libraries <html> <%@ taglib uri=« http://acme.com/taglibs/simpleDB.tld »prefix=« x » %> <x:queryBlock connData=« defude/bruno@MICA »> <x:queryStatement>select number, sname from students </x:queryStatement> The first 10 students are: <table> <tr><th>Number</th><th>Name</th> <x:queryCreateRows from =« 1 » to=« 10 »> <td><x:queryDisplay field=« number »/></td> <td><x:queryDisplay field=« sname »/></td> </x:queryCreateRows> </table> </x:queryBlock> </html> Web and DB - INT Evry 48 Example (cont) Definition of a queryBlock tag composed itself by other tags (queryStatement, queryCreateRows) Allow to mask JDBC code to the page developer a tag is defined by a descriptor (XML document) + a Java class for implementation Web and DB - INT Evry 49 XML and Web XML offers a clear splitting between logical structure and presentation Allow to develop software producing different format outputs of the same XML source (HTML vs PDF, WML vs HTML,...) Web and DB - INT Evry 50 Publishing Environment for XML Software environment allowing to produce different output formats from the same XML source Cocoon from XML-Apache Often based on Java implementation (Servlet, JSP) Web and DB - INT Evry 51 Cocoon Web browser or PDA Client 1 6 2 Apache + Tomcat Cocoon 5 Doc. XML 3 HTML Doc. 4 XSLT Interpreter WML Doc. PDF Doc. Web and DB - INT Evry 52 4 FOP Interpreter Conclusion Numerous software products and several solutions choice depends on the objective: » Query web site: simple gateways of server-side scripting language PHP, ASP » Different web sites produced from the same source: XML » Corporate portals: JSP or Cocoon Web and DB - INT Evry 53