databases and web technology

advertisement
DATABASES
AND
WEB TECHNOLOGY
Dr. Awad Khalil
Computer Science Department
AUC
Content
 The Web Building Blocks
 The Web as a Database Application Platform
 Database Gateways for Database Connectivity
 Client-side Approach
 Server-side Approach
 CGI (Common Gatewa y Interface)
 API (Application Programming Interface)
 Available Tools and Technologies
 Scripting languages such as JavaScript, Perl, and PHP
 HTTP cookies
 Netscape API (NSAPI) and Microsoft Information Server API ( ISAPI).
 ODBC (Open DataBase Connectivity)
 Java and JDBC, SQLJ, Servlets, and JavaServer Pages (JSP).
 Microsoft’s Web Solution Platform with Active Server Pages (ASPs) and Active Data Objects (ADO)
 Oracle’s Internet Platform
The Web Building Blocks











Internet
World W ide W eb (WWW or W eb)
Web Page
HyperText Markup Language
Hyperlink
Web browser
Static W eb page
Dynamic W eb page
Web server
Web site
HyperText Transfer Protocol
Internet
Worldwide network of computers connected together through a standard network protocol known as
Transmission Control Protocol/Internet Protocol, or TCP/IP, You can think of the Internet as the “highway”
on which data travel, as in the phrase “the information super-highway.” The terms Internet and World Wide
Web are often used interchangeably, but they are not synonyms.

World Wide Web (WWW or Web)
 Worldwide network of servers that store collections of specially formatted files known as Web pages. The
Web is just one of many services provided by the Internet. However, it is the World Wide Web that truly
opened the Internet to the World.
Web Page
 A text document containing labels and special commands (or tags) written in a Markup Language (HTML or
XML).
HyperText Markup Language
 Standard document-formatting language for Web pages.
Hyperlink
 Web pages are linked to each other – that is, each Web page can call other Web pages – creating the effect of
a “web.” Because this link can connect to different types of documents such as text, graphics, animated
graphics, video, and audio, it is known as a “hyperlink.”
Web browser
 The end user application used to browse or navigate (jump from page to page) through the Web. The browser
is a graphical interface that runs on the client computer, and its main function is to display Web pages. The
Web browser (e.g., Netscape Navigator, Internet Explorer) requests Web pages from the Web server.
Static Web page
 A Web page whose contents remain the same (when viewed in a browser) unless the page is manually edited.
An example of a static Web page is a standard price list posted by a manufacturer.
Dynamic Web page
 A Web page whose contents are automaticall y created and tailored to an end user’s needs each time the end
user requests the page. For example, an end user can access a Web page that displays the latest stock
selections entered by that end user.
Web server
 A specialized application whose only function is to “listen” for clients’ requests and to send the requested
Web page(s).
Web site
 Term used to refer to the Web server and the collection of Web pages stored on the local hard disk of the
server computer or an accessible shared directory. The Web server and the Web client communicate using a
special protocol known as HyperText Transfer Protocol or HTTP.
HyperText Transfer Protocol
 The standard protocol used by the Web browser and Web server to communicate.
Intranet & Extranet



Intranet is a locally owned and operated Internet whose access is carefully controlled.
Intranet architecture is based on the same basic components of the Internet: Web servers, clients using Web
browser, the TCP/IP and HTTP protocols, and HTML/XML formatting.
An Extranet extends the Intranet to the corporation’s value chain.
Intranet Services
HTTP
HTTP Request
 An HTTP request consists of a header indicating the type of request, the name of a resource, the HTTP
version, followed by optional body. The main HTTP request types are:
 GET – Retrieves (gets) the resource the user has requested.
 POST – Transfers (posts) data to the specified resource.
 HEAD – Returns onl y an HTTP header instead of response data.
 PUT Uploads the resource to the server.
 DELETE – Deletes the resource from the server.
 OPTIONS – Requests the server’s configuration options.
 HTTP is currently a stateless protocol – the server retains no information between requests.
 This means that the information a user enters on one page is not automatically available on the next page
requested.
 The stateless property of HTTP makes it difficult to support the concept of a session that is essential to basic
DBMS transactions.
HTML
 HTML is the document formatting language used to design most Web pages.
 It is a simple, yet powerful, platform-independent document language.
 Originally developed by Tim Berners-Lee while at CERN but was standardized in November 1995.
 In early 2000, W3C produced XHTML 1.0 as a reformulation of HTML 4 in XML.
 HTML Tags:
HTML tags cover formatting, structural, and semantic markup of an HTML document.
<HTML>
<HEAD>
<TITLE>
<BODY>
<A>
<IMG>
<B>
<I>
<HR>




Good morning <B> Egypt </B>
<HR WIDTH="200">
In 1989, <B><I>Tim Brayners</I></B> developed the first version of HTML
<HTML>
<HEAD>
<TITLE>Page1</TITLE>
</HEAD>
<BODY>
Good Morning, <B><I>Egypt</I></B>
</BODY>
</HTML>
HTML Document Structure
H T M L docum ent
T itle
<HTM L>
<HEAD>
< T IT L E >
T itle o f y o u r d o c u m e n t g o e sh e re
< /T IT L E .
< /H E A D >
<BO DY>
Y o u r d o c u m e n t g o e s h e re
< /B O D Y
< /H T M L >
B o d y s e c tio n
A Complete HTML Page
<HTML>
<HEAD>
<TITLE>The Universal Construction Company</TITLE>
</HEAD>
<BODY>
<IMG SRC="welcome.gif">
<H1>The Universal Construction Company !</H1>
<H2>Specialists in Construction Since 1920</H2>
<CENTER>
<IMG SRC="logo.gif" HEIGHT=60 WIDTH=100>
</CENTER>
<P>
Our Company offers a highly specialized group of <B><I>highly eexperienced engineers
</I></B> with the <I><B>state-of-the-art technology </B></I>for constructing modern
buildings. Write us at:<BR>
P.O. Box 555<BR>
Cairo 11511, Egypt<BR>
Or visit our
<A HREF="http://www.ucc.com/technical/">UCC Technical Site</A>
<HR>
</BODY>
</HTML>
XML versus HTML
HTML
 HTML encompasses formatting, structural, and semantic markup.
<dt>Hot Cop
<dd> by Jacques Morali, Henri Belolo and Victor Willis
<ul>
<li>Producer: Jacques Morali
<li>Publisher: PolyGram Records
<li>Length: 6:20
<li>Written: 1978
<li>Artist: Village People
</ul>
 HTML is designed for a specific application; to convey information to human (usually visually, through a
web browser).
XML
 XML markup describes a document’s structure and meaning. It does not describe the formatting of the
elements on the page. Formatting can be added to a document with a style sheet..
<SONG>
<TITLE>Hot Cop</TITLE>
<COMPOSER>Jacques Morali</COMPOSER>
<COMPOSER>Henri Belolo</COMPOSER>
<COMPOSER>Victor Willis</COMPOSER>
<PRODUCER>Jacques Morali</PRODUCER>
<PUBLISHER>PolyGram Records</PUBLISHER>
<LENGTH>6:20</LENGTH>
<YEAR> 1978</YEAR>
<ARTIST>Village People</ARTIST>
</SONG>
 XML has no specific application; it is designed for whatever use you need it for.
The Web as a Database Application Platform
1- Requirements
 The ability to access valuable corporate data in a secure manner.
 Data and vendor independent connectivit y to allow freedom of choice in the selection of the DBMS now and
in the future
 The ability to interface to the database independent of any proprietary Web browser or Web server.
 A connectivit y solution that takes advantage of all the features of an organization’s DBMS.
 An open-architecture approach to allow interoperability with a variet y of s ystems and technologies; for
example, support for:
 Different Web servers;
 Microsoft’s (Distributed) Common Object Model (DCOM/COM);
 CORBA/IIOP (Internet Inter-ORB protocol);
 Java/RMI (Remote Method Invocation).
 A cost-effective solution that allows for scalability, growth, and changes in strategic directions.
 Support for transactions that span multiple HTTP requests.
 Support for session- and application-based authentication.
 Minimal administration overhead.
2- Web-DBMS Architecture
 The Web-DBMS architecture is a three-tier architecture in which:
 A Web Browser acting as the “thin” client.
 A Web Server acting as the application server.
 A Database Server acting as the third tier.
Web – DBMS Approach








Advantages
Simplicity
Platform independence
Graphical User Interface
Standardization
Cross-platform support
Transparent network access
Scalable deployment
Innovation








Disadavantages
Reliability
Security
Cost
Scalability
Limited functionality of HTML
Stateless
Bandwidth
Performance
Immaturity of development tools
Components of a Web Database Application

Web database applications may be created using various approaches. However, there are a number of
components that will form essential building blocks for such applications. In other words, a Web database
application should comprise the following four layers (i.e., components):




Browser layer.
Application logic layer.
Database connection layer.
Database layer.
Browser Layer
 The browser is the client of a Web database application, and it has two major functions:
 First, it handles the layout and display of HTML documents.
 Second,
it executes the client-side extension functionality such as Java, JavaScript, and ActiveX (a
method to extend a browser’s capabilities).
 The two most popular browsers at the present are Netscape Navigator (Netscape for short) and Microsoft
Internet Explorer (IE). They have their own advantages and disadvantages with respect to Web database
applications.
IE versus Netscape
 Both browsers support Java, JavaScript, and IE also supports ActiveX.
 Netscape is supported on numerous platforms, whereas IE runs only on Microsoft systems (e.g., Windows
2000, 98 and NT).
 IE offers compatibility with other Microsoft products and can easily be integrated with existing tools such as
Word, Excel and PowerPoint. The drawback is that IE heavily dependent on the Windows platforms and
other Microsoft proprietary systems.
 ActiveX is designed to extend the functionality of IE, and works only on Windows platforms and Macintosh
System 7+.
The Application Logic Layer
 The application logic layer is the part of a Web database application with which a developer will
spend the most time. It is responsible for:
 collecting data for a query (e.g., an SQL statement);
 preparing and sending the query to the database via the database connection layer;
 retrieving the results from the connection layer;
 formatting the data for display.
The Database Connection Layer
 This is the component which actuall y links a database to the Web server.
 Many current Web database building tools offer database connectivity solutions and they are used to simplify
the connection process.
 The database connection layer provides a link between the application logic layer and the DBMS.
Connection solutions come in many forms, such as
 DBMS net protocols,
 API (Application Programming Interface) or class libraries, and programs that are themselves database
clients. Some of these solutions resulted in tools being specifically designed for developing Web database
applications.
 In Oracle, for example, there are native API libraries for connection and a number of tools, such as Web
Publishing Assistant, for developing Oracle applications on the Web.
 The connection layer within a Web database application must accomplish a number of goals. They have to
provide access to the underlying database, and they also need to be easy to use, efficient, flexible, robust,
reliable, and secure. Different tools and methods fulfill these goals to different extents.
The Database Layer
 This is the place where the underlying database resides within the Web database application.
 The database is responsible for storing, retrieving, and updating data based on user requirements, and the
DBMS can provide efficiency and security measures.
 In many cases, when developing a Web database application, the underlying database has already been in
existence. A major task, therefore, is to link the database to the Web (the connection layer) and to develop
the application logic layer.
Database Gateways for Database Connectivity
 A Web database gateway (middleware)is a bridge between the Web and a DBMS, and its objective is to
provide a Web-based application the ability to manipulate data stored in the database.
 Web database gateways link stateful systems (i.e., databases) with a stateless, connectionless protocol (i.e.,
HTTP).
 HTTP is a stateless protocol in the sense that each connection is closed once the server provides a response.
Thus, a Web server will not normally keep any record about previous requests. This results in an important
difference between a Web-based client-server application and a traditional client-server application.
 In a Web-based application, only one transaction can occur on a connection. In other words, the connection
is created for a specific request from the client. Once the request has been satisfied, the connection is closed.
Thus, every request involving access to the database will have to incur the overhead of making the
connection.
 In a traditional application, multiple transactions can occur on the same connection. The overhead of making
the connection will only occur once at the beginning of each database session.
Types of Solutions

There are a number of different ways for creating web database gateways. Generally, they can be grouped
into two categories:


Client-Side Solutions.
Server-Side Solutions.
Client-side Solutions
 The client-side solutions include two types of approaches for connections:
 browser extensions
 external applications.


Browser extensions are add-ons to the core Web browser that enhance and augment the browser’s original
functionality. Specific methods include plug-ins for Netscape and IE, and ActiveX controls for IE. Also, both
types of browsers (Netscape and IE) support Java and JavaScript languages (i.e., Java applets and JavaScript
can be used to extend browsers’ capabilities).
External applications are helper applications or viewers. They are typically existing database clients that
reside on the client machine and are launched by the Web browser in a particular Web application. Using
external applications is a quick and easy way to bring legacy database applications online, but the resulting
system is neither open nor portable. Legacy database clients do not take advantages of the platform
independence and language independence available through many Web solutions. Legacy clients are resistant
to change, meaning that any modification to the client program must be propagated via costly manual
installations throughout the user base.
Browser Extensions
 These types of gateways take advantage of the resources of the client machine to aid server-side database
access.
 Remember, however, it is advantageous to have a thin client. Thus, the scope of such programming on the
client-side should be limited.
 A very large part of the database application should be on the server-side.

Browser extensions can be created by incorporating support for:
 Scripting Languages (JavaScript, Jscript, VBScript, Perl, and PHP)
 Java (Applets)
 ActiveX
 Plug-Ins
Scripting Languages
JavaScript & JScript
 JavaScript and Jscript are virtuall y identical interpreted scripting languages from Netscape and MicroSoft,
respectively.
 JavaScript is a scripting language that allows programmers to create and customize applications on the
Internet and Intranets.
 On the client-side, it can be used to perform simple data manipulation such as mathematical calculations and
form validation.
 JavaScript code is normally sent as a part of HTML document and is executed b y the browser upon receipt
(the browser must have the script language interpreter). On the server-side, LiveWire (an online development
environment for server-side JavaScript) works with Netscape, providing gateway functionality such as access
to databases.
 It should note that JavaScript has little to do with Java language. JavaScript was originall y called LiveScript,
but it was changed to benefit from the excitement surrounding Java. The only relationship between
JavaScript and Java is a gateway between the former and Java applets (Web applications written in Java).
JavaScript provides developers a simple way to access certain properties and methods of Java applets on the
same page without having to understand or modify the Java source code of the applet.
 As a database gateway, JavaScript on the client-side does not offer much without the aid of a complementary
approach such as Java, plug-ins, and CGI (Common Gateway Interface). For examples,
 If a Java applet on a page of HTML has access to a database, a programmer can write JavaScript code to

manipulate the applet.
 If there is a form on the HTML document and if an action parameter for that form refers to a CGI
program that has access to a database, a programmer can write JavaScript code to manipulate the data
elements within the form and then submit it (i.e., submit a kind of request to a DBMS).
JavaScript can improve the performance of a Web database application if it is used for client-side state
management. It can eliminate the need to transfer state data repeatedly between the browser and the Web
server. Instead of sending an HTTP request each time it updates an application state, it sends the state only
once as the final action. However, there are some side effects resulted from this gain in performance. For
example, it may result in the application becoming less robust if state management is completely on the
client-side. If the client accidentally or deliberately exits, the session state is lost.
VBScript
 VBScript is a Microsoft proprietary interpreted scripting language whose goals and operations are virtually
identical to those JavaScript/Jscript.
 VBScript, however, has syntax more like Visual Basic than Java.
 VBScript is a procedural language and so uses subroutines as the basic unit.
Perl & PHP
 Perl (Practical Extraction and Report Language) is an interpreted programming language with extensive,
easy-to-use text processing capabilities.
 It is now one of the most widely used languages for server-side programming.
 Although Perl was originally developed on the Unix platform, it was always intended as a cross-platform
language and there is now a version of Perl for the Windows platform (called ActivePerl)
 PHP (Hypertext Processor) is another popular open source HTML-embedded scripting language that is
supported by many Web servers including Apache HTTP Server and Microsoft’s Internet Information Server
(IIS), and is the preferred linux Web scripting language.
 The goal of the language is to allow Web developers to write d ynamicall y-generated pages quickly.
 A popular choice nowadays is to use the open source combinations of Apache HTTP Server, PHP, and one
of the database systems mySQL or PostgreSQL.
Java (Applets)
 As mentioned earlier, Java applets can be manipulated by JavaScript functions to access databases.
 In general, Java applets can be downloaded into a browser and executed on the client-side (the browser
should have the bytecode interpreter).
 The connection to the database is made through appropriate APIs (Application Programming Interface, such
as JDBC and ODBC).
ActiveX
 ActiveX is a way to extend Microsoft IE’s (Internet Explorer) capabilities.
 An ActiveX control is a component on the browser that adds functionality which cannot be obtained in
HTML, such as access to file on the client, other applications, complex user interfaces, and additional
hardware devices.
 ActiveX is similar to Microsoft OLE (Object Linking and Embedding), and ActiveX controls can be
developed by any organization and individual.
 At the present, there are more than one thousand ActiveX controls, including controls for database access,
are available for developers to incorporate into Web applications.
 A number of commercial ActiveX controls offer database connectivity. Because ActiveX has abilities similar
to OLE, it supports most or all the functionality available to any Windows program.
 Like JavaScript, ActiveX can aid in minimizing network traffic. In many cases, this technique results in
improved performance. ActiveX can also offer rich GUIs. The more flexible interface, executed entirely on
the client-side, make operations more efficient for users.
Plug-Ins
 Plug-ins are D ynamic Link Libraries (DLL) that allow data of new MIME (Multipurpose Internet Mail
Extensions) to be viewed or heard.
 Plug-ins can be installed to run seamlessly inside the browser window, transparent to the user.
 They have full access to the client’s resources.
 To create a plug-in, the developer writes an application using the plug-in API and native calls. The code is
then compiled as a DLL. Installing a plug-in is just a matter of copying the DLL into the directory where the
browser looks for plug-ins. The next time that the browser is run, the MIME type(s) that the new plug-in
supports will be opened with the plug-in. One plug-in may support multiple MIME types.
 There are a number of important issues concerning plug-ins:
 Plug-ins incur installation requirements. Because the y are native code not packaged with the browser
itself, plug-ins must be installed on the client machine.

Plug-ins are platform-dependent. Whenever a change is made, it must be made on all supported
platforms.
 The latest version of IE supports the same plug-in architecture as Netscape. Thus, a plug-in should work
on both browsers provided they are running on the same platform.
Connection to databases
 Plug-ins can operate like any stand-alone applications on the client-side. They can be used to create direct
socket connections to databases via the DBMS net protocols (such as SQL*Net for Oracle). Plug-ins can also
use JDBC, ODBC, OLE, and any other methods to connect to databases.
Performance
 Plug-ins are loaded on demand. When a user starts up a browser, the installed plug-ins are registered with the
browser along with their supported MIME types, but the plug-in themselves are not loaded. When a plug-in
for a particular MIME type is requested, the code is then loaded into memory. Because plug-ins use native
code, their executions are fast.
External Applications
 External helper applications can be new or legacy database clients, or a terminal emulator. If there are
existing traditional client-server database applications which reside on the same machine as the browser, then
they can be launched by the browser and execute as usual.
 This approach may be an appropriate interim solution for migrating from an existing client-server application
to a purely Web-based one.
 It is straightforward to configure the browser to launch existing applications. It just involves the registration
of a new MIME type and the associated application name.
 Using the external applications approach, the existing database applications need not be changed. However,
it means that all the maintenance burdens associated with traditional client-server applications will remain.
Any change to the external application will require a very costly reinstallation on all client machines.
Because this is not a pure Web-based solution, many advantages offered by Web-based applications cannot
be realized.
 Traditional client-server database applications usually offer good performance. They do not incur the
overhead of requiring repeated connections to the database. External database clients can make one
connection to the remote database and use that connection for as many transactions as necessary for the
session, closing it only when finished.
Server-side Solutions
 Server-side solutions are more widely adopted than the client-side solutions.
 A main reason for this is that the Web database architecture requires the client to be as thin as possible.
 The Web server should not only host all the documents, but should also be responsible for dealing with all
the requests from the client.
Server-side versus Client-side Solutions
Server Side
listening for HTTP requests.
 checking the validity of the request.
 finding the requested resource.
 requesting authentication if
necessary.
 delivering requested resource.
 spawning programs if required.
 passing variables to programs.
 delivering output of programs to the
requester.
displaying error message if necessary.

Client Side
rendering HTML documents.
 allowing users to navigate HTML
links.
 displaying image.
 sending HTML form data to a URL.
 interpreting Java applets.
 executing plug-ins.
 executing external helper
applications.
 interpreting JavaScript and other
scripting language programs.
executing ActiveX controls in case of IE.

Web-to-Database Middleware
Server-side Solutions
 CGI (Common Gateway Interface)
 HTTP Server APIs and Server modules
CGI (Common Gateway Interface)
 CGI is a protocol for allowing Web browsers to communicate with Web servers, such as sending data to the
servers. Upon receiving the data, the Web server can then pass them to a specified external program (residing
on the server host machine) via environment variables or standard input stream (STDIN).
 The external program is called a CGI program or CGI script. Because CGI is a protocol, not a library of
functions written specifically for any particular Web server, CGI programs/scripts are language-independent.







As long as the program/script conforms to the specification of the CGI protocol, it can be written in any
language such as C, C++ or Java.
In short, CGI is the protocol governing communications among browsers, servers, and CGI programs.
In general, a Web server is only able to send documents and to tell a browser what kinds of documents it is
sending. By using CGI, the server can also launch external programs (i.e., CGI programs). When the server
recognizes that a URL points to a file, it returns the contents of that file. When the URL points to a CGI
program, the server will execute it and then sends back the output of the program’s execution to the browser
as if it were a file.
Before the server launches a CGI program, it prepares a number of environment variables representing the
current state of the server, who is requesting the action, and so on.
The program collects this information and reads STDIN. It then carries out the necessary processing and
writes its output to STDOUT (the standard output stream).
In particular, the program must send the MIME header information prior to the main body of the output. This
header information specifies the type of the output.
The CGI approach enables access to databases from the browser. The Web client can invoke a CGI
program/script via a browser, and then the program performs the required action and accesses the database
via the gateway. The outcome of accessing the database is then returned to the client via the Web server.
The CGI Environment
CGI (Common Gateway Interface)
 Invoking and executing CGI programs from a Web browser is mostly transparent to the user. The following
steps need to be taken in order for a CGI program to execute successfully:
 The user (Web client) calls the CGI program by clicking on a link or by pressing a button. The program
can also be invoked when the browser loads an HTML document (hence being able to create a dynamic
Web page).
 The browser contacts the Web server asking for permission to run the CGI program.
 The server checks the configuration and access files to ensure that the program exists and the client has
access authorization to the program.
 The server prepares the environment variables and launches the program.
 The program executes and reads the environment variables and STD IN.
 The program sends the appropriate MIME headers to STDOUT followed by the remainder of the output
and terminates.
 The server sends the data in STDOUT (i.e., the output from the program’s execution) to the browser and
closes the connection.
CGI (Common Gateway Interface)
CGI is the de facto standard for interfacing Web clients and servers with external applications, and is
arguably the most commonly adopted approach for interfacing Web applications to data sources
(such as databases).
Advantages




Simplicity,
Language independence,
Web server independence,
Wide acceptance.
Disadvantages

The first notable drawback of CGI is
that the communication between a client
(browser) and the database server must
always go through the Web server in the
middle, which may cause a bottleneck if
there is a large number of users
accessing
the
Web
server
simultaneously. For every request
submitted by a Web client or every
response delivered by the database



server, the Web server has to convert
data from or to an HTML document.
This incurs a significant overhead to
query processing.
The second disadvantage of CGI is the
lack of efficiency and transaction
support in a CGI program. For every
query submitted through CGI, the
database server has to perform the same
logon and logout procedure, even for
subsequent queries submitted by the
same user. The CGI program could
handle queries in batch mode, but then
support for online database transactions
that contain multiple interactive queries
would be difficult.
The third major shortcoming of CGI is
due to the fact that the server has to
generate a new process or thread for
each CGI program. For a popular site
(like Yahoo), there can easily be
hundreds or even thousands of processes
competing for memory, disk, and
processor time. This situation can incur
significant overhead.
Last but not least, extra measures have
to be taken to ensure server security.
CGI itself does not provide any security
measures, and therefore, developers of
CGI programs must be security
conscious. Any request for unauthorized
action must be spotted and stopped.
HTTP Server APIs and Server Modules
 HTTP server (Web server) APIs and modules are the server equivalent of browser extensions.
 The central theme of Web database sites created with HTTP server APIs or modules is that the database
access programs coexist with the server. They share the address space and run-time process of the server.
 This approach is in direct contrast with the architecture of CGI, in which CGI programs run as separate
processes and in separate memory spaces from the HTTP server.
 At the present, there are two main APIs: the Netscape Server API (NSAPI) and Microsoft Information Server
API (ISAPI).
 Instead of creating a separate process for each CGI program, the API offers a way to create an interface
between the server and the external programs using dynamic linking or shared objects.
 Programs are loaded as part of the server, giving them full access to all the I/O functions of the server.
 In addition, only one copy of the program is loaded and shared among multiple requests to the server.
 Server modules are just prefabricated applications written in some server APIs.
 Developers can often purchase commercial modules to aid or replace the development of an application
feature.
 Sometimes, the functionality required in a Web database application can be found as an existing server
module.
 Vendors of Web servers usually provide proprietary server modules to support their products.
 There are a very large number of server modules that are commercially available, and the number is still
rising.
 For example, Oracle provides the “Oracle PL/SQL”
” module, which contains procedures to drive databasebacked Web sites.
 The Oracle module supports both NSAPI and ISAPI.
Advantages of server APIs and modules
 Having database access programs coexist with the HTTP server improves Web database access b y improving
speed, resource sharing, and the range of functionality.




Speed
Server API programs run as dynamically loaded libraries or modules. A server API program is usually
loaded the first time the resource is requested, and therefore, only the first user who requests that program
will incur the overhead of loading the dynamic libraries. Alternatively, the server can force this first
instantiation so that no user will incur the loading overhead. This technique is called preloading. Either way,
the API approach is more efficient than CGI.
Resource sharing
Unlike a CGI program, a server API program shares address space with other instances of itself and with the
HTTP server. This means that any common data required by the different threads and instances need exist
only in one place. This common storage area can be accessed by concurrent and separate instances of the
server API program. The same principle applies to common functions and code. The same set of functions
and code are loaded just once and can be shared by multiple server API programs. The above techniques save
space and improve performance.





Range of functionality
CGI programs have access to a Web transaction only at certain limited points. It has no control over the
HTTP authentication scheme. It has no contact with the inner workings of the HTTP server, because a CGI
program is considered external to the server.
In contrast, server API programs are closely linked to the server; they exist in conjunction with or as part of
the server. They can customize the authentication method as well as transmission encryption methods. Server
API programs can also customize the way access logging is performed, providing more detailed transaction
logs than are available by default.
Provide Web page or site security by inserting an authentication “layer” requiring an identifier and a



password outside that of the Web browser’s own security methods.
Log incoming and outgoing activity by tracking more information than the Web server does, and store it in a
format not limited to those available with the Web server.
Serve data out to browsing clients in a different way than the Web server would (or even could) by itself.
Overall, server APIs provide a very flexible and powerful solutions in extending the capabilities of Web
servers. However, the approach is much more complex than CGI, requiring specialized programmers with a
deep understanding of the Web server and with sophisticated programming skills.
Important issues






Server architecture dependence
Server APIs are closely tied to the server they work with. The only way to provide efficient cross-server
support is for vendors to adhere to the same API standard. If a common API standard is used, programs
written for one server will work just as well with another server. However, setting up standards involves
compromises among competitors. In many cases, they are hard to come by.
Platform dependence
Server APIs and modules are also dependent on computing platforms. Netscape servers are supported on
multiple platforms. Nevertheless, each supporting version is dependent on that platform. Similarly, Microsoft
server is only available for Windows NT.
Programming language
Both Netscape servers and Microsoft Web servers can be extended using a variety of programming
languages and facilities. In addition, Microsoft provides an application environment called Active Server
Pages. Active Server Pages is an open, compile-free application environment in which developers can
combine HTML, scripts and reusable ActiveX server components to create dynamic and powerful web-based
business solutions.
Features
Language-independent
Runs in differentprocess from core
Web server
Open standard
Web server architectureindependent
Supports distributed computing
Multiple, extensible roles
Memory sharing with other
processes
Does NOT create new process for
each instance
Easy to use
CGI
APIs
*
*
*
*
*
*
*
*
*
Connecting to the Database
 We have seen various approaches that enable browsers to communicate with Web servers, and in turn allow
Web clients to have access to databases.
 For example, CGI and API programs can be invoked by the Web client to access the underlying database.
 In general, database connectivity solutions include the use of:
 Native database APIs.
 Database-independent APIs (such as ODBC).
 Template-driven database access packages.
 Third part y class libraries.
Tools and Related Technologies
 Scripting languages such as JavaScript, Perl, and PHP
 HTTP cookies
 Netscape API (NSAPI) and Microsoft Information Server API (ISAPI).
 ODBC (Open DataBase Connectivity)
 Java and JDBC, SQLJ, Servlets, and JavaServer Pages (JSP).
 Microsoft’s Web Solution Platform with Active Server Pages (ASPs) and Active Data Objects (ADO)
 Oracle’s Internet Platform
HTTP Cookies
 One way to make CGI scripts more interactive is to use cookies.
 A cookie is a piece of information that the client stores on behalf of the server.
 The information that is stored in the cookie comes from the server as part of the server’s response to an
HTTP request.
 A client may have man y cookies stored at an y given time, each one associated with a particular Web site or
Web page.
 Each time the client visits that site/page, the browser packages the cookie with the HTTP request.
 The Web server can then use the information in the cookie to identify the user and, depending on the nature
of the information collected, possibly personalize the appearance of the Web page.
 The Web server can also add or change the information within the cookie before returning it.
 Cookies can be used to store registration information or preferences.
 Example:
$!/bin/sh
echo “Content-type:text/html”
echo “Set-cookie: UserID=conn-ci0; expires = Friday 30-Apr-02 12:00:00 GMT”
eCHO “Set-cookie: Password=guest; expires = Friday 30-Apr-02 12:00:00 GMT”
echo “”
Netscape API
 Netscape LiveWire Pro provides server-side JavaScript constructs for database connectivity using the
Netscape API (NSAPI).
 JavaScript is compiled into bytecode and the interpreted by the LiveWire Pro server extension running in
conjunction with the Netscape server.
 JavaScript can accomplish many of the tasks usually associated with retrieving and working with information
from a database, including:
 Connecting to and disconnecting from the database;
 Beginning, , and rolling back an SQL query;
 Displaying the results of an SQL query;
 Creating updateable cursors for viewing, inserting, deleting, and modifying data.
 Accessing binary large objects (BLOBs) for multimedia content, such as images and sounds.
Netscape API – An Example
//Connect to the database, and check connection successful
database.connect(ORACLE, my_server, auser_name, auser_password, Company_db)
If (!database.connected())
write(“Error connecting to database”)
else {
// Set up a cursor for query; second parameter indicates that updates will occur through cursor
myCursor = database.cursor(“SELECT * FROM Employee”, TRUE)
// Loop over all the records and update salary field
while (myCursor.next()) {
myCursor.salary = myCursor.salary * 1.05
myCursor.updateRow(Employee)
}
// Finally, disconnect from the database
database.disconnect()
}
ODBC (Open DataBase Connectivity)
 ODBC is an SQL-based product of Microsoft with the objective of providing a common interface for
accessing heterogeneous SQL databases.
 This interface (built on the ‘C’ language) provides a high degree of interoperability: a single application can
access different SQL DBMSs through a common set of code.
 This enables a developer to build and distribute a client-server application without targeting a specific
DBMS.
 ODBC has emerged as a de facto industry standard for the following reasons:






Applications are not tied to a proprietary vendor API;
SQL statements can be explicitly included in source code or constructed dynamically at runtime;
An application can ignore the underlying data communications protocols;
Data can be sent and received in a format that is convenient to the application;
ODBC is designed in conjunction with the X/Open and ISO Call-Level Interface (CLI) standards;
There are ODBC drivers available today for many of the popular DBMSs;
ODBC Architecture
 The ODBC interface defines the following:
 A librar y of function calls that allow an application to connect to a DBMS, execute SQL statements, and
retrieve results;
 A standard wa y to connect and log on to DBMSs;
 A standard representation of data t ypes;
 A standard set of error codes;
 SQL s yntax based on the X/Open and ISO Call-Level Interface (CLI) specifications.
 The ODBC architecture has four components:
 Application – which performs processing and calls ODBC functions to submit SQL statements to the
DBMS and to retrieve results from DBMS.
 Driver Manager – which loads and unloads drivers on behalf of an application.
 Driver and Database Agent – which process ODBC function calls, submit SQL requests to a specific data
source, and return results to the applications.
 Data source – which consists of the data the user wants to access and its associated DBMS, its host
operating system, and network platform, if any.
Windows Applications Use ODBC to Access Databases
Java



Java is a proprietary language developed by Sun Microsystems.
Java is rapidly becoming the de facto standard programming language for Web computing.
Java is a type-safe, object-oriented, distributed, interpreted, robust, secure, architecture neutral, portable,
high-performance, multi-threaded, and dynamic programming language that is interesting because of its
potential for building Web applications (applets) and server applications (servlets).
J2EE – Java 2 Enterprise Edition
J2EE – Java 2 Enterprise Edition – aimed at robust, scalable, multi-user, and secure enterprise application.
The cornerstone of J2EE is Enterprise JavaBeans (EJB), a standard for building server-side components in
Java.
 We are particularl y interested in two J2EE components:
 JDBC
 JavaServer Pages
 Enterprise JavaBeans (EJB) is a server-side component architecture for the business tier, encapsulating
business and data logic.
JDBC Connectivity Using ODBC drivers

The Pure JDBC platform
SQLJ
 SQLJ is a specification for Java with static embedded SQL.
 SQLJ comprises a set of clauses that extend Java to include SQL constructs.
 An SQLJ translator transforms the SQLJ clauses into standard Java code that accesses the database through a
call-level interface.
 SQLJ is based on static embedded SQL while JDBC is based on dynamic SQL (allows a calling program to
compose SQL at runtime).
Java Servlets
 Servlets are programs that run on a Java-enabled Web server and build Web pages, analogous to CGI
programming.
 Servlets have a number of advantages over CGI, such as:
 Improved performance – with servlets a lightweight thread inside JVM handles each request.
 Portability – Java servlets adhere to the “write once, run an ywhere” philosophy of Java.
 Extensibility – Java servlets can utilize Java code from an y source and can access the large set of APIs
available for the Java platform.
 Simpler session management – A typical CGI program uses cookies on either the client or server (or
both) to maintain some sense of state or session. On the other hand, servlets can maintain state and
session identity because they are persistent and all client requests are processed until the servlet is shut
down by the Web server.
 Improved security and reliability – servlets have the added advantages of benefiting from the in-built Jave
security model and inherit Java type safety, making the servlets more reliable.
JSP (JavaServer Pages)






JSP is a Java-based server-side scripting language that allows static HTML to be mixed with dynamicallygenerated HTML.
The HTML developers can use their normal Web page building tools (for example, Microsoft’s FrontPage or
Macromedia’s Dreamweaver) and then modify the HTML file and embed the dynamic content within special
tags.
JSP works with most Web servers including Apache HTTP Server and Microsoft Internet Information Server
(with plug-ins from IBM’s WebSphere, LiveSoftware’s Jrun, or New Atlanta’s ServletExec).
The JSP engine transforms JSP tags, Java code, and static HTML content into Java code, which is then
automatically organized by the JSP engine into an underlying Java servlet, after which the servlet is then
automatically compiled into Java bytecodes.
Thus, when a client requests a JSP page, a generated precompiled servlet does all the work.
JSP gives both efficient performance and the flexibility of rapid development with no need to manually
compile code.
Microsoft Web Solution Platform



Microsoft Web Solution Platform has been created for building and deploying interoperable Web solutions.
It is the precursor to Microsoft’s “.net” – a vision for the third generation of the Internet where “software is
delivered as a service, accessible by any device, any time, any place, and is fully programmable
and personalizable”
”.
There are various tools, services, and technologies in the platform such as Windows 2000, Exchange Server,
BizTalkServer, Visual Studio, HTML/XML, scripting (JScript, VBScript, …), and components (Java or
ActiveX).
Object Linking and Embedding (OLE)
 OLE is an object-oriented technology that enables development of reusable software components.
 Instead of traditional procedural programming in which each component implements the functionality it
requires, the OLE architecture allows applications to use shared objects that provide specific functionality.
 Objects like text documents, charts, spreadsheets, e-mail messages, graphics, and sound clips all appear as
objects to the OLE application.
 When objects are embedded or linked, they appear within the client application.
 When the linked data needs to be edited, the user double-clicks the object, and the application that created it
is started.
Component Object Model
 Component objects are objects that provide services to other client applications.
 The Component Object Model (COM)) is an object-based model consisting of both a specification that
defines the interface between objects within a system and a concrete implementation, packaged as a Dynamic
Link Library (DLL).
 COM is a service to establish a connection between a client application and an object and its associated
services.
 COM provides a standard method of finding and instantiating objects, and for the communication between
the client and the component.
 One of the major strengths of COM lies in the fact that it provides a binary interoperability standard; that is,
the method for bringing the client and object together is independent of any programming language that
created the client and object.
Distributed Component Object (DCOM)
 Distributed Component Object Model (DCOM) extends the COM architecture to provide a distributed
component-based computing environment, allowing components to look the same to clients on a remote
machine as on a local machine.


DCOM does this by replacing the interprocess communication between client and component with an
appropriate network protocol.
DCOM is very suited to the three-tier architecture.
COM+
 COM+ provides the basis for Microsoft’s new framework for unifying and integrating the PC and the
Internet.
 The Web Solution Platform is “An architectural framework for building modern, scalable, multi-tier
distributed computing solutions, that can be delivered over any network”, which defines a common set of
services including components.
 There are several core components to this architecture:
 Active Server Pages (ASP)
 ActiveX Data Objects (ADO)
Universal Data Access
 The Microsoft ODBC technology provides a common interface for accessing heterogeneous SQL databases.
 ODBC has man y limitations when used as a programming interface.
 Microsoft packaged Access and Visual C++ with Data Access Objects (DAO).
 The object model of DAO consists of objects such as Databases, TableDefs, QueryDefs, Recordsets, fields,
and properties.
 Next, Microsoft defined a set of data objects, collectivel y known as OLE DB (Object Linking Embedding for
DataBases), that allows OLE-oriented applications to share and manipulate sets of data as objects.
 OLE DB provides low-level access to an y data source, including relational and non-relational databases, email and file systems, text and graphics, custom business objects, and more.
 OLE DB is an object-oriented specification based on a C++ API.
Universal Data Access
Active Server Pages and ActiveX Data Objects




Active Server Pages (ASP) is a programming model that allows dynamic, interactive Web pages to be
created on the Web server, analogous to JavaServer Pages (JSP).
The pages can be based on what browser type the user has, on what language the user’s machine supports,
and on what personal preferences the user has chosen.
ASP was introduced with the Microsoft Internet Information Server (IIS) and supports ActiveX scripting,
allowing a large number of different scripting engines to be used, within a single ASP script if necessary.
Native support is provided for VBScript (the default scripting language for ASP) and JScript.
ASP Architecture






ASP provide the flexibility of CGI, without the performance overhead.
Unlike CGI, ASP runs in-process with the server, and is multi-threaded and optimized to handle a large
volume of users.
ASP is built around files with the extension “.asp”, which can contain any combination of the following:
 Text;
 HTML tags,
 Script commands and output expressions.
An ASP script starts to run when a browser requests an “.asp” file from the Web server.
The Web server then calls ASP, which reads through the requested file from top to bottom, executes any
commands, and sends the generated HTML page back to the browser.
It is possible to generate client-side scripts within a server-side generated HTML file by simply including the
script as text within the ASP script.
ActiveX Data Objects
 Active Data Objects (ADO) is a programming extension of ASP supported by the Microsoft Internet
Information Server (IIS) for database connectivity.
 ADO is designed as an easy-to-use application level interface to OLE DB.
 ADO supports the following key features:
 Independent-created objects;
 Support for stored procedures, with input and output parameters and return parameters;
 Different cursor t ypes, including the potential for the support of different back-end-specific cursors;
 Batch updating;
 Support for limits on the number of returned rows and other quer y goals;
 Support for multiple recordsets returned from stored procedures or batch statements.
Comparison of ASP and JSP
 ASP and JSP are designed to enable developers to separate page design from programming logic through the
use of callable components, and both provide an alternative to CGI programming that simplifies Web page
development and deployment.
 Platform and Server Independence – JSP conforms to the “Write Once, Run Anywhere” philosophy of the
Java environment. Thus, JSP can run on any Java-enabled Web server and is supported by a wide variety
of vendor tools. In contrast, ASP is primarily restricted to Microsoft Windowa-based platforms.
 Extensibility – Although both technologies use a combination of scripting and tagging to create d ynamic
Web pages, JSP allows developers to extend the JSP tags available.
 Reusability – JSP components (JavaBeans, EJB, and custom tags) are reusable across platforms. For
example, an EJB component can access distributed databases across a variety of platforms (e.g., UNIX,
Windows).
 Security and reliability – JSP has the added advantages of benefiting from the in-built Java security model
and the inherent Java type safety, making JSP potentially more reliable.
Microsoft Access and Web Page Generation
 Microsoft Access 2000 provides three wizards for automatic generating HTML pages based on tables,
queries, forms, or reports in the database:



Static pages – the user can export data to HTML format.
Dynamic pages, using Active Server Pages – with this approach, the user can export data to an .asp file on
the Web server, by specifying the name of the current database, a username, and password to connect to
the database, and the URL of the Web server that will store the ASP file.
Dynamic pages, using data access pages – Data access pages are Web pages bound directly to the data in
the database. Data access pages can be used like Access forms, except that these pages are stored as
external files, rather than within the database or database project. Data access pages are written in
dynamic HTML (DHTML), an extension of HTML that allows dynamic objects as part of the Web page.
Unlike ASP files, a data access page is created within Access using a wizard or in design view employing
many of the same tools that are used to create Access forms.
Microsoft .net
 Microsoft is launching a platform called .net – a vision for the next generation of Internet.
 This vision is motivated by a shift from individual Web sites to clusters of computers, devices, and
mechanisms that collaborate to provide improved user services.
 The intention is to allow people to control how, when, and what information is delivered to them. Among the
components in this new platform are ASP.net and ADO.net:


ASP.net – is the next version of ASP that has been reengineered to improve performance and scalability.
ADO.net – is the next version of ADO with new classes that expose data access services to programmer.
Oracle Internet Platform


The Oracle Internet Platform, comprising Oracle Internet Application Server (iAS) and Oracle DBMS, is
aimed particularly at providing extensibility for distributed environment.
It is n-tier architecture based on industry standards such as:
 HTTP and HTML/XML for Web enablement.
 The Object Management Group’s CORBA technology for manipulating objects.
 Internet Inter-Object Protocol (IIOP) for object interoperability and Remote Method Invocation (RMI).
 Java, Enterprise JavaBeans (EJB), JDBC and SQLJ for database connectivity, Java servlets, and JSP. It
also supports Java Messaging Service (JMS), Java Naming and Directory Interface (JNDI), and it allows
stored procedures to be written in Java.
Oracle Internet Application Server (iAS)
Download