VOL3 - Eurescom

advertisement
Project P817-PF
Database Technologies for Large Scale
Databases in Telecommunication
Deliverable 1
Overview of Very Large Database Technologies and Telecommunication
Applications using such Databases
Volume 3 of 5: Annex 2 - Data manipulation and management issues
Suggested readers:
- Users of very large database information systems
- IT managers responsible for database technology within the PNOs
- Database designers, developers, testers, and application designers
- Technology trend watchers
- People employed in innovation units and R&D departments.
For full publication
March 1999
EURESCOM PARTICIPANTS in Project P817-PF are:

BT

Deutsche Telekom AG

Koninklijke KPN N.V.

Tele Danmark A/S

Telia AB

Telefonica S.A.

Portugal Telecom S.A.
This document contains material which is the copyright of certain EURESCOM
PARTICIPANTS, and may not be reproduced or copied without permission.
All PARTICIPANTS have agreed to full publication of this document
The commercial use of any information contained in this document may require a
license from the proprietor of that information.
Neither the PARTICIPANTS nor EURESCOM warrant that the information
contained in the report is capable of use, or that use of the information is free from
risk, and accept no liability for loss or damage suffered by any person using this
information.
This document has been approved by EURESCOM Board of Governors for
distribution to all EURESCOM Shareholders.
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Preface
(Edited by EURESCOM Permanent Staff)
The Project will investigate different database technologies to support high
performance and very large databases. It will focus on state-of-the-art, commercially
available database technology, such as data warehouses, parallel databases, multidimensional databases, real-time databases and replication servers. Another important
area of concern will be on the overall architecture of the database and the application
tools and the different interaction patterns between them. Special attention will be
given to service management and service provisioning, covering issues such as data
warehouses to support customer care and market intelligence and database technology
for web based application (e.g. Electronic Commerce).
The Project started in January 1998 and will end in December 1999. It is a partially
funded Project with an overall budget of 162 MM and additional costs of around
20.000 ECU. The Participants of the Project are BT, DK, DT, NL, PT, ST and TE.
The Project is led by Professor Willem Jonker from NL.
This is the first of four Deliverables of the Project and is titled: “Overview of very
large scale Database Technologies and Telecommunication Applications using such
Databases”. The Deliverable consists of five Volumes, of which this Main Report is
the first. The other Volumes contain the Annexes. Other Deliverables are: D2
“Architecture and Interaction Report”, D3 “Experiments: Definition” and D4
“Experiments: Results and Conclusions”.
This Deliverable contains an extensive state-of-the-art technological overview of very
large database technologies. It addresses low-cost hardware to support very large
databases, multimedia databases, web-related database technology and data
warehouses. Contained is a first mapping of technologies onto applications in the
service management and service provisioning domain.
 1999 EURESCOM Participants in Project P817-PF
page i (xi)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Executive Summary
This annex contains a part of the results of the literature study to database
technologies for very large databases. The other parts can be found in other annexes.
This document is subdivided in three parts viz.:

Transaction Processing Monitors: describing the concepts and added-value of
transaction processing monitors together with available products

Retrieval and Manipulation: describing aspects of data retrieval and manipulation
in distributed databases, plus an overview of current commercial database
systems.

Backup and Recovery: describing backup and recovery strategies and their
necessity in the context of high-available very large database applications.
Transaction Processing Monitors reside between the client and the (database) server.
Their main goals are to ensure that transactions are correctly processed, often in a
heterogeneous environment, and that workload is equally distributed among available
systems in compliance with security rules. The most mature products are Tuxedo,
Encina, TOP END and CICS. Grip and Microsoft Transaction Server (MTS) lack
some features and standards support. If you are looking for enterprise-wide capacity,
consider Top End and Tuxedo. If your project is medium sized, consider Encina as
well. If you look for a product to support a vast number of different platforms then
Tuxedo may be the product to choose. If DCE is already used as underlying
middleware then Encina should be considered. Regarding support of objects or
components MTS is clearly leading the field with a tight integration of transaction
concepts into the COM component model. Tuxedo and Encina will support the
competing CORBA object model from the OMG. There seems to be a consolidation
on the market for TP Monitors. On the one hand Microsoft has discovered the TP
Monitor market and will certainly gain a big portion of the NT server market. On the
other side the former TP Monitor competitors are merging which leaves only IBM
(CICS and Encina) and BEA Sytems (Tuxedo and TOP END) as the old ones. The
future will heavily depend on the market decision about object and component models
such as DCOM, CORBA and JavaBeans and the easy access to integrated
development tools.
Retrieval and manipulation of data in different database architectures has various
options for finding optimal solutions for database applications. In recent years many
architectural options have been discussed in the field of distributed and federated
databases and various algorithms have been implemented to optimise the handling of
data and to optimise methodologies to implement database applications. Nevertheless,
retrieval and manipulation in different architectures apply similar theoretical
principals for optimising the interaction between applications and database systems.
Efficient query and request execution is an important criterion when retrieving large
amounts of data. This part also covers a number of commercial database products
competing in the VLDB segment, most of which run on various hardware platforms.
The DBMSs are generally supported by a range of tools for e.g. data replication and
data retrieval.
Being able to backup and recover data is essential for an organisation as no system
(not even a fault-tolerant one) is free of failures. Moreover, errors are not only caused
by hard- and software failures but also by (un)willfull wrong user actions. Some types
of failures can be corrected by the DBMS immediately (e.g. wrong user operations)
page ii (xi)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
but others need a recovery action from a backup device (e.g. disk crashes). Depending
on issues like the type of system, the availability requirements, the size of the
database etc., one can choose from two levels of backup and recovery. The first is on
the operating system level and the second on the database level. Products of the
former are often operating system dependent and DBMS independent and products of
the latter the other way around. Wich product to choose depends on the mentioned
issues.
 1999 EURESCOM Participants in Project P817-PF
page iii (xi)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
List of Authors
Part 1
Berend Boll
Deutsche Telekom Berkom GmbH, Germany
Part 2
Frank Norman
Tele Danmark
Wolfgang Müller
Deutsche Telekom
Part 3
Sabine Gerl
Deutsche Telekom Berkom GmbH, Germany
Andres Peñarrubia
Telefonica, Spain
page iv (xi)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Table of Contents
Preface ............................................................................................................................ i
Executive Summary ....................................................................................................... ii
List of Authors .............................................................................................................. iv
Table of Contents........................................................................................................... v
Abbreviations ................................................................................................................ ix
Definitions .................................................................................................................... xi
Part 1 Transaction Processing Monitors ................................................................... 1
1 Introduction................................................................................................................. 1
2 Concepts of Transactions ........................................................................................... 1
2.1 ACID Properties ............................................................................................... 1
2.2 Two Phase Commit Protocol ........................................................................... 1
3 Concepts of TP Monitors ............................................................................................ 2
3.1 Why should you use a TP Monitor? ................................................................ 2
3.2 Standards and Architecture .............................................................................. 4
3.3 Transaction management ................................................................................. 6
3.4 Process management ........................................................................................ 7
3.4.1 Server classes ...................................................................................... 7
3.4.2 Reduced server resources ................................................................... 7
3.4.3 Dynamic load balancing ..................................................................... 8
3.5 Robustness ....................................................................................................... 8
3.6 Scalability ........................................................................................................ 9
3.6.1 Shared process resources .................................................................... 9
3.6.2 Flexible hardware requirements ......................................................... 9
3.7 Performance ..................................................................................................... 9
3.8 Security .......................................................................................................... 10
3.9 Transaction profiles ....................................................................................... 10
3.10 Administration ............................................................................................. 11
3.11 Costs ............................................................................................................. 11
3.12 3-tier architecture framework ...................................................................... 12
3.13 When not to use a TP Monitor ..................................................................... 12
4 Commercial TP Monitors ......................................................................................... 13
4.1 BEA Systems Inc.'s Tuxedo ........................................................................... 13
4.1.1 Summary ........................................................................................... 13
4.1.2 History .............................................................................................. 14
4.1.3 Architecture ...................................................................................... 15
4.1.4 Web Integration ................................................................................ 16
4.1.5 When to use ...................................................................................... 17
4.1.6 Future plans ...................................................................................... 17
4.1.7 Pricing ............................................................................................... 18
4.2 IBM's TXSeries (Transarc's Encina) .............................................................. 18
4.2.1 Summary ........................................................................................... 18
4.2.2 History .............................................................................................. 19
4.2.3 Architecture ...................................................................................... 19
4.2.4 Web Integration ................................................................................ 21
 1999 EURESCOM Participants in Project P817-PF
page v (xi)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
4.2.5 When to use ....................................................................................... 21
4.2.6 Future plans ....................................................................................... 22
4.2.7 Pricing ............................................................................................... 22
4.3 IBM's CICS..................................................................................................... 22
4.3.1 Summary ........................................................................................... 22
4.3.2 History ............................................................................................... 23
4.3.3 Architecture ....................................................................................... 23
4.3.4 Web integration ................................................................................. 25
4.3.5 When to use ....................................................................................... 26
4.3.6 Future plans ....................................................................................... 26
4.3.7 Pricing ............................................................................................... 27
4.4 Microsoft Transaction Server MTS ............................................................... 27
4.4.1 Summary ........................................................................................... 27
4.4.2 History ............................................................................................... 27
4.4.3 Architecture ....................................................................................... 28
4.4.4 Web Integration................................................................................. 29
4.4.5 When to use ....................................................................................... 29
4.4.6 Future plans ....................................................................................... 29
4.4.7 Pricing ............................................................................................... 29
4.5 NCR TOP END .............................................................................................. 30
4.5.1 Summary ........................................................................................... 30
4.5.2 History ............................................................................................... 30
4.5.3 Architecture ....................................................................................... 31
4.5.4 Web Integration................................................................................. 32
4.5.5 When to use ....................................................................................... 33
4.5.6 Future plans ....................................................................................... 33
4.5.7 Pricing ............................................................................................... 34
4.6 Itautec's Grip................................................................................................... 34
4.6.1 Summary ........................................................................................... 34
4.6.2 History ............................................................................................... 34
4.6.3 Architecture ....................................................................................... 35
4.6.4 Web Integration................................................................................. 36
4.6.5 When to use ....................................................................................... 36
4.6.6 Future plans ....................................................................................... 36
4.6.7 Pricing ............................................................................................... 37
5 Analysis and recommendations................................................................................. 37
5.1 Analysis .......................................................................................................... 37
5.2 Recommendations .......................................................................................... 37
References .................................................................................................................... 38
Part 2 Retrieval and Manipulation .......................................................................... 39
1 Introduction ............................................................................................................... 39
1.1 General architecture of distributed Databases ............................................... 39
1.1.1 Components of a distributed DBMS ................................................. 39
1.1.2 Distributed versus Centralised databases .......................................... 41
1.2 General architecture of federated Databases .................................................. 41
1.2.1 Constructing Federated Databases .................................................... 42
1.2.2 Implementing federated database systems ........................................ 44
1.2.3 Data Warehouse Used To Implement Federated System .................. 46
1.2.4 Query Processing in Federated Databases ........................................ 47
page vi (xi)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
1.2.5 Conclusion: Federated Databases ..................................................... 47
2 Organisation of distributed data ............................................................................... 48
2.1 Schema integration in Federated Databases .................................................. 48
2.2 Data Placement in Distributed Databases ...................................................... 49
2.2.1 Data Fragmentation .......................................................................... 50
2.2.2 Criteria for the distribution of fragments.......................................... 50
3 Parallel processing of retrieval ................................................................................. 51
3.1 Query Processing ........................................................................................... 51
3.2 Query optimisation......................................................................................... 51
4 Parallel processing of transactions ........................................................................... 52
4.1 Characteristics of transaction management .................................................. 52
4.2 Distributed Transaction.................................................................................. 52
5 Commercial products ................................................................................................ 53
5.1 Tandem........................................................................................................... 53
5.1.1 Designed for scalability .................................................................... 53
5.1.2 High degree of manageability ........................................................... 53
5.1.3 Automatic process migration and load balancing............................. 53
5.1.4 High level of application and system availability ............................ 53
5.2 Oracle ............................................................................................................. 54
5.2.1 Oracle8.............................................................................................. 54
5.2.2 A Family of Products with Oracle8 .................................................. 55
5.3 Informix ......................................................................................................... 60
5.3.1 Informix Dynamic Server ................................................................. 60
5.3.2 Basic Database Server Architecture ................................................. 60
5.3.3 Informix Dynamic Server Features................................................... 62
5.3.4 Supported Interfaces and Client Products ........................................ 64
5.4 IBM ................................................................................................................ 66
5.4.1 DB2 Universal Database................................................................... 66
5.4.2 IBM's Object-Relational Vision and Strategy .................................. 69
5.4.3 IBM’s Business Intelligence Software Strategy ............................... 71
5.5 Sybase ............................................................................................................ 73
5.5.1 Technology Overview: Sybase Computing Platform ....................... 73
5.5.2 Sybase's Overall Application Development/Upgrade Solution:
Customer-Centric Development ................................................... 76
5.5.3 Java for Logic in the Database ......................................................... 77
5.6 Microsoft ........................................................................................................ 79
5.6.1 Overview........................................................................................... 79
5.6.2 Microsoft Cluster Server .................................................................. 81
5.7 NCR Teradata ................................................................................................ 83
5.7.1 Data Warehousing with NCR Teradata ............................................ 83
5.7.2 Teradata Architecture ....................................................................... 84
5.7.3 Application Programming Interfaces ................................................ 85
5.7.4 Language Preprocessors ................................................................... 85
5.7.5 Data Utilities ..................................................................................... 86
5.7.6 Database Administration Tools ........................................................ 86
5.7.7 Internet Access to Teradata .............................................................. 86
5.7.8 NCR's Commitment to Open Standards ........................................... 86
5.7.9 Teradata at work ............................................................................... 87
6 Analysis and recommendations ................................................................................ 87
 1999 EURESCOM Participants in Project P817-PF
page vii (xi)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
References .................................................................................................................... 88
Part 3 Backup and Recovery..................................................................................... 91
1 Introduction ............................................................................................................... 91
2 Security aspects ......................................................................................................... 91
3 Backup and Recovery Strategies ............................................................................... 93
3.1 Recovery ......................................................................................................... 95
3.2 Strategies ........................................................................................................ 96
3.2.1 Requirements .................................................................................... 96
3.2.2 Characteristics ................................................................................... 97
4 Overview of commercial products ............................................................................ 97
4.1 Tools ............................................................................................................... 98
4.1.1 PC-oriented backup packages ........................................................... 98
4.1.2 UNIX packages ................................................................................. 98
4.2 Databases ...................................................................................................... 100
4.2.1 IBM DB2 ......................................................................................... 100
4.2.2 Informix........................................................................................... 101
4.2.3 Microsoft SQL Server ..................................................................... 102
4.2.4 Oracle 7 ........................................................................................... 102
4.2.5 Oracle 8 ........................................................................................... 103
4.2.6 Sybase SQL Server ......................................................................... 105
5 Analysis and recommendations............................................................................... 105
References .................................................................................................................. 106
Appendix A: Backup and Restore Investigation of Terabyte-scale Databases .......... 107
A.1 Introduction ................................................................................................. 107
A.2 Requirements ............................................................................................... 107
A.3 Accurate benchmarking ............................................................................... 107
A.4 The benchmark environment ....................................................................... 108
A.5 Results ......................................................................................................... 109
A.5.1 Executive summary ........................................................................ 109
A.5.2 Detailed results ............................................................................... 111
A.6 Interpreting the results ................................................................................. 113
A.7 Summary ...................................................................................................... 113
Appendix B: True Terabyte Database Backup Demonstration .................................. 115
B.1 Executive Summary ..................................................................................... 115
B.1.1 Definitions ...................................................................................... 116
B.2 Detailed Results ........................................................................................... 116
B.2.1 Demonstration Environment .......................................................... 116
B.2.2 Results ............................................................................................ 117
B.3 Interpreting the Results ................................................................................ 118
B.4 Summary ...................................................................................................... 119
page viii (xi)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Abbreviations
ACID
Atomicity, Consistency, Isolation, Durability
ACL
Access Control List. Used to define security restictions for
objects/resources
COM
Microsoft's Component Object Model
CORBA
OMG's Common Object Request Broker Architecture
DBA
Database Administrator
DBMS
Database Management System
DBS
Data Base System
DCE
Open Group’s Distributed Computing Environment
DCOM
Microsoft's Distributed Component Object Model
DDL
Data Definition Language
DML
Database Manipulation Language
DRM
Disaster Recovery Manager
DSA
Database Server Architecture
DTP Model
Distributed Transaction Processing Model, defined by the Open
Group.
FDBS
Federated Database System
GIF
Graphics Interchange Format
HSM
Hierarchical Storage Management
HTML
Hypertext Markup Language
IDL
Interface Definition Language
JDBC
Java Database Connectivity
LOB
Line-Of-Business
MDBS
Multi Database System
MOM
Message-Oriented Middleware
MPP
Massively Parallel Processing
NCA
Network Computing Architecture
ODBC
Open Database Connectivity
OLAP
Online Analytical Processing
OMG
Object Management Group
Open Group
None-profit, vendor-independent, international consortium. Has
created the DTP Model and the XA Standard for Transaction
Processing.
ORB
Object Request Broker
ORDBMS
Object-Relational DBMS
 1999 EURESCOM Participants in Project P817-PF
page ix (xi)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
PDF
Portable Document Format
RDBMS
Relational DBMS
RPC
Remote Procedure Call
SMP
Symmetric Multiprocessing
TLI
Transport Layer Interface (APIs such as CPI-C the SNA peer-topeer protocol and Named Pipes)
TP Monitor
Transaction Processing Monitor
TPC
Transaction Processing Performance Council
tpmC
Transaction Per Minute measured in accordance with TPC's C
standard (TPC-C).
UDF
User-defined Function
UDT
User-defined Data Type
ULL
United Modeling Language
VLDB
Very Large Database
VLDB
Very Large Database
VLDB
Very Large Database
XA
API used to co-ordinate transaction updates across resource
managers
page x (xi)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Definitions
ACID properties
the four transaction properties: atomicity, consistency,
integrity, durability.
Conversational
Kind of communication. Unlike request-response each request
in a conversation goes to the same service. The service
remains state information about the conversation. There is no
need to send state information with each client request.
Publish-Subscribe
Kind of communication. (Publisher) Components are able to
send events and other components (Subscribers) are able to
subscribe to a special event. Everytime the subsribed event
happens within the publisher component the subscriber
component gets notified by a message.
Queue
Kind of communication. A queue provides time-independent
communication. Request and Responses are stored in a queue
and could be accessed asynchronously.
Request-response
Kind of communication. The client issues a request to a
service and then waits for a response before performing other
operations (an example is a RPC)
Resource managers
a piece of software that manages shared resources
server class
a group of processes that are able to run the code of the
application program.
two-phase commit
Protocol for distributed transactions
 1999 EURESCOM Participants in Project P817-PF
page xi (xi)
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Part 1 Transaction Processing Monitors
1
Introduction
"The idea of distributed systems without transaction management is like a society
without contract law. One does not necessarily want the laws, but one does need a
way to resolve matters when disputes occur. Nowhere is this more applicable than in
the PC and client/server worlds." - Jim Gray (May, 1993)
2
Concepts of Transactions
Transactions are fundamental in all software applications, especially in distributed
database applications. They provide a basic model of success or failure by ensuring
that a unit of work must be completed in its entirety.
From a business point of view a transaction changes the state of the enterprise, for
example a customer paying a bill which constitutes in changes of the order status and
a change on balance sheets.
From a technical point of view we define a transaction as "a collection of actions that
is governed by the ACID-properties" ([5]).
2.1
ACID Properties
The ACID properties properties describe the key features of transactions:

Atomicity. Either all changes to the state happen or none do. This includes
changes to databases, message queues or all other actions under transaction
control.

Consistency. The transaction as a whole is a correct transformation of the state.
The actions undertaken do not violate any of the integrity constraints associated
with the state.

Isolation. Each transaction runs as though there are no concurrent transactions.

Durability. The effects of a committed transaction survive failures.
Database and TP systems both provide these ACID properties. They use locks, logs,
multiversions, two-phase-commit, on-line dumps, and other techniques to provide this
simple failure model.
2.2
Two Phase Commit Protocol
The two-phase commit protocol is currently the accepted standard protocol to achieve
the ACID properties in a distributed transaction environment. Each distributed
transaction has a coordinator, who initiates and coordinates the transaction.
In the first phase the coordinator (root node) informs all participating subordinate
nodes of the modifications of the transaction. This is done via the prepare-to-commit
message. Then the coordinator waits for the answers of the subordinate nodes. In case
of success he gets a ready-to-commit message from each of the subordinate nodes.
The root node logs this fact in a safe place for recovery from a root node failure.
 1999 EURESCOM Participants in Project P817-PF
page 1 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
If any of the subordinate nodes fails and does not send a ready-to-commit message to
the root node then the whole transaction will be aborted.
In the second phase the coordinator sends a commit message to all subordinate nodes.
They commit their actions and answer with a complete message. The protocol is
illustrated in the figure below.
Coordinator
(root node)
Participant
(subordinate root)
Participant
(subordinate root)
start
prepare
Log
file
ready-to-commit
ready-to-commit
Phase 1
prepare
commit
complete
complete
Phase 2
commit
complete
Figure 1. Two-Phase Commit Protocol
Most TP Monitors could easily handle transactions that spanned across 100 two-phase
commit engines. However, the two phase commit has some limitations:

Performance overhead. There is a message overhead, because the protocol does
not distinguish for the different kind of transactions. That means also for readtransactions all the messages of the two-phase commit protocol will be sent, even
if they are not really needed.

Hazard windows. At special times a failure of a node can lead to a problem. For
example, if the root node crashes after the first phase, the subordinate rotes may
be left in disarray. There are workarounds, but they tend to be tricky. So the
architecture should be built in a way, that the root node is located on a faulttolerant system.
3
Concepts of TP Monitors
3.1
Why should you use a TP Monitor?
Based on the TPC ranking of February 1998, the success of TP Monitors was clearly
demonstrated by the fact that every single test environment of the top 20 TPC-C
benchmark results (ranked by transactions per minute) included a TP Monitor. If the
same results are ranked by the price/performance ratio, 18 of the top 20 used a TP
Monitor ([1], [12]).
Why are TP Monitors so popular in modern architectures and what problems do they
address?
page 2 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
To understand this one has to take a look at how application architectures are build.
All applications consist of three parts:

presentation layer (GUI) which resides on the client

application layer which could reside on a separate application servers

data layer which could resides on a separate database severs.
In many applications the is no clear separation of those layers within the code. The
same code could do work for all three layers. Good structured applications separate
these layers on the code (software) and on the hardware level.
2-tier applications are applications that have the application layer integrated within
the presentation layer on the client and/or the data layer (as Remote Procedures) an
the database server.
3-tier application separate the application and most often run them on special
application servers. Still with a separation of an application layer one could run an
application layer services physically on the client or on the database server. But the
point is that there exists a separation and a possibility to redistribute the application
layer services on special purpose machines based on workload and performance
issues.
Basically TP Monitors provide an architecture to build 3-tiered client/server
applications. According to Standish Group, in 1996 57% of mission-critical
applications were built with TP Monitors. This is because former 2-tiered
architectures have the following problems:

For each active client the databases must maintain a connection which consumes
machine resources, reducing performance as the number of clients rises.

2-tiered applications scale well to a point and then degrade quickly.

Reuse is difficult, because 2-tiered application code like stored-procedures is
tightly bound to specific database systems

Transactional access to multiple data sources is only possible via gateways. But
gateways integrate applications on the data level which is "politically" and
technically unstable and not adoptive to change ("politically" refers to the
problem, that the owner of the data might not be willing to give access at the data
level outside of his department).

Database stored-procedures could not execute under global transaction control.
They could not be nested and programmed in a modular basis. Also they are
vendor-specific.

Outside of trusted LAN environments the security model used in 2-tiered systems
doesn't work well, because it focuses on granting users access to data. Once
administrators give a user write or change access to a table, the user can do
almost anything to the data. There is no security on the application level.

There is no transaction mechanism for objects (CORBA) or components (COM,
JavaBeans).
TP Monitors address all of the above problems and despite their rare usage in average
client/server-applications they have a famous history in the mainframe area.
Nowadays they tend to raise more and more attention because of the development of
commercial applications on the Internet.
 1999 EURESCOM Participants in Project P817-PF
page 3 (120)
Volume 3: Annex 2 - Data manipulation and management issues
3.2
Deliverable 1
Standards and Architecture
A TP Monitor could be described as an operating system for transaction processing. It
delivers the architecture to distribute and manage transactions over a heterogeneous
infrastructure. This implicitly forces the application architecture to be 3-tier, because
a TP Monitor is a type of middleware.
A TP Monitor does three things extremely well:

Process management includes starting server processes, funneling work to
them, monitoring their execution and balancing their workloads.

Transaction management means that the TP Monitor guarantees the ACID
properties to all the programs that run under its protection.

Client/Server communication management allows clients (and services) to
invoke an application component in a variety of ways - including requestresponse, conversations, queuing, publish-subscribe or broadcast.
DBMS 1
Client
DBMS 2
Client
File handling
System
Business
Logic /
TP Monitor
Client
Message
Queue
Client
Application
Process 1
Application
Process 2
Figure 2. 3-tier client/server with TP Monitor
A TP Monitor consists of several components. The Open Group's Distributed
Transaction Processing Model (1994) ([10]), which has achieved wide acceptance in
the industry, defines the following components:

The application program contains the business logic. It defines the transaction
boundaries through calls it makes to the transaction manager. It controls the
operations performed against the data through calls to the resource managers.

Resource managers are components that provide ACID access to shared
resources like databases, file systems, message queuing systems, application
components and remote TP Monitors.

The transaction manager creates transactions, assigns transaction identifiers to
them, monitors their progress and coordinates their outcome.

The Communication Resource Manager controls the communications between
distributed applications.
page 4 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Application Program (AP)
RM API
XATMI
TX API
TxRPC
CPI-C
Transaction
Manager (TM)
Resource
Manager (RM)
XA API
XA+ API
Communication
Resource
Managers (CRM)
Figure 3. X/Open 1994 Distributed Transaction Processing Model
The following interfaces exist between the components:

RM API is used to query and update resources owned by a resource manager.
Typically the provider of the resource manager defines this interface. For
example, the API for a relational database would be an SQL API.

TX API is used to signal the transaction manager the beginning, the commitment
or the abortion of a transaction.

XA API is used to coordinate transaction updates across resource managers
(two-phase commit protocol).

XA+ API defines the communication between the communications resource
managers and the transaction manager.
This interface, however, was never ratified, so not all the vendors use it. On the
whole, the XA+ interface is relatively unimportant as, generally, it is used
internally within the product.

XATMI, TxRPC and CPI-C are transaction communication programming
interfaces.
There is no general standard. XATMI is based on BEAs Tuxedo's Application to
Transaction Monitor Interface (ATMI); TxRPC is based on the Distributed
Processing Environment (DCE) RPC interface and CPI-C is based on IBMs peerto-peer conversational interface.
The role of these components and interfaces within a 3-tier architecture is visualized
with the following picture. The client (presentation layer) communicates with the
Application Program (AP). The access to the data-layer is done via the Resource
Manager (RM) component. If several distributed TP Monitors are involved within a
transaction, the Communication Resource Managers (CRM) are responsible for the
necessary communication involved.
 1999 EURESCOM Participants in Project P817-PF
page 5 (120)
Volume 3: Annex 2 - Data manipulation and management issues
RPC,
Queue, ...
Deliverable 1
Application Program
Client
CRM
TM
RM
Database
TP Monitor X
TCP/IP,
SNA, OSI
TP Monitor Y
CRM
TM
RM
Database
Application Program
PresentationLayer
Application-Layer
Data-Layer
Figure 4. TP Monitor within a 3-tier architecture
Actual implemenations of TP Monitors consists of serveral other components but
these differ from product to product. Therefore the OpenGroup DTP Model could
only be used to understand the main function of a TP Monitor: distributed transaction
management.
Other components include modules for client/server communication such as queues
(which all TP Monitors have now included), administration tools, directory services
and many more. We refer here to the commercial product chapter of this part.
The key interface is XA, because it is the interface between the resource manager
from one vendor and the DTPM from the middleware vendors.
XA is not a precise standard, nor does it comprise source code which can be licensed.
It is a specification against which vendors are expected to write their own
implementations. There are no conformance tests for the X/Open XA specification, so
it is not possible for any vendor to state that it is 'XA-compliant'; all that vendors can
claim is that they have used the specification and produced an XA implementation
which conforms to it. The situation is complicated even more by the fact, that the
commitee which devised the XA Model and the specifications has now disbanded.
The DBMSs that definitely support the XA standard and also work with all DTPMs
that support the standard are Oracle, Informix, SQL Server and DB2/6000.
In the following chapter we will describe special features of TP Monitors in more
detail.
3.3
Transaction management
TP Monitors are operation systems for business transactions. The unit of
management, execution and recovery is the transaction. The job of the TP Monitor is
to ensure the ACID properties even in a distributed resource environment while
maintaining a high transaction throughput.
The ACID properties are achieved through co-operation between the transaction
manager and resource managers. All synchronisation, commit and rollback actions are
co-ordinated by the transaction manager via the XA interface and the 2-phase-commit
protocol.
page 6 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
3.4
Process management
3.4.1
Server classes
TP Monitors main job is managing processes. They keep pools of pre-started
application processes or threads (called server classes). Each process or thread in the
server class is able to run the application. The TP Monitor balances the work between
them. Each application can have one or more server classes.
These server processes are pre-warmed. They are already loaded in memory, have a
context and are ready to start instantly. If they finish their work for one client request
they stay in memory and wait for the next request.
3.4.2
Reduced server resources
Keeping and sharing these pre-warmed processes dramatically reduces the number of
concurrent processes and therefore makes it possible to support huge number of
clients.
Database
1000
Clients
1000
connections
+
1000 Processes
+
500 MB of RAM
+
10,000 open files
Figure 5. Process Management without TP Monitor
This kind of process management could be described as a pooling and funnelling. It
provides scalability for huge database applications because it addresses the problem
that databases establish and maintain a separate database connection for each client. It
is because of this feature that all leading TPC-C benchmarks are obtained by using TP
Monitors ([12]).
Database
TP Monitor
1000
Clients
100
Server
Classes
50
50 shared
connections
+
50 Processes
+
25 MB of RAM
+
500 open files
Figure 6. Process Management with TP Monitor - pooling and funneling
 1999 EURESCOM Participants in Project P817-PF
page 7 (120)
Volume 3: Annex 2 - Data manipulation and management issues
3.4.3
Deliverable 1
Dynamic load balancing
If the number of incoming client requests exceeds the number of processes in a server
class, the TP Monitor may dynamically start new processes or even new server
classes. The server classes can also be distributed across multiple CPUs in SMP or
MPP environments. This is called load balancing. It could be done via manual
adminstration or automatically by the TP Monitor. In the latter case it is called
dynamic load balancing.
This load balancing could be done either for appication layer or data layer processes.
For data layer processes the database server bottleneck could be released by having
several (replicated) databases over which the TP Monitor could distribute the
workload.
3.5
Robustness
TP systems mask failures in a number of ways. At the most basic level, they use the
ACID transaction mechanism to define the scope of failure. If a service fails, the TP
Monitor backs out and restarts the transaction that was in progress. If a node fails, it
could migrate server classes at that node to other nodes. When the failed node restarts,
the TP system's transaction log governs restart and recovery of the node's resource
managers.
In that way the whole application layer running as server classes on potentially
distributed application servers is highly available. Each failure (local process/server
class or total machine) could be masked by the TP Monitor by restarting or migrating
the server class. Also the data layer is robust to failure, because a resource manager
under TP Monitor control masks the real database server. If a database server crashes,
the TP Monitor could restart the database server or migrate the server classes of the
resource manager to a fallback database server. In either cases (failure of application
or data layer) the failures will be fixed by the TP Monitor without disturbing the
client. the TP Monitor is acting as a self-healing system, it not just handles faults, it
automatically corrects them.
This could be done because the client is only connected to one TP Monitor. The TP
Monitor handles all further connections to different application, database, file services
etc. The TP Monitor handles failures of servers and redirects the requests if necessary.
That implies that the TP Monitor itself should be located on a fault-tolerant platform.
Still it is possible to run several TP monitor server processes on different server
machines that could take over the work of each other in case of failure, so that even
this communication links has a fallback.
Different to a 2-tier approach a client could crash and leave an open transaction with
locks on the database, because not the client but the TP Monitor controls the
connections to the database. Therefore the TP Monitor could rollback connections for
a crashed client.
TP systems can also use database replicas in a fallback scheme - leaving the data
replication to the underlying database system. If a primary database site fails, the TP
Monitor sends the transactions to the fallback replica of the database. This hides
server failures from clients - giving the illusion of instant fail-over.
Moreover TP Monitors deliver a wide range of synchronous and asynchronous
communication links for application building including RPCs, message queues
page 8 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
(MOMs), ORB/DCOM-invocation, conversational peer-to-peer and event-based
publish-and-subscribe communication. Therefore based on the infrastructure and the
nature of the services, the best communication links could be chosen.
In short the following services are delivered:

Failover Server

Automatic retry to re-establish connection on failure

Automatic restart of process on client, server or middleware

Automatic redirection of requests to new server instance on server failure.

Appropriate and reliable communication links.

Transaction deadlocks are dissolved.
3.6
Scalability
3.6.1
Shared process resources
TP Monitors are the best at database funneling, because they are able to use only a
handful of database connections while supporting thousands of clients (see section on
Process Management). Therefore by reducing the connection overhead on the
database server they make the database side much more scalable.
In a 3-tier TP Monitor architecture for database access there are typically at least a
factor of 10 fewer database connections necessary than in a straightforward 2-tier
implementation.
Database
Resources
Without TP Monitor
With TP Monitor
Number of clients
Figure 7. Scalability of the database layer
3.6.2
Flexible hardware requirements
Scalability is not only enhanced on the database side. With dynamic load balancing
the load could be distributed over several machines. By doing this the whole
architecture becomes very flexible on the hardware side. To increase the total
processing capabilities of the 3-tier architecture one could either upgrade server
machines or increase the number of servers. The decision could be made based on
costs and robustness of the hardware and on company policies.
3.7
Performance
Performance could be sped up by different approaches. First of all it is based on the
effective handling of processes in one machine. TP Monitors deliver an architecture
 1999 EURESCOM Participants in Project P817-PF
page 9 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
for multi-threading/multi-tasking on the middleware and the application layer. That
includes:

Automatic creation of new threads/tasks based on loads

Automatic deletion of threads/tasks when load reduces

Application parameters dynamically altered at runtime.
TP Monitors are speeding up performance by keeping the pool of processes in
memory together with their pre-allocated resources. In this pre-warmed environment
each client request could be instantaneously processed without the normal starting
phase.
Another way to enhance performance is to use more processors/machines to share the
workload. TP Monitor support load balancing for:
3.8

multiple processors in one machine (SMP or MPP)

multiple machines (nodes)

multiple middleware, application and resource services (which may be started on
demand as stated above).
Security
The TP Monitor provides a convenient way to define users and roles and to specify
the security attributes of each service in an access control list (ACL). It authenticates
clients and checks their authority on each request, rejecting those that violate the
security policy. Doing this it delivers a fine granulate security on the
service/application level which adds a lot of value to security on the database or
communication line levcl.
Aspects of security typically include:

Role. Based on roles, users have security restrictions to use the services.

Workstation. It is defined which physical workstation is allowed to request what
service.

Time. A service might be limited to a special period of time.
For example, in a payment systems, clerks might be allowed to make payments from
in-house workstations during business hours.
Furthermore a TP Monitor supports general communication security features as
3.9

Authentication

Authorization

Encryption.
Transaction profiles
It is possible to assign profiles to transactions/server classes. Attributes of profiles
include:

Priority. Transaction could have different priorities which are used by the TP
Monitor to do load-balancing.
page 10 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1

3.10
Volume 3: Annex 2 - Data manipulation and management issues
Security. Defines who is allowed to use the service.
Administration
As described in the above sections processes are clustered into pools in service
classes. Typically, short-running or high-priority services are packaged together, and
batch or low-priority work is packaged in separate low-priority server classes.
After packaging the services, an administrator could assign security attributes to them.
Change is constant, and TP Monitors manage it on the fly. They can, for example,
automatically install a service by creating a server class for it (dynamic loadbalancing). Even more interesting, they can upgrade an existing service in place by
installing the new version, creating new service classes that use it, and gradually
killing off old server classes as they complete their tasks (on-the-fly software release
upgrades). Of course, the new version must use the same request-reply interface as
the old one.
This load-balancing feature could also be used for planned outages in case of
maintenance for 24x7 applications to shift all load to a back-up machine.
The registration and start-up/shut-down of resources is unbundled from the
application, because the application uses services of the TP Monitor which are
mapped to available resources. Therefore resources could be added and removed on
the fly. There is no direct connection between a resource and an application (i.e. via
an IP-number or DNS-name of a Database-Server).
Overall TP Monitors help to configure and manage client/server interactions. They
help system administrators to install, configure and tune the whole system, including
application services, servers and large populations of clients.
Administration features include:

Remote installation of middleware and applications

Remote configuration

Performance monitoring of applications, databases, network, middleware

Remote start up/shut down of application, server, communication link and
middleware

Third party tool support

Central administration

Fault diagnosis with alerts/alarms, logs and analysis programs.
Even if it should be possible to administer processes and load-balancing manually, the
preferred option should always be an automatic, self-administrating and self-healing
system.
3.11
Costs
TP Monitors help to reduce overall costs in large, complex application systems
servers. They do this in several ways:

Less expensive hardware. By doing optimized load-balancing together with
better performance, TP Monitors help to use resources more efficiently.
 1999 EURESCOM Participants in Project P817-PF
page 11 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Moreover by funneling client request, the number of concurrent processes
running on the resources (databases) could be drastically reduced (normally by
the factor of 10). Therefore the TP Monitor architectures have lower hardware
requirements.

Reduced Downtimes. Because of the robust application architecture downtimes
of applications could be reduced. This reduces lost profits because of downtimes.

Reduced license costs. The funneling effect reduces the number of concurrent
open connections to the database server, therefore reducing expensive license
costs.

Development time savings. By delivering an application architecture and
forcing the developers to build the system over a 3-tier architecture system
development and maintenance time is reduced (according to Standish Group by
up to 50%).
According to Standish group this may result in total system cost savings of greater
than 30% over a data-centric (2-tier) approach.
3.12
3-tier architecture framework
TP Monitors provide a framework to develop, build, run and administer client/server
applications. Increasingly, visual tool vendors are integrating TP Monitors and
making them transparent to the developer.
TP Monitors deliver shared state handling (transaction context) to exchange
information between the services, freeing developers from this task.
They introduce an event-driven component based programming style on the server
side. Services are created and only function calls or objects are exported not the data
itself. This allows to keep adding functions and let the TP Monitor distribute them
over multiple servers.
They provide a clear 3-tier architecture framework. It is almost impossible to violate
this paradigma by "lazy" programmers. The application becomes strictly modularised
with decoupled components. This leads to a state of the art architecture which fits best
into current object and component paradigmas. Maintenance and re-use are best
supported by such architectures.
3.13
When not to use a TP Monitor
Despite the fact that a TP Monitor has a lot of advantages you do not need it for all
kind of applications and there are also some other drawbacks:

Few users. Even if you have a VLDB with complex data you do not need a TP
Monitor if you have just a few concurrent users at any time. TP Monitor helps
nothing in managing data but it helps a lot in managing processes.

Raised Complexity. If you include a new component like a TP Monitor in the
architecture it raises the complexity to be mastered (at least in the beginning and
regarding the knowledge of the people involved). So you need to have the
necessary knowledge in your development and administration team. But in fact
with big systems the complexity is not raised but lowered, because the whole
application structure becomes better modularized and decoupled (see section on
3-tier architecture framework).
page 12 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1

4
Volume 3: Annex 2 - Data manipulation and management issues
Vendor dependence. Because the architecture of all TP Monitor systems is
different it is not so easy to switch from one TP Monitor to another TP Monitor.
Therefore for some part you are locked with a one vendor solution.
Commercial TP Monitors
The following comparison of the different available TP Monitors is heavily based on
a study done by Ovum Publications ([11]) Ovum Publications, www.ovum.com,
"Distributed TP Monitors", February 1997 and recent information from the product
companies.
In general all of the products follow the OpenGroup standard DTP architecture
described in Figure 3. They differ in their support of the XA-Interface between the
Transaction manager and the Resource manager which is of the greatest importance
for the for the usage of a TP Monitor.
Also some of them integrate the Transaction Manager and the Communication
Resource Manager into one component (BEAs Tuxedo). But this has no importance
for the usage of the TP Monitor because anyhow the interface between those two
components is only used internally and should have no impact on the applications
built on the TP Monitor. Also this XA+ interface between those components was
never officially standardized.
So the differentiation between the products should be mostly done by

supported resources

supported communication protocols

platform support

company history, structure and future

Internet support

Object / Component support

Price

Easy usage

Future developments

market share

your expertise in similar/complemtary tools and programming languages
4.1
BEA Systems Inc.'s Tuxedo
4.1.1
Summary
Key Points

Transaction control available via XA compliant resource managers, such as
Oracle, Informix, and others. Interoperates with MQSeries via third party or own
gateways. Can also interoperate with other TP environments via the XAP OSI-TP
standard and SNA (CICS) and R/3.
 1999 EURESCOM Participants in Project P817-PF
page 13 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1

Runs on OS/2, MS-DOS, Windows 3, Windows 95, Windows NT, Apple
Macintosh, AS/400 and a wide range of UNIX flavours from the major system
providers. Supports Bull GCOS8, ICL VME, MVS IMS, Unisys A series, and
MVS/CICS via third party or own gateways. [ Note: I am not aware of GCOS8 or
VME functionality in the latest versions of the product, although I may not know
of all the 3rd party versions.]

Directly supports TCP/IP, via sockets or TLI. Indirectly supports SNA LU6.2.
Strengths

Excellent directory services, with a labor-saving architecture and good
administrative support tools, all suited to large-scale deployment

Vast array of platforms (Hardware, Netware, Middleware) supported, with links
to some environments which have no other strategic middleware support, such as
GCOS and VME

All technology is now available from one supplier, with a correspondingly clear
strategy for development, support and evolution
Weaknesses
4.1.2

BEA still has some way to go to integrate all the technology and people it has
acquired - especially after buying TOP END from NCR.

Guaranteed delivery services on the communication services side are not well
developed

Load balancing services should be more automated
History
AT&T started the development of Tuxedo in 1979 as part of an application called
LMOS (Line Maintenance and Operation System). The product evolved internally
within Bell Labs until 1989, when AT&T decided to license the technology to OEMs
(value-added resellers).
At 1992 AT&T had spun off the development of Unix, languages and Tuxedo into a
new group named Unix Systems Laboratories (USL). In 1993, Novell bought USL
and started to develop plans which involved the integration of Tuxedo with Novell's
Directory System and AppWare application development tools. These plans never
worked out.
In September 1994, Novell released version 5 of Tuxedo. Enhancements to the
product included support for DCE, extra platform support, a runtime trace feature,
dynamic data-dependent routing and the 'domain' feature - used by systems
administrators to configure Tuxedo servers into autonomous groups.
However, in February 1996, BEA Systems assumed all development, sales and
support responsibilities for Tuxedo'. BEA was a start-up company specifically set up
to acquire and develop middleware technology in the transaction processing area.
Novell retained the right to develop the technology on NetWare platforms. BEA
acquired the rights to develop the technology on all other platforms.
Despite the somewhat confusing language of the announcement, BEA has effective
control of Tuxedo, the technology and its future development. It has obtained an
exclusive 'licence' to the technology in perpetuity. The entire Tuxedo development,
page 14 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
support and sales team - in effect the entire Tuxedo business unit - transferred to BEA
which now has about 100 to 150 developers working on Tuxedo, including many of
the developers from AT&T and Bell Labs.
Release 6.1 was released in June 1996 and added the event broker, an administration
API and ACLs.
1997 BEA Systems made a revenue of $ 61.6 Mio.
May 1998, BEA Systems announced to buy the competing TP Monitor TOP END
from NCR.
4.1.3
Architecture
Figure 8. Tuxedo architecture
The Tuxedo architecture includes the following components:

Core System provides the critical distributed application services: naming,
message routing, load balancing, configuration management, transactional
management, and security.

The Workstation component off-loads processing to desktop systems. This
allows applications to have remote clients, without requiring that the entire BEA
TUXEDO infrastructure reside on every machine.

Queue Services component provides a messaging framework for building
distributed business workflow applications.

Domains allow the configuration of servers into administratively autonomous
groups called domains.

DCE Integration is a set of utilities and libraries that allows integration between
The Open Group’s DCE.

BEA CICx an emulator of the CICS transaction processing product which runs
on Unix.

BEA Connect are connectivity gateways to other TP environments (via SNA
LU6.2 and OSI-TP protocol)

BEA Builder covers tools that assist in the development and testing of Tuxedobased applications.

BEA Manager is the administration component

BEA Jolt for Web integration
 1999 EURESCOM Participants in Project P817-PF
page 15 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Figure 9. Tuxedo 3-tier architecture
Special features include:
4.1.4

Event Brokering System which implements an event system based on the
publish-and-subscribe programming paradigm. This allows for the notification of
events on a subscription basis.

Security is offered for both 40-bit and 128-bit Link Level Encryption add-on
products to provide security for data on the network. It also supports servicelevel Access Control Lists (ACLs) for events, queues, and services.

Cobol Support. A COBOL version of ATMI is provided.

Service Directory. The Bulletin Board, located on every server node
participating in an application, serves as the naming service for application
objects, providing location transparency in the distributed environment. It also
serves as the runtime repository of application statistics.

Internationalization. In compliance with The Open Group’s XPG standards,
users can easily translate applications into the languages of their choice.
Languages can also be mixed and matched within a single application.
Web Integration
BEA Jolt enables Java programs to make Tuxedo service requests from Java-enabled
Web browsers across the Internet (or intranets). The aim of Jolt is to get round the
restrictions of normal Internet communication.
Jolt consists of a collection of Java classes. It also replaces HTTP with its own
enhanced Jolt Transaction Protocol.
page 16 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Figure 10. Jolt architecture
4.1.5
4.1.6
When to use

You are developing object-based applications. Tuxedo works with non-object
based applications, but it is especially suited to object based ones. In fact, you
cannot implement object-based applications involving transactions without
under-pinning them with a distributed TP Monitor like Tuxedo. (ORBs without
TP underpinning are not secure.)

You have a large number of proprietary and other platforms which you need to
integrate. You use Oracle, DB2/6000, Microsoft SQL Server, Gresham ISAMXA, Informix DBMSs or MQSeries, and you need to build transaction processing
systems that update all these DBMSs/resource managers concurrently.

You want to integrate Internet/intranet based applications with in-house
applications for commercial transactions.
Future plans
There are plans to enhance Tuxedo in the following key areas:

Java, internet integration – see newest release of Jolt 1.1: HTML client support
via BEA Jolt Web Application Services and JavaBeans support for BEA Jolt
client development via JoltBeans

Exploiting the Object Technology by creating an object interface to ATMI for
Java (BEA Jolt), COM, CORBA and C++ - the CORBA, COM and C++
interfaces are available using M3, Desktop Broker and BEA Builder for
TUXEDO Active Expert

EJB Builder, a graphical tool for building Enterprise JavaBeans (EJBs)
applications.

Exploiting the Object Technology by creating an object interface to ATMI for
Java (BEA Jolt), COM, CORBA and C++

The Iceberg project (release date June 1998) includes an updated version of
BEA Tuxedo; a revised version of BEA ObjectBroker, formerly called Digital's
CORBA ORB; and an integrated pairing of the two. This will integrate ORBS
into Tuxedo.
 1999 EURESCOM Participants in Project P817-PF
page 17 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1

additional security features like with link level encryption, end-to-end
encryption, digital signatures and built-in support for off-the-shelf security
systems such as RSA - TUXEDO 6.4 provides most of these features.

More support is also planned for multi-threading and more automated load
balancing.

tighter integration with CICS

the BEA Manager will be supported on Windows NT and a browser-based
administration tool will be released.
Figure 11. Tuxedo future plans
4.1.7
Pricing
Tuxedo costs about $ 5000 per development seat. The charges for runtimes are about
$ 550 for concurrent use of a Tuxedo runtime and $ 125 for non-concurrent use per
user per Tuxedo runtime.
Jolt is sold on a per-server basis and its cost is related to the number of users that can
access any server. The minimum likely charge is about $ 3500.
4.2
IBM's TXSeries (Transarc's Encina)
4.2.1
Summary
Key points

Based on DCE, X/Open, Corba II and JavaBeans standards

Runs on AIX, HP-UX, SunOS, Solaris, Digital Unix, Windows 3, Windows NT.
Third party versions from Hitachi (HI-UX), Stratus (FTX) and Bull (DPX). Twophase commit access to MVS CICS through DPL and access to IMS through
Encina client on MVS

Supports TCP/IP and LU6.2 via integrated Peer-to-Peer gateway
page 18 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Strengths

Adds considerable value to DCE

Extensive, well developed product set

Excellent service offering with well defined, helpful consultancy and technical
support

Encina++ provides distributed object support
Weaknesses
4.2.2

TXSeries supports only a small subset of the platforms DCE supports

Third-party tool support is poor and the TXSeries interface is not easy to
understand - the choices can be bewildering to a novice user

Limited automatic failover features (no automatic redirection of requests to new
server instances or new nodes).
History
Transarc has its roots in pioneering research into distributed databases, distributed file
systems and support for transaction-based processing, which was undertaken at
Carnegie Mellon university in the early 1980s.
Shortly after Transarc was founded in 1989, Hewlett-Packard, IBM, Transarc, Digital
and Siemens met under the auspices of the OSF to create the architecture that was to
become DCE. Transarc played a major role in providing the vision behind DCE.
Transarc's AFS became the DFS component of DCE and was released as the first
commercial version in 1994.
Encina in its first product version was released in 1992.
Transarc's close links with IBM resulted in an agreement with it that Encina should
form the foundation for CICS on other platforms besides MVS. A joint team was
formed to build CICS over Encina services, initially on AIX. About 60% of the code
bases of Encina and Encina for CICS is common, but there are, in effect, two products
for two markets (this is further explored in the CICS part).
IBM recently bought Transarc and bundled the two products Encina 2.5 and CICS 4.0
into a new product called IBM TXSeries 4.2. Also included in the product package;
MQSeries, Domino Go Web Server and DCE servers and Gateways.
4.2.3
Architecture
TXSeries is a distributed transaction processing monitor, based on the XA standard
and including support for both transactions and nested transactions. It supports twophase commit across:

heterogeneous DBMSs, file systems and message queuing systems (such as IBM
MQSeries) supporting the XA standard

the queues (RQS) and record oriented file system (SFS) provided by TXSeries.

any LU6.2-based mainframe application(in the case of CICS this is full 2-phase
2-way transactional support through DPL (Distributed Program Link) which is a
higher level call interface than LU6.2).
 1999 EURESCOM Participants in Project P817-PF
page 19 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
TXSeries provides synchronous and asynchronous support for client-server based
applications. It supports synchronous processing using remote procedure calls and
asynchronous processing using either its recoverable queuing service (RQS)or IBM
MQSeries, which is included in the TXSeries server package
TXSeries is layered over DCE. The developer can access DCE services and TXSeries
has been built to take advantage of them. Services such as time, security, threads
support and directory are all provided by DCE.
Figure 12. TXSeries architecture
The Encina Monitor
The Encina Monitor is the core component of TXSeries and consists of run-time and
development libraries. It provides programmers with tools and high level APIs with
which to develop applications, as well as run-time services such as load balancing,
scheduling and fault tolerance services. The Encina Monitor comes with a GUI-based
management and administration tool, called Enconsole, which is used to manage all
resources in the DCE network (Encina servers, clients and DCE resources such as
cells) from a central point.
The Monitor also acts as the central co-ordination agent for all the services. For
example, it contains processing agents that receive and handle client requests, multithread services and multi-thread processes.
Further modules are:

Encina Toolkit is built from several modules including the logging service,
recovery service, locking service, volume service and TRAN - the two-phase
commit engine.

Recoverable queuing service (RQS) provides a message queue for message
storage.

Encina structured file server (SFS) is Transarc's own record-oriented file
handling system

Peer-to-peer communication (PPC) executive is a programming interface that
provides support for peer-to-peer communication between TXSeries-based
applications and either applications on the IBM MVS mainframe or other Unix
applications, using the CPI-C (common program interface communication) and
CPI-RR (common program interface resource recovery) interface.

Peer-to-peer communication (PPC) gateway provides a context bridge between
TCP/IP and SNA networks allowing LU6.2 'sync level syncpoint (synclevel2)'
communication.

DE-Light is a light-weight implementation of TXSeries which consists of three
components: - the c client component, which runs on Windows
page 20 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
- the JavaBeans component which runs on any JDK 1.1-enabled browser
- the gateway, which runs on Solaris, AIX and HP-UX.
Figure 13. TXSeries interfaces
4.2.4
Web Integration
For DE-Light exists a DE-Light Web client that enables any Web browser
supporting Java to access TXSeries and DCE services. DE-Light Web client does not
need DCE or TXSeries on the client. It is implemented as a set of Java classes, which
are downloaded automatically from the Web server each time the browser accesses
the Web page referencing the client. DE-Light Web also has minimal requirements for
RAM and disk.
Figure 14. TXSeries Web Integration
4.2.5
When to use
You are already using or will be happy to use

Your programmers are familiar with C, C++ or Corba OTS

You use Oracle, DB2/6000, MS SQL Server, Sybase, CA-Ingres, Informix,
ISAM-XA, MQSeries and/or any LU6.2-based mainframe transaction and you
 1999 EURESCOM Participants in Project P817-PF
page 21 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
need to build transaction processing systems that update or inter-operate with all
these
4.2.6

DBMSs/resource managers concurrently.

You need to build applications that enable users to perform transactions or access
files over the Internet - all Transarc's Web products are well thought out and
useful.
Future plans
Future development plans include:
4.2.7

Integration with Tivoli's TME, the systems management software from IBM's
subsidiary

Integration with Lotus Notes (available as sample code today)

Enhancement of the DE-Light product (Full JavaBeans client support is currently
available There are plans to provide an Enterprise JavaBeans environment later
this year, which will be integrated with TXSeries in the future

TXSeries currently provides Corba 2.0 OTS and OCCS services. These work
with IONA’s ORB today and there are plans to support other ORBS - such as
IBM’s Component Broker in the near future.

Available since 4Q97, IBM provides direct TXSeries links to IMS, which will
enable IMS transactions to become part of a transaction controlled using twophase commit and specified using TxRPC

Transarc will provide tools for automatic generation of COM objects from the
TXSeries IDL in our next release. Today, integration with tools like Power
Builder, Delphi, etc is achieved through calls to DLL libraries.

Support for broadcast and multi-cast communication.
Pricing
Product
Price
TXSeries Server
$ 3,600
TXSeries Registered User
$
TXSeries Unlimited Use (mid-tier)
$ 16,500
4.3
IBM's CICS
4.3.1
Summary
80
Key Points

Supports NetBIOS (CICS clients only), TCP/IP (except the ESA, AS/400 and
VSE platforms) and SNA (all platforms)
page 22 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues

CICS servers run on AIX, HP-UX, Solaris, Digital Unix, Sinix, OS/2, Windows
NT, ESA, MVS, VSE and OS/400

CICS clients run on AIX, OS/2, DOS, Apple Macintosh, Windows 3, Windows
95 and Windows NT
Strengths

Development environment (API, tools, languages supported) and communication
options for programmer well developed

Context bridging and message routing support provided to bridge SNA and
TCP/IP environments

Worldwide IBM support, training and consultancy available
Weaknesses
4.3.2

Directory services are weak and likely to need high administrator input to set up
and maintain

Central cross-platform administration of the environment is weak

Underlying security services are not provided in a uniform way across all the
environments and lack encryption facilities
History
Over the years IBM has produced three transaction monitors - CICS (Customer
Information Control System), IMS (Information Management System) and TPF
(Transaction Processing Facility), but CICS has become the dominant one.
During the 1960s IBM set up a team of six people at IBM Des Plaines to develop a
common software architecture on the System 360 operation system what became
CICS. The product was initially announced as Public Utility Customer Information
Control System (PUCICS) in 1968. It was re-announced the following year at the time
of full availability as CICS. In 1970, the CICS team moved to Palo Alto, California.
In 1974, CICS product development was moved to IBM's Hursley Laboratory in the
UK. During the late 1970s and early 1980s, support for distributed processing was
added to CICS. The main functions added were: transaction routing, function shipping
and distributed transaction processing. Support for the co-ordination of other resource
managers as part of a CICS transaction was introduced in the late 1970s.
The first version on Unix to be released was CICS/6000, based on Encina from
Transarc and DCE. Only parts of the Encina product and DCE were used: IBM
estimates that between 40% and 50% of the resulting code base is new code, the
remainder being Encina and DCE.
The Windows NT version of CICS was originally based on the OS/2 code base, but is
currently being transferred to the Encina code base.
The CICS development is now placed at IBM's Global Network Division.
4.3.3
Architecture
CICS comprises a CICS core, application servers and listener services. The
listener services handle the communications links on the network, apply any security
 1999 EURESCOM Participants in Project P817-PF
page 23 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
and collect buffers received from the network software, breaking the buffers down
into their components.
Each instance of CICS has a schedule queue: a shared memory queue, invisible to the
programs but used within CICS to handle the scheduling of requests. As requests are
received by the listeners, they place the requests on the schedule queue. CICS does
not have a 'transmission queue' on the sending end to store requests.
One CICS instance on a machine (there can be more than one CICS instance on
mainframes) can handle the messages or requests destined for all the programs, CICS
files, or CICS transient data queues which are using that CICS instance. Thus, the
scheduling queue will contain the request and message from multiple programs and
destined for multiple programs, files or queues.
The CICS instance can also contain one or more application servers. These
components handle the dequeuing of messages and requests from the scheduling
queue and the transfer of these messages to the appropriate program, file or queue. An
application server is not 'allocated' to a specific application program, file system or
transient queue. Each application server simply takes whatever happens to be the next
message on the schedule queue and then processes it.
In essence, the scheduling queue is organised in first-in-first-out order. CICS supports
prioritisation of messages on the ESA machine, but this is only used to despatch
messages, not to process them when a message is received.
Once the application server has taken the message off the queue, it will load the
program where necessary and then wait until the program has completed its activity,
the action has been completed by the file system or the message has been placed on
the transient queue. Once the program has finished processing, the application server
will clean up any data in memory and then go back to see if there are any messages on
the schedule queue for it to process.
Once there are more messages/requests on the schedule queue than a specified limit
(normally ten), new application servers are spawned automatically. When there are
fewer messages than the limit, application servers are automatically taken out of
service.
Further components include:

directory service

CICS file system (a record-based file-handling system)

temporary storage (memory-based or disk-based queue)

transient data (sequential files outside the file control functional area, which
can be shared between transactions/tasks)

memory-based semaphore (CICS-controlled area of memory, which acts like a
semaphore - not under transaction control)

shared temporary storage.
page 24 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Figure 15. CICS architecture
Special features include:
4.3.4

New Java functionality. CICS transaction capability from any Java-enabled
browser.

Integration with Lotus Domino and the Internet.
Web integration
IBM has two products to integrate CICS with the Web.
CICS Internet Gateway
A recently released component enables CICS applications to be accessed via the Web.
The user uses a normal Web browser to access the Web server; the CICS Internet
Gateway is then used to interface between CICS client applications and the Web
server. The CICS client can be AIX or OS/2.
CICS/Java Gateway
The CICS/Java Gateway enables Java applets to be automatically downloaded from a
Web server via a Web page. As shown in Figure 4, the CICS ECI Applet then
connects directly to a CICS/Java Gateway, while the CICS ECI Applet works on the
client machine within the Java Virtual machine. CICS/Java Gateway uses the secure
sockets connection to transmit data and ECI calls. The CICS/Java Gateway can run on
OS/2, Windows NT and AIX.
 1999 EURESCOM Participants in Project P817-PF
page 25 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Figure 16. CICS Web Integration
4.3.5
4.3.6
When to use

You are already a big user of CICS and intend to be, in the future, a user of
predominantly IBM machines and operating systems. Note that if you intend to
use non-IBM operating systems you must be prepared to use DCE.

You are attracted by IBM's service and support network.

You are prepared to dedicate staff to performing all the administrative tasks
needed to ensure CICS is set up correctly and performs well across the company.

You do not need an enterprise-wide, self-administering solution supporting a
large range of different vendors' machines.
Future plans
CICS Systems Manager
IBM wants to harmonise all the different versions of the CICS Systems Manager so
that there is one unified interface, preferably GUI-based. All administration should be
done from a central remote console.
CICS Java-based applet
Due early in 1997, the CICS External Call Interface is used to enable a Java applet to
access CICS servers through Web browsers. Read-only access is provided to systems
over TCP/IP and SNA networks. By late 1997, a Java interface should be available on
CICS servers, enabling developers to write client-server, CICS-based Java
applications.
Support for data format translation
IBM is thinking of providing support for self-describing message data and for
messages which are more than 32Kb long. This will enable more flexible support for
messages and will allow the format of those messages to be translated. IBM is
considering the use of a type of data definition language to describe the message
content. It could thus borrow both the ideas and technology available in distributed
databases and use them to do the conversion.
page 26 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Support for dynamic load balancing
CICS only supports load balancing on the ESA and AIX platforms. IBM would like to
extend this facility to all the platforms and add to the support so that tasks could be
automatically created.
Better security support
IBM is planning support for encryption and for more unified security methods such as
Secure Sockets. It is also investigating the use of DCE's GSSAPI.
4.3.7
Pricing
IBM packages its products into what are termed 'transaction servers'. These servers
are part of a range of server components, including database server and
communications server. Therefore no separate prices could be state.
4.4
Microsoft Transaction Server MTS
4.4.1
Summary
Key Points

MTS is Windows NT-only. MTS extends the application server capabilities of
NT and uses features of NT for security and robustness .

Synchronous communication support via DCOM, DCE RPC and asynchronous
store and forward features of MSMQ.

Support of SQL Server 6.5 and 7, and ORACLE 7.3 via ODBC-interfaces.
Strengths

Adding transactions and shared data to COM-Objects.

Easy integration of DCOM on the client side.

Simple COM-API (only three methods: GetObjectContext, SetComplete and
SetAbort).

Easy administration. The GUI-drag and drop management console is tightly
integrated into the Microsoft Management Console (MMC).

It is cheap. There is no additional software cost because it is bundled with the NT
Server 5.0.
Weaknesses
4.4.2

Limited support of the standard XA resource interface.

Poor transaction recovery.

Poor cross-platform support for diverse networks and existing systems.
History
A few years ago, Microsoft hired some of the best and brightest minds in transaction
processing, including Jim Gray, who literally wrote the book on it, and set them to
 1999 EURESCOM Participants in Project P817-PF
page 27 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
work on a next-generation TP monitor. The result was Microsoft Transaction Server
(MTS). Version 1.0 was released in January 1997.
MTS is an ActiveX-based component coordinator. It managed a pool of ODBC
connections and COM-Object connections that clients could draw from.
MTS focuses on transactions for COM objects and supports developers with an easy
manageable tool, thereby lowering the barrier for developers to use a transaction
monitor. It was targeted toward Visual Basic applications running as ActiveX
components under IIS (Internet Information Server).
In December 1997 Microsoft released a Windows NT Option pack which includes
MTS 2.0. It enhances support to the Message Queue Server (MSMQ), transactional
Active Server Pages (ASP) for IIS and support for ORACLE 7.3. MTS 2.0 was
integrated fully into NT Server 5.0.
It is positioned against JavaBeans and CORBA by "naturally" enhancing the DCOM
model with transactions.
4.4.3
Architecture
MTS consists of the following components:

MTS Explorer is the management console to create and manage packages,
configure component properties such as security and transaction behavior, and
monitor/manage operating servers. MTS Explorer can run as a snap-in to the
Microsoft Management Console (MMC).

Resource dispensers create and manage a reusable pool of resource connections
automatically.

Automatic Object Instance Management extends the COM object model with
just-in-time activation where components only consume server resources while
they are actually executing.

Shared Property Manager is a special-purpose resource dispenser that enables
multiple components to access the same data concurrently.

Distributed Transaction Coordinator (DTC) is responsible for the transaction
management (2-phase-commit, etc.).

Microsoft Message Queue Server (MSMQ) provides a transaction sensible
message queue.

SNA Server 4.0 Integration via COM-based interfaces to mainframe
applications.
page 28 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Figure 17. MTS architecture
The client side of MTS is totally integrated into Windows 95. Therefore no special
MTS client is necessary.
4.4.4
Web Integration
MTS 2.0 is tightly integrated with IIS 4.0. This allows to mark Active Server Pages
(ASPs) as transactional, using the same transaction settings that system administrators
assign in the MTS Explorer. IIS can run in its entirety within an MTS-managed
process, or individual ASP applications can run in separate processes.
4.4.5
4.4.6
When to use

You are building an multi-tier (especially internet) application based on
Microsoft Backend Server Suites.

Your system architecture is built upon the DCOM architecture.

Your developers have good Visual Basic, VisualJ++ and NT knowledge.

You are building a complex business application which stays in the Microsoft
world only.
Future plans
Not known.
4.4.7
Pricing
MTS 2.0 is included as a feature of the Microsoft Windows NT Server. There is no
additional charge.
 1999 EURESCOM Participants in Project P817-PF
page 29 (120)
Volume 3: Annex 2 - Data manipulation and management issues
4.5
NCR TOP END
4.5.1
Summary
Deliverable 1
Key Points

Oracle, Informix, Sybase, Teradata, CA-Ingres, Gresham's ISAM-XA, MS SQL
Server, DB2/6000 and MQSeries are supported via XA

Supports many flavours of Unix including AIX, HP-UX, Sun Solaris and NCR
SvR4. IBM support for MVS, AS/400 and TPF is provided via gateways or
remote server. Client-only support is provided for OS/2, MS-DOS and Windows.
Remote server support is also provided for Windows 95 and OS/2

Supports TCP/IP and OSI TLI. Support is provided for LU6.2 via gateway
Strengths

Excellent directory services, with a labor saving architecture and good
administrative support tools, all suited to large scale, enterprise-wide deployment

Highly developed automatic load balancing and restart/recovery facilities, again
providing a labor saving administrative environment and likely to guarantee high
availability and performance

NCR has found a way to circumvent the current limited support for XA by using
its own 'veneers'
Weaknesses
4.5.2

Some key platforms are not supported (for example, Digital's OpenVMS), or are
only supported via remote clients and/or remote servers

Limited support for guaranteed delivery
History
In the early 1970s when NCR first started to ship mainframe computers, its staff in
countries such as the UK and Switzerland decided that the machines needed to be
supported by a mainframe class TP Monitor that worked across the range of machines
and therefore NCR developed a TP Monitor called TranPro, which it provided for all
its proprietary systems.
TranPro evolved to become MultiTran, a TP Monitor that could support the NCR
9800 and clustered environments with added functionality such as failover.
When the NCR 3000 range was launched in 1990 and NCR decided to move away
from its proprietary operating systems to Unix, it realised that MultiTran would need
to be adapted and re-developed to support not only the Unix operating system, but
large networks of distributed computers. MultiTran was consequently re-developed to
support distributed environments and parallel architectures and the resulting product
was renamed Top End. It was released in late 1991.
At that time developments were affected by AT&T's ownership of the competing
Tuxedo product. Despite the rivalry between Tuxedo and Top End, development and
enhancement of Top End continued. Once AT&T sold USL to Novell in 1993,
however, a notable increase in the pace of Top End development took place.
page 30 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Top End on Windows NT (release 2.03.01) was available at the beginning of 1996.
1996 AT&T's allowed NCR to be an independent public company.
May 1998, BEA Systems, owner of the competing product Tuxedo, announced that it
would buy TOP END from NCR.
4.5.3
Architecture
Top End consists of the following components:

Top End Base Server software runs on every server node in the network where
there are Top End clients connected or where there are Top End applications.
Top End Base Server software is also required on any node that is being used to
connect to IBM hosts, or to drive the SNA open gateway. Security services are an
optional component that can be added to the Base Server software

Top End Client software is needed on any client machine running an application
that issues Top End transaction requests. It also includes software that can handle
PC and 3270 terminal screen support

Top End connections provide connectivity to mainframe and other server
platforms.

Node Manager is a collection of processes that run on each server and provide
the core middleware services such as message routing and queuing, transaction
management, security checking, runtime administrative services, as well as
exception logging, alert and failure recovery.

Network Interfaces provide communication services between node managers.

Remote Client Services extends client application services and APIs to remote
networked workstations.

Remote Server Services extend server application services and APIs to remote
networked workstations and various server platforms.

Network Agents extend all application services to the Remote Client Services
and the Remote Server Services platforms.

Login Clients use a library of terminal support routines and TOP END screen
formats to facilitate communications between a character-mode terminal (client)
and a TOP END application program (service).

Tools for developing and configuring applications, and managing the enterprise.

Administration tools are used to perform component start-up and shutdown,
manage auditing and recovery, activate communication links, monitor application
alerts, and other distributed services.
 1999 EURESCOM Participants in Project P817-PF
page 31 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Figure 18. TOP END architecture
Special features include:
4.5.4

Automatic Recovery from application failures, transaction failures, network
failures, and node failures. Components are monitored by the system monitor.
When an component fails, the system monitor notifies the node manager and
restarts the failed process.

Data-warehousing. NCR is a leading provider of large-scale Data-warehousing
solutions with its Teradata database software. TOP END's transaction monitor
brings OLTP transaction scalability to this Data-warehouse.
Web Integration
Java remote client is aimed at organisations wanting to support commercial
transactions on the Internet using the Web.
The Internet and its protocols are normally unable to recognise the state of a
transaction, which makes the multi-step, often complex interactions that take place
within a transaction largely impossible to support. Java remote client solves this
problem by combining the distributed transaction processing capabilities of Top End
with software that can sustain transaction interactions over the Web using browsers.
Although Web browsers and servers can be used directly with Top End, NCR believes
that the combination of the Java remote client with the Web browser is a more robust
and high performance solution for transaction processing and other applications.
Java remote client is written in the Java language and is supplied as a Java applet.
The Enterprise ActiveX controls (TEC) are currently available.
page 32 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Figure 19. TOP END web-integration
4.5.5
4.5.6
When to use

You need a strategic, high performance, high availability middleware product
that combines support for synchronous and asynchronous processing with
message queuing, to enable you to support your entire enterprise.

You use TCP/IP, Unix (AIX, HP-UX, Solaris, Dynix, Sinix, IRIX, NCR SvR4,
U6000 SvR4, Olivetti SvR4, SCO UnixWare, Digital Unix, Pyramid DC/OSx),
Windows (NT, 95 or 3), OS/2, MVS, AS/400 or TPF.

You need distributed transaction processing support for Oracle, Informix,
Sybase, Teradata, CA-Ingres, Gresham's ISAM-XA, Microsoft SQL Server or
DB2/6000.

Your programmers use C, Cobol or C++, Oracle Developer/2000, NatStar,
Natural, PowerBuilder, Informix 4GL, SuperNova, Visual Basic, Visual C++ (or
any other ActiveX compliant tool), Java and Web browsers.
Future plans
Enhancements include:

Remote clients - support will be provided for message compression and
encryption from remote clients.

Top End development environment - support will be provided for Enterprise
ActiveX controls

Security - enhancements will be added within the Windows NT environment to
match those in the Unix environment

Interactive systems definition - this type of support will be added for XR, MSR,
YCOTS and BYNET.

More 'data warehouse enablers'

Support for very large systems, with the focus on scalability and availability.
 1999 EURESCOM Participants in Project P817-PF
page 33 (120)
Volume 3: Annex 2 - Data manipulation and management issues

4.5.7
Deliverable 1
Support of an enterprise component model for ”plug-and-play” application
development with CORBA and Java Beans. A key development here is the TOP
END Enterprise Java Beans Server (EJB).
Pricing
NCR has a tiered pricing model. Machines are split into groups based on their
theoretical relative processing power. All machines within a category (whatever make
and operating system) are then priced the same. Pricing ranges from $ 2,700 to $
150,000.
4.6
Itautec's Grip
4.6.1
Summary
Key Points

Grip can control transactions which access local Btrieve 5 & 6 (Windows NT
and NetWare), Oracle (Windows NT and NetWare), SQL Server (Windows NT),
Faircom (Windows NT), and Sybase version 10 (NetWare and Windows NT)
databases.

Grip clients run on Windows 3.11 and MS-DOS version 3, NetWare, Windows
95 and Windows NT (Servers or Workstations). Grip Servers can run under
Windows NT, NetWare 3 & 4, NetWare SFT III or SCO Unix. Access is
available to IBM hosts (MVS or AS/400) via SNA LU6.2 and a gateway

Grip supports IPX/SPX, TCP/IP, X.25 and SNA LU6.2
Strengths

Simple, easy-to-use API and development environment

Easy-to-understand architecture and configuration

Provides a design geared towards fast transaction throughput
Weaknesses
4.6.2

No dynamic directory facilities

Support restricted to a small number of key platforms

No support of the standard XA resource interface.

Limited support services outside South America and Portugal
History
Grip was designed by Itautec Informatica, Brazil's largest IT company, to handle the
10 million transactions per day processed by the country's second-largest private
bank, Itau.
The development of Grip was driven largely by the nature of Brazilian politics,
coupled with the state of Brazil's economy in the 1980s. At that time Brazil had
protectionist policies, with self-imposed restrictions which largely prevented the use
of external technology. Its economy, however, was in severe trouble, with inflation
page 34 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
exceeding 30% per month. Tremendous pressures were placed on Brazil's banking
systems to support the need to put cash immediately into electronic form, in order to
help to protect it from inflation (via interest rates) and move it around the banking
system quickly and reliably.
Work started on Grip in 1982. Grip technology was sold during the late 1980s and
early 1990s as part of a package of banking and other commercial solutions developed
by Itautec. In 1993, Itautec decided to decouple what became Grip from these
packaged solutions.
4.6.3
Architecture
Grip has two main components: a Grip Server component and a Grip Client
component.
Grip Server
Most of the modules on the server can be multi-threaded.

Grip Start-up module and tables provides a configuration file which contains
tables and parameters used on start-up.

Application manager manages the application execution, including handling
routines, statistics information and database transaction control. Functions

Time scheduler controls and activates all the time scheduled transactions.

Message manager (two managers: Grora and Groea) manage all the incoming
and outgoing messages.

Server queue which are invisible to the programmer and only used internally.

Communications modules allow messages to be sent across a network, while
hiding the underlying network protocols.
Figure 20. Grip Server architecture
Grip Client
 1999 EURESCOM Participants in Project P817-PF
page 35 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
The client is intended to be a complete environment for both running and creating
client applications on Windows. The Grip Client components include a GUI screen
generator, Microsoft VBXs, OCXs and Microsoft foundation.
The communication modules on the client provide a programming interface which can
be used to open and close sessions and to send and receive messages.
Figure 21. Grip client architecture
4.6.4
Web Integration
No support for the Internet.
4.6.5
4.6.6
When to use

You need a DTPM which is capable of supporting a cost-effective, stand-alone or
locally distributed application, which may exchange data with a central
mainframe.

You want to develop these applications on Windows NT or NetWare servers.

Your hardware and network configurations are relatively stable.

The DBMSs you intend to use are Oracle, Sybase, SQL Server, Btrieve or
Faircom.
Future plans
The plans that Itautec has for Grip cover three areas:

Integration of Grip for Windows NT with the Windows NT event viewer,
performance monitor, NT login process and security system. This enhancement is
'imminent'

Integration of Grip on NetWare with Novell's NDS (NetWare Directory
Services).

Integration of Windows NT with Exchange so that messages can be received by
and sent to users.
page 36 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
4.6.7
Volume 3: Annex 2 - Data manipulation and management issues
Pricing
Prices range from $ 4,000 to $ 31,000. A typical customer can expect to spend an
additional 10-25% on interfaces, for example, to Sybase, TCP/IP and NetWare
(NetWare for SAA), depending on the configuration of the system.
5
Analysis and recommendations
5.1
Analysis
The Standish Group recommends the use of TP Monitors for any client/server
application that has more than 100 clients, processes more than five TPC-C type
transactions per minute, uses three or more physical servers and/or uses two or more
databases.
TP Monitor are forcing the building of robust 3-tier applications. The internet is
making 3-tier client/server applications ubiquitous and therefore creates a huge
demand of this kind of middleware technology.
As component-based middleware becomes dominant, support of transactional objects
becomes a must for robust applications. This will clearly govern the future
development of TP Monitors. These next-generation monitors which span transaction
control over components and objects are so-called Object Transaction Monitors
(OTMs).
5.2
Recommendations
The most mature products are Tuxedo, Encina, TOP END and CICS. Grip and MTS
lack some features and standards support.
If you are looking for enterprise-wide capacity, consider Top End and Tuxedo. If your
project is medium sized, consider Encina as well. If you look for a product to support
a vast number of different platforms then Tuxedo may be the product to choose. If
DCE is already used as underlying middleware then Encina should be considered.
MTS and Grip are low-cost solutions. If cost is not an issue then consider Tuxedo,
TOP END and Encina. Internet integration is best for MTS, Encina, Tuxedo and TOP
END.
Regarding support of objects or components MTS is clearly leading the field with a
tight integration of transaction concepts into the COM component model. Tuxedo and
Encina will support the competing CORBA object model from the OMG.
There seems to be a consolidation on the market for TP Monitors. On the one hand
Microsoft has discovered the TP Monitor market and will certainly gain a big portion
of the NT server market. On the other side the former TP Monitor competitors are
merging which leaves only IBM (CICS and Encina) and BEA Sytems (Tuxedo and
TOP END) as the old ones.
The future will heavily depend on the market decision about object and component
models such as DCOM, CORBA and JavaBeans and the easy access to integrated
development tools.
 1999 EURESCOM Participants in Project P817-PF
page 37 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
References
[1]
Bea Systems, www.beasys.com; www.beasys.com/action/tpc.htm
[2]
Edwards, Jeri; DeVoe, Deborah, "3-tier client/server at work", John Wiley &
Sons, Inc., 1997
[3]
Frey, Anthony, "Four DTP Monitors Build Enterprise App Services", Network
Computing Online, techweb.cmp.com/nc/820/820r1.html
[4]
Gray, Jim, "Where is Transaction Processing Headed?", OTM Spectrum
Reports, May 1993
[5]
Gray, Jim; Andreas Reuter, "Transaction
Techniques", Morgan Kaufmann, 1993
[6]
IBM, www.ibm.com
[7]
Itautec Philco SA - Software Products Division, www.itautec.com.br
[8]
Microsoft, Transaction Server -- Transactional Component Services, December
1997, www.microsoft.com/com/mts/revguide.htm
[9]
NCR Corporation, www.ncr.com
Processing
Concepts
and
[10] Open Group, www.opengroup.org
[11] Ovum Publications, www.ovum.com, "Distributed TP Monitors", February
1997
[12] Transaction Processing Performance Council, www.tpc.com
[13] Transarc Corporation, www.transarc.com
page 38 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Part 2 Retrieval and Manipulation
1
Introduction
1.1
General architecture of distributed Databases
A distributed database is a collection of databases distributed over different nodes in a
communication network. Each node may represent different branches of an
organisation with partly anutonomous processes. Each site participates at least at one
global application, which may be responsible for the exchange of data between the
different branches or for synchronising the different participating database systems
(Figure 1).
Branch 2
DBMS
Branch 1
DBMS
Communication
Network
Branch 3
DBMS
Figure 1: DBMS distributed in a communication network
To apply distributed databases in organisations with decentralised branches of
responsibility may naturally match most of the required functionality of a
decentralised information management system. The organisational and economic
motivation are probably the most important reason for developing distributed
databases [2]. The integration of pre-existing databases in a decentralised
communication environment is probably less costly, than the creation of a completely
new centralised database. Another reason for developing distributed databases is the
easy extensibility of the whole system.
1.1.1
Components of a distributed DBMS
A distributed database management system supports the creation and maintenance of
distributed databases. Additionally to centralised DBMS distributed DBMS contain
additional components which extend their capabilities by supporting communication
and cooperation between several instances of DBMS’s which are installed on
 1999 EURESCOM Participants in Project P817-PF
page 39 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
different sites of a computer network (Figure 2 ). The software components which are
typically necessary for building a distributed database are according [2]:

DB: Database Management Component

DC: Data Communication Component

DD: Data Dictionary

DDB: Distributed Database Component
Local
Database 1
DB
DC
DDB
DD
Site 1
Site 2
Local
Database 2
DB
DC
DDB
DD
Figure 2 Components of a distributed DBMS
The components DB, DC and DD are representing the database management system
of a conventional non distributed database. But the data dictionary (DD) has to be
extended to represent information about the distribution of data in the network. The
distributed database component (DDB) provides one of the most important features,
the remote database access for application programs.
Some of the features supported by the above types of components are:

Remote database access

Some degree of distribution transparency; there is a strong trade-off between
distribution transparency and performance

Support for database administration and control: this feature includes tools for
monitoring, gathering information about database utilisation etc.

Some support of concurrency control and recovery of distributed transaction
One important aspect in a distributed DBMS-Environment is the degree of
heterogeneity of the involved DBMS. Heterogeneity can be considered at different
levels in a distributed database: hardware, operating system and the type of local
DBMS.
As DB software vendors are offering their products for different hardware/operation
systems, heterogeneity problems of these types have usually not to be considered,
these levels heterogeneity are managed by the communication software.
The development of distributed DBMS without pre-existing local DBMS, the design
may be performed top-down. In this case it is possible to implement a harmonised
page 40 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
data model for all involved sites, this type of development implements a
homogeneous distributed DBMS:
In case of pre-existing local databases, one type of heterogeneity has to be considered
in building a distributed DBMS: the required translation between the different data
models used in the different local databases. This translation realises a global view of
the data in a heterogeneous distributed DBMS.
1.1.2
Distributed versus Centralised databases
Distributed databases are not simply distributed implementations of centralised
databases [2]. The main features expected by Centralised DB have to be implemented
offering different features in the case of distributed DBMS:

Centralised Control
The idea of Centralised Control which is one of the most important features in
centralised DB has to be replaced by a hierarchical control structure based on a
global database administrator, who is responsible for the whole database and
local database administrators who have the responsibility for the local databases.
The so called site autonomy may be realised in different degrees of autonomy:
from complete site autonomy without any database administrator to almost
completely centralised control.

Data independence
Data independence guarantees to the application programmer that the actual
organisation is transparent to the applications. Programs are unaffected by
changes in the physical organisation of the data. In a distributed DB environment
data independence has the same importance, but an additional aspect has to be
considered: distribution transparency.

Distribution transparency
means that programs can be written as if the databases were not distributed.
Moving data from one site to another does not affect the DB applications. The
distribution transparency is realised by additional levels of DB schemata.

Reduction of redundancy
opposite to the decision rules in traditional databases, data redundancy in
distributed databases are a desirable feature: the accessibility to data may be
increased if the data is replicated at the sites where the applications are needing
them. Additionally the availability of the system is increased; in case of site
failures application may work on replicated data of other sites. But similar to the
traditional environment the identification of the optimal degree of redundancy
requires several evaluation techniques. As a general statement, replication of data
items improves the ratio of retrieval accesses, but the consistent update of
replicated data reduces the ratio of update accesses.
1.2
General architecture of federated Databases
In [3] different definitions for distributing data and applications on database systems
are identified (see Figure 3).
 1999 EURESCOM Participants in Project P817-PF
page 41 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
A centralised DBS consists of a single centralised database management system
(DBMS), which manages a single database on the same computer system. A
distributed DBS consists of a single distributed database management system
(DBMS) managing multiple databases. The databases may differ in hardware and
software, may be located on a single or multiple computer system and connected by a
communication infrastructure. A multidatabase system (MDBS) supports operation on
multiple DBS, where each component DBS is managed by a ”component database
management system (DBMS)”.
The degree of autonomy of the component DBS within a MDBS identifies two classes
of MDBS: the nonfederated database system and the federated database system. In the
nonfederated database system, the component DBMS are not autonomous. For
instance a nonfederated DBS does not distinguish local and nonlocal users.
The component DBS in a federated database system participate at the federation as
autonomous components managing their own data, but allowing controlled sharing of
their data between the component DBS.
FDBMS
Component
DBS 1
Component DBMS 1
a centralised DBMS
Component
DBS 1
Component
DBS n
Component
DBS 2
Component DBMS 2
a distributed DBMS
Component
DBS 2-1
others
Component
DBS 2-2
Figure 3: Components of a DBS according [3]
The cooperation between the component DBS allow different degrees of integration, a
FDBS represents a compromise between no integration (users must explicitly
interface with multiple autonomous databases) and total integration (users may access
data through a single global interface, but cannot directly access a DBMS as a local
user).
1.2.1
Constructing Federated Databases
In general there are different interests in distributed databases. Some of these are
(1-3): (1) The opportunity to implement new architectures, which are distributed
according to their conceptual nature. This is often the case for client / server systems.
(2) The opportunity to make distributed implementations of conceptual non
distributed systems to achieve efficiency. For instance it might be desirable to
implement advanced query searching in a conceptually large entity (table) by
distributing the table among many computers, and let each computer do the query
search on parts of the entity. (3) There may be situations, where data are available on
page 42 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
a distributed system by nature. But where it is more desirable to consider these data as
non distributed. For instance this might be the case, when two companies (or two
divisions in one company) wish to join forces and hence data. In such a case the
companies may choose to design a federated database to be able to share data.
A federated database is also called a multidatabase. The term ”federated” is used in
the research world - the term ”multidatabase” is used in commercial products. A
federated database arise by considering a set of databases as one (federated) database.
The databases might very well exist on different computers in a distributed system.
As an example we can think of a fictitious commercial telephone company comprised
of two more or less independent divisions. Let us assume, that one division is
responsible for providing the customer with mobile telephone services, and the other
division is responsible for providing the necessary network for static desk telephones
communication. It is then very likely, that each of these divisions has its own system
to handle customer orders and billing (billing system). That is, at the end of the month
each customer receives two bills: One bill covering the customers expenses for using
the mobile telephone (printed by the first divisions billing system) and another bill
covering the expenses for using the static telephone (printed by the second divisions
billing system). But it might be more desirable both from the company and the
customers point of view, that the customer only receives one bill each month both
covering the mobile and static telephone expenses. If the company decides to
implement this, the company might choose to implement a whole new billing system
integrated in the two divisions. However this solution might be very difficult. One
reason is, that customer billing directly or indirectly must be based on customer bill
reporting (CDR) from the networks implementing the telephone services. Most likely
the networks implementing the mobile and static telephone services have different
topologies and internal communication protocols. (The mobile network might be an
extension of the static network.) So the CDR formats are different from one network
to the other. Assuming that each division uses a database (for instance comprised of a
set of tables / entities) to keep track of CDR collections, the two databases (one for
each division) will likewise represent heterogeneous information. For instance this
means, that the entities (tables) uses by the two databases have different entity
formats (record structures). The implementer of the new billing system therefore have
two choices (1-2): (1) Restructure the telephone service networks to unify the CDRs
and merge them into one database instead of the existing two databases. (2) Base the
new billing system on the existing databases. Restructuring (1) the networks is
probably impossible and undesirable. A new billing system must therefore more or
less be based on the existing databases (2). In practical terms: The job for the new
billing system is to make the information in the existent databases look uniform on the
customer bill. The billing system might do this by implementing a new view of the
databases as a federated database.
In the rest of this section we will deal with some of the aspects of implementing
federated database systems. To some extent we will also discuss how to express
queries in such systems and how to implement query searching. A federated database
will probably be implemented using concepts and techniques known in the world of
distributed systems. So in the following we assume, that the federated databases under
discourse will be implemented on a some kind of distributed database system. We
assume we are equipped with techniques to access the databases on the different
computers in the system. We further assume, that we have query techniques available
conforming to the power of SQL of some variant. We use the word ”entity” to
 1999 EURESCOM Participants in Project P817-PF
page 43 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
identify the basic elements in a database. So we more or less expect, that we are in the
relational world of databases.
1.2.2
Implementing federated database systems
To some extent the concept of federated database in a distributed system conforms to
the concept view for instance defined in SQL and provided by many DBMS’s.
Logically a view is a function from a set of entities to an entity. Likewise a federated
database is logically a function from a set of databases to a database. On the other
hand a DBMS itself is not expected to support functional mapping directly suitable
for federated purposes. But since a database is a set of entities (in the relational case),
the use of views might be an important tool to implement a federated database.
As an example we continue the discussion of the above two databases defined by the
divisions in the telephone company above. For simplification we now further assume,
that each database is comprised of just one entity (table with no relation / record
ordering and no multiple instances of the same relation), and these entities are
instances of the same schema (contains the same attributes). In all we deal with two
entities A and B. The entities A and B keep customer accounts for the two company
divisions respectively. (Let us say A for the division dealing with the mobile phones
and vice versa.) For each customer in each division a relation (record) in the entity
specifies the name of the customer and the amount of money to be specified on the
next monthly bill. (This amount increases as the customer uses the telephone.) It
might look like this:
Entity A
Entity B
Name
Account
Name
Account
Peterson
100
Peterson
100
Hanson
50
Hanson
50
Larson
200
Larson
200
From the federated perspective we are really just interested in the customers and the
accounts. When we print the bills, it is not important whether the customers are using
a mobile phone or a static phone as long as they pay. So the federated database should
just be comprised of one logic table made as some sort of concatenation of the
accounts represented by A and B. We could consider using the view U defined as the
union of A and B as the first step for this purpose. Apart from some problems with
Hansons account (to be discussed below) U represents all accounts in A and B. Using
for instance SQL it is easy to express, that we want the sum of all accounts, presented
as the entity (view) F0. Except for the problem with Hanson, F0 will do as our
federated database.
page 44 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
FO: Federated database - first attempt!
U: Union of A and B
Name
Account
Name
Peterson
100
Peterson
175
Hanson
50
Hanson
50
Larson
200
Larson
200
Jonson
200
Jonson
200
Peterson
Account
75
But since we are in the (strict) relational world, U has lost too much information. The
reason is, that Hanson has used the same amount on the mobile and static telephone,
and Hanson therefore is represented by identical relations (records) in A and B. Since
a relation is represented with exactly one or zero instances in the (strict) relational
model, only one of Hansons accounts is registered in U. So in U we have lost the
necessary information about one of Hansons accounts.
We assume, that it will be possible to solve the problems with Hansons account on
any DBMS. But the point here is, that the facilities provided by a particular DBMS
not necessarily are suitable for the needs of constructing a federated database. It could
be argued that the problems above are due to bad database design. But the problem
with federated databases as such is, that the databases forming the basis for the
federated database not are designed for the purpose.
Therefore different attempts are made to meet the special problems met, when
designing federated databases. An example is the Tuple-Source (TS) model described
in [11]. The TS model can be thought of as an extension of the traditional relational
model. We will not try to explain this model in detail. But in the following we will try
to give some idea of how this model can solve the problem above.
Above we used the union operator to define the view U. The TS-model provides us
with an alternative ”union” view operator, let us call it TS-union (this operator is not
explicitly named as such in [11]). The TS-union operator makes it possible to
concatenate entities, which are instances of the same schema (like entities A and B)
without loss of information as in the case of U. Using the TS-union on A and B yields
the entity showed below (W).
W: TS-union on A and B
Name
Account
DB
Peterson
100
DB_A
Hanson
50
DB_A
Larson
200
DB_A
Hanson
50
DB_B
Jonson
200
DB_B
Peterson
75
DB_B
 1999 EURESCOM Participants in Project P817-PF
page 45 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
The TS-union makes sure, that each instances of the same relation appearing in both A
and B is replicated as two distinct relations in W. This is assured by providing W with
an extra attribute, identifying the originating database - either A (DB-A) or B (DB-B).
From W it is now easy to generate the sum of the accounts belonging to each client
(identified by the Name attribute) using TS-SQL (the query language provided with
the TS-model - a variant of SQL). The sum of all the accounts is presented as the
entity below (F) - which make up our federated database.
F: The federated database - as we want it - summing Hanson accounts correctly!
Name
Account
Peterson
175
Hanson
100
Larson
200
Jonson
200
When designing a federated database, we sometimes need to consider a set of
databases (the basis) as one as above. For the same federated database, we might at
other times need to consider the basis it is made of - a set of distinct databases.
TS-SQL contains SQL extended instructions, which make it convenient to deal with
both cases. Regarding the TS-model it should finally be noted, that the extensions do
not violate the fundamental important properties of the traditional model. This for
instance means that a system implementing the TS-model can make use of query
optimization techniques used in traditional relational systems.
1.2.3
Data Warehouse Used To Implement Federated System
Above we ended up defining our federated database as an entity (view) called F. To
define F we used an intermediate entity (view) called W defined from two other
entities A and B. We did say define - not generate. The fact the F and W are defined
with A and B as basis, do not necessary mean that each relation in F and W are
represented physically by the DBMS. It might be feasible just to keep a symbolic
logical definition in the DBMS - also in the case where the entities grow to represent
any higher number of relations. For instance the size of W might just be on the order
of 50 bytes / octets, independent of the number of entities in W. To see why, let us
look at a yet another view. Let us define the view H as comprised of ”All accounts
regarding Peterson in the union of A and B”. Again this might seem, as we first need
the union of A and B (in a physical though small representation) as a first step to
generate a representation of H. But possibly the DBMS (which is relational) will
choose internaly to rephrase the definition of H (which will have a formal
representation in the DBMS), using properties of the relational algebra. Probably the
DBMS can find out that H equally is defined as ”the union of Peterson accounts in A
and Peterson accounts in B”. Since there is a small number of Peterson accounts in
both A and B the latter expression of H might be much more efficient to calculate,
than brute force calculation the first definition of H.
Although it may seem, that a federated database can be implemented mostly using
properties of relational algebra on the databases making up the basis, it might anyway
be necessary to generate physic representations of the federated database (like F
above) or physic representations of views used to define the federated database (like
W above). Let us for instance say, that we actually need frequent access to the data
page 46 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
warehouse F above, defining the amount of money each customer owes the telephone
company. We might want to give the customers online access to their own account in
F, so they can keep track of their growing bill. But for instance it might be, that it is
not possible to get access to A and B in the day times. A reason for this could be, that
the local DBMS’s collecting CDR information in A and B just are too busy doing that
in the day time. So since A and B are not available in the daytimes, we have to
generate a physic representation of F in the night time, where the local DMBS’s
managing A and B can find spare time to send a copy of A and B over the network to
the local DBMS administrating F. In this case the federated database F is
implemented as a data warehouse. We now say, that F is a data warehouse because we
made explicit, that F is having a physical representation independent of the physical
representations of the data sources A and B.
[11] Gives further references to attempts made to help the design of federated
databases.
1.2.4
Query Processing in Federated Databases
Query processing on a federated database meets the problems of distributed database
systems in general. Above we described the TS-model. This model is implemented on
a distributed DBMS also described in [11]. Regarding query processing we almost
literally quote from [11]:
The distributed query processor consists of a query mediator and a number of query
agents, one for each local database. The query mediator is responsible for
decomposing global queries given by multidatabase applications into multiple
subqueries to be evaluated by the query agents. It also assembles the subquery results
returned by the query agents and further processes the assembled results in order to
compute the final query result. Query agents transform subqueries into local queries
that can be directly processed by the local database systems. The local query results
are properly formatted before they are forwarded to the query mediator. By dividing
the query processing tasks between query mediator and query agents, concurrent
processing of subqueries on local databases is possible, reducing the query response
time. This architectural design further enables the query mediator to focus on global
query processing and optimization, while the query agents handle the transformation
of subqueries decomposed by query mediator into local queries. Note that the query
decomposition performed by query mediator assumes all local database schemas are
compatible with the global schema. It is a job for the query agents to convert the
subqueries into local queries on heterogeneous local schemas. The heterogeneous
query interfaces of local database systems are also hidden from the query mediator by
the query agents.
1.2.5
Conclusion: Federated Databases
A federated database is conceptually just a mapping of a set of databases. When the
word federated is used, it indicates that the federated database is a mapping of a set of
databases not originally designed for a mutual purpose. This gives rise to special
problems. Above we have tried to indicate some of these. Though we have not
discussed the situations, where it is desirable to perform changes (conceptually)
directly in the federated database.
 1999 EURESCOM Participants in Project P817-PF
page 47 (120)
Volume 3: Annex 2 - Data manipulation and management issues
2
Organisation of distributed data
2.1
Schema integration in Federated Databases
Deliverable 1
Following the definitions in [3] a Five Level Schema is identified to support the
general requirements of a FDBMS: distribution, heterogeneity and autonomy (Figure
4).
Externel
Schema
Externel
Schema
Federated
Schema
Federated
Schema
Export
Schema
Externel
Schema
Export
Schema
Export
Schema
Component
Schema
Component
Schema
Local
Schema
Local
Schema
Component
DBS
Component
DBS
Figure 4 Five-level schema architecture of a FDBMS [3]
The Local Schema is the conceptual schema of a component database. The Local
Schema is expressed in the native data model of the component DBMS, different
Local Schemas may be expressed in different data models.
The Component Schema builds a single representation of the divergent local schemas.
Semantic that are missing in a local schema can be added in its component schema.
The Component Schema homogenises the local schema in a canonical data model.
The Export Schema filters the data available to the federation, it represents a subset of
the component schema. The purpose of defining Export Schemas is to facilitate
control and management of association autonomy. Using filtering processor may limit
the set of operations that can be submitted to a corresponding Component Schema.
The Federated Schema is an integration of multiple Export Schemas. It also includes
information about data distribution, represented by the integrated Export Schemas.
There may be multiple Federated Schemas in an FDBMS, one for each class of
federation users. A class of federation users is a group of users and/or applications
performing a related set of activities (e.g. corporate environment: managers and
employees).
The External Schema defines a schema for a user and/or application or a class of
user/applications. As a Federated Schema may be large, complex, and difficult to
manage, the External Schema can be used to specify a subset of information relevant
page 48 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
to the users of the External Schema (customisation). Additionally integrity constraints
may be specified in the External Schema and access control provided for the
component databases.
Based on this schema model and an introduced reference architecture, different design
methodology of distributed/federated databases are discussed in [3].
2.2
Data Placement in Distributed Databases
The design of a distributed database is an optimisation problem requiring solutions to
several interrelated problems e.g.:

Data fragmentation

Data allocation and replication

Partitioning and local optimisation
A optimal solution will reduce delay by communication costs and additionally enable
parallel managing of the distributed data. The main keywords in this field are: Data
Declustering, Data Partitioning. A general description of data placement and the
applied technology may be found in [8].
Global Schema
Site
independent
schema
Fragmentation
Schema
Allocation
Schema
Local
Schema
Local
Schema
DBMS
site 1
DBMS
site 2
DB
(other sites )
DB
Figure 5: reference architecture for distributed databases
 1999 EURESCOM Participants in Project P817-PF
page 49 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Figure 5 shows a reference architecture for distributed databases according [2]. There
are several levels of schemas introduced which reflect the different views to the
distributed DB environment. At the top level is the global schema which defines all
the data as if the database were not distributed. Each global relation may be split into
several (nonoverlapping) fragments (Fragmentation schema). Fragmentation may be
done one to many: i.e. several fragments may correspond to one global relation, but
only one global relation corresponds to one fragment.
The allocation schema defines at which site a fragment is located. The local mapping
schema maps the physical images of the global relations to the objects of the local
DBMS.
This concepts describes the following features in general [2]:

Fragmentation Transparecy
provides, that user of application programmer are working on global relations

Location Transparency
is a lower degree of transparency: applications are working on fragements and
not on global relations

Local Mapping Transparency
Independence from local DBMS: this feature, guarantees that the local
representation of data is hidden to the global applications
The different levels of transparency offer various options for optimising the design of
distributed DBMS.
2.2.1
Data Fragmentation
Data fragmentation is done in several ways: vertical, horizontal and mixed.
Horizontal fragmentation consists of partitioning tuples of global relations into
subsets. This type of partitioning is useful in distributed databases, where each subset
can contain data which have common geographical properties.
Vertical partitioning takes attributes of a relation and groups them together into non
overlapping fragments. The fragments are then allocated in a distributed database
system to increase the performance of the system. The objective of vertical
partitioning is to minimise the cost of accessing data items during transaction
processing. [6] and [9]gives an survey about techniques in vertical partitioning, but
see too the remarks in[8].
2.2.2
Criteria for the distribution of fragments
Two aspects are the key influencing the distribution of data in a distributed DBS,

efficiency of access

high availability
The efficiency of access is influenced by transmission time of requests and the
response time of the involved nodes. The distribution of data near the places they are
needed reduces the overall transmission time, the smart distribution of workload
reduces the response time.
High availability is realised in redundant distribution of data among several nodes.
The failure of one or more nodes will keep the system running. In case of failures the
page 50 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
redundant distribution of data may improve the performance of the whole system even
if some nodes are failing (graceful degradation).
3
Parallel processing of retrieval
3.1
Query Processing
The steps to be executed for query processing are in general: parsing a request in an
internal form, validating the query against meta-data informations (schemas or
catalogs), expanding the query using different internal views and finally building an
optimised execution plan to retrieve the requested data objects.
In a distributed system the query execution plans have to be optimised in a way that
query operations may be executed in parallel, avoiding costly shipping of dataSeveral
forms of parallelism may be implemented: Inter-query-parallelism allows the
execution of multiple queries concurrently on a database management system.
Another form of parallelism is based of the fragmentation of queries (sets of database
queries, e.g. selection, join, intersection, collecting) and on parallel execution of these
fragment pipelining the results between the processes [8].
Inter-query-parallelism may be used in two forms, either to execute producer and
consumers of intermediate results in pipelines (vertical inter-operator parallelism) or
to execute independent subtrees in a complex query execution plan concurrently
(horizontal inter-operator parallelism[7]).
A detailed description of technologies applied for query evaluation may be found in
[7].
3.2
Query optimisation
The main parts involved in query processing in a database are the query execution
engine and the query optimiser. The query execution engine implements a set of
physical operators which takes as input one or more data streams and produces an
output data stream. Examples for physical operators are ”sort”, ”sequential scan”,
”index scan”, ”nested loop join”, and ”sort merge join”.
The query optimiser is responsible for generating the input for the execution engine. It
takes a parsed representation of a SQL query as input and is responsible for
generating an efficient execution plan for the given SQL query from the space of
possible execution plans.
The query optimiser has to resolve a difficult search problem in a possibly vast search
space. To solve this problem it is necessary to provide:

A space of plans (search space)

A cost estimation techique so that a cost may be assigned to each plan in the
search space

An enumeration algorithm that can search through the execution space
A desirable optimiser is one where (1) the search space includes plans that have low
cost, (2) the costing technique is accurate (2) the enumeration algorithm is efficient.
Each of these tasks is nontrivial.
 1999 EURESCOM Participants in Project P817-PF
page 51 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
In [5] an overview is given of the state of the art in query optimisation.
4
Parallel processing of transactions
4.1
Characteristics of transaction management
The transaction processing management in a distributed environment is responsible to
keep the consistency of data in a distributed system, to guarantee the efficiently
execution of competing transactions and to be able to handle error conditions. The
main properties of a transaction are the ACID-Properties [1]:
4.2

Atomicity
within a transaction, every or none of the needed operations will be done. The
interruption of a transaction will initiate a recovery condition which reestablishes the former state conditions.

Consistency
A transaction is a correct transformation of the state. The action taken as a group
do not violate any of the integrity constraints associated with the state.

Isolation
Even though transactions execute concurrently, it appears to each transaction T,
that others executed either before T or after T, but not both

Durability
Once a transaction completes successfully, it changes to the state survive
failures.
Distributed Transaction
The Two-Phase commit protocol is used primarily to coordinate work of independent
resource managers within a transaction (from [1]). It deals with integrity checks that
have been deferred until the transaction is complete (phase 1) and with work that has
been deferred until the transaction has finally committed (phase 2). In a distributed
system each node in a cluster has it own transaction manager. In case that transactions
are accessing objects on different nodes, the transactions are known to several
transaction managers. The two-phase commit protocol is used to make the commit of
such distributed transactions atomic and durable, and to allow each transaction
manager the option to unilaterally aborting any transactions that are not yet prepared
to commit.
The protocol is fairly simple. The transaction manager that began the transaction is
called the root transaction manager. As work flows from one node to another, the
transaction managers in the transaction form a tree with the root transaction manager
at the root of the tree. Any member of the transaction can abort the transaction, but
only the root can perform the commit. It represents the commit coordinator. The
coordinator polls the participants at phase 1; if any vote no or fail to respond within
the timeout period, the coordinator rejects the commit and broadcasts the abort
decision. Otherwise, the coordinator writes the commit record and broadcasts the
commit decision to all the other transaction managers.
page 52 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
5
Commercial products
5.1
Tandem
NonStop Clusters software combines a standards-based, version of SCO UnixWare
2.1.2 with Tandem’s unique single system image (SSI) clustering software
technology. SSI simplifies system administration tasks with a consistent, intuitive
view of the cluster. This helps migrating current UNIX system applications to a
clustered environment. It also allows transparent online maintenance such as hot plugin of disks, facilitates the addition of more nodes to the cluster, and provides
automatic failover and recovery. [12]
5.1.1
Designed for scalability
With NonStop Clusters software, servers within the cluster are highly scalable and
can be scaled as needed without incurring significant downtime. Through the use of
NonStop Clusters software, each node can contain a different number of processors of
different speeds, memory capacities, and internal disk storage capacities. Other
features, such as shared devices and memory, enhance scalability. Depending upon
the applications structure, there is nearly one-to-one performance scaling. In addition,
NonStop Clusters software supports industry-standard middleware to facilitate cluster
growth.
5.1.2
High degree of manageability
The SSI approach to cluster management is designed to avoid the need for special
cluster considerations. With the SSI, you manage a single resource, not a collection of
systems.
5.1.3
Automatic process migration and load balancing
NonStop Clusters software lets you migrate applications, operating system objects,
and processes between cluster nodes. Migrating into default or predefined operating
areas balances the load between nodes. Migration can be automatic, and the load can
be balanced with other tasks or through a failover to specific node groups. Any node
can be selected for failover. Automatic process migration and load balancing are
available during normal operation or after application restart or failover. This
promotes efficient cluster operation without the need for a dedicated standby node.
5.1.4
High level of application and system availability
NonStop Clusters software promotes a high level of availability throughout the
enterprise. It runs on industry-standard servers, the reliable Integrity XC series, which
consists of packaged Compaq ProLiant servers. The clustering operation provides a
replicated operating system, which continues membership services as if hardware
were replicated. Availability is also enhanced by the Tandem ServerNet® faulttolerant system area network (SAN) technology. NonStop Clusters software runs on
the Integrity XC seriesfrom Compaq. CPU performance on the Integrity XC series is
tied to Intel processor architecture evolution and enhancements and extensions to
Compaq’s ProLiant server line.
 1999 EURESCOM Participants in Project P817-PF
page 53 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.2
Oracle
5.2.1
Oracle8
Deliverable 1
Oracle8 is a data server from Oracle. Oracle8 is based on a object-relational model. In
this model it is possible to create a table with a column whose datatype is another
table. That is, tables can be nested within other tables as values in a column. The
Oracle server stores nested table data "out of line" from the rows of the parent table,
using a store table which is associated with the nested table column. The parent row
contains a unique set identifier value associated with a nested table instance [14].
Oracle products run on Microsoft Windows 3.x/95/NT, Novell NetWare, Solaris, HPUX and Digital UNIX platforms.
Many operational and management issues must be considered in designing a very
large database under Oracle8 or migrating from an Oracle7 (the major predecessor of
Oracle8) database. If the database is not designed properly, the customer will not be
able to take full advantage of Oracle8’s new features. This section discusses issues
related to designing a VLDB under Oracle8 or migrating from an Oracle7 database
[17].
5.2.1.1
Partitioning Strategies – Divide And Conquer
One of the core features of Oracle8 is the ability to physically partition a table and its
associated indexes. By partitioning tables and indexes into smaller components while
maintaining the table as a single database entity, the management and maintenance of
data in the table becomes more flexible.
The data management now can be accomplished at the finer-grained, partition level,
while queries still can be performed at the table level. For example, applications do
not need to be modified to run against the newly partitioned tables.
The divide-and-conquer principle allows data to be manipulated at the partition level,
reducing the amount of data per operation. In most cases, this standard also allows
partition-level operations to be performed in parallel with the same operations on
other partitions of the same table, speeding up the entire operation.
5.2.1.2
Benefits From Table Partitioning
The greatest benefits from Oracle8 partitioning is the ability to maintain and
administer very large databases. The following dimensions of scalability can be found
in Oracle8 compared to Oracle7:
5.2.1.3

Higher Availability

Greater Manageability

Enhanced Performance
Higher Availability
Using intelligent partitioning strategies, Oracle8 can help meet the increasing
availability demands of VLDBs. Oracle8 reduces the amount and duration of
scheduled downtime by providing the ability to perform downtime operations when
the database is still open and in use. Refer to the paper, Optimal Use of Oracle8
page 54 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Partitions, for further information about this topic. The key to higher availability is
partition autonomy, the ability to design partitions to be independent entities within
the table. For example, any operation performed on partition X should not impact
operations on a partition Y in the same table.
5.2.1.4
Greater Manageability
Managing a database consists of moving data in and out, backing up, restoring and
rearranging data due to fragmentation or performance bottlenecks. As data volumes
increase, the job of data management becomes more difficult. Oracle8 supports table
growth while placing data management at a finer-grain, partition level. In Oracle7, the
unit of data management was the table, while in Oracle8 it is the partition. As the
table grows in Oracle8, the partition need not also grow; instead the number of
partitions increases. All of the data management functions in Oracle7 still exist in
Oracle8. But with the ability to act against a partition, a new medium for parallel
processing within a table now is available. Similar rules apply to designing a database
for high availability. The key is data-segment size and to a lesser degree, autonomy.
5.2.1.5
Enhanced Performance
The strategy for enhanced performance is divide-and-conquer through parallelism.
This paradigm inherently results in performance improvements because most
operations performed at the table level in Oracle7, now can be achieved at the
partition level in Oracle8. However, if a database is not designed correctly under
Oracle8, the level of achievable parallelism and resulting performance gains will be
limited. The table partitioning strategy used in Oracle7 may not necessarily be
adequate to make optimal use of the Oracle8 features and functionality.
5.2.2
A Family of Products with Oracle8
Oracle offers a set of products and tools to aid in the development and management of
complex computer systems and applications. This section focuses on how specific
products are integrated with and take advantage of the new features of Oracle8 [13].
The Oracle products described in this section are built around NCA (Network
Computing Architecture). NCA provides a cross-platform, standards-based
environment for developing and deploying network-centric applications. NCA aids
integration of Web servers, database servers, and application servers.
To address these challenges, all Oracle products are embracing NCA, from the core
Oracle8 database to Oracle's comprehensive development tools and packaged
enterprise applications.
A key strength of Oracle’s product offering is the breadth of tools available for
working with the Oracle database. Database application development requires tools to
build, design, and manage the applications. In addition, Oracle is working with many
third-party partners to ensure that their products fully leverage and exploit the new
capabilities provided with Oracle8.
Oracle offers a set of tools targeted to build and deploy both applications for a
potentially large network of computers and applications for a single computer.
The tools are designed to let developers focus on solving business problems instead of
programming application infrastructure.
 1999 EURESCOM Participants in Project P817-PF
page 55 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
The suite of products make it possible to:

Provide a management system for all servers in the enterprise.

Application development using component creation and assembly tools.

Mining of data from data warehouses with OLAP (Online Analytical Processing)
and decision support tools.

Deploy Web applications.
The rest of this section is a more detailed description of the capabilities of some of the
products and tools All of these products are integrated with the Oracle8 system. The
focus is on how these specific products take advantage of the new features of Oracle8
[16].
5.2.2.1
SQL*Plus
SQL*Plus, the primary ad-hoc access tool to the Oracle8 server, provides an
environment for querying, defining, and controlling data. SQL*Plus delivers a full
implementation of Oracle SQL and PL/SQL (see below), along with a set of
extensions.
SQL*Plus provides a flexible interface to Oracle8, enabling developers to manipulate
Oracle SQL commands and PL/SQL blocks. With SQL*Plus it is possible to create ad
hoc queries, retrieve and format data, and manage the database.
For maximum administrative flexibility, you can work directly with Oracle8 to
perform a variety of database maintenance tasks. You can view or manipulate
database objects, even copy data between databases.
The new Oracle8 features can be implemented through SQL*Plus 8.0.3. Use of New
Oracle8 features, such as creating and maintaining partitioned objects, using parallel
Data Manipulation Language (DML) commands, creating index organized tables and
reverse key indexes, deferred constraint checking and enhanced character set support
for National Language Support can be implemented with SQL*Plus. SQL*Plus also
supports the new password management capability of Oracle8.
In addition, SQL*Plus 8.0.3 fully supports the object capability of Oracle8, as well as
its very large database support features. New object types can be defined, including
collection types and REF (reference) attributes. SQL*Plus supports the SQL syntax
for creating object tables using the newly defined object types, as well as all the new
DML syntax to access the object tables. Object type methods are written in PL/SQL
from the SQL*Plus tool, along with object views and instead of triggers. All the
storage handling syntax is supported from SQL*Plus, in addition to handling all
aspects of LOB (Large Object) manipulation and storage management.
PL/SQL is Oracle’s procedural extension to SQL. PL/SQL’s extension to industrystandard SQL, is a block-structured programming language that offers application
developers the ability to combine procedural logic with SQL to satisfy complex
database application requirements. PL/SQL provides application developers benefits
including seamless SQL access, tight integration with the Oracle RDBMS and tools,
portability, security, and internationalization [15].
page 56 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
5.2.2.2
Volume 3: Annex 2 - Data manipulation and management issues
Oracle8 Enterprise Manager
Oracle8 Enterprise Manager (OEM) is Oracle's framework for managing the Oracle
environment. Enterprise Manager consists of a centralized console, common services
(such as job scheduling and event management), and intelligent agents running on
managed nodes. Various applications run on this framework performing
comprehensive systems management capabilities.
Oracle8 Enterprise Manager 1.4 supports the new scalability features of Oracle8 such
as partitioning, queuing, password management, and server managed backup and
recovery. Bundled with the Console are a set of database administration tools that
help automate and simplify the common tasks in the life of a database administrator
(DBA). All the tools provide a graphical user interface, with drag-and-drop
functionality and wizards.
OEM helps the availability of the data in a recovery situation by enabling the DBA to
complete recovery sooner.
OEM’s Backup Manager provides a graphical interface to various backup options,
such as the new Oracle8 utility, Recovery Manager (RMAN). The Oracle8 Recovery
Manager supports secure management of backups by using either a recovery catalog
or a control file. The DBA initiates restore and recovery operations very quickly,
using a point-and-click interface, allowing the recovery operation to complete that
much sooner.
Backups can be scheduled to run at off-hours of the day using OEM’s job scheduling
capability. In addition, Oracle8 Backup Manager also has support for Oracle7 type
database backups.
OEM provides graphical user interface (GUI) support for all the password
management capabilities of Oracle8. Additionally, OEM supports GUI creation of
global users and global roles, greatly simplifying the security administrator’s user
management tasks.
5.2.2.3
Designer/2000: Model Once For Multiple Targets
Designer/2000 is a business and application modeling tool with the ability to generate
complete applications from those models. Business analysts and developers use a
visual modeling interface to represent and define business objects, functionality,
business rules and requirements in a declarative way. These rules can then be
implemented on one or more tier: on the client, application server or database server.
Designer/2000 enables the definition of systems independently of their
implementation so that applications can be generated in multiple environments and
configurations from a single model. Developers can reuse these definitions by
dragging and dropping them into new models. From these models, Designer/2000
generates and reverse engineers Oracle database objects, Developer/2000 client/server
and Web applications, Oracle Web Application Server applications, Visual Basic
applications, Oracle Power Objects applications, and C++ mappings for database
objects.
Designer/2000 2.0 also provides modeling, generation, and design recovery for all the
scalability features of Oracle8, such as partitioned tables, new LOBs (Line-OfBusiness), index organized tables and deferred constraint checking, as well as object
features including user defined types, type tables, referenced and embedded types, and
 1999 EURESCOM Participants in Project P817-PF
page 57 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
object views of relational data. These concepts are represented using an extension of
the ULL (United Modeling Language), the emerging open standard for object
modeling. The design is then implemented within the Oracle8 Server through
automated SQL DDL generation. Existing Oracle8 Server designs can be reverse
engineered into the Designer/2000 Release 2.0 repository, including the automated
construction of diagrams based on recovered objects and definitions.
Additionally, in the definition of client-side applications Designer/2000 uses a
concept called a 'module component'. A module component defines what data is
accessed (tables and columns), how it appears in a form, and what specialized
behavior it has (in addition to that inherited from the tables it includes, e.g. validation
rules). A module component can then be included in many forms (or reports, etc.).
Any change made in the original component definition will be inherited in every
module definition that includes the component.
5.2.2.4
Object Database Designer
Object Database Designer is the natural companion for anyone designing and building
Oracle8 systems. The product addresses key areas of functionality designed to aid in
all aspects of ORDBMS design, creation, and access: type modeling forms the core of
an object oriented development, and is used in all stages of analysis and design.
Object Database Designer, like Designer/2000, implements type modeling by UML;
thereby meeting the needs of both the major developer roles: database designers and
application developers.
The type model is transformed into an Oracle8 database schema, giving the database
designer an excellent head start on the design, and mapping abstract type models onto
the world of ORDBMS. The designer can then refine this schema design to exploit
Oracle8 implementation options. Then the visual design is automatically translated
into the appropriate SQL DDL to implement it. This approach takes the effort out of
manually building a database, and guarantees bug-free SQL.
Because it is equally important to be able to visualize existing database structures,
Object Database Designer supports reverse engineering and full round-trip
engineering of model and schema. This database design and generation capability is
identical to the corresponding capability in Designer/2000 thus providing DBAs with
a single tool-set. However Object Database Designer is specifically designed for
developers of Object Oriented 3GL applications of Oracle8.
C++ is currently the most widely used object-oriented programming language, as such
it is extremely important to provide a mechanism for C++ programs to seamlessly
access Oracle8. Using the type model as its base, the C++ generator automatically
generates C++ classes that provide transparent database persistency for those objects.
This delivers major productivity benefits to C++ programmers, allowing them to
concentrate on application functionality rather than database access.
C++ Generator also creates a run-time mapping to allow those applications to interact
with their persistent store: the Oracle database. This allows the database schema to
migrate without unnecessarily affecting the applications. Additionally, by exploiting
the power and performance of the Oracle8 client-side cache, the generated code
provides a high performance database access. Not only is the interface simplified for
the developer, it is also performance tuned.
On the client side, Object Database Designer generates a library of class definitions,
each of which may have a persistency mapping onto Oracle8 types. The class
page 58 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
structure generated is based on the abstract type model, which is UML compliant and
hence is capable of modeling and generating multiple inheritance class structures.
However, the transformer to Oracle8 schema design only resolves single inheritance
trees, which it implements using a number of options (super-type references sub-type,
sub-type references super-type, super-type and sub-type union, and single type with
type differentiator attribute).
The type modeling and generation in Object Database Designer is a streamlined
packaging of the equivalent Designer/2000 capability. Object Database Designer is
focused specifically at the Oracle8 database designer and C++ Programmer.
5.2.2.5
Developer/2000: Building Enterprise Database Applications
Developer/2000 is a high-productivity client/server and Web development tool for
building scalable database applications. A typical Developer/2000 application might
include integrated forms, reports, and charts, all developed using an intuitive GUI
interface and a common programming language, PL/SQL. Developer/2000
applications can be constructed and deployed in two-tier or multi-tier architectures, in
a client/server or Web environment.
One of the strengths of Developer/2000 lies in its database integration and its inherent
ability to support highly complex transactions and large numbers of users. Oracle has
carried these strengths through to the Developer/2000 Server, an application server
tier that supports the deployment of robust database applications in a network
computing environment. The Developer/2000 Server enables any new or previouslycreated Developer/2000 application to be deployed on the Web using Java, and to
publish information using industry standard formats, including HTML, PDF, and GIF.
Oracle8 Developer/2000 Release 2.0 is fully certified for application development
against Oracle8. This release will also enhance the large scale OLTP nature of
applications developed with Developer/2000 by support of such functions as
Transparent Application Fail-over, password management, and connection pooling.
Along with an enhanced user interface to support these newer more complex data
structures, Developer/2000 will allow developers to extend the scalability of their
applications, and in conjunction with Sedona, provide greater access to the object
world.
5.2.2.6
Sedona: Component-Based Development
Sedona is a development environment for building component-based applications. It
includes a component framework, a repository, and a suite of visual tools that work in
concert to simplify and expedite the specification, construction, and evolution of
component-based applications.
Oracle and SQL*PLUS are registered trademarks and Enabling the Information Age,
Oracle8, Network Computing Architecture, PL/SQL, Oracle7, Developer/2000,
Designer/2000, Oracle Enterprise Manager, and Oracle Web Application Server are
trademarks of Oracle Corporation.
 1999 EURESCOM Participants in Project P817-PF
page 59 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.3
Informix
5.3.1
Informix Dynamic Server
Deliverable 1
Informix Dynamic Server is a database server. A database server is a software
package that manages access to one or more databases for one or more client
applications. Specifically, Informix Dynamic Server is a multithreaded relational
database server that manages data that is stored in rows and columns. It employs a
single processor or symmetric multiprocessor (SMP) systems and dynamic scaleable
architecture (DSA) to deliver database scalability, manageability and performance.
This section deals with Informix Dynamic Server, Version 7.3. This version is
provided in a number of configurations. Not all features discussed are provided by all
versions. Informix Dynamic Server works on different hardware equipment, of which
some is UNIX or Microsoft Windows NT based [18].
5.3.2
Basic Database Server Architecture
The basic Informix database server architecture (DSA) consists of the following three
main components:

Shared memory

Disk

Virtual processor
These components are described briefly in this section.
5.3.2.1
The Shared-Memory Component
Shared memory is an operating-system feature that lets the database server threads
and processes share data by sharing access to pools of memory. The database server
uses shared memory for the following purposes:

To reduce memory use and disk I/O

To perform high-speed communication between processes
Shared memory lets the database server reduce overall memory uses because the
participating processes - in this case, virtual processors - do not need to maintain
individual copies of the data that is in shared memory. Shared memory reduces disk
I/O because buffers, which are managed as a common pool, are flushed on a database
server-wide basis instead of on a per-process basis. Furthermore, a virtual processor
can often avoid reading data from disk because the data is already in shared memory
as a result of an earlier read operation. The reduction in disk I/O reduces execution
time. Shared memory provides the fastest method of interprocess communication
because processes read and write messages at the speed of memory transfers.
5.3.2.2
The Disk Component
A disk is a collection of one or more units of disk space assigned to the database
server. All the data in the databases and all the system information that is necessary to
maintain the database server resides within the disk component.
page 60 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
5.3.2.3
Volume 3: Annex 2 - Data manipulation and management issues
The Virtual Processor Component
The central component of Informix DSA is the virtual processor, which is a database
server process that the operating system schedules for execution on the CPU.
Database server processes are called virtual processors because they function
similarly to a CPU in a computer. Just as a CPU runs multiple operating-system
processes to service multiple users, a virtual processor runs multiple threads to service
multiple client applications. A thread is a task for a virtual processor in the same way
that the virtual processor is a task for the CPU. How the database server processes a
thread depends on the operating system. Virtual processors are multithreaded
processes because they run multiple concurrent threads.
5.3.2.4
Architectural Elements of Informix Dynamic Server
5.3.2.4.1 Scalability
Informix Dynamic Server lets you scale resources in relation to the demands that
applications place on the database server. Dynamic scalable architecture provides the
following performance advantages for both single-processor and multiprocessor
platforms:


A small number of database server processes can service a large number of client
application processes.
provides more control over setting priorities and scheduling database tasks
than the operating system does.
DSA
Informix Dynamic Server employs single-processor or symmetric multiprocessor
computer systems. In an SMP computer system, multiple central processing units
(CPUs) or processors all run a single copy of the operating system, sharing memory
and communicating with each other as necessary.
5.3.2.4.2 Raw (Unbuffered) Disk Management
Informix Dynamic Server can use both file-system disk space and raw disk space in
UNIX and Windows NT environments. When the database server uses raw disk
space, it performs its own disk management using raw devices. By storing tables on
one or more raw devices instead of in a standard operating-system file system, the
database server can manage the physical organization of data and minimize disk I/O.
Doing so results in three performance advantages:

No restrictions due to operating-system limits on the number of tables that can be
accessed concurrently.

Optimization of table access by guaranteeing that rows are stored contiguously.

Elimination of operating-system I/O overhead by performing direct data transfer
between disk and shared memory.
If these issues are not a primary concern, you can also configure the database server to
use regular operating-system files to store data. In this case, Informix Dynamic Server
manages the file contents, but the operating system manages the I/O.
 1999 EURESCOM Participants in Project P817-PF
page 61 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
5.3.2.4.3 Fragmentation
Informix Dynamic Server supports table and index fragmentation over multiple disks.
Fragmentation lets you group rows within a table according to a distribution scheme
and improve performance on very large databases. The database server stores the rows
in separate database spaces (dbspaces) that you specify in a fragmentation strategy. A
dbspace is a logical collection of one or more database server chunks. Chunks
represent specific regions of disk space.
5.3.2.4.4 Fault Tolerance and High Availability
Informix Dynamic Server uses the following logging and recovery mechanisms to
protect data integrity and consistency in the event of an operating-system or media
failure:

Dbspace and logical-log backups of transaction records

Fast recovery

Mirroring

High-availability data replication

Point-in-time recovery
5.3.2.4.5 Dbspace and Logical-Log Backups of Transaction Records
Informix Dynamic Server lets you back up the data that it manages and also store
changes to the database server and data since the backup was performed. The changes
are stored in logical-log files. You can create backup tapes and logical-log backup
tapes while users are accessing the database server. You can also use on-line
archiving to create incremental backups. Incremental backups let you back up only
data that has changed since the last backup, which reduces the amount of time that a
backup would otherwise require. After a media failure, if critical data was not
damaged (and Informix Dynamic Server remains on-line), you can restore only the
data that was on the failed media, leaving other data available during the restore.
5.3.2.4.6 Database Server Security
Informix Dynamic Server provides the following security features:

Database-level security

Table-level security

Role creation
Informix Dynamic Server, Workgroup and Developer Editions, do not support role
creation and the CREATE ROLE statement. The databases and tables that the
database server manages enforce access based on a set of database and table
privileges.
5.3.3
Informix Dynamic Server Features
5.3.3.1
Relational Database Management
An Informix RDBMS consists of a database server, a database, and one or more client
applications. This chapter discusses the first two components. Informix Dynamic
page 62 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Server works with relational databases. A relational database lets you store data so
that the data is perceived as a series of rows and columns. This series of rows and
columns is called a table, and a group of tables is called a database. (An edition of the
database server that is not discussed in this manual works with object relational
databases.) SQL statements direct all operations on a database. The client application
interacts with you, prepares and formats data, and uses SQL statements to send data
requests to the database server. The database server interprets and executes SQL
statements to manage the database and return data to the client application. You can
use SQL statements to retrieve, insert, update, and delete data from a database. To
retrieve data from a database, you perform a query. A query is a SELECT statement
that specifies the rows and columns to be retrieved from the database.
An Informix RDBMS permits high-speed, short-running queries and transactions on
the following types of data:
5.3.3.2

Integer

Floating-point number

Character string, fixed or variable length

Date and time, time interval

Numeric and decimal

Complex data stored in objects
High-Performance Loader
The High-Performance Loader (HPL) is an Informix Dynamic Server feature that lets
you efficiently load and unload very large quantities of data to or from an Informix
database. Use the HPL to exchange data with tapes, data files, and programs and
convert data from these sources into a format compatible with an Informix database.
The HPL also lets you manipulate and filter the data as you perform load and unload
operations.
5.3.3.3
Informix Storage Manager
The Informix Storage Manager (ISM) lets you connect an Informix database server to
storage devices for backup and restore operations. ISM also manages backup media.
ISM has two main components: the ISM server for data backup and recovery, and the
ISM administrator program for management and configuration of the ISM server,
storage media, and devices.
5.3.3.4
Database Support

Informix Dynamic Server supports the following types of databases:

ANSI compliant

Distributed

Distributed on multiple vendor servers

Dimensional (data warehouse)
 1999 EURESCOM Participants in Project P817-PF
page 63 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.3.3.5
Deliverable 1
ANSI-Compliant Databases
Informix Dynamic Server supports ANSI-compliant databases. An ANSI-compliant
database enforces ANSI requirements, such as implicit transactions and required
ownership, that are not enforced in databases that are not ANSI compliant. You must
decide whether you want any of the databases to be ANSI compliant before you
connect to a database server. ANSI-compliant databases and databases that are not
ANSI-compliant differ in a number of areas.
5.3.3.6
Dimensional Databases
Informix Dynamic Server supports the concept of data warehousing. This typically
involves a dimensional database that contains large stores of historical data. A
dimensional database is optimized for data retrieval and analysis. The data is stored as
a series of snapshots, in which each record represents data at a specific point in time.
A data warehouse integrates and transforms the data that it retrieves before it is
loaded into the warehouse. A primary advantage of a data warehouse is that it
provides easy access to, and analysis of, vast stores of information.
A data-warehousing environment can store data in one of the following forms:
5.3.3.7

Data warehouse

Data mart

Operational data store

Repository
Transaction Logging
The database server supports buffered logging and lets you switch between buffered
and unbuffered logging with the SET LOG statement. Buffered logging holds
transactions in memory until the buffer is full, regardless of when the transaction is
committed or rolled back. You can also choose to log or not to log data.
5.3.4
Supported Interfaces and Client Products
5.3.4.1
Interfaces
5.3.4.1.1 Informix Enterprise Command Center
Informix Enterprise Command Center (IECC) runs on Windows 95 or Windows NT.
IECC provides a graphical interface that allows the administrator to configure,
connect to, control, and monitor the status of the database server. IECC simplifies the
process of database server administration and automates common administrative
functions.
5.3.4.1.2 Optical Subsystem
The optical storage subsystem supports the storage of TEXT and BYTE data on
optical platters known as WORM optical media. It includes a specific set of SQL
statements that support the storage and retrieval of data to and from the optical
storage subsystem.
page 64 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
5.3.4.1.3 Informix SNMP Subagent
Simple Network Management Protocol (SNMP) is a published, open standard for
network management. The Informix SNMP subagent lets hardware and software
components on networks provide information to network administrators. The
administrators use the information to manage and monitor applications, database
servers, and systems on networks.
5.3.4.2
Client SDK Products
The Informix Client SDK provides several application-programming inter-faces that
you can use to develop applications for Informix database servers. These APIs let
developers write applications in the language with which they are familiar, such as
ESQL, C, C++, and Java. INFORMIX-Connect contains the runtime libraries of the
APIs in the Client SDK.
5.3.4.2.1 INFORMIX-ESQL/C
INFORMIX-ESQL/C lets programmers embed SQL statements directly into a Cprogram. ESQL/C contains:

ESQL/C libraries of C functions, which provide access to the database server

ESQL/C header files, which provide definitions for the data structures, constants,
and macros useful to the ESQL/C program

ESQL, a command that manages the source-code processing to convert a C file
that contains SQL statements into an object file
5.3.4.2.2 INFORMIX-GLS
The INFORMIX-GLS application-programming interface lets ESQL/C programmers
develop internationalized applications with a C-language interface. It accesses GLS
locales to obtain culture-specific information. Use INFORMIX-GLS to write or
change programs to handle different languages, cultural conventions, and code sets.
5.3.4.2.3 INFORMIX-CLI
INFORMIX-CLI is the Informix implementation of the Microsoft Open Database
Connectivity (ODBC) standard. INFORMIX-CLI is a Call Level Interface that
supports SQL statements with a library of C functions. An application calls these
functions to implement ODBC functionality. Use the INFORMIX-CLI application
programming interface (API) to access an Informix database and interact with an
Informix database server.
5.3.4.3
Online analytical processing
MetaCube is a variety of tools for online analytical processing [19].
MetaCube Explorer is a graphical data access tool that enables quick retrieval and
analysis of critical business data stored in a large data warehouse. Explorer works
with the MetaCube analysis engine to query data warehouses stored in an Informix
database. Explorer’s graphical interface displays multiple views of the information
retrieved by any query. As a business analyst, you can: [20]

retrieve results of complex queries.
 1999 EURESCOM Participants in Project P817-PF
page 65 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1

pivot rows and columns to present data by different categories or groupings; sort
report rows and columns alphabetically, numerically, and chronologically.

drill down for more detailed information or up for a more summarised report.

incorporate calculations into reports that provide comparisons and rankings of
business data, thereby facilitating analysis of business data.

customize reports to present user-defined views of data.
MetaCube for Excel is an add-in for Excel spreadsheet software that enables quick
retrieval and analysis of business data stored in a MetaCube data warehouse.
MetaCube for Excel works with the MetaCube analysis engine to query a
multidimensional data warehouse that resides in an Informix database. Using
MetaCube for Excel, you can: [21]

Retrieve results of complex queries.

Automatically sort, subtotal, and total data retrieved from the data warehouse.

Apply powerful analysis calculations to returned data.
5.4
IBM
5.4.1
DB2 Universal Database
IBM delivered its first phase of object-relational capabilities with Version 2 of DB2
Common Server in July, 1995. In addition, IBM released several packaged Relational
Extenders for text, images, audio, and video. The DB2 Universal Database combines
Version 2 of DB2 Common Server, including object-relational features, with the
parallel processing capabilities and scalability of DB2 Parallel Edition on symmetric
multiprocessing (SMP), massively parallel processing (MPP), and cluster platforms.
(See Figure 3.) DB2 Universal Database, for example, will execute queries and UDFs
in parallel. After that, IBM will add major enhancements to DB2 Universal Database's
object-relational capabilities, including support for abstract data types, row types,
reference types, collections, user-defined index structures, and navigational access
across objects [22].
The DB2 product family spans AS/400* systems, RISC System/6000* hardware, IBM
mainframes, non-IBM machines from Hewlett-Packard* and Sun Microsystems*, and
operating systems such as OS/2, Windows (95 & NT)*, AIX, HP-UX*, SINIX*, SCO
OpenServer*, and Sun Solaris [24].
5.4.1.1
Extensibility
5.4.1.1.1 User-defined Types (UDTs)
DB2 Common Server v2 supports user-defined data types in the form of distinct
types. UDTs are strongly typed and provide encapsulation. The DB2 Relational
Extenders (see below) provide predefined abstract data types for text, image, audio,
and video. DB2 Universal Database will introduce an OLE object for storing and
manipulating OLE objects in the DBMS. An organisation will be able to store
personal productivity files centrally, query the contents with predefined UDFs ("find
all of the spreadsheets with 'profit' and 'loss' in them"), manage the files as regular
relational data (e.g., for backup purposes), and apply integrated content-searching
page 66 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
capabilities. In the future, DB2 will add user-defined abstract data types with support
for multiple inheritance.
5.4.1.1.2 User-defined Functions (UDFs)
DB2 v2 supports scalar UDFs for defining methods such as comparison operators,
mathematical expressions, aggregate functions, and casting functions. DB2 Universal
Database will add table functions (UDFs that return tables), a very significant
enhancement, and parallel execution of UDFs. Functions are resolved based on
multiple attributes and can be delivered in binary form (source code is not required),
making it attractive for third parties to develop DB2 UDFs. In addition, DB2 supports
the notion of a "sourced" function, allowing one to reuse the code of an existing
function. UDFs can run in an "unfenced" mode in the same address space as the DB2
server for fast performance, or in a "fenced" mode in a separate address space for
security.
UDFs can be written in C, Visual Basic, Java, or any language that follows the C
calling convention. The ability to write UDFs in SQL is coming. Support for the
JDBC API is also available. This provides a set of object methods for Java
applications to access relational data.
5.4.1.1.3 New Index Structures
Several new index structures for text, images, audio, and video are now available
through the DB2 Relational Extenders. Future releases of DB2 will add navigational
access via object pointers, indexes on expressions (for example, index on salary +
commission), indexes on the attribute of a UDT (e.g., index on language(document)
where document is a UDT and language is an attribute of document), and support for
user-defined index structures.
5.4.1.1.4 Extensible Optimiser
The optimiser plays a critical role in achieving good performance, and IBM has made
a significant investment here. DB2's optimiser is now rule-based and has sophisticated
query transformation, or query rewrite, capabilities. This is a key foundation for
DB2's object-relational extensibility. IBM plans to document the rules interface for
the optimiser so that customers and third parties can also extend the scope of the DB2
optimiser. The user can give the DB2 optimiser helpful information about the cost of
a UDF, including the number of I/Os per invocation, CPU cost, and whether the
function involves any external actions.
5.4.1.2
LOBs And File Links
Version 2 of DB2 added LOB (large object) support with three predefined LOB
subtypes: BLOBs (binary), CLOBs (character), and DBCLOBs (double-byte
character). There can be any number of LOB columns per table. DB2 provides
significant flexibility in storing and retrieving LOB data to provide both good
performance and data recoverability. For example, the option to preallocate storage
for LOBs trades off storage requirements and performance. Locators within LOBs
allow DB2 to optimize delivery of the data via piecemeal retrieval of LOB data. In the
area of recovery, the option to log or not log LOB data trades off recovery and
performance; the ability to recover LOB data without writing changes to the log files
optimises both. Reading LOB data into application memory rather than into shared
memory avoids excessive manipulation of shared memory. And direct file support
 1999 EURESCOM Participants in Project P817-PF
page 67 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
means DB2 can write LOBs directly to disk on the client for manipulation by the
application.
Integration with external file systems via the robust file-links technology described
above is in development. External filesystems will be supported for data/index storage
and delivery of data, and DB2 will guarantee the integrity of data stored in external
files.
5.4.1.3
Integrated Searchable Content
All of the object-relational extensions to DB2 are fully supported by SQL, enabling
the user to access any and all data in the database in a single SQL statement. Extended
functions for searching the content of complex data can be used anywhere a built-in
SQL function can be used. This is an important goal for IBM. The company is
committed to the SQL3 standard here.
5.4.1.4
Business Rules
DB2 v2 already supports SQL3-style triggers, declarative integrity constraints, and
stored procedures. DB2 offers significant flexibility in its stored procedures. Stored
procedures are written in a 3GL (C, Cobol, Fortran), a 4GL, or Java. This approach
provides both portability and programming power, unlike proprietary stored
procedure languages. IBM also plans to implement the SQL3 procedural language
extensions so that procedures can be written in SQL as well.
5.4.1.5
Predefined Extensibility: DB2 Relational Extenders
The DB2 Relational Extenders build on the object-relational infrastructure of DB2.
Each extender is a package of predefined UDTs, UDFs, triggers, constraints, and
stored procedures that satisfies a specific application domain. With the extenders, the
user can store text documents, images, videos, and audio clips in DB2 tables by
adding columns of the new data types provided by the extenders. The actual data can
be stored inside the table or outside in external files. These new data types also have
attributes that describe aspects of their internal structures, such as "language" and
"format" for text data. Each extender provides the appropriate functions for creating,
updating, deleting, and searching through data stored in its data types. The user can
now include these new data types and functions in SQL statements for integrated
content searching across all types of data. Here are some highlights of the current
extender offerings. All of the extenders will be bundled with DB2 Universal
Database.
5.4.1.5.1 DB2 Text Extender
The Text Extender supports full-text indexing and linguistic and synonym search
functions on text data in 17 languages. Search functions include word and phrase,
proximity, wild card, and others plus the ability to rank each retrieved document
based on how well it matches the search criteria. Text Extender works with multiple
document formats and can be applied to pre-existing character data in the database.
5.4.1.5.2 DB2 Image Extender
The Image Extender offers similar capabilities for images stored in multiple formats.
Searches can be based on average color, color distribution, and texture within an
image. The user can also find images similar to an existing image. Future
page 68 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
enhancements include image-within-image support and more sophisticated indexing
techniques.
5.4.1.5.3 DB2 Audio Extender
The Audio Extender supports a variety of audio file formats, such as WAVE and
MIDI, and offers import/export facilities and the ability to playback audio clips on an
audio browser. Attributes include number of channels, transfer time, and sampling
rate.
5.4.1.5.4 DB2 Video Extender
The Video Extender maintains attributes such as frame rate, compression ratio, and
number of video tracks for each video clip stored in the database. This extender
supports several video file formats-MPEG1, AVI, QuickTime, etc.-and a variety of
file-based video servers. In addition to playing video clips, the Video Extender can
identify and maintain data about scene changes within a video, enabling the user to
find specific shots (all of the frames associated with a particular scene) and a
representative frame within the shot.
5.4.1.5.5 Other Extenders
IBM also offers an extender for fingerprints with extenders for time series and spatial
data in development. The company is also working with third-party vendors to help
them incorporate their software products as DB2 extenders. An extender developers
kit with wizards for generating and registering user-defined types and functions is
coming to make the development effort easier. This is an important component of
IBM's strategy.
5.4.2
IBM's Object-Relational Vision and Strategy
IBM has made significant contributions to our understanding and use of database
management systems and technology over the past three decades. IBM developed
both the relational model and SQL, its now-industry-standard query language, in the
1970s. IBM has also long recognized the need for an extensible database server. The
company began research in this area over ten years ago with its Starburst project, a
third-generation database research project that followed the System R (aprototype
relational database) and R* (distributed relational database) projects. Designed to
provide an extensible infrastructure for the DBMS, Starburst technology is now a
major underpinning for object-relational capabilities in IBM's DB2 family of RDBMS
products.
IBM has four primary efforts underway in its drive to deliver a state-of-the-art objectrelational data management
5.4.2.1
DB2 Object-Relational Extensions
Extending the DB2 server itself is obviously a focal point. To do this, IBM is
incorporating Starburst technology into DB2 in phases. Version 2 of DB2 Common
Server added the first round of object-relational features, including UDTs, UDFs,
large object support, triggers, and enhanced integrity constraints. IBM also replaced
the entire query compiler and optimiser with an extensible one.
 1999 EURESCOM Participants in Project P817-PF
page 69 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
With these building blocks in place, IBM can extend DB2 to "push down" complex
business logic as much as possible into the database server. The company has already
introduced the DB2 Relational Extenders for text, image, audio, video, and others.
Future versions will continue to expand DB2's object-relational capabilities. IBM
plans to include object extensions and Relational Extenders in DB2 for OS/390 and
DB2 for OS/400 in the future as well.
5.4.2.2
Robust File Links For External File Access
We have already discussed file links and support of data stored in external file
systems. A major challenge is ensuring the integrity of external data and controlling
access to the data even though they are physically stored outside the database. To do
this, IBM is developing what it calls "robust file links." These file links will enable
SQL to provide a single point of access to data stored in both the DBMS and external
files. (See Figure 2.) The goal is to add integrated search capability to existing
applications that already use a file API for storage of data, and to give SQL-based
applications transparent access to external data. A future release of DB2 will include
this file-link capability.
A file link is actually a UDT (a file-link type) with a handle that points to an external
flat file. The developer uses the UDT when creating any new column that represents
data in an external file. In the example in Figure 2, photographs of employees are
stored externally and referenced in the "picture" column of the employee table. The
picture column is a file-link UDT that points to an external image file. External
indexes could also be represented as file-link UDTs.
DB2 provides two software components to support file-link types in a particular file
system. One is a DB2 file-system API that the DBMS uses to control the external files
in a file server. For example, when a new file is "inserted" into the database, the
DBMS checks to see if the file exists (that is, the external file has already been
created) and then tells the file system, "note that I own this file." The second software
component, DB2 File-Link Filter, is a thin layer on the file system that intercepts
certain file-system calls to these DBMS-managed external files to ensure that the
request meets DBMS security and integrity requirements. When a user submits a
query to retrieve the employee picture, the DBMS checks to see if the user has
permission to access the image. If yes, the DBMS returns to the application the file
name with an authorization token embedded in it. The application then uses the file
API to retrieve the image. No changes are required to the existing file API provided
by the operating system to support file links. The DBMS also uses the DB2 filesystem API to include external files when backing up the database.
File links will provide tight integration of file-system data with the object-relational
DBMS, allowing the DBMS to guarantee the integrity of data whether they are stored
inside or outside the database.
5.4.2.3
Client Object Support
Client Object Support is designed to fulfil many important roles in IBM's objectrelational vision. This component, currently under development, is an extended
database front end that enhances object support for the client application. Client
Object Support can run anywhere but will most likely reside close to the application
for performance reasons. The overall goal is to enable the execution of database
page 70 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
extensions wherever appropriate: on the client, on an application server, or in the
DBMS. Client Object Support will:

Provide a logical view of all data accessible through DB2, through file links, or
through DataJoiner (see below) and guarantee transaction consistency.

Manage the client cache, automatically moving objects between the database and
the client cache as appropriate to satisfy query requests. Client Object Support
will have enough optimization intelligence to decide where to run queries and
UDFs (both can be executed locally or on the server) depending on the contents
of the client cache, and to navigate through objects via pointers.

Extend the existing database API to include a native API to the programming
language in which the application is written. Thus, a C or C++ application, for
example, will be able to navigate pointers among objects on the client side
(Client Object Support will automatically create these pointers) and directly
invoke UDFs. This also provides a way to map objects in the application program
to objects in the database server so relational data can be materialized as native
C/C++ objects, Java objects, etc.
Client Object Support is the mechanism by which IBM will provide tight integration
with object-oriented programming languages. The extensible architecture of Client
Object Support will also enable IBM to implement different APIs on top of it. An
example would be an API based on object-request-broker (ORB) technology that
supports the ORB Interface Definition Language (IDL) and Java objects.
5.4.2.4
DataJoiner For Heterogeneous Access
DataJoiner is IBM's solution for heterogeneous data access. It provides transparent
read/write access to all IBM RDBMS products plus VSAM, IMS, Oracle, Sybase,
Informix, Microsoft SQL Server, and database managers with an ODBC- or X/Opencompliant call-level interface, with others coming. DataJoiner is not just a simple
gateway between DB2 and other database managers. It includes the full functionality
of the DB2 server, a global optimiser that has considerable knowledge about the
various data managers supported, and the ability to handle SQL compensation
requirements.
Because DataJoiner is built on the DB2 server, it can take advantage of all of DB2's
object-relational extensions, including the ability to simulate these capabilities, where
possible, in non-DB2 data managers. The next release of DataJoiner will incorporate
Version 2 of DB2 Common Server, and thus begin to deliver a SQL3-based, objectrelational API for access to all of its supported data managers.
5.4.3
IBM’s Business Intelligence Software Strategy
5.4.3.1
Business Intelligence Structure
The IBM business intelligence structure is an evolution of IBM’s earlier Information
Warehouse architecture. The structure consists of the following components: [23]
 1999 EURESCOM Participants in Project P817-PF
page 71 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
5.4.3.1.1 Business Intelligence Applications
These applications are complete business intelligence solution packages tailored for a
specific industry and/or application area. These packages use products from other
components of the business intelligence structure.
5.4.3.1.2 Decision Support Tools
These tools range from basic query and reporting tools to advanced online analytical
processing (OLAP) and information mining tools. All these tools support GUI-driven
client interfaces. Many can also be used from a Web interface. At present most of
these tools are designed to handle structured information managed by a database
product, but IBM’s direction here is to add capabilities for handling both complex and
unstructured information stored in database and file systems, and also on Web
servers.
5.4.3.1.3 Access Enablers
These consist of application interfaces and middleware that allow client tools to
access and process business information managed by database and file systems.
Database middleware servers enable clients to transparently access multiple back-end
IBM and non-IBM database servers — this is known as a federated database. Web
server middleware allows Web clients to connect to this federated database.
5.4.3.1.4 Data Management
These products are used to manage the business information of interest to end users.
Included in this product set is IBM’s DB2 relational database family. Business
information can also be accessed and maintained by third-party relational database
products through the use of IBM’s database middleware products. Web server
middleware permits information managed by Web servers to participate in the
business intelligence environment.
IBM sees up to three levels of information store being used to manage business
information. This three-level architecture is based on existing data warehousing
concepts, but as has already been mentioned, other types of information, for example,
multimedia data, will be supported by these information stores in the future. At the
top level of the architecture is the global warehouse, which integrates enterprise-wide
business information. In the middle tier are departmental warehouses that contain
business information for a specific business unit, set of users, or department. These
departmental warehouses may be created directly from operational systems, or from
the global warehouse. (Note that these departmental warehouses are often called data
marts.) At the bottom of the architecture are other information stores, which contain
information that has been tailored to meet the requirements of individual users or a
specific application. An example of using this latter type of information store would
be where financial data is extracted from a departmental information store and loaded
in a separate store for modeling by a financial analyst.
5.4.3.1.5 Data Warehouse Modeling and Construction Tools
These tools are used to capture data from operational and external source systems,
clean and transform it, and load it into a global or departmental warehouse. IBM
products use the database middleware of the Access Enabler component to access and
maintain warehouse data in non-IBM databases.
page 72 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
5.4.3.1.6 Metadata Management
This component manages the metadata associated with the complete business
intelligence system, including the technical metadata used by developers and
administrators, and the business metadata for supporting business users.
5.4.3.1.7 Administration
This component covers all aspects of business intelligence administration, including
security and authorization, backup and recovery, monitoring and tuning, operations
and scheduling, and auditing and accounting.
5.4.3.2
Business Intelligence Partner Initiative
IBM’s business intelligence structure is designed to be able to integrate and
incorporate not only IBM’s business intelligence products, but also those from thirdparty vendors. To encourage support for its business intelligence structure, IBM has
created a Business Intelligence Partner Initiative. The objective of this program is to
have not only joint marketing relationships with other vendors, but also joint
development initiatives that enable other vendors’ products to be integrated with
IBM’s products. Proof that IBM is serious about tight integration between its products
and those from other vendors can be seen in its current relationships with Arbor
Software, Evolutionary Technology International, and Vality Technology. The next
part of this paper on IBM’s business intelligence product set reviews the level of
integration that has been achieved to date with products from these vendors.
5.5
Sybase
5.5.1
Technology Overview: Sybase Computing Platform
The Sybase Computing Platform is directly aimed at the new competitive-advantageapplication development/deployment needs of enterprise IS. The Sybase Computing
Platform includes a broad array of products and features for Internet/middleware
architecture support, decision support, mass-deployment, and legacy-leveraging IS
needs, bundled into a well-integrated, field-proven architecture. The Adaptive Server
DBMS product family, the core engine of the Sybase Computing Platform, supports
the full spectrum of new-application data needs: mass-deployment, enterprise-scale
OLTP, and terabyte-scale data warehousing.
The most notable point about the Sybase Computing Platform for the developer is its
combination of simplicity and power. Developers can create applications that run
without change on all major platforms and architectures, scaling up from the laptop to
the enterprise server or Web server. These applications can take advantage of the
scalability of Adaptive Server and PowerDynamo, the flexibility and programmer
productivity of Powersoft's Java tools, and the legacy interoperability of Sybase's
Enterprise CONNECT middleware [25].
5.5.1.1
Adaptive Server
DBMSs need the following crucial characteristics to effectively support IS's new
application-development needs:

Scalability down to the desktop and up to the enterprise;
 1999 EURESCOM Participants in Project P817-PF
page 73 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1

Flexibility to handle the new Internet and distributed-object architectures and to
merge them with client-server and host-based solutions;

Programmer productivity support by supplying powerful object interfaces that
allow "write once, deploy many."
Notable new or improved Adaptive Server features to meet these needs include:

A common database architecture from the laptop and desktop to the enterprise improving flexibility and programmer productivity;

Ability to handle mixed and varying workloads - improving scalability and
flexibility;

Ability to leverage legacy data via Sybase middleware - improving flexibility;

Scalable-Internet-architecture support - improving scalability/flexibility; and

New low- and high-end scalability features.
5.5.1.1.1 Common Database Architecture
Adaptive Server adds a common language capability and a "component layer" that
allows developers to write to a common API or class library acrossAdaptive Server
Enterprise, Adaptive Server IQ, and Adaptive Server Anywhere. This library is based
on Transact SQL, will soon include Java support, and will later support a superset of
each DBMS's APIs. Thus, developers using this interface can today write applications
for Adaptive Server Anywhere's Transact SQL that will run without change on the
other two DBMSs, and will shortly be able to "write once, deploy many" for all
Adaptive Server DBMSs.
5.5.1.1.2 Mixed and Varying Workloads
Adaptive Server Enterprise provides notable mixed-workload flexibility and
scalability, e.g., via the Logical Process Manager's effective allocation of CPU
resources and the Logical Memory Manager's tunable block I/O. Tunable block I/O
allows Adaptive Server Enterprise to adapt more effectively to changes in data-access
patterns, improving performance for changing workloads. This is especially useful for
scaling packaged applications that mix OLTP and decision support, and as flexible
"insurance" where the future mix of OLTP and decision support is hard to predict.
5.5.1.1.3 Leverage Legacy Data
Combined with Replication Server or other data-movement tools, Enterprise
CONNECT allows users to merge legacy and divisional databases periodically or on a
time-delayed basis into a common mission-critical-data pool. Administrators may
translate the data into end-user-friendly information or duplicate and group it for
faster querying. Thus, Enterprise CONNECT combined with Replication Server and
Adaptive Server IQ forms the core of an enterprise-scalable data warehouse.
Enterprise CONNECT includes an array of gateways, open clients, and open servers
that match of exceed other suppliers' offerings in breadth and functionality. These
include high-performance, globalized versions of Open Client and Open Server
integrated with directory and security services such as Novell's NDS, as well as
DirectCONNECT for MVS that integrates Open ClientCONNECT and Open
ServerCONNECT. Sybase supports X/Open's TP-monitor standard via Sybase's XA
Library that includes support for CICS/6000, Encina, Tuxedo, and Top End.
page 74 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Replication Server - which allows developers to replicate data across multiple
suppliers' distributed databases - is now fully supported in all three DBMS products.
Users can use Replication Server to synchronize Sybase databases with heterogeneous
distributed databases within non-SQL-Server and multisupplier environments. The
OmniSQL Gateway "data catalogs" provide data-dictionary information across
multiple databases. Users may now apply distributed queries across not only Adaptive
Server's databases but also previous versions of SQL Server.
Overall, Sybase's legacy-data support is exceptionally broad, allowing simple, flexible
backend access to a particularly wide range of user data. Moreover, it is field-proven;
products such as Open Client and Open Server have been highly popular and fieldtested for more than half a decade.
5.5.1.1.4 Internet-Architecture Support
Adaptive Server is "Web-enabled," allowing users to write once to its common API
and thereby deploy automatically across a wide range of target Internet environments.
Enterprise CONNECT allows Internet access to more than 21 backend database
servers, including partnerships with "firewall" suppliers for database security.
5.5.1.1.5 Low- and High-end Scalability
Adaptive Server builds on SQL Server 11's benchmark-proven record of high
scalability in database size and numbers of end users. Adaptive Server Enterprise now
includes features such as bi-directional index scans for faster query processing,
parallel querying, and parallel utilities such as online backup. Adaptive Server
effectively supports database applications at the workgroup and client level through
its synergism with Adaptive Server Anywhere.
5.5.1.1.6 Adaptive Server Anywhere
Users should also note Adaptive Server Anywhere's exceptional downward
scalability. Adaptive Server Anywhere's relatively small engine size ("footprint")
allows it to fit within most of today's desktops and many of today's laptops. It takes up
approximately 1 to 2 megabytes in main memory and 5.3 megabytes on disk. Thus, for
exceptionally high performance for small-to-medium-scale databases, Adaptive
Server Anywhere code and the entire database or a large database cache could run
entirely in main memory.
Adaptive Server Anywhere also requires only 2K bytes per communications
connection, a key consideration in many sites where free space in low main memory
is scarce. At the same time, Adaptive Server Anywhere provides the RDBMS features
essential to high performance and scalability in desktop and workgroup environments:
the ability to run in main memory; multi-user support and multithreading; 32-bit
support; stored procedures and triggers; transaction support; native ODBC Level 2
support for faster ODBC access to the server; and cursor support to minimize time
spent on result set download from the server.
Further, Adaptive Server Anywhere completely supports Java database development
by storing Java objects and JavaBeans, by accommodating stored procedures and
triggers written in Java, and by offering high-performance Java database connectivity.
Finally, Adaptive Server Anywhere sets a new standard in low-end performance by
integrating Symmetric Multi-Processor support and including enhanced caching and
optimizing capabilities.
 1999 EURESCOM Participants in Project P817-PF
page 75 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.5.2
Sybase's
Overall
Application
Customer-Centric Development
Development/Upgrade
Deliverable 1
Solution:
Adaptive Server is a key component of Sybase's overall application development/deployment solution that also includes Sybase middleware such as PowerDynamo and
Powersoft development tools such as PowerBuilder Enterprise, PowerSite, and
PowerJ. PowerBuilder builds on the highly-popular PowerBuilder client-server
application development environment to provide enterprise features such as dataaccess and Web-enablement support. The PowerSite Visual Programming
Environment (VPE) provides support for developers creating new data-driven, Webbased enterprise-scale applications; it includes advanced Web-development features
such as team programming support, automated Web application deployment, and
application management across the Internet. PowerJ supports development of Java
applications for both Web server and client applets, including Enterprise JavaBeans
support. PowerDesigner provides team-programming application design and data
modeling support, with specific features for data warehousing and the Internet.
The Jaguar CTS transaction server and PowerDynamo application server provide load
balancing for Web-based applications (in effect acting as an Internet TP monitor for
scalability and access to multiple suppliers' backend databases), and PowerDynamo
offers automated application deployment via replication. jConnect for JDBC allows
developers to access multiple suppliers' backend databases via a common SQL-based
API.
Sybase aims the Sybase Computing Platform at "customer-centric development": that
is, allowing IS to create new applications that use new technology to deliver
competitive advantage by providing services to "customer" end users inside and
outside the enterprise - for example, data mining for lines of business or Web
electronic commerce for outside customers.
The Sybase Computing Platform and Powersoft tools together provide exceptional
features to aid incorporation of new technologies into competitive-advantage
application development and deployment:

Solution scalability in both application and database complexity - e.g., via
Adaptive Server's "both-ways" scalability and the focus of all products on
enterprise-scale application development via data-driven and team programming;

Solution flexibility to cover a broad range of IS needs - e.g., via Adaptive
Server's ability to cover OLTP, decision support, and mass-deployment needs,
and by the emphasis of the development tools and middleware on openarchitecture standards such as the Internet, Java and JDBC;

Programmer productivity features enabling a "write once, deploy many"
approach, including such features as Adaptive Server's common API, the
development tools' VPEs and team-programming features, and PowerSite's
automated deployment features; and

Specific support for new technologies such as the Internet, objects, and data
warehousing - including not only support across the product line for Java objects
and the Internet architecture, but also specialty datastores for multimedia and
geospatial data from Sybase partners as well as Replication Server, distributed
querying, and Adaptive Server IQ's fast-query capabilities.
page 76 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
5.5.3
Volume 3: Annex 2 - Data manipulation and management issues
Java for Logic in the Database
A full-featured programming language for the DBMS
Application logic (in the form of Java classes) will run in Adaptive Server in a secure
fashion. A Java virtual machine (VM) and an internal JDBC interface are being built
into Adaptive Server to make this happen. In this way, Sybase is bringing a fullfeatured yet secure programming language into the server, overcoming the
programming limitations of SQL-based stored procedures.

Object data type
Java objects can be stored as values in a relational table. This provides the
support for rich data types that other object-relational databases have aimed for,
but by using Java it does so in an open, non-proprietary fashion.

A consistent programming model
For the first time, application components can be moved between clients or
middle-tier servers and the DBMS. Developers have a single consistent
programming model for all tiers.

A natural implementation
Sybase is committed to a natural implementation: Java objects and syntax work
as you expect them to work; server schema function in an expected manner, even
when interacting with Java objects.
The Sybase Java initiative will open new doors for enterprise application development
[26].
5.5.3.1
A Commitment to Openness and Standards
The Java relational architecture removes barriers to application development
productivity, but proprietary implementations of new technologies can create other
barriers.
To further promote the open development environment Sybase believes IT
organizations need, Sybase is working with JavaSoft, the ANSI SQL standards
committee, and the JSQL consortium to develop standards for running Java in the
DBMS.
-Sybase aims to succeed by being the best company for IT to work with, not by
providing proprietary solutions to IT problems.
5.5.3.2
Java for Logic in the Database
Today's data servers use SQL to perform two tasks; data access and server-based
logic. While SQL continues to be an excellent language for data manipulation and
definition, the stored procedure extensions to SQL that allow server-based logic show
some clear weaknesses.
SQL stored procedures are limited by the lack of development tools, the inability to
move stored procedures outside the server, and the lack of many features found in
modern application programming languages such as external libraries, encapsulation
and other aspects of object orientation, and the ability to create components.
 1999 EURESCOM Participants in Project P817-PF
page 77 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
5.5.3.2.1 Installing classes into the server
Java logic is written in the form of classes. To use Java in the server, Adaptive Server
will provide the ability to install a Java class into the server. The class is compiled
into bytecode (ready for execution by the VM) outside the server. Once installed, it
can be run and debugged from inside the server.
-Java provides a natural solution to the limitations of stored procedures for encoding
logic in a server. SQL continues to be the natural language for data access and
modification.
5.5.3.2.2 Accessing SQL from Java with JDBC
To implement Java logic in the database, there is a need for a Java interface to SQL.
Just as SQL data manipulation and definition statements can be accessed from stored
procedures, so too must they be accessible from Java methods.
On the client side, JDBC provides an application programming interface (API) for
including SQL in Java methods. JDBC is a Java Enterprise API for executing SQL
statements and was introduced in the Java SDK 1.1.0.
To meet the goal of removing barriers to application development and deployment,
JDBC must also provide the interface for accessing SQL from Java methods inside
the database. An internal JDBC interface for Adaptive Server is therefore a key part
of the Sybase Java initiative.
5.5.3.2.3 Facilitating JDBC Development
Like ODBC, JDBC is a low-level database interface. Just as many RAD tools have
built their own more usable interfaces on top of ODBC, so too is there a need for
interfaces on top of JDBC if developers are to be productive.
As the Sybase implementation installs compiled Java classes (bytecode) into the
DBMS, any of the higher-level tools and methods that generate Java and JDBC code
are automatically supported.
For example:

JSQL
JSQL is an alternative method of including SQL calls in Java code, managed by a
consortium that includes IBM, Oracle, Sybase, and Tandem.
JSQL provides an embedded SQL capability for JDBC. JSQL code is simpler to
write in some cases than JDBC, and for database administrators has the
advantage of being closer to the way in which SQL-based stored procedures are
written.
JSQL code is preprocessed into JDBC calls before compilation. Adaptive Server
users will be able to write in JSQL if they wish, and install the preprocessed code
into the server.

RAD Tools
RAD tools for Java, such as Sybase's PowerJ, provide Java classes built on top of
JDBC to provide a more useable interface for developers. Such classes can be
installed into the server for use.
page 78 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1

Volume 3: Annex 2 - Data manipulation and management issues
JavaBeans
JavaBeans are components: collections of Java classes with a well-defined
interface. JavaBeans can be installed into the server in the same way as any other
set of classes.
The Sybase Adaptive Component Architecture recognizes that component-based
application development has become the major way to build enterprise database
applications and the principal method of accelerating software delivery by reusing
code.
5.6
Microsoft
Microsoft SQL Server Enterprise Edition 6.5 is a high-performance database
management system designed specifically for the largest, highly available Microsoft
Windows NT operating system applications. It extends the capabilities of SQL Server
by providing higher levels of scalability, performance, built-in high-availability, and a
comprehensive platform for deploying distributed, mission-critical database
applications [27].
5.6.1
Overview
As businesses streamline processes and decentralize decision-making, they
increasingly depend on technology to bring users and information together. To that
end, enterprise-class organizations are turning to distributed computing as the bridge
between data and informed business decisions. Performance and reliability become an
even greater factor as today's transactional processing systems grow in size and
number of users.
Microsoft SQL Server, Enterprise Edition 6.5 was engineered with this environment
in mind. Microsoft SQL Server, Enterprise Edition extends the tradition of excellence
in Microsoft SQL Server, providing a higher level of scalability and availability.
Optimized for the Windows NT Enterprise Edition operating system, Microsoft SQL
Server, Enterprise Edition is designed to meet the needs of enterprise OLTP, data
warehouse and Internet applications. In addition to the features provided in the
standard version of Microsoft SQL Server, the Enterprise Edition of SQL Server
supports high-end symmetric multiprocessing (SMP) servers with additional memory,
providing customers with better performance and scalability. To meet the availability
and 7-day by 24-hour requirements of mission-critical applications, Microsoft SQL
Server, Enterprise Edition also supports high-availability 2-node clusters.
Many of these performance and reliability gains are achieved through the close
integration with the Enterprise Edition of Windows NT Server. And, as part of the
Microsoft BackOffice, Enterprise Edition family, Microsoft SQL Server, Enterprise
Edition 6.5works with the other Microsoft BackOffice server products for superior,
integrated client/server and Web-based applications.
5.6.1.1
Product Highlights
5.6.1.1.1 Support for larger SMP servers
Microsoft SQL Server is architected to deliver excellent scalability on SMP servers
from a variety of system vendors. The standard version is optimized for use on up to
 1999 EURESCOM Participants in Project P817-PF
page 79 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
four-processor SMP servers. Enterprise Edition is designed and licensed for use on a
new class of more-than-four-processor SMP servers for superior scalability.
5.6.1.1.2 Cluster-ready for high availability
Microsoft SQL Server, Enterprise Edition also delivers built-in support for Microsoft
Cluster Server, formerly known by the code name "Wolfpack." In a high-availability
cluster configuration, Microsoft SQL Server delivers 100 percent protection against
hardware faults for mission-critical applications. To simplify the management of a
high-availability cluster, Microsoft SQL Server, Enterprise Edition provides easy-touse graphical tools for setting up and configuring the two-node cluster.
5.6.1.1.3 Support for additional memory
Complete support for Windows NT Server 4GB RAM tuning (4GT), allows Microsoft
SQL Server to take advantage of additional memory. Making use of 4GT allows
Microsoft SQL Server to address up to 3 GB of real memory, providing increased
performance for applications such as data warehousing. This feature is available for
Microsoft SQL Server, Enterprise Edition only on 32-bit Intel architecture servers.
Very large memory (VLM) support for Digital's 64-bit Alpha Servers will be
delivered in a future release of Microsoft SQL Server, Enterprise Edition.
5.6.1.1.4 Natural language interface
Enables the retrieval of information from SQL Server, Enterprise Edition using
English, rather than a formal query language, such as SQL. An application using
Microsoft English Query accepts English commands, statements, and questions as
input and determines their meaning. It then writes and executes a database query in
SQL Server and formats the answer.
5.6.1.1.5 A platform for building reliable, distributed applications
In addition, Microsoft SQL Server, Enterprise Edition, in conjunction with Windows
NT Server, Enterprise Edition, is a complete platform for reliable, large-scale,
distributed database applications, utilizing the Microsoft Transaction Server and
Microsoft Message Queue Server software. Microsoft Transaction Server is
component-based middleware for building scalable, manageable distributed
transaction applications quickly. Microsoft Transaction Server provides simple
building blocks that can reliably and efficiently execute complex transactions across
widespread distributed networks, including integrated support for Web-based
applications. Microsoft Message Queue Server is store-and-forward middleware that
ensures delivery of messages between applications running on multiple machines
across a network. Microsoft Message Queue Server is an ideal environment for
building large-scale distributed applications that encompass mobile systems or
communicate across occasionally unreliable networks.
5.6.1.2
General Info
5.6.1.2.1 Specifications
System using an Intel Pentium or Digital Alpha processor Microsoft Windows NT
Server 4.0 Enterprise Edition.

64 MB of memory
page 80 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues

80 MB of available hard disk space (95 MB with books online)

CD-ROM drive
5.6.1.2.2 Networking Options
The following networks are supported using native protocols:

Microsoft Windows NT Server

Microsoft LAN Manager

Novell NetWare

TCP/IP-based networks

IBM LAN Server

Banyan VINES

Digital PATHWORKS

Apple AppleTalk
5.6.1.2.3 Clients supported:
5.6.2

Microsoft Windows operating system version 3.1

Microsoft Windows 95

Microsoft Windows for Workgroups

Microsoft Windows NT Workstation

Microsoft MS-DOS® operating system
Microsoft Cluster Server
In late 1995, Microsoft announced that they would work with their hardware and
software vendors to deliver clustering for the Microsoft Windows NT Server network
operating system, the Microsoft BackOffice integrated family of server software, and
leading application software packages. Clustering technology enables customers to
connect a group of servers to improve application availability, data availability, fault
tolerance, system manageability, and system performance. Unlike other clustering
solutions, the Microsoft approach does not require proprietary systems or proprietary
server interconnection hardware. Microsoft outlined this strategy because customers
indicated a need to understand how clustering will fit into their long-term, information
technology strategy.
Microsoft Cluster Server (MSCS), formerly known by its code name, ”Wolfpack” will
be included as a built-in feature of Microsoft Windows NT Server, Enterprise Edition.
Over fifty hardware and software vendors participated in the MSCS design reviews
throughout the first half of 1996, and many of these are now working on MSCS-based
products and services. Microsoft is also working closely with a small group of Early
Adopter system vendors in the development and test of its clustering software:
Compaq Computer Corp., Digital Equipment Corp., Hewlett-Packard, IBM, NCR, and
Tandem Computers. Together, Microsoft and these vendors will create a standard set
of products and services that will make the benefits of clustered computers easier to
utilize and more cost effective for a broad variety of customers. [28]
 1999 EURESCOM Participants in Project P817-PF
page 81 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.6.2.1
Deliverable 1
A Phased Approach
MSCS software will include an open Application Programming Interface (API) that
will allow applications to take advantage of Windows NT Server, Enterprise Edition,
in a clustered environment. As will other application vendors, Microsoft plans to use
this API to add cluster-enabled enhancements to future versions of its server
applications, the BackOffice family of products. Clustering will be delivered in
phases.
Phase 1: Support for two-node failover clusters. Applications on a primary server will
automatically fail over to the secondary server when instructed to do so by the
administrator or if a hardware failure occurs on the primary server.
Phase 2: Support for shared-nothing clusters up to 16 nodes and for parallel
applications that can use these large clusters to support huge workloads.
The progress in MSCS will be mirrored by progress in applications that use these
features to provide application-level availability and scalability. Microsoft SQL
Server gives a good example of how applications built on top of MSCS provides these
benefits to the customer.
5.6.2.2
SQL Server Use of Microsoft Cluster Server
Microsoft SQL Server is an excellent example of a Windows NT Server-based
application that will take advantage of MSCS to provide enhanced scalability and
availability.
Microsoft will deliver SQL Server clustering products in two phases:
Phase 1: Symmetric Virtual Server: Enables a two-node cluster to support multiple
SQL Servers. When one node fails or is taken offline, all the SQL Servers migrate to
the surviving node.
Phase 2: Massive Parallelism: Enables more than two servers to be connected for
higher performance.
5.6.2.2.1 Phase 1: Symmetric Virtual Server Solution
SQL Server will have the capability to run several SQL Server services on an MSCS
Cluster. In a two-node cluster, each node will be able to support half the database and
half the load. On failure, the surviving node will host both servers. During normal
operation, each node will serve half the clients and will be managing the database on
half the disks, as shown in Figure 2. SQL Server will also include wizards and
graphical tools to automate cluster setup and management. This phase will be
supported with the Phase 1 release of MSCS.
5.6.2.2.2 Availability
SQL Server 6.5, Enterprise Edition is scheduled for release in the third quarter of
1998. It will provide for Microsoft Cluster Service and it will utilize a 3 GB memory
space for its execution, offering users even higher performance.
5.6.2.2.3 Phase 2: Massive Parallelism
Phase 2 will enable future versions of SQL Server to use massive parallelism on large
clusters. When the overall load exceeds the capabilities of a cluster, additional
systems may be added to scale up or speed up the system. This incremental growth
page 82 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
enables customers to add processing power as needed. This parallelism is almost
automatic for client-server applications like online-transaction processing, file
services, mail services, and Internet services. In those applications the data can be
spread among many nodes of the cluster, and the workload consists of many
independent small jobs that can be executed in parallel. By adding more servers and
disks, the storage and workload can be distributed among more servers. Similarly, for
batch workloads like data mining and decision support queries, parallel database
technology can break a single huge query in to many small independent queries that
can be executed in parallel. Sphinx will support pipeline parallelism, while future
versions will support partition parallelism.
Formerly, IS professionals needed to make up-front commitments to expensive, highend servers that provided space for additional CPUs, drives, and memory. With the
Phase 2 implementation of SQL Server on MSCS, they will be able to purchase new
servers as needed and just add them to the cluster to grow the system's capacity and
throughput.
5.7
NCR Teradata
5.7.1
Data Warehousing with NCR Teradata
Beginning with the first shipment of Teradata RDBMS, NCR have over 16 years of
experience in building and supporting data warehouses worldwide. Today, NCR
Scalable Data Warehousing (SDW) deliver solutions in the data warehouse
marketplace, from entry-level data marts to very large production warehouses with
hundreds of Terabytes. Data warehousing from NCR is a complete solution that
combines Teradata parallel databasetechnology, scalable hardware, experienced data
warehousing consultants, and industry tools and applications available on the market
today [29], [30].
5.7.1.1
The Database – A Critical Component of Data Warehousing
Most databases were designed for OLTP environments with quick access and updates
to small objects or single records. But what happens when you want to use your
OLTP database to scan large amounts of data in order to answer complex questions?
Can you afford the constant database tuning required to accommodate change and
growth in your data warehouse? And will your database support the scalability
requirements imposed by most data warehouse environments? Data warehousing is a
dynamic and iterative process, the requirements of which are constantly changing as
the demands on your business change.
5.7.1.2
NCR claims to be The Leader in Data Warehousing
NCR has more than 16 years of experience in the design, implementation, and
management of large-scale data warehouses.
NCR claims itself as the data warehousing leader and dominator of industry
benchmarks for decision support in all data volumes. NCRs WorldMark servers have
been hailed [31] as the most open and scalable computing platforms on the market
today. NCR claims to have the most comprehensive data warehousing programs to
help your current and future initiatives, as well as alliances with other software and
services vendors in the industry.
 1999 EURESCOM Participants in Project P817-PF
page 83 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.7.1.3
Deliverable 1
NCR Teradata RDBMS - The Data Warehouse Engine
The NCR Teradata Relational Database Management System (RDBMS) is a scalable,
high performance decision support solution. This data warehouse engine is an answer
for customers who develop scalable, mission-critical decision support applications.
Designed for decision support and parallel implementation from its conception, NCR
Teradata is not constrained by the limitations that plague traditional relational
database engines. Teradata easily and efficiently handles complex data requirements
and simplifies management of the data warehouse environment by automatically
distributing data and balancing workloads.
5.7.2
Teradata Architecture
NCR Teradata’s architectural design was developed to support mission-critical, faulttolerant decision support applications. Here is how NCR database technology has
evolved.
5.7.2.1
The Beginning – AMPs
The original Teradata design employed a thin-node (one processor per logical
processing unit) shared nothing architecture that was implemented on Intel-based
systems. Each Access Module Processor (AMP), a physically distinct unit of
parallelism, consisted of a single Intel x386 or x486 CPU. Each AMP had exclusive
access to an equal, random portion of the database.
5.7.2.2
The Next Phase – VPROCs
As hardware technology advanced and the demand for non-proprietary systems
increased, Teradata entered its next phase of evolution in which the original
architecture was "virtually" implemented in a single Symmetric Multi-Processing
(SMP) system. The multitasking capabilities of UNIX, the enhanced power of next
generation processors like the Pentium, and the advent of disk array subsystems
(RAID), allowed for the implementation of virtual AMPs, also known as virtual
processors (VPROCs).
In this implementation, the logical concept of an AMP is separated even further from
the underlying hardware. Each VPROC is a collection of tasks or threads running
under UNIX or Windows NT. This allows the system administrator to configure NCR
Teradata to use more AMPs than the underlying system has processors. In turn, each
VPROC has semi-exclusive access to one or more physical devices in the attached
RAID subsystem. There are actually two types of VPROCs in Teradata: the AMP and
the Parsing Engine (PE). The PE performs session control and dispatching tasks, as
well as SQL parsing functions. The PE receives SQL commands from the user or
client application and breaks the command into sub-queries, which are then passed on
to the AMPs. There need not be a one-to-one relationship between the number of PEs
and the number of AMPs. In fact, one or two PEs may be sufficient to serve all other
AMPs on a single SMP node. The AMP executes SQL commands and performs
concurrency control, journaling, cache management, and data recovery.
page 84 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
5.7.2.3
Volume 3: Annex 2 - Data manipulation and management issues
Today – MPP
Today's version of Teradata takes the architecture even further, providing openness
and scalability by adding additional operating system support and interconnect
capabilities. Now running on UNIX or Windows NT (with modifications - see note on
NT below), each SMP system can be connected via Teradata's high-speed
interconnect, the BYNET, to form "FAT Nodes" in a loosely coupled Massively
Parallel Processing (MPP) architecture that is managed as a single system. This
provides the foundation for Teradata's linear scalability, which can start with a fourprocessor SMP environment and scale to thousands of physical processors and tensof-thousands of VPROCs. A perfect fit for entry-level data marts or massive
enterprise warehouses.
Note on NT: Teradata on NT only allows up to 50 GB userdata. An update of
Teradata on NT is planed for the fall 1998. This update will allow up to 300 GB
userdata. According to plans Teradata 3.0 will be released in the second half of 1999.
From this point forward feature and functionality enhancements will be released on
the NT and UNIX platforms simultaneously [32].
5.7.2.3.1 BYNET – Scalable Interconnect
The BYNET is a redundant, fault-tolerant, intelligent, high-speed circuit switching
interconnect for Teradata. The BYNET allows the database to coordinate and
synchronize the activities of a large number of SMP nodes without increasing network
traffic or degrading performance as the system grows. The BYNET provides a nodeto-node data transfer bandwidth of 10 MB per second and can linearly scale,
supporting up to 1024 nodes on a single system.
5.7.2.3.2 Scalable Hardware Platform
Teradata's scalability is further enhanced through its tight integration with the NCR
WorldMark platform. The WorldMark family of Intel-based servers provides seamless
and transparent scalability. Adding more computational power is as simple as adding
more hardware to the current system. The operating system will automatically
recognize and adapt to the additional system resources, and NCR Teradata will
redistribute existing data to take advantage of the new hardware. Existing applications
continue to run without modification.
5.7.3
Application Programming Interfaces
Teradata provides a number of standardized interfaces to facilitate easy development
of client/server applications. Included are the Teradata ODBC Driver, the Teradata
Call-Level Interface (CLI), and the TS/API which permits applications that normally
access IBM DB2 to run against Teradata. Also included are a number of third-party
interfaces like the Oracle Transparent Gateway for Teradata, Sybase Open Server and
Open Client.
5.7.4
Language Preprocessors
NCR Teradata provides a number of preprocessors to facilitate application
development in languages such as COBOL, C/C++ and PL/1. With the libraries in
these preprocessors, developers can create or enhance client or host-based
applications that access the Teradata RDBMS.
 1999 EURESCOM Participants in Project P817-PF
page 85 (120)
Volume 3: Annex 2 - Data manipulation and management issues
5.7.5
Deliverable 1
Data Utilities
Teradata includes both client-resident and host-based utilities that allow users and
administrators to interact with or control the Teradata engine. Among them are the
Basic Teradata Query facility (BTEQ) for command-line and batch-driven querying
and reporting; BulkLoad, FastLoad and MultiLoad for data loading and updating; and
FastExport for extracting data from Teradata.
5.7.6
Database Administration Tools
The Teradata RDBMS has a rich collection of tools and facilities to control the
operation, administration, and maintenance of the database. These include ASF/2 for
backup, archive, and recovery, the Database Window (DBW) for status and
performance statistics, and the Administrative Workstation (AWS) for a single point
of administrative control over the entire WorldMark-based Teradata system. All of
these tools and many others can be accessed individually or through a common user
interface known as Teradata Manager. Teradata Manager runs on Windows NT or
OS/2.
5.7.7
Internet Access to Teradata
The Internet can widely expand your company's exposure to global markets. NCR
understands this emerging opportunity and consequently offers two common methods
for accessing information stored in the NCR Teradata RDBMS from the World Wide
Web: Java and CGI.
5.7.7.1
Java
The Teradata Gateway for Java provides application developers with a simple, easy to
use API to access Teradata from the Internet or Intranet. Any client capable of
running a Java applet or application, including web browsers like Netscape Navigator
or Microsoft Internet Explorer, can now access the Teradata RDBMS directly.
5.7.7.2
CGI Access
The Common Gateway Interface (CGI) describes a standard for interfacing database
applications with web servers. NCR's CGI solution for Teradata allows SQL
statements to be embedded within an HTML page and provides a mechanism to return
result sets in the same HTML format. It validates parameters received through the
HTTP query string and allows all data manipulation language (DML) constructs,
including SELECTS, INSERTS, UPDATES, and DELETES.
5.7.8
NCR's Commitment to Open Standards
NCR is a dedicated member of many committees that define industry standards,
including the ANSI SQL Committee, the Microsoft Data Warehouse Alliance, the
OLAP Council and the Metadata Coalition. Fifteen years ago, with its original
parallel and scalable design, Teradata's developers began the process necessary to
make the database available on many different platforms.
page 86 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
5.7.9
Volume 3: Annex 2 - Data manipulation and management issues
Teradata at work
As your business expands in volume and complexity, NCR’s data warehousing
solution protect your investment in both hardware and software. NCR data
warehouses scale proportionately to support your users and your increasingly complex
data. In fact, NCR offers a seamless pathway to scale a pilot data warehouse to a
multi-terabyte configuration without changing hardware, databases or applications.
NCR Teradata RDBMS with its shared nothing architecture, combined with a new
class of scalable, modular and high availability WorldMark servers, give you a secure,
guaranteed solution to help you move confidently into the 21st century.
5.7.9.1
Retail
Retailers use NCR Teradata to compile and analyze months and years of data gathered
from checkout scanners in thousands of retail stores worldwide to manage purchasing,
pricing, stocking, inventory management and to make store configuration decisions.
5.7.9.2
Financial
The financial industry uses NCR Teradata for relationship banking and householding
where all customer account information is merged for cross segment marketing. Data
is sourced from diverse geographical areas, different lines of business (checking,
savings, auto, home, credit cards, ATMs) and from various online systems.
5.7.9.3
Telecommunications
The telecommunications industry uses NCR Teradata to store data on millions of
customers, circuits, monthly bills, volumes, services used, equipment sold, network
configurations and more. Revenues, profits and costs are used for target marketing,
revenue accounting, government reporting compliance, inventory, purchasing and
network management.
5.7.9.4
Consumer Goods Manufacturing
Manufacturers use NCR Teradata to determine the most efficient means for supplying
their retail customers with goods. They can determine how much product will sell at a
price point and manufacture goods for "just in time" delivery.
6
Analysis and recommendations
In general there are different interests in distributed databases. Some of these are
(1-3): (1) The opportunity to implement new architectures, which are distributed
according to their conceptual nature. (2) The opportunity to make distributed
implementations of conceptual non distributed systems to achieve efficiency. (3)
Situations where data are available in a distributed system by nature. But where it is
more desirable to consider these data as non distributed. In the latter case we might
implement a federated database to access data.
The design of a distributed database is an optimisation problem requiring solutions to
several interrelated problems e.g. data fragmentation, data allocation, data replication,
partitioning and local optimisation. Special care is needed to implement query
processing and optimisation.
 1999 EURESCOM Participants in Project P817-PF
page 87 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Retrieval and manipulation of data in different database architectures has various
options for finding optimal solutions for database applications. In recent years many
architectural options have been discussed in the field of distributed and federated
databases and various algorithms have been implemented to optimise the handling of
data and to optimise methodologies to implement database applications.
Retrieval and manipulation in the different architectures applies nevertheless similar
theoretical principals for optimising the interaction between applications and database
systems. Efficient query and request execution is an important criterion when
retrieving large amounts of data.
This part also described a number of commercial database products competing in the
VLDB segment. Most of these run on various hardware platforms. The DBMSs are
generally supported by a range of tools for e.g. data replication and data retrieval.
References
[1]
Gray, Jim, Andreas Reuter, ”Transaction
Techniques”, Morgan Kaufmann, 1993
Processing
Concepts
and
[2]
Ceri, S: ”Distributed Databases”, McGraw-Hill, 1984
[3]
A. P. Sheth; J. A. Larson, ”Federated Database Systems for Managing
Distributed, Heterogeneous, and Autonomous Databases”, ACM Computer
Surveys, 22(3): 183-236, September 1990.
[5]
S. Chaudhuri, ”An Overview of Query Optimization in Relational Systems”,
Proceedings the ACM PODS, 1998, http://www.research.microsoft.com/users/surajitc
[6]
J. K. Smith, ”Survey Paper On Vertical Partitioning”, November 3, 1997,
http://www.ics.hawaii.edu/~jkdmith/survey69l.html
[7]
Goetz Graefe, ”Query Evaluation Techniques for Large Databases”, ACM
Computing Surveys, pp. 73-170, June 1993, available online:
http://wilma.cs.brown.edu/courses/cs227/papers/bl/Graefe-Survey.ps
[8]
EURESCOM P817, Deliverable 1, Volume 2, Annex 1 - Architectural and
Performance issues, September 1998
[9]
Chakravarthy, S., Muthuraj, J., Varadarajan, R., and Navathe, S., ”An
Objective Function for Vertically Partitioning Relations in Distributed
Databases and its Analysis”, University of Florida Technical Report UF-CISTR-92-045, 1992
[11] Ee-Peng Lim, Roger H. L. Chiang, Yinyan Cao, "Tuple Source Relational
Model:
A
Source-Aware
Data
Model
for
Multidatabases"
Note: Will be published in: Data & Knowledge Engineering. Amstedam:
Elsevier, 1985-. ISSN: 0169-023X. Requests for further details can be sent by
email to fnorm@tdk.dk.
[12] Compaq World NonStop <http://www.tandem.com>
[13] Oracle Technology Network <http://technet.oracle.com/>
[14] The Object-Relational DBMS
<http://technet.oracle.com/doc/server.804/a58227/ch5.htm#10325>
[15] PL/SQL New Features with Oracle8 and Future Directions
<http://ntsolutions.oracle.com/products/o8/html/plsqlwp1.htm>
page 88 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
[16] A Family of Products with Oracle8
<http://www.oracle.com/st/o8collateral/html/xo8twps2.html>
[17] <http://www.oracle.com/st/o8collateral/html/xo8vtwp3.html>
[18] Getting Started with Informix Dynamic Server
<http://www.informix.com/answers/english/pdf_docs/73ids/4351.pdf>
[19] Introduction to new fetures
<http://www.informix.com/answers/english/pdf_docs/metacube/5025.pdf>
[20] Explorer User’s Guide, MetaCube ROLAP Option
<http://www.informix.com/answers/english/pdf_docs/metacube/4188.pdf>
[21] MetaCube for Excel, User’s Guide, MetaCube ROLAP Option
<http://www.informix.com/answers/english/pdf_docs/metacube/4193.pdf>
[22] Creating An Extensible, Object-Relational Data Management Environment,
IBM's DB2 Universal Database
<http://www.software.ibm.com/data/pubs/papers/dbai/db2unidb.htm>
[23] The IBM Business Intelligence Software Solution
<http://www.software.ibm.com/data/pubs/papers/bisolution/index.html>
[24] The DB2 Product Family <http://www.software.ibm.com/data/db2/>
[25] Sybase Adaptive Server And Sybase Computing Platform: A Broad, Powerful
Foundation For New-Technology Deployment
<http://www.sybase.com/adaptiveserver/whitepapers/computing_wps.html>
[26] Sybase Adaptive Server: Java in the Database
<http://www.sybase.com/adaptiveserver/whitepapers/java_wps.html>
[27] Microsoft SQL Server, Enterprise Edition
<http://www.microsoft.com/sql/guide/enterprise.asp?A=2&B=2>
[28] Clustering support for Microsoft SQL Server
<http://www.microsoft.com/sql/guide/sqlclust.asp?A=2&B=4>
[29] <http://www3.ncr.com/teradata/teraover.pdf>
[30] <http://www.teradata.com>
[31] <http://www3.ncr.com/data_warehouse/awards.html>
[32] <http://www3.ncr.com/teradata/nt/tntmore.html>
 1999 EURESCOM Participants in Project P817-PF
page 89 (120)
Volume 3: Annex 2 - Data manipulation and management issues
page 90 (120)
Deliverable 1
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Part 3 Backup and Recovery
1
Introduction
The purpose of this chapter is to discuss the control strategies available to the
database administrator in order to deal with failures and security threats.
Any deviation of a system from the expected behaviour is considered a failure.
Failures in a system can be attributed to deficiencies in the components that make it
up, both hardware and software, or in the design.
Backups are important because no system is free from failures, not even fault-tolerant
systems. It is necessary to restore and recover the data quickly to resume operations.
The key to success in this situation is a well-defined backup and recovery strategy.
The definition of the rules for controlling data manipulation is part of the
administration of the database, so security aspects must also be taken into account.
This part ends with two appendixes with backup and recovery demonstrations of
terabyte databases. The figures mentioined, give an idea of the time needed and
system overhead generated when backing up and recovering a very large database.
2
Security aspects
Data security is an important function of a database system that protects data against
unauthorised access. Data security includes two aspects: data protection and
authorisation control.
Data protection is required to prevent unauthorised users from understanding the
physical content of data. This function is typically provided by data encryption.
Authorisation control must guarantee that only authorised users perform operations
they are allowed to perform on the database. Authorisations must be refined so that
different users have different rights on the same objects.
When discussing security, it is important to note the various threats to data. Some
threats are accidental, but they can lead to the disclosure, deletion, or destruction of
the data in the databases. These threats include software, hardware, and human errors.
However, attempts to deliberately bypass or violate the security facilities are by far
the biggest security threats. Such attempts include the following:

Unauthorised stealing, copying, changing, corrupting, or browsing through stored
data.

Electronic bugging of communication lines, terminal buffers, or storage media.

Sabotaging, which can include erasing and altering the data, deliberately
inputting erroneous data, or maliciously destroying the equipment or the storage
media.

Personnel aspects, such as position misuse, false identification, blackmail,
bribery, or transferred authorisation (where users can obtain other passwords).

DBAs avoiding or suppressing the security facilities.
 1999 EURESCOM Participants in Project P817-PF
page 91 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1

Shared programs performing functions not described in their specifications by
taking advantage of the rights of their environment.

Masquerading, such as when a program poses as the operating system or the
application, to obtain user passwords.
Each organisation should have a data security policy, which is a set of high-level
guidelines determined by user requirements, environmental aspects, internal
regulations, and governmental laws. In a database environment, security focuses on
the allowed access, the control of the access, and the granularity of the control. The
allowed access to data is somewhere on the scale between need-to-know (only the
data necessary to perform a task is supplied) and maximal sharing. Data access
control can be described by a scale ranging from an open system to a closed system.
Control granularity determines which types of objects access rights are specified -- for
example, individual data items, collections of data items (such as rows in tables), data
object contents (such as all the rows in a table or a view), the functions executed, the
context in which something is done, or the previous access history.
The approaches, techniques, and facilities one use for security control must cover
external (or physical) security control as well as internal (computer system) security
control. The external security controls include access control, personnel screening,
proper data administration, clean desk policies, waste policies, and many, many more.
The main focus lies here on the internal controls that ensure the security of the stored
and operational data.
These include the following:

Access controls: Ensure that only authorised accesses to objects are made -doing so specifies and enforces who may access the database and who may use
protected objects in which way. Authorisation is often specified in terms of an
access matrix, consisting of subjects (the active entities of the system), objects
(the protected entities of the model), and access rights, where an entry for a
[subject, object] pair documents the allowable operations that the subject can
perform on the object. Two variations on access matrices are authorisation lists
and capabilities. Authorisation lists or access-control lists (per object) specify
which subjects are allowed to access the object and in what fashion. Capabilities
are [objects, rights] pairs allocated to users; they specify the name or address of
an object and the manner in which those users may access it.

Ownership and sharing: Users may dispense and revoke access privileges for
objects they own or control.

Threat monitoring: An audit trail is recorded to examine information
concerning installation, operations, applications, and fraud of the database
contents. A usage log is kept of all the executed transactions, of all the attempted
security violations, and of all the outputs provided.
The security mechanisms you use to protect your databases should have the following
properties:

Completeness: Defence is maintained against all possible security-threatening
attacks.

Confidence: The system actually does protect the database as it is supposed to.

Flexibility: A wide variety of security policies can be implemented.
page 92 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues

Ease of use: The database administrator (DBA) has an easy interface to the
security mechanisms.

Resistance to tampering: The security measures themselves are secure.

Low overhead: The performance costs can be predicted and are low enough for
efficiency.

Low operational costs: The mechanisms utilise the available resources
efficiently.
In most DBMSs, authorisation rules enforce security. Authorisation rules are controls
incorporated in the database and enforced by the DBMS. They restrict data access and
also the actions that people may take when they access the data.
3
Backup and Recovery Strategies
This chapter put the main emphasis on backup and recovery strategies of VLDBs.
Recovery can be necessary to receive data of an archive with special day stamp, or to
reset an application or to keep an application in case of data loss running with lowest
loss of service time.
One of the innumerable tasks of the DBA is to ensure that all of the databases of the
enterprise are always "available." Availability in this context means that the users
must be able to access the data stored in the databases, and that the contents of the
databases must be up-to-date, consistent, and correct. It must never appear to a user
that the system has lost the data or that the data has become inconsistent.
Many factors threaten the availability of the databases. These include natural disasters
(such as floods and earthquakes), hardware failures (for example, a power failure or
disk crash), software failures (such as DBMS malfunctions -- read "bugs" -- and
application program errors), and people failures (for example, operator errors, user
misunderstandings, and keyboard trouble). To this list one can also add security
aspects, such as malicious attempts to destroy or corrupt the contents of the database.
Oracle classifies the most frequent failures as follows:

Statement and process failure: Statement failure occurs when there is a logical
failure in the handling of a statement in an Oracle program (for example, the
statement is not a valid SQL construction). When statement failure occurs, the
effects (if any) of the statement are automatically undone by Oracle and control
is returned to the user. A process failure is a failure in a user process accessing
Oracle, such as an abnormal disconnection or process termination. The failed
user process cannot continue work, although Oracle and other user processes can.
h minimal impact on the system or other users.

Instance failure: Instance failure occurs when a problem arises that prevents an
instance (system global area and background processes) from continuing work.
Instance failure may result from a hardware problem such as a power outage, or a
software problem such as an operating system crash. When an instance failure
occurs, the data in the buffers of the system global area is not written to the
datafiles.

User or application error: User errors can require a database to be recovered to
a point in time before the error occurred. For example, a user might accidentally
delete data from a table that is still required (for example, payroll taxes). To
 1999 EURESCOM Participants in Project P817-PF
page 93 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
allow recovery from user errors and accommodate other unique recovery
requirements, Oracle provides for exact point-in-time recovery. For example, if a
user accidentally deletes data, the database can be recovered to the instant in time
before the data was deleted.

Media (disk) failure: An error can arise when trying to write or read a file that is
required to operate the database. This is called disk failure because there is a
physical problem reading or writing physical files on disk. A common example is
a disk head crash, which causes the loss of all files on a disk drive. Different files
may be affected by this type of disk failure, including the datafiles, the redo log
files, and the control files. Also, because the database instance cannot continue to
function properly, the data in the database buffers of the system global area
cannot be permanently written to the datafiles.
Data saved in VLDB´s are used from applications e.g. data mining, data warehousing
to deduce estimated values from or to make decisions or for operational actions.
Depending of the application loss of data can be without effects or is not acceptable.
In the second case security concepts have to be developed to avoid loss and repair
respectively.
Errors which causes loss of data are distinguished in 6 categories:
1.
User error: user deletes or changes data improper or erroneous.
2.
Operation error: an operation causes a mistake and the DBMS reacts with an
error message.
3.
Process error: a failure of a user process is called process error.
4.
Network error: an interruption of the network can cause network errors in
client/server based databases.
5.
Instance error: e.g. power failure or software failures can cause that the instance
(SGA with processes in the background) does not work properly any more.
6.
Media error: Physical hardware defects causes read or write failures.
Errors 1 - 5 can be troubleshooted by algorithms of the DBMS. Therefore backup and
recovery strategies are focused on treatment of media errors and archivation.
In a large enterprise, the DBA must ensure the availability of several databases, such
as the development databases, the databases used for unit and acceptance testing, the
operational online production databases (some of which may be replicated or
distributed all over the world), the data warehouse databases, the data marts, and all
of the other departmental databases. All of these databases usually have different
requirements for availability. The online production databases typically must be
available, up-to-date, and consistent for 24 hours a day, seven days a week, with
minimal downtime. The warehouse databases must be available and up-to-date during
business hours and even for a while after hours.
On the other hand, the test databases need to be available only for testing cycles, but
during these periods the testing staff may have extensive requirements for the
availability of their test databases. For example, the DBA may have to restore the test
databases to a consistent state after each test. The developers often have even more ad
hoc requirements for the availability of the development databases, specifically
toward the end of a crucial deadline. The business hours of a multinational
organization may also have an impact on availability. For example, a working day
from 8 a.m. in central Europe to 6 p.m. in California implies that the database must be
page 94 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
available for 20 hours a day. The DBA is left with little time to provide for
availability, let alone perform other maintenance tasks.
3.1
Recovery
Recovery is the corrective process to restore the database to a usable state from an
erroneous state.
The basic recovery process consists of the following steps:
1.
Identify that the database is in an erroneous, damaged, or crashed state.
2.
Suspend normal processing.
3.
Determine the source and extent of the damage.
4.
Take corrective action, that is:
5.

Restore the system resources to a usable state.

Rectify the damage done, or remove invalid data.

Restart or continue the interrupted processes, including the re-execution of
interrupted transactions.
Resume normal processing.
To cope with failures, additional components and algorithms are usually added to the
system. Most techniques use recovery data (that is, redundant data), which makes
recovery possible. When taking corrective action, the effects of some transactions
must be removed, while other transactions must be re-executed; some transactions
must even be undone and redone. The recovery data must make it possible to perform
these steps.
The following techniques can be used for recovery from an erroneous state:
Dump and restart: The entire database must be backed up regularly to archival
storage. In the event of a failure, a copy of the database in a previous correct state
(such as from a checkpoint) is loaded back into the database. The system is then
restarted so that new transactions can proceed. Old transactions can be re-executed if
they are available. The following types of restart can be identified:

A warm restart is the process of starting the system after a controlled system
shutdown, in which all active transactions were terminated normally and
successfully.

An emergency restart is invoked by a restart command issued by the operator. It
may include reloading the database contents from archive storage.

A cold start is when the system is started from scratch, usually when a warm
restart is not possible. This may also include reloading the database contents
from archive storage. Usually used to recover from physical damage, a cold
restart is also used when recovery data was lost.
Undo-redo processing (also called roll-back and re-execute): By using an audit trail
of transactions, all of the effects of recent, partially completed transactions can be
undone up to a known correct state. Undoing is achieved by reversing the updating
process. By working backwards through the log, all of the records of the transaction in
question can be traced, until the begin transaction operations of all of the relevant
 1999 EURESCOM Participants in Project P817-PF
page 95 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
transactions have been reached. The undo operation must be "idempotent," meaning
that failures during undo operations must still result in the correct single intended
undo operation taking place. From the known correct state, all of the journaled
transactions can then be re-executed to obtain the desired correct resultant database
contents. The operations of the transactions that were already executed at a previous
stage are obtained from the audit trail. The redo operation must also be idempotent,
meaning that failures during redo operations must still result in the correct single
intended redo operation taking place. This technique can be used when partially
completed processes are aborted.
Roll-forward processing (also called reload and re-execute): All or part of a
previous correct state (for example, from a checkpoint) is reloaded; the DBA can then
instruct the DBMS to re-execute the recently recorded transactions from the
transaction audit trail to obtain a correct state. It is typically used when (part of) the
physical media has been damaged.
Restore and repeat: This is a variation of the previous method, where a previous
correct state is restored. The difference is that the transactions are merely reposted
from before and/or after images kept in the audit trail. The actual transactions are not
re-executed: They are merely reapplied from the audit trail to the actual data table. In
other words, the images of the updated rows (the effects of the transactions) are
replaced in the data table from the audit trail, but the original transactions are not reexecuted as in the previous case.
Some organizations use so-called "hot standby" techniques to increase the
availability of their databases. In a typical hot standby scenario, the operations
performed on the operational database are replicated to a standby database. If any
problems are encountered on the operational database, the users are switched over and
continue working on the standby database until the operational database is restored.
However, database replication is an involved and extensive topic.
In the world of mainframes backup and recovery are very well known and therefore
there are some well established tools. In contrast to the mainframe world there are less
professional tools for VLDB for Unix and NT based systems available.
This chapter shows concepts and requirements of backup strategies and give an
overview of available commercial tools.
3.2
Strategies
3.2.1
Requirements
The selection of backup and recovery strategies are driven by quality and business
guidelines.
The quality of backup and recovery strategies are [3]:

Consistency: The scripts or programs used to conduct all kinds of physical
backups and full exports must be identical for all databases and servers, to keep
the learning curve for new administrators low.

Reliability: Backups from which one can not recover are useless. therefore
backup should be automated and monitored to keep the possibility of errors low
and to bring them to attention immediately.
page 96 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues

Extensibility/scalability: Extensibility describes the ability to include new
servers and/or databases into the backup schedule easily.

Support of heterogeneous environments: Backup should be able to handle a
variety of platforms, operating systems and tape libraries.

Usability: To keep the resource cost of backup strategies low, they have to be
easy to use and to learn.
There are also some important guidelines:
3.2.2

Speed: The speed of backup depends on the speed of tape, network, and/or disk
on which the backup is dependent and on the software performing the backup.

Application load: Backups of large databases could place a noticeable load on
the server where the database resides, therefore peak transaction periods should
be avoided.

Resources for periodically testing of backups and restores personnel and
hardware resources have to be included in the backup plans.

Business requirements determine availability requirements for an
application and database. Business requirements also determine database
size, which in conjunction with hardware/software configuration,
determines restoration time.

Restoration time: The hardware and software configuration determines the
restoration time.
Characteristics
Backup strategies differ along the following characteristics:
4

Locality: backups can be located on the server where the database resides or can
be executed on a remote server over a network.

Storage media.

Toolset: To automate the backup strategies system based tools can be written or
commercial tools can be used.

Database size: Very large database strategies tend to be problematic in terms of
backup or recovery time. It takes hours with most types of disk or tape
subsystems to backup and restore.

Availability requirements: Some applications/systems have to be available 24
hours a day, 7 days a week others less. Applications which have very high
availability requirements have no window for cold physical backups. Online/hot
backups are more complicated.

Life cycle: Depending o the importance of the data backup and recovery
strategies can be differentiated by production, development or test environment.
Overview of commercial products
As a result, the DBA has an extensive set of requirements for the tools and facilities
offered by the DBMS. These include facilities to back up an entire database offline,
facilities to back up parts of the database selectively, features to take a snapshot of the
 1999 EURESCOM Participants in Project P817-PF
page 97 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
database at a particular moment, and obviously journaling facilities to roll back or roll
forward the transactions applied to the database to a particular identified time. Some
of these facilities must be used online -- that is, while the users are busy accessing the
database. For each backup mechanism, there must be a corresponding restore
mechanism -- these mechanisms should be efficient. The backup and restore facilities
should be configurable -- e.g. to stream the backup data to and from multiple devices
in parallel, to add compression and decompression (including using third-party
compression tools, to delete old backups automatically off the disk, or to label the
tapes according to ones own standards. One should also be able to take the backup of
a database from one platform and restore it on another -- this step is necessary to cater
for non-database-related problems, such as machine and operating system failures.
For each facility, one should be able to monitor its progress and receive an
acknowledgement that each task has been completed successfully.
There exist two kinds of tools used to perform backups: the facilities offered by each
DBMS and the generally applicable tools which can be used with more than one
(ideally all) commercial DBMSs.
4.1
Tools
4.1.1
PC-oriented backup packages
None of these tools come with tape support built in, so it is necessary to have some
third party software to work with tapes. Here is a set of the most commonly used tools
to perform backups oriented towards PC servers:
4.1.2

Arcada Software - Storage Exec.

Avail

Cheyenne Software - ArcServe

Conner Storage Systems - Backup Exec

Emerald Systems - Xpress Librarian

Fortunet - NSure NLM/AllNet

Hewlett Packard - Omniback II

IBM - ADSM (Adstar Distributed Storage Manager)

Legato - Networker

Mountain Network Solutions - FileSafe

NovaStor

Palindrome - Network Archivist

Palindrome - Backup Director

Performance Technology - PowerSave

Systems Enhancement - Total Network Recall
UNIX packages
Among the tools available in the Unix world, the following can be found:
page 98 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
4.1.2.1
Volume 3: Annex 2 - Data manipulation and management issues

APUnix - FarTool

Cheyenne - ArcServe

Dallastone - D-Tools

Delta MycroSystems (PDC) - BudTool

Epoch Systems - Enterprise Backup

IBM - ADSM (ADSTAR Distributed Storage Manager)

Hewlett Packard - Omniback II

Legato - Networker

Network Imaging Systems

Open Vision - AXXion Netbackup

Software Moguls - SM-arch

Spectra Logic - Alexandria

Workstation Solutions
Example: IBM's ADSM
In the next lines, the main features of IBM’s ADSM will be presented in order to
illustrate with an example the functionality of these tools.

Provides unattended, backups, long-term data archives and Hierarchical Storage
Management (HSM) operations.

Supports a wide range of hardware platforms.

Administrator capabilities to manage the ADSM server from any ADSM client
platform.

Easy-to-use Web-browser and Graphical User Interfaces (GUIs) for daily
administrative and user tasks.

Extensive storage device support.

Disaster recovery features allowing multiple file copies onsite or offsite.

Optional compression to reduce network traffic, transmission time and server
storage requirements.

HSM capability to automatically move infrequently used data from workstations
and file servers onto an ADSM storage management server, reducing expensive
workstation and file server storage upgrades and providing fast access to data.

Provides a Disaster Recovery Manager (DRM) feature to help plan, prepare and
execute a disaster recover plan.

Multitasking capability.

Online and offline database backup and archive support.

Security capabilities.
 1999 EURESCOM Participants in Project P817-PF
page 99 (120)
Volume 3: Annex 2 - Data manipulation and management issues
4.2
Deliverable 1
Databases
In this section the tools and facilities offered by IBM, Informix, Microsoft, Oracle,
and Sybase for backup and recovery will be presented.
4.2.1
IBM DB2
IBM's DB2 release 2.1.1 provides two facilities to back up your databases, namely the
BACKUP command and the Database Director. It provides three methods to recover
your database: crash recovery, restore, and roll-forward.
Backups can be performed either online or offline. Online backups are only supported
if roll-forward recovery is enabled for the specific database. To execute the BACKUP
command, you need SYSADM, SYSCTRL, or SYSMAINT authority. A database or a
tablespace can be backed up to a fixed disk or tape. A tablespace backup and a
tablespace restore cannot be run at the same time, even if they are working on
different tablespaces. The backup command provides concurrency control for multiple
processes making backup copies of different databases at the same time.
The restore and roll-forward methods provide different types of recovery. The restoreonly recovery method makes use of an offline, full backup copy of the database;
therefore, the restored database is only as current as the last backup. The roll-forward
recovery method makes use of database changes retained in logs -- therefore it entails
performing a restore database (or tablespaces) using the BACKUP command, then
applying the changes in the logs since the last backup. You can only do this when rollforward recovery is enabled. With full database roll-forward recovery, you can
specify a date and time in the processing history to which to recover.
Crash recovery protects the database from being left in an inconsistent state. When
transactions against the database are unexpectedly interrupted, you must perform a
rollback of the incomplete and in-doubt transactions, as well as the completed
transactions that are still in memory. To do this, you use the RESTART DATABASE
command. If you have specified the AUTORESTART parameter, a RESTART
DATABASE is performed automatically after each failure. If a media error occurs
during recovery, the recovery will continue, and the erroneous tablespace is taken
offline and placed in a roll-forward pending state. The offline tablespace will need
additional fixing up -- restore and/or roll-forward recovery, depending on the mode of
the database (whether it is recoverable or non-recoverable).
Restore recovery, also known as version control, lets you restore a previous version of
a database made using the BACKUP command. Consider the following two scenarios:

A database restore will rebuild the entire database using a backup made earlier,
thus restoring the database to the identical state when the backup was made.

A tablespace restore is made from a backup image, which was created using the
BACKUP command where only one or more tablespaces were specified to be
backed up. Therefore this process only restores the selected tablespaces to the
state they were in when the backup was taken; it leaves the unselected
tablespaces in a different state. A tablespace restore can be done online (shared
mode) or offline (exclusive mode).
Roll-forward recovery may be the next task after a restore, depending on your
database's state. There are two scenarios to consider:
page 100 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
4.2.2
Volume 3: Annex 2 - Data manipulation and management issues

Database roll-forward recovery is performed to restore the database by applying
the database logs. The database logs record all of the changes made to the
database. On completion of this recovery method, the database will return to its
prefailure state. A backup image of the database and archives of the logs are
needed to use this method.

Tablespace roll-forward can be done in two ways: either by using the
ROLLFORWARD command to apply the logs against the tablespaces in a rollforward pending state, or by performing a tablespace restore and roll-forward
recovery, followed by a ROLLFORWARD operation to apply the logs.
Informix
Informix for Windows NT release 7.12 has a Storage Manager Setup tool and a
Backup and Restore tool. These tools let you perform complete or incremental
backups of your data, back up logical log files (continuous and manual), restore data
from a backup device, and specify the backup device.
Informix has a Backup and Restore wizard to help you with your backup and restore
operations. This wizard is only available on the server machine. The Backup and
Restore wizard provides three options: Backup, Logical Log Backup, and Restore.
The Backup and Restore tool provides two types of backups: complete and
incremental. A complete backup backs up all of the data for the selected database
server. A complete backup -- also known as a level-0 backup -- is required before you
can do an incremental backup. An incremental backup -- also known as a level-1
backup -- backs up all changes that have occurred since the last complete backup,
thereby requiring less time because only part of the data from the selected database
server is backed up. You also get a level-2 backup, performed using the command-line
utilities, that is used to back up all of the changes that have occurred since the last
incremental backup. The Backup and Restore tool provides two types of logical log
backups: continuous backup of the logical logs and manual backup of the logical logs.
A Logical Log Backup backs up all full and used logical log files for a database
server. The logical log files are used to store records of the online activity that occurs
between complete backups.
The Informix Storage Manager (ISM) Setup tool lets you specify the storage device
for storing the data used for complete, incremental, and logical log backups. The
storage device can be a tape drive, a fixed hard drive, a removable hard drive, or none
(for example, the null device). It is only available on the server machine. You can
select one backup device for your general backups (complete or incremental) and a
separate device for your logical log backups. You always have to move the backup
file to another location or rename the file before starting your next backup. Before
restoring your data, you must move the backup file to the directory specified in the
ISM Setup and rename the backup file to the filename specified in ISM Setup.
If you specify None as your logical log storage device, the application marks the
logical log files as backed up as soon as they become full, effectively discarding
logical log information. Specify None only if you do not need to recover transactions
from the logical log. When doing a backup, the server must be online or in
administration mode. Once the backup has started, changing the mode will terminate
the backup process. When backing up to your hard drive, the backup file will be
created automatically.
 1999 EURESCOM Participants in Project P817-PF
page 101 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
The Restore option of the Backup and Restore wizard restores the data and logical log
files from a backup source. You cannot restore the data if you have not made a
complete backup. The server must be in offline mode during the restore operation.
You can back up your active logical log files before doing the restore, and you can
also specify which log files must be used. A level-1 (incremental) backup can be
restored, but you will be prompted to proceed with a level-2 backup at the completion
of the level-1 restore. Once the restore is completed, the database server can be
brought back online, and processing can continue as usual. If you click on Cancel
during a restore procedure, the resulting data may be corrupted.
4.2.3
Microsoft SQL Server
Microsoft SQL Server 6.5 provides more than one backup and recovery mechanism.
For backups of the database, the user can either use the Bulk Copy Program (BCP)
from the command line to create flat-file backups of individual tables or the built-in
Transact-SQL DUMP and LOAD statements to back up or restore the entire database
or specific tables within the database.
Although the necessary Transact-SQL statements are available from within the SQL
environment, the Microsoft SQL Enterprise Manager provides a much more userfriendly interface for making backups and recovering them later on. The Enterprise
Manager will prompt the DBA for information such as database name, backup device
to use, whether to initialize the device, and whether the backup must be scheduled for
later or done immediately. Alternatively, you can use the Database Maintenance
wizard to automate the whole maintenance process, including the backup procedures.
These tasks are automatically scheduled by the wizard on a daily or weekly basis.
Both the BCP utility and the dump statement can be run online, which means that
users do not have to be interrupted while backups are being made. This facility is
particularly valuable in 24 X 7 operations.
A database can be restored up to the last committed transaction by also LOADing the
transaction logs that were dumped since the previous database DUMP. Some of the
LOAD options involve more management. For example, the database dump file and
all subsequent transaction-log dump files must be kept until the last minute in case
recovery is required. It is up to the particular site to determine a suitable backup and
recovery policy, given the available options.
To protect against hardware failures, Microsoft SQL Server 6.5 has the built-in
capability to define a standby server for automatic failover. This option requires
sophisticated hardware but is good to consider for 24 X 7 operations. Once
configured, it does not require any additional tasks on an ongoing basis. In addition,
separate backups of the database are still required in case of data loss or multiple
media failure.
4.2.4
Oracle 7
Oracle 7 Release 7.3 uses full and partial database backups and a redo log for its
database backup and recovery operations. The database backup is an operating system
backup of the physical files that constitute the Oracle database. The redo log consists
of two or more preallocated files, which are used to record all changes made to the
database. You can also use the export and import utilities to create a backup of a
database. Oracle offers a standby database scheme, with which it maintains a copy of
page 102 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
a primary database on duplicate hardware, in a constant recoverable state, by applying
the redo logs archived off the primary database.
A full backup is an operating system backup of all of the data files, parameter files,
and the control file that constitute the database. A full database backup can be taken
by using the operating system's commands or by using the host command of the
Server Manager. A full database backup can be taken online when the database is
open, but only an offline database backup (taken when the database server is shut
down) will necessarily be consistent. An inconsistent database backup must be
recovered with the online and archived redo log files before the database will become
available. The best approach is to take a full database backup after the database has
been shut down with normal or immediate priority.
A partial backup is any operating system backup of a part of the full backup, such as
selected data files, the control file only, or the data files in a specified tablespace only.
A partial backup is useful if the database is operated in ARCHIVELOG mode. A
database operating in NOARCHIVE mode rarely has sufficient information to use a
partial backup to restore the database to a consistent state. The archiving mode is
usually set during database creation, but it can be reset at a later stage.
You can recover a database damaged by a media failure in one of three ways after you
have restored backups of the damaged data files. These steps can be performed using
the Server Manager's Apply Recovery Archives dialog box, using the Server
Manager's RECOVER command, or using the SQL ALTER DATABASE command:

You can recover an entire database using the RECOVER DATABASE
command. This command performs media recovery on all of the data files that
require redo processing.

You can recover specified tablespaces using the RECOVER TABLESPACE
command. This command performs media recovery on all of the data files in the
listed tablespaces. Oracle requires the database to be open and mounted in order
to determine the file names of the tables contained in the tablespace.

You can list the individual files to be recovered using the RECOVER
DATAFILE command. The database can be open or closed, provided that Oracle
can take the required media recovery locks.
In certain situations, you can also recover a specific damaged data file, even if a
backup file isn't available. This can only be done if all of the required log files are
available and the control file contains the name of the damaged file. In addition,
Oracle provides a variety of recovery options for different crash scenarios, including
incomplete recovery, change-based, cancel-based, and time-based recovery, and
recovery from user errors.
4.2.5
Oracle 8
Oracle 8 offers to the DBA three possibilities for performing backups:

Recovery Manager.

Operating System.

Export.
 1999 EURESCOM Participants in Project P817-PF
page 103 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Backup Method
Version Available
Requirements
Recovery Manager
Oracle8
Media Manager (if backing up to tape)
Operating System
All versions of Oracle
O/S backup utility (for example UNIX dd)
Export
All versions of Oracle
N/A
Table 1. Requirements for different backup methods
Summarizing here comes a table wich compares the features of the methods above
described.
Feature
closed database
backups
open database backups
Recovery Manager
Supported. Requires
instance to be mounted.
Not use BEGIN/END
BACKUP commands.
incremental backups
Supported. Backs up all
modified blocks.
corrupt block detection
Supported. Identifies
corrupt blocks and
writes to
V$BACKUP_CORRUP
TION or
V$COPY_CORRUPTI
ON.
Supported. Establishes
the name and locations
of all files to be backed
up (whole database,
tablespace, datafile or
control file backup).
Supported. Backups are
cataloged to the
recovery catalog and to
the control file, or just to
the control file.
Supported. Interfaces
with a Media Manager.
automatically backs up
data
catalogs backup
performed
makes backups to tape
backs up init.ora and
password files
Operating System
independent language
Not supported.
An O/S independent
scripting language.
Operating System Export
Supported.
Not supported.
Generates more
redo when using
BEGIN/END
BACKUP
commands
Not supported.
Not supported.
Requires RBS to
generate consistent
backups.
Supported, but not a true
incremental, as it backs
up a whole table even if
only one block is
modified.
Supported. Identifies
corrupt blocks in the
export log.
Not supported.
Supported. Performes
Files to be backed either full, user or table
up must be
backups.
specified manually.
Not supported.
Not supported.
Supported. Backup Supported.
to tape is manual
or managed by a
Media Manager
Supported.
Not supported.
O/S dependent.
O/S independent
scripting language.
Table 2. Feature comparison of backup methods
page 104 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
4.2.6
Volume 3: Annex 2 - Data manipulation and management issues
Sybase SQL Server
Sybase SQL Server 11 uses database dumps, transaction dumps, checkpoints, and a
transaction log per database for database recovery. All backup and restore operations
are performed by an Open Server program called Backup Server, which runs on the
same physical machine as the Sybase SQL Server 11 process.
A database dump is a complete copy of the database, including the data files and the
transaction log. This function is performed using the DUMP DATABASE operation,
which can place the backup on tape or on disk. You can make dynamic dumps, which
let the users continue using the database while the dump is being made. A transaction
dump is a routine backup of the transaction log. The DUMP TRANSACTION
operation also truncates the inactive portion of the transaction log file. You can use
multiple devices in the DUMP DATABASE and DUMP TRANSACTION operations
to stripe the dumps across multiple devices.
The transaction log is a write-ahead log, maintained in the system table called syslogs.
You can use the DUMP TRANSACTION command to copy the information from the
transaction log to a tape or disk. You can use the automatic checkpointing task or the
CHECKPOINT command (issued manually) to synchronize a database with its
transaction log. Doing so causes the database pages that are modified in memory to be
flushed to the disk. Regular checkpoints can shorten the recovery time after a system
crash.
Each time Sybase SQL Server restarts, it automatically checks each database for
transactions requiring recovery by comparing the transaction log with the actual data
pages on the disk. If the log records are more recent than the data page, it reapplies
the changes from the transaction log.
An entire database can be restored from a database dump using the LOAD
DATABASE command. Once you have restored the database to a usable state, you
can use the LOAD TRANSACTION command to load all transaction log dumps, in
the order in which they were created. This process reconstructs the database by reexecuting the transactions recorded in the transaction log.
You can use the DUMP DATABASE and LOAD DATABASE operations to port a
database from one Sybase installation to another, as long as they run on similar
hardware and software platforms.
5
Analysis and recommendations
Although each DBMS that has been treated, has a range of backup and recovery
facilities, it is always important to ensure that the facilities are used properly and
adequately. "Adequately" means that backups must be taken regularly. All of the
treated DBMSs provided the facilities to repost or re-execute completed transactions
from a log or journal file. However, reposting or re-executing a few weeks worth of
transactions may take an unbearably long time. In many situations, users require quick
access to their databases, even in the presence of media failures. Remember that the
end users are not concerned with physical technicalities, such as restoring a database
after a system crash.
Even better than quick recovery is no recovery, which can be achieved in two ways.
First, by performing adequate system monitoring and using proper procedures and
good equipment, most system crashes can be avoided. It is better to provide users with
 1999 EURESCOM Participants in Project P817-PF
page 105 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
a system that is up and available 90 percent of the time than to have to do sporadic
fixes when problems occur. Second, by using redundant databases such as hot standby
or replicated databases, users can be relieved of the recovery delays: Users can be
switched to the hot backup database while the master database is being recovered.
A last but extremely important aspect of backup and recovery is testing. Test your
backup and recovery procedures in a test environment before deploying them in the
production environment. In addition, the backup and recovery procedures and
facilities used in the production environment must also be tested regularly. A recovery
scheme that worked perfectly well in a test environment is useless if it cannot be
repeated in the production environment -- particularly in that crucial moment when
the root disk fails during the month-end run!
References
[1]
Oracle Consulting, Backup/Recovery Template, Berlin, 1996
[2]
Theo Saleck, Datenbanken für sehr große Datenmengen: Nicht nur eine
technische Herausforderung. "Datenbank Fokus", Februar 1998, Volume 2, pp
36-43.
[3]
Derek Ashmore, Backing up the Oracle Enterprise. "DBMS Online"., Update
April 3, 1998
page 106 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Appendix A: Backup and Restore Investigation of Terabytescale Databases
A Proof of Concept featuring Digital AlphaServers, Storage Tek Redwood Tape
Drives, Oracle's Enterprise Backup Utility (EBU) and Spectra Logic's Alexandria
Backup Librarian.
October 30, 1996
A.1
Introduction
Databases and database applications continue to grow exponentially. Document
imaging, data warehousing and data mining, and massive On-Line Transaction
Processing (OLTP) systems are constantly adding to the demand for increased
database performance and size. Advances in server and I/O technology also contribute
to the viability of Very Large Databases (VLDBs).
In the past, however, Database administrators (DBAs) have not had the methods, tools
and available hours to accomplish backups. They cannot bring down a database long
enough to back it up. With earlier backup tools, some sites were unable to accomplish
a "hot" (on-line) backup in 24 hours, during which users were forced to accept
substantial performance degradation.
A.2
Requirements
For Terabyte-scale VLDBs to become truly viable, there must be tools and methods to
back them up. While certain features and functionality will be advantageous to certain
sites, a sine qua non list of DBA requirements would certainly include:
A.3

Hot backup capability: 24 x 7 access to applications and world-wide access to
OLTP preclude many sites from bringing the database off-line at all for routine
maintenance or backup.

Performance and the ability to scale: sites must be able to accomplish backups in
short time intervals with the addition of more and faster tape drives -- the backup
software or utilities must not be a bottleneck.

Low CPU utilisation: during the backup, the system cannot devote a large portion
of system resources to backup; CPU bandwidth must be available to the RDBMS
applications and any other tasks (reports, etc.) that must be accomplished in the
background.

Support for a wide range of hardware platforms, operating systems and backup
devices: applications are running on a variety of these platforms in different
environments, the software should not limit choices.
Accurate benchmarking
With more mature tools on the market for VLDB backup, it has become difficult to
prove which products can meet the requirements outlined above. There are many
bottlenecks to consider and it is dangerous to extrapolate. Backup to a single tape
drive at 1 MB/sec does not guarantee that you can back up to a thousand tape drives at
1000 MB/sec.
 1999 EURESCOM Participants in Project P817-PF
page 107 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
The only way to truly find the limits of backup performance is to do it empirically hook up all the hardware, load the software and run real life tests. While this is
accurate, it is time consuming and expensive to achieve the numbers described in this
paper.
Spectra Logic was able to partner with Digital Equipment Corporation to bring this
demonstration to fruition. Digital, with partners Storage Technology Corporation and
Oracle, provided millions of dollars worth of hardware to test Oracle's EBU backup
utility with Spectra Logic's Alexandria Backup Librarian.
A.4
The benchmark environment
At Digital's Palo Alto Database Technology Center, the following was available for
the test:


Digital AlphaServer 8400 (Turbo Laser) Server:

eight 300MHz Alpha processors,

8 GB system memory,

one TLIOP I/O channel,

30 KZPSA F/W SCSI Controllers,

Digital UNIX 64-bit OS (v4.0a).
Storage Technology Corporation (STK):

16 Redwood SD-3 Drives.
NOTE: Although 16 tape drives were available for the testing, only 15 were used for
the hot backups. This was considered a better match for the database structure and the
number of disk drives available for source data. This allowed 14 drives to interleave
data from four disks each and one to interleave from three. If the database were spread
across 64 disks, (4 per tape drive) the benchmarks could have made use of all 16 tape
drives.

Oracle 7.2.2 database and Oracle EBU v2.0

184 GB of data on 59 SCSI disk drives

15 tablespaces:

14 tablespaces of 4 disk each

1 tablespace of 3 disks

59 Tables, one (3.1 GB).per disk

59 Indexes, one (0.7 GB) per disk

26,000,000 rows per table; 1,534,000,000 rows total.
page 108 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1

Volume 3: Annex 2 - Data manipulation and management issues
Table Data Structure:
L_ORDERKEY
NUMBER(10)
L_PARTKEY
NUMBER(10)
L_SUPPKEY
NUMBER(10)
L_LINENUMBER
NUMBER(10)
L_QUANTITY
NUMBER(10)
L_EXTENDEDPRICE
NUMBER
L_DISCOUNT
NUMBER(10)
L_TAX
NUMBER(10)
L_RETURNFLAG
CHAR(1)
L_LINESTATUS
CHAR(1)
L_SHIPDATE
DATE
L_COMMITDATE
DATE
L_RECEIPTDATE
DATE
L_SHIPINSTRUCT
VARCHAR2(25)
L_SHIPMODE
VARCHAR2(10)
L_COMMENT
VARCHAR2(27)
The 8400 Server can accept up to three TLIOPs (Turbo Laser Input Output
Processors), each with four PCI busses allowing the system to scale beyond the
numbers achieved. Enough hardware was available, however, to saturate the single
channel and demonstrate impressive transfer capabilities.
Three backup methods were chosen for comparison: a hot backup using Oracle's
EBU, a cold backup of raw disk using Alexandria's Raw Partition Formatter (RPF),
and a hot backup using Spectra Logic's Comprehensive Oracle Backup and Recovery
Agent (COBRA).
No compression was used in the benchmarks. The compressibility of data varies with
the data itself, making it difficult to reproduce or compare different benchmarks. All
of the numbers in this document were achieved with native transfer.
A.5
Results
A.5.1
Executive summary
Spectra Logic's Alexandria achieved impressive throughput and CPU figures in all
three tests. Sustained transfer rates were similar in each of the tests, implying that I/O
limitations of the hardware were being approached. During the testing, close to 8090% of theoretical maximum was achieved; allowing for system overhead, arbitration
and I/O waits, this is close to a realistic maximum.
Both wall clock rates and sustained transfer were measured. The wall clock rate is
computed from the elapsed time to complete the backup from start to finish. The
 1999 EURESCOM Participants in Project P817-PF
page 109 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
sustained transfer is the throughput when the system is running - the actual processor
time not accounting for media selection, mounting or any backup overhead.
The sustained transfer rate is, however, a valid predictor of how much additional time
would be added or subtracted from the backup window if more or less data were
backed up.

Cold Backup: 542 GB/hour at 3% CPU Utilisation
For the cold backup testing, the disks containing the Oracle data were written to
tape using Alexandria's high performance RPF format. A total of 236 GB was
written to 15 drives in 29 minutes:
TOTALS:


Max Xfer
542 GB/hour

Wall clock
236 GB / 29 minutes (.483 hours) = 488 GB/hour

CPU utilisation
< 3%
Hot Backup using Spectra Logic's COBRA: 525 GB/hour at 4% CPU
Utilization
Spectra Logic has been shipping a hot backup product for Oracle for over a year.
The COBRA agent uses SQL commands to place individual tablespaces in
backup mode and coordinates with Alexandria to back up their datafiles.
This method also uses Alexandria's high performance RPF format. A total of 236
GB was written to 16 drives in 30 minutes:
TOTALS:


Max Xfer
525 GB/hour

Wall Clock
236 GB / 30 minutes (.5 hours) = 472 GB/hour

CPU utilisation
< 4%
Hot Backup Using Oracle's EBU: 505 GB/hour at 9.5% CPU Utilisation
The Enterprise Backup Utility is provided by and supported by Oracle for on-line
backup of Oracle databases. EBU is not a stand-alone product but rather an API
which gives third-party developers a standard access method for database backup
and retrieval. Coupled with Alexandria's media management and scheduling
capabilities it provides a robust, high-performance backup method.
For the EBU backup testing, a total of 184 GB was written to 15 drives in 29:17.
TOTALS:


Max Xfer
05.5 GB/hour

Wall clock
184 GB / 29:17 (.49 hours) = 375 GB/hour

CPU utilisation
9.5 %
Hot Backup With Transaction Load: 477 GB/hour at 16% CPU Utilisation
As the purpose of hot backup is to allow for database access during backup, one
of the tests was to perform a full backup with a light transaction load (118
updates per second) on the system.
page 110 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Performing a hot, EBU backup with this transaction load, a total of 184 GB was
written to 15 drives in 38:03.
TOTALS:


Max Xfer
477.4 GB/hour

Wall clock
184 GB / 38:03 (.63 hours) = 291 GB/hour

CPU utilisation
16.3 %
Hot Restore Using Oracle's EBU
To preserve the integrity of the test system and to provide a means of verifying
restores, it was decided not to write over the original datafiles. Rather, restores
were directed toward other disks on the system. The maximum number of concurrent restores that were practical in this environment was 12 tapes at a time.
To check the ability to scale, restore tests were run with one, two, four and 12
concurrent restores.
A.5.2
TAPE DRIVES
SUSTAINED THROUGHPUT
CPU UTILIZATION
1
40.28 GB/hour
%3
2
78.05 GB/hour
%4
4
140.73 GB/Hour
%6
12
382.75 GB/Hour
% 30
Detailed results
Spectra Logic's Alexandria achieved impressive throughput and CPU figures in all
three tests. Sustained transfer rates were similar in each of the tests, implying that I/O
limitations of the hardware were being approached. During the testing, close to 8090% of theoretical maximum was achieved; allowing for system overhead, arbitration
and I/O waits, this is close to a realistic maximum.

Cold Backup: 542 GB/hour
For the cold backup testing, 16 stores were launched by Alexandria, each one to
a single tape drive. Each store backed up multiple physical disks.
A total of 236 GB was written to the 16 drives in 28.4 minutes at a CPU
utilisation of less than 3%. Table 1 shows the statistics for each store operation in
both wall clock rate (from the launch of the entire operation to the completion of
the store) and processor rate (from the launch of the store process to its
completion). The process rate does not account for media selection, software
latency or any system time to start the store operation.
 1999 EURESCOM Participants in Project P817-PF
page 111 (120)
Volume 3: Annex 2 - Data manipulation and management issues
SIZE AND
WALL
CLOCK
RATE
PROCESS
CLOCK
RATE
10,477,371,392 bytes
1412 sec
7246.32 kb/s
1267 sec
8075.62 kb/s
13,969,129,472 bytes
1418 sec
9620.40 kb/s
1264 sec
10792.51 kb/s
13,969,129,472 bytes
1438 sec
9486.60 kb/s
1281 sec
10649.28 kb/s
13,969,129,472 bytes
1420 sec
9606.85 kb/s
1274 sec
10707.79 kb/s
13,969,129,472 bytes
1414 sec
9647.62 kb/s
1265 sec
10783.97 kb/s
13,969,129,472 bytes
1418 sec
9620.40 kb/s
1270 sec
10741.52 kb/s
13,969,129,472 bytes
1433 sec
9519.70 kb/s
1278 sec
10674.28 kb/s
13,969,129,472 bytes
1410 sec
9674.98 kb/s
1305 sec
10453.43 kb/s
13,969,129,472 bytes
1397 sec
9765.02 kb/s
1310 sec
10413.53 kb/s
13,969,129,472 bytes
1413 sec
9654.44 kb/s
1263 sec
10801.05 kb/s
13,969,129,472 bytes
1415 sec
9640.80 kb/s
1268 sec
10758.46 kb/s
13,969,129,472 bytes
1411 sec
9668.13 kb/s
1264 sec
10792.51 kb/s
13,969,129,472 bytes
1419 sec
9613.62 kb/s
1264 sec
10792.51 kb/s
13,969,129,472 bytes
1414 sec
9647.62 kb/s
1266 sec
10775.46 kb/s
13,969,129,472 bytes
1407 sec
9695.61 kb/s
1266 sec
10775.46 kb/s
Total

Deliverable 1
206,045,184,000 bytes
142108.10 kb/s 157987.37 kb/s CPU 2-3%
Hot Backup Using Oracle's EBU: 505 GB/hour
For the EBU backup testing, a total of 184 GB was written to 15 drives in 29:17
at a CPU utilisation of 9.5%.
The following chart shows a chronological report of transfer rate as different
store operations were launched. Note that the 505GB/hour maximum transfer rate
was sustained for 18:07 -- more than half the store.
It is irresistible to extrapolate these figures and ask "How much throughput could
be achieved with more hardware?" The sign of stress in the system will be
evidenced by the linearity of the system's ability to scale.
For this reason, tests were run with one, two, four, eight and 15 tape drives. Up
through 15 drives, numbers were almost completely linear.
The other factor germane to scalability is CPU utilization. During the actual data
transfer, CPU usage was between 9-10%. A short peak to 18% reflects
Alexandria's overhead to select media and update its internal database.

Hot Backup with Transaction Load: 477 GB/hour
Using Oracle's EBU for a hot backup with a light transaction load on the
database (118 transactions per second), a total of 184 GB was written to 15
drives in 38:03 at a CPU utilisation of 16.3%.
page 112 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1

Volume 3: Annex 2 - Data manipulation and management issues
Hot Restore Using Oracle's EBU
To preserve the integrity of the test system and to provide a means of verifying
restores, it was decided not to write over the original datafiles. Rather, the
restores were directed toward other disks on the system. The maximum number
of concurrent restores that were practical in this environment was 12 at a time.
Of all the tests, hot restores were the most demanding in CPU and system
resource utilization. It is interesting, therefore to check the ability to scale when
using one, two, four and 12 drives. Again, the data through 12 drives is almost
completely linear.
A.6
TAPE DRIVES
SUSTAINED THROUGHPUT
CPU UTILIZATION
1
40.28 GB/hour
%3
2
78.05 GB/hour
%4
4
140.73 GB/Hour
%6
12
382.75 GB/Hour
% 30
Interpreting the results
The total throughput numbers for all the tests are impressive: a Terabyte database can
be backed up in about two-three hours.
Just as important, however, is the CPU utilisation. The reason for a hot backup is to
allow access to the database during the backup. If a large portion of the CPU is
devoted to backup tasks, the users will see severe performance degradation. By taking
less than 10%, Alexandria/EBU is leaving most of the bandwidth available to other
applications.
Another consideration is the application's ability to scale. If 505GB/hour is required
today, next year will the requirement be 1 TB/hour? If this benchmark configuration
were a production system, the user could add one or two more TLIOP processors,
additional tape drives and be confident of scaling well beyond 505 GB/hour. Again,
extrapolating is dangerous but it is obvious that at less than 10%, there is ample
headroom to support more hardware-without taking all the system resources.
The linearity of the scaling throughout the demonstration also suggests that both the
Alexandria application and the EBU utility are capable of even faster benchmarks.
Quite likely the ultimate limitation here was the number of disk drives available for
simultaneous read. To use substantively more than the 59 disks, however, would have
necessitated an additional TLIOP I/O channel and perhaps more tape drives. All in all
every component in this demonstration proved its ability to scale.
A.7
Summary
This demonstration was important-not only to Spectra Logic-but also to all the
vendors involved. The Digital 8400 AlphaServers and 64-bit Digital UNIX showed
amazing throughput. The TLIOP provides I/O which is not bound by backplane
limitations and as the numbers attest, the I/O operates very near its theoretical
maximum with 30 SCSI busses and over a hundred separate devices attached.
Likewise, the StorageTek Redwood tape drives performed magnificently. In weeks of
full throttle testing by multiple software vendors, the drives proved robust and fast.
 1999 EURESCOM Participants in Project P817-PF
page 113 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
Lastly, Oracle's EBU demonstrates the firm's commitment to supporting and
managing VLDBs. EBU scaled well with minimal CPU utilization, proving that these
vendors are ready to manage and support the databases of tomorrow.
RDBMS usage will continue to grow in size and in applications. Developers and
Information Service professionals can now rest assured that the huge databases can be
managed and protected.
page 114 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
Appendix B: True Terabyte Database Backup Demonstration
A Proof of Concept for backing up a one terabyte database in a one hour timeframe
featuring Silicon Graphics(r) Origin2000 S2MP(tm) Server, IBM Magstar(tm) 3590
ACF Tape Libraries, Oracle's Enterprise Backup Utility (EBU(tm)) and Spectra Logic
Corporation's Alexandria(tm) Backup and Archival Librarian.
B.1
Executive Summary
This whitepaper describes the results of a performance demonstration where Silicon
Graphics and Spectra Logic partnered for the enterprise industry's first successful true
Terabyte database backup demonstration. The demonstration was a proof of concept
for backing up a 1-Terabyte Oracle7 database in approximately one hour using a
Silicon Graphics® Origin2000 S2MP™ Server, IBM Magstar™ 3590 with ACF
(Automatic Cartridge Facility), Oracle's Enterprise Backup Utility (EBU™) and
Spectra Logic Corporation's Alexandria™ Backup and Archival Librarian software.
The results of this demonstration are as follows:
TEST
Sustained Throughput Wall-Clock Throughput Total System Overhead
(TB/Hour)
(GB/Hour)
Cold
1.5
1,237
6% overhead (94% of the
system still available)
Hot Backup
1.3
985
6% overhead (94% of the
system still available)
Hot Backup with
load (4500tpm)
1.1
901
21% overhead (79% of the
system still available)
Fast growing multi-gigabyte and multi-terabyte enterprise sites looking for highperformance solutions should note the following demonstration results:

All the tests were run on an actual 1.0265 Terabyte Oracle7 database. No
extrapolation was used for the wall-clock throughput rates.

The hot backup left 94% of the system still available for user processes. This
removes the need to quiesce databases during the backup.

The hot backup was successfully completed in approximately one wall-clock
hour, including tape exchanges. Shrinking backup windows on 24x7 systems can
now be addressed with a proven, real-time solution.

The demonstration distinctly shows scalability. There was plenty of I/O left for
growth in additional data processing, CPUs, memory, tape drives, and disk
drives.

All of the products used in the demonstration are commercially available today.
For the true terabyte performance demonstration, Silicon Graphics and Spectra Logic
performed three tests: a cold backup, a hot backup and a hot backup with transaction
load.
 1999 EURESCOM Participants in Project P817-PF
page 115 (120)
Volume 3: Annex 2 - Data manipulation and management issues
B.1.1
B.2
Deliverable 1
Definitions

Cold Backup. The database is offline during the backup and not available to
end-users.

Hot Backup. The database remains available to end-users during the backup.

System. The system consists of the server (nodes, backplane, etc.), the disks, the
Oracle database, the operating system, the backup software, the SCSI busses and
the tape drives.

Total System Overhead. Total system overhead includes CPU utilization
generated by the operating system, the relational database management system
(RDBMS) and the backup software, and is an average taken for the duration of
the backup.

Total System Throughput. Total system throughput is the total amount of data
written to the tape drives during the duration of the backup divided by the system
time required to complete the backup. Total system overhead and total system
throughput are inseparable numbers that directly influence system scalability.
Detailed Results
This section includes descriptions and graphs explaining the performance results in
detail. The database used was the same for all three tests - a 1.0265 Terabyte Database
using Oracle7.
The data used to derive the charts was captured using Silicon Graphics' System
Activity Reporter, or SAR.
B.2.1
Demonstration Environment
The demonstration took place at Silicon Graphics headquarters in Mountain View,
California using following hardware and software:
The Server. Silicon Graphics Inc. Origin2000 S2MP Server running IRIX 6.4:

16 MIPS RISC R10000 64-bit CPUs on 8 boards on 2 Origin2000 modules. The
modules were connected via a CrayLink(tm) Interconnect.

5 Gigabytes memory.

Each node card has up to 700 MB/sec sustained memory bandwidth.

20 XIO slots were available for SCSI. Since each XIO slot can accommodate 4
SCSI channels, this means a total of 80 UltraSCSI channels could be used.
The Database. Oracle 7.3.2.3 database and Oracle Enterprise Backup Utility (EBU)
v2.1:

Database size - 1.0265 TB.

138, 9 Gigabyte disks, housed in 3 rackmountable Origin Vaults. The 2 internal
SCSI drive bays had a total of 10 single-ended disks enclosed in them (5 per bay)
but these were not used for the test database. One of them was the system disk.

Each rackmountable disk enclosure had one SCSI channel. 18 enclosures had the
full 6 disk complement. 6 enclosures had 4 disks. 2 enclosures had 3 disks.
page 116 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues

Database data was striped and consisted of 193 data files in a TPC-C schema
database for transactions.

80% of the database was populated.
The Drives. IBM Magstar 3590 Tape Subsystem with ACF (Automatic Cartridge
Facility):

38 drives were available for the cold backup. 35 drives were available for the hot
backups

Each 3590 used a separate SCSI channel.
The Backup Software. Spectra Logic Corporation's Alexandria v3.7 and
Alexandria's Oracle Backup Agent:

Alexandria exhibits high-performance and scalability across multiple platforms
and storage library systems.
Miscellaneous.
B.2.2

The demonstration configuration used a total of 64 SCSI channels including
disks and tape drives. Since each XIO slot can accommodate 4 SCSI channels,
this means that only 16 XIO slots out of 20 were used. This is important for
hardware scalability because there were still 16 SCSI channels available for
additional headroom and expansion, without adding another Origin module.

Two backup methods were used: the hot backups used Oracle's EBU, and the
cold backup was performed on raw disk using Alexandria's Raw Partition
Formatter (RPF).

No software compression was used, however hardware compression was enabled
on the tape drives.

Tape exchanges produced visible troughs on the throughput curve. This is related
to the load/unload times staggering slightly for various reasons, such as the
amount of data that was streamed to that particular drive.

Certain CPU spikes may be related to file finds, UNIX kernal and EBU
processes.
Results

Cold Backup at 1,237 GB/hour with 38 drives
The cold backup was performed with 38 IBM Magstar 3590 drives and wall-clock
measured at 1,237 GB/hour and 6% total system overhead.
TOTALS
Database
Elapsed Time
Sustained Xfer
Wall clock Xfer
Total System Overhead

Cold Backup
1.0265 TB Oracle7 database
51 Minutes
1.5 TB/hour
1,237 GB/hour
6% (94% of the system was still available for other processes)
Hot Backup without Transaction Load at 985 GB/hour with 35 drives
 1999 EURESCOM Participants in Project P817-PF
page 117 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
The hot backup without transaction load was performed with 35 IBM Magstar 3590
drives and was wall-clock measured at 985 GB/hour and 6% total system overhead.
TOTALS
Hot Backup
1.0265 TB Oracle7 database
Database
64 minutes
Elapsed Time
1.3 TB/hour
Sustained Xfer
985 GB/hour
Wall clock Xfer
Total System Overhead 6% (94% of the system was still available for other processes)
 Hot Backup with Transaction Load at 901 GB/hour with 35 drives
The hot backup with transaction load was performed with 35 IBM Magstar 3590
drives and was wall-clock measured at 901 GB/hour and 21% total system overhead.
Scripts were used to generate a load of 75 transactions per second (4500 tpm).
TOTALS
Database
Elapsed Time
Sustained Xfer
Wall clock Xfer
Total System Overhead
B.3
Hot Backup with Transaction Load
1.0265 TB Oracle7 database
70 minutes
1.1 TB/hour
901 GB/hour
21% @ 4500 updates per minute load (79% of the system
was still available for other processes)
Interpreting the Results
Demonstrating the ability to back up a true terabyte database in approximately one
hour is an important landmark for company's with VLDBs looking for a capable
backup solution.
Why is one hour so important? Taking a VLDB off-line directly cuts into a 24x7
company's bottom line. In a recent Information Week article, one of the largest banks
in the country estimated the bank would lose close to $50 million for every 24 hours,
or $2.08 million per hour, their system is down.1
Backup of terabyte databases requires certain features and functionality to take
advantage of their sheer size:

Hot (Online) backup capability: As applications increasingly demand 24x7
uptime, "hot" online database backup will take precedence over "cold" offline
database backup.

Availability: Database availability is key. The backup system should have
minimal impact on availability, so that as the database grows the backup system
will not interfere with its' availability to the end-users.

Ability to Scale: the system must be designed so that it will not collapse under
the weight of its own growth in 18 to 24 months. 10% growth of terabyte
database will use considerably more resources than 10% growth of a 10 gigabyte
database.

Support for Heterogeneous Environments: today's terabyte UNIX sites consist of
a wide variety of platforms, operating systems and tape libraries. The backup
software should not limit your choices.
page 118 (120)
 1999 EURESCOM Participants in Project P817-PF
Deliverable 1
Volume 3: Annex 2 - Data manipulation and management issues
For this performance demonstration, a true terabyte database was used for all of the
tests conducted. The only way to truly exercise the boundaries of backup performance
is to do it empirically - hook up all the hardware, load the software and run real life
tests. While this is accurate, it is time consuming and expensive to achieve the
numbers described in this paper. But, the significance of these numbers need to be
understood given the criteria listed above.
One aspect of this demonstration that warrants looking at closely is scalability.2 A
scaleable system is one which allows for an increase in the amount of data managed
and the amount of user workload supported without losing any functionality. Figure 4
illustrates a growing database. It attempts to put in perspective the considerable
amount of resources that a terabyte of data can consume.
Figure 4: Growing towards a Terabyte Database
The backups at Silicon Graphics demonstrated a level of throughput and overhead that
had minimal impact against on-going, end-user operations. (See Figure 5) Because of
the minimal overhead generated by Alexandria and EBU there was no need to
increase the system availability by adding additional CPU nodes and memory. During
the hot backup demonstration, 79% of the total system throughput was still available
when a transaction load of 4500 tpm was applied during the backup.
Figure 5: Hot Database Backup with Transaction Load. Minimal Backup
Software Overhead Allows for Scalability
B.4
Summary
This demonstration has proven the viability of true terabyte backup solutions. The
Origin2000 running Alexandria Backup and Archive Librarian is now confirmed in
 1999 EURESCOM Participants in Project P817-PF
page 119 (120)
Volume 3: Annex 2 - Data manipulation and management issues
Deliverable 1
its' ability to handle ten's-of-gigabytes to terabytes of Oracle database data. This also
demonstrated a new backup paradigm, as shown in Figure 6. It is no longer necessary
to divide terabyte sites into several multi-gigabyte groups. Now, terabyte sites can be
backed up in several terabyte groups.
Figure 6: The New Backup Paradigm
The Origin2000 system can scale to considerably larger than the configuration used in
the performance demonstration. Maximum configuration consists of 128 CPUs, 192
XIO boards and an I/O bandwidth of up to 82 GB/second. On the other hand, given
the systems tremendous I/O capabilities a much smaller Origin2000 configuration
(and considerably less expensive) would have adequately handled the demonstration
parameters.
The IBM Magstar 3590's performed superbly in terms of performance, capacity and
reliability based on hours of sustained streaming. Magstar has leading-edge streaming
and start/stop performance with uncompacted data transfer rate of 9MB/sec, and
instantaneous data rate of up to 20MB/sec with compaction. The advanced Magstar
tape drive is designed for up to a 100-fold increase in data integrity with improved
error correction codes and servo tracking techniques. Magstar uses longitual
serpentine recording technology, using IBM's industry leading magneto-resistive
heads that read and write sixteen tracks at a time. The IBM Magstar 3590 Model B11
includes a removable 10 cartridge magazine, and provides random access to over
300GB (compacted) data. Each cartridge has a uncompacted capacity of 10GB, 50
times more than 3480 cartridges.
Spectra Logic's Alexandria Backup and Archive Librarian proved that it is the
industry's most scaleable solution. Furthermore, the low CPU-overhead shows
Alexandria will continue to scale with your needs.
Once again, Oracle's EBU demonstrated the firm's commitment to supporting and
managing very large databases. EBU scaled exceptionally well with minimal CPU
utilisation, proving that these vendors are ready to manage and support the databases
of tomorrow.
page 120 (120)
 1999 EURESCOM Participants in Project P817-PF
Download