Ch 5. Transaction Processing Monitors

advertisement
Transaction Processing Monitors
An Overview
Module 2
COP 6730
Overview
• A reference architecture of
transaction-oriented system
• role of a TP monitor within this
framework.
– services provided by a TP monitor
– structure of this system component
2
The Role of TP Monitors
Operating systems, communication systems,
etc. are usually not designed for the needs of a
transaction-oriented environment:
– A TP monitor provides
• either essential services absent from the host system, or
• services the host performed so poorly that a new
implementation was required.
– The main function of a TP monitor is to integrate
other system components to make them work
together to support transaction-oriented
processing.
3
Characteristics of TRANSACTIONORINETED PROCESSING (1)
Data sharing: Computations read and update
databases shared among all users.
Repetitive workload: Users do not run arbitrary
programs, but rather request the system to
execute certain functions out of a predefined set.
Mostly simple functions: Consume 105 – 107
instructions and do some 10 disk I/Os.
Variable requests: exhibits some statistical
regularity, but cannot be preplanned.
4
Characteristics of TRANSACTIONORINETED PROCESSING (2)
Some batch transactions: have the size and
duration of typical batch jobs.
Many concurrent users: 103 – 106
High availability: Because of the large
number of users, the system must be
highly reliable and available.
System does recovery.
Automatic load balancing: The system should
deliver high throughput with guaranteed
low response time (soft real-time system). 5
Transaction Types
– Direct vs queued
– Simple vs complex
local/distributed
Transaction types are distinguished by
three categories
– Local vs distributed
6
Direct vs Queued Transactions
Direct: The terminal and the
process running the server
program (handling the
request) are associated with
each other.
Queued: Transactions are put
in a queue and scheduled for
processing according to the
queuing discipline.
Server
Program
Server
Program
7
Simple vs Complex Transactions
Simple
–
Single message: There is a single input message from
the terminal; and upon commit, a single output message
is delivered.
–
Short: The number of object it touches is in the tens.
Complex
–
Conversational: It allows for repeated exchange of
messages between the user and the application.
–
Long: The number of objects it touches is in the tens
of thousands (batch-like transaction).
8
Local vs Distributed
Transactions
• Local: Transactions run entirely on the
network node where the request
originated (centralized processing).
• Distributed: In addition to the local
node, transactions may also invoke
services from other nodes
9
A TAXONOMY OF
TRANSACTION EXECUTION
Transaction
Direct
Single Message
Local
Distributed
Direct OLTP
Transaction
Queued
Conversational
Local Distributed
Complex Online
Transaction
Short
Long
Local Distribute Local Distribute
Queued OLTP
Transaction
Long Batch
Transaction
(e.g., ad hoc queries)
OLTP: Online Transaction Processing
10
Transaction Processing Services
• Transaction services must provide a
programming environment that integrates
transaction control in a seamless manner.
– The program needs not worry about
concurrency, failures, clean-up, and so forth.
• As far as data sharing is concerned,
applications can use the services provided
by a database service.
11
Transaction Processing Services
Apart from the technical issue of access to shared
data, more system services are required
Manage heterogeneity: Local transaction mechanisms in each
subsystem are not sufficient to ensure the ACID properties for
the whole function.
Control communication: Status of communication sessions must also
be subject to transaction control (e.g., Transactional RPC)
Terminal management: Since the ACID properties must be perceived
by the user, sending and receiving the message must be part of
the transaction (e.g., Response delivered to user before failure ?)
Presentation services: If the terminal uses sophisticate presentation
services, then reestablishing the window environment after a
crash is also a part of the transaction guarantee.
Context management: Storing and Recovering context must be
bound to the SoC
Start/restart: TP monitor must also handle restart after any failure.
By doing so, all the subsystems are brought up in a state that is
consistent with respect to the ACID rules
12
Integrated Control
Database transaction
control is not all there
is to transaction
processing
Note:
• All components integrated by the transaction services must implement a
basic set of protocols that enable them to cooperate in transaction
processing
• Subsystems that support these protocols are called resource managers
13
Server and Sever Class
• Typically, a number of services are
bunched together in one application.
• Server class is a group of processes
(servers) that are able to run the
code of a given application program.
• At run time, a server class is
maintained for each application
program.
Server Class
Server
Server
Server
Server
Server
• Execution of a service request requires the
request to be sent to a process (a server) of
the right server class – service invocation. 14
One Process Per Terminal
• All applications
are linked
together to
form one
application
program.
• At logon, each
terminal is
given its own
process for
the entire
session (e.g.,
time-sharing
systems)
Process
100 Applications
Process
100 Applications
Process
100 Applications
Process
100 Applications
15
One Process Per Terminal
Problem:
1. Too many capabilities per processor: Each
process comes with more capabilities then
a terminal needs.
2. Too many process switches: Process
switches are very expensive operations in
most operating systems (2,000 – 5,000
instructions)
Limitation: Acceptable only for small
systems of less than 200 clients.
16
Only One Terminal Process
• All terminals talk
to one process
which can be the
TP monitor
process itself.
• The TP monitor
process receives
the function
requests and
route them to the
programs that can
service them
One Process
100 Applications
EXAMPLE: CICS (Customer Information Control System) is a
transaction server that runs primarily on IBM mainframe systems.
17
Only One Terminal Process
Advantages & Disadvantages
• Advantages:
– Makes transaction processing simpler.
– The TP monitor can check the function requests,
schedule them according to its own polices, and so on.
• Disadvantages:
– Each page fault or other exception in the process will
stop the whole TP environment.
– Since a single process can employ only one CPU at a time,
the TP system can uses only one CPU.
– The process is confined within one address space, which
can be a serious limitation for large application.
18
Many Servers, One Scheduler
• One (data communication) process handles all
request and response messages.
• There is a group of processes (i.e., a server class)
for each application program.
– Different applications are fenced off against each other.
– The data communication process routes the service
19
request to the appropriates server.
Many Servers, One Scheduler
Advantages & Disadvantages
• Example: IMS/DC
IMS (Information Management System) is a joint
hierarchical database and information management
system with extensive transaction processing
capabilities
• Advantages: Simplicity! There is one
place for scheduling and load control.
• Disadvantages: The data communication
resource can become a bottleneck.
20
Many Servers, Many Schedulers
A number of (functionally identical) data communication
processes do the terminal handling
– There is a server class for data communication services.
– The communication service must multiplex itself among
the terminals it is attached to (i.e., multi-threaded
process).
Many Data
Communication
Processes
Presentation
Services
Terminals
Monitor
Process
Many
Application
Servers
Application 1
Application n
21
Many Servers, Many Schedulers
Load control,
activation/deactivation
of processes, etc. must
be coordinated by a
separate instance, the
monitor process
• The application server classes
are set up as in “many servers
one scheduler” scenario.
• The application servers can be
simple, single-threaded
processes.
The presentation service process should be
multi-threaded to support multiple terminals
22
Many Servers, Many Schedulers
Advantages & Disadvantages
Example: Tandem’s Pathway, DEC’s ACMS
(Application Control Management System).
Advantage:
• The data communication process is no
longer a bottleneck.
• Expensive process switches can generally
be replaced by much cheaper processinternal thread changes.
Disadvantage: Load balance become more
difficult.
23
Tasks of TP Monitors (1)
• Scheduling: Service requests must be
mapped to the proper servers.
• Server class management: The TP monitor is
responsible for setting up the server class.
• Recovery: After a crash, the TP monitor is
responsible for bringing up the TP
environment.
– It starts all the system processes,
– brings up the server classes, and then
– passes control to the transaction manager.
24
The Tasks of TP Monitors (2)
• Resource administration: Information about the
terminals, databases, application programs, users,
etc. is kept in a system repository managed by
the TP monitor.
• Authentication and authorization: Service
requests must be cleared by the TP monitor
before they are executed.
• System operation: The TP monitor must
– provide the operators with sufficient information to
tune the system, and
– inform them about any problems that occur during
normal operations.
25
Resource Managers
A resource manager is a software subsystem that ties into the TP
monitor to provide protected actions on its state.
 It must be able to participate in transaction-oriented recovery
Start SoC
DB2 participate in
transaction
TRID: used to tag all
subsequent messages
BEGIN WORK
receive (input message)
< some SQL >
send (statistics menu) to (window w1);
COMMIT WORK;
Many server, one scheduler
26
Context-Sensitive Scheduling
• The completion of a request typically frees
the server so that it can be reassigned to
another request.
• However, there are cases in which a server
is reserved for a special user.
Example: For chained transactions, the server
must be reserved for the “next” transaction,
because it may refer to local context variables
available only in that server process.
27
Transaction Manager (TM)
Once the transaction program has
started, TP monitor has little to do
with transaction management.
The coordination of the resource mangers
is done by the transaction manager.
28
Transaction Manager (TM) cont’
We want to separate
• the components exercising transaction
control (transaction manager) from
• those that do transaction-oriented
resource scheduling (TP monitor).
Reasons: There are transactions that do not
come in though the TP monitor.
Query
Examples:
• Ad hoc query interface of SQL system.
• CAD applications run their own terminal
environment.
DBMS
TP
Environment
29
Responsibilities of TP Monitors (1)
• The TP monitor brings up the
resource managers upon startup.
• For restart, the TP monitor only has
to bring up the resource managers.
The actual recovery protocol is
completely handled among the resource
managers and the transactions manager.
30
Responsibilities of TP Monitors (2)
• To dispatch a server for a request, the TP
monitor creates a process (or reuse an
existing one) and load the code into it.
• All the calls among resource managers are
so-called transactional remote procedure
calls (TRPCs). The mechanisms to handle
them are provided by the TP monitor.
Example: BEGIN_WORK is a TRPC to the
transaction manager.
31
Transactional Remote
Procedure Call
(TRPC)
Remote Procedure Call (RPC)
A RPC system enables
a client program to communicate with
sever programs on different computers by
calling procedures in a similar way to the
conventional use of procedure calls in highlevel language.
Server
program
Client
program
Computer 1
Server
program
Computer 2
33
Export/Import Service
– Export Procedures: At the RPC level a service
may be viewed as a module with an interface that
exports a set of procedures appropriate for
operating on some data abstraction or resource.
Procedure 1
Procedure 2
Resource
Procedure 3
Client
Server
Export my
procedures
34
Export/Import Service
– Import Procedures: From the perspective of
client programs, a service provides the same
facilities as a software module – enabling clients
to import its procedures.
Import a
procedure
Procedure 1
Procedure 2
Resource
Procedure 3
Client
Server
35
Marshalling
• Marshalling is the process of taking a collection
of data items and assembling them into a form
suitable for transmission in a message.
– Flatten structured data items into a sequence of
basic data items.
Marshalling
– Translate those data items into an external data
representation.
36
Unmarshalling
• Unmarshalling is the process of disassembling
them on arrival to produce an equivalent
collection of data items at the destination.
– Translate the external data representation to the
local one.
Marshalling
Unmarshalling
– Unflatten the data item.
37
Message Destinations
• Potential clients need to know an identifier for
communicating with a server.
• In the Internet protocols, the destination
addresses for messages are specified as
– a port number used by a process and
– the Internet address of the computer on which it runs.
Send
(p, message)
port p
Receive
(p, message)
port q
Message
Internet Address
38
RPC: Main Tasks
The software that supports remote procedure
calling has three main tasks:
– Binding: Locating an appropriate server for a
particular service.
– Communication handling: Transmitting and receiving
request and reply messages.
– Interface processing: Integrating the RPC
mechanism with client and server programs in
convention programming languages.
• dispatching of request messages to the appropriate
procedure in the server.
• marshalling and unmarshalling of arguments in the client
and the server.
39
Stub Procedure
Client computer
Server computer
Server process
Client process
Local
call
Local
return
Client
Marshall
arguments
Send
request
Receive
Request
Unmarshall
arguments
Select
procedure
Unmarshall
results
Client stub
procedure
Receive
Reply
Communication
module
Send
Reply
Communication
module
Execute
procedure
Return
Marshall
results
Dispatcher
Service
procedure
Server stub
An RPC system provides a stub procedure to stand
in for each remote procedure that is called by the
40
client program.
Client Stub Procedure
Client computer
Client process
Local
call
Local
return
Client
Marshall
arguments
Unmarshall
results
Client stub
procedure
Send
request
Receive
Reply
The purpose of a client stub
procedure is to convert a local
procedure call to a remote
procedure call to the server.
– marshal the arguments and to
pack them up with the procedure
identifier into message,
– send the message to the server
and then await the reply
message,
Communication
module
– unmarshal it and return the
results.
41
Server Stub Procedure
An RPC system provides a despatcher and a set of
server stub procedures.
Despatcher: uses the procedure
identifier in the request
message to select one of the
server stub procedures and pass
on the arguments.
Server stub procedure:
– unmarshals the arguments,
– calls the appropriate service
procedure, and
– when it returns, marshals the
output arguments into a reply
message.
Server computer
Server process
Receive
Request
Service
procedure
Unmarshall
arguments
Select
procedure
Send
Reply
Dispatcher
Execute
procedure
Return
Marshall
results
Server stub
42
Remote Procedure Calls (RPCs)
CALLEE (server)
CALLER (client)
:
Procedure Call
1. Subroutine Call
:
RPC stub
RPC stub
3. Subroutine Call
2. Request
massage
Service Routine
• RPC makes the invocation of services at
remote nodes look like local subroutine
calls.
• The RPC stub on the callee acts fully
complementary to the stub at the caller’s
side.
43
Interface Definition
• The types of the arguments and results in
the client stub must conform to those
expected by the server stub. This is
achieved by the use of a common interface
definition.
• An RPC interface definition specifies those
characteristics of the procedures provided
by a server that are visible to the server’s
clients:
– names of the procedures, and
– types of their parameters.
44
Interface Compilers
Interface Definition
(in Interface Definition Language)
INTERFACE COMPILER A
Client
program
INTERFACE COMPILER B
Client stub
Client computer
COMPILER A
COMPILER A
Server stub
Server computer
COMPILER B
Server process
LINKERUnmarshall
arguments
Marshall
LINKER
arguments
Receive
Request
Send
request
RPC
CLIENT
Local
return
COMPILER B
COMPILER B
Client process
Local
call
Server
program
Dispatcher
Receive
Reply
Select
procedure
SERVER
Send
Reply
Execute
procedure
Return
Marshall
results
Interface compilers can be designed to process interfaces for
use with different languages enabling clients and servers written
45
in different languages to communicate by using RPCs.
Unmarshall
results
Interface Compilers
Interface Definition
(in Interface Definition Language)
INTERFACE COMPILER A
Client
program
INTERFACE COMPILER B
Client stub
COMPILER A
COMPILER A
Server stub
COMPILER B
Server
program
COMPILER B
COMPILER B
LINKER
LINKER
CLIENT
Dispatcher
RPC
SERVER
Interface compilers can be designed to process interfaces for
use with different languages enabling clients and servers written
46
in different languages to communicate by using RPCs.
Invocation of SQL Resource Manager
(SQL Pre-compiler)
1. Server Side: SQL pre-compiler parses and
translates the SQL statement into an internal
representation that can be interpreted directly by
the SQL executor.
Internal
alsorepresentation
generates
2. Client Side: The pre-compiler
code
for the host language to call the SQL server:
!sqlselect(‘fastsql’,
format_CB, expression_CB,
SELECT
…
Query
Precompiler
&variable_CB);
FROM …
Executor
WHERE …
A resource manager
invocation (recognized
by the stub compiler)
Entry
point
Resource
manager
name
(RMNAME)
Parameters
47
Invocation of SQL Resource Manager
(SQL Pre-compiler)
1. Server Side: SQL pre-compiler parses and
translates the SQL statement into an internal
representation that can be interpreted directly by
the SQL executor.
2. Client Side: The pre-compiler also generates code
for the host language to call the SQL server:
!sqlselect(‘fastsql’, format_CB, expression_CB,
&variable_CB);
A resource manager
invocation (recognized
by the stub compiler)
Entry
point
Resource
manager
name
(RMNAME)
In host
language
Parameters
48
Execution Plans
Embedded SQL is compiled once, and from
then on the generated query plan is executed.
• At compile time, the client has to issue rmCall to the
SQL server for it to compile the statement.
• The SQL server compiles the statement and
generates the access plan, and hands back an ID for
that plan.
• At run time, the rmCalls from the client refer to the
access plan ID and thereby ask the server to run
that pre-compiled query.
49
Binding
An interface definition specifies a textual service name
for a server. However, client request message must be
addressed to a server port.
Look Up: When a
client process
starts, it sends a
message to the
binder requesting
it to look up the
identifier of the
server port of a
named service.
CLIENT
Binder
(Name Service)
Service Name
SERVER
Server Port
Registration: When a
server process starts
executing, it sends a
message to the binder
requesting it to register
its service name and
server port.
50
Transactional RPC (TRPC)
TP Monitors provide the mechanism to handle RPCs. In
addition, TP monitors turn each RPC into a TRPC:
• Bind RPCs to transaction: Each RPC is tagged with
a TRID.
• Inform the transaction manager: It makes sure
that the transaction manager knows the callee is
participating in a transaction (i.e., expanding the
sphere of control).
• Binding Processes to transactions: When
dispatching a server, the TP monitor remembers
the transaction for which the server is running
and thus can inform the transaction manager if
that process crashes.
51
TP Monitors & O.S.
TP monitors allocate resources for
other system components to do the
work, rather then doing the work itself.
– Their tasks are similar to the duties of
an operating system.
– Some believe it would be best if the
operating system just swallowed the TP
monitor.
52
Summary
The sum of TP monitor’s functioning is twofold:
1.
It extends standard RPC mechanisms to
include server class management.
2. It provides the transaction manager with
enough information to keep the dynamically
expanding web of resource managers
participating in a transaction within a
sphere of control.
53
Dynamic of TRPC (1)
1.
Bind the RMNAME in the invocation to a
NODEID and an RMID; information is
obtained from the name server.
2. Look up the callee’s interface prototype
description (in the repository).
3. Coerce* the local parameter representation
into the one expected by the invoked
resource manager.
4. Pack all the transformed parameter values
into a byte string (parameter marshalling).
*e.g., mapping the data type from Big Endian (most significant byte in smallest
address) to Little Endian (least significant byte in smallest address)
54
Dynamic of TRPC (2)
5. Send the message to the peer TRPC stub.
6. The caller is now suspended until the
response from the server arrives.
7. When the response from the server arrives,
unpack the byte string (reveres marshalling).
8. Coerce the parameter values received into
the representation used by the caller.
Note:
Client makes it right: coercing the parameter values is done at
the caller’s site.
Server makes it right: coercing is done at the server’s site.
55
Download