WICS TP Chapter 1 - Microsoft Research

advertisement
The Whirlwind Tour
9:00
11:00
13:30
15:30
18:00
Aug. 2
Intro &
terminology
Reliability
Fault
tolerance
Transaction
models
Reception
Aug. 3
Aug. 4
Aug. 5
Aug. 6
TP mons
Logging &
Files &
Structured
& ORBs
res. Mgr.
Buffer Mgr.
files
Locking Res. Mgr. &
COM+
Access paths
theory
Trans. Mgr.
Locking
CICS & TP
CORBA/
Groupware
techniques & Internet
EJB + TP
Queueing
Advanced
Replication Performance
Trans. Mgr.
& TPC
Workflow Cyberbricks
Party
FREE
Chapter 1a
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

Transactions: Where It All Started
[Cuneiform] documents now number about half a million, threequarters of them more or less directly related to the history of law dealing, as they do, with contracts, acknowledgment of debts,
receipts, inventories, and accounts, as well as containing records
and minutes of judgments rendered in courts, business letters,
administrative and diplomatic correspondence, laws, international
treaties, and other official transactions. The total evidence enables
the historian to reach back as far as the beginnings of writing, to the
dawn of history.[ ... ]
Moreover, because of the inconvenience of writing in stone or clay,
Mesopotamians wrote only when economic or political necessity
demanded it.

(Encyclopaedia Britannica, 1974 edition)
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
2
From Transactions to Transaction
Processing Systems - I
The Sumerian way of doing business involved two
components:



Database. An abstract system state,
represented as marks on clay tablets, was
maintained. Today, we would call this the
database.
Transactions. Scribes recorded state changes
with new records (clay tablets) in the database.
Today, we would call these state changes
transactions.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
3
From Transactions to Transaction
Processing Systems - II
Reality
Abstraction
DB
Transaction
DB'
Query
Change

Answer
The real state is represented by an abstraction, called the database, and
the transformation of the real state is mirrored by the execution of a
program, called a transaction, that transforms the database.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
4
Transactions Are In ...
Communications:
Each time you make a phone call, there is a call setup
transaction that allocates some resources to your
conversation; the call teardown is a second
transaction, freeing those resources. The call setup
increasingly involves complex algorithms to find the
callee (800 numbers could be anywhere in the world)
and to decide who is to be billed (800 and 900
numbers have complex billing). The system must deal
with features like call forwarding, call waiting, and voice
mail. After the call teardown, billing may involve many
phone companies.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

5
Transactions Are In ...
Finance:
Each time you purchase gas using a credit card, the
point-of-sale terminal connects to the credit card
company's computer. In case that fails, it may
alternatively try to debit the amount to your account by
connecting to your bank.
This generalizes to all kinds of point-of-sale terminals
such as cash registers, ATMs, etc.
When banks balance their accounts with each other
(electronic fund transfer), they use transactions for
reliability and recoverability.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

6
Transactions Are In ...
Travel:
Making reservations for a trip requires many related
bookings and ticket purchases from airlines, hotels,
rental car companies, and so on.
From the perspective of the customer, the whole trip
package is one purchase. From the perspective of the
multiple systems involved, many transactions are
executed: One per airline reservation (at least), one for
each hotel reservation, one for each car rental, one for
each ticket to be printed, on for setting up the bill, etc.
Along the way, each inquiry that may not have resulted
in a reservation is a transaction, too.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

7
Transactions Are In ...
Manufacturing:
Order entry, job and inventory planning and scheduling,
accounting, and so on are classical application areas
of transaction processing. Computer integrated
manufacturing (CIM) is a key technique for improving
industrial productivity and efficiency. Just-in-time
inventory control, automated warehouses, and robotic
assembly lines each require a reliable data storage
system to represent the factory state.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

8
Transactions Are In ...
Real-Time Systems:
This application area includes all kinds of physical
machinery that needs to interact with the real world,
either as a sensor, or as an actor. Traditionally, such
systems were custom made for each individual plant,
starting from the hardware. The usual reason for that
was that 20 years ago off-the-shelf systems could not
guarantee real-time behavior that is critical in these
applications. This has changed, and so has the
feasibility of building entire systems from scratch.
Standard software is now used to ensure that the
application will be portable.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

9
A Transaction Processing System
A transaction processing system (TP-system) provides tools to ease
or automate application programming, execution, and administration
of complex, distributed applications.
Transaction processing applications typically support a network of
devices that submit queries and updates to the application.
Based on these inputs, the application maintains a database
representing some real-world state.
Application responses and outputs typically drive real-world actuators
and transducers that alter or control the state.
The applications, database, and network tend to evolve over several
decades.
Increasingly, the systems are geographically distributed,
heterogeneous (they involve equipment and software from many
different vendors), continuously available (there is no scheduled

downtime), and have stringent response time requirements.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
10
ACID Properties: First Definition




Atomicity: A transaction’s changes to the state are atomic: either
all happen or none happen. These changes include database
changes, messages, and actions on transducers.
Consistency: A transaction is a correct transformation of the
state. The actions taken as a group do not violate any of the
integrity constraints associated with the state. This requires that
the transaction be a correct program.
Isolation: Even though transactions execute concurrently, it
appears to each transaction T, that others executed either before T
or after T, but not both.
Durability: Once a transaction completes successfully (commits),
its changes to the state survive failures.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

11
Structure of a Transaction Program




The application program declares the start of a new transaction by
invoking BEGIN_WORK().
All subsequent operations will be covered by the transaction.
Eventually, the application program will call COMMIT_WORK(), if a
new consistent state has been reached. This makes sure the new
state becomes durable.
If the application program cannot complete properly (violation of
consistency constraints), it will invoke ROLLBACK_WORK(), which
appeals to the atomicity of the transaction, thus removing all effects
the program might have had so far.
If for some reason the application fails to call either commit or
rollback (there could be an endless loop, a crash, a forced process
termination), the transaction system will automatically invoke
ROLLBACK_WORK() for that transaction.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

12
The End User’s View of a Transaction
Processing System
Operations on Mail and Mailboxes
Mailboxes and Mail
Logon
Bruce
Name______

Password___
Chris
Headers
From
Jim
Chris
Betty
Subject
hi
it's raining
more bugs
Read Message
from: Jim
subject: hi
<text>
© Jim Gray, Andreas Reuter
Andreas
Delete Message
Cancel Message
Send Message
to: Jim
subject: dinner
<text, sound,
image>
Transaction Processing - Concepts and Techniques
Betty
Jim
WICS August 2 - 6, 1999
13
The Administrator's/Operator’s View of a
TP System
Administrator
& Operator

Other
Mail
Systems
Mail Gateway
Hong Kong
Data Base
Application
Data Comm
© Jim Gray, Andreas Reuter
New York
Repository
Berlin
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
14
Performance Measures of Interactive
Transactions
Performance/
Small/Simple
Medium
Complex
Transaction
________________________________________________________________
Instr./transaction
100k
1M
100M
Disk I/O / TA
1
10
1000
Local msgs. (B)
10 (5KB)
100 (50KB)
1000 (1MB)
Remote msgs. (B)
2 (300B)
2 (4KB)
100 (1MB)
Cost/TA/second
10k$/tps
100k$/tps
1M$/tps
Peak tps/site
1000
100
1
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

15
Client-Server Computing: The Classical
Idea
Host
Server(s)
Workstation Client

Services
Presentation
In Workstation
Logon
Delete
Headers
Send
Read
© Jim Gray, Andreas Reuter
Transactional
Remote
Procedure Call
Data communications
Transaction Processing - Concepts and Techniques
Logon
Headers
Read
Send
Data Base
TP
Monitor
Delete
WICS August 2 - 6, 1999
16
Client-Server Computing: The CORBA
Idea
Object
Implementation:
Jim´s Mailbox
Client on WS
Presentation
Services etc

IDL
Skeleton
IDL
Stub
Request: Delete
Object Request Broker
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
17
Client-Server Computing: The
WWW Idea
HTTP
Server
WWWBrowser
Java-Applet
+
Java Database
Connection
(JDBC)
Driver Code
Javaapplet
JDBCdriver code

proprietary protocol
JDBC-ODBCbridge
ODBC
driver
prop.
protocol
Database
Server
JDBC network public protocol JDBC
driver
driver
(e.g. TCP/IP)
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
18
© Jim Gray, Andreas Reuter
Network
TP Monitor
Service
(server )

Database
Network
Client
Time
User
Screen
Using Transactional Remote Procedure
Calls (TRPCs)
Transaction Processing - Concepts and Techniques
Another
TP-Monitor
and Server
WICS August 2 - 6, 1999
19
Terms We Have Introduced So Far

Resource manager: The system comes with an array of
transactional resource managers that provide ACID operations on the
objects they implement. Database systems, persistent programming
languages, and queue managers are typical examples.

Durable state: Application state represented as durable data stored
by the resource managers.

TRPC: Transactional remote procedure calls allow the application to
invoke local and remote resource managers as though they were
local. They also allow the application designer to decompose the
application into client and server processes on different computers.

Transaction program: Inquiries and state transfor-mations are
written as programs in conventional or specialized programming
languages. The programmer brackets the successful execution of the
program with a Begin-Commit pair and brackets a failed execution
with a Begin-Rollback pair.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

20
Terms We Have Introduced So Far

Atomicity: At any point before the commit, the application or the
system may abort the transaction, invoking rollback. If the
transaction is aborted, all of its changes to durable objects will be
undone (reversed), and it will be as though the transaction never
ran.

Consistency: The work within a Begin-Commit pair must be a
correct transformation.

Isolation: While the transaction is executing, the resource
managers ensure that all objects the transaction reads are isolated
from the updates of concurrent transactions.

Durability: Once the commit has been successfully executed, all the
state transformations of that transaction are made durable and
public.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

21
The World According to the Resource Manager
Transaction
Application
Servers
Application
Servers
Transaction
Manager

Application
Resource
Managers
© Jim Gray, Andreas Reuter
Resource
Managers
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
22
Where To Split Client/Server?
Thin
Fat
Presentation

Flow Control
Application Logic
(=business
objects)
Data Access
Fat
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
Thin
Server
WICS August 2 - 6, 1999
23
Client/Server Infrastructure
Client
Server
Middleware
Objects
GUI
SQL

Files
ORB
OOUI
System
Mgmt.
TRPC
Groupware
Mail
Security
TP-Mon.
WWW
DBMS
Transport
OS
© Jim Gray, Andreas Reuter
etc.
OS
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
24
Transactional Core Services
Begin_Work()
transid
Work Requests
Application
Work Requests
Resource
Manager
Normal
Funcitons
Recovery
Manager
Lock
Lock Requests Manager
Join_Work
Log Records

Log
Manager
Commit_Work()
Transaction
Recovery
Functions
© Jim Gray, Andreas Reuter
Commit Phase 1?
Yes/No
Commit Phase 2
ack
Transaction Processing - Concepts and Techniques
Write
Commit
Log Record
&
Force Log
WICS August 2 - 6, 1999
25
The X/Open TP-Model
TM
Transaction Manager
Begin
Commit
Abort

Prepare, Commit, Abort
Join
Application
Requests
RM
Resource Manager
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
26
The X/Open Distributed Transaction
Processing Model
Begin
Commit
Abort
TM
Transaction
Manager
Application
Outgoing
Incoming
CM
Communications
Manager
CM
Communications
Manager
Prepare, Commit, Abort
Requests
Remote Requests

Server
Prepare, Commit, Abort
Requests
RM
Resource
Manager
RM
Resource
Manager
© Jim Gray, Andreas Reuter
TM
Transaction
Manager Start
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
27
The OTS Model
transmitted
with request
transaction
originator
TAcontext
creation
termination
TAcontext
© Jim Gray, Andreas Reuter
recoverable
server
commit
coordination
Transaction
service
Transaction Processing - Concepts and Techniques
invocation

TAcontext
WICS August 2 - 6, 1999
28
Transaction Processing System Feature
List



Application development features
Application generators; graphical programming interfaces; screen
painters; compilers; CASE tools; test data generators; starter system with
a complete set of administrative and operations functions, security, and
accounting.
Repository features
Description of all components of the system, both hardware and software.
Description of the dependencies among components (bill-of-material).
Description of all changes to all components to keep track of different
versions. The repository is a database. Its role in the system must be
complete, extensible, active and allow for local autonomy.
TP-Monitor Features
Process management; server classes; transactional remote procedure
calls; request-based authentication and authorization; support for
applications and resource managers in implementing ACID operations on
durable objects.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

29
Transaction Processing System Feature List




Data communications features
Uniform I/O interfaces; device independence; virtual terminal; screen
painter support; support for RPC and TRPC; support for context-oriented
communication (peer-to-peer).
Database features
Data independence; data definition; data manipulation; data control; data
display; database operations.
Operations features
Archiving; reorganization; diagnosis; recovery; disaster recovery; change
control; security; system extension.
Education and testing features
Imbedded education; online documentation; training systems; national
language features; test database generators; test drivers.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

30
Data Communications Protocols
Applications
Standard Interface To All Networks
add: transactions, rpc, naming,
security, reliable messaeges,
and uniform interface.
SNA
LU0
© Jim Gray, Andreas Reuter
X.25
TCP
IP

Named
Pipes
Transaction Processing - Concepts and Techniques
SNA
LU6.2
PU2.1
OSI
WICS August 2 - 6, 1999
31
Presentation Management
Form
Description
Repository
Application
OUR BANK
NAME_____
PM

1 LOGON
2 NAME PIC X(20)
2 PIN PIC 9(4)
PASSWORD_
Device
Description
© Jim Gray, Andreas Reuter
READ TERMINAL
CHECK PIN
DISPLAY HELLO
OR NO
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
32
SQL Data Definition
TABLE (=File)
VIEW
employee
name dept
loc
emp view
DEFINE VIEW emp_view AS
SELECT dept,loc
FROM employee
where loc = 7;
dept
loc

TUPLE (=record)
DOMAIN (= type)
COLUMN (=field)
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
33
SQL Data Manipulation
PROJECT
(column
subset)
employee
name dept loc
SELECT
(row
subset)
employee
name dept loc
JOIN (matching values)
address
dept mgr
employee
name dept loc

a
a
a
project
© Jim Gray, Andreas Reuter
select
Transaction Processing - Concepts and Techniques
join
WICS August 2 - 6, 1999
34
Summary of Chapter 1




A transaction processing system is a large web of application
generators, system design and operation tools, and the more
mundane language, database, network, and operations software.
The repository and the applications that maintain it are the
mechanisms needed to manage the TP system. The repository is a
transaction processing application.
It represents the system configuration as a database and supplies
change control by transactions that manipulate the configuration and
the repository.
The transaction concept, like contract law, is intended to resolve the
situation when exceptions arise. The first order of business in
designing a system is, therefore, to have a clear model of system
failure modes. What breaks? How often do things break?
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

35
Basic Terminology
9:00
11:00
13:30
15:30
18:00
Aug. 2
Intro &
terminology
Reliability
Fault
tolerance
Transaction
models
Reception
Aug. 3
Aug. 4
Aug. 5
Aug. 6
TP mons
Logging &
Files &
Structured
& ORBs
res. Mgr.
Buffer Mgr.
files
Locking Res. Mgr. &
COM+
Access paths
theory
Trans. Mgr.
Locking
CICS & TP
CORBA/
Groupware
techniques & Internet
EJB + TP
Queueing
Advanced
Replication Performance
Trans. Mgr.
& TPC
Workflow Cyberbricks
Party
FREE
Chapter 1b
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

A Word About Words (Chapter 2)
Humpty Dumpty: “When I use a word, it means exactly what I
chose it to mean; nothing more nor less.”
Alice: “The question is, whether you can make words
mean so many different things.”
Humpty Dumpty: “The question is, which is to be master, that’s
all.”
Lewis Carroll
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

37
Basic Computer Terms
To get any confusion that might be caused by the many synonyms in our field out
of the way, let us adopt the following conventions
for the rest of this class:

domain = data type = ...
field = column = attribute = ...
record = tuple = object = entity = ...
block = page = frame = slot = ...
file = data set = table = ...
process = task = thread = actor = ...
function=request=method=...
All the other terms and definitions we need will be briefly introduced and
explained during the session.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
38
Basic Hardware Architecture I
In Bell and Newell’s classic taxonomy, hardware consists of three
types of modules:
Processors, memory, and communications (switches or wires).

Processors execute instructions from a program,
read and write memory,
and send data via communication lines.
Computers are generally classified as supercomputers, mainframes,
minicomputers, workstations, and personal computers. However,
these distinctions are becoming fuzzy with current shifts in
technology.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
39
Basic Hardware Architecture II
Today’s workstation has the power of yesterday’s
mainframe. Similarly, today’s WAN (wide area network)
has the communications bandwidth of yesterday’s LAN
(local area network).
In addition, electronic memories are growing in size to
include much of the data formerly stored on magnetic
disk.

These technology trends have deep implications for
transaction processing.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
40
Basic Hardware Architecture III



Distributed processing: Processing is moving closer
to the producers and consumers of the data
(workstations, intelligent sensors, robots, and so on).
Client-server: These computers interact with each
other via request-reply protocols. One machine, called
the client, makes requests to another, called the
server. Of course, the server may in turn be a client to
other machines.
Clusters: Powerful servers consist of clusters of many
processors and memories, cooperating in parallel to
perform common tasks.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

41
Basic Hardware Architecture IV
The Network
processor
processor
processor
processor
Memory
processor
processor

Memory
Memory
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
42
Memories - The Economic Perspective I


The processor executes instructions from virtual memory, and it
reads and alters bytes from the virtual memory. The mapping
between virtual memory and real memory includes electronic
memory, which is close to the processor, volatile, fast, and
expensive, and magnetic memory, which is "far away" from the
processor, non-volatile, slow, and cheap. The mapping process is
handled by the operating system with some hardware assistance.
Memory performance is measured by its access time:
Given an address, the memory presents the data at some later
time. The delay is called the memory access time. Access time is a
combination of latency (the time to deliver the first byte), and
transfer time (the time to move the data). Transfer time, in turn, is
determined by the transfer size and the transfer rate. This produces
the following overall equation:
memory access time = latency + ( transfer size / transfer rate )
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

43
Memories - The Economic Perspective II

Memory price-performance is measured in one of two ways:


Cost/byte. The cost of storing a byte of data in that media.
Cost/access. The cost of reading a block of data from that media.

This is computed by dividing the device cost by the
number of accesses
per second that the device can perform.

The actual units are cost/access/second, but the time unit is
implicit in the metric’s name.
These two cost measures reflect the two different views of a
memory’s purpose:




it stores data, and
it receives and retrieves data.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
44
Typical large system capacity
Memories- The Economic Perspective III
© Jim Gray, Andreas Reuter

Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
45
Memories- The Economic Perspective VI
$ / MB

© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
46
Magnetic Memory

There are two types of magnetic storage media: disk and tape.
Disks rotate, passing the data in the cylinder by the electronic
read-write heads every few milliseconds. This gives low access
latency. The disk arm can move among cylinders in tens of
milliseconds. Tapes have approximately the same storage density
and transfer rate, but they must move long distances if random
access is desired. Consequently, tapes have large random access
latencies—on the order of seconds.
Disk Access Time =
© Jim Gray, Andreas Reuter

Seek_Time +
Rotational_Latency +
(Transfer_Size/ Transfer_Rate)
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
47
Magnetic Memory
Compare the times required for two access patterns to 1MB stored in
1000 blocks on disk:

Sequential access: Read or write sectors [x, x + 1, ..., x + 999] in
ascending order. This requires one seek (10 ms) and half a rotation (5
ms) before the data in the cylinder begins transferring the megabyte at
10 MBps (the transfer takes 100 ms, ignoring one-cylinder seeks).
The total access time is 115ms.

Random access: Read the 1000 sectors [x, ..., x + 999] in random
order. In this case, each read requires a seek (10 ms), half a rotation (5
ms), and then the 1 kb transfer (.1 ms). Since there are 1000 of these
events, the total access time is 15.1 seconds.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

48
Memory Hierarchies
processor
cache
registers
current data
cache
main memory

electronic storage
online external storage
near line (archive) storage
block addressed
non-volatile electronic or
magnetic
tape or disc
robots
off line
memory capacity
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
49
Memory Hierarchies




The hierarchy uses small, fast, expensive cache memories to cache
some data present in larger, slower, cheaper memories.
If hit ratios are good, the overall memory speed approximates the speed
of the cache.
At any level of the memory hierarchy, the hit ratio is defined as:
hit ratio = references satisfied by cache / all references to cache
Suppose a cache memory with access time C has hit rate H, and
suppose that on a miss the secondary memory access time is S. Further,
suppose that C = .01 • S. The effective access time of the cache will be
as follows:
Effective memory access time = H • C + (1 - H) • S
= H • (.01 • S) + ( 1 - H) • S
= (1 - .99 • H) • S
 (1 - H) • S
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

50
The Five Minute Rule







Assume there are no special response time (real-time) requirements; the decision to
keep something in cache is, therefore, purely economic.
To make things simple, suppose that data blocks are 10 KB.
At 1995 prices, 10 KB of main memory cost about $1. Thus, we could keep the data
in main memory forever if we were willing to spend a dollar.
With 10 KB of disk costing only $.10, we could save $.90 if we kept the 10 KB on
disk.
In reality, the savings are not so great; if the disk data is accessed, it must be moved
to main memory, and that costs something. How much, then, does a disk access
cost?
A disk, along with all its supporting hardware, costs about $3,000 (in 1995) and
delivers about 30 acc./sec.; the cost, therefore, is about $100. At this rate, if the data
is accessed once a second, it costs $100.10 to store it on disk (disk storage and disk
access costs). That is considerably more than the $1 to store it in main memory.
The break-even point is about one access per 100 seconds. At that rate, the main
memory cost is about the same as the disk storage cost plus the disk access costs.
At a more frequent access rate, diskstorage is more expensive. At a less frequent
rate, disk storage is cheaper. Anticipating the cheaper main memory that will result
from technology changes, this observation is called the five-minute rule rather than
the two-minute rule.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

51
The Five Minute Rule
Keep a data item in electronic memory if its access frequency is
five minutes or higher; otherwise keep it in magnetic memory.
Similar arguments apply to objects stored on tape and cached on
disk. Given the object size, the cost of cache, the cost of secondary
memory, and the cost of accessing the object in secondary memory
once per second, the frequency at the break-even point in units of
accesses per second (a/s) is given by the following formula:

Frequency  ((Cache_Cost/Byte - Secondary_Cost/Byte) .
Object_Bytes) / (Object_Access_Per_Second_Cost) a/s
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
52
The Rules of Exponential Growth
Electronic memory:
MemoryChipCapacity(year) = 4((year-1970)/3) Kb/chip
for year in [1970...2000]
Moore’s Law

Magnetic memory:
MagneticAreaDensity(year) = 10 ((year-1970)/10) Mb/inch2
for year [1970...2000]
Hoagland’s Law
Processors:
(year-1984)
SunMips(year) = 2
MIPS
for year in [1984...2000]
Joy’s Law
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
53
Communication Hardware
The early 90s
The definition of the four kinds of networks by their diameters. These
diameters imply certain latencies (based on the speed of light). In
1990, Ethernet (at 10 Mbps) was the dominant LAN. Metropolitan
networks typically are based on 1 Mbps public lines. Such lines are
too expensive for transcontinental links at present; most longdistance lines are therefore 50 Kbps or less. As you will get from the
news, these things are changing fast.
Cluster
LAN
(local area network)
MAN
(metro area network)
WAN
(wide area network)
© Jim Gray, Andreas Reuter

100 m
.5 µs
1 Gbps
10 µs
1 km
5. µs
10 Mbps
1 ms
100 km
.5 ms
1 Mbps
10 ms
10,000 km
50. ms
50 Kbps
210 ms
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
54
Communication Hardware
Scenario 2000

Point-to-point bandwidth likely to be common among
computers by the year 2000.
Type of Network
Cluster
LAN (local area network)
MAN (metro area network)
WAN (wide area network)
© Jim Gray, Andreas Reuter
Diameter Latency Bandwidth
Send 1 KB
100 m
.5 µs
1 Gbps
5 µs
1 km
5. µs
1 Gbps
10 µs
100 km
.5 ms
100 Mbps
.6 ms
10,000 km
50. ms
100 Mbps
50 ms
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
55
Processor Architectures
The Network
processor
processor
processor
processor
processor
processor
processor

Private Memories
Private Memory
© Jim Gray, Andreas Reuter
Global Memory
Shared Disks /
tapes
Transaction Processing - Concepts and Techniques
Shared Memory
WICS August 2 - 6, 1999
56
Processor Architectures



Shared nothing: In a shared-nothing design, each memory is
dedicated to a single processor. All accesses to that data must pass
through that processor. Processors communicate by sending
messages to each other via the communications network.
Shared global: In a shared-global design, each processor has
some private memory not accessible to other processors. There is,
however, a pool of global memory; shared by the collection of
processors. This global memory is usually addressed in blocks
(units of a few kilobytes or more) and is RAM disk or disk.
Shared memory: In a shared-memory design, each processor has
transparent access to all memory. If multiple processors access the
data concurrently, the underlying hardware regulates the access to
the shared data and provides each processor a current view of the
data.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

57
Address Spaces
process
address space
process
process
address space
address space
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
shared data
segments
shared code
segments
segments

WICS August 2 - 6, 1999
58
Address Spaces



Memory segmentation and sharing: A process executes in an
address space—a paged, segmented array of bytes. Some
segments may be shared with other address spaces. The sharing
may be execute-only, read-only, or read-write. Most of the segment
slots are empty (lightly shaded boxes), and most of the occupied
segments are only partially full of programs or data.
To simplify memory addressing, the virtual address space is
divided into fixed-size segment slots, and each segment partially
fills a slot.
Typical slot sizes range from 2**24 to 2**32 bytes. This gives a
two-dimensional address space, where addresses are
{segment_number, byte}. Again, segments are often partitioned
into virtual memory pages, which are the unit of transfer between
main and secondary memory. If an object is bigger than a segment,
it can be mapped into consecutive segments of the address.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

59
Processes




A process is a virtual processor. It has an address space that contains the
program the process is executing and the memory the process reads and
writes. One can imagine a process executing Java programs statement by
statement, with each statement reading and writing bytes in the address
space or sending messages to other processes.
Processes provide an ability to execute programs in parallel; they provide a
protection entity; and they provide a way of structuring computations into
independent execution streams. So they provide a form of fault
containment in case a program fails.
Processes are building blocks for transactions, but the two concepts are
orthogonal. A process can execute many different transactions over time,
and parts of a single transaction may be executed by many processes.
Each process executes on behalf of some user, or authority, and with some
priority. The authority determines what the process can do: which other
processes, devices, and files the process can address and communicate
with. The process priority determines how quickly the process’s demand for
resour-ces will be serviced if other processes make competing demands.
Short tasks typically run with high priority, while large tasks are given lower
priority.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

60
Protection Domains



There are two ways to provide protection :
Process = protection domain: Each subsystem executes as a
separate process with its own private address space. Applications
execute subsystem requests by switching processes, that is, by
sending a message to a process.
Address space = protection domain: A process has many
address spaces: one for each protected subsystem and one for the
application. Applications execute subsystem requests by switching
address spaces. The address space protection domain of a
subsystem is just an address space that contains some of the
caller’s segments; in addition, it contains program and data
segments belonging to the called subsystem. A process connects to
the domain by asking the subsystem or OS kernel to add the
segment to the address space. Once connected, the domain is
callable from other domains in the process by using a special
instruction or kernel call.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

61
Protection Domains
process
Application
DataBase
Network
OS Kernel

A process may have many protection domains.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
62
Threads
There is a need for multiple processes per address space:



For example, to scan through a data stream, one process is
appointed the producer, which reads the data from an external
source, while the second process processes the data. Further
examples of cooperating processes are file read-ahead,
asynchronous buffer flushing, and other housekeeping chores in the
system.
Processes can share the same address space simply by having all
their address spaces point to the same segments. Most operating
systems do not make a clean distinction between address spaces
and processes. Thus a new concept, called a thread or a task, is
introduced.
But note: Several operating systems do not use the term process at
all. For example, in the Mach operating system, thread means
process, and task means address space; in MVS, task means
process, and so on.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

63
Threads


The term thread often implies a second property: inexpensive to
create and dispatch. Threads are commonly provided by some
software that found the operating system processes to be too
expensive to create or dispatch. The thread software multiplexes
one big operating system process among many threads, which can
be created and dispatched hundreds of times faster than a
process.
The term thread is used in the following to connote these lightweight processes. Unless this light-weight property is intended,
“process” is used. Several threads usually share a common
address space. Typically, all the threads have the same
authorization identifier, since they are part of the same address
space domain, but they may have different scheduling priorities.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

64
Messages and Sessions
There are two styles of communication among processes:


Datagrams: The sender of a message determines the recipient's address
(e.g. the process name) and constructs an envelope consisting of the
sender's name and address, the recipient's name and address, and the
message text. This envelope is delivered to the capable hands of the
communication system. It is analogous to sending letters by mail.
Sessions: Before any messages are sent, a fixed connection is
established between sender and receiver, a so-called session. Once it
has been established, both parties can send and receive messages via
this session. This symmetry is often referred to as "peer-to-peer".
Establishing a session requires a datagram. A session must at some point
be closed down explicitly. It is analogous to a phone conversation.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

65
Advantages of Sessions




Shared state: A session represents shared state between the
client and the server. A datagram might go to any process with the
designated name, but a session goes to a particular instance of
that name.
Authorization: Processes do not always trust each other. The
server often checks the client’s credentials to see that the client is
authorized to perform the requested function. The authentication
protocols require multi-message exchanges. Once the session key
is established, it is shared state.
Error correction: Messages flowing in each session direction are
numbered sequentially. These sequence numbers can detect lost
messages and duplicate messages.
Performance: The operations described are fairly costly. Each of
the steps often involves several messages. By establishing a
session, this information is cached.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

66
Clients and Servers




The question of how computations consisting of many interacting
processes should be structured has no simple answer. Currently, two
styles are particularly popular: peer-to-peer and client-server.
The debate about which style is "better" often creates the impression
that they are radically different. But in reality, peer-to-peer is more
general and more complex, and it subsumes client-server. Here is a
brief characterization:
Peer-to-peer: The two processes are independent peers, each
executing its computation and occasionally exchanging data with the
other.
Client-server: The two processes interact via request-reply exchanges
in which one process, the client, makes a request to a second process,
the server, which performs this request and replies to the client.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

67
Clients and Servers


The limitation of the client-server model lies in the fact that it
implies a synchronous pattern of one request/one response.
There are, however, cases in which one request generates
thousands of replies, or where thousands of requests generate one
reply. Operations that have this property include transferring a file
between the client and server or bulk reading and writing of
databases. In other situations, a client request generates a request
to a second server, which, in turn, replies to the client. Parallelism
is a third area where simple RPC is inappropriate. Because the
client-server model postulates synchronous remote procedure
calls, the computation uses one processor at a time. However,
there is growing interest in schemes that allow many processes to
work on problems in parallel. The RPC model in its simplest form
does not allow any parallelism.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

68
Remote Procedure Calls (RPCs)
LOCAL PROCDURE CALL
z = add(x,y)
z
add(int x,y)
{ return x + y }

REMOTE PROCDURE CALL
z = add(x,y)
Server
pack & send
add, x, y
unpack & call
add(int x,y)
{ return x + y }
z
unpack,return
© Jim Gray, Andreas Reuter
x + y
pack and send
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
69
Naming


Naming has to do with the problem of how a client denotes a
server it wants to invoke. Typical naming schemes distinguish
between an object's name, its address, and its location. The
name is an abstract identifier for the object, the address is the
path to the object, and the location is where the object is.
An object can have several names. Some of these names may be
synonyms, called aliases. Let us say that Bruce and Lindsay are
two aliases for Bruce Lindsay. For this to be explicit, all names,
addresses, and locations must be interpreted in some context,
called a directory. For example, in our RPC context, Bruce means
Bruce Nelson, and in our publishing context, Bruce means Bruce
Spatz. Within the 408 telephone area, Bruce Lindsay’s address is
927-1747, and outside the United States it is +1-408-927-1747.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

70
Name Servers


Names are grouped into a hierarchy called the name space. An
international commission has defined a universal name space
standard, X.500, for computer systems. The commission
administers the root of that name space. Each interior node of the
hierarchy is a directory. A sequence of names delimited by a period
(.) gives a path name from the directory to the object.
No one stores the entire name space—it is too big, and it is
changing too rapidly. Certain processes, called name servers,
store parts of the name space local to their neighborhood; in
addition, they store a directory of more global name servers.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

71
Authentication Techniques


Passwords are the simplest technique. The client has a secret password,
a string of bytes known only to it and the server. The client sends his
password to the server to prove the client’s identity. A second password is
then needed to authenticate the server to the client. Thus, two passwords
are required, and they must be sent across the wire.
Challenge-response uses only one password or key. In this scheme, the
client and the server share a secret encryption key. The server picks a
random number, N, and encrypts it with the key as EN. The server sends
EN to the client and challenges the client to decrypt it using the secret key.
If the client responds with N, the server believes the client knows the secret
encryption key. The client can also authenticate the server by challenging it
to decrypt a second random number. The shared secret is stored at both
ends, but random numbers are sent across the wire.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

72
Authentication Techniques

Public key system: Each authid has a pair of keys—a public encryption
key, EK, and a private decryption key, DK. The keys are chosen so that
DK(EK(X)) = X, but knowing only EK and EK(X) it is hard to compute X.
Thus, a process’s ability to compute X from EK(X) is proof that the process
knows the secret DK. Each authid publishes its public key to the world.
Anyone wanting to authenticate the process as that authid goes through the
challenge protocol: The challenger picks a random number X, encrypts it
with the authid’s public key EK, and challenges the process to compute X
from EK(X). Secrets are stored in one place only, and they do not go across
the wire.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

73
Scheduling
The purpose of scheduling is to make sure all requests get processed,
i.e. are assigned to a specific server process. There are basically two
additional constraints:



Short response times: The requests should not wait longer than
necessary before they get serviced.
Economic usage of resources: The required throughput should be
achieved with the minimum number of resources (processors, nodes,
links, etc.).
Throughput and response time at resource utilization r are related by the
following formula:

Average_Response_Time(r) = (1/ (1 - r)) • Service_Time
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
74
Response Time
(in multiples of service time)
The Scheduling Problem
Response Time vs Utilization
30

20
10
0
0
.1
.2
.3
.4
.5
.6
.7
.8
.9
1
Utilization: 
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
75
File Organizations
File

unstructured
structured
direct
entry sequenced
© Jim Gray, Andreas Reuter
associative
relative
key sequenced
Transaction Processing - Concepts and Techniques
hash
WICS August 2 - 6, 1999
76
SQL in a Distributed Environment
Client
Application Program
SQL : set oriented logic
File System: record logic
SQL Servers

SQL: set oriented logic
SQL: set oriented logic
File Server:
SQL: set
records
oriented
and logic
files
Network: msg. transport
SQL:records
set and
oriented
File Server:
files
logic
Network:
File
Server:
message
records
transport
and files
Network:
message transport
File Server:
records
and files
Network:
message transport
Network: message
transport
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
77
Software Performance
MICROSECONDS
(with 10 mips and Ethernet)
INSTRUCTIONS
1,000,000
100,000
10,000
1,000
100
10
100,000
process create
simple database transaction 10,000
main memory transation
null transaction
WAN rpc random read/write disc record
random write memory record 1,000
LAN rpc
random read memory record
sequential write record
local rpc
100
process dispatch sequential read record
domain switch
procedure call

1KB on Ethernet
1KB memory copy
10
1
1
© Jim Gray, Andreas Reuter
WAN transmit delay
disc access
.1
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
78
Protocol Standards
Porting and Installation Steps
Portable
Program
API compiler
linker/loader

"local"
compiled program
Operation and Inter-Operation
Unix
Operating
System
Client Machine
Client
process
FAP
message formats
protocol machine
© Jim Gray, Andreas Reuter
VMS
Operating
System
Transaction Processing - Concepts and Techniques
Server
Server Machine
protocol machine
WICS August 2 - 6, 1999
79
Relevant FAP-Standards







CSMA/CD, Token Ring, etc.: Low-level protocols that specify how bits are
physically transmitted across a shared medium.
IP/TCP, NetBIOS, HTTP: Transport level protocols.
LU6.2: SNA´s peer-to-peer protocol that allows both session oriented and
client-server-style communication under transaction protection.
OSI-TP: ISO´s rendering of a protocol that provides a functionality very
similar to LU6.2.
ASN.1: Protocol for exchanging data formatting and structuring
information. Required for RPCs in a heterogeneous environment.
DRDA: Interoperability standard for IBM SQL-systems.
ODBC, JDBC: Interoperability standards for general SQL-systems.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

80
Relevant API-Standards






SQL: Portability standard for accessing relational databases
(lots of proprietary extensions).
APPC, CPI-C: Two of IBM´s APIs for the LU6.2 protocol.
X/Open-XA, X/Open-XA+, etc.: APIs by the X/Open
consortium on ISO´s OSI-TP protocols.
IDL: OMG´s interface definition language to let objects be
integrated through an object request broker.
STDL: Language for programming TP-applications; based on
the ACMS TP-monitor.
Java: The web´s favorite programming language; comes with
its own FAP-component.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

81
OSI Standards and X/Open APIs
OSI/TP and CCR protocols
TM
TM
prepare, commit, abort
Transaction
Transaction
begin
+ack, -ack, restart
Manager
Manager start
commit
new transid is
transid is leaving
abort
arriving
this node
CM
CM
CommuniCommuniApplication
cations
cations
Server
prepare,
prepare,
Manager
Manager
commit,
commit,
abort
abort
remote requests
requests
requests
RM
Resource
Manager
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques

RM
Resource
Manager
WICS August 2 - 6, 1999
82
A Last Glance at TP-Standards
PARTICIPANTS
application : TM
application : RM
application:server
TM : RM
TM: CM
TM-TM
PROTOCOL / API
TX
RM specific
(e.g. SQL, Queues)
RPC or ROSE
XA
XA+
OSI-TP + CCR
DEFINER
X/Open DTP
various

OSI + application
X/Open DTP
X/Open DTP
OSI
Each resource manager (RM) registers with its local transaction
manager (TM). Applications start and commit transactions by calling
their local TM. At commit, the TM invokes every participating RM. If the
transaction is distributed, the communications manager informs the
local and remote TM about the incoming or outgoing transaction, so
that the two TMs can use the OSI-TP protocol to commit the
transaction.
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999
83
Summary




Transaction processing systems comprise all parts of a system,
software and hardware.
Building such a system requires to consider end-to-end arguments
at all levels of abstraction.
The performance of distributed TP systems is influenced by the
hardware architecture (what is shared), by software issues (which
protocols are used), and by configuration aspects (what limits
scaleability).
The multitude of those influences gives rise to a constant dilemma:
Should one restrict the variety to few (proprietary) components for
better tuning and performance, or should one embrace all the
standards for openness - at the risk of poor scaleability and
performance?
© Jim Gray, Andreas Reuter
Transaction Processing - Concepts and Techniques
WICS August 2 - 6, 1999

84
Download