Red
Online transaction processing (OLTP) has been used for decades and remains the cornerstone by which most businesses do business. It involves gathering input information, processing that information, and updating data to reflect the new information. Whether a business handles worldwide commerce though the Internet or runs reservation systems for major airlines, the core of its IT environment is OLTP.
As the world grows more interconnected, OLTP is becoming more complex and has new demands on the systems that support it. In large applications, efficient OLTP can depend on transaction management software or database optimization tactics to facilitate the processing of large numbers of concurrent updates to an OLTP-oriented database. For more challenging, decentralized database systems, software can distribute transaction processing among multiple computers on a network while it is closely integrated into a service-oriented architecture (SOA) and web services. These approaches can be successful for easily partitioned workloads.
However, for more demanding workloads with interrelated data at high transactions rates, these solutions often lead to data inconsistency, scalability, and performance issues.
This IBM® Redpaper™ publication, which was written for architects, designers, system engineers, and software developers, explores a unique and innovative software solution that solves the challenges of OLTP. This solution is called
.
This paper presents a case study of a large customer in the travel and tourism industry that handles airline reservations. This case study highlights the unparalleled scalability and performance characteristics of WebSphere Transaction Cluster Facility to handle workload challenges today and into the future. In addition, this paper examines the importance and power of the WebSphere Transaction Cluster Facility schema design and how it differs from traditional relational database designs. It includes analysis of the hierarchical network data model for WebSphere Transaction Cluster Facility.
ibm.com/redbooks 1 © Copyright IBM Corp. 2013. All rights reserved.
IBM WebSphere Transaction Cluster Facility is designed for applications that have large, centralized databases with frequent read and update requirements, where data must be globally consistent. It is ideal for applications that are common in many industries where performance, scalability, and availability are paramount.
IBM WebSphere Transaction Cluster Facility provides a high-performance, continuously available, and highly scalable offering for large-volume OLTP. WebSphere Transaction Cluster
Facility is designed for applications that have large, centralized databases with demanding and frequent read and update requirements where data is not easily partitioned and must be globally consistent. Typical uses for WebSphere Transaction Cluster Facility include reservation, payment, and gambling systems.
WebSphere Transaction Cluster Facility is built on IBM DB2® Enterprise Server Edition to provide a highly available database solution. It also uses the IBM DB2 pureScale® feature for
DB2 Enterprise Server Edition. This feature provides superior performance and near-linear scalability for a centralized database in a clustered, distributed system. WebSphere
Transaction Cluster Facility is a distributed system-based solution that is modeled on proven
IBM OLTP database technology to provide world-class application performance, availability, and scalability. This combination of IBM middleware solutions provides the infrastructure that is needed for demanding transaction processing applications.
The WebSphere Transaction Cluster Facility database architecture and middleware are based on the industry-proven IBM z/Transaction Processing Facility Enterprise Edition (z/TPF). z/TPF has been the premiere mainframe-based transaction processing solution at the core of reservation and financial systems around the world for more than 50 years. The advent of the
DB2 pureScale Coupling Facility made it possible to develop a distributed, high-performance, continuously available, and highly scalable offering (WebSphere Transaction Cluster Facility) for large-volume OLTP on IBM Power Systems™ servers.
The DB2 pureScale feature uses a high-speed InfiniBand data network to synchronize database workloads across servers in the cluster. Through this interconnection, the DB2 pureScale feature can achieve cross-server update rates of just microseconds. This capability is key for WebSphere Transaction Cluster Facility to scale in a near-linear fashion as more servers are added to the cluster to support growing workloads. WebSphere Transaction
Cluster Facility is a modern solution with a rich history that addresses the most extreme demands of today’s transaction processing environments.
WebSphere Transaction Cluster Facility takes a unique approach to the challenge of a centralized, nonpartitionable database architecture. WebSphere Transaction Cluster Facility uses DB2 in a nonrelational manner to implement a customized network or hierarchical model database with a flexible, many-to-many relationship between collections of heterogeneous customer-defined records. This highly customized approach to defining a database allows for high performance data traversal, reads and updates that use hash-based indexing, minimized index collisions, and data consolidation that can minimize I/O operations. The unique approach to database architecture design used by WebSphere Transaction Cluster Facility results in a large scale, high performance database solution.
WebSphere Transaction Cluster Facility ensures data integrity by using customer-designed database metadata. This way, WebSphere Transaction Cluster Facility can automatically perform type checking, input validation, and data verification through generated code. It can ensure data consistency at all levels of the database hierarchy and throughout the customer code. WebSphere Transaction Cluster Facility maintains transaction integrity across the various members of clustered systems by storing all data on disk, not in memory.
2 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
Because WebSphere Transaction Cluster Facility uses DB2 as a persistent data store,
WebSphere Transaction Cluster Facility users can take advantage of many features of DB2.
Some of these features include transaction recoverability, excellent reliability, availability, and serviceability (RAS), database backup and restore, advanced data caching algorithms, redundancy for high availability and failover, and disaster recovery.
The object-oriented C++ and Java interface of WebSphere Transaction Cluster Facility helps enhance application performance and programmer productivity in an IBM AIX® development environment. WebSphere Transaction Cluster Facility stores and retrieves defined data objects in the database without costly marshalling and demarshalling of data. This process for handling data is typical of relational database solutions, such as object-to-relational mapping, which helps achieve maximum read and update performance.
WebSphere Transaction Cluster Facility supports multitenancy, which is an architectural principle of virtualization and cloud computing where multiple customers, or tenants, can share hardware and software resources in an isolated and controlled fashion. This method helps to maximize the utilization of your physical resources. It also helps to reduce the ongoing development and maintenance costs of supporting multiple customer databases.
WebSphere Transaction Cluster Facility is equipped with and works with the following tools:
Graphical and Eclipse-based, the WebSphere Transaction Cluster Facility Toolkit helps database layout and record content design. The
generates object classes for the traversal, read, and update of the data structures in the database. It simplifies code development, enhances the maintainability of the customer code, and helps improve application time to market.
WebSphere Transaction Cluster Facility provides several middleware features, such as a high performance cache manager for read-intensive data with synchronization across members in the cluster, and tools to ensure the integrity of the database.
By using
, an agent allows specific WebSphere Transaction
Cluster Facility measurements to be monitored in real time.
IBM WebSphere Application Server is an industry-leading runtime environment for Java applications that provides ease of integration for WebSphere Transaction Cluster Facility into a SOA.
The schema design in WebSphere Transaction Cluster Facility is different from a traditional relational database design. WebSphere Transaction Cluster Facility uses a hierarchical network data model that is proven to scale by the travel and transportation industry for extreme I/O rates, while it maintains data consistency and integrity. With a network model, no table join operations occur when retrieving data because the design or layout of the data is optimized for the specific use cases that are required by the application. In addition to use cases that require fast retrieval of data, the network data model of WebSphere Transaction
Cluster Facility provides efficient and scalable transactions that require consistent updates to multiple logical pieces of data. WebSphere Transaction Cluster Facility can scale these update-type use cases by minimizing the number of locks and the number of I/O operations that are required to access the data.
The performance of WebSphere Transaction Cluster Facility depends on the layout or design of the data. The following sections highlight an example of a customer in the travel and
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 3
transportation industry. They illustrate how the following questions or considerations influence the layout of your data:
How many different logical groupings of data do you have?
What records are closely related? If you need to access one record, are you likely to need access another related record that contains different information?
In what ways do you find a record? Which methods of finding a record must be optimized?
What types of indexes are needed?
When do you need to duplicate data versus just linking to it?
The case study in this paper involves a large customer in the travel and transportation industry that handles airline reservations. A
is the pairing of customer information with specific airline flight information. A customer can have multiple reservations, such as when a business traveler or frequent flyer has multiple trips that are booked in advance. A single reservation can contain multiple flight segments, such as for connecting flights to reach a destination, or for a round trip.
The data model for WebSphere Transaction Cluster Facility consists of a hierarchy of
cabinets, folders, and records (Figure 1) that are much like their physical counterparts.
Cabinet
Folder Folder Folder
Records Records Records
Figure 1 Simplified data model hierarchy for WebSphere Transaction Cluster Facility
Related pieces of paper (records) are stored in folders, and folders are kept in cabinets according to an organization methodology, such as alphabetically by drawer. To find information, you locate the folder, open it, and look for the specific record that you need.
Folders can contain different types of records, and depending on the size of the records, they can sometimes all be retrieved with a single I/O. For transactions that need to update multiple pieces of information, this organization is efficient in the number of I/Os that are required and in the number of locks that need to be held to perform the update.
4 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
Based on the application requirements, the case-study data model entails two logical groupings of data: customer information and reservation information. The basic information in each grouping contained the following elements:
Customer information
– Name
– Address
– Phone number
– Form of ID (example, driver’s license or passport)
– Credit card
– Frequent flyer number
Reservation information
– Name
– Flight number
– Date
– Origin and Destination
– Ticket number
– Payment
You can modify this organization for other use cases.
Logical groupings of data are typically organized in cabinets, as shown in Figure 2. This
example has two cabinets, one for customer information and one for reservation information.
CustomerInfo
Cabinet
Reservation
Cabinet
Figure 2 Cabinet organization for the use case
All use cases in the study involve accessing a reservation, so we look closer at the folders and records in the reservation cabinet. As stated earlier, a single reservation can be for multiple people, such as a family that is traveling together, and multiple flights. Therefore, you organize the information into multiple records, for example by passenger names, flight information, ticket number, and payment. If more flights are added to the reservation, you add a record. When you retrieve a reservation, you likely need to see the passenger name, flight information, and payment status. Keeping related records in the same folder is efficient because folder contents are kept together in the database.
The next consideration is how many different reservations to place in the same folder.
Considering that you locate information by first locating the folder, if multiple reservations are placed in the same folder, you compare each record to determine whether it is part of the reservation that you are trying to access. However, the actions of looping through records and doing compares are inefficient.
In this case study, a single reservation (also called a
(PNR)) is placed into a folder that contains a record for each of the following items:
Passenger name
Flight segment
Ticket number
Frequent Flyer number
ID information (for example, type of ID and ID number)
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 5
Payment information (for example, form of payment and credit card number)
Phone number
Email address
Next, you set up indexes and fast access paths for the most frequent use cases. In this case
study, several tests contain different use cases. Table 1 shows these use cases and the
frequencies that these use cases occur.
Table 1 Use-case tests
Use case
Create, query, modify, and delete a reservation by using the reservation (PNR) ID
Query or update a reservation by using a form of ID
Query or update a reservation by using the ticket number
Query a reservation by using a frequent flyer number
Query or update a reservation by using the flight number and date
Query or update a reservation by using an agent ID and date
Query or update a reservation by using a phone number
Frequency
35%
32%
18%
8%
5%
1%
1%
To build the data model, we used mixed-message workload definitions (contain all transaction types), as typically seen in a production environment, to show how the data can be optimized.
The most frequent use case is to access a reservation by the reservation ID. Regarding the data model, this use case was set up by using the ultimate optimization. From a logical data organization standpoint, all records that are associated with a PNR were placed in a single folder. WebSphere Transaction Cluster Facility provided a way to save the location of the
folder. When you use the physical cabinet analogy (Figure 3), finding the PNR folder is the
same as locating the sixth folder in drawer #2. You do not have to think about how the folders are organized. Instead, you go directly to the specified folder.
Reservation
Cabinet
PNR ID PNR
Figure 3 Logical data organization
Although the use case that finds reservations based on a ticket number is not the next most frequent use case, that use case provides a good example of how to set up a simple index.
In WebSphere Transaction Cluster Facility, indexes are defined by using root cabinets. The index, which in this example is based on the ticket number, is used as a search key to locate
6 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
a folder in the root cabinet. The folder contains a record that associates the search key value with a reference to the item that you want to index. This case study involves associating the ticket number with a reference to the PNR folder.
The model in Figure 4 shows a root cabinet, a ticket folder, and a ticket record.
Ticket
Root Cabinet
Tickets
Cabinet
Cabinet
Ticket #
PNR
Figure 4 Using the ticket number as an index to get the PNR
Although the data is organized this way, the application needs only to make one call, using the ticket number as an index, to get the reference to the PNR. The reference is then used to read the specific records in the PNR folder.
For the use case that involves forms of ID, access is more complicated than accessing the
PNR by ticket number because the ID is more closely related to the specific customer than a specific reservation.
You can set up an index similar to ticket number, but there is a trade-off. That is, when a PNR is removed from the system, the customer information (and the index) is also removed. By separating the customer information into its own cabinet, you can reuse the customer information when you make new reservations. This consideration is significant for performance when you add new reservations because all of the customer-related index information is in place after the customer makes their first reservation.
Similar to the reservation information, the customer information is stored in multiple records within a folder. Some data duplication occurs between the customer information and what is stored in the reservation for performance reasons. When a reservation is retrieved by using ticket information, you do not need to go into the customer cabinet because the information is in the reservation. The data that is duplicated is infrequently updated. It is more expensive to update information if it is in multiple places. If the data needs to be updated frequently, it might be better to keep one copy and use references to access it from multiple places.
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 7
In Figure 5, the customer folder has a segment record (highlighted) that contains a reference
to a PNR folder.
CustomerInfo
Cabinet
Customer
Name FF #
Form of ID
Segment
PNR
Credit
Card
Phone e-mail
Figure 5 Using a segment record containing a reference to a PNR
As stated earlier, a customer can have multiple reservations, and a single reservation can contain multiple flights. For these use cases, multiple segment records might exist, where two segment records can point to different PNRs (multiple reservations) or to the same PNR (one reservation that involves two flights).
With this structure for the CustomerInfo cabinet, you can now set up an index to retrieve a
PNR based on a customer’s form of ID. Similar to the ticket index, a root cabinet is defined with the form of ID as the search key, and the ID value is associated with a reference to the customer folder.
Using the form of ID as a method to retrieve the PNR involves an extra step, compared to using a ticket number. The form of ID is used as an index, but you get a reference to the customer folder, not a reference to the PNR. Therefore, the segment record must be read to
obtain the reference to the PNR, as shown in Figure 6 on page 9. The path length to access
the PNR by using a form of ID is longer than when you use the ticket index. However, the path length to add the PNR to the form-of-ID-related index is similar to adding the ticket index because, when the customer exists, you need to add only the segment reference.
8 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
Form of ID
Root Cabinet
IDs
CustomerInfo
Cabinet
ID Value
Customer PNR
Segment
Figure 6 Reading a segment record to obtain the PNR reference
You can set up indexes for other customer-related information, such as frequent flyer number or phone number, in a similar way to what was done here for the form-of-ID-related index.
The use cases to query or update a reservation by using an agent ID and date or by using a phone number depend on the selection of a reasonable distribution method. The folders in a root cabinet are placed into a hash table by using the search key as a hash value. Based on the hash value, the size of the hash table (maximum number of folders) and the algorithm that is used for hashing determine the distribution of the records across all of the folders in the root cabinet. Determining the best size and algorithm is a balance between space requirements and performance. If space is not an issue, you choose a distribution algorithm that can spread your data so that you can have, at most, one record per folder in the root cabinet. This approach can eliminate collisions and the need to compare the record against the search key value.
For a large database, such as the one in this case study with the number of PNRs defined to be 100 million, an enormous amount of space would be consumed for each index. Consider the Agent ID example. The case study defined the number of agents to be 100,000 with reservations kept in the system for 365 days. Using the Agent ID and date as the search key would result in 365 million unique values. Because the frequency of the Agent ID-based query or update is only 1%, the improved performance was not worth the space needed for the large number of entries that are required.
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 9
The alternative solution is to create a multilevel index structure as shown in Figure 7.
Agent
Root Cabinet
Agent ID
AgentDate
Date
Reservation
PNR
Figure 7 Using a multilevel index structure
It takes a longer to access the data by using multiple indexes, but this method requires less space. The first-level index is used to find data that is associated with a particular agent based on the agent’s ID. The second-level index is set up by date. Date folders are created only when an agent creates a reservation on that particular date, which minimizes the number of folders that are created as part of the second-level index.
The last use case involves accessing the PNR by using a flight number and date. The number of flights or the size of the data is not a factor as it was for the use case with the Agent ID and date as input. Instead, it is more a consideration of the data to use as a hash value. Not all data is ideal as a hash value.
WebSphere Transaction Cluster Facility offers the DirectOrdinalSpecification option, where access to folders is based on an ordinal number rather than a search key value. By using the
DirectOrdinalSpecification option, you can create your own method for finding and placing data into root cabinet folders. It works well for a fixed or managed set of data values (such as flight numbers and dates) that can be mapped easily to an ordinal number.
In this use case (Figure 8 on page 11), each combination of flight number and date is
assigned to an ordinal number in a table. To access the PNR by using a flight number and date, a binary search is done on the table to find the ordinal number that is assigned to the flight and date. The ordinal number is then used to access the flight record that contains references to all of the PNRs that are associated with that flight.
10 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
Flights
Root Cabinet
FlightOrdinal
Reservation
Cabinet
FlightRes
PNR
Figure 8 The DirectOrdinalSpecification option in WebSphere Transaction Cluster Facility
When you make updates to WebSphere Transaction Cluster Facility data, by adding new records or by modifying existing ones, WebSphere Transaction Cluster Facility holds locks on the folders where updates are being made. For example, if an update is being made to information in a PNR, WebSphere Transaction Cluster Facility locks that specific PNR folder.
In the example data model, each PNR is in its own folder, so that there is no contention when you update the PNR. However, updating information in a PNR that is used as an index involves more than just updating the PNR. For example, when you update a ticket number, the
PNR and the index records that use the ticket number need to be updated. After the PNR is updated, a new index is created based on the new ticket number. After the new ticket number index is created, the old index can be removed.
When processing indexes, WebSphere Transaction Cluster Facility minimizes any lock contention by holding only the lock on the folder that contains the new ticket number long enough to create the index. Yet a large amount of contention can still occur if the data is distributed poorly, such as when many ticket numbers go to the same folder. In this case, updating any ticket number in the folder requires the folder lock to be held for a longer time. It also increases the likelihood of another transaction that needs that same folder lock and must wait (lock contention), which impacts overall system performance. By mapping the ticket numbers to different folders, updates to different ticket numbers can proceed with minimal lock contention, resulting in better overall system performance.
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 11
The WebSphere Transaction Cluster Facility data model uses a hierarchy of cabinets, folders,
and records, as shown in Figure 9. Logical groupings of data form cabinets that hold folders.
Data is stored in records, and related records are placed into the same folder for efficient retrieval and update. Indexes are defined by using root cabinets that provide fast access to folders. The index-based distribution method that is used with root cabinets is an important factor to minimize contention for updates and to ensure performance for retrieval.
Form of ID
Root Cabinet
Ticket
Root Cabinet
Flights
Root Cabinet
Agent
Root Cabinet
IDs Tickets FlightOrdinal Agent ID
ID Value Ticket # FlightRes AgentDate
Date
Reservation
Customer
PNR PNR ID
Segment
Figure 9 Complete WebSphere Transaction Cluster Facility data model hierarchy
The performance tests involve the same large customer in the travel and tourism industry that handles airline reservations. All information that is related to a reservation is kept in a PNR document. A reservation can contain multiple passengers and multiple flight segments. Some transactions work with a single PNR, such as creating a reservation or updating an existing reservation. Other transactions involve multiple PNRs, such as showing all reservations for a customer or generating a passenger manifest for a flight.
12 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
An agent or customer can locate a PNR by using multiple ways. The query PNR transaction allows the following forms of input among other forms:
PNR record locator (also referred to as the
)
Ticket number
Form of ID (driver’s license, passport, or other government issued ID)
Credit card number and type
Phone number
Email address
Passenger name, flight number, and date
Passenger name, departure city, and date
Similarly, the update PNR transaction can have multiple variations such as the following examples:
Adding or changing a phone number
Updating nonindexed data such as a PNR history record
Adding a flight segment
Canceling an existing flight for one or more passengers
Changing a flight (adding a new flight and canceling an old flight)
Creating or updating a reservation can require updating multiple database records (changes must be persisted to a storage device). These records must be updated in the same transaction scope, including the PNR, the customer database records for each passenger, and the inventory and manifest records for each flight.
The test environment consisted of five WebSphere Transaction Cluster Facility server nodes, each running on its own 20-core IBM POWER7® (P780) box. The WebSphere Transaction
Cluster Facility database contained 100 million PNRs, 50 million unique customers, and
1 million flights. A client load generator generated transaction request messages and sent them across the network to the server. The client load generator was cluster unaware. That is, any transaction for any PNR could be sent to any of the WebSphere Transaction Cluster
Facility server boxes. The environment was as close to a real production system as possible, including high availability (HA) of the server and database layers. It was also a fully functional, indexed database that was could handling multiple transaction types.
The following terminology is used in the test descriptions:
Response time The mount of time it takes WebSphere Transaction Cluster Facility to process a transaction. This is the time between when WebSphere
Transaction Cluster Facility receives a request message from the network and when WebSphere Transaction Cluster Facility sends the response to that message.
Full message mix When the ratio of PNR query, update, create, or delete actions is
100:10:1:1. In a full message mix, for every 112 transactions, 100 transactions are to query a PNR, 10 transactions are to update a PNR, one transaction is to create a PNR, and one transaction is to delete a
PNR.
Test results: The results of the tests are specific to the test environment, test data, network, and other variables for this paper. These results will vary depending on your specific circumstances.
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 13
The goal of this test was to show that WebSphere Transaction Cluster Facility can handle high query request volumes, because querying a PNR is the most frequent type of transaction. In the test environment, query PNR-type transactions were sent to WebSphere Transaction
Cluster Facility at a rate of 100,000 transactions per second. At this sustained rate, the average response time remained constant and was less than 3 milliseconds.
The goal of this test was to show that WebSphere Transaction Cluster Facility can handle high database update rates. In the test environment, update PNR-type transactions were sent to
WebSphere Transaction Cluster Facility at a rate of 18,000 transactions per second. At this sustained rate, the average response time remained constant and was less than 18 milliseconds.
The goal of this test was to show that, as message rates increase, WebSphere Transaction
Cluster Facility can be scaled by adding more nodes to the cluster. The test began with one active WebSphere Transaction Cluster Facility node that sends traffic by using the full message mix at increasing rates until the maximum capacity of the node was reached. This message rate is called
. A second node was then activated, and traffic was increased until the maximum capacity of both nodes was reached. This message rate is called
. This procedure was repeated by adding third, fourth, and fifth nodes. Message rate
represents the message rate with all five nodes active.
Throughput scaled in a near linear fashion as nodes were added. For example, the throughput of two nodes (
) was 1.99 times the throughput of one node (
). The throughput of five nodes (
) was 4.97 times the throughput of one node.
Figure 10 illustrates the throughput as nodes were added.
5N
4N
3N
2N
N
1 2 3 4
Number of WTCF Nodes
Figure 10 Throughput as extra nodes are added
5
14 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
The goal of this test was to show that no increase in response time occurs as the size of
WebSphere Transaction Cluster Facility database increases. The first part of the test involved running the full message mix at a certain message rate against a smaller database that contained 100,000 PNRs. The second part involved running the full message mix at the same message rate against a larger database that contained 100 million PNRs. Even though the database was 1,000 times larger, the average response time in the second test was identical to the average time in the first test.
The goal of this test was to show that, even when you process a heavy workload, WebSphere
Transaction Cluster Facility can handle large increases or spikes in message rates. The initial setup involved the full message mix at a rate of 21,000 transactions per second, with the system running steady at that rate for a predefined period.
Next, the message rate was increased to a steady 31,000 transactions per second, and
WebSphere Transaction Cluster Facility immediately reacted to the increase in traffic. At time
, WebSphere Transaction Cluster Facility was processing 21,000 transactions per second. At time
, WebSphere Transaction Cluster Facility was processing 31,000 transactions per second and remained at that rate during the test. The average response time remained constant from time
until the end of the test.
Figure 11 shows the transaction rates and average response times during the test.
= Transaction Rate
= Average Response Time
Start T T+1 End
Time in Seconds
Figure 11 Rates and response times during bursts of transactions
The goal of this test was to show that WebSphere Transaction Cluster Facility can process high transaction workloads consistently for long periods of time. The full message mix was run at 50,000 transactions per second for 18 hours. The average response time was less than
5 milliseconds and remained consistent throughout the test.
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 15
The tests that are described in the previous sections were for a large volume PNR workload.
Yet these tests demonstrated only a small fraction of the full capabilities of WebSphere
Transaction Cluster Facility. WebSphere Transaction Cluster Facility can scale vertically by adding more cores to a node. It can scale horizontally by adding more nodes to the cluster.
The full message mix PNR workload in this test environment showed a result of 50,000 transactions per second across five WebSphere Transaction Cluster Facility nodes, each with
20 cores and running at only 75% of processor capacity. Considering the observed scalability characteristics of WebSphere Transaction Cluster Facility, a projection was made for this test environment. The projection is that 128 nodes, with 80 cores each (plus sufficient DASD, desk control units, and network bandwidth), can process more than 4 million transactions per second in the full message mix PNR workload.
The demands of OLTP will continue to challenge existing database solutions. IBM
WebSphere Transaction Cluster Facility provides a continuously available, high-performance, scalable solution for large-volume online transaction processing. The DB2 pureScale feature enables WebSphere Transaction Cluster Facility to synchronize database workloads across the servers in the cluster. The DB2 pureScale feature allows WebSphere Transaction Cluster
Facility to achieve cross-server update rates of microseconds, which is key to the product’s ability to scale in a near-linear fashion to support expanding workloads. Data integrity is ensured through metadata that automatically performs data type checking, and this transaction integrity is maintained across clustered systems by storing all data on disk and not in memory.
For workloads where the data cannot be partitioned, the powerful WebSphere Transaction
Cluster Facility networked data model provides a number of advantages over relational technologies that rely on partitioning. The unique benefit of this data model lies in its ability to efficiently handle transactions that require consistent updates to multiple logical pieces of data. WebSphere Transaction Cluster Facility is able to scale these update-type use cases by minimizing the number of locks and the number of I/Os required for data operations.
WebSphere Transaction Cluster Facility uses a highly customized, networked data model that defines the relationships between collections of heterogeneous customer-defined records.
This database definition approach allows for high performance data traversal, reads, and updates that minimize index collisions and consolidate data for minimal I/O operations.
The airline case study demonstrates that WebSphere Transaction Cluster Facility is a viable option for a large, high-volume online transaction system. WebSphere Transaction Cluster
Facility successfully met and exceeded the various requirements, showcasing the characteristics of scalability, reliability, consistency, and stability that are required in any mission-critical system. WebSphere Transaction Cluster Facility provided a single view of a database that contains 100 million PNRs from a multinode complex. It also showed that it can process 4 million transactions per second in a larger environment. This case study proved that
WebSphere Transaction Cluster Facility can handle enormous workloads, such as those of an airline, both today and for many years to come. Explore how this distinctive technology can help you achieve your short-term and long-term transaction processing needs.
For more information about WebSphere Transaction Cluster Facility, go to: http://www.ibm.com/software/webservers/wtcf
16 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
This paper was produced by a team of specialists from around the world at the International
Technical Support Organization (ITSO), Raleigh Center.
Robert Blackburn is a Senior Statistician with IBM in Poughkeepsie, New York (NY). He has more than 30 years of experience in the field of performance analysis. His specialties include probability, queueing theory, and applied statistics. Robert is the performance analyst for the z/TPF and WebSphere Transaction Cluster Facility products and consults with z/TPF customers on matters of performance and design. He also works with the IBM hardware design teams to enable optimum performance in processor and storage systems. Robert has a doctorate degree in mathematics from The Graduate Center at The City University of New York.
Jonathan Collins is a Product Line Manager for the WebSphere Transaction Cluster Facility,
Airline Control System, and Transaction Processing Facility products in the IBM Software Group.
Jonathan handles financial management, acquisitions, pricing, marketing, new business opportunities, partner enablement, vendor relationships, strategy, and product direction. He ensures that the EPS Industries portfolio of products meets the transaction processing needs of
C-level and LOB executives. His background includes software development and management in several fields, and he has written extensively about the value of IBM transaction processing products. Jonathan has a Bachelor of Arts degree from Union College with a major in English and minor in Computer Science.
Jamie Farmer is a Senior Software Engineer at IBM in Poughkeepsie, NY. Jamie is a lead developer for two leading high volume transaction processing platforms: z/TPF and IBM
WebSphere Transaction Cluster Facility. He has over 15 years of experience in designing and developing components of these platforms, including network communications, cryptography and encryption, database management, and enterprise integration. Jamie has a Bachelor of
Science degree in computer science from Ithaca College and a master degree in computer science from Marist College.
Mark Gambino is a Senior Technical Staff Member at IBM in Poughkeepsie, NY. He is a senior architect for the Transaction Processing Facility and WebSphere Transaction Cluster
Facility high-end transaction processing product lines. Mark has over 24 years of experience in designing, developing, and testing numerous components of these platforms.
Colette Manoni is a Senior Technical Staff Member and Architect for the Transaction
Processing Facility family of products in IBM Software Group. With over 28 years of experience in developing large-scale transaction systems, Colette is responsible for the requirements, functional designs, and overall product solutions. She is also a Master Inventor and holds over 20 patents in the US, Japan, China, and several other countries. Colette has a
Bachelor of Science degree from Rensselaer Polytechnic Institute.
Joshua Wisniewski is a Senior Software Engineer at IBM in Poughkeepsie, NY. He has over
12 years of experience in designing, developing, and maintaining various features for the high-volume transaction processing platforms of z/TPF and IBM WebSphere Transaction
Cluster Facility. He was a technical lead, customer liaison, developer, architect, and project manager for z/TPF debugger technologies, various z/TPF tools, and WebSphere Transaction
Cluster Facility applications development. Joshua has Bachelor of Science degree from
Clarkson University with a major in computer engineering.
Pushing Data Performance with IBM WebSphere Transaction Cluster Facility 17
Here's an opportunity to spotlight your skills, grow your career, and become a published author—all at the same time! Join an ITSO residency project and help write a book in your area of expertise, while honing your experience using leading-edge technologies. Your efforts will help to increase product acceptance and customer satisfaction, as you expand your network of technical contacts and relationships. Residencies run from two to six weeks in length, and you can participate either in person or as a remote resident working from your home base.
Learn more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Find us on Facebook: http://www.facebook.com/IBMRedbooks
Follow us on Twitter: http://twitter.com/ibmredbooks
Look for us on LinkedIn: http://www.linkedin.com/groups?home=&gid=2130806
Explore new IBM Redbooks® publications, residencies, and workshops with the IBM
Redbooks weekly newsletter: https://www.redbooks.ibm.com/Redbooks.nsf/subscribe?OpenForm
Stay current on recent Redbooks publications with RSS Feeds: http://www.redbooks.ibm.com/rss.html
18 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
© Copyright International Business Machines Corporation 2013. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by
GSA ADP Schedule Contract with IBM Corp.
19
This document REDP-4910-00 was created or updated on March 13, 2013.
Send us your comments in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to: redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400 U.S.A.
®
Red
™
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines
Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:
AIX®
DB2®
IBM®
Power Systems™
POWER7® pureScale®
Redbooks®
Redpaper™
Redbooks (logo)
Tivoli®
WebSphere®
®
The following terms are trademarks of other companies:
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Other company, product, or service names may be trademarks or service marks of others.
20 Pushing Data Performance with IBM WebSphere Transaction Cluster Facility