book - Networked Software Systems Laboratory

advertisement
‫אביב תשע"א‬
1102111
Network Software and Systems Lab
Electrical Engineering Faculty
Final Report
Submitted By: Omer Kiselov, Ofer Kiselov
Supervised By: Dmitri Perelman
Project in software
Technion

To my brother, who
needed this for his
interview. We wish
him good luck.

1
INDEX
ABSTRACT ................................................................................................................... 5
1.INTRODUCTION .................................................................................................... 6
1.1Parallel Computing ............................................................................................................................ 6
1.2 Locks and the problems they present ............................................................................................... 9
2. BACKGROUND ..................................................................................................... 11
2.1 Software Transactional Memory Abstraction ................................................................................ 11
2.2STM Implementation example – Transactional Locking Overview ............................................... 14
2.3 Aborts in STM
.............................................................................................................................. 18
2.4 Unnecessary Aborts in STM ............................................................................................................ 19
2.4.1 What Are They? ...................................................................................................................... 19
2.4.2
Why do they happen? ......................................................................... 20
2.4.3
How do we detect them? .............................................................. 22
2.4.4
Example: Aborts in TL2 ...................................................................... 23
3.PROJECT GOAL .................................................................................................... 28
4.IMPLEMENTATION ............................................................................................ 29
4.1 Overview .......................................................................................................................................... 29
4.2 The log file and parser ..................................................................................................................... 29
4.3 The offline analysis .......................................................................................................................... 33
4.4 The online logging ............................................................................................................................ 37
5. EVALUATION ........................................................................................................ 54
5.1 Hardware.......................................................................................................................................... 54
Project in software
Technion
5.2 Deuce Framework ............................................................................................................................ 54
5.4 Benchmarks...................................................................................................................................... 56
5.4.1 AVL test bench .......................................................................................................................... 56
5.4.2 Vacation test bench .................................................................................................................. 57
5.4.3 SSCA2 test bench ...................................................................................................................... 58
5.5 Results .............................................................................................................................................. 59
6. CONCLUSION AND SUMMARY ........................................................................... 65
7, ACKNOWLEDGEMENTS ...................................................................................... 67
8. REFERENCES......................................................................................................... 68
9. INDEX A – CLASS LIST ......................................................................................... 70
2
Project in software
Technion
Figures List
Figure 1 - Serial Software ................................................................................ 7
Figure 2 - Parallel Programming Schematic .................................................... 8
Figure 3 - TL2 Tests vs. different algorithms.................................................. 18
Figure 4 - Aborts ............................................................................................ 20
Figure 5 - Example Of Unnecessary Aborts................................................... 21
Figure 6 - Precedence Graph ........................................................................ 22
Figure 7 - Structure of the classes representing the log ................................ 32
Figure 8 - A schematic description of the offline analysis part ....................... 34
Figure 9 - Another View of the Offline Design ................................................ 37
Figure 10 - A schematic description of the online logging part ...................... 38
Figure 11 - Online Part version 1 ................................................................... 41
Figure 12 - Online Part Final Version ............................................................. 43
Figure 13 - Deuce Method application ........................................................... 54
Figure 14 - Deuce Context for TM algorithms ................................................ 55
Figure 15 - Comparison between Deuce and similar methods for running TM
....................................................................................................................... 56
Figure 16 - AVL Benchmark results – commit ratio, aborts precentage ......... 59
Figure 17 - SSCA2 Benchmark results – commit ratio, aborts precentage .... 60
Figure 18 - Vacation Benchmark result – commit ratio, aborts precentage ... 60
Figure 19 – AVL benchmark results - Analysis of Aborts by type .................. 61
Figure 20 - SSCA2 Benchmark results - Analysis of Aborts by type .............. 61
Figure 21 – Vacation Benchmark results - Analysis of Aborts by type ........... 62
Figure 22 - AVL Benchmark results - Catagorizing by type and amounts ...... 63
Figure 23 - SSCA2 Benchmark results - Catagorizing by type and amounts . 63
Figure 24 – Vacation Benchmark results - Catagorizing by type and amounts
....................................................................................................................... 64
3
Project in software
Technion
Source Code List
Source Code 1 - Log Formation..................................................................... 32
Source Code2 - Logger Interface.................................................................. 39
Source Code 3 - Background Collector Run method ..................................... 42
Source Code 4 - Queue Manager Ct'r .......................................................... 44
Source Code 5 - Add Queue Method............................................................ 44
Source Code 6 - getNextLine & traverseQueues methods ........................... 46
Source Code 7 - Addition to the TL2 context ................................................. 47
Source Code 8 - Change in context C'tor ...................................................... 47
Source Code 9 - changes in TL2 Context init() .............................................. 48
Source Code 11 - The new exceptions in LockTable ..................................... 50
Source Code 11 - Exeption throw instance .................................................... 50
Source Code 12 - Change in Commit method ............................................... 51
Source Code 11 - Changes in the onReadAccess method ............................ 52
Source Code 14 - Changes in beforeReadAccess method .......................... 52
Source Code 15 - Change in Context objectid ............................................... 53
4
Project in software
Technion
Abstract
Aborts in Software Transactional Memory are a blow to a program's
performance – the aborted transaction must perform its action again, while
previous efforts are wasted, and CPU utilization decreases. Aborts in STM
algorithms are caused by many factors, but not all of them are necessary to
maintain correctness: sometimes, a transaction may abort only because
validating its correctness would be too complex.
In this project, our goals were to observe the amounts of aborts and
unnecessary aborts, and their impact on performance over a selected STM
algorithm. First, we formulated a log file structure, and built an offline "Abort
Analyzer", which able to read and conclude whether an abort is necessary.
Afterwards, we modified an existing STM environment, so that it may report
on every transactional action it performs (e.g. reads, writes, transaction starts,
commits and aborts) to a log file, matching the Abort Analyzer.
5
Project in software
Technion
1. Introduction
1.1 Parallel Computing
Over the years we had a rise in computer performance as a result
of technology improvements. In addition to technology
improvements we have computer architectures benefiting from
these improvements by adapting and embracing them. With these
improvements come many problems due to power dissipation and
scaling boundaries. Other than that we have the possibility to
create multiple circuits on a smaller area of die.
Parallel computing is a form of computer architecture and design in
which calculations can be carried out simultaneously. The key
principle is that large size problems can be divided into smaller
ones which can be solved concurrently. As power consumption
(and consequently heat generation) by computers has become a
concern in recent years, parallel computing has become the
dominant paradigm in computer architecture, mainly in the form
of multicore processors.
VLSI technology, allows larger and larger numbers of
components to fit on a chip and clock rates to increase. The
parallel computer architecture translates the potential of the
technology into greater performance and expanded capability of
the computer system. The leading character is parallelism. A larger
volume of resources means that more operations can be done at
once, in parallel. Parallel computer architecture is about organizing
these resources so that they work well together. Computers of all
types have harnessed parallelism more and more effectively to
gain performance from the raw technology, and the level at which
parallelism is exploited continues to rise. The other key character is
storage. The data that is operated on at an ever faster rate must be
held somewhere in the machine.
6
Project in software
Technion
Traditionally, computer software has been written for serial
computation. To solve a problem, an algorithm is constructed and
implemented as a serial stream of instructions. These instructions
are executed on a central processing unit on one computer. Only
one instruction may execute at a time—after that instruction is
finished, the next is executed.
Figure 1 - Serial Software
In the simplest sense, parallel computing is the simultaneous use
of multiple compute resources to solve a computational problem:

To be run using multiple CPUs

A problem is broken into discrete parts that can be solved
concurrently

Each part is further broken down to a series of instructions

Instructions from each part execute simultaneously on
different CPUs
7
Project in software
Technion
Figure 2 - Parallel Programming Schematic
Historically, parallel computing has been considered to be "the high
end of computing", and has been used to model difficult scientific
and engineering problems found in the real world. Some examples
are: Physics - applied, nuclear, particle, condensed matter, high
pressure, fusion, photonics, Bioscience, Biotechnology, Genetics,
Chemistry, Molecular Sciences, Geology, Seismology, Mechanical
Engineering - from prosthetics to spacecraft, Electrical Engineering,
Circuit Design, Microelectronics, Computer Science, Mathematics,
and graphics.
There are two directions to achieve parallelism in computing:
The first is through hardware – there is a way of building
mechanisms which takes a set of commands finds their
dependencies and executes the independent commands in
parallel.
Most of these hardware mechanisms are built in VLSI and are
using advanced caching algorithms and consist of high end
architecture and micro-architecture changes from the norm (X86 or
what not). We will not deal with these possible changes.
8
Project in software
Technion
Instead we will deal with software parallelism methods. There are
several ways to execute parallel computing above architecture
layer. The most common way to perform parallel computing is
concurrent programming using locks.
1.2 Locks and the problems they present
A lock is a synchronization mechanism for enforcing limits on
access to a resource in an environment where there are
many threads of execution. Locks are one way of
enforcing concurrency control policies.
Locks are an effective method for obtaining certain degree of
concurrency but they have many problems:
The first is the overhead while locking. There is a utilization of
some resources for the locking mechanism in itself, like the
memory allocated for the locks, the time the CPU takes to initialize
and destroy the locks (they are usually built out of an ADT called
semaphore), and of course the time each thread takes to acquire
and release the locks.
Naturally these are all problems one will have in all concurrency
algorithms but in this case the overhead limits our utilization and
our use of the mechanism. Thus we are forced to use larger locks
or less locking which in turn has its own issues.
Second there is the problem of contention. If there is a lock held by
one thread and another thread or two wants to obtain it – it must
stand in "line". This problem has a direct link to the matter of
granularity of the locks. The more granular the locks are the lower
the contention is. Granularity is the amount of substance locked
within each lock – a lock can obtain an entire matrix of memory
data or a singular cell, perhaps a row or column. As we can
immediately conclude we granularity has a direct link to the level of
contention and a trade-off with the overhead.
9
Project in software
Technion
The direct result of such lock is that we can have a few locks
locking a lot of data, a case which hurts concurrency and
performance of the machine but has much less overhead, or we
can have a large amount of locks locking less data – but again we
get the overhead.
Last we have the problem of deadlocks and livelocks. In both cases
we have two threads each containing lock which posses the
information which the other thread needs. Thus each of them is
waiting for the other to release the lock and no progress is made in
the process. The threads are stuck. The difference between
deadlocks and livelocks is that in livelocks we have some sort of
action still running and some reading and writing to memory is still
proceeding in an endless loop.
This introduction was a brief summary of the subject we are here to
discuss. We will talk of a new locking algorithm with high
granularity and low performance. This locking algorithm guaranties
atomicity and smart execution with no deadlocks.
This algorithm is called transactional memory.
We will not discuss the matter of hardware transactional memory
but the software transactional memory. That is because we want to
minimize the cost of the changes to the existing systems – the
existing systems of microprocessors could be adapted to
transactional memory but with great difficulty in back-fitting
versions. It will probably not fit with existing systems or earlier
system.
10
Project in software
Technion
2. Background
2.1 Software Transactional Memory Abstraction
Software transactional memory (STM) is a scheme for concurrent
programming with multiple threads that uses transactions similar to
those used in databases.
At its heart, a transaction is simply a way of performing a group of
operations in such a way that they appear to happen atomically, all at a
single instant. The intermediate states cannot be read or updated by
any other transaction. This enables you to keep tables consistent with
their constraints at all times - one operation can violate a constraint, but
both before and after each transaction, it will hold.
In typical implementations of transactions, you have the option of
either commiting or aborting a transaction at any time. When you
commit a transaction, you agree that the transaction went as planned
and make its changes permanently in the database. If you abort a
transaction on the other hand, this means that something went wrong
and you want to roll back, or reverse, any partial changes it's already
made. Good installers use a similar rollback scheme: if some
installation step fails and recovery is not possible, it will erase all the
cruft it's already put on your machine so that it's just like it was before.
If it fails to do this, the machine may become unusable or future
installation attempts may fail.
If we assume no faults happen, the way to ensure the atomicity of the
operations is usually based on locking or acquiring exclusively
ownership on the memory locations accessed by an operation. If a
transaction cannot capture an ownership it fails, and releases the
ownerships already acquired. To guarantee liveness one must first
eliminate deadlock, which for static transactions is done by acquiring
the ownerships needed in some increasing order. In order to continue
ensuring liveness in a situation where faults happen, we must make
11
Project in software
Technion
certain that every transaction completes even if the process which
executes it has been delayed, swapped out, or crashed. This is
achieved by a "helping" methodology, forcing other transactions which
are trying to capture the same location to help the owner of this location
to complete its own transaction. The key feature in the transactional
approach is that in order to free a location one need only help its single
owner transaction. Moreover, one can effectively avoid the overhead of
coordination among several transactions attempting to help release a
location by employing a "reactive" helping policy.
The benefit of this optimistic approach is increased concurrency: no
thread needs to wait for access to a resource, and different threads can
safely and simultaneously modify disjoint parts of a data structure that
would normally be protected under the same lock. Despite the
overhead of retrying transactions that fail, in most realistic programs,
conflicts arise rarely enough that there is an immense performance
gain over lock-based protocols on large numbers of processors.
However, in practice STM systems also suffer a performance hit
relative to fine-grained lock-based systems on small numbers of
processors. This is due primarily to the overhead associated with
maintaining the log and the time spent committing transactions. Even in
this case performance is typically no worse than twice as slow.
Hence we see that transactional memory eliminates the use of locks.
The use of software alone to implement requires only the adaptive
existing CAS command (compare and swap).
There are many ways to implement STM. STM can be implemented as
a lock-free algorithm or it can use locking. There are two types of
locking schemes: In encounter-time locking, memory writes are done
by first temporarily acquiring a lock for a given location, writing the
value directly, and logging it in the undo log. Commit-time locking locks
memory locations only during the commit phase.
12
Project in software
Technion
A commit-time scheme named "Transactional Locking II" implemented
by Dice, Shalev, and Shavit uses a global version clock. Every
transaction starts by reading the current value of the clock and storing it
as the read-version. Then, on every read or write, the version of the
particular memory location is compared to the read-version; and, if it's
greater, the transaction is aborted. This guarantees that the code is
executed on a consistent snapshot of memory. During commit, all write
locations are locked, and version numbers of all read and write
locations are re-checked. Finally, the global version clock is
incremented, new write values from the log are written back to memory
and stamped with the new clock version.
13
Project in software
Technion
2.2 STM Implementation example – Transactional Locking
Overview
To illustrate better the method and advantages of STM we will now
portray an example for an algorithm used to run and perform
transactional memory in software. Using the global clock variation it
has an effect on the method of use and increase concurrency. We
will examine the implementation in order to utilize the properties of
the algorithm to our future needs within the project.
The transactional locking 2 algorithm is based on a combination of
commit-time locking and a novel global version clock based
validation technique. The TL2 fits with any systems memory life
cycle, including the platforms using malloc and free. The TL2 also
avoids periods of unsafe execution. User code is guaranteed to
operate on consistent memory states. Eventually while providing
these properties the TL2 has a reasonable performance (better
than other STM algorithms).
Lock based STM tend to outperform non-blocking ones due to
simpler algorithm that result in lower overheads (as discussed
earlier in the section regarding the tradeoffs). The main limitations
of the STM discussed here were introduced by the writers in order
to set the criteria for commercial use of STM.
The first of those is that memory used transactionally must be
recyclable to be used non-transactionally. Hence the use of
garbage collecting languages is introduced. Even so in C hence a
garbage collector must be implemented.
The second limitation is that the STM algorithm must not use a
special runtime environment which cripples performance in many
efficient otherwise STM algorithms. The TL2 algorithm runs user
codes on consistent states of memory and thus eliminating the
need for specialized managed runtime environments.
14
Project in software
Technion
The TL2 is a two phase locking scheme that employs commit-time
lock acquisition mode. Each implemented transactional system has
a shared global version-clock variable. The global clock is
incremented using the CAS operation. The global version clock will
be read by each transaction.
There is a special versioned write-lock associated with every
transacted memory location. In its simplest form, the versioned
write-lock is a single word spinlock that uses a CAS operation to
acquire the lock and a store to release it. Since one only needs a
single bit to indicate that the lock is taken, the rest of the lock word
is used to hold a version number. This number is advanced by
every successful lock-release.
To implement a given data structure there is a need to allocate a
collection of versioned write-locks. Various schemes for associating
locks with shared data are used:
Per object (PO), where a lock is assigned per shared object or per
stripe (PS), where we allocate a separate large array of locks and
memory is striped (partitioned) using some hash function to map
each transactable location to a stripe.
Other mappings between transactional shared variables and locks
are possible.
The PO scheme requires either manual or compiler-assisted
automatic insertion of lock fields whereas PS can be used with
unmodified data structures.
The TL2 algorithm is:
For Write Transactions .
1. Sample global version-clock: Load the current value of the
global version clock and store it in a thread local variable called the
read-version number (rv). This value is later used for detection of
15
Project in software
Technion
recent changes to data fields by comparing it to the version fields of
their versioned write-locks.
2. Run through a speculative execution: Execute the transaction
code (load and store instructions are mechanically augmented and
replaced so that speculative execution does not change the shared
memory’s state.) Locally maintain a read-set of addresses loaded
and a write-set address/value pairs stored. This logging
functionality is implemented by augmenting loads with instructions
that record the read address and replacing stores with code
recording the address and value to-be-written. The transactional
load first checks to see if the load address already appears in the
write-set. If so, the transactional load returns the last value written
to the address. This provides the illusion of processor consistency
and avoids read-after-write hazards. A load instruction sampling
the associated lock is inserted before each original load, which is
then followed by post-validation code checking that the location’s
versioned write-lock is free and has not changed. Additionally,
make sure that the lock’s version field is smaller of equal to rv and
the lock bit is clear. If it is greater than rv it suggests that the
memory location has been modified after the current thread
performed step 1, and the transaction is aborted.
3. Lock the write-set: Acquire the locks (avoid indefinite deadlock).
In case not all of these locks are successfully acquired, the
transaction fails.
4. Increment global version-clock: Upon successful completion of
lock acquisitions of all locks in the write-set perform an incrementand-fetch (using a CAS operation) of the global version-clock
recording the returned value in a local write-version number
variable wv.
5. Validate the read-set: validate for each location in the read-set
that the version number associated with the versioned-write-lock is
16
Project in software
Technion
smaller or equal to rv. Verify that these memory locations have not
been locked by other threads. In case the validation fails, the
transaction is aborted. (By re-validating the read-set, there is a
guarantee that its memory locations have not been modified) while
steps 3 and 4 were being executed. In the special case where rv +
1 = wv it is not necessary to validate the read-set, as it is
guaranteed that no concurrently executing transaction could have
modified it.
6. Commit and release the locks: For each location in the write-set,
store to the location the new value from the write-set and release
the locations lock by setting the version value to the write-version
wv and clearing the write-lock bit.
Read-Only Transactions
1. Sample the global version-clock: Load the current value of the
global version-clock and store it in a local variable called readversion (rv).
2. Run through a speculative execution: Execute the transaction
code. Each load instruction is post-validated by checking that the
location’s versioned write-lock is free and making sure that the
lock’s version field is smaller or equal to rv. If it is greater than rv
the transaction is aborted, otherwise commits.
We will not discuss the method of the version clock implementation
nor do we want to deal with any further methods within the
implementation of TL2. The discussion here is purely on the
algorithm that TL2 presents for the transactions to perform.
Naturally the TL2 is different from other STM algorithm but we
believe that TL2 represents the majority of the STM algorithms.
What we intend to check are the aborts that the TL2 can perform.
On which we will discuss next chapter.
17
Project in software
Technion
We will now review the results of the TL2 empirical performance
evaluation as done by Dice, Shalev, and Shavit.
Figure 3 - TL2 Tests vs. different algorithms
The tests were run on a red black tree with customizable
contention. The result shows that the TL and TL2 are above all
other algorithms. The TL2 beats the TL algorithm in the highest
contention which means it poses a small overhead cost. In any way
according to these results the TL2 as preliminary test is a strong
representation to the STM algorithms as a whole.
2.3 Aborts in STM
As said in the TL2 algorithm and others as well there is a situation
where the transaction might encounter several inconsistencies
which will make it determine that the atomic actions set it holds
cannot be committed. Thus the actions are canceled and the
transaction aborts.
Aborts are in general not a "good" thing. They cripple performance
and jam the memory with wasted work. They lower concurrency at
times and as result create latency.
A transaction's abort may be initiated by a programmer or may be
the result of a TM decision. The last case when the transaction is
18
Project in software
Technion
forcefully aborted by the TM is the case we will handle mostly. Take
an example for a "regular" abort of a transaction which reads object
A and writes to object B. At the same time another transaction
reads the old value of B and writes A. One of the Transactions
must be aborted to ensure atomicity. This abort is a must. There is
no way to do it with out aborting one or both transactions.
Most existing TMs perform unnecessary (spare) aborts Aborts of transactions that could have committed without violating
correctness. Spare aborts have several drawbacks: like ordinary
aborts work done by the aborted transaction is lost, computer
resources are wasted, and the overall throughput decreases.
Moreover, after the aborted transactions restart, they may conflict
again, leading to livelock and degrading performance even further.
The point is – while some aborts are a "necessary evil" since we
cannot maintain atomicity without them, unnecessary aborts on the
other hand are wrong and are a waste in every way.
Some unnecessary aborts cannot be avoided – the algorithms are
built in a way that certain situations cause aborts in any way. On
the other hand some aborts can be avoided by adapting software
or algorithm to make possible to maintain atomicity while such an
incident occur.
2.4 Unnecessary Aborts in STM
2.4.1 What Are They?
Our first discussion was in fact to determine what are
unnecessary aborts?
The lowest margin and most accurate definition of an
unnecessary abort is an abort which we can avoid (by
software of hardware mechanism) and/or was caused not by
an actual conflict in the object version or any other real
conflict but by a step in the algorithm excluding such states
as "risky" and there for aborts when they are entered.
19
Project in software
Technion
In other words, an unnecessary abort is an abort that
shouldn't have occurred.
We discussed the TM abstraction in terms of utilization of
resources – a writing transaction access an object to write to
it. A reading transaction access an object to read from it. In
TL2 the rule of the thumb is that the version clock remains
constant while a transaction performs the action.
Hence to maintain correctness in TL2 we need that the
version clock remains steady.
2.4.2 Why do they happen?
There is more than one way to look at unnecessary aborts.
They may not seem unnecessary at all at times.
The point is that most TM implementations abort one
transaction whenever two overlapping transactions access
the same object and at least one access is a write.
While easy to implement, this approach may lead to high
abort rates, especially in situations with long-running
transactions and contended shared objects.
The TM algorithms created several odd situations.
Figure 4 - Aborts
For example in the figure it clearly shows that TL2 should not
have been avoided. What happened was that the algorithm
saw that two transactions accessed the same object and
20
Project in software
Technion
simply aborted one of them.
Figure 5 - Example Of Unnecessary Aborts
This occurs in most TM algorithms and has many examples.
The run can continue without violating correctness yet the
transaction aborts.
Of course here arise the question of why would the TM
algorithms allow such a state?
The conditions discussed contain a difficulty to check their
correctness. In order to check their correctness the algorithm
need to run speculatively to inspect the run of the program.
There is a trade-off in the "permissiveness" of the algorithm,
in terms of conditions allowed and the latency which the
correctness check by the algorithm requires.
The computational needs of the correctness check by the
algorithm were proven to be high by many of the TM
algorithms.
The main point is that most algorithms have low
permissiveness rates due that phenomenon and hence have
an impact on the commit rate and performance of the
algorithm (to be statistically proven by us here).
21
Project in software
Technion
2.4.3 How do we detect them?
In the article "On Avoiding Spare Aborts in Transactional
Memory" by prof. Idit Keidar and Dmitri Perelman they
concluded that Transactions, when placed correctly in read
list and write list may point to each other, forming a
Precedence Graph.
The precedence graph reflects the dependencies among the
transactions as created during the run. The vertexes of the
precedence graph are transactions.
The edges of the precedence graph are the dependencies
between them. For example – say we have two transactions
T1 and T2. If T1 reads an object named O1 and T2 writes to
O1 than the precedence graph contains an Edge (T1,T2)
labeled RAW – read after write. If the transaction T1 writes to
O1 and T2 also writes to O1 the precedence graph contains
an edge (T1,T2) labeled WAW – write after write. If T1 writes
to O1 and T2 read O1 before it was changed (O1 is written to
version n and T2 reads version n-1) than the edge (T1,T2)
will be labeled WAR – write after read.
Figure 6 - Precedence Graph
From here on end we have the definition of the precedence
graph. We can use the following conclusion to determine the
meaning of it.
22
Project in software
Technion
Corollary 1. Consider a TM that maintains object version lists
that keeps PG acyclic and forcefully aborts a set S of live
transactions only when aborting any subset S '  S of
transactions creates a cycle in PG. Then this TM satisfies  opacity and online  - opacity- permissiveness.
This is proven in the article.
What this conclusion means is that if after the transactions
are entered in to the precedence graph, the precedence
graph remains acyclic, and the transactions still abort, than
the abort was unnecessary.
In order to detect unnecessary aborts we can review their
speculative run and create a speculative precedence graph
as described above. In the said graph we will check (using
DFS algorithm or other means) for cycles, containing the
aborted transaction. If there were no cycles the abort was
unnecessary. Otherwise the abort was necessary since the
correctness of the run was unable to be maintained if the
transaction hasn't aborted.
So in order to check and analyze aborts we must document
the run of the transaction, place in a precedence graph, and
check for cycles. If there were cycles, the abort was
necessary, else it was unnecessary.
We will want more information than the abort ratio and the
ratio of the unnecessary aborts. We will discuss them further
in the evaluation part.
2.4.4 Example: Aborts in TL2
We review the TL2 algorithm and see where we can find
possible aborts:
For Write Transactions .
1. Sample global version-clock: Load the current value of
the global version clock and store it in a thread local variable
23
Project in software
Technion
called the read-version number (rv). This value is later used
for detection of recent changes to data fields by comparing it
to the version fields of their versioned write-locks.
24
Project in software
Technion
2. Run through a speculative execution: Execute the
transaction code. Locally maintain a read-set of addresses
loaded and a write-set address/value pairs stored. This
logging functionality is implemented by augmenting loads
with instructions that record the read address and replacing
stores with code recording the address and value to-bewritten. The transactional load first checks to see if the load
address already appears in the write-set. If so, the
transactional load returns the last value written to the
address. A load instruction sampling the associated lock is
inserted before each original load, which is then followed by
post-validation code checking that the location’s versioned
write-lock is free and has not changed. Additionally, make
sure that the lock’s version field is smaller of equal to rv and
the lock bit is clear. If it is greater than rv it suggests that the
memory location has been modified after the current thread
performed step 1, and the transaction is aborted.
3. Lock the write-set: Acquire the locks (avoid indefinite
deadlock). In case not all of these locks are successfully
acquired, the transaction fails.
4. Increment global version-clock: Upon successful
completion of lock acquisition of all locks in the write-set
performs an increment-and-fetch (using a CAS operation) of
the global version-clock recording the returned value in a
local write-version number variable wv.
25
Project in software
Technion
5. Validate the read-set: validate for each location in the
read-set that the version number associated with the
versioned-write-lock is smaller or equal to rv. Verify that
these memory locations have not been locked by other
threads. In case the validation fails, the transaction is
aborted. while steps 3 and 4 were being executed. In the
special case where rv + 1 = wv it is not necessary to validate
the read-set, as it is guaranteed that no concurrently
executing transaction could have modified it.
6. Commit and release the locks: For each location in the
write-set, store to the location the new value from the writeset and release the locations lock by setting the version
value to the write-version wv and clearing the write-lock bit.
Read-Only Transactions
1. Sample the global version-clock: Load the current value
of the global version-clock and store it in a local variable
called read-version (rv).
2. Run through a speculative execution: Execute the
transaction code. Each load instruction is post-validated by
checking that the location’s versioned write-lock is free and
making sure that the lock’s version field is smaller or equal
to rv. If it is greater than rv the transaction is aborted,
otherwise commits.
We can see that the TL2 algorithm has three abort types –
one is the "Version Incompatibility Abort", meaning that the
version is changed due to a writing transaction which
overlaps this one. The second is the "Locks Abort" in which
the transaction doesn't have all the required resources to
perform the operation (thus prevent deadlocks), and last is
26
Project in software
Technion
the "Read-Set Invalidation Abort", makes sure that the
required memory is free for writing.
These three different abort types are the places in which the
algorithm aborts. If we can diagnose the amounts we may be
able to offer treatment in a way in which the aborts could be
avoided in low cost to performance.
27
Project in software
Technion
3. Project Goal
Our goals are the following:

Analyze a given run statistically. Check the amount of aborts,
percentage of unnecessary aborts, percentage of wasted work,
impact of aborts and unnecessary aborts on performance. Impact
of aborts and unnecessary aborts on load.

Analyze aborts by types – see what the main causes of aborts in
several algorithms are. Categorize and compare results by
different contentions and running terms.

Check general performance.

Inspect the required data and answer the following question:
"Will it pay off to add designs to stop the unnecessary
aborts?”
Meaning – are they worth addressing? Is their impact on
performance so massive that a solution to some of them be a
practical visible change for the better in the performance,
speedup, and load?
28
Project in software
Technion
4. Implementation
4.1 Overview
To accomplish our goals, we had to find a way to get information about a
program's run. The information we looked for was the order of
transactional events (e.g transactions starts, reads, writes, commits and
aborts). Since trying to analyze the data while the program still runs may
hurt the program's parallelism, we decided to split the implementation to
three different parts:

Designing the log file generated through the run for future analysis.

The offline analysis part – this part includes reading the log file
written by the online part, and calculating the various statistics we
attempt to measure.

The online logging part – this part includes running the program in
STM environement, and writing the transactional events into a log
file.
This section will focus on the implementation of the classes we used to
log and analyze TL2. Through the implementation we used the standart
Java libraries, including their concurrency packages, the JgraphT library,
which supllies graph theory interface, XML database language and
Java's SAX library, as well as Object Oriented design patterns such as
singleton, abstract factory and visitor.
4.2 The log file and parser
The first step in implementing our abort analyzer was planning what data
will be written to the log file throughout the program's run. To know the
causes for each and every abort, we needed data on transactions'
chronological order of reads and writes, as well as the order of
transactions starting, commiting and aborting.
To ease the parsing of the log file by our analyzer, and to apply an
inflexible format to it, we have decided to use XML format for it. Using
Java's SAX (Simple API for XML), we are able to access the log file with
relative ease.
29
Project in software
Technion
The XML log is built with a main "log" tag, which is the container for
every action the online part would log. The log tag contains a number of
records. Each record is called a "LogLine", marked with a "line" tag, and
represents a single action of a single transaction: start, read, write or
commit.
The data wer'e saving for each action:

For transaction starts: the global version clock at that time.

For reads: an object Id for the read object, the object's lock
version and the result of the read– a "version" tag that contains the
read object's current version on success, and an "abort" tag, which
we also use in commit lines, on abort. The abort tag also contains
the read object's current version, but also has the "reason" for the
abort as attribute. While modifying the algorithm to log for us, we
inserted different names for different causes for aborts, to distinct
between different types.

For writes: the object's ID.

For commits: on commit we saved the write version for the written
objects (0 if no objects were overwritten), and an abort tag on
aborts, similar to those described for read lines. This time, the abort
tag's content is the version of the object preventing the transaction
from commiting.
Finally, the log's format is:
30
Project in software
Technion
<log>
<line type = "start" txn = (transaction ID)><rv>
(Global version clock version on transaction start)
</rv></line>
<line type="read" txn= (transaction ID)>
<obj> (object ID) </obj>
<result>
[On success:
<version>0</version>]
[On abort: <abort reason = (a string describing the
reason for the abort)> (Object current version)
</abort>]
</result>
</line>
<line type="write" txn= (transaction ID)><obj>
(object ID) </obj></line>
<line type="commit" txn= (transaction ID)>
<result>
[On success:
<write-set>
<wv> (transaction write version) </wv>
</write-set>]
[On abort: <abort reason = (a string describing the
reason for the abort)>0</abort>]
</result>
31
Project in software
Technion
</line>
</log>
Source Code 1 - Log Formation
This log structure is designed specificly for the TL2 algorithm. There are
no "abort" tags possible in start lines or write lines, since TL2 only
samples the global version clock on start, and adds objects to write-set
on write. For different algorithms, the log's format may be changed.
The structure of the classes representing each line is as follows:
name: String
result:
boolean
LogLine
reason:
String
opType:
Enum Type
IS A
Log_StartLine
Log_CommitLine
Log_ReadLine
Log_WriteLine
rv:
long
wv:
Long
object:
String
version:
Long
object:
String
Figure 7 - Structure of the classes representing the log
For the purpose of storing all the actions that were performed, we
created a general "LogLine" class, and four classes that extend it for
each action. Each of these classes contains the data we wanted to save
for its action type respectively.
32
Project in software
Technion
For each type of LogLine we created a SingleLineFactory – a class with
a single method, "createLine", which receives an XML node and, using
the DOM XML, returns a LogLine according to it. To distinct which
factory we should use, we created a general "LogLineFactory" singleton
class, which reads the type of LogLine from the "type" attribute at each
"line" tag, and calls the matching factory, using a Map of strings as keys
and SingleLineFactories as data.
With those classes, we are able to read an XML log file, and use the
LogLineFactory.parseLogFile method, which gets one XML node at a
time and calls the matching factory. The method produces a list of
LogLines. We assume the list maintains the order of insertion.
After the factory finishes the parsing of the file, we have no further use
for it since we have all the data in our list. The list will now be processed
by the analyzer.
4.3 The offline analysis
There are several stages to the offline analysis:

parsing the log file.

analyzing the statistics.
Using a "divide and conquer" approach, we designed different
packages for each of these stages: "ofersParser" package, which
contains the means to parse the log file – the LogLine classes and
factory classes discussed earlier – and "analyzer" package, which is
called by the parser when the log's data is stored in classes, rather
than the XML file. The analyzer package maintains a database of
transactions, dependencies and actions performed, as well as the
statistics wer'e attempting to measure.
33
Project in software
Technion
Run
Description
Parser
XML
Analyzer
LOG
Run
Abort
Description
analyzer
Data
Structure
Figure 8 - A schematic description of the offline analysis part
In this scheme, we can see that the "Parser" unit receives the log file
and generates a run description. The run description is actually the log
lines as they were written to the log, this time in Java classes. The
Parser passes this down to the Analzer unit, which reads the data and
maintains a "Run Description Data Structure", which is a precedence
graph. That precedence graph is passed to the "Abort analyzer" unit,
which checks each abort's necessity.
The analyzer functions as an interpreter – it reads line by line without
looking ahead, and reacts to each line, independent of other lines.
The analyzer package contains three data collecting classes and three
information storage classes. The storage classes are:

TxnDsc – contains general information about the transaction in the
current stage of the run, as interpreted by the analyzer thus far (As
recalled, the analysis is offline, so "thus far" refers to the analyzer's
progress through the list, not the actual state in the program's run).
These are used as the vertexes in the precedence graph. The
TxnDsc contains the transaction's ID, status (active, commited or
aborted), read version, read-set and write-set.
34
Project in software

Technion
ObjectVersion – contains information about a specific version of a
single object: its version, the previous and the next versions, the
writer of this version and a set of the readers of this version. While
building the precedence graph, we access the objectVersions to
determine which edges we should add to the graph.

ObjectHandle – contains information about the object itself: its ID,
the first and last versions, and the number of versions until now.
Using the object handles, we may traverse through objects' versions
easily.
The other three classes are "LogLineVisitors" – they use the Visitor
design pattern to distinct between each type of LogLine. Each one of
those classes has a different role in the analysis of the log:

Analyzer – keeps basic statistics of the run, and a database of
TxnDscs that can be accessed for general purpose. The statistics
available in the Analyzer are aborts and commits counts, reads and
wasted reads count, and writes and wasted writes count, which are
not valid for TL2, since writes are only speculative and are not
performed. It also contains an "abort map", to count aborts of each
type.
Generally, the analyzer has every statistic, aside from unnecessary
aborts, treated by the other classes. The analyzer has the method
"analyzeRun", which iterates over the LogLines list and calls all three
data collection classes to visit it.

RunDescriptor – maintains the precedence graph. As recalled, the
precedence graph is a graph, whose vertexes are the transactions,
and the edges represents dependencies of Read after Write (RaW),
Write after Read (WaR) and Write after Write (WaW). An
enumerated type was created for the edges, while the vertexes use
TxnDscs. The RunDescriptor also visits each line, and modifies the
graph suitably, by adding vertexes on transaction starts, updating
TxnDscs, ObjectHandles and ObjectVersions on reads, and Adding
ObjectVersions and edges on commits, and does nothing on aborts.
35
Project in software
Technion
The graph is built using the JgraphT package. The RunDescriptor
class also has an "exportToVisio" method, which creates a CSV file
with the precedence graph.

AbortAnalyzer – The missing piece of the RunDescriptor, this class'
job is to detect unnecessary aborts. On aborts, it speculatively adds
the matching dependency edges to the graph, and tries to detect
cycles containing the aborting transaction's vertex in the graph
(using the RunDescriptor.hasCycle method). As recalled, a cycle in
the graph means that the abort was necessary. The AbortAnalyzer
counts the number of unnecessary aborts.
For summary:
Class
Role
TxnDsc
Holds data concerning a single transaction.
ObjectVersion
Holds data concerning one version of a certain
object, and reference to previous and next
versions.
ObjectHandle
Holds data concerning a single object, and all of
its versions.
Analyzer
Holds a transactions database and basic
statistics.
RunDescriptor
Holds the precedence graph describing the run.
AbortAnalyzer
When encounters an aborts, detects cycles in
the precedence graph to determine abort's
necessity.
36
Project in software
Technion
XML Log
Parser
Abort Analyzer
Matlab histograms
and final analysis
Figure 9 - Another View of the Offline Design
Combining these three classes, we may measure:

Abort rate

Unnecessary abort rate

Wasted reads

Each abort type's impact on performance.
In conclusion, we have three classes which iterate over every LogLine,
and maintain a wide database of transactions and statistics, along with
the matching precedence graph. These classes allow us to read a log
generated by a program and reach the statistics we need. Our final step
is to make an STM environment do that for us.
4.4 The online logging
As recalled, the online logging part's aim is to monitor a running program
and generate a log file, matching the format we discussed earlier.
The online logging part also consists of two stages. The first of them is
creating the means for logging an algorithm with the XML format –
generating a "Logger" mechanism to which the STM environment will
report on every transaction's action. The second one is modifying the
37
Analyzer
Output of analysis is
a precedence graph
showing the
transactions and
their actions.
RUN DESCRIPTOR
Project in software
Technion
TL2 algorithm with the logging actions so that it may report to the logger
on every action.
The STM framework we modify is Deuce STM, discussed in the
background part.
Deuce Framework
TL2 algorithm
Transactions
Logging
code:
Instructions
Transaction
Logger
actions data
XML log
data
Start, read,
XML Log
file
write, commit
Figure 10 -was
A schematic
descriptiontoofimplement:
the online logging
part
The Logger mechanism
very problematic
first, we
had
to build it with a minimal amount of actions, since long insertion to highly
complex data structures may distort concurrency levels in the program.
Second, we had to make sure the LogLines reach the log in their
chronological order.
The most basic class in the Logger implementation is the LogCreator
class. The LogCreator is a singleton, and is in charge of receiving data
and writing it directly to the log file. To pass the data to the LogCreator,
we re-used the LogLine classes from the ofersParser package. And so,
LogCreator is a LogLine visitor. When we call LogCreator.getInstance
().visit (LogLine), the LogCreator independently writes the LogLine
directly to the file in XML format. We used manual writing of XML, rather
than DOM XML for this purpose.
LogCreator also has the "flush" and "quit" methods. Both of them close
the log tag in the XML file, and perform writing of the data accumulated
in the LogCreator's Writer class. The quit method also waits some time
untill all data has flushed, and then closes the log file.
38
Project in software
Technion
The Logger interface implementation is in charge of holding the
LogLines generated by each thread with TL2 in order of instantiation. It
has the following methods:
public interface Logger {
public void startTxn(String txnId, long rv);
public void readOp(String txnId, String objId, long verNum,
boolean result, String reason);
public void writeOp(String txnId, String objId);
public void commitOp(String txnId, long wv, boolean result,
String reason);
public void stop();
}
Source Code 2 - Logger Interface
Each method receives the data required to generate a LogLine. Then,
the method sends it to the LogCreator for writing in the log.
The Logger is a ThreadLocal, because synchronizing the threads to take
turns in using the Logger may be costy in means of parallelism. With
ThreadLocal Loggers, each thread may write independently to its own
private Logger.
To bring load off the running threads, we decided that along with the
threads, a background collector thread will collect the LogLines by their
order. The Collector will also be responsible for ordering the LogLines at
their chronological order.
We had two attempts at building the collector.
First, we created the LogLineHolder class. It is a container for a LogLine,
and an integer, called serialNum, representing its index. Lower
serialNum means earlier appearance. The index is generated by an
AtomicInteger. Each logger that needs to add a line accesses this
39
Project in software
Technion
integer and puts its value as the line's index, and increments the logger
(using the AtomicInteger.getAndIncrement () method).
In both of our attempts, the Logger holds a Queue of LogLineHolders. It
is O 1 complexity to insert an element to the Queue, and the FIFO rule
allows the lowest indexed lines to be treated first.
In our first attempt, our algorithm included a priority queue of
LogLineHolders queues. The priority of each queue is decided by the
index of its top element. In the background, there is a collector thread,
popping the queue and getting the first element of the top queue, and
writing it to the log by calling LogCreator. If the queue is now empty, the
Collector would not insert it back to the queue, which means each thread
may insert it when it fills up. Insertion to PriorityQueue may be costy, and
so we decided to refrain from using it.
40
Project in software
Technion
A schematic description of the first attempt:
Threads add the loggers to the queue themselves.
Figure 11 - Online Part version 1
In our second attempt, we focused on making the running threads more
passive: each thread inserts its LogLineHolders to its queue. In the
background, a new Collector thread, named BackendCollector, tries to
get the next line each time, and adds it to the log. The BackendCollector
keeps running while a Boolean value called "stop" is false and there are
more lines to collect:
public void run() {
while (true) {
LogLine line = qm.getNextLine();
if (line == null) {
if (stop)
break;
else
continue;
}
try {
line.accept(LogCreator.getInstance());
41
Project in software
Technion
} catch (IOException e) {
e.printStackTrace();
return;
}
}
try {
LogCreator.getInstance().quit();
} catch (IOException e) {
e.printStackTrace();
}
}
Source Code 3 - Background Collector Run method
The qm used in the second line of run () is the QueueManager, which
will be explained later. Its getNextLine () method returns either the next
line in the log, or null if no matching lines exist. If line is null, the
Collector keeps asking for the next line. When the Collecotr finds a line,
it sends it to the LogCreator using line.accept (LogLineVisitor) method.
At the end, the Collecotr attempts to close the log.
A schematic description of our second attempt:
42
Project in software
Technion
Collector iterates through the
loggers, threads remain passive.
Figure 12 - Online Part Final Version
Though the BackendCollector is very ineffective, since its search
function is very trivial, but its performance means nothing to us. Our only
specification for it is that it should create minimal interferance with the
running program, even if writing the log file should take longer, as long
as concurrency remains unharmed. With this algorithm, the threads' only
tasks are to access an AtomicInteger and to add a LogLineHolder to a
ThreadLocal queue.
The Logger's matching implementation is called AsyncLogger. It
contains the AtomicInteger for the indexes, and a LogLineHolder queue
for the lines. We also built an "AsyncLoggerFactory", which holds the
ThreadLocal Logger. Threads may access the factory to get their
Logger, which calls the ThreadLocal.get () method.
The final unit in the Logger mechanism is called the QueueManager. Its
job is to supply the BackendCollecotr with the next line in the queue
each time.
QueueManager is a singleton class. Its constructor is what starts the
Collector thread:
BackendCollector collector = new BackendCollector(this);
Thread t;
43
Project in software
Technion
private QueueManager() {
t = new Thread(collector);
t.start();
}
Source Code 4 - Queue Manager Ct'r
With the thread running early on, the logger is ready to accept LogLines
from the Threads.
The QueueManager holds a map of longs as keys and LogLineHolder
queues as data. The AsyncLogger C'tor adds its queue to the
QueueManager's map, using the "addQueueIfNotExists" method. The
method adds the queue and the thread num to the map, using
Thread.currentThread ().getId ():
public void addQueueIfNotExists(Queue<LogLineHolder> q) {
long curThreadId = Thread.currentThread().getId();
if (!threadQueues.containsKey(curThreadId)) {
threadQueues.put(curThreadId, q);
}
}
Source Code 5 - Add Queue Method
The method "getNextLine", which is used by the Collector, uses a
priorityQueue of LogLineHolders, sorted by index, and called
"LogLineCache".
LogLineCache is filled by a method called "traverseQueues": If the
Cache doesn't have the next line in order, the Collector uses this method
to loop over all queues to find any LogLines there. If none are found, the
method returns false, and "getNextLine" returns null, making the
Collector check stop condition.
44
Project in software
Technion
The "getNextLine" method checks for the next line in order using an int
named "counter", initialized on 0 and increased on every line added, and
the lines' indexes are supposed to match the counter.
public LogLine getNextLine() {
boolean empty = false;
while (!empty) {
LogLineHolder cacheHead = logLinesCache.peek();
if (cacheHead != null && cacheHead.serialNum == counter)
counter++;
return logLinesCache.poll().line;
}
empty = !traverseQueues();
}
return null;
}
private boolean traverseQueues() {
boolean smtFound = false;
for (Map.Entry<Long, Queue<LogLineHolder>> entry : threadQueues
.entrySet()) {
LogLineHolder lineHolder = entry.getValue().poll();
if (lineHolder != null) {
smtFound = true;
logLinesCache.add(lineHolder);
}
}
45
Project in software
Technion
return smtFound;
}
Source Code 6 - getNextLine & traverseQueues methods
Using those, the collector gets lines from the queues as long as the
"stop" Boolean value is false, and as long as there are more lines in the
queues.
To stop the Collector and end the run, the user may call the "stop"
method, which changes "stop" to true, and uses Thread.join () to wait for
the Collector to finish. The user will have to call
QueueManager.getInstance ().stop () in order to complete the Collector's
run.
With these interfaces in hand, the only task remaining is to embed the
Logger mechanism in the TL2 implementation.
Deuce STM has an interface called "org.deuce.transaction.Context". As
recalled, the Deuce framework automaticly matches the reads and writes
to the transaction's command to allow any user to implement an STM
algorithm. The user has two roles: implementing the STM algorithm, and
adding the "@Atomic" annotation to a function which he desires to be a
critical section. Deuce turns the function to a transaction.
We modified an existing TL2 implementation. As we discussed earlier,
we want to log each and every read and write.
Our additions to the TL2 Context, beside calls to the logger were:
final static int writtenByThis = -1;
private String name = "";
private final ThreadLocal<Integer> txnId = new
ThreadLocal<Integer> () {
@Override
protected Integer initialValue () {
return 0;
46
Project in software
Technion
}
};
private final Logger logger;
Source Code 7 - Addition to the TL2 context
In the Context's C'tor:
logger = LoggerFactory.getInstance ().getLogger ();
Source Code 8 - Change in context C'tor
47
Project in software
Technion
And in the "init ()" method:
this.name = Thread.currentThread ().getId () + "_T" + txnId.get
();
txnId.set (txnId.get () + 1);
logger.startTxn (name, this.localClock);
Source Code 9 - changes in TL2 Context init()

The integer "writtenByThis" is a constant. Whenever a transaction
reads something written by it, the logger would receive a read of
the integer's value (currently -1) to indicate it.

The "name" string holds the transaction's name, calculated once
in the Context's C'tor.

The "txnId" field is actually a counter. It is ThreadLocal, so each
thread has an instance of its own, initialized on 0. Whenever a
transaction is started, its name is calculated with the thread's Id,
and with this integer. Each new transaction increments the int
value, to apply different Ids to different transactions.
Within the C'tor, the Context saves a reference to its thread's logger.
This way, the functions can refer to the class itself, instead of accessing
the LoggerFactory on every logging operation.
When a transaction starts, the "init" method is called. A new name is
given to the new transaction, composed of the transaction Id and the
thread's name. As recalled, the ThreadLocal "txnId" is incremented, and
the first call to the logger within the transaction is made. This means that
on each "init", the logger writes the transaction's start in the log, along
with the global version clock matching.
This settles the new fields.
In the original TL2 implementation, there was a static
TransactionException stored in the LockTable class, which was a single
instance and was used by all transactions for every abort. However,
48
Project in software
Technion
neither concurrency nor correctness was damaged, since there was no
relevant information within that exception. The exception didn't contain
any information at all about the reason for the abort, either. In order to
distinct between the different causes for aborts, we had to create new
exception types to replace it, for each abort cause in the LockTable:

LockVersionException, in case the lock's version is too large.

ObjectLockedException, in case the lock is aquired by another
thread.
We didn't need an exception for each abort type in the whole algorithm.
For instance, an exception for failure in read-set validation isn't
necessary, since Deuce does'nt throw any exception about it anyway
(the check is performed in the Context class). To maintain Deuce
correctness, the new exceptions extends TransactionException, and can
be caught by the Deuce framework the same way the original exception
was.
Since we need different information from each transaction, we couldn't
use just one instance of each abort reason exception, like the old
TransactionException. We had to recreate each exception since most
errors differ in objects and versions causing them. But, each thread runs
a single transaction at a time, and so a single thread can use only one
exception at a time. So, to preserve memory, we made the new
exceptions ThreadLocal. Each time an abort occurs, the thread
accesses its own ThreadLocal, puts the relevant data in it, and throws it
as usual. When we catch it within the TL2 Context methods, we read its
data to send it to the logger, and throw it again to maintain the Deuce
framework's correctness. We are promised that no one shall access
those exceptions besides the relevant thread, since the transaction
stops. The new exceptions lie within the LockTable class, since locks
states and object versions are checked there:
49
Project in software
Technion
private static ThreadLocal<ObjectLockedException>
lockedException = new ThreadLocal<ObjectLockedException>() {
@Override
public ObjectLockedException initialValue() {
return new ObjectLockedException("Object is locked.", 0);
}
};
private static ThreadLocal<LockVersionException>
versionException = new ThreadLocal<LockVersionException>() {
@Override
public LockVersionException initialValue() {
return new LockVersionException("Object is locked.", 0);
}
};
Source Code 10 - The new exceptions in LockTable
Additional changes were made to the TL2 Context, such as changing the
return type of the LockTable.checkLock and
LockProcedure.setAndUnlockAll methods from void to int, to get the lock
version and the write versions respectively. We also added the
exceptions throw statements. For instance:
if( clock < (lock & UNLOCK))
versionException.get().throwWithVersion(lock & UNLOCK);
Source Code 11 - Exeption throw instance
The if statement performs the actual lock version check. The statement
within the if replaces the old TransactionException with the new
versionException. The constant UNLOCK is a constant – 0x0FFFFFF,
which means that a locked object is marked with 1 in its MSB. The
exception is thrown with lock & UNLOCK, since throwing the lock version
when it's locked may cause negative versions. Similar changes were
made to both checkLock methods.
50
Project in software
Technion
Since we don't want to interfere with the deuce framework, we added
another try/catch block in the "commit" method:
try {
// pre commit validation phase
writeSet.forEach(lockProcedure);
readSet.checkClock(localClock);
} catch (ObjectLockedException exception) {
lockProcedure.unlockAll();
logger.commitOp(name, exception.getLockVersion(), false,
"ReadsetInvalid");
return false;
} catch (LockVersionException e) {
lockProcedure.unlockAll();
logger.commitOp(name, e.getLockVersion(), false,
"ReadsetInvalid");
return false;
}
Source Code 12 - Change in Commit method
In the "onReadAccess0" method:
try {
// Check the read is still valid
LockTable.checkLock(hash, localClock, lastReadLock);
} catch (ObjectLockedException e) {
logger.readOp(name, objectId(obj, field) , e.getLockVersion(),
false, "ObjectLocked");
throw e;
} catch (LockVersionException e) {
51
Project in software
Technion
logger.readOp(name, objectId(obj, field), e.getLockVersion(),
false, "VersionTooHigh");
throw e;
}
Source Code 13 - Changes in the onReadAccess method
And in "beforeReadAccess" method:
try {
// Check the read is still valid
lastReadLock = LockTable.checkLock(next.hashCode(), localClock);
} catch (ObjectLockedException e) {
logger.readOp(name, objectId(obj, field), e.getLockVersion(),
false, "ObjectLocked");
throw e;
} catch (LockVersionException e) {
logger.readOp(name, objectId(obj, field), e.getLockVersion(),
false, "VersionTooHigh");
throw e;
}
Source Code 14 - Changes in beforeReadAccess method
The code we added to each method distincts between the abort types.
Whenever the LockTable class is accessed, we attempt to catch the new
exceptions. In the original implementation, onReadAccess0 had a simple
catch (TransactionException), while commit had no try/catch block there
at all – the TransactionException would go directly back to the Deuce
framework. We "intercept" the exceptions on their course, and catch
each one of the exceptions seperately. For each exception, we log the
necessary data, and throw it back as usual.
Using our Logger interface, and the "logger" field we added, we added
calls for the logging operations when needed:
52
Project in software
Technion

When a TransactionException is thrown (to mark an abort)

When a function completes successfully (to mark a successful
operation)
Depending on the method, either "init" (as seen earlier),
"onReadAccess", "onWriteAccess" or "commit", we added the matching
log functions, using the necessary data, which lies within the Context
and in our new exceptions.
Another important addition we made to the Context class is the objectId
method:
private static String objectId(Object reference, long field) {
return Long.toString(System.identityHashCode(reference) +
field);
}
Source Code 15 - Change in Context objectid
This method gives us a unique id for each object. The reference would
not have been enough, since the JVM performs defragmentation while
running programs, so we added the field.
We concentrated the TL2 classes we changed to a new package,
"loggingTL2", which uses the original "field" and "pool" packages. When
running our benchmarks, we inserted the loggingTL2.Context class as
the Deuce STM Context class. And so, when the run ended, we got
ourselves a log file describing the run, ready for the offline analysis.
53
Project in software
Technion
5. Evaluation
5.1 Hardware
To allow a high degree of concurrency, we ran our benchmarks over
the Trinity computer. Trinity's system consists of eight Quad Core
AMD Opteron CPUs, and 132GB RAM. Using c-shell scripts, we
automaticly ran each benchmark, about five times to get an average
measurement. We ran each benchmark with 1 to 1024 threads
(growing exponentially). When running the benchmarks, Trinity was
relatively idle, so parallelism was the best attainable.
With Trinity's 32 cores, the high parallelism would allow us to smaple
STM benchmarks results that are practical and relevant to the future
of multi-core computer architecture.
5.2 Deuce Framework
Figure 13 - Deuce Method application
We use the Deuce Java-based STM framework. Deuce is a
pluggable STM framework that allows different implementations of
STM protocols. A developer only needs to implement the Context
interface, and provide his own implementation for the various STM
library functions. The library functions specify which actions to take
on reading a field, writing to a field, committing a transaction, and
54
Project in software
Technion
rolling back a transaction. Deuce is non-invasive: it does not modify
the JVM or the Java language, and it does not require re-compiling
source code in order to instrument it. It works by introducing a new
@Atomic annotation. Java methods that are annotated with
@Atomic are replaced with a retry-loop that attempts to perform and
commit a transacted version of the method. All methods are
duplicated. The transacted copy of every method is similar to the
original, except that all field and array accesses are replaced with
calls to the Context interface, and all method invocations are
rewritten so that the transacted copy is invoked instead of the
original. Deuce works either in online or offline mode. In online
mode, the entire process of instrumenting the program happens
during runtime. A Java agent is attached to the running program, by
specifying a parameter to the JVM. During runtime, just before a
class is loaded into memory, the Deuce agent comes into play and
transforms the program in-memory.
Figure 14 - Deuce Context for TM algorithms
To read and rewrite classes, Deuce uses ASM, a general-purpose
byte code manipulation framework. In order to avoid the runtime
overhead of the online mode, Deuce offers the offline mode, that
performs the transformations directly on compiled class files. In this
55
Project in software
Technion
mode, the program is transformed similarly, and the transacted
version of the program is written into new class files. Deuce's STM
library is homogenous. In order to allow its methods to take
advantage of specific cases where optimization is possible, we
enhance each of its STM functions to accept an extra incoming
parameter, advice. This parameter is a simple bit-set, which
represents information that was pre-calculated and may help finetune the instrumentation. For example, when writing to a field that
will not be read, the advice passed to the STM write function will
have 1 in the bit corresponding to no-read-after-write. In this work we
focus on the Transactional Locking II protocol implementation in
Deuce. In TL2, confict detection is done by using a combination of
versioned write-locks, associated with memory locations or objects,
together with a global version clock. TL2 is a lazy-update STM, so
values only written to memory at commit time; therefore locks are
held for a very short amount of time.
Out utilization of Deuce is in the context method as regarded earlier
with changes. We believe, since all of Deuce's major advantages are
created to maintain concurrency and minimize locking times, which
Deuce will be very fast to save and document the run without
harming concurrency.
Figure 15 - Comparison between Deuce and similar methods for running TM
5.4 Benchmarks
5.4.1 AVL test bench
The AVL benchmark is a simple test we created for the STM. It
creates a shared AVL tree of integers, and runs many threads. Each
thread's function is to randomly choose between adding a random
56
Project in software
Technion
number, removing a random number or searching the tree for
random numbers. The AVL tree is a perfect example to a data
structure made thread-safe easily by STM, while using locks would
have been a catastrophe.
The AVL tree is a high contention data structure. Its frequent
rotations make it vulnerable to constant changes in every part of the
tree. These changes could create any conflicts between transactions
running in the same time.
The AVL tree test is customizable in many ways (besides the
quantity of actions). We may change the range of the integers to
modify program's contention, and the lower the range, the more
threads access similar tree nodes. We can also modify the
randomizing of actions, increasing or decreasing the chances to
read or write. This way, we can compare a read-dominated run, to
read-write or write dominated runs.
The AVL tree test is expected to yield very high abort ratio, and
analyzing its unnecessary abort ratio could give us a picture of how
rampant is the work waste in STM in dynamic data structures.
5.4.2 Vacation test bench
The vacation benchmark implementation was supplied by Deuce.
The test implements a travel reservation system. Its workload
consists of several client threads interacting with the database, via
the system's transaction manager.
The database consists of four entities: cars, rooms, flights, and
customers. The first three have relations with fields representing a
unique ID number, reserved quantity, total available quantity, and
price. The table of customers tracks the reservations made by each
customer and the total price of the reservations they made. The
tables are implemented as Red-Black trees.
The customizable parameters in the benchmark are:
57
Project in software

Technion
The initial size of the database, which is the number of entries in
the tree on initialization.

The number of total tasks performed by the reservation systems.

The distribution of the tasks.

The range of values from which the clients generate, which
affects contention level.
There are four different possible tasks:

Make Reservation -- The client checks the price of -n items, and
reserves a few of them.

Delete Customer -- The total cost of a customer's reservations is
computed and then the customer is removed from the system.

Add to Item Tables -- Add -n new items for reservation, where an
item is one of {car, flight, room} and has a unique ID number.

Remove from Item Tables -- Remove -n new items for
reservation, where an item is one of {car, flight, room} and has a
unique ID number.
We ran Vacation with low contention configuration, also over 1 to
1024 threads.
5.4.3 SSCA2 test bench
The Scalable Synthetic Compact Applications~2 (SSCA2)
benchmark is comprised of four kernels that operate on a large,
directed, weighted multi-graph. The threads in this test add nodes to
the graph in parallel and uses transactional memory to synchronize
their accesses. This operation is relatively small, so transactions are
short. Moreover, the length of the transactions and the sizes of their
read and write sets is relatively small. The amount of contention is
the application is also relatively low - the large number of graph
58
Project in software
Technion
nodes lead to infrequent concurrent updates of the same adjacency
list.
5.5 Results
Our eventual results were loaded to TXT files and parsed to Matlab.
Following the analysis we made several runs for several threads and
several contentions. We saw on the high contention benchmark that
the aborts and specifically the unnecessary aborts cripple
performance. Almost 90% of operations in large amounts of threads
are wasted work.
The percentage of unnecessary aborts is relatively stable with a very
low incline with the increase of the threads. A linear behavior was
noted to the commit ratio decrease with the increase in the amount
of threads.
More or less we can see here that there is a large part played by the
aborts and unnecessary aborts and that if some of them were
avoided it is possible that we see a rise in the commit ratio incline
and a rise in overall performance. We will discuss this further in the
conclusions summary part.
6000
Amount Of Unnecessary Aborts
Precentage Of Successful Commits
1
0.8
0.6
0.4
0.2
0
0
10
1
10
Number Of Threads
5000
4000
3000
2000
1000
0
0
10
2
10
0.8
0.6
0.4
0.2
0
0
10
2
10
1
Precentage Of Wasted Reads
precentage Of Unnecessary Aborts
1
1
10
Number Of Threads
1
10
Number Of Threads
0.8
0.6
0.4
0.2
0
0
10
2
10
1
10
Number Of Threads
Figure 16 - AVL Benchmark results – commit ratio, aborts precentage
59
2
10
Project in software
Technion
1500
Amount Of Unnecessary Aborts
Precentage Of Successful Commits
1
0.8
0.6
0.4
0.2
0
0
10
1
10
Number Of Threads
1000
0
0
10
2
10
1
10
Number Of Threads
2
10
1
Precentage Of Wasted Reads
precentage Of Unnecessary Aborts
1
0.8
0.6
0.4
0.2
0
0
10
500
1
10
Number Of Threads
0.8
0.6
0.4
0.2
0
0
10
2
10
1
10
Number Of Threads
2
10
Figure 17 - SSCA2 Benchmark results – commit ratio, aborts precentage
700
Amount Of Unnecessary Aborts
Precentage Of Successful Commits
1
0.8
0.6
0.4
0.2
0
0
10
1
10
Number Of Threads
10
400
300
200
100
1
10
Number Of Threads
2
10
1
Precentage Of Wasted Reads
precentage Of Unnecessary Aborts
500
0
0
10
2
1
0.8
0.6
0.4
0.2
0
0
10
600
1
10
Number Of Threads
0.8
0.6
0.4
0.2
0
0
10
2
10
1
10
Number Of Threads
2
10
Figure 18 - Vacation Benchmark result – commit ratio, aborts precentage
60
Project in software
Technion
threads2
threads4
threads8
36%
39%
43%
46%
43%
43%
11%
14%
25%
Version Too High
Object Locked
Readset Invalid
threads16
threads32
16%
16%
22%
threads64
5%
51%
57%
27%
28%
79%
Figure 19 – AVL benchmark results - Analysis of Aborts by type
threads2
threads4
23%
threads8
22%
26%
12%
60%
65%
14%
60%
19%
Version Too High
Object Locked
Readset Invalid
threads16
threads32
threads64
28%
35%
36%
36%
41%
45%
18%
24%
36%
Figure 20 - SSCA2 Benchmark results - Analysis of Aborts by type
61
Project in software
Technion
threads2
threads4
threads8
29%
32%
34%
55%
61%
62%
10%
6%
12%
Version Too High
Object Locked
Readset Invalid
threads16
threads32
threads64
13%
23%
27%
38%
5%
15%
62%
68%
49%
Figure 21 – Vacation Benchmark results - Analysis of Aborts by type
62
Project in software
Technion
3500
0.7
Version Too High
Object Locked
Readset Invalid
3000
0.6
2500
0.5
Precentage Of Aborts
Amount Of Aborts
Version Too High
Object Locked
Readset Invalid
2000
1500
0.4
0.3
1000
0.2
500
0.1
0
1
2
3
4
log2 of Number Of Threads
5
0
6
1
2
3
4
5
log2 of Number Of Threads
6
Figure 22 - AVL Benchmark results - Catagorizing by type and amounts
700
0.7
Version Too High
Object Locked
Readset Invalid
600
0.6
500
0.5
Precentage Of Aborts
Amount Of Aborts
Version Too High
Object Locked
Readset Invalid
400
300
0.4
0.3
200
0.2
100
0.1
0
1
2
3
4
log2 of Number Of Threads
5
0
6
1
2
3
4
5
log2 of Number Of Threads
6
Figure 23 - SSCA2 Benchmark results - Catagorizing by type and amounts
63
Project in software
Technion
400
0.7
Version Too High
Object Locked
Readset Invalid
Version Too High
Object Locked
Readset Invalid
350
0.6
300
0.5
Precentage Of Aborts
Amount Of Aborts
250
200
0.4
0.3
150
0.2
100
0.1
50
0
1
2
3
4
log2 of Number Of Threads
5
0
6
1
2
3
4
5
log2 of Number Of Threads
6
Figure 24 – Vacation Benchmark results - Catagorizing by type and amounts
64
Project in software
Technion
6. Conclusion and Summary
From the results seen prior there are several things to be seen.
We will start by answering the question discussed in the project goals section
of whether it is worthwhile to create mechanisms to avoid or avert
unnecessary aborts. Well the answer seems obvious when looking at the
results – when discussing high contention and medium plus length
transactions, we see that the percentage of the unnecessary aborts is stable
and relatively high in every mode (except for a single thread which doesn't
produce aborts of course). We see a decline in the level of aborts and
unnecessary aborts on shorter transactions but still a very high percentage of
aborts (over 20% in every case except a single thread). The performance
damage still lies mostly with the higher contention transactions – because
when discussing short transactions the probability of overlapping transactions
in time decreases (shorter transactions spend less time in speculative run and
analysis and more on commit – hence the lock is subtle enough in order to
create a system where the versions are consistent in state and there is no
locked objects since objects are locked for a very short time).
To summarize our findings in this matter we can simply say that the abort
level is high enough and cripple performance on more than 50% of cases (all
except short transactions alone with no shared memory over time). Hence to
handle and avoid aborts according to the result we saw is more than
worthwhile – it is essential in order to maintain a certain amount of
performance increase when advancing in the amount of parallelism in
transactional memory in general.
Further analysis shows that shorter transactions may not lead to better results
in most cases. We see in this example of low contention level that the highest
amount of aborts is of type read-set invalidation. This means that the object
addressed by a certain transactions is often altered by the time the said
transaction commits. Naturally we can deduce that the method of avoidance
should be to change the method and recover from the abort situation (further
analysis of the algorithm level and not of the transaction formation etc.).
65
Project in software
Technion
In most cases we can see a sharp incline in the rise of the percentage of the
version incompatibility abort type. In shorter transaction this doesn't happen of
course since the object doesn't have the sufficient time to be changed during
the time of the transaction's actions. The version incompatibility can be
avoided by saving earlier versions of an object for a longer time. As proposed
in the SMV algorithm created by Dmitri Perelman and prof. Idit Keidar. It
seems that the results of the said algorithm will invoke a high improvement
scale in performance and commit ratio over time. We recommend to further
examining such alternatives in order to examine the change in the factors in
question. There may be a chance where most aborts and wasted work will be
avoided by using recovery systems of sorts.
Another aspect of the results is the amount of wasted reads during all runs.
They maintain an exponential increase (in terms with the commit ratio) where
it is visible to see a decrease in performance level. This is the issue where the
abort avoidance is most considered essential – since this wasted reads create
an unwanted latency and cripple performance.
We can see that categorizing the aborts on such a high level seems to leave
us with general lines only to characterize the problem. We recommend trying
and analyzing the aborts per type and with consideration to the program's
timeline of run.
It is clearly visible from the results that the locking method of the TL2
algorithm is still crud in many ways. The locking still leaves many situations
where the level of aborts is far too high and could have been avoided. There
is a possibility that other algorithms will show completely different results. We
hence find that there is a need to analyze other algorithms in order to receive
a more based conclusion. In any case – since abort type differ between
algorithms, each analysis will return eventually to the performance decrease
due to aborts which will create the wanted approximation of the efficiency of
the algorithm in comparison to others.
66
Project in software
Technion
The logging mechanism is relatively effective yet we can still see that there is
a certain injury to the concurrency of the program so there are really more
aborts and more cases of wasted reads.
We believe that our simulation is sufficient to decide to further research and
create abort avoiding mechanisms in STM.
7, Acknowledgements
We would like to thanks our supervisor Dima who was above and beyond in
his help and exceeded from being just the supervisor, becoming a full partner
in the process and tought us programming from scratch.
We would like to thank Victor, Ilana and the entire lab staff for hosting us, for
being very nice and helpful at all times and for the tech support while running
the simulation on Trinity.
67
Project in software
Technion
8. References
1. Asanovic, Krste et al. (December 18, 2006). "The Landscape of
Parallel Computing Research: A View from Berkeley" (PDF). University
of California, Berkeley. Technical Report No. UCB/EECS-2006-183.
2. Barney, Blaise. "Introduction to Parallel Computing". Lawrence
Livermore National Laboratory. Retrieved 2007-11-09.
3. C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP:
Stanford, Transactional Applications for Multi-processing. In IISWC '08:
Proceedings of The IEEE International Symposium on Workload
Characterization, September 2008.
4. C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP:
Stanford Transactional Applications for Multi-processing. In IISWC '08:
Proceedings of The IEEE International Symposium on Workload
Characterization, September 2008.
5. Maurice Herlihy, Victor Luchangco, Mark Moir, and William N. Scherer
III. Software Transactional Memory for Dynamic-Sized Data
Structures. Proceedings of the Twenty-Second Annual ACM SIGACTSIGOPS Symposium on Principles of Distributed Computing (PODC),
92–101. July 2003.
6. Dave Dice and Ori Shalev and Nir Shavit (sep 2006).
Transactional Locking II.
In: DISC~'06: Proc. 20th International Symposium on Distributed
Computing. pp. 194--208. Springer-Verlag Lecture Notes in Computer
Science volume 4167.
7. Nir Shavit and Dan Touitou (Aug 1995).
Software Transactional Memory.
In: Proceedings of the 14th ACM Symposium on Principles of
Distributed Computing. pp. 204--213.
8. Maurice Herlihy and Victor Luchangco and Mark Moir and III William N.
Scherer (Jul 2003).
Software Transactional Memory for Dynamic-Sized Data Structures.
In: PODC~'03: Proc. 22nd ACM Symposium on Principles of
Distributed Computing. pp. 92--101.
68
Project in software
Technion
9. Christopher Cole and Maurice Herlihy (Jul 2004).
Snapshots and Software Transactional Memory.
In: Proceedings of the ACM PODC Workshop on Concurrency and
Synchronization in Java Programs.
10. Dmitri Perelman and Idit Keidar (aug 2009).
On Avoiding Spare Aborts in Transactional Memory.
In: SPAA~'09: Proc. 21st Symposium on Parallelism in Algorithms and
Architectures.
11. Steal-on-abort: Improving Transactional Memory Performance through
Dynamic Transaction Reordering Mohammad Ansari, Mikel Luj´an,
Christos Kotselidis, Kim Jarvis, Chris Kirkham, and Ian Watson, The
University of Manchester
12. Yehuda Afek, Guy Korland, and Arie Zilberstein, Lowering STM
Overhead with Static Analysis, LCPC'10
13. Guy Korland, Nir Shavit and Pascal Felber, “Noinvasive Java
Concurrency with Deuce STM”, MultiProg '10, Pisa, Italy
14. http://www.deucestm.org/
15. D. Perelman, A. Bishevsky, O. Litmanovich, and I. Keidar: SMV:
Selective Multi-Versioning STM.
69
Project in software
Technion
9. Index A – Class List
Package analyzer:
analyzer.AbortsAnalyzer
Searches for cycles in the
precedence graph to check
necessity of aborts.
analyzer.Analyzer
Holds basic transactions
statistics and information, as
well as abort ratio and wasted
reads.
analyzer.ObjectHandle
Holds data of a certain object
and its versions.
analyzer.ObjectVersion
Holds data of a certain
version of a single object.
analyzer.RunDescriptor
Holds the precedence graph.
analyzer.TxnDsc
Holds data of a single
transaction.
Package logger:
logger.Logger
Interface for logging actions
class.
logger.LoggerFactory
Interface for Logger
generator.
logger.AsyncLogger
Implements Logger, using
Queues.
Logger.AsyncLogger.AsyncLoggerFactory
Implements LoggerFactory,
generates AsyncLoggers.
logger.BackendCollector
Collects LogLines by their
order, looping over all
Loggers.
logger.LogCreator
Writes LogLines to the XML log
70
Project in software
Technion
file.
logger.LogLineHolder
Holds a LogLine and a number,
representing the LogLine's
time of occurrence.
logger.QueueManager
Holds the Loggers, allows
BackendCollector to loop over
them.
Package loggingTL2:
loggingTL2.BloomFilter
The Bloom Filter used by the
TL2 algorithm.
loggingTL2.Context
The modified Context class,
with logging functions.
loggingTL2.LockProcedure
The LockProcedure class used
by the TL2 algorithm.
loggingTL2.LockTable
The Lock array used to hold
object locks, with changes to
throw exceptions and return
versions.
loggingTL2.LockVersionException
An exception stating an abort,
occuring because of access to
an object with version higher
than read version.
loggingTL2.ObjectLockedException
An exception stating an abort,
occuring because of a
unaquired lock.
loggingTL2.ReadSet
The ReadSet class used by the
TL2 algorithm.
loggingTL2.WriteSet
The WriteSet class used by the
TL2 algorithm.
Package notLoggingTL2:
notLoggingTL2.BloomFilter
The Bloom Filter used by the
TL2 algorithm.
71
Project in software
Technion
notLoggingTL2.Context
The modified Context class,
with Counter access.
notLoggingTL2.Counter
A counter holding integers to
count aborts and transactions.
notLoggingTL2.LockProcedure
The LockProcedure class used
by the TL2 algorithm.
notLoggingTL2.LockTable
The Lock array used to hold
object locks, unmodified.
notLoggingTL2.ReadSet
The ReadSet class used by the
TL2 algorithm.
notLoggingTL2.WriteSet
The WriteSet class used by the
TL2 algorithm.
Package ofersParser:
ofersParser.SingleLineFactory
An interface for LogLine
generating classes.
ofersParser.FormatErrorException
An exception stating an error
in the log format.
ofersParser.LogLine
An abstract class, holding
every data in a generic log
line.
ofersParser.LogLineFactory
A class that calls the
matching SingleLineFactory for
each log line in the log.
ofersParser.LogLineVisitor
An interface used to create
visitor design pattern over
LogLines.
ofersParser.Log_CommitLine
A LogLine stating a commit
attempt.
ofersParser.Log_ReadLine
A LogLine stating a Read
attempt.
ofersParser.Log_StartLine
A LogLine stating a start of a
72
Project in software
Technion
transaction.
ofersParser.Log_WriteLine
A LogLine stating a write
attempt.
ofersParser.Main
The main class, calling the
LogLineFactory and the
analyzer for analysis.
ofersParser.ReadLineFactory
A SingleLineFactory that
generates Log_CommitLines from
commit lines in the log.
ofersParser.StartLineFactory
A SingleLineFactory that
generates Log_CommitLines from
commit lines in the log.
ofersParser.WriteLineFactory
A SingleLineFactory that
generates Log_CommitLines from
commit lines in the log.
Package tests.avltree:
tests.avltree.AVLNode
A node in the tree.
tests.avltree.AVLTree
A self balancing binary tree
with Atomic methods.
tests.avltree.Tree
An interface for a search
tree.
73
Download