5-BAFTAB

advertisement
Assignment 1: Review of two research papers (done individually)
Read the papers carefully and write a personally written review on each paper based on the issues
below (about A4 page normal sized text per paper, 300 - 700 words). The analysis must be expressed
in your own words.
PAPER 1
Worst-Case Execution Time Analysis for Dynamic Branch
Predictors
Author, Title:
Iain Bate and Ralf Reutemann
Department of Computer Science, University of York
York, United Kingdom
e-mail: {ijb,ralf}@cs.york.ac.uk

Is the paper well organized?
In this paper everything is well organized like in abstract it is clearly written that nowadays branch
prediction is commonly used in modern microprocessor. There are some problems in predictability like
what should we predict first. Introduction is also clear and briefly describes worst case execution time
(WCET) and how prediction branch occur.
This paper also describes the branch prediction techniques in detail and how they work. Facts and
figures are also understandable and clear so all mathematical notations algorithms defines the
efficiency of each prediction scheme. Conclusion also describes the all related work, comparison and
tells which prediction algorithm is suitable for which condition.
References are also from different scientific papers, books and seminars are well written and
 Comment the following sections (if present):

Title
As long as title of this scientific paper concern it describes the general comparison of all the methods
of branch prediction not especially dynamic branch prediction. So title should be worst case execution
time for branch predictors.

Abstract
It is well written as it mention the problem of branch predictor is predictability. And what they are going
to do is making comparison between bimodal and global history branch prediction. They also mention
that it is easily to predict or hard to predict for semantic context of the branches in the source code.

Introduction
Introduction is well written and clear it describes the main problem of branch predictor that is all the
tasks are able to meet their respective deadlines. It describes many commercially available
microprocessors which are not designed for real time systems have branch predictability by their
schedulable analysis. It also describes the procedure of branch predictor.

Main section(s)
Main section also describes related work, comparison, mathematical analysis, techniques and statistics
of all the algorithms and different types of branch prediction techniques. It also describes some
statistical behavior of different branch prediction techniques.

Summary
This paper summarized that static WCET analysis of dynamic is feasible. And tells which branch
predictors are better for which condition to achieve better performance.

Conclusions, and
This paper describes all foundation of static analysis of branch predictor. It also describes references
of authors where they describe the models of different branch prediction techniques.

References.
References are helpful and informative to see others papers and their ideas. There are also references
of workshops and books those are helpful and can fine more related topics and their perspective.

Comment on the language used in the paper.
Language is fine and clear to understand spatially facts and figures are informative and clear to
understand.

General comments to the paper.
Overall paper is informative and have good link between previous section and next session. But
language is technical and mathematical, but it is scientific paper so it should have this technicality. This
research paper is for those who have enough knowledge of their relative field.
Paper 2
Title:
Improving Direct-Mapped Cache Performance by the Addition of a Small
Fully-Associative Cache and Prefetch Buffers
Author,
Norman P. Jouppi
Digital Equipment Corporation Westem Research Lab
100 Hamilton Ave.. Palo Alto, CA 94301
•
Is the paper well organized?
I studied whole paper so this paper is well organized but description or explanation required more whole
the work good for title “Improving Direct-Mapped Cache Performance by the Addition of a Small FullyAssociative Cache and Prefetch Buffers” so interesting topic.
•
Title,
The title of this IEEE paper (Improving Direct-Mapped Cache Performance by the Addition of a Small
Fully-Associative Cache and Prefetch Buffers) is good because it explains whole the phenomenon of
this paper when a person read its title then he could understand the idea, purpose and the technology
working behind it so, it may be edited some more information related to topic e.g.” Improving DirectMapped Cache Performance Controlled by the Addition of a Small Fully-Associative Cache and
Prefetch Buffers”
•
Abstract,
In the abstract the author defines about the miss caching, victim caching, stream buffers and multi-way
stream buffers. It also defines the pipelined using in stream buffering. And explain how to use the
techniques to reduce the conflict of cache. Author uses the multi-way stream buffers for the replacement
of pipelined to decrease the rate of confliction of cache then we get better processing speed using these
kinds of techniques.
•
Introduction,
In the introduction author well to says comparatively that, if we using a different kind of machines like
VAX11/780 WRL Titan so according to table 1-1 whenever we increase the cycles per instruction then
the miss cost instruction will automatically high with cycle and memory time high to low but miss
cycles cost is high but whenever the number of cycles per instruction is high then cost of miss cycles
cost instruction is low.
•
Main section(s),
2. Baseline Design
In baseline the author explains how to increase the speed of data processing he had taken the CMOS and
GaAs for under consideration to elaborate about the large and small cache and described the design of
baseline in Fig: 2-1. To more elaboration including a test program in Table 2-1.
3. Reducing Conflict Misses: Miss Caching and Victim caching
It includes four categories Conflict, Compulsory, Capacity and Coherence. More detail is given below.
3.1. Miss Caching
In this section author discuss the worst, average, and the best case to avoiding conflict of caches e.g.
assuming at least 60 different instructions are executed in each procedure, the conflict misses would
span more than the 15 lines in the maximum size miss cache tested. In other words, a small miss cache
could not contain the entire overlap and so would be reloaded repeatedly before it could be used. This
type of reference pattern exhibits the worst miss cache performance also defines in Fig: 3-3
3.2. Victim Caching
Victim cache is used to reduce the miss cache because of miss cache replaced by victim cache so that we
can say that the victim cache is the main step to reduce the confliction of cache.
3.3. The Effect of Direct-Mapped Cache Size on
Victim Cache Performance
Whenever the victim cache entries are increasing then the confliction of miss cache is reduces by the
effect of direct-mapped cache size on victim cache performance.
3.4. The Effect of Line Size on Victim Cache
Performance
The effect of line size on victim cache performance is still improved whenever the number of lines and
size of line increases then the conflict of miss cache decreases because of victim cache.
3.5. Victim Caches and Second-Level Caches
When we use a cache for normal level use then we use first level victim cache but when we use a cache
for higher memory then we use second level caches.
4. Reducing Capacity and Compulsory Misses
There are three kind of Prefetch algorithm to reducing capacity and compulsory misses Prefetch on miss,
tagged Prefetch, Prefetch always. It based on the second level cache system so the tagged cache may be
many head start on giving instructions
.
4.1. Stream Buffers
A stream buffer is mainly used in start the Prefetch before tagged transition can take place. The
sequential buffer streaming defines in fig: 4-2 every line have a tag and its address which it stored in
buffers. Because of second level used it uses maximum bandwidth so it reduces the conflict.
4.2. Multi-Way Stream Buffers
Multi-way stream buffer is a technique more reduces the chance of confliction (it changes from 7% to
60% reduction) the matrices are used reduces the chance of miss cache as introduce in Fig: 4-4.
4.3. Stream Buffer Performance vs. Cache Size
The instruction stream buffers have remarkably constant performance over a wide range of cache sizes.
The data stream buffer performance generally improves as the cache size increases. This is especially
true for the single stream buffer if a cache size increase the number of confliction is also increases.
4.4. Stream Buffer Performance vs. Line Size
The single data stream buffer performance is especially hard hit compared to the multi-way stream
buffer because of the increase in conflict misses at large line sizes.
•
Summary ,
Miss caching is due to the confliction of the cache. Baseline the author explains how to increase the
speed of data processing. Includes four categories Conflict, Compulsory, Capacity and coherence. Small
miss cache could not contain the entire overlap and so would be reloaded repeatedly before it could be
used. Victim cache is used to reduce the miss cache because of miss cache replaced by victim cache. A
stream buffer is mainly used in start the Pre-fetch before tagged transition can take place. Multi-way
stream buffer is a technique more reduces the chance of confliction (it changes from 7% to 60%
reduction) the matrices are used reduces the chance of miss cache.
•
Conclusions
Victim caches are an improvement to miss caching that saves the victim of the cache miss instead of the
target in a small associative cache. Victim caches are even more effective at removing conflict misses
than miss caches. Multi-way stream buffers are a set of stream buffers that can prefetch down several
streams concurrently. Multi-way stream buffers are useful for data references that contain interleaved
accesses to several different large data structures, such as in array operations. This study has
concentrated on applying victim caches and stream buffers to first-level caches. An interesting area for
future work is the application of these techniques to second-level caches. Also, the numeric programs
used in this study used unit stride access patters. Numeric programs with non-unit stride and mixed
stride access patterns also need to be simulated. Finally, the performance of victim caching and stream
buffers needs to be investigated for operating system execution and for multiprogramming workloads.
•
References
Reference is to good precious and overcome the requirement.
•
Comment on the language used in the paper.
The language used in the paper is technical and also a user friendly but it may be improved by more
elaborating, user friendly and writing information about each process.
Download