Laboratory 8

advertisement
Laboratory Nine – cache and performance improvement
– 23 August 2002 – y k choi
Objective
There are two purposes of this laboratory: to study more about the caching effect and use
profiling to measure and to improve a programs’ performance. The exercise is extracted
form exercise 4, CTE.
Note that for Pentium III:
Cache Structure:
32 Kb split cache - 16 Kb (16384 bytes) data, 16 Kb instructions
4-way set associative
512 lines
32 bytes per line (per cache line)
Procedure 1 – Caching
Activity 1 – 20 minutes
You should run three
times and take the
third result, as it is
more accurate.
This program uses a square block to maximize the performance.
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <minmax.h>
// this program use a SQUARE block size to maximise
// the use of cache memory
void main(){
float a[250][250], b[250][250], c[250][250];
int i, j, k, kk, jj;
float r;
int BLKSZE = 8; //the total bytes will be 8 x 4 bytes (floating) = 32 bytes
for (i = 0; i< 250; i++){
for (kk = 0; kk < 250; kk += BLKSZE)
for (k = kk; k < min(kk + BLKSZE, 250); k++) {
r = a[i][k];
for (jj = 0; jj < 250; jj += BLKSZE)
1
for (j =jj; j <min(jj + BLKSZE, 250); j++)
c[i][j] += r * b[k][j];
}
}
}
Write down the following figures:
Func
Func+Child
Hit
Time %
Time % Count Function
--------------------------------------------------------Now change the block size to 16 bytes. (BLKSZE = 4 The line size of cache)
Write down the following figures:
Func
Func+Child
Hit
Time %
Time
% Count Function
---------------------------------------------------------
Now change the block size to 64 bytes. (BLKSZE =16 The line size of cache)
Write down the following figures:
Func
Func+Child
Hit
Time %
Time % Count Function
---------------------------------------------------------
Now change the block size to 96 bytes. (BLKSZE = 24 The line size of cache)
Write down the following figures:
Func
Func+Child
Time %
Time %
BLKSIZE
Expected Cache line
Real time
4
16
Hit
Count Function
8
32
16
64
24
96
2
Looking at the above table, what is your conclusion? (which will perform better)
One mark______
Activity 2 – consider the following program with two loops
forming a square tile [15 minutes]
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <minmax.h>
// this program use a SQUARE block size to maximise
// the use of cache memory
void main(){
float a[250][250], b[250][250], c[250][250];
int i, j, k, kk, jj;
float r;
int BLKSZE = 4;
for (kk = 0; kk < 250; kk += BLKSZE)
for (jj = 0; jj < 250; jj += BLKSZE)
for (i = 0; i< 250; i++){
for (k = kk; k < min(kk + BLKSZE, 250); k++) {
r = a[i][k];
for (j =jj; j <min(jj + BLKSZE, 250); j++)
c[i][j] += r * b[k][j];
}
}
}
Write down the following figures:
Func
Func+Child
Time %
Time %
Hit
Count Function
Up to here, you should choose the size of BLKSZE so that a BLKSZE x BLKSZE submatrix of b and a row of length BLKSZE of c can fit in the cache. Now complete the
following table.
3
BLKSZE
4
8
12
16
32
TIME (ms)
One mark______
4
Procedure 2 – one hour and twenty minutes
Login into the
system and follow
the instruction
This is from CTE exercise 4
Objective:
You should make a new version of substitute.exe and demonstrate, using profiling output,
that it runs faster. You should be able to obtain at least a factor of 2 speedup (old run
time divided by new run time). You do not have to use Microsoft Foundation Class
objects, but given that these are well-written and probably correct, you should only
replace code that is doing unnecessary work as reflected in profiler measurements.
Procedure
1) download the code again similar to last week and measure the time it takes.
2) There are many ways you can modify the code to run faster. You will be given
one mark if you find one two marks if you can find one more.
You have to measure the original program without modification and then the one with
your modification.
One mark______, find one method and demonstrate to me
One mark_____, find another method and demonstrate to me
/* substitute -- substitute strings in a list of files
This program operates on a set of files listed on the command line. The first file
specifies a list of string substitutions to be performed on the remaining
files. The list of string substitutions has the form:
"string 1" "replacement 1"
"string 2" "replacement 2"
...
If a string contains a double quote character or a backslash character, escape the
character with backslash: "\"" denotes the string with one double
quote character. "\\" contains one backslash. Each file is searched for instances of
"string 1". Any occurences are replaced with "replacement 1".
In a similar manner, all "string 2"s are replaced with "replacement 2"s, and so on.
The results are written to the input file. Be sure to keep a backup of files if you do not
want to lose the originals when you run this program.
5
*/
#include "afx.h"
#include "iostream.h"
// parse a quoted string from buffer
// return final index in string
int parse1(CString *buffer, int start, CString *str)
{
// look for initial quote:
int i = buffer->Find("\"", start);
if (i != -1) {
// copy to result string
str->Empty();
int j = 0;
// index into str
i++; // skip over the opening double-quote
// scan and copy up to the closing double-quote:
while ((*buffer)[i] != 0) {
if ((*buffer)[i] == '\\') {
// read next char to see what to do
i++;
if ((*buffer)[i] != 0) {
str->Insert(j++, CString((*buffer)[i]));
}
} else if ((*buffer)[i] == '\"') {
return i + 1;
}
str->Insert(j++, CString((*buffer)[i]));
i++;
}
}
return -1;
}
// parse two quoted strings from buffer; return false on failure
//
bool parse(CString *buffer, CString *pattern, CString *replacement)
{
int start = parse1(buffer, 0, pattern);
if (start < 0) {
return false;
}
start = parse1(buffer, start, replacement);
return (start >= 0);
}
6
void substitute(CString *data, CString *pattern, CString *replacement) // modify the
code here for the first mark
{
int loc;
// find every occurrence of pattern:
for (loc = data->Find(*pattern, 0); loc >= 0;
loc = data->Find(*pattern, 0)) {
// delete the pattern string from loc:
data->Delete(loc, pattern->GetLength());
Somewhere
// insert each character of the replacement string:
here
for (int i = 0; i < replacement->GetLength(); i++) {
data->Insert(loc + i, (*replacement)[i]);}
}
}
void do_substitutions(CString *data, CString *subs_filename)
{
TRY {
CStdioFile file(*subs_filename, CFile::modeRead);
while (true) {
CString buffer; // holds line from file
CString pattern;
CString replacement;
file.ReadString(buffer);
// handle end of file
if (buffer.GetLength() == 0) break;
if (parse(&buffer, &pattern, &replacement)) {
substitute(data, &pattern, &replacement);
} else {
cout << "Bad pattern/replacement line: " << buffer << endl;
return;
}
}
}
CATCH(CFileException, e ) {
cout << "File could not be opened or read " << e->m_cause << endl;
}
END_CATCH
}
void process_file(CString *filename, CString *subs_filename)
{
// read in filename to a CString
7
TRY {
CFile file(*filename, CFile::modeRead);
int size = file.GetLength();
// read the data, allocate more than we need
char *data = new char[size + 16];
file.Read(data, size);
// files are not zero-terminated but string should be:
data[size] = 0;
// now we can make a CString from the data:
CString content(data);
delete data; // data is no longer needed
do_substitutions(&content, subs_filename);
// write the data
file.Close();
file.Open(*filename, CFile::modeWrite);
file.Write(content, content.GetLength());
file.Close();
}
CATCH(CFileException, e ) {
cout << "File could not be opened or read " <<
e->m_cause << " " << *filename << endl;
}
END_CATCH
}
int main(int argc, char *argv[])
{
if (argc < 3) {
cout << "Not enough input arguments" << endl;
cout << "Usage: substitute subs-file src1 src2 ..." << endl;
} else {
CString subs_filename(argv[1]);
for (int i = 2; i < argc; i++) {
CString filename(argv[i]);
process_file(&filename, &subs_filename);
}
}
return 0;
}
8
Download