Parallel Processing in SAS

advertisement
Parallel Processing in SAS
CPUCOUNT
A comparison of Proc Means for
the Project
THREADS and CPUCOUNT
• In SAS there is an option to let SAS create
THREADS=YES, this allows SAS to start
different threads that can be run on
different CPUs.
• SAS also has an option to set the number
of available CPUs, for example
CPUCOUNT = 2
CPUCOUNT = 1 - 8
• The times to run the following code on the
Project problem 2 were recorded for
different CPUCOUNTs
proc means data=data01 n mean std
maxdec=2;
class location;
title 'Mean scores for each location';
var M1 - M8;
run;
Code
DM 'LOG;CLEAR;OUT;CLEAR';
OPTIONS THREADS=YES CPUCOUNT=8;
options mlogic mprint mtrace symbolgen;
data data01;
infile 'C:\Documents and Settings\Eric A.
Suess\Desktop\Stat6250\data01.txt';
input ID Location M1 M2 M3 M4 M5 M6 M7 M8;
run;
/* Compute the mean and stdev of the measurement. */
proc means data=data01 n mean std maxdec=2;
class location;
title 'Mean scores for each location';
var M1 - M8;
run;
proc sort data=data01;
by location;
run;
proc means data=data01 n mean std maxdec=2;
by location;
title 'Mean scores for each location';
var M1 - M8;
run;
/* Create a new data set using the "output" subcommand. */
proc means data=data01 noprint nway;
class location;
var M1 - M8;
output out=locationsum
n = n_measurement
mean = m_measurement
std = s_measurement
max = max_measurement
min = min_measurement;
run;
proc print data=locationsum;
title 'Listing of data set locationsum';
run;
proc contents data=data01 ;
run;
For CPUCOUNT=7
419 /* Compute the mean and stdev of the measurement. */
420 proc means data=data01 n mean std maxdec=2;
421
class location;
422
title 'Mean scores for each location';
423
var M1 - M8;
424 run;
NOTE: There were 3000000 observations read from the data set
WORK.DATA01.
NOTE: PROCEDURE MEANS used (Total process time):
real time
1.64 seconds
cpu time
3.39 seconds
Times
CPUs
real time
cpu time
1
5.92
2.81
2
2.48
3.56
3
2.03
3.29
4
1.92
3.7
5
1.54
3.42
6
1.62
3.31
7
1.64
3.39
8
1.57
3.2
Graph
Proc Means times
7
6
seconds
5
4
real time
3
cpu time
2
1
0
1
2
3
4
5
CPUs
6
7
8
Question
• For what CPUCOUNT is there the most
gain?
• After what number of CPUs is there no
clear added advantage?
Download