SAS

advertisement
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
SAS -- Scatter Plots (X-Y Graphs) and Correlation Analysis
Proc Gplot and Proc Corr
Proc Gplot -- Scatter Plots in SAS
Scatter Plots are graphs that show the corresponding values for two or more variables. For example, a
graph that shows the (X,Y) ordered pairs from a data set with variables X and Y is a scatter plot.
Proc Gplot is used to create scatter plots in SAS. For example, if we have two variables GDP and Time
in dataset01, we can create a scatter plot with GDP on the vertical (Y) axis and Time on the horizontal
(X) axis with the following commands:
proc gplot data=dataset01;
plot GDP*Time;
run;
You can include multiple “gplot” statements between the "proc gplot" and "run" commands if you want to
make more than one plot. For example, if you had variables GDP, InflationRate and Time in your
dataset, you could make several different plots:
proc gplot data=dataset01;
plot GDP*Time;
plot InflationRate*Time;
plot GDP*InflationRate;
run;
You can graph two or more Y variables on the same graph with the "Overlay" option. For example, if
you want to graph GDP and InflationRate both against Time on the same graph, you can use the
following commands:
proc gplot data=dataset01;
plot GDP*Time InflationRate*Time / Overlay;
run;
1
Proc Corr -- Correlation Analysis in SAS
To do correlation analysis in SAS, use PROC CORR. For example, suppose we have variables
“PopCens” and “Age65per10000” in dataset01. To calculate the value of r for these variables, use the
following SAS commands:
ods graphics on;
proc corr data = dataset01 plots=matrix(histogram nvar=all nwith=all);
var PopCens Age65per10000;
run;
ods graphics off;
The “ods graphics on;” and “ods graphics off;” tell SAS to make the output graphs prettier. “ods
graphics” is the graphics part of SAS’s “output delivery system.” The ods commands are optional, but
they make the output graphs look much nicer.
To have SAS run correlations between all pairs in a list of variables, simply list all the variables on the
"var" line of the proc corr command. Remember, the variables must be numerical measurement variables.
The key output of Proc Corr is a Correlation Matrix, a table that gives, for every pair of variables
selected for analysis:
 the correlation coefficient (the "r" values),
 the p-value (for the hypothesis test H0: ρ = 0, H1: ρ ≠ 0), and
 the number of observations used to calculate the correlation coefficient..
The upper-left to lower-right diagonal elements of the table will be exactly 1.00, because any variable is
perfectly correlated with itself. Also, the numbers above the diagonal will be the same as the
corresponding numbers below the diagonal--you only need to look at the numbers above the diagonal or
the numbers below the diagonal, not both. Finally, if there are missing values in the data, then the
number of observations used to calculate each correlation coefficient may vary from one correlation
coefficient to the next in the table.
The “plots=matrix(histogram nvar=all nwith=all)” option of Proc Corr causes SAS to create a
Scatterplot-Matrix—a matrix of graphs, where each graph shows a pair of variables from the “var” list
plotted against one another. In addition, the histograms for all variables are shown along the diagonal of
the scatterplot matrix. Very cool!
IMPORTANT NOTE—Accessing the Scatterplot Matrix: SAS puts the Scatterplot Matrix in the
Results Window of SAS (not the Output Window), in subfolder “Corr: The SAS System”, filename
“MatrixPlot.” In SAS, go to the Results Window by clicking on the Results tab at the bottom of the SAS
screen. Then, in the Results Window, double-click on the “Corr: The SAS System” folder, then doubleclick on “MatrixPlot.” SAS will put the Scatterplot Matix in a new window. You can enlarge or reduce
the MatrixPlot window, and then you can use the “Snip” tool in MS Windows to take a snip of the
Scatterplot Matrix and paste it into a MS Word document.
2
Download