UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas SAS -- Scatter Plots (X-Y Graphs) and Correlation Analysis Proc Gplot and Proc Corr Proc Gplot -- Scatter Plots in SAS Scatter Plots are graphs that show the corresponding values for two or more variables. For example, a graph that shows the (X,Y) ordered pairs from a data set with variables X and Y is a scatter plot. Proc Gplot is used to create scatter plots in SAS. For example, if we have two variables GDP and Time in dataset01, we can create a scatter plot with GDP on the vertical (Y) axis and Time on the horizontal (X) axis with the following commands: proc gplot data=dataset01; plot GDP*Time; run; You can include multiple “gplot” statements between the "proc gplot" and "run" commands if you want to make more than one plot. For example, if you had variables GDP, InflationRate and Time in your dataset, you could make several different plots: proc gplot data=dataset01; plot GDP*Time; plot InflationRate*Time; plot GDP*InflationRate; run; You can graph two or more Y variables on the same graph with the "Overlay" option. For example, if you want to graph GDP and InflationRate both against Time on the same graph, you can use the following commands: proc gplot data=dataset01; plot GDP*Time InflationRate*Time / Overlay; run; 1 Proc Corr -- Correlation Analysis in SAS To do correlation analysis in SAS, use PROC CORR. For example, suppose we have variables “PopCens” and “Age65per10000” in dataset01. To calculate the value of r for these variables, use the following SAS commands: ods graphics on; proc corr data = dataset01 plots=matrix(histogram nvar=all nwith=all); var PopCens Age65per10000; run; ods graphics off; The “ods graphics on;” and “ods graphics off;” tell SAS to make the output graphs prettier. “ods graphics” is the graphics part of SAS’s “output delivery system.” The ods commands are optional, but they make the output graphs look much nicer. To have SAS run correlations between all pairs in a list of variables, simply list all the variables on the "var" line of the proc corr command. Remember, the variables must be numerical measurement variables. The key output of Proc Corr is a Correlation Matrix, a table that gives, for every pair of variables selected for analysis: the correlation coefficient (the "r" values), the p-value (for the hypothesis test H0: ρ = 0, H1: ρ ≠ 0), and the number of observations used to calculate the correlation coefficient.. The upper-left to lower-right diagonal elements of the table will be exactly 1.00, because any variable is perfectly correlated with itself. Also, the numbers above the diagonal will be the same as the corresponding numbers below the diagonal--you only need to look at the numbers above the diagonal or the numbers below the diagonal, not both. Finally, if there are missing values in the data, then the number of observations used to calculate each correlation coefficient may vary from one correlation coefficient to the next in the table. The “plots=matrix(histogram nvar=all nwith=all)” option of Proc Corr causes SAS to create a Scatterplot-Matrix—a matrix of graphs, where each graph shows a pair of variables from the “var” list plotted against one another. In addition, the histograms for all variables are shown along the diagonal of the scatterplot matrix. Very cool! IMPORTANT NOTE—Accessing the Scatterplot Matrix: SAS puts the Scatterplot Matrix in the Results Window of SAS (not the Output Window), in subfolder “Corr: The SAS System”, filename “MatrixPlot.” In SAS, go to the Results Window by clicking on the Results tab at the bottom of the SAS screen. Then, in the Results Window, double-click on the “Corr: The SAS System” folder, then doubleclick on “MatrixPlot.” SAS will put the Scatterplot Matix in a new window. You can enlarge or reduce the MatrixPlot window, and then you can use the “Snip” tool in MS Windows to take a snip of the Scatterplot Matrix and paste it into a MS Word document. 2