Module 15: Correlation 1 Correlation Assignment: 15.2, 15.7

advertisement
Module 15: Correlation
Assignment: 15.2, 15.7
March 13, 2009
1
Correlation
We learn today about correlation between 2 random variables. Correlation is a number
between -1 and 1 that shows how strongly 2 variables are related. The procedure PROC
CORR produces descriptive statistics for all pairs of variables listed in the VAR statement. It also computes a p-value for testing whether the true population correlation ρ = 0.
Example 15.1
We want to see if there is a relationship between test grades in the file grades.dat.
filename datain ’Grades.dat’;
data one;
infile datain;
input id $ gender $ class quiz exam1 exam2 lab final;
run;
proc sort data=one;
by gender;
run;
proc corr data=one;
var exam1 exam2 final;
by gender;
run;
As we see on the output page, there are some descriptive statistics for each variable, as
well as the correlation coefficient between each pair of variables. It appears that there is
a positive correlation between the variables. Notice we used the PROC SORT command
to sort the data by gender. You also need to add a ’by gender’ statement after the ’var’
line in the PROC CORR command.
1
Download