More about Correlations Spearman Rank order correlation • Does the same type of analysis as a Pearson r but with data that only represents order. – Ordinal data represents highest to lowest but without any indication of distance between ranks. Spearman correlation cont. – With a Spearmen rank order correlation both variables (x and y) are ranked. – The correlation then determines the relationship between rankings – Easier calculation but less powerful as a statistical test. Multiple Regression Correlation: Relationship between two variables. Regression: What would you predict about the dependent variable, given the independent variable(s). Since you can have several variables: • One or more are designated as dependent while all others are independent. – The DV is identified based on prior knowledge or expectations. – The IV’s can be continuous measurements (different than an ANOVA) – This analysis still does not show causation. Relationship is defined by : Y a B1 x1 B2 x2 ... Bk xk ' – Where: • a is the intercept • Each x is an IV • Each B is a regression coefficient for a particular IV Looking at the output Correlation overall is evaluated with F. F MS reg MS res IV1 IV2 DV IV3 • R - Multiple correlation coefficient is the measure of correlation between the predicted y and the obtained y. • R2 - the portion of the variation of the DV that is predictable from the regression equation. Output cont. • Each IV can be evaluated based on a t test based on the regression coefficients. If: • cancer deaths • % of smokers and • % of the population over 75 are used to predict median health care costs… Model Summary Model 1 R R Square .640a .410 Adjust ed R Square .364 St d. Error of the Es timate ********** a. Predic tors: (Constant), deaths due to c ancer/ 100,000, % of s mok ers, % of population over 75 ANOVAb Model 1 Regres sion Residual Total Sum of Squares 4944948 7119704 12064652 df 3 39 42 Mean Square 1648316.001 182556.515 F 9.029 a. Predictors: (Constant), deaths due to cancer/ 100,000, % of s mokers, % of population over 75 b. Dependent Variable: health cost s pent/pers on - 2000 Sig. .000a Coeffi cientsa Model 1 (Const ant) % of population over 75 % of s mok ers deaths due to c anc er/ 100,000 Unstandardized Coeffic ients B St d. Error -473.998 1067.026 148.853 61.315 2.186 21.909 7.484 2.416 a. Dependent Variable: health cos t spent/person - 2000 St andardiz ed Coeffic ients Beta .339 .013 t -.444 2.428 .100 Sig. .659 .020 .921 .418 3.098 .004 If: • # of hospitals and • # of MD’s Are used to predict median health care costs… Model Summary Model 1 R .758a R Square .575 Adjusted R Square .557 Std. Error of the Estimate ********** a. Predictors: (Constant), # of MD's/100,000 people, number of hospitals /100000 ANOVAb Model 1 Regres sion Residual Total Sum of Squares 7566999 5598500 13165499 df 2 47 49 Mean Square 3783499.516 119117.014 F 31.763 Sig. .000a a. Predic tors: (Constant), # of MD's /100,000 people, number of hospitals /100000 b. Dependent Variable: health cost spent/person - 2000 Coefficientsa Model 1 (Constant) number of hospitals/100000 # of MD's/100,000 people Unstandardized Coefficients B Std. Error 1746.150 289.435 Standardized Coefficients Beta t 6.033 Sig. .000 94.256 35.635 .273 2.645 .011 7.223 .908 .822 7.955 .000 a. Dependent Variable: health cost spent/person - 2000