Running head: SPSS CORRELATION & REGRESSION SPSS Correlation & Regression Assignment Deborah Davis Liberty University 1 SPSS CORRELATION & REGRESSION 2 The purpose of this document is to provide a response to the third SPSS assignment that for the use and understanding of the correlation and regression in statistics. The research scenario was given as a part of the assignment, and presumes leaders of colleges and universities are seeking ways to develop and maintain online and blended programs (Szapkiw, 2014, p. 2). This particular assignment required the use of SPSS to “choose the appropriate tests . . . run the tests and analyze the data” (Szapkiw, 2014, p. 14). Structure In that there are two aspects to this assignment and seven sections to each of them, and it is a small college paper, it will only use the first two levels of headings, and it will be separated by assignment section within each aspect. Research Question 1 Is there a statistically significant relationship between students’ total community score as measured by the Classroom Community Scale (Rovai, 2002) and the total perceived learning scores as measured by the Perceived CAP Learning Scale (Rovai et al., 2009) after controlling for gender among students who are enrolled in the Introduction to Statistics course. Null Hypothesis 1 There is no significant relationship between students’ total community score as measured by the Classroom Community Scale (Rovai, 2002) and the total perceived learning scores as measured by the Perceived CAP Learning Scale (Rovai et al., 2009) after controlling for gender among students who are enrolled in the Introduction to Statistics course. Alternative Hypothesis 1 There is a significant relationship between students’ total community score as measured by the Classroom Community Scale (Rovai, 2002) and the total perceived learning scores as SPSS CORRELATION & REGRESSION 3 measured by the Perceived CAP Learning Scale (Rovai et al., 2009) after controlling for gender among students who are enrolled in the Introduction to Statistics course. Continuous Variables Students’ total community score as measured by the Classroom Community Scale (Rovai, 2002) Students’ total perceived learning scores as measured by the Perceived CAP Learning Scale (Rovai et al., 2009) Test to be Used Partial correlation was used to evaluate the null hypothesis that there is no significant relationship between students’ total community sore as measured by the Classroom Community Scale (Rovai, 2002) and the total perceived learning scores as measured by the Perceived CAP Learning Scale (Rovai et al., 2009), while controlling for gender (N = 40). This was the most appropriate analysis as it assists researchers in examining the strength and direction of the linear relationship between two variables, while controlling for another variable (i.e. confounding variable; a variable you suspect influences the other two variables; Warner, 2013). A bivariate correlation is ruled out because the researcher is not interested in the relationship between only two variables. Assumptions Preliminary analyses using a scatter plot were performed to ensure no violations of the assumptions of linearity, bivariate normality, and homoscedasticity; assumptions were found tenable. 1) Normality: This assumption assumes that the population distributions are normal. Check for normality by creating histograms or by conducting normality tests, such as the Shapiro-Wilk and SPSS CORRELATION & REGRESSION 4 Kolmogorov-Smirnov tests. On the histogram, normality is assumed when there is a symmetrical, bell shaped curve. For the normality tests, non-significant results (a significance level more than .05 indicate tenability of the assumption. That is, normality can be assumed. Community sore as measured by the Classroom Community Scale (Rovai, 2002) and the total perceived learning scores as measured by the Perceived CAP Learning Scale (Rovai et al., 2009) There should be no significant outliers. Outliers are data points within your sample that do not follow a similar pattern to the other data points. Pearson's correlation coefficient, r, is sensitive to outliers, meaning that outliers can have an exaggerated influence on the value of r. This can lead to Pearson's correlation efficient not having a value that best represents the data as a whole. Therefore, it is best if there are no outliers or that they are kept to a minimum. 2) Linearity: This assumption assumes the relationship between the two variables is linear. Check for linearity using a scatterplot; a roughly straight line (no curve) indicates that the assumption is tenable. 3) Homoscedasticity: This assumption assumes the variability in scores in both variables should be similar. Check for homoscedasticity using a scatterplot; a cigar shape indicates that the assumption is tenable. See Figure 1. Figure 1. SPSS CORRELATION & REGRESSION 5 Descriptive Statistics This assignment required elements to be presented from the data set provided. An SPSS analysis of the mean, median, standard deviation, variance, and range for community connectedness and perceived learning were performed. See Table 1. Table 1. Descriptive Statistics for Community Connectedness and Perceived Learning Variable (N = 40) M Mdn SD V Range Community Connectedness 54.65 27.00 19.66 96.38 29 Perceived Learning 37.43 42.50 13.86 192.20 42 Results Partial correlation was used to evaluate the null hypothesis that there is no significant relationship between online university’s students’ sense of community and perceived learning, after controlling for gender (N = 40). Preliminary analyses using a scatter plot were performed to ensure no violations of the assumptions of linearity, bivariate normality, and homoscedasticity; assumptions were found tenable. There was significant evidence to reject the null hypothesis and conclude that there was a strong positive partial correlation between sense of community (M = 54.65, SD = 19.66) and perceived learning (M = 37.43, SD = 13.86), while controlling for gender (M = 1.50, SD = 0.51), r (39) = 0.91, p >.01. Higher levels of community are associated with higher levels of perceived learning. Results of the zero order correlation yielded that there was strong positive correlation between sense of community and perceived learning, r (38) = .89, p<.0013, thus; indicating that controlling for gender had a minimal effect on the strength of the relationship between the two variables. SPSS CORRELATION & REGRESSION 6 Research Question 2 Is there a statistically significant relationship between student’s sense of community and their grade point average among students who are enrolled in the Introduction to Statistics course? Null Hypothesis 2 There is not a significant predictive relationship between student’s sense of community and their grade point average (GPA) among students who are enrolled in the Introduction to Statistics course. Alternative Hypothesis 2 There is a significant predictive relationship between student’s sense of community and their GPA among students who are enrolled in the Introduction to Statistics course. Continuous Variables Students’ total community score as measured by the Classroom Community Scale (Rovai, 2002) Grade Point Average (GPA) Test to be Used Since we seek the relationship direction and strength of two variables, the Pearson’s product coefficient is the appropriate tool for determining this second research question. A linear regression analysis was conducted to evaluate the null hypothesis that university students’ sense of community cannot significantly predict their GPA. The objective of the research aligns with the purpose of a bivariate linear regression. A simple linear regression: (1) determines the proportion of the variation in the dependent variable explained by the independent variable; and (2) predicts criterion variable values based on the predictor variable value (Warner, 2012). SPSS CORRELATION & REGRESSION Assumptions Preliminary analyses using a scatter plot were performed to ensure no violations of the assumptions of linearity, no significant outliers or influential points, independence of residuals, and homoscedasticity; assumptions were found tenable. See Figure 2. Figure 2. Descriptive Statistics This assignment required elements to be presented from the data set provided. An SPSS analysis of the mean, median, standard deviation, variance, and range for community connectedness and grade point average were performed. See Table 2. Table 2. Descriptive Statistics for Community Connectedness and Grade Point Average (N = 40) M Mdn SD V Range Community Connectedness 27.03 27.00 9.82 96.38 29.00 Grade Point Average 3.05 3.10 0.69 0.47 2.30 Results 7 SPSS CORRELATION & REGRESSION 8 Pearson’s product coefficient was used to evaluate the null hypothesis that there is no significant relationship between online university’s students’ sense of community and grade point average (N = 40). Preliminary analyses using a scatter plot were performed to ensure no violations of the assumptions of linearity, bivariate normality, and homoscedasticity; assumptions were found tenable. There was significant evidence to reject the null hypothesis and conclude that there was a moderately positive partial correlation between community connectedness (M = 27.03, SD = 9.82) and grade point average (M = 3.05, SD = 0.69) r = 0.91, p >.01. Higher levels of community are associated with higher levels of grade point average. Effect Size The effect size was determined using the formula for eta squared (t squared/t squared + (N1+N2-2). In this case that means 16.482/16.482 + (40+40-2) = 271.59/271.59+78 = 271.59/341.59 = 0.80. Therefore, the Pearson correlation coefficient in this example, (r = .894) suggests a strong correlation per the guidelines suggested by Cohen (1988). SPSS CORRELATION & REGRESSION 9 References Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates. Rovai, A.P., Wighting M.J.. Baker, J.D., & Grooms, L.D. (2009). Development of an instrument to measure perceived cognitive, affective, and psychomotor (CAP) learning in traditional and virtual classroom higher education settings. Internet and Higher Education, 12(1), 713. Szapkiw, A. (2014). Spss assignment instructions. In Lynchburg, VA: Liberty University. Retrieved from http://amandaszapkiw.com/elearning/writing/downloads/ReportingStatisical-Results-in-APA-Format.pdf SPSS CORRELATION & REGRESSION Appendix GET FILE='C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav'. DATASET NAME DataSet1 WINDOW=FRONT. DESCRIPTIVES VARIABLES=Community_Connectedness_Scale Perceived_CAP_Learning_Scale /STATISTICS=MEAN STDDEV VARIANCE RANGE MIN MAX. Descriptives Notes Output Created 17-SEP-2014 12:52:34 Comments Data C:\Users\Deb\Documents\liberty\educ81 2\SPSS_Assignment_Data_Set.sav Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> Input N of Rows in Working Data File 40 10 SPSS CORRELATION & REGRESSION Missing Value Handling Definition of Missing User defined missing values are treated as missing. Cases Used All non-missing data are used. DESCRIPTIVES VARIABLES=Community_Connectedne ss_Scale Perceived_CAP_Learning_Scale Syntax /STATISTICS=MEAN STDDEV VARIANCE RANGE MIN MAX. Processor Time 00:00:00.00 Elapsed Time 00:00:00.04 Resources [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav Descriptive Statistics N Range Minimu m Maximu m Mean Std. Deviation Community_Connected ness_Scale 40 29 11 40 27.03 9.818 Perceived_CAP_Learni ng_Scale 40 42 12 54 37.43 13.864 Valid N (listwise) 40 Descriptive Statistics Variance 11 SPSS CORRELATION & REGRESSION Community_Connectedness_Scale 96.384 Perceived_CAP_Learning_Scale 192.199 Valid N (listwise) FREQUENCIES VARIABLES=Community_Connectedness_Scale Perceived_CAP_Learning_Scale /STATISTICS=MEAN MEDIAN /ORDER=ANALYSIS. Frequencies Notes Output Created 17-SEP-2014 12:57:53 Comments Input 12 Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> SPSS CORRELATION & REGRESSION N of Rows in Working Data File 40 Definition of Missing User-defined missing values are treated as missing. Cases Used Statistics are based on all cases with valid data. Missing Value Handling FREQUENCIES VARIABLES=Community_Connectedne ss_Scale Perceived_CAP_Learning_Scale Syntax /STATISTICS=MEAN MEDIAN /ORDER=ANALYSIS. Processor Time 00:00:00.02 Elapsed Time 00:00:00.03 Resources [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav Statistics Community_ Perceived_C Connectednes AP_Learning s_Scale _Scale Valid 40 40 0 0 Mean 27.03 37.43 Median 27.00 42.50 N Missing 13 SPSS CORRELATION & REGRESSION Frequency Table Community_Connectedness_Scale Frequency Percent Valid Percent Cumulative Percent 11 3 7.5 7.5 7.5 12 3 7.5 7.5 15.0 14 1 2.5 2.5 17.5 15 1 2.5 2.5 20.0 20 3 7.5 7.5 27.5 22 2 5.0 5.0 32.5 23 6 15.0 15.0 47.5 27 2 5.0 5.0 52.5 30 1 2.5 2.5 55.0 32 1 2.5 2.5 57.5 34 1 2.5 2.5 60.0 35 4 10.0 10.0 70.0 36 3 7.5 7.5 77.5 37 5 12.5 12.5 90.0 Valid 14 SPSS CORRELATION & REGRESSION 38 1 2.5 2.5 92.5 40 3 7.5 7.5 100.0 40 100.0 100.0 Total Perceived_CAP_Learning_Scale Frequency Percent Valid Percent Cumulative Percent 12 1 2.5 2.5 2.5 14 1 2.5 2.5 5.0 15 1 2.5 2.5 7.5 16 2 5.0 5.0 12.5 18 1 2.5 2.5 15.0 19 1 2.5 2.5 17.5 22 1 2.5 2.5 20.0 Valid 25 2 5.0 5.0 25.0 27 3 7.5 7.5 32.5 29 2 5.0 5.0 37.5 32 1 2.5 2.5 40.0 35 1 2.5 2.5 42.5 40 3 7.5 7.5 50.0 45 4 10.0 10.0 60.0 47 3 7.5 7.5 67.5 15 SPSS CORRELATION & REGRESSION 48 2 5.0 5.0 72.5 50 3 7.5 7.5 80.0 51 2 5.0 5.0 85.0 52 1 2.5 2.5 87.5 53 2 5.0 5.0 92.5 54 3 7.5 7.5 100.0 40 100.0 100.0 Total FREQUENCIES VARIABLES=Community_Connectedness_Scale GPA /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM MEAN MEDIAN MODE /ORDER=ANALYSIS. Frequencies Notes Output Created 17-SEP-2014 13:01:48 Comments Input Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v 16 SPSS CORRELATION & REGRESSION Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 40 Definition of Missing User-defined missing values are treated as missing. Cases Used Statistics are based on all cases with valid data. Missing Value Handling FREQUENCIES VARIABLES=Community_Connectedne ss_Scale GPA Syntax /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM MEAN MEDIAN MODE /ORDER=ANALYSIS. Processor Time 00:00:00.00 Elapsed Time 00:00:00.00 Resources [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav Statistics 17 SPSS CORRELATION & REGRESSION Community_ Connectednes s_Scale Valid GPA 40 40 0 0 Mean 27.03 3.0500 Median 27.00 3.1000 23 4.00 9.818 .68687 96.384 .472 Range 29 2.30 Minimum 11 1.70 Maximum 40 4.00 N Missing Mode Std. Deviation Variance Frequency Table Community_Connectedness_Scale Frequency Percent Valid Percent Cumulative Percent 11 3 7.5 7.5 7.5 12 3 7.5 7.5 15.0 Valid 18 SPSS CORRELATION & REGRESSION 14 1 2.5 2.5 17.5 15 1 2.5 2.5 20.0 20 3 7.5 7.5 27.5 22 2 5.0 5.0 32.5 23 6 15.0 15.0 47.5 27 2 5.0 5.0 52.5 30 1 2.5 2.5 55.0 32 1 2.5 2.5 57.5 34 1 2.5 2.5 60.0 35 4 10.0 10.0 70.0 36 3 7.5 7.5 77.5 37 5 12.5 12.5 90.0 38 1 2.5 2.5 92.5 40 3 7.5 7.5 100.0 40 100.0 100.0 Total GPA Frequency Percent Valid Percent Cumulative Percent 1.70 2 5.0 5.0 5.0 Valid 1.90 2 5.0 5.0 10.0 2.10 1 2.5 2.5 12.5 19 SPSS CORRELATION & REGRESSION 2.30 2 5.0 5.0 17.5 2.40 2 5.0 5.0 22.5 2.50 1 2.5 2.5 25.0 2.60 2 5.0 5.0 30.0 2.70 2 5.0 5.0 35.0 2.80 1 2.5 2.5 37.5 2.90 1 2.5 2.5 40.0 3.00 4 10.0 10.0 50.0 3.20 3 7.5 7.5 57.5 3.30 1 2.5 2.5 60.0 3.40 2 5.0 5.0 65.0 3.50 3 7.5 7.5 72.5 3.60 1 2.5 2.5 75.0 3.70 3 7.5 7.5 82.5 3.80 2 5.0 5.0 87.5 4.00 5 12.5 12.5 100.0 Total 40 100.0 100.0 T-TEST PAIRS=Community_Connectedness_Scale WITH GPA (PAIRED) /CRITERIA=CI(.9500) /MISSING=ANALYSIS. 20 SPSS CORRELATION & REGRESSION T-Test Notes Output Created 17-SEP-2014 13:10:56 Comments Input Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 40 Definition of Missing User defined missing values are treated as missing. Cases Used Statistics for each analysis are based on the cases with no missing or out-of-range data for any variable in the analysis. Missing Value Handling T-TEST PAIRS=Community_Connectedness_Sca le WITH GPA (PAIRED) Syntax /CRITERIA=CI(.9500) /MISSING=ANALYSIS. Resources Processor Time 00:00:00.02 21 SPSS CORRELATION & REGRESSION Elapsed Time 00:00:00.06 [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav Paired Samples Statistics Mean Community_Connected Pair 1 ness_Scale GPA N Std. Deviation 27.03 40 9.818 1.552 3.0500 40 .68687 .10860 Paired Samples Correlations N Pair 1 Community_Connected ness_Scale & GPA Std. Error Mean Correlatio n 40 .908 Sig. .000 Paired Samples Test Paired Differences 22 SPSS CORRELATION & REGRESSION Mean Std. Deviation Std. Error Mean 23 95% Confidence Interval of the Difference Lower Pair 1 Community_Connected 23.97500 ness_Scale - GPA 9.19849 1.45441 21.03318 Paired Samples Test Paired Differences t df Sig. (2-tailed) 95% Confidence Interval of the Difference Upper Pair 1 Community_Connectednes s_Scale - GPA 26.91682 16.484 39 PARTIAL CORR /VARIABLES=Perceived_CAP_Learning_Scale Community_Scale_Total BY Sex /SIGNIFICANCE=TWOTAIL /STATISTICS=DESCRIPTIVES /MISSING=LISTWISE. .000 SPSS CORRELATION & REGRESSION Partial Corr Notes Output Created 17-SEP-2014 12:07:20 Comments Input Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 40 Definition of Missing User defined missing values are treated as missing. Cases Used Statistics are based on cases with no missing data for any variable listed. Missing Value Handling PARTIAL CORR /VARIABLES=Perceived_CAP_Learnin g_Scale Community_Scale_Total BY Sex Syntax /SIGNIFICANCE=TWOTAIL /STATISTICS=DESCRIPTIVES /MISSING=LISTWISE. Resources Processor Time 00:00:00.00 24 SPSS CORRELATION & REGRESSION Elapsed Time 00:00:00.01 [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav Descriptive Statistics Mean Std. Deviation N Perceived_CAP_Learni ng_Scale 37.43 13.864 40 Community_Scale_Tot al 54.65 19.657 40 1.50 .506 40 Sex Correlations Control Variables Perceived_C Community_ AP_Learning Scale_Total _Scale Correlation Sex Perceived_CAP_Learni Significance (2ng_Scale tailed) df Community_Scale_Tota Correlation 1.000 .910 . .000 0 37 .910 1.000 25 SPSS CORRELATION & REGRESSION l Significance (2tailed) df .000 . 37 0 GRAPH /SCATTERPLOT(BIVAR)=Perceived_CAP_Learning_Scale WITH Community_Connectedness_Scale /MISSING=LISTWISE. Graph Notes Output Created 17-SEP-2014 12:07:42 Comments Input Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> 26 SPSS CORRELATION & REGRESSION N of Rows in Working Data File 40 GRAPH /SCATTERPLOT(BIVAR)=Perceived_C AP_Learning_Scale WITH Community_Connectedness_Scale Syntax /MISSING=LISTWISE. Processor Time 00:00:00.31 Elapsed Time 00:00:00.58 Resources [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav 27 SPSS CORRELATION & REGRESSION CORRELATIONS /VARIABLES=Community_Connectedness_Scale GPA /PRINT=TWOTAIL NOSIG /STATISTICS DESCRIPTIVES /MISSING=PAIRWISE. 28 SPSS CORRELATION & REGRESSION Correlations Notes Output Created 17-SEP-2014 12:21:31 Comments Input Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> N of Rows in Working Data File 40 Definition of Missing User-defined missing values are treated as missing. Cases Used Statistics for each pair of variables are based on all the cases with valid data for that pair. Missing Value Handling CORRELATIONS Syntax /VARIABLES=Community_Connectedn ess_Scale GPA /PRINT=TWOTAIL NOSIG /STATISTICS DESCRIPTIVES /MISSING=PAIRWISE. 29 SPSS CORRELATION & REGRESSION Processor Time 00:00:00.03 Elapsed Time 00:00:00.06 Resources [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav Descriptive Statistics Mean Community_Connected ness_Scale GPA Std. Deviation N 27.03 9.818 40 3.0500 .68687 40 Correlations Community_ Connectedne ss_Scale Pearson Correlation 1 Community_Connected ness_Scale Sig. (2-tailed) N GPA Pearson Correlation GPA .908** .000 40 40 .908** 1 30 SPSS CORRELATION & REGRESSION Sig. (2-tailed) N .000 40 40 **. Correlation is significant at the 0.01 level (2-tailed). GRAPH /SCATTERPLOT(BIVAR)=GPA WITH Community_Connectedness_Scale /MISSING=LISTWISE. Graph Notes Output Created 17-SEP-2014 12:22:21 Comments Input Data C:\Users\Deb\Documents\liberty\ educ812\SPSS_Assignment_Data_Set.sa v Active Dataset DataSet1 Filter <none> Weight <none> Split File <none> 31 SPSS CORRELATION & REGRESSION N of Rows in Working Data File 40 GRAPH /SCATTERPLOT(BIVAR)=GPA WITH Community_Connectedness_Scale Syntax /MISSING=LISTWISE. Processor Time 00:00:00.31 Elapsed Time 00:00:00.39 Resources [DataSet1] C:\Users\Deb\Documents\liberty\educ812\SPSS_Assignment_Data_Set.sav 32 SPSS CORRELATION & REGRESSION 33