1 HANDOUT ON RELIABILITY Reliability refers to the consistency and stability in the results of a test or scale. A test is said to be reliable if it yields similar results in repeated administrations when the attribute being measured is believed not to have changed in the interval between measurements, even though the test may be administered by different people and alternative forms of the test are used. For example, if you weighed yourself twice consecutively and the first time the scale read 130 lbs. And the second time 140 lbs., we would say that the scale was an unreliable measure of weights. In addition, to be reliable, an instrument or test must be confined to measuring a single construct and only one dimension. For example, if a questionnaire designed to measure anxiety simultaneously measured depression, the instrument would not be a reliable measure of anxiety. A reliable instrument or test must meet two conditions: it must have a small random error; and it must measure a single dimension. Among others, one major source of inconsistency in test results is random measurement error. A primary concern of test developers and test users is therefore to determine the extent to which random measurement errors influence test performance. The classical true score model provides a useful theoretical framework for defining reliability and for the development of practical reliability investigations. In the classical true score model, an examinee’s or a subject’s observed score on a particular test is viewed as a random sample of one of the many possible test scores that a person could have earned under repeated administrations of the same test; and the observed score (X) is envisioned as the composite of two hypothetical components - a true score (T) and a random error component (E). T is defined as the expected value of the examinee’s test scores over many repeated testings with the same test and E is the discrepancy between an examinee’s observed score and his/her true score. The following equation summarizes the relationship between X, T and E: X=T+E An important question which follows from the above is: How closely related are the examinees’ true and observed scores on a particular test or instrument? Based on the classical true score model1, two indices are derived to measure the relationship between true and observed scores. 1. Reliability coefficient - defined as the correlation between parallel measures2 . 1 “X = T + E” is only one of the assumptions of the classical true score theory. Please consult texts on measurement/test theory for other assumptions in the model as well as how the reliability coefficient and the reliability index are derived from the model. 2 According to classical true score theory, two measures/tests are defined as parallel when 1) each examinee or subject has the same true score on both measures/tests, and 2). The error variances of the two measures/tests are equal. Based on this definition, it is sensible to assume that 2 This coefficient ( Dxx,) can be shown to equal the ratio F observed score variance due to true score variance. 2. 2 /F2X , the proportion of T Reliability index - defined as the correlation between true and observed scores on a single measure (i.e. DXT) and is equivalent to Fx/FT. However, in reality, we rarely know about the true scores. Besides, the reliability coefficient defined above is purely a theoretical concept because it is not possible to verify that two tests are truly parallel. Therefore reliability of tests have to be estimated using other methods. Methods of Estimating Reliability: The methods of estimating reliability can be roughly categorized into two groups: one group of methods includes methods that require two separate test administrations; and another group of methods includes those using one test administration. 1. Methods Requiring Two Separate Test Administrations: a. Test-Retest Method Test-Retest method yields a reliability estimate, m12, is based on testing the same examinees/subjects twice with the same test/scale and then correlating the results. If each examinee/subject receives exactly the same observed score on the second testing as he/she did on the first, and if there is some variance in the observed scores among examinees/subjects, then the correlation is 1.0, indicating perfect reliability. The correlation coefficient obtained from this test-retest procedure is called the coefficient of stability, which measures how consistently examinees/subjects respond to this test/scale at different times. b. Alternate-Forms Method This method involves constructing two similar forms of a test/scale (i.e. both forms have the same content) and administering both forms to the same group of examinees within a very short time period. The correlation between observed scores on the alternate test/scale forms, (i.e. mxy computed using the Pearson product moment formula), is n estimate of the reliability of either one of the alternate forms. This correlation coefficient is known as coefficient of equivalence. a. Test-Retest with Alternate Forms Method This method is a combination of the test-retest and alternate-forms methods. In parallel tests are matched in content. Dr. Robert Gebotys 2003 3 this case, the procedure is to administer form 1 of the test/scale, wait, and then administer form 2. The correlation coefficient between the two sets of observed scores is an estimate of the reliability of either one of the alternate forms and is known as the coefficient of stability and equivalence. 2. Methods Using One Test Administration: There are many situations when a single form of a test/scale will be administered only once to a group of examinees/subjects. The following are methods of estimating reliability based on scores from a single test administration. These methods of estimating reliability are mainly focused on how consistently the examinees/subjects performed or scored across items or subsets of items on this single test/scale form. The reliability estimates generated by these methods are usually called coefficient of internal consistency. These methods of estimating reliability are based on the argument that if the scores of the subjects/examinees are consistent across items or subsets of items on the single test/scale form, then it is reasonable to think that these items or subsets of items came from the same content domain and were constructed according to the same specifications. In addition, if the examinees/subjects’ performance is consistent across subsets of items within a test/scale, the test/scale administrator can also have some confidence that this performance would generalize to other possible items in the content domain. a. Reliability Estimates Based on Item Variances: Calculation of Cronbach’s Alpha This is the most widely used method of estimating reliability using a single test administration. Cronbach’s Alpha (") is calculated based on the following formula: " = k / k -1 ( { 1 - E F i 2 } / F2x ) where k is the number of items on the test/scale, F2i is the variance of item i, and F2x is the total test variance Cronbach’s " can actually be conceived as the average of all the possible split-half reliabilities (Calculation of split-half reliabilities will be discussed in a following section) estimated on the single test/scale. However, unlike the split-half methods, Cronbach’s " is not affected by how the items are arranged in the test/scale. b. Split-Half Method Under this method, test/scale developers divide the scale/test into two halves, so that the first half forms the first part of the entire test/scale and the second half forms the remaining part of the test/scale. Both halves are normally of equal lengths and they are designed in such a way that each is an alternate form of the other. Estimation of reliability is based on correlating the results of the two halves of the same test/scale. If 4 the two halves of the test/scale are parallel forms of one another, the Spearman Brown prophecy formula is used to estimate the reliability coefficient of the entire test/scale. The Spearman Brown prophecy formula is: Dxx’ = 2 DYY, / 1 + DYY, where Dxx’ is the reliability projected for the full-length test/scale, and DYY` is the correlation between the half-tests. DYY, is also an estimate of the reliability of the test/scale if it contains the same number of items as that contained in the half-test. If the two halves of test/scale are not parallel, the reliability of the full-length test/scale is calculated using the formula for coefficient " for split halves: " = 2 [ F2x - ( F2Y1 + F2 Y2) ] 1 / F2x Where F Y1 and F Y2 are the variances of scores on the two halves of the test, and F is the variance of the scores on the whole test, with X = Yl + Y2. 2 2 2 x In the SPSS program, the ‘SPLIT-HALF” model for reliability analysis is conducted on the assumption that the two halves of the test/scale are parallel forms. Hence, coefficient " has to be obtained by hand calculations. Besides, it must be noted that split-half reliability estimate is contingent upon how the items in the test/scale are arranged. Reordering of the items and/or regrouping of items in the test/scale can result in different reliability estimates using the split-half method. Hence, reliability estimate obtained from the even/odd method (a method which is similar to split-half method and which will be mentioned below) on the same test/scale will most likely be different from the reliability estimated by using the split-half method. c. Even/Odd Method Even/odd method is similar to split-half method, with the exception that the estimation of reliability for the entire test/scale is no longer based on correlating the first half of the test/scale with the second half, but instead it is based on correlating even items with odd items. Determining Reliability Using SPSS: Example 1: Dr. Robert Gebotys 2003 5 The following illustrative example contains six items extracted from a scale used to measure adolescents’ attitude towards the use of physical aggressive behaviours in their daily life. Each item in the scale refers to a situation where physical aggressive behaviour is or is not used. Adolescents are asked whether they agree or disagree with each and every item on the scale. Adolescents’ responses to the items are converted to scores of either 1 or 0, where 0 represents the endorsement of the use of physical aggressive behaviours and 1 represents disapproval of the use of physical aggressive behaviours. Below are the contents of the six items as well as the scores of 14 adolescents on these six items: Item No. Content 1 When there are conflicts, people won’t listen to you unless you get physically aggressive. 2 It is hard for me not to act aggressively if I am angry with someone. 3 Physical aggression does not help to solve problems, it only makes situations worse. 4 There is nothing wrong with a husband hitting his wife if she has an affair. 5 Physical aggression is often needed to keep things under control. 6 When someone makes me mad, I don’t have to use physical aggression. I can think of other ways to express my anger. 6 The following is the data obtained from 14 adolescents: Person 1 2 Items 3 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1 1 1 1 0 0 1 1 0 0 0 0 1 1 1 0 1 1 1 1 0 1 0 0 5 6 0 1 1 1 1 0 1 1 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 In the pages that follow, we will first outline the major commands for different models of reliability analyses and briefly explain the usage of these commands. Then, the whole program for the different reliability analyses will be reproduced in the next section, which will in turn be followed by discussions on the outputs. SPSS Commands for Reliability Analyses:3 1. Calculation of Cronbach’s Alpha: reliability variables=item 1 to item 6/ scale (test score) =item 1 to item 6/ model = alpha statistics all The subcommand “scale (testscore) = item1 to item 6" specifies the items on which reliability analysis is to be carried out. In this case, item1 to item6 will form the “scale” on which analysis will be done. The subcommand “model =alpha” instructs the computer to perform the “ALPHA” model (i.e. to calculate Cronbach’s Alpha) for reliability analysis. 3 In the following illustrations and explanations, only a sample of commonly used commands and computer languages are shown. Students are advised to consult SPSS User’s Guide for other appropriate commands and computer languages in reliability analyses. Dr. Robert Gebotys 2003 7 a. b. c. d. e. f. g. h. 2. The command “statistics all” will instruct the computer to give us the following additional statistics from reliability analysis:4 Item means and standard deviations; Inter-item covariance matrix; Inter-item correlation matrix; Scale mean, variance and standard deviation; Summary statistics for item means, item variances, inter-item covariances, inter-item correlations, and item-total statistics (i.e. summary statistics comparing each item to the scale composed of other items (including alpha (") if that item is deleted)); ANOVA; Hotelling’s T-Squared; Other statistics like Friedman’s chi-square, Kendall’s coefficient of concordance and Cochran’s Q, if applicable. Assessing Split-Half Reliability: reliability variables=item1 to item6/ statistics=scale/ summary=means variances covariance correlations/ scale (test score) =item1 to item6/ model=split The “scale (test score) =item1 to item6" subcommand specifies the number as well as the order of the items on which subsequent reliability analysis is to be performed. The subcommand “model=split’ instructs the computer to use the “SPLIT-HALF” model for reliability analysis on the scale. A split-half reliability analysis will be performed based on the order in which the items were named on the preceding “scale” subcommand, i.e., the first half of the items (rounding up if the number of items is odd) form the first part/half, and the remaining items form the second part/half. In this case, items 1, 2 and 3 will form the first part and items 4, 5 and 6 will form the second part. Since the inter-item covariance matrix, inter-item correlation matrix, item means and standard deviations as well as the item-total statistics produced from this reliability analysis are the same as those produced in the preceding “ALPHA” model (because the two analyses were performed on the same set of data), we may not want to look at these again at this stage. However, we may be interested in knowing the following: a. the means and standard deviations of each of the two parts of the scale; b. the summary statistics (i.e. item means, item variances, inter-item 4 Only outputs containing statistics categorized under a to e will be reproduced and discussed in subsequent pages because these are already sufficient in terms of serving the purposes and needs of our present analyses. Statistics under categories f to h will not be reported. 8 covariances and inter-item correlations) of each of the two parts of the scale. The insertion of the two subcommands, namely, “statistics=scale” and “summary= . . . correlations”, into the computer program will enable us to obtain the above-mentioned statistics which were not provided by the previous analysis based on the “ALPHA” model.5 3. Estimating Even/Odd Reliability: reliability variables=item1 to item6/ scale (test score) =item1 item3 item5 item2 item4 item6/ model=split statistics all Since “EVEN/ODD” model for reliability analysis is not an available option in SPSS, the “SPLIT-HALF” model is used for this analysis. However, in order that the “SPLITHALF” model can be successfully employed for estimating even/odd reliability, the order of the items listed in the preceding “scale” subcommand must have been arranged in such a way that the odd items form the first part of the scale and that the even items form the remaining part. Please see the above “scale” subcommand for an illustration. As already mentioned, the command “statistics all” instructs the computer to produce the eight categories of additional statistics from reliability analysis6 In a later section, it will be shown that the item-total summary statistics, items means and standard deviations, interitem covariance matrix, and the inter-item correlation matrix produced in this analysis are virtually the same as those produced from the “ALPHA” model of reliability analysis, with the exception that the statistics are displayed slightly differently as a result of reordering the six items. Alternatively, additional statistics which are specific to this model of reliability analysis and which are of interest to us can be obtained by using the same “statistics=scale” and “summary= . . . correlations” subcommands as those shown in the computer program for “SPLIT-HALF” model of reliability analysis. Conducting All the Above-mentioned 3 Models of Reliability Analyses on the Set of Scores Obtained from 14 Adolescents for the 6 Items Using SPSS 1. SPSS Computer Program 5 If you want the full set of additional statistics from split-half reliability analysis, you have to write into the program the command “statistics all” in the same manner as that shown in the computer program for conducting the “ALPHA” model of reliability test. 6 Again, only statistics under categories a to e will be reported and discussed in subsequent pages. Dr. Robert Gebotys 2003 9 2. SPSS Outputs and Discussions7 a. Reliability Analysis - “ALPHA” Model The initial part of the output contains descriptive statistics on each of the items (i.e. means and standard deviations), an inter-item covariance8 matrix and an inter-item correlation matrix. These will be followed by descriptive statistics for the scale and the summary statistics. ****** Method 2 (covariance matrix) will be used for this analysis ****** R E L I A B I L I T Y A N A L Y S I S - S C A L E (A L P H A) Mean 1. 2. 3. 4. 5. 6. ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 .3571 .3571 .5714 .5714 .5714 .3571 7 Std Dev Cases .4972 .4972 .5136 .5136 .5136 .4972 14.0 14.0 14.0 14.0 14.0 14.0 Discussions and Explanations are in italics. These are not parts of the original computer outputs. 8 Covariance (Sxy) is defined as the average product of the deviations in X and Y, where a deviation is a distance from the mean. Its relation with the Pearson product-moment correlation coefficient is illustrated by the formula: mxy = Sxy Sx Sy . 10 R E L I A B I L I T Y A N A L Y S I S - S C A L E (T E S T S C O R) Correlation Matrix ITEM1 ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 1.0000 .6889 .6455 .3443 .6455 .3778 ITEM2 1.0000 .3443 .3443 .3443 .6889 ITEM3 1.0000 .4167 .7083 .3443 ITEM4 1.0000 .4167 .3443 ITEM5 1.0000 .3443 ITEM6 ITEM6 1.0000 It is shown in the above inter-item correlation matrix that the largest correlation coefficient occurs between items 3 and 5 (i.e. r = .7083). Item 2 is also fairly highly correlated with both item 1 and item 6 (i.e. r in both cases are .6889). The lowest correlation coefficient is .3443, which occurs between a number of pairs of items (e.g. between item 1 and item 4, etc.) RELIABILITY ANALYSIS - SCALE (T E S T S C O R) R E L I A B I L I T Y A N A L Y S I S - S C A L E (A L P H A) R E L I A B I L I T Y N of Cases = A N A L Y S I S - S C A L E (A L P H A) 14.0 Mean 2.7857 Variance 5.1044 Std Dev 2.2593 N of Variables 6 Item Means Mean .4643 Minimum .3571 Maximum .5714 Range .2143 Max/Min 1.6000 Variance .0138 Item Variances Mean .2555 Minimum .2473 Maximum .2637 Range .0165 Max/Min 1.0667 Variance .0001 Mean .1190 Minimum .0879 Maximum .1868 Range .0989 Max/Min 2.1250 Variance .0015 Mean .4665 Minimum .3443 Maximum .7083 Range .3641 Max/Min 2.0575 Variance .0234 Statistics for Scale Inter-item Covariances Inter-item Correlations Dr. Robert Gebotys 2003 11 The section of output reproduced above gives us descriptive statistics for the scale9 and summary statistics for the items. From the above section, it can be seen that the average score for the scale is 2.7857 and the standard deviation is 2.2593. The average score on an item is 0.4643, with a range of 0.2143 (i.e. maximum minus minimum). The average of the item variances is 0.2555, with a minimum of 0.2473 and a maximum of 0.2637. These show that the items in the scale have fairly comparable variances. The average covariance between the items is .119. The correlations between the items range from .3443 to .7083. The ratio between the largest and the smallest correlations is .7083/.3443, or 2.0575. The average correlation between the items is .4665. The item-total summary statistics forms the next section of the output and is reproduced below: Item-total Statistics ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 Scale Mean if Item Deleted Scale Variance if Item Deleted 2.4286 2.4286 2.2143 2.2143 2.2143 2.4286 3.4945 3.6484 3.5659 3.8736 3.5659 3.8022 Corrected ItemTotal Correlation .7330 .6364 .6572 .4784 .6572 .5440 Squared Multiple Correlation Alpha if Item Deleted .7511 .7469 .6000 .2533 .6000 .5733 .7901 .8095 .8051 .8404 .8051 .8273 Item-total Statistics For each item, the first column of the above set of statistics shows what the average score for the scale would be if the item were excluded from the scale. For example, if item 1 were deleted from the scale, the mean score of the scale would be 2.4286. The next column in this set of statistics is the scale variance if the item were eliminated. The column labeled “Corrected Item-Total Correlation” is the Pearson correlation coefficient between the score on the individual item and the sum of the scores on the remaining items. For example, the smallest correlation reported is .4784, which occurs between the score on item 4 and the sum of the scores of items 1, 2, 3, 5 and 6. We can say that the relationship between item 4 and the other items is not very strong. Comparatively speaking, the relationship between item 1 and the other items is much stronger, with r = .7330. Another way of looking at the relationship between an individual item and the rest of the scale is to try to predict a person’s score on the item based on the scores obtained on the other 9 The scale in this case is formed by items 1 to 6. For each individual adolescent (or case), a score on the scale is computed by adding his/her scores on the six items. 12 items. We can do this by calculating a multiple regression equation with the item of interest as the dependent variable and with all of the other items as independent variables. The multiple R2 from this regression equation is displayed for each of the items in the column labeled “Squared Multiple Correlation”. We can see that about 75% of the observed variability in the responses to item 1 can be explained by the other items. As expected, item 4 is less well predicted from the other items. Its multiple R2 is only .2533. The final column “Alpha if Item Deleted” tells us how the reliability of the scale is affected by each of the items. Six Cronbach’s "’s are reported in this column, each representing the Cronbach’s " of the scale when one item on the scale is removed. As will be shown later, the Cronbach’s " for the entire scale of 6 items is .8396. We can see from this column of statistics that removing item 4 from the scale causes " to increase from .8396 to .8404. On the other hand, eliminating any items other than item 4 from the scale will cause the " to decrease. If for some reason the scale must have to be shortened, then item 4 will logically be the first one to go. Conversely, it will be most undesirable to remove item 1 from the scale because " will decrease to .79 as a result. The final results of the reliability analysis based on the “ALPHA” model is reported in the final section of the output and is reproduced below: Reliability Coefficients Alpha = 6 items .8396 Standardized item alpha = .8399 Cronbach’s alpha is shown in the above output. The value is .8396 and can be regarded as quite large. This indicates that the 6 - item scale is quite reliable. “Standardized item alpha” refers to the " that would be obtained if all of the items were standardized to have a variance of 1. Since there is not much variation among the variances of the 6 items in the scale10, there is therefore little difference between the two reported "’s. If items in the scale have widely differing variances, the two "’s may differ substantially. b. Reliability Analysis - “SPLIT-HALF” Model RELIABILITY ANALYSIS - S C A L E (T E S T S C O R) The subcommand “statistics=scale” instructs the computer to produce the above output, while the subcommand “summary=means . . . correlations” instructs the computer to perform and produce the following: 10 Please refer to the statistics reported in the section on “summary statistics on the items” as well as the discussion in that section. The variances of individual items can be computed by squaring the standard deviations reported for individual items in the initial section of the output. Dr. Robert Gebotys 2003 13 R E L I A B I L I T Y A N A L Y S I S N of Cases = - S C A L E (S P L I T) 14.0 Mean 1.2857 1.5000 2.7857 Variance 1.6044 1.3462 5.1044 Std Dev 1.2666 1.1602 2.2593 N of Variables 3 3 6 Item Means Part 1 Part 2 Scale Mean .4286 .5000 .4643 Minimum .3571 .3571 .3571 Maximum .5714 .5714 .5714 Range .2143 .2143 .2143 Max/Min 1.6000 1.6000 1.6000 Variance .0153 .0153 .0138 Item Variances Part 1 Part 2 Scale Mean .2527 .2582 .2555 Minimum .2473 .2473 .2473 Maximum .2637 .2637 .2637 Range .0165 .0165 .0165 Max/Min 1.0667 1.0667 1.0667 Variance .0001 .0001 .0001 Inter-item Covariances Part 1 Part 2 Scale Mean .1410 .0952 .1190 Minimum .0879 .0879 .0879 Maximum .1703 .1099 .1868 Range .0824 .0220 .0989 Max/Min 1.9375 1.2500 2.1250 Variance .0017 .0001 .0015 Inter-item Correlations Part 1 Part 2 Scale Mean .5596 .3684 .4665 Minimum .3443 .3443 .3443 Maximum .6889 .4167 .7083 Range .3446 .0724 .3641 Max/Min 2.0010 1.2103 2.0575 Variance .0282 .0014 .0234 Statistics for Part 1 Part 2 Scale Please note that the descriptive statistics for the entire scale and the summary statistics over all items in the entire scale given in these sections of the computer output are identical to those produced in the corresponding sections of the output based on the “ALPHA” model of reliability analyses (check statistics on the “scale” row of corresponding sets of statistics). The significant feature of these sections of the output is that descriptive and summary statistics are given for each of the two parts of the scale, namely, Part 1 which is formed by items 1, 2 and 3, and Part 2 which is composed of items 4, 5 and 6. It is clearly evident that the two Parts have different means and standard deviations, as well as different item means, item variances, inter-item covariances and inter-item correlations. Reliability Coefficients 6 items Correlation between forms = .7328 Equal-length Spearman-Brown = .8458 Guttman Split-half = .8439 Unequal-length Spearman-Brown = .8458 Alpha for part 1 = .7911 Alpha for part 2 = .6367 3 items in part 1 3 items in part 2 14 The above section of the output contains the results of reliability analysis based on the “SPLIT-HALF” model. The correlation between the two halves (or parts), labeled on the output as “Correlation between forms”, is .7328. This is an estimate of the reliability of the scale if it has three items. The equal length Spearman-Brown coefficient, which has a value of .8458 in this case, tells us what the reliability of the entire scale would be if it was made up of two equal (or parallel) parts that have a three-item reliability of .7328. If the number of items on each of the two parts is not equal, the unequal length Spearman-Brown coefficient can be used to estimate the reliability of the overall scale. In the present example, since the two parts of the scale are of equal length, the two Spearman-Brown coefficients are identical. The Guttman split-half coefficient is another estimate of the reliability of the overall scale. It does not assume that the two parts are equally reliable or have the same variance, hence the reliability coefficient produced is smaller. Finally, separate values of Cronbach’s " are also shown for each of the two parts of the scale in the output. c. Reliability Analysis - “EVEN/ODD” Model RELIABILITY ANALYSIS - S C A L E (T E S T S C O R) ****** Method 2 (covariance matrix) will be used for this analysis ****** R E L I A B I L I T Y 1. 2. 3. 4. 5. 6. ITEM1 ITEM3 ITEM5 ITEM2 ITEM4 ITEM6 A N A L Y S I S - S C A L E Mean Std Dev Cases .3571 .5714 .5714 .3571 .5714 .3571 .4972 .5136 .5136 .4972 .5136 .4972 14.0 14.0 14.0 14.0 14.0 14.0 (S P L I T) Covariance Matrix Correlation Matrix ITEM1 ITEM3 ITEM5 ITEM2 ITEM4 ITEM6 ITEM1 1.0000 .6455 .6455 .6889 .3443 .3778 ITEM6 ITEM6 1.0000 Dr. Robert Gebotys 2003 ITEM3 ITEM5 ITEM2 ITEM4 1.0000 .7083 .3443 .4167 .3443 1.0000 .3443 .4167 .3443 1.0000 .3443 .6889 1.0000 .3443 15 The additional statistics produced in this section of the output are basically similar to those shown in corresponding sections of the output based on the “ALPHA” model. The only difference is that as a result of reordering the items in the scale, the statistics are displayed differently here. N of Cases = 14.0 Mean 1.5000 1.2857 2.7857 Variance 1.8077 1.4505 5.1044 Std Dev 1.3445 1.2044 2.2593 N of Variables 3 3 6 Item Means Part 1 Part 2 Scale Mean .5000 .4286 .4643 Minimum .3571 .3571 .3571 Maximum .5714 .5714 .5714 Range .2143 .2143 .2143 Max/Min 1.6000 1.6000 1.6000 Variance .0153 .0153 .0138 Item Variances Part 1 Part 2 Scale Mean .2582 .2527 .2555 Minimum .2473 .2473 .2473 Maximum .2637 .2637 .2637 Range .0165 .0165 .0165 Max/Min 1.0667 1.0667 1.0667 Variance .0001 .0001 .0001 Inter-item Covariances Part 1 Part 2 Scale Mean .1722 .1154 .1190 Minimum .1648 .0879 .0879 Maximum .1868 .1703 .1868 Range .0220 .0824 .0989 Max/Min 1.1333 1.9375 2.1250 Variance .0001 .0018 .0015 Inter-item Correlations Part 1 Part 2 Scale Mean .6664 .4591 .4665 Minimum .6455 .3443 .3443 Maximum .7083 .6889 .7083 Range .0628 .3446 .3641 Max/Min 1.0973 2.0010 2.0575 Variance .0011 .0317 .0234 Statistics for Part 1 Part 2 Scale The above section of the output looks similar to the corresponding section produced under the “SPLIT-HALF model. In fact, the descriptive and summary statistics reported in these two outputs for the entire scale are identical. However, descriptive and summary statistics for corresponding parts of the scale reported in the two outputs are not the same. The differences originate from the fact that the compositions of Part 1 and Part 2 are altered in the present analysis, i.e., Part 1 is made up of items 1, 3 and 5; and Part 2 is composed of items 2, 4 and 6. 16 Item-total Statistics Item-total Statistics ITEM1 ITEM3 ITEM5 ITEM2 ITEM4 ITEM6 Scale Mean if Item Deleted Scale Variance if Item Deleted 2.4286 2.2143 2.2143 2.4286 2.2143 2.4286 3.4945 3.5659 3.5659 3.6484 3.8736 3.8022 Corrected ItemTotal Correlation .7330 .6572 .6572 .6364 .4784 .5440 Squared Multiple Correlation Alpha if Item Deleted .7511 .6000 .6000 .7469 .2533 .5733 .7901 .8051 .8051 .8095 .8404 .8273 The item-total statistics reported in the present analysis are exactly the same as those reported under the “ALPHA” model, with the only exception that the statistics are arranged differently. Again this is a direct result of reordering the items in the scale. Reliability Coefficients 6 items Correlation between forms = .5700 Equal-length Spearman-Brown = .7262 Guttman Split-half = .7234 Unequal-length Spearman-Brown = .7262 Alpha for part 1 = .8571 Alpha for part 2 = .7159 3 items in part 1 3 items in part 2 The above are the results of the reliability analysis based on the “EVEN/ODD” model. Please note that the correlation coefficient between the parts formed respectively by even and odd items is smaller than the correlation reported in the “SPLIT-HALF” model (i.e. .5700 compared with .7328 in the “SPLIT-HALF” model). As a result, the Spearman-Brown coefficients reported in this analysis are comparatively smaller (i.e. .7262 against .8458). This illustrative example shows that “splithalf” reliability analyses are capable of producing different reliability estimates on the same scale, depending on the methods researchers used in splitting items in the scale. Determining Reliability Using SPSS: Example 2: The following questionnaire was developed by a researcher as part of an effort to collect participants’ feedback on a five-week community-based program designed to teach individuals disease prevention Dr. Robert Gebotys 2003 17 and to encourage healthier lifestyles. The questionnaire contained six items. Respondents were asked to respond to each item according to the following scale: Strongly Agree Agree 1 No Opinion 2 3 Disagree Strongly Disagree 4 5 The 6 items in the questionnaire were: 1. The goals of the program are clear. 2. I feel comfortable in discussing my plans, concerns and experiences with the group. 3. The materials covered in the program are helpful. 4. The health contract is useful in assisting me to make healthy lifestyle changes. 5. Overall speaking, the group is supportive. 6. Overall, the program is useful in assisting me develop positive changes towards healthy lifestyles. The following is the data obtained from 10 participants: Items Person 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 2 1 4 5 2 3 4 2 2 3 3 2 3 3 1 3 5 1 2 4 1 1 4 2 2 1 2 2 2 2 3 1 5 4 2 3 3 2 2 5 4 3 3 3 1 3 4 1 2 4 2 1 3 2 1 1 2 1 2 2 Conducting Cronbach’s Alpha; Split-Half Reliability & Even-Odd Reliability Analyses on the Set of Scores Obtained from 10 Respondents for the 6 Items Using SPSS 1. SPSS Computer Program 18 2. SPSS Outputs and Results11 a. Reliability Analysis - “Alpha” Model ****** Method 2 (covariance matrix) will be used for this analysis ****** R E L I A B I L I T Y 1. 2. 3. 4. 5. 6. ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 A N A L Y S I S - S C A L E Mean Std Dev Cases 2.8000 2.7000 1.9000 3.0000 2.8000 1.7000 1.2293 1.2517 .8756 1.3333 1.1353 .6749 10.0 10.0 10.0 10.0 10.0 10.0 (A L P H A) Correlation Matrix ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 1.0000 .6066 .4955 .7457 .3662 .5892 1.0000 .0710 .5992 .8914 .5392 1.0000 .5710 -.1341 .6956 1.0000 .5138 .7408 1.0000 .4930 ITEM6 ITEM6 1.0000 R E L I A B I L I T Y N of Cases = A N A L Y S I S 10.0 - S C A L E (A L P H A) Mean 14.9000 Variance 25.8778 Std Dev 5.0870 N of Variables 6 Item Means Mean 2.4833 Minimum 1.7000 Maximum 3.0000 Range 1.3000 Max/Min 1.7647 Variance .2937 Item Variances Mean 1.2278 Minimum .4556 Maximum 1.7778 Range 1.3222 Max/Min 3.9024 Variance .2621 Statistics for Scale Inter-item 11 Only sections of outputs relevant to the purposes and needs of our present analysis will be reproduced below. Brief discusions are in italics and they are not parts of the original computer outputs. Dr. Robert Gebotys 2003 Covariances Inter-item Correlations Mean .6170 Minimum -.1333 Maximum 1.2667 Range 1.4000 Max/Min -9.5000 19 Variance .1434 Mean .5189 Minimum -.1341 Maximum .8914 Range 1.0255 Max/Min -6.6457 Variance .0651 Item-total Statistics ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 Scale Mean if Item Deleted Scale Variance if Item Deleted 12.1000 12.2000 13.0000 11.9000 12.1000 13.2000 16.9889 16.8444 22.0000 15.4333 18.9889 20.6222 Reliability Coefficients Alpha = .8584 Corrected ItemTotal Correlation Squared Multiple Correlation Alpha if Item Deleted .7344 .9060 .8634 .7752 .9363 .8531 .8192 .8196 .8750 .7973 .8499 .8311 .7281 .7267 .3788 .8273 .5660 .7830 6 items Standardized item alpha = .8662 The Cronbach’s Alpha reported in the above analysis is .8584. This indicates that the 6-item questionnaire is quite reliable. The last column in the Item-total Statistics indicates that removing item 4 from the questionnaire will lead to a drop of Cronbach’s " from .8584 to .7973; while removing item 3 from the questionnaire will lead to an increase of Cronbach’s " from .8584 to .8750. b. Reliability Analysis - “SPLIT-HALF” Model RELIABILITY ANALYSIS R E L I A B I L I T Y N of Cases = Statistics for Part 1 Part 2 Scale Item Means Part 1 Part 2 Scale - S C A L E (T E S T S C O R) A N A L Y S I S - S C A L E (S P L I T) 10.0 Mean 7.4000 7.5000 14.9000 Variance 6.9333 7.1667 25.8778 Std Dev 2.6331 2.6771 5.0870 N of Variables 3 3 6 Mean 2.4667 2.5000 2.4833 Minimum 1.9000 1.7000 1.7000 Maximum 2.8000 3.0000 3.0000 Range .9000 1.3000 1.3000 Max/Min 1.4737 1.7647 1.7647 Variance .2433 .4900 .2937 20 Item Variances Part 1 Part 2 Scale Mean 1.2815 1.1741 1.2278 Minimum .7667 .4556 .4556 Maximum 1.5667 1.7778 1.7778 Range .8000 1.3222 1.3222 Max/Min 2.0435 3.9024 3.9024 Variance .1995 .4470 .2621 Inter-item Covariances Part 1 Part 2 Scale Mean .5148 .6074 .6170 Minimum .0778 .3778 -.1333 Maximum .9333 .7778 1.2667 Range .8556 .4000 1.4000 Max/Min 12.0000 2.0588 -9.5000 Variance .1466 .0341 .1434 Inter-item Correlations Part 1 Part 2 Scale Mean .3910 .5825 .5189 Minimum .0710 .4930 -.1341 Maximum .6066 .7408 .8914 Range .5356 .2478 1.0255 Max/Min 8.5474 1.5026 -6.6457 Variance .0639 .0151 .0651 Reliability Coefficients 6 items Correlation between forms = .8354 Equal-length Spearman-Brown = .9103 Guttman Split-half = .9103 Unequal-length Spearman-Brown = .9103 Alpha for part 1 = .6683 Alpha for part 2 = .7628 3 items in part 1 RELIABILITY COEFFICIENTS 3 items in part 2 6 ITEMS The Spearman-Brown results reported in the output of reliability analysis based on the “SPLITHALF “ model indicate that the reliability of the entire scale/questionnaire is .9103 if it is made up of two equal (or parallel) parts that have a three-item reliability of .8354 each. Separate values of Cronbach’s "s are shown for each of the two parts of the scale/questionnaire, i.e. Cronbach’s " for the first half is .6683 and that for the second half is .7628. c. Reliability Analysis - “EVEN/ODD” Model R E L I A B I L I T Y A N A L Y S I S - S C A L E (T E S T S C O R) N of Cases = Statistics for Part 1 Part 2 Scale Mean 7.5000 7.4000 14.9000 Dr. Robert Gebotys 2003 10.0 Variance 5.3889 8.0444 25.8778 Std Dev 2.3214 2.8363 5.0870 N of Variables 3 3 6 21 Item Means Part 1 Part 2 Scale Mean 2.5000 2.4667 2.4833 Minimum 1.9000 1.7000 1.7000 Maximum 2.8000 3.0000 3.0000 Range .9000 1.3000 1.3000 Max/Min 1.4737 1.7647 1.7647 Variance .2700 .4633 .2937 Item Variances Part 1 Part 2 Scale Mean 1.1889 1.2667 1.2278 Minimum .7667 .4556 .4556 Maximum 1.5111 1.7778 1.7778 Range .7444 1.3222 1.3222 Max/Min 1.9710 3.9024 3.9024 Variance .1460 .5046 .2621 Inter-item Covariances Part 1 Part 2 Scale Mean .3037 .7074 .6170 Minimum -.1333 .4556 -.1333 Maximum .5333 1.0000 1.2667 Range .6667 .5444 1.4000 Max/Min -4.0000 2.1951 -9.5000 Variance .1147 .0603 .1434 Inter-item Correlations Part 1 Part 2 Scale Mean .2425 .6264 .5189 Minimum -.1341 .5392 -.1341 Maximum .4955 .7408 .8914 Range .6296 .2016 1.0255 Max/Min -3.6942 1.3738 -6.6457 Variance .0885 .0086 .0651 Reliability Coefficients 6 items Correlation between forms = .9450 Equal-length Spearman-Brown = .9717 Guttman Split-half = .9618 Unequal-length Spearman-Brown = .9717 Alpha for part 1 = .5072 Alpha for part 2 = .7914 3 items in part 1 RELIABILITY COEFFICIENTS 3 items in part 2 6 ITEMS The Spearman-Brown coefficient reported in the results of reliability analysis based on the “EVENODD” model is .9717, indicating that the 6 - item questionnaire is very reliable. This SpearmanBrown coefficient is even higher than that reported in the “SPLIT-HALF” model. The correlation coefficient between the parts formed respectively by even and odd items is also larger than the correlation reported in the “SPLIT-HALF” model (i.e., .9450 compared with .8354). The Cronbach’s " for the first part of the questionnaire (i.e. which is made up of odd items) is .5072, while that for the second part is .7914. 22 Part Five: Using SPSS for Windows to Implement Reliability Analyses The following section will outline the steps necessary in undertaking three different forms of ‘single test administration’ reliability analyses: (1) Cronbach alpha, (2) Split-half, and (3) Even-odd. For further discussion of these reliability measures, students are encouraged to consult Bob Gebotys’ “Handout on Reliability.” 1.1 Conducting a Reliability Analysis using the Cronbach Alpha (α α) Measure For this analysis we will use the data regarding adolescent attitudes toward physical aggression, as outlined on page 6 of Gebotys’ “Handout on Reliability.” In order to conduct this analysis, the following steps are required. 1. 2. 3. Enter the aforementioned data set into an SPSS Data Editor Window (see Section 2.1 for instructions, if necessary). Next, click Statistics on the main menu bar, followed by Scale, and then Reliability Analysis… This series of clicks will open a Reliability Analysis dialogue box similar to the one shown below. You should note that all of the variables (all Items) are listed in the text box at the left-hand side of the dialog box. Take your cursor and click on “item1.” Keeping your finger depressed on the left button of the mouse, scroll your mouse downward until all variables (i.e., item1 through item6) are highlighted. Once they are highlighted, click the right arrow button (<) in the centre of the dialog box to move the selected variables into the ‘Items:’ text box. 4. Next, check to see that text in the ‘Model:’ text box reads “Alpha.” If it does not, click on the downward arrow (?) to the right of the text box and select Alpha from the list that appears. Dr. Robert Gebotys 2003 23 5. Next, click on the Statistics… pushbutton, which will open a ‘Reliability Analysis: Statistics’ subdialog box similar to the one below. 6. Next, select (i.e., click on) all options under ‘Descriptives for’ (i.e., item, scale, scale if item deleted), ‘Summaries’ (i.e., means, vaiances, covariances, correlations), and ‘Inter-item’ (i.e., correlations, covariances). These are the primary statistics that you will need to interpret your reliability analyses. If, however, you would like further statistics, such as the ‘F test’ and ‘Hotelling’s T-square,’ you can make these selections from options in this subdialog box. Once you have made your selections, click the Continue command pushbutton at the top right-hand corner of the subdialog box. This will return you to the ‘Reliability Analysis: Statistics’ subdialog box. You have now completed all the necessary steps in specifying the reliability procedure. If you would like to examine the SPSS syntax for this procedure, please read the note below. If you would like to run this procedure now, without examining the syntax, click the OK command pushbutton at the top right-hand corner of the dialog box. 7. 8. Note: If you would like to examine the SPSS syntax for this procedure, click on the Paste command pushbutton to open an SPSS Syntax Window. The syntax window should then resemble the one below. In order to run this syntax and complete the reliability analysis, click Run on the menu bar, followed by All. 24 Once you have run the Cronbach Alpha reliability procedure, the results should appear in an SPSS Viewer window similar to the one shown below. At this stage it is recommended that you save and print the contents of the SPSS Viewer window. The steps that you take to save and print this reliability analysis are identical to the steps taken to save and print the Scatterplot, as outlined in section 2.4 of this guide. Your output should resemble the information on the following pages. Dr. Robert Gebotys 2003 25 5.2 Conducting the Split-Half Reliability Analysis Please note that the steps necessary for conducting the Split-Half reliability analysis are almost identical to the procedures outlined above for the Cronbach Alpha analysis. The only difference when using SPSS for Windows is that you must specify the “Split-Half” model instead of the “Alpha” model in the Reliability Analysis dialog box. Therefore, for the Split-Half model, step #4 should read as follows: #4. Next, check to see that text in the ‘Model:’ text box reads “Split-half.” If it does not, click on the downward arrow to the right of the text box and select Split-half from the list that appears. The SPSS Syntax window for the Split-half analysis should resemble the example below. Again, it is recommended that you save and print the contents of the SPSS Viewer window. Your output should resemble the output on the following pages. 26 1.2 Conducting the Even-Odd Reliability Analysis Unlike the Cronbach Alpha and Split-Half models, the Even-Odd method of assessing reliability cannot be accessed using the “point and click” approach in SPSS. In order to utilize the Even-Odd option, one needs to modify the syntax file for the Split-Half model. More specifically, the order of the items examined needs to be changed so that the odd items form the first part of the scale and the even items form the remaining part. Therefore, your first step should be to follow the instructions noted above for undertaking the Split-Half model, but then be sure to click the Paste command pushbutton in the Reliability Analysis dialog box in order to open the SPSS Syntax window. Recall that the syntax for the Split-Half model contains the following information: RELIABILITY /VARIABLES=item1 item2 item3 item4 item5 item6 /FORMAT=NOLABELS /SCALE(SPLIT)=ALL/MODEL=SPLIT /STATISTICS=DESCRIPTIVE SCALE HOTELLING CORR COV ANOVA /SUMMARY=TOTAL MEANS VARIANCE COV CORR . For the Even-Odd model, you will need to make the following change to line 4 of the syntax file (note: the part to be changed is in bold): Before: /SCALE(SPLIT)=ALL/MODEL=SPLIT After: /SCALE(SPLIT)= item1 item3 item5 item2 item4 item6/MODEL=SPLIT The entire syntax file should now read: RELIABILITY /VARIABLES=item1 item2 item3 item4 item5 item6 /FORMAT=NOLABELS /SCALE(SPLIT)= item1 item3 item5 item2 item4 item6/MODEL=SPLIT /STATISTICS=DESCRIPTIVE SCALE HOTELLING CORR COV ANOVA /SUMMARY=TOTAL MEANS VARIANCE COV CORR . Once you have made the change noted above, click Run on the menu bar followed by All. The analysis output should then appear in an SPSS Viewer window. You should then proceed to save and print the analysis. Your output should resemble the information provided on the following pages. Dr. Robert Gebotys 2003 27