Missing Data 1 Appendix B: Sample Syntax of Analyses for Illustration The following three sections of this appendix provide sample syntax for performing mean substitution using SPSS, MI with SAS, and FIML with Mplus. All use the illustrative data set described in the text and available from the journal website. This data set is a tab-delimited text file title “illustration.txt” that contains 60 rows to represent the 60 cases and 10 columns representing 10 variables: ID number (from 1 to 60), group membership (0 or 1), the covariate, 7 variables representing the dependent variable under (a) no missing values; (b) MCAR with 10, 20, or 50% missing; and MAR with 10, 20, or 50% missing. Missing values are denoted by the value of 999 in this text file. The dataset does not contain variable labels in the first row; instead, the syntax files specify the variable names and order. Mean Substitution using SPSS (not recommended) Mean substitution, which is an approach we do not recommend, can be performed in all programs. We next provide SPSS syntax to demonstrate these analyses with the illustrative data set. * The following opens the data file * assuming it is placed directly in the C drive . GET DATA /TYPE=TXT /FILE='C:\illustration.txt' /DELCASE=LINE /DELIMITERS="\t" /ARRANGEMENT=DELIMITED /FIRSTCASE=1 /IMPORTCASE=ALL /VARIABLES= ID F2.0 Group F1.0 Covariat F18.16 DV0Miss F18.16 DV10MCAR F18.16 DV20MCAR F18.16 DV50MCAR F17.16 DV10MAR F18.16 DV20MAR F18.16 DV50MAR F18.16. CACHE. EXECUTE. Missing Data 2 DATASET NAME DataSet1 WINDOW=FRONT. * The next lines recode the missing code 999 into SPSS system missing values . RECODE DV10MCAR DV20MCAR DV50MCAR DV10MAR DV20MAR DV50MAR (999=SYSMIS). EXECUTE . * Analysis with 0% missing . * The dependent variable (DV0Miss) is * The remaining syntax specifies SPSS which are reasonable in this analysis deletion, which is irrelevant in this data) . REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT DV0Miss /METHOD=ENTER Group . regressed onto “Group” . defaults for regression, (including listwise situation of no missing * Analysis with 10% MCAR using Mean Substitution . * Here, the dependent variable is “DV10MCAR”, which is the same variable as used previously, but with 10% of cases randomly deleted (i.e., missing) . * Note that the line “/MISSING MEANSUBSTITUTION” specifies mean substitution as the method of managing missing data . DATASET ACTIVATE DataSet1. REGRESSION /MISSING MEANSUBSTITUTION /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT DV10MCAR /METHOD=ENTER Group. * Analysis with 20% MCAR using Mean Substitution . DATASET ACTIVATE DataSet1. REGRESSION /MISSING MEANSUBSTITUTION /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT DV20MCAR /METHOD=ENTER Group. To conserve space, we do not show the syntax for the remaining four analyses, noting that these involve simply inserting DV50MCAR , DV10MAR , DV20MAR , or DV50MAR as the ‘dependent’ (variable). MI using SAS Missing Data 3 MI is an approach greatly preferred over mean substitution. SPSS does not have the ability to directly perform MI. We illustrate these analyses using SAS syntax as shown next: ***The following lines read in data and recode 999 as missing***; PROC IMPORT OUT= WORK.Illustration DATAFILE= "C:\illustration.txt" DBMS=TAB REPLACE; GETNAMES=NO; DATAROW=1; RUN; DATA illustration; set illustration; ID = Var1; Group = Var2; Covariat = Var3; DV0Miss = Var4; IF (Var5 < 999) THEN DV10MCAR = Var5; IF (Var6 < 999) THEN DV20MCAR = Var6; IF (Var7 < 999) THEN DV50MCAR = Var7; IF (Var8 < 999) THEN DV10MAR = Var8; IF (Var9 < 999) THEN DV20MAR = Var9; IF (Var10 < 999) THEN DV50MAR = Var10; RUN; ****Analysis with 0% missing****; * The dependent variable (DV0Miss) is regressed onto “Group”; PROC REG DATA=illustration; MODEL DV0Miss = group; RUN; ****Multiple Imputation for 10% MCAR****; * Here, the dependent variable is “DV10MCAR”, which has 10% of cases randomly deleted (i.e., missing); **** The first command creates 10 imputed data sets ***; PROC MI DATA=illustration OUT=MCAR10 NIMPUTE=10 SEED=1211981; VAR group covariat DV10MCAR; RUN; **** This second command performs 10 regression analyses for the 10 imputed data sets ***; PROC REG DATA=MCAR10 outest=a COVOUT; MODEL DV10MCAR = group; BY _IMPUTATION_; RUN; **** This third command combines results of the 10 regression analyses to estimate appropriate standard errors for the regression coefficients intercept and group ***; PROC MIANALYZE DATA=a; MODELEFFECTS INTERCEPT group; RUN; ****Multiple Imputation for 20% MCAR****; PROC MI DATA=illustration OUT=MCAR20 NIMPUTE=10 SEED=1211981; VAR group covariat DV20MCAR; RUN; PROC REG DATA=MCAR20 outest=a COVOUT; MODEL DV20MCAR = group; BY _IMPUTATION_; Missing Data 4 RUN; PROC MIANALYZE DATA=a; MODELEFFECTS INTERCEPT group; RUN; To conserve space, we do not show the syntax for the remaining four analyses, noting that these involve substituting DV50MCAR , DV10MAR , DV20MAR , or DV50MAR as the new file name (out-file of Proc MI and data-file of Proc REG) and dependent variable in the ‘model’ command of the Proc REG. FIML using Mplus FIML is a model-based approach that is also greatly preferred over mean substitution. The FIML approach to missing data management is most commonly implemented in structural equation modeling or multilevel modeling software. We illustrate these analyses using MPlus syntax as shown next (note that two warnings appear in the output for this syntax, one input warning noting that ‘Type=Missing’ is now the default and the other regarding the standard errors; both of these warnings can be ignored) : Syntax for no missing data: TITLE: 0% Missingness; DATA: FILE IS "c:\illustration.txt"; VARIABLE: NAMES ARE ID Group Covariat DV0Miss DV10MCAR DV20MCAR DV50MCAR DV10MAR DV20MAR DV50MAR; USEVARIABLES group DV0Miss covariat; MISSING = all (999); ANALYSIS: TYPE IS MISSING; ESTIMATOR IS ML; ITERATIONS = 1000; CONVERGENCE = 0.00005; COVERAGE = 0.10; MODEL: DV0Miss on group; group with covariat; DV0Miss with covariat; OUTPUT: tech1 tech3 Syntax for 10% MCAR: TITLE: FIML 10% MCAR; DATA: FILE IS "c:\illustration.txt"; VARIABLE: NAMES ARE ID Group Covariat DV0Miss DV10MCAR DV20MCAR DV50MCAR Missing Data 5 DV10MAR DV20MAR DV50MAR; USEVARIABLES group DV10MCAR covariat; MISSING = all (999); ANALYSIS: TYPE IS MISSING; ESTIMATOR IS ML; ITERATIONS = 1000; CONVERGENCE = 0.00005; COVERAGE = 0.10; MODEL: DV10MCAR on group; group with covariat; DV10MCAR with covariat; OUTPUT: tech1 tech3 Again, to conserve space, we do not show the syntax for the remaining five analyses. To perform these analyses, one simply replaces DV10MCAR within the syntax above with DV20MCAR, DV50MCAR, DV10MAR, DV20MAR, or DV50MAR. Missing Data 6 Illustration.txt 1 0 -.850867829681874 -.154493593047363 -.154493593047363 .154493593047363 -.154493593047363 -.154493593047363 -.154493593047363 .154493593047363 2 0 -.211056217578008 -1.29827630533749 -1.29827630533749 1.29827630533749 999 -1.29827630533749 -1.29827630533749 999 3 0 -1.58158061928887 -1.84729308697254 -1.84729308697254 999 999 999 999 999 4 0 -.892182736398273 -.749896078284945 -.749896078284945 .749896078284945 -.749896078284945 -.749896078284945 -.749896078284945 999 5 0 -.0154904264600894 -.0350404123315361 -.0350404123315361 .0350404123315361 999 -.0350404123315361 -.0350404123315361 999 6 0 -.617416180956403 -.791173756467051 -.791173756467051 .791173756467051 -.791173756467051 -.791173756467051 -.791173756467051 999 7 0 -.160947342612139 -1.39305601560536 -1.39305601560536 1.39305601560536 -1.39305601560536 -1.39305601560536 -1.39305601560536 999 8 0 .0822082173324673 -.592680564158753 -.592680564158753 .592680564158753 -.592680564158753 -.592680564158753 -.592680564158753 .592680564158753 9 0 -2.84579505283808 -.928260768294282 -.928260768294282 .928260768294282 999 999 999 999 10 0 -.750327378280308 -.566730518398268 -.566730518398268 .566730518398268 -.566730518398268 999 999 999 11 0 -.192208289689706 1.6010839421218 1.6010839421218 999 999 1.6010839421218 1.6010839421218 1.6010839421218 12 0 -.29929998834839 -.706578484159677 -.706578484159677 .706578484159677 999 -.706578484159677 -.706578484159677 -.706578484159677 13 0 .397103644817022 -.854409356857959 -.854409356857959 .854409356857959 -.854409356857959 -.854409356857959 -.854409356857959 .854409356857959 14 0 -1.86598842125329 -2.01778396402153 -2.01778396402153 2.01778396402153 -2.01778396402153 -2.01778396402153 -2.01778396402153 999 15 0 -.563836006435321 .336646818210308 .336646818210308 .336646818210308 .336646818210308 .336646818210308 999 999 16 0 -.147133183225006 1.09641236468901 1.09641236468901 1.09641236468901 1.09641236468901 1.09641236468901 1.09641236468901 1.09641236468901 17 0 -.0355818140701889 -.0997428904192106 999 999 999 .0997428904192106 -.0997428904192106 -.0997428904192106 18 0 .887897514907359 -.182639887782136 -.182639887782136 .182639887782136 999 -.182639887782136 -.182639887782136 -.182639887782136 19 0 -1.48840552487275 -.539959146486662 -.539959146486662 .539959146486662 999 999 999 999 20 0 .108675987681218 -.437665939005987 -.437665939005987 999 999 -.437665939005987 -.437665939005987 -.437665939005987 Missing Data 7 21 0 -.20275634025207 -1.38987700220506 999 999 999 1.38987700220506 -1.38987700220506 -1.38987700220506 22 0 -.376034169822592 -.0396592797147315 -.0396592797147315 .0396592797147315 999 999 999 999 23 0 1.00077829138927 -.358278452919041 999 999 999 .358278452919041 -.358278452919041 -.358278452919041 24 0 -.542027337979399 .0171460549100705 .0171460549100705 .0171460549100705 .0171460549100705 .0171460549100705 .0171460549100705 .0171460549100705 25 0 -.849060312346486 -.549660453094282 -.549660453094282 .549660453094282 999 -.549660453094282 -.549660453094282 -.549660453094282 26 0 -1.80705390136401 -.289094061603656 -.289094061603656 .289094061603656 -.289094061603656 -.289094061603656 999 999 27 0 -.101662079471109 -.276314150394167 -.276314150394167 999 999 -.276314150394167 -.276314150394167 999 28 0 -3.2620755881702 -1.39658693937835 -1.39658693937835 1.39658693937835 -1.39658693937835 -1.39658693937835 999 999 29 0 -.0330451498263781 .24681560739095 .24681560739095 .24681560739095 .24681560739095 999 999 999 30 0 .0396122058845316 -.680536147388183 999 999 999 .680536147388183 999 999 31 1 1.05826453711523 .99917526641664 .99917526641664 .99917526641664 .99917526641664 .99917526641664 .99917526641664 .99917526641664 32 1 -.181826632905936 .29456458296444 .29456458296444 .29456458296444 .29456458296444 .29456458296444 .29456458296444 999 33 1 .325298620152614 -.124115873877004 -.124115873877004 .124115873877004 -.124115873877004 -.124115873877004 -.124115873877004 .124115873877004 34 1 -.554650029752322 -.622863667018844 -.622863667018844 .622863667018844 999 -.622863667018844 -.622863667018844 -.622863667018844 35 1 -.654212127213383 -1.01541423045684 -1.01541423045684 1.01541423045684 999 -1.01541423045684 999 999 36 1 -1.1177906863268 .209256887497134 .209256887497134 .209256887497134 999 .209256887497134 .209256887497134 999 37 1 -.181923486491488 -.717691998996674 -.717691998996674 .717691998996674 -.717691998996674 -.717691998996674 -.717691998996674 .717691998996674 38 1 .0570406365471001 .156146647451192 999 999 999 .156146647451192 .156146647451192 999 39 1 .664147230154368 -.288229623773199 -.288229623773199 .288229623773199 -.288229623773199 -.288229623773199 -.288229623773199 999 40 1 1.20738016510671 .251229411715677 .251229411715677 .251229411715677 .251229411715677 .251229411715677 .251229411715677 .251229411715677 Missing Data 8 41 1 1.49167062362998 1.82453435527453 1.82453435527453 1.82453435527453 1.82453435527453 1.82453435527453 1.82453435527453 1.82453435527453 42 1 .10443799659144 -.505317323754116 -.505317323754116 .505317323754116 999 -.505317323754116 -.505317323754116 999 43 1 -.549409315402947 .688840627679144 .688840627679144 .688840627679144 999 .688840627679144 .688840627679144 .688840627679144 44 1 1.2355847811718 .488225613985325 .488225613985325 .488225613985325 .488225613985325 .488225613985325 .488225613985325 .488225613985325 45 1 1.4159448165691 .102824753225284 .102824753225284 .102824753225284 999 .102824753225284 .102824753225284 .102824753225284 46 1 -.829901220499762 -.884557475454705 -.884557475454705 999 999 -.884557475454705 -.884557475454705 999 47 1 1.16593272486838 2.28809182337569 2.28809182337569 2.28809182337569 2.28809182337569 2.28809182337569 2.28809182337569 2.28809182337569 48 1 .43362804115754 .446226340180492 .446226340180492 .446226340180492 999 .446226340180492 .446226340180492 999 49 1 .206497558025471 .641067713273713 .641067713273713 .641067713273713 .641067713273713 .641067713273713 .641067713273713 999 50 1 .847932701494698 1.25236793850088 1.25236793850088 1.25236793850088 1.25236793850088 1.25236793850088 1.25236793850088 1.25236793850088 51 1 1.84759808931025 1.69199874491296 1.69199874491296 1.69199874491296 1.69199874491296 1.69199874491296 1.69199874491296 1.69199874491296 52 1 -.7760562984868 -.390328025772538 -.390328025772538 .390328025772538 -.390328025772538 -.390328025772538 999 999 53 1 1.83621118386067 1.06028971435592 1.06028971435592 1.06028971435592 999 1.06028971435592 1.06028971435592 1.06028971435592 54 1 1.63096618805731 1.72387743255306 1.72387743255306 1.72387743255306 999 1.72387743255306 1.72387743255306 1.72387743255306 55 1 2.43408888774903 .821243682071921 .821243682071921 .821243682071921 .821243682071921 .821243682071921 .821243682071921 .821243682071921 56 1 1.05397592438395 .507454217808059 .507454217808059 999 999 .507454217808059 .507454217808059 999 57 1 .0521649622721339 .238215282499099 .238215282499099 .238215282499099 .238215282499099 .238215282499099 .238215282499099 999 Missing Data 9 58 1 1.43465274304311 2.1426558324697 999 999 999 2.1426558324697 2.1426558324697 2.1426558324697 59 1 2.72640613683008 2.3463057079243 2.3463057079243 2.3463057079243 999 2.3463057079243 2.3463057079243 2.3463057079243 60 1 -1.20849872180246 -.748491890025146 -.748491890025146 .748491890025146 -.748491890025146 -.748491890025146 -.748491890025146 999