SOP_repCovariance

advertisement
SOP how I automate to run repetitive analyses of covariance (ANCOVA) with
SPSS.
Stefan Vormfelde in September 2011
Here I describe my standard operating procedure to automate analyses of covariance
(ANCOVAs) in SPSS. My main application is to analyze the effect of multiple
genotypes on multiple phenotypes in stratified populations.
Stratification. I may be interested in the effect of variables like gender or study center,
which stratify my data, or I may be not interest in their effect. When the results of
gender or study center are of interest, then I include them in the analyses as “fixed
variables”. Also when I am not interested in the results, I may include gender or study
center in my analysis. Then my aim is to reduce excess variation and I include such
stratifying variables as “random variables”.
Interaction term. Commonly, to explore covariance, the first run has to include an
interaction term. The reason is that affecters may exert their effect oppositely in
different strata, e.g. positively in left-handed and negatively in right-handed subjects.
Without an interaction term that impact well appears negligibly. The second run, when
the interaction term is not statistically significant, restricts to the main effects. I
commonly analyze data, in which genotypes are expected to affect strata in parallel. I
therefore directly start analyzing the main effects (omitting the interaction term). Here I
describe how I automate to run such ANCOVAs directly analyzing the main effects.
Study phases. One may want to run ANCOVA on data stratified for study phases. This
is not correct. Data from different study phases are dependent data as they refer to the
same subjects. Repeated Measures ANOVA has to be used in that instance.
Samples
I can follow this SOP with the example files
repCovariance_routinePreparation.xls (an Excel file to prepare the syntax)
repCovariance_routine.sps (the desired SPSS-routine)
repCovariance_dataSheet.sav (an SPSS data sheet)
repCovariance_output.xls (an Excel file containing the output)
repCovariance_output_worked_up.xls (identifying my most significant result)
I prepare the syntax for the sps-file (SPSS routine) in Excel
I open my Excel sheet “repCovariance_routinePreparation.xls”.
I enter the name of the first phenotype I want to analyze in the column "phenotypes".
I enter the name of my stratifying variable(s) in the column "stratifyer".
I insert as many lines as I want to analyze genotypes and copy the above line
downwards.
I copy the names of the polymorphisms from the "variable view" of SPSS to the column
named "polymorphisms".
I copy the whole block of commands for the first phenotype downwards, according to
the number of phenotypes I want to analyze.
I replace (e.g. using ctrl+h) the names of the phenotypes in the downward blocks.
I enter directory and file name for the output file in the second last cell of the
polymorphisms-column.
Ready. I save the file.
I prepare the SPSS-routine and let it run
I open SPSS, then file/new/syntax. A window with an empty sps-file pops up.
I mark and copy (rightclick+copy) the grey-shaded area from the above xls-file to the
sps-file (rightclick+insert)
I save the sps-file, e.g. as “repCovariance_routine.sps” in my respective working
directory. I thereby keep my activities traceable.
I open the sav-file containing my data, e.g. “repCovariance_dataSheet.sav”.
I switch to my sps-file and select the pull-down menu execute (“Ausführen”), then all
(“Alle”).
An output window pops up. I can follow the progress of the calculations. When all
calculations are done, the window closes.
Ready. I find my results in an xls-file named and stored in a directory as I defined
above.
I work up the results
I open the xls-file, which contains my results.
I save this file under a new name, e.g. “repCovariance_output_worked_up.xls”.
I add a sorting column:
- I enter a “1” in the first cell of the first column after the last column (usually
column 7).
- I enter “=1+” in the cell below, click on the cell with the “1” and press “enter”. A “2”
appears in the cell.
- I copy (ctrl+c) this cell.
- I go to the end of the worksheet (ctrl+end).
- I mark all cells up to the cell displaying the “2” (ctrl+shift+up-arrow) and press enter.
- I mark and copy the column with the numbers I produced: I rightclick the header of
the column, usually a “7” or a “G”, I then select copy.
- I enter the numbers as values (instead of formulas): I again rightclick the header of the
column, I then select to insert the content (“Inhalte einfügen”). In the pop-up window I
doubleclick “values” (“Werte”).
I delete cell linkage among cells:
- I mark the entire worksheet (e.g. I click on the cell in the very upper left corner
between column headers and line headers).
- I rightclick on the marked area and select “format cells” (“Zellen formatieren”). A
window pops up.
- I select the second left register card (arrangement? “Ausrichtung”).
- I tick the box left to “link cells” (“Zellen verbinden”) twice, so that it is neither shaded
nor ticked but empty.
- I press “OK”.
I identify my most significant association
I sort for p-values:
- I select all (e.g. ctrl+a twice)
- I select the pull-down menu “data”, then “sort”. A window pops up.
- I select to sort for the column, which contains the p-values. This is usually the second
last column, column “6” or “F”.
- I press “OK”.
- I find the most significant association usually below or near the end of the constant
terms
- I mark the line of the most significant association yellow: I rightclick the line header
and select “format cells”. I sleect the second last register card “fills” (“Ausfüllen”). I
select a standard color, usually yellow, and press “OK”.
I rearrange my sheet:
- I select all (e.g. ctrl+a twice)
- I select the pull-down menu “data”, then “sort”. A window pops up.
- I select to sort for the sorting column. This is usually the last column, column “/” or
“G”.
- I press “OK”.
I find my most significant association:
- I press ctrl+f. A window pops up.
- I select “options >>”
- I select the press-button “format…” to the right of the first line.
- I select the register card “fills” (“Ausfüllen”).
- I select the standard color, I marked the line with, usually yellow.
- I press “OK”.
- I select to search further (“Weitersuchen”).
Ready, I should have been referred to the context of my most significant association.
I save the file (ctrl+s).
Download