2008-2010 Community Survey Syntax Instructions The following

advertisement
2008-2010 Community Survey Syntax Instructions
The following instructions provide guidance for running the SPSS syntax file to analyze your 2008-2010 Community
Survey data. The instructions will walk you through each step of the syntax file, show you examples of the resulting
output, and describe how to use the output to complete the Community Survey reporting template.
Note that the red headers below refer to the section headers in the syntax file.
1. Frequencies:
The frequencies provide an opportunity to look at the distribution of the variables in the dataset and to become more
familiar with the data. You should review the frequencies (I would not recommend printing them all out) and look for
any odd entries or anything that seems incorrect or unusual about how respondents have answered questions. For
example, are there a lot of missing values for a variable that you would expect to be answered? Variables used in the
analyses are either original or recoded. Importantly, if an original variable has a corresponding recoded/reverse-coded
variable in the data set, always use the recoded or reverse-coded variable in the analysis because these variables have
accounted for skip patterns, reverse codings, inconsistency, dichotomizations, etc.
You will notice that the syntax directly preceding the frequencies splits the file based on year (2008, 2009, 2010) and the
grouping variable (target site, SPF-SIG sites, comparison sites). This allows the output to be displayed separately for
each year and group, thus allowing for comparisons of frequencies across these categories.
When reviewing the output, you will need to take note of the headers to determine to which year and group the output
pertains. For example, you’ll see in Figure 1 below that the output is for the target site (i.e., your site) in 2008.
Figure 1. Frequencies by year and by group – year 2008 target site.
2008
Target site
(same as
your site)
1
Scroll down past the 2008 target site data and you will find the output for the SPF-SIG sites in 2008. See Figure 2 below.
You can find the remaining year and site data in a similar manner.
Figure 2. Frequencies by year and by group – year 2008 SPF SIG site.
2008
SPF SIG sites
(not including
your site)
2. Crosstabs
The crosstabs provide a summary of three year data on each analytic variable for three groups (i.e., target site, SPF SIG
site and comparison site). The syntax generates one table for each variable, cross-tabulated by group and year. If you
concern user-define missing values such as 888 or 999, use the syntax in Section A to obtain the count and the
percentage of user-define missing values (see Figure 3). If user-define missing values are not concerned, then use the
syntax in Section B to exclude such values from calculation (see Figure 4). This will help you see how responses have
changed over the 3 years the survey was conducted. Keep in mind that if you data in year 1 were not representative,
you may find that your year 1 estimate is quite different from years 2 & 3.
2
Figure 3. User-define missing value in the crosstab tables.
3 years
Site. Scroll down to see
Comparison sites.
User-defined missing value
Figure 4. No user-define missing value in the crosstab tables.
No user-define missing value in the table
3
3. Graphs – part 1
The first set of graphs compares the percentages for each variable by year. As you can see in the syntax that directly
precedes the graphs syntax, the file is now split on group. The output, therefore, will first be displayed for the target
group, then SPF-SIG group, and comparison group. Noting the headers that appear directly above each graph will tell
you to which group the output pertains. Figure 5 displays the graph for Question #1 for the target group. You’ll notice
that the group is noted directly above the graph, and the text of the question is directly below it.
Figure 5. Graphs comparing the 3- year data within each site.
Target site
2008 vs. 2009 vs. 2010
Text of Q1
4. Graphs – part 2
The second set of graphs compares each variable across the three groups, first for 2008 then for 2009 then for 2010.
Figure 6 below displays the graph for Q1 in 2008.
Figure 6. Graphs comparing the 3- site data within each year.
2008
Target vs. SPF-SIG vs. Comp.
Text of Q1
4
5. Demographic Characteristics in Table 1 in the REQUIRED reporting templates
This syntax will provide the demographic characteristics to complete Table 1 in the REQUIRED reporting templates. The
file is now split on year and group, so the 2008 data for the three groups will come first, followed by the 2009 data then
the 2010 data for the three groups.
As you will see in the output, the descriptive statistics for respondents’ age are presented first. Figure 7 below shows
where the relevant statistics (mean, standard deviation, and minimum and maximum) are located in the output. Figure
8 shows where the percentages for gender, race/ethnicity, and language are located. When presenting the percentages,
be sure to cite the “valid percent” from the output, as these percentages do not include those who skipped the question
or provided a response of “Don’t Know” or “Not Applicable.”
Figure 7. Age in Table 1 in the REQUIRED reporting templates.
n’s (Table 1)
Mean, SD, min, max (Table 1)
5
Figure 8. Gender, ethnicity and language in Table 1 in the REQUIRED reporting templates.
Valid percentages (Table 1)
n’s (Table 1)
6. Scale Score Creation
This syntax creates the five scale scores representing the intervening variables. You will notice that the syntax uses
recoded variables, and that the scores are created by taking the mean score of the constituent variables. Once created,
these variables will appear at the end of your dataset. After running this syntax, please check your data file to make
sure the new measures are there and are coded correctly.
7. Intervening Variables
This syntax contains two parts. Section A generates output for completing Table 1 in the OPTIONAL reporting templates.
Section B generates output for completing Table 2 in the REQUIRED reporting templates.
Section A: performs one-way ANOVAs to determine if the three groups (target, SPF-SIG, and comparison) differ within
each year with respect to the scale scores created in the previous step. The resulting output can be used to complete
Table 1 in the OPTIONAL reporting templates. Again, the file is split on year so that the 2008 results come first, followed
by the 2009 then the 2010 results.
The F-test and associated p-value tests whether the mean scale scores across the three groups are significantly different
from each other. If so, the post-hoc Tukey test can be used to assess which means are different. This can be
determined by examining the p-values in the output produced by the Tukey test. Results of the Tukey test are used to
complete the cell of Pairwise Comparisons in Table 1 in the OPTIONAL reporting templates.
Figure 9 indicates where in the output the scale means, standard deviations, the F test results and associated p-values
can be found to complete Table 1 in the OPTIONAL reporting templates. Figure 10 indicates where to find the results of
the Tukey test to determine which means are different.
6
Figure 9. Means, standard deviations, the F test results and associated p-values to complete Table 1 in the OPTIONAL
reporting templates.
n’s, Mean & SD (Table 1)
F-test & p-value (Table 1)
Figure 10. Tukey test to determine which means are different.
P-value for Tukey test (Table 1 pairwise
comparison). The significant value here
indicates that the target and SPF-SIG sites
are different regarding their mean scores on
the perceived risk scale (p= .001) in 2008.
7
Section B: performs one-way ANOVAs to determine if three years (2008, 2009, and 2010) differ within each group
(target, SPF-SIG, and comparison) with respect to the scale scores created in the previous step. The resulting output can
be used to complete Table 2 in the REQUIRED reporting templates. The file is split on group so that the target group
results come first, followed by the SPF SIG then the comparison results.
It is the same step illustrated in Section A to find the scale means, standard deviations, the F-test and associated pvalues. If the F-test is significant, then check the Tukey test in the Multiple Comparisons table in the output to determine
which year’s mean is different. Results of the Tukey test are used to complete the cell of Pairwise Comparisons in Table
2 in the REQUIRED reporting templates.
Figure 11 indicates where in the output the scale means, standard deviations, the F test results and associated p-values
can be found to complete Table 2 in the REQUIRED reporting templates. Figure 12 indicates where to find the results of
the Tukey test to determine which means are different.
To complete the cell ‘Trend observed from 2008 to 2010’ in Table 2 in the REQUIRED reporting templates, first you need
to determine if there is difference across three years in the data, which is determined by F-test. If the F-test is not
significant, it indicates that the intervening variable outcomes from 2008 to 2010 do not differ. Then the Trend cell can
be completed by saying there is no significant difference (or change) from 2008 to 2010. If the F-test is significant,
based on pairwise comparison (from the Tukey test), you can describe how the means of the intervening variable
outcomes change from year to year. For example, if the mean of UDOC (likelihood of police involvement in patrolling
underage drinking) is 2.74 (year 2008), 2.73 (year 2009), 2.89 (year 2010), and the Tukey test indicates that year 2010 is
different from year 2008 and year 2009, but year 2008 and year 2009 are not different, then you can say UDOC has
increased significantly from 2009 to 2010.
Figure 11. Means, standard deviations, the F test results and associated p-values to complete Table 2 in the REQUIRED
reporting templates.
n’s, Mean & SD (Table 2)
3 years
F-test & p-value (Table2)
8
Figure 12. Tukey test to determine which means are different.
P-value for Tukey test (Table 2 pairwise
comparison). The non-significant values
here indicate that the 3 years are not
different from each other regarding their
mean scores on the underage drinking
scale.
8. Behavioral Outcomes for completing Table 2 in the OPTIONAL reporting templates
This syntax provides chi-square tests of whether the three groups are different with respect to each behavioral
outcome. These results can be used to complete Table 2 in the OPTIONAL reporting templates (see Figure 13). The
variables used here are dichotomized (yes/no) versions of the outcome variables. The file is split by year so that the
results for 2008 come first, followed by those for 2009 and 2010.
Figure 13. Relevant statistics in the output for Table 2 in the OPTIONAL reporting templates.
Percentages (Table 2)
Chi-square value (Table 2)
n’s (Table 2)
p-value for chi-square (Table 2)
9
9. Behavioral Outcomes and Trend analysis for completing Table 3 in the REQIUIRED reporting templates
The last set of syntax provides significance testing on the association between year and behavioral outcomes. The goal
of this analysis is to examine whether or not there is a linear trend in behavioral outcomes from 2008 to 2010 (i.e.,
increase or decrease over time). These results can be used to complete Table 3 in the REQUIRED reporting templates.
The χ2 you need to use is labeled as “linear by linear association”. Associated p-value of .05 or smaller indicates there is
a significant association between year and behavioral outcomes. If the p-value is significant, then compare percentages
in column “More than one day” 1(see Figure 14) to determine if the trend is increasing or decreasing.
The file is split by group such that the results of the target site come first, followed by SPF SIG sites then comparison
sites.
Figure 14. Relevant statistics in the output for Table 3 in the REQUIRED reporting templates.
Percentages (Table 3)
n’s (Table 3)
p-value for chi-square (Table 3)
Chi-square value (Table 3). Its
insignificant p-value indicates these
three years of data are not different
from each other, and there is no linear
trend in the data.
1
Given that all behavioral outcomes are dichotomized, the two columns in the output table will be 0 (None or No) and 1 (column title
may be different across different outcomes). Be cautious to not use percentages in the column with title “None or No” to determine
the trend.
10
Download