Chapter 1 Fundamental concepts SPSS

advertisement
Chapter 1 Fundamental concepts
SPSS - Descriptive statistics
Before starting with any advanced analysis, it is a good habbit to start with some descriptive statistics and
simple graphics, to see what is going on in your data!
Datafile used: gss.sav
How to get there: Analyze
Descriptive Statistics
….
Frequencies …
This menu selection opens the following Frequencies dialog box:
As you can see, the variables are difficult to read. To make them easier to read, we’ll use variable names
instead of labels in dialog boxes. Do this by choosing Edit
Options. Then, in the Options dialog box,
click the General tab. In the Variable Lists group box (top at the right), select ‘Display names’ and click
OK.
This change doesn’t have effect until the next time you open a data file!
So close the datafile, and reopen it. Return to the Frequency dialog box.
Now you’ll see the following Frequencies dialog box:
Choose the variable(s) for which you need descriptive statistics by selecting them and clicking on the
arrow. They appear in the ‘Variable(s):’ box.
‘Display frequency tables’ is automatically selected. In a frequency table the absolute and the relative
frequencies are shown, as well as the percentage and cumulative percentage of valid cases (without missing
values). The cumulative percentage is the portion that is smaller or equal to the concerning value.
Button Statistics…
One can select many descriptive statistics. Most importantly, these are the Mean, Median, and Mode, and
Std. deviation, Range, Minimum and Maximum. See following figure.
Button Charts…
Some simple charts can be obtained, such as bar charts, pie charts and histograms. A histogram is a
graphical display of counts for ranges of data values. In histograms, one can choose to indicate the normal
curve as well. See following figure.
When a chart is obtained in the output, they can be modified in the SPSS Viewer. A new window appears,
the SPSS Chart Editor, in which changes can be made by clicking on a certain part of the chart (e.g. axis,
legend, title) In the following figure, the window ‘Category Axis’ appears by clicking on the x-axis title
Respondent’s Sex.
Output of running frequencies
Output 1
When you perform an analysis using ‘Frequencies’ on the variable degree, without indicating any
options, the results are the following:
Frequencies
Statistics
RS Highest Degree
N
Valid
1496
Missing
4
RS Highest Degree
Valid
Missing
Total
Less than HS
High school
Junior college
Bachelor
Graduate
Total
Don't know
No answer
Total
Frequency
279
780
90
234
113
1496
2
2
4
1500
Percent
18,6
52,0
6,0
15,6
7,5
99,7
,1
,1
,3
100,0
Valid Percent
18,6
52,1
6,0
15,6
7,6
100,0
Cumulative
Percent
18,6
70,8
76,8
92,4
100,0
In the table ‘Statistics’, the number of cases (N) is splitted in Valid and Missing cases.
In the frequency table ‘RS Highest Degree’, the variable degree is splitted into the possible answers (Less
than HS, High School, ..etc), and their absolute (Frequency) and the relative (Percent) frequencies are
shown, as well as the percentage and cumulative percentage of valid cases (Valid Percent and Cumulative
Percent). Percent calculates the relative frequencies including the missing cases. However, Valid Percent
calculates the relative frequencies excluding the missing cases, so that the relative frequencies of the valid
cases count up to 100 %.
Output 2
When you perform an analysis using ‘Frequencies’ on the variables age, indicating the options mean,
median and mode (button
Statistics) , and histogram with normal curve (button
Charts), some of the
results are the following (we left the table ‘Age of Respondent’ out because it is very large):
Frequencies
Statistics
Age of Respondent
N
Valid
Missing
Mean
Median
Mode
1495
5
46,23
43,00
28a
a. Multiple modes exist. The smallest value is shown
Age of Respondent
200
Frequency
100
Std. Dev = 17,42
Mean = 46,2
N = 1495,00
0
20,0
30,0
25,0
35,0
40,0
50,0
45,0
60,0
55,0
70,0
65,0
80,0
75,0
90,0
85,0
Age of Respondent
As usual, the number of valid and missing cases are visible in the ‘Statistics’ table. The other descriptive
statistics (Mean, Median and Mode), are indicated in the same table.
The histogram of the variable age shows its distribution, with Age of Respondent on the x-axis and
Frequency on the y-axis. The distribution seems to be approximately normal, and skewed to the left.
Download