Welcome to SPSS 101 – Data Entry: Getting Started

advertisement
Creating a Data File in SPSS
Welcome to a tutorial on Getting Started in SPSS - Creating a Data File.
Some terms you should know before you begin:
Variable name: in SPSS this is an 8 character, or less, name that you assign a variable,
although later versions allow more than 8 characters. The SPSS program uses the variable
name to identify the variable. You should try to be as descriptive as possible in naming
your variables, e.g., gender could be called gender (6 characters), marital status could be
called marstat (7 characters), etc. Remember, each variable name must be unique; you
cannot name two variables the same name.
Variable label: this is a longer description of the variable name that appears on your
output when you do your data analysis. For the variable marstat, you would type marital
status for the variable label, for gender you would probably type “gender” or “gender of
participant.” Variable labels are important as they include description that is not possible
in the variable name.
Variable values: these are the range of possible responses for the variable. If the variable
is interval or ratio, you do not have to assign variable values because the numbers have a
true quantitative meaning, e.g., gross annual income is gross annual income. If the
variable is nominal or ordinal, you need to define the values you will use for data entry.
For example, gender could be female or male. You would need to assign each possible
value a number, e.g., female = 0, and male = 1. Although the numbers are arbitrary and
have no quantitative meaning, it is best to start with zero or one and work sequentially (0,
1, 2, 3, etc.) through all possibilities.
Where there is the possibility of missing data, you need to assign missing values a
number, usually the number 9 is used for missing data, or 99 if the 9 is a real value, or
999, etc. You also need missing values specified for interval or ratio level variables, and
again it is convention to use a series of 9s. You can leave the space blank for missing
data, but if you do it is sometimes difficult to know if the blank was intentional or not.
Also, if you have to add the variable to another variable, any cells with blanks will not
compute. For instance, in one of my files the variables days in placement, days home,
days runaway, were added to check that they equaled 365 days in a year. If they do not
equal 365 days then I know I have an error(s) and I can easily find them. If missing
values are left blank, SPSS will not add the values for that case, therefore I use zeros for
0 days, and a defined value like 999 for any missing data.
Research for Effective Social Work Practice by Judy L. Krysik and Jerry Finn
© 2010 Routledge / Taylor & Francis
Lets get started using SPSS to create a new data file. Double click on the SPSS icon.
Depending on the version of SPSS you are using, you may see a screen that allows you to
run an SPSS tutorial, open an existing file, or create a new file. For now, select create a
new file and click on OK and a blank screen with a grid on it should appear. In newer
versions of SPSS you will arrive at an empty grid ready to create a new data file. It is
advisable to return to the SPSS tutorial at some point. You can access the tutorial from
the Help button on the main toolbar at the top of the screen.
If you look in the bottom left of the screen you should see two tabs and the tab that is
highlighted will be “data view.” The variable view screen is where you will define the
variables or make changes to how existing variables are defined. The second tab “data
view” is where you will go to enter the data. You can click back and forth between these
two tabs. The SPSS data file is structured with the variables in the columns and cases in
the rows. All of the data for a single case (organization, individual, family, couple, etc.) is
entered in one row and is referred to as a “record.”
The first thing you want to do in setting up a SPSS data file is to create a variable you
will use to identify the case. Let’s begin by creating a variable for case identification.
There are two ways to define variables. From the variable view screen you can click on
the cell where the first variable name will appear, this is at the top of the first column on
the first row. From data view you can click on “var” at the top of the column where you
want the variable to be placed, this action will take you to the variable view screen to
define the variable. You can begin by typing in your first eight-character, or less, variable
name – which in this example we could use ID. Type in ID for the first variable name.
You have different options to help define the variable. The first is “type.” Most of the
data we will be concerned with will be numeric, for example, our first variable ID is
numeric. Data that consists of words are called “string” variables. Some variables may be
in dollars and you could select “dollar” for the type. If the variable is a date, as in date of
birth, it is important to specify that the variable type as a date. If not, you will not be able
to use SPSS to calculate the time between two periods or age at a certain point such as
Research for Effective Social Work Practice by Judy L. Krysik and Jerry Finn
© 2010 Routledge / Taylor & Francis
baseline or entry to the program. If you select date at the type, then you must next select
the format for how you will enter the date. It is important to be consistent throughout
(e.g., mm/dd/yy, or mm/dd/yyyy).
When you design a database you want to consider speed, accuracy, and ease of data
entry. The order of the variables you define should match with the order of your data
gathering device (questionnaire, interview schedule, etc.). It will be very difficult to enter
the data if it requires flipping back and forth between pages or moving the cursor up and
down to different variables. You want to aim for a smooth process.
The next option is “Width.” The width you will select depends on the number of
characters in the range of possible responses to that variable. For instance, if my ID data
are all 3 digit numbers, I may want to change width from the default of 8 to 3. The width
is set on 8 by default, this simply means the number of characters that are allowed for that
variable unless you specify differently. If you changed width to two, the space for that
variable would become very narrow – go ahead and try it out. I usually leave width at the
default of 8 so that I can read the 8 character variable names. If you need more spaces, for
example, if you are entering a 9 digit ID number you would need to change the width to
9, and so on. To change these values, simply click on the cell and then follow the arrows,
or gray box, etc., whatever appears.
The next field is “Decimals.” Unless I want decimal places I like to change the decimal
from 2 to 0, otherwise I will have a decimal and two zeros behind every value in the
database.
Once you are finished changing type, width, and decimals you will see the option for
values. In older versions of SPSS you need to click on continue. Since there are no value
labels for ID, and should be no missing data, you can leave it at the default which is set
on none for missing. Also in older versions of SPSS click on ok and you are ready to start
defining another variable. You will not need to change anything under the option column
format, align, or measure. Do be careful however that under measure, nominal or ordinal
is not chosen for a variable that is continuous, or that you are using as interval or ratio
data. Scale is the default.
Now, go to the second row and type in “gender” for the second variable name in the cell
under the first variable ID. Change the field decimals so that there are no decimal points
allowed. Click on label or place your cursor in the label cell. Type in the longer
descriptive title for the variable in the cell or in older versions in the space beside
“variable label.” Next define the value labels by clicking in the cell and then on the dots
in the gray box to the right. Type a numeral “1” in the box labeled “value.” Tab to the
next box or use the arrow key and type in the “value label” – which in this example is
female. Click on “Add.” You will see 1 = female appear in the value label box. Now type
a 2 in the box labeled value, tab to the next box and type in male, then click on Add. If
you make an error, simply click on the value definition (e.g., 1=female) and then on
“remove.” When you are finished defining all possible values for the variable, click on
OK or continue. Remember that the values must be unique and mutually exclusive. If two
Research for Effective Social Work Practice by Judy L. Krysik and Jerry Finn
© 2010 Routledge / Taylor & Francis
values are possible, as in questions that instruct the respondent to choose all that apply,
you must create a separate variable for each value and then indicate the response as yes or
no.
Now define “missing values,” in some cases you may not know the gender of some of
the respondents, in this case you would click in the cell, then on the gray box in the
corner, and change the default from no missing values to discrete missing values or a
range of missing values if appropriate. For gender we will enter a 9 for missing values in
one of the three spaces below discrete missing values. Click “OK” or “continue” when
you are done defining the missing values.
Now you have finished defining two variables. The next step is to save your file. Similar
to all computer programs, you should save your work intermittently to avoid losing it.
To save the data file, click on “File” in the upper left hand corner of the top toolbar, then
on “save as,” and then name your file as you would any other file. The extension .sav
will automatically be added to all data files. You can save your file to a specific drive if
you have a disk or a memory stick with you. To do this click on the little folder beside
“save in” at the top of the save screen. Select the directory where you want the file to be
stored. You only need to execute the “save as” command once. Subsequently when you
want to save you will click “save” from the File option on the top toolbar.
Once you have defined all of your variables and saved your file, you are ready to start
entering data. I like to use the number pad on the right side of the keyboard to enter data
on a desktop computer. Make sure the number lock is on if you use the number pad. In
newer versions of SPSS you will go to the data view tab in the left hand corner of the
screen to enter data.
For the variable ID, place your cursor in the first cell where ID and row one intersect,
type in the first value for ID “1.” If you press enter you will advance to the next row. If
you press the right arrow key you will advance to the next cell in the same row. Continue
entering data until you get to ID number 15. If you make a mistake, you can backspace
and fix it, or simply go back to that cell, click on it, and type over what was there or press
delete. Now move your cursor to the variable gender and enter in the values for male and
female for your 15 cases. You can enter fictional data and even type in some 9s for
missing data. Remember to save your data file every now and then.
There are two ways to view the data in your data file. One is to view the numeric values –
the numbers, the other is to view the actual values – the words such as male or female.
To change the view, click on “view” on the top toolbar, then go to the item “value
labels,” and click on it. To change it back again, follow the same steps.
To delete a variable or delete a case, place your cursor on the variable or case at the edge
of the row or column just outside the data grid and click. You will see the row or column
Research for Effective Social Work Practice by Judy L. Krysik and Jerry Finn
© 2010 Routledge / Taylor & Francis
highlighted. Next, select “edit” from the top toolbar and then “clear,” or simply click on
“delete.”
Delete the case in row 8 and see what happens.
Under the “data” tab on the top toolbar you can insert a variable, insert a case, or go to a
case. Try all three – insert a variable, insert a case, and go to a case by using the ID
number.
If you have many variables with the same value labels, such as in a satisfaction interview
in which the same Likert scale is used for each item, you can use the “template” option
under “data” to define your value labels, missing values, and type. This avoids having to
retype the same information for every variable. In newer versions of SPSS the option for
this action is called “copy data properties.” SPSS will prompt you with a series of
questions as to where you want to copy the properties (the working data file in this case),
the source variable (the one you want to copy from), the working file variables (the one
or ones you want to copy to), and then you will see a screen where you can select which
properties you want to copy, and then chose to execute the command.
The “utilities” option on the top toolbar is useful to review your variables and their
definitions. Select “variables” from the utilities option and see what happens. Next select
the option called “file info.”
The key to learning SPSS is to practice and not be afraid to experiment. There is usually
more than one way to do what needs to be done. The best way to learn about your
preferences and to gain speed is to go ahead and try new approaches.
Research for Effective Social Work Practice by Judy L. Krysik and Jerry Finn
© 2010 Routledge / Taylor & Francis
Download