CHAPTER 1 GETTING STARTED GETTING STARTED WITH SPSS In this chapter you will find (a) (b) (c) (d) (e) general information about SPSS general directions for using the Windows style pull-down menus general instructions for choosing values for dialog boxes how to enter data other general commands. General Information SPSS is a powerful tool that can perform many statistical procedures. Data are entered in the data editor window. The data editor window offers to choices: data view screen and variable view screen. The variable view screen is for you to define variables, meaning to name variables, declare variable type, determine variable format, and declare measurement type. The choices for measurement type are scale, ordinal, or nominal. The data view screen is where you enter data. The data view screen has a spreadsheet format. Each column contains data for one variable. If the variable is not defined, then the default variable name “var00001” will be used for the first column, “var00002” for the second column, and so on. Once data is entered, Windows style pull-down menus are used to select activities, graphs, or other statistical procedures. Starting and Ending SPSS The steps you use to start SPSS may differ according to the computer equipment you are using. You will need to get specific instructions for your installation from your professor or computer lab manager. Use this space to record the details of logging onto your system and accessing SPSS. Copyright © Houghton Mifflin Company. All rights reserved. 283 284 Technology Guide Understandable Statistics, 8th Edition Once SPSS is activated, the first screen you see will look like this Choose Type in data, you will have the data view screen of the data editor window. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 285 Notice the main menu items: File Edit View Data Transform Analyze Graphs Utilities Add-ons Window Help The toolbar contains icons for frequently used operations. To end SPSS: Click on the File option. Select Exit and click on it or press ENTER. Menu selection summary: hFilehExit Entering Data One of the first tasks you do when you begin an SPSS session is to enter data into the data editor window. To do that, you may choose to first define variables (columns) of the data. This is done in the variable view screen. By clicking on the Variable View button located at the bottom of the data editor window, you will get into the variable view screen, where you define variables’ name, type, format, etc. After variables are defined, click on the Data View button, which is also located at the bottom of the data editor window, to get back to the data view screen, where you enter the data. Notice that the active cell is outlined by a heavier box. To enter a number, type it in the active box and then press ENTER or TAB. The data value is entered and the next cell in the same column is activated. Arrow keys and mouse cursor may also be used to move around in the data view screen. Each column contains data for a specific variable. Notice that there is a cell for a column label above row number 1. To change a data value in a cell, activate the cell by clicking on it, then correct the data in the entry bar above the data sheet, and press ENTER or TAB. Example Open a new data sheet by selecting hFilehNewhData. Let’s create a new data set that has data in it regarding ads on TV. A random sample of 20 hours of prime time viewing on TV gave information about the number of TV ads in each hour as well as the total time consumed in the hour by ads. We will enter the data into two variables (columns): one variable representing the number of ads and the other the time per hour devoted to ads. First, let’s get into the variable view screen to define the two variables. As shown below, we name the two variables as Ad_Count and Min_Per_Hr. They are both of the type of numeric. Width (number of digits) are 8 for both of them. Decimals (number of digits after the decimal point) are 0 for Ad_Count, and 2 for the other. We use “ad” and “mph” as the labels for the two variables, respectively. At this time data is not entered yet. Therefore, “values” (number of value entered in this column) and “missing” (number of missing values in this column) are None for both. “Column” stands for the column width, and we use 8 for both variables. They are both aligned to the right and are in scale measurement. The screen is displayed below. Copyright © Houghton Mifflin Company. All rights reserved. 286 Technology Guide Understandable Statistics, 8th Edition New click on the Data View button to get into the data view screen, where we enter our data. The result is shown below. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 287 Working with Data In an SPSS data sheet, each column stands for one variable, and each row stands for a case (record). To delete a variable or a case, the Edit menu option may be used. To insert a variable or a case, the Data menu option may be used. To delete value in a certain cell, activate that cell and press the Delete key. However, the cell itself will not be deleted. Deleting value in a cell simply causes a “missing value” for the corresponding variable. To insert a variable(column) to the left of column K: activate a cell in the column K, then select hDatahInsert Variable. To insert a case(row) above row K: activate a cell in the row K, then select hDatahInsert Cases. To delete a variable(column) or a case (row): select the column by clicking on the variable name (or the row by clicking on the row number), then use hEdithCut. Click on the Data menu item. You will see these cascading options in the pull down list. Copyright © Houghton Mifflin Company. All rights reserved. 288 Technology Guide Understandable Statistics, 8th Edition Click on the Edit menu item. You will see these cascading options in the pull down list. To print the data sheet, use hFilehPrint. Manipulating Data You can also do calculations with entire columns. Click on the Transform menu item and select Compute (hTransformhCompute). The dialog box appears: Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 289 Suppose you like to calculate a new variable x such that x = 3(Ad_Count)+4, and store the results in the third column. To do that, first type x in the “Target Variable” entry bar, then type 3, click on the multiply key * on the calculator, highlight [Ad_Count] and click ∼ button to enter it into the “Numeric Expression” entry bar, click on the + key on the calculator, then type 4. Now click on OK. The results of this arithmetic will appear in the third column (variable x) of the data sheet. Copyright © Houghton Mifflin Company. All rights reserved. 290 Technology Guide Understandable Statistics, 8th Edition Saving a Data Click on the File menu and select Save As. A dialog box similar to the following appears. (Select hFilehSave As.) For most computer labs, you will save your file on a 3½ inch floppy. Insert it in the appropriate drive. Scroll down the Save in button until you find 3½ Floppy (A:). Then select a file name. In most cases you will save the file as a SPSS file. If you change versions of SPSS or systems, you might select SPSS portable. Example Let’s save the worksheet created in the previous example (information about ads on TV). If you added the variable x as described under Manipulating the Data, clink on the variable name “x” to highlight this column and press the Del key. Your data should have only two columns. Use hFilehSave As. Insert a diskette in drive A. Scroll down the Save in box and select 3½ Floppy (A:). Name the file ads. Click on Save. The worksheet will be saved as ads.sav Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 291 . LAB ACTIVITIES FOR GETTING STARTED WITH SPSS 1. Go to your computer lab (or use your own computer) and learn how to access SPSS. 2. (a) Use the data worksheet to enter the data: 1 3.5 4 10 20 in column 1, name this variable as First. 9 8 12 in column 2, name this variable as Second. Enter the data 3 7 (b) Use hTransformhCompute to generate a new variable Result, stored in column 3. The data in Result should be Result = 2*First + Second. Check to see that the first entry in column 3 for Result is 5. Do the other entries check? (c) Save the data as Prob 2 on a floppy diskette. (d) Retrieve the data by selecting hFilehOpen h Data. (e) Print the data. Use either the Print button or select hFilehPrint. Copyright © Houghton Mifflin Company. All rights reserved. 292 Technology Guide Understandable Statistics, 8th Edition RANDOM SAMPLES (SECTION 1.2 OF UNDERSTANDABLE STATISTICS) In SPSS you can take random samples from a variety of distributions. We begin with one for the simplest: random samples from a range of consecutive integers under the assumption that each of the integers is equally likely to occur. To generate such a random sample, follow these steps: 1) In the data editor, enter the sample numbers in the first column. For example, to generate five random numbers, enter 1 through 5 in the first column. 2) Use the menu options hTransformhCompute. In the dialog box, first type in a variable name as the Target Variable, then select a function to generate random numbers from the desired distribution. The function RV.UNIFORM(min, max) under the group Random Numbers generate random numbers from the uniform distribution (min, max). Then function TRUNC(k) under the group Arithmetic truncates a real number to its integer part. Therefore, to generate random integer sample between two numbers, say between 1 and 100, the formula TRUNC(RV.UNIFORM(1, 101)) should be used. Note that the real numbers between 100 and 101 will truncate to 100. The random sample numbers are given in the order of occurrence. If you want them in ascending order (so you can quickly check to see if any values are repeated), use the menu options hDatahSort Cases. hDatahSort Cases Dialog Box Responses Sort by: Enter the variable name that you wish to sort by. Store order: Select either ascending or descending order. Example There are 175 students enrolled in a large section of introductory statistics. Draw a random sample of 15 of the students. We number the students from 1 to 175, so we will be sampling from the integers 1 to 175. We don’t want any student repeated, so if our initial sample has repeated values, we will continue to sample until we have 15 distinct students. We sort the data so that we can quickly see if any values are repeated. First we follow the above two steps to generate the fifteen numbers. We use x as the variable containing the 15 random numbers. Note that we use 176 as the “max” for the RV.UNIFORM function. Displayed below is the responses in the hTransformhCompute dialog. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide Now click on OK. Results are listed below. (Your results will vary.) Copyright © Houghton Mifflin Company. All rights reserved. 293 294 Technology Guide Understandable Statistics, 8th Edition Next, sort the data. Click on OK. Highlight the column for x to see the sorted data. The results are shown. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 295 We see that no data are repeated. If you have repetitions, keep sampling until you get 15 distinct values. Random numbers are also used to simulate activities or outcomes of a random experiment such as tossing a die. Since the six outcomes 1 through 6 are equally likely, we can use above procedure to simulate tossing a die any number of times. When outcomes are allowed to occur repeatedly, it is convenient to use frequency table to tally, count, and give percents of the outcomes. We do this with the menu options hAnalyzehDescriptive StatisticshFrequencies. hAnalyzehDescriptive StatisticshFrequencies Dialog Box Responses Variables: variable name containing data Option to check: Display frequency tables. Example Use the above random number generating procedure with min = 1 and max = 7 (numbers between 6 and 7 will truncate to 6) to simulate 100 tosses of a fair die. Use the frequency table to give a count and percent of outcomes. Generate the random sample using the function formula TRUNC(RV.UNIFORM(1, 7)). Use the name Outcome for the variable containing the outcomes. Then use hAnalyzehDescriptive StatisticshFrequencies with “Display frequency tables” checked. Copyright © Houghton Mifflin Company. All rights reserved. 296 Technology Guide Understandable Statistics, 8th Edition Click on OK. The results are shown below. (Your results will vary.) If you have a finite population, and wish to sample from it, you may use the menu options hDatahSelect Cases to accomplish that. hDatahSelect Cases Dialog Box Responses Select the variable to be sampled from. Check: Random sample of cases Check: either Filtered (the unselected cases will be marked with 0, selected cases with 1.) or Deleted (the unselected cases will be deleted). Next, click on the Sample button. Another dialog will show up. Dialog Box Responses Check: Exactly Enter number of cases to be selected from the first N (total number of cases) cases. Example Take a sample of size 10 without replacement from the population of numbers 1 through 100. First, enter the numbers 1 through 100 in the first column, using x as the variable name. Then use hDatahSelect Cases. In the dialog box, select variable x, check Random sample of cases, and check Filtered. The dialog is shown below. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 297 Next, click on the Sample button. In the Sample button dialog box, check “Exactly” and enter “10” cases from the first “100” cases. The dialog box is displayed below. Copyright © Houghton Mifflin Company. All rights reserved. 298 Technology Guide Understandable Statistics, 8th Edition Now click on Continue. Then click on OK. The results follow: Let us now check Deleted in the hDatahSelect Cases dialog box. This way only the selected cases will show up as shown below. Note that results varies from sampling to sampling. Copyright © Houghton Mifflin Company. All rights reserved. Part IV: SPSS Guide 299 LAB ACTIVITIES FOR RANDOM SAMPLES 1. Out of a population of 8173 eligible count residents, select a random sample of 50 for prospective jury duty. Should you sample with or without replacement? Use the TRUNC(RV.UNIFORM(min,max)) function to generate the sample. Use sorting procedure to sort the data so that you can check for repeated values. If necessary, repeat the procedure again to continue sampling until you have 50 different people. 2. Retrieve the SPSS data Svls02.sav on the CD-ROM. This file contains weights of a random sample of linebackers on professional football teams. The data is in Column 1. Use the menu options hDatahSelect Cases to take a random sample of 10 of these weights. Print the 10 weights included in the sample. Simulating experiments in which outcomes are equally likely is another important use of random numbers. 3. We can simulate dealing bridge hands by numbering the cards in a bridge deck from 1 to 52. Then we draw a random sample of 13 numbers without replacement from the population of 52 numbers. A bridge deck has 4 suits: hearts, diamonds, clubs, and spades. Each suit contains 13 cards: those numbered 2 through 10, a jack, a queen, a king, and an ace. Decide how to assign the numbers 1 through 52 to the cards in the deck. (a) Use TRUNC(RV.UNIFORM(min,max)) to get the numbers of the 13 cards in one hand. Translate the numbers into cards and tell what cards are in the hand. For a second game, the cards would be collected and reshuffled. Use the computer to determine the hand you might get in a second game. (b) Store the 52 cards in the first column, and then use hDatahSelect Cases to sample 13 cards. Print the results. Repeat this process to determine the hand you might get in a second game. (c) Compare the four hands you have generated. Are they different? Would you expect this result? 4. We can also simulate the experiment of tossing a fair coin. The possible outcomes resulting from tossing a coin are heads or tails. Assign the outcome heads the number 2 and the outcome tails the number 1. Use TRUNC(RV.UNIFORM(min,max)) to simulate the act of tossing a coin 10 times. Use Frequency table to tally the results. Repeat the experiment with 10 tosses. Do the percents of outcomes seem to change? Repeat the experiment again with 100 tosses. Copyright © Houghton Mifflin Company. All rights reserved.