Chapter 3 How to Do Scatterplots and Maps In SPSS 10.0 to 13.0 At the start of Chapter 3 Section 1 we wanted to find measures of reliance on agriculture and fixity of settlement patterns to test the hypothesis that the transition to agriculture would lead to transition to fixed settlement. We found a variable, AGRICULTURECONTRIBUTION TO LOCAL FOOD SUPPLY (variable 3 in the SCCS codebook) for the first measure, and another. FIXITY OF SETTLEMENT (variable 150) for the second measure. In this chapter we explore how to test bivariate (two-variable) hypotheses with scatterplots (Section 1) and how to map the geographical distribution of the societies in the sample that have different configurations of values of the variables in our hypothesis. You may want to use the latest two-week demo version of SPSS 13.0 for some of the options discussed in this chapter. It can be downloaded from http://www.spss.com/spss/ Scatterplots…(to be expanded) Maps…(to be expanded) Section 1: Making Scatterplots to Test Relationships between Variables In our case the hypothesis test involves the analysis of bivariate correlation between two variables. In such cases we advise you to start with making a scatterplot which will make it possible to see the general relationship between the variables. To do the scatterplot, go to the menu line and choose: Graphs → Scatter Chapter 3 You will see the following window: 2 Scatterplots and Maps in Spss Press “Define” button. You will see the following submenu: 3 Chapter 3 The independent variable should be moved to the X axis, whereas the dependent one should be placed on the Y axis. The independent variable in our case is AGRICULTURE- CONTRIBUTION TO LOCAL FOOD SUPPLY. Thus, let us move it to the X axis. The dependent variable is FIXITY OF SETTLEMENT. Thus, we are to move it to the Y axis. But first we have to find it. To do this we would advise you to look in your notes and to see what is this variable number (Do you remember we asked you to write it down?). This number is 61. The total number of variables in the database is just a bit smaller than 2000. Hence, our variable should be quite close to the beginning of the list (note that if its number were, say, 1700, you would have to move to the end of the list). In general, knowing variable number facilitates greatly to find the variable. That is why we advised you to write down variable numbers when you find them in the Codes.doc file. Finally, we have put AGRICULTURE- CONTRIBUTION TO LOCAL FOOD SUPPLY to the X axis. FIXITY OF SETTLEMENT is on the Y axis. Now we can press the OK button. And we shall the following scatterplot: 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 Agriculture-Contribution to Local Food Supply In fact, in this form the scatterplot is not particularly informative. In order to make it really informative you will have to edit it. To do this, first double click on the scatterplot 4 Scatterplots and Maps in Spss (the time between the two left-clicks should be VERY small – otherwise you will not simply get in the editing mode). You will see the following: In SPSS 12 and 13 the Chart Editor window haa a different menu, as shown here. 5 Chapter 3 To adjust the node sizes in the scatterplot in v.12-13 so as to reflect the frequenqies in the cross-tab, you: “Bin Data Points” by Double clicking on one of the points, and a properties menu will appear. and then click - Display As ( ) Bins; and ( ) Marker Size. The click the Marker tab next to Point Bins, and increase the Marker Size from 5, say, to 25. Then click Apply. If needed, click Edit / Undo and redo the point sizes. There will also be a Scale box that will appear on the graph, and you can click and move it to a more convenient locations. In Spss v.14, you must now click Edit / Properties and click the Point Bins tab on the Properties menu (in v.13 you didn’t need to first click Edit). 2 Finally, you can click the main menu Elements / and Interpolation line to get your final graph. Chapter 3 Figure 3.1. Scatterplot of Agricultural Reliance and Fixity of Settlement (SPSS 12-13) In Figure 3.1 the dip in the interpolation line at Agriculture = 3 (3 = < 10%) is an indication that in between the nomadism associeted with food collection (Agriculture = 1, 2 Scatterplots and Maps in Spss None) and fixity of residence with Agriculture = 6, Primarily agricultural) there is another sort of nomadism associated with pastoralism in which there is often a small amount of agriculture. Incidentally, how to get to know what all these 2, 4, or 6 mean? It is very simple. You should not just close the “Codes.doc” while working in the SPSS with the Standard Sample database. There you can easily find answers to all such questions. The codes can also be found on-line at http://eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm. Figure 3.1, however, is made by SPSS version 12 or 13. For versions 10 and 11, the sacatterplot graph and the mans of making it will differ: First enlarge the SPSS Chart Editor window and choose in the SPSS Chart Editor menu line: CHART → OPTIONS You will see the following: 3 Chapter 3 Now tick “Show sunflowers” and “Fit Line: Total”, and then press “Fit Options” button. You will see the following: Mark “Lowess.” Press “Continue”, then “OK”, and the scatterplot will be changed in the following way: 4 Scatterplots and Maps in Spss Figure 3.2. Scatterplot of Agricultural Reliance and Fixity of Settlement (the dotted black line is explained in Chapter 5) Irrespective of which version of SPSS we used, we have changed the scatterplot in two important ways. First, we did “Point Binning” or “sunflowers.” In the present scatterplot a single circle denotes one case, a circle with two “petals” denotes two cases, a circle with three “petals” denotes three cases etc. It is really very difficult, however, to count the number of “petals” when they go beyond a dozen. Using Point Binning (Figure 3.1), however, we were able to make the size of the nodes in the graph reflect the number of cases in the cross tables, but in intervals of ten (0 – 10 – 20 – 30 – 40 – 50 – 60 – 70). The other new feature is that an Interpolation line (Figure 3.1, version 12-13) or Lowess curve (Figure 3.2, version 10-11) was fitted into the scatterplot. The Interpolation or Lowess line corresponds to a formula describing the relationship between the variables most closely, hence, it is sensitive not only to linear, but also to curvilinear relationships. As we shall see below, this is immensely important for us. In general, the scatterplot in its current form suggests that we have all the grounds to expect to find a rather strong and significant correlation between the variables. However, note that in fact it tells us much more than this. For example, we may notice that in the reliance on agriculture range 1-3 (that is, 0-10% reliance on agriculture) the growth of agricultural contribution to food supply does not tend to increase the settlement fixity. Yet, the growth of reliance on agriculture over 10% (the transition from value 3 to 4 of the X axis scale) apparently leads to a sharp increase in settlement fixity. This sharp 5 Chapter 3 increase continues with the transformation of agriculture into the most important single food supplier (even before it starts contributing more than 50% of food – this corresponds to the transition from value 4 to 5 of the X axis scale). However, the further growth of the agricultural contribution to food supply does not result in as a sharp increase in settlement fixity (corresponding to the transition from value 5 to 6 of the X axis scale). To understand why we advise you to start your analysis of correlation with building a scatterplot, let us make a scatterplot for two variables: political complexity and polygyny For the latter we shall use v860. It is named “Cultural Basis of Polygyny”, but has the following values: 1 = Monogamy prescribed 2 = Monogamy preferred, but exceptional cases of polygyny 3 = Polygyny preferred by individual men with leadership attributes (chiefs, medicine men, outstanding hunters) 4 = Polygyny preferred by men of a higher social class: men of wealth, rank, nobility, etc. 5 = Polygyny preferred by most men, and attained by most men of sufficient years or wealth to obtain wives. Hence, it may well be regarded as a polygyny index. Now, follow the algorithm specified above. If you do not do any mistakes the result (SPSS 11 format) will look as follows: Figure 3.3. Scatterplot of Jurisdictional Hierarchy and Polygyny 6 5 4 3 2 1 0 0 1 2 3 4 5 6 Jurisdictional Hierarchy Beyond Local Ccommunity 6 Scatterplots and Maps in Spss The scatterplot immediately suggests that the relationship between political complexity and polygyny is curvilinear. The polygyny level tends to increase with the growth of political complexity up to a medium complexity level (3 = 2 political integration levels over community, which corresponds to complex chiefdoms and their equivalents [e.g. complex tribal confederations). However, afterwards it tends to decrease rather sharply. Normal correlation tests (like Pearson r, or Spearman’s Rho ones) are only aimed at analysis of linear relationships. Hence, in our case such tests would yield patently misleading results. That is why we advise you so strongly to make scattergrams before doing any crosstabs and statistical tests. For an exercise now do a scatterplot for population density and political complexity. If you do not do any mistakes, the result should look as follows: Figure 3.4. Scatterplot of Population Density and Juridictional Hierarchy (the dotted black line and circles are explained in Chapter 5) 6 5 4 3 2 1 0 0 1 2 Population Density Interpret now the scatterplot. 7 3 4 5 6 7 8 Chapter 3 Section 2: Making Maps Using the "Scatter" option you can also do maps. This makes much sense, as before studying correlations between cultural traits it is very helpful to study the geographic distribution of traits in question. Let us, for example, make a map specifying the geographic distribution of postmarital residence patterns. To do this map (and other maps) use file S-DATMAPnew.sav. After opening it choose the "Scatter" option. Move "Latitude" to Y Axis; move "Longitude" to X Axis. Move "Marital Residence" (v69) to "Set Markers by" box. Press "OK". You will see the following: 8 Scatterplots and Maps in Spss 80 Marital Residence 60 * 40 Neolocal-separate fr om kin 20 Ambilocal-w / either w ife's or husband's 0 Patrilocal or virilo cal -20 Avunculocal-w /husban d's mother's brother -40 Matrilocal or uxoril -60 ocal-w ith w ife's kin -200 -100 0 100 200 Longitude in Degrees In fact, it is possible to get to know much using even this map. However, such maps produced directly by SPSS are not really user-friendly. So we would advise you to edit it. To do this double click on the map. Imagine that we would like to study the patterns of geographic distribution of matrilocal vs. patrilocal residence. So click on a matrolocal residence marker (within the map itself, or in its margin), then choose in the SPSS Chart Editor menu line: FORMAT → MARKER 9 Chapter 3 You will see the following: 10 Scatterplots and Maps in Spss We would advise you to choose the dark circle and the large size by clicking at the respective options: Now click "Apply". As a result the map will experience the following change: 11 Chapter 3 Now do the same with the patrilocality markers. The map will now look sa follows: Next, we would advise you to change a bit the colours. To do this choose in the SPSS Chart Editor menu line: FORMAT → COLOR Click on the markers whose colors you would like to change, select color you prefer and click "Apply". Some general advice. If you are going to use the map in your essay, thesis etc., which will be printed out with a black-and-white printer, it would make sense to observe the following points. Avoid light colors – printers frequently fail to print them out (sometimes they are not visible even in Powerpoint presentations). We would advise you the following colors which could look satisfactorily both in multicolor and blackand-white formats: dark-blue for continent contours, red for one marker, dark-grey for the alternative marker. After changing the map this way it will look as follows: 12 Scatterplots and Maps in Spss 80 Marital Residence 60 * 40 Neolocal-separate fr om kin 20 Ambilocal-w / either w ife's or husband's 0 Patrilocal or virilo cal -20 Avunculocal-w /husban d's mother's brother -40 Matrilocal or uxoril -60 ocal-w ith w ife's kin -200 -100 0 100 200 Longitude in Degrees In fact, we would still advise you to make some further amendments to this map. One of the points is that in our case the marker labels are too long. It would make sense, for example, to shorthen the first label from "Neolocal – separate from kin" to just "Neolocal". To do this double-click on any marker legend and you will see the following: 13 Chapter 3 Using this menue you could make any changes you like of both legend title and marker labels. Just do not forget after making changes to any label click the "Change" button (otherwise your changes will be lost). Finally click OK. Now the map will look as follows: 14 Scatterplots and Maps in Spss 80 60 40 Residence 20 * 0 Neolocal -20 Ambilocal Patrilocal -40 Avunculocal -60 Matrilocal -200 -100 0 100 200 Longitude in Degrees Finally, it might make sense to delete the axis titles ("Longitude in Degrees" and "Latitude in Degrees") – we do not really need them (to do this just double-click on the respective axis title, delete it in the opened window and press OK). The final version of the map will look as follows: 15 Chapter 3 80 60 40 20 Residence 0 * Neolocal -20 Ambilocal Patrilocal -40 Avunculocal -60 -200 Matrilocal -100 0 100 200 Now we can easily interpret the map. We see immediately that in the world of traditional cultures the dominant postmarital pattern was patrilocal. The matrilocal residence occurred much more rare, and most matrilocal cases concentrate in South-East Asia and adjacent part of Oceania, and eastern parts of both North America and South America. Note that this immediately suggests that the observed distribution of these cultural traits might be to a considerable extent a result of diffusion processes (and hence, we are confronting here what cross-cultural anthropologists call "Galton Problem" [on which see below in Chapter *]). Now make a map of geographical distribution of matrilineal vs. patrilineal descent. To do this use variable DESCENT - MEMBERSHIP IN CORPORATE KINSHIP GROUPS (v70). If you follow correctly the algorithm specified above, you will get the following map: 16 Scatterplots and Maps in Spss 80 60 40 20 Descent 0 * Bilateral -20 Ambilineal Patrilineal -40 Double -60 -200 Matrilineal -100 0 100 200 Now try to interprete this map (Suggestion: a meaningful interpretation of this map could be done most easily if you compare it with the map of geographic distribution of postmarital residence patterns above). For further examples of maps for SCCS variables you may consult the White-Veit EthnoAtlas at http://eclectic.ss.uci.edu/~drwhite/ethnoatlas/nindex.html. There the maps were made by the pre-windows MAPTAB program (White 1985, published on diskette; see http://eclectic.ss.uci.edu/~drwhite/worldcul/wldvol11.htm). After this let us move further. To test a correlation we would advise you to use a cross-tab option. Thus, let us move to our next chapter. 17