Supplement: Population Size Predicts Technological Complexity in

advertisement
Supplement: Population Size Predicts Technological
Complexity in Oceania
Michelle A Kline* and Robert Boyd*
* Department of Anthropology, University of California, Los Angeles, CA, 90095
Sampling from Electronic Human Relations Area Files (eHRAF)
We limited our data set to ethnographies found in the eHRAF (2009), because of the
strict criteria of inclusion for that database. The HRAF database contains published
books and articles, and unpublished dissertations. Authors include anthropologists
and other scholarly researchers, as well as missionaries and explorers. Most
publications are in the form of ethnographies done in situ, but the collection also
includes publications resulting from research in museums, and ethno-histories. The
publications included in our sample were overwhelmingly ethnographic accounts.
For details of how we controlled for potential variation in ethnographic coverage,
see the section on control variables, below. For further details on the eHRAF
collection, see their online guides (World Cultures Database, 2009). Our sample
contains societies from near and far Oceania (also known as Micronesia, Melanesia,
and Polynesia).
Figure S1. Figure shows the geographical locations of the ten societies in our sample.
Geographical distance between societies is not always indicative of rates of contact. Map from
GoogleMaps (2008).
Collection of Tool Data
We collected data in multiple stages. Here we explain in detail.
1. Collection of ethnographic excerpts from the electronic Human Relations Area Files
database, and calculation of total number of tools.
We selected the societies to be included in this sample by limiting ourselves to (a)
societies in Oceania, as defined by eHRAF (b) societies for which it is possible to
obtain a population size estimate. We did not use groups located on Papua New
Guinea and mainland Australia, so that our sample would include only relatively
isolated groups for which we could estimate rates of contact. After examining the
organization of eHRAF indexing codes, we used the advanced search function to
retrieve all paragraphs indexed by eHRAF as having content on “fishing,” “marine
hunting,” or “fishing gear.” The coder then copied and pasted all retrieved
paragraphs into a word document for each society, for further coding.
The coder read through the word document, simultaneously using an excel sheet to
assemble a list of all technologies mentioned (whether in passing or as the focus) in
these ethnographic excerpts, and a record of in which paragraph(s) the technology
is mentioned. These excel sheets formed separate technology indexes for each
society in our sample. After completing the index, the coder reviewed notes made
during the indexing process and double-checked that each tool indexed was in fact
an independent tool. Tools were granted this status if any of the following was true:
tools were described as performing a unique purpose, tools were described as
having a unique structure or production technique, or tools were given a unique
name by informants. Tool names sometimes differed according to the author of the
ethnography, and in these cases the coder gave priority to the structural and
functional descriptions of tools.
The coder indexed every technology mentioned in the collected excerpts, even if
they were not expressly marine foraging technologies. After verifying that each tool
was in fact an independent tool type, we removed technologies that were not
directly involved in procuring marine resources (for instance, canoe houses or
cooking pots). We also removed technologies like canoes, because while often
mentioned in fishing-related excerpts, they were not reliably sampled by the eHRAF
codes we used. We retained tools or tool parts that were used in marine foraging,
but seemed to be primarily decorative or supernatural, because they are
nonetheless part of the tool kit from the perspective of the producers, so that they
too are subject to the dynamics of cultural evolution. The resulting list of tools
composed our “total number of tools” data. In the next step, we gauged the
complexity of each tool, by revisiting the ethnographic excerpts according to the
excel spreadsheet index created here.
2. Coding of Techno-units.
We defined techno-units just as Oswalt (1976) did: “an integrated, physically
distinct, and unique structural configuration that contributes to the form of a
finished artefact (p. 38).” The coder used the index and any notes created in phase
one to revisit paragraphs on each tool, and to estimate the number of techno-units
that composed each tool. The coder did this individually for each tool, for each
group.
Techno-unit counts are based on verbal descriptions, illustrations, and photographs
from the eHRAF. Thus, our study differs from Oswalt’s (1976) in that he had access
to museum collections and was able to handle some actual specimens. Since we
were limited to using ethnographies only, we sometimes found that there was not
enough information available on a given technology to rate its techno-units. We do
not include these tools in our techno-unit measures.
Many of the same tools were present in a number of groups in our sample, with
varying degrees of techno-unit information available in the ethnographic excerpts
collected for each group. In these cases, we took the average of all estimates we
were able to generate, and used this new mean value as the techno-unit data for that
tool across all groups, regardless of whether or not information on techno-units was
available in that specific instance. The proportion of tools without techno-unit
ratings does not predict the estimate of mean techno-units for a group (ß= .142,
p=.696, R2=.0201). See Figure 1 for an example of techno-unit coding.
Figure 1. A sink-net used by Santa Cruz Islanders; made up of 7 techno-units. (From
Speizer 1958, p71, Fig 29: in eHRAF 2008)
Calculation of Control Variables
Most of the control variables we used are self-explanatory. Here we discuss four
types of control variables , ethnographic coverage, effective temperature (ET), and
relative importance of fishing.
1. Ethnographic Coverage
One alternate explanation of our result is that it is an artifact of the way that data
were collected by ethnographers, across the different societies in our sample. By
this account, larger populations might draw more ethnographers, or ethnographers
might write more about them, so that there is more information available on their
tool kits. To control for this possibility, we used the following data on the eHRAF
collection: (1) number of publications, (2) number of authors, and (3) number of
pages published on a particular society. None of these variables predicts tool kit
variation when included in a regression with log-transformed population size, in
terms of number of tools or complexity of tools as measured by mean techno-units.
None of these variables predicts tool kit variation in terms of number of tools,
according to regression analyses (see below). In an Akaike model selection analysis,
the strongest model here (number of publications regressed on number of tools) is
ranked 15 out of 23 and so is not a preferred model. Likewise, none of these
variables predict tool complexity, according to regression analyses. According to
Akaike analyses, the model using number of publications is the least preferred of 23
models as a predictor of tool complexity.
Table S1. Rows give the standardized regression coefficients and significance values for
regressions in which the dependent variable is the logarithm of the total number of tools and
the independent variables are log-transformed population size, and a log-transformed measure
of ethnographic coverage.
IV
ß
Sig
BS sig
R2
Population
0.733
0.014
0.058
0.6853
# Publications
0.205
0.396
0.581
Population
0.744
0.023
0.460
# Authors
0.120
0.653
0.856
Population
0.809
0.014
0.023
# Pages
-0.008
0.975
0.979
0.6593
0.6486
Table S2. Rows give the standardized regression coefficients and significance values for
regressions in which the dependent variable is the logarithm of the mean number of technounits per tool, per society and the independent variables are log-transformed population size,
and a log-transformed measure of ethnographic coverage.
IV
ß
Sig
BS sig
R2
Population
0.721
0.039
0.154
0.5008
# Publications
-0.044
0.883
0.936
Population
0.814
0.030
0.072
# Authors
-0.212
0.503
0.546
Population
0.717
0.047
0.124
# Pages
-0.023
0.939
0.956
0.5324
0.4995
2. Effective Temperature
Effective temperature (ET) is a measure of ecosystem abundance that is based on
the amount of solar energy available in a particular location. It has been used
previously for this purpose (Bailey 1960, Binford 2001, Collard et al 2005). It is
calculated by the following formula, where MWM is the average temperature in
centrigrade during the mean warmest month, and MCM is the average temperature
in centigrade during the mean coldest month:
ET = (18 * MWM – 10 * MCM) / (MWM – MCM + 8)
We also used latitude as another proxy of solar energy available. We did not use
variation in growing season length, since all groups in our sample had year-round
growing seasons.
3. Vulnerability to Catastrophic Storms
We also collected data on the threat of cyclones and tropical storms, since these
influence risk of resource failure in the Pacific. Cyclones may tear the roofs from
houses, uproot crops, and prevent fishing and marine collecting for days at a time.
To estimate the threat of cyclones and tropical storms for each group, we gathered
data using cyclone path maps for the Pacific region, available at Australian Severe
Weather (2009). Each map covers one cyclone season. We used maps from the
1998-1999 to 2008-2009 seasons. We used recent data, because historical data was
not available for all groups in our sample, and did not seem to be as reliable. For any
cyclone path coming within approximately 50 miles of our target group’s island, we
recorded the maximum wind speed of the entire storm and the maximum actual
windspeed of the storm at the point nearest the island. From these data we
calculated for each group: (a) the total number of storms across all seasons from
1998 to 2009, (b) the total maximum windspeeds of all storms, (c) the mean
windspeed for all storms, (c) the maximum windspeed of any storm. None of these
measures has an affect on a group’s total number of tools, or their average tool
complexity (in techno-units).
4. Threat of Drought
Another major source of risk of resource failure is drought. We controlled for this
using weather data from Weatherbase (2009) and the National Weather Service for
Hawaii (2009), including (a) number of rainy days per year, (b) total annual rainfall,
(c) mean annual rainfall, (d) standard deviation in rainfall per year. None of these
measures has an affect on a group’s total number of tools, or their average tool
complexity (in techno-units).
5. Importance of Fishing
In order to measure the importance of fishing’s contribution to the subsistence of
each of the groups in our sample, we obtained data from Ember (2008) on the
subsistence “types” of each group. These data were themselves compiled by Ember
for use in the Human Relations Area Files database. Some of these sources specified
the contribution of fishing in the group’s diet by percent, rounded to the nearest ten;
others provided rough estimates. As a result, we were unable to obtain interval data
on the importance of fishing. Instead, we used the best available data for all groups
in order to rank the groups in order of how much fishing contributed to their
subsistence. We allowed for ties, so that groups with equal percentages of their
subsistence coming from fishing received the same score. See the table below with
the data provided by Ember, and our conversion of partial interval to ordinal data.
Table S3. Data on the importance of fishing for subsistence of groups in our sample. Data are in
two forms: raw percentages obtained from Ember (2008), and a conversion of those
percentages into ordinal data, with 1 being least important and 7 being most.
Culture
% Fishing (Ember)
Fishing Rank
Malekula
Min: 56
Max: 85
6
Tikopia
20
2
Santa Cruz
Min: 56
Max: 85
6
Yap
40
4
Fiji
50
5
Trobriand Isl.
10
1
Chuuk
40
4
Manus
90
7
Tonga
30
3
Hawaii
40
4
We analyzed the data on reliance on fishing for subsistence in three forms. First, we
converted Ember’s percentage data into ordinal data, which we call fishing rank. We
present these analyses in the paper. To test the robustness of these results, we also
analyzed the data in two other forms: one using the lowest estimates for fishing
subsistence for Malekula and Santa Cruz, the other using the highest estimates. The
results support our conclusions using the ordinal data—fishing importance does not
predict toolkit breadth or complexity.
Table S4. The first two rows give the standardized regression coefficients and significance values
for regressions in which the dependent variable is the logarithm of the total number of tools and
the independent variable is percent contribution of fishing toward subsistence, (1) using the
low-end estimates for Malekula and Santa Cruz, and (2) the high-end estimates for both groups.
The last two rows combine each of those independent variables in turn with the logarithm of
population size, for a multiple regression. The coefficients for both measures of importance of
fishing are low and in the opposite of the predicted direction, and the effects are not significant
according to asymptotic and bootstrap significance values. The coefficients for population size
are large and mostly significant, a result supported by bootstrap values as well
IV
ß
Sig
BS sig
R2
% fishing (low est.)
-0.065
0.859
0.898
0.0042
% fishing (high est.)
-0.298
0.402
0.499
0.0890
Population
0.806
0.008
0.012
0.6546
% fishing (low)
-0.078
0.737
0.833
Population
0.777
0.010
0.190
% fishing (high)
-0.149
0.523
0.677
0.6699
Table S5. The first two rows give the standardized regression coefficients and significance values
for regressions in which the dependent variable is the logarithm of the mean number of technounits per tool and the independent variable is percent contribution of fishing toward
subsistence, using (1) the low-end estimates for Malekula and Santa Cruz, and (2) the high-end
estimates for both groups. The last two rows combine each of those independent variables in
turn with the logarithm of population size, for a multiple regression. The coefficients for both
measures of importance of fishing are low, and the effects are not significant according to
asymptotic and bootstrap significance values. The coefficients for population size are large and
significant, a result mostly supported by bootstrap values.
IV
ß
Sig
BS sig
R2
% fishing (low est.)
0.260
0.468
0.587
0.0677
% fishing (high est.)
-0.0593
0.871
0.886
0.0035
Population
0.702
0.026
0.049
0.5610
% fishing (low)
0.249
0.354
0.498
Population
0.722
0.032
0.101
% fishing (high)
0.080
0.777
0.823
0.5052
Checking the Robustness of Results
1. Bootstrap Resampling Analyses
We checked the robustness of our regression analyses for our original sample using
a bootstrap resampling to calculate significance for the regression coefficients.
Table S6. Each row gives the standardized regression coefficients and significance values for a
multiple regression in which the dependent variable is the logarithm of number of tool types.
The coefficients for the control variables are smaller and none are close to significant, equally
for asymptotic and bootstrap values. The AICc value for a regression with only the constant is
0.63.
Independent Variable
ß
Significance
Boot-strap
Significance
R2
AICc
AICc
weights
Population
0.805
0.005
0.001
0.649
-2.41
.03997
Mean rainfall/yr.
-0.474
0.166
0.119
0.225
-1.62
.02691
Publications
0.464
0.176
0.213
0.216
-1.61
.02677
Standard Dev Rain/yr.
-0.442
0.201
0.325
0.195
-1.58
.02641
Sum of max wind speeds
for all cyclones
0.360
0.306
0.588
0.130
-1.51
Effective Temperature
-0.344
0.331
0.315
0.118
-1.49
.2523
Latitude
0.270
0.450
0.450
0.073
-1.44
.02461
Total cyclones
0.241
0.502
0.610
0.058
-1.43
.02441
Fish Genera
0.192
0.594
0.713
0.037
-1.40
.02415
Mean rainy days/yr.
-0.153
0.673
0.777
0.023
-1.39
.02398
Mean maximum cyclone
wind speed
-0.128
0.724
0.724
0.017
-1.38
Importance of fishing
-0.104
.02541
.02389
0.773
0.824
0.011
-1.38
.02383
Table S7. Each row gives the standardized regression coefficients and significance values for a
multiple regression in which the dependent variable is the logarithm of number of tool types
and the independent variables are the logarithm of population size and the one of the
alternative variables. The coefficients for population size are large and mostly significant, while
the coefficients for the control variables are smaller and none are close to significant, equally for
asymptotic or bootstrap values. The AICc value for a regression with only the constant is -2.91.
R2
AICc
0.015
0.383
0.028
0.528
0.020
0.409
0.762
-3.39 .06513
0.677
-3.08 .05592
0.691
-3.12 .05711
0.014
0.396
0.069
0.979
0.007
0.476
0.012
0.510
0.015
0.550
0.012
0.873
0.012
0.694
0.020
0.793
0.685
-3.11 .05663
0.652
-3.01 .05381
0.674
-3.07 .05570
0.671
-3.06 .05539
0.010
0.628
0.010
0.112
0.006
0.239
0.019
0.822
0.014
0.908
0.024
0.786
0.130
0.844
0.054
0.376
0.142
0.881
0.023
0.935
0.661
-3.03 .05454
0.661
-3.03 .05451
0.716
-3.21 .05956
0.651
-3.00 .05378
0.649
-3.00 .05362
Independent Variable
ß
p
Population
Fish Genera
Population
Mean rainfall/yr.
Population
Mean max cyclone wind
speed
Population
Publications
Population
Importance of fishing
Population
Mean # rainy days/yr.
Population
Sum max wind speeds
for all cyclones
Population
Total # cyclones
Population
Latitude
Population
Contact
Population
SD Rainfall per Year
Population
Effective Temperature
10.045
-0.413
0.988
0.249
0.824
-0.206
0.002
0.110
0.017
0.455
0.006
0.361
0.733
0.205
0.724
0.008
0.871
0.175
0.918
-0.188
0.858
-0.123
0.783
0.635
0.792
0.259
0.844
0.065
0.819
0.030
Bootstrap
Significance
AICc
weight
Table S8. Each row gives the standardized regression coefficients and significance values for a
multiple regression in which the dependent variable is the logarithm of average number of
techno-units per tool. The coefficient for population size is large and significant, while the
coefficients for the control variables are smaller and none are close to significant, equally for
asymptotic or bootstrap values. The AICc value for a regression with only the constant is -2.91.
Independent
Variable
Population
ß
p
Bootstrap
Significance
R2
AICc
AICc
weight
0.706
0.022
0.018
0.499
-3.60
.03827
-0.629
0.051
0.124
0.396
-3.42
.03486
0.495
0.146
0.200
0.245
-3.19
.03116
Mean rainfall/yr.
-0.390
0.265
0.170
0.152
-3.08
.02942
Mean rainy days/yr.
-0.348
0.324
0.340
0.121
-3.04
.02890
Sum of max wind
speeds for all
cyclones
0.292
0.413
0.610
0.085
-3.00
.02831
Total cyclones
0.202
0.576
0.678
0.041
-2.95
02765
Effective
Temperature
-0.163
0.652
0.655
0.027
-.294
.02745
Mean maximum
cyclone wind speed
-0.136
0.708
0.745
0.019
-2.93
.02734
Importance of
fishing
0.084
0.818
0.833
0.007
-2.92
.02717
Latitude
0.022
0.952
0.961
0.001
-2.91
.02709
Publications
0.212
0.557
0.599
0.045
-1.65
.01439
Standard Dev
Rain/yr.
Fish Genera
Table S9. Each row gives the standardized regression coefficients and significance values for a
multiple regression in which the dependent variable is the logarithm of average number of
techno-units per tool and the independent variables are the logarithm of population size and
the one of the alternative variables. The coefficients for population size are large and mostly
significant, while the coefficients for the control variables are smaller and none are close to
significant. Significance values based on bootstrap analysis are larger, but show a similar
pattern.
Independent Variable
ß
p
Population
Publications
Population
# Fish Genera
Population
Effective
Temperature
Population
SD Rainfall per Year
Population
Latitude
Population
Mean max cyclone
wind speed
Population
Total # cyclones
Population
Sum max wind
speeds for all
cyclones
Population
Mean # Rainy days
per Year
Population
Mean rainfall per
Year
Population
Contact
Population
Importance of fishing
0.722
-0.044
0.632
0.128
0.798
0.201
0.039
0.883
0.093
0.705
0.029
0.511
Bootstrap
Significance
0.136
0.920
0.207
0.831
0.045
0.567
0.514
-0.321
0.732
-0.127
0.727
-0.205
0.143
0.337
0.030
0.652
0.026
0.453
0.757
-0.120
0.828
-0.203
R2
AICc
0.500
AICc
weight
-4.19 .05137
0.510
-4.21 .05186
0.531
-4.25 .-5301
0.305
0.557
0.279
0.845
0.027
0.518
0.565
-4.33 .05504
0.515
-4.22 .05209
0.541
-4.27 .05355
0.036
0.694
0.038
0.551
0.100
0.790
0.104
0.791
0.511
-4.21 .05189
0.526
-4.24 .05270
0.670
-0.096
0.052
0.747
0.086
0.840
0.507
-4.20 .05171
0.907
0.274
0.048
0.494
0.188
0.661
0.534
-4.26 .05317
0.715
-0.144
0.702
-0.103
0.030
0.600
0.033
0.710
0.258
0.746
0.049
0.784
0.520
-4.23 .05238
0.513
-4.22 .05215
2. Coding Reliability
In order to check the accuracy of our data, we obtained second measures for
number of tools for half of our sample. We did this by using Oswalt’s (1976) data on
the Chuuk, and had research assistants recode four randomly selected groups in our
sample. RAs were given very little training, and were told the same rules of thumb
for deciding what constitutes a “tool” that are provided in the paper (see section on
techno-unit coding above). We did this only for number of tools and not technounits because our criteria for what “counts” as a techno-unit intentionally differs
from Oswalt’s (1976), and because of the time-intensive nature of training to code
and actually coding techno-units.
Table S10. Compares first and second coder ratings for half of our sample (n=5). *Data for the
Chuuk are taken from Oswalt (1976, Trukese), counting only marine foraging tools.
Culture
# Tools (1st coder)
# Tools (2nd coder)
Chuuk
40
*33
Tikopia
22
29
Manus
28
30
Trobriands
19
23
Santa Cruz
24
24
Table S11. Each row gives the standardized regression coefficients and significance values for a
multiple regression in which the dependent variable is the logarithm of number of tool types
and the independent variables are the logarithm of population size and the contact, coded as a
dummy variable. The results here replicate our main findings using our first-coder ratings,
suggesting that coder accuracy is not influencing our results.
Independent
variable
ß
Significance
Bootstrap
Significance
R2
Population
.794
0.006
0.005
0.6303
Population
.782
0.008
0.077
0.6795
Contact
.222
0.335
0.475
3. Sample Representativeness
We do not mix data sources in our primary dataset, because of differences between
our coding scheme and Oswalt’s. However, here we reanalyze our dataset and
include two groups from Oswalt’s study (the Tiwi and the Pukapuka). These groups
were not available on electronic HRAF at the time of the study. Using Oswalt’s data,
we counted the number of tools for marine foraging: 30 for the Pukapuka and 4 for
the Tiwi. We reanalyzed our original data after adding these two groups (n=12). The
results from this altered dataset support our initial findings.
Table 10. Each row gives the standardized regression coefficients and significance values for a
multiple regression in which the dependent variable is the logarithm of number of tool types
and the independent variables are the logarithm of population size and the contact, coded as a
dummy variable. The results here replicate our main findings using the original data set plus
two groups from Oswalt (1976), suggesting that our sample is robust.
ß
Population
0.600
0.039
0.045
0.3598
Population
0.543
0.066
0.261
0.4218
Contact
0.255
0.352
0.422
Significance
Bootstrap
Significance
R2
Independent
variable
References
Australian Severe Weather. 2009. South Pacific Tropical Cyclones:JWTC Data. 1 Dec
2009. http://www.australiasevereweather.com/cyclones/index.html
Ember, C. 2008. Classification of the Major Forms of Subsistence (for the cultures in
eHRAF World Cultures). 15 January 2010. www.yale.edu/hraf.
National Weather Service. 2009. NWS: Severe Weather. 1 Dec 2009.
http://www.nws.noaa.gov/
Weatherbase. 2009. Weatherbase: Oceania. 1 Dec 2009. http://weatherbase.com/.
World Cultures Ethnography Database. Human Relations Area Files, Inc.;
http://ehrafWorldCultures.yale.edu; 2008.
Download