https://global.oup.com/academic/authors/bookproposals/?cc=gb&lang=en. 6.9 x 4.4 inches Comparative Social Science: A Very Short Introduction Cross-Cultural Research is the vehicle whereby Comparative Social Science interrogates the texts of historians, economists and ethnographers through comparable categories. These are coded into variables that can be use for testing models of how evolutionary processes transformed human societies. Such codes may also be used in testing hypotheses about functional relationships among variables and to trace both the diffusion of peoples and the diffusion of variables across societies, i.e., the processes of human histories and evolution. These problems are not so easy to decipher. Postmodernists do not believe in “comparable categories” but scientists do and define measures accordingly. Humans employ categories that may be subjective and need to be decodes in context: contexts of observable behavior, conversation, signs and symbols, … and assign these to measures that may or may not be comparable. Not all categories are equally viable: some may be highly inferential, others reliable in the contexts of behavior or culture as a places with sets of ideas of a high degree of 1 commonality, agreement and interlocked functional activity, like that of getting recurrent jobs done in contexts of interaction. This is why ethnographers tend to stay within or compare cultural contexts for long periods of time, benefits from histories of these contexts, and why historians or historical economists tend to gather a great deal of evidence and observations before reaching conclusions. Efforts to assemble samples of well described (and often repeatedly studied) contexts for the study of cultures that could be studied and coded for a large number of contexts and variables did not take place easily in cross-cultural research. From the 1880 to the 1960s comparativists tended to select their own samples of societies or contexts to study on the assumption that many such studies would cohere into shared disciplinary knowledge. Having collected in my tenure as graduate student all such studies that had been converted to the punched-cards that era, I can testify empirically that convergence did not tend to occur. Testable hypotheses did coalesce and could be tested for specific disciplines and these could be tested with correlations and significance tests but anthropology had another problem: these statistics simply did not work when historically similar local societies closely abutted one another with constellations of similar or dissimilar 2 spread abutted one another in much more complicated ways than in random samples of persons in a large population. Cultures were much more deeply entwined or separated, depending on their larger histories, those of mass migration, colonialism, fundamentally different types of economy or transport. The construction of samples that were relatively complete, like extant hunter-gatherers studied by ethnographers (Binford 2001) and for the full compass of variables (some 506 variables in this case for 399 societies, plus a great many ecological, animal species and climate variables coded) began only in the 1960s. Jorgenson (1985) did the same for Western American Indians (172 societies, 496 societies). Murdock (1967) coded only 167 variables and 1257 societies but Murdock and White (1969) began a different approach, where 186 maximally different earliest best-described societies in their cultural milieu were precisely pinpointed in time and space with applicable published ethnographic texts with the idea that cross-culturalists could develop and contribute their own codes on different topics and biogeographers could contribute their data for the specific eco-contexts of each society and its surrounding regions. Their database, the SCCS 3 or Standard Cross-Cultural Sample, now hosts 2109 variables contributed by over 100 cross-cultural and other researchers. Some contributors to the open access code-appending SCCS powerhouse for cross-cultural research contributed only a 25% random sample of coded cases, others 50% or 78%, etc. New forms of analysis, however, compensated for missing data by taking the set of societies coded for the dependent variable as the focus of study, and for those societies imputing missing data for all the other variables that the investigator wishes to include among the possible hypothesized independent variables that might be of interest. At this level of organization it might seem that cross-cultural research could establish a scientific foundation. Statistician Sir Francis Galton, however, had noted a major flaw in the very first presentation of a crosscultural study in 1888-89 that had never been repaired. This was the fact that societies were intertwined in ways that are obscured by cultural borrowing, expansion of populations from the same language groups, and shared environments. This might also be true for ordinary survey questionnaires and medical research on populations of patients. The first full exposure to the problem was concerned with how to extract from every set of dependent and independent 4 variables common or independent evolutionary histories and processes among societies as distinct from correlations among variables that are often completely misleading. Correlation is not causation. Eff and Dow (2009) were concerned with how multiple evolutionary sources could be revealed to bring to fruition Galton’s problem. They thought of the matrix of weighted distances between societies as a measure of the potential routes of diffusion, and the network matrix of linguistic trees as a weighted measure of the potential routes of cultural heritage. Indeed, these W matrices and their sums with weighted squares W+W2 when normalized to row sums of one might be put to good use in modeling evolutionary effects. Part of their innovation was to consider multiplying every term in a regression equation, both the dependent and independent variables by the sum of these weights and then taking the calculated value of the Wy term in this estimate as the total measure of evolutionary effects. Then adding this Wy term to the original regression equation can tested whether the independent variables are truly independent (“exogenous”) from the error term, in which case the separation has been made between evolutionary effects and truly independent effects on the dependent variable. 5 First of all, however, Eff and Dow impute missing data for the reason that the categories used in cross-cultural studies cannot always be coded. Imputation proves to be important in examining causal effects within networks of variables. Eff and Dow’s example examined independent variables that were predictive of each society’s evaluation of the value of children, but this code was rather vague, and independent variables accounted for very low predictions, with Rsq=0.10. When focused on society’s evaluation of the value of girls, the Rsq=0.16 as shown in Table 1 is produced by women’s work and residence with wife’s kin in the first years of marriage. 6 Table 1: Women’s work and kin as predictors of Value of Girls: Subsistence Contribution, Fishing, Cultivation with Rain The focus of Eff and Dow’s initial model was Imputed datasets The XC Bnlearn 7