Multilevel networks and world ethnography Doug White and UC team UCI human complexity seminar 1:30 Fri, Sept 24, 2010 UCI grads and undergrads can sign up for 1.33 credits: SOC SCI 240A SEM A (72100) UCI_imbs+UCSD_econ+UCLA_cs, project team Scott D. White, UCI , One Spot B. Tolga Oztan, MBS Ren Feng. Xi’an Xiaotung Univ. Karim Chalak BU, Econ Halbert White UCSD Doug White, MBS Assist from Judea Pearl UCLA Tony Eff, MST,U Econ Comparing ethnographic data • Early best-described ethnographies give our best chance of understanding the evolution of societies and cultures. • We now have large samples, N=186, 390, 1400, …. • And many variables, V=2000+(SCCS), 500+ (foragers), 150 (EthnoAtlas), …. • Problem is that correlations of variables have no meaning. • Historical interactions have huge effects – splitting, branching , merging, borrowing, migrating, colonizing, conquering. We still have, among these ethnographic cases, living societies like the Hadza with the genetic stock of humanity’s common ancestors of 150,000 years ago, or the San with the next-split of 120,000 years ago, etc. • • It’s not a matter of splitting the historical network interaction and regional similarities from functional and causal relations. (Harold Driver’s struggle with Kroeber) • Its that the statistics for doing so have been so weak as to not be capable of making these inferences, and the concept of inference has been too weak. Inference, Statistics, Causality • Statistical inference has to do with replicability, or change, given changing conditions. In survey data, it’s hard to get replication because of changing – Composition of the sample: location, composition – Peer effects and interactions within the sample – Time period • So results will vary with different (e.g., cross-cultural) samples . • Looking for invariance is like Norm Schofield ‘s question “What are the causal variables for how people vote” other than the names and parties of the candidates? I.e., no proper nouns in the variables for causality. • When there are peer effects operating in the sample, significance tests are exaggerated (type 1 error). Cross-cultural studies are full of type 1 error. • Random sampling (Ember&Bernard solution) does not solve this problem although means and correlations may be estimated correctly. • • But correlations will vary widely with changes in the sample or with time. replication is thwarted finding differences in replication when none exist (type 2 error) • A stronger concept of inference would deal with causation not correlation. • This began with structural equation models (Sewall Wright 1921, 1935 SEM) and continues with Judea Pearl et al’s extensions of graphical methods Where we are now: *Rccs* • new program package *Rccs* takes into account network effect matrices (distance, language, etc.) • Computes regression coefficients • R code for 2SLS implemented for classroom use • (see pdfs attached to intersci wiki talk) • Compute causal graph models (in development) • E.g., computes total effects (adding indirect paths) • Chalak and H. White extension: reciprocal effects x-->y plus y-->x-->y as including the indirect path Solving Galton’s problem with 2SLS: 2-Stage OLS 2-stage ordinary least squares regression with peer effects Stage 1 OLS Calculate the “Instruments” 2nd Stage OLS, Include the “Instruments” peer effects I X1 IW I independent variables X2 X3 Y dependent variable Causal graph, Pearl’s regression method “Say for three variables you are trying to estimate the direct effect c of X on Z given an indirect effect of Y. The causal diagram model gives you a license to do it by the regression method, where, for example (for reference on the pdf) E(y|x, z) – E(y|x´, z) a X Y c = ————————————— (1) c b x – x´ , Z Controlling for the change from x to x´, E(y|x, z) and E(y|x´, z) are the changes in variable Z due to unit changes in X controlling for Y.” (email from Pearl see Pearl 2000:151, 368; Chalak and White 2010). Because the x,z in (y|x,z) is a joint distribution, eqn (1) means that x→x´ changes y which through the x-y-x path, considered as a joint distribution, changes z. From this it follows, given the single door criterion (Pearl 2000:150) that c + a•b = rxy.z, the coef for total effect of X on Z. Comparing causality in ethnographic data • R package *Rsccs* takes into account any number of peer effects as Instruments in the previous equations that allow further causal analysis • Regressions change with time periods • Correlate total effects to X → Y (time lagged correlations) • Regressions that yield results for causality can be identified in Pearl’s (2000) single door, back door, and front door causal graph criteria. Some graphical structures may require that some potential confounder variables be blocked, and that direct, indirect and total effects be computed from regression coefficients without those confounders. Language families as an Instrument for measuring peer effects Multilevel effect language tree Spatial distances as Instrument for measuring peer effects Standard Cross-Cultural Sample (SCCS wikipedia maps by Tony Eff) Afro-Eurasia drawn to a slightly smaller scale Multilevel effect spatial World system peer effects -- of exchange – as Instruments Folded image: Core, Semi-periphery1, SP2, P1-2 Core Semi-Peri1 Semi-Peri2 Periphery1 Multilevel effects Periphery2 world system A structurally endogamous kinship network core of a Turkish nomad clan Multilevel effects (White and Johansen 2005: 379; 76-79). internal networks Up and down effects 13 linked regressions out of 2000+ SCCS variables http://eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm Nodes are variables in regression analyses of variables from the Standard Cross-Cultural Sample of 186 societies (SCCS). Lines represent independent variables. They point down to 13 dependent variables in successive colored layers. Black lines are positive effects, red lines negative effects from regression results. Colors of nodes for variables show depth in a causal hierarchy with net effects estimated as causal graphs (Pearl 2000). At level 4 the Evil eye dependent variable has a triangular relationship with money and milked domestic animals. The regressions control for peer effects of spatial transmission (distance) and cultural transmission (language phylogeny), incorporated as Instrumental Variables in a second-stage regression, with the IVs estimated in a first-stage regression. Node sizes reflect the significance of spatial transmission peer effects. Language effects are sometimes negative. Paired visual comparison of spatial distributions v1189 Belief in evil eye 238. v238 Moral gods==4 HIGH GODS 18 . = Missing data 68 1 = Absent or not reported 47 2 = Present but not active in human affairs 13 3 = Present and active in human affairs but not supportive of human morality 40 4 = Present, active, and specifically supportive of human morality NOTE the circum-Mediterranean overlap with Evil eye (previous slide) Paired visual comparison of spatial distributions v1189 Belief in evil eye (dichotomy) Large nodes red Small nodes orange 155. SCALE 77 14 43 27 25 v155 True money==5 7- MONEY (here, an independent variable) 1 = None 2 = Domestically usable articles 3 = Alien currency 4 = Elementary forms 5 = True money NOTE the circum-Mediterranean overlap with Evil eye (previous slide) Paired visual comparison of spatial distributions v1189 Belief in evil eye v272 Caste stratification 272. CASTE STRATIFICATION (ENDOGAMY) (two cases have secondary castes) 5 . = Missing data (154) 0 = (Omitted from map) Absent or insignificant 17 1 = Despised occupational group(s) 3 2 = Ethnic stratification 7 3 = Complex NOTE the circum-Mediterranean overlap with Evil eye (previous slide) Paired visual comparison of spatial distributions v1189 Belief in evil eye v245 Milked animals NOTE the circum-Mediterranean overlap with Evil eye (previous slide) v1189 Belief in evil eye R2=0.513; N=186; 10 imputations; standard errors 00R2 adjusted for two-stage least squares. Language nonsignificant (p > .33). No effect of Islam or Christianity. v1189 Belief in evil eye Some nonlinear relationships No additional variables Error terms homoskedastic " " not normally distributed no " " cultural lag no " " spatial lag R2=0.490; N=186; 10 imputations; standard errors 00R2 adjusted for two-stage least squares. Distance (p > .00002) & language significant (p > .003). v155 Money No nonlinear relationships Some additional variables Error terms homoskedastic " " normally distributed no " " cultural lag no " " spatial lag v155 Money R2=0.504; N=186; 10 imputations; standard errors 00R2 adjusted for two-stage least squares. Distance (p < .00001) & language insignificant (p > .15). v238 Moral gods No nonlinear relationships No additional variables Error terms ~homoskedastic " " not normally distributed no " " cultural lag no " " spatial lag v238 Moral gods Transmission effects (Galton’s problem): Spatial and cultural Peer Effect Spatial Transmission (Distance) Cultural Transmission (Language) Variable Money Moral gods Evil eye Money Moral gods Evil eye coef .960 .824 .767 -.988 -.672 -.228 pvalue .0000009 .0000014 .000002 .002 p > 0.14 p > 0.36 Negative peer effects for language indicate that, for each of these dependent variables, there is a tendency, strong for Money and weak for the other two variables, NOT to be the result of cultural tradition but of innovation that differentiates the societies with Money, Moral gods and Evil eye from the norms in their respective language families. This tendency is nearly significant (value < 0.15) for societies with Moral gods. Excluding peer effects: Causal graph with multiple triangular 000regression coefficients - numbers are the regression coefficients -0.393 Milking animals A B Money (v155) (v245) 0.484 0.102 p<0.14 Moral gods D (v238) 0.294 0.792 0.430 0.597 0.664 1.372 C Evil eye (v1188) Caststrat LGd E Causal graph total effects and regression slopes Independent Dependent Variable Variable Net effects=Direct and Indirect =Total Causal Graph Effects effects Money Evil eye 0.597 0.597 Moral gods Evil eye 0.294+(0.102*.597) 0.355 Milking Evil eye 0.664+(-.393*.597)+(.484*.104*.597) 0.744 Moral gods Money 0.102 0.102 Milking Money -.393+(0.484*.102) -0.344 THESE CAUSALITIES A-E-D-B-C ARE TRANSITIVE, all significant or nearly so, and completely ordered but the arrow from A to B is NEGATIVE A Milked domestic animals E Caste stratification D Moral gods (to money only p <.15) B Money C Evil eye A 2-slide example for two time periods is next, if time allows (package*Rccs* applies to time series, includes multiplicative interactions as well) Causal analysis: Transformation predictions from Indian Jajmani to market system R2 = .672 Data source: Maximizing in Jajmaniland: A Model of Caste Relations. 1968. MARTIN ORANS. American Anthropologist 70(5): 875–897. R2 = .623 Correct time 2 predictions match causal inferences P=.067 R2 = .747 P=.055 Peer effect regression time 1 (Temporal predictions about changes are even stronger) P=.05 p=.067 Causal graphs may incorporate multiplicative or interaction effects, which are used by Martin Orans in his 1968 article. These are diagrammed Jajmani system Power concentration Power concentration Isolation concentration Power Isolation Ritual-secular correlation Isolation Jajmani system None of these models were significant, however, compared to the simple linear additive effects that we tested and found significant (Ren Feng, T. Oztan, D. White) Further slides, if time allows, show different kinds of analysis than that of *Rccs* Other kinds of cross-cultural data structures and analyses: Statistical Entailment Analyses: Society sets for variables tend to form chains of sets Galois duality lattice (Concept lattices): Society sets for variables tend to form chains of sets and intersections, and opposite ordering of Sets of variables that tend to form chains of sets VS1 VS2 VS3 VS4 A B C D A B C D Intrasocietal network structure overlays on genealogy For each society these will define new variables such as 1) sidedness, reciprocal marriage to opposites. 2) structurally endogamous groups 3) marriage-type census as against random simulation 4) distribution of structural features over generations Multilevel analysis e.g. regional or world system effects local societies. on Fig. 3: An exact world entailment digraph for the sexual division of labor Late Task A Early Task B Female Male Female Male Fig. 3: An exact world ethnographic lattice of kin avoidances has a four-dimensional partial ordering of distributions: 1) parents of Hu, Wi (opp/same sex, within circles), 2) siblings and siblings-in-law of Hu and Wife (opp/same sex, in parallelograms), 3) opposite sex siblings & parents siblings & parallel cousins (White 1995). Lower types of avoidances entail upper ones features in perfect inclusion relations, found by statistical entailment analysis (White 1999b). Of the 250 societies, names attached to each node show each subset of avoidance relations. Table 1 Pajek Repast Simulation X X Peer Effects ArcGIS.com New Codes New Ethnogr. Cases X X 3 400 foragers2 X X (Binford & Boehm) 85 World-system 3 X X 1294 Atlas4 X X 0 186 SCCS5 28 1945-19656 30 Post 19657 X X 0 X X X 28 (SCCS) X X X 308 (eSCCS) 80KinSources1 X Cohesion 2 (country data) 1 http://kinsource.net/kinsrc/bin/view/KinSources archives kinship network data contributed by anthropologists. Only three KS ethnographies remain for conversion from paper-based genealogies to e-networks for analysis with Pajek, but others will be added. 2,5 Binford’s (2001) Constructing Frames of Reference forager database has been spreadsheeted by Boehm and Hill. Non-foragers from the SCCS will be analyzed separately. Extensive testing of “peer effects” methods have established their validity. 3 Smith and White (1992) have postwar WS commodity flow time series in 5yr intervals; capital and migration flow will be added. 4 Murdock’s Ethnographic Atlas (EA) in Spss format has been supplemented by newly authored installments 30-31. 5 Murdock and White’s (1969) Standard Cross-Cultural Sample dataset on 186 societies in Spss and R formats has coded data contributions from 80+ different authors on 2008+ variables. Citations to SCCS are now 95+/year and growing. Table 1 80 KinSources1 Pajek X Repast Simulation X Cohesion X Peer Effects ArcGIS.com New Codes New Ethnogr. Cases X X 3 400 foragers2 X X (Binford & Boehm) 85 Wrld-system3 X X 1294 Atlas4 X X 0 186 SCCS5 X X 0 28 1945-19656 X X X 28 (SCCS) 30 Post 19657 X X X 308 (eSCCS) 2 (country data) 5 Murdock and White’s (1969) Standard Cross-Cultural Sample dataset on 186 societies in Spss and R formats has coded data contributions from 80+ different authors on 2008+ variables. Citations to SCCS are now 95+/year and growing. 6 109 missing codes for 28 SCCS variables 1006-1115 will be coded for 28 SCCS societies on the world-system impacts variables partially coded in White and Burton’s (1985-1988) NSF 8507685 funded research on “World-Systems and Ethnological Theory.” 7 To bring the SCCS societies up to date for post-1965 societies, 30 well described post-1965 ethnographic cases will be added to an (expanded) eSCCS and coded for EA variables and the CDC Cultural Diversity Codebook of 180 SCCS variables. 8 Given that the SCC Sample was published in 1969, the eSCCS additions to the sample will bring it up to date temporally. This will allow study of world-system impacts on 37 welldescribed ethnographic cases in the contemporary post-war period. A structurally endogamous kinship network core of a Turkish nomad clan (White and Johansen 2005: 379; 76-79). Fig. 1.A. Gmap of Cultural Survival (2010) 100+ recent trouble spot study cases: Gmaps extend to networks at the global level, clicking into cases at the local level. Live: http://bit.ly/c1funC Fig. 1.B. This google map tracks cases of swine flu in 2009, types of cases are color coded, fatal cases have no dot, clicking a region gives a more detailed map of cases within the region. Similarly, Wolf (1982) drills down at several hundred ethnographically data points to analyze how commodity exchange affected indigenous societies in the 1500-1980 period of overseas conquest and modern worldsystems. Interactive maps provide for drilling down from a network at one level (network spread of disease not shown here) by clicking a node to see a more detailed map or a network within that node. The upper level nodes can be societies with organizations networks reached by a click of a given node.