Colonialism and Its Legacies: A Comprehensive Historical Dataset A PROJECT FUNDED BY THE NATIONAL SCIENCE FOUNDATION (PROPOSALS 0648292 and 0647921) Principal investigators: John Gerring, Department of Political Science, Boston University James Mahoney, Department of Sociology and Political Science, Northwestern University Collaborators: Paul Barclay, Department of History, Lafayette College Neil Englehart, Department of Political Science, Bowling Green State University Andrew Harris, Department of Government, Harvard University Charles Kurzman, Department of Sociology, University of North Carolina James Robinson, Department of Government, Harvard University Nicolas van de Walle, Department of Government, Cornell University The current era of globalization began in 1415 with the Portuguese conquest of Ceuta (off the coast of Morocco), the first enduring overseas European colony. In subsequent centuries, AngloEuropeans and Japanese managed to control, at various times, virtually the entire inhabitable planet. Only a few nations escaped direct control of these colonial powers. None escaped their spheres of influence. It has become commonplace to observe that the colonial experience shaped the modern era in profound ways. Colonial policies and practices are widely blamed for the underdevelopment of the South, the absence of significant industrialization, ethnic strife, weak state capacity, authoritarian rule, weak national identity, diffuse and porous borders, hunger, illiteracy, and corruption. Interestingly, colonialism is also sometimes praised for furthering social, political, and economic development in the South. Indeed, it is a central issue of dispute in the scholarly community whether colonialism fostered, or delayed, the development of the regions that it touched. Africa provides a stunning example of these directly contradictory arguments. Conventionally, Africa’s developmental prospects were thought to have been hindered by colonial interference (e.g., Young 1994). Yet, the striking fact is that Africa experienced considerably less colonial intervention than most parts of the world. This has led some writers to claim, at least implicitly, that Africa’s problems at the present time are attributable to insufficient colonial influence (Herbst 2000; Mamdani 1996). In short, while there is general agreement that “colonialism mattered,” it is less clear what the long-term effects of this traumatic intervention actually have been. Indeed, the virulence of scholarly and popular opinions about colonialism is matched only by the inconclusiveness of current research (contrast Alam 2000 and Grier 1999). Given that colonialism is a complex subject and evokes strong feelings it is perhaps not surprising that the extensive study devoted to this weathered subject across the fields of the social sciences has not rendered a clear verdict on its legacy. Contributing to the inconclusiveness of this research are certain persistent methodological problems associated with the two dominant strategies of research – the case study (area study) and the global study (crossnational study) – which we now briefly review. Usually, work on colonialism follows a case study approach that involves the intensive study of a particular country or region of the world (e.g., Brown 2000; Young 1994), or a particular colonial power (e.g., Armitage 2000). Alternatively, an author or a group of authors may cover the globe but do so with a series of case study analyses (e.g., Chamberlain 1998; Kohli 2004). While there are advantages to this way of approaching the subject – and our own approach builds selfconsciously on precisely this sort of work – case study work is not designed to estimate typical causal effects for large populations. Rather, the goal of this literature is to understand the causes and consequences of colonialism in delimited contexts, leaving questions about general causal effects for other types of research. Thus, while case studies have taught us a great deal about the effects of colonialism in particular places and at particular times, they have not told us whether colonialism overall had a positive or negative impact on the non-western world, or in what specific ways it affected that world as a whole. Given that many writers do presume general effects, not simply contextually specific effects, there is a prima facie case for a general (global) approach to the problem of colonialism. In any case, to learn whether colonialism left behind general effects cannot be determined by mere assertion; it requires comprehensive evidence in the form of a global dataset. The same may be said for the claim of historical specificity; “different effects in different places” is not a hypothesis that can be proven without systematic testing across all the cases (or a significant sub-sample thereof). The opposite difficulty is posed by the (rather few) studies of colonialism that are truly global in scope – where a general hypothesis is tested across all developing countries. The problem here is that the subject of interest, colonialism, is usually reduced to a single dimension, e.g., a) a 2 dummy variable registering the predominant colonizer of a country (La Porta et al. 1999), b) a measure of “settler” versus “extractive” colonialism (Acemoglu, Johnson & Robinson 2001), or c) a variable measuring the number of years a country was under colonial control (Grier 1999). These sorts of studies are useful, though preliminary, attempts to systematize hypotheses drawn from the case study literature. However, they scarcely exhaust the topic. Indeed, the true effects of colonialism may not be apparent from an approach that reduces the topic to one or two dimensions. Consider the debate over the relative importance of colonial institutions (Acemoglu, Johnson & Robinson 2001, 2002; North 1981) and geography (Diamond 1997; Olsson & Hibbs 2000; Sachs & Warner 1997) in structuring long-run economic development. While the measurement of geographic factors has advanced to include a host of highly differentiated variables (e.g., climate, soil, native perennial wild grasses, disease vectors, domesticable mammals, continental axis, latitude), the measurement of colonialism has been stuck at one or two (as listed above). Consequently, it has been impossible to provide a fine-grained test of the colonialism hypothesis. Likewise, students of political regimes have long suspected that patterns of colonialism may strongly shape national prospects to establish and maintain democracy. For example, cross-national statistical research has found a relationship between colonial status and subsequent regime history (e.g., Bernhard, Reenock & Nordstrom 2004; Bollen 1979; Bollen & Jackman 1985). However, these findings are based on the use of one or two variables: dummies for the identity of the colonial power and (in some studies) a variable measuring the number of years under colonial rule. These are not unsophisticated studies in other respects, but their measurement of the key hypothesis is strictly circumscribed. A survey of the immense literature on colonialism thus reveals two truisms: 1) the case study literature is informative but also un-systematic and 2) the crossnational literature is systematic but not very discriminating. As a consequence, the topic of colonialism suffers from simultaneous promiscuity and neglect. Although the subject is ubiquitous in the contemporary fields of anthropology, economics, history, political science, and sociology, it is rarely studied in a detailed and systematic manner. Our ambition is to marry the virtues of case study and crossnational approaches so that the influence of colonialism on the modern world (whatever that may be) can be measured in ways that are satisfying to scholars working with in-depth historical studies as well as global datasets. Specifically, we propose to develop a comprehensive dataset on colonialism, a cross-national timeseries study that will stimulate future research concerning the causes and effects of colonialism by scholars in all fields of the social sciences -- regardless of method, theoretical framework, or substantive area of interest. Our focus is on Anglo-European and Japanese overseas colonialism since the fifteenth century. We intend to measure the type and degree of involvement of various colonial powers across a variety of dimensions that may be expected to influence subsequent development trajectories. We suggest that good data on these dimensions alone would go a long way toward enabling scholars to test key hypotheses related to colonialism. In addition to activities directly tied to colonialism, we seek to put together data on other key historical variables that might be relevant to scholars who study colonialism and long-term patterns of development. These ancillary topics fall into five broad categories – geography, economics and demography, human development, the state, and the nation – described at length in the appendix to this document. Without such a comprehensive historical dataset we lack the means to adjudicate among rival causal hypotheses. What has colonialism wrought? Under what circumstances might colonialism leave favorable or unfavorable legacies? Is the causal effect of colonialism to be discovered in the immense variety of colonial experiences? If so, how shall we understand these 3 experiences, and judge their effects? How might the various eras of globalization be compared with each other? Was colonialism in the eighteenth century, for example, significantly different from colonialism in the nineteenth and twentieth centuries? These kinds of questions motivate the data collection effort of this study. We hope not only to elucidate the fraught subject of colonialism but also to shed light on long-term patterns of development, a subject usually hostage to late-twentieth century datasets. The study also has an implicit methodological goal. It is often noted that the field of comparative politics is rent by a central cleavage separating cross-national statistical researchers, who work with global datasets drawn from the postwar era, and historical institutionalists, who work with historical materials drawn from a single region or a small set of countries (Pierson & Skocpol 2002). The project at hand attempts to bridge these two camps, integrating the salient features of in-depth historical accounts into a single global dataset that stretches back over the centuries. As such, we hope it will provide a new way of doing business in the social sciences, one that is acceptable and accessible to both qualitative and quantitative researchers. The Dataset Problem Given that the work of social science is increasingly global in scope it is not surprising that global datasets have played an increasingly important role in the disciplines of political science, sociology, and economics. A short list of the most important and most frequently utilized datasets in these fields would include the following: Correlates of War, Cross-National Time-Series Data Archive, Penn World Tables, Polity IV, State Failure Task Force, and World Development Indicators (see Table 1). Scholars rely on these datasets -- and many others -- for a wide range of tasks. They perform roughly the same role for comparativists that standard surveys such as the National Election Study and the General Social Survey perform for Americanists. Yet, despite the prominence of crossnational data in contemporary research, existing datasets suffer from three generic problems. First, they are limited in temporal scope. Few global datasets reach back before 1950, and only three extend to the early nineteenth century. We have not found a single widely used global dataset that extends into the eighteenth century. Given that work in the social sciences increasingly deals with causal and descriptive propositions that extend back to the Enlightenment or to earlier historical eras, this lack of historical coverage may be regarded as a monumental lacuna. Second, existing datasets often suffer from ambiguity about their sources, coding procedures, and the methods of aggregation. Thus, although crossnational datasets have become staples of scholarly research, it is with considerable unease that scholars employ their variables. While some of these faults are inherent to the enterprise – collecting data globally is, after all, a daunting task – others may be corrected through careful attention to coding decisions, the use of multiple sources, the recording of data in disaggregated form, detailed recording of procedures, and – perhaps most important of all -- reliance on the expertise of country specialists. These methodological issues are discussed at length below. Third, existing datasets do not address the issue of colonialism in any detail. At best, a single dichotomous variable for principal colonizer is included (e.g., British colonial origin). Thus, our dataset would constitute the first attempt to systematically examine and record the imprint of colonialism on the modern world. The need for a comprehensive historical dataset that stretches back to the pre-modern era, and treats colonialism in a more differentiated fashion, seems clear. 4 Table 1: A Sample of Extant Global Datasets Dataset Subjects Years Source, Location Correlates of War International relations 1815- Singer, Diehl (1990) www.umich.edu/~cowproj/ Cross-National Time-Series Data Archive Comparative politics 1815- Banks (1994) http://www.databanks.sitehosting.net/ Penn World Tables (PWT) Economics 1950- Heston, Summers (1991) www.bized.ac.uk/dataserv/penndata/pennhome.htm Polity IV Democracy, governance 1800- Marshall, Jaggers (2002) www.cidcm.umd.edu/inscr/polity/ State Failure Task Force International relations 1955- Goldstone et al. (2000) www.cidcm.umd.edu/inscr/stfail/ World Development Indicators (WDI) Economics, demography 1960- World Bank (2003) www.worldbank.org/data/wdi2002/ Hypotheses In order to direct the data-collection process it is necessary to establish priorities. Which descriptive patterns and causal relationships warrant attention? What sorts of evidence can be coded numerically or in natural-language categories such that broader features of social, political, and cultural development across the world can be better understood? Data collection is always, at least implicitly, motivated by theory. Even so, we are wary of macro-theoretical frameworks that might limit the utility of the resulting dataset for scholars working in other schools and genres. To this end, we wish to avoid an overly “theoretical” vocabulary that would identify this as a project emanating from Marxist, world-systems, Weberian, neoclassical, or some other theoretical framework. In this light, our approach is fairly close-to-theground. It should be clear that the purpose of this investigation is not solely to explore causal relationships, but also descriptive patterns. Colonialism is of great intrinsic significance, influencing our views on a wide range of present-day phenomena, e.g., globalization, North/South relations, slavery, development, and what some have called “neo-imperalism.” Many of the assertions at issue in these contemporary debates concern what? questions, rather than (or in addition to) why? questions. Thus, our initial hypotheses, listed in Table 2 (below), include both causal and descriptive inferences. It is our hope that, once completed, the data included in this project will generate new hypotheses. This, in turn, will undoubtedly stimulate further collection of data (which we expect will 5 be integrated into the dataset). Social science is a dynamic process. But one must start somewhere. We offer the following list of hypotheses in an open-ended spirit, as a point of departure for future work on colonialism and long-term development. 6 Table 2: Hypotheses COLONIALISM British/other Japan/other Africa/other Property rights Population density Extractable resources British rule was different – more decentralized, more indirect, more democracy, and/or better governance. Japanese rule was different from European rule – more interventionist, more developmental. Africa was less intensively colonized than Latin America, South and Southeast Asia. Property rights were more likely to be established in colonies that attracted large numbers of settlers (AJR 2001, 2002). Densely settled indigenous areas were less likely (AJR), or more likely (Sokoloff & Engerman 2000), to be targets of settlement by Europeans. Regions with readily extractable resources (e.g., gold) were subject to more European settlers (a common assumption in the literature on Latin America). DEVELOPMENT British rule Territorial continuity Colonial intervention Type of colonization Property rights Property rights and conflict British colonialism, by virtue of its greater local democracy, indirect rule, and/or effective civil service, leads to greater development. Continuity of borders, or at least the endurance of a “core” region within the colony, allows for a more successful transition during the post-independence era, and hence to greater development. Greater colonial intervention causes greater (Alam 2000; Grier 1999), or lesser (Young 1994), development. Directly-ruled settler colonies have the strongest developmental performance; indirectly-ruled nonsettler colonies have the worst. Areas with well established property rights experienced greater subsequent development (AJR 2001, 2002; North 1981). Reification of customary norms governing access to land and property in colonial law generated conflict over interpretation and enforcement of such laws (Chanock 1998, Colson 1974). DEMOCRACY British rule Direct/indirect rule Colonial settlement Pre-colonial legislatures British rule encouraged local- and national-level democracy, thus establishing norms and procedures that would help democracy survive in the post-independence era (Bernhard, Reenock & Nordstrom 2004; Bollen and Jackman 1985; Lipset et al., 1993; Weiner 1987). Directly ruled colonies are more democratic later on because direct rule destroys traditional (and often undemocratic) power-holders (e.g., chiefs). European settlement produces democracy. Experience with democratic procedures through pre-colonial legislatures helps to establish and protect democratic norms in the post-independence era. 7 Constructing the Dataset There is no simple recipe for designing and pursuing a successful data collection project. Existing crossnational datasets, discussed above, provide both exemplary models and cautionary tales. They are exemplary insofar as they manage to capture, in quantitative form, a variety of indispensable concepts commonly used in comparative analysis. They are worrisome insofar as they have often failed to provide adequate explanation of their coding procedures and are subject to important measurement errors (see, e.g., Munck and Verkuilen 2002). We aim to provide a more careful and thorough – and consequently a more useful – dataset that includes a detailed codebook with all concepts, definitions, and primary and secondary sources employed. Our intention is to remain as close to the ground as possible in our coding decisions, which is to say that aggregated concepts will be employed only in conjunction with their component (disaggregated) parts, so that future scholars can re-visit the ground that we cover. In this manner, most coding decisions can be easily revised. Variables Our hope is to identify dimensions of politics, economy, and society that are valid across time and across regions. Ideally, these measures would also be applicable to a variety of political units including empires, nation-states, city-states, colonies, and so forth. Of course, we do not imagine that data will be equally available, or equally informative, for these diverse units. The point, rather, is that the coding categories should be valid. Variables are divided into six general categories: 1) colonial rule, 2) politics, 3) geography, 4) economics and demography, 5) human development, and 6) society. Granted, these categories are somewhat arbitrary. Indeed, future researchers may choose to divide up the subject in quite different ways. Fortunately, the six-part division of subjects does not affect the substantive goals of the project in any way. A complete list of variables, along with their definitions and potential sources, can be found in the appendix. Note that most variables could be conceptualized alternately as explanatory variables, control variables, or outcome variables, depending upon one’s theoretical proposition. Only a few, such as several of the geographic variables, are entirely exogenous. Note also that some variables are invariant, or occur only at one period of time (e.g., at the moment of the initial colonial encounter). These will be static (identical in all years of the timeseries dataset). Most variables of interest experience some change over time and are thus properly coded in a time-series format. Some time-series variables, such as GDP, are available only for contemporary years (Angus Maddison’s estimates extend back to 1800; solid annual data for a broad cross-section of countries begins in 1950). We choose to include such variables, even though they cannot be extended over the entirety of our chosen time-period. (This means that the resulting panel will be “unbalanced.”) A final category of variables will be composed from various time-series indicators to allow for a summary measure of a country’s overall colonial or development experience, as discussed below. Coding Our intention is to collect data on “natural” units of analysis, as defined by primary and secondary sources, leaving the task of aggregation for a later stage. This means that we must deal with a wide variety of units of analysis -- empires, continents, cultural zones, nation-states, subnational regions, cities, and so forth. These units overlap and, in many instances, change over time. The British 8 Empire in 1700 evidently refers to a very different geographic entity than the British Empire in 1800. In order to incorporate this complexity into the basic architecture of the dataset we plan to employ GIS mapping software. Evidently, we need to find a data structure that can preserve the original units of analysis (as drawn from primary sources) while offering the possibility of multiple aggregation techniques. For example, we need to be able to reconstruct the history of present-day nation-states, whatever their previous geo-political identities might have been. Temporal units of analysis are also complicated. Evidently, data for many of our variables are available only at very irregular intervals. At the same time, a few variables are much more precisely dated. In many cases, for example, we know the exact year and day that colonial administrators assumed office (Henige 1970). Again, the purpose of the dataset is to preserve as much precision in the original data as possible. Thus, we are hoping to find an architecture that allows a precise dating of particular days (June 10, 1555) as well as larger, less precise temporal units (the eighteenth century). Some data will be available in convenient numerical form (e.g., number of colonial settlers in a region). Other information will have to be estimated, on the basis of historical accounts or by expert coders (e.g., theory of colonial rule). We distinguish between “primary” variables (requiring relatively judgment on the part of the coder) and “secondary” variables (more highly aggregated, and relying on information from primary variables). The dataset thus represents a mix of “objective” and “subjective” codings, and quantitative and qualitative data. Resulting variables are of all sorts: string (ordinary language), nominal, ordinal, and interval. For each data point (cell), the dataset will note the following: variable (substantive information), location (i.e., country, colony, region, city, or town), year, source(s), and additional notes. The latter is an all-purpose field allowing us to comment on the viability of the source, disagreements among sources or coders, special coding rules, or any other facet of the data point that might be relevant. This cell-by-cell information system should make the task of any future recoding immeasurably easier and allows for a full reporting of the procedures employed. Imputation and Aggregation While the primary purpose of the dataset is to provide a centralized primary-data source for scholars, we also intend to apply imputation and aggregation techniques so that the data is rendered in a more usable form. The problem of aggregation besets variable-centered research. Evidently, the way in which indicators of a central concept are aggregated (e.g., additive; weighted-additive; necessary/sufficient approaches) can have important implications for final result. All aggregations are, in some sense, research interventions. In turn, these aggregation issues depend in part on how the specific indicators are measured themselves (e.g., at a nominal, ordinal, or interval level). Our effort will be to preserve the most basic indicators used in all aggregative concepts, wherever possible, so that our choices of aggregation can be revisited by future scholars, perhaps with different hypotheses in mind. Here, we describe a few of the techniques that may be employed for these secondary datasets. Where several sources are employed to measure the same concept, or very similar concepts, we may choose the “best” (most reliable, most consistent, most comprehensive) of these variables to report in a final, secondary dataset. Alternatively, we may combine these multiple variables into a single variable, using an averaging or imputation technique (to fill in missing data). When a single variable lacks temporal data, and where this data is highly trended, simple techniques of linear interpolation may be employed. 9 A more complicated problem arises with missing spatial data (data unavailable for particular units), or temporal data that is not highly trended. Here, multiple-imputation techniques may be employed (e.g., King et al. 2001). In arriving at a final, “complete” dataset, we are cognizant that different temporal units may be advisable. For some variables, an annual coding scheme may be possible. For others, decadal or century-long intervals may be more appropriate. The objective, ultimately, would be to provide a reasonably complete dataset for the world over the past six centuries across chosen dimensions. The purpose of “complete-ness” in this context is to provide a dataset that can be used for varied analyses -- descriptive, causal, and predictive – without biasing results by over-representing those parts of the world where richer data is available. At the same time, any techniques of aggregation that involve imputation of missing data should be reflected in a corresponding measure of uncertainty for that variable. Measures of uncertainty will therefore be reported along with each variable in the final product. Most important, the raw data – as collected from numerous sources – will be preserved so as to be available to future scholars, who may wish to apply different techniques of aggregation. Thus, at the end of the project we envision several datasets, as follows: 1) a Primary dataset, with raw data, along with any new variables that we decide to code; 2) a Completed dataset, with selected variables for key concepts interpolated and imputed so that all units and time-periods are covered across the entire six centuries; 3) a Cross-sectional dataset, centered on the year 2010, summarizing the temporal data for each contemporary nation-state in a static cross-country format. The latter will be based on cumulative totals and/or averages. For a subset of variables, this dataset will provide information in four time-periods: a) prior to colonization, b) at the height of the colonial period (during the decade of greatest colonial influence), c) at independence (the approximate year in which a country attains formal sovereignty), and d) at present (the most recent year in the dataset). Thus, once the data collection phase of the project is complete, a set of subsidiary variables and datasets will be constructed from the underlying (raw) data. Operationalizing Key Concepts While the utility and conceptual validity of most of the variables appearing in the appendix will be readily apparent to the reader, some of them raise complicated issues of conceptualization and measurement and thus deserve more extended commentary. In the following sections we discuss measures of Colonialism, Economic development, Human development, and Cultural transformation. Colonialism Measuring colonial rule requires a viable definition of colonialism, a term that has been variously understood (Abernethy 2000: 19-21; Esherick et al. 2006). Our definition, a fairly conventional one, highlights three elements: 1) a colonizer (the metropole) makes a successful claim of sovereignty over an overseas territory; 2) it exerts influence over the occupied territory through the creation of an administrative structure that extracts resources and enforces regulations; and 3) it perceives the indigenous population as different and usually inferior in culture (as defined by race, ethnicity, religion, customs, and/or language) and denies this population full citizenship rights. Accordingly, we include in the analysis all of the European colonizers as well as Japan. Excluded from our purview are land empires such as the Qing, Ottoman, and Romanov Empires. 10 To address questions about colonialism and its legacies, four variables seem especially important: 1) presence of colonial rule; 2) identity of primary colonizer; 3) type of colonialism; and 4) level of colonialism. We discuss these in turn. First, we code the presence of colonial rule using three categories: sovereign territory, partial colony, and colony. We include the second category of partial colony to accommodate borderline cases such as India from the mid-1700s until the formal establishment of colonialism in 1857 and Sierra Leone during the 1787-1896 period. Scholars who have a more expansive definition of colonialism may choose to treat these borderline cases as colonies, whereas those with a more restrictive definition may choose to treat them as periods of sovereignty. The number of countryyears that will be coded in the borderline category is modest, corresponding mostly to territories where the beginning or end of colonialism is ambiguous over several decades. Second, for country-years where colonialism is fully or partially present, we code the identity of the primary colonizer (i.e., England, Spain, Portugal, and so on). Again, in most cases this coding is straight-forward, but occasionally territories are subject to multiple colonizers. For example, portions of the southern United States were colonized by Spain and France, and subsequently by England. In this regard, it bears emphasis that our GIS-based approach to coding takes account of (estimates of) the actual spatial control of different colonizers, as judged by historical maps. GIS software also allow us to arrive at aggregate measures of colonial rule over the longue duree, using different measures of colonial control (e.g., percent of territory now part of the United States controlled by Metropole A, multiplied by the number of years it was in control of that territory). Third, we propose to code cases in light of four types of colonialism that are of central importance in the literature: direct settler, indirect settler, direct non-settler, and indirect non-settler colonialism. Direct settler colonies refer to those in which the colonial settler population represents the majority of the total population and permanently resides in the colony; these colonies are marked by direct rule by the settler population (e.g., Pearson 2001). Classic examples include the British settler colonies of Canada, the United States, Australia, and New Zealand. Indirect settler colonies refer to colonies in which the colonial settler population represents a substantial portion of the total population (though less than half) and this population permanently resides in the colony; however, these colonies feature indirect rule in which indigenous elites are responsible for governance outside of the central bureaucracy. Examples of indirect settler colonies include South Africa, Zimbabwe, and the Spanish colonial empire for much of its history. Both direct and indirect non-settler colonies refer to colonies in which the population from the colonizing nation does not permanently settle the colonial territory. The two types are distinguished by the presence of direct or indirect rule. Direct rule occurs when the colonial state is unified and bureaucratically organized. Indirect rule, by contrast, consists of a bifurcated state in which the central administration is bureaucratically organized and the peripheral administration – which is supposed to be based on pre-colonial institutions and is run by indigenous intermediaries who collaborate with the colonial authorities – is organized along patrimonial lines. The uneasy combination of central bureaucratization and peripheral patrimonialism is therefore the hallmark of indirect rule. For some cases, more than one type may be applicable. For example, India featured both pockets of direct rule and pockets of indirect rule. For these cases, we will code both the dominant type of colonialism and the secondary type of colonialism. If there is real ambiguity concerning which type is dominant, the case will be coded as a hybrid. Finally, perhaps our most important new variable is “level of colonialism.” An adequate measure of this variable is central to testing to whether, ceteris paribus, more colonialism or less colonialism is associated with better post-colonial development performance. Level of colonialism is a multi-dimensional category that aggregates several important dimensions. The overarching 11 variable of level of colonialism will be especially useful for scholars who seek to introduce a basic control for colonialism in their studies. By contrast, scholars who seek to explore the impact of specific features of colonialism will likely find the codes for the underlying dimensions of this concept to be more useful. We define level of colonialism as the extent to which the economic, sociocultural, and political institutions of a metropole are imposed upon a colonized region. Thus, we are concerned with the extent to which a broad array of institutions is transferred – in whole or in part and possibly with substantial modification – from a colonizing nation to a colonized territory. We operationalize the concept using three measures: economic institutional transformation, sociocultural institutional transformation, and political institutional transformation. Because we will systematically gather data on each of these dimensions, the three measures themselves could be used as variables for analysis. With economic institutions, we are centrally concerned with the extent to which the colonizing power shapes methods of production and trade relations within the colonized territory. In some cases, colonizers introduced entirely new modes of production into their occupied territories (e.g., the Spanish in the Andes). In other cases, colonizers left indigenous modes of production at least partly intact (e.g., British colonialism in parts of East Africa). Likewise, the extent to which internal and external trade relations were controlled and regulated by colonial authorities varied across regions and time (compare, for example, the restrictions of Spanish mercantilism with the more liberal orientation of British and Japanese colonizers). We draw on these observable differences to assess the extent to which economic institutions were implanted into the different colonies. In the course of coding the economic institutional measure, we will be able to gather data on additional variables related to the economy (see appendix). For example, the variables of colonial economic activity (extractive, mixed, extensive), level of colonial investment, and type of labor force can be coded in conjunction with scoring the measure of economic institutional transformation. With sociocultural institutions, we are concerned with the extent to which the colonizing power implanted new cultural practices and new styles of life into colonial territories. For empirical purposes, we especially focus on cultural institutions, such as religious institutions, civic associations, and educational facilities. In some places and at some times, colonizers implanted almost none of these institutions, whereas in other places and at other times, colonizers vigorously pursued the cultural transformation of the colonized territory. We seek to measure these variations. Again, in the course of scoring cases for the measure of sociocultural transformation, we will be in a position to gather data relevant to the coding of several other variables. These variables include those related to missionary activity, religion, and language (see appendix). With political institutions, we focus on the extent to which colonizers brought bureaucratic and governmental structures into the colonies. In some colonies, the colonial state represented no more than a couple hundred bureaucrats and soldiers located in a distant capital; no effort was made to impose a formal government (e.g., much of Africa). In other colonies, the colonial state was more developed, perhaps featuring a national government in which indigenous leaders collaborated with colonial authorities (e.g., the West Indies in the 20th century). In still other colonies, the colonial power imposed a large, centrally-controlled bureaucratic, legal, military, and government apparatus that extended deeply into society (e.g., parts of East Asia). When coding political institutional transformation, we will also seek to gather data for several other related variables, including those related to military presence, legal penetration, and a range of variables that we list under “Politics” in the appendix. Economic development 12 We rely centrally on demographic variables to measure the developmental capacity of societies prior to the modern era. Two demographic variables, urbanization and population of a state’s largest city, are looked upon as proxies for aggregate societal wealth and civilizational development (including technology, the division of labor, and the development of advanced forms of social and political organization) in periods prior to the demographic revolution (Acemoglu et al. 2002; Bairoch 1988). These are also periods in which the calculation of a gross domestic product is virtually impossible, since there was no formal economy to speak of (and, to make matters even more complicated, no common measures by which purchasing power parity could be observed crossnationally). This makes demographic variables all the more essential. Fortuitously, most civilizations that we are aware of – and certainly all modern civilizations -- were based on urban agglomerations.1 Cities and civilization went hand in hand (Bairoch 1988; Chandler 1987; Childe 1950; Modelski 2000). In the twentieth century, and perhaps even the nineteenth, it becomes possible to arrive at reasonably good estimates of GDP per capita (Maddison 2001), which can be combined with demographic data from earlier periods to arrive at a comprehensive accounting of societal wealth throughout the world during the period covered in this survey (1400-the present). Another variable that may be helpful in charting economic growth within colonies and nation-states prior to the mid-twentieth century (when GDP statistics become widely available and reliable) is export revenue per capita (e.g., Manning 1982: 4). Human development It is important to stress that the foregoing measures are regarded as proxies for aggregate wealth and civilizational development, not human development. Various privations associated with social inequality and oppression, of which slavery is the most egregious example, are entirely excluded from such demographic and economic variables. Indeed, it is likely that urbanization was associated with an increase in mortality rates prior to the twentieth century (Bairoch 1988). It would be folly, in any case, to equate the size of a political unit’s largest city, its level of urbanization, or its export revenue with the quality of life enjoyed by its inhabitants. Measurements of human development are more problematic since estimates of mortality -the most common indicator -- are available on a global scale only from the mid-twentieth century. Other indicators of health and literacy are even more limited in historical and geographic range. For a small set of regions an available proxy for human physical wellbeing exists in the form of human stature. Stature, understood here as the average height of mature members of a specific human community, is the best measure – indeed, virtually the only measure – of human development prior to the tabulation of mortality rates (Bogin 1988; Bogin & Keep 1999; Eveleth & Tanner 1991; Komlos & Baten 1998; Steckel 1995; Steckel & Rose 2002; Steckel & Floud 1997). Human stature is highly sensitive to nutritional intake, particularly during childhood, and to insecurities in food provision that might disrupt nutritional intake. Since we have accurate measurements of stature potential – drawn from healthy specimens of present-day populations with similar genetic composition – biologists can easily calculate the degree to which previous populations achieved this potential. Particularly revelatory is the degree to which stature has varied over time in human populations around the world. Strong evidence suggests, for example, that stature declined in the initial aftermath of colonial interventions in Latin America. Granted, stature is not quite the same concept as overall mortality since, in principle, a population of tall adults might co-exist with high infant and child mortality rates (if children are dying of diseases that are not 1 Partial exception might be made for the Egyptian empire (Modelski 2000: 25-6) and the Roman Empire in its later stages, when wealth migrated from Rome to large latifundia-style estates situated in rural regions around the empire. However, the fact that this movement was associated with the empire’s decline is not coincidental. 13 nutritionally related). However, prior to recent discoveries about disease, sanitation, and medicine we can expect that adult stature corresponded with mortality rates. Thus, we propose that stature is a useful proxy for human development through most eras covered in this study. As data on historical stature in different parts of the world becomes available, it will be integrated into the dataset and may provide a good measure of human development over the long run. Cultural Transformation Constructing accurate and sensitive measures of cultural change is perhaps the most daunting task of all. One set of proxies involves linguistic and religious practices. If these change – if, for example, a region adopts the language and/or religion of its principal colonizer – there is strong reason to suppose that a wide-ranging cultural transformation has occurred. The speed and thoroughness of this transformation can presumably be tracked by the rate and extent to which indigenous practices disappear. Thus, the variables measuring linguistic and religious practices (see appendix) offer a crude tracking of broader societal transformations. A related approach looks to changes in the racial complexion of a population as a clue to the cultural transformation of indigenous peoples and – equally important -- how integrated/segregated these societies were, overall. Presumably, where the color line was shifting and indistinct, fewer barriers separated colonial and indigenous peoples. These are all outcome-based approaches, to be sure. In order to get a sense of the inputs – i.e., the extent of direct cultural intervention on the part of a colonizing power – one may attempt to estimate the number of schools run by, or established by, the colonizer and the principal language of instruction in that school system(s). Methods of Analysis Before concluding, we wish to pay explicit attention to the methods by which the data resulting from this study might be analyzed. The first anticipated use is purely descriptive, i.e., to show how countries and/or colonies fared at various points in their historical trajectory. Indeed, the most important use of this dataset may be primarily descriptive, allowing researchers to make better and broader comparisons through time and across space. This, in turn, may provide the point of departure for focused case studies. A second use of the dataset is to provide direct evidence of causal relationships. Our point of departure is the global cross-country research design, usually focused on the postwar decades (1950-). Although sometimes approached in a pooled time-series format (Gerring, Thacker & Moreno 2005), there is usually relatively little variation in key variables through this time period, and such variation as exists is manifestly non-experimental (and thus correlated with the error term). Complicating matters further is the extreme heterogeneity across units (nation-states). It is no wonder that the format has been strongly criticized in recent years (e.g., Kittel 2006; Rodrik 2005). Even so, for many questions of interest to scholars the crossnational regression remains among the best of all bad options. While we take no position in this ongoing debate it is worth noting that whatever confidence one might have in cross-sectional models depends largely upon the problem of adequately specifying the model. This, in turn, rests upon the intensity and variety of specification tests that a writer is able to apply to a given hypothesis (since “correct” benchmark models are virtually impossible to identify). Only if a result is robust in the face of many plausible specifications can it be regarded as providing strong evidence of a causal hypothesis. Such specification tests evidently depend upon 14 the prior existence of a large set of correctly measured and crossnationally valid indicators – to be used as controls or as alternate measures of the key concept of interest. In short, crossnational regressions depend upon specification tests, and specification tests depend upon data. Once again, the crucial importance of the present project becomes apparent if we are to sustain the viability of this common mode of crossnational analysis. At the same time, one of the anticipated benefits of the Colonial Legacies project is to open up new methodological approaches. In particular, we hope to offer scholars the possibility of exploiting useful variation over time. Note that the projected dataset will collect information on key variables at annual, decadal, or centennial intervals (depending upon data availability). This means that the resulting data may be examined in a panel format over a much longer period of time, and this opens the way for what may prove, in some circumstances, a more productive use of the panel format. Since the project will collect some data for spatial units much smaller than the contemporary nation-state (e.g., cities and regions) it may also be possible to perform analyses that are much more disaggregated than the typical cross-country regression. These analyses might center on spatial units chosen according to available GIS formats, e.g., geographically circumscribed areas or hectares. The possibilities for new spatial units of analysis are, in principle, unlimited, and may greatly change our capacity to model causal relations through time. An additional approach, also relying on temporal variation, focuses on the histories of territories whose colonial ruler changed – from Dutch to British in New York and in South Africa, for example. These cases, which are quite numerous throughout the world, offer critical evidence for any hypothesis concerning the effects of colonial rulership, for the ceteris paribus assumption of causal analysis is likely to be satisfied if the comparison is restricted to periods just prior to, and after, the changeover in control. A final methodology compares the performance of colonies in decades just prior to, and after, the achievement of independence. This approach regards independence as an exogenous “treatment,” allowing for pre- and post-tests along various dimensions of development (political, social, and economic). Of particular importance in this sort of analysis is the construction of appropriate controls for regional and global trends that might otherwise produce spurious findings. Also important is a wide-angle focus on either side of the independence divide so that temporary effects associated with this unique political rupture are neutralized. (Alternatively, one might simply ignore the decadal period surrounding the date of independence. In any case, territories achieving independence in the very recent past [e.g., East Timor] or perhaps not at all [e.g., Puerto Rico] would be excluded from this species of analysis.) Advancing Historical Research in the Social Sciences Social science theories increasingly recognize that historical events and processes (i.e., events lying in the distant past) are critical for the explanation of contemporary outcomes. They are also increasingly inclined to seek global answers for significant questions and problems. Global history is here to stay. Indeed, it is increasingly recognized that most of the various processes attendant upon “globalization” are by no means novel to the twentieth century. However, researchers in the social sciences are often unable to devise adequate empirical tests for propositions rooted in the distant past. This is not primarily the fault of the researchers. The problem is that data for prior historical periods is limited, and that which exists often suspect, requiring a deep and nuanced knowledge of a particular time-period and region. Evidence 15 constructed in this painstaking fashion seems to resist all but the most anodyne generalizations and hence provides rather unpropitious ground for social-scientific theory-making. We are well aware that data recovery for the fifteenth century will never match the quality and quantity of data available for the twentieth century. Yet, we are equally convinced that much more can be done to collect the data that lies out there already (in the form of secondary sources and specialized datasets), to provide new codings of substantively important topics, and to make this information more widely available to scholars. We regard this project as an important step in this direction, and one that will greatly enhance the ability of social scientists to test the propositions suggested by their increasingly historically-oriented theories. In much the same way that Polity IV, Correlates of War, and the World Development Indicators now function as standard references for the study of nation-states in the modern era, we anticipate that this new dataset may serve as the leading source of quantitative and qualitative information for those investigating periods prior to the nineteenth century – a jumping-off place for research on all conceivable topics. Plan of Action The project is funded for three years, beginning in Summer, 2007. Over this period, we anticipate three phases of activity. In the first phase, we plan to incorporate all existing historical data relevant to colonialism and development that is relatively easy to collect, e.g., information can be drawn from existing datasets or printed sources. In the second phase, we will begin coding original data for additional variables, or adding additional data to provide more complete or more reliable coverage for existing variables. Decisions on which topics (i.e., variables) to address will be based on three general criteria: a) ease of collection, b) data reliability, and c) expected theoretical yield. In the third and final phase we will aggregate the raw data into a series of aggregate variables and sub-set datasets, as described above. A project this expansive has no definitive point of completion. No matter how long we labor, there will always remain significant shortcomings -- in data coverage, data reliability, and theoretical scope. This is true, naturally, of all projects. But it is particularly true of a project that aims to discover patterns on a global scale and in the distant past. Our long-term objective, therefore, is to ensure that this project will be maintained – amended, emended, perhaps even fundamentally reconceptualized -- into the future. Just as other datasets (WDI, PWT, State Failure, Polity) have endured, so, we imagine, the Colonial Legacies dataset might endure. To this end, we hope to create a community of scholars who are sufficiently committed to the project that they will lend their expertise, and their time, to ensure its future. 16 REFERENCES Abernethy, David B. 2000. The Dynamics of Global Dominance: European Overseas Empires, 1415-1980. New Haven: Yale University Press. Acemoglu, Daron; Simon Johnson; James A. Robinson. 2001. “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91, 1369-1401. Acemoglu, Daron; Simon Johnson; James A. Robinson. 2002. “Reversal of Fortune: Geography and Institutions in the Making of the Modern World Income Distribution.” Quarterly Journal of Economics 117, 1231-94. Alam, M. Shahid. 2000. Poverty from the Wealth of Nations: Integration and Polarization in the Global Economy since 1760. Houndsmills, Basingstoke: Macmillan. Armitage, David. 2000. The Ideological Origins of the British Empire. Cambridge: Cambridge University Press. Bairoch, Paul. 1988. Cities and Economic Development: From the Dawn of History to the Present. Chicago: University of Chicago Press. Banks, Arthur S. 1994. “Cross-National Time-Series Data Archive.” Center for Social Analysis, State University of New York at Binghamton. Binghamton, New York. Beach, Harlan P; Charles H. Fahs (eds). 1925. World Missionary Atlas. New York: Institute of Social and Religious Research. Benjamin, Thomas (ed). 2006. Encyclopedia of Western Colonialism since 1450. New York: Macmillan. Bernhard, Michael, Christopher Reenock, Timothy Nordstrom. 2004. “The Legacy of Western Overseas Colonialism on Democratic Survival.” International Studies Quarterly 48, 225-50. Bogin, Barry. 1988. Patterns of Human Growth. Cambridge: Cambridge University Press. Bogin, Barry; R. Keep. 1999. “Eight Thousand Years of Economic and Political History in Latin America Revealed by Anthropometry.” Annals of Human Biology 26:4, 333-51. Bollen, Kenneth A. 1979. “Political Democracy and the Timing of Development.” American Sociological Review 44: 572-87. Bollen, Kenneth A.; Robert W. Jackman. 1985. “Political Democracy and the Size Distribution of Income.” American Sociological Review 54: 612-21. Boswell, Terry. 1989. “Colonial Empires and the Capitalist World-Economy: A Time Series Analysis of Colonization, 1640-1960.” American Sociological Review 54:2 (April) 180-96. Brown, David S. 2000. “Democracy, Colonization, and Human Capital in Sub-Saharan Africa.” Studies in Comparative International Development 35:1, 20-40. Brown, Michael E. 1997. “The Impact of Government Policies on Ethnic Relations.” In Michael E. Brown and Sumit Ganguly (eds), Government Policies and Ethnic Relations in Asia and the Pacific (Cambridge: MIT Press). Carlson, W. Bernard. 2005. Technology in World History, 7 vols. Oxford: Oxford University Press. Carneiro, Robert L. 1970. “A Theory of the Origin of the State.” Science 169, 733-38. Chamberlain, Muriel Evelyn (ed). 1998. The Longman Companion to European Decolonisation in the Twentieth Century. Addison-Wesley. Chandler, Tertius. 1987. Four Thousand Years of Urban Growth: An Historical Census. Lewiston, NY: St. David’s University Press. Chase-Dunn, Christopher; Thomas Reifer. 2003. The Social Foundations of Global Conflict and Cooperation: Waves of Globalization and Global Elite Integration Since 1840. NSF Grant, University of California-Riverside. Childe, Gordon V. 1950. “The Urban Revolution.” Town Planning Review 21:1, 3-17. Clark, Grover. 1936. The Balance Sheets of Imperialism: Facts and Figures on Colonies. New York: Columbia University Press. Cohen, Ronald; Elman R. Service (eds). 1978. Origins of the State: The Anthropology of Political Evolution. Philadelphia: Institute for the Study of Human Issues. Curtin, Philip D. 1989. Death by Migration: Europe’s Encounter with the Tropical World in the Nineteenth Century. Cambridge: Cambridge University Press. Diamond, Jared. 1997. Guns, Germs, and Steel: The Fates of Human Societies. New York: Norton. 17 Eggimann, Gilbert. 1999. La Population des villes des Tiers-Mondes, 1500-1950. Geneva: Centre d’histoire economique Internationale de l’Universite de Geneve, Libraire Droz. Eisenstadt, S.N.; Stein Rokkan (eds). 1973. Building Nations and States: Models and Data Resources, vol. 1. Beverly Hills, CA: Sage. Englebert, Pierre. 2000. State Legitimacy and Development in Africa. Boulder: Lynne Rienner. Esherick, Joseph W.; Hasan Kayali; Eric van Young (eds). 2006. Empire to Nation: Historical Perspectives on the Making of the Modern World. Lanham, MD: Rowman & Littlefield. Etemad, Bouda. 2000. Possession du monde: Poids et mesures de la colonisation (XVIIIe-XXe Siecles). Bruxelles: Editions complexes. Eveleth, Phyllis B., James M. Tanner. 1991. Worldwide Variation in Human Growth, 2d ed. Cambridge: Cambridge University Press. Fieldhouse, D.K. 1966. The Colonial Empires: A Comparative Study from the Eighteenth Century. London: Macmillan. Frankema, Ewout. 2006. “The Colonial Origins of Inequality: Exploring the Causes and Consequences of Land Distribution.” Research Memorandum GD-81, Groningen Growth and Development Centre. Gerring, John; Strom Thacker; Carola Moreno. 2005. “Centripetal Democratic Governance: A Theory and Global Inquiry,” American Political Science Review 99:4 (November) 567-81. Goldstone, Jack A. et al. 2000. “State Failure Task Force Report: Phase III Findings.” [Available at http://www.cidcm.umd.edu/inscr/stfail/SFTF%20Phase%20III%20Report%20Final.pdf] Grier, Robin M. 1999. “Colonial Legacies and Economic Growth.” Public Choice 98:317-335. Hailey, Lord. 1945. An African Survey: A Study of Problems Arising in Africa South of the Sahara. London: Oxford University Press. Hailey, Lord. 1979. Native Administration in the British African Territories, 5 vols. Colonial Office. Henige, David P. 1970. Colonial Governors from the Fifteenth Century to the Present. Madison, WI: University of Wisconsin Press. Hensel, Paul. [various years]. “ICOW Colonial History Data Set.” http://garnet.acns.fsu.edu/~phensel/ Herbst, Jeffrey. 2000. States and Power in Africa: Comparative Lessons in Authority and Control. Princeton: Princeton University Press. Heston, Alan; Robert Summers. 1991. “The Penn World Table (Mark 5): An Expanded Set of International Comparisons, 1950-1988.” Quarterly Journal of Economics (May) 327-68. Jacobson, Harold K. 1968. “United Nations and Colonialism, 1946-1967.” ICPSR Study No. 5513. Jones, Eric L. 1981. The European Miracle: Environments, Economics and Geopolitics in the History of Europe and Asia, 2d ed. Cambridge: Cambridge University Press. King, Gary; James Honaker; Anne Joseph; Kenneth Scheve. 2001. “Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Impuation.” American Political Science Review 95:1 (March) 4969. Kittel, Bernhard. 2006. “A Crazy Methodology?: On the Limits of Macroquantitative Social Science Research.” International Sociology 21, 647-77. Kohli, Atul. 2004. State-Directed Development: Political Power and Industrialization in the Global Periphery. Cambridge: Cambridge University Press. Komlos, John; Joerg Baten (eds). 1998. The Biological Standard of Living in Comparative Perspective. Stuttgart: Franz Steiner Verlag. Kuczynski, R.R. 1948. Demographic Survey of the British Colonial Empire, vol 1 London: Oxford University Press. Kuczynski, R.R. 1949. Demographic Survey of the British Colonial Empire, vol 2 London: Oxford University Press. Kuczynski, R.R. 1953. Demographic Survey of the British Colonial Empire, vol 3 London: Oxford University Press. Kurzman, Charles; Erin Leahey. 2004. “Intellectuals and Democratization, 1904-1912 and 1988-1996” American Journal of Sociology 109:4. Lange, Matthew. 2003. “The British Colonial Lineages of Despotism and Development.” Dissertation, Department of Sociology, Brown University. La Porta, Rafael; Florencio Lopez-de-Silanes; Andrei Shleiferand Robert W. Vishny. 1998. “Law and Finance.” Journal of Political Economy 106:6. 18 La Porta, Rafael; Florencio Lopez-de-Silanes; Andrei Shleifer; Robert W. Vishny. 1999. “The Quality of Government.” Journal of Economics, Law and Organization 15:1, 222-79. Lipset, S. M.; K. Seong; J. C. Torres. 1993. “A Comparative Analysis of the Social Requisites of Democracy.” International Social Science Journal 136: 155-75. Macfarlane, Alan. 1978. The Origins of English Individualism: The Family, Property, and Social Transition. Cambridge: Cambridge University Press. Maddison, Angus. 2001. The World Economy: A Millennial Perspective. Paris: OECD. Mahoney, James. 2003. “Long-Run Development and the Legacy of Colonialism in Spanish America.” American Journal of Sociology 109:1. Mamdani, Mahmood. 1996. Citizen and Subject: Decentralized Despotism and the Legacy of Late Colonialism. Oxford: Oxford University Press. Manning, Patrick. 1982. Slavery, Colonialism and Economic Growth in Dahomey, 1640-1960. Cambridge: Cambridge University Press. Marshall, Monty G.; Keith Jaggers. 2002. “Polity IV Project: Political Regime Characteristics and Transitions, 1800-2002.” Manuscript. [Available at http://www.cidcm.umd.edu/inscr/polity/] Masters, William A., Margaret S. McMillan. 2000. “Climate and Scale in Economic Growth.” Journal of Economic Growth 6, 167-86. McEvedy, Colin; Richard Jones. 1978. Atlas of World Population History. New York: Facts on File. Mitchell, Brian R. 2003a. International Historical Statistics: Africa, Asia and Oceania, 1750-1993, 3d ed. London: Macmillan. Mitchell, Brian R. 2003b. International Historical Statistics: The Americas, 1750-2000, 5th ed. London: Macmillan. Mitchell, Brian R. 2003c. International Historical Statistics: Europe, 1750-1993, 4th ed. London: Macmillan. Mitchell, George P. 1967. Ethnographic Atlas. Pittsburgh: University of Pittsburgh Press. Modelski, George. 2000. World Cities: -300 to 2000. Washington: Faros. Munck, Gerardo L., Jay Verkuilen. 2002. “Conceptualizing and Measuring Democracy: Evaluating Alternative Indices.” Comparative Political Studies, 35:1 (Feb): 5-34. North, Douglas C. 1981. Structure and Change in Economic History. New York: Norton. Olsson, Ola; Douglas A. Hibbs, Jr. 2000. “Biogeography and Long-Run Economic Development.” Working Papers in Economics No. 26 (August), Department of Economics, Goteborg University. Pearson, David. 2001. The Politics of Ethnicity in Settler Societies. Basingstoke: Palgrave. Pierson, Paul; Theda Skocpol. 2002. “Historical Institutionalism in Contemporary Political Science.” In Ira Katznelson, Helen V. Milner (eds), Political Science: The State of the Discipline (New York: W. W. Norton) 693-721. Posner, Daniel. 1999. “The Colonial Origins of Ethnic Cleavages: The Case of Linguistic Divisions in Zambia,” presented at the Spring 2001 Meeting of the Laboratory in Comparative Ethnic Processes (LiCEP), Harvard University, 23 March 2001; and the James S. Coleman African Studies Center Seminar, UCLA. Posner, Daniel. 2000. “Measuring Ethnic Identities and Attitudes Regarding Inter-group Relations: Methodological Pitfalls and a New Technique,” presented at the Fall 2000 Meeting of the Laboratory in Comparative Ethnic Processes (LiCEP), University of Pennsylvania. Putterman, Louis. 2003. “State Antiquity Index (‘Statehist’), Version 2.” See http://www.econ.brown.edu/fac/Louis_Putterman/ Rodrik, Dani. 2005. “Why We Learn Nothing from Regressing Economic Growth on Policies.” Ms. Sachs, Jeffrey D.; Andrew Warner. 1997. “Fundamental Sources of Long-run Growth.” American Economic Review 87:2, 184-8. Singer, J. David; Paul Diehl (eds). 1990. Measuring the Correlates of War. Ann Arbor: University of Michigan Press. Sokoloff, Kenneth L.; Stanley L. Engerman. 2000. “Institutions, Factor Endowments, and Paths of Development in the New World.” Journal of Economic Perspectives 14:3 (Summer) 217-32. Steckel, Richard H. 1995. “Stature and the Standard of Living.” Journal of Economic Literature 33 (December) 1903-40. 19 Steckel, Richard H.; Jerome C. Rose (eds). 2002. The Backbone of History. Cambridge: Cambridge University Press. Steckel, Richard H.; Roderick Floud (eds). 1997. Health and Welfare during Industrialization. Chicago: University of Chicago Press. Stewart, John. 1996. The British Empire: An Encyclopedia of the Crown's Holdings, 1493 through 1995. McFarland & Co. Stewart, John. 1999. African States and Rulers, 2d ed. Jefferson, NC: McFarland & Company. Strang, David. 1990. “From Dependency to Sovereignty: An Event History Analysis of Decolonization, 1870-1987.” American Sociological Review 55 (December) 846-60. Strang, David. 1991. “Global Patterns of Decolonization, 1500-1987.” International Studies Quarterly 35:4 (December) 429-54. Thorp, Rosemary. 1998. Progress, Poverty, and Exclusion: An Economic History of Latin America in the 20th Century. New York: Inter-American Development Bank. United Nations Educational, Social and Cultural Organization. 1957. World Illiteracy at Mid-Century, a Statistical Study. Paris: UNESCO. Weiner, Myron. 1987. “Empirical Democratic Theory,” in Competitive Elections in Developing Countries, edited by M. Weiner and E. Ozbudun, pp. 3-34. Durham: Duke University Press. Wilkinson, Steven I. Forthcoming. Colonization, Institutions and Conflict. Book manuscript. Woodberry, Robert D. 2004. “The Shadow of Empire: Christian Missions, Colonial Policy, and Democracy in Postcolonial Societies.” PhD Dissertation, Department of Sociology, UNC-Chapel Hill. World Bank. 2003. World Development Indicators 2003. Washington, DC: International Bank for Reconstruction and Development. Young, Crawford. 1994. The African Colonial State in Comparative Perspective. New Haven: Yale University Press. 20 Appendix: VARIABLES AND SOURCES Sources listed below refer to primary sources of data (i.e., published work or publicly available datasets) or, in some cases, to works that offer discussions or examples of the hypothesis that a variable represents. In addition, we wish to acknowledge certain general sources, of use for a wide range of indicators. These include: Beach & Fahs (1925), Benjamin (2006), Boswell (1989), Carlson (2005), Chase-Dunn & Reifer (2003), Clark (1936), Correlates of War dataset (Singer & Diehl 1990), Cross-National Time-Series Data Archive (Banks 1994), Eisenstadt & Rokkan (1973), Etemad (2000), Frankema (2006), Hailey (1945), Henige (1970; see dataset employed in Strang 1990), Hensel (various years), Jacobson (1968), Kuczynski (1953), Mitchell (1998a, 1998b, 1998c), Penn World Tables (Heston & Summers 1991), Polity IV dataset (Marshall & Jaggers 2002), State Failure Task Force dataset (Goldstone et al. 2000), Statesman’s Yearbook (various years), Stewart (1996, 1999), Wilkinson (forthcoming), Woodberry (2004), World Development Indicators (“WDI” [various years]). These will be carefully culled for additional data. For each coding, it will be necessary to assign a spatial unit – empire, country, colony, region, city, and so forth. This should follow the designation of the original source as closely as possible, unless there are reasons to assign a different coding (e.g., to retain consistent usage in the meaning of a place-name or to conform to a more reliable spatial designation than is contained in the original source). The principal temporal unit of analysis is the territory-year. However, more precise dates (e.g., for an election) should also be noted, wherever available. Note that most questions can be applied to all regions, while some pertain only to colonies. I. Colonial rule Primary variables Arrival of first explorers. Coding: date. Arrival of first missionaries. Coding: date. Arrival of first traders. Coding: date. Arrival of first permanent officials from the metropole. Coding: date. Colonial status. Coding: 1) sovereign territory, 2) partial colony, 3) full colony. Sources: Abernethy (2000: appendix), Clark (1936), Putterman (2003), Strang (1991). Identity of colonizer. Coding: string. Includes countries that are coded as colony or partial colony. Sources: Clark (1936), Strang (1990, 1991). Distance from metropole. Applicable only with reference to colonies. Coding: miles from colonial capital. Travel time from metropole. Applicable only with reference to colonies. Coding: shortest time required to travel from the metropole to the colony. Distance from trade centers. A measure of peripheral status relative to the cores of the world economy in the modern era. Coding: miles from London or Japan, whichever is closest (from the country’s center). 21 Type of military presence. Coding: 1) no foreign military presence except that which may be voluntarily negotiated (e.g., NATO troops in Europe), 2) temporary interventions (or threat of same), 3) permanent stationing of foreign troops. Number of permanent military. Coding: number of troops from the metropole (or in the service of the metropole). Status within the empire. Coding: 1) colonial center, 2) semi-periphery, 3) periphery. Sources: Mahoney (2003). Settler mortality. Sources: Acemoglu, et al. (2001, 2002), Curtin (1989). Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 R.R. Kuczynski, Colonial Population, 1937. Re: Global, 1900-1937. Location: Mugar HB 885 K8 1969 Colonials as percent of native population. Sources: Clark (1936), Kuczynski (1953). Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 R.R. Kuczynski, Colonial Population, 1937. Re: Global, 1900-1937. Location: Mugar HB 885 K8 1969 Europeans/Japanese as percent of population. Sources: Acemoglu et al. (2002), Kuczynski (1953), McEvedy and Jones (1978). R.R. Kuczynski, Colonial Population, 1937. Re: Global, 1900-1937. Location: Mugar HB 885 K8 1969 Territory under foreign control. Coding: percent of territory under foreign rule. Sources: Putterman (2003). Legal penetration. Number of customary court cases/total number of court cases/total population at independence. Sources: Lange (2003). Open door treaty. Coding: 1) no open door treaty, 2) open door treaty. Theory of colonial rule. Refers to the ideology of colonialism not to facts on the ground. Coding: 1) direct, 2) mixed, 3) indirect. Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 Practice of colonial rule. Refers to facts on the ground, as near as they can be ascertained, rather than the official dogma professed by leaders of the colonial power. Coding: percent of country ruled directly (i.e., under the direct administrative authority of the metropole or those appointed and responsible to the metropole) and indirectly. Colonial administrators. Coding: number of colonial administrators. Sources: British Blue books. Colonial administrators (%). Coding: number of colonial administrators as percent of total administration (see below). Profits provided by colony. Net profits/losses from all sources. 22 Cost of colony. Costs, including infrastructure, military, social policies, and administration. Net Cost/Profit of colony. Profits – costs. Independence movement. The “strength” of an independence movement understood in terms of the size and coherence of the formal organization, the size of its (passive) support among the populace, its commitment to independence as a goal, and actions taken to achieve that end. Coding: 1) weak, 2) medium-strength, 3) strong. Achieving independence. 1) won by force of arms, 2) won by some combination of force and politics, 3) granted freely by the colonist without military threat. Colonial taxation. A measure of the proportion of colonial revenues derived from “native” hut, poll, and income taxes. Agricultural policy and institutions. 1) A measure of the export tax on primary commodities of the colony. 2) An indicator noting the presence of government monopsony over a primary export commodity. “Extractiveness” of colonial activity. The value of tax revenues and exports minus administrative expenditures. Colonial land policy. A measure of colonial policy towards indigenous land tenure systems. Colonial regulation of indigenous political activity. A set of variables recording colonial policy on native representation and political organization. Secondary variables Level of colonialism. Discussed in text. [Fill in details here.] Type of colonialism. Coding: 1) direct settler, 2) indirect settler, 3) direct non-settler colonialism, 4) indirect nonsettler colonialism. For a slightly more disaggregated typology see Fieldhouse (1966; reported in Abernethy 2000: 55-6). Colonial economic activity (predominant type). Coding: 1) extractive, 2) mixed, 3) extensive. [Is this codable?] II. Politics Primary variables Predominant political unit. Coding: 1) bands, clans, or tribes (nothing above the tribal level), 2) kingdom (hereditary), 3) empire, 4) nation-state. Sources: Cohen and Service (1978), Putterman (2003). Democracy (Polity). The Polity IV dataset codes sovereign countries with populations of at least half a million on a twenty-one point scale from 1800 to the present. Sources: the “Polity2” variable (Polity IV). Elections. Coding: 1) no election, 2) an election. Sources: Wilkinson (forthcoming). Type of election. Coding: 1) metropole (election in colony for an office in the metropole), 2) national or colonywide, 3) subnational (regional or local). Suffrage. Coding: constitutional rule or norm, e.g., white males only or citizens of the metropole only. 23 Suffrage (%). Coding: percent of permanent residents who are permitted to vote. Turnout. Coding: percent of permanent residents who vote. Decentralization. Autonomy of local political units relative to central authority. Coding 1) low, 2) moderate, 3) high. [Note: this variable needs more work; not sure how to clarify.] Monopoly of physical force. Concerns the degree to which the “official” political unit said to be in charge of a territory manages (by force or persuasion) to suppress other conflicts. Specifically, are there civil wars, regions of self-proclaimed autonomy, or areas where government officials venture with trepidation? Coding: 1) very little control over the use of physical force, 2) partial monopoly of physical force, 3) total monopoly of physical force. Revenue. Coding: central government tax revenue as share of GDP or (prior to the availability of GDP measures) population. Sources: Mitchell (1998a, 1998b, 1998c), WDI. G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 Expenditure. Coding: central government expenditure as share of GDP or (prior to the availability of GDP measures) population. Sources: Mitchell (1998a, 1998b, 1998c), WDI. G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 Bureaucracy. Size of bureaucracy measured as the number of persons in regular employ, military and nonmilitary. Bureaucracy (non-military). Size of bureaucracy measured as the number of persons in regular employ, nonmilitary only. Indigenization of post-colonial bureaucracy. Coding: 1) gradual, 2) moderate, 3) fast and thorough. Military (number). Coding: size of military as share of population. Military (character). Coding: 1) separate from colonizers’ armed forces, 2) many, incorporated into colonizers’ armed forces. Legal system. Coding: 1) common law, 2) mixed, 3) civil law, 4) indigenous system of law, 5) no formal system of law. III. Geography Primary variables Total land area. Coding: square kilometers. Sources: WDI. Latitude. Coding: absolute value of latitude (natural logarithm). Sources: La Porta et al. (1998). Continental axis. Coding: the distance in longitudinal degrees between the eastern and westernmost points of each continent, divided by the distance in latitudinal degrees between the northernmost and southernmost 24 points. A value of 2, for instance, indicates that the landmasss in question is about two times more East-West oriented than North-South oriented. Sources: Diamond (1997), Olsson and Hibbs (2000). Isolation. Coding: 1) on a north-south land-mass (North and South America, Africa) or isolated island (e.g., Australia), 2) on an east-west landmass (Asia). Sources: Diamond (1997), Olsson and Hibbs (2000). Frost. Coding: Proportion of land with more than 5 frost-days per month in winter. Sources: Masters and McMillan (2000). Climate. Coding: 1) unfavorable to agriculture, 2) moderately favorable, 3) favorable to agriculture. Sources: Olsson and Hibbs (2000). Mineral resources. Judged according to their availability, ease of extraction, and recognized value at the time. Coding: 1) poor, 2) moderate, 3) plentiful. Natural infrastructure. Includes harbors, waterways, sea access, and terrain suitable for roads. Coding: 1) poor, 2) moderate, 3) plentiful. Secondary variables Border consistency. Coding: how many years has the current border been maintained (with only minor revisions)? Homeland duration. Coding: how many years has the territory of the current state been together in the same unit? IV. Economics and Demography Primary variables GDP per capita. Sources: Maddison (2001), Mitchell (1998a, 1998b, 1998c), Thorp (1998), WDI. Trade routes. Coding: List any direct trade contacts, i.e., the city or state which is directly linked by a trade route (a trade route is the route taken by an overland caravan, ship, or plane, and may include several stops). Trade travel time. Coding: List travel time in number of days to each direct trade contact. Trade openness (quantitative). Refers to foreign trade, i.e., trade outside the official borders of a political unit (whether a sovereign state or colony). Trade between the metropole and its colony is understood as foreign trade for both units. Coding: Imports and exports as share of GDP. Sources: Clark (1936), Mitchell (1998a, 1998b, 1998c), WDI. Export revenue. Coding: export revenue per capita. Sources: Manning (1982: 4). G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 Trade openness (qualitative). Coding: 1) all foreign trade expressly forbidden and strictly enforced, 2) trade permitted only through a few entrepot trading centers and strictly limited, 3) trade within the confines of the empire, strictly enforced, 4) trade within the confines of an empire, not strictly enforced, 5) trade with all parties allowed. 25 Currency. Coding: 1) no currency of any sort, 2) primitive monies (not used as units of account and not fully fungible; restricted to special purposes), 3) a variety of currencies that compete with one another, 4) a uniform currency enforced by national authorities. Roads -- Length. Coding: length of all major highways (kilometers). Sources: Herbst (2000: 84-), WDI. G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 Railroads -- Length. Coding: length of railway line open (kilometers). Sources: Mitchell (1998a, 1998b, 1998c). G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 Railroads -- Freight. Coding: freight traffic on railways (thousand metric tons). Sources: Mitchell (1998a, 1998b, 1998c). G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 Railroads – Passenger traffic. Coding: passenger traffic on railways (thousands). Sources: Mitchell (1998a, 1998b, 1998c). Connectedness. Coding: % of population who are one day’s journey away or less (using usual means of transport) from a navigable river, airport, or ocean port. Postal traffic. Coding: mail items (millions). Sources: Mitchell (1998a, 1998b, 1998c). G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.8 Telegraph traffic. Coding: telegrams (millions). Sources: Mitchell (1998a, 1998b, 1998c). Telephones. Coding: telephones in use (Mitchell) or telephone mainlines (telephone lines connecting a customer’s equipment to the public switched telephone network per 1,000 people; WDI). Sources: Mitchell (1998a, 1998b, 1998c), WDI. Waterways. Percent of territory reachable by navigable waterway. [Clarification needed.] Property rights. Coding: 1) communal property or no property rights, 2) mix of communal and private property but not all property is trade-able, 3) private property is recognized and freely tradable. Labor force, Subsistence. Coding: percent of labor force engaged in hunting, gathering and planting for consumption only (non-commercial). Labor force, Plantation. Coding: percent of labor force engaged in plantation agriculture (commercial). Labor force, Agriculture. Coding: percent of labor force engaged in agriculture (any variety). Agriculture as % of GDP. Sources: Mitchell (1998a, 1998b, 1998c), WDI. Mineral extraction. Coding: percent of formal economy (GDP or equivalent) derived from mineral extraction. G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 26 Population. Sources: Clark (1936), Kuczynski (1953), Maddison (2001), McEvedy and Jones (1978), Mitchell (1998a, 1998b, 1998c), WDI. Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995. Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 R.R. Kuczynski, Colonial Population, 1937. Re: Global, 1900-1937. Location: Mugar HB 885 K8 1969; R.R. Kuzynski, The Cameroons and Togoland: A Demographic Study, 1939. “Collection of population statistics and to the demographic situation of an African area from the beginning of its colonization up to the present time.” James D. Tarver, The Demography of Africa, 1996. Re: Africa, 1950-1996 (AD14-1996). Location: Widener HB 3661 A3 T37 1996 Urbanization. Coding: percent of population living in urban areas. Sources: Acemoglu et al. (2002a), Bairoch (1988), Eggimann (1999), Maddison (2001), Mitchell (1998a, 1998b, 1998c), WDI. Nomadic. Coding: percent of population who are not sedentary. Population density. Sources: Kuczynski (1953), WDI. Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 G.B. Kay, Political Economy of Colonialism in Ghana, a Collection of Documents and Statistics 1900-1960, 1972. Re: Ghana, 1900-1960. Location: Widener Afr 6193.82 City population. Main sources: Chandler (1987), Modelski (2000). Additional sources: Maddison (2001), WDI. Largest city. Largest city within larger unit (e.g., nation-state, colony, empire). Derived from City population variable (above). Compactness of human settlement. Coding: 1) dispersed (multiple population centers), 2) moderately dispersed, 3) compact (a single population center). Sources: Englebert (2000). Ethnic cleansing. Forcible displacement of population groups, either within-country or emigration, of groups equaling at least 3% of national population. Coding: 1) none, 2) ethnic cleansing in progress. V. Human Development Primary variables Stature. Discussed in text. Sources: Bogin (1988), Bogin and Keep (1999), Eveleth and Tanner (1991), Komlos and Baten 1998), Steckel (1995), Steckel and Rose (2002), Steckel and Floud (1997). Birth rate. Sources: Kuczynski (1953), Maddison (2001), Mitchell (1998a, 1998b, 1998c), WDI. Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 27 James D. Tarver, The Demography of Africa, 1996. Re: Africa, 1950-1996 (AD14-1996). Location: Widener HB 3661 A3 T37 1996 Infant mortality. Sources: Kuczynski (1953), Mitchell (1998a, 1998b, 1998c), WDI. Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 R.R. Kuczynski, Colonial Population, 1937. Re: Global, 1900-1937. Location: Mugar HB 885 K8 1969 James D. Tarver, The Demography of Africa, 1996. Re: Africa, 1950-1996 (AD14-1996). Location: Widener HB 3661 A3 T37 1996 Mortality rate (crude). Sources: Kuczynski (1953), Maddison (2001), Mitchell (1998a, 1998b, 1998c), WDI. Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 R.R. Kuczynski, Colonial Population, 1937. Re: Global, 1900-1937. Location: Mugar HB 885 K8 1969 James D. Tarver, The Demography of Africa, 1996. Re: Africa, 1950-1996 (AD14-1996). Location: Widener HB 3661 A3 T37 1996 Life expectancy. Sources: Maddison (2001), WDI. Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 Literacy. Coding: the percentage of people aged 15+ who can, with understanding, both read and write a short, simple statement on their everyday life. (Coded as zero if there is no generally recognized written language.) Sources: Eisenstadt and Rokkan (1973: 245-47), United Nations Educational, Social and Cultural Organization (1957), WDI. Schooling – Primary and Secondary. Coding: the number of children in school as percent of school-age population (or, where unavailable, of general population). Sources: Mitchell (1998a, 1998b, 1998c), PWT, WDI. Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 Schooling – University. Coding: the number of university students as percent of total population. Sources: Mitchell (1998a, 1998b, 1998c), PWT, WDI. Schooling – Overseas. Coding: the number of persons educated (for some period of time) in a university located in a metropole. Sources: Kurzman & Leahey (2004). Schooling – system type. Coding: public secular (%), public denominational (%), private secular (%), private denominational (%). Schooling – language of instruction. Coding: language of instruction in each of the following: public schools, private secular schools, private denominational schools. 28 VI. Society Primary variables Colonial genocide: Percent of indigenous population felled by disease or other causes as a direct result of colonizer’s presence during the initial colonial encounter. Inter-breeding. Inter-breeding between colonials and indigenes. Coding: 1) low, 2) medium, 3) high. Sources: Kuczynski (1953). Missionary religion(s). Coding: list each missionary group and, if available, the approximate number of missionaries that they had in the field. Sources: Beach & Fahs (1925), Woodberry (2004). Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 Religion (members). Coding: list all major religions and the approximate percent of the population who adhered to, or were born into, each. Bruce Fetter, Demography from Scanty Evidence, Central Africa in the Colonial Era, 1990. Re: Africa, 1800s-1970. Location: Widener HB 3664.3 A3 D46 1990 Religion (geography). Coding: show all major religions and the territory in which their adherents lived. Religious homogeneity. Percent of population in a given territory who are members, or are born into, the largest religion. Sources: Posner (1999, 2000). Language. Coding: list all major languages and the approximate percent of the population who are speakers of each. Numbers may exceed 100, since some will have multiple competencies. Language (geography). Coding: show all major languages and the territory in which their speakers lived. “First” language. Coding: list all major “first” (native) languages and the approximate percent of the population who are speakers of each. Linguistic homogeneity. Coding: percent who are fluent in the most commonly spoken tongue (even if not a ‘first’ language for most users). Linguistic distance. Coding: distance between major “first” languages, as understood by linguists. Ethnicity (geography). Coding: show all major ethnicities and the territory in which their adherents lived. Cultural composition. Culture is understood here in the most encompassing sense, i.e., including whatever racial, religious, linguistic, and ethnic cues help to identify ‘us’ from ‘them’ in a given context. (Sometimes the term “ethnic” is used in this sense.) Coding: 1) unipolar (more than 90% of the population belong to a single cultural group), 2) bipolar (more than 90% of the population belong to two main cultural groups), 3) multipolar (no two cultural groups together comprise 90% of the population). Sources: Brown (1997), Kuczynski (1953), Posner (1999, 2000). Cross-border cultural affinities. Coding: 1) main ethnic groups are not found in neighboring countries, 2) at least one major ethnic group has significant numbers in an adjacent country. Sources: Englebert (2000). 29 Cultural stratification. Concerns the degree of separateness characterizing the most salient cultural cleavages; the degree to which cultural divisions are invidious. Excludes consideration of very small groups (e.g., Jews in most European countries, the Burakumin in Japan). Coding: 1) low, 2) moderate, 3) high (e.g., caste or racial apartheid). Cultural conflict. Level of conflict among non-colonial (indigenous) groups. Coding: 1) low, 2) moderate, 3) high. Cultural geography. Coding: 1) none (no important cultural cleavage), 2) dispersed (cultural groups intermingle), 3) regionally concentrated (cultural groups do not intermingle). Sources: Brown (1997). Cultural cleavages. Coding (narrative): describe how linguistic, ethnic, and religious cleavages overlap or reinforce. Slavery. Coding: 1) no slavery or slavetrading, 2) extensive slave-trading, 3) extensive slave population. Antonio McDaniel, Swing Low Sweet Chariot, the Mortality Cost of Colonizing Liberia in the 19th Century, 1995. Re: Liberia, 1800-1900. Location: Mugar HB 1497.2 A3 M35 1995 Economic equality. Coding: Gini coefficient of family income. Sources: WDI. 30