TIX2 - Acagle.Net

advertisement
STATISTICAL ANALYSES OF THE DISTRIBUTION OF CERAMICS AT KOM EL-HISN
Robert J. Wenke
As noted in the introduction to this chapter, the primary objectives in our analyses of the Kom
el-hisn ceramics are to derive a relative seriation suitable for inference of a relative chronology,
so that we can address various kinds of change over time at Kom el-Hisn; and also, to define
spatial associations of ceramics and other artifacts in order to analyze various kinds of functional
variability of the Kom el-Hisn community, both as it existed at any one time and as it changed
over time.
Some of the statistical and other problems associated with these objectives have already been
discussed in this chapter, but it is worth repeating that the constant re-use of occupational debris
for building materials--particularly in the form of the numerous pottery sherds found in most
mudbricks--created at Kom el-Hisn an archaeological site in which there are repeated violations
of the "law of superposition," in the sense that the ceramic contents of a particular volume of
excavated materials cannot necessarily be assumed to be older than the materials on which they
lie. Other disturbances and problems mean that we shall have to use large samples and
conservative analytical techniques to search for general patterns in the spatial distribution of
artifacts at this site.
Stratigraphic and other difficulties aside, a more fundamental concern involves the analytical
units constructed from these ceramics and other artifacts. When confronted with a collection of
Egyptian pottery sherds or stone tools, most analysts have classified or grouped these objects on
the basis of principles and categories that have been in use for many decades. Not only are these
traditional taxonomic systems well-established in Egyptian archaeology, they have repeatedly
proved their usefulness in the standard archaeological tasks of description, relative dating, and
functional analysis. Petrie's pioneering efforts at seriating ceramics (1899) and Tixier's
systematization of lithics (1971) remain extremely influential in Egyptian artifact analyses.
Both Petrie's and Tixier's systems were created by arranging artifacts into groups of objects that
seemed similar to one another and dissimilar to other groups on the basis of gross characteristics
of size, shape, and "style." To a great extent, virtually all schemes for categorization and
analysis of Egyptian artifacts, even the most recent (e.g., Bourriau 1981, Arnold and Bietak in
press, Jacquet-Gordon ed. 1986) are logically similar to Petrie's and Tixier's methods.
Despite the widespread and productive use of these and other traditional methods of Egyptian
artifact typology, the procedures and assumptions on which they are based remain extremely
controversial. Indeed, numerous scholars have argued that the kinds of classificatory and
typological systems traditionally and currently applied to Egyptian artifacts are, at best,
incomplete (e.g., Read 1982, Whallon 1982), and at worst, incorrectly and inefficiently
formulated for their intended purposes (Dunnell 1971, 1986).
1
These negative evaluations are based on theoretical and methodological considerations that
appear only rarely in the Egyptian archeological literature (but see Kemp 1977, Adams 1963,
Hassan 1980, Close 1977, 1980a, 1980b, Wendorf and Schild 1980: 8-11).
Instead, the emphasis in recent studies of Egyptian artifacts seems to be on: (1) chemical
analyses of artifacts (e.g., neutron activation analysis [David ed. 1986]); (2) increasingly precise
and technical descriptions of the composition of ceramic wares (Arnold and Bietak in press); and
(3) applications of multivariate statistical techniques to tabulations of traditional Egyptian
ceramic and lithic types (Wenke, Long, and Buck n.d., Close 1980a).
These technical studies and other approaches are useful. But they do not in and of themselves
meet the criticisms directed at traditional methods of Egyptian artifact classification and
typology.
To understand why the kinds of typologies applied to Egyptian artifacts have been judged
inadequate, it is necessary to consider the objectives of analyses and also the criteria by which
one approach can be said to be better than another.
Most archaeologists have in common some basic research objectives, in that the use of
artifacts to infer chronologies and to reconstruct the activities of ancient peoples may be said to
be common to all students of ancient material culture. But in the final analysis, many
anthropologically-trained archaeologists differ from some of the scholars working in Egypt in
that the former retain the aspiration of making archaeology a scientific discipline--at least in the
sense of a discipline that can explain history.
Currently, there is widespread disenchantment with the 1960s-era hopes of making
archaeology a formal predictive science based on the model of physics (e.g., Salmon 1982,
Dunnell 1984, Hodder 1986), but there also appears to be general agreement that, whatever kind
of discipline archaeology can become, significant improvement in its explanatory power must be
based on reconsiderations of how archaeologists formulate and use artifact-based analytical
units. Rejection of traditional anthropological and archaeological typologies has been
widespread. Generally, many specialists in problems of archaeological classification assume that
archaeology cannot progress substantially until several issues of method and theory with regard
to classification are resolved (Spaulding 1982, Dunnell 1986, Read 1982, Wenke 1981, 1987);
indeed, the issue of unit formation has become the center around which debates about the
possibilities of historical and cultural analysis revolve.
Some anthropologically-inclined archaeologists, in particular, entertain ideas of scientific unit
formation that are fundamentally at odds with the idea--which is the norm in Egyptian studies-that archaeological analysis is in essence a historical and humanistic enterprise in which one may
use scientific methods (e.g., radiocarbon dating) but one's goal is fundamentally not scientific-one's goal is the description of an historical process, and the explanation of that process is in
terms of events, personalities, etc., and the common-sense kinds of interpretations of historical
2
processes (Hawkes 1968). This is to be contrasted with the kinds of explanations of history
envisioned by anthropologically-trained archaeologists, such as Binford (1983), Flannery (1972),
Watson, Redman, and LeBlanc (1986), Dunnell (1984), and Salmon (1982).
It seems evident that archaeologists of many disparate theoretical persuasions have in common
some basic notions of using artifacts for these descriptive and comparative purposes, and for
inferring relative chronologies and reconstructing trade patterns, social arrangements, etc. Given
this, most archaeologists must address two questions: (1) how do we construct archaeological
analytical units, and (2) how can we evaluate these units' usefulness?
With regard to the first question, one's assumptions and theoretical framework obviously
determine the kinds of units one constructs. In archaeology we have only a few primitive
theoretical notions to direct us in the creation of analytical units, and the questions to which these
units have been applied have been relatively simple. In devising artifact types and categories,
most archaeologists have simply been trying to describe their finds, or have been trying to
measure aspects of stylistic and functional variability--two concepts with no universally accepted
definition. With regard to stylistic variability, archaeologists have been particularly concerned
with relative chronologies and inferences about cultural interaction. Archaeologists assume that
people have always made some artifacts with characteristics that are not related to the function of
these objects and that, thus, these characteristics can be considered "style"; further, to be
considered stylistic, variability must be distributed through time continuously and unimodally, in
the sense that a stylistic artifact attribute or complex of attributes is defined as those that are
invented in a given area, begin to be used by increasingly more people, reach a peak of
popularity, and then die out. Since the distribution of artifact styles is affected by distance (e.g.,
styles can reach their ultimate point of dispersion long after they have died out at their point of
origin), methods of relative seriation for purposes of inferring chronologies require that the units
to be seriated come from a small enough area that spatial variability is not a factor.
Summaries of the finer points of the seriation method can be found in Ford (1954), Dunnell
(1970), and Marquardt (1978). Seriation and related issues of artifact classification and typology
continue to be actively debated, in part because many archaeologists have concluded that some
of the most important problems of historical analysis will only be resolved at a regional scale-that is, on the basis of archaeological surveys and analyses of many sites in large areas, as
opposed to research focussed on single sites. This has made comparisons between sites a
particularly important analytical step, and such comparisons have often taken the form of
establishing the relative chronology of occupations at many different sites. Also, relative
seriations have become increasingly important in analyses of surface collections (Johnson 1972,
Adams 1981, Kemp 1977, Wenke 1987) and in dealing with materials excavated at sites where
stratigraphy is complex or so obscured that it is no certain guide to sequence through time--a
common situation in Egyptian archaeology (Wenke 1986, Kemp 1977, Hassan 1984, Hoffman
1984).
The use of artifact styles to infer community and societal patterns and interactions is based,
obviously, on the spatial distribution of stylistic variability. We may find, for example, that a
3
particular kind of red-slipped bowl is common in a "rich" tomb of a 5th Dynasty noble but rare at
a rural site of this same period--implying, perhaps, differences of rank and wealth. Within the
confines of a single site, patterns in the distribution of pottery styles have been taken as
indications of social differentiation (e.g., Hoffman 1982), and there is a long tradition in Egypt of
using various wares as evidence of interregional and international commodity exchange (e.g.
Bourriau 1981:121-39). Regarding functional variability, archaeologists traditionally have used
analogy and inference to link "functional" artifact types to ancient activities. When we find
apparently smoke-smudged "cooking wares" in profusion in midden deposits near mud-brick
buildings but rarely or never in tomb assemblages, we infer their use as cooking utensils. Pottery
vessels have to meet certain conditions of permeability, thermodynamic expansion, etc. to
perform various functions, and variables presumably reflecting these characteristics can be
formulated and measured.
There is also a large literature on the mathematical problems of determining and measuring the
spatial clustering and patterns of co-occurrence of functional artifact types (e.g., Carr 1984), and
on the use of high-power microscopy to identify edge-damage characteristics in lithics (e.g.,
Vaughn 1985, Shipman 1986).
With regard to both stylistic and functional units of archaeological variability, the criteria by
which we judge the efficacy of our units are derived from these simple ideas about the
distribution through time and space of stylistic and functional variability. To judge, for example,
whether one method or another is better for purposes of inferring a relative chronological
seriation, we have as performance criteria only their relative fit to the seriation model, or their
relative agreement with independent criteria (e.g., dendrochronological evidence); in the case of
a functional analysis, we generally evaluate the adequacy of our units in terms of whether they
seem to be in agreement with other categories of evidence: whether or not lithic blades with
"sickle sheen," for example, appear in association with floral remains and other artifacts and
features indicative of an agricultural economy.
In contemporary archaeology there are currently four influential, current, and to some extent,
opposing schools of thought on archaeological artifact arrangement: (1) intuitive, "traditional"
methods of unit formation; (2) statistical approaches to nominal-level attribute analysis; (3)
multivariate statistical analyses of metric data as a means of grouping and classifying artifacts;
and (4) paradigmatic classification.
It is beyond the scope of this paper to review in detail these several approaches, and as yet the
Kom el-Hisn ceramics have only been analyzed in terms of the traditional typological methods.
But we hope to analyze the whole corpus of Kom el-Hisn ceramics using each of these methods
and then to use the different units created to test them against our theoretical expectations and
against the basic assumptions of the seriation method. Thus, it is relevant to outline each of
these approaches.
(1) Traditional Typologies
The artifact classifications and typologies commonly used in Egyptian studies have been
employed for various purposes, including: (1) description, often simply for the purpose of
4
comparing artifacts found at different sites (e.g., Bourriau 1981); (2) studies of stylistic
variation, especially for purposes of relative seriation (e.g., Petrie 1899, Kemp 1977, Wenke
1984:34-38), or for inferring cultural interactions (Close 1980a); and (3) studies of physicochemical composition, usually for the purpose of reconstructing trade patterns or identifying
manufacturing sites (e.g., David ed. 1986).
Methods
A common preliminary step in pursuing these research objectives has been the simple
grouping of objects by their obvious attributes of size, shape, and decoration. That is, most
analysts sort lithics or pot sherds into groups of objects that look alike, and they do so on the
basis of a simultaneous and complex visual consideration of many different characteristics of
shape, size, etc. Common examples of such categories in Egyptian studies are "Epipaleolithic
backed bladelet" and "Predynastic black-top ware." Studies (e.g., Berlin 1968) reveal that the
mental processes by which such artifact arrangements are made--to the extent that they can be
verbalized--involve a shifting hierarchy of criteria, so that variously weighted combinations of
shape, size, decoration, etc. are used to place objects in groups. So subtle and complex are the
mental processes involved in such taxonomies that attempts to duplicate them with multivariate
statistical analyses and high-speed computers have not been particularly successful (Doran and
Hodson 1975).
These kinds of intuitive groupings of Egyptian artifacts have recently been combined with
more precise and detailed descriptive procedures. Arnold and Bietak (in press), Nordstrom
(1972), Bourriau (1981), Adams (1962) and others, for example, have provided lists of
characteristics on the basis of which pottery may be grouped into wares, forms, styles, etc., by
reference to such variables as type of clays and silts used, Munsell color-chart values, size and
type of tempering particles, ratios of size measurements, etc.
In analyzing the Kom el-Hisn ceramics, we have begun by sorting most of the "diagnostics"
(i.e., rims, bases, handles, spouts, and decorated sherds) into the types illustrated in Figures x*yy. There is considerable variability within some of these types (e.g., Figure x), but these types
seem quite consistent in that they appear in substantial numbers in different areas of the site and
are sufficiently distinct that different analysts, working independently, reliably identify them. As
noted in the first section of this chapter, some of these types have been reported at various other
Old Kingdom sites, and some vessels are virtually identical to examples from tombs of 5th and
6th Dynasty nobles at Saqqara and Giza.
Given the widespread recognition of these traditional types, their ease of application, and the
common language of comparison they provide, what, if anything, is wrong with them as
analytical units? Although our analyses are in their early stages, these Kom el-Hisn types appear
to be reasonable reflections of changing styles over times (as inferred from stratigraphy); and
their spatial distributions fit some of our ideas about the functional composition of the site, as
well.
But, as noted above, such types must be assumed to have various limitations. Perhaps the most
5
important of these is that they do not allow reliable comparability between Kom el-Hisn
materials and those found elsewhere. Because they are based on unspecified procedures of
combining and differentially weighting different variables, the pottery from two different sites
cannot be precisely compared nor can the similarities and differences be precisely tabulated and
expressed.
Also, if we intended to use these intuitive types for chronological or functional seriations, we
would have to suppose that the groups into which we have arranged the Kom el-Hisn ceramics
reflect a mixture of stylistic and functional variability--at least until we have tested these groups
against the expectations of our chronological and functional models. Vessels like those in Type
31 (Figure xx) may be found in a certain frequency in a given level of occupation in part because
they were used for certain economic functions and in part because that particular style of bowl
was at a certain point in its "popularity" trajectory in time. In such a case, the most precise
indicator of chronology may be some complex combination of lip angle and radius, rather than
simple counts of the melange of variability embodied in Group 2 (Figure 2).
In short, these groupings are imprecise, in that the exact considerations that went into their
construction can never be entirely verbalized or expressed in precise measurement procedures.
Moreover, the research objectives that determined the creation of these units were simply the
assumption that by sorting these objects into groups that looked alike, descriptive categories and
units useful for seriation would be produced.
In summary, the perceived faults with traditional methods of categorizing Egyptian materials
include these elements: (1) Egyptian classifications and typologies have been established without
a clear expression of the objectives of the research for which they have been constructed, or the
research objectives specified are inadequate; (2) they are usually based on blends of size, shape,
and decoration--in other words, of both stylistic and functional variability, and thus they must be
assumed to be less than optimal for measuring either style or function; (3) these typologies are
summations of considerable variability, and the variability within the types may be particularly
important for precise analyses; (4) because of these assorted limitations, traditional Egyptian
types do not allow effective comparisons between assemblages from different areas; and (5)
because these traditional units are usually based entirely on physical groups and observed
objects, taxonomies of Egyptian artifacts are entirely bound to specific data sets and thus are not
suitable for conversion to the kinds of scientific units with which some scholars still hope to
build a scientific archaeology.
Because of all these limitations on intuitive methods of traditional type formation,
archaeologists in the last two decades have continually reassessed the basis on which they make
these intuitive groupings and have sought better ways in which to categorize the archaeological
record.
In our work at Kom el-Hisn we have just begun the process of applying alternative methods of
artifact categorization, and only our preliminary descriptive typology has been applied to enough
of the corpus of ceramics that we can analyze the distribution of these types statistically.
The frequencies of the most numerous and "stable" types (in the sense of reliability of
6
identification) are given by excavation SU in Table x. Almost all of the ceramics from our first
two seasons have been tabulated in terms of our typology, but only the ceramics from the units
listed in Table have been sufficiently studied that they can be analyzed statistically. Our
intention is to combine the ceramics from the anticipated third season with those from the first
two in a "final" typology, so that in our final analyses we can analyze the whole corpus in terms
both of the final typology and the alternative methods of artifact arrangement described below.
Figures x-y illustrate the most common and stable types so far defined, and in pp. of this
chapter some of these have been related to finds from other sites.
Considering both the illustrations in Figures x-y and their distribution by excavation unit, as
well as the additional illustrations in Figures xx-*yy, it is apparent that most of these ceramics
are from the kinds of vessels one would expect from an Old Kingdom agricultural settlement.
The numbers of fine-wares, particularly the relatively high-fired, red-slipped bowls like those
of Types 31a-d is perhaps somewhat unexpected, in that these vessels were also commonly
included in tombs of nobles at Giza and Saqarra. At Kom el-Hisn these vessels seem to be the
common utensils of everyday life--though they are rare or absent in the best preserved areas of
mud-brick architecture (e.g., 1202S-1070E - 1213S-1074E, Figure x). The trays and plates
illustrated as Types 3a-d and Type 13, too, although called "offering trays" by some, are so
common at Kom el-Hisn that they must be presumed to be objects of everyday domestic use.
In general, the types we have defined seem to be distributed throughout the site, in the sense
that few or none of these types can be said to come mainly from specific strata or areas of the
site. Thus, none of these types--as they are now defined--appears to be a good "index fossil" that
marks a distinct time period or social class. The distribution of these types is certainly not
random, however. Some types are much more likely to be found in association than other
combinations, and these associations may mark functional, chronological, or other depositional
patterning.
As noted previously, the determination of spatial associations is an enormously complex
statistical problem. Here, too, only when we have much larger samples from Kom el-Hisn and a
much refined system of artifact categorization can we expect to determine with considerable
precision the associations between these kinds of ceramics and other artifacts. Despite all the
qualifications imposed on our analyses by limited sample sizes, a primitive typology, and
substantial redeposition of materials, we can at least take as a working assumption the notion that
the kinds of ceramic artifacts used together at about the same time and for related purposes will
tend to be found together if the site is excavated by cultural stratigraphy.
Our preliminary attempt at analyzing patterns of spatial association of ceramic types involves
some complex statistical procedures, but the underlying assumptions and principles are quite
simple. We began by tabulating the frequency of the 37 different types (and some combinations
of types) that were represented by at least 15 individual sherds of that type. We then formed a
data matrix comprising all the basic excavation units (SUs, see Chapter II) that had at least 10
7
sherds identifiable as to type in them (thereby eliminating those SUs that represent brick walls,
small areas of soil discoloration, and other volumes of occupational debris that cannot be
assumed to reflect in their ceramics specific patterns of use).
The next analytical step was to calculate a coefficient of similarity that expresses how similar
every pair of types is in their spatial distribution: that is, to calculate a number such that this
number is high when two types are often found together in some excavation units and are also
both absent in other excavation units, but this number is low when one of these types is
frequently found where the other is not. Such a similarity coefficient can be computed in many
different ways, using for example the actual frequency of occurrence or just the presence and
absence of occurrence.
We took a conservative approach in which we converted the actual frequency of each type in
each excavation unit to simple presence or absence, thereby losing some information but
reducing--it is hoped--the effect on these coefficients of the different sizes of excavation units,
sampling error, etc. We used Sokal and Sneath's Similarity Measure 1 (SPSSX 1986: 739),
which gives double weight to cases in which both types are present or absent in a given
excavation unit, compared to cases in which one type is present and the other is not.
After having computed a matrix of SS1 coefficients expressing the similarity of occurrence of
all possible pairs of pottery types, we subjected this matrix to a non-metric-multidimensional
scaling analysis (or MDS analysis). The mathematical basis of MDS is beyond the scope of this
report, but MDS has been extensively used in archaeological analyses (Kendall 1969, LeBlanc
1975, Wenke 1975-76, Drennan 1976). Multidimensional scaling is a method where by artifacts
or assemblages are measured on their characteristics (e.g., size and shape variables of objects,
frequencies of artifact types in excavation units), then a measure of similarity is computed
among the set of objects or assemblages. With MDS one tries to find the fewest number of
dimensions in which the proximities of these points can be expressed, while maintaining the
distances (or the "ranks" of these distance) between these points as measured by their similarity
coefficients. The usual example is a table of driving distances between cities, say New York,
San Francisco, and 10 or 12 others. From a table of the driving distances between these cities-which is analogous to the matrix of coefficients of similarity among the 37 pottery types--an
MDS computer program can plot the location of each city in such a way that the information
contained in the matrix of distances is precisely expressed in terms of the distance of these cities
from each other as located as points on a two-dimensional plot--in other words, on a standard
map. In this example there are two main dimensions of variability -- longitude and latitude.
In archaeological applications the main dimensions of variability usually sought are change
over time or some functional dimension: for mathematical reasons, if excavation units or some
other unit of analysis exactly fit the "battle-ship" shaped curve of a perfect chronological
seriation, when analyzed with MDS they can be plotted as a horse-shoe shape in a twodimensional space. And the sequence through time of these units can be read around the arc of
the horse-shoe in such a way that, with adequate data, the distances between the points on this
horse-shoe can be exactly translated into differences of years (LeBlanc 1975; Drennan 1976).
8
The results of the MDS analysis of the 37 Kom el-Hisn pottery types is presented in Figure x,
with the statistics normally used to interpret such data. So many cautions and qualifications
attend these data that the sequence in Figure x cannot be interpreted necessarily as a
chronological or a functional sequence. In fact there is some evidence in these data that one
primary dimension of variability is simply the relative frequency of these ceramics, and statistics
associated with this analysis (stress and RMS) indicate that the variability among these types
cannot be expressed with considerable precision in a space of only two dimensions.
Nonetheless, the pattern illustrated in Figure x can be taken as a working hypothesis about the
patterns in which these ceramics co-occur, and we will investigate the significance, if any, of
these groupings in our future excavations and our reanalyses of these data.
In our future
analyses of the Kom el-Hisn ceramics we shall apply several specific methods of artifact
categorization, and it is appropriate here to explain briefly these alternative methods and some of
our our preliminary results in applying them.
(2) Statistical Analyses of Nominal-level Attribute Associations
One of the most influential attempts to replace or supplement traditional typologies is that
promulgated by Albert Spaulding (1953, 1982) and applied by him and others (e.g., Sackett
1982) in the form of statistical analyses of artifact attributes.
Research Objectives
Spaulding has consistently noted the utility of traditional methods of artifact classification and
taxonomy for purposes of seriation and functional analysis (1976), but he considers the ultimate
goal of archaeological analysis to require the construction of archaeological units of a kind quite
different from traditional units:
"Presumably the primary task of archaeology is to discover and describe whatever structure (or
order or pattern or predictability) there may be in the data of archaeology. The data of
archaeology consist of artifacts and other evidences of past human activity together with
observations on the circumstances in which they were found" (1982:1).
Spaulding argues that we should attempt to construct units that reflect behavior:
"A good type is a material reflection of more or less discrete culturally patterned segmentation of
human activities. This segmentation may be connected with the physical requirements of kinds
of tasks . . . or it may be a stylistic reflection of social patterning . . . or it may reflect some
combination of physical requirements and stylistic habits. In any case, the good type is a
summary expression or index of the jointedness of cultural nature, of the distinctive kinds of
activities performed by the participants in cultural systems. In fact, I suppose that understanding
9
a cultural system means identifying these distinctive kinds of activities and exploring their
interrelationships with the aid of ethnographic analogy, provenience data, chronological
information, environmental reconstruction, and anything else that seems potentially relevant"
(1983:19).
Methods
Spaulding suggests that we can search for this patterning in various ways, but that any
scientific and powerful analysis of the cultural principles that produced the archaeological record
must eventually focus on nominal variables: that is, variables that have mutually exclusive
states, such as "long" and "short," and "shell-tempered" and "sand-tempered." Nominal variables
are to be distinguished from ordinal variables, such as an ordering of pottery vessels from largest
to smallest, or ratio and interval levels of measurements, which imply an exact degree of
difference between two measurements (e.g., a lithic 4.2 cm long is twice as long as one 2.1 cm
long). Spaulding argues that we should focus on nominal variables because, "If I can distinguish
readily between long and short projectile points in some groups, so could the makers and users of
the points. And in attempting to infer why this distinction was made, I search for non-random
relationships between the attributes short and long and other variables" (1982:6).
In his later papers Spaulding (1982) has used a rather complex form of statistical analysis--loglinear and hierarchical log-linear models--to investigate complex combinations of attributes, but
these are extensions, not modifications, of his basic method.
Spaulding's approach remains a central issue in contemporary debates on artifact
arrangement. The major criticisms of his approach have been that: (1) its over-all objective is
the reconstruction of ancient behavior, and the potential--or even possibility--of such
reconstructions in higher-level analyses of culture has not been demonstrated (Dunnell 1971); (2)
the chi-square statistical method used to establish attribute co-occurrence is inappropriate
(though this criticism has been blunted by Spaulding's adoption of log-linear models), and by
compressing all variability into nominal categories, significant variability is lost or obscured
(Doran and Hodson 1975); and (3) Spaulding's approach is focused exclusively on attributeclustering, whereas the most productive archaeological units possibly are to be formed by objectclustering (Doran and Hodson 1975; Cowgill 1982).
(3) Multivariate Statistical Methods of Archaeological Classification
and Typology
During the past two decades the use of computer-based multivariate statistical techniques to
group and classify artifacts has become very popular. Many scholars (Dunnell 1971, Whallon
1972, 1982, Spaulding 1977, Christensen and Read 1977, Vierra 1982) have pointed out the
problems associated with some of these techniques, but these methods remain an active area of
research.
It is beyond the scope of this article to summarize the many abstruse mathematical points that
10
underlie these and other statistical methods. The MDS analysis presented in Figure x is an
example of a multivariate statistical approach. In terms of artifact analyses, the most widely
used multivariate statistical techniques are various forms of cluster analysis and principal
components analysis. They have been used together (Doran and Hodson 1975), but there is
considerable controversy about their relationship and the ultimate utility of either.
Some of those who advocate multivariate clustering of archaeological data do so in part on the
assumption that it is possible--even probable--that the significant patterning in large
archaeological assemblages is of such a complexity and on such a scale that it will only be
identified using complex mathematical analyses (Doran and Hodson 1975, Hodson 1982). The
human mind can arrange groups of objects in categories of similarity and ddissimilarity with
great virtuosity, but no one can make reliable comparisons between thousands of precise
measurements on tens of thousands of objects.
Multivariate statistical clustering and other methods can be applied to attributes, objects, or
assemblages--in other words to any numbers derived from analyses of the archaeological record
at various scales. Cluster analysis as applied to artifacts generally has three stages: (1) a
collection of objects is measured on a large set of interval or ratio level variables; (2) on the basis
of these measurements a single number--usually a similarity coefficient--is calculated that
expresses the similarity of each object to every other object in the collection; and (3) on the basis
of a computerized method, all the objects are grouped into sub-groups (or clusters) of objects that
are similar to each other and different from members of other sub-groups.
An example of cluster analysis as applied to the Kom el-Hisn ceramics in presented in Figures
5-9. These sherds were measured on 12 variables, such as radius, maximum thickness, the angle
formed by the long axes of the neck and body, etc. These measurements were then standardized
so that they had a comparable mean and standard deviations. Then the "distance" between each
sherd and every other sherd was calculated, using a common statistical measure (euclidean
distance). Finally, the sherds were rearranged on the basis of euclidean distances such that-insofar as the program could do so--each sherd was placed next to the sherds with which it is
most similar, in a dendrogram form, as illustrated in Figure 6. Note that there is some similarity
between the "intuitive" groups produced in Figure 2 and these computer-generated groups in
Figure 6.
This form of clustering (Figure 6) is just one of many alternative methods in which different
coefficients, methods of forming groups, variable scalings, etc., could have been used.
These forms of cluster analysis have been applied to assemblages of Egyptian materials, but
not to measurements of objects. Close (1980a), for example, clustered Terminal Eastern Saharan
Paleolithic and Neolithic sites on the basis of stylistic attributes of stone tools. Lubell, Sheppard,
and Jackes (1984) grouped Epipaleolithic sites in the Maghreb on the basis of their relative
frequencies of tool types. I have used these same methods to cluster pottery types for purposes
of chronological seriation for Late Period pottery (Wenke 1984). By using these techniques on
assemblages rather than artifact attributes one escapes some of the limitations of the method, but
most of these comparisons are based on units and frequencies of units that must be assumed to
mix stylistic and functional variability. Also, as is discussed below, Read has pointed out (1982)
11
that most clustering analyses are based on the questionable assumption thatevery variable has
equal importance in the creation of groups. Read notes that there is really no reason why this
should be true in most archaeological analyses, but he is basing this suggestion on the idea that it
is the cognitive categories of artifact makers and sorters that are the ultimate criteria.
In part to avoid the problem in cluster analysis of each variable having equal weight in
defining groupings, some archaeologists have turned to data-reduction and data summarization
techniques of multivariate analysis. Read (1982), in fact, argues that multivariate analyses of
artifact attributes in many cases should include as a preliminary stage the use of principal
components analysis (hereafter PCA), and many computerized methods of cluster analysis offer
the option of clustering on the basis of statistical summaries of variables rather than the original
variables taken individually. PCA is a method of determining the extent to which each of the
variables comprising a data set are measuring the same general components or dimensions.
As an example of PCA, consider Figures 7-8. The 25 Kom el-Hisn sherds have been
analyzed using principal components analysis2, the implicit assumption being that the 17
measurements made on these sherds are really measurements of some smaller set of underlying
components or dimensions, such as general size and shape. Mathematically, PCA involves
calculating a measure of how closely two variables co-vary. We can calculate, for example, the
extent to which the radius co-varies with the maximum thickness of the neck.
In PCA these measures of covariation are manipulated and summarized using matrix algebra.
If, for example, these 17 measurements of pot sherds are mainly measuring over-all size of the
vessel, PCA will show us in the form of "factor loadings" precisely the extent to which each of
our variables is measuring this composite sense of "size." But if there are two dimensions of
variability being measured by these variables--size and shape--it will not be possible to reduce
the variability in the correlation matrix to a single dimension. PCA involves calculating how
many significant dimensions of variability exist in a data set and how each of the variables is
related to these dimensions.
The PCA analysis of the 25 Kom el-Hisn sherds indicates that--based on the measurements
made on these sherds--there are at least seven underlying dimensions of variability, seven
"components," on which these sherds differ significantly and independently. "Independently" is
important here in that PCA finds components of variability that are uncorrelated--"orthogonal,"
statistically speaking.
Extended discussion of the methods of PCA are available (e.g., Tabachnik and Fidell [1986]),
but for our purposes the principal question is, how can these techniques be used to classify and
categorize artifacts, assemblages, etc.?
Read (1982) has suggested that we could use PCA and cluster analysis in combination. Each
of our 25 sherds, for example, could be given a score on each of these seven components, and
that score used in seriation. Indeed, LeBlanc (1975) applied a similar approach to ceramics from
the American Southwest and found that PCA could identify clusters of attributes that proved to
be excellent units for inferring chronological seriations--at least as tested with
dendrochronological and stratigraphic evidence.
12
Even if one does not use the groupings provided by PCA, various scholars (Whallon 1982,
Read 1982) have argued that PCA can be used to identify those variables that are critical in terms
of forming groups of artifacts for purposes of chronological seriation, functional analysis, or
simply description. And the principal components extracted in PCA can be used in histogram
form to provide the nominal categories used in the kinds of analyses Spaulding employs.
A PCA analysis of the 37 Kom el-Hisn ceramic types is presented in Figure x.
In many cases the criterion used to evaluate multivariate statistical methods of artifact
grouping has simply been how accurately they were able to duplicate the results of intuitive
groupings (Doran and Hodson 1975: ). In such cases one may legitimately wonder why, then,
one should employ such complex methods. Hodson (1982) and others have claimed that such
groupings are more precise because they involve the same measurements, but this really does not
address the problem of comparability. To compare the Kom el-Hisn ceramics to those from the
Old Kingdom site at Bhuto, for example, we could measure samples from both areas on these
same variables and then do these same kinds of multivariate statistical procedures on each, but in
the end we would get groups of sherds and groups of variables that would be slightly changed
each time a new specimen was included in the analysis. Certainly, these groups would offer
some measure of comparison between these sites, but is this the best such measure? As
discussed below, at least some archaeologists think not.
Also, some apparently important kinds of variability in artifacts are difficult to measure with
the quantitative variables central to multivariate statistical methods of artifact grouping. This
problem is particularly severe with regard to shape (Read 1982). Whallon (1982), working with
Swiss Neolithic pottery, found that eleven different measurements of dimensions of these pots
failed to produce adequate measurement of shape--all were mainly measures of size. He found
that shape could only be measured with his variables if they were converted to ratios and other
composite measurements. Similarly, Read (1982) found that shape in lithic artifacts could only
be measured by a complex summary mathematical expression.
Generally, in those cases in which cluster analysis, PCA, and MDS do seem to have worked
well--principally in problems of relative seriation--it is not at all clear that different methods
would necessarily have given inferior results.
(4) Paradigmatic Dimensional Classification
The fourth and last method of arrangement that I will consider here is one whose basics were
established by Rouse (1960) and by Dunnell (1971, 1978, 1986). This approach is quite different
from these others, although there are points of convergence that may be particularly relevant to
the issues discussed here.
13
Dunnell's version (1971) of this approach is the most explicit, yet it is complex and defies easy
summation; it has also been rather controversial (Benfer 1975, Doran and Hodson 1975,
Spaulding 1972). However, Dunnell's method is like Spaulding's, in that although few
archaeologists have adopted it directly, many have established their own approaches in reference
to issues raised by Dunnell (e.g., Vierra 1982, Read 1982, Voorrips 1982).
Dunnell's objective is to use artifacts to create
analytical units that are maximally useful in the standard pursuits of relative seriation and
functional analysis, but he is also concerned with creating units that have a specific "scientific"
character and utility. His notion of science is "a systematic study deriving from a logical system
which results in the ordering of phenomena to which it is applied in such a manner as to make
them ahistorical and capable of explanation. . ." (1971: 199-200). In his view, explanations will
derive from the articulation of analytical units and laws--in other words, fromtheory, which he
defines as "a system of units (classes) and relationships (laws) between units that provides the
basis for the explanation of the phenomena" (1971: 200).
Methods
Dunnell makes a major distinction that determines the whole structure of the rest of his
method: this is the difference between groups and classes (see also Rouse 1960). Groups are
real collections of phenomena, such as the 25 Kom el-Hisn sherds described above. One can
construct groups by simply sorting similar objects into sets, by multivariate statistical methods,
and by various other methods; the way in which these groups are formed is not relevant to their
definition as a "group" -- they are groups because such arrangements are sortings of physical
objects and are thus bound to the set of phenomena that comprise them: in the case of the Kom
el-Hisn ceramics, for example, somewhat different groups would be formed by any of the above
procedures if other sherds were added to the sample.
Classes, in contrast, have no objective existence; they are definitional. One can construct
classes relevant to artifacts by selecting a number of dimensions, such as length, or color, or type
of material, and then breaking these dimensions into segments, or modes. For the 25 Kom elHisn sherds we can produce classes by choosing as dimensions such variables as color, temper,
and radius, breaking each of these into segments, or modes, and then intersecting the dimensions
to produce the paradigm illustrated in Figure 10. Such classes need not describe any particular
artifacts, but once formed they can be used to tabulate the frequency of combinations of attribute
states.
The contents of an excavation unit, surface collection, or whatever, thus can be described by
tallying the combinations of numbers representing the intersections of the various dimensions
that have been divided into modes (Figure 10). These class frequencies can be tested against the
seriation model, applied to functional analyses, or manipulated by multivariate statistical means
14
for various purposes.
A complete discussion of the many complexities of Dunnell's approach is beyond the scope of
this paper, but the general point of relevance here is that he sees classes, not groups, as the most
useful and powerful analytical units in seriation and other analyses.
A form of paradigmatic analysis has been applied to Egyptian ceramics (Kroeper, personal
communication; Arnold and Bietak in press). In these analyses, ceramics are scored on many
different dimensions of variability, but unlike Dunnell's version of this method, these dimensions
have been ordered hierarchically. Wares, shapes, methods of forming, fabrics, and kinds of
decoration are ordered from most important to least, based on assumptions about how people
make pottery. From Dunnell's point of view, such a hierarchy is wrong, in that the only method
of determining that one variable is more important than another is by comparison to models
explaining the distribution through time and space. Also, there is nothing gained by such a
hierarchy when assemblages are being compared, since a non-hierarchical paradigm identifies
the same patterns of similarity.
Evaluation
Unlike grouping procedures, such as cluster analysis, paradigmatic classifications do not
necessarily change if one considers new data; thus--assuming the measurements are made
precisely--objects from any number of places can be meaningfully compared with each other.
The intuitive typology of the Kom el-Hisn sherds presented in Figure 2 may be useful in
analyzing style and function, in the sense that subsequent excavations and analyses may show
that these groups belong to different chronological periods, or that these groups co-occur with
other kinds of artifacts, or animal bones, or architecture. From Dunnell's point of view, however,
such intuitive groups--or even their statistical distillates, such as PCA component scores or
cluster analysis arrangements--would be expected to be less than optimal for purposes of
analyzing stylistic or functional variability. He argues that for these purposes, one should
instead: (1) select dimensions on the basis of what one is trying to study, whether style or
function and some explanatory model of how this variability should be distributed through time
and space; (2) construct a multidimensional paradigm for these objects; and (3) count the
frequencies of these attribute combinations and compare the variation in frequency of occurrence
of these classes across space or in relation to specific kinds of faunal remains.
If, for example, we were particularly interested in stylistic variability, we might consider that
rather subtle variations on the angle of the neck to the rim (Figure 5) might vary more directly
with the passage of time than might, say, radius, which may be tied to the vessel's function. We
could select other dimensions of variability likely to be time-dependent, and then arrange our
analysis in the form of a count of frequencies of intersecting sets of these dimensions. These
frequencies then, could be tested against stratigraphy, absolute dating (e.g., association with
inscribed sealings), etc.
As noted, the advantages of such an approach are several: it makes it possible to compare the
Kom el-Hisn pottery directly to pottery from, say, Hierakonpolis, in an extremely precise way-15
in fact with a precision limited only by the ability of the respective analysts to make the same
measurements. Such comparisons cannot be made with the same exactitude using the groups
produced by cluster analysis or the other statistical approaches, because in each case the
groupings change with each new sherd and each new variable, and--more important--in the case
of traditional types, one would be trying to compare units that were constructed without an
explicit explanation of how they were produced.
Critics of Dunnell's method have suggested that it offers little increase in analytical power
over intuitive typologies in relation to the much greater time required, that it ignores the
"natural" groupings evident in the archaeological record, that it is inefficient in selecting the best
dimensions for a given form of analysis (in comparison to PCA, for example), and that
dimensional paradigms are incomplete, in that they do not identify multivariate interactions
between attribute states (Benfer 1975, Spaulding 1972, Read 1982, Doran and Hodson 1975).
As an example of the time-costs of the paradigmatic method, note that to construct a
paradigm that would distinguish the groups pictured in Figure 2, and then to distinguish these
groups from the many other "types" of ceramic artifacts found at Kom el-Hisn would require a
dimensional paradigm with many dimensions and modes. In comparison, the traditional intuitive
types are quickly identified and sorted, and they are in any case reflections of "primitive" or
unstated paradigms, in the sense that they can be easily reduced to dimensional paradigms and
the counts of dimensional intersections may not vary much from type counts in some collections
of artifacts. Nonetheless, dimensional paradigms can be applied quite efficiently and rapidly
once the paradigm is constructed.
In any case, the important point is that we shall never really know for any given data set
whether or not a paradigmatic classification produces better units for seriation, etc. than other
approaches, unless performance tests are made using appropriate archaeological data and
models.
Summary and Conclusions
Given these different methods of classification and typology, with their perceived strengths
and weaknesses, what general lessons or conclusions can we draw?
With regard to research objectives, it seems only sensible that, in the absence of definitive
testing, archaeologists evaluate methods of classification and typology at least in part in terms of
the possibility that archaeology can someday become a powerful scientific discipline. As noted
above, even those who reject this possibility cannot ignore the criticism described here of
traditional Egyptian artifact categories, even for simple purposes of description, comparison, and
seriation. To some extent, the development of a more powerful explanatory form of archaeology
will probably be built on simple improvements in constructing units of artifact variability for
relative seriations and functional analyses.
With regard to methods of classification and typology, it is important to recognize that the
variant approaches described here have not been adequately tested, so we have no adequate basis
16
for saying one approach usually works better than another.
In the case of the Kom el-Hisn ceramics, for example, an adequate test of just the "fit" of the
analytical units defined by these various methods would require that these tens of thousands of
ceramics be analyzed in terms of these different methods, and then the different units compared
to the seriation model or to stratigraphic evidence and other measures of chronology. In the end,
the "best" method of unit formation, in such an experiment, would simply be the one that made
the most "sense" in some composite conception, based on all these different lines of evidence.
Although no such large scale tests of different methods have been done, to my knowledge, if
the evaluations presented here are valid, certain conclusions follow. The intuitive traditional
typologies of Egyptian artifacts, for example, can be expected to have a role as simple
descriptive devices, but they may have few virtues other than ease of application. If I report, for
example, that our excavations at Kom el-Hisn revealed groups of red-slipped bowls and jars in
the proportions like those in Figure 2, the person who wishes to compare our ceramics with those
from Giza, or Bhuto, or some other site can only look at drawings of representative sherds of our
groups and come to some approximate idea of how similar these drawn specimens and group
frequencies are to those at these other sites.
Similarly, computerized cluster analysis in which the defining criteria are equally-weighted
quantitative variables would seem to offer few improvements on the the intuitive groupings,
except perhaps in the comparisons of assemblages, as opposed to artifacts. Even here there is
doubt about the utility of cluster analysis and similar approaches.
Spaulding's approach is interesting, but the units it produces do not seem ideally suited to
archaeological chores like relative seriation (Whallon 1972). Here too, however, no definitive
testing has been done, and the development of effective log-linear models has made it at least
possible to determine if the complex interactions of nominal level variables provide useful
analytical units. It is not at all clear, however, that Spaulding is correct in asserting that the
artificer's conceptualization of artifact variability can be monitored by our artifact
categorizations. In the first place, we'll never know what these conceptualizations were; in the
second place, not all distinctions that may be important would be evident to the ancient maker
and user of artifacts. If we were to study agricultural economies, for example, the appearance of
a stone tool assemblage used increasingly for harvesting of cereal stems may make itself evident
only or principally in "luster" patterns on edges, such that one can only measure them
microscopically.
Currently in anthropologically-oriented archaeology the major philosophical division in
studies of artifact categorization involves the relative analytical priority of multivariate statistical
procedures and dimensional paradigms. The merits and limitations of both these approaches are
widely recognized.
Read (1982), for example, has argued that all the methods of artifact
analysis described here can be--and should be--combined in most analytical frameworks. The
sequence he suggests is to begin with some form of cluster analysis, in which principal
components analysis has been used to deal with redundant and irrelevant variables. Using the
17
groups formed by cluster analysis and associated statistics one should then construct a
paradigmatic classification. Finally, Read suggests, a Spaulding-style chi-square and log-linear
analysis should be done to identify significant co-occurrences of attributes.
A crucial decision in all these processes of artifact categorization is how we select the
dimensions that we eventually use in our analyses. In traditional methods one selects obvious
variability in size, shape, and decoration, without specifying exactly what combinations and
weightings of these three dimensions are to be applied to every object--and generally with
considerable variation from object to object and observer to observer in how these criteria are
applied. In Spaulding's approach one does statistical analyses of these obvious dimensions of
variability and pursues in the analysis only those combinations that are statistically significant in
their patterns of co-occurrence. In the multivariate statistical methods, we measure as many
variables as we can think of or have time for, and either use these measurements directly or
distill them with PCA, MDS, etc. In a paradigmatic approach, the question of how one selects
dimensions of variability is particularly crucial, since there are strict practical limits to how many
precise measurements one we can make of multi-modal dimensions.
Again the critical question for all these methods of artifact arrangement is, how do we
determine that we have concentrated on the appropriate dimensions of variability and that we
have measured these dimensions adequately? If we try, for example, to analyze stone tool
function by considering the physics of percussion and abrasion of brittle solid, how can we have
confidence that we are considering every relevant dimension or even the most important
dimensions? Our understanding of the physics of the process of edge alteration is incomplete.
Moreover, we cannot hope to reconstruct the functional considerations an individual might have
had in mind when he made tools: he may have wanted a particular kind of edge for shaping
arrows, but he might have wanted a certain over-all weight to apply that edge with a certain
force.
So how can we know the relevant dimensions? And even if we have confidence that we can
pick out the 15 or 20 most likely dimensions, what about the possibility that the most effective
units of comparison between two assemblages in terms of their use is some abstruse and
complicated mathematical function involving three or four modes?
There is also the problem of the information lost by breaking up a continuous variable into
modal classes. How can we determine the proper scale at which to divide our dimensions?
It is not evident that we can solve these problems by using cluster analysis or principal
components analysis to identify dimensions or variables that have non-random distributions
across microenvironments or significant co-occurrences with other artifact or faunal types. To
do this allows a particular group of artifacts to determine what variables we use, and this could
be misleading because accidental or other kinds of associations not causally connected with the
phenomena we are trying to analyze will always be found.
These same considerations of how one chooses variables to measure also apply in the case of
18
the selection of variables for stylistic analyses. It is the essence of stylistic variability that it is
random, in the sense that no one can predict whether this slight turn of lip angle, or that subtle
color will be the element that changes unimodally through time. So, in trying to isolate potential
stylistic variables, one might sort sherds in piles from different sites, or from different levels, and
see if there is anything evident that distinguishes them. These dimensions then could be used to
construct the paradigm that will separate them in a chronologically meaningful manner.
Alternatively, we could measure a great many variables, expressing them also as ratios,
combinations, and transformed values, and then use principal components analysis or other
multivariate statistical techniques to define attributes or combinations of attributes to test against
the seriation method. One cannot conclude that because seriations must address conceptualized
categories--significant differences apparent to the artisan--that to identify chronologically
significant variability may not involve ratios or other mathematically more complex
measurements than "red" or "black," or "long" or "short."
Considering all these issues, it seems reasonable that anyone analyzing Egyptian artifacts
should: (1) define as precisely as possible the objectives for which a given arrangement is to be
made; (2) state explicitly why the attributes being measured are reasonable links to the kinds of
objectives specified in the first step; (3) try alternate forms of grouping and classification, and
check to see if in fact one or another is more in line with expectations, based on corroborative
information, such as stratigraphy or documentary evidence; and (4) publish fully exact
measurements, counts, etc., so that eventually we can determine what kinds of arrangements best
serve the needs of the analyst for a given purpose.
The increased time required to do these analyses may require major changes in the ways
resources for field work--especially time--are allocated, but the alternative seems be the
continued limitation of Egyptian archaeology to a simple, descriptive exercise.
We hope to follow our own recommendations in our future analyses of the Kom el-Hisn
ceramics, once we have increased the diversity of our ceramics samples and--most important-related the different strata of the areas excavated.
19
Download