The recognition of multiple spatial patterns: a case study from the French upper paleolithic

16
Journal of Archaeological Science 1983, 10, 165-180 The Recognition of Multiple Spatial Patterns: a Case Study from the French Upper Paleolithic Jan F. Simek” and Roy R. Larick A pattern recognition approach to spatial analysis is applied to artifact distributions from the Magdalenian site of Pincevent, Section 36. Patterning is investigated using a k-means cluster analysis that permits iterative mapping of artifact distributions at several scalesof spatial complexity. Multiple scalesof patterning are recognized in the Pincevent distributions. These patterns are assessed in terms of those discovered by the excavators through intensive visual inspection of the materials. Basic concordance is seen between these two approaches. Finally, the analysis detects more detailed patterning. A relationship between artifact abundance and location seems to hold constant throughout the site. Keywords: SPATIAL ANALYSIS, MATHEMATICAL PATTERN RECOGNI- TION, K-MEANS CLUSTER ANALYSIS, FRANCE, PARIS BASIN, PINCEVENT, UPPER PALEOLITHIC, MAGDALENIAN, LIVING FLOOR, OPEN-AIR SITE, STONE ARTIFACTS, BONE ARTIFACTS. Introduction One of the goals of current research on Paleolithic habitation sites is to make better use of artifact distributions recovered from site excavations. Ideally, one would like to have spatial methods making it possible to infer aspects of human social and economic organization at a site. The ambition is to relate form to process. However, frustration has often been encountered in these efforts for a variety of reasons. Given spatial artifact distributions that are overwhelmingly rich, a major problem concerns the current methodology for pattern recognition. The two most common strategies employed are visual inspections of distributions by informed human analysts without a formal guide or technique, and quantitative approaches based on summary statistics such as nearest neighbour. Modern excavations usually record the locations of most artifacts in space using three- dimensional coordinates. These data may be translated into point distribution maps of sometimes dauntingmcomplexity. A basic problem facing the analyst is how to deal with the massive number of points present on such maps. The visual assessment of point maps may be the most desirable method, if carried out with the intensity and attention to detail needed to view and account for all potential patterns. In this kind of study, the excavator’s knowledge of context and structure can be applied to the pattern search, and other kinds of information, for example, artifact conjoining and the presence and nature of features, can be integrated into the analysis. “Department of Anthropology, State University of New York at Binghamton, New York, U.S.A. 165 0305-403/83/020165+ 16 $03,00/O 0 1983Academic Press Inc. (London) Limited

Transcript of The recognition of multiple spatial patterns: a case study from the French upper paleolithic

Journal of Archaeological Science 1983, 10, 165-180

The Recognition of Multiple Spatial Patterns: a Case Study from the French Upper Paleolithic Jan F. Simek” and Roy R. Larick

A pattern recognition approach to spatial analysis is applied to artifact distributions from the Magdalenian site of Pincevent, Section 36. Patterning is investigated using a k-means cluster analysis that permits iterative mapping of artifact distributions at several scales of spatial complexity. Multiple scales of patterning are recognized in the Pincevent distributions. These patterns are assessed in terms of those discovered by the excavators through intensive visual inspection of the materials. Basic concordance is seen between these two approaches. Finally, the analysis detects more detailed patterning. A relationship between artifact abundance and location seems to hold constant throughout the site.

Keywords: SPATIAL ANALYSIS, MATHEMATICAL PATTERN RECOGNI- TION, K-MEANS CLUSTER ANALYSIS, FRANCE, PARIS BASIN, PINCEVENT, UPPER PALEOLITHIC, MAGDALENIAN, LIVING FLOOR, OPEN-AIR SITE, STONE ARTIFACTS, BONE ARTIFACTS.

Introduction One of the goals of current research on Paleolithic habitation sites is to make better use of artifact distributions recovered from site excavations. Ideally, one would like to have spatial methods making it possible to infer aspects of human social and economic organization at a site. The ambition is to relate form to process. However, frustration has often been encountered in these efforts for a variety of reasons. Given spatial artifact distributions that are overwhelmingly rich, a major problem concerns the current methodology for pattern recognition. The two most common strategies employed are visual inspections of distributions by informed human analysts without a formal guide or technique, and quantitative approaches based on summary statistics such as nearest neighbour.

Modern excavations usually record the locations of most artifacts in space using three- dimensional coordinates. These data may be translated into point distribution maps of sometimes dauntingmcomplexity. A basic problem facing the analyst is how to deal with the massive number of points present on such maps. The visual assessment of point maps may be the most desirable method, if carried out with the intensity and attention to detail needed to view and account for all potential patterns. In this kind of study, the excavator’s knowledge of context and structure can be applied to the pattern search, and other kinds of information, for example, artifact conjoining and the presence and nature of features, can be integrated into the analysis.

“Department of Anthropology, State University of New York at Binghamton, New York, U.S.A.

165 0305-403/83/020165+ 16 $03,00/O 0 1983 Academic Press Inc. (London) Limited

166 J. F. SIMEK AND R. R. LARICK

However, there are major drawbacks to the visual method. First, it is time-consuming and requires a large group of specialists in order to achieve results. Second, partly because of this, there is a problem with replicability. There is no guarantee that two teams of analysts, perhaps guided by different research interests, will detect the same patterns. Therefore, the judgement criterion for evaluating such studies is limited to the perceived competence of the analysts. Finally, the human eye is most likely to see obvious or general patterns, but may not be able to discern small-scale ones within the same distribution. This is especially the case when we try to compare two or more artifact distributions.

An alternative to the visual method is the use of computer-assisted quantitative procedures. These have the advantages of replicability and efficiency. However, many quantitative techniques are overly reductive. Some require the expression of point data as quadrat counts (cf. Whallon, 1973). These methods test for deviation from random patterning, but the theoretical relevance of this in archaeological contexts can be questioned (Hodder & Orton, 1976, p. 8; Hietala & Stevens, 1977, pp. 539-540). Techniques such as nearest neighbor analysis reduce a distribution of points to a single statistic which is used to characterize it as random, aggregated (clustered), or uniform (Grieg-Smith, 1964; Pinder et al., 1979; see also Hodder & Orton, 1976, p. 40). These summary statistics are difficult to relate to contextual elements (e.g., features, stains, etc.) present on the surface, since they do not use the locational information contained in point patterns.

How can we, then, analyze point distribution maps without spending the time and resources required for adequate visual studies and without resorting to overly reductive or inappropriate mathematical methods? We require a method which can manage large distributions and can search for patterns in these at various scales. Obviously, some data reduction is necessary to accomplish this goal. But ideally this reduction will not obscure or ignore possible fine-scale patterns that result from subtle yet important site formation processes. A pattern search procedure that can identify both large and small-scale patterning in the same distribution is called for. The method must also characterize patterns in such a way that many maps, e.g., of specific artifact classes, can be compared analytically, and that relationships among patterns can be defined and examined. We illustrate a technique that accomplishes these ends.

In a recent paper, K. Kintigh and A. Ammerman propose an “heuristic” approach to archaeological spatial analysis. A basic goal of the heuristic approach is ultimately “to open the way for the use of contextual knowledge and human expertise within a formal (computer executed) procedure” (Kintigh & Ammerman, 1982, p. 31). Motivated by the kinds of problems outlined above, Kintigh and Ammerman developed an analytic procedure to solve archaeological problems with a less rigid methodology than those traditionally employed in spatial analysis. As their basic pattern recognition technique, they used a k-means clustering algorithm.

The k-means procedure allows for pattern searching at many levels. The method provides a step-by-step description of the distribution under study at various spatial configurations. These can be assessed to find appropriate solutions for maps at a variety of scales. These multiple configurations are analyzed sequentially from the simplest to the more complex. In this way, small-scale patterns can be examined within an understanding of larger-scale patterns. Although the patterns are not explanations but characterizations of the distributions, the results retain content and locational information that are required in order to frame explanation in terms of complex formation processes. The heuristic approach is discussed in detail in Kintigh and Ammerman’s paper, and the reader is referred to that source for a complete presentation. We will discuss a few relevant details below.

MULTIPLE SPATIAL PATTERNS 167

The present study illustrates and evaluates the k-means technique on distributions from the Magdalenian open air site at Pincevent (Leroi-Gourhan & Brezillon, 1972). We use Pincevent for several reasons. First, the site was recently excavated using three- dimensional coordinates for recovered materials. Second, there is a monograph on the site available. Third, the site was visually analyzed and interpreted in terms of several site formation processes. The last characteristic is important, since we can use the results of this intensive visual analysis to check our results for plausibility. It should be stressed that we do not wish to “test” or even to replicate Leroi-Gourhan & Brezillon’s formula- tion. However, correspondence between the two studies would be encouraging from our point of view, since we will have succeeded in producing patterns comparable to those recognized after years of study. Moreover, we will have detected such patterns using an explicit and replicable formal procedure.

In addressing these problems, we hope to detect formal patterns within the Pincevent distributions that are comparable to those identified by the excavators. We attempt to recognize patterns on several scales that may be related to a complex set of site formation processes. We illustrate a technique that can realize these goals while providing reproducible characterizations of patterns.

The Data Base Pincevent, Section 36 (Seine-et-Maine, France) is a well-preserved series of Paleolithic habitation surfaces fully reported by Leroi-Gourhan & Brezillon (1972). The habitation surface serving as our data base (Level IV-2) produced 23 features; thousands of lithic artifacts; a well-preserved fauna1 component including reindeer, horse, wolf, hare, and birds; masses of imported rock; and other cultural debris (including iron and manganese oxides). The preservation of these Magdalenian deposits was the result of fluvial inundation shortly after occupation. Swift but gentle deposition of silts was a key element in protecting the integrity of the archaeological finds.

Based on an extensive study of the distribution maps, Leroi-Gourhan & Brezillon present a credible model of the accumulation of materials at Section 36. This inter- pretation, which infers a series of three distinct habitation areas, views deposition at Pincevent as the result of dumping behaviour by the occupants. The three principal habitations are centered on rock-lined hearths which served as the foci of domestic activities (Figure 1). Material resulting from activities performed at these hearths were ejected en masse to one side of the features in a fan-shaped pattern. Because of the positioning of artifact concentrations to one side of the fireplace, Leroi-Gourhan & Brezillon infer three covered tent structures occupying relatively empty areas near the hearths. Refuse, then, was removed from the “interior” areas and deposited into semi-circular “domestic dumps” (see Figure 1). A general area of communal dumping is suggested in between the two northernmost structures. Smaller artifact concentrations are also identified, and these occur in peripheral areas of sparse deposition.

The Leroi-Gourhan & Brezillon interpretation for the Pincevent deposits foreshadows the “ring model” proposed by Yellen in 1977. The Pincevent model, like Yellen’s, views overall patterns of artifact densities as linked with occupation loci (i.e., the distribution of social units) and the duration of occupation rather than reflecting the refuse of “specific activity areas”. Habitation structures are located near focal hearths, and most refuse deposition in adjacent areas results from a desire to keep residential areas clear. Space away from the focal hearths is used less intensively, and specific activities may here achieve a certain measure of spatial separation.

The spatial data used in this study are points representing mapped artifacts. These are grouped for analysis into ten categories as presented in the excavation report. Cartesian

168 J. F. SIMEK AND R. R. LARICK

Figure 1. Hypothetical map of habitation structures and deposition areas at Pincevent, Section 36. Following Leroi-Gourhan & Brezillon (1972, pp. 247-250), the habitation spaces are divided into various areas of refuse disposal. Solid and broken lines delimit three areas within the inferred structures. A, Focal hearths; B, domestic activity areas, and C, residential or interior areas. Broken and pointed lines delimit concentric areas of deposition associated with each structure. D, Areas of domestic refuse disposal; E, areas of dispersed deposition; F, areas of rare deposition; G, areas of isolated deposition.

coordinates were obtained from the published maps using a digitizer. The ten artifact groups, and the numbers of objects obtained from the maps for each category, along with figure references from Leroi-Gourhan & Brezillon (1972) are presented in Table 1.

In our case, the k-means procedure functions by taking the initial point configuration and finding the center of that distribution based on the mean x and y coordinate values. This mean value is called a “centroid” and is expressed as a pair of coordinates. The farthest point from the centroid is then found by calculating the Euclidian distances

Table 1. Artifact classes used for analysis and reference to class distribution map in Leroi-Gourhan & Brezillon (1972)

Artifact class n Figure Pages

Cores 149 66 112-115 Burins 177 70 126-128 Backed bladelets 691 73 134-136 Endscrapers 132 74 138-140 Borers 136 74 138-140 Antlers 164 79 156-158 Mandibular and dental fragments 219 92 168-170 Metacarpals 235 95 178-180 Metatarsals 186 95 178-180 Ribs 366 96 182-184

MULTIPLE SPATIAL PATTERNS 169

between all data points and the mean. This second point becomes the seed for another centroid. The distances between all points and the two reference points (the centroid and the seed) are computed. Data points are then assigned to the nearest reference. The mean x and y coordinate values are recomputed within the groups, and another pass is made through the data reassigning points to their nearest centroid. This process continues until no more reassignments are made. The result is a “two cluster solution” for the distribution. This process is carried out to form in sequence three, four, five, etc., clusters until a user-specified termination point, “I?‘, is achieved.

After a solution is reached at a given level, the clusters are lumped back to the previous level by combining the two clusters with the closest centroids. The dividing procedure is performed again, and all centroids are recomputed. The points are again assigned to the nearest centroid. If the results of the lumping process match or surpass the previous solution as a description of the distribution, then the new configuration is retained. If the first solution was better than the second, it is retained. Clusters are lumped back through all previous levels, assessing the strength of the results at each stage, until a single cluster has again been formed. The results of the run are then reported.

The degree of clustering (internal homogeneity 7~. external isolation) is defined by a statistic, the summed squared error (SSE) which is the sum of the squared distances between all points in the distribution and their nearest centroid at a given clustering level. The k-means procedure retains the solution having the minimum SSE in comparisons during lumping and splitting passes through the data.

The SSE is also used, after a transformation, to identify the most appropriate solutions for the distribution. To assess the overall cluster solutions, the SSE is expressed at each level as a percentage of the maximum SSE for the entire distribution. This characterizes the improvement in distribution description of a given level over the initial centroid and SSE values as a proportion. This percentage is then transformed by log,, as a convenience, and is plotted against the number of clusters. The log (%SSE) decreases naturally as more and more clusters are formed. Conventially, the analyst uses inflections in the plot as a means of identifying good cluster solutions. Several such inflections indicate that several solutions characterize the point distribution. This feature of the k-means procedure provides a way to identify fine-scale patterning within simpler, perhaps more obvious patterns.

It should be pointed out that inflections in the log (%SSE) plot are always greater at lower solution levels than later in the sequence. This is an artifact of the larger SSE proportions associated with early divisions of the initial distribution, and it may also reflect the log transformation of the proportions. At this point, no significance test has been developed to assess clustering strength so that no solution indicated by an inflection can be defined as the “best”. This is the case despite apparent differences in the sizes of inflections on the plot. We do not use these log (‘@SE) plots past the first indicated solution.

In addition to the SSE and the log (%SSE), three other k-means summary statistics will be used here. The coordinates of the centroids locate the clusters so they can be mapped. To describe the individual clusters once they have been formed at a given level, the root mean squared (RMS) which is the square root of the summed squared distances between a cluster centroid and associated points, provides an indication of cluster size as a radius. Finally, the number of units in each cluster will be used to characterize a cluster’s content.

The use of these statistics will be demonstrated in the following application of the k-means technique to spatial analysis at Pincevent. Before proceeding to the presentation, however, a few terms must be defined. For the purposes of our analysis, cluster will refer specifically to a concentration of points assigned to a centroid. A cluster has a spatial

170 J. F. SIMEK AND R. R. LARICK

size, defined by the RMS, and a point frequency which is the number of points assigned to the centroid. Area will refer to a spatial relationship among clusters and contextual features, for example hearths and soil stains. Zone will refer to a set of clusters where the RMS radii show considerable overlap in space.

Stage I Analysis: the Individual Artifact Classes The first stage of the study involves the investigation of spatial patterns within artifact categories. A series of k-means analyses was performed on the ten artifact classes listed on Table 1. Each run involved clustering to a termination level of fifteen clusters. Space limitations preclude the presentation of all ten analyses here. A single class (mandibular and other dental fragments) is presented in some detail to illustrate the technique with the relevant statistics and interpretations described. * All ten artifact categories were analyzed in this manner, and our interpretations are based on all classes.

As an initial assessment of the best cluster solutions characterizing the mandible distribution, the plot of the log (%SSE) is presented in Figure 2. Inflections in the plot at three, six, and nine clusters indicate good solutions at these levels.

Figure 2. Transformed summed square error plot (log (%SSE)) for mandibular and dental fragments after k-means cluster analysis.

Based on this plot, three maps are produced representing the mandible distribution divided into three, six, and nine clusters (Figures 3, 4 and 5). We shall hereafter refer to these solutions as the first (three cluster), second (six cluster), and third (nine cluster) solutions.

The individual cluster maps are formed using the centroids and RMS measures of cluster size. These statistics, along with the number of points belonging to each cluster, are given in Tables 2, 3 and 4.t

*Documentation and illustrations for all k-means runs described herein exist as a portfolio of drawings and printouts on file at the Department of Anthropology, State University of New York at Binghamton. t Cluster centroid coordinates and RMS values are given in meters based on the scale of the published artifact distribution maps (Table 1). The Cartesian origin we used is at the northwest corner of the original excavation grid (the northwest corner of square 299). Our y axis is reversed from that of the excavators; our y values ascend to the south as the grid letters descend.

MULTIPLE SPATIAL PATTERNS 171

Figure 3. Three cluster solution map for mandibular and dental fragments. Numbers represent the rank of each cluster in terms of the artifact counts present near the centroid. Ranks are plotted at centroids.

Figure 4. Six cluster solution map for mandibular and dental fragments. Ranks are plotted at each corresponding centroid.

172 J. F. SIMEK AND R. R. LARICK

Figure 5. Nine cluster solution map for mandibular and dental fragments. Ranks are plotted at each corresponding centroid.

Table 2. Three cluster solution statistics for mandibular and dental fragments

Rank n x Y RMS

1 132 8.36 6.18 1.51 2 78 14.36 6.36 1.15 3 9 16.00 15.27 1.71

Table 3. Six cluster solution statistics for mandibular and dental fragments

Rank n X Y RMS

1 69 14.55 6.18 1.00 2

z: 8.73 5.45 0.76

3 10.00 8.00 0.98 4 15 5.09 3.45 1.03 5 9 16.73 15.82 1.33 6 4 4.55 8.91 0.76

Table 4. Nine cluster solution statistics for mandibular and dental fragments

Rank n x Y RMS

56 9.81 7.64 0.71 50 13.82 6.00 0.71 40 7.82 5.82 0.56 23 10.00 4.91 0.51 19 16.36 6.36 0.78 13 4.91 3.27 1.00

9 16.73 15.82 1.33 6 13.09 11.09 0.98 3 4.18 9.09 0.58

MULTIPLE SPATIAL PATTERNS 173

Important information to note here is the variation in point frequency among the clusters. For the first solution, there is a clear discrepancy among clusters: 59 % of all objects occur in one cluster while only 7 % are in another. The same is true for the second solution, where over 83 % of all points are located in three clusters. In the third solution, 64 % of all mandibles and teeth occur in only three clusters. As can be seen on the maps, the richest clusters in all solutions center at or near the three hearths defined by Leroi- Gourhan & Brezillon as foci of domestic activities.

Also important to note is the way in which the clusters are divided as the k-means analysis proceeds. The two northern clusters in Figure 3 are divided into five separate clusters in the second solution. The other cluster defined at the three cluster level remains the same. In the nine cluster solution (Figure 5), three more clusters are formed in the northern sector. Again, the southernmost cluster remains unchanged.

In sum, the richest clusters in the first solution are divided into the most clusters initially, and these are further divided as the run proceeds. The least rich cluster remains unchanged throughout the sequence. That sparse clusters are defined early in the sequence is important, because it suggests a relationship between abundance and location that goes beyond the possible effects of SSE minimization processes. The latter factors may, however, condition the results to some extent in all runs.

The example of mandibles and teeth should illustrate how the k-means procedure can be used to define multiple levels of patterning in the spatial distributions of individual artifact categories. Similar discussion can be made for each of the ten classes analyzed. To address the problems of complex deposition and artifact association, artifact categories must be examined together. That is the subject of the second analytic stage.

Stage II Analysis: Displaying Multiple Artifact Classes In order to investigate overall spatial patterns at Pincevent, the individual artifact cluster maps produced in Stage I were overlayed at each of the three different cluster solutions. Three maps, corresponding to the first (three cluster), second (six cluster), and third (nine cluster) solutions, were produced from the centroid and RMS values provided by individual runs for all artifact classes. The algorithm found its best solution for each artifact class at the three cluster level. This is readily seen in the plot of the log (%SSE) curves where the greatest inflections are found at the three-level in each case (Figure 6).

We observe patterning at greater detail by defining larger numbers of clusters for each distribution of artifacts. Two multiples of the three cluster solution provide the two higher levels of clustering (or smaller scales). Specifically, levels of six and nine clusters increase detail while maintaining simplicity in presentation. No structural rule is implied by grouping at these multiples; it simply provides convenient scales for defining spatial patterning in greater complexity. The downward inflections in the log (%SSE) lines were not used to define higher levels of clustering. For reasons mentioned above, it is too difficult to judge the significance of these inflections within and between the runs of the ten classes of artifacts.

Figure 7 represents the map of the cluster centroid and RMS statistics produced by the first solutions for all ten artifact classes. Three depositional zones occur in the areas of the three hearths. Two of these, in the northern part of the site, are fairly compact with closely spaced cluster centroids. The third zone in the southern part of the site is less compact and contains larger RMS circles. The coincidence between this map and that proposed by Leroi-Gourhan & Brezillon (see Figure 1) is striking. All zones are centered on the rock-lined hearth features. Most centroids are located to one side of those hearths. Two zones contain clusters from all ten artifact categories. The northeastern zone has nine classes, lacking only bladelets. These results are similar to the excavator’s inter- pretations of the site, defining three general concentrations of deposition which plausibly

J. F. SIMEK AND R. R. LARICK

Figure 6. Transformed summed square error plots (log(%SSE)) overlaid for all ten artifact categories.

N

2m 0

Figure 7. k-Means three cluster solution maps overlaid for all ten artifact classes.

represent separate habitation areas. Since we hoped to achieve some correspondence between patterning recognized analytically and the findings of Leroi-Gourhan & Brezillon, the results obtained at the three cluster level must be viewed as encouraging.

Differences in the individual distributions do exist and can be identified at this analytic stage. In all three parts of the site, there appears to be a difference in the general distribution of bone and stone artifact categories. In the two northernmost areas, stone

MULTIPLE SPATIAL PATTERNS 175

Figure 8. k-Means six cluster solution maps overlaid for all ten artifact classes.

artifacts tend to center on or very close to the hearths, while bone materials cluster to the east of the features. In the south, the separation is similar, but the axis runs in a north-south direction. Stone artifacts are only slightly more concentrated on the hearth here, while bone objects are grouped a little further to the north. Overall, however, the pattern of these three areas seems equivalent. Hearths serve as the “centers of gravity” for the first solutions, and stone and bone artifacts at a given hearth are somewhat separated within a general concentration located on one side of the feature.

In the second solutions (Figure 8) the tendencies noted above are even more pronounced. A series of clusters, with bone and stone still slightly separated, focuses on the hearths. Other clusters are located near some of the smaller features at the site (including unlined hearths, soil stains, and gravel and stain features). The latter clusters tend to have lesser point frequencies than those associated with focal hearths. Stone tool clusters are more compact (they have smaller RMS radii) than bone clusters; the distribution of bone materials is more dispersed and located away from the hearths. Small zones defined at this level show variation in the number of artifact classes present, with some clusters made up of only bone, lithics, or restricted mixtures of classes.

Figure 9, which shows the nine cluster overlay map, not only reflects the trends already noted but also displays some new characteristics. The clusters centered on the hearths are subdivided here, and these subdivisions are compact and closely spaced. The separation between bone and lithics is maintained, with bone centroids concentrated on one side of the hearths and lithic centroids on smaller peripheral features, particularly in the southern area. There, two principal zones can now be distinguished: a large and more general one focused on the hearth itself and a second, more dispersed zone near a gravel feature. A gravel and stain feature located between the two northern areas shows a complex pattern of all artifact categories similar in content to the hearth area concentra- tions. Peripheral features are the centers of gravity for smaller assemblage clusters.

176 J. F. SIMEK AND R. R. LARICK

Figure 9. /c-Means nine cluster solution maps overlaid for all ten artifact classes.

In summary, the first two analytic stages produce the following results. The three large scale artifact concentrations isolated by Leroi-Gourhan & Brezillon were defined here using a quantitative procedure for detecting point patterns. These three areas, as they noted, center on hearth features. The deposition of materials around these hearths, particularly of bone debris, is located off to one side. As more clusters are defined, smaller-scale zones are suggested which have somewhat differing depositional characteristics. Zones centered around the hearths contain most of the artifact classes, but others away from the features have fewer. The close packing of small clusters in some zones, and the dispersal of fewer clusters in others, suggests an overall relationship between artifact abundance and spatial organization. The ability of the k-means procedure to detect this kind of multi-level patterning is important. By providing sequential solutions of increasing complexity, the method allows this kind of multiple pattern recognition. The relationship between abundance and location, and its implica- tions for artifact class associations, is the focus of our third research stage.

Stage III Analysis: Deposition Zones and Artifact Abundance To examine the relationship between artifact abundance and location, we rank clusters by their point frequencies for each of the three k-means solutions. For example, the cluster with the greatest frequency is the first rank and is given the symbol “1” in Figures 10 to 12. We then plot the ranks on the excavation grid using the centroid coordinates. When the ranks for all ten classes are plotted together for a given cluster solution, the overall spatial patterning among classes in terms of relative abundance is illustrated. We examine rank maps for each cluster solution individually, then turn to a discussion of general trends indicated by all three.

MULTIPLE SPATIAL PATTERNS 177

,Y, ,’ ‘, ‘,

,.: 0

Figure 10. Ranks of cluster centroid point frequencies plotted for all k-means three cluster solutions.

Figure 11. Ranks of cluster centroid point frequencies plotted for all k-means six cluster solutions.

178 J. F. SIMEK AND R. R. LARICK

Figure 12. Ranks of cluster centroid point frequencies plotted for all k-means nine cluster solutions.

Figure 10 shows the rank map for all first solutions. With few exceptions, primarily in the northern part of the site, the ranks distribute sequentially into the three hearth areas. The highest ranking clusters occur generally in the northwest area while the second ranks are concentrated in the northeast. All of the lowest ranking clusters occur in the southernmost area. To summarize this pattern, then, all artifact classes occur in each hearth area, and within a given area the rank value for all classes is remarkably similar.

This pattern holds through the second and third solutions. At the six cluster level (Figure 1 I), clear patterning is still evident. High ranks (1 and 2) tend to occur together as do middle ones (3 and 4). Low ranks are located away from the central hearths. Although more difficult to evaluate visually, the nine cluster map (Figure 12) exhibits the same relationships at a more detailed scale. The highest ranks are concentrated near the two northern hearths where they tend to occur in groups. The southern hearth area, least rich of the three focal hearths, contains mostly middle rank clusters. Finally, a series of peripheral zones are identified at the nine cluster solution. Located away from the central hearths, these contain only a few low ranking clusters and are scattered across the site surface.

Mapping the rank of point frequencies within clusters enables us to recognize a general and recurring relationship between abundance and location throughout the site. Cluster centroids for most of the major typological classes show similar rank values when found in spatial proximity. This is the case regardless of the total amount of material present in a given location. Importantly, this pattern is observed at each of the three clustering levels examined in this study. In other words, in three different characterizations of the Pincevent distributions, the relative abundance of artifact types, expressed as ranks, is spatially patterned no matter how large or small a spatial division is made.

MULTIPLE SPATIAL PATTERNS 179

Clusters, and zones of overlapping clusters, are not units representing single events organized on the basis of activities. Rather, they represent the outcome of more complex depositional events or pathways. A series of events intervenes between artifact use and the point and time of artifact abandonment and deposition. With this perspective, one would expect that artifacts not used together should be recovered from the same depositional contexts. While the overall expectation is against finding direct spatial associations between the use and location of artifacts within the site, there are likely to be some areas where such associations may be found. This kind of patterning may best be reflected in the smallest clusters identified, for example, at the nine cluster level.

Discussion Leroi-Gourhan and Brezillon interpret the Pincevent occupation as three habitation areas, presumably covered, focused on central hearth features. Fan-shaped deposition areas and a common refuse pile between the two northern habitations received most of the disposed materials. Our analysis supports this interpretation and, in so doing, finds the k-means technique to be a quantitative procedure capable of identifying meaningful patterns in spatial distributions. Three major concentrations, centered on the hearths, are indicated by the first level of k-means analysis. All artifact categories displayed this zonation at the three cluster level. Bone and stone materials show some degree of separation within the fan-shaped deposits. The relative abundance of materials (expressed as ranks) is very similar among all artifact classes in the three areas. The second solution adds the communal deposition area (identified by the authors) between the northern structures as well as a few peripheral zones. At this level, the cluster point frequencies vary little within zones but noticeably among areas. The nine cluster k-means solution continues this pattern while adding peripheral zones and subdividing the central deposition areas into several smaller zones.

In assessing the success of the k-means pattern recognition approach, the value of observing multiple levels of patterning must be stressed. We have examined spatial patterning at three different scales. Each scale contributes a slightly different view of the depositional processes occurring throughout the site. For example, it may be that the three cluster pattern reflects the presence prehistorically of three similar social units. The six cluster pattern may reflect the organization of space by those groups either at the same time (where patterning may represent the spacing of distinct social units or the expected use of the location when occupied) or at different times (where avoidance of previous deposits in favour of clear areas may have constrained habitation placement). The smallest deposition zones, defined at nine clusters, may approach the level of pattern detail required in a search for specific activity contexts.

We believe that our study offers avenues to guide future research in spatial analysis. Three problems seem especially important. A central theoretical issue is the complex relationship between artifact abundance and location. Artifact spatial associations may result from an interplay of processes involving dropping rates, the intensity of activity at a given locus, the number of activities carried out in particular site areas, and the length of occupation in different parts of the site. Some peripheral site areas may reflect specific activities in a more direct way. These more simply organized areas may provide information on which to base investigations of the more complex central spaces. We may need to study outlying areas, and their depositional events, at other sites to be aware of the range of activities expected within them. With this information, models for the kinds of activities to be expected within the complex central spaces may be constructed. Another issue, more technical in nature, is of developing better ways of comparing several spatial distributions at the same time. Our use of ranks is but a start in this direction.

180 J. F. SIMEK AND R. R. LARICK

Finally, one of the most promising lines of study may well be the comparison of distributions from different sites using the k-means approach to pattern recognition. We are developing a repertoire of k-means summaries of the distributions from several upper Paleolithic sites with which we may examine this potential.

Acknowledgements We thank A. Ammerman for his indispensible guidance throughout all stages of this study, We also thank K. Kintigh for ensuring his k-means program was operational, A. Cook and A. Wylie for their contribution to the development of this project, and the consultants of the Computer Center, State University of New York at Binghamton, for solving many technical problems. We thank M. Conkey for substantive criticisms of the manuscript. Finally, we acknowledge A. Leroi-Gourhan and M. Brezillon for producing such an exemplary site report.

References Grieg-Smith, P. (1964). Quantitative Plant Ecology. London: Methuen. Hietala, H. & Stevens, D. (1977). Spatial analysis: multiple procedures in pattern recognition

studies. American Antiquity 42 (4), 539-559. Hodder, I. & Orton, C. (1976). Spatial Analysis in Archaeology. Cambridge: Cambridge

University Press. Hodson, F. R. (1971). Numerical typology and prehistoric archaeology. In (F. R. Hodson,

D. G. Kendall & P. Tautu, Eds) Mathematics in the Archaeological and Historical Sciences. Edinburgh: Edinburgh University Press, pp. 30-45.

Johnson, A. E. & Johnson, A. S. (1975). K-means and temporal variability in Kansas City Hopewell ceramics. American Antiquity 40, 283-295.

Kintigh, K. W. & Ammerman, A. J. (1982). Heuristic approaches to spatial analysis. American Antiquity 41 (l), 31-63.

Leroi-Gourhan, A. & Brezillon, M. (1972). Fouilles de Pincevent. Essai d’analyse ethnographique d’un habitat magdalenien (La Section 36). VIIeme Supplkment ci Gallia Prkhistoire. Paris: Editions du Centre National de la Recherche Scientifique.

Finder, D., Shimada, I. & Gregory, D. (1979). The nearest-neighbor statistic: archaeological applications and new developments. American Antiquity 44 (3), 430-445.

Whallon, R. (1973). Spatial analysis of occupation floors: the application of dimensional analysis of variance. American Antiquity 38, 266-278.

Yellen, J. (1977). Archaeological Approaches to the Present. New York: Academic Press.