Land cover classification with AVHRR multichannel composites in northern environments

16
ELSEVIER Land Cover Classification with AVHRR Multichannel Composites in Northern Environments Josef Cihlar,* Hung Ly,* and Qinghan Xiao* The objectives of this study were to test the usefulness of various spectral channel combinations of AVHRR multitemporal composites for deriving land cover infor- mation in northern environments', and to assess the effect of A VHRR spatial resolution on the classification accu- racy. A sequence of operations was carried out to remove radiometric distortions from A VHRR composites (1 km pixel size) prepared for the landmass of Canada using multidate NOAA-11 data for the 1993 growing season: atmospheric corrections for A VHRR Channels 1, 2, and 4; identification and replacement of cloud-contaminated pixels; bidirectional reflectance corrections of Channels 1 and 2; and principal component (PC) calculations to retain significant independent PC channels. Input princi- pal components were classified using an unsupervised clustering algorithm, and accuracies were assessed through a comparison to 30 m Landsat TM pixels atfive different sites in three biomes. We found that the normalized difference vegetation index (ND VI) was the most effective single spectral dimension to derive land cover types, but other channels (especially i and 2) were needed to obtain highest accuracies. Overall, classification accuracies for the 30 m pixels were between 45% and 60%. Mixes of land cover classes within A VHRR pixels were the princi- pal reason for the low accuracies. When considering only A VHRR pixels with one dominant land cover type, the accuracy increased up to 80% or more in proportion to the mixed types retained. The accuracy also increased when a dispersed class (mixed forest) was combined with *Applications Division, Canada Centre for Remote Sensing, Ot- tawa, Ontario, Canada *Intera Information Technologies (Canada) Ltd., Ottawa, On- tario, Canada Address correspondence to Josef Cihlar, Applications Div., Can- ada Centre for Remote Sensing, 588 Booth St., Ottawa, ON K1A 047, Canada. Received 30 June 1995; revised 29 August 1995. REMOTE SENS. ENVIRON. 58:36-51 (1996) ©Government of Canada 655 Avenue of the Americas, New York, NY 10010 the more ubiquitous coniferous forest class. The intrinsic A VHRR resolution and the compositing process are the major factors influencing the impact of mixed cover types on the classification accuracy. The impact of these factors is discussed and strategies for optimizing the use of multi- temporal A VHRR data in land cover classification are suggested. INTRODUCTION AND OBJECTIVES Land cover mapping has been practiced for centuries. With the advent of aerial photography, a new perspec- tive was added which greatly increased the quality and representativeness of the resulting maps. Until recently, however, it has not been possible to obtain data for mapping and cover over large areas and within a suffi- ciently short time period so that the result could be regarded as representing a point in time. As a result, the existing global and many regional land cover maps are a collage of detailed maps produced at different times and by various methods, thus giving widely differ- ent picture of land cover distribution (DeFries and Townshend, 1993). The availability of Landsat and other satellite images of the earth's surface greatly increased our capability to obtain up-to-date land cover information around the globe. So far, however, the use of these and similar high resolution data has suffered from the infrequent coverage and high costs, in addition to the high data volume. As an interim step, effort has in recent years been directed toward the use of medium resolution optical data such as obtained by the NOAA Advanced Very High Resolution Radiometer (AVHRR). AVHRR provides coarser spatial resolution (1.1 km at nadir) but much better temporal resolution with its daily coverage. From such data, images of the earth's surface with much 0034-4257 / 96 / $00.00 SSDI 0034-4257(95)00210-3

Transcript of Land cover classification with AVHRR multichannel composites in northern environments

ELSEVIER

Land Cover Classification with AVHRR Multichannel Composites in Northern Environments

Josef Cihlar,* Hung Ly,* and Qinghan Xiao*

T h e objectives of this study were to test the usefulness of various spectral channel combinations of AVHRR multitemporal composites for deriving land cover infor- mation in northern environments', and to assess the effect of A VHRR spatial resolution on the classification accu- racy. A sequence of operations was carried out to remove radiometric distortions from A VHRR composites (1 km pixel size) prepared for the landmass of Canada using multidate NOAA-11 data for the 1993 growing season: atmospheric corrections for A VHRR Channels 1, 2, and 4; identification and replacement of cloud-contaminated pixels; bidirectional reflectance corrections of Channels 1 and 2; and principal component (PC) calculations to retain significant independent PC channels. Input princi- pal components were classified using an unsupervised clustering algorithm, and accuracies were assessed through a comparison to 30 m Landsat TM pixels atfive different sites in three biomes. We found that the normalized difference vegetation index (ND VI) was the most effective single spectral dimension to derive land cover types, but other channels (especially i and 2) were needed to obtain highest accuracies. Overall, classification accuracies for the 30 m pixels were between 45% and 60%. Mixes of land cover classes within A VHRR pixels were the princi- pal reason for the low accuracies. When considering only A VHRR pixels with one dominant land cover type, the accuracy increased up to 80% or more in proportion to the mixed types retained. The accuracy also increased when a dispersed class (mixed forest) was combined with

*Applications Division, Canada Centre for Remote Sensing, Ot- tawa, Ontario, Canada

*Intera Information Technologies (Canada) Ltd., Ottawa, On- tario, Canada

Address correspondence to Josef Cihlar, Applications Div., Can- ada Centre for Remote Sensing, 588 Booth St., Ottawa, ON K1A 047, Canada.

Received 30 June 1995; revised 29 August 1995.

REMOTE SENS. ENVIRON. 58:36-51 (1996) ©Government of Canada 655 Avenue of the Americas, New York, NY 10010

the more ubiquitous coniferous forest class. The intrinsic A VHRR resolution and the compositing process are the major factors influencing the impact of mixed cover types on the classification accuracy. The impact of these factors is discussed and strategies for optimizing the use of multi- temporal A VHRR data in land cover classification are suggested.

INTRODUCTION AND OBJECTIVES

Land cover mapping has been practiced for centuries. With the advent of aerial photography, a new perspec- tive was added which greatly increased the quality and representativeness of the resulting maps. Until recently, however, it has not been possible to obtain data for mapping and cover over large areas and within a suffi- ciently short time period so that the result could be regarded as representing a point in time. As a result, the existing global and many regional land cover maps are a collage of detailed maps produced at different times and by various methods, thus giving widely differ- ent picture of land cover distribution (DeFries and Townshend, 1993).

The availability of Landsat and other satellite images of the earth's surface greatly increased our capability to obtain up-to-date land cover information around the globe. So far, however, the use of these and similar high resolution data has suffered from the infrequent coverage and high costs, in addition to the high data volume. As an interim step, effort has in recent years been directed toward the use of medium resolution optical data such as obtained by the NOAA Advanced Very High Resolution Radiometer (AVHRR). AVHRR provides coarser spatial resolution (1.1 km at nadir) but much better temporal resolution with its daily coverage. From such data, images of the earth's surface with much

0034-4257 / 96 / $00.00 SSDI 0034-4257(95)00210-3

A VHRR Land Cover Classification 37

reduced cloud cover can be prepared every 10-30 days (Townshend, 1994). The methodological challenge is to extract meaningful land cover information from these composite images.

Two approaches to using AVHRR data in land cover mapping have been explored in previous studies, single- date and multitemporal. In the first case, AVHRR data are analyzed similarly as those from higher resolution sensors, with digital image classification being used to map different cover types. In phenologically simple envi- ronments, it has been shown that this approach can produce very good results (Pokrant, 1991; Beaubien and Simard, 1993). For example, Beaubien and Simard (1993) were able to identify and map 16 land cover classes in a boreal forest area, including five conifer classes with stand densities ranging from 25 to above 60%, two deciduous, and two mixed classes. Similarly, Iverson et al. (1994), Hlavka and Spanner (1995), and others have shown that single-date AVHRR data can be used to estimate the proportion of forest within a pixel provided the landscape is spectrally simple (topography and cover type). However, the single-date method has two inherent drawbacks. First, because of the differ- ences in atmospheric and other conditions during acqui- sition, it requires that the classification be derived sepa- rately for each image and the differences in classification in the overlapping zone, if any, be reconciled (e.g., Stone et al., 1994). Secondly and more importantly, it is extremely difficult to find single-date AVHRR images that are cloud-free and thereby to compile a near- coincident coverage of large areas. A consistent land cover map representing a fixed point in time is a critical input into many scientific and resource management applications because many such uses consider processes and issues that are time-dependent. Therefore, in- creased effort has recently been directed at the use of composite AVHRR images (Loveland et al., 1991; Brown et al., 1993; Evans et al., 1993; Townshend et al., 1994; Running et al., 1995). A global AVHRR data set is in preparation to produce land cover information for the International Geosphere-Biosphere Program (Eiden- shink and Faundeen, 1994). Several global and regional land cover maps have been produced from AVHRR data at spatial resolution/pixel size ranging from over 100 km to 1 km (Tucker et al., 1985; Townshend et al., 1987; DeFries and Townshend, 1994; Loveland et al., 1991; Brown et al., 1993).

The classification accuracy of land cover types from AVHRR data depends on two factors: spectral unique- ness (including the temporal dimension) of the signa- tures of individual cover types, and the spatial homoge- neity of the AVHRR pixels. The use of multitemporal AVHRR data thus poses two methodological questions.

1. Which spectral bands should be used? In most previous studies employing multitemporal data

NDVI images (computed as the per-pixel differ- ence between Channel 2 and Channel 1 reflec- tance or radiance, depending on the study, di- vided by the sum of the two channels) were used because of availability or ease of use. Since most of the previous composite data sets were prepared using maximum NDVI as the pixel se- lection criterion (Townshend et al., 1994), the NDVI profiles are relatively noise-free and also provide information on the seasonal dynamics of the surface cover, However, AVHRR data span five spectral channels, and thus potentially useful information may not be taken advantage of in this manner.

2. What is the effect of mixed land cover types within the larger pixels? At resolutions > 1 km most pixels represent mixed land cover types, with attendant problems in labeling and accu- racy of the resulting maps.

The purpose of this study was to analyze the above questions as part of a land cover mapping study for the Canadian landmass. The second question is addressed using AVHRR data and an analysis of effective pixel size in composite images. Because of the range of en- vironments included in the study the results should be applicable to the circumpolar landmass north of about 50°N.

METHODOLOGY

The processing and analysis of AVHRR data consisted of a series of steps.

Step 1: Compositing. AVHRR composite data for the 1993 growing season (1 May to 10 October) were used in this study. The composites were prepared for 15 10-day periods (3 per month) us- ing the GEOCOMP system (Robertson et al., 1992) operated by the Manitoba Remote Sens- ing Centre. For each 10-day period, daily NOAA-11 afternoon pass images were geolo- cated, resampled into a Lambert Conformal Conic projection using 16-point Kaiser resam- piing kernel, and combined to select the most cloud-free pixel using the maximum NDVI value as the selection criterion. The selected pixels could have view zenith up to 68.7 ° and thus represent relatively large surface areas with a consequent mixture of cover types. The effect of the large pixel size is examined in the Appen- dix. GEOCOMP produces a 10-channel data set which includes for each selected pixel: compos- ited Channels 1-5, NDVI, three angles (view ze- nith, solar zenith, relative azimuth), and the data of imaging.

3 8 Cihlar et al.

Step 2: Corrections. The GEOCOMP images were further processed to reduce noise in the compos- ites. Channels 1 and 2 were corrected for atmo- spheric effects using the SMAC program (Rah- man and Dedieu, 1994), and the NDVI was recomputed. The NDVI was corrected for solar zenith angle effect using the coefficients of Sell- ers et al. (1994) derived from a global AVHRR global area coverage (GAC) data set. Land cover information from Pokrant (1991) was used for this step as well as for bidirectional reflectance corrections below. This data set has 1 km pixel size, but the distribution of classes has been gen- eralized by spatial smoothing. Consequently, the distribution represents regional trends rather than local detail. Because of this, and given that only some principal components were used in the classification, the use of the land cover data set in the preprocessing is not expected to in- fluence the results. A new procedure (Cihlar, 1996) was used to identify pixels contaminated by clouds, snow, or similar effects, and the con- taminated pixels in each composite were re- placed through temporal interpolation. To re- duce the significant noise present in multitemporal Channel 1 and 2 data, these were corrected by deriving land cover-dependent co- efficients for the Walthall et al.'s (1984) model in each composite period using a 10% sample of the data. The sample was selected and the re- sults applied on a pixel basis using the land cover data set of Pokrant (1991). The contami- nated pixels were identified using the above de- rived masks and were replaced similarly as for NDVI. Channel 4 data were corrected for atmo- spheric effects using the split window approach of Price (1984). The contaminated pixels were then replaced through interpolation, similarly as for NDVI. No emissivity corrections were made to Channel 4 data. At this stage, the data set consisted of 64 channels (C1, C2, C4, NDVI for 15 dates), each channel 5700 x 4800 pixels in size.

Step 3: Preprocessing. A principal component (PC) analysis was used to reduce the data volume while retaining significant information. Pultz et al. (1992) and Moorman et al. (1993) found that the higher-order PCs contain progressively more detailed information on the seasonal dynamics of the land cover and, given the mathematical basis of PC, also increasingly higher proportion of noise. PC eigenvalues could therefore be used to select thresholds in retaining the PCs for further analysis. PC images were computed separately for each channel set (C1, C2, C4,

NDVI) for the data set (except for period 4, which originally contained high proportion of missing data). Only some PCs were retained for the analysis, based on the percentage variance explained by the individual PCs. The retained PCs accounted for 96.3% (C1), 97.8% (C2), 96.6% (C4), and 96.1% (NDVI) of the total vari- ance in the input data (Table 1, Fig. 1). Figure 2 shows the scores (eigenvalues) for most of the principal components used; the remaining were omitted for clarity. In general, the first compo- nent corresponds to the mean seasonal value, and higher components capture the increasingly detailed elements of the seasonal distribution. This is best evident for NDVI (Fig. 2a) where, for example, the second component contrasts peak green with ends of the growing season; third green-up with senescence; etc. Eigenval- ues for some of the higher components behaved less regularly, showing the effect of image noise-for example, in some components for Channels 1 and 2 (Figs. 2b and 2c). Neverthe- less, all the components retained showed some regional patterns (beyond the obvious noise); for this reason and to account for approximately the same proportion of total variation in all chan- nels, they were also used in the analysis,

Step 4: AVHRR Classification. The retained PC im- ages were classified using the ISOCLASS algo- rithm (Tou and Gonzales, 1974). The algorithm is a parametric clustering algorithm in which spectrally distinct clusters are identified through an iterative procedure. During each iteration a pixel is assigned to the nearest cluster mean (measured by Euclidean distance), and new means are recomputed. The initial assumption is that all pixels form one cluster. In subsequent it-

Figure 1. Variance in the seasonal profile of four spec- tral dimensions accounted for by various principal components. C1 (C2) = surface reflectance, C4-- radio- metric temperature, NDVI = normalized difference vegetation index. NOAA-11 AVHRR data set over the Canadian landmass.

20 - - - ~ ! . . . .

~ 1 6 ,

e ' - I " ,

> 4 \.,, -

3 5 7 9 11 13 15 PC number

i " C1 C 2 - C 4 - - - N D V I I I

A VHRR Land Cover Classification 39

Table 1. Band Combinations and Classification Parameters Used

Cluster Minimum Number of Number of Combining Cluster Splitting Number of

Input Data Clusters Iterations Distance Distance Pixels (Number of PCs) a (MAX CLUS) (MAXNUMIT) (MAX CLSTD) ( C L U S D I S T ) (MINNUM)

C1(9) 128 12 3.2 4.5 30 (:2(11) 128 12 3.2 4.5 30 N(5) 128 12 3.2 4.5 30 C4(6) 128 12 3.2 4.5 30 C 1 (7)C2(9) 150 l 5 3.2 4.5 30 C1 (6)C2(7)N(3) 150 15 3.2 4.5 30 C1 (6)C2(7)C4(3) 150 15 3.2 4.5 30 C1(5)C2(7)N(3)C4(1) 150 15 3.2 4.5 30

" The first x PCs were used in the classification (e.g., PC1-9 for C1, PC1-7 for C1 plus PC1-9 for C2 for the C1C2 combination, etc.)

erations a cluster is subdivided along with j th axis (spectral channel) provided that the j th axis has the highest standard deviation (for that clus- ter) which is also higher than the threshold max- imum cluster distance, and the cluster has more than a minimum number of pixels. On a given it- eration, all clusters that have standard deviation higher than the threshold are split. If the maxi- mum number of clusters has been reached, pixel reclassification continues without creating new clusters. At each iteration the intercluster distances are checked, and clusters are com- bined if the distance is below a threshold. The split and merge procedure continues until the specified number of iterations is reached.

ISOCLASS thus requires that several parameters be specified. Eight different PC classification sets were used, shown in Table 1 with the associated ISOCLASS parameters. The classifications use PCs derived from AVHRR Channels 1, 2, 4 and NDVI. NDVI data were previously shown to identify useful land cover categories through the seasonal variation of greenness (Loveland et al., 1991). Channels 1 (wavelengths 0.57-0.70/tm) and 2 (0.72-0.98/lm) were also shown to be effective in land cover identification (Beaubien and Simard, 1993). They add spectral albedo information, that is, differ- ences between dark and light targets of the same green- ness. Channel 4 (10.3-11.3/~m) was expected to contrib- ute temperature information which varies in relation to the land-cover dependent surface energy exchange.

The classification was applied to data for the Cana- dian landmass only, identified with the aid of a water mask (Pokrant, 1991). To derive cluster statistics, the PC images were sampled to include only every sixth line and sixth pixel. Once the means of all clusters were determined through ISOCLASS, the full-resolution PC images were classified by assigning each pixel using the minimum Euclidean distance with respect to all cluster means. This procedure was carried out separately for each PC combination.

Step 5: Ground Data. The classification results were evaluated using Landsat Thematic Mapper (TM)-derived classifications. Several different land environments were chosen (Fig. 3, Table 2), including arctic tundra, boreal forest and a cropland/grassland area, to determine the effect of land cover combinations on the resulting ac- curacies. The land cover classes used corre- spond to an interim version of the IGBP set of classes (Belward, 1995; Table 3) and are re- ferred to below as "IGBP classes" in the remain- der of this article. Other IGBP classes did not occur in the areas studied. Among the five test sites, three TM classifications were available from other sources; in these cases, the corre- sponding IGBP classes were identified and the Landsat-derived classes were grouped corre- spondingly. Two new classifications were pre- pared for this study (Head Lake and Matagami, Table 2) and in each case, the clusters were la- beled directly as belonging into one of the IGBP classes.

Step 6: Coregistration. Each Landsat-derived classi- fication was registered to the AVHRR data. To maximize registration accuracy of these dissimi- lar data types, ground control points were se- lected for the Landsat TM images on 1:50,000 topographic maps, and the images were re- projected into the Lambert Conformal Conic projection using the PCI software package (V.5.3). In this process, the pixel size of the AVHRR classifications was changed into 30 m through nearest neighbor resampling approach. As a result, two coregistered data sets were ob- tained, classified and labeled Landsat data and clustered AVHRR data. Consequently, each spectral AVHRR cluster could be directly re- lated to the Landsat classes to determine the fre- quency of cooccurrence.

Step 7: Labeling. Individual AVHRR clusters were

40 Cihlar et al.

0.6 0.8 ~- - - -

0.4

0.2

0

-0.2

-0.4

-0.6

a

• cd % {A ' &

©

I o

-35 6 7 8 9 10 11 12131415161718

-'~1 ~2 -"3 ~4 ~-5 I

0.6

0.4

0.2

0

-0.2

-0.4

-0.6 3

. ,:.-: >,>e .~t'

' I , ~ ÷ - - t i •

6 8 10 12 14 16

- 72 .a 4 -5 61 18

0 . 8 ~ . . . . . . 0 . 6 * P,

0.4 0.2

0 -0.2 - 0 .4 . . , /

L :i -0.6 ,~ ..... ] - 0 . 8 . . . . . ~-~ +- . . . . . . - ~ + ~ - - ÷ ~

3 5 6 7 8 9 10 11 12 13 14 15 16 17 18

r i i - -1 ~ 2 ---3 ~ 4 - ~ 5 ~ 6 c d

0 . 8 7 ~ . . . . . .

0.6

0.4

0.2

-0.2

-0.4

- 0 . 6 ~ ~ ~ - ~ - ~ ~ I -~ ~ ~ i I ~ -

3 5 6 7 8 9 10 11 12 13 14 15 16 17 18

i ' - 1 ~-~ 2 ~:3 ~ 4 ~ 5 -61 Figure 2. Eigenvalues for principal components of the AVHRR multitemporal data used in the analysis: NDVI (2a), Channel 1 (2b), Channel 2 (2c), and Channel 4 (2d). For clarity, only the first six components are shown for Channels 1 and 2.

assigned to a TM-derived land cover class, sepa- rately for each PC classification result and for each site. The assignment was made by finding the TM class with which the particular AVHRR cluster occurred most frequently (i.e., labeling and accuracy testing was done with the same reference data). This approach represents the most favorable case for the AVHRR classifica- tion, comparable to using complete-cover train- ing data set (thus optimizing the cluster label- ing). It was used to obtain a sensitive measure of the information content of the various chan- nels, not to assess the likely classification accu- racy over large areas. All AVHRR clusters la- beled as one TM class were combined and assigned the same label.

Step 8: Accuracy evaluation. Confusion matrices were prepared for all sites and classification combinations. Two measures of overall accuracy were then used, diagonal accuracy (DiAc) and Khat. Both are rather strict measures given the difference in spatial resolution of the two sen- sors; they were used to evaluate, as rigorously as possible, the correspondence of ground classes to the AVHRR classified results. The diagonal accuracies were computed as

q

l O0* ~P(avhrr)i DiAc(i,i) = i =; (1)

P(tm)~ where:

P(tm)i = number ofpixel in land cover class i, P(avhrr)i = number of 30 m pixels in the AVHRR

clusters labeled as TM class i (Step 7 above),

j -q = AVHRR clusters labeled as class i.

DiAc thus measures the proportion of ground cover that was "positively identified" as that cover through AVHRR classification.

The Khat distance was also computed for each confusion matrix (Congalton, 1991) as follows:

r

NEx.- F, (x,+ *x+O Khat= i=l i=1 (2)

N 2- E (x,**x,,) i = l

where

x, = total number ofpixels in row i, column i, N = total number ofpixels, r = number of rows (columns),

xi + and x + ~ = row and column totals, respectively.

A VHRR Land (:over Classification 41

0

Figure 3. Locations of the five test sites used to assess the AVHRR classification accuracies.

As seen from Eq. (2), Khat measures the dispersion outside of the diagonal in the confusion matrix in rela- tion to the concentration along the diagonal axis.

RESULTS

Effectiveness of Various Spectral Band Combinations Table 4 shows an example of the confusion matrices for individual PC channel combinations at the SSA site. For brevity, only matrices for combined classes are shown (an example of the full matrix is in Table 6). The most frequent type (coniferous forest for SSA, NSA, MAT; grassland for SAL) was classified significantly more accurately (mostly above 85%) than the remain- ing classes in either ease. Ground cover types occurring in smaller patches (fen, small water bodies, regenera- tion, disturbances) were usually miselassified, most fre- quently as coniferous forest. The differences between classification results from various PC channel combina- tions were not large, the diagonal accuracies varying between 53% and 57%.

Figure 4 shows the average diagonal accuracies for the five test sites. The individual TM classes were merged at three sites (SSA, NSA, MAT) to coincide with the IGBP classes; only IGBP classes were used at the two remaining sites. The AVHRR diagonal accuracies ranged between 42% and 63%, without a clear pattern among sites. In some cases, the accuracies were insensi- tive to the PC channels used (e.g., NSA and less so SSA, SAL) while in others the differences due to the channels employed were more substantial (MAT, HLA). In most eases, combinations including NDVI (N, C12N, C12N4) yielded higher classification accuracies. For all five sites, NDVI was the most consistently accurate channel if all classes were considered. C4 alone provided results at the lower end of the range in all eases except for HLA. Considering all sites and all classes per site, the effec-

tiveness of individual PC combinations decreased ap- proximately in the order NDVI > C2 > C1-C4 (Fig. 4).

The addition of other channels to NDVI increased the overall site accuracy to varying degrees, usually only marginally. However, various PC channels performed best if the accuracies for individual classes are consid- ered. This relation is summarized in Table 5 by counting the number of the IGBP land cover classes, at each site, for which a given spectral combination provided the highest accuracy. The C12N combination was the single most successful one (27%), followed by C2, and C1- N-C124. It should be noted that while NDVI alone was best in only 14% of cases, all combinations involving NDVI were best in 46%.

Figure 5 shows the trends in Khat for the five sites. In general, the values were low (Khat = 1 for perfect classification), ranging from 0.04 to 0.32. The differences between the various spectral combinations were consid- erable, except for the NSA site. Khat values are consis- tent with the DiAc results: highest accuracies for combi- nations involving NDVI except for MAT, low accuracies for C4, and C2 with bet ter performance than C1 or C4.

The effect of the size of the area used in deriving spectral clusters was tested by clustering full resolution data over a 280 km × 650 km area encompassing both BOREAS sites (i.e., not sampling the data to derive the cluster means and variances). This classification was completed only for NDVI PCs, using 10 clusters. The remainder of the procedure was as described above. It was found that for both SSA and NSA test areas, the difference in DiAc (Khat) was less than 1% (2%) com- pared to the classification derived fi~r the whole country.

An assessment of the consistency of labeling a given spectral cluster with the same cover type at the five test sites was carried out for the NDVI PC classification. Approximately 60% of the clusters were consistently assigned to one of the IGBP land cover classes while the remainder was assigned to different classes at one or more sites. Confusions included vegetation classes (e.g., coniferous vs. deciduous) as well as nonvegetated (water HLA / NSA / SSA vs. bare soil or grassland SAL, conifers NSA vs. disturbed MAT, etc.). In many cases the confusion could be explained by the similarities of NDVI seasonal trajectories, and some should not occur in the other spectral combinations.

Effect of Spatial Land Cover Distribution The generally low overall classification accuracies shown above are in large part related to the variability of ground cover types within AVHRR pixels. This is illus- trated in Table 6 which shows the confi~sion matrices for three increasingly broader classes at the SSA site. When combining the Landsat-derived classes (Table 2) into 6 IGBP classes (Table 3), only a small increase in DiAc (56-57%) and none in Khat (0.29) took place because these merged classes occupied only a small

42 Cihlar et al.

Table 2. Test Sites and Data Sources

Landsat Scene Center Path / Row Lat / Long (o) Classification

Site (Date) (Size, km 2) Procedure

SSA (BOREAS 37 / 22-23 53.8 / 105.4 Note" ssn) (90 / 08 / 06) (9850)

NSA (BOREAS 33 / 21 55.8 / 98.2 Note t~ NSA) (88 / 08 / 20) (9045)

MAT (Matagami, 18 / 25 50.5 / 77.5 Note' Quebec (91 / 08 / 20) (9380)

SAL (Tabor, Alberta) 40/25 49.99/111.56 Note a (88 / 07 / 20) (29660)

HLA (Head Lake, 47 / 13 67.29 / 110.91 Note" NWT) (92 / 07 / 14) (7640)

a Classification of TM Bands 1-5 and 7 prepared for the BOREAS project was used as the reference (Hall and Knapp, 1994a). It is based on supervised classification of an image from 6 August 1990 image and supported by extensive field checking. The following original classes were identified: wet conifers (primarily black spruce), dry conifer (jack pine), mixed (coniferous and deciduous), deciduous, disturbed, fen (wetland), water, regeneration (medium age), regen- eration (younger), regeneration (older), burn (visible). Accuracy was assessed through confu- sion matrix using independently visited sites. The diagonal accuracy was 66.6% Khat = 0.56 (Hall and Knapp, 1994a).

~' Classification of TM Bands 1-5 and 7 prepared for the BOREAS project was used as the reference (Hall and Knapp, 1994b). It is based on supervised classification of an image from 20 August 1988 and supported by extensive field checking. The following classes were identi- fied: wet conifers (primarily black spruce), dry conifer (jack pine), mixed (coniferous and deciduous), deciduous, disturbed, fen (wetland), water, regeneration (medium age), regenera- tion (older), burn (visible). Accuracy was assessed through confusion matrix using indepen- dently visited sites. The diagonal accuracy was 73% Khat = 0.63 (Hall and Knapp, 1994b).

" Classification of TM image from 20 August 1991 prepared for the BIOME-TEL project was used (Royer et al., 1994). It is based on supervised classification supported by air photos. The original 20 classes were grouped into seven: coniferous (including density classes > 60 %, 40-60%, 25-40%), mixed (> 55% coniferous, > 55% deciduous), deciduous (density > 40%, with or without coniferous understorey), disturbed (recent cuts, recent cuts more or less covered by vegetation), wetland (wetland with 10-25% conifers, wetland with < 10% confers, grass-dominated wetland), regeneration (various stages after disturbances). The training data and validation were obtained primarily from air photographs. Accuracy was not assessed statisti- cally but the procedure was extensively used in similar environments in different parts of Quebec.

't Landsat TM Channels 1,2,3,4,5, and 7 were classified using ISOCLASS (25 clusters, labeled based on site visits (early 1995) in five classes corresponding to the IGBP categories (Table 3): grassland, annual broadleaf crops, mixed annual crops, water, bare soil (i.e., summerfallow). Confusion matrix was not computed but the classification is considered fairly accurate given the spectral contrasts between the broad land cover classes.

e Landsat TM Channels 1,2,3,4,5, and 7 were classified using ISOCLASS (25 clusters, labeled based on site visits in three classes corresponding to the IGBP categories (Table 3): grassland, bare soil or rock, water. Confusion matrix not computed but the classification is considered fairly accurate based on the field information and the simple land cover conditions in the area.

Table 3. IGBP Classes Used in the Tests"

Evergreen needleleaf trees and shrubs (= coniferous forest) Deciduous broadleaf trees and shrubs (= deciduous forest) Mixed trees and shrubs (= mixed) Grasslands ( = grassland) Permanent wetland (= fen) Annual broadleaf crops Annual grass crops Mixed annual crops Bare soil and rocks (= barren) Water bodies (= water)

a Notes: 1) The name of the class used in this article is in the parentheses. 2) The IGBP classes were taken from an interim document. The final version may differ in some cases (e.g., annual broadleaf and grass crops are likely to be merged). 3) Source: Belward (1995).

A VHRR Land Cover Classification 43

Table 4. Confus ion Matr ices for E igh t Classif icat ions Using Var ious Spectral C o m b i n a t i o n s o f IGBP Classes

CON CONDEC DEC FEN WA T OTHER (dist,regen)

CI CON 0.96 0.91 0.76 0.96 0,35 0.91 COND EC 0.00 0.O0 0.00 0.00 0,00 0.00 DEC 0.02 0.07 0.21 0.03 0.05 0.06 FEN 0.00 0.00 0.00 0.00 0.00 0.00 WAT 0.01 0.02 0.03 0.01 0.60 0.03 OTHER 0.00 0.00 0.00 0.00 0.00 0.00

Accur. Di 0.54 Khat 0.19 (;2

CON 0.92 0.74 0.42 0.91 0.35 0.80 CONDEC 0.02 0.07 0.10 0.03 0.02 0,05 DE(; 0.05 0.17 0.45 0.05 0.08 0,13 FEN 0.00 0.00 0.00 0.00 0.00 0.00 WAT 0.01 0.02 0.02 0.01 0.55 0.02 OTHER 0.00 0.00 0.00 0.00 0.00 0.00

Aecur. Di 0.55 Khat 0.25 N

CON 0.92 0.70 0.36 0.91 0.31 0.84 CONDEC 0.04 0.16 0.25 0.06 0.02 0.11 DEC 0.02 0.12 0.39 0.02 0.02 0.03 FEN 0.00 0.00 0.00 0.00 0.00 0.00 WAT 0.01 0.01 0.01 0.01 0.65 0.02 OTHER 0.00 0.00 0.00 0.00 0.00 0.00

Accur. Di 0.57 Khat 0.29 (;4

CON 0.98 0.96 0.91 0.99 0.42 0.99 CONDEC 0.00 0.00 0.00 0.00 0.00 0.00 DEC 0.01 0.02 0.05 0,01 0.05 0.00 FEN 0.00 0.00 0.00 0.00 0.00 0.00 WAT 0.01 0.02 0.03 0.01 0.53 0.01 O TIt E R 0.00 0.00 0.00 0.00 0.00 0.00

Accur. Di 0.53 Khat 0.14 C12

CON 0.95 0.83 0.59 0.95 0.34 0.87 CONDEC 0.00 0.Ol 0.01 0.00 0.00 0.01 DEC 0.04 0.14 0.37 0,04 0.07 0, l 0 FEN 0.00 0.00 0.00 O.O0 0.00 0.00 WAT 0.01 0.01 0.03 0,01 0.59 0.02 OTHER 0.00 0.00 0.00 0,00 0.00 0.00

Aceur. Di 0.55 Khat 0.23 CI2N

CON 0.93 0.75 0.41 0.93 0.30 0.87 CONDEC 0.01 0.04 0.05 0.02 0.00 0.02 DEC 0.04 0.19 0.53 0.04 0.03 0.07 FEN 0.00 0.00 0.00 0.0O 0.00 0.00 WAT 0.02 0.01 0.01 0.01 0.67 0.03 OTHER 0.00 0.00 0.00 0.00 0.00 0.01

Aceur. Di 0.57 Khat 0.29 C 124

CON 0.98 0.94 0.87 0.98 0.45 0.95 CONDEC 0,00 0.00 0.00 0.00 0.00 0.00 I)EC 0.01 0.04 0,09 0.01 0.01 0.02 FEN 0,00 0.00 0.00 0.00 0.00 0.00 WAT 0,01 0.02 0.03 0.00 0.54 0.01 OTHER 0.00 0.00 0.00 0.00 0.00 0.01

Accur. Di 0.53 Khat 0.15 C12N4

CON 0.92 0.72 0.37 0.91 0.30 0.84 CONDEC 0.04 0.14 0.18 0.05 0.01 0.08 DEC 0.03 0.13 0.43 0.03 0.03 0.05 FEN 0.00 0.00 0.00 0.00 0.00 0.00 WAT 0.01 0.01 0.01 0.01 0.66 0.04 OTHER 0.00 0.00 0.00 0.00 0.00 0.00

Accur. Di 0.57 Khat 0.29

44 Cihlar et al.

0.65 r t I

O.6 J

"~ 0.55 ~ :

"-- t

8 o.5i

0.45

0.4 SSA

n I'

NSA MAT SAL

i~ C1 ~-C12

I

J !

I + ~ ,

HLA

o c 2 - N -

+ C 1 2 N -°-C124 ~C12N41

Figure 4. Average diagonal classification accuracy (%) for five sites and eight spectral combinations: SSA (NSA)= BOREAS Southern (Northern) Study Area; MAT = Matagami, Quebec; SAL = Taber, South- ern Alberta; HLA = Head Lake, NWT.

portion of the area (refer to row "Fraction" in Table 6). Only when the mixed class (a separate category in the IGBP nomenclature, Table 1), a relatively large class (17% of the area) frequently adjacent to the conifers, was merged, did accuracies change substantially (DiAc increased to 71%, Khat to 0.33). Another consequence of the patchiness is that the small classes were not differentiated at all. For example, the fen category which is present in small patches over a low total proportion of the area was not separated in the AVHRR classification (Table 6).

The impact of combining ground classes can thus be expected to vary with the mixtures of cover types in a given area. Table 7 confirms this conclusion by show- ing the average diagonal accuracies (DiAc) and Khat for the eight PC channel combinations and three levels of ground class generalization: original (Table 2), IGBP (Table 3), and combined conifer and mixed classes. Both DiAc and Khat increased significantly at all three sites once the spatially dispersed mixed forest was merged with the ubiquitous conifers. However, the increase was

0.3! ÷

-~0.2 L r -

v

t

SSA

~ C12

o /

2

Q

NSA MAT SAL HLA

c2 -N : c4 i ~*~C12N ~ C124 ~-C12N41

Figure 5. Average Khat value for five and eight spec- tral combinations: SSA (NSA)= BOREAS Southern (Northern) Study Area; MAT = Matagami, Quebec; SAL = Taber, Southern Alberta; HLA = Head Lake, NWT.

not uniform, in part because the proportions of the mixed class were not the same at the three sites.

The effect of patchiness of ground cover classes was further examined by identifying those AVHRR pixels whose "purity", that is, fraction occupied by one class, was above 80%, 60% or 50%. In each case, the same procedure as outlined in the methodology section above was used, but only areas covered by AVHRR pixels meeting the specified levels of purity were considered for the confusion matrix. Figure 6 shows diagonal accu- racies for the eight spectral combinations in relation to the pixel purity. In all cases, the accuracies increased with increasing purity. The linearity is partly a result of the procedure used to label the clusters since the label was assigned on the basis of the most frequently oc- curring ground cover type. In general, DiAc is higher than the limiting pixel purity because many pixels have purity higher than the specified value. This is especially the case at lower purity thresholds where the difference between DiAc and the threshold is also higher. The one exception noted was the lower accuracy for pixels with purity above 80% in classifications involving C4 (Fig.

Table 5. The Number of Original (TM-Derived) Land Cover Classes for Which a Spectral Combination Gave the Highest Classification Accuracy

SSA NSA MAT SAL HLA Total

C1 0 1 1 1 0 3 C2 0 1 2 1 0 4 N 1 0 0 1 1 3 C4 0 1 0 0 1 2 C12 0 0 0 0 0 0 C12N 3 0 2 1 0 6 C124 2 0 1 0 0 3 C12N4 0 0 0 1 0 1

A VHRR Land Cover Classification 4;5

6b). The reason is probably lack of cover type differenti- ation through thermal emission in the NSA. Khat results for SSA and NSA are given in Figure 7. As for DiAc, the values increased to 0.6 or more when more pure pixels were included. Figures 6 and 7 are also consistent with the above-discussed findings concerning the effec- tiveness of various spectral combinations for identifying land cover: Best overall results were obtained with combinations involving the NDVI.

DISCUSSION

Results of the AVHRR classification accuracies show that among the four channels used (C1, C2, C3, NDVI), the NDVI yielded the most consistent overall site accu- racies. This is attributed to the the lower proportion of image noise in this channel, for two reasons. First, the pixels were selected using the maximum NDVI criterion, and, second, some of the noise in Channels 1 and 2 is eliminated through the NDVI ratio. However, NDVI alone did not always provide the highest accura- cies. The extreme case was the MAT site where C2 and C12 gave better results but as Table 5 shows, spectral combination other than NDVI-alone yielded higher ac- curacies in 86% of the cases. Although the "best" spec- tral combination (counted in Table 5) still did not yield high accuracy in most cases the differences due to different spectral combinations were often significant in a relative sense. From a qualitative inspection of cluster- ing results of the various PC channel combinations it was noted that C1 and C2 also showed local differentiation related to topographic effects, especially when com- bined with the presence of water in the landscape such as at the wetland/uplands transition SW of James Bay, Ontario. These findings suggest that in northern areas, NDVI-based classification should provide relatively con- sistent results but if the highest possible accuracies are sought (especially locally), individual channels (espe- cially 1 and 2) should be used in addition to NDVI. This is consistent with results for single-date classification in eastern Canada (Beaubien, 1995), where AVHRR Channels 1 and 2 were used in addition to NDVI, and with results of Beaubien and Simard (1993) in a part of Quebec (including the MAT site), where only Channels 1 and 2 were successfully employed. Given that NDVI is computed from Channels 1 and 2, the added value of NDVI for classification purposes is noise reduction in the composite images.

Channel 4 was consistently the least accurate. This might be due tu the narrow range of geographic condi- tions, and it suggests that, over the relatively small areas examined, the land cover types had similar surface temperature regimes. It is thus possible that surface temperature may be more effective in separating cover types over a wider range of latitudes rather than within a limited geugraphie area. It is also possible that C4

reflects local variation in soil moisture (e.g., Nemani et al., 1993) rather than cover type. If no corrections were made for the presence of contaminated pixels (especially snow), greater distinction might be obtained between northern and southern cover types. However, using seasonal snow cover as a diagnostic land cover feature was considered inappropriate because snow cover is highly variable and not an intrinsic feature of the land cover type.

In all classifications, there was a distinct tendency to include smaller (i.e, less frequently occurring) cover types among the more widespread ones. This trend results from the statistical nature of the classification procedure in which the definition of spectral clusters is influenced by the spatial representation of the various cover types. The same trend was found with TM data for the SSA and NSA areas (Hall and Knapp, 1994a, b). It is also related to the patchiness of land cover where small classes occur close to the larger ones. For example, Table 6 shows that several TM classes were classified as wet conifers with high accuracy by the AVtIRR.

The classification results differed among geographic regions but not greatly so. The highest accuracies were obtained in southern Alberta where contiguous areas ()f grassland occur but the differences with other sites were not large (Figs. 4 and 5). The most obvious differences were caused by the patchiness of ground cover and classification results thus improved in relation to the purity of the AVHRR pixels.

Results discussed above suggest that land cover classification derived rom AVHRR data will have limited accuracy if evaluated through a rigorous comparison with high resolution data. This is true even if the land cover classes are fairly general, such as those proposed fi)r use by the IGBP (Belward and Loveland, 1995). The principal reason is the patchiness of land cover. The patchiness includes land-open water mixture (as in HLA) but also various land cover types adjacent to each other over short distances in the landscape (MAT, NSA, SSA). The analysis shows that ifpixels are sufficiently homoge- nous, AVHRR data can provide the classes required.

The spatial homogeneity depends on the land cover distribution and is also complicated by the compositing process which introduces uncertainty into the effective pixel size through three principal effects.

i. Inclusion of pixels with various view zenith angles which have different spatial resolutions. Depending on the maximum acceptable view ze- nith angle (usually constrained in the compositing process) it may vary between 1.1 km and 6.8 km (refer to the Appendix). Since pixels various view- ing geometries are accepted in different compos- ites, the effective resolution along the seasonal profile will differ from one composite pixel to the next and may in fact be difficult to determine.

46 Cihlar et al.

Table 6. Effect of Merging Land Cover Classes on AVHRR Classification Accuracy Using NDVI PCs

a) I I TM Classes

Conif wet Conif dry M i x e d Deciduous Disturbed Fen Water Regenmed Regenyou Regenold Burn

Conif.__wet 0.92 0.91 0.70 0.36 0.73 0.91 0.31 0.83 0.83 0.86 0.93 Conif__dry 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Mixed 0.04 0.02 0.16 0.25 0.18 0.06 0.02 0.10 0.11 0.10 0.01 Deciduous 0.02 0.01 0.12 0.39 0.08 0.02 0.02 0.04 0.01 0.03 0.00 Disturbed 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Fen 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Water 0.01 0.05 0.01 0.01 0.02 0.01 0.65 0.02 0.04 0.01 0.06 Regenmed 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 Regenyou 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Regenold 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Burn 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

DiAc 0.57 Khat 0.29 Fraction 0.47 0.01 0.17 0.10 0.01 0.04 0.10 0.03 0.01 0.06 0.00

b) 6 IGBP Classes

Other Conif Mixed Deciduous Fen Water (Dist., Regen)

Conif 0.92 0.70 0.36 0.91 0.31 0.84 Mixed 0.04 0.16 0.25 0.06 0.02 0.11 Deciduous 0.02 0.12 0.39 0.02 0.02 0.03 Fen 0.00 0.00 0.00 0.00 0.00 0.00 Water 0.01 0.01 0.01 0.01 0.65 0.02 Other 0.00 0.00 0.00 0.00 0.00 0.00

DiAe 0.57 Khat 0.29 Fraction 0.48 0.17 0.10 0.04 0.10 0.12

c) 5 IGBP Classes

Other Conif Deciduous Fen Water (Dist., Regen)

Conif 0.94 0.60 0.97 0.33 0.95 Deciduous 0.05 0.39 0.02 0.02 0.03 Fen 0.00 0.00 0.00 0.00 0.00 Water 0.01 0.01 0.01 0.65 0.02 Other 0.00 0.00 0.00 0.00 0.00

DiAc 0.711032 Khat 0.331676 Fraction 0.65 0.10 0.04 0.10 0.12

ii. R e g i s t r a t i o n accu racy . T h e p r e p a r a t i o n o f e a c h

c o m p o s i t e r e q u i r e s tha t t h e dai ly i m a g e s b e regis-

t e r e d to a c o m m o n m a p r e f e r e n c e a n d r e s a m -

p l e d to a c o n s t a n t p ixe l size. T h e r eg i s t r a t i on

p r o c e s s is no t p e r f e c t a n d typ ica l ly p r o d u c e s a

m i s r e g i s t r a t i o n e r r o r for e a c h dai ly pixel . Th is er-

ro r is t h e n c a r r i e d in to t h e c o m p o s i t e w h e r e ad-

j a c e n t p ixe ls m a y h a v e o p p o s i t e e r ro rs , thus fur-

t h e r i n c r e a s i n g t h e e f f ec t i ve p ixe l size.

T h e e f f ec t i ve p ixe l s ize o v e r t h e seasonal ser ies o f

A V H R R c o m p o s i t e s resu l t s f rom a c o m b i n a t i o n o f t h e

r e s o l u t i o n and r eg i s t r a t i on u n c e r t a i n t i e s . As an e x a m p l e ,

F i g u r e 8 shows t h e e f fec t o f m i s r e g i s t r a t i o n o f t h e im-

ages in t h e t e m p o r a l se r ies i f R M S = 1 km. I t is e v i d e n t

tha t t h e a r e a i n c l u d e d in a 1 k m 2 c o m p o s i t e p ixel m a y

va ry b e t w e e n 8 k m 2 and > 20 km 2, d e p e n d i n g on t h e

angu l a r cu to f f u s e d in t h e c o m p o s i t i n g p rocess . F o r

e x a m p l e , i f a v i e w z e n i t h ang le c u t o f f of 57 ° is u s e d in

Table 7. Effect of Merging Ground Classes on the AVHRR Classification Accuracy And Effectiveness of Various Channels

# Original # % % # % %

Site Classes DiAc Khat Classes DiAc Change Khat Change Classes Accur.Diag. Change Khat Change

SSA 11 0.54 0.23 6 0.55 1.3 0.23 1.7 5 0.70 28.6 0.30 34.2 NSA 10 0.48 0.08 6 0.54 13.6 0.10 20.2 5 0.74 54.6 0.18 119.1 MAT 7 0.45 0.20 6 0.47 5.9 0.22 1(/.4 5 0.53 19.5 0.24 20.6

A VHRR Land (:over Classification 4 7

0.9

0.85

o . 8 > ,

o 0.75 .

o 0.7 O

ir 0.65 _~

o.6 ; 50 55 60 65 70 75 80 60 65 70

Pixel purity (SSA, %) Pixel purity (NSA, %)

0.85

,--, 0.8 t ~ o~° 0.75

j .

0.7 : z . ~ , /- , j - • -'-t ~ J

8 0.65 j J -

0.6 ~ - : : ~

0.55 ~ : ' 50 55 75 80

-C1 ~,C2 " N = C 4 I i i - C1 C2 " N : C4 I - C12 ..... C12N ~ C 1 2 4 ~ C 1 2 N 4 - C12 ~C12N ~C124 C12N4

a b

Figure 6. Effect of the mixed pixels (expressed as the % of the dominant cover type) on AVHRR diagonal classification accu- racy for eight spectral combinations: 6a) BOREAS Southern Study Area; 6b) BOREAS Northern Study Area.

compositing and assuming misregistration RMS of i km (1.5 kin) the effective pixel size will be about 16.6 km 2 (24.7 kin2; see Appendix). The corresponding 1-date value is 5.1 km 2.

iii. Pixel selection. The maximum NDVI value is presently used most frequently as the composit- ing criterion (Cihlar et al., 1994). This criterion is very good for not selecting cloudy pixels be- cause of their low NDVI (Holben, 1986). How- ever, in doing so, it also gives preference to pix- els with higher NDVI in situations where mixed cover types are present. The open water (low NDVI)-deciduous vegetation (high NDVI) are the strongest contrasts in the present data set; but the same principle operates in all situations, and its impact depends on the mixtures of adja- cent cover types and their respective NDVI. The effects include the reduced size of open water bodies and increased size of patches with high

NDVI (relative to their neighbors) in the classi- fied images.

In spite of the high spatial variability/patchiness of diverse land cover types, uses of the resulting land cover maps for studies of the biosphere-atmosphere interactions require that a variety of cover types be resolved. For example, 20 or so cover classes identified for the IGBP needs (Belward and Loveland, 1995) are considered to be the minimum necessary for a realistic description of the energy and gas exchange between the biosphere and the atmosphere. Given the require- ment for identifying these classes, the limitations of present satellite data sources evident from the above results and the reality of mixed cover types in many areas of the world, a possible mapping strategy with composite image data might be the following:

1. Minimize inclusion of higher view zenith angle pixels in the composites. This will have to be

Figure spectral

7. Effect of the mixed pixels (expressed as the % of the dominant cover type) on AVHRR Khat value for eight

0.8

0.7

0.6

0.5

0.4

0.3

0.2 ~ 50 55

combinations: 7a) BOREAS Southern Study Areas; 7b) BOREAS Northern Study Area.

0.7

0.6 ~ -,

0.5 ~ 0 . 4 J --~

0.3 ~.~-4r

0.2 ~ - ' ~

o l 50 55 75 80 60 65 70 75 80 60 65 70

Pixel purity (SSA, %) Pixel purity (NSA, %)

C1 -' C12 i ,= C2 - - N .~, C4

~; C12N ~ C 1 2 4 ~-C12N4

C1 C 2 - N , C4 I b ~ C12 , C12N ~ C124 C12N4 I

48 Cihlar et al.

traded against the decrease in temporal resolu- tion of the composites, which may be important in rapidly changing areas during the growing season.

2. Minimize misregistration errors. The short-term opportunity for improvements here seems lim- ited as most composite production operations pay much attention to this aspect. Improvements could come from better orbit information and better DEM data, but this is not available at the present.

3. Use ancillary data to eliminate open water from the mapping process. Because of the maximum NDVI compositing artifact, water bodies will tend to be underrepresented if identified through the classification process.

4. While aiming at the differentiation of many classes with the AVHRR data, accept that most will be relatively broad. The map legend should contain pure land cover types where land cover occurs in large, homogenous patches, and mixes of cover types where the patches are typically much smaller than the pixel size.

5. Use higher resolution satellite data (TM and the like) to characterize the land cover distribution (types and proportions) within the classes identi- fied using the AVHRR data. This could involve a relatively small sample of the various types. Al- ternatively, use unmixing techniques to differen- tiate within the broad classes, although their suc- cess will hinge on the simplicity of land cover mixes and the availability of satellite data sets with minimum noise. As an intermediate step in phenologically simple environments, it should be possible to employ classifications derived from near-nadir, one-date images for increased spatial resolution (and thereby higher classification accu- racy).

The classification accuracies reported above should be regarded in a comparative sense and not as definitive values. The principal reason is that none of the classifi- cations were optimized to obtain maximum possible accuracies. This was done to facilitate objective compar- isons between the various PC combinations. In an actual mapping exercise, such optimization is important for obtaining accurate and consistent results (e.g., Loveland et al., 1991). Another reason is that the preprocessing of the AVHRR data set could be further improved through better corrections of C1 and C2 (bidirectional effects) and C4 (end-season effects) and possibly a more conservative selection of PCs for the classification. On the other hand, an optimum labeling approach was used here which requires data that may not always be available in practice. It should also be noted that the accuracies were compared to those of TM classification

0.001 1 10 20 30 40 50 55.4 Scan angle (degrees)

Azimuth res. (km) o - Range res. (km)

Area (km**2) ~ Aeff(km**2)

Figure 8. Effect of scan angle on the spatial resolution of AVHRR composite data. Full triangles, full ellipses, open tri- angles show respectively the azimuth (km), range (km), and area (km 2) resolution in a single-data AVHRR image. The upper curve shows the resolved area (corresponding to -1- 1 standard deviation) assuming that all data originate at that scan angle and that RMS misregistration error among dates was 1.0 km.

which were less than 100% (Table 2). The actual accura- cies could thus be higher or lower, depending on the randomness of the TM classification errors.

CONCLUSIONS

A comparative study of the effectiveness of four spectral variables derived from AVHRR composite multitempo- ral data of Canada (Channel 1 and 2 surface reflectance, NDVI, brightness temperature) was carried out. Various combinations of input classes were tested using an unsu- pervised clustering algorithm and compared with TM- derived classifications at 30 m pixel size. It was found that:

1. NDVI was the most effective single spectral di- mension, but higher class accuracies were de- rived when NDVI was combined with other channels (most often C1 and C2). Overall, combi- nations including NDVI were best in 46% of all site / class combinations.

2. The (site-dependent) patchiness of land cover dis- tribution had major effects on the classification accuracy. Such patchiness occurs even for broadly defined classes. The accuracy improved dramatically when only AVHRR pixels with one dominant cover type were classified or when mixed classes were merged prior to classification (notably the mixed forest class).

It is concluded that individual Channels (1, 2, 4) as well as NDVI should be used to achieve maximum

A VHRR Land Cover Classification 4 9

D2 D1 h

U2 U1

Figure 9. Schematic representation of the effect of imaging geometry on spatial resolution (see text).

accuracies. Consideration of the artifacts in image com- posites suggests two basic steps that may be taken to maximize the expected accuracy: limiting acceptable view zenith and maximizing geometric registration accu- racy (to retain the highest possible resolution in the composite data), and using images with a higher spatial resolution (TM or near-nadir, single-date AVHRR when available as cloud-free over significant areas) to quantify the composition of the mixed classes. The results also indicate that characterization of the composition of land cover classes defined from AVHRR is necessary if the total areas of individual classes are to be known at the regional level, especially for small but ecologically significant classes such as fens and disturbed forest classes.

APPENDIX. EFFECTIVE RESOLUTION OF AVHRR COMPOSITE IMAGES

The spatial resolution of the composite pixels is deter- mined by the view zenith angle of the sensor (which increases the instantaneous field of view away from nadir) and the misregistration of a pixel during the geometric correction process.

View Zenith Angle Effect Although AVHRR composite pixels are resampled at 1 km, their effective resolution is equivalent to the field of view (FOV) of each pixel as imaged by the sensor. The FOV is determined by the resolution of AVHRR detectors and the sensor scan angle. As the scan angle increases, the FOV assumes elliptical shape with in- creasing ratio of the axes. The two axes, one describing range resolntion Rr (in the scan direction) and the other azimuth resolution R,, (orthogonal to the scan direction) can be computed fbr pixei i and compositing period j as follows (Fig. 9):

U2-U1 Rr(i,j) = n * R ~ * - - ,

180

n , D , I F O V Ro(i,j)

180

U1 = 180 - SA - X1,

U2 = 180 - SA - I F O V - X2;

D _ D 1 + D 2 = R % ( s i n ( U 1 ) + sin(U2) / 2 \sin(Sa) s in (S~+IFOVi / '

X l = a r c s i n ( R e + h . s i n ( S A ) ) \ R.

• /R~ + h . . . . IFOV)' t ] X2 = arcsln{ *sm(a/t + \ R,, . '

where

7 { =

h = SA=

IFOV =

radius of the earth (6378 kin), 3.14159, satellite altitude (nominally 833 km), scan angle, instantaneous field of view at nadir (slightly different between AVHRR Channels i and 2 (Kidwell, 1991) but value of 0.0014 radians can be used).

The area of a pixel i for period j is thus equal to

A(i,j) n*Ra(i,j)*R,.(i,j) 4

Given the series of n images for all periodsj and assum- ing that the VZAs for pixel i are normally distributed, the effect of the different pixel sizes on the area repre- sented can be described as

Ra,,,(i) = M(R~(i.j)) + STD(R~(i,j)),

R,-.,(i) = m(Rr( i , j ) ) + STD(Rr( i , j ) ,

Am(i) - n*Ra,,,(i)*R,~.(i) 4

where

M and STD = mean and standard deviation (in km) calculated for all values ofj ofpixel i,

R.m (Rrm) = effective resolution in azimuth (range) for pixel i during the period spanned byj composites,

Am = surface area determined by the effective resolution ofpixel i during the period spanned byj composites.

Under the assumption of normal distrilmtion, Am(i) will be less than the maximum value of A(i,j) in 67% of all dates j.

5 0 Cihlar et al.

Effect of Misregistration Because of the imperfections in the geometric registra- tion process, a given pixel (i,j) can be shifted with respect to its correct geographic position. Since the shift at various dates j can occur in different directions, the combined effect for the series of n images will be a "blurring" of pixel i, the measurement signal having originated from an area larger than Am(i). I f the misregis- tration is described by the root mean square (RMS) of the probability distribution of the difference between the true and the actual positions of the center of the pixel (i,j), then 67% of pixels (i,j) will be located within a spatial distance of 2*RMS from the center. This as- sumes that the misregistration of a pixel is a directionally insensitive random effect that can be quantified by a single normal distribution.

Effective Geometric Resolution The actual surface area represented in pixel i over the time series of n composite images, A~#(i), is a combina- tion of the scan angle and misregistration effects in these images. The overall resolution in azimuth (R~) and range (Rr) can thus be approximated as

na(i) = Ram(i) + 2,RMS,

Rr(i) = n~,,(i) + 2*RMS,

and the combined surface area represented in pixel i is a blurred ellipse with area A equal to

A~(i) ~,R~(i),Rr(i) 4

We are pleased to acknowledge the assistance of Mr. Ming Xie and Ms. Fengting Huang from Intera Technologies with data processing. The contribution of Drs. Zhanqing Li and Jing Chen (both from the Canada Centre for Remote Sensing) to the problem of spatial resolution definition in composite images and Jing Chen's comments on a draft of the article are much appreciated. The GEOCOMP data were produced by Ms. Pat Hurlburt at the Manitoba Remote Sensing Centre in Winnipeg, Manitoba. The ISOCLASS software was provided by Mr. Thomas Lovelaad from the EROS Data Center in Sioux Falls, South Dakota, USA as part of a cooperative project.

REFERENCES

Beaubein, J. (1995), personal communication, Centre de re- cherche forestieres des Laurentides, Quebec City, Quebec.

Beaubein, J., and Simard, G. (1993), Methodologie de classifi- cation des donn~es AVHRR pour la surveillance du couvert vegetal, in Proceedings of the 16th Canadian Remote Sensing Symposium, Sherbrooke, Quebec, pp. 597-603.

Belward, A. (1995), The IGBP___DIS global 1 km land cover data set: a validation strategy proposal, Draft Strategy Doc- ument V.1, 28 pp.

Belward, A. S., and Loveland, T. R. (1995), The IGBP-DIS' 1 km land cover project, in Remote Sensing in Action, Proceed-

ings of the Remote Sensing Society 1995 Annual Conference, University of Southampton, 11-14 September, forth- coming.

Brown, J. F., Loveland, T. R., Merchant, J. W., Reed, B. C., and Ohlen, D. O. (1993), Using multisource data in global land-cover characterization: concepts, requirements, and methods, Photogramm. Eng. Remote Sens. 59:977-987.

Cihlar, J. (1996), Identification of contaminated pixels in AVHRR composite images for studies of land biosphere, Remote Sens. Environ., forthcoming.

Cihlar, J., Manak, D., and D'Iorio, D. (1994), Evaluation of compositing algorithms for AVHRR data over land, IEEE Trans. Geosci. Remote Sens. 32:427-437.

Congalton, R. G. (1991), A review of assessing the accuracy of classifications of remotely sensed data, Remote Sens. Environ. 37:35-41.

DeFries, R. S., and Townshend, J. R. G. (1993), Global land cover: comparison of ground-based data sets to classifica- tions with AVHRR data, in Environmental Remote Sensing from Regional to Global Scales, (G. Foody and P. Curran, Eds.), Wiley, Chichester, pp. 84-110.

DeFries, R. S., and Townshend, J. R. G. (1994), NDVI-derived land classifications at a global scale, Int. J. Remote Sens. 15:3567-3586.

Eidenshink, J. C., and Faundeen, J. L. (1994), The 1 km AVHRR global land data set: first stages in implementation, Int. J. Remote Sens. 15(17):3443-3462.

Evans, D. L., Zhu, Z., and Winterberger, K. (1993), Mapping forest distributions with AVHRR data, World Resource Rev. 5:66-71.

Hall, F. G., and Knapp, D. (1994a), Landsat TM forest cover classification image of BOREAS Southern Study Area, the BOREAS Information System, NASA Goddard Space Flight Center, Greenbelt, MD (available as a digital file).

Hall, F. G., and Knapp, D. (1994b), Landsat TM forest cover classification image of BOREAS Northern Study Area, the BOREAS Information System, NASA Goddard Space Flight Center, Greenbelt, MD (available as digital file).

Hlavka, C. A., and Spanner, M. A. (1995), Unmixing AVHBR imagery to assess clearcuts and forest regrowth in Oregon, IEEE Trans. Geosci. Remote Sens. 33:788-795.

Holben, B. (1986), Characteristics of maximum-value compos- ite images from temporal AVHRR data, Int. J. Remote Sens. 7:1417-1434.

Iverson, L. R., Cook, E. A., and Graham, R. L. (1994), Regional forest cover estimation via remote sensing: the calibration center concept, Landscape Ecol. 9(3):159-174.

Kidwell, K. B. (1991), NOAA Polar Orbital User's Guide, Na- tional Oceanic and Atmospheric Administration, National Environmental Satellite Data and Information Service, Washington, DC.

Loveland, T. R., Merchant, J. W., Ohlen, D. O., and Brown, J. F. (1991), Development of a landcover characteristics database for the conterminous U.S., Photogramm. Eng. Remote Sens. 57:1453-1463.

Moorman, L. A., Ahern, F. J., Beaubien, J., and Cihlar, J. (1993), Relationships between multitemporal AVHRR data and Canada's biophysical regions, in Proceedings of the 16th Canadian Symposium on Remote Sensing, Sherbrooke, Quebec, pp. 571-577.

Nemani, R., Pierce, L. L., Running, S. W., and Goward, S. N.

AVHRR Land Cover Classification 51

(1993), Developing satellite derived estimates of surface moisture status, J. Appl. Meteorol. 32:548-557.

Pokrant, H. (1991), Land cover map of Canada derived from AVHRR images, Manitoba Remote Sensing Centre, Winni- peg, Manitoba, Canada.

Price, J. C. (1984), Land surface temperature measurements from the split-window channels of NOAA 7 Advanced Very High Resolution Radiometer, J. Geophys. Res. 89:7231- 7237.

Pultz, T., Ahern, F., Cihlar, J., and Howard, J. (1992), Explor- ing the information content of multi-temporal AVHRR data for land cover mapping, in Proceedings of the 15th Canadian Remote Sensing Symposium, June, Toronto, pp. 64-68.

Rahman, 1t., and Dedieu, G. (1994), A SMAC: a simplified method tbr the atmospheric correction of satellite measure- ments ill the solar spectrum, Int. J. Remote Sens. 15(1): 123-143.

Robertson, B., Erickson, A., Friedel, J., et al. (1992), GEO- COMP, a NOAA AVHRR geocoding and compositing sys- tem, in Proceedings of the ISPRS Conference, Commission 2, Washington, DC, pp. 223-228.

Royer, A., Anseau, C., Viau, A., et al. (1994), Impact des changements de l'environnement global sur la foret boreale du Quebec, in Teledetection de l'Environnement dans l~Es- pace Francophone, (F. Bonn, Ed.), Presses de l'Universit6 du Quebec, Sainte-Foy, Qu6bec, pp. 305-330.

Running, S. W., Loveland, T. R., Pierce, L., Nemani, R. R., and Hunt, E. R., Jr. (1995), A remote sensing based vegetation classification logic for global land cover analysis, Remote Sens. Environ. 51:39-48.

Sellers, P. J., Tucker, C. J., Collatz, G. J., et al. (1994), A global 1 ° by 1 ° NDVI data set for climate studies. Part 2: The generation of global fields of terrestrial biophysical parameters from the NDVI, Int. J. Remote Sens. 15:3519- 3545.

Stone, T. A., Schlesinger, P., Houghton, R. A., and Woodwell, G. M. (1994), A map of the vegetation of South America based on satellite imagery, Photogramm. Eng. Remote Sens. 60(5):541-551.

Tou, J. T., and Gonzales, R. C. (1974), Pattern Recognition Principles, Addison-Wesley, Reading, MA, 377 pp.

Townshend, J. R. G. (1994), Global data sets for land applica- tions from the Advanced Very High Resolution Radiometer: an introduction, Int. J. Remote Sens. 15:3319-3332.

Townshend, J. R. G., Justice, C. O., and Kalb, V. T. (1987), Characterization and classification of South American land cover type using satellite data, Int. J. Remote Sens. 8:1189- 1207.

Townshend, J. R. G., Justice, C. O., Skole, D., et al. (1994), The 1 km resolution global data set: needs of the International Geosphere Biosphere Programme, Int. J. Remote Sens. 15: 3417-3441.

Tucker, C. J., Townshend, J. R. G., and Goff, T. E. (1985), African land-cover classification using satellite data, Science 227:369-375.

Wahhall, C. L., Norman, J. M., Welles, J. M., Campbell, G., and Blad, B. L. (1984), Simple equation to approximate the bidirectional reflectance from vegetative canopies and bare soil surfaces, Appl. Opt. 24:383-387.