An automated object-based classi�cation approach for Updating Corine Land Cover data

11
An automated object-based classification approach for Updating Corine Land Cover data Thilo Wehrmann a,c , Stefan Dech a,c ,R¨ udiger Glaser b a DLR–DFD German Remote Sensing Data Center, Weßling, Germany; c University of W¨ urzburg, Remote Sensing Unit, Am Hubland, W¨ urzburg, Germany; b University of Heidelberg, Department of Geography, Im Neuenheimer Feld, Heidelberg, Germany ABSTRACT In this paper, the framework of an object-based classification approach for land cover and land use classes is presented. Re- cently, there is an increasing demand for information on actual land cover resp. land use from planning, administration and science institutions. Remote sensing provides timely information products in different geometric and thematic scales. The effort to manually classify land use data is still very high. Therefore a new approach is required to incorperate automated image classification to human image understanding. The proposed approach couples object-based clasification technique – a rather new trend in image classification – with machine learning capacities (Support Vector Classifier) depending on information levels. To ensure spatial and spectral transferability of the classification scheme, the data has to be passed through several generalisation levels. The segmentation generates homogeneous and contiguous image objects. The hier- archical rule type uses direct and derived spectral attributes combined with spatial features and information extracted from the metadata. The identified land cover objects can be converted into the current CORINE classes after classification. Keywords: Object-Based Image Classification, Automatisation, CORINE Land Cover, Pattern Recognition, Machine Learning, Human Image Understanding 1. INTRODUCTION In this paper we present an automated approach for land cover classification. It is modelled on human image understanding and combines the object-based classification with fuzzy a-priori knowledge about separate land cover classes depending on landscape units. The development of this approach is necessary because it lacks existing methods to handle more complex classification tasks like extracting CORINE classes. Land cover is defined as the observed (bio-) physical layer, which covers the surface of the Earth. 1 However Land use implies the man-made function of the land. There is still an increasing demand for accurate LandCover / LandUse (LC/LU) information in several institutions and administrations for planning and science (e.g. input data for European structural development programmes and INSPIRE – INfrastructure for SPatial InfoRmation). Frequently lacking these spatial data, users are forced to use statistical data summarized to administrative levels for monitoring and modelling. In many cases even statistical data does not provide all information needed, e.g. knowledge about impervious areas. 2 Therefore accurate, detailed and spatially derived data is crucial to use, visualize and analyse changes in LC/LU. Remote sensing is the only alternative to acquire spatially contiguous and homogeneous data in various geometric, spectral and temporal resolutions. However, the effort of LC/LU classication by visual image interpretation is too high. Therefore it is far too expensive to produce large scale land use maps from remotely sensed data. Alternatively, operational classication can convert continuous image data into thematic information, producing LC/LU classes to its best level of detail. This initial classication assists human interpreters by producing pre-classified LC/LU maps. Experts are able to focus upon more complex topics. Further author information: (Send correspondence to Thilo Wehrmann) Thilo Wehrmann: E-mail: [email protected], Telephone: +49 931 888 4958

Transcript of An automated object-based classi�cation approach for Updating Corine Land Cover data

An automated object-based classification approach for

Updating Corine Land Cover data

Thilo Wehrmanna,c, Stefan Decha,c, Rudiger Glaserb

aDLR–DFD German Remote Sensing Data Center, Weßling, Germany;cUniversity of Wurzburg, Remote Sensing Unit, Am Hubland, Wurzburg, Germany;

bUniversity of Heidelberg, Department of Geography, Im Neuenheimer Feld, Heidelberg, Germany

ABSTRACT

In this paper, the framework of an object-based classification approach for land cover and land use classes is presented. Re-cently, there is an increasing demand for information on actual land cover resp. land use from planning, administration andscience institutions. Remote sensing provides timely information products in different geometric and thematic scales. Theeffort to manually classify land use data is still very high. Therefore a new approach is required to incorperate automatedimage classification to human image understanding. The proposed approach couples object-based clasification technique– a rather new trend in image classification – with machine learning capacities (Support Vector Classifier) depending oninformation levels. To ensure spatial and spectral transferability of the classification scheme, the data has to be passedthrough several generalisation levels. The segmentation generates homogeneous and contiguous image objects. The hier-archical rule type uses direct and derived spectral attributes combined with spatial features and information extracted fromthe metadata. The identified land cover objects can be converted into the current CORINE classes after classification.

Keywords: Object-Based Image Classification, Automatisation, CORINE Land Cover, Pattern Recognition, MachineLearning, Human Image Understanding

1. INTRODUCTION

In this paper we present an automated approach for land cover classification. It is modelled on human image understandingand combines the object-based classification with fuzzy a-priori knowledge about separate land cover classes depending onlandscape units. The development of this approach is necessary because it lacks existing methods to handle more complexclassification tasks like extracting CORINE classes.

Land cover is defined as the observed (bio-) physical layer, which covers the surface of the Earth.1 However Landuse implies the man-made function of the land. There is still an increasing demand for accurate LandCover / LandUse(LC/LU) information in several institutions and administrations for planning and science (e.g. input data for Europeanstructural development programmes and INSPIRE – INfrastructure for SPatial InfoRmation). Frequently lacking thesespatial data, users are forced to use statistical data summarized to administrative levels for monitoring and modelling.In many cases even statistical data does not provide all information needed, e.g. knowledge about impervious areas.2

Therefore accurate, detailed and spatially derived data is crucial to use, visualize and analyse changes in LC/LU. Remotesensing is the only alternative to acquire spatially contiguous and homogeneous data in various geometric, spectral andtemporal resolutions. However, the effort of LC/LU classication by visual image interpretation is too high. Therefore it isfar too expensive to produce large scale land use maps from remotely sensed data. Alternatively, operational classicationcan convert continuous image data into thematic information, producing LC/LU classes to its best level of detail. Thisinitial classication assists human interpreters by producing pre-classified LC/LU maps. Experts are able to focus uponmore complex topics.

Further author information: (Send correspondence to Thilo Wehrmann)Thilo Wehrmann: E-mail: [email protected], Telephone: +49 931 888 4958

Classification

The act of forming into a class or classes; a distribution into groups, as classes, orders, families, etc., accord-ing to some common relations or affinities.

WEBSTER’S 1913 DICTIONARY

In this context, image classification is defined as extraction of differentiated classes of land use and land cover categoriesfrom remotely sensed satellite data due to common spectral, textural or contextural features. LILLESAND AND KIEFER3

distinguish three types of image information, including spectral, spatial and temporal patterns. The analysis of spectralpatterns is widely implemented in commercial software packages. Its classification algorithms use statistical clustering,assigning pixels to classes according to its spectral information (per-pixel classification). However, the relation of pixelsto its neighbours, utilizing various measurements such as texture or entropy, is often an important characteristic for classassignment and incorporates the spatial pattern into image classification process. If multitemporal data is available thetemporal patterns, e.g. phenology and periodical changes, can be used to aid feature identification.

Common automated classification approaches such as unsupervised classification categorize spectral data into statisticaldistinct clusters (ISODATA, k−means). Afterwards the human interpreter assigns these clusters to thematic land coverclasses. Another approach, the supervised image classification, employs representative sample sites of known land coverclasses as training areas. Although these training sites refer to real land cover classes, additional information such asbackground knowledge or topographic maps is helpful for accurate class assignment. Due to the process that every pixelmust be considered, pixels of unknown land cover are categorized to their most likely land cover class (for exampledepending on probability (Maximum Liklihood) or closest distance to the mean in feature space (Nearest Neighboor)).The pixel-based classification approaches do not allow to determine land use classes, because the satellite cannot observeanthropogenic functions of an area. A solution to this problem is either the comprehension of more information extractedfrom the data source (e.g. topology) or auxiliary information. According to the geometrical resolution of the image, mixedpixels cause further problems, because only objects larger than the geometrical resolution provide spectrally ”pure” pixels.

Another categorization for classification techniques is the assumption of data distribution in feature space. The para-metric approaches like Maximum Liklihood depend on a predeefind distribution function (mostly gaussian) of the data,because the mean and covariance are calculated from the trianing set. Non-parametric techniques are Artificial Neural Net-works(ANNs) and Decision Tree Classifiers(DTCs). They reconstruct the data independently from statistical distribution.Another method for non-parametric image classification is the Support Vector Classifier (SVC) based on Structural RiskMinimization4 (SRM) developed by Vapnik.

A rather new approach is the object-based classification, integrating spatial and spectral patterns. The image is spatiallysegmented into homogeneous areas, called image objects,5 prior to the actual classification step. This procedure generatesadditional information about the object, e.g. shape, size and neighbourhood, which can also be integrated into imageclassification. Features and attributes of classes can be inherited to single objects in the class hierarchy. The commercialsoftware eCognition6 uses this per-field approach . Utilising objects instead of pixels allows the determination of certainland use classes based on object size and neighbourhood relationships for the first time in classification.

pixel bases techniques parametric Maximum Likelihood (ML)Nearest Neighbour (NN)

...non-parametric Artificial Neural Networks (ANN)

Decision Tree Classifier (DTC)Support Vector Classifier (SVC)

object based techniques eCognitionexperimental approaches

Table 1. Frequently used supervised Classification Approaches

There are many automated or semiautomated approaches for classification of remotly sensed data. They use low spatial/ high temporal resolution data (AVHRR,7 MODIS, VEGETATION8) for land cover classification or specify and thereforesimplify the classification scheme for just a few classes (e.g. roads,9 junctions, vehicles, sealed areas).

CORINE Land Cover Project The CORINE LandCover (CLC) project develops a pan-european land cover databasebased on remote sensing data. Its hierarchical structure contains 44 different but harmonized land cover classes groupedinto three major categories. The first data acquisition was in the mid 90ies. Currently the CLC2000 update10 is nearlycompleted. The information is derived manually by GIS-aided visual image interpretation with tremendous financial andhuman effort. Its scheduled update interval is ten years. The hierarchical structure of the CORINE classes allows logicalclass aggregation, hence abstract mapping. The classification system is extendible by adding classes to level four and five.The pan-european CLC project requires partial operationalisation of the classification process. This operational classifi-cation chain enables faster and more frequent update as well as improved class information. The proposed classificationsystem builds upon two different datasets (CLC90 and CLC2000) for training and validation purposes.

There is several research done in CORINE Land Cover data related to class conversation11 and updating. Besides theCORINE system there are more land cover classification systems developed by non-european agencies (e.g. FAO, USGS).

Figure 1. Research Area: The subscene covers an area more than 5.000 km2 of heterogeneous classes (NDVI; edge-enhanced)

2. RESEARCH AREA AND DATA SETS

There are several testing sites randomly choosen over entire Germany. These sites allow to test the approach with differentclass composites. The current research area (Figure 1) is in the north-eastern part of Germany. It includes the capital,Berlin,and extends to the Odra river in the East and the Lake Plateau of Mecklenburg in the North (Mecklenburg-Western Pomera-nia). It is part of the North German Plain, which was heavily reshaped by nordic ice sheets during the Pleistocene. Dueto the heterogenity of the landscape (younger glacial drift areas) and the high degree of anthropogenic functionality and

usage (urban settlements, river management, agriculture, local recreation) it is suitable for testing of automated classifica-tion approach. The area covers several natural landscape classes (ground moraine plates with lakes / depressions, terminalmoraines, outwash plains and glacial valleys) with different dominant land cover classes.

The satellite data used for this study was acquired on July, 7th 1989 by Landsat 5 TM (path 193 / row 23). Themultispectral sensor collects 6 bands in 25m spatial resolution (bands 1–5, 7) and one band (band 6) in 120m resolution.Due to georeferencing and further processing, all bands and data are geometrically resampled to 28.5m resolution.

The Natural landscape classes (”Naturr amliche Gliederung”) by Meynen-Schmithusen are digitized and rasterized forselect actual training samples. Moreover a Digital Elevation Model (DEM) is used for terrain analysis and visualisation.

3. METHODOLOGY

Remotely sensed image classification is a well examined field of research. However no approach is omnipotent nor capableto fulfill human generalisation abilities. But, is it possible to imitate human pattern recognition ? It is not trivial to answerthe question and leads us towards neuropsychology.

Following the human image understanding in a more psychological way, the task would be split in several parts orprocessing stages. In the mid-80s BIEDERMAN12 developed a theory to explain human image understanding throughRecognition by Components (RBC). The problem of object recognition is divided into several cascading subprocesses.

1. Edge Extraction

(a) Detection of Nonaccidential Properties

(b) Parsing Regions of Concavity

2. Determination of Components

3. Matching of Components to Object Representations

4. Object Identification

When an image of an object is illustrated on the retina, the RBC approach assumes that the representation of the image issegmented – or parsed – into separate regions. The object is decomposed into primitive components also known as geons(for ”geometrical ions”). Instead of a vast account of these primitives further research determined ”only” 36 different geons.Any complex objects can be constructed by these 36 geons. Color and texture are only used for additional informationabout the object or for segmentation. Two dimensional and three dimensional objects can be represented by its componentsand therefore classified by its gestalt. The robust capacity of abstraction allows the recognition of various incomplete anddegraded objects. Instead of recognizing daily life objects, the classification of remotely sensed images is a much morespecialized problem. Because of sensor limitations ”pure” objects cannot be recognized from space. There have to be otherfeatures in the process of class identification.

The RBC-approach of BIEDERMAN bears how the human speech is perceived. The understanding of the human speechis also a problem of object recongition. Complex object arrangements can be abstracted into a small number of primitives.”Only” 44 phonems or 26 letters are sufficient to code english speech or english words, respectively. The set of primitivesis derived from dichotomous or trichotomous contrasts of a small number. With simple combination of primitives anddichotomous contrasts it is possible to code a wide range of words. EDELMAN13 has recently explained the representationof objects by reference shapes, or ”prototypes” (Chorus of Prototypes). He introduced a global shape space in which allkinds of objects can be projected and classified. But robustness and processing speed could not be achieved by using onlyobject prototypes. RIESENHUBER14 developed a more neuropsychological way of object recognition. He broadened theidea of object representation by establishing class categories (Categorical Perception).

The next example explains some of these features in context of specialised land cover classification.

Once upon a midnight dreary, while I pondered weak and weary,Over many a quaint and curious volume of forgotten lore,While I nodded, nearly napping, suddenly there came a tapping,As of some one gently rapping, rapping at my chamber door.‘’Tis some visitor,’ I muttered, ‘tapping at my chamber door —Only this, and nothing more.’

E.A.POE, THE RAVEN (1835)

In the traditional way of classification the former paragraph is split into separate letters. These letters and the occurenceof them is statistically analysed. It is not a trivial work to get the distribution of this data in any feature space. Indead theresults can be compared to other poems and maybe the author used a special vocabulary we can use to identify the text.An extended approach is the use of words – devided by spaces and word marks because the combination of the letters isnot randomly distributed and follows more or less rigorous rules. These letter fragments can be evaluated, too. At thattime words get a special function (as nouns, verbs, attributes, etc.) and meaning in the sentence. The classification is moreaccurate than just using letters. However the number and type of words do not describe the nature of the author. Maybeit is possible to determine the author with this method but there is a certain source of mistake in the results. Letters andwords do not allow to estimate the author exactly. A possible solution is the extension of used features. Certain words(e.g. dreary, weak, forgotten lore, muttered, etc.) get additional information about their meaning. The context exists fromthe meaning of the separate words. This information is extracted from learning. Words become objects with content andadditional attributes. These objects derive from classes which inherit further attributes. With this information the poemobtains new features (like feeling / atmosphere). This new feature is more stable and capable to classify the poem becauseinstead of using the same letters or words Poe used the same atmosphere or style for most of his work. In using categoriesthese features can be used to estimate the right category for the poem. It is obvious that there are several informationlevels in these patterns. Letters and phonems (in speech) are the lowest information units. The next level consists ofwords with semantic meaning. These words are coded by letters and enable the exchange of information (communication).The sentence explains a short statement. The content of information exists at that level. More information is given in anaccumulation of sentences (paragraph). The document itself imparts complex structures, too.

Level Text Image1. Letters – Pixel2. Words – Image Objects3. Sentences – Forest Objects4. Paragraphs – Forest Class5. Documents – Landscape

Table 2. A comparison between text and image objects

What can we learn from this example for image classification? For object classification it is highly important to choosethe right features. Pixels and image objects reflect the true but simple nature of the objects in the image but they cannot beused alone to fulfill complex classification tasks. The human interpret solves this problem in implementing his backgroundknowledge into the classification (e.g. class scheme, hierarchical classification and so on). These results become betterbut it lacks of spatial or thematical transferability of the working steps. What kind of features can be generated from theimage data we can use for background knowledge (a-priori knowledge) ? Surely it depends on the nature of the classes.For the CORINE Land Cover classification detailed information about vegetation, soil and artificial structures are usefulfor hierarchical classification. Moreover a certain knowledge about the spatial and temporal distribution of the classesenhances the classification accuracy. Single objects are not randomly distributed in space and time. In a simple way itis possible to create rules for CORINE objects. The combination of the hard (pixel data) and soft (a-priori knowledge)information is realised in a fuzzy manner.

The combination of object-based classification with a-priori knowledge is not a new invention. Especially the use ofeCognition allows a complex realisation for implementing knowledge. However the developed class hierarchies are rarelytransferable without ease. The robustness and stability of the system evolve from using simple features and non complexrules. Most classes from CORINE LandCover can be identified in combination with former CORINE data set (CLC90)with satisfying accuracy. The classification can be enhanced with additional information and multitemporal data.

Level extracted from Information typepixel level spectral pixel data tone

brightness, saturationImage- texture (kernel size)Level segment level spectral pixel data texture (adapted)

shape size (absolute)compactness, ...

object level shape size (relative)compactness, ...

segments pattern of segmentsSemantic- class level objects pattern of objects

Level spatial distributionneighbourhood

connectivitylandscape level classes, objects spatial distribution

diversity

Table 3. Information Levels of Images

The process of CORINE object recognition of remotely sensed data is also separated into several processing stages(data preparation, generalisation, post-processing).

1. Preprocessing

2. Multi–Level Classification / Generalisation

3. Post-processing

These stages are adapted to the conceptual frame for image understanding by IBRAHIM.15 The model consists of filtering,segmentation, lower image interpretation and complex object recognition. It is postulated that image understanding is aknowledge-based process. Knowledge can be coded implicitly (procedural knowledge) or explicitly (declarative knowl-edge). Procedural knowledge is incorporated in an algorithm or a program. However declarative knowledge is representedsymbolically in production rules, semantic nets and frames. In contrast to procedural knowledge the explicite representa-tion does not contain any control information about the usage of the knowledge.

3.1. Preprocessing

All preprocessing modules are developed in an object-oriented way and use relational databases for data storage.

Metadata Extraction This data is extracted from the scene (acquisition date and place and sensor type). The geograph-ical position determines the Natural landscape classes of the scene.

Image Enhancement The data mining approach (SVC) classifies the data spectrally in a supervised way. This attemptrequires some sort of training data, which is stored in a training samples database. To enhance classification results anatmospheric correction tool like ATCOR16 is used to homogenise data from various scenes. In using reflectances insteadof simple DNs it is possible to apply ratios and colour features for a-priori knowledge.

Image Segmentation There are different approaches for image segmentation. Until now the fractal net evolution ap-proach (FNEA) is used to segment the data. This approach is embedded in the commercial eCognition6 package. It isplanned to substitute this method for an own-developed, more automatical texture approach.

Generation of a-priori knowledge Information is extracted from the image with a-priori knowledge (e.g. Vegetationmask derived from NDVI, Greeness and NIR channel). Thus it is possible to gain fuzzy LC masks for vegetation, soil andsettlement.17

Level 0 Non Vegetation Mixed Vegetation Dense Vegetation

Level 1 Waterbodies Settlement areas ForestImpervious Areas Cropland MeadowBare Soil Artificial areas CroplandPermanent Snow / Ice Natural areas Natural areas

Level 2 Clear Water Open Residential Sites Dedicuous ForestDisturbed Water Cropland (early stages) Coniferous ForestCommercial / Industrial site Cropland (extensive usage) MeadowDense residential sites Parks Cropland (later stages)Acres, Bare Soil Greenland (early) ParksRock, Exploited Areas Wetlands Greenland (later)Sparse Vegetation WetlandsPermanent Snow / Ice

Figure 2. Example for a Vegetation Memership Function

3.2. Multi-Level Classification

Due to the different information levels (table 3) a multi-level classification system is developed. It combines the object-based classification technique, machine learning capacities and fuzzy a-priori knowledge decision in an iterative classi-fication and identification process. Generalisation is split in two parts (spatial and thematical) to ensure robustness andaccuracy. The spatial generalisation is done by image segmentation. Whereas the thematical generalisation consists ofhierarchical spectral classification and fuzzy land cover masks.

Level ComponentLevel 0/1 hierarchical, per pixel Classification with Support Vector ClassifierLevel 2 Neighbourhood / Shape Analysis with GIS ToolsLevel 3 Temporal Analysis (Change Detection) with Decision Support SystemLevel 4 CORINE Analysis with Decision Support System

Table 4. Multi-level Generalisation / Classification

Figure 3. Concept for CORINE object recognition

Object classification A spectral database provides information to identify training samples in the image. A new promis-ing18 data mining technique, called Support Vector Classifier (SVC), is used for supervised classification. It is a machine-learning based approach developed for classification problems and regression analysis. Contrary to Maximum LikelihoodClassication (MLC) the SVC method is non-parametric and produces higher-generalized results. SVCs are successfullyapplied to broad aspects of classification like particle identification, face identication and text categorization. The approachis systematic and reproducible. Although it is a new classification approach, SVCs are used for remotely sensed data forseveral times1920.21 The modules allocate the training samples in ”intelligent” way (sensor, phenology, geographicalposition etc.) from the database.

Support Vector Classifier The technique of support vector machines (SVM) has been developed in the frameworkof statistical learning approaches. In BENNETT AND CAMPBELL22 the approach is examined geometrically because “itis a rare example of a methodology where geometric intuition, elegant mathematics, theoretical guarantees and practicalalgorithms meet.” VAPNIK4 gives an excellent comprehension in statistical learning theory. Like in other binary classifi-cation tasks, there are m data points xi (i=1, ..., m) having corresponding labels yi = ±1. The data points are representedin a k-dimensional feature space. Let the classification function be: f(x) = sign(wx − b). The vector w determines theorientation of a discriminant plane, and the scalar b determines the offset of the plane from the origin. Let us consider thetwo data sets are linearly separable. There are infinitely many possible planes that correctly separate the two training sets.The optimal plane with the highest degree of generalization is the one being ”furthest” from both clusters. This plane canbe constructed by maximizing the margin between both classes. For that purpose those points have to be found with theclosest distance to each other. There are two ways to determing these points. One of them is to create a convex hull aroundeach training data set. The best plane will bisect orthogonally the closest points in the convex hulls.

These located points are called ”support vectors”. The second method is to maximize the margin between the parallelplanes which limit the data of each class. After rescaling the data the linear function w · xi − b ≥ 1 defines all data of the

Figure 4. Schematic overview about Support Vector Machine

class +1. For the class -1 the similar function is w · xi − b ≤ 1. The best dividing plane between the two dataset can becalculated in maximizing the distance between these support planes (w · x = b + 1 and w · x = b − 1). The margin isγ = 2/ ‖ w ‖2. Maximizing the margin is equivalent to minimizing γ.

The method of maximizing the margin between the supporting planes is equivalent to the first one because the supportvectors on the plane are the same as the closest points on the convex hull (duality in mathematics).

The solution of the problem depends only on these support vectors instead of the whole data. In maximizing the marginof separation the complexity is reduced to describe a linear function. This produces less generalization errors and improvedgeneralization with better probability. Thus the dimensionality of the data does not affect the size of the margin. In contrastto other classification methods the problem of overfitting high-dimensional data (”curse of high dimensionality”) is reduced.The more complex the function describing both datasets the poorer the capability of future generalization. Maximizing themargin reduces the complexity of the model.

In reality datasets with linear separability are rarely met. With real life data the strategy of construction a separatingplane will produce no solution. To estimate the best (reduced) convex hull the influence of each point has to be restrictedin introducing an upper bound D < 1 on the multiplier for that point. The former equitation has to be enhanced with theidea of reduced a convex hull. The separability of the maximized margin can produce a wide range of training errors if thelinear discriminants are not adapted to the data sets. This problem can be solved by introducing the kernel trick.

Kernel Trickery In many cases a simple linear discriminant function cannot solve each classification problem (forinstance no linear function can describe a quadratic function).Additional features enhance the data set to increase thedimensionality in using a kernel function. In this artificial feature space a linear discriminant can be constructed. Thiskernel mapping can be applied to all equations. The SVM approach has developed several kernel functions.

θ(u) K(u, v)

Degree d polynomial (u · v + 1)d

Radial Basis Function Machine exp(

‖u−v‖2

)

Two-Layer Neural Network sigmoid(η(u · v) + c)

Table 5. Some Kernel Functions

Kernel mapping produces highly nonlinear classifiers using the same linear discriminant function.

Up to this step there are only image objects containing the spatial and spectral information of the individual pixels.Further useful knowledge can be calculated, e.g. information about shape, size or texture parameters. The integration ofthis additional knowledge allows the construction of complex objects by aggregating the simple image objects.

Fuzzy Object Analysis The classification of image objects deliveres a fuzzy set of land cover classes of different levels(rough and finer classes). The miscellaneous information levels, the land cover masks (vegetation, soil and settlement)and former classification results add additional ”diffuse” data to the object identifier. At the end the determination modulecombines all data sets (Spectral Membership + a-priori Membership + [Landscape] + [Terrain] = Fuzzy Land Cover Class).At that point several classification passes can alter the loading of each component (simple ”learning” element).

After defuzzification the actual CORINE class is estimated.

Spectral Vegetation Settlement Soil0.6 Barren 0.4 No Vegetation 0.2 Settlement 0.7 Soil0.2 Grassland 0.3 Mixed Vegetation 0.8 Settlement 0.3 Soil0.1 Settlement 0.6 Mixed Vegetation 0.9 Settlement 0.1 Soil

=⇒ 211 Non-irrigated arable land

Table 6. Example for non standardised membership and following defuzzification process

3.3. Post-processing

CORINE classes are clearly described thematically and geometrically. Image objects which are smaller than 25 ha (0.25km2) are eliminated or rather merged with their neighbours. In many cases (no change in objects) the border lines can beused from current CORINE data because of redundancy / validation / change detection. Complex classes (e.g. airports)can be constructed in merging certain objects together (asphaltic areas, greenland and buildings).

The validation is effected by current CORINE classification data (CLC 2000). Employing this data, there are threepossibilities of temporal variation. Due to temporal reasons, the (1) geometry changed because of processes such asurbanization and / or (2) the thematic content has been altered, e.g. afforestation. However, the most likely assumption formany CORINE objects is (3) no change. Misclassified objects can be manually corrected after notification.

It is very practical to incorporate the existing CORINE data into the classification system. The supervised classificationapproach for classifying image objects requires training samples. These samples must be extracted from the image. Pre-vious CORINE datasets can also help to assign thematic knowledge to this training data without manual interaction. ForCORINE purpose it is also necessary to use the same geometry for comparison and change detection.

4. RESULTS AND CONCLUSION

Not all componends of this approach are implemented in Python until now. Up to this moment it is possible to process thea-priori knowledge masks for the subscene. Currently level 2 to 4 of the multi-level classification scheme is in progress.Until the end of the year the remaining componends will be implemented. An extensive testing will validate the project.

The approach uses the full geometrical and multispectral potential of LANDSAT 7. It is possible to extract objectssmaller than 25 ha for certain classes (one disadvantage of current CORINE classification). The (semi-) automated methodenables timely land cover maps for large scale areas in spite of heavy processing costs.

REFERENCES

1. A. D. Gregorio and L. Jansen, “A new concept for a land cover classification system,” in Proceedings of the EarthObservation and Environmental Information, 1997.

2. ed: Statistisches Bundesamt, “Umwelt — Umweltproduktivit at, Bodennutzung, Wasser, Abfall. Ausgew ahlte Ergeb-nisse der Umwelt okonomischen Gesamtrechnungen und der Umweltstatistik 2003,” tech. rep., Statistisches Bunde-samt, 2003.

3. T. Lillesand and R. Kiefer, Remote Sensing and image interpretation, 4th ed.John Wiley & Sons, New York, 2000.4. V. Vapnik, The Nature of Statistical Learning Theory, Springer. New York, 1995.5. M. Pedley and P. Curran, “Per-field classification: an example using spot imagery,” Int. Journal of Remote Sensing.

12, pp. 2181–2192, 1991.6. U. C. Benz, P. Hofmann, G. Willhauck, I. Lingenfelder, and M. Heynen, “Multi-resolution, object-oriented fuzzy

analysis of remote sensing data for GIS-ready information,” ISPRS Journal of Photogrammetry & Remote Sensing58, pp. 239–258, 2004.

7. J. C. Eidenshink and J. L. Faundeen, “The 1-km AVHRR global land data set: first stages in implementation,” Inter-national Journal of Remote Sensing 15, pp. 3443–3462, 1994.

8. K.-S. Han, J.-L. Champeaux, and J.-L. Roujean, “A land cover classification product over France at 1 km resolutionusing SPOT4/VEGETATION data,” Remote Sensing of Environment 92, pp. 52–66, 2004.

9. S. Hinz, “Automatic road extraction in urban scenes - and beyond,” in Proceedings of ISPRS congress ”Geoinforma-tion Bridging Continents”, 35, International Archieves of Photogrammetry, Remote Sensing and Spatial InformationSciences, 2004.

10. M. Keil, B. Mohaupt-Jahr, R. Kiefl, and G. Strunz, “Update of the CORINE Land Cover Data Base in Germany,” inCLC 2000 Workshop, Federal Environmental Agency, 2003.

11. R. M. Fuller and N. Brown, “A CORINE map of Great Britain by automated means. Techniques for automaticgeneralization of the Land cover map of Great Britain,” International Journal of Geographical Information Systems8, pp. 937–953, 1996.

12. I. Biederman, “Recognition-by-Components: A Theory of Human Image Understanding,” Psychological Review94(2), pp. 115–147, 1987.

13. S. Edelman, “Representation, Similarity and the Chorus of Prototypes,” Minds and Machines 5, pp. 45–68, 1995.14. M. Riesenhuber, How a Part of the Brain Might or Might Not Work: A New Hierarchical Model of Object Recognition.

PhD thesis, Massachusetts Institute of Technology, 2000.15. A. E. Ibrahim. Internet.16. R. Richter and D. Schl apfer, “Geo-Atmospheric Processing of Airborne Imaging Spectrometry Data, Part 2: Atmo-

spheric / Topographic Correction,” International Journal of Remote Sensing 23(13), pp. 2631–2649, 2002.17. R. de Kok, T. Wever, and R. Fockelmann, “Analysis of urban structure and development applying procedures for

automatic mapping of large area data,” The international archives of the photogrammetry, remote sensing and spatialinformation science 34, pp. 41–45, 1994.

18. P. A. Flach, “On the state of the art in machine learning: A personal review,” Artificial Intelligence 131, pp. 199–222,2001.

19. G. Camps-Valls, L. Gomez-Chova, J. Calpe-Maravilla, J. D. Martın-Guerrero, E. Soria-Olivias, L. Alonso-Chorda,and J. Moreno, “Robust support vector method for hyperspectral data classifaction and knowledge discovery,” IEEETransactions on Geoscience and Remote Sensing 42(7), pp. 1530–1542, 2004.

20. G. M. Foody and A. Mathur, “A relative evaluation of multiclass image classification by support vector machines,”IEEE Transactions on Geoscience and Remote Sensing 42(6), pp. 1335–1343, 2004.

21. L. D. C. Huang and J. Townshend, “An assessment of support vector machines for land cover classification,” Interna-tional Journal of Remote Sensing 23, pp. 725–749, 2002.

22. K. P. Bennett and C. Campbell., “Support vector machines: Hype or hallelujah ?,” in SIGKDD Explorations, pp. 1–13,2000.