Imaging spectroscopy for scene analysis: challenges and opportunities

www.ietdl.org

IE

d

Published in IET Computer VisionReceived on 20th November 2010Revised on 21st December 2012Accepted on 3rd January 2013doi: 10.1049/iet-cvi.2010.0205

T Comput. Vis., 2013, Vol. 7, Iss. 6, pp. 467–477oi: 10.1049/iet-cvi.2010.0205

ISSN 1751-9632

Imaging spectroscopy for scene analysis: challengesand opportunitiesAntonio Robles-Kelly1,3, Bill Simpson-Young2

1NICTA, Tower A, 7 London Circuit, Canberra, ACT 2601, Australia2NICTA, Level 5, Australian Technology Park, Eveleigh, NSW 2015, Australia3College of Engineering and Computer Science, ANU, Canberra, ACT 0200, Australia

E-mail: [email protected]

Abstract: In this study, the authors explore the opportunities, application areas and challenges involving the use of imagingspectroscopy as a means for scene understanding. This is important, since scene analysis in the scope of imagingspectroscopy involves the ability to robustly encode material properties, object composition and concentrations of primordialcomponents in the scene. The combination of spatial and compositional information opens-up a vast number of applicationpossibilities. For instance, spectroscopic scene analysis can enable advanced capabilities for surveillance by permitting objectsto be tracked based on material properties. In computational photography, images may be enhanced taking into account eachspecific material type in the scene. For food security, health and precision agriculture it can be the basis for the developmentof diagnostic and surveying tools which can detect pests before symptoms are apparent to the naked eye. This combination ofa broad domain of application with the use of key technologies makes the use of imaging spectroscopy a worthwhileopportunity for researchers in the areas of computer vision and pattern recognition.

1 Introduction

Imaging spectroscopy technology captures and processesimage data in tens or hundreds of bands covering a broadspectral range. Compared with traditional monochrome andred/green/blue (RGB) camera sensors, the multispectral andhyperspectral image sensors used for imaging spectroscopycan provide an information-rich representation of thespectral response of materials which poses greatopportunities and challenges.Despite its potential, imaging spectroscopy has often only

been available to a limited number of researchers andprofessionals because of the high cost of spectral camerasand the complexity of processing spectral data correspondingto large number of bands. Further, until recently, commercialspectral imaging systems were mainly airborne ones whichcould not be used for ground-based image acquisition.The vast majority of ground-based image capture (e.g. for

general photography, surveillance and industrial machinevision) is currently performed using RGB filters on imagingsensors. Since its inception in the late eighties, digitalphotography has almost exclusively used RGB sensors.Approaches to camera sensors have varied including usingcomplementary metal oxide semiconductor (CMOS)detectors or charge coupled devices (CCDs) with standardRGB filter mosaics using Bayer arrays, three – CCDcameras or a set of stacked photodiodes with a layer foreach of colour channel, such as that used in the Foveon(R)chip. Despite their differences, all of these sensor typesinvolve capturing light at the wavelengths of the three

additive primary colours (red, green and blue). These bandsare used because they are a close approximation to thebands to which the human eye is sensitive and, thereforeare well-suited to capturing photos and videos forimmediate human perception.In today’s cameras, the captured RGB image data was not

only used for immediate human perception as originallyintended, but also for a wide range of applications devoidof direct human visualisation. Over the past 5 years, manydigital cameras have featured integrated circuits andfirmware for sophisticated image processing (such asCanon’s DIGIC chips and Nikon’s Expeed chips). Theseinitially performed functions such as automatic focus,exposure and white balance. More recently, such chips haveperformed a wider range of higher-level scene analysisfeatures such as face detection, smile detection and objecttracking. Such scene analysis features provide a high valueto professional and consumer camera users and are typicallysignificant selling points in particular models. Manyindustrial cameras are also now including high-level sceneanalysis functions such as people counting in surveillancecameras and object recognition in ‘smart cameras’ used inindustrial machine vision.Every scene comprises a rich tapestry of light sources,

material reflectance, lighting effects because of objectcurvature and shadows. Despite being reasonably effectivefor scene analysis, trichromatic (i.e. RGB) technology doeshave limits in its scene analysis capabilities. For example,a camera with an RGB sensor cannot determine theconstituent material of an object in a scene and alter the

467& The Institution of Engineering and Technology 2013

www.ietdl.org
aesthetics accordingly. Similarly, cameras with RGB sensorscannot, in general, deliver photometric invariantscharacteristic to a material independent of the lighting forrobust tracking, identification and recognition tasks.In contrast, imaging spectroscopy delivers an
information-rich representation of the surface radianceacquired by multispectral and hyperspectral sensing devices.As in trichromatic imaging, the multispectral reflectancedistribution is determined not only by the light source andviewer directions, but also by the material properties of thesurface under study. The main difference stems in thecapacity of imaging spectroscopy of robustly encodingmaterial properties, object composition and concentrations.Imaging spectroscopy research will enable the use ofcompositional information for scene understanding.With the appearance of commercial systems such as the

Hyperspectral Image Intensifed Benchtop Camera Systemsof OKSI, the Brimrose Acousto-Optic Tunable Filter(AOTF) imagers and the FluxData Multispectral Cameras,scene analysis via imaging spectroscopy will find its wayinto areas such as defence, biosecurity, surveillance andcomputational photography [1].With the availability of imaging spectroscopy in

ground-based cameras, it will no longer be necessary tolimit the camera data captured to three RGB colourchannels, but cameras should be available that offer analternative number and range of bands that provide the besttrade-off of functionality, performance and cost for aparticular market segment or application need. Rather thanhaving the same spectra captured as displayed, it will bepractical to decouple these, capturing a rich spectralrepresentation, performing processing on this representationand then rendering in trichromatic form when this is neededfor display.The ability of combining spatial and compositional

information of the scene will require solving severaldifficult problems. With these problems solved,spectroscopic scene analysis offers the possibility ofperforming shape analysis from a single view fornon-diffuse surfaces [2], recovering photometric invariantsand material-specific signatures [3], recovering the powerspectrum of the illuminant [4] and visualisation of digitalmedia [5].In the following sections, we explore the opportunities,

application areas and challenges involving the use ofimaging spectroscopy for advanced shape analysis, recoveryof photometric parameters and material-dependentsignatures. These technologies can potentially changecurrent practices by enabling consumer digital cameras withadvanced scene analysis features and more accurate colourreproduction; by providing manufacturers with improvedvisual inspection methods; by giving security professionalsmore information on a surveillance scene; and by savingfarmers and quarantine officers hundreds of hours inlaborious and time-consuming pest inspections.

Fig. 1 Example applications for computational photography

Top panel: Example of changing the reflectance of a particular material in thesceneBottom panel: Example of changing both the material reflectance and theilluminant power spectrum

2 Application areas

Imaging spectroscopy technology can be used in a wide rangeof applications and a subset of these will be described in thissection. Note that despite the diversity of the followingapplications, it is feasible to build core imagingspectroscopy technology for the handling and processing ofspectral data across all of these application areas. This isjust as some core technologies for processing RGB images


are now used in almost all cameras. In the followingsection, we turn our attention to some of the researchchallenges in computer vision and pattern recognitioninvolving spectroscopic scene analysis. For now, we focuson the applications in computational photography, defense,health and food security.

2.1 Computational photography

The research on scene analysis using imaging spectroscopycan provide a means for development of advanced featuresin computational photography. The illuminant recoverytechniques proposed in [4] and polarimetric analysismethods in [6] can be further explored for purposes ofscene classification through the integration of spatial andspectral information. This will permit the development ofadvanced methods for autofocusing, white balancing, post-and pre-processing tasks for digital content generation andmedia. This is at the core of future developments incomputational photography [1]. Since resolution is notexpected to drive the market in high-end photography in thelong term, manufacturers will seek new ways of improvingimage quality and adding advanced features, such asre-illumination (changing the scene lighting accurately) andmaterial substitution (changing characteristics of materialsin the scene such as changing cotton to appear like silk).Imaging spectroscopy can provide advanced functionality

and high fidelity colour reproduction while allowing opticsto remain unaffected, thus providing a natural extension ofcurrent technologies while allowing advanced features. Anexample of this is shown in Fig. 1. In the top panel, weshow the results obtained by substituting the reflectance ofthe subject’s T-shirt for that of a green cotton cloth. Thebottom panel shows the new T-shirt reflectance combined

IET Comput. Vis., 2013, Vol. 7, Iss. 6, pp. 467–477doi: 10.1049/iet-cvi.2010.0205

www.ietdl.org
with the substitution of the illuminant power spectrum in thescene with that of a neon strip light. The images have beenobtained by recovering the illuminant power spectrum usingthe method in [4]. The material recognition has been doneusing the algorithm in [3]. Despite being simple examples,note that post-processing tasks, such as re-illumination andmaterial substitution can be employed in digital mediageneration whereas colour enhancement and retouching areapplicable to digital image processing and post-production.
2.2 Food security

Food security is increasingly becoming a national priority fornations around the world. Imaging spectroscopy providesa rich representation of natural environments. Spectralsignatures for flora can provide a means for non-destructivebiosecurity practices by accurately identifying pests whichthreaten crops, livestock and habitats. This may be done sothat signatures can be stored in a database in order toperform recognition on spectral images. Thus, imagingspectroscopy technologies will enable early detection ofplant pests by allowing diagnosis before symptoms arevisible to the naked eye. For instance, consider the use ofhyperspectral imagery for crop disease detection andmapping. In the left-hand side column of Fig. 2, we showpseudo-colour images of capsicum plants. Some of thesethat have been infected with a virus whose visiblesymptoms are not yet apparent. Note that, from the colour

Fig. 2 Example results for the detection of viral infection on capsicums

Left-hand column: pseudo-colour imagesMiddle column: foreground regions recovered after chlorophyll recognitionRight-hand column: pathogen mapping results after the application of the method


alone, there is no noticeable difference between the healthyand infected specimens. Despite this, making use ofhyperspectral imagery, the infected instances can berecovered making use of a two step process, where theplant regions in the image are recovered using chlorophyllrecognition and, once the plants have been segmented fromthe scene, infection can be detected using the method in [7].Further, imaging spectroscopy technologies are not limited

to biosecurity but can also be used to solve a set of strategicproblems that will provide more automated and informativeprocesses to existing platforms for crop management, planthealth and food security. This is exemplified in Fig. 3,where in the left-hand side panel we show the acquisitionsetting for the imagery in Fig. 2, whereas the right-handside panel shows a similar set-up in the field. As a result,imaging spectroscopy may find its way into precisionagriculture in order to achieve higher yields and improveagribusiness efficiency. For instance, imaging spectroscopycould be used for phenology by relating spectral signaturesto periodic plant life cycle events and the ways in whichthese are influenced by seasonal and variations in climate.This will permit the selection and delivery of crops bettersuited to particular environments. Another exampleapplication is food sorting. In Fig. 4, we show results onnut classification using imaging spectroscopy. The figureshows the pseudo-colour image for the hyperspectralimages under study along with the example spectra of nutsbeing classified. In our imagery, there are five nut types that

from hyperspectral imagery

in [7]


Fig. 3 Example acquisition settings for biosecurity and precision agriculture

On the left-hand panel, we show the actual acquisition set-up for the imagery shown in Fig. 2. The right-hand panel shows a deployment at an apple orchard

Fig. 4 Nut classification example

a Input pseudo-colour spectrab Classification map [cashew: blue, macadamia: green, almond: red, peanut: cyan, pistachio: magenta]c Selected sample spectra

www.ietdl.org

470 IET Comput. Vis., 2013, Vol. 7, Iss. 6, pp. 467–477& The Institution of Engineering and Technology 2013 doi: 10.1049/iet-cvi.2010.0205

Fig. 5 Calibrated data acquisition in a studio

Here, we show a set-up specifically designed to acquire hyperspectralphotometric stereo imagery using ten incandescent light sources and twocameras in daisy chain

www.ietdl.org
appear in the scene in different quantities. Note that the colourand shape of the nuts do not permit a straightforwardclassification from RGB data. Nonetheless, by applying themethod in [3], the nuts can be segmented into differentvarieties.
2.3 Defense technologies

Spectroscopic scene analysis also has the potential totransform defence technologies. New spectroscopy methodswill provide better judgment on materials and objects in thescene and will deliver advanced tracking techniques robustto illumination changes and confounding factors such ascamouflage or make up. Such capabilities are elusive incurrent computer vision systems because of theirvulnerability to metamerism, that is, the possibility ofhaving two materials with the same colour but dissimilarcomposition.Other functionalities include gender, age and ethnic group

recognition from skin biometrics [8]. The extensions of thisresearch so as to include polarimetric measurements canpermit tracking in scattering media so as to deal withnatural phenomena such as mirages. For surveillance, theintegration of the spectral and spatial information willproduce a very information-rich representation of the sceneand may open-up the possibility of exploiting theinterdependency of the spectral and spatial information todetect dangerous goods and provide positive access makinguse of biometric measurements such as skin spectra.

2.4 Earth sciences

Imaging spectroscopy technologies can also be applied toearth sciences, for example, to measure carbon content invegetation and biomass. This is related to the naturalextension of imaging spectroscopy to geosciences andremote sensing, where photometric invariance inhyperspectral imagery for material classification andmapping in aerial imaging is mainly concerned withartifacts induced by atmospheric effects and changing solarilluminations [9]. For ground-based scene analysis,assumptions often found in remote sensing regarding theilluminant, sensors, scene geometry and imagingcircumstances do not apply anymore. For scene analysis,the light source is no longer a single point at infinity, thatis, the sun, the geometry of the scene is no longer a plane,that is, the earth surface, and the spectrum of the illuminantand transmission media are not constrained to theatmosphere and sunlight power spectrum. Thus the recoveryof reflectance invariant to the illuminant, viewer directionsand imager choice is further complicated and severaldifficult problems remain to be solved. Close-range groundinspection relevant to areas such as mining and resourceexploration can benefit from scene analysis techniqueswhich solve these problems.

2.5 Health

The advanced recognition capabilities provided by patternrecognition approaches will allow tackling problems such asskin cancer detection and non-intrusive saturationmeasurement for health and high-performance sports. Thisis as set-ups such as that in Fig. 5, where a variety ofcontrolled light sources can be combined with thehyperspectral cameras in a studio can be used to capturephotometrically calibrated imaging. These imaging


techniques can provide an opportunity to develop anon-invasive way of measuring haemoglobin concentration,ratios of oxygenated against desaturated haemoglobin,carboxy-haemoglobin concentrations or diagnosingmelanoma by making use of imaging spectroscopytechnology together with three-dimensional (3D) shapeanalysis and statistical models for skin reflectance.

3 Research challenges

As mentioned earlier, imaging spectroscopy is a key area ofresearch enabling a wide range of functions in diverse areasof applications. Absorption spectroscopy, which is themeasurement of the amount of energy which is absorbed asa function of wavelength by a sample or object, is usedfor identifying materials in chemistry, astronomy andgeosciences. Imaging spectroscopy relies on associatingeach pixel in the image with a spectrum representing theintensity at each wavelength. Performing scene analysis onan image with such pixels confronts us with several hardresearch problems. These spectra are not only dependent onthe materials in the scene, but also depend on the shape ofthe objects, the illuminant ‘colour’ and the light position.The photometrics of the scene not only influence theappearance of an object to an observer, but also thepolarisation properties of the emitted radiation. This impliesthat, to achieve reliable scene understanding we are notonly required to focus on higher-level tasks such as


www.ietdl.org
recognition or classification, but we also have to recover theobject shape, the illuminant power spectrum and theposition of the light with respect to the camera.Thus, the research on spectroscopic scene analysis has a
twofold aim. Firstly, it should address the photometricinvariance problem so as to recover features devoid ofillumination variations, specularities and shadowing. This,in turn, may comprise complex analysis of the spectroscopydata, with goals such as the following:

† Illuminant recovery for invariant imaging spectroscopy.† Shape analysis and reflectance modelling.† Polarimetric scene analysis.

Secondly, such research must explore the use of theimaging spectra for purposes of scene classification andobject material classification aimed at

† Extraction of scale and affine invariant features andspectral signatures.† Spatio-spectral unmixing.† Spatio-spectral feature combination and selection.† Integration of spatial and spectral information for structuralscene representation.

3.1 Illuminant recovery for invariant imagingspectroscopy

In order to retrieve the spectra of the material, it is firstnecessary to remove the spectra of illuminants in the scene.The main challenge here stems from the fact that imagersdeliver radiance, that is, the amount of light that passesthrough the lens of the camera, not reflectance (the fractionof the power spectrum that is reflected from the surface).Note that, whereas radiance is determined by the lightsource and viewer directions and the material properties ofthe surface under study, reflectance is a characteristic of theobject. This is well known in remote sensing materialidentification [10], and is, in general, a classification problem.For spectral image classification, each pixel is associated

with a spectrum which can be viewed as an input vector ina high-dimensional space. Thus algorithms from statisticalpattern recognition and machine learning have been adoptedto perform pixel-level feature extraction and classification[11]. These methods either directly use the completespectra, or often make use of preprocessing anddimensionality reduction steps at input and attempt torecover statistically optimal solutions. Linear dimensionalityreduction methods are based on the linear projection of theinput data to a lower dimensional feature space. Typicalmethods include principal component analysis (PCA) [12],linear discriminant analysis (LDA) [13] and projectionpursuit [14]. Almost all linear feature extraction methodscan be kernelised, resulting in kernel PCA [15], kernelLDA [16] and kernel projection pursuit [17]. Thesemethods exploit non-linear relations between differentsegments in the spectra by mapping the input data onto ahigh-dimensional space through different kernel functions[18].From an alternative viewpoint, features of the absorption

spectrum can be used as signatures for chemicals and theirconcentrations [19]. Absorption and reflection are twocomplementary behaviours of the light incident on thematerial surface. Although reflections are directlymeasurable by imaging sensors, absorptions, indicated by


local dips or valleys in the reflectance spectra, are lessstraightforward to recover. Nevertheless, absorptions areinherently related to the material chemistry as well as otherphysical properties such as surface roughness [20].Therefore the presence of an absorption at a certain spectralrange is a ‘signature’, which can be used for identificationand recognition purposes. Furthermore, absorption featureshave been used in the Tetracorder system [21] to identifyspectrum components. Unfortunately, the Tetracorder is asemi-automatic expert system where, for purposes ofmaterial identification, absorption features are required to bemanually labelled. The Tetracorder is based upon spectralfeature identification algorithms such as that in [22], wherea least-squares fit is used to match the spectrum under studyto that in a reference library.In any case, whereas spectra viewed as high-dimensional

vectors or absorptions are employed, there is the need torecover the reflectance invariant to illuminant, viewerdirections and sensor choice. This is further complicated bythe fact that assumptions often found in remote sensingregarding the illuminant, sensors, scene geometry andimaging circumstances do not apply anymore. As mentionedearlier, for scene analysis, the light source is no longer asingle point at infinity, that is, the sun, the geometry of thescene is no longer a plane, that is, the earth surface, and thespectrum of the illuminant and transmission media are notfixed to the atmosphere and sunlight power spectrum. Thus,for scene analysis, the problem should be treated as onearising from complex geometric settings found in real-worldscenes, with one or more illuminants of different sorts (neontubes, incandescent light bulbs, sunlight) and interreflectionsbetween object surfaces, some of them translucent ortransparent.In computer vision, the main focus regarding reflectance

has traditionally revolved about the use of the bidirectionalreflectance distribution function (BRDF) [23]. Despite beingeffective, the direct application of these methods to imagingspectroscopy is somewhat limited since extending them toimages comprised of tens or even hundreds of bands is nota straightforward task. This is mainly because of thecomplexity regarding a closed form solution for thehigh-dimensional image data involved, but is also related tothe fact that the (BRDF) parameters become wavelengthdependent and, hence, are often intractable.For reliable scene analysis, methods should be applicable

to highly textured surfaces and scenes with multipleilluminants. The case of multiple illuminant directions,where these are known, is also interesting from thescholarly point of view since the angular variables forisotropic reflection depend solely on the surface normal[24]. Finally, the multi-image case, such as that pertainingstereo vision, where the light source directions are notknown, may be a worthy vehicle for application oflarge-scale optimisation methods so as to process all thespatial and spectral domain parameters simultaneously inorder to recover the illuminant directions, surface shape andphotometric parameters.

3.2 Shape analysis and reflectance modelling

Spectroscopic scene analysis opens up the possibility ofrecovering the object shape and photometric parametersfrom a single viewpoint making use of a single image byunderstanding the relation between shape and reflectance.This is because of the intrinsic relation between photometricinvariance and shape recovery arising from the fact that the


www.ietdl.org
reflectance of an object is determined not only by the lightsource and viewing directions, but also by the materialproperties of the surface under study.This is not only practically useful but theoretically
important since it would provide a unified view overprevious work in the fields of shape from shading andphotometric stereo. Note that the classic approaches toshape from shading [25–27] often rely on Lambertian,that is, diffuse, object reflectance assumptions. Othershape-from-shading methods use a reflectance model tocapture the photometric invariance of non-Lambertiansurfaces, where specular spikes and rough materials have tobe considered. Along these lines, it is perhaps the work ofBeckmann on smooth and rough surface reflectance that isthe best known in the vision and graphics communities[28]. Other reflectance models used in the literature arethose proposed by Vernold and Harvey [29] and Torranceand Sparrow [30].The recovery of shape from a single view of non-

Lambertian surfaces using the solutions to the reflectancemodels above can lead to a generalisation of classicalmethods in computer vision for purposes of shaperecovery. This is not a straightforward task. Indeed, it is achallenging one entailing dependence upon wavelength andthe transmission and refraction of light through theboundary between different object media. Nonetheless, therewards are vast since a solution to the shape recoveryproblem in spectroscopic scene analysis can cast theproblem in a general theoretical setting so as to recover thephotometric invariants of materials and surface shape frommultispectral or hyperspectral imagery. Moreover, imagingspectroscopy methods are quite general in nature beingequally applicable to monochromatic or trichromaticimagery by fixing the discrete wavelength-indexed termsaccordingly.

3.3 Extraction of scale and affine invariant featuresand spectral signatures

It is surprising that despite the widespread use of high-levelfeatures for recognition and retrieval of monochromatic andtrichromatic imagery, image descriptors in imagingspectroscopy are somewhat under-researched, with currentmethods focusing on the local analysis of individual pixelspectra rather than profiting from the structural informationthat imagery provides. The development of affine invariantimaging spectroscopy features to capture the structural andspectral information in the imagery for purposes ofrecognition can greatly enhance the results delivered bymaterial classification methods which operate on pixelspectra alone. Thus, the use of higher-level features inimaging spectroscopy opens-up great opportunities inrecognition and classification tasks.Moreover, the multidimensional nature of these local

image features and descriptors may be combined to improveperformance. For instance, for RGB data, Varma and Ray[31] have used a kernel learning approach to learn thetrade-off between discriminative power and invariance ofimage descriptors in classification tasks. Other alternativestend to view the features as multidimensional data and,making use of unsupervised learning, exploit similarityinformation in a graph-theoretic setting. Examples of theseare the method presented by Sengupta and Boyer [32] andthat developed by Shokoufandeh et al. [33], which employinformation-theoretical criteria to hierarchically structure the


dataset under study and pattern recognition methods tomatch the candidates.For imaging spectroscopy, descriptors and representation

of signatures for recognition is mainly limited to edgedetection [34] or approaches based upon the derivativeanalysis of the spectra [35]. Nonetheless this local analysisof the spectra was shown to be intrinsic to the surfacealbedo, the analysis in [35] was derived from theLambertian reflection model (i.e. where all surfaces reflectlight perfectly diffusely) and, hence, is not applicable tocomplex scenes where interreflections and multipleilluminants may occur. Fu et al. [36] have proposed the useof band ratios as an alternative to raw spectral bands asfeatures for classification invariant to shading. In [37], asubspace projection method for specularity-free spectralrepresentation is presented. Huynh and Robles-Kelly [38]have presented a method to represent reflectance datamaking use of a continuous basis by fitting a B-spline tothe spectra under study.Despite being effective, the methods above focus on the

representation of pixels as signatures overlooking the spatialstructure on the image. Recently, Khuwuthyakorn et al. [39]proposed a texture descriptor for imaging spectroscopybased upon Fourier analysis and heavy tailed probabilitydistributions. This is reminiscent of time-dependenttextures, whose probability density functions exhibit first-and second-order moments which are space and time-shiftinvariant [40]. Unfortunately, the descriptor presented in[39] cannot be used to recover the spectral signatures butrather has been designed for purposes of recognition whereprecise material matching is not necessary.Note that the methods above employ empirical

observations of statistical patterns in the spectra or signalprocessing techniques to ‘capture’ the information in theimage. Even though effective, the combination of thesewith the material-specific spectral information delivered byimaging spectroscopy is still a somewhat sparse area ofresearch. The combination of the spectral and spatialstructure delivered by the imagery can provide a means tocompact representations of the scene robust to affinetransformations in the scene. Further, by viewing thespectra as high-dimensional vector fields, this can alsoallow the application of kernel methods so as to incorporatemachine learning techniques into the feature recoveryprocess. Other possibilities comprise the use of attributedgraph-based representations, where pixels, object parts orsuperpixels can be viewed as nodes. In this manner, theinformation delivered by their spectra can then be used torecover an attribute set for the graph.

3.4 Spatio-spectral unmixing

Up to this stage, we have elaborated upon the invariance orrepresentation problems related to imaging spectroscopydevoid of subpixel information. The use of imagingspectroscopy for scene analysis also permits addressing theproblem of spectral unmixing with known lightingconditions and unknown constitutive material compoundsof the objects in the scene. Spectral unmixing is commonlystated as the problem of decomposing an input spectralsignal into relative portions of known spectra of endmembers. The end members can be any man-made ornaturally occurring materials such as water, metals etc. Theinput data varies in many forms, such as radiance orreflectance spectra obtained from hyperspectral images. Theproblem of unmixing applies to all those cases where a


www.ietdl.org
capability to provide subpixel detail is needed, such asgeosciences, food quality assessment and process control.Moreover, unmixing is not exclusive to subpixel processingbut can be viewed as a pattern recognition task related tosoft-clustering with known or unknown end members.Unlike previous approaches related to the field of spectral
unmixing [41] and photometric invariants [37], in the caseof scene analysis, the constitutive compounds or endmembers, as they are known in geosciences andmineralogy, are, in general, not known. Spectral unmixingwith automatic end member extraction is closely related tothe simultaneous estimation of illuminant power spectrumand material reflectance presented earlier. Many of themethods elsewhere in the literature hinge on the notion thatthe simultaneous recovery of the reflectance and illuminantpower spectrum requires an inference process driven bystatistical techniques. In [42], Stainvas and Lowe proposeda Markov random field to separate illumination fromreflectance from the input images.On the other hand, the physics-based approach in [43] for

image colour understanding alternately forms hypotheses ofcolour clusters from local image data and verifies whetherthese hypotheses fit the input image.Moreover, unmixing with unknown end member spectra

recovery has the potential to avoid cumbersome labelling ofthe scene spectral signatures, which, at the moment, ismainly effected through expert intervention [44]. Automaticend member extraction is often complicated because of theconfounding factors introduced by illumination and thecomplex nature of real-world settings, where the number ofmaterials in the scene is not known. This automatic endmember identification task can be viewed as a blind-sourcelabelling problem.Note that this can be tackled in a number of ways. In [45], a

probabilistic treatment of the problem is presented where softclustering is effected on the pixel reflectance spectra. This isdone using deterministic annealing in a manner akin toclustering so as to recover groups of spectra with similarsignatures. As the inference process converges, these groupscan be further refined. In Fig. 6, we show the probabilitymaps of skin and cloth materials recovered by the methodin [45]. In the fourth and fifth columns, we show theprobability maps recovered by the spectral angle mapper[46] commonly used in remote sensing. The two sampleimages shown have been captured under differentillumination conditions, one of which in the visible and theother in the infrared spectrum. In the panels, the brightnessof the probability maps is proportional to the associationprobability with the reference material.

Fig. 6 Material maps estimated from a visible range image (top row) a

First column: the input image in pseudo colour. Second and third columns: the probaand fifth columns: the probability maps of skin and cloth produced by the spectral


Note that, from the figure, we can appreciate not only thatthe scene analysis setting is quite dissimilar to the remotesensing one, but also that by retrieving the end-members,even for very simple images, it opens up the possibilityof effecting segmentation and recognition simultaneously.Thus, the method is quite general in the sense that it isapplicable to any number of wavelength-indexed bandsassuming no prior knowledge of the materials in the scene.Such approaches enable the use of spatial relations betweenimage pixels and their spectra so as to cast the unmixingtask as a mixture of primal components making use of theinput radiance images normalised with respect to theilluminant power spectrum.

3.5 Spatio-spectral feature combination andselection

In practice, object classification and image categorisationtechniques [47–49] are based upon the comparison ofimage features between those in a query image and thosecorresponding to the images in a dataset. This is usuallybased on a codebook that serves as a means to recoverthe closest match through classification. Furthermore,descriptors are often combined so as to assign differentimportance to each of them in order to maximiseperformance. Earlier, we elaborated upon the use of featuresin imaging spectroscopy. These high-level features can befurther selected and combined making use of the reflectanceinformation and the results yielded by the unmixingoperation for recognition via the supervised learning over atraining set of known object categories or classes.In either case of recovering the optimal descriptors directly

[50] or their optimal combination [31, 51] the aim incategorisation is to maximise performance so as tominimise the variance across the dataset with respect to theclassifier output. This is as the classifier output is dependenton the image representation and the similarity measureemployed to categorise the images [52]. This hints at thefact that the image features arise not only from the localdescriptors but also from spectral signatures which capturethe composition of objects in terms of materials and theirquantities in the scene. To this end, feature selection can beadapted to a spatial-spectral model. This would allow us toapply mixtures of spatio-spectral features to the imagery forpurposes of recognition. Moreover, methods to handle theclassification of objects composed of multiple end memberscan be improved by casting the feature combination andselection problem into a pairwise classification framework.Note that, here, selection can also be applied to the spectra.

nd a near infrared range image (bottom row)

bility maps of skin and cloth produced by the unmixing method in [45]. Fourthangle mapper [46]


www.ietdl.org
This would also allow the further selection of bands so as toachieve a means to goal-directed compact representations ofthe imagery based upon training data.
3.6 Polarimetric scene analysis

When light is reflected, it is polarised perpendicularly withrespect to the plane of incidence and parallel to thereflective surface. This is an important observation since itimplies that polarisation can be a means for shape analysis.This can be observed in Fig. 7, where we show the surfaceprofiles presented as shading maps recovered making use ofthe phase corresponding to the polarised light as reportedin [6]. In these shading maps, the brightest points on thesurface correspond to those whose normal is in the viewer’sdirection.As a result, recently, there has been an increased interest in

the use of polarisation as a means to shape analysis. Rahmannand Canterakis [53] proposed a method based on polarisationimaging for shape recovery of specular surfaces. Thisapproach to depth recovery relies on the correspondencesbetween phase images from multiple views. An importantresult drawn from their research is that three polarisationviews are sufficient for surface reconstruction. Atkinson andHancock [54, 55] also made use of the correspondencesbetween the phase and degree of polarisation in two viewsfor shape recovery. As an alternative option to the use ofmultiple views, polarisation information has been extractedfrom photometric stereo images for shape recovery.Drbohlav and Sára [56] have shown how to disambiguatesurface orientations from uncalibrated photometric stereousing images corresponding to different polarisation anglesof the incident and emitted light. Their method is based ontwo constraints representing the projections of the objectsurface normals onto planes perpendicular to the viewingand illumination direction.The challenge here is to develop a method which is not

limited by the requirement of a known light source positionrelative to the observation point. Moreover, the methods

Fig. 7 Top row: polarised images of sample real-world objects; bottom


above rely on the assumption of known index of refraction.Thus, the polarimetric analysis can tackle the task ofsimultaneous estimation of surface orientation andphotometric invariants, including index of refraction. Thisis an important theoretical development since it relatespolarisation to the shape, material index of refraction andother photometric variables. Further, polarisation permitsthe recovery of phase maps, that is, the distribution ofpolarisation phase over the image lattice. Since the phase isrelated to the plane of incidence, this can be viewed asa map of the scene which provides information ontopographical artifacts and variations in object composition.Topographical and compositional information using

polarimetric analysis may be recovered in a number ofways. An option here is to estimate the shape of objects inthe scene and index of refraction in tandem through acoordinate search procedure in the space spanned by thereflectance models under consideration. This simultaneousestimation of shape and index of refraction allows for thewavelength-dependency property of the refractive index tobe used as a feature for material recognition. Thus, the useof polarisation not only provides a means to the recoveryof the shape of the objects in the scene, but also theirrefractive properties and hence their material composition.

3.7 Integration of spatial and spectral informationfor structural scene representation

Note that the ultimate goal proposed here is to achieve sceneunderstanding. This requires the combination of individualdescriptors or features into a structured, goal-directed andcontextual representation of the scene. This is as sceneunderstanding can be considered a top-down process, wherefeatures are integrated into a coherent representation makinguse of their contextual relationships [57]. This has been thefocus of attention of visual saliency in computer vision,where, when guided by the human preferences, those partsin the image that are less related to the targets of visualattention can be assigned smaller contributions on the

row: surface profiles recovered using polarisation information


www.ietdl.org
saliency map or even completely ignored [58, 59].Unfortunately, saliency computation methods oftenoverlook the intrinsic relationships between the individualfeatures. This is due to the fact that the features are oftentreated as independent primitives, even though they mayactually be interrelated or highly correlated. Moreover, thestructural information is often disregarded since the featuresare usually combined devoid of relational information in thescene.This lack of relational or structural information contrasts
with methods in shape analysis [60], motion analysis [61]and stereo reconstruction [62] where the task is that ofmatching features or points in the scene. If structuralrepresentations are to be adopted, the challenges aretwofold. First, to construct, in a principled manner,graphical structures which can capture the relationshipbetween the image descriptors, their spectral signatures andpolarisation information. These relational structures may beconstructed in a manner akin to visual attention to representthe relationships between image descriptors in multiplescales and spectral bands at different wavelengths. Anotherpossibility is to use statistics and 3D-view geometry tobuild an augmented representation of the scene. Secondly, asuitable set of metrics should be developed so as to operatein a structural sense on the imaging spectroscopy data.These metrics can be then used for purposes of sceneclassification, categorisation and comparison.

4 Conclusions

The research described above is applicable to a wide varietyof settings where sensing technologies are required forpurposes of inferring materials and general characteristics ofthe scene. These challenges and opportunities pertaining tospectroscopic scene analysis have the potential to be keyenablers of geosciences, computational photography andother areas such as plant health, surveillance and creativearts. These applications arise when sensing technologies arerequired to ‘perceive’ the world for purposes of imageaesthetics, recognition, representation, classification andadvanced pre-processing and post-processing.Thus, addressing the challenges above will provide a better

understanding of materials and objects in the scene and willdeliver advanced tracking techniques robust to illuminationchanges and confounding factors. In geosciences, it willbe aimed at providing a means to quantify carbon invegetation. Resource management, exploration andgeophysics are natural areas of application for the unmixingtechnologies in Section 3.4. In plant health and biosecurity,the utility of the imaging spectroscopy methods pertains toearly detection of plant pests by allowing diagnosis beforesymptoms are visible to the naked eye. Again, thesetechnologies are not exclusive to the early detection ofpathogens but will also easily allow the selection anddelivery of better suited crops. As mentioned earlier, thesemethods are also applicable to precision agriculture forachieving higher yields and improving agribusinessefficiency. For digital photography, these methods can yieldnovel re-illumination, highlight removal, colour transfer andreflectance modelling and recovery techniques.Note that, throughout, we have made very little, if any

differentiation between the use of spectra in either thevisible and near-infrared range. This is because applicationsof imaging spectroscopy for scene understanding usingbench top hyperspectral and multispectral cameras are not


necessarily constrained to the range visible by the humaneye. There is, hence, scope of application for near-infraredspectra, such as that presented in [63]. This combination ofa broad domain of application with the use of keytechnologies makes the challenges above a promisingopportunity to advance the areas of spectral imageunderstanding, computer vision and pattern recognition.From the research point of view, the development of

context-specific spectral image descriptors presented here isnovel and opens-up the possibility of not only advancingthe knowledge base of computer vision, but also makinguse of the information-rich representation of the scene thatspectral imaging provides so as to achieve indexing,template creation and data mining tasks. This, by itself is anew approach to scene analysis making use of the spectralsignatures and their context so as to effect sceneclassification. The use of polarisation and reflection torecover object profiles akin to those in phase imaging canderive in novel methods capable of recovering an optimalrepresentation of the scene which captures shape, material,object profiles and photometric parameters such as index ofrefraction.Furthermore, the inherently higher dimensionality of

spectroscopy data implies that these algorithms may notbe exclusive to imaging spectroscopy, but could also beapplied to the processing of other high-dimensional data.Thus, these methods may be extendible to many othersensing technologies beyond just spectral imagery.

5 Acknowledgments

NICTA is funded by the Australian Government asrepresented by the Department of Broadband,Communications and the Digital Economy and theAustralian Research Council through the ICT Centre ofExcellence program.

6 References

1 Raskar, R., Tumblin, J., Mohan, A., Agrawal, A., Li, Y.: ‘Computationalphotography’. Proc. Eurographics: State of the Art Report STAR, 2006

2 Huynh, C.P., Robles-Kelly, A.: ‘Simultaneous photometric invarianceand shape recovery’. IEEE Int. Conf. Computer Vision, 2009

3 Fu, Z., Robles-Kelly, A.: ‘Discriminant absorption feature learning formaterial classification’, IEEE Trans. Geosci. Remote Sens., 2011, 49,(5), pp. 1536–1556

4 Huynh, C.P., Robles-Kelly, A.: ‘A solution of the dichromatic model formultispectral photometric invariance’, Int. J. Comput. Vis., 2010, 90,(1), pp. 1–27

5 Kim, S.J., Zhuo, S., Deng, F., Fu, C.W., Brown, M.: ‘Interactivevisualization of hyperspectral images of historical documents’, IEEETrans. Vis. Graphics, 2010, 16, (6), pp. 1441–1448

6 Huynh, C.P., Robles-Kelly, A., Hancock, E.R.: ‘Shape and refractiveindex recovery from single-view polarisation images’. IEEE Conf.Comput. Vis. and Pattern Recognition, 2010

7 Fu, Z., Robles-Kelly, A.: ‘Milis: multiple instance learning with instanceselection’. IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, (5),pp. 958–977

8 Huynh, C.P., Robles-Kelly, A.: ‘Hyperspectral imaging for skinrecognition and biometrics’. Int. Conf. Image Processing, 2010

9 Healey, G., Slater, D.: ‘Invariant recognition in hyperspectral images’.IEEE Conf. Computer Vision and Pattern Recognition, 1999,pp. 1438–1443

10 Chang, J.Y., Lee, K.M., Lee, S.U.: ‘Shape from shading using graphcuts’. Proc. Int. Conf. Image Processing, 2003

11 Landgrebe, D.: ‘Hyperspectral image data analysis’, IEEE SignalProcess. Mag., 2002, 19, pp. 17–28

12 Jolliffe, I.T.: ‘Principal component analysis’ (Springer, 2002)13 Fukunaga, K.: ‘Introduction to statistical pattern recognition’ (Academic

Press, 1990, 2nd edn.)


www.ietdl.org
14 Jimenez, L., Landgrebe, D.: ‘Hyperspectral data analysis and feature
reduction via projection pursuit’, IEEE Trans. Geosci. Remote Sens.,1999, 37, (6), pp. 2653–2667

15 Schölkopf, B., Smola, A.J., Müller, K.-R.: ‘Kernel principal componentanalysis’. Advances in Kernel methods: support vector learning, 1999,pp. 327–352

16 Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Muller, K.: ‘Fisherdiscriminant analysis with kernels’. IEEE Neural Networks for SignalProcessing Workshop, 1999, pp. 41–48

17 Dundar, M., Landgrebe, D.: ‘Toward an optimal supervised classifier forthe analysis of hyperspectral data’, IEEE Trans. Geosci. Remote Sens.,2004, 42, (1), pp. 271–277

18 Scholkopf, B., Smola, A.J.: ‘Learning with kernels: support vectormachines, regularization, optimization, and beyond’ (MIT Press, 2001)

19 Sunshine, J., Pieters, C.M., Pratt, S.F.: ‘Deconvolution of mineralabsorption bands: an improved approach’, J. Geophys. Res., 1990, 95,(B5), pp. 6955–6966

20 Hapke, B.: ‘Theory of reflectance and emittance spectroscopy’(Cambridge University Press, 1993)

21 Clark, R., Swayze, G., Livo, K., et al.: ‘Imaging spectroscopy: Earth andplanetary remote sensing with the usgs tetracorder and expert system’,J. Geophys. Res., 2003, 108, (5), pp. 1–44

22 Clark, R.N., Gallagher, A.J., Swayze, G.A.: ‘Material absorption banddepth mapping of imaging spectrometer data using a complete bandshape least-squares fit with library reference spectra’. Proc. SecondAirborne Visible/Infrared Imaging Spectrometer Workshop, 1990,pp. 176–186

23 Wyszecki, G., Stiles, W.S.: ‘Color science: concepts and methods,quantitative data and formulae’ (Wiley, 2000)

24 Horn, B.K.P., Brooks, M.J.: ‘Shape from shading’ (MIT Press, 1989)25 Ikeuchi, K., Horn, B.K.P.: ‘Numerical shape from shading and

occluding boundaries’, Artif. Intell., 1981, 17, (1–3), pp. 141–18426 Horn, B.K.P., Brooks, M.J.: ‘The variational approach to shape from

shading’, CVGIP, 1986, 33, (2), pp. 174–20827 Zheng, Q., Chellappa, R.: ‘Estimation of illuminant direction, albedo,

and shape from shading’, IEEE Trans. Pattern Anal. Mach. Intell.,1991, 13, (7), pp. 680–702

28 Beckmann, P., Spizzochino, A.: ‘The scattering of electromagneticwaves from rough surfaces’ (Pergamon, New York, 1963)

29 Vernold, C.L., Harvey, J.E.: ‘A modified Beckmann–Kirchoff scatteringtheory for nonparaxial angles’. Scattering and Surface Roughness,number 3426 in Proc. SPIE, 1998, pp. 51–56

30 Torrance, K., Sparrow, E.: ‘Theory for off-specular reflection fromroughened surfaces’, J. Opt. Soc. Am., 1967, 57, (9), pp. 1105–1112

31 Varma, M., Ray, D.: ‘Learning the discriminative powerinvariancetrade-off’. Int. Conf. Comput. Vis., 2007

32 Sengupta, K., Boyer, K.L.: ‘Using geometric hashing with informationtheoretic clustering for fast recognition from a large cad modelbase’.IEEE Int. Symp. Computer Vision, 1995, pp. 151–156

33 Shokoufandeh, A., Dickinson, S.J., Siddiqi, K., Zucker, S.W.: ‘Indexingusing a spectral encoding of topological structure’. Proc. ComputerVision and Pattern Recognition, 1998, pp. 491–497

34 Stokman, H.M.G., Gevers, T.: ‘Detection and classification ofhyper-spectral edges’. British Machine Vision Conf., 1999

35 Angelopoulou, E.: ‘Objective colour from multispectral imaging’.European Conf. Comput. Vis., 2000, pp. 359–374

36 Fu, Z., Caelli, T., Liu, N., Robles-Kelly, A.: ‘Boosted band ratio featureselection for hyperspectral image classification’, Proc. Int. Conf. PatternRecognit., 2006, 1, pp. 1059–1062

37 Fu, Z., Tan, R., Caelli, T.: ‘Specular free spectral imaging usingorthogonal subspace projection’, Proc. Intl. Conf. Pattern Recognit.,2006, 1, pp. 812–815

38 Huynh, C.P., Robles-Kelly, A.: ‘A nurbs-based spectral reflectancedescriptor with applications in computer vision and patternrecognition’. IEEE Conf. Computer Vision and Pattern Recognition,2008

39 Khuwuthyakorn, P., Robles-Kelly, A., Zhou, J.: ‘An affine invarianthyperspectral texture descriptor based upon heavy-tailed distributions


and Fourier analysis’. Joint IEEE Int. Workshop on Object Trackingand Classification in and Beyond the Visible Spectrum, 2009,pp. 112–119

40 Doretto, G., Chiuso, A., Wu, Y., Soatto, S.: ‘Dynamic textures’,Int. J. Comput. Vis., 2003, 51, (2), pp. 91–109

41 Bergman, M.: ‘Some unmixing problems and algorithms inspectroscopy and hyperspectral imaging’. Proc. 35th Applied Imageryand Pattern Recognition Workshop, 2006

42 Stainvas, I., Lowe, D.: ‘A generative model for separating illuminationand reflectance from images’, J. Mach. Learn. Res., 2003, 4,pp. 1499–1519

43 Klinker, G., Shafer, S., Kanade, T.: ‘A physical approach to color imageunderstanding’, Int. J. Comput. Vis., 1990, 4, (1), pp. 7–38

44 Lennon, M., Mercier, G., Mouchot, M.C., Hubert-moy, L.: ‘Spectralunmixing of hyperspectral images with the independent componentanalysis and wavelet packets’. Proc. Int. Geoscience and RemoteSensing Symp., 2001

45 Huynh, C.P., Robles-Kelly, A.: ‘A probabilistic approach to spectralunmixing’. S + SSPR 2010, 2010, pp. 44–353

46 Yuhas, R.H., Goetz, A.F.H., Boardman, J.W.: ‘Discrimination amongsemiarid landscape endmembers using the spectral angle mapper SAMalgorithm’. Summaries of the Third Annual JPL Airborne GeoscienceWorkshop, 1992, pp. 147–149

47 Nister, D., Stewenius, H.: ‘Scalable recognition with a vocabulary tree’.Comp. Vis. Pattern Recognit., 2006, II, 2161–2168

48 Sivic, J., Russell, B., Efros, A., Zisserman, A., Freeman, W.:‘Discovering objects and their location in images’. Int. Conf. Comput.Vis., 2005, I, pp. 370–377

49 Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: ‘Total recall:automatic query expansion with a generative feature model for objectretrieval’. Int. Conf. Computer Vision, 2007

50 Winder, S., Brown, M.: ‘Learning local image descriptors’. IEEE Conf.Computer Vision and Pattern Recognition, 2007

51 Nilsback, M.E., Zisserman, A.: ‘A visual vocabulary for flowerclassification’. Comp. Vision and Pattern Recognition, 2006, II,pp. 1447–1454

52 Vasconcelos, N.: ‘On the efficient evaluation of probabilistic similarityfunctions for image retrieval’, IEEE Trans. Inf. Theory, 2004, 50, (7),pp. 1482–1496

53 Rahmann, S., Canterakis, N.: ‘Reconstruction of specular surfaces usingpolarization imaging’, IEEE Conf. Comput. Vis. Pattern Recognit.,2001, 1, pp. 149–155

54 Atkinson, G., Hancock, E.R.: ‘Recovery of surface height usingpolarization from two views’. CAIP, 2005, pp. 162–170

55 Atkinson, G., Hancock, E.R.: ‘Multi-view surface reconstruction usingpolarization’. Int. Conf. Computer Vision, 2005, pp. 309–316

56 Drbohlav, O., Sára, R.: ‘Unambigous determination of shape fromphotometric stereo with unknown light sources’. Int. Conf. ComputerVision, 2001, pp. 581–586

57 Rensink, R.A.: ‘The dynamic representation of scenes’, Vis. Cogn.,2000, 7, pp. 17–42

58 Navalpakkam, V., Itti, L.: ‘Search goal tunes visual features optimally’,Neuron, 2007, 53, (4), pp. 605–617

59 Berengolts, A., Lindenbaum, M.: ‘On the performance of connectedcomponents grouping’, Int. J. Comput. Vis., 2001, 41, (3), pp. 195–216

60 Cootes, T., Taylor, C., Cooper, D., Graham, J.: ‘Active shape models –their training and application’, Comput. Vis. Image Underst., 1995, 61,(1), pp. 38–59

61 Torr, P., Murray, D.W.: ‘The development and comparison of robustmethods for estimating the fundamental matrix’, Int. J. Comput. Vis.,1997, 24, pp. 271–300

62 Shapiro, L., Brady, J.M.: ‘Feature-based correspondance – aneigenvector approach’, Image Vis. Comput., 1992, 10, pp. 283–288

63 Li, S.Z., Chu, R., Liao, S., Zhang, L.: ‘Illumination invariant facerecognition using nearinfrared images’, IEEE Trans. Pattern Anal.Mach. Intell., 2007, 29, pp. 627–639


Imaging spectroscopy for scene analysis: challenges and opportunities

Documents

Transcript of Imaging spectroscopy for scene analysis: challenges and opportunities