An Optimization-Based Approach to Fusion of Hyperspectral Images

9
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 2, APRIL 2012 501 An Optimization-Based Approach to Fusion of Hyperspectral Images Ketan Kotwal and Subhasis Chaudhuri, Fellow, IEEE Abstract—In this paper we propose a new approach for vi- sualization-oriented fusion of hyperspectral image bands. The proposed technique has been devised to generate the fused image with a certain set of desired properties for a better visualization. The fusion technique should provide a resultant image with a high local contrast without driving individual pixels into over- or under-saturation. We focus on these desired properties of the resultant image, and formulate a multi-objective cost function for the same. We have shown how we can incorporate the con- straint of spatial smoothness of the weight vectors, as opposed to the smoothness of the fused image. The solution of this opti- mization problem has been provided using the Euler–Lagrange technique. By using an appropriate auxiliary variable, we show how the constrained optimization problem can be converted into a computationally efcient unconstrained one. The effectiveness of the proposed technique is substantiated from the visual and quantitative results provided. Index Terms—Hyperspectral image fusion, image fusion, opti- mization, visualization. I. INTRODUCTION O FTEN a single image is not capable of providing a com- plete information of a scene, and multiple images of the same scene are captured to provide the complementary infor- mation. The fusion of images is dened as the process of com- bining information from two or more images of a scene into a single composite image that is more informative and is more suitable for visual perception or computer processing [1]. How- ever, one can create different fused images from the same set of input data depending on the need of the application. The fu- sion of images is generally carried out for various applications like improved reliability, decreased uncertainty, improved clas- sication, and better visualization [1], [2]. The result of fusion is either a grayscale image or an RGB image which can be dis- played on a standard tristimulus display. The integration of information especially when the number of images is very large, typically more than hundred, is a chal- lenging task. To provide an effective description of the scene in a single (or RGB) image by appropriately combining the abun- dant data provided by a large number of bands requires an ef- cient technique of fusion. Hyperspectral images are one of the Manuscript received September 15, 2011; revised December 09, 2011; ac- cepted December 30, 2011. Date of publication March 13, 2012; date of current version May 23, 2012. This work was supported by the Bharti Centre for Com- munication. The authors are with the Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India (e-mail: [email protected]; [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/JSTARS.2012.2187274 important and frequently occurring examples of datasets with a very large number of bands. The hyperspectral images are acquired by an array of contiguous, narrowband hyperspectral sensors by measuring the reectance response of the surface. This array of sensors is tuned to cover a range of wavelengths in the visible and the infrared spectra. A typical hyperspectral data provide a high resolution spectral information in the form of 200–250 bands covering the wavelength range from 0.4 m to 2.5 m. The research in the processing of hyperspectral im- ages has been lately receiving a lot of attention due to its ability to provide a robust, accurate and detailed multidimensional in- formation about the scene. Due to these distinct advantages, ap- plications of hyperspectral images have been explored in var- ious areas such as remote sensing, medical imaging, anomaly detection, surveillance, and geological surveying [3]–[6]. How- ever, the visualization of this rich source of information on a standard display device is a problem as the number of bands in hyperspectral data is very large. A fast and efcient visualiza- tion of the contents of a hyperspectral scene can be an impor- tant primary step prior to the use of complex processing algo- rithms. This visualization which provides a quick overview of the data is accomplished by fusion of the disparate information in different bands and by elimination of the redundancy from the input hyperspectral data. Many techniques for a visualiza- tion-oriented fusion of hyperspectral data have been proposed in the literature. However, most of these techniques are ad-hoc in nature in terms of the fusion strategy, and the proof of con- cept always lies in a subjective analysis of the output. We ask the question—whether the fusion strategy could be made an ob- jective one, and the output should be based on minimization of a certain objective function. The pixel-based fusion techniques generate the resultant fused image as a linear combination of the input image bands. The weights are typically related to a certain measure of the relative importance of a pixel/band in the input image data, which are then used to combine pixels from all image bands. However, these fusion techniques do not consider any of the properties of the resultant fused image from the point of view of a better visualization. In this paper, we propose a new ap- proach for fusion of hyperspectral data that explicitly seeks for some of the desirable characteristics of the fused image during the process of fusion. The proposed technique seeks to nd the best set of fusion weights that generates the most desired output image from the visualization perspective. Accordingly we derive an appropriate cost function, minimization of which results in the fused image. The organization of the paper is as follows. In Section II, we provide a review of existing techniques for hyperspectral 1939-1404/$31.00 © 2012 IEEE

Transcript of An Optimization-Based Approach to Fusion of Hyperspectral Images

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 2, APRIL 2012 501

An Optimization-Based Approach to Fusion ofHyperspectral Images

Ketan Kotwal and Subhasis Chaudhuri, Fellow, IEEE

Abstract—In this paper we propose a new approach for vi-sualization-oriented fusion of hyperspectral image bands. Theproposed technique has been devised to generate the fused imagewith a certain set of desired properties for a better visualization.The fusion technique should provide a resultant image with ahigh local contrast without driving individual pixels into over-or under-saturation. We focus on these desired properties of theresultant image, and formulate a multi-objective cost functionfor the same. We have shown how we can incorporate the con-straint of spatial smoothness of the weight vectors, as opposedto the smoothness of the fused image. The solution of this opti-mization problem has been provided using the Euler–Lagrangetechnique. By using an appropriate auxiliary variable, we showhow the constrained optimization problem can be converted intoa computationally efficient unconstrained one. The effectivenessof the proposed technique is substantiated from the visual andquantitative results provided.

Index Terms—Hyperspectral image fusion, image fusion, opti-mization, visualization.

I. INTRODUCTION

O FTEN a single image is not capable of providing a com-plete information of a scene, and multiple images of the

same scene are captured to provide the complementary infor-mation. The fusion of images is defined as the process of com-bining information from two or more images of a scene into asingle composite image that is more informative and is moresuitable for visual perception or computer processing [1]. How-ever, one can create different fused images from the same setof input data depending on the need of the application. The fu-sion of images is generally carried out for various applicationslike improved reliability, decreased uncertainty, improved clas-sification, and better visualization [1], [2]. The result of fusionis either a grayscale image or an RGB image which can be dis-played on a standard tristimulus display.The integration of information especially when the number

of images is very large, typically more than hundred, is a chal-lenging task. To provide an effective description of the scene ina single (or RGB) image by appropriately combining the abun-dant data provided by a large number of bands requires an effi-cient technique of fusion. Hyperspectral images are one of the

Manuscript received September 15, 2011; revised December 09, 2011; ac-cepted December 30, 2011. Date of publication March 13, 2012; date of currentversion May 23, 2012. This work was supported by the Bharti Centre for Com-munication.The authors are with the Department of Electrical Engineering, Indian

Institute of Technology Bombay, Mumbai, India (e-mail: [email protected];[email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/JSTARS.2012.2187274

important and frequently occurring examples of datasets witha very large number of bands. The hyperspectral images areacquired by an array of contiguous, narrowband hyperspectralsensors by measuring the reflectance response of the surface.This array of sensors is tuned to cover a range of wavelengthsin the visible and the infrared spectra. A typical hyperspectraldata provide a high resolution spectral information in the formof 200–250 bands covering the wavelength range from 0.4 mto 2.5 m. The research in the processing of hyperspectral im-ages has been lately receiving a lot of attention due to its abilityto provide a robust, accurate and detailed multidimensional in-formation about the scene. Due to these distinct advantages, ap-plications of hyperspectral images have been explored in var-ious areas such as remote sensing, medical imaging, anomalydetection, surveillance, and geological surveying [3]–[6]. How-ever, the visualization of this rich source of information on astandard display device is a problem as the number of bands inhyperspectral data is very large. A fast and efficient visualiza-tion of the contents of a hyperspectral scene can be an impor-tant primary step prior to the use of complex processing algo-rithms. This visualization which provides a quick overview ofthe data is accomplished by fusion of the disparate informationin different bands and by elimination of the redundancy fromthe input hyperspectral data. Many techniques for a visualiza-tion-oriented fusion of hyperspectral data have been proposedin the literature. However, most of these techniques are ad-hocin nature in terms of the fusion strategy, and the proof of con-cept always lies in a subjective analysis of the output. We askthe question—whether the fusion strategy could be made an ob-jective one, and the output should be based on minimization ofa certain objective function.The pixel-based fusion techniques generate the resultant

fused image as a linear combination of the input image bands.The weights are typically related to a certain measure of therelative importance of a pixel/band in the input image data,which are then used to combine pixels from all image bands.However, these fusion techniques do not consider any of theproperties of the resultant fused image from the point of viewof a better visualization. In this paper, we propose a new ap-proach for fusion of hyperspectral data that explicitly seeks forsome of the desirable characteristics of the fused image duringthe process of fusion. The proposed technique seeks to findthe best set of fusion weights that generates the most desiredoutput image from the visualization perspective. Accordinglywe derive an appropriate cost function, minimization of whichresults in the fused image.The organization of the paper is as follows. In Section II,

we provide a review of existing techniques for hyperspectral

1939-1404/$31.00 © 2012 IEEE

502 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 2, APRIL 2012

image visualization. The proposed optimization-based fusiontechnique is described in detail in Section III. We perform ex-periments on various datasets and substantiate the usefulness ofthe proposedmethod in Section IV, while conclusions are drawnin Section V.

II. RELATED WORK

A large number of techniques for fusion of a few images, say2–6 in the case of multispectral, multi-focus or multi-exposuredata, can be found in the literature. However, their extension tothe fusion of data sets involving an arbitrarily large number ofimages, such as hyperspectral images, is a non-trivial task. Werestrict the scope of literature survey to the fusion of hyperspec-tral images only.Fusion of hyperspectral images deals with combining the var-

ious bands in the image to generate a single (gray level or color)image of the same spatial resolution. The simplest techniqueused for the visualization of the hyperspectral data is to choosethree specific input bands and assign them to the red, green, andblue channels, respectively. In the browser for the AVIRIS hy-perspectral sensor, the bands with wavelengths closest to thenatural wavelengths of the red, green and blue colors are chosenfor the purpose of rendering.1 The choice of bands can also beadaptive depending upon the features to be highlighted. A lowcomplexity technique for the selection of 3 bands using one-bittransform (1 BT) has been proposed in [7]. Both these tech-niques loose a large amount of information available in rest ofthe bands of hyperspectral image. These methods basically areband selection techniques (see also [8], [9]), and no quality guar-antee in terms of the resultant fused image can be ascertained.A variety of subspace-based methods have been applied for

the fusion of hyperspectral images. The goal of fusion of hy-perspectral images is to obtain a representation of the data ina low dimensional space. The classical technique of dimen-sionality reduction is principal component analysis (PCA) [10]which projects the data onto the basis vectors in the directionof maximum variance of the input data. Tyo et al. proposeda PCA-based fusion method for hyperspectral images whichmaps the first three principal component vectors of the inputdata to the RGB or HSI space [11]. For efficient PCA-based fu-sion, three schemes for partitioning the input hyperspectral datahave been suggested in [12]. A fusion method based on the in-dependent component analysis has been proposed in [13]. Themain disadvantage of the subspace-based fusion techniques istheir dependency on the image statistic for the principal com-ponents. A small change in the input data can lead to a differentconfiguration of the basis vectors and thus, to a completely dif-ferent resultant image, i.e., the fusion techniques are not consis-tent [14]. In [15], each band is assigned a specific weight, andthe final fused image is generated as the weighted sum of thepixels across all the bands. The weight kernels have been de-signed on the basis of specific applications and the purpose ofvisualization. Jacobson et al. have proposed the use of specificweights derived from the color matching functions (CMF) [16].The color matching functions define the weights by specifying

1The details about the AVIRIS can be found at http://aviris.jpl.nasa.gov.

the amount of primary colors to be mixed which will create thesensation of a same color as that of viewing the original spec-trum of hyperspectral data. These techniques assign the sameweight to all the pixels within a given band, and hence do notfully exploit the local variation of data within the band.A matte-based technique for fusion of hyperspectral images

proposed in [17] uses the edge-preserving characteristic of bilat-eral filter to calculate the weights at individual pixels of the data.The calculation of weights is based on the weak textures in thedata in order to preserve them during the process of fusion. Mul-tiresolution analysis (MRA)-based techniques have also provedto be very useful in the field of image fusion. A fusion techniquethat decomposes the set of available bands into four hierarchicallevels using downsampling and a Gaussian kernel has been pro-posed in [18]. Each level of each of the band in this pyramidalstructure, except the topmost level, is then subjected to a set ofgradient filters to extract directional details. These pyramidaldecompositions are fused at every level using a predefined ruleto generate a pyramidal representation of the final fused image.The wavelet-based techniques also provide directional informa-tion at each decomposition level, and these prove to be superiorto traditional techniques of image fusion [19]. However, the hy-perspectral detectors may introduce line-to-line stripping andscanline noise due to unequal detector sensitivities and elec-tronic factors [20]. With this kind of noise, the wavelet-basedtechniques may introduce spatial distortions in the fused imagesuch as blurring of contours and textures, or ringing or aliasingeffects [21].A fusion technique involving data transformation and clas-

sification for the RGB visualization of the hyperspectral datahas been proposed in [22]. Cui et al. proposed a convex opti-mization-based approach for visualization of the hyperspectralimages in [23]. This technique formulates a cost function forthe preservation of the pairwise spectral distances between theimage pixels within the input data and the pairwise distancesbetween the pixels of the output image in the HSV space. How-ever, the actual optimization is performed over a very few pointsin the image data referred to as the representative points, and thevalues for the remaining points are computed using numericalinterpolation. This technique uses the PCA for dimensionalityreduction followed by constrained optimization, both of whichare computationally demanding.An optimization-based technique has been used for fusion of

multi-exposure optical images by Raman et al. [24] wherein aset of such images has been fused together for the purpose of en-hancing the dynamic range in the output image. However, dueto incorporation of smoothness in the output image in the costfunction, it results in a locally smooth solution. Unlike in [24],we do not demand that the output image be smooth to circum-vent the ill-posedness of the optimization problem. Instead weenforce the smoothness of the weight function locally in eachband during the fusion process, which results in a very high con-trast output.The quality assessment of the fused image is still an open

problem. When the reference image or a ground truth is avail-able, we may evaluate the quality of the fused image usingsome of the existing measures, e.g., [25]–[27]. However, the as-sessment in the absence of ground truth is quite a challenging

KOTWAL AND CHAUDHURI: AN OPTIMIZATION-BASED APPROACH TO FUSION OF HYPERSPECTRAL IMAGES 503

problem. We evaluate the result of the proposed fusion tech-nique using some of the measures proposed in [13], [14], [28].

III. PROPOSED METHOD

One of the primary purposes of image fusion is to providea quick visualization of the scene contents using a reduceddataset. This is achieved by selectively merging the usefulfeatures from the set of source images. A hyperspectral imagecontains a large number of bands acquired by sampling thewavelength spectrum at narrow intervals in a contiguousmanner. Most of the existing techniques of fusion aim atextracting the specific useful information contained at a par-ticular location in the available spatial and spectral range, andassigning it with a suitable weight during the fusion process.The final image is usually formed as a linear combinationof the pixels from all of the input image bands. However,these fusion techniques do not consider any properties of theresultant fused image. As the fusion techniques are intendedfor the visualization of the hyperspectral images, it would behighly desirable to use a fusion strategy which generates theresultant image with certain characteristics. We propose a novelfusion technique based on some of the desired characteristics ofthe fused image. Based on these characteristics, we develop amulti-objective cost function and transform the fusion probleminto an optimization framework, the solution of which providesan optimal set of weights for the purpose of fusion of the imagebands. In the following subsections, we first discuss the desiredobjectives based on the quality of the fused image, and formu-late the corresponding cost function. Subsequently we providea solution to the problem using the Euler–Lagrange method.

A. Formulation of Objective Function

The basic approach for pixel-based fusion of hyperspectralimages is the linear combination of pixels across the availablesampled wavelength spectrum [15]–[18]. Different fusionschemes generate different sets of weights which are alsopopularly known as -mattes in computer graphics literature.Let be a hyperspectral image containing bands each with

the dimensions pixels. One may consider as a 3-Darray with the dimensions where the individualbands of the hyperspectral image are stacked. Let repre-sent the fused image of dimensions generated using apixel-based fusion technique, i.e., using a linear combination ofthe sampled wavelength response at every pixel location givenas

(1)

where represents the pixel at location in the thband of the hyperspectral image, and is the value of-matte for the corresponding pixel which acts as the weightfor fusion. The weights should satisfy the followingproperties:• At any given pixel, the sum of all the weights should beequal to 1, i.e., .

• The weights should be non-negative, i.e.,.

The non-negativity of the weight provides a sufficient conditionfor the fused image to be also non-negative. Throughoutthe fusion procedure, we assume that the input hyperspectraldata is normalized such that, .In the present work, we deal with the visualization-oriented

fusion of hyperspectral images, where a human analyst will ob-serve the contents of the scene for its interpretation. One maywish to develop a fusion scheme based on a criterion based onthe human visual system (HVS). However, the choice of sucha factor is a study topic in itself, and it has not been well in-vestigated for fusion applications. Thus, in this paper we focuson some of the quantifiable characteristics of the resultant im-ages that are robust, and well investigated. For a clear and ap-pealing visualization, one may naturally expect the fused imageto have high values of contrast and sharpness. While a visuallyappealing fused image is the primary requirement, it should alsopossess a high amount of details from the input data. This im-provement in the details of the output image is also helpful forsome of the post-fusion operations such as object detection andsegmentation.The resultant image is to be obtained from the hyperspectral

bands which encompass a large bandwidth. Thus, the reflectanceresponse of the scene, which is a function of wavelength, alsovaries largely over the set of captured bands. The fusion algo-rithm has to map pixels from this large dynamic intensity rangeinto the smaller range of the display system. This mapping canbe accomplished using two different strategies:• A1: Map the pixels from the input toward the central re-gion (0.50) of the dynamic range of the gray values around[0, 1]. This mapping helps in bringing the under and overexposed regions in the scene into the mid-range of the dis-play. The minimization of the distance of gray values fromthe middle of dynamic range has, in fact, been used as oneof the objectives for an enhanced visualization of the image[29]. As the dynamic range of the fused image directlyconfines to that of the display device, usually no furtherpost-processing, such as non-linear stretching for displayis required. This is preferred in computer graphics applica-tions.

• A2: Map the pixels from the input toward the mean grayvalue of the input data. This mapping helps maintainingthe average intensity level of the output close to that ofthe input hyperspectral data. The fused image will havea low value of mean bias which has been defined in [14]as the deviation of mean of the fused image from that ofthe constituent images. Thus, the fused image has a lesserdegree of radiometric distortion, and is often preferred inremote sensing applications.

Following strategy A1, our primary objective is to calculatethe weights in such a way that the gray values of the pixels inthe fused image will be close to the central gray level. From aninformation theoretic point of view, this problem can be restatedas the problem of entropy maximization. We define our objec-tive function to compute a set of weights that willmaximize the entropy of the fused image

(2)

504 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 2, APRIL 2012

with being given in (1). The factor of in the de-nominator has been introduced to force the trivial solution to goto 0.50 as explained in A1. It should be noted that in the presentcontext, the entropy of an image has been defined over the nor-malized pixel values of the hyperspectral data. This should notbe confused with the general and more commonly used defini-tion of the image entropy as a measure of average informationcontent in the picture. In order to meet condition (A2), instead ofusing in the denominator in (2), we replace this by ,where is the mean of the entire hyperspectral data .As the trivial solution for the maximization of (2) is a con-

stant, i.e., , it indicates that the outputimage has a poor contrast. Thus, the maximum entropy (MENT)objective alone is not enough to obtain a sharply fused image.Therefore, one must add a complementary objective to the opti-mization function which will maintain the balance between theentropy maximization and the contrast of the fused image.The variance of an image is one of the commonly used

contrast measures. A high value of variance indicates a wellspreadout values of intensities in the fused image over thegiven range, yielding a visually sharper image. A small valueof variance indicates lack of details in the fused image. Weincorporate an additional term in the objective functionof maximizing the variance to produce high contrast resultantfused images.

(3)

It can be seen that these two objectives ( and ) are comple-mentary in nature as the former criteria indicates “pulling in”the far distinct pixels towards the mean gray level, and the latercriteria “pushes out” the pixels away from the mean.The adjacent pixels in the hyperspectral data generally be-

long to the same or similar underlying object in the scene, andthus, such pixels in the scene exhibit a high degree of correla-tion with the neighborhood pixels. However, in the proposed ap-proach, so far the pixels are fused without taking into considera-tion any spatial correlation among them within the same image.A simple and commonly used solution is the inclusion of a cer-tain kind of smoothness constraint in the fusion expression [24].However, enforcing the smoothness constraint on the resultantfused image leads to an excessive smoothing, and thereby blur-ring edges and washing away the weak and minor features. En-forcing the smoothness constraint on the fused image not onlydeteriorates the quality of the result, but also contradicts to oneof our objectives of obtaining high contrast images from (3). Inorder to acknowledge the strong spatial correlation among theinput pixels, and yet to avoid any smoothing of the fusion re-sult especially at edges, we propose to incorporate an additionalpenalty term in the cost function enforcing the smooth-ness of the fusion weights (i.e., the -matte). When the datain the input hyperspectral bands are smooth, one would expectthe corresponding weights also to be smooth i.e., should besmooth along the and directions. However, the sharpness orthe discontinuities in the input data gets preserved in the outputwhich is a linear combination of the input image bands, as long

as input image bands have a good contrast. The smoothness ofthe -matte can be written as

(4)

where subscripts in and denote differentiation in the respec-tive directions.From (2), (3), and (4), we formulate the overall objective

function for a constrained minimization as

(5)

(6)

where and are the regularization parameters that definethe weightage given to the variance term, and the smoothnessterm as compared to the MENT term, respectively. From (5),we observe that the fused image can be obtained by solving theproblem of calculus of variation.

B. Variational Solution

Among the two constraints on the weights, viz., normaliza-tion and positivity as mentioned in (6), the normalization con-straint can easily be incorporated into the cost function with thehelp of a Lagrangian multiplier. However, an addition of thepositivity constraint on the weights converts the problem into acomputationally demanding constrained optimization problem.In order to provide a computationally simple solution whilemaintaining the positivity of the weights, we introduce a set ofauxiliary variable , and define it as the positive square rootof the matte . Thus the original weights can be replaced as

. The cost function and the con-straints in (6) are then appropriately modified in terms of .The expressions for entropy and variance objectives are modi-fied by replacing by . In the case of the smoothness term,however, the smoothness in which is a positive square rootof implies smoothness in the actual weights . Therefore,we can impose the smoothness in as an explicit constraint toensure the smoothness in the actual weights . It can be easilyobserved that the actual weights are always non-negative, ir-respective of the sign of . The normalization constraint, how-ever, should be explicitly added to ensure that for every pixel

, the sum of the weights should equal 1. After modifyingthe weights to the square term, the unity or the normalizationconstraint can be written as

(7)

The constraint given in (6) required that the weights bechosen such that they lie on the hyperplane , in thefirst quadrant, while the constraint in (7) requires the auxiliaryvariables to lie on a unit hyper-sphere. It will be seen later thatsuch a unity norm constraint can be easily enforced externallywhile solving (5) as an unconstrained optimization problem.

KOTWAL AND CHAUDHURI: AN OPTIMIZATION-BASED APPROACH TO FUSION OF HYPERSPECTRAL IMAGES 505

Now we provide the solution to the multi-objective optimiza-tion problem using the Euler–Lagrange method. For the sake ofbrevity and easy understanding, we adopt the vector notations asfollows. Let denote a vector consisting of bandsat the location across the hyperspectral image cube, then

denotes the weight vector for the same loca-tion. This vector is actually an elementwise product of the vector

with itself. This is also known as the Hadamardproduct. Mathematically, we can write this expression as

(8)

where is an elementwise product operator. Using the vectornotations, the resultant fused image can be representedin the form of a dot product of the input data vector atthe corresponding pixel location with the weight vector

at the same spatial location, i.e.,

(9)

The combination of (5) and (7) gives the following cost func-tional:

(10)

Here and are the scalars that weigh the relative importanceof the variance and the smoothness terms, respectively, whileis the Lagrangian multiplier for the unity norm constraint. Thearguments and of the functions are omitted for the purposeof brevity ( , etc.). The solution of (10) is obtainedusing the corresponding Euler–Lagrange equation

(11)

where is the integrand in (10).On simplification, (11) becomes

(12)

Here represents the elementwise product operator, andrepresents the Laplacian operator. For RHS of the equation,indicates a zero vector. The Laplacian operator for a 2-D

function is given by . A dis-crete approximation of the Laplacian operator is given as [30]:

where isthe local average of the weight vectors in the and dimen-sions, and is the distance between adjacent pixels, trivially setto 1. After discretization, we obtain the iterative solution for theweight vector by rearranging the terms as

(13)

where is the index of the iteration. From (13), it can be ob-served that the scalar appears only as a part of a positivescaling factor. Since the purpose of is just to enforce the unitlength of the weight vector, we can avoid this scaling factor,and explicitly normalize the weight vector at each it-eration to satisfy the necessary constraint given in (7) [30]. Weintroduce an intermediate variable to represent un-normalizedweights, . The final solution is thus given as

(14)

(15)

Hence we arrive at the unconstrained optimization problemto solve the fusion problem. The resultant fused image is givenby (1) where . We expect the fused imageto be centred around the mean radiometric value of the data cube, and to have a high contrast. The estimated -matte is locallysmooth, but not the fused image.

IV. EXPERIMENTAL RESULTS

To test the effectiveness of the proposed algorithm, we havetested it on several hyperspectral datasets. Here we providethe experimental results of the proposed technique over foursuch datasets acquired by two different hyperspectral imagingsensors. The first two datasets are acquired by the Hyperionimaging sensor used in the EO-1 spacecraft for Earth observa-tion. The Hyperion data consist of 242 bands spanning the totalbandwidth from 0.4 m to 2.5 m having 30 m spatial resolu-tion. We have used the terrain-corrected dataset, designated atLevel G1 provided as the 16-bit radiance data. The two datasetsused for the experimentation depict the areas of the urbanregion of Palo Alto, CA, and a certain geological region, andare referred to as “urban” and “geological”, respectively, in thepaper. The dimension of the data cube is (256 512 242).

506 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 2, APRIL 2012

Another two datasets considered for testing of our algorithmare obtained by the AVIRIS (Airborne Visible/Infrared ImagingSpectrometer) operated by the National Aeronautics and SpaceAdministration/Jet Propulsion Laboratory. Each of the datasetcontains 224 bands of size (614 512) pixels provided as16-bit reflectance values. One of the datasets depicts the regionof Moffett Field, CA and the other dataset shows variousfeatures of the Lunar Lake. These datasets have been referredto as “moffett” and “lunar”, respectively.The transmission of radiation from the earth’s surface gets

blocked at certain wavelengths due to molecular absorption bywater and carbon dioxide [20]. After removing such nearly zero-response bands, and other highly noisy bands, the number ofavailable bands for the fusion reduces to nearly 170–175.It may be noted that the result of fusion may depend on

the choice of the regularization factor. The selection of reg-ularization parameters is known to be a difficult problem,typically solved using a cross-validation technique [31]. Thefinal result of fusion is, however, not very sensitive to theexact value of , but depends on the order of the value, i.e.,

. We have used the values in the rangeof which were found to provide a good balance amongthe competing terms in the objective function. The value ofshould be less than , as serves as the relative weightagegiven to the smoothness term of the minimization functional. Ahigher value of this term may reduce the contrast in the fusedimage which would lead to spectral averaging of the imagebands. We have used the same set of values of , and inall our experimentation. Several different techniques to stopthe iterative minimization process may be employed. We havefollowed the commonly used relative cost based criteria to con-clude the iterative procedure. In this procedure, the total costof the functional is computed after every iteration . Thechange in the value of the functional from that of the previousiteration is calculated, i.e., .We define the stopping rule in terms of the relative difference ofthe cost functional, i.e., . It was seen that typicallythe fusion process took 8–10 iterations to converge.Some of the visualization techniques of hyperspectral data

generate three independent images, and assign them to red,green, and blue channels to provide a fused RGB image tothe user. The assignment of colors is not directly related tothe actual wavelengths of these primary colors, and hence,several pseudo-color schemes are used to present the resultin an enhanced manner. To extend the proposed techniquefor the generation of an RGB image, we illustrate the resultsusing a simple strategy—partition the input data sequentiallyalong the wavelength into three groups having nearly equalnumber of bands. Contiguity is important as all these bandsare clubbed into a single color channel. Each of the groups arefused separately, and the three fused images are assigned to theR, G, and B channels to form a composite color output.Fig. 1(a) shows the result of fusion of the urban dataset using

the proposed technique, while Fig. 1(b) provides the RGB ver-sion of fusion result using the aforementioned strategy. Eachof the image bands in the Hyperion data represents the spec-tral response of the underlying scene over a nominal bandwidthof 10 nm. Thus the resultant image represents the collective

Fig. 1. Result of fusion for (a) the urban data, and (c) the geological data, fromthe Hyperion using the proposed technique. (b) and (d) are the RGB represen-tations of the corresponding fusion results. Gamma is set to 1.50 for all colorimages.

spectral response of the scene over nearly 2500 nm fused fora better visualization of the scene contents. It can be observedthat the proposed technique provides a clear and sharp image. Itcan also be seen that the resulting images [Fig. 1(a)–(b)] have avery little over- or under-saturated regions (except for the com-pletely dark regions which were a part of the data). However,there is no loss of contrast while preventing the fusion resultfrom saturation. Similar sharp and high contrast results can alsobe seen for the geological Hyperion data in Fig. 1(c), and thecorresponding RGB version of fusion result in Fig. 1(d). Thegeological image does not contain sharp details, as opposed tothe sharp and very prominent details in the former, urban image.Also, most of the input data is concentrated in a narrow rangeof the gray values which indicate the low contrast image bands.The proposed technique, however, has been able to provide theresults with a reasonably high amount of contrast, suitable for ahuman observer. It may also be noted that all the images are thedirect result of fusion without any further post-processing suchas contrast stretching, histogram modification, or gamma cor-rection, etc. (Except for the RGB versions, where only a gammacorrection has been applied.) Thus, the proposed technique iscapable of providing a good quality visualization of challenginglow contrast, input hyperspectral data.

KOTWAL AND CHAUDHURI: AN OPTIMIZATION-BASED APPROACH TO FUSION OF HYPERSPECTRAL IMAGES 507

Fig. 2. Result of an RGB image from fusion of the moffett data from theAVIRIS using (a) piecewise linear function technique, (b) color matchingfunction technique, (c) bilateral filtering-based technique, and (d) proposedtechnique.

We have employed the same technique for the remaining testdatasets. These datasets are provided by the AVIRIS hyperspec-tral sensor where 224 bands together encompass the bandwidthfrom 350.5 nm to 2582.2 nm. The RGB version of result offusion using the proposed technique over the moffett data isshown in Fig. 2(d). This image depicts a variety of features suchas water resources, urban areas, and vegetation. The urban re-gions on the top and right part of the resultant fused image ap-pear very clear and sharp. At the same time, the fused imagedoes not suffer from the saturation problem which is usuallyobserved in the case of data involving a large number of brightgray values. The visualization results of the moffett data usingproposed technique are compared with the results using someof the existing techniques–piecewise linear function [16], colormatching function [16], and bilateral filtering-based fusion [17]in Fig. 2. It can easily be observed from the fused images that theproposed technique is able to provide sharp features, without in-troducing any visible artefacts. Fig. 2(a) which is the result of fu-sion using a piecewise linear function technique appears to havea low contrast at various places and the output appears to be sat-urated at several places. Therefore, features of smaller size suchas small urban establishments are hard to visualize. The fusedimage using the color matching function technique [Fig. 2(b)]also suffers from the similar problem, although the result hasa slightly higher contrast than the previous image. The resultof bilateral filtering-based fusion shown in Fig. 2(c) provides abetter visualization of the scene as it assigns a different weightto each of the pixels in the data, as opposed to the former twotechniques that assign a single weight to the entire band. How-ever, this technique does not improve any contrast in the result,much required for an observation by a human analyst. The pro-posed technique, on the other hand, is specifically designed toimpart a high contrast to the resultant image without the pixelsrunning out of the range. This set of multiple objectives yields aset of weights that provide a visually sharp image without losingthe minor features in the data.

Fig. 3. Visual comparisons of different fusion techniques over a block of theMoffett field hyperspectral image as shown in Fig. 2(d). Result of an RGB ver-sion of fusion using (a) piecewise linear function technique, (b) color matchingfunction technique, (c) bilateral filtering-based technique, and (d) proposedtechnique.

For the purpose of better subjective assessment of the pro-posed technique, we have provided zoom-in images of smallerblock in Fig. 3. These regions refer to the areas marked by awhite rectangle in Fig. 2(d). One can observe the clarity andthe sharpness of the regions having a mesh-like structure inFig. 3(d). These details are not properly visible in the resultsusing other techniques. Figs. 3(a)–(b) appear to be quite satu-rated and lack the contrast. While the results in Fig. 3(c) pro-vide enough contrast in the image, the overall appearance of theimage is dark. The resultant image in Fig. 3(d) does not appearas bright as other two images, becuase the average intensity ofthe image was chosen to be close to the half of the gray rangedue to the factor of 0.50 in (2).The second dataset from the AVIRIS used in our experiments

captures the regions of Lunar lake. Fig. 4 shows the RGB fusedimage of this dataset using the proposed optimization-basedtechnique along with the results of other techniques. The fusedimage provides a high contrast view of the scene details presentacross almost the entire image. Unlike other results, the pro-posed technique [Fig. 4(d)] does not create any over- or under-exposed pixels in the fused image.The performance advantage of the proposed fusion technique

is quite evident from the visual inspection of the results, whichis the primary objective of the fusion here. However, as suchan assessment is very subjective, we need a set of quantitativemeasures that provides an objective assessment of various tech-niques, and also facilitates an easier comparative analysis. Wecompare the results of the proposed technique with some of therecent visualization techniques. We have used the technique ofpiecewise linear function [16], color matching function [16],multi-resolution analysis (MRA)-based technique [18], and bi-lateral filtering-based technique [17] for the quantitative assess-ment of the urban, geological, and lunar data. In the presentwork, we have not considered the PCA-based techniques whichare computationally demanding, and are not consistent [14]. Wehave also not considered the comparison against some of the

508 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 2, APRIL 2012

TABLE IQUANTITATIVE ASSESSMENT FOR VARIOUS TECHNIQUES OF VISUALIZATION-ORIENTED FUSION OF HYPERSPECTRAL DATA.

L—LUNAR DATA, U—URBAN DATA, G—GEOLOGICAL DATA

Fig. 4. Result of an RGB image from fusion of the Lunar lake data from theAVIRIS. (a) Piecewise linear function technique, (b) color matching functiontechnique, (c) bilateral filtering-based technique, and (d) proposed technique.

band selection based techniques [7], as these techniques do notactually involve any fusion of the data.The fused images are expected to have high values of vari-

ance and entropy which represent the image contrast and theamount of image information respectively. Another measurecommonly used for image sharpness in terms of its gradientvalues is the average gradient [28], [32]. For an image ofdimensions , if and denote the differences inthe and directions, respectively, then the average gradient

is given as . The aforemen-tioned performance measures are used for the quantitative eval-uation of the final result of fusion. However, they do not pro-vide any idea about the extent to which independent bands havecontributed to the final result for a given fusion technique. Themeasure used to quantify the similarity between the final imageand the constituent input image bands in terms of mutual infor-mation is known as the fusion factor (FF). Traditional definitionof the FF for fusion of only two images can be analyzed from[28], [32], and [33]. In [14], a modified definition of the FF hasbeen proposed that weighs contribution of the individual bandstowards the result of fusion based on the information contentwithin the band. If denotes the amount of mutualinformation between one of the input bands , and the fusedimage generated by fusing bands, then the fusion factor FFis given as— ,

where is the entropy of the band . We compute thevalues of fusion factor using the modified definition as it facili-tates assessing fusion in a better way. High values of this mea-sure suggest a higher amount of mutual information between theresultant image and its constituent hyperspectral bands.The quantitative assessment results of the fused images are

provided in Table I for the lunar, urban, and geological datasets.The fused images using the proposed technique yield very highvalues of the image variance. It can be observed that the variancemeasure of the results is atleast 10% higher than those of theother methods. The values of image entropy, which is definedas the average information content in the image, are provided inthe second column of the Tables. The proposed technique per-forms very well in terms of this measure as well. Better resultsfor gradient measure which measures the sharpness in theimage across and directions, can be seen in Table I for theproposed method. It may also be noted from Table I that theproposed technique has performed much superior to all othertechniques for all the parameters in the case of geological data.Recalling that the geological input data has a very low value ofintrinsic contrast and is difficult to visualize, the effectivenessof the proposed optimization-based technique is substantiated.

V. CONCLUSION

We have proposed a new approach for the visualization-ori-ented fusion of hyperspectral images. The fusionweights are op-timally computed in order to provide better visualization results.The novelty of the proposed approach lies in the fact that the fu-sion process is driven by the desired output characteristics. Wehave developed a multi-objective cost function based on someof these output characteristics, and then we provide the solutionusing the elegant Euler–Lagrange equations. As the proposedtechnique is completely output-driven, it does not require anyextra input information. We have included all the typical con-straints of a pixel-based image fusion. However, by introducingan auxiliary variable, we have been able to represent fusion as anunconstrained optimization problem. This leads to a computa-tionally simpler and easier solution. The visual and quantitativeresults are provided to substantiate the quality of the results ob-tained using the proposed approach.

REFERENCES[1] A. A. Goshtasby and S. Nikolov, “Image fusion: Advances in the state

of the art,” Information Fusion, vol. 8, no. 2, pp. 114–118, 2007, Spe-cial Issue on Image Fusion: Advances in the State of the Art.

[2] C. Pohl and J. L. Van Genderen, “Multisensor image fusion in remotesensing: Concepts, methods and applications,” Int. J. Remote Sens.,vol. 19, pp. 823–854, Mar. 1998.

KOTWAL AND CHAUDHURI: AN OPTIMIZATION-BASED APPROACH TO FUSION OF HYPERSPECTRAL IMAGES 509

[3] D. Stein, S. Beaven, L. Hoff, E. Winter, A. Schaum, and A. Stocker,“Anomaly detection from hyperspectral imagery,” IEEE SignalProcess. Mag., vol. 19, pp. 58–69, Jan. 2002.

[4] H. Erives and N. Targhetta, “Implementation of a 3-D hyperspectralinstrument for skin imaging applications,” IEEE Trans. Instrum.Meas.,vol. 58, pp. 631–638, Mar. 2009.

[5] H. Ren and C.-I. Chang, “Automatic spectral target recognition in hy-perspectral imagery,” IEEE Trans. Aerosp. Electron. Syst., vol. 39, pp.1232–1249, Oct. 2003.

[6] J.-P. Ardouin, J. Levesque, and T. Rea, “A demonstration of hyper-spectral image exploitation for military applications,” in Proc. 10th Int.Conf. Information Fusion, Jul. 2007, pp. 1–8.

[7] B. Demir, A. Çelebi, and S. Ertürk, “A low-complexity approach forthe color display of hyperspectral remote-sensing images using one-bit-transform-based band selection,” IEEE Trans. Geosci. Remote Sens.,vol. 47, pp. 97–105, Jan. 2009.

[8] K. Kotwal and S. Chaudhuri, “A fast approach for fusion of hyperspec-tral images through redundancy elimination,” in Proc. 7th Indian Conf.Computer Vision, Graphics and Image Processing, ICVGIP’10, 2010,pp. 506–511.

[9] H. Yang, Q. Du, and G. Chen, “Unsupervised hyperspectral band selec-tion using graphics processing units,” IEEE J. Sel. Topics Appl. EarthObserv. Remote Sens. (JSTARS), vol. 4, pp. 660–668, Sep. 2011.

[10] P. Ready and P. Wintz, “Information extraction, SNR improve-ment, and data compression in multispectral imagery,” IEEE Trans.Commun., vol. 21, pp. 1123–1131, Oct. 1973.

[11] J. Tyo, A. Konsolakis, D. Diersen, and R. Olsen, “Principal-com-ponents-based display strategy for spectral imagery,” IEEE Trans.Geosci. Remote Sens., vol. 41, pp. 708–718, Mar. 2003.

[12] V. Tsagaris, V. Anastassopoulos, and G. Lampropoulos, “Fusion ofhyperspectral data using segmented PCT for color representationand classification,” IEEE Trans. Geosci. Remote Sens., vol. 43, pp.2365–2375, Oct. 2005.

[13] Y. Zhu, P. Varshney, and H. Chen, “Evaluation of ICA based fusion ofhyperspectral images for color display,” in Proc. 10th Int. Conf. Infor-mation Fusion, Québec, Canada, Jul. 2007, pp. 1–7.

[14] K. Kotwal and S. Chaudhuri, “A novel approach to quantitative eval-uation of hyperspectral image fusion techniques,” Information Fusion,2011, DOI: 10.1016/j.inffus.2011.03.008.

[15] B. Guo, S. Gunn, B. Damper, and J. Nelson, “Hyperspectral image fu-sion using spectrally weighted kernels,” in Proc. 8th Int. Conf. Infor-mation Fusion, Jul. 2005, vol. 1, p. 7.

[16] N. Jacobson, M. Gupta, and J. Cole, “Linear fusion of image sets fordisplay,” IEEE Trans. Geosci. Remote Sens., vol. 45, pp. 3277–3288,Oct. 2007.

[17] K. Kotwal and S. Chaudhuri, “Visualization of hyperspectral imagesusing bilateral filtering,” IEEE Trans. Geosci. Remote Sens., vol. 48,pp. 2308–2316, May 2010.

[18] T. Wilson, S. Rogers, and M. Kabrisky, “Perceptual-based image fu-sion for hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol.35, pp. 1007–1017, Jul. 1997.

[19] J. J. Lewis, R. J. O’Callaghan, S. G. Nikolov, D. R. Bull, and N.Canagarajah, “Pixel- and region-based image fusion with complexwavelets,” Information Fusion, vol. 8, no. 2, pp. 119–130, 2005.

[20] R. A. Schowengerdt, Remote Sensing—Models and Methods for ImageProcessing, 3rd ed. Boston, MA: Academic Press, 2007.

[21] L. Alparone, L. Wald, J. Chanussot, C. Thomas, P. Gamba, and L.Bruce, “Comparison of pansharpening algorithms: Outcome of the2006 GRSS data-fusion contest,” IEEE Trans. Geosci. Remote Sens.,vol. 45, pp. 3012–3021, Oct. 2007.

[22] Q. Du, N. Raksuntorn, S. Cai, and R. Moorhead, “Color display forhyperspectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 46,pp. 1858–1866, June 2008.

[23] M. Cui, A. Razdan, J. Hu, and P. Wonka, “Interactive hyperspectralimage visualization using convex optimization,” IEEE Trans. Geosci.Remote Sens., vol. 47, pp. 1673–1684, June 2009.

[24] S. Raman and S. Chaudhuri, “A matte-less, variational approach to au-tomatic scene compositing,” in IEEE 11th Int. Conf. Computer Vision,Rio De Janeiro, Brazil, Oct. 2007, pp. 1–6.

[25] Z.Wang and A. Bovik, “A universal image quality index,” IEEE SignalProcess. Lett., vol. 9, pp. 81–84, Mar. 2002.

[26] G. Piella and H. Heijmans, “A new quality metric for image fusion,”in Proc. Int. Conf. Image Processing, ICIP’03, Sep. 2003, vol. 3, pp.173–176.

[27] C. Xydeas and V. Petrović, “Objective image fusion performance mea-sure,” Electron. Lett., vol. 36, pp. 308–309, Feb. 2000.

[28] C. Ramesh and T. Ranjith, “Fusion performance measures and a liftingwavelet transform based algorithm for image fusion,” in Proc. 5th Int.Conf. Information Fusion, 2002, vol. 1, pp. 317–320.

[29] G. Piella, “Image fusion for enhanced visualization: A variational ap-proach,” Int. J. Computer Vision, vol. 83, pp. 1–11, 2009.

[30] Shape From Shading, M. J. Brooks and B. K. P. Horn, Eds. Cam-bridge, MA: MIT Press, 1989.

[31] G. Golub, M. Heath, and G. Wahba, “Generalized cross-validation as amethod for choosing a good ridge parameter,” Technometrics, vol. 21,pp. 215–223, 1979.

[32] G. Qu, D. Zhang, and P. Yan, “Information measure for performanceof image fusion,” Electron. Lett., vol. 38, pp. 313–315, Mar. 2002.

[33] Y. Chen, Z. Xue, and R. S. Blum, “Theoretical analysis of an informa-tion-based quality measure for image fusion,” Information Fusion, vol.9, no. 2, pp. 161–175, 2008.

Ketan Kotwal received the B.E. degree in elec-tronics and telecommunication from the Universityof Pune, Pune, India, in 2005. In 2007, he joined theIndian Institute of Technology Bombay, Mumbai,India, for the M.Tech. degree in electrical engi-neering. Currently he is working toward the Ph.D.degree in electrical engineering at IIT Bombay.His research interests include image processing,

computer vision, and pattern classification.

Subhasis Chaudhuri (M’87–SM’02–F’11) is an In-stitute Chair Professor in the Electrical EngineeringDepartment and the Dean of International Relationsat IIT Bombay. His areas of research include imageprocessing, computer vision, and computationalhaptics.Prof. Chaudhuri was the recipient of the Bhatnagar

Award in engineering sciences in 2004 and the GDBirla Award for scientific research in 2010.