Post on 11-Apr-2023
UNLV Theses, Dissertations, Professional Papers, and Capstones
8-1-2014
A Novel Multimodal Image Fusion Method Using Hybrid Wavelet-A Novel Multimodal Image Fusion Method Using Hybrid Wavelet-
based Contourlet Transform based Contourlet Transform
Yoonsuk Choi University of Nevada, Las Vegas
Follow this and additional works at: https://digitalscholarship.unlv.edu/thesesdissertations
Part of the Computer Engineering Commons, and the Electrical and Computer Engineering Commons
Repository Citation Repository Citation Choi, Yoonsuk, "A Novel Multimodal Image Fusion Method Using Hybrid Wavelet-based Contourlet Transform" (2014). UNLV Theses, Dissertations, Professional Papers, and Capstones. 2172. http://dx.doi.org/10.34917/6456402
This Dissertation is protected by copyright and/or related rights. It has been brought to you by Digital Scholarship@UNLV with permission from the rights-holder(s). You are free to use this Dissertation in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/or on the work itself. This Dissertation has been accepted for inclusion in UNLV Theses, Dissertations, Professional Papers, and Capstones by an authorized administrator of Digital Scholarship@UNLV. For more information, please contact digitalscholarship@unlv.edu.
A NOVEL MULTIMODAL IMAGE FUSION METHOD USING HYBRID
WAVELET-BASED CONTOURLET TRANSFORM
By
Yoonsuk Choi
Bachelor of Engineering in Electrical Engineering
Korea University, South Korea
2003
Master of Engineering in Electronics and Computer Engineering
Korea University, South Korea
2006
A dissertation submitted in partial fulfillment of the requirements for the
Doctor of Philosophy - Electrical Engineering
Department of Electrical and Computer Engineering
Howard R. Hughes College of Engineering
The Graduate College
University of Nevada, Las Vegas
August 2014
ii
THE GRADUATE COLLEGE
We recommend the dissertation prepared under our supervision by
Yoonsuk Choi
entitled
A Novel Multimodal Image Fusion Method Using Hybrid Wavelet-based Contourlet
Transform
is approved in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Engineering - Electrical Engineering
Department of Electrical and Computer Engineering
Shahram Latifi, Ph.D., Committee Chair
Sahjendra Singh, Ph.D., Committee Member
Venkatesan Muthukumar, Ph.D., Committee Member
Laxmi Gewali, Ph.D., Graduate College Representative
Kathryn Hausbeck Korgan, Ph.D., Interim Dean of the Graduate College
August 2014
iii
ABSTRACT
By
Yoonsuk Choi
Dr. Shahram Latifi, Examination Committee Chair
Professor of Electrical and Computer Engineering
University of Nevada, Las Vegas
Various image fusion techniques have been studied to meet the requirements of different
applications such as concealed weapon detection, remote sensing, urban mapping,
surveillance and medical imaging. Combining two or more images of the same scene or
object produces a better application-wise visible image. The conventional wavelet
transform (WT) has been widely used in the field of image fusion due to its advantages,
including multi-scale framework and capability of isolating discontinuities at object
edges. However, the contourlet transform (CT) has been recently adopted and applied to
the image fusion process to overcome the drawbacks of WT with its own advantages.
Based on the experimental studies in this dissertation, it is proven that the contourlet
transform is more suitable than the conventional wavelet transform in performing the
image fusion. However, it is important to know that the contourlet transform also has
major drawbacks. First, the contourlet transform framework does not provide shift-
invariance and structural information of the source images that are necessary to enhance
the fusion performance. Second, unwanted artifacts are produced during the image
decomposition process via contourlet transform framework, which are caused by setting
some transform coefficients to zero for nonlinear approximation. In this dissertation, a
novel fusion method using hybrid wavelet-based contourlet transform (HWCT) is
iv
proposed to overcome the drawbacks of both conventional wavelet and contourlet
transforms, and enhance the fusion performance. In the proposed method, Daubechies
Complex Wavelet Transform (DCxWT) is employed to provide both shift-invariance and
structural information, and Hybrid Directional Filter Bank (HDFB) is used to achieve less
artifacts and more directional information. DCxWT provides shift-invariance which is
desired during the fusion process to avoid mis-registration problem. Without the shift-
invariance, source images are mis-registered and non-aligned to each other; therefore, the
fusion results are significantly degraded. DCxWT also provides structural information
through its imaginary part of wavelet coefficients; hence, it is possible to preserve more
relevant information during the fusion process and this gives better representation of the
fused image. Moreover, HDFB is applied to the fusion framework where the source
images are decomposed to provide abundant directional information, less complexity, and
reduced artifacts.
The proposed method is applied to five different categories of the multimodal image
fusion, and experimental study is conducted to evaluate the performance of the proposed
method in each multimodal fusion category using suitable quality metrics. Various
datasets, fusion algorithms, pre-processing techniques and quality metrics are used for
each fusion category. From every experimental study and analysis in each fusion
category, the proposed method produced better fusion results than the conventional
wavelet and contourlet transforms; therefore, its usefulness as a fusion method has been
validated and its high performance has been verified.
v
TABLE OF CONTENTS
ABSTRACT.......................................................................................................................iii
TABLE OF CONTENTS…………….……………………………....................................v
LIST OF TABLES……………………………………………………………………….vii
LIST OF FIGURES……………………………………………………………………viii
CHAPTER 1 INTRODUCTION…………...………………………………………….…1
1.1 Image Fusion………………………………………………………….……..1
1.2 Multimodal Image Fusion………………………………………………......2
1.3 Applications of Multimodal Image Fusion……………………………...........3
1.4 Challenges and Approach……………………………………………………..6
1.5 Outline of this Dissertation…..…..………..……….……………………….....9
CHAPTER 2 TRANSFORM THEORIES
2.1 Wavelet Theory…………………………………………………………….12
2.2 Wavelet Transform……………………………………………….………...17
2.3 Contourlet Transform………………………………………………………..23
2.4 Summary……………………………………………………………………..30
CHAPTER 3 FUSION METHODS……………………………………………………31
3.1 Intensity-Hue-Saturation (IHS)………………………………………………31
3.2 Principal Component Analysis (PCA)……………………………………….33
3.3 Wavelet-based Fusion………………………………………………………..35
3.4 Contourlet-based Fusion……………………………………………………..36
3.5 Comparative Analysis and Results…………………………………………..37
3.6 Conclusion…………………………………………………………………...40
CHAPTER 4 PROPOSED FUSION METHOD………………………...………………41
4.1 Hybrid Wavelet-based Contourlet Transform (HWCT) Fusion Model…….41
4.2 Wavelet-based Contourlet Transform Modeling…………………………….41
4.3 Daubechies Complex Wavelet Transform (DCxWT)………………………..46
4.4 Usefulness of Daubechies Complex Wavelets in Image Fusion…………….47
4.5 Hybrid Directional Filter Bank (HDFB) Modeling………………………….50
4.6 Summary……………………………………………………………………..54
vi
CHAPTER 5 PRE-PROCESSING OF DATASETS…………………………………….55
5.1 Image Registration…………………………………………………………...55
5.2 Band Selection……………………………………………………………….64
5.3 Decomposition Level………………………………………………………...71
5.4 Conclusion…………………………………………………………………...74
CHAPTER 6 EXPERIMENTAL STUDY AND ANALYSIS…………………………..76
6.1 Remote Sensing Image Fusion……………………………………………….76
6.2 Medical Image Fusion………………………….…………………………….89
6.3 Infrared Image Fusion………………………………………………………..98
6.4 Radar Image Fusion………………………………………………………...104
6.5 Multi-focus Image Fusion…………………………………………………..109
6.6 Conclusion………………………………………………………………….115
CHAPTER 7 CONCLUSION AND FUTURE WORK………………………………..118
7.1 Conclusion.…………………………………………………………………118
7.2 Future Work………………………………………………………………...120
REFERENCES…………………………………………………………………………121
CURRICULUM VITAE……………………………………..…………………………130
vii
LIST OF TABLES
Table 1. A performance comparison using quality assessment metrics…......……38
Table 2. A performance comparison using quality assessment metrics………..…40
Table 3. A comparison of fusion results using performance quality metrics –
Dataset 1…………………………………………………………………63
Table 4. A comparison of fusion results using performance quality metrics –
Dataset 2…………………………………………………………………64
Table 5. A performance comparison using quality assessment metrics…...…...…69
Table 6. A performance comparison using quality assessment metrics…...……...70
Table 7. A comparison of the fusion results with different levels of
decomposition…………………………………………………………74
Table 8. A comparison of the fusion results with different levels of
decomposition…………………………………………………………74
Table 9. A performance comparison of the fusion results using quality assessment
metrics……………………………………………………………………83
Table 10. A performance comparison of the fusion results using quality assessment
metrics……………………….………………………………………...…84
Table 11. A performance comparison of the fusion results using quality assessment
metrics……………………………………………………………………86
Table 12. A performance comparison of the fusion results using quality assessment
metrics……………………………………………………………………87
Table 13. A performance comparison of the fusion results using quality assessment
metrics……………………………………………………………………88
Table 14. A performance comparison of the fusion results using quality assessment
metrics……………………………………………………………………89
Table 15. Performance evaluation of the proposed HWCT method………………..95
Table 16. Performance evaluation of the proposed HWCT method………………..96
Table 17. Performance evaluation of the proposed HWCT method………………..97
Table 18. Performance evaluation of the proposed HWCT method………………..98
Table 19. Performance evaluation of the proposed HWCT method………..……..101
Table 20. Performance evaluation of the proposed HWCT method…………..…..102
Table 21. Performance evaluation of the proposed HWCT method…………..…..103
Table 22. Performance evaluation of the proposed HWCT method……..………..107
Table 23. Performance evaluation of the proposed HWCT method……..………..108
Table 24. Performance evaluation of the proposed HWCT method……..………..112
Table 25. Performance evaluation of the proposed HWCT method……..………..113
Table 26. Performance evaluation of the proposed HWCT method……..………..114
Table 27. Performance evaluation of the proposed HWCT method……..………..115
viii
LIST OF FIGURES
Figure 1. Comparison of wavelet transform and contourlet transform……………8
Figure 2. Challenges in contourlet transform………………………………………..8
Figure 3. Haar wavelet……………………………………………………………..12
Figure 4. Mother wavelet and daughter wavelets…………………………………..15
Figure 5. Three-level one-dimensional discrete wavelet transform………………..20
Figure 6. One-level two-dimensional discrete wavelet transform…………………21
Figure 7. One stage of 2-D DWT multiresolution image decomposition………….22
Figure 8. A representation of one-level and two-level image decomposition……...22
Figure 9. The contourlet transform framework…………………………………….23
Figure 10. Laplacian pyramid……………………………………………………….24
Figure 11. Directional filter bank……………………………………………………27
Figure 12. Two-dimensional spectrum partition using quincunx filter banks with fan
filters……………………………………………………………….…….28
Figure 13. Example of shearing operation that is used like a rotation operation for
DFB decomposition……………………...………………………………28
Figure 14. The contourlet filter bank………………………………………………...29
Figure 15. Comparison between actual 2-D wavelets (left) and contourlets (right)...30
Figure 16. General framework for contourlet-based image fusion………………….37
Figure 17. Original MS image and two synthesized source images………………...38
Figure 18. Fusion results…………………………………………………………….38
Figure 19. Original HS image and two synthesized source images………..………..39
Figure 20. Fusion results…………………………………………………………….39
Figure 21. Schematic of the proposed fusion method……………………………….41
Figure 22. (a) A schematic plot of the WBCT using 3 dyadic wavelet levels and 8
directions at the finest level (b) An example of the wavelet-based
contourlet packet…………………………………………………………43
Figure 23. A diagram that shows the multi-resolution subspaces for the WBCT…...45
Figure 24. The WBCT coefficients of the Peppers image………………………..…45
Figure 25. (a) A circular edge structure. (b) Reconstructed using wavelet coefficients
of real-valued DWT at single scale. (c) Reconstructed using wavelet
coefficients of Daubechies complex wavelet transform at single scale….48
Figure 26. (a) Cameraman image. (b) Medical image. (c) Image reconstructed from
the phase of wavelet coefficients of cameraman image and modulus of
wavelet coefficients of medical image. (d) Image reconstructed from the
phase of wavelet coefficients of medical image and modulus of wavelet
coefficients of cameraman image………………………………………..49
Figure 27. Directional filter bank frequency partitioning using 8 directions………..51
ix
Figure 28. (a) An example of the vertical directional filter banks. (b) An example of
the horizontal directional filter banks……………………………………51
Figure 29. (a) Quincunx filter bank. H0 and H1 are fan filters and Q is the sampling
matrix. Pass bands are shown by white color in the fan filters. (b) An
image downsampled by Q. (c) A horizontal or vertical strip of the
downsampled image……………………………………………………...53
Figure 30. Applying resampling operations to an image downsampled by Q………53
Figure 31. Fusion framework………………………………………………………..60
Figure 32. Fusion scheme……………………………………………………………61
Figure 33. Two original MS images…………………………………………………62
Figure 34. Fusion results of four different registration methods using Dataset 1…...63
Figure 35. Fusion results of four different registration methods using Dataset 2…...63
Figure 36. Source images that are used in the fusion………………………………..68
Figure 37. Fusion results…………………………………………………………….68
Figure 38. Source images that are used in the fusion……….……………………….69
Figure 39. Fusion results…………………………………………………………….70
Figure 40. Original HS image and two synthesized source images using Dataset 1...73
Figure 41. Original HS image and two synthesized source images using Dataset 2...73
Figure 42. Original HS image and two synthesized source images………..………..82
Figure 43. Fusion results…………………………………………………………….83
Figure 44. Original HS image and two synthesized source images…..……………..83
Figure 45. Fusion results…………………………………………………………….84
Figure 46. Original HS image and two synthesized source images……..…………..85
Figure 47. Fusion results…………………………………………………………….85
Figure 48. Original MS image and two synthesized source images……..…...……..86
Figure 49. Fusion results…………………………………………………………….86
Figure 50. Original MS image and two synthesized source images…..……...……..87
Figure 51. Fusion results…………………………………………………………….87
Figure 52. Original MS image and two synthesized source images……..…...……..88
Figure 53. Fusion results…………………………………………………………….89
Figure 54. A set of source images…………..……………………………………...94
Figure 55. Fusion results…………………………………………………………….94
Figure 56. A set of source images…….….………………………………………...95
Figure 57. Fusion results…………………………………………………………….95
Figure 58. A set of source images….……………….……………………………...96
Figure 59. Fusion results…………………………………………………………….96
Figure 60. A set of source images……..…………………………………………...97
Figure 61. Fusion results…………………………………………………………….97
Figure 62. A set of source images………………………………………………...101
Figure 63. Fusion results…………………………………………………………...101
x
Figure 64. A set of source images………………………………………………...102
Figure 65. Fusion results…………………………………………………………...102
Figure 66. A set of source images………………………………………………...103
Figure 67. Fusion results…………………………………………………………...103
Figure 68. A set of source images………………………………………………...107
Figure 69. Fusion results…………………………………………………………...107
Figure 70. A set of source images………………………………………………...108
Figure 71. Fusion results…………………………………………………………...108
Figure 72. A set of source images………………………………………………...111
Figure 73. Fusion results…………………………………………………………...112
Figure 74. A set of source images………………………………………………...112
Figure 75. Fusion results…………………………………………………………...113
Figure 76. A set of source images………………………………………………...113
Figure 77. Fusion results…………………………………………………………...114
Figure 78. A set of source images………………………………………………...114
Figure 79. Fusion results…………………………………………………………...115
1
CHAPTER 1
INTRODUCTION
1.1. Image Fusion
Image fusion is a process to combine multisource imagery data using advanced fusion
techniques including fusion framework, schemes and algorithms. The main purpose is the
integration of disparate and complementary data to enhance the information apparent in
the images as well as to increase the reliability of the interpretation. This leads to more
accurate data [1] and increased utility [2]. It is also stated that fused data provides robust
operational performance, increased confidence, reduced ambiguity, improved reliability
and improved classification [2]. Image fusion is generally applied to digital imagery for
the following applications that are valuable in human life [3]-[10]:
Geographical change detection
Deforestation monitoring
Glacier monitoring
Hazards monitoring
Military target detection
Border security surveillance
Early detection of medical symptoms like a cancer
Urban mapping
Replace defective data
Object identification and classification
2
1.2. Multimodal Image Fusion
Multimodal Image fusion techniques have been employed in various applications, such as
target detection, remote sensing, urban mapping and medical imaging. Combining two or
more images from heterogeneous sources usually produces a better application-wise
visible image [11]. The fusion of different images can reduce the uncertainty related to a
single image. Furthermore, image fusion should include techniques that can implement
the geometric alignment of several images acquired by different sensors. Such techniques
are called a multimodal or multisensor image fusion [12]. The resultant fused images are
usually efficiently used in many military and security applications, such as target
detection, object tracking, weapon detection, night vision, etc.
With the availability of multisensor data in many fields, such as remote sensing, medical
imaging or machine vision, sensor fusion has emerged as a new and promising research
area. It is possible to have several images of the same scene providing different
information although the scene is the same. This is because each image has been captured
with a different sensor. If we are able to merge the heterogeneous information that is
collected from different image sensors, we can obtain a new and improved image which
is called a multimodal fusion image.
In general, the problem that image fusion tries to solve is to combine information from
several images (sensors) taken from the same scene in order to achieve a new fused
image, which contains the best information coming from the original images. Hence, the
resultant fused image has better quality than any of the original images.
3
1.3. Applications of Multimodal Image Fusion
There are an increasing number of applications in which multimodal images are used to
improve and enhance image interpretation. This section gives some examples of
multimodal image fusion comprising the combination of multiple images and ancillary
data with remote sensing images:
Topographic mapping and map updating
Land use, agriculture and forestry
Flood monitoring
Ice and snow monitoring
Geology
Each section contains a list of references for further reading on these topics.
1.3.1. Topographic Mapping and Map Updating
Image fusion as a tool for topographic mapping and map updating has its importance in
the provision of up-to-date information. Areas that are not covered by one sensor might
be contained in another. In the field of topographic mapping or map updating,
combinations of visible and infrared (VIR) and synthetic aperture radar (SAR) are used.
The optical data serves as a reference while the SAR data that can be acquired at any time
provides the most recent situation. In addition the two datasets complement each other in
terms of information contained in the imagery. Work in this field has been studied by
many researchers and among them are discussed in [13]-[28].
4
1.3.2. Land Use, Agriculture and Forestry
Regarding the classification of land use, the combination of VIR with SAR data helps
discriminating classes which are not distinguishable in the optical data alone based on the
complementary information provided by the two data sets [29]-[32]. Similarly, crop
classification in agriculture applications is facilitated [33]-[35]. Concerning multisensor
SAR image fusion, the difference in incidence angles data may solve ambiguities in the
classification results [36]. Multitemporal SAR is a valuable data source in countries with
frequent cloud cover and successfully used in crop monitoring. Especially, for
Developing Countries the fusion of SAR data with VIR is a cost effective approach
which enables continuous monitoring [37]-[39]. Optical and microwave image fusion is
also well known for the purpose of identifying and mapping forest cover and other types.
The combined optical and microwave data provide a unique combination that allows
more accurate identification, as compared to the results obtained with the individual
sensors [40]-[45]. With the implementation of fusion techniques using multisensor
optical data, the accuracy of urban area classification is improved mainly due to the
integration of multispectral with high spatial resolution [46]-[48].
1.3.3. Flood Monitoring
In the field of the management of natural hazards and flood monitoring, multisensor
VIR/SAR images play an important role. In general, there are two advantages to
introduce SAR data in the fusion process with optical imagery:
5
1. SAR is sensitive to the di-electric constant which is an indicator for the humidity
of the soil. In addition, many SAR systems provide images in which water can be
clearly distinguished from land.
2. SAR data is available at any time of the day or year independent from cloud cover
or daylight. This makes it a valuable data source in the context of regular
temporal data acquisition necessary for monitoring purposes.
For the representation of the pre-flood situation the optical data provides a good basis.
The VIR image represents the land use and the water bodies before flooding. Then, SAR
data acquisition at the time of the flood can be used to identify flood extent and damage.
Examples of multisensor fusion for flood monitoring are described by [49]-[54]. Others
rely on multitemporal SAR image fusion to assess flood extents and damage [55]-[61].
Furthermore, multitemporal SAR or SAR/VIR combinations are used together with
topographic maps [62]-[65].
1.3.4. Ice and Snow Monitoring
The fusion of data in the field of ice monitoring provides results with higher reliability
and more detail [66], [67]. Regarding the use of SAR from different orbits for snow
monitoring, the amount of distorted areas due to layover, shadow and foreshortening can
be reduced significantly [68].
1.3.5. Geology
Multimodal image fusion is well implemented in the field of geology and widely applied
techniques for geological mapping. It is a well-known fact that the use of multisensor
data improves the interpretation capabilities of the images. Geological features which are
6
not visible in the single data alone are detected from integrated imagery. In most cases
VIR is combined with SAR based on the fact that the data sets complement each other.
They introduce information on soil geochemistry, vegetation and land use (VIR) as well
as soil moisture, topography and surface roughness (SAR) [69]-[88].
1.4. Challenges and Approaches
There are many different methods in the multimodal image fusion. The Brovey
Transform (BT), Intensity Hue Saturation (IHS) and Principal Component Analysis (PCA)
[89] provide the basis for many commonly used image fusion techniques. Intensity-hue-
saturation method is the oldest method used in image fusion. It performs in RGB domain.
The RGB input image is then transformed to IHS domain. Inverse IHS transform is used
to convert the image to RGB domain [90]. Brovey transform is based on the chromacity
transform. In the first step, the RGB input image is normalized and multiplied by the
other image. The resultant image is then added to the intensity component of the RGB
input image [91]. Principal component analysis-based image fusion methods are similar
to IHS methods, without any limitation in the number of fused bands. Some of these
techniques improve the spatial resolution while distorting the original chromaticity of the
input images, which is a major drawback.
Recently, great interests have arisen on new transform techniques that utilize the multi-
resolution analysis, such as wavelet transform (WT). The multi-resolution decomposition
schemes decompose input images into different scales or levels of frequencies. Wavelet
based image fusion techniques are implemented by replacing the detail components (high
frequency coefficients) from one input image with the detail components from another
7
input image. However, the wavelet based fusion techniques are not optimal in capturing
two-dimensional singularities from the input images. The two-dimensional wavelets,
which are obtained by a tensor-product of one-dimensional wavelets, are good in
detecting the discontinuities at edge points. However, the 2-D wavelets exhibit limited
capabilities in detecting the smoothness along the contours [92]. Moreover, the
singularity in some objects is due to the discontinuity points located at the edges. These
points are located along smooth curves rendering smooth boundaries of objects.
Contourlet transform (CT) was introduced by Do and Vetterli in 2005 [93]. This
transform is more suitable for constructing multi-resolution and multi-directional
expansions, and capable in detecting the discontinuities at edge points and the
smoothness along the contours. In the contourlet transform, a Laplacian pyramid (LP) is
employed in the first stage, while directional filter banks (DFB) are used in the angular
decomposition stage. However, the contourlet transform does not provide shift-invariance
and structural information; hence, it may not be the optimum choice for image processing
applications such as image fusion. Recently, some approaches have been attempted to
introduce image transforms based on the DFB with the capability of both radial and
angular decomposition. The octave-band directional filter banks [94] are a new family of
directional filter banks that offer an octave-band radial decomposition as well. Another
approach is the critically sampled contourlet (CRISP-contourlet) transform [95], which is
realized using a one-stage non-separable filter bank. Using similar frequency
decomposition to that of the contourlet transform, it provides a non-redundant version of
the contourlet transform. The second major drawback of the contourlet transform is the
occurrence of artifacts that are caused by setting some transform coefficients to zero for
8
nonlinear approximation during the fusion process. These unwanted artifacts occur in the
areas with useful information; hence, there is a possibility of losing important
characteristics or information after the fusion process is completed.
Figure 1. Comparison of wavelet transform and contourlet transform.
Figure 2. Challenges in contourlet transform.
9
In this study, a new fusion method is proposed using hybrid wavelet-based contourlet
transform (HWCT). Daubechies complex wavelets are used in the first stage filter bank to
realize multiscale subband decompositions with shift-invariance and structural
information. Hybrid directional filter bank is employed in the second stage filter bank to
achieve angular decompositions with reduced artifacts. Figure 2 depicts the challenges of
the contourlet transform and the approaches to overcome the drawbacks.
1.5. Outline of this Dissertation
This dissertation focuses on the study of multimodal image fusion and proposes a novel
fusion method using hybrid wavelet-based contourlet transform (HWCT). Although both
wavelet and contourlet transforms have advantages, the main challenge is to overcome
their major drawbacks and enhance the fusion performance. First, the contourlet
transform framework does not provide shift-invariance and structural information of the
source images that are necessary to enhance the fusion performance. Second, unwanted
artifacts are produced during the image decomposition process via contourlet transform
framework, which are caused by setting some transform coefficients to zero for nonlinear
approximation.
In this dissertation, a novel fusion method using hybrid wavelet-based contourlet
transform (HWCT) is proposed to overcome the drawbacks of both conventional wavelet
and contourlet transforms, and enhance the fusion performance. In the proposed method,
Daubechies Complex Wavelet Transform (DCxWT) is employed to provide both shift-
invariance and structural information, and Hybrid Directional Filter Bank (HDFB) is used
to achieve less artifacts and more directional information. DCxWT provides shift-
10
invariance which is desired during the fusion process to avoid mis-registration problem.
Without the shift-invariance, source images are mis-registered and non-aligned to each
other; therefore, the fusion results are significantly degraded. DCxWT also provides
structural information through its imaginary part of wavelet coefficients; hence, it is
possible to preserve more relevant information during the fusion process and this gives
better representation of the fused image. Moreover, HDFB is applied to the fusion
framework where the source images are decomposed to provide abundant directional
information, less complexity, and reduced artifacts.
The proposed method is applied to five different categories of the multimodal image
fusion: i) remote sensing image fusion, ii) medical image fusion, iii) infrared image
fusion, iv) radar image fusion, and v) multi-focus image fusion. Experimental study is
conducted to evaluate the performance of the proposed method in each multimodal fusion
category using suitable quality metrics.
In Chapter 2, the transform theories are explained in detail, beginning with the wavelet
theory. Based on the wavelet theory, the wavelet transform and the contourlet transform
are discussed because both transforms are a foundation of the proposed method.
In Chapter 3, four most widely used fusion methods, namely Intensity-Hue-Saturation
(IHS), Principal Component Analysis (PCA), Wavelet-based Fusion and Contourlet-
based Fusion, are discussed in detail. After the discussion of each fusion method,
comparative analyses are conducted using several multimodal datasets and quality
metrics.
11
In Chapter 4, the hybrid wavelet-based contourlet transform (HWCT) modeling is
discussed first in detail. Next, the Daubechies complex wavelet transform (DCxWT) is
discussed, especially in terms of its advantages and usefulness in the image fusion
process. Lastly, the hybrid directional filter bank modeling is discussed, especially in
terms of its capability in obtaining abundant directional information during the
decomposition process with reduced artifacts.
In Chapter 5, three most important pre-processing steps are discussed in detail: i) image
registration, ii) band selection, and iii) decomposition level. Performance evaluations are
conducted for each pre-processing technique to show what method produces the best
fusion results.
In Chapter 6, five different categories of the multimodal image fusion are discussed with
detailed experimental study and analysis for each category. For each category, numerous
multimodal datasets are used in the experiments, and different quality metrics are
employed to analyze and evaluate the fusion results. Moreover, for each multimodal
fusion category, a different fusion algorithm is used due to the characteristics of the
multimodal datasets, i.e., it is not a good idea to apply the same fusion algorithm to every
category.
The dissertation concludes in Chapter 7 with a summary of research findings,
experimental results and contributions. This chapter also provides a summary of possible
future work that can be further conducted.
12
CHAPTER 2
TRANSFORM THEORIES
2.1. Wavelet Theory
Wavelet theory is a relatively recent branch of mathematics. The first and simplest
wavelet was developed by Alfred Haar in 1909. The Haar wavelet belongs to the group of
wavelets known as Daubechies wavelets, which are named after Ingrid Daubechies, who
proved the existence of wavelet families whose scaling functions have certain useful
properties, namely compact support over an interval, at least one non-vanishing moment,
and orthogonal translates. Because of its simplicity (see Eq. (1) and Figure 3), the Haar
wavelet is useful for illustrating the basic concepts of wavelet theory but has limited
utility in applications.
11, 0
2
1, 0 1 1( ) ( ) 1, 1
0, 2
0,
Haar Haar
x
xx x x
otherwise
otherwise
(1)
Figure 3. Haar wavelet.
13
Various researchers further developed the concept of wavelets over the next half century
but it was not until the 1980’s that the relationships between quadrature mirror filters,
pyramid algorithms, and orthonormal wavelet bases were discovered, allowing wavelets
to be applied in signal processing. Over the past decade, there has been an increasing
amount of research into the applications of wavelet transforms to remote sensing,
particularly in image fusion. It has been found that wavelets can be used to extract detail
information from one image and inject it into another, since this information is contained
in high frequencies and wavelets can be used to select a set of frequencies in both time
and space. The resulting merged image, which can in fact be a combination of any
number of images, contains the best characteristics of all the original images.
2.1.1. Wavelet Family
Wavelets can be described in terms of two groups of functions: wavelet functions and
scaling functions. It is also common to refer to them as families: the wavelet function is
the “mother” wavelet, the scaling function is the “father” wavelet, and transformations of
the parent wavelets are “daughter” and “son” wavelets.
A. Wavelet Functions
Generally, a wavelet family is described in terms of its mother wavelet, denoted as ψ(x).
The mother wavelet must satisfy certain conditions to ensure that its wavelet transform is
stably invertible. These conditions are:
2
( ) 1x dx (2)
( )x dx (3)
14
( ) 0x dx (4)
The conditions specify that the function must be an element of L2(R), and in fact must
have normalized energy, that it must be an element of L1(R), and that it has zero mean
[96]. The third condition allows the addition of wavelet coefficients without changing the
total flux of the signal. Other conditions might be specified according to the application.
For example, the wavelet function might need to be continuous, or continuously
differentiable, or it might need to have compact support over a specific interval, or a
certain number of vanishing moments. Each of these conditions affects the results of the
wavelet transform.
To apply a wavelet function, it must be scaled and translated. Generally, a normalization
factor is also applied so that the daughter wavelet inherits all of the properties of the
mother wavelet. A daughter wavelet , ( )a b x is defined by the following equation:
1
2, ( ) (( ) / )a b x a x b a
(5)
Where ,a b R and 0a ; a is called the scaling or dilation factor and b is called the
translation factor. In most practical applications, it is necessary to place limits on the
values of a and b . A common choice is 2 ja and 2 jb k , where j and k are
integers. The resulting equation is:
1
2, ( ) 2 (2 )j
j k x x k (6)
15
This choice for dilation and translation factors is called a dyadic sampling. Changing j
by one corresponds to changing the dilation by a factor of two, and changing k by one
corresponds to a shift of 2 j . Figure 4 uses the Haar wavelet to illustrate the relationship
of daughter wavelets to the mother wavelet and the effect of varying dilation and
translation for both the general equation and the dyadic equation. The mother wavelet is
1,0 ( )x in Figure 4(a) and 0.0 ( )x in Figure 4(b). Non-integer values are used for j and
k in one example in Figure 4(b) to allow direct comparison with 0.5,1.5( )x in Figure 4(a).
Figure 4. Mother wavelet and daughter wavelets. (a) Daughter wavelets according to
Eq. 5. (b) Daughter wavelets according to Eq. 6.
2.1.2. Scaling Functions
In discrete wavelet transforms, a scaling function, or father wavelet, is needed to cover
the low frequencies. If the mother wavelet is regarded as a high pass filter then the father
wavelet, denoted as ( )x , should be a low pass filter. To ensure that this is the case, it
cannot have any vanishing moments. It is useful to specify that, in fact, the father wavelet
has a zeroeth moment, or mean, equal to one:
16
( ) 1x dx (7)
In mathematical terms, ( )x is chosen so that the set ( ),x k k Z forms an
orthonormal basis for the reference space 0V . A subspace jV is spanned by
1
2, ( ) 2 (2 ),j
j k x x k k Z
. Mutiresolution analysis makes use of a closed and
nested sequence of subspaces j j ZV
, which is dense in 2 ( )L R : each subsequent
subspace is at a higher resolution and contains all the subspaces at lower resolutions [97].
Since the father wavelet is in 0V , it, as well as the mother wavelet, can be expressed as
linear combinations of the basis functions for 1V , 1, ( )k x :
,( ) ( )k i k
k
x l x (8)
,( ) ( )k i k
k
x h x
(9)
The set
1
2, ( ) 2 (2 ),j
j k x x k k Z
then forms a basis for jW , with
jW being
the orthogonal complement to jV and j j Z
W
forming a basis for 2 ( )L R . In practice,
neither the scaling function nor the wavelet function is explicitly derived. Provided that
the wavelet function has compact support, the scaling function is equivalent to a scaling
filter and it is sufficient to determine the filter coefficients. The coefficients kl in Eq. 8
form this scaling, or lowpass filter and the coefficients kh in Eq. 9 form the wavelet, or
highpass filter. To ensure that a signal can be exactly reconstructed from its
17
decomposition, the scaling coefficients and wavelet coefficients must form a quadrature
mirror filter [98].
In this case, the relationship between the coefficients is given as follows:
1 ( 1) [ ]nh L n l n . (10)
, where L is the length of the filter and 0 n L .
Since it can be difficult to create wavelets that meet certain specific needs, yet are
orthogonal, this condition is relaxed in the group of wavelets known as biorthogonal
wavelets. These have two scaling functions, which may generate different multiresolution
analyses, and two wavelet functions. They must satisfy the biorthogonality condition,
2 ,02n n m m
n Z
l l
(11)
, where l and l are the coefficients for the two scaling functions, which do not have to
be the same length [99]. Biorthogonal wavelets are used in image compression, as well as
other applications.
2.2. Wavelet Transform
Wavelet transform provides a framework in which a signal is decomposed, with each
level corresponding to a coarser resolution, or lower frequency band. There are two main
groups of transforms, continuous and discrete. Discrete transforms are more commonly
used and can be subdivided in various categories.
18
2.2.1. Continuous Wavelet Transform
A continuous wavelet transform is performed by applying an inner product to the signal
and the wavelet functions. The dilation and translation factors are elements of the real
line. For a particular dilation a and translation b , the wavelet coefficient ( , )fW a b for a
signal f can be calculated as follows:
, ,( , ) , ( ) ( )f a b a bW a b f f x x dx (12)
Wavelet coefficients represent the information contained in a signal at the corresponding
dilation and translation. The original signal can be reconstructed by applying the inverse
transform:
, 2
1( ) ( , ) ( )f a b
w
daf x W a b x db
C a
(13)
, where C is the normalization factor of the mother wavelet [100].
Although the continuous wavelet transform is simple to describe mathematically, both the
signal and the wavelet function must have closed forms, making it difficult or impractical
to apply. The discrete wavelet is used instead.
2.2.2. Discrete Wavelet Transform
The term discrete wavelet transform (DWT) is a general term, encompassing several
different methods. It must be noted that the signal itself is continuous; discrete refers to
discrete sets of dilation and translation factors and discrete sampling of the signal. For
19
simplicity, it will be assumed that the dilation and translation factors are chosen so as to
have dyadic sampling, but the concepts can be extended to other choices of factors.
At a given scale J , a finite number of translations are used in applying multiresolution
analysis to obtain a finite number of scaling and wavelet coefficients. The signal can be
represented in terms of these coefficients as:
1
( ) ( ) ( )J
JK JK jk jk
k j k
f x C x d x
(14)
, where JKC is the scaling coefficient and jkd is the wavelet coefficient. The first term in
Eq. 14 gives the low-resolution approximation of the signal while the second term gives
the detailed information at resolutions from the original down to the current resolution J
[101]. The process of applying the DWT can be represented as a bank of filters, as in
Figure 5. At each level of decomposition, the signal is split into high frequency and low
frequency components; the low frequency components can be further decomposed until
the desired resolution is reached. When multiple levels of decomposition are applied, the
process is referred to as multiresolution decomposition. In practice when wavelet
decomposition is used for image fusion, one level of decomposition can be sufficient, but
this depends on the ratio of the spatial resolutions of the images being fused (for dyadic
sampling, a 1:2 ratio is needed).
20
(a)
(b)
Figure 5. Three-level one-dimensional discrete wavelet transform. (a) Filter bank
representation. (b) Results in frequency domain.
The wavelet and scaling filters are one-dimensional, necessitating a two-stage process for
each level in the multiresolution analysis: the filtering and down-sampling are first
applied to the rows of the image and then to its columns. This produces four images at the
lower resolution, one approximation image and three wavelet coefficient, or detail,
images. In Figure 6(a), [ ]x n represents the original image; in both Figure 6(a) and (b), A,
HD, VD, and DD are the sub-images produced after one level of transformation. The A
sub-image is the approximation image and results from applying the scaling or low-pass
filter to both rows and columns. A subsequent level of transformation would be applied
21
only to this sub-image. The HD subimage contains the horizontal details (from low-pass
on rows, high-pass on columns), the VD sub-image contains the vertical details (from
high-pass on rows, lows-pass on columns) and the DD sub-image contains the diagonal
details (from high-pass, or wavelet filter, on both rows and columns) [101].
(a)
(b)
Figure 6. One-level two-dimensional discrete wavelet transform. (a) Filter bank
representation. (b) Image representation.
The image decomposition examples using the discrete wavelet transform are shown in
below Figure 7. It depicts one stage of 2-D DWT multiresolution image decomposition.
22
Next, Figure 8 shows a difference between one-level image decomposition and two-level
image decomposition.
Figure 7. One stage of 2-D DWT multiresolution image decomposition.
(a) (b)
Figure 8. A representation of (a) one-level and (b) two-level image decomposition.
23
2.3. Contourlet Transform
2.3.1. Contourlet Transform Framework
The wavelet transform is good at isolating the discontinuities at object edges, but cannot
detect the smoothness along the edges. Moreover, it can capture limited directional
information. The contourlet transform can effectively overcome the disadvantages of
wavelet; contourlet transform is a multi-scale and multi-direction framework of discrete
image. In this transform, the multi-scale analysis and the multi-direction analysis are
separated in a serial way. The Laplacian pyramid (LP) [102] is first used to capture the
point discontinuities, then followed by a directional filter bank (DFB) [103] to link point
discontinuities into linear structures. The overall result is an image expansion using basic
elements like contour segments. The framework of contourlet transform is shown in
Figure 9.
Figure 9. The contourlet transform framework.
24
2.3.2. Laplacian Pyramid
One way to obtain a multiscale decomposition is to use the Laplacian pyramid (LP)
introduced by Burt and Adelson [102]. The LP decomposition at each level generates a
downsampled lowpass version of the original and the difference between the original and
the prediction, resulting in a bandpass image. Figure 10(a) depicts this decomposition
process, where H and G are called (lowpass) analysis and synthesis filters, respectively,
and M is the sampling matrix. The process can be iterated on the coarse (downsampled
lowpass) signal. Note that in multidimensional filter banks, sampling is represented by
sampling matrices; for example, downsampling [ ]x n by M yields [ ] [ ]dx n x Mn , where
M is an integer matrix [104].
(a)
(b)
Figure 10. Laplacian pyramid. (a) One-level decomposition. The outputs are a coarse
approximation a[n] and a difference b[n] between the original signal and the prediction.
(b) The new reconstruction scheme for the Laplacian pyramid [104][26].
25
A drawback of the LP is the implicit oversampling. However, in contrast to the critically
sampled wavelet scheme, the LP has the distinguishing feature that each pyramid level
generates only one bandpass image (even for multidimensional cases), and this image
does not have “scrambled” frequencies. This frequency scrambling happens in the
wavelet filter bank when a highpass channel, after downsampling, is folded back into the
low frequency band, and thus its spectrum is reflected. In the LP, this effect is avoided by
downsampling the lowpass channel only.
2.3.3. Directional Filter Bank
Bamberger and Smith [103] constructed a 2-D directional filter bank (DFB) that can be
maximally decimated while achieving perfect reconstruction. The DFB is efficiently
implemented via an l-level binary tree decomposition that leads to 2l subbands with
wedge-shaped frequency partitioning as shown in Figure 11(a). The original construction
of the DFB in [103] involves modulating the input image and using quincunx filter banks
with diamond-shaped filters [105]. To obtain the desired frequency partition, a
complicated tree expanding rule has to be followed for finer directional.
In [106], a new construction for the DFB that avoids modulating the input image is
proposed and this new construction has a simpler rule for expanding the decomposition
tree. The simplified DFB is intuitively constructed from two building blocks. The first
building block is a two-channel quincunx filter bank with fan filters (see Figure 12) that
divides a 2-D spectrum into two directions: horizontal and vertical. The second building
block of the DFB is a shearing operator, which amounts to just reordering of image
samples. Figure 13 shows an application of a shearing operator where a −45◦ direction
26
edge becomes a vertical edge. By adding a pair of shearing operator and its inverse
(“unshearing”) to before and after, respectively, a two-channel filter bank in Figure 12,
we obtain a different directional frequency partition while maintaining perfect
reconstruction. Thus, the key in the DFB is to use an appropriate combination of shearing
operators together with two-direction partition of quincunx filter banks at each node in a
binary tree-structured filter bank, to obtain the desired 2-D spectrum division as shown in
Figure 11(a).
Using multirate identities [107], it is instructive to view an l-level tree-structured DFB
equivalently as a 2l parallel channel filter bank with equivalent filters and overall
sampling matrices as shown in Figure 11(b). Denote these equivalent (directional)
synthesis filters as ( ) ,0 2l l
kD k , which correspond to the subbands indexed as in Figure
11(a). The corresponding overall sampling matrices were shown [106] to have the
following diagonal forms:
1 1
( )
1 1
(2 ,2) 0 2
(2,2 ) 2 2
l l
l
k l l l
diag for kS
diag for k
, (15)
, which means sampling is separable. The two sets correspond to the mostly horizontal
and mostly vertical set of directions, respectively.
From the equivalent parallel view of the DFB, we see that the family,
2
( ) ( )
0 2 ,[ ]
l
l l
k k k m Zd n S m
, (16)
obtained by translating the impulse responses of the equivalent synthesis filters ( )l
kD over
the sampling lattices by ( )l
kS , provides a basis for discrete signals in 2 2(Z )l . This basis
27
exhibits both directional and localization properties. These basis functions have quasi-
linear supports in space and span all directions. In other words, the basis (4) resembles a
local Radon transform and are called Radonlets. Furthermore, it can be shown [106] that
if the building block filter bank in Figure 12 uses orthogonal filters, then the resulting
DFB is orthogonal and (4) becomes an orthogonal basis.
(a)
(b)
Figure 11. Directional filter bank. (a) Frequency partitioning where l=3 and there are
23=8 real wedge-shaped frequency bands. Subbands 0-3 correspond to the mostly
horizontal directions, while subbands 4-7 correspond to the mostly vertical directions. (b)
The multichannel view of an l-level tree-structured directional filter bank.
28
Figure 12. Two-dimensional spectrum partition using quincunx filter banks with fan
filters. The black regions represent the ideal frequency supports of each filter. Q is a
quincunx sampling matrix.
Figure 13. Example of shearing operation that is used like a rotation operation for DFB
decomposition.
2.3.4. Contourlet Filter Bank
Figure 14 shows the contourlet filter bank. First, multi-scale decomposition is performed
by the Laplacian pyramid, and then a directional filter bank is applied to each band pass
channel.
29
Figure 14. The contourlet filter bank.
Contourlet expansion of images consists of basis images oriented at various directions in
multiple scales with flexible aspect ratio. In addition to retaining the multi-scale and
time-frequency localization properties of wavelets, the contourlet transform offers high
degree of directionality. Contourlet transform adopts non-separable basis functions,
which makes it capable of capturing the geometrical smoothness of the contour along any
possible direction. Compared with traditional image expansions, contourlet can capture
2-D geometrical structure in natural images much more efficiently [108].
Furthermore, for image enhancement, one needs to improve the visual quality of an
image with minimal image distortion. Wavelet-based methods present some limitations
because they are not well adapted to the detection of highly anisotropic elements such as
alignments in an image. Contourlet transform (CT) has better performance in
representing the image salient features such as edges, lines, curves and contours than
wavelet transform because of CT’s anisotropy and directionality. Therefore, CT is well-
suited for multi-scale edge based image enhancement.
30
Figure 15. Comparison between actual 2-D wavelets (left) and contourlets (right) [109].
To highlight the difference between the wavelet and contourlet transform, Figure 15
shows a few wavelet and contourlet basis images. It is possible to see that contourlets
offer a much richer set of directions and shapes, and thus they are more effective in
capturing smooth contours and geometric structures in images [109].
2.4. Summary
The proposed fusion method (see Chapter 4) is based on both wavelet transform and
contourlet transform, which are based on the wavelet theory. Therefore, the wavelet
theory is briefly explained in Section 2.1 as a basis for the subsequent chapters and
discussions. The most widely used transforms in the field of fusion are the wavelet
transform and the contourlet transform, and these two transforms serve as a foundation of
the proposed fusion method. In order to provide readers with better understanding, the
wavelet transform and the contourlet transform are discussed in detail, in Section 2.2 and
Section 2.3, respectively.
31
CHAPTER 3
FUSION METHODS
3.1. Intensity-Hue-Saturation (IHS)
The IHS color transformation effectively separates spatial (I) and spectral (H, S)
information from a standard RGB image. It relates to the human color perception
parameters. The mathematical context is expressed by Eq. 17. I relates to the intensity,
while ‘v1’ and ‘v2’ represent intermediate variables which are needed in the
transformation. H and S stand for Hue and Saturation [110].
1
2
1 1 1
3 3 3
1 1 2
6 6 6
1 10
2 2
I R
v G
v B
1 2
1
tanv
Hv
2 2
1 2S v v (17)
There are two ways of applying the IHS technique in image fusion: direct and
substitutional. The first refers to the transformation of three image channels assigned to I,
H and S [111]. The second transforms three channels of the data set representing RGB
into the IHS color space which separates the color aspects in its average brightness
(intensity). This corresponds to the surface roughness, its dominant wavelength
32
contribution (hue) and its purity (saturation) [112], [113]. Both the hue and the saturation
in this case are related to the surface reflectivity or composition [114]. Then, one of the
components is replaced by a fourth image channel which is to be integrated. In many
published studies the channel that replaced one of the IHS components is contrast
stretched to match the latter. A reverse transformation from IHS to RGB as presented in
Eq. 18 converts the data into its original image space to obtain the fused image [115]. The
IHS technique has become a standard procedure in image analysis. It serves color
enhancement of highly correlated data, feature enhancement, the improvement of spatial
resolution and the fusion of disparate data sets.
1
2
1 1 1
3 6 2
1 1 1
3 6 2
1 20
3 6
R I
G v
B v
(18)
The use of IHS technique in image fusion is manifold, but based on one principle: the
replacement of one of the three components (I, H or S) of one data set with another image.
Most commonly the intensity channel is substituted. Replacing the intensity (sum of the
bands) by a higher spatial resolution value and reversing the IHS transformation leads to
composite bands. These are linear combinations of the original (resampled) multispectral
bands and the higher resolution panchromatic band.
A variation of the IHS fusion method applies a stretch to the hue saturation components
before they are combined and transformed back to RGB. This is called color contrast
stretching. The IHS transformation can be performed either in one or in two steps. The
33
two step approach includes the possibility of contrast stretching the individual I, H and S
channels. It has the advantage of resulting in color enhanced fused imagery. A closely
related color system to IHS is the HSV: hue, saturation and value.
3.2. Principal Component Analysis (PCA)
The PCA is useful for image encoding, image data compression, image enhancement,
digital change detection, multitemporal dimensionality and image fusion. It is a statistical
technique that transforms a multivariate data set of intercorrelated variables into a data
set of new un-correlated linear combinations of the original variables. It generates a new
set of axes which are orthogonal.
The approach for the computation of the principal components (PCs) comprises the
calculation of:
1. Covariance (unstandardized PCA) or correlation (standardized PCA) matrix
2. Eigenvalues and eigenvectors
3. PCs
An inverse PCA transforms the combined data back to the original image space. The use
of the correlation matrix implies a scaling of the axes so that the features receive a unit
variance. It prevents certain features from dominating the image because of their large
digital numbers. The signal-to-noise ratio (SNR) is significantly improved applying the
standardized PCA [116], [117]. Better results are obtained if the statistics are derived
from the whole study area rather than from a subset area [118]. The PCA technique can
also be found under the expression Karhunen Loeve approach [119].
34
Two types of PCA can be performed: selective or standard. The latter uses all available
bands of the input image and the selective PCA uses only a selection of bands which are
chosen based on a priori knowledge or application purposes. In case of TM the first three
PCs contain 98-99 percent of the variance and therefore are sufficient to represent the
information.
PCA in image fusion has two approaches:
1. PCA of multichannel image replacement of first principal component by different
images (Principal Component Substitution - PCS).
2. PCA of all multi-image data channels.
The first version follows the idea of increasing the spatial resolution of a multichannel
image by introducing an image with a higher resolution. The channel which will replace
PC1 is stretched to the variance and average of PC1. The higher resolution image
replaces PC1 since it contains the information which is common to all bands while the
spectral information is unique for each band; PC1 accounts for maximum variance which
can maximize the effect of the high resolution data in the fused image.
The second procedure integrates the disparate natures of multisensor input data in one
image. The image channels of the different sensor are combined into one image file and a
PCA is calculated from all the channels.
A similar approach to the PCS is accomplished in the C-stretch (color stretch) [120] and
the D-stretch (de-correlation stretch) [121]. The de-correlation stretch helps to overcome
the perceived problem that the original data often occupy a relatively small portion of the
overall data space [121]. In D-stretching three-channel multispectral data are transformed
35
on to principal component axes, stretched to give the data a spherical distribution in
feature space and then transformed back onto the original axes. In C-stretching PC1 is
discarded, or set to a uniform DN across the entire image, before applying the inverse
transformation. This yields three color stretched bands which, when composited, retain
the color relations of the original color composite but albedo and topographically induced
brightness variations are removed.
The PCA approach is sensitive to the choice of area to be analyzed. The correlation
coefficient reflects the tightness of a relation for a homogeneous sample. However, shifts
in the band values due to markedly different cover types also influence the correlations
and particularly the variances [121].
3.3. Wavelet-based Fusion
A mathematical tool developed originally in the field of signal processing can also be
applied to fuse image data following the concept of the multiresolution analysis (MRA)
[122]. Another application is the automatic geometric registration of images, one of the
pre-requisites to pixel based image fusion [123]. The wavelet transform creates a
summation of elementary functions (wavelets) from arbitrary functions of finite energy.
The weights assigned to the wavelets are the wavelet coefficients which play an
important role in the determination of structure characteristics at a certain scale in a
certain location. The interpretation of structures or image details depend on the image
scale which is hierarchically compiled in a pyramid produced during the MRA.
The wavelet transform in the context of image fusion is used to describe differences
between successive images provided by the MRA. Once the wavelet coefficients are
36
determined for the two images of different spatial resolution, a transformation model can
be derived to determine the missing wavelet coefficients of the lower resolution image.
Using these it is possible to create a synthetic image from the lower resolution image at
the higher spatial resolution. This image contains the preserved spectral information with
the higher resolution, hence showing more spatial detail.
3.4. Contourlet-based Fusion
The distribution of the coefficients of contourlet transform is related with the parameter
n-levels given in the DFB stage decomposition where n-levels is one-dimensional vector.
The parameter, n-levels is used to store the parameters of the decomposition level of each
level of pyramid for DFB. If the parameter of the decomposition level is 0 for DFB, DFB
will use the wavelet to process the subimage of pyramid. If the parameter is lj, the
decomposition level of DFB is 2lj, which means that the subimage is divided into 2
lj
directions. Corresponding to the vector parameter n-levels, the coefficient Y of the
contourlet decomposition is a vector too. The length of Y is equal to the length (n-levels)
+1. Y{1} is the subimage of the low frequency. Y{i}(i = 2,... Len) is the directional
subimage obtained by DFB decomposition, where i denotes the i-th level pyramid
decomposition.
Fusion methods based on contourlet analysis combine decomposition coefficients of two
or more source images using a certain fusion algorithm. Then, the inverse transform is
performed on the combined coefficients resulting in the fused image. A general scheme
for contourlet-based fusion methods is shown in Figure 16, where Image 1 and Image 2
37
denote the input images, CT represents the contourlet transform, and Image F is the final
fused image.
Figure 16. General framework for contourlet-based image fusion.
3.5. Comparative Analysis and Results
3.5.1. Experimental Study and Analysis
In this section, experiments are conducted in order to compare and analyze which fusion
method is optimal in the image fusion process. Pre-processing of the datasets, fusion
process, fusion algorithms or schemes and performance quality metrics are explained in
detail in the following chapters. The main point of this section; however, is to provide the
readers with a clear view on the fusion performance of four different methods which are
widely used in the fusion process. Furthermore, from the given experimental results, it is
verified that the contourlet-based fusion method is the suitable solution to achieving
better fusion performance.
38
(a) (b) (c)
Figure 17. Original MS image and two synthesized source images. (a) Original
MS image. (b) Synthesized PAN source image. (c) Synthesized MS source image.
(a) (b) (c) (d)
Figure 18. Fusion results. (a) IHS. (b) PCA. (c) WT. (d) CT.
Table 1. A performance comparison using quality assessment metrics.
Fusion
Method
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
IHS 0.846 44.853 0.277 26.252 0.674 68.652
PCA 0.859 44.738 0.268 26.016 0.683 68.738
WT 0.862 44.682 0.256 25.891 0.691 68.744
CT 0.879 44.527 0.245 25.472 0.699 68.757
39
(a) (b) (c)
Figure 19. Original HS image and two synthesized source images. (a) Original HS
image. (b) Synthesized PAN source image. (c) Synthesized HS source image.
(a) (b) (c) (d)
Figure 20. Fusion results. (a) IHS. (b) PCA. (c) WT. (d) CT.
40
Table 2. A performance comparison using quality assessment metrics.
Fusion
Method
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
IHS 0.753 45.769 0.267 28.621 0.651 67.567
PCA 0.759 45.756 0.262 28.536 0.658 67.734
WT 0.762 45.741 0.256 28.511 0.663 67.849
CT 0.769 45.728 0.248 28.503 0.672 67.937
3.6. Conclusion
In Chapter 3, four most widely used fusion methods, namely Intensity-Hue-Saturation
(IHS), Principal Component Analysis (PCA), Wavelet-based Fusion and Contourlet-
based Fusion, were discussed in detail. After the discussion of each fusion method,
comparative analyses were conducted using several multimodal datasets and quality
metrics. As mentioned earlier, fusion process and quality metrics are discussed in detail
in Chapter 6.
From the experimental results, we can observe that the contourlet-based fusion method
produced better results than the other three methods, both spatially and spectrally. A total
of six different quality metrics were employed in the performance evaluations: CC,
RASE and SAM for spectral performance; Distortion, UIQI and SNR for spatial
performance. Each quality metric verifies the fact that the contourlet-based fusion
produces better results than the other three methods.
41
CHAPTER 4
PROPOSED FUSION METHOD
4.1. Hybrid Wavelet-based Contourlet Transform (HWC) Fusion Model
The block diagram of the proposed fusion method is illustrated in Figure 21. Source
images are first decomposed using Daubechies Complex Wavelet Transform (DCxWT)
in order to realize multiscale subband decompositions with no redundancy. Next, hybrid
directional filter banks are applied to the frequency coefficients obtained from the
previous stage to achieve angular decompositions. The obtained frequency coefficients
are fused together based on certain fusion algorithms which are discussed in Chapter 6.
The resultant fused coefficients are used to reconstruct an image using inverse transform.
As a result, the final fusion result is obtained.
Figure 21. Schematic of the proposed fusion method.
4.2. Wavelet-based Contourlet Transform (WBCT) Modeling
Similar to the contourlet transform, the WBCT consists of two filter bank stages. The first
stage provides subband decomposition, which in the case of the WBCT is a wavelet
42
transform, in contrast to the Laplacian pyramid used in contourlets. The second stage of
the WBCT is a directional filter bank (DFB), which provides angular decomposition. The
first stage is realized by separable filter banks, while the second stage is implemented
using non-separable filter banks. For the DFB stage, the iterated tree-structured filter
banks are employed using fan filters [124].
At each level j in the wavelet transform, it is possible to obtain the traditional three high-
pass bands corresponding to the LH, HL, and HH bands. DFB is then applied with the
same number of directions to each band in a given level j. Starting from the desired
maximum number of directions ND = 2L on the finest level of the wavelet transform J, the
number of directions at every other dyadic scale is decreased when proceeding through
the coarser levels (j < J). This way, the anisotropy scaling law can be achieved, which is
width ≈ length2.
Figure 22(a) illustrates a schematic plot of the WBCT using 3 wavelet levels and L = 3
directional levels. Since we have mostly vertical directions in the HL image and
horizontal directions in the LH image, it might seem logical to use partially decomposed
DFB’s with vertical and horizontal directions on the HL and LH bands, respectively.
However, since the wavelet filters are not perfect in splitting the frequency space to the
low-pass and high-pass components, that is, not all of the directions in the HL image are
vertical and in the LH image are horizontal, fully decomposed DFB is used on each band.
43
(a) (b)
Figure 22. (a) A schematic plot of the WBCT using 3 dyadic wavelet levels and 8
directions at the finest level ( 8DN ). The directional decomposition is overlaid the
wavelet subbands. (b) An example of the wavelet-based contourlet packet.
One of the major advantages of the WBCT is that we can have Wavelet-based Contourlet
Packets in much the same way as we have Wavelet Packets. That is, keeping in mind the
anisotropy scaling law (the number of directions is doubled at every other wavelet levels
when we refine the scales), we allow quad-tree decomposition of both low-pass and high-
pass channels in wavelets and then apply the DFB on each subband. Figure 22(b)
schematically illustrates an example of the wavelet-based contourlet packets. However, if
the anisotropy constraint is ignored, a quad-tree like angular decomposition, which is
introduced in [125] as Contourlet Packets, can be constructed as well. Below, a brief
multi-resolution modeling of the WBCT is presented.
Following a similar procedure outlined in [126], for an l-level DFB we have 2l directional
subbands with ( )l
kG , 0 ≤ k < 2l equivalent synthesis filters and the overall downsampling
matrices of ( )l
kS , 0 ≤ k < 2l are defined as follows:
44
1
1
( )
1
1
2 0
0 2 , 0 2
, 2 22 0
0 2
l
l
l
k l l
l
if kS
if k
. (19)
Next, ( )l l
k kg n S m , 0 2lk , 2m , is a directional basis for 2 2( )l ; where ( )l
kg is
the impulse response of the synthesis filter ( )l
kG . Assuming an orthonormal separable
wavelet transform, we will have separable 2-D multi-resolution [127]:
2
j j j V V V , and 2 2 2
1j j j V V W , (20)
,where 2
jW is the detail space and orthogonal component of 2
jV in 2
1jV . The family
2
1 2 3
, , ,, ,j n j n j n n
Z is an orthonormal basis of 2
jW . Now, if we apply jl directional
levels to the detail multi-resolution space 2
jW , we obtain 2 jl directional subbands of 2
jW
(see Figure 23):
122,( )2
,0
l j
jl
j j kk
W W , (21)
Defining:
2
,( )
, , ,j j ji l l l i
j k n k k j m
m
g m S n
Z
, i = 1, 2, 3, (22)
the family 2
1,( ) 2,( ) 3,( )
, , , , , ,, ,j j jl l l
j k n j k n j k nn
Z
is a basis for the subspace 2,( )
,jl
j kW .
45
Figure 23. A diagram that shows the multi-resolution subspaces for the WBCT.
Figure 24 shows an example of the WBCT coefficients of the Peppers image. Here, 3
wavelet levels and 8 directions are used at the finest level. It can be seen that most of the
coefficients in the HL subbands are in the vertical directional subbands (the upper half of
the subbands) while those in the LH subbands are in the horizontal directional subbands
(the lower half of the subbands).
Figure 24. The WBCT coefficients of the Peppers image. For better visualizing, the
transform coefficients are clipped between 0 and 7.
46
4.3. Daubechies Complex Wavelet Transform
The scaling equation of multi-resolution theory is given as follows:
( ) 2 (2 )k
k
x a x k (23)
, where ka are the coefficients. The ka can be real as well as complex values and
1ka . Daubechies’s wavelet bases , ( )j k t in one dimension are defined through
the above scaling function and multi-resolution analysis of2 ( )L R . To provide general
solution, Daubechies considered ka to be a real value only. The construction details of
Daubechies complex wavelet transform are given in [128].
The generating wavelet ( )t is given as follows:
1( ) 2 ( 1) (2 )n
n
n
t a t n (24)
Here, ( )t and ( )t share the same compact support , 1N N . Any function ( )f t can
be decomposed into complex scaling function and mother wavelet as follows:
max
0
0
0
1
, ,( ) ( ) ( )j
j j
k j k k j k
k j j
f t c t d t
(25)
where 0j is a given resolution level, 0j
kc and jkd are known as approximation and
detailed coefficients.
The Daubechies complex wavelet transform has the following advantages:
1) It has perfect reconstruction.
47
2) It is non-redundant wavelet transform, unlike Dual Tree Complex Wavelet
Transform (DTCWT) [10] which has redundancy of 2 :1m for m-dimensional
signal.
3) It has the same number of computation steps as DWT (although it involves
complex computations), while DTCWT has 2m times more computations than
DWT for m-dimensional signals.
4) It is symmetric. This property makes it easy to handle edge points during the
signal reconstruction.
4.4. Usefulness of Daubechies Complex Wavelets in Image Fusion
Daubechies complex wavelet transform exhibits two important properties that directly
improve the quality of the fusion results.
4.4.1. Reduced Shift Sensitivity
Daubechies complex wavelet transform is approximately shift invariant. A transform is
shift sensitive if an input signal shift causes an unpredictable change in transform
coefficients. In discrete wavelet transform (DWT), shift sensitivity arises from use of
downsamplers in the implementation. Figure 25 shows a circular edge structure
reconstructed using real and complex Daubechies wavelets at single scale. It is clear that
as the circular edge structure moves through space, the reconstruction using real valued
DWT coefficients changes erratically, while Daubechies complex wavelet transform
reconstructs all local shifts and orientations in the same manner. Shift invariance is
48
desired during fusion process otherwise mis-registration [129] problem will occur, which
in turn provides a mismatched or non-aligned fusion image.
(a) (b) (c)
Figure 25. (a) A circular edge structure. (b) Reconstructed using wavelet coefficients of
real-valued DWT at single scale. (c) Reconstructed using wavelet coefficients of
Daubechies complex wavelet transform at single scale.
4.4.2. Availability of Phase Information
Daubechies complex wavelet transform (DCxWT) provides phase information through its
imaginary part of wavelet coefficients. The most of the structural information about
images are contained in the phase of image. In order to show the importance of the phase,
cameraman and medical image are decomposed by DCxWT. Reconstruction of these
images are done with exchanging the phase of these images with each other. As we can
see from Figure 26, it is clear that the phase of an image represents structural details or
skeleton of the image. It was found that the phase is an important criterion to detect
strong (salient) features of images such as edges, corners, contours, etc. Therefore, by
49
using DCxWT, we are able to preserve more relevant information during fusion process
and this will give better representation of the fused image.
(a) (b)
(c) (d)
Figure 26. (a) Cameraman image. (b) Medical image. (c) Image reconstructed from the
phase of wavelet coefficients of cameraman image and modulus of wavelet coefficients
of medical image. (d) Image reconstructed from the phase of wavelet coefficients of
medical image and modulus of wavelet coefficients of cameraman image.
50
4.5. Hybrid Directional Filter Bank (HDFB) Modeling
As discussed previously, wavelet-based contourlet transform is non-redundant and can be
adopted for the process of image fusion for better results. However, there is a main
drawback of the contourlet-based transforms, including WBCT, which is the occurrence
of artifacts that are caused by setting some transform coefficients to zero for nonlinear
approximation. In order to reduce the unexpected artifacts, Hybrid Directional Filter
Bank (HDFB) model is employed.
The original Directional filter bank (DFB) decomposes the frequency space into wedge-
shaped partitions as illustrated in Figure 27. In this example, eight directions are used,
where directional subbands of 1, 2, 3, and 4 represent horizontal directions (directions
between -45° and +45°) and the rest stand for the vertical directions (directions between
45° and 135°). The DFB is realized using iterated quincunx filter banks.
For the proposed HDFB, it is required to decompose the input into either horizontal
directions or vertical directions or both. Therefore, it is necessary to explore Vertical
DFB and Horizontal DFB, where one can achieve vertical or horizontal directional
decompositions, respectively. Figure 28 shows the frequency space partitioned by the
Vertical DFB and Horizontal DFB. The implementation of these schemes is
straightforward when we use the iterated tree-structured filter banks to realize the DFB.
At the first level of the DFB, a quincunx filter bank (QFB) is employed as depicted in
Figure 29(a). The quincunx sampling matrix is defined as follows:
1 1
1 1Q
(26)
51
Figure 27. Directional filter bank frequency partitioning using 8 directions.
(a) (b)
Figure 28. (a) An example of the vertical directional filter banks. (b) An example of the
horizontal directional filter banks.
Figure 29(b) shows how downsampling by Q affects the input image. The image is
rotated +45° clockwise. Therefore, in the DFB, since this is not a rectangular output, the
image is further decomposed by using two other QFBs at the outputs y0 and y1. As a
52
result, four outputs corresponding to the four directions of the DFB can be obtained. At
level three and higher, QFBs are employed in conjunction with some resampling matrices
to further decompose the DFB. In the Vertical DFB or Horizontal DFB, however, we stop
at y1 (y0) and decompose the other channel (y0 in Vertical DFB and y1 in Horizontal DFB)
in a similar manner as we decompose the DFB. Therefore, since we keep y1 or y0, we
have to find a way to represent these outputs in a rectangular form.
Assuming periodic filters are used, one can select a rectangular strip of these outputs as
depicted in Figure 29(c). However, for better visualization and possible further
processing of the coefficients in image processing applications such as fusion, we need a
better representation. A solution to this issue is the use of a resampling matrix. During
resampling, the sampling rate of the input image does not change and the samples are
merely reordered. In particular, we find resampling matrices to reorder the samples of y1
or y0 from a diamond shape to a shape of parallelogram. The resampling matrices can be
selected as follows:
1 0
0 1hR
and 1 0
1 1vR
(27)
Applying these resampling operations to the outputs of the QFB, we obtain
parallelogram-shaped outputs as illustrated in Figure 30. Next, we simply shift the
resulting coefficients (column-wise in the case of Rh and row-wise in the case of Rv) to
obtain rectangular outputs. Thus, the resulting overall sampling matrix for representing y1
and y0 is Qh = QRh , or Qv = QRv , where Qh (Qv) in conjunction with a shifting operation
results in a horizontal (vertical) rectangular output.
53
(a) (b)
(c)
Figure 29. (a) Quincunx filter bank. H0 and H1 are fan filters and Q is the sampling
matrix. Pass bands are shown by white color in the fan filters. (b) An image
downsampled by Q. (c) A horizontal or vertical strip of the downsampled image.
Figure 30. Applying resampling operations Rh and Rv to an image downsampled by Q.
The right side images show the resulting outputs after shifting the coefficients into a
rectangle box.
54
4.6. Summary
In Chapter 4, the proposed fusion method is discussed in detail. Source images are first
decomposed using Daubechies Complex Wavelet Transform (DCxWT) in order to realize
multiscale subband decompositions with no redundancy. Next, hybrid directional filter
banks are applied to the frequency coefficients obtained from the previous stage to
achieve angular decompositions with reduced artifacts. The obtained frequency
coefficients are fused together based on a certain fusion algorithm which is discussed in
Chapter 6. The fusion algorithm is different for each category of multimodal image
fusion due to the characteristics of source images. For example, the algorithm used in the
remote sensing image fusion is different from the one used in the medical image fusion.
The resultant fused coefficients are used to reconstruct an image using inverse transform.
As a result, the final fusion result is obtained.
The wavelet-based contourlet transform modeling was discussed first in detail. Next, the
DCxWT was discussed, especially in terms of its advantages and usefulness in the image
fusion process. Lastly, the hybrid directional filter bank modeling was discussed in detail,
especially in terms of its capability in obtaining abundant directional information during
the decomposition process with reduced artifacts.
55
CHAPTER 5
PRE-PROCESSING OF DATASETS
5.1. Image Registration
5.1.1. Registration Methods
Image registration is one of the necessary pre-processing techniques that significantly
affect the fusion results. Image registration can also be called as image alignment, in such
a way as to align the input images as perfectly as possible in order to produce the best
fusion results. If the input image datasets are not aligned to each other, it is impossible to
obtain good fusion results although fusion framework, scheme and algorithm are
optimum. Therefore, it is necessary to align or register input images as much as possible
prior to the main fusion process. In this section, various image registration methods that
are used in the image fusion process are discussed.
A. Cross-correlation Method
The classical representative of the area-based methods is the normalized cross-correlation
(CC) and its modifications [130].
( , )
( , ) ( , )
2 2
( , ) ( , )
( ( )) ( ))
( , )
( ( )) ( ( ))i j
i j i j
W
i j i j
W I
W E W I E I
CC i j
W E W I E I
(28)
This measure of similarity is computed for window pairs from the sensed and reference
images and its maximum is searched. The window pairs for which the maximum is
achieved are set as the corresponding ones. If the subpixel accuracy of the registration is
56
demanded, the interpolation of the CC measure values needs to be used. Although the CC
based registration can exactly align mutually translated images only, it can also be
successfully applied when slight rotation and scaling are present.
There are generalized versions of CC for geometrically more deformed images. They
compute the CC for each assumed geometric transformation of the sensed image window
[131] and are able to handle even more complicated geometric deformations than the
translation-usually the similarity transform. Berthilsson [132] tried to register in this
manner even finely deformed images and Simper [133] proposed to use a divide and
conquer system and the CC technique for registering images differing by perspective
changes as well as changes due to the lens imperfections. The computational load,
however, grows very fast with the increase of the transformation complexity. In case the
images/objects to be registered are partially occluded, the extended CC method based on
increment sign correlation can be applied [134].
Similar to the CC methods is the sequential similarity detection algorithm (SSDA) [135].
It uses the sequential search approach and a computationally simpler distance measure
than the CC. It accumulates the sum of absolute differences of the image intensity values
and applies the threshold criterion—if the accumulated sum exceeds the given threshold,
the candidate pair of windows from the reference and sensed images is rejected and the
next pair is tested. The method is likely to be less accurate than the CC but it is faster.
Sum of squared differences similarity measure was used in [136] for iterative estimation
of perspective deformation using piecewise affine estimates for image decomposed to
small patches.
57
Two main drawbacks of the correlation-like methods are the flatness of the similarity
measure maxima (due to the self-similarity of the images) and high computational
complexity. The maximum can be sharpened by preprocessing or by using the edge or
vector correlation. Pratt [137] applied, prior to the registration, image filtering to improve
the CC performance on noisy or highly correlated images. Van Wie [138] and Anuta [139]
employed the edge-based correlation, which is computed on the edges extracted from the
images rather than on the original images themselves. In this way, the method is less
sensitive to intensity differences between the reference and sensed images, too. Extension
of this approach, called vector-based correlation, computes the similarity measures using
various representations of the window.
Despite the limitations mentioned above, the correlation-like registration methods are still
often in use, particularly thanks to their easy hardware implementation, which makes
them useful for real-time applications.
B. Mutual Information Method
The mutual information (MI) methods appeared recently and represent the leading
technique in multimodal registration. Registration of multimodal images is the difficult
task, but often necessary to solve, especially in medical imaging. The comparison of
anatomical and functional images of the patient’s body can lead to a diagnosis, which
would be impossible to gain otherwise. Remote sensing often makes use of the
exploitation of more sensor types as well.
58
The MI, originating from the information theory, is a measure of statistical dependency
between two data sets and it is particularly suitable for registration of images from
different modalities. MI between two random variables X and Y is given by:
( , ) ( ) ( | ) (X) H(Y) H(X,Y)MI X Y H Y H Y X H (29)
, where H(X) = - Ex(log(P(X))) represents entropy of random variable and P(X) is the
probability distribution of X. The method is based on the maximization of MI. Often the
speed up of the registration is implemented, exploiting the coarse-to-fine resolution
strategy (the pyramidal approach).
C. Feature-based Method using Spatial Relations
Methods based primarily on the spatial relations among the features are usually applied if
detected features are ambiguous or if their neighborhoods are locally distorted. The
information about the distance between the Closest Points (CPs) and about their spatial
distribution is exploited. Goshtasby [140] described the registration based on the graph
matching algorithm. He was evaluating the number of features in the sensed image that,
after the particular transformation, fall within a given range next to the features in the
reference image. The transformation parameters with the highest score were then set as a
valid estimate.
Clustering technique, presented by Stockman et al. [141], tries to match points connected
by abstract edges or line segments. The assumed geometrical model is the similarity
transform. For every pair of CPs from both the reference and sensed images, the
parameters of the transformation which maps the points on each other are computed and
represented as a point in the space of transform parameters. The parameters of
59
transformations that closely map the highest number of features tend to form a cluster,
while mismatches fill the parameter space randomly. The cluster is detected and its
centroid is assumed to represent the most probable vector of matching parameters.
Mapping function parameters are thus found simultaneously with the feature
correspondence. Local errors do not influence globally the registration process. The
clustering technique was implemented in [142].
D. Feature-based Method using Invariant Descriptors
As an alternative to the methods exploiting the spatial relations, the correspondence of
features can be estimated using their description, preferably invariant to the expected
image deformation. The description should fulfill several conditions. The most important
ones are invariance (the descriptions of the corresponding features from the reference and
sensed image have to be the same), uniqueness (two different features should have
different descriptions), stability (the description of a feature which is slightly deformed in
an unknown manner should be close to the description of the original feature), and
independence (if the feature description is a vector, its elements should be functionally
independent). However, usually not all these conditions have to (or can) be satisfied
simultaneously and it is necessary to find an appropriate trade-off.
Features from the sensed and reference images with the most similar invariant
descriptions are paired as the corresponding ones. The choice of the type of the invariant
description depends on the feature characteristics and the assumed geometric deformation
of the images. While searching for the best matching feature pairs in the space of feature
descriptors, the minimum distance rule with thresholding is usually applied. If a more
60
robust algorithm is needed, the matching likelihood coefficients [143], which can better
handle questionable situations, can be an appropriate solution. Guest et al. proposed to
select features according to the reliability of their possible matches [144]. The simplest
feature description is the image intensity function itself, limited to the close
neighborhood of the feature. To estimate the feature correspondence, authors computed
the CC on these neighborhoods. Other types of similarity measures can be used as well.
Zheng and Chellapa make use of the correlation coefficients [145]. They assumed the
similarity geometric deformation. In their approach, firstly the rotation between images
was compensated by the estimation of the illuminant direction and then the coarse-to-fine
correlation based registration was performed.
5.1.2. Performance Evaluations
A. Fusion Framework
The fusion framework used in the experiments is shown in Figure 31. First, source
images are decomposed into multi-scale and multi-directional components using
contourlet transform, and these components are fused together based on a certain fusion
scheme. Next, inverse contourlet transform is performed in order to obtain a final fused
image.
Figure 31. Fusion framework.
61
B. Fusion Scheme
The source images are fused according to the fusion scheme and fusion rule that are
described as follows:
Figure 32. Fusion scheme.
1) The source images are decomposed using contourlet transform in order to
obtain multi-scale or multi-directional frequency coefficients. For each
decomposition level K, K approximation subband and 3K detail subbands are
produced. In our experiments, decomposition level of 3 was used since the
level beyond 3 significantly degraded the fusion performance.
2) Once the source images are decomposed, high frequency components are
selected from the PAN source image and then injected into detail subbands of
the MS source image via maximum frequency fusion rule which compares
and selects the frequency coefficient with the highest absolute value at each
pixel.
62
3) The inverse contourlet transform is performed to obtain the final fusion image.
C. Performance Quality Metrics
Specific criteria are necessary to measure and evaluate the performance of the fusion
results. Qualitative and quantitative inspections are two major means of evaluation.
However, the quantitative approach is adopted in our study because the qualitative
approach is often influenced by subjective factors such as personal preferences and
eyesight.
For the quantitative approach, we employed correlation coefficient (CC), relative average
spectral error (RASE) and spectral angle mapper (SAM) for spectral analysis, and
average gradient (AG), universal image quality index (UIQI) and signal to noise ratio
(SNR) for spatial analysis. Performance quality metrics are discussed in detail in Chapter
6 for each fusion category.
D. Experimental Results
(a) (b)
Figure 33. Two original MS images. (a) Dataset 1. (b) Dataset 2.
63
(a) (b) (c) (d)
Figure 34. Fusion results of four different registration methods using Dataset 1. (a) CC.
(b) MI. (c) Spatial relations (SR). (d) Invariant Descriptors (ID).
(a) (b) (c) (d)
Figure 35. Fusion results of four different registration methods using Dataset 2. (a) CC.
(b) MI. (c) Spatial relations (SR). (d) Invariant Descriptors (ID).
Table 3. A comparison of fusion results using performance quality metrics – Dataset 1.
Pre-
processing
Method
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
CC 0.764 44.965 0.314 28.583 0.539 66.475
MI 0.924 44.438 0.226 30.424 0.916 70.732
SR 0.885 44.587 0.263 29.236 0.721 69.139
ID 0.853 44.781 0.285 29.021 0.682 68.317
64
Table 4. A comparison of fusion results using performance quality metrics – Dataset 2.
Pre-
processing
Method
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
CC 0.646 47.659 0.427 26.835 0.438 63.745
MI 0.918 46.184 0.236 30.542 0.902 69.332
SR 0.852 46.386 0.337 29.268 0.710 66.979
ID 0.734 46.817 0.365 28.421 0.645 65.143
5.2. Band Selection
5.2.1. Similarity-based Band Selection
In order to select the distinctive bands or the most dissimilar bands, similarity metrics
need to be designated, namely distance, correlation, etc. The measurement is taken on
each pair of bands; however, in our study, band similarity is evaluated jointly rather than
pairwisely. The band selection algorithm using the concept of endmember extraction has
this property. In addition, due to the large number of original bands, the exhaustive
search for optimal band combinations is computationally prohibitive. The sequential
forward search can save significant computation time [146]. It begins with the best two
band combination. Next, this two-band combination is subsequently augmented to three,
four, and more until the desired number of bands is selected. The band selection
algorithm using the endmember extraction concept adopts the sequential forward search
strategy [147]. Another advantage is that it is less dependent on the number of bands to
be selected because those bands that are already being selected do not change with this
value; increasing this value simply means to continue the algorithm execution with the
bands being selected, whereas decreasing this number simply means to keep enough
bands from the selected band subset (starting with the first selected band) as the final
65
result. This is different from the algorithms in the parallel mode, such as the NFINDR-
based band selection method, where the number of bands to be selected must be
determined before the band selection process, and the change of this number results in the
re-execution of the entire band selection process [147].
The basic steps of the band selection can be described as follows:
1. Initialize the algorithm by choosing a pair of bands B1 and B2. Then, the resulting
selected band subset is Φ = {B1B2}.
2. Find the third band B3 that is the most dissimilar to all the bands in the current Φ
by using a certain criterion. Then, the selected band subset is updated as
Φ = Φ ∪ {B3}.
3. Continue with Step 2 until the number of bands in Φ is large enough.
A straightforward criterion that can be employed for the similarity comparison is Linear
Prediction (LP), which can jointly evaluate the similarity between a single band and
multiple bands. The concept in the LP-based band selection was originally used in the
unsupervised fully constrained least squares linear unmixing (UFCLSLU) for endmember
pixel selection in [148], which means that a pixel with the maximum reconstruction error,
using the linear combination of existing endmember pixels, is the most distinctive pixel.
The difference here is that, for band selection, there is no constraint imposed on the
coefficients of linear combination [147].
Linear Prediction (LP) Method can be described as follows:
66
Assume that there are two bands B1 and B2 in Φ with N pixels each. B1 and B2 are used to
estimate the third band B which is the most dissimilar to B1 and B2, i.e.,
0 1 1 2 2 'a a B a B B (30)
, where B' is the estimate or linear prediction of band B using B1 and B2, and a0, a1 and a2
are the parameters that can minimize the linear prediction error: e = ǁB - B'ǁ. Let the
parameter vector be a = (a0a1a2)T. It can be determined using a least squares solution,
1( ) T Ta X X X y (31)
, where X is an N × 3 matrix whose first column is one, second column includes all the N
pixels in B1, and third column includes all the pixels in B2, and y is an N × 1 vector with
all the pixels in B. The band that yields the maximum error emin (using the optimal
parameters in a) is considered as the most dissimilar band to B1 and B2, and will be
selected as B3 for Φ. The similar procedure can be easily conducted when the number of
bands in Φ is larger than two [147].
5.2.2. Performance Evaluations
In these performance evaluations, two different datasets are used: i) multispectral (MS)
and panchromatic (PAN) images, and ii) hyperspectral (HS) and panchromatic (PAN)
images. Each dataset is explained more in detail in the following sub-sections. Moreover,
the following sub-sections explain how the pre-processing is performed over the source
images prior to the image fusion.
67
In order to quantitatively analyze the fusion results, we employ various quality metrics,
and they can be classified into two categories: i) spectral analysis and ii) spatial analysis.
Correlation coefficient (CC), relative average spectral error (RASE) and spectral angle
mapper (SAM) are used for spectral analysis. On the other hand, for spatial analysis, we
employ distortion, universal image quality index (UIQI) and signal-to-noise ratio (SNR).
A. Multispectral and Panchromatic Images
The first dataset was downloaded from [149]. This is a set of 4m-MS and 1m-PAN
images of the city of Fredericton, Canada which were acquired by the commercial
satellite IKONOS. In order to obtain a PAN source image, the original multispectral
image is spectrally integrated over the entire spectral range. The final result is a
synthesized panchromatic image that is perfectly registered, i.e., aligned to the
multispectral image [150]. Next, we perform the band reduction on the original MS
image to create a new band-reduced MS image. Fusion is performed over two different
pairs of datasets: i) Band-reduced MS + PAN and ii) Original MS + PAN. The fusion
results are quantitatively analyzed using six different quality metrics both spectrally and
spatially. Each quality metric shows how two results are similar to each other. If the
difference between the fusion results is less, then the band-reduction is verified to be
effective in the fusion process.
68
(a) (b) (c)
Figure 36. Source images that are used in the fusion. (a) PAN image. (b) Original MS
image. (c) Band-reduced MS image.
Fusion of the above source images is performed based on CT and fusion scheme as
discussed in the previous section. The fused result is analyzed using various quality
metrics as discussed earlier. Table 1 shows the fusion results of two pairs of datasets: i)
Band-reduced MS + PAN and ii) Original MS + PAN.
(a) (b)
Figure 37. Fusion results. (a) Band-reduced MS + PAN. (b) Original MS + PAN.
69
Table 5. A performance comparison using quality assessment metrics.
Datasets Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
Band-reduced
MS + PAN 0.869 43.352 0.257 27.975 0.641 65.274
Original MS
+ PAN 0.783 41.171 0.286 25.593 0.747 67.149
B. Hyperspectral and Panchromatic Images
The second data set is a hyperspectral image from MultiSpec© homepage by Purdue
University (MultiSpec©) [151]. In order to obtain a panchromatic source image, the
original hyperspectral image is spectrally integrated over the entire spectral range. The
final result is a synthesized panchromatic image that can be used as the second source
image [150]. By doing this, we can obtain two perfectly co-registered source images
without going through the registration process. Figure 38 shows the source images that
are used in the fusion, and Figure 39 shows the fusion results of two different pairs of
datasets: (a) Band-reduced MS + PAN and (b) Original MS + PAN respectively.
(a) (b) (c)
Figure 38. Source images that are used in the fusion. (a) PAN image. (b) Original MS
image. (c) Band-reduced MS image.
70
(a) (b)
Figure 39. Fusion results. (a) Band-reduced MS + PAN. (b) Original MS + PAN.
Table 6. A performance comparison using quality assessment metrics.
Fusion
Method
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
Band-reduced
MS + PAN 0.718 43.645 0.347 31.216 0.537 71.635
Original MS
+ PAN 0.667 42.278 0.365 29.572 0.624 72.835
As we can see from the above comparative analyses, despite the band reduction
performed over the original spectral images, the fusion results using band-reduced
MS/HS and original MS/HS are very much similar in terms of spatial resolution.
Moreover, we can observe that the band reduction does not affect the fusion performance
in terms of spectral resolution as well, and that the band reduction even enhanced the
fusion performance in terms of CC and SAM, which represent the spectral similarity
between the fusion results of the band-reduced spectral image fusion and the original
spectral image fusion. This better performance was possible because the redundancy in
the correlation of the spectral bands was removed during the band reduction process.
71
Therefore, the spectral bands that are used in the fusion are less correlated to each other
and represent only useful spectral information from each selected band. As a result, the
fusion results represent a higher spatial resolution with reduced data volume, while
preserving the spectral information. This fusion method with both band reduction and
contourlet transform would be suitable for data processing, transmission and storage.
5.3. Decomposition Level
5.3.1. Decomposition of Datasets
Decomposition of the input images is a very important pre-processing step where the
original input images are decomposed into multiple subbands with frequency components
which have capabilities as follows:
1. Detect discontinuities at edge points and smoothness along the contours.
2. Represent salient image features, such as lines and curves.
3. Capture smooth contours and geometric structures in images.
All experiments in this section use a wavelet transform with “9-7” biorthogonal filters
[152], [153] and 3 decomposition levels. For the contourlet transform, in the LP stage, the
“9-7” filters are selected again. The “9-7” biorthogonal filters are chosen because they
have been proven to provide the best results for images, partly because they are linear
phase and are close to being orthogonal. In the DFB stage, the “23-45” biorthogonal
quincunx filters designed by Phoong et al. [154] are selected and modulated to obtain the
biorthogonal fan filters. Not only being linear phase and nearly orthogonal, but these fan
filters are close to having the ideal frequency response and thus can approximate the
directional vanishing moment condition. The drawback is that they have large support
72
which creates a large number of significant coefficients near edges. The number of DFB
decomposition levels is doubled at every other finer scale and is equal to 5 at the finest
scale. Note that in this case, both the wavelet and the contourlet transforms share the
same detail subspaces Wj. The difference is that each detail subspace Wj in the wavelet
transform is represented by a basis with three directions, whereas in the contourlet
transform it is represented by a redundant frame with many more directions. It is possible
to notice that only contourlets that match both location and direction of image contours
produce significant coefficients.
5.3.2. Performance Evaluations
In order to evaluate the fusion performance and the effects of the decomposition, we
applied the same fusion framework and scheme to the remote sensing images as the
previous section. However, we differentiated the level of decomposition in each fusion
process to observe how the decomposition level affects the fusion performance.
Two sets of hyperspectral images are employed and the following tables clearly show
that the decomposition level of 3 produces the best fusion results whereas the
decomposition levels beyond 3 do not necessarily provide better fusion results.
73
(a) (b) (c)
Figure 40. Original HS image and two synthesized source images using Dataset 1.
(a) Original (reference) HS image. (b) Synthesized PAN source image. (c) Synthesized
HS source image.
(a) (b) (c)
Figure 41. Original HS image and two synthesized source images using Dataset 2.
(a) Original (reference) HS image. (b) Synthesized PAN source image. (c) Synthesized
HS source image.
74
Table 7. A comparison of the fusion results with different levels of decomposition –
Dataset 1.
Decomposition
Level
Spectral Analysis Spatial Analysis
CC RASE SAM SID E UIQI SNR AG
Level 1 0.746 47.357 0.256 0.234 6.225 0.632 69.463 5.926
Level 2 0.823 45.264 0.232 0.193 6.451 0.734 70.773 6.132
Level 3 0.875 43.711 0.212 0.157 6.731 0.832 71.837 6.375
Level 4 0.819 46.925 0.247 0.221 6.158 0.779 68.962 6.021
Table 8. A comparison of the fusion results with different levels of decomposition –
Dataset 2.
Decomposition
Level
Spectral Analysis Spatial Analysis
CC RASE SAM SID E UIQI SNR AG
Level 1 0.746 47.357 0.256 0.234 6.225 0.632 69.463 5.926
Level 2 0.823 45.264 0.232 0.193 6.451 0.734 70.773 6.132
Level 3 0.875 43.711 0.212 0.157 6.731 0.832 71.837 6.375
Level 4 0.819 46.925 0.247 0.221 6.158 0.779 68.962 6.021
5.4. Conclusion
In Chapter 5, three main pre-processing techniques of datasets were discussed in detail.
Image registration is necessary in the multimodal image fusion because it is always
possible that the source datasets are not registered (aligned) to each other. As a result, we
end up with fusion results that are not as good as we expected although the employed
fusion method is optimum. Therefore, it is very important to begin with the registration of
the datasets in order to achieve the best fusion results. After conducting performance
evaluation, it is verified that the MI is the suitable method to be used in the fusion among
other registration methods due to the fact that it gives the best fusion results by aligning
the source images well.
75
Band selection is also an important pre-processing step that needs to be taken prior to the
multimodal image fusion, especially in terms of remote sensing image fusion. Spectral
images are often very large to process, transfer and store; hence, it is necessary to
perform the band selection technique in order to reduce the size of the datasets and the
computational load/time, while preserving the useful information as much as possible.
From the experimental results, it is verified that the band selection does not affect the
fusion performance in terms of spatial resolution as much as possible and that it improves
the performance in terms of spectral resolution. This better performance is possible
because the redundancy in the correlation of the spectral bands is removed during the
band reduction process. Therefore, the spectral bands that are used in the fusion are less
correlated to each other and represent only useful spectral information from each selected
band. As a result, it is a good solution to use this band selection technique in processing,
transferring and storing large data like remote sensing images.
Another important pre-processing step is the decomposition. More importantly, it is
necessary to understand up to what level the source datasets need to be decomposed in
order to achieve desirable fusion results. As can be seen from the performance
evaluations, decomposition level of 3 produced the best fusion results and the levels
beyond 3 did not necessarily improve the fusion results.
76
CHAPTER 6
EXPERIMENTAL STUDY AND ANALYSIS
Experimental study and analysis are conducted in this chapter for five different categories
of multimodal image fusion: i) Remote sensing image fusion, ii) Medical image fusion,
iii) Infrared image fusion, iv) Radar image fusion, and v) Multi-focus image fusion. The
only common component in the experiments is the fusion framework which is based on
the hybrid wavelet-based contourlet transform (HWCT) framework and this is discussed
in Chapter 4 in detail. For each fusion category, a different fusion algorithm is proposed
which is suitable for that specific category. Moreover, necessary performance quality
metrics are carefully chosen for each fusion category to conduct the best assessment on
the fusion results, and various pre-processing techniques are applied to the source
datasets in each fusion category that are necessary to correctly prepare the source images
for the main fusion process. The performance of the proposed fusion method is evaluated
via comparative analyses against the conventional wavelet transform and the contourlet
transform.
6.1. Remote Sensing Image Fusion
In the remote sensing field, the color information is provided by three sensors covering
red, green and blue spectral wavelengths. These sensors have a low number of pixels
(low spatial resolution), and the small objects and details (cars, small lines, etc.) are hard
to be seen. Such small objects and details can be observed with a different sensor
(panchromatic), which has a high number of pixels (high spatial resolution) but without
77
the color information. With a fusion process, a unique image can be achieved containing
both high spatial resolution and color information.
6.1.1. Experimental Study and Analysis
A. Fusion Algorithm
The source images first go through both multiscale and multidirectional decomposition
stages using the hybrid wavelet-based contourlet transform (HWCT) framework (see
Chapter 4), and then these decomposed images are fused based on a certain fusion
scheme. Since the low frequency parts include mostly the background information, we
need weighted-average operators. In other words, the low frequency coefficients are
weight-averaged. On the other hand, the high frequency parts include mostly the image
representation information, such as edge and texture information. The correlation
between a pixel and its neighboring pixels is often larger than others. Hence, the fusion
scheme should know about the region where the pixel is the center. The fusion scheme
computes the region energy of the center pixel and its neighboring pixels. The expression
of region energy is as follows:
2
( , )
( , ) ( , ) ( , )A A
m n w
E i j m n f m n
(32)
2
( , )
( , ) ( , ) ( , )B B
m n w
E i j m n f m n
(33)
, where ,i k m i k j k n j k , width of the region is ( 2 1)w w k , ( , )m n is
the weighted value and ( , )f m n is the pixel gray value.
78
The region energy is larger when the image features are salient; hence, the region energy
for each pixel is compared and high frequency coefficients of the pixel with larger region
energy are selected to be used as high frequency coefficients of the fused image. This
way, the salient image features, such as edge and texture information can be preserved.
Based on the above discussion, the region-based measure fusion operators are used in the
experiments. First, the region is selected using a 3×3 window. Second, if the difference of
the region energy of pixels from the source images is large, select high frequency
coefficients of the pixel with larger region energy. Otherwise, perform weighted-average
over the pixels’ high frequency coefficients as follows:
( , ) ( , )
( , ) ( , )
1 2
( , ), , ( , )
( , ) ( , ), , ( , )
( , ) ( , ), ( , )
A i j B i j AB
A i j B i j AB
AB
A i j E E R i j T
F i j B i j E E R i j T
A i j B i j R i j T
(34)
, where T is the energy match degree threshold and 1 2 1 . The energy-match degree
( , )ABR i j is as follows:
( , )
2 ( , ) ( , ) ( , )
( , )( , ) ( , )
A B
m n w
AB
A B
m n f m n f m n
R i jE i j E i j
(35)
, where ( , )ABR i j value is between 0 and 1.
According to this fusion scheme, the high frequency coefficients of the salient feature
region can be preserved. Lastly, the inverse transform is performed to reconstruct the
fused image.
79
B. Performance Quality Metrics
Specific criteria are necessary to measure and evaluate the performance of the fusion
results. Qualitative and quantitative inspections are two major means of evaluation.
However, the quantitative approach is adopted in our study because the qualitative
approach is often influenced by subjective factors, such as personal preferences and
eyesight.
For the quantitative approach to evaluate the performance of the remote sensing image
fusion, we employed correlation coefficient (CC), relative average spectral error (RASE)
and spectral angle mapper (SAM) for spectral analysis, and average gradient (AG),
universal image quality index (UIQI) and signal to noise ratio for spatial analysis.
a) Correlation coefficient (CC) is as follows:
2
,A B
A B
CC
(36)
, where 2
,A B is a covariance between images A and B, and ,A B are standard
deviation of each image. When two images are similar to each other, CC becomes
close to 1. If two images that are compared are identical, CC is equal to 1. In other
words, if the CC between the fusion result and the reference image is closer to 1,
the fusion performance is higher.
b) Relative Average Spectral Error (RASE) is as follows:
2 2
1
1 1( ( ) ( ))
N
i i
i
RASE DM R SSD RN
(37)
80
, where is a mean radiance of the N spectral bands of the reference image, DM
is a difference between the means of the reference and the fused images, SSD is a
squared sum of intensity differences, and Ri is N-bands of the reference image
(i=1,…,N)
RASE characterizes the average performance of the method for all bands. The
lower RASE, the better spectral quality of the fused image.
c) Spectral Angle Mapper (SAM) is as follows:
2 2
ˆ,ˆ( , ) arccos
ˆSAM
v vv v
v v (38)
, where ˆ,v v are two spectral vectors, both having L components, and
1,..., Lv v v is the original spectral vector and 1ˆ ˆ ˆ,..., Lv v v is the distorted
vector obtained by applying fusion to the spectral data. When two images are
similar, SAM is closer to 0.
d) Average Gradient (AG) is as follows:
1 1
( , ) ( , )
1
2
K L
m n
F m n F m n
m nAG
KL
(39)
, where K and L are the number of lines and columns of the fused image F. AG
describes the changing feature of image texture and detailed information. Larger
values of AG correspond to higher spatial resolution and sharpness of the fused
image.
e) Universal Image Quality Indicator (UIQI) is as follows:
2 2 2 2
4
( ) ( ) ( )
i i i i
i i i i
F R F R
F R F R
UIQI
(40)
81
, where i iF R is a covariance between the bands of the reference and the fused
image, and , are mean and standard deviation of the images, respectively. The
higher UIQI is, the better spatial quality of the fused image. When two images are
identical, UIQI is equal to 1.
f) Signal-to-Noise Ratio (SNR) is as follows:
1 2
1 2
2
1 110 2
1 1
( , )
10log
( , ) ( , )
S S
m n
S S
m n
z m n
SNR
z m n o m n
(41)
, where z(m,n) is the intensity of the pixel of the fused image, o(m,n) is the
intensity of the pixel of the original image, and S1×S2 is the intensity of the pixel
of the fused image. The higher SNR is, the better spatial quality of the fused
image is.
C. Pre-processing of Datasets
In this experimental study, multiple sets of multispectral and hyperspectral images are
used as a reference image which is later used in a comparison with the fusion results. In
other words, the fusion results are compared to the reference image to evaluate the fusion
performance. Quality metrics from the previous section are adopted to analyze how
similar the fusion results are to the reference image. If the quality metrics show the
numbers indicating that the fusion resultant images are close to the reference image, it is
possible to state that the fusion performance is very high.
First, the original spectral (either MS or HS) image is down sampled by a factor of 3 and
resized to the same size as before using the bilinear interpolation. The final result is a
82
synthesized spectral image that can be used in our experiment as a source image. Second,
the original spectral image is spectrally integrated over the entire spectral range to obtain
a panchromatic image with spatial details preserved as much as possible. The final result
is a synthesized panchromatic image that can be used as the second source image [26].
By doing this, we can obtain two perfectly co-registered source images without going
through registration process which takes up lots of time and computational load. This
way, we can focus on the performance of the proposed fusion method without other
factors affecting the fusion process, such as misaligned source images.
D. Experiments
a) Hyperspectral and Panchromatic Datasets
(a) (b) (c)
Figure 42. Original HS image and two synthesized source images. (a) Original
(reference) HS image. (b) Synthesized PAN source image. (c) Synthesized HS source
image
83
(a) (b) (c)
Figure 43. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 9. A performance comparison of the fusion results using quality assessment
metrics.
Transform
Type
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
WT 0.765 47.764 0.234 28.327 0.748 64.782
CT 0.784 46.721 0.215 25.542 0.821 66.643
Proposed 0.823 44.712 0.193 22.623 0.892 70.129
(a) (b) (c)
Figure 44. Original HS image and two synthesized source images. (a) Original
(reference) HS image. (b) Synthesized PAN source image. (c) Synthesized HS source
image
84
(a) (b) (c)
Figure 45. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 10. A performance comparison of the fusion results using quality assessment
metrics.
Transform
Type
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
WT 0.812 50.124 0.334 29.247 0.684 66.382
CT 0.854 48.567 0.237 26.253 0.741 68.640
Proposed 0.901 45.781 0.209 24.334 0.822 71.297
85
(a) (b) (c)
Figure 46. Original HS image and two synthesized source images. (a) Original (reference)
HS image. (b) Synthesized PAN source image. (c) Synthesized HS source image.
(a) (b) (c)
Figure 47. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
86
Table 11. A performance comparison of the fusion results using quality assessment
parameters.
Transform
Type
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
WT 0.844 49.157 0.229 30.247 0.788 69.872
CT 0.875 45.198 0.204 27.462 0.832 71.643
Proposed 0.914 43.645 0.183 23.263 0.922 73.912
b) Multispectral and Panchromatic Datasets
(a) (b) (c)
Figure 48. Original MS image and two synthesized source images. (a) Original
(reference) MS image. (b) Synthesized PAN source image. (c) Synthesized MS source
image.
(a) (b) (c)
Figure 49. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
87
Table 12. A performance comparison of the fusion results using quality assessment
metrics.
Transform
Type
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
WT 0.657 45.645 0.244 31.734 0.848 66.456
CT 0.747 42.221 0.205 27.749 0.891 69.640
Proposed 0.853 38.122 0.181 24.159 0.922 73.571
(a) (b) (c)
Figure 50. Original MS image and two synthesized source images. (a) Original MS
image. (b) Synthesized PAN source image. (c) Synthesized MS source image.
(a) (b) (c)
Figure 51. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
88
Table 13. A performance comparison of the fusion results using quality assessment
metrics.
Transform
Type
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
WT 0.615 49.763 0.334 29.112 0.778 67.728
CT 0.788 45.421 0.305 24.452 0.851 69.943
Proposed 0.833 41.912 0.223 21.223 0.932 73.136
(a) (b) (c)
Figure 52. Original MS image and two synthesized source images. (a) Original
MS image. (b) Synthesized PAN source image. (c) Synthesized MS source image.
89
(a) (b) (c)
Figure 53. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 14. A performance comparison of the fusion results using quality assessment
metrics.
Transform
Type
Spectral Analysis Spatial Analysis
CC RASE SAM AG UIQI SNR
WT 0.825 51.764 0.224 32.327 0.798 74.728
CT 0.864 49.720 0.195 30.453 0.821 77.343
Proposed 0.933 46.722 0.163 27.663 0.855 80.126
6.2. Medical Image Fusion
In medical imaging, we can have positron emission tomography (PET), computed
tomography (CT), and magnetic resonance (MRI) images of brains or organs from the
same patient. The PET and CT images are a functional image displaying the brain/organ
functional activity but no anatomical information. On the contrary, the MRI provides
90
anatomical information but no functional activity. Moreover, although the images come
exactly from the same brain/organ area, both PET and CT images have fewer pixels than
the MRI, due to the resolution of the image sensors. The goal of this medical image
fusion is to achieve a unique image with both functional and anatomical information
while preserving the original resolution.
6.2.1. Experimental Study and Analysis
A. Fusion Algorithm
Fusion framework used in this medical image fusion is the same as the one used in the
remote sensing image fusion, which is hybrid wavelet-based contourlet transform
(HWCT) framework. In other words, the input images are decomposed into multiscale
and multidirectional subbands using the filter stages discussed in Chapter 5.
However, the fusion algorithm used in the medical image fusion is different and it is
explained below:
1) The decomposed subbands from the transformation stage are combined using
lowpass and highpass fusion rules.
2) Lowpass subband fusion: The local energy domain is developed as the
measurement, then the selection and averaging modes are used to compute the
final coefficients.
The local energy E(x,y) is calculated centering the current coefficient in the
approximate subband a,
2( , ) ( , ) (m,n)J L
m n
E x y a x m y n W
(42)
91
, where (x, y) denotes the current subband coefficient and WL(m, n) is a template
of size 3×3,
1 1 11
1 1 19
1 1 1
LW
(43)
Then the salience factor is calculated to determine which mode is to be used
between the selection mode and the averaging mode, in the fusion process.
2 ( , ) ( , )
( , )( , ) ( , )
A B
J JAB m nJ A B
a x m y n a x m y n
M x yE x y E x y
(44)
, where ( , ), ,X
Ja x y X A B denotes the lowpass subband coefficients of the source
images A and B, and ( , )AB
JM x y is the salience factor.
Salience factor reflects the similarity of the lowpass subbands of the two source
images. Then, this value is compared to a predefined threshold TL.
If L( , ) TAB
JM x y , the averaging mode is selected for the following fusion process:
( , ) ( , ) ( , )F A B
J A J B Ja x y a x y a x y (45)
, where ( , )F
Ja x y represents the fused results at position (x, y), and ,A B are the
weights:
min
max
( , ) ( , )
( , ) ( , )
A B
A A B
for E x y E x y
for E x y E x y
(46)
1B A
(47)
, where min min max(0,1), 1 .
If L( , ) TAB
JM x y , the selection mode is chosen for the condition with the
following fusion rule:
92
( , ) ( , ) ( , )
( , ) ( , ) ( , )
A A B
J
A B A B
J
a x y for E x y E x y
a x y for E x y E x y
(48)
3) Highpass subband fusion: The coefficients with larger absolute values in the high
frequency subbands , ( , )j kd x y are fused using the average method as follows:
, , ,( , ) ( , ) ( , )F A B
j k j k j kE x y d x y d x y
(49)
, where , ( , )F
j kE x y is the local energy and , ( , )X
j kd x y is the high frequency
coefficient.
4) Reconstruction of the fusion image: The fused image is reconstructed from
( , )F
ja x y and , ( , )F
j kE x y using inverse transform.
B. Performance Quality Metrics
a) Entropy (EN): Entropy can effectively reflect the amount of information in certain
image. The larger the value is, the better fusion results are obtained:
1
2
0
( ) log ( )L
F F
i
EN p i p i
(50)
, where Fp is the normalized histogram of the fused image to be evaluated, L is
the maximum gray level for a pixel in the image.
b) Overall cross entropy (OCE): The overall cross entropy is used to measure the
difference between the two source images and the fused image. Small value
corresponds to good fusion results obtained:
( , ) ( , )( , ; )
2
A BA B
CE f F CE f FOCE f f F
(51)
93
, where ,A Bf f are the input multimodality medical images, F is the fused result,
( , ), ( , )A BCE f F CE f F are the cross entropy of the source images ,A Bf f and the
fused image F, which is:
1
2
0
( )( , ) ( ) log
( )
LG
G
i F
p iCE G F p i
p i
, G = A or B (52)
c) Spatial frequency (SF): Spatial frequency can be used to measure the overall
activity and clarity level of an image. Larger SF value denotes better fusion result:
2 2SF RF CF (53)
, where RF is the row frequency and CF is the column frequency:
1 22
0 0
1( ( , 1) ( , ))
( 1)
M N
i j
RF F i j F i jM N
(54)
2 12
0 0
1( ( 1, ) ( , ))
( 1)
M N
i j
CF F i j F i jM N
(55)
C. Pre-processing of Datasets
It is hard to find a reference image in the field of medical image fusion; therefore, it is
very important to use the quality assessment metrics that can evaluate the fusion
performance without the presence of the reference image (see above Section B).
Moreover, it is very important to pre-process the source datasets prior to the fusion
process. Unlike the remote sensing image fusion, it is necessary to register the source
images to each other, i.e., align the source images, in order to produce the best fusion
results. As discussed in Chapter 5, the image registration method based on Mutual
Information (MI) appears to be the suitable solution to the image fusion. Therefore, from
94
a pair of source images, the one with lower resolution is resampled to match the one with
higher resolution, and the one with smaller size is resized to match the one with larger
size. Next, the source images are registered to each other using MI registration method in
order to align the source images as perfectly as possible.
D. Experiments
(a) (b)
Figure 54. A set of source images. (a) Original CT image. (b) Original MRI image.
(a) (b) (c)
Figure 55. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
95
Table 15. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.2378 0.6324 4.6532
CT 4.5681 0.4983 4.8415
Proposed HWCT 4.8169 0.3945 4.9967
(a) (b)
Figure 56. A set of source images. (a) Original MRI image. (b) Original SPECT image.
(a) (b) (c)
Figure 57. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
96
Table 16. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.1987 0.5346 4.8713
CT 4.4716 0.4712 4.9134
Proposed HWCT 4.7934 0.3567 5.1237
(a) (b)
Figure 58. A set of source images. (a) Original CT image. (b) Original MRI image.
(a) (b) (c)
Figure 59. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
97
Table 17. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 5.0146 0.6431 5.1432
CT 5.2417 0.5864 5.3754
Proposed HWCT 5.3834 0.5243 5.4983
(a) (b)
Figure 60. A set of source images. (a) Original MRI image. (b) Original MRA image.
(a) (b) (c)
Figure 61. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
98
Table 18. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.3148 0.6338 4.7129
CT 4.8561 0.6124 4.9874
Proposed HWCT 5.2908 0.5897 5.2647
6.3. Infrared Image Fusion
Because of different imaging mechanism and waveband, the infrared image reflects
radiation information of the objective scene but image clarity is lower. The visible image
reflects reflection information of the objective scene, and can give a better description of
surroundings information. Therefore, fusion of infrared and visible images can make full
use of information contained in source images, and can acquire better understanding of
the whole scene. For example, details of targets that are hard to be detected in a visible
image can be better discovered in an infrared image; hence, the infrared image fusion can
be widely used in surveillance and security monitoring.
6.3.1. Experimental Study and Analysis
A. Fusion Algorithm
1) Decompose source images 1( , )S x y and 2 ( , )S x y using Daubechies complex
wavelet based contourlet transform to obtain approximation 1( , )AS x y , 2 ( , )AS x y
and detail 1( , )DS x y , 2( , )DS x y coefficients as follows:
1 1 1( , ), ( , ) [ ( , )]AS x y DS x y T S x y and 2 2 2( , ), ( , ) [ ( , )]AS x y DS x y T S x y (56)
, where T represents the transform.
99
2) For approximation coefficients 1( , )AS x y and 2 ( , )AS x y , maximum fusion rule is
applied as follows:
1 1 2
2 2 1
( , ), ( , ) ( , )( , )
( , ), ( , ) ( , )f
AS x y if AS x y AS x yAS x y
AS x y if AS x y AS x y
(57)
, where ( , )fAS x y is approximation level coefficient for the fused image.
3) For detail subbands ( , )jDS x y (j is the total number of detail subbands) of source
images ( , )iS x y (i is the total number of source images), the energy of each
subband is denoted by ( , )jEDS x y and defined as follows:
2
1
( , ) ( , )n
j j
k
EDS x y DS x y
(58)
, where k = 1, 2, …, n is the maximum size of detail subbands.
If 1( , )EDS x y and 2( , )EDS x y are the energy of detail subbands 1( , )DS x y and
2( , )DS x y for source images 1( , )S x y and 2 ( , )S x y respectively, then selection of
detail coefficients can be obtained by the following rule:
1 1 2
2 2 1
( , ), ( , ) ( , )( , )
( , ), ( , ) ( , )f
DS x y if EDS x y EDS x yDS x y
DS x y if EDS x y EDS x y
(59)
, where ( , )fDS x y is detail coefficient for the fused image.
4) Fused image ( , )F x y is obtained by taking inverse transform of ( , )fAS x y and
( , )fDS x y as follows:
100
( , ) ( , ), ( , )f fF x y InverseT AS x y DS x y (60)
B. Performance Quality Metrics
In order to evaluate the fusion performance, the same quality metrics that were used for
the medical image fusion are employed (see Section 6.2 for details). These specific
quality metrics are chosen because there is no reference image that is compared to the
fused image. In other words, these metrics evaluate the fusion results without the
reference image.
C. Pre-processing of Datasets
Similar to the medical image fusion, it is hard to find a reference image in the field of
infrared image fusion; therefore, it is very important to use the quality assessment metrics
that can evaluate the fusion performance without the presence of the reference image (see
Section 6.2). Moreover, it is very important to pre-process the source datasets prior to the
fusion process. Unlike the remote sensing image fusion, it is necessary to register the
source images to each other, i.e., align the source images, in order to produce the best
fusion results. As discussed in Chapter 5, the image registration method based on Mutual
Information (MI) appears to be the optimum solution to the image fusion. Therefore,
from a pair of source images, the one with lower resolution is resampled to match the one
with higher resolution, and the one with smaller size is resized to match the one with
larger size. Next, the source images are registered to each other using MI registration
method in order to align the source images as perfectly as possible.
101
D. Experiments
(a) (b)
Figure 62. A set of source images. (a) Infrared image. (b) Visible image.
(a) (b) (c)
Figure 63. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 19. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 5.2567 0.5218 5.8681
CT 5.6781 0.3492 6.1942
Proposed HWCT 6.0114 0.3107 6.3321
102
(a) (b)
Figure 64. A set of source images. (a) Infrared image. (b) Visible image.
(a) (b) (c)
Figure 65. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 20. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 5.8142 0.4127 5.2149
CT 6.0034 0.3891 5.6741
Proposed HWCT 6.2478 0.2687 5.9947
103
(a) (b)
Figure 66. A set of source images. (a) Infrared image. (b) Visible image.
(a) (b) (c)
Figure 67. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 21. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 5.6417 0.3984 5.3741
CT 5.9837 0.3547 5.5745
Proposed HWCT 6.2147 0.3146 6.1473
104
6.4. Radar Image Fusion
Synthetic Aperture Radar (SAR) is one important branch of radar imaging technology,
and it is an active remote sensor system which possesses the ability of all time, all
weather, long distance and high resolution. The SAR has been widely used in the field of,
military detection, earth observing, and border monitoring. Because the SAR image is
radar coherent imaging, it does not have the ability to detect smoothness along the
contours; therefore, the SAR image reveals isolation of the discontinuities at edge points.
The infrared (IR) image, on the other hand, can reflect approximately the temperature
grads and radiation grads of observation object, and it can provide comparatively
integrated information of edge and texture. If SAR image and infrared image are fused,
the information of edges and textures obtained from the infrared image can be added to
the SAR image. As a result, both edge and texture information in the fused image will be
more integrated while preserving the frequency characteristics of the SAR image, which
makes the fused image more readable and useful.
6.4.1. Experimental Study and Analysis
A. Fusion Scheme
1) The low frequency part mainly reflects the approximate and average characters of
source images and contains most of the energy. Therefore, the low frequency
subband determines the outline of the image. Because of different imaging
mechanisms between Synthetic Aperture Radar sensor and IR sensor, the former
captures high resolution characters and information of a scene; however, the latter
enjoys comparatively integrated information of edges and textures of a scene.
105
Therefore, the low frequency subband coefficients in the SAR image are selected
for the information of low frequency subbands in the fused image:
0 0 0( , ) ( , ), ( , )L L LAB m n select A m n B m n (61)
, where 0( , )LAB m n is the low frequency subband of the fused image.
2) Bandpass directional subband coefficients always contain edge and texture
features. In order to make full use of information in the neighborhood and cousin
coefficients in the transform domain, a salience measure, as a selection principle
of the bandpass directional subband coefficients based on area character, is used
for the source images. The fusion rule is defined as follows:
In bandpass directional subbands of the source images, respectively, a
corresponding block neighborhood of size M×N (typically or 3×3 or 5×5) is
chosen, of which central point is the pixel to be fused. So we select (m, n) and r as
the central point and the width of window. MA and MB are used for two counters
of two windows, respectively, which initial value are both zero. Definitions can
be, [ , ], [ , ]2 2 2 2
r r r rx m m y n n .
We obtain the counters for two windows according to the following selection:
, ( , ) ( , y)
1, ( , ) ( , y)
, 1 ( , ) ( , y)
k k
A A B B l l
k k
A A B B l l
k k
A A B B l l
M M M M if A x y B x
M M M M if A x y B x
M M M M if A x y B x
(62)
Then, we can choose bandpass directional subband coefficients as follows:
106
( , )
( , )( , )
( , ) ( , ) ( , )
( , ) ( , ) ( , )
k
l A B
k
l A Bk
k k kll A B l l
k k k
l A B l l
A m n if M M
B m n if M MAB m n
A m n if M M and A m n B m n
B m n if M M and A m n B m n
(63)
, where ( , ), 1,2,...,2 , 1,2,...,lnk
lAB m n k l L is the bandpass frequency subband
of the fused image.
B. Performance Quality Metrics
In order to evaluate the fusion performance, entropy (EN), average gradient (AG) and
overall cross entropy (OCE) are employed in the assessment (see Section 6.2). These
specific quality metrics are chosen because there is no reference image that is compared
to the fused image. In other words, these metrics evaluate the fusion results without the
reference image.
C. Pre-processing of Datasets
It is very important to pre-process the source datasets prior to the fusion process. Unlike
the remote sensing image fusion, it is necessary to register the source images to each
other, i.e., align the source images, in order to produce the best fusion results. As
discussed in Chapter 5, the image registration method based on Mutual Information (MI)
appears to be the optimum solution to the image fusion. Therefore, from a pair of source
images, the one with lower resolution is resampled to match the one with higher
resolution, and the one with smaller size is resized to match the one with larger size. Next,
the source images are registered to each other using MI registration method in order to
align the source images as perfectly as possible. The source images are then decomposed
107
in level 3, as discussed in Chapter 5, to produce necessary subband coefficients that are
used for the fused image.
D. Experiments
(a) (b)
Figure 68. A set of source images. (a) SAR image. (b) Infrared image.
(a) (b) (c)
Figure 69. Fusion results. (a) WT. (b) CT. (c) HWCT.
Table 22. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.3244 0.7355 5.0541
CT 4.8531 0.6841 5.8614
Proposed HWCT 5.0396 0.6512 7.2184
108
(a) (b)
Figure 70. A set of source images. (a) SAR image. (b) Infrared image.
(a) (b) (c)
Figure 71. Fusion results. (a) WT. (b) CT. (c) HWCT.
Table 23. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.5871 0.6872 6.5013
CT 4.7784 0.6421 6.5571
Proposed HWCT 4.9573 0.5998 6.7834
109
6.5. Multi-focus Image Fusion
Due to the limited depth-of-focus of on-board image sensors, especially which are
deployed in satellites or aircrafts, it is often not possible to get an image that contains all
relevant objects “in focus”. Therefore, multi-focus fusion process is required so that all
parts and objects in the resultant image are in focus.
6.5.1. Experimental Study and Analysis
A. Fusion Algorithm
1) Decompose the source images using HWCT and obtain lowpass (coarse)
coefficients CA, C
B and bandpass coefficients (details):
, 1,2,..., , , 1,2,...,A B
l lD l q D l q (64)
, where q is the decomposed level.
In each scale, hybrid directional filter bank (HDFB) is employed for bandpass
coefficients to implement the directional decomposition.
, ,, 1,2,..., , , 1,2,...,A B
l k l kD l q D l q (65)
, where kl is the direction numbers of l-level bandpass coefficients.
2) Combine decomposed coefficients of the coarse subbands and the detail subbands
respectively as follows:
For the coefficients of the high frequency, there are two kinds of fusion rule to
combine transform detail coefficients: one is pixel-based fusion rule and the other
is region-based fusion rule. Pixel-based fusion rules only concern on coefficients
of current fusion pixel. Region-based fusion rules are more robust and less
110
sensitive to noise. The high (band pass) frequency includes most of the image
detail information (edge and texture information). The fusion rules compute the
weighted region energy of the center pixel and its neighboring pixels. So, in this
new fusion scheme, weighted region energy fusion rule is adopted for band pass
coefficients. For band pass coefficients,
, , ,
,
, , ,
( , ) ( , ) ( , )( , )
( , ) ( , ) ( , )
l l l
l
l l l
A A B
l k l k l kF
l k B A B
l k l k l k
D i j E i j E i jD i j
D i j E i j E i j
(66)
, where , ( , )l
A
l kE i j , , ( , )l
B
l kE i j are region energy of source images A and B
respectively.
If a certain pixel is coming from the source image A but with the majority of its
surrounding neighbors from B, this pixel will be selected to come from B.
For the coefficients of the lowpass coefficients, fusion with the average rule is
used:
( , ) ( ( , ) ( , )) / 2F A BC i j C i j C i j (67)
3) Perform inverse HWCT using the fused coefficients in order to produce a
resultant fusion image.
B. Performance Quality Metrics
In order to evaluate the fusion performance, we employed the same quality metrics that
were used for the medical image fusion (see Section 6.2 for details). These specific
quality metrics are chosen because there is no reference image that is compared to the
fused image. In other words, these metrics evaluate the fusion results without the
reference image.
111
C. Pre-processing of Datasets
It is very important to pre-process the source datasets prior to the fusion process. Unlike
the remote sensing image fusion, it is necessary to register the source images to each
other, i.e., align the source images, in order to produce the best fusion results. As
discussed in Chapter 5, the image registration method based on Mutual Information (MI)
appears to be the optimum solution to the image fusion. Therefore, from a pair of source
images, the one with lower resolution is resampled to match the one with higher
resolution, and the one with smaller size is resized to match the one with larger size. Next,
the source images are registered to each other using MI registration method in order to
align the source images as perfectly as possible. The source images are then decomposed
in level 3, as discussed in Chapter 5, to produce necessary subband coefficients that are
used for the fused image.
D. Experiments
(a) (b)
Figure 72. A set of source images. (a) Right focused. (b) Left focused.
112
(a) (b) (c)
Figure 73. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 24. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 5.9127 0.5478 5.6478
CT 6.2354 0.5102 5.7714
Proposed HWCT 6.5587 0.4458 6.0017
(a) (b)
Figure 74. A set of source images. (a) Right focused. (b) Left focused.
113
(a) (b) (c)
Figure 75. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 25. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.3147 0.6587 5.3647
CT 4.8695 0.5384 5.6724
Proposed HWCT 5.2871 0.3145 5.9567
(a) (b)
Figure 76. A set of source images. (a) Right focused. (b) Left focused.
114
(a) (b) (c)
Figure 77. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 26. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 4.2567 0.6447 5.0578
CT 4.8925 0.5142 5.9145
Proposed HWCT 5.2874 0.4456 6.4512
(a) (b)
Figure 78. A set of source images. (a) Right focused. (b) Left focused.
115
(a) (b) (c)
Figure 79. Fusion results. (a) WT. (b) CT. (c) Proposed HWCT.
Table 27. Performance evaluation of the proposed HWCT method.
Fusion Method Performance Quality Metrics
EN OCE SF
WT 3.2547 0.6354 5.1237
CT 4.1156 0.5471 5.5646
Proposed HWCT 4.8562 0.4412 6.5002
6.6. Conclusion
In this chapter, multiple datasets are used to test and verify the performance of the
proposed fusion method against the existing methods (WT and CT). For each fusion
category, a novel fusion algorithm is proposed because the algorithm itself is very
important in producing the best fusion results due to the nature of datasets in each fusion
category. For example, the same algorithm should not be used in both remote sensing
image fusion and medical image fusion because the characteristics and the information
that each fusion category possesses are different; hence, algorithms with different
selection of the fusion coefficients are necessary. Various performance quality metrics
116
are also adopted to conduct comparative analyses on the fusion results. In the remote
sensing image fusion, CC, RASE, and SAM are used for the spectral analysis, and AG,
UIQI, and SNR are used for the spatial analysis. For the other four categories of the
fusion, EN, OCE, and SF are selected to analyze the fusion performance.
The fusion results in the remote sensing image fusion show that the proposed method
performs well and its effectiveness as a fusion method is validated. The results are
assessed as precisely as possible using six different quality metrics. Moreover, the source
images are synthesized from the original spectral images in order to produce a perfectly
co-registered pair of source images. As a result, no additional pre-processing steps are
necessary, and there is less chance that other factors affect the fusion performance during
the process. Therefore, it is possible to state that the fusion results and comparative
analyses are accurate, and that the proposed method is verified to be a high-performance
multimodal fusion method.
Unlike the remote sensing image fusion, a pair of original source images is used in the
other four categories of the image fusion, i.e., no synthesizing process is done. Therefore,
it is not possible to assume that the source images are registered to each other with the
same sample size and resolution. In order to match the size and resolution of the source
images, necessary resampling and resizing processes are undertaken. Next, the MI-based
image registration is performed over the source images to align them as precisely as
possible (see Chapter 5 for details). Another noticeable difference can be found in
evaluating the fusion performance because there is no reference image to compare the
resultant fused images to. In other words, it is not possible to employ the same quality
metrics as used in the remote sensing image fusion. Three other quality metrics are
117
chosen for the performance evaluations in the remaining four fusion categories, which are
EN, OCE, and SF (see Section 6.2.1). These quality metrics have an advantage of
evaluating the fusion performance without the reference image because they can analyze
the fusion results directly while other metrics need a reference image to find similarities
between the results and the reference.
In all four fusion categories (medical, infrared, radar, and multi-focus image fusion), the
proposed fusion method outperformed the existing methods according to the performance
evaluations using three different quality metrics. Numerous datasets were used to conduct
accurate and reliable evaluations. Since the proposed method produced better fusion
results in every experiment, its usefulness as a fusion method is validated and its high
performance is verified.
118
CHAPTER 7
CONCLUSION AND FUTURE WORK
This dissertation is a study on the multimodal image fusion which has various
applications that are contributable to many aspects of human life such as urban mapping,
target detection, concealed weapon identification, natural disaster monitoring, and early
detection of medical symptoms like a cancer. In this final chapter, a summary of research
findings, experimental results and contributions is provided.
7.1. Conclusion
The major research findings, experimental results and contributions are highlighted
below:
1) A novel multimodal image fusion method is proposed based on the following
ideas:
a. Hybrid wavelet-based contourlet transform (HWCT) fusion model
b. Wavelet-based contourlet transform model
c. Daubechies complex wavelets
d. Hybrid directional filter bank (HDFB) model
e. Proposed fusion algorithm for each multimodal fusion category
2) Necessary pre-processing techniques are studied and novel methods are proposed:
a. Original spectral images in the remote sensing image fusion are
synthesized to produce a pair of source images. As a result, a perfectly co-
registered (aligned) pair of source images is obtained.
119
b. Various image registration methods are studied, and mutual information
(MI)-based registration is proposed to align the source images as precisely
as possible.
c. Similarity-based band selection technique is proposed to reduce the size of
the spectral images, which is usually very large to be processed,
transferred and stored.
d. The study on the decomposition level is conducted to observe how the
level affects the fusion performance and what level is sufficient for the
fusion process. It is observed that the level-3 is good enough and that the
levels beyond the level-3 do not necessarily improve the fusion
performance.
3) Multiple categories of the multimodal image fusion are studied using numerous
datasets and performance quality metrics. From every experimental study and
analysis in each fusion category, the proposed method produced better fusion
results than the conventional wavelet and contourlet transforms; therefore, its
usefulness as a fusion method has been validated and its high performance has
been verified.
a. For the remote sensing image fusion, CC, RASE, and SAM are used for
the spectral analysis, and AG, UIQI, and SNR are used for the spatial
analysis.
b. For the other four categories of the fusion, EN, OCE, and SF are employed
to analyze the fusion performance.
120
7.2. Future Work
This section outlines additional research and studies that can be further conducted to
extend the work of this dissertation. Further steps may be taken as follows:
1) Develop a robust image registration method to enhance the fusion performance
even better.
2) Develop a robust method to compress the spectral bands of remote sensing images
for effective processing, transferring and storage of large data, while preserving
both information and characteristics of source images.
3) Develop a method to automatically determine the decomposition level for
optimum decomposition of the subband coefficients for the fused image.
4) Study and find an image denoising method that can be used prior to the main
fusion process, which can further enhance the fusion results.
5) Develop a numerical method of a new quantitative metric for the fusion
performance evaluation that can be commonly used in different types of image
fusion.
6) Integrate a powerful hardware such as graphic processing unit (GPU) in order to
achieve faster processing of large datasets.
121
REFERENCES
[1] Keys, L. D., Schmidt, N. J., and Phillips, B. E., 1990, A prototype example of sensor
fusion used for a siting analysis. Technical Papers 1990, ACSM-ASPRS Annual
Convention, Image Processing and Remote Sensing, 4, 238-249.
[2] Rogers, R. H., and Wood, L., 1990, The history and status of merging multiple sensor
data: an overview. Technical Papers 1990, ACSM-ASPRS Annual Convention, Image
Processing and Remote Sensing, 4, 352-360.
[3] Chavez, P. S., Sides, S. C., and Anderson, J. A., 1991, Comparison of three different
methods to merge multiresolution and multispectral data: TM & SPOT pan.
Photogrammetric Engineering and Remote Sensing, 57, 295-303.
[4] Strobl, D., Raggam, J., and Buchroithner, M. F., 1990, Terrain correction geocoding of a
multi-sensor image data set. Proceedings 10th EARSeL Symposium, Toulouse, France
(Paris: European Space Agency), pp. 98-107.
[5] Bloom, A., Fielding, E., and Fu, X., 1988, A demonstration of stereo photogrammetry
with combined SIR-B and Landsat-TM images. International Journal of Remote
Sensing, 9, 1023-1038.
[6] Leckie, D. G., 1990, Synergism of SAR and visible/infrared data for forest type
discrimination. Photogrammetric Engineering and Remote Sensing, 56, 1237- 1246.
[7] Schistad-Solberg, A. H., Jain, A. K., and Taxt, T., 1994, Multisource classification of
remotely sensed data: fusion of Landsat TM and SAR images. IEEE Transactions on
Geoscience and Remote Sensing, 32, 768-778.
[8] Duguay, G., Holder, G., Howarth, P., and LeDrew, E., 1987, Integrating remotely sensed
data from different sensors for change detection. Proceedings of the IEEE International
Geoscience and Remote Sensing Symposium (IGARSS ’87), Ann Arbor, USA, May 18-
21, 1987, p. 333.
[9] Aschbacher, J., and Lichtenegger, J., 1990, Complementary nature of SAR and optical
data: a case study in the Tropics. Earth Observation Quarterly, 31, 4-8.
[10] Suits, G., Malila, W., and Weller, T., 1988, Procedures for using signals from one sensor
as substitutes for signals of another. Remote Sensing of Environment, 25, 395-408.
[11] S. Ibrahim and M. Wirth, “Visible and IR Data Fusion Technique Using the Contourlet
Transform”, International conference on computational science and engineering, CSE
09, IEEE, vol. 2, pp. 42-47, 2009.
[12] Mouyan Zou and Yan Liu, “Multi-Sensor Image Fusion: Difficulties and Key
Techniques”, 2nd International congress on image and signal processing, IEEE, pp. 1-5,
2009.
[13] Essadiki, M., 1987, A combination of panchromatic and multispectral SPOT images for
topographic mapping. IT C Journal, 1987-1, 59-66.
[14] Bloom, A., Fielding, E., and Fu, X., 1988, A demonstration of stereo photogrammetry
with combined SIR-B and Landsat-TM images. International Journal of Remote
Sensing, 9, 1023-1038.
[15] Tauch, R., and Ka¨hler, M., 1988, Improving the quality of satellite image maps by
various processing techniques. International Archives of Photogrammetry and Remote
Sensing, Proceedings XV I ISPRS Congress, Kyoto, Japan, pp. IV238-IV247.
[16] Welch, R., and Ehlers, M., 1988, Cartographic feature extraction from integrated SIR-B
and Landsat TM images. International Journal of Remote Sensing, 9, 873-889.
[17] Tanaka, S., Sugimura, T., and Higashi, M., 1989, High resolution satellite image map
from the SPOT and Landsat TM data. Advanced Space Research, 9, 115-120.
122
[18] Rogers, R. H., and Wood, L., 1990, The history and status of merging multiple sensor
data: an overview. Technical Papers 1990, ACSM-ASPRS Annual Convention, Image
Processing and Remote Sensing, 4, 352-360.
[19] Dallemand, J. F., Lichtenegger, J., Kaufmann, V., Paudyal, D. R., and Reichert, A., 1992,
Combined analysis of ERS-1 SAR and visible/infrared RS data for land cover/land use
mapping in tropical zone: a case study in Guinea. Space at the Service of our
Environment, Proceedings First ERS-1 Symposium, Cannes, France, 6-8 November
1992, SP-359, pp. 555-561.
[20] Perlant, F., 1992, Example of SPOT/ERS-1 complementary. Space at the Service of our
Environment, Proceedings First ERS-1 Symposium, 4± 6 November 1992, Cannes,
France, pp. 563-568.
[21] Albertz, J., and Tauch, R., 1994,Mapping from space Ð Cartographic applications of
satellite image data. GeoJournal, 32, 29-37.
[22] Kaufmann, K.-H., and Buchroithner, M. F., 1994, Herstellung und AnwendungsmoÈ
glichkeiten von Satellitenkarten durch digitale Kombination von Landsat-TM und KWR-
1000-Daten. Zeitschrift fuÈ r Photogrammetrie und Fernerkundung, 62, 133-137.
[23] Matte, R., 1994, L’ortho-image au service de la cartographie. Numerimage, Bulletin
d’Information Quadrimetriel (Quebec), 3, 9-12.
[24] Perlant, F., Sempere, J.-P., and Guerre, L.-F., 1994, Production of SPOT/ERSI image
maps. Proceedings of First ERS-1 Pilot Project Workshop, Toledo, Spain, 22-24 June
1994, pp. 331-336.
[25] Hinse, M., and Proulx, J., 1995, La spatiocarte regionale. Numerimage, Bulletin
d’Information Quadrimestriel (Quebec), 3, p. 2.
[26] Pohl, C., 1995, SPOT/ERS Image maps: Topographic map updating in Indonesia. SPOT
MAGAZINE, December 1995, 17-19.
[27] Pohl, C., 1996, Geometric aspects of multisensor image fusion for topographic map
updating in the humid Tropics. IT C publication No. 39, ISBN 90 6164 121 7.
[28] Pohl, C., and Genderen, J. L. van, 1993, Geometric integration of multi-image
information. Space at the Service of our Environment, Proceedings of the Second ERS-1
Symposium, 11-14 October 1993, Hamburg, Germany, pp. 1255-1260.
[29] Toll, D. L., 1985, Analysis of digital Landsat MSS and Seasat SAR data for use in
discriminating land cover at the urban fringe of Denver, Colorado. International Journal
of Remote Sensing, 6, 1209-1229.
[30] Munechika, C. K., Warnick, J. S., Salvaggio, C., and Schott, J. R., 1993, Resolution
enhancement of multispectral image data to improve classification accuracy.
Photogrammetric Engineering and Remote Sensing, 59, 67-72.
[31] Franklin, S. E., and Blodgett, C. F., 1993, An example of satellite multisensor data
fusion. Computers and Geoscience, 19, 577-583.
[32] Hussin, Y. A., and Shaker, S. R., 1996, Optical and radar satellite image fusion
techniques and their applications in monitoring natural resources and land use change.
Synthetic Aperture Radar, Proceedings European Conference, Germany, 26-28 March
1996, pp. 451-456.
[33] Ahern, F. J., Goodenough, D. G., Grey, A. L., and Ryerson, R. A., 1978, Simultaneous
microwave and optical wavelength observations of agricultural targets. Canadian Journal
of Remote Sensing, 4, 127-142.
[34] Ulaby, F. T., Li, R. Y., and Shanmugan, K. S., 1982, Crop classification using airborne
radar and Landsat data. IEEE Transactions in Geoscience and Remote Sensing, 20, 42-
51.
123
[35] Brisco, B., and Brown, R. J., 1995, Multidate SAR/TM synergism for crop classification
in Western Canada. Photogrammetric Engineering and Remote Sensing, 61, 1009-1014.
[36] Brisco, B., Ulaby, F. T., and Dobson, M. C., 1983, Spaceborne SAR data for land cover
classification and change detection. Digest of International Geoscience and Remote
Sensing Symposium, 1, 1.1-1.8.
[37] Nezry, E.,Mougin, E., Lopes, A., and Gastellu-Etchegorry, J. P., 1993, Tropical
vegetation mapping with combined visible and SAR spaceborne data. International
Journal of Remote Sensing, 14, 2165-2184.
[38] Mangolini, M., and Arino, O., 1996 a, Improvement of agriculture monitoring by the use
of ERS/SAR and Landsat/TM. Fusion of Earth Data, Proceedings EARSeL Conference,
Cannes, France, 6± 8 February 1996, pp. 29-35.
[39] Mangolini, M., and Arino, O., 1996 b, ERS-1SAR and Landsat-TM multitemporal
fusion for crop statistics. Earth Observation Quarterly, 51, 11-15.
[40] Leckie, D. G., 1990, Synergism of SAR and visible/infrared data for forest type
discrimination. Photogrammetric Engineering and Remote Sensing, 56, 1237-1246.
[41] Lozano-Garcia,D. F., and Hoffer, R. M., 1993, Synergistic e ffects of combined Landsat-
TM and SIR-B data for forest resources assessment. International Journal of Remote
Sensing, 14, 2677-2694.
[42] Kachhwalha, T. S., 1993, Temporal and multisensor approach in forest/vegetation
mapping and corridor identification for effective management of Rajaji Nat. Park, Uttar
Pradesh, India. International Journal of Remote Sensing, 14, 3105-3114.
[43] Hinse, M., and Coulombe, A., 1994, Numerimage, Bulletin d’Information
Quadrimestriel (Quebec), 3, 2-4.
[44] Aschbacher, J., Giri, C. P., Ofren, R. S., Tiangco, P. N., Delso, J. P., Suselo, T. B.,
Vibulsresth, S., and Charupat, T., 1994, Tropical mangrove vegetation mapping using
advanced remote sensing and GIS technology. Executive Summary, ADC Austria and
Asian Institute of Technology.
[45] Wilkinson, G. G., Folving, S., Kanellopoulos, I., McCormick, N., Fullerton, K., and
Me’gier, J., 1995, Forest mapping from multi-source satellite data using neural network
classifiers - An experiment in Portugal. Remote Sensing Reviews, 12, 83-106.
[46] Griffiths, G. H., 1988, Monitoring urban change from Landsat TM and SPOT satellite
imagery by image differencing. Proceedings I.E.E.E. International Geoscience and
Remote Sensing Symposium (IGARSS ’88), Edinburgh, Scotland, 13-16 September
1988, pp. 493-497.
[47] Haack, B., and Slonecker, T., 1991, Sensor fusion for locating villages in Sudan. GIS/L
IS ACSM-ASPRS Fall Convention, Technical Papers, Atlanta, U.S.A., 28 October 1991,
pp. B97-B107.
[48] Ranchin, T., Wald, L., and Mangolini, M., 1996, The ARSIS method: a general solution
for improving spatial resolution of images by the means of sensor fusion. Fusion of
Earth Data, Proceedings EARSeL Conference, Cannes, France, 6-8 February 1996.
[49] Corves, C., 1994, Assessment of multi-temporal ERS-1 SAR Landsat TM data for
mapping the Amazon river floodplain. Proceedings of First ERS-1 Pilot Project
Workshop, Toledo, Spain, 22-24 June 1994, SP-365, pp. 129-132.
[50] Ye’sou, H., Besnus, Y., and Rolet, J., 1994, Perception of a geological body using
multiple source remotely-sensed data-relative influence of the spectral content and the
spatial resolution. International Journal of Remote Sensing, 15, 2495-2510.
[51] Pohl, C., and Genderen, J. L. van, 1993, Geometric integration of multi-image
information. Space at the Service of our Environment, Proceedings of the Second ERS-1
Symposium, 11-14 October 1993, Hamburg, Germany, ESA SP-361, pp. 1255-1260.
124
[52] Pohl, C., and Genderen, J. L. van, 1995, Image fusion of microwave and optical remote
sensing data for map updating in the Tropics. Image and Signal Processing for Remote
Sensing, Proceedings EUROPT O ’95, Paris, France, 25-29 September 1995, SPIE Vol.
2579, pp. 2-10.
[53] Wang, Y., Koopmans, B. N., and Pohl, C., 1995, The 1995 flood in The Netherlands
monitored from space: a multi-sensor approach. International Journal of Remote
Sensing, 16, 2735-2739.
[54] Fellah, K., and Tholey, N., 1996, Flood risk evaluation and monitoring in North East
France. In Application Achievements of ERS-1, New Views of the Earth, Preliminary
Version, pp. 116-119.
[55] Matthews, J., and Gaffney, S., 1994, A study of flooding on Lough Corrib, Ireland during
early 1994 using ERS-1 SAR data. Proceedings of First ERS-1 Pilot Project Workshop,
Toledo, Spain, 22-24 June 1994, SP-365, pp. 125-128.
[56] ESA, 1995, ERS-1 SAR images of flood in Northern Europe (January 1995). Earth
Observation Quarterly, 47, 13.
[57] ESA, 1995, Multitemporal image Maastricht (The Netherlands). Earth Observation
Quarterly, 47, 14.
[58] Lichtenegger, J., and Calabresi, G., 1995, ERS monitors flooding in The Netherlands.
ESA Bulletin, 81, Title page & p. 94.
[59] MacIntosh, H., and Profeti, G., 1995, The use of ERS SAR data to manage flood
emergencies at a smaller scale. Proceedings of Second ERS Applications Workshop,
London, U.K., 6-8 December 1995 (Paris: ESA publication), pp. 243-246.
[60] Badji, M., 1995, Flood plain management in Wallonia. In Application Achievements of
ERS-1, New Views of the Earth, Preliminary Version, pp. 114-115.
[61] Desnos, Y.-L., Mayer, T., and Sardar, A. M., 1996, Multitemporal ERS-1 SAR images of
the Brahmaputra flood plains in Northern Bangladesh. Earth Observation Quarterly, 51,
6-10.
[62] Bonansea, E., 1995, Flood monitoring and assessment in Piemonte. In Application
Achievements of ERS-1, New Views of the Earth, SP-1176/11, p. 95.
[63] Brakenridge, G. R., 1995, Flood mapping in the Mississippi. In Application
Achievements of ERS-1, New Views of the Earth, SP-1176/11, p. 92.
[64] Kannen, A., Oberstadler, R., and Protman, F., 1995, Flood mapping in Germany.
Application Achievements of ERS-1, New Views of the Earth, Preliminary Version, pp.
124-125.
[65] Otten, M. P. G., and Persie, M. van, 1995, SAR image of floods using PHARS system.
Earth Observation Quarterly, 47, 15.
[66] Ramseier, R. O., Emmons, A., Armour, B., and Garrity, C., 1993, Fusion of ERS-1 SAR
and SSM/I Ice data. Space at the Service of our Environment, Proceedings of the Second
ERS-1 Symposium, 11± 14 October 1993, Hamburg, Germany, ESA SP-361, pp. 361-
368.
[67] Armour, B., Ehrismann, J., Chen, F., and Bowman, G., 1994, An integrated package for
the processing and analysis of SAR imagery, and the fusion of radar and passive
microwave data. In Proceedings ISPRS Symposium, 6± 10 June 1994, Ottawa, Canada,
pp. 299-306.
[68] Haefner, H., Holecz, F., Meier, E., Nu¨ esch D., and Piesbergen, J., 1993, Capabilities
and limitations of ERS-1 SAR data for snow-cover determination in mountainous
regions. Space at the Service of our Environment, Proceedings of the Second ERS-1
Symposium, Hamburg, Germany, 11± 14 October 1993, SP-361, pp. 971-976.
125
[69] Daily, M. I., Farr, T., and Elachi, C., 1979, Geologic interpretation from composited
radar and Landsat imagery. Photogrammetric Engineering and Remote Sensing, 45,
1009-1116.
[70] Blom, R. G., and Daily, M., 1982, Radar image processing for rock type discrimination.
IEEE Transactions on Geoscience and Remote Sensing, 20, 343-351.
[71] Rebillard, P., and Nguyen, P. T., 1982, An exploration of co-registered SIR-A, SEASAT
and Landsat images. RS of Environment, RS for Exploration Geology, Proceedings
International Symposium, Second T hematic Conference, USA, pp. 109-118.
[72] Reimchen, T. H. F., 1982, Location of economic gypsum deposits by integration of
multispectral, multitemporal, geophysical, -chemical, -botanical and geological data. RS
of Environment, RS for Exploration Geology, Proceedings International Symposium,
Second T hematic Conference, Forth Worth, U.S.A., pp. 119-125.
[73] Haydn, R., Dalke, G. W., Henkel, J., and Bare, J. C., 1982, Application of the IHS color
transform to the processing of multisensor data and image enhancement. Remote
Sensing of Arid and Semi-Arid L ands, Proceedings of International Symposium, pp.
599-607.
[74] Aarnisalo, J., 1984, Image processing and integration of geophysical, MSS and other
data as a tool for numerical exploration in glaciated Precambrian terrain. Proceedings of
the International Symposium on Remote Sensing of Environment, Proceedings of Third
Thematic Conference, Remote Sensing for Exploration Geology, Colorado Springs, pp.
107-128.
[75] Evans, D., 1988, Multisensor classification of sedimentary rocks. Remote Sensing of
Environment, 25, 129-144.
[76] Baker, R. N., and Henderson, F. B., 1988, Results of the GEOSAT programme to
evaluate SPOT data for petroleum and mineral exploration. SPOT 1: Image utilization,
assessment, results, Proceedings CNES Conference, Paris, France, pp. 731-738.
[77] Hopkins, H. R., Navail, H., Berger, Z., Merembeck, B. F., Brovey, R. L., and Schriver, J.
S., 1988, Structural analysis of the Jura mountains - Rhine Graben intersection for
petroleum exploration using SPOT stereoscopic data. SPOT 1: Image utilization,
assessment, results, Proceedings CNES Conference, Paris, France, pp. 803-810.
[78] Paradella, W. R., Vitorello, I., Liu, C. C.,Mattos, J. T.,Meneses, P. R., and Dutea, L. V.,
1988, Spectral and spatial attribute evaluation of SPOT data in geological mapping of
Precambrian terrains in semi-arid environment of Brazil. SPOT 1: Image utilization,
assessment, results, Proceedings CNES Conference, Paris, France, pp. 851-860.
[79] Guo, H. D., and Pinliang, D., 1989, Integrated MSS-SAR-SPOT-geophysical
geochemical data for exploration geology in Yeder area. CAS-IRSA, pp. 1-8.
[80] Harris, J. R., and Murray, R., 1989, IHS transform for the integration of radar imagery
with geophysical data. Proceedings of the 12th Canadian Symposium on Remote
Sensing, Vancouver, Canada, pp. 923-926.
[81] Koopmans, B. N., Bakx, J. P. G., Dijk, P. M. van, and AlFasatwi, Y. A., 1989,
SPOTRADAR synergism for geological mapping in arid areas. Project Report Prosper
Project, Final Report, CNES-SPOT Image-ITC.
[82] Harris, J. R., Murray, R., and Hirose, T., 1990, IHS transform for the integration of radar
imagery with other remotely sensed data. Photogrammetric Engineering and Remote
Sensing, 56, 1631-1641.
[83] Grasso, D. N., 1993, Application of the IHS colour transformation for 1:24,000-scale
geologic mapping: a low cost SPOT alternative. Photogrammetric Engineering and
Remote Sensing, 59, 73-80.
126
[84] Jutz, S. L., and Chorowicz, J., 1993, Geological mapping and detection of oblique
extension structures in the Kenyan Rift Valley with a SPOT/Landsat-TM data-merge.
International Journal of Remote Sensing, 14, 1677-1688.
[85] Koopmans, B. N., and Richetti, E., 1993, Optimal geological data extraction from
SPOTRadar synergism with samples from Djebel Amour (Algeria), Red Sea Hills
(Sudan), Sirte Basin (Libya) and Magdalena Valley (Colombia). From Optics to Radar,
SPOT and ERS Applications, Conference Proceedings, 10± 13 May 1993, Paris, France,
pp. 263-274.
[86] Koopmans, B. N., and Forero, G. R., 1993, Airborne SAR and Landsat MSS as
complementary information source for geological hazard mapping. ISPRS Journal of
Photogrammetry and Remote Sensing, 48, 28-37.
[87] Ye’sou, H., Besnus, Y., and Rolet, J., 1993 a, Extraction of spectral information from
Landsat TM data and merger with SPOT panchromatic imagery-a contribution to the
study of geological structures. ISPRS Journal of Photogrammetry and Remote Sensing,
48, 23-36.
[88] Ray, P. K. C. Roy, A. K., and Prabhakaran, B., 1995, Evaluation and integration of ERS-
1 SAR and optical sensor data (TM and IRS) for geological investigations.
Photonirvachak, Journal of the Indian Society of Remote Sensing, 23, 77-86.
[89] G. Pajares and M. de la Cruz, “A wavelet-based image fusion tutorial”, Pattern
Recognition, vol. 37, no. 9, pp. 1855–1872, 2004.
[90] A.A. Goshtasby, S. Nikolov, “Image fusion: advances in the state of the art”, Information
Fusion 8 (2) (2007) 114–118.
[91] V.S. Petrovic, C.S. Xydeas, “Gradient-based multiresolution image fusion”, IEEE
Transactions on Image Processing 13 (2) (2004) 228–237.
[92] G. V. Welland, “Beyond Wavelets”, Academic Press, 2003.
[93] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional
multiresolution image representation”, IEEE Transactions on Image Processing, vol. 14,
no. 12, pp. 2091–2106, 2005.
[94] P. Hong and M. J. T. Smith, “An octave-band family of non-redundant directional filter
banks,” in IEEE proc. ICASSP, vol. 2, pp. 1165-1168, 2002.
[95] Y. Lu and M. N. Do, “CRISP-contourlets: a critically sampled directional
multiresolution image representation,” in proc. of SPIE conference on Wavelet
Applications in Signal and Image Processing X, San Diego, USA, August 2003.
[96] Chui, C., 1992. An Introduction to Wavelets. Academic Press, New York.
[97] Vidakovic, B., Mueller, P., 1994. Wavelets for Kids: A Tutorial Introduction. Institute of
Statistics and Decision Science, Duke University, Durham, NC.
http://www2.isye.gatech.edu/~brani/wp/kidsA.pdf, last access date: May 18, 2007.
[98] Pajares, G., de la Cruz, J.M., 2004. A wavelet-based image fusion tutorial. Pattern
Recognition 37 (9), 1855–1872.
[99] Mallat, S.G., 1999. A Wavelet Tour of Signal Processing, second ed. Academic Press,
San Diego.
[100] Gonzalez-Audicana,M., Otazu, X., Fors, O., Seco, A., 2005. Comparison between
Mallat's and the ‘à trous’ discrete wavelet transform based algorithms for the fusion of
multispectral and panchromatic images. International Journal of Remote Sensing 26 (3),
595–614.
[101] Pajares, G., de la Cruz, J.M., 2004. A wavelet-based image fusion tutorial. Pattern
Recognition 37 (9), 1855–1872.
[102] Burt P J., “Merging images through pattern decomposition”, Proceedings of SPIE,
575: 173-18, 1985.
127
[103] Bamberger R H., “A filter bank for the directional decomposition of images: Theory
and design”, IEEE Trans. Signal Processing, 40 (4): 882 -893, 1992.
[104] M. Vetterli and J. Kovaˇcevi´c, Wavelets and Subband Coding. Prentice-Hall, 1995.
[105] M. Vetterli, “Multidimensional subband coding: Some theory and algorithms,” Signal
Proc., vol. 6, no. 2, pp. 97–112, February 1984.
[106] M. N. Do, “Directional multiresolution image representations,” Ph.D. dissertation,
Swiss Federal Institute of Technology, Lausanne, Switzerland, December 2001.
[107] M. Vetterli and J. Kovaˇcevi´c, Wavelets and Subband Coding. Prentice-Hall, 1995.
[108] Aboubaker M. ALEjaily et al., “Fusion of remote sensing images using contourlet
transform”, Innovations and Advanced Techniques in Systems, Computing Sciences and
Software Engineering, Springer, pp. 213-218, 2008.
[109] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional
multiresolution image representation”, IEEE Transactions on Image Processing, vol. 14,
no. 12, pp. 2091–2106, 2005.
[110] Harrison, B. A., and Jupp, D. L. B., 1990, Introduction to image processing.
MicroBRIAN Resource Manual, Part 2.
[111] Rast, M., Jaskolla, M., and Aranson, F. K., 1991, Comparative digital analysis of
Seasat-SAR and Landsat-TM data for Iceland. International Journal of Remote Sensing,
12, 527-544.
[112] Gillespie, A. R., Kahle, A. B., and Walker, R. E., 1986, Colour enhancement of highly
correlated images. I. Decorrelation and HSI contrast stretches. Remote Sensing and
Environment, 20, 209-235.
[113] Carper, W. J., Lillesand, T. M., and Kieffer, R. W., 1990, The use of Intensity-Hue-
Saturation transformations for merging SPOT Panchromatic and multispectral image
data. Photogrammetric Engineering and Remote Sensing, 56, 459-467.
[114] Grasso, D. N., 1993, Application of the IHS colour transformation for 1:24,000-scale
geologic mapping: a low cost SPOT alternative. Photogrammetric Engineering and
Remote Sensing, 59, 73-80.
[115] Hinse, M., and Proulx, J., 1995, Numerimage, Bulletin d’Information Quadrimestriel
(Quebec), 3, p. 2.
[116] Singh, A., and Harrison, A., 1985, Standardized principal components. International
Journal of Remote Sensing, 6, 883-396.
[117] Shettigara, V. K., 1992, A generalized component substitution technique for spatial
enhancement of multispectral images using a higher resolution data set.
Photogrammetric Engineering and Remote Sensing, 58, 561-567.
[118] Fung, T., and LeDrew, E., 1987, Application of Principal Component Analysis to
change detection. Photogrammetric Engineering and Remote Sensing, 53, 1649-58.
[119] Zobrist, A. L., Blackwell, R. J., and Stromberg, W. D., 1979, Integration of Landsat,
Seasat, and other geo-data sources. Proceedings 13th International Symposium on
Remote Sensing of Environment, ERIM, Ann Arbor, USA, 23-27 April 1979, pp. 271-
280.
[120] Rothery, D. A., and Francis, P. W., 1987, Synergistic use of MOMS-01 and Landsat
TM data. International Journal of Remote Sensing, 8, 501-508.
[121] Campbell, N. A., 1993, Towards more quantitative extraction of information from
remotely sensed data. Advanced Remote Sensing, Conference Proceedings, held in
Sydney, Australia, 2, 29-40.
[122] Mallat, S. G., 1989, A theory for multiresolution signal decomposition: The wavelet
representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11,
674-693.
128
[123] Djamdji, J.-P., Bijaoui, A., and Manieri, R., 1993, Geometrical registration of images:
The multiresolution approach. Photogrammetric Engineering and Remote Sensing, 59,
645-653.
[124] M. N. Do, Directional multiresolution image representations. PhD thesis, EPFL,
Lausanne, Switzerland, Dec. 2001.
[125] D. D.-Y. Po and M. N. Do, “Directional multiscale modeling of images using the
contourlet transform,” submitted to IEEE Trans. On Image Processing, 2003.
[126] M. N. Do and M. Vetterli, “Contourlets,” in Beyond Wavelets, Academic Press, New
York, 2003.
[127] S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 2nd Ed., 1998.
[128] D. Clonda, 1. M. Lina, B. Goulard, Complex Daubechies Wavelets: Properties and
statistical image modeling, Signal Processing, vol. 84, pp. 1-23,2004.
[129] Z. Qiang, W. Long, L. Huijuan, M. Zhaokun, Similarity-based multimodality image
fusion with shiftable complex directional pyramid, Pattern Recognition Letters, vol. 32,
pp. 1544-1553, 2011.
[130] W.K. Pratt, Digital Image Processing, 2nd ed., Wiley, New York, 1991.
[131] H. Hanaizumi, S. Fujimura, An automated method for registration of satellite remote
sensing images, Proceedings of the International Geoscience and Remote Sensing
Symposium IGARSS’93, Tokyo, Japan, 1993, pp. 1348–1350.
[132] R. Berthilsson, Affine correlation. Proceedings of the International Conference on
Pattern Recognition ICPR’98, Brisbane, Australia, 1998, p. 1458–1461.
[133] A. Simper, Correcting general band-to-band misregistrations, Proceedings of the
IEEE International Conference on Image Processing ICIP’96, Lausanne, Switzerland,
1996, 2, pp. 597–600.
[134] S. Kaneko, Y. Satoh, S. Igarashi, Using selective correlation coefficient for robust
image registration, Pattern Recognition 36 (2003) 1165–1173.
[135] D.I. Barnea, H.F. Silverman, A class of algorithms for fast digital image registration,
IEEE Transactions on Computing 21 (1972) 179–186.
[136] G. Wolberg, S. Zokai, Image registration for perspective deformation recovery, SPIE
Conference on Automatic Target Recognition X, Orlando, Florida, USA, April 2000, p.
12.
[137] W.K. Pratt, Correlation techniques of image registration, IEEE Transactions on
Aerospace and Electronic Systems 10 (1974) 353–358.
[138] P. van Wie, M. Stein, A landsat digital image rectification system, IEEE Transactions
on Geoscience Electronics 15 (1977) 130–136.
[139] P.E. Anuta, Spatial registration of multispectral and multitemporal digital imagery
using Fast Fourier Transform, IEEE Transactions on Geoscience Electronics 8 (1970)
353–368.
[140] A. Goshtasby, G.C. Stockman, Point pattern matching using convex hull edges, IEEE
Transactions on Systems, Man and Cybernetics 15 (1985) 631–637.
[141] G. Stockman, S. Kopstein, S. Benett, Matching images to models for registration and
object detection via clustering, IEEE Transactions on Pattern Analysis and Machine
Intelligence 4 (1982) 229–241.
[142] S.H. Chang, F.H. Cheng, W.H. Hsu, G.Z. Wu, Fast algorithm for point pattern
matching: Invariant to translations, rotations and scale changes, Pattern Recognition 30
(1997) 311–320.
[143] J. Flusser, Object matching by means of matching likelihood coefficients, Pattern
Recognition Letters 16 (1995) 893–900.
129
[144] E. Guest, E. Berry, R.A. Baldock, M. Fidrich, M.A. Smith, Robust point
correspondence applied to two- and three-dimensional image registration, IEEE
Transaction on Pattern Analysis and Machine Intelligence 23 (2001) 165–179.
[145] Q. Zheng, R. Chellapa, A computational vision approach to image registration, IEEE
Transactions on Image Processing 2 (1993) 311–325.
[146] A. Ifarraguerri, “Visual method for spectral band selection,” IEEE Geosci. Remote
Sens. Lett., vol. 1, no. 2, pp. 101–106, Apr. 2004.
[147] Qian Du; He Yang, "Similarity-Based Unsupervised Band Selection for Hyperspectral
Image Analysis," Geoscience and Remote Sensing Letters, IEEE, vol.5, no.4,
pp.564,568, Oct. 2008.
[148] D. C. Heinz and C.-I Chang, “Fully constrained least squares linear spectral mixture
analysis method for material quantification in hyperspectral imagery,” IEEE Trans.
Geosci. Remote Sens., vol. 39, no. 3, pp. 529–545, Mar. 2001.
[149] http://studio.gge.unb.ca/UNB/zoomview/examples.html
[150] M. Eismann, R. Hardie, “Application of the stochastic mixing model to hyperspectral
resolution enhancement”, IEEE Transactions on Geoscience and Remote Sensing 42 (9)
(2004) 1924–1933.
[151] MultiSpec©, https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html
[152] A. Cohen, I. Daubechies, and J.-C. Feauveau, “Biorthogonal bases of compactly
supported wavelets,” Commun. on Pure and Appl. Math., vol. 45, pp. 485–560, 1992.
[153] M. Vetterli and C. Herley, “Wavelets and filter banks: Theory and design,” IEEE
Trans. Signal Proc., vol. 40, no. 9, pp. 2207–2232, September 1992.
[154] S.-M. Phoong, C. W. Kim, P. P. Vaidyanathan, and R. Ansari, “A new class of two-
channel biorthogonal filter banks and wavelet bases,” IEEE Trans. Signal Proc., vol. 43,
no. 3, pp. 649–665, Mar. 1995.
130
CURRICULUM VITAE
Graduate College
University of Nevada, Las Vegas
Yoonsuk Choi
Degrees:
Bachelor of Engineering in Electrical Engineering, 2003
Korea University, South Korea
Master of Engineering in Electronics and Computer Engineering, 2006
Korea University, South Korea
Special Honors and Awards:
1st Place Winner, Best Dissertation Award, College of Engineering, UNLV, 2014.
Graduate Assistant Scholarship, Dept. of Electrical & Computer Eng., UNLV, 2010-2014.
Best New Employee Prize, Missile R&D Team, LIG Nex1, 2007.
Best Research Team Award, Korea Institute of Science and Technology, 2004 & 2006.
Outstanding Researcher Award, Korea Institute of Science and Technology, 2005.
Industry-Academic Collaboration Scholarship, Samsung Electronics, 2003-2005.
Brain Korea (BK 21) Scholarship, Korea University, 2003-2005.
Outstanding Teaching Assistant Award, Dept. of Electrical Eng., Korea University, 2004.
Publications:
Journal Articles
1. E. Sharifahmadian, Y. Choi, and S. Latifi, “Multichannel Data Compression using
Wavelet Subbands Arranging Technique,” International Journal of Computer Application,
Vol. 91, No. 4, pp. 17-22, April 2014.
2. Y. Choi, E. Sharifahmadian and S. Latifi, “Remote Sensing Image Fusion using
Contourlet Transform with Sharp Frequency Localization”, International Journal of
Information Technology, Modeling and Computing, Vol. 2, No. 1, pp. 23-35, Feb. 2014.
3. Y. Choi, E. Sharifahmadian and S. Latifi, “Quality Assessment of Image Fusion Methods
in Transform Domain”, International Journal on Information Theory, Vol. 3, No. 1, pp. 7-
18, Jan. 2014.
4. Y. Choi, E. Sharifahmadian and S. Latifi, “Performance Analysis of Contourlet-based
Hyperspectral Image Fusion Methods”, International Journal on Information Theory, Vol.
2, No. 1, pp. 1-14, Oct. 2013.
5. Y. Choi and S. Latifi, “Modern Flash Memory Technologies: A Flash Translation Layer
Perspective”, International Journal of High Performance Systems Architecture, Vol. 4,
No. 3, pp. 167-182, 2013.
6. Y. Choi and S. Latifi, “Future Prospects of DRAM: Emerging Alternatives”,
International Journal of High Performance Systems Architecture, Vol. 4, No. 1, pp. 1-12,
2012.
131
7. G. B. Kim, S. Y. Cha, E. K. Hyun, Y. C. Jung, Y. Choi, J. S. Rieh, S. R. Lee and S. W.
Hwang, "Integrated Planar Spiral Inductors with CoFe and NiFe Ferromagnetic Layer",
Microwave and Optical Technology Letters, Vol. 50, No. 3, pp. 676-678, March 2008.
8. Y. Choi, S. H. Son, et al., “Fabrication of an In-Plane Gate Transistor and Electron
Transport Characterization”, Journal of the Korean Vacuum Society, Vol. 27, pp. 215-
225, 2006.
9. S. H. Son, Y. Choi, et al., “Gate bias controlled NDR in an in-plane-gate quantum dot
transistor”, Physica E, Volume 32, No. 1-2, pp. 532-535, May 2006.
10. S. H. Son, Y. Choi, et al., “Single-Electron Transport in GaAs/AlGaAs Nano-In-Plane-
Gate Transistors”, Journal of the Korean Physical Society, Vol. 47, pp. S517-S521,
November 2005.
11. S. H. Hong, H. K. Kim, Y. Choi, et al., “Controllable Capture of Au Nano-Particles by
using Dielectrophoresis”, Journal of the Korean Physical Society, Vol. 45, pp. S665-S668,
December 2004.
12. S. H. Son, K. H. Cho, Y. Choi, et al., “Fabrication and Characterization of Metal-
Semiconductor Field-Effect-Transistor-Type Quantum Devices”, Journal of Applied
Physics, Vol. 96, No. 1, pp. 704-708, July, 2004.
Conference Articles
1. Y. Choi, E. Sharifahmadian and S. Latifi, “Spectral Image Fusion using Band Reduction
and Contourlets”, SPIE Defense, Security, and Sensing, Vol. 9088, pp. 90880W1-
90880W9, Baltimore, USA, May 2014.
2. E. Sharifahmadian, Y. Choi and S. Latifi, “Wavelet-based Compression of Multichannel
Climate Data”, SPIE Defense, Security, and Sensing, Vol. 9124, pp. 91240B1-91240B6,
Baltimore, USA, May 2014.
3. E. Sharifahmadian, Y. Choi and S. Latifi, “Wavelet-based Identification of Objects from
a Distance”, SPIE Defense, Security, and Sensing, Vol. 9091, pp. 90911B1-90911B8,
Baltimore, USA, May 2014.
4. Y. Choi, E. Sharifahmadian and S. Latifi, “Effect of Pre-processing on Satellite Image
Fusion”, The 17th CSI International Symposium on Computer Architecture and Digital
Systems (IEEE CADS) 2013, pp. 111-115, Tehran, Iran, Oct. 30-31, 2013.
5. Y. Choi, E. Sharifahmadian and S. Latifi, “Fusion and Quality Analysis for Remote
Sensing Images using Contourlet Transform”, SPIE Defense, Security, and Sensing, Vol.
8743, pp. 874326(1)-874326(9), Baltimore, USA, May 2013.
6. Y. Choi, E. Sharifahmadian and S. Latifi, “Performance Analysis of Image Fusion
Methods in Transform Domain”, SPIE Defense, Security, and Sensing, Vol. 8756, pp.
87560G1-87560G12, Baltimore, USA, May 2013.
7. E. Sharifahmadian, Y. Choi and S. Latifi, “A Simulation Study of Target Detection using
Hyperspectral Data Analysis”, SPIE Defense, Security, and Sensing, Vol. 8744, pp.
874410(1)-874410(13), Baltimore, USA, May 2013.
8. E. Sharifahmadian, Y. Choi and S. Latifi, “A Simulation Study of Detection of Weapon
of Mass Destruction based on Radar”, SPIE Defense, Security, and Sensing, Vol. 8710,
pp. 87100Y1-87100Y12, Baltimore, USA, May 2013.
9. Y. Choi and S. Latifi, “Fusion Techniques for Satellite Images”, The 2012 International
Conference on Information and Knowledge Engineering, pp. 294-300, Las Vegas, USA,
July 16-19, 2012.
132
10. Y. Choi and S. Latifi, “Contourlet Based Multi-Sensor Image Fusion”, The 2012
International Conference on Information and Knowledge Engineering, pp. 301-307, Las
Vegas, USA, July 16-19, 2012.
11. S. Y. Cha, G. B. Kim, Y. C. Jung, Y. Choi, K. H. Cho, J. S. Rieh, S. W. Hwang, E. K.
Hyun, S. R. Lee, "Characterization and Analysis of Integrated RF Ferromagnetic Spiral
Inductances", KIEEME Autumn Conference 2006, pp. 109-111, South Korea, December
1, 2006.
12. S. H. Son, Y. Choi, et al., “Resonant Tunneling through Quantum States of Enhancement
Mode In-Plane-Gate Quantum Dot Transistors”, The 28th International Conference on
the Physics of Semiconductors (ICPS Conference), pp. 210, Vienna, Austria , July 24-28,
2006.
13. S. H. Son, Y. Choi, K. H. Cho, et al., “Gate Bias Controlled NDR in an In-Plane-Gate
Quantum Dot Transistor”, The 12th International Conference on Modulated
Semiconductor Structures (MSS12), Albuquerque, New Mexico, USA, July 10-15, 2005.
14. S. H. Son, Y. Choi, et al., “Single Electron Transport in GaAs/AlGaAs Nano-In-Plane-
Gate Transistors”, 2005 Asia-Pacific Workshop on Fundamental and Application of
Advanced Semiconductor Devices (AWAD 2005), pp.181-183, Seoul, South Korea, June
28-30, 2005.
15. S. H. Son, Y. Choi, et al., “Fabrication and Characterization of an In-Plane-Gate
Quantum Dot Resonant Tunneling Transistor”, The Second Conference on Nanoscale
Devices & System Integration (NDSI 2005), pp. 20, Texas, Houston, USA, April 4-6,
2005.
16. S. H. Son, Y. Choi, et al., “Single Electron Transport in a GaAs/AlGaAs Nano-In-Plane-
Gate Transistor”, The 12th Seoul International Symposium on the Physics of
Semiconductors and Applications, March, 2005.
17. S. H. Son, Y. Choi, et al., “Fabrication and Transport Characterization of the In-Plane-
Gate Transistor”, Korea Society of Physics, 2005.
18. S. H. Hong, H. K. Kim, Y. Choi, et al., “Controllable Capture of Au Nano-particles using
Dielectrophoresis”, The 12th Seoul International Symposium on the Physics of
Semiconductors and Applications, March 2004.
Technical Articles
1. E. Sharifahmadian, Y. Choi and S. Latifi, “Remotely Detecting Weapons of Mass
Destruction”, SPIE Newsroom, August 22, 2013, DOI: 10.1117/2.1201308.005057.
2. E. Sharifahmadian, Y. Choi and S. Latifi, “New Sensors Could Evaluate Astronauts’
Vital Signs in Flight”, SPIE Newsroom, February 27, 2014, DOI:
10.1117/2.1201402.005321.
Dissertation Title:
A Novel Multimodal Image Fusion Method using Hybrid Wavelet-based Contourlet
Transform
Dissertation Examination Committee:
Committee Chair, Shahram Latifi, Ph.D.
Committee Member, Sahjendra Singh, Ph.D.
Committee Member, Venkatesan Muthukumar, Ph.D.
Graduate Faculty Representative, Laxmi Gewali, Ph.D.