Content-driven pyramid compression of medical images for error-free archival and progressive...

10
Content-Driven Pyramid Compression of Medical Images for Error-Free Archival and Progressive Transmission (1) Franco Lotti, Bruno Aiazzi, Stefano Baronti, Andrea Casini Istituto di Ricerca sulle Onde Elettromagnetiche “Nello Carrara” - CNR, Via Panciatichi 64, 50127 Firenze - Italy Lucian0 Alparone Dipartimento di Ingegneria Elettronica, Universita di Firenze Via di S. Marta 3,50139 Firenze - Italy Abstract. An efficient encoding scheme based on an improved Laplacian pyramid is proposed for lossless/lossy compression aimed at archival and progressive transmission of greyscale medical images. The major assets are: pyramid entropy is decreased by adopting two different filters for reduction and expansion; encoding priority is given to major details through a hierarchical con- rent-driven decision rule; the binary quad-free of splihnsplir nodes is blockwise zig-zag scanned and run-length encoded. Errorfeedback along the levels of the Laplacian pyramid ensures control of the maximum absolute error, up to fully reversible reconstruction capability, and enhances the efficiency and the robustness of the whoie scheme. While coding results on the standard greyscale image Lena are pretty competitive with the most recent literature, reversible compression of scanned RX images achieves ratios of about 15, establishing improvements over DPCM schemes; high-quality lossy versions at 0.15 bivpel outperform JPEG both visually and quantita- tively. Moreover, nofloating-point computation is required in the algorithm, and on-line compres- sion and reconstruction are feasible on general-purpose terminals. 1. INTRODUCTION Lossless image compression is recently gaining atten- tion over a wider audience in the field of Medical Imag- ing and particularly of teleradiology, in which a huge amount of digital imagery or volumetric data (tomo- graphic sections) must be stored, retrieved and transmit- ted [ 11. However the error-free requirement severely re- duces the compression performance attainable, due to the intrinsic noise level introduced by the imaging system, that must be coded as well [2], and therefore is justified only if all the digits of the binary representa- tion of the image samples are recognized to be signifi- cant, i.e. the quantization step is comparable to the RMS value of the system’s noise. This reason explains why a certain loss of information or distonion is usually tolerated whenever a significant compression ratio is re- quested, provided that a high reconstruction quality is fully guaranteed. The term quality should be intended as absence of visual, or better diagnostic, artifacts [3]. (I) Work partially supported by a grant of the National Research Council of Italy (CNR) within the framework of the Progerro Finaliz- zato Telecomunicazioni. Consequentiy, control of the maximum absoiute error, or Peak Error (Pa, instead of globally averaged meas- ures like Mean Square Error (MSE) or Peak Signal-to- Noise Ratio (PSNR) is demanded. Most of the classical image compression techniques that are based on MSE (or PS”) as quality measure, like Vector Quantization (VQ) [4,5], Transform Coding [6] and Sub-Band Coding (SBC) [7], even though effec- tive for irreversible coding, are not suitable by them- selves for lossless compression and, in this case, must be followed by an encoding procedure of the residual error, with performance penalty, due to its high entropy. If merely statistic coding techniques (e.g. bit-plane and arithmetic coding) [a] are disregarded, the only meth- ods capable of achieving fully reversible image com- pression, at least at reasonable rates, is 2 D Differential Pulse Code Modulation (DPCM), widely used for rever- sible compression of medical images [9], in which pre- dictionlinterpolation errors are rounded and encoded. Progressive image coding is growing in interest as well, as alternative to block or raster coding, for trans- mission on lowlmedium rate channels {lo]. Applications range from image data retrieval and tekbrowsing of Pic- ture Archival and Communication Systems (PACS), to remote monitoring and low-rate video coding. In the ba- Vol.6, No. 3 May-June 1995 301

Transcript of Content-driven pyramid compression of medical images for error-free archival and progressive...

Content-Driven Pyramid Compression of Medical Images for Error-Free Archival and Progressive Transmission (1)

Franco Lotti, Bruno Aiazzi, Stefano Baronti, Andrea Casini Istituto di Ricerca sulle Onde Elettromagnetiche “Nello Carrara” - CNR, Via Panciatichi 64, 50127 Firenze - Italy Lucian0 Alparone Dipartimento di Ingegneria Elettronica, Universita di Firenze Via di S . Marta 3,50139 Firenze - Italy

Abstract. An efficient encoding scheme based on an improved Laplacian pyramid is proposed for lossless/lossy compression aimed at archival and progressive transmission of greyscale medical images. The major assets are: pyramid entropy is decreased by adopting two different filters for reduction and expansion; encoding priority is given to major details through a hierarchical con- rent-driven decision rule; the binary quad-free of splihnsplir nodes is blockwise zig-zag scanned and run-length encoded. Errorfeedback along the levels of the Laplacian pyramid ensures control of the maximum absolute error, up to fully reversible reconstruction capability, and enhances the efficiency and the robustness of the whoie scheme. While coding results on the standard greyscale image Lena are pretty competitive with the most recent literature, reversible compression of scanned RX images achieves ratios of about 1 5 , establishing improvements over DPCM schemes; high-quality lossy versions at 0.15 bivpel outperform JPEG both visually and quantita- tively. Moreover, nofloating-point computation is required in the algorithm, and on-line compres- sion and reconstruction are feasible on general-purpose terminals.

1. INTRODUCTION

Lossless image compression is recently gaining atten- tion over a wider audience in the field of Medical Imag- ing and particularly of teleradiology, in which a huge amount of digital imagery or volumetric data (tomo- graphic sections) must be stored, retrieved and transmit- ted [ 11. However the error-free requirement severely re- duces the compression performance attainable, due to the intrinsic noise level introduced by the imaging system, that must be coded as well [2], and therefore is justified only if all the digits of the binary representa- tion of the image samples are recognized to be signifi- cant, i.e. the quantization step is comparable to the RMS value of the system’s noise. This reason explains why a certain loss of information or distonion is usually tolerated whenever a significant compression ratio is re- quested, provided that a high reconstruction quality is fully guaranteed. The term quality should be intended as absence of visual, or better diagnostic, artifacts [3].

( I ) Work partially supported by a grant of the National Research Council of Italy (CNR) within the framework of the Progerro Finaliz- zato Telecomunicazioni.

Consequentiy, control of the maximum absoiute error, or Peak Error (Pa, instead of globally averaged meas- ures like Mean Square Error (MSE) or Peak Signal-to- Noise Ratio (PSNR) is demanded.

Most of the classical image compression techniques that are based on MSE (or P S ” ) as quality measure, like Vector Quantization (VQ) [4,5], Transform Coding [6] and Sub-Band Coding (SBC) [7], even though effec- tive for irreversible coding, are not suitable by them- selves for lossless compression and, in this case, must be followed by an encoding procedure of the residual error, with performance penalty, due to its high entropy. If merely statistic coding techniques (e.g. bit-plane and arithmetic coding) [a] are disregarded, the only meth- ods capable of achieving fully reversible image com- pression, at least at reasonable rates, is 2D Differential Pulse Code Modulation (DPCM), widely used for rever- sible compression of medical images [9], in which pre- dictionlinterpolation errors are rounded and encoded.

Progressive image coding is growing in interest as well, as alternative to block or raster coding, for trans- mission on lowlmedium rate channels {lo]. Applications range from image data retrieval and tekbrowsing of Pic- ture Archival and Communication Systems (PACS), to remote monitoring and low-rate video coding. In the ba-

Vol.6, No. 3 May-June 1995 301

Fnnco Lotti, Bruno Aiazzi, Stefan0 Baronti, Andrea Casini, Lucian0 Alparone

sic scheme [ 1 11 a come but whole version of the image is first transmitted; refinements are attained by encoding gradually subtler details, up to full resolution, if request- ed. In this frame, producing a compact source representa- tion is a crucial point, since the entropic coders take ad- vantage if the spatial decorrelation procedure is designed to reduce the zero-order entropy of the image source.

In this paper a complete and adaptive system is pro- posed for either lossiess or semi-iossy compression of medical images (especially RX) and monochrome still pictures in general, as well as for progressive transmis- sion at any rate. The term semi-lossy denotes that a pre- fixed number of the MSBs of the samples are recon- structed without loss, and has been introduced when dealing with image data whose bits are not all meaning- ful 131. The outline is based on a modified LP designed to possess lower correlation between adjacent levels, and consequently lower zero-order entropy, than the classi- cai LP 1121, which is a representation of the (zero-mean) detail content of the original image, at increasing spatial resolution. The major novelties concern the adoption of two different filters for the two pyramid generating pro- cedures: reduction (filtering, and decimation or down- sampling) and expansion (zero-interleaving or up-sam- pling, and interpolation). In particular, polynomial half- band kernels have been used for 2 D interpolation, while the classical Burt's kernel [13] has been employed for reduction. To increase the performance of the encoding scheme, priority is given to major image details through a hierarchical conrent-driven decision rule [14, 161 de- fining a binary quad-tree of synchronization flags, which is run-length encoded. The rule is based on a sim- ple L, activity measure on 2 x 2 data blocks. Feedback along the pyramid layers of the values of unsplit nodes, as well as of quantization errors [ 171, ensures semi-lossy reconstruction capability (up to lossless, on request) also reducing the overall zero-order entropy of the LP, and improves the robustness of the algorithm to different types of pyramid generating kernels. Both error-free and irreversible coding tests have been carried out on the standard image Lena as well as on a digital RX; the re- sults, when compared with predictive DPCM and block- DCT JPEG [18], show significant coding improvements and benefits in terms of visual quality.

2. LAPLACIAN PYRAMIDS

2.1. Generalized pyramids

LetGo= {Go( i , j ) , i = O , ... M - l ; j = O ,... , N - I} , M = p x 2K and N = q x 2 K , be the input image,p and q positive integers; Go and the following set of recursive- ly down-sampled image versions

for k = 1, ... , K, are defined as Generalized Gaussian Pyr- amid (GGP); where i = 0, ... , M / 2 & - 1 , j = 0, ... , N12k - 1, k identifies the pyramid level and K > 0 the top level or root, of size p x q.

From the GGP a Generalized Laplacian Pyramid (GLP) is recursively defined as:

for i = 0 ,... , M/2' - 1, j = 0 ,,.. , N/2k - 1, and k = 0, ... , K - 1; the summation terms are taken to be null for noninteger values of ( i + m ) / 2 and (j' + n) /2 . LK ( i , j ) is taken equal to G, (i, j ) , for i = 0, ... , p - 1 and j = 0. ..., q - 1; the total number of nodes is given by the sum of a truncated geometric series:

(3)

The terms Gaussian and Laplacian pyramids find their origin in Marr's vision theory [ 191: by modelling the hu- man visual system, the former represents a whole descrip- tion of the original image at different resolutions; the lat- ter contains the contour sketch of the image at the various resolutions. The attribute generalized has been introduced because in the original definitions only Gaussian-shaped kernels were used for both reduction and expansion.

The purposes of the spatial filters W, (m, n) and W2 (m, n ) are the following: W , introduces a lowpass effect to prevent spatial frequency aliasing from being generated by decimation; W, provides interpolation of the upper level of the GGP, whose difference from the current level recursively makes up the GLP. Both these functions are critical for the entropy and energy content of the resulting GLP, as it will be shown in subsection 3.2. W, (m, n ) and W2 (m, n), transformation kernels with

support regions (2 N, + 1) x (2 M, + 1) and (2 N2 + 1) x (2 M, + l), are taken to be separable as product of two 1 D symmetric kernels:

In the following it is assumed that

The parametric kernel introduced by Burt in 1981 [ 131 and stated in the : domain as

1 H , ( z ) . --- z + L l + a + - z + --- (4) (: 3 -? 4 4 (: 3

302 ETT

Content-Driven pyramid Compression of Medical Images for Error-Free Archival and hogressive Transmission

has been intensively used for both reduction (1) and ex- panrion ( 2 ) with equal values of the kernel parameter (12, 14 - 161; in these works a scale factor of 4 has been added in (2) to adjust the gain, since samples to be inter- polated are zero interleaved along each coordinate axis.

2.2. Improvements of the laplacian pyramid

Considerations on the frequency response properties of pyramid generating filters, as well as on computa- tional complexity, suggest using half-band filters 1201. However, from the analysis of the spectral energy distri- bution of up-sampled signals [21j one recognizes that the half-band requirement is tighter for expansion than for reduction. In fact, the reduction filter must achieve a tradeoff between delivering the maximum amount of spectral energy to upper levels, and reducing the contri- bution due to aliasing. For natural (correlated) images this implies that the filter passband may also exceed 7712 (i.e.fsI4, wherefs denotes the sampling frequency). Instead, the interpolation filter should cut at exactly ~ 1 2 , and also exhibit vestigial symmetry of the frequen- cy response H(w) between pass-band and stop-band, i.e. 1 - H ( w ) = H (0- n), for best removal of the spec- tral images centered at odd multiples of K, introduced by up-sampling. From such considerations it stems that Burt's kernel, which is not half-band except when a = 0.5, may be suitable for Gaussian pyramid generating, but not for interpolating pyramid layers. Therefore we decided to adopt Burt's kernel for reduction and to em- ploy cheap half-band odd-order polynomial kernels, having null the even symmetrical coefficients, for ex- pansion. The linear kernel is stated as:

I - I 1 2 2

H , ( z ) = - z + 1 + - z

the cubic kernel, having the same number (five) of non- zero coefficients as (4), is defined by

5 - 5 1 3 H, ( z ) = - z-3 + - z I + 1 + - z - - z

8 8 8 8

Also a fifth-order interpolator, with seven nonzero coefficients, has been assessed

150 25 256 256 256

I+-z-- z +--7

256 256 2.56-

2-3 + -t-' + *, (z) = 5 5 - -

150 25 3 3 5 (7)

The DC gain of all the above kernels equals (2), as for any interpolation 1 D filter. The real-valued fre- quency responses H(w) of parametric Burt's kernel (4) and of the polynomial interpolators (5) - (7) are shown in Fig. 1 a) and 1 b), respectively. Notice the parametric dependence of the pass-band and the flatness of the re- sponse, for Burt's filter. As to the polynomial interpola-

1.1 i- I I

1 - 0.9

a 0.8 2 - 0.7

5

H I :: a 0.4

0.3

0.2

0.1

0 -0.1

s i3

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Normafued Frequenc y a)

1.1 I I I I I t I I I l

1

t! 0.9

a" 0.8 g 0.7

- 3

B 3 :.: a 0.4 z

0.3

g 0.2 LA 0.1

0

-0.1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

N m a l i e d Frequenc y b)

Fig. 1 - Frequency responses of parametric filter for decimation (a) and polynomial inte olators (b). Values of parameter a: 0.35, 0.4, 0.45, 0.5.0.55, 0.6.T.65, 0.7 left-to-right in (a); linear, fifth-order, and cubic interpolator, left-to-right in passband in (b).

tor, the cubic kernel little differs from the fifth-order one, except for a small overshoot.

Employing two different kernels, with the interpolator being strictly half-band, makes reduction and expansion independent of one other, introducing a separate adjust- ability that significantly decreases the entropy, as well as the variance, of the resulting Enhanced Laplacian b r a - mid (ELP) 2221 with respect to Burt's Laplacian pyramid (LP). This is shown in Fig. 2 a) and 2 b), reporting the equivalent zeroth order entropy [17], and the equivalent amplitude, of the test image Lena, chosen for compari- son with the literature. The definitions of equivalent measures is necessary because the total number of nodes of the pyramid N, is greater than the number of image pixels M x ff, as shown by (3). The former is defined as

the latter is analogously defined as the RMS value at the various levels of the LP

Vol. 6. No. 3 May - June 1995 303

Franco Lotti, Bruno Aiazzi, Stefan0 Baronti, Andrea Casini, Luriano Alparone

(9)

where Ho(Lk) and a2(Lk) respectively denote the zeroth order entropy and the variance at the k-th level of the LP, measured with the same quantization as for the ong- ind (i.e. with unitary steps). Although efficacy is simi- lar, thefifth-order kernel (7) has been chosen for inter- polation in the experimental tests, as minimizing the equivalent entropy of the test image Lena. From Fig. 2a), lower entropy is for a ranging in 0.6 i 0.7. As a

Equivalent Entropy 6.3 I 6.2

035 0.4 0.45 0.5 0.55 0.6 0.65 0.7 Burt's Kernel Parameter a

-.. Burt's Lp + ELP bneaf x Etp Cube .t ELP Fifth-Order a)

Equivakmt Amplitude

0.35 0.4 0.45 0.5 0.55 0.6 0 % 0.7

Burt's Kernel Parameter a

+ Burt's ip + ELP Linear x ELP Cubic + ELP Fifth-Order b)

Fig. 2 - Zcro-order equivalent entropy a) and equivalent amplitude b) of test image LCM vs. Bun's kernel parameter a. LP: Burr's kernel for both reduction and expansion, with same parameter value. ELP Burt's kernel for reduction and ho[f-band polynomial kernels for expansion.

trend, the minimum entropy is attained when the fre- quency response of the two filters in cascade ( Figs. 1 a - b) is as flat as possible in pass-band, especially as the frequency approaches the DC. From the two plots of Figs. 2 a) - b), it is evident that for the modified LP, minimizing energy does not correspond to optimum cod- ing performance, as for Burt's LP; in fact the minimum energy is practically always attained for a = 0.5, result- ing in reduction and expansion filters to be both haq- band, while the minimum entropy is attained for a = 0.625 and fifth-order interpolation filter. Therefore ker- nels designed for minimum energy (i.e. minimum mean square interpolation error) 1231 might fail in minimizing

the zeroth order entropy of the source, which is the goal of lossless compression.

3. ENCODING ALGORITHM

3.1. Basic definitions

The fundamental scheme consists of transmitting the K-th level of the GP (root image) followed by the quan- tized and coded LP coefficients at levels from K - 1 to 0 [12]. In order to introduce a selection criterion for the information to be coded (content-driven decision), each level of the LP is divided into adjacent non-overlapped 2 x 2 blocks of coefficients (nudes), that may not be further partitioned. as information atoms. With refer- ence to Fig. 3, a quartet of nodes at level k is taken to have one father node at level k + 1, and each node of a quartet at level k > 0 is regarded to be the father node of a 2 x 2 block at level k - 1. In this way, a hierarchy is introduced in the pyramids, analogously to quad-trees, that enables the application of a split decision rule based on content.

K k c '

Fig. 3 - Pyramidal quad-free hierarchy for content-driven split deci- sion rule.

3.2. Content-driven split decision rule

Content-driven progressive transmission, introduced by Dreizen [ 141 for quad-tree structured grey-level im- ages, and by two of the authors [15] in the framework of LP, consists of multiple breadth-first scan steps each driven by a set of thresholds (T,, k = 1, .... K}, one for each pyramid level. The thresholds identify those parts of the LP that are most significant. On each 2 x 2 block of the LP at level k - 1 an activity function Ek of the fa- ther node at level k is computed. If Ek < T,, the block is skipped; otherwise, the block (i.e. the interpolation er- rors) is considered for transmission. The major advan-

304

Content-Driven Pyramid Compression of Medical Images for Error-Free Archival and Progressive Transmission

tage of this approach. which has been extended to block DCT coding [24] and to the wavelet transform [251, is that the most important data are transmitted since the very early stages, which means that the information is prioritized. The basic principle is the same as in Recur- sive Block Coding [26J in which, however, a two-source decomposition is considered, and the stationary part of the image is reconstructed through bilinear interpolation.

In the former works by the authors [15, 16, 221, only nodes underlying a split node were considered for the contentdriven transmission, while all the nodes underly- ing a skipped (i.e. unsplit) node were forced to be skipped as well, as a zero-tree [Z]. Therefore, in a single-step scanning fashion, the content-driven feature prevented the error-free reconstruction capability, since skipped node would not be longer encountered in a next step. As the present scheme features a single-step pyramid scanning, aiming more at an efficient lossless performance than at a multi-step progressiveness, all the quartets at any level of the ELP are checked, regardless of their fathers. This fea- ture requires a completely different synchronization strat- egy, as it will be shown in subsection 3.5.

A crucial point of the above outline is the choice of an efficient information measure. Several functions have been reviewed in 1161. In the present work Ek(i, j ) , activity function of node (i . j ) at level k, is defined as the maximum absolute value of its four sons

E~ (i. j ) = MAX 1 ~ ~ - ~ (2i + n,2 j + rn)l (10)

This choice represents a very simple yet efficient se- lection criterion, both in terms of visual quality and ob- jective errors (MSE). In fact, due to the residual space correlation of Laplacian pyramids, four neighboring co- efficients are likely to be of similar magnitude; there- fore, thresholding their absolute maximum guarantees an efficient selection of the whole quartet.

n.m=O.l

3.3. Quantization and error feedback

Quantization is critical in pyramid and SBC schemes, because errors at higher levels spread down over lower ones and accumulate in the lower frequency compo- nents of the reconstructed version [27]. Since differen- tial schemes must ensure the predictor, either causal or not (interpolator), to produce the same output both in compression and in decompression, in pyramid schemes quantization errors at upper levels should be fed-back at the lower levels, as reported in [17]. Therefore, the ELP with quantization error feedback, i.e. L;, is given by

The term Gk+, in (2) has been replaced by G;+, in

(1 l), which recursively accounts for the quantization er- rors from the root down to level k + 1. G;, the recon- structed ELP at level k < K, is made up by adding t i , rounded to an integer multiple of the quantization step A&, to the interpolated up-sampled Gi+,:

The block diagram of the FLP scheme is outlined in Fig. 4, for layers 0 (base) to 3 (root). DPCM encoding of the root will be addressed in subsection 3.4. Notice the error feedback loops, including quantization (Q) and de-quantization (Q-1) blocks. at layers 1 and 2.

Lq, i 0;’

Fig. 4 - Block diagram of a Ckayers ELP coder with error feedback and spatial prediction (DPCM) of the root level.

The error-feedback strategy has the effect of damping the propagation of quantization errors produced at upper levels and allows the reconstruction error to be con- trolled, up to lossless performance, by acting only on 4. Hence, the only source of quantization error is the final quantizer at the pyramid basis, since prior errors are corrected at each level. The error-feedback mecha- nism also reduces the overall entropy of the ELP, by modifying the first-order distribution of the quantized residuals at each level; hence, the equivalent entropy (lossless case) results to be even lower than that report- ed in Fig. 2 a), derived in absence of quantization er- rors. Moreover, the half--band filters (5) - (7) are more appropriate than Burt’s kernel (4). when uncorrelated quantization errors are interpolated. Also, the activity function (10) is computed on the ucmf ELP, thus im- proving the efficiency of the content-driven decision.

In a practical implementation the values of Gk (1) and of the interpolated version of Gi+l (12) are rounded to integers and their differences 15: (1 1) are uniformly quantized with odd integer steps, roughly doubling for decreasing k. The uniform quantizer is known to mini- mize the entropy of the Laplacian source [28]. The se- lection criterion based on thresholding the maximum

Vol. 6, No.3 May-June 1995 305

Franco Lotti, Bruno Aiazzi, Stefan0 Baronti, Andrea Casini, Lucian0 Alparone

absolute value of the ELP (10) introduces a dead-zone i n the quantizer, that decreases the entropy of the source, with a moderate additional distortion. Quantiza- tion levels of the split nodes are encoded through Huff- mun codes fitting the statistics of each level.

The content-driven decision rule with fixed quantiza- tion of the retained quartets may be regarded as a space- variant quantization, in which quartets to be skipped are quantized with so large steps that their quantized values are zero and the four quantization errors equal the quar- tet itself. Therefore also all the errors introduced by the splitlunsplit decision rule may be recovered at the lower levels, thus generalizing the mechanism of quantization error feedback also to content-driven encoding schemes. The major asset of the error feedback is that T,, decision threshold at level k, is not critical for the coding perfor- mance, whether lossless or lossy, since the reconstruc- hon error is thoroughly determined at the zeroth layer.

3.4. Coding ofthe root level

Analogously So the basic scheme, the p x q root level {GK (i, j), i = 0, .._ , p - 1, j = 0, ... , q - 1) is coded first. The choice of K usually varies from an image to an- other. As a tendency, for a given image, increasing K decimates the mean correlation length of the root level image. However, due to aliasing introduced by recur- sive low-pass filtering, the increase of K makes the BLP to be extremely energetic at high levels; therefore it is suitable to keep K reasonably low and to exploit the re- sidual correlation of the root image by spatial DPCM encoding. A simple prediction scheme uses a linear combination of the values of four neighboring pixels, three in the previous and one in the current row. Pre- djcted value at ( i , j ) , bK ( i , j ) is given by:

The optimal coefficients a, j3, 6 are found by mini- mizing +, the mean square error between the original and the predicted root:

where E( .) is the expectation operator. The solution of this problem is well known in the literature [6. 91. For an Auto-Regressive (AR) model of the field it involves the inversion of a (square) correlation matrix of size equal to the prediction length, that is 4 in this case. Nonzero prediction errors are Huffman coded, while the positions of zerolnonzero quantization levels are run- length encoded, as it will be described in subsection 3.5. The best choice of K for the simple predictor adopted,

considering only neighboring pixels at unitary distance, is the value for which the resulting root image exhibits an average correlation length d as close to unity as pos-

sible; in fact, if d < 1 the field is little correlated and any prediction is ineffective; conversely, if d > 1 the predic- tor would better require a longer status. Hence, K should be increased for spatially correlated data, whereas little correlated images benefit from a smaller K .

3.5. Run-length encoding of the split quad-tree

The introduction of a selective choice of the nodes to be split is effective in the coding scheme, but needs a synchronization overhead. Flag bits constitute a binary quud-tree [29], whose root corresponds to the level K, and whose bottom to the level 1 of the ELP. Each one marks the split of the related pyramid node into the quartet of its sons (Fig. 3) . Conversely, each zero de- notes that the underlying quartet of interpolation errors has not been considered.

A straightforward encoding of the split-wee is not ef- ficient for the strong correlation existing within each of its levels, reflecting the features of the ELP: contours and active regions are marked by clustered ones, while homogeneous areas by gatherings of zeroes.

An easy way to exploit such correlation is to run- length encode the 1D sequences of zeroes and ones. Each level of the synchronization quad-tree is parti- tioned into square blocks; each block is zig-zag scanned to yield sequences of Zeroes or ones (i.e. the runs), in case continuing inside the next block. This scanning is better than the raster scanning, taking into account not only correlation along rows but also along columns: tests on RX images have shown that the average length of the runs is maximized when 8 x 8 blocks are considered.

Run lengths of zeroes and ones are separately coded with Laemmel codes (A-codes) and Hasler codes (B- codes) [30], which are more efficient for run-lengths than Huffman’s, because for decoding they do not re- quire the knowledge of the run-length occurrences, whose number may be extremely large. At each pyramid level a preliminary check is made on whether A or B codes are more efficient: in fact, the former is suitable for short runs, the latter better fits long runs. Also the optimal blocksize for the A and B codes [30] is chosen level by level. With integrated use of A and B codes, the average coding length approaches the entropy of the run- lengths wrsr-order Markov source [30]), with redundan- cy always lower than 3%‘ as little additional information is to be specified, differently from Huffman’s codes.

4. CODING RESULTS AND COMPARISONS

Coding results have been assessed by considering the total (i.e. gross) bit-rate (BR) in bits per pixel (bpp), with side information like image size, quantization steps, Occurrences of the quantization levels at each pyr- amid layer for Huffman codes, and option for A or B run-length codes and block sizes, all included in a head- er. Distortion is measured by means of the Mean Square

306

Content-Driven Pyramid Compression of Medical Images for Error-Free Archival and Progressive Transmission

Fig.5

a

b

C

d

Error (MSE) , the Peak Signal-to-Noise Ratio (PSNR), and the Peak Error (Pa.

If Go (i, 13 (GP zeroth layer) denotes the original and do (i,]] its reconstructed version, then

BR(bpp) PE MSE PSNR(dB)

4.72 0 0 W

0.8 10 11.2 37.65

0.4 19 21.8 34.74

0.2 47 41.7 3 1.93

where Grs denotesfill-scale, namely 255 for 8 bit images. Figs. 5 a) - d) show the original monochrome Lena

(512 x 512, 8 bits/pixel (bpp), with H, =7.45 bitdpixel) and three coded versions at 0.8, 0.4 and 0.2 bitslpixel, respectively. All the related distortion values are report- ed in Table 1. Also extremely pleasant versions of re- duced sizes (token images) are made available by the procedure at no additional cost, as intermediate recon-

Level (size)

3 (64x64)

Fig. 6 - Token images from lossless coding of Lena; rates and distor- tions in Table 2.

CR PE MSE PSNR(dB)

118.8 1 0.68 49.8

struction steps. The levels 3 to 1 of the reconstructed Gaussian pyramid in the lossless case are shown in Fig. 6. For each token image Table 2 reports compression ratios with respect to the 512 x 512 original and the dis- tortion values with respect to the original Gaussian pyr- amid (l), assumed as reference. Rate Distortion curves, relating MSE to BR, are drawn in Fig. 7 relatively to Le- na coded with four different algorithms:

I 2(128x 128) 1 42.4

1 (256 x 256) 18.62

i) Burt's Laplacian pyramid (LP), with feedback of quantization error;

ii) DPCM with 4 pel AR predictor (13) and IU en- coding of zero-nonzero prediction errors;

iii) JPEG (XV, Rel. 3.0); iv) Content-driven coding of the ELP with feed-

Fig. 5 - Original 8 bit Len0 (a) and coding results of ELP scheme (a = 0.59375. half-band 5-th-order interpolator. content-driven split and error feedback) in (b). (c), and (d); gmss rates, including all side in- formation. and distortions in Table 1.

back of quantizationldecision errors.

~

43 3 3.25

5 6.67 39.9

Table 1 - Bit-Rates and distortions for the coding tests shown in Fig. 5. Table 2 - Compression ratios (CR) towards 8 bit original,

and distortions for the token images shown in Fig. 6.

Vol. 6, No. 3 May - June 1995 307

Franc0 Lotti. Bruno Aiazzi, Stefan0 Baronti, Andrea Casini. Lucian0 Alparone

Fig. 8

a

b

C

d

- DPCM

+ ELP

BR(bpp) PE MSE PSNR(dB)

48.42

45.26

39.17

I .86 0 0

0.15 5 0.94

0.08 8 1.94

0.15 25 7.87

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Bit Rate (bpp)

Fig. 7 - Rate Distortion pcrformance of several compression schemes for Lena.

Due to the error feedback capability, lossless recon- struction is achieved by both the pyramidal schemes, by coarse quantization at upper levels and recovery of errors at Level zero, by setting the quantization step A,, to unity. Also the DPCM scheme is able to attain reversible com- pression through a unitary step. Conversely, JPEG can- not provide error-free results, as well known. From the four plots it appears that the pyramid scheme proposed in this paper is more efficient than the classical LP, when error recovery is introduced [16], as well as than DPCM and JPEG. Moreover, the MSE-BR performance of both the pyramid schemes is steady for low, medium and high rates, without critical points and cutoffs.

Since the goals of this coding scheme are archival and transmission of Medical Images, all the further results are addressed to this direction. Fig. 8 a) shows a 768 x 832 RX image of a chest, scanned at 60 dpi (dots-per-inch). Only the 8 MSBs have been selected for lossless com- pression as well as for display; the zero-order entropy of image source is Ho = 6.71 bitdpixel. Figs. 8 b) and 8 c) display reconstructions after applications of the proposed

Fi 8 * Lossless and lossy coding of original 8 bit RX: ELP with u = o.&. f ifth-order interpolator. content-driven split and error feedback in (a), fb), (c); JPEG (d); rates and distortions in Table 3.

pyramid scheme, (reduction kernel with a = 0.625, fifth- order interpolator, content-driven split and error feed- back) at different rates. Fig. 8 d) portrays the result of JPEG scheme at the same BR as in Fig. 8 b). Visual judgement confirms that significant features, such as con- tours and especially fine textures, are preserved, while no appreciable artifacts are introduced in homogeneous re- gions, dso at the lowest rate; whereas JPEG shows tiling effects due to block DCT coding: The compression per- formance of the tests of Figs. 8 a) - d), including the re- versible case as first, is reported in Table 3: both average and maximum errors prove that the novel pyramid scheme outperforms JPEG. MSE-BR plots of schemes i) to iv) are supplied in Fig. 9 for the RX image. The ELP scheme still attains the best performance at any rate.

Table 3 - Bit-Rates and distortions for the tests shown in. Fig. 8.

3

2.5

2

1.5

1

0.5

0

MSE

o 0.2 0.4 0.6 0.8 i 1.2 1.4 1.6 1.8 2 2.2 Bit Rate (bpp)

Fig. 9 - Rate Distortion performance of some compression schemes for test RX image.

The superiority of the novel pyramid scheme over JPEG is stressed not only in terms of global (MSE or PSNR) but also of local error ( P a , which is kept reason- ably low by the error-feedback mechanism, also at ex- tremely low rates. This trend is fully exposed in the plots of Fig. 10, in which PE of the RX image is reported against BR, for the four above mentioned schemes. Again both the pyramid schemes outperform the others two.

Besides the efficiency and progressiveness features outlined in this section, a simple and fast algorithm rec- ommends this method for PACS organization as well as for image browsing and retrieval from remote terminals.

308 ETT

Content-Driven Pyramid Compression of Medical Images for Error-Free Archival and Progressive Transmission

Peak Error

51..&\ ....................................................... \ - ] 4 ........................................................... 3 ............................................................................. 2 ............. \ ...................................................... 1 < ........................ ............................................ 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2

Bi Rate (bpp)

Fig. 10 - Peak Error versus Bit Rate of test RX image.

Since no floating-point computation is required, on-line compressioddecompression is feasible on general-pur- pose computers.

A final remark concerns the robustness of the scheme to changes of the filtering parameters: tests canied out with different (non-optimal) kernels have shown very little per- formance penalty, thus validating the half-band intepla- tor when embedded in an error-feedback mechanism.

5. CONCLUSIONS

A novel encoding scheme has been proposed for re- versible compression of medical images. It exploits an enhanced LP with entropy and energy lower than the classical LP, achieved by means of different reduction and expansion filters. The scheme features a confent- driven decision rule based on a simple L, activity meas- ure on 2 x 2 ELP blocks. Errors due both to quantiza- tion and to ELP nodes skipped by the decision process, are fed-back at lower pyramid levels. This strategy en- sures the control of the maximum absolute reconstruc- tion error up to reversibility, improves the robustness of the algorithm and reduces the zero-order entropy of the ELP. Synchronization of quartets of values to be encod- ed is provided by a binary quad-tree efficiently imple- mented on a layer basis, by run-length encoding the splitlunsplir pyramid nodes, through integrated use of A and B codes. The complete outline may be regarded as a hierarchical interpolative (i.e. non-causal) spatial DPCM, with the advantages of differential schemes in terms of error control and a source decorrelation more effective than that of a spatial predictive (i.e. causal) method. due to the intrinsic capability of coping with nonstationarity of the data field. Moreover, control of the maximum absolute reconstruction error prevents the introduction of visual artifacts, like tiling and ringing effects, in textures and homogeneous regions of the lossy reconstructed versions, also for high compression ratios; in fact, the algorithm neither operates on blocks, like JPEG and other DCT-based schemes, nor utilizes selective fdters, that may cause the onset of artifacts in

the neighboring of step edges, as in SBC. Reversible re- sults on Medical Images are superior to those of a 2 D DPCM scheme, whereas lossy coding outperforms JPEG. as well as the classical pyramid scheme. More- over, no floating-point computation is required and real-time archival, as well as retrieval, is feasible on commercial general-purpose computers.

Acknowledgement

The authors wish to gratefully acknowledge the valu- able support of Prof. RafYaella De Dominicis of the Di- partimento di Fisiologia Clinica, University of Florence, for her huge radiological skill and expertise, and useful suggestions throughout this work.

Manuscript received on September 9, 1994.

REFERENCES

[ I ] G . R. Kuduvalli, R. M. Rangayyan: Performance analysis of re- versible image compression techniques for high-resolution digi- tal feleradiology. “IEEE Trans. Medical Imaging”, Vol. MI-1 I . No. 3, Nov. 1992, p. 430-445.

[2] R. E. Roger, I. F. Amold: Reversible image compression bound- ed by noise. “IEEE Trans. Geosci. Remote Sensing”, Vol. GE- 32. No. I , Jan. 94. p. 19-24.

[3] P. C. Cosman. R. M. Gray, R. A. Olshen: Evaluating qualip of compressed medical images: SNR, subjective rating, and diagnos- ticaccuracy. “Pmc. EEF‘, VoI. 82, No. 6. Jun. 1994, p. 919-932.

[4] N. M. Nasrabadi. R. A. King: Image coding using vector quan- tization: A review. “IEEE Trans. Communications”, Vol. COM-

[S] F. G. B. Denatale, G . S. Desoli, D. D. Giusto. G. Vernazza: Data compression for image archiving applications. “ETT’. Vol.4, No. 2, Mar.-Apr. 1993. p. 183-191.

[6] A. N. Netravali, B. G. Haskell: Digital picture representation and compression. Plenum Press, New York, 1988.

17) J. W. Woods. S. D. O’Neil: Sub-band coding of images. “IEEE Trans. on Acoustic Speech and Signal Processing”, Vol. ASSP-

[8] R. B. Arps, T. K. Truong: Comparison of internarional stan- dards for lossless still image compression. “Proc. of the IEEE’.

[91 M. Das, S. Burgett: Lossless compression of medical images us- ing two-dimensional multiplicative autoregressive models. ‘TEE Trans. Medical Images”, Vol. ML.12. No. 4. Dec. 1993,

[ 101 K. H. Tzou: Progressive image transmission: a review and com- parison of techniques. J . Optical Engineering. Vol. 26, No. 7.

1111 K. R. Sloan, S. L. Tanimoto: Progressive Refinement of raster scan images. “IEEE Trans. on Computers”, Vol. C-28, No. 1 1 .

(121 P. J. Burt, E. H. Adelson: The lnplacian pyramid as (I compact imuge code. “IEEE Trans, on Communications”, Vol. COM-3 I , No. 4. Apr. 1983. p. 532-540.

[13] P. J. Burt: Fast filter rransfonns for image processing. “Com- puter Vision, Graphics, and Image Processing”, Vol. 16, No. I. Oct. 1981, p. 20-51.

[14] H. M. Dreizen: Content-driven progressive transmission of grey-scale images. “IEEE Trans. on Communications”, Vol.

36, NO. 8, Aug. 1988, p. 957-971.

34, NO. 10, Oct. 1986, p. 1278-1288.

V O ~ . 82, NO. 6. Jun. 1994, p. 889-899.

p. 721-726.

JuI. 1987, p. 581-589.

N ~ V . 1979. p. 871-874.

COM-35, NO. 3. Mar. 1987. p. 289-296.

Vol. 6, No. 3 May - Jcne 1995 309

Franco Lotti, Bruno Aiazzi, Stefan0 Baronti, Andrea Casini, Lucian0 Alparone

[ 151 G. Mongatti, L. Alparone. S. Baronti: Erzrropy criterion for pro- gressive faplacian pyramid-based image transmission. “Electr. Lett.”. Vol. 25, No. 7, 1989, p. 450-451.

1161 G. Mongani. L. Alparone, G . Benelli. S. Baronti. F. Lotti, A. Casini: Progressive image rrammission by content driven Lapla- cian pyramid encoding. “ E E Proc.-I, Communications, Speech and Vision”, Vol. 139, No. 5, Oct. 1992, p. 495-500.

I171 L. Wang, M. Goldberg: Comparative performance of pyramid data structures for progressive image transmission. “IEEE Trans. on Communications”, Vof. COM-39, No. 4. Apr. 1991. p. 540-548.

I181 G. K. Wallace: The JPEG stiff picture compression standard. Communications of ACM. Vol. 34. No. 4. 1990, p. 30-44.

[ 191 D. Marr: Vision. Freeman, San Francisco, CA. 1982.

[20] P. Meer, E. S. Baugher. A. Rosenfeld: Frequency domuin analy- sl i and synrhesis of image pyramid generating kernels. “IEEE Trans. on Pattern Analysis and Machine Intelligence”. Vol. PA- MI-9, No. 4, Apr. 1987. p. 512-522.

12 I 1 R. E. Cmhiere, L. R. Rabincr: Muftirate digitul signal process- ing. Englewood Cliffs, N.J.. Prentice-Hall, 1983.

[22J S. Baronti. A. Casini, F. Lotti, L. Alparone: Conrenr-driven dv- ferentinl encoding of an enhanced image pyramid. “Signal Pro- cessing: Image Comrnucication”. Vol. 6, No. 5, Oct. 1994, p. 463-469.

1231 F. Chin. A. Choi, Y. Luo: Optimal generating kernels for image pyramids bv piecewise fitting. “IEEE Trans. on Pattern Analysis and Machine Intelligence”, Vol. PAMI-14. No. 12, Dec. 1992, p. 1190-1 198.

[24j Y. Huang. H. M. Dreizen, N. P. Gdatsanos: Prioritized DCT for compression and progressive transmission of images. “IEEE Trans. Image Recessing". Vol. IF-I, No. 4. Oct. 1992, p. 477487.

[251 J. M. Shapiro: Embedded h u g e codin using zerotrees of wave- let coeficienrs. “IEEE Trans. Signal kocessing”, Vol. 41, No. 12, Dec. 1993. p. 3445-3462.

1261 P. M. Farclle: Recursive block coding for image dara compres-

[271 K.M. Uz. M. Vetterli, D. J. LeGall: Infernlarive multiresolurion sion. Springer Verlag. Berlin-New York, 1990.

codin of advanced television wirh C ~ R tible subchannels. IEEEfTrans. Circuits and S stems for V i G Technology”, Vot.

1.No. 1,Mar. 1991, p. 86-9g.

I281 N. Fantardin, J. W. Modestino: Oprimum quantizerperfonnance for a class of non-Gaussian memoryless sources. ‘ IEEE Trans. Inform. Theory”, Vol. IT-30, May 1984, p. 485-497.

[291 C. H. Shaffer. H. Samet: Optimal quadtree construction algo- rithms. “Computer Vision, Graphics, and Image Processing”,

[30) H. Meyr. H. G. Rosdolsky. T.. S. Huang: Oprimum run length codes. “IEEE Trans. on Communications”, Vol. COM-22, No. 6, Jun. 1974, p. 826-835.

Vol. 37, NO. 3. Mx. 1987, p. 402-419.

310