B-term approximation using tree-structured Haar transforms

10
B-term Approximation Using Tree-Structured Haar Transforms Hsin-Han Ho , Karen O. Egiazarian * , and Sanjit K. Mitra Dept. of ECE, University of California, Santa Barbara, CA 93106, U.S.A. * Dept. of Signal Processing, Tampere University of Technology, P.O. Box 553, 33101, Tampere, Finland ABSTRACT We present a heuristic solution for B-term approximation of 1-D discrete signals using Tree-Structured Haar (TSH) transforms. Our solution consists of two main stages: best basis selection and greedy approximation. In addition, when approximating the same signal with different B constraints or error metrics, our solution also provides the flexibility of reducing overall computation time of approximation by increasing overall storage space. We adopt a lattice structure to index basis vectors, so that one index value can fully specify a basis vector. Based on the concept of fast computation of TSH transform by butterfly network, we also develop an algorithm for directly deriving butterfly parameters and incorporate it into our solution. Results show that, when the error metric is either normalized 1 -norm or normalized 2 -norm, our solution has comparable (sometimes better) approximation quality with prior data synopsis algorithms. Keywords: Haar transform, approximation, data synopsis, best basis algorithm 1. INTRODUCTION Approximation techniques are of fundamental importance due to its wide applicability in science and engineering. In particular, in many database applications it is often desirable to have quality approximation of target signals, called synopsis, given a space constraint. 1, 2 Due to this demand many synopsis algorithms have been developed. Although some of them are developed for higher dimensions, here we focus on approximating 1-D discrete signals. More specifically, our objective is, given an original signal f of length N , to construct its approximation ˆ f by B ( N ) non-zero coefficients and B indices, such that a pre-defined error metric is minimized. The error metric we use here is normalized p -norm: ( i |f i ˆ f i | p N ) 1/p ,p [1, ]. Synopsis algorithms can be generally classified into two principal approaches: 1) histogram-based approach, 3–9 2) hierarchical approach utilizing dyadic Haar wavelets. 8, 10–14 Very recently also has emerged combined ap- proaches like Haar + , 15 Compact Hierarchical Histogram (CHH) 16 and Lattice Histogram (LH). 17 Among these algorithms, LH has particularly attracted our attention. In LH paper, the optimal approach and one heuristic approach are proposed. Both approaches run under a space constraint B and a resolution parameter δ. In the general case of monotonic distributive error metric, defined in Ref. 15, the optimal approach minimizes error but is space expensive. In special case of maximum error metric, the optimal approach requires less space usage than general case by solving the dual error-bounded problem followed by binary search in error domain. To reduce space usage under general error metric case, one heuristic approach is proposed. It consists of two stage optimization: 1) obtaining the optimal solution, namely node locations and node values, under an appropriate maximum error metric, and 2) adjusting node values following the spirit of Ref. 16. Further author information: (Send correspondence to H.H.) H.H.: E-mail: [email protected] K.O.E.: E-mail: karen.egiazarian@tut.fi S.K.M.: E-mail: [email protected] Image Processing: Algorithms and Systems VII, edited by Jaakko T. Astola, Karen O. Egiazarian Nasser M. Nasrabadi, Syed A. Rizvi, Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 7245, 724505 © 2009 SPIE-IS&T · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.816680 SPIE-IS&T/ Vol. 7245 724505-1

Transcript of B-term approximation using tree-structured Haar transforms

B-term Approximation Using Tree-Structured HaarTransforms

Hsin-Han Ho†, Karen O. Egiazarian*, and Sanjit K. Mitra†

†Dept. of ECE, University of California, Santa Barbara, CA 93106, U.S.A.*Dept. of Signal Processing, Tampere University of Technology, P.O. Box 553, 33101,

Tampere, Finland

ABSTRACT

We present a heuristic solution for B-term approximation of 1-D discrete signals using Tree-Structured Haar(TSH) transforms. Our solution consists of two main stages: best basis selection and greedy approximation. Inaddition, when approximating the same signal with different B constraints or error metrics, our solution alsoprovides the flexibility of reducing overall computation time of approximation by increasing overall storage space.We adopt a lattice structure to index basis vectors, so that one index value can fully specify a basis vector. Basedon the concept of fast computation of TSH transform by butterfly network, we also develop an algorithm fordirectly deriving butterfly parameters and incorporate it into our solution. Results show that, when the errormetric is either normalized �1-norm or normalized �2-norm, our solution has comparable (sometimes better)approximation quality with prior data synopsis algorithms.

Keywords: Haar transform, approximation, data synopsis, best basis algorithm

1. INTRODUCTION

Approximation techniques are of fundamental importance due to its wide applicability in science and engineering.In particular, in many database applications it is often desirable to have quality approximation of target signals,called synopsis, given a space constraint.1, 2 Due to this demand many synopsis algorithms have been developed.Although some of them are developed for higher dimensions, here we focus on approximating 1-D discrete signals.More specifically, our objective is, given an original signal f of length N , to construct its approximation f̂ byB (� N) non-zero coefficients and B indices, such that a pre-defined error metric is minimized. The error metricwe use here is normalized �p-norm:

( ∑

i

|fi − f̂i|pN

)1/p, p ∈ [1,∞].

Synopsis algorithms can be generally classified into two principal approaches: 1) histogram-based approach,3–9

2) hierarchical approach utilizing dyadic Haar wavelets.8, 10–14 Very recently also has emerged combined ap-proaches like Haar+,15 Compact Hierarchical Histogram (CHH)16 and Lattice Histogram (LH).17 Among thesealgorithms, LH has particularly attracted our attention. In LH paper, the optimal approach and one heuristicapproach are proposed. Both approaches run under a space constraint B and a resolution parameter δ. In thegeneral case of monotonic distributive error metric, defined in Ref. 15, the optimal approach minimizes errorbut is space expensive. In special case of maximum error metric, the optimal approach requires less space usagethan general case by solving the dual error-bounded problem followed by binary search in error domain. Toreduce space usage under general error metric case, one heuristic approach is proposed. It consists of two stageoptimization: 1) obtaining the optimal solution, namely node locations and node values, under an appropriatemaximum error metric, and 2) adjusting node values following the spirit of Ref. 16.

Further author information: (Send correspondence to H.H.)H.H.: E-mail: [email protected].: E-mail: [email protected].: E-mail: [email protected]

Image Processing: Algorithms and Systems VII, edited by Jaakko T. Astola, Karen O. EgiazarianNasser M. Nasrabadi, Syed A. Rizvi, Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 7245, 724505

© 2009 SPIE-IS&T · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.816680

SPIE-IS&T/ Vol. 7245 724505-1

When properly setting the value of resolution parameter δ, both approaches of LH significantly outperformall earlier algorithms in terms of approximation quality. The advantage of LH lies in the good mix of localand non-local approximation, achieved by adopting the lattice data structure. Local approximation is carriedout using a single coefficient to represent multiple consecutive data points, whereas non-local approximation isnaturally embedded in the hierarchical structure of lattice. When error metric is normalized �1-norm, optimalLH requires O(Δ

δ N3B2) time complexity and O(Δδ N2B) space complexity. However, before obtaining the final

satisfactory approximation, it seems necessary to spend additional time/space resources to obtain a moderate δvalue. Based on this observation, we feel it is of interest to develop an approximation algorithm with lower timeand space complexity than both optimal LH and heuristic LH, namely keeping the advantage of lattice datastructure and removing the resolution parameter δ.

To this end, we have chosen Tree-Structured Haar (TSH) transforms18 for developing approximation algo-rithm. TSH transforms are a family of generalized Haar transforms defined by binary interval splitting trees(BISTs). Each tree splitting node specifies support of the corresponding basis vector. It is straightforward toembed basis vector information into the lattice data structure, where each lattice node specifies a basis vectorφi and its coefficient ci conveys the information for constructing f̂ =

∑i ciφi. In addition, we have chosen

ci = 〈f, φi〉 to avoid the use of resolution parameter. This is in contrast to the unrestricted Haar wavelets13 inwhich ci ∈ R.

We have taken a heuristic approach for determining B non-zero coefficients of TSH transforms. More specif-ically, our solution consists of two main stages: 1) selecting the best basis whose coefficients ci = 〈f, φi〉 havingthe minimum cost, respect to a pre-chosen additive cost function such as �p-norm or entropy, and 2) greedyapproximation using the best basis. In stage one, we select the minimum cost basis from a large library of piece-wise constant orthonormal bases by varying the structure of BIST. The best basis should capture signal’s mainfeatures in just few basis vectors, giving low approximation error after properly choosing B non-zero coefficients.In stage two, we use the greedy algorithm of Guha and Harb.13 This algorithm supports error metrics in all�p-norms, p ∈ [1,∞], and guarantees that the final approximation error is within a finite distance to the minimumapproximation error from optimal solution. Although it is developed for dyadic compactly support wavelet bases,it also works for TSH transforms. Our solution has lower time/space complexity and comparable approximationquality with heuristic LH for the case of normalized �1-norm and normalized �2-norm error metrics.

It can be immediately seen that this approach belongs to B-term approximation in approximation theory.B-term approximation is defined as selecting B non-zero coefficients for a pre-defined orthonormal basis such thatthe approximation error is minimized.19 It is often that we select the best basis from a large library accordingto some cost criterion. For example, Coifman and Wickerhauser provide a O(N log N) time algorithm to selectthe basis with minimum entropy cost from a library of O(2N ) bases.20 Due to double stage of optimization,this type of approximation is highly nonlinear .19

2. TREE-STRUCTURED HAAR TRANSFORMS

TSH transforms18 are generalizations of classical dyadic Haar transform by allowing arbitrarily splitting supportof each basis vector (row in transform matrix) into one part for positive values and the other part for negativevalues, except for the flat (DC in circuit theory) basis vector. Among all non-flat basis vectors, hierarchicaldependency exists between parent (longer) and child (shorter) basis vectors. More specifically, support forpositive (or negative) values of a parent vector is equal to support of the child vector. All basis vectors aredefined by a pre-chosen binary tree, called BIST, whose every splitting node is associated with a piecewiseconstant function, called TSH function, with one positive part next to one negative part. BIST root node isassociated with one TSH function and one flat (DC) function. A TSH basis vector of length N is then sampledfrom each TSH function at N equally spaced points. For example, given the BIST in Figure 1(a), the associatedTSH transform matrix H is given in Eq. (1).

SPIE-IS&T/ Vol. 7245 724505-2

5 8

6 97 10g g

17 181 019 20

H =

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

√15

√15

√15

√15

√15√

23·5

√2

3·5√

23·5 −

√3

2·5 −√

32·5√

12·3

√1

2·3 −√

21·3 0 0

√12 −

√12 0 0 0

0 0 0√

12 −

√12

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

(1)

Recently, Fryzlewicz also generalized classical dyadic Haar transform and named it Discrete Unbalanced HaarTransform (DUHT).21 DUHT does not explicitly take any tree structure for constructing transform matrix.Instead, a set of N − 1 breakpoints is used to specify N × N transform matrix. For example, H in Eq. (1) canbe specified by a set of breakpoints {3, 2, 1, 4}. It should be noted that a breakpoint together with its locationin the set specify a basis vector, thus breakpoint set may not be suitable for specifying any arbitrary set of basisvectors. In the scenario of B-term approximation, we would need a method to index basis vectors, so that oneindex value specifies one basis vector.

2.1. Indexing Basis Vectors

Inspired by LH,17 we have used a similar lattice structure to index basis vectors. Figure 1(b) illustrates thelattice structure and indices associated with the TSH basis defined by BIST in Figure 1(a). Original signal{f1, f2, f3, f4, f5} is placed under the lattice in order to illustrate the support range of each lattice node. TheTSH basis is indexed by a set of non-negative integers {0, 3, 12, 17, 20}.

Detailed indexing scheme is described as follows. For a TSH basis of size N , the associated lattice consists ofN(N +1)(N−1)/6 nodes fully specifying the dictionary of non-flat basis vectors, among which N−1 orthonormalvectors are included in the basis. Each included vector is denoted by a black node (•). Positive integers are usedto index lattice nodes, whereas 0 denotes the flat (DC) basis vector. All basis vectors with the same supportcorrespond to lattice nodes stacked at the same location.

(a) (b)

Figure 1. (a) Example of a binary interval splitting tree, and (b) the corresponding lattice structure and indices. Originalsignal {f1, f2, f3, f4, f5} is placed under the lattice.

2.2. Butterfly Network Computation by Parameter Table

In Ref. 18, a fast algorithm for computing TSH transform in O(N) operations is proposed. Fast computationis achieved by a series of high-pass and low-pass operations, called butterfly network. Based on an input BIST,a set of sparse matrices are constructed such that their product is equal to the TSH transform matrix. Eachsparse matrix specifies one layer of butterfly network. In this section, we propose an algorithm to computebutterfly parameters directly without using any sparse matrix. This algorithm takes a breakpoint set BS asinput, and then output a table of butterfly parameters TB and a reordered set of breakpoints RBS. Each table

SPIE-IS&T/ Vol. 7245 724505-3

(1 1O}

4 6

4®W}2 2

5 1O}

1 1

E;__- i;- 4 5

1 1

(1 1 2 2 ( (

2

(1 1 2 2 3 3 4 4 ®1O}(1 1 2 2 3 3 4 4 710}

(1 1 2 2 3 3 4 4 5 5 6 6 ®__(1O}

{1 1 2 2 3 3 4 4 5 5 6 6 (i'js (jjs 10 10}

1 1

(1 1 2 2 3 3 4 4 5 5 6 6 (1J7 (6 9 9 10 10}

cell corresponds to a layer of butterfly network. RBS is created by permuting BS according to the order ofsorted parameters in TB, and is converted to lattice indices in our proposed heuristic solution.

Our algorithm for building parameter table for a size N TSH transform is as follows:

Algorithm: BuildParameterTable(BS)

Input: Breakpoint set BS = {b1, b2, ..., bN−1} which specifies a TSH basis in RN .

Output: A table TB (with O(N) space) consisting of butterfly parameters. A reordered set of breakpointsRBS.

Step 1 Initialize an array of {a1, a2} = {1, N}. According to each bk ∈ BS, iteratively split the array atal−1 ≤ bk < al and insert values {bk, bk + 1}. Extract 4 butterfly parameters each time splitting the array:1) 2 parameters hk,1 = al − bk, hk,2 = −(bk − al−1 + 1) for high-pass operation, 2) 2 indices {al−1, bk + 1}for low-pass and high-pass coefficients. Parameters of low-pass operation are set to be {1, 1} throughoutthis step.

Step 2 Use stable sorting to sort butterfly parameters twice, first time by |hk,2|, and the second time byhk,1 + |hk,2|. Permute BS by the order of sorted parameters, and store into a new set RBS. Groupparameters with the same high-pass operation and store into one cell in TB.

Time and Space Complexity: Since we use array data structure in Step 1, it takes O(N2) time for processingN − 1 break points. Step 2 takes O(N log N) time due to sorting operation. Space complexity is O(N) due togrowing array and parameter table.

For example, given an input breakpoint set of BS = {4, 2, 1, 3, 6, 5, 9, 8, 7}, illustration of Step 1 is shown inFigure 2. Figure 3(a) shows the output parameter table, which specifies the butterfly network in Figure 3(b).The output RBS = {4, 6, 2, 9, 8, 1, 3, 5, 7}.

Figure 2. Illustration of array splitting and extracting butterfly parameters. Input breakpoint set is {4, 2, 1, 3, 6, 5, 9, 8, 7}.

SPIE-IS&T/ Vol. 7245 724505-4

1 1 6 -4 1 5

1 1 4 -2 5 7

1 1 2 -2 1 3

1 1 1 -3 7 10

1 1 1 -2 7 9

1 1 1 -1 1 2

1 1 1 -1 3 4

1 1 1 -1 5 6

1 1 1 -1 7 8

(a)

012345678

0

1

2

3

4

5

6

7

8

9

10

11

layer m of butterfly network

sign

al in

dex

−1

1

−1

1

−1

1

−1

1

−2

1

−3

1

−2

2

−2

4

−4

6

(b)

Figure 3. (a) Example of parameter table, and (b) its associated butterfly network. Only parameter values of high-passoperations are shown.

3. PROPOSED HEURISTIC SOLUTIONStep 1 Select the best TSH basis and represent it by a set of breakpoints BS.

Step 2 Run algorithm BuildParameterTable(BS) to obtain parameter table TB. Calculate transform coeffi-cients ci = 〈f, φi〉 by TB, and store into an array by the order of sorted parameters in TB.

Step 3 Convert reordered breakpoint set RBS to lattice indices. Store 4 arrays (transform coefficients, latticeindices, hk,1, |hk,2|) of length N for greedy approximation with input parameters (p,B). B specifies thenumber of non-zero terms, and p denotes normalized �p-norm error metric.

Step 4 Load previously stored 4 arrays. Based on hk,1, |hk,2| and p, calculate p′ = pp−1 and ||φi||p′ for each

basis vector φi.

Step 5 Apply the greedy approximation in Ref. 13, namely select B largest terms of |ci|/||φi||p′ .

3.1. Best Basis Algorithm for TSH Transforms

We have adopted dynamic programming to select the best basis having minimum cost. Because of distributivenature of dynamic programming, we only use additive cost functions, such as entropy or �p-norm of transformcoefficients. Since the flat (DC) basis vector is always included in the best basis, best basis search essentiallyselects N − 1 orthonormal vectors from the dictionary of N(N + 1)(N − 1)/6 non-flat TSH basis vectors, whereN denotes the length of input signal. The total number of orthonormal bases, CN , in TSH transforms can bederived by the following recursion:

CN =N−1∑

k=1

Ck · CN−k, C1 = 1. (2)

Let J(P,Q) be the minimum cost from Q−P non-flat orthonormal basis vectors having support within [P,Q].The following recursive equation computes J(P,Q):

J(P,Q) = minP≤b≤Q−1

(J(P, b) + J(b + 1, Q) + cost of basis vector φ(P,Q, b)), (3)

where basis vector φ(P,Q, b) has support [P,Q] and breakpoint location at b. For a 1-D signal defined in[1, N ], the cost of best basis is thus equal to: J(1, N) + cost of flat (DC) basis vector. Figure 4 illustrates thepseudocode of our dynamic programming implementation for cost minimization in Eq. (3). During minimization,two 2-D tables are used to bookkeep solutions of sub-problems: minimum costs (table J) and breakpoints

SPIE-IS&T/ Vol. 7245 724505-5

(table brk pt). Computation of each inner product in O(1) time is achieved using partial sums of input signalstored in a separate table table psum. After cost minimization, breakpoints specifying best basis are derived bybacktracking table brk pt.

Our best basis algorithm is related to the fast recursive tiling algorithm by Huang et al.22 for finding theoptimal tiling of a 2-D discrete image. Our algorithm is different in two aspects: 1) it does not conduct any treepruning, and 2) the original signal is placed at the tree leaf nodes. In addition, our implementation does notrequire the use of recursive functions.

1 MinCost TSHaarTransform ( s i g i n )23 N = length ( s i g i n ) ;4 f o r k = 1 :N5 tab l e J (k , k ) = 0 ;6 table psum (k , k ) = s i g i n (k ) ;7 end89 f o r k = 1 :N−1 // LOOP: ( support l ength )−1 o f each ba s i s vec to r

10 f o r p = 1 :N−k // LOOP: s t a r t i n g l o c a t i o n o f each support11 t ab l e J (p , p+k) = i n f ;12 min brk pt = [ ] ;1314 table psum (p , p+k) = table psum (p , p+k−1) + table psum (p+k , p+k ) ;1516 f o r b = p : p+k−1 // LOOP: a l l p o s s i b l e breakpo int s o f ba s i s vec tor phi [ p , p+k , b ]1718 // c a l c u l a t e inner product o f ba s i s vec to r phi [ p , p+k , b ]19 tmp innpd = ( ( p+k−b)∗ table psum (p , b) − (b−p+1)∗ table psum (b+1, p+k ) ) / . . .20 sq r t ( (p+k−b )∗ (b−p+1)∗(k+1) ) ;2122 // c a l c u l a t e co s t23 tmp cost = tab l e J (p , b) + tab l e J (b+1, p+k) + ( cos t o f tmp innpd ) ;2425 i f tmp cost <= tab l e J (p , p+k)26 t ab l e J (p , p+k) = tmp cost ;27 min brk pt = b ;28 end2930 end31 t ab l e b r k p t (p , p+k) = min brk pt ;32 end33 end34 return t ab l e J (1 , N)

Figure 4. Pseudocode of dynamic programming implementation for cost minimization.

3.2. Time and Space Complexity

In this section, the time and space complexity for each step in the solution is presented. Note that in this paperwe only focus on offline processing of input signals. Therefore, in steps 4 and 5, we choose a simpler method forimplementation, instead of generalizing the stream model in Ref. 13 to variable tree structure case.

Step 1 During cost minimization, it takes O(∑N−1

k=1

∑N−kp=1 1) = O(N2) time to setup table psum and

table brk pt, and O(∑N−1

k=1

∑N−kp=1 k) = O(N3) time to setup table J, adding up to O(N3) time. Overall

this step takes O(N2) space to store three 2-D tables.

Step 2 O(N2) time and O(N) space. Please see Section 2.2 for detailed analysis.

Step 3 Our algorithm for this step takes O(N2) time and O(N2) space. In short, our algorithm first builds asize O(N2) table consisting of cumulative sums of index values in O(N2) time, and then compute all latticeindices in O(N) time.

Step 4 Our implementation for this step simply computes all ||φi||p′ values in O(N) time and stores into a sizeO(N) array.

SPIE-IS&T/ Vol. 7245 724505-6

Step 5 Sort the scaled coefficients in O(N log N) time, and then select the B largest terms.

Table 1 tabulates a complexity comparison between prior synopsis algorithms and our work under normalized�1-norm error metric. In each associated algorithm, Δ denotes the difference between maximum and minimumvalue in the original data, δ denotes the resolution step parameter, and ε∞ denotes an upper bound for thetarget maximum absolute error. r denotes the machine resolution in representing real numbers.

Table 1. Complexity comparison with one-dimensional synopsis algorithms. Error metric: normalized �1-norm.

Algorithm Time Complexity Space Complexity

Plain Histogram7, 9 O(N2(B + log N)) O(N)

CHH (Greedy, time efficient)16 O(NB2 log N) O(NB log N)

CHH (Greedy, space efficient)16 O(NB2 log2 N) O(B log2 N + N)

Haar+ 15 O((Δδ )2NB) O(Δ

δ B log NB + N)

Lattice Histogram (Optimal)17 O(Δδ N3B2) O(Δ

δ N2B)

Lattice Histogram (Heuristic)17 O(Δδ N3 log ε∞

r ) O(Δδ N2)

This Work O(N3) O(N2)

20 25 30 35 40 45 50

60

70

80

90

100

110

120

130

140

150

B

Ave

rage

Abs

olut

e E

rror

TSHOptimal Plain HistogramCHH

Haar+

Heuristic LH

(a)

20 30 40 50 60 70 80 90 1000

200

400

600

800

1000

1200

1400

B

Ave

rage

Abs

olut

e E

rror

TSHOptimal Plain HistogramCHH

Haar+

Heuristic LH

(b)

Figure 5. Approximation quality of datasets (a) FC and (b) FR. Error metric: normalized �1-norm.

4. EXPERIMENTAL RESULTSFor direct comparison, we have used the same datasets FC, FR, and DJIA as mentioned in Ref. 17 to runsimulations. All algorithms were implemented in Matlab c© version 7.6.0.324 (R2008a).

In all the following presented results, we have used �1-norm of transform coefficients as the cost function. Thestate-of-the-art algorithms we compare with are optimal plain histogram,7, 9 Haar+,15 greedy heuristic CHH,16

optimal LH and heuristic LH.17 We have used experimental results from Ref.s 15 and 17 for comparison.Therefore, for both Haar+ and LH in datasets FC, FR and DJIA, resolution parameter δ = 10, 50, and 0.5respectively, same as their original papers. For normalized �1-norm and normalized �2-norm error metrics, LHresults are from heuristic approach, whereas for normalized �∞-norm error metric, LH results are from optimalapproach.

For each dataset with normalized �1-norm error metric, our solution achieves the lowest approximation error.Figures 5 and 6(a) show the results. Note that in DJIA dataset, we have found multiple best bases having thesame minimum cost value, and all of them achieve the same approximation quality. Therefore, we only use one

SPIE-IS&T/ Vol. 7245 724505-7

20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

1.2

1.4

B

Ave

rage

Abs

olut

e E

rror

TSH, k = 1,2,3,4Optimal Plain HistogramCHH

Haar+

Heuristic LH

(a)

20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

B

RM

SE

TSH, k = 1,2,3,4Optimal Plain HistogramCHH

Haar+

Heuristic LH

(b)

Figure 6. Approximation quality of DJIA dataset. Error metric: (a) normalized �1-norm and (b) normalized �2-norm.

20 25 30 35 40 45 5060

80

100

120

140

160

180

200

220

B

RM

SE

TSH

(a)

20 30 40 50 60 70 80 90 100200

400

600

800

1000

1200

1400

1600

B

RM

SE

TSH

(b)

Figure 7. Approximation quality of datasets (a) FC and (b) FR. Error metric: normalized �2-norm.

curve with index number k to represent all results. On the other hand, results of normalized �2-norm errorare presented in Figures 6(b) and 7. It can be seen that our approximation error is consistently the lowest forDJIA dataset. For both normalized �1-norm and normalized �2-norm error metrics, the use of heuristic makesapproximation quality of LH sub-optimal. It is possible that optimal LH has better approximation quality thanour solution, when properly setting δ value.

In the case of normalized �∞-norm error metric, optimal LH constantly has better approximation qualitythan our solution, whereas Haar+ is better in general but with few exceptions. Depending on the dataset andB value, optimal histogram and greedy heuristic CHH sometimes perform better than our solution, sometimesworse. It should be noted that optimal histogram constantly performs better than our solution in DJIA dataset.Figures 8 shows our results. Superiority of optimal LH and Haar+ in this case may be explained by the fact thatboth algorithms are capable of using only one non-zero coefficient to represent one data point of f , whereas anon-zero coefficient in TSH transforms can only represent two data points or more.

We do not get better results than optimal LH for 2-term approximation of the short signal {4,3,5,10,12,11,11,4}in Ref. 17. In this case, our normalized �1-norm error, normalized �2-norm error and normalized �∞-norm errorare 1.125, 1.5 and 3.5 respectively, all higher than 0.5,

√0.5 and 1 from optimal LH, but still lower than 1.375

(�1-optimal) and 4 (�∞-optimal) from 2-term plain histogram, and 2 (�1-optimal) and 4 (�∞-optimal) from 2-termHaar+/CHH.

SPIE-IS&T/ Vol. 7245 724505-8

15 20 25 30 35 40 45 500

200

400

600

800

1000

1200

B

Max

imum

Abs

olut

e E

rror

TSHOptimal Plain HistogramCHH

Haar+

Optimal LH

(a)

20 30 40 50 60 70 80 90 1000

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

B

Max

imum

Abs

olut

e E

rror

TSHOptimal Plain HistogramCHH

Haar+

Optimal LH

(b)

20 30 40 50 60 70 80 90 1000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

B

Max

imum

Abs

olut

e E

rror

TSH, k = 1,2,3,4Optimal Plain HistogramCHH

Haar+

Optimal LH

(c)

Figure 8. Approximation quality under error metric normalized �∞-norm. Dataset: (a) FC, (b) FR, and (c) DJIA.

5. CONCLUDING REMARKS

Our contributions are summarized as follows:

1. A method to index each TSH basis vectors by only one value.

2. A O(N2) time and O(N) space algorithm for building parameter table from a set of breakpoints specifyinga TSH basis in RN .

3. A O(N3) time and O(N2) space solution for B-term approximation. Both complexities are due to costminimization stage by dynamic programming.

4. In the case of approximating the same signal with different parameter pairs (p,B), our solution providesthe flexibility of reducing overall running time by increasing overall storage space. In short, the algorithmonly needs to run cost minimization once in O(N3) time, then stores four arrays of length N . For eachparameter pair (p,B) a separate approximation is conducted, after which the final 2 arrays of length B arestored.

ACKNOWLEDGMENTS

The first author thanks Dr. Panagiotis Karras for providing experimental results and helpful comments forimproving this paper. This work is supported in part by Finnish Centre for International Mobility (CIMO)fellowship, University of California MICRO grants with matching support from Intel Corporation, MicrosoftCorporation, and Xerox Corporation.

SPIE-IS&T/ Vol. 7245 724505-9

REFERENCES1. K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim, “Approximate query processing using wavelets,”

The VLDB Journal 10(2-3), pp. 199–223, 2001.2. Y. E. Ioannidis, “Approximations in database systems,” in ICDT ’03: Proceedings of the 9th International

Conference on Database Theory, pp. 16–30, Springer-Verlag, (London, UK), 2002.3. Y. E. Ioannidis and V. Poosala, “Balancing histogram optimality and practicality for query result size

estimation,” SIGMOD Rec. 24(2), pp. 233–244, 1995.4. V. Poosala, P. J. Haas, Y. E. Ioannidis, and E. J. Shekita, “Improved histograms for selectivity estimation

of range predicates,” SIGMOD Rec. 25(2), pp. 294–305, 1996.5. V. Poosala and Y. E. Ioannidis, “Selectivity estimation without the attribute value independence assump-

tion,” in VLDB ’97: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 486–495, 1997.

6. H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. C. Sevcik, and T. Suel, “Optimal histogramswith quality guarantees,” in VLDB ’98: Proceedings of the 24rd International Conference on Very LargeData Bases, pp. 275–286, 1998.

7. S. Guha, K. Shim, and J. Woo, “REHIST: Relative error histogram construction algorithms,” in VLDB ’04:Proceedings of the Thirtieth international conference on Very large data bases, pp. 300–311, 2004.

8. S. Guha, “Space efficiency in synopsis construction algorithms,” in VLDB ’05: Proceedings of the 31stinternational conference on Very large data bases, pp. 409–420, 2005.

9. S. Guha, “On the space—time of optimal, approximate and streaming algorithms for synopsis constructionproblems,” The VLDB Journal 17(6), pp. 1509–1535, 2008.

10. Y. Matias, J. S. Vitter, and M. Wang, “Wavelet-based histograms for selectivity estimation,” SIGMOD Rec.27(2), pp. 448–459, 1998.

11. M. Garofalakis and A. Kumar, “Wavelet synopses for general error metrics,” ACM Trans. Database Syst.30(4), pp. 888–928, 2005.

12. S. Guha and B. Harb, “Wavelet synopsis for data streams: minimizing non-euclidean error,” in KDD ’05:Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining,pp. 88–97, ACM, (New York, NY, USA), 2005.

13. S. Guha and B. Harb, “Approximation algorithms for wavelet transform coding of data streams,” IEEETransactions on Information Theory 54(2), pp. 811–830, 2008.

14. P. Karras and N. Mamoulis, “One-pass wavelet synopses for maximum-error metrics,” in VLDB ’05: Pro-ceedings of the 31st international conference on Very large data bases, pp. 421–432, 2005.

15. P. Karras and N. Mamoulis, “Hierarchical synopses with optimal error guarantees,” ACM Trans. DatabaseSyst. 33(3), pp. 1–53, 2008.

16. F. Reiss, M. Garofalakis, and J. M. Hellerstein, “Compact histograms for hierarchical identifiers,” in VLDB’06: Proceedings of the 32nd international conference on Very large data bases, pp. 870–881, 2006.

17. P. Karras and N. Mamoulis, “Lattice Histograms: a resilient synopsis structure,” in ICDE2008: IEEE 24rdInternational Conference on Data Engineering, pp. 247–256, 2008.

18. K. Egiazarian and J. Astola, “Tree-Structured Haar transforms,” Journal of Mathematical Imaging andVision 16, pp. 269–279, 2002.

19. R. A. DeVore, “Nonlinear approximation,” Acta Numerica 7, pp. 51–150, 1998.20. R. Coifman and M. Wickerhauser, “Entropy-based algorithms for best basis selection,” IEEE Transactions

on Information Theory 38(2), pp. 713–718, 1992.21. P. Fryzlewicz, “Unbalanced Haar technique for nonparametric function estimation,” Journal of the American

Statistical Association 102, pp. 1318–1327, December 2007.22. Y. Huang, I. Pollak, M. Do, and C. Bouman, “Fast search for best representations in multitree dictionaries,”

IEEE Transactions on Image Processing 15(7), pp. 1779–1793, 2006.

SPIE-IS&T/ Vol. 7245 724505-10