On the optimal Halton sequence

13
Mathematics and Computers in Simulation 70 (2005) 9–21 On the optimal Halton sequence H. Chi a, , M. Mascagni b , T. Warnock c a Department of Computer and Information Science, Florida A& M University, Tallahassee, FL 32307, USA b Department of Computer Science and School of Computational Science, Florida State University, Tallahassee, FL 32306-4530, USA c Los Alamos National Laboratory, PO Box 1663, MS B265, Los Alamos, NM 87544, USA Received 9 June 2004; received in revised form 3 March 2005; accepted 7 March 2005 Available online 23 May 2005 Abstract Quasi-Monte Carlo methods are a variant of ordinary Monte Carlo methods that employ highly uniform quasiran- dom numbers in place of Monte Carlo’s pseudorandom numbers. Clearly, the generation of appropriate high-quality quasirandom sequences is crucial to the success of quasi-Monte Carlo methods. The Halton sequence is one of the standard (along with (t, s)-sequences and lattice points) low-discrepancy sequences, and one of its important advan- tages is that the Halton sequence is easy to implement due to its definition via the radical inverse function. However, the original Halton sequence suffers from correlations between radical inverse functions with different bases used for different dimensions. These correlations result in poorly distributed two-dimensional projections. A standard solution to this phenomenon is to use a randomized (scrambled) version of the Halton sequence. An alternative approach to this is to find an optimal Halton sequence within a family of scrambled sequences. This paper presents a new algorithm for finding an optimal Halton sequence within a linear scrambling space. This optimal sequence is numerically tested and shown empirically to be far superior to the original. In addition, based on analysis and insight into the correlations between dimensions of the Halton sequence, we illustrate why our algorithm is efficient for breaking these correlations. An overview of various algorithms for constructing various optimal Halton sequences is also given. © 2005 IMACS. Published by Elsevier B.V. All rights reserved. Keywords: Quasi-Monte Carlo; Scrambling; Correlation; Optimal Halton sequence Corresponding author at: School of Computational Science, Florida State University, Tallahassee, FL 32306-4120, USA. Tel.: +1 850 6440 313; fax: +1 850 6440 098. E-mail address: [email protected] (H. Chi), [email protected] (M. Mascagni), [email protected] (T. Warnock). 0378-4754/$30.00 © 2005 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.matcom.2005.03.004

Transcript of On the optimal Halton sequence

Mathematics and Computers in Simulation 70 (2005) 9–21

On the optimal Halton sequence

H. Chia,∗, M. Mascagnib, T. Warnockc

a Department of Computer and Information Science, Florida A& M University, Tallahassee, FL 32307, USAb Department of Computer Science and School of Computational Science, Florida State University,

Tallahassee, FL 32306-4530, USAc Los Alamos National Laboratory, PO Box 1663, MS B265, Los Alamos, NM 87544, USA

Received 9 June 2004; received in revised form 3 March 2005; accepted 7 March 2005Available online 23 May 2005

Abstract

Quasi-Monte Carlo methods are a variant of ordinary Monte Carlo methods that employ highly uniform quasiran-dom numbers in place of Monte Carlo’s pseudorandom numbers. Clearly, the generation of appropriate high-qualityquasirandom sequences is crucial to the success of quasi-Monte Carlo methods. The Halton sequence is one of thestandard (along with (t, s)-sequences and lattice points) low-discrepancy sequences, and one of its important advan-tages is that the Halton sequence is easy to implement due to its definition via the radical inverse function. However,the original Halton sequence suffers from correlations between radical inverse functions with different bases usedfor different dimensions. These correlations result in poorly distributed two-dimensional projections. A standardsolution to this phenomenon is to use a randomized (scrambled) version of the Halton sequence. An alternativeapproach to this is to find an optimal Halton sequence within a family of scrambled sequences. This paper presentsa new algorithm for finding an optimal Halton sequence within a linear scrambling space. This optimal sequence isnumerically tested and shown empirically to be far superior to the original. In addition, based on analysis and insightinto the correlations between dimensions of the Halton sequence, we illustrate why our algorithm is efficient forbreaking these correlations. An overview of various algorithms for constructing various optimal Halton sequencesis also given.© 2005 IMACS. Published by Elsevier B.V. All rights reserved.

Keywords:Quasi-Monte Carlo; Scrambling; Correlation; Optimal Halton sequence

∗ Corresponding author at: School of Computational Science, Florida State University, Tallahassee, FL 32306-4120, USA.Tel.: +1 850 6440 313; fax: +1 850 6440 098.

E-mail address:[email protected] (H. Chi), [email protected] (M. Mascagni), [email protected] (T. Warnock).

0378-4754/$30.00 © 2005 IMACS. Published by Elsevier B.V. All rights reserved.doi:10.1016/j.matcom.2005.03.004

10 H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21

1. Introduction

Quasi-Monte Carlo (QMC) methods are a variant of ordinary Monte Carlo methods that employ highlyuniform quasirandom numbers in place of Monte Carlo’s pseudorandom numbers. Quasi-Monte Carlomethods are now widely used in scientific computation, especially in estimating integrals over multidi-mensional domains[14,17] and in many different financial computations[16]. Clearly, the generationof appropriate high-quality quasirandom sequences is crucial to the success of using quasi-Monte Carlomethods.

Monte Carlo methods offer statistical error estimates; however, while quasi-Monte Carlo often has afaster convergence rate than normal Monte Carlo, there is no practical way to obtain error estimates directlyfrom quasi-Monte Carlo sample values. In order to take advantage of both Monte Carlo and quasi-MonteCarlo methods, several researchers[15,18] proposed randomized quasi-Monte Carlo methods, whererandomness can be brought to bear on quasirandom sequences through scrambling and other related

Fig. 1. Poor 2D projections were studied in several papers. For example, left top: was included in Braaten’s paper[1], right top:in Morokoff’s paper[12], left bottom: in Kocis’s paper[7], right bottom: a random-start Halton sequence[21].

H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21 11

randomization techniques. Scrambling provides a good method to obtain error estimates in quasi-MonteCarlo based by treating each scrambled sequence as a different and independent random sample. Thecore of randomized quasi-Monte Carlo is to find an effective and fast algorithm to scramble (randomize)quasirandom sequences.

In this paper, we study the Halton sequence, which is one of the standard (along with (t, s)-sequencesand lattice points) low-discrepancy sequences, and thus is widely used in quasi-Monte Carlo applications.Many scrambling methods have been proposed[1,7] for the Halton sequence. Since the Halton sequencehas properties that are fundamentally different from (t, s)-sequences, some of these scrambling techniqueshave been designed specifically for the Halton sequence. Although the Halton sequence has poor two-dimensional projections (seeFig. 1) as dimension increases, a scrambled Halton sequence obtains bothgood two-dimensional projections and a smaller discrepancy, which is a measure of deviation fromuniformity.

By scrambling a single quasirandom sequence, we can produce a family of related quasirandom se-quences. Finding an optimal quasirandom sequence within this scrambled family is an interesting prob-lem, as such optimal quasirandom sequences can be quite useful in quasi-Monte Carlo applications. Theprocess of finding such optimal quasirandom sequences is called the derandomization of a randomized(scrambled) family. We will present a derandomization technique for the Halton sequence. Then, basedon a certain family of scrambled Halton sequences, an algorithm to search for the optimal in this sensewill be proposed.

The rest of this paper is organized as follows. First a brief review of the Halton sequence is presented inSection2. This is followed by the analysis of poor two-dimensional projections and correlations betweenthe radical inverse functions as used in the Halton sequence in Section3. Methods for breaking thesecorrelations are presented in Section4. In Section5, we summarize some previous methods used toscramble the Halton sequence and present our own, simpler scrambling algorithm. An algorithm forfinding the optimal Halton sequence within a thusly scrambled family is presented in Section6, andnumerical results that confirm the quality of the optimal scramblings are presented in Section7. This isfollowed by the final section, Section8, where we provide conclusions and discuss future work.

2. The Halton sequence

A classical family of low-discrepancy sequences is the Halton sequence[6]. Its definition is based onthe radical inverse function defined as follows:

φp(n) ≡ b0

p+ b1

p2+ · · · + bm

pm+1, (1)

wherep is a prime number, and thep-ary expansion ofn is given asn = b0 + b1p + . . . + bmpm, with

integers 0≤ bj < p. The Halton sequence,Xn, in s-dimensions is then defined as

Xn = (φp1(n), φp2(n), . . . , φps(n)), (2)

where the dimensional basesp1, p2, . . . , ps are pairwise coprime. In practice, we always use the firstsprimes as the bases.

In comparison to other low-discrepancy sequences, the Halton sequence is much easier to implementdue to the ease of implementation of the radical inverse function. However, a problem with the Halton

12 H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21

sequence arises from correlations between the radical inverse functions used for different dimensions.These correlations cause the Halton sequence to have poor two-dimensional projections for some pairs ofdimensions. For example, the two-dimensional projections of the 7 and 8th[1], 28 and 29th[12], and 39and 40th[7] dimensions are known to be very poor.Fig. 1 illustrates the poor projections in these cases.In addition, Section3 has a more detailed analysis of these correlations.

The poor two-dimensional projections are caused by the fact that the difference between the twoprimes bases corresponding to the different dimensions is very small relative to the base size. Fox[5]coded the first 40 dimensions of the Halton sequence, and this implementation has become somewhat ofthe reference Halton sequence code, and so it is fair to analyze it for our purpose. The first 40 primes wereused for the bases in Fox’s implementation. Among these 40 primes, there are eight pairs of twin primesgreater than 10. The list is as follows: (11,13), (17,19), (41,43), (59,61), (71,73), (101,103), (107,109),(149,151). All of them have poor two-dimensional projections.

To study the phenomenon of poor two-dimensional projections, consider one base,pand another base,p + α, where the difference,α, can be thought of as being relatively small. Letn be a positive integer,then we can expand it in basep + α asn = a0 + a1(p + α) + · · · + am(p + α)m. Thus, the formula forφp(n) is given in Eq.(1), and the formula forφp+α(n) is as follows:

φp+α(n) = a0

p + α+ a1

(p + α)2+ · · · + am

(p + α)m+1

= a0

p(1 + α/p)+ a1

p2(1 + α/p)2+ · · · + am

pm+1(1 + α/p)m+1. (3)

From Eqs.(1) and (3), we can see that a strong correlation exists betweenφp(n) andφp+α(n) because(1 + α

p) is close to 1 whenα is small compared top. Thus, one would expect the worst problems to occur

whenα is the smallest possible, sayα = 2, the case of twin primes forp andp + α.However, good two-dimensional projections for the Halton sequence may be obtained if the number

of points used is equal to the product of the bases. This is due to the fact that the least significant digitin thep-ary expansion ofn is b0, and so it repeats everyp; similarly, a0 repeats everyp + α. Since theuniformity is dictated mostly by this most-significant digit, we should get a uniform two-dimensionalprojection by usingp(p + α) points. According to this reasoning, the Halton sequence should have goodtwo-dimensional projection if we choose the number of points to be the product of bases. For example, ifwe plan to use the 39 and 40th dimension of the Halton sequence, the bases are 167 and 173, respectively,and so 167× 173= 28,891 points should be well distributed in these two dimensions.

3. Correlations

The original Halton sequence suffers from correlations between radical inverse functions with differentbases used for different dimensions. These correlations result in poorly distributed two-dimensionalprojections, among other things. In this section, we will calculate the correlation coefficient betweentwo radical inverse functions,φp(n) andφp+α(n). The calculations will provide some insights into thecorrelations between dimensions in the Halton sequence. Based on this analysis, an effective scramblingalgorithm will be proposed.

In Fig. 1, we can see similarities in the poor two-dimensional projections. For example there are twoclusters of lines parallel to the liney = x. A more careful analysis would reveal that the number of parallel

H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21 13

lines in each cluster is almost equal to the ceiling of the number of points divided by the prime base. Atthe end of this section, we will give explanations for this observation.

The main point of this section is to compute the correlation coefficient betweenφp(n) andφp+α(n).We have thep-ary and thep + α-ary expansion ofn given as

n ≡ b0 + b1p + · · · + bmpm = a0 + a1(p + α) + · · · + am(p + α)m. (4)

Let us consider only the first two most significant digits, i.e.,m = 1. Then after truncating atm = 1, weobtain the following relation from Eq.(4):{

b1 = a1 + � a0p

b0 = a0 + αa1 (modp).(5)

The period of botha0 andb0 is p(p + α), so we only considern between 1 top(p + α). Therefore,φp+α(n) can be expressed as follows:

φp+α(n) = a0

p + α+ a1

(p + α)2+ O(p−2) (6)

By combining Eqs.(1) and (5), φp(n) can thus be expressed in terms of the (p + α)-ary expansion ofnas

φp(n) = a0 + αa1(modp)

p+ a1 + �a0/p

p2+ O(p−2). (7)

For 1≤ n ≤ p(p + α), we partition this interval top + α parts, namelykp + 1 ≤ n ≤ (k + 1)p − 1 fork = 0,1,2, . . . , p + α − 1. Then we calculateRk, the correlation coefficient betweenφp(n) andφp+α(n)with kp ≤ n < (k + 1)p. However,Rk can be obtained fromR0, the correlation coefficient betweenφp(n)andφp+α(n) with 1 ≤ n < p − 1. This is due to the fact that the second most significant digits,a1 andb1, will not change until the most significant digits have cycled. Thus, the correlation betweenφp(n) andφp+α(n) is primarily based on the correlation of their most significant digits,b0 anda0. Thus, computingthe correlation coefficient betweenφp(n) andφp+α(n) with kp ≤ n < (k + 1)p translates into computingthe correlation coefficient betweenb0

p+αand b0+αb1

pwith 1 ≤ n < p.

We now define the formula for the correlation coefficient,R, between any two sequences{xi}1≤i≤N

and{yi}1≤i≤N . Let x andy denote the average of the two sequences, respectively, then the formula for thecorrelation coefficient is defined as

R = Sxy√SxxSyy

, (8)

whereSxy = ∑(xi − x)(yi − y), Sxx = ∑

(xi − x)2, andSyy = ∑(yi − y)2.

In our casexi = ip+α

andyi = ip, and we take thosei’s such that 1≤ i ≤ p − 1. Thus asα is small,

x = 12 + O( 1

p+α) andy = 1

2. Then pieces of the correlation coefficient betweenφp(n) andφp+α(n), forn = 1, . . . , p − 1, can be calculated by the following formula:

Sxy ≈p−1∑i=1

(i

p + α− 1

2

)(i

p− 1

2

)= p2 + (4 − 3α)

12(p + α)= p

12+ O

p

). (9)

14 H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21

ThenSxx andSyy can be calculated as follows:

SxxSyy =p−1∑i=0

(i

p + α− 1

2

)2 p∑i=0

(i

p− 1

2

)2

=(p(p2 + 3p + 2 − α(1 − α))

12(p + α)2

)(p2 + 2

12p

)

=(p

12

)2

+ O

(α2

(p + α)2

). (10)

Using the same assumption as above, we can approximate Eq.(10)by ( p

12)2. LetR0 denote the correlationcoefficient betweenφp(n) andφp+α(n), for n = 1,2, . . . , p − 1, then

R0 ≈ p/12√(p/12)2

= 1. (11)

One can calculate thatRk ≈ 1 for kp + 1 ≤ n ≤ (k + 1)p − 1 with b1 = k. This explains why the poortwo-dimensional projections of the Halton sequence inFig. 1 look like lines parallel to the liney = x,since all pairs of points approximately fall on the linesy = Rkx + c with c = b1 or a1. This is based ona common interpretation of the correlation coefficient[3].

Now, let us compute the number of parallel lines seen for these poor projections of the Halton sequence.Every time,b1 or a1 changes, the liney = Rkx + c wraps. Forn points,b1 will change� n

p� times anda1

will change� np+α

�. Thus, the total number of lines for anyn points may be computed as

⌈n

p

⌉+⌈

n

p + α

⌉.

4. Methods to break correlations

There are at least two possible ways to break the correlations we have seen in the Halton sequence.One is by increasing the difference between the bases for any pair of dimensions; the other is to scramblethe Halton sequence.

The first method is useful when the number of dimensions is small. When the number of dimensionsis large, it is hardly useful. Whenp is large,p + α has to be very much larger if we want to breakthe correlations. Letα = ep, wheree > 0, and then from Eqs.(9) and (10), we see that the correlationcoefficient will be approximately1

e+1. Thus we can increaseeuntil the correlation coefficient is sufficientlysmall. Notee = p is normally considered to be sufficient to ensure small correlation, however, that impliesprime pairs of the formp andp + p2.

Increasingα also raises a problem in the upper bound of the star-discrepancy1 for the Halton sequence.ForN points in thes-dimensional Halton sequence, the upper bound of the star-discrepancy satisfies theinequality:

D∗N ≤ C(p1, . . . , ps)

(log N)s

N+ O

((log N)s−1

N

),

1 The measure of uniformity.

H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21 15

Fig. 2. A list ofφ17(n) andφ19(n) with 2 ≤ n ≤ 14, andφ5(n) andφ7(n) with 2 ≤ n ≤ 11.

whereC(p1, . . . , ps) ≈ ∏sj=1

pj−12 log pj

. With pj increasing, this constant in the upper bound will increase

as pj

log pjand is an increasing function ofpj.

The other method to break the correlations is to scramble the Halton sequence. The first four-dimensionsof the Halton sequence gives us a hint for obtaining better quality high-dimensional sequences. If one canreorder or shuffle the digits of each point in the Halton sequence for different dimensions, the correlationsbetween different dimensions can be made very small. This is due to the fact that there are gaps betweenthe most significant digits ofφ2(n),φ3(n),φ5(n), andφ7(n), which have good two-dimensional projectionswith p < 10.

However, whenp > 10, there are no gaps for the most significant base 10 digits ofφp(n) andφp+α(n).In Fig. 2, it is easy to see that the most significant digits forφ17(n) andφ19(n) cycle from 1 to 9 withoutjumps. However, the most significant digits ofφ5(n) andφ7(n) jump. Scrambling the Halton sequencecan break these cycles, and the correlations created with the radical inverse function. It is clear that thecorrelations seen are due to the shadowing of the most significant digits of the Halton sequence.

Linear scrambling is the simplest and most effective scrambling method to break this correlation. Thisis the reason why we focus on linear scrambling and try to look for the “best” in the linear space in thenext section.

5. Linear scrambling

Many scrambling methods[1,11,21,22]have been proposed for the Halton sequence in order to breakcorrelation between dimensions. Most of them are based on digital permutation, and its definition is asfollows:

φπpi(n) ≡ πpi

(b0)

pi

+ πpi(b1)

p2i

+ · · · + πpi(bm)

pm+1i

, (12)

whereπpiis a permutation of the set{0,1,2,3, . . . , pi − 1}.

16 H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21

Before we start to search for the optimal Halton sequence, we must decide which permutation functionscan be chosen forπpi

. In other words, we are trying to find a functionf (x) from many permutations of agiven form,x ∈ {0,1,2,3, . . . , pi − 1}, such thatf (x) is a permutation of the set{0,1,2,3, . . . , pi − 1}.There are two simple functions, which conveniently define a subset of thepi! permutations[8]: one isf (x) = wx + c (modpi), and it is the “best” in some sense among a subset of thep! possible permutations.The other isf (x) = xk (modpi), and gcd(k, pi − 1) = 1. From the implementational point-of-view, thelinear scrambling,f (x) = wx + c (modpi), is quite effective in comparison to other scrambling methods.

We choose the linear function,f (x) = wx + c (modpi), with c = 0 as ourπpito scramble the Halton

sequence:

πpi(bj) = wibj (modpi), (13)

where 1≤ wi ≤ pi − 1 and 0≤ j ≤ s. The reason for consideringc = 0 is that we want not to permutezero. The idea of not permuting zero is to keep the sequence unbiased. Permuting zero (assuming an infinitestring of trailing zeros) leads to a biased sequence. This linear scrambling gives us a stochastic family ofthe scrambled Halton sequences, which includes (p1 − 1)(p2 − 1) . . . (ps−1 − 1)(ps − 1) sequences forthes-dimensional Halton sequence. The main goal of this paper is to find an optimal sequence from amongthis scrambled family. The algorithm for finding the optimal sequence is described in the next section.

6. The optimal Halton sequence

In this section, we provide a number theoretic criterion that can be used to choose the optimal scramblingfrom among a large family of possible linear scramblings. Based on this criterion, we have been able tofind optimal scramblings from this family for the Halton sequence in any dimension. This derandomizedHalton sequence is then numerically tested and shown empirically to be far superior to the originalsequence.

The basic idea behind searching for an optimal Halton sequence is to find optimal choices for thepermutationsπpi

. But we must be careful in choosing the measure to judge the “optimum." If we onlyconsider the “best" permutation in one dimension, then the sequence may not be optimal. For example,Braaten and Weller[1] picked a permutationπpi

(j + 1) by minimizing the one-dimensional discrepancy

of the set{πpi(1)

pi, . . . ,

πpi(j)

pi,πpi

(j+1)pi

}. This method gets rid of some poor projections of the original Haltonsequence. However, the permutation focuses on optimal Halton sequence only in one dimension, and theusage of such a measure makes the selected permutations still highly related for two dimensions whosebases are twin primes. For example, a part of a table given in[1] listing optimal permutations based onthe above one-dimension criterion is listed in ourTable 1. Except for 0, there are eight pairs of similarpermutations whose difference is within 2. We use� to identify these closely related pairs inTable 1.These permutations make it hard to break the correlations between two dimensions when the prime basesare close. After permuting, the poor two-dimensional projections of the original Halton sequence maynot improve much (please refer toFig. 3). This is because the measure of best permutation is based on theone-dimensional discrepancy[12], which turns out to produce very similar optimal permutations when thebases are close. InFig. 3, we can see that poor two-dimensional projections are not completely removedby Braaten and Weller’s optimal permutations. Several papers[7,19] extended the same permutations tohigh dimensions without considering correlation between dimensions.

H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21 17

Table 1The 7 and 8th dimensional permutations for the Halton sequence in Braaten and Weller[1]

s p Permutation of (0,1,2, . . . , p − 1)

7 17 (0 8 13 3 11 5 16 1 10 7 14 4 12 2 15 6 9)� � � � � � � � � �

8 19 (0 9 14 3 17 6 11 1 15 7 12 4 18 8 2 16 10 5 13)

� means that the difference between these pairs is no greater than 2.

Fig. 3. Left: 2000 points of Braaten and Weller modified Halton sequence; right: our new proposed optimal Halton sequence.

In the rest of this section, we present an algorithm and a measure for finding the optimal Halton sequencein the linear scrambling space. We try to find an optimal multiplierwi in Eq. (13) within the linearscrambling function. Warnock[23,24] replacedwi in Eq. (13) with either�pi{√pi}� or �pi{√pi}. The

choice depends on the summation of the continued fraction expansion of�pi{√pi}�pi

= [qu1, qu2, . . . , quk]

or �pi{√pi}pi

= [ql1, ql2, . . . , qlj]. In other words, letqu = ∑ki=1 qui andql = ∑j

i=1 qli, and then compare

qu with ql, and ifqu < ql, thenwi = �pi{√pi}�pi

, otherwisewi = �pi{√pi}pi

. Note that the Weyl sequence2 isused in choosing optimalwi. Sincepi is different for eachwi, then the scrambled version of a sequencegenerated by a differentwi is expected to be independent. This is due to the fact that the Weyl sequences[13,22], consisting of multiples of square roots of primes have good discrepancy in low dimension. Thesquare roots of primes are “independent." This property can help us find a scrambled Halton sequencethat is not only optimal in one dimension, but also optimal among dimensions.

Mascagni and Chi[9] proposed using a primitive root modulopi, Api, instead ofwi in Eq. (12).

Normally,pi has more than one primitive root, and the primitive root having the smallest of the largestpartial quotient is chosen to beApi

. The theoretical background for the relation between the “best”primitive root modulopi and the discrepancy of this sequence can be found in[2,13]. For a prime modulus

2 A Weyl sequence is given byxk = kα (mod 1), whereα is an irrational number.

18 H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21

Table 2Choices for optimalwi for the first 50 dimensions for the optimal Halton sequence

pi wi pi wi pi wi pi wi pi wi

2 1 31 17 73 45 127 29 179 693 2 37 5 79 70 131 57 181 835 2 41 17 83 8 137 97 191 1577 5 43 26 89 38 139 110 193 171

11 3 47 40 97 82 149 32 197 813 7 53 14 101 8 151 48 199 3217 3 59 40 103 12 157 84 211 11219 10 61 44 107 38 163 124 213 20523 18 67 12 109 47 167 155 227 1529 11 71 31 113 70 173 26 229 31

p, and a primitive rootwmodulopas multiplier, we have that the extreme discrepancy,D(2)N , satisfies[13]

(p − 1)D(2)p ≤

2 + 2

log 2log p if 1 ≤ q ≤ 3

2 + q + 1

log q + 1log q if q ≥ 4

, (14)

whereq = max1<i≤k qi andqi is theith partial quotient in the continued fraction expansion ofwp

.However, the disadvantages of this method is the conditions for choosing the optimal primitive root

Apiis only based onpi. It could happen thatApi

for a differentpi may be close wheneverpi is close.For example,Ap13 = 17 andAp14 = 18, wherep13 = 41 andp14 = 43 are bases for dimensions 13 and14 of the Halton sequence, and this leads to very high correlations. Therefore, we cannot use the “best”multiplier in each dimension because we can again get spurious correlations. The reason is that bestmultipliers tend to be close to the golden ratio.

We now try to blend Warnock’s choices forwi with Mascagni and Chi’sApi:

wi ={Spi

, if Spiis primitive root modulopi

A′pi, otherwise

,

whereA′pi

is a primitive root modulopi close toSpi. With such a modification, we keep the “independence"

among dimensions while maintaining the reduced error estimates as well.In Table 2, we list the choices for the first 50 dimension forwi, wherepi refers to the base used for the

Halton sequence, andwi refers to the best choice for the linear scrambling.

7. Numerical results

To empirically test the quality of our proposed optimal Halton sequence, we evaluate the test integraldiscussed in[4,5]∫ 1

0· · ·∫ 1

0$s

i=1|4xi − 2| + ai

1 + aidx1 . . .dxs = 1. (15)

H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21 19

The accuracy of QMC integration depends not simply on the dimension of the integrands, but on theireffective dimension. The test function in Eq.(15) is among the most difficult cases for high-dimensionalnumerical integration. We have estimated the values of these test integrals in dimension 20≤ s ≤ 50,with a1 = a2 = · · · = as. In this integral, all the variables are equally important, and Wang and Fang[20]calculated the effective dimension is approximately the same as the normal dimension of the integrand.Thus, we may expect improvements in QMC by using the derandomized sequences we have constructed.

The result fora1 = a2 = · · · = as = 0, is listed inTable 3. In Fox’s paper[5], the original Haltonsequence is used to evaluate the integral, and the errors in the numerical results for over 20 dimensionsbecome quite large. However, after derandomization, we found that the integral has a much smallerrelative error over dimension 20.

• wHalton refers to the algorithm modified by Warnock’s derandomization,• mHalton refers to the algorithm proposed in[9],• dHalton refers to the new algorithm we propose in this paper.

We notice fromTable 3that dHalton is stable and convergent with an increase in the number ofsimulations compared with other optimal sequences. In general, we found that the relative error of thisintegral problem was less than 5% (except for one case) after dimension 20 with our new optimal sequence.

Table 3Estimates of the integral

∫ 1

0· · · ∫ 1

0$s

i=1|4xi − 2| dx1 . . .dxs = 1

Generator N s = 10 s = 20 s = 30 s = 40 s = 50

wHalton 1000 1.333 0.972 0.594 0.146 0.042mHalton 1000 0.881 0.601 0.198 0.311 0.862dHalton 1000 0.947 1.417 0.506 0.196 0.300

wHalton 3000 1.313 0.996 1.655 1.159 0.139mHalton 3000 0.952 0.868 0.410 0.514 0.319dHalton 3000 1.014 1.058 0.556 0.332 0.242

wHalton 7000 1.275 0.975 1.169 0.734 0.146mHalton 7000 0.992 1.216 1.188 0.489 0.274dHalton 7000 0.994 1.127 0.829 2.184 0.454

wHalton 30000 1.293 1.230 1.518 0.952 0.634mHalton 30000 0.999 1.097 2.028 1.276 0.251dHalton 30000 1.003 1.053 1.087 1.125 0.737

wHalton 50000 1.291 1.186 1.710 1.980 0.543mHalton 50000 1.002 1.116 1.949 1.034 0.256dHalton 50000 1.001 1.096 1.080 1.050 1.717

wHalton 80000 1.287 1.167 1.532 1.571 0.517mHalton 80000 1.003 1.047 1.483 0.826 0.318dHalton 80000 0.993 1.038 1.088 0.929 1.170

wHalton 100000 1.285 1.160 1.620 1.579 0.561mHalton 100000 1.002 1.068 1.366 0.788 0.354dHalton 100000 0.999 1.012 1.023 0.864 0.998

20 H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21

It is important to single outs = 50, wheredHalton is convergent, while the other two derandomizationtechniques seem not to converge with increasing number of simulations. Obviously, the optimal Haltonsequence can improve the convergence rate for QMC applications. More numerical experiments willbe needed to further explore the derandomization of the Halton sequence. Here, the optimal Halton,dHalton , is based on a global measure, while the others are based only on certain local measures. Thus,the fact thatdHalton has better performance in this integral problem is intuitively reassuring.

8. Conclusions

Above, we gave a an overview of the Halton sequence and of various methods used to scramble theHalton sequence. We also explored the reasons which explain the poor two-dimensional projections ofthe Halton sequence. By scrambling a the Halton sequence, we can produce a family of the scrambledHalton sequences. It is a natural question, therefore, to ask how to choose an optimal Halton sequencefrom among this family. Derandomization techniques provides us a way to find an optimal sequence fromsuch a stochastic family of the Halton sequences.

In this paper, based on the analysis of these correlations, we propose a new and simple algorithm togenerate an optimal Halton sequence. This algorithm generates the optimal Halton sequence very rapidly.This was shown to be very important for practical quasi-Monte Carlo applications through the exampleof a difficult high-dimensional integral. Even though it is well-known that the distribution of the Haltonsequence in high dimensions is not good, the scrambled or optimally scrambled Halton sequence canoften improve the quality. Thus, the scrambled and optimal Halton sequences can be widely applied inquasi-Monte Carlo applications. It is important to note that the idea used for selecting optimal Haltonsequence in this paper can be used for derandomization of generalized Faure (GFaure ) [10,18] family.This will be the subject of some of our future research.

References

[1] E. Braaten, G. Weller, An improved low-discrepancy sequence for multidimensional quasi-Monte Carlo integration, J.Comput. Phys. 33 (1979) 249–258.

[2] E. Brunner, A. Uhl, Optimal multipliers for linear congruential pseudo-random number generators with prime moduli:parallel computation and properties, BIT 68 (1999) 249–260.

[3] G. Casella, Statistical Inference, Brooks/Cole Pub. Co, Pacific Grove, CA, 1990.[4] P.J. Davis, P. Rabinowitz, Methods of Numerical Integration, Academic Press, New York, 1984.[5] B. Fox, Implementation and relative efficiency of quasirandom sequence generators, ACM Trans. Math. Software 12 (1986)

362–376.[6] J. Halton, On the efficiency of certain quasirandom sequences of points in evaluating multidimensional integrals, Numerische

Mathematik. 2 (1960) 84–90.[7] L. Kocis, W. Whiten, Computational investigations of low discrepancy sequences, ACM Trans. Math. Software 23 (1997)

266–294.[8] R. Lidl, H. Niederreiter, Introduction to Finite Fields and Their Applications, Cambridge University Press, Cambridge,

1994.[9] M. Mascagni, H. Chi, On the scrambled Halton sequences, Monte Carlo Meth. Appl. 10 (2004) 435–442.

[10] M. Mascagni, H. Chi, Optimal quasi-Monte Carlo valuation of derivative securities, in: M. Costantino, C. Brebbia (Eds.),Computational Finance and its Applications, WIT Press, Southampton, Boston, 2004, pp. 177–185.

H. Chi et al. / Mathematics and Computers in Simulation 70 (2005) 9–21 21

[11] J. Matousek, On theL2-discrepancy for anchored boxes, J. Complex. 14 (1998) 527–556.[12] W.J. Morokoff, R.E. Caflish, Quasirandom sequences and their discrepancy, SIAM J. Sci. Comput. 15 (1994) 1251–1279.[13] H. Niederreiter, Quasi-Monte Carlo methods and pseudorandom numbers, Bull. Am. Math. Soc. 84 (1978) 957–1041.[14] H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods, SIAM, Philadephia, 1992.[15] A.B. Owen. Randomly permuted (t, m, s)-nets and (t, s)-sequences. Monte Carlo and Quasi-Monte Carlo Methods in

Scientific Computing, 106 in Lecture Notes in Statistics, 1995, pp. 299–317.[16] S.H. Paskov, J.F. Traub, Faster valuation of financial derivatives, J. Portfolio Manage. 22 (1) (1995) 113–120.[17] J. Spanier, E.H. Maize, Quasi-random methods for estimating integrals using relatively small samples, SIAM Rev. 36 (1)

(1994) 18–44.[18] S. Tezuka, Uniform Random Numbers, Theory and Practice, Kluwer Academic Publishers, New York, 1995.[19] B. Tuffin. Improvement of Halton sequences distribution. Technical Report No. 998. IRISA, Resses, 1996.[20] X. Wang, K.T. Fang, The effective dimension and quasi-Monte Carlo, J. Complex. 19 (2) (2003) 101–124.[21] X. Wang, F.J. Hickernell, Randomized Halton sequences, Math. Comput. Model. 32 (2000) 887–899.[22] T. Warnock, Computational investigations of low discrepancy point sets, in: S.K. Zaremba (Ed.), Applications of Number

Theory to Numerical Analysis, Academic Press, New York, 1972, pp. 319–343.[23] T. Warnock, Computational investigations of low discrepancy point sets II, in: H. Niederreiter (Ed.), Monte Carlo and

Quasi-Monate Carlo Methods in Scientific Computing, Springer, 1995, pp. 354–361.[24] T. Warnock. Effective error estimates for quasi-Monte Carlo computations. Uncertainty Quantification Working Group and

Related Special Seminars,http://www.public.lanl.gov/kmh/uncertainty/meetings/warnvgr.pdf, 2002.