SELECTING THE t POPULATIONS WITH THE LARGEST a ...

....

SELECTING THE t POPULATIONS

* WITH THE LARGEST a-PERCENTILE

by Milton Sobel

SELECTING A SUBSET CONTAINING THE

November, 1965

* POPULATION WITH THE LARGEST a-PERCENTILE

by M. Haseeb Rizvit and Milton Sobel

Technical Report No. 65

University of Minnesota

Minneapolis, Minnesota

tNow at Ohio State University, Columbus, Ohio.

* This report was supported by National Science Foundation Grant No. GP-3813.

-

(1.1)

Selecting the t Populations with the Largest a-Percentiles

by Milton Sobel, University of Minnesota 1. Introduction

There are given k populations with cumulative distribution function

(c.d.f.) F . ( x) = F . ( i = 1, 2, ••• , k) ; a number a with O < a < 1 and an l. 1.

integer t with 1 ~ t ~ k-1 are preassigned. The goal is to select those

t of the k populations which have the largest a-percentiles. The form

of the c.d.f. Fi is unknown but it is assumed that F.(x) 1.

is continuous

in x (i = 1,2, ••• ,k). Two different formulations of this nonparametric

problem will be given; for each formulation there will be a case A and a

Case B according to whether a certain assumption is ma.de.

Let xa(F) denote.the ath

-percentile of the distribution F; if this

percentile is not unique we can define it as any arbitrary convex combination

of points in the closure of the set {x: F(x) = a}. However, we shall assume

that each F. has a unique a-percentile in order to avoid extra notation 1.

which does not add to the basic ideas of the problem. Let F[i](x) = F[i]

denote the c.d.f. with the ith

smallest a-percentile. The correct ordering

of the k distributions is

where It is assumed that no

a priori information (in the form of a distribution or otherwise) is avail-

able concerning the correct pairing of the F. 1.

and the We do not

assume that the k distributions have the same supports.

It is clear that if the F.(x) are ordered uniformly in x, i.e., if 1.

there are no crossovers, then the problem of selecting the t smallest of

the Fi{x) is equivalent to the problem of selecting the t populations

with the largest a-percentiles.

Formulation 1:

Let e* > O be a specified number such that * * e ~ a ~ 1-e . Let

l3(F) and x~{F) denote the infimum and supremum of the set {x;F(x) = ~}.

- 1 -

(1.2)

(1.3)

( 1.4)

(1.5)

( 1.6)

Define the closed interval

For i = 1,2, ..• ,k-t and j = k-t+l, k-t+2, ..• , k, let

Finally let d* and p* denote specified constants with d* > 0 and

1 > p* > 1/(~). 1

We define F[k-t] < F[k-t+l] to mean that x0 (F[k-t]) <

x0

(F[k-t+l]) and when this is the case we define a correct selection (cs)

to mean the selection of those t populations with the largest a-percentiles;

we shall not need the definition of a CS when F[k-t] = F[k-t+l].

Our goal is to find a procedure R which satisfies the probability

requirement,

i.e.,

and

P{CS IR} ~ p* when inf di . ~ d* , {i,j) ,J

whenever the infimum of di . over all pairs (i,j) with 1 ~ i ~ k-t ,J

k-t+l ~ j ~ k is at least d*. Any vector or point {F1,F2 , ••• ,Fk)

for which * d. j< d for at least one pair {i,j) is said to be in the 1,

"indifference zone" and no probability requirement is ma.de for such points.

Let the common sample size n from each population be sufficiently

large so that 1 ~ (n+l)a~ n. and let the positive integer r be defined by

r ~ {n+l)a < r+l .

It follows from the above that 1 ~ r ~ n. Let Y .. denote the jth

J,1

order statistics from Fi (j = r,r+l; i = 1,2, ••• ,k) and let

[r+l-( n+l)a]Y . + [ {n+l)a-r ]Y 1 . • r,1 r+ ,1

I

I

I

~

I.I

I ~

i I

~

I ...

The procedure R will be to take n observations from each population and ~

select the t populations which give rise to the t largest W-values.

- 2 -

( 1. 7)

(1.8)

If a is rational then (n+l)a will be an integer for some arith-

metic progression A of n-values and then r = (n+l)a and

for simplicity we shall assume that a is rational.

Hence the problem of finding a procedure R satisfying (4) reduces

to the problem of finding the smallest integer n such that (1.4) is satis

fied; we shall be content with finding the smallest n in A that satisfies

( 1.4).

Formulation 2

Let e* > 0 be a specified number such that

i = 1,2, .•• ,k-t and j = k-t+l, k-t+2, ••• , k, let

= ~ *(F[ .1) - x~~L *(F[. 1) . a-e J vr;-e i

* * e ~ a ~ 1-e . For

Let p* denote a preassigned constant such that 1 > p* > 1/(:). Our

goal is to find a procedure R such that

P{CS IR} ~ p* when inf dij ~ 0 , ( i, j)

i.e., whenever the infimum of dij over all pairs (i,j) with 1 ~ i ~ k-t

and k-t+l ~ j ~ k is non-negative. The form of the procedure is the same

as for formulation 1 above and the comments made there hold in this case

also; in particular, the same assumption that a is rational will be made.

Hence the problem again reduces to finding the smallest integer n such

that (1.8) is satisfied. The vectors (or points) in the indifference zone

are those for which d!. < 0 for at least one pair (i,j). iJ

2. Probability of a Correct Selection

For each formulation we calculate the exact and asymptotic (n -+oo)

probability of a correct selection for an arbitrary configuration and for

the "least favorable" configuration, the latter being defined as the infimum

of the P{CS} in the complement of the indifference zone.

- 3 -

(2 .1)

(2.2)

(2.3)

(2.4)

2 • 1 P {CS} -for : F clttnu lat ion · 1A · '-,, ~; ·1 i·, .·.~."!!L::.' .,·,·~ ~:.'j~-~;; ': i ~~·. • ~ r .. ..' ;"1:h: : ' . ,

L~t ~~, (i) = ~{i) . denote t~e _r . ord_er statistic ~ss~cia~ed_ with the _. s :::· :~ .. ~= ~;r.r= \: 1

_ ~.-: ... ; • • .. 1 •• 1':-;r ~ ::~r .. ;- ~-:.~ .. : ~.> . .:·- ·-; _:,· _::, !~~--~· --~-- ~) ~·

disitr1butfon F[.]" The density element dH .(y) = dH.(y) and the c.dof. i . ·' r,1. 1.

. ' •••• i •• • •• ,· ~ •• _· •. • •• ; :i .•.

H .(y) = Hi(y) .of: Y(·) are respectively given by r,1. . , 1. r

dH/y) = r(:)Ft~~(y)[l-F[i](:y)ln-_r., dF[i](y)

.' ....... _J },

( ) n (n) j ( ) ( ) n-j H. y = _E • F(i] y [1-F[.] y] o i J=r J 1.

The probability of a correct selection under.the procedure R is given by

p {CS IR} = P.{~;~(Y ( l), •.• , Y:(k-t)) .. !< Min(Y (k-t+l), ••• , Y (k))}

k

·f'

= .. j=k~t+l P{~~(Y(l) ," ••. :,iY(R-t)) .< y(j) = Min(Y(k-t+l), •.. ,Y(k))}

k Joo : k- t . k . :-'· ,, = . E Jl Hi~y) IT [ 1- (y) J:-i_~iH. (y)

J=k-t+1 _ 00 - 1.=1. ; . : ·]: . f3="k-t+1 : . _Hf3 J

f3+j

k . . : ' k-t . . 1{

?; • E i. JI Hi(y) II [1-1\i(y)l ~/Y) J=k-t+l J 1.=l . f3=k-t+l '. '

. f3fj

+ Gk-t(a+E*+d)[l-G(a+€*)]t

where G (p) = G(p) is used for the standard Incomplete Beta function r,n ·

n! JP r-1 m-r G ( p) = ( l) 1 ( ) , x ( 1-x) dx ; r ,n r- • n-r •

0

in the last term of (2.3) we have used the fact that Hi(y) is a continuous

nondecreasing function of F[i](y) and hence also of y. Using this fact

again gives

k P{CSIR} . k I ~ E

j=k-t+l , [Hk-t(y)]k-t II [1-Hf3(y)] dH.(y)

f3=k-t+l J

+ Gk-t(a+E*+d*)[l-G(a+E*)]t·

- 4 -

f3+j

~

~

..

..;

...

Ila.I

..,

..,

liaJ

-..I

...,

...

~

w

~

...

~

.. ~

-.I

,, --- (2.5)

-'-

~

..

...

---..

(2.6)

1-

\aj

-.. (2.7)

'- (2.8)

-la

-

and we note that we have set each H.(y) under the integral sign equal to i

Hk-t(y) . to attain a minimum.

Similarly we can write the P{CSIR} in the form

P{CSIR) = P{Max(Y(l)'"""'y(k-t)) < Min(Y(k-t+l)'"""'y(k))}

k-t = i~l P{Max(Y(l)'"""'y(k-t)) = y(i) < Min(Y(k-t+l), ••• ,Y(k))}

= k-t E

i=l r - ()0

~ E k-t l i=l I

k-t k

IT H/y) IT (1-H/y)l dHi(y) r=l j=k-t+l r=f=i

k-t IT Hr(y)[l-Hk-t+l(y)]tdHi(y) + Gk-t(a-e*+d*)[l-G(a-e*)Jt.

r=l r=f=i

We note that to minimize the P{CSIR} we have to set each H.(y) J

equal to Hk-t+l(y) for y in the interior of the interval I. Carrying

this out in (2.4) we obtain

P(CSJR} ,ta t 1 [l\c_t(y)l-t[l-Hk-t+l(y)]t-ldHk-t+l(y) I

+ Gk-t(a+e*+d*)[l-G(a+e*)]t

On the interval I we have F[k-t](y) = F[k-t+l](y)+d* and hence, letting

µ = F [k-t+1/Y)'

Hk-t(y) = G(F [k-t] (y)) = G(u+d*)

Hk-t+l (y) = G(u).

Hence (2.6) takes the form

- 5 -

(2.9)

(2.10)

(2 .11)

(2 .12)

P{csjR) ~ a+e*

t 1 Gk-t( u+d*) [ 1-G( u)] t-ldG( u) a-e*

From (2.5) we now obtain the expression

P{CSIR) ~ a+d*+e*

(k-t) 1 Gk-t-l(v) [ 1-G(v-d*) fdG(v) a+d*-€*

which is equivalent to (2.9).

It is interesting to note that with this formulation the right side of

(2.9) does not give 1/(~) if we set d* = 0 (or if we set e* = d* = 0),

but actually gives something smaller; for n -+oo so that r/n -+O: we will

show below that the right side of (2.9) approaches one for d* > 0 and

approaches 1/(!) for d* = 0.

2.2 P(CS} for Formulation lB

A slight variation of the results in the last section are obtained if

we make the assumption that for all x

( i = 1, 2, ••• , k-t)

(j = k-t+l, k-t+2, .•• , k) •

For the special case t = 1 we omit the second part of (2.11) and for t = k-1

we omit the first part. [It should be noted that we could have assumed that ~

for all pairs (i,j) and all x

but the assumption (2.11) is a weaker assumption and accomplishes the same

purpose, namely to make F[i] = F[j] = F(say) in the worst configuration.] ..iJ

- 6 -

(2.13)

(2.14)

Under the assumption (2.11) the derivation of the first term on the right

side of (2.9) i.e., the integral, remains exactly the same as in (2.9)

but the contribution from the complement of I is different. The final

result for this formulation, using the same methods as above, is

* P{CSIR) ~ Gk-t(u+d*)[l-G(u)]t-ldG(u) tlo:+*e

0:-€

+ (t) [l-Gk-t+l,t(G(~€*+d*)) + Gk-t+l,t(G(a-€*))] ' t

The explanation of the second term on the right side of (2.13) is as follows.

In the jth

term of the sum in (2.3) we let µ = F[j](y) and consider that

portion of the integral for which o:+e* < µ < a+e*+d*. As y approaches

Y1 = ~a+e*{F[k-t+l]), the right end p9int of I, from within I we have

F[j](y) ~O:+e* and it follows that F[i](y) ~ a+e*+d* and hence

G[ i/Y) ~ G(o:+e*+d*) for i = 1,2, ... , k-t and y = y 1. ,-.::-'.

Hence, letting I' 1

denote this part of the integral in (2.3),

+ Y1 k k

Il ~ Gk-t(~€*+d*) 1- L, II [ l-Hf3 ( y)] dH j ( y) Y1 j::k-t+l f3=k-t+l

t3tj

k +

= Gk-t(o:+e*+d*){l- II [1-H .(y)] ly1

} . J I -J=k-t+l y1

The last part of (2.13) represents the contributions of the intervals

* * * o:+e +d ~ µ ~ 1 and O ~ µ ~ 0:-e, respectively.

For d* = O and any e* > O, it is easily seen that the right side of

- 7 -

(2.15)

(2 .16)

(2.17)

(2.13) reduces to

It is natural to conjecture that the right side of (2.13) is greater

than the right side of (2.9) and hence that formulation lB requires an

n-value which is not larger than that required by formulation lA. This is

indeed the case since the difference t:::. of these quantities satisfies the

inequality

1

J k-t( )t-1 k-t( )t f:::. ~ t y 1-y dy - X 1-X

X

(0 ~ X ~ 1)

where x = G(a+e*+d*). Differentiating the right side of (2.15) we find

that it is decreasing from 1/(~) to O as x increases from Oto 1,

so that t:::. ~ 0.

It will also be seen from the dis.cussion below that the expressions in

(2.9) and (2.13) are asymptotically equivalent for large n {or, equivalently,

for d* close to zero)and it will be of some interest to see how much

difference it makes in the sample size to make the assumption (2.11).

Table 1 shows that this difference is quite small.

2.2 The P{CSIR} for Formulation 2A

Using the 'third expression in (2.3) as a starting point, setting

Hi{y) = G(ci+-e*) for each i, integrating the resulting expression and

letting x 1 = mjn x0 _/F[j]) gives

k = Gk-t(a+e*) II [1-H.(x

1)]

j=k-t+l J

Increasing every H. J

as much as possible at x 1 we obtain G(a-e*) for

each of them and hence

- 8 -

I \.ii

(2.18)

(2.19)

-

It will be seen below that this tends to one as n ~~ and the remaining

problem is to determine the smallest n for which the right side of (2.17)

is at least p*.

For the special case a=\ we have r = (ntl)/2 = n-r+l and, using

a well-known property of the symmetric Incomplete beta function, (2.17)

becomes

I k( * P{CS R} ~ G \+e ),

which does not depend on t.

2.3 The P{CSIR) for Formulation 2B

We now make the same assumption (2.11) as in formulation lB and,

using the same methods as above, we obtain

Ja-e k t t 1 P{CSjR} ~ t G - (u)[l-G(u)] - dG(u)

0

= (!) [Gk-t+l,t(G(a-c*)) + 1-Gk-t+l,t(G(Of!-E*))l t

It is natural to conjecture that the right side of (2.19) is greater

than the right side of (2.17) and hence formulation 2B requires an n

which is not larger than that required by formulation 2A. This is indeed

the case and the proof is exactly the same as in formulation· 1. We also note

that for e* = O the right side of (2.19) reduces to k 1/(t).

For k = 2, 1 t = 1, a = 2 and n+l r = 2 , Table 1 gives the smallest

odd integers that will solve the problem for selected values of d* = e*

and P*, for each of the four formulations, lA, lB, 2A and 2B.

- 9 -

(3 .1)

(3.2)

(3.3)

(3.4)

(3.5)

(3.6)

.. ~ . Aayll)p't::9tic \E.xpress ions

·;~.f .1 ._X :, hM_: th~: ~nc~mpl_e;~ beta distribution . Gr ,-n;.ir+/x) .G(x),. then

as n -+ 00 and r / n -+ o:

G(a+y)

[-(x - .2:_)

= P n+l r(n-r+l)

(n+2)(n+1) 2

<

where t denotes the standard normal c·.d .f. Hence the asymptotic expression

as n -+00 with r/n -+O: for fornrulation 1A is

P{CS IR} ~ t f 00 ~k-t(x+vn) [ 1..;~(x)] t-l d~(x) -00

where v n = / n+ 1/ /i,( 1-0:) and it wi 11 be understood that n -+ 00 because

..

d* -+0. Values of v have been tabulated for_t = 1 by Gup~a[3] a~d Milton [6]o n

If v(t,k,P*) = v* is the tabulated solution of

. . 00 . . .

t J.. ~k-t(x+v*) [ 1-~(x)] t-l d~(x) = p* ,

then the required asymptotic solution for· n is given by

n ~

*2 (!....) o:( l-0:) • d*

Clearly, we get an upper bound to this asymptotic solution by setting

o:(1-o:) = \. It is interesting to note that v* and also the asymptotic

solution (3.4) do not depend on * E •

For formulation lB the asymptotic (n -+oo, r/n -+O:) expression for

the right side of (2.13) is the same as (3.2) since for x > 0

1-G(a+x) rJ 1-t( x§iT) -+ 0 , Jo:( 1-0:)

G(O:-x) r-.1 t(-xfii+I) ~o:( 1-0:)

-+ 0

- 10 -

,---

: ! '-I

I I la.I

i I

-

(3.7)

(3.8)

--

and hence all terms after the ~ntegral in (2.13) tend to zero.

For formulation 2A we easily obtain from (2.17)

= ~k/e*{n+f) \ Ja( 1-a)/

and hence, if ~(P*) is the P*-percentile of the standard normal c.d.f.,

the required asymptotic solution for n is given by

and again we obtain an upper bound over different a by replacing a(l-a)

by \ in ( 3 • 8) .

It is curious that the results in (3.7) and (3.8) do not depend on t;

here this property holds for all values of a.

For formulation 2B the asymptotic (n -+oo, r/n -+a) expression for the

right side of (2.19) is the same as that of (3.7) by virtue of (3.5) and

(3.6). Hence (3.7) also holds for formulation 2B.

4. Asymptotic Relative Efficiency for Formulation 1

In this section we compare the asymptotic relative efficiency (ARE)

of our procedure for formulation 1 with a=% with that of several other

procedures developed for particular parametric families. Unless stated

otherwise we shall assume that p* is fixed and that d* -+O for fixed

small e*; then we will let e* -+0 also.

1. ARE for Normal Slippage Alternatives(NA)

We first compare (3.4) with the result obtained in [ 1] for the problem

of selecting the t 'best' out of k normal populations with common variance

- 11 -

( 4.1)

(4.2)

( 4. 3)

(4.4)

( 4. 5)

a2 = 1 . The inf imum of the P (CS} ·. under procedure P' based on the sample

means is easily seen to be (see [ 1 J for details)

00

P(CSIR') ~ t J ~k-\x+v~) [1-~(x)]t-l d~(x) -00

where v' = 8*vn, 8* > 0 being a specified measure of distance between the n

tth largest population mean and the t+1st largest. Then n is obtained by

~

\ '\

"-

~

J

setting the right side of (4.1) equal to a specified value P*, where 1 > p* > 1/(~) U

For two c.d.f.'s <l>(x) and <l>(x+c·*), we want to find the smallest value

of 5* such that <l>(x+8*) - <l>(x) ~ d* for all x E I, where I is defined

in (1.2) with F[k-t+l](x) = ~(x). It is easy to see that we need only ask

for equality at the right end point of J, i.e., at -1 1 XO= (2+e*)

smallest 8* that satisfies our condition is given by

<l>(x +8*) - <l>(x ) 0 0

= d* '

or equivalently,

5* = <l>- 1(d*+½+c*) - ~- 1(½+€*)

and the

Hence the asymptotic (n ~oo, r/n __.\) value of n (say n') required for

this normal problem is

n' ~ (v* 2 ~) = 2 [ ]-2 v* d -1

(d*) dx ~ (x)lx=½+e*

= * 2

(V ) 1 ( . -1 1 d* '2ir exp -L (2+e*)]2}

Using (3.4) with a=\ and (4.4) it follows that the .ARE(R,R2

) of the

procedure R based on the medians relative to the best fixed sample procedure

R2

based on the sample means is

n' 2 1 ARE(R,R2 ;NA) = lim - = - exp(-[<%>- (½+e*)] 2) •

d*~o n 7r

- 12 -

4-'

-1

_,

i..l

-.i

~

~

' I w

Ii.I

'-'

, I

¥'

\

'-

J !

~

i ii ...,

' If we let e* ~o then we can make this result as close to 2/7r as we

wish; in particular, if we had set e~· = d* and let d* approach zero

then the result would be exactly 2/7r.

(4.6)

(4.7)

We now compare our procedure R with a class of procedures based on ranks

(including the Rank Sum Procedure and the Normal Scores Procedure) which

form a slight generalization of the class of procedures described by Lehmann [ 5 ].

To develop Lehmann's results on the required sample size for the problem of

selecting the t best populations (he considers only t = 1), we now prove

an analogue of his lemma 2.

Let X .. denote the j!h observation from the population 7ri (i = 1,2, •.. ,k; l.J

j = 1,2, .•. ,n)

Let

and let R •. l.' J

denote the rank of X .• l.J

in the combined sample.

~ J(Rij ) vi. = L, -

j=l nk+l (i:::1,2,oooJk) J

where J(x) is the inverse of some arbitrary c.d.f., i.e., J(x) is non-

decreasing and defined on the interval [0,1]. It is assumed that X •• has 1J

distribution F. and that the F. differ only in a location parameter 0. 1. 1.

Let 0[ l] ~ 0[ 2 ] ~ ••. ~ e[k] denote the ordered 8-values. The procedure R"

for selecting the t 'best' populations will be to select those that give

th rise to the t largest V.-values. If the t- best is shifted to the right

1.

of the (t+l)!! best by a positive amount 5 = e[k-t+l] - e[k-t] > 0 then

a correct selection (cs) is clearly defined. We want to determine the sample

size n required to satisfy the condition

p ( cs IR" } ~ p*

here o* > 0 and p* (1 > p* > 1/(k)) t

whenever 6 ~ 6*

are preassigned constants.

It is easy to show that to satisfy (4.7) we need only consider the

configuration

- 13 -

(4.8)

(4.9)

(4.10)

(4.11)

( 4 .12)

(4.13)

e[1J = ··· = e[k-t] = e[k-t+l] - 8* = ·•• = e[k] - 8*,

i.e., (4.8) is the least favorable configuration (LFC).

Lemma 2. For fixed p*, let n = n(8*,t,k) denote the smallest integer that

satisfies (4.7) and suppose that the F. and J satisfy the regularity 1.

conditions of Theorem 6.2 and Lemma 7.2 of Puri [8]. Then as n -.oo

a* = v* ____ A ____ _

rn I :x (J(F(x))} dF(x)

1 +- o(-)

Jn

where v* = v(t,k,P*) is defined in Section,3 and

A2 · = J J2 (u)du - (J J(u)du) 2

Proof: The proof is quite similar to that given in Lehmann [ 5 ]; we consider

a sequence of parameter vectors satisfying (4.8) with 5* -.o. By Puri's

theorem the random variables

have a limiting normal distribution with

Esi = 0 (i = 1,2, .•• ,k)

cr2(si) = 1 (i = 1,2, ..• ,k)

P(si,sj) = \ ( i + j)

The P(CS} then is easily shown to take the form

P(CS IR"} ~ P(Max f. < Min[Min(f. + ~), ~]} , i 1. j J v2 ~2

where i = 1, 2, ..•. , k-t, j = k-t+ 1, k- t+2 , ••• , k-1 and v is defined by

- 14 -

-,

i >

t I I

I

'-llil

I I

(4.14)

- (4.15)

--

(4.16)

V = ./n. f [J(¥ F(x+8*)+ ~ F(x)) - J(¥ F(x)+ ~ F(x-8*) ~~(~f.: · ·

A

a*[n J :x {J(F(x))} dF(x) ~------------A

We can now write for all a(a = 1,2, ••• ,k-1) ~a= (71a - rik)/{2 where

111(i = 1,2, ••• ,k) are independent standard normal chance variables and we then obtain

P(CSIR"} ~ P(Max ~-<Min (~. + v)} i i j=k-t+l, ••• ,k J

= Z P (Max Tli < 1b + v, 1b < Min 1l.} O'=k-t+l j=fa J

= t L"'lbk-t(x+v) (1-lb(x)]t-l dlb(x) .

If we set the last expression in (4.14) equal to v* then we obtain the required

result (4.9).

For the location parameter problem with a~~ we make use of the

Lemma: If f 0(x) is a continuous unimodal density that is symmetric about x = O,

and if a~%, then to satisfy the condition that F0

(x+8*)-F0

(x) ~ d* for all x e I

it is sufficient to set F0

(x0+8*)-F

0(x

0) = d*, where x0 = F~-l)(a+e*) is the right

end point of I.

Proof: If f0

(x) is unimodal and continuous and a* is sufficiently small, then there

exists a point (or a whole interval, in which case we select the midpoint of this

interval) inside the support of f0

(x) where f0

(x+8*) = f0

(x); call this point x'.

Then clearly F0

(x+8*)-F0

(x) is also unimodal with a maxinrum at x'; if in addition

f0

(x) is symmetric (say, about x = 0) then F0

(x+8*)-F0

(x) will be symmetric about

x' = -8*/2. If a~ % then x0

= F~-l)(a+e:*) ~ 0 since e* > O and F0

has median O;

clearly the left end point of I, -x0

, is closer to x' < 0 than x0

is. Hence

* F0

(x+8 )-F0

(x) takes on its minimum value at x0 , which proves the lemma.

For small values of a* (and hence also d*) we have using the above lemma

~ f (x) 0 0 =

when a~\ and f0

(x) satisfies the properties in the above lemma.

From (3.4), (4.9) and (4.16) we now obtain for translation alternatives (TA)

- 15 -

Theorem: If a~%, and· f(x)

* any e > 0

has the properties mentioned above, then for

- -I

(4.17) ARE(R,R'(J);TA) =

(4.18)

(4 .19)

(4.20)

(4.21)

o:( 1-o:{ C: :x (J(F(x) )) dF(xj]2

For the rank sum test Ri we take J(x) = x and obtain from (4.17) for

normal alternatives (NA) with a=~ nv

ARE(R,R1;NA) = lim l = d*~o n

* which can be made arbitrarily close to 2/3 by making e smallo

I I

For the normal scores test R2 we take J(x) ~ ~(-l)(x) and obtain from (4.17) ._

for normal alternatives with a=\ n'

ARE(R,R2;NA) = lim _g = g_ exp(-[~(-l)(½+e*)] 2 ) d*~o n 7f

which can be made arbitrarily close to 2/rr by making e* small.

2. ARE for 2-sided Exponential Slippage Alternatives (TEA)

We wish to compare our procedure R with certain other procedure which

we might prefer if the k distributions all have the form

i[ -lt-eil F.(x) = F(x;0.) = 2 e dt

1. 1. -00

here the F.(x) differ at most in the value of the real location parameter e. 1.

If we knew that the distributions had this form then we might use a fixed

sample procedure R1 based on the sample medians.

[The procedure R1 is asymptotically equivalent to a procedure R2

which

depends on scores as in (4.6) with

J(x) = sgn(x - ½) r-1

= ) 0

l 1

for O~x<\ for 1

X = 2

for ½<x< 1

This procedure R2

can be shown to be the locally most powerful test on the

basis of Hajek [4] for the 2 sample case; we shall not prove this result here.]

The rule for procedure R1

is to select the t populations which give

rise to the t largest sample medians. Let e[i] denote the ith smallest

0-value. We want to choose n large enough so that

- 16 -

w

I i

I

w

! I ...

... '

I (4 .22)

(4.23)

(4.24)

(4.25)

for all vectors 7/ = (el, 02, •.• , ek) such that e[k-t+l] - e[k-t] ~ a*;

here 8* > 0 and p* with 1 > p* > 1/(k) are both preassigned constants. t

For general 7/ the P{CSjR1

} is again given by the development and 3rd

expression in (2.3) with H.(y) defined in (2.2), except that now r = (n+l)/2 l.

and F[i](y) = F(y;e[i]). It is easy to show using a result of Rizvi (9]

that the infimum of the P{CSIRi} is obtained at the configuration (4.8),

i.e., the configuration (4.8) is again least favorable. Using this in (2.3)

we obtain

00

t J Hk-t(y+8*) [ 1-H (y)] t-l dH (y) , 0 0 0

-oo

where H0

(y) is the right side of (2.2) with F(i](y) replaced by F(y;o),

i.e., H (y) 0

is the distribution of the sample median for an odd number n1

of observations taken from the c.d.f. (4.20). We now use the fact that the

sample median is asymptotically normal with mean equal to the population

median g and variance 2 -1

(4nlf(s)] where f(x) is the derivative of

F(x;g). Applying this to (4.23) gives

P (CS IR1} "' t f 00 tk-t(x+5* Jnii [ 1-t(x)] t-l dt(x) .

-oo

The value of n1 is then obtained by setting the right side of (4.24)

equal to p* and solving for n1 .

To get the relation between the horizontal measure of distance 8*

and the vertical measure d* used in procedure R we set the difference of

the c.d.f.'s equal to d* at the right endpoint of the interval and obtain

2d* 1-2e* =

-8* * 1-e ~ 8 .

- 17 -

(4.26)

(4.27)

(4.28)

(4.29)

Hence, using (4.24) and the definition of v* above,

2

n i:::= 1

~ (v*( l-2e*)) 2d*

If we compare this with (3.4) for a=\ or use (4.17) we obtain

. n . ( ) l ~: ( *)2 ARE R,R1 = lim - =' 1-2€

d*-+O n

which can be made arbitrarily close to 1 by making e* sufficiently small.

It should be noted that although the sampling procedures are the same,

the expressions for the infimum of the P{CS} are not the same for the two

procedures above. In view of the relation mentioned above between R1 and

the asymptotically most powerful test R2

, it follows that R also has an

asymptotic efficiency with respect to R2 that can be ma.de arbit~arily close to

1 by taking e* sufficiently small.

With respect to the Rank sum test R1 and the Normal Scores test R2 we now wish to show that the procedure R has an asymptotic efficiency which

is greater than one, when the distributions are in fact given by (4.29).

For the rank sum test we use (4.9) with J(x) = x and F(x) given by

(4.20) with e = O and obtain

Hence, using (4.25) and (3.4) with a=\

n" ( * * )2

* 2 ARE(R,R1 ;TEA) = lim _! = ~3

v ( l-2E ) (~) =

d*-+O n 2d* V

which can be made arbitrarily close to 4/3 by making e* sufficiently small.

For the Normal Scores test R2,we set J(x) = ~~~(x) and use the same

- 18 -

I

! I

\ l i

I i

..

(4.30)

(4.31)

(4.32)

- (4.33)

F(x); using (4.9) again and integrating by parts we obtain

=

=

since it is e~sy to show that the first term in this integration by parts

vanishes and the second term is 2/[2ir. Hence using (4.25) and (3.4) with

0: =\,we obtain

n" * 2 ARE (R ,R' ; TEA) = lim _g = '!, (v*( l-2

€ )) 2 d*-+O n 2 2d*

2d* 2

(-) v*

7f ( *)2 = 2 1-2€ ,

which can be made arbitrarily close to 7r/2 by making e* sufficiently small.

Using (4.26) we can easily find the ARE of our procedure R with respect

to the procedure R' based on the sample means for the 2-sided exponential

case. Using the fact that the variance of X with c.d.f. given by (4.20)

is 2, we again obtain (4.1) with v' = o*Jn/2 where a*> 0 has the same n

meaning as in (4.1). * Hence, if v is defined by (3.3) and n'

required value of n for procedure R' then as in (4.26)

Comparing this with (3.4) for O: = \, we obtain

ARE(R,R' ;TEA) = lim d*-+O

n' -n = 2(1-2e*)2

which can be made arbitrarily close to 2 by making e* small.

- 19 -

denotes the

(5.1)

(5.2)

(5.3)

5. Supplementary Results Needed for Table Calculation

Formulation 1A

For the special case k = 2, t = 1, 1 a= 2 and e* = d*, using the right

side of (2.9), the equation determining n for formulation 1A becomes

½+d* r G(u+d*)dG(u) + G(½+2d*)c(½-d*) Ji d* 2-

= p*,

I

~

where G = G is the incomplete beta c.d.f. with parameters r = (n+l)/2 i..., r ,n-r+l

and n-r+l = (n+l)/2. To evaluate the integral I in (5.1), it is useful to

note that for c > 0 and 2d*/c an integer

2d*/c I ~ ½ E [G(½+jc) + G(½+(j-l)c)][G(½-d*+jc) - G(½-d*+(j-l)c)];

-j=l

this is obtained by considering non-overlapping squares of side length c

whose diagonals form the line segment on * . 1 * 1 * v = u+d with 2 -d < u < 2+d

and replacing the integral over the lower triangle of each square by% the

integral over the square. If we replace the integral over the lower triangle

in each square by the integral over the whole square we clearly obtain an

upper bound for I; it follows that an upper bound B1 to the absolute value

of the error in (5.2) is given by ..-

= 2d*/c

½ E [G(½+jc) - G(½+(j-l)c)][G(½-d*+jc) - G(½-d*+(j-l)c)]. j=l

We choose c < d* so that O < ~ = (d*-c)/2 < \; then the point(½,½) is

not included in any of these squares and for each square the distance from

the boundary to(½,½) is at least ~ff. Hence each square has a distance

from\ of at least ~ in one of the two coordinates. Since there are

2d*/c squares it follows that

- 20 -

--\

I /

-(5.4) * ~ [G(~c) - G(½t-D.)]

C

~ · q,[ ( d*+c )Jn+g]

cv'n+2

We note that B1 = cd* for n=l and it can be shown that B1

decreases with n;

the last expression in (5.4) shows that both B1 and B2

decrease exponentially

fast with n. Hence for n ~ 99 we can use Pearson's Table (7] with c = .01

(or c = .02 or etc., provided c < d* and 2d*/c is an integer) to solve

(5.1) for n with high accuracy by using (5.2) to evaluate the left side of

(5.1) for different n values.

For n > 99, we need an asymptotic approximation and we first note that,

for k = 2, t = 1, O! = ½ and e* = d*, (3.3) reduces to

(5 .5) ~<[2') = p* ,

{5.6)

{ 5 · 7)

where x = 2d*/n+2. However this result gives an error of about 5~ to~ for

n between 200 and 400 and hence a better approximation is required which

does not drop all terms approaching zero. Applying (3.1) to .the right side of

{2.9) and setting x = 2d*Jn+2, k = 2, t = 1, l * * O! = 2 and € = d , we

find that the equation to be solved for x is

p* = q,(2x)[ 1-q,(x)] + Jx q,(y+x)dq,(y) -x

= q,(x2

) -[00

(q,(y+x) - q,(y-x)Jdq,(y) - [1-q,(x)][l-q,(2x)J ;

the last form shows that the root of (5.5) is a lower bound to the root of {5.6).

Using the funct~on V(Ah,h), defined by

JAhlxlA v(~h,h) = q,(y)~{x) dydx

0 0

and tabled in [2 ], we can write {5.6) exactly in the form

- 21 -

(5.8)

(5.9)

(5.10)

(-5 .11) I..

p* = ½+½[t(~)-½]2

+ V(x,2~) + v(_~, 3x) - [1-t(x)][l-t(2x)]; J2 vc. .[2

the details of this are easily worked out with the help of the introduction

in [2] and are omitted here. Assuming that we have any approximate solution

x of (5.8) (say, the solution of (5.5)), we set x = x +e in (5.8), expand 0 0

in powers of e, drop powers of 2 and higher,and solve the resulting equation

for e; then X = X +E 0

appears to give always a better approximation to the

solution of (5.8); thus an iteration of Newton's approximation formula was

used to solve (5.8). The last term. on the right in (5.8) is very small and

can be dropped; the resulting equation for xj+l in terms of xj is

X. 2 X. 3x. *11 J l - J J P -2-2[t(-)-2] - V(x. ,2x .) - V(-,--;--)

+ [g J J f2 112 l X. X. 3x. rn cp( ~)[t( ~)+t( ~)-1] + cp(x.)[t(2x.)-½]

'v 2 v 2 v2 v 2 J J

For large h and ~ not close to one, the tables for V(~h,h) in [2] are

inadequate and it was found necessary to use the formula

1 1 = ~ [2t(~h)-l] - 21r arc tan~,

which is given in the introduction to (2 ].

Another equation whose root yields a close upper bound to the· root of

(5.6), and hence also (5.8) for large x, but requires less calculation than

the iteration (5.9) is

t(x) + t(2x) 2

cp(x) x2..f2ir = p*.

·t

This equation is useful to obtain a good (i.e., close) starting value for (5.9);

we omit the derivation of (5.11)~

I i ...

In (5.9) we have indicated that for j ~ 1 the trial value for the (j+l)st -..

iteration y is the same as the result of the J· th iteration x. 1

, but a j+l J+

- 22 - l I

( 5 .12)

(5.13)

(5.14)

slight variation sometimes gives better results. After the j th iteration we

know which x. • s ( i ~ j-1) ].

are upper bounds; let y. 1 J-denote the smallest

of these upper bounds. If we replace x. J

in the right side of (5.9) by the

average y. = (xj+y. 1)/2 then the resulting iteration may converge even

J ]-

faster than (5.9).

Formulation lB

For k = 2, t = 1, a=½ and e* = d*, using the right side of (2.13),

the equation determining n for formulation lB becomes

1 * i 2+d * * * 2

* G(u+d )dG(u) + G(½-d) + ½[G(½+2d*) - G(½+d )] = p* . .!.. d 2-

where G is the same as in (5.1). The only part that presents any difficulty

in (5.12) is the integral and it is the same integral as the one already

considered for Formulation 1A above; hence all the results above on I can be

used here also.

In the asymptotic case (i.e., for n > 99) equation (5.5) holds here also

but again it gives too large errors for the purposes of calculating tables.

The equation corresponding to (5.6) is

p* = 1 - 41(x )' + ½c 41(2x )-41(x) J2 + lx 41(y+x)d41(y); -x

and the root of (5.5) is a lower bound to the root of (5.13). Corresponding to

(5.8) we can write (5.13) in the form

P* = ½ + ½[41(.,(2")-½]2

+ V(x,2x) + V(~,~) + ½[41(2x)-41(x)]2

•

As p*-+ 1 and/or d* = e* --)Q, the difference of then-values required by

formulations 1A and lB tends to zero, as is also indicated by the tables.

- 23 -

6. Acknowledgement

The author wishes to thank Professor M. Haseeb Rizvi of Ohio State

University and Mr. George Woodworth of the University of Minnesota for

several helpful discussions in connection with this paper. The ~uthor is

also indebted to Miss Marilyn Huyett of the Bell Telephone Laboratories

and Mr. A. P. Basu of the University of Minnesota for their help with the

tables in this paper.

- 24 -

I

~

I !

-

-

ADDENDA

• For formulation 2B the asymptotic (n --:)oo, r/n --:)Q!) expression for the

X ·;,:_, fi,, 1ri'gil~/ itae~!-cff ::(i.{9) :'.(~ equfv1a1.erit? fc('thkt ·!!11··tfo 7) [by -\fi~tU~._,~, 6o5);~nd

(3.9)

(3.10)

(3.11)

(3.6). Hence (3.7) can also be used for formulation 2B. However a better

approximation needed_, f9r.~abl~~nµiking is - . ~ ~ -- . . . ~~ .

+ G t,k ... t+l i (

(~ (-e * Jn+I))~ -/4x(l-a) ~

and for k = 2, t = 1 this reduces to

P{CSIR} . ~ ~2 ( e*Jii+I ) {o:{ lc-0:)

+ ~2 (-e*Jri+T) Jo:(1...a:)

For formulations 1A and lB we can also get an asymptotic expression for

P{CSIR1} after making a variance-stabilizing substitution. We do not alter

the procedure but merely try to obtain an expression in which the convergence

to normality is faster. We assume that d* < min{e*,o:) and that the F[r] are

all differentiable with positive derivatives in the interval I.

j ~ k-t as we approach the LF configuration F[k-t+l](F~j~izj))

z. = F[ .](y .) ~o: and hence for all r(l ~ r ~ k) J J . r,J

We make the substitution

Then for each

* --:) z .-d where J

(3 .12) Ur = arc sin/ F [k-t+l](F~;P (zr)) (1 ~ r ~ k)

where z F (y ) has the beta distribution ~(r,n-r+l). Then, using r= [r] r,r

(3.11), we have asymptotically

(3.13} Var Ur ~ [4(n+2)]-l

- - 25 -

(3.14) E(U .-U.) ~

J ]. arc s inJ'a - arc sin Jo:-d* - d' > 0

...

for i ~ k-t and j ~ k-t+l. Since arc sin v'x is an incr~asing f~nction of x

for 0 ~ X ~ 1,

{3.15)

and the P*-condition for determining n as d* -+O becomes

(3.16)

This is consistent with (3.2) for very small values of d*; it was not used

. * * in the computations of Table 1 where d = e •

- 26 -

iii..

I ... I

I I

la.-

.. •

-

-

·o

.6( )0 I

.65

.700

.75

.ao

0

0

0

.85

,90

0

0

95

.97

.. 990

0

5

Table 1

Number n of Observations Required per Population under Procedure lt for .

Selecting from k • 2 Populations the One with the Largest Population Median.

[Here d* = e* and the condition satisfied by Procedure a1 is

PtCorrect SelectionfR1} ~ p* whenever the distance d ~ d*]

Formulation 1A Fornulation 1B ' Foraalation 2!,___I Formulation ~n I

.05 ,10 .15 .20 .05 .10 .15 .20 .05 .10 .15 .20 .05 .10 .15 .20

31 9 ~ 1 15 3 1 1 43 11 5 3 17 5 3 ,_

47 11 5 3 33 9 3 1 57 15 7 3 35 9 5

€,7 17 7 5 55 13 7 3 75 19 9 5 57 15 7

95 23 11 5 85 21 9 5 97 25 11 5 81 21 9

129 33 15 7 125 31 13 7 123 31 13 7 111 27 13

181 45 19 11 175 45 19 11 155 39 17 9 147 37 17

253 63 27 15 251 63 27 15 201 49 21 · 13 193 49 21

363 91 39 21 363 91 39 21 265 65 29 15 261 65 29

567 143 63 35 565 141 63 35 381 95 41 23 379 93 41

777 193 85 49 777 193 85 49 501 125 53 29 501 125 53

1087 271 127 69 1087 271 127 69 663 165 71 39 663 165 71

The entries for fornulations lA, lB, Z\ and 2B, respectively, are solutions

of the equations (2.9), (2.13), (2.17) and (2.19) with k = 2, t = 1, a c % and e* ~ d*. For n > 150 the equations used were (5.9), (5.14),(3.8) and

. ' ~ 3.10, 1 respectively.

- 27 -

1

3

3

5

7

9

11

15

23

29

39

i

REFERENCES

[I] Bechhofer, R. E .. (1954) A single-sample multiple-decision procedure for rankinp, means of normal populations with known variances. A~tl. Math. Statist. 25, 16-39.

[2] Blanch, G., Fix, E., Germond, Ho Hw and Neyman, J. (1956) Tables of the bivariate normal distribution function and related functions. Applied Math. Series #5°Nat'l. Bur. of Standards, Washington, DoCo

[3] Gupta, So So (1963) ·Probability integrals of the multivariate normal and multivariate t. Ann. Math. Statist. 34, 792-828.

[4] Hajek, J. (1962) Asymptotically most powerful ra~k-order tests. Ann. Math. Statist. 33, 1124-1147.

[5] Lehmann, Eo L. (1963) A class of selection procedures based on ranks. Math. Annalen 150, 268-275.

Milton, Roy (1963) ility integral.

Tables of the equally correlated multivariate normal probabTechnical Report #27 Univ. of Minnesota, Minneapolis, Minn.

['7] Pearson, K. (1948) Tables of the ·incomplete beta .function. The Biometrika Office University College, London.

[8] Puri, M. L. (1964) Asymptotic efficiency of a class of c-sample te;:;ts. Ann. Math. Statist. 35, 102-121.

(9) Rizvi, H. (1963) Ranking and selection. problems of normal populations usi:i.g the cibsolute values of their means~ Fixed sample size case. Technical Rr.:!port #31 Univ. of Minnesota, Minneapolis, Minn .

... '28 -

..

i -

(2 .1)

(2.2)

(2.3)

(2.4)

Selecting a Subset Containing the Population with the Largest ~ -ct-Percentile

by M. HASEEB RIZVI _and MILTON SOBEL

OHIO STATE UNIV. UNIV. OF MINNESOTA

1. Introduction

A nonpa~ametric solution is considered for the proble~ of selecting a

subset of k populations which contains the one with the largest a th-

percentile. It is assumed that each population has associated with it a

continuous c.d.f. Fi(x) (i = 1,2, .•• ,k) and that observations from the

same or different populations are all independent. We use the notation of

the companion paper [ 5] without redefining all the symbols and, in parti

cular, it is convenient to identify each population with its c.d.f.

2. Formulation of the Problem

A common fixed (i.e., given) number of observations are taken from

each of the k populations. A constant p* > 1/k is preassigned and we

are required to find a procedure R1 such that

for all possible k-tuples (F 1 ,F2

, ••• ,Fk) for which

Min F[j)(y) ~- F[k]{y) l~j~k ·. .

for all y

We denote this set of k-t~ples (F 1 ,F2 , ••• ,Fk) by n1 •

To avoid being "too far out" in the tails we assume that the given

common sample size n is large enough so that

1 ~ (n+l)a ~ n

and we define a positive integer r by the inequalities

r ~ ( n+ 1 )a < r + 1

It follows that 1 ~ r ~ n.

- 1 -

(2.5)

(3 .1)

I

-~

We now define a procedure R1

= R1(c) in terms of a positive integer

c( 1 ~ c ~ r-1) : and the order statistics Y. . where Y. . denotes the "' J,l. . J,l.

j th order statistic from the population Fi(y) (j = 1,2, •.. ,n; i = 1,2, ... ,k).

Procedure R1

:

Put F. l.

in the selected subset if and only if

y ~ r,i Max

l~j~k y . r-c,J

where c is the smallest integer·with 1 ~ c ~ r-1 for which ~2.1)-is satisfied.

We shall show that for any given a and k a value of c ~ r-1 may

not exist for some pairs_'(n,P*). but if P* is chosen not greater than some

function P1

=·-,P 1 (n,a,k) where 1/k < P1 < 1, then a val.Ile of c ~ r-1

does exist that satisfies (2.1). The function P1 will be derived and

evaluated; -in particular we shall be interested to see for fixed k how

rapidly it approaches 1 as n increas-es.

[If we allow c = r and define Y0

• == -co then any value of p* ~ 1 ,J

can be obtained by putting all the populations in the selected subset. Thus

P 1

represen-u the largest P* for widen the {nun-unn:lomized) pt;oblem is

nondegenerate .... ·6-If P1 < p* < 1 ·-~ randomi:zat:ioa bemeen the rule (2.5)

with c = r-1 and the degeneratie rul-e (c = r) is desirable, but we shall not

consider an.y such rul-es in the discus&iml be:h!a..· .lit. -shoe·Jd be poi,alEd out

3. Probability of a Correct Selection for'!R1 .·.~

Using--:the notation· of· ·se~t:i-ons··-4 «ttif .. ~··Df-:f!f]~····.the probability of a

correct S'elec-tion P{CSfR1) is easily seen. to· be

i»-.&-:Ji. .-. .

= 1 JI Hr-c,i(y)-4~ i{y) -00 i::::l ,

where

'H' i.(y) .• u r,

I

II-'

i

~

: I

I.I

' :

-

(3.3)

(3.4)

(3. 5)

(3.6)

-

To find the infimum of (3.1) in n1 , we note that H .(y) depends on r-c,1.

y only through F[i](y) and that it is an increasing function of F[i](y).

Hence we obtain the infimum by setting F[i](y) = F[k](y) for i = 1,2, ... ,k-l.

Making the transformation u = F[k](y) then gives

11

k 1 = G - (u) dG (u) r-c r 0

where G (u) = I (r,n-r+l) is the standard incomplete beta function. When r u

c = 0 the last member of (3.3) is clearly 1/k and we wish to show that

G (u) is an increasing function of c (c = 1,2, ... ,r-1) for fixed r r-c

and u or that G (u) r

by parts gives

is decreasing in r for fixed u. Integrating G (u) r

which shows the required monotonicity. Hence the maximum value of p*

for which a value of c satisfying (2.1) exists is obtained by setting

c = r-1 in (3.3) and this easily reduces to

Hence for any p* < P1(n,a,k) a value of c ~ r-1 exists and it is unique

so that the procedure R1 is well defined for p*··< P 1 • A short table of

P1-values for a=½, so that r = (n+l)/2, and k = 2 is given at the

end of this paper •. It can be shown that P 1 -+ 1 exponentially fast as n -+co.

In fact, we can write for large n (with r/n -+a)

where a'= 1-a and

of (3.6) is omitted.

~ 1 - (k-1) [(l+a')1:a'J n C 1

4(a' )o: o:

c0

, does not depend on k o~ n; a detailed derivation

o:' Since l+o:>O and [(l+a')/o:'] > 0 are both increasing in o:'

(0 <a'< 1) their product is also and hence the maximum value of the quantity

- 3 -

(3.7)

in square brackets in (3.6) is one; this shows that P1 -+ 1 exponentially f~st.

Of course, the same phenomenon described above can be interpreted from

another point of view, namely, that for a given p* > 1/k there ·is a minimum

common number n of observations required to satisfy (2.1). Table 1 also

gives the required common value of n for a=% and ·k = 2 if we take P1

to be the.specified value of P*.

For small values of k and any a satisfying (2.3) the integral in

(3.3) can be written more explicitly and, in particular, for k = 2 we obtain

after algebraic simplification the three equivalent expressions ..

inf P{CSIR1 , k = 2) n1

=

=

=

½+

n E

i=r-c

(r-l+i) (2n-r-i) r-1 n-r

r-1 E

i=r-c

cr-l+i) (2n-r-i) r-1 n-r

The last expression is more useful for computing if r is small and the

..

i 6-1

I

~

middle one is more useful if c is small. A table of r-c values satisfying (2.1) oJ

-a=½, so that r = {n+l)/2, and selected values of n and p* is given at the

end of this paper.

4. Expected Size of the Selected Subset

The size S of the subset selected by procedure R1 is, of course,

a random variable and its expected value may be taken as a measure of the

efficiency of R1 , with small values corresponding to high efficiency. In

this section we derive an expression for E{SIR1) and show that the maximum

value in n1 is attained when the c.d.f.'s are all identical.

It is easily seen that for any c (1 ~ c ~ r-1)

- 4 -

I

~

.. _,

~

al

-lal

I.al

lal

--\al

) ._

.. wal

.. -'

-'

..i

_,

• (4:1)

{4.2)

(4.3)

(4 .4)

(4.5)

'

(4.6)

k

E(S IR1) = r, P(F[i) i=l

is included in the subsetjR1}

k 00 k = r. 1 rr H . ( y) dH . ( y) • . 1 . . 1 r-c, J r, 1

l= :..00 J= jti

For the special configuration

F[l](y) = F[2](y) = - F - [k-l](y)

this reduces to

100

k 1 E(SIR1} = :..oo Hr=c,l(y) dHr,k(y)

+ (k-1) 1"' Hr-c,k(y) H:=~,l(y) dRr,l(y) -oo

and, if we set all of the F[i]'s equal, we obtain

100

k-1 E(SIR1} = k H k(y) dH k(y) ;

:..oo r-c, r,

for c equal to the integer c-value satisfying (2.1) the right side of (4.4)

is approximately kP* •

To prove that E(SIR1}is a maximum when F[l](y) = F[ 2 ](y) = ... = F[k](y)

for all y, we first prove the

Lemma: For any two c.d.f. 's F1(y) and F2(y) such that F1(y) = F[l](y) ~

F[2 ](y) = F2(y) for all y and for any integer c with 1 ~ c ~ r-1,

Q = Hr-c,l(y) Hr,2(y) - Hr,l(y) 8r-c,2(y)

is nonincreasing in F[l)(y) for any fixed F[ 2 ](y).

Proof: Differentiating Q with respect to F1(;), it is sufficient to show that

Q1 = H 2

(y) dH 1(y) - H ·

2(y) dH

1(y) ~ 0 . r, r-c, r-c, r,

- 5 -

(4.7)

(4.8)

(4.9)

(4.10)

We show (4.6) by letting F2 (y) -+O for all y, so that Q1 ~o, and showing

that O is the maximum value of Q1 for F1(y) ~ F2 (y). For this purpose we

differentiate Q1

with respect to F2

(y) and obtain (neglecting common factors)

r-c-1 r-1 r-1 r-c-1

..

Ql2 = Fl(y) [1-Fl(y)]n-r+cF2(y)[l-F2(y)]n-r_Fl(y)[l-Fl{y)]n-rF2(y) [l-F2(y)]n-r+c

r-c-1 r-c-1 = Fl(y) [l-Fl(y)]n-rF2(y) [l-F2(y)]n-r{[l-Fl(y)]cF~(y)-F~(y)[l-F2(y)]c}.

It follows from (4.7) that Q12 ~ 0 whenever F2 (y) ~ F1

(y)

and this proves the lemma.

for all y

Theorem: E{S} is a maximum in n1 when

F[l](y) = F[2](y) = = F[k](y) for all y.

Proof: Consider the pair F[i] and F[k] for any particular i + k; we

wish to show that if F[i](y) decreases on any subset of y values without

violating (4.8), then E{S} is not decreased. Integrating by parts in

the general expression (4.1) for E(S} gives

k-1 00 k 00 k-1

E{S} = _r. 1 II Hr-c,c/Y) dHr,/Y) +J II Hr-c a(y) dHr k(y) J = 1 :..co Q';:: 1 -co O';:: 1 ' '

a+j

~

I.

I

--1..1

~

~

I

i...

-1

....

1111111

I

I

1..1

._

k-1 00 k-1 -' = 1 - 'f. 1 IT H rv(y)[H k(y) dH .(y) - H k(y) dH .(y)] . . 1 rv_ 1 r-c,~ r, r-c,J r-c, r,J

J= ~ ~ ~j ~

For fixed F[k](y) we can disregard all terms except j = i since in the

jth

term (j + i) if F[·](y) decreases then H .(y) decreases and hence 1 r-c,~

E(S} increases. Hence we need only show that

00 k-1

1 II Hr-c a(y)[Hr k(y) dHr-c i(y) - Hr-c k(y) dHr i(y)] -co O'= 1 ' ' ' ' '

a+i is a nondecreasing function of F[i](y) for fixed F[k](y) in n1 • Integrating

by parts again we find that it is sufficient to show that both of the functions

- 6 -

~

laal

: I

la.I

1-.1

~

'-iii

-(4. ii)

( 4 .12)

(5.1)

(5.2)

(5.3) ....

(5.4)

H .{y) dH k(y) - H .(y) dH k(y) r-c,i r, r,i r-c,

are nonincreasing functions of~ F[i](y) for fixed F[k]{y) in n1. This

was shown for (4.11) in the lemma; to show this for (4.12) we differentiate

(4.12) with respect to F[i](y) and the proof is exactly the same as in

(4.7). This completes the proof of the theorem.

5. Asymptotic Results and Asymptotic Relative Efficiency (ARE)

Asymptotic (n~oo) expressions for inf P(CS IR1) and E(S IR

1) will

n1 be obtained for fixed p*o A definition is given for the asymptotic relative

efficiency ARE(R' ,R" ;A) of a procedure R' relative to another procedure R"

under the alternatives A. Our procedure R1 for a=\ is then compared to others,

e.g. the one

the normal case.

based on the sample means,under general alternatives including

For n ~oo with r/n ~a, let J3 denote the value of c/(n+l) for

large n and fixed k,P*; we use the fact that

where ~(·) is the standard normal c.d.f. Then by (3.3)

~ 100

~k-1 (y/a(l-a) + 13Jn+2) d~(y). -00 )(a-t, )( 1-a+13)

~-Since this remains equal to P < 1 as n ~oo, it follows that f3 ~ O.

Hence if S = S(k,P*) is the root of

Loo

- <l>k-l(y+S) d<!>(y) = p*

then the rate at whic~ J3 ~ O {and c ~ oo) is given by

J3Jn ~ sva( 1-a) as n ~ oo •

- 7 -

(5.5)

(5.6)

(5.7)

(5.8)

The value of S (or H = s12) has been extensively tabulated by Bechhofer

[1], Gupta [3] and Milton [4]o Treating (5o4) as an equality and solving

for ~' we can then use c = (n+l)~ to make the procedure R1 explicit and

(2.1) will then be satisfied for large n.

For E{SIR1} we obtain from (4.4) the maximum value kP*; for other

asymptotic results· we assume a particular class of alternatives, e.g., (NA)

denotes that F. is normal with mean 0. and variance 1 (i = 1,2, .•• ,k). ]. ].

As n -+oo with r/n -+a, the distribution H .(y) of the r th order statistic r,1.

from F[i] is asymptotically normal with mean

a(l-a)/nf[i]S"1'-a), where the density f(i]( 0)

0[i] + "1'.a and variance

is the derivative of F[i](x)

d ' ' (F ) . h th ·1 f an ~a= '~a [i] is tea -percenti e o F[i]' i.e.,

_,,_ "' ( (y-0[ ir"1'-a) jn f [ i] ("1'-a) ) H .(y) --,. "' r,i Ja(l-a)

We note that for any f(t) that is continuous at "1'.a we have

1"1'.a-~

1"1'.a

a-~ = f(t)dt = a - f(t)dt ~ a - ("1'.a-"1'.a_~) -00 \x-~

since ~ -+ O as n -+ 00; hence we can write "1'.a-~ ~ "1'.a-t31f("1'.a). For the

particular configuration with 0[k] = 0 and all others equal to

(where 0 > 0), we obtain from (4.3), (5.4), (5.5) and (5.6)

-0

where 0n = eJn f("1'.a)/Ja(l-a) and s is defined by (5.3). It is easy to

verify that E{S!R1}-+ 1; our present aim is to obtain an asymptotic expression

for the smallest value of n required to satisfy

for positive values of € near zero. From (5o7) using the fact that

(1-x)k ~ 1-kx for x small and the Feller-Laplace normal approximation, we have

- 8 -

I

Li

I

..i

I w

I .

w

Ll

u

I I

L,

... (5 .9)

•

(5.10)

( 5 .11)

~ (k-1) Loo lf>(y+s-en) q,k-2(y+S) dlf>(y) - i;t>f-S-Bn)'

--- '\J2r

(k-1) (s-e,) T k 2 ( s+e ) (-s-e ) ~ cp ~ ·,: - . .:x + -~ d(x) - (k-1) 'r- __ n en02 /2 -oo · 2 2 V c.

(k-1) (s-en) (s-e) (-s-e ) ( s-e ) ~ (k-1) (k-2) cp _n __ n _ (k-l) If> - n e ·./2 q, ~2 en~2 ff ~ /2 ,n

~ (k-1) (s-0 n) ~

(k-1) e(s-en) e 12 q, [2 2 ,J2 n

If we set this equal to € > 0 then the asymptotic value n1(€) of n

required by procedure R1 is given by

a( 1-a) (s-i2 ~ ( ~) )2

€

where e' = 2e/(k-1) and (0) ( 1)

is the e'-percentile of : i.e.,~ , = - (e'). €

We now apply the same analysis to the procedure R2 based on sample means

(see Gupta (2) for a detailed development and an up-to-date set of references).

We assume that the Fi have a finite, known and (for simplicity) common

variance a2 • To ensure that procedure R2 is accomplishing the same goal as

R1

, we also assume that the asymptotic normal distribution of the mean x from the population with c.d.f. F[i) has mean e[i]' where e[l] ~ e[2 ] ~

this certainly holds if we are dealing with a location parameter.

The procedure R2

is to put the i th

population It in the selected subset iff

Max X. - d , l~j~k J

where d > O is chosen to satisfy the P*-condition (2.1). Here, for

specified p*, we find that d = So/[n where S = S(k,P*) is defined by (5.3).

- 9 -

( 5 .12)

(5.13)

( 5 .14)

(5.15)

(5.16)

· (5.17)

For the same configuration with e[k] = 0 and all the others equal to -e

(where e > O) we obtain corresponding to (5.7) and (5.9), letting e~ = Bin/a~

1.oo k 1 100 k 2 = t - (y+S+B~) dt(y) + .(k-1) t(y+S-8~) t - (y+S) dt(y), j

-00 -00 ~

{ I } 1 ~ (k-1) t{s-e~) ES R2 - 2 ./2

Setting this equal to e > 0 we find that the value n2(e) of n2 required

by procedure R2 is

a2(s-.J2' °A. (~))2 e

We define the asymptotic relative efficiency ARE(R' ,R" ;A) of R' relative

to R" for the alternatives A to be the limit as e -+ 0 of the ratio of

n"( e) to n' ( e); hence

= lim e-+0

=

which is independent of e for any alternative.

For a=\ and the normal shift alternatives (NA) with a= 1 we obtain

For a=.\ and the two-sided exponential shift alternatives (TEA) with median

zero and f(O) =\we obtain

thus R1 is asymptotically twice as efficient as the procedure R2

under TEA.

We can also compare the procedure R1 with both non-randomized and random

ized rank sum (or score) procedures which have been applied to the problem of

selecting a subset containing the best population by Bartlett [1]. The basic

procedures are defined in Lehmann [5] in terms of a function J(x) defined

- 10 -

' I w

I

u

w

-0 - (5.18)

(5.19)

(5.20)

(5 .21)

for O ~ .x ~ 1, whose inverse J~~ 1 !(x) = H(x) is a c.d.f. For some fixed

function J(x) we can define the score for the i th population by

s ,. = ..! N,i n {i = 1,2, ••• ,k),

where R. j denotes the rank of the jth observation X .. from the i th

1, 1,J

population IT~ in the combined sample of N observations; we are assuming

that there is a common number from each population so that N = kn~ Then

the·non-randomized procedure R' = R'(.;r) based on these scores is to put

IT1 in the selected subset iff

SN' • > ,1.

Max l~j~k

SN' • - c' , ,J

where c' is a positive number chosen to satisfy the P*-condition (2.1).

The randomized procedure R" is similarly defined in terms of randomized

scores but, since the asymptotic properties of R" are equivalent to those of

R', we need not define R" here. The c.d.£.'s F.(x) are now assumed to be 1.

of the form F(x-B.), so that we are dealing with the selection problem for 1.

a location parameter. We shall use the result from (1) that for asymptotically

small d* {our 8), the common (large) sample size n' that would be needed

for R' (and also for R") to satisfy both (2.1) and (5.8) is

~ [ Ad r .., 0 C ! {J(F(x) )) dF(x)

where (his) d in (5.20) is the root in e of the equation obtained by n

setting the right side of (5.7) equal to 1 + e, and where

. 1 11 2 A2 = 1 J 2 (u) du·-(

0 J(u) du)

It should be noted that n' in (5.20) represents the value of n that would be

needed if you knew that the true configuration were given by e[kfe[ite (i=l~2, ••• ,k)

- 11 -

L·;;>:11for .'JS·ome ifixed f: .. a, ·,~not dep~di.ng,.ort /·; iL,, 'i. ~ 1{\l{~hif;,:i~ting: .. ~gceSf? :1ig, 't·o . · : -- . · .., ,; : -···~rt,'-",··._, .. ,-- · :,• .,· · - . - -. ' L.

keep e >, O , fixed· 1and let e[rt .... dJa( ~-a} f f(X~) ·· wh~i,e ~ :,is, ~~e '~oot ~f.

(5.22)

( 5 .23)

(5.24)

(5.25)

(5.26)

the equation obtained by setting the right :sid~ of (~L7) equal to 1 + e. : . f i ..

This is a different limiting process than the orie w7 have defined, and hence

we write ~ above and shall write AREB for any ratio based on such a

limit; in general, these limits will.be functions of e.

:fr w~ :,1carry o~t .this s~m~, limiti~g. _pr~c.es,~: 'foi o~r ~r.oced.ure R1

, then

using (5.7) we·_obtS:in · I,,'

where d -is the.,t,$ame as in. ( ~-.20:). Hence~ fo~·,:trans lat.ion alternatives (TA}, . . \,:!.~: . ·.

we have the ·i;·e~:~;lt

.. '!' ··· . _:,.:. ;' • I; I{_.~~. 1-.~-~ • ':~ ~... •

._; ··,.

which is ihd~~~@tleil t ·:of e ~ 1 '· • ' · •• :

-I£ we·:;.-t~k~f~.5j(~l) -~ :H. :i:i -l ·, a~d·· 6o~si~ir.:rt~r~l ~lte~natives (NA) then .,,• ··, \

(5 .23) with a;=--~ gives the ·same ·result 1 : as in (5 .16), viz.

,,,.:· ..

.<'" r -~ ,: • I .; I

For the same H .if we cons·ider th~· TEA t:hJn (?~-.~3) :~it]?.';_::~·q = 1!. gives, • # ••

after using symmetry and straightforward methods-of -integration, ' ..... J ,· .. .

. ,,.·. '/. ' .. .

:. 1(1/2) 2 ··· 7f

~B(Jctl'R' (~);TEA) '.= . ( 1/4) (-J2hr)'2 '= 2 .

·.~:f~\·~. r/{'~\t'.~/.': . . ..... ~ ,~. ( 1) If we·take J - = H = U, the standard uniform distribution,and

consider logistic alternatives (LA) then (5.23) with a~\ gives

. ( 1 / 12 ) ( 1 / 4a 1) 2

( 1/4 )( 1/6a l )2

3 = 4

for the logistic alternatives it is known that this "score" statistic with

- 12 -

L

I

L.

!

~

i

~

--0 -

(6.1)

(6.2)

(6.3)

J(-l) = H = U reduces to the Wilcoxon statistic and gives the locally most

powerful test. For the same H if we consider the (TEA) then (5.23) with

a=~ gives

= (1/12)(1/2)2

(1/4)(1/4) 2 =

4 3

We note with curiosity (but without explanation) that the last two numbers

are reciprocals and the two numbers before that are also reciprocals.

It can be shown (we omit the proof) that the AREB in both (5.15) and

(5.23), considered as a function of a for O ~a~ 1, is a maximum for

a=% for all three of the alternatives considered above, i.e., for NA, LA

and TEA. A sufficient condition for this is that f(x) be symmetric about

x = O and f 2 (x)/[F(x)(1-F(x))] be decreasing for x ~ O; these properties

hold in each of the three different alternatives considered above.

6. Dual Problem

Consider the dual problem of selecting a subset containing the population

with the smallest a-percentile. For specified p* > 1/k we now require a

procedure R1 that satisfies

for all possible k-tuples (F 1 ,F2 , ..• ,Fk) for which

Max F[ .](y) ~ F[l](y) l~j~k J

for ally;

denote the set of all such k-tuples by n1. We again assume (2.3) and define

r by (2.4). The procedure dual to procedure R1 of section 2 is

Procedure R1: Put F. in the selected subset if and only if

i

y . ~ r,i Min Y .

l~·~k r+c,J -J-

where c is the smallest integer with 1 ~ c ~ n-r for which (6.1) is satisfied.

- 13 -

(6.4)

( 6. 5)

(6.6)

It will be convenient l:o use a.Lso the notatio11 I.' ,' l:' .' \ ) ~;l 1J!.r.-.·y; r, Li J , for H .(y). r,1

For this problem equation (3.1) is replaced by

p (CS IRi} = 100 -00

=f

k

II { 1-H . ( y) } dH l ( y) r+c,J r, j=2

k

II Hn-r+l-c (l-F[j](y)) dHn-r+l(l-F[l](y)) j=2

If the F.(y) are all symmetric about y = 0 then for any k, by l.

making the transformation s = -y, we obtain

=f k

II Hn-r+l-c (F[j](x)) dHn-r+l(F[l](x)) j=2

and we note that the problem is equivalent to selecting a subset containing

the largest (1-0!)-percentile (a+%).

Equation (3.3) is replaced by

inf P ( CS IR 1' } O'

1 11

= [1-G (u)]k-l dG (u) r+c r

1 . 1 k-1 ) = G u dG u • 0

n-r+l-c ( n-r+l ( )

If a=% then n-r+l = r and hence for any k we note that (606) reduces

to (3.3). Hence for a=\ the tables for determining c are the same for

both problems; this is not true in general for a+~- In particular, the

tables we have computed for k = 2, a=~ can also be used for the problem

of selecting a subset containing the population with the smallest mediano

Other discussions for this problem are similar to those of the dual

problem and are omitted.

7. Property of Monotonicity or Unbiasedness

Let denote the probability that R1 retains

subset. We wish to prove the

Theorem: For any two c.d.f. 's F. ( •) ].

and F. ( ·) J

- 14 -

such that

in the selected

•

L

w ! I

~

I I , ~

I : w I I

I , 1..1

I

I i ~

; I w

·"'(7.1) 0

(7.2)

-

Proof: Letting M denote the set {m:m = 1,2, ... ,k; mt i,m f j} we use the

fact that Hr,i(y) is an increasing function of F[i](y) and then integrate

by parts to obtain

+ ·~ Jool·-rr Hr-c m(y)j..., Hr-c i(y){Hr J.(y) - Hr /y)} dHr-c m'(y). m' eM ~ meM ' ' ' ' '

m:f:m I .

for all y it follows that H . (y) ~ H .(y) r,1. r,J for

all y and every term in the last member of (7.2) is non-positive; this proves

the theorem.

It follows that under our assumption (2.2) the probability of including

F[k] in the selected subset is not less than the probability of including

any other F[j] in the selected subset, i.e., the procedure R1 is unbiased.

A similar result holds for the dual prob1em; the proof is similar to the

above proof and is omitted.

8. Acknowledgement

The authors wish to thank Mr. George Woodworth of the University of

Minnesota for several useful discussions in connection with this paper.

Thanks are also due to Mr. of Ohio State University for

his help in carrying out the computations in the tables.

- 15 -

Table 1 I I.,;

~

.-I

Values of P1

(n) fork= 2 and a=\ w

and the smallest value of n for which P1

(n) ~ p* ~

P1(n) = 1 - (n)/( 2n) where r = n+l

r r 2 ad

.... n I pl (n)

1 I .500000 .,J

3 .800000

5 .916667 ... 7 . 965035 9 .985294 -11 .993808

13 .997391 I

15 .998901 '-'

I

I ' ~-

._J

~

~

p* .500 .550 .600 .650 .700 .Boo .850 .900 .950 .975 .990

Smallest n w for which 1 3 3 3 3 3 5 5 7 9 11 Pl (n) ~ P* -

~

tr

~

~

...,

- 16 -~

... 6

.....

~

....

...

4-.1

-._;

19'

l.i

...

la

~

-..i

-.i

~

"""

---

Table 2 >

Largest values of r-c for which P{CS} i:> p*

-·. ---,t.n I n+l p*

r =w 2 i .750 .900 .950 .975 .990 : I o* o* o* 5 I 3· 1 1 i

15 8 6 ~ 3 2 2 25 13 10 8 1 6 5 ., 35 18 15 12 11 10 8 45 23 19 16 15 13 12 55 28 24 21 19 17 16 65 33 29 25 23 22 20 75 38 33 30 28 26 24 85 43 38 34 32 30 28 95 48 43 39 36 34 32

145 73 67 62 59 56 53 195 ' 98 91 85 81 78 75

I

245 I 12J· 115: ·-.1088 104:, io1 91 295 ! 148 139 132 128 124 119 345 173 164 156 151 147 142 395 198 188 180 174 170 165 445 223 212 203 198 193 188 495 248 237 227. 222 217 211

- ·---

t r- s Values of:+~= ~{n ~ 2 {see (5.4))

- - . -

95 48 .507646 ~·913762 1•218349 1•421407 1·624466 195 98 .498723 ' .926200 1.211184 1.42492·3 1.638661 295 148 . 522230 ... .928409 1.160511 1.392613 1.682741 395 198 .501884 .903391 1.204522 1.405275 1.656217 495 248 .493416 .941977 1. 166257 1.390537 1.659673

00 00, .476936 !906194 1. 163087 1.385903 1.644976 . 1oegenerate cases in which both the populations go into the selected subset with

probability one.

;

t:The.:erratk behavior .in the approach of :cJn"/{n+l) .to a,Umit,.as n ~00 is due to the fact that c is forced:to be an in~eger .under R1• This suggests that we might use a procedure R11 that allows randomization between r-c and r-c+l to get P(-CS1R11) = P* exactly. Then we g-et an improvement because E(~.,fR1) - E(SJRn) is positive, although it will usually be small; tables for the randomi.zed procedure have not been constructed.

- l'f -

References

(1) Bartlett, N. Ranking and selection procedures for continuous and certain discrete populations. Ph.D. Thesis, 1965. Case Institute of Technology.

(2) Bechhofer, R. E. (1954) A single sample multiple-decision procedure for ranking means of normal populations with known variances, Annals of Math. Statist. 25, 16-39.

(3) Gupta, s. s. (1965) On some multiple decision (selection and ranking) rules, Technometrics 7, 225-245.

[41 Gupta, s. s. (1963) multivariate t.

Probability integrals of the multivariate normal and Annals of Math. Statist. 34, 792-828.

(51 Lehmann, E. L. (1963) Math. Annalen 150,

A class of selection procedures based on ranks. 268-275.

[6]

[71

Milton, Roy, (1963) Tables of the equally correlated multivariate normal integral, Technical Report No. 27, University of Minnesota, Minneapolis, Minnesota.

Sobel, Milton (1966) Selecting the population with the largest~ percentile, submitted to Annals of~- Statist. (for this issue).

- JJj -

L

I I ... I I

l..i

I I .. I I

!Mi

I

41111

I ~

I ... I ... i .. I

...i

I i

w

SELECTING THE t POPULATIONS WITH THE LARGEST a ...

Documents

Transcript of SELECTING THE t POPULATIONS WITH THE LARGEST a ...