Post on 28-Jan-2023
....
SELECTING THE t POPULATIONS
* WITH THE LARGEST a-PERCENTILE
by Milton Sobel
SELECTING A SUBSET CONTAINING THE
November, 1965
* POPULATION WITH THE LARGEST a-PERCENTILE
by M. Haseeb Rizvit and Milton Sobel
Technical Report No. 65
University of Minnesota
Minneapolis, Minnesota
tNow at Ohio State University, Columbus, Ohio.
* This report was supported by National Science Foundation Grant No. GP-3813.
-
(1.1)
Selecting the t Populations with the Largest a-Percentiles
by Milton Sobel, University of Minnesota 1. Introduction
There are given k populations with cumulative distribution function
(c.d.f.) F . ( x) = F . ( i = 1, 2, ••• , k) ; a number a with O < a < 1 and an l. 1.
integer t with 1 ~ t ~ k-1 are preassigned. The goal is to select those
t of the k populations which have the largest a-percentiles. The form
of the c.d.f. Fi is unknown but it is assumed that F.(x) 1.
is continuous
in x (i = 1,2, ••• ,k). Two different formulations of this nonparametric
problem will be given; for each formulation there will be a case A and a
Case B according to whether a certain assumption is ma.de.
Let xa(F) denote.the ath
-percentile of the distribution F; if this
percentile is not unique we can define it as any arbitrary convex combination
of points in the closure of the set {x: F(x) = a}. However, we shall assume
that each F. has a unique a-percentile in order to avoid extra notation 1.
which does not add to the basic ideas of the problem. Let F[i](x) = F[i]
denote the c.d.f. with the ith
smallest a-percentile. The correct ordering
of the k distributions is
where It is assumed that no
a priori information (in the form of a distribution or otherwise) is avail-
able concerning the correct pairing of the F. 1.
and the We do not
assume that the k distributions have the same supports.
It is clear that if the F.(x) are ordered uniformly in x, i.e., if 1.
there are no crossovers, then the problem of selecting the t smallest of
the Fi{x) is equivalent to the problem of selecting the t populations
with the largest a-percentiles.
Formulation 1:
Let e* > O be a specified number such that * * e ~ a ~ 1-e . Let
l3(F) and x~{F) denote the infimum and supremum of the set {x;F(x) = ~}.
- 1 -
(1.2)
(1.3)
( 1.4)
(1.5)
( 1.6)
Define the closed interval
For i = 1,2, ..• ,k-t and j = k-t+l, k-t+2, ..• , k, let
Finally let d* and p* denote specified constants with d* > 0 and
1 > p* > 1/(~). 1
We define F[k-t] < F[k-t+l] to mean that x0 (F[k-t]) <
x0
(F[k-t+l]) and when this is the case we define a correct selection (cs)
to mean the selection of those t populations with the largest a-percentiles;
we shall not need the definition of a CS when F[k-t] = F[k-t+l].
Our goal is to find a procedure R which satisfies the probability
requirement,
i.e.,
and
P{CS IR} ~ p* when inf di . ~ d* , {i,j) ,J
whenever the infimum of di . over all pairs (i,j) with 1 ~ i ~ k-t ,J
k-t+l ~ j ~ k is at least d*. Any vector or point {F1,F2 , ••• ,Fk)
for which * d. j< d for at least one pair {i,j) is said to be in the 1,
"indifference zone" and no probability requirement is ma.de for such points.
Let the common sample size n from each population be sufficiently
large so that 1 ~ (n+l)a~ n. and let the positive integer r be defined by
r ~ {n+l)a < r+l .
It follows from the above that 1 ~ r ~ n. Let Y .. denote the jth
J,1
order statistics from Fi (j = r,r+l; i = 1,2, ••• ,k) and let
[r+l-( n+l)a]Y . + [ {n+l)a-r ]Y 1 . • r,1 r+ ,1
I
I
I
~
I.I
I ~
i I
~
I ...
The procedure R will be to take n observations from each population and ~
select the t populations which give rise to the t largest W-values.
- 2 -
( 1. 7)
(1.8)
If a is rational then (n+l)a will be an integer for some arith-
metic progression A of n-values and then r = (n+l)a and
for simplicity we shall assume that a is rational.
Hence the problem of finding a procedure R satisfying (4) reduces
to the problem of finding the smallest integer n such that (1.4) is satis
fied; we shall be content with finding the smallest n in A that satisfies
( 1.4).
Formulation 2
Let e* > 0 be a specified number such that
i = 1,2, .•• ,k-t and j = k-t+l, k-t+2, ••• , k, let
= ~ *(F[ .1) - x~~L *(F[. 1) . a-e J vr;-e i
* * e ~ a ~ 1-e . For
Let p* denote a preassigned constant such that 1 > p* > 1/(:). Our
goal is to find a procedure R such that
P{CS IR} ~ p* when inf dij ~ 0 , ( i, j)
i.e., whenever the infimum of dij over all pairs (i,j) with 1 ~ i ~ k-t
and k-t+l ~ j ~ k is non-negative. The form of the procedure is the same
as for formulation 1 above and the comments made there hold in this case
also; in particular, the same assumption that a is rational will be made.
Hence the problem again reduces to finding the smallest integer n such
that (1.8) is satisfied. The vectors (or points) in the indifference zone
are those for which d!. < 0 for at least one pair (i,j). iJ
2. Probability of a Correct Selection
For each formulation we calculate the exact and asymptotic (n -+oo)
probability of a correct selection for an arbitrary configuration and for
the "least favorable" configuration, the latter being defined as the infimum
of the P{CS} in the complement of the indifference zone.
- 3 -
(2 .1)
(2.2)
(2.3)
(2.4)
2 • 1 P {CS} -for : F clttnu lat ion · 1A · '-,, ~; ·1 i·, .·.~."!!L::.' .,·,·~ ~:.'j~-~;; ': i ~~·. • ~ r .. ..' ;"1:h: : ' . ,
L~t ~~, (i) = ~{i) . denote t~e _r . ord_er statistic ~ss~cia~ed_ with the _. s :::· :~ .. ~= ~;r.r= \: 1
_ ~.-: ... ; • • .. 1 •• 1':-;r ~ ::~r .. ;- ~-:.~ .. : ~.> . .:·- ·-; _:,· _::, !~~--~· --~-- ~) ~·
disitr1butfon F[.]" The density element dH .(y) = dH.(y) and the c.dof. i . ·' r,1. 1.
. ' •••• i •• • •• ,· ~ •• _· •. • •• ; :i .•.
H .(y) = Hi(y) .of: Y(·) are respectively given by r,1. . , 1. r
dH/y) = r(:)Ft~~(y)[l-F[i](:y)ln-_r., dF[i](y)
.' ....... _J },
( ) n (n) j ( ) ( ) n-j H. y = _E • F(i] y [1-F[.] y] o i J=r J 1.
The probability of a correct selection under.the procedure R is given by
p {CS IR} = P.{~;~(Y ( l), •.• , Y:(k-t)) .. !< Min(Y (k-t+l), ••• , Y (k))}
k
·f'
= .. j=k~t+l P{~~(Y(l) ," ••. :,iY(R-t)) .< y(j) = Min(Y(k-t+l), •.. ,Y(k))}
k Joo : k- t . k . :-'· ,, = . E Jl Hi~y) IT [ 1- (y) J:-i_~iH. (y)
J=k-t+1 _ 00 - 1.=1. ; . : ·]: . f3="k-t+1 : . _Hf3 J
f3+j
k . . : ' k-t . . 1{
?; • E i. JI Hi(y) II [1-1\i(y)l ~/Y) J=k-t+l J 1.=l . f3=k-t+l '. '
. f3fj
+ Gk-t(a+E*+d)[l-G(a+€*)]t
where G (p) = G(p) is used for the standard Incomplete Beta function r,n ·
n! JP r-1 m-r G ( p) = ( l) 1 ( ) , x ( 1-x) dx ; r ,n r- • n-r •
0
in the last term of (2.3) we have used the fact that Hi(y) is a continuous
nondecreasing function of F[i](y) and hence also of y. Using this fact
again gives
k P{CSIR} . k I ~ E
j=k-t+l , [Hk-t(y)]k-t II [1-Hf3(y)] dH.(y)
f3=k-t+l J
+ Gk-t(a+E*+d*)[l-G(a+E*)]t·
- 4 -
f3+j
~
~
..
..;
...
Ila.I
..,
..,
liaJ
-..I
...,
...
~
w
~
...
~
.. ~
-.I
,, --- (2.5)
-'-
~
..
...
---..
(2.6)
1-
\aj
-.. (2.7)
'- (2.8)
-la
-
and we note that we have set each H.(y) under the integral sign equal to i
Hk-t(y) . to attain a minimum.
Similarly we can write the P{CSIR} in the form
P{CSIR) = P{Max(Y(l)'"""'y(k-t)) < Min(Y(k-t+l)'"""'y(k))}
k-t = i~l P{Max(Y(l)'"""'y(k-t)) = y(i) < Min(Y(k-t+l), ••• ,Y(k))}
= k-t E
i=l r - ()0
~ E k-t l i=l I
k-t k
IT H/y) IT (1-H/y)l dHi(y) r=l j=k-t+l r=f=i
k-t IT Hr(y)[l-Hk-t+l(y)]tdHi(y) + Gk-t(a-e*+d*)[l-G(a-e*)Jt.
r=l r=f=i
We note that to minimize the P{CSIR} we have to set each H.(y) J
equal to Hk-t+l(y) for y in the interior of the interval I. Carrying
this out in (2.4) we obtain
P(CSJR} ,ta t 1 [l\c_t(y)l-t[l-Hk-t+l(y)]t-ldHk-t+l(y) I
+ Gk-t(a+e*+d*)[l-G(a+e*)]t
On the interval I we have F[k-t](y) = F[k-t+l](y)+d* and hence, letting
µ = F [k-t+1/Y)'
Hk-t(y) = G(F [k-t] (y)) = G(u+d*)
Hk-t+l (y) = G(u).
Hence (2.6) takes the form
- 5 -
(2.9)
(2.10)
(2 .11)
(2 .12)
P{csjR) ~ a+e*
t 1 Gk-t( u+d*) [ 1-G( u)] t-ldG( u) a-e*
From (2.5) we now obtain the expression
P{CSIR) ~ a+d*+e*
(k-t) 1 Gk-t-l(v) [ 1-G(v-d*) fdG(v) a+d*-€*
which is equivalent to (2.9).
It is interesting to note that with this formulation the right side of
(2.9) does not give 1/(~) if we set d* = 0 (or if we set e* = d* = 0),
but actually gives something smaller; for n -+oo so that r/n -+O: we will
show below that the right side of (2.9) approaches one for d* > 0 and
approaches 1/(!) for d* = 0.
2.2 P(CS} for Formulation lB
A slight variation of the results in the last section are obtained if
we make the assumption that for all x
( i = 1, 2, ••• , k-t)
(j = k-t+l, k-t+2, .•• , k) •
For the special case t = 1 we omit the second part of (2.11) and for t = k-1
we omit the first part. [It should be noted that we could have assumed that ~
for all pairs (i,j) and all x
but the assumption (2.11) is a weaker assumption and accomplishes the same
purpose, namely to make F[i] = F[j] = F(say) in the worst configuration.] ..iJ
- 6 -
(2.13)
(2.14)
Under the assumption (2.11) the derivation of the first term on the right
side of (2.9) i.e., the integral, remains exactly the same as in (2.9)
but the contribution from the complement of I is different. The final
result for this formulation, using the same methods as above, is
* P{CSIR) ~ Gk-t(u+d*)[l-G(u)]t-ldG(u) tlo:+*e
0:-€
+ (t) [l-Gk-t+l,t(G(~€*+d*)) + Gk-t+l,t(G(a-€*))] ' t
The explanation of the second term on the right side of (2.13) is as follows.
In the jth
term of the sum in (2.3) we let µ = F[j](y) and consider that
portion of the integral for which o:+e* < µ < a+e*+d*. As y approaches
Y1 = ~a+e*{F[k-t+l]), the right end p9int of I, from within I we have
F[j](y) ~O:+e* and it follows that F[i](y) ~ a+e*+d* and hence
G[ i/Y) ~ G(o:+e*+d*) for i = 1,2, ... , k-t and y = y 1. ,-.::-'.
Hence, letting I' 1
denote this part of the integral in (2.3),
+ Y1 k k
Il ~ Gk-t(~€*+d*) 1- L, II [ l-Hf3 ( y)] dH j ( y) Y1 j::k-t+l f3=k-t+l
t3tj
k +
= Gk-t(o:+e*+d*){l- II [1-H .(y)] ly1
} . J I -J=k-t+l y1
The last part of (2.13) represents the contributions of the intervals
* * * o:+e +d ~ µ ~ 1 and O ~ µ ~ 0:-e, respectively.
For d* = O and any e* > O, it is easily seen that the right side of
- 7 -
(2.15)
(2 .16)
(2.17)
(2.13) reduces to
It is natural to conjecture that the right side of (2.13) is greater
than the right side of (2.9) and hence that formulation lB requires an
n-value which is not larger than that required by formulation lA. This is
indeed the case since the difference t:::. of these quantities satisfies the
inequality
1
J k-t( )t-1 k-t( )t f:::. ~ t y 1-y dy - X 1-X
X
(0 ~ X ~ 1)
where x = G(a+e*+d*). Differentiating the right side of (2.15) we find
that it is decreasing from 1/(~) to O as x increases from Oto 1,
so that t:::. ~ 0.
It will also be seen from the dis.cussion below that the expressions in
(2.9) and (2.13) are asymptotically equivalent for large n {or, equivalently,
for d* close to zero)and it will be of some interest to see how much
difference it makes in the sample size to make the assumption (2.11).
Table 1 shows that this difference is quite small.
2.2 The P{CSIR} for Formulation 2A
Using the 'third expression in (2.3) as a starting point, setting
Hi{y) = G(ci+-e*) for each i, integrating the resulting expression and
letting x 1 = mjn x0 _/F[j]) gives
k = Gk-t(a+e*) II [1-H.(x
1)]
j=k-t+l J
Increasing every H. J
as much as possible at x 1 we obtain G(a-e*) for
each of them and hence
- 8 -
I \.ii
(2.18)
(2.19)
-
It will be seen below that this tends to one as n ~~ and the remaining
problem is to determine the smallest n for which the right side of (2.17)
is at least p*.
For the special case a=\ we have r = (ntl)/2 = n-r+l and, using
a well-known property of the symmetric Incomplete beta function, (2.17)
becomes
I k( * P{CS R} ~ G \+e ),
which does not depend on t.
2.3 The P{CSIR) for Formulation 2B
We now make the same assumption (2.11) as in formulation lB and,
using the same methods as above, we obtain
Ja-e k t t 1 P{CSjR} ~ t G - (u)[l-G(u)] - dG(u)
0
= (!) [Gk-t+l,t(G(a-c*)) + 1-Gk-t+l,t(G(Of!-E*))l t
It is natural to conjecture that the right side of (2.19) is greater
than the right side of (2.17) and hence formulation 2B requires an n
which is not larger than that required by formulation 2A. This is indeed
the case and the proof is exactly the same as in formulation· 1. We also note
that for e* = O the right side of (2.19) reduces to k 1/(t).
For k = 2, 1 t = 1, a = 2 and n+l r = 2 , Table 1 gives the smallest
odd integers that will solve the problem for selected values of d* = e*
and P*, for each of the four formulations, lA, lB, 2A and 2B.
- 9 -
(3 .1)
(3.2)
(3.3)
(3.4)
(3.5)
(3.6)
.. ~ . Aayll)p't::9tic \E.xpress ions
·;~.f .1 ._X :, hM_: th~: ~nc~mpl_e;~ beta distribution . Gr ,-n;.ir+/x) .G(x),. then
as n -+ 00 and r / n -+ o:
G(a+y)
[-(x - .2:_)
= P n+l r(n-r+l)
(n+2)(n+1) 2
<
where t denotes the standard normal c·.d .f. Hence the asymptotic expression
as n -+00 with r/n -+O: for fornrulation 1A is
P{CS IR} ~ t f 00 ~k-t(x+vn) [ 1..;~(x)] t-l d~(x) -00
where v n = / n+ 1/ /i,( 1-0:) and it wi 11 be understood that n -+ 00 because
..
d* -+0. Values of v have been tabulated for_t = 1 by Gup~a[3] a~d Milton [6]o n
If v(t,k,P*) = v* is the tabulated solution of
. . 00 . . .
t J.. ~k-t(x+v*) [ 1-~(x)] t-l d~(x) = p* ,
then the required asymptotic solution for· n is given by
n ~
*2 (!....) o:( l-0:) • d*
Clearly, we get an upper bound to this asymptotic solution by setting
o:(1-o:) = \. It is interesting to note that v* and also the asymptotic
solution (3.4) do not depend on * E •
For formulation lB the asymptotic (n -+oo, r/n -+O:) expression for
the right side of (2.13) is the same as (3.2) since for x > 0
1-G(a+x) rJ 1-t( x§iT) -+ 0 , Jo:( 1-0:)
G(O:-x) r-.1 t(-xfii+I) ~o:( 1-0:)
-+ 0
- 10 -
,---
: ! '-I
I I la.I
i I
-
(3.7)
(3.8)
--
and hence all terms after the ~ntegral in (2.13) tend to zero.
For formulation 2A we easily obtain from (2.17)
= ~k/e*{n+f) \ Ja( 1-a)/
and hence, if ~(P*) is the P*-percentile of the standard normal c.d.f.,
the required asymptotic solution for n is given by
and again we obtain an upper bound over different a by replacing a(l-a)
by \ in ( 3 • 8) .
It is curious that the results in (3.7) and (3.8) do not depend on t;
here this property holds for all values of a.
For formulation 2B the asymptotic (n -+oo, r/n -+a) expression for the
right side of (2.19) is the same as that of (3.7) by virtue of (3.5) and
(3.6). Hence (3.7) also holds for formulation 2B.
4. Asymptotic Relative Efficiency for Formulation 1
In this section we compare the asymptotic relative efficiency (ARE)
of our procedure for formulation 1 with a=% with that of several other
procedures developed for particular parametric families. Unless stated
otherwise we shall assume that p* is fixed and that d* -+O for fixed
small e*; then we will let e* -+0 also.
1. ARE for Normal Slippage Alternatives(NA)
We first compare (3.4) with the result obtained in [ 1] for the problem
of selecting the t 'best' out of k normal populations with common variance
- 11 -
( 4.1)
(4.2)
( 4. 3)
(4.4)
( 4. 5)
a2 = 1 . The inf imum of the P (CS} ·. under procedure P' based on the sample
means is easily seen to be (see [ 1 J for details)
00
P(CSIR') ~ t J ~k-\x+v~) [1-~(x)]t-l d~(x) -00
where v' = 8*vn, 8* > 0 being a specified measure of distance between the n
tth largest population mean and the t+1st largest. Then n is obtained by
~
\ '\
"-
~
J
setting the right side of (4.1) equal to a specified value P*, where 1 > p* > 1/(~) U
For two c.d.f.'s <l>(x) and <l>(x+c·*), we want to find the smallest value
of 5* such that <l>(x+8*) - <l>(x) ~ d* for all x E I, where I is defined
in (1.2) with F[k-t+l](x) = ~(x). It is easy to see that we need only ask
for equality at the right end point of J, i.e., at -1 1 XO= <I> (2+e*)
smallest 8* that satisfies our condition is given by
<l>(x +8*) - <l>(x ) 0 0
= d* '
or equivalently,
5* = <l>- 1(d*+½+c*) - ~- 1(½+€*)
and the
Hence the asymptotic (n ~oo, r/n __.\) value of n (say n') required for
this normal problem is
n' ~ (v* 2 ~) = 2 [ ]-2 v* d -1
(d*) dx ~ (x)lx=½+e*
= * 2
(V ) 1 ( . -1 1 d* '2ir exp -L<I> (2+e*)]2}
Using (3.4) with a=\ and (4.4) it follows that the .ARE(R,R2
) of the
procedure R based on the medians relative to the best fixed sample procedure
R2
based on the sample means is
n' 2 1 ARE(R,R2 ;NA) = lim - = - exp(-[<%>- (½+e*)] 2) •
d*~o n 7r
- 12 -
4-'
-1
_,
i..l
-.i
~
~
' I w
Ii.I
'-'
, I
¥'
\
'-
J !
~
i ii ...,
' If we let e* ~o then we can make this result as close to 2/7r as we
wish; in particular, if we had set e~· = d* and let d* approach zero
then the result would be exactly 2/7r.
(4.6)
(4.7)
We now compare our procedure R with a class of procedures based on ranks
(including the Rank Sum Procedure and the Normal Scores Procedure) which
form a slight generalization of the class of procedures described by Lehmann [ 5 ].
To develop Lehmann's results on the required sample size for the problem of
selecting the t best populations (he considers only t = 1), we now prove
an analogue of his lemma 2.
Let X .. denote the j!h observation from the population 7ri (i = 1,2, •.. ,k; l.J
j = 1,2, .•. ,n)
Let
and let R •. l.' J
denote the rank of X .• l.J
in the combined sample.
~ J(Rij ) vi. = L, -
j=l nk+l (i:::1,2,oooJk) J
where J(x) is the inverse of some arbitrary c.d.f., i.e., J(x) is non-
decreasing and defined on the interval [0,1]. It is assumed that X •• has 1J
distribution F. and that the F. differ only in a location parameter 0. 1. 1.
Let 0[ l] ~ 0[ 2 ] ~ ••. ~ e[k] denote the ordered 8-values. The procedure R"
for selecting the t 'best' populations will be to select those that give
th rise to the t largest V.-values. If the t- best is shifted to the right
1.
of the (t+l)!! best by a positive amount 5 = e[k-t+l] - e[k-t] > 0 then
a correct selection (cs) is clearly defined. We want to determine the sample
size n required to satisfy the condition
p ( cs IR" } ~ p*
here o* > 0 and p* (1 > p* > 1/(k)) t
whenever 6 ~ 6*
are preassigned constants.
It is easy to show that to satisfy (4.7) we need only consider the
configuration
- 13 -
(4.8)
(4.9)
(4.10)
(4.11)
( 4 .12)
(4.13)
e[1J = ··· = e[k-t] = e[k-t+l] - 8* = ·•• = e[k] - 8*,
i.e., (4.8) is the least favorable configuration (LFC).
Lemma 2. For fixed p*, let n = n(8*,t,k) denote the smallest integer that
satisfies (4.7) and suppose that the F. and J satisfy the regularity 1.
conditions of Theorem 6.2 and Lemma 7.2 of Puri [8]. Then as n -.oo
a* = v* ____ A ____ _
rn I :x (J(F(x))} dF(x)
1 +- o(-)
Jn
where v* = v(t,k,P*) is defined in Section,3 and
A2 · = J J2 (u)du - (J J(u)du) 2
Proof: The proof is quite similar to that given in Lehmann [ 5 ]; we consider
a sequence of parameter vectors satisfying (4.8) with 5* -.o. By Puri's
theorem the random variables
have a limiting normal distribution with
Esi = 0 (i = 1,2, .•• ,k)
cr2(si) = 1 (i = 1,2, ..• ,k)
P(si,sj) = \ ( i + j)
The P(CS} then is easily shown to take the form
P(CS IR"} ~ P(Max f. < Min[Min(f. + ~), ~]} , i 1. j J v2 ~2
where i = 1, 2, ..•. , k-t, j = k-t+ 1, k- t+2 , ••• , k-1 and v is defined by
- 14 -
-,
i >
t I I
I
'-llil
I I
(4.14)
- (4.15)
--
(4.16)
V = ./n. f [J(¥ F(x+8*)+ ~ F(x)) - J(¥ F(x)+ ~ F(x-8*) ~~(~f.: · ·
A
a*[n J :x {J(F(x))} dF(x) ~------------A
We can now write for all a(a = 1,2, ••• ,k-1) ~a= (71a - rik)/{2 where
111(i = 1,2, ••• ,k) are independent standard normal chance variables and we then obtain
P(CSIR"} ~ P(Max ~-<Min (~. + v)} i i j=k-t+l, ••• ,k J
= Z P (Max Tli < 1b + v, 1b < Min 1l.} O'=k-t+l j=fa J
= t L"'lbk-t(x+v) (1-lb(x)]t-l dlb(x) .
If we set the last expression in (4.14) equal to v* then we obtain the required
result (4.9).
For the location parameter problem with a~~ we make use of the
Lemma: If f 0(x) is a continuous unimodal density that is symmetric about x = O,
and if a~%, then to satisfy the condition that F0
(x+8*)-F0
(x) ~ d* for all x e I
it is sufficient to set F0
(x0+8*)-F
0(x
0) = d*, where x0 = F~-l)(a+e*) is the right
end point of I.
Proof: If f0
(x) is unimodal and continuous and a* is sufficiently small, then there
exists a point (or a whole interval, in which case we select the midpoint of this
interval) inside the support of f0
(x) where f0
(x+8*) = f0
(x); call this point x'.
Then clearly F0
(x+8*)-F0
(x) is also unimodal with a maxinrum at x'; if in addition
f0
(x) is symmetric (say, about x = 0) then F0
(x+8*)-F0
(x) will be symmetric about
x' = -8*/2. If a~ % then x0
= F~-l)(a+e:*) ~ 0 since e* > O and F0
has median O;
clearly the left end point of I, -x0
, is closer to x' < 0 than x0
is. Hence
* F0
(x+8 )-F0
(x) takes on its minimum value at x0 , which proves the lemma.
For small values of a* (and hence also d*) we have using the above lemma
~ f (x) 0 0 =
when a~\ and f0
(x) satisfies the properties in the above lemma.
From (3.4), (4.9) and (4.16) we now obtain for translation alternatives (TA)
- 15 -
Theorem: If a~%, and· f(x)
* any e > 0
has the properties mentioned above, then for
- -I
(4.17) ARE(R,R'(J);TA) =
(4.18)
(4 .19)
(4.20)
(4.21)
o:( 1-o:{ C: :x (J(F(x) )) dF(xj]2
For the rank sum test Ri we take J(x) = x and obtain from (4.17) for
normal alternatives (NA) with a=~ nv
ARE(R,R1;NA) = lim l = d*~o n
* which can be made arbitrarily close to 2/3 by making e smallo
I I
For the normal scores test R2 we take J(x) ~ ~(-l)(x) and obtain from (4.17) ._
for normal alternatives with a=\ n'
ARE(R,R2;NA) = lim _g = g_ exp(-[~(-l)(½+e*)] 2 ) d*~o n 7f
which can be made arbitrarily close to 2/rr by making e* small.
2. ARE for 2-sided Exponential Slippage Alternatives (TEA)
We wish to compare our procedure R with certain other procedure which
we might prefer if the k distributions all have the form
i[ -lt-eil F.(x) = F(x;0.) = 2 e dt
1. 1. -00
here the F.(x) differ at most in the value of the real location parameter e. 1.
If we knew that the distributions had this form then we might use a fixed
sample procedure R1 based on the sample medians.
[The procedure R1 is asymptotically equivalent to a procedure R2
which
depends on scores as in (4.6) with
J(x) = sgn(x - ½) r-1
= ) 0
l 1
for O~x<\ for 1
X = 2
for ½<x< 1
This procedure R2
can be shown to be the locally most powerful test on the
basis of Hajek [4] for the 2 sample case; we shall not prove this result here.]
The rule for procedure R1
is to select the t populations which give
rise to the t largest sample medians. Let e[i] denote the ith smallest
0-value. We want to choose n large enough so that
- 16 -
w
I i
I
w
! I ...
... '
I (4 .22)
(4.23)
(4.24)
(4.25)
for all vectors 7/ = (el, 02, •.• , ek) such that e[k-t+l] - e[k-t] ~ a*;
here 8* > 0 and p* with 1 > p* > 1/(k) are both preassigned constants. t
For general 7/ the P{CSjR1
} is again given by the development and 3rd
expression in (2.3) with H.(y) defined in (2.2), except that now r = (n+l)/2 l.
and F[i](y) = F(y;e[i]). It is easy to show using a result of Rizvi (9]
that the infimum of the P{CSIRi} is obtained at the configuration (4.8),
i.e., the configuration (4.8) is again least favorable. Using this in (2.3)
we obtain
00
t J Hk-t(y+8*) [ 1-H (y)] t-l dH (y) , 0 0 0
-oo
where H0
(y) is the right side of (2.2) with F(i](y) replaced by F(y;o),
i.e., H (y) 0
is the distribution of the sample median for an odd number n1
of observations taken from the c.d.f. (4.20). We now use the fact that the
sample median is asymptotically normal with mean equal to the population
median g and variance 2 -1
(4nlf(s)] where f(x) is the derivative of
F(x;g). Applying this to (4.23) gives
P (CS IR1} "' t f 00 tk-t(x+5* Jnii [ 1-t(x)] t-l dt(x) .
-oo
The value of n1 is then obtained by setting the right side of (4.24)
equal to p* and solving for n1 .
To get the relation between the horizontal measure of distance 8*
and the vertical measure d* used in procedure R we set the difference of
the c.d.f.'s equal to d* at the right endpoint of the interval and obtain
2d* 1-2e* =
-8* * 1-e ~ 8 .
- 17 -
(4.26)
(4.27)
(4.28)
(4.29)
Hence, using (4.24) and the definition of v* above,
2
n i:::= 1
~ (v*( l-2e*)) 2d*
If we compare this with (3.4) for a=\ or use (4.17) we obtain
. n . ( ) l ~: ( *)2 ARE R,R1 = lim - =' 1-2€
d*-+O n
which can be made arbitrarily close to 1 by making e* sufficiently small.
It should be noted that although the sampling procedures are the same,
the expressions for the infimum of the P{CS} are not the same for the two
procedures above. In view of the relation mentioned above between R1 and
the asymptotically most powerful test R2
, it follows that R also has an
asymptotic efficiency with respect to R2 that can be ma.de arbit~arily close to
1 by taking e* sufficiently small.
With respect to the Rank sum test R1 and the Normal Scores test R2 we now wish to show that the procedure R has an asymptotic efficiency which
is greater than one, when the distributions are in fact given by (4.29).
For the rank sum test we use (4.9) with J(x) = x and F(x) given by
(4.20) with e = O and obtain
Hence, using (4.25) and (3.4) with a=\
n" ( * * )2
* 2 ARE(R,R1 ;TEA) = lim _! = ~3
v ( l-2E ) (~) =
d*-+O n 2d* V
which can be made arbitrarily close to 4/3 by making e* sufficiently small.
For the Normal Scores test R2,we set J(x) = ~~~(x) and use the same
- 18 -
I
! I
\ l i
I i
..
(4.30)
(4.31)
(4.32)
- (4.33)
F(x); using (4.9) again and integrating by parts we obtain
=
=
since it is e~sy to show that the first term in this integration by parts
vanishes and the second term is 2/[2ir. Hence using (4.25) and (3.4) with
0: =\,we obtain
n" * 2 ARE (R ,R' ; TEA) = lim _g = '!, (v*( l-2
€ )) 2 d*-+O n 2 2d*
2d* 2
(-) v*
7f ( *)2 = 2 1-2€ ,
which can be made arbitrarily close to 7r/2 by making e* sufficiently small.
Using (4.26) we can easily find the ARE of our procedure R with respect
to the procedure R' based on the sample means for the 2-sided exponential
case. Using the fact that the variance of X with c.d.f. given by (4.20)
is 2, we again obtain (4.1) with v' = o*Jn/2 where a*> 0 has the same n
meaning as in (4.1). * Hence, if v is defined by (3.3) and n'
required value of n for procedure R' then as in (4.26)
Comparing this with (3.4) for O: = \, we obtain
ARE(R,R' ;TEA) = lim d*-+O
n' -n = 2(1-2e*)2
which can be made arbitrarily close to 2 by making e* small.
- 19 -
denotes the
(5.1)
(5.2)
(5.3)
5. Supplementary Results Needed for Table Calculation
Formulation 1A
For the special case k = 2, t = 1, 1 a= 2 and e* = d*, using the right
side of (2.9), the equation determining n for formulation 1A becomes
½+d* r G(u+d*)dG(u) + G(½+2d*)c(½-d*) Ji d* 2-
= p*,
I
~
where G = G is the incomplete beta c.d.f. with parameters r = (n+l)/2 i..., r ,n-r+l
and n-r+l = (n+l)/2. To evaluate the integral I in (5.1), it is useful to
note that for c > 0 and 2d*/c an integer
2d*/c I ~ ½ E [G(½+jc) + G(½+(j-l)c)][G(½-d*+jc) - G(½-d*+(j-l)c)];
-j=l
this is obtained by considering non-overlapping squares of side length c
whose diagonals form the line segment on * . 1 * 1 * v = u+d with 2 -d < u < 2+d
and replacing the integral over the lower triangle of each square by% the
integral over the square. If we replace the integral over the lower triangle
in each square by the integral over the whole square we clearly obtain an
upper bound for I; it follows that an upper bound B1 to the absolute value
of the error in (5.2) is given by ..-
= 2d*/c
½ E [G(½+jc) - G(½+(j-l)c)][G(½-d*+jc) - G(½-d*+(j-l)c)]. j=l
We choose c < d* so that O < ~ = (d*-c)/2 < \; then the point(½,½) is
not included in any of these squares and for each square the distance from
the boundary to(½,½) is at least ~ff. Hence each square has a distance
from\ of at least ~ in one of the two coordinates. Since there are
2d*/c squares it follows that
- 20 -
--\
I /
-(5.4) * ~ [G(~c) - G(½t-D.)]
C
~ · q,[ ( d*+c )Jn+g]
cv'n+2
We note that B1 = cd* for n=l and it can be shown that B1
decreases with n;
the last expression in (5.4) shows that both B1 and B2
decrease exponentially
fast with n. Hence for n ~ 99 we can use Pearson's Table (7] with c = .01
(or c = .02 or etc., provided c < d* and 2d*/c is an integer) to solve
(5.1) for n with high accuracy by using (5.2) to evaluate the left side of
(5.1) for different n values.
For n > 99, we need an asymptotic approximation and we first note that,
for k = 2, t = 1, O! = ½ and e* = d*, (3.3) reduces to
(5 .5) ~<[2') = p* ,
{5.6)
{ 5 · 7)
where x = 2d*/n+2. However this result gives an error of about 5~ to~ for
n between 200 and 400 and hence a better approximation is required which
does not drop all terms approaching zero. Applying (3.1) to .the right side of
{2.9) and setting x = 2d*Jn+2, k = 2, t = 1, l * * O! = 2 and € = d , we
find that the equation to be solved for x is
p* = q,(2x)[ 1-q,(x)] + Jx q,(y+x)dq,(y) -x
= q,(x2
) -[00
(q,(y+x) - q,(y-x)Jdq,(y) - [1-q,(x)][l-q,(2x)J ;
the last form shows that the root of (5.5) is a lower bound to the root of {5.6).
Using the funct~on V(Ah,h), defined by
JAhlxlA v(~h,h) = q,(y)~{x) dydx
0 0
and tabled in [2 ], we can write {5.6) exactly in the form
- 21 -
(5.8)
(5.9)
(5.10)
(-5 .11) I..
p* = ½+½[t(~)-½]2
+ V(x,2~) + v(_~, 3x) - [1-t(x)][l-t(2x)]; J2 vc. .[2
the details of this are easily worked out with the help of the introduction
in [2] and are omitted here. Assuming that we have any approximate solution
x of (5.8) (say, the solution of (5.5)), we set x = x +e in (5.8), expand 0 0
in powers of e, drop powers of 2 and higher,and solve the resulting equation
for e; then X = X +E 0
appears to give always a better approximation to the
solution of (5.8); thus an iteration of Newton's approximation formula was
used to solve (5.8). The last term. on the right in (5.8) is very small and
can be dropped; the resulting equation for xj+l in terms of xj is
X. 2 X. 3x. *11 J l - J J P -2-2[t(-)-2] - V(x. ,2x .) - V(-,--;--)
+ [g J J f2 112 l X. X. 3x. rn cp( ~)[t( ~)+t( ~)-1] + cp(x.)[t(2x.)-½]
'v 2 v 2 v2 v 2 J J
For large h and ~ not close to one, the tables for V(~h,h) in [2] are
inadequate and it was found necessary to use the formula
1 1 = ~ [2t(~h)-l] - 21r arc tan~,
which is given in the introduction to (2 ].
Another equation whose root yields a close upper bound to the· root of
(5.6), and hence also (5.8) for large x, but requires less calculation than
the iteration (5.9) is
t(x) + t(2x) 2
cp(x) x2..f2ir = p*.
·t
This equation is useful to obtain a good (i.e., close) starting value for (5.9);
we omit the derivation of (5.11)~
I i ...
In (5.9) we have indicated that for j ~ 1 the trial value for the (j+l)st -..
iteration y is the same as the result of the J· th iteration x. 1
, but a j+l J+
- 22 - l I
( 5 .12)
(5.13)
(5.14)
slight variation sometimes gives better results. After the j th iteration we
know which x. • s ( i ~ j-1) ].
are upper bounds; let y. 1 J-denote the smallest
of these upper bounds. If we replace x. J
in the right side of (5.9) by the
average y. = (xj+y. 1)/2 then the resulting iteration may converge even
J ]-
faster than (5.9).
Formulation lB
For k = 2, t = 1, a=½ and e* = d*, using the right side of (2.13),
the equation determining n for formulation lB becomes
1 * i 2+d * * * 2
* G(u+d )dG(u) + G(½-d) + ½[G(½+2d*) - G(½+d )] = p* . .!.. d 2-
where G is the same as in (5.1). The only part that presents any difficulty
in (5.12) is the integral and it is the same integral as the one already
considered for Formulation 1A above; hence all the results above on I can be
used here also.
In the asymptotic case (i.e., for n > 99) equation (5.5) holds here also
but again it gives too large errors for the purposes of calculating tables.
The equation corresponding to (5.6) is
p* = 1 - 41(x )' + ½c 41(2x )-41(x) J2 + lx 41(y+x)d41(y); -x
and the root of (5.5) is a lower bound to the root of (5.13). Corresponding to
(5.8) we can write (5.13) in the form
P* = ½ + ½[41(.,(2")-½]2
+ V(x,2x) + V(~,~) + ½[41(2x)-41(x)]2
•
As p*-+ 1 and/or d* = e* --)Q, the difference of then-values required by
formulations 1A and lB tends to zero, as is also indicated by the tables.
- 23 -
6. Acknowledgement
The author wishes to thank Professor M. Haseeb Rizvi of Ohio State
University and Mr. George Woodworth of the University of Minnesota for
several helpful discussions in connection with this paper. The ~uthor is
also indebted to Miss Marilyn Huyett of the Bell Telephone Laboratories
and Mr. A. P. Basu of the University of Minnesota for their help with the
tables in this paper.
- 24 -
I
~
I !
-
-
ADDENDA
• For formulation 2B the asymptotic (n --:)oo, r/n --:)Q!) expression for the
X ·;,:_, fi,, 1ri'gil~/ itae~!-cff ::(i.{9) :'.(~ equfv1a1.erit? fc('thkt ·!!11··tfo 7) [by -\fi~tU~._,~, 6o5);~nd
(3.9)
(3.10)
(3.11)
(3.6). Hence (3.7) can also be used for formulation 2B. However a better
approximation needed_, f9r.~abl~~nµiking is - . ~ ~ -- . . . ~~ .
+ G t,k ... t+l i (
(~ (-e * Jn+I))~ -/4x(l-a) ~
and for k = 2, t = 1 this reduces to
P{CSIR} . ~ ~2 ( e*Jii+I ) {o:{ lc-0:)
+ ~2 (-e*Jri+T) Jo:(1...a:)
For formulations 1A and lB we can also get an asymptotic expression for
P{CSIR1} after making a variance-stabilizing substitution. We do not alter
the procedure but merely try to obtain an expression in which the convergence
to normality is faster. We assume that d* < min{e*,o:) and that the F[r] are
all differentiable with positive derivatives in the interval I.
j ~ k-t as we approach the LF configuration F[k-t+l](F~j~izj))
z. = F[ .](y .) ~o: and hence for all r(l ~ r ~ k) J J . r,J
We make the substitution
Then for each
* --:) z .-d where J
(3 .12) Ur = arc sin/ F [k-t+l](F~;P (zr)) (1 ~ r ~ k)
where z F (y ) has the beta distribution ~(r,n-r+l). Then, using r= [r] r,r
(3.11), we have asymptotically
(3.13} Var Ur ~ [4(n+2)]-l
- - 25 -
(3.14) E(U .-U.) ~
J ]. arc s inJ'a - arc sin Jo:-d* - d' > 0
...
for i ~ k-t and j ~ k-t+l. Since arc sin v'x is an incr~asing f~nction of x
for 0 ~ X ~ 1,
{3.15)
and the P*-condition for determining n as d* -+O becomes
(3.16)
This is consistent with (3.2) for very small values of d*; it was not used
. * * in the computations of Table 1 where d = e •
- 26 -
iii..
I ... I
I I
la.-
.. •
-
-
·o
.6( )0 I
.65
.700
.75
.ao
0
0
0
.85
,90
0
0
95
.97
.. 990
0
5
Table 1
Number n of Observations Required per Population under Procedure lt for .
Selecting from k • 2 Populations the One with the Largest Population Median.
[Here d* = e* and the condition satisfied by Procedure a1 is
PtCorrect SelectionfR1} ~ p* whenever the distance d ~ d*]
Formulation 1A Fornulation 1B ' Foraalation 2!,___I Formulation ~n I
.05 ,10 .15 .20 .05 .10 .15 .20 .05 .10 .15 .20 .05 .10 .15 .20
31 9 ~ 1 15 3 1 1 43 11 5 3 17 5 3 ,_
47 11 5 3 33 9 3 1 57 15 7 3 35 9 5
€,7 17 7 5 55 13 7 3 75 19 9 5 57 15 7
95 23 11 5 85 21 9 5 97 25 11 5 81 21 9
129 33 15 7 125 31 13 7 123 31 13 7 111 27 13
181 45 19 11 175 45 19 11 155 39 17 9 147 37 17
253 63 27 15 251 63 27 15 201 49 21 · 13 193 49 21
363 91 39 21 363 91 39 21 265 65 29 15 261 65 29
567 143 63 35 565 141 63 35 381 95 41 23 379 93 41
777 193 85 49 777 193 85 49 501 125 53 29 501 125 53
1087 271 127 69 1087 271 127 69 663 165 71 39 663 165 71
The entries for fornulations lA, lB, Z\ and 2B, respectively, are solutions
of the equations (2.9), (2.13), (2.17) and (2.19) with k = 2, t = 1, a c % and e* ~ d*. For n > 150 the equations used were (5.9), (5.14),(3.8) and
. ' ~ 3.10, 1 respectively.
- 27 -
1
3
3
5
7
9
11
15
23
29
39
i
REFERENCES
[I] Bechhofer, R. E .. (1954) A single-sample multiple-decision procedure for rankinp, means of normal populations with known variances. A~tl. Math. Statist. 25, 16-39.
[2] Blanch, G., Fix, E., Germond, Ho Hw and Neyman, J. (1956) Tables of the bivariate normal distribution function and related functions. Applied Math. Series #5°Nat'l. Bur. of Standards, Washington, DoCo
[3] Gupta, So So (1963) ·Probability integrals of the multivariate normal and multivariate t. Ann. Math. Statist. 34, 792-828.
[4] Hajek, J. (1962) Asymptotically most powerful ra~k-order tests. Ann. Math. Statist. 33, 1124-1147.
[5] Lehmann, Eo L. (1963) A class of selection procedures based on ranks. Math. Annalen 150, 268-275.
Milton, Roy (1963) ility integral.
Tables of the equally correlated multivariate normal probabTechnical Report #27 Univ. of Minnesota, Minneapolis, Minn.
['7] Pearson, K. (1948) Tables of the ·incomplete beta .function. The Biometrika Office University College, London.
[8] Puri, M. L. (1964) Asymptotic efficiency of a class of c-sample te;:;ts. Ann. Math. Statist. 35, 102-121.
(9) Rizvi, H. (1963) Ranking and selection. problems of normal populations usi:i.g the cibsolute values of their means~ Fixed sample size case. Technical Rr.:!port #31 Univ. of Minnesota, Minneapolis, Minn .
... '28 -
..
i -
(2 .1)
(2.2)
(2.3)
(2.4)
Selecting a Subset Containing the Population with the Largest ~ -ct-Percentile
by M. HASEEB RIZVI _and MILTON SOBEL
OHIO STATE UNIV. UNIV. OF MINNESOTA
1. Introduction
A nonpa~ametric solution is considered for the proble~ of selecting a
subset of k populations which contains the one with the largest a th-
percentile. It is assumed that each population has associated with it a
continuous c.d.f. Fi(x) (i = 1,2, .•• ,k) and that observations from the
same or different populations are all independent. We use the notation of
the companion paper [ 5] without redefining all the symbols and, in parti
cular, it is convenient to identify each population with its c.d.f.
2. Formulation of the Problem
A common fixed (i.e., given) number of observations are taken from
each of the k populations. A constant p* > 1/k is preassigned and we
are required to find a procedure R1 such that
for all possible k-tuples (F 1 ,F2
, ••• ,Fk) for which
Min F[j)(y) ~- F[k]{y) l~j~k ·. .
for all y
We denote this set of k-t~ples (F 1 ,F2 , ••• ,Fk) by n1 •
To avoid being "too far out" in the tails we assume that the given
common sample size n is large enough so that
1 ~ (n+l)a ~ n
and we define a positive integer r by the inequalities
r ~ ( n+ 1 )a < r + 1
It follows that 1 ~ r ~ n.
- 1 -
(2.5)
(3 .1)
I
-~
We now define a procedure R1
= R1(c) in terms of a positive integer
c( 1 ~ c ~ r-1) : and the order statistics Y. . where Y. . denotes the "' J,l. . J,l.
j th order statistic from the population Fi(y) (j = 1,2, •.. ,n; i = 1,2, ... ,k).
Procedure R1
:
Put F. l.
in the selected subset if and only if
y ~ r,i Max
l~j~k y . r-c,J
where c is the smallest integer·with 1 ~ c ~ r-1 for which ~2.1)-is satisfied.
We shall show that for any given a and k a value of c ~ r-1 may
not exist for some pairs_'(n,P*). but if P* is chosen not greater than some
function P1
=·-,P 1 (n,a,k) where 1/k < P1 < 1, then a val.Ile of c ~ r-1
does exist that satisfies (2.1). The function P1 will be derived and
evaluated; -in particular we shall be interested to see for fixed k how
rapidly it approaches 1 as n increas-es.
[If we allow c = r and define Y0
• == -co then any value of p* ~ 1 ,J
can be obtained by putting all the populations in the selected subset. Thus
P 1
represen-u the largest P* for widen the {nun-unn:lomized) pt;oblem is
nondegenerate .... ·6-If P1 < p* < 1 ·-~ randomi:zat:ioa bemeen the rule (2.5)
with c = r-1 and the degeneratie rul-e (c = r) is desirable, but we shall not
consider an.y such rul-es in the discus&iml be:h!a..· .lit. -shoe·Jd be poi,alEd out
3. Probability of a Correct Selection for'!R1 .·.~
Using--:the notation· of· ·se~t:i-ons··-4 «ttif .. ~··Df-:f!f]~····.the probability of a
correct S'elec-tion P{CSfR1) is easily seen. to· be
i»-.&-:Ji. .-. .
= 1 JI Hr-c,i(y)-4~ i{y) -00 i::::l ,
where
'H' i.(y) .• u r,
I
II-'
i
~
: I
I.I
' :
-
(3.3)
(3.4)
(3. 5)
(3.6)
-
To find the infimum of (3.1) in n1 , we note that H .(y) depends on r-c,1.
y only through F[i](y) and that it is an increasing function of F[i](y).
Hence we obtain the infimum by setting F[i](y) = F[k](y) for i = 1,2, ... ,k-l.
Making the transformation u = F[k](y) then gives
11
k 1 = G - (u) dG (u) r-c r 0
where G (u) = I (r,n-r+l) is the standard incomplete beta function. When r u
c = 0 the last member of (3.3) is clearly 1/k and we wish to show that
G (u) is an increasing function of c (c = 1,2, ... ,r-1) for fixed r r-c
and u or that G (u) r
by parts gives
is decreasing in r for fixed u. Integrating G (u) r
which shows the required monotonicity. Hence the maximum value of p*
for which a value of c satisfying (2.1) exists is obtained by setting
c = r-1 in (3.3) and this easily reduces to
Hence for any p* < P1(n,a,k) a value of c ~ r-1 exists and it is unique
so that the procedure R1 is well defined for p*··< P 1 • A short table of
P1-values for a=½, so that r = (n+l)/2, and k = 2 is given at the
end of this paper •. It can be shown that P 1 -+ 1 exponentially fast as n -+co.
In fact, we can write for large n (with r/n -+a)
where a'= 1-a and
of (3.6) is omitted.
~ 1 - (k-1) [(l+a')1:a'J n C 1
4(a' )o: o:
c0
, does not depend on k o~ n; a detailed derivation
o:' Since l+o:>O and [(l+a')/o:'] > 0 are both increasing in o:'
(0 <a'< 1) their product is also and hence the maximum value of the quantity
- 3 -
(3.7)
in square brackets in (3.6) is one; this shows that P1 -+ 1 exponentially f~st.
Of course, the same phenomenon described above can be interpreted from
another point of view, namely, that for a given p* > 1/k there ·is a minimum
common number n of observations required to satisfy (2.1). Table 1 also
gives the required common value of n for a=% and ·k = 2 if we take P1
to be the.specified value of P*.
For small values of k and any a satisfying (2.3) the integral in
(3.3) can be written more explicitly and, in particular, for k = 2 we obtain
after algebraic simplification the three equivalent expressions ..
inf P{CSIR1 , k = 2) n1
=
=
=
½+
n E
i=r-c
(r-l+i) (2n-r-i) r-1 n-r
r-1 E
i=r-c
cr-l+i) (2n-r-i) r-1 n-r
The last expression is more useful for computing if r is small and the
..
i 6-1
I
~
middle one is more useful if c is small. A table of r-c values satisfying (2.1) oJ
-a=½, so that r = {n+l)/2, and selected values of n and p* is given at the
end of this paper.
4. Expected Size of the Selected Subset
The size S of the subset selected by procedure R1 is, of course,
a random variable and its expected value may be taken as a measure of the
efficiency of R1 , with small values corresponding to high efficiency. In
this section we derive an expression for E{SIR1) and show that the maximum
value in n1 is attained when the c.d.f.'s are all identical.
It is easily seen that for any c (1 ~ c ~ r-1)
- 4 -
I
~
.. _,
~
al
-lal
I.al
lal
--\al
) ._
.. wal
.. -'
-'
..i
_,
• (4:1)
{4.2)
(4.3)
(4 .4)
(4.5)
'
(4.6)
k
E(S IR1) = r, P(F[i) i=l
is included in the subsetjR1}
k 00 k = r. 1 rr H . ( y) dH . ( y) • . 1 . . 1 r-c, J r, 1
l= :..00 J= jti
For the special configuration
F[l](y) = F[2](y) = - F - [k-l](y)
this reduces to
100
k 1 E(SIR1} = :..oo Hr=c,l(y) dHr,k(y)
+ (k-1) 1"' Hr-c,k(y) H:=~,l(y) dRr,l(y) -oo
and, if we set all of the F[i]'s equal, we obtain
100
k-1 E(SIR1} = k H k(y) dH k(y) ;
:..oo r-c, r,
for c equal to the integer c-value satisfying (2.1) the right side of (4.4)
is approximately kP* •
To prove that E(SIR1}is a maximum when F[l](y) = F[ 2 ](y) = ... = F[k](y)
for all y, we first prove the
Lemma: For any two c.d.f. 's F1(y) and F2(y) such that F1(y) = F[l](y) ~
F[2 ](y) = F2(y) for all y and for any integer c with 1 ~ c ~ r-1,
Q = Hr-c,l(y) Hr,2(y) - Hr,l(y) 8r-c,2(y)
is nonincreasing in F[l)(y) for any fixed F[ 2 ](y).
Proof: Differentiating Q with respect to F1(;), it is sufficient to show that
Q1 = H 2
(y) dH 1(y) - H ·
2(y) dH
1(y) ~ 0 . r, r-c, r-c, r,
- 5 -
(4.7)
(4.8)
(4.9)
(4.10)
We show (4.6) by letting F2 (y) -+O for all y, so that Q1 ~o, and showing
that O is the maximum value of Q1 for F1(y) ~ F2 (y). For this purpose we
differentiate Q1
with respect to F2
(y) and obtain (neglecting common factors)
r-c-1 r-1 r-1 r-c-1
..
Ql2 = Fl(y) [1-Fl(y)]n-r+cF2(y)[l-F2(y)]n-r_Fl(y)[l-Fl{y)]n-rF2(y) [l-F2(y)]n-r+c
r-c-1 r-c-1 = Fl(y) [l-Fl(y)]n-rF2(y) [l-F2(y)]n-r{[l-Fl(y)]cF~(y)-F~(y)[l-F2(y)]c}.
It follows from (4.7) that Q12 ~ 0 whenever F2 (y) ~ F1
(y)
and this proves the lemma.
for all y
Theorem: E{S} is a maximum in n1 when
F[l](y) = F[2](y) = = F[k](y) for all y.
Proof: Consider the pair F[i] and F[k] for any particular i + k; we
wish to show that if F[i](y) decreases on any subset of y values without
violating (4.8), then E{S} is not decreased. Integrating by parts in
the general expression (4.1) for E(S} gives
k-1 00 k 00 k-1
E{S} = _r. 1 II Hr-c,c/Y) dHr,/Y) +J II Hr-c a(y) dHr k(y) J = 1 :..co Q';:: 1 -co O';:: 1 ' '
a+j
~
I.
I
--1..1
~
~
I
i...
-1
....
1111111
I
I
1..1
._
k-1 00 k-1 -' = 1 - 'f. 1 IT H rv(y)[H k(y) dH .(y) - H k(y) dH .(y)] . . 1 rv_ 1 r-c,~ r, r-c,J r-c, r,J
J= ~ ~ ~j ~
For fixed F[k](y) we can disregard all terms except j = i since in the
jth
term (j + i) if F[·](y) decreases then H .(y) decreases and hence 1 r-c,~
E(S} increases. Hence we need only show that
00 k-1
1 II Hr-c a(y)[Hr k(y) dHr-c i(y) - Hr-c k(y) dHr i(y)] -co O'= 1 ' ' ' ' '
a+i is a nondecreasing function of F[i](y) for fixed F[k](y) in n1 • Integrating
by parts again we find that it is sufficient to show that both of the functions
- 6 -
~
laal
: I
la.I
1-.1
~
'-iii
-(4. ii)
( 4 .12)
(5.1)
(5.2)
(5.3) ....
(5.4)
H .{y) dH k(y) - H .(y) dH k(y) r-c,i r, r,i r-c,
are nonincreasing functions of~ F[i](y) for fixed F[k]{y) in n1. This
was shown for (4.11) in the lemma; to show this for (4.12) we differentiate
(4.12) with respect to F[i](y) and the proof is exactly the same as in
(4.7). This completes the proof of the theorem.
5. Asymptotic Results and Asymptotic Relative Efficiency (ARE)
Asymptotic (n~oo) expressions for inf P(CS IR1) and E(S IR
1) will
n1 be obtained for fixed p*o A definition is given for the asymptotic relative
efficiency ARE(R' ,R" ;A) of a procedure R' relative to another procedure R"
under the alternatives A. Our procedure R1 for a=\ is then compared to others,
e.g. the one
the normal case.
based on the sample means,under general alternatives including
For n ~oo with r/n ~a, let J3 denote the value of c/(n+l) for
large n and fixed k,P*; we use the fact that
where ~(·) is the standard normal c.d.f. Then by (3.3)
~ 100
~k-1 (y/a(l-a) + 13Jn+2) d~(y). -00 )(a-t, )( 1-a+13)
~-Since this remains equal to P < 1 as n ~oo, it follows that f3 ~ O.
Hence if S = S(k,P*) is the root of
Loo
- <l>k-l(y+S) d<!>(y) = p*
then the rate at whic~ J3 ~ O {and c ~ oo) is given by
J3Jn ~ sva( 1-a) as n ~ oo •
- 7 -
(5.5)
(5.6)
(5.7)
(5.8)
The value of S (or H = s12) has been extensively tabulated by Bechhofer
[1], Gupta [3] and Milton [4]o Treating (5o4) as an equality and solving
for ~' we can then use c = (n+l)~ to make the procedure R1 explicit and
(2.1) will then be satisfied for large n.
For E{SIR1} we obtain from (4.4) the maximum value kP*; for other
asymptotic results· we assume a particular class of alternatives, e.g., (NA)
denotes that F. is normal with mean 0. and variance 1 (i = 1,2, .•• ,k). ]. ].
As n -+oo with r/n -+a, the distribution H .(y) of the r th order statistic r,1.
from F[i] is asymptotically normal with mean
a(l-a)/nf[i]S"1'-a), where the density f(i]( 0)
0[i] + "1'.a and variance
is the derivative of F[i](x)
d ' ' (F ) . h th ·1 f an ~a= '~a [i] is tea -percenti e o F[i]' i.e.,
_,,_ "' ( (y-0[ ir"1'-a) jn f [ i] ("1'-a) ) H .(y) --,. "' r,i Ja(l-a)
We note that for any f(t) that is continuous at "1'.a we have
1"1'.a-~
1"1'.a
a-~ = f(t)dt = a - f(t)dt ~ a - ("1'.a-"1'.a_~) -00 \x-~
since ~ -+ O as n -+ 00; hence we can write "1'.a-~ ~ "1'.a-t31f("1'.a). For the
particular configuration with 0[k] = 0 and all others equal to
(where 0 > 0), we obtain from (4.3), (5.4), (5.5) and (5.6)
-0
where 0n = eJn f("1'.a)/Ja(l-a) and s is defined by (5.3). It is easy to
verify that E{S!R1}-+ 1; our present aim is to obtain an asymptotic expression
for the smallest value of n required to satisfy
for positive values of € near zero. From (5o7) using the fact that
(1-x)k ~ 1-kx for x small and the Feller-Laplace normal approximation, we have
- 8 -
I
Li
I
..i
I w
I .
w
Ll
u
I I
L,
... (5 .9)
•
(5.10)
( 5 .11)
~ (k-1) Loo lf>(y+s-en) q,k-2(y+S) dlf>(y) - i;t>f-S-Bn)'
--- '\J2r
(k-1) (s-e,) T k 2 ( s+e ) (-s-e ) ~ cp ~ ·,: <I> - . .:x + -~ d<I>(x) - (k-1) <I> 'r- __ n en02 /2 -oo · 2 2 V c.
(k-1) (s-en) (s-e) (-s-e ) ( s-e ) ~ (k-1) (k-2) cp _n <I> __ n _ (k-l) If> - n e ·./2 q, ~2 en~2 ff ~ /2 ,n
~ (k-1) (s-0 n) ~
(k-1) e(s-en) e 12 q, [2 2 ,J2 n
If we set this equal to € > 0 then the asymptotic value n1(€) of n
required by procedure R1 is given by
a( 1-a) (s-i2 ~ ( ~) )2
€
where e' = 2e/(k-1) and (0) ( 1)
is the e'-percentile of <I>: i.e.,~ , =<I> - (e'). €
We now apply the same analysis to the procedure R2 based on sample means
(see Gupta (2) for a detailed development and an up-to-date set of references).
We assume that the Fi have a finite, known and (for simplicity) common
variance a2 • To ensure that procedure R2 is accomplishing the same goal as
R1
, we also assume that the asymptotic normal distribution of the mean x from the population with c.d.f. F[i) has mean e[i]' where e[l] ~ e[2 ] ~
this certainly holds if we are dealing with a location parameter.
The procedure R2
is to put the i th
population It in the selected subset iff
Max X. - d , l~j~k J
where d > O is chosen to satisfy the P*-condition (2.1). Here, for
specified p*, we find that d = So/[n where S = S(k,P*) is defined by (5.3).
- 9 -
( 5 .12)
(5.13)
( 5 .14)
(5.15)
(5.16)
· (5.17)
For the same configuration with e[k] = 0 and all the others equal to -e
(where e > O) we obtain corresponding to (5.7) and (5.9), letting e~ = Bin/a~
1.oo k 1 100 k 2 = t - (y+S+B~) dt(y) + .(k-1) t(y+S-8~) t - (y+S) dt(y), j
-00 -00 ~
{ I } 1 ~ (k-1) t{s-e~) ES R2 - 2 ./2
Setting this equal to e > 0 we find that the value n2(e) of n2 required
by procedure R2 is
a2(s-.J2' °A. (~))2 e
We define the asymptotic relative efficiency ARE(R' ,R" ;A) of R' relative
to R" for the alternatives A to be the limit as e -+ 0 of the ratio of
n"( e) to n' ( e); hence
= lim e-+0
=
which is independent of e for any alternative.
For a=\ and the normal shift alternatives (NA) with a= 1 we obtain
For a=.\ and the two-sided exponential shift alternatives (TEA) with median
zero and f(O) =\we obtain
thus R1 is asymptotically twice as efficient as the procedure R2
under TEA.
We can also compare the procedure R1 with both non-randomized and random
ized rank sum (or score) procedures which have been applied to the problem of
selecting a subset containing the best population by Bartlett [1]. The basic
procedures are defined in Lehmann [5] in terms of a function J(x) defined
- 10 -
' I w
I
u
w
-0 - (5.18)
(5.19)
(5.20)
(5 .21)
for O ~ .x ~ 1, whose inverse J~~ 1 !(x) = H(x) is a c.d.f. For some fixed
function J(x) we can define the score for the i th population by
s ,. = ..! N,i n {i = 1,2, ••• ,k),
where R. j denotes the rank of the jth observation X .. from the i th
1, 1,J
population IT~ in the combined sample of N observations; we are assuming
that there is a common number from each population so that N = kn~ Then
the·non-randomized procedure R' = R'(.;r) based on these scores is to put
IT1 in the selected subset iff
SN' • > ,1.
Max l~j~k
SN' • - c' , ,J
where c' is a positive number chosen to satisfy the P*-condition (2.1).
The randomized procedure R" is similarly defined in terms of randomized
scores but, since the asymptotic properties of R" are equivalent to those of
R', we need not define R" here. The c.d.£.'s F.(x) are now assumed to be 1.
of the form F(x-B.), so that we are dealing with the selection problem for 1.
a location parameter. We shall use the result from (1) that for asymptotically
small d* {our 8), the common (large) sample size n' that would be needed
for R' (and also for R") to satisfy both (2.1) and (5.8) is
~ [ Ad r .., 0 C ! {J(F(x) )) dF(x)
where (his) d in (5.20) is the root in e of the equation obtained by n
setting the right side of (5.7) equal to 1 + e, and where
. 1 11 2 A2 = 1 J 2 (u) du·-(
0 J(u) du)
It should be noted that n' in (5.20) represents the value of n that would be
needed if you knew that the true configuration were given by e[kfe[ite (i=l~2, ••• ,k)
- 11 -
L·;;>:11for .'JS·ome ifixed f: .. a, ·,~not dep~di.ng,.ort /·; iL,, 'i. ~ 1{\l{~hif;,:i~ting: .. ~gceSf? :1ig, 't·o . · : -- . · .., ,; : -···~rt,'-",··._, .. ,-- · :,• .,· · - . - -. ' L.
keep e >, O , fixed· 1and let e[rt .... dJa( ~-a} f f(X~) ·· wh~i,e ~ :,is, ~~e '~oot ~f.
(5.22)
( 5 .23)
(5.24)
(5.25)
(5.26)
the equation obtained by setting the right :sid~ of (~L7) equal to 1 + e. : . f i ..
This is a different limiting process than the orie w7 have defined, and hence
we write ~ above and shall write AREB for any ratio based on such a
limit; in general, these limits will.be functions of e.
:fr w~ :,1carry o~t .this s~m~, limiti~g. _pr~c.es,~: 'foi o~r ~r.oced.ure R1
, then
using (5.7) we·_obtS:in · I,,'
where d -is the.,t,$ame as in. ( ~-.20:). Hence~ fo~·,:trans lat.ion alternatives (TA}, . . \,:!.~: . ·.
we have the ·i;·e~:~;lt
.. '!' ··· . _:,.:. ;' • I; I{_.~~. 1-.~-~ • ':~ ~... •
._; ··,.
which is ihd~~~@tleil t ·:of e ~ 1 '· • ' · •• :
-I£ we·:;.-t~k~f~.5j(~l) -~ :H. :i:i -l ·, a~d·· 6o~si~ir.:rt~r~l ~lte~natives (NA) then .,,• ··, \
(5 .23) with a;=--~ gives the ·same ·result 1 : as in (5 .16), viz.
,,,.:· ..
.<'" r -~ ,: • I .; I
For the same H .if we cons·ider th~· TEA t:hJn (?~-.~3) :~it]?.';_::~·q = 1!. gives, • # ••
after using symmetry and straightforward methods-of -integration, ' ..... J ,· .. .
. ,,.·. '/. ' .. .
:. 1(1/2) 2 ··· 7f
~B(Jctl'R' (~);TEA) '.= . ( 1/4) (-J2hr)'2 '= 2 .
·.~:f~\·~. r/{'~\t'.~/.': . . ..... ~ ,~. ( 1) If we·take J - = H = U, the standard uniform distribution,and
consider logistic alternatives (LA) then (5.23) with a~\ gives
. ( 1 / 12 ) ( 1 / 4a 1) 2
( 1/4 )( 1/6a l )2
3 = 4
for the logistic alternatives it is known that this "score" statistic with
- 12 -
L
I
L.
!
~
i
~
--0 -
(6.1)
(6.2)
(6.3)
J(-l) = H = U reduces to the Wilcoxon statistic and gives the locally most
powerful test. For the same H if we consider the (TEA) then (5.23) with
a=~ gives
= (1/12)(1/2)2
(1/4)(1/4) 2 =
4 3
We note with curiosity (but without explanation) that the last two numbers
are reciprocals and the two numbers before that are also reciprocals.
It can be shown (we omit the proof) that the AREB in both (5.15) and
(5.23), considered as a function of a for O ~a~ 1, is a maximum for
a=% for all three of the alternatives considered above, i.e., for NA, LA
and TEA. A sufficient condition for this is that f(x) be symmetric about
x = O and f 2 (x)/[F(x)(1-F(x))] be decreasing for x ~ O; these properties
hold in each of the three different alternatives considered above.
6. Dual Problem
Consider the dual problem of selecting a subset containing the population
with the smallest a-percentile. For specified p* > 1/k we now require a
procedure R1 that satisfies
for all possible k-tuples (F 1 ,F2 , ..• ,Fk) for which
Max F[ .](y) ~ F[l](y) l~j~k J
for ally;
denote the set of all such k-tuples by n1. We again assume (2.3) and define
r by (2.4). The procedure dual to procedure R1 of section 2 is
Procedure R1: Put F. in the selected subset if and only if
i
y . ~ r,i Min Y .
l~·~k r+c,J -J-
where c is the smallest integer with 1 ~ c ~ n-r for which (6.1) is satisfied.
- 13 -
(6.4)
( 6. 5)
(6.6)
It will be convenient l:o use a.Lso the notatio11 I.' ,' l:' .' \ ) ~;l 1J!.r.-.·y; r, Li J , for H .(y). r,1
For this problem equation (3.1) is replaced by
p (CS IRi} = 100 -00
=f
k
II { 1-H . ( y) } dH l ( y) r+c,J r, j=2
k
II Hn-r+l-c (l-F[j](y)) dHn-r+l(l-F[l](y)) j=2
If the F.(y) are all symmetric about y = 0 then for any k, by l.
making the transformation s = -y, we obtain
=f k
II Hn-r+l-c (F[j](x)) dHn-r+l(F[l](x)) j=2
and we note that the problem is equivalent to selecting a subset containing
the largest (1-0!)-percentile (a+%).
Equation (3.3) is replaced by
inf P ( CS IR 1' } O'
1 11
= [1-G (u)]k-l dG (u) r+c r
1 . 1 k-1 ) = G u dG u • 0
n-r+l-c ( n-r+l ( )
If a=% then n-r+l = r and hence for any k we note that (606) reduces
to (3.3). Hence for a=\ the tables for determining c are the same for
both problems; this is not true in general for a+~- In particular, the
tables we have computed for k = 2, a=~ can also be used for the problem
of selecting a subset containing the population with the smallest mediano
Other discussions for this problem are similar to those of the dual
problem and are omitted.
7. Property of Monotonicity or Unbiasedness
Let denote the probability that R1 retains
subset. We wish to prove the
Theorem: For any two c.d.f. 's F. ( •) ].
and F. ( ·) J
- 14 -
such that
in the selected
•
L
w ! I
~
I I , ~
I : w I I
I , 1..1
I
I i ~
; I w
·"'(7.1) 0
(7.2)
-
Proof: Letting M denote the set {m:m = 1,2, ... ,k; mt i,m f j} we use the
fact that Hr,i(y) is an increasing function of F[i](y) and then integrate
by parts to obtain
+ ·~ Jool·-rr Hr-c m(y)j..., Hr-c i(y){Hr J.(y) - Hr /y)} dHr-c m'(y). m' eM ~ meM ' ' ' ' '
m:f:m I .
for all y it follows that H . (y) ~ H .(y) r,1. r,J for
all y and every term in the last member of (7.2) is non-positive; this proves
the theorem.
It follows that under our assumption (2.2) the probability of including
F[k] in the selected subset is not less than the probability of including
any other F[j] in the selected subset, i.e., the procedure R1 is unbiased.
A similar result holds for the dual prob1em; the proof is similar to the
above proof and is omitted.
8. Acknowledgement
The authors wish to thank Mr. George Woodworth of the University of
Minnesota for several useful discussions in connection with this paper.
Thanks are also due to Mr. of Ohio State University for
his help in carrying out the computations in the tables.
- 15 -
Table 1 I I.,;
~
.-I
Values of P1
(n) fork= 2 and a=\ w
and the smallest value of n for which P1
(n) ~ p* ~
P1(n) = 1 - (n)/( 2n) where r = n+l
r r 2 ad
.... n I pl (n)
1 I .500000 .,J
3 .800000
5 .916667 ... 7 . 965035 9 .985294 -11 .993808
13 .997391 I
15 .998901 '-'
I
I ' ~-
._J
~
~
p* .500 .550 .600 .650 .700 .Boo .850 .900 .950 .975 .990
Smallest n w for which 1 3 3 3 3 3 5 5 7 9 11 Pl (n) ~ P* -
~
tr
~
~
...,
- 16 -~
... 6
.....
~
....
...
4-.1
-._;
19'
l.i
...
la
~
-..i
-.i
~
"""
---
Table 2 >
Largest values of r-c for which P{CS} i:> p*
-·. ---,t.n I n+l p*
r =w 2 i .750 .900 .950 .975 .990 : I o* o* o* 5 I 3· 1 1 i
15 8 6 ~ 3 2 2 25 13 10 8 1 6 5 ., 35 18 15 12 11 10 8 45 23 19 16 15 13 12 55 28 24 21 19 17 16 65 33 29 25 23 22 20 75 38 33 30 28 26 24 85 43 38 34 32 30 28 95 48 43 39 36 34 32
145 73 67 62 59 56 53 195 ' 98 91 85 81 78 75
I
245 I 12J· 115: ·-.1088 104:, io1 91 295 ! 148 139 132 128 124 119 345 173 164 156 151 147 142 395 198 188 180 174 170 165 445 223 212 203 198 193 188 495 248 237 227. 222 217 211
- ·---
t r- s Values of:+~= ~{n ~ 2 {see (5.4))
- - . -
95 48 .507646 ~·913762 1•218349 1•421407 1·624466 195 98 .498723 ' .926200 1.211184 1.42492·3 1.638661 295 148 . 522230 ... .928409 1.160511 1.392613 1.682741 395 198 .501884 .903391 1.204522 1.405275 1.656217 495 248 .493416 .941977 1. 166257 1.390537 1.659673
00 00, .476936 !906194 1. 163087 1.385903 1.644976 . 1oegenerate cases in which both the populations go into the selected subset with
probability one.
;
t:The.:erratk behavior .in the approach of :cJn"/{n+l) .to a,Umit,.as n ~00 is due to the fact that c is forced:to be an in~eger .under R1• This suggests that we might use a procedure R11 that allows randomization between r-c and r-c+l to get P(-CS1R11) = P* exactly. Then we g-et an improvement because E(~.,fR1) - E(SJRn) is positive, although it will usually be small; tables for the randomi.zed procedure have not been constructed.
- l'f -
References
(1) Bartlett, N. Ranking and selection procedures for continuous and certain discrete populations. Ph.D. Thesis, 1965. Case Institute of Technology.
(2) Bechhofer, R. E. (1954) A single sample multiple-decision procedure for ranking means of normal populations with known variances, Annals of Math. Statist. 25, 16-39.
(3) Gupta, s. s. (1965) On some multiple decision (selection and ranking) rules, Technometrics 7, 225-245.
[41 Gupta, s. s. (1963) multivariate t.
Probability integrals of the multivariate normal and Annals of Math. Statist. 34, 792-828.
(51 Lehmann, E. L. (1963) Math. Annalen 150,
A class of selection procedures based on ranks. 268-275.
[6]
[71
Milton, Roy, (1963) Tables of the equally correlated multivariate normal integral, Technical Report No. 27, University of Minnesota, Minneapolis, Minnesota.
Sobel, Milton (1966) Selecting the population with the largest~ percentile, submitted to Annals of~- Statist. (for this issue).
- JJj -
L
I I ... I I
l..i
I I .. I I
!Mi
I
41111
I ~
I ... I ... i .. I
...i
I i
w