Optimal estimation methods under weakened linear restrictions in regression

10
Computational Statistics CQ: Data Analysis 13 (1992) 527-530 North-Holland . Toutenburg @. Trenkler * Rcccived July I990 Rcviscd October 199 I Abstract: This paper deals with the linear regression moAcl under additional stochastic linear restrictions when the dispersion matrix of the restrictitbn disturbances is unknown and the traditional mixed model approach is not applicable. ?he concept of weak unbiasedncss is introduced. Since the optimal weakly unbiased cstirnators, already mention& in the previous sentence, arc not operational, some rcplaccmcnt l,tr;ltcgics for the unknown quantities arc discussed. Ke~~~ords: Linear model; Stochastic linear restrictions; Mixed estimator: Quadratic risk; Weak unbiascdness. ction In the context of the general linear regression model it may occur that some information is available on the parameters, but not in an exact form. A very popular approach to incorporate information of the form r= is given by the so-called mixed estimator. owever, this estimati heavily depends on the knowledge of the di rsion matrix of 6 C~~rr~~sportd~~rlcc to: G. Trcnklcr, Dcpartmcnt of Statistics. University of Dortmund. P.0. Box SWSOO, 4(,00 Dortmund SO, Germany. * Support by the Dcutschc Forschungsgcmcinschaft (grant number TR253/ l-l ) is gr:itcfulh, ::cknowlcdgcd. Ol67-~~37~/~~2/$05.00 (’ 1992 - El5cvicr Scicncc ublishcrs B.&l. All rights rcscrvcd

Transcript of Optimal estimation methods under weakened linear restrictions in regression

Computational Statistics CQ: Data Analysis 13 (1992) 527-530

North-Holland

. Toutenburg

@. Trenkler *

Rcccived July I990 Rcviscd October 199 I

Abstract: This paper deals with the linear regression moAcl under additional stochastic linear restrictions when the dispersion matrix of the restrictitbn disturbances is unknown and the traditional mixed model approach is not applicable. ?he concept of weak unbiasedncss is introduced. Since the optimal weakly unbiased cstirnators, already mention& in the previous sentence, arc not operational, some rcplaccmcnt l,tr;ltcgics for the unknown quantities arc

discussed.

Ke~~~ords: Linear model; Stochastic linear restrictions; Mixed estimator: Quadratic risk; Weak

unbiascdness.

ction

In the context of the general linear regression model it may occur that some information is available on the parameters, but not in an exact form.

A very popular approach to incorporate information of the form r= is given by the so-called mixed estimator. owever, this estimati heavily depends on the knowledge of the di rsion matrix of 6

C~~rr~~sportd~~rlcc to: G. Trcnklcr, Dcpartmcnt of Statistics. University of Dortmund. P.0. Box

SWSOO, 4(,00 Dortmund SO, Germany. * Support by the Dcutschc Forschungsgcmcinschaft (grant number TR253/ l-l ) is gr:itcfulh,

::cknowlcdgcd.

Ol67-~~37~/~~2/$05.00 (’ 1992 - El5cvicr Scicncc ublishcrs B.&l. All rights rcscrvcd

528 H. Touter&w-g et al. / Optimal estimation methods

. tifili? is not realistic. In the following we suggest some alternative estimation

urcs which only make use of the given vector Y.

ear regressio ic ear res S

Consider the linear regression model

y=xp +E, (2 1) .

where y is an n x 1 vector of observations on the dependent variable, X is an II x p matrix of observations on p explanatory variables of full column rank, E is an n x 1 vector of unobservable disturbances, and p is an unknown p X 1 vector of parameters. We assume E(E) = 0 and Cov(e) =a2W where the positive definite n x n matrix W is given, but (T’ > 0 may be unknown.

In addition to the former model (2.0, related only to sample information, let us assume prior information about p in the form of m independent stochastic linear restrictions

r=RP+& (2 2) .

where r is an m x 1 stochastic vector, R is a given m xp matrix of rank m, t#~ is an m x 1 vector of disturbances such that E(4) = 0, Cov(+) = a2V and E(E@) = 0 with I/ being a positive definite matrix. The vector r is assumed to be observable and given

The so-called mixed estimator, introduced by Theil and Goldberger (1961), is based on the sample information (2.1) and the prior information (2.2), and is given by

b, = b + S-‘R’(V+ RS-‘R’)-‘(r - Rb), (2 3) .

where S =X’W-‘X and b =(X’W-‘Xl-‘X’W-*y is the generalized least squares estimator (GLSE). The mixed estimator is unbiased and has dispersion matrix

cov(b,) = dS-’ - a2S-‘R’(V+ RS-lR’j-‘RS-l. (2 4) . In statistical practice, unfortunately, the matrix V is rarely known, and conse- quently, b, becomes inoperational. Nevertheless, we should still be interested in how to extract the remaining applicable substance of the information contained in (2.2). One way consists in replacing V by a sample based estimate or by a prior guess, but the statistical properties of the resulting operational variants of the mixed estimator are difficult to evaluate.

Alternatively, we may look for a concept which makes use of the warranted of the auxiliary informat’ (2.2). If V is not fully known, then the

H. Torrtenhwg et al. / Optirnul estimation methods 529

To take information (2.2) into account w en constructing estimators p” for p is to require that

E(R&?)=r. (2.6)

An estimator 8 for p is said to be weakly-( R, r)-unbiased with respect to the stochastic linear restriction t’ = RP + 4 if (2.6) holds.

Thus, conditional on Y, the expectation of the weakly-( , r)-unbiased i lies on

the hyperplane defined by R and Y.

, rb

Let us be given linear inhomogeneous estimators p” = Cy + d. Then the condi- tion of weak-( R, r)-unbiasedness becomes equivalent to the following condition on C and d

E(R&r)=RCXP+Rd=r, (34

where now Y has to be considered as a fixed realization. To measure the quality of an estimator b we introduce the following risk

function

RA(B. P)=E[(p^-@‘A(&P)]? (3.2)

where A is a positive definite matrix of weights. If p^ = Cy + d we obtain

RA(b, p) =(T’ tr(ACWC’) + [(CX-Z)p +d]‘A[(CX-I)p +d]. (3.3)

To identify an optimal linear estimator 6 = Cy + d we will minimize (3.3) under the additional restriction (3.1), i.e. we have to solve the problem

min {R@, p)-2A’(RCXfi +Rd --I-)}, (3 4) . C.d.h

where h is an m x 1 vector of Lagrangian multipliers. Upon setting

g(C, d, A)=R,(fi,p)-2A’(RCXfi+Rd-r) (3 5) .

with R,(b, p) given in (3.3) we get the first order equations for an optimum:

ag z = 2( Acxpp’x’ -AFP’X’+Adp’X’+~‘ACW-R’AP’X’)=0.

(3.6)

$=Z(Ad+ACXP-A/3-R%)=& (3 7) .

ag - =2(RCXp +Rd -r) =O. ah

(3.8)

Solving (3.7) for ALI and insertinAg in (3.6) yields &lCW = 0. Since A and r/f/ are positive definite we conclude C = 0, and consulting (3.7) again we obtain

J=p +A-‘R’h. (3.9)

Premultiplying (3.9) by R and using (3.8) with C = 0 gives

li =(RA-‘R’)-‘(r-RP). (3.10)

Hence our optimal estimator is

a^=p +A-‘R’(R/-‘R’)-‘(r- RP). (3.11)

The following theorem summarizes our findings.

ewe In the regression model (2.1) the optimal inhomogenous weakly- ( R, r )-unbiised estimator for p is gilYen by

cf=&& A) =p +A-‘R’(RA-‘R’)--‘(r- RP), (3.12)

and its risk is, conditional OH r,

R,,(p^,(p, A), p) = (r-R~)‘(RA-‘R’)-‘(r-RP).

(i)

(ii)

(iii)

(iv)

I

(3.13)

Formula (3.13) for the minimal risk is an easy consequence of (3.3) and (3.12). Since R,( 6, /3> is a convex function of C, our solution d^ = p”J p, A) from (3.11) yields the minimum. The estimator p^,cp, A) depends on the unknown parameter vector p, and thus is not operational. However, if /3 is replaced by an unbiased estimator 6, the resulting feasible estimator &( p, A) becomes weakly-( R, r)-unbi- ased.nWe will consider this problem in Section 4 again. As &C p, A) explicitly depends on the weight matrix A, variation with respect to A defines a new class of ~r+;m-~--- rJrlrllalUl3. Let for instance p be repiaced by the generalized least squares estimator b = (X’IV-‘XV’X’W-’ y. Then the choice A =X’W-‘X = S results in the generalized restrickd least squares estimator (conditional on Y)

&(b, X’W’X) = b + S-‘R’[ RS-‘R’] -‘(r - Rb). (3.14)

,i us now have a closer look at linear homogeneous estimators. If @ = Cy, then the requirement of weak unbiasedness of /3 is equivalent to RCX/J = r (conditional on r). Then we obtain the following first order equation for an optimum:

ag Z = 2( AC/3 - App’X’ - R’hp’X’) = 0, (3.15)

CX/3-r)=O, (3.16)

where the matrix B is defined as

B=X~~‘X’+o’W. (3.17)

Obviously is positive definite, and its inverse is

After some straightforward calculations the optimal values for C and a turn out to be:

i = (RA-‘RI)_‘(r/a -R/3), (3.19)

and

t=pp’X’B-‘+A-‘R’(RA-‘R’)-‘(r-/a -RP)P’X’B-‘,

where

(3.20)

cy = B’X’B- ‘Xp = p’Sp/(a + P’SP). (3.21)

Then the optimal homogeneous weakly-( R, r)-unbiased estimator CY is

&p, A) =pcr(y) +A-‘R’(RA-‘R’)-‘(r/a - RP)a(y). (3.22)

where

a(y)=P’X’B-‘y=P’X’W-‘y/(a’+P’SP)s

It should be emphasized that

(3.23)

WY)= pp’x’w- ‘y

a’+P’SP ’ (3.24)

minimizes R,(b, p> within the class of linear homogeneous estimators p^ = CY without the restrictltln n I,.. ,f SG.& unbiasedness (Theil, 1971, pp. 346-352,670-672).

Thus, p,( p, A) is the sum of the biased estimator pcy< Y) and a correction term ensuring weak-( R, r )-unbiasedness

E(Rp^,(p, A)r) = R@ + *a - RPa =I’. (3.25) a

As PQ( y) itself, &( p, A) is not operational. Its bias is given by

Bias &p, A) = P(a - 1) + za, (3.26)

where

z =A-‘R’(W’R’)-‘(r/a - RP). (3.37)

Observe that Bias@&(Y)) = P<a - 1). bviously, the dispersion matrix of &p, A) is

1

Cov(B,(P, “,) = Cov(pa( y)) + ZZ’& “, ;,sp & +(zp’+pz’)

CT2 +p’sp ’ (3.28)

532 H. Toutenhurg et al. / Optimal estimation methods

which implies for the mean square error matrix of &

M(&% 4, P) = E[(&P, A) -P)(&P, A)

=M(pa(Y), p)*ffzz’, (3.29)

where M( /3a( y ), p> is the mean square error matrix o Using the convexity property of the functions i d we may state the

following result.

In our model (2.1) the estimator which mi es R,C i% p) within the ogeneous weakly-( R, r )-unbiased estimators

,&(p. A) =&Y(Y) +_4-‘R’(RA-‘R’)-‘(r/q - MY)7 (3.30)

where cy and ar( y ) are giL?en in (3.21) and (3.23), respectillely. The corresponding risk is

Kl(&P? 47 P) = tr AM(Pa( y), p) + m'Az = ,@d Y), P) + ~‘Az (3.31)

with z gitven in (3.27).

Observe that the estimator from the p.seeding theorem is not operational since it depends on the unknown parameter p. As no ed in the remarks, this disadvsntagc zan be also stated for the estimator from Theorem 1. In the following paragraph we will propose some replacement strategies.

From Remark (iii) in Section 3 we know that any removal of /3 by z unbiased estimator p leaves p^,<fi, A) weakly-( R, r&unbiased. find such that the feasible version @,(fi, A) is optimal

out an estimator p espect to the quadratic

risk we have to confine ourselves to well-defined Let us demonstrate this problem for the class

estimators with the corresponding inhomogeneou

&(dy, A) = (?y +A-lR’(RA-‘R’)-l(r - Rcy).

Letting

of homogeneous

(4 1) .

A =A-‘R’(&f-‘R’)-‘,

we obtain

ias &(dy, A) = (EX-I)p -bk(r - RdXp),

a

(4 2) .

(4 3) .

H. Touter&m-g et al. / Optinlal estimation methods 533

If 6 - cy is unbiased, i.e. a A) from (4.1) +jo becoties unbiased.

= _&p/2 -1q- l -l/2 . (4 v LC

The matrix Q is an orthogonal projector with rank(Q) =p - m. t is readily seen that

(I-R’,$)A(&A”R)= (4 6) . holds.

Let A =&..., p X p matrix of p vectors hi of type p X 1 of Lagrangian vector mul$pliers. hen the \ I?, r&optimal unbiased operationaliza- tion of the estimator &(p, A) is the solution to

min {tr A Cov( &(cy, A)) - 2 tr N(cX - I,} 6,.1

= min ( 0’ d.‘i

tr A’/‘QA’/%VCr’ - 2 tr ,t’(ck - I)}

= ming(c, A), 6.t

(4 7) .

say. Differentiating with respect to (? and A, respectively, gives the necessary

conditions for a minimum:

and

%(C A) arl

=qcx-Z)=O. (4 9) .

Postmultiplying (4.8) by W- IX and using (4.9) gives

A* ,A1/2QA1/2S- 1,

and consequently from (4.8)

/@/‘Q/4’+ - S-1X’W-l] = 0.

The principal solution of (4.11) then is given by

6, = (X’w_lX)-lX’

with the corresponding estimator

P - C&S (X’w-iX)-*X’w-ly, *

(4.10)

(4.11)

(4.12)

(4.13)

(cf. 3.14). Hence we may state the following theorem, by using the convexity argument again.

Let 11s be @[*en the class ,& cy. A ) of weakly f R, r j-unbiased cstirmtors with p = & being an unbiased estimator ,fc;r p. Tiherl in this class the estimator ,& b, A) minimizes the risk R,.J - , p!. There are two obr+orrs choices for A:

(1) A=I: &(b, I) =b+R’(RR’)-‘(r- Rh)

(2) A=% &(b, S) = b + S-‘R’( RS-‘R’)-‘(r - Rb).

The latter estimator clearly @*es the generalized restricted leas! sqccares estimator.

Suppose now that a prior guess & of p is available, such that either d, is nonstochastic or stochastic, but uncorrelated with the disturbance vector E, i.e. E(ci * E’) = 0. Then we can tipdate the prior guess d * with respect to the sample and the auxiliary information. For this define an estimator

g=Cy+d*, (5.1)

where C has to be chosen such that b becomes optimal under the condition of weak-( R. r )-unbiasedness. T en C must obey

RCXP=r-Rd.+, (5 l 2)

(conditional on r and d .+ ). To find the optimal matrix C of this problem we have to solve

Ti;{R,,(g, Pf -2h’(RCX@ -r+Rd,)}.

Thz necessary conditions are

(5.3)

ACB-A@-d, +A-‘R’A)p’X’=O.

RCXP -r+Rd. =O,

where B = X@‘X’ -t- &V. Solving (5.4) for C gives

C= (p -d, t-A-‘R’+‘X’B-‘.

Inserting in (5.51 and using cy from (3.212 we get

RCXp = a( RP - Rd, +RA-‘R’A)=r-Rd,,

and solving for A yields

(5 94)

(5 2)

(5.6)

(5.7)

resu

-‘R’)-‘[r-aRP - Rd*(l --a)], (5.8)

(5.9)

cncc the corresponding estimator is

&p, d,, A)=&+&

=(P-d,)Q!(y)+d* +A-‘R’(R/-‘R’)_’

[

1 x -(r-Rd,)-R(P-d,) a(y),

a! 1 (5.10)

where art y ) is defined in (3.23). Comparison with the estimator & p, A) gives the relationship

p^,(P, n,. A) -&p, A) +=(d,). \ 2 (5.11)

where

[

(T2 t(d,)=d, - d, + --A-‘R’(Re’R’)-*Rd* a(y)

P’W 1

(5.12)

may be interpreted as a term correcting p^,@, A) with respect to the prior guess d, and corrects d, with Arespect to the requiremen! of weak unbiasedness. If d, = 0 (null guess) then &Cp, 0, A) coincides with &C /3, A ).

In addition to the assumptions on d * given before let us now suppose that the prior guess d * is consistent with the restriction RP = I’, i.e.

E(Rd, 1 Y) =I’. (5.13)

Again using the special linear inhomogeneous estimator (5.1). the condition of weak-( R, r)-unbiastdness now becomes equivalent to

RCX/3 = 0. (5.14)

Minimization of the quadrats risk RJ p”, p> under condition (5.14) is equivalent to the minimization of R,&p, p) - 2A’RCXP. Differentiating with respect to C and setting equal to zero yields the solution (5.9) for C. Due to the restriction (5.13) then we obtain

i = (RA-‘R’)-‘R(d, -p). (5.15)

Hence the optimal (R, &weakly unbiased estimator, taking into account the prior guess de which satisfies E( Rd i 119 = Y, becomes

p^,(P, d,, 4 =Q(Y)(&&~) -1”) +d*, (5.16)

where p^,(p, A) is the overall optimal ( R, &weakly unbiased estimator given in (3.12) and cu( y ) is defined in (3.23).

Straightforward derivations show that

(5.17)

irrespective whether d * is stochastic or not.

536 H. Timten?urg et al. / Optimal estimation methods

here is a close relationship between &p, d * ) A) and p^,(& d *, A):

+Ly(y) A-“R’(RA-‘.W)-‘(r-Rd,). (5.18)

50th estimators can be made operational by the same method applied to &( /3, A) in Section 4.

eferences

Theil, H., Principles of Econometrics (Wiley, New York, 1971). The& H. and A.S. Goldberger, On pure and mixed estimaticn in economics, International

Economic Reriers.. 2 (1961) 65-78.