Chapter 14 Logistic Regression, Poisson Regression, and ...

Chapter 14Logistic Regression,Poisson Regression,

and Generalized Linear Models

許湘伶

Applied Linear Regression Models(Kutner, Nachtsheim, Neter, Li)

hsuhl (NUK) LR Chap 10 1 / 29

14.1 Regression Models with Binary Response Variable

Regression Models with Binary Response Variable

the response variable of interest has only two possible qualitativeoutcomesbe represented by a binary indicator variables: taking values on 0and 1A binary response variable is said to involve binary responses ordichotomous(二元) responses



Meaning of Response Function when Outcome Variable isBinary

1 Consider the simple linear regression model:

Yi = β0 + β1Xi + εi , Yi = 0, 1(E{εi} = 0)

E{Yi} = β0 + β1Xi

2 Consider Yi to be a Bernoulli random variable:Yi Probability1 P(Yi = 1) = πi0 P(Yi = 0) = 1− πi

⇒ E{Yi} = 1(πi) + 0(1− πi) = πi = P(Yi = 1) = β0 + β1Xi



Meaning of Response Function when Outcome Variable isBinary (cont.)

The mean response E{Yi} = β0 + β1Xi is simply the probabilitythat Yi = 1 when the level of the predictor variable is Xi



Special Problems when Response Variable is Binary

1 Nonnormal Error Terms: εi = Yi − (β0 + β1Xi)

When Yi = 1 : εi = 1− β0 − β1Xi

When Yi = 0 : εi = −β0 − β1Xi

2 Nonconstant Error Variance: εi = Yi − πi (πi : constant)

⇒ σ2{Yi} = σ2{εi} = πi(1− πi) = (E{Yi})(1− E{Yi})

3 Constraints on Response Function:

0 ≤ E{Y } = π ≤ 1

Many response function do not automatically posses thisconstraint.



Sigmoidal(S形形形) Response Functions for Binary Response

Ex: health researcher studying the effect of a mother’s use of alcoholX : an index of degree of alcohol use during pregnancyY c : the duration of her pregnancy (continuous response)simple linear regression model:

Y ci = βc

0 + βcz Xi + εc

i , εci ∼ N(0, σ2

c )

⇒ the usual simple linear regression analysis



Sigmoidal(S形形形) Response Functions for Binary Response(cont.)

Coded each pregnancy:

Yi =

{1 if Y c

i ≤ T weeks0 if Y c

i > T weeks⇒ P(Yi = 1) = πi = P(Y c

i ≤ T ) = P(βc0 + βc

1Xi + εci ≤ T )

= P

εciσc(Z)

≤ T − βc0

σc(β∗

0 )

− βc1σc

(−β∗1 )

Xi

= P(Z ≤ β∗0 + β∗1Xi)

= Φ(β∗0 + β∗1Xi)

(Φ(z) = P(Z ≤ z), Z ∼ N(0, 1) )




1 probit mean response function:

E{Yi} = πi = Φ(β∗0 + β∗1Xi)

⇒ Φ−1(πi) = π′i = β∗0 + β∗1Xi

Φ−1: sometimes called the probit transformationπ′i = β∗0 + β∗1Xi : probit response function or linear predictor




bounded between 0 and 1β∗1 ↑ (β∗0 = 0) more S-shaped, changing rapidly in the centerchanging the sign of β∗1Symmetric function: Y ′i = 1− Yi (Φ(Z ) = 1− Φ(−Z ))

P(Y ′i = 1) = P(Yi = 0) = 1− Φ(β∗0 + β∗1Xi ) = Φ(−β∗0 − β∗1Xi )




Logistic functionSimilar to the normal distribution: mean=0, variance=1slightly heavier tails




εL ∼ logistic r.v. µ = 0,σ = π/√3

fL(εL) =exp(εL)

[1 + exp(εL)]2, FL(εL) =

exp(εL)

1 + exp(εL)

If εci ∼ Logistic distribution with µ = 0,σc

P(Yi = 1) = πi = P(εc

iσc≤ β∗0 + β∗1Xi

), (E (

εciσc

) = 0, σ{ εciσc} = 1)

= P

π√3εc

iσc

(εL)

≤ π√3β∗0

(β0)

+π√3β∗1

(β1)

Xi

= P (εL ≤ β0 + β1Xi ) = FL(β0 + β1Xi ) =

exp(β0 + β1Xi )

1 + exp(β0 + β1Xi )




2 logistic mean response function:

E{Yi} = πi = FL(β0 + β1Xi)

=exp(β0 + β1Xi)

1 + exp(β0 + β1Xi)=

11 + exp(−β0 − β1Xi)

⇒ F−1(πi) = π′i = β0 + β1Xi




F−1(πi ) = β0 + β1Xi = loge

(πi

1− πi

):

called the logit transformation of the probability πiπi

1− πi: called the odds(勝算)




3 complementary log-log response function: extreme value ofGumbel probability distribution

E{Yi} = πi = 1− exp(− exp(βG0 + βG

1 Xi))

⇒ π′ = log[− log(1− π(Xi))] = βG0 + βG

1 Xi

F−1(π) = loge

(πi

1−πi

): called the logit transformation of the

probability ππi

1−πi: called the odds(勝算)


14.3 Simple Logistic Regression

Simple Logistic Regression

The most widely used:1 The regression parameters have relatively simple and useful

interpretations2 statistical software is widely available for analysis of logistic

regression modelsEstimation parameters:

Estimation: MLE to estimate the parameters of the logisticresponse functionUtilize the Bernoulli distribution for a binary random variable



Simple Logistic Regression (cont.)

Simple Logistic Regression ModelY ∼ Ber(π): E{Y } = π

⇒ Yi = E{Yi}+ εi

the distribution of εi depends on the Bernoulli distribution of theresponse Yi

Yi are independent Bernoulli random variables with expectedvalues E{Yi} = π, where

E{Yi} = πi =exp(β0 + β1Xi)

1 + exp(β0 + β1Xi)




Likelihood function:Each Yi :

P(Yi = 1) = πi ; P(Yi = 0) = 1− πi

⇒fi(Yi) = πYii (1− πi)

1−Yi , Yi = 0, 1; i = 1, . . . , n

The joint probability function:

g(Y1, . . . ,Yn) =n∏

i=1fi (Yi ) =

n∏i=1

πYii (1− πi )

1−Yi




g(Y1, . . . ,Yn) =n∏

i=1fi (Yi ) =

n∏i=1

πYii (1− πi )

1−Yi

⇒ ln g(Y1, . . . ,Yn) =n∑

i=1

[Yi ln

(πi

1− πi

)]+

n∑i=1

ln(1− πi )

(∵ 1− πi = [1 + exp(β0 + β1Xi )]−1)

(⇒ ln(

πi1− πi

)= β0 + β1Xi )

⇒ ln L(β0, β1) =n∑

i=1Yi (β0 + β1Xi )−

n∑i=1

ln[1 + exp(β0 + β1Xi )]

No closed-form solution for β0, β1 that max ln L(β0, β1)




Maximum Likelihood Estimation:To find the MLE b0, b1: require computer-intensive numericalsearch proceduresOnce b0, b1 are found:

πi =exp(b0 + b1Xi)

1 + exp(b0 + b1Xi)

the logit transformation:

π′ = ln(

π

1− π

)= b0 + b1X




1 examine the appropriateness of the fitted response function2 if the fit is good⇒ make a variety of inferences and predictions




Interpretation of b1:The interpretation of the estimated regression coefficient b1 inthe fitted logistic response function is not the straightforwardinterpretation of the slope in a linear regression model.An interpretation of b1:the estimated odds π/(1− π) are multiplied by exp(b1) for anyunit increase in Xπ′(Xj): the logarithm of the estimated odds when X = Xj(denoted as ln(odds1))

π′(Xj) = b0 + b1(Xj)




Similarly, ln(odds2) = π′(Xj + 1)

b1 = ln(

odds2

odds1

)= ln(odds2)− ln(odds1)

odds ratio(勝算比): OR

OR =odds2

odds1= exp(b1)



Deviance Goodness of Fit Test

Deviance goodness of fit test:completely analogous to the F test for lack of fit for simple andmultiple linear regression modelsAssume:

c unique combinations of the predictors: X1, . . . ,Xcnj : #{repeat binary observations at Xj}Yij : the ith binary response ar Xj

Deviance goodness of fit test: Likelihood ratio test of thereduced model:

Reduced model: E{Yi j} = [1 + exp(−X ′jβ)]−1

Full model: E{Yi j} = πj , j = 1, . . . , c

πj : are parametershsuhl (NUK) LR Chap 10 23 / 29


Deviance Goodness of Fit Test (cont.)

pj = Y·jnj, j = 1, . . . , c

π: the reduced model estimate of πThe likelihood ratio test statistic:

Deviance: G2 = −2 [ln L(R)− ln L(F )]

= −2c∑

j=1

[Y·j ln

(πj

pj

)+ (nj − Y·j) ln

(1− πj

1− pj

)]= DEV (X0,X1, . . . ,Xp−1)



Deviance Goodness of Fit Test (cont.)

Test the alternative:

H0 : E{Yi j} = [1 + exp(−X ′jβ)]−1

Ha : E{Yi j} 6= [1 + exp(−X ′jβ)]−1

Decision rule:If DEV (X0,X1, . . . ,Xp−1) ≤ χ2(1− α; c − p)⇒ conclude H0

If DEV (X0,X1, . . . ,Xp−1) > χ2(1− α; c − p)⇒ conclude Ha



Logistic Regression Diagnostics

various residualsordinary residuals: ei

ei =

{1− πi if Yi = 1−πi if Yi = 0

not be normally distributedPlots of ei against Yi or Xj will be uninformative

Pearson Residuals: rpi

rpi =Yi − πi√πi(1− πi)

related to Pearson chi-square goodness of fit statistichsuhl (NUK) LR Chap 10 26 / 29


Logistic Regression Diagnostics (cont.)

Studentized Pearson Residuals: rSPi

rSPi =rpi

1− hii, H = W

1/2X(X ′W X)−1X ′W

1/2

W : the n × n diagonal matrix with elements πi(1− πi)

Deviance Residuals: devi

DEV (X0,X1, . . . ,Xp−1) = −2n∑

i=1[Yi ln(πi) + (1− Yi) ln(1− πi)]

devi = sign(Yi − πi)

√√√√−2 n∑i=1

[Yi ln(πi) + (1− Yi) ln(1− πi)]

(n∑

i=1dev 2

i = DEV (X0,X1, . . . ,Xp−1))


Chapter 14 Logistic Regression, Poisson Regression, and ...

Documents

Transcript of Chapter 14 Logistic Regression, Poisson Regression, and ...