STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1)...

22
STK643 PEMODELAN NON-PARAMETRIK Pendahuluan

Transcript of STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1)...

Page 1: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

STK643 PEMODELAN NON-PARAMETRIK

Pendahuluan

Page 2: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

MATERI

1. Pendahuluan • Mengapa pemodelan nonparametrik • Penerapan pemodelan nonparametrik (Eksplorasi data dan Inferensia)

2. Pendugaan fungsi kepekatan peubah tunggal • Metode Histogram • Metode Kernel

3. Pendugaan fungsi kepekatan peubah ganda

4. Penerapan pendugaan fungsi kepekatan 5. Pemodelan nonparametrik

• Pemulusan plot tebaran • Metode pemulus Kernel

6. Pemodelan nonparametrik peubah ganda 7. Regresi Spline 8. Model aditif

Page 3: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

KEPUSTAKAAN

1) Bowman AW, Azzalini A. 1997. Applied Smoothing Techniques for Data Analysis: the Kernel

Approach With S-Plus Illustrations. Oxford University Press. London.

2) Eubank RL. 1999. Nonparametric Regression and Spline Smoothing. Marcel Dekker. New York.

3) Jianqing Fan. Prospects of nonparametric modeling. http://escholarship.org/uc/item/38w9t0km.

(10 Februari 2016)

4) Hastie T, Tibshirani R. 1990. Generalized Additive Models. Chapman & Hall/CRC. London.

5) Hastie T, Tibshirani R, Friedman J. 2008. The Elements of Statistical Learning: Data Mining,

Inference, and Prediction. Second Eddition. Springer-Verlag. New York.

Page 4: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

KEPUSTAKAAN

6) Scott DW. 1992. Multivariate Density Estimation. Theory, Practice, and Visualization. John Willey &

Sons, Inc. New York.

7) Silverman BW. 1986. Density Estimation for Statistics and Data Analysis. Vol. 26 of Monographs on

Statistics and Applied Probability. Chapman & Hall/CRC. London.

8) Simonoff JS. 1996. Smoothing Methods in Statistics. Springer. New York.

9) Rizzo ML. 2008. Statistical Computing with R. Chapman & Hall/CRC. London.

Page 5: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PERANGKAT LUNAK

1) Bahasa pemrograman R (www.r-project.org)

stats , graphics, ash, GenKern (KernSec, KernSur), kerdiest, KernSmooth, ks, np,

plugdensity, sm

2) SAS (www.sas.com)

KDE Procedure (density estimation)

LOESS Procedure (estimating regression surfaces)

TPSPLINE Procedure (multivariate adaptive regression splines)

GAM Procedur (generalized Additive Models)

GAMPL Procedure (generalized additive models based on low-rank regression splines)

ADAPTIVEREG Procedure (multivariate adaptive regression splines)

Page 6: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PENDAHULUAN

• Statistics: collection, summarization, presentation, and interpretation of data

• Data are the key to make inferences

• No assumptions about the underlying process that generated these data

• It is assumed parametric model (such as Gaussian with µ and σ2) or nonparametric

• If the assumed model is not the correct one, then inferences can be worse and

misleading interpretations of the data

Page 7: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PENDAHULUAN WHY NONPARAMETRIC ?

• Parametric:

strict assumptions that are often violated by real data

strict hypotheses → if correct, accurate and precise estimates, otherwise very misleading

linear relationships between the dependent variable and predictor variables (normality, and

linearity)

• Nonparametric:

less-strict assumptions that are less-frequently violated by data

less conditions → free estimates from hypotheses

wide range of relationships between the dependent variable and predictor variables (linear,

moderately nonlinear, or highly-nonlinear)

Page 8: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PENDAHULUAN NONPARAMETRIC (SMOOTHING)

• a bridge between making no assumptions on formal structure (a purely nonparametric

approach) and making very strong assumptions (a parametric approach)

• to identify potentially unexpected structure to more complicated data analysis problems

• to extract more information from the data than is possible purely nonparametrically, as

long as the (weak) assumption of smoothness is reasonable

• to provide analyses flexible and robust

Page 9: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PENDAHULUAN PURPOSE OF NONPARAMETRIC • Exploratory data analysis

• Smoothing highlights important structure clearly

• Model building

• Choosing the appropriate model as the basis of analysis

• Goodness of fit

• Smoothed curves ‘test’ formally the adequacy of fit of a hypothesized model

• Smoothed density estimates and regression curves can be used to construct confidence intervals and regions for

true densities and regression functions, with similar avoidance of restrictive parametric assumptions

• Parametric estimation

• Compared to maximum likelihood, density and regression estimates are often fully efficient and more robust

(less sensitive to an outlier)

(Simonoff 1996)

Page 10: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PENDAHULUAN PURPOSE OF NONPARAMETRIC

• Exploring general relationship between two variables

• Gives predictions of observations without reference to a fixed parametric model

• Provides tool for finding spurious observations by studying the influence of isolated points

• A flexible method of substituting for missing values or interpolating between adjacent X-

values

(Hardle 1994)

An aim of nonparametric techniques is to reduce possible modeling biases of parametric models

Page 11: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

HISTOGRAM

52 34 12 69 44 22 36 41 77 39 73 21 37 38 32

22 21 41 41 11 25 22 63 48 18 28 22 70 27 51

44 38 20 20 53 80 30 56 46 32 15 72 92 13 22

34 25 45 30 39 24 18 49 16 10 36 26 19 64 33

37 95 14 26 22 41 83 19 34 24 27 37 46 13 17

56 28 32 53 21 33 58 47 33 28 16 65 30 38 31

53 38 21 23 83 64 49 36 64 18

Dahan Daun

6 1 012334

15 1 566788899

30 2 001111222222344

39 2 556677888

13 3 0001222333444

48 3 666777888899

36 4 111144

30 4 5667899

23 5 12333

18 5 668

15 6 3444

11 6 59

9 7 023

6 7 7

5 8 033

2 8

2 9 2

1 9 5

Page 12: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

PROBLEMS WITH THE HISTOGRAM

• Definition of Classes:

choice of intervals and truncation influence estimation (boundary dependence)

• Not Smooth:

stepwise function even if the density function is continuous

Solution:

• local histogram: frees histogram from classes definition

• kernel density estimates: tackles smoothness

Page 13: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

HISTOGRAM VS KERNEL ESTIMATOR

• bias vs variance

• bin width

Page 14: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

HISTOGRAM DENGAN PEMULUS

9.868 7.724 7.552 8.481 10.756 11.886 10.506 4.954

7.454 7.802 8.562 12.82 9.886 9.847 12.976 7.452

10.624 9.203 6.164 12.08 6.625 7.612 13.990 6.198

8.221 7.312 8.93 9.199 10.305 8.196 8.761 10.057

8.842 7.538 9.24 10.117 5.893 8.865 8.782 9.286

8.448 5.198 10.349 10.454 9.114 5.179 5.883 11.902

5.631 3.959 3.169 4.982 3.148 2.342 3.787 3.269

5.133 5.143 4.293 4.392 5.343 4.591 4.273 6.001

5.932 4.193 3.669 4.738 4.114 3.626 4.165 2.544

4.096 5.312 2.12 3.732 2.936 4.009 4.5 3.411

6.167 3.496 3.009 6.439 4.366 3.653 4.16 3.528

4.192 5.432 4.839 4.217 5.058 2.488 5.05 3.943

3.874 3.509 5.018 4.563

Dahan Daun

5 2 13459

21 3 0112445566677899

40 4 0011111222335557800

14 5 00011113346889

46 6 011146

40 7 34455678

32 8 1244577889

22 9 11222888

14 10 01334567

6 11 89

4 12 0899

𝑤 𝑢 = 1 𝑢𝑛𝑡𝑢𝑘 𝑢 <

1

2

0 𝑢𝑛𝑡𝑢𝑘 𝑢 ≥1

2

Page 15: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

FUNGSI KEPEKATAN DENGAN PEMULUS KERNEL

Page 16: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

REGRESI NONPARAMETRIK

X

y

9 8 7 6 5 4 3

25

20

15

10

5

y = - 1.07 + 2.74 X

Page 17: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

REGRESI NONPARAMETRIK

f x =b0+b1x 𝑓 𝑥 = 𝑏0 + 𝑏1𝑥 + 𝑏2𝑥2

Page 18: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

REGRESI NONPARAMETRIK

Qλ f = Yi−f xi

2n

i=1

+λ f " x 2xn

x1

dx f h x = yiK

x−xih

Kx−xih

ni=1

n

i=1

Page 19: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

REGRESI NONPARAMETRIK

Page 20: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN
Page 21: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN
Page 22: STK643 PEMODELAN NON-PARAMETRIK - stat.ipb.ac.id · Regresi Spline 8. Model aditif . KEPUSTAKAAN 1) Bowman AW, Azzalini A. 1997. ... moderately nonlinear, or highly-nonlinear) PENDAHULUAN

AREAS OF RESEARCH

• Nonparametric inferences

• High-dimensional nonparametric modeling

• Functional data analysis

• Information engineering and signal processing

• Nonlinear time series and finance modeling

• Nonparametric modeling in biostatistics