Chapter-5-Curve-fitting-PART-1.pdf - People@UTM

49
Chapter 5 Curve fitting

Transcript of Chapter-5-Curve-fitting-PART-1.pdf - People@UTM

Chapter 5

Curve fitting

Chapter 5

Curve fitting

Linear regression (exponential model,

power equation and saturation growth

rate equation)

Polynomial Regression

Polynomial Interpolation (Linear

interpolation, Quadratic

Interpolation, Newton DD)

Lagrange Interpolation

Spline Interpolation

• Curve fitting describes techniques to fit curves at points between the discrete values to obtain intermediate estimates.

• Two general approaches for curve fitting:

a) Least –Squares Regression - to fits the shape or general trend by sketch a best line of the data without necessarily matching the individual points (figure PT5.1, pg 426).

- 2 types of fitting:

i) Linear Regression

ii) Polynomial Regression

3

Curve fitting

Figure PT5.1, pg 439 shows sketches developed from same set of data by 3 engineers.

a) least-squares regression - did not

attempt to connect the point, but characterized the general upward trend of the data with a straight line

b) Linear interpolation - Used straight-line segments or linear interpolation to connect the points. Very common practice in engineering. If the values are close to being linear, such approximation provides estimates that are adequate for many engineering calculations. However, if the data is widely spaced, significant errors can be introduced by such linear interpolation.

c) Curvilinear interpolation - Used curves to try to capture suggested bay the data.

Our goal here to develop systematic and objective method deriving such curves.

Curve fitting

4

a) Least-square Regression : i) Linear Regression

• Is used to minimize the discrepancy/differences between the data

points and the curve plotted. Sometimes, polynomial interpolation is

inappropriate and may yield unsatisfactory results when used to

predict intermediate values (see Fig. 17.1, pg 455).

Fig. 17.1 a): shows 7

experimentally derived

data points exhibiting

significant variability. Data

exhibiting significant error.

Curve fitting 5

Linear Regression is fitting a „best‟ straight line through the

points.

The mathematical expression for the straight line is:

y = a0+a1x+e Eq 17.1

where, a1- slope

a0 - intercept

e - error, or residual, between the model and the observations

Rearranging the eq. above as:

e = y - a0 - a1x

Thus, the error or residual, is the discrepancy between the true value y and the approximate value, a0+a1x, predicted by the linear equation.

6

Curve fitting

Criteria for a ‘best’ Fit

• To know how a “best” fit line through the data is by minimize the

sum of residual error, given by ;

where; n : total number of points

• A strategy to overcome the shortcomings: The sum of the squares

of the errors between the measured y and the y calculated with the

linear model is shown in Eq 17.3;

Curve fitting

7

n

i

ii

n

i

i xaaye1

10

1

)( ----- Eq 17.2

n

i

n

i

iielimeasuredi

n

i

ir xaayyyeS1 1

2

10

2

mod,,

1

2 )()( ----- Eq 17.3

Least-squares fit for a straight line

• To determine values for ao and a1, i) differentiate equation 17.3 with respect to each coefficient, ii) setting the derivations equal to zero (minimize Sr), iii) set ao = n.ao to give equations 17.4 and 17.5, called as normal equations, (refer text book) which can be solved simultaneously for a1 and ao;

where; = means for x and y

Curve fitting

8

xaya

xxn

yxyxna

ii

iiii

10

221

xy ,

----- Eq 17.6

----- Eq 17.7

EXAMPLE 1

Use least-squares regression to fit a straight line to:

x 1 2 3 4 5 6 7

y 0.5 2.5 2.0 4.0 3.5 6.0 5.5

• Two criteria for least-square regression will provide the best estimates of ao and a1 called maximum likelihood principle in statistics:

i. The spread of the points around the line of similar magnitude along the entire range of the data.

ii. The distribution of these points about the line is normal.

• If these criteria are met, a “standard deviation” for the regression line is given by equation:

---------- Eq. 17.9

sy/x : standard error of estimate

“y/x” : predicted value of y corresponding to a particular value of x

n -2 : two data derived estimates ao and a1 were used to compute Sr (we have lost 2 degree of freedom)

Curve fitting

10

2

n

SS r

xy

• Equation 17.9 is derived from Standard Deviation (Sy) about the mean :

-------- (PT5.2, pg 442 )

-------- (PT5.3, pg 442 )

St : total sum of squares of the residuals between data points and the mean.

• Just as the case with the standard deviation, the standard error of the estimate quantifies the spread of the data.

Curve fitting

11

1

n

SS t

y

2)( yyS it

Estimation of errors in summary 1. Standard Deviation

2. Standard error of the estimate

where, y/x designates that the error is for a predict value of y

corresponding to a particular value of x.

2)(

1

yyS

n

SS

it

ty

2

)(1 1

2

10

2

nr

SS

xaayeiS

xy

n

i

n

iir

12

Curve fitting

----- Eq 17.8

----- Eq 17.9

----- (PT5.2, pg 442 )

----- (PT5.3, pg 442 )

3. Determination coefficient

4. Correlation coefficient

t

rt

S

SSr

2

2222 )()(

))((@

iiii

iiii

t

rt

yynxxn

yxyxnr

S

SSr

13

Curve fitting

----- Eq 17.10

----- Eq 17.11

EXAMPLE 2

Use least-squares regression to fit a straight line to:

x 1 2 3 4 5 6 7

y 0.5 2.5 2.0 4.0 3.5 6.0 5.5

Compute the standard deviation the standard error of estimate and

the correlation coefficient for data above (use Example 1 result)

QUIZ 1

Use least-squares regression to fit a straight line to:

Compute the standard error of estimate and the correlation

x 1 2 3 4 5 6 7 8 9

y 1 1.5 2 3 4 5 8 10 13

QUIZ 2 (DIY)

Compute the standard error of the estimate and the

correlation coefficient.

x 0.25 0.75 1.25 1.50 2.00

y -0.45 -0.60 0.70 1.88 6.00

Chapter 5

Curve fitting

Linearization of Nonlinear Relationships

• Linear regression provides a powerful technique for fitting the best line to data, where the relationship between the dependent and independent variables is linear.

• But, this is not always the case, thus first step in any regression analysis should be to plot and visually inspect whether the data is a linear model or not.

Curve fitting

18

Figure 17.8: a) data is ill-suited for linear regression, b) parabola is preferable.

Curve fitting

19

Figure 17.9: Type of polynomial equations and their linearized versions, respectively.

Curve fitting

20

Curve fitting

21

• Fig. 17.9, pg 453 shows population growth of radioactive decay behavior.

Fig. 17.9 (a) : the exponential model

------ (17.12)

1 , 1 : constants, 1 0

This model is used in many fields of engineering to characterize quantities.

Quantities increase : 1 positive

Quantities decrease : 1 negative

xey 1

1

Example 2

Fit an exponential model y = a ebx to:

Solution

• Linearized the model into;

ln y = ln a + bx

y = a0 + a1x ----- (Eq. 17.1)

• Build the table for the parameters used in eqs 17.6 and 17.7, as in example 17.1, pg 444.

Curve fitting

22

x 0.4 0.8 1.2 1.6 2.0 2.3

y 750 1000 1400 2000 2700 3750

Curve fitting

23

xi yi ln yi xi2 (xi)(ln yi)

0.4 750 6.620073 0.16 2.648029

0.8 1000 6.900775 0.64 5.520620

1.2 1400 7.244228 1.44 8.693074

1.6 2000 7.600902 2.56 12.161443

2.0 2700 7.901007 4.00 15.802014

2.3 3750 8.229511 5.29 18.927875

8.3 44.496496 14.09 63.753055

416083.76

496496.44ln383333.1

6

3.8

753055.63))(ln(09.14

496496.44ln3.8

6

1 1

2

11

yx

yxx

yx

n

n

i

n

i

ii

n

i

i

n

i

i

i

Curve fitting

24

843.0)3.8()09.14)(6(

)496496.44)(3.8()753055.63)(6(

)(

)(ln))(ln(

2

221

b

xxn

yxyxnba

ii

iiii

25.6ln

)383333.1)(843.0(416083.7lnln0

a

xbyaa

xy

bxay

843.025.6ln

lnln

bxeay

51825.6ln 25.6 eaa

xbx eeay 843.0518

Straight-line:

Exponential:

Figure 17.9: Type of polynomial equations and their linearized versions, respectively.

Curve fitting

25

Figure 17.9: Type of polynomial equations and their linearized versions, respectively.

Curve fitting

26

Power Equation

• Equation (17.13 ) can be linearized by taking base-10

logarithm to yield:

-------- (17.13)

-------- (17.16)

• A plot of log y versus log x will yield a straight line with slope of 2 and an intercept of log 2.

Curve fitting

27

xy

xy

logloglog22

2

2

Power Equation

Example 4

Linearization of a Power equation and fit equation (17.13) to the data in table below using a logarithmic transformation of the data.

Curve fitting

28

x 1 2 3 4 5

y 0.5 1.7 3.4 5.7 8.4

Solution:

Curve fitting

29

xi yi log xi log yi (log xi)2 (log xi)(log yi)

1 0.5 0 -0.301 0 0

2 1.7 0.301 0.226 0.090601 0.068026

3 3.4 0.477 0.534 0.227529 0.254718

4 5.7 0.602 0.753 0.362404 0.453306

5 8.4 0.699 0.922 0.488601 0.644478

2.079 2.134 1.169135 1.420528

Curve fitting

30

4268.05

134.2log4158.0

5

079.2log

420528.1))(log(log169135.1)(log

134.2log079.2log

5

1 1

2

11

yx

yxx

yx

n

n

i

n

i

iii

n

i

i

n

i

i

b n(log x i)(log y i) (log x i)(log y i)

n(log x i)2 (log x i)

2

b (5)(1.420528) (2.079)(2.134)

(5)(1.169135) (2.079)21.75

Curve fitting

31

bxay

5.0103.0log 3.0 aa

75.15.0 xxay b

xy

xbay

log75.13.0log

logloglog

3.0ln

)4158.0)(75.1(4268.0)(logloglog

a

xbya

Straight-line:

Power:

• Fig. 17.10 a), pg 455, is a plot of the original data in its untransformed state, while fig. 17.10 b) is a plot of the transformed data.

• The intercept, log 2 = -0.300, and by taking the antilogarithm, 2 = 10-0.3 = 0.5.

• The slope is 2 = 1.75, consequently, the power equation is : y = 0.5x1.75

Curve fitting

32

Figure 17.9: Type of polynomial equations and their linearized versions, respectively.

Curve fitting

33

Saturation-growth rate Equation

• Equation (17.14) can be linearized by inverting it to yield:

------- (17.14)

------- (17.17)

• A plot of 1/y versus 1/x will yield a straight line with slope of 3/3 and an intercept of 1/3

• In their transformed forms, these models are fit using linear regression in order to evaluate the constant coefficients.

• They could then be transformed back to their original state and used for predictive purposes as discusses in

Curve fitting

34

33

3111

xy

x

xy

3

3

Example 5

Linearization of a saturation-growth rate equation to the data in table below.

Curve fitting

35

x 0.75 2 2.5 4 6 8 8.5

y 0.8 1.3 1.2 1.6 1.7 1.8 1.7

Solution

Curve fitting

36

7442.07

2094.514132.0

7

8926.21

8127.211

3074.21

2094.51

8926.21

7

1 1

2

11

yx

yxx

yx

n

n

i

n

i iii

n

i i

n

i i

Curve fitting

37

xi yi 1/ xi 1/ yi (1/xi)2 (1/ xi)(1/yi)

0.75 0.8 1.33333 1.25000 1.7777 1.6666

2 1.3 0.50000 0.76923 0.2500 0.3846

2.5 1.2 0.40000 0.83333 0.1600 0.3333

4 1.6 0.25000 0.62500 0.0625 0.1562

6 1.7 0.16667 0.58823 0.0278 0.0981

8 1.8 0.12500 0.55555 0.0156 0.0694

8.5 1.7 0.11765 0.58823 0.0138 0.1045

2.89260 5.20940 2.3074 2.8127

5935.0)8926.2()3074.2)(7(

)2094.5)(8926.2()8127.2)(7(

)1

(1

1111

2

2

2

a

b

xxn

yxyxn

a

b

ii

iiii

Curve fitting

38

4990.01

)4132.0)(5935.0(7442.0111

a

xa

b

ya

xa

b

ay

111

xy

15935.04990.0

1

Straight-line:

xb

xay

187.1)2)(5935.0(5935.0

24990.01

ba

b

aa

x

xy

187.12

Saturation-growth:

Curve fitting

39

Quiz 3

Fit a power equation and saturation growth rate equation to:

.

x 1 2 3 4 5 6 7

y 2.1 2.2 2.3 2.4 2.5 2.6 2.7

Figure 17.8: a) data is ill-suited for linear regression, b) parabola is preferable.

Curve fitting

40

Figure 17.9: Type of polynomial equations and their linearized versions, respectively.

Curve fitting

41

• Another alternative is to fit polynomials to the data using polynomial regression.

• The least-squares procedure can be readily extended to fit the data to a higher-order polynomial.

• For example, to fit a second–order polynomial or quadratic:

• The sum of the squares of the residual is:

where n= total number of points

Polynomial Regression (pg 470)

Curve fitting

42

n

i

iiir xaxaayS1

22

210 )(

exaxaay 2

210

• Then, taking the derivative of equation (17.18) with respect to each of the unknown coefficients, ao , a1 ,and, a2 of the polynomial, as in:

• Setting the equations equal to zero and rearrange to develop set of normal equations and by setting ao = n.ao

Curve fitting

43

iiiii

iiiii

iii

yxaxaxax

yxaxaxax

yaxaxan

2

2

4

1

3

0

2

2

3

1

2

0

2

2

10

)()()(

)()()(

)()()(

)(2

)(2

)(2

2

210

2

2

2

210

1

2

210

0

iiiir

iiiir

iiir

xaxaayxa

S

xaxaayxa

S

xaxaaya

S

----- 17.19

• The above 3 equations are linear with 3 unknowns coefficients (ao , a1 ,and, a2) which can be calculated directly from observed data.

• In matrix form:

• The two-dimensional case can be easily extended to an mth-order polynomial as:

• Thus, standard error for mth-order polynomial :

Curve fitting

44

ii

ii

i

iii

iii

ii

yx

yx

y

a

a

a

xxx

xxx

xxn

2

2

1

0

432

32

2

exaxaxaay m

m ...2

210

)1(/

mn

SS r

xy ----- 17.20

Example 6

Fit a second order polynomial to the data in the first 2 columns of table 17.4:

• From the given data:

m = 2 Σxi = 15 Σ xi4 = 979 y = 25.433

n = 6 Σ yi = 152.6 Σ xiyi = 585.6 Σ xi3 = 225

x = 2.5 Σ xi2 = 55 Σ xi

2yi = 2488.8

45

xi yi xi2 xi

3 xi4 xi yi xi

2 yi

0 2.1 0 0 0 0 0

1 7.7 1 1 1 7.7 7.7

2 13.6 4 8 16 27.2 54.4

3 27.2 9 27 81 81.6 244.8

4 40.9 16 64 256 163.6 654.4

5 61.1 25 125 625 305.5 1527.5

15 152.6 55 225 979 585.6 2488.8

• Therefore, the simultaneous linear equations are:

• Solving these equations through a technique such as Gauss elimination gives:

ao = 2.47857, a1 = 2.35929, and a2 = 1.86071

• Therefore, the least-squares quadratic equation for this case is:

y = 2.47857 + 2.35929x + 1.86071x2

• To calculate st and sr , build table 17.4 for columns 3 and 4.

Curve fitting

46

ii

ii

i

iii

iii

ii

yx

yx

y

a

a

a

xxx

xxx

xxn

2

3

1

0

432

32

2

8.2488

6.585

6.152

97922555

2255515

55156

2

1

a

a

ao

xi yi (yi- y )2 (yi-ao-a1xi-a2xi2)2

0 2.1 544.44 0.14332

1 7.7 314.47 1.00286

2 13.6 140.03 1.08158

3 27.2 3.12 0.80491

4 40.9 239.22 0.61951

5 61.1 1272.11 0.09439

Σ 152.6 2513.39 3.74657

Curve fitting

47

74657.3)(

39.2513)(

22

210

iiir

it

xaxaayS

yyS

The standard error (regression polynomial):

12.1)12(6

74657.3

)1(

mn

SS r

xy

• The correlation coefficient can be calculated by using equations 17.10 and 17.11, respectively:

Therefore, r2 = (St – Sr) / St = (2513.39 – 3.74657) / 2513.39

r2 = 0.99851

The correlation coefficient is, r = 0.99925

• The results indicate that 99.851% of the original uncertainty has been explained by the model. This result supports the conclusion that the quadratic equation represents an excellent fit, as evident from Fig.17.11.

Curve fitting

48

2222 )()(

))((@

iiii

iiii

t

rt

yynxxn

yxyxnr

S

SSr

t

rt

S

SSr

2

Figure 17.11: fit of a second-order polynomial

Curve fitting

49