Chapter-5-Curve-fitting-PART-1.pdf - People@UTM
-
Upload
khangminh22 -
Category
Documents
-
view
1 -
download
0
Transcript of Chapter-5-Curve-fitting-PART-1.pdf - People@UTM
Chapter 5
Curve fitting
Linear regression (exponential model,
power equation and saturation growth
rate equation)
Polynomial Regression
Polynomial Interpolation (Linear
interpolation, Quadratic
Interpolation, Newton DD)
Lagrange Interpolation
Spline Interpolation
• Curve fitting describes techniques to fit curves at points between the discrete values to obtain intermediate estimates.
• Two general approaches for curve fitting:
a) Least –Squares Regression - to fits the shape or general trend by sketch a best line of the data without necessarily matching the individual points (figure PT5.1, pg 426).
- 2 types of fitting:
i) Linear Regression
ii) Polynomial Regression
3
Curve fitting
Figure PT5.1, pg 439 shows sketches developed from same set of data by 3 engineers.
a) least-squares regression - did not
attempt to connect the point, but characterized the general upward trend of the data with a straight line
b) Linear interpolation - Used straight-line segments or linear interpolation to connect the points. Very common practice in engineering. If the values are close to being linear, such approximation provides estimates that are adequate for many engineering calculations. However, if the data is widely spaced, significant errors can be introduced by such linear interpolation.
c) Curvilinear interpolation - Used curves to try to capture suggested bay the data.
Our goal here to develop systematic and objective method deriving such curves.
Curve fitting
4
a) Least-square Regression : i) Linear Regression
• Is used to minimize the discrepancy/differences between the data
points and the curve plotted. Sometimes, polynomial interpolation is
inappropriate and may yield unsatisfactory results when used to
predict intermediate values (see Fig. 17.1, pg 455).
Fig. 17.1 a): shows 7
experimentally derived
data points exhibiting
significant variability. Data
exhibiting significant error.
Curve fitting 5
Linear Regression is fitting a „best‟ straight line through the
points.
The mathematical expression for the straight line is:
y = a0+a1x+e Eq 17.1
where, a1- slope
a0 - intercept
e - error, or residual, between the model and the observations
Rearranging the eq. above as:
e = y - a0 - a1x
Thus, the error or residual, is the discrepancy between the true value y and the approximate value, a0+a1x, predicted by the linear equation.
6
Curve fitting
Criteria for a ‘best’ Fit
• To know how a “best” fit line through the data is by minimize the
sum of residual error, given by ;
where; n : total number of points
• A strategy to overcome the shortcomings: The sum of the squares
of the errors between the measured y and the y calculated with the
linear model is shown in Eq 17.3;
Curve fitting
7
n
i
ii
n
i
i xaaye1
10
1
)( ----- Eq 17.2
n
i
n
i
iielimeasuredi
n
i
ir xaayyyeS1 1
2
10
2
mod,,
1
2 )()( ----- Eq 17.3
Least-squares fit for a straight line
• To determine values for ao and a1, i) differentiate equation 17.3 with respect to each coefficient, ii) setting the derivations equal to zero (minimize Sr), iii) set ao = n.ao to give equations 17.4 and 17.5, called as normal equations, (refer text book) which can be solved simultaneously for a1 and ao;
where; = means for x and y
Curve fitting
8
xaya
xxn
yxyxna
ii
iiii
10
221
xy ,
----- Eq 17.6
----- Eq 17.7
EXAMPLE 1
Use least-squares regression to fit a straight line to:
x 1 2 3 4 5 6 7
y 0.5 2.5 2.0 4.0 3.5 6.0 5.5
• Two criteria for least-square regression will provide the best estimates of ao and a1 called maximum likelihood principle in statistics:
i. The spread of the points around the line of similar magnitude along the entire range of the data.
ii. The distribution of these points about the line is normal.
• If these criteria are met, a “standard deviation” for the regression line is given by equation:
---------- Eq. 17.9
sy/x : standard error of estimate
“y/x” : predicted value of y corresponding to a particular value of x
n -2 : two data derived estimates ao and a1 were used to compute Sr (we have lost 2 degree of freedom)
Curve fitting
10
2
n
SS r
xy
• Equation 17.9 is derived from Standard Deviation (Sy) about the mean :
-------- (PT5.2, pg 442 )
-------- (PT5.3, pg 442 )
St : total sum of squares of the residuals between data points and the mean.
• Just as the case with the standard deviation, the standard error of the estimate quantifies the spread of the data.
Curve fitting
11
1
n
SS t
y
2)( yyS it
Estimation of errors in summary 1. Standard Deviation
2. Standard error of the estimate
where, y/x designates that the error is for a predict value of y
corresponding to a particular value of x.
2)(
1
yyS
n
SS
it
ty
2
)(1 1
2
10
2
nr
SS
xaayeiS
xy
n
i
n
iir
12
Curve fitting
----- Eq 17.8
----- Eq 17.9
----- (PT5.2, pg 442 )
----- (PT5.3, pg 442 )
3. Determination coefficient
4. Correlation coefficient
t
rt
S
SSr
2
2222 )()(
))((@
iiii
iiii
t
rt
yynxxn
yxyxnr
S
SSr
13
Curve fitting
----- Eq 17.10
----- Eq 17.11
EXAMPLE 2
Use least-squares regression to fit a straight line to:
x 1 2 3 4 5 6 7
y 0.5 2.5 2.0 4.0 3.5 6.0 5.5
Compute the standard deviation the standard error of estimate and
the correlation coefficient for data above (use Example 1 result)
QUIZ 1
Use least-squares regression to fit a straight line to:
Compute the standard error of estimate and the correlation
x 1 2 3 4 5 6 7 8 9
y 1 1.5 2 3 4 5 8 10 13
QUIZ 2 (DIY)
Compute the standard error of the estimate and the
correlation coefficient.
x 0.25 0.75 1.25 1.50 2.00
y -0.45 -0.60 0.70 1.88 6.00
Linearization of Nonlinear Relationships
• Linear regression provides a powerful technique for fitting the best line to data, where the relationship between the dependent and independent variables is linear.
• But, this is not always the case, thus first step in any regression analysis should be to plot and visually inspect whether the data is a linear model or not.
Curve fitting
18
Figure 17.8: a) data is ill-suited for linear regression, b) parabola is preferable.
Curve fitting
19
Figure 17.9: Type of polynomial equations and their linearized versions, respectively.
Curve fitting
20
Curve fitting
21
• Fig. 17.9, pg 453 shows population growth of radioactive decay behavior.
Fig. 17.9 (a) : the exponential model
------ (17.12)
1 , 1 : constants, 1 0
This model is used in many fields of engineering to characterize quantities.
Quantities increase : 1 positive
Quantities decrease : 1 negative
xey 1
1
Example 2
Fit an exponential model y = a ebx to:
Solution
• Linearized the model into;
ln y = ln a + bx
y = a0 + a1x ----- (Eq. 17.1)
• Build the table for the parameters used in eqs 17.6 and 17.7, as in example 17.1, pg 444.
Curve fitting
22
x 0.4 0.8 1.2 1.6 2.0 2.3
y 750 1000 1400 2000 2700 3750
Curve fitting
23
xi yi ln yi xi2 (xi)(ln yi)
0.4 750 6.620073 0.16 2.648029
0.8 1000 6.900775 0.64 5.520620
1.2 1400 7.244228 1.44 8.693074
1.6 2000 7.600902 2.56 12.161443
2.0 2700 7.901007 4.00 15.802014
2.3 3750 8.229511 5.29 18.927875
8.3 44.496496 14.09 63.753055
416083.76
496496.44ln383333.1
6
3.8
753055.63))(ln(09.14
496496.44ln3.8
6
1 1
2
11
yx
yxx
yx
n
n
i
n
i
ii
n
i
i
n
i
i
i
Curve fitting
24
843.0)3.8()09.14)(6(
)496496.44)(3.8()753055.63)(6(
)(
)(ln))(ln(
2
221
b
xxn
yxyxnba
ii
iiii
25.6ln
)383333.1)(843.0(416083.7lnln0
a
xbyaa
xy
bxay
843.025.6ln
lnln
bxeay
51825.6ln 25.6 eaa
xbx eeay 843.0518
Straight-line:
Exponential:
Figure 17.9: Type of polynomial equations and their linearized versions, respectively.
Curve fitting
25
Figure 17.9: Type of polynomial equations and their linearized versions, respectively.
Curve fitting
26
Power Equation
• Equation (17.13 ) can be linearized by taking base-10
logarithm to yield:
-------- (17.13)
-------- (17.16)
• A plot of log y versus log x will yield a straight line with slope of 2 and an intercept of log 2.
Curve fitting
27
xy
xy
logloglog22
2
2
Power Equation
Example 4
Linearization of a Power equation and fit equation (17.13) to the data in table below using a logarithmic transformation of the data.
Curve fitting
28
x 1 2 3 4 5
y 0.5 1.7 3.4 5.7 8.4
Solution:
Curve fitting
29
xi yi log xi log yi (log xi)2 (log xi)(log yi)
1 0.5 0 -0.301 0 0
2 1.7 0.301 0.226 0.090601 0.068026
3 3.4 0.477 0.534 0.227529 0.254718
4 5.7 0.602 0.753 0.362404 0.453306
5 8.4 0.699 0.922 0.488601 0.644478
2.079 2.134 1.169135 1.420528
Curve fitting
30
4268.05
134.2log4158.0
5
079.2log
420528.1))(log(log169135.1)(log
134.2log079.2log
5
1 1
2
11
yx
yxx
yx
n
n
i
n
i
iii
n
i
i
n
i
i
b n(log x i)(log y i) (log x i)(log y i)
n(log x i)2 (log x i)
2
b (5)(1.420528) (2.079)(2.134)
(5)(1.169135) (2.079)21.75
Curve fitting
31
bxay
5.0103.0log 3.0 aa
75.15.0 xxay b
xy
xbay
log75.13.0log
logloglog
3.0ln
)4158.0)(75.1(4268.0)(logloglog
a
xbya
Straight-line:
Power:
• Fig. 17.10 a), pg 455, is a plot of the original data in its untransformed state, while fig. 17.10 b) is a plot of the transformed data.
• The intercept, log 2 = -0.300, and by taking the antilogarithm, 2 = 10-0.3 = 0.5.
• The slope is 2 = 1.75, consequently, the power equation is : y = 0.5x1.75
Curve fitting
32
Figure 17.9: Type of polynomial equations and their linearized versions, respectively.
Curve fitting
33
Saturation-growth rate Equation
• Equation (17.14) can be linearized by inverting it to yield:
------- (17.14)
------- (17.17)
• A plot of 1/y versus 1/x will yield a straight line with slope of 3/3 and an intercept of 1/3
• In their transformed forms, these models are fit using linear regression in order to evaluate the constant coefficients.
• They could then be transformed back to their original state and used for predictive purposes as discusses in
Curve fitting
34
33
3111
xy
x
xy
3
3
Example 5
Linearization of a saturation-growth rate equation to the data in table below.
Curve fitting
35
x 0.75 2 2.5 4 6 8 8.5
y 0.8 1.3 1.2 1.6 1.7 1.8 1.7
Solution
Curve fitting
36
7442.07
2094.514132.0
7
8926.21
8127.211
3074.21
2094.51
8926.21
7
1 1
2
11
yx
yxx
yx
n
n
i
n
i iii
n
i i
n
i i
Curve fitting
37
xi yi 1/ xi 1/ yi (1/xi)2 (1/ xi)(1/yi)
0.75 0.8 1.33333 1.25000 1.7777 1.6666
2 1.3 0.50000 0.76923 0.2500 0.3846
2.5 1.2 0.40000 0.83333 0.1600 0.3333
4 1.6 0.25000 0.62500 0.0625 0.1562
6 1.7 0.16667 0.58823 0.0278 0.0981
8 1.8 0.12500 0.55555 0.0156 0.0694
8.5 1.7 0.11765 0.58823 0.0138 0.1045
2.89260 5.20940 2.3074 2.8127
5935.0)8926.2()3074.2)(7(
)2094.5)(8926.2()8127.2)(7(
)1
(1
1111
2
2
2
a
b
xxn
yxyxn
a
b
ii
iiii
Curve fitting
38
4990.01
)4132.0)(5935.0(7442.0111
a
xa
b
ya
xa
b
ay
111
xy
15935.04990.0
1
Straight-line:
xb
xay
187.1)2)(5935.0(5935.0
24990.01
ba
b
aa
x
xy
187.12
Saturation-growth:
Curve fitting
39
Quiz 3
Fit a power equation and saturation growth rate equation to:
.
x 1 2 3 4 5 6 7
y 2.1 2.2 2.3 2.4 2.5 2.6 2.7
Figure 17.8: a) data is ill-suited for linear regression, b) parabola is preferable.
Curve fitting
40
Figure 17.9: Type of polynomial equations and their linearized versions, respectively.
Curve fitting
41
• Another alternative is to fit polynomials to the data using polynomial regression.
• The least-squares procedure can be readily extended to fit the data to a higher-order polynomial.
• For example, to fit a second–order polynomial or quadratic:
• The sum of the squares of the residual is:
where n= total number of points
Polynomial Regression (pg 470)
Curve fitting
42
n
i
iiir xaxaayS1
22
210 )(
exaxaay 2
210
• Then, taking the derivative of equation (17.18) with respect to each of the unknown coefficients, ao , a1 ,and, a2 of the polynomial, as in:
• Setting the equations equal to zero and rearrange to develop set of normal equations and by setting ao = n.ao
Curve fitting
43
iiiii
iiiii
iii
yxaxaxax
yxaxaxax
yaxaxan
2
2
4
1
3
0
2
2
3
1
2
0
2
2
10
)()()(
)()()(
)()()(
)(2
)(2
)(2
2
210
2
2
2
210
1
2
210
0
iiiir
iiiir
iiir
xaxaayxa
S
xaxaayxa
S
xaxaaya
S
----- 17.19
• The above 3 equations are linear with 3 unknowns coefficients (ao , a1 ,and, a2) which can be calculated directly from observed data.
• In matrix form:
• The two-dimensional case can be easily extended to an mth-order polynomial as:
• Thus, standard error for mth-order polynomial :
Curve fitting
44
ii
ii
i
iii
iii
ii
yx
yx
y
a
a
a
xxx
xxx
xxn
2
2
1
0
432
32
2
exaxaxaay m
m ...2
210
)1(/
mn
SS r
xy ----- 17.20
Example 6
Fit a second order polynomial to the data in the first 2 columns of table 17.4:
• From the given data:
m = 2 Σxi = 15 Σ xi4 = 979 y = 25.433
n = 6 Σ yi = 152.6 Σ xiyi = 585.6 Σ xi3 = 225
x = 2.5 Σ xi2 = 55 Σ xi
2yi = 2488.8
45
xi yi xi2 xi
3 xi4 xi yi xi
2 yi
0 2.1 0 0 0 0 0
1 7.7 1 1 1 7.7 7.7
2 13.6 4 8 16 27.2 54.4
3 27.2 9 27 81 81.6 244.8
4 40.9 16 64 256 163.6 654.4
5 61.1 25 125 625 305.5 1527.5
15 152.6 55 225 979 585.6 2488.8
• Therefore, the simultaneous linear equations are:
• Solving these equations through a technique such as Gauss elimination gives:
ao = 2.47857, a1 = 2.35929, and a2 = 1.86071
• Therefore, the least-squares quadratic equation for this case is:
y = 2.47857 + 2.35929x + 1.86071x2
• To calculate st and sr , build table 17.4 for columns 3 and 4.
Curve fitting
46
ii
ii
i
iii
iii
ii
yx
yx
y
a
a
a
xxx
xxx
xxn
2
3
1
0
432
32
2
8.2488
6.585
6.152
97922555
2255515
55156
2
1
a
a
ao
xi yi (yi- y )2 (yi-ao-a1xi-a2xi2)2
0 2.1 544.44 0.14332
1 7.7 314.47 1.00286
2 13.6 140.03 1.08158
3 27.2 3.12 0.80491
4 40.9 239.22 0.61951
5 61.1 1272.11 0.09439
Σ 152.6 2513.39 3.74657
Curve fitting
47
74657.3)(
39.2513)(
22
210
iiir
it
xaxaayS
yyS
The standard error (regression polynomial):
12.1)12(6
74657.3
)1(
mn
SS r
xy
• The correlation coefficient can be calculated by using equations 17.10 and 17.11, respectively:
Therefore, r2 = (St – Sr) / St = (2513.39 – 3.74657) / 2513.39
r2 = 0.99851
The correlation coefficient is, r = 0.99925
• The results indicate that 99.851% of the original uncertainty has been explained by the model. This result supports the conclusion that the quadratic equation represents an excellent fit, as evident from Fig.17.11.
Curve fitting
48
2222 )()(
))((@
iiii
iiii
t
rt
yynxxn
yxyxnr
S
SSr
t
rt
S
SSr
2