Transcript Lecture22

Engineering Analysis ENG 3420 Fall 2009
Dan C. Marinescu
Office: HEC 439 B
Office hours: Tu-Th 11:00-12:00
Lecture 22


Attention: The last homework HW5 and the last project are due on
Tuesday November 24!!
Last time:

Linear regression
 Exponential, power, and saturation non-linear models
 Linear least squares regression

Today






Linear regression versus sample mean. Coefficient of determination
Polynomial least squares fit
Multiple linear regression
General linear squares
More on non-linear models
Interpolation (Chapter 15)




Polynomial interpolation
Newton interpolating polynomials
Lagrange interpolating polynomials
Next Time

Splines
Lecture 22
2

Quantification of Errors

For a straight line the sum of the squares of the estimate residuals is:
n
n
i1
i1
Sr   ei2   yi  a0  a1 xi 

The standard error of the estimate:

Sr
s y/ x 
n2
2
Linear regression versus the sample mean




What is the difference between linear regression and the case when we simply
compute the sample mean and draw a line corresponding to the sample mean?
The spread  the histogram of the differences between the values predicted by
linear regression and the actual sample values.
Regression data showing (a) the spread of data around the mean of the
dependent data and (b) the spread of the data around the best fit line:
The reduction in spread represents the improvement due to linear regression.
Coefficient of Determination

S  Sr
The coefficient of determination r2  r 2  t
St

r2




n

S t   yi  y

2
i 1
represents the percentage of the original uncertainty explained by
the model.
For a perfect fit, Sr=0 and r2=1.
If r2=0, there is no improvement over simply picking the mean.
If r2<0, the model is worse than simply picking the mean!
Example
V
(m/s)
F
(N)
i
xi
yi
a0+a1xi
(yi- ȳ)2
(yi-a0-a1xi)2
1
10
25
-39.58
380535
4171
2
20
70
155.12
327041
7245
3
30
380
349.82
68579
911
4
40
550
544.52
8441
30
5
50
610
739.23
1016
16699
6
60
1220
933.93
334229
81837
7
70
830
1128.63
35391
89180
8
80
1450
1323.33
653066
16044

360
5135
1808297
216118
Fest  234.2857 19.47024v

St   yi  y   1808297
2
Sr   yi  a0  a1 xi   216118
2
sy 
1808297
 508.26
8 1
216118
 189.79
82
1808297  216118
r2 
 0.8805
1808297
s y/ x 
88.05% of the original uncertainty
has been explained by the
linear model
Polynomial least-fit squares

MATLAB has a built-in function polyfit that fits a least-squares n-th order
polynomial to data:

p = polyfit(x, y, n)





x: independent data
y: dependent data
n: order of polynomial to fit
p: coefficients of polynomial
f(x)=p1xn+p2xn-1+…+pnx+pn+1
MATLAB’s polyval command can be used to compute a value using the
coefficients.

y = polyval(p, x)
Fitting an mth order polynomial to n data points

Minimize:
n
n
Sr   e   yi  a0  a1 xi  a x 
2
i
i1

2
2 i
The standard error is:


Sr

n  m 1
because the mth order polynomial has (m+1) coefficients.
The coefficient of determination r2 is:
r  St  Sr
St
2

a x
i1
s y/ x
n

S t   yi  y
i 1

2

m 2
m i
Multiple Linear Regression
Now y is a linear function of two or
more independent variables.

y  a0  a1x1  a2 x2 
am xm
The best fit minimize the sum of the
squares of the estimate residuals:

n
n
Sr   e   yi  a0  a1 x1,i  a2 x2,i 
2
i
i1

i1
For example when:
y  a0  a1 x1  a2 x2
instead of a line we have a plane
am xm,i 
2
General Linear Least Squares

Linear, polynomial, and multiple linear regression all belong to the general linear
least-squares model:
y  a0 z0  a1z1  a2 z2 


am zm  e
where z0, z1, …, zm are a set of m+1 basis functions and e is the error of the fit.
The basis functions can be any function data but cannot contain any of the
coefficients a0, a1, etc.
The equation can be re-written for each data point as a matrix equation:

y  Z a e
where {y} is a vector of n dependent data, {a} is a vector of (m+1) coefficients of
the equation, {e} contains the error at each point, and [Z] is:


z01
z02
Z   

z0n
z11
z12
z1n
zm1 
zm2 

zmn 

with zji representing the value of the jth basis function calculated at the ith point.

Solving General Linear Least Squares Coefficients

Generally, [Z] is an n x (m+1) matrix. Simple inversion cannot be used to
solve for the (m+1) {a}. Instead the sum of the squares of the estimate
residuals is minimized:
m

2
Sr   ei2  
yi   a j z ji 

i1
i1 
j0

n

n
The outcome of this minimization yields:
Z  Z a  Z  y
T


T
Example


Given the colum vectors x and y, find the coefficients for best fit line
y=a0+a1x+a2x2
Z = [ones(size(x) x x.^2]
a = (Z’*Z)\(Z’*y)
 MATLAB’s left-divide will automatically include the [Z]T terms if
the matrix is not square, so
a = Z\y
would work as well
To calculate measures of fit:
St = sum((y-mean(y)).^2)
Sr = sum((y-Z*a).^2)
r2 = 1-Sr/St
syx = sqrt(Sr/(length(x)-length(a)))
Nonlinear Models

How to deal with nonlinear models (when we cannot fit a straight line)
to the sample data

Transform the variables and solve for the best fit of the transformed
variables. This works well for exponential, power, saturation models but
not all equations can be transformed easily or at all.
 Perform nonlinear regression to directly determine the least-squares fit.

To perform nonlinear regression:

write a function that returns the sum of the squares of the estimate
residuals for a fit and then
 use fminsearch function to find the values of the coefficients where a
minimum occurs.

The arguments to the function to compute Sr should be the
coefficients, the independent variables, and the dependent variables.
Example

Given two vectors of n observations ym for the force F and xm for
the velocity v find the coefficients a0 and a1 for the best fit of the
equation:
a1
0
Fa v

First - write a function called fSSR.m containing the following:


function f = fSSR(a, xm, ym)
yp = a(1)*xm.^a(2);
f = sum((ym-yp).^2);
Use fminsearch in the command window to obtain the values of a
that minimize fSSR:
a = fminsearch(@fSSR, [1, 1], [], v, F)
where [1, 1] is an initial guess for the [a0, a1] vector, [] is a
placeholder for the options
Comparison between the transformed of the power
equation and the direct method in our example

In the general case the two methods produce different results (the
coefficients of the equations are different). The direct method produces the
largest r2.
Polynomial Interpolation

Problem estimate intermediate values between precise data points.

Related to data fitting but
 The function uses to interpolate must pass through the data points - this
makes interpolation more restrictive than fitting.

Polynomial interpolation  an (n-1)th order polynomial is found that
passes through n data points:
f ( x)  p1 x n 1  p2 x n  2    pn 1 x  pn

How to find the coefficients of the polynomial

Use linear algebra to solve a system of n linear equations.
 Use polyfit and polyval built-in functions. Making sure the order of the fit for n
data points is n-1.
Matrix formulation of polynomial interpolation: find the coefficients
p1, p2 … pn knowing the values of the function f(x1),f(x2)…f(xn)
x1n1
x n1
 2
 n1
xn1
n1

x
 n
x1n2
x2n2
x1
x2
n2
xn1
xnn2
xn1
xn
1 p1   f x1  
 f x 



1  p2    2  


  
1pn1  f xn1 


 f x 
p


n
1

  n  
Ill conditioned linear problems



A matrix is ill-conditioned if small changes in the coefficients of the
solution have drastic effects on the results, which makes iterating
the solution to a small residual a tricky operation.
Another type of ill-conditioned matrix is when we have matrix values
that vary by several degrees of magnitude.
Numerical round-off in the system can be challenging for solving a
model having an ill-conditioned matrix.
Problems

Vandermonde matrices are very ill-conditioned  their solutions are very
sensitive to round-off errors.
x1n1
x n1
 2
 n1
xn1
xnn1

x1n2
x2n2
n2
xn1
xnn2
x1 1 p1   f x1  
 f x 

x2 1
p
 
 2   2 

  
xn1 1pn1  f xn1 

 pn 
 
f
x
xn 1


 n 
Matrices such as that on the left are known as The issue can be minimized
by scaling and shifting the data.

Newton Interpolating Polynomials

The differences between a simple polynomial and Newton’s
interpolating polynomial for first and second order interpolations are:
Order
1st
2nd
Simple
f1 (x)  a1  a2 x
f2 (x)  a1  a2 x  a3 x 2
Newton
f1 (x)  b1  b2 (x  x1 )
f2 (x)  b1  b2 (x  x1 ) b3(x  x1 )(x  x2 )
First-order Newton interpolating polynomial


The first-order Newton
interpolating polynomial may be
obtained from linear interpolation
and similar triangles, as shown.
The resulting formula based on
known points x1 and x2 and the
values of the dependent function
at those points is:
f x2   f x1 
f1 x  f x1  
x  x1 
x2  x1
Second-order Newton interpolating polynomial


The second-order Newton
interpolating polynomial introduces
some curvature to the line connecting
the points, but still goes through the
first two points.
The resulting formula based on known
points x1, x2, and x3 and the values of
the dependent function at those points
is:
f x3   f x2  f x2   f x1 

f x2   f x1 
x 3  x2
x2  x1
f2 x  f x1 
x

x

 1
x  x1 x  x2 
x2  x1
x3  x1
Newton interpolating polynomial of degree n-1


In general, an (n-1)th Newton interpolating polynomial has all the terms
of the (n-2)th polynomial plus one extra.
The general formula is:
fn1 x  b1  b2 x  x1  

 bn x  x1 x  x2 
b1  f x1 
where
b2  f x2 , x1 
b3  f x3 , x2 , x1 
bn  f xn , xn1 ,

, x2 , x1 
and the f[…] represent divided differences.

x  xn1 
Divided differences

Divided difference are calculated as follows:
f xi , x j 
f xi   f x j 
f xi , x j , xk 
xi  x j
f xi , x j  f x j , xk 
xi  x k
f xn , xn1, , x2 , x1  

f xn , xn1 , , x2   f xn1 , xn2 , , x1 
xn  x1
Divided differences are calculated using divided difference of a
smaller number of terms:

Lagrange interpolating polynomials


Another method that uses shifted value to express an interpolating
polynomial is the Lagrange interpolating polynomial.
The differences between a simply polynomial and Lagrange interpolating
polynomials for first and second order polynomials is:
Order
1st
2nd
Simple
f1 (x)  a1  a2 x
f2 (x)  a1  a2 x  a3 x 2
Lagrange
f1 (x)  L1 f x1  L2 f x2 
f2 (x)  L1 f x1  L2 f x2  L3 f x3 
where the Li are weighting coefficients that are functions of x.
First-order Lagrange interpolating polynomial


The first-order Lagrange
interpolating polynomial may be
obtained from a weighted
combination of two linear
interpolations, as shown.
The resulting formula based on
known points x1 and x2 and the
values of the dependent function
at those points is:
f1 (x)  L1 f x1  L2 f x2 
x  x2
x  x1
L1 
, L2 
x1  x2
x2  x1
x  x2
x  x1
f1 (x) 
f x1 
f x2 
x1  x2
x2  x1
Lagrange interpolating polynomial for n points

In general, the Lagrange polynomial interpolation for n points is:
n
fn1 xi    Li x  f xi 
i1

where Li is given by:
x  xj
Li x  
x  xj
j1 i
n

ji

Inverse Interpolation



Interpolation general means finding some value f(x) for some x that
is between given independent data points.
Sometimes, it will be useful to find the x for which f(x) is a certain
value - this is inverse interpolation.
Rather than finding an interpolation of x as a function of f(x), it may
be useful to find an equation for f(x) as a function of x using
interpolation and then solve the corresponding roots problem:
f(x)-fdesired=0 for x.
Extrapolation


Extrapolation is the process of
estimating a value of f(x) that
lies outside the range of the
known base points x1, x2, …, xn.
Extrapolation represents a step
into the unknown, and extreme
care should be exercised when
extrapolating!
Extrapolation Hazards

The following shows the results of extrapolating a seventh-order
population data set:
Oscillations


Higher-order polynomials can not only lead to round-off errors due to illconditioning, but can also introduce oscillations to an interpolation or fit
where they should not be.
In the figures below, the dashed line represents an function, the circles
represent samples of the function, and the solid line represents the results
of a polynomial interpolation: