Transcript ppt - COR@L
Emory University
Inexact Methods for
PDE-Constrained
Optimization
Frank Edward Curtis
Northwestern University
Joint work with Richard Byrd and Jorge Nocedal
February 12, 2007
Nonlinear Optimization
“One” problem
min
f ( x)
s.t.
g ( x) 0
h( x ) 0
x
Circuit Tuning
Building blocks:
Transistors
(switches) and Gates (logic units)
Improve aspects of the circuit – speed, area,
power – by choosing transistor widths
AT1
AT2
w1
w2
d1
AT3
d2
(A. Wächter, C. Visweswariah, and A. R. Conn, 2005)
Circuit Tuning
Building blocks:
Transistors
Improve aspects of the circuit – speed, area,
power – by choosing transistor widths
AT1
AT2
(switches) and Gates (logic units)
w1
w2
d1
AT3
d2
Formulate an optimization problem
min
s.t.
f (AT, w, d)
AT3 max AT1 d1, AT2 d2,...
(A. Wächter, C. Visweswariah, and A. R. Conn, 2005)
Strategic Bidding in Electricity Markets
Independent operator collects
bids and sets production
schedule and “spot price” to
minimize cost to consumers
(Pereira, Granville, Dix, and Barroso, 2004)
min
s.t.
bids P
PT e demand
Strategic Bidding in Electricity Markets
Independent operator collects
min
bids P
bids and sets production
schedule and “spot price” to
s.t. PT e demand
minimize cost to consumers
Electricity production companies “bid” on how
much they will charge for one unit of electricity
max
s.t.
(spot cost) P
bids P
min
P arg min
T
s.t. P e demand
(Pereira, Granville, Dix, and Barroso, 2004)
Strategic Bidding in Electricity Markets
Independent operator collects
min
bids P
bids and sets production
schedule and “spot price” to
s.t. PT e demand
minimize cost to consumers
Electricity production companies “bid” on how
much they will charge for one unit of electricity
max
s.t.
(spot cost) P
bids P
min
P arg min
T
s.t. P e demand
Bilevel problem
Equivalent to MPCC
Hard geometry!
(Pereira, Granville, Dix, and Barroso, 2004)
x0
y0
xy 0
y
x
Challenges for NLP algorithms
Very large problems
Numerical noise
Availability of derivatives
Degeneracies
Difficult geometries
Expensive function evaluations
Real-time solutions needed
Integer variables
Negative curvature
Outline
Problem Formulation
Inexact Framework
Merit function and sufficient decrease
Satisfying first order conditions
Numerical Results
Unconstrained optimization and nonlinear equations
Stopping conditions for linear solver
Global Behavior
Equality constrained optimization
Sequential Quadratic Programming
Model inverse problem
Accuracy tradeoffs
Final Remarks
Future work
Negative curvature
Outline
Problem Formulation
Inexact Framework
Merit function and sufficient decrease
Satisfying first order conditions
Numerical Results
Unconstrained optimization and nonlinear equations
Stopping conditions for linear solver
Global Behavior
Equality constrained optimization
Sequential Quadratic Programming
Model inverse problem
Accuracy tradeoffs
Final Remarks
Future work
Negative curvature
Equality constrained optimization
Goal: solve the problem
min
f ( x)
s.t.
c( x) 0
x
e.g., minimize the difference
between observed and expected
behavior, subject to atmospheric
flow equations (Navier-Stokes)
x
*
Equality constrained optimization
Goal: solve the problem
min
f ( x)
s.t.
c( x) 0
x
Define: the derivatives
g ( x) :
f ( x)
A( x) :
c i ( x)
2
W ( x, ) : xx L( x, )
Define: the Lagrangian
L( x, ) f ( x) T c( x)
Goal: solve KKT conditions
g ( x) A( x)T
0
c( x)
Sequential Quadratic Programming (SQP)
Two “equivalent” step computation techniques
Algorithm: Newton’s method
W
A
g AT
AT d
0
c
Algorithm: the SQP subproblem
min
g T d 12 d TWd
s.t.
c Ad 0
d
Sequential Quadratic Programming (SQP)
Two “equivalent” step computation techniques
Algorithm: Newton’s method
W
A
g AT
AT d
0
c
W
A
A
0
T
Algorithm: the SQP subproblem
min
g T d 12 d TWd
s.t.
c Ad 0
d
KKT matrix
• Cannot be formed
• Cannot be factored
Sequential Quadratic Programming (SQP)
Two “equivalent” step computation techniques
Algorithm: Newton’s method
W
A
g AT
AT d
0
c
W
A
A
0
T
Algorithm: the SQP subproblem
min
g T d 12 d TWd
s.t.
c Ad 0
d
KKT matrix
• Cannot be formed
• Cannot be factored
Linear system solve
• Iterative method
• Inexactness
Outline
Problem Formulation
Inexact Framework
Merit function and sufficient decrease
Satisfying first order conditions
Numerical Results
Unconstrained optimization and nonlinear equations
Stopping conditions for linear solver
Global Behavior
Equality constrained optimization
Sequential Quadratic Programming
Model inverse problem
Accuracy tradeoffs
Final Remarks
Future work
Negative curvature
Unconstrained optimization
Goal: minimize a nonlinear objective
min f ( x )
x
Algorithm: Newton’s method (CG)
f ( xk )d k f ( xk )
2
Unconstrained optimization
Goal: minimize a nonlinear objective
min f ( x )
x
Algorithm: Newton’s method (CG)
f ( xk )d k f ( xk )
2
Note: choosing any
intermediate step ensures
global convergence to a
local solution of NLP
xk
(Steihaug, 1983)
Nonlinear equations
Goal: solve a nonlinear system
F ( x) 0
Algorithm: Newton’s method
F ( xk )d k F ( xk )
Nonlinear equations
Goal: solve a nonlinear system
F ( x) 0
Algorithm: Newton’s method
F ( xk )d k F ( xk )
any step with
xk
F ( xk )d k F ( xk ) rk
and
rk F ( xk ) , 0 1
ensures descent
(Dembo, Eisenstat, and Steihaug, 1982)
(Eisenstat and Walker, 1994)
Line Search SQP Framework
W
A
g AT
AT d
0
c
min
g T d 12 d TWd
s.t.
c Ad 0
d
Define “exact” penalty function
( x) f ( x) c( x)
Line Search SQP Framework
W
A
g AT
AT d
0
c
min
g T d 12 d TWd
s.t.
c Ad 0
d
Define “exact” penalty function
( x) f ( x) c( x)
( x d )
( x d )
1
10
Algorithm Outline (exact steps)
for k = 0, 1, 2, …
Compute
step by…
W
A
Set
g AT
AT d
0
c
penalty parameter to ensure descent on…
( x) f ( x) c( x)
Perform
backtracking line search to satisfy…
( x d ) ( x) D (d )
Update
iterate
Exact Case
W
A
g AT
AT d
0
c
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
Exact Case
W
A
g AT
AT d
0
c
Exact step minimizes
the objective on the
linearized constraints
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
Exact Case
W
A
g AT
AT d
0
c
Exact step minimizes
the objective on the
linearized constraints
… which may lead to
an increase in the
model objective
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
Quadratic/linear model of merit function
W
A
g AT
AT d
0
c
min
g T d 12 d TWd
s.t.
c Ad 0
d
Create model
m(d ) f g T d 12 d TWd ( c Ad )
Quantify reduction obtained from step
mred (d ) m(0) m(d )
g d d Wd ( c c Ad )
T
1
2
T
Quadratic/linear model of merit function
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
Create model
m(d ) f g T d 12 d TWd ( c Ad )
Quantify reduction obtained from step
mred (d ) m(0) m(d )
g d d Wd ( c c Ad )
T
1
2
T
Exact Case
W
A
g AT
AT d
0
c
Exact step minimizes
the objective on the
linearized constraints
… which may lead to
an increase in the
model objective
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
Exact Case
W
A
g AT
AT d
0
c
Exact step minimizes
the objective on the
linearized constraints
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
… which may lead to
an increase in the
model objective
… but this is ok since
we can account for this
conflict by increasing
the penalty parameter
mred (d ) c , 0 1
Exact Case
W
A
g AT
AT d
0
c
Exact step minimizes
the objective on the
linearized constraints
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
… which may lead to
an increase in the
model objective
… but this is ok since
we can account for this
conflict by increasing
the penalty parameter
g T d 12 d T Wd
, 0 1
(1 ) c
Algorithm Outline (exact steps)
for k = 0, 1, 2, …
Compute
step by…
W
A
Set
g AT
AT d
0
c
penalty parameter to ensure descent on…
( x) f ( x) c( x)
Perform
backtracking line search to satisfy…
( x d ) ( x) D (d )
Update
iterate
First attempt
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
Proposition: sufficiently small residual
( , r ) ( g AT , c) , 0 1
Test: 61 problems from CUTEr test set
1e-8
1e-7
1e-6
1e-5
1e-4
1e-3
1e-2
1e-1
Success
100%
100%
97%
97%
90%
85%
72%
38%
Failure
0%
0%
3%
3%
10%
15%
28%
62%
First attempt… not robust
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
Proposition: sufficiently small residual
( , r ) ( g AT , c) , 0 1
… not enough for complete robustness
We
have multiple goals (feasibility and optimality)
Lagrange multipliers may be completely off
… may not have descent!
Second attempt
Step computation: inexact SQP step
W
A
g AT
AT d
0
c r
Recall the line search condition
( x d ) ( x) D (d )
We can show
D (d ) g d c r
T
Second attempt
Step computation: inexact SQP step
W
A
g AT
AT d
0
c r
Recall the line search condition
( x d ) ( x) D (d )
We can show
D (d ) g d c r
T
... but how negative should this be?
Algorithm Outline (exact steps)
for k = 0, 1, 2, …
Compute
step
W
A
Set
g AT
AT d
0
c
penalty parameter to ensure descent
( x) f ( x) c( x)
Perform
backtracking line search
( x d ) ( x) D (d )
Update
iterate
Algorithm Outline (inexact steps)
for k = 0, 1, 2, …
Compute
step and set penalty parameter to ensure
descent and a stable algorithm
W
A
g AT
AT d
0
c
r
Perform
( x) f ( x) c( x)
backtracking line search
( x d ) ( x) D (d )
Update
iterate
Inexact Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
Inexact Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
g T d 12 d TWd c r c
Inexact Case
W
A
g AT
A d
0
c r
T
Step is acceptable if for
0 1, 0 :
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
r ,
g AT
g T d 12 d TWd c r c
Inexact Case
W
A
g AT
A d
0
c r
T
Step is acceptable if for
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
0 1, 0 :
r c ,
c
g T d 12 d TWd c r c
Inexact Case
W
A
g AT
A d
0
c r
T
Step is acceptable if for
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
0 1, 0 :
r c ,
c
g T d 12 d T Wd
, 0 1
(1 ) c r
g T d 12 d TWd c r c
Algorithm Outline
for k = 0, 1, 2, …
Iteratively
solve
W
A
g AT
AT d
0
c r
Until
r c , 0 1
c , 0
r , 0
or
g AT , 0 1
mred (d ) c
Update penalty parameter
Perform backtracking line search
Update iterate
Termination Test
Observe KKT conditions
g AT max 1, g opt
, 0 opt 1
c max 1, c( x0 ) feas , 0 feas 1
Outline
Problem Formulation
Inexact Framework
Merit function and sufficient decrease
Satisfying first order conditions
Numerical Results
Unconstrained optimization and nonlinear equations
Stopping conditions for linear solver
Global Behavior
Equality constrained optimization
Sequential Quadratic Programming
Model inverse problem
Accuracy tradeoffs
Final Remarks
Future work
Negative curvature
Assumptions
The sequence of iterates is contained in a convex
set and the following conditions hold:
the
objective and constraint functions and their first
and second derivatives are bounded
the multiplier estimates are bounded
the constraint Jacobians have full row rank and their
smallest singular values are bounded below by a
positive constant
the Hessian of the Lagrangian is positive definite with
smallest eigenvalue bounded below by a positive
constant
Sufficient Reduction to Sufficient Decrease
Taylor expansion of merit function yields
D (d ) g T d c r
Accepted step satisfies
mred (d ) c , 0 1
g d c r d Wd c
T
1
2
T
D (d ) 12 d TWd c
d c
2
Intermediate Results
r c , 0 1
c , 0
g d d Wd
, 0 1
(1 ) c r
T
1
2
T
r , 0
g A , 0
T
mred (d ) c
d
is bounded above
is bounded above
is bounded below by
a positive constant
Sufficient Decrease in Merit Function
D (d ; x, ) d c
2
( x; ) ( x d ; ) d c
lim d k
k
2
ck 0
lim Z kT g k 0
k
2
Step in Dual Space
We converge to an optimal primal solution, and
g A g AT , 0 1
T
(for sufficiently small || c || and || d || )
Therefore,
lim ck 0
k
lim g k A k 0
k
T
k
Outline
Problem Formulation
Inexact Framework
Merit function and sufficient decrease
Satisfying first order conditions
Numerical Results
Unconstrained optimization and nonlinear equations
Stopping conditions for linear solver
Global Behavior
Equality constrained optimization
Sequential Quadratic Programming
Model inverse problem
Accuracy tradeoffs
Final Remarks
Future work
Negative curvature
Problem Formulation
Tikhonov-style regularized inverse problem
to solve for a reasonably large mesh size
Want to solve for small regularization parameter
Want
SymQMR for linear system solves
Input parameters:
g AT
r
c
Recall:
r c , 0 1
c , 0
(Curtis and Haber, 2007)
0.1, 1,
1, 0.1
r , 0
or
g AT , 0 1
mred (d ) c
Numerical Results
n
m
1024
512
1e-6
r
g AT
c
Iters.
Time
Total LS Avg. LS
Iters.
Iters.
Avg. Rel.
Res.
0.5
29
29.5s
1452
50.1
3.12e-1
0.1
12
11.37s
654
54.5
6.90e-2
0.01
9
11.60s
681
75.7
6.27e-3
(Curtis and Haber, 2007)
Numerical Results
n
m
1024
512
1e-6
r
g AT
c
Iters.
Time
Total LS Avg. LS
Iters.
Iters.
Avg. Rel.
Res.
0.5
29
29.5s
1452
50.1
3.12e-1
0.1
12
11.37s
654
54.5
6.90e-2
0.01
9
11.60s
681
75.7
6.27e-3
(Curtis and Haber, 2007)
Numerical Results
n
m
1024
512
1e-1
Iters.
Time
1e-6
12
1e-7
11.40s
Total LS Avg. LS
Iters.
Iters.
654
54.5
Avg. Rel.
Res.
6.90e-2
11
14.52s
840
76.4
6.99e-2
1e-8
8
10.57s
639
79.9
6.15e-2
1e-9
11
18.52s
1139
104
8.65e-2
1e-10
19
44.41s
2708
143
8.90e-2
(Curtis and Haber, 2007)
Numerical Results
n
m
8192
4096
1e-1
Iters.
1e-6
15
Total LS Avg. LS
Iters.
Iters.
264.47s 1992
133
Avg. Rel.
Res.
8.13e-2
1e-7
11
236.51s 1776
161
6.89e-2
1e-8
9
204.51s 1567
174
6.77e-2
1e-9
11
347.66s 2681
244
8.29e-2
1e-10
16
805.14s 6249
391
8.93e-2
(Curtis and Haber, 2007)
Time
Numerical Results
n
m
65536
32768
1e-1
Iters.
1e-6
15
Total LS Avg. LS
Iters.
Iters.
5055.9s 4365
291
Avg. Rel.
Res.
8.46e-2
1e-7
10
4202.6s 3630
363
8.87e-2
1e-8
12
5686.2s 4825
402
7.96e-2
1e-9
12
6678.7s 5633
469
8.77e-2
1e-10
14
14783s
895
8.63e-2
(Curtis and Haber, 2007)
Time
12525
Outline
Problem Formulation
Inexact Framework
Merit function and sufficient decrease
Satisfying first order conditions
Numerical Results
Unconstrained optimization and nonlinear equations
Stopping conditions for linear solver
Global Behavior
Equality constrained optimization
Sequential Quadratic Programming
Model inverse problem
Accuracy tradeoffs
Final Remarks
Future work
Negative curvature
Review and Future Challenges
Review
Defined a globally convergent inexact SQP algorithm
Require only inexact solutions of primal-dual system
Require
only matrix-vector products involving
objective and constraint function derivatives
Results also apply when only reduced Hessian of
Lagrangian is assumed to be positive definite
Numerical experience on model problem is promising
Future challenges
(Nearly) Singular constraint Jacobians
Inexact derivative information
Negative curvature
etc., etc., etc….
Negative Curvature
Big question
What
is the best way to handle negative curvature
(i.e., when the reduced Hessian may be indefinite)?
Small question
What
is the best way to handle negative curvature in
the context of our inexact SQP algorithm?
We have no inertia information!
Smaller question
When can we handle negative curvature in the
context of our inexact SQP algorithm with NO
algorithmic modifications?
When do we know that a given step is OK?
Our analysis of the inexact case leads to a few
observations…
Why Quadratic Models?
W
A
g AT
A d
0
c r
T
xk
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
Why Quadratic Models?
W
A
g AT
A d
0
c r
T
xk
Provides a good…
• direction? Yes
• step length? Yes
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
Provides a good…
• direction? Maybe
• step length? Maybe
Why Quadratic Models?
W
A
g AT
A d
0
c r
T
xk
min
g T d 12 d TWd
s.t.
c Ad 0
d
xk
One can use our stopping criteria as a mechanism for
determining which are good directions
All that needs to be determined is whether the step
lengths are acceptable
Unconstrained Optimization
Hd g
min g T d 12 d T Hd
d
Direct method is the angle test
gT d g d
Indirect method is to check the conditions
g , d Hd d
T
or
g d g , d g
T
2
2
Unconstrained Optimization
Hd g
min g T d 12 d T Hd
d
Direct method is the angle test
gT d g d
Indirect method is to check the conditions
g , d Hd d
T
or
step quality
g d g , d g
T
2
2
step length
Constrained Optimization
W
A
g AT
A d
0
c r
T
g T d 12 d TWd
s.t.
c Ad 0
d
Step quality determined by
r c , 0 1
c , 0
min
r , 0
or
g AT , 0 1
mred (d ) c
Step length determined by
d Wd d
T
2
or
d max c , r
Thanks!
Actual Stopping Criteria
W
A
g AT
A d
0
c r
T
g T d 12 d TWd
s.t.
c Ad 0
d
Stopping conditions: 0 , 1, 0 ,
r c
c
min
r max c , 1
or
max c , g AT
mred (d ) max c , r c
Model reduction condition
g T d 2 d TWd c r max c , r c
Constraint Feasible Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
If feasible, conditions reduce to
xk
r
g AT
1
r g T d 2 d T Wd
Constraint Feasible Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
If feasible, conditions reduce to
xk
r
g AT
1
r g T d 2 d T Wd
Constraint Feasible Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
If feasible, conditions reduce to
xk
r
g AT
1
r g T d 2 d T Wd
Some region
around the
exact solution
Constraint Feasible Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
If feasible, conditions reduce to
xk
r
g AT
1
r g d 2 d Wd
T
T
Ellipse distorted
toward the
linearized
constraints
Constraint Feasible Case
W
A
g AT
A d
0
c r
T
min
g T d 12 d TWd
s.t.
c Ad 0
d
If feasible, conditions reduce to
xk
r
g AT
1
r g T d 2 d T Wd