Dynamic Programming

Download Report

Transcript Dynamic Programming

Dynamic Programming
We are often interested in net present values:

Vt   t   t 1    t  2    t 3  ...      
2
3
 t
Net present values can be expressed as the sum of a current payoff and a discounted
future value:
Vt   t   Vt 1
Often we can eliminate time subscripts by establishing that value each period depends
on a set of "state variables" that describe different "states of the world":
V (k )   (k , k ')   V (k ')
where k ' depends on k
This yields a "stationary" problem (one that is not time-dependent)
Dynamic Programming is a set of tools and techniques for analyzing such problems
A Simple Example
Consider a simple economy with a representative agent who lives forever
The agent's problem is:

max   tU (ct )
{ct ,it }0
subject to
t 0
kt 1  kt (1   )  it
ct  it  F (kt )
where  is the discount factor, U (.) is the utility function, ct is consumption
in period t , kt is capital in period t , it is investment in period t , and F (.) is the
production function
We can use the constraints to eliminate it and express ct as
ct  F (kt )  kt 1  kt (1   )  f (kt )  kt 1
Capital as a State Variable
The agent's problem can be rewritten as

max
{kt 1 }t  0
subject to
  U ( f (k )  k
t
t 0
t
0  kt 1  f (kt ),
t 1
)
t  0,1, 2,....
k0  0 is given
Since this is an infinite horizon problem, it looks the same every period; the only thing
that differs if you begin in another period is the initial capital stock
Thus, intuitively, the solution to this problem must depend entirely on k0
Another feature of this problem: if we start in period 0, the agent does not need to
choose the entire sequence of capital stocks immediately
The agent just needs to choose the next period's capital stock; this is sufficient to
determine current consumption and next period's output
The agent can wait until period 1 to choose k2 , and so on
Given this, we just need to describe the agent's preferences over kt 1 given kt
The Value Function
Suppose that the agent's problem has been solved for every value of k0
Then we can define a value function V (k0 ) that represents the value of the maximized
objective function for each k0 :

V (k0 )    tU ( f (kt )  kt 1 )
t 0
when the sequence of capital stocks is chosen optimally subject to the constraints
Now for any k0 we can say
V (k0 )  max U ( f (k0 )  k1 )   V (k1 )
0 k1  f ( k0 )
The time subscripts are now completely unnecessary: the problem is stationary, or
independent of the date
The Functional Equation
In general, given a current capital stock k ,
V ( k )  max U ( f ( k )  k ')   V (k ')
0 k ' f ( k )
This one equation in the unknown function V (.) is called a functional equation
The study of dynamic optimization problems through the analysis of such functional
equations is called dynamic programming
We can often describe the optimal choices using a policy function g (.)
The relationship kt 1  g (kt ) completely describes the dynamics of capital
accumulation (changes in the state variable) from any given initial state
Computing the Value Function
Note that in the problem
V ( k )  max U ( f ( k )  k ')   V (k ')
0  k ' f ( k )
U (.) and f (.) are known
The unknown is the value function
In general there are three ways to solve for the value function:
1. For a few simple cases, V ( k ) has a known functional form (if you pick the right
U (.) and f (.), a textbook can tell you what V (k ) is)
2. You can guess. This method is notoriously unreliable
3. You can compute V (k ) numerically on a computer. This requires that you choose
particular functional forms and parameter values, and then use an appropriate
computational technique
Computing the Value Function
The standard approach for computing value functions is known as
“iteration on the value function”
1.
To proceed, first specify functional forms and parameter values for
the primitives of the problem: utility functions, production
functions, the discount factor, etc.
2.
Then set up a grid of possible values of the state variable: in cases
where the state variable is continuous (such as the capital
accumulation problem) this results in an approximate solution rather
than an exact one, but this is usually good enough
3.
The next step is to choose an initial value function; this sounds
difficult but in practice it usually doesn’t matter too much what you
start with: setting V(k) = 0 for every k on the grid is a typical choice
Iteration on the Value Function
4. At every grid point k , taking the initial function V (.) as given, choose the value
of k ' that solves the agent's problem:
max U ( f (k )  k ')   V (k ')
0 k' f(k)
Record the value of U ( f (k )  k ')   V (k ') as V (k ) in a new vector of values
5. Repeat step 4. using the new V (.) function; keep repeating until V (.) no longer
changes (at which point we say that the value function has converged).
Mathematical Foundations
• We would like our model to yield a unique solution for V(.) (a unique
function, not a point)
• This means no matter which initial value function we choose at the
start of the iterations, the final value function is always the same
• Similarly, we would like g(.) to be single-valued (a function instead of
a correspondence)
• Then, given an initial state, the model would yield a unique prediction
for the entire path of capital accumulation
• Whether the model yields unique solutions for the value function and
policy function depends on the primitives of the problem: utility
functions, production functions, etc.
• Stokey and Lucas describe general sufficient conditions for a model to
yield a unique value function and a unique policy function (these
conditions are sufficient, not necessary)
General Notation
We'll follow the notation in Stokey and Lucas (1989) Section 4.2 and consider
functional equations of the form:
v( x)  max[ F ( x, y )   v( y )]
y ( x )
under the assumption that F is bounded and   1
X is the set of possible values of the state variable
 : X  X is the correspondence describing the feasibility constraints
A  {( x, y )  X  X : y  ( x)} is the graph of 
F : A  R is the return function
  [0,1) is the discount factor
Assumptions
4.3: X is a convex subset of R l , and the correspondence  : X  X is non-empty,
compact-valued, and continuous.
4.4: The function F : A  R is bounded and continuous, and   (0,1).
4.7: F is strictly concave.
4.8:  is convex in the sense that for any   [0,1], and x, x '  X , if y  ( x) and
y '  ( x '), then  y  (1   ) y '  ( x  (1   ) x ').
Key Results
Given assumptions 4.3, 4.4, 4.7, and 4.8, there is a unique bounded and continuous
function v that satisfies the functional equation. Further, v is strictly concave, and the
associated optimal policy function g is single-valued and continuous.
Given x0 , the sequence { xt } defined by xt 1  g ( xt ), t  0,1, 2,... is the unique
solution to the dynamic programming problem.
We'll go over the assumptions to establish what might go wrong if they are not
satisfied, but it is important to keep in mind that the assumptions are sufficient
conditions, not necessary ones. It is possible to use dynamic programming in
other environments. "Existence proof by computer" is fairly common.
The Content of the Assumptions
To understand most of the assumptions, we do not really need to think about
dynamic programming.
It is sufficient to think about any constrained optimization problem.
In a constrained optimization problem, the constraints must allow for at least
one solution. Thus, ( x) must be non-empty.
The constraints must also create a feasible set that is closed and bounded.
Closed is necessary because you cannot maximize on an open interval.
To see this, consider max x subject to x  [0,1).
Bounded is necessary so that the objective function does not reach 
Closed and bounded  compact, so ( x) must be compact-valued.
The Assumptions
Assuming that F is bounded helps ensure that v is bounded
A bounded feasible set is insufficient for this: consider max 1/ x s.t. x  [0,1]
The solution is x  0
Assuming that F is continuous is important for obtaining a max
If the objective function is not continuous, a max may not be obtainable
Consider f ( x)  0 if x  0 and 1/ x if x  0
If the problem is max f ( x) s.t. x  [0,1] then there is no solution
x
The Assumptions
Note that in the value maximization problem the objective function is
F ( x, y )   v ( y )
In order to ensure that this function is continuous, we need ( x) to be continuous
Otherwise, slight changes in x could lead to jumps in y that could cause
v( x) to have breaks

We need   1 because otherwise the infinite sum   t F ( x, y )
t 0
may not be bounded even if F ( x, y) is
The Assumptions
In order to obtain a unique solution y for every x, we need to ensure that
the objective function is strictly concave
This rules out flat segments or multiple peaks
In order to attain a single-peak and have a unique solution, the constraint
set must be convex and X must be convex
To see why, consider the problem max{ x 2 } s.t. x  1 or x  1
x