Control of Distributed-Information Nonlinear Stochastic Systems

Download Report

Transcript Control of Distributed-Information Nonlinear Stochastic Systems

SWAN 2006, ARRI_
Control of
Distributed-Information
Nonlinear Stochastic Systems
Prof. Thomas Parisini
University of Trieste
SWAN 2006, ARRI
Summary
Examples
Problem Formulation
Approximate NN Solution
Two Significant Applications
SWAN 2006, ARRI
Example:
Water Distribution Networks
Objective: control the spatio-temporal distribution of drinking water disinfectant throughout
the network by the injection of appropriate amount of disinfectant at appropriately chosen
actuator locations
SWAN 2006, ARRI
Example:
Robot Soccer
O
b
j
e
c
t
i
v
e
:
d
SWAN 2006, ARRI
Distributed Decision and Control
How did this happen?
Outer-loop control: as we move from low-level control to
higher-level control, we face the need to take into
consideration other feedback systems that influence the
environment.
Complexity: due to the presence of more complex systems,
we had to break down the controller into smaller
controllers.
Communications and Networks: data networks and wireless
communications have facilitated the design of feedback
systems that require distributed decision and control
techniques.
SWAN 2006, ARRI
Distributed Decision and Control
What does it imply?
Need for different type of control problem formulations.
Need to handle competition from other controllers
(agents).
Need to handle cooperation with other controllers
Need to handle inter-controller communication issues.
Need for suitable individual and team evaluation methods
SWAN 2006, ARRI
Distributed Decision and Control
How does Learning come in?
Learning the environment
Learning the strategy of adversarial agents
Predicting the actions of collaborating agents
Learning is crucial because it is a highly
time-varying (evolving) environment.
SWAN 2006, ARRI
Structure of a “Team” of Agents
SWAN 2006, ARRI
Problem formulation
Definitions
Unpredictable variables:
Random vector representing all uncertainties in
the external world with known p.d.f.
SWAN 2006, ARRI
Problem formulation
Definitions
Information function:
SWAN 2006, ARRI
Problem formulation
Definitions
Decision function:
SWAN 2006, ARRI
Problem formulation
Definitions
Cost functional:
SWAN 2006, ARRI
Problem formulation
Problem T
Given
Find the optimal
strategies
and
that minimize
Solving analytically Problem T is in general impossible
SWAN 2006, ARRI
About Problem T
Further Definitions and Concepts
Predecessor:
The control actions generated by
affect
the information set of
for any possible
SWAN 2006, ARRI
About Problem T
Further Definitions and Concepts
Information set inclusion:
The information set
information set
(
is a function of the
is “nested” in
)
SWAN 2006, ARRI
About Problem T
Further Definitions and Concepts
Information network:
SWAN 2006, ARRI
About Problem T
Further Definitions and Concepts
Information Structures
Static: information of each
decisions of other
Dynamic: otherwise
is not influenced by
SWAN 2006, ARRI
About Problem T
Sufficient conditions for analytic solution
LQG Static Teams
Linear optimal control strategy
SWAN 2006, ARRI
About Problem T
Sufficient conditions for analytic solution
“Partially Nested” LQG Dynamic Teams
Any
can reconstruct the information of the
influencing its own information
Linear optimal control strategy
SWAN 2006, ARRI
About Problem T
Sufficient conditions for analytic solution
Existence of a sequential partition
Optimal control strategy by DP
SWAN 2006, ARRI
Approximate Solution of Problem T
A Simple Methodology
Assumption: no loops in the information network
Given parametric
structure
Vector of “free” parameters to
be optimized
SWAN 2006, ARRI
Formulation of the Approximate Problem
Problem T’
Substitute
and
into the cost function
SWAN 2006, ARRI
Formulation of the Approximate Problem
Problem T’
Find the optimal
vector
Given
and
that minimizes
Given NN structures
Functional Optimization Problem T
Nonlinear Programming Problem T’
SWAN 2006, ARRI
NN Learning Algorithm
Gradient Method
However:
cannot be written in explicit form
SWAN 2006, ARRI
NN Learning Algorithm
Stochastic Approximation
Compute the “realization”
SWAN 2006, ARRI
NN Learning Algorithm
Stochastic Approximation
randomly generated
according to the (known) p.d.f. of
The step
Example:
is chosen so as
SWAN 2006, ARRI
NN Learning Algorithm
Important Remark
Gradient method + Stochastic Approximation
Distributed Learning: each DM is able to
compute “locally” its control function by
“exchanging messages” with cooperating
DMs according to the Information Structure
SWAN 2006, ARRI
Methodology: Conceptual Steps
Problem T: minimize
Replace
structure
Exact optimal solutions
with the NN
Problem T’: minimize
Stoc. Appr. to solve Problem T’
Approximate optimal
solutions
SWAN 2006, ARRI
The Witsenhausen Counterexample
Problem W
Given:
and
independent
information functions:
cost function:
Find the optimal
strategies
that minimize
SWAN 2006, ARRI
The Witsenhausen Counterexample
Problem W
SWAN 2006, ARRI
The Witsenhausen Counterexample
Remarks on Problem W
LQG hypotheses hold
Information structure not partially nested
An optimal solution
does exist
But:
are not affine functions of
SWAN 2006, ARRI
The Witsenhausen Counterexample
Remarks on Problem W
Best affine solution:
Wit. solution:
For
and
SWAN 2006, ARRI
The Witsenhausen Counterexample
Remarks on Problem W
Optimized Wit. solution:
given the structures
For
and
SWAN 2006, ARRI
The Witsenhausen Counterexample
Remarks on Problem W
45
Opt. Wit. solution outperforms
the best linear solutions:
40
35
30
25
20
15
10
5
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Choice of the parametric structures
Given parametric
structures
Vector of “free” parameters to
be optimized
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Problem W’
Substitute
into the cost functional
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Problem W’
Given
and
independent
Find the optimal
NN weights
that minimize
information functions:
cost function:
SWAN 2006, ARRI
Conceptual Steps to Solve Approximately
Problem W
Problem W: minimize
Replace
the NNs
Exact optimal solutions
with
Problem W’: minimize
Stoc. Appr. to solve Problem W’
Approximate optimal
solutions
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
15
15
10
10
Best Linear
5
5
0
0
Opt. W.
-5
10
Best Linear
Opt. W.
-5
10
NN
-15
NN
-15
-15
10
-5
0
5
10
15
-15
10
-5
0
5
10
15
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
15
15
Best Linear
Best Linear
10
10
5
5
0
0
Opt. W.
-5
10
Opt. W.
-5
10
NN
-15
NN
-15
-15
10
-5
0
5
10
15
-15
10
-5
0
5
10
15
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
15
15
Best Linear
10
5
5
0
0
Opt. W.
-5
10
Best Linear
10
Opt. W.
-5
10
NN
-15
NN
-15
-15
10
-5
0
5
10
15
-15
10
-5
0
5
10
15
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
15
15
Best Linear
10
5
5
0
0
Opt. W.
-5
10
Best Linear
10
Opt. W.
-5
10
NN
NN
-15
-15
-15
10
-5
0
5
10
15
-15
10
-5
0
5
10
15
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
15
15
Best Linear
10
10
Best Linear
5
5
0
0
Opt. W.
-5
NN
NN
10
10
-15
-15
-15
Opt. W.
-5
10
-5
0
5
10
15
-15
10
-5
0
5
10
15
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
The “3-step” area
The “5-step” area
The “linear” area
45
3-
3-
5-
5-
5-
5-
L+
L+
40
3-
3-
5-
5-
5-
5-
L+
L+
35
3-
3-
5-
5-
5-
5-
L+
L+
30
3-
3-
5-
5-
5-
5-
5+
L+
25
3-
3-
5-
5-
5-
5-
5+
L+
20
3-
3-
3-
5-
5-
5-
5+
L+
15
33+
33-
33-
53-
53-
55-
L+
L+
L+
L+
10
3+
3+
33-
33-
33-
33-
55-
L+
L+
L+
L+
3+
3+
33-
33-
33-
53-
L+
L+
L+
L+
0.1
0.2
0.3
0.4
0.5
5
0
0
3-
0.6
L+
0.7
SWAN 2006, ARRI
Approximate NN Solution of Problem W
Results and Comparisons
Costs of the Neural, Optimized Witsenhausen, and Best
Linear Solutions
SWAN 2006, ARRI
Concluding Remarks
General approximate methodology for the solution of
distributed-information control problems:
Decision makers act as cooperating members of a
team
Team functional optimization problem reduced to a
nonlinear programming one
Distributed learning scheme: each DM can compute
(or adapt) its “personal” control function “locally”
Straightforward extension to the infinite-horizon
case (receding-horizon paradigm)
SWAN 2006, ARRI
Acknowledgments
Riccardo Zoppoli, Marios Polycarpou, Marco Baglietto,
Angelo Alessandri, Alessandro Astolfi, Daniele
Casagrande, Riccardo Ferrari, Elisa Franco, Frank Lewis,
R. Selmic, Jason Speyer, Marcel Staroswiecki, Jakob
Stoustrup, …