Huaguang Zhang

Download Report

Transcript Huaguang Zhang

TC Report for the 2013 June AdCom Meeting (June 20, 2013)
Adaptive Dynamic Programming
and Reinforcement Learning
Technical Committee (ADPRL TC)
Chair: Huaguang Zhang, China
Vice-Chairs: Jagannathan Sarangapani, USA
Ana Maria Madureira, Portugal
1
Outline
• Introduction of ADPRLTC
• Technical Activities of ADPRLTC
• Review of 2013 ADPRLTC Meeting
• ADPRLTC Plans in 2013
• TF Activity Reports
2
ADPRLTC Members
2011 December
43 Members
2012 June
48 Members
2012 December
52 Members
2013 May
57 Members
North America: 21 North America: 21 North America: 22 North America: 26
Latin America: 1
Latin America: 1
Latin America: 1
Latin America: 1
Europe: 11
Europe: 12
Europe: 13
Europe: 13
Africa: 0
Africa: 0
Africa: 0
Africa: 0
Asia: 9
Asia: 13
Asia: 15
Asia: 16
Oceania: 1
Oceania: 1
Oceania: 1
Oceania: 1
Male: 37
Female: 6
Male: 42
Female: 6
Male: 46
Female: 6
Male: 51
Female: 6
Members from
Industry: 4
Members from
Industry: 5
Members from
Industry: 6
Members from
Industry: 6
New Members: 3
New Members: 7
New Members: 4
New Members: 5
3
ADPRLTC New Members
There are five new members in 2013:
 Warren Dixon, University of Florida, USA
 Hao Xu, Missouri University of Science and Technology,
USA
 Xiong Luo, University of Science and Technology Beijing,
China
 Travis Dierks, Missouri University of Science and
Technology, USA
 Evangelos Theodorou, University of Southern California,
USA
4
ADPRL TC Main Conference
ADPRL: IEEE Symposium on Adaptive Dynamic
Programming and Reinforcement Learning
Year
Location
06
-
07
Hawaii
-
65
49(75%)
Nashville
-
40
33 (83%)
Paris
-
63
47 (75%)
Singapore
-
39
-
28 (72%)
-
08
09
10
11
12
13
14
15
Orlando
Attendees Submitted
Accepted
Oral
Poster
24
4
5
ADPRL TC Task Forces
TF 1: Applications of ADP and RL
Chair: Draguna Vrabie
Vice-Chair(s): Zhong-Ping Jiang
Members:
Warren Powell
Sean Meyn
John Valasek
Derong Liu
Jay H. Lee
Frank Lewis
Jagannathan Sarangapani
G K Venayagamoorthy
Warren Dixon
TF 2: Reinforcement Learning and Function
Approximation
Chair: Robert Babuska
Members:
Robert Babuska
Lucian Busoniu
Vice-Chair(s): Lucian Busoniu
Damien Ernst
Philippe Preux
New Vice-Chair of TF2:
Lucian Busoniu, University of Lorraine, France
6
ADPRL TC Task Forces
TF 3: Robot Reinforcement Learning
Chair: Evangelos Theodorou
Members:
Leslie P. Kaelbling
Jens Kober
Martin Riedmiller
Jennie Si
Emo Todorov
Vice-Chair(s): Stefan Schaal
Robert Babuska
Jun Morimoto
Nick Roy
Russ Tedrake
Nikos Vlassis
TF 4: Evolutionary Algorithms for ADPRL
Chair: Hisashi Handa
Members:
Yoshiaki Katada
Kazuaki Yamada
Hisashi Handa
Vice-Chair(s): Kazuhiro Ohkura
Matteo Gagliolo
Kazuhiro Ohkura
New Chair of TF3:
Evangelos Theodorou, University of Southern California, USA
7
ADPRL TC Task Forces
TF 5: ADPRL in Real-time Feedback Control Systems
Chair: Xin Xu
Vice-Chair(s): Haibo He
Members:
Wen Yu
Yanhong Luo
Dongbin Zhao
Lucian Busoniu
Haibo He
Xin Xu
TF 6: ADP in Game Theory and Multi-Agent Optimization
Chair: Kyriakos G. Vamvoudakis
Vice-Chair(s): Travis Dierks
Members:
Luis Rodolfo Garcia Carrillo
Marcio Fantini Miranda
Qinglai Wei
New Task Forces:
TF 6 is a new task force in 2013.
8
Outline
• Introduction of ADPRLTC
• Technical Activities of ADPRLTC
• Review of 2013 ADPRLTC Meeting
• ADPRLTC Plans in 2013
• TF Activity Reports
9
Activities at SSCI 2013
Symposium:
1. “Adaptive Dynamic Programming and Reinforcement
Learning (ADPRL)” (Chairs: Marco Wiering, Huaguang Zhang,
Jagannathan Sarangapani)
2. “Computational Intelligence Applications in Smart Grid
(CIASG)” (Chairs: Ganesh Kumar Venayagamoorthy, Haibo He)
Keynotes
1. Keynote on "General-purpose RLADP: Solving the scaling
problem“ for ADPRL(Speaker: Paul Werbos)
2. Keynote on “Intelligent adaptive optimal control: algorithms
and stability” (Speaker: Huaguang Zhang)
10
Activities at SSCI 2013
Special Sessions:
1. “Evolutionary Algorithms for ADPRL” at ADPRL 2013
(Organizers: Hisashi Handa and Kazuhiro Ohkura)
2. “Online Planning” at ADPRL 2013 (Organizers: Lucian
Busoniu and Rémi Munos)
3. “ADP and RL in real-time feedback systems” at
ADPRL 2013 (Organizers: Xin Xu and Haibo He)
4. “Finite-Approximate-Error Based Adaptive Dynamic
Programming: Algorithms and Applications” at
ADPRL 2013 (Organizers: Yanhong Luo, Qinglai
Wei, and Zengguang Hou)
11
Planned Activities in 2014
Symposium:
1. “2014 Adaptive Dynamic Programming and Reinforcement Learning
(ADPRL2014)”
Special Sessions:
1. “Solving Games, with ADP” at WCCI 2014 (Organizers: Kyriakos G.
Vamvoudakis and Travis Dierks)
2. “ADP algorithm for the control of multidimensional systems” at
WCCI 2014 (Organizers: Huaguang Zhang and Yanhong Luo )
3. “Adaptive Dynamic Programming and Its Applications in TimeDelayed Systems” at ADPRL 2014(Organizers: Qinglai Wei, Ding
Wang, and Dong-bin Zhao)
Tutorial:
1. “Extreme Learning Machine in Neural Computing and Applications”
at WCCI 2014 (Organizer: Guang-bin Huang)
12
Activities at CIS-Related Journals(1)
Editorial Service
• Derong Liu: Editor in Chief, IEEE Transactions on Neural
Networks and Learning Systems.
• G K Venayagamoorthy: Associate Editor, IEEE
Transactions on Smart Grid.
• Marco Wiering, Associate Editor, IEEE Trans. on Neural
Networks and Learning Systems.
• Huaguang Zhang: Associate Editor, IEEE Transactions on
Fuzzy Systems.
• Draguna Vrabie: Associate Editor, IEEE Transactions on
Neural Networks and Learning Systems.
13
Activities at CIS-Related Journals(2)
• Huaguang Zhang: Associate Editor, IEEE
Transactions on Neural Networks and Learning
Systems.
• Haibo He: Associate Editor, IEEE Trans. on Neural
Networks and Learning Systems.
• W. B. Powell: Associate Editor, Operations Research.
• Haibo He: Associate Editor, IEEE Transactions on
Smart Grid.
• Xin Xu: Editor-in-Chief, Journal of Intelligent
Learning Systems and Applications.
14
Other Major Activities for CIS(1)
Special Issues for CIS-Related Journals
1. Special issue on Optimization Models and Algorithms for the
Smart Grid, 2013 (IEEE Transactions on the Smart Grid)
2. Special issue of Neural Computing and Applications on “Databased control, optimization, modeling and applications” in
2013 (Organizer: Dongbin Zhao, Yi Shen, Zhanshan Wang, &
Xiaolin Hu)
3. Special issue on Learning Issues in Feedback Control of
Uncertain Dynamical Systems, International Journal of
Adaptive Control and Signal Processing, 2013. (Organizer: Xin
Xu, Frank Lewis)
4. Special issue on Computational Intelligence in Smart Grid,
IEEE Trans. on Smart Grid, 2013
15
Other Major Activities for CIS(2)
Major Activities for Other CIS-Sponsored Conferences/
Symposium
1. Derong Liu, 6th Int. Conf. on Brain Inspired Cognitive
Systems (BICS 2013), Beijing, China, June, 9-11, 2013,
General Chair.
2. Huaguang Zhang, 4th Int. Conf. on Intelligent Control and
Information Processing (ICICIP 2013), Beijing, China,
June, 9-11, 2013, General Chair.
3. Derong Liu, IEEE World Congress on Computational
Intelligence, July 6-11, 2014, Beijing, General Chair.
4. Haibo He, 2014 IEEE Symposium Series on Computational
Intelligence, Dec 9-12, 2014, Orlando, General Chair.
16
Other Major Activities(1)
Book Publications:
1. Huaguang Zhang, Derong Liu, Yanhong Luo, Ding
Wang, Adaptive Dynamic Programming for Control:
Algorithms and Stability. Springer Verlag, 2013.
2. F. L. Lewis and D. Liu (eds)., Reinforcement Learning
and Approximate Dynamic Programming for
Feedback Control. New York: Wiley, 2012.
3. Ana Madureira, Cecilia Reis, Viriato Marques (eds).,
Computational Intelligence and Decision Making–
Trends and applications. Springer Verlag, 2012.
4. W. B. Powell, I. O. Ryzhov, Optimal Learning, John
Wiley and Sons, New York, 2012.
17
Other Major Activities(2)
5. D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal
Adaptive Control and Differential Games by
Reinforcement Learning Principles, IET press, 2012.
6. M.A. Wiering and M. van Otterlo (eds)., Reinforcement
Learning: state-of-the-art, Springer, 2012.
7. K. G. Vamvoudakis, F. L. Lewis, Shuzhi Sam Ge,
“Neural Networks in Feedback Control Systems,” in
Mechanical Engineers’ Handbook, Instrumentation,
Systems, Controls, and MEMS, ed. Myer Kutz, John
Willey, NY, 2012.
18
Other Major Activities(3)
8. K. G. Vamvoudakis, and F. L. Lewis, “Online Adaptive
Learning Solution of Multi-Agent Differential Graphical
Games,” in Frontiers in Advanced Control Systems, ed.
Ginalber Luiz Serra, Chapter 2, INTECH, 2012.
9. Yanhong Luo, Huaguang Zhang, Adaptive Optimal
Control for Complex Nonlinear Systems, Science Press,
Beijing, June 2013. (in Chinese)
10. Zhanshan Wang, Stability Analysis of Recurrent Neural
Network and Its Applications, Science Press, Beijing,
2013. (in Chinese)
19
Other Major Activities(4)
Workshops:
1. NSF workshop on May 31/June 1, 2012: "A conversation
between Artificial Intelligence and operations research on
stochastic optimization" which addressed modeling and
algorithmic issues in approximate dynamic programming
(Warren Powell).
2. Workshop at IEEE Conference on Decision and Control, Dec
2012: “Optimization Based Control” which will include
presentations related to ADP and applications (Draguna
Vrabie).
3. Workshop at 24th Chinese Control and Decision Conference,
May 2012: “Industry Process Control and Optimization” which
includes presentations related to adaptive dynamic
programming theory and applications (Huaguang Zhang).
20
Other Major Activities(5)
4. Organizing an entire track on "computational
stochastic optimization" which includes talks that
are specifically on approximate dynamic
programming, both for the annual informs
meeting, and also for the workshop sponsored by
the Informs Computing Society (Warren Powell).
5. Workshop on Exploration vs. Exploitation,
Edinburgh, Scotland, “The Knowledge Gradient
for Optimal Learning,” ICML 2012 (Warren
Powell).
21
Other Major Activities(6)
Major Activities for Other Conferences
• Frank Lewis: Keynote Lecture on “Optimal Design for Cooperative
Control Synchronization and Games on Comunication Graphs ” in
Brain Inspired Cognitive Systems (BICS 2013), Beijing, China,
June 9-11, 2013.
• Jagannathan Sarangapani: Invited Lecture “Optimal Adaptive
Control of Uncertain Nonlinear Dynamic Systems” in the 25th
Chinese Control and Decision Conference, Guiyang, China, May
25-27, 2013.
• Dongbin Zhao: Organize the Special Session “Data-based control
and optimization for nonlinear systems”, the 32th Chinese Control
Conference (CCC 2013), Xi’an, China, July 26-28, 2013.
22
Other Major Activities(7)
• Huaguang Zhang, PC member of 20th International Conference
on Neural Information Processing (ICONIP2013), Daegu, Korea,
November 3-7, 2013.
• Haibo He: Invited talk at the 19th International Conference on
Neural Information Processing (ICONIP'12), Doha, Qatar,
November 14, 2012.
• Warren Powell: Advanced Tutorial: “Unifying the Jungle of
Stochastic Optimization,” Conference Principles and Practices of
Constraint Propagation, Quebec City, Oct 12, 2012.
• Guang-Bin Huang, International Symposium on Extreme Learning
Machine (ELM2012), Singapore, Dec 11-13, 2012, Symposium
Chair.
23
Other Major Activities(8)
Society and Conference Service

•
•
•
•
•
Huaguang Zhang: Chair of IEEE CIS Shenyang chapter
Ana Maria Madureira: Elected vice-chair of IEEE
Portuguese section
Ana Maria Madureira: Elected vice-chair of IEEE CIS
Portuguese chapter
Huaguang Zhang: AdCom Member of Chinese
Association for Artificial Intelligence
Warren Powell: Member of American Association for
Artificial Intelligence.
Warren Powell: Member of Math Programming Society
and American Mathematical Society.
24
Outline
• Introduction of ADPRLTC
• Technical Activities of ADPRLTC
• Review of 2013 ADPRLTC Meeting
• ADPRLTC Plans in 2013
• TF Activity Reports
25
Discussions in 2013 TC Meeting
We have discussed the following issues at the TC Meeting:
1. How to increase the number of submissions to ADPRL Symposium
2014/WCCI 2014 and motivate authors to submit their papers
before the original deadline.
2. How to motivate highly qualified specialist to help review the
conference papers.
3. How to avoid possible plagiarism.
4. How to encourage the TC members to propose new Task Forces
and carry out the webpage updating for each Task Force.
5. How to cross the boundaries of different communities, such as ADP
community, reinforcement learning community, stochastic optimal
control community, and so on.
6. How to extend the applications of ADPRL algorithms to more
complex industrial processes.
26
Outline
• Introduction of ADPRLTC
• Technical Activities of ADPRLTC
• Review of 2013 ADPRLTC Meeting
• ADPRLTC Plans in 2013
• TF Activity Reports
27
ADPRLTC Chair’s Plan in 2013
1. Activate the task force “Robot Reinforcement Learning” in
2013.
2. Increase the members from Oceania and Africa in 2013.
3. Encourage more keynotes/workshops/tutorials on ADPRL
during some related conferences, such as
CDC/ACC/ECC/ICSP/IJCNN etc., to publicize this field.
4. Consider a special issue on ADPRL in CIS-sponsored
journals, such as IEEE Computational Intelligence Magazine,
IEEE TNNLS, etc..
5. Extend the applications of ADPRL algorithms to more
complex industrial processes.
6. Organize some summer schools in Asia/Europe.
7. Encourage the membership upgrading (e.g., Senior Members,
Fellow) and awards nomination.
28
Outline
• Introduction of ADPRLTC
• Technical Activities of ADPRLTC
• Review of 2013 ADPRLTC Meeting
• ADPRLTC Plans in 2013
• TF Activity Reports
29
Task Force Report
Applications of ADP and RL
Chair: Draguna Vrabie
Vice-Chair(s): Zhong-Ping Jiang
Members:
Warren Powell
John Valasek
Jay H. Lee
Jagannathan Sarangapani
Warren Dixon
Sean Meyn
Derong Liu
Frank Lewis
G K Venayagamoorthy
Activities in 2013:
1. Keynote Lecture on “Optimal Design for Cooperative Control Synchronization
and Games on Comunication Graphs ” in Brain Inspired Cognitive Systems (BICS
2013), Beijing, China, June 9-11, 2013;
2. Invited Lecture “Optimal Adaptive Control of Uncertain Nonlinear Dynamic
Systems” in the 25th Chinese Control and Decision Conference, Guiyang, China,
May 25-27, 2013.
3. Workshop at IEEE CDC 2012: “Optimization Based Control” which includes
presentations related to ADP and applications, December 10-13, 2012.
Planned Activities in 2014:
1. Special issue “Reinforcement Learning and Adaptive Dynamic Programming” at
ACTA AUTOMATICA SINICA, 2014.
30
Task Force Report
Reinforcement Learning and
Function Approximation
Chair: Robert Babuska
Vice-Chair(s): Lucian Busoniu
Members:
Robert Babuska
Lucian Busoniu
Damien Ernst
Philippe Preux
Activities in 2012/2013:
1. Book chapters: L. Busoniu, R. Munos, R. Babuska, Optimistic planning
in Markov decision processes. In: Reinforcement Learning and Adaptive
Dynamic Programming for Feedback Control, F. Lewis, D. Liu (ed.), Wiley,
2012.
2. Special Session at SSCI 2013 : “Online Planning”, Organizer: L. Busoniu,
R. Munos.
Planned Activities in 2014:
1. Lucian Busoniu is to co-organize the symposium ADPRL 2014 as a cochair of this symposium.
31
Task Force Report
Robot Reinforcement Learning
Chair: Evangelos Theodorou
Vice-Chair(s): Stefan Schaal
Members:
Leslie P. Kaelbling
Jens Kober
Martin Riedmiller
Jennie Si
Emo Todorov
Robert Babuska
Jun Morimoto
Nick Roy
Russ Tedrake
Nikos Vlassis
Activities in 2012/2013:
1. Invited Lecture at WCCI2012: “Uncovering the Neural Code of Learning
Control”
2. Panel at WCCI2012: “Computational Intelligence in Education and
University Curricula”
Planned Activities in 2014:
Unsure still for 2014.
32
Task Force Report
Evolutionary Algorithms for
ADPRL
Chair: Hisashi Handa
Vice-Chair(s): Kazuhiro Ohkura
Members:
Yoshiaki Katada
Kazuaki Yamada
Hisashi Handa
Matteo Gagliolo
Kazuhiro Ohkura
Activities in 2012/2013:
1. Special Session at WCCI 2012: “Real World Applications of
Reinforcement Learning” ;
2. Special Session on “Evolutionary Algorithms for ADPRL” at ADPRL
2013.
Planned Activities in 2014:
Unsure still for 2014.
33
Task Force Report
ADPRL in Real-time Feedback
Control Systems
Chair: Xin Xu
Vice-Chair(s): Haibo He
Members:
Wen Yu
Dongbin Zhao
Haibo He
Yanhong Luo
Lucian Busoniu
Xin Xu
Activities in 2013:
1. Special issue “Optimization Models and Algorithms for the Smart Grid”
on IEEE Transactions on the Smart Grid, 2013;
2. A special issue on Learning Issues in Feedback Control of Uncertain
Dynamical Systems is under publication in International Journal
of Adaptive Control and Signal Processing in 2013;
3. Special Session on “ADP and RL in real-time feedback systems” at
ADPRL 2013.
Planned Activities in 2014:
1. Tutorial on “Reinforcement learning for real-time feedback control
systems” at WCCI2014.
34
Task Force Report
ADP in Game Theory and MultiAgent Optimization
Chair: Kyriakos G. Vamvoudakis
Vice-Chair(s): Travis Dierks
Members:
Luis Rodolfo Garcia Carrillo
Marcio Fantini Miranda
Qinglai Wei
Activities in 2013:
1. Special session on “Games, ADP and Network Security” for IEEE CDC
2013;
2. K. G. Vamvoudakis, F. L. Lewis, Shuzhi Sam Ge, “Neural Networks in
Feedback Control Systems,” to appear in Mechanical Engineers’
Handbook, Instrumentation, Systems, Controls, and MEMS, ed. Myer
Kutz, John Willey, NY, 2013.
Planned Activities in 2014:
1. Special session on “Solving Games, with ADP” for WCCI 2014.
35
36