1 TC Report for the 2013 June AdCom Meeting (June 20, 2013) Adaptive Dynamic Programming and Reinforcement Learning Technical Committee (ADPRL TC) Chair:

1

TC Report for the 2013 June AdCom Meeting (June 20, 2013)

Adaptive Dynamic Programming and Reinforcement Learning

Technical Committee (ADPRL TC)

Chair: Huaguang Zhang, China

Vice-Chairs: Jagannathan Sarangapani, USA

Ana Maria Madureira, Portugal

2

Outline

• Introduction of ADPRLTC

• Technical Activities of ADPRLTC

• Review of 2013 ADPRLTC Meeting

• ADPRLTC Plans in 2013

• TF Activity Reports

3

2011 December 2012 June 2012 December 2013 May 43 Members 48 Members 52 Members 57 Members

North America: 21

North America: 21

North America: 22

North America: 26

Latin America: 1 Latin America: 1 Latin America: 1 Latin America: 1

Europe: 11 Europe: 12 Europe: 13 Europe: 13

Africa: 0 Africa: 0 Africa: 0 Africa: 0

Asia: 9 Asia: 13 Asia: 15 Asia: 16

Oceania: 1 Oceania: 1 Oceania: 1 Oceania: 1

Male: 37Female: 6

Male: 42Female: 6

Male: 46 Female: 6

Male: 51 Female: 6

Members from Industry: 4




New Members: 3 New Members: 7 New Members: 4 New Members: 5

ADPRLTC Members

4

There are five new members in 2013: Warren Dixon, University of Florida, USA

Hao Xu, Missouri University of Science and Technology,

USA

Xiong Luo, University of Science and Technology Beijing,

China

Travis Dierks, Missouri University of Science and

Technology, USA

Evangelos Theodorou, University of Southern California,

USA

ADPRLTC New Members

5

ADPRL TC Main Conference

Year Location Attendees Submitted Accepted Oral Poster

06 -

07 Hawaii - 65 49(75%)

08

09 Nashville - 40 33 (83%)

10

11 Paris - 63 47 (75%)

12

13 Singapore - 39 28 (72%) 24 4

14 Orlando - - -

15

ADPRL: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning

6

TF 1: Applications of ADP and RLChair: Draguna Vrabie Vice-Chair(s): Zhong-Ping Jiang Members:Warren Powell Sean MeynJohn Valasek Derong LiuJay H. Lee Frank LewisJagannathan Sarangapani G K VenayagamoorthyWarren Dixon

TF 2: Reinforcement Learning and Function ApproximationChair: Robert Babuska Vice-Chair(s): Lucian BusoniuMembers:Robert Babuska Damien Ernst Lucian Busoniu Philippe Preux

New Vice-Chair of TF2：Lucian Busoniu, University of Lorraine, France

ADPRL TC Task Forces

7

TF 3: Robot Reinforcement LearningChair: Evangelos Theodorou Vice-Chair(s): Stefan SchaalMembers:Leslie P. Kaelbling Robert BabuskaJens Kober Jun Morimoto Martin Riedmiller Nick Roy Jennie Si Russ Tedrake Emo Todorov Nikos Vlassis

TF 4: Evolutionary Algorithms for ADPRLChair: Hisashi Handa Vice-Chair(s): Kazuhiro OhkuraMembers: Yoshiaki Katada Matteo GaglioloKazuaki Yamada Kazuhiro OhkuraHisashi Handa

New Chair of TF3：Evangelos Theodorou, University of Southern California, USA


8

TF 5: ADPRL in Real-time Feedback Control SystemsChair: Xin Xu Vice-Chair(s): Haibo HeMembers:Wen Yu Yanhong LuoDongbin Zhao Lucian BusoniuHaibo He Xin Xu

TF 6: ADP in Game Theory and Multi-Agent OptimizationChair: Kyriakos G. Vamvoudakis Vice-Chair(s): Travis DierksMembers:Luis Rodolfo Garcia Carrillo Marcio Fantini Miranda Qinglai Wei

New Task Forces:

TF 6 is a new task force in 2013.


9

Outline






10

Activities at SSCI 2013

Symposium:1.“Adaptive Dynamic Programming and Reinforcement

Learning (ADPRL)” (Chairs: Marco Wiering, Huaguang Zhang, Jagannathan Sarangapani)

2.“Computational Intelligence Applications in Smart Grid (CIASG)” (Chairs: Ganesh Kumar Venayagamoorthy, Haibo He)

Keynotes1. Keynote on "General-purpose RLADP: Solving the scaling

problem“ for ADPRL(Speaker: Paul Werbos)2. Keynote on “Intelligent adaptive optimal control: algorithms

and stability” (Speaker: Huaguang Zhang)

11

Activities at SSCI 2013

Special Sessions:

1. “Evolutionary Algorithms for ADPRL” at ADPRL 2013

(Organizers: Hisashi Handa and Kazuhiro Ohkura)

2. “Online Planning” at ADPRL 2013 (Organizers: Lucian

Busoniu and Rémi Munos)

3. “ADP and RL in real-time feedback systems” at

ADPRL 2013 (Organizers: Xin Xu and Haibo He)

4. “Finite-Approximate-Error Based Adaptive Dynamic

Programming: Algorithms and Applications” at

ADPRL 2013 (Organizers: Yanhong Luo, Qinglai

Wei, and Zengguang Hou)

12

Planned Activities in 2014

Symposium:1. “2014 Adaptive Dynamic Programming and Reinforcement Learning

(ADPRL2014)”

Special Sessions:1. “Solving Games, with ADP” at WCCI 2014 (Organizers: Kyriakos G.

Vamvoudakis and Travis Dierks) 2. “ADP algorithm for the control of multidimensional systems” at

WCCI 2014 (Organizers: Huaguang Zhang and Yanhong Luo ) 3. “Adaptive Dynamic Programming and Its Applications in Time-

Delayed Systems” at ADPRL 2014(Organizers: Qinglai Wei, Ding Wang, and Dong-bin Zhao)

Tutorial:1. “Extreme Learning Machine in Neural Computing and Applications”

at WCCI 2014 (Organizer: Guang-bin Huang)

13

Activities at CIS-Related Journals(1)

Editorial Service

• Derong Liu: Editor in Chief, IEEE Transactions on Neural Networks and Learning Systems.

• G K Venayagamoorthy: Associate Editor, IEEE Transactions on Smart Grid.

• Marco Wiering, Associate Editor, IEEE Trans. on Neural Networks and Learning Systems.

• Huaguang Zhang: Associate Editor, IEEE Transactions on Fuzzy Systems.

• Draguna Vrabie: Associate Editor, IEEE Transactions on Neural Networks and Learning Systems.

14

• Huaguang Zhang: Associate Editor, IEEE Transactions on Neural Networks and Learning Systems.

• Haibo He: Associate Editor, IEEE Trans. on Neural Networks and Learning Systems.

• W. B. Powell: Associate Editor, Operations Research.

• Haibo He: Associate Editor, IEEE Transactions on Smart Grid.

• Xin Xu: Editor-in-Chief, Journal of Intelligent Learning Systems and Applications.

Activities at CIS-Related Journals(2)

15

Other Major Activities for CIS(1)

Special Issues for CIS-Related Journals1. Special issue on Optimization Models and Algorithms for the

Smart Grid, 2013 (IEEE Transactions on the Smart Grid)

2.Special issue of Neural Computing and Applications on “Data-based control, optimization, modeling and applications” in 2013 (Organizer: Dongbin Zhao, Yi Shen, Zhanshan Wang, & Xiaolin Hu)

3.Special issue on Learning Issues in Feedback Control of

Uncertain Dynamical Systems, International Journal of

Adaptive Control and Signal Processing, 2013. (Organizer: Xin

Xu, Frank Lewis)

4.Special issue on Computational Intelligence in Smart Grid,

IEEE Trans. on Smart Grid, 2013

16

Major Activities for Other CIS-Sponsored Conferences/ Symposium

1. Derong Liu, 6th Int. Conf. on Brain Inspired Cognitive Systems (BICS 2013), Beijing, China, June, 9-11, 2013, General Chair.

2. Huaguang Zhang, 4th Int. Conf. on Intelligent Control and Information Processing (ICICIP 2013), Beijing, China, June, 9-11, 2013, General Chair.

3. Derong Liu, IEEE World Congress on Computational Intelligence, July 6-11, 2014, Beijing, General Chair.

4. Haibo He, 2014 IEEE Symposium Series on Computational Intelligence, Dec 9-12, 2014, Orlando, General Chair.

Other Major Activities for CIS(2)

17

Book Publications:1. Huaguang Zhang, Derong Liu, Yanhong Luo, Ding

Wang, Adaptive Dynamic Programming for Control: Algorithms and Stability. Springer Verlag, 2013.

2. F. L. Lewis and D. Liu (eds)., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. New York: Wiley, 2012.

3. Ana Madureira, Cecilia Reis, Viriato Marques (eds)., Computational Intelligence and Decision Making–Trends and applications. Springer Verlag, 2012.

4. W. B. Powell, I. O. Ryzhov, Optimal Learning, John Wiley and Sons, New York, 2012.

Other Major Activities(1)

18

5. D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, Optimal

Adaptive Control and Differential Games by

Reinforcement Learning Principles, IET press, 2012.

6. M.A. Wiering and M. van Otterlo (eds)., Reinforcement

Learning: state-of-the-art, Springer, 2012.

7. K. G. Vamvoudakis, F. L. Lewis, Shuzhi Sam Ge,

“Neural Networks in Feedback Control Systems,” in

Mechanical Engineers’ Handbook, Instrumentation,

Systems, Controls, and MEMS, ed. Myer Kutz, John

Willey, NY, 2012.


19

8. K. G. Vamvoudakis, and F. L. Lewis, “Online Adaptive

Learning Solution of Multi-Agent Differential Graphical

Games,” in Frontiers in Advanced Control Systems, ed.

Ginalber Luiz Serra, Chapter 2, INTECH, 2012.

9. Yanhong Luo, Huaguang Zhang, Adaptive Optimal

Control for Complex Nonlinear Systems, Science Press,

Beijing, June 2013. (in Chinese)

10. Zhanshan Wang, Stability Analysis of Recurrent Neural

Network and Its Applications, Science Press, Beijing,

2013. (in Chinese)


20


Workshops:1. NSF workshop on May 31/June 1, 2012: "A conversation

between Artificial Intelligence and operations research on stochastic optimization" which addressed modeling and algorithmic issues in approximate dynamic programming (Warren Powell).

2. Workshop at IEEE Conference on Decision and Control, Dec 2012: “Optimization Based Control” which will include presentations related to ADP and applications (Draguna Vrabie).

3. Workshop at 24th Chinese Control and Decision Conference, May 2012: “Industry Process Control and Optimization” which includes presentations related to adaptive dynamic programming theory and applications (Huaguang Zhang).

21

4. Organizing an entire track on "computational stochastic optimization" which includes talks that are specifically on approximate dynamic programming, both for the annual informs meeting, and also for the workshop sponsored by the Informs Computing Society (Warren Powell).

5. Workshop on Exploration vs. Exploitation, Edinburgh, Scotland, “The Knowledge Gradient for Optimal Learning,” ICML 2012 (Warren Powell).


22

Major Activities for Other Conferences

• Frank Lewis: Keynote Lecture on “Optimal Design for Cooperative

Control Synchronization and Games on Comunication Graphs ” in

Brain Inspired Cognitive Systems (BICS 2013), Beijing, China,

June 9-11, 2013.

• Jagannathan Sarangapani: Invited Lecture “Optimal Adaptive

Control of Uncertain Nonlinear Dynamic Systems” in the 25th

Chinese Control and Decision Conference, Guiyang, China, May

25-27, 2013.

• Dongbin Zhao: Organize the Special Session “Data-based control

and optimization for nonlinear systems”, the 32th Chinese Control

Conference (CCC 2013), Xi’an, China, July 26-28, 2013.


23

• Huaguang Zhang, PC member of 20th International Conference on Neural Information Processing (ICONIP2013), Daegu, Korea, November 3-7, 2013.

• Haibo He: Invited talk at the 19th International Conference on Neural Information Processing (ICONIP'12), Doha, Qatar, November 14, 2012.

• Warren Powell: Advanced Tutorial: “Unifying the Jungle of Stochastic Optimization,” Conference Principles and Practices of Constraint Propagation, Quebec City, Oct 12, 2012.

• Guang-Bin Huang, International Symposium on Extreme Learning Machine (ELM2012), Singapore, Dec 11-13, 2012, Symposium Chair.


24

Society and Conference ServiceHuaguang Zhang: Chair of IEEE CIS Shenyang chapter• Ana Maria Madureira: Elected vice-chair of IEEE

Portuguese section• Ana Maria Madureira: Elected vice-chair of IEEE CIS

Portuguese chapter• Huaguang Zhang: AdCom Member of Chinese

Association for Artificial Intelligence• Warren Powell: Member of American Association for

Artificial Intelligence.• Warren Powell: Member of Math Programming Society

and American Mathematical Society.


25

Outline






26

1. How to increase the number of submissions to ADPRL Symposium 2014/WCCI 2014 and motivate authors to submit their papers before the original deadline.

2. How to motivate highly qualified specialist to help review the conference papers.

3. How to avoid possible plagiarism.

4. How to encourage the TC members to propose new Task Forces and carry out the webpage updating for each Task Force.

5. How to cross the boundaries of different communities, such as ADP community, reinforcement learning community, stochastic optimal control community, and so on.

6. How to extend the applications of ADPRL algorithms to more complex industrial processes.

Discussions in 2013 TC Meeting

We have discussed the following issues at the TC Meeting:

27

Outline






28

1. Activate the task force “Robot Reinforcement Learning” in 2013.

2. Increase the members from Oceania and Africa in 2013.

3. Encourage more keynotes/workshops/tutorials on ADPRL during some related conferences, such as CDC/ACC/ECC/ICSP/IJCNN etc., to publicize this field.

4. Consider a special issue on ADPRL in CIS-sponsored journals, such as IEEE Computational Intelligence Magazine, IEEE TNNLS, etc..

5. Extend the applications of ADPRL algorithms to more complex industrial processes.

6. Organize some summer schools in Asia/Europe.

7. Encourage the membership upgrading (e.g., Senior Members, Fellow) and awards nomination.

ADPRLTC Chair’s Plan in 2013

29

Outline






30

Task Force Report

Applications of ADP and RL Chair: Draguna VrabieVice-Chair(s): Zhong-Ping Jiang

Members:Warren Powell Sean MeynJohn Valasek Derong LiuJay H. Lee Frank LewisJagannathan Sarangapani G K VenayagamoorthyWarren DixonActivities in 2013:1. Keynote Lecture on “Optimal Design for Cooperative Control Synchronization

and Games on Comunication Graphs ” in Brain Inspired Cognitive Systems (BICS 2013), Beijing, China, June 9-11, 2013;

2. Invited Lecture “Optimal Adaptive Control of Uncertain Nonlinear Dynamic Systems” in the 25th Chinese Control and Decision Conference, Guiyang, China, May 25-27, 2013.

3. Workshop at IEEE CDC 2012: “Optimization Based Control” which includes presentations related to ADP and applications, December 10-13, 2012.

Planned Activities in 2014:1. Special issue “Reinforcement Learning and Adaptive Dynamic Programming” at

ACTA AUTOMATICA SINICA, 2014.

31

Task Force Report

Reinforcement Learning and Function Approximation

Chair: Robert Babuska

Vice-Chair(s): Lucian Busoniu

Members:Robert Babuska Damien Ernst Lucian Busoniu Philippe Preux

Activities in 2012/2013:1. Book chapters: L. Busoniu, R. Munos, R. Babuska, Optimistic planning

in Markov decision processes. In: Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control, F. Lewis, D. Liu (ed.), Wiley, 2012.

2. Special Session at SSCI 2013 : “Online Planning”, Organizer: L. Busoniu, R. Munos.

Planned Activities in 2014:1. Lucian Busoniu is to co-organize the symposium ADPRL 2014 as a co-

chair of this symposium.

32

Chair: Evangelos Theodorou Vice-Chair(s): Stefan Schaal

Members:Leslie P. Kaelbling Robert BabuskaJens Kober Jun Morimoto Martin Riedmiller Nick Roy Jennie Si Russ Tedrake Emo Todorov Nikos Vlassis

Activities in 2012/2013:1. Invited Lecture at WCCI2012: “Uncovering the Neural Code of Learning Control”

2. Panel at WCCI2012: “Computational Intelligence in Education and University Curricula”

Planned Activities in 2014:Unsure still for 2014.

Task Force Report

Robot Reinforcement Learning

33

Chair: Hisashi Handa Vice-Chair(s): Kazuhiro Ohkura

Members:

Yoshiaki Katada Matteo GaglioloKazuaki Yamada Kazuhiro OhkuraHisashi Handa

Activities in 2012/2013:

1. Special Session at WCCI 2012: “Real World Applications of Reinforcement Learning” ;2. Special Session on “Evolutionary Algorithms for ADPRL” at ADPRL

2013.

Planned Activities in 2014:Unsure still for 2014.

Task Force Report

Evolutionary Algorithms for ADPRL

34

Task Force Report

ADPRL in Real-time Feedback Control Systems

Chair: Xin XuVice-Chair(s): Haibo HeMembers: Wen Yu Yanhong LuoDongbin Zhao Lucian BusoniuHaibo He Xin Xu

Activities in 2013:1. Special issue “Optimization Models and Algorithms for the Smart Grid” on IEEE Transactions on the Smart Grid, 2013;2. A special issue on Learning Issues in Feedback Control of Uncertain Dynamical Systems is under publication in International Journal of Adaptive Control and Signal Processing in 2013;3. Special Session on “ADP and RL in real-time feedback systems” at ADPRL 2013.

Planned Activities in 2014:1. Tutorial on “Reinforcement learning for real-time feedback control systems” at WCCI2014.

35

Task Force Report

ADP in Game Theory and Multi-Agent Optimization

Chair: Kyriakos G. VamvoudakisVice-Chair(s): Travis DierksMembers:Luis Rodolfo Garcia Carrillo Marcio Fantini Miranda Qinglai Wei

Activities in 2013:1. Special session on “Games, ADP and Network Security” for IEEE CDC

2013;

2. K. G. Vamvoudakis, F. L. Lewis, Shuzhi Sam Ge, “Neural Networks in Feedback Control Systems,” to appear in Mechanical Engineers’ Handbook, Instrumentation, Systems, Controls, and MEMS, ed. Myer Kutz, John Willey, NY, 2013.

Planned Activities in 2014:1. Special session on “Solving Games, with ADP” for WCCI 2014.

36

1 TC Report for the 2013 June AdCom Meeting (June 20, 2013) Adaptive Dynamic Programming and Reinforcement Learning Technical Committee (ADPRL TC) Chair:

Documents

adprltc members

usa adprltc new members

reinforcement learning

lucian busoniu members

learning systems

kazuhiro ohkura members

travis dierks members

huaguang zhang slide