Top Banner
Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento Júnior Instituto Tecnológico de Aeronáutica (ITA) Brazil Sidney Nascimento Givigi Júnior Royal Military College of Canada (RMCC) Canada 1
39

Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Dec 18, 2015

Download

Documents

Blaze Dean
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots

using Reinforcement Learning

Sérgio Ronaldo Barros dos SantosCairo Lúcio Nascimento Júnior

Instituto Tecnológico de Aeronáutica (ITA)Brazil

Sidney Nascimento Givigi JúniorRoyal Military College of Canada (RMCC)

Canada1

Page 2: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Introduction• Quad-rotor robots have attracted the attention of

many researchers in the past few years.

• Examples of applications:– Military applications: surveillance, border patrolling,

crowd control.

– Civilian applications: rescue missions during floods and earthquakes, monitoring pipelines and electric transmission liones.

2

Page 3: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

IntroductionA quad-rotor consists of four independent propellers attached to the corners of a cross-shaped frame, turning in opposite directions.

3

Page 4: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Quad-Rotor DynamicsAll rotational and translational movements of a quad-rotor can be achieved by adjusting its rotor speeds.

4

Page 5: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Introduction• Quad-rotor robots are affected by a

number of physical effects such as:

– Aerodynamic effects,

– Gravity effect,

– Ground effect,

– Gyroscopic effect,

– Friction.

• Due to these nonlinear effects, it is difficult to design good controllers for a quad-rotor.

5

Page 6: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Introduction• Typically quad-rotor applications use controllers

derived using linearized models.

• These controllers exhibit poor performance for fast maneuvers or in the presence of disturbances such as wind and the ground effect.

• In order to perform path tracking in the presence of nonlinear disturbances, a machine learning technique (RL-LA) will be applied.

6

Page 7: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Objectives

• To present a solution for testing and evaluation of attitude stabilization and path tracking controllers for quad-rotors.

• To use a Reinforcement Learning algorithm (Learning Automata) to adjust the controllers parameters using a simulation environment that includes wind and ground effects.

7

Page 8: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Quad-Rotor Dynamics• An inertial frame and a body fixed frame whose

origin is in the center of mass of the quad-rotor are used.

8

Page 9: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Quad-Rotor Dynamics• The dynamic model is derived under the

following assumptions.

– the vehicle frame is rigid and symmetrical,

– the body fixed frame is located at the vehicle center of mass,

– the propellers are also rigid.

9

Page 10: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Quad-Rotor Dynamics• The dynamic model of the quad-rotor can de

derived using Newton-Euler formalism.

10

Page 11: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Robot Controllers• The control architecture for the robot involves

two loops: inner and outer. The roll, pitch, and yaw angles are represented by Φ, θ and ψ, respectively.

11

Page 12: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Robot Controllers• Three nonlinear control strategies are used:

- Nonlinear PID Control, - Backstepping technique

- Sliding Model Control.

12

Page 13: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Robot ControllersThe parameters of the 6 controllers are tuned using the RL algorithm.

Technique

Controllers

Path Tracking AttitudeHeight

x-position y-position Pitch Roll Yaw

PID kp,ki,kd kp,ki, ,kd kp,ki,kd kp,ki,kd kp,ki,kd kp,ki,kd

Backstepping α12, α11 α10, α9 α4, α3 α1, α2 α5, α6 α7, α8

Sliding Mode k5, λ5 k4, λ4 k2, λ2 k1, λ1 k3, λ3 k6, λ6

13

Page 14: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Simulation Environment• A simulation setup is proposed to train and

evaluate the quad-rotor controller under more realistic conditions.

14

Page 15: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Simulation Environment

15

Page 16: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Simulation Environment

16

Page 17: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Simulation Environment• Using the Plane-Marker, a X-Plane model of the

X3D-BL quad-rotor (manufactured by Ascending Technologies) was created.

17

Page 18: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Simulation Environment• The responses of the X-Plane and SIMULINK

models are compared for a hovering maneuver.

18

Page 19: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning• Learning Automata (LA) is an alternative approach

that can be used to adjust the parameters of the controllers.

19

Page 20: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning• Steps of the learning process:1. Initialize the probability and parameters vectors of each

controller;

2. Select the parameters for each controller using its associated probability vector;

3. Execute the desired task, obtain its response and use a cost function to measure its performance.

4. Compute the reinforcement signal;

5. Adjust the probability vectors;

6. Check the probability vectors for convergence, otherwise return to step 2.

20

Page 21: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning• Supervisory level: LA adjusts the parameters of

the attitude and path tracking controllers.

21

Page 22: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning• Learning the parameters of the controllers was

executed using the X-Plane model in 3 stages with increasing levels of difficulty :

– without the presence of any external disturbances,

– considering only the presence of wind,

– considering the wind and ground effects.

22

Page 23: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning

23

Page 24: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning

24

Page 25: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning• A cost function evaluates the response of each

controller (i) for the selected task at the end of each trial (k) :

T

ssspMeik EGMGdtteGJ

0

2 )(

25

Page 26: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement LearningThe reinforcement signal is computed for each controller (i) at the end of each trial (k):

22

1,,0maxmin10min

pbpbik

ib

ip

med

medikii

RRRRCRR

JJ

JJCC

kk

kk

26

Page 27: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning1. The element of the probability vector

associated with the selected controller parameter is adjusted:

2. The probability vector is then normalized.

ikik

ik jpjp 11

27

Page 28: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Reinforcement Learning• Learning the desired trajectory using the PID

controller during the first stage.

28

Page 29: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• The nonlinear PID controllers results obtained

during simulation. The trajectory is formed by the points (0,0) - (0,10) - (10,10) - (10,0) meters.

29

Page 30: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• The quad-rotor robot during the execution of a

pre-defined trajectory visualized in the X-Plane.

30

Page 31: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• The backstepping controller results in the

presence of wind and ground effects

31

Page 32: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• The path tracking of quad-rotor obtained by the

backstepping controllers in the presence of wind and ground effects, visualized in the X-Plane.

32

Page 33: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• The sliding mode controller response using the

in presence of wind and ground effects.

33

Page 34: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• The quad-rotor trajectory obtained by the sliding

controllers in presence of wind and ground effects, visualized in the X-Plane.

34

Page 35: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Results• Evaluation of the controllers tracking of desired

path after the learning process.

35

Page 36: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Conclusions

• The proposed method (Learning Automata) allows one to tune the parameters of different controllers for a quad-rotor aircraft, considering external disturbances such as wind and ground effects.

• It was shown that the proposed simulation framework can be useful to investigate the application of learning algorithms to adjust the control laws of quad-rotors for different flight maneuvers.

36

Page 37: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Future Research

• Evaluate the controllers (obtained using LA, the simulated model, the simulation environment) using real quad-rotors.

• On-line learning: useful to correct inaccuracies of the simulated (model + environment).

37

Page 38: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Future Research

• Comparison to other RL methods (e.g., Q-Learning) and other search procedures (e.g., genetic algorithms).

• Limitation of learning: generalization to other tasks

Problem: selection of tasks to be executed during training (adaptive control: choice of excitation signal).

38

Page 39: Design of Attitude and Path Tracking Controllers for Quad-Rotor Robots using Reinforcement Learning Sérgio Ronaldo Barros dos Santos Cairo Lúcio Nascimento.

Thank You !

39