Top Banner
1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous Mobile Robots
17

1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University [email protected] Apr 08, 2009 CS5331: Autonomous.

Dec 16, 2015

Download

Documents

Scarlett Watts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

1

An Application of Reinforcement Learning to Aerobatic Helicopter

Greg McChesneyTexas Tech University

[email protected]

Apr 08, 2009CS5331: Autonomous Mobile

Robots

Page 2: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Overview

Creating a robot that can fly autonomously

Software developed at Stanford as part of their AI lab

This paper is slightly outdated as many new maneuvers have been created.

Apr 08, 2009CS5331: Autonomous Mobile

Robots 2

Page 3: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Learning Approach

Apprenticeship Collect data from human trying

maneuver (multiple times) Learn a model from the data Find controller than can simulate based

on model Test on helicopter (pray it doesn’t

crash)

Apr 08, 2009CS5331: Autonomous Mobile

Robots 3

Page 4: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Helicopters State

Position Velocity Angular Velocity Controlled with 4 dimensions

Cyclic pitch Tail rotor

Take gravity out when calculating the model

Apr 08, 2009CS5331: Autonomous Mobile

Robots 4

Page 5: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Controller Design

Use a Markov decision process Sextuple (S,A,T,H,s(0),R)

S-set of states A-set of actions (inputs) T-dynamic model-set of probability

distributions for the next state H-horizon or number of time steps of

interest s(0)-initial state R-reward function

Apr 08, 2009CS5331: Autonomous Mobile

Robots 5

Page 6: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Differential Dynamic Programming(DDP)

Compute the linear approximation Compute the optimal solution to the

linear quadratic regulator Must take into account error state Cost for change in input-needed in real

testing

Apr 08, 2009CS5331: Autonomous Mobile

Robots 6

Page 7: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

DDP-Continued

2 phases DDP to find open loop input sequence Use DDP again refining the inputs as a

deviation from the nominal open-loop input sequence

Integral control-take into account wind and errors in the model

Apr 08, 2009CS5331: Autonomous Mobile

Robots 7

Page 8: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Rewards

24 features Used inverse reinforcement learning Rewards from inverse reinforcement

usually did not produce correct result

Took inverse results and manually tuned them to get good results

Apr 08, 2009CS5331: Autonomous Mobile

Robots 8

Page 9: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Helicopter

Xcell Tempest 54” long 19” high 13 lbs Two-stroke engine Orientation sensors GPS-doesn’t work during flips

Apr 08, 2009CS5331: Autonomous Mobile

Robots 9

Page 10: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Apr 08, 2009CS5331: Autonomous Mobile

Robots 10

Page 11: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Flip

Apr 08, 2009CS5331: Autonomous Mobile

Robots 11

Page 12: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Roll

Apr 08, 2009CS5331: Autonomous Mobile

Robots 12

Page 13: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Tail-In Funnel

Apr 08, 2009CS5331: Autonomous Mobile

Robots 13

Page 14: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Nose-In Funnel

Apr 08, 2009CS5331: Autonomous Mobile

Robots 14

Page 15: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Questions

Motivations/Who pays for it I can see applications in the defense

sector DARPA

Could more maneuvers be done just by changing some parameters? Probably not because the filter is

learned based on a model so you would need to create a new model

Apr 08, 2009CS5331: Autonomous Mobile

Robots 15

Page 16: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

More Questions

What's the relationship between reinforcement learning and MDP? Not Sure

Could a helicopter like this operate in the West Texas wind storms?

Apr 08, 2009CS5331: Autonomous Mobile

Robots 16

Page 17: 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous.

Fun Stuff

Videos: http://heli.stanford.edu/ http://www.youtube.com/watch?v=VCd

xqn0fcnE Helicopter

http://www.miniatureaircraftusa.com/helicopterkits/1025_Spectra_G/1025_kit_main.asp

Apr 08, 2009CS5331: Autonomous Mobile

Robots 17