Top Banner
Controlling a multi-joint arm actuated by pneumatic muscles with quasi-DDP optimal control G. Kumar Hari Shankar Lal Das 1,2 , B. Tondu 1,2 , O. Stasse 1 , P. Soueres 1 Abstract—Pneumatic actuators have inherent compliance and hence they are very interesting for applications involving inter- action with environment or human. But controlling such kind of actuators is not trivial. The paper presents an implementation of iterative Linear Quadratic regulator (iLQR) based optimal control framework to control an anthropomorphic arm with each joint actuated by an agonist-antagonistic pair of Mckibben artificial muscles. The method is applied to positioning tasks and generation of explosive movements by maximizing the link speed. It is then compared to traditional control strategies to justify that optimal control is effective in controlling the position in highly non-linear pneumatic systems. Also the importance of varying compliance is highlighted by repeating the tasks at different compliance level. The algorithm validation is reported here by several simulations and hardware experiments in which the shoulder and elbow flexion are controlled simultaneously. I. I NTRODUCTION Pneumatic actuators are inherently compliant and have very high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil- itation or even in walking robot and human robot interaction. The goal of this paper is to use an arm actuated by McKibben muscles to perform explosive tasks like hammering or kicking balls. Recent works [1] advocates other kind of actuation system which are easier to control but come with a high price tag. Our goal was to demonstrate here that with minimal mod- ification of an existing platform [2], it is possible to perform such kind of explosive motions, and not fine manipulation as in [1]. The Mckibben artificial muscle is known for its non- linearities and hence pose a great control challenge. These non-linearities are mainly due to hysteresis, saturation and internal friction between fabrics. So far these challenges were dealt with traditional controllers like high gain PID controller, sliding mode controller for position control [3]. These methods usually lead to stiff system dynamics with higher impedances. Recent developments in pneumatic actuation control [4], [5] are use Model Predictive Control (MPC). For instance in [6], the control scheme is based on a switching Piece Wise Affine (PWA) system model approximation. The method is able to capture the high non-linearities of the Pneumatic Artificial Muscle (PAM). In [4] a linear formulation of the actuation system is proposed to simplify the algorithm implementation and makes it real-time. The contribution of this paper is two fold. - First it proposes a new model which provides a good compromise between accuracy and simplicity. 1 are with CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France 2 are also with Univ de Toulouse, INSA, F-31400 Toulouse, France gkharish,btondu,ostasse,[email protected] - The second contribution is to show that, on this model an iLQR control scheme can be used to generate with good precision positioning task and explosive movements. This is achieved with the introduction of an empirical model of pressure generation from the Intensity-Pressure con- verter (I/P). Each of the seven Degrees of Freedom (DoFs) of the manipulator arm of LAAS-CNRS is actuated by a pair of agonist-antagonist Mckibben muscles. The capabilities of the non-linear iLQR control to execute some simple tasks of reaching a point, maximizing the velocity of joint and end effector are demonstrated on this platform. The paper is organised as follows: The dynamical model of joints actuated by Mckibben muscles is presented in section II. Section III describes the model of the anthropomorphic robot arm. The optimal control formulation and brief introduction to iterative Linear Quadratic Regulator(iLQR) is presented in section IV. Section V presents simulations and experiments. II. DYNAMIC MODEL OF A JOINT ACTUATED BY MCKIBBEN MUSCLES There have been several attempts to model the McK- ibben artificial muscle [4], [7], [8], [9], [10]. The difficulties for getting an accurate model are due to a combination of complex phenomena during static and dynamic contraction: shape changing at the muscle tip which lose their initial cylindrical shape for a conical one, mobility and flexibility of the braided sleeve, elasticity of the inner rubber tube, exotic friction in the textile braided sleeve without forgetting, for the pneumatic version of the McKibben muscle, dynamic fluidic phenomena resulting from the artificial muscle volume variation during contraction. When we want to include such a physical model into the closed-loop control of a McKibben muscle actuator, a compromise must be found between the complexity of an accurate model and the time required for its computation. The need for an efficient but not too complex dynamic muscle model is all the more important in the case of artificial muscles for robot arms. Indeed the preservation of the actuator compliance imposes a joint direct-drive mode and so the consideration of a dynamic robot model, linked to the actuator model. The originality of our approach, described in this section, consists in including an original model of real pressure variation inside a simple McKibben muscle dynamic model before proposing a multi-variable pressure control of the actuator made of two similar antagonist muscles. A. Muscles side dynamics The following model was considered for relating the dy- namic contraction force F of the artificial muscle to its control pressure P and contraction ratio. F (ε , P)=(π r 2 0 )P[a(1 - kε ) 2 - b] (1)
8

Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

Aug 23, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

Controlling a multi-joint arm actuated by pneumatic muscles withquasi-DDP optimal control

G. Kumar Hari Shankar Lal Das1,2, B. Tondu1,2, O. Stasse1, P. Soueres1

Abstract—Pneumatic actuators have inherent compliance andhence they are very interesting for applications involving inter-action with environment or human. But controlling such kind ofactuators is not trivial. The paper presents an implementationof iterative Linear Quadratic regulator (iLQR) based optimalcontrol framework to control an anthropomorphic arm witheach joint actuated by an agonist-antagonistic pair of Mckibbenartificial muscles. The method is applied to positioning tasksand generation of explosive movements by maximizing the linkspeed. It is then compared to traditional control strategies tojustify that optimal control is effective in controlling the positionin highly non-linear pneumatic systems. Also the importanceof varying compliance is highlighted by repeating the tasks atdifferent compliance level. The algorithm validation is reportedhere by several simulations and hardware experiments in whichthe shoulder and elbow flexion are controlled simultaneously.

I. INTRODUCTION

Pneumatic actuators are inherently compliant and have veryhigh power to weight ratio. These factors motivate researchersto use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation or even in walking robot and human robot interaction.The goal of this paper is to use an arm actuated by McKibbenmuscles to perform explosive tasks like hammering or kickingballs. Recent works [1] advocates other kind of actuationsystem which are easier to control but come with a high pricetag. Our goal was to demonstrate here that with minimal mod-ification of an existing platform [2], it is possible to performsuch kind of explosive motions, and not fine manipulation asin [1]. The Mckibben artificial muscle is known for its non-linearities and hence pose a great control challenge. Thesenon-linearities are mainly due to hysteresis, saturation andinternal friction between fabrics. So far these challenges weredealt with traditional controllers like high gain PID controller,sliding mode controller for position control [3]. These methodsusually lead to stiff system dynamics with higher impedances.Recent developments in pneumatic actuation control [4], [5]are use Model Predictive Control (MPC). For instance in [6],the control scheme is based on a switching Piece Wise Affine(PWA) system model approximation. The method is able tocapture the high non-linearities of the Pneumatic ArtificialMuscle (PAM). In [4] a linear formulation of the actuationsystem is proposed to simplify the algorithm implementationand makes it real-time.

The contribution of this paper is two fold.

- First it proposes a new model which provides a goodcompromise between accuracy and simplicity.

1 are withCNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France2 are also with Univ de Toulouse, INSA, F-31400 Toulouse, Francegkharish,btondu,ostasse,[email protected]

- The second contribution is to show that, on this modelan iLQR control scheme can be used to generate with goodprecision positioning task and explosive movements.

This is achieved with the introduction of an empiricalmodel of pressure generation from the Intensity-Pressure con-verter (I/P). Each of the seven Degrees of Freedom (DoFs)of the manipulator arm of LAAS-CNRS is actuated by a pairof agonist-antagonist Mckibben muscles. The capabilities ofthe non-linear iLQR control to execute some simple tasks ofreaching a point, maximizing the velocity of joint and endeffector are demonstrated on this platform.

The paper is organised as follows: The dynamical model ofjoints actuated by Mckibben muscles is presented in section II.Section III describes the model of the anthropomorphic robotarm. The optimal control formulation and brief introductionto iterative Linear Quadratic Regulator(iLQR) is presented insection IV. Section V presents simulations and experiments.

II. DYNAMIC MODEL OF A JOINT ACTUATED BYMCKIBBEN MUSCLES

There have been several attempts to model the McK-ibben artificial muscle [4], [7], [8], [9], [10]. The difficultiesfor getting an accurate model are due to a combination ofcomplex phenomena during static and dynamic contraction:shape changing at the muscle tip which lose their initialcylindrical shape for a conical one, mobility and flexibilityof the braided sleeve, elasticity of the inner rubber tube,exotic friction in the textile braided sleeve without forgetting,for the pneumatic version of the McKibben muscle, dynamicfluidic phenomena resulting from the artificial muscle volumevariation during contraction. When we want to include sucha physical model into the closed-loop control of a McKibbenmuscle actuator, a compromise must be found between thecomplexity of an accurate model and the time required for itscomputation. The need for an efficient but not too complexdynamic muscle model is all the more important in the caseof artificial muscles for robot arms. Indeed the preservationof the actuator compliance imposes a joint direct-drive modeand so the consideration of a dynamic robot model, linked tothe actuator model. The originality of our approach, describedin this section, consists in including an original model of realpressure variation inside a simple McKibben muscle dynamicmodel before proposing a multi-variable pressure control ofthe actuator made of two similar antagonist muscles.

A. Muscles side dynamics

The following model was considered for relating the dy-namic contraction force F of the artificial muscle to its controlpressure P and contraction ratio.

F(ε,P) = (πr20)P[a(1− kε)2−b] (1)

Page 2: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

l0

r0

α0

Fig. 1. Muscle geometric parameters: l0 initial length, r0 radius and α0 braidangle

Where,

ε =lo− l

lo

a =3

tan2 αo

b =1

sin2(αo)

l being the current muscle length and lo, ro, αo are the musclegeometric parameters corresponding to its initial length, radiusand braid angle respectively (see [7] for more details). Theparameter k, slightly greater than 1, is empirically chosen fortaking into account the conic shape at muscle tips. The viscouscoefficient is also experimentally estimated for expressing thenaturally damped behavior of the artificial muscle contractiondue to the complex kinetic friction inside the braided sleeve.Although hysteresis phenomenon due to dry friction is notincluded in this model, we think it captures most of the staticand dynamic behaviour of the artificial muscle.

B. Pressure side dynamics

Control pressure in a muscle is provided by an Intensity-Pressure (I/P) converter which translates a current value into adesired pressure value that has to be fed to the muscle. Inliterature, see [11] for a survey, there exist several modelsof pressure generated in a muscle in terms of the massof air injected, volume of the muscle and some parametersof servo valves or I/P converter (for an example see [12].These models are highly non-linear and eventually becometoo cumbersome for devising control strategies. We propose asomewhat simpler empirical model to cover the dynamics ofthe pressure generation from the I/P converter. It is derivedfrom SAMSON I/P converters reported in [13]. Followingthis model the instantaneous pressure p inside a muscle isrepresented by a damped second order differential functionwith a control pressure as input as follows:

p+2wn p+w2n p = w2

n pdes (2)

Note that the input of the intensity converter is a current valuewhich is then scaled to the corresponding pressure controlinput Pdes in the pressure unit. The above equation is alsonon-linear as the natural frequency wn has been empiricallyidentified as a function of the volume of the muscle, V .

wn = 2π fv(1/V ) (3)

fv is the empirically found parameter. The volume of a McK-ibben artificial muscle can be approached by the followingcylindrical volume: V = πr2l, where r and l are the current

radius and length of the artificial muscle. By using the rela-tionship Eq.(4) as reported in [14], we deduce an expressionfor the volume of the muscle.

r/r0 = (1/sin(αo))√(1− ε)2) (4)

V =πr2

olsin2(αo)

(1− cos2(α0)(1− ε)2) (5)

For a second order system, the rise time is inversely propor-tional to its natural frequency and in the case of our pressuredynamic model, the natural frequency is inversely proportionalto the volume. It implies that the bigger the muscle, the largerthe rise time. Fig. 2 depicts a typical variation of wn withthe joint angle at one of the manipulator joints. If the range

0 0.5 1 1.5 2 2.5 30

5

10

15

wn

agonist muscle at joint 1 (larger)agonist muscle at joint 2 (smaller)

0 0.5 1 1.5 2 2.5 30

500

1000

Angular Position (rad)

Volu

me(c

m3)

agonist muscle at joint 1 (larger)agonist muscle at joint 2 (smaller)

Fig. 2. Natural frequency and volume of the muscle with the joint angle

of operation for a joint is small i.e range of θ is small, thenthe corresponding changes in the volume of the muscles as θ

varies, would be small at that joint. In such case, a constantvalue for wn could be chosen for the muscles corresponding totheir mean volumes. Here, another interesting point to noticeis that for different muscles, pressure dynamics of a largervolume muscle will be slower. The above two points help intrading off between the operating range of the joint and thecorresponding easiness of control strategy.

C. Agonistic-antagonistic joint actuator

Following the human arm model, a pair of artificial musclescan be set up in antagonistic fashion to drive a chained wheelof radius R. According to Fig. 3, the resulting actuator torqueT can be written as follows:

T = R[F1(ε1,P1)−F2(ε2,P2)] (6)

where F1 and F2 are the forces of muscles 1 and 2 respec-tively, defining the antagonist muscle pair. As illustrated inFig. 3, the relationship between θ , the actuator angle and thecontraction ratios for each muscle can be expressed as follows:ε1 = ε10 +

l0and ε2 = ε20− Rθ

l0, where ε10 and ε20 are the

initial contraction ratios for muscle 1 and muscle 2 respectivelycorresponding to the zero-angular position. Moreover, wepropose to specify the pressures in muscle 1 and muscle 2 asfollows: P1 = P10 +∆P1 and P2 = P20 +∆P2 where P10 and P20are respectively the initial pressure in the agonist-antagonistmuscle and related to zero positioning of the joint. ∆P1 and∆P2, are the control pressures.

Using Eq.(1), and neglecting the terms ε21 and ε2

2 , the torqueat the joint can be expressed as

T = K1∆P1−K2∆P2−K3(P1 +P2)θ −K4θ +K5, (7)

Page 3: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

θT

ε20 − Rθl0

P20 + ∆P2

ε10 + Rθl0

P10 + ∆P1

ε-contraction axis for both muscles

0

Fig. 3. Pulley-chain driven antagonist muscle actuator made of two identicalMcKibben artificial muscles. It has the possibility to adapt initial torque atinitial zero θ -position by means of P10, P20, ε10 and ε20

where,

K1 = (πr2o)R[a(1−2kε10)−b],

K2 = (πr2o)R[a(1−2kε20)−b],

K3 = 2(πr2o)R

2ka/lo,

K4 = (πr2o)R

2 fv/lo,

K5 = (πr2o)R[(a−b)(P10−P20)−2ka(P10ε10−P20ε20)].

If P10 = P20 and ε10 = ε20, K5 is equal to zero. The detailsof our robot-arm including the dimensions of muscles aregiven in [2]. The antagonist muscle actuator is now consideredas a MIMO-system whose inputs are the control pressures∆P1,∆P2, and outputs are both the θ and the actuator stiffnesswhich is defined as the instantaneous ratio between the currenttorque variation and the current angular position variation(see Eq.(9)). When no gravity effect is considered, the staticequilibrium position of the actuator can be directly derivedfrom Eq.(7) with zero angular velocity.

θequ = (K1∆P1−K2∆P2 +K5)/(K3(P1 +P2)) (8)

With associated stiffness σequ expressed as

σequ =−∂T∂θ

= K3(P1 +P2) (9)

From Eq.(8) and Eq.(9), it is possible to remark that theequilibrium position can be changed while keeping the samestiffness by modulating ∆P1 and ∆P2 with a constant ∆P1+∆P2.In the case of a symmetrical pressure variation in both muscles:∆P1 = −∆P2 = ∆P , the actuator becomes a SISO-systemwhose corresponding torque TSISO , is now given by thefollowing relationship:

T = (K1 +K2)∆P−K3(P10 +P20)θ −K4θ +K5 (10)

where stiffness at equilibrium position is now constant andequal to K3(P10 + P20). When the joint angle θ varies, andassuming that the actuator chain is inextensible, the contractionratio of each muscle is known and, consequently, the pressuredynamic model, proposed in II-B, can be applied to eachmuscle. It is important to note that, in the case of relativelysmall joint muscles, the effect of volume variation with jointangle on pressure dynamics could be negligible. If the range ofoperation is small, a constant wn could be chosen for that joint.

As previously noted, pressure dynamics of a larger volumemuscle will be slower which is, especially, the case of themuscles at the shoulder joint compared to the muscles at theelbow joint.

III. ROBOT DYNAMICS

This section presents the robot model formally includingthe rigid body model of the robot with its pressure dynamics.Let us consider a n degrees of freedom robot with generalizedjoint angle coordinates q ∈ Rn. Each joint is actuated by 2pneumatic muscles, so there will be 2n muscles, each one witha pressure Pi ∈R2n. The robot dynamics can be represented instandard form as below:

M(q)q+C(q, q)+G(q) = T (q,P) (11)

P+2CpP+GpP = GpPdes (12)

M(q) ∈Rn×n is the mass inertia matrix of the robot, C(q, q) ∈Rn×1 is the coriolis and centrifugal terms, G(q) ∈ Rn isthe gravity related terms, T (q,P) ∈ Rn is the torque gen-erated by pneumatic muscles as described in Eq.(6). Cp =diag[wn1 ,wn2 , ...wn2n ] and Gp = diag[w2

n1,w2

n2, ...w2

n2n] are the

collection of coefficients of the pressure dynamics of eachmuscle and wni is the natural frequency of the pressure dynam-ics at the ith muscle. P = [P1,P2.....P2n]

T ∈ R2n is the vectorof current muscle pressure and Pdes = [Pdes1 ,Pdes2 .....Pdes2n ]

T ∈R2n, is the vector of desired pressure Pdesi , the control inputin pressure unit given to the I/P converter of the ith muscle.Any reference trajectory in q given to the robot has to respectthe pressure dynamics whose bandwidth depends on wn. Asdiscussed in the previous section, if the joint is actuated bylarger muscles (having smaller bandwidth) and higher risetime, rapidly varying trajectory would be difficult to track andpose a limiting factor in performing explosive tasks. Limits tothe operating range of the joints are obviously decided by themaximum contraction of the spanning muscles. In addition tothat, wn could be reasonably considered as constant in a limitedoperating range, removing in this way the non-linearity inpressure dynamics. Apart from this, due to hardware limitationof I/P converter, the control input Pdes is bounded as follows:

U = Pdes ∈ R2n : Pdes ∈ [Pmin,Pmax] (13)

Where, Pmin,Pmax are the lower and upper bound on controlinput Pdes.

IV. OPTIMAL CONTROL FORMULATION

This section presents the optimal computational frameworkused to find the control sequences to perform a desired task.The optimal control problem is the minimization or maximiza-tion of a performance criterion with respect to the control undera set of constraints that arise from the physical limitation ofthe control action and from the plant dynamics.

A. State space representation

Let us represent the dynamics stated in Eq.(11) andEq.(12) in state space form considering the state vector asx = [q, q,P, P]T .

Page 4: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

x = f (x,u) =

q

M−1(−C(q, q)−G(q)+T (q,P))P

−CpP−GpP+Gpu

(14)

where, f is the non-linear function given by Eq.(14) instate x and control u that gathers Eq.(11) and Eq.(12). In thepresent work, we consider the constraints on the state in theoptimal control formulation which is discussed in the followingsubsection.

B. Treatment of Constraints

There exists a mechanical limit to each degrees of freedomof the arm. To have safe operation, we have introduced theselimits as constraints on the state space inside the cost functionof the optimal control formulation. The chosen cost functionfor the state space constraints are expressed in the followingequations:

max = 1−λ (xmax− x)min = 1−λ (x− xmin)

Cs = eλ∗max + eλ∗min (15)

where xmin,xmax are lower and upper limits on the state andλ is a constant. The above consideration for the cost functionwill ensure that the cost near the limits will be very high asevident in the Fig: 4 and hence the optimal solution will keepthe system within the operating limits. Also, the control action

0 0.5 1 1.5 2

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 108

x

Cost fu

nction v

alu

e

Fig. 4. Cost function of the state constraints

has to be admissible, i.e u∈U = [Pmin,Pmax]. In order to handlethe constraints on the control, the control limited DifferentialDynamic Programming (DDP) approach reported in [15] isused. The control problem formulation is then expressed asdetermining an open-loop control input u = u(t,x) ∈U whichcan minimize or maximize a cost function along a given timeinterval t ∈ [0,T ] and with initial state x(0) = x0. The mostgeneric expression for the cost function can be written asfollows:

J(x0) =C f +Cr +Cs (16)

where C f = h(x(T )) is the final cost, Cr =∫ T0 c(x(t),u(t,x(t)))dt is the integral of the running cost

c(x,u) which encapsulates the task objectives and Cs isthe cost value imposed by the constraints. For a non-linear

dynamics Eq.(14) and non-quadratic cost Eq.(16), optimalcontrol solutions can be obtained using full DDP. However, asDDP is computationally expensive, an iterative LQR (iLQR)approach is considered [16]. The iLQR method is relying onlinearizing the dynamics and approximating the cost functionto quadratic form along the x trajectory. This control approachis briefly summarized in the next section.

C. Iterative Linear Quadratic Regulator (iLQR)

iLQR is initialized with a nominal control sequence and thecorresponding state trajectory (x0,u0). the dynamical systemis then linearized as in Eq.(17) and the cost function isapproximated by the quadratic form Eq.(18) and a local LQRproblem is then solved. Using this solution, the states and thecontrol sequence are improved iteratively.

δx = Aδx+Bδu (17)

∆J = hTx δx(T )+δxT (T )hxxδx(T )

+∫ T

0cT

x δx+ cTu δu+δxT cxxδx+δxT cxuδu+δuT cuuδu

(18)

where A = ∂ f∂x and B = ∂ f

∂u . In Eq.(17) and Eq.(18) subscriptx and u indicate that the function is partially derivated withrespect to x and u. At every iteration, Eq.(17) and Eq.(18)are solved and (δx,δu) are deduced from the resolution of amodified Ricatti-type system. Then the new improved sequenceis generated by x← x+δx and u← u+δu. When ∆J ≈ 0, theiLQR converges and gives an optimal control sequence u∗ ∈Uand the corresponding optimal state trajectory x∗.

V. SIMULATION AND EXPERIMENTAL RESULTS

The experimental set up considered here is the manipulatorof LAAS CNRS which is an anthropomorphic arm of sevendegrees of freedom (DoF), where each joint is pneumaticallyactuated by a pair of Mckibben muscles. In the experimentpresented in this paper, only two joints are controlled. Thesejoints are the flexion θ1 at the shoulder and the flexion θ2at the elbow. (see Fig. 5). So, for the experiments, the robotcan be viewed as a 2 DoF manipulator with state definedby x = [θ1, θ1,θ2, θ2]

T . Each muscle is pressurized by an I/Pconverter which converts a current command to a referencepressure value. The objective of the experiments is to evaluatethe performance of the iLQR control to achieve the followingtasks with our multi-link pneumatic robot.

1) Final position control: We aim to compare the quality ofthe positioning control in the presence of different loads withthe iLQR approach and with the feed-forward proportionalcontrol. 2) Capability to execute explosive movements bymaximizing the link speed at a given time.For this secondtask we compare simulation results with the results of ex-periment executed with the real robot. For each task, weanalyze the stiffness modulation and we show that the optimalcontrol approach enhances the explosive motion capabilitiesby simultaneous modulation of position and stiffness. In orderto do that, simulations and experiments are done in twocases. 1) Case-I: When the sum of the pressure in agonist-antagonist pair of muscles at each joint is kept constant, i.e.P1 + P2 = Constant. 1) Case-II: When P1 and P2 are leftindependent. In this case, there will be two inputs for eachjoint.

Page 5: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

Shoulder FlexionMuscle Pair

Elbow FlexionMuscle Pair

Fig. 5. Picture of the robot-arm showing the two pairs of muscles, in shoulderand elbow flexion.

A. Task 1: Position control

The manipulator is given a final position with a loadmass of ml = 0.1kg at the end effector. iLQR is used hereto find the optimal path to reach the final goal (θ1,θ2) =(28.8,57.6)degrees. The same task is repeated with differentload masses of ml = 0.1kg and ml = 1.0kg which is shown inFig. 6.

The following cost function is considered for this task.

C f = Q f (xre f (T )− x(t))2,

Cr = Q∫ T

0((xre f (T )− x(t))2 +u2(t))dt,

(19)

where, xre f is the final position for the two joints. TheiLQR control uses Eq.(14) and Eq.(19) to solve for optimalcontrol sequence u∗. This optimal control sequence in forwardapplication to Eq.(14) will yield the needed trajectory. Tocompare the effectiveness of the iLQR approach, the task isexecuted using a Proportional-Integral controller with a feed-forward term. The feed-forward term gives a desired pressurefor each muscle at the joint needed to maintain a desired jointangle. Thus, the control action of the feed-forward PI controllercan be defined as follows:

u(t) = Pf eed +Kp(e(t))+Ki

∫ t

0(e(t))dt (20)

where, e(t) = xre f − x(t), Kp and Ki are proportional andintegral gains respectively. The simulation results are shownin Fig. 7. Response of feed-forward PI controller is shown inblack dashed line is compared with the responses of optimalposition control in Case-I when P1+P2 =Constant (blue lines)and in Case-II when P1 and P2 are independent (magentalines)(See Fig. 7).

From the position plots in Fig. 7, it appears that theiLQR approach gives a good compromise between keepingthe stiffness low and minimizing the oscillations which resultsinto a smooth motion. However, the feed-forward PI controller

0 2 4 6 8 1010

20

30

40

Stiffness (

N−

m/r

ad)

Time(s)

ml = 1kgml = 0.1kg

0 2 4 6 8

5

10

15

Time(s)

ml = 1kgml =0.1

0 2 4 6 8 10−20

0

20

40Joint 1

Positio

n (

Degre

es)

0 2 4 6 8 100

20

40

60Joint 2

Fig. 6. Figures on the left side represents the position response and stiffnessvariation at joint 1 and the figures on the right side represents the same forjoint 2 in order to reach the target (θ1,θ2) = (28.8,57.6)degrees. The plotscompare the response of the joints and the stiffness profile when the endeffector is loaded with mass 0.1kg (solid line) and 1.0kg (dashed line). Forthe heavier load, optimal control gives a stiffer stiffness profile.

0 2 4 6 8 100

20

40

60

80Joint 2

0 2 4 6 8 1010

20

30

40

Stiff

ne

ss (

N−

m/r

ad

)

Time (s)

Feedforward PI

Optimal position control at P1+P2 = 3 bar

Optimal position control at P1+P2 = 4 bar

Optimal Position and stiffness control

0 2 4 6 8 100

5

10

15

20

Time (s)

Feedforward PI

Optimal position control at P1+P2 = 4 bar

Optimal position control at P1+P2 = 5 bar

Optimal Position and stiffness control

0 2 4 6 8 10−20

0

20

40Joint 1

Po

sitio

n (

De

gre

es)

Fig. 7. Figures on the left side represents the position response and stiffnessvariation at joint 1 and the figures on the right side represents the same forjoint 2.

makes joint 1 very stiff and joint 2 very flexible leading to anovershoot and oscillations.

As discussed in section II-C, the stiffness can be adjustedfor the same equilibrium position by changing the sum ofpressures in the agonist-antagonist pair of muscles. As thesum increases, stiffness at the equilibrium increases which isevident at each joint (blue and brown lines in stiffness plots).

In order to find the most compliant position trajectory, i.eto find the best sum of pressure in the agonist-antagonist pair,P1,P2 are left independent and Eq.(7) has two independentinputs for each joint. In this case, the optimal solution givesthe similar optimal position trajectory but with better stiffnessprofile (magenta line in stiffness plot in Fig. 7).

B. Task 2: Maximizing the link speed

The objective is to execute some explosive motions withthe aim to perform in the future tasks such as ball throwing,kicking or hammering a nail. Such motions would requireeither maximizing the joint/link speed or end-effector speed[17], [18]. Here, two subtasks are presented: 1) Maximizingthe angular speed of the elbow joint. The task is first simulatedand then executed by the real robot. A comparison between

Page 6: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

simulation and the experimental results are shown in Fig. 8. 2)Maximizing the end effector speed: The objective of this taskis to achieve maximum end-effector speed at terminal time T .Simulations are done for both Case-I (P11 +P2 = Constant)and Case-II (P1 and P2 are independent).

1) Maximizing the angular speed: For maximizing theelbow joint’s angular speed at final time T = 1, the taskrequires only the terminal cost, C f . But to minimize the controleffort, a running cost Cr involving only control pressures isused.

C f =−Q f (x4(T ))2,

Cr = Qu

∫ T

0u2(t)dt,

(21)

where, x4 = θ2 is the angular velocity of the elbow jointwhich constitutes the fourth state variable. Q f and Qu areweights for the terminal cost and the running cost. Thesimulation results are shown in Fig. 8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1

0

1

2

Positio

n (

rad)

Experimental vs simulated response

Joint 1Joint 2Join1 simJoin2 sim

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−20

−10

0

10

20

Join

t V

elo

city (

rad/s

)

Time(s)

Joint 1Joint 2Join1 simJoin2 sim

Fig. 8. Plots on the top compare the robot response (solid lines) withsimulated response (dashed lines) at joint 1 (blue line) and joint 2 (red line).Bottom plots compare the velocity response of the robot and the simulation.

In simulation, the maximum joint velocity is reached atterminal time which is around -15 rad/sec. The robot responseand the simulation response in position and speed show agood match. However, some discrepancies can be observed.Apart from some modeling error, this difference is probablydue to the fact that the constraints on the state vector are notconsidered in the optimal control problem formulation. This isan improvement that we plan to do in future work. Snapshotsof the video of the experiment are presented in Fig. 12

2) Maximizing the end effector speed: The objective is tomaximize the end-effector speed of the manipulator at terminaltime. The cost function, thus, comprises the terminal cost C finvolving the end-effector velocity and the running cost Crinvolving only control efforts.

C f =−Q f ( fvel(x(T ))2,

Cr = Qu

∫ T

0u2(t)dt,

(22)

where, fvel(x(t)) is the function which computes end effectorvelocity from the robot kinematics. Simulations are done forthe following two cases. In Case-I, the sum of pressures in theagonist-antagonistic pair is kept constant at the joint 1 to 3 barand at the joint 2 to 4 bar. In this second simulation (Case-II),

0 0.2 0.4 0.6 0.8 1

−50

0

50

100

150

200

Positio

n (

Degre

es)

Joint 1Joint 2

0 0.2 0.4 0.6 0.8 10

2

4

Contr

ol In

put (B

ar)

Time(s)

Input 1Input 2

Fig. 9. Case-I: P1 +P2 =Const. Plots on the top show the simulated positionresponse of joint 1 (blue line) and Joint 2 (black line). Control effort is shownin the bottom plots. The control inputs, Input 1 and Input 2 are applied onjoint 1 and joint 2 respectively.

0 0.2 0.4 0.6 0.8 1

−50

0

50

100

150

200

Positio

n (

Degre

es)

Joint 1Joint 2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4C

ontr

ol In

put (B

ar)

Input 1Input 2Input 3Input 4

Fig. 10. Case-II: P1,P2 are independent. Plots on the top show the simulatedposition response of joint 1 (blue line) and joint 2 (black line). Control effortis shown in the bottom plots. The control inputs, Input 1 and Input 3 areinputs to the agonist-antagonist pair at joint 1. The control inputs, Input 2 andInput 4 are inputs to the agonist-antagonist pair at joint 2.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

End e

ffecto

r’s s

peed (

m/s

)

Case-I: StiffCase-II: Compliant

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−10

0

10

20

30

Stifn

ess(N

−m

/rad)

Time (s)

Joint 1(Case-I: Stiff)Joint 2 (Case-I: Stiff)Joint 1 (Case-II: Compliant)Joint 2 (Case-II: Compliant)

Fig. 11. Comparison of the stiffness profile of each joint (joint 1: blue lines,Joint 2: red lines) in Case-I when P1 +P2 =Const (dashed lines ) and Case-IIwhen P1,P2 are independent (solid lines). It is evident from the plots that themotion in Case-II is indeed slightly more compliant than the motion in Case-I.

we let P1,P2 independent.

Comparing the velocity plots and the corresponding stiff-ness plots in Fig. 11, it can be seen that the maximumend-effector velocity has increased from 3.8m/s in Case-I to

Page 7: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

Fig. 12. The pneumatic manipulator arm executing the task of maximizing the angular speed of the elbow.

4.1m/s in Case-II. This is due to the fact that in Case-II, theoptimal control has the freedom to modulate the compliancewith two inputs at each joint and hence it was able to achievehigher speed.

The hardware set up used for the experimentation is therobot mentioned in [2] whose control modules have beenupgraded recently. The important components of the current setup are I/P converters, encoders, NI data acquisition devices andthe development computer running real time control software.1) I/P converter : It is a Samson I/P 6111 manufacturedby Samson Corporation, Frankfurt. This I/P is rapid andproduces output pressure in range 0 to 5 bar. The bandwidthof the I/P is volume dependent. There are 7 joints and agripper. Each joint and the gripper are actuated by a pairof Mckibben muscles and each muscle is controlled by oneI/P so, there are 16 IP converters. 2) Encoders/Potentiometers:There are seven potentiometers, one at each Dof to get thejoint angle data. 3) The data Acquisition device: CompactRIOfrom National Instruments is used for this purpose. Thisis a reconfigurable and embeddable chassis, integrating anintelligent real-time controller. It has NI9205 module which isan Analog-to-Digital-Converter (ADC) with up to 32 channels,selectable acquisition range (200mV,1V,5V and 0V ) and amax sampling frequency of 250kS/s. This device samples theseven potentiometers dedicated to the positioning of the arm.There is a NI9265 card which has three modules of Digital-to-Analog-Converters (DAC). They have the particularity toprovide analog current outputs. Each module is able to manageup to 4 channels in the 0 to 20mA range at 100kS/s. 4) Thedevelopment computer: The development computer runs thehigh level control algorithm for the robot. It communicates viathe NI module using standard UDP protocol at a frequency of1kHz in real time. The computer has linux 14.04 with Xenomaireal time kernel running in parallel. For the present experimen-tation, control software are running at 200Hz frequency.

VI. CONCLUSION AND FUTURE WORK

We considered the problem of modeling and controllinga robot manipulator actuated by the pneumatic Mckibbenmuscles in an optimal control frame-work. Our first contri-bution is to propose a model which encapsulates the pressuredynamics in an efficient way. It is done by using a secondorder differential function whose bandwidth is dependent onthe instantaneous volume of the muscles. Even though, themodel does not include the effect of dry friction and hysteresis,it covers most of the static and the dynamic behavior. Usingthis model, along with the robot manipulator model, our second

contribution is to show that the implementation of iLQR allowsto perform efficient position control but also preserves theinherent compliance of the Mckibben muscles with respectto the more conventional control approaches. The optimalcontrol formulation with this new model is used to performvarious tasks like positioning control under different loadsand maximizing the link speed. The results were reported insimulation first and then some of these results were validatedon the hardware platform. In future work, we plan to extendthe application on the real robot in order to perform explosivemovements such as throwing a ball or hammering a nail. Thiswill be done by embedding the state constraints in the optimalcontrol formulation. A detailed analysis of stiffness variationand possibilities of stiffness control will then be possible.

VII. ACKNOWLEDGEMENTS

This work was partially supported by the European FP7Project EUROC under Grant Agreement 608849 and the ERCAdvanced Grant Actanthrope. We thank Jerome Manhes forhis support in hardware development, Florent Forget and KevinGiraud Esclasse for their help in software development.

REFERENCES

[1] V. Kumar, Z. Xu, and E. Todorov, “Fast, strong and compliant pneu-matic actuation for dexterous tendon-driven hands,” in ICRA, 2013.

[2] B.Tondu, S. Ippolito, and J. Guiochet, “A seven-degrees-of-freedomrobot-arm driven by pneumatic artificial muscles for humanoid robots,”Int. Jour. of Robotics Research, vol. 24, no. 4, 2005.

[3] B. Tondu, K. Braikia, M. Chettoulr, and S. Ippolito, “Second ordersliding mode control for an anthropomorphic robot-arm driven withpneumatic artificial muscles,” in Int. Conf. on Humanoid Robotics(ICHR), 2009, pp. 47–54.

[4] Y. Tassa, T. Wu, J. Movellan, and E. Todorov, “Modeling and iden-tification of pneumatic actuators,” in Int. Conf. on Mechatronics andAutomation (ICMA), 2013, pp. 437–443.

[5] V. Kumar, Y. Tassa, T. Erez, and E. Todorov, “Real-time behavioursynthesis for dynamic hand-manipulation,” in ICRA, 2014.

[6] G. Andrikopoulos, G.Nikolakopoulos, and S. Manesisa, “Pneumaticartificial muscles: A switching model predictive control approach,”Control engineering practice, vol. 21, pp. 1653–1664, 2013.

[7] B. Tondu, “Modelling of the mckibben artificial muscle: A review,”Journal of Intelligent Material Systems and Structures, vol. 23, no. 3,2012.

[8] C. Kothera, M. Jangid, J. Sirohi, and N. Wereley, “Experimental char-acterization and static modeling of mckibben actuators,” Transactionsof the ASME, Journal of Mechanical Design, vol. 131, 2009.

[9] B. Tondu and S. Zagal, “Mckibben artificial muscle can be in ac-cordance with hill skeletal muscle model,” in IEEE/EMBS Int.Conf,BioMed. Rob. I& BioMecatron., no. 276, 2006.

Page 8: Controlling a multi-joint arm actuated by pneumatic ... · high power to weight ratio. These factors motivate researchers to use pneumatic actuation for exoskeleton, prosthesis, rehabil-itation

[10] F. Daerden and D. Lefeber, “The concept and design of pleatedpneumatic artificial muscles,” International Journal of Fluid Power,vol. 2, no. 3, pp. 41–50, 2001.

[11] E. Kelasidi, G. Andrikopoulos, G. Nikolakopoulos, and S. Manesis, “Asurvey on pneumatic muscle actuators modeling,” Journal of Energyand Power Engineering, vol. 6, pp. 1442–1452, 2011.

[12] B. Vanderborght, B.Verrelst, B.Ham, R. V. Damme, M. V. Beyl, andP. Lefeber, “Torque and compliance control of the pneumatic artificialmuscles in the biped ”lucy”,” in ICRA, 2006, pp. 842–847.

[13] V. Boitier, “Design and control of a 2-dof scara-type robot actuated bypneumatic mckibben artificial muscles,” Ph.D thesis, INSA-Toulouse,France, 1996.

[14] B. Tondu and P. Lopez, “Theory of an artificial pneumatic muscle andapplication to the modelling of mckibben artificial muscle,” C.R.A.S.French National Academy of Sciences, vol. 320, no. 4, 1995.

[15] Y. Tassa, N. Mansard, and E. Todorov, “Control-limited differentialdynamic programming,” in ICRA, 2014.

[16] W. Li and E. Todorov, “Iterative linearisation methods for approximatelyoptimal control and estimation of non-linear stochastic system,” Int.Jour. of Control, vol. 80, no. 9, pp. 1439–1453, 2007.

[17] M. Garabini, A. Passagliaand, F. Belo, P. Salaris, and A.Bicchi, “Opti-mality principles in variable stiffness control: Vsa hammer,” in IROS,2011.

[18] D.Braun, F. Petit, F. Huber, S. Haddadin, P. D. Smagt, A. Albu-Schaffer,and S.Vijayakumar, “Robots driven by compliant actuators: Optimalcontrol under actuation constraints,” in IEEE Transactions on Robotics,vol. 29, 2013.