-
Learning Robot Motions with Stable Dynamical Systemsunder
Diffeomorphic Transformations
Klaus Neumann1,∗, Jochen J. Steil2,∗
Research Institute for Cognition and Robotics (CoR-Lab)Bielefeld
University - Germany
Abstract
Accuracy and stability have in recent studies been emphasized as
the two major ingredients to learn robot motionsfrom demonstrations
with dynamical systems. Several approaches yield stable dynamical
systems but are also limitedto specific dynamics that can
potentially result in a poor reproduction performance. The current
work addresses thisaccuracy-stability dilemma through a new
diffeomorphic transformation approach that serves as a framework
generalizingthe class of demonstrations that are learnable by means
of provably stable dynamical systems. We apply the
proposedframework to extend the application domain of the stable
estimator of dynamical systems (SEDS) by generalizing theclass of
learnable demonstrations by means of diffeomorphic transformations
τ . The resulting approach is named τ -SEDSand analyzed with
rigorous theoretical investigations and robot experiments.
Keywords: SEDS, imitation learning, programming by
demonstration, robotics, dynamical system, stability
1. Introduction
Nonlinear dynamical systems have been utilized recentlyas
flexible computational basis to model motor capabilitiesfeatured by
modern robots [1, 2, 3]. Point-to-point move-ments are an important
subclass that can be modeled byautonomous dynamical systems. They
are often used toprovide a library of basic components called
movementprimitives (MP) [4, 5], which are applied very
success-fully to generate movements in a variety of
manipulationtasks [6]. The main advantage of the dynamical
systemsapproach over standard path planning algorithms is
itsinherent robustness to perturbations that result from en-coding
the endpoint or goal as a stable attractor, whereasthe movement
itself can be learned from demonstrations.Naturally, the
generalization and robustness then dependson the stability
properties of the underlying dynamicalsystems, as has been
emphasized in several recent stud-ies [7, 8, 9, 10].
The most widely known approach is the dynamic move-ment
primitives (DMP) approach [7], which generates mo-tions by means of
a non-autonomous dynamical system.Essentially, a DMP is a linear
dynamical system with non-linear perturbation, which can be learned
from demonstra-tion to model a desired movement behavior. The
stabilityis enforced by suppressing the nonlinear perturbation
atthe end of the motion, where the smooth switch from non-linear to
stable linear dynamics is controlled by a phase
∗Corresponding
[email protected]@cor-lab.uni-bielefeld.de
variable. The phase variable can be seen as external stabi-lizer
which in return distorts the temporal pattern of thedynamics. This
leads to the inherent inability of DMP togeneralize well outside
the demonstrated trajectory [11].
This shortcoming motivates methods which are capableof
generalizing to unseen areas in case of
spatio-temporalperturbations. Such methods are time-independent
andthus preserve the spatio-temporal pattern. They becameof special
interest by focusing on the “what to imitate”problem [12, 13].
An appealing approach that aims at ensuring robust-ness to
temporal perturbations by learning dynamical sys-tems from
demonstrations is the stable estimator of dy-namical systems (SEDS)
[13]. It is based on a mixture ofGaussian functions and respects
correlation across severaldimensions. In [13], it is rigorously
shown that SEDS isglobally asymptotically stable in a fixed point
attractorwhich marks the end of the encoded point-to-point
move-ment. However, the proof also reveals that movementslearned by
SEDS are restricted to contractive dynamicscorresponding to a
quadratic Lyapunov function, i.e. thatthe distance of the current
state to the attractor will de-crease in time when integrating the
system’s dynamics.This results in dynamical systems with globally
asymptot-ically stable fixpoint attractor but also potentially poor
re-production performance, if the demonstrated trajectoriesare not
contractive.
This stability vs. accuracy dilemma was acknowledgedby
Khansari-Zadeh et al. in the original work on SEDS.They remark that
“the stability conditions at the basisof SEDS are sufficient
conditions to ensure global asymp-
Preprint submitted to Journal of LATEX Templates March 27,
2015
-
Target
Training Data
Potential Lines
Target
Training Data
Reproduction
Dynamic Flow
Figure 1: Learning of demonstrations with a dynamical system
ad-mitting a quadratic Lyapunov function. Contradicting
quadraticLyapunov function and demonstrations (left). The resulting
dynam-ical system is inaccurately approximating the demonstrations
whenusing a learner which is bound to a quadratic Lyapunov
function(right).
totic stability of non-linear motions when modeled witha mixture
of Gaussian functions. Although our experi-ments showed that a
large library of robot motions canbe modeled while satisfying these
conditions, these globalstability conditions might be too stringent
to accuratelymodel some complex motions.” ([13], p. 956).
An example illustrating this trade-off in learning
fromdemonstrations while simultaneously satisfying a
quadraticLyapunov function is shown in Fig. 1. The
sharp-C-shapeddemonstrations are part of the LASA data set [14].
Theequipotential lines (colored) of the quadratic Lyapunovfunction
together with the demonstrations (black) are de-picted in the left
plot of the figure. The resulting flow(blue arrows) of the
dynamical system, the demonstra-tions (black), and its
reproductions (red) are depicted inFig. 1 (right). The
reproductions by means of the dy-namical system are obviously
stable but also inaccurate inapproximating the demonstrations.
One way to overcome this problem is to separate con-cerns, e.g.
Reinhart et al. [10] used a neural networkapproach to generate
movements for the humanoid robotiCub. Accuracy and stability are
addressed in their ap-proach by two separately trained but
superpositioned net-works, which is feasible but also very complex
and yieldsno stability guarantees.
Another approach that allows learning larger sets
ofdemonstrations accurately is the recently developed con-trol
Lyapunov function - dynamic movements (CLF-DM)approach3 which was
published in [15, 16]. CLF-DM is “inspirit identical to the control
Lyapunov function - basedcontrol scheme in control engineering”
([15], p. 88) and im-plements less conservative stability
conditions compared toSEDS but relies on an online correction
signal which po-tentially interferes with the dynamical system. We
showan example of such interference in Sec. 7.3. The construc-tion
of an appropriate control Lyapunov function (CLF)
3This approach was originally called SEDS-II [15].
is achieved by using an approach called weighted sum
ofasymmetric quadratic function (WSAQF), which learnsfrom the set
of demonstrations.
The technique of control Lyapunov functions was devel-oped
mainly by Artstein and Sontag [17, 18] who general-ized Lyapunov’s
theory of stability. Such control functionsare used to stabilize
non-linear dynamical systems (thatare typically known beforehand
and not learned) throughcorrections at runtime and interfere with
the dynamicalsystem whenever unstable behavior is detected. The
detec-tion and elimination of unstable tendencies of the dynam-ical
system without distorting the dynamics too stronglyremains
difficult nevertheless.
A further approach to accurately learn a larger classof dynamics
than SEDS was developed by Lemme et al.and called neurally
imprinted stable vector fields (NIVF)[8]. It is based on neural
networks and stability is ad-dressed through a Lyapunov candidate
that shapes thedynamical system during learning by means of
quadraticprogramming. Lyapunov candidates are constructed bythe
neurally imprinted Lyapunov candidate (NILC) ap-proach introduced
recently in [19]. While this approachleads to accurate results,
stability is restricted to finite re-gions in the workspace. Also
the mathematical guaranteesare obtained by an ex-post verification
process which iscomputationally costly.
The current work therefore addresses the accuracy-stability
dilemma through a new framework that general-izes the class of
learnable demonstrations, provides prov-ably stable dynamical
systems and integrates the Lya-punov candidate into the learning
process.
Assume that a data set (x(k),v(k)) encoding demon-strations is
given, where x(k) refers to the position andv(k) to the velocity of
a robot end-effector. As a firststep, a so-called Lyapunov
candidate L ∈ L with L :Ω → R that is consistent with the
demonstrations (i.e.∇L(x(k))·v(k) < 0 : ∀k) is learned. Second,
diffeomorphictransformations τ : L×Ω→ Ω̃ are defined that
transformthose candidates from the original space to a new space
inwhich they appear as a fixed and simple function.
These transformations (parameterized with the learnedLyapunov
candidate L) are then used to map the demon-strations from Ω to Ω̃
where they are consistent with theunderlying fixed Lyapunov
function of the learner - in thespecial case of SEDS, a quadratic
function. That is, inthe new space a provably globally stable
dynamical sys-tem (i.e. Ω = Ω̃ = Rd) can be learned with respect
tothe transformed data, which is then back-transformed intothe
original space with the inverse mapping τ−1L : Ω̃ → Ωwhich exists
because of the diffeomorphic properties of τL.It is then shown that
in the original space, the Lyapunovcandidate indeed can be used to
prove stability of thetransformed dynamical system, which
accurately modelsthe demonstrations and resolves the dilemma.
We evaluate the new approach - named τ -SEDS - indetail, provide
rigorous theoretical investigations, and ex-periments that
substantiate the effectiveness and applica-
2
-
bility of the proposed theoretical framework to enhancethe class
of learnable stable dynamical systems to gener-ate robot
motions.
2. Programming Autonomous Dynamical Systemsby Demonstration
Assume that a data set D = (xi(k),vi(k)) with the indicesi = 1 .
. . Ntraj and k = 1 . . . N
i consisting of Ntraj demon-strations is given. N =
∑iN
i denotes the number of sam-ples in the data set. The
demonstrations considered in thispaper encode point-to-point
motions that share the sameend point, i.e. xi(N i) = xj(N j) = x∗ :
∀i, j = 1 . . . Ntrajand vi(N i) = 0 : ∀i = 1 . . . Ntraj. These
demonstrationcould be a sequence of the robot’s joint angles or the
posi-tion of the arm’s end-effector possibly obtained by
kines-thetic teaching.
We assume that such demonstrations can be modeledby autonomous
dynamical systems which can be learnedby using a set of parameters
that are optimized by meansof the set of demonstrations.
ẋ(t) = y(x(t)) : x ∈ Ω , (1)
where Ω ⊆ Rd might be the joint or workspace of the robot.It is
of particular interest that y(x) : Ω → Ω has a singleasymptotically
stable point attractor x∗ = v(x∗) = 0 in Ω,besides that y is
nonlinear, continuous, and continuouslydifferentiable. The limit of
each trajectory has to satisfy:
limt→∞
x(t) = x∗ : ∀x ∈ Ω . (2)
New trajectories can be obtained by numerical integrationof Eq.
(1) when starting from a given initial point in Ω.They are called
reproductions and denoted by x̂i(·) if theystart from the
demonstrations’ initial points xi(0).
In order to analyze the stability of a dynamical system,we
recall the conditions of asymptotic stability found byLyapunov:
Theorem 1. A dynamical system is locally asymptoti-cally stable
at fixed-point x∗ ∈ Ω in the positive invari-ant neighborhood Ω ⊂
Rd of x∗, if and only if there ex-ists a continuous and
continuously differentiable functionL : Ω→ R which satisfies the
following conditions:
(i) L(x∗) = 0 (ii) L(x) > 0 : ∀x ∈ Ω\x∗
(iii) L̇(x∗) = 0 (iv) L̇(x) < 0 : ∀x ∈ Ω\x∗ .(3)
The dynamical system is globally asymptotically stableat
fixed-point x∗ if Ω = Rd and L is radially unbounded,i.e. ‖x‖ → ∞ ⇒
L(x) → ∞. The function L : Ω → R iscalled Lyapunov function.
It is usually easier to search for the existence of a
Lyapunovfunction than to proof asymptotic stability of a
dynamicalsystem directly. Typically, previously defined
Lyapunovcandidates are used as a starting point for stability
ver-ification and conditions (i)-(iv) of theorem 1 are verified
in a stepwise fashion to promote the candidate to becomean
actual Lyapunov function. We thus first define whatkind of Lyapunov
candidates are in principle applicablefor investigation.
Definition 1. A Lyapunov candidate is a continuousand
continuously differentiable function L : Ω → R thatsatisfies the
following conditions
(i) L(x∗) = 0 ( ii) L(x) > 0 : ∀x ∈ Ω\x∗
(iii) ∇L(x∗) = 0 (iv) ∇L(x) 6= 0 : ∀x ∈ Ω\x∗ ,(4)
where x∗ ∈ Ω is the asymptotically stable fixed-point at-tractor
and Ω is a positive invariant neighborhood of x∗.L is a globally
defined Lyapunov candidate if Ω = Rd andL is radially unbounded in
addition to the previous condi-tions, i.e. ‖x‖ → ∞⇒ L(x)→∞.
We use the term Lyapunov candidate whenever the func-tion is
used to enforce asymptotic stability of a dynamicalsystem during
learning and control Lyapunov function ifthe dynamical system is
stabilized during runtime. In thefollowing, we will learn such
candidate functions and eval-uate their quality. To this aim we
define what it meansthat a Lyapunov candidate contradicts a given
demonstra-tion or reference trajectory.
Definition 2. A Lyapunov candidate L : Ω → R
vio-lates/contradicts a dynamical system v : Ω → Ω or
ademonstration (xi(k),vi(k)) : k = 1 . . . N i, if and only if
∃x ∈ Ω : ∇TL(x) · v(x) > 0 or∃k : 1 ≤ k ≤ N i : ∇TL(xi(k)) ·
vi(k) > 0 .
(5)
The dynamical system or the given demonstration is saidto be
consistent with or satisfying/fulfilling the Lya-punov candidate if
and only if there is no violation.
3. Related Work
Several different approaches for movement generation andlearning
of autonomous dynamical systems have been in-troduced so far. This
section introduces the most impor-tant developments among those
methods as related workand embeds them in the previously defined
formalism nec-essary to achieve stable movement generation.
3.1. Stable Estimator of Dynamical Systems (SEDS)
The stable estimator of dynamical systems (SEDS) [13]is a
advanced version of the binary merging algorithm [9]and learns a
dynamical system by means of a Gaussianmixture model
ẋ =
K∑k=1
P(k)P(x|k)∑i P(k)P(x|i)
(µkẋ + Σ
kẋx(Σ
kx)−1 (x− µkx)) (6)
where P(k), µk, and Σk yield the prior probability, themean, and
the covariance matrix of the K Gaussian func-tions, respectively.
Note that this model can be expressed
3
-
as a space varying sum of linear dynamical systems accord-ing to
the definition of the matrix Ak = Σkẋx(Σ
kx)−1, the
bias bk = µkẋ − Akµkx, and the nonlinear weighting termsh(k) =
P(k)P(x|k)∑
i P(k)P(x|i). The reformulation according to this
definition leads to
ẋ(t) = y(x(t)) =
K∑k=1
hk(x(t))(Akx(t) + bk
). (7)
Learning can be done by minimization of different objec-tive
functions by a non-linear program subject to a set ofnon-linear
constraints. A possible objective function canbe the mean square
error functional
min
N∑ik
‖vi(k)− y(xi(k))‖2 , (8)
which is minimized in the parameters of the Gaussian mix-ture
model and at the same time subject to the followingnon-linear
constraints [13]
(i) bk = −Akx∗
(ii) Ak +AkT ≺ 0 : ∀k = 1, . . .K ,
(9)
where ≺ 0 denotes the negative definiteness of a matrix.Note,
that it is also necessary to add the constraints forthe
requirements on the covariance matrices Σk and priorsP to the
non-linear program4. It is shown that these con-straints are
sufficient conditions for the learned dynamicalsystem to be
globally asymptotically stable.
The stability analysis in the original contribution con-siders a
quadratic Lyapunov candidate
L(x) =1
2(x− x∗)T (x− x∗) : ∀x ∈ Rd , (10)
which is used for stability analysis. In detail, the theo-rem
states that this scalar function is indeed a Lyapunovfunction of
the autonomous dynamical system defined bySEDS
L̇(x) =d
dtL(x(t))
= ∇L(x(t)) · ddt
x(t) = ∇L(x(t)) · ẋ(t)
=
K∑k=1
hk(x)︸ ︷︷ ︸>0
(x− x∗)Ak(x− x∗)︸ ︷︷ ︸
-
3.3. Neurally Imprinted Stable Vector Fields (NIVF)
Another successful approach to represent robotic move-ments by
means of autonomous dynamical systems is basedon neural networks
and called the neurally imprinted vec-tor fields approach [8]. It
features efficient supervisedlearning and incorporates stability
constraints via quadraticprogramming (QP). The constraints are
derived from a pa-rameterized or learned Lyapunov function which
enforceslocal stability.
The approach considers feed-forward neural networksthat comprise
three different layers of neurons: x ∈ RIdenotes the input, h ∈ RR
the hidden, and y ∈ RI theoutput neurons. These neurons are
connected via inputmatrix W inp ∈ RR×I , which remains fixed after
randominitialization and are not subject to supervised learning.The
read-out matrix given by W out ∈ RI×R which is sub-ject to
supervised learning. For input x the output of theith read-out
neuron is thus given by
yi(x) =
R∑j=1
W outij f(
I∑n=1
W inpjn xn + bj) , (14)
where the biases bj parameterize the component-wise
Fermifunction f(x) = 11+e−x of the jth neuron in the
hiddenlayer.
It is assumed that a Lyapunov candidate L is given.In order to
obtain a learning algorithm for W out that alsorespects condition
(iv) of Lyapunov’s theorem, this condi-tion is analyzed by taking
the time derivative of L:
L̇(x) = (∇xL(x))T ·d
dtx = (∇xL(x))T · v̂
=
I∑i=1
(∇xL(x))iR∑k=1
W outij fj(Winpx+ b) < 0 .
(15)
Interestingly, L̇ is linear in the output parameters W out
and irrespective of the form of the Lyapunov function L.For a
given point u ∈ Ω, Eq. (15) defines a linear constrainton the
read-out parameters W out, which is implementedby solving the
quadratic program with weight regulariza-tion [20]:
W out = arg minW
(‖W ·H(X)− V ‖2 + ε‖W‖2)
subject to: L̇(U) < 0 ,(16)
where the matrix H(X) = (h(x(1)), . . . ,h(x(Ntr))) col-lects
the hidden layer states obtained from a given data setD = (X,V ) =
(xi(k),vi(k)) for inputs X and the corre-sponding output vectors V
and where ε is a regularizationparameter. It is shown in [20] that
a well-chosen samplingof points U is sufficient to generalize the
incorporated dis-crete constraints to continuous regions in a
reliable way.The independence of Eq. (15) from the specific form
ofL motivates the use of methods to learn highly flexibleLyapunov
candidates from data. The neurally imprintedLyapunov candidate
(NILC) [19] is such a method that
enables the NIVF approach to generate robust and flexi-ble
movements for robotics. Details of the approach arestated in Sec.
5.2.
4. Learning Stable Dynamics underDiffeomorphic
Transformations
This section describes how to link a Lyapunov candidatewith
respect to given demonstrations in one space Ω andthe learning of a
stable dynamical system with quadraticLyapunov function with
respect to transformed data in asecond space Ω̃ by means of a
diffeomorphism τ . The latteris described on an abstract level and
by an illustrativeexample. Also the main algorithmic steps are
introduced.The procedure undergoes a rigorous stability analysis
thatsubstantiates the underlying principles.
4.1. Overview
Assume that a Lyapunov candidate L : Ω→ R with L ∈ L,which is
consistent with the demonstrations in D, is givenor can be
constructed automatically. The main goal isto find a mapping τ : L
× Ω → Ω̃ that transforms theLyapunov function candidate L into a
fixed and simplefunction L̃ : Ω̃ → R in the new space Ω̃ such that
theparameterized mapping τL : Ω → Ω̃ is a diffeomorphism.The
transformation is defined according to the following
Definition 3. A diffeomorphic candidate transfor-mation τ : L ×
Ω → Ω̃ with (L,x) 7→ x̃ transforms allLyapunov candidates L : Ω → R
with L ∈ L to a fixedfunction L̃ : Ω̃ → R such that the
parameterized map-ping τL : Ω → Ω̃ is a diffeomorphism, i.e. τL : Ω
→ Ω̃ isbijective, continuous, continuously differentiable, and
theinverse mapping τ−1L : Ω̃ → Ω is also continuous and
con-tinuously differentiable. We say τ corresponds to L.
The main example and standard case used in this workis to target
a quadratic function L̃(x̃) = L(τ−1L (x̃)) = x̃
2
after transformation.The idea is then to use τL in order to
transform the
data set D into the new space. The obtained data set
D̃ = (x̃i(k), ṽi(k)) = (τL(xi(k)), JTτ (x
i(k)) · vi(k)) (17)
is consistent with this Lyapunov candidate L̃ if the initialdata
D is consistent with the Lyapunov function candidateL. The term (Jτ
(x
i(k)))mn =∂
∂xmτn(x
i(k)) denotes the
Jacobian matrix for τL at point xi(k).
Also assume that a learner is given which is able toguarantee
asymptotic stability by means of a quadraticLyapunov function
L̃(x̃) = x̃2 (e.g. the SEDS approach).The dynamical system ỹ : Ω̃→
Ω̃ trained with the data D̃in Ω̃ is then expected to be accurate.
The inverse of thediffeomorphism τ−1L : Ω̃→ Ω is used to map the
dynamicalsystem back to the original space. The back
transforma-tion y : Ω→ Ω of ỹ is formally given by
y(x) := J−Tτ (τL(x)) · ỹ(τL(x)) , (18)
5
-
Algorithm 1 Diffeomorphic Transformation Approach
Require: Data set D = (xi(k),vi(k)) : i = 1 . . . Ntraj, k = 1 .
. . Ni is given
1) Construct Lyapunov candidate L : Ω→ R that is consistent with
data D2) Define a diffeomorphism τ : L × Ω→ Ω̃ where L̃ takes a
quadratic form3) Transform D to D̃ = (x̃i(k), ṽi(k)) = (τL(x
i(k)), JTτ (xi(k)) · vi(k))
4) Learn a dynamical system ỹ : Ω̃→ Ω̃ of data D̃ in Ω̃ with
stability according to L̃5) Apply the back transformation y(x) :=
J−Tτ (τL(x)) · ỹ(τL(x)) in Ω to obtain a stable dynamical
system
τ L :Ω→Ω̃
τ L−1:Ω̃→Ω
L :Ω→ℝ L̃ :Ω̃→ℝ , x̃→ x̃2
ỹ ( x̃) :Ω̃→Ω̃y ( x)= J τ−T ỹ
D D̃
Figure 2: Schematic illustration of the proposed transformation
ap-proach. The left part of the figure shows the original space
Ω,the demonstrations D, and the complex Lyapunov candidate L.The
right side visualizes the transformed space Ω̃ equipped with
aquadratic Lyapunov function L̃ and the corresponding data D̃.
Thetransformation τ between those space is visualized by the arrows
inthe center part of the plot.
where (J−Tτ (τ(x)))ij =∂∂x̃i
τL−Tj (x̃) denotes the transpose
of the inverse Jacobian matrix for τL at point x.
Thistransformation behavior is rigorously investigated in
thefollowing sections regarding the stability analysis of
theunderlying dynamical systems. This procedure is summa-rized in
Alg. 1 and schematically illustrated in Fig. 2.
4.2. The Diffeomorphic Transformation Approach:A Simple
Illustrative Example
Fig. 3 illustrates the intermediate steps of the diffeomor-phic
candidate transformation and learning of the SEDSapproach shown in
Alg.1. The movement obviously vio-lates a quadratic Lyapunov
candidate (as shown in Fig. 1)and is thus well suited for the
transformation approach.
First, we manually construct an elliptic Lyapunov can-didate
that is more or less consistent with the training dataD (step 1).
It is directly clear that an elliptic Lyapunovcandidate is too
restricted for learning complex motions,but it is good enough to
serve as an example. We definethe Lyapunov candidate as
L(x) = xTPx , (19)
with the diagonal matrix P = diag(1,5). The set of possi-ble
candidates L is given by
L ={xTPx : P diag. matrix and pos. def.
}, (20)
The visualization of this scalar function that serves as
Lya-punov candidate is shown in Fig. 3 (second). Note, thatthis
function still violates the training data but relaxes the
violation to a satisfactory degree. A diffeomorphic candi-date
transformation τ that corresponds to L is given (step2) by the
following mapping
τL(x) =√P , (21)
which is the component-wise square root of the matrixP and
particularly constructed for the elliptic candidatefunctions
defined by different diagonal matrices P . It isimportant to
understand that this function τ maps anyelliptic Lyapunov candidate
in L onto a quadratic function.
L̃(x̃) = L(τ−1(x̃)) = L(√P−1
x̃)
= (√P−1
x̃)TP√P−1
x̃
= x̃T√P−TP√P−1
x̃
= x̃T√P−1P√P−1
x̃
= x̃T x̃ = ‖x̃‖2 .
(22)
The respective Jacobian matrix is analytically given
andcalculated as
Jτ (x) =√P , (23)
where we used the symmetric definition of the P matrix:√PT
=√P .
The training data set D is then prepared for learn-ing by
transforming the data set into D̃ that is definedin the transformed
space (step 3) which is consistent witha quadratic Lyapunov
candidate L̃(x̃). The result of thedata transformation and the
Lyapunov candidate is illus-trated in Fig. 3 (second). We then
apply a learning ap-proach (here: SEDS) to obtain ỹ (step 4) which
is stableaccording to a quadratic Lyapunov function L̃ in Ω̃
afterlearning the data D̃. The result of the learning is de-picted
by the dynamic flow after learning the transformeddemonstrations D̃
in Fig. 3 (third).
Finally, the inverse transformation τ−1L =√P−1
isused to obtain the dynamics y for the original data D inthe
original space Ω (step 5). Eq. (18) was used for
backtransformation. It is illustrated that the transformed dataset
is still violating the quadratic function, however, lessstrongly
such that more accurate modeling of the originaldemonstrations is
enabled, see Fig. 3 (fourth). Note thatthe resulting dynamical
systems and their generalizationcan potentially differ in many
aspects. It is important tounderstand that the origin of such
effects are caused by
6
-
Target
Training Data
Potential Lines
Target
Training Data
Dynamic Flow
Target
Training Data
Reproduction
Dynamic Flow
Figure 3: Demonstrations D and the respective Lyapunov candidate
L in Ω (first). Transformed Lyapunov function L̃ and
transformeddemonstrations D̃ in Ω̃ (second). The dynamical system
ỹ learned by SEDS using the data set D̃ which fulfills a quadratic
Lyapunov functionin the transformed space Ω̃ (third). The result y
in Ω after applying the inverse transformation τ−1L of ỹ
(fourth).
many different features of the algorithm, e.g. the selectionof
the Lyapunov candidate or the randomness of the SEDSalgorithm. The
exact definition of the term “generalizationcapability” of the
dynamical system and its measurementremains difficult. Systematic
approaches to answer thisquestion were rigorously discussed in
[21].
4.3. General Stability Analysis
The main question raised is regarding the stability of
thetransformed system y in the original space Ω. It is also
offundamental interest how the generalization capability ofthe
learner in Ω̃ transfers into the original space Ω.
The following proposition indicates the necessary con-ditions
for implementation.
Proposition 1. Let D = (xi(k),vi(k)) be a data set withi = 1 . .
. Ntraj and k = 1 . . . N
i consisting of Ntraj demon-strations and L : Ω → R be a
Lyapunov candidate fromthe set L. Let τ : L×Ω→ Ω̃ be a
diffeomorphic candidatetransformation that corresponds to L.
Then, it holds for all L ∈ L that the dynamical systemy : Ω → Ω
with y(x) := J−Tτ (τL(x)) · ỹ(τL(x)) is asymp-totically stable at
target x∗ with Lyapunov function L ifand only if the dynamical
system ỹ : Ω̃→ Ω̃ is asymptot-ically stable at target x̃∗ with
τL(x
∗) = x̃∗ and Lyapunovfunction L̃.
Proof. We first derive the transformation properties forthe
Lyapunov candidate. Note, that the dependence onthe Lyapunov
candidate L will be omitted in the followingfor notational
simplicity, i.e. τ = τL. Scalar functionssuch as Lyapunov
candidates show the following forwardand backward transformation
behavior
L(x) = L̃(τ(x)) and L̃(x̃) = L(τ−1(x̃)) , (24)
while these equations hold for x ∈ Ω and x̃ ∈ Ω̃.
Thistransformation behavior is important for the investigationof
the differential information of the Lyapunov candidates
in the different spaces. The gradient of the Lyapunov can-didate
thus transforms according to
∇L(x) = Jτ (x) · ∇̃L̃(x̃) , (25)
where (Jτ (x))ij =∂∂xi
τj(x) is the Jacobian matrix for thediffeomorphism τ at point x.
A vector field y(x) can alsobe represented in both spaces. The
transformation behav-ior of the dynamical system is the
following
y(x) =
d∑k=1
d∑j=1
(J−Tτ (τ(x)))ji · ỹi(τ(x))
= J−Tτ (x̃) · ỹ(x̃) .
(26)
where (J−Tτ (τ(x)))ij is the transpose of the inverse Jaco-bian
matrix ∂∂x̃i τ
−1j (x̃) for the function τ at point x. These
identities hold because of the diffeomorphic properties ofτ .
The mathematical justification is given by the inversefunction
theorem, which states that the inverse of the Ja-cobian matrix
equals the Jacobian of the inverse function.
The following equations show that L is an actual Lya-punov
function for the dynamical system y(x). Per defi-nition, L
satisfies (i) and (ii) of Lyapunov’s conditions forasymptotic
stability stated in theorem 1. We thus focuson condition (iii)
L̇(x∗) = (y(x∗))T∇L(x∗)= (J−Tτ (x̃
∗) · ỹ(x̃∗)︸ ︷︷ ︸=0 FP
)T · ∇L(x∗)︸ ︷︷ ︸=0 (iii)
= 0 , (27)
which is also satisfied. The main requirement for the proofof
the proposition is that condition (iv) is fulfilled. Itstates that
the dynamical system, which is stable with aquadratic Lyapunov
function in the transformation space,becomes asymptotically stable
according to the previouslydefined Lyapunov candidate function L in
the original spaceΩ after back transformation. The Lyapunov
candidate L
7
-
thus becomes a Lyapunov function.
L̇(x) = (y(x))T · ∇L(x)
=(J−Tτ (x̃) · ỹ(x̃)
)T · Jτ (x) · ∇̃L̃(x̃)= ỹ(x̃)T · J−1τ (x̃) · Jτ (x) · ∇̃L̃(x̃)=
ỹ(x̃)T · ∇̃L̃(x̃)
= ˙̃L(x̃) < 0 : ∀x̃ ∈ Ω̃, x̃ 6= x̃∗
⇒ L̇(x) < 0 : ∀x ∈ Ω,x 6= x∗ ,
(28)
where Eq. (25) and Eq. (26) were used for derivation.
It is of great interest how this framework affects the
ap-proximation capabilities of the underlying approach
duringtransformation. It can be shown that the approximationis
optimal in least squares sense which is summarized inthe following
proposition.
Proposition 2. Assume that the same prerequisites as inProp. 1
are given. Then, it holds for all L ∈ L that thedynamical system y
: Ω→ Ω approximates the data D inleast squares sense if and only if
ỹ(x) approximates thetransformed data set D̃ in least squares
sense.
Proof. We assume that the mapping in the transformedspace
according to the learner ỹ : Ω̃→ Ω̃ approximates thedata set D̃ =
(x̃(k), ṽ(k)) = (τL(x(k)), J
Tτ (x(k)) · v(k)),
i.e. that the learner itself is continuous and minimizes
thefollowing error
Ẽ =
N∑k=1
‖ỹ(x̃(k))− ṽ(k)‖2 → min . (29)
The error in the original space Ω for a given data set Dand a
back-transformed dynamical system y and the cor-responding data set
D̃ in the transformed space Ω̃ learnedby ỹ is given by
E =
N∑k=1
‖y(xi(k))− vi(k)‖2
=
N∑k=1
‖J−Tτ (τ(xi(k)))[ỹ(τ(xi(k)))− ṽi(k)
]‖2
=
N∑k=1
‖ J−Tτ (x̃i(k)))︸ ︷︷ ︸fixed
· [ỹ(x̃i(k))− ṽi(k)]︸ ︷︷ ︸minimized in Eq. (29)
‖2 → min
(30)
This shows that the error E is decreasing for a given
fixedtransformation τ if the error in the transformed space Ẽis
minimized because L̃ and D̃ are consistent.
Note, that the proposition gives no specific informationabout
the construction of the Lyapunov candidate and thediffeomorphism.
The following sections introduce and rig-orously analyze possible
Lyapunov candidates and a cor-responding diffeomorphism.
5. Learning Complex Lyapunov Candidates
This section investigates step 1) in Alg.1, the constructionor
learning of Lyapunov candidates from demonstrations.
5.1. Weighted Sum of Asymmetric Quadratic Functions(WSAQF)
The construction of valid Lyapunov candidates can bedone in
various ways. One option is to model the can-didate function
manually. However, this is potentially dif-ficult and time
consuming. We therefore suggest to ap-ply automatic methods to
learn valid Lyapunov candidatefunctions from data. A method that
constructs Lyapunovcandidates in a data-driven manner is the
already men-tioned weighted sum of asymmetric quadratic
functions(WSAQF) [16]. The following equations describe the
re-spective parametrization.
L(x) = xTP 0x +
L∑l=1
βl(x)(xTP l(x− µl)
)2, (31)
where we set x∗ := 0 for convenience. L is the number ofused
asymmetric quadratic functions, µl are mean vectorsto shape the
asymmetry of the functions, and P l ∈ Rd×dare positive definite
matrices. The coefficients β are de-fined according to the
following
βl(x) =
{1 : xTP l(x− µl) ≥ 00 : xTP l(x− µl) < 0 , (32)
Khansari-Zadeh et al. state that this scalar function
iscontinuous and continuously differentiable. Furthermore,the
function has a unique global minimum and thereforeserves a
potential control Lyapunov function. Learningis done by adaptation
of the components of the matricesP l and the vectors µl in order to
minimize the followingconstrained objective function
min
Ntraj∑i=1
Ni∑k=1
1 + w̄
2sign(ψik)ψik
2+
1− w̄2
ψik2
subject to P l � 0 : l = 0, . . .L
, (33)
where � denotes the positive definiteness of a matrix andw̄ is a
small positive scalar. The function ψ is definedaccording to the
following
ψik =∇L(xi(k))Tvi(k)
‖∇L(xi(k))T ‖ · ‖vi(k)‖, (34)
We show that this scalar function is a valid Lyapunov
can-didate.
Lemma 1. The WSAQF approach L : Ω→ R is a (global)Lyapunov
candidate function according to Def. 2 and itholds that (x− x∗)T ·
∇L > 0.
8
-
Proof. Obviously, condition (i), (ii), and (iii) in Def. 2are
fulfilled. The function is also continuous and continu-ously
differentiable despite the switches of β from zero toone or vice
versa. In order to analyze condition (iv), thegradient is
calculated.
∇L = (P 0 + P 0T )x +L∑l=1
2βl(x)xTP l(x− µl) ·[(P l + P l
T)x− P lµl
] , (35)Condition (iv) holds because of the following
inequalitythat demonstrates that L becomes a valid Lyapunov
can-didate according to Def. 2. Note that we still set x∗ := 0for
convenience without losing generality.
xT · ∇L = xT (P 0 + P 0T )x +L∑l=1
2βl(x)xTP l(x− µl) ·[xT (P l + P l
T)x− xTP lµl
]︸ ︷︷ ︸
xTP l(x−µl)+xTP lTx
= xT (P 0 + P 0T
)x︸ ︷︷ ︸>0
+
L∑l=1
2βl(x)︸ ︷︷ ︸≥0
(xTP l(x− µl)
)2︸ ︷︷ ︸≥0
+ 2βl(x)xTP l(x− µl)︸ ︷︷ ︸≥0
xTP lTx︸ ︷︷ ︸
>0
> 0 : ∀x ∈ Ω ,(36)
where P l are positive definite matrices and Ω = Rd. Pleasenote
that the transpose of a positive definite matrix is alsopositive
definite. The WSAQF approach indeed constructsLyapunov candidates
that are radially unbounded becauseof its specific structure.
L(x) = xTP 0x︸ ︷︷ ︸‖x‖→∞⇒∞
+
L∑l=1
βl(x)(xTP l(x− µl)
)2︸ ︷︷ ︸≥0
. (37)
such that the WSAQF approach becomes a globally de-fined
Lyapunov candidate.
Also different methods to learn Lyapunov candidates
arepotentially applicable as long as the learned function
sat-isfies the conditions in Def. 2.
5.2. Neurally-Imprinted Lyapunov Candidates (NILC)
The learning or construction of appropriate Lyapunov can-didate
functions from data is challenging. In previouswork [19], we have
already introduced a neural networkapproach called neurally
imprinted Lyapunov candidate(NILC). This approach learns Lyapunov
candidates L :Ω→ R that are smooth and well suited to shape
dynamicalsystems that in earlier work have been learned with
neu-ral networks as well [8]. We briefly introduce the methodfrom
[19].
Consider a neural network architecture which definesa scalar
function L : Rd → R. This network comprisesthree layers of neurons:
x ∈ Rd denotes the input, h ∈ RRthe hidden, and L ∈ R the output
neuron. The input isconnected to the hidden layer through the input
matrixW inp ∈ RR×d which is randomly initialized and stays
fixedduring learning. The read-out matrix comprises the pa-rameters
subject to learning which is denoted byW out ∈ RR.For input x the
output neuron is thus given by
L(x) =
R∑j=1
W outj f(
d∑n=1
W inpjn xn + bj) , (38)
The main goal is to minimize the violation of the trainingdata
and the candidate function by making the negativegradient of this
function follow the training data closely.A quadratic program is
defined
1
Nds
Ntraj∑i=1
Ni∑k=1
(‖ − ∇L(xi(k))− vi(k)‖2 +
. . . + �RR‖W out‖2)→ min
Wout,
(39)
subject to the following equality and inequality
constraintscorresponding to Lyapunov’s conditions (i)-(iv) in
theo-rem 1 such that L becomes a valid Lyapunov
candidatefunction
(a) L(x∗) = 0 (b) L(x) > 0 : x 6= x∗
(c) ∇L(x∗) = 0 (d) xT∇L(x) > 0 : x 6= x∗(40)
where the constraints (b) and (c) define inequality con-straints
which are implemented by sampling these con-straints. The gradient
of the scalar function defined bythe network in Eq. (38) is linear
in W out and given by
(∇L(x))i =R∑j=1
W outj f′(
d∑k=1
W inpjk xk + bj) ·Winpji , (41)
where f′
denotes the first derivative of the Fermi function.The
disadvantage of this approach is that the Lyapunovcandidate is not
globally valid. It can be extended towardspredefined but finite
regions. Interestingly, this candidatealso fulfills the following
condition: (x − x∗)T · ∇L > 0,which is important for the
diffeomorphic transformationthat is defined in the following
section. The result ofthis constructive approach is summarized in
the followinglemma:
Lemma 2. The NILC approach L : Ω → R is a (local)Lyapunov
candidate function according to Def. 2 and itholds that (x− x∗)T ·
∇L > 0.
The previous section revealed that arbitrary Lyapunovfunction
candidates are applicable for the learning of sta-ble dynamical
systems, if a diffeomorphism is given thattransforms this candidate
into a quadratic function. Thefollowing section defines and
investigates a correspondingdiffeomorphism for the NILC and the
WSAQF Lyapunovcandidate approaches.
9
-
6. Coping with Complex Lyapunov Candidates:The Diffeomorphic
Candidate Transformation
This section defines step 2) of Alg.1 in detail. In order
toallow an implementation of flexible Lyapunov candidatesL : Ω → R,
a diffeomorphic candidate transformation τ :L × Ω→ Ω̃, x 7→ x̃ is
defined as follows
τL(x) =
{√L(x) · x−x
∗
‖x−x∗‖ if x 6= x∗
x∗ if x = x∗. (42)
This mapping transforms each Lyapunov candidate L ac-cording to
Def. 2 into a quadratic function L̃ : Ω̃ → R,x̃ 7→ x̃2 stated by
the following lemma.
Lemma 3. The mapping τ : L×Ω→ Ω̃ is a diffeomorphiccandidate
transformation according to Def. 3 that corre-sponds to the set of
Lyapunov candidates L where eachelement L ∈ L fulfills (x − x∗)T ·
∇L > 0 : x ∈ Ω,i.e. τL : Ω → Ω̃ is bijective, continuous,
continuouslydifferentiable, and the inverse mapping τ−1L : Ω̃ → Ω
isalso continuous and continuously differentiable. Further,τ
transforms functions L ∈ L to the fixed quadratic func-tion L̃ :
Ω̃→ R, x̃ 7→ x̃2.
Proof. We again define τ := τL and set x∗ = 0 for conve-
nience. At first, it is obvious that τ : Ω→ Ω̃ is continuousand
continuously differentiable, because L is continuousand
continuously differentiable. Importantly, the diffeo-morphism is
injective, i.e.
∀x1,x2 ∈ Ω : (x1 6= x2 ⇒ τ(x1) 6= τ(x2)) (43)
If x1,x2 ∈ Ω are arbitrary vectors with x1 6= x2 and x1,2 6=0,
four different cases are distinguished
(1) L(x1) 6= L(x2) and x1 � x2(2) L(x1) = L(x2) and x1 � x2(3)
L(x1) 6= L(x2) and x1 ∼ x2(4) L(x1) = L(x2) and x1 ∼ x2 ,
(44)
where x ∼ y means that there exists a real number λ > 0for
which x = λy holds. Cases (1) and (2) are unproblem-atic because
τ(x1) 6= τ(x2) directly follows from x1 � x2.In order to analyze
case (3), we calculate the directionalderivative of L along the
direction of x which exists dueto the total differentiability of τ
and satisfies the followinginequality
∇xL(x) = xT∇L(x) > 0 . (45)
This is directly according to condition (iv) of the con-sidered
Lyapunov candidate. L is thus strictly mono-tonically increasing
along a given direction in Ω. WithL(x1) 6= L(x2) we therefore infer
that ‖τ(x1)‖ 6= ‖τ(x2)‖and thus τ(x1) 6= τ(x2). Case (4) is
invalid, becauseL(x1) = L(x2) ⇒ ‖τ(x1)‖ = ‖τ(x2)‖ and with x1 ∼
x2it follows that x1 = x2 which is in contradiction to
theassumption that x1 6= x2. Therefore, τ is injective. It
directly follows that τ : Ω → Ω̃ is surjective because Ω̃ isthe
image of τ and thus bijective.The inverse function τ−1 : Ω̃→ Ω
exists because of the bi-jectivity and is continuous and
continuously differentiable.The reason is that the directional
derivative of L(x) alongx is strictly monotonically increasing.In
order to show that the diffeomorphism τ maps each Lonto the fixed
function L̃ : Ω̃ → R, x̃ 7→ x̃2, the followingequivalence holds per
definition
‖τ(x)‖ =√L(x)⇔ ‖τ(x)‖2 = L(x) . (46)
The transformed function becomes quadratic with the useof Eq.
(46)
L̃(x̃) = L(τ−1(x̃)) = ‖τ(τ−1(x̃))‖2 = ‖x̃‖2 . (47)
Each Lyapunov candidate that satisfies (x−x∗)T ·∇L > 0(such
as the NILC and the WSAQF approach) and thediffeomorphism in Lem. 3
are therefore applicable for im-plementation of flexible and
desired Lyapunov candidateswith the τ -SEDS approach.
In this particular case, the Jacobian Jτ (x) of the
diffeo-morphism τL can be derived analytically. We again setx∗ = 0
for simplicity.
Jτ (x)ij =∂
∂xiτj(x)
=∂
∂xi
√L(x) · xj
‖x‖
Jτ (x) =∇L(x)
2√L(x)
· xT
‖x‖+√L(x)
(I
‖x‖− xx
T
‖x‖3
),
(48)
where I ∈ Rd×d is the identity matrix, L(x) is the Lya-punov
candidate and ∇L(x) denotes the gradient of theLyapunov candidate.
It is important to note that thisJacobian has a removable
singularity at x = 0 and isthus well-defined for the limit case of
‖x‖ → 0 whereJτ (x) = 0 : x = 0.
This approach based on the framework for diffeomor-phic
candidate transformations τ introduced in this sec-tion that
applies SEDS and WSAQF or NILC as a basisfor learning is called τ
-SEDS (WSAQF, NILC) or simplyτ -SEDS in the following. Note that
this theoretical frame-work is not restricted to the special forms
of the used ap-proaches and thus serves as fundamental framework
forlearning complex motions under diffeomorphic
transfor-mations.
7. Experimental Results
This section introduces the experimental results obtainedfor the
different approaches and compares them qualita-tively and
quantitatively. This comprises the steps 3) to5) of Alg.1.
10
-
Target
Training Data
Lypunov Function
Target
Training Data
Dynamic Flow
Target
Training Data
Reproduction
Dynamic Flow
Figure 4: Lyapunov candidate constructed by the WSAQF approach
(first row), and Lyapunov candidate originating from the NILC
approach(second row). Desired Lyapunov function L and data set D in
the original space Ω (first column). Transformed Lyapunov function
L̃ andtransformed data set D̃ in Ω̃ (second column). Dynamical
system ỹ learned by SEDS using data set D̃ which admits to a
quadratic Lyapunovfunction in the transformed space Ω̃ (third
column). The result y in Ω after applying the inverse
transformation τ−1 of ỹ (fourth column).
7.1. Reproduction Accuracy
To measure the accuracy of a reproduction is an importanttool to
evaluate the performance of a movement generationmethod. We use the
swept error area5 (SEA) as an errorfunctional to evaluate the
reproduction accuracy of themethods. It is computed by
E =1
N
Ntraj∑i=1
Ni−1∑k=1
A(x̂i(k), x̂i(k+1),xi(k),xi(k+1)
)(49)
where x̂i(·) is the equidistantly re-sampled reproduction ofthe
demonstration xi(·) with the same number of samplesN i and A(·)
denotes the function which calculates the areaof the enclosed
tetragon generated by the four points x̂i(k),x̂i(k+1), xi(k), and
xi(k+1).
7.2. Illustrative Example: τ -SEDS
This experiment illustrates the processes of
diffeomorphictransformation and learning of the SEDS approach in
com-bination with the WSAQF and NILC Lyapunov candi-dates. The
experimental results are again obtained for asharp-C-like movement
from a library of 30 human hand-writing motions called LASA data
set [14]. This data pro-vide realistic handwritten motions and is
used in several
5This measure was first defined in [16].
different studies about the learning of dynamical systemsapplied
for movement generation [13, 19, 8]. As mentioned,the movement
violates a quadratic Lyapunov candidate(shown in Fig. 1). The
previously introduced Lyapunovcandidates are used for
transformation and comparison.The first candidate function is
constructed by means ofthe NILC technique [19] (results shown in
first row). Thesecond function is obtained with the WSAQF
approach(second row). Learning in the transformed space is doneby
SEDS6, which is initialized with K = 5 Gaussian func-tions and
trained for maximal 500 iterations. The functionτ (see Eq. (42)) is
used as corresponding diffeomorphiccandidate transformation.
Fig. 4 illustrates the intermediate steps obtained dur-ing the
learning and transformation phase. The plots inthe first column in
Fig. 4 show the different Lyapunov can-didates that are consistent
with the respective six demon-strations. The training data set D is
then prepared forlearning by transforming the data set into D̃ that
is de-fined in the transformed space which is consistent witha
quadratic Lyapunov candidate L̃(x̃). The result of thedata
transformation and the Lyapunov candidate is illus-trated in Fig. 4
(second column). We then apply the SEDSlearning approach to obtain
ỹ which is stable accordingto a quadratic Lyapunov function L̃ in
Ω̃ after learning
6We used the SEDS software 1.95 by Khansari-Zadeh et al.
[22]
11
-
Target
Training Data
Lypunov Function
Target
Training Data
Dynamic Flow
Target
Training Data
Reproduction
Trajectory
Dynamic Flow
Target
Training Data
Reproduction
Trajectory
Dynamic Flow
Target
Training Data
Reproduction
Trajectory
Dynamic Flow
1 1.5 2 2.5 3 3.5 4 4.5 50
2000
4000
6000
8000
log ρ0
Sw
ep
t E
rro
r A
rea
[m
m2]
Figure 5: Explicit stabilization of the sharp-C-shape through
CLF. A Lyapunov candidate function learned with the WSAQF approach
(topleft). An unstable dynamical system of six demonstrations with
GMR (top second). The stabilized system for three demonstrations
withparameter ρ0 = 10 (top third), ρ0 = 1000 (top fourth), ρ0 =
100000 (top fifth). SEA of the stabilized system with changing ρ0
(bottom).
the data D̃. The result of the learning is depicted by
thedynamic flow after learning the transformed demonstra-tions D̃
in Fig. 4 (third column). It is illustrated that thenew data is not
violating the quadratic function and thusallows an accurate
modeling of the data. Finally, the in-verse transformation τ−1L is
used to obtain the dynamicsy for the original data D in the
original space Ω (step 5).Eq. (18) was used for
back-transformation. Note that theobtained vector field has no
discontinuities and providesa gentle generalization of the applied
data set D irrespec-tive of the used Lyapunov candidate L, see Fig.
4 (fourthcolumn).
The experiment shows that the class of learnable demon-stration
of SEDS is enhanced by means of the proposedframework based on
diffeomorphic transformations. Theexperiment also reveals that the
generalization capabilityof the learner transfers to the original
space which is animportant prerequisite for such systems. Please
comparethe results of this experiment to Fig. 1 and Fig. 3.
7.3. Investigating the Control Lyapunov Approach
The explicit stabilization during runtime with online
cor-rections of the learned dynamical system in the CLF-DMapproach
is parameterized with ρ0 and κ0 defining thefunction ρ(‖x‖), see
Eq. (13), which shifts the activationthreshold of a correction
signal u(x). Basically, two funda-mental problems concerning these
parameters are inherentto this approach. First, the parameters
should be selectedand scaled according to the scalar product ∇L(x)T
ŷ(x)in order to allow an appropriate stabilization, where ŷ
isdefined according to Eq. (12). The optimization process ofthese
parameters is independent of the actual learning ofthe Lyapunov
candidate L, hence, the learning of ŷ con-stitutes a separate
process. Optimization of these param-
eters usually requires several iterations and is thus
com-putationally expensive. Second, the parameters can onlydeal
with a finite range of the scalar product ∇L(x)T ŷ(x).This is
particularly problematic whenever the scalar prod-uct is too small
in some region and at the same time toobig for ρ(‖x‖) in another.
This can lead to inaccurate re-production capabilities or numerical
integration problems.The respective parameterization appears to be
too sim-ple in such situations. However, the introduction of
moreparameters is unsatisfying.
Fig. 5 demonstrates the effect of the parameterizationand shows
the learning of six demonstrations by means ofthe CLF approach. The
first two plots in the figure showsthe result of the WSAQF approach
for learning the Lya-punov candidate (top first) and the learning
of the dynam-ical system by means of the Gaussian mixture
regression(GMR) approach (top second). As expected, the simpleGM
regression method results in an unstable estimate ofthe three
demonstrations. The second row of Fig. 5 showsthe experimental
results. We selected κ0 = 0.05 which re-mains fixed, and varied ρ0
in the range from [10, 100000]logarithmically in 11 steps. We
recorded the SEA measurein this experiment. The bottom plot of Fig.
5 shows theSEA of the demonstrations and the respective
reproduc-tions. It is demonstrated that the reproduction
accuracydecreases with increasing ρ0. For small ρ0, the
reproduc-tion accuracy is high, while for big ρ0, the
reproductionsbecome inaccurate. In this case, the reproduction is
forcedto converge faster and thus follows the gradient of the
Lya-punov candidate rather than the demonstrations. How-ever, a too
strong reduction of ρ0 is also an insufficientstrategy. The reason
is that the correction signal is toosmall and can hinder
convergence to the target attractorby induction of numerical
spurious attractors. This is es-
12
-
pecially problematic if perturbations are present and drivethe
trajectory in such a spurious attractor. This can beavoided by
increasing ρ0 which simultaneously increasesthe CS and thus
“overwrites” the numerical attractors.These both facts introduce a
trade-off in the CLF approachfor which a careful parameter
selection is necessary.
The three top right plots in Fig. 5 visualize these sit-uations
for ρ0 = 10, 1000, 100000 from left to right. Theplots comprise the
demonstrations used as training data(black trajectories), the
reproduction by the dynamicalsystem (red trajectories), the
dynamical flow (blue tra-jectories and arrows), and a trajectory
(green trajectory)that was iterated for one hundred steps. Overall,
the leftplot shows good reproduction accuracy but also a
spuriousnumerical attractor at the end of the green trajectory;
thecenter plot shows a good reproduction and convergencecapability
for which tuning the CLF parameters was suc-cessful; and the right
plot shows a correction signal thatis too strong which results in a
poor reproduction of thedemonstrations and a strong convergence
behavior.
These effects are especially problematic if we assumethat two
different Lyapunov candidates L ∼ L′ are given,which are equivalent
in the sense that their gradients pointinto the same direction: ∇L
∼ ∇L′. These functions canbe transformed into each other by means
of a continu-ously differentiable function ϕ(L) : R → R with ϕ(0) =
0and ∂∂Lϕ > 0. This has a direct influence on the dif-feomorphic
transformation which copes with the functionϕ(L(x)) : Ω → R. Only
the training data D̃ in the trans-formed space Ω̃ looks different.
These changes disappearafter back transformation of the dynamical
system into theoriginal space. In contrast, the parameters κ0 and
ρ0 ofthe correction signal are not automatically related to ϕ
andneed to be re-tuned whenever a new Lyapunov candidateis
applied.
In summary, this sensitivity of the CLF approach tothe
parameters of the runtime correction is unsatisfactoryto the degree
that a separate optimization process -whichis computationally
costly- is required. The diffeomorphictransformation approach τ
-SEDS, however, requires no ad-ditional parameters since it merges
the learning of theLyapunov candidate and the actual dynamics
through theSEDS approach.
7.4. Performance Experiments
This experiment compares the proposed approach τ -SEDSwith the
state of the art methods and SEDS as baselinein a qualitative and
quantitative manner. The perfor-mance of the different approaches
is analyzed on the LASAdata set [14]. For evaluation, we use the
neurally im-printed Lyapunov candidate (NILC) and the weighted
sumof asymmetric quadratic functions (WSAQF) approach asLyapunov
candidates and combine them with Gaussianmixture regression through
the control Lyapunov func-tion - dynamic movements (CLF-DM)
approach; the sta-ble estimator of dynamical systems through the
diffeomor-phic transformation (τ -SEDS) approach; and neurally
im-
0
5000
10000
15000
Mean S
wept E
rror
Are
a [m
m2]
Approaches
SE
DS
CLF
−D
M (
GM
R,N
ILC
)
CLF
−D
M (
GM
R,W
SA
QF
)
NIV
F (
NIL
C)
NIV
F (
WS
AQ
F)
τ−
SE
DS
(N
ILC
)
τ−
SE
DS
(W
SA
QF
)
Figure 6: The SEA for the different method combinations on
thelibrary of 30 handwritten motions.
printed vector fields (NIVF) through quadratic program-ming. The
accuracy of the reproductions is again mea-sured according to the
SEA. The results are stated inFig. 6.
The WSAQF Lyapunov candidates are learned and op-timized
according to the following: the trade-off param-eter w̄ = 0.9 is
fixed; an additional regularization term7
λ·‖θ‖2 is added in order to obtain smoother functions; andthe
number of asymmetric functions L is increased until aminimum of the
violation between Lyapunov candidate Land demonstrations D is
observed. The NILC candidatesare based on a network with R = 100
hidden neuronswhere the regularization parameter is decreased from
100
to 10−5 logarithmically in 5 equidistant steps until the
vi-olation reaches a minimum. The Lyapunov properties arevalidated
in NC = 10
5 uniformly distributed samples8.The results for each shape are
averaged over ten ini-
tializations. The SEDS and GMR models where initializedwith K =
7 Gaussian functions in the mean square error(MSE) mode and trained
for maximally 1500 iterations.The parameters of the CLF integration
were selected from9 logarithmically equidistant steps from 10−6 to
103.
The experiments first reveal that the average SEA ofthe SEDS
approach is extraordinary large in comparisonto the other
approaches. The reason is that some of the 30shapes (e.g. the
sharp-C-shape) are violating a quadraticLyapunov candidate
function. This leads to inaccurate re-productions and thus to a
poor average performance, be-cause these shapes are principally not
learnable by SEDS.
7θ = {P 0, . . . , PL, µ0, . . . , µL} collects all parameters
of theWSAQF approach. λ = 0.1 in the experiments.
8see [19] for details
13
-
Approach (LC) Dynamics Stability Integration
SEDS (Quad.) L = x̃T x̃ Global YesCLF-DM (NILC) x̃T∇L > 0
Local NoCLF-DM (WSAQF) x̃T∇L > 0 Global NoNIVF (NILC) x̃T∇L >
0 Local YesNIVF (WSAQF) x̃T∇L > 0 Local Yesτ -SEDS (NILC) x̃T∇L
> 0 Local Yesτ -SEDS (WSAQF) x̃T∇L > 0 Global Yes
Table 1: Qualitative comparison of the different method
combina-tions. The τ -SEDS (WSAQF) approach is promising.
Approacheswhich use the NILC approach as Lyapunov candidate cannot
guar-antee stability globally.
The performance of the dynamical systems explicitlystabilized by
the CLF approach is significantly better thanfor the original SEDS
approach, but slightly worse thanfor the other approaches. This is
due to the fact thatthe selection of the CLF parameters was
restricted to adiscrete set of variables and that the
parameterization of ρis insufficient for learning of some of the
shapes. However,the learning performance can in principle be
increased butwould demand more computational resources.
The τ -SEDS and the NIVF approach reach the bestresults among
the tested methods. The difference in theresults originating from
the two Lyapunov candidates arenon-negligible. The WSAQF approach
performs slightlybetter than the NILC due to the fact that the
error func-tional also directly implements a reduction of the
viola-tion, see Eq. (33). The simple alignment of the candi-dates’
gradient and the velocity of the demonstrations is-in some cases-
not sufficient to implement a violation-freeLyapunov candidate with
NILC. In fact, it is also possibleto use the learning functional of
the WSAQF approach forthe NILC approach and vice versa.
An additional qualitative summary of the several ap-proaches can
be found in Tab. 1. The table assigns threeimportant properties to
the discussed approaches. Thefirst property is the class of
learnable functions. SEDS isin principle restricted to dynamics
that satisfy a quadraticLyapunov function. All other approaches
allow much largerclasses of dynamics irrespective if the NILC or
the WSAQFapproach are applied. The second column states the rangeof
the stability property and distinguishes between localand numerical
stability guarantees or constructively provenglobal asymptotic
stability. The last column in the tablestates that the Lyapunov
candidate is integrated into thelearning procedure. This is not the
case for the CLF-DMapproach because stabilization is only applied
online. Theτ -SEDS (WSAQF) approach appears to be the only
ap-proach which provides a large class of learnable
demonstra-tions, allows global stability that is proven
constructivelyand integrates the Lyapunov candidate into the
learningof the dynamical system directly while simultaneously
per-forming in a reliable manner. We provide the results forthis
approach for all 30 movements in Fig. 7.
8. Robotics Experiment
We apply the presented approach in a robotic scenario in-volving
the humanoid robot iCub [23]. Such robots aretypically designed to
solve service tasks in environmentswhere a high flexibility is
required. Robust adaptabilityby means of learning is thus a
prerequisite for such sys-tems. The experimental setting is
illustrated in Fig. 8(left). A human tutor physically guides iCub’s
right armin the sense of kinesthetic teaching using a recently
estab-lished force control on the robot. The tutor can
therebyactively move all joints of the arm to place the
end-effectorat the desired position. Beginning on the right side of
theworkspace, the tutor first moves the arm around the obsta-cle on
the table, touches its top, and then moves the armtowards the left
side of the obstacle were the movementstops. This procedure is
repeated three times.
The recorded demonstrations comprise betweenNtraj =542 and Ntraj
= 644 samples. We apply the original SEDSand τ -SEDS (WSAQF)
approach to learn the demonstra-tions and equip SEDS in both cases
with K = 5 Gaussiansand iterate for maximally 1500 steps. The WSAQF
wasparameterized with λ = 0.01 and w̄ = 0.9 and comprisedL = 3
basis functions. The results of the experiment arevisualized in
Fig. 8. The center plot of the figure showsthe result of learning
the robot demonstrations with theSEDS approach. The dynamical
system constructed bySEDS is not able to follow the demonstrations
because thedynamics are restricted to a quadratic Lyapunov
function.The right plot of Fig. 8 visualizes the back
transforma-tion of the SEDS dynamics into the original space
whichyields good reproductions while simultaneously guarantee-ing
asymptotic stability by construction. A closer inspec-tion reveals
that the reproductions from the learned dy-namical system actually
both are smoother and emphasizemore clearly the main feature of the
touching the upmostpoint of the tower, thereby not smoothing out
this impor-tant feature of the demonstrated movement.
9. Discussion
The experimental results obtained by application of
thetheoretically derived framework substantiates the successof this
transformation procedure. Proposition 1 and Propo-sition 2 left
many degrees of freedom that can be used indifferent ways as we
performed in this paper. Therefore,many questions arise that we
discuss in this chapter. It is,however, important to note that the
majority of the an-swers given are based on empirical observations
and con-jectures. We nevertheless believe that these points
areinteresting to focus on in the recent future.
• What are “good” Lyapunov candidates? One of themain
requirements on Lyapunov candidates is thatthey are consistent with
the demonstrations - learn-ing thus appears as a method of choice
to obtain anappropriate candidate. Another important point is
14
-
Figure 7: The collection of Lyapunov candidates constructed with
the WSAQF approach for the 30 handwritten motions (left block).
Thecorresponding dynamical systems constructed with the τ -SEDS
(WSAQF) approach. This approach generates accurate and stable
movements.
that robotics movements should be smooth. Strongaccelerations
are undesired because they are danger-ous, both, for humans in the
vicinity of the robotand for the robot itself. The applied Lyapunov
can-
didate is a major factor concerning the smoothness ofthe
resulting dynamics. The construction of smoothscalar functions that
reduce the risk of undesiredjerkiness is thus indispensable.
15
-
−0.05
0
0.05
0.1
−0.050
0.050.1
0.150.2
0.25
−0.05
0
0.05
0.1
0.15
0.2
yx
z
−0.05
0
0.05
0.1
−0.050
0.050.1
0.150.2
0.25
−0.05
0
0.05
0.1
0.15
0.2
yx
z
Demonstrations
Reproductions
Attractor
Figure 8: Kinesthetic teaching of iCub and the results of the
iCub experiment. The tutor moves iCub’s right arm from the right to
the leftside of the small colored tower (left). Reproduction (red)
of the demonstrated trajectories (black) in meter by the original
SEDS approach(center) which is inaccurate. The reproductions of τ
-SEDS with WSAQF candidate according to the back transformation in
the originalspace Ω are accurate and stable (right).
• What is the class of learnable dynamics? The class oflearnable
dynamics is mainly driven by the Lyapunovcandidate function (WSAQF,
NILC). The diffeomor-phic candidate transformation τL : Ω → Ω̃
definedin Eq. (42) requires Lyapunov candidates that ful-fill the
inequality (x − x∗)T · ∇L > 0, see Eq. (36).Hence, the class is
indeed restricted but still muchlarger than for quadratic Lyapunov
candidates. Itis nevertheless worth to investigate Lyapunov
candi-dates and diffeomorphic candidate transformationsthat allow
more complex dynamics or even universalapproximation
capabilities.
• What are “good” diffeomorphisms? It is clear thatthe
diffeomorphic candidate transformation τ has animpact on the
resulting dynamics. The differentialproperties of the learned
dynamical system transfervia τL into the original space. It is,
however, unclearhow the properties change after back
transformation.We believe that the diffeomorphism should be
curvedonly as much as necessary (to obtain the quadraticLyapunov
candidate) and as less as possible (to keepthe differential
properties rather unchanged).
• What is the role of the learner? The learner is cer-tainly the
major ingredient for a successful learningof a non-linear dynamical
system. Different proper-ties are very important, such as the
generalizationability, the smoothness of the solution and the
spaceΩ̃ for which asymptotic stability can be guaranteed.SEDS,
e.g., is globally stable (Ω̃ = Rd) which resultsalso in a global
solution after back transformation ifthe Lyapunov candidate is also
globally (such as theWSAQF approach) valid in Ω = Rd.
• How is the generalization ability of the learner af-fected by
the transformation? The learner is sup-
posed to minimize the error in the transformed spaceΩ̃. However,
the back transformation changes the er-ror values according to Eq.
(30). The experimentsshowed that this had no negative effect but
withoutany theoretical justification.
• How is the proposed approach related to the ideaof movement
primitives? One of the key featuresof movement primitives is that
they can be super-imposed to create more complex motions
withoutinducing unstable behavior. The super-position ofmodels
based on the original SEDS formulation in-cludes this feature of
stability but is restricted tothe same class of learnable dynamics
as the SEDSapproach itself, i.e. superimposed SEDS movementsare
stable according to a quadratic Lyapunov func-tion. The resulting
dynamics of two super-imposedτ -SEDS models are not necessarily
stable because ofthe potentially different underlying Lyapunov
candi-dates used for transformation. It is nevertheless pos-sible
to guarantee stability of the resulting dynamicsif the used
Lyapunov candidates are identical. Howto chose an applicable
Lyapunov candidate whichsatisfies the requirements of all
demonstrations atthe same time is yet unclear.
• Is the robot’s stability guaranteed? It is mathemati-cally
proven that the dynamical systems used for mo-tion control of the
robot are stable if used the men-tioned movement primitive
approaches. This doesnot necessarily mean that the movement of the
realrobot is also stable. However, several recent exper-imental
results showed that this seems practicallyirrelevant.
The answering to these questions will be left for
futureresearch. The following section concludes the paper.
16
-
10. Conclusion
SEDS is a very exciting approach to learn dynamical sys-tems
while simultaneously ensuring global asymptotic sta-bility. One of
the main advantages is that SEDS guar-antees stability of
point-to-point movements globally byconstruction. However, the
class of dynamics that areaccurately learnable is restricted to be
consistent witha quadratic Lyapunov function. This property is
unde-sired because it effectively prevents the learning of
complexmovements.
Other approaches such as CLF-DM or NIVF enablelearning of larger
sets of dynamics. However, not with-out the disadvantages of a
correction signal or the lack ofconstructive mathematical
guarantees.
This paper therefore proposes a theoretical frameworkthat
enhances the class of learnable movements by em-ploying SEDS9
indirectly after a diffeomorphic transforma-tion. In comparison to
the state of the art, the proposedτ -SEDS (WSAQF) approach appears
to be the only ap-proach which provides a large class of learnable
demonstra-tions, allows global stability that is proven
constructivelyand integrates the Lyapunov candidate into the
learningof the dynamical system directly while simultaneously
per-forming in a reliable manner.
The key idea is to build a flexible data-driven Lya-punov
candidate that is consistent with the given demon-strations. The
diffeomorphic candidate transformation τis then used to map the
data set into the transformed spacewhere the demonstrations follow
a quadratic Lyapunovfunction. Learning is applied on the
transformed data setby SEDS. The learned dynamical system will then
accu-rately reproduce the demonstrations while
simultaneouslysatisfying the conditions for asymptotic fixpoint
stability.Finally, the back transformation of the dynamical
systemis performed. Complex dynamics that are accurately fol-lowing
the original demonstrations that are at the sametime stable
according to the previously defined Lyapunovcandidate are obtained;
i.e. the Lyapunov candidate be-comes a Lyapunov function for the
back-transformed dy-namics. This new approach is called τ -SEDS.
Interest-ingly, we could easily reuse the complete implementationof
the SEDS approach. This means that the frameworkallows a modular
implementation without much coding ef-fort.
The theoretical results are complemented by experi-mental
results from robotics which illustrate the effect ofthe learning
and transformation. The generality of theframework is demonstrated
by using different Lyapunovcandidates. This emphasizes the fact
that this frameworkis not restricted to the special form of the
Lyapunov can-didate and the SEDS approach.
9We thank S. Mohammad Khansari-Zadeh and the LASA Lab atEPFL for
providing the open source software of SEDS.
ACKNOWLEDGMENT
This research is funded by the German BMBF withinthe Intelligent
Technical Systems Leading-Edge Cluster.
References
[1] M. Mühlig, M. Gienger, J. J. Steil, Interactive imitation
learningof object movement skills, Autonomous Robots 32 (2)
(2012)97–114.
[2] A. Pistillo, S. Calinon, D. G. Caldwell, Bilateral physical
inter-action with a robot manipulator through a weighted
combina-tion of flow fields, in: IEEE Conf. IROS, 2011, pp.
3047–3052.
[3] A. Billard, S. Calinon, R. Dillmann, S. Schaal, Robot
program-ming by demonstration, Vol. 1, Springer, 2008.
[4] S. Schaal, Is imitation learning the route to humanoid
robots?,Trends in cognitive sciences 3 (6) (1999) 233–242.
[5] F. L. Moro, N. G. Tsagarakis, D. G. Caldwell, On the
kinematicmotion primitives (kmps) - theory and application,
Frontiers inNeurorobotics 6 (10).
[6] A. Ude, A. Gams, T. Asfour, J. Morimoto, Task-specific
gen-eralization of discrete and periodic dynamic movement
primi-tives, IEEE Transactions on Robotics 26 (5) (2010)
800–815.doi:10.1109/TRO.2010.2065430.
[7] A. Ijspeert, J. Nakanishi, S. Schaal, Learning attractor
land-scapes for learning motor primitives, in: NIPS, 2003, pp.
1523–1530.
[8] A. Lemme, K. Neumann, R. F. Reinhart, J. J. Steil,
Neurallyimprinted stable vector fields, in: Proc. Europ. Symp. on
Arti-ficial Neural Networks, 2013, pp. 327–332.
[9] S. M. Khansari-Zadeh, A. Billard, BM: An iterative
algorithmto learn stable non-linear dynamical systems with gaussian
mix-ture models, in: IEEE Conf. ICRA, 2010, pp. 2381–2388.
[10] F. Reinhart, A. Lemme, J. Steil, Representation and
general-ization of bi-manual skills from kinesthetic teaching, in:
Proc.Humanoids, 2012, pp. 560–567.
[11] E. Gribovskaya, A. Billard, Learning nonlinear
multi-variatemotion dynamics for real-time position and orientation
controlof robotic manipulators, in: IEEE Conf. Humanoids, 2009,
pp.472–477.
[12] K. Dautenhahn, C. L. Nehaniv, The agent-based perspective
onimitation, Imitation in animals and artifacts (2002) 1–40.
[13] S. M. Khansari-Zadeh, A. Billard, Learning stable
nonlineardynamical systems with gaussian mixture models, IEEE
Trans-actions on Robotics 27 (5) (2011) 943–957.
[14] S. M.
Khansari-Zadeh,http://www.amarsi-project.eu/open-source (2012).
[15] S. M. Khansari-Zadeh, A dynamical system-based approachto
modeling stable robot control policies via imitation learn-ing,
Ph.D. thesis, École Polytechnique Fédérale de
Lausanne,Switzerland (2012).
[16] S. M. Khansari-Zadeh, A. Billard, Learning control
lyapunovfunction to ensure stability of dynamical system-based
robotreaching motions, Robotics and Autonomous Systems.
[17] Z. Artstein, Stabilization with relaxed controls.,
NonlinearAnalysis TMA 7 (11) (1983) 1163–1173.
[18] E. D. Sontag, A universal construction of artstein’s
theorem onnonlinear stabilization, Systems & control letters 13
(2) (1989)117–123.
[19] K. Neumann, A. Lemme, J. J. Steil, Neural learning of
stabledynamical systems based on data-driven lyapunov
candidates,in: IEEE Proc. Int. Conf. on Intelligent Robots and
Systems,2013, pp. 1216–1222.
[20] K. Neumann, M. Rolf, J. J. Steil, Reliable integration of
con-tinuous constraints into extreme learning machines, Journal
ofUncertainty, Fuzziness and Knowledge-Based Systems.
[21] M. Khansari, A. Lemme, Y. Meirovitch, B. Schrauwen, M.
A.Giese, A. Ijspeert, A. Billard, J. J. Steil, Workshop on
bench-marking of state-of-the-art algorithms in generating
human-likerobot reaching motions, in: Humanoids, 2013.
17
-
[22] S. M.
Khansari-Zadeh,http://lasa.epfl.ch/people/member.php?SCIPER=183746/
(2013).[23] N. G. Tsagarakis, G. Metta, G. Sandini, D. Vernon,
R. Beira,
F. Becchi, L. Righetti, J. Santos-Victor, A. J. Ijspeert, M.
C.Carrozza, et al., icub: the design and realization of an open
hu-manoid platform for cognitive and neuroscience research,
Ad-vanced Robotics 21 (10) (2007) 1151–1175.
18