Top Banner
TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets)
55

tensor approximation tools free of the curse of dimensionality

Jan 28, 2017

Download

Documents

doanthien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: tensor approximation tools free of the curse of dimensionality

TENSOR APPROXIMATION TOOLS

FREE OF THE CURSE OF DIMENSIONALITY

Eugene Tyrtyshnikov

Institute of Numerical Mathematics

Russian Academy of Sciences

(joint work with Ivan Oseledets)

Page 2: tensor approximation tools free of the curse of dimensionality

WHAT ARE TENSORS?

Tensors = d-dimensional arrays:

A = [aij...k]

i ∈ I, j ∈ J, ... , k ∈ K

Tensor A has:

• dimensionality (order) d = number of indices(modes, axes, directions, ways)

• size n1 × ... × nd

(number of nodes along each axis)

Page 3: tensor approximation tools free of the curse of dimensionality

WHAT IS A PROBLEM?

NUMBER OF TENSOR ELEMENTS = nd

GROWS EXPONENTIALLY IN d

WATER AND UNIVERSE

H2O molecule has 18 electrons. Each electron has 3 coordinates.

Thus we have 18 × 3 = 54 axes.

If we take 32 nodes on each axis, we obtain 3254 ≈ 1081 points,

which is close to the number of atoms in the universe.

CURSE OF DIMENSIONALITY

Page 4: tensor approximation tools free of the curse of dimensionality

WE SURVIVE WITH

• COMPACT (LOW-PARAMETRIC)

REPRESENTATIONS FOR TENSORS

• METHODS FOR COMPUTATIONS

IN COMPACT REPRESENTATIONS

Page 5: tensor approximation tools free of the curse of dimensionality

TUCKER DECOMPOSITION

a(i1, ..., id) =

r1∑

α1=1

...

rd∑

αd=1

g(α1, ..., αd) q1(i1, α1) ... qd(id, αd)

L. R. Tucker, Some mathematical notes on three-mode factor analysis,

Psychometrika, V. 31, P. 279–311 (1966).

COMPONENTS:

• 2D arrays q1, ..., qd with dnr entries

• d-dimensional array g(α1, ..., αd) with rd entries

CURSE OF DIMENSIONALITY REMAINS

Page 6: tensor approximation tools free of the curse of dimensionality

CANONICAL DECOMPOSITION (PARAFAC, CANDECOMP)

a(i1, ..., id) =R∑

α=1

u1(i1, α) ... ud(id, α)

Number of defining parameters is dRn.

DRAWBACKS:

• INSTABILITY (cf. Lim, de Silva)

x1, ... , xd, y1, ... , yd linearly independent

a =d∑

t=1

zt1 ⊗ ... ⊗ zt

d, ztk =

{xk, k 6= tyk, k = t

a =1

ε(x1 + εy1) ⊗ ... ⊗ (xd + εyd) − 1

εx1 ⊗ ... ⊗ xd + O(ε)

• EVENTUAL LACK OF ROBUST ALGORITHMS

Page 7: tensor approximation tools free of the curse of dimensionality

a(i1, ..., id) =

r1∑

α1=1

...

rd∑

αd=1

g(α1, ..., αd) q1(i1, α1) ... qd(id, αd)

TUCKER DECOMPOSITION

Page 8: tensor approximation tools free of the curse of dimensionality

a(i1, ..., id) =R∑

α=1

u1(i1, α) ... ud(id, α)

CANONICAL DECOMPOSITION (PARAFAC, CANDECOMP)

Page 9: tensor approximation tools free of the curse of dimensionality

a(i1, ..., id) =∑

α1, ..., αd−1

g1(i1, α1) g2(α1, i2, α2) . . .

. . . gd−1(αd−2, id−1, αd−1) gd(αd−1, id)

TENSOR-TRAIN DECOMPOSITION

Page 10: tensor approximation tools free of the curse of dimensionality

TENSORS AND MATRICES

Let A = [aijklm].

Take up a pair of mutually complementary long indices

(ij) and (klm)

(kl) and (ijm)

.........

Tensor A gives rise to unfolding matrices:

B1 = [b(ij),(klm)]

B2 = [b(kl),(ijm)]

.........

By definition,

b(ij),(klm) = b(kl),(ijm) = ... = aijklm

Page 11: tensor approximation tools free of the curse of dimensionality

DIMENSIONALITY CAN BE DECREASED

a(i1, ..., id) = a(i1, ..., ik; ik+1, ..., is)

=r∑

s=1

u(i1, ..., ik; s) v(ik+1, ..., id; s)

Dimension d reduces to dimensions k + 1 and d − k + 1.

Proceed by recursion.

Binary tree arises.

Page 12: tensor approximation tools free of the curse of dimensionality

TUCKER VIA RECURSION

���2�3�4�5

��α� �2�3�4�5α�

�2α2 �3�4�5α�α2

�3α3 �4�5α�α2α3

�4α4 �5α�α2α3α4

�5α5α�α2α3α4α5

a(i1, i2, i3, i4, i5) =∑

α1,α2,α3,α4,α5

g(α1, α2, α3, α4, α5)·

·q1(i1, α1) q2(i2, α2) q3(i3, α3) q4(i4, α4) q5(i5, α5)

Page 13: tensor approximation tools free of the curse of dimensionality

BINARY TREE IMPLIES

• Any auxiliary index belongs to exactly twoleaf tensors.

• Tensor is the sum over all auxiliary indices of the productof elements of the leaf tensors.

HOW TO AVOID rd PARAMETERS

• Let any leaf tensor have at most onespatial index.

• Let any leaf tensor have at most two (three)auxiliary indices.

Page 14: tensor approximation tools free of the curse of dimensionality

TREE WITHOUT TUCKER

���2�3�4�5

��α� �2�3�4�5α�

�2α2 �3�4�5α�α2

�3α�α3 �4�5α2α3

�4α2α4 �5α3α4

TENSOR-TRAIN DECOMPOSITION

a(i1, i2, i3, i4, i5) =∑

α1,α2,α3,α4

g1(i1, α1) g2(α1, i3, α3) g3(α3, i5, α4) g4(α4, i4, α2) g5(α2, i2)

Page 15: tensor approximation tools free of the curse of dimensionality

HOW MANY PARAMETERS

NUMBER OF TT PARAMETERS = 2nr + (d − 2)nr2

EXTENDED TT DECOMPOSITION

���2�3�4�5

���2α� �3�4�5α�

��α2 �2α�α2 �3α3 �4�5α�α3

�4α�α4 �5α3α4�2α5

α�α2α5

�4α6α�α4α6 �5α7

α3α4α7

NUMBER OF EXTENDED TT PARAMETERS = dnr + (d − 2)r3

Page 16: tensor approximation tools free of the curse of dimensionality

TREE IS NOT NEEDED!

ALL IS DEFINED BY A PERMUTATION OF SPATIAL INDICES

TENSOR-TRAIN DECOMPOSITION

a(i1, i2, i3, i4, i5) =∑

β1,β2,β3,β4

g1(iσ(1), β1) g2(β1, iσ(2), β2) g3(β2, iσ(3), β4) g4(β4, iσ(5), β5) g5(β5, iσ(5))

TT = Tree–Tucker ⇒ neither Tree, nor Tucker ⇒TENSOR TRAIN

Page 17: tensor approximation tools free of the curse of dimensionality

MINIMAL TT DECOMPOSITION

Let 1 ≤ βk ≤ rk.

What are minimal values for compression ranks rk?

rk ≥ rankAσk

Aσk =

[aσ(iσ(1), ..., iσ(k); iσ(k+1), ..., iσ(d))

]

aσ(iσ(1), ..., iσ(k); iσ(k+1), ..., iσ(d)) = a(i1, ..., id)

Page 18: tensor approximation tools free of the curse of dimensionality

GENERAL PROPERTIES

THEOREM 1.

Assume that a tensor a(i1, ..., id) possesses a canonical decomposi-tion with R terms. Then a(i1, ..., id) admits a TT decomposition ofrank R or less.

THEOREM 2.

Assume that a tensor a(i1, ..., id), when ε-perturbed, with any smallε possesses a canonical decomposition with R terms. Then a(i1, ..., id)admits a TT decomposition of rank R or less.

Page 19: tensor approximation tools free of the curse of dimensionality

FROM CANONICAL TO TENSOR TRAIN

a(i1, ..., id) =R∑

s=1u(i1, s)... u(id, s) =

∑α1,....,αd−1

u(i1, α1) δ(α1, α2)u(i2, α2) ...

... δ(αd−2, αd−1)u(id−1, αd−1) u(id, αd−1)

FREE!

Page 20: tensor approximation tools free of the curse of dimensionality

EFFECTIVE RANK OF A TENSOR

ERank(a) = lim supε→+0

min|b − a| ≤ ε

b ∈ C(n1, ..., nd)

RANK(b)

F(n1, ..., nd): all tensors of size n1 × ... × nd with entries from F.

Let a ∈ F(n1, ..., nd) ⊂ C(n1, ..., nd). Thencanonical rank over F depends on F, effective rank does not.

Close to border rank concept (Bini-Capovani).Which still depends on F.

THEOREM 2 (reformulated)

Let a ∈ F(n1, ..., nd). Then for this tensor there exists a TT decom-position of rank r ≤ ERank(a) with entries of all tensors belongingto F.

Page 21: tensor approximation tools free of the curse of dimensionality

EXAMPLE 1

d-dimensional tensor in the matrix form

A = Λ ⊗ I ⊗ ... ⊗ I + I ⊗ Λ ⊗ ... ⊗ I + ... + I ⊗ ... ⊗ I ⊗ Λ

P (h) ≡ ⊗ds=1(I + hΛ) = I + hA + O(h2)

A =1

hP (h) − 1

hP (0) + O(h)

ERank(A) = 2

Page 22: tensor approximation tools free of the curse of dimensionality

EXAMPLE 2

Real-valued tensor F by the function

f(x1, ..., xd) = sin(x1 + ... + xd)

on some 1D grids for x1, ..., xd.

Beylkin et al: canonical rank over R of F does not exceed d(it is likely to be exactly d).However,

sin x =exp(ix) − exp(−ix)

2i

ERank(F ) = 2

Page 23: tensor approximation tools free of the curse of dimensionality

EXAMPLE 3

d-dimensional tensor A from discretization of operator

A =∑

1≤i≤j≤d

aij

∂xi

∂xj

on a tensor grid for variables x1, ..., xd.

Canonical rank ∼ d2/2.

However,

ERank(A) ≤ 3

2d + 1

(N. Zamarashkin, I. Oseledets, E. Tyrtyshnikov)

Page 24: tensor approximation tools free of the curse of dimensionality

TENSOR TRAIN DECOMPOSITION

a(i1, ...id) =∑

α0,...,αd

g1(α0, i1, α1) g2(α1, i2, α2)... gd(αd−1, id, αd)

MATRIX FORM

a(i1, ..., id) = Gi11 Gi2

2 ... Gidd

MINIMAL TT COMPRESSION RANKS:

rk = rankAk, Ak = [a(i1...ik)(ik+1...id)], 0 ≤ k ≤ d

size(Gikk ) = rk−1 × rk

Page 25: tensor approximation tools free of the curse of dimensionality

THE KEY TO EVERYTHING

PROBLEM OF RECOMPRESSION:

Given a tensor train, but with large ranks.

Let us try to find in ε-vicinity a tensor train

with lesser compression ranks.

METHOD OF TT RECOMPRESSION (I. V. Oseledets):

• NUMBER OF OPERATIONS IS LINEAR

IN DIMENSIONALITY d AND MODE SIZE n

• THE RESULT HAS GUARANTEED

APPROXIMATION ACCURACY

Page 26: tensor approximation tools free of the curse of dimensionality

METHOD OF TENSOR TRAIN RECOMPRESSION

Minimal TT compression ranks = ranks of unfolding matrices Ak

Matrices Ak are of size nk × nd−k, but never appearas full arrays of nd elements.

Nevertheless, the SVD for Ak are constructed with orthogonal (uni-tary) matrices in a compact factorized form.

When neglecting smallest singular values, we provideGUARANTEED ACCURACY.

To show the idea, consider a TT decomposition

a(i1, i2, i3) =∑

α1,α2

g1(i1, α1) g2(α1, i2, α2) g3(α2, i3)

Page 27: tensor approximation tools free of the curse of dimensionality

TENSOR TRAIN RECOMPRESSION

RIGHT TO LEFT by QR

a(i1, i2, i3) =∑

α1,α2

g1(i1, α1) g2(α1, i2, α2) g3(α2; i3)

=∑

α1,α′2

g1(i1, α1) g2(α1, i2; α′2) q3(α

′2; i3)

=∑

α′1,α′

2

g1(i1; α′1) q2(α

′1, i2; α′

2) q3(α′2; i3)

Matrices q2(α′1; i2, α′

2), q3(α′2; i3) obtain orthonormal rows.

g3(α2; i3) =∑

α′2

r3(α2; α′2) q3(α

′2; i3) QR

g2(α1, i2; α′2) =

∑α2

g2(α1, i2; α2) r3(α2, α′2)

g2(α1; i2, α′2) =

α′1

r2(α1; α′1) q2(α

′1; i2, α′

2) QR

g1(i1; α′1) =

∑α1

g1(i1; α1) r2(α1; α′2)

Page 28: tensor approximation tools free of the curse of dimensionality

TENSOR TRAIN RECOMPRESSION

LEFT TO RIGHT by SVD

a(i1, i2, i3) =∑

α′1,α′

2

g1(i1; α′1) q2(α

′1, i2, α′

2) q3(α′2, i3)

=∑

α′′1 ,α′

2

z1(i1; α′′1) g2(α

′′1 ; i2, α′

2) q3(α′2, i3)

=∑

α′′1 ,α′′

2

z1(i1; α′′1) z2(α

′′1 ; i2, α′′

2) g3(α′′2 , i3)

Matrices z1(i1; α′′1), z2(α

′′1 , i2; α

′′2) obtain orthonormal columns.

Page 29: tensor approximation tools free of the curse of dimensionality

LEMMA ON ORTHONORMALITY

Let k ≤ l and matrices

qk(αk−1 ; ik, αk), ... , ql(αl−1 ; il, αl)

have orthonormal rows. Then the matrix

Qk(αk ; i) ≡ Qk(αk−1 ; ik, ..., il, αl) ≡∑

αk,...,αl−1

qk(αk−1 ; ik, αk) ... ql(αl−1 ; il, αl)

has orthonormal rows as well.

PROOF BY INDUCTION.

Qk(αk−1 ; ik, i) =∑αk

qk(αk−1 ; ik, i) Qk+1(αk ; i) ⇒∑ik,i

Qk(α ; ik, i) Qk(β ; ik, i) =

∑ik,i

∑µ,ν

qk(α ; ik, µ) Qk+1(µ ; i)qk(β ; ik, ν) Qk+1(ν ; i) =

∑ik

∑µ,ν

qk(α, ; ik, µ) qk(β ; ik, ν) δ(µ, ν) =

∑ik,αk

qk(α, ; ik, αk) qk(β ; ik, αk) = δ(α, β)

Page 30: tensor approximation tools free of the curse of dimensionality

TENSOR TRAIN RECOMPRESSION

a(i1, i2, i3) =∑

α′1,α′

2

g1(i1, α′1) q2(α

′1, i2, α′

2) q3(α′2, i3)

=∑

α′′1 ,α′

2

z1(i1, α′′1) g2(α

′′1 , i2, α′

2) q3(α′2, i3)

=∑

α′′1 ,α′′

2

z1(i1, α′′1) z2(α

′′1 , i2, α′′

2) g3(α′′2 , i3)

rankA1 = rank[g1(α

′′0 , i1; α

′1)

]

rankA2 = rank[g2(α

′′1 , i2; α

′2)

]

rankA3 = rank[g3(α

′′2 , i3; α

′3)

]

• Complexity of computation of compression ranks is linear in d.

• “Truncation” is performed in the SVD of small-size matrices.

• NUMBER OF OPERATIONS = O(dnr3)

• GUARANTEED ACCURACY =√

d ε(in the Frobenius norm)

Page 31: tensor approximation tools free of the curse of dimensionality

TT APPROXIMATION FOR LAPLACIAN

d TT recompression time Canonical rank Compresison rank

10 0.01 sec 10 220 0.09 sec 20 240 0.78 sec 40 280 13 sec 80 2

160 152 sec 160 2200 248 sec 200 2

1D grids are of size 32.

Tensor has modes of size n = 1024.

Page 32: tensor approximation tools free of the curse of dimensionality

WHAT CAN WE DO WITH TENSOR TRAINS?

a(i1, ...id) =∑

α1,...,αd−1

g1(i1, α1) g2(α1, i2, α2)... gd(αd−1, id)

• RECOMPRESSION: given a tensor train with TT-ranks r, we canapproximate it by another tensor train with a guaranteed accu-racy using O(dnr3) operations.

• QUASI-OPTIMALITY OF RECOMPRESSION:

ERROR ≤√

d − 1 · BEST APPROX. ERROR WITH SAME TT-RANKS

• EFFICIENT APPROXIMATE MATRIX OPERATIONS

Page 33: tensor approximation tools free of the curse of dimensionality

CANONICAL VERSUS TENSOR-TRAIN

Canonical Tensor-Train

Number of parameters O(dnR) O(dnr + (d − 2)r3)

Matrix-by-vector O(dn2R2) O(dn2r2 + dr6)

Addition O(dnR) O(dnr)

Recompression O(dnR2 + d3R3) O(dnr2 + dr4)

Tensor-vector contraction O(dnR) O(dnr + dr3)

Page 34: tensor approximation tools free of the curse of dimensionality

TENSOR-VECTOR CONTRACTION

γ =∑

i1,...,id

a(i1, ..., id) x1(i1) ... xd(id)

ALGORITHM:

• Compute matrices

Zk =

ik

gk(ik, αk−1, αk) xk(ik)

• Multiply matricesγ = Z1Z2...Zk

NUMBER OF OPERATIONS = O(dnr2)

Page 35: tensor approximation tools free of the curse of dimensionality

RECOVER A d-DIMENSIONAL TENSOR

FROM A “SMALL” PORTION OF ITS ELEMENTS

Given a procedure for computation of a(i1, ..., id).

We need to choose “true” elements and use them to constructa TT approximation for this tensor.

TT decomposition with maximal compression rank ris allowed to be constructed from some O(dnr2) elements.

Page 36: tensor approximation tools free of the curse of dimensionality

HOW THIS PROBLEM IS SOLVED FOR MATRICES

Let A be close to a matrix of rank r:

σr+1(A) ≤ ε

Then there exists a cross of r columns C and r rows R such that

|(A − CG−1R)ij| ≤ (r + 1)ε

G is an r × r matrix on the intersection of C and R

Take G of maximal volume among all r × r submatrices in A.

S.A.Goreinov, E.E.Tyrtyshnikov:

The maximal-volume concept in approximation by low-rank matrices,

Contemporary Mathematics, Vol. 208 (2001), 47–51.

S.A.Goreinov, E.E.Tyrtyshnikov, N.L.Zamarashkin:

A theory of pseudo-skeleton approximations, Linear Algebra Appl.

261: 1–21 (1997). Doklady RAS (1995).

Page 37: tensor approximation tools free of the curse of dimensionality

GOOD INSTEAD OF BEST: PSEUDO-MAX-VOLUME

Given A of size n × r, find a row permutation to move a good sub-

matrix to the upper r × r block. Since volume does not change byright-side multiplications, assume that

A =

1. . .

1ar+1,1 ... ar+1,r

... ... ...an1 ... anr

NECESSARY FOR MAX-VOL: |aij| ≤ 1, r + 1 ≤ i ≤ n, 1 ≤ j ≤ r

Let this define a good submatrix. Then here is an algorithm:

• If |aij| ≥ 1 + δ, then swap rows i and j.

• Make I in the first r rows by right-side multiplication.

• Check new |aij|. Quit if all are less than 1 + δ.

• Otherwise repeat.

Page 38: tensor approximation tools free of the curse of dimensionality

MATRIX CROSS ALGORITHM

• Assume we are given some initial column indices j1, ..., jr.

• Find maximal-volume row indices i1, ..., ir in these columns.

• Find maximal-volume column indices in the rows i1, ..., ir.

• Proceed choosing columns and rows untilthe skeleton cross approximations stabilize.

E.E.Tyrtyshnikov, Incomplete cross approximation in themosaic-skeleton method, Computing 64, no. 4 (2000), 367–380.

Page 39: tensor approximation tools free of the curse of dimensionality

TENSOR-TRAIN CROSS INTERPOLATION

Given a(i1, i2, i3, i4), consider the unfoldings and r-column sets:

A1 = [a(i1 ; i2, i3, i4)], J1 = {i(β1)2 i

(β1)3 i

(β1)4 }

A2 = [a(i1, i2 ; i3, i4)], J2 = {i(β2)3 i

(β2)4 }

A3 = [a(i1, i2, i3 ; i4)], J3 = {i(β3)4 }

Successively choose good rows:

I1 = {i(α1)1 } in a(i1 ; i2, i3, i4) : a =

∑α1

g1(i1; α1) a2(α1; i2, i3, i4)

I2 = {i(α2)1 i

(α2)2 } in a2(α1, i2 ; i3, i4) : a2 =

∑α2

g2(α1, i2; α2) a3(α2, i3; i4)

I3 = {i(α3)1 i

(α3)2 i

(α3)3 } in a3(α2, i3 ; i4) : a3 =

∑α3

g3(α2, i3; α3) g4(α3; i4)

Finally

a =∑

α1,α2,α3,α4

g1(i1, α1) g2(α1, i2, α2) g3(α2, i3, α3) g4(α3, i4)

Page 40: tensor approximation tools free of the curse of dimensionality

TT-CROSS INTERPOLATION OF A TENSOR

Tensor A of size n1 × n2 × . . . × nd with compression ranks

rk = rankAk, Ak = A(i1i2 . . . ik; ik+1 . . . id)

is recovered by elements of TT-cross

Ck(αk−1, ik, βk) = A(i(αk−1)1 , i

(αk−1)2 , . . . , i

(αk−1)

k−1 , ik, j(βk)k+1 , . . . , j

(βk)d )

TT-cross is defined by index sets

Ik = {i(αk)1 . . . i

(αk)k }, 1 ≤ αk ≤ rk

Jk = {j(βk)k+1 . . . j

(βk)d }, 1 ≤ βk ≤ rk

Nested property for α sets.

Require nonsingularity of rk × rk matrices

Ak(αk, βk) = A(i(αk)1 , i

(αk)2 , . . . , i

(αk)k ; j

(βk)k+1 , . . . , j

(βk)d )

αk, βk = 1, ..., rk

Page 41: tensor approximation tools free of the curse of dimensionality

FORMULA FOR TT-INTERPOLATION

A(i1, i2, . . . , id) =∑

α1,...,αd−1

C1(α0, i1, α1) C2(α1, i2, α2) . . . Cd(αd−1, id, αd)

Ck(αk−1, ik, αk) =∑

α′k

Ck(αk−1, ik, α′k) A−1

k (α′k, αk)

k = 1, . . . , d

Ad = I

Page 42: tensor approximation tools free of the curse of dimensionality

TENSOR-TRAIN CROSS ALGORITHM

• Assume we are given rk initial column indices j(βk)k+1 , ..., j

(βk)d

in the unfolding matrices Ak.

• Find rk maximal-volume rows in submatrices in Ak of the form

a(i(αk−1)1 , ..., i

(αk−1)

k−1 , ik; j(βk)k+1 , ..., j

(βk)d ).

• Use the row indices obtained and do the same from right to leftto find new column indices.

• Proceed with these sweeps from left to right and from right to left.

• Stop when tensor trains stabilize.

Page 43: tensor approximation tools free of the curse of dimensionality

EXAMPLE OF TT-CROSS APPROXIMATION

HILBERT TENSOR

a(i1, i2, . . . , id) =1

i1 + i2 + . . . + id

d = 60, n = 32

rmax Time Iterations Relative accuracy2 1.37 5 1.897278e+003 4.22 7 5.949094e-024 7.19 7 2.226874e-025 15.42 9 2.706828e-036 21.82 9 1.782433e-047 29.62 9 2.151107e-058 38.12 9 4.650634e-069 48.97 9 5.233465e-0710 59.14 9 6.552869e-0811 72.14 9 7.915633e-0912 75.27 8 2.814507e-09

Page 44: tensor approximation tools free of the curse of dimensionality

COMPUTATION OF d-DIMENSIONAL INTEGRALS: example 1

I(d) =

∫sin(x1 + x2 + . . . + xd) dx1dx2 . . . dxd =

Im

[0,1]dei(x1+x2+...+xd) dx1dx2 . . . dxd = Im((

ei − 1

i)d)

Use the Chebyshev (Clenshaw-Curtis) quadrature with n = 11 nodes.All nd values are NEVER COMPUTED!

Instead, we find a TT cross and construct a TT approximation forthis tensor.

d I Relative accuracy Time10 -6.299353e-01 1.409952e-15 0.14100 -3.926795e-03 2.915654e-13 0.77500 -7.287664e-10 2.370536e-12 4.641000 -2.637513e-19 3.482065e-11 11.602000 2.628834e-37 8.905594e-12 33.054000 9.400335e-74 2.284085e-10 105.49

Page 45: tensor approximation tools free of the curse of dimensionality

COMPUTATION OF d-DIMENSIONAL INTEGRALS: example 2

I(d) =

[0,1]d

√x2

1 + x22 + . . . x2

d dx1dx2 . . . dxd

d = 100

Chebyshev quadrature with n = 41 nodes plusTT-cross of size rmax = 32 give a “reference solution”.For comparison, take n = 11 nodes:

rmax Relative accuracy Time2 1.747414e-01 1.764 2.823821e-03 11.528 4.178328e-05 42.7610 3.875489e-07 66.2812 2.560370e-07 94.3914 4.922604e-08 127.6016 9.789895e-10 167.0218 1.166096e-10 211.0920 2.706435e-11 260.13

Page 46: tensor approximation tools free of the curse of dimensionality

INCREASE DIMENSIONALITY

(TENSORS INSTEAD MATRICES)

Matrix is a 2-way array.

A d-level matrix is naturally viewed as a 2d-way array:

A(i, j) = A(i1, i2, . . . , id; j1, j2, . . . , jd)

i ↔ (i1...id), j ↔ (j1...jd)

Important to consider a related reshaped array:

B(i1j1, . . . , idjd) = A(i1, i2, . . . , id; j1, j2, . . . , jd)

Matrix A is represented by tensor B.

Page 47: tensor approximation tools free of the curse of dimensionality

MINIMAL TENSOR TRAINS

a(i1 . . . id; j1 . . . jd) =∑

1≤αk≤rk

g1(i1j1, α1) g2(α1, i2j2, α2) . . . gd−1(αd−2, id−1jd−1, αd−1) gd(αd−1, idjd)

Minimal possible values of compression ranks rk are equal to theranks of specific unfolding matrices:

rk = rankAk, Ak = [A(i1j1, . . . , ikjk; ik+1jk+1, . . . , idjd)]

If all rk = 1 then

A = G1 ⊗ . . . ⊗ Gd

In general

A =∑

α1

G1α1 ⊗

α2

G2α1α2 ⊗

α3

G3α2α3 . . . . . .

Page 48: tensor approximation tools free of the curse of dimensionality

NO CURSE OF DIMENSIONALITY

Let 1 ≤ ik, jk ≤ n and rk = r.

Then the number of representation parameters is dn2r2.

Dependence on d is linear!

SO LET US MAKE d AS LARGE AS POSSIBLE

BY ADDING FICTITIOUS AXES

Assume we had d0 levels. If n = 2d1 then set d = d0d1.Then

memory = 4dr2

d = log2(size(A))

LOGARITHMIC IN THE SIZE OF MATRIX

Page 49: tensor approximation tools free of the curse of dimensionality

CAUCHY–TOEPLITZ EXAMPLE

A =

[1

i − j + 1/2

]

Relative accuracy Compression ranks for A and A−1

1.e-5 3 7 8 8 8 7 7 7 31.e-7 3 7 9 10 10 9 9 7 31.e-9 3 7 11 11 11 11 11 7 31.e-11 3 7 12 13 13 13 12 7 31.e-13 3 7 14 14 15 14 14 7 3

n = 1024, d0 = 1, d1 = 10

Page 50: tensor approximation tools free of the curse of dimensionality

INVERSES TO BANDED TOEPLITZ MATRICES

Let A be a band Toeplitz matrix : Aij = [a(i − j)]

ak = 0, |k| > s, s is half-bandwidth.

THEOREM

Let size(A) = 2d × 2d and det A 6= 0. Then

rk(A−1) ≤ 4s2 + 1, k = 1, . . . , d − 1,

the estimate being sharp.

COROLLARY

The inverse to a band Toeplitz matrix A of size 2d × 2d with half-bandwidth s has a TT representation with the number of parameters

O(s4 log2 n).

Using Newton with approximations we obtain the inversion algorithm

with complexity O(log2 n).

Page 51: tensor approximation tools free of the curse of dimensionality

AVERAGE COMPRESSION RANK

r =

√memory

2d⇒ memory = 4dr2

INVERSION OF d0-DIMENSIONAL LAPLACIAN

BY MODIFIED NEWTON

d1 = 10

Physical dimensionality (= d0) 1 3 5 10 30 50

Average compression rank of A 2.8 3.5 3.6 3.7 3.8 3.8Average compression rank

of approximation to A−1 7.3 18.6 19.2 17.4 16.1 16.5

Time (sec) 2. 10. 17. 23. 27. 33.

||AX − I||/||I|| 1.e-2 6.e-3 2.e-3 5.e-5 4.e-5 4.e-5

The last matrix size is 2100.

Page 52: tensor approximation tools free of the curse of dimensionality

INVERSION OF 10-DIMENSIONAL LAPLACIAN VIA INTEGRALREPRESENTATION BY STENGER FORMULA

∞∫

0

exp(−At)dt ≈ hτ

M∑k=−M

wk exp(−tk

τA

)

h = π/√

M, wk = tk = exp(hk), λmin(A/τ ) ≥ 1

Page 53: tensor approximation tools free of the curse of dimensionality

CONCLUSIONS AND PERSPECTIVES

• Tensor-train decompositions and corresponding algorithms (seehttp://pub.inm.ras.ru) provide us with excellent approximationtools for vectors and matrices. TT-toolbox for Matlab is available:http://spring.inm.ras.ru/osel.

• The memory needed depends on the matrix size logarithmically.It is terrific advantage when compression ranks are small. It isexactly so in many applications.

• Approximate inverses can be computed in the tensor-train formatgenerally with complexity logarithmic in the size of matrix.

• Applications unclude huge-scale matrices (with size up to 2100)and as well typical large-scale and even modest-scale matrices(like images).

• The key to efficient tensor-train operations is the recompression

algorithm with complexity O(dnr6) and reliability of the SVD.

• Modified Newton method with truncations and integral represen-tations of matrix functions are viable in the tensor-train format.

Page 54: tensor approximation tools free of the curse of dimensionality

GOOD PERSPECTIVES

• Multi-variate interpolation (construction of tensor trains from asmall portion of all elements, tensor cross methods using the max-imal volume concept).

• Fast computation of integrals in d dimensions (no Monte Carlo).

• Approximate matrix operations (e.g. inversion)with complexity O(log2 n).

linear in d = linear in log2 n

• New direction in data compression and image processing(movies).

• Statistical interpretation of tensor trains.

• Applications to quantum chemistry, multi-parametric optimiza-tion, stochastic PDEs, data mining etc.

Page 55: tensor approximation tools free of the curse of dimensionality

MORE DETAILS and WORK IN PROGRESS

• I. V. Oseledets and E. E. Tyrtyshnikov, “Breaking the curse od dimensionality, or how to

use SVD in many dimensions”, Research Report 09-03, Hong Kong: ICM HKBU, 2009

(www.math.hkbu.edu.hk/ICM/pdf/09-03.pdf), SIAM J. Sci. Comput., 2009.

• I. Oseledets, “Compact matrix form of the d-dimensional tensor decomposition”, SIAM

J. Sci. Comput., 2009.

• I. V. Oseledets, "Tensors inside matrices give logarithmic complexity", SIAM J. Matrix

Anal. Appl., 2009.

• I. V. Oseledets, “TT-Cross Approximation for Multidimensional Arrays”, Research Report

09-11, Hong Kong: ICM HKBU, 2009 (www.math.hkbu.edu.hk/ICM/pdf/09-11.pdf),

Linear ALgebra Appl., 2009.

• I. Oseledets, E. E. Tyrtyshnikov, “On a recursive decomposition of multi-dimensional

tensors”, Doklady RAS, vol. 427, no. 2 (2009).

• I. Oseledets, “On a new tensor decomposition”, Doklady RAS, vol. 427, no. 3 (2009).

• I. Oseledets, “On approximation of matrices with logarithmic number of parameters”,

Doklady RAS, vol. 427, no. 4 (2009).

• N. Zamarashkin, I. Oseledets, E. Tyrtyshnikov, Tensor structure of the inverse to a

banded Toeplitz matrix, Doklady RAS, vol. 427, no. 5 (2009).

• Efficient ranks of tensors and stability of TT approximations, TTM for image processing,

TT approximations in electronic structure calculations. In preparation.