Matrix Differential Matrix Differential Calculus Calculus By Dr. Md. Nurul Haque Mollah, Professor, Dept. of Statistics, University of Rajshahi, Bangladesh 01-10-11 1 Dr. M. N. H. MOLLAH
Matrix Differential Matrix Differential CalculusCalculus
By
Dr. Md. Nurul Haque Mollah,Professor,
Dept. of Statistics,University of Rajshahi,
Bangladesh
01-10-11 1Dr. M. N. H. MOLLAH
OutlineOutline
Differentiable FunctionsClassification of Functions and Variables for DerivativesDerivatives of Scalar Functions w. r. to Vector VariableDerivatives of Scalar Functions w. r. to Vector VariableDerivative of Scalar Functions w. r. to a Matrix VariableDerivative of Scalar Functions w. r. to a Matrix VariableDerivatives of Vector Function w. r. to a Scalar VariableDerivatives of Vector Function w. r. to a Scalar VariableDerivatives of Vector Function w. r. to a Vector VariableDerivatives of Vector Function w. r. to a Vector VariableDerivatives of Vector Function w. r. to a Matrix VariableDerivatives of Vector Function w. r. to a Matrix VariableDerivatives of Matrix Function w. r. to a Scalar VariableDerivatives of Matrix Function w. r. to a Scalar VariableDerivatives of Matrix Function w. r. to a Vector VariableDerivatives of Matrix Function w. r. to a Vector VariableDerivatives of Matrix Function w. r. to a Matrix VariableDerivatives of Matrix Function w. r. to a Matrix VariableSome Applications of Matrix Differential Calculus
01-10-11 2Dr. M. N. H. MOLLAH
1. Differentiable Functions1. Differentiable FunctionsA real-valued function where is an open set is said to be continuously differentiable if the partial derivatives exist for each and are continuous functions of x over X. In this case we writeover X.
Generally, we write over X if all partial derivatives of order p exist and are continuous as functions of x over X.
If over Rn, we simply write
If on X, the gradient of at a point is defined as
,X:f nX
nxxfxxf /,...,/ 1 Xx1Cf
pCf
nx
xf
x
xf
xDf 1
.Cf ppCf
1Cf
Xx
f
Dr. M. N. H. MOLLAH01-10-11 3
If over X, the Hessian of at x is defined to be the
symmetric matrix having as the ijth
element
If where then is represented by
the column vector of its component functions as
If X is open, we can write on X if
on X. Then the derivative of the vector function with
respect to the vector variable x is defined by
2Cf f
nn ji xxxf 2
ji xx
xfxfD
22
,mRX:f ,nX f
mfff ,...,, 21
xf
xf
xf
m
1
pCf pm
pp CfCfCf ,...,, 21
mn m1 x....DfxDfxDf
f(x)
Dr. M. N. H. MOLLAH01-10-11 4
If is real-valued functions of (x,y) where
we write
RRf rn :
,Ry,...,y,y,Rx,...,x,xx rn21
nn21
rrji
yy
rnjixy
nnjixx
1rr
1
y
1nn
1
x
yy
yx,fyx,fD
,yx
yx,fyx,fD,
xx
yx,fyx,fD
,
y
yx,f
y
yx,f
yx,fD,
x
yx,f
x
yx,f
yx,fD
Dr. M. N. H. MOLLAH01-10-11 5
If then
For and consider the function
defined by
Then if the chain rule of differentiation is stated as
,,...,,,: 21 mmrn ffffRRf
mrmy1yy
mnmx1xx
yx,f...Dyx,fDyx,fD
,yx,f...Dyx,fDyx,fD
mr RRh : ,RR:g rn
mn RRf : .xghxf
,and ppp CfCg,Ch
xgDxDxDf hg
Dr. M. N. H. MOLLAH01-10-11 6
22. . Classification of Functions and Variables for DerivativesClassification of Functions and Variables for Derivatives
Let us consider scalar functions g, vector functions ƒ and matrix functions F. Each of these may depend on one real variable x, a vector of real variables x, or a matrix of real variables X. We thus obtain the classification of function and variables shown in the following Table.
Table ScalarVariable
Vectorvariable
Matrixvariable
Scalar function
Vector function
Matrix function )(
)(
)(
x
x
x
F
f
g
)(
)(
)(
xF
xf
xg
)(
)(
)(
X
X
X
F
f
g
Dr. M. N. H. MOLLAH01-10-11 7
Some Examples of Scalar, Vector and Matrix FunctionsSome Examples of Scalar, Vector and Matrix Functions
XXAXBX
xxx
XX
Axx
XXXXbaX
Axxxax
'
''
,,:)F
F
xx
xF
AXB,Af
f
)bx,ax(f
),(tr,g
,g
axxg
'
''
2
2
2
(
:)(
1:(x)
functionMatrix
:)(
:)(
:(x)
functionFunction Vector
:)(
:)(
:)(
function Scalar
Dr. M. N. H. MOLLAH01-10-11 8
Consider a scalar valued function ‘g’ of ‘m’ variables
g = g(x1, x2,…, xm) = g(x),
where x = (x1, x2,…, xm)/. Assuming function g is differentiable, then its vector gradient with respect to x is the m-dimensional column vector of partial derivatives as follows
mx
g
x
g
g
1
x
3. Derivatives of Scalar Functions w. r. to Vector Variable3. Derivatives of Scalar Functions w. r. to Vector Variable
3.1 Definition3.1 Definition
Dr. M. N. H. MOLLAH01-10-11 9
Consider the simple linear functional of x as
where is a constant vector. Then the gradient of g is w. r. to x is given by
Also we can write it as
Because the gradient is constant (independent of x), the Hessian matrix of is zero.
ax
1
1
mma
ag
3.2 Example 1
xax
m
iii xa)(g
1
Tm )aa( 1a
aa
x
x
xx Tag )(Dr. M. N. H. MOLLAH01-10-11 10
Example 2
Consider the quadratic form
where A=(aij) is a m x m square matrix. Then the gradient of
g(x) w.r. to x is given by
∑∑m
i
m
jjiji
T axxAg1 1= =
== xxx )(
matrix)symmetricisA(if2
∂
∂
∂
∂
∂
∂
11 1
1 1
11
1
1
∑ ∑
∑ ∑
Ax
axax
axax
x
g
x
g
g
m
m
j
m
i
miijmj
m
j
m
i
iijj
mm
xAAx
x
Dr. M. N. H. MOLLAH01-10-11 11
Then the second- order gradient or Hessian matrix of g(x)=x/Ax w. r. to x becomes
matrix) symmetric isA (if2
2
2
∂
∂
11
1111
2
2
A
AA
aaa
aaa)A(
mmmm
mmT
x
xx
Dr. M. N. H. MOLLAH01-10-11 12
For computing the gradients of products and quotients of functions, as well as of composite functions, the same rules apply as for ordinary functions of one variable.
Thus
x
xx
x
x
xx
xxx
x
x
x
xx
x
xxx
x
x
x
xx
)(f))(g(f
))(g(f
)(g])(g
)(f)(g)(f
[)(g/)(f
)(g)(f)(g
)(f)(g)(f
2
The gradient of the composite function f(g(x)) can be generalized to any number of nested functions, giving the same chain rule of differentiation that is valid for functions if one variable.
3.3 Some useful rules for derivative of scalar functions w. r. to vectors
Dr. M. N. H. MOLLAH01-10-11 13
product) Hadamardfor # andproduct Kronecker for :(Note
11
109.
8
7
6
5
4
3
2
where0, 1
2
x
x
x
x
x
x
x
x
x
x
x
x
x
xxx
x
x
x
xxx
xxx
x
x
x
xxx
xx
x
x
xx
xxx
x
x
x
xxx
xxx
x
x
x
xxx
x
x
x
x
xxx
x
x
x
xx
)(ftrace
)](f[trace.
)(fvec
)(vecf.,
)(f)(f.
],)(g
)#(f[)](g#)(f
[)(g)#(f
.
])(g
)(f[)](g)(f
[)(g)(f
.
)(f))(g(f
))(g(f.
)(g])(g
)(f)(g)(f
[)(g/)(f
.
)(g)(f)(g
)(f)(g)(f.
)(g)(f)(g)(f.
)(f)(f.
) f(A)A(
.
3.3 Fundamental Rules for Matrix Differential Calculus
Dr. M. N. H. MOLLAH01-10-11 14
3.4 Some useful derivatives of scalar functions w. r. to a vector variable3.4 Some useful derivatives of scalar functions w. r. to a vector variable
yyx
yxy
x
)()(
AyAyxx
xx
)(
)x(x
2
symmetric) is (if2 AAxAxxx
xAAAxxx
yAAxyx
)(
)()(
)(
)(a)(b
)(b)(a
)](b)(a[
)(a))((aD)](a)(a[
//
xx
xx
x
xxx
x
xQQxxQxx x
Dr. M. N. H. MOLLAH01-10-11 15
Consider a scalar-valued function f of the elements of a matrix X=(xij) as
f = f(X) = f(x11, x12,… xij,..., xmn)
Assuming that function f is differentiable, then its matrix gradient with respect to X is the m×n matrix of partial derivatives as follows
4. Derivative of Scalar Functions w. r. to a Matrix 4. Derivative of Scalar Functions w. r. to a Matrix VariableVariable4.1 Definition4.1 Definition
nmmnm
n
x
f
x
f
x
f
x
f
f
1
111
X
Dr. M. N. H. MOLLAH01-10-11 16
The trace of a matrix is a scalar function of the matrix elements. Let X=(xij) is an m x m square matrix whose trace is denoted by tr (X). Then
Proof: The trace of X is defined by
Taking the partial derivatives of tr (X) with respect to one of the elements, say xij, gives
m)(tr
IX
X
m
iiix)(tr X
ji,
ji,
x ji for0
for1
∂
)(tr ∂ X
4.2 Example 1
Dr. M. N. H. MOLLAH01-10-11 17
Thus we get,
m
mm
...
............
...
...
I
X
X
X
100
010
001
∂x
)(tr ∂
∂
)(tr ∂
ij
Dr. M. N. H. MOLLAH01-10-11 18
The determinant of a matrix is a scalar function of the matrix elements. Let X=(xij) is an m x m invertible square matrix whose determinant is denoted |X|. Then
Proof: The inverse of a matrix X is obtained as
where adj(X) is known as the adjoint matrix of X. It is defined by
where Cij =(-1)i+jMij is the cofactor w. r. to xij and Mij is the minor w. r. to xij.
4.2 Example 2
|X|)(|X| 1-
∂
∂X
X
)(adj||
XX
X11-
nnn
n
CC
CC
)(adj
1
111
X
Dr. M. N. H. MOLLAH01-10-11 19
The minor Mij is obtained by first taking the (n-1) x (n-1) sub-matrix of X that remains when the i-th row and j-th column of X are removed, then computing the determinant of this sub-matrix. Thus the determinant |X| can be expressed in terms of the cofactors as follows
Row i can be any row and the result is always the same. In the cofactors Cik none of the matrix elements of the i-th row appear, so the determinant is a linear function of these elements. Taking now a partial derivatives of |X| with respect to one of the elements, say xij, gives
∑1
n
k
ikikCx||
X
jiji
|C
x ∂
|∂
X
Dr. M. N. H. MOLLAH01-10-11 20
Thus we get,
This also implies that
1
ij∂x
||∂
∂
||∂
||||
)(adj
C
/
mmij
mm
XX
X
X
X
X
1-
∂
∂1
∂
∂)(
logX
X
X
XX
X
Dr. M. N. H. MOLLAH01-10-11 21
4.3 Some useful derivatives of scalar functions w.r.to matrix
.CC,a)(
a))(())()((
.
.CC,
)()(
.
)(.
)(.
)()(
.
)(.
)(.
/
if2
7
if2
6
5
24
3
2
1
bXaC
bXaCCX
bXaCbXa
aCXa
aXaCCX
CXaXa
aCXbbXaCX
CXbXa
aXaX
XaXa
abbaXX
XbXa
abX
bXa
baX
Xba
Dr. M. N. H. MOLLAH01-10-11 22
k
kn
/n
n
T
/
/k
i
iKiK
/KK
k)log(
])[()][log(tr
.
e)e(tr
.
)()(tr
.
][)(tr
.
)(tr.
)(K)(tr
.
)(tr.
)(tr.
)(tr.
/
XXI
XIX
XIX
BAXXX
BAX
XXX
X
AXXX
AX
XX
X
AX
XA
AX
AX
IX
X
XX
1
1
111
111
1
0
1
1
1 where
16
15
14
13
12
11
10
9
8
Derivatives of trace w.r.to matrix
Dr. M. N. H. MOLLAH01-10-11 23
X)BABA(X
)BAXX(tr.
XBAAXBX
)AXBX(tr.
AXBBXAX
)AXBX(tr.
BAX
)AXB(tr.
BAX
)BAX(tr.
)AA(XX
)XAX(tr.
X)AA(X
)AXX(tr.
TTT
TTT
TTTTTT
TT
T
TT
TT
23
22
21
20
19
18
17
Dr. M. N. H. MOLLAH01-10-11 24
symmetric. and real is if231
symmetric. and real is if2
matrix. real is if30
229
000|if28
27
26
25
24
1
1
1
1
1
1
1
1
1
C,)(||log
.
C,)(||
C,)()(||||
.
)(||log
.
||,||,||,|)(||||
.
B)AXB(||||
.
)(||log
.
)(||k||
.
)(||||||
.
/
kk
CXXCXX
CXX
CXXCXCXX
CXXXCCCXXX
CXX
XXXX
XX
XBAXXAXBX
AXB
AAXBX
AXB
XX
X
XXX
X
XXX
X
X
X
Derivatives of determinants w.r.to matrix
Dr. M. N. H. MOLLAH01-10-11 25
Consider the vector valued function ‘f’ of a scalar variable x
as f(x)=[f1(x) , f2(x) ,…, fn(x) ]/
Assuming function f is differentiable, then its scalar gradient with respect to x is the n-dimensional row vector of partial derivatives as follows
n
n
x
)x(f,...,
x
)x(f
x
f
1
1
5. Derivatives of Vector Function w. r. to a Scalar Variable5. Derivatives of Vector Function w. r. to a Scalar Variable
5.1 Definition5.1 Definition
Dr. M. N. H. MOLLAH01-10-11 26
Let
Then the gradient of f with respect to x is given by
Also we can write it as
mm...,,,
x
)mx(,...,
x
)x(,
x
)x(
x
f
121
2
5.2 Example
/)mx...,,x,x()x(f 2
m
x)x(f,x
)x(f
2
1
with , where yyy
Dr. M. N. H. MOLLAH01-10-11 27
Consider the vector valued function ‘f’ of a vector variable x=(x1, x2, …, xm)/ as
f(x)= y =[y1= f1(x) , y2= f2(x) ,…, yn= fn(x) ]/
Assuming function f is differentiable, then its vector gradient with respect to x is the m×n matrix of partial derivatives as follows
nmm
n
m
n
nm
n
x
)(f
x
)(f
x
)(f
x
)(f
)(f,...,
)(ff
xx
xx
x
x
x
x
x
1
11
1
1
6. Derivatives of Vector Function w. r. to a Vector Variable6. Derivatives of Vector Function w. r. to a Vector Variable
6.1 Definition6.1 Definition
Dr. M. N. H. MOLLAH01-10-11 28
Let
Then the gradient of f (x) with respect to x is given by
6.2 Example
) 1,2,...,( , of rowth theis ][
and )( ,),..., ,( where
...,
mn/
m21
2211
nii,..., a, aaA
axxx
,xA)(f,xA)(f,xA)(f)(f
imi2i1i
ij
/nn
A
Ax
xxxAxx
A
xx
xx
x
x
x
x
x
x
nmmnm
n
nmm
n
m
n
nm
n
aa
aaa
x
)(f
x
)(f
x
)(f
x
)(f
)(f,...,
)(f)(f
1
12111
1
11
1
1
Dr. M. N. H. MOLLAH01-10-11 29
Consider the vector valued function ‘f’ of a matrix variable X=(xij) of order m×n as
f(X)= y =[y1= f1(X) , y2= f2(X) ,…, yq= fq(X) ]/
Assuming that function f is differentiable, then its matrix gradient with respect to X is the mn×q matrix of partial derivatives as follows
qmnmn
n
mnmn
n
n
qmn
q
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(fx
)(f
x
)(f
x
)(f
]vec[
)(f,...,
]vec[
)(ff
XXX
XXX
XXX
X
X
X
X
X
21
2121
2
21
1
1111
2
11
1
1
7. Derivatives of Vector Function w. r. to a Matrix Variable7. Derivatives of Vector Function w. r. to a Matrix Variable
7.1 Definition7.1 Definition
Dr. M. N. H. MOLLAH01-10-11 30
Let
Then the gradient of f (X) w. r. to matrix variable X is given by
7.2 Example
) 1,2,...,( , of rowth theis ][
and )( ,),..., ,( where
...,
n m/
m21
221
mii,..., x, xxX
xaaa
,X)(f,)X(f,)(f
X)(f
imi2i1i
ij
/mm1
X
Xa
aXaXaXX
aX
m
mmnm
m
mn
m
mnmn
m
m
mmn
q
a
a
a
a
a
a
a
a
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(fx
)(f
x
)(f
x
)(f
]vec[
)(f,...,
]vec[
)(f)(f
Ia
XXX
XXX
XXX
X
X
X
X
X
X
000
000
000
000
000
000
000
000
2
2
2
1
1
1
21
2121
2
21
1
1111
2
11
1
1
Dr. M. N. H. MOLLAH01-10-11 31
Consider the matrix valued function ‘F’ of a scalar variable x as F(x)= Y =[yij= fij(x)]m×n
Assuming that function F is differentiable, then its scalar gradient with respect to the scalar x is the m×n order matrix of partial derivatives as follows
nm
ij
x
)x(f
x
)x(F
8. Derivatives of Matrix Function w. r. to a Scalar Variable8. Derivatives of Matrix Function w. r. to a Scalar Variable
8.1 Definition8.1 Definition
Dr. M. N. H. MOLLAH01-10-11 32
Let
Then the gradient of F (x) w. r. to scalar variable x is given by
8.2 Example
matrix.order n man is )( where
n m
ij
nmijij
a
,xa)x(f
xA)x(F
A
A
aaa
aaa
aaa
x
)x(f
x
)x(f
x
)x(f
x
)x(f
x
)x(f
x
)x(fx
)x(f
x
)x(f
x
)x(f
x
)x(f
x
)x(F
mnmm
n
n
mnmm
n
n
nm
ij
21
22221
11211
21
22221
11211
Dr. M. N. H. MOLLAH01-10-11 33
Consider the matrix valued function ‘F’ of a vector variable x=(x1,x2,…,xm) as F(x)= Y =[yij= fij(x)]n×q
Assuming that function F is differentiable, then its vector gradient with respect to the vector x is the m×nq order matrix of partial derivatives as follows
nqm
nqqn)(f
,...,)(f
,)(f
,...,)(f)(F
x
x
x
x
x
x
x
x
x
x 1111
9. Derivatives of Matrix Function w. r. to a Vector Variable9. Derivatives of Matrix Function w. r. to a Vector Variable
9.1 Definition9.1 Definition
Dr. M. N. H. MOLLAH01-10-11 34
Let
Then the gradient of F (x) w. r. to scalar variable x is given by
9.2 Example
.),( and where
/121 n2
/m
nmjiij
,...,aaa,)x,...,x,x(
,ax)(f
)(F
ax
x
axx
m
mnmn
n
n
m
mn
mm
mn
mn
mnm
ij
aaa
aaa
aaa
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(fx
)(f
x
)(f
x
)(f
)x(f
x
)x(F
Ia
xxx
xxx
xxx
x
000000
000000
000000
21
21
21
2111
22
21
2
11
11
21
1
11
Dr. M. N. H. MOLLAH01-10-11 35
Consider the matrix valued function ‘F’ of a matrix variable X=(xij)m×p as F(X)= Y =[yij= fij(X)]n×q
Assuming that function F is differentiable, then its matrix gradient with respect to the matrix X is the mp×nq order matrix of partial derivatives as follows
nqmp
nqqn
/
vec
)(f,...,
vec
)(f,
vec
)(f,...,
vec
)(f
vec
)(vecF)(F
X
X
X
X
X
X
X
X
X
X
X
X
1111
10.10. Derivatives of Matrix Function w. r. to a Matrix VariableDerivatives of Matrix Function w. r. to a Matrix Variable
10.1 Definition10.1 Definition
Dr. M. N. H. MOLLAH01-10-11 36
Let
Then the gradient of F (X) w. r. to scalar variable X is given by
10.2 Example
. and where
1
qnkjnmik
qm
n
kkjikij
]x[,]a[
,xa)(f
)(F
XA
X
AXX
mqnq
mqqm
mqnq
ij
/
vec
)(f,...,
vec
)(f,...,
vec
)(f,...,
vec
)(f
vec
)(fvec
)(vecF)(F
X
X
X
X
X
X
X
X
X
XX
X
X
X
1111
Dr. M. N. H. MOLLAH01-10-11 37
A
aa
aa
aa
aa
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
x
)(f
)(F
q
mqnqmnn
m
mnn
m
nq
mq
nq
q
nq
m
nq
q
mq
q
q
q
m
q
n
mq
n
q
n
m
n
mqqm
I
XXXX
XXXX
XXXX
XXXX
X
X
1
111
1
111
1111
11
1
1
1
1
11
11
1
1
1
1
11
1111
1
11
1
11
11
00
00
00
00
Dr. M. N. H. MOLLAH01-10-11 38
Some important rules for matrix differentiation
Adt
dAtr))A(det(
dt
d.
Adt
dAAA
dt
d.
dt
dAA...A
dt
dAAA
dt
dAA
dt
d.
dt
dCABC
dt
dBABC
dt
dA)ACB(
dt
d.
dt
dB
dt
dA)BA(
dt
d.
nnnn
5
4
3
2
1
111
121
Dr. M. N. H. MOLLAH01-10-11 39
Homework's
)XA()BX(X
BAX.
X)X(X
X.
)abba(XX
)XbXa(.
abX
)bXa(.
baX
)Xba(.
/
'
111
111
5
4
3
2
1
Dr. M. N. H. MOLLAH01-10-11 40
11. Some Applications of Matrix Differential 11. Some Applications of Matrix Differential Calculus Calculus
1. Test of independence between functions2. Expansion of Tailor series3. Transformations of Multivariate Density
functions 4. Multiple integrations 5. And so on.
Dr. M. N. H. MOLLAH01-10-11 41
Test of Test of Independence Independence
A set of functions are said to be
correlated of each other if their Jacobian is zero. That is
Example: Show that the functions
are not independent of one another. Show that
Proof:
So the functions are not independent.
xfxfxf n,...,, 21
0
,...,,
,...,,,...,,
21
2121
n
nn xxx
ffffffJ
3223
22
21332123211 2,, xxxxxfxxxfxxxf xxx
xxx 32
22
1 2 fff
0
,,
,,,,
321
321321
xxx
ffffffJ
Dr. M. N. H. MOLLAH01-10-11 42
In deriving some of the gradient type learning algorithms, we have to resort to Taylor series expansion of a function g(x) of a scalar variable x,
(3.19)
We can do a similar expansion for a function g(x)=g(x1, x2,…, xm) of m variables.
We have (3.20)
Where the derivatives are evaluated at the point x. The second term is the inner product of the gradient vector with the vector x-x, and the third term is a quadratic form with the symmetric Hassian matrix (∂2g / ∂x2).The truncation error depends on the distance |x-x|; the distance has to be small, if g(x)is approximated using only the first and second-order terms.
Taylor series expansions of multivariate functions
...)(2
1)()()( 2
2
2
xxdx
gdxx
d
dgxgxg
...)()(21
)()()( 2
2
xx
xxxxx
xxx
gggg T
T
Dr. M. N. H. MOLLAH01-10-11 43
The same expansion can be made for a scalar function of a matrix variable. The second order term already becomes complicated because the second order gradient is a four-dimension tensor. But we can easily extend the first order term in (3.20), the inner product of the gradient with the vector x-x to the matrix case. Remember that the vector
inner product is define as
For the matrix case, this must become the sum
.
Taylor series expansions of multivariate functions
m
iii
i
T
xxgg
1)()(
xxx
x
m
i
m
jjiji
ji
xxg
1 1)(
x
Dr. M. N. H. MOLLAH01-10-11 44
This is the sum of the products of corresponding elements, just like in the vectorial inner product. This can be nicely presented in matrix form when we remember that for any two matrices, say A and B.
With obvious notation. So, we have
(3.21)
for the first two terms in the Taylor series of a function g of a matrix variable.
Taylor series expansions of multivariate functions
m
i
m
i
m
jjiiitrace
1 1 1
)()()()( ji
TT BABABA
)]()[()()( XXX
XX Tgtracegg
Dr. M. N. H. MOLLAH01-10-11 45