A Schur complement approach for computing subcovariance matrices arising in a road safety measure modelling

Journal of Computational and Applied Mathematics 177 (2005) 331–345

www.elsevier.com/locate/cam

A Schur complement approach for computing subcovariancematrices arising in a road safety measure modelling

Assi N’Guessana,∗, Claude Langrandb

aEcole Polytechnique Universitaire de Lille et Laboratoire de Mathématiques Appliquées CNRS, FRE 2222,Université de Lille 1, 59655 Villeneuve d’Ascq, Cedex, France

bLaboratoire de Mathématiques Appliquées CNRS, FRE 2222.- U.F.R. de Mathématiques pures et appliquées,Université de Lille 1, 59655 Villeneuve d’Ascq, Cedex, France

Received 9 March 2004; received in revised form 20 July 2004

Abstract

This paper deals with the determination of the analytical expression of a real partitioned matrix inverse. A for-mal inversion method is suggested through Schur complements and the elements of any inverse matrix block areexplained, irrespective of the size of the considered matrix. The suggested approach does not need any matrix inver-sion numerical program. This methodology is applied to a Fisher information matrix relating to a multidimensionalmodelling of road accident data when a road safety measure is applied on different experiment sites. An exampleof formal calculation and interpretation is given to support our approach.© 2004 Elsevier B.V. All rights reserved.

Keywords:Matrix approximation; Schur complement; Fisher information matrix; Asymptotic covariance matrix; Formalestimation; Road safety measure; Accident data; Multinomial model; Restricted maximum likelihood

1. Introduction

Most statistical studies involving data collecting (random phenomenon modelling, experiment plan-ning, opinion poll, etc.) not only bring out the problems of parameter estimation (looking for optimalsolutions) but also those related to the evaluation of the accuracy of those estimations. In many of the

∗ Corresponding author. Department of Statistics and Computer Sciences, Bât. Polytech’Lille, Université de Lille 1, 59655Villeneuve d’Ascq, France. Tel.: +33 3 28 76 74 57; fax: +33 3 28 76 73 01.

E-mail addresses:[email protected](A. N’Guessan),[email protected](C. Langrand).

0377-0427/$ - see front matter © 2004 Elsevier B.V. All rights reserved.doi:10.1016/j.cam.2004.09.023

http://www.elsevier.com/locate/cam

mailto:[email protected]

mailto:[email protected]

332 A. N’Guessan, C. Langrand / Journal of Computational and Applied Mathematics 177 (2005) 331–345

statistical evaluation problems, particularly in multivariate statistics, interest parameters are not func-tionally independent, which means there are relations—constraints—between them. These constraintsadd further difficulties in bringing out the solutions (estimations) and their accuracy. One of the mostpopular—as well as the most frequently used—statistical tools to overcome those difficulties still usedis the inversion of the so-called Fisher information matrix. This matrix definition uses both the notion ofpartial derivative of the logarithm of the likelihood function and the linear operator called mathematicalexpectancy, a notion to which we will come back later in our paper. The concept of the Fisher informationmatrix is well known and of capital importance in the theory of the parameter statistical estimation (seefor example[4,7,9,10,18,22]) and even in the theory of linear systems (see for example[17,20]). Indeedthis matrix intervenes in many estimation iterative algorithms, with or without constraints, and its inverse,called asymptotic variance and covariance matrix, generates a measure of the estimation accuracy andof the degree of the linear relations which may exist among the different parameters of the consideredmodel.

In this paper, we focus on the analytical expression of the elements of some blocks of the inverse ofa particular Fisher information matrix arising from a statistical model analysis of a road safety measureset up by N’Guessan et al. (see[13,16]). There are certainly other methods (for example[6,23,24,27])to reach this expression but the one we suggest is based on the Schur complement approach[2,8,19,26]and enables us to get the formal expression of the desired elements. This work stems from N’Guessan’sone[14]. Our results go further than those of the latter author and explain the analytical structure ofthe elements of any block of the inverse of the information matrix considered, irrespective of the sizeof this matrix. The paper is thus organised. Section 2 states all the work notations and hypotheses usedthereafter. Section 3 presents the main technical results and sketches their demonstrations. Section 4offers an illustrative example coming from the multidimensional modelling of a safety road and accidentrisk measure. We conclude with an appendix describing the mechanism to obtain the analysed Fisherinformation matrix structure.

2. Notations and assumptions

Let us considers(s >0) andr(r >1) two given integers;nk (nk >0) a given integer,k = 1,2, . . . , s;zk = (z1k, z2k, . . . , zrk)

T a r × 1 vector of given real data withzjk >0, k = 1,2, . . . , s; j = 1,2, . . . , r;and note,� = (�0, �

T)T a (1 + sr) × 1 unknown vector parameter, where�0(�0>0) is a real number,�=(�T

1, �T2, . . . , �

Ts )

T ∈ Rsr , with �k=(�1k, �2k, . . . , �rk)T ar×1 vector and�jk >0; fork=1,2, . . . , s

�k = nk/(1 + �0〈zk, �k〉) a real number, where〈, 〉 is the usual inner product,

V�,k = �0�knk

(z1k, z2k, . . . , zrk)T,

a r × 1 vector,

��,k = diag

(1 + �0z1k

�1k,

1 + �0z2k

�2k, . . . ,

1 + �0zrk

�rk

),

a r × r matrix,B�,k = �k(��,k − V�,kVT�,k), ar × r matrix.

https://www.researchgate.net/publication/242979488_Computation_of_the_Fisher_information_matrix_for_time_series_models?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/245217668_The_Fisher_information_matrix_for_linear_systems?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/238511454_Constrained_covariance_matrix_estimation_in_road_accident_modelling_with_Schur_complements?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/266919261_Generalized_inverses_of_partitioned_matrices_useful_in_statistical_applications?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/260161106_An_Exact_Cholesky_Decomposition_and_the_Generalized_Inverse_of_the_Variance-Covariance_Matrix_of_the_Multinomial_Distribution_with_Applications?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/229193594_Linear_Statistical_Inference_and_Its_Applications_Second_Edition?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/228606576_Fisher_information_matrices_with_censoring_truncation_and_explanatory_variables?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/223400926_A_Schur_complement_approach_to_a_general_extrapolation_algorithm?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/23632054_Symbolic_Cholesky_decomposition_of_the_variance--covariance_matrix_of_the_negative_multinomial_distribution?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

https://www.researchgate.net/publication/222577716_Schur_complements_and_its_applications_to_symmetric_nonnegative_and_Z-matrices?el=1_x_8&enrichId=rgreq-68c0605e-d7d7-46f9-89b8-e6cfe95730fa&enrichSource=Y292ZXJQYWdlOzIyODk0MjMzNjtBUzo5OTkyNzcxMTU1MTQ5NkAxNDAwODM2MDMxNTA1

A. N’Guessan, C. Langrand / Journal of Computational and Applied Mathematics 177 (2005) 331–345333

Then, we assume that the unknown vector parameter� is subjected tosset of restraints

(A1) hk(�)= 0, k = 1,2, . . . , s, where functionshk : R1+sr → R are given by(A2) hk(�)= 〈1r , �k〉 − 1 with 1r = (1, . . . ,1)T ∈ Rr , the vector of unity. Therefore we note that

J� =[

�� UT�

U� B�

], �� =

[J� HT

�H� 0s,s

], (1)

where��, U�, B� are, respectively, a positive real number, a(sr)×1 vector and a(sr)× (sr)matrixgiven by

�� =s∑

k=1

�k〈zk, �k〉�0(1 + �0〈zk, �k〉)

, U� = (UT�,1, U

T�,2, . . . , U

T�,s)

T,

B� = bloc.diag(B�,1, B�,2, . . . , B�,s)

with U�,k = (�k/�0)V�,k andH� = (H1, H2) a s × (1 + sr) matrix, where

H1 =(

�h1

��0,�h2

��0, . . . ,

�hs��0

)T

, H2 =(

�h1

��,�h2

��, . . . ,

�hs��

)T

are, respectively,s × 1 ands × (sr) matrices. In the present paper, we investigate the inversion ofmatricesJ� and�� under assumptions (A1) and (A2) and using the Schur complement approach(see for instance[19]).

3. Main results

Theorem 3.1. For k = 1,2, . . . , s

(i) ‖zk‖2�−1

�,k

< 〈zk, �k〉/�0;

(ii) ‖V�,k‖2�−1

�,k

< �0〈zk, �k〉/(1 + �0〈zk, �k〉)2<1.

Proof. (i) Using the fact thatx/(1 + x)<1 for all x >0, one obtains after matrix manipulations,‖zk‖2

�−1�,k

< (1/�0)(∑r

m=1�mkzmk)= 〈zk, �k〉/�0.

(ii) This part follows from (i) and the relation between vectorsVk andzk. �

Theorem3.2. (i) For k=1,2, . . . , s,matrixB�,k is nonsingular andB−1�,k=�−1

k (�−1�,k+tk�−1

�,kV�,kVT�,k�

−1�,k)

with tk = (1 − ‖V�,k‖2�−1

�,k

)−1.

(ii) ‖U�‖2B−1

�< ��.

Proof. (i) Let us define

�(1)�,k =

[��,k V�,k

V T�,k 1

],


a (1+ r)× (1+ r) matrix and(�(1)�,k/��,k) (resp.(�(1)

�,k/1)) the Schur complement of��,k (resp. of 1) in

�(1)�,k. Theorem 3.1 implies that

(�(1)�,k/��,k)= 1 − ‖V�,k‖2

�−1�,k>0.

So usingWoodburry[29] formula (see also[21, pp. 73–78; 19, pp. 201–204]), we can deduce that(�(1)�,k/1)

is nonsingular and

(�(1)�,k/1)

−1 = �−1�,k + �−1

�,kV�,k(�(1)�,k/��,k)

−1V T�,k�

−1�,k.

The result follows from the relationB�,k = �k(�(1)�,k/1).

(ii) Matrix manipulations and relations on one hand between vectorsU�,k andV�,k and on the otherhand between matricesB�,k and��,k imply that

‖U�‖2B−1

�= 1

�20

s∑k=1

�k‖V�,k‖2�−1

�,k

1 − ‖V�,k‖2�−1

�,k

. (2)

Using Theorem 3.1, one obtains

‖U�‖2B−1

�<

s∑k=1

�k(1 + �0〈zk, �k〉)�2

0

‖V�,k‖2�−1

�,k< ��. � (3)

Theorem 3.3. Let us define

B(0)� =

[B� U�

UT� 0

], �k,m =

[(J�/B�) UT

�,mB−1�,m

B−1�,kU�,k �k,mB

−1�,k

],

respectively, a (1 + sr) × (1 + sr) matrix and a(1 + r) × (1 + r) matrix withk,m = 1,2, . . . , s and�k,m the Kronecker symbol. Then

(i) (J�/B�)= �� + (B(0)� /B�)>0;

(ii) (J�/��) is a (sr)× (sr) nonsingular matrix and

(J�/��)−1 =

J1,1 J1,2 . . . J1,sJ2,1 J2,2 . . . J2,s...

.... . .

...

Js,1 Js,2 . . . Js,s

, (4)

where each blockJk,m is a r × r matrix given by(�k,m/(J�/B�)), the Schur complement of(J�/B�)

in �k,m.


Proof. (i) Matrix manipulations and relations on the one hand between vectorsU�,k andV�,k and on theother hand between matricesB−1

�,k and�−1�,k imply that

−(B(0)� /B�)= 1

�20

s∑k=1

�k‖V�,k‖2�−1

�,k

1 − ‖V�,k‖2�−1

�,k

.

So doubly using part (ii) of Theorem 3.1, we obtain

−(B(0)� /B�)<

s∑k=1

�k〈zk, �k〉�0(1 + �0〈zk, �k〉)

= ��.

Therefore, we can deduce that(J�/B�) is nonsingular and

(J�/B�)= �� + (B(0)� /B�)>0. (5)

(ii) SinceB−1� exists and(J�/B�)>0 then, using the same argument as in part (i) of Theorem 3.2, the

(sr)× (sr) matrix (J�/��) is nonsingular and

(J�/��)−1 = B−1

� + (J�/B�)−1B−1

� U�UT� B

−1� . (6)

The result follows from matrix manipulations.�

Theorem 3.4. Let us define

B(0)�,k =

[B�,k 1r1Tr 0

], �(0)

� =[

�� T� 0

], �(1)

� =[

�� −�T� (J�/B�)

],

respectively, a (1 + r)× (1 + r) matrix and two(1 + s)× (1 + s) matrices, where

�� = diag[−(B(0)�,1/B�,1), . . . ,−(B(0)

�,s /B�,s)] (7)

is a s × s matrix and

� =H2B−1� U� (8)

is a s × 1 vector. Then

(i) −(B(0)�,k/B�,k)>0 for k = 1,2, . . . , s;

(ii) (�(1)� /(J�/B�)), the Schur complement of(J�/B�) in �(1)

� is a s × s nonsingular matrix and

(�(1)� /(J�/B�))

−1 = �−1� − �−1

� �((J�/B�)+ (�(0)� /��))

−1T� �−1

� . (9)

Proof. (i) B�,k is nonsingular so(B(0)�,k/B�,k) exists and

−(B(0)�,k/B�,k)= 1T

r B−1�,k1r >0

for k = 1,2, ..., s.


(ii) Part (i) implies that�� is nonsingular. Since(J�/B�)>0, then(�(1)� /(J�/B�)) is nonsingular and

(�(1)� /(J�/B�))

−1 = �−1� − �−1

� �(�(1)� /��)

−1T� �−1

�

= �−1� − �−1

� �((J�/B�)+ (�(0)� /��))

−1T� �−1

� . � (10)

Theorem 3.5. Let us noteR� =−(��/J�), thes×s matrix given by the negative of the Schur complementof J� in ��. R� is nonsingular and

R−1� = �−1

� − ((J�/B�)+ (�(0)� /��))

−1�−1� �

T� �−1

� . (11)

Proof. Using Theorems 3.2 and 3.3 and the well-known results of Schur complements about the parti-tioned matrix inverse (see for instance[19, pp. 201–204]), matrixJ� is nonsingular and

J−1� =

[(J�/B�)

−1 −(J�/B�)−1UT

� B−1�−B−1

� U�(J�/B�)−1 (J�/��)

−1

]. (12)

Now using assumptions (A1) and (A2) above and the expression of(J�/��)−1 (see the proof of Theorem

3.3), we obtain

R� =H2(J�/��)−1HT

2

= �� + (J�/B�)−1�

T�

= (�(1)� /(J�/B�)). (13)

The result follows from Theorem 3.4.�

Theorem 3.6. Let us note2� = [(�(1)

� /��)]−1, where(�(1)� /��) is the Shur complement of�� in �(1)

� ;1r,r ther× r matrix of unity; ��,k=zT

k�−1�,k1r , k=1,2, . . . , s;A� = ((Ak,m)), amatrixs× s with (k,m)th

element

Ak,m = �k,m

‖1r‖2B−1

�,k

− 2�tktm�k�m��,k��,m

nknm‖1r‖2B−1

�,k

‖1r‖2B−1

�,m

. (14)

Then(i)

HT2 R

−1� H2 = A� ⊗ 1r,r

is a (sr)× (sr) matrix, where⊗ is the Kronecker product;(ii)

(J�/��)−1HT

2 R−1� H2(J�/��)

−1 =G�,

whereG� is a (sr)× (sr) matrix, where eachGk,m block is also ar × r matrix with

Gk,m =s∑

j=1

[(s∑i=1

Ai,jJk,i

)1r,r

]Jj,m, (15)

wherek,m= 1,2, . . . , s.


Proof. This proof stems from an algebraic manipulation of matrices. So we give the outline.(i) Using Theorems 3.4 and 3.5 and the relations

1r,rB−1�,kU�,kU

T�,mB

−1�,m1r,r = tktm��,k��,m

nknm1r,r , (16)

we can deduce that

HT2 R

−1� H2 = A� ⊗ 1r,r . (17)

(ii) In view of Theorem 3.3, ther × r elementsJk,m of (J�/��)−1 are

Jk,m = �k,mB−1�,k − (J�/B�)

−1B−1�,kU�,kU

T�,mB

−1�,m.

So using part (ii) of Theorem 3.3, the postmultiplication ofHT2 R

−1� H2 by (J�/��)

−1 and finally thepremultiplication of the same matrix, we get

(J�/��)−1HT

2 R−1� H2(J�/��)

−1

=

s∑j=1

[(s∑i=1

J1,iAi,j

)1r,r

]Jj,1

s∑j=1

[(s∑i=1

J1,iAi,j

)1r,r

]Jj,2 . . .

s∑j=1

[(s∑i=1

J1,iAi,j

)1r,r

]Jj,s

s∑j=1

[(s∑i=1

J2,iAi,j

)1r,r

]Jj,1

s∑j=1

[(s∑i=1

J2,iAi,j

)1r,r

]Jj,2 . . .

s∑j=1

[(s∑i=1

J2,iAi,j

)1r,r

]Jj,s

. . . . . . . . . . . .s∑

j=1

[(s∑i=1

Js,iAi,j

)1r,r

]Jj,1

s∑j=1

[(s∑i=1

Js,iAi,j

)1r,r

]Jj,2 . . .

s∑j=1

[(s∑i=1

Js,iAi,j

)1r,r

]Jj,s

(18)

and thereafter we obtain matrixG�. �

Theorem 3.7.Matrix �� is nonsingular and

�−1� =

[W� J−1

� HT� R

−1�

R−1� H�J

−1� −R−1

�

], (19)

whereW�, the leading(1 + rs)× (1 + sr) matrix, is defined by

W� =[W�(1,1) WT

� (2,1)W�(2,1) W�(2,2)

](20)

withW�(1,1),W�(2,1) andW�(2,2), respectively, a scalar, a (sr)× 1 vector and a(sr)× (sr)matrixas follows:

W�(1,1)= (�(1)� /��)

−1,

W�(2,1)= −(J�/B�)−1[B−1

� U� − (J�/��)−1HT

2 R−1� �],

W�(2,2)= (J�/��)−1 − (J�/��)

−1HT2 R

−1� H2(J�/��)

−1. (21)

Proof (Outline). Combining well-known results about the inverse of partitioned matrix (see for instance[1, Lemma 3]) and the theorems above, we get the expression of�−1

� above, where

W� = J−1� − J−1

� HT2 R

−1� H2J

−1� . (22)


So using (A1), (A2) and the expression ofJ−1� (see proof ofTheorem 3.5), we obtain the block-components

of matrixW� as follows:

W�(1,1)= (J�/B�)−1 − (J�/B�)

−1T�R

−1� �(J�/B�)

−1,

W�(2,1)= −B−1� U�(J�/B�)

−1 + (J�/��)−1HT

2 R−1� �(J�/B�)

−1,

W�(2,2)= (J�/��)−1 − (J�/��)

−1HT2 R

−1� H2(J�/��)

−1. (23)

The result follows from matrices manipulations.�

Lemma 3.8.We get the following results:

(i)

− (B(0)�,k/B�,k)= ‖1r‖2

B−1�,k

= �−1k

[trace(�−1

�,k)+ tk�2

0�2k�

2�,k

n2k

], (24)

(ii)

(J�/B�)=s∑i=1

�̂2i

ni

(〈zi, �i〉�0

− �i tini

‖zi‖2�−1

�,i

), (25)

(iii)

−(�(0)� /��)= ‖�‖2

�−1�

=s∑i=1

�2i t

2i �2

�,i

n2i ‖1r‖2

B−1�,i

, (26)

(iv)

(�(1)� /��)= (J�/B�)− (�(0)

� /��). (27)

Remark 1. Lemma 3.8 provides from matrix manipulations. So we leave out the proof. Details aredisplayed in a recent technical report[15].

4. Application to road safety measure modelling

4.1. Problem formulation and data

We apply the above results to a statistical road accident data modelling when a road safety measure isapplied on different target sites. Our model and estimation method stem from those of N’Guessan et al.[13,16]. The latter authors deal with a multidimensional combination of road accident frequencies before


and after a similar change (crossroad lay-out, surface of a motorway section, etc.) ats(s >0) target sites.Each target site countsr(r >1) accident types (fatal accidents, seriously injured people, slightly injuredpeople, material damage, etc.) and is linked to a specific control area. In order to take some externalfactors into account (such as traffic flow, speed limit variation, weather conditions, etc.) they supposethat they know the data set(z1k, z2k, . . . , zrk)

T, (k= 1,2, . . . , s) for the specific control areas, where themeasure (or change) is not directly applied, but which are linked to the target sites. They built a statisticalmodel to share the total accident numbernk(k= 1,2, . . . , s) in each site. The unknown vector parameter� of their model depends on the global average effect�0 of the change as well as on the vector of accidentrisks(�1k, �2k, . . . , �rk)

T of each control areak such that

r∑j=1

�jk = 1, (28)

wherek= 1,2, . . . , s. Their method of estimating simultaneously the global effect and the accident risksin control areas, i.e.,�, is based on the maximization of function

L(�)= Cte+s∑

k=1

r∑j=1

{y.jk loge(�jk)+ y2jk loge(�0)− y.jk loge(1 + �0〈zk, �k〉)} (29)

under assumptions (A1) and (A2), wherey.jk = y1jk + y2jk andy1jk (resp.y2jk) stands for the numberof type j of accidents on sitek before (resp. after) the setting up of the change andnk = ∑r

j=1y.jk.FunctionL(�) is called log-likelihood and the method proposed is clearly the restricted maximumlikelihood estimation (RMLE) method which is an approach to estimation that maximizes the logarithmof a likelihood over a restricted space. This approach of estimation is the well known applied mathematicalmethod that maximizes a function subjected to restraints. The general statistical framework is well knownand is not discussed here (see for instance[1, 3, 5, 6, 10, p. 172; 11, 12, 25, 28]).

4.2. Computing information matrix using second partial derivatives

Information matrix�� defined below and related to functionL(�) and assumptions (A1) and (A2)plays an important role in RMLE method. Not only does it provide the solution for unknown vectorparameter� but its inverse, if it exists, is also used to compute the covariance matrix of RMLE estimates.The general statistical framework shows that a central role is played by the inverse, if it exists, of thebordered information matrix

�� =(J� HT

�H� 0

), (30)

where the(1 + sr) × (1 + sr) Fisher information matrixJ� uses the second partial derivatives of thenegative of functionL(�) with respect to the elements of� and defined as follows:

J� = E

(− �2

L

��T

)


with E(.) the usual statistical expectation operator (see the appendix for details). Taking the second partialderivatives of the negative of functionL(�) and setting

E(y1jk)= nk�jk1 + �0〈zk, �k〉

, E(y2jk)= nk�jkzjk�0

1 + �0〈zk, �k〉, (31)

we obtain the general form of the elements of matrixJ� as follows:

J� =

� UT�,1 UT

�,2 . . . UT�,s

U�,1 B�,1 0 . . . 0

U�,2 0 B�,2. . .

......

.... . .

. . . 0U�,s 0 . . . 0 B�,s

. (32)

4.3. Computing the inverse of the information matrix and interpretation

Without loss of generality, we suppose thats = 1 andr = 2 i.e., the number of experimental site isequal to one and there are two accident types withnk the given total number of accidents on the site andzk = (z1k, z2k)

T the given data from the control area. Parameter� is defined by� = (�0, �k)T ∈ R3 and

�k = (�1k, �2k)T ∈ R2 with �0>0, �jk >0, and�1k + �2k = 1. So the dimension of the parameter space

is three and we have one constraint. We show that the 4× 4 constrained or bordered Fisher informationmatrix is

�� =

��k�0v1k

�k�0v2k 0

�k�0v1k b

(k)11 b

(k)12 1

�k�0v2k b

(k)21 b

(k)22 1

0 1 1 0

,

where�� = �2k〈zk, �k〉/�0nk, vmk = �k�0zmk/nk, �k = nk/(1 + �0〈zk, �k〉) and

B� = B�,k

=(b(k)11 b

(k)12

b(k)21 b

(k)22

)

= �k

1 + �0z1k

�1k− �2

0z21k�

2k

n2k

−�20z

21kz

22k�

2k

n2k

−�20z

22kz

21k�

2k

n2k

1 + �0z2k

�2k− �2

0z22k�

2k

n2k

. (33)

In order to reach the variances (accuracy) and covariances (linear relations) of the components of sub-parameter

�k =(

�1k�2k

)∈ R2, �1k + �2k = 1,


we must get the elements of the 4× 4 symmetric matrix

�−1� =

�−1� (1,1) �−1

� (1,2) �−1� (1,3) �−1

� (1,4)�−1

� (2,2) �−1� (2,3) �−1

� (2,4)�−1

� (3,3) �−1� (3,4)

�−1� (4,4)

and thereafter we take the 2× 2 element-block

cov(�k, �k)=(

�−1� (2,2) �−1

� (2,3)�−1

� (2,3) �−1� (3,3)

),

where�−1� (2,2), �−1

� (3,3) and�−1� (2,3) are, respectively, interpreted as the precisions related to the

estimation of parameters�1k, �2k and the degree of linear relation between�1k and�2k.The method proposed here allows us to avoid doing this inversion and gives the analytical expressions

of the elements of matrix cov(�k, �k) by using Theorem 3.7 as follows:

J� = ��

�k�0V T

�,k

�k�0V�,k B�,k

, ��,k =

1 + �0z1k

�1k0

01 + �0z2k

�2k

andB�,k = �k[��,k − V�,kVT�,k] a 2× 2 matrix and

V�,k = �0�knk

(z1kz2k

), HT

2 =(

11

).

We show that the Schur complement ofB� in J� is given by

(J�/B�)= �2k

nk

(〈zk, �k〉�0

− �ktknk

‖zk‖2�−1

�,k

), ‖zk‖2

�−1�,k

= �1kz21k + �2kz

22k

with

�jk = �jk1 + �0zjk

, j = 1,2

the elements of the 2× 2 diagonal matrix�−1�,k and

‖�‖2�−1

�= �2

kt2k�2

�,k

n2k‖1r‖2

B−1�,k

with ��,k = �1kz1k + �2kz2k.

So using Theorems 3.6 and 3.7 we get

cov(�k, �k)= Jk,k − Ak,kJk,k12,2Jk,k, (34)


a 2× 2 matrix, where

12,2 =[

1 11 1

], Jk,k = �−1

k

[�1k + k�2

1kz21k k�1k�2k

k�1k�2k �2k + k�22kz

22k

]

with

k = tk�2k

n2k

[�20 − (J�/B�)

−1tk�k] and Ak,k = 1

‖1r‖2B−1

�,k

− 2�t

2k �2k�

2�,k

n2k‖1r‖4

B−1�,k

with

2� = [(J�/B�)+ ‖�‖2

�−1�

]−1.

Thereafter, we deduce the explicit expressions of the variances and the covariances of subparameter�kas follows:

�−1� (j + 1, j + 1)= 2(�jk)= �−1

k [�jk + �2jk( kz

2jk − Ak,k�

−1k x2

jk)], j = 1,2,

�−1� (2,3)= cov(�1k; �2k)= �−1

k �1k�2k[ kz1kz2k − Ak,k�−1k x1kx2k],

wherexjk = 1 + k��,kzjk. The main strength of our approach is its applicability to any dimension ofmatrix�� without having to invert it. It saves considerable time and enables great precision in calculationsand is accessible to everyone. We do not need any numerical inversion program to succeed in doing it,only matrices manipulations (product and addition) are called for.

Acknowledgements

We would like to thank Prof. Claude Brezinski for turning our attention to the paper on Schur com-plement and we are also deeply grateful to the anonymous referee for his judicious notes and commentswhich have enabled a complete remodelling of our paper. Part of this work was set up while the first authorwas a guest researcher at Montreal University CRT (Research Center on Transportation), the technicaland material helps of which we would also like to thank.

Appendix A. Computing of the Fisher information matrix

DifferentiatingL(�) with respect to each of the(1 + sr) components of�, we obtain the first partialderivatives as follows:

�L(�)

��0=

s∑k=1

r∑m=1

{y2mk − y1mk�0〈zk, �k〉

�0(1 + �0〈zk, �k〉)},

�L(�)

��mk= y.mk

�mk− y..k�0zmk

1 + �0〈zk, �k〉(k = 1,2, . . . , s; m= 1,2, . . . , r),


wherey.mk = y1mk + y2mk andy..k =∑rj=1y.mk. So we deduce the second partial derivatives as follows:

�2L(�)

��0��0=

s∑k=1

{−y2.k

�20

+ y..k(〈zk, �k〉)2(1 + �0〈zk, �k〉)2

},

�2L(�)

��0��mk= − y..kzmk

(1 + �0〈zk, �k〉)2k = 1,2, . . . , s, m= 1,2, . . . , r,

�2L(�)

��jp��mk=

0 if k �= p,

y..kzmkzjk�20

(1 + �0〈zk, �k〉)2if k = p andj �= m,

−y.mk

�2mk

+ y..k�20z

2mk

(1 + �0〈zk, �k〉)2if k = p andj =m

k, p = 1,2, . . . , s;m, j = 1,2, . . . , r

and thereafter the general structure of the(1 + sr)× (1 + sr) matrixJ� is given by

J� =

J�0,�0 J T�0,�1

J T�0,�2

. . . J T�0,�s

J�0,�1 J�1,�1 0 . . . 0

J�0,�2 0 J�2,�2

. . ....

......

. . .. . . 0

J�0,�s 0 . . . 0 J�s ,�s

.

whereJ�0,�0 is a scalar given by

J�0,�0 = E

(−�2

L(�)

��0��0

)

=s∑

k=1

{1

�20

E(y2.k)− (〈zk, �k〉)2(1 + �0〈zk, �k〉)2

E(y..k)

}.

J�0,�k is ar × 1 vector given by

J�0,�k =[

E

(− �2

L(�)

��0��1k

), E

(− �2

L(�)

��0��2k

), . . . , E

(− �2

L(�)

��0��rk

)]T

with themth element given by

E

(− �2

L(�)

��0��mk

)= zmk

(1 + �0〈zk, �k〉)2E(y..k)

andJ�k,�k a r × r matrix with (j,m)th elements

E

(− �2

L(�)

��jk��mk

)=

− zmkzjk�20

(1 + �0〈zk, �k〉)2E(y..k) if j �= m,

1

�2mk

E(y.mk)− �20z

2mk

(1 + �0〈zk, �k〉)2E(y..k) if j =m.


Now setting

E(y1jk)= nk�jk1 + �0〈zk, �k〉

, E(y2jk)= nk�jkzjk�0

1 + �0〈zk, �k〉for k = 1,2, . . . , s andm= 1,2, . . . , r, we getJ�0,�0 = �, J�0,�k = U�,k andJ�k,�k = B�,k.

References

[1] J. Aitchison, S.D. Silvey, Maximum likelihood estimation of parameters subject to restraints, Ann. Math. Statist. 29 (1958)813–829.

[2] C. Brezinski, M.R. Zaglia, A Schur complement approach to a general extrapolation algorithm, Linear Algebra Appl. 368(2003) 279–301.

[3] N.R. Cook, Restricted maximum likelihood, in: P. Armitage, T. Colton (Eds.), Encyclopedia of Biometrics, vol. 5, Wiley,1998, pp. 3827–3830.

[4] H. Cramer, Mathematical Methods of Statistics, Princeton University Press, Princeton, 1946.[5] M. Crowder, On the constrained maximum likelihood estimation with non i.i.d. observations, Ann. Inst. Statist. Math. 36A

(1984) 239–249.[6] F.J.H. Don, The use of generalized inverses in restricted maximum likelihood, Linear Algebra Appl. 70 (1985) 225–240.[7] L.A. Escobar, W.Q. Meeker, Fisher information matrices with censoring, truncation, and explanatory variables, Statist.

Sinica 8 (1998) 221–237.[8] Y. Fan, Schur complements and its applications to symmetric nonnegative and Z-matrices, LinearAlgebraAppl. 353 (2002)

289–307.[9] A. Klein, G. Mélard, Computation of the Fisher information matrix for times series models, J. Comput. Appl. Math. 64

(1995) 57–68.[10] J.R. Magnus, Linear Structures, Charles Griffin and Company Ltd., London, 1988.[11] G.B. Matthews, N.A.S. Crowther, A maximum likelihood estimation procedure when modelling in terms of constraints,

South African Statist. J. 29 (1995) 29–50.[12] B.E. Neuenschwander, B.D. Flury, A note on Silvey’s (1959) Theorem, Statist. Probab. Lett. 36 (1997) 307–317.[13] A. N’Guessan, On a use of Schur complements for an inverse of a constrained accident data information matrix, vol. 60,

no. VI, Pub. IRMA, Lille, 2003.[14] A. N’Guessan, Constrained covariance matrix estimation in road accident modelling with Shur complements, C.R. Acad.

Sci. Paris Ser. I 337 (2003) 219–222.[15] A. N’Guessan, Constrained Estimation of a Road Safety covariance matrix using Shur complements, Technical

Report—Centre for Researh on Transportation, no. CRT-2003-12, Université de Montréal, 2003, 18pp.[16] A. N’Guessan, A. Essai, C. Langrand, Estimation multidimensionnelle des contrôles et de l’effet moyen d’une mesure de

sécurité routière, Rev. Statist. Appl. XLIX (2) (2001) 83–100.[17] R.J. Ober, The Fisher information matrix for linear systems, Systems Control Lett. 47 (2002) 221–226.[18] R.J. Ober, Q. Zou, Z. Lin, Calculation of the Fisher information matrix for multidimensional data sets, IEEE Trans. Signal

Process. 51 (10) (2003) 2679–2691.[19] D.V. Ouellette, Schur complements and statistics, Linear Algebra Appl. 36 (1981) 187–295.[20] R.L.M. Peeters, B. Hanzon, Symbolic Computation of Fisher information matrices for parametrized state-space systems,

Automatica 35 (1999) 1059–1071.[21] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.F. Flannery, Numerical Recipes in C The Art of Scientific Computing,

second ed., Cambridge University Press, Cambridge, 1992.[22] C.R. Rao, Linear Statistical Inference and its Applications, second ed., Wiley, New York, 1973.[23] C.R. Rao, H.Yanai, Generalized inverses of partitioned matrices useful in statistical applications, Linear Algebra Appl. 70

(1985) 105–115.[24] M. Sagae, K. Tanabe, Symbolic Cholesky decomposition of the variance–covariance matrix of the negative multinomial

distribution, Statist. Probab. Lett. 15 (1992) 103–108.[25] S.D. Silvey, The Langrangian multiplier test, Ann. Math. Statist. 30 (1959) 389–407.


[26] G.P.H. Styan, in: T. Pukkila, S. Puntanen (Eds.), Schur Complements and Linear Statistical Models, 1985, pp. 37–75.[27] K. Tanabe, M. Sagae, An exact Cholesky decomposition and the generalized inverse of the variance–covarince matrix of

the multinomial distribution, with applications, J. Roy. Statist. Soc. B 54 (1) (1992) 211–219.[28] C. Van Eeden, Estimation in restricted parameter spaces—some history and some recent developments, CWI Quart. 9 (1,2)

(1996) 69–76.[29] M.A. Woodburry, Inverting modified matrices, Memorandum Report 42, Statistical Research Group, Princeton, NJ, 1950.

A Schur complement approach for computing subcovariance matrices arising in a road safety measure modelling

Documents