Top Banner
Journal of Nonparametric Statistics Vol. 22, No. 8, November 2010, 937–954 Spearman’s footrule and Gini’s gamma: a review with complements Christian Genest a *, Johanna Nešlehová b and Noomen Ben Ghorbal a a Département de mathématiques et de statistique, Université Laval, 1045, avenue de la Médecine, Québec (Québec), Canada G1V 0A6; b Department of Mathematics and Statistics, McGill University, 805, rue Sherbrooke Ouest, Montréal (Québec), Canada H3A 2K6 (Received 19 February 2009; final version received 16 November 2009 ) The scattered literature on Spearman’s footrule and Gini’s gamma is surveyed. The following topics are covered: finite-sample moments and asymptotic distribution under independence; large-sample distribution under arbitrary alternatives; asymptotic relative efficiency for testing independence; consistent asymptotic variance estimation through the jackknife; multivariate generalisations and uses. Complementary results and an extensive bibliography are provided, along with several original illustrations. Keywords: asymptotic relative efficiency; concordance; copula; Gini’s gamma; jackknife; Spearman’s footrule; Spearman’s rho; ranks; test of independence 1. Introduction Spearman’s footrule is a nonparametric measure of association. It was introduced by the British psychologist Charles Spearman as an alternative to the correlation in the pairs (R 1 ,S 1 ), . . . , (R n ,S n ) of ranks associated with a random sample (X 1 ,Y 1 ), . . . , (X n ,Y n ) from some continuous bivariate distribution H(x,y) = Pr(X x,Y y). Spearman’s footrule usually refers to the statistic ϕ n = 1 3 n 2 1 n i =1 |R i S i |, (1) although other normalisations have been used, even by Spearman himself (cf. Spearman 1904, 1906; Dinneen and Blakesley 1971). This coefficient is closely related to the indice de cogradu- azione semplice introduced by the Italian statistician, demographer and sociologist Corrado Gini *Corresponding author. Email: [email protected] ISSN 1048-5252 print/ISSN 1029-0311 online © American Statistical Association and Taylor & Francis 2010 DOI: 10.1080/10485250903499667 http://www.informaworld.com Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011
18

Spearman's footrule and Gini's gamma: a review with complements

Jan 17, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric StatisticsVol. 22, No. 8, November 2010, 937–954

Spearman’s footrule and Gini’s gamma: a reviewwith complements

Christian Genesta*, Johanna Nešlehováb and Noomen Ben Ghorbala

aDépartement de mathématiques et de statistique, Université Laval, 1045, avenue de la Médecine, Québec(Québec), Canada G1V 0A6; bDepartment of Mathematics and Statistics, McGill University, 805,

rue Sherbrooke Ouest, Montréal (Québec), Canada H3A 2K6

(Received 19 February 2009; final version received 16 November 2009 )

The scattered literature on Spearman’s footrule and Gini’s gamma is surveyed. The following topics arecovered: finite-sample moments and asymptotic distribution under independence; large-sample distributionunder arbitrary alternatives; asymptotic relative efficiency for testing independence; consistent asymptoticvariance estimation through the jackknife; multivariate generalisations and uses. Complementary resultsand an extensive bibliography are provided, along with several original illustrations.

Keywords: asymptotic relative efficiency; concordance; copula; Gini’s gamma; jackknife; Spearman’sfootrule; Spearman’s rho; ranks; test of independence

1. Introduction

Spearman’s footrule is a nonparametric measure of association. It was introduced by theBritish psychologist Charles Spearman as an alternative to the correlation in the pairs(R1, S1), . . . , (Rn, Sn) of ranks associated with a random sample (X1, Y1), . . . , (Xn, Yn) fromsome continuous bivariate distribution H(x, y) = Pr(X � x, Y � y).

Spearman’s footrule usually refers to the statistic

ϕn = 1 − 3

n2 − 1

n∑i=1

|Ri − Si |, (1)

although other normalisations have been used, even by Spearman himself (cf. Spearman 1904,1906; Dinneen and Blakesley 1971). This coefficient is closely related to the indice de cogradu-azione semplice introduced by the Italian statistician, demographer and sociologist Corrado Gini

*Corresponding author. Email: [email protected]

ISSN 1048-5252 print/ISSN 1029-0311 online© American Statistical Association and Taylor & Francis 2010DOI: 10.1080/10485250903499667http://www.informaworld.com

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 2: Spearman's footrule and Gini's gamma: a review with complements

938 C. Genest et al.

(1914), viz.

γn = 1

�n2/2�n∑

i=1

{|(n + 1 − Ri) − Si | − |Ri − Si |}, (2)

where �m� denotes the integer part of arbitrary m > 0.Spearman’s footrule and Gini’s gamma remained largely neglected until fairly recently. In the

fourth edition of his book on rank correlation methods, Kendall (1970) discussed the footrule as anonparametric measure of association but dismissed it because of a lack of statistical properties.Prior to 1980, the main sources of information on Gini’s gamma were in Italian (Savorgnan 1915;Salvemini 1951; Amato 1954; Cucconi 1964).

Interest in Spearman’s footrule was apparently revived by Diaconis and Graham (1977), whohighlighted its natural interpretation in terms of the Manhattan (or city-block) distance betweentwo sets of ranks. They derived its asymptotic distribution under independence and noted that insmall samples, it is less variable than Spearman’s rho, which is based on the Euclidean metric.Extensions have since been proposed to handle data that are incomplete (Alvo and Charbonneau1977), multivariate (Úbeda-Flores 2005), and censored (Sen, Salama, and Quade 2003; Salamaand Quade 2004; Quade and Salama 2006).

Because of its simplicity, robustness and natural interpretation, the footrule has since beenrediscovered and used in various contexts. For instance, motivated by litigation about a scoringprocedure for civil service examinations, Berman (1996) proposed the statistic Mn = ∑

(Ri −Si)1(Ri > Si) as a measure of ‘unfairness’when the results of an exam leading to ranks R1, . . . , Rn

are replaced by scores leading to ranks S1, . . . , Sn. However, Berman did not notice that Mn =(n2 − 1)(1 − ϕn)/6.

In the field of genomics, a simple function of ϕn was advocated a few years ago by Kim, Rha,Cho, and Chung (2004) to measure reproducibility among replicates in microarray experiments,which are likely to produce outliers due to a low signal-to-noise ratio. In the field of informationretrieval, Spearman’s footrule distance has also been used to measure the discrepancy betweenrank lists (Fagin, Kumar, and Sivakumar 2003; Mikki in press). The same idea was used veryrecently in gene expression profiling and in bioinformatics by Iorio, Tagliaferri, and di Bernardo(2009) and Lin and Ding (2009), respectively.

In comparison with Spearman’s footrule, Gini’s gamma seems to be used rather rarely inpractice. This may well change in the years to come, however, as a strong connection betweenthe two coefficients was recently uncovered by Nelsen and Úbeda-Flores (2004). They observedthat γn is in fact an extension of ϕn which Salama and Quade (2001) introduced to remedy itsasymmetry, already noted by Spearman (1904).

In support of these recent developments, this paper aims to consolidate the knowledge base onSpearman’s footrule and Gini’s gamma. The scattered literature on the subject is collected andorganised in a structured way using the theory of copulas as a unifying framework. This leads toseveral new results, proofs and illustrations.

Section 2 reviews basic properties and relations between Spearman’s footrule and Gini’sgamma. Sections 3 and 4 describe their distributions under independence and under generalalternatives, respectively. Section 5 collects results on tests of independence based on the twocoefficients.

A jackknife procedure is detailed in Section 6 for the estimation of the statistics’ asymptoticvariance under any dependence structure. Finally, multivariate extensions of ϕn and γn are consid-ered in Section 7, and their sampling properties are studied in Section 8. Practical recommendationsare summarised in the Conclusion.

Various Appendices contain the technical arguments, including new, simpler proofs of knownresults based on the asymptotic behaviour of the empirical copula process.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 3: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 939

2. Definitions and basic properties

It is clear that the statistic ϕn defined in Equation (1) equals 1 when Ri = Si for all i ∈ {1, . . . , n}.It takes its smallest value when the two sets of ranks are antithetic, that is, when Ri = n + 1 − Si

for every i ∈ {1, . . . , n}. A simple calculation shows that

n∑i=1

|n + 1 − 2i| =

⎧⎪⎪⎨⎪⎪⎩

n2

2, when n is even,

(n2 − 1)

2, when n is odd.

Therefore, ϕn varies in [−1/2, 1] when n is odd but it can go as low as −(n2 + 2)/{2(n2 − 1)} ∈[−1, −1/2) for n even.

In order to span the entire interval [−1, 1], one can replace 3/(n2 − 1) by 2/�n2/2� inEquation (1). Even if this is done, the statistic ϕn may still be regarded as unsatisfactory insome applications. This is because it generally assigns different degrees of dependence (in abso-lute value) to the samples (X1, Y1), . . . , (Xn, Yn) and (−X1, Y1), . . . , (−Xn, Yn). For example, if(X1, Y1) = (10, 20), (X2, Y2) = (20, 30) and (X3, Y3) = (30, 10), then ϕn = −1 while ϕn = 0for the sample (−10, 20), (−20, 30) and (−30, 10).

As explained by Salama and Quade (2001), this problem can be solved by making ϕn symmetricwith respect to the rank transformation R �→ n + 1 − R. Nelsen and Úbeda-Flores (2004) pointedout that the resulting coefficient is the right-hand side of Equation (2), that is, Gini’s γn.

Many properties of ϕn and γn stem from their representation as linear rank statistics. From theidentity |u − v| = u + v − 2 min(u, v) valid for all u, v ∈ R, one gets

ϕn = 1

n − 1

n∑i=1

(Ri

n + 1,

Si

n + 1

)− 2n + 1

n − 1, (3)

where Jϕ(u, v) = 6 min(u, v). Similarly, one can use the identity |(n + 1) − u − v| =2 max{0, u + v − (n + 1)} − u − v + (n + 1) to see that

γn = n + 1

2�n2/2�n∑

i=1

(Ri

n + 1,

Si

n + 1

)− n(n + 1)

�n2/2� , (4)

where Jγ (u, v) = 4 min(u, v) + 4 max(0, u + v − 1).

3. Distribution under independence

The behaviour of ϕn, γn and variants thereof has been extensively studied under the assumptionthat the variables X and Y are independent. From results which Spearman (1904) attributed toFelix Hausdorff (see, for example, Kleinecke, Ury, and Wagner 1962, for a derivation), one gets

E(ϕn) = 0 and var(ϕn) = 2n2 + 7

5(n + 1)(n − 1)2.

Tables of the null distribution of ϕn were produced by Ury and Kleinecke (1979). They werelater expanded by Franklin (1988) and Salama and Quade (1990); see also Salama and Quade

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 4: Spearman's footrule and Gini's gamma: a review with complements

940 C. Genest et al.

(2002). Diaconis and Graham (1977) were apparently the first to show that under independence,

n1/2ϕn � N (0, 2/5),

where � denotes convergence in distribution as n → ∞. See Sen and Salama (1983) for analternative proof.

For Gini’s gamma, Amato (1954) and Cucconi (1964) obtained independently

E(γn) = 0 and var(γn) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

2(n2 + 2)

3(n − 1)n2, when n is even,

2(n2 + 3)

3(n − 1)(n2 − 1), when n is odd.

A third derivation was provided by Salama and Quade (2001) but note the typo in their finalformula for n even.

The exact null distribution of Gini’s gamma was given by Savorgnan (1915) for n � 5; thesetables were later extended by Salvemini (1951) and Cifarelli and Regazzini (1977). In addition,Rizzi (1971) used simulations to approximate the null distribution of γn up to n = 30. Betrò(1993) later showed how the exact distribution can be derived numerically. Other approximationswere designed by Landenna and Scagni (1989), and by Vittadini (1991). It was suspected for along time (Salvemini 1951; Amato 1954; Cucconi 1964; Herzel 1972) that under independence,

n1/2γn � N (0, 2/3)

as n → ∞. This was eventually proved by Cifarelli and Regazzini (1977).

4. Distribution in the case of dependence

The asymptotic distribution of Gini’s gamma was given by Cifarelli, Conti, and Regazzini (1996)in the general case where the pair (X, Y ) has a bivariate distribution H(x, y) = Pr(X � x, Y �y) with continuous margins F(x) = Pr(X � x) and G(y) = Pr(Y � y). The parallel result forSpearman’s footrule is reported below, seemingly for the first time.

As it turns out, the large-sample distributions of ϕn and γn depend on H only through thefunction C implicitly defined by

H(x, y) = Pr(X � x, Y � y) = C{F(x), G(y)}for all x, y ∈ R. The so-called copula C, which is unique, is a bivariate distribution function withuniform margins on the interval (0, 1) (Nelsen 2006, Chap. 2).

The following proposition, whose proof is in Appendix 1, shows that ϕn and γn areasymptotically unbiased estimators of

ϕC = 1 − 3∫

(0,1)2|u − v| dC(u, v) = −2 + 6

∫ 1

0C(t, t) dt

and

γC = 2∫

(0,1)2{|u + v − 1| − |u − v|} dC(u, v) = −2 + 4

∫ 1

0{C(t, t) + C(t, 1 − t)} dt,

respectively. From these definitions, reported by Nelsen (1998), it is clear that ϕC and γC dependonly on the copula’s main and secondary diagonal sections, defined for all t ∈ [0, 1] by C(t, t)

and C(t, 1 − t), respectively.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 5: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 941

Proposition 1 Suppose that a bivariate copula C admits continuous partial derivativesC1(u, v) = ∂C(u, v)/∂u and C2(u, v) = ∂C(u, v)/∂v on (0, 1). Then as n → ∞,

n1/2(ϕn − ϕC) � N (0, σ 2ϕC

), n1/2(γn − γC) � N (0, σ 2γC

),

with σ 2ϕC

and σ 2γC

defined in Equations (A1) and (A2), respectively.

When C(u, v) = �(u, v) ≡ uv is the independence copula, one gets σ 2ϕC

= 2/5 and σ 2γC

= 2/3.Additional examples of explicit calculations are given below.

Example 1 Let C(u, v) = uv + θuv(1 − u)(1 − v) be the Farlie–Gumbel–Morgenstern copulawith parameter θ ∈ [−1, 1]. Routine calculations yield ϕC = θ/5, γC = 4θ/15,

σ 2ϕC

= 2

5+ 3

70θ − 11

150θ2 and σ 2

γC= 2

3− 88

675θ2.

Example 2 Given θ ∈ [−1, 1], let A(v) = θ sin(2πv)/(2π) and C(u, v) = uv + u(1 − u)A(v)

for all u, v ∈ (0, 1). These are examples of copulas with quadratic sections, as defined byQuesada-Molina and Rodríguez-Lallena (1995). Interestingly, ϕC = γC = 0 for all θ ∈ [−1, 1].This is in fact the case for any measure of concordance à la Scarsini (1984), because C(u, v) +C(u, 1 − v) = u for all u, v ∈ (0, 1), so that all members of this family are ‘indifferent,’ in thesense given to that term by Gini (Conti 1994). With the help of Maple, one finds

σ 2ϕC

= 1080θ2 + 72θ2π4 + 225θ2π2 + 64π6

160π6

and

σ 2γC

= 330θ2 + 24θ2π4 + 95θ2π2 + 20π6

30π6.

5. Asymptotic relative efficiency

Spearman’s footrule and Gini’s gamma are natural statistics for testing independence. Cifarelliand Regazzini (1977) compared the merits of the test based on γn in terms of Pitman’s asymptoticrelative efficiency in a Gaussian model. More recently, Conti and Nikitin (1999a) computed thelocal Bahadur efficiency of ϕn and γn for a large class of alternatives. As the test statistics arerank-based, the calculations rely only on the dependence structure under the alternative, that is,the copula.

In their work, Conti and Nikitin (1999a) considered copula alternatives defined for eachu, v ∈ (0, 1) by Cθ(u, v) = uv + θθ(u, v), where θ � 0 and θ is a non-negative functionwhose mixed partial derivative satisfies mild conditions. They showed that for such alternatives,Bahadur’s and Pitman’s efficiencies coincide.

Using the results of Genest, Quessy, and Rémillard (2006), one can extend these comparisonsto other copula families (Cθ ) in which independence occurs when, say, θ = 0. Indeed, note thatby Equations (3) and (4), ϕn and γn are asymptotically equivalent to statistics of the form

SJn = 1

n

n∑i=1

J

(Ri

n + 1,

Si

n + 1

)− 1

n2

n∑i=1

n∑j=1

J

(i

n + 1,

j

n + 1

). (5)

Here, J = Jϕ and J = Jγ , respectively. Many classical nonparametric tests of independenceare based on statistics of the form (5) for some score function J .

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 6: Spearman's footrule and Gini's gamma: a review with complements

942 C. Genest et al.

Given right-continuous, square-integrable, quasi-monotone score functions J1 and J2, it isshown by Genest et al. (2006) that Pitman’s asymptotic relative efficiency (ARE) equals

ARE(SJ1n , SJ2

n ) =(

μJ1/σJ1

μJ2/σJ2

)2

,

provided that the family (Cθ ) of copula alternatives meets mild regularity conditions concerningmainly the existence and properties of the function C0, defined as ∂Cθ(u, v)/∂θ evaluated atθ = 0. Here, μJi

is the derivative with respect to θ of the asymptotic mean of SJin under Cθ ,

evaluated at θ = 0, that is,

μJi=

∫(0,1)2

C0(u, v) dJi(u, v).

Furthermore, σ 2Ji

stands for the asymptotic variance of SJin at independence.

Given below are applications of this result when J1, J2 ∈ {Jϕ, Jγ , Jρ} with Jρ(u, v)=12 uv forall u, v ∈ (0, 1), which corresponds to Spearman’s rho.

Example 3 If Cθ is the Gaussian copula and � denotes the cumulative distribution function ofa N (0, 1) random variable, one finds C0(u, v) = �′{�−1(u)}�′{�−1(v)} for all u, v ∈ (0, 1).Thus, μJϕ

= √3/π , μJγ

= 4/(√

3 π) and μJρ= 3/π . Hence,

ARE(SJϕ

n , SJρ

n ) = 5

6≈ 0.83 and ARE(S

n , SJρ

n ) = 8

9≈ 0.89.

These calculations are in accordance with the findings of Cifarelli and Regazzini (1977). Forthis class of alternatives, both Spearman’s footrule and Gini’s gamma are less efficient thanSpearman’s rho. The Pitman efficiency of the latter is 9/π2 ≈ 0.91 when compared with the vander Waerden statistic, which is locally optimal for such alternatives (Genest and Verret 2005).

Example 4 Suppose that the family (Cθ ) is such that for all u, v ∈ (0, 1), C0(u, v) = kuv(um −1)(vm − 1) for some k > 0 and m � 1. The Farlie–Gumbel–Morgenstern, D

‘abrowska, Plackett

and Frank families of copulas fall in this category when m = 1. The alternatives of Woodworth(1970) illustrate the case m > 1. Simple calculations yield

μJϕ= 4km2

(m + 3)(2m + 3)and μJρ

= 3km2

(2 + m)2.

A complex but explicit expression is also available for μJγ; it reduces to 4/15 if m = 1. Using

σρ = 1, one recovers the results of Conti and Nikitin (1999a) for m = 1, viz.

ARE(SJϕ

n , SJρ

n ) = 9

10= 0.90 and ARE(S

n , SJρ

n ) = 24

25= 0.96.

Spearman’s footrule and Gini’s gamma are thus somewhat less efficient than Spearman’s rho,which is the locally optimal test statistic for this class of models (Genest and Verret 2005). Asshown in Figure 1, however, ϕn eventually becomes more efficient than ρn as m → ∞, while γn

gradually looses ground when m � 2. In fact,

limm→∞ARE(S

n , SJρ

n ) = 10

9≈ 1.11 and lim

m→∞ARE(SJγ

n , SJρ

n ) = 2

3≈ 0.67.

Example 5 Suppose that the family (Cθ ) is such that for all u, v ∈ (0, 1), C0(u, v) =kuv ln(u) ln(v) for some k > 0. The Clayton/Cook–Johnson and Gumbel–Barnett families fall in

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 7: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 943

Figure 1. Relative efficiency of ϕn versus ρn (left) and γn versus ρn (right) as a function of parameter m � 1 in theWoodworth alternatives of Example 4.

this category, as well as Model 4.2.10 of Nelsen (2006). Here, μJϕ= 4k/9, μJγ

= k(15 − π2)/9and μJρ

= 3k/4. Consequently,

ARE(SJϕ

n , SJρ

n ) = 640

729≈ 0.88 and ARE(S

n , SJρ

n ) = 8(15 − π2)2

243≈ 0.87.

The tests based on ϕn and γn thus have similar efficiencies. For this class of alternatives, however,neither they nor the test based on ρn can be recommended. Indeed, the Pitman efficiency ofSpearman’s rho is only 9/16 ≈ 0.563 when compared with Savage’s log-rank test, which is thelocally most powerful test statistic in this case (Genest and Verret 2005).

The final example, adapted from Conti and Nikitin (1999a), exhibits dependence models forwhich Spearman’s footrule and Gini’s gamma are the locally most powerful test statistics.

Example 6 Consider the families of copulas defined for all u, v, θ ∈ (0, 1) by

Cϕθ (u, v) = uv + θ{|u − v|3 − (u + v)3 + 2uv(u2 + v2 + 2)}

2

and

θ (u, v) = uv + θ{|1 − u − v|3 + |u − v|3 − 3(u2 + v2 − u − v) − 1}6

.

Both of them lie in the class of cubic-section copulas introduced by Nelsen, Quesada-Molina,and Rodríguez-Lallena (1997). As shown by Conti and Nikitin (1999a), tests of independencebased on ϕn and γn are locally most powerful for the classes of alternatives C

ϕθ and C

γ

θ , respectively.For the family C

ϕθ , one gets μJϕ

= 2/5, μJγ= 1/2 and μJρ

= 3/5. Thus,

ARE(SJϕ

n , SJρ

n ) = 10

9≈ 1.11 and ARE(S

n , SJρ

n ) = 25

24≈ 1.04.

For Cγ

θ , one finds μJϕ= 1/2, μJγ

= 2/3 and μJρ= 4/5. Hence,

ARE(SJϕ

n , SJρ

n ) = 125

128≈ 0.98 and ARE(S

n , SJρ

n ) = 25

24≈ 1.04.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 8: Spearman's footrule and Gini's gamma: a review with complements

944 C. Genest et al.

6. Estimation of the asymptotic variance

An alternative derivation of the limiting distributions of ϕn and γn was given by Conti (1994)using an asymptotically equivalent U -statistic (see also Cifarelli et al. 1996). His approach leadsto a consistent estimate of their large-sample variances. Given u, v, s, t ∈ R, let

ψ1(u, v; s, t) = |u − v| + sign (u − v){1(s � u) − 1(t � v) − u + v},ψ2(u, v; s, t) = |u + v − 1| + sign (u + v − 1){1(s � u) + 1(t � v) − u − v},

with the convention sign(0) = −1. For k = 1, 2, define k(u, v; s, t) = ψk(u, v; s, t) +ψk(s, t; u, v) as well as

ϒkn =(

n

2

)−1 ∑i<j

k(Ui, Vi; Uj , Vj ),

where Ui = F(Xi), Vi = G(Yi) for i ∈ {1, . . . , n}. Conti’s result is as follows (see Conti 1994for a proof).

Proposition 2 If the conditions of Proposition 1 hold, then as n → ∞,

n1/2(ϒ1n − ϕC) � N (0, σ 2ϕC

) and n1/2(ϒ2n − ϒ1n − γC) � N (0, σ 2γC

).

Let σ 2ϕC

and σ 2γC

be the delete-one jackknife variance estimators based on ϒ1n and ϒ2n − ϒ1n,respectively. The theory of U -statistics (Lee 1990, Chap. 5) implies that σ 2

ϕCis a consistent estimate

of σ 2ϕC

; similarly, σ 2γC

estimates σ 2γC

consistently. In his work, Conti (1994) used slight variantsbased on the work of Sen (1960). Specifically, let

ϒkn,i = 1

n − 1

n∑j=1, j �=i

k(Ui, Vi; Uj , Vj )

for k = 1, 2 and i ∈ {1, . . . , n}. Conti’s estimators of σ 2ϕC

and σ 2γC

are then given by

σ 2ϕC

= 4

n

n∑i=1

(ϒ1n,i − ϒ1n)2 = (n − 2)2

n(n − 1)σ 2

ϕC

and

σ 2γC

= 4

n

n∑i=1

(ϒ2n,i − ϒ2n)2 = (n − 2)2

n(n − 1)σ 2

γC,

respectively. In this fashion, var(σ 2ϕC

) < var(σ 2ϕC

) and var(σ 2γC

) < var(σ 2γC

).As shown by Conti (1994), the delete-one jackknife remains consistent when Ui and Vi are

replaced by Fn(Xi) = Ri/n and Gn(Yi) = Si/n, where Fn and Gn are the empirical versions ofF and G, respectively.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 9: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 945

n=50 n=100 n=250 n=500

0.3

0.4

0.5

0.6

0.7

0.8

0.5

n=50 n=100 n=250 n=500

0.3

0.4

0.5

0.6

0.7

0.8

0.5

Figure 2. Dispersion of the asymptotic variance estimate of Spearman’s footrule (left) and Gini’s gamma (right), basedon 100 random samples of size n = 50, 100, 250 and 500 from the Farlie–Gumbel–Morgenstern copula with parameterθ = 1/2.

The procedure can be implemented more easily upon noting that

1(u, v; s, t)

2= 1 {u � min(v, s), t < min(v, s)} + 1 {s � min(t, u), v < min(t, u)}

and

2(u, v; s, t)

2= 1 {s � min(u, 1 − t), v > max(t, 1 − u)}

+ 1 {u � min(s, 1 − v), t > max(v, 1 − s)} .

The behaviour of σ 2ϕC

and σ 2γC

is illustrated in Figure 2 using random samples of size n =50, 100, 250 and 500 from the Farlie–Gumbel–Morgenstern copula with parameter θ = 1/2. Ascan be seen, the convergence is fairly rapid. The same phenomenon was observed for several otherclasses of copulas [results not shown].

7. Extensions

In recent years, various generalisations of Spearman’s footrule and Gini’s gamma have beenproposed. In particular, Cifarelli et al. (1996) considered

ϕg,C = g−1

{∫(0,1)2

g(|u − v|) dC(u, v)

}

and

γg,C = g−1

[∫(0,1)2

{g(|u + v − 1|) − g(|u − v|)} dC(u, v)

],

where g : [0, 1] → [0, 1] is a strictly increasing, continuous function. In addition, if g is convexand satisfies g(0) = 0, γg,C is a measure of concordance in the sense of Scarsini (1984); the casesg(t) = t and g(t) = t2 correspond to Gini’s gamma and Spearman’s rho, respectively. Cifarelliet al. (1996) identified the asymptotic distribution of the empirical version of γg,C and showedhow to estimate its variance consistently by the jackknife. See Conti and Nikitin (1999b) foradditional limiting results. However, it is not clear how g should be chosen in practice.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 10: Spearman's footrule and Gini's gamma: a review with complements

946 C. Genest et al.

More recently, multivariate versions of Spearman’s footrule and Gini’s gamma were proposedby Úbeda-Flores (2005) and Behboodian, Dolati, and Úbeda-Flores (2007), respectively. Thed-variate version of ϕC is

ϕC = d + 1

d − 1

∫ 1

0{C(t, . . . , t) + C(t, . . . , t)} dt − 2

d − 1,

where C is the distribution function of 1 − U with U = (U1, . . . , Ud) distributed as C. Úbeda-Flores (2005) showed that ϕC = 0 at independence and ϕC = 1 at the Fréchet–Hoeffding upperbound, defined for every u1, . . . , ud ∈ (0, 1) by

M(u1, . . . , ud) = min(u1, . . . , ud).

In addition, he proved that the inequality ϕC � −1/d always holds and that if C12, C13, C23

are the bivariate margins of a trivariate copula C, then

ϕC = 1

3(ϕC12 + ϕC13 + ϕC23). (6)

This property, which does not extend to higher dimensions, is shared by the multivariate exten-sion of Gini’s gamma proposed by Behboodian et al. (2007). The latter is defined as a lineartransformation of

γ ∗C =

∫ 1

0{C(t, . . . , t) + C(t, . . . , t)} dt +

∑A⊆D

(−1)|A|∫

(0,1)dW(uA) dC(u),

where |A| denotes the cardinality of the set A ⊆ D = {1, . . . , d} and uA is the vector derivedfrom u = (u1, . . . , ud) ∈ (0, 1)d by replacing its �th coordinate by 1 if and only if � /∈ A. Theexpression for γ ∗

C also involves the multivariate Fréchet–Hoeffding lower bound, defined for everyu1, . . . , ud ∈ (0, 1) by

W(u1, . . . , ud) = max(0, u1 + · · · + ud + 1 − d).

More specifically, Behboodian et al. (2007) defined γC = (γ ∗C − 2ad + 1)/(2bd − 2ad) with

ad = 1

d + 1+ 1

2(d + 1)! +d∑

j=0

(−1)j(

d

j

)1

2(j + 1)! , bd = 1 −d−1∑j=1

1

4j

chosen in such a way that γC = 0 at independence and γC = 1 at M .

8. Sample properties in the multivariate case

Given a random sample (X11, . . . , X1d), . . . , (Xn1, . . . , Xnd) from some continuous d-variatedistribution, and (R11, . . . , R1d), . . . , (Rn1, . . . , Rnd) the associated vectors of componentwiseranks, Úbeda-Flores (2005) defined the empirical version of ϕC by

ϕn = 1 − d + 1

d − 1

n∑i=1

Li

n2 − 1,

where for each i ∈ {1, . . . , n}, Li = max(Ri1, . . . , Rid) − min(Ri1, . . . , Rid). The followingproposition, whose proof is in Appendix 2, implies that ϕn is asymptotically unbiased. As will beshown below, however, it is generally biased in finite samples.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 11: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 947

Proposition 3 Suppose that a d-variate copula C admits continuous partial derivativesC1(u1, . . . , ud) = ∂C(u1, . . . , ud)/∂u1, . . . , Cd(u1, . . . , ud) = ∂C(u1, . . . , ud)/∂ud on (0, 1)d .Then as n → ∞,

n1/2(ϕn − ϕC) � N (0, σ 2ϕC

),

where σ 2ϕC

is defined in Equation (A5).

It is checked readily that Equation (A5) reduces to Equation (A1) when d = 2, and that Property(6) continues to hold for the empirical version of ϕC . Although a closed-form expression isavailable for σ 2

ϕC, its computation can be tedious. Here is a simple example in dimension d = 3.

Example 7 Given θ12, θ13, θ23, θ123 ∈ [−1, 1], a trivariate version of the Farlie–Gumbel–Morgenstern copula is defined for all u, v, w ∈ (0, 1) by

C(u, v, w) = uvw{1 + θ12(1 − u)(1 − v) + θ13(1 − u)(1 − w)

+ θ23(1 − v)(1 − w) + θ123(1 − u)(1 − v)(1 − w)}.Simple algebra yields ϕC = (θ12 + θ13 + θ23)/15 and

σ 2ϕC

= 2

15+ 2

63(θ12 + θ13 + θ23) − 11

1350(θ2

12 + θ213 + θ2

23) − 17

1350(θ12θ13 + θ12θ23 + θ13θ23).

Note that θ123 is absent from the formulas, as might be expected from Property (6).

One possible use of the extended version of ϕn is as a test statistic for multivariate independence.It is shown in Appendix 3 that under the null hypothesis,

E(ϕn) = 1 − d + 1

d − 1

n

n − 1

{1 − 2

n + 1

n∑k=0

(k

n

)d}

. (7)

Observe that while it vanishes when d = 2 or 3, this expectation is only O(1/n2) in generaland, for example, equals 1/(9n2) when d = 4. A closed-form expression for the finite-samplevariance of ϕn is also given in Equation (A6), but it is cumbersome.

In view of Proposition 3, a more practical solution is to reject the null hypothesis at asymptoticlevel α if |ϕn|/σ� is larger than the quantile of level 1 − α/2 of the N (0, 1) distribution. Here,σ 2

� stands for the large-sample variance of ϕn under H0, that is, when the underlying copula is�(u1, . . . , ud) = u1 × · · · × ud for all u1, . . . , ud ∈ (0, 1). As shown in Appendix 4,

σ 2� = 2

(d + 1

d − 1

)2 {2 + 4d − d2 + d3

d(d + 2)(2d + 1)(d + 1)2− B(d, d + 2)

d + 1

},

where B denotes the Beta function. In particular, σ 2� = 2/5, 2/15 and 149/2268 when d = 2, 3

and 4, respectively.Behboodian et al. (2007) defined the empirical version of γC by

γn = γ ∗n − cn

dn − cn

for appropriate normalising constants cn, dn and a function γ ∗n of the vectors R1, . . . , Rn of

normalised ranks given for each i ∈ {1, . . . , n} by Ri = (Ri1, . . . , Rid) /(n + 1). Specifically,

γ ∗n = 1

2n

n∑i=1

[M(Ri ) + W(Ri ) +

∑A⊆D

(−1)|A|{M(RiA) + W(RiA)}]

,

where for each i ∈ {1, . . . , n}, RiA is the vector obtained from Ri ∈ (0, 1)d by replacing its �thcoordinate by 1 if and only if � /∈ A.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 12: Spearman's footrule and Gini's gamma: a review with complements

948 C. Genest et al.

It may be conjectured that γn is an unbiased, asymptotically normal estimator of γC in arbitrarydimension d � 3. It will be a challenge to determine its large-sample variance, however, evenunder independence. This may be the object of future work.

9. Conclusion

This paper reviewed and complemented the properties of Spearman’s footrule and Gini’s gamma.As mentioned in the Introduction, Spearman’s footrule, ϕn, is quickly gaining popularity in appli-cations, mainly due to its interpretation as a Manhattan distance between two sets of ranks. Assuch, it is more robust than, for example, Spearman’s rho which is based on the Euclidean dis-tance. However, it suffers from one major drawback, namely its asymmetry. Gini’s statistic, γn,corrects this defect while maintaining the interpretation as a distance. From this point of view, itthus seems preferable.

Furthermore, ϕn and γn may be regarded as measures of non-linear association. However, onlyγn satisfies the axiomatic definition of such a measure proposed by Scarsini (1984). Nonetheless,both statistics can be used for testing independence. In most cases considered here, they turned outto be less efficient than the classical Spearman’s rho. A general recommendation cannot be made,however, as both ϕn and γn are locally optimal for specific classes of alternatives. For additionaldiscussion on rank-based tests of independence and efficiency considerations, see, for example,Genest and Rémillard (2004), Genest and Verret (2005) and Genest et al. (2006).

At present, standard errors for ϕn and γn are rarely found in applications, if ever. Asymptoticconfidence intervals for both statistics can be derived readily from Proposition 1 using the simplerform of Conti’s variance estimator given in Section 6.

Results in Sections 7 and 8 make it possible to use the multivariate version of Spearman’sfootrule proposed by Úbeda-Flores (2005) for the comparison of d � 3 sets of ranks. It wasshown that ϕn is again asymptotically normal, but that it is generally biased in finite samples ifd � 4. The asymptotic variance under independence was computed and can be used to constructtests for multivariate independence. Similar results concerning the multivariate extension of Gini’sgamma are still under development.

Acknowledgements

Funding in support of this work was provided by the Natural Sciences and Engineering Research Council of Canada, theFonds québécois de la recherche sur la nature et les technologies and the Institut de finance mathématique de Montréal.

References

Alvo, M., and Charbonneau, M. (1997), ‘The Use of Spearman’s Footrule in Testing for Trend When the Data areIncomplete’, Communications in Statistics: Simulation and Computation, 26, 193–213.

Amato, V. (1954), ‘Sulla distribuzione dell’indice del Gini’, Statistica, 14, 505–519.Behboodian, J., Dolati, A., and Úbeda-Flores, M. (2007), ‘A Multivariate Version of Gini’s Rank Association Coefficient’,

Statistical Papers, 48, 295–304.Berman, S.M. (1996), ‘Rank Inversions in Scoring Multipart Examinations’, The Annals of Applied Probability, 6, 992–

1005.Betrò, B. (1993), ‘On the Distribution of Gini’s Rank Association Coefficient’, Communications in Statistics: Simulation

and Computation, 22, 497–505.Cifarelli, D.M., Conti, P.L., and Regazzini, E. (1996), ‘On the Asymptotic Distribution of a General Measure of Monotone

Dependence’, The Annals of Statistics, 24, 1386–1399.Cifarelli, D.M., and Regazzini, E. (1977), ‘On a distribution-free test of independence based on Gini’s rank association

coefficient’. Recent Developments in Statistics (Proceedings of the European Meeting of Statisticians, Grenoble,1976), Amsterdam, North-Holland, pp. 375–385.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 13: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 949

Conti, P.L. (1994), ‘Asymptotic Inference on a General Measure of Monotone Dependence’, Journal of the ItalianStatistical Society, 3, 213–241.

Conti, P.L., and Nikitin, Y.Y. (1999a), ‘Asymptotic Efficiency of Independence Tests Based on Gini’s Rank AssociationCoefficient, Spearman’s Footrule and Their Generalizations’, Communications in Statistics: Theory and Methods,28, 453–465.

Conti, P.L., and Nikitin, Y.Y. (1999b), ‘Rates of Convergence for a Class of Rank Tests for Independence’, Zap. Nauchn.Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 260, Veroyatn. i Stat. 3, 155–163, 319–320 [Translation inJournal of Mathematical Science (New York) 109 (2002) 2141–2147].

Cucconi, O. (1964), ‘La distribuzione campionaria dell’indice di cograduazione del Gini’, Statistica, 24, 143–151.Deheuvels, P. (1979), ‘La fonction de dépendance empirique et ses propriétés: un test non paramétrique d’indépendance’,

Académie royale de Belgique: Bulletin de la classe des sciences, 65(5), 274–292.Diaconis, P., and Graham, R.L. (1977), ‘Spearman’s Footrule as a Measure of Disarray’, Journal of the Royal Statistical

Society, Series B, 39, 262–268.Dinneen, L.C., and Blakesley, B.C. (1971), ‘Definition of Spearman’s Footrule’, Journal of the Royal Statistical Society,

Series C, 31, 66.Fagin, R., Kumar, R., and Sivakumar, D. (2003), ‘Comparing Top k Lists’, SIAM Journal of Discrete Mathematics, 17,

134–160.Fermanian, J.-D., Radulovic, D., and Wegkamp, M.H. (2004), ‘Weak Convergence of Empirical Copula Processes’,

Bernoulli, 10, 847–860.Franklin, L.A. (1988), ‘Exact Tables of Spearman’s Footrule for N = 11(1)18 With Estimate of Convergence and Errors

for the Normal Approximation’, Statistics and Probability Letters, 6, 399–406.Genest, C., Quessy, J.-F., and Rémillard, B. (2006), ‘Local Efficiency of a Cramér–von Mises Test of Independence’,

Journal of Multivariate Analysis, 97, 274–294.Genest, C., and Rémillard, B. (2004), ‘Tests of Independence and Randomness Based on the Empirical Copula Process’,

Test, 13, 335–369.Genest, C., and Verret, F. (2005), ‘Locally Most Powerful Rank Tests of Independence for Copula Models’, Journal of

Nonparametric Statistics, 17, 521–539.Gini, C. (1914), L’Ammontare e la Composizione della Ricchezza delle Nazione, Torino: Bocca.Hájek, J., and Šidák, Z. (1967), Theory of Rank Tests, New York: Academic Press.Herzel, A. (1972), ‘Sulla distribuzione campionaria dell’indice di cograduazione del Gini’, Metron, 30, 137–153.Iorio, F., Tagliaferri, R., and di Bernardo, D. (2009), ‘Identifying Network of Drug Mode of Action by Gene Expression

Profiling’, Journal of Computational Biology, 16, 241–251.Kendall, M. (1970), Rank Correlation Methods (4th ed.), London: Griffin.Kim, B.S., Rha, S.Y., Cho, G.B., and Chung, H.C. (2004), ‘Spearman’s Footrule as a Measure of cDNA Microarray

Reproducibility’, Genomics, 84, 441–448.Kleinecke, D.C., Ury, H.K., and Wagner, L.F. (1962), ‘Spearman’s Footrule—An Alternative Rank Statistic’, (Rep. no

CDRP–182–114), Civil Defense Research Project, Institute of Engineering Research, University of California atBerkeley.

Landenna, G., and Scagni, A. (1989), ‘An Approximated Distribution of the Gini’s Rank Association Coefficient’,Communications in Statistics. Theory and Methods, 18, 2017–2026.

Lee, A.J. (1990), U -Statistics: Theory and Practice, New York: Dekker.Lin, S., and Ding, J. (2009), ‘Integration of Ranked Lists via Cross Entropy Monte Carlo with Applications to mRNA and

microRNA Studies, Biometrics, 65, 9–18.Mikki, S. (in press), ‘Comparing Google Scholar and ISI Web of Science for Earth Sciences’, Scientometrics.Nelsen, R.B. (1998), ‘Concordance and Gini’s Measure of Association’, Journal of Nonparametric Statistics, 9,

227–238.Nelsen, R.B. (2006), An Introduction to Copulas (2nd ed.), Berlin: Springer.Nelsen, R.B., Quesada-Molina, J.J., and Rodríguez-Lallena, J.A. (1997), ‘Bivariate Copulas with Cubic Sections’, Journal

of Nonparametric Statistics, 7, 205–220.Nelsen, R.B., and Úbeda-Flores, M. (2004), ‘The Symmetric Footrule is Gini’s Rank Association Coefficient’,

Communications in Statistics: Theory and Methods, 33, 195–196.Quade, D., and Salama, I.A. (2006), ‘Concordance of Complete or Right-censored Rankings Based on Spearman’s

Footrule’, Communications in Statistics: Theory and Methods, 35, 1059–1069.Quesada-Molina, J.J., and Rodríguez-Lallena, J.A. (1995), ‘Bivariate Copulas with Quadratic Sections’, Journal of

Nonparametric Statistics, 5, 323–337.Rizzi, A. (1971), ‘Distribuzione dell’indice di cograduazione del Gini’, Metron, 29, 63–73.Rüschendorf, L. (1976), ‘Asymptotic Distributions of Multivariate Rank Order Statistics’, The Annals of Statistics, 4,

912–923.Salama, I.A., and Quade, D. (1990), ‘A Note on Spearman’s Footrule’, Communications in Statistics: Simulation and

Computation, 19, 591–601.Salama, I.A., and Quade, D. (2001), ‘The Symmetric Footrule’, Communications in Statistics: Theory and Methods, 30,

1099–1109.Salama, I.A., and Quade, D. (2002), ‘Computing the Distribution of Spearman’s Footrule in O(n4) Time’, Journal of

Statistical Computation and Simulation, 72, 895–898.Salama, I.A., and Quade, D. (2004), ‘AgreementAmong Censored Rankings Using Spearman’s Footrule’, Communications

in Statistics: Theory and Methods, 33, 1837–1850.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 14: Spearman's footrule and Gini's gamma: a review with complements

950 C. Genest et al.

Salvemini, T. (1951), ‘Sui vari indici di cograduazione’, Statistica, 11, 133–154.Savorgnan, F. (1915), Sulla Formazione dei Valori dell’Indice di Cograduazione, Studi Economico-Giuridici

dell’Università di Cagliari.Scarsini, M. (1984), ‘On Measures of Concordance’, Stochastica, 8, 201–218.Sen, P.K. (1960), ‘On Some Convergence Properties of U -Statistics’, Calcutta Statistics Association Bulletin, 10, 1–18.Sen, P.K., and Salama, I.A. (1983), ‘The Spearman Footrule and a Markov Chain Property’, Statistics and Probability

Letters, 1, 285–289.Sen, P.K., Salama, I.A., and Quade, D. (2003), ‘Spearman’s Footrule Under Progressive Censoring’, Journal of

Nonparametric Statistics, 15, 53–60.Spearman, C. (1904), ‘The Proof and Measurement of Association Between Two Things’, The American Journal of

Psychiatry, 15, 72–101.Spearman, C. (1906), ‘Footrule for Measuring Correlation’, The British Journal of Psychiatry, 2, 89–108.Stute, W. (1984), ‘The Oscillation Behavior of Empirical Processes: The Multivariate Case’, The Annals of Probability,

12, 361–379.Tsukahara, H. (2005), ‘Semiparametric Estimation in Copula Models’, The Canadian Journal of Statistics, 33, 357–375.Úbeda-Flores, M. (2005), ‘Multivariate Versions of Blomqvist’s Beta and Spearman’s Footrule’, Annals of the

Institute of Statistical Mathematics, 57, 781–788.Ury, H.K., and Kleinecke, D.C. (1979), ‘Tables of the Distribution of Spearman’s Footrule’, Applied Statistics, 28,

271–275.Vittadini, G. (1991), ‘Una approssimazione della variabile casuale Gdi Gini’, Rivista Internationale di Scienze Economiche

e Commerciali, 38, 81–94.Woodworth, G.G. (1970), ‘Large Deviations and Bahadur Efficiency of Linear Rank Statistics’, The Annals of

Mathematical Statistics, 41, 251–283.

Appendix 1: Proof of Proposition 1

First, define a variant of the empirical copula of Deheuvels (1979) by

Cn(u, v) = 1

n

n∑i=1

1(

Ri

n + 1� u,

Si

n + 1� v

)

for every u, v ∈ (0, 1). Observe that for any score function J : (0, 1)2 → R, one has

1

n

n∑i=1

J

(Ri

n + 1,

Si

n + 1

)=

∫(0,1)2

J (u, v) dCn(u, v).

If in addition J itself is a copula, up to a multiplicative constant, Fubini’s theorem yields∫(0,1)2

J (u, v) dCn(u, v) =∫

(0,1)2Cn(u, v) dJ (u, v).

When these identities are used with J = Jϕ , Equation (3) becomes

ϕn = 6n

n − 1

∫ 1

0Cn(t, t) dt − 2n + 1

n − 1,

which shows that n1/2(ϕn − ϕC) has the same asymptotic behaviour as ZC,n = 6∫

Cn(t, t) dt , where Cn = n1/2(Cn − C)

is the empirical copula process. Similarly, n1/2(γn − γC) behaves asymptotically as Z∗C,n = 4

∫ {Cn(t, t) + Cn(t,

1 − t)} dt .Now it has been known since the work of Rüschendorf (1976) that when C admits continuous partial derivatives,

Cn converges weakly as n → ∞ to a continuous centred Gaussian process C of the form C(u, v) = UC(u, v) −C1(u, v)UC(u, 1) − C2(u, v)UC(1, v) for all u, v ∈ (0, 1). Here, UC denotes a pinned C-Brownian sheet, that is, acentred Gaussian random field whose covariance function at u, v, s, t ∈ (0, 1) is given by cov{UC(u, v), UC(s, t)} =C{min(u, s), min(v, t)} − C(u, v)C(s, t). See, for example, Stute (1984), Fermanian, Radulovic, and Wegkamp (2004),or Tsukahara (2005) for further discussion.

Because ZC,n is a continuous linear functional of Cn, it converges weakly as n → ∞ to the centred Gaussian randomvariable ZC = 6

∫C(t, t) dt with variance

σ 2ϕC

= 36∫ 1

0

∫ 1

0cov{C(s, s), C(t, t)} ds dt. (A1)

Similarly, the weak limit of Z∗C,n as n → ∞ is the centred Gaussian random variable Z∗

C = 4∫ {C(t, t) + C(t, 1 − t)} dt

with variance

σ 2γC

= 16∫ 1

0

∫ 1

0cov{C(s, s) + C(s, 1 − s), C(t, t) + C(t, 1 − t)} ds dt. (A2)

The latter limit corresponds to the case g(t) = t in Cifarelli et al. (1996, Theorem 4.1).

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 15: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 951

Appendix 2: Proof of Proposition 3

For arbitrary u = (u1, . . . , ud ) ∈ (0, 1)d , let

Cn(u) = 1

n

n∑i=1

1(

Ri1

n + 1� u1, . . . ,

Rid

n + 1� ud

)

be the d-variate empirical copula. One then has

ϕn = 1 − d + 1

d − 1

n

n − 1

∫(0,1)d

{max(u) − min(u)} dCn(u), (A3)

and the factor n/(n − 1) can be ignored asymptotically.Now if (U1, . . . , Ud) has distribution Cn and U is an independent uniform random variable on (0, 1), then∫

(0,1)dmax(u) dCn(u) = 1 − Pr(U � U1, . . . , U � Ud) = 1 −

∫ 1

0Cn(t, . . . , t) dt

and ∫(0,1)d

min(u) dCn(u) = Pr(U � U1, . . . , U � Ud) =∫ 1

0Pr(U1 > t, . . . , Ud > t) dt.

The latter expression can be formulated alternatively in terms of Cn by means of the inclusion–exclusion formula. Tothis end, let |A| denote the cardinality of any set A ⊆ D = {1, . . . , d}, and denote by tA the vector (t1, . . . , td ) such thatt� = t1(� ∈ A) + 1(� /∈ A) for all � ∈ {1, . . . , d} so that, for example, tD = (t, . . . , t). Then,

Pr(U1 > t, . . . , Ud > t) =∑A⊆D

(−1)|A| Pr

(⋂i∈A

{Ui � t})

=∑A⊆D

(−1)|A|Cn(tA),

where an intersection over the empty set is to be interpreted as the sure event.Similarly, one has∫ 1

0C(t, . . . , t) dt =

∫ 1

0C(1 − t, . . . , 1 − t) dt =

∑A⊆D

(−1)|A|∫ 1

0C(tA) dt.

Consequently, n1/2(ϕn − ϕC) has the same asymptotic behaviour as

ZC,n = d + 1

d − 1

⎧⎨⎩

∫ 1

0Cn(tD) dt +

∑A⊆D

(−1)|A|∫ 1

0Cn(tA) dt

⎫⎬⎭ ,

which is a continuous linear functional of the process Cn. From the work of Rüschendorf (1976), the limit of the latter isof the form

C(u) = UC(u) −d∑

j=1

Cj (u)UC(uj ) (A4)

for arbitrary u = (u1, . . . , ud ), where uj represents a d-dimensional vector with uj in its j th coordinate and 1 every-where else. Here, UC is a d-variate-centred Gaussian field with covariance given by cov{UC(u), UC(v)} = C(u ∧ v) −C(u)C(v), where for all u, v ∈ (0, 1)d , u ∧ v represents the componentwise minimum. Thus, ZC,n converges weakly asn → ∞ to a centred Gaussian random variable

ZC = d + 1

d − 1

⎧⎨⎩

∫ 1

0C(tD) dt +

∑A⊆D

(−1)|A|∫ 1

0C(tA) dt

⎫⎬⎭ .

Hence, if sA is defined as tA mutatis mutandis, the variance of ZC is given by

σ 2ϕC

=(

d + 1

d − 1

)2⎧⎨⎩�(D, D) + 2

∑A⊆D

(−1)|A|�(A, D) + �(D, D)

⎫⎬⎭ , (A5)

where for arbitrary A, B ⊆ D, one has

�(A, B) =∫ 1

0

∫ 1

0cov{C(sA), C(tB)} ds dt

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 16: Spearman's footrule and Gini's gamma: a review with complements

952 C. Genest et al.

and

�(D, D) =∑A⊆D

∑B⊆D

(−1)|A|+|B|�(A, B) =∫ 1

0

∫ 1

0cov{C(sD), C(tD)} ds dt.

Here, the process C is defined in Equation (A4), with C replaced by C everywhere. Thus when C is radially symmetric,that is, C = C, one gets �(D, D) = �(D, D).

Appendix 3: Moments of ϕn at independence

For arbitrary integers k � n, let k/nA denote the vector (k1, . . . , kd )/(n + 1), where k� = k1(� ∈ A) + (n + 1)1(� /∈ A)

for all � ∈ {1, . . . , d}. Using results stated on p. 59 of Hájek and Šidák (1967), one can see easily that E{Cn(k/nA)} =(k/n)|A| at independence. Thus if tA is defined as in Appendix 2, one gets

E

{∫ 1

0Cn(tA) dt

}= 1

n + 1

n∑k=0

E {Cn(k/nA)} = 1

n + 1

n∑k=0

(k

n

)|A|.

It then follows from the identities proven in Appendix 2 that

E

{∫(0,1)d

max(u) dCn(u)

}= 1 − E

{∫ 1

0Cn(tD) dt

}= 1 − 1

n + 1

n∑k=0

(k

n

)d

and that

E

{∫(0,1)d

min(u) dCn(u)

}=

∑A⊆D

(−1)|A|E{∫ 1

0Cn(tA) dt

}

= 1

n + 1

∑A⊆D

(−1)|A|n∑

k=0

(k

n

)|A|= 1

n + 1

d∑�=0

(−1)�(

d

) n∑k=0

(k

n

)�

.

The latter sum can be simplified further using the binomial theorem, viz.

n∑k=0

d∑�=0

(−1)�(

d

) (k

n

)�

=n∑

k=0

(1 − k

n

)d

=n∑

k=0

(k

n

)d

.

Taking expectations on both sides of Equation (A3) and making the appropriate substitutions, one gets Formula (7), asstated.

Turning to the computation of var (ϕn), one can immediately deduce from first principles and the above expression forE (ϕn) that

var(ϕn) = 2

(d + 1

d − 1

)2 (n

n2 − 1

)2⎡⎣E{M(n, d)} − 2

{n∑

k=0

(k

n

)d}2

⎤⎦ ,

where

M(n, d) = (n + 1)2∫ 1

0

∫ 1

0

⎧⎨⎩Cn(sD)Cn(tD) + Cn(sD)

∑A⊆D

(−1)|A|Cn(tA)

⎫⎬⎭ ds dt.

Now at independence, one has

E{Cn(j/nD)Cn(k/nA)}

= 1

n2

(j

n

)d−|A| [n

{min(j, k)

n

}|A|+ n(n − 1)

{jk − min(j, k)

n2 − n

}|A|]

for arbitrary j, k ∈ {0, . . . , n} and A ⊆ D. Thus if

�n(�, d) = 1

n

n∑j=0

n∑k=0

(j

n

)d−�[{

min(j, k)

n

}�

+ (n − 1)

{jk − min(j, k)

n2 − n

}�]

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 17: Spearman's footrule and Gini's gamma: a review with complements

Journal of Nonparametric Statistics 953

for all � ∈ {0, . . . , d}, one gets

E

{∫ 1

0

∫ 1

0Cn(sD)Cn(tA) ds dt

}= �n(|A|, d)

(n + 1)2.

Upon substitution and an application of the binomial identity, one finds

E{M(n, d)} = �n(d, d) +d∑

�=0

(−1)�(

d

)�n(�, d) = �n(d, d) + �n(d),

where

�n(d) = 1

n

n∑j=0

n∑k=0

[{j

n− min(j, k)

n

}d

+ (n − 1)

{j

n− jk − min(j, k)

n2 − n

}d]

.

Collecting terms, one finds

var (ϕn) = 2

(d + 1

d − 1

)2 (n

n2 − 1

)2⎡⎣�n(d, d) + �n(d) − 2

{n∑

k=0

(k

n

)d}2

⎤⎦ . (A6)

Appendix 4: Computation of σ2�

Because the independence copula is radially symmetric, Formula (A5) reduces to

σ 2ϕC

= 2

(d + 1

d − 1

)2⎧⎨⎩�(D, D) +

∑A⊆D

(−1)|A|�(A, D)

⎫⎬⎭ .

To compute σ 2ϕC

, one must evaluate 2d covariances of the form cov {C(sA), C(tD)} for some A ⊆ D. In view ofEquation (A4), any such covariance may be expressed as

cov {UC(sA), UC(tD)} +d∑

j=1

d∑k=1

Cj (sA)Ck(tD)cov {UC(sA∩{j}), UC(tD∩{k})}

−d∑

j=1

Cj (sA)cov {UC(sA∩{j}), UC(tD)} −d∑

k=1

Ck(tD)cov {UC(sA), UC(tD∩{k})}.

Simplifications occur when C = � because Cj (sA) = s|A\{j}|, Ck(tB) = t |B\{k}| and for arbitrary A, B ⊆ D,

cov {UC(sA), UC(tB)} ={

s|A| (t |B\A| − t |B|) , if s < t,

t |B| (s|A\B| − s|A|) , if s > t.

Thus if s < t and A ⊆ D, one finds

cov {C(sA), C(tD)} = s|A|(t |D\A| − t |D|) +∑

j=k∈A

s|A|t |D|−1(1 − t)

−∑j∈A

s|A|t |D|−1(1 − t) −∑k∈A

s|A|t |D|−1(1 − t),

which reduces to

s|A|(t |D\A| − t |D|) − |A|s|A|t |D|−1(1 − t) = s|A|{t |D\A|(1 − t |A|) − |A|t |D|−1(1 − t)}.Similarly if s > t , one gets

t |D|(s|A\D| − s|A|) − |A| t |D|s|A|−1(1 − s) = t |D|{(1 − s|A|) − |A| s|A|−1(1 − s)}.Consequently, for any A ⊆ D with |A| = k,

�(A, D) =∫ 1

0

∫ 1

0cov {C(sA), C(tD)} ds dt = �1(k, d) + �2(k, d),

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011

Page 18: Spearman's footrule and Gini's gamma: a review with complements

954 C. Genest et al.

where for arbitrary k ∈ {1, . . . , d},

�1(k, d) =∫ 1

0

∫ t

0sk{td−k(1 − tk) − ktd−1(1 − t)} ds dt,

and

�2(k, d) =∫ 1

0

∫ 1

t

td {(1 − sk) − k sk−1(1 − s)} ds dt.

Consequently,

σ 2� = 2

(d + 1

d − 1

)2 2∑�=1

{��(d, d) +

d∑k=0

(−1)k(

d

k

)��(k, d)

}.

Now observe that in view of the binomial identity,

d∑k=0

(−1)k(

d

k

)�1(k, d) =

∫ 1

0

∫ t

0

d∑k=0

(−1)k(

d

k

)sk{td−k(1 − tk) − ktd−1(1 − t)} ds dt

=∫ 1

0

∫ t

0{(t − s)d − (1 − s)d td + ds(1 − s)d−1td−1 − ds(1 − s)d−1td } ds dt.

Similarly,

d∑k=0

(−1)k(

d

k

)�2(k, d) =

∫ 1

0

∫ 1

t

−td {(1 − s)d − (1 − s)d−1d + ds(1 − s)d−1} ds dt.

Upon collecting the terms and integrating, one gets the desired conclusion.

Downloaded By: [Canadian Research Knowledge Network] At: 15:25 7 January 2011